Hadoop with velocity problem task

deepak kapse
4 min readSep 6, 2022

--

Hadoop
uses the concept of parallelism to
upload the split data while fulfilling
Velocity problem.

Integrating Hadoop with LVM

Hi friends, I am back with another blog in this blog we have gone discuss Hadoop, LVM, and elasticity storage like Increasing, decreasing, and merging storage devices and the same LVM i.e, integration with LVM.

Before starting the practical let us discuss a few basic vocabularies.

What is Hadoop?

Apache Hadoop is the software it is a collection of open-source software utilities that facilitates using a network of many computers to solve problems involving massive amounts of data and computation. It provides a software framework for distributed storage and processing of big data using the MapReduce programming model.

In Linux, Logical Volume Manager is a device mapper framework that provides logical volume management for the Linux kernel. Most modern Linux distributions are LVM-aware to the point of being able to have their root file systems on a logical volume. In this, we can extend two storage devices, increase, decrease sizes this is the main benefit.

Click on the image to understand Hadoop

What is LVM?

In Linux, Logical Volume Manager is a device mapper framework that provides logical volume management for the Linux kernel. Most modern Linux dCristributions are LVM-aware to the point of being able to have their root file systems on a logical volume. In this, we can extend two storage devices, increase, decrease sizes this is the main benefit.

Click on the image to understand LVM

What is the Elasticity storage of hadoop?

The storage will be flexible we can increase, decreas,e and merge storages to one data node. This allows us easy to maintain flexible/elasticity storage.

Let’s dev into practical

Create HDFS Hadoop Cluster:

Click on this Image to setup Hadoop Cluster

Hadoop Cluster

I have set up a Hadoop cluster you guys can go through by clicking on this

Hadoop Cluster

After Hadoop cluster setup if everything is fine then we can start working with LVM

Create an LVM

adding two storage devices to my VM

Before creating LVM we can see new devices not mounted.

>> df -Th

Creating PV, Volume Group, and Logical Volume

Created two Physical volume

>> pvcreate /dev/sdh
>> pvcreate /dev/sdi

Create Volume Group

vgcreate hadoopLVM /dev/sdh /dev/sdi

Create Logical Volume

lvcreate --size 40G --name hadoopLV hadoopLVM

Display Logical Volume

Format Storage using the command

>> mkfs.ext4 /dev/hadoopLVM/hadoopLV

Mount to data node storage directory

>> mount /dev/hadoopLVM/hadoopLV  /datanode/

Display mount of storage below we can see 35G is mounted

Storage of Datanode

Below datanode is having 35GiB

Increase Datanode size

Here we can see increased storage from 35GiB to 50Gib

>> lvextend --size +<Size_to_increase>G /dev/<group_name>/<LV_name>
>> resize2fs /dev/<group_name>/<LV_name>

Hadoop data node Storage

Decrease A Logical Volume Size

Before decreasing make sure your storage is unmounted

>> umount <mounted_path>

Remove from metadata

>> e2fsck  -f /dev/<group_name>/<LV_name>

Reformated only reduced data

>> resize2fs /dev/<group_name>/<LV_name> <sizeToFormat>G

Now we are safe to reduce storage, reducing Logical Volume

>> lvreduce -L <sizeToReduce>G /dev/<group_name>/<LV_name>

Hadoop data node Storage

--

--