It's complicated and time consuming, but totally worth it.
I've been a sysadmin for a long time, and part of being a sysadmin is doing more than is humanly possible. Sometimes that means writing wicked cool scripts, sometimes it means working late, and sometimes it means learning to say no. Unfortunately, it also sometimes means cutting corners. I confess, I've been “that guy” more than once. A good example is SELinux. On more than a few (hundred!) occasions, I've simply disabled SELinux, because getting things to work right is often really frustrating and time consuming. The same is true with LVM (Logical Volume Manager). I didn't get it. I thought it added an unnecessary layer of complexity. I thought it meant another potential point of failure. I thought it was stupid.
I was wrong.
LVM is an incredibly flexible, ridiculously useful and not terribly complicated to use system. It makes life easier. It makes future storage upgrades and migrations simple. Quite simply, I love it. So in this article, I cover the concepts and usage of LVM. By the time I'm done, hopefully you'll love it as much as I do!
The best analogy I can come up with for explaining LVM is a SAN. If you've ever used a SAN (Storage Area Network) in your server environment, you know it abstracts the idea of individual hard drives and allows you to carve out “chunks” of space to use as drives. Rather than worrying about how big your hard drives might be, a SAN lets you throw all your hard drives into a big chassis and then allocate space to individual clients without being concerned about how many or how few physical drives are being used. LVM is sort of like that, but for an individual system rather than an entire network.
Figure 1 shows my poor attempt at drawing the concept of an LVM system. At first glance, it might seem like using an LVM is silly. Why combine a bunch of drives together, only to carve them up into virtual drives, right? Thankfully, that simple concept gives incredible flexibility down the road. Need a great big partition, but have only a bunch of smaller disks? No problem. Have only a couple disks now, but want to add more later without reformatting? No problem. Need to take snapshots, like with virtual servers, but you're using actual bare metal? No problem. LVM makes dealing with storage far better than partitioning drives or using a simple RAID setup (which, incidentally, brings me to the next issue).
With all the flexibility and expandability I mentioned in the previous paragraph, it seems like LVM would be a perfect replacement for hardware- or software-based RAID. After all, one of the big advantages of RAID is that multiple smaller drives can be used as a single, larger drive. For that particular feature, LVM is indeed ideal. Unfortunately, however, LVM doesn't provide any options for redundancy or parity. That means if you have a drive fail in LVM, you lose data. There's no such thing as striped LVM or mirrored LVM; it's simply not designed to do that.
LVM also isn't designed to increase speed by striping reads and writes across multiple disks. As block devices in the volume group fill up, such simultaneous read/writes may occur, but it's not by design and certainly not to gain speed. Hopefully, it's clear: LVM is really cool, but it is not in any way a replacement for RAID. Thankfully, it doesn't need to be.
(Note: recent versions of LVM actually do provide striping and mirroring features. In some cases, it can take the place of RAID completely. I still think understanding them as separate concepts is important. If you want to learn more about LVM and utilize the RAID features, I'll leave that as an exercise for you.)
If you look at the first “stage” in my drawing (Figure 1), you'll notice that I didn't call the 10GB chunks “drives”; I called them physical volumes. That's because although it's certainly possible to use a physical drive as a physical volume in LVM, it's not a requirement. In fact, it's not even the most common scenario. In most production environments, LVM is used in combination with RAID. Whether that's hardware-based RAID or software-based RAID, having your underlying physical volumes exist as RAID devices is ideal.
As someone who has had problems with hardware-based RAID arrays, I tend to lean toward software-based RAID in my systems. That's certainly a matter of personal preference, but it's good to know that since software-based RAID and LVM both operate at the kernel level, both are extremely efficient. Software-based RAID admittedly uses some CPU, especially when rebuilding arrays, but LVM uses very little. If I/O performance is of utmost importance for your purposes, it's worth doing some research and possibly testing before committing to any solution.
Although it's certainly possible to transition to an LVM system after Linux is already installed, it's far more preferable to do so during the initial setup. Most distributions allow for LVM setup to take place during the installation process, and in the case of CentOS and RHEL, LVM is used by default. Even if you're installing only onto a single, non-RAID hard drive, setting up LVM allows you flexibility and expansion opportunity later. Heck, it's possible to add RAID to a server later on, then simply migrate the data from your original physical volume to the RAID physical volume. That's far easier than using dd, especially when you'd like to keep your server running!
Because this is an introduction, let me start with a simplistic setup. Let's say you have two hard drives, /dev/sdb and /dev/sdc. With LVM, any block device can be used as a physical volume (PV), which means you can use either partitions or entire drives. If you need to have a “traditional” partition (in some cases, the /boot partition might need to be on a regular, non-LVM device), be sure to partition the drive before adding the physical volumes to your volume group. In this example, let's use the raw disks themselves.
Once you have the block devices you want to add to your volume group (again, keep referring to my drawing if the terms get confusing), you need to establish them as LVM physical volumes. To do that, use the pvcreate command:
pvcreate /dev/sdb pvcreate /dev/sdc
These commands configure the drives as potential candidates to be added to a volume group. If you want to make sure it worked correctly, you can type pvdisplay or pvscan to show the status of any existing LVM Physical Volumes:
$ sudo pvdisplay --- Physical volume --- PV Name /dev/sdb VG Name PV Size 10.4 GiB / not usable 3.00 MiB Allocatable yes PE Size 4.00 MiB Total PE 4994 Free PE 4994 Allocated PE 0 PV UUID SRKAXh-EpYr-r2td-g0gA-31RA-fnfz-3qqGrO --- Physical volume --- PV Name /dev/sdc VG Name PV Size 10.4 GiB / not usable 3.00 MiB Allocatable yes PE Size 4.00 MiB Total PE 4994 Free PE 4994 Allocated PE 0 PV UUID t2cKru-IwMy-I8re-ADp2-vzFF-Tvh5-O4zMhI
And, the simpler pvscan:
$ sudo pvscan PV /dev/sdb lvm2 [10.4 GiB] PV /dev/sdc lvm2 [10.4 GiB] Total: 2 [20.8 GiB] / in use: 0 [0 ] / in no VG: 2 [20.8 GiB]
Once you create the volume group and logical volumes, go ahead and run these commands again to see how the information changes. The differences should be obvious and should make sense.
You don't currently have any volume groups, so create one using the two physical volumes you just made:
vgcreate my_volume_group /dev/sdb /dev/sdc
Hopefully the command is clear. You've created a volume group named my_volume_group using the physical volumes /dev/sdb and /dev/sdc. As with the physical volumes, if you want to check the current state of LVM Volume Groups on your system, type vgdisplay to get a listing:
$ sudo vgdisplay --- Volume group --- VG Name my_volume_group System ID Format lvm2 Metadata Areas 2 Metadata Sequence No 1 VG Access read/write VG Status resizable MAX LV 0 Cur LV 0 Open LV 0 Max PV 0 Cur PV 2 Act PV 2 VG Size 20.8 GiB PE Size 4.00 MiB Total PE 9988 Alloc PE / Size 0 / 0 GiB Free PE / Size 9988 / 20.8 GiB VG UUID oVYiY6-bQp9-4CVO-QgrN-LGgB-1umR-ebJQo4
As you can see in the output, you've combined the available space of the two physical volumes (10.4GB each) into a total pool of 20.8GB. You could add more drives to the volume group or mix and match entire drives with partitions from other drives. LVM is very flexible. The large pool of available data does no good, however, until you create Logical Volumes to act as your usable disks.
When you add a hard drive to your system, you don't really get to pick its name. You get /dev/sda, /dev/sdb and so on. When you create logical volumes, however, you decide what you want the devices to be called. You also get to decide how large each “drive” is as you carve it out of the larger volume group. It's good to note here that if you make your logical volumes too small, it's very easy to expand them later, so don't worry too much about planning for long-term potential needs. If you need more space later, you can just add it. To create your logical volumes, type:
$ sudo lvcreate -L 5G -n 5gig my_volume_group Logical volume "5gig" created
Then to see what happened behind the scenes, type:
$ sudo lvdisplay --- Logical volume --- LV Path /dev/my_volume_group/5gig LV Name 5gig VG Name my_volume_group LV UUID 3MxOB0-ce5o-yvBD-YORT-52qV-j8HJ-oDru2G LV Write Access read/write LV Status available # open 0 LV Size 5.0 GiB Current LE 5753 Segments 1 Allocation inherit Read ahead sectors auto - currently set to 256 Block device 252:0
Notice how nice and clean the device-naming system is with LVM. It's important to run the lvdisplay command, however, to make sure you know the mapped device name. Many systems use symbolic linking in an attempt to make the device's virtual locations easier to find, but I think that adds a layer of confusion for folks trying to understand what's really going on.
Once you've successfully created your logical volumes, it's just a matter of using them as block devices. If you need a filesystem to mount as your /home directory, just do this:
$ sudo mkfs.ext4 /dev/my_volume_group/5gig $ sudo mount -t ext4 /dev/my_volume_group/5gig /home
And, your /home directory will be a whopping 5GB in size, but fully expandable, thanks to LVM. (Obviously, if you really want to mount your logical volume as your home directory, you should add an entry to /etc/fstab so it mounts on boot.) From the standpoint of your Linux system, however, /dev/my_volume/5gig is a block device similar to any hard drive you might plug in. You can use it as swap, format it like you did above, or even encrypt it and mount it somewhere as an encrypted partition.
I know, in this little example, you've done nothing but create a JBOD (Just a Bunch Of Disks) type system, which will completely fail if you lose even one drive. The power of LVM isn't fully realized until down the road when you want to expand your logical volumes without migrating data. Or, when you want to take an LVM snapshot of your drive so you can roll back to an instantaneous backup when an upgrade fails. Or, when you replace a small drive with a fast RAID array and want to migrate the data quietly to your new PV.
The Logical Volume Manager is a system that abstracts storage devices. It does add a layer of complexity to your system, I won't lie, but the trade-off is significant. It may complicate your system a bit more, but it also simplifies your work a great deal when you have to deal with storage in the future.
Hopefully at this point, you see LVM isn't a complete waste of time. When the time comes, what sorts of advantages will LVM provide? Here's a quick off-the-top-of-my-head list you might want to check out:
Move Logical Volumes from old, slow PVs to new, fast PVs, on the fly.
Resize Logical Volumes, filling more space in the Volume Group.
Stripe data across PVs in a VG for increased performance.
Resize Volume Groups by adding or subtracting physical volumes.
Take a snapshot of any Logical Volume, which can be restored later.
One of my favorite uses for LVM in production is to take an LVM snapshot before an upgrade. If something goes wrong, I can just revert back to the snapshot. Once you start thinking about all the possibilities LVM offers, you'll wonder why you waited so long!