LVM
Logical Volume Management.
Device Mapper
Device-mapper is infrastructure in the Linux kernel that provides a generic way to create virtual layers of block devices.
The device-mapper is a component of the linux kernel since version 2.6 that support logical volume management. It is required by LVM2 and EVMS. If you need device-mapper you should install dmsetup and libdevmapper.
The encryption target dm-crypt may be used to create and use encrypted disks
LVM
Logical Volume Management utilizes the kernel's device-mapper feature to provide a system of partitions independent of underlying disk layout. With LVM you abstract your storage and have "virtual partitions", making extending/shrinking easier (subject to potential filesystem limitations).
Virtual partitions allow addition and removal without worry of whether you have enough contiguous space on a particular disk, getting caught up fdisking a disk in use (and wondering whether the kernel is using the old or new partition table), or, having to move other partitions out of the way.
See also [1]
Basic building blocks of LVM:
Physical volume (PV)
- Unix block device node, usable for storage by LVM. Examples are:
- a hard disk,
- an MBR or GPT partition,
- a loopback file,
- a device mapper device (e.g. dm-crypt).
The PV hosts an LVM header.
Volume group (VG)
- A group of PVs that serves as a container for LVs. PEs are allocated from a VG for a LV.
Logical volume (LV)
- "Virtual/logical partition" that resides in a VG and is composed of PEs. LVs are Unix block devices analogous to physical partitions, e.g. they can be directly formatted with a file system.
Physical extent (PE)
- The smallest contiguous extent (default 4 MiB) in the PV that can be assigned to a LV. Think of PEs as parts of PVs that can be allocated to any LV.
Examples:
Physical disks
Disk1 (/dev/sda):
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
|Partition1 50 GiB (Physical volume) |Partition2 80 GiB (Physical volume) |
|/dev/sda1 |/dev/sda2 |
|_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ |_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ |
Disk2 (/dev/sdb):
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
|Partition1 120 GiB (Physical volume) |
|/dev/sdb1 |
|_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _|
LVM logical volumes
Volume Group1 (/dev/MyVolGroup/ = /dev/sda1 + /dev/sda2 + /dev/sdb1):
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
|Logical volume1 15 GiB |Logical volume2 35 GiB |Logical volume3 200 GiB |
|/dev/MyVolGroup/rootvol |/dev/MyVolGroup/homevol |/dev/MyVolGroup/mediavol |
|_ _ _ _ _ _ _ _ _ _ _ _ |_ _ _ _ _ _ _ _ _ _ _ _ _ _ |_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _|
Advantages
LVM gives you more flexibility than just using normal hard drive partitions:
- Use any number of disks as one big disk.
- Have logical volumes stretched over several disks.
- Create small logical volumes and resize them "dynamically" as they get filled up.
- Resize logical volumes regardless of their order on disk. It does not depend on the position of the LV within VG, there is no need to ensure surrounding available space.
- Resize/create/delete logical and physical volumes online. File systems on them still need to be resized, but some (such as ext4) support online resizing.
- Online/live migration of LV being used by services to different disks without having to restart services.
- Snapshots allow you to backup a frozen copy of the file system, while keeping service downtime to a minimum.
- Support for various device-mapper targets, including transparent filesystem encryption and caching of frequently used data. This allows creating a system with (one or more) physical disks (encrypted with LUKS) and LVM on top to allow for easy resizing and management of separate volumes (e.g. for /, /home, /backup, etc.) without the hassle of entering a key multiple times on boot.
Disadvantages
- Additional steps in setting up the system, more complicated. Requires (multiple) daemons to constantly run.
- If dual-booting, note that Windows does not support LVM; you will be unable to access any LVM partitions from Windows.
- If your physical volumes are not on a RAID-1, RAID-5 or RAID-6 losing one disk can lose one or more logical volumes if you span (or extend) your logical volumes across multiple non-redundant disks.
Volume Operations
| op | comment |
|---|---|
| pvcreate /dev/sda | create a PV on entire disk /dev/sda |
| pvs | display PVs |
| pvresize | extend PV (after you have enlarged the partition via fdisk or gparted) pvresize will automatic detect and extend the PV to its maximum size. a partitioning tools is required to be run afterwards to adjust the size of a partition if shrinking, or must be used to increase the size of a partition if pvresize larger. |
| pvdisplay | displays information and extends of PVs, associated volume group and any free space that extends across PVs |
| pvmove | used to move PEs from one PV to another This may take a long time on large volumes and fsck should be run afterwards, a crash during pvmove could prove fatal. |
| vgcreate | create a volume group of one or more device(s) e.g. vgreate vg0 /dev/sdb1 /dev/sdb2 |
| vgs | display volume groups |
| vgrename | rename the volume group |
| vgextend <name> <dev> | extend vg <name> onto <dev> |
| pvmove <dev> | remove LVM from <dev> |
| pvmove <dev1> <dev2> | move LVM PV from <dev1> to <dev2> Then move PV from VG vgreduce <vg> <dev1> |
| vgreduce <vg> <dev> | remove PV dev from VG |
| vgreduce --all <vg> | remove all empty PV from the VG |
| vgreduce --removemissing --force <vg> | remove missing/broken disk from VG |
| pvremove <dev> | remove the partition from the VG so it can be used for something else |
| lvcreate -L <size> <vg> -n <name> | create a logical volume in a volume group of specified size and named via <vg>-<name> e.g lvcreate -L 8G redhat -n swap |
| lvrename <vg> <lvold> <lvnew> | rename logical volume in VG e.g. lvrename redhat swap swap-s |
| lvresize -L 15G --resizefs <vg>/<lv> | resize logical volume <lv> in <vg> to 15 GiB and resize the file system all at once. |
| lvextend -l 100%FREE --resizefs <vg>/<lv> | resize the logical volume in <vg> to full extents and extend the filesystem |
| resizefs /dev/vg/lv | resize the logical volume file system to the LV size |
| lvs | list all logical volumes |
| lsblk | list volumes and mount point |
snapshots
LVM allows you to take a snapshot of your system in a much more efficient way than a traditional backup. It does this efficiently by using a COW (copy-on-write) policy.
The initial snapshot you take simply contains hard-links to the inodes of your actual data. So long as your data remains unchanged, the snapshot merely contains its inode pointers and not the data itself. Whenever you modify a file or directory that the snapshot points to, LVM automatically clones the data, the old copy referenced by the snapshot, and the new copy referenced by your active system. Thus, you can snapshot a system with 35 GiB of data using just 2 GiB of free space so long as you modify less than 2 GiB (on both the original and snapshot). In order to be able to create snapshots you need to have unallocated space in your volume group. Snapshot like any other volume will take up space in the volume group. So, if you plan to use snapshots for backing up your root partition do not allocate 100% of your volume group for root logical volume.
You create snapshot logical volumes just like normal ones.
lvcreate --size 100M --snapshot --name snap01 /dev/vg0/lv
With that volume, you may modify less than 100 MiB of data, before the snapshot volume fills up.
Reverting the modified 'lv' logical volume to the state when the 'snap01' snapshot was taken can be done with
lvconvert --merge /dev/vg0/snap01
In case the origin logical volume is active, merging will occur on the next reboot (merging can be done even from a LiveCD). Note: The snapshot will no longer exist after merging.
Also multiple snapshots can be taken and each one can be merged with the origin logical volume at will.
The snapshot can be mounted and backed up with dd or tar. The size of the backup file done with dd will be the size of the files residing on the snapshot volume. To restore just create a snapshot, mount it, and write or extract the backup to it. And then merge it with the origin.
Snapshots are primarily used to provide a frozen copy of a file system to make backups; a backup taking two hours provides a more consistent image of the file system than directly backing up the partition.
See Create root filesystem snapshots with LVM for automating the creation of clean root file system snapshots during system startup for backup and rollback.
See also https://tutonics.com/2012/12/lvm-guide-part-2-snapshots.html on how to use a snapshot to secure your system state before making changes.
LV cache
create cache
The fast method is creating a PV (if necessary) on the fast disk (replace X with your drive letter) and add it to the existing volume group:
vgextend dataVG /dev/sdX
Create a cache pool with automatic meta data on sdX, and convert the existing logical volume (dataLV) to a cached volume, all in one step:
lvcreate --type cache --cachemode writethrough -L 20G -n dataLV_cachepool dataVG/dataLV /dev/sdX
Obviously, if you want your cache to be bigger, you can change the -L parameter to a different size. Note: Cachemode has two possible options:
- writethrough ensures that any data written will be stored both in the cache pool LV and on the origin LV. The loss of a device associated with the cache pool LV in this case would not mean the loss of any data;
- writeback ensures better performance, but at the cost of a higher risk of data loss in case the drive used for cache fails.
If a specific --cachemode is not indicated, the system will assume writethrough as default.
Remove cache
If you ever need to undo the one step creation operation above:
lvconvert --uncache dataVG/dataLV
This commits any pending writes still in the cache back to the origin LV, then deletes the cache. Other options are available and described in lvmcache(7).
RAID
From lvmraid(7):
- lvm(8) RAID is a way to create a Logical Volume (LV) that uses multiple physical devices to improve performance or tolerate device failures. In LVM, the physical devices are Physical Volumes (PVs) in a single Volume Group (VG).
LVM RAID supports RAID 0, RAID 1, RAID 4, RAID 5, RAID 6 and RAID 10. See Wikipedia:Standard RAID levels for details on each level.
cache
fsck count
You can reset the fsck count via tune2fs if your LVM file systems are ext.
- locate the mapped drives you want to alter the fsck count for via:
fsck -N :* on padme.server we have:
[/usr/sbin/fsck.ext4 (1) -- /] fsck.ext4 /dev/mapper/os-root [/usr/sbin/fsck.vfat (1) -- /boot/efi] fsck.vfat /dev/sdb1 [/usr/sbin/fsck.ext4 (1) -- /hdd] fsck.ext4 /dev/mapper/hdd-hdstore [/usr/sbin/fsck.ext4 (1) -- /ssd] fsck.ext4 /dev/mapper/ssd-ssdstore
- now list the volume file system e.g.
tune2fs -l /dev/mapp/os-root
tune2fs 1.46.2 (28-Feb-2021) Filesystem volume name: root Last mounted on: / Filesystem UUID: a745eed1-e988-440b-869c-ea16525eee41 Filesystem magic number: 0xEF53 Filesystem revision #: 1 (dynamic) Filesystem features: has_journal ext_attr resize_inode dir_index filetype needs_recovery extent 64bit flex_bg sparse_super large_file huge_file dir_nlink extra_isize metadata_csum Filesystem flags: signed_directory_hash Default mount options: user_xattr acl Filesystem state: clean Errors behavior: Continue Filesystem OS type: Linux Inode count: 1892352 Block count: 7567360 Reserved block count: 378368 Free blocks: 2461154 Free inodes: 1449221 First block: 0 Block size: 4096 Fragment size: 4096 Group descriptor size: 64 Reserved GDT blocks: 1024 Blocks per group: 32768 Fragments per group: 32768 Inodes per group: 8192 Inode blocks per group: 512 Flex block group size: 16 Filesystem created: Wed Jan 20 19:00:53 2021 Last mount time: Fri Feb 18 12:03:25 2022 Last write time: Fri Feb 18 12:03:25 2022 Mount count: 338 Maximum mount count: -1 Last checked: Sun Jan 24 16:51:26 2021 Check interval: 0 (<none>) Lifetime writes: 255 GB Reserved blocks uid: 0 (user root) Reserved blocks gid: 0 (group root) First inode: 11 Inode size: 256 Required extra isize: 32 Desired extra isize: 32 Journal inode: 8 First orphan inode: 1706072 Default directory hash: half_md4 Directory Hash Seed: 790edd61-9039-4383-af73-85b601163e31 Journal backup: inode blocks Checksum type: crc32c Checksum: 0x5dde0897
- set the fsck count via:
tune2fs -c 5 /dev/mapper/os-root
Configuration Files
Configuration:
- /etc/lvm/lvm.conf - the central config read by tools
- /etc/lv/lvm_hosttag.conf - cluster configuration
- /etc/lvm/profile - where configuration profiles are stored
- etc/lvm/cache/.cache - device name filter cache file
- /etc/lvm/backuo - back for configuration
- /etc/lvm/archive - archive for configuration
- /var/lock/lvm - single host configuration used to lock files to prevent parallel tool runs from corrupting the meta-data.
You can display the LVM configuration
root@c3po:/etc/lvm# lvmconfig
config {
checks=1
abort_on_errors=0
profile_dir="/etc/lvm/profile"
}
dmeventd {
mirror_library="libdevmapper-event-lvm2mirror.so"
snapshot_library="libdevmapper-event-lvm2snapshot.so"
thin_library="libdevmapper-event-lvm2thin.so"
}
activation {
checks=0
udev_sync=1
udev_rules=1
verify_udev_operations=0
retry_deactivation=1
missing_stripe_filler="error"
use_linear_target=1
reserved_stack=64
reserved_memory=8192
process_priority=-18
raid_region_size=512
readahead="auto"
raid_fault_policy="warn"
mirror_image_fault_policy="remove"
mirror_log_fault_policy="allocate"
snapshot_autoextend_threshold=75
snapshot_autoextend_percent=20
thin_pool_autoextend_threshold=100
thin_pool_autoextend_percent=20
use_mlockall=0
monitoring=1
polling_interval=15
activation_mode="degraded"
}
global {
umask=63
test=0
units="h"
si_unit_consistency=1
suffix=1
activation=1
proc="/proc"
etc="/etc"
locking_type=1
wait_for_locks=1
fallback_to_clustered_locking=1
fallback_to_local_locking=1
locking_dir="/run/lock/lvm"
prioritise_write_locks=1
abort_on_internal_errors=0
detect_internal_vg_cache_corruption=0
metadata_read_only=0
mirror_segtype_default="raid1"
raid10_segtype_default="raid10"
sparse_segtype_default="thin"
use_lvmetad=1
use_lvmlockd=0
system_id_source="none"
use_lvmpolld=1
notify_dbus=1
}
shell {
history_size=100
}
backup {
backup=1
backup_dir="/etc/lvm/backup"
archive=1
archive_dir="/etc/lvm/archive"
retain_min=10
retain_days=30
}
log {
verbose=0
silent=0
syslog=1
overwrite=0
level=0
indent=1
command_names=0
prefix=" "
activation=0
debug_classes=["memory","devices","activation","allocation","lvmetad","metadata","cache","locking","lvmpolld","dbus"]
}
allocation {
maximise_cling=1
use_blkid_wiping=1
wipe_signatures_when_zeroing_new_lvs=1
mirror_logs_require_separate_pvs=0
cache_pool_metadata_require_separate_pvs=0
thin_pool_metadata_require_separate_pvs=0
}
devices {
dir="/dev"
scan="/dev"
obtain_device_list_from_udev=1
external_device_info_source="none"
filter=["a|^/dev/sd.*|","r|^/dev/cdrom|"]
cache_dir="/run/lvm"
cache_file_prefix=""
write_cache_state=1
sysfs_scan=1
multipath_component_detection=1
md_component_detection=1
fw_raid_component_detection=0
md_chunk_alignment=1
data_alignment_detection=1
data_alignment=0
data_alignment_offset_detection=1
ignore_suspended_devices=0
ignore_lvm_mirrors=1
disable_after_error_count=0
require_restorefile_with_uuid=1
pv_min_size=2048
issue_discards=0
allow_changes_with_duplicate_pvs=0
}
Resizing FS
Once a LV have been resized, the resident filesystem must also be resized:
e2fsck /dev/volume_group/volume resize2fs /dev/volume_group/volume
See Resize fs for other leads.
Backup and Restore
- references
This example is for a system running LVM2 with two partitions, the first /dev/sda1 is /boot and the second /dev/sda2 is an LVM2_Member for the / PV.
on live system
Note: If you live booted you may need to do the following:
- download lvm2
apt install lvm2
- activate LVM groups
vgchange -a y
This technique highlighted that I needed some free extents available to make a snapshot. Since the /dev/mapper/os/root LVM was filled to 100% I downsized the hdd PV and made a new partition of 20G and extended /os/root onto it, which enabled me to make the snapshot. (I need to read up on LVM to better understand extents and extending / to see if there is a way this can be fixed offline instead of with vgextend on the live system.)
- create a snapshot (note you only need a snapshot size big enough to handle any changes that are occuring while you take the backup.
lvcreate –L 100M –s /dev/vg/lvm –n lvm-snapshot
- mount the snapshot
mount /dev/vg/lv-snapshot /mount-point
- archive with tar
tar –zcvpzf /backup/snapshot.tar.gz /mount-point
- save the boot partition (it may be a UEFI or a legacy boot and cannot reside on LVM)
dd if=/dev/sdxn of=boot-backup.img
- note the uid of root:
pvscan –u > uuid-root.txt
- preserve details of physical drives
lvdisplay > lvdisplay.txt
restore onto existing system
If you need to copy a LVM snapshot source copy onto your os root (/) disk, you must boot a recovery or live image which includes LVM2.
- if you do not have LVM2 in the live image obtain across the www via the package manager
apt update apt install lvm2
- you might also want a few tools e.g.
apt install lz4 pv
- now scan your prior system volumes:
vgscan
- make what you need active (I am taking the lazy way and activate the lot):
vgchange -ay lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 931.5G 0 disk ├─sda1 8:1 0 15.4G 0 part [swap] ├─sda2 8:2 0 916.1G 0 part /boot └─hdd-hddstore 254:2 0 916.1G 0 lvm /hdd sdb 8:16 0 111.8G 0 disk ├─sdb1 8:17 0 467M 0 part ├─sdb2 8:18 0 28.9G 0 part | └─os-root 254:1 0 28.9G 0 lvm / ├─sdb3 8:18 0 82.5G 0 part └─sdd-ssdstore 254:0 0 26.8G 0 lvm /ssd
- now mount your root partition
mkdir /mnt/root mount /dev/os/root /mnt/root
- examine
ls /mnt/root
- (optional)
rm -fr /mnt/root/*
- now overwrite the root with the backup image
cd /mnt/root tar xvf snapshot.tar
on new system
To restore the / partition from your backup you need a Linux Live-CD that supports LVM, such as Knoppix or the Debian (or you can attach your live boot to the network and obtain the lvm2 package):
- prerequisties
apt update apt install lvm2 apt install dd tar lz4 pv
- repartition your main disk for /
dd if=boot-backup.img of=/dev/sda1 bs=2M
- examine your former uuid for /
more uuid-root.txt
- create a pv for p2
- pvcreate –uuid uuid /dev/sda2
- more /backup/lvdisplay
pvcreate Volume-Group-name /dev/sda2
- create the LV for /
// LVM VG lvcreate –l 100%FREE –n root os
- mount the partition
mount /dev/Volume-Group/Logic-Volume /mount-point
- restore the former contents
cd /mount-point tar –xvzf snapshot.tar.gz
- umount
umount /mount-point
- test
reboot
Note: lvm stores backups of its metadata in /etc/lvm/backup. Each volume group will have a file, which will list the uuid for each PV.
You may also take tar snapshots of any other PVs that you wish to copy to another machine. This will be faster than performing disk and partition imaging.
Tricks
where are things
If you want to now where files and directories are located then type
df <fullpath>
move an LV from one VG to another
dd image copy
umount /somedir/ lvdisplay /dev/vgsource/lv0 --units b lvcreate -L 12345b -n lv0 vgtarget dd if=/dev/vgsource/lv0 of=/dev/vgtarget/lv0 bs=1024K conv=noerror,sync status=progress mount /dev/vgtarget/lv0 /somedir/
if everything is good, remove the source
via partclone
lvremove vgsource/lv0 If you need to copy a logical volume from VG A to another VG B, I found a interesting variant using partclone. The snapshot then copy with dd is a good method but might be slow if your file-systems are not full. This solution is very fast because it copy only the used blocks.
First create a snapshot of the source LV
lvcreate --snapshot --size 1G /dev/sourcevg/lv --name lv-backup
the --size here is how much write can occur before the snapshot will be disabled
Create the destination LV in the destination VG
lvcreate --size <new_lv_size> /dev/destvg --name newlv
new_lv_size must be at least the size of the source LV
Copy the file-system from source lv backup to destination LV
partclone.<fs_type> --dev-to-dev --source /dev/sourcevg/lv-backup --output /dev/destvg/newlv
fs_type can be ext4, fat32, btrfs, xfs, ... any FS supported by partclone
Delete the snapshot
lvremove /dev/sourcevg/lv-backup
abort pvmove
If you are running a pvmove and you have a crash you can:
pvmove // with no arguments and the move will resume pvmove --abort // and the pvmove (and mirror) will wind-back
- no lvmetad
pvs WARNING: Not using lvmetad because duplicate PVs were found. WARNING: Use multipath or vgimportclone to resolve duplicate PVs? WARNING: After duplicates are resolved, run "pvscan --cache" to enable lvmetad. PV VG Fmt Attr PSize PFree /dev/sda3 c3po-vg lvm2 a-- 1.82t 648.75g /dev/sdb1 hdd lvm2 a-- 919.17g 919.17g /dev/sdb2 swap lvm2 a-- 32.00g 0 /dev/sdb3 hdd lvm2 a-- 911.84g 390.84g
more PV stuff
- move a PV https://www.google.com.au/amp/s/www.thegeekdiary.com/centos-rhel-move-a-physical-volume-from-an-existing-volume-group-to-another-volume-group/amp/
- downsize PV https://unix.stackexchange.com/questions/479545/how-to-shrink-a-physical-volume
removable drives
Suspend/resume with LVM and removable media
In order for LVM to work properly with removable media – like an external USB drive – the volume group of the external drive needs to be deactivated before suspend. If this is not done, you may get 'buffer I/O errors on the dm device (after resume). For this reason, it is not recommended to mix external and internal drives in the same volume group.
To automatically deactivate the volume groups with external USB drives, tag each volume group with the sleep_umount tag in this way:
vgchange --addtag sleep_umount vg_external
Configuration changes
- configuration is stored in
/etc/lv/lvm.conf
- configuration may be examined and operated on by
lvm lvconfig
- after substantive changes, including fstab, make a new kernel initramfs on Debian:
cp -p /boot/initrd.img-$(uname -r) /boot/initrd.img-$(uname -r)$(date +%Y%m%dT%H%M%S).bak update-initramfs
- on Centos:
cp -p /boot/initramfs-$(uname -r).img /boot/initramfs-$(uname -r).img.$(date +%Y%m%dT%H%M%S).bak dracut -f
If you receive a warning from udate-initramfs regarding RESUME being different to your swap dev/fs blkid then edit the file
/etc/initramfs-tools/conf.d/resume
and remove the RESUME override.
Problems
- See https://wiki.archlinux.org/index.php/LVM#Troubleshooting
- PV metadata recovery cache https://fy.blackhats.net.au/blog/html/2019/11/26/recovering_lvm_when_a_device_is_missing_with_a_cache_pool_lv.html
- PV metadata recovery https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/logical_volume_manager_administration/mdatarecover
- swap do not place swap on LVM https://bugs.launchpad.net/ubuntu/+source/initramfs-tools/+bug/1768230
References
- Redhat manual https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html-single/configuring_and_managing_logical_volumes/index
- https://wiki.gentoo.org/wiki/LVM
- https://wiki.archlinux.org/index.php/LVM
- https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html-single/configuring_and_managing_logical_volumes/index
- guides: