Distroname and release: Debian Squeeze
Debian Linux software RAID with MDADM
It is possible with help of mdadm to setup a software RAID.This have some clear benefits. You do not dependent on a specific controller. If it fails, you will have to have the exact same model to be able to reconstruct your RAID. Since software RAID is not dependent on any hardware, we are not as vulnerable for hardware failues.
This of course comes with a downside as well, since the performance is not as good, as a real RAID hardware controller card.
When this said, most cheap RAID controllers, is just a emulated hardware RAID (meaning, software RAID).
Setup RAID under installation, Debian Squeeze
1) Boot the installer2) Go through the required steps, until you are at the point called [!!] Partition disks.... Select Manual.
3) (Step might not be required).. Select the Disk 1, select it and press enter. If it asks you if you want to create an empty partition table, select yes. (WARNING! Will erase ALL data)!
4) Go down to Disk 1, where it says free space, and press enter, to create an partition, give it a size, and go to "Use as:" and select physical volume for RAID. Continue this step until all of your wanted partitions are created, make make sure they all are marked as RAID, and set the first one bootable.
Now do the exact same for disk 2.
Wether swap should be on RAID is debateable, look here for some detailed info.
http://tldp.org/HOWTO/Software-RAID-HOWTO-2.html
I want my partition layout to look like this.
Disk 1:
/ 15 gb (bootable)
swap 4 gb
/var 171.1 gb
/root 10 gb
Disk 2:
/ 15 gb (bootable)
swap 4 gb
/var 171.1 gb
/root 10 gb
No filesystem types or mountpoints are created at this time, so to do this, go to "Configure Software RAID", say Yes to write the changes to disk.
"Create MD Device"
"Select your RAID type (1 In my case)"
"Select number of active devices for the RAID1 array". (This is the number of disks available. 2 in my case)
Number of spare devices for the RAID1 array. (0, I do not have any spare disks).
Now select two devices. (Required to match. (In example /dev/sda1 and /dev/sdb1, PLEASE MAKE SURE THAT THE DISKS ARE THE CORRECT SIZE, SO YOU ARE SELECTING THE CORRECT ONE.
If the partition layout was exactly the same for Disk1 and Disk2, the devices /dev/sda1 and /dev/sdb1 should match, and /dev/sda2 /dev/sdb2 should match, and so on.
I have though expirenced that if you have made a mistake during the partition layout (the order of the partitions), and you delete them, and readd them, that /dev/sda1 /dev/sdb1 in example does not match!
In the above situation I solved it by starting from scratch in the partition layout, so the "numbers to match across partitions.
Now continue with this step until you are done selecting and matching active partitions.
When you are done, select finish.
You will be brought back to the first partition screen, and now you will have the options, to select filesystems and mountpoints for the newly configured RAID volumes.
RAID1 device #0 / ext3, 15 gb RAID1 device #1 swap 4 gb RAID1 device #2 /var 171.1 gb RAID1 device #3 /root 10 gbAnd after this you will have to select "Finish partitioning and write changes to disk".
Continue and complete the installation.
Note:
You will now most likely have quite a lot disk activity, since the RAID is currently out of sync, and will need need to sync. This will have some performance degrade while doing so. It will take quite some time, depending on the size of the disks, and the disk utilization.
Setup RAID on a running system
installation:aptitude install mdadmSelect weater mdadm devices are holding the root file system. In my case, it is on a USB stick, so I write "none" in this field. Start md arrays automatically yes configure the mdadm arrays: In this example, I am setting up an RAID5 array. My devices are as follows. /dev/sda /dev/sdb /dev/sdc
mdadm -C /dev/md0 -l 5 -n 3 /dev/sda /dev/sda /dev/sdc -vAlternative setup the sdc disk as SPARE. This have the advantage that it will not use and tear the disk until another disk dies, but it will reduce the RAID capacity. In example, 3 drives of 500GB will give 500GB in total, whereas with the above solution it will us 1TB instead.
mdadm -C /dev/md0 -l 5 -n 2 /dev/sda /dev/sda -x 1 /dev/sdc -vThis, should show up some information about the chunk size (default 512), and also if you have created partitions on some of these devices, it will warn you that these will be lost! answer "yes" or just "y" to continue. It should show this output mdadm: array /dev/md0 started. Now the raid build is in progress. This could take a LONG time, depending on the size of the disks, speed of hardware and so on. To see a status check /proc/mdstat
cat /proc/mdstatOr if you want frequent updates you can do it with the watch command, which will show an full screen output. This example will update every 5 seconds.
watch -n 5 cat /proc/mdstatAs you can see, this will take a little more than 2 hours. Have no worries, in the meantime, it is possible to setup the array, create partitions, filesystems, mount and use them. Please notice that if you continue to work with the new array, in example creating the filesystem, as we will do later, the buildtime will increase.
First, I will create an partition on the mdadm array. (Yes I am preferring, cfdisk over fdisk). It might warn you about "Unknown partition table type". Answer yes, to Start with a zero table. Create an partition, with a Linux type, write the partition table, and quit.
cfdisk /dev/md0Now you should have an partition on /dev/md0p1. Note: It is actually not needed to create a partition, if you wish to use the entire array for one partition. I prefer to have partitions, I personally think it gives me a better overview. Just do the below if you wish to do so, or skip this step.
mkfs.ext4 /dev/md0Create the filesystem if you not have done the above!, this takes some time. Notice: Doing so, will increase the buildtime of the array, if this is not complete.
mkfs.ext4 /dev/md0p1Create the array in the config file. If this is not done, it is possible that the array is something recoignized as /dev/md127 instead of /dev/md0. First get the UUID of the array.
mdadm --detail --scanOutput should be something below.
ARRAY /dev/md/0 metadata=1.2 name=thor:0 UUID=1878c959:78d3c43d:82f3ec63:3a410a8eUse the UUID in the mdadm.conf file
create mountpoint, and mount the volume./etc/mdadm/mdadm.conf # definitions of existing MD arrays ARRAY /dev/md/0 metadata=1.2 name=thor:0 UUID=1878c959:78d3c43d:82f3ec63:3a410a8e
mkdir /mnt/RAID5array mount /dev/md0p1 /mnt/RAID5arrayConfigure fstab to automount this volume on boot. Insert the following line.
/etc/fstab /dev/md0p1 /mnt/RAID5array ext4 rw,user,auto 0 0
Monitoring
It is crucial that we discover if something goes wrong with our software raid. We can do this with e-mail alerts.You must be able to send mails from this host, before this is working.
Now we want to be sure that this is triggered when an event occours.
"Reconfigure" mdadm, and select Yes to start the MD monitoring daemon, and i the next step put your e-mail address.
dpkg-reconfigure madmNext we can test this, by sending test e-mails for the mdadm volumes.
dadm --monitor --scan --test
Increase performance
On RAID5+6 arrays only, increasing the stripe size can be an significiant improvement, which I would hightly recommend to do.You can try with other numbers, but this works very well for me.
echo 16384 > /sys/block/md0/md/stripe_cache_size
Note, you have to add it to a startup script, or create a new one, copy it to /etc/init.d/ and activate it with insserv, so it runs at every boot.
References and links:
https://raid.wiki.kernel.org/index.php/Linux_RaidReplacing a failed drive
This will also work with encrypted MDADM using LUKS, and LVM.DISK -> MD -> LUKS -> (LVM) -> FS
Start by location the dead disk. (in this example is /dev/sda)
It could also be smartctl/smartd which has reported the disk is not healthy. Or even MDADM which has reported the array as faulty.
tail -f /var/log/messages |grep sd Aug 28 18:40:47 server1 kernel: [2141433.351542] sd 0:0:0:0: [sda] Unhandled error code Aug 28 18:40:47 server1 kernel: [2141433.351549] sd 0:0:0:0: [sda] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK Aug 28 18:40:47 server1 kernel: [2141433.351557] sd 0:0:0:0: [sda] CDB: Read(10): 28 00 00 00 00 00 00 00 08 00 Aug 28 18:40:47 server1 kernel: [2141433.351678] sd 0:0:0:0: [sda] Unhandled error code Aug 28 18:40:47 server1 kernel: [2141433.351683] sd 0:0:0:0: [sda] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK Aug 28 18:40:47 server1 kernel: [2141433.351690] sd 0:0:0:0: [sda] CDB: Read(10): 28 00 3a 38 60 28 00 00 08 00 Aug 28 18:40:47 server1 kernel: [2141433.351810] sd 0:0:0:0: [sda] Unhandled error code Aug 28 18:40:47 server1 kernel: [2141433.351815] sd 0:0:0:0: [sda] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK Aug 28 18:40:47 server1 kernel: [2141433.351822] sd 0:0:0:0: [sda] CDB: Read(10): 28 00 00 00 00 00 00 00 08 00Example of faulty MDADM array, look for "(F)" and/or the sda disk.
cat /proc/mdstat Personalities : [raid1] md3 : active raid1 sda4[0](F) sdb4[1] 9770936 blocks super 1.2 [2/1] [_U] md2 : active raid1 sda3[0](F) sdb3[1] 460057464 blocks super 1.2 [2/1] [_U] md1 : active (auto-read-only) raid1 sda2[0] sdb2[1] 3905524 blocks super 1.2 [2/2] [UU] md0 : active raid1 sda1[0](F) sdb1[1] 14646200 blocks super 1.2 [2/1] [_U] unused devices:If in doubt which drive is which, locate the serial number, and later on remove the disk labeled which this serial number.
hdparm -I /dev/sdb|grep -i serial Serial Number: WD-WCAPW7666349 hdparm -I /dev/sdb|grep -i serial Serial Number: WD-WCAPW7216370Mark /dev/sda as faulty, and then remove the drive from the md devices.
mdadm /dev/md0 --fail /dev/sda1 mdadm: set /dev/sda1 faulty in /dev/md0 mdadm /dev/md1 --fail /dev/sda2 mdadm: set /dev/sda2 faulty in /dev/md1 mdadm /dev/md2 --fail /dev/sda3 mdadm: set /dev/sda3 faulty in /dev/md2 mdadm /dev/md3 --fail /dev/sda4 mdadm: set /dev/sda4 faulty in /dev/md3Remove the disk from the md devices
mdadm /dev/md0 --remove /dev/sda1 mdadm: hot removed /dev/sda1 from /dev/md0 mdadm /dev/md1 --remove /dev/sda2 mdadm: hot removed /dev/sda2 from /dev/md1 mdadm /dev/md2 --remove /dev/sda3 mdadm: hot removed /dev/sda3 from /dev/md2 mdadm /dev/md3 --remove /dev/sda4 mdadm: hot removed /dev/sda4 from /dev/md3Now replace the disk. If it's not hot-swap able, then shutdown the machine first. Dump partition table on the original disk, and write it to the other disk!
You can also just backup the partition table of the original disk, and import it afterwards.
In this example /dev/sda is the disk we have replaced, and where partition layout needs to be added.
sfdisk -d /dev/sdb | sfdisk /dev/sdaAnother approach is simply to use fdisk.
fdisk -l /dev/sdb Disk /dev/sdb: 500.1 GB, 500107862016 bytes 255 heads, 63 sectors/track, 60801 cylinders, total 976773168 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x000a2320 Device Boot Start End Blocks Id System /dev/sdb1 2048 29296639 14647296 fd Linux raid autodetect /dev/sdb2 29296640 37109759 3906560 fd Linux raid autodetect /dev/sdb3 37109760 957227007 460058624 fd Linux raid autodetect /dev/sdb4 957227008 976771071 9772032 fd Linux raid autodetectCreate the partition layout on the replaced disk. I am using the "Start" and "End" sectors, when creating the disk-layout so I am certain the it's the same.
fdisk /dev/sda Command (m for help): n Partition type: p primary (0 primary, 0 extended, 4 free) e extended Select (default p): p Partition number (1-4, default 1): Using default value 1 First sector (2048-976773167, default 2048): Using default value 2048 Last sector, +sectors or +size{K,M,G} (2048-976773167, default 976773167): 29296639 Command (m for help): n Partition type: p primary (1 primary, 0 extended, 3 free) e extended Select (default p): p Partition number (1-4, default 2): Using default value 2 First sector (29296640-976773167, default 29296640): Using default value 29296640 Last sector, +sectors or +size{K,M,G} (29296640-976773167, default 976773167): 37109759 Command (m for help): n Partition type: p primary (2 primary, 0 extended, 2 free) e extended Select (default p): Using default response p Partition number (1-4, default 3): Using default value 3 First sector (37109760-976773167, default 37109760): Using default value 37109760 Last sector, +sectors or +size{K,M,G} (37109760-976773167, default 976773167): 957227007 Command (m for help): n Partition type: p primary (3 primary, 0 extended, 1 free) e extended Select (default e): p Selected partition 4 First sector (957227008-976773167, default 957227008): Using default value 957227008 Last sector, +sectors or +size{K,M,G} (957227008-976773167, default 976773167): 976771071 Command (m for help): w The partition table has been altered! Calling ioctl() to re-read partition table. Syncing disks.Now we are ready to add the partitions to the md devices.
mdadm /dev/md0 --add /dev/sda1 mdadm /dev/md1 --add /dev/sda2 mdadm /dev/md2 --add /dev/sda3 mdadm /dev/md3 --add /dev/sda4If the disk is part of an bootable OS/partition/array/disk, then we need to install grub in the MBR on the new replcaced disk. If this is not done, and the other disk fails, theres not MBR to load, and it will not boot.
grub-install /dev/sdaIf grub install is run while its in degraded mode you will get this warning, NOT error.
grub-install: warning: Couldn't find physical volume `(null)'. Some modules may be missing from core image.You can ignore it, or do it again after the rebuild is complete.
Rebulid could take a long time, depending on partition size
Known issues and fixes
md array inactive
cat /proc/mdstat Personalities : [raid1] md1 : inactive sdb2[1](S) 3905536 blocks super 1.2Stop the array (if possible)
mdadm --stop /dev/md1 mdadm: stopped /dev/md1Assemble the array when stopped
mdadm -A /dev/md1 /dev/sdb2 mdadm: /dev/md1 assembled from 1 drive - need all 2 to start it (use --run to insist).Force start
mdadm -A /dev/md1 /dev/sdb2 --run mdadm: /dev/md1 has been started with 1 drive (out of 2)Lastly add the secondary disk
mdadm /dev/md1 --add /dev/sda2 mdadm: added /dev/sda2
mdadm: Cannot open /dev/sda1: Device or resource busy
Check what's using sda1cat /proc/mdstat md0 : active raid1 sdb1[2] 14646200 blocks super 1.2 [2/1] [U_] md2 : active raid1 sdb3[2] 460057464 blocks super 1.2 [2/1] [U_] md1 : active (auto-read-only) raid1 sdb2[2] 3905524 blocks super 1.2 [2/1] [U_] md3 : active raid1 sdb4[2] 9770936 blocks super 1.2 [2/1] [U_] md127 : inactive sda[0](S) 488385560 blocks super 1.2
Stopping md arrays
Then stop the md array.Warning Be careful is this its actively in use and/or part of the OS itself.
mdadm --stop /dev/md127Now it should be possible to add the disks
mdadm /dev/md0 --add /dev/sda1 mdadm: added /dev/sda1 mdadm /dev/md1 --add /dev/sda2 mdadm: added /dev/sda2 mdadm /dev/md2 --add /dev/sda3 mdadm: added /dev/sda3 mdadm /dev/md3 --add /dev/sda4 mdadm: added /dev/sda4
Running partprobe
Make sure you did run partprobe after sfdisk!partprobe
Swapping
If you get the error on a swap device, try to disable swap first.swapoff -athen add again..
mdadm /dev/md1 --add /dev/sdd1 swapon -a
LVM
I've seen some cases where an older disk was used, and LVM was present.pvdisplay /dev/sdd1 WARNING: PV /dev/sdd1 in VG VG_XenStorage-34507d32-dcd5-83fa-6a4b-70c21a34f3k8 is using an old PV header, modify the VG to update. WARNING: Device /dev/sdd1 has size of 62498816 sectors which is smaller than corresponding PV size of 1465147120 sectors. Was device resized? WARNING: One or more devices used as PVs in VG VG_XenStorage-34507d32-dcd5-83fa-6a4b-70c21a34f3k8 have changed sizes. --- Physical volume --- PV Name /dev/sdd1 VG Name VG_XenStorage-34507d32-dcd5-83fa-6a4b-70c21a34f3k8 PV Size <698,64 GiB / not usable <11,87 MiB Allocatable yes PE Size 4,00 MiB Total PE 178848 Free PE 140401 Allocated PE 38447 PV UUID dhMyiN-75V9-PvMo-SgV5-zseJ-id5p-kVBHKBThen remove VG, (answer yes to everything).
vgremove VG_XenStorage-34507d32-dcd5-83fa-6a4b-70c21a34f3k8Then try again:
mdadm /dev/md1 --add /dev/sdd1