Difference between revisions of "How to replace software RAID Disk"
Line 1: | Line 1: | ||
− | Replacing A Failed Hard Drive In A Software RAID1 Array | + | == Replacing A Failed Hard Drive In A Software RAID1 Array == |
This guide shows how to remove a failed hard drive from a Linux RAID1 array (software RAID), and how to add a new hard disk to the RAID1 array without losing data. | This guide shows how to remove a failed hard drive from a Linux RAID1 array (software RAID), and how to add a new hard disk to the RAID1 array without losing data. | ||
Line 5: | Line 5: | ||
I do not issue any guarantee that this will work for you! | I do not issue any guarantee that this will work for you! | ||
− | + | == 1 Preliminary Note == | |
− | |||
In this example I have two hard drives, /dev/sda and /dev/sdb, with the partitions /dev/sda1 and /dev/sda2 as well as /dev/sdb1 and /dev/sdb2. | In this example I have two hard drives, /dev/sda and /dev/sdb, with the partitions /dev/sda1 and /dev/sda2 as well as /dev/sdb1 and /dev/sdb2. | ||
Line 19: | Line 18: | ||
/dev/sdb has failed, and we want to replace it. | /dev/sdb has failed, and we want to replace it. | ||
− | |||
− | + | == 2 How Do I Tell If A Hard Disk Has Failed? == | |
If a disk has failed, you will probably find a lot of error messages in the log files, e.g. /var/log/messages or /var/log/syslog. | If a disk has failed, you will probably find a lot of error messages in the log files, e.g. /var/log/messages or /var/log/syslog. | ||
Line 28: | Line 26: | ||
You can also run | You can also run | ||
− | cat /proc/mdstat | + | cat /proc/mdstat |
and instead of the string [UU] you will see [U_] if you have a degraded RAID1 array. | and instead of the string [UU] you will see [U_] if you have a degraded RAID1 array. | ||
− | + | == 3 Removing The Failed Disk == | |
To remove /dev/sdb, we will mark /dev/sdb1 and /dev/sdb2 as failed and remove them from their respective RAID arrays (/dev/md0 and /dev/md1). | To remove /dev/sdb, we will mark /dev/sdb1 and /dev/sdb2 as failed and remove them from their respective RAID arrays (/dev/md0 and /dev/md1). | ||
Line 39: | Line 37: | ||
First we mark /dev/sdb1 as failed: | First we mark /dev/sdb1 as failed: | ||
− | mdadm --manage /dev/md0 --fail /dev/sdb1 | + | mdadm --manage /dev/md0 --fail /dev/sdb1 |
The output of | The output of | ||
Line 59: | Line 57: | ||
Then we remove /dev/sdb1 from /dev/md0: | Then we remove /dev/sdb1 from /dev/md0: | ||
− | mdadm --manage /dev/md0 --remove /dev/sdb1 | + | mdadm --manage /dev/md0 --remove /dev/sdb1 |
The output should be like this: | The output should be like this: | ||
Line 68: | Line 66: | ||
And | And | ||
− | cat /proc/mdstat | + | cat /proc/mdstat |
should show this: | should show this: | ||
Line 86: | Line 84: | ||
mdadm --manage /dev/md1 --fail /dev/sdb2 | mdadm --manage /dev/md1 --fail /dev/sdb2 | ||
− | cat /proc/mdstat | + | cat /proc/mdstat |
server1:~# cat /proc/mdstat | server1:~# cat /proc/mdstat | ||
Line 103: | Line 101: | ||
mdadm: hot removed /dev/sdb2 | mdadm: hot removed /dev/sdb2 | ||
− | cat /proc/mdstat | + | cat /proc/mdstat |
server1:~# cat /proc/mdstat | server1:~# cat /proc/mdstat | ||
Line 117: | Line 115: | ||
Then power down the system: | Then power down the system: | ||
− | shutdown -h now | + | shutdown -h now |
and replace the old /dev/sdb hard drive with a new one (it must have at least the same size as the old one - if it's only a few MB smaller than the old one then rebuilding the arrays will fail). | and replace the old /dev/sdb hard drive with a new one (it must have at least the same size as the old one - if it's only a few MB smaller than the old one then rebuilding the arrays will fail). | ||
− | + | == 4 Adding The New Hard Disk == | |
− | |||
After you have changed the hard disk /dev/sdb, boot the system. | After you have changed the hard disk /dev/sdb, boot the system. | ||
Line 128: | Line 125: | ||
The first thing we must do now is to create the exact same partitioning as on /dev/sda. We can do this with one simple command: | The first thing we must do now is to create the exact same partitioning as on /dev/sda. We can do this with one simple command: | ||
− | sfdisk -d /dev/sda | sfdisk /dev/sdb | + | sfdisk -d /dev/sda | sfdisk /dev/sdb |
You can run | You can run | ||
− | fdisk -l | + | fdisk -l |
to check if both hard drives have the same partitioning now. | to check if both hard drives have the same partitioning now. | ||
Line 138: | Line 135: | ||
Next we add /dev/sdb1 to /dev/md0 and /dev/sdb2 to /dev/md1: | Next we add /dev/sdb1 to /dev/md0 and /dev/sdb2 to /dev/md1: | ||
− | mdadm --manage /dev/md0 --add /dev/sdb1 | + | mdadm --manage /dev/md0 --add /dev/sdb1 |
server1:~# mdadm --manage /dev/md0 --add /dev/sdb1 | server1:~# mdadm --manage /dev/md0 --add /dev/sdb1 | ||
Line 150: | Line 147: | ||
Now both arays (/dev/md0 and /dev/md1) will be synchronized. Run | Now both arays (/dev/md0 and /dev/md1) will be synchronized. Run | ||
− | cat /proc/mdstat | + | cat /proc/mdstat |
to see when it's finished. | to see when it's finished. | ||
Line 181: | Line 178: | ||
That's it, you have successfully replaced /dev/sdb! | That's it, you have successfully replaced /dev/sdb! | ||
+ | |||
+ | Original site here: | ||
+ | |||
+ | http://www.howtoforge.com/replacing_hard_disks_in_a_raid1_array |
Revision as of 16:02, 7 September 2011
Contents
Replacing A Failed Hard Drive In A Software RAID1 Array
This guide shows how to remove a failed hard drive from a Linux RAID1 array (software RAID), and how to add a new hard disk to the RAID1 array without losing data.
I do not issue any guarantee that this will work for you!
1 Preliminary Note
In this example I have two hard drives, /dev/sda and /dev/sdb, with the partitions /dev/sda1 and /dev/sda2 as well as /dev/sdb1 and /dev/sdb2.
/dev/sda1 and /dev/sdb1 make up the RAID1 array /dev/md0.
/dev/sda2 and /dev/sdb2 make up the RAID1 array /dev/md1.
/dev/sda1 + /dev/sdb1 = /dev/md0
/dev/sda2 + /dev/sdb2 = /dev/md1
/dev/sdb has failed, and we want to replace it.
2 How Do I Tell If A Hard Disk Has Failed?
If a disk has failed, you will probably find a lot of error messages in the log files, e.g. /var/log/messages or /var/log/syslog.
You can also run
cat /proc/mdstat
and instead of the string [UU] you will see [U_] if you have a degraded RAID1 array.
3 Removing The Failed Disk
To remove /dev/sdb, we will mark /dev/sdb1 and /dev/sdb2 as failed and remove them from their respective RAID arrays (/dev/md0 and /dev/md1).
First we mark /dev/sdb1 as failed:
mdadm --manage /dev/md0 --fail /dev/sdb1
The output of
cat /proc/mdstat
should look like this:
server1:~# cat /proc/mdstat Personalities : [linear] [multipath] [raid0] [raid1] [raid5] [raid4] [raid6] [raid10] md0 : active raid1 sda1[0] sdb1[2](F)
24418688 blocks [2/1] [U_]
md1 : active raid1 sda2[0] sdb2[1]
24418688 blocks [2/2] [UU]
unused devices: <none>
Then we remove /dev/sdb1 from /dev/md0:
mdadm --manage /dev/md0 --remove /dev/sdb1
The output should be like this:
server1:~# mdadm --manage /dev/md0 --remove /dev/sdb1 mdadm: hot removed /dev/sdb1
And
cat /proc/mdstat
should show this:
server1:~# cat /proc/mdstat Personalities : [linear] [multipath] [raid0] [raid1] [raid5] [raid4] [raid6] [raid10] md0 : active raid1 sda1[0]
24418688 blocks [2/1] [U_]
md1 : active raid1 sda2[0] sdb2[1]
24418688 blocks [2/2] [UU]
unused devices: <none>
Now we do the same steps again for /dev/sdb2 (which is part of /dev/md1):
mdadm --manage /dev/md1 --fail /dev/sdb2
cat /proc/mdstat
server1:~# cat /proc/mdstat Personalities : [linear] [multipath] [raid0] [raid1] [raid5] [raid4] [raid6] [raid10] md0 : active raid1 sda1[0]
24418688 blocks [2/1] [U_]
md1 : active raid1 sda2[0] sdb2[2](F)
24418688 blocks [2/1] [U_]
unused devices: <none>
mdadm --manage /dev/md1 --remove /dev/sdb2
server1:~# mdadm --manage /dev/md1 --remove /dev/sdb2 mdadm: hot removed /dev/sdb2
cat /proc/mdstat
server1:~# cat /proc/mdstat Personalities : [linear] [multipath] [raid0] [raid1] [raid5] [raid4] [raid6] [raid10] md0 : active raid1 sda1[0]
24418688 blocks [2/1] [U_]
md1 : active raid1 sda2[0]
24418688 blocks [2/1] [U_]
unused devices: <none>
Then power down the system:
shutdown -h now
and replace the old /dev/sdb hard drive with a new one (it must have at least the same size as the old one - if it's only a few MB smaller than the old one then rebuilding the arrays will fail).
4 Adding The New Hard Disk
After you have changed the hard disk /dev/sdb, boot the system.
The first thing we must do now is to create the exact same partitioning as on /dev/sda. We can do this with one simple command:
sfdisk -d /dev/sda | sfdisk /dev/sdb
You can run
fdisk -l
to check if both hard drives have the same partitioning now.
Next we add /dev/sdb1 to /dev/md0 and /dev/sdb2 to /dev/md1:
mdadm --manage /dev/md0 --add /dev/sdb1
server1:~# mdadm --manage /dev/md0 --add /dev/sdb1 mdadm: re-added /dev/sdb1
mdadm --manage /dev/md1 --add /dev/sdb2
server1:~# mdadm --manage /dev/md1 --add /dev/sdb2 mdadm: re-added /dev/sdb2
Now both arays (/dev/md0 and /dev/md1) will be synchronized. Run
cat /proc/mdstat
to see when it's finished.
During the synchronization the output will look like this:
server1:~# cat /proc/mdstat Personalities : [linear] [multipath] [raid0] [raid1] [raid5] [raid4] [raid6] [raid10] md0 : active raid1 sda1[0] sdb1[1]
24418688 blocks [2/1] [U_] [=>...................] recovery = 9.9% (2423168/24418688) finish=2.8min speed=127535K/sec
md1 : active raid1 sda2[0] sdb2[1]
24418688 blocks [2/1] [U_] [=>...................] recovery = 6.4% (1572096/24418688) finish=1.9min speed=196512K/sec
unused devices: <none>
When the synchronization is finished, the output will look like this:
server1:~# cat /proc/mdstat Personalities : [linear] [multipath] [raid0] [raid1] [raid5] [raid4] [raid6] [raid10] md0 : active raid1 sda1[0] sdb1[1]
24418688 blocks [2/2] [UU]
md1 : active raid1 sda2[0] sdb2[1]
24418688 blocks [2/2] [UU]
unused devices: <none>
That's it, you have successfully replaced /dev/sdb!
Original site here:
http://www.howtoforge.com/replacing_hard_disks_in_a_raid1_array