Difference between revisions of "RAID-1 configuration and management"

From PDP/Grid Wiki
Jump to navigationJump to search
Line 50: Line 50:
 
  sdisk /dev/sdY
 
  sdisk /dev/sdY
  
(using  
+
(using X=a,b,c,.. and Y=a,b,c,..., X different from Y).
X=a,b,c,..
 
and
 
Y=a,b,c,...
 
, X different from Y).
 
  
5.
+
5. Add all partitions of the new disk to the corresponding raid devices:
 +
 
 +
mdadm /dev/mdX -a /dev/sdYZ            (X=0,1,2,... Y=a,b,c,... Z=1,2,3,...)
 +
 
 +
For example,
 +
mdadm /dev/md0 -a /dev/sdb1
 +
to add partition 1 on disk sdb from raid device md0.
 +
This will automatically trigger the synchronization of the data on that partition to the one on the new disk. All above command may immediately be repeated for all partitions; the actual synchronization takes place sequentially.
 +
 
 +
6. The progress of the sycnhronization can be monitored via the following command:
 +
 
 +
cat /proc/mdstat
 +
 
 +
which produces output like:
 +
 
 +
Personalities : [raid1]
 +
read_ahead 1024 sectors
 +
Event: 23
 +
md0 : active raid1 sda1[0] sdb1[1]
 +
      128384 blocks [2/2] [UU]
 +
 +
md1 : active raid1 sda2[0] sdb2[1]
 +
      8385856 blocks [2/2] [UU]
 +
 +
md2 : active raid1 sda3[0] sdb3[1]
 +
      3148672 blocks [2/2] [UU]
 +
 +
md3 : active raid1 sda5[2] sdb5[1]
 +
      521984 blocks [2/1] [_U]
 +
      [==============>......]  recovery = 73.5% (384524/521984) finish=0.0min speed=54932K/sec
 +
unused devices: <none>

Revision as of 14:59, 3 November 2006

RAID-1 configuration via Kickstart

(Under construction) How to change the Kickstart file to define RAID-1

The following examples uses two serial ATA disks (/dev/sda and /dev/sdb) with four partitions (/boot, /, swap and /tmp), each in RAID-1 configuration:

part raid.01 --size=128  --ondisk=sda
part raid.02 --size=8192 --ondisk=sda
part raid.03 --size=3072 --ondisk=sda
part raid.04 --size=512 --ondisk=sda

part raid.05 --size=128  --ondisk=sdb
part raid.06 --size=8192 --ondisk=sdb
part raid.07 --size=3072 --ondisk=sdb
part raid.08 --size=512 --ondisk=sdb

raid /boot --level=RAID1 --device=md0 --fstype=ext2 raid.01 raid.05
raid /     --level=RAID1 --device=md1 --fstype=ext3 raid.02 raid.06
raid swap  --level=RAID1 --device=md2 --fstype=swap raid.03 raid.07
raid /tmp  --level=RAID1 --device=md3 --fstype=ext3 raid.04 raid.08

Restoring data on a new disk

OK, so there are two disks in a RAID-1 (mirror) configuration. What to do if on of them dies?

Well, all data are still available on the other disk, so they can be restored when a new disk is placed. The following steps show how to restore the data:

0. Do not reboot the machine! A reboot may change the drive names (e.g., /dev/sdb becoming /dev/sda) and hang the machine if it cannot find the boot loader grub.

1. Remove the partitions of the defect disk from the RAID configuration:

mdadm /dev/mdX -r /dev/sdYZ            (X=0,1,2,... Y=a,b,c,... Z=1,2,3,...)

For example,

mdadm /dev/md0 -r /dev/sdb1

to remove partition 1 on disk sdb from raid device md0.

2. Replace the bad disk by a fresh one. Do not reboot the machine!

3. Rescan the SCSI bus for new devices. Use the script

/usr/local/bin/rescan-sci-bus.sh

(if installed from rpm), or download it from:

http://stal.nikhef.nl/scripts/rescan-scsi-bus.sh

4. Now the partition table should be created on the new disk. Use the following command to clone the partition table of existing disk sdX to the new disk sdY:

sfdisk -d /dev/sdX | \
sed -e s/sdX/sdY/ | \
sdisk /dev/sdY

(using X=a,b,c,.. and Y=a,b,c,..., X different from Y).

5. Add all partitions of the new disk to the corresponding raid devices:

mdadm /dev/mdX -a /dev/sdYZ            (X=0,1,2,... Y=a,b,c,... Z=1,2,3,...)

For example,

mdadm /dev/md0 -a /dev/sdb1

to add partition 1 on disk sdb from raid device md0. This will automatically trigger the synchronization of the data on that partition to the one on the new disk. All above command may immediately be repeated for all partitions; the actual synchronization takes place sequentially.

6. The progress of the sycnhronization can be monitored via the following command:

cat /proc/mdstat

which produces output like:

Personalities : [raid1]
read_ahead 1024 sectors
Event: 23
md0 : active raid1 sda1[0] sdb1[1]
      128384 blocks [2/2] [UU]

md1 : active raid1 sda2[0] sdb2[1]
      8385856 blocks [2/2] [UU]

md2 : active raid1 sda3[0] sdb3[1]
      3148672 blocks [2/2] [UU]

md3 : active raid1 sda5[2] sdb5[1]
      521984 blocks [2/1] [_U]
      [==============>......]  recovery = 73.5% (384524/521984) finish=0.0min speed=54932K/sec
unused devices: <none>