Difference between revisions of "Managing RAID Controllers"
(Created page with "Most systems come equipped with a hardware RAID controller, which can be controlled from the OS when the right software is installed. There are a couple of flavours available,...") |
|||
Line 22: | Line 22: | ||
One of these should work. | One of these should work. | ||
+ | |||
+ | == Using smartctl on the carnaval nodes == | ||
+ | |||
+ | The carnaval blades are equipped with Dell PERC H200 cards, having two 500 GB disks in a RAID0 setup. Reading the SMART status of the virtual disk /dev/sda won't work, but what does work is using the generic SCSI devices exposed to the system: | ||
+ | smartctl /dev/sg1 -a | ||
+ | smartctl /dev/sg2 -a | ||
+ | |||
+ | This helps to reveal which of the two disks caused the problem. |
Revision as of 14:49, 14 January 2015
Most systems come equipped with a hardware RAID controller, which can be controlled from the OS when the right software is installed. There are a couple of flavours available, the one you need is not always obvious. The software can be used to destroy the RAID set, but more usefully it can be used to:
- manage the audible alarm (e.g. in case the backup battery unit is failing)
- blink the led of a disk that needs replacement
- enable/disable a disk
Installing the software
If it is not already installed, the software is available through the nikhef-external repo with yum. Use either of:
yum install MegaCli yum install storcli yum install MegaRAID_Software_Manager
MegaCli is a command-line tool with a rather painful syntax. Storcli is nicer.
/opt/MegaRAID/MegaCli/MegaCli64 -CfgDsply -aALL /opt/MegaRAID/storcli/storcli64 show all
In case these tools report no RAID controllers, try the StorCli from the last one:
/usr/local/MegaRAID\ Storage\ Manager/StorCLI/storcli64 show all
One of these should work.
The carnaval blades are equipped with Dell PERC H200 cards, having two 500 GB disks in a RAID0 setup. Reading the SMART status of the virtual disk /dev/sda won't work, but what does work is using the generic SCSI devices exposed to the system:
smartctl /dev/sg1 -a smartctl /dev/sg2 -a
This helps to reveal which of the two disks caused the problem.