Managing RAID Controllers

From PDP/Grid Wiki
Revision as of 11:47, 15 October 2015 by Dennisvd@nikhef.nl (talk | contribs)
Jump to navigationJump to search

Most systems come equipped with a hardware RAID controller, which can be controlled from the OS when the right software is installed. There are a couple of flavours available, the one you need is not always obvious. The software can be used to destroy the RAID set, but more usefully it can be used to:

  • manage the audible alarm (e.g. in case the backup battery unit is failing)
  • blink the led of a disk that needs replacement
  • enable/disable a disk

See also the specific pages for oliebol and strijker type nodes.

Installing the software

If it is not already installed, the software is available through the nikhef-external repo with yum. Use either of:

yum install MegaCli
yum install storcli
yum install MegaRAID_Software_Manager

MegaCli is a command-line tool with a rather painful syntax. Storcli is nicer.

/opt/MegaRAID/MegaCli/MegaCli64 -CfgDsply -aALL
/opt/MegaRAID/storcli/storcli64 show all

In case these tools report no RAID controllers, try the StorCli from the last one:

/usr/local/MegaRAID\ Storage\ Manager/StorCLI/storcli64 show all

One of these should work.

Using smartctl on the carnaval nodes

The carnaval blades are equipped with Dell PERC H200 cards, having two 500 GB disks in a RAID0 setup. Reading the SMART status of the virtual disk /dev/sda won't work, but what does work is using the generic SCSI devices exposed to the system:

smartctl /dev/sg1 -a
smartctl /dev/sg2 -a

This helps to reveal which of the two disks caused the problem.