|
|
Line 1: |
Line 1: |
− | The strijker nodes are [http://downloads.dell.com/Manuals/all-products/esuprt_ser_stor_net/esuprt_poweredge/poweredge-r515_Owner%27s%20Manual_en-us.pdf 2u Dell R515] servers with 12 front-loading disks each.
| + | This page [https://wiki.nikhef.nl/nikhef/ctb/NDPF:Strijker_Node_type_details has been moved] to the internal wiki. |
− | | |
− | They have a [http://www.dell.com/downloads/global/products/pvaul/en/perc-technical-guidebook.pdf PERC H700 Integrated] RAID controller, which can be managed by the MegaRAID Storage Manager software:
| |
− | /usr/local/MegaRAID\ Storage\ Manager/StorCLI/storcli64 /c0 show all
| |
− | | |
− | The Nagios sensor in /usr/local/lib/nagios/plugins/check_lsi_raid checks the state of the controller with a somewhat poorly documented command:
| |
− | | |
− | /opt/MegaRAID/storcli/storcli64 adpallinfo a0
| |
− | | |
− | Where 'a0' stands for controller 0.
| |
− | | |
− | The output of this command contains a summary of the controller status, of which one block is particularly interesting:
| |
− | | |
− | Device Present
| |
− | ================
| |
− | Virtual Drives : 2
| |
− | Degraded : 0
| |
− | Offline : 0
| |
− | Physical Devices : 16
| |
− | Disks : 14
| |
− | Critical Disks : 1
| |
− | Failed Disks : 0
| |
− | | |
− | The 'Critical Disks' here shows there is a disk failing or about to fail. Although individual disk info can be retrieved e.g. with
| |
− | | |
− | storcli64 /c0/e32/sall show all
| |
− | | |
− | This doesn't directly say that a disk is critical; one has to infer this from the error counts, e.g.
| |
− | | |
− | Drive /c0/e32/s5 State :
| |
− | ======================
| |
− | Shield Counter = 0
| |
− | Media Error Count = 46
| |
− | Other Error Count = 137
| |
− | Drive Temperature = 36C (96.80 F)
| |
− | Predictive Failure Count = 34
| |
− | S.M.A.R.T alert flagged by drive = Yes
| |
− | | |
− | Visual inspection of the machine will show a blinking led on the failing disk. The numbering of the disks is top to bottom, then left to right with the top left position 0.
| |
Latest revision as of 11:27, 13 April 2018