HooiMaanden

From PDP/Grid Wiki
Jump to navigationJump to search

How to manage and configure the Hooi Maanden

First some specs about the cluster:

  • 12 x Dell R710
    • 2 x Intel L5520 quadcore
    • 72GB DDR3 ram
    • 2 x 154GB 10K RPM SAS disks in hardware raid 1
    • Intel 10Ge 82598EB network card
    • Mellanox Infiniband MT25208 card
    • iDRAC6 enterprice management card
    • Onboard 1Ge network cards are disabled in the BIOS
  • 6 x DDN SA9900
    • 2 Controllers per cabinet
    • Each controller has 20Gbit/s Infiniband connection to one of the Dell machines
    • 5 disk drawers per cabinet, each drawer can have 60 disks
    • Hitachi 7200RPM 2TB disks

The hostnames of the disk controllers:

  • hooimaand-mngt-C37-1.ipmi.nikhef.nl
  • hooimaand-mngt-C37-2.ipmi.nikhef.nl
  • hooimaand-mngt-C38-1.ipmi.nikhef.nl
  • hooimaand-mngt-C38-2.ipmi.nikhef.nl
  • hooimaand-mngt-C39-1.ipmi.nikhef.nl
  • hooimaand-mngt-C39-2.ipmi.nikhef.nl

Basic commands

When you're logged into a disk controller you'll find the following commands usefull. host, lun, disk, faults,

How to setup the Hooi Maanden

First off install the machines (for example with a base EL5 Quattor install). The steps below must be executed per machine (Dell server).

When you have a working system, install these packages:

yum install srptools opensm xfsprogs

This will install the packages you'll need for the Infiniband stack. After that were going to activate the Infiniband stack, so that the DDN Controller will see the connection and to hook them up to the disks.

/etc/init.d/openibd start
sleep 3
/etc/init.d/opensmd start
sleep 20
/usr/sbin/ibsrpdm -d /dev/infiniband/umad1 -c | sed -e "s/\(service_id\)/max_sect=4096,max_cmd_per_lun=4,\1/" > /sys/class/infiniband_srp/srp-mthca0-2/add_target

Now were going to login into the DDN disk controllers. There two options to use: serial cable or a telnet session. Our disk controllers have all ready a network connection, but where going to use the serial connection.

Note (RS): It seems that these commands also need to be executed after booting the machine. Otherwise, the machine cannot find the disks.

Configuring the DDN setup

Log into one of the DDN controllers, (the one where the infiniband connection to the machine is connected to). The default username and password is: admin, password. When you're logged in, type: lun. You'll see something like this:

LUN  Label         Owner  Status        (Mbytes)  Size  Tiers Tier list
-------------------------------------------------------------------------------
  1 SATA_1           1    Ready          15261576  4096    1  1 
  2 SATA_2           2    Ready          15261576  4096    1  2 
  3 SATA_3           1    Ready          15261576  4096    1  3 
  4 SATA_4           2    Ready          15261576  4096    1  4 
  5 SATA_5           1    Ready          15261576  4096    1  5 
  6 SATA_6           2    Ready          15261576  4096    1  6 
  7 SATA_7           1    Ready          15261576  4096    1  7 
  8 SATA_8           2    Ready          15261576  4096    1  8 
  9 SATA_9           1    Ready          15261576  4096    1  9 
 10 SATA_10          2    Ready          15261576  4096    1  10 
 11 SATA_11          1    Ready          15261576  4096    1  11 
 12 SATA_12          2    Ready          15261576  4096    1  12 
 13 SATA_13          1    Ready          15261576  4096    1  13 
 14 SATA_14          2    Ready          15261576  4096    1  14 
 15 SATA_15          1    Ready          15261576  4096    1  15 
 16 SATA_16          2    Ready          15261576  4096    1  16 
 17 SATA_17          1    Ready          15261576  4096    1  17 
 18 SATA_18          2    Ready          15261576  4096    1  18 
 19 SATA_19          1    Ready          15261576  4096    1  19 
 20 SATA_20          2    Ready          15261576  4096    1  20 
 21 SATA_21          1    Ready          15261576  4096    1  21 
 22 SATA_22          2    Ready          15261576  4096    1  22 
 23 SATA_23          1    Ready          15261576  4096    1  23 
 24 SATA_24          2    Ready          15261576  4096    1  24

Here you see the different lun's and on which controller they are connected to. We have 4 machines and 2 controllers and 24 lun's. That means that 6 lun per machine. So lun 1,3,5,7,9,11 will be connected to the first machine

Notes (RS):

1. Only hooimaand-01 ... hooimaand-08 (cabinets 38 and 39) have 6 LUNs. For hooimaand-09 ... hooimaand-12 (cabinet 37), there are only 4 LUNs.

2. Complete overview of LUNs per machine since the 2nd-4th machine per rack cannot be guessed trivially:

hooimaand-01: 1,3,5,7,9,11
hooimaand-02: 13,15,17,19,21,23
hooimaand-03: 2,4,6,8,10,12
hooimaand-04: 14,16,18,20,22,24
hooimaand-05: 1,3,5,7,9,11
hooimaand-06: 13,15,17,19,21,23
hooimaand-07: 2,4,6,8,10,12
hooimaand-08: 14,16,18,20,22,24
hooimaand-09: 1,3,5,7
hooimaand-10: 2,4,6,8
hooimaand-11: 9,11,13,15
hooimaand-12: 10,12,14,16

With the commands of How to setup the Hooi Maanden, you'll setup a basic infiniband connection to the controller from the machine. When you type in: host. You'll see:

Host                      Timeout
Port LID Depth MaxMsgSize seconds  Current GUID  Rate(Gbps) Port Status 
-------------------------------------------------------------------------------
  1    1   64    4192     75   50001FF100050B86    20.0     Good
  2    2   64    4192     75   50001FF200050B86  Link Down  Not connected
  3    1   64    4192     75   50001FF300050B86  Link Down  Not connected
  4    2   64    4192     75   50001FF400050B86  Link Down  Not connected
                           Current Logins
                 Frame/ S_ID/
 User       Port  MTU    LID   World Wide Name   Login
-------------------------------------------------------------------------------
Anonymous     1  2048       2  50001FF8000505A8  WED MAY 19 02:33:25 2010

The current login of Anonymous is the machine where you just did the commands standed above. Let's add this machine as a real user and give him some harddrives. Type in the following command: user add

 Currently logged-in Anonymous Users:
 ID  User Name    World Wide Name  S_ID   Port Time Logged in
 ----------------------------------------------------------------------------
   0  Anonymous   50001FF8000505A8      2   1  WED MAY 19 02:33:25 2010
Enter: an Anonymous User ID, 
      's' to specify a new Host User's world wide name, or 
      'e' to escape: 0

Enter here the ID of the user. In the example above, that would be 50001FF8000505A8.

Enter: An alias name for the host user 
          (up to 12 characters accepted): HooiMaand-01
Host users can have their port access 'zoned', in order to limit which 
ports the host user is allowed to log into.
By default, host users are given access to all the ports in the system.
Do you want to zone the host ports for this user? (y/N): y

Enter here the alias ("HooiMaand-01") for the user and y that you want to do the zoning.

Enter: For unit #1: A list of active ports in the range 1..4, (one per line), or
      'e' to escape:
1
2
3
4
Enter: For unit #2: A list of active ports in the range 1..4, (one per line), or
      'e' to escape:
1
2
3
4

Enter here 2 times 1 t/m 4.

Host users are limited to accessing specific LUNs, as follows: 
   a host user may have its own unique LUN mapping, or 
   a host user may use the anonymous LUN mapping.
The anonymous user LUN mapping is handled by the port ZONING command.
In either case, the LUN mapping applies on all the ports that the user has 
been zoned for.
Do you want to assign a unique LUN mapping to this user? (y/N): y

Enter here y to do the mapping to the lun's.

Enter a new unique LUN mapping for this user.
Enter the unique LUN mapping, as follows:
 G.l   GROUP.LUN number
 P     Place-holder
 R     Before GROUP.LUN to indicate Read-Only
 N     Clear current assignment
 <cr>  No change
 E     Exit command
 ?     Display detailed help text
Map external LUN 0 to internal LUN: 1
Map external LUN 1 to internal LUN: 3
Map external LUN 2 to internal LUN: 5
Map external LUN 3 to internal LUN: 7
Map external LUN 4 to internal LUN: 9
Map external LUN 5 to internal LUN: 11
Map external LUN 6 to internal LUN: e
 *** User ID 0: (HooiMaand-01, 50001FF8000505A8) has been added! ***

Enter here 1 3 5 7 9 11 (depending on the host, see overview above).

 The new User's settings are:
                                     Unit Port-Map
 ID   User Name      World Wide Name    1    2       Zoning Method 
------------------------------------------------------------------------------
 0    HooiMaand-01   50001FF8000505A8  1234 1234   000,001     001,003    ...
                                 LUN Zoning
  ID    User Name      (External LUN, Internal LUN)
 ----------------------------------------------------------------------
  0     HooiMaand-01    000,001     001,003     002,005     003,007    
                        004,009     005,011

Now you're finished with configuring the DDN controllers. Now we can create the Raid0 and the LVM on the machine itself.

/usr/sbin/ibsrpdm -d /dev/infiniband/umad1 -c | sed -e "s/\(service_id\)/max_sect=4096,max_cmd_per_lun=4,\1/" > /sys/class/infiniband_srp/srp-mthca0-2/add_target
devices=`cd /sys/block; /bin/ls -1d * | grep sd | grep -v sda`
fd=`echo $devices | sed 's/sd/\/dev\/sd/g'`
ndev=`echo $devices | wc -w`
mdadm --create /dev/md0 --level=stripe --chunk=4096 --raid-devices=$ndev $fd
pvcreate /dev/md0
vgcreate data /dev/md0
lvcreate --name test --size 10G data
mkfs.xfs -f -ssize=4k -isize=2k /dev/data/test
mkdir -p /export/data/test
mount /dev/data/test /export/data/test