|
|
Line 5: |
Line 5: |
| == IPMI == | | == IPMI == |
| | | |
− | IPMI is a "standard" interface that can be used to do remote management tasks. All physical machines (is this true) in the room have IPMI interfaces. If you know the physical machine you want to access, you can find the corresponding IPMI address | + | IPMI is a "standard" interface that can be used to do remote management tasks. All physical machines (is this true) in the room have IPMI interfaces. If you know the physical machine you want to access, you can find the corresponding IPMI address on this page: [[NDPF_Node_Functions]], in the table called "IPMI dedicated management network". Here is an example: you want to access wn-val-003.farm.nikhef.nl. You can see from that table the following line: |
− | == Serial-over-LAN and IPMI 2.0 ==
| |
| | | |
− | SoL is a standard feature of IPMI 2.0 and can be used with any IPMI 2.0 client over the LAN+ interface, provided the BMC is correctly configured with an IPv4 address. SoL is only avaialble to privileged users (on Dell: "root", hooi-ei: "root" and on Valentine: "ADMIN"), and of course such accounts on the BMC must be protected with a passphrase, which they are. Only one SoL client can be connected at any one time.
| + | 0.20-0.121 wn-val-(001-102) valentine LCG2ELPROD |
| | | |
− | == Setting up the BMC LAN+ channel ==
| + | read this as follows: the first field gives the last two parts of the IP address (the first part is 172.20), so wn-val-003 will have IP address 172.20.0.22. The second field gives the IPMI hostname, so this node has hostname "wn-val-003.ipmi.nikhef.nl". |
| + | bosui:~> host wn-val-003.ipmi.nikhef.nl |
| + | wn-val-003.ipmi.nikhef.nl has address 172.20.0.22 |
| + | This information can be used as hostname/ip input for the various IPMI client tools. |
| | | |
− | The trivial way is to do it with the BMC BIOS interface (press Ctrl-E halfway in the boot process), but it can also be done using the IPMI device driver local interface as described in this [http://lonesysadmin.net/2007/06/21/how-to-configure-ipmi-on-a-dell-poweredge-running-red-hat-enterprise-linux/ Lone Sysadmin] article on configuring IPMI on a Dell PE with RHEL.
| + | '''Important Note''': the IPMI network is not accessible from everywhere. From your desktop, you should have the OpenVPN tunnel running, otherwise you won't be able to connect. The IPMI network is not accessible from all nodes in the farm; it does work from e.g. the install server (stal). |
| | | |
− | For some reason the BMC LAN configuration can make the BMC hang, and in other cases I had to enable ARP responses and gratuitious ARP announcements to get the IP address to be recognised by clients. For the initial tests, I've just used an IP address out of the regular range for the host addresses (e.g. out of 194.171.97.0/24 for the farmnet servers like kaf), but this can and should be changed later as it will otherwise eat IP addresses like mad.
| + | == Remote Consoles and Switches == |
| | | |
− | If you think you've configured everything correctly using "ipmitool -I open ...", but the network still refuses to answer, a reboot of the BMC will usually help.
| + | The more modern machines in the farm support something called a "KVM console" over IPMI, which means that via a tool like the Supermicro IPMI viewer, one can look at (and type at) the console of a physical machine. However, not all machines in the farm are modern enough to have this feature. So for some machines in the farm, you can use something like IPMIView to do essentially everything you want (the valentine nodes are a good example), while for other machines in the farm, you must do something else. Good examples of the latter are the older Dell machines like the luilaks and bulldozers. If you want to look at the console on these WNs, you have to use the Dell Remote Console switch software. This software only works under Linux and Windows. Mac users can resort to a virtual machine. I believe there are other solutions and they will be documented here if possible. |
| | | |
− | == Enabling SoL == | + | == Command-line operations via ipmitool == |
| | | |
− | SoL should be enabled in the BIOS or using the IPMI interface, but it usually on already. You can
| + | There are a lot of commands you can run via ipmitool. One example is to turn the power of a machine off or on. Here is how to turn one on. |
− | activate (connect to) it any time.
| |
| | | |
− | The default baud rate is 19200 8N1, and it's really best to keep it that way.
| + | ipmitool -P $P -U root -H wn-lui1-001.ipmi.nikhef.nl chassis power on |
| | | |
− | == SoL clients ==
| + | will turn on the power for wn-lui1-001. $P is the password and not listed on this page. |
| | | |
− | There are many SoL clients:
| + | == Usernames and passwords == |
| | | |
− | * OpenIPMI provides ipmitool:
| + | There is no standard username for the IPMI interface on our farm. |
| | | |
− | ipmitool -I lanplus -H ''IPMI-BMC-IPADDR'' -U ''BMCPRIVUSER'' sol activate
| + | * for valentine machines, the username is ADMIN |
| + | * for luilak machines, the username is root |
| + | * for the Dell Remote Console Switches, the username is apparently Admin although as of this writing I have not been able to verify this. |
| | | |
− | and if somebody else is already connected, throw them off with | + | Good luck and happy system management! |
− | | |
− | ipmitool -I lanplus -H ''IPMI-BMC-IPADDR'' -U ''BMCPRIVUSER'' sol deactivate
| |
− | | |
− | and they're gone -- and you can start playing with the console.
| |
− | | |
− | * Various vendors have IPMI viewers, like SuperMicro's [http://www.mathematik.uni-marburg.de/pub/mirror/supermicro/update/IPMIView/ IPMIview 2.0]
| |
− | | |
− | = Serial Consoles =
| |
− | | |
− | == Grub ==
| |
− | | |
− | The Dell PE serial console redirection has two modes of operation. As explained in the [http://support2.jp.dell.com/docs/software/smbmcmu/2.0A01/en/ug/bmcugadd.htm Dell BMC Users Guide], Console Rdirection is set in the BIOS, accessible using F2 during boot.
| |
− | | |
− | Assuming we will use COM2 for the SoL console, and leave COM1 happily attached to the physical RS232 interface, we use the recipe there:
| |
− | | |
− | Set Serial Communication-> Serial Communication to On with Console Redirection via COM2
| |
− | Set Serial Communication-> External Serial Connector to COM2
| |
− | NOTE: If the console redirection is used for SOL then the External Serial Connector setting
| |
− | does not need to be configured.
| |
− | | |
− | and so we set
| |
− | | |
− | Set Serial Communication-> External Serial Connector to COM1
| |
− | | |
− | But then there are two options for
| |
− | | |
− | Serial Communication -> Redirection After Boot
| |
− | | |
− | and each corresponds to a very specific way of configuring grub. If you do it the wrong way round, you will either have to press a key during boot on either console, or you'll just hang the server (but you can use "<tt>ipmitool -I lan -H <i>XXX.XXX.XXX.XXX</i> -U root chassis power cycle</tt>" to reboot any time).
| |
− | | |
− | === Redirection After Boot Enabled ===
| |
− | | |
− | Grub essentially should see a single console, called "console", as the BIOS will take care of sending any output also to the serial line. If you would configure grub now to also talk to the serial (SoL) port, they'll fight for the input, and you will have to wait for a very, very long time. Actually, you can wait forever, infinitely long.
| |
− | | |
− | If you have Redirection After Boot enabled, your grub should really look like:
| |
− | | |
− | # grub.conf assuming a Xen kernel mess
| |
− | default=0
| |
− | timeout=5
| |
− | hiddenmenu
| |
− | serial --unit=1 --speed=19200
| |
− | terminal --timeout=2 console
| |
− |
| |
− | title CentOS (2.6.18-8.1.15.el5xen)
| |
− | root (hd0,0)
| |
− | kernel /xen.gz-2.6.18-8.1.15.el5 com2=19200,8n1 console=com2,vga
| |
− | module /vmlinuz-2.6.18-8.1.15.el5xen ro root=/dev/md1 console=xvc xencons=xvc pnpacpi=off
| |
− | module /initrd-2.6.18-8.1.15.el5xen.img
| |
− | | |
− | where grub only uses the console, but of course the (Xen) kernel uses COM2: explicitly, as by then you'll be in protected mode. The CentOS kernel lines are thus standard for a Xen kernel.
| |
− | | |
− | === Redirection After Boot Disabled ===
| |
− | | |
− | If you have defailt redirection after boot disabled, grub will have to take care of talking to the serial console (which, in case of SoL, will happily be waiting for you on COM2:). Now, you must set the "terminal" line as follows:
| |
− | | |
− | terminal --timeout=5 serial console
| |
− | | |
− | But, it does not change a thing for the kernel commandline arguments, as the BIOS redirection will only have effect for as long as you're in real mode. Once you switch to protected mode, the BIOS redirection will be out of the loop.
| |
− | | |
− | | |
− | == Serial Consoles on a regular kernel ==
| |
− | | |
− | There are plenty of guides for this. Use
| |
− | | |
− | kernel /vmlinuz-2.6.18-8.1.15.el5 ro root=/dev/md1 console=ttyS1,19200n8 console=tty1
| |
− | | |
− | and you should be ready for action. (DvD I found that i had to leave out the last console=tty1 bit...YMMV)
| |
− | If you want to log in over the serial line, also
| |
− | add to /etc/inittab
| |
− | | |
− | co:2345:respawn:/sbin/agetty -L ttyS1 19200 vt100
| |
− | | |
− | and send SIGHUP to init, and don't forget to add <tt>/dev/ttyS1</tt> to <tt>/etc/securetty</tt> ...
| |
− | | |
− | == Serial Consoles with a Xen kernel ==
| |
− | | |
− | This is a lot more challenging, as Xen will do two things:
| |
− | * virtualize the regular tty, so that the OS in Dom0 can talk to the native console
| |
− | * open a fake serial connection for itself on a ttyS<i>x</i> to that a host-OS in Dom0 can look at the status diagnostics from the root Xen kernel (xen.gz)
| |
− | | |
− | We need to change both in order to get the serial console to work again.
| |
− | | |
− | title CentOS (2.6.18-8.1.15.el5xen)
| |
− | root (hd0,0)
| |
− | kernel /xen.gz-2.6.18-8.1.15.el5 com2=19200,8n1 console=com2,vga
| |
− | module /vmlinuz-2.6.18-8.1.15.el5xen ro root=/dev/md1 console=xvc xencons=xvc pnpacpi=off
| |
− | | |
− | will do these two things: instruct Xen to send it's status output to COM2: (the SoL console), and simultaneously also send it to VGA. The COM2: serial settings are the default 19200 8N1.
| |
− | Secondly, the actual kernel in Dom-0 will use the Xen virtual console (xvc) to direct all it's output to. Via the
| |
− | Xen kernel (in xen.gz) all this output will end up on the SoL console as well. the "<tt>pnpacpi=off</tt>" hack seems to be necessary on some systems.
| |
− | | |
− | The login prompt on Dom-0 will be connected to /dev/xvc0 since RHEL automatically adds to inittab
| |
− | | |
− | co:2345:respawn:/sbin/agetty xvc0 9600 vt100-nav
| |
− | | |
− | and also the RHEL5 distribution will automatically add /dev/xvc0 to /etc/securetty so that root can log in.
| |
− | | |
− | If you cannot get anything to work, and echo-ing characters to /dev/ttyS1 gives "<tt>no such device or address</tt>" on a Xen enabled system in Dom-0, it's a sign that Xen is still eating the serials for it's own status messages and you need to fiddle with the arguments to both the xen.gz and the "module vmlinuz ..." line.
| |
− | | |
− | == login prompt ==
| |
− | | |
− | If you use the Xen kernel, and use xencons=xvc, RHEL5 will do all the magic for you. If you build your own Xen
| |
− | stuff and run Xen with RHEL4 in Dom-0, try "xencons=off" and a direct "console=ttyS1,19200n8" as a (real) kernel argument. It appears to work that way, but you miss most of the startup messages and the Xen status info.
| |
− | | |
− | = Useful Links =
| |
− | | |
− | * [http://grokbase.com/topic/2007/12/21/centos-5-need-help-with-serial-ports/aEACca7SY3LYkUM7MZPdJfZE-Oc centos-5-need-help-with-serial-ports]
| |
− | * [http://support2.jp.dell.com/docs/software/smbmcmu/2.0A01/en/ug/bmcugadd.htm Dell BMC Users Guide] (but remember that on any 8+ generation PE system, such as the PE1950, the serial console lives on COM2: (ttyS1)
| |
− | * [http://osdir.com/ml/linux.redhat.fedora.xen/2006-04/msg00186.html Xen and xencons settings] to find out how to get the xencons configuration worked out correctly
| |
− | * [http://www.mathematik.uni-marburg.de/pub/mirror/supermicro/update/IPMIView/ IPMIView 2.0] for a copy of IPMIview20.jar from SuperMicro (runs on Windows as well)
| |
− | * [http://www.centos.org/docs/5/html/Virtualization-en-US/ch-virt-serialconsole-errors.html Virtualization and Serial Console Errors]
| |
− | * [http://www.redhat.com/docs/manuals/enterprise/RHEL-5-manual/Virtualization-en-US/ch-virt-trouble-serial.html http://www.redhat.com/docs/manuals/enterprise/RHEL-5-manual/Virtualization-en-US/ch-virt-trouble-serial.html]
| |
− | * [http://www.gnu.org/software/grub/manual/html_node/serial.html Grub manual on serial and terminal statements]
| |
− | | |
− | And general IPMI links:
| |
− | | |
− | * [http://www.ecst.csuchico.edu/~dranch/LINUX/IPMI/ipmi-on-linux.html http://www.ecst.csuchico.edu/~dranch/LINUX/IPMI/ipmi-on-linux.html]
| |
− | * [https://twiki.cern.ch/twiki/bin/view/FIOgroup/IpmiRefHardwareSupport https://twiki.cern.ch/twiki/bin/view/FIOgroup/IpmiRefHardwareSupport]
| |
− | * [https://hep.pa.msu.edu/twiki/bin/view/AGLT2/IPMIDetails https://hep.pa.msu.edu/twiki/bin/view/AGLT2/IPMIDetails]
| |
− | | |
− | = Example configurations from kaf.nikhef.nl =
| |
− | | |
− | <tt>/etc/grub.conf</tt>
| |
− | | |
− | # grub.conf generated by anaconda
| |
− | #
| |
− | # Note that you do not have to rerun grub after making changes to this file
| |
− | # NOTICE: You have a /boot partition. This means that
| |
− | # all kernel and initrd paths are relative to /boot/, eg.
| |
− | # root (hd0,0)
| |
− | # kernel /vmlinuz-version ro root=/dev/md1
| |
− | # initrd /initrd-version.img
| |
− | #boot=/dev/md0
| |
− | default=0
| |
− | timeout=5
| |
− | #splashimage=(hd0,0)/grub/splash.xpm.gz
| |
− | hiddenmenu
| |
− | serial --unit=1 --speed=19200
| |
− | terminal --timeout=2 console
| |
− |
| |
− | title CentOS (2.6.18-8.1.15.el5xen)
| |
− | root (hd0,0)
| |
− | kernel /xen.gz-2.6.18-8.1.15.el5 com2=19200,8n1 console=com2,vga
| |
− | module /vmlinuz-2.6.18-8.1.15.el5xen ro root=/dev/md1 console=xvc xencons=xvc pnpacpi=off
| |
− | module /initrd-2.6.18-8.1.15.el5xen.img
| |
− |
| |
− | title CentOS monitor (2.6.18-8.1.15.el5xen)
| |
− | root (hd0,0)
| |
− | kernel /xen.gz-2.6.18-8.1.15.el5
| |
− | module /vmlinuz-2.6.18-8.1.15.el5xen ro root=/dev/md1
| |
− | module /initrd-2.6.18-8.1.15.el5xen.img
| |
− | | |
− | BIOS
| |
− | | |
− | Set Serial Communication-> Serial Communication to On with Console Redirection via COM2
| |
− | Set Serial Communication-> External Serial Connector to COM1
| |
− | Serial Communication -> Redirection After Boot is '''Enabled'''
| |
− | | |
− | and in the BMC settings (Ctrl-E), LAN is enabled and set to 194.171.97.250 and netmask 255.255.255.0 for the time being. SoL is enabled (you cannot turn it off IIRC), and the root password is set there as well (but can be changed using the ipmitool locally).
| |
Background
This page documents how to do management tasks -- like looking at the console, powering machines on and off, etc. -- from your desktop, instead of going into the machine room.
IPMI
IPMI is a "standard" interface that can be used to do remote management tasks. All physical machines (is this true) in the room have IPMI interfaces. If you know the physical machine you want to access, you can find the corresponding IPMI address on this page: NDPF_Node_Functions, in the table called "IPMI dedicated management network". Here is an example: you want to access wn-val-003.farm.nikhef.nl. You can see from that table the following line:
0.20-0.121 wn-val-(001-102) valentine LCG2ELPROD
read this as follows: the first field gives the last two parts of the IP address (the first part is 172.20), so wn-val-003 will have IP address 172.20.0.22. The second field gives the IPMI hostname, so this node has hostname "wn-val-003.ipmi.nikhef.nl".
bosui:~> host wn-val-003.ipmi.nikhef.nl
wn-val-003.ipmi.nikhef.nl has address 172.20.0.22
This information can be used as hostname/ip input for the various IPMI client tools.
Important Note: the IPMI network is not accessible from everywhere. From your desktop, you should have the OpenVPN tunnel running, otherwise you won't be able to connect. The IPMI network is not accessible from all nodes in the farm; it does work from e.g. the install server (stal).
Remote Consoles and Switches
The more modern machines in the farm support something called a "KVM console" over IPMI, which means that via a tool like the Supermicro IPMI viewer, one can look at (and type at) the console of a physical machine. However, not all machines in the farm are modern enough to have this feature. So for some machines in the farm, you can use something like IPMIView to do essentially everything you want (the valentine nodes are a good example), while for other machines in the farm, you must do something else. Good examples of the latter are the older Dell machines like the luilaks and bulldozers. If you want to look at the console on these WNs, you have to use the Dell Remote Console switch software. This software only works under Linux and Windows. Mac users can resort to a virtual machine. I believe there are other solutions and they will be documented here if possible.
Command-line operations via ipmitool
There are a lot of commands you can run via ipmitool. One example is to turn the power of a machine off or on. Here is how to turn one on.
ipmitool -P $P -U root -H wn-lui1-001.ipmi.nikhef.nl chassis power on
will turn on the power for wn-lui1-001. $P is the password and not listed on this page.
Usernames and passwords
There is no standard username for the IPMI interface on our farm.
- for valentine machines, the username is ADMIN
- for luilak machines, the username is root
- for the Dell Remote Console Switches, the username is apparently Admin although as of this writing I have not been able to verify this.
Good luck and happy system management!