Difference between revisions of "NDPF GS environment"

From PDP/Grid Wiki
Jump to navigationJump to search
m
 
(15 intermediate revisions by the same user not shown)
Line 1: Line 1:
== The Grid Services environment ==
+
The Grid Services environment contains nodes and virtual machines that run special or dedicated services for grid and grid-related work: web servers, the EUGridPMA Repository, the CA and RA systems, et cetera. These service nodes are ‘one-off’ systems, not under quattor control, installed separately, and updating themselves using yum or apt. They do not even all run the same OS version or flavour.
  
The "Grid Service" environment contains nodes and virtual machines that run special or dedicated services that are not under regular (Quattor) control. They live on a separate network (194.171.96.64/28). The following services are provided:
+
They mostly live on a separate network [http://www.nikhef.nl/pub/projects/grid/gridwiki/index.php/NDPF_Node_Functions#Grid_Services_Protected_Network (194.171.96.64/28)], and at the Remote Housing Location.  
  
== Services ==
+
= Machine overview =
  
=== A note on web servers ===
+
{| style="background:silver; color:black"
 +
! colspan="5" style="background:green; color:white" | ''Machine (real or virtual) overview''
 +
|- style="background:green; color:white"
 +
| ''machine''||''responsible''||''Level''||''Tasks''||''Comments''
 +
|- style="background: gray"
 +
| rooier || sveng || low || web server for EGEE Security SSCs ||
 +
|- style="background: lightgray"
 +
| beerput || davidg || medium || rsync backup service || with ADSM client and backup
 +
|- style="background: gray"
 +
| gierput || davidg || low || no useful purpose left || spare for beerput
 +
|- style="background: yellow"
 +
| sikkel || davidg || high || NDPF subversion service ||
 +
|- style="background: red"
 +
| zeis || davidg || critical || www.eugridpma.org web site (with dynamic content) || a hot spare is available on dodo, re-point the DNS (hosted at [https://access.enom.com/ https://access.enom.com/]) in case it really does not come back
 +
|- style="background: red"
 +
| weikuip || davidg || critical || dist.eugridpma.info web (IGTF CA distribution) || a hot spare is available on lama, re-point the DNS (hosted at [https://access.enom.com/ https://access.enom.com/]) in case it really does not come back
 +
|- style="background: red"
 +
| keerder || davidg || critical || physical host system || serves: zeis, weikuip, rooier, sikkel
 +
|- style="background: yellow"
 +
| hek || davidg || high || DutchGrid CA 'internal' system || ra.dutchgrid.nl, used by the CA admins
 +
|- style="background: red"
 +
| kaasvat || davidg || critical || ca.dutchgrid.nl (DutchGrid CRL distribution) || a hot spare is available on vink, re-point the DNS for ca.dutchgrid.nl, ask PaulKS
 +
|- style="background: yellow"
 +
| rakel || davidg || high || physical host system || Blade #1 (top left, in c15). Hosts: mestkar
 +
|- style="background: yellow"
 +
| mestkar || davidg || high || web server for dutchgrid (and some NDPF stats) ||
 +
|- style="background: lightgray"
 +
| rijf || davidg || medium || NDPF mirror service || stalkaars-02, in 2nd valentine rack
 +
|- style="background: yellow"
 +
| salado || davidg || high || network management host || in cabinet of deel. Makes the cricket graphs. Warning: disk is NOT raided!
 +
|}
  
All web servers share a similar layout. All stuff related to the web service is contained in
+
= Web sites =
  
/project/srv/www
+
== EUGridPMA and IGTF ==
  
and uses the standard httpd as shipped with the OS. To get that far, we move the httpd.conf in
+
For the EUGridPMA and IGTF web sites, also Anders Waananen (NBI, DK) has the access rights and methods to get into it. He could potentially also do the system swap in DNS with ENOM, but had never tried that one yet.
<tt>/etc/httpd/conf/httpd.conf</tt> out of the way and make a symlink there to <tt>/project/srv/www/conf/httpd.conf</tt>.  
 
  
The <tt>.../www/conf/httpd.conf</tt> file contains only generic configuration, and then includes
+
These web sites *really* have a high profile, so please take care of them for me. Mails sent to the EUGridPMA Operations email address get forwarded to the grid sysadmin list as well.
one config file per virtual host to add and remove web sites. Each virtual web site is self-contained in a directory with the following structure
 
  
/project/srv/www/site/<i>webname</i>/<i>webname</i>.conf
+
== DutchGrid CA ==
/project/srv/www/site/<i>webname</i>/html
 
/project/srv/www/site/<i>webname</i>/cgi
 
/project/srv/www/site/<i>webname</i>/...
 
  
All certificates are stored in <tt>.../www/conf/...</tt> and the shared scripts and php includes (such as genpg) are in <tt>.../www/share/...</tt>.
+
The Dutchgrid CA has, besides its off-line signing system, 2 (two) on-line systems: the 'RA' box that serves the internal web management console that Djuhaeri, Andre and Dennis can use; and the 'public' box that serves the web site for user requests, as well as the CRL download location. This latter function (CRL downloads) is *really* critical and gets noticed by each and every site in the grid. Please keep it running, and look for complaints sent to ca@dutchgrid.nl. Dennis, Djuhaeri and Andre get these mails.
  
=== List of Services ===
+
Neither of the two boxes has a redundant power supply, but they do have redundant RAID-1 disks (on a 3ware controller)
; [[Web site www.dutchgrid.nl]] : on host beerput.nikhef.nl as a standard virtual web site, aliases www.dutchgrid.(com,net,org)
 
; [[Web site www.biggrid.nl]] : on host beerput.nikhef.nl as a standard virtual web site (both plain and SSL). Note that the main page http://www.biggrid.nl/ is redirected in the site-local httpd.conf file to point to http://www.nikhef.nl/grid/biggrid, but that the /twiki link is actually hosted there, and contains some magic in the conf file for client-side authentication
 
; [[Web site internal.vl-e.nl]] : on host beerput.nikhef.nl as a standard virtual web site
 
; [[Web site www.eugridpma.org]] : on host beerput.nikhef.nl as a virtual web site with it's own IP address (aliases are www.eugridpma.info and www.gridpma.eu). This is the PHP enabled web site with the agenda and such: note that the distribution server advertised to the public is https://dist.eugridpma.info/, although this site ''does'' contain a local (unreachable) copy of this as well. This sites runs with a separate host certificate, issues by the GlobalSign SureServer EDU CA
 
; [[VO LDAP service grid-vo.nikhef.nl]] : on host beerput.nikhef.nl, as part of the one and only LDAP service running on this host. The same LDAP server hosts the obsolete and outdated CA LDAP server for issued certificates up till approx. 2003.
 
; [[RSync Backup Server]] : a rsync-based backup system that periodically (once per day) obtains a copy of the filesystems of designated clients, and puts the stuff on beerput for further secondary backup via ASDM to SARA.
 
; [[Beerput CVS service]] : hosted on beerput, is used as a CVS repository for the EUGridPMA (distribution service, utilities, web site), DutchGrid CA public web site, and the NDPF Quattor configuation, system utilities, and the configuration of deel. Data is contained in /project/srv/cvs, with a symlink "/cvs" thereto. Account management via unix on beerput itself.
 
; [[Beerput TFTP service]] : hosted on beerput, used to upload configuration and firmware into deel for emergency cases.
 
; [[Deel master configuration]] : hosted on beerput in /tftpboot/deel.src, which is a symlink to a file under CVS, checked out as davidg on deel. Please edit the master "deel.src" file, ''su'' to davidg and commit changes to CVS there. Then, copy deel.src to deel in /tftpboot if needed. Otherwise, keep the running-config and this file in sync manually by editing according to the live changes on deel.
 
  
== Hosts ==
+
== DutchGrid web site, BiG Grid and the VL-e PoC ==
  
; [[Host dist.eugridpma.info]] : HIGH QOS SERVICE! A vmware guest (sikkel.nikhef.nl) on rooier.nikhef.nl, CentOS 4/i686. It serves the EUGridPMA/IGTF trust anchor distribution from a web server with only static content. A GlobalSign SureSever EDU certificate has been issues to this host.
+
These web sites have (just!) been migrated to mestkar.nikhef.nl, a VM(Xen,PV) hosted on the first blade top-left in the new chassis, on a host called rakel. This machine also does the CVS service for now.  
; [[Host rooier.nikhef.nl]] : VMware hosting system. A "HA-GRID" PowerEdge 1950 8GB/dual Woodcrest system that runs CentOS4 x86_64 and whose sole purpose is hosting VMs. The services on this system are limites to SSH from within trusted NIKEHF internal networks, and the VMware server management port from these same networks. No other services are (to be) run on this system!
+
New here: the uids are taken from the NDPF LDAP, and no longer follow the ikonet assignments.
; [[Host beerput.nikhef.nl]] : The central web, LDAP, rsync-backup and CVS hosting server. It is a dual-Xeon system with a 3ware SATA RAID-1 card and 2x250GByte +1x500 SATA disks. The 500 GB SATA disk contains only the backups of the other hosts in transit to ADSM, and the mirror of external web sites. Note that the hardware is identical to gierput, which can be canabalized for parts if needed.
+
 
; [[Host gierput.nikhef.nl]] : warm spare for beerput. Takes a nightly mirror of all data on the /project disk of beerput, using rsync from cron.
+
The only service that was NOT yet migrated away from beerput is the ADSM backup. Even more: mestkar is now backed-up TO beerput on a daily basis.
; [[Host hek.nikhef.nl]] : DutchGrid CA online-protected system for CA operations (alias: ra.dutchgrid.nl)
+
 
; [[Host TRIANGEL]] : This host is NOT connected to any network, and only bears "triangel" on the case to identify it. It is the off-line CA signing system, without a hard disk, but with a CD-ROM tray to put the Knoppix CD from the CA safe into. Can be replaced with any system, as long as the replacement similarly has ''no disk and no network, and is booted from the trusted Knoppix CD''.
+
= CVS =
 +
 
 +
The CVS service, using ssh access only, is now provided from mestkar (was: beerput)
 +
 
 +
= SVN =
 +
 
 +
The SVN service runs on sikkel, a VM(Xen,PV) on keerder.
 +
 
 +
= ADSM and backup =
 +
 
 +
The rsync backup service runs on beerput. In /export/data/backups/''FQDN''/ you find the mos recent backup. The time stamp of the top-level directory is the time the backup last ran.  
 +
 
 +
This area is again backed-up through ADSM to SARA on a daily basis, with 100 days history. In case of trouble with ADSM, contact Ton.
 +
 
 +
= Cricket and network monitoring and control =
 +
 
 +
The 'salado' host (a.k.a. schoffel) is directly connected over a private 192.168.254.0/24 lan to the management blades in deel and nikopn, and uses several of its other interfacs to connect to guestnet, the public-farmnet(sec) and to theipmi network.
 +
 
 +
This hosts also runs the cricket grapher (the site http://www.dutchgrid.nl/ndpf/cricket/ is just a proxy forward), and runs it from there.  
 +
If the machine is completely hosed, the cricket config (and graphs up to and including June 5th, and INCLUDING the new hef-router collectors)
 +
is copied in a tar-ball to <tt>/global/ices/grid/nikhef/network/</tt>. Unpack in /project/cricket and restart the cron job (from a host with the same network addresses).
 +
 
 +
The cron job is
 +
*/5 * * * *    /project/cricket/deploy/cricket/collect.sh > /dev/null 2>&1
 +
 
 +
Also add to /etc/hosts the correct guestnet-side address of salado:
 +
192.16.192.80  hef-router.nikhef.nl hef-router
 +
 
 +
and of course, enable cgi and the web server on the new salado
 +
lrwxrwxrwx  1 cricket cricket 43 Aug 18  2008 grapher.cgi -> /project/cricket/deploy/cricket/grapher.cgi
 +
lrwxrwxrwx  1 cricket cricket 38 Aug 18  2008 images -> /project/cricket/deploy/cricket/images
 +
lrwxrwxrwx  1 cricket cricket 35 Aug 18  2008 lib -> /project/cricket/deploy/cricket/lib
 +
lrwxrwxrwx  1 cricket cricket 46 Aug 18  2008 mini-graph.cgi -> /project/cricket/deploy/cricket/mini-graph.cgi
 +
 
 +
The firewall of this box is really strict, make sure to make any new box as paranoid as this one.
 +
 
 +
= The Real Hosts =
 +
 
 +
Most of the grid services run off 2 (two) physical hosts: keerder, a PE1950-III with a software-raid-1 serup from the HA-GRID series systems; the other is rakel, a M600e blade with hardware raid-1 over SATA in position 1 of the enclosure.
 +
Physical hosts left are: beerput, gierput, hek, kaasvat, rijf/stalkaars-02.
 +
 
 +
= Decommissioned services =
 +
 
 +
The following services have been decommissioned:
 +
 
 +
* VO LDAP services at grid-vo.nikhef.nl
 +
* SecureGrid.org web site
 +
 
 +
Also, all running services that used to run on <tt>beerput.nikhef.nl</tt>, '''except for the ADSM backup''' have been migrated to "mestkar.nikhef.nl".
 +
 
 +
= Older documentation that still has validity =
 +
 
 +
For the non-migrated services (mainly the DutchGrid CA and the rsync-based backup service, the attached document (PDF) is still valid!
 +
[[Image:Grid-Service-Systems-Guide-20070518.pdf||Grid Service Guide]]

Latest revision as of 16:31, 5 June 2009

The Grid Services environment contains nodes and virtual machines that run special or dedicated services for grid and grid-related work: web servers, the EUGridPMA Repository, the CA and RA systems, et cetera. These service nodes are ‘one-off’ systems, not under quattor control, installed separately, and updating themselves using yum or apt. They do not even all run the same OS version or flavour.

They mostly live on a separate network (194.171.96.64/28), and at the Remote Housing Location.

Machine overview

Machine (real or virtual) overview
machine responsible Level Tasks Comments
rooier sveng low web server for EGEE Security SSCs
beerput davidg medium rsync backup service with ADSM client and backup
gierput davidg low no useful purpose left spare for beerput
sikkel davidg high NDPF subversion service
zeis davidg critical www.eugridpma.org web site (with dynamic content) a hot spare is available on dodo, re-point the DNS (hosted at https://access.enom.com/) in case it really does not come back
weikuip davidg critical dist.eugridpma.info web (IGTF CA distribution) a hot spare is available on lama, re-point the DNS (hosted at https://access.enom.com/) in case it really does not come back
keerder davidg critical physical host system serves: zeis, weikuip, rooier, sikkel
hek davidg high DutchGrid CA 'internal' system ra.dutchgrid.nl, used by the CA admins
kaasvat davidg critical ca.dutchgrid.nl (DutchGrid CRL distribution) a hot spare is available on vink, re-point the DNS for ca.dutchgrid.nl, ask PaulKS
rakel davidg high physical host system Blade #1 (top left, in c15). Hosts: mestkar
mestkar davidg high web server for dutchgrid (and some NDPF stats)
rijf davidg medium NDPF mirror service stalkaars-02, in 2nd valentine rack
salado davidg high network management host in cabinet of deel. Makes the cricket graphs. Warning: disk is NOT raided!

Web sites

EUGridPMA and IGTF

For the EUGridPMA and IGTF web sites, also Anders Waananen (NBI, DK) has the access rights and methods to get into it. He could potentially also do the system swap in DNS with ENOM, but had never tried that one yet.

These web sites *really* have a high profile, so please take care of them for me. Mails sent to the EUGridPMA Operations email address get forwarded to the grid sysadmin list as well.

DutchGrid CA

The Dutchgrid CA has, besides its off-line signing system, 2 (two) on-line systems: the 'RA' box that serves the internal web management console that Djuhaeri, Andre and Dennis can use; and the 'public' box that serves the web site for user requests, as well as the CRL download location. This latter function (CRL downloads) is *really* critical and gets noticed by each and every site in the grid. Please keep it running, and look for complaints sent to ca@dutchgrid.nl. Dennis, Djuhaeri and Andre get these mails.

Neither of the two boxes has a redundant power supply, but they do have redundant RAID-1 disks (on a 3ware controller)

DutchGrid web site, BiG Grid and the VL-e PoC

These web sites have (just!) been migrated to mestkar.nikhef.nl, a VM(Xen,PV) hosted on the first blade top-left in the new chassis, on a host called rakel. This machine also does the CVS service for now. New here: the uids are taken from the NDPF LDAP, and no longer follow the ikonet assignments.

The only service that was NOT yet migrated away from beerput is the ADSM backup. Even more: mestkar is now backed-up TO beerput on a daily basis.

CVS

The CVS service, using ssh access only, is now provided from mestkar (was: beerput)

SVN

The SVN service runs on sikkel, a VM(Xen,PV) on keerder.

ADSM and backup

The rsync backup service runs on beerput. In /export/data/backups/FQDN/ you find the mos recent backup. The time stamp of the top-level directory is the time the backup last ran.

This area is again backed-up through ADSM to SARA on a daily basis, with 100 days history. In case of trouble with ADSM, contact Ton.

Cricket and network monitoring and control

The 'salado' host (a.k.a. schoffel) is directly connected over a private 192.168.254.0/24 lan to the management blades in deel and nikopn, and uses several of its other interfacs to connect to guestnet, the public-farmnet(sec) and to theipmi network.

This hosts also runs the cricket grapher (the site http://www.dutchgrid.nl/ndpf/cricket/ is just a proxy forward), and runs it from there. If the machine is completely hosed, the cricket config (and graphs up to and including June 5th, and INCLUDING the new hef-router collectors) is copied in a tar-ball to /global/ices/grid/nikhef/network/. Unpack in /project/cricket and restart the cron job (from a host with the same network addresses).

The cron job is

*/5 * * * *     /project/cricket/deploy/cricket/collect.sh > /dev/null 2>&1

Also add to /etc/hosts the correct guestnet-side address of salado:

192.16.192.80   hef-router.nikhef.nl hef-router

and of course, enable cgi and the web server on the new salado

lrwxrwxrwx  1 cricket cricket 43 Aug 18  2008 grapher.cgi -> /project/cricket/deploy/cricket/grapher.cgi
lrwxrwxrwx  1 cricket cricket 38 Aug 18  2008 images -> /project/cricket/deploy/cricket/images
lrwxrwxrwx  1 cricket cricket 35 Aug 18  2008 lib -> /project/cricket/deploy/cricket/lib
lrwxrwxrwx  1 cricket cricket 46 Aug 18  2008 mini-graph.cgi -> /project/cricket/deploy/cricket/mini-graph.cgi

The firewall of this box is really strict, make sure to make any new box as paranoid as this one.

The Real Hosts

Most of the grid services run off 2 (two) physical hosts: keerder, a PE1950-III with a software-raid-1 serup from the HA-GRID series systems; the other is rakel, a M600e blade with hardware raid-1 over SATA in position 1 of the enclosure. Physical hosts left are: beerput, gierput, hek, kaasvat, rijf/stalkaars-02.

Decommissioned services

The following services have been decommissioned:

  • VO LDAP services at grid-vo.nikhef.nl
  • SecureGrid.org web site

Also, all running services that used to run on beerput.nikhef.nl, except for the ADSM backup have been migrated to "mestkar.nikhef.nl".

Older documentation that still has validity

For the non-migrated services (mainly the DutchGrid CA and the rsync-based backup service, the attached document (PDF) is still valid! File:Grid-Service-Systems-Guide-20070518.pdf