Difference between revisions of "NDPF"

From PDP/Grid Wiki
Jump to navigationJump to search
Line 28: Line 28:
=== Grid, Quattor and Yaim ===
=== Grid, Quattor and Yaim ===
; [[How to work with our Quattor setup]] : How to work with the template repository, compiling and deploying node templates
; [[Upgrading Quattor managed glite servers]] : Information on how to upgrade Quattor managed glite servers.
; [[Upgrading Quattor managed glite servers]] : Information on how to upgrade Quattor managed glite servers.
; [[Upgrading Quattor Components]] : Information on how to upgrade quattor components using the CVS repository on stal and the local tools for rpm management.
; [[Upgrading Quattor Components]] : Information on how to upgrade quattor components using the CVS repository on stal and the local tools for rpm management.

Revision as of 15:38, 17 November 2008

Nikhef Data Processing Facility

pending actions needed on NDPF

Hardware Inventory and Connections

which node does what??
Network Connections in the NDPF (all routers and switches)
Cabling of the OPN switches and routers
which disk server contains what??
domUs on a Dom0
Layout of the cabinets in the server room
NCF nodes, mostly in the leftmost rack (node16-17 to node16-41), have mismatches between what is written in the label (front and back) and the real installed node. This table has the mapping.
MAC addresses "EasterEgg" Sun X4500 Thumpers
Where is my Thumper? Physical locations per serial number
OLD Deel interfaces

Operating procedures and manuals

User Management

how to create poolaccounts for a new VO (or extend an existing set), or even recover from an empty gridmapdir
LDAP directory setup and slave linking
Adding local users
How to create a new user account.
Adding a new VO
How to add a new VO
Uid and Gid number plan for the NDPF clusters

Grid, Quattor and Yaim

How to work with our Quattor setup
How to work with the template repository, compiling and deploying node templates
Upgrading Quattor managed glite servers
Information on how to upgrade Quattor managed glite servers.
Upgrading Quattor Components
Information on how to upgrade quattor components using the CVS repository on stal and the local tools for rpm management.
Upgrading the nikhef-yaim-local rpm
How to build a release of the nikhef-yaim-local rpm and include it in the Quattor profiles.
Namespaced Quattor configuration
Documentation on the Quattor setup, including the namespaced templates
AII version 2 and complex block device schemas
How to configure complex block device schemas in Pan with AII version 2
Setting up a gLite WMS
Notes for installing and configuring a gLite WMS
LCAS and LCMAPS installation for gLExec and (GT4) gatekeepers
What to install and what/how to configure the components
Requesting or Renewing Host certificates
Guide on the procedure to request a new host certificate or renew an existing certificate

Batch Systems and Scheduling

Adding/removing nodes to PBS
Information on how to add or remove nodes from PBS.
Information on how the accounting chain works from PBS to the local and APEL accounting portals.
Creating the voview/grisview graphs on www.nikhef.nl/grid/stats, and the cron jobs that run on naab.

Systems Documentation

Remote usage of the Dell console switches
How to remotely connect to the Dell console switches.
Serial Consoles
how to setup your serial (over LAN) console and IMPI 2.0 SoL stuff, even with Xen, on a PE1950
how to set the LCD panel text at runtime on a Dell PowerEdge using IPMI

NDPF Dell switch config
setup and configuration of the dell switches (disabling port-fast) - required for every new installation
use and configuration of the chassis and other controls of the Dell (PE1950) systems
documentation for the Equinox ELS-16 TS
NDPF GS environment
Grid Service (specialties) environment documentation
NDPF rsync backup
backup of service nodes using rsync and indirect ADSM usage
NDPF MySQL configurations
Configuration and monitoring of the MySQL service on bedstee
RAID-1 configuration and management
how to set up RAID-1 and how to manage RAID devices
some tidbits on the X4500 Thumpers
Increasing Thumper filesystems for DPM
recipe for increasing the logical volume, file system and making it visible to DPM
handling NL-T1 alarms at Nikhef
NDPF SubVersion Repository set up

Problem Recovery

Procedures to be followed when recovering from major outages. No extensive descriptions, just a list of what to do, without having to think/know much about the topic.

Restoring Services
order in which to bring services back on-line
Resurrecting kuiken
steps to restart and check the VOMS server kuiken

System Utilities

The current LCG software can put severe load on a pbs server due to the times it calls qstat (we have seen as many as 25 qstat calls per second). This load can eventually bring the entire system to an halt. In order to reduce the impact of this problem (a full solution requires rewriting part of the grid job manager), Davide wrote a set of utilities that wrap the original pbs commands and provide caching
MoniFarm Utilities
Graphing package for farm utilisation (used to make the production graphs for the NDPF Facilities)
The proxy renewal wrapper script, enhanced with status info


Xen on CentOS 5
Installation and configuration of XEN Dom0 CentOS 5 x86_64, DomU CentOS 4 i386

Virtualisation Issues

NDPF VMware authentication
controlling and creating VMs on the central VMware server
NDPF vmware tips
tips and tricks for generating vmware images
Xen on CentOS 5 - Notes
Notes for installing and configuring a XEN System on CentOS 5
Xen on CentOS - Automating Installation-Administration
Quator managed Xen-Dom0 and DomUs
Xen 3.2, CentOS 5.1 and NAT HOWTO
HOWTO document to set up Xen with NAT networking

Performance data and analysis

NDPF Node Performance
performance figures for nodes used for accounting purpuses

User Security and eTokens

Using an Aladdin eToken PRO with grid certificates (including gridproxy generation)
Using an Aladdin eToken PRO to store grid certificates
Using an Aladdin eToken PRO with grid certificates (including gridproxy generation)