Introduction Grid Computing Lab Course Overview

From PDP/Grid Wiki
Revision as of 22:35, 26 August 2005 by Davidg@nikhef.nl (talk | contribs)
Jump to navigationJump to search

Structure

The aim of the lab courses will be to install, deploy and operate a mini-grid, with some applications and services. The entire minigrid will be build and run by the students partipating in the course (of course with some help from the tutors). At the end of the lab course you'll know what a grid is, be able to build one, and what is needed to make it useful for applications.

A grid contains a few components that you cannot do without:

  • a common trust domain (authentication)
  • communities of resources and users (authorization)
  • an information service

and of course some services to make the grid useful, like

  • a job submission service
  • data movement or indexing
  • workload management
  • database access
  • your favourice custom services ...

For each of these, literature and documentation are given below, together with one or two projects (assignments) to be picked up by a team of students (say, 2-3 students per project).

Authentication

Trust in the grid today is established via a Public Key Infrastructure (PKI). Every entity in the system is issues with a "certificate" that links an identifier (the persons name, or a DNS name) to a piece of unique cryptographic data (an RSA keypair, for instance). These certificates usually have a limited lifetime when stored in a file, or are carried on hardware tokens like smart-cards and USB keys.

Commercial providers, like Verisign, Thawte, or Entrust, operate a Certification Authority and sellX.509 public key certificates.

You can also setup an X.509 Certification Authority (CA) yourself. The simplest is to use the OpenSSL commands, that even come with a shellscript to automate the task. More complete functionality can be found in OpenCA. Recent version of the Globus Toolkit also come with a package called "globus-simple-ca".

Establishing a trust domain is non-trivial (see, e.g., the EUGridPMA or IGTF web sites), and it raises issues like validity period of the certificates, revocation lists orCRLs, and on-line status checking via OCSP.


Project proposals

  • Build a simple CA service, e.g. based on OpenSSL, that can be used by your fellow students to obtain certificates.
  • Describe the way in which you would identify entities, and what the level of trust in your certificates should be. Describe what the limitations, vulnerabilities, and possible attack vectors.
  • Build a more scalable system, incorporating Registration Authorities, and on-line checking of the status of your certificates (using an independent client program).
  • Integrate on-line checks in a piece of middleware (optional)


Authorization

Users and resources in a grid are grouped in Virtual Organisations. These can be based on directories of users stored in LDAP directories, on attributes issued to the user by the VO, and embedded in the proxy certificate, like in VOMS, or by having a Community Authorization Service (CAS) issue the proxy to the user.

The proxy certificate is the basis for grid authorization today, and enables single sign-on. To access these proxy certs from web portals (and for proxy renewal for long-running jobs), a MyProxy service has been built. This MyProxy service is required for portal operations.

Literature

Project proposals

  • Provide a VO management service for the two grid clusters that will be built lateron (this can best be done with a VO-LDAP server).
  • Old-style systems required the system administrators of a grid site to maintain a file (grid-mapfile) with a list of the authorized users. With VO-LDAP and VOMS, the membership list can be maintained in a central directory for the VO. What else is needed for smooth operation with a VO-LDAP, i.e. how to prevent the sysadmin from having to type something for each new member? (keywords: gridmapdir, LCMAPS, WorkSpace Service/WSS).
  • Setup a CAS service (with GT4) and CAS-enable an example service.

Information Services

A grid consists of many autonomous resources, that come and go. A resource information system to find the resources available for you is therefor vitally important. The system must be stable, scalable to several hunderd sites, hunderds of queries per second, and universally understood.

Information systems have evolved significantly over the years. The Globus Toolkit shipped originally with the "Metacomputing Directory Service" (later renamed to Monitoring and Discovery Service, MDS). The information was presented via an LDAP interface with a proprietary schema. This system later evolved into the

And with Condor you get it's own monitoring system Hawkeye.

Besides there are various management presentation tools like GridICE, MapCenter, GOC Monitor &c.

Literature