Using the Grid/Job Advanced MPI script
Jump to navigation
Jump to search
This page has been marked as Todo, which means it needs serious work. Please feel free to add to this page. Contribute help |
MPI Job
Introduction
MPI jobs on the Grid are slightly different. They will run on only one site and you can request a number of nodes per job. On the lifescience Grid the clusters have 16 nodes. When you request more nodes, your job can only run on larger clusters. Currently that will be the cluster in Groningen and Gina. There are several flavors of MPI available, however MPICH and OpenMPI are the most common on the Grid.
To request four cpu's for an MPICH job, add: JobType= 'MPICH' and NodeNumber = 4 in the jdl-file.
Submission
The three files from the Example below, you can submit with:
user$ startGridSession lsgrid user$ glite-wms-job-submit -d $USER -o myjobs test-mpi.jdl
And the usual commands to check status, retreive output and possibly cancel your job.
user$ glite-wms-job-status -i myjobs user$ glite-wms-job-output --dir ./my_ouput -i myjobs user$ glite-wms-job-cancel -i myjobs
Example files
test-mpi.jdl
Type = "Job"; JobType = "MPICH"; NodeNumber = 4; Executable = "test-mpi.sh"; Arguments = "test-mpi"; StdOutput = "test-mpi.out"; StdError = "test-mpi.err"; InputSandbox = {"test-mpi.sh","test-mpi.c"}; OutputSandbox = {"test-mpi.err","test-mpi.out","mpiexec.out"}; Requirements = Member("MPICH", other.GlueHostApplicationSoftwareRunTimeEnvironment);
file: test-mpi.sh
#!/bin/sh -x # the binary to execute EXE=$1 echo "***********************************************************************" echo "Running on: $HOSTNAME" echo "As: " `whoami` echo "***********************************************************************" echo "***********************************************************************" echo "Compiling binary: $EXE" echo mpicc -o ${EXE} ${EXE}.c mpicc -o ${EXE} ${EXE}.c echo "*************************************" if [ "x$PBS_NODEFILE" != "x" ] ; then echo "PBS Nodefile: $PBS_NODEFILE" HOST_NODEFILE=$PBS_NODEFILE fi if [ "x$LSB_HOSTS" != "x" ] ; then echo "LSF Hosts: $LSB_HOSTS" HOST_NODEFILE=`pwd`/lsf_nodefile.$$ for host in ${LSB_HOSTS} do echo $host >> ${HOST_NODEFILE} done fi if [ "x$HOST_NODEFILE" = "x" ]; then echo "No hosts file defined. Exiting..." exit fi echo "***********************************************************************" CPU_NEEDED=`cat $HOST_NODEFILE | wc -l` echo "Node count: $CPU_NEEDED" echo "Nodes in $HOST_NODEFILE: " cat $HOST_NODEFILE echo "***********************************************************************" echo "***********************************************************************" CPU_NEEDED=`cat $HOST_NODEFILE | wc -l` echo "Checking ssh for each node:" NODES=`cat $HOST_NODEFILE` for host in ${NODES} do echo "Checking $host..." ssh $host "hostname; set;ls -l `which mpirun`;rpm -qf `which mpirun`;rpm -qa | grep mpi;hostname" done echo "***********************************************************************" echo "***********************************************************************" echo "Executing $EXE with mpiexec" chmod 755 $EXE mpiexec `pwd`/$EXE > mpiexec.out 2>&1 echo "***********************************************************************"
file: test-mpi.c
/* hello.c * * Simple "Hello World" program in MPI. * */ #include "mpi.h" #include <stdio.h> int main(int argc, char *argv[]) { int numprocs; /* Number of processors */ int procnum; /* Processor number */ /* Initialize MPI */ MPI_Init(&argc, &argv); /* Find this processor number */ MPI_Comm_rank(MPI_COMM_WORLD, &procnum); /* Find the number of processors */ MPI_Comm_size(MPI_COMM_WORLD, &numprocs); printf ("Hello world! from processor %d out of %d\n", procnum, numprocs); /* Shut down MPI */ MPI_Finalize(); return 0; }