Using the Grid/Job Advanced MPI script
		
		
		
		
		
		Jump to navigation
		Jump to search
		
		
	
| This page has been marked as Todo, which means it needs serious work. Please feel free to add to this page. Contribute help | 
MPI Job
Introduction
MPI jobs on the Grid are slightly different. They will run on only one site and you can request a number of nodes per job. On the lifescience Grid the clusters have 16 nodes. When you request more nodes, your job can only run on larger clusters. Currently that will be the cluster in Groningen and Gina. There are several flavors of MPI available, however MPICH and OpenMPI are the most common on the Grid.
To request four cpu's for an MPICH job, add: JobType= 'MPICH' and NodeNumber = 4 in the jdl-file.
Submission
The three files from the Example below, you can submit with:
user$ startGridSession lsgrid user$ glite-wms-job-submit -d $USER -o myjobs test-mpi.jdl
And the usual commands to check status, retreive output and possibly cancel your job.
user$ glite-wms-job-status -i myjobs user$ glite-wms-job-output --dir ./my_ouput -i myjobs user$ glite-wms-job-cancel -i myjobs
Example files
test-mpi.jdl
Type = "Job";
JobType = "MPICH";
NodeNumber = 4;
Executable = "test-mpi.sh";
Arguments = "test-mpi";
StdOutput = "test-mpi.out";
StdError = "test-mpi.err";
InputSandbox = {"test-mpi.sh","test-mpi.c"};
OutputSandbox = {"test-mpi.err","test-mpi.out","mpiexec.out"};
Requirements = Member("MPICH", other.GlueHostApplicationSoftwareRunTimeEnvironment);
file: test-mpi.sh
#!/bin/sh -x
# the binary to execute
EXE=$1 
echo "***********************************************************************" 
echo "Running on: $HOSTNAME" 
echo "As:       " `whoami` 
echo "***********************************************************************" 
echo "***********************************************************************" 
echo "Compiling binary: $EXE" 
echo mpicc -o ${EXE} ${EXE}.c
mpicc -o ${EXE} ${EXE}.c
echo "*************************************" 
if [ "x$PBS_NODEFILE" != "x" ] ; then 
  echo "PBS Nodefile: $PBS_NODEFILE" 
  HOST_NODEFILE=$PBS_NODEFILE 
fi
if [ "x$LSB_HOSTS" != "x" ] ; then 
  echo "LSF Hosts: $LSB_HOSTS" 
  HOST_NODEFILE=`pwd`/lsf_nodefile.$$ 
  for host in ${LSB_HOSTS} 
  do 
    echo $host >> ${HOST_NODEFILE} 
  done 
fi
if [ "x$HOST_NODEFILE" = "x" ]; then
  echo "No hosts file defined.  Exiting..."
  exit
fi 
echo "***********************************************************************" 
CPU_NEEDED=`cat $HOST_NODEFILE | wc -l` 
echo "Node count: $CPU_NEEDED"
echo "Nodes in $HOST_NODEFILE: "
cat $HOST_NODEFILE
echo "***********************************************************************" 
echo "***********************************************************************" 
CPU_NEEDED=`cat $HOST_NODEFILE | wc -l` 
echo "Checking ssh for each node:"
NODES=`cat $HOST_NODEFILE`
for host in ${NODES}
do
  echo "Checking $host..." 
  ssh $host "hostname; set;ls -l `which mpirun`;rpm -qf `which mpirun`;rpm -qa | grep mpi;hostname"
done
echo "***********************************************************************" 
echo "***********************************************************************" 
echo "Executing $EXE with mpiexec" 
chmod 755 $EXE 
mpiexec `pwd`/$EXE > mpiexec.out 2>&1 
echo "***********************************************************************" 
file: test-mpi.c
/*  hello.c
 *
 *  Simple "Hello World" program in MPI.
 *
 */
   
#include "mpi.h"
#include <stdio.h>
int main(int argc, char *argv[])
{
  int numprocs;  /* Number of processors */
  int procnum;   /* Processor number */
  /* Initialize MPI */
  MPI_Init(&argc, &argv);
  /* Find this processor number */
  MPI_Comm_rank(MPI_COMM_WORLD, &procnum);
  /* Find the number of processors */
  MPI_Comm_size(MPI_COMM_WORLD, &numprocs);
  printf ("Hello world! from processor %d out of %d\n", procnum, numprocs);
  /* Shut down MPI */
  MPI_Finalize();
  return 0;
}