Using the Grid/Job Advanced MPI script
Jump to navigation
Jump to search
| This page has been marked as Todo, which means it needs serious work. Please feel free to add to this page. Contribute help |
MPI Job
Introduction
MPI jobs on the Grid are slightly different. They will run on only one site and you can request a number of nodes per job. On the lifescience Grid the clusters have 16 nodes. When you request more nodes, your job can only run on larger clusters. Currently that will be the cluster in Groningen and Gina. There are several flavors of MPI available, however MPICH and OpenMPI are the most common on the Grid.
To request four cpu's for an MPICH job, add: JobType= 'MPICH' and NodeNumber = 4 in the jdl-file.
Submission
The three files from the Example below, you can submit with:
user$ startGridSession lsgrid user$ glite-wms-job-submit -d $USER -o myjobs test-mpi.jdl
And the usual commands to check status, retreive output and possibly cancel your job.
user$ glite-wms-job-status -i myjobs user$ glite-wms-job-output --dir ./my_ouput -i myjobs user$ glite-wms-job-cancel -i myjobs
Example files
test-mpi.jdl
Type = "Job";
JobType = "MPICH";
NodeNumber = 4;
Executable = "test-mpi.sh";
Arguments = "test-mpi";
StdOutput = "test-mpi.out";
StdError = "test-mpi.err";
InputSandbox = {"test-mpi.sh","test-mpi.c"};
OutputSandbox = {"test-mpi.err","test-mpi.out","mpiexec.out"};
Requirements = Member("MPICH", other.GlueHostApplicationSoftwareRunTimeEnvironment);
file: test-mpi.sh
#!/bin/sh -x
# the binary to execute
EXE=$1
echo "***********************************************************************"
echo "Running on: $HOSTNAME"
echo "As: " `whoami`
echo "***********************************************************************"
echo "***********************************************************************"
echo "Compiling binary: $EXE"
echo mpicc -o ${EXE} ${EXE}.c
mpicc -o ${EXE} ${EXE}.c
echo "*************************************"
if [ "x$PBS_NODEFILE" != "x" ] ; then
echo "PBS Nodefile: $PBS_NODEFILE"
HOST_NODEFILE=$PBS_NODEFILE
fi
if [ "x$LSB_HOSTS" != "x" ] ; then
echo "LSF Hosts: $LSB_HOSTS"
HOST_NODEFILE=`pwd`/lsf_nodefile.$$
for host in ${LSB_HOSTS}
do
echo $host >> ${HOST_NODEFILE}
done
fi
if [ "x$HOST_NODEFILE" = "x" ]; then
echo "No hosts file defined. Exiting..."
exit
fi
echo "***********************************************************************"
CPU_NEEDED=`cat $HOST_NODEFILE | wc -l`
echo "Node count: $CPU_NEEDED"
echo "Nodes in $HOST_NODEFILE: "
cat $HOST_NODEFILE
echo "***********************************************************************"
echo "***********************************************************************"
CPU_NEEDED=`cat $HOST_NODEFILE | wc -l`
echo "Checking ssh for each node:"
NODES=`cat $HOST_NODEFILE`
for host in ${NODES}
do
echo "Checking $host..."
ssh $host "hostname; set;ls -l `which mpirun`;rpm -qf `which mpirun`;rpm -qa | grep mpi;hostname"
done
echo "***********************************************************************"
echo "***********************************************************************"
echo "Executing $EXE with mpiexec"
chmod 755 $EXE
mpiexec `pwd`/$EXE > mpiexec.out 2>&1
echo "***********************************************************************"
file: test-mpi.c
/* hello.c
*
* Simple "Hello World" program in MPI.
*
*/
#include "mpi.h"
#include <stdio.h>
int main(int argc, char *argv[])
{
int numprocs; /* Number of processors */
int procnum; /* Processor number */
/* Initialize MPI */
MPI_Init(&argc, &argv);
/* Find this processor number */
MPI_Comm_rank(MPI_COMM_WORLD, &procnum);
/* Find the number of processors */
MPI_Comm_size(MPI_COMM_WORLD, &numprocs);
printf ("Hello world! from processor %d out of %d\n", procnum, numprocs);
/* Shut down MPI */
MPI_Finalize();
return 0;
}