Difference between revisions of "User:Dennisvd@nikhef.nl/mpi"
From PDP/Grid Wiki
Jump to navigationJump to search(One intermediate revision by the same user not shown) | |||
Line 10: | Line 10: | ||
* It does not yet work with CREAM, als all the processes are run on the same node. | * It does not yet work with CREAM, als all the processes are run on the same node. | ||
− | One thing that I had forgotten was to set the MPI_*_PATH variables that mpi-start needs. It had a fallback to /opt/i2g/openmpi and that did not distribute the work according to plan. | + | One thing that I had forgotten was to set the MPI_*_PATH variables that mpi-start needs. It had a fallback to /opt/i2g/openmpi and that did not distribute the work according to plan. After I straightened this out, I got the same result: everything ran on the same node. |
+ | mpi-start is supposed to do the right thing, but no. The debug message I got was clarifying things: | ||
+ | found openmpi and PBS, don't set machinefile | ||
+ | |||
+ | which means that the call to mpirun (or mpiexec) does not include the -machinefile which it needs. You wouldn't need it if the openmpi came with the PBS startup stuff, but in this case it is no true. | ||
== References == | == References == | ||
* http://egee-uig.web.cern.ch/egee-uig/production_pages/MPIJobs.html ''Official EGEE user documentation on using MPI.'' | * http://egee-uig.web.cern.ch/egee-uig/production_pages/MPIJobs.html ''Official EGEE user documentation on using MPI.'' | ||
+ | * http://www.grid.ie/mpi/wiki/YaimConfig ''YAIM configuration notes for MPI sites.'' |
Latest revision as of 15:40, 12 June 2009
OK, so maybe not heroic. But my attempts at getting MPI running are much like an uphill battle.
I managed only partly, so far, so here are some notes:
- I used passwordless-hostbased-ssh logins between the nodes
- I used Torque as the batch system
- I used the RHEL4 provided openmpi (that does not include the tm module, hence the ssh stuff)
- It worked with torque and mpirun
- ... but not with mpi-start.
- It does not yet work with CREAM, als all the processes are run on the same node.
One thing that I had forgotten was to set the MPI_*_PATH variables that mpi-start needs. It had a fallback to /opt/i2g/openmpi and that did not distribute the work according to plan. After I straightened this out, I got the same result: everything ran on the same node.
mpi-start is supposed to do the right thing, but no. The debug message I got was clarifying things:
found openmpi and PBS, don't set machinefile
which means that the call to mpirun (or mpiexec) does not include the -machinefile which it needs. You wouldn't need it if the openmpi came with the PBS startup stuff, but in this case it is no true.
References
- http://egee-uig.web.cern.ch/egee-uig/production_pages/MPIJobs.html Official EGEE user documentation on using MPI.
- http://www.grid.ie/mpi/wiki/YaimConfig YAIM configuration notes for MPI sites.