Practical details
There are several MPI-systems available. I have fetched and
compiled
the MPICH-system. Manual pages
are available in /chalmers/sw/unsup64/mpich-3.1.3/share/man
(in the file system). Append this path to your MANPATH environment
variable and you can type man
name_of_MPI_routine in a terminal window (for help about name_of_MPI_routine).
This is how you use the MPICH-software:
I have set up the distribution in /chalmers/sw/unsup64/mpich-3.1.3.
The binaries are located in the bin-subdirectory. Set
your unix-path so that you get the correct versions of the
binaries. You should, for example, be able to type
which mpif90
and the shell should respond with
/chalmers/sw/unsup64/mpich-3.1.3/bin/mpif90
Make sure you get the correct mpiexec as well. Note that
there are other MPI-distributions installed in the system, but use the
one I have installed.
You find documentation in the share-subdirectory (and in
particular in share/man and share/doc).
To compile your program type one of:
mpicc my_prog.c
or
mpic++
my_prog.cc
or mpicxx
my_prog.cc
or mpif90 my_prog.f90
or mpif77 my_prog.f
To run on your program (you can run more
processes than you have CPUs, but you do not get
true parallel computing of course, nor do you expect the best speedup).
In
this example we run two (-n 2) processes:
mpiexec -n 2 ./a.out
The
following instructions may not work due to security issues.
The student-machines have four cores, if you want to run on
several computers, create a hostfile (named hostfile
in the example) containing a list of names of systems, one name per
row e.g:
host1
host2:4
host3:2
meaning, run one process on host1, four on host2
and two on host3. The Linux-command, hostname,
will print the appropriate name of a system. This is
how you start the seven processes:
mpiexec -f hostfile
./a.out
In the following example-program each process will print out the rank
and the name of the host the process is executing on.
#include <stdio.h>
#include <unistd.h>
#include "mpi.h"
int main(int argc, char *argv[])
{
int my_rank;
size_t len = 50;
char host_name[len];
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);
gethostname(host_name, len);
printf("%2d %s\n", my_rank, host_name);
MPI_Finalize();
return 0;
}
Using the hostfile:
remote11.chalmers.se:2
remote12.chalmers.se:2
One may get the following printout (I gave the mpiexec-command
on remote11). The order may change between runs.
% mpiexec -f hostfile ./a.out
0 remote11.chalmers.se
2 remote12.chalmers.se
1 remote11.chalmers.se
3 remote12.chalmers.se
Note that these machines may be heavily loaded, so do not expect a good speedup.
Using gfortran one can use the subroutine hostnm
to retrieve the hostname, e.g:
program main
character (len = 60) :: name
call hostnm(name)
print*, name
end
A debug hint:
MPI may be buffering the output from print-statements which means that
you may not see a debug-printout (fairly likely if you have an
incorrect program). In order to see the printout you can flush the
print-buffers.
In Fortran:
print*, 'x = ', x ! for example
call flush ! force printout
In C:
printf("x = %e\n", x); // for example
fflush(stdout); // force printout
