MPI Newbie - some questions about how mpirun and process control work

First of all, I am not a programmer by profession, but I need to program the code for my project (although I have experience with C ++ and python). I often came here when I was stuck, and most of the time I had good decisions, but now I have important questions about MPI programming, otherwise I could not continue until I knew its concept.

Here is my description of the problems,

I would like to create code for an algorithm for scientific computation. The code can be divided into 2 parts.

A.) Matrix vector multiplication and matrix inversion. This part is relatively simple, and I even have my own working MPI code for this part

B.) Calling an external program with MPI support for more complex computation (this part should also be simple, because it just calls the UNIX command line).

The problem I am facing is how to combine these two parts together? My algorithm is as follows:

for k in specified range
   dividing a state vector of size 6NMx1 into M blocks, let each of M nodes handle these.
   Manipulate a state vector of size 6NMx1 according to A.) in parallel
   After A.) is done, run B.) using M nodes in parallel /* THIS IS WHERE I GOT STUCK */
   Update state vector
end for

To run B.), I have to use mpirun to invoke the UNIX command,

mpirun -np #PPN my_app > some_output

The questions I have

  • How does mpirun work? Does this mean the emergence of new processes when called?

  • Say, if I use M-cluster compute nodes, and each of them has 16 processors per node, if I use only 1 node process to invoke the above UNIX command, will it generate 16 more processes? If so, I would end up with 256M processes, am I right?

  • - node ( , 6Nx1) B.) ' m, , , , , , node B.) A.). - , MPI? A.) B.) python script , .

Python script:

for k in specified range
   mpirun A.) --> This is straightforward for me
   mpirun B.)
end for

B.)

/* THIS PROGRAM SHOULD HAVE 16M PROCESSES */
if rank % 16 == 0
   mpirun -np 16 my_app > output
end if
/* I WANT M CALLS TO THIS PROGRAM IN PARALLEL */
MPI_COMM.BARRIER

, 16M B.)? B.), , , , A.), , !

3.) , - . , .

, , . , !:)

+4
1

Mpirun - , , , , , .

, , , , , . , slurm sbatch, :

// number of proc on one node
#SBATCH -n 2
// number of node
#SBATCH -N 4

run ./a.out

, 4 2 procs .

, , , , - . MPI, node, openMP.

MPI , , node, .

, .

+2

Source: https://habr.com/ru/post/1530060/


All Articles