Communication between individual MPI programs

I have the following problem:

Program 1 has a huge amount of data, say 10 GB. This data consists of large integer and double arrays. Program 2 has 1..n MPI processes that use pieces of data to calculate results.

How can I send data from program 1 to MPI processes?

Using file I / O is out of the question. Computing node has ample RAM.

+4
source share
4 answers

Depending on your implementation of MPI, it should be possible to run several different programs in the same MPI job. For example, using OpenMPI, you can run

mpirun -n 1 big_program : -n 20 little_program 

and you should have access to both programs using MPI_COMM_WORLD. From there, you can use the usual MPI functions to transfer your data from a large program to a small one.

+4
source

One answer may be that the two programs are on separate communicators; one executable file can run both sets of applications using MPI-2 dynamic process control, and the producer program communicates via MPI_COMM_WORLD with the consumer application. Subsequently, all IPCs for a consumer application would have to work inside a sub-communicator that would exclude part of the manufacturer. This would mean rewriting to avoid calling MPI_COMM_WORLD directly.

+1
source

Based on your description, β€œProgram 1” is not an MPI application, and β€œProgram 2” is not an MPI application. The shortest solution path is likely to open a socket between two programs and send data in this way. This does not require that "Program 1" be changed as an MPI program. I will start with a socket between "Program 1" and "Program 2: Rank 0", with a rank of 0, distributing the data in the remaining rows.

A few suggestions so far have included running a heterogeneous set of executable files as one of the possible solutions. There is no requirement that all rows in the same MPI job be the same executable. This requires that both executables be "MPI programs" (for example, include at least MPI_Init and MPI_Finalize calls). The level of modification required for "Program 1", and the inability to run it outside the MPI environment, may make this option unattractive.

I would recommend that you avoid the dynamic process approach if you are not using a commercial implementation that offers support. Connect / accept support tends to be spotty in open source MPI implementations. It can β€œjust work,” but get technical assistance if it cannot be an open end problem.

+1
source

It is not recommended to mix sockets and MPI. The easiest way to achieve this is to move both process 1 and process 2 into a single MPI application.

The best way to implement this is to use the MPMD programming model or Multi-Program Multi-Data. As the name implies, your MPI application will have several programs that work with multiple data. Even if program 1 is not an MPI application, you do not need to make too many changes. Just call MPI_Init and add routines for the send / recv data. You can think of it as a Master-Slave model, where Prg1 is the master and the rest are subordinates, receiving pieces of data for work from the master.

Another method could be to implement a pool of workers, making program 1 the same as program 2, and each reads part of the data file and starts working. But you excluded the IO file, so I assume that prog2-n does not have access to the file at runtime. Master-Slave will work best for your needs.

0
source

Source: https://habr.com/ru/post/1310660/


All Articles