I would think that would work too, but apparently not.
If you reject the corresponding bit of the MPI standard, where it actually defines the final map, the reason becomes clear - MPI_Type_create_subarray
displays the area that Subarray takes in the full array, but marchs through memory in a linear order, so the data layout does not change. In other words, when sizes equal to sub-regions, the submatrix is simply a continuous block of memory; and for a subarav strictly smaller than the whole array, you just change the subregion that is sent / received, not the data order. You can see the effect when choosing only a subregion:
int sizes[]={cols,rows}; int subsizes[]={2,4}; int starts[]={1,1}; MPI_Type_create_subarray(2, sizes, subsizes, starts, MPI_ORDER_FORTRAN, MPI_INT, &ftype); MPI_Type_commit(&ftype); MPI_Type_create_subarray(2, sizes, subsizes, starts, MPI_ORDER_C, MPI_INT, &ctype); MPI_Type_commit(&ctype); MPI_Isend(&(send[0][0]), 1, ctype, 0, 1, MPI_COMM_WORLD,&reqc); MPI_Recv(&(recvc[0][0]), 1, ctype, 0, 1, MPI_COMM_WORLD, &statusc); MPI_Isend(&(send[0][0]), 1, ctype, 0, 1, MPI_COMM_WORLD,&reqf); MPI_Recv(&(recvf[0][0]), 1, ftype, 0, 1, MPI_COMM_WORLD, &statusf); /*...*/ printf("Original:\n"); printarr(send,rows,cols); printf("\nReceived -- C order:\n"); printarr(recvc,rows,cols); printf("\nReceived: -- Fortran order:\n"); printarr(recvf,rows,cols);
gives you the following:
0 1 2 3 4 5 6 10 11 12 13 14 15 16 20 21 22 23 24 25 26 30 31 32 33 34 35 36 40 41 42 43 44 45 46 50 51 52 53 54 55 56 60 61 62 63 64 65 66 Received -- C order: 0 0 0 0 0 0 0 0 11 12 13 14 0 0 0 21 22 23 24 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Received: -- Fortran order: 0 0 0 0 0 0 0 0 11 12 0 0 0 0 0 13 14 0 0 0 0 0 21 22 0 0 0 0 0 23 24 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Thus, the same data is sent and received; all that really happens is that the sizes, sweeps and triggers of the arrays vary.
You can transpose using MPI data types - the standard even gives a couple of examples, one of which I transliterated to C here - but you must create the types yourself. The good news is that it really is no more than submarine material:
MPI_Type_vector(rows, 1, cols, MPI_INT, &col); MPI_Type_hvector(cols, 1, sizeof(int), col, &transpose); MPI_Type_commit(&transpose); MPI_Isend(&(send[0][0]), rows*cols, MPI_INT, 0, 1, MPI_COMM_WORLD,&req); MPI_Recv(&(recv[0][0]), 1, transpose, 0, 1, MPI_COMM_WORLD, &status); MPI_Type_free(&col); MPI_Type_free(&transpose); printf("Original:\n"); printarr(send,rows,cols); printf("Received\n"); printarr(recv,rows,cols); $ mpirun -np 1 ./transpose2 Original: 0 1 2 3 4 5 6 10 11 12 13 14 15 16 20 21 22 23 24 25 26 30 31 32 33 34 35 36 40 41 42 43 44 45 46 50 51 52 53 54 55 56 60 61 62 63 64 65 66 Received 0 10 20 30 40 50 60 1 11 21 31 41 51 61 2 12 22 32 42 52 62 3 13 23 33 43 53 63 4 14 24 34 44 54 64 5 15 25 35 45 55 65 6 16 26 36 46 56 66