As @Yossarian noted. HDF5 always stores data as a string (convention C). Octave is the same as Fortran, and internally stores data as a column.
When writing a matrix from Octave, the HDF5 layer transposes for you, so it is always written as a string, no matter what language you use. This provides file portability.
There is a very good example in the HDF5 7.3.2.5 User Guide mentioned by @Yossarian. Here's an example of (almost) reproducing with Octave:
octave:1> A = [ 1:3; 4:6 ] A = 1 2 3 4 5 6 octave:2> save("-hdf5", "test.h5", "A") octave:3> quit ~$ h5dump test.h5 HDF5 "test.h5" { GROUP "/" { COMMENT "# Created by Octave 3.6.4, Fri Jun 13 08:36:16 2014 MDT < user@localhost >" GROUP "A" { ATTRIBUTE "OCTAVE_NEW_FORMAT" { DATATYPE H5T_STD_U8LE DATASPACE SCALAR DATA { (0): 1 } } DATASET "type" { DATATYPE H5T_STRING { STRSIZE 7; STRPAD H5T_STR_NULLTERM; CSET H5T_CSET_ASCII; CTYPE H5T_C_S1; } DATASPACE SCALAR DATA { (0): "matrix" } } DATASET "value" { DATATYPE H5T_IEEE_F64LE DATASPACE SIMPLE { ( 3, 2 ) / ( 3, 2 ) } DATA { (0,0): 1, 4, (1,0): 2, 5, (2,0): 3, 6 } } } } }
Note that the HDF5 layer will transfer the matrix to make sure it is stored in the main format.
Then an example read in C:
#include <stdio.h> #include <stdlib.h> #include <string.h> #include <hdf5.h> #define FILE "test.h5" #define DS "A/value" int main(int argc, char **argv) { int i = 0; int j = 0; int n = 0; int x = 0; int rank = 0; hid_t file_id; hid_t space_id; hid_t dset_id; herr_t stat; hsize_t *dims = NULL; int *data = NULL; file_id = H5Fopen(FILE, H5F_ACC_RDONLY, H5P_DEFAULT); dset_id = H5Dopen(file_id, DS, dset_id); space_id = H5Dget_space(dset_id); n = H5Sget_simple_extent_npoints(space_id); rank = H5Sget_simple_extent_ndims(space_id); dims = malloc(rank*sizeof(int)); stat = H5Sget_simple_extent_dims(space_id, dims, NULL); printf("rank: %d\t dimensions: ", rank); for (i = 0; i < rank; ++i) { if (i == 0) { printf("("); } printf("%llu", dims[i]); if (i == (rank -1)) { printf(")\n"); } else { printf(" x "); } } data = malloc(n*sizeof(int)); memset(data, 0, n*sizeof(int)); stat = H5Dread(dset_id, H5T_NATIVE_INT, H5S_ALL, H5S_ALL, H5P_DEFAULT, data); printf("%s:\n", DS); for (i = 0; i < dims[0]; ++i) { printf(" [ "); for (j = 0; j < dims[1]; ++j) { x = i * dims[1] + j; printf("%d ", data[x]); } printf("]\n"); } stat = H5Sclose(space_id); stat = H5Dclose(dset_id); stat = H5Fclose(file_id); return(EXIT_SUCCESS); }
When the task is compiled and started:
~$ h5cc -o rmat rmat.c ~$ ./rmat rank: 2 dimensions: (3 x 2) A/value: [ 1 4 ] [ 2 5 ] [ 3 6 ]
This is great, as it means that the matrices stored in memory are optimized. What this means is that you must change the way you perform your calculations. For row-major you need to do pre-multiplication, while for column-column you have to do post-multiplication. Here is an example, I hope this is explained a little more clearly.
Does it help?