R in C by specifying the row / column names of the matrix

I am writing an R package that manipulates matrices in C. Currently, matrices returned in R have numbers for row / column names. I would prefer to assign my row / column names when modifying an object in C.

I walked for about an hour about an hour, but have not yet found a good solution. The closest I found is dimnames, but I want to name each column, not just two dimensions. Matrices get more than 4x4, below is just a small example of what I want to do.

The number of lines is 4 ^ x, where X is the length of the string name

Current [,1] [,2] [,3] [,4] [1,] 0.20 0.00 0.00 0.80 [2,] 0.25 0.25 0.25 0.25 [3,] 0.25 0.25 0.25 0.25 [4,] 1.00 0.00 0.00 0.00 [5,] 0.20 0.00 0.00 0.80 [6,] 0.25 0.25 0.25 0.25 [7,] 0.25 0.25 0.25 0.25 [8,] 1.00 0.00 0.00 0.00 [9,] 0.20 0.00 0.00 0.80 [10,] 0.25 0.25 0.25 0.25 [11,] 0.25 0.25 0.25 0.25 [12,] 1.00 0.00 0.00 0.00 [13,] 0.20 0.00 0.00 0.80 [14,] 0.25 0.25 0.25 0.25 [15,] 0.25 0.25 0.25 0.25 [16,] 1.00 0.00 0.00 0.00 Desired [A] [C] [G] [T] [AA] 0.20 0.00 0.00 0.80 [AC] 0.25 0.25 0.25 0.25 [AG] 0.25 0.25 0.25 0.25 [AT] 1.00 0.00 0.00 0.00 [CA] 0.20 0.00 0.00 0.80 [CC] 0.25 0.25 0.25 0.25 [CG] 0.25 0.25 0.25 0.25 [CT] 1.00 0.00 0.00 0.00 [GA] 0.20 0.00 0.00 0.80 [GC] 0.25 0.25 0.25 0.25 [GG] 0.25 0.25 0.25 0.25 [GT] 1.00 0.00 0.00 0.00 [TA] 0.20 0.00 0.00 0.80 [TC] 0.25 0.25 0.25 0.25 [TG] 0.25 0.25 0.25 0.25 [TT] 1.00 0.00 0.00 0.00 
+6
source share
3 answers

As Jim said, this is much easier to do in R. I pass the names to the C function using the nam argument.

 #include <Rinternals.h> SEXP myMat(SEXP nam) { /*PrintValue(nam);*/ SEXP ans, dimnames; PROTECT(ans = allocMatrix(REALSXP, length(nam), length(nam))); PROTECT(dimnames = allocVector(VECSXP, 2)); SET_VECTOR_ELT(dimnames, 0, nam); SET_VECTOR_ELT(dimnames, 1, nam); setAttrib(ans, R_DimNamesSymbol, dimnames); UNPROTECT(2); return(ans); } 

If you put this code in a file called myMat.c , you can check it on the next line. I use Ubuntu, so you have to change myMat.so to myMat.dll if you are on Windows.

 R CMD SHLIB myMat.c Rscript -e 'dyn.load("myMat.so"); .Call("myMat", c("A","C","G","T"))' 
+3
source

If you are open to C ++ instead of C, then Rcpp can make this a little easier. We simply create a list object with row and column names, as in R, and assign this to the dimnames attribute of the dimnames object:

 R> library(inline) # to compile, link, load the code here R> src <- ' + Rcpp::NumericMatrix x(2,2); + x.fill(42); // or more interesting values + // C++0x can assign a set of values to a vector, but we use older standard + Rcpp::CharacterVector rows(2); rows[0] = "aa"; rows[1] = "bb"; + Rcpp::CharacterVector cols(2); cols[0] = "AA"; cols[1] = "BB"; + // now create an object "dimnms" as a list with rows and cols + Rcpp::List dimnms = Rcpp::List::create(rows, cols); + // and assign it + x.attr("dimnames") = dimnms; + return(x); + ' R> fun <- cxxfunction(signature(), body=src, plugin="Rcpp") R> fun() AA BB aa 42 42 bb 42 42 R> 

The actual assignment of column and row names is so manual ... because the current C ++ standard does not allow direct assignment of vectors during initialization, but that will change.

Edit: I just realized that I can of course use the static create() method for the row and column name, which makes this a little easier and shorter.

 R> src <- ' + Rcpp::NumericMatrix x(2,2); + x.fill(42); // or more interesting values + Rcpp::List dimnms = // two vec. with static names + Rcpp::List::create(Rcpp::CharacterVector::create("cc", "dd"), + Rcpp::CharacterVector::create("ee", "ff")); + // and assign it + x.attr("dimnames") = dimnms; + return(x); + ' R> fun <- cxxfunction(signature(), body=src, plugin="Rcpp") R> fun() ee ff cc 42 42 dd 42 42 R> 

Thus, we reduce to three or four operators, without monkeys with PROTECT / UNPROTECT and without memory management.

+6
source

The remark above is instructive. Dimnames is a list with the same number of elements as the sizes of the data set, where each element corresponds to numerical elements by this dimension, i.e. list(c('a','c','g','t'), c('a','c','g','t')) .

To install this in C, I would recommend:

 PROTECT(dimnames = allocVector(VECSXP, 2)); PROTECT(rownames = allocVector(STRSXP, 4)); PROTECT(colnames = allocVector(STRSXP, 4)); setAttrib( ? , R_DimNamesSymbol, dimnames); 

You will need to set the corresponding rowname and colname elements. In general, this material is much easier to do in R.

Jim

+1
source

Source: https://habr.com/ru/post/886194/


All Articles