Copy dimnames without copying objects?

I could not find the previous question about this, but this one is pretty close.

Often I create new objects and want them to have the same dimnames ( names , colnames , rownames ) like some other object. Usually I would use names or rownames + colnames , but I'm tired of doing this, and I want a better solution. I also want a solution that allows partial overlap, so I need a new function. My problem is that it seems not quite easy to get it exactly.

Firstly, an auxiliary function:

 get_dims = function(x) { if (is.null(dim(x))) { return(length(x)) } else { return(dim(x)) } } 

This gets the size of any object. dim() returns NULL for atomic objects (vectors and lists), whereas it really should just return its length.

Then we compile some minimal test data:

 t = matrix(1:9, nrow=3) t2 = t rownames(t) = LETTERS[1:3]; colnames(t) = letters[1:3] 

Check:

 > t abc A 1 4 7 B 2 5 8 C 3 6 9 > t2 [,1] [,2] [,3] [1,] 1 4 7 [2,] 2 5 8 [3,] 3 6 9 

The test is that t2 should receive dimnames t . I print them because == apparently cannot handle list comparing (returns logical(0) ).

A simple solution is to take the object whose names I want to copy, the object into which I want to copy them, and simply change the dimnames in the function and return the object back. This can be done as follows:

 copy_names1 = function(x, y, partialmatching = T) { #find object dimensions x_dims = get_dims(x) y_dims = get_dims(y) #set names if matching dims if (all(x_dims == y_dims)) { #loop over each dimension for (dim in 1:length(dimnames(x))) { dimnames(y)[[dim]] <- dimnames(x)[[dim]] } } return(y) } 

Test:

 > copy_names1(t, t2) abc A 1 4 7 B 2 5 8 C 3 6 9 

Thus, it works fine, but returns an object, which means that you need to use the assignment operator, which is not needed with ordinary * names() functions.

We can also assign from a function using assign() :

 copy_names2 = function(x, y, partialmatching = T) { #find object dimensions x_dims = get_dims(x) y_dims = get_dims(y) #what is the object in y parameter? y_obj_name = deparse(substitute(y)) #set names if matching dims if (all(x_dims == y_dims)) { #loop over each dimension for (dim in 1:length(dimnames(x))) { dimnames(y)[[dim]] <- dimnames(x)[[dim]] } } #assign in the outer envir assign(y_obj_name, pos = 1, value = y) } 

Test:

 > copy_names2(t, t2) > t2 abc A 1 4 7 B 2 5 8 C 3 6 9 

It also works: it does not require the use of an assignment operator and returns quietly. However, it copies the object to RAM (I think), which is bad when using large objects. It would be better to call dimnames on an existing object without copying it. Therefore, I try:

 copy_names3 = function(x, y, partialmatching = T) { #find object dimensions x_dims = get_dims(x) y_dims = get_dims(y) #what is the object in y parameter? y_obj_name = deparse(substitute(y)) get(y_obj_name, pos = -1) #test that it works #set names if matching dims if (all(x_dims == y_dims)) { #loop over each dimension for (dim in 1:length(dimnames(x))) { dimnames(get(y_obj_name, pos = -1))[[dim]] <- dimnames(x)[[dim]] } } } 

Test:

 > copy_names3(t, t2) Error in dimnames(get(y_obj_name, pos = -1))[[dim]] <- dimnames(x)[[dim]] : could not find function "get<-" 

A very mysterious mistake! According to the previous question, get() cannot be used like that because it only retrieves values, not assigns them. Persons write instead of assign() . However, in the documentation for assign() we find:

assign does not send assignment methods, so it cannot be used to set vector elements, names, attributes, etc.

How to copy dimnames without copying objects using function?

-1
source share
2 answers

One solution is to start calling dimnames in the parent environment, and not inside the function. This can be done as follows:

 copy_names4 = function(x, y, partialmatching = T) { library(stringr) #find object dimensions x_dims = get_dims(x) y_dims = get_dims(y) #what is the object in y parameter? x_obj_name = deparse(substitute(x)) y_obj_name = deparse(substitute(y)) #set names if matching dims if (all(x_dims == y_dims)) { #loop over each dimension for (dim in 1:length(dimnames(x))) { str_call = str_c("dimnames(", y_obj_name, ")[[", dim, "]] <- dimnames(" ,x_obj_name, ")[[", dim, "]]") eval(parse(text = str_call), parent.frame(1)) } } } 

Check this:

 > t2 [,1] [,2] [,3] [1,] 1 4 7 [2,] 2 5 8 [3,] 3 6 9 > copy_names4(t, t2) > t2 abc A 1 4 7 B 2 5 8 C 3 6 9 

Success!

But faster?

 library(microbenchmark) microbenchmark(copy_names1 = {t2 = copy_names1(t, t2)}, copy_names2 = copy_names2(t, t2), copy_names4 = copy_names4(t, t2)) 

Results:

 Unit: microseconds expr min lq mean median uq max neval copy_names1 8.778 10.6795 14.57945 11.9960 15.653 46.812 100 copy_names2 24.869 27.7950 38.62004 33.7925 39.937 202.168 100 copy_names4 466.067 478.9405 507.48058 494.4460 514.488 840.559 100 

Surprisingly, the original version was much faster, about 40-50 times. However, the latter should be faster for large objects. Try a larger test:

 #larger test t = matrix(1:9000000, nrow=3000) t2 = t rownames(t) = sample(LETTERS[1:26], size = 3000, replace = T); colnames(t) = sample(letters[1:26], size = 3000, replace = T) t[1:5, 1:5] t2[1:5, 1:5] microbenchmark(copy_names1 = {t2 = copy_names1(t, t2)}, copy_names2 = copy_names2(t, t2), copy_names4 = copy_names4(t, t2)) 

Results:

 Unit: milliseconds expr min lq mean median uq max neval copy_names1 4.146032 4.442115 33.09852 12.14201 13.00495 242.2970 100 copy_names2 4.229708 4.553877 41.39389 12.23739 20.12995 229.4899 100 copy_names4 5.104497 5.499469 44.42764 13.24267 21.41507 228.7731 100 

Now they are about equally fast, although the first two are still a little faster.

0
source

I'm not sure how a โ€œpartial matchโ€ should work, but maybe this is:

 t = matrix(1:9, nrow=3) t2 = t t2 <- rbind(t2, 11:13) rownames(t) = LETTERS[1:3]; colnames(t) = letters[1:3] d <- dim(t) == dim(t2) dimnames(t2)[d] <- dimnames(t)[d] t2 # abc #[1,] 1 4 7 #[2,] 2 5 8 #[3,] 3 6 9 #[4,] 11 12 13 

Edit:

Here's how you can do this from within the setter function without eval(parse(...)) :

 t = matrix(1:9, nrow=3) t2 = t t2 <- rbind(t2, 11:13) rownames(t) = LETTERS[1:3]; colnames(t) = letters[1:3] fun <- function(x, template, ...) { y <- substitute(x) z <- substitute(template) d <- dim(x) == dim(template) expr <- bquote(dimnames(.(y))[.(d)] <- dimnames(.(z))[.(d)]) eval(expr, ...) invisible(NULL) } fun(t2, t, .GlobalEnv) t2 # abc #[1,] 1 4 7 #[2,] 2 5 8 #[3,] 3 6 9 #[4,] 11 12 13 

Of course, if you need something really fast, you need to implement it in C (as was done using the dimnames<- function).

+4
source

Source: https://habr.com/ru/post/1209157/


All Articles