R self-service

In R, I do a lot:

adataframe[adataframe$col==something]<-adataframe[adataframe$col==something)]+1

This path is quite long and tiring. Is there any way for me
to reference the object I'm trying to change, for example

 adataframe[adataframe$col==something]<-$self+1 

?

+35
r dataframe
Oct. 14 '11 at 1:55 april
source share
4 answers

Try the data.table package and its operator := . It is very fast and very short.

 DT[col1==something, col2:=col3+1] 

The first part of col1==something is a subset. You can put something here and use column names as if they were variables; i.e. there is no need to use $ . Then the second part of col2:=col3+1 assigns RHS to the LHS inside this subset, where the column names can be assigned as if they were variables. := - assignment by reference. No copies of any object are executed, therefore it is faster than <- , = , within and transform .

In addition, in the near future, to be implemented in v1.8.1, there is one final goal of the j syntax, which allows := in j how to combine it with by , see the question: when to use the := operator in data.table .

UDPDATE: this was really released ( := by group) in July 2012.

+34
Oct. 14 '11 at 3:14 a.m.
source share

You should pay more attention to Gabor Grothendeick (and not only in this case.) The above inc function on Matt Asher's blog does everything you ask:

(And the obvious extension works as well.)

 add <- function(x, inc=1) { eval.parent(substitute(x <- x + inc)) } # Testing the `inc` function behavior 

EDIT: after my temporary annoyance in the absence of approval in the first comment, I took on the task of adding another function argument. Supplied with one argument to a part of the data frame, it would still increase the range of values ​​by one. Up to this point, very little has been tested on infix-dyadic operators, but I see no reason why it should not work with any function that takes only two arguments:

 transfn <- function(x, func="+", inc=1) { eval.parent(substitute(x <- do.call(func, list(x , inc)))) } 

(Guilty input: it somehow “feels wrong” from the traditional R perspective to return values ​​for the destination.) An earlier test of the inc function is below:

 > df <- data.frame(a1 =1:10, a2=21:30, b=1:2) > inc <- function(x) { + eval.parent(substitute(x <- x + 1)) + } > inc(df$a1) # works on whole columns > df a1 a2 b 1 2 21 1 2 3 22 2 3 4 23 1 4 5 24 2 5 6 25 1 6 7 26 2 7 8 27 1 8 9 28 2 9 10 29 1 10 11 30 2 > inc(df$a1[df$a1>5]) # testing on a restricted range of one column > df a1 a2 b 1 2 21 1 2 3 22 2 3 4 23 1 4 5 24 2 5 7 25 1 6 8 26 2 7 9 27 1 8 10 28 2 9 11 29 1 10 12 30 2 > inc(df[ df$a1>5, ]) #testing on a range of rows for all columns being transformed > df a1 a2 b 1 2 21 1 2 3 22 2 3 4 23 1 4 5 24 2 5 8 26 2 6 9 27 3 7 10 28 2 8 11 29 3 9 12 30 2 10 13 31 3 # and even in selected rows and grepped names of columns meeting a criterion > inc(df[ df$a1 <= 3, grep("a", names(df)) ]) > df a1 a2 b 1 3 22 1 2 4 23 2 3 4 23 1 4 5 24 2 5 8 26 2 6 9 27 3 7 10 28 2 8 11 29 3 9 12 30 2 10 13 31 3 
+15
oct. 2018-11-14T00:
source share

Here is what you can do. Let's say you have a dataframe

 df = data.frame(x = 1:10, y = rnorm(10)) 

And you want to increase all y by 1. You can do it easily using transform

 df = transform(df, y = y + 1) 
+6
Oct 14 '11 at 2:07 a.m.
source share

I would be partial (presumably a subset is in rows)

 ridx <- adataframe$col==something adataframe[ridx,] <- adataframe[ridx,] + 1 

which does not rely on any bizarre / fragile parsing, is quite expressive with respect to the operation being performed and is not too detailed. It also tends to break lines into good human-syntactic units, and there is something attractive in using standard idioms - vocabulary and language features I'm already great enough for my taste.

+5
Oct. 14 '11 at 16:36
source share



All Articles