How to match rows with the same value in one column of a data frame in R

Question

How to match rows with the same value in one column of a data frame in R

I have data in the following form:

set.seed(1234) data <- data.frame(cbind(runif(40,0,10), rep(seq(1,20,1), each = 2))) data <- data[sample(nrow(data)),] colnames(data) <- c("obs","subject") head(data) obs subject 1.5904600 12 8.1059855 13 5.4497484 6 0.3999592 12 2.5880982 19 2.6682078 9 ... ...

Let's say that I have only two observations (column "obs") on the subject (column "subject", where the subjects are numbered from 1 to 20).

I would like to "group" the rows by the values of the "subject" column. More precisely, I would like to "order" data on the topic, but preserving the order shown above. Thus, the final data will be something like this:

  obs subject 1.5904600 12 0.3999592 12 8.1059855 13 2.3656473 13 5.4497484 6 7.2934746 6

Any ideas? I was thinking about what would probably identify each line corresponding to the subject, with which :

 which(data$subject==x)

and then rbind these lines in a loop, but I'm sure there is a simpler and faster way to do this, right?

+5

r group-by order dataframe

Ladislas nalborczyk Sep 27 '16 at 21:08

source share

4 answers

zx8754 · Answer 1 · 2016-09-27T21:17:49+0000

Convert to a factor with levels, then order:

 data$group <- factor(data$subject, levels = unique(data$subject)) data[ order(data$group), ] # obs subject group # 1 1.59046003 12 12 # 4 0.39995918 12 12 # 2 8.10598552 13 13 # 30 2.18799541 13 13 # ...

Joe · Answer 2 · 2016-10-22T20:38:17+0000

Commit data with obs and disconnect again. The result will be saved in the original order, but subject will be grouped.

 library(tidyr) data %>% nest(obs) %>% unnest() # A tibble: 6 × 2 # subject obs # <int> <dbl> #1 12 1.5904600 #2 12 0.3999592 #3 13 8.1059855 #4 6 5.4497484 #5 19 2.5880982 #6 9 2.6682078

kwicher · Answer 3 · 2016-09-27T21:46:59+0000

It is based on zx8754, but retains the data type:

 library(dplyr) #arrange function group<-factor(data[,'subject'], levels=unique(data[,'subject'])) data<-cbind(data,group) data<-arrange(as.data.frame(data),group) data<-as.matrix(data[,-3])

smci · Answer 4 · 2016-09-27T22:25:28+0000

dplyr is a great package with various useful verbs, one of which arrange(variable) , which does what you want here, and more elegantly (the result is usually also data.frame, so you don't need cbind ):

 require(dplyr) as.data.frame(data) %>% arrange(subject) # or, if you want reverse order: as.data.frame(data) %>% arrange(-subject)

(In this regard, data.table is fine too. Actually, you can combine them into a dtplyr package)

How to match rows with the same value in one column of a data frame in R

More articles: