How to match rows with the same value in one column of a data frame in R

I have data in the following form:

set.seed(1234) data <- data.frame(cbind(runif(40,0,10), rep(seq(1,20,1), each = 2))) data <- data[sample(nrow(data)),] colnames(data) <- c("obs","subject") head(data) obs subject 1.5904600 12 8.1059855 13 5.4497484 6 0.3999592 12 2.5880982 19 2.6682078 9 ... ... 

Let's say that I have only two observations (column "obs") on the subject (column "subject", where the subjects are numbered from 1 to 20).

I would like to "group" the rows by the values โ€‹โ€‹of the "subject" column. More precisely, I would like to "order" data on the topic, but preserving the order shown above. Thus, the final data will be something like this:

  obs subject 1.5904600 12 0.3999592 12 8.1059855 13 2.3656473 13 5.4497484 6 7.2934746 6 

Any ideas? I was thinking about what would probably identify each line corresponding to the subject, with which :

 which(data$subject==x) 

and then rbind these lines in a loop, but I'm sure there is a simpler and faster way to do this, right?

+5
source share
4 answers

Convert to a factor with levels, then order:

 data$group <- factor(data$subject, levels = unique(data$subject)) data[ order(data$group), ] # obs subject group # 1 1.59046003 12 12 # 4 0.39995918 12 12 # 2 8.10598552 13 13 # 30 2.18799541 13 13 # ... 
+5
source

Commit data with obs and disconnect again. The result will be saved in the original order, but subject will be grouped.

 library(tidyr) data %>% nest(obs) %>% unnest() # A tibble: 6 ร— 2 # subject obs # <int> <dbl> #1 12 1.5904600 #2 12 0.3999592 #3 13 8.1059855 #4 6 5.4497484 #5 19 2.5880982 #6 9 2.6682078 
0
source

It is based on zx8754, but retains the data type:

 library(dplyr) #arrange function group<-factor(data[,'subject'], levels=unique(data[,'subject'])) data<-cbind(data,group) data<-arrange(as.data.frame(data),group) data<-as.matrix(data[,-3]) 
-1
source

dplyr is a great package with various useful verbs, one of which arrange(variable) , which does what you want here, and more elegantly (the result is usually also data.frame, so you don't need cbind ):

 require(dplyr) as.data.frame(data) %>% arrange(subject) # or, if you want reverse order: as.data.frame(data) %>% arrange(-subject) 

(In this regard, data.table is fine too. Actually, you can combine them into a dtplyr package)

-3
source

Source: https://habr.com/ru/post/1257354/


All Articles