Rearrange the table for each group

I am working with R and I have this data:

data <- structure(list(Col1 = 1:9, Col2 = structure(c(2L, 2L, 2L, 1L, 3L, 3L, 3L, 3L, 3L), .Label = c("Administrative ", "National", "Regional"), class = "factor"), Col3 = structure(c(NA, 3L, 4L, NA, 2L, 3L, 1L, 4L, 3L), .Label = c("bike", "boat", "car", "truck" ), class = "factor"), Col4 = c(56L, 65L, 58L, 62L, 24L, 25L, 120L, 89L, 468L), X = c(NA, NA, NA, NA, NA, NA, NA, NA, NA), X.1 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA)), .Names = c("Col1", "Col2", "Col3", "Col4", "X", "X.1"), class = "data.frame", row.names = c(NA, -9L)) 

I would like to rebuild it to see what is available or not. The result will look like this:

  result <- structure(list(Col1 = c(1L, 4L, 5L), Col2 = structure(c(2L, 1L, 3L), .Label = c("Administrative ", "National", "Regional"), class = "factor"), car = c(1L, 0L, 1L), truck = c(1L, 0L, 1L), boat = c(0L, 0L, 1L), bike = c(0L, 0L, 1L)), .Names = c("Col1", "Col2", "car", "truck", "boat", "bike"), class = "data.frame", row.names = c(NA, -3L)) 

I tried with the aggregate, but I'm still far from the result. Help will be

 t <- aggregate(data$Col2, by=list(data$Col3), c) 

Help is appreciated!

+5
source share
4 answers

We can use dcast from data.table with length as fun.aggregate

 library(data.table) dcast(setDT(data), Col2~ Col3, length)[, 1:5, with = FALSE] 
+4
source

Here is an idea using the R base,

 #convert to character data[2:3] <- lapply(data[2:3], as.character) #get unique elements to tabulate i1 <- unique(data$Col3) i1 <- i1[!is.na(i1)] setNames(data.frame(do.call(rbind, lapply(split(data$Col3, data$Col2), function(i) as.integer(match(i1, i, nomatch = 0) > 0)))), i1) 

what gives,

  car truck boat bike Administrative 0 0 0 0 National 1 1 0 0 Regional 1 1 1 1 
+3
source

Here is the dplyr solution if you are interested, although the akrun solution looks more concise:

 library(tidyverse) result <- data %>% group_by(Col2, Col3) %>% summarise(tot = sum(Col4)) %>% mutate(bool = if_else(tot > 0, 1, 0)) %>% select(Col2, Col3, bool) %>% spread(key = Col3, value = bool, fill = 0) %>% select(-`<NA>`) 
+2
source

Here is another basic R method using table and some coercion.

 (table(data$Col2, data$Col3) > 0) + 0L bike boat car truck Administrative 0 0 0 0 National 0 0 1 1 Regional 1 1 1 1 

table counts instances, returning 0 for NA. Then we try logically with > 0 to reset values ​​greater than 1 and back to an integer with + 0L .

+1
source

Source: https://habr.com/ru/post/1272146/


All Articles