Divide the columns into adjacent columns, use the row name as the new column name in R

Question

Divide the columns into adjacent columns, use the row name as the new column name in R

I have a data frame containing two columns of identifying information and one column of letter pairs separated by a hyphen:

df<-data.frame(
    list = rep(1:3, each = 2),
    set =  rep(c("A","B"), times = 3),
    item = c("ab-cd","ef-gh","ij-kl","mn-op","qr-st","uv-wx")  
    )

What I'm trying to accomplish is converting a data frame into the following form, in which: 1. The individual lines indexed by the "list" are collapsed into one line each; 2. The column "item" is divided into adjacent columns with a hyphen as a separator; 3. The "set" column, which serves as the basis for naming the resulting columns.

df2 <- data.frame(
       list = c(1:3),
       A_1 = c("ab", "ij", "qr"),
       A_2 = c("cd", "kl", "st"),
       B_1 = c("ef", "mn", "uv"), 
       B_2 = c("gh", "op", "wx"))

( []) , BASE, reshape splitstackshape. , , .

, .

+4

split r transformation

Steve'sConnect 13 . '16 18:42

3

, Hadleyverse:

library(dplyr)
library(tidyr)
df %>% 
  separate(item, 1:2) %>% 
  gather(val, item, -set, -list) %>% 
  mutate(set=paste(set, val, sep="_")) %>% 
  select(-val) %>% 
  spread(set, item)
#   list A_1 A_2 B_1 B_2
# 1    1  ab  cd  ef  gh
# 2    2  ij  kl  mn  op
# 3    3  qr  st  uv  wx

+1

lukeA 13 . '16 19:11

For completeness, this also works well with the R-Nemesis Hadliver base reshape:

reshape(cbind(df[-3], 
              do.call(rbind, strsplit(as.character(df$item), "-"))), 
        direction = "wide", idvar = "list", timevar = "set")
#   list 1.A 2.A 1.B 2.B
# 1    1  ab  cd  ef  gh
# 3    2  ij  kl  mn  op
# 5    3  qr  st  uv  wx

(But dcast+ cSplitwill be much more efficient and readable).

+1

A5C1D2H2I1M1N2O1R2T1 Jan 13 '16 at 19:29

source share

Heroka · Accepted Answer · 2016-01-13T18:54:59+0000

@AnandaMahto: , , , .

library(splitstackshape)
cSplit(dcast(as.data.table(df), list ~ set, value.var = "item"), c("A", "B"), "-")

, R reshape2.

"1" "2", . , , .

df[,c("1","2")] <- do.call(rbind,strsplit(as.character(df$item),"-"))

recast:

res <- recast(data=df, list~set+variable, measure.var=c("1","2"))
res

  list A_1 A_2 B_1 B_2
1    1  ab  cd  ef  gh
2    2  ij  kl  mn  op
3    3  qr  st  uv  wx

Divide the columns into adjacent columns, use the row name as the new column name in R

More articles: