Equivalent text to columns in R, dividing data frame by character

Question

Equivalent text to columns in R, dividing data frame by character

I would like to know how to split columns in the same way as excel in the text-to-column function. There are many stackexchange tutorials on how to split columns with characters, but they don't address 3 things I need:

1). work with a column where only some of the rows have a 2) character. work with a data framework that has many columns 3). treat columns as characters / factors

For example, I have a dataframe

df <- data.frame(V1 = c("01, 02", "04", "05, 06", "07, 08", "09", "10"), V2 = c("11, 12", "14", "13, 14", 11, 14", "13", "15")

If I used text columns from V1 to excel, I would end up splitting 3 columns into a comma. The second column will be created only for those cells that have a comma in them. There will be empty cells for rows that did not have a column. I will also have the opportunity to treat the new column as a number or text. In this case, I need a leading zero, so it should be considered as text.

It will look something like this.

  V1 V2 V3 Row 1 01 02 11,12 Row 2 04 NA 14

How would I do something like this in R, bearing in mind that the dataset that I have has many columns, so it’s not practical to rename each column in the code.

Hope this was clear. Thanks for the help!

+5

r

tom Dec 12 '14 at 3:49

source share

2 answers

akrun · Answer 1 · 2014-12-12T03:52:45+0000

Maybe it helps

 library(splitstackshape) cSplit(df, 'V1', sep=", ", type.convert=FALSE) # V2 V1_1 V1_2 #1: 11, 12 01 02 #2: 14 04 NA #3: 13, 14 05 06 #4: 11, 14 07 08 #5: 13 09 NA #6: 15 10 NA

If you want both columns to be split

 cSplit(df, 1:ncol(df), sep=",", stripWhite=TRUE, type.convert=FALSE) # V1_1 V1_2 V2_1 V2_2 #1: 01 02 11 12 #2: 04 NA 14 NA #3: 05 06 13 14 #4: 07 08 11 14 #5: 09 NA 13 NA #6: 10 NA 15 NA

default is type.convert= TRUE , which is converted to numeric .

data

  df <- data.frame(V1 = c("01, 02", "04", "05, 06", "07, 08", "09", "10"), V2 = c("11, 12", "14", "13, 14", "11, 14", "13", "15") )

42- · Answer 2 · 2014-12-12T06:52:23+0000

Separation using strsplit and then access using "[" seems to work. Do you understand that these were the factors that I hope for?

 spl <-strsplit(as.character(df$V1), ",") data.frame(V1= sapply(spl, "[", 1), V2 = sapply(spl, "[", 2), df$V2) V1 V2 df.V2 1 01 02 11, 12 2 04 <NA> 14 3 05 06 13, 14 4 07 08 11, 14 5 09 <NA> 13 6 10 <NA> 15

Equivalent text to columns in R, dividing data frame by character

data

More articles: