Dividing multiple values in a single column into multiple rows R

Question

Dividing multiple values in a single column into multiple rows R

I have a data frame, which for the most part is one row observation. However, some lines have several meanings:

# A tibble: 3 x 2 `number` abilities <dbl> <chr> 1 51 b1261 2 57 d710 3 57 b1301; d550 structure(list(`number` = c(51, 57, 57), abilities = c("b1261", "d710", "b1301; d550")), .Names = c("number", "abilities" ), row.names = c(NA, -3L), class = c("tbl_df", "tbl", "data.frame" ))

I would like to get the following:

 # A tibble: 3 x 2 `number` abilities <dbl> <chr> 1 51 b1261 2 57 d710 3 57 d550 4 57 b1301

It is straight ahead to divide by; but I'm not sure how easy it is to add a new line, especially since abilities can contain more than two values.

This is very similar: the semicolon R splits the column into rows , but does not require duplicate removal

+5

r dplyr tidyr

pluke Jun 06 '17 at 23:04

source share

3 answers

dplyr is good for this, as it has unnest :

 library(tidyverse) library(stringr) df %>% mutate(unpacked = str_split(abilities, ";")) %>% unnest %>% mutate(abilities = str_trim(unpacked))

+3

Marius Jun 06 '17 at 23:12

source share

Another option: cSplit

  library(splitstackshape) cSplit(df1, 'abilities', '; ', 'long') # number abilities #1: 51 b1261 #2: 57 d710 #3: 57 b1301 #4: 57 d550

+1

akrun Jun 07 '17 at 4:07

source share

Lamia · Accepted Answer · 2017-06-06T23:12:12+0000

tidyr has a separate_rows function to do this:

 library(tidyr) ## The ";\\s+" means that the separator is a ";" followed by one or more spaces separate_rows(df,abilities,sep=";\\s+") number abilities <dbl> <chr> 1 51 b1261 2 57 d710 3 57 b1301 4 57 d550

Dividing multiple values ​​in a single column into multiple rows R

More articles:

Dividing multiple values in a single column into multiple rows R