Another solution is data.table , which does not rely on the existence of any unique fields in the source data.
DT = data.table(read.table(header=T, text="blah | splitme T | a,b,c T | a,c F | b,d F | e,f", stringsAsFactors=F, sep="|", strip.white = TRUE)) DT[,.( blah , splitme , splitted=unlist(strsplit(splitme, ",")) ),by=seq_len(nrow(DT))]
The important thing by=seq_len(nrow(DT)) is the "fake" unique identifier on which the splitting occurs. It's tempting to use by=.I instead, since it needs to be defined the same way, but .I seems like a magical thing that changes its meaning, it's best to stick to by=seq_len(nrow(DT))
There are three columns in the output. We simply name the two existing columns and then calculate the third as split
.( blah # first column of original , splitme # second column of original , splitted = unlist(strsplit(splitme, ",")) )
Aaron McDaid Jul 14 '16 at 17:55 2016-07-14 17:55
source share