Convert the numeric representation of the variable column to the original row after melting using patterns

Question

Convert the numeric representation of the variable column to the original row after melting using patterns

I use the patterns() argument in data.table::melt() to melt data that has columns with several easily defined patterns. It works, but I don’t see how I can create a character index variable instead of the default numeric decomposition.

For example, the columns for dogs and cats are numbered ... look at the "variable" column:

 A = data.table(idcol = c(1:5), dog_1 = c(1:5), cat_1 = c(101:105), dog_2 = c(6:10), cat_2 = c(106:110), dog_3 = c(11:15), cat_3 = c(111:115)) head(melt(A, measure = patterns("^dog", "^cat"), value.name = c("dog", "cat"))) idcol variable dog cat 1: 1 1 1 101 2: 2 1 2 102 3: 3 1 3 103 4: 4 1 4 104 5: 5 1 5 105 6: 1 2 6 106

However, in B, the dog and cat columns are numbered with text, but the variable column is still numeric.

 B = data.table(idcol = c(1:5), dog_one = c(1:5), cat_one = c(101:105), dog_two = c(6:10), cat_two = c(106:110), dog_three = c(11:15), cat_three = c(111:115)) head(melt(B, measure = patterns("^dog", "^cat"), value.name = c("dog", "cat"))) idcol variable dog cat 1: 1 1 1 101 2: 2 1 2 102 3: 3 1 3 103 4: 4 1 4 104 5: 5 1 5 105 6: 1 2 6 106

How can I fill the variable column with one / two / three instead of 1/2/3?

+6

r data.table melt

Nancy Jan 26 '17 at 21:48

source share

1 answer

Henrik · Accepted Answer · 2017-01-26T22:20:58+0000

There may be simpler ways, but this seems to work:

 # grab suffixes of 'variable' names suff <- unique(sub('^.*_', '', names(B[ , -1]))) # suff <- unique(tstrsplit(names(B[, -1]), "_")[[2]]) # melt B2 <- melt(B, measure = patterns("^dog", "^cat"), value.name = c("dog", "cat")) # replace factor levels in 'variable' with the suffixes setattr(B2$variable, "levels", suff) B2 # idcol variable dog cat # 1: 1 one 1 101 # 2: 2 one 2 102 # 3: 3 one 3 103 # 4: 4 one 4 104 # 5: 5 one 5 105 # 6: 1 two 6 106 # 7: 2 two 7 107 # 8: 3 two 8 108 # 9: 4 two 9 109 # 10: 5 two 10 110 # 11: 1 three 11 111 # 12: 2 three 12 112 # 13: 3 three 13 113 # 14: 4 three 14 114 # 15: 5 three 15 115

Note that there is an open problem in this section with some other alternatives: FR: expanding melt functionality to handle output names .

This is one of the (rare) cases where I find that good'ol base::reshape cleaner. The sep argument is useful here - both the column names "value" and the column levels "variable" are generated in one pass:

 reshape(data = B, varying = names(B[ , -1]), sep = "_", direction = "long")

Convert the numeric representation of the variable column to the original row after melting using patterns

More articles: