How to spread () in tidyr accumulation levels

I manipulated my data and found that at some point in the process I did something wrong. When I investigated the problem, the problem boiled down to the following spread() behavior in the tidyr package.

Here is a demo. Say we have a data frame as shown below.

 > d <- data.frame(factor1 = rep(LETTERS[1:3], each = 3), + factor2 = rep(paste0("level", c(1, 2, 10)), 3), + num = 1:9 + ) > d factor1 factor2 num 1 A level1 1 2 A level2 2 3 A level10 3 4 B level1 4 5 B level2 5 6 B level10 6 7 C level1 7 8 C level2 8 9 C level10 9 

What I wanted to do was convert this long format data frame to a wide format. And I thought spread() is the way to go. The result, however, was not what I expected.

 > spread(d, factor2, num) factor1 level1 level2 level10 1 A 1 3 2 2 B 4 6 5 3 C 7 9 8 

If factor1 is "A" and factor2 is "level2", the value should be 2, but the resulting wide format says 3. Apparently, num is sorted alphabetically of order factor2 (level1> level10> level2) and placed in wide format . But when this is the case, factor2 labels keep the same order as in the original data frame (level1> level2> level10).

Can someone explain why this is happening (and / or where I can find relevant information)?

+5
source share
1 answer

Using the provided data, I got a different result:

 > packageVersion("tidyr") [1] '0.1' spread(d, factor2, num) factor1 level1 level10 level2 1 A 1 3 2 2 B 4 6 5 3 C 7 9 8 
+8
source

Source: https://habr.com/ru/post/1204123/


All Articles