Long and wide data - when to use what?

I am going to compile data from different data sets into one data set for analysis. I will be engaged in data exploration, trying different things to find out what patterns can be hidden in the data, so at the moment I do not have a specific method. Now I am wondering if I should compile my data in long or wide format.

Which format should be used and why?

I understand that data can be reformatted from long to wide or vice versa, but the simple existence of this functionality implies that sometimes there is a need to change the form, and this need in turn implies that a particular format may be better suited to a specific task. So, when do I need which format and why?

I am not asking about performance. This has been considered in other matters.

+4
source share
3 answers

Hadley Wickham A Tidy Data document , and the package tidyr, which is his (last) implementation of its principles, is a great place to start.

, . , , "" "", .

< >

, , mtcars. , . , "" , -

        model type   value
1 AMC Javelin  mpg  15.200
2 AMC Javelin  cyl   8.000
3 AMC Javelin disp 304.000
4 AMC Javelin   hp 150.000
5 AMC Javelin drat   3.150
6 AMC Javelin   wt   3.435

; mpg cyl - .

ChickWeight ( )

require(tidyr)
ChickWeight %>% spread(Time, weight)
   Chick Diet  0  2  4  6   8  10  12  14  16  18  20  21
1     18    1 39 35 NA NA  NA  NA  NA  NA  NA  NA  NA  NA
2     16    1 41 45 49 51  57  51  54  NA  NA  NA  NA  NA
3     15    1 41 49 56 64  68  68  67  68  NA  NA  NA  NA
4     13    1 41 48 53 60  65  67  71  70  71  81  91  96
5      9    1 42 51 59 68  85  96  90  92  93 100 100  98
6     20    1 41 47 54 58  65  73  77  89  98 107 115 117
7     10    1 41 44 52 63  74  81  89  96 101 112 120 124
8      8    1 42 50 61 71  84  93 110 116 126 134 125  NA
9     17    1 42 51 61 72  83  89  98 103 113 123 133 142
10    19    1 43 48 55 62  65  71  82  88 106 120 144 157
11     4    1 42 49 56 67  74  87 102 108 136 154 160 157
12     6    1 41 49 59 74  97 124 141 148 155 160 160 157
13    11    1 43 51 63 84 112 139 168 177 182 184 181 175
...

, , , , , .

+4

imho . , "" . , . - , - NA, na.rm = true.

, . , .

+2

, R , .

, , , , .

, , , , , , . , , reshape2.

+1

Source: https://habr.com/ru/post/1622724/


All Articles