Transpose data in column X in data state in column Y

I have a problem with data organization. I have data that looks like this:

ROW   date      names
1     1.1.2000  A
2     NA        B
3     NA        C
4     1.1.2000  X
5     NA        Y
6     2.1.2000  Z

I want it to look like this:

ROW   date      name1 name2 name3 name4
1     1.1.2000  A     B     C     NA
2     1.1.2000  X     Y     NA    NA
3     2.1.2000  Z     NA    NA    NA

Therefore, the code must check the "date" column, and when it finds the date, it stores the row number (ROW1). Then it checks the following lines of "date" -column and, if the value is "NA", the program saves its rowan berries (ROW y: x) until it finds the next date (the next date line is not saved). The code then moves on to the y: x rows in the name columns and moves their data to the new columns in ROW1. After that, the code performs the same process for the next date that it finds after the y: x lines.

ROW1 y: x , . , , , .

, , , , .

+4
1
library(dplyr)
library(tidyr)

df = read.table(text = "
ROW   date      names
1     1.1.2000  A
2     NA        B
3     NA        C
4     1.1.2000  X
5     NA        Y
6     2.1.2000  Z
", header=T, stringsAsFactors=F)

df %>%
  group_by(ROW = cumsum(!is.na(date))) %>%       # create the rows of updated dataset based on rows without NAs; for each new row:
  mutate(counter = row_number(),                 # count how many columns you need for each new row
         date = unique(date[!is.na(date)])) %>%  # keep unique date after excluding NAs
  ungroup() %>%                                  # forget the grouping
  mutate(counter = paste0("name",counter)) %>%   # update variable to use as column names
  spread(counter, names)                         # reshape dataset

# # A tibble: 3 x 5
#     ROW     date name1 name2 name3
# * <int>    <chr> <chr> <chr> <chr>
# 1     1 1.1.2000     A     B     C
# 2     2 1.1.2000     X     Y  <NA>
# 3     3 2.1.2000     Z  <NA>  <NA>
+6

Source: https://habr.com/ru/post/1689870/


All Articles