Using Separate () to Separate a Date

I have this data frame:

Source: local data frame [446,604 x 2]

                  date pressure
    1  2014_01_01_0:01      991
    2  2014_01_01_0:02      991
    3  2014_01_01_0:03      991
    4  2014_01_01_0:04      991
    5  2014_01_01_0:05      991
    6  2014_01_01_0:06      991
    7  2014_01_01_0:07      991
    8  2014_01_01_0:08      991
    9  2014_01_01_0:09      991
    10 2014_01_01_0:10      991
    ..             ...      ...

I want to separate the date column using separate()fromtidyr

library(tidyr)
separate(df, date, into = c("year", "month", "day", "time"), sep="_") 

But that will not work. I managed to do this with substr()and mutate():

library(dplyr)
df %>%
mutate(
        year = substr(date,  1, 4),
        month = substr(date,  6, 7),
        day = substr(date, 9, 10),
        time = substr(date, 12, 15))

Update:

This does not work because I have invalid lines. I was able to diagnose using my original method substr(), and I found out that I had strange entries in the dataframe:

df %>%
  select(date) %>%

  mutate(
    year = substr(date,  1, 4),
    month = substr(date,  6, 7),
    day = substr(date, 9, 10),
    time = substr(date, 12, 15)) %>%

  group_by(year) %>%
  summarise(n=n())

And here is what I get:

Source: local data frame [33 x 2]

   year      n
1  2014 446293
2  4164      9
3  4165     10
4  4166     10
5  4167     10
6  4168     10
7  4169     10
8  4170     10
9  4171     10
10 4172     10
11 4173     10
12 4174     10
13 4175     10
14 4176     10
15 4177     10
16 4178     10
17 4179     10
18 4180     10
19 4181     10
20 4182     10
21 4183     10
22 4184     10
23 4185     10
24 4186     10
25 4187     10
26 4188     10
27 4189     10
28 4190     10
29 4191     10
30 4192     10
31 4193     11
32 4194     10
33 4195      1

Will there be a more efficient way to diagnose the structure of column elements and search for invalid rows before executing individual () ones?

+4
source share
1 answer

Stages:

  • separate() ( )
  • , ( )
  • separate() extra = "drop"
  • group_by() summarise(), ,
0

Source: https://habr.com/ru/post/1568188/


All Articles