Extract hours and seconds with POSIXct for graphing purposes in R

Suppose I have data.frame foo

  start.time duration 1 2012-02-06 15:47:00 1 2 2012-02-06 15:02:00 2 3 2012-02-22 10:08:00 3 4 2012-02-22 09:32:00 4 5 2012-03-21 13:47:00 5 

And class(foo$start.time) returns

 [1] "POSIXct" "POSIXt" 

I would like to create a foo$duration v graph. foo$start.time . In my scenario, I'm only interested in the time of day, not the actual day of the year. How do you extract the time of day in the form of hours: seconds from the POSIXct vector class?

+42
datetime r ggplot2 lubridate
May 22 '12 at 15:38
source share
4 answers

This is a good question and highlights some of the difficulties with working with dates in R. The lubridate package is very convenient, so below I present two approaches, one of which uses the database (as suggested by @ RJ-), and the other using lubridate.

Restore (the first two lines) of the data block in the original message:

 foo <- data.frame(start.time = c("2012-02-06 15:47:00", "2012-02-06 15:02:00", "2012-02-22 10:08:00"), duration = c(1,2,3)) 

Convert to POSIXct and POSIXt class (two ways to do this)

 # using base::strptime t.str <- strptime(foo$start.time, "%Y-%m-%d %H:%M:%S") # using lubridate::ymd_hms library(lubridate) t.lub <- ymd_hms(foo$start.time) 

Now choose the time in decimal hours

 # using base::format h.str <- as.numeric(format(t.str, "%H")) + as.numeric(format(t.str, "%M"))/60 # using lubridate::hour and lubridate::minute h.lub <- hour(t.lub) + minute(t.lub)/60 

Demonstrate that these approaches are equal:

 identical(h.str, h.lub) 

Then choose one of the following approaches to set the decimal hour of foo$hr :

 foo$hr <- h.str # If you prefer, the choice can be made at random: foo$hr <- if(runif(1) > 0.5){ h.str } else { h.lub } 

then build using the ggplot2 package:

 library(ggplot2) qplot(foo$hr, foo$duration) +     scale_x_datetime(labels = "%S:00") 
+38
May 22 '12 at 18:42
source share

You can rely on the R base:

 # Using R 2.14.2 # The same toy data foo <- data.frame(start.time = c("2012-02-06 15:47:00", "2012-02-06 15:02:00", "2012-02-22 10:08:00"), duration = c(1,2,3)) 

Since the POSIXct class contains date information in a structured way, you can rely on substr to extract characters at temporary positions in the POSIXct vector. That is, if you know the format of your POSIXct (how it will be displayed when printing), you can extract the hours and minutes:

 # Extract hour and minute as a character vector, of the form "%H:%M" substr(foo$start.time, 12, 16) 

And then paste it on an arbitrary date to convert it back to POSIXct. In the example, I use January in early 2012, but if you do not specify a date and use format instead, then R uses the current date.

 # Store time information as POSIXct, using an arbitrary date foo$time <- as.POSIXct(paste("2012-01-01", substr(foo$start.time, 12, 16))) 

Both plot and ggplot2 can format time in POSIXct out of the box.

 # Plot it using base graphics plot(duration~time, data=foo) # Plot it using ggplot2 (0.9.2.1) library(ggplot2) qplot(x=time, y=duration, data=foo) 
+14
Oct. 12 '12 at 23:18
source share

This code is much faster than converting to a string and back to numeric

 time <- c("1979-11-13T08:37:19-0500", "2014-05-13T08:37:19-0400"); time.posix <- as.POSIXct(time, format = "%Y-%m-%dT%H:%M:%S%z"); time.epoch <- as.vector(unclass(time.posix)); time.poslt <- as.POSIXlt(time.posix, tz = "America/New_York"); time.hour.new.york <- time.poslt$hour + time.poslt$min/60 + time.poslt$sec/3600; > time; [1] "1979-11-13T08:37:19-0500" "2014-05-13T08:37:19-0400" > time.posix; [1] "1979-11-13 15:37:19 IST" "2014-05-13 15:37:19 IDT" > time.poslt; [1] "1979-11-13 08:37:19 EST" "2014-05-13 08:37:19 EDT" > time.epoch; [1] 311348239 1399984639 > time.hour.new.york; [1] 8.621944 8.621944 
+4
Jul 01 '14 at 18:32
source share

Lubridate does not process time of day data, so Hadley recommends the hms package for this data type. Something like this will work:

 library(lubridate) foo <- data.frame(start.time = parse_datetime(c("2012-02-06 15:47:00", "2012-02-06 15:02:00", "2012-02-22 10:08:00")), duration = c(1,2,3)) foo<-foo %>% mutate(time_of_day=hms::hms(second(start.time),minute(start.time),hour(start.time))) 

Watch out for two potential problems - 1) lubridate has another function called hms, and 2) hms :: hms takes arguments in the opposite order to the sentence by its name (so only seconds can be provided)

+1
Dec 6 '17 at 11:43 on
source share



All Articles