I have wide-format data that has two different sets of columns of values: those that contain mass (Mass1, Mass2, etc.) and those that contain the corresponding dates (Mass1_date, Mass2_date, etc.).
library(tidyr)
library(dplyr)
library(lubridate)
df <- structure(list(Year = 2004, Nest_no = 21, Mass1 = 2325, Mass1_date = structure(1081987200, class = c("POSIXct",
"POSIXt"), tzone = "UTC"), Mass2 = 2000, Mass2_date = structure(1082851200, class = c("POSIXct",
"POSIXt"), tzone = "UTC"), Mass3 = 1750, Mass3_date = structure(1083715200, class = c("POSIXct",
"POSIXt"), tzone = "UTC")), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -1L), .Names = c("Year", "Nest_no", "Mass1",
"Mass1_date", "Mass2", "Mass2_date", "Mass3", "Mass3_date"))
df
## Source: local data frame [1 x 8]
##
## Year Nest_no Mass1 Mass1_date Mass2 Mass2_date Mass3 Mass3_date
## (dbl) (dbl) (dbl) (time) (dbl) (time) (dbl) (time)
## 1 2004 21 2325 2004-04-15 2000 2004-04-25 1750 2004-05-05
I would like to "sort" the data in a long format, where there are two sets of columns of gathered ( melted) values in two different columns of value, one column containing the values of the "Mass" columns and one with the values of the date columns:
#
#
#
#
#
#
#
At first I thought that I could use tidyrand do it in two steps.
gather(df, capture, date, contains("Date")) %>%
gather(capture2, weight, contains("Mass"))
#
#
#
#
#
#
#
#
#
#
#
#
#
However, it did not work as expected. After several attempts, I came up with this solution:
df <- gather(df, capture2, weight, contains("Mass"), convert = T) %>%
mutate(capture = extract_numeric(capture2))
#
#
df$capture2 <- ifelse(grepl("date", df$capture2), "date", "weight")
df <- spread(df, capture2, weight) %>%
mutate(date = as.Date(as.POSIXct(date, origin = "1970-01-01")))
df
#
#
#
#
#
#
#
I was wondering if there is a better way to achieve this?
Thanks Philip