A subset of the data frame for the specified month and year

I have a dataframe where the first column is a date in d / m / y format and the second is a numeric value (sale).

I want to create subsets for each month in one year (e.g. 11/11, 12/11, etc.). I tried the code suggested in this answer: a subset of data.frame with several conditions

and it works when the month condition is met:

subset(sales, format.Date(date, "%m")=="11") 

but it returns an empty subset with the invalid 'x' argument error message when I add the condition of the year:

 subset(sales, format.Date(date, "%m")=="11" & format.Date(date, "%y")=="11") 

I am using R 2.10.1-2 on Ubuntu 10.04, thanks for the help you can give.

+6
source share
3 answers

Since you did not provide the dataset, I made my own from the link you provided. Your method works for me, and I get an empty data set only if I do not agree with both conditions (month and year), so I assume that you are trying to multiply a series of dates (month and year) that does not exist ( but cannot say for sure without the code you use). Here is the code I used:

 sales <- read.table(text="2372 Kansas KS2000111 HUMBOLDT, CITY OF ATRAZINE 1.3 05/07/2006 9104 Kansas KS2000111 HUMBOLDT, CITY OF ATRAZINE 0.34 07/23/2006 9212 Kansas KS2000111 HUMBOLDT, CITY OF ATRAZINE 0.33 02/11/2007 2094 Kansas KS2000111 HUMBOLDT, CITY OF ATRAZINE 1.4 05/06/2007 16763 Kansas KS2000111 HUMBOLDT, CITY OF ATRAZINE 0.61 05/11/2009 1076 Kansas KS2000111 HUMBOLDT, CITY OF METOLACHLOR 0.48 05/12/2002 1077 Kansas KS2000111 HUMBOLDT, CITY OF METOLACHLOR 0.3 05/07/2006") sales$V9 <- as.Date(sales$V9, "%m/%d/%Y") names(sales)[9] <- 'date' subset(sales, format.Date(date, "%m")=="05" & format.Date(date, "%y")=="07") # V1 V2 V3 V4 V5 V6 V7 V8 date #4 2094 Kansas KS2000111 HUMBOLDT, CITY OF ATRAZINE 1.4 2007-05-06 subset(sales, format.Date(date, "%m")=="05" & format.Date(date, "%y")=="10") #[1] V1 V2 V3 V4 V5 V6 V7 V8 date #<0 rows> (or 0-length row.names) 
+4
source

This answer avoids the subset , handles missing cases, and uses the as.POSIXct date / time as.POSIXct . Although, the rest of the code is pretty much the same as Tyler Rinker's answer. Note that I should specify the name of the date / time variable inside as.POSIXct , and not use the name of the unformatted variable Date_Time .

 my.data <- read.csv(text = ' Date_Time, state, city 10/05/2011 07:32:40, AK, aa 15/06/2011 13:26:02, AK, bb 19/07/2011 13:26:02, OH, cc NA, OH, dd 20/05/2012 14:57:27, PA, ee 22/07/2012 14:57:27, AL, ff 20/03/2013 15:03:18, NY, gg ', header=TRUE, stringsAsFactors = FALSE, na.strings = 'NA', strip.white = TRUE) my.data$my_Date_Time <- as.POSIXct(my.data$Date_Time, format = "%d/%m/%Y %H:%M:%S") # Select May my.data[format.Date(my.data$my_Date_Time, "%m")=="05" & !is.na(my.data$my_Date_Time),] # Date_Time state city my_Date_Time # 1 10/05/2011 07:32:40 AK aa 2011-05-10 07:32:40 # 5 20/05/2012 14:57:27 PA ee 2012-05-20 14:57:27 # Select 2012 my.data[format.Date(my.data$my_Date_Time, "%Y")=="2012" & !is.na(my.data$my_Date_Time),] # Date_Time state city my_Date_Time # 5 20/05/2012 14:57:27 PA ee 2012-05-20 14:57:27 # 6 22/07/2012 14:57:27 AL ff 2012-07-22 14:57:27 # Select May 2012 my.data[format.Date(my.data$my_Date_Time, "%m")=="05" & format.Date(my.data$my_Date_Time, "%Y")=="2012" & !is.na(my.data$my_Date_Time),] # Date_Time state city my_Date_Time # 5 20/05/2012 14:57:27 PA ee 2012-05-20 14:57:27 
+2
source

"Y" is case sensitive in dates. I don’t know why, but β€œm” for several months and β€œd” for days is lowercase, but β€œY” should be in uppercase. This should work for you:

 subset(sales, format.Date(date, "%m")=="11" & format.Date(date, "%Y")=="11") 
0
source

Source: https://habr.com/ru/post/909149/


All Articles