A subset of data.frame with multiple conditions

Suppose my data looks like this:

2372 Kansas KS2000111 HUMBOLDT, CITY OF ATRAZINE 1.3 05/07/2006 9104 Kansas KS2000111 HUMBOLDT, CITY OF ATRAZINE 0.34 07/23/2006 9212 Kansas KS2000111 HUMBOLDT, CITY OF ATRAZINE 0.33 02/11/2007 2094 Kansas KS2000111 HUMBOLDT, CITY OF ATRAZINE 1.4 05/06/2007 16763 Kansas KS2000111 HUMBOLDT, CITY OF ATRAZINE 0.61 05/11/2009 1076 Kansas KS2000111 HUMBOLDT, CITY OF METOLACHLOR 0.48 05/12/2002 1077 Kansas KS2000111 HUMBOLDT, CITY OF METOLACHLOR 0.3 05/07/2006 

I want to be able to subset with an analyzer and a partial date match (namely, I just want a year). I tried to do this, but I know that this is not entirely correct.

  data[data$Analyte=="ATRAZINE" & grep("2006",as.character(data$Date)),] 

Any suggestions?

+3
source share
3 answers

For this problem, I would like to move on to the approach in the β€œStudent Queue” answer retrieving the year from the date, and not to create a general row match. I would suggest:

 data[data$Analyte =="ATRAZINE" & as.POSIXlt(data$Date, format="%m/%d/%Y")$year == 106] 

But if you really needed to perform regular expression matching, you can use grepl , which returns a logical vector, rather than grep , which returns an index vector.

 data[data$Analyte=="ATRAZINE" & grepl("2006",as.character(data$Date)),] 
+3
source

One way to use date literature:

 data[data$Analyte =="ATRAZINE" & (data$Date >= '2006-01-01' & data$Date < '2007-01-01')] 

Another way to use format

 data[data$Analyte =="ATRAZINE" & format(data$Date, "%Y") == '2006'] 
+2
source

Understand that this question was asked quite a few years ago, I hope that it will help someone in the future.

Used by dplyr for substitution using several conditions and checking the year after conversion to date type

 library(dplyr) data %>% filter( Analyte=="ATRAZINE" & format(as.Date(Date,format = "%m/%d/%Y"),"%Y") == "2006") 
0
source

Source: https://habr.com/ru/post/909151/


All Articles