Removing days before and after matching rows

I can remove rows that match between two data frames, df1 and df2, with some code kindly provided by @Eric Fail:

df1[!(apply(df1[1:2], 1, toString) %in% apply(df2[1:2], 1, toString)), ]

or using dplyr by @steveb

df1 %>% filter( ! ((date == df2$date) & (ticker == df2$ticker)) )

However, I realized that I need to delete not only the common line as follows:

 df1 <- data.frame(ticker = c("MSFT", "MSFT", "MSFT", "MSFT"), date = c("2016-01-01", "2016-01-02", "2016-01-03", "2016-01-04"), stringsAsFactors=F) df1 ticker date 1 MSFT 2016-01-01 2 MSFT 2016-01-02 3 MSFT 2016-01-03 4 MSFT 2016-01-04 df2 <- data.frame(ticker = c("AAPL", "GOOG", "MSFT", "FB"), date = c("2016-01-01", "2016-01-01", "2016-01-02", "2016-01-03"), stringsAsFactors=F) df2 ticker date 1 AAPL 2016-01-01 2 GOOG 2016-01-01 3 MSFT 2016-01-02 4 FB 2016-01-03 df3 ticker date 1 MSFT 2016-01-01 2 MSFT 2016-01-03 3 MSFT 2016-01-04 

But the day before, and the next day - the specified line. So my last df would be this:

  ticker date 1 MSFT 2016-01-04 

Please note that 3 MSFT 2016-01-02 was a coincidence, so the row must be deleted along with day and day, and then 3 MSFT 2016-01-01 and 3 MSFT 2016-01-03

An example with two matches:

 df1 <- data.frame(ticker = c("MSFT", "MSFT", "MSFT", "MSFT"), date = as.Date(c("2016-01-01", "2016-01-02", "2016-01-03", "2016-01-04")), stringsAsFactors=F) df2 <- data.frame(ticker = c("AAPL", "GOOG", "MSFT", "MSFT"), date = as.Date(c("2016-01-01", "2016-01-01", "2016-01-01","2016-01-02")), stringsAsFactors=F) 

Target Output:

 ticker date 4 MSFT 2016-01-04 
+5
source share
1 answer

You can convert strings to dates so you can add and subtract days

 df1 <- data.frame(ticker = c("MSFT", "MSFT", "MSFT", "MSFT"), date = as.Date(c("2016-01-01", "2016-01-02", "2016-01-03", "2016-01-04")), stringsAsFactors=F) df2 <- data.frame(ticker = c("AAPL", "GOOG", "MSFT", "FB"), date = as.Date(c("2016-01-01", "2016-01-01", "2016-01-02", "2016-01-03")), stringsAsFactors=F) (m <- df2[(df2$date %in% df1$date) & (df2$ticker %in% df1$ticker), ]) # ticker date # 3 MSFT 2016-01-02 df1[!(df1$date %in% (m$date + c(-1,0,1))), ] # ticker date # 4 MSFT 2016-01-04 

edit - for multiple matches, just apply function(x) to each date

 df1 <- data.frame(ticker = c("MSFT", "MSFT", "MSFT", "MSFT"), date = as.Date(c("2016-01-01", "2016-01-02", "2016-01-03", "2016-01-04")), stringsAsFactors=F) df2 <- data.frame(ticker = c("AAPL", "GOOG", "MSFT", "MSFT"), date = as.Date(c("2016-01-01", "2016-01-01", "2016-01-01","2016-01-02")), stringsAsFactors=F) (m <- df2[(df2$date %in% df1$date) & (df2$ticker %in% df1$ticker), ]) # ticker date # 3 MSFT 2016-01-01 # 4 MSFT 2016-01-02 df1[!(df1$date %in% (sapply(m$date, function(x) x + c(-1,0,1)))), ] # ticker date # 4 MSFT 2016-01-04 
+4
source

Source: https://habr.com/ru/post/1241286/


All Articles