Finding two columns in a data frame in R

I have a question about finding values ​​in R, this is actually a bit like the question that was posted yesterday (as indicated here: Looking for a vector / data table back in R ), except that I think my problem is a bit more complicated ( as well as the opposite of what I want to do), and since I'm very new to R, I'm not too sure how to solve this problem,

I have a data frame similar to the one below, and I want to find the previous index value for my current one, where the column Timesis different from my current time and the column Midquotedoes not NA.

Index               Times    |    Midquote
                -----------------------------
   1            10:30:45.58  |    5.319
   2            10:30:45.93  |    5.323
   3            10:30:45.104 |    5.325
   4            10:30:45.127 |    5.322
   5            10:30:45.188 |    5.325
   6            10:30:45.188 |    NA
   7            10:30:45.212 |    NA
   8            10:30:45.231 |    5.321
   9            10:30:45.231 |    5.321

If we start at the bottom of the data frame and consider this the “current” time, it will be in index 9 and has the Timesvalue 10:30:45.231and Midquote 5.321, then if I want to find the first index where the time differs from my current time, we see that this is an indicator of the index 7, which has time 10:30:45.212(since index 8 has the same time), But we also see that at index 7 the value Midquoteis equal NA, so now I have to check the data frame again. Index 6 again has a different time (i.e. 10:30:45.188), but it also matters NAagain in the column Midquote, therefore, going back to index 5, we see that the column Timeshas a different time for my current time (i.e. 10:30:45.188again) and that value Midquotesis equal 5.325.

, 5 10:30:45.188 ( , 10:30:45.231), Midquote 5 NA, output '5', , .

: ? , , R, ...

EDIT: , ( , ),

+4
3

, , .

ind<-function(t,df){
    ind<-t
    while(t>1){
       t=t-1
        if((df$Times[t]!=df$Times[ind]) && (!is.na(df$Midquote[t]))){
            return(t)
        }
    }
}
sapply((nrow(data):1),FUN = ind,data)

#[[1]]
#[1] 5

#[[2]]
#[1] 5

#[[3]]
#[1] 5

#[[4]]
#[1] 4

#[[5]]
#[1] 4

#[[6]]
#[1] 3

#[[7]]
#[1] 2

#[[8]]
#[1] 1

#[[9]]
#NULL

data.frame, .

: ind , t , ind-1 1. df .frame as , while , df$Times[t] df$Midquote[t] . , else, , .

sapply :

 ind(9,df)
 [1] 5
+1

, . , . , "Times" ,

library(magrittr)
which(df$Times < df[9,1] & !is.na(df$Midquote)) %>% max()

which "Index", "Times" , 9, "Midquote" - NA. %>% max(), . , .

+2

Data.table , 1 .

library(data.table)

dt <- data.table(Index = 1:9,
                 Times = c( '10:30:45.58', '10:30:45.93','10:30:45.104','10:30:45.127','10:30:45.188','10:30:45.188','10:30:45.212','10:30:45.231','10:30:45.231' ),
                 Midquote = c('5.319','5.323','5.325','5.322','5.325',NA,NA,'5.321','5.321')
                )

> dt[ Times != Times[.N] & !is.na(Midquote), max(Index) ]
[1] 5

Index, ( )

dt2 <- data.table(Times = c( '10:30:45.58', '10:30:45.93','10:30:45.104','10:30:45.127','10:30:45.188','10:30:45.188','10:30:45.212','10:30:45.231','10:30:45.231' ),
                  Midquote = c('5.319','5.323','5.325','5.322','5.325',NA,NA,'5.321','5.321'))


# Option 1 - create an id column on the fly (unfortunately data.table recalculate .I after evaluating the "where" clause... so you need to save it)
dt2[, cbind(.SD, id=.I)][ Times != Times[.N] & !is.na(Midquote), max(id) ]

# Option 2 - simply check the last position of where your condition is met
dt2[, max(which(Times != Times[.N] & !is.na(Midquote))) ]

NB nrow, , , 1-, 2- 4- , , nrow 3, 3- .

EDIT 2 ( 3 )

dt3 <- data.table(Times = c( '10:30:45.58', '10:30:45.93','10:30:45.104','10:30:45.127','10:30:45.188','10:30:45.188','10:30:45.212','10:30:45.231','10:30:45.231' ),
                  Midquote = c('5.319','5.323', NA,'5.322','5.325', NA, NA,'5.321','5.321'))


# Option 1 - create an id column on the fly (unfortunately data.table recalculate .I after evaluating the "where" clause... so you need to save it)
dt3[, cbind(.SD, id=.I)][ Times != Times[.N] & !is.na(Midquote), max(id) ]
[1] 5

# Option 2 - simply check the last position of where your condition is met
dt3[, max(which(Times != Times[.N] & !is.na(Midquote))) ]
[1] 5

# Option 3 - good luck with this
nrow(dt3[Times != Times[.N] & !is.na(Midquote)])
[1] 4
+1
source

Source: https://habr.com/ru/post/1680682/


All Articles