To make this question more general, I believe that it can also be rephrased as: Creating a current temporary variable with a variable . Although this is an unusual requirement, it can be used for different data sources.
I have a series of non-uniform time data
s> 1 records per day for thousands of users. I want to create a new player_type
column that tracks a rolling 30-day definition of their behavior. The behavior is determined by what games they play; the 'games'
column is a factor in game A, gameB.
Thus, there are three types of behavior:
- Exclusively plays GameA -
'A'
- Exclusively plays GameB -
'B'
- Play both games -
'Hybrid'
I want to use this new column to see changes in their game behavior over time, and also count the number of players in each group over time to see how they change.
Time series are very irregular for each player. Players can play several types of games per day or not play games for many months. The time series is uneven for each player, so a record is created only when the player is playing a game, so I expect the solution to use a filter something like:
interval(current_date, current_date - new_period(days=30)
(using lubridate).
Here is an example dataset. Keep in mind that this is simplified and checks for a rolling change in 1 day, so simple recording verification methods will not actually work before that. If you can make a better dataset, consult and I will edit this post.
p <- c( 1, 1, 1, 2, 2, 2, 6, 6, 6) g <- c('A', 'B', 'B', 'A', 'B', 'A', 'A', 'B', 'B') d <- seq(as.Date('2014-10-01'), as.Date('2014-10-9'), by=1) df <- data.frame(player_id = p, date = d, games = g)
As a conclusion, I require:
player_id date games type 1 1 2014-10-01 AA (OR NA) 2 1 2014-10-02 B Hybrid 3 1 2014-10-03 BB 4 2 2014-10-04 AA (OR NA) 5 2 2014-10-05 B Hybrid 6 2 2014-10-06 A Hybrid 7 6 2014-10-07 AA (OR NA) 8 6 2014-10-08 B Hybrid 9 6 2014-10-09 BB
The solution should be something like apply
through the columns and apply a function that checks 30 days in time, and an ifelse()
statement to see what games they played.
This is a very similar message - and should help solve this problem. How to make a notional amount that looks only between certain date criteria
I also learned rowwise()
and conditional mutates()
with dplyr, however catch is a historical time component for me.
Thanks for the help! I can not thank this forum. I will check often.