Dply lagging with a few subsets

I believe that ddply ist a tool that I need for my task, and it is difficult for me to get the right results. I read for several hours about ddply and experimented with different codes, but I didn’t do it myself anymore. here is an example data frame

station <- c(rep("muc",13), rep("nbw", 17)) year <- c(rep(1994,4),rep(1995,4),rep(1996,5),rep(1994,5), rep(1995,4), rep(1996,4), rep(1997, 4)) depth <- c(rep(c("HUM","31-60","61-90","91-220"),2), rep(c("HUM","0-30", "31-60","61-90","91-220"),2),rep(c("HUM","0-30", "31-60","91-220"),1),rep(c("HUM","0-30", "31-60","61-90"),2)) doc <- c(80, 10, 3, 2,70, 15, 5, 5,70, 20, 5, 5, 2, 40, 10, 3, 2, 1,50, 15, 5, 2, 45, 20, 2, 1,35, 8, 2, 1) df <-data.frame(station,year,depth,doc) df 

Depth refers to the depth of the soil (HUM = humus layer), and doc is the measured dissolved organic carbon (doc) for the depth of the soil. Please note that not every year there are measurements for the document and some depth classes are missing. This is annoying, but often appears in my dataset. With ddply, I would like to add a column to this data frame so that a document of the overlying soil layer is returned for each depth and for HUM NA should be given, since nothing is on top of the Humus layer. as an example:

 depth doc doc_m1 HUM 80 NA 31-60 10 80 61-90 3 10 91-220 2 3 

In the dataframe, this, of course, must be calculated for each year and each depth. I would like to avoid the same thing for loops, and it seems that ddply is suitable for this, however I was not lucky that the lag command works with ddply. this is until I got the code (obviously not very far):

 doc <- ddply(df, .(year), transform, doc_m1 = ????) 

Does anyone have a suggestion? Thanks in advance!

+4
source share
1 answer

If your depths are already in the correct order in your dataset (as in your example), you can simply do:

 doc2 <- ddply(df, .(station, year), transform, doc_m1 = c(NA, doc[-length(doc)])) 

Note. I also grouped at the station. This gives:

 > head(doc2, 10) station year depth doc doc_m1 1 muc 1994 HUM 80 NA 2 muc 1994 31-60 10 80 3 muc 1994 61-90 3 10 4 muc 1994 91-220 2 3 5 muc 1995 HUM 70 NA 6 muc 1995 31-60 15 70 7 muc 1995 61-90 5 15 8 muc 1995 91-220 5 5 9 muc 1996 HUM 70 NA 10 muc 1996 0-30 20 70 

If they have not yet been sorted by depth, make the depth factor with the levels in the correct order, and then sort by that. Then this approach should work.

+5
source

Source: https://habr.com/ru/post/1398988/


All Articles