I believe that ddply ist a tool that I need for my task, and it is difficult for me to get the right results. I read for several hours about ddply and experimented with different codes, but I didn’t do it myself anymore. here is an example data frame
station <- c(rep("muc",13), rep("nbw", 17)) year <- c(rep(1994,4),rep(1995,4),rep(1996,5),rep(1994,5), rep(1995,4), rep(1996,4), rep(1997, 4)) depth <- c(rep(c("HUM","31-60","61-90","91-220"),2), rep(c("HUM","0-30", "31-60","61-90","91-220"),2),rep(c("HUM","0-30", "31-60","91-220"),1),rep(c("HUM","0-30", "31-60","61-90"),2)) doc <- c(80, 10, 3, 2,70, 15, 5, 5,70, 20, 5, 5, 2, 40, 10, 3, 2, 1,50, 15, 5, 2, 45, 20, 2, 1,35, 8, 2, 1) df <-data.frame(station,year,depth,doc) df
Depth refers to the depth of the soil (HUM = humus layer), and doc is the measured dissolved organic carbon (doc) for the depth of the soil. Please note that not every year there are measurements for the document and some depth classes are missing. This is annoying, but often appears in my dataset. With ddply, I would like to add a column to this data frame so that a document of the overlying soil layer is returned for each depth and for HUM NA should be given, since nothing is on top of the Humus layer. as an example:
depth doc doc_m1 HUM 80 NA 31-60 10 80 61-90 3 10 91-220 2 3
In the dataframe, this, of course, must be calculated for each year and each depth. I would like to avoid the same thing for loops, and it seems that ddply is suitable for this, however I was not lucky that the lag command works with ddply. this is until I got the code (obviously not very far):
doc <- ddply(df, .(year), transform, doc_m1 = ????)
Does anyone have a suggestion? Thanks in advance!