I am new to R and I have a problem splitting a very large data frame into a nested list. I tried to find help on the Internet, but I was unsuccessful.
I have a simplified example of how my data is organized:
Headers:
1 "station" (number) 2. "date.str" (date string) 3. "member" 4. "forecast time" 5. "data"
I'm not sure that my sample data will display correctly, but if so, it looks like this:
1. station date.str member forecast.time data1 2. 6019 20110805 mbr000 06 77 3. 6031 20110805 mbr000 06 28 4. 6071 20110805 mbr000 06 45 5. 6019 20110805 mbr001 12 22 6. 6019 20110806 mbr024 18 66
I want to split a large data frame into a nested list after "station", "member", "date.str" and "forecast.time". Thus, mylist [[c (s, m, d, t)]] contains a data frame with data for stations "s" and a member "m" for date.str "d" and for the predicted time "t", storing the values s, m, d and t.
My code is:
data.st <- list() data.st.member <- list() data.st.member.dato <- list() data.st. <- split(mydata, mydata$station) data.st.member <- lapply(data.st, FUN = fsplit.member)
(I created a function to split after the "member")
#Loop over station number: for (s in 1:S){ #Loop over members: for (m in 1:length(members){ tmp <- split( data.st.member[[s]][[m]], data.st.member[[s]][[m]]$dato.str ) #Loop over number of different "date.str"s for (t in 1:length(no.date.str) ){ data.st.member.dato[[s]][[m]][[t]] <- tmp} } #end m loop } #end s loop
I would also like to split according to the predicted time: forec.time, but I did not understand this.
I tried a couple of different configurations inside loops, so at the moment I don't have a consistent error message. I cannot understand what I am doing or thinking wrong.
Any help is much appreciated!
Sisse Relations