Is there a function to split a large data frame into n smaller data frames of equal size (per row) and they have n + 1 smaller data frames?

The title pretty much claims it. I have a data frame that has 7 + million rows, too big for me to parse without my malfunctioning. I want to split it into 100 smaller data frames with 70,000 rows and have the 101st data block with the remaining rows (<70,000). This seems to be nontrivial.

I know that I can manually calculate the frame size n+1 , delete it, and then use the split function as follows:

 d <- split(my_data_frame,rep(1:100,each=70,000)) 

But I have some large data frames, and all of these calculations are tedious. Is there an alternative solution?

+6
source share
1 answer

How about something like this:

 df <- data.frame(x = 1:723500, y = runif(7235000)) split(df, rep(1:100, each = round(NROW(df) / 100, -4))) 

Or ignore more:

 num_dfs <- 100 split(df, rep(1:num_dfs, each = round(NROW(df) / num_dfs, -4))) 

You might want to consider something from the caret package, for example: caret::createFolds(df$x)

+4
source

Source: https://habr.com/ru/post/989098/


All Articles