Split and save in new data.frames

I have a big data.frame (144 columns). I would like to break it into groups of 3 columns each (subfile or sub data.frame), and then save the sub data.frames in divided files. In other words: file1 will contain columns 1 through 3, file2 will contain columns 6 through 9, and so on.

Any ideas?

Just an example:

Hb1 Int1 Value1 Hb2 Int2 Value2 A c 0.3 SW n 0.34 V sd 0.45 FG b 0.345 N wer 0.76 GH m 0.67 

So: The file "output1" will contain:

  Hb1 Int1 Value1 A c 0.3 V sd 0.45 N wer 0.76 

The file "output2" will contain:

  Hb2 Int2 Value2 SW n 0.34 FG b 0.345 GH m 0.67 

etc.

I tried adding a column to the transposed data.frame containing index values, such as:

Index = rep (1: 48, each = 3)

Then I tried to split the large data.frame file according to the Index column, but I can not continue.

+4
source share
2 answers

Perhaps this is useful for you:

 # A simple function (EDIT: FIXED) Split_and_save_DF <- function(DF, split){ # Spliting your data frame by columns to get several data.frames DFlist <-lapply(seq(1, ncol(DF), split), function(x, i){x[, i:(i+(split-1))]}, x=DF) # Saving each data.frames as .txt file invisible(sapply(1:length(DFlist), function(x, i) write.table(x[[i]], file=paste0('DF', i, '.txt')), x=DFlist)) } 

Example

 DF <- data.frame(matrix(rnorm(144*12, 100, 30), ncol=144)) dim(DF) # a dataframe with 12 rows and 144 cols Split_and_save_DF(DF=DF, split=3) # will produce 48 DF's 

Where DF is data.frame and split is the number of columns by which to divide the dataframe.

This is not a good answer, but it does what you want.

This function will split your DF and save each new DF in the current working directory with names such as: DF1.txt , DF2.txt , DF3.txt .... so you can read each file by doing:

 read.table("DF1.txt", header=TRUE) # and so on 

To check the output:

 dim(read.table("DF1.txt", header=TRUE)) # checking dims of new DF's [1] 12 3 
+4
source

you were close to Index = rep(1: 48, each = 3) , you can use it to separate column names.

 lapply(split(colnames(DF), rep(1:48,each=3)), function(x)DF[,x]) 

Testing with the @Jilber example:

 colnames(DF) <- paste(c('Hb','Int', 'Value'),rep(1:48,each=3),sep='') > ll <- lapply(split(colnames(DF), + rep(1:48,each=3)), + function(x)DF[,x]) > head(ll) $`1` Hb1 Int1 Value1 1 155.56103 114.70061 50.15758 2 100.91212 108.93485 138.43324 3 65.02612 97.95829 60.55026 4 102.85399 99.80714 74.53144 5 152.52558 100.28795 109.27979 6 110.84282 122.67727 100.60916 7 100.06572 92.96498 118.99915 8 104.69424 91.46041 38.57983 9 74.59960 119.89719 158.41313 10 100.89299 85.79222 122.57668 11 92.87294 84.40889 95.39005 12 81.20039 127.29311 92.19261 $`2` Hb2 Int2 Value2 1 101.27385 96.21813 21.83450 2 124.26445 117.29466 53.67718 3 144.58042 111.06022 91.92567 4 120.74942 98.63582 123.98479 5 95.74860 79.96633 149.62814 6 74.78898 68.25731 122.72720 7 132.12760 97.76982 56.66394 8 47.18706 118.68346 113.63118 9 115.27 
+3
source

Source: https://habr.com/ru/post/955217/


All Articles