R file with thousands of columns, concat after the first 10

I am reading a file with several thousand columns, I'm only interested in the first 10 columns. How can I say that fread reads the first 10 columns and then combines all this into one column. I assume this will significantly speed up the reading of the file.

+4
source share
2 answers

You can do this with awk:

> fread("../foo.csv")
       a     b     c     d     e     f     g     h     i
   <int> <int> <int> <int> <int> <int> <int> <int> <int>
1:     1     2     3     4     5     6     7     8     9
2:     2     3     4     5     6     7     8     9    10
> fread("cat ../foo.csv | awk -F ',' 'BEGIN { s = 5 } { for (i=1; i<=NF; i++) printf(\"%s%s\", $(i), i<s ? OFS : i<NF ? \"\" : ORS) }'")
       a     b     c     d  efghi
   <int> <int> <int> <int>  <int>
1:     1     2     3     4  56789
2:     2     3     4     5 678910
> 

But if you can’t make it out of the battlefield with the data you are working with, I would probably use this approach. An alternative would be to concat the message after the file has been read. I am also skeptical that this will speed up work on the file.

+2
source

, , . . df, ( NB )

 df10 <- df[,1:10]
 df <- NULL

, . -, , .

0

Source: https://habr.com/ru/post/1675950/


All Articles