R file with thousands of columns, concat after the first 10

Question

R file with thousands of columns, concat after the first 10

I am reading a file with several thousand columns, I'm only interested in the first 10 columns. How can I say that fread reads the first 10 columns and then combines all this into one column. I assume this will significantly speed up the reading of the file.

+4

r data.table fread

par Apr 28 '17 at 20:20

source share

2 answers

, , . . df, ( NB )

 df10 <- df[,1:10]
 df <- NULL

, . -, , .

0

Umberto 28 . '17 20:43

Clayton stanley · Accepted Answer · 2017-04-28T21:40:27+0000

You can do this with awk:

> fread("../foo.csv")
       a     b     c     d     e     f     g     h     i
   <int> <int> <int> <int> <int> <int> <int> <int> <int>
1:     1     2     3     4     5     6     7     8     9
2:     2     3     4     5     6     7     8     9    10
> fread("cat ../foo.csv | awk -F ',' 'BEGIN { s = 5 } { for (i=1; i<=NF; i++) printf(\"%s%s\", $(i), i<s ? OFS : i<NF ? \"\" : ORS) }'")
       a     b     c     d  efghi
   <int> <int> <int> <int>  <int>
1:     1     2     3     4  56789
2:     2     3     4     5 678910
>

But if you can’t make it out of the battlefield with the data you are working with, I would probably use this approach. An alternative would be to concat the message after the file has been read. I am also skeptical that this will speed up work on the file.

R file with thousands of columns, concat after the first 10

More articles: