Read.table in Chunks - error message

I have a large file with 6 mile rows, and I'm trying to read the data in pieces for processing, so I did not fall into the RAM limit. Here is my code (note temp.csv is just a dummy file with 41 entries):

infile <- file("data/temp.csv", open="r") headers <- as.character(read.table(infile, header = FALSE, nrows=1, sep=",", stringsAsFactors=FALSE)) while(length(temp <-read.table(infile, header = FALSE, nrows=10, sep=",", stringsAsFactors=FALSE)) > 0){ temp <- data.table(temp) setnames(temp, colnames(temp), headers) setkey(temp, Id) print(temp[1, Tags]) } print("hi") close(infile) 

Everything runs smoothly until the last iteration. I get this error message:

 Error in read.table(infile, header = FALSE, nrows = 10, sep = ",", stringsAsFactors = FALSE) : no lines available in input In addition: Warning message: In read.table(infile, header = FALSE, nrows = 10, sep = ",", stringsAsFactors = FALSE) : incomplete final line found by readTableHeader on 'data/temp.csv' 

Presumably this is because in the last iteration there is only 1 row of records and read.table expects 10?

All data is actually read in order. Surprisingly, even in the final iteration, temp is still converted to data.table . But print("hi") and everything after it are never executed. Is there something I can do to get around this?

Thanks.

+1
source share
1 answer

And it turned out!

 repeat{ temp <-read.table(infile, header = FALSE, nrows=10, sep=",", stringsAsFactors=FALSE) temp <- data.table(temp) setnames(temp, colnames(temp), headers) setkey(temp, Id) print(temp[1, Tags]) if (nrow(temp) < 10) break } print("hi") 

This message continues to issue a warning, but no more errors:

 Warning message: In read.table(infile, header = FALSE, nrows = 10, sep = ",", stringsAsFactors = FALSE) : incomplete final line found by readTableHeader on 'data/temp.csv' 
+1
source

Source: https://habr.com/ru/post/1500075/


All Articles