Fread () file from archive

I would like to know what is the recommended way to read data.table from an archive file (zip archive in my case). One obvious option is to unzip it into a temporary file and then fread() as usual. I don't want to worry about creating a new file, so instead I use read.table() from the unz() connection, and then convert it to data.table() :

 mydt <- data.table(read.table(unz(myzipfilename, myfilename))) 

This works fine, but read.table() slower for large files, and fread() cannot directly read unz() . I am wondering if there is a better solution.

+8
source share
1 answer

Take a look at: Reading a Ziped CSV file using fread. To avoid tmp files, you can use unzip with -p to extract files into the pipeline, without messages.

You can use this kind of statements with Fread.

 x = fread('unzip -p test/allRequests.csv.zip') 

Or with weapons

 x = fread('gunzip -cq test/allRequests.csv.gz') 

You can also use grep or other tools.

+7
source

Source: https://habr.com/ru/post/1234512/


All Articles