What is the data format for large files in R?

I am creating a very large data file with Python, basically consisting of 0(false) and just a few 1(true). It has about 700,000 columns and 15,000 rows, and therefore a size of 10.5 GB. The first line is the title.
Then this file should be read and visualized in R.

I am looking for a suitable data format for exporting my file from Python.

As indicated here :

HDF5 is row based. You get MUCH efficiency by having tables that are not too wide but quite long.

Since I have a very wide table, I suppose HDF5 is inappropriate in my case?

So which data format is suitable for this purpose?
Would it also be wise to zip it?

An example of my file:

id,col1,col2,col3,col4,col5,...
1,0,0,0,1,0,...
2,1,0,0,0,1,...
3,0,1,0,0,1,...
4,...
+4
2

Zipping , , . , , . , R? Python, / 10,5 ?

, , (: 1, ), .

, 700K 15K , 10786 600492, (600492, 10786) R.

+4

SciPy scipy.io.mmwrite, , R readMM. SciPy .

0

Source: https://habr.com/ru/post/1625033/


All Articles