Import large matrices: import all or column by column? - MATLAB

General programming issue, but there may be specific considerations for Matlab.

I will import a very large data file. Is it better to practice / faster / more efficiently import the entire file into memory and then divide it into sub-matrices, or rather just import all n columns into a new matrix?

My guess is that it will load it all faster into the cache and then work with it, but this is just an uneducated assumption.

+4
source share
1 answer

From my experience, the best approach is to parse it once, using either csvread (which uses dlmread, which uses textscan - so the time penalty is not significant). Of course, this means that a very large file does not exceed the amount of free RAM. If a very large file is larger than RAM (I just had to parse a 31 gigabyte file, for example), then I would use fopen, read line by line (or chunks, block whatever you prefer), and write them to a writeable file mate. Thus, theoretically, you can write huge files limited by your file system.

+1
source

Source: https://habr.com/ru/post/1485869/


All Articles