Streaming data in Julia

Is there currently a good way to read data in Julia in a streaming way?

For example, let's say I have a CSV file that is too large to fit in memory. Are there currently built-in functions or a library that makes it easier to work with?

I am aware of the DataStream functionality prototype in DataFrames, but it is not currently visible through the open API.

+4
source share
1 answer

The everyline function turns an input-output source into a string iterator. This should allow you to read the file one at a time. from there, the readcsv and readdlm functions can read each line if you turn it into an IOBuffer.

for ln in eachline(open("file.csv"))
  data = readcsv(IOBuffer(ln))
  # do something with this data
end

This is still very good, but not many steps, so not so bad.

+3
source

Source: https://habr.com/ru/post/1541697/


All Articles