Reading a file or loading a file into main memory from a disk for processing

How to load a file into main memory?

I read files using, I use

BufferReader buf = new BufferedReader(FileReader()); 

I assume this is reading the file line by line from disk. What is the advantage of this?

What is the advantage of loading a file directly into memory? How do we do this in Java?

I found some examples in the Scanner or RandomAccessFile . Do they load files into memory? Should I use them? Which of the two should I use?

Thanks in advance!

+4
source share
2 answers
 BufferReader buf = new BufferedReader(FileReader()); 

I assume this is reading the file line by line from disk. What is the advantage of this?

Not really. It reads the file in chunks whose size is the size of the default buffer (I think it is 8k bytes).

The advantage is that you do not need a huge heap to read a huge file. This is a serious problem, since the maximum heap size can only be specified when starting the JVM (using Hotspot Java).

You also do not consume system physical / virtual memory resources to represent a huge heap.

What is the advantage of loading a file directly into memory?

It reduces the number of system calls and can read the file faster. How much faster depends on a number of factors. And you have a problem with really large files.

How do we do this in Java?

  • Find out how big the file is.
  • Select an array of bytes (or characters) large enough.
  • To view the entire file, use the appropriate read(byte[], int, int) or read(char[], int, int) method.

You can also use a memory mapped file ... but this requires using the Buffer APIs, which can be a little complicated to use.

I found some examples of the Scanner or RandomAccessFile methods. Do they load files into memory?

No and no.

Should I use them? Which of the two should I use?

Do they provide the required features? Do you need to read / analyze text data? Do you need random access to binary data?

Under normal circumstances, you should choose I / O APIs based mainly on the required functions, and secondly, on performance considerations. Usually use BufferedInputStream or BufferedReader to get acceptable performance * if you intend to analyze it while reading it. (But if you really need to save the entire file in memory in its original form, then the BufferedXxx shell class actually makes reading a little slower.)


* - Please note that acceptable performance is not the same as optimal performance, but your client / project manager will probably not want you to spend time writing code for optimal performance ... if this is not a stated requirement .

+7
source

If you read in a file and then parse it, once from the beginning to the end you extract your data, and then do not refer to the file again, the buffered reader is about as β€œoptimal” as you get. You can "tweak" the performance somewhat by adjusting the size of the buffer - a larger buffer will read large fragments from the file. (Make a buffer with a capacity of 2 - for example, 262144.) Reading in the entire large file (larger, say, 1 mb) will usually cost you performance when managing search and heap.

+3
source

Source: https://habr.com/ru/post/1442429/


All Articles