What is the fastest way to load a large 2D int array from a file?

I am loading a 2D array from a file, that’s 15,000,000 * 3 ints big (eventually it will be 40,000,000 * 3). Right now I am using dataInputStream.readInt() to read an int sequentially. It takes ~ 15 seconds. Can I do it significantly (at least 3 times) faster or is it as fast as I can get?

+4
source share
2 answers

Yes, you can. From the standard 13 different ways to read files :

If you need to choose the fastest approach, this will be one of the following:

  • FileChannel with a MappedByteBuffer and an array read.
  • FileChannel with direct ByteBuffer and array.
  • FileChannel with a wrapped ByteBuffer array and direct access to the array.

There are 4 things to keep in mind for better Java reading performance:

  • Minimizing I / O by reading the array at a time, rather than bytes at a time. An 8 KB array is a good size (therefore, the default value for BufferedInputStream ).
  • Minimize method calls by receiving data in an array at a time, rather than byte at a time. Use array indexing to get bytes in the array.
  • Minimize thread synchronization locks if you don't need a security thread. Either make fewer method calls in a thread-safe class, or use an unsafe class, such as FileChannel and MappedByteBuffer .
  • Minimize data copying between JVM / OS, internal buffers, and application arrays. Use a FileChannel with memory mapping or a direct or wrapped ByteBuffer array.
+7
source

Match the file in memory!

Java 7 code:

 FileChannel channel = FileChannel.open(Paths.get("/path/to/file"), StandardOpenOption.READ); ByteBuffer buf = channel.map(0, channel.size(), FileChannel.MapMode.READ_ONLY); // use buf 

See here for more details.

If you are using Java 6, you need to:

 RandomAccessFile file = new RandomAccessFile("/path/to/file", "r"); FileChannel channel = file.getChannel(); // same thing to obtain buf 

You can even use .asIntBuffer() in the buffer if you want. And you can only read what you really need to read when you need to read it. And it does not affect your heap.

+7
source

Source: https://habr.com/ru/post/1487755/


All Articles