Java NIO MappedByteBuffer OutOfMemoryException

I really have a problem: I want to read HUGE files on more than a few GB using FileChannel and MappedByteBuffer - all the documentation found implies that it is quite simple to map the file using the FileChannel.map() method. Of course, there is a 2 GB limit, since all Buffer methods use int for position, limit, and capacity - but what about the system, limits are implied below this?

In fact, I have a lot of problems regarding OutOfMemoryException s! And no documentation at all sets limits! So - how can I match a file that fits int-limit safely into one or more MappedByteBuffer without getting exceptions?

Can I ask the system which part of the file I can photograph before trying FileChannel.map() ? How? Why is there so little documentation about this feature?

+4
source share
4 answers

The larger the file, the less you want everything in memory at once. Create a way to process the file with a buffer at a time, a line at a time, etc.

MappedByteBuffers are especially problematic since there is no specific release of mapped memory, so using more than one at a time is essentially a failure.

+2
source

I can offer some working code. Is it difficult to solve your problem or not. This hunts through a file for a pattern recognized by Hunter .

See the excellent article Java Tip: How to quickly read files for original research (not mine).

 // 4k buffer size. static final int SIZE = 4 * 1024; static byte[] buffer = new byte[SIZE]; // Fastest because a FileInputStream has an associated channel. private static void ScanDataFile(Hunter p, FileInputStream f) throws FileNotFoundException, IOException { // Use a mapped and buffered stream for best speed. // See: http://nadeausoftware.com/articles/2008/02/java_tip_how_read_files_quickly FileChannel ch = f.getChannel(); long red = 0L; do { long read = Math.min(Integer.MAX_VALUE, ch.size() - red); MappedByteBuffer mb = ch.map(FileChannel.MapMode.READ_ONLY, red, read); int nGet; while (mb.hasRemaining() && p.ok()) { nGet = Math.min(mb.remaining(), SIZE); mb.get(buffer, 0, nGet); for (int i = 0; i < nGet && p.ok(); i++) { p.check(buffer[i]); } } red += read; } while (red < ch.size() && p.ok()); // Finish off. p.close(); ch.close(); f.close(); } 
+8
source

What I'm using is a List<ByteBuffer> , where each ByteBuffer maps a file in a block from 16 MB to 1 GB. I use powers of 2 to simplify the logic. I used this to display in files up to 8 TB.

A key limitation of memory mapped files is that you are limited by your virtual memory. If you have a 32-bit JVM, you will not be able to display it very much.

I would not create new memory mappings for a file because they are never cleared. You can create many of them, but it seems that their limit is around 32 KB on some systems (no matter how small they are)

The main reason I find MemoryMappedFiles useful is that they don’t need to be cleaned up (if you can assume that the OS will not die) This allows you to record low latency data without worrying about losing too much data. if the application dies or has too much performance, having to write () or flash ().

+6
source

You do not use the FileChannel API to immediately write the entire file. Instead, you send the file in parts. See the sample code in Martin Thompson's article comparing the performance of Java IO methods: Java Sequential IO Performance

Also, documentation is not enough because you are invoking a platform-specific call. from map() JavaDoc:

Many of the details of memory mapped files depend on the nature of the underlying operating system and are therefore undefined.

+3
source

Source: https://habr.com/ru/post/1435507/


All Articles