Jumping on line and reading

I need to work with large files (lots of GB) and need quick searches to get specific strings on demand.

The idea was to support mapping:

some_key -> byte_location

Where the byte location represents where the line begins in the file.

Edit: The question has changed a bit:

At first I used:

FileInputStream stream = new FileInputStream(file);
BufferedReader reader = new BufferedReader(new InputStreamReader(stream));
FileChannel channel = stream.getChannel();

I noticed that FileChannel.position()it will not return the exact position in which the reader is reading, because it is a "buffered" reader. It reads pieces of a given size (16k here), so what I get from the FileChannel is a multiple of 16k, and not the exact position at which the reader really reads.

PS: file is in UTF-8

+3
source share
2 answers

- :

    RandomAccessFile raf = new RandomAccessFile(file);
    ...
    raf.seek(position);
    raf.readLine();
    ...

, readLine() 8 . , ASCII Latin-1, UTF-8.

, RandomAccessFile , readUTF() writeUTF() "", UTF-8.

Followup

dammit... utf-8

... . .

UTF-8 RandomAccessFile:

  • ,
  • readFully(byte[]) byte[],
  • pos == ,
  • , , 2.
  • , new String(bytes, 0, pos, UTF-8) Java.

, readLine(), , FileInputStream skip() .

+2

FileInputStream, stream.skip(pos), InputStreamReader BufferedReader InputStreamReader?

+3

Source: https://habr.com/ru/post/1773007/


All Articles