Java is the fastest way to read Char text files in Char

I have almost 500 text files with 10 million words. I have to index these words. What is the fastest way to read from a text file by nature? Here is my initial attempt:

InputStream ist = new FileInputStream(this.path+"/"+doc); BufferedReader in = new BufferedReader(new InputStreamReader(ist)); String line; while((line = in.readLine()) != null){ line = line.toUpperCase(Locale.ENGLISH); String word = ""; for (int j = 0; j <= line.length(); j++) { char c= line.charAt(j); // OPERATIONS } 
+4
source share
3 answers

read() will not give significant performance differences.

Read more: Comparison with Peter Lawery read () and readLine ()

Now back to your original question:
Input line: hello how are you?
Therefore, you need to index the words of the string, i.e.:

 BufferedReader r = new BufferedReader(new InputStreamReader(inputStream)); String line; while ((line = r.readLine()) != null) { String[] splitString = line.split("\\s+"); //Do stuff with the array here, ie construct the index. } 

Note. The sample \\s+ will put the delimiter on the line like any spaces, such as tab, space, etc.

+1
source

The InputStreamReader read () method can read a character at a time.

You can wrap it around a FileReader or BufferedReader or example.

Hope this helps!

0
source

Do not read the lines, and then re-examine the char strings on char. This way you process each character twice. Just read the characters through BufferedReader.read ().

0
source

Source: https://habr.com/ru/post/1379221/


All Articles