Does a scanner class load an entire file into memory at once?

I often use the Scanner class to read files because it is so convenient.

String inputFileName; Scanner fileScanner; inputFileName = "input.txt"; fileScanner = new Scanner (new File(inputFileName)); 

My question is, does the above file download the entire file at once? Or make subsequent calls in the Scanner file, for example

  fileScanner.nextLine(); 

read from file (i.e. from external storage, not from memory)? I ask because it bothers me what can happen if the file is too large to be read right away in memory. Thank you

+3
source share
4 answers

If you read the source code, you can answer the question yourself.

It seems that the implementation of the Scanner constructor in question shows:

 public Scanner(File source) throws FileNotFoundException { this((ReadableByteChannel)(new FileInputStream(source).getChannel())); } 

The last wrapped in Reader:

 private static Readable makeReadable(ReadableByteChannel source, CharsetDecoder dec) { return Channels.newReader(source, dec, -1); } 

And it is read using buffer size

 private static final int BUFFER_SIZE = 1024; // change to 1024; 

As you can see in the final constructor in the build chain:

 private Scanner(Readable source, Pattern pattern) { assert source != null : "source should not be null"; assert pattern != null : "pattern should not be null"; this.source = source; delimPattern = pattern; buf = CharBuffer.allocate(BUFFER_SIZE); buf.limit(0); matcher = delimPattern.matcher(buf); matcher.useTransparentBounds(true); matcher.useAnchoringBounds(false); useLocale(Locale.getDefault(Locale.Category.FORMAT)); } 

So, it seems that the scanner is not reading the entire file at once.

+12
source

From reading the code, it seems to load 1K at a time by default. The buffer size may increase for long lines of text. (To the size of the longest line of text that you have)

+1
source

You better go with something like BufferedReader with FileReader for large files. A basic example can be found here .

0
source

In ACM Contest, fast reading is very important. In Java, we found that using something like this very quickly ...

  FileInputStream inputStream = new FileInputStream("input.txt"); InputStreamReader streamReader = new InputStreamReader(inputStream, "UTF-8"); BufferedReader in = new BufferedReader(streamReader); Map<String, Integer> map = new HashMap<String, Integer>(); int trees = 0; for (String s; (s = in.readLine()) != null; trees++) { Integer n = map.get(s); if (n != null) { map.put(s, n + 1); } else { map.put(s, 1); } } 

The file contains in this case the tree names ...

 Red Alder Ash Aspen Basswood Ash Beech Yellow Birch Ash Cherry Cottonwood 

You can use StringTokenizer to search for any part of the string you want.

We have some errors if we use Scanner for large files. Read 100 lines from a file with 10,000 lines!

The scanner can read text from any object that implements a Readable interface. If the call to the main readable method, the Readable.read method (java.nio.CharBuffer) throws an IOException, then the scanner assumes that the end of the input has been reached. The last IOException that can be thrown using the ioException () method.

points to API

Good luck

0
source

Source: https://habr.com/ru/post/985170/


All Articles