There are two factors here: time and memory consumption. The time mainly depends on the number of times java.lang.AbstractStringBuilder.expandCapacity() called. Of course, the cost of each call is linear with respect to the current buffer size, but I simplify it here and just count them:
Number expandCapacity() (time)
Default configuration (16 character capacity)
- In 60% of cases,
StringBuilder will expand 0 times - In 39% of cases,
StringBuilder will expand 8 times - In 1% of cases,
StringBuilder will expand 11 times
The expected expandCapacity is 3.23.
Starting capacity 4096 characters
- In 99% of cases,
StringBuilder will expand 0 times - In 1% of cases,
StringBuilder will expand 3 times
The expected expandCapacity is 0.03.
As you can see, the second scenario looks much faster, since it is very rare to expand the StringBuilder (three times for every 100 inputs). Please note, however, that the first extensions are less significant (copying a small amount of memory); also, if you add lines to the builder in huge chunks, it will work more actively with less iterations.
On the other hand, memory consumption is increasing:
Memory consumption
Default configuration (16 character capacity)
- In 60% of cases
StringBuilder will occupy 16 characters - In 39% of cases,
StringBuilder will occupy 4K characters - In 1% of cases
StringBuilder will occupy 32K characters
Expected average memory consumption: 1935 .
Starting capacity 4096 characters
- In 99% of cases,
StringBuilder will occupy 4K characters - In 1% of cases
StringBuilder will occupy 32K characters
Expected average memory consumption: 4383 .
TL DR
This makes me think that increasing the initial buffer to 4 K will increase the memory consumption by more than two times, speeding up the program by two orders of magnitude.
The bottom line is: try it! Itβs not so difficult to write a test that will process millions of lines of different lengths with different initial powers. But I believe that a large buffer may be a good choice.
source share