It looks like this may be due to a problem. It is only truly worthy of comment, but it is too difficult to put as a whole without its illegibility.
perlbrew exec --with=5.10.0 memusage perl -e '$x = q[a] x 1_000_000_000; print length($x)' 5.10.0 ========== 1000000000 Memory usage summary: heap total: 2000150514, heap peak: 2000141265, stack peak: 4896
Yes, this is 2 G memory for 1 G text.
Now using 2G ...
perlbrew exec --with=5.10.0 memusage perl -e '$x = q[a] x 1_000_000_000; $y = q[a] x 1_000_000_000; print length($x)+length($y)' 5.10.0 ========== 2000000000 Memory usage summary: heap total: 4000151605, heap peak: 4000142092, stack peak: 4896
Clap. That would certainly fall into the 32-bit limit if you had one.
I was spoiled and did my testing on 5.19.5
, which has a notable improvement, named copy-to-write strings, which significantly reduces memory consumption:
perlbrew exec --with=5.19.5 memusage perl -e '$x = q[a] x 1_000_000_000; $y = q[a] x 1_000_000_000; print length($x)+length($y)' 5.19.5 ========== 2000000000 Memory usage summary: heap total: 2000157713, heap peak: 2000150396, stack peak: 5392
So, anyway, if you are using any version of Perl other than development, you need to expect it to consume twice as much memory you need.
If for some reason there is a memory limit around a 2G window for 32-bit processes, you will get this using the 1G line.
Why copy to write?
Ok when you do
$a = $b
$a
is a copy of $b
So when you do
$a = "a" x 1_000_000_000
First, it extends the right side, creating a variable, and then creates a copy to store in $a
.
You can prove this by excluding the copy as follows:
perlbrew exec --with=5.10.0 memusage perl -e 'print length(q[a] x 1_000_000_000)' 5.10.0 ========== 1000000000 Memory usage summary: heap total: 1000150047, heap peak: 1000140886, stack peak: 4896
See, everything I did was deleted by an intermediate variable, and memory usage was halved!
: S
Although, since 5.19.5
only refers to the original string and copies it when writing, it is effective by default, so deleting an intermediate variable has slight advantages
perlbrew exec --with=5.19.5 memusage perl -e 'print length(q[a] x 1_000_000_000)' 5.19.5 ========== 1000000000 Memory usage summary: heap total: 1000154123, heap peak: 1000145146, stack peak: 5392