TL DR: Your example is a pathological case for one of the internal representations of v8 strings. You can fix this by indexing in output once in a while (information on why below).
First, we can use heapdump to find out what the garbage collector does:

The picture above was taken shortly before the node runs out of memory. As you can see, most things look fine: we see two lines (a very large output and a small piece to be added), three links to the same big array (approximately 64 MB, similar to what we expect), and many smaller items that don't look unusual.
But, one thing stands out: output is a whopping 1.4+ GB. At the time the picture was taken, it was approximately 80 million characters, so ~ 160 MB, assuming 2 bytes per character. How is this possible?
Perhaps this is due to the v8 internal string representation. Quoting mraleph :
There are two types of [lines of v8] (actually more, but only these two are important for the problem under consideration):
- Flat strings are immutable character arrays
- cons strings are pairs of strings, the result of concatenation.
If you concatenate a and b, you get the string cons (a, b), which represents the result of concatenation. If you later agree to this, you will get another cons string ((a, b), d).
Indexing into such a "tree-like" string is not O (1), therefore, for acceleration, V8 aligns the string during indexing: copies all characters to a flat string.
So can it be that v8 represents output as a giant tree? One way to check is to make v8 smooth the line (as suggested by mraleph above), for example, by indexing in output at regular intervals inside the for loop:
if (i % 10000000 === 0) {
And indeed, the program works successfully!
One question remains: why did version 2 work? It seems that in this case, v8 can optimize most string concatenations (all on the right side, which are converted to bitwise operations in a 4-element array).