The tests on this page are really biased, so let's see how much. The author claims to be testing string manipulations, but here's what the programs on this page are testing:
- String concatenation
- Memory Management, Explicit (C) or Implicit
- In some languages, regular expressions
- In other cases, string search algorithms and substring replacement
- Access to memory that has restrictions in several languages
There are too many aspects. Here's how it is measured:
This is unsuccessful, since the computer should be fully designed to run only this test for reasonable values without any other processes, such as services, antiviruses, browsers, even the waiting * nix shell. CPU time would be much more useful, you could even run tests on a virtual machine.
Another aspect is that characters in C, C ++, Perl, Python, PHP, and Ruby are 8-bit, but they are 16-bit in many other test languages. This means that memory usage is emphasized in very different quantities, at least 2 times. Here cache misses are much more noticeable.
I suspect Perl is so fast that it checks its arguments once before calling the C function, rather than constantly checking the boundaries. Other languages with 8-bit strings are not so fast, but still fast enough.
JavaScript V8 has strings that are ASCII if possible, so if the added and replaced token was "ëfgh", you would pay a lot more in this implementation.
Python 3 is almost three times slower than Python 2, and I assume this is due to the internal representation of the wchar_t * vs char * strings.
JavaScript SpiderMonkey uses 16-bit strings. I did not dig many sources, but the jsstr.h file mentions ropes.
Java is so slow that String immutable, so for this test it is definitely not the appropriate data type. You pay the price of creating a huge line after each .replace() . I have not tested, but probably StringBuffer will be much faster.
So, this test should be done not only with salt, but also with a handful.
In Common Lisp, border checking and type dispatching in aref and its setf are probably bottlenecks.
For good performance, you will need to generate generic String sequences and use simple-string or simple-vector s, depending on which your implementation is optimized best. Then you should have a way to make schar or svref and their setf capable forms that bypass border checking. From here you can implement your own simple-string-search or simple-character-vector-search (and replace-simple-string or replace-simple-vector , although they play a much smaller role in this particular example) with full speed optimization and declarations types, with border checks on the head of each call instead of every access to the array.
A smart enough compiler ™ would do all this for you, given the “right” declarations. The problem is that you will need to use (concatenate 'simple-string/simple-vector ...) , because neither simple strings nor simple vectors can be changed.
With a compacting / moving GC, in these cases there is always a penalty (for example, copying an array / object), and the choice between array tuning and concatenation should really depend on profiling checks. Otherwise, tuning may be faster than concatenation, while there is enough free memory to grow the array.
You can use custom arrays if the implementation will access the actual elements directly after a brief check of the boundaries at the head of optimized calls / extensions search and replace with custom arrays (for example, using internal definitions that take the actual offset vector / array and the start and end offsets )
But I think a lot here, you need to compile, check the compilation and profile in each implementation for real facts.
As a side note, example C code is filled with errors, such as phased (-1, actually) distributions ( strcat calls write an extra byte, line terminator with a null terminating character) an uninitialized null-terminated gstr (the first strcat works well, as memory may not be initialized to 0), the conversion of size_t and time_t in int and the assumption that these units in printf format string, unused variable pos_c , which is initialized to the first distribution gstr , which increases not received May consider that realloc can move the buffer and do not handle errors at all.