Possible duplicate:
C ++ string :: find complexity
I recently found out that the
std::string::find
function is an order of magnitude slower than the
std::strstr
- in my environment with GCC 4.7 on Linux. The difference in performance depends on the length of the lines and the hardware architecture.
However, there is a simple reason for the difference: std::string::find
basically calls std::memcmp
in a loop - with O(m * n)
time complexity. In contrast, std::strstr
highly optimized for hardware architecture (e.g. with SSE instructions) and uses a more sophisticated string matching algorithm (apparently Knuth-Morris-Pratt).
I was also surprised if I did not find the time complexity of these two functions in language documents (i.e., draft N3290 and N1570). I just found temporary difficulties for char_traits
. But this does not help, because there is no function to search for a substring in char_traits
.
I would expect std::strstr
and memmem
contain similar optimizations with almost the same performance. And until recently, I assumed that std::string::find
uses memmem
internally.
Questions: Is there a good reason why std::string::find
does not use std::memmem
? And is this different from another implementation?
The question is not: what is the best implementation of this function? In C ++, it's really hard to argue if it's slower than C. I wouldn’t care if both implementations are slow. This is a performance difference that really hurts.
source share