Regular actions between languages โ€‹โ€‹or libraries

I could not find anything about this subject, so I wonder if someone compared the speed of regular expression between different languages. I would like to know which language calculates regular expressions faster, because in my current project I need to constantly evaluate a huge number of regular expressions. The choice of language will be mainly based on this performance.

My idea is that C / C ++ will naturally be faster, but I want to avoid it if possible, and I'm not sure if I'm right. For example, a C # library might use native code with P / Invoke, and so the speed difference can be ridiculous. But I donโ€™t know which library to choose, or do I need to create a wrapper around the C ++ library (which?).

+4
source share
4 answers

What are regular expressions? Will they use features such as lookaheads, lookbehinds, backreferences, reluctant quantifiers, atomic groups, custom quantifiers, etc. Etc.?

Other respondents are associated with the regex-dna regular expression , but it uses only the most basic functions common to all regular expression flavors, such as Kleene star ( * ) and alternating ( | ). Thus, while the GNU C / C ++ implementations seem like clear winners, they wonโ€™t be useful if you need any of the features listed above.

Another consideration is Unicode support. If you are dealing with actual text (and not data presented as text, as in the regex-dna test), you should use a regular expression flavor with good Unicode support.

I suggest you take a look at C #. Emulating .NET regular expressions does not have a reputation for being slow (which is the only reasonable thing to say about IMO regular expression speeds), and for mission-critical applications, it provides the ability to directly compile byte code to significantly improve performance.

+4
source

There is a regex here: http://shootout.alioth.debian.org/u64q/benchmark.php?test=regexdna&lang=all&box=1

But the types of regular expressions that you are going to use can potentially have a lot more value than your choice of engine. Some engines perform better than others for certain types, and some types of regular expressions are slow no matter what the engine is (for example, a certain regular expression may require exponential time)

+3
source

I suggest evaluating the complex regular expression in RegExBuddy .
Try in the languages โ€‹โ€‹you want to test. It shows the speed in ms. Believe me, this is a great tool.

0
source

The choice of language will be mainly based on this performance.

Then your choice may come to choosing a regular expression .

Will your program run on single-core machines or multi-core processors, or x86 or x64?

0
source

Source: https://habr.com/ru/post/1341422/


All Articles