OutOfMemoryException in regular expression Matches when processing large files

I have an exception log from one of the production code releases.

System.OutOfMemoryException: Exception of type 'System.OutOfMemoryException' was thrown.
   at System.Text.RegularExpressions.Match..ctor(Regex regex, Int32 capcount, String text, Int32 begpos, Int32 len, Int32 startpos)
   at System.Text.RegularExpressions.RegexRunner.InitMatch()
   at System.Text.RegularExpressions.RegexRunner.Scan(Regex regex, String text, Int32 textbeg, Int32 textend, Int32 textstart, Int32 prevlen, Boolean quick)
   at System.Text.RegularExpressions.Regex.Run(Boolean quick, Int32 prevlen, String input, Int32 beginning, Int32 length, Int32 startat)
   at System.Text.RegularExpressions.MatchCollection.GetMatch(Int32 i)
   at System.Text.RegularExpressions.MatchEnumerator.MoveNext()

The data he is trying to process is about 800 KB.

It works fine in my local tests. Have you ever seen such behavior, what could be the reason?

Do I have to split the text before processing it, but obviously in this case the regular expression may not match, because the source file is split into a random place.

My regular expressions are:

EDIT 2:

I think this particular RegEx causes a problem, when I test it in an isolated environment, it eats memory instantly.

((?:( |\.\.|\.|""|'|=)[\/|\?](?:[\w#!:\.\?\+=&@!$'~*,;\/\(\)\[\]\-]|%[0-9a-f]{2})*)( |\.|\.\.|""|'| ))?

EDIT

. , , .NET Framework, OOM RegEx, ( , , ).

.NET Framework 2.0.

+3
3

Regex, , , Greedy Lazy.

Regex , Greedy match , Regex 800k, .

.

+2

, , , - .

( ) , ?

CLR Profiler. , , , . , .

+1

Based on your editing, it looks like your code can create lines that take up large amounts of memory. This would mean that even if a memory exception is thrown from Regex code, this is actually not because Regex itself takes up too much memory. Therefore, if using StringBuilder in your own code fixes the problem, then what should you do.

+1
source

Source: https://habr.com/ru/post/1706042/


All Articles