I am currently repeating somewhere between 7,000 and 10,000 text definitions ranging in size from 0 to 5000 characters, and I want to check if any line exists in any of them. I want to do this for somewhere in the area of 5000 different string definitions.
In most cases, I just want to know the exact case-insensitive match, but sometimes a regular expression is required. I was wondering if it would be faster to use a different “search” technique when a regular expression is not required.
The skipped version of the code looks something like this.
foreach (string find in stringsiWantToFind)
{
Regex rx = new Regex(find, RegexOptions.IgnoreCase);
foreach (String s in listOfText)
if (rx.IsMatch(s))
find.FoundIn(s);
}
I read a little to see if I was something obvious. There are a number of suggestions for using fixed regular expressions, but I don’t see that this is useful, given the “dynamic” nature of the regular expression.
I also read an interesting article in CodeProject, so I'm going to look at using "FastIndexOf" to see how it compares in performance.
I was just wondering if anyone has any advice on this issue and how can performance optimization be optimized?
thank
source
share