A quick algorithm to find out if a string contains any string in a given array

I have a list of 50 keywords and about 50,000 lines. I check every line if it contains at least one of the keywords. I'm not interested in the matched keyword or the number of matching keywords. I need only the "true" or "false" back, as quickly as possible.

So, I put there an algorithm that far surpasses my current version of LINQ:

class MyEnumerableExtension
{
    public static bool ContainsAny(this string searchString, IEnumerable<string> keywords)
    {
        return keywords.Any(keyword => searchString.Contains(keyword))
    }
}

bool foundAny = "abcdef".ContainsAny(new string[] { "ac", "bd", "cd" } );
+3
source share
4 answers

, , , , ,

+1

.

0

Yo --.

0

, . , . Regex "", ( ). , . , , , .

        string[] keywords = { "ac", "bd", "cd" };
        string[] tosearch = { "abcdef" };
        string pattern = String.Join("|", keywords);
        Regex regex = new Regex(pattern, RegexOptions.Compiled);
        foundAny = regex.IsMatch(String.Join("|", tosearch));

Also note that this works as long as your keywords do not contain special regular expression characters (and the search strings do not contain a pipe character. However, special characters can be overcome using escape sequences and search strings should not be combined, as i did.

0
source

Source: https://habr.com/ru/post/1775640/


All Articles