How to compare string with filter list in linq?

I am trying to filter a collection of strings with a list of "filter" ... a list of bad words. The line contains a word from the list in which I do not want it.

I still realized that the bad word here is "frakk":

string[] filter = { "bad", "words", "frakk" }; string[] foo = { "this is a lol string that is allowed", "this is another lol frakk string that is not allowed!" }; var items = from item in foo where (item.IndexOf( (from f in filter select f).ToString() ) == 0) select item; 

But that doesn't work, why?

+6
source share
3 answers

You can use Any + Contains :

 var items = foo.Where(s => !filter.Any(w => s.Contains(w))); 

if you want to compare case insensitive:

 var items = foo.Where(s => !filter.Any(w => s.IndexOf(w, StringComparison.OrdinalIgnoreCase) >= 0)); 

Refresh . If you want to exclude sentences in which at least one word is in the filter list, you can use String.Split() and Enumerable.Intersect :

 var items = foo.Where(sentence => !sentence.Split().Intersect(filter).Any()); 

Enumerable.Intersect very effective since it uses Set under the hood. more efficiently transfer a long sequence. Because of Linq, deferred execution stops at the first matching word.

(note that the “empty” Split includes other space characters, such as tab or newline)

+9
source

The first problem you need to solve is decomposing a sentence into a series of words. The easiest way to do this is based on spaces.

 string[] words = sentence.Split(new char[] {' '}, StringSplitOptions.RemoveEmptyEntries); 

From there, you can use a simple LINQ expression to find profanities

 var badWords = words.Where(x => filter.Contains(x)); 

However, this is a slightly primitive solution. It will not handle many complex cases that you probably need to think about.

  • There are many characters that qualify as space. My solution uses only ' '
  • Separation does not handle punctuation. Therefore dog! will not be considered a dog . It is probably much better to break words into legal symbols.
+2
source

The reason your initial attempt did not work is because this line:

 (from f in filter select f).ToString() 

evaluates an Array Iterator type name string, which is implied by part of the linq expression. This way you are actually comparing the characters of the following line:

System.Linq.Enumerable+WhereSelectArrayIterator``2[System.String,System.String]

not filter words when viewing your phrases.

0
source

Source: https://habr.com/ru/post/950397/


All Articles