Search for dictionary keys contained in an array of strings

I have a list of lines where each element is free text describing the skill, so it looks like this:

List<string> list = new List<string> {"very good right now", "pretty good", "convinced me that is good", "pretty medium", "just medium" .....} 

And I want to save a user account for these free texts. Therefore, at the moment I am using the conditions:

 foreach (var item in list) { if (item.Contains("good")) { score += 2.5; Console.WriteLine("good skill, score+= 2.5, is now {0}", score); } else if (item.Contains(low")) { score += 1.0; Console.WriteLine("low skill, score+= 1.0, is now {0}", score); } } 

Suppose that in my work I want to use a dictionary to compare points, for example:

 Dictionary<string, double> dic = new Dictionary<string, double> { { "good", 2.5 }, { "low", 1.0 }}; 

What would be a good way to cross between dictionary values โ€‹โ€‹and a string list? Now I see it as a nested loop:

 foreach (var item in list) { foreach (var key in dic.Keys) if (item.Contains(key)) score += dic[key]; } 

But I'm sure there are better ways. Better to be faster or more pleasing to the eye (LINQ), at least.

Thanks.

+5
source share
4 answers
 var scores = from item in list from word in item.Split() join kvp in dic on word equals kvp.Key select kvp.Value; var totalScore = scores.Sum(); 

Note. Your current solution checks to see if the item in the list contains a key in the dictionary. But it will return true, even if the key in the dictionary is part of some word in the element. For instance. "follow the rabbit" contains "low" . Separating an element into words solves this problem.

LINQ join also uses a hash set internally to search for elements of the first sequence in the second sequence. This gives you O (1) search speed instead of O (N) when listing all dictionary entries.

+2
source

If your code finds N skill lines containing the word โ€œgood,โ€ then it adds a score of 2.5 N times.

So, you can simply count the skill lines containing the dictionary and multiply the value by the corresponding score.

 var scores = from pair in dic let word = pair.Key let score = pair.Value let count = list.Count(x => x.Contains(word)) select score * count; var totalScore = scores.Sum(); 
+2
source

it doesn't work faster, but you can use LINQ:

 score = list.Select(s => dic.Where(d => s.Contains(d.Key)) .Sum(d => d.Value)) .Sum(); 

note that your loop cycle will have two different keys, if it matches the lines, I saved this in my solution.

+1
source

Well, you do not use the dictionary as a dictionary, so we can simplify it a bit with the new class:

 class TermValue { public string Term { get; set; } public double Value { get; set; } public TermValue(string t, double v) { Term = t; Value = v; } } 

With this, we can be a little more direct:

 void Main() { var dic = new TermValue[] { new TermValue("good", 2.5), new TermValue("low", 1.0)}; List<string> list = new List<string> {"very good right now", "pretty good", "convinced me that is good", "pretty medium", "just medium" }; double score = 0.0; foreach (var item in list) { var entry = dic.FirstOrDefault(d =>item.Contains(d.Term)); if (entry != null) score += entry.Value; } } 

From here we can just play around a bit (the compiled code for this will probably be the same as above)

  double score = 0.0; foreach (var item in list) { score += dic.FirstOrDefault(d =>item.Contains(d.Term))?.Value ?? 0.0; } 

then, (in the word Violet), we can go crazy:

 double score = list.Aggregate(0.0, (scre, item) =>scre + (dic.FirstOrDefault(d => item.Contains(d.Term))?.Value ?? 0.0)); 
0
source

Source: https://habr.com/ru/post/1268368/


All Articles