Items Common to Most Lists

Given a list of lists (say, 5 lists in order to have a real number to work with), I can easily find items that are common to all 5 lists (see Crossing multiple lists with IEnumerable.Intersect () ) using the following code:

var list1 = new List<int>() { 1, 2, 3 }; var list2 = new List<int>() { 2, 3, 4 }; var list3 = new List<int>() { 3, 4, 5 }; var listOfLists = new List<List<int>>() { list1, list2, list3 }; var intersection = listOfLists.Aggregate((previousList, nextList) => previousList.Intersect(nextList).ToList()); 

Now let's say that intersection ends with 0 elements. It is possible that there are some objects that are common to 4/5 lists. How can I find them in the most efficient way?

I know that I can just run through all the combinations from 4 lists and save all the results, but this method does not scale very well (this, ultimately, will need to be done in about 40 lists).

If no items are common to 4 lists, the repeated search will search for items common to 3/5 lists, etc. Visually, this can be represented by lists of grid points, and we are looking for points that have the greatest match.

Any ideas?

EDIT: Maybe it would be better to look at each point and track how many times it appears in each list, and then create a list of points with the highest result?

+4
source share
2 answers

You can select all numbers (points) from all lists and group them by value. Then, sort the result by the size of the group (i.e., Count the number where the dot is present), and select the most common element:

 var mostCommon = listOfLists.SelectMany(l => l) .GroupBy(i => i) .OrderByDescending(g => g.Count()) .Select(g => g.Key) .First(); // outputs 3 

Instead of accepting only the first element, you can take a few top elements by replacing First() with Take(N) .


Return items with the number of lists (sorted by the number of lists):

 var mostCommonItems = from l in listOfLists from i in l group i by i into g orderby g.Count() descending select new { Item = g.Key, NumberOfLists = g.Count() }; 

Usage (element is a strongly typed anonymous object):

 var topItem = mostCommonItems.First(); var item = topItem.Item; var listsCount = topItem.NumberOfLists; foreach(var item in mostCommonItems.Take(3)) // iterate over top three items 
+6
source

First you can combine all the lists, and then find the list mode using the dictionary strategy as follows. This is pretty fast:

 /// <summary> /// Gets the element that occurs most frequently in the collection. /// </summary> /// <param name="list"></param> /// <returns>Returns the element that occurs most frequently in the collection. /// If all elements occur an equal number of times, a random element in /// the collection will be returned.</returns> public static T Mode<T>(this IEnumerable<T> list) { // Initialize the return value T mode = default(T); // Test for a null reference and an empty list if (list != null && list.Count() > 0) { // Store the number of occurences for each element Dictionary<T, int> counts = new Dictionary<T, int>(); // Add one to the count for the occurence of a character foreach (T element in list) { if (counts.ContainsKey(element)) counts[element]++; else counts.Add(element, 1); } // Loop through the counts of each element and find the // element that occurred most often int max = 0; foreach (KeyValuePair<T, int> count in counts) { if (count.Value > max) { // Update the mode mode = count.Key; max = count.Value; } } } return mode; } 
+1
source

Source: https://habr.com/ru/post/1491912/


All Articles