Where can I use the technique from the majority vote algorithm

Question

Where can I use the technique from the majority vote algorithm

As can be seen from the answers to the linear time algorithm algorithm? , you can calculate most of the array of elements in linear time and log(n) space.

It was shown that everyone who sees this algorithm believes that this is a cool technique. But does the idea generalize new algorithms?

It seems that the hidden strength of this algorithm is that the counter plays a complex role - for example, "(the counter of most elements so far) - (the amount of the second majority so far)". Are there other algorithms based on the same idea?

+6

language-agnostic algorithm complexity-theory

dsg May 24 '11 at 1:40

source share

3 answers

Jaydev Misra and David Gries have a document called Search for duplicate elements ( ACM page ), which generalizes it to an element that repeats more than n / k times (k = 2 is a majority problem).

Of course, this is probably very similar to the original problem, and you are probably looking for “different” algorithms.

Here is an example that may be different.

Give an algorithm that will determine if the chain of brackets ('(' and ')') is correctly formed.

I believe the standard solution is to maintain a counter.

Side note:

For answers that claim they cannot be constant space, etc., ask them about a calculation model. For example, in the WORD RAM model, you take indices of integers / arrays, etc. O (1).

Many people mix and match models incorrectly. For example, they would happily have an input array of n integers O (n), have an array index of O (1), but the counter they consider Omega (log n), etc., is meaningless. If they want to consider the size in bits, then the input itself is Omega (n log n), etc.

+4

Aryabhatta May 24 '11 at 1:56

source share

For people who want to understand what this algorithm does and why it works: see my detailed answer .

Here I will describe the natural continuation of this algorithm (or generalization). Thus, in the algorithm of the standard majority of votes, you should find an element that appears in the stream at least n/2 times, where n is the size of the stream. You can do this in O(n) time (with a tiny constant and in O(log(n)) space, the worst case and unlikely.

The generalized algorithm allows you to find the k most common elements, where each time in the original stream appeared at least n/(k+1) times. Please note that if k = 1, you get your original problem.

The solution to this problem is really similar to the original one, with the exception of one counter and one possible element, you support k counters and k possible elements. Now the logic goes the same way. You iterate over the array and, if the element is in possible elements, you increase its counter, if one of the counters is zero - replace the element of this counter with a new element. Otherwise, simply decrease the values.

As with the original majority-vote algorithm, you need to have a guarantee that you have these elements of the majority majority, otherwise you will have to make another pass through the array to make sure that your previously found possible elements are correct. Here is my python attempt (did not go through rigorous testing).

 from collections import defaultdict def majority_element_general(arr, k=1): counter, i = defaultdict(int), 0 while len(counter) < k and i < len(arr): counter[arr[i]] += 1 i += 1 for i in arr[i:]: if i in counter: counter[i] += 1 elif len(counter) < k: counter[i] = 1 else: fields_to_remove = [] for el in counter: if counter[el] > 1: counter[el] -= 1 else: fields_to_remove.append(el) for el in fields_to_remove: del counter[el] potential_elements = counter.keys() # might want to check that they are really frequent. return potential_elements

+2

Salvador dali Mar 27 '16 at 5:09

source share

akappa · Accepted Answer · 2011-05-24T02:41:54+0000

Umh, let it first begin to understand why the algorithm works in order to "isolate" ideas there.

The point of the algorithm is that if you have a majority element, then you can match each of its occurrences with a “different” element, and then you have a few more “spare” ones.

So, we have a counter that counts the number of "spare" cases of our guest response. If it reaches 0, then it is not a majority element for a subsequence, starting from the moment when we “chose” the “current” element as the main guest element in the “current” position. In addition, since our “guest” element corresponds to any other element in the considered subsequence, there are no basic elements in the considered subsequence.

Now, since:

our algorithm gives the correct answer only if there is a main element, and
if there is a main element, then it will still be, if we ignore the "current" subsequence, when the counter goes to zero

it is obvious that if the main element exists, then we have the suffix of the whole sequence when the counter never reaches zero.

Now: what is the idea that can be used in the new O (1) O (n) time algorithms?

For me, you can apply this technique whenever you need to calculate the P property for a sequence of elements that:

can be deleted from seq[n, m] to seq[n, m+1] O (1) times if Q(seq[n, m+1]) fails
P(seq[n, m]) can be calculated in O (1) time and space from P(seq[n, j]) and P(seq[j, m]) if Q(seq[n, j]) contains

In our case, P is the "spare" occurrences of our "selected" main element, and Q - " P is zero."

If you see this, the longest common subsequence uses the same idea (I don’t know about its “coolness coefficient”;))

Where can I use the technique from the majority vote algorithm

More articles: