There is no general way to prevent arbitrary functions from duplicating. Of course, you can always filter out duplicates, but you do not want this for very good reasons. Therefore, you need a special way to create only non-duplicates.
One way would be to generate permutations in increasing lexicographical order. Then you can simply compare if the “new” permutation is the same as the last, and then skip it. It gets even better: the algorithm for generating permutations to increase the lexicographical order specified in http://en.wikipedia.org/wiki/Permutations#Generation_in_lexicographic_order does not even create duplicates at all!
However , this is not the answer to your question, as it is a different algorithm (albeit based on the exchange).
So let's take a closer look at your algorithm. One of the key observations:
- Once the character changes to
begin , it will remain there for all nested permute calls.
We combine this with the following general observation of permutations:
- If you rearrange the string
s , but only in those places where there is the same character, s will remain unchanged. In fact, all duplicate permutations have a different order for the occurrence of some character c, where c occurs at the same positions.
OK, so all we need to do is make sure that the occurrences of each character are always in the same order as at the beginning. The code follows, but ... I really do not speak C ++, so I will use Python and hope to leave with a statement about its pseudocode.
Let's start with your original pseudo-code rewritten algorithm:
def permute(s, begin, end): if end == begin + 1: print(s) else: for i in range(begin, end): s[begin], s[i] = s[i], s[begin] permute(s, begin + 1, end) s[begin], s[i] = s[i], s[begin]
and an auxiliary function that makes it easier to call:
def permutations_w_duplicates(s): permute(list(s), 0, len(s))
Now we extend the permutation function with some bookkeeping about how many times a certain character has been replaced by the begin position (i.e. it has been fixed), and we also remember the original order of occurrence of each character ( char_number ). Each character that we are trying to swap to the begin position must then be next higher in the original order, that is, the number of corrections for a symbol determines which original origin of this symbol can be corrected as follows: I call it next_fixable .
def permute2(s, next_fixable, char_number, begin, end): if end == begin + 1: print(s) else: for i in range(begin, end): if next_fixable[s[i]] == char_number[i]: next_fixable[s[i]] += 1 char_number[begin], char_number[i] = char_number[i], char_number[begin] s[begin], s[i] = s[i], s[begin] permute2(s, next_fixable, char_number, begin + 1, end) s[begin], s[i] = s[i], s[begin] char_number[begin], char_number[i] = char_number[i], char_number[begin] next_fixable[s[i]] -= 1
Again, we use a helper function:
def permutations_wo_duplicates(s): alphabet = set(s) next_fixable = dict.fromkeys(alphabet, 0) count = dict.fromkeys(alphabet, 0) char_number = [0] * len(s) for i, c in enumerate(s): char_number[i] = count[c] count[c] += 1 permute2(list(s), next_fixable, char_number, 0, len(s))
What is it!
Nearly. You can stop here and rewrite to C ++ if you want, but if you are interested in some test data, read on.
I used slightly different code for testing because I did not want to print all permutations. In Python, you replace print with yield , turning the function into a generator function, the result of which can be iterated over with a for loop, and permutations will be calculated only as needed. This is the real code and test I used:
def permute2(s, next_fixable, char_number, begin, end): if end == begin + 1: yield "".join(s)
And the result:
FOOQUUXFOO has 3628800 permutations (counting duplicates) permutations of these that are unique: 37800 FOOQUUXFOO has 37800 unique permutations (directly computed) The first 10 permutations : ['FOOQUUXFOO', 'FOOQUUXFOO', 'FOOQUUXOFO', 'FOOQUUXOOF', 'FOOQUUXOOF', 'FOOQUUXOFO', 'FOOQUUFXOO', 'FOOQUUFXOO', 'FOOQUUFOXO', 'FOOQUUFOOX'] The first 10 unique permutations: ['FOOQUUXFOO', 'FOOQUUXOFO', 'FOOQUUXOOF', 'FOOQUUFXOO', 'FOOQUUFOXO', 'FOOQUUFOOX', 'FOOQUUOFXO', 'FOOQUUOFOX', 'FOOQUUOXFO', 'FOOQUUOXOF']
Note that permutations are calculated in the same order as the original algorithm, without duplicates. Please note that 37800 * 2! * 2! * 4! = 3628800, as you would expect.