Create lists with one common item

Question

Create lists with one common item

In Python, I reflect on an idea, but I'm not quite sure how to implement it correctly.

I have a pool of 26 letters ('A' - 'Z'), and each letter can be used as many times as necessary. I want to create lists using these letters; each list will contain 10 letters without repetitions inside them, and I want to guarantee that if I compare any generated two lists, there will be exactly one common letter.

Questions:

Is there an easy way to do this (i.e. using library functions)?
Is it possible to calculate the maximum number of lists that I can create, given the size of the pool (pool_size) and the length of the list (list_length)?

It would be helpful to evaluate any pointers to relevant material; I don’t need an implementation like somewhere to lay my archimedean lever (to be read: I need a foundation before I can build my idea).

"Give me a place to stand, and I will move the earth." - Archimedes

UPDATE . How naive I am to think that the alphabet was enough for the job. Allow a pool of up to 300 characters, but keep lists of length 10. Does this work?

+4

python

taserian Oct 24 '12 at 18:50

source share

2 answers

Mark byers · Answer 1 · 2012-10-24T18:54:08+0000

Only 26 letters to choose from it, you can create only two lists.

Select one letter at random and place it on both lists. Then select 18 different letters and put nine in random order in each list. Then your lists will look something like this:

ABCDEFGHIJ AKLMNOPQRS

If you add a third list, it is not possible to satisfy your restrictions, because there are only seven unused letters. The third list will have to have at least two letters with one of the other lists that you prohibit.

Update

This only partially answers your updated question, but I will post it anyway, as this can help you or other people find the best solution.

In general, with characters n and lists of length x you can easily generate lists of at least floor((n-1)/(x-1)) using the algorithm described above (selecting 1 letter and adding it to all lists). Thus, for 300 characters and lists of length 10, which give 33 lists.

But this can be improved using a different algorithm. For example, if n is 10 and x is 4, the above algorithm gives only three lists:

 ABCD AEFG AHIJ

But an algorithm that uses letters more efficiently can generate five lists:

 ABCD AEFG BEHI CFHJ DGIJ

I generated these lists using a greedy algorithm: for each new list, reuse as many different letters from previous lists as possible, which means that you add as few new letters as possible.

The second list reuses one letter from the first list and adds three new letters. The third list reuses a different letter from each of the first two lists and therefore only introduces two new letters. The fourth list repeats the three letters that were earlier, and adds another new letter. The final list can now reuse the letter from each of the previous lists and there is no need to add new letters.

Update 2

A greedy algorithm is definitely not an optimal solution.

Let's try: n = 26, x = 2

A simple solution gives the optimal 25 lists:

 AB AC AD .. AZ

However, the greedy algorithm generates only 3 lists:

 AB AC BC

Now it is impossible to add more lists without breaking one of the rules.

sarwar · Answer 2 · 2012-10-24T20:42:55+0000

This was a common solution that I found for all the Set and Line Length values. The first assumes that you do not need two solutions for sharing the same element, but you want each solution to have one common element with any other solution. Given the infinite pool for form selection, the total number of solutions is limited by the length of each solution.

 SET_LENGTH = 10 CHOICE_LENGTH = 300 data = set(range(CHOICE_LENGTH)) solutions =[] solution_sets = [] used = set() while True: new_solution = [] #Try to get unique values from each previous set try: for sol_set in solution_sets: while True: candidate = sol_set.pop() if not candidate in used: new_solution.append(candidate) used.update([candidate]) break except KeyError, e: print e break #Fill with new data until the line is long enough try: while len(new_solution) < SET_LENGTH: new_solution.append(data.pop()) except KeyError, e: print e break solutions.append(new_solution) solution_sets.append(set(new_solution)) #Show the results for solution in solutions: print solution print "Orphans %s" % len(data)

For example, n = 300 x = 10 gives:

 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] [0, 10, 11, 12, 13, 14, 15, 16, 17, 18] [1, 10, 19, 20, 21, 22, 23, 24, 25, 26] [2, 11, 19, 27, 28, 29, 30, 31, 32, 33] [3, 12, 20, 32, 34, 35, 36, 37, 38, 39] [4, 13, 21, 33, 34, 40, 41, 42, 43, 44] [5, 14, 22, 27, 36, 40, 45, 46, 47, 48] [6, 15, 23, 28, 37, 41, 45, 49, 50, 51] [7, 16, 24, 29, 38, 42, 47, 49, 52, 53] [8, 17, 25, 30, 39, 43, 48, 50, 52, 54] [9, 18, 26, 31, 35, 44, 46, 51, 53, 54] Orphans 245

If you don't care how many solutions use the same common element, then this is even simpler:

 SET_LENGTH = 2 CHOICE_LENGTH = 300 data = set(range(CHOICE_LENGTH)) solutions =[] alpha = data.pop() while True: new_solution = [alpha] try: [new_solution.append(data.pop()) for x in range(SET_LENGTH-1)] except KeyError, e: break solutions.append(new_solution) for solution in solutions: print solution print "Solutions: %s" % len(solutions) print "Orphans: %s" % len(data)

Create lists with one common item

More articles: