Python script to calculate aded combinations from a dictionary

Question

Python script to calculate aded combinations from a dictionary

I am trying to write a script that will take a dictionary of elements, each of which contains properties of values from 0 to 10, and add various elements to choose which combination of elements reaches the desired results. I also need a script using only those elements that have the same "slot".

For instance:

item_list = { 'item_1': {'slot': 'top', 'prop_a': 2, 'prop_b': 0, 'prop_c': 2, 'prop_d': 1 }, 'item_2': {'slot': 'top', 'prop_a': 5, 'prop_b': 0, 'prop_c': 1, 'prop_d':-1 }, 'item_3': {'slot': 'top', 'prop_a': 2, 'prop_b': 5, 'prop_c': 2, 'prop_d':-2 }, 'item_4': {'slot': 'mid', 'prop_a': 5, 'prop_b': 5, 'prop_c':-5, 'prop_d': 0 }, 'item_5': {'slot': 'mid', 'prop_a':10, 'prop_b': 0, 'prop_c':-5, 'prop_d': 0 }, 'item_6': {'slot': 'mid', 'prop_a':-5, 'prop_b': 2, 'prop_c': 3, 'prop_d': 5 }, 'item_7': {'slot': 'bot', 'prop_a': 1, 'prop_b': 3, 'prop_c':-4, 'prop_d': 4 }, 'item_8': {'slot': 'bot', 'prop_a': 2, 'prop_b': 2, 'prop_c': 0, 'prop_d': 0 }, 'item_9': {'slot': 'bot', 'prop_a': 3, 'prop_b': 1, 'prop_c': 4, 'prop_d':-4 }, }

Then the script should select which combinations of the "item_list" dict, using 1 element per "slot", which will achieve the desired result when added.

For example, if the desired result is: "prop_a": 3, "prop_b": 3, "prop_c": 8, "prop_d": 0, the script will select "item_2", "item_6", and "item_9" along with any other combination that worked.

 'item_2': {'slot': 'top', 'prop_a': 5, 'prop_b': 0, 'prop_c': 1, 'prop_d':-1 } 'item_6': {'slot': 'mid', 'prop_a':-5, 'prop_b': 2, 'prop_c': 3, 'prop_d': 5 } 'item_9': {'slot': 'bot', 'prop_a': 3, 'prop_b': 1, 'prop_c': 4, 'prop_d':-4 } 'total': 'prop_a': 3, 'prop_b': 3, 'prop_c': 8, 'prop_d': 0

Any ideas how to do this? It doesn't have to be in python or even in a thorough script, but for me there would be enough explanations of how to do this in theory. I tried to work out each combination, but it quickly gets my hand and uncontrollability. The actual script will have to do this for approximately 1000 elements, using 20 different slots, each of which has 8 properties.

Thanks for the help!

+4

python language-agnostic algorithm combinations combinatorics

user325441 Apr 25 '10 at 16:24

source share

4 answers

Alex martelli · Answer 1 · 2010-04-25T17:34:18+0000

Since the properties can have both positive and negative values, and you need all the satisfactory combinations, I believe that there is no "significant" optimization, that is, no solution to the polynomial time (provided that P! = NP ...; - ) All decisions come down to listing all the combinations with one slot and checking the final results with very little settings, which can save you a few percent of the effort here or there, but nothing really big.

If you have 1000 items in 20 possible slots, let's say that it is evenly distributed across about 50 positions for each slot, there are about 50**20 possibilities in total, i.e. 9536743164062500000000000000000000 - about 10**34 (myriad billions of billions of billions of billions ...). You cannot, in general, “trim” any subtree from the “search for all solutions”, because regardless of the value of prop, when you have a hypothetical choice for the first 20-p slots, there can still be a choice of the remaining p , which can satisfy restriction (or more than one).

If you could find the exact solution to the polynomial time for this, an NP-complete problem, you would basically revolutionize modern mathematics and computer science - Turing's prizes and field medals would only be the beginning of successive awards. This is unlikely.

To move on to a possible problem, you will have to somehow soften your requirements (take the opportunity to find only a subset of solutions, take a probabilistic rather than deterministic approach, make approximate solutions, ...).

Once you do this, some small optimizations may make sense - for example, start with the summation constants (equal to one greater than the smallest negative value of each argument) to all property values and goals, so that each value is prop> 0 - now you can sort the slots by (for example) the value for a certain property or the sum of all properties and perform some trimming, based on the knowledge that adding another slot to a partial hypothetical solution will increase each aggregate value of prop along the edge at least for X and the sum at least for Y (so you can trim this branch if any condition makes the current results exceed the target). Such a heuristic approximation does not have to make the behavior of large O generally better, but it can reduce the expected value of the factor enough to bring the problem closer to being able to calculate.

But you should not even look for such clever little tricks if there is no need for relaxation: in this case the problem will remain computationally impossible, therefore, the search for clever little optimizations will not be practically productive in any case.

ShreevatsaR · Answer 2 · 2010-04-25T20:11:39+0000

This problem is essentially a generalization of the problem of a subset of sums (which is NP-complete, yes) for several dimensions. To repeat the problem (to make sure we solve the same problem): you have 1000 elements divided into 20 classes (which you call slots). Each element has an integer value in [-10,10] for each of the 8 properties; thus, each element can be considered to have a value, which is an 8-dimensional vector. You want to select one element from each slot, so the total value (adding these eight-dimensional vectors) is the given vector.

In the above example, you have 4 dimensions, and 9 elements in 3 classes have values (2,0,2,1), (5,0,1, -1), ... etc., and you want select one item from each class to make an amount (3,3,8,0). Right?

busting

Firstly, there is a search for brute force, which lists all the possibilities. Assuming that your 1000 items are divided equally into 20 classes (so you have 50 in each), you have 50 options for each class, which means that you will need to choose 50 ²⁰ = 9536743164062500000000000000000000 (and for each of them you need to add 20 elements along each of the 8 coordinates and check, so the runtime will be 50 ²⁰ · 20 · 8): this is not possible.

Dynamic programming, one-time

Then there is a dynamic programming solution that is different, and in practice often works where brute force is not feasible, but in this case, unfortunately, also seems to be impracticable. (You would improve it exponentially if you had better limits on your “property values.”) The idea here is to keep track of one way to achieve every possible amount. The sum of 20 numbers from [-10,10] is in [-200,200], so there is only "400" ⁸ = 655360000000000000000 possible sums for your 8-dimensional vector. (This is a small part of another search space, but it will not console you. You can also take for each “property” the difference between the sums of [the largest element in each class] and [the smallest element in each class] to replace 400 with a smaller number.) Idea dynamic programming algorithm is as follows.

Let the last [(a, b, c, d, e, f, g, h)] [k] denote one element that you can take from the k-th class (together with one element each of the first k -1), to make the sum exactly (a, b, c, d, e, f, g, h). Then pseudo code:
```
 for k=1 to 20: for each item i in class k: for each vector v for which last[v][k-1] is not null: last[v + value(i)][k] = i 
```

Then, if your desired final amount is s, you select the last element [s] [k] from the k-th class, the last element [s-value (i)] [k-1] from the (k-1) th class and etc. It takes a time of α 20 and middot; 50 and middot; 400 ⁸ ? 8 in the worst case (only a free upper bound, not a hard analysis).

Dynamic programming, separately

So much for “perfect” solutions. However, if you allow heuristic solutions and those that "are most likely to work in practice," you can do better (even to solve the problem exactly). For example, you can solve the problem separately for each of the 8 dimensions. This is even easier to implement, in the worst case only α 20 and middot are required; 50? 400 and middot; 8 = 3200000, and you can do it quite easily. If you save the last [] [] as a list, instead of a single element, then at the end you have a (efficiently) list of subsets that reach the specified amount for this coordinate (in the "product form"). In practice, a small number of subsets can be exactly as many as you want, so you can start with the coordinate for which the number of subsets is the smallest, and then try each of these subsets for the other 7 coordinates. The complexity of this step depends on the data in the problem, but I suspect (or hopefully) that either (1) there will be very few sets with equal sums, in which case this intersection will reduce the number of sets to check, or (2) there will be many sets with a given sum, in which case you will find them quite early.

In any case, performing dynamic programming individually for each coordinate will first allow you to search in a much smaller space in the second stage.

Approximate algorithms

If you don’t need exactly equal amounts and they will accept amounts that are within a certain coefficient of your required amount, there is a well-known idea used to obtain FPTAS (a completely polynomial-approximate scheme) for the problem of a subset of sums that is executed in time by a polynomial (number elements, etc.) and 1 / ε. I have run out of time to explain this, but you can watch it - basically, it just replaces the 400 ⁸ space with a smaller one, for example, rounding numbers to the nearest multiple of 5 or something else.

Stephen · Answer 3 · 2010-04-25T16:47:16+0000

This sounds like a variation of the knapsack problem , which is usually solved with dynamic programming .

But you could probably write a fairly simple solution (but slower) using recursion:

 def GetItemsForSlot(item_list, slot): return [ (k,v) for (k,v) in item_list.items() if v['slot'] == slot] def SubtractWeights(current_weights, item_weights): remaining_weights = {} for (k,v) in current_weights.items(): remaining_weights[k] = current_weights[k] - item_weights[k] return remaining_weights def AllWeightsAreZero(remaining_weights): return not [v for v in remaining_weights.values() if v != 0] def choose_items(item_list, remaining_weights, available_slots, accumulated_items=[ ]): print "choose_items: ", remaining_weights, available_slots, \ accumulated_items # Base case: we have no more available slots. if not available_slots: if AllWeightsAreZero(remaining_weights): # This is a solution. print "SOLUTION FOUND: ", accumulated_items return else: # This had remaining weight, not a solution. return # Pick the next available_slot slot = available_slots[0] # Iterate over each item for this slot, checking to see if they're in a # solution. for name, properties in GetItemsForSlot(item_list, slot): choose_items(item_list, # pass the items recursively SubtractWeights(remaining_weights, properties), available_slots[1:], # pass remaining slots accumulated_items + [name]) # Add this item if __name__ == "__main__": total_weights = { 'prop_a': 3, 'prop_b': 3, 'prop_c': 8, 'prop_d': 0 } choose_items(item_list, total_weights, ["top", "mid", "bot"])

It was checked and seemed to work. No promises though :)

Saving slots and prop_a as properties of the same object made work difficult. I would suggest using classes instead of a dictionary to make the code more understandable.

Juho vepsäläinen · Answer 4 · 2010-04-25T17:01:14+0000

I tried to work through the cycle through each combination, but it seems to get my hand and uncontrollability very quickly. The actual script will have to do this for approximately 1000 elements, using 20 different slots, each of which has 8 properties.

This can help your thinking first load the structure into a hierarchy of good objects, and then solve it piecewise.

Example:

 class Items(dict): def find(self, **clauses): # TODO! class Slots(dict): # TODO! items = Items() for item, slots in item_list.items(): items[item] = Slots(slots) # consider abstracting out slot based on location (top, mid, bot) too print items.find(prop_a=3, prop_b=3, prop_c=8, prop_d=0)

Python script to calculate aded combinations from a dictionary

busting

Dynamic programming, one-time

Dynamic programming, separately

Approximate algorithms

More articles: