Create a list of random weighted tuples from a list

Given a list of tuples a :

 a =[(23, 11), (10, 16), (13, 11), (12, 3), (4, 15), (10, 16), (10, 16)] 

We can calculate how many cells of each tuple we use Counter :

 >>> from collections import Counter >>> b = Counter(a) >>> b Counter({(4, 15): 1, (10, 16): 3, (12, 3): 1, (13, 11): 1, (23, 11): 1} 

Now the idea is to select 3 random tuples from the list without repeating, so that the counter determines the probability of choosing a particular tuple.

For example, (10, 16) is more likely to be chosen than others - its weight is 3/7, while the other four tuples are 1/7.

I tried using np.random.choice :

 a[np.random.choice(len(a), 3, p=b/len(a))] 

But I can not generate tuples.

I'm trying to:

 a =[(23, 11), (10, 16), (13, 11), (10, 16), (10, 16), (10, 16), (10, 16)] b = Counter(a) c = [] print "counter list" print b for item in b: print "item from current list" print item print "prob of the item" print (float(b[item])/float(len(a))) c.append(float(b[item])/float(len(a))) print "prob list" print c print (np.random.choice(np.arange(len(b)), 3, p=c, replace=False)) 

In this case, im gets random array indices.

  • Is there a more optimized way to not calculate an array of probabilities?

  • There is also a problem with the prob array not matching the Counter array.

-1
source share
3 answers

It will do the trick

 from collections import Counter import matplotlib.pyplot as plt import numpy as np import random listOfNumbers =[(23, 11), (10, 16), (13, 11), (10, 16), (10, 16), (10, 16), (10, 16)] b = Counter(listOfNumbers) c = [] pres=[] for k,v in b.most_common(): c.append(float(v)/float(len(listOfNumbers))) pres.append(k) resultIndex = np.random.choice(np.arange(len(b)), 3, p=c, replace=False) ass=[] for res in resultIndex: ass.append(pres[res]) print ass 

Now just see if there is a way to optimize it.

0
source

If you are not interested in the intermediate step of calculating frequencies, you can use random.shuffle (either in the list or in the copy), and then cut off as many elements as you need.

eg.

 import random a =[(23, 11), (10, 16), (13, 11), (12, 3), (4, 15), (10, 16), (10, 16)] random.shuffle(a) random_sample = a[0:3] print(random_sample) 

In the case of random permutation, he will avoid the problem of repetition and should statistically give the same result (excluding differences in the generation of random numbers between np and random).

0
source

You can repeat the following steps 3 times:

  • Randomly select a number i in the range [0..n-1] , where n is the current number of elements in a .
  • Find tuple at the i th position in the initial a list. Add tuple to the resulting triplet.
  • Remove all occurrences of tuple from a .

Notice the corner case where a can be empty.

The total time complexity will be O (n) for the list.

In the first step, the number i must be generated in accordance with the uniform distribution, which provides regular random . The more occurrences of a particular tuple are in a , the more likely it will be selected.

0
source

Source: https://habr.com/ru/post/1241043/


All Articles