What is a more efficient way to select a random pair of objects from a list of lists or tuples?

I have a list of 2d coordinates with this structure:

coo = [(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (1, 2), (2, 0)] 

Where coo[0] is the first coordinate stored in the tuple.

I would like to select two different random coordinates. I can of course use this method:

 import numpy as np rndcoo1 = coo[np.random.randint(0,len(coo))] rndcoo2 = coo[np.random.randint(0,len(coo))] if rndcoo1 != rndcoo2: #do something 

But since I have to repeat this operation 1'000'000 times, I was wondering if there is a faster way to do this. np.random.choice() cannot be used for 2d array, is there any alternative that I can use?

+5
source share
3 answers
 import random result = random.sample(coo, 2) 

will give you the expected result. And this is (possibly) as fast as you can get with Python.

+6
source

Listed in this post is a vectorized approach that gives us a number of such random options for a number of iterations in a single pass, without focusing on those multiple iterations. The idea uses np.argpartition and is inspired by this this post .

Here's the implementation -

 def get_items(coo, num_items = 2, num_iter = 10): idx = np.random.rand(num_iter,len(coo)).argpartition(num_items,axis=1)[:,:2] return np.asarray(coo)[idx] 

Note that we are returning a 3D array whose first size is the number of iterations, the second dimension is the number of options that must be made at each iteration, and the last dimension is the length of each tuple.

The sample run should contain a sharper image -

 In [55]: coo = [(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (1, 2), (2, 0)] In [56]: get_items(coo, 2, 5) Out[56]: array([[[2, 0], [1, 1]], [[0, 0], [1, 1]], [[0, 2], [2, 0]], [[1, 1], [1, 0]], [[0, 2], [1, 1]]]) 

A run-time test comparing a looping implementation with random.sample , as stated in the @freakish post -

 In [52]: coo = [(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (1, 2), (2, 0)] In [53]: %timeit [random.sample(coo, 2) for i in range(10000)] 10 loops, best of 3: 34.4 ms per loop In [54]: %timeit get_items(coo, 2, 10000) 100 loops, best of 3: 2.81 ms per loop 
+1
source

Is coo just an example or are your coordinates actually equally distributed? If so, you can simply select the M 2D coordinates as follows:

 import numpy N = 100 M = 1000000 coo = numpy.random.randint(0, N, size=(M, 2)) 

Of course, you can also offset and scale the distribution using addition and multiplication to take into account different step sizes and offsets.

If you use memory constraints with large M s, you can, of course, choose smaller sizes or just one array of 2 values โ€‹โ€‹with size=2 .

0
source

Source: https://habr.com/ru/post/1260513/


All Articles