Counting the number of columns in a numpy array

Given a 2 xd numpy M dimensional array, I want to count the number of occurrences of each column M. That is, I'm looking for a generic version of bincount .

What I have tried so far: (1) Converted columns to tuples (2) Hash tuples (via hash ) to natural numbers (3) used numpy.bincount .

This seems rather awkward. Does anyone know a more elegant and efficient way?

+5
source share
2 answers

You can use collections.Counter :

 >>> import numpy as np >>> a = np.array([[ 0, 1, 2, 4, 5, 1, 2, 3], ... [ 4, 5, 6, 8, 9, 5, 6, 7], ... [ 8, 9, 10, 12, 13, 9, 10, 11]]) >>> from collections import Counter >>> Counter(map(tuple, aT)) Counter({(2, 6, 10): 2, (1, 5, 9): 2, (4, 8, 12): 1, (5, 9, 13): 1, (3, 7, 11): 1, (0, 4, 8): 1}) 
+4
source

Given:

 a = np.array([[ 0, 1, 2, 4, 5, 1, 2, 3], [ 4, 5, 6, 8, 9, 5, 6, 7], [ 8, 9, 10, 12, 13, 9, 10, 11]]) b = np.transpose(a) 
  • More efficient solution than hashing (still requires manipulation):

    I create an array view with the np.void flexible data np.void (see here ), so that each row becomes a single element. Converting this form will allow np.unique to work with it.

     %%timeit c = np.ascontiguousarray(b).view(np.dtype((np.void, b.dtype.itemsize*b.shape[1]))) _, index, counts = np.unique(c, return_index = True, return_counts = True) #counts are in the last column, remember original array is transposed >>>np.concatenate((b[idx], cnt[:, None]), axis = 1) array([[ 0, 4, 8, 1], [ 1, 5, 9, 2], [ 2, 6, 10, 2], [ 3, 7, 11, 1], [ 4, 8, 12, 1], [ 5, 9, 13, 1]]) 10000 loops, best of 3: 65.4 ยตs per loop 

    Accounts are added to unique columns a .

  • Your hashing solution.

     %%timeit array_hash = [hash(tuple(row)) for row in b] uniq, index, counts = np.unique(array_hash, return_index= True, return_counts = True) np.concatenate((b[idx], cnt[:, None]), axis = 1) 10000 loops, best of 3: 89.5 ยตs per loop 

Update : The Eph solution is the most efficient and elegant.

 %%timeit Counter(map(tuple, aT)) 10000 loops, best of 3: 38.3 ยตs per loop 
+1
source

Source: https://habr.com/ru/post/1237959/


All Articles