Calculate percentage for array list

A simple problem, but I can't get it to work. I want to calculate the percentage of the number that appears in the list of arrays, and output this percentage accordingly. I have a list of arrays that looks like this:

import numpy as np

# Create some data   
listvalues = []

arr1 = np.array([0, 0, 2])
arr2 = np.array([1, 1, 2, 2])
arr3 = np.array([0, 2, 2])

listvalues.append(arr1)
listvalues.append(arr2)
listvalues.append(arr3)

listvalues
>[array([0, 0, 2]), array([1, 1, 2, 2]), array([0, 2, 2])]

Now I count the occurrences using collections, which returns a list of collections. Contrast:

import collections 

counter = []
for i in xrange(len(listvalues)):
    counter.append(collections.Counter(listvalues[i]))

counter
>[Counter({0: 2, 2: 1}), Counter({1: 2, 2: 2}), Counter({0: 1, 2: 2})]

The result I'm looking for is an array with 3 columns representing values ​​from 0 to 2 and len (listvalues) of the rows. Each cell should be filled with a percentage of this value in the array:

# Result
66.66    0      33.33
0        50     50
33.33    0      66.66

So, 0 arises 66.66% in array 1, 0% in array 2 and 33.33% in array 3, etc.

What would be the best way to achieve this? Many thanks!

+4
5

-

# Get lengths of each element in input list
lens = np.array([len(item) for item in listvalues])

# Form group ID array to ID elements in flattened listvalues
ID_arr = np.repeat(np.arange(len(lens)),lens)

# Extract all values & considering each row as an indexing perform counting
vals = np.concatenate(listvalues)
out_shp = [ID_arr.max()+1,vals.max()+1]
counts = np.bincount(ID_arr*out_shp[1] + vals)

# Finally get the percentages with dividing by group counts
out = 100*np.true_divide(counts.reshape(out_shp),lens[:,None])

-

In [316]: listvalues
Out[316]: [array([0, 0, 2]),array([1, 1, 2, 2]),array([0, 2, 2]),array([4, 0, 1])]

In [317]: print out
[[ 66.66666667   0.          33.33333333   0.           0.        ]
 [  0.          50.          50.           0.           0.        ]
 [ 33.33333333   0.          66.66666667   0.           0.        ]
 [ 33.33333333  33.33333333   0.           0.          33.33333333]]
+1

numpy_indexed , count_table, :

import numpy_indexed as npi
arrs = [arr1, arr2, arr3]
idx = [np.ones(len(a))*i for i, a in enumerate(arrs)]
(rows, cols), table = npi.count_table(np.concatenate(idx), np.concatenate(arrs))
table = table / table.sum(axis=1, keepdims=True)
print(table * 100)
+2

, , :

values = set([y for row in listvalues for y in row]) print [[(a==x).sum()*100.0/len(a) for x in values] for a in listvalues]

+2

:

percentage_list = [((counter[i].get(j) if counter[i].get(j) else 0)*10000)//len(listvalues[i])/100.0 for i in range(len(listvalues)) for j in range(3)]

np :

results = np.array(percentage_list)

, :

results = results.reshape(3,3)

, .
, , , , .

, - .

0

. :

>>> import numpy as np
>>> import pprint
>>> 
>>> arr1 = np.array([0, 0, 2])
>>> arr2 = np.array([1, 1, 2, 2])
>>> arr3 = np.array([0, 2, 2])
>>> 
>>> arrays = (arr1, arr2, arr3)
>>> 
>>> u = np.unique(np.hstack(arrays))
>>> 
>>> result = [[1.0 * c.get(uk, 0) / l
...            for l, c in ((len(arr), dict(zip(*np.unique(arr, return_counts=True))))
...            for arr in arrays)] for uk in u]
>>> 
>>> pprint.pprint(result)
[[0.6666666666666666, 0.0, 0.3333333333333333],
 [0.0, 0.5, 0.0],
 [0.3333333333333333, 0.5, 0.6666666666666666]]
0

Source: https://habr.com/ru/post/1648651/


All Articles