A more elegant way to assign a list according to values

I would like to display the list in numbers according to the values.

For instance:

['aa', 'b', 'b', 'c', 'aa', 'b', 'a'] -> [0, 1, 1, 2, 0, 1, 3]

I am trying to achieve this using numpy and map dict.

def number(lst):
    x = np.array(lst)
    unique_names = list(np.unique(x))
    mapping = dict(zip(unique_names, range(len(unique_names)))) # Translating dict
    map_func = np.vectorize(lambda name: d[name])
    return map_func(x)

Is there a more elegant / faster way to do this?

Update: Bonus question - do it with the saved order.

+4
source share
5 answers

You can use the keyword return_inverse:

x = np.array(['aa', 'b', 'b', 'c', 'aa', 'b', 'a'])
uniq, map_ = np.unique(x, return_inverse=True)
map_
# array([1, 2, 2, 3, 1, 2, 0])

Edit: save version:

x = np.array(['aa', 'b', 'b', 'c', 'aa', 'b', 'a'])
uniq, idx, map_ = np.unique(x, return_index=True, return_inverse=True)
mxi = idx.max()+1
mask = np.zeros((mxi,), bool)
mask[idx] = True
oidx = np.where(mask)[0]
iidx = np.empty_like(oidx)
iidx[map_[oidx]] = np.arange(oidx.size)
iidx[map_]
# array([0, 1, 1, 2, 0, 1, 3])
+2
source

Here's a NumPy based vector solution -

def argsort_unique(idx):
    # Original idea : http://stackoverflow.com/a/41242285/3293881 by @Andras
    n = idx.size
    sidx = np.empty(n,dtype=int)
    sidx[idx] = np.arange(n)
    return sidx

def map_uniquetags_keep_order(a):
    arr = np.asarray(a)

    sidx = np.argsort(arr)
    s_arr = arr[sidx]

    m = np.concatenate(( [True], s_arr[1:] != s_arr[:-1] ))
    unq = s_arr[m]
    tags = np.searchsorted(unq, arr)
    rev_idx = argsort_unique(sidx[np.searchsorted(s_arr, unq)].argsort())
    return rev_idx[tags]

Run Example -

In [169]: a = ['aa', 'b', 'b', 'c', 'aa', 'b', 'a'] # String input

In [170]: map_uniquetags_keep_order(a)
Out[170]: array([0, 1, 1, 2, 0, 1, 3])

In [175]: a = [4, 7, 7, 5, 4, 7, 2]                 # Numeric input

In [176]: map_uniquetags_keep_order(a)
Out[176]: array([0, 1, 1, 2, 0, 1, 3])
+2
source

Using duplicate removal kits:

myList = ['a', 'b', 'b', 'c', 'a', 'b']
mySet = set(myList)

Then create a dictionary using understanding:

mappingDict = {letter:number for number,letter in enumerate(mySet)}
+1
source

I did this using ASCII values ​​because it is easy and short.

def number(list):   
    return map(lambda x: ord(x)-97,list)  
l=['a', 'b', 'b', 'c', 'a', 'b']  
print number(l)

Conclusion:

[0, 1, 1, 2, 0, 1]

0
source

If the order is not a concern:

[sorted(set(x)).index(item) for item in x]

# returns:
[1, 2, 2, 3, 1, 2, 0]
0
source

Source: https://habr.com/ru/post/1674818/


All Articles