How to evaluate and add a string to a numpy array element

Have this piece of code I'm trying to optimize. It uses lists and works.

series1 = np.asarray(range(10)).astype(float)
series2 = series1[::-1]

ntup = zip(series1,series2)
[['', 't:'+str(series2)][series1 > series2] for series1,series2 in ntup ]
 #['', '', '', '', '', 't:4.0', 't:3.0', 't:2.0', 't:1.0', 't:0.0']

Trying to use np.where()here. Is there a solution with numpy. (Without using a series)

series1 = np.asarray(range(10)).astype(float)
series2 = series1[::-1]  

np.where(series1 >  series2 ,'t:'+ str(series2),'' )

The results are as follows:

array(['', '', '', '', '', 't:[ 9.  8.  7.  6.  5.  4.  3.  2.  1.  0.]',
       't:[ 9.  8.  7.  6.  5.  4.  3.  2.  1.  0.]',
       't:[ 9.  8.  7.  6.  5.  4.  3.  2.  1.  0.]',
       't:[ 9.  8.  7.  6.  5.  4.  3.  2.  1.  0.]',
       't:[ 9.  8.  7.  6.  5.  4.  3.  2.  1.  0.]'], 
      dtype='|S43')
+4
source share
3 answers

We can use a vector approach based on

  • np.core.defchararray.addto add a string 't:'with valid strings and

  • np.where select based on the conditional statement and add or just use the default value of the empty string,

So, we would have such an implementation -

np.where(series1>series2,np.core.defchararray.add('t:',series2.astype(str)),'')

Strengthen it!

np.core.defchararray.add series1>series2, .

, :

mask = series1>series2
out = np.full(series1.size,'',dtype='U34')
out[mask] = np.core.defchararray.add('t:',series2[mask].astype(str))

:

def vectorized_app1(series1,series2):
    mask = series1>series2
    return np.where(mask,np.core.defchararray.add('t:',series2.astype(str)),'')

def vectorized_app2(series1,series2):
    mask = series1>series2
    out = np.full(series1.size,'',dtype='U34')
    out[mask] = np.core.defchararray.add('t:',series2[mask].astype(str))
    return out

-

In [283]: # Setup input arrays
     ...: series1 = np.asarray(range(10000)).astype(float)
     ...: series2 = series1[::-1]
     ...: 

In [284]: %timeit [['', 't:'+str(s2)][s1 > s2] for s1,s2 in zip(series1, series2)]
10 loops, best of 3: 32.1 ms per loop # OP/@hpaulj soln

In [285]: %timeit vectorized_app1(series1,series2)
10 loops, best of 3: 20.5 ms per loop

In [286]: %timeit vectorized_app2(series1,series2)
100 loops, best of 3: 10.4 ms per loop

OP in comments, , , dtype series2 . , U32, dtype , str dtype, .. series2.astype('U32') np.core.defchararray.add. :

In [290]: %timeit vectorized_app1(series1,series2)
10 loops, best of 3: 20.1 ms per loop

In [291]: %timeit vectorized_app2(series1,series2)
100 loops, best of 3: 10.1 ms per loop

, !

+2

, . , , , .

In [521]: series1=[float(i) for i in range(10)]
In [522]: series2=series1[::-1]
In [523]: [['', 't:'+str(s2)][s1 > s2] for s1,s2 in zip(series1, series2)]
Out[523]: ['', '', '', '', '', 't:4.0', 't:3.0', 't:2.0', 't:1.0', 't:0.0']

@Divaker, np.char.add, . , , . , .

=========

array, @Divakar

In [539]: aseries1=np.array(series1)
In [540]: aseries2=np.array(series2)
In [541]: np.where(aseries1>aseries2, np.char.add('t:',aseries2.astype('U3')), '
     ...: ')
Out[541]: 
array(['', '', '', '', '', 't:4.0', 't:3.0', 't:2.0', 't:1.0', 't:0.0'], 
      dtype='<U5')

:

In [542]: timeit [['', 't:'+str(s2)][s1 > s2] for s1,s2 in zip(series1, series2)
     ...: ]
100000 loops, best of 3: 15.5 µs per loop

In [543]: timeit np.where(aseries1>aseries2, np.char.add('t:',aseries2.astype('U3')), '')
10000 loops, best of 3: 63 µs per loop
+1

This works for me. Fully vectorized.

import numpy as np
series1 = np.arange(10)
series2 = series1[::-1]
empties = np.repeat('', series1.shape[0])
ts = np.repeat('t:', series1.shape[0])
s2str = series2.astype(np.str)
m = np.vstack([empties, np.core.defchararray.add(ts, s2str)])
cmp = np.int64(series1 > series2)
idx = np.arange(m.shape[1])
res = m[cmp, idx]
print res 
+1
source

Source: https://habr.com/ru/post/1653208/


All Articles