Approach No. 1
Perform a stepwise comparison for non-equality, and then get the reduction ANYalong the last axis and finally count -
(a!=b).any(-1).sum()
Approach # 2
Probably faster with np.count_nonzerofor counting boolean elements -
np.count_nonzero((a!=b).any(-1))
Approach No. 3
Much faster with views-
def view1D(a, b):
a = np.ascontiguousarray(a)
b = np.ascontiguousarray(b)
void_dt = np.dtype((np.void, a.dtype.itemsize * a.shape[1]))
return a.view(void_dt).ravel(), b.view(void_dt).ravel()
a1D,b1D = view1D(a,b)
out = np.count_nonzero(a1D!=b1D)
Benchmarking
In [32]: np.random.seed(0)
...: m,n = 10000,100
...: a = np.random.randint(0,9,(m,n))
...: b = a.copy()
...:
...:
...: b[np.random.choice(len(a), len(a)//10, replace=0)] = 0
In [33]: %timeit (a!=b).any(-1).sum()
...: %timeit np.count_nonzero((a!=b).any(-1))
...: %timeit np.any(a - b, axis=1).sum()
1000 loops, best of 3: 1.14 ms per loop
1000 loops, best of 3: 1.08 ms per loop
100 loops, best of 3: 2.33 ms per loop
In [34]: %%timeit
...: a1D,b1D = view1D(a,b)
...: out = np.count_nonzero((a1D!=b1D).any(-1))
1000 loops, best of 3: 797 µs per loop
source
share