Seems like the fastest way to do this is simple a[mask[a]]. I wrote a quick test that shows the difference in speed between the two methods depending on the coverage of the mask, p (the number of true elements / n).
import timeit
import matplotlib.pyplot as plt
import numpy as np
n = 10000
p = 0.25
slow_times = []
fast_times = []
p_space = np.linspace(0, 1, 100)
for p in p_space:
mask = np.random.choice([True, False], n, p=[p, 1 - p])
a = np.arange(n)
np.random.shuffle(a)
y = np.array([x for x in a if mask[x]])
z = a[mask[a]]
n_test = 100
t1 = timeit.timeit(lambda: np.array([x for x in a if mask[x]]), number=n_test)
t2 = timeit.timeit(lambda: a[mask[a]], number=n_test)
slow_times.append(t1)
fast_times.append(t2)
plt.plot(p_space, slow_times, label='slow')
plt.plot(p_space, fast_times, label='fast')
plt.xlabel('p (# true items in mask)')
plt.ylabel('time (ms)')
plt.legend()
plt.title('Speed of method vs. coverage of mask')
plt.show()
Who gave me this plot

Thus, this method is much faster, regardless of the coverage of the mask.
source
share