Line
A[start:end] = B[mask]
will - according to the definition of the Python language - first evaluate the right side, get a new array containing the selected row B and take up additional memory. The most efficient pure-Python method I know of is to use an explicit loop:
from itertools import izip, compress for i, b in izip(range(start, end), compress(B, mask)): A[i] = b
Of course, it will be much less time than your source code, but it uses only O (1) additional memory. Also note that itertools.compress() is available in Python 2.7 or 3.1 or higher.
source share