I saw several discussions on this forum about applying a median filter with a moving window, but my application has a special feature.
I have a 3D array of 750x12000x10000 size , and I need to apply a media filter to get a 2D array (12000x10000). To do this, each median calculation should take into account a fixed neighborhood window (usually 100x100 ) and all values of the z axis . There are zero values in the matrix, and they cannot be considered to calculate the median. To process real data, I use numpy.memmap:
fp = np.memmap(filename, dtype='float32', mode='w+', shape=(750, 12000, 10000))
To execute the real data stored in memmap, my input array is divided into several pieces, but to increase the speed of my tests I will use a reduced array (11, 200, 300) and a smaller window (11, 5, 5) or ( 11, 50, 50), and I expect a result matrix (200, 300):
import numpy as np
from timeit import default_timer as timer
zsize, ysize, xsize = (11, 200, 300)
w_size = 5
m_in=np.arange(zsize*ysize*xsize).reshape(zsize, ysize, xsize)
m_out = np.zeros((ysize, xsize))
First, I tried the brute force method, but it is very slow as expected (even for a small array):
start = timer()
for l in range(0, ysize):
i_l = max(0, l - w_size/2)
o_l = min(ysize, i_l+w_size/2)
for c in range(0, xsize):
i_c = max(0, c - w_size/2)
o_c = min(xsize, i_c+w_size/2)
values = m_in[:, i_l:o_l, i_c:o_c]
values = values[np.nonzero(values)]
value = np.median(values)
m_out[l, c] = value
end = timer()
print("Time elapsed: %f seconds"%(end-start))
To remove double-for, I tried using itertools.product, but it still remains slow:
from itertools import product
for l, c in product(range(0, ysize), range(0, xsize)):
i_l = max(0, l - w_size/2)
o_l = min(ysize, i_l+w_size/2)
i_c = max(0, c - w_size/2)
o_c = min(xsize, i_c+w_size/2)
values = m_in[:, i_l:o_l, i_c:o_c]
values = values[np.nonzero(values)]
value = np.median(values)
m_out[l, c] = value
So, I tried using the performance of numpy matrix operations, so I tried with scipy.ndimage:
from scipy import ndimage
m_all = ndimage.median_filter(m_in, size=(zsize, w_size, w_size))
m_out[:] = m_all[0]
and scipy.signalalso:
m_all = signal.medfilt(m_in, kernel_size=(zsize, w_size, w_size))
m_out[:] = m_all[0] #only first layer of 11, considering all the same
scipy , , , (all_z, w_size, w_size).
, ((11, 200, 300) (11, 50, 50)). , ( 750x12000x10000 750x100x100).
, - (3D- 2D-) ?
Edit1
. 750 15 . , - .