Scipy implements standard N-dimensional convolutions, so the folded matrix and core are N-dimensional.
Y, Y 3-:
result = signal.convolve(X, Y[..., None], 'valid')
, , [width, height, image_idx] ( [height, width, image_idx]). , ( C-), Y[..., None] Y[None, ...].
Y[..., None] Y, 3- [kernel_width, kernel_height, 1] , , .
. , - width x height, CNN.
EDIT: , @.
:
def test(S, N, K):
""" S: image size, N: num images, K: kernel size"""
a = np.random.randn(S, S, N)
b = np.random.randn(K, K)
valid = [slice(K//2, -K//2+1), slice(K//2, -K//2+1)]
%timeit signal.convolve(a, b[..., None], 'valid')
%timeit signal.fftconvolve(a, b[..., None], 'valid')
%timeit ndimage.convolve(a, b[..., None])[valid]
:
S:
>>> test(100, 50, 11)
1 loop, best of 3: 909 ms per loop
10 loops, best of 3: 116 ms per loop
10 loops, best of 3: 54.9 ms per loop
>>> test(1000, 50, 11)
1 loop, best of 3: 1min 51s per loop
1 loop, best of 3: 16.5 s per loop
1 loop, best of 3: 5.66 s per loop
N:
>>> test(100, 5, 11)
10 loops, best of 3: 90.7 ms per loop
10 loops, best of 3: 26.7 ms per loop
100 loops, best of 3: 5.7 ms per loop
>>> test(100, 500, 11)
1 loop, best of 3: 9.75 s per loop
1 loop, best of 3: 888 ms per loop
1 loop, best of 3: 727 ms per loop
K:
>>> test(100, 50, 5)
1 loop, best of 3: 217 ms per loop
10 loops, best of 3: 100 ms per loop
100 loops, best of 3: 11.4 ms per loop
>>> test(100, 50, 31)
1 loop, best of 3: 4.39 s per loop
1 loop, best of 3: 220 ms per loop
1 loop, best of 3: 560 ms per loop
, , ndimage.convolve , , ( K = 31 ).