The step determines how the filter moves along the input image (tensor). Nothing prevents you from moving along different axes in different ways, for example, stride=[1, 2]
means moving 1px at a time along the 0 axis and 2px at a time along 1 axis. This particular combination is not general, but possible.
The Tensorflow API goes even further and allows arbitrary stepwise movement for all axes of the input 4D tensor (see tf.nn.conv2d
). Using this API, it is often possible to establish strides=[1, 2, 2, 1]
what makes perfect sense: it should process each image (first 1
) and each input channel (last 1
), but use a 2x2
spatial pedometer. Dimensions. As for convolution, the operation is applicable for any array strides
, however, the values are not equally useful.
We strongly recommend this CS231n tutorial for more details on this.
Maxim source
share