Convoys can work with any size of image input (which is large enough). However, if you have a fully connected layer at the end, this layer requires a fixed input size. Therefore, for a complete network, a fixed input image size is required.
However, you can remove the fully connected layer and just work with convolutional layers. You can make a convolution layer at the end that has the same number of filters as the classes. But you need one value for each class that indicates the probability of this class. Therefore, you apply a merge filter across the remaining functions map. This association, therefore, is βglobalβ because it is always as large as necessary. Conversely, regular merge layers have a fixed size (for example, 2x2 or 3x3).
This is a general concept. You can also find the global pool in other libraries, for example. Lasagne . If you need good literature, I recommend reading Network In Network .
source share