Consider a fully plug-in layer as a simple matrix-matrix multiplication of 1xN and NxM to get a 1xM measurement 1xM .
Consider the transfer of say 56x56x3 measurement data as the input of a fully connected layer. Let the weight dimension be unknown to NxM . Consider, put num_ouput = 4096 .
To calculate this data, a fully connected layer modifies the input data of size 56x56x3 as 1xN , 1x(56x56x3) = 1x9408 .
In this way,
N = 9408
M = num_output = 4096
In fact, we are doing the multiplication (1x9408)matrix - (9408x4096) matrix .
If the num_output value was changed to say 100 , it would complete the multiplication (1x9408)matrix - (9408x100) matrix .
Thus, increasing the num_ouput value will increase the number of weight parameters that the model should study.
source share