Consider a fully plug-in layer as a simple matrix-matrix multiplication of 1xN
and NxM
to get a 1xM
measurement 1xM
.
Consider the transfer of say 56x56x3
measurement data as the input of a fully connected layer. Let the weight dimension be unknown to NxM
. Consider, put num_ouput = 4096
.
To calculate this data, a fully connected layer modifies the input data of size 56x56x3
as 1xN
, 1x(56x56x3) = 1x9408
.
In this way,
N = 9408
M = num_output = 4096
In fact, we are doing the multiplication (1x9408)matrix - (9408x4096) matrix
.
If the num_output value was changed to say 100
, it would complete the multiplication (1x9408)matrix - (9408x100) matrix
.
Thus, increasing the num_ouput
value will increase the number of weight parameters that the model should study.
source share