The variations I found in initializing Xavier for weights in a neural network all mention fan-in and fan-out ; Could you tell us how these two parameters are calculated? In particular, for these two examples:
1) initialization of the convolutional layer weights with the shape filter [5, 5, 3, 6] (width, height, input depth, output depth);
2) initialization of the weights of a fully connected layer with the form [400, 120] (ie, mapping 400 input variables to 120 output variables).
Thanks!
Fanta source share