I would like to train a quantized network, i.e. use quantized weights during the direct pass to calculate the loss, and then update the full-precision floating point base weights during the back pass.
This question has already been asked here , but has not been answered.
Note that in my case, "fake quantization" is sufficient. This means that the balance can still be stored as 32-bit floating point values if they are a quantized value with a low bit width.
In a blog post by Pete Warden, he claims:
"[...] we have support for" fake quantization "operators. If you include them in your graphs at the points where quantization is expected (for example, after convolution), then in a direct float pass the values will be rounded to the specified number of levels (usually 256) to simulate the effects of quantization.
The specified operators can be found in the TensorFlow API .
Can someone tell me how to use these features? If I call them, for example, the conv layer in my model definition, why would this quantize the weights in the layer instead of the outputs (activation) of this layer?
source
share