You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, thanks again for sharing this repo for reproducing the awesome results.
I am curious about is the BatchNorm layers folded into preceding Conv or FC layers while inference?
I ran both static and retrain mode for mobilenetv2. While inference, I found that the variables of BatchNorm (mean/var/gamma/beta) are filled with some values instead of 1s or 0s, and still get involved in the computation graph. Is that worked as intended?
(I load the quantized model with .ckpt and .pb files.)
The text was updated successfully, but these errors were encountered:
Hi @jakc4103,
The folding of BatchNorm layers in Graffitist is not an in-place operation. By that I mean, the BN parameters (mean, var, gamma, beta etc) are retained as variables, and only the graph is modified to fold them along with weights / biases. As a result, the folded weights and biases are not modified in-place, rather the computed at run-time using ops that implement the folding. If you're interested in getting the final folded & quantized weights / biases of a convolutional layer, you may provide dump_quant_params=True argument to the quantize transform like this.
Just to be sure, I found the dumped weights are in integer format, while the whole training and inference pipelines are done using FakeQuant (computations in FP32), is that right?
Also, is there any other debugging or experimental flags for quantize()? (ex: setting weight calibration method to MAX for training mode)
Hi, thanks again for sharing this repo for reproducing the awesome results.
I am curious about is the BatchNorm layers folded into preceding Conv or FC layers while inference?
I ran both static and retrain mode for mobilenetv2. While inference, I found that the variables of BatchNorm (mean/var/gamma/beta) are filled with some values instead of 1s or 0s, and still get involved in the computation graph. Is that worked as intended?
(I load the quantized model with .ckpt and .pb files.)
The text was updated successfully, but these errors were encountered: