Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is variables of batch norm layers folded while inference? #12

Open
jakc4103 opened this issue May 4, 2020 · 2 comments
Open

Is variables of batch norm layers folded while inference? #12

jakc4103 opened this issue May 4, 2020 · 2 comments

Comments

@jakc4103
Copy link

jakc4103 commented May 4, 2020

Hi, thanks again for sharing this repo for reproducing the awesome results.

I am curious about is the BatchNorm layers folded into preceding Conv or FC layers while inference?
I ran both static and retrain mode for mobilenetv2. While inference, I found that the variables of BatchNorm (mean/var/gamma/beta) are filled with some values instead of 1s or 0s, and still get involved in the computation graph. Is that worked as intended?
(I load the quantized model with .ckpt and .pb files.)

@sjain-stanford
Copy link
Member

Hi @jakc4103,
The folding of BatchNorm layers in Graffitist is not an in-place operation. By that I mean, the BN parameters (mean, var, gamma, beta etc) are retained as variables, and only the graph is modified to fold them along with weights / biases. As a result, the folded weights and biases are not modified in-place, rather the computed at run-time using ops that implement the folding. If you're interested in getting the final folded & quantized weights / biases of a convolutional layer, you may provide dump_quant_params=True argument to the quantize transform like this.

python $groot/graffitize.pyc \
    --in_graph $in_graph \
    --out_graph $infquant_graph \
    --inputs $input_node \
    --outputs $output_node \
    --input_shape $input_shape \
    --transforms 'fix_input_shape' \
                 'fold_batch_norms' \
                 'remove_training_nodes' \
                 'strip_unused_nodes' \
                 'preprocess_layers' \
                 'quantize(dump_quant_params=True, ...)'

This will save out an HDF5 dump of quantized weights and biases, which will have BN params folded in and quantized appropriately.

@jakc4103
Copy link
Author

jakc4103 commented May 5, 2020

@sjain-stanford tks for the kind reply!

Just to be sure, I found the dumped weights are in integer format, while the whole training and inference pipelines are done using FakeQuant (computations in FP32), is that right?

Also, is there any other debugging or experimental flags for quantize()? (ex: setting weight calibration method to MAX for training mode)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants