Is variables of batch norm layers folded while inference? #12

jakc4103 · 2020-05-04T07:36:51Z

Hi, thanks again for sharing this repo for reproducing the awesome results.

I am curious about is the BatchNorm layers folded into preceding Conv or FC layers while inference?
I ran both static and retrain mode for mobilenetv2. While inference, I found that the variables of BatchNorm (mean/var/gamma/beta) are filled with some values instead of 1s or 0s, and still get involved in the computation graph. Is that worked as intended?
(I load the quantized model with .ckpt and .pb files.)

sjain-stanford · 2020-05-04T19:12:32Z

Hi @jakc4103,
The folding of BatchNorm layers in Graffitist is not an in-place operation. By that I mean, the BN parameters (mean, var, gamma, beta etc) are retained as variables, and only the graph is modified to fold them along with weights / biases. As a result, the folded weights and biases are not modified in-place, rather the computed at run-time using ops that implement the folding. If you're interested in getting the final folded & quantized weights / biases of a convolutional layer, you may provide dump_quant_params=True argument to the quantize transform like this.

python $groot/graffitize.pyc \
    --in_graph $in_graph \
    --out_graph $infquant_graph \
    --inputs $input_node \
    --outputs $output_node \
    --input_shape $input_shape \
    --transforms 'fix_input_shape' \
                 'fold_batch_norms' \
                 'remove_training_nodes' \
                 'strip_unused_nodes' \
                 'preprocess_layers' \
                 'quantize(dump_quant_params=True, ...)'

This will save out an HDF5 dump of quantized weights and biases, which will have BN params folded in and quantized appropriately.

jakc4103 · 2020-05-05T05:46:31Z

@sjain-stanford tks for the kind reply!

Just to be sure, I found the dumped weights are in integer format, while the whole training and inference pipelines are done using FakeQuant (computations in FP32), is that right?

Also, is there any other debugging or experimental flags for quantize()? (ex: setting weight calibration method to MAX for training mode)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is variables of batch norm layers folded while inference? #12

Is variables of batch norm layers folded while inference? #12

jakc4103 commented May 4, 2020

sjain-stanford commented May 4, 2020

jakc4103 commented May 5, 2020 •

edited

Loading

Is variables of batch norm layers folded while inference? #12

Is variables of batch norm layers folded while inference? #12

Comments

jakc4103 commented May 4, 2020

sjain-stanford commented May 4, 2020

jakc4103 commented May 5, 2020 • edited Loading

jakc4103 commented May 5, 2020 •

edited

Loading