about your gradient #126

BobxmuMa · 2021-09-19T07:47:48Z

首先，非常感谢您开源了您的XNOR-pytorch代码。其次，我注意到您在更新单精度权重时，对于权重的梯度乘了一些系数：
self.target_modules[index].grad.data = m.add(m_add).mul(1.0-1.0/s[1]).mul(n)
self.target_modules[index].grad.data = self.target_modules[index].grad.data.mul(1e+9)
关于这些系数，我没有在原文中找到相应的描述，想问一下您为什么对梯度进行了这样的变换。

The text was updated successfully, but these errors were encountered:

zhaoxiangshun · 2021-10-18T09:44:58Z

我也想知道，楼主如果明白了，麻烦给讲解一下，谢谢

jiecaoyu · 2021-11-15T20:48:11Z

Hi @BobxmuMa @zhaoxiangshun , this parameter 1e+9 appears in the paper author's initial repo and, therefore, I also kept it. The main effect of this parameter is to increase the range of the weights and reduce the effect of weight decay. I suppose using a much smaller weight decay value will have the same effect. I also tested the accuracy with and without this parameter. In my tests, I saw a higher accuracy if using this parameter.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

about your gradient #126

about your gradient #126

BobxmuMa commented Sep 19, 2021

zhaoxiangshun commented Oct 18, 2021

jiecaoyu commented Nov 15, 2021

about your gradient #126

about your gradient #126

Comments

BobxmuMa commented Sep 19, 2021

zhaoxiangshun commented Oct 18, 2021

jiecaoyu commented Nov 15, 2021