You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi @BobxmuMa@zhaoxiangshun , this parameter 1e+9 appears in the paper author's initial repo and, therefore, I also kept it. The main effect of this parameter is to increase the range of the weights and reduce the effect of weight decay. I suppose using a much smaller weight decay value will have the same effect. I also tested the accuracy with and without this parameter. In my tests, I saw a higher accuracy if using this parameter.
首先,非常感谢您开源了您的XNOR-pytorch代码。其次,我注意到您在更新单精度权重时,对于权重的梯度乘了一些系数:
self.target_modules[index].grad.data = m.add(m_add).mul(1.0-1.0/s[1]).mul(n)
self.target_modules[index].grad.data = self.target_modules[index].grad.data.mul(1e+9)
关于这些系数,我没有在原文中找到相应的描述,想问一下您为什么对梯度进行了这样的变换。
The text was updated successfully, but these errors were encountered: