-
Notifications
You must be signed in to change notification settings - Fork 19.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
exclude_from_weight_decay for AdamW #16201
Comments
Can you please elaborate about your feature and please specify the use cases for this feature. Thanks! |
I need to exclude some parameters from the weight decay process. Useful during training the Transformer model. List of regex patterns of variables excluded from weight decay. Variables whose names contain a substring matching the pattern will be excluded. Here is the use case. The code behind exclude_from_weight_decay from tensorflow/addons is here. |
@markub3327 What is the difference between this issue when compared to #16195? Can you please state the differences clearly or close this issue this is a duplicate of #16195. Thanks! |
This is a proposal to add #16195 is related to port RectifiedAdam (a new optimizer) with |
Thanks a lot ... Very good explanation! |
@chenmoneygithub Any comments on this? |
@markub3327 Thanks for reporting the issue! Are you willing to open a PR to add |
Yes. Thanks. |
Hello,
please can you add an
exclude_from_weight_decay
parameter to AdamW? Some parameters during weight decay must be excluded.For more details please look here.
Thanks a lot.
Have a nice day.
The text was updated successfully, but these errors were encountered: