You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Algorithms in qpolgrad have been organized to define functions for loss calculation. Those functions are then called in the update function for the algorithm. A2C and PPO need to be brought up to that same structure.
Specifically:
Define compute_policy_loss and compute_value_loss functions in A2C and PPO.
Modify the update rules for both algorithms to call the loss computation functions.
Update docstrings to reflect your changes! If there aren't docstrings (sorry), add them!
👍
The text was updated successfully, but these errors were encountered:
Algorithms in
qpolgrad
have been organized to define functions for loss calculation. Those functions are then called in theupdate
function for the algorithm. A2C and PPO need to be brought up to that same structure.Specifically:
compute_policy_loss
andcompute_value_loss
functions in A2C and PPO.👍
The text was updated successfully, but these errors were encountered: