Reimplementation of "Exploration by Random Network Distillation" aiming to train as fast as possible.
A final project for the course "Advanced Topics in Deep Reinforcement learning" (a report is available in Russian).
Install all dependencies from either yml or txt file.
Adjust config.yml file as you wish (note the "SavePath", "OptimDevice" and "RunDevice" arguments).
Run model training via
python montezuma_train.py
the trained model can be evaluated with
python montezuma_eval.py
Montezuma Revenge
Training with both intrinsic and extrinsic rewards
Training with intrinsic-only reward
- Separate actor and learner
- Log number of rooms visited
- Add optional V-trace targets correction
- Add TPU support
- Add fp16 support