Skip to content

Latest commit

 

History

History
22 lines (13 loc) · 769 Bytes

README.md

File metadata and controls

22 lines (13 loc) · 769 Bytes

RL_baselines

Custom implementation of unavailable RL SOTA baseline algorithms.

Currently available pytorch implementations -

  • POIS (Policy Optimization via Importance Sampling) (NeuRIPS 2018) - paper

In progress -

  • Minimum-Variance Policy Evaluation for Policy Improvement (UAI 2023)- paper

Installation

pip install -r requirements.txt

Replicating POIS results -

python evaluate_pois.py

Results for 500 iterations

We show evaluation results against PPO and TRPO in the linear Gaussian and MLP Gaussian Policy with learnable mean and fixed variance.

Image