Reinforcement learning with a bilinear Q function

This paper, Reinforcement learning with a bilinear Q function, represents the Q fucntion as Q(s,a) = s^T * W * a, where s represents the state-space, a is the action-space and W is a learned matrix. W is learned through linear regression using a batch of examples sampled from the environment. One of the advantages of this method is that it can be done offline, without any agent-environment interaction. Also, the number of learned parameters in W is considerably low.

Mountain Car Problem

The python implementation on the continuous Mountain Car problem implemented in the Gym library can be found here.

The program can be started from the Main.py file. At the top of the file there are certain parameters and flags that you can set. It is up to you whether you want to test many parameters at the same time or just one set of parameters. It is also possible to render the video to see how the algorithm performs by setting the render_video flag to True. Note that the samples are generated randomly at the start of training, which means that it is possible to end up with a bad batch of samples that does not actually lead to a solution.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
README.md		README.md
Reinforcement learning with a bilinear Q function.pdf		Reinforcement learning with a bilinear Q function.pdf
agent.py		agent.py
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Reinforcement learning with a bilinear Q function

Mountain Car Problem

About

Releases

Packages

Contributors 2

Languages

Ebiz95/Fitted-Q-learning-on-Mountain-car-problem

Folders and files

Latest commit

History

Repository files navigation

Reinforcement learning with a bilinear Q function

Mountain Car Problem

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages