In this jupyter notebook , you will compare the performance of three reinforcement learning algorithms - On-Policy First-Visit Monte-Carlo Control, Sarsa, and Q-Learning - in a simple racetrack environment. You will then implement a modified TD agent that improves upon the learning performance of a basic Q-Learning agent.