Code

Technical

What are MDPs? And how can we Solve them?

MDPs are frequently encountered when they are used as a framework for setting up reinforcement learning problems. In this post, we will define what they are, highlight their universality, and present two methods to solve them.

Read
Technical

What is BC? How can we use it?

How can an agent learn a policy when it doesn't have access to the underlying reward structure of it's environment? We cover one method to solve this problem called Behavioral Cloning, while providing the required theory and an implemented example of it in practice.

Read