With the growing ubiquity of AI and machine learning, intelligent agents, especially robots, will be expected to perform in increasingly complex environments. Recent successes in model-free reinforcement learning have demonstrated the ability of agents to succeed in such environments. However, model-free methods use data inefficiently, and are therefore impractical for complex robotics tasks, where evaluation of the system is expensive. Recent work in aDOBO has demonstrated the efficiency of model-based reinforcement learning in practical robotics applications. We extend this work to non-linear dynamics models and show that this new framework can successfully control systems with highly non-linear, complex dynamics and scales well to high dimensional environments.
We have been working on testing aDOBO on a toy environment, called Dubins car, where you have to control a car in 3 dimensions to move it to a goal location. We are also working on looking into the efficacy of aDOBO in comparison to model-free approaches. Additionally, we are going to be looking at extending aDOBO to allow for efficient learning of multiple goals. For example, say we wanted to be able to effectively learn how to drive our car not just to one goal location but to any goal location, in a typical model-based RL approach this would be very easy because model-based approaches allow for changing of the cost function without re-learning the dynamics model. However, in the aDOBO framework, we learn the dynamics in a goal-driven way meaning we must re-learn the dynamics whenever we change the cost function, so we are looking into ways to extend aDOBO to avoid this problem.