Reinforcement learning (RL) agents learn to perform a task through trial-and-error interactions with an initially unknown environment. Despite the recent progress in deep RL, it remains a challenge to train intelligent agents that can efficiently explore a large state space and quickly solve a wide variety of tasks. One of the biggest obstacles is the high cost of human supervision for RL: it is difficult to design reward functions that provide enough learning signal yet still induce the correct behavior at convergence. To reduce the amount of human supervision required, there has been recent progress on self-supervised RL approaches, where the agent learns on its own by interacting with the environment without an extrinsic reward function. However, without any prior knowledge about the task, these methods can be sample-inefficient and suffer from poor exploration. Towards solving these challenges, this thesis focuses on how we can balance self-supervised RL with scalable forms of human supervision to efficiently train an agent for solving various high-dimensional robotic tasks. Being mindful about the cost of human labor required, we consider alternative modalities of supervision that can be more scalable and easier to provide from the human user. We show that such supervision can drastically improve the agent's learning efficiency, enabling the agent to do directed exploration and learning within a large search space of states.
Ruslan Salakhutdinov (Co-chair)
Eric Xing (Co-chair)
Chelsea Finn (Stanford University)
Sergey Levine (University of California, Berkeley)
Zoom Particpation. See announcement.