Conventional planning and control of highly articulated legged robots is challenging because of the high dimensionality of the state space, and such conventional techniques normally produce a single point solution. In this work, we present a framework for end to end learning of a set of parameterized families of behaviors that can be modulated by low dimensional sets of control parameters. We draw inspiration from Central Pattern Generators (CPGs), which use networks of oscillators to form expressive low-dimensional parameterizations of locomotive behaviors that result in a reduction in the dimensionality of the planning problem. The design of CPGs however requires significant domain knowledge and hand tuning. Model-free deep reinforcement learning (RL) on the other hand, offers a framework for learning behavioral policies by interacting with the environment, with minimal to no domain expertise about the robot or its environment. The results from RL are still restrictive because they represent a single behavior whose characteristics cannot be easily modified. This work presents a framework that brings together ideas from CPGs and model-free deep RL to enable expressive parameterizations of behaviors to be learned end-to-end by interaction with the environment.
Matthew Travers (Co-Advisor)
Howie Choset (Co-Advisor)
Zoom Participation Enabled. See announcement.