Robotics Thesis Oral
- Newell-Simon Hall
- VENKATRAMAN NARAYANAN
- Ph.D. Student
- Robotics Institute
- Carnegie Mellon University
A recurrent and elementary robot perception task is to identify and localize objects of interest in the physical world. In many real-world situations such as in automated warehouses and assembly lines, this task entails localizing specific object instances with known 3D models. Most modern-day methods for the 3D multi-object localization task employ scene-to-model feature matching or regression/classification by learners trained on synthetic or real scenes. While these methods are typically fast in producing a result, they are often brittle, sensitive to occlusions, and depend on the right choice of features and/or training data.
This thesis introduces and advocates a deliberative approach, where the multi-object localization task is framed as an optimization over the space of hypothesized scenes. We demonstrate that deliberative reasoning — such as understanding inter-object occlusions — is essential to robust perception, and that discriminative techniques can effectively guide such reasoning. The contributions of this thesis broadly fall under three parts:
The first part, PErception via SeaRCH (PERCH) and its extension C-PERCH, formulates Deliberative Perception as an optimization over hypothesized scenes, and develops an efficient tree search algorithm for the same.
The second part focuses on accelerating global search through statistical learners, in the form of search heuristics (Discriminatively-guided Deliberative Perception), and by modulating the search-space (RANSAC-Trees).
The final part introduces general-purpose graph search algorithms that bridge statistical learning and search. Of these, the first is an anytime algorithm for leveraging edge validity priors to accelerate graph search, and the second, Improved Multi-Heuristic A*, permits the use of multiple, inadmissible heuristics that might arise from learning.
Experimental validation on multiple robots and real-world datasets, one of which we introduce, indicates that we can leverage the complementary strengths of fast learning-based methods and deliberative classical search to handle both "hard" (severely occluded) and "easy" portions of a scene by automatically sliding the amount of deliberation required.
Maxim Likhachev (Chair)
Siddhratha S. Srinivasa
Manuela M. Veloso
Dieter Fox (University of Washington)
Copy of Draft Thesis Document