Robotics Thesis Proposal
- Remote Access - Zoom
- Virtual Presentation - ET (NEW TIME)
- MENGTIAN (MARTIN) LI
- Ph.D. Student
- Robotics Institute
- Carnegie Mellon University
Resource-Constrained Learning and Inference for Visual Perception
Real-world applications usually require computer vision algorithms to meet certain resource constraints. In this talk, I will present evaluation methods and principled solutions for both training and testing. First, I will talk about a formal setting for studying training under the non-asymptotic, resource-constrained regime, i.e., budgeted training. We analyze the following problem: "given a dataset, algorithm, and fixed resource budget, what is the best achievable performance?" Such a setting could be essential for the democratization of deep learning. Second, I will talk about how vision algorithms should respond to resource constraints inherent in embodied perception, where an autonomous agent needs to perceive its environment and (re)act in time. We introduce a meta-benchmark that systematically converts any single-frame understanding task into a streaming understanding task. Such streaming perception framework yields several surprising conclusions and solutions. Third, I will talk about an unconventional approach for streaming object detection. Image downsampling is a commonly adopted technique to ensure the latency constraint is met. However, this naive approach greatly restricts an object detector's capability to identify small objects. Inspired by the foveated human vision, we elastically magnify certain regions while maintaining a small input canvas. With attentional magnification, we set a new record for streaming AP on Argoverse-HD.
Deva Ramanan (Chair)
Raquel Urtasun (Waabi / University of Toronto)
Ross Girshick (Facebook AI Research)
Zoom Participation. See announcement.