Incorporating Structural Bias into Neural Networks
The rapid progress in artificial intelligence in recent years can largely be attributed to the resurgence of neural networks, which enables learning representations in an end-to-end manner. Although neural network are powerful, they have many limitations. For examples, neural networks are computationally expensive and memory inefficient; Neural network training needs many labeled exampled, especially for tasks that require reasoning and external knowledge. The goal of this thesis is to overcome some of the limitations by designing neural network with structural bias of the inputs taken into consideration.
This thesis aims to improve the efficiency of neural networks by exploring structural properties of inputs in designing model architectures. Specifically, this thesis augments neural networks with designed modules to improves their computational and statistical efficiency. We instantiate those modules in a wide range of tasks including supervised learning and unsupervised learning and show those modules not only make neural networks consume less memory, but also generalize better.
Thesis Committee: Eric Xing (Co-Chair) Taylor Berg-Kirkpatrick (Co-Chair, LTI) Ruslan Salakhutdinov Alexander Smola (Amazon) Nando de Freitas (DeepMind)