SCS Faculty Candidate

  • Remote Access - Zoom
  • Virtual Presentation - ET

Universal Natural Language Processing with Limited Annotations

The research of Natural Language Processing (NLP) tries to endow machines with the human's ability to understand natural languages. This mission has been broken down into a number of subtasks. We often focus on solving individual tasks by first collecting large-scale task-specific data, then developing a learning algorithm to fit the data. This research paradigm has considerably pushed the frontiers of NLP. However, it also means that we have to build new systems to handle new tasks, which is undesirable in the long run since (i) we do not really know all the tasks that we need to solve, and (ii) it discourages people from thinking about how to make systems truly understand the natural languages rather than remember the patterns in the training data.

In this talk, I will share my progress towards the goal of universal NLP (i.e., building a single system to solve a wide range of NLP tasks) in three stages. First, I study why humans show vastly superior generalization to machines regarding classifying open-genre text to the open-form labels. The first part of the talk will present a single and static system that unifies various text classification problems: new text labels keep coming to the system while no supporting examples are available. Secondly, I define a more realistic task—"incremental few-shot text classification'', where the system needs to learn the new labels incrementally with k examples per label. Thirdly, I shift my focus from classification problems to more complex and distinct tasks (e.g., Question Answering, Coreference Resolution, Relation Extraction, etc.).  This part will elaborate on how to optimize the generalization of a pre-trained entailment model with k task-specific examples so that a single entailment model can generalize to a variety of NLP tasks. Overall, the universal NLP research pushes us to think more about the underlying universal reasoning among various problems, facilitating utilizing indirect supervision to solve new tasks.

Dr. Wenpeng Yin is a research scientist at Salesforce Research, Palo Alto, California. He got a Ph.D. degree from the University of Munich, Germany, in 2017 under the supervision of Prof. Hinrich Sch├╝tze and then worked as a postdoc at UPenn with Prof. Dan Roth. Wenpeng has broad research interests in Natural Language Processing (NLP) and Machine Learning, with a recent focus on Universal & Trustworthy NLP. He got multiple awards in the past, including WISE2013 “Best Paper”, "Baidu Ph.D. Fellowship" in 2014&2015, "Chinese Government Award for Outstanding Self-financed Ph.D. Students Abroad" in 2016, and “Area Chair Favorites” paper award in COLING2018. He was an invited Senior Area Chair for NAACL'21, Area Chairs for NAACL'19 and ACL'19&21.

Faculty Host: Yiming Yang

Language Technologies Institute

Zoom Participation. See announcement.

For More Information, Please Contact: