DSA学域研讨会

Towards Diverse, Informative and Robust Natural Language Generation

Deep neural networks have shown remarkable effectiveness in natural language generation (NLG) tasks (e.g., text continuation and dialog generation). However, there is still a noticeable quality difference between human-written and machine-generated text. Current NLG models often produce text that lacks diversity and information, and they are incapable of achieving robust performance in low-resource data settings.

In this talk, I will present my efforts towards developing new principled modeling and learning frameworks for human-like text generation. First, I will present a new framework that learns with a semantic latent space for generating diverse text. Different formulations and training methods can be instantiated to learn a proper semantic latent space and obtain significant improvements in diversity. Next, I will introduce our research on exploring and exploiting useful knowledge for informative text generation. Third, I will present a unified view of data augmentation for the purpose of building robust generation models. It allows us to formulate the problem of data augmentation for general text generation models without any use of augmented data mapping functions. Finally, I will conclude with a blueprint for the next-level NLG systems.

Wei BI

Principal Researcher

Tencent AI Lab

Dr. BI Wei is a principal researcher at Tencent AI Lab. She received her Ph.D. in Computer Science and Engineering from the Hong Kong University of Science and Technology in 2015. She is an awardee of the Google Ph.D. Fellowship in 2013 and the Google Anita Borg Scholarship in 2014. Her research interests lie in the broad areas of machine learning, natural language processing, and artificial intelligence. In particular, she is interested in developing principles, methodologies, and AI systems to learn with rich content (data, knowledge, symbolic rules, rewards, etc.), and their applications in natural language generation. Her proposed models have been deployed to support many Tencent's products, including dialog systems, virtual humans, and Tencent games.

日期

05 December 2022

时间

14:00:00 - 15:00:00

地点

线上

Join Link

Zoom Meeting ID:
950 5762 1601


Passcode: dsat

主办方

数据科学与分析学域

联系邮箱

dsat@hkust-gz.edu.cn