博士资格考试

From Research to Reach: A Survey of LanguageModels for Scientific Literature Summarization andSimplification

The Hong Kong University of Science and Technology (Guangzhou)

数据科学与分析学域

PhD Qualifying Examination

By Mr. JIANG Gongyao

摘要

The growing complexity and volume of scientific literature pose significant challenges to effective knowledge dissemination across expertise levels. Language Models (LMs) have emerged as promising tools to address these challenges through automated text adaptation techniques, particularly summarization and simplification. Summarization distills technical content into concise forms for rapid comprehension, while simplification reduces linguistic complexity to enhance accessibility for non-specialists. This survey examines advancements in adapting LMs for scientific literature transformation through three interconnected dimensions: data curation, model architectures, and evaluation frameworks. We analyze how domain-specific datasets, pretraining and finetuning strategies, and hybrid evaluation protocols collectively enable systems to transform content into variants with high readability and faithfulness. Current approaches demonstrate robust capabilities in extracting salient information and generating coherent outputs, yet critical gaps persist in handling multimodal content, dynamically adapting to diverse audiences, and aligning automated metrics with human judgment. In the future, the integration of visualtextual reasoning, user-aware generation, and scalable evaluation pipelines will be essential to bridge the gap between specialized research and equitable scientific communication. By synthesizing progress and challenges, this work provides a roadmap for developing LMpowered tools that empower researchers and the public to navigate the expanding frontiers of scientific knowledge.

PQE Committee

Chair of Committee: Prof. TSUNG Fugee

Prime Supervisor: Prof. LUO Qiong  

Co-Supervisor: Prof. MA Xiaojuan

Examiner: Prof. ZHANG Yongqi

日期

10 June 2025

时间

14:00:00 - 15:00:00

地点

E1-147 (HKUST-GZ)