Unlocking LLM Potential: Investigating Graph Problems and Mathematical Reasoning
The Hong Kong University of Science and Technology (Guangzhou)
数据科学与分析学域
PhD Thesis Examination
By Mr. Nuo CHEN
摘要
Large language models (LLMs) excel at pattern matching but still struggle with rigorous reasoning. To systematically unlock their latent potential, we address key challenges in mathematical and graph-based reasoning, including: enabling large language models (LLMs) to perform self-reflection and calibration for better problem-solving, overcoming data scarcity, leveraging reinforcement learning to enhance reasoning capabilities, and bridging graph problem reasoning with general reasoning to improve overall LLM performance.
First, we propose IMR-TIP (Improving Math Reasoning with Tool-augmented Interleaf Prompting), an interactive reasoning framework that empowers LLMs to self-reflect and invoke external tools for solving mathematical problems. We further introduce MathOctopus, a multilingual mathematical reasoning model capable of addressing problems in over ten languages. To enhance generalization, we design a controllable math data generation method, significantly improving LLMs’ performance in mathematical tasks.
Having stabilised mathematical reasoning, we turn to structure as a catalyst for general reasoning. Graphs offer a universal abstraction, so we build GraphWiz, the first LLM fine-tuned specifically on graph computational problems. Its success motivates a larger vision: if graphs embody relations underlying many tasks, then pre-training on rich graph reasoning should lift performance everywhere. We curate GraphPile, a large, diverse corpus of graph-centric problems and tutorials, and pre-train GraphMind. Empirically, GraphMind transfers its relational bias to out-of-domain benchmarks, outperforming vanilla LLMs across arithmetic, commonsense, and program synthesis. Finally, we close the loop at the reward level.We propose Graph-PRM, a process-level reward model that grades each reasoning step by graph-problem correctness. Reinforcement learning with GraphPRM further sharpens both mathematical and graph reasoning, confirming our hypothesis that structured rewards amplify structured pre-training.
TEC
Chairperson: Prof Xin WANG
Prime Supervisor: Prof Jia LI
Co-Supervisor: Prof Yangqiu SONG
Examiners:
Prof Xiaowen CHU
Prof Wenjia WANG
Prof Enyan DAI
Prof Xu CHEN
日期
2025年6月5日
时间
14:00:00 - 16:00:00
地点
E1-202, HKUST(GZ)
Join Link
Zoom Meeting ID: 968 3773 6868
Passcode: dsa2025
主办方
数据科学与分析学域
联系邮箱
dsarpg@hkust-gz.edu.cn