Neural Information Retrievaland Beyond
The Hong Kong University of Science and Technology (Guangzhou)
Data Science and Analytics Thrust
PhD Thesis Proposal Examination
By Mr. ZHOU Jiawei
Abstract
For decades, information retrieval (IR) has been fundamental in helping people access knowledge. As data volumes grow beyond human capacity, retrieval systems have become indispensable.
However, we are now witnessing a fundamental shift. With the emergence of large language models (LLMs) such as ChatGPT, Claude, and DeepSeek, retrieval is no longer designed solely for human users. Increasingly, LLM-based AI systems act as autonomous agents, performing tasks on behalf of people. These systems depend on retrieval not just to deliver results to users, but to gather and ground information for reasoning, content generation, and decision-making. This has given rise to a new paradigm: retrieval-augmented generation (RAG), where retrieval serves as the memory and knowledge interface for AI.
This thesis investigates retrieval for both human users and AI systems. I begin by examining retrieval systems tailored for human-facing applications, with an emphasis on pre-training, transparency, multi-modal data. Building on this foundation, I analyze the emerging gap between user-oriented retrieval and AI-oriented retrieval, highlighting the limitations of current methods when integrated into LLM pipelines. Finally,I propose to explore how retrieval systems can be explicitly redesigned to support AIagents—improving factuality, adaptability, and efficiency in end-to-end tasks.
TPE Committee
Chair of Committee: Prof. CHU Xiaowen
Prime Supervisor: Prof. CHEN Lei
Co-Supervisor: Prof. TSUNG Fugee
Examiner: Prof. LIANG Yuxuan
Date
10 June 2025
Time
09:00:00 - 10:00:00
Location
E1-147 (HKUST-GZ)