Automated Alpha Factor Discovery in Quantitative Finance: A Critical Survey from Human Expertise to Large Language Models
The Hong Kong University of Science and Technology (Guangzhou)
数据科学与分析学域
PhD Qualifying Examination
By Mr. ZHANG, Junxiang
摘要
Alpha factor mining is a central problem in quantitative investment because it connects raw market, fundamental, and textual data to tradable portfolio decisions. This survey reviews automated alpha factor discovery as a sequence of expanding search spaces and feedback mechanisms: human-designed economic factors, formulaic alpha libraries, evolutionary and symbolic search, machine-learning and deep-learning factor modeling, reinforcement-learning-based generation, and recent LLM-, RAG-, and agent-based workflows. Rather than treating these paradigms as isolated tool families, the survey compares them through a common framework: factor representation, search space, optimization mechanism, evaluation protocol, interpretability, and robustness. The review emphasizes that automation improves discovery capacity but also amplifies long-standing validation risks. Larger generators can produce many plausible candidates, yet financial data are noisy, non-stationary, and vulnerable to look-ahead bias, data snooping, factor redundancy, crowding, and alpha decay. LLM-based systems add new opportunities for semantic hypothesis generation and code synthesis, but they also introduce hallucination, leakage, invalid-code, and post-hoc explanation risks. The survey concludes that progress in alpha mining depends less on unconstrained generation and more on reliable evaluation, economic grounding, diversity control, reproducible infrastructure, and human-AI collaboration.
PQE Committee
- Chair: Prof. WANG, Wei
- Prime Supervisor: Prof. TANG, Jing
- Co-Supervisor: Prof. TANG, Nan
- Examiner: Prof. DING, Ningning
日期
10 June 2026
时间
13:00:00 - 14:00:00
地点
E1-148, HKUST(GZ)