A Survey on Ground-level Visual Geolocalization
The Hong Kong University of Science and Technology (Guangzhou)
数据科学与分析学域
PhD Qualifying Examination
By Miss LI Ling
摘要
Ground-level Visual Geolocalization seeks to determine the geographic location of an image based solely on its visual content. While early methods relied on recognition-based techniques such as image classification and retrieval, recent advances have introduced models that incorporate contextual understanding and reasoning. This survey provides a comprehensive overview tracing the evolution from handcrafted features and deep learning models to recent reasoning-driven frameworks powered by large vision-language models (LVLMs). We categorize existing approaches into classification-based, retrieval-based, reasoning-based, and hybrid methods, providing an overview of their methodological characteristics and representative techniques. We also examine recent trends in both dataset construction and modeling frameworks, highlighting the need for high-quality, reasoning-compatible data and the growing role of LVLMs in inference and training. Finally, we outline open challenges and future directions focused on scalable data curation, effective training strategies, and the design of geolocalization systems that are accurate, interpretable, and robust across real-world conditions.
PQE Committee
Chair of Committee: Prof. CHU Xiaowen
Prime Supervisor: Prof. WEI Jiaheng
Co-Supervisor: Prof. TSUNG Fugee
Examiner: Prof. LIANG Yuxuan
日期
11 June 2025
时间
09:00:00 - 10:00:00
地点
E1-147 (HKUST-GZ)