A Survey on Ground-level Visual Geolocalization

博士资格考试

The Hong Kong University of Science and Technology (Guangzhou)

数据科学与分析学域

PhD Qualifying Examination

By Miss LI Ling

摘要

Ground-level Visual Geolocalization seeks to determine the geographic location of an image based solely on its visual content. While early methods relied on recognition-based techniques such as image classification and retrieval, recent advances have introduced models that incorporate contextual understanding and reasoning. This survey provides a comprehensive overview tracing the evolution from handcrafted features and deep learning models to recent reasoning-driven frameworks powered by large vision-language models (LVLMs). We categorize existing approaches into classification-based, retrieval-based, reasoning-based, and hybrid methods, providing an overview of their methodological characteristics and representative techniques. We also examine recent trends in both dataset construction and modeling frameworks, highlighting the need for high-quality, reasoning-compatible data and the growing role of LVLMs in inference and training. Finally, we outline open challenges and future directions focused on scalable data curation, effective training strategies, and the design of geolocalization systems that are accurate, interpretable, and robust across real-world conditions.

PQE Committee

Chair of Committee: Prof. CHU Xiaowen

Prime Supervisor: Prof. WEI Jiaheng

Co-Supervisor: Prof. TSUNG Fugee

Examiner: Prof. LIANG Yuxuan

日期

11 June 2025

时间

09:00:00 - 10:00:00

地点

E1-147 (HKUST-GZ)