GRAPH REPRESENTATION LEARNING ONREAL-WORLD COMPLEX NETWORKS
The Hong Kong University of Science and Technology (Guangzhou)
数据科学与分析学域
PhD Thesis Proposal Examination
By Mr SONG Yifan
摘要
Graph Representation Learning (GRL) has become a fundamental approach for analyzing complex networks. However, a significant gap exists between the capabilities of traditional GRL models and the multifaceted challenges presented by real-world networks. This thesis proposal identifies three critical challenges in dealing with real-world complex graphs: scale and dynamics, data incompleteness, and graphs with negative feedback. To address these challenges, we propose a novel and effective solution for each. First, to tackle the challenge of scale and dynamics, we introduce LTGE (Large-scale Temporal Graph Embedding). This method employs a novel temporal similarity matrix factorization technique to efficiently capture temporal dynamics at an unprecedented scale, successfully generating embeddings for graphs with over a billion edges. Additionally, we develop LTGEInc, an incremental update algorithm with provable error bounds, which enables rapid adaptation to evolving graph structures. Second, to address data incompleteness, a pervasive issue that severely degrades Graph Neural Network (GNN) performance, we propose DDFI (Diverse and Distribution-aware Missing Feature Imputation). DDFI uniquely combines feature propagation with a Graph Masked Autoencoder and introduces a two-step inference process to mitigate feature distribution shifts in inductive learning settings. Its effectiveness is validated on multiple public datasets, as well as on a newly collected real-world benchmark dataset with naturally missing features, referred to as Sailing. Finally, to model graphs with negative feedback, we develop SPGNN (Signed Proximity-based Graph Neural Network) for signed graph recommendation. SPGNN introduces novel Signed Local and Global Proximity metrics alongside a Spectral Feature Initialization strategy, replacing standard GNN convolutions to better capture the nuanced structural information of signed graphs. This approach achieves state-of-the-art performance on multiple benchmark datasets. In summary, building upon the current preliminary results and findings, we expect to create a cohesive toolkit that advances GRL towards scalable, robust, and semantically faithful modeling of real-world complex networks, supported by principled theory and reproducible benchmarks.
TPE Committee
Chair of Committee: Prof. FAN, Mingming
Prime Supervisor: Prof. TANG, Jing
Co-Supervisor: Prof. YANG, Jinglei
Examiner: Prof. DING, Ningning
日期
29 August 2025
时间
10:30:00 - 11:30:00
地点
E3-201 (HKUST-GZ)