GRAPH REPRESENTATION LEARNING ONREAL-WORLD COMPLEX NETWORKS
The Hong Kong University of Science and Technology (Guangzhou)
Data Science and Analytics Thrust
PhD Thesis Proposal Examination
By Mr SONG Yifan
Abstract
Graph Representation Learning (GRL) has become a fundamental approach for analyzing complex networks. However, a significant gap exists between the capabilities of traditional GRL models and the multifaceted challenges presented by real-world networks. This thesis proposal identifies three critical challenges in dealing with real-world complex graphs: scale and dynamics, data incompleteness, and graphs with negative feedback. To address these challenges, we propose a novel and effective solution for each. First, to tackle the challenge of scale and dynamics, we introduce LTGE (Large-scale Temporal Graph Embedding). This method employs a novel temporal similarity matrix factorization technique to efficiently capture temporal dynamics at an unprecedented scale, successfully generating embeddings for graphs with over a billion edges. Additionally, we develop LTGEInc, an incremental update algorithm with provable error bounds, which enables rapid adaptation to evolving graph structures. Second, to address data incompleteness, a pervasive issue that severely degrades Graph Neural Network (GNN) performance, we propose DDFI (Diverse and Distribution-aware Missing Feature Imputation). DDFI uniquely combines feature propagation with a Graph Masked Autoencoder and introduces a two-step inference process to mitigate feature distribution shifts in inductive learning settings. Its effectiveness is validated on multiple public datasets, as well as on a newly collected real-world benchmark dataset with naturally missing features, referred to as Sailing. Finally, to model graphs with negative feedback, we develop SPGNN (Signed Proximity-based Graph Neural Network) for signed graph recommendation. SPGNN introduces novel Signed Local and Global Proximity metrics alongside a Spectral Feature Initialization strategy, replacing standard GNN convolutions to better capture the nuanced structural information of signed graphs. This approach achieves state-of-the-art performance on multiple benchmark datasets. In summary, building upon the current preliminary results and findings, we expect to create a cohesive toolkit that advances GRL towards scalable, robust, and semantically faithful modeling of real-world complex networks, supported by principled theory and reproducible benchmarks.
TPE Committee
Chair of Committee: Prof. FAN, Mingming
Prime Supervisor: Prof. TANG, Jing
Co-Supervisor: Prof. YANG, Jinglei
Examiner: Prof. DING, Ningning
Date
29 August 2025
Time
10:30:00 - 11:30:00
Location
E3-201 (HKUST-GZ)