论文开题审查

Data Stream Management: Efficient T-GNN Training over Large-Scale Dynamic Graphs

The Hong Kong University of Science and Technology (Guangzhou)

数据科学与分析学域

PhD Thesis Proposal Examination

By Mr GAO Shihong

摘要

Temporal Graph Neural Networks (T-GNNs) have become the de facto solution for representation learning on dynamic graphs, enabling state-of-the-art performance on tasks such as temporal link prediction and recommendation. However, existing T-GNN training pipelines suffer from scalability issues due to ill-suited batching and high input data loading costs, which severely limit their efficiency on large-scale graphs. This thesis proposal addresses both these bottlenecks with two complementary system prototypes. First, we propose ETC, a generic framework that introduces a theoretically grounded batch splitting algorithm and a three-step deduplication policy to improve computation throughput and reduce I/O overhead. Second, we present SIMPLE, a dynamic data placement system that maintains a GPU buffer for frequently accessed inputs, optimizing data reuse through an interval selection algorithm with approximation guarantees. Together, ETC and SIMPLE significantly accelerate T-GNN training, achieving up to 62.4× speedup over state-of-theart baselines while preserving model accuracy, as demonstrated by extensive experiments on real-world datasets.

TPE Committee

Chair of Committee: Prof. ZHOU, Xiaofang(Online)

Prime Supervisor: Prof. YANG, Can (Online)

Co-Supervisor: Prof. CHEN, Lei  

Examiner: Prof. ZHANG, Yongqi

日期

04 August 2025

时间

15:00:00 - 16:00:00

地点

E3-201 (HKUST-GZ)

Join Link

Zoom Meeting ID:
971 7136 0711


Passcode: dsa2025