ACCELERATING SPARSE COMPUTING ONGPUS: FROM GRAPH NEURAL NETWORKS TOLARGE LANGUAGE MODELS

论文开题审查

ACCELERATING SPARSE COMPUTING ONGPUS: FROM GRAPH NEURAL NETWORKS TOLARGE LANGUAGE MODELS

The Hong Kong University of Science and Technology (Guangzhou)

数据科学与分析学域

PhD Thesis Proposal Examination

By FAN Ruibo

摘要

Sparse computing has become increasingly critical in modern artificial intelligence applications, from Graph Neural Networks (GNNs) to Large Language Models (LLMs). This thesis proposal presents a comprehensive research program addressing the fundamental challenges of accelerating sparse matrix operations on Graphics Processing Units (GPUs). The research tackles three key problems across the spectrum of sparse computing applications: efficient GNN training through hybrid-parallel algorithms, general sparse matrix multiplication using Tensor Cores, and low-level sparsity acceleration for LLM inference. The proposed work builds upon three published contributions: HP-SpMM for GNN acceleration, DTC-SpMM for general Tensor Core-based sparse computing, and SpInfer for sparse LLM inference. These works demonstrate significant performance improvements and establish new state-of-the-art results across diverse application domains. The research addresses critical gaps in current sparse computing techniques, particularly the challenge of efficiently utilizing modern GPU architectures for sparse operations while maintaining both performance and memory efficiency.

TPE Committee

Chair of Committee: Prof. TANG Nan

Prime Supervisor: Prof. CHU Xiaowen

Co-Supervisor: Prof. WANG Wei

Examiner: Prof. WEN Zeyi

日期

09 June 2025

时间

10:00:00 - 11:00:00

地点

E1-147 (HKUST-GZ)