SPARSE MATRIX-MATRIX MULTIPLICATION ON GPUS: A COMPREHENSIVE SURVEY

博士资格考试

SPARSE MATRIX-MATRIX MULTIPLICATION ON GPUS: A COMPREHENSIVE SURVEY

The Hong Kong University of Science and Technology (Guangzhou)

数据科学与分析学域

PhD Qualifying Examination

By Mr. Ruibo FAN

摘要

Sparse Matrix-Matrix Multiplication (SpMM) is a fundamental operation in many scientific and machine learning applications, particularly within Graph Neural Networks (GNNs), Convolutional Neural Networks (CNNs), and Large Language Models (LLMs). Efficient SpMM is critical due to the large-scale and high sparsity of the matrices involved in these applications. This survey provides a comprehensive overview of state-of-the-art techniques for optimizing SpMM on GPUs, highlighting advancements in memory access optimization, workload balancing, compiler optimizations, Tensor Core utilization, matrix reordering, and adaptive frameworks.

Performance evaluations conducted on Nvidia’s RTX 3090 and RTX 4090 GPUs reveal significant performance gains with Tensor Core-based implementations, particularly with advanced methods like DTC-SpMM. These results emphasize the importance of specialized hardware and sophisticated optimization strategies in achieving high performance for SpMM operations. Our in-depth analysis distills the essential advancements and challenges in GPU-accelerated SpMM, paving the way for innovative future research in this critical field.

PQE Committee

Chairperson: Prof. Nan TANG

Prime Supervisor: Prof Xiaowen CHU

Co-Supervisor: Prof Wei WANG

Examiner: Prof Zeyi WEN

日期

2024年6月4日

时间

13:30:00 - 14:45:00

地点

E1-150

Join Link

Zoom Meeting ID:
863 2469 1073

Passcode: dsa2024

SPARSE MATRIX-MATRIX MULTIPLICATION ON GPUS: A COMPREHENSIVE SURVEY

日期

时间

地点

Join Link

线上咨询