Efficient IO for Graph Processing

DSA学域研讨会

摘要

Graphs are widely used in many domains because it can flexibly express the relations among entities as edges. However, such flexibility also leads to random data access and poor IO performance since the basic operation of graph processing is to access the neighbors of a node, which are usually randomly scattered. Here, I will share how to tailor system designs to improve IO efficiency for graph processing systems. I will cover three typical computing scenarios, i.e., in-memory graph processing, disk-based graph processing, and distributed graph processing, and present the core challenges and key techniques for each scenario, e.g., cache miss and prefetch for in-memory processing, read amplification and data packing for disk-based processing, and network communication and computation push for distributed processing.

演讲者简介

Yan Xiao

Research Scientist

CPII, Hong Kong

Dr. Yan is a research scientist at Centre for Perceptual and Interactive Intelligence (CPII), Hong Kong. He received his PhD degree on Computer Science and Engineering from the Chinese University of Hong Kong. His research interests are machine learning systems and database systems, including graph processing systems, graph learning systems, vector databases, model training and inference systems. He has worked with top companies including Meta, AWS, Huawei, and Alibaba for system development and won Track 2 of The Billion-scale Approximate Nearest Neighbor Search Challenge at NeurIPS’21.

日期

06 January 2025

时间

14:00:00 - 15:00:00

地点

香港科技大学（广州）E3-2楼-202室

Join Link

Zoom Meeting ID:
985 0373 6391

Passcode: dsat

主办方

数据科学与分析学域

联系邮箱

dsat@hkust-gz.edu.cn