A Survey of Communication Efficiency in Mixture of Experts Systems: Foundations, Frontiers, and Future Outlook

PhD Qualifying-Exam

A Survey of Communication Efficiency in Mixture of Experts Systems: Foundations, Frontiers, and Future Outlook

The Hong Kong University of Science and Technology (Guangzhou)

Data Science and Analytics Thrust

PhD Qualifying Examination

By Mr. PAN, Xinglin

Abstract

In recent years, Mixture of Experts (MoE) has become a prominent architectural choice for large-scale models, driven by its parameter scalability, conditional computation, and task specialization advantages. However, as the scale of parameters and deployment environments grows, MoE systems face increasing challenges in execution efficiency and scalability—particularly in distributed settings. Communication overhead introduced by expert parallelism and token routing had become a significant system bottleneck among these. To address this, communication-aware system optimization has emerged as a crucial and rapidly developing area of research. This survey presents a focused review of communication efficiency in MoE systems. We begin by outlining the architectural characteristics of MoE and analyzing the communication bottlenecks it introduces. We then systematically review recent research progress from three core system perspectives: overlapping communication and computation, optimizing All-to-All primitives, and balancing expert workloads across devices. We compare representative techniques for each dimension and summarize their application scenarios and performance trade-offs. Finally, we discuss current challenges and outline promising directions for future research in communication-efficient MoE system design.

PQE Committee

Chair of Committee: Prof. TANG Nan

Prime Supervisor: Prof. CHU Xiaowen

Co-Supervisor: Prof. WANG Wei

Examiner: Prof. WEN Zeyi

Date

09 June 2025

Time

09:00:00 - 10:00:00

Location

E1-147 (HKUST-GZ)

Join Link

Zoom Meeting ID:
983 5853 6058

Passcode: dsa2025