Bridging Classification and Generation: From Neural Collapse in Imbalanced Classification to Discriminative-driven Image Generation
The Hong Kong University of Science and Technology (Guangzhou)
数据科学与分析学域
PhD Thesis Examination
By Ms. Xuantong LIU
摘要
Deep learning models have demonstrated remarkable success in both classification and generation tasks; meanwhile, their behavior under imbalanced data distributions and their ability to adapt to new tasks remain active areas of research. This thesis investigates the intersection of discriminative learning and generative modeling, intending to bridge the gap between classification and generation under practical and theoretical constraints. We identify three core challenges: (1) learning robust representations under imbalanced class distributions in deep classification models; (2) improving controllability in conditional generative modeling; and (3) developing a unified generative paradigm that works across modalities such as language and vision. To address these challenges, we present three contributions. First, we explore the phenomenon of Neural Collapse and propose a method to explicitly induce it in long-tailed classification tasks. This approach enhances generalization by enforcing geometric structure in the learned representations. Second, we propose a training-free framework for controllable image generation by inverting a pre-trained vision-language model. This method leverages the strong alignment capabilities of discriminative models to guide image synthesis, achieving strong controllability without generative training. Third, we perform a systematic exploration of applying autoregressive language models to image generation. We investigate the key factors such as tokenization methods, scan patterns, scaling behavior, and vocabulary design, and show that next-token prediction—a language modeling paradigm—can surprisingly yield state-of-the-art performance in image generation. Together, these contributions provide a unified perspective on how discriminative structures and objectives can inform and improve generative modeling, and lay the groundwork for building more generalizable, controllable, and unified deep learning systems.
TEC
Chairperson: Prof Mark Nicholas GRIMSHAW-AAGAARD
Prime Supervisor: Prof Yuan YAO
Co-Supervisor: Prof Nevin Lianwen ZHANG
Examiners:
Prof Qiong LUO
Prof Lei LI
Prof Menglin YANG
Prof Zenglin XU
日期
22 April 2025
时间
15:00:00 - 17:00:00
地点
W3-105, HKUST(GZ)
Join Link
Zoom Meeting ID: 93921426405
Passcode: dsa2025
主办方
数据科学与分析学域
联系邮箱
dsarpg@hkust-gz.edu.cn