About Me

I am Liangyu Wang, a Ph.D. candidate in Computer Science at King Abdullah University of Science and Technology (KAUST), specializing in efficient training and inference for large language models (LLMs) through distributed computing and advanced GPU programming. Before that, I completed my master degree at The Chinese University of Hong Kong, focusing on multimodal machine learning.

Currently, I am conducting large language model pretraining research at the Alibaba Qwen Team.

My research interests include optimizing distributed training and inference of LLMs, improving multi-threaded and multi-stream scheduling, and enhancing privacy-preserving methods for LLMs. I have interned as a LLM Pretraining Engineer at Aramco, working with large-scale GPU clusters to boost training throughput and model scalability. Currently, I am working on:

  • Efficient reinforcement learning (RL) for LLMs reasoning
  • Distributed training and inference of LLMs
  • Efficient algorithm and infrastructure design for LLMs
  • Efficient privacy-preserving methods

News

Projects

ZO2 (Zeroth-Order Offloading): Full Parameter Fine-Tuning 175B LLMs with 18GB GPU Memory GitHub stars

  • A framework that enables fine-tuning of extremely large language models (like OPT-175B) on limited GPU memory through zeroth-order optimization and CPU-GPU offloading. GitHub

Tiny-DeepSpeed: A Minimalistic Re-Implementation of DeepSpeed GitHub stars

  • A concise re-implementation of DeepSpeed, designed to help users understand the core functionalities of distributed training and model optimization. GitHub

Publications

  • ZO2: Scalable Zeroth-Order Fine-Tuning for Extremely Large Language Models with Limited GPU Memory
    Liangyu Wang, Jie Ren, Hang Xu, Junxiao Wang, Huanyi Xie, David E. Keyes, and Di Wang
    NeurIPS workshop, 2024; arXiv preprint arXiv:2503.12668, 2025 Paper | Code

  • FlashDP: Memory-Efficient and High-Throughput DP-SGD Training for Large Language Models
    Liangyu Wang, Junxiao Wang, Jie Ren, Zihang Xiang, David E. Keyes, and Di Wang
    NeurIPS workshop 2024 Paper

  • WiP: Towards Light Adaptation of Large Language Models For Personal Hardware
    Liangyu Wang, Junxiao Wang and Di Wang
    Mobisys workshop 2024 Paper

Reviewer Service