CV
Education
- Ph.D in King Abdullah University of Science and Technology, 2026 (expected)
- M.S. in The Chinese University of Hong Kong, 2022
- B.S. in Tianjin University, 2021
Work experience
Summer 2024: LLM Pretraining Engineer (Intern)
- Aramco
- Duties includes: Pretraining large-scale models using a 24-H100-GPU cluster. Building pre-training framework, improving training throughput with CUDA kernel fusion, multi-thread scheduling, and asynchronous checkpointing.
- Supervisor: Salma Alsinan
Fall 2022: Research Assistant
- King Abdullah University of Science and Technology
- Duties included: Distributed Federated Learning
- Supervisor: Di Wang
Skills
- PyTorch / Libtorch: In-depth knowledge of PyTorch operators’ workflow and implementation, including distributed training packages, and multi-threading / streaming programming.
- CUDA programming / Triton: Intermediate in CUDA stream and kernel programming, with a solid understanding of CUDA principles.
- DeepSpeed / Megatron: Experience using DeepSpeed and Megatron for distributed training, including manual implementation for optimization.
- Programming Languages: Python (Mainly for PyTorch), C/C++ (Mainly for Multi-thread, CUDA Programming, and LibTorch).
–>