Hao Lin @ Alibaba Cloud

profile      
Hao Lin

R&D Engineer,
Platform of Artificial Intelligence (PAI),
Alibaba Cloud,
Hangzhou, Zhejiang, China.

[ Biography | Education | Working Experience | Publication and Preprint | Award | Invited Talk | Teaching Assistant | Correspondence]

Biography

Currently, I am a R&D engineer in the Platform of Artificial Intelligence (PAI) of Alibaba Cloud Group. My research interest includes machine learning systems and reinforcement learning systems.

Prior to this position, I received Master degree from Department of Computer Science and Technology of Nanjing University in June, 2024. My supervisor was Professor Wu-Jun Li.

Before that, I received B.Sc. degree from Department of Computer Science and Technology of Nanjing University in June, 2021. In the same year, I was admitted to pursue my Master degree without entrance examination.

Education

Working Experience

Publication and Preprint

Qwen3 
  • Chujie Zheng†, Kai Dang, Bowen Yu†, Mingze Li, Huiqiang Jiang, Junrong Lin, Yuqiong Liu, Hao Lin, Chencan Wu, Feng Hu, An Yang, Jingren Zhou, Junyang Lin. Stabilizing Reinforcement Learning with LLMs: Formulation and Practices. CoRR abs/2512.01374, 2025. [PDF]

  • We propose a new formulation for reinforcement learning with LLMs, viewing the token-level optimization objective as a first-order approximation to the true expected sequence-level reward.

UniAP 
  • Hao Lin*, Ke Wu*, Jie Li*, Jun Li, and Wu-Jun Li†: UniAP: Unifying Inter- and Intra-Layer Automatic Parallelism by Mixed Integer Quadratic Programming, Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR), 2025, pp. 20947-20957. Award Candidate. [PDF] [Supplemental] [Slide] [YouTube] [bibtex]

  • We propose an automatic parallelism framework UniAP. It utilizes MIQP to jointly optimize DP, TP, FSDP, and PP to enhance efficiency in training large models. Experimental results show that UniAP outperforms SOTA by up to 3.80x in throughput and reduces strategy optimization time by up to 107x across five Transformer-based models.

Qwen3 
  • Qwen Team‡: Qwen3 Technical Report. CoRR abs/2505.09388, 2025. [PDF]

  • We present Qwen3 series models, including models of both dense and Mixture-of-Expert (MoE) architectures. In Qwen3, we integrate thinking mode and non-thinking mode into a unified framework. Empirical evaluations demonstrate that Qwen3 achieves state-of-the-art results across diverse benchmarks.

Qwen2.5 
  • Qwen Team‡: Qwen2.5 Technical Report, CoRR abs/2412.15115, 2024. [PDF]

  • We introduce Qwen2.5, a comprehensive series of large language models (LLMs). Qwen2.5 has demonstrated top-tier performance on a wide range of benchmarks. Additionally, Qwen2.5 models have been instrumental in training specialized models such as Qwen2.5-Math, Qwen2.5-Coder, QwQ, and multimodal models.

(*: equal contribution. †: corresponding author. ‡: one of the contributors)

Award

Invited Talk

Teaching Assistant

Correspondence

E-mail Address

baodong.lh{AT}alibaba-inc.com (Business)
hao.lin.msc{AT}gmail.com (Private)



   

Back to Top