About me

I am currently a second-year Master’s student at National Engineering Research Center for Software Engineering, Peking University. Previously, I had the fortune to collaborate with researchers from NEUIR, THUNLP, ModelBest(面壁智能) and Shanghai AI Lab. My research interests include Code Intelligence, LLM Reasoning, and LLM Agent.

I am looking for PhD opportunities, feel free to contact me!🔥

📖 Educations

  • 2024.09 - 2027.06, National Engineering Research Center for Software Engineering, Peking University, Beijing, China.
  • 2020.09 - 2024.06, Software College, Northeastern University, Shenyang, China.

💼 Internships

  • 2025.07 - now, ByteDance Seed, Beijing, China.
  • 2024.07 - 2024.09, Shanghai AI Laboratory, Beijing, China.
  • 2023.10 - 2024.07, TsinghuaNLP & ModelBest Inc (面壁智能) , Beijing, China.
  • 2022.08 - 2024.07, NEUIR Lab , Shenyang, China.
  • 2023.03 - 2023.06, The Knowledge Computing Lab, Peking University, Beijing, China.

📝 Preprints

* indicates equal contribution.

  • Group Pattern Selection Optimization: Let LRMs Pick the Right Pattern for Reasoning [Paper]

    Hanbin Wang, Jingwei Song, Jinpeng Li3, Fei Mi, Lifeng Shang.

  • Ui-tars-2 technical report: Advancing gui agent with multi-turn reinforcement learning [Paper]

    ByteDance Seed UI-TARS Team.

  • From f(x) and g(x) to f(g(x)): LLMs Learn New Skills in RL by Composing Old Ones [Paper]

    Lifan Yuan, Weize Chen, Yuchen Zhang, Ganqu Cui, Hanbin Wang, Ziming You, Ning Ding, Zhiyuan Liu, Maosong Sun, Hao Peng.

  • Code-vision: Evaluating multimodal llms logic understanding and code generation capabilities [Paper]

    Hanbin Wang, Xiaoxuan Zhou, Zhipeng Xu, Keyuan Cheng, Yuxin Zuo, Kai Tian, Jingwei Song, Junting Lu, Wenhui Hu, Xueyang Liu.

  • Process Reinforcement through Implicit Rewards [Paper][Blog]

    Ganqu Cui*, Lifan Yuan*, Zefan Wang, Hanbin Wang, Wendi Li, Bingxiang He, Yuchen Fan, Tianyu Yu, Qixin Xu, Weize Chen, Jiarui Yuan, Huayu Chen, Kaiyan Zhang, Xingtai Lv, Shuo Wang, Yuan Yao, Xu Han, Hao Peng, Yu Cheng, Zhiyuan Liu, Maosong Sun, Bowen Zhou, Ning Ding*.

📝 Publications

* indicates equal contribution.

  • (ACL 2025) CODEMENV: Benchmarking Large Language Models on Code Migration [Paper]

    Keyuan Cheng, Xudong Shen, Yihao Yang, Tengyue Wang, Yang Cao, Muhammad Asif Ali, Hanbin Wang, Lijie Hu, Di Wang.

  • (ACL 2025) KnowCoder-X: Boosting Multilingual Information Extraction via Code [Paper]

    Yuxin Zuo, Wenxuan Jiang, Wenxuan Liu, Zixuan Li, Long Bai, Hanbin Wang, Yutao Zeng, Xiaolong Jin, Jiafeng Guo, Xueqi Cheng.

  • (ICLR 2025) Advancing LLM Reasoning Generalists with Preference Trees [Paper]

    Lifan Yuan*, Ganqu Cui*, Hanbin Wang*, Ning Ding, Xingyao Wang, Jia Deng, Boji Shan, Huimin Chen, Ruobing Xie, Yankai Lin, Zhenghao Liu, Bowen Zhou, Hao Peng, Zhiyuan Liu, Maosong Sun.

  • (NAACL 2025) Enhancing the Code Debugging Ability of LLMs via Communicative Agent Based Data Refinement [Paper]

    Weiqing Yang*, Hanbin Wang*, Zhenghao Liu, Xinze Li, Yukun Yan, Shuo Wang, Yu Gu, Minghe Yu, Zhiyuan Liu, Ge Yu

  • (TOIS 2024) Building A Coding Assistant via the Retrieval-Augmented Language Model

    Xinze Li*, Hanbin Wang*, Zhenghao Liu, Shi Yu, Shuo Wang, Yukun Yan, Yukai Fu, Yu Gu, Ge Yu

  • (ACL 2024) INTERVENOR: Prompt the Coding Ability of Large Language Models with the Interactive Chain of Repairing [Paper]

    Hanbin Wang, Zhenghao Liu, Shuo Wang, Ganqu Cui, Ning Ding, Zhiyuan Liu, Ge Yu