About me

I am currently a second-year Masterโ€™s student at National Engineering Research Center for Software Engineering, Peking University. Previously, I had the fortune to collaborate with researchers fromTHUNLP, Shanghai AI Lab, and ByteDance Seed UI-TARS Team. My research interests include LLM Agent (Joint use of search, code, and GUI tools), Code Intelligence, and LLM Reasoning.

I expect to graduate in 2027 and am currently on the job market! If you are interested in me, please feel free to contact me.๐Ÿ”ฅ

๐Ÿ“– Educations

  • 2024.09 - 2027.06, National Engineering Research Center for Software Engineering, Peking University, Beijing, China.
  • 2020.09 - 2024.06, Software College, Northeastern University, Shenyang, China.

๐Ÿ’ผ Internships

  • 2025.07 - now, ByteDance Seed, Beijing, China.
  • 2024.07 - 2024.09, Shanghai AI Laboratory, Beijing, China.
  • 2023.10 - 2024.07, TsinghuaNLP & ModelBest Inc (้ขๅฃๆ™บ่ƒฝ) , Beijing, China.
  • 2022.08 - 2024.07, NEUIR Lab , Shenyang, China.
  • 2023.03 - 2023.06, The Knowledge Computing Lab, Peking University, Beijing, China.

๐Ÿ“ Preprints

* indicates equal contribution.

  • (Core Contributor) Seed1.8 Model Card: Towards Generalized Real-World Agency[Paper]

    ByteDance Seed.

  • Ui-tars-2 technical report: Advancing gui agent with multi-turn reinforcement learning [Paper]

    ByteDance Seed UI-TARS Team.

  • Group Pattern Selection Optimization: Let LRMs Pick the Right Pattern for Reasoning [Paper]

    Hanbin Wang*, Jingwei Song*, Jinpeng Li, Fei Mi, Lifeng Shang.

  • Teaching Large Reasoning Models Effective Reflection [Paper]

    Hanbin Wang*, Jingwei Song*, Jinpeng Li, Qi Zhu, Fei Mi, Ganqu Cui, Yasheng Wang, Lifeng Shang.

  • From f(x) and g(x) to f(g(x)): LLMs Learn New Skills in RL by Composing Old Ones [Paper]

    Lifan Yuan, Weize Chen, Yuchen Zhang, Ganqu Cui, Hanbin Wang, Ziming You, Ning Ding, Zhiyuan Liu, Maosong Sun, Hao Peng.

  • Code-vision: Evaluating multimodal llms logic understanding and code generation capabilities [Paper]

    Hanbin Wang*, Xiaoxuan Zhou*, Zhipeng Xu, Keyuan Cheng, Yuxin Zuo, Kai Tian, Jingwei Song, Junting Lu, Wenhui Hu, Xueyang Liu.

  • Process Reinforcement through Implicit Rewards [Paper][Blog]

    Ganqu Cui*, Lifan Yuan*, Zefan Wang, Hanbin Wang, Wendi Li, Bingxiang He, Yuchen Fan, Tianyu Yu, Qixin Xu, Weize Chen, Jiarui Yuan, Huayu Chen, Kaiyan Zhang, Xingtai Lv, Shuo Wang, Yuan Yao, Xu Han, Hao Peng, Yu Cheng, Zhiyuan Liu, Maosong Sun, Bowen Zhou, Ning Ding*.

๐Ÿ“ Publications

* indicates equal contribution.

  • (ACL 2025) CODEMENV: Benchmarking Large Language Models on Code Migration [Paper]

    Keyuan Cheng, Xudong Shen, Yihao Yang, Tengyue Wang, Yang Cao, Muhammad Asif Ali, Hanbin Wang, Lijie Hu, Di Wang.

  • (ACL 2025) KnowCoder-X: Boosting Multilingual Information Extraction via Code [Paper]

    Yuxin Zuo, Wenxuan Jiang, Wenxuan Liu, Zixuan Li, Long Bai, Hanbin Wang, Yutao Zeng, Xiaolong Jin, Jiafeng Guo, Xueqi Cheng.

  • (ICLR 2025) Advancing LLM Reasoning Generalists with Preference Trees [Paper]

    Lifan Yuan*, Ganqu Cui*, Hanbin Wang*, Ning Ding, Xingyao Wang, Jia Deng, Boji Shan, Huimin Chen, Ruobing Xie, Yankai Lin, Zhenghao Liu, Bowen Zhou, Hao Peng, Zhiyuan Liu, Maosong Sun.

  • (NAACL 2025) Enhancing the Code Debugging Ability of LLMs via Communicative Agent Based Data Refinement [Paper]

    Weiqing Yang*, Hanbin Wang*, Zhenghao Liu, Xinze Li, Yukun Yan, Shuo Wang, Yu Gu, Minghe Yu, Zhiyuan Liu, Ge Yu

  • (TOIS 2024) Building A Coding Assistant via the Retrieval-Augmented Language Model

    Xinze Li*, Hanbin Wang*, Zhenghao Liu, Shi Yu, Shuo Wang, Yukun Yan, Yukai Fu, Yu Gu, Ge Yu

  • (ACL 2024) INTERVENOR: Prompt the Coding Ability of Large Language Models with the Interactive Chain of Repairing [Paper]

    Hanbin Wang, Zhenghao Liu, Shuo Wang, Ganqu Cui, Ning Ding, Zhiyuan Liu, Ge Yu