Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456
Yifan Zhang - AI Research

Princeton University · AI Research

Yifan Zhang

PhD candidate at Princeton University and Princeton AI Lab Fellow, focusing on Large Language Models and Multimodal Foundation Models — especially Reinforcement Learning and LLM Reasoning, Pretraining and Language Modeling, as well as Agentic AI and Agentic RL.

About

About Me

I am a PhD candidate at Princeton University and a Princeton AI Lab Fellow, working with Prof. Mengdi Wang, Prof. Andrew Yao, and Prof. Quanquan Gu, where my research focuses on building scalable and capable large language models (LLMs) and multimodal foundation models. My work explores methods to improve LLM reasoning and agentic AI via reinforcement learning, advance the data and algorithms behind foundation models, and develop new attention mechanisms, positional encodings, and model architectures.

Previously, I was a visiting PhD student at the UCLA AGI Lab, and a Top Seed researcher with the Seed Foundation Model Team, working on LLM and MLLM pretraining and scaling.

For a concise overview of my research, see my interactive research-talk slides.

If you are interested in my research or projects, I would be happy to discuss potential collaborations via email.

Yifan Zhang

Focus

Research Interests

01 Reinforcement Learning and LLM Reasoning

02 Pretraining and Language Modeling

03 Agentic AI and Agentic RL

You can find my publications on Google Scholar, and my writing at Yifan's Blog.

Research

Selected Works

[SDPG] Self-Distilled Policy Gradient

Yifeng Liu*, Shiyuan Zhang*, Yifan Zhang†, Quanquan Gu

arXiv:2606.04036

Project Page Website

[Falcon] Fast-Weight Attention for Continual Learning

Yifan Zhang et al.

Preprint

Project Page Website

FlashSampling: Fast and Memory-Efficient Exact Sampling

Tomas Ruiz*, Zhen Qin*, Yifan Zhang†, Xuyang Shen, Yiran Zhong, Mengdi Wang†

arXiv:2603.15854

Project Page Website

Seed2.0 Model Card: Towards Intelligence Frontier for Real-World Complexity

Seed Team

Project Page Website

[DDL] Deep Delta Learning

Yifan Zhang, Yifeng Liu, Mengdi Wang, Quanquan Gu

arXiv:2601.00417

Project Page Website

[GRAPE] Group Representational Position Encoding

Yifan Zhang, Zixiang Chen, Yifeng Liu, Zhen Qin, Huizhuo Yuan, Kangping Xu, Yang Yuan, Quanquan Gu, Andrew C Yao

International Conference on Learning Representations (ICLR 2026)

Project Page Website

[RPG] On the Design of KL-Regularized Policy Gradient Algorithms for LLM Reasoning

Yifan Zhang*, Yifeng Liu*, Huizhuo Yuan, Yang Yuan, Quanquan Gu, Andrew C Yao

International Conference on Learning Representations (ICLR 2026); see also Thinking Machines Tinker and DeepSeek V3.2

Project Page Website

[TPA] Tensor Product Attention Is All You Need

Yifan Zhang*, Yifeng Liu*, Huizhuo Yuan, Zhen Qin, Yang Yuan, Quanquan Gu, Andrew C Yao

Conference on Neural Information Processing Systems (NeurIPS 2025 Spotlight)

Project Page Website

[GPM] Beyond Bradley-Terry Models: A General Preference Model for Language Model Alignment

Yifan Zhang*, Ge Zhang*, Yue Wu*, Kangping Xu, Quanquan Gu

International Conference on Machine Learning (ICML 2025)

Project Page Website

[Proposer-Verifier] Cumulative Reasoning with Large Language Models

Yifan Zhang*, Jingqin Yang*, Yang Yuan, Andrew C Yao

Transactions on Machine Learning Research (TMLR)

Project Page Website

(* denotes equal contribution, † denotes corresponding authors)

Papers

Recent Publications

Group Representational Position Encoding

Yifan Zhang, Zixiang Chen, Yifeng Liu, Zhen Qin, Huizhuo Yuan, Kangping Xu, Quanquan Gu, Andrew C Yao

International Conference on Learning Representations (ICLR 2026)

Project Page Website

On the Design of KL-Regularized Policy Gradient Algorithms for LLM Reasoning

Yifan Zhang*, Yifeng Liu*, Huizhuo Yuan, Quanquan Gu, Andrew C Yao

International Conference on Learning Representations (ICLR 2026); see also Thinking Machines Tinker and DeepSeek V3.2

Project Page Website

CriticLean: Critic-Guided Reinforcement Learning for Mathematical Formalization

Zhongyuan Peng*, Yifan Yao*, Kaijing Ma*, Shuyue Guo, Yizhe Li, Yichi Zhang, Chenchen Zhang, Yifan Zhang, et al.

Annual Meeting of the Association for Computational Linguistics (ACL 2026)

Tensor Product Attention Is All You Need

Yifan Zhang*, Yifeng Liu*, Huizhuo Yuan, Zhen Qin, Yang Yuan, Quanquan Gu, Andrew C Yao

Conference on Neural Information Processing Systems (NeurIPS 2025 Spotlight)

Project Page Website

Autonomous Data Selection with Zero-shot Generative Classifiers for Mathematical Texts

Yifan Zhang*, Yifan Luo*, Yang Yuan, Andrew C Yao

Findings of the Association for Computational Linguistics (ACL 2025 Findings)

Project Page Website

Beyond Bradley-Terry Models: A General Preference Model for Language Model Alignment

Yifan Zhang*, Ge Zhang*, Yue Wu*, Kangping Xu, Quanquan Gu

International Conference on Machine Learning (ICML 2025)

Project Page Website

Augmenting Math Word Problems via Iterative Question Composing

Haoxiong Liu*, Yifan Zhang*, Yifan Luo, Andrew C Yao

AAAI Conference on Artificial Intelligence (AAAI 2025)

Project Page Website

Beyond Squared Error: Exploring Loss Design for Enhanced Training of Generative Flow Networks

Rui Hu*, Yifan Zhang*, Zhuoran Li, Longbo Huang

International Conference on Learning Representations (ICLR 2025 Spotlight)

(* denotes equal contribution)

All Publications

Writing

Blog Highlights

Visit the Blog

Service

Professional Activities

Teaching

Teaching Assistant, Machine Learning for Yao class, IIIS, Tsinghua University

Academic Services

Conference Reviewer: NeurIPS, ICLR, ICML, COLM, AAAI, AISTATS
Journal Reviewer: TMLR, IEEE TDSC, ACM TKDD, ACM TIST, Neuralcomputing, Neural Networks

Awards

William G. Bowen Merit Fellowship at Princeton University (only one in each academic division)
2025 Stanford University-Elsevier World's Top 2% Scientist