Homepage - Shatong Zhu

Education

Stanford University

Department of Electrical Engineering

M.S., Software & Computer Systems focus

Aug. 2025 - Jun. 2027 (expected)
Tongji University

School of Computer Science

B.Eng. in Data Science

Sep. 2021 - Jun. 2025

Honors & Awards

ICLR 2026 paper acceptance in cooperative multi-agent reinforcement learning

2026
NeurIPS 2024 Spotlight paper on context-based offline meta-reinforcement learning

2024
National Scholarship, Tongji University (top 0.2%), second-time laureate

2024
National First Prize, RoboCup China Open

2024
National First Prize, Information Security and Countermeasures Competition

2024
Outstanding Graduate, Tongji University

2025

Experience

NVIDIA

Incoming Software Engineer Intern, AI/ML

Incoming

Santa Clara, CA

Incoming role focused on AI/ML software systems and infrastructure.

AI/ML AI Infrastructure Software Engineering

Preparing to work on production-oriented AI/ML systems at the intersection of software engineering and intelligent infrastructure.

ZhongAn Insurance

Software Engineer Intern, AI/ML

Jun. 2024 - Sep. 2024

Shanghai, China

Built agent-powered BI workflows for Text-to-SQL and structured analytics.

Agentic BI Text-to-SQL RAG LangChain FastAPI FAISS

Built a conversational intake workflow that generated structured JSON for downstream BI analytics and automation.
Upgraded the LangChain pipeline with RAG, a FastAPI retrieval API, FAISS vector search, and HTTP connection pooling for more stable serving.
Led a 4-intern team with code reviews and documentation standards to deliver the system on schedule.

MeetSocial

Software Engineer Intern, LLM Applications

Mar. 2024 - Jun. 2024

Shanghai, China

Contributed to early LLM application systems including MeetAsk, RAG search, and multi-agent workflows.

LLM Applications RAG Milvus Elasticsearch LangChain Multi-Agent Systems

Implemented RAG modules for an enterprise Q&A platform with hybrid retrieval over vector and keyword search systems.
Optimized prompt and multi-agent workflows for topic identification, keyword extraction, and structured response quality.
Built review-insight and content-understanding pipelines using topic modeling and LLM-assisted summarization.

News

2026

Our cooperative multi-agent reinforcement learning paper was accepted to ICLR 2026. Read more

Jan 26

2025

Started the M.S. program in Electrical Engineering at Stanford University.

Aug 01

Received M.S. admission offers from Stanford, CMU, Berkeley, and other universities.

Apr 01

2024

Awarded the National Scholarship for the second time at Tongji University.

Nov 01

Our paper on context-based offline meta-reinforcement learning was accepted to NeurIPS 2024 as a Spotlight. Read more

Sep 01

Joined ZhongAn Insurance as a Software Engineer Intern working on AI/ML systems.

Jun 01

Selected Publications (view all)

Potentially Optimal Joint Actions Recognition for Cooperative Multi-Agent Reinforcement Learning

Chang Huang*, Shatong Zhu*, Junqiao Zhao, Hongtu Zhou, Hai Zhang, Di Zhang, Chen Ye, Ziqiao Wang, Guang Chen (* equal contribution)

International Conference on Learning Representations (ICLR) 2026 ICLR 2026 Poster

Value function factorization is widely used in cooperative multi-agent reinforcement learning, but monotonicity constraints can limit expressiveness and hinder optimal policy learning. This work proposes Potentially Optimal Joint Actions Weighting (POW), an architecture-agnostic method that iteratively identifies potentially optimal joint actions and assigns them higher training weights. The approach provides a theoretical guarantee for recovering the optimal joint policy and improves stability and performance across matrix games, difficulty-enhanced predator-prey, SMAC, SMACv2, and highway-env intersection scenarios.

[OpenReview] [PDF]

Potentially Optimal Joint Actions Recognition for Cooperative Multi-Agent Reinforcement Learning

Chang Huang*, Shatong Zhu*, Junqiao Zhao, Hongtu Zhou, Hai Zhang, Di Zhang, Chen Ye, Ziqiao Wang, Guang Chen (* equal contribution)

International Conference on Learning Representations (ICLR) 2026 ICLR 2026 Poster

[OpenReview] [PDF]

Towards an Information Theoretic Framework of Context-Based Offline Meta-Reinforcement Learning

Lanqing Li*, Hai Zhang*, Xinyu Zhang, Shatong Zhu, Yang Yu, Junqiao Zhao, Pheng-Ann Heng (* equal contribution)

Advances in Neural Information Processing Systems (NeurIPS) 2024 NeurIPS 2024 Spotlight

As a marriage between offline reinforcement learning and meta-reinforcement learning, context-based offline meta-RL aims to learn a universal policy conditioned on effective task representations. This work shows that several mainstream COMRL methods can be understood as optimizing the same mutual-information objective between the task variable and its latent representation via different approximate bounds. The framework leads to supervised and self-supervised implementations that generalize across RL benchmarks, context shift scenarios, data qualities, and deep learning architectures, providing an information-theoretic foundation for task representation learning in offline meta-RL.

[arXiv] [DOI] [PDF] [Code]

Towards an Information Theoretic Framework of Context-Based Offline Meta-Reinforcement Learning

Lanqing Li*, Hai Zhang*, Xinyu Zhang, Shatong Zhu, Yang Yu, Junqiao Zhao, Pheng-Ann Heng (* equal contribution)

Advances in Neural Information Processing Systems (NeurIPS) 2024 NeurIPS 2024 Spotlight

[arXiv] [DOI] [PDF] [Code]

Education

Honors & Awards

Experience

News

Selected Publications (view all)

Potentially Optimal Joint Actions Recognition for Cooperative Multi-Agent Reinforcement Learning

Potentially Optimal Joint Actions Recognition for Cooperative Multi-Agent Reinforcement Learning

Towards an Information Theoretic Framework of Context-Based Offline Meta-Reinforcement Learning

Towards an Information Theoretic Framework of Context-Based Offline Meta-Reinforcement Learning

All publications