Preprints

Chasing Moving Targets with Online Self-Play Reinforcement Learning for Safer Language Models
Mickel Liu*, Liwei Jiang*, Yancheng Liang, Simon Shaolei Du, Yejin Choi, Tim Althoff*, Natasha Jaques*
SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning
Bo Liu*, Leon Guertler*, Simon Yu*, Zichen Liu*, Penghui Qi, Daniel Balcells, Mickel Liu, Cheston Tan, Weiyan Shi, Min Lin, Wee Sun Lee, Natasha Jaques
Enhancing Personalized Multi-Turn Dialogue with Curiosity Reward
Yanming Wan*, Jiaxing Wu*, Marwa Abdulhai, Lior Shani, Natasha Jaques
Improving Human-AI Coordination through Online Adversarial Training and Generative Models
Paresh Chaudhary, Yancheng Liang, Daphne Chen, Simon S. Du, Natasha Jaques

2025

ReaLJam: Real-Time, Synchronous Human-AI Music Jamming with Reinforcement Learning-Tuned Transformers
Alexander Scarlatos, Yusong Wu, Ian Simon, Adam Roberts, Tim Cooijmans, Natasha Jaques, Cassie Tarakajian, Anna Huang
Extended Abstracts of The ACM Conference on Human Factors in Computing Systems (CHI) 2025
Achieving Human Level Competitive Robot Table Tennis
David D’Ambrosio, Saminda Wishwajith Abeyruwan, Laura Graesser, Atil Iscen, Heni Ben Amor, Alex Bewley, Barney J. Reed, Krista Reymann, Leila Takayama, Yuval Tassa, Krysztof Choromanski, Erwin Coumans, Deepali Jain, Navdeep Jaitly, Natasha Jaques, Satoshi Kataoka, Yuheng Kuang, Nevena Lazic, Reza, Mahjourian, Sherry Moore, Kenneth Oslund, Anish Shankar, Vikas Sindhwani, Vincent Vanhoucke, Grace Vesom, Peng Xu, Pannag Sanketi
IEEE International Conference on Robotics and Automation (ICRA) 2025 (Best Paper Finalist)
Cross-environment Cooperation Enables Zero-shot Multi-agent Coordination
Kunal Jha, Wilka Carvalho, Yancheng Liang, Simon S. Du, Max Kleiman-Weiner*, Natasha Jaques*
International Conference on Machine Learning (ICML) 2025 (Oral Paper-Top 1%) and CogSci 2025
InvestESG: A multi-agent reinforcement learning benchmark for studying climate investment as a social dilemma
Xiaoxuan Hou, Jiayi Yuan, Joel Z. Leibo, Natasha Jaques
International Conference on Learning Representations (ICLR) 2025

2024

Infer Human’s Intentions Before Following Natural Language Instructions
Yanming Wan, Yue Wu, Yiping Wang, Jiayuan Mao*, Natasha Jaques*
AAAI Conference on Artificial Intelligence (AAAI) 2025
Multi Agent Reinforcement Learning for Sequential Satellite Assignment Problems
Josh Holder, Natasha Jaques, Mehran Mesbahi
AAAI Conference on Artificial Intelligence (AAAI) 2025 (Oral Paper - Top 5%)
Personalizing Reinforcement Learning from Human Feedback with Variational Preference Learning
Sriyash Poddar, Yanming Wan, Hamish Ivison, Abhishek Gupta*, Natasha Jaques*
Neural Information Processing Systems (NeurIPS) 2024 (Spotlight-Top 2%)
Learning to Cooperate with Humans Using Generative Agents
Yancheng Liang, Daphne Chen, Abhishek Gupta, Simon Du, Natasha Jaques
Neural Information Processing Systems (NeurIPS) 2024
Impossibility theorems for feature attribution
Blair Bilodeau, Natasha Jaques, Pang-Wei Koh, Been Kim
Proceedings of the National Academy of Sciences (PNAS) 2024
Adaptive Accompaniment with ReaLchords
Yusong Wu, Tim Cooijmans, Kyle Kastner, Adam Roberts, Ian Simon, Alexander Scarlatos, Chris Donahue, Cassie Tarakajian, Shayegan Omidshafiei, Aaron Courville, Pablo Samuel Castro, Natasha Jaques, Cheng-Zhi Anna Huang
International Conference on Machine Learning (ICML), 2024