Rosie Zhao | research

My current research interests span the following:

understanding training dynamics of deep models: particularly with respect to factors such as implicit bias, random variation, and scale
finetuning language models for domains like reasoning or alignment
reinforcement learning

I’ve also had the privilege of previously working on research projects in theoretical computer science and applied machine learning. See a list of publications below, or check out my Google Scholar for a more up-to-date list.

2025

Improving SOAP Using Iterative Whitening and Muon

Vyas, Nikhil, Zhao, Rosie, Morwani, Depen, Kwun, Mujin, and Kakade, Sham

2025
Distributional Scaling Laws for Emergent Capabilities

Zhao, Rosie, Qin, Tian, Alvarez-Melis, David, Kakade, Sham, and Saphra, Naomi

arXiv preprint arXiv:2502.17356 2025
Echo Chamber: RL Post-training Amplifies Behaviors Learned in Pretraining

Zhao, Rosie, Meterez, Alexandru, Kakade, Sham, Pehlevan, Cengiz, Jelassi, Samy, and Malach, Eran

2025

2024

Policy Gradient Methods in the Presence of Symmetries and State Abstractions

Panangaden, Prakash, Rezaei-Shoshtari, Sahand, Zhao, Rosie, Meger, David, and Precup, Doina

Journal of Machine Learning Research 2024
Beyond Implicit Bias: The Insignificance of SGD Noise in Online Learning

Vyas, Nikhil, Morwani, Depen, Zhao, Rosie, Kaplun, Gal, Kakade, Sham M, and Barak, Boaz

In Forty-first International Conference on Machine Learning 2024
Feature emergence via margin maximization: case studies in algebraic tasks

Morwani, Depen, Edelman, Benjamin L, Oncescu, Costin-Andrei, Zhao, Rosie, and Kakade, Sham M

In The Twelfth International Conference on Learning Representations 2024
Deconstructing What Makes a Good Optimizer for Language Models

Zhao, Rosie, Morwani, Depen, Brandfonbrener, David, Vyas, Nikhil, and Kakade, Sham

arXiv preprint arXiv:2407.07972 2024
SOAP: Improving and Stabilizing Shampoo using Adam

Vyas, Nikhil, Morwani, Depen, Zhao, Rosie, Shapira, Itai, Brandfonbrener, David, Janson, Lucas, and Kakade, Sham

arXiv preprint arXiv:2409.11321 2024
Creating a Cooperative AI Policymaking Platform through Open Source Collaboration

Lewington, Aiden, Vittalam, Alekhya, Singh, Anshumaan, Uppuluri, Anuja, Ashok, Arjun, Athmaram, Ashrith Mandayam, Milt, Austin, Smith, Benjamin, Weinberger, Charlie, Sarin, Chatanya, and others,

arXiv preprint arXiv:2412.06936 2024

2023

On the peel number and the leaf-height of Galton–Watson trees

Devroye, Luc, Goh, Marcel K, and Zhao, Rosie Y

Combinatorics, Probability and Computing 2023
Loss of plasticity in continual deep reinforcement learning

Abbas, Zaheer, Zhao, Rosie, Modayil, Joseph, White, Adam, and Machado, Marlos C

In Conference on Lifelong Learning Agents 2023

2022

Leaf multiplicity in a Bienaym\backslash’e-Galton-Watson tree

Brandenberger, Anna M, Devroye, Luc, Goh, Marcel K, and Zhao, Rosie Y

Discrete Mathematics & Theoretical Computer Science 2022
Boolean functions with small approximate spectral norm

Cheung, Tsun Ming, Hatami, Hamed, Zhao, Rosie, and Zilberstein, Itai

In 2022
Lower bound methods for sign-rank and their limitations

Hatami, Hamed, Hatami, Pooya, Pires, William, Tao, Ran, and Zhao, Rosie

In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2022) 2022
Continuous mdp homomorphisms and homomorphic policy gradient

Rezaei-Shoshtari, Sahand, Zhao, Rosie, Panangaden, Prakash, Meger, David, and Precup, Doina

Advances in Neural Information Processing Systems 2022
Continuous Homomorphisms and Leveraging Symmetries in Policy Gradient Algorithms for Markov Decision Processes

Zhao, Rosie Y

2022

2021

Bridging the gap between supervised classification and unsupervised topic modelling for social-media assisted crisis management

Brunila, Mikael, Zhao, Rosie, Mircea, Andrei, Lumley, Sam, and Sieber, Renee

In Proceedings of the Second Workshop on Domain Adaptation for NLP 2021
Arithmetic Subsequences in a Random Ordering of an Additive Set

Goh, Marcel K, and Zhao, Rosie Y

INTEGERS 2021

2020

Using deep learning and social network analysis to understand and manage extreme flooding

Romascanu, Andrei, Ker, Hannah, Sieber, Renee, Greenidge, Sarah, Lumley, Sam, Bush, Drew, Morgan, Stefan, Zhao, Rosie, and Brunila, Mikael

Journal of Contingencies and Crisis Management 2020