About me

I am an Assistant Professor at USC in the Thomas Lord Dept of CS, and by courtesy in the Ming Hsieh Dept of ECE. Until recently, I was an SNSF postdoc working with Mike I. Jordan at UC Berkeley, and obtained my PhD at EPFL advised by Martin Jaggi. I also co-lead the Federated Learning for Health working group at MONAI with Holger Roth at NVIDIA. My work has previously been deployed across industry at Meta, Google, Open AI, Owkin.

Selected Awards

Some awards I have been fortunate to have been recognized by:
- Amazon Center on Secure & Trusted ML award.
- Capitol One Fellowship.
- SNSF Mobility Fellowship.
- Patrick Denantes Memorial Prize for the best thesis in computer science.
- Chorafas Foundation Prize for exceptional applied research.

I am actively looking for Fall 2026 PhD students especially in: a. data-centric ML: data valuation, agentic markets, synthetic data; b. privacy and memorization in ML; c. foundations of robust and secure ML. If any of this is you, reach out! :)

Teaching

- Fall 2025: CSCI 699 Privacy Preserving Machine Learning (website)
- Spring 2025: CSCI 599 Optimization for Machine Learning (website)
- Fall 2024: CSCI 699 Privacy Preserving Machine Learning (website)

My research

Soon, a network of interacting AI agents will be mediating our social, cultural, and political exchanges, giving rise to complex “machine behavior”. This goes well beyond the current scope of machine learning research which focuses narrowly on developing better models on individual datasets, but rather requires taking a holistic view of the systems that arise from these agents interacting with society and with each other. I use tools from optimization, statistics, and economics to study and design such systems.

I am currently focusing on the role of data in such ML ecosystems. Data is the most important factor determining the quality of an ML system. However, we understand very little about what makes data good or bad. Further, the most valuable data (e.g. health records) are either extremely expensive to collect or simply inaccesible. Some questions I am thinking about nowadays:

Large-scale Private & Federated optimization. Medical data is subject to strict privacy regulations. How can we privately train ML models on data distributed across multiple hospitals without the data leaving the hospitals? This uses tools like Federated Learning, Differential Privacy, and optimization for large-scale machine learning.
Data Valuation and Data Markets. The data commons that current AI relies on is disappearing. In order to build a sustainable data-ecosystem, people need to be compensated for their data. But, how much is a specific data point worth?. This questions requires understanding i) how data affects uncertainity in a ML model, and ii) the relative importance of datapoints.
AI for Health. Healthcare comes with tons of data challenges on top of privacy concerns: the data may be highly heterogenous, and have missing features. Further, because of the high-stakes involved, fairness and equity, reliable uncertainity quantification, and interepretable predictions are all extremely important.

News

- (Jun 25): Excited to be part of this workshop at TTIC this summer on Incentives in Data Sharing. We are now accepting submissions and registrations. See you all there soon!

- (Feb 25): I am co-organizing a TTIC summer workshop on Incentives in Data Sharing and Collaborative Learning on Aug 13-15. Please reach out if you’re interested in giving a talk or participating.

- (Jan 25): I was named as a Capitol One Fellow along with Robin Jia, thank you for the support!

- (Dec 24): I received a joint appointment with the Ming Hsieh Dept of ECE. If you are interested in working with me, you can now apply to PhD programs in both the CS and ECE departments.

- (Nov 24): I will be giving talks in the Chicago area - in the computer science seminar at the University of Chicago on Nov 11, and in the LANS seminar at the Argonne National Laboratory.

- (Sep 24): I will be giving invited talks at USC Theory Lunch on Sep 19, at INFORMS 24 on Oct 22, and guest lecture in USC CSCI 697 on “Building Collaborative Data-Ecosystems”.

- (Sep 24): Serving as an area chair at ICLR 25.

- (Aug 24): I’m co-organizing the workshop on Federated Learning in the Age of Foundation Models as part of NeurIPS 2024. Please consider submitting your work!

- (Jun 24): I was appointed co-lead of the Data Quality and Federated Learning working group along with Holger Roth as part of MONAI. Looking forward to push the boundaries of AI for healthcare!

- (Jun 24) I will be in Boston for the foundations of responsible computing conference (FORC 2024) Jun 12-14. Come say hi!

Publications

* indicates equal contribution.

2025

A Systematic Analysis of Base Model Choice for Reward Modeling.
Kian Ahrabian, Pegah Jandaghi, Negar Mokhberian,Sai Praneeth Karimireddy, Jay Pujara.
[ EMNLP Main 2025 ]
The Surprising Effectiveness of Membership Inference with Simple N-Gram Coverage. Skyler Hallinan, Jaehun Jung, Melanie Sclar, Ximing Lu, Abhilasha Ravichander, Sahana Ramnath, Yejin Choi,Sai Praneeth Karimireddy, Niloofar Mireshghallah, Xiang Ren.
[ COLM 2025 ]
TruthTorchLM: A Comprehensive Library for Predicting Truthfulness in LLM Outputs. Duygu Nur Yaldiz, Yavuz Faruk Bakman, Sungmin Kang, Alperen Öziş, Hayrettin Eren Yildiz, Mitash Ashish Shah, Zhiqi Huang, Anoop Kumar, Alfy Samuel, Daben Liu, Sai Praneeth Karimireddy, Salman Avestimehr.
[ EMNLP Demo 2025 ]
Reconsidering LLM Uncertainty Estimation Methods in the Wild.
Yavuz Faruk Bakman, Duygu Nur Yaldiz, Sungmin Kang, Tuo Zhang, Baturalp Buyukates, Salman Avestimehr,Sai Praneeth Karimireddy.
[ ACL Main 2025 ]
Communication-Efficient Heterogeneous Federated Learning with Generalized Heavy-Ball Momentum.
Riccardo Zaccone,Sai Praneeth Karimireddy, Carlo Masone, Marco Ciccone.
[ TMLR 2025 ]
ContextLeak: Auditing Leakage in Private In-Context Learning Methods.
Jacob Choi, Shuying Cao, Xingjian Dong,Sai Praneeth Karimireddy.
[ ICML 2025 Workshop MemFM ]
LIA: Privacy-Preserving Data Quality Evaluation in Federated Learning Using a Lazy Influence Approximation.
Ljubomir Rokvic, Panayiotis Danassis,Sai Praneeth Karimireddy, Boi Faltings.
[ IEEE International Conference on Big Data 2025 ]
The Price is Right? Making Data Valuations Incentive-Compatible.
Dongyang Fan, Tyler J. Rotello,Sai Praneeth Karimireddy.
[ ICLR Data Problems 2025 ], [ Short talk (5 mins)], [ Slides ]

2024

DAVED: Data Acquisition via Experimental Design for Decentralized Data Markets. Charles Lu, Baihe Huang,Sai Praneeth Karimireddy, Michael I. Jordan.
[ NeurIPS 2024 ]
Conformal Prediction Adaptive to Unknown Subpopulation Shifts.
Nien-Shao (Regan) Wang,Sai Praneeth Karimireddy.
[ NeurIPS SFLFM Workshop 2024 ]
Defection-Free Collaboration between Competitors in a Learning System.
Mariel Werner,Sai Praneeth Karimireddy, Michael I Jordan.
[ NeurIPS FL@FM Workshop 2024 ], [ Arxiv 2024 ]
My-This-Your-That - Interpretable Identification of Systematic Bias in Federated Learning for Biomedical Images.
Mary-Anne Hartley, Klavdiia Naumova, Arnout Devos,Sai Praneeth Karimireddy, Martin Jaggi.
[ NPJ Digital Medicine 2024 ] [ Blog ]
Collaborative Heterogeneous Causal Inference Beyond Meta-analysis.
Tianyu Guo,Sai Praneeth Karimireddy, Michael I Jordan.
[ ICML 2024 ]
Privacy Can Arise Endogenously in an Economic System with Learning Agents. Nivasini Ananthakrishnan, Tiffany Ding, Mariel Werner,Sai Praneeth Karimireddy, Michael I Jordan.
[ FORC 2024 ]
Optimization with Access to Auxiliary Information. (Invited for ICLR 2025 Oral)
El Mahdi Chayti,Sai Praneeth Karimireddy.
[ TMLR 2024 ]

2023

Provably Personalized and Robust Federated Learning.
Mariel Werner, Lie He,Sai Praneeth Karimireddy, Michael I. Jordan, Martin Jaggi.
[ TMLR 2023 ]
Federated Conformal Predictors for Distributed Uncertainty Quantification.
Charles Lu*, Yaodong Yu*,Sai Praneeth Karimireddy, Michael I. Jordan, Ramesh Raskar.
[ ICML 2023 ], [ Code ]
Evaluating and Incentivizing Diverse Data Contributions in Collaborative Learning.
Baihe Huang,Sai Praneeth Karimireddy, Michael I. Jordan.
[ ICML FL@FM Workshop 2023 ], [ Talk ], [ slides. ]
Federated Learning Showdown: The Comparative Analysis of Federated Learning Frameworks.
Sai Praneeth Karimireddy, Narasimha Veeraragavan, Severin Elvatun, Jan Nygård.
[ FMEC 2023 ]
Agree to Disagree: Diversity through Disagreement for Better Transferability. (Notable Top 5%)
Matteo Pagliardini, Martin Jaggi, François Fleuret,Sai Praneeth Karimireddy.
[ ICLR 2023 ], [ Code ]

2022

Mechanisms that Incentivize Data Sharing in Federated Learning. (Best paper)
Sai Praneeth Karimireddy*, Wenshuo Guo*, Michael I. Jordan.
[ Arxiv 2022 ], [ FL NeurIPS workshop 2022 ]
Towards Provably Personalized Federated Learning via Threshold-Clustering of Similar Clients
Mariel Werner, Lie He,Sai Praneeth Karimireddy, Mike I Jordan, Martin Jaggi.
[ FL NeurIPS workshop 2022 ]
Byzantine-Robust Decentralized Learning via Self-Centered Clipping.
Lie He,Sai Praneeth Karimireddy, Martin Jaggi.
[ FL NeurIPS workshop 2022 ], [ Code ]
FLamby: Datasets and Benchmarks for Cross-Silo Federated Learning in Realistic Healthcare Settings.
Jean du Terrail et al. (multi-institutional collaborative effort)
[ NeurIPS 2022 ], [ Code ]
TCT: Convexifying Federated Learning using Bootstrapped Neural Tangent Kernels.
Yaodong Yu, Alexander Wei,Sai Praneeth Karimireddy, Yi Ma, Michael I. Jordan.
[ NeurIPS 2022 ], [ Code ]
Towards Model Agnostic Federated Learning using Knowledge Distillation.
Andrei Afonin,Sai Praneeth Karimireddy.
[ ICLR 2022 ], [ Slides ]
Byzantine-Robust Learning on Heterogeneous Datasets via Bucketing. (Spotlight)
Sai Praneeth Karimireddy*, Lie He*, Martin Jaggi.
[ ICLR 2022 ], [ SPICY-FL NeurIPS workshop 2020 ], [ Slides ]

2021

A Field Guide to Federated Optimization.
Jianyu Wang, et al. (Collaborative survey by the FL community)
[ Arxiv ]
Mime: Mimicking Centralized Stochastic Algorithms in Federated Learning.
Sai Praneeth Karimireddy, Martin Jaggi, Satyen Kale, Mehryar Mohri, Sashank Reddi, Sebastian U. Stich, Ananda Theertha Suresh.
[ NeurIPS 2021 ], [ Short Talk ], [ Long Talk ], [ Slides ], [ Code ]
RelaySum for Decentralized Deep Learning on Heterogeneous Data.
Thijs Vogels*, Lie He*, Anastasia Koloskova, Tao Lin,Sai Praneeth Karimireddy, Sebastian U. Stich, Martin Jaggi.
[ NeurIPS 2021 ], [ Talk ], [ Slides ], [ Code ]
Optimal Model Averaging: Towards Personalized Collaborative Learning (Best paper)
Felix Grimberg, Mary-Anne Hartley,Sai Praneeth Karimireddy, Martin Jaggi.
[ FL ICML workshop 2021 ], [ Talk ]
Learning from History for Byzantine Robust Optimization.
Sai Praneeth Karimireddy, Lie He, Martin Jaggi.
[ ICML 2021 ], [ Short talk ], [ Poster ], [ Slides ], [ Code ]
Quasi-Global Momentum: Accelerating Decentralized Deep Learning on Heterogeneous Data.
Tao Lin,Sai Praneeth Karimireddy, Sebastian Stich, Martin Jaggi.
[ ICML 2021 ], [ Short talk ], [ Code ]

2020

Why Adaptive methods beat SGD for Attention Models.
Jingzhao Zhang,Sai Praneeth Karimireddy, Andreas Veit, Seungyeon Kim, Sashank Reddi, Sanjiv Kumar.
[ NeurIPS 2020 ], [ Short talk ]
PowerGossip: Practical Communication Compression in Decentralized Deep Learning.
Thijs Vogels,Sai Praneeth Karimireddy, Martin Jaggi.
[ NeurIPS 2020 ], [ Short talk ], [ Code ]
Weight Erosion: An Update Aggregation Scheme for Personalized Collaborative Machine Learning.
Felix Grimberg, Mary-Anne Hartley, Martin Jaggi,Sai Praneeth Karimireddy.
[ DART 2020 (pdf) ]
Secure Byzantine Machine Learning.
Lie He,Sai Praneeth Karimireddy, Martin Jaggi.
[ SPICY-FL NeurIPS workshop 2020 ]
Accelerated Gradient Boosted Machines.
Haihao Lu*,Sai Praneeth Karimireddy*, Natalia Ponomareva, Vahab Mirrokni.
[ AISTATS 2020 ]
The Error-Feedback Framework: Better Rates for SGD with Delayed Gradients and Compressed Communication.
Sebastian Stich,Sai Praneeth Karimireddy.
[ JMLR 2020 ]
SCAFFOLD: Stochastic Controlled Averaging for Federated Learning.
Sai Praneeth Karimireddy, Satyen Kale, Mehryar Mohri, Sashank Reddi, Sebastian Stich, Ananda Theertha Suresh.
[ ICML 2020 ], [ Short talk ], [ Long talk ], [ Slides ]

2019

PowerSGD: Practical Low-rank Gradient Compression for Distributed Optimization.
Thijs Vogels,Sai Praneeth Karimireddy, Martin Jaggi.
[ NeurIPS 2019 ], [ Short video ], [ Code ]
Global Convergence of Newton-type Methods without Strong-Convexity or Lipschitz Gradients.
Sai Praneeth Karimireddy, Sebastian Stich, Martin Jaggi.
[ NeurIps OptML 2019 ]
Efficient greedy coordinate descent for composite problems.
Sai Praneeth Karimireddy*, Anastasia Koloskova*, Martin Jaggi.
[ AISTATS 2019 ]
Error Feedback fixes SignSGD and other Gradient Compression Schemes. (Long talk)
Sai Praneeth Karimireddy, Quentin Rebjock, Sebastian Stich, Martin Jaggi.
[ ICML 2019 ], [ Slides ], [ Code ]

2018

On Matching Pursuit and Coordinate Descent.
Francesco Locatello*, Anant Raj*,Sai Praneeth Karimireddy, Sebastian Stich, Martin Jaggi.
[ ICML 2018 ]
Adaptive Balancing of Gradient and Update Computation Times using Approximate Subproblem Solvers. (Oral)
Sai Praneeth Karimireddy, Sebastian Stich, Martin Jaggi.
[ AISTATS 2018 ], [ Slides ]

2016

Assignment Techniques for Crowdsourcing Sensitive Tasks.
Elisa Celis*,Sai Praneeth Karimireddy*, Ishaan Singh*, Shailesh Vaya*.
[ CSCW 2016 ]
Multi-Broadcasting under SINR Model.
Darek Kowalski*,Sai Praneeth Karimireddy*, Shailesh Vaya*
[ PODC 2016 ]
Some results on a class of van der Waerden Numbers.
Sai Praneeth Karimireddy*, Kaushik Maran*, Dravyansh Sharma*, Amitabha Tripati*.
[ Rocky Journal of Mathematics Vol. 48 ]