About me

I am an Assistant Professor at USC in the Thomas Lord Dept of CS, and by courtesy in the Ming Hsieh Dept of ECE. Until recently, I was an SNSF postdoc working with Mike I. Jordan at UC Berkeley, and obtained my PhD at EPFL advised by Martin Jaggi. I also co-lead the Federated Learning for Health working group at MONAI with Holger Roth at NVIDIA, and closely work with Satyen Kale at Apple Research New York. My work has previously been deployed across industry at Meta, Google, Open AI, Owkin.

Selected Awards

Some awards I have been fortunate to have been recognized by:
- 2023 SNSF Mobility Fellowship.
- 2022 Patrick Denantes Memorial Prize for the best thesis in computer science.
- 2022 EPFL thesis distinction awarded to the top 8% theses at EPFL.
- 2021 Chorafas Foundation Prize for exceptional applied research.

I am actively looking for Fall 2025 PhD students especially in: a. game theory and incentives in ML; b. optimization and ML theory; and c. privacy and memorization in ML. If you’re interested in a summer internship, see this program if you’re from India, and this program if you’re from Tsinghua. See also the USC JumpStart Program for undergrads. If any of this is you, reach out! :)

Teaching

- Spring 2025: CSCI 599 Optimization for Machine Learning (website)
- Fall 2024: CSCI 699 Privacy Preserving Machine Learning (website)

My research

Soon, a network of interacting AI agents will be mediating our social, cultural, and political exchanges, giving rise to complex “machine behavior”. This goes well beyond the current scope of machine learning research which focuses narrowly on developing better models on individual datasets, but rather requires taking a holistic view of the systems that arise from these agents interacting with society and with each other. I use tools from optimization, statistics, and economics to study and design such systems.

I am currently focusing on the role of data in such ML ecosystems. Data is the most important factor determining the quality of an ML system. However, we understand very little about what makes data good or bad. Further, the most valuable data (e.g. health records) are either extremely expensive to collect or simply inaccesible. Some questions I am thinking about nowadays:

Large-scale Private & Federated optimization. Medical data is subject to strict privacy regulations. How can we privately train ML models on data distributed across multiple hospitals without the data leaving the hospitals? This uses tools like Federated Learning, Differential Privacy, and optimization for large-scale machine learning.
Data Valuation and Data Markets. The data commons that current AI relies on is disappearing. In order to build a sustainable data-ecosystem, people need to be compensated for their data. But, how much is a specific data point worth?. This questions requires understanding i) how data affects uncertainity in a ML model, and ii) the relative importance of datapoints.
AI for Health. Healthcare comes with tons of data challenges on top of privacy concerns: the data may be highly heterogenous, and have missing features. Further, because of the high-stakes involved, fairness and equity, reliable uncertainity quantification, and interepretable predictions are all extremely important.

News

- (Jun 25): Excited to be part of this workshop at TTIC this summer on Incentives in Data Sharing. We are now accepting submissions and registrations. See you all there soon!

- (Feb 25): I am co-organizing a TTIC summer workshop on Incentives in Data Sharing and Collaborative Learning on Aug 13-15. Please reach out if you’re interested in giving a talk or participating.

- (Jan 25): I was named as a Capitol One Fellow, thank you for the support!

- (Dec 24): I received a joint appointment with the Ming Hsieh Dept of ECE. If you are interested in working with me, you can now apply to PhD programs in both the CS and ECE departments.

- (Nov 24): I will be giving talks in the Chicago area - in the computer science seminar at the University of Chicago on Nov 11, and in the LANS seminar at the Argonne National Laboratory.

- (Sep 24): I will be giving invited talks at USC Theory Lunch on Sep 19, at INFORMS 24 on Oct 22, and guest lecture in USC CSCI 697 on “Building Collaborative Data-Ecosystems”.

- (Sep 24): Serving as an area chair at ICLR 25.

- (Aug 24): I’m co-organizing the workshop on Federated Learning in the Age of Foundation Models as part of NeurIPS 2024. Please consider submitting your work!

- (Jun 24): I was appointed co-lead of the Data Quality and Federated Learning working group along with Holger Roth as part of MONAI. Looking forward to push the boundaries of AI for healthcare!

- (Jun 24) I will be in Boston for the foundations of responsible computing conference (FORC 2024) Jun 12-14. Come say hi!

Publications

* indicates equal contribution.

Preprints

A Systematic Analysis of Base Model Choice for Reward Modeling.
Kian Ahrabian, Pegah Jandaghi, Negar Mokhberian, SPK, Jay Pujara.
[ Arxiv 2025 ]
From Fairness to Truthfulness: Rethinking Data Valuation Design.
Dongyang Fan, Tyler J. Rotello, SPK.
[ Arxiv 2025 ]
Evaluating and Incentivizing Diverse Data Contributions in Collaborative Learning.
Baihe Huang, SPK, Michael I. Jordan.
[ Arxiv 2023 ], [ Talk ], [ slides. ]
Online Learning in a Creator Economy.
Banghua Zhu, SPK, Jiantao Jiao, Michael I. Jordan.
[ Arxiv 2023 ]

2025

Reconsidering LLM Uncertainty Estimation Methods in the Wild.
Yavuz Faruk Bakman, Duygu Nur Yaldiz, Sungmin Kang, Tuo Zhang, Baturalp Buyukates, Salman Avestimehr, SPK.
[ ACL 2025 ]
Communication-Efficient Heterogeneous Federated Learning with Generalized Heavy-Ball Momentum.
Riccardo Zaccone, SPK, Carlo Masone, Marco Ciccone.
[ TMLR 2025 ]
LIA: Privacy-Preserving Data Quality Evaluation in Federated Learning Using a Lazy Influence Approximation.
Ljubomir Rokvic, Panayiotis Danassis, SPK, Boi Faltings.
[ IEEE International Conference on Big Data 2025 ]
The Price is Right? Making Data Valuations Incentive-Compatible.
Dongyang Fan, Tyler J. Rotello, SPK.
[ ICLR Data Problems 2025 ], [ Short talk (5 mins)], [ Slides ]

2024

DAVED: Data Acquisition via Experimental Design for Decentralized Data Markets. Charles Lu, Baihe Huang, SPK, Michael I. Jordan.
[ NeurIPS 2024 ]
Conformal Prediction Adaptive to Unknown Subpopulation Shifts.
Nien-Shao (Regan) Wang, SPK.
[ NeurIPS SFLFM Workshop 2024 ]
Defection-Free Collaboration between Competitors in a Learning System.
Mariel Werner, SPK, Michael I Jordan.
[ NeurIPS FL@FM Workshop 2024 ], [ Arxiv 2024 ]
My-This-Your-That - Interpretable Identification of Systematic Bias in Federated Learning for Biomedical Images.
Mary-Anne Hartley, Klavdiia Naumova, Arnout Devos, SPK, Martin Jaggi.
[ NPJ Digital Medicine 2024 ] [ Blog ]
Collaborative Heterogeneous Causal Inference Beyond Meta-analysis.
Tianyu Guo, SPK, Michael I Jordan.
[ ICML 2024 ]
Privacy Can Arise Endogenously in an Economic System with Learning Agents. Nivasini Ananthakrishnan, Tiffany Ding, Mariel Werner, SPK, Michael I Jordan.
[ FORC 2024 ]
Optimization with Access to Auxiliary Information. (Invited for ICLR 2025 Oral)
El Mahdi Chayti, SPK.
[ TMLR 2024 ]

2023

Provably Personalized and Robust Federated Learning.
Mariel Werner, Lie He, SPK, Michael I. Jordan, Martin Jaggi.
[ TMLR 2023 ]
Federated Conformal Predictors for Distributed Uncertainty Quantification.
Charles Lu*, Yaodong Yu*, SPK, Michael I. Jordan, Ramesh Raskar.
[ ICML 2023 ], [ Code ]
Federated Learning Showdown: The Comparative Analysis of Federated Learning Frameworks.
SPK, Narasimha Veeraragavan, Severin Elvatun, Jan Nygård.
[ FMEC 2023 ]
Agree to Disagree: Diversity through Disagreement for Better Transferability. (Notable Top 5%)
Matteo Pagliardini, Martin Jaggi, François Fleuret, SPK.
[ ICLR 2023 ], [ Code ]

2022

Mechanisms that Incentivize Data Sharing in Federated Learning. (Best paper)
SPK*, Wenshuo Guo*, Michael I. Jordan.
[ Arxiv 2022 ], [ FL NeurIPS workshop 2022 ]
Towards Provably Personalized Federated Learning via Threshold-Clustering of Similar Clients
Mariel Werner, Lie He, SPK, Mike I Jordan, Martin Jaggi.
[ FL NeurIPS workshop 2022 ]
Byzantine-Robust Decentralized Learning via Self-Centered Clipping.
Lie He, SPK, Martin Jaggi.
[ FL NeurIPS workshop 2022 ], [ Code ]
FLamby: Datasets and Benchmarks for Cross-Silo Federated Learning in Realistic Healthcare Settings.
Jean du Terrail et al. (multi-institutional collaborative effort)
[ NeurIPS 2022 ], [ Code ]
TCT: Convexifying Federated Learning using Bootstrapped Neural Tangent Kernels.
Yaodong Yu, Alexander Wei, SPK, Yi Ma, Michael I. Jordan.
[ NeurIPS 2022 ], [ Code ]
Towards Model Agnostic Federated Learning using Knowledge Distillation.
Andrei Afonin, SPK.
[ ICLR 2022 ], [ Slides ]
Byzantine-Robust Learning on Heterogeneous Datasets via Bucketing. (Spotlight)
SPK*, Lie He*, Martin Jaggi.
[ ICLR 2022 ], [ SPICY-FL NeurIPS workshop 2020 ], [ Slides ]

2021

A Field Guide to Federated Optimization.
Jianyu Wang, et al. (Collaborative survey by the FL community)
[ Arxiv ]
Mime: Mimicking Centralized Stochastic Algorithms in Federated Learning.
SPK, Martin Jaggi, Satyen Kale, Mehryar Mohri, Sashank Reddi, Sebastian U. Stich, Ananda Theertha Suresh.
[ NeurIPS 2021 ], [ Short Talk ], [ Long Talk ], [ Slides ], [ Code ]
RelaySum for Decentralized Deep Learning on Heterogeneous Data.
Thijs Vogels*, Lie He*, Anastasia Koloskova, Tao Lin, SPK, Sebastian U. Stich, Martin Jaggi.
[ NeurIPS 2021 ], [ Talk ], [ Slides ], [ Code ]
Optimal Model Averaging: Towards Personalized Collaborative Learning (Best paper)
Felix Grimberg, Mary-Anne Hartley, SPK, Martin Jaggi.
[ FL ICML workshop 2021 ], [ Talk ]
Learning from History for Byzantine Robust Optimization.
SPK, Lie He, Martin Jaggi.
[ ICML 2021 ], [ Short talk ], [ Poster ], [ Slides ], [ Code ]
Quasi-Global Momentum: Accelerating Decentralized Deep Learning on Heterogeneous Data.
Tao Lin, SPK, Sebastian Stich, Martin Jaggi.
[ ICML 2021 ], [ Short talk ], [ Code ]

2020

Why Adaptive methods beat SGD for Attention Models.
Jingzhao Zhang, SPK, Andreas Veit, Seungyeon Kim, Sashank Reddi, Sanjiv Kumar.
[ NeurIPS 2020 ], [ Short talk ]
PowerGossip: Practical Communication Compression in Decentralized Deep Learning.
Thijs Vogels, SPK, Martin Jaggi.
[ NeurIPS 2020 ], [ Short talk ], [ Code ]
Weight Erosion: An Update Aggregation Scheme for Personalized Collaborative Machine Learning.
Felix Grimberg, Mary-Anne Hartley, Martin Jaggi, SPK.
[ DART 2020 (pdf) ]
Secure Byzantine Machine Learning.
Lie He, SPK, Martin Jaggi.
[ SPICY-FL NeurIPS workshop 2020 ]
Accelerated Gradient Boosted Machines.
Haihao Lu*, SPK*, Natalia Ponomareva, Vahab Mirrokni.
[ AISTATS 2020 ]
The Error-Feedback Framework: Better Rates for SGD with Delayed Gradients and Compressed Communication.
Sebastian Stich, SPK.
[ JMLR 2020 ]
SCAFFOLD: Stochastic Controlled Averaging for Federated Learning.
SPK, Satyen Kale, Mehryar Mohri, Sashank Reddi, Sebastian Stich, Ananda Theertha Suresh.
[ ICML 2020 ], [ Short talk ], [ Long talk ], [ Slides ]

2019

PowerSGD: Practical Low-rank Gradient Compression for Distributed Optimization.
Thijs Vogels, SPK, Martin Jaggi.
[ NeurIPS 2019 ], [ Short video ], [ Code ]
Global Convergence of Newton-type Methods without Strong-Convexity or Lipschitz Gradients.
SPK, Sebastian Stich, Martin Jaggi.
[ NeurIps OptML 2019 ]
Efficient greedy coordinate descent for composite problems.
SPK*, Anastasia Koloskova*, Martin Jaggi.
[ AISTATS 2019 ]
Error Feedback fixes SignSGD and other Gradient Compression Schemes. (Long talk)
SPK, Quentin Rebjock, Sebastian Stich, Martin Jaggi.
[ ICML 2019 ], [ Slides ], [ Code ]

2018

On Matching Pursuit and Coordinate Descent.
Francesco Locatello*, Anant Raj*, SPK, Sebastian Stich, Martin Jaggi.
[ ICML 2018 ]
Adaptive Balancing of Gradient and Update Computation Times using Approximate Subproblem Solvers. (Oral)
SPK, Sebastian Stich, Martin Jaggi.
[ AISTATS 2018 ], [ Slides ]

2016

Assignment Techniques for Crowdsourcing Sensitive Tasks.
Elisa Celis*, SPK*, Ishaan Singh*, Shailesh Vaya*.
[ CSCW 2016 ]
Multi-Broadcasting under SINR Model.
Darek Kowalski*, SPK*, Shailesh Vaya*
[ PODC 2016 ]
Some results on a class of van der Waerden Numbers.
SPK*, Kaushik Maran*, Dravyansh Sharma*, Amitabha Tripati*.
[ Rocky Journal of Mathematics Vol. 48 ]