Sai Praneeth Karimireddy


sp dot lastname at berkeley dot edu

About me

I am an SNSF postdoc working with Mike I. Jordan at UC Berkeley, and obtained my PhD at EPFL advised by Martin Jaggi. I also collaborate with Satyen Kale, Mehryar Mohri, and Ananda Theertha Suresh at Google Research. Additionally, I am affiliated with iGH where I work on distributed intelligence in health with Mary-Anne Hartley. Before all this, I graduated from IIT Delhi.

My work has previously been deployed across industry at Facebook, Google, Open AI, Owkin, and has been awarded with the Chorafas Foundation Prize for exceptional applied research, the Patrick Denantes Memorial Prize for the best computer science thesis, an SNSF fellowship, and a best paper award at FL-ICML 2021. It aslo played a tiny part in making some memes.

My research

All models are wrong, but some are useful. - George Box.

I am interested in building intelligence infrastructure to enable collaborative machine learning, and more generally in the generation, usage, and governance of data. My work attempts to use theory as a guide to build systems which work in the real world. So far, it has involved topics such as federated learning, decentralized learning, mechanism design, game theory, robustness, security, and privacy.

The true potential of machine learning can only be unleashed when it is tightly integrated into everyday society. Such a future will involve a network of interacting machine learning agents mediating our social, cultural, and political exchanges, giving rise to complex “machine behavior”. Carefully building, designing, and understanding such systems is, I believe, one of the most pressing problems we currently face.

For more details on some of these, you can read this research statement or watch these interviews. I am always looking for collaborations and would love to hear from you! Reach out if any of my work sounds interesting. Especially if your background is different from mine :)

Selected Talks

  • Byzantine Robust Collaborative Learning. BLISS Seminar 03/2022.
    [ Video ], [ Slides ]

  • A Tutorial on Efficient Federated Learning. POSTECH Seminar 02/2022.
    [ Video ], [ Slides ]

  • What is Privacy? MLO Seminar 12/2021.
    [ Slides ]

  • Interview on Federated Learning. ZettaBytes 12/2019.
    [ Playlist ]


* indicates equal contribution.


  • Towards Provably Personalized Federated Learning via Threshold-Clustering of Similar Clients
    Mariel Werner, Lie He, SPK, Mike I Jordan, Martin Jaggi.
    [ FL NeurIPS workshop 2022 ]

  • Mechanisms that Incentivize Data Sharing in Federated Learning. (Oral)
    SPK*, Wenshuo Guo*, Michael I. Jordan.
    [ Arxiv 2022 ], [ FL NeurIPS workshop 2022 ]

  • Agree to Disagree: Diversity through Disagreement for Better Transferability.
    Matteo Pagliardini, Martin Jaggi, François Fleuret, SPK.
    [ Arxiv 2022 ], [ Code ], [ DistShift NeurIPS workshop 2022 ]

  • Optimization with Access to Auxiliary Information.
    El Mahdi Chayti, SPK.
    [ Arxiv 2022 ]

  • Byzantine-Robust Decentralized Learning via Self-Centered Clipping.
    Lie He, SPK, Martin Jaggi.
    [ Arxiv 2022 ], [ Code ]


  • FLamby: Datasets and Benchmarks for Cross-Silo Federated Learning in Realistic Healthcare Settings.
    Jean du Terrail et al. (multi-institutional collaborative effort)
    [ NeurIPS 2022 ], [ Code ]

  • TCT: Convexifying Federated Learning using Bootstrapped Neural Tangent Kernels.
    Yaodong Yu, Alexander Wei, SPK, Yi Ma, Michael I. Jordan.
    [ NeurIPS 2022 ], [ Code ]

  • Towards Model Agnostic Federated Learning using Knowledge Distillation.
    Andrei Afonin, SPK.
    [ ICLR 2022 ], [ Slides ]

  • Byzantine-Robust Learning on Heterogeneous Datasets via Bucketing. (Spotlight)
    SPK*, Lie He*, Martin Jaggi.
    [ ICLR 2022 ], [ SPICY-FL NeurIPS workshop 2020 ], [ Slides ]


  • A Field Guide to Federated Optimization.
    Jianyu Wang, et al. (Collaborative survey by the FL community)
    [ Arxiv ]

  • Mime: Mimicking Centralized Stochastic Algorithms in Federated Learning.
    SPK, Martin Jaggi, Satyen Kale, Mehryar Mohri, Sashank Reddi, Sebastian U. Stich, Ananda Theertha Suresh.
    [ NeurIPS 2021 ], [ Short Talk ], [ Long Talk ], [ Slides ], [ Code ]

  • RelaySum for Decentralized Deep Learning on Heterogeneous Data.
    Thijs Vogels*, Lie He*, Anastasia Koloskova, Tao Lin, SPK, Sebastian U. Stich, Martin Jaggi.
    [ NeurIPS 2021 ], [ Talk ], [ Slides ], [ Code ]

  • Optimal Model Averaging: Towards Personalized Collaborative Learning (Best paper)
    Felix Grimberg, Mary-Anne Hartley, SPK, Martin Jaggi.
    [ FL ICML workshop 2021 ], [ Talk ]

  • Learning from History for Byzantine Robust Optimization.
    SPK, Lie He, Martin Jaggi.
    [ ICML 2021 ], [ Short talk ], [ Poster ], [ Slides ], [ Code ]

  • Quasi-Global Momentum: Accelerating Decentralized Deep Learning on Heterogeneous Data.
    Tao Lin, SPK, Sebastian Stich, Martin Jaggi.
    [ ICML 2021 ], [ Short talk ], [ Code ]


  • Why Adaptive methods beat SGD for Attention Models.
    Jingzhao Zhang, SPK, Andreas Veit, Seungyeon Kim, Sashank Reddi, Sanjiv Kumar.
    [ NeurIPS 2020 ], [ Short talk ]

  • PowerGossip: Practical Communication Compression in Decentralized Deep Learning.
    Thijs Vogels, SPK, Martin Jaggi.
    [ NeurIPS 2020 ], [ Short talk ], [ Code ]

  • Weight Erosion: An Update Aggregation Scheme for Personalized Collaborative Machine Learning.
    Felix Grimberg, Mary-Anne Hartley, Martin Jaggi, SPK.
    [ DART 2020 (pdf) ]

  • Secure Byzantine Machine Learning.
    Lie He, SPK, Martin Jaggi.
    [ SPICY-FL NeurIPS workshop 2020 ]

  • Accelerated Gradient Boosted Machines.
    Haihao Lu*, SPK*, Natalia Ponomareva, Vahab Mirrokni.
    [ AISTATS 2020 ]

  • The Error-Feedback Framework: Better Rates for SGD with Delayed Gradients and Compressed Communication.
    Sebastian Stich, SPK.
    [ JMLR 2020 ]

  • SCAFFOLD: Stochastic Controlled Averaging for Federated Learning.
    SPK, Satyen Kale, Mehryar Mohri, Sashank Reddi, Sebastian Stich, Ananda Theertha Suresh.
    [ ICML 2020 ], [ Short talk ], [ Long talk ], [ Slides ]


  • PowerSGD: Practical Low-rank Gradient Compression for Distributed Optimization.
    Thijs Vogels, SPK, Martin Jaggi.
    [ NeurIPS 2019 ], [ Short video ], [ Code ]

  • Global Convergence of Newton-type Methods without Strong-Convexity or Lipschitz Gradients.
    SPK, Sebastian Stich, Martin Jaggi.
    [ NeurIps OptML 2019 ]

  • Efficient greedy coordinate descent for composite problems.
    SPK*, Anastasia Koloskova*, Martin Jaggi.
    [ AISTATS 2019 ]

  • Error Feedback fixes SignSGD and other Gradient Compression Schemes. (Long talk)
    SPK, Quentin Rebjock, Sebastian Stich, Martin Jaggi.
    [ ICML 2019 ], [ Slides ], [ Code ]


  • On Matching Pursuit and Coordinate Descent.
    Francesco Locatello*, Anant Raj*, SPK, Sebastian Stich, Martin Jaggi.
    [ ICML 2018 ]

  • Adaptive Balancing of Gradient and Update Computation Times using Approximate Subproblem Solvers. (Oral)
    SPK, Sebastian Stich, Martin Jaggi.
    [ AISTATS 2018 ], [ Slides ]


  • Assignment Techniques for Crowdsourcing Sensitive Tasks.
    Elisa Celis*, SPK*, Ishaan Singh*, Shailesh Vaya*.
    [ CSCW 2016 ]

  • Multi-Broadcasting under SINR Model.
    Darek Kowalski*, SPK*, Shailesh Vaya*
    [ PODC 2016 ]

  • Some results on a class of van der Waerden Numbers.
    SPK*, Kaushik Maran*, Dravyansh Sharma*, Amitabha Tripati*.
    [ Rocky Journal of Mathematics Vol. 48 ]