Sai Praneeth Karimireddy


sai dot karimireddy at epfl dot ch

About me

I am a 5th year PhD student at EPFL advised by Martin Jaggi. I am also affiliated with iGH and IDDO where I work on distributed intelligence in health. Before this, I graduated from IIT Delhi. I am on the job market starting Fall 2021.

My research

All models are wrong, but some are useful. - George Box.

My main research interest is in enabling machine learning in the wild; to take it outside of clean centralized datasets. My past research has involved topics such as learning in low-resource settings using compression, federated learning, decentralized learning, robustness, security, and privacy. For more detail, read this one page research overview, or watch these interviews.


* indicates equal contribution.


  • Learning from History for Byzantine Robust Optimization.
    SPK, Lie He, Martin Jaggi.
    [ Arxiv 2020 ]

  • Mime: Mimicking Centralized Stochastic Algorithms in Federated Learning.
    SPK, Martin Jaggi, Satyen Kale, Mehryar Mohri, Sashank Reddi, Sebastian Stich, Ananda Theertha Suresh.
    [ Arxiv 2020 ], [ Slides ], [ Code ]

  • Byzantine-Robust Learning on Heterogeneous Datasets via Resampling.
    Lie He*, SPK*, Martin Jaggi.
    [ Arxiv 2020 ]

  • Secure Byzantine Machine Learning.
    Lie He, SPK, Martin Jaggi.
    [ Arxiv 2020 ]


  • Why Adaptive methods beat SGD for Attention Models.
    Jingzhao Zhang, SPK, Andreas Veit, Seungyeon Kim, Sashank Reddi, Sanjiv Kumar.
    [ NeurIPS 2020 ]

  • PowerGossip: Practical Communication Compression in Decentralized Deep Learning.
    Thijs Vogels, SPK, Martin Jaggi.
    [ NeurIPS 2020 ], [ Slides ], [ Code ]

  • Weight Erosion: An Update Aggregation Scheme for Personalized Collaborative Machine Learning.
    Felix Grimberg, Mary-Anne Hartley, Martin Jaggi, SPK.
    [ DART 2020 (pdf) ]

  • Accelerated Gradient Boosted Machines.
    Haihao Lu*, SPK*, Natalia Ponomareva, Vahab Mirrokni.
    [ AISTATS 2020 ]

  • The Error-Feedback Framework: Better Rates for SGD with Delayed Gradients and Compressed Communication.
    Sebastian Stich, SPK.
    [ JMLR 2020 ]

  • SCAFFOLD: Stochastic Controlled Averaging for Federated Learning.
    SPK, Satyen Kale, Mehryar Mohri, Sashank Reddi, Sebastian Stich, Ananda Theertha Suresh.
    [ ICML 2020 ], [ Short talk ], [ Long talk ]


  • PowerSGD: Practical Low-rank Gradient Compression for Distributed Optimization.
    Thijs Vogels, SPK, Martin Jaggi.
    [ NeurIPS 2019 ], [ Short video ], [ Code ]

  • Global Convergence of Newton-type Methods without Strong-Convexity or Lipschitz Gradients.
    SPK, Sebastian Stich, Martin Jaggi.
    [ NeurIps OptML 2019 ]

  • Efficient greedy coordinate descent for composite problems.
    SPK*, Anastasia Koloskova*, Martin Jaggi.
    [ AISTATS 2019 ]

  • Error Feedback fixes SignSGD and other Gradient Compression Schemes. (Long talk)
    SPK, Quentin Rebjock, Sebastian Stich, Martin Jaggi.
    [ ICML 2019 ], [ Slides ], [ Code ]


  • On Matching Pursuit and Coordinate Descent.
    Francesco Locatello*, Anant Raj*, SPK, Sebastian Stich, Martin Jaggi.
    [ ICML 2018 ]

  • Adaptive Balancing of Gradient and Update Computation Times using Approximate Subproblem Solvers. (Oral)
    SPK, Sebastian Stich, Martin Jaggi.
    [ AISTATS 2018 ], [ Slides ]


  • Assignment Techniques for Crowdsourcing Sensitive Tasks.
    Elisa Celis*, SPK*, Ishaan Singh*, Shailesh Vaya*.
    [ CSCW 2016 ]

  • Multi-Broadcasting under SINR Model.
    Darek Kowalski*, SPK*, Shailesh Vaya*
    [ PODC 2016 ]

  • Some results on a class of van der Waerden Numbers.
    SPK*, Kaushik Maran*, Dravyansh Sharma*, Amitabha Tripati*.
    [ Rocky Journal of Mathematics Vol. 48 ]