Course Description and Objectives
This course focuses on the foundations of privacy-preserving machine learning. Extremely personal data is being collected at an unprecedented scale by ML companies. While training ML models on such confidential data can be highly beneficial, it also comes with huge privacy risks. This course addresses the dual challenge of maximizing the utility of machine learning models while protecting individual privacy. We will cover the following topics: differential privacy; private training of ML models; privacy attacks and audits; federated and decentralized machine learning.
This course will prepare you to rigorously identify, reason about, and manage privacy risks in machine learning. You will learn to design algorithms that protect sensitive information, and to analyze the privacy leakage of any ML system. Additionally, the course will introduce you to cutting-edge research and practical applications. By the end of the course, you will be well-equipped to undertake research and address real-world privacy challenges in machine learning.
For providing anonymous feedback at any point in the course, please use this anonymous form.
Grading
- Assignments (30%)
- 3 assignments in the first half of semester (submitted on Brightspace)
- Short conceptual checks + practical components
- Goal: ensure understanding of core concepts and let you “play” with them
- Project Report (35%) — due exam day
- Option 1: Paper Reading
- Team up with 1–3 others working on related papers
- Teach each other your papers and background
- Replicate core experiments from state-of-the-art
- Submit a 4-page report
- Option 2: Research (encouraged)
- Teams of 1–3
- Develop your own research question (from class readings or otherwise)
- Meet with instructor before Oct 6 (Fall break) for feedback
- Submit a 4-page report
- Option 1: Paper Reading
- Paper Reading & Discussion (35%)
- Uses a role-playing discussion format
- Each week post-Fall break, we will discuss 2–3 papers
- Roles:
- Presenter*: present the paper (in-class presentation, 20% of total grade)
- Antagonist: identify flaws, missing experiments
- Archaeologist: situate the paper in the broader field
- Researcher: propose a follow-up abstract
- Practitioner: pitch how to turn it into a product
- Each student rotates through all roles equally
- Non-presenters: submit a 1-paragraph role write-up before class (on Brightspace), then join in-class discussion (15%)
Prerequisites
While there are no official prerequisites, knowledge of advanced probability (at the level of MATH 505a), linear algebra and multi-variable calculus (at the level of MATH 225), analysis of algorithms (at the level of CSCI 570), introductory statistics and hypothesis testing (at the level of MATH 308), and machine learning (at the level of CSCI 567) is recommended.
Syllabus
Week | Date | Lecture | Presentation | Items Due | Lecture Material |
---|---|---|---|---|---|
1 | Aug 25 |
|
|
|
|
Sep 1 | Labor Day | ||||
2 | Sep 8 |
|
To be uploaded | ||
3 | Sep 15 |
|
HW 1 due | To be uploaded | |
4 | Sep 22 |
|
|
To be uploaded | |
5 | Sep 29 |
|
To be uploaded | ||
6 | Oct 6 |
|
|
To be uploaded | |
7 | Oct 13 |
|
|
To be uploaded | |
8 | Oct 20 |
|
|
To be uploaded | |
9 | Oct 27 |
| To be uploaded | ||
10 | Nov 3 |
| To be uploaded | ||
11 | Nov 10 |
|
| To be uploaded | |
12 | Nov 17 |
| To be uploaded | ||
13 | Nov 24 |
|
To be uploaded | ||
14 | Dec 1 |
|
To be uploaded | ||
Dec 8 | Study break | ||||
Dec 15 |
|
Resources
There are no required textbooks. The following writeups are excellent supplemental readings and may be used as references.
- C. Dwork and A. Roth. The Algorithmic Foundations of Differential Privacy. Foundations and Trends in Theoretical Computer Science, 2014. pdf. Reference for DP.
- Nissim et al. Differential Privacy: A Primer for a Non-technical Audience. Journal of Entertainment & Technology Law, 2018. pdf. Great read with many examples tying legal definitions and privacy in practice.
- Kairouz et al. Advances and Open Problems in Federated Learning. Community survey on federated learning. pdf.
This course builds on several related courses which can serve as valuable additional references:
- Privacy-Preserving Machine Learning by Aurelien Bellet at Inria (link)
- Trustworthy Machine Learning by Reza Shokri at NUS (link)
- Federated and Collaborative Learning by Virginia Smith at CMU (link)
- Large Scale Optimization for Machine Learning (ISE 633) by Meisam Razaviyayn at USC (link)
- Digital Privacy by Vitaly Shmatikov at Cornell (link)