Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Optimizing privacy-utility trade-offs in AI-enabled network applications
(USC Thesis Other)
Optimizing privacy-utility trade-offs in AI-enabled network applications
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
Optimizing Privacy-Utility Trade-offs in AI-enabled Network Applications
by
Jiang Zhang
A Dissertation Presented to the
FACULTY OF THE USC GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
(ELECTRICAL ENGINEERING)
May 2024
Copyright 2024 Jiang Zhang
Dedication
To my beloved family, for their unwavering faith and endless support.
ii
Acknowledgements
First and foremost, I extend my deepest gratitude to my supervisor, Prof. Konstantinos Psounis, for his
expert guidance, patience, and unwavering support throughout during my PhD study. My appreciation
also goes to the members of my PhD oral exam committee, Prof. Leana Golubchik, Prof. Harsha Madhyastha, and the members of my PhD qualifying exam committee, Prof. Salman Avestimehr, Prof. Mahdi
Soltanolkotabi, Dr. Peter Kairouz, for their insightful comments and encouragement. To my lab mates in
the Department of Electrical and Computer Engineering at University of Southern California, thank you
for creating a stimulating and supportive research environment. To my family and friends, your understanding and love have been my stronghold. This thesis stands as a testament to your belief in me.
iii
Table of Contents
Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvii
Chapter 1: Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Research Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Research Challenges and Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
I Optimizing User Privacy and Utility via Data Obfuscation in Centralized
Learning 6
Chapter 2: Privacy-Utility Trades in Crowdsourced Signal Map Obfuscation . . . . . . . . . . . . . 7
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2.1 User Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2.2 Threat Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.2.3 Data Obfuscation and Privatizers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.2.4 Context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.2.5 Adversary Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.2.6 Signal Map Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.2.7 Practical Considerations Regarding the Implementation of Privatizers . . . . . . . 22
2.3 Definition of Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.3.1 Privacy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.3.2 Utility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.4 Privatizers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.4.1 Gaussian Noise-Adding Privacy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.4.2 Local Differential Privacy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.4.3 Generative Adversarial Privacy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.4.4 Information-theoretic Privacy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.5 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
iv
2.5.1 Comparison of Privatizers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.5.2 Leveraging Measurement Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.5.3 Analysis of Privacy-Utility Trade Space . . . . . . . . . . . . . . . . . . . . . . . . 39
2.5.4 Constraining Distortion Levels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
2.5.5 Performance Against Different Adversaries . . . . . . . . . . . . . . . . . . . . . . 42
2.6 Limitations and Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
2.7 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
2.7.1 Privacy Mechanisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
2.7.2 Theoretical Studies of Privacy-Utility Trades . . . . . . . . . . . . . . . . . . . . . . 46
2.7.3 Prior work on mobile network data privacy . . . . . . . . . . . . . . . . . . . . . . 47
2.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
Chapter 3: Harpo: A Principled Obfuscation Approach for Subverting Online Behavioral Advertising 51
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.2 Threat Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.3 Proposed Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.3.2 System Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
3.3.3 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
3.3.4 System Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.3.5 System Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
3.4 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
3.4.1 User Persona Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
3.4.2 Data Collection and Preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
3.4.3 Training and Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
3.4.4 Accuracy of Surrogate Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
3.4.5 Baselines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
3.5 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
3.5.1 Privacy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
3.5.2 Transferability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
3.5.3 Overhead . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
3.5.4 Stealthiness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
3.5.5 Adaptiveness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
3.5.6 Personalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
3.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
3.6.1 Ethical Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
3.6.2 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
3.7 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
3.7.1 Privacy-Enhancing Blocking Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
3.7.2 Privacy-Enhancing Obfuscation Tools . . . . . . . . . . . . . . . . . . . . . . . . . 90
3.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
Chapter 4: De-Harpo: A Utility-Preserving Obfuscation Approach for YouTube Recommendations 93
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
4.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
4.2.1 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
4.2.2 Threat Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
4.3 Proposed Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
v
4.3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
4.3.2 System Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
4.3.3 Performance Goals and Guarantees . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
4.3.4 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
4.3.5 The “Secret" of the Denoiser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
4.4 System Design and Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
4.4.1 Video Embedding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
4.4.2 Obfuscator Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
4.4.3 Denoiser Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
4.4.4 Repopulating Recommended Videos . . . . . . . . . . . . . . . . . . . . . . . . . . 116
4.4.5 YouTube Surrogate Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
4.4.6 De-Harpo Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
4.5 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
4.5.1 User Personas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
4.5.2 Data Collection and Preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
4.6 Training and Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
4.6.1 Baselines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
4.7 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
4.7.1 Privacy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
4.7.2 Utility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
4.7.3 Varying the Obfuscation Budget . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
4.7.4 Overhead . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
4.7.5 Stealthiness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
4.7.6 De-obfuscation Robustness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
4.7.7 Personalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
4.8 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
4.8.1 Ethical Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
4.8.2 Ethical Issues Related to Reddit User and Real-world User Personas. . . . . . . . . 137
4.8.3 Limitations & Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
4.8.4 Discussion of Joint Training of Obfuscator and Denoiser . . . . . . . . . . . . . . . 138
4.9 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
4.10 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
II Enhancing User Privacy and Utility in Federated Learning with Secure
Aggregation 143
Chapter 5: Quantifying On-average Privacy Leakage in Federated Learning with Secure Aggregation via Mutual Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
5.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
5.2.1 Basic Setting of Federated Learning . . . . . . . . . . . . . . . . . . . . . . . . . . 148
5.2.2 Secure Aggregation Protocols for Federated Learning . . . . . . . . . . . . . . . . . 150
5.3 Theoretical Privacy Guarantees of FL with Secure Aggregation . . . . . . . . . . . . . . . . 153
5.3.1 Main Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
5.3.2 Impact of System Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
5.3.2.1 Impact of Number of Users (N) . . . . . . . . . . . . . . . . . . . . . . . . 157
5.3.2.2 Impact of Batch Size (B) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
vi
5.3.2.3 Impact of Model Size (d) . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
5.3.2.4 Impact of Global Training Rounds (T) . . . . . . . . . . . . . . . . . . . . 157
5.3.2.5 Impact of User Dropout, Collusion, and User Sampling . . . . . . . . . . 158
5.3.2.6 Impact of User Dropout and Collusion with the Server. . . . . . . . . . . 158
5.3.2.7 Impact of User Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
5.4 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
5.4.1 MI Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
5.4.2 Datasets and Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
5.5 Empirical Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
5.5.1 Impact of Number of Users (N) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
5.5.2 Impact of Model Size (d) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
5.5.3 Impact of Batch Size (B) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
5.5.4 Accumulative MI leakage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
5.5.5 Impact of Local Training Epochs (E) . . . . . . . . . . . . . . . . . . . . . . . . . . 170
5.5.6 Impact of Data Heterogeneity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
5.5.7 Practical Privacy Implications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
5.6 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
5.7 Further Discussion and Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
Chapter 6: Quantifying Worst-case Privacy Leakage in Federated Learning with Secure Aggregation via Differential Privacy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
6.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
6.2.1 Differential Privacy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
6.2.2 Threat Model for FL with SA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
6.3 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
6.3.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
6.3.2 Negative Result for DP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
6.3.3 What We Need for DP Guarantee . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
6.4 Theoretical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
6.4.1 Basic Assumption for Gradient Noise . . . . . . . . . . . . . . . . . . . . . . . . . . 190
6.4.2 Necessary Condition for DP Guarantee . . . . . . . . . . . . . . . . . . . . . . . . . 192
6.4.3 Gaussian Sampling Noise with Non-Singular Covariance Matrix . . . . . . . . . . . 194
6.4.4 Gaussian Sampling Noise with Singular Covariance Matrix . . . . . . . . . . . . . 199
6.5 Water-Filling Noise Addition (WF-NA) Algorithm. . . . . . . . . . . . . . . . . . . . . . . . 201
6.5.1 Algorithm Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
6.5.2 Discussion about WF-NA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
6.5.3 Comparison of DP Mechanisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206
6.6 Discussion and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
6.7 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
6.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
III Leveraging Large Foundation Models to Protect User Privacy and Utility in
Specialized ML Model Training 212
Chapter 7: Efficient Toxic Content Detection by Bootstrapping and Distilling Large Language
Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
vii
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
7.2 Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216
7.2.1 DToT Prompting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
7.2.2 Augmented DToT Prompting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
7.2.3 Rationale Distillation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
7.3 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
7.3.1 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
7.3.2 Models and Baselines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
7.3.3 Parameters and Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
7.4 Evaluation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
7.4.1 Evaluation of DToT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
7.4.2 Evaluation of Rationale Distillation . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
7.5 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
7.5.1 Toxic Content Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
7.5.2 Prompting LLMs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
7.5.3 Distilling LLMs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
7.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
Chapter 8: Customized Synthetic Data Generation for Private Training of Specialized ML Models . 234
8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
8.2 Threat Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
8.3 Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
8.3.1 System Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238
8.3.2 Device-side System Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
8.3.3 Server-side System Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
8.3.4 Privacy Measurement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243
8.4 Experimental setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243
8.4.1 Real-world Tasks and Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243
8.4.2 Synthetic Dataset Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244
8.4.3 Training and Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
8.5 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246
8.5.1 Utility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246
8.5.2 Privacy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247
8.5.3 Privacy-Utility Trade-offs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249
8.6 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250
8.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252
Chapter 9: Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254
List of Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256
viii
List of Tables
2.1 Dataset Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2 Context used by privatizers (last 4 rows) and properties of threat models (first 6 rows). . . 17
2.3 Utility Reference Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.4 Comparison of GLDP and LLDP privatizer on Chania dataset. . . . . . . . . . . . . . . . . 31
2.5 RMSE (root mean square error) of RSS prediction model trained with obfuscated
measurements when P = 1.0. Note that we use the RSS prediction model trained with
non-obfuscated measurements as a baseline (privatizer is none). . . . . . . . . . . . . . . . 37
2.6 Evaluation results against common adversaries. Baseline reports the privacy of a
privatizer against the adversary trained with its own obfuscated data. Unobfuscated is
trained against unobfuscated data. Aggregate is trained against aggregate obfuscated data.
Alternative is trained using a different loss function. . . . . . . . . . . . . . . . . . . . . . 43
3.1 Parameter values of neural networks for RL agent and surrogate model in Harpo. . . . . . 73
3.2 Accuracy of surrogate user profiling and ad targeting models. FPR and TFP denote false
positive and true positive rates. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
3.3 Evaluation results with surrogate models w.r.t. L1 (percent of false segments in obfuscated
persona), L2 (number of different segments between base and obfuscated persona), L3
(percentage increase of high bids in obfuscated persona), L4 (average ratio of obfuscated
persona over base persona bid values). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
3.4 Transferability results w.r.t. L1 (percent of different segments between base and obfuscated
profile), L2 (percent of false segments in obfuscated profile), L3 (percentage increase of
high bids in obfuscated profile), L4 (average ratio of obfuscated persona over base persona
bid values), CPM (cost per thousand impressions in dollar, the unit of bid values). . . . . . 78
3.5 Personalization results. L
allowed
2
and L
disallowed
2
denote the distortion on allowed segments
and disallowed segments respectively. Note that Harpo is trained to maximize L
allowed
2 +
L
disallowed
2
, while personalized Harpo is trained to maximize L
allowed
2 − wdL
disallowed
2
. . . 86
ix
4.1 Privacy evaluation results against YouTube w.r.t. P and P
N orm. . . . . . . . . . . . . . . . 127
4.2 Utility evaluation results w.r.t. ULoss and U
N orm
Gain . Note that each cell in the table reports
ULoss/U N orm
Gain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
4.3 Stealthiness evaluation results under different obfuscation budget α with 5% DeHarpo
users. Note that we choose α from {0.2, 0.3, 0.5} and report (Precision, Recall) of the
adversarial detector for different obfuscators. . . . . . . . . . . . . . . . . . . . . . . . . . . 131
4.4 De-obfuscation robustness evaluation results under different obfuscation budget. Note
that we set α ∈ {0.2, 0.3, 0.5} and report (Precision, Recall) of adversarial detector under
different obfuscation approaches. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
4.5 Personalization results. DN onSens
KL and DSens
KL denote the divergence in non-sensitive
classes and sensitive classes respectively. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
5.1 Models used for MNIST and CIFAR10 datasets. Note that SLP, MLP, and CNN represent
Single Layer Perceptron, Multiple Layer Perceptron, and Convolutional Neural Network,
respectively. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
6.1 Comparison of different DP mechanisms in FedSGD. Note that for SGD + IID gradient
sampling, the sampling ratio is B
D
, which will offer a privacy amplification with ratio B
D
approximately. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
7.1 Prompt templates for different LLMs, given question q: “Does sentence s contain toxic
content?" (see Section 3.1). Note that c is the context, s is the statement, and type(c) is
the category of context c (e.g. toxic, hate and violent). . . . . . . . . . . . . . . . . . . . . . 220
7.2 Evaluation results of DToT on Toxigen, SBIC, DHate datasets. In “Method" column,
“FT" stands for fine-tuning on training dataset, “CoT" refers to CoT prompting, “DToT"
corresponds to DToT prompting, “DToT+FS" denotes the combination of DToT prompting
with few-shot demonstrations, and “DToT+FS+R" presents the combination of DToT
prompting with few-shot demonstrations and rationale augmentations. Due to the lack of
output logits, the AUC scores of ChatGPT are populated as “N/A". . . . . . . . . . . . . . . 226
7.3 Distillation evaluation results on Toxigen, SBIC, and DHate datasets. In “Label" column,
“Human" indicates that the labels come from the training dataset, “LLM" indicates that
the labels are predicted by LLM. In “Rationale" column, “N/A" means no rationales are
used in fine-tuning, “RCoT " means rationales extracted via CoT are used in fine-tuning,
and “RCoT " means rationales extracted via DToT are used in fine-tuning. . . . . . . . . . . 227
7.4 Transferability evaluation results. Note that we fine-tune these models on Toxigen dataset
while testing them on other datasets, and we report AUC score. For Amazon dataset, due
to confidentiality policy, we only report the increased AUC score compared with RoBerta
(whose AUC score is denoted by X). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
7.5 Impact of student LMs’ size on rationale distillation. Note that we use Toxigen dataset for a
case study, and F-T5-XL/F-T5-L/F-T5-B is short for Flan-T5-XL/Flan-T5-Large/Flan-T5-Base. 230
x
7.6 Responses generated by FC-T5 using different approaches. Note we require to detect toxic
content in sentence “why do you have to come to America, you should stay in your own
country and fight for your freedom, not come to America" in the prompt. . . . . . . . . . . . 231
8.1 An example of user request, where the goal is to train a specialized ML model to
monitoring the dog’s status in the user’s room. . . . . . . . . . . . . . . . . . . . . . . . . . 239
8.2 Model accuracy evaluation results. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248
8.3 Privacy evaluation results. Note that PSNR measures the average Peak Signal-to-Noise
Ratio between the user’s private images and the synthetic images generated by the server.
SIM measures the semantic embedding cosine similarity between the user’s private images
and the synthetic images generated by the server. Higher values of both PSNR and SIM
indicate that the synthetic images generated by the server are more similar to the user’s
private images, suggesting that more privacy information is being leaked. . . . . . . . . . 250
xi
List of Figures
1.1 ML model training schemes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2.1 Overview of system model. 1) Users collect measurement data and obfuscate it before
uploading it to the service provider; 2) The service provider or third party aggregates
obfuscated user data to train a signal map model, i.e. RSS prediction model; 3) The
adversary has access to the obfuscated user data and uses it to estimate the user ID and
locations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2 Chania Dataset (colored by user) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.3 A diagram of adversary implementation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.4 Noise privatizer when σ=0.1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.5 Noise privatizer when σ=0.9. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.6 Privacy and utility of different privatizers. Note that Noise, GLDP, GAP, and IT refer to the
Gaussian noise-adding, local Gaussian Mechanism DP, GAP, and the information-theoretic
privatizers, respectively. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.7 Visualization of obfuscated measurements generated by different privatizers when
P = 1.0. Note that each cell in these figures represents a geographical location. The color
of each cell represents the average signal strength value in this location. Lighter color
represents higher signal strength value. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.8 Effect of leveraging measurement sequences on adversary’s user ID estimate error. . . . . 38
2.9 Effect of leveraging measurement sequences on adversary’s user location estimate error. . 38
2.10 Privacy-utility trade-offs of different privatizers under the Chania, UCI, and Radiocell
datasets with composite metrics. Note that Noise, GLDP, GAP, and IT refer to the
Gaussian noise-adding, local Gaussian Mechanism DP, GAP, and the informationtheoretic privatizers, respectively. Note that P is the composite privacy metric defined in
Eq. (2.5) and U is the composite utility metric defined in Eq. (2.12). . . . . . . . . . . . . . 40
xii
2.11 Privacy-utility trade-offs of different privatizers under the Chania dataset with noncomposite metrics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.12 Choosing parameters under a constraint on distortion. . . . . . . . . . . . . . . . . . . . . 42
3.1 Overview of Harpo’s workflow. Note that the ith URL, pi
, can be a user or an obfuscation
URL, denoted by p
u
i
and p
o
i
respectively. c
u
i
/ c
o
i
represents the embedding vector of the ith
real (blue) / fake (green) page, respectively. . . . . . . . . . . . . . . . . . . . . . . . . . . 57
3.2 Neural network structures for RL agent and surrogate model. Both the RL agent and the
surrogate model take the content features Ct
(document embedding output by doc2vec
model) of the latest w URLs in user persona Pt as input, and then utilize CNN as encoder
to convert Ct
into feature vector ϕ
i
t
as the input of decoders. The decoder of the RL agent
is a LSTM followed by two FCNN, representing actor and critic networks respectively.
And the decoder of the surrogate model is a FCNN with Softmax activation function,
which outputs the binary classification result as the reward for the RL agent. . . . . . . . . 62
3.3 Selecting parameters of the MC model. Note that the autocorrelation with lag K measures
the correlation between states that are K time steps apart. . . . . . . . . . . . . . . . . . . 69
3.4 The MC model and its state transition probability diagram for simulating user personas. . 69
3.5 Overview of Harpo’s evaluation process. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
3.6 Loss under different obfuscation budgets for the user profiling and ad targeting models.
Note that the reported loss values (L1, L2, L3) are all against real-world user profiling and
ad targeting models. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
3.7 Stealthiness evaluation results. Note that the α values from left to right of each curve in
each figure are 0.2, 0.15, 0.1 and 0.05 respectively. The reported privacy values (L1, L2,
L3) are against surrogate user profiling and ad targeting models. . . . . . . . . . . . . . . . 82
3.8 Adaptiveness of Harpo and two of the most competitive baselines (Rand-intent and
Bias-intent) against ad targeting models. The color of each cell represents the normalized
Euclidean distance between a pair of obfuscation URL category distributions. Warmer
colors (red; higher values) represent superior adaptiveness. . . . . . . . . . . . . . . . . . 85
4.1 Problem Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
4.2 Overview of De-Harpo. Note that V
u denotes the non-obfuscated user persona, V
o
denotes the obfuscated user persona generated by the obfuscator, C
u
is the recommended
video class distribution based on V
u
, C
o
is the recommended video class distribution
based on V
o
, Cˆu
is the denoiser’s estimate of C
u
, and v
u
i
and v
o
i
represent user video and
obfuscation video respectively. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
4.3 Privacy and utility metrics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
4.4 MDP for the obfuscator. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
xiii
4.5 Details of system design. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
4.6 Privacy-utility tradeoff w.r.t. P
N orm and ULoss under different obfuscation budget α. Note
that Rand-Obf/De-Harpo-Den represents the combination of Rand-Obf obfuscator and
the De-Harpo denoiser, Bias-Obf/De-Harpo-Den represents the combination of Bias-Obf
obfuscator and the De-Harpo denoiser, and PBooster-Obf/De-Harpo-Den represents the
combination of PBooster-Obf obfuscator and the De-Harpo denoiser. Top left of figure
represent both high privacy and high utility. . . . . . . . . . . . . . . . . . . . . . . . . . . 130
4.7 Privacy level P
N orm vs obfuscation budget α. . . . . . . . . . . . . . . . . . . . . . . . . . 132
4.8 Precision of the adversarial detector vs the percentage of De-Harpo users under α = 0.5. 132
5.1 Figure (a) illustrates the current formal privacy guarantee of FL with SA protocols and
sheds light on the missing privacy guarantee on the aggregated model information
leakage which is studied in this work. Figure (b) gives a preview of the behavior of the
privacy leakage through the global aggregated model for a CNN model as a function of
the number of users in FL. The privacy leakage follows a O(1/N) decay as proved in our
theoretical bounds. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
5.2 The training process in federated learning. . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
5.3 Impact of the number of users (N) when using FedSGD. Note that we set and B = 32
for all users on both MNIST and CIFAR10 datasets. We normalize the MI by entropy of a
single data batch (i.e. 32 ∗ 567 for MNIST and 32 ∗ 1403 for CIFAR10). . . . . . . . . . . . 164
5.4 Impact of the number of users (N) when using FedAvg. Note that we set E=1 and B = 32
for all users on both MNIST and CIFAR10 datasets. We normalize the MI by entropy of
the whole local training dataset (i.e. 1200 ∗ 567 for MNIST and 1000 ∗ 1403 for CIFAR10). 165
5.5 Impact of the number of users (N) when using FedProx. Note that we set E=1 and
B = 32 for all users on both MNIST and CIFAR10 datasets. We normalize the MI by
entropy of a single data batch (i.e. 1200 ∗ 567 for MNIST and 1000 ∗ 1403 for CIFAR10). . 166
5.6 Impact of batch size (B) when using FedSGD. The MI is normalized by the entropy of a
data batch, which is proportional to the batch size B (i.e. B ∗ 567 for MNIST and B ∗ 1403
for CIFAR10). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
5.7 Impact of batch size (B) when using FedAvg. The MI is normalized by the entropy of a
user’s local dataset, which is a constant (i.e. 1200 ∗ 567 for MNIST and 1000 ∗ 1403 for
CIFAR10). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
5.8 Impact of batch size (B) when using FedProx. The MI is normalized by the entropy of a
user’s local dataset, which is a constant (i.e. 1200 ∗ 567 for MNIST and 1000 ∗ 1403 for
CIFAR10). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
xiv
5.9 Accumulative MI privacy leakage on MNIST and CIFAR10 datasets. Note that we
normalize the MI by the entropy of each user’s local dataset, which will not change with
T. We use the linear model for both MNIST and CIFAR10 datasets. . . . . . . . . . . . . . . 170
5.10 Accumulative MI privacy leakage vs model accuracy of different FL algorithms. Note that
we use a linear model for case study and normalize the MI by the entropy of each user’s
local dataset. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
5.11 Impact of the local training round (E) when using FedAvg. We normalize the MI by the
entropy of each user’s local dataset, and we consider N ∈ {10, 20}. . . . . . . . . . . . . . 171
5.12 Impact of the local training round (E) when using FedProx. We normalize the MI by the
entropy of each user’s local dataset, and we consider N ∈ {10, 20}. . . . . . . . . . . . . . 172
5.13 Impact of user heterogeneity when using FedAvg on non-IID CIFAR10. Note that α = ∞
means that the user data distributions are identical (IID users), and the MI is normalized
by the entropy of a user’s local dataset. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
5.14 Impact of user heterogeneity when using FedAvg on FEMNIST. Note that the MI is
normalized by the entropy of target user’s local dataset, which is 678 ∗ 176 . . . . . . . . . 173
5.15 Impact of varying the number of users N, on the reconstructed image quality (PSNR) of
the DLG attack and on the MI privacy leakage. . . . . . . . . . . . . . . . . . . . . . . . . 175
5.16 Effects of using DP noise together with SA on MI privacy leakage and model accuracy.
Note that we add DP noise in aggregated model updates after SA. . . . . . . . . . . . . . . 176
5.17 Heatmap of the absolute values of sampled updates from clients 1, 2 and 3 in the counter
example. x4 and x
′
4
can be distinguished even adding the aggregated noise from P3
i=1 xi
. 179
6.1 Federated learning with SA and DP guarantees. . . . . . . . . . . . . . . . . . . . . . . . . 183
6.2 System model for FL with SA. Note that the input of this system is users’ local datasets
({Di}
i=N
i=1 ), and the output of the system is the aggregated model update (Pi=N
i=1 x
(t)
i
),
which is a random vector due to users’ local gradient (i.e. data batch) sampling. The server
will infer user i’s local dataset (Di
) by observing Pi=N
i=1 x
(t)
i
. . . . . . . . . . . . . . . . . . 187
6.3 Heatmap of the absolute values of sampled updates from users 1, 2 and 3 in the
counterexample. x4 and x
′
4
P
can be distinguished even adding the aggregated noise from
3
i=1 xi
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
6.4 Comparison of WF noise and isotropic noise. . . . . . . . . . . . . . . . . . . . . . . . . . . 205
6.5 Comparison of different DP mechanisms on MNIST dataset. Note that we consider 50
users participating in FL. The training epoch is set as 100, the mini-batch size B is 32,
the clipped value C is set as 10, and we consider δ = 10−4
. We report the accumulative
privacy across all training epochs by using the composition theorem in [294]. . . . . . . . 207
xv
7.1 Overall workflow of the proposed BD-LLM. Given question q, it first bootstraps the LLM
via DToT prompting to extract answer a and rationale r with high-confidence. Then, it
uses q as input and (a, r) as output to fine-tune the student LM. . . . . . . . . . . . . . . . 215
7.2 An example of context tree. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
8.1 Problem statement. The user sends a request about the model they need and a few
reference images. The server automatically train a model for the user. . . . . . . . . . . . . 235
8.2 Details of the proposed system. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238
8.3 Privacy-utility trade-off results. Note that privacy leakage is measured by SIM, which
represents the semantic similarity between generated synthetic images and the user’s
private images. The model utility represents the performance of the specialized model
trained on synthetic data. The top-left part of these figures indicates both higher privacy
and higher utility. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251
xvi
Abstract
Over the past decade, Artificial Intelligence (AI) techniques have been widely used in various network applications, significantly enhancing the intelligence, efficiency, and personalization of the services provided
for users. However, this advancement has intensified privacy concerns due to the development of Machine
Learning (ML) models that learn from user data. Therefore, how to deliver high-quality and personalized
online services using ML models while minimizing privacy risks for users has become a crucial research
area.
In this thesis, I develop innovative methods and systems to optimize the privacy-utility trade-offs in
AI-enabled network applications. Recognizing that users face varied types of privacy risks across different applications, different privacy protection methods and systems tailored to address application-specific
challenges are proposed. The thesis is organized into three main parts, detailed as follows.
In the first part (Chapter 2-4), I focus on network applications in which the server collects user data
and employs centralized learning methods to develop ML models from user data. To minimize the privacy
leakage during the collection of user data while preserving the utility of ML models trained on such data,
I propose methods to optimize user privacy and utility via data obfuscation (i.e. noise addition), aiming at
protecting two common types of user privacy: user location privacy and user profiling privacy.
In the second part (Chapter 5-6), I consider network applications using Federated Learning (FL) with
Secure Aggregation (SA), where users share encrypted local model updates with the server without sending private local data, and the server can only observe the aggregated model update without accessing
xvii
individual local model updates. While SA guarantees the privacy for the local model updates of users from
the encrypted model updates, the aggregated model update may still leak the private information about
user data. To systematically investigate the privacy and utility trade-offs in FL with SA, I use formal metrics
including Mutual Information and Differential Privacy to quantify both on-average and worst-case privacy
leakage in FL with SA. I demonstrate that the inherent randomness in aggregated model updates can be
leveraged as noise to offer privacy protection for individual user’s data without hurting model utility.
For the first two parts, the methodology I utilize to optimize privacy-utility trade-offs can be summarized as adding noise smartly into user data to hide sensitive information. More recently, the emerging
Generative Large Foundation Models (FMs) have showcased their superior capability of generating highquality synthetic data. Therefore, in the last part of thesis (Chapter 7-8), I design approaches to leverage
large FMs to protect user privacy and maintain utility in specialized ML model training. I demonstrate
that the high-quality synthetic data generated by large FMs can be used to train accurate specialized ML
models with minimal or no usage of real user data.
xviii
Chapter 1
Introduction
1.1 Research Problem
In the past decade, the landscape of network applications has been profoundly transformed by the integration of Artificial Intelligence (AI) techniques, empowering online service providers to offer more intelligent,
efficient, and personalized services to users. At the core of this transformation is the adoption of Machine
Learning (ML) algorithms, which involve developing models (i.e. a type of computer programs) that automatically improve their performance on specific tasks by learning from user data [1]. For example, cellular
service providers construct signal map models that learn from user data to predict signal strength and provide improved network services [2]. Online content platforms such as YouTube and TikTok develop video
recommendation models that learn from users’ historical engagement data to recommend personalized
videos to users [3, 4].
While AI-enabled network applications offer a myriad of benefits and enhance user experience, increased privacy concerns have arisen. This is because users’ sensitive data may be exposed during the
development of ML models, which poses a significant risk of privacy breaches. In this thesis, I aim to design novel methods and systems to optimize privacy-utility trade-offs in AI-enabled network applications.
1
User data
Database
ML models
User 1 User 2 User N
…
Privacy
Leakage
(a) Centralized learning.
ML models
User 1 User 2 User N
…
Privacy
Leakage Aggregator
Model updates
(b) Federated learning.
Figure 1.1: ML model training schemes.
1.2 Research Challenges and Approaches
Depending on the development approach of machine learning (ML) models in AI-enabled network applications, users are exposed to different type of privacy risks and hence require different privacy protection
methods and systems to mitigate data leakage. Specifically, ML models can be developed through two
primary training schemes: centralized learning and federated learning, as illustrated in Figure 1.1a and
Figure 1.1b respectively. In centralized learning, user data is directly collected by the server for training
the ML models. Privacy leakage in this scenario occurs when the server aggregates user data, potentially
exposing sensitive information. Conversely, in federated learning, users do not share their private data directly with the server. Instead, they share local model updates, which the server then aggregates to derive
global model updates. Despite this, privacy risks still exist, as private information about user data can leak
from the shared model updates.
To address the privacy-utility trade-offs in centralized learning applications, I design principled data
obfuscation approaches that obfuscate user data before it is shared with service providers. These approaches aim at safeguarding two prevalent types of user privacy: user location privacy and user profiling
2
privacy. Specifically, for structured user location data, I utilize context-aware noise addition mechanisms
to concurrently achieve adequate user privacy and utility (see Chapter 2). Regarding unstructured user
profile data, such as users’ web browsing history, I design Harpo, a novel obfuscation system based on
principled reinforcement learning, which introduces fake user activity into web browsing histories to maximize user privacy while preserving users’ natural browsing experience (see Chapter 3). Furthermore, for
applications where user utility matters a lot (e.g. recommendation system), I design De-Harpo, a utilitypreserving obfuscation system that employs a novel obfuscation-denoising architecture (see Chapter 4).
To improve both user privacy and model utility in federated learning applications, I explore the use of
secure aggregation (SA), which ensures that the server can access only the global aggregated model update,
not the individual ones, without compromising model utility. However, even with SA, the aggregated
model update might still inadvertently reveal individual user data. To thoroughly examine the privacy
and utility trade-offs in FL with SA, I employ formal privacy measures, including mutual information and
differential privacy, to assess the privacy leakage. Through theoretical analysis, I demonstrate that the
inherent randomness in aggregated model updates can be leveraged as noise to offer provide protection
to individual user’s data. Next, I empirically show that an increasing number of users can enhance MIbased on-average privacy of SA in FL without detracting from model utility (see Chapter 5). Furthermore,
I analyze the conditions when SA can guarantee DP, and investigate how the inherent randomness in
aggregated model updates can be potentially leveraged to reduce the additional noise required for DP
guarantee (see Chapter 6).
So far, the methodology I employ to optimize the privacy-utility trade-offs in AI-enabled network
applications can be concluded as adding noise smartly into user data to hide sensitive information, either
applied to raw user data in centralized learning applications, or applied to users’ local model updates calculated from user data in federated learning applications. Most recently, the emerging Generative Large
Foundation Models (FMs) have showcased their superior capability of generating high-quality synthetic
3
data. By harnessing the power of large FMs in synthetic data generation, it becomes possible to develop
specialized ML models with minimal or even no need for actual user data. Hence, the user privacy can be
naturally protected without compromising the utility of ML models. To investigate this, in the last part of
this thesis, I design novel methods to leverage large FMs’ synthetic data generation capability to protect
both user privacy and utility in the development of specialized ML models. Specifically, I investigate two
applications. In the first application, I study the problem of detecting toxic content in users’ conversation
with AI assistants, since toxic content leaks users sensitive information (e.g. their biases, prejudices, and
personal identifiable information). I propose a novel method for training a small but accurate toxic content
detection model, by leveraging the rationale data generated by LLM (see Chapter 7). In the second application, I explore the problem where the user requests the AI assistant to build a personalized ML model
without sharing and labeling private user data. To solve this problem, I design a novel synthetic data generation pipeline based on large diffusion models [5], which can generate customized synthetic data based
on user requests without violating users’ privacy preferences (see Chapter 8).
1.3 Thesis Organization
The rest of this thesis is organized as follows: Part I (Chapter 2-4) presents principled user data obfuscation methods to optimize the privacy-utility trade-offs in centralized learning applications, which includes
signal maps (Chapter 2), online advertising (Chapter 3), and video recommendation systems (Chapter 4).
Part II (Chapter 5-6) investigates how SA can be used to enhancing user privacy and utility in federated
learning applications, where formal metrics including mutual information (Chapter 5) and differential privacy (Chapter 6) are applied to measure on-average and worst-cast privacy respectively. Part III (Chapter
7-8) proposes solutions which utilize large FMs for specialized ML model training with minimal actual user
data, applied to the training of an accurate and efficient model for detecting privacy-sensitive user content
4
(Chapter 7) and private training of customized ML models based on user requests (Chapter 8). Chapter 9
concludes this thesis.
5
Part I
Optimizing User Privacy and Utility via Data Obfuscation in Centralized
Learning
6
Chapter 2
Privacy-Utility Trades in Crowdsourced Signal Map Obfuscation
Cellular providers and data aggregating companies crowdsource cellular signal strength measurements
from user devices to generate signal maps, which can be used to improve network performance. Recognizing that this data collection may be at odds with growing awareness of privacy concerns, we consider
obfuscating such data before the data leaves the mobile device. The goal is to increase privacy such that
it is difficult to recover sensitive features from the obfuscated data (e.g. user ids and user whereabouts),
while still allowing network providers to use the data for improving network services (i.e. create accurate
signal maps). To examine this privacy-utility trade-offs, we identify privacy and utility metrics and threat
models suited to signal strength measurements. We then obfuscate the measurements using several preeminent techniques, spanning differential privacy, generative adversarial privacy, and information-theoretic
privacy techniques, in order to benchmark a variety of promising obfuscation approaches and provide
guidance to real-world engineers who are tasked to build signal maps that protect privacy without hurting
utility. Our evaluation results, based on multiple, diverse, real-world signal map datasets, demonstrate the
feasibility of concurrently achieving adequate privacy and utility, with obfuscation strategies which use
the structure and intended use of datasets in their design, and target average-case, rather than worst-case,
guarantees.
7
2.1 Introduction
Network providers and data aggregating companies crowdsource mobile user data for a variety of reasons.
This data can reveal network performance, allow for the generation of signal strength maps, inform decisions on where to deploy cell towers or sensors, and provide insight on how to improve user experience.
The measurements are collected directly from user devices, via standalone mobile apps [6], or measurement software development kits [7] integrated into popular partnering apps. Providers and aggregators
then sell this data to network operators, regulators, and device and equipment manufacturers. For the
operators, regulators, and manufacturers, this crowdsourced data offers clear value for network planning.
For the user, contributing data can in turn be useful, given that it leads to better network performance.
However, participation also raises legitimate privacy concerns.
For example, some cellular providers have allegedly been selling their users’ real-time location data
to credit agencies, bail bondsmen, and other third parties [8]. Furthermore, while these measurements
are assumed to be sparse in space and time and over thousands of users, previous work has shown that
identities are inferable from anonymized data [9].
In recent years, privacy issues have come to the front of news, politics, and public opinion [10, 11,
12] and pioneering privacy laws have been enacted [13, 14]. To protect user privacy, a plethora of data
masking, or obfuscating, schemes have been proposed, see, for example, [15]. However, by obfuscating the
original data for the sake of privacy, data can no longer provide the exact insights it once could, sacrificing
data utility for privacy [16].
In this work, we examine the privacy-utility trade-offs in the context of cellular signal strength measurements, focusing on device-level obfuscation where the measurement is obfuscated, or privatized, before it leaves the user’s phone. The goal is to increase privacy such that it is difficult to recover sensitive
features from the obfuscated measurements, including user ids and whereabouts, while still allowing network providers to use the measurement for improving network services, i.e. create accurate signal maps.
8
To examine this privacy-utility trade-offs, we identify privacy and utility metrics and threat models suited
to the signal map application at hand. We then obfuscate the measurements using a number of promising
approaches at the forefront of privacy research, in order to benchmark them and provide guidance to realworld engineers who are tasked with building signal maps that provide (some) privacy while maintaining
(adequate) utility. To evaluate the different approaches, we use multiple, diverse, real-world signal map
datasets to ensure real world applicability of our findings.
We implement four strategies for obfuscating signal strength measurements to assess and compare
their application-specific performance, selecting preeminent methods from the literature that span a range
of complexities and privacy guarantees. Specifically, the first is a noise-adding privatizer, which adds independent, identically distributed Gaussian noise across the features of the data. Albeit simple, this scheme
provides intuition into the privacy-utility tradeoff via the choice of how much noise to add. The second is
based on differential privacy (DP) [17], a leading approach to data obfuscation which provides probabilistic
worst-case guarantees against any arbitrary adversary, including one with unlimited resources and access
to side-information. In this work, we apply the popular local Gaussian mechanism [17], as well as the recent Truncated Laplacian Mechanism [18]. The third leverages the idea of generative adversarial networks
to allow a data-driven method of learning an obfuscation scheme. This method, which is referred to as
generative adversarial privacy (GAP) [19], positions a privatizer and an adversary, both modeled as neural
networks, against each other. The privatizer learns to obfuscate the data such that the adversary cannot
infer sensitive features, and the adversary simultaneously learns to infer sensitive features. While this
method cannot offer the formal worst-case guarantees of the differentially private methods, the learning
approach offers the potential to leverage structure in the data set and take advantage of the specific utility
objectives in the network. The fourth strategy is motivated by an information-theoretic treatment of the
problem. Considering mutual information as a convex metric for privacy performance, we frame a formal
optimization problem as finding the obfuscation strategy which maximizes privacy subject to a constraint
9
on utility. This approach, to which we refer to as (IT), maximizes user privacy in an average sense, but
sacrifices the worst-case guarantees offered by the deferentially private methods. Section 2.4 discusses
these privatizers in more detail.
We analyze the performance of each of these privatizers using three, diverse, real-world signal map
datasets. The first one is collected from cellular users over a seven-month period in the city of Chania,
Greece [20]. The second one is collected over a period of four months by Android smartphones in the
University of California Irvine campus [21]. The last one is sampled from the Radiocell dataset [22], one of
the largest publicly available datasets with millions of measurements from nearly one million macrocells
around the world. The sample we work with contains signal strength measurements from hundreds of
users over a one-year period in UK’s countryside. Section 2.2.1 discusses these datasets in detail.
An important aspect of our study is to identify privacy and utility metrics (Section 2.3) as well as
threat models (Section 2.2.2) suited to signal map application. We assess our obfuscation schemes against
specific adversaries, modeled as neural networks (Section 2.2.5 discusses adversary models in detail), which
estimate private user information from observing obfuscated data, and we take the adversary’s estimation
performance as a practical, application-specific privacy metric. We also consider more robust privacy
guarantees, such as DP, which is not dependent on any specific adversary implementation. With respect to
utility, we consider two metrics. First, we consider a received signal strength (RSS) model which accurately
predicts signal maps when trained with unobfuscated data. We train this model with the obfuscated data.
Then, we use as an application-specific utility metric the L1 distance between the parameters of the RSS
model trained with obfuscated versus unbofuscated data. As a general utility metric, we use the overall
assessment of data distortion. This serves as a proxy for utility under a wide variety of other potential
mobile data applications.
Our main contributions are as follows:
10
1. We present a framework and define appropriate metrics to assess the privacy and utility of obfuscation schemes in the context of signal maps, (Sections 2.2, 2.3).
2. We apply, for the first time, two promising general obfuscation approaches, namely generative adversarial privacy (GAP) and an information-theoretic approach based on optimization and coding
(IT), to signal maps data, (Section 2.4).
3. We evaluate the feasibility of achieving different notions of privacy, namely worst-case (DP) versus
on-average (GAP, IT) privacy guarantees in the signal map application. (Section 2.5.1).
4. We conduct a systematic exploration of the privacy-utility trade-offs in signal maps data under different obfuscation approaches. (Section 2.5.3).
5. We demonstrate that obfuscation strategies which use the structure and intended use of datasets in
their design, and target average-case, rather than worst-case, guarantees, can concurrently achieve
adequate privacy and utility in the context of signal maps. (Section 2.5).
In the next section, we briefly discuss relevant work in privacy, especially as it relates to mobile network
data. Section 2.2 describes our system model, including the three real-world datasets that we use in our
evaluation, the threat models we consider, the privatizer and adversary model we implement, and the
service provider model we consider. Section 2.3 rigorously defines our privacy and utility metrics. Section
2.4 presents each of the four obfuscation schemes and their application to signal maps. In Section 2.5, we
evaluate and compare these schemes, and analyze our results. We discuss the limitations and future works
in Section 2.6, and present our conclusions in Section 2.8.
2.2 System Model
Figure 2.1 illustrates the system model we consider, which involves mobile users, a service provider or a
third party, and an adversary. User devices record network measurement data and transmit it to a serviceprovider or third-party server. Since the reported data contains information that the users may deem
11
Privatizer User data
(measurements)
Obfuscated
data
User
Adversary
(user ID & location
estimator)
Signal map model
(RSS predictor)
Database
Service provider or third party
Privatizer
Privatizer
Figure 2.1: Overview of system model. 1) Users collect measurement data and obfuscate it before uploading
it to the service provider; 2) The service provider or third party aggregates obfuscated user data to train a
signal map model, i.e. RSS prediction model; 3) The adversary has access to the obfuscated user data and
uses it to estimate the user ID and locations.
private (e.g. user location, see Section 2.2.1), users apply device-level privatizers to obfuscate their data
locally before uploading them to the server (see Section 2.2.3). The goal of the service provider is to train
a RSS model based on the aggregated obfuscated user measurement data, which can be used to generate
signal maps and thus guide network planning and operation [23] (see Section 2.2.6). Finally, an adversary
with access to the obfuscated data estimates the whereabouts of users, by estimating the user ID and
location corresponding to the incoming measurements (see Section 2.2.5).
Note that we assume the service provider is also curious about user whereabouts and thus can be the
adversary. We further assume that the adversary has access to the obfuscated data as it arrives at the
server, but no side information that directly reveals the identity of users (see Section 2.2.2 for a detailed
description of the threat model).
2.2.1 User Data
We use three real-world datasets collected from different countries and over different time periods to
evaluate the performance of our privatization schemes under different environments and user behaviors,
and thus make our findings more conclusive.
12
The first dataset is taken from users in Chania, Greece, and will be referred to as the Chania dataset,
which contains measurements from nine users over seven months in 2014. The nine users are mobile
device owners who carry their devices with them throughout the day collecting measurements. Each
measurement contains 24 features: device address, timestamp (to the second), received signal strength
(RSS) in dBm, latitude, longitude, cellID identifying the base station, downlink carrier frequency, uplink
carrier frequency, mobile network code, etc.
The second dataset contains measurements from seven users over four months in the University of
California Irvine (UCI) campus in 2017, and will be referred to as the UCI dataset. Each measurement consists of 15 features including latitude, longitude, reference signal received power (RSRP) in dBm, reference
signal reference quality (RSRQ) in dBm, timestamp. deviceID, cellID, etc.
The third dataset is collected by Radiocell.org [22], which has been crowdsourcing wireless network
measurements from world-wide mobile users since 2009. It is the largest open-source mobile network
dataset we can have access to. We sample about 0.5 million measurements from 219 mobile users in UK,
2019∗
, and refer to it as Radiocell dataset. Each measurement has 23 features including latitude, longitude,
altitude, speed, signal strength (SS) in dBm, country code, mobile network code, etc.
The most relevant features to this work are tabulated in Table 2.1 along with an indicator of their
sensitivity. User ID and location are assumed sensitive features (private), whereas RSS/RSRP and others
are not sensitive (public).
Features User ID Latitude Longitude RSS Others
Sensitivity Private Private Private Public Public
Variable u x1 x2 x3 xj, j>3
Table 2.1: Dataset Features
∗We choose UK since most of the collected measurements in 2019 come from mobile users in UK (10 million measurements
in total). To limit computational complexity, we select three cells containing the largest amount of data.
13
Figure 2.2: Chania Dataset (colored by user)
For visualization purposes, we have plotted the data of the first dataset over the geographic region in
Figure 2.2. The colors indicate user ID, and it is apparent that one cannot easily assume user ID based on
location alone.
2.2.2 Threat Model
Our adversary has the goal of gathering private information that may be revealed by users operating in
the mobile data network who are sending data reports to the service provider. The adversary may use this
information for purposes not in the users’ interest, or even to aid criminal attacks such as identity theft.
Note that the adversary can either be an undetected malware software installed in the service provider, or
the service provider itself, since we assume that the service provider or third party may also be curious
about the user whereabouts, as it can be sold to other third parties or used for other purposes [8].
To accomplish his/her goal, the adversary will seek to obtain access to as many user feature reports
as possible, consisting of (u, x1, x2, x3, ...). Since the primary information sought by the adversary may
not be explicitly present in the reports, e.g., if the reports are intentionally obfuscated, the adversary will
perform inference attacks to estimate the private user information they desire. The nature of the threat
may have some variation dependent on the specific mobile data application and the capabilities of the
14
adversary. With this in mind, we consider the following properties as part of the definition of the threat
model:
• Whether the adversary can access individual user reports directly, or whether their access is limited
to the aggregated reports of all users,
• whether the adversary should be assumed to have bounded computational resources,
• whether the adversary has access to relevant side information, and
• whether users are primarily concerned with potential exposure of private information from their
reports on average or in the worst-case.
Side information is any additional information that may be available to an adversary that could be used to
supplement the information collected from the user reports to increase the efficacy of an inference attack.
This could include public databases from organizations like the US Census Bureau or the Department of
Transportation which allow an adversary to associate data features, e.g., addresses with names.
Typical mobile network data threat model: For most mobile network data applications and users, we
apply the following threat model:
• The adversary can access individual user reports directly,
• the adversary’s computational resources are bounded,
• the adversary has limited access to side information, and
• users are primarily concerned with privacy exposure on average.
We consider that many users are likely to have reservations about providing private data to a service
provider, either because they do not trust the provider to adequately protect their data, or they believe the
service provider will themselves use the data in ways that do not align with the user’s interests. For this
reason, we assume a threat model where users must be able to protect their private information at the local
level, e.g., at the user device. We also recognize that some users likely will not have such reservations, and
thus a minority of users can be incentivized, e.g., through discounts, to trust a data aggregator with their
15
data, allowing for the possibility of training or tuning privacy schemes based on real user data. Adversary
computational resources are assumed to be bounded, recognizing that other methods outside the scope
of the data network could be employed to reliably obtain the same private information if an adversary is
assumed to have limitless resources. Adversary access to side information is assumed to be limited for
the same reason. Finally, we assume users will typically be concerned with the exposure of their private
information on average. For most mobile data applications, a user will likely operate with the network over
a long period of time and will generate many feature reports as a result. Further, exposure of the private
data of any one report will typically pose a much lower risk than exposure through the aggregation of
many reports over a period of time. Thus, protecting against an adversary attack on any single report
under worst-case conditions is unnecessary for typical applications.
Worst-case mobile network data threat model: Due to the wide variety of potential mobile network
data applications and possible user privacy concerns, we acknowledge there may be some use cases where a
worst-case threat model is appropriate. To account for this, we also treat such a model in our analysis. This
adversary can access individual reports directly, but in contrast to the typical threat model we assume the
adversary has unbounded computational resources and unlimited access to side information. Also, users
are concerned with exposure of any single feature report, and their private information in each report
must be protected from exposure under worst-case conditions.
2.2.3 Data Obfuscation and Privatizers
To protect against the adversary threat, privacy can be preserved through obfuscation of the feature data
provided by individual users before being released to the service provider. At a minimum, the feature
set is stripped of user ID. Remaining features are then obfuscated according to the selected privatization
scheme, or "privatizer" for short. This is needed because the adversary may learn patterns in the data
which associate public and private features, thus it is not sufficient to only obfuscate private features.
16
LDP GAP IT
Threat model
Adversary computational
resources
Unlimited Limited Unlimited
Adversary side-info access Unlimited Limited Unlimited
Type of privacy-loss
guarantee
Worst-case On-average On-average
Provable adversary privacy
protection
Against any
adversary
Against trained
adversary
Against any adversary
Context
Privatizer access to data for
training
Not necessary but
helpful*
Yes No
Privatizer access to data
distribution
Not necessary but
helpful*
No Yes
Utility protection type None/Some* Maximize utility Lower bound on
utility
* As discussed in detail in Section 2.4.2, LDP requires clipping. While clipping can be done in a manner which is agnostic
to the data [24, 25], this may result in large utility loss. As a result, clipping is usually performed using information about
the data to ensure the added noise is calibrated with the range of data values, see Eq. (2.16).
Table 2.2: Context used by privatizers (last 4 rows) and properties of threat models (first 6 rows).
The privatizer will produce an obfuscated measurement feature report(u, x1, x2, x3, . . .) → (y1, y2, y3, . . .),
with yi denoting the obfuscated version of xi
, where the mapping depends on the design of the privatizer.
We will consider several privatizers, described fully in Section 2.4. Some privatizers leverage actual user
data in their design. We assume such data is collected either through opt-in surveys and service provider
incentives, or else collected by the provider through other means such as wardriving. In our analysis, we
use 70% of our available dataset for training our adversary (see Section 2.2.5 for more details) as well as for
training, fitting models, and/or choosing parameters of the privatizers (see Section 2.4 for more details).
The remainder of the dataset will be used to test our privatizers against the adversary.
2.2.4 Context
User data is a type of application-specific context, and different privatizers may use the actual data, data
distributions, or merely data moments like mean and variance. There are other types of application-specific
17
context, e.g. privacy and utility metrics of interest, which privatizers may optimize over. Since mobile service providers know what they want to use the data for, and may ask their clients about privacy concerns,
such metrics may indeed be available to be used in the design of privatizers.
Using context has implications to the threat model. For example, optimizing over a particular privacy
metric guarantees protection against this privacy metric but not against any function of the data. As
another example, if a privatizer optimizes its design under a known data distribution, or is trained under
a given dataset, its performance is not guaranteed under different distributions and datasets.
Using context may also offer utility guarantees since optimizing over, or putting a constraint on a utility
metric, restricts the privatizer from making obfuscation decisions that reduce utility below acceptable
levels.
Table 2.2 compares different privatizers with respect to how much context they use and which threat
model properties they can protect against. LDP offers stronger privacy protection than the rest as it provides worst-case privacy guarantees against any adversary with potentially unlimited resources and side
information access. However, it does not have a formal mechanism to guarantee a minimum level of utility. In contrast, GAP and IT are aware of application specific utility metrics which they include in their
optimization setups, and thus provide utility guarantees. The GAP privatizer in particular optimizes a
multi objective function which considers both privacy and utility. That said, it is optimized and can offer
formal guarantees only against the particular adversary in its training loop. These fundamental distinctions among the different obfuscation approaches are discussed in more detail in Section 2.4 and their
implications to the privacy-utility trade-offs are presented and discussed in detail in Section 2.5.
2.2.5 Adversary Model
Depending on whether users upload measurements to the server one at a time or in batches, the adversary
may or may not know whether a sequence of measurements originated from a single user or multiple users.
18
Consider first the scenario where each user uploads one obfuscated measurement each time. Given that
the user ID of each obfuscated measurement is unknown, the adversary takes as input one measurement
from the obfuscated dataset (y1, y2, y3, . . .) and predicts the user ID and true location (the unobfuscated
latitude and longitude) from which the measurement originated (ˆu, xˆ1, xˆ2, , xˆ3 . . .). Now consider the
scenario where, for the sake of reduced system complexity, each user uploads a sequence of obfuscated
measurements each time†
. While the user ID of each obfuscated measurement in the database is unknown,
the adversary knows that measurements in the same batch belong to the same user, and can take advantage
of correlations across measurements to improve estimation. In this case the adversary takes as input a
measurement sequence {(y1i
, y2i
, y3i
, . . .)}
i=L
i=1 from a single user and predicts a single user ID uˆ and the
true locations {(x1i
, x2i)}
i=L
i=1 (i denotes the i
th measurement in this sequence, and L is the sequence
length). In Section 2.5 we investigate the performance under both scenarios, see Section 2.5.2 for a direct
comparison between the two.
The adversary estimation is a mapping from (y1, y2, y3, . . .) to (ˆu, xˆ1, xˆ2) and one may use a number
of approaches to perform that mapping. In theory, one may discretize the continuous xi
’s and yi
’s and
use empirical conditional probabilities and maximum likelihood estimation, but in practice the state space
would explode. Given the availability of real world datasets, learning is a better choice. We experimented
with linear and non-linear models for used ID estimation, and chose a deep neural network (DNN) to model
our adversary (see Figure 2.3), given the effectiveness of DNNs in approximating non-linear functions.
Specifically, our adversary is modeled as a fully-connected DNN containing two hidden layers with 256
neurons each. Between layers, we employ Rectified Linear Unit (ReLU) activations, and our optimization
relies on Adaptive Moment Estimate (Adam) stochastic gradient descent with a learning rate of 0.001. These
values were empirically selected to maximize the adversary’s performance when given the unobfuscated
data as input.
†
In practice, the service provider can require users to upload their data weekly or monthly.
19
Assume that the input measurement contains m features and there are k users (m and k depend on
three datasets described in Section 2.2.1). Then each input batch has n measurements containing the
m features. The output of the adversary neural net is a n × (k + 2) matrix representing estimates of
user ID and location (n = 1024 in our experiments). Each row in this matrix contains the likelihood
that this measurement belongs to different users, and the estimated latitude and longitude of the original
measurement. The loss function used to train the adversary is a weighted sum of the categorical cross
entropy loss of the user ID estimate vector and the euclidean distance between the actual location and
the location estimate. The user ID estimate error, location estimate error, and adversary loss functions are
defined in Section 2.3.
We provide our adversary 70% of the obfuscated dataset to train on, for which it has access to the
unobfuscated user IDs and locations, and test it on the remaining 30% of the data. Providing the adversary
such a high portion of the data for training makes our privacy results conservative. In our threat model we
have assumed some access to side information but comprehensively modeling access to possible forms of
side information is intractable. The adversary’s access to 70% of the dataset with obfuscated and true user
ID and location labels serves as an approximation of some form of side information. Side information may
include known user locations at certain times, or inputs from the adversary’s own devices on the network
to establish ground truth. The training set could also correspond to the adversary simulating published
privatizers which may be revealed by the service provider to help convince users regarding their ability to
preserve privacy.
2.2.6 Signal Map Model
The service provider trains an RSS predictor based on the aggregated user data such that it can generate
accurate signal maps. Specifically, the model input features include (obfuscated) latitude, longitude and
other features (i.e. (x1, x2, xj , j > 3)), and the model output is the RSS value x3 in dBm.
20
Inputs Estimate of
sensitive feature
Adversary
Privatized
inputs
Privatizer
Figure 2.3: A diagram of adversary implementation.
There is a long line of research on RSS predictor models, see, for example [26, 27, 2]. We first consider
a simple path loss model [28] but find its accuracy to be underwhelming. We also consider a linear and
a neural network model and find that both have good comparable accuracy, yet the former is easier to
work with. Notably, its parameters can be estimated in one step which allows us to calculate applicationspecific utility metrics more efficiently (see Section 2.3.2). We thus select a linear RSS prediction model.
Specifically, we use the following model:
x3 = a0 +
j
X=m
j=1,j̸=3
aj−1xj , (2.1)
where k is the total number of features in a measurement and α = [a0, ..., am−1]
T
is the parameter vector.
Given a set of n measurements X = [xji] where j = 1, ..., m and i = 1, ..., n (m is the number of features),
the parameter vector of the RSS prediction model can be estimated via linear regression as follows:
αX = (XT
−3X−3)
−1XT
−3X3, (2.2)
where X3 is the third column of X and X−3 is the remaining columns of X without the third column.
Similarly, given a set of n obfuscated measurements Y = [yji] where j = 1, ..., m and i = 1, ..., n, the
parameter vector of RSS prediction model can be estimated as αY .
21
2.2.7 Practical Considerations Regarding the Implementation of Privatizers
As discussed in our system model, the privatizer is deployed in local devices of mobile network users
to obfuscate the collected measurements. In practice, the privatizer can be implemented as a software
function, being part of the crowdsourcing Apps used to collect and upload signal map measurements
(e.g. RadioBeacon [29] used for Radiocell dataset collection and AntMonitor [21] used for UCI dataset
collection). We can integrate the privatizer into these Apps. After each device collects the signal map
measurements, instead of sending the raw measurements with actual features (e.g. user locations and rss
values), each device will first call the privatizer function to obfuscate those features in raw measurements
and then send the measurements with obfuscated features to the service provider or third-party, who will
receive the obfuscated measurements and store them in its database as XML or JSON files (e.g. in [22]).
Note that the service providers can be cellular providers like AT&T [30] and the third parties can be mobile
analytics companies like Tusla [31]. They will build RSS prediction models using the crowdsourced signal
map measurements for various purposes like improving network performance and coverage as discussed
in [2]. Since our obfuscated measurements have the same format as the raw measurements, they can be
directly used to train the RSS model without any changes in the original infrastructure.
Moreover, the only part of the crowdsourcing Apps that requires new implementation work is the
privatizer, which will not lead to large system overhead (e.g. memory and CPU overhead). Specifically,
adding noise should be easy to implement and with negligible overhead. The IT privatizer only needs to
select codes from a codebook, which also does not have significant overhead. GAP needs more computation
resources during its training phase while fewer resources during runtime. Moreover, since the number of
features in signal map measurements is small, the system overhead of running GAP privatizer is also not
significant. Therefore, we expect that the runtime overhead of running privatizers will not be a problem
for today’s mobile devices.
22
2.3 Definition of Metrics
In this section we define the metrics used to evaluate privacy and utility.
2.3.1 Privacy
Let n denote the number of measurements per batch. u = [ui
], i = 1 . . . n represent the user ID of
each collected measurement and uˆ = [ˆui
] is the adversary’s estimate of u. The adversary computes a
probability distribution over the space of possible user IDs and selects for each measurement the user
ID estimate with the maximum likelihood. We define the adversary estimate accuracy as the fraction of
correct user ID estimates, that is,
acc(ˆu, u) = 1
n
Xn
i=1
1uˆi=ui
,
where the indicator function 1uˆi=ui
is equal to 1 if the estimate is correct and 0 otherwise. Since high
values of accuracy correspond to low values of privacy, we define the first privacy metric as
P1(ˆu, u) = 1 − acc(ˆu, u). (2.3)
xˆ1 and xˆ2 are the adversary’s estimates of the true latitude x1 and longitude x2. While uˆ represents a
probability distribution, xˆ1 and xˆ2 specify an exact location. Our second privacy metric is the Euclidean
distance between the true location and the adversary’s estimate averaged over the batch, defined by
P2(ˆx1, xˆ2, x1, x2) = 1
n
Xn
i=1
p
(ˆx1i − x1i)
2 + (ˆx2i − x2i)
2, (2.4)
where the subscript i = 1...n corresponds to the i
th measurement in the batch of size n. This metric
defines how well the adversary is able to recover the original user location. High values of adversary
location error correspond to high privacy.
23
Since both user IDs and locations are considered as private and sensitive information in our application,
we further define the following composite privacy metric:
P(ˆx1, xˆ2, u, x ˆ 1, x2, u) = v1P1(ˆu, u) + v2P2(ˆx1, xˆ2, x1, x2), (2.5)
where v1 and v2 are parameters controlling the weights of the two aforementioned privacy metrics. P1, P2
and P are the privacy metrics we use throughout the paper to compare the performance of different privatizers.
Additional privacy metrics. The composite privacy metric defined above is not differentiable because
P1 is not differentiable. This is a problem for adversary training. To handle this we use the cross entropy
loss of the user ID estimate
P
ce
1
(ˆu, u) = −
1
n
Xn
i=1
log pi
, (2.6)
where pi
is the estimated likelihood of user ID ui for measurement i, and define the loss function of the
adversary as
La(ˆx1, xˆ2, u, x ˆ 1, x2, u) = v1P
ce
1
(ˆu, u) + v2P2(ˆx1, xˆ2, x1, x2), (2.7)
which is used in the training of the adversary and of the GAP neural networks (the GAP privatizer and
adversary used in the iterative training, see Section 2.4.3).
Our IT privacy approach, see Section 2.4.4, is motivated by the use of mutual information as a measure
a privacy. The mutual information between two random variables X (e.g., our input) and Y (e.g., the obfuscated data) quantifies how much information about one random variable is obtained through observing
the other. It is given by
I(X; Y ) = X
y∈Y
X
x∈X
pX,Y (x, y) log pX,Y (x, y)
pX(x)pY (y)
, (2.8)
24
where pX,Y is the joint probability mass function and pX, pY are the marginal probability mass functions.
Last, the privacy metrics defined above are well suited for the typical threat model discussed in Section
2.2.2. However, for the worst-case threat model involving adversaries with unbounded computational
resources and auxiliary information where users seek protection of any single report (see Section 2.2.2) we
resort to differential privacy (DP) [32]. Specifically, let K be a randomized function applied to the input
dataset. K gives ϵ,δ-differential privacy if for all datasets D1 and D2 which differ in at most one element
and ∀S ∈ range(K),
P r[K(D1) ∈ S] ≤ e
ϵP r[K(D2) ∈ S] + δ, (2.9)
where the probability is taken over the randomness in K. ϵ and δ bound the difference between the output
of K on D1 and D2 thus making it hard to guess the input (D1 versus D2 ) by observing the output. DP
is a strong guarantee, since it doesn’t make any assumptions about the computation power and auxiliary
information available to the adversary, and ϵ and δ serve as metrics for privacy, see [17] for more details.
2.3.2 Utility
Let m be the number of features at each measurement excluding the user ID which is stripped from the
input. The output of the privatizer y = [yj ], j = 1 . . . m is the obfuscated data, e.g. y1 and y2 denote
obfuscated latitude and longitude, respectively. Our utility metrics quantify the difference between the
input data x = [xj ] and the obfuscated data y = [yj ]. Recall that we consider n measurements per batch
thus xj and yj are vectors of size n with elements xji and yji, i = 1 . . . n, respectively. We consider several
utility metrics motivated by real-world applications of crowdsourced network data.
25
The first metric quantifies the overall distortion of the dataset, considering all m features, by the L2
norm distance between input and obfuscated data, averaged over all n batch measurements:
U1(x, y) = −
1
n
Xn
i=1
vuut
Xm
j=1
(yji − xji)
2. (2.10)
Intuitively, high values of distortion correspond to low utility thus the minus sign in front of distortion in
Eq. (2.10).
The second utility metric is related to the RSS prediction model described in Section 2.2.6. Recall that
the goal of service provider is to estimate an accurate RSS prediction model based on the aggregated user
data. However, with obfuscated user data, the estimated parameters of RSS prediction model differs from
those estimated by unobfuscated user data (i.e. the estimated parameter vector changes from αX to αY ,
see Equation (2.2)). To minimize the difference between them, we define our second utility function as the
opposite of L1-norm distance between αX and αY as follows:
U2(x, y) = −||αX − αY ||1, (2.11)
where αX represents the RSS prediction model parameters estimated by unobfuscated user data (i.e. the
privatizer’s input) and αY represents the RSS prediction model parameters estimated by obfuscated user
data (i.e. the privatizer’s output). We refer to ||αX − αY ||1 as the generated map error, where higher
values of map error corresponds to lower utility. While many metrics could be used to measure the distance
between αX and αY , comparing the fitted parameters over this bounded space provides a simple, effective
loss function. Note that this map error does not capture how well a RSS prediction model generated by the
obfuscated data could be used to predict RSS values at a new location, but rather captures the “distance"
between maps generated before and after obfuscation.
26
Envisioning that the service provider may care for more than a single application-specific utility metric
like U2 in practice, we further define a composite utility metric U(x, y) as
U(x, y) = w1U1(x, y) + w2U2(x, y),
(2.12)
where w1 and w2 are parameters adjusting the weights of each utility metric.
2.4 Privatizers
In this section we introduce in detail each of the four privatizers, which represent different types of obfuscation schemes. Specifically, we first select a Gaussian noise-adding privatizer for its simplicity and as
a benchmark. We then select a locally differentially private privetizer (LDP) motivated by the well known
strengths of Differential Privacy. We then select a privatizer based on GANs (referred to as the GAP privetizer), given the recent interest on how adversarial learning may be used to train privatizers by positioning
them against adversities. Last, we select the so-called IT privatizer since it is a good representative of obfuscation schemes which use mutual information as a privacy metric and optimization to optimally design
obfuscation.
2.4.1 Gaussian Noise-Adding Privacy
Our Gaussian noise-adding privatizer (Noise privatizer) takes the simplest approach to data obfuscation.
For each input batch of size n × m, where n is the number of points and m is the number of features, we
add an n × m matrix of Guassian noise. Each element in this noise matrix is normally distributed with
a mean of 0 and a standard deviation of σ. Since the data is also normalized such that each feature has a
mean value of zero with a standard deviation of 1, values of σ close to 1 add a significant amount of noise
and we choose to vary σ between 0 and 1. Figure 2.4a provides a visualization of what the normalized
27
(a) Input Data, Obfuscated Data, Adversary Estimate (b) Signal Maps Before and After Obfuscation
Figure 2.4: Noise privatizer when σ=0.1.
(a) Input Data, Obfuscated Data, Adversary Estimate (b) Signal Maps Before and After Obfuscation
Figure 2.5: Noise privatizer when σ=0.9.
input data, obfuscated data, and adversary’s estimate look like side by side using the Noise privatizer for a
low value of σ. Figure 2.4b shows the signal maps generated before and after obfuscation. This shows that
even in the presence of obfuscation, we can generate representative signal maps with the obfuscated data.
Figures 2.5a and 2.5b show the same plots for a high value of σ. Note that while the privacy is improved,
i.e. the adversary estimate is further from the input data, the signal maps differ significantly.
For reference, a privatizer which releases a completely random dataset (from a normal distribution
with variance of 1.0) regardless of input data would observe the errors shown in Table 2.3.
Metric No Obfuscation Random Data
Distortion (−U1) 0 5.74
Generated Map Error (−U2) 0 2.41
Table 2.3: Utility Reference Values
28
2.4.2 Local Differential Privacy
We implement two local DP (LDP) privatizers which provide mathematical guarantees for privacy (see Eq.
(2.9)) under the worst-case threat model discussed in Section 2.2.2.
The first approach is the Gaussian Mechanism with parameters ϵ and δ, which we refer to as GLDP
[17]. This mechanism adds zero-mean Guassian noise with variance b to each feature. This variance is
defined by
b =
∆2
ϵ
2
2 ln(1.25
δ
), (2.13)
where ∆ is the L2-sensitivity of a function f given by
∆ = max
D1,D2∈D
||f(D1) − f(D2)||2, (2.14)
where D1 and D2 are subsets of the dataset D differing only by one element. Generally, in the local DP
model, one can think of D1 and D2 as datasets of size 1 (i.e. one data point) and f as an identity function.
Therefore the sensitivity becomes the greatest L2-distance between any two data points. In practice, we use
an analytically calibrated Gaussian Mechanism which is shown to extend better to the very high privacy
regime (ϵ −→ 0) and the low privacy regime (ϵ −→ ∞), see Algorithm 1 in [33] for the exact calculation
for the variance of the added noise b.
The second approach is the Truncated Laplacian Mechanism with parameters ϵ and δ, which we refer
to as LLDP, recently proposed in [18]. This mechanism adds noise satisfying the truncated Laplacian
distribution, with probability density function
f(x) =
Be−
|x|
λ , for x ∈ [−A, A]
0, otherwise
(2.15)
29
where
λ =
∆
ϵ
, A =
∆
ϵ
log(1 + e
ϵ − 1
2δ
), B =
1
2λ(1 − e
− A
λ )
,
and ∆ is defined in Eq. (2.14).
For both approaches, we follow standard practice and use δ = 0.00001 (δ should be much smaller than
1/n [34]) and ϵ between 1 and 10 (larger ϵ values yield a very loose bound, see Eq. (2.9)), where low values
of epsilon guarantee better privacy.
Moreover, following standard practice again, we clip each data point to have L2-norm ≤
∆
2
. Then, by
invoking the triangle inequality, we can ensure that sensitivity is no greater than ∆. Specifically, for both
the Gaussian mechanism and Laplacian mechanisms, we clip each data point according to the following
function
xnew =
x
||x||2
min (∆
2
, ||x||2). (2.16)
To choose ∆
2
, we use the rule of thumb that clipping should occur 5% of the time. Using the pilot dataset
to approximate how much of the data would be clipped for a given value, we choose ∆
2 = 7.154 and use
this parameter during testing.
Table 2.4 compares the GLDP and LLDP privatizers with respect to our privacy and utility metrics on
Chania dataset. We notice that both GLDP and LLDP privatizers yield quite large utility losses. From this
table, it is evident that GLDP achieves sizably higher privacy than LLDP w.r.t. P1 and P2, especially for
larger ϵ values. Although GLDP privatizer has larger loss in utility, both GLDP and LLDP privatizers can
not offer any utility protection. Hence, we use GLDP with higher privacy in the rest of the paper when
comparing LDP with other approaches under the typical threat model, see Section 2.5.
Note that while our Noise and LDP privatizers both add normally distributed noise, the key difference
between the two is the noise clipping step. Intuitively, this ensures that no two data points are too different.
This gives an anonymity to each measurement that is crucial to privacy under a worst-case threat model.
30
ϵ Mechanism P1 P2 U1 U2
1
GLDP 0.68 0.94 113.76 2.96
LLDP 0.68 0.94 84.60 2.95
10
GLDP 0.63 0.91 16.46 2.39
LLDP 0.49 0.69 8.50 2.49
100
GLDP 0.32 0.36 4.21 2.44
LLDP 0.05 0.10 0.93 2.20
Table 2.4: Comparison of GLDP and LLDP privatizer on Chania dataset.
2.4.3 Generative Adversarial Privacy
Generative Adversarial Privacy is a data-driven approach to obfuscation which learns a privatization strategy by positioning the privatizer and adversary against each other in a minimax game [19, 35]. Our privatizer is a fully-connected feedforward neural network with a similar structure to our adversary. It has two
hiddens layers of 256 units each. Between layers we employ Rectified Linear Unit (ReLU) activations, and
our optimization relies on Adaptive Moment Estimate (Adam) stochastic gradient descent with a learning rate of 0.001. Our privatizer, which takes an input batch of size n × m, outputs an n × m batch of
obfuscated data, where each measurement has been obfuscated independently. (We treat the case where
measurements are grouped into batches and then jointly obfuscated in Section 2.5.2.)
Our privatizer wants to minimize the following loss function
Lp(x, y, u, ˆ xˆ1, xˆ2) = −ρU(x, y)
− (1 − ρ)La(ˆx1, xˆ2, u, x ˆ 1, x2, u),
(2.17)
where U is the composite utility metric defined in Eq. (2.12) and La is the adversary loss function defined
in Eq. (2.7) which is a differentiable version of the composite privacy metric and depends on the adversary
estimate error of the user ID and location.
31
Notice that as the adversary’s loss decreases (implying less privacy), the privatizer’s loss increases. ρ
quantifies the penalty on utility loss, as opposed to privacy loss. Utility losses have a large effect on the
privatizer when ρ −→ 1 and privacy losses have a large affect on the privatizer when ρ −→ 0.
We take an iterative approach to training the two neural networks. We first train the adversary, specifically, we fix the neural network (NN) weights of the privatizer and perturb the NN weights of the adversary
along the negative gradient of La for k epochs. We then train the privatizer, that is, we perturb the NN
weights of the privatizer along the negative gradient of Lp for k epochs, and so on and so forth. When
both have converged, we have found the equilibrium of our minimax game. We then fix the weights of
both NNs during testing.
The GAP privatizer incorporates the privacy and utility metrics in its loss function Lp and trains against
an adversary with the same loss function La as the one used to evaluate privatizers. While it is advantageous to incorporate specific privacy metrics, for generality we evaluate the GAP privatizer’s performance
against other loss functions too, see Section 2.5.5.
2.4.4 Information-theoretic Privacy
For this approach, we consider the privacy-utility tradeoff in an analytically tractable fashion under a
formal optimization framework.
Considering X ∈ X and Y ∈ Y as random variables describing our input and obfuscated data respectively, our IT privatizer tries to minimize the mutual information I(X; Y ), see Eq. (2.8), subject to a
constraint on utility. The privatizer is specified by the conditional probability distribution pY |X, the probability of releasing Y given input data X. Without the utility constraint, the optimal solution is to release
Y independent of X.
32
Formally, the problem becomes:
min
pY |X
I(X; Y ) (2.18a)
s.t. X
y∈Y
pY |X(y|x)U(x, y) ≥ Uc, ∀x ∈ X (2.18b)
pY |X(y|x) ≥ 0, ∀x ∈ X , ∀y ∈ Y (2.18c)
X
y∈Y
pY |X(y|x) = 1, ∀x ∈ X , (2.18d)
where (2.18b) is a constraint on the composite utility U(x, y) defined in Equation (2.12), and constraints
(2.18c) and (2.18d) ensure that pY |X is a valid probability distribution.
We approach this constrained minimization problem by rewriting it as a Lagrange function whose
optimal point is a global minimum over the domain of the choice variables and a global maximum over
the Karush-Kuhn-Tucker (KKT) multipliers [36]. We analyze the KKT conditions below to derive key
observations on the optimal solution:
pX(x)log(
p
∗
Y |X
(y|x)
pY (y)
) − µ
∗
1U(x, y) − µ
∗
2 + λ = 0 (2.19a)
µ
∗
1
(Uc −
X
y∈Y
p
∗
Y |X(y|x)U(x, y)) = 0, ∀x ∈ X (2.19b)
µ
∗
2p
∗
Y |X(y|x) = 0, ∀x ∈ X , ∀y ∈ Y (2.19c)
µ
∗
1
, µ∗
2 >= 0. (2.19d)
Solving this for the optimal conditional probability distribution, we see
p
∗
Y |X(y|x) = p
∗
Y
(y) exp (µ
∗
1U(x, y) + µ
∗
2 − λ
∗
pX(x)
). (2.20)
33
We take the sum of both sides,
X
y∈Y
p
∗
Y
(y) exp (µ
∗
1U(x, y) + µ
∗
2 − λ
∗
pX(x)
) = 1. (2.21)
We then manipulate this to get an expression in terms of λ
∗
, which we substitute back into Equation (2.20)
to get the following:
p
∗
Y |X(y|x) = 1
η
p
∗
Y
(y) exp (µ
∗
1U(x, y)
pX(x)
), (2.22)
where η is a normalization term over y ∈ Y. From this formal treatment, and reminiscent of our previous
work [37, 38], we derive two important characteristics of the optimal solution: (i) pY |X should exponentially increase with utility, and (ii) pY |X should linearly increase with pY , the probability that y is reported
for any x, i.e. we should reuse released datasets to the extent practical.
Given the above qualities of an optimal solution, we design the following heuristic approach. We use
the pilot dataset to empirically determine the distribution pX using multi-variate Gaussian kernel density
estimation. We then sample from this distribution Ns times to create a “codebook" which approximates
the sample space Y. Limiting Ns allows us to reuse released datasets, as mentioned above.
The weight of each “code" or possible y value is given by
w(y) = exp (µ
∗
1U(x, y)), (2.23)
where µ
∗
1
is our KKT multiplier. Given an input data x, our information theoretic privatizer selects a y
from the codebook with probability w(y)/
P
y∈codebook w(y). This ensures the likelihood of reporting a
y increases exponentially with utility. As µ
∗
1
increases, the IT privatizer offers higher utility but lower
privacy. By contrast, as µ
∗
1
approaches zero, the IT privatizer achieves lower utility while higher privacy.
34
0
0.2
0.4
0.6
P
1
1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.01
10 9 8 7 6 5 4 3 2 1 0.1
1
*
1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Noise
GLDP
GAP
IT
(a) P1: user ID estimate error rate.
0
0.2
0.4
0.6
0.8
1
P
2
1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.01
10 9 8 7 6 5 4 3 2 1 0.1
1
*
1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Noise
GLDP
GAP
IT
(b) P2: user location estimate error.
-20
-15
-10
-5
0
U1
1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.01
10 9 8 7 6 5 4 3 2 1 0.1
1
*
1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Noise
GLDP
GAP
IT
(c) U1: −distortion.
-8
-6
-4
-2
0
U
2
1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.01
10 9 8 7 6 5 4 3 2 1 0.1
1
*
1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Noise
GLDP
GAP
IT
(d) U2: −generated map error.
Figure 2.6: Privacy and utility of different privatizers. Note that Noise, GLDP, GAP, and IT refer to the
Gaussian noise-adding, local Gaussian Mechanism DP, GAP, and the information-theoretic privatizers,
respectively.
In implementation, we use a codebook with size of 51. This codebook size was empirically determined
to be large enough that one or more codes would provide good utility, yet small enough that codes are
reused to the extent practical. Note that we bias the codebook by including a copy of the unobfuscated
data (i.e. 50 obfuscated codes + 1 unobfuscated code). This ensures at least one y has very high utility even
for relatively small codebooks. Also, to reduce computational complexity, we split the n measurements
into batches and for each batch x we select a batch y from the codebook.
2.5 Performance Evaluation
In this section we compare the performance of the privatizers against different adversaries when users
upload a single or a batch of measurements, and evaluate where they sit in the privacy-utility design trade
space. All performance comparisons in this section are under the typical threat model (bounded adversary).
We use the three real-world traces introduced in Section 2.2.1 in our evaluation. Unless otherwise stated,
the default trace is the Chania dataset.
2.5.1 Comparison of Privatizers
Consider the scenario where users upload a single measurement at a time. Figure 2.6a/Figure 2.6b show the
adversary estimate user ID error/adversary location error respectively against each privatizer (its privacy),
35
and Figure 2.6c/Figure 2.6d show the distortion/generated map error of each privatizer(its utility). The xaxis in these and the following plots represents the parameterization of each privatizer, i.e. σ, ϵ, ρ, and
µ
∗
1
.
As expected, for the noise privatizer, as σ increases from 0 to 1 the adversary’s user ID and location
estimate errors increase, demonstrating higher privacy (larger P1 and P2). At the same time, both the
distortion and generated map errors increase, demonstrating lower utility (smaller U1 and U2). For the
GLDP/GAP/IT privatizers, decreasing ϵ/ρ/µ
∗
1
leads to higher privacy (i.e. an increase in the adversary’s
user ID and location estimate error rate) and lower utility (i.e. an increase in distortion and generated map
error).
Among these privatizers, the GLDP privatizer consistently achieves high privacy for typical values of
ϵ. Specifically, against the GLDP privatizer with 1 ≤ ϵ ≤ 10, the adversary’s user ID estimation error
is around 70% and the adversary’s location estimate error is close to 1. These numbers can be explained
as follows. In the absence of any intelligible patterns due to obfuscation, the adversary learns to assume
all measurements came from the geographic center of the dataset, thus its error is on the same order
as the spread of input data, i.e. roughly 1. Both the IT and GAP privatizers can approach this privacy
performance as µ
∗
1
and ρ get close to 0.1 or smaller. As for the user ID estimation error, the user with the
most measurements contributes roughly 30% of them, thus a simple adversary assigning this user’s ID to
all measurements would have 70% user ID error, hence this can be considered as the upper bound of P1.
With respect to the utility, the GLDP privatizer offers the worst performance. The GAP privatizer
outperforms the others for ρ in the range [0.0,0.4] (i.e. high privacy region), while the IT privatizer achieves
the best utility for µ
∗
1
in the range [0.4,1.0] (i.e. low privacy region). As it will become clear in the next
couple of sections, a major reason why GLDP has the best privacy and worst utility is that for the range
of ϵ values considered, it distorts the data to a larger extent than the rest of the approaches. We discuss in
more detail the differences between the 4 privatizers in the Section 2.5.3.
36
2
2
22
2
2
2
2
2
2
(a) Without obfuscation.
(b) Via Noise privatizer.
(c) GAP privatizer.
(d) IT privatizer.
Figure 2.7: Visualization of obfuscated measurements generated by different privatizers when P = 1.0.
Note that each cell in these figures represents a geographical location. The color of each cell represents
the average signal strength value in this location. Lighter color represents higher signal strength value.
Privatizer None Noise GAP IT
RMSE (dBm) 2.46 2.90 2.54 2.48
Table 2.5: RMSE (root mean square error) of RSS prediction model trained with obfuscated measurements
when P = 1.0. Note that we use the RSS prediction model trained with non-obfuscated measurements as
a baseline (privatizer is none).
Furthermore, we visualize the obfuscated measurements generated by different privatizers as heat
maps in Figure 2.7. Note that the x-axis and y-axis of each heat maps represent longitude and latitude
respectively, and the color of each cell in these heat maps represents the signal strength value. Compared
with the heat map generated by Noise privatizer, the heat maps generated by GAP and IT privatizers are
more similar to the heat map without obfuscation under the same privacy level, indicating that GAP and
IT privatizers inject less distortion into the measurement data during obfuscation.
Lastly, we report the root mean square error (RMSE) of RSS prediction model trained by obfuscated
measurements in Table 2.5. Note that the goal of the service provider is to train a RSS prediction model
based on the obfuscated measurements uploaded by users. The more accurate the RSS prediction model is,
the higher utility the obfuscated scheme can provide. As illustrated in Table 2.5, under the same privacy
level, the RSS prediction model trained with measurements obfuscated by IT privatizer achieves the lowest RMSE, which is close the RMSE of RSS prediction model trained with
00.20.40.60.81
0
10
20
30
40
50
60
70
User ID estimate error/%
seq_len=1
seq_len=5
seq_len=10
seq_len=20
(a) Only adversary leverages sequences.
00.20.40.60.81
0
10
20
30
40
50
60
70
User ID estimate error/%
seq_len=1
seq_len=5
seq_len=10
seq_len=20
(b) Only privatizer leverages sequences.
00.20.40.60.81
0
10
20
30
40
50
60
70
User ID estimate error/%
seq_len=1
seq_len=5
seq_len=10
seq_len=20
(c) Both leverage sequences.
Figure 2.8: Effect of leveraging measurement sequences on adversary’s user ID estimate error.
00.20.40.60.81
0
0.2
0.4
0.6
0.8
1
User location estimate error
seq_len=1
seq_len=5
seq_len=10
seq_len=20
(a) Only adversary leverages sequences.
00.20.40.60.81
0
0.2
0.4
0.6
0.8
1
User location estimate error
seq_len=1
seq_len=5
seq_len=10
seq_len=20
(b) Only privatizer leverages sequences.
00.20.40.60.81
0
0.2
0.4
0.6
0.8
1
User location estimate error
seq_len=1
seq_len=5
seq_len=10
seq_len=20
(c) Both leverage sequences.
Figure 2.9: Effect of leveraging measurement sequences on adversary’s user location estimate error.
The RSS prediction model trained with measurements obfuscated by Noise privatizer achieves the highest
RMSE, indicating that the Noise privatizer provides the worst utility.
2.5.2 Leveraging Measurement Sequences
To directly investigate the effect of correlations and predictable patterns when considering mobile measurements as a time sequence, we consider an adversary which takes measurement sequences as input, i.e.
time sequences of lengths 1, 5, 10, and 20 which belong to a single user, and estimates the (common) user
ID and locations of all these measurements, taking advantage of correlations across data of the same user.
In practice, the adversary can do this when users upload measurements in batches.
The adversary we consider is trained via supervised learning with the final output of the converged
GAP privatizer. The GAP privatizer is a good choice to study sequences of data as it can be trained to
consider correlations of sequences and privatize batches of data in one shot as well.
38
Figure 2.8 demonstrates the effect of leveraging measurement sequences on adversary’s user ID estimate error, where three cases are considered: only the adversary, only the privatizer, and both of them
consider sequences of data. Specifically, Figure 2.8a shows results when only the adversary considers measurement sequences. We observe that the longer the sequence the better the adversary performance, as the
adversary achieves smaller error for the same data distortion. Figure 2.8b shows results when only the privatizer considers measurement sequences. We notice that the longer the sequence the better the privatizer
performance, as the privatizer forces the adversary to achieve higher error for the same data distortion.
Thus, sequences of measurements help both the adversary and the privatizer, which is expected in the
presence of inter-measurement correlations. That said, the trade-offs in both cases above is the additional
computational and memory resources required to handle input sequences as opposed to single measurements. Lastly, Figure 2.8c shows results when both the adversary and privatizer consider sequences of the
same length. We observe that longer sequences result in better privacy, as the adversary’s user ID estimate
error increases.
Figure 2.9 demonstrates the effect of leveraging measurement sequences on adversary’s user location
estimate error. Similar to the results in Figure 2.8, we observe that when the adversary leverages longer
measurement sequence, it achieves lower user location estimate error and hence degrades the user privacy.
However, when the privatizer leverages longer measurement sequence, higher privacy can be achieved in
terms of increasing user location estimate error.
2.5.3 Analysis of Privacy-Utility Trade Space
Figure 2.10 and Figure 2.11 illustrate where each privatizer sits in the privacy-utility trade-offs space under
real-world datasets with different metrics. Specifically, in Figure 2.10, we consider the composite privacy
and utility metrics on Chania, UCI, and Radiocell datasets, where the x axis shows the composite privacy
P defined in Equation (2.5) with weights v1 = v2 = 1 and the y axis shows the composite utility U defined
39
0 0.5 1 1.5
Privacy
-20
-15
-10
-5
0
Utility
Noise
GLDP
GAP
IT
=0.1
1
*
=1
=1
=50
=100
=0.2 =0
=0.4
=10
1
*
=0
1
*
=0.6
=1
(a) P vs U on Chania dataset.
0 0.5 1 1.5
Privacy
-15
-10
-5
0
Utility
Noise
GLDP
GAP
IT =10
=1
=0 =0.4
=0.4
1
*
=0
=50
=100
=0.1
=1
1
*
=2
1
*
=0.6
(b) P vs U on UCI dataset.
0 0.5 1 1.5 2
Privacy
-30
-25
-20
-15
-10
-5
0
Utility
Noise
GLDP
GAP
IT
1
*
=0
=1
=1
=0.1
=100
=0.1
=0.7
=50
=0
=10
1
*
=2 1
*
=1
(c) P vs U on Radiocell dataset.
Figure 2.10: Privacy-utility trade-offs of different privatizers under the Chania, UCI, and Radiocell datasets
with composite metrics. Note that Noise, GLDP, GAP, and IT refer to the Gaussian noise-adding, local
Gaussian Mechanism DP, GAP, and the information-theoretic privatizers, respectively. Note that P is the
composite privacy metric defined in Eq. (2.5) and U is the composite utility metric defined in Eq. (2.12).
0 0.1 0.2 0.3 0.4 0.5 0.6
P
1
-20
-15
-10
-5
0
U1
Noise
GLDP
GAP
IT
(a) P1 and U1 tradeoff.
0 0.2 0.4 0.6 0.8 1
P
2
-20
-15
-10
-5
0
U1
Noise
GLDP
GAP
IT
(b) P2 and U1 tradeoff.
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7
P
1
-8
-6
-4
-2
U
2
Noise
GLDP
GAP
IT
(c) P1 and U2 tradeoff.
0 0.2 0.4 0.6 0.8 1
P
2
-8
-6
-4
-2
0
U
2
Noise
GLDP
GAP
IT
(d) P2 and U2 tradeoff.
Figure 2.11: Privacy-utility trade-offs of different privatizers under the Chania dataset with non-composite
metrics.
in Equation (2.12) with weights w1 = w2 = 1. Note that we consider such composite privacy and utility
metrics since in practice a service provider may care about both P1 and P2 and about both U1 and U2.
In Figure 2.11, we further consider four different combinations of non-composite privacy and utility
metrics, and we use the Chania dataset as an example to illustrate how the privacy-utility trade-offs curves
of different privatizers change with different non-composite metrics.
In all these plots, the ideal privatizer should sit in the top right corner implying high privacy and
high utility. While the three traces are collected in different countries, areas, and years, the results are
qualitatively the same. From both plots we conclude that GAP and the IT privatizer outperform the Noise
and GLDP privatizers. It is important to remind the reader that the above comparison is under the typical
threat model where the adversary is bounded, whereas GLDP privatizer is the only privatizer that provides
privacy guarantees under the worst-case threat model. As discussed in detail in Section 2.2.2, we focus on
the typical threat model as it is more relevant to our context/application.
40
A major reason why GAP and the IT privatizer perform well is that they rely on the notion of context,
as we have already discussed in Section 2.2.4. The GAP privatizer gains some insight about the structure
of the dataset through data-driven learning. It also tries to minimize the difference between the true and
obfuscated data while achieving privacy, as encoded in its loss function. In summary, GAP uses P, U and
the data. The IT privatizer gains some insight about the structure of the dataset through Gaussian kernel
density estimation. It does well because it releases obfuscated datasets which inherently mirror the true
dataset’s structure, thanks to a constraint on utility. In summary, IT uses U and the data distribution. In
contrast to GAP and IT, GLDP only needs information about the data to perform clipping without hurting
utility too much (in our implementation we used the data directly for this purpose, see Eq. (2.16)), and
Noise only needs the variance of the data to normalize the amount of Gaussian noise that it adds.
Comparing GAP with IT, because GAP tries to prevent an adversary from estimating features of x
given y, this strategy can be thought of as a data-driven approach to what IT does, i.e. minimizing mutual
information. Yet while the IT strategy adds privacy by choosing y randomly (with appropriate weights),
the GAP privatizer maintains a model of a rational adversary which it intentionally tries to deceive. Training against an adversary with the same loss function as the adversary used to test the performance of
the privatizers, might be perceived as unfair. To address this, in Section 2.5.5 we test privatizers against
adversaries with different loss functions.
The GLDP privacy-utility curve shown in Figure 2.10 shows values of ϵ up to 100. Note that this is
an order of magnitude greater than the values of ϵ shown in Figure 2.6c/Figure 2.6d and such high values
yield a very loose bound on Equation (2.9), yet we do so to show that the Noise and GLDP privatizer meet
when noise levels are similar. Note that values of ϵ ≤ 10 lie along the asymptotic behavior around the
P = 1.6/1.5/2.0 line for the Chania/UCI/Radiocell dataset, respectively.
41
0 1 2 3 4 5 6 7 8 9 10
Distortion
1
0.7
0.6
0.5
0.4
0.2
0.1
0.01
100
50
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Noise
GLDP
GAP
MI
Figure 2.12: Choosing parameters under a constraint on distortion.
Finally, note that when we train the GAP privatizer and compute the codebook of the IT privatizer
to generate the results of Figure 2.11, we use the composite privacy and utility metrics to avoid retraining/recomputing them for each case. Interestingly, this doesn’t deteriorate their performance in a visible
manner. While real-world engineers could retrain/recompute the GAP/IT privatizers for the specific privacy and utility metrics they care about, in practice this may be cumbersome.
2.5.4 Constraining Distortion Levels
Previously we have considered privacy and utility as two components of our objective. Suppose instead we
wish to maximize privacy subject to a constraint on utility. In Figure 2.12 we re-frame previous results to
demonstrate choosing the appropriate parameters to meet a constraint on distortion (−U1), which can act
as an empirical measure of how different the obfuscated data is compared to the original data. Figure 2.12
presents a plot which can be interpreted as a continuous lookup table. For example, to meet the constraint
−U1 ≤ 3, we could choose σ = 0.2 or µ
∗
1 = 0.6 or ρ = 0.4. This plot also offers a sense of which range of
distortion each approach may achieve for its selected range of parameters.
2.5.5 Performance Against Different Adversaries
So far we have tested each privatizer against an adversary trained on the obfuscated data generated by
the privatizer. We refer to this as the privatizer’s “own" adversary. What is more, the GAP privatizer is
42
Privatizer Parameter Utility U
Privacy P against different adversaries
Baseline Unobfuscated Aggregate
Alternative (different La)
v1 = 0.8, v2 = 0.2 v1 = 0.2, v2 = 0.8
Noise σ = 0.2 -2.5 0.13 0.30 0.80 0.13 0.14
GAP ρ = 0.4 -2.5 0.95 1.54 1.33 1.05 0.88
IT µ
∗
1 = 0.6 -2.5 0.70 1.26 1.18 0.68 0.70
Table 2.6: Evaluation results against common adversaries. Baseline reports the privacy of a privatizer
against the adversary trained with its own obfuscated data. Unobfuscated is trained against unobfuscated
data. Aggregate is trained against aggregate obfuscated data. Alternative is trained using a different loss
function.
explicitly trained to beat its own adversary, and it would be informative to investigate its performance
against other adversaries.
Motivated by the above, we investigate how the four privatizers perform against the following three
adversaries: (i) “Unobfuscated" adversary which is trained with the unobfuscated data via supervised learning (rather than the obfuscated data that we have used so far), (ii) “Aggregate" adversary which has access
to all obfuscated data generated by all privatizers, and is trained with the aggregated obfuscated data, and
(iii) “Alternative" adversary trained with a different loss function than the one used so far, which has also
been used for training the GAP adversary inside the iterative GAP loop. Specifically, alternative adversaries
use different weights v1 and v2 in the loss function La.
Recall that for each privatizer, we have different parameter settings to trade privacy and utility. For
a fair comparison, we first set a target utility value and use for each privatizer its parameter value that
achieves this utility. Table 2.6 shows the corresponding parameter values for a composite target utility
of -2.5. This value is motivated by Table 2.3 and Figure 2.10, as the former shows the (negative) utility
of a random dataset and the latter shows the entire privacy-utility spectrum considered. (Notice that for
GLDP to achieve a -2.5 utility value it would use too large of an ϵ value (>100) thus we omit this line from
43
the table.) We report the privacy achieved by each privatizer against its own adversary (Baseline) and the
three adversaries introduced above.
Interestingly, the GAP privatizer outperforms all the other privatizers not only when privatizers are
positioned against their own adversaries (see also Section 2.5.1) but also against the other adversaries,
namely Unobfuscated, Aggregate, and Alternative. That said, the performance gap does reduce, which can
be explained by the fact that the GAP privatizer is trained against an adversary with a loss function which
is now different from that of the adversary used to test the privatizers.
As expected, all privatizers achieve the lowest privacy against their own adversary (baseline), since the
latter is trained with the obfuscated data of each privatizer. Also, all privatizers achieve the highest privacy
against the Unobfuscated adversary. This is also expected as the Unobfuscated adversary is trained using
unobfuscated data thus it is weaker than the others.
2.6 Limitations and Future work
Points of interest: The adversary we consider predicts user IDs and all locations from where measurements are collected. However, an adversary may be particularly interested to learn specific users’ points of
interest (POIs). For instance, the adversary may want to predict the target user’s home or work location.
We do not consider this in the paper since users can choose to not collect measurements around POIs as a
defense mechanism.
Side information: We assume the adversary has access only to the obfuscated user data shared with
the service provider, which does not contain user ID information. A stronger adversary might leverage
side information to estimate the user ID of each measurement. For example, the adversary might be able
to monitor the network connection between the service provider and mobile users, such that it knows
from which device each obfuscated measurement comes from and thus the user ID. This adversary may
44
then build a user whereabouts model. Since it is much harder for an adversary to have access to such
information than to merely access database updates, we do not consider this threat model.
Federated learning: Mobile crowdsourcing applications lend themselves to a federated learning implementation [39, 40], which can provide some privacy for mobile users. Recent works show that federated
learning could also leak user privacy [41, 42, 43, 44, 45, 46]. However, it would be a reasonable solution for
opt-in mobile users used to collect training data for the GAP privatizer and to estimate data distributions
for the IT privatizer.
Another avenue for future work is to investigate how federated learning can be applied, with additional
privacy mechanisms, to achieve privacy-preserving training of an RSS predictor. For instance, one may
add noise to local model updates or carefully select the measurements used for local model training epochs,
to weaken data reconstruction attacks (see, for example, the DLG attack proposed in [47]).
2.7 Related Work
2.7.1 Privacy Mechanisms
Differential privacy (DP) [48, 17, 32] is a mathematically rigorous definition of privacy which is useful for
quantifying privacy loss and designing randomized algorithms with privacy guarantees. Motivated by statistical disclosure control, or providing accurate statistics while protecting individual survey respondents,
DP approaches the problem of releasing coarse-grained information while keeping fine-grained details
private. This popular approach to data privatization is studied under the local and global models [49].
The global model assumes a trusted data analyst has access to the dataset and wants to release queries
computed on it in a privacy-preserving fashion. The local model assumes the absence of a trusted server,
thus the data is randomized locally prior to aggregation. This work studies DP in local models, which is
referred to as local differential privacy (LDP).
45
Generative Adversarial Privacy (GAP) [19, 35] offers an alternative to noise-adding mechanisms in
that it is context-aware, meaning it takes the dataset statistics into account. GAP learns from the dataset
without the need to explicitly model the underlying distribution. Leveraging recent advancements in generative adversarial networks (GANs) [50, 51, 52], GAP allows the privatizer to learn obfuscation schemes
from the dataset itself. Like the generator and discriminator in a GAN, the privatizer and adversary optimize a mini-max game.
Information-theoretic (IT) privacy [16, 53, 54] provides an alternative in which privacy metrics are
motivated by concepts from information theory. For example, mutual information [55] is the measure
of how much one random variable tells us about another. Obfuscation schemes which minimize mutual
information intuitively provide privacy. Unlike DP which provides guarantees on worst-case privacy,
mutual information is an expectation, i.e. provides guarantees on average privacy.
2.7.2 Theoretical Studies of Privacy-Utility Trades
Previous work has analyzed distortion in the context of DP [56] or attempted to minimize the utility loss
incurred by DP [57]. Previous work in GAP maximizes privacy subject to a constraint on distortion [19].
Additionally, previous IT privacy metrics have been considered in the context of theoretically motivated
utility metrics [58]. In contrast, in this work we consider utility metrics beyond distortion which are
specific to our application and are both more intuitive and relevant for mobile network data. We also
formally compare the performance of context-free (Gaussian noise-adding, local DP) and context-aware
(GAP, IT) approaches in the context of our application.
Prior theoretical studies on the privacy-utility trade-offs include [16, 59, 60]. The authors of [16] formally define an analytical model for trading equivocation (a privacy metric based on Shannon entropy) and
distortion (a utility metric which could be Euclidean distance, Hamming distortion, Kullback-Leibler divergence, etc.). This model is designed for “universa” metrics, but is not generalized for non-i.i.d. datasets
46
or datasets lacking strong structural properties. A so-called geometric mechanism is presented in [60]
as a utility-maximizing alternative to the Laplace or Gaussian mechanisms typically used in differential
privacy, where utility is the expected loss of any symmetric, monotonic loss function. In [59], the authors
define a bound on the information-theoretic min-entropy leakage of ϵ-differential privacy, and a bound on
utility (where utility is roughly the number of differing dataset entries). Our work uniquely examines the
trade-offs for all of these approaches in the unifying context of a single application, allowing us to present
additional insight.
2.7.3 Prior work on mobile network data privacy
Previous work on privacy in mobile network data has considered strategic sampling, distribution modeling,
and noise addition as obfuscation strategies. In [15], the authors exploit compressive sensing techniques to
sample and compress received signal strength values in a privacy-preserving RSS map generation scheme.
While privacy is gained in sampling and compression, the authors of [15] do not take a formal approach
to quantifying privacy. In [61], distributed algorithms for Gaussian and exponential noise addition are
explored in a crowdsourced data setting. Local differential privacy is applied to the user-tracking problem
for indoor positioning systems in [62]. The authors of [63] present a relaxed version of differential privacy,
probabilistic DP, which accounts for certain worst-case privacy scenarios being highly unlikely. They apply
this to the generation of synthetic data which maps daily commutes. In [64], a novel privacy-preserving
incentive mechanism is proposed for mobile crowd sensing, where the authors employed DP to perturb
aggregated data. In each of [15, 61, 62, 63, 64], utility is not rigorously considered. Our work takes a formal
approach to both privacy and utility.
In recent years, researchers have grown interested in studying the privacy-utility trade-offs in mobile
network applications. Shokri et al. in [65] propose an optimal strategy against location attack based on
47
Stackelberg Bayesian game theory, which provide the best location privacy while satisfying the user’s service quality requirements. Bordenabe et al. in [66] formulate the trade-offs optimization problem between
geo-indistinguishability and quality of service, and propose a method based on linear optimization to solve
this problem. Chen et al. in [67] design a differentially private obfuscation scheme based on reinforcement learning to optimize privacy budget allocation for each location in vehicle trajectory obfuscation,
which can balance geolocation obfuscation and semantic security and thus results in better privacy-utility
trade-offs. In [68], the authors design novel privacy and utility metrics for location privacy, and perform
large-scale evaluation and analysis of several existing location-privacy preserving mechanisms. In [69],
the authors provide a survey of DP-based obfuscation approaches for location privacy protection and compare the privacy-utility trade-offs performance of these approaches. However, these works only focus on
location privacy without considering other privacy metrics for mobile user data. Moreover, the proposed
approaches in [65] and [66] are based on linear programming and discrete locations which cannot be easily
applied under our threat models (continue locations and non-linear adversary). The mechanism proposed
by [67] does not formally optimize the privacy-utility trade-offs during trajectory obfuscation. [68] is an
empirical study with no formal analysis or obfuscation schemes that formally consider both privacy and
utility in their design, and [69] only surveys DP-based obfuscation approaches without considering other
obfuscation schemes.
In [70], the authors propose a novel framework, DP-star, for publishing trajectory data with differential
privacy guarantees, while preserving high utility. In [71] the authors present AdaTrace, a utility-aware
location trace synthesizer which provides a differential privacy guarantee and inference attack resilience.
This work is closely related to ours in that the authors employ both learning and noise-adding to generate
datasets which they evaluate for statistical utility, and analyze how the choice of privacy parameter effects
utility. In [72], the authors proposed a privacy-preserving and utility-aware mechanism based on mutual
information optimization, with application to the data uploading phase in participatory sensing. Zhang
48
et al. in [73] also propose an information theoretic approach based on mutual information optimization,
which protects the user’s location privacy while satisfying the user’s utility constraints when releasing
location aggregates. However, these works only consider the database-level threat model during dataset
publishing, which requires a trust-worthy third party to distort data before release, and they cannot be
directly applied to the device-level obfuscation in our application.
In [74] and [75], GANs are leveraged to achieve utility-aware obfuscation of mobile sensor data. However, these works focus on obfuscating image sensor data to reduce sensitive information leakage in mobile
apps, where both the dataset structure and threat model are different from ours. Moreover, [75] does not
compare GANs with other formal obfuscation schemes, and [74] does not compare against obfuscation
schemes that formally consider both privacy and utility in their design.
While the advantages and disadvantages of a range of obfuscation methods are to some extent known
in principle from prior work, how they perform and compare in signal map application is unclear. In this
work, we implement representative obfuscation schemes based on preeminent approaches, apply them
to the important real-world application of generating signal maps via crowdsourcing, and compare their
performance. Our performance results can serve as benchmarks, offering insights about how to design
real-world systems to generate accurate signal maps while protecting user privacy.
2.8 Conclusion
In this work, we have systematically examined the privacy-utility trade-offs which exists in crowdsourced
mobile network data obfuscation. We have considered four preeminent privatizers employing different
obfuscation strategies. To compare them, we have identified several privacy and utility metrics as well
as a number of adversaries under two different threat models suited to crowdsourced mobile network
data, and evaluate the privacy-utility trade-offs performance of different privatizers on three diverse realworld mobile network datasets. The main takeaway is that under a typical threat model with a bounded
49
adversary, which is of more practical interest in the context of our application, incorporating the structure
and intended use of datasets in obfuscation can provide privacy gains without significant utility losses.
50
Chapter 3
Harpo: A Principled Obfuscation Approach for Subverting Online
Behavioral Advertising
Online behavioral advertising, and the associated tracking paraphernalia, poses a real privacy threat, since
it tracks web users’ browsing histories for user profiling and subsequent ad targeting Unfortunately, existing privacy-enhancing tools are not always effective against online advertising and tracking. In this
chapter, we propose Harpo, a principled learning-based approach to subvert online behavioral advertising
through obfuscation. Harpo uses reinforcement learning to adaptively interleave real page visits with fake
pages to distort a tracker’s view of a user’s browsing profile. We evaluate Harpo against real-world user
profiling and ad targeting models used for online behavioral advertising. The results show that Harpo improves privacy by triggering more than 40% incorrect interest segments and 6× higher bid values. Harpo
outperforms existing obfuscation tools by as much as 16× for the same overhead. Harpo is also able to
achieve better stealthiness to adversarial detection than existing obfuscation tools. Harpo meaningfully
advances the state-of-the-art in leveraging obfuscation to subvert online behavioral advertising.
3.1 Introduction
Online behavioral advertising poses a real privacy threat due to its reliance on sophisticated and opaque
tracking techniques for user profiling and subsequent ad targeting [76, 77, 78, 79, 80, 81]. The tracking
51
information compiled by data brokers for the sake of online behavioral advertising is often outright creepy
and scarily detailed [82, 83, 84, 85]. Furthermore, the surveillance capitalism business model of the “free”
web naturally aligns with mass surveillance efforts by governments [86, 80, 87, 88, 89]. Finally, beyond
privacy, the targeting capabilities of online behavioral advertising are routinely abused for discrimination
[90, 91, 92] and manipulation [93, 94, 95, 96].
To address the privacy concerns of online behavioral advertising, some platforms now allow users to
opt in/out of tracking. Notably, iOS 14.5 introduced a new App Tracking Transparency feature that requires apps to get permission from users to track them for targeted advertising [97]. Unfortunately, the
vast majority of data brokers do not give users any meaningful choice about tracking. The privacy community has also developed privacy-enhancing tools to enable users to outright block online advertising and
tracking. These blocking tools, available as browser extensions such as uBlock Origin [98] and Ghostery
[99], are now used by millions of users. However, advertisers and trackers can often circumvent these
blocking tools, e.g., by evading blocking rules [100, 101, 102, 103, 104, 105, 106] or bypassing these tools
altogether [107, 108, 109]. Thus, blocking is not the silver bullet against online behavioral advertising.
The privacy community has recently started to leverage obfuscation to subvert online behavioral advertising without resorting to outright blocking or to complement blocking [110, 111]. Unfortunately,
existing privacy-enhancing obfuscation tools have limited effectiveness. For example, AdNauseam [112,
110] by Howe and Nissenbaum obfuscates a user’s browsing profile by randomly clicking on ads. As another example, TrackThis [111] by Mozilla obfuscates a user’s browsing profile by visiting a curated set of
URLs. The effectiveness of these (and other relevant approaches, e.g., [113, 114, 115], discussed in Section
3.7) is limited because they are not principled and also prone to adversarial detection.
We propose Harpo, a privacy-enhancing system that helps users obfuscate their browsing profiles to
subvert online behavioral advertising. To this end, Harpo interleaves real page visits in a user’s browsing
profile with fake pages. Unlike prior obfuscation tools, Harpo takes a principled learning-based approach
52
for effective obfuscation. More specifically, Harpo leverages black-box feedback from user profiling and
ad targeting models to optimize its obfuscation strategy. In addition, and equally importantly, Harpo’s
obfuscation is able to adapt to the user’s persona. This principled and adaptive approach helps Harpo
minimize its overhead by introducing the most effective fake page visit at each opportunity, and enhances
its stealthiness against adversarial detection.
At its core, Harpo leverages reinforcement learning (RL) to obfuscate a user’s browsing profile. Harpo
trains an RL-based obfuscation agent by analyzing a user’s browsing profile using an embedding and then
optimizing the reward by interacting with a black-box user profiling or ad targeting model. Harpo’s
trained RL agent is then used to introduce fake page visits into the user’s browsing profile at a budgeted
rate. A key challenge in designing Harpo is that the state space of the underlying Markov Decision Process
(MDP) is prohibitively large. We use a recurrent neural network (NN), together with a convolutional NN
as an encoder, and two fully connected NNs as decoders to alleviate the state space explosion of the MDP.
Another key challenge in implementing Harpo is that we have limited black-box access to real-world user
profiling and ad targeting models. We overcome this challenge by training surrogate user profiling and ad
targeting models and leveraging them to train the RL agent.
We evaluate Harpo against real-world user profiling and ad targeting models [116, 117]. We find that
Harpo is able to successfully mislead user profiling models by triggering more than 40% incorrect interest
segments among the obfuscated personas. We find that Harpo is able to successfully mislead ad targeting
models by triggering 6× higher bid values. We also find that Harpo outperforms existing obfuscation tools
by as much as 16× for the same overhead and by up to 13× with 2× less overhead. We also demonstrate
that Harpo achieves better stealthiness against adversarial detection than existing obfuscation tools.
We summarize our key contributions as follows:
• We propose Harpo, a principled RL-based approach to adaptively obfuscate a user’s browsing profile.
53
• We develop surrogate ML models to train Harpo’s RL agent with limited or no black-box access to
real-world user profiling and ad targeting models.
• We demonstrate the success of Harpo against real-world user profiling and ad targeting models in
terms of privacy, overhead, and stealthiness.
Paper Organization: The rest of the paper is organized as follows. Section 3.2 describes the threat model.
Section 3.3 presents the design and implementation of Harpo. We describe the experimental setup, including data collection and training process for Harpo and baselines, in Section 3.4. Section 3.5 presents
the evaluation results. We discuss ethical issues and limitations in Section 3.6. Section 3.7 summarizes
prior literature before concluding with Section 3.8.
3.2 Threat Model
The goal of the obfuscation system is to protect the privacy of a user against profiling and targeting models
of a tracker. To this end, the obfuscation system interleaves the user’s real page visits with fake pages to
distort the tracker’s view of the user’s browsing profile.
User. The user’s goal is to routinely browse the web while misleading the tracker so it cannot accurately
profile their interests and subsequently target ads. Users protect themselves by installing a modified user
agent (i.e., browser or browser extension) that obfuscates a user’s browsing profile by inserting fake page
visits into the user’s real page visits. The design goals for this obfuscation system are:
• it should be seamless in that it should not require any modifications to the user’s real page visits.
• it should be principled in that misleading the tracker’s profiling and targeting models is guaranteed.
• it should be adaptive to the real page visits so the fake page visits are not trivially always the same.
• it should be stealthy so that it is not possible for the tracker to detect obfuscation and discount fake
page visits.
• it should have low overhead to preserve user experience.
54
We start by assuming that the user has black-box access to the actual profiling and targeting models of
the tracker. To relax this assumption, we assume that the user can train a surrogate model that is different
from the actual model but can reasonably replicate its output.
Tracker. The tracker is typically a third-party that is included by first-party publishers to provide advertising and tracking services on their sites. We assume that the tracker is able to link the user’s different
page visits by using well-known cross-site tracking techniques such as cookies or browser fingerprinting.
We consider a strong threat model by assuming that the tracker has complete coverage of a user’s browsing
profile. While a tracker typically does not have complete coverage, prior literature has shown that some
trackers indeed have significant coverage of top sites and that even trackers with smaller individual coverage collaborate with each other to improve their coverage [118, 80, 119, 78]. The tracker is also assumed to
have substantial computational resources to train machine learning models on the user’s browsing profile
to effectively profile the user’s interests and target relevant ads [120].∗
We also assume that the tracker’s goal is to train machine learning models to profile and target arbitrary
users rather than a particular user with a known identity (e.g., email address, account identifier). In the
latter case, the tracker can trivially gather information by collaborating with a first-party publisher (e.g.,
social network or e-commerce site). We assume that it is not the case. Even when this assumption is
invalid, we contend that a privacy-conscious user would be able to leverage data deletion requests under
privacy regulations, such as GDPR [13] or CCPA [14], to remove their identity or information. In summary,
we assume that the tracker does not have the user’s non-obfuscated browsing profile to begin with.
∗Note that while tracking may take place via additional modalities like location, browsing profile based profiling and targeting
remains one of the most important modalities currently used in the online advertising ecosystem. See Section 3.6.2 for more
discussion.
55
3.3 Proposed Approach
In this section, we present the design and implementation of our proposed obfuscation approach called
Harpo.
3.3.1 Overview
Harpo inserts a fake page visit at random times, where the percentage of fake versus real user’s page
visits is a configurable system parameter. We refer to the corresponding URLs as obfuscation versus user
URLs, respectively. Every time a fake page is to be visited, Harpo needs to decide which URL to pick
as the obfuscation URL. The decision of Harpo depends on the user’s current browsing profile, which is
modeled as a random process because neither the user URLs nor the sequence of obfuscation and user URLs
are deterministic. Clearly, Harpo’s decisions impact the accuracy of the tracker’s profiling and targeting
models—the less their accuracy the better the effectiveness of Harpo.
We formulate the selection of obfuscation URLs as a Markov Decision Process (MDP) which selects
URLs to maximize the distortion of the tracker’s estimate of the user’s interests. This MDP is not analytically tractable because the exact mechanism that trackers use to create profiles is unknown. Moreover,
even if a good tracker model were available, the state space of this MDP is prohibitively large to be solved
analytically. The obvious choice hence is to use Reinforcement Learning (RL) [121], which uses feedback
from the tracker to train the RL agent that then selects suitable obfuscation URLs.
Figure 3.1 illustrates Harpo’s workflow. Harpo starts by parsing the content from the pages of the
visited URLs, and featurizes it using an embedding. It then trains an RL agent that selects obfuscation
URLs to optimize its reward based on the extracted features. After training, the RL agent is used by a URL
agent that inserts obfuscation URLs, interleaving them with user URLs.
56
1. Content feature extraction
Document
embedding
Environment
Policy
�(⋅ |�!)
Page content
2. RL agent
Obfuscation URL �!"#
$
Action �!
User persona
state �!
3. URL
agent
Reward �! Harpo
…
… …
�#
% �&
$ �!
% �!"#
$
Tracker
�#
% �&
$ �!
%
User
Figure 3.1: Overview of Harpo’s workflow. Note that the ith URL, pi
, can be a user or an obfuscation URL,
denoted by p
u
i
and p
o
i
respectively. c
u
i
/ c
o
i
represents the embedding vector of the ith real (blue) / fake
(green) page, respectively.
3.3.2 System Preliminaries
User persona. We define user persona simply as the set of visited URLs. We denote the user URL set,
obfuscation URL set, and the full URL set by P
u
, P
o
, and P respectively, where P = P
u∪Po
. Since we are
interested in the URL selection rather than the time interval between consecutive URLs, we focus at URL
insertion times and work with time steps. We represent a user persona at time step t by Pt = [p1, · · · , pt
],
where pi represents the ith visited URL. At every time step i, 1 ≤ i ≤ t, we select an obfuscation URL
with probability α, denoted by p
o
i ∈ Po
, or a user URL with probability 1 − α, denoted by p
u
i ∈ Pu
, where
α is a parameter to control the percentage of obfuscation URLs. For each obfuscated persona there is a
corresponding base persona without the obfuscation URLs, and we denote those personas by P
o
t
and P
u
t
respectively.
User profiling and ad targeting models. Many advertisers provide interest segments, such as “traveleurope", “pets-dogs", “health-dementia", inferred by their user profiling models for transparency [122, 123].
Furthermore, the bids placed by advertisers as part of the real time bidding (RTB) protocol are often visible
in the browser [124]. We leverage this black-box access to user profiling (i.e., interest segments) and ad
57
targeting models (i.e., bid values†
) to collect the data needed to train our own surrogate models. Specifically,
we extract the interest segments made available by the Oracle Data Cloud Registry [116] which gathers
data based primarily on cookies, and the bid values placed by advertisers in the Prebid.js header bidding
implementation [117, 125]. Note that Oracle is a well-established data broker that combines data from
more than 70 partners/trackers [126] and is accessible without needing a cumbersome sign-up process.
Furthermore, segments are added and removed regularly to reflect the user’s latest profile. To collect bids
from multiple bidders efficiently, we select the three popular header bidding enabled sites (www.speedtest.
net, www.kompas.com, and www.cnn.com), each of which contains multiple bidders.
Privacy Metrics. At a high level, we use as privacy metric the distortion in the tracker’s user profiling
or ad targeting model estimate, expressed as an accuracy loss.
For the user profiling model, we consider as distortion the addition of new interest segments and the
removal of existing interest segments, when comparing the user profile under the base persona and its
obfuscated version. To define loss metrics along these lines, we consider Ns interest segments in total
and define the interest segment vectors of P
o
t
and P
u
t
as Xo = [x
o
1
, ..., xo
Ns
] and Xu = [x
u
1
, ..., xu
Ns
]
respectively, where x
j
i
, i ∈ {1, . . . , Ns}, j ∈ {o, u}, are binary variables representing whether the ith
interest segment is triggered or not (1: triggered, 0: not triggered). Then, we define the following loss
for the tracker, which represents the percentage of segments of the obfuscated persona which were not
segments of the base persona:
L1(Xo
, Xu
) =
PNs
i=1 1{x
o
i =1,xu
i =0}
PNs
i=1 x
o
i
. (3.1)
Here, the numerator is the number of incorrect segments (1A = 1 if A is true and 0 otherwise) and the
denominator is the total number of segments of the obfuscated persona.‡
†The bid values are measured in cost per thousand impressions (also known as CPM).
‡Note that L1 is undefined if the obfuscated persona has no segments, which is a corner case of no practical relevance.
58
We also define a second loss metric for the tracker which aims to quantify the profile distortion:
L2(Xo
, Xu
) = X
Ns
i=1
x
o
i ⊕ x
u
i
. (3.2)
This loss metric equals the number of different segments between the original (Xo
) and obfuscated
(Xu
) profiles. It is maximized if all the interest segments of the base persona are removed by the profile
and all the remaining segments are triggered, thus maximally distorting the profile.
For both L1 and L2, the more base persona segments are removed and the more obfuscated persona
segments are added, the higher their value. The difference is that, L1 measures the portion of the triggered
segments for the obfuscated persona that have no value for the tracker and equals 100% when all base
persona segments are removed, whileL2 reports the total number of different segments and thus represents
the profile distortion. Clearly, the higher the L1 and L2 values are, the lesser the sensitive information
contained in the obfuscation profile and the higher the resulting privacy.
For the ad targeting model, we consider as distortion the deviation of the bid values placed by a bidder
under the base persona and its obfuscated version. It is worth noting that bid values represent a bidder’s
confidence about whether a user’s interests match the targeted ad. By manipulating a user’s profile to
distort bid values, our goal is to make bidders place bids that are inconsistent with the user’s interests
(e.g., place a high bid value for an ad that is actually irrelevant to the user). This inconsistency means that
the bidder has incorrectly profiled the user’s interests and thus represents a better privacy outcome for
the user. To maximize the deviation one may attempt to either increase or decrease the bidding values by
appropriately selecting obfuscation URLs. We choose to attempt to increase the bid values because bidders
tend to significantly place more low bid values than high bid values, thus there is much more room to
distort low bid values.§
§We discuss ethical considerations regarding the potential infliction of economic pain to bidders by this in Section 3.6.
59
To define practical loss metrics along these lines, we first group the bid values into two classes. Suppose
the mean and variance of all the bid values we collect from a bidder are µ and σ respectively. Then, we
use µ + σ to split the bid values into low and high value bid classes. If a bid value is larger than µ + σ, we
classify it as high, else we classify it as low. Now, consider a total of Nb bidders bidding for ads based on
the current user browsing profile. We define v
j
i
, i ∈ {1, . . . , Nb}, j ∈ {o, u}, as the bid value placed by
bidder i for an obfuscated (j = o) or non-obfuscated (j = u) persona. We also define b
j
i = 1v
j
i ≥µi+σi
to
indicate whether the bid value for bidder i is below (bi = 0) or above (bi = 1) the threshold µi + σi
, where
µi and σi are the mean and variance of bid values placed by bidder i. Then, we use as loss the increase of
the proportion of high bids in the obfuscated persona as compared to the corresponding base persona, i.e.,
L3({b
o
i
, bu
i }
i=Nb
i=1 ) = 1
Nb
X
Nb
i=1
(b
o
i − b
u
i
). (3.3)
To directly quantify how much the bid values change, we also use the average ratio of bid values of an
obfuscated persona over its corresponding base persona and denote it by L4. Specifically,
L4({v
o
i
, vu
i }
i=Nb
i=1 ) =
PNb
i=1 v
o
P
i
Nb
i=1 v
u
i
. (3.4)
3.3.3 System Model
As discussed earlier, we formulate the selection of obfuscation URLs as an MDP. In a nutshell, MDP is a
standard framework for modeling decision making when outcomes are partly stochastic and partly under
the control of a decision maker. We describe in detail all components of the MDP below.
Obfuscation step. MDPs are discrete-time processes evolving in time steps. Recall that we assume time
evolves in steps every time a URL is visited. We refer to a time step as an obfuscation step, if the visited
URL at this time step is an obfuscation URL, and use obfuscation steps as the time steps of the MDP. In the
60
rest of the section we use t to denote obfuscation steps, and let Nt denote the total number of URLs (i.e.,
user and obfuscation URLs) visited by the persona under consideration up to obfuscation time step t.
¶
State. MDPs transition between states. We define the state at obfuscation step t as st = [p1, · · · , pNt
] ∈ S,
which consists of the visited URLs up to time step t, where S denotes the state space of the MDP. Note
that this state definition means the state space will grow indefinitely. Yet we do so because the retention
time of URLs by data brokers, including the Oracle Data Cloud Registry, are often in the order of 12 to
18 months [123]. Thus, we want to select an obfuscation URL based on the entire browsing profile of a
persona. While such a state space complicates analytical treatment, as we discuss later in Section 3.3.4, we
use a recurrent model as part of our RL model which allows us to handle this effectively.
Action. MDPs choose an action at each step, based on a policy. At obfuscation step t, the action at
is the
selection of an obfuscation URL p
o
Nt+1 from the set of P
o
, which is the action space.
State transition. The transition between states of an MDP is dictated by a state transition function
T (·|S, A) : S × A × S → R, which outputs the probability distribution of state st+1 given the previous state st and the action at
. Note that state st+1 consists of all visited URLs up to step t (st
), of the
obfuscation URL p
o
Nt+1, and of the user URLs visited between obfuscation step t and t + 1.
Reward. Every time there is an action, there is a reward associated with it. We use as reward of obfuscation
step t the difference of the loss of the tracker between this and the previous step. Specifically, let Lt denote
the loss of the tracker at obfuscation step t. Lt can be any of the privacy metrics defined above. Then, the
reward rt equals Lt − Lt−1.
To avoid selecting a small set of high-reward obfuscation URLs repeatedly as this may affect stealthiness, we may use the following reward function which penalizes the selection of the same URLs: rt =
Lt − Lt−1 − δ ∗ (N(p) − 1), where N(p) represents the number of times the obfuscation URL p has been
¶Note that to keep the notation in Figure 3.1 simple, we have used t to represent time steps corresponding to both user and
obfuscation URLs, in a slight abuse of notation.
61
Document embedding
Latest �
URLs
Content feature �!
a.com
b.com
…
t.com
…
z.com
Persona �!
…
Conv Concat
CNN encoder
LSTM
FCNN + Softmax
Actor: �!
a" ∼ � ⋅ �!
Critic: �!
� − 1
�
Reward:
�! �!, �!
RL
agent
FCNN
�!
#(�!)
�!
#([�!, �!])
b) Encoder for both RL agent and surrogate model c) Decoders for RL agent and surrogate model
URL set: {���$}
Document set
Parse out text contents
Document embedding:
���$ → �$
Train doc2vec model
a) Content feature extraction
�!
#
�!
%
Surrogate
model
Figure 3.2: Neural network structures for RL agent and surrogate model. Both the RL agent and the surrogate model take the content features Ct
(document embedding output by doc2vec model) of the latest
w URLs in user persona Pt as input, and then utilize CNN as encoder to convert Ct
into feature vector
ϕ
i
t
as the input of decoders. The decoder of the RL agent is a LSTM followed by two FCNN, representing
actor and critic networks respectively. And the decoder of the surrogate model is a FCNN with Softmax
activation function, which outputs the binary classification result as the reward for the RL agent.
selected in the past and δ is a parameter controlling the diversity of selected URLs, see Section 3.5.4 for
related performance results.
Policy. We define the policy of the MDP as π(·|S) : S × A → R, where at ∼ π(·|st). That is, the policy is
the probability distribution of the obfuscation URL selection for each state st
. Specifically, let No = |Po
| be
the total number of available obfuscation URLs. Then, π(·|st) is a multinomial distribution with parameter
At = [a
1
t
, · · · , aNo
t
], where a
i
t
is the probability of selecting the ith obfuscation URL, and PNo
i=1 a
i
t = 1.
We design the policy π(·|st) with the objective of maximizing the accumulated expected reward over a
finite time horizon. In the RL implementation of the MDP, the finite time horizon equals the number of
obfuscation steps during training of the RL agent.
3.3.4 System Design
Harpo consists of 4 modules: (i) a content feature extraction module that converts text to a document
embedding, (ii) an RL agent which gets document embeddings as input and outputs obfuscation URLs,
(iii) a surrogate model, trained to replicate real-world user profiling and ad targeting models, which is
used for fast training of the RL agent, and (iv) a URL agent which inserts obfuscation URLs in the web
browsing profile of personas. We describe in detail each module below, and present an overview of Harpo’s
workflow in Figure 3.1.
62
Feature Extraction. To extract the features of a visited URL, we train a document embedding model
for Harpo, whose input is the textual content on the page of each visited URL pi and the output is the
document embedding ci ∈ Rd
(a real vector with dimension d). More specifically, as demonstrated in
Figure 3.1, for each URL in our URL set, we first parse the text content from its rendered HTML page as
a text document. We then train a doc2vec embedding model [127] via unsupervised learning by utilizing
the extracted text documents of all URLs in P. Finally, Harpo uses the trained doc2vec model to map each
URL to an embedding, which represents the features of the page corresponding to the URL.
We only consider textual content during feature extraction for two reasons. First, text content is the
basis of HTML files and can convey the information in the web pages that is relevant for user profiling
and ad targeting models [128, 129]. Second, it is easier and faster to process text content than other types
of multimedia content. Moreover, since a page typically contains thousands of word tokens, we choose
to train a document embedding model instead of word embedding or sentence embedding models, so that
the dimension of embedding vectors can be reduced.
RL Structure and Implementation. At a high level, the RL agent consists of a CNN (Convolutional
Neural Network) as an encoder, followed by an LSTM (Long-short Term Memory) neural network and
two FCNNs (Fully-Connected Neural Networks) as decoder, which represent actor and critic networks
respectively. The actor network will determine which obfuscation URL to select at each obfuscation step
based on the current state, while the critic network will estimate the future cumulative reward based on
the current state and the action chosen by the actor network.
Specifically, as illustrated in Figures 3.2b and 3.2c, the input of the CNN Ct consists of the document
embeddings of the latest w URLs (Ct ∈ Rw×d
) and the output of the CNN ϕ
1
t
is an encoded real vector
with m elements (ϕ
1
t ∈ Rm). ϕ
1
t
is the input of the LSTM, which outputs a decoded real vector ϕ
2
t with
n elements (ϕ
2
t ∈ Rn
). ϕ
2
t will further be the input of the actor and critic networks, which output the
probability distribution of selecting each obfuscation URL At ∈ RNo
(recall there are No obfuscation URLs
63
in total) and the estimate of the expectation of the future accumulated reward Vt ∈ R (a real number),
respectively. We train the actor critic networks via the A2C (Advantage Actor and Critic) algorithm [130],
which is one of the most popular on-policy RL algorithms. Note that we select on-policy RL algorithms
since they are more memory efficient and adaptive to the dynamic web environment as compared to offpolicy RL algorithms.
We choose CNN as the encoder of the document embedding since it has fewer training parameters
compared with other Deep Neural Networks (DNNs) and prior works demonstrate its effectiveness on
text classification (e.g., [131]). Furthermore, we use an LSTM because it is a recurrent neural network
which allows us to maintain information about the whole browsing profile despite the input to the RL
agent being the w most recent pages only. Prior work has also used an LSTM when the MDP state is only
partially observable by an RL agent [132, 133]. Note that The RL agent’s input at each obfuscation step is
an embedding matrix consisting of a sequence of doc2vec embeddings. Adding a CNN before the LSTM
can extract the local features of the embedding matrix efficiently and reduce the input feature space of the
LSTM (from a 2D matrix to a vector) at each obfuscation step. Prior research [134] has also demonstrated
the effectiveness of combining CNN with LSTM.
Surrogate model. To train the RL agent, we would need ample access to a real-world user profiling or ad
targeting model. However, as outlined in the threat model, we may have limited or no access to the realworld user profiling or ad targeting models in practice. To address this issue, we propose to train surrogate
models that can reasonably replicate the output of real-world user profiling or ad targeting models. These
surrogate models are then used to train the RL agent. The surrogate models also help improve the efficiency
of RL agent training by providing a virtual environment, which is much faster than querying real-world
64
user profiling or ad targeting models.∥ Next, we describe in detail the surrogate models for user profiling
and ad targeting systems.
For the user profiling model, we train a separate model for each interest segment in the Oracle Data
Cloud Registry to predict whether this interest segment will be triggered by the most recent w URLs in
the web browsing profile of a persona. Note that we use the latest w URLs rather than the complete
web browsing profile, because it is hard to accurately train models with very long and variable length
inputs. More precisely, for a user persona with a browsing profile of Nt URLs at obfuscation step t, PNt =
[p1, · · · , pNt
], we extract the document embedding of the latest w URLs, Ct = [cNt−w+1, ..., cNt
], and feed
them as input into the model. The model, which we refer to as a segment predictor henceforth, outputs a
1 if the segment is expected to be triggered, and a 0 otherwise.
For the ad targeting model, as discussed already, we first group continuous bid values into a low- and
a high-bid class. Then, we train a binary classifier to predict the bid class and refer to this model as the
bid predictor. Similar to the segment predictor models, the bid predictor takes Ct as the input and outputs
either 0 (low bid class) or 1 (high bid class).
The detailed structure of surrogate models are demonstrated in Figures 3.2b and 3.2c, which consist of
a CNN and FCNN with Softmax activation. Specifically, the CNN has the same structure as that in the RL
agent, which takes Ct as input and outputs ϕ
1
t
(see Section 3.3.4). The decoder, which is the FCNN, takes
ϕ
1
t
as input and outputs the binary classification value (0 or 1) of each surrogate model.
To train the bid and segment predictors, we start by randomly constructing a set of user personas.
Then, we collect training data (by the Oracle Registry for the user profiling model and multiple bidders
for the ad targeting model) and use supervised learning, see Section 3.4 for more details.
∥While profile registries like the Oracle Data Cloud Registry are required by law to allow users access to their profiles, these
profiles may be updated every few days. Thus, it would take months to collect enough samples to train the RL agent solely by
accessing such registries.
65
URL Agent. The URL agent creates user personas consisting of both user and obfuscation URLs through
an i.i.d random process. At each time slot, with probability α the URL is an obfuscation URL selected by
Harpo and with probability 1−α it is a user URL, randomly picked from the user URL set. In practice, the
URL agent would not generate user and obfuscation URLs in discrete time slots. Instead, it would estimate
the arrival rate of user URLs, call it λ
u
. Then, to target an obfuscation “budget" α, it would create a random
process with arrival rate λ
o =
λ
u∗α
1−α
to specify the insertion times of obfuscation URLs. For example, a
Poisson process with rate λ
o
can be used for that purpose, or, a non-homogeneous Poisson process can
be used to adapt λ
o
(more precisely, λ
o
(t) in this case) to the current user behavior (i.e., if the user is not
engaged in an active browsing session, very few or no obfuscation URLs would be inserted).
3.3.5 System Implementation
We implement Harpo as a browser extension. Its architecture has a passive monitoring component and
an active obfuscation component. The monitoring component uses a background script to access the webRequest API to inspect all HTTP requests and responses as well as a content script to parse the DOM and
extract innerHTML [135]. This capability allows us to implement the content extraction module, which is
responsible for computing document embedding for each visited page. The monitoring component sends
this information to the obfuscation component, which is responsible to implement the other 3 modules of
Harpo (RL agent, surrogate model, and URL agent). The RL agent and surrogate model modules run in
the background, the former to select an obfuscation URL that is visited by the URL agent module, and the
later to train the RL agent. To visit the obfuscation URL in the background so user experience is seamless,
we open the URL in a background tab that is hidden from the user’s view.∗∗ Note that our implementation
does not simply use AJAX to simulate clicks [110], it realistically loads pages by executing JavaScript and
rendering the page content. Harpo’s browser extension is implemented to minimize adverse impact on
∗∗In Section 3.6.2, we discuss how to prevent a tracker from using side-channels associated with background tabs to detect
Harpo.
66
user experience. We evaluate the system overhead of Harpo’s browser extension implementation later in
Section 3.5.3.
3.4 Experimental Setup
3.4.1 User Persona Model
We need to gather realistic web browsing profiles to experimentally evaluate Harpo and baselines. While
we could try to directly use real-world web browsing traces, this would pose two problems from a practical
standpoint. First, we need to restrict the total number of distinct URLs to a manageable number that we
can crawl in a reasonable amount of time. Second, it is preferable to train a model that can work for general
user types than individual users. To address these problems, we first use real-world web browsing traces
to train a user persona model, and then use this model to generate a large number of web browsing profiles
from a manageable pool of distinct URLs and user types.
Specifically, we start with the AOL dataset [136] which consists of millions of distinct URLs and web
browsing profiles of millions of users.†† We then randomly sample users with more than 100 visited
URLs each, and leverage WhoisXMLAPI [137] to map each URL into one of the 16 IAB categories from
Alexa [138]. We observe that real web browsing profiles consist of URLs from a handful of preferred URL
categories. Motivated by this, we use a Markov Chain (MC) model to generate web browsing profiles as
follows: a MC state dictates the category from which a URL is selected. We assign a separate state to each of
the most popular categories, and a single state collectively to the rest of the categories. As the MC transits
from state to state, a URL is randomly selected from the URL category (or categories) that corresponds to
the current state.
††While the AOL dataset is somewhat dated, it is one of the largest publicly available datasets of real-world user browsing
profiles and captures well the browsing behavior of a large, diverse set of users.
67
To specify the model parameters, first we need to decide how many popular categories will have their
own state. We do so by assigning a separate state to categories whose URLs represent more than 10% of
the total URLs in the dataset. Figure 3.3a plots the percentage of URLs in a user’s web browsing profile
from the i
th most popular URL category for this user, averaged over all users. From the figure we conclude
that the 3 most popular categories satisfy our criteria. Thus, we set the total number of states of the MC
to 4, one for each of the 3 most popular categories and one collectively for the 13 remaining categories.
Next, we need to decide the order of the MC. In general, a higher order MC has the ability to model
longer term correlations between states, as the transition probability from one state to another for a jthorder MC depends on the j most recent states. That said, the higher the order the higher the complexity
of the MC, as the state space grows exponentially, see, for example, [139]. Following standard practice,
we use the autocorrelation function to measure the correlation in the AOL dataset and experiment with
different order MCs to identify the smallest order required for a good fit. Figure 3.3b shows that a 1st
order MC is enough to achieve a good fit. Last, given the order and number of states of the MC, we fit
the stationary distribution and transition probabilities of our MC model to the statistics of the dataset (see
Figure 3.4 for the final MC model).
In the rest of the paper, we use the aforementioned model to generate web browsing profiles for user
personas. Since the most popular categories are not necessarily the same for each user persona, we select
the 100 most common combinations of the 3 most popular URL categories from the AOL dataset and
define 100 user types. Then, every time we want to generate a web browsing profile for a user persona,
we randomly select one user type which sets the specific 3 most popular categories for this user, and use
the MC model to generate the user URLs as described above.
68
1 6 11 16
The ith most popular URL category
0%
20%
40%
Averge percentile
(a) Percentage of URLs from i
th most popular
category.
0 1 2 3 4 5
K
0
0.5
1
Autocorrelation
AOL dataset
1
st order MC
2
nd order MC
3
rd order MC
(b) Autocorrelation for different order MCs.
Figure 3.3: Selecting parameters of the MC model. Note that the autocorrelation with lag K measures the
correlation between states that are K time steps apart.
0.14
0.26 0.09 0.14
0.50
0.27
0.33
0.10
0.08
0.25
0.28
0.14
0.27
0.51 0.35
0.28
State 1 (1st popular
URL category)
State 2 (2nd popular
URL category)
State 3 (3rd popular
URL category)
State 4 (The rest URL
categories)
Figure 3.4: The MC model and its state transition probability diagram for simulating user personas.
3.4.2 Data Collection and Preparation
Persona URLs. The web browsing profiles of user personas consist of user and obfuscation users. User
URLs are generated by the user persona model described above. Note that for each of the 16 IAB categories,
we keep the 100 most popular URLs within each category as ranked by Alexa [138], thus there are a total
of 1600 user URLs to pick from every time a user URL is selected.
Obfuscation URL categories depend on the obfuscation scheme (Harpo or one of the baseline approaches described in Section 3.4.5) and we consider three different categories: TrackThis, AdNauseam,
and intent URL categories. The TrackThis category contains 400 obfuscation URLs from [111]. For the
AdNauseam category, we collect all the third-party URLs from the 1,600 pages corresponding to the 1600
69
URLs of the 16 IAB categories we described above, and identify advertising URLs using EasyList [140]. In
total, we collect 2,000 advertising URLs. For the intent category, we randomly crawl 1,930 product URLs
through Google shopping (10 URLs for each one of the 193 shopping categories, which we will refer to as
intent URL subcategories henceforth).
Data collection for surrogate models. We construct 10,000 personas to collect data from real-world user
profiling and ad targeting models in order to train the surrogate models. The proportion of obfuscation
URLs, α, in each persona varies between 0 and 0.2.
Collecting data from real-world models is a costly operation. Thus, we determine the suitable length
of a persona based on the following analysis, keeping data collection efficiency in our mind. Let N be the
average number of URLs per persona, which we wish to determine. Let n be the fraction of personas for
which we are able to collect some feedback (i.e., the trackers return no feedback for the rest). 10, 000 · n
is the total number of personas we can collect feedback for and 10, 000 · N is the total number of URLs
among all personas. We choose to select N such that we maximize n/N for the following reason: While
longer personas with more URLs will likely trigger more feedback, computational overheads (e.g., CPU and
memory) are also proportional to the total number of URLs. Thus, the most efficient choice is to maximize
the feedback we collect per URL, and n/N represents the number of personas with non-empty feedback
per URL. The above procedure yields a value of N equal to 20, and we use the MC model described above
to select the user URLs, and Harpo to select obfuscation URLs from the intent URL category, for a total of
20 URLs per user persona.
Using these personas we collect feedback from real-world user profiling and ad targeting models as
follows: For each persona, we start with a fresh browser profile in OpenWPM [79]. For ad targeting, we
access bidding sites to collect the triggered bids immediately after visiting the 20 URLs. For user profiling,
since we observe that it takes on average two days for triggered interest segments to appear in Oracle Data
Cloud Registry, we save the browser state after visiting the 20 URLs and reload it after 2 days to collect
70
the triggered interest segments. In total, we collect 184 different interest segments from the Oracle Data
Cloud Registry and bids placed by 10 different bidders on 16 different ad slots. Note that for each bidding
site, there could be multiple bidders that might place different bids for different ad slots.
Data preparation for surrogate models. We first clean the data by removing unrelated interest segments such as those related to geographic location or device type, and by removing zero bids. Then, for
each user persona for which we collected some feedback, we extract content features from the visited web
pages, concatenate the document embedding vectors of all visited URLs of the persona into an embedding
matrix, and use this matrix as the input to the surrogate models. We use a surrogate model for each interest segment, where the label is a binary variable with 1/0 representing that the user persona will/will not
trigger the segment, respectively. We also use a surrogate model for each bidder and ad slot pair, where
the label is a binary variable with 1/0 representing that the user may trigger a high/low bid, respectively.
Section 3.4.3 discusses how we train the surrogate models using supervised learning. Let a dataset
refer to all the user persona embedding matrices and the associated labels collected. If the percentage of
labels with value 1 in a dataset is less than 5%, we remove it because it is likely not sufficient for training
surrogate models later. We end up with 121 interest segment datasets and 50 bid datasets for training a
total of 171 surrogate models.
Data collection for RL agent. We construct 15,000 personas, 50 for each of 300 training rounds of the RL
agent, to train the RL agent. Each persona consists of 100 URLs. Recall that user URLs are selected using
the MC model from the IAB categories, and the obfuscation URLs are selected from the intent category
based on the actions generated by the RL agent. The first 20 URLs are selected randomly from the user
URL set for initialization. The remaining 80 URLs are either obfuscation URLs (with probability α) or user
URLs (with probability 1 − α). Thus, we have on average 80 · α obfuscation URLs per persona.
A word on the selection of the α value is in order. Clearly, the smaller the α the lower the overhead.
Also, one may conjecture that the smaller the α the higher the stealthiness. However, too small of an α
71
value may not yield enough obfuscation URLs to have a large impact. We start our evaluation by choosing
α = 0.1 for both user profiling and ad targeting. In Section 3.5.3 and Section 3.5.4 we study the impact of
α on obfuscation effectiveness, overhead and stealthiness.
System configuration. We use OpenWPM [79] to implement our crawling system in an automated and
scalable manner. The experiments are run on servers with 32 CPUs and 128GB memory on an AMD Ryzen
Threadripper 3970X server with 3.7GHz clockspeed.
3.4.3 Training and Testing
We report the neural network parameter values of surrogate model and RL agent in Table 3.1, and describe
the training and testing process of the surrogate models and the RL agent in this subsection.
Surrogate models. For each of the 176 surrogate models (121 interest segment and 55 bid models), we
utilize 80% of the data collected from the 10,000 personas for training and 20% for testing. We train each
model via stochastic gradient descent [141] with a batch size of 32 personas for 30 training rounds, where
all the training data are used once at each training round.
RL agent. We train and test the RL agent with the data collected from the 15,000 personas. Specifically,
we train the RL agent using the surrogate models to collect the reward and run the training for 300 rounds.
We test the RL agent using surrogate models for 10 rounds. At each training or testing round we generate
a batch of 50 personas.
In addition to testing the RL agent using surrogate models, we also test it against real-world user
profiling and ad targeting models. To this end, we create 100 personas with 100 URLs each. For each
persona we start with a fresh browser profile in OpenWPM [79]. For ad targeting, we immediately collect
the triggered bid values as we visit the 100 URLs of the persona. For user profiling, we save the browser
state after visiting all 100 URLs of the persona, and wait for 2 days to access the Oracle Data Cloud Registry
and collect the triggered interest segments.
72
Parameter description Configuration
Dimension of document embedding d = 300
Dimension of CNN input w × d = 20 × 300
Kernel size in CNN {i × i × 1 × 100}i=3,4,5
Dimension of encoder vector m = 300
Dimension of decoder vector n = 256
Dimension of actor’s output No = 193
Table 3.1: Parameter values of neural networks for RL agent and surrogate model in Harpo.
Recall that Harpo selects obfuscation URLs from the intent URL category which consists of 1930 URLs
(10 URLs from each of the 193 intent URL subcategories). For scalability reasons we wish to reduce the
number of possible decisions, thus implement the RL agent to select on of the intent URL subcategories,
and then Harpo randomly selects one of the URLs within the selected intent URL subcategory. Note that
by construction, URLs within the same intent URL subcategory have similar content.
Training cost. We note that it takes about 2 minutes to train the surrogate model from scratch and less
than 1 minute to train the RL agent per-round on our server. While the exact training time would vary
depending on the user’s system specifications, we do not expect it to take longer than a few minutes.
3.4.4 Accuracy of Surrogate Models
We study the accuracy of the surrogate models we trained for user profiling and ad targeting and report
the true positive rate (TPR) and false positive rate (FPR) metrics. Out of the 121 interest segment models
we select the 20 most accurate, and out of the 55 bid models we select the 10 most accurate. We then use
those models to train and evaluate Harpo in the context of user profiling and ad targeting.
User profiling. In general, the trained surrogate user profiling models have reasonable accuracy. As
reported in Table 3.2, the average FPR and TPR of the 20 most accurate surrogate user profiling models
are 3.92% and 96.57% respectively. The FPRs of these 20 surrogate user profiling models ranges from 1.82%
73
Model type User profiling Ad targeting
Number of models 20 10
Dataset size 10,000 10,000
Positive data 7.04% 11.26%
Average FPR 3.92% 18.28%
Average TPR 96.57% 73.43%
Table 3.2: Accuracy of surrogate user profiling and ad targeting models. FPR and TFP denote false positive
and true positive rates.
to 19.23% and the TPRs vary from 81.43% to 100.00%. Last, among the 20 datasets training the top 20
surrogate user profiling models, the percentage of data points with label value 1 (positive data, indicating
the segment is triggered) varies from 3.91% to 15.87%, with an average value of 7.04%.
Ad targeting. Compared with user profiling surrogate models, we observe that ad targeting surrogate
models are less accurate in general. This is expected since the bids placed by each bidder are likely affected by other auction dynamics, and their values have larger variance making it more difficult to predict
accurately [124]. However, we still obtain 10 surrogate ad targeting models with good accuracy, which
achieve 18.28% FPR and 73.43% TPR on average as shown in Table 3.2. The FPRs of these 10 surrogate
models range from 12.37% to 16.93% and the TPRs vary from 70.27% to 78.35%. Last, among the 10 datasets
training the top 10 surrogate ad targeting models, the percentage of positive data (indicating a high bid is
triggered) is 11.26% on average, ranging from 8.55% to 15.38%.
3.4.5 Baselines
We compare the performance of Harpo against four other baseline approaches. Two of these approaches
(AdNauseam and TrackThis) have their own set of obfuscation URLs whereas the other two (Rand-intent
and Bias-intent) use different selection techniques on the set of obfuscation URLs used by Harpo, see 3.4.2
for more details on these sets of obfuscation URLs. The four approaches are as follows:
74
User persona
model
Document
embedding model RL agent
Surrogate tracker models
User profiling Ad targeting Adversarial detection model
Non-obfuscated
personas
Privacy
evaluation
Stealthiness
evaluation
Content
features URL agent Obfuscated
personas
Harpo
Train RL
Real-world tracker models
Oracle Bidders
Figure 3.5: Overview of Harpo’s evaluation process.
AdNauseam. Every time an obfuscation URL is needed, we uniformly randomly select one of the AdNauseam URLs.
TrackThis. Every time an obfuscation URL is needed, we uniformly randomly select one of the TrackThis
URLs.
Rand-intent. Every time an obfuscation URL is needed, we uniformly randomly select one of the 193
intent URL subcategories and pick one URL from this subcategory at random.
Bias-intent. Every time an obfuscation URL is needed, we randomly select one of the 193 intent URL subcategories with the probability proportional to the average reward triggered by URLs in this subcategory,
and pick one URL from this subcategory uniformly randomly.
It is noteworthy that the aforementioned AdNauseam and TrackThis baselines are not exactly the same
as the original implementations. The original AdNauseam implementation clicks on ad URLs that are on
the current page, making it hard to control the budget and diversity of obfuscation URLs. The original
TrackThis implementation opens 100 preset URLs. We try to adapt these approaches to our experimental
setup as best as possible. To this end, we first use the original implementations to generate the AdNauseam
and TrackThis URL sets as described in the URL set subsection, and then randomly select obfuscation URLs
from these sets with uniform probability, as already discussed.
3.5 Evaluation
Figure 3.5 summarizes Harpo’s evaluation process. We first use the user persona model of Section 3.4.1 to
generate a large number of diverse web browsing profiles. Next, we use the 4 Harpo modules discussed in
75
Section 3.3.4 in the following order: (i) we use the doc2vec embedding model to extract features for pages
visited by each persona, (ii) we crawl data to train surrogate user profiling and ad targeting models and use
them to train the RL agent, (iii) we use the RL agent to select obfuscation URLs, and (iv) we use the URL
agent to create obfuscated personas. Then, we evaluate Harpo’s effectiveness in protecting user privacy
as compared to the baselines against real-world user profiling and ad targeting models. Finally, we analyze
Harpo’s performance from three key perspectives: overhead, stealthiness (using an adversarial detection
model introduced later in Section 3.5.4), and adaptiveness.
3.5.1 Privacy
Table 3.3 reports the effectiveness of Harpo and baselines in protecting user privacy against surrogate user
profiling (L1 and L2) and ad targeting (L3 and L4) models. Here, we only report the results for α = 0.1.
Section 3.5.3 reports the results for varying values of α. Note that the Control represents a persona that
does not deploy obfuscation.
User profiling. We note that Harpo outperforms all four baselines with respect to both L1 and L2 metrics.
Harpo triggers an average of 36.31% (L1) interest segments that were not present in the corresponding
Control persona. The obfuscated persona has on average 4.40 (L2) different interest segments from the
corresponding Control persona, where on average 4.17 are new segments and 0.23 are removed segments
from the Control persona.‡‡ While Rand-intent and Bias-intent fare much better than AdNauseam and
TrackThis, Harpo outperforms all of the baselines by at least 1.41× and up to 2.92× in terms of L1.
Similarly, Harpo outperforms all baselines by at least 1.46× and up to 4.00× in terms of L2.
Ad targeting. We again note that Harpo outperforms all four baseline approaches in terms of L3 metric.§§
Harpo increases high bids by 38.96% as compared to the Control persona. Harpo again outperforms
‡‡While the vast majority of different segments between the Control and obfuscated persona are newly added segments here,
when evaluating Harpo against real-world user profiling and ad targeting models 25% of different segments are due to removals,
see Section 3.5.2.
§§Note that we do not report results for L4 because the surrogate model can only predict whether the bid is high or low, and
not its actual value. We evaluate L4 in Section 3.5.2.
76
Approaches
Metrics
L1 L2 L3 L4
Control 0.00% 0.00 0.00% –
AdNauseam 12.42% 1.10 2.78% –
TrackThis 17.76% 1.42 11.00% –
Rand-intent 23.06% 1.75 14.84% –
Bias-intent 25.71% 3.01 24.72% –
Harpo 36.31% 4.40 38.96% –
Table 3.3: Evaluation results with surrogate models w.r.t. L1 (percent of false segments in obfuscated
persona), L2 (number of different segments between base and obfuscated persona), L3 (percentage increase
of high bids in obfuscated persona), L4 (average ratio of obfuscated persona over base persona bid values).
all of the baselines. Bias-intent is the most competitive baseline triggering 24.72% high bids on average.
However, Harpo is able to outperform it significantly by triggering 1.58× more high bids.
3.5.2 Transferability
Next, we evaluate the effectiveness of Harpo and baselines against real-world user profiling and ad targeting models. To this end, we replace surrogate models with the real-world user profiling model by Oracle
Data Cloud Registry and ad targeting models of 10 different bidders. Table 3.4a reports the effectiveness of
Harpo and baselines against real-world user profiling (L1 and L2) and ad targeting (L3 and L4) models.
User profiling. We again note that Harpo outperforms all four baselines with respect to both L1 and L2
metrics, as shown in Table 3.4a. In fact, Harpo’s margin of improvement over baselines further increases
against real-world models as compared to surrogate models. Harpo now triggers an average of 43.24% (L1)
interest segments that were not present in the corresponding Control persona. The obfuscated persona
now has on average 5.22 (L2) different interest segments from the corresponding Control persona, where
3.89 on average are new interest segments and 1.33 are removed segments from the Control persona.
77
Approaches
Metrics
L1 L2 L3 L4 (CPM)
Control 0.00% 0.00 0.00% 1.00 ($0.29)
AdNauseam 12.85% 1.53 2.70% 1.21 ($0.35)
TrackThis 32.67% 2.81 -1.50% 0.89 ($0.26)
Rand-intent 33.10% 3.18 8.40% 1.69 ($0.49)
Bias-intent 31.27% 3.19 10.30% 2.07 ($0.60)
Harpo 43.24% 5.22 43.30% 6.28 ($1.82)
(a) Effectiveness against real-world tracker models used in training, with synthetic user personas as inputs
Approaches
Metrics
L1 L2 L3 L4 (CPM)
Bias-intent with L1 – – 3.60% 1.40 ($0.40)
Harpo with L1 – – 10.20% 2.06 ($0.59)
Bias-intent with L2 – – 9.6% 1.73 ($0.50)
Harpo with L2 – – 10.10% 2.10 ($0.61)
Bias-intent with L3 24.70% 2.55 – –
Harpo with L3 46.72% 2.97 – –
(b) Effectiveness against real-world tracker models not used in training, with synthetic user
personas as inputs
Approaches
Metrics
L1 L2 L3 L4 (CPM)
Control 0.00% 0.00 0.00% 1.00 ($0.09)
AdNauseam 5.50% 0.60 1.30% 1.27 ($0.15)
TrackThis 12.97% 1.58 0.00% 0.84 ($0.11)
Rand-intent 24.24% 2.24 0.90% 2.22 ($0.20)
Bias-intent 19.39% 1.75 16.00% 6.67 ($0.60)
Harpo 45.06% 5.14 49.10% 18.96 ($1.71)
(c) Effectiveness against real-world tracker models using real user personas from AOL dataset
Table 3.4: Transferability results w.r.t. L1 (percent of different segments between base and obfuscated profile), L2 (percent of false segments in obfuscated profile), L3 (percentage increase of high bids in obfuscated
profile), L4 (average ratio of obfuscated persona over base persona bid values), CPM (cost per thousand
impressions in dollar, the unit of bid values).
78
Harpo outperforms all baselines by at least 1.31× and up to 3.36× in terms of L1. Similarly, Harpo
outperforms all of the baselines by at least 1.64× and up to 3.41× in terms of L2.
Ad targeting. As reported in Table 3.4a, Harpo increases high bids by 43.30% (L3) and bid values by 6.28×
(L4) as compared to the Control persona. We again note that Harpo’s margin of improvement over baselines further increases against real-world models as compared to surrogate models. Harpo significantly
outperforms all baselines by up to 16.04× in terms of L3 and 7.06× in terms of L4. Bias-intent is again the
most competitive baseline, but it increases high bids by only 10.30% and bid values by only 2.07. Harpo
is able to outperform it significantly by triggering 4.03× more high bids and 3.03× higher bid values.
Cross-validation against real-world tracker models. Our transferability analysis so far has demonstrated that Harpo’s effectiveness against user profiling/ad targeting surrogate models can be transferred
to the real-world user profiling/ad targeting models well. To further investigate Harpo’s transferability
performance, we cross-validate Harpo by testing it against different real-world tracker models than those
used to train it.
Table 3.4b reports two type of results. In the first four rows, Harpo is trained with user profiling
models (w.r.t. L1 or L2) and tested against other models (e.g. against real-world ad targeting models, see
L3 and L4 results). In the last two rows, Harpo is trained with ad targeting models (w.r.t. L3) and tested
against real-world user profiling models (see L1 and L2 results). As expected, its effectiveness is somewhat
lower when it is tested against different models than the ones it was trained with, see Table 3.4a versus
3.4b results. That said, Harpo performs well regardless. For example, it increases the average bid values
by more than 2× when trained with user profiling models, and it creates obfuscated personas which have
on average 2.97 different interest segments from the corresponding Control persona when trained with
ad targeting models. When comparing cross validation results for Harpo and baselines (the table shows
results only for Bias-intent as for the rest of the baselines the results do not change from those in Table
3.4a), when trained with user profiling models Harpo outperforms baselines by up to 3.76× in terms of
79
L3 and 2.34× in terms of L4 (i.e. against real-world ad targeting models). Similarly, when trained with ad
targeting models, Harpo outperforms all of the baselines against real-world user profiling models, by at
least 1.17× and up to 2.79× in terms of L1 and L2 on average.
Evaluation using real user personas. We have thus far evaluated Harpo’s effectiveness using synthetic
user personas. Next, we evaluate Harpo’s effectiveness using real user personas. To this end, we randomly
sample 100 real-world user personas from the AOL dataset and use them as non-obfuscated personas. Then,
we use Harpo and baselines approaches to generate 100 obfuscated personas and evaluate the effectiveness
of obfuscation against real-world tracker models.
Table 3.4c shows that Harpo continues to significantly outperform all baselines. Specifically, Harpo
outperforms all baselines against real-world user profiling models by up to 8.19× and 8.57× in terms of
L1 and L2, respectively. Also, Harpo outperforms all baselines against real-world ad targeting models by
up to 54× and 22.57× in terms of L3 and L4, respectively. These results demonstrate that Harpo under
real personas achieves comparable results in terms of L1 and L2 and better results in terms of L3 and
L4 as compared to under synthetic personas, which demonstrates Harpo’s transferability under real user
personas. Note that the CPM value for Control in Table 3.4c is lower than that for synthetic personas in
Table 3.4a yielding a large gap between the value of L4 under Table 3.4a and 3.4c, but the actual CPM value
for Harpo is comparable between the two tables.
In conclusion, our results demonstrate that Harpo’s performance transfers well to different real-world
tracker models encountered in the wild as well as to real user personas. The trends are largely consistent
across surrogate and real-world models and across synthetic and real user personas. In fact, the performance gap between Harpo and baselines widens in the real-world evaluation. It is worth mentioning that
real-world user profiling and ad targeting models may change over time. While our results here demonstrate that Harpo transfers well to real-world models, it might be prudent to update Harpo from time to
80
0%
AdNauseam
TrackThis
Rand-intent
Bias-intent
Harpo
20%
40%
60%
Privacy (L
1
)
=0.1
=0.2
(a) User profiling loss w.r.t. L1
0
AdNauseam
TrackThis
Rand-intent
Bias-intent
Harpo
2
4
6
Privacy (L
2
)
=0.1
=0.2
(b) User profiling loss w.r.t. L2
AdNauseam
TrackThis
Rand-intent
Bias-intent
Harpo
0%
20%
40%
60%
Privacy (L
3
)
=0.1
=0.2
(c) Ad targeting loss w.r.t. L3
0
AdNauseam
TrackThis
Rand-intent
Bias-intent
Harpo
2
4
6
8
Privacy (L
4
)
=0.1
=0.2
(d) Ad targeting loss w.r.t. L4
Figure 3.6: Loss under different obfuscation budgets for the user profiling and ad targeting models. Note
that the reported loss values (L1, L2, L3) are all against real-world user profiling and ad targeting models.
time to account for significant changes. We remark that Harpo’s RL agent is amenable to be updated in
an online fashion and can also leverage transfer learning techniques to avoid training from scratch.
3.5.3 Overhead
Obfuscation overhead. Our evaluation thus far has used the obfuscation budget of α = 0.1. Next, we
investigate the impact of varying the obfuscation budget, controlled by the parameter α, on the effectiveness of Harpo and baselines. Figure 3.6 plots the impact of varying α between 0.1 and 0.2 on real-world
user profiling and ad targeting models. While there is a general increase in the effectiveness for a larger
obfuscation budget, it is noteworthy that some baselines actually degrade when α is increased from 0.1 to
0.2. We note that Harpo’s effectiveness generally improves for the larger obfuscation budget and it continues to outperform the baselines. Harpo’s effectiveness improves by 1.33× for L1, 1.03× for L2, 1.41×
81
0% 10% 20% 30% 40% 50%
Stealthiness (detection error)
10%
20%
30%
40%
Privacy (L
1
)
Harpo
Bias-intent
Rand-intent
AdNauseam
TrackThis
Stealthy
=0.2
=0.05
=0.1
=0.15
(a) Privacy and stealthiness trade-off
w.r.t. L1
0% 10% 20% 30% 40% 50%
Stealthiness (detection error)
1
2
3
4
Privacy (L
2
)
Harpo
Bias-intent
Rand-intent
AdNauseam
TrackThis
(b) Privacy and stealthiness trade-off
w.r.t. L2
0% 10% 20% 30% 40% 50%
Stealthiness (detection error)
0%
10%
20%
30%
40%
50%
Privacy (L
3
)
Harpo
Bias-intent
Rand-intent
AdNauseam
TrackThis
(c) Privacy and stealthiness trade-off
w.r.t. L3
Figure 3.7: Stealthiness evaluation results. Note that the α values from left to right of each curve in
each figure are 0.2, 0.15, 0.1 and 0.05 respectively. The reported privacy values (L1, L2, L3) are against
surrogate user profiling and ad targeting models.
for L3, and 1.23× for L4 when α is increased from 0.1 to 0.2. In fact, Harpo outperforms baselines even
with a lower obfuscation budget. Overall, Harpo at α = 0.1 outperforms baselines at α = 0.2 by at least
1.47× in terms of L2 on average and up to 13.27× in terms of L3 on average.
System overhead. We evaluate the system overhead of Harpo to assess its potential adverse impact
on user experience. We study Harpo’s system overhead in terms of resource consumption (CPU and
memory usage) and overall user experience (page load time). We launch a total of 300 browsing sessions
on a commodity Intel Core i7 laptop with 8 GB memory on a residential WiFi network, without Harpo as
control and with Harpo for α = 0.1 and 0.2. Each browsing session involved visiting a series of 20 pages,
with the next page being loaded as soon as the first page finished loading. For each page visit during
the browsing session, we measure the CPU and memory consumption as well as the page load time. The
average percentage increase in CPU usage over control is 5.3% and 8.8% for α = 0.1 and 0.2, respectively.
The average percentage increase in memory usage over control is 3.9% and 4.0% for α = 0.1 and 0.2,
respectively. The average percentage increase in page load time over control is 0.20 and 0.26 seconds for
α = 0.1 and 0.2, respectively. We conclude that increasing values of α has a modest impact on the CPU
and memory but a minimal impact on overall user experience. This is because Harpo’s browser extension
implementation (model and fake page visits) uses a separate background thread that does not directly
82
interrupt the browser’s main thread. Overall, we expect Harpo’s implementation to have negligible system
overheads on reasonably well-provisioned devices.
3.5.4 Stealthiness
Next, we introduce the notion of stealthiness to reason about potential countermeasures by the tracker
against Harpo. More specifically, we expect the tracker to try to detect the usage of Harpo using purposebuilt ML models. We evaluate the stealthiness of Harpo and baselines as well as study the trade-off between stealthiness and obfuscation budget (α).
Adversarial detection. To build a supervised detection model, the tracker needs to gather training data
comprising of both non-obfuscated and obfuscated browsing profiles. To this end, we assume a strong
adversary that has access to sufficient non-obfuscated browsing profiles as well as black-box access to
obfuscators (including Harpo) that can be used to gather obfuscated browsing profiles. To train the classification model, we assume that the tracker extracts embedding based content features for all the URLs
in the available positive and negative labeled browsing profiles. Thus, we assume that the tracker: (1) can
track all the URLs in a user’s browsing profile; (2) is able to extract content features for any URL that a
user may visit; and (3) has sufficient resources to gather training data and train an ML-based supervised
detection model. Based on these assumptions, we design a binary ML classifier that uses the doc2vec
embeddings as features of a user browsing profile and outputs a binary detection decision to indicate
whether or not a given persona is obfuscated by Harpo (or other obfuscators under consideration). We
gather a dataset of obfuscated and non-obfuscated personas containing a total of 20,000 URLs and use a
similar 80-20 split to train and test this detector. We then use the detection error as a metric to measure
stealthiness–obfuscation is more/less stealthy if the detection error is higher/lower.
Privacy-stealthiness trade-off. We evaluate privacy and stealthiness of Harpo and baselines as we vary
α ∈ {0.05, 0.10, 0.15, 0.20} in Figure 3.7. We note that stealthiness generally degrades for larger values
83
of α. As also shown in Section 3.5.3, we again note that privacy generally improves for larger values of α.
Thus, we get the privacy-stealthiness trade-off curve as α is varied. This trade-off is intuitive as the higher
the obfuscation budget (α), the higher the privacy. Additionally, it should be easier for the detector to
identify the presence of obfuscation when α is higher, leading to lower stealthiness. It is noteworthy that
Harpo achieves the best privacy-stealthiness trade-off (towards the top right of Figure 3.7) as compared
to baselines. More specifically, for the same level of stealthiness, Harpo outperforms all baselines with
respect to various privacy metrics. Similarly, for the same level of privacy, it achieves better stealthiness
than baselines.
Harpo achieves both high privacy and stealthiness and is more stealthy than baselines because it
ensures that obfuscation URLs are varied by disincentivizing the selection of same URLs and URL categories
thanks to the way we have designed the reward of the RL agent and the corresponding MDP (see Section
3.3.3). Note that by varying δ, the adjustable parameter in the reward function which controls the diversity
of URL selection, from 0.001 to 0.1, Harpo may achieve a range of privacy and stealthiness results. While
Bias-intent and Rand-intent are in the same ballpark as Harpo, we note that AdNauseam and TrackThis
by far achieve the worst privacy-stealthiness trade-off (towards the bottom left of Figure 3.7). AdNauseam
is not stealthy because it always selects an ad URL, which perhaps stands out to the obfuscation detector.
Note that this occurs in spite of the fact that, to account for real world user behavior, we make sure
non-obfuscated personas include 5% of advertising URLs, thus there are ad URLs in both the original and
obfuscated profiles. Similarly, our TrackThis implementation randomly selects one of the obfuscation URLs
from a curated set.¶¶
We conclude that Harpo is able to achieve better privacy-stealthiness trade-off as compared to baselines. This is in part because Harpo is able to achieve better privacy for a given obfuscation budget due to
¶¶The original TrackThis implementation uses four fixed set of curated obfuscation URL sets, and selects one of those to
injected all its ≈ 100 URLs at the same time for obfuscation, which can be trivially detected.
84
4
8
12
16
20
4
8
12
16
20
(a) Adaptiveness of Rand-intent
4
8
12
16
20
4
8
12
16
20
(b) Adaptiveness of Bias-intent
4
8
12
16
20
4
8
12
16
20 0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
(c) Adaptiveness of Harpo
Figure 3.8: Adaptiveness of Harpo and two of the most competitive baselines (Rand-intent and Bias-intent)
against ad targeting models. The color of each cell represents the normalized Euclidean distance between
a pair of obfuscation URL category distributions. Warmer colors (red; higher values) represent superior
adaptiveness.
its principled learning based approach. To further provide insights into Harpo, we next analyze obfuscation URLs selected by Harpo and two of the most competitive baselines (Bias-intent and Rand-intent).
3.5.5 Adaptiveness
Our analysis thus far has not looked at whether and how the obfuscation URLs selected by Harpo and
baselines adapt to different user personas. To study adaptiveness of obfuscation, we conduct a controlled
experiment using a sample of 20 different personas (see Section 3.4.1 for details). To quantify the differences in selection of obfuscation URLs across each pair of personas, we visualize the distance between
their distributions of obfuscation URL categories∗∗∗ in Figure 3.8. Rows and columns here represent different persona types, and each cell in the matrix represents the normalized Euclidean distance between the
corresponding pair of distributions of obfuscation URL categories. If an obfuscation approach is not adaptive to different personas, we expect the values in the matrix to be closer to 0. Figure 3.8 shows that Harpo
is clearly more adaptive across personas than our two most competitive baselines (Bias-intent and Randintent). The average adaptiveness of Bias-intent and Rand-intent is respectively 1.53× and 1.51× worse
than Harpo. Rand-intent and Bias-intent are less adaptive because they use a fixed distribution (uniform
or weighted) to select obfuscation URL categories. Overall, together with its principled nature, we believe
∗∗∗Recall from Section 3.4.3 that Harpo and baseline randomly select obfuscation URLs from a pool of 193 URL subcategories.
85
that Harpo’s superior adaptiveness helps it achieve better privacy-stealthiness trade-off as compared to
baselines.
3.5.6 Personalization
A user may disallow Harpo from distorting certain segments. This may be because the user wants to
preserve a segment in his/her profile such that, for example, he/she may receive related ads [142]. Or, it
may be because the user does not want his/her profile to include a sensitive incorrect segment. Motivated
by this, we conducted an additional experiment where we trained Harpo to distort allowed segments while
preserving disallowed segments. Among the 20 considered interest segments, we select 15 as allowed and
5 as disallowed.
We denote the L2 distortion on allowed and disallowed segments by L
allowed
2
and L
disallowed
2
respectively. Then, we train Harpo to maximize L
allowed
2 − wdL
disallowed
2
, i.e., to maximize the distortion in
allowed segments while minimizing the distortion in disallowed segments. Note that wd is an adjustable
parameter controlling how aggressive Harpo is in distorting segments.††† As shown in Table 3.5, personalized Harpo is able to trigger the same level of distortion for allowed segments compared to nonpersonalized Harpo. However, personalized Harpo is able to preserve disallowed segments much better than non-personalized Harpo. Such personalized obfuscation would provide Harpo users more finegrained control over their user profiles and subsequent ad targeting.
L
allowed
2 L
disallowed
2
Harpo 3.74 0.71
Personalized Harpo 4.06 0.12
Table 3.5: Personalization results. L
allowed
2
and L
disallowed
2
denote the distortion on allowed segments and
disallowed segments respectively. Note that Harpo is trained to maximize L
allowed
2 + L
disallowed
2
, while
personalized Harpo is trained to maximize L
allowed
2 − wdL
disallowed
2
.
†††We set wd to 0.1 for this experiment.
86
3.6 Discussion
3.6.1 Ethical Considerations
We make a case that potential benefits of Harpo to users outweigh potential harms to the online advertising ecosystem.
Benefits to users. We argue that Harpo meaningfully contributes to improving privacy for users who
have no other recourse. The web’s current business model at its core has been described as surveillance
capitalism [86, 143, 144]. The true extent of pervasive tracking and surveillance for the sake of online
targeted advertising is unknown to lay users, who cannot be reasonably expected to understand the details
buried in incomprehensible privacy policies [145, 146] or make the informed choices due to deceptive
practices [147]. A vast majority of online advertising platforms do not support privacy-by-design features
or attempt to circumvent privacy-enhancing blocking tools. Thus, the practice of falsification in a privacyenhancing obfuscation tool, such as Harpo, is ethically justified from the user’s perspective [148].
Harms to online advertising ecosystem. We argue that potential harms of Harpo to online advertising
ecosystem are lower than existing obfuscation approaches or other alternatives. Harpo does introduce
additional costs/overheads for publishers and advertisers. Since Harpo reduces the effectiveness of user
profiling and ad targeting, advertisers may have to spend more on advertising to achieve the same level
of conversions. Publishers, in turn, may also notice a reduction in their advertising revenues. In the worst
case where behavioral targeting is completely ineffective, advertisers may have to resort to contextual
advertising that is reportedly about 52% less valuable [149, 150]. However, we note that obfuscation is more
viable as compared to other alternatives such as ad/tracker blocking, where advertisers and publishers
essentially lose all advertising revenue. Moreover, unlike AdNauseam, Harpo is designed to not explicitly
click on ads, thereby not engaging in overt ad fraud.
87
Thus, we argue that Harpo provides an ecologically viable (though less profitable) path for the online
advertising ecosystem while providing clear privacy benefits to users.
3.6.2 Limitations
We discuss some of the limitations of Harpo’s design, implementation, and evaluation.
Side-channels. There are side-channels that can be used to undermine Harpo’s stealthiness. Specifically,
since Harpo uses background tabs to load obfuscation URLs, an adversary can use the Page Visibility
API [151] or timing information via Performance API [152] to determine whether the tab is open in the
background and detect the use of Harpo. More generally, Harpo’s browser extension implementation
is susceptible to extension fingerprinting attacks [153, 154, 155, 156]. Harpo’s implementation can be
hardened against these attacks by patching the APIs that leak such information.
Tracking modalities. Trackers use multiple modalities (browsing profile, page interactions, location,
sensors, etc.) to profile user interests and subsequently target ads. While Harpo is currently designed to
only obfuscate users’ browsing profiles, it can be extended to obfuscate other modalities in the future.
User traces. We evaluated Harpo using both real user traces from the 15 year old AOL dataset and
synthetic traces based on our user persona model. We acknowledge that the characteristics of the AOL
and synthetic traces might be different than those of current real users. A future line of research would be
to evaluate Harpo by recruiting real users.
3.7 Related Work
In this section, we contextualize our work with respect to prior literature on enhancing user privacy in
online behavioral advertising. Online advertising platforms typically do not allow users to meaningfully
opt in/out of tracking. The notable exception is Apple’s newly introduced App Tracking Transparency
88
feature that requires apps to get permission from users to track them for targeted advertising [97]. Unfortunately, a vast majority of data brokers do not give users any meaningful choice about tracking. Thus,
as we discuss next, the privacy community has developed a number of privacy-enhancing blocking and
obfuscation tools geared towards online behavioral advertising.
3.7.1 Privacy-Enhancing Blocking Tools
The privacy community has a long history of developing privacy-enhancing tools to counter online advertising and tracking through blocking. These blocking tools have seen widespread adoption, with hundreds of millions of users across uBlock Origin [98], AdBlock Plus [157], and Ghostery [99]. In fact, several
privacy-focused browsers such as Firefox [158] and Brave [159] now provide built-in blocking features. An
established line of research aims to improve the effectiveness of these blocking tools [160, 100, 161, 162,
163, 164, 165, 119]. In addition to blanket blocking of advertising and/or tracking, selective blocking tools
aim to give users control over the trade-off between privacy and utility. Tools such as MyTrackingChoices
[166] and TrackMeOrNot [167] enable users to block tracking of private interests while allowing tracking
of non-private interests. Thus, these selective blocking tools can help users still receive personalized ads
for non-private interests while protecting their private interests.
Unsurprisingly, advertisers and trackers consider blocking a threat to their business model. The ensuing arms race over the last few years has seen advertisers and trackers leveraging a myriad of ways
to circumvent blocking [100, 101, 103, 104]. First, blocking tools that rely on signatures (e.g., EasyList)
can be trivially evaded by simply modifying the signature (e.g., randomizing domain names or URL paths)
[103]. Second, new circumvention techniques to bypass blocking techniques are often devised before being eventually patched [108, 107, 109]. Finally, prior work has demonstrated the non-stealthy nature of
most forms of ad blocking as they can be reliably detected by publishers, allowing them to retaliate using
89
anti-adblocking paywalls [168, 169]. Thus, blocking is not the silver bullet against online advertising and
tracking.
3.7.2 Privacy-Enhancing Obfuscation Tools
Closer to our focus in this work, the privacy community has also developed privacy-enhancing obfuscation
tools to counter online advertising and tracking [170]. We discuss prior obfuscation approaches in terms of
whether the obfuscation approach is: (1) adaptive to the user’s browsing profile, (2) principled in attacking
the tracker’s profiling/targeting model, (3) stealthy against detection and potential countermeasures by
the tracker, and (4) cognizant of obfuscation overheads.
In a seminal work, Howe and Nissenbaum [110] presented AdNauseam that combined blocking with
obfuscation to “protest” against online targeted advertising. The main aim is to protect user privacy by confusing user profiling and ad targeting systems used in online targeted advertising. To this end, AdNauseam
obfuscates a user’s browsing behavior by deliberately clicking on a controllable fraction of encountered
ads. While AdNauseam’s obfuscation approach is adaptive to the user’s browsing and allows control of
overheads, it is not principled and stealthy—it injects a random subset of ad URLs in a user’s browsing profile without any awareness of the user profiling or ad targeting model. In the same vein, Mozilla recently
launched TrackThis [111] to “throw off” advertisers and trackers by injecting a curated list of obfuscation URLs. TrackThis is more primitive than AdNauseam—it is further not adaptive or stealthy because it
injects a fixed set of curated obfuscation URLs that do not change across different user browsing profiles.
In an early work that does not specifically focus on online targeted advertising, Xing et al. [171]
proposed an attack to “pollute” a user’s browsing profile and impact first-party personalization on YouTube,
Google, and Amazon. Building on this work, Meng et al. [172] implemented and deployed this polluting
attack against online targeted advertising. Their obfuscation approach randomly injects curated URLs
that are likely to trigger re-targeting. In another attack on online targeted advertising, Kim et al. [173]
90
proposed to create fake browsing profiles to waste an advertiser’s budget on fake ad slots. While similar
to Meng et al. [172] in that they aim to trigger more expensive re-targeted ads, their attack does not
seek to enhance user privacy and is squarely focused on wasting the budget of advertisers. While these
obfuscating approaches were shown to impact ad targeting, they share the same limitations as TrackThis.
Degeling and Nierhoff [113] designed and evaluated an obfuscation approach to “trick” a real-world
user profiling system. While their obfuscation approach injects a curated set of obfuscation URLs, it is
principled because it relies on feedback from the advertiser’s user profiling model to select obfuscation
URLs. Their obfuscation approach was shown to induce incorrect interest segments in BlueKai’s user profiling model. While their obfuscation approach is principled and somewhat adaptive to a user’s browsing
profile, it is neither stealthy nor cognizant of obfuscation overheads.
In a related obfuscation-through-aggregation approach, Biega et al. [114] proposed to use a proxy to
interleave browsing profiles of multiple users to protect their privacy through “solidarity.” Their approach
mixes browsing profiles of different users based on the similarity between their browsing profiles. Their
approach is adaptive and stealthy because it tries to mix browsing profiles of similar users. However, it is
neither principled nor it is cognizant of obfuscation overheads.
Beigi et al. [115] proposed to use greedy search to suitably obfuscate a user’s browsing profile. Their
approach is adaptive and principled since it uses a greedy search approach that is essentially equivalent to
our Bias-intent baseline. However, it does not consider sequential dependencies [174] in a user’s browsing
profile or allow control over obfuscation overheads.
3.8 Conclusion
In this work, we present Harpo, a principled reinforcement learning-based obfuscation approach to subvert online targeted advertising. Harpo significantly outperforms existing obfuscation tools by as much
91
as 16× for the same overhead. Additionally, for the same level of privacy, Harpo provides better stealthiness against potential countermeasures. Thus, the privacy protections offered by Harpo are better suited
for the arms race than existing obfuscation tools. We hope that Harpo and follow-up research will lead
to a new class of obfuscation-driven effective, practical, and long-lasting privacy protections against online behavioral advertising. To facilitate follow-up research, Harpo’s source code is available at https:
//github.com/bitzj2015/Harpo-NDSS22. In the next Chapter, we design a utility-preserving obfuscation approach based on Harpo for content platforms where user utility matters a lot.
92
Chapter 4
De-Harpo: A Utility-Preserving Obfuscation Approach for YouTube
Recommendations
In last chapter, the proposed obfuscation approach Harpo primarily focuses on enhancing privacy, but at
the same time it may degrade the utility of online services since obfuscation introduces unrelated contents
sent by the service providers. When it comes to online behavioral advertising, web users may not be
interested in the utility brought by personalized ads. However, for online content platforms with billions
of user engagements such as YouTube, the utility of personalized content recommendations matters a lot
to web users. Therefore, in this chapter, we design and implement De-Harpo, an obfuscation approach for
YouTube’s recommendation system that not only obfuscates a user’s video watch history to protect privacy
but then also denoises the video recommendations by YouTube to preserve their utility. In contrast to prior
obfuscation approaches (e.g. Harpo in Chapter 3), De-Harpo adds a denoiser that makes use of a “secret”
input (i.e., a user’s actual watch history) as well as information that is also available to the adversarial
recommendation system (i.e., obfuscated watch history and corresponding “noisy" recommendations). Our
large-scale evaluation of De-Harpo shows that it outperforms the state-of-the-art by a factor of 2× in
terms of preserving utility for the same level of privacy, while maintaining stealthiness and robustness to
de-obfuscation.
93
4.1 Introduction
Online content platforms, such as YouTube, heavily rely on recommendation systems to optimize user
engagement on their platforms. For instance, 70% of the content watched on YouTube is recommended
by its algorithm [3]. These recommendation systems provide personalized content recommendations by
tracking and profiling user activity. For instance, YouTube tracks and profiles activities of its users on
YouTube as well as off of YouTube to this end [175]. This tracking and profiling enables these platforms to
predict relevant content that a user is likely to be interested in. On one hand, this tracking and profiling
enables desirable utility to users by providing relevant content recommendations. On the other hand, this
tracking and profiling poses a privacy issue because the platform might infer potentially sensitive user
interests.
Some platforms, including YouTube, allow users to remove a subset of the tracked activity (e.g., remove
a specific video from YouTube watch history) or even disable the use of certain profiled user interests (e.g.,
gambling) to influence the recommendations. However, these controls do not necessarily stop the platform
from tracking and profiling user activities in the first place. Thus, they may not provide much, if any,
privacy benefit to users. Moreover, the exercising of these controls would hurt the quality of personalized
recommendations. For example, if users employ these controls to curtail tracking or profiling then they
will likely not receive personalized recommendations they are actually interested in.
The research community is increasingly interested in developing privacy-enhancing obfuscation approaches that do not rely on cooperation from online content platforms [110, 176, 177, 178]. At a high level,
these privacy-enhancing approaches work by adding fake activity to real user activity to lessen the ability
of the recommendation system to infer sensitive information. However, the addition of fake activity for
the sake of obfuscation also ends up impacting the utility users might derive from the recommendation
system in terms of relevance of personalized recommendations. Prior obfuscation approaches attempt to
94
navigate the trade-off between privacy and utility, for example [178], by carefully adding fake activity so
as to obfuscate “private” interests but allow “non-private” interests.
In this work, we are interested in designing a privacy-enhancing and utility-preserving obfuscation
approach for recommendation systems. In contrast to prior approaches that are typically limited to only
obfuscating inputs to the recommendation system, our key idea is to design an obfuscation approach that
can obfuscate inputs to preserve user privacy but at the same time remove “noise” from outputs to preserve the utility of recommendations. Since an adversarial recommendation system might also attempt to
remove “noise”, it is crucial that the denoiser can only be used by the user and not by the recommendation system. To this end, our insight is that the denoiser uses a “secret” input (specifically, a user’s actual
browsing history), which is only available to the user and not the recommendation system. The recommendation system instead only has access to the obfuscated browsing history of the user. Therefore, by
leveraging the knowledge of a user’s actual browsing history, the denoiser allows the user to preserve the
recommendations related to the users’ actual interests while discarding the unrelated recommendations
caused by obfuscation.
We design and implement De-Harpo, an obfuscation approach for YouTube’s recommendation system
that not only obfuscates a user’s video watch history to protect privacy but then also denoises the video
recommendations by YouTube to preserve their utility. De-Harpo uses an obfuscator to inject obfuscation
videos into a user’s video watch history and a denoiser to remove recommended videos that are unrelated
to the user’s actual interests.
The obfuscator is a RL model trained to insert YouTube videos in a users’ watch history that will
maximize the distortion in their interests being inferred by YouTube. We address three key issues in
designing De-Harpo’s obfuscator, which is a non-trivial adaptation of Harpo [178] to YouTube. First,
we build a surrogate of YouTube’s recommendation system to efficiently train the RL model in a virtual
environment. Second, we design the surrogate model to predict the distribution of hundreds of different
95
classes of YouTube recommendation videos (we use the 154 affinity segments used by Google [179] as
our video classes) rather than the sheer number (order of hundreds of millions) of individual YouTube
videos. Lastly, the obfuscator selects obfuscation videos based on embedding similarity, which is scalable
to millions of obfuscation videos.
The denoiser is a ML model that is trained to reproduce the original recommendations that would
have been received in the absence of the obfuscator. We address two key issues in designing De-Harpo’s
denoiser. First, denoiser makes use of a “secret” input (i.e., a user’s actual watch history) as well as information that is also available to the adversarial recommendation system (i.e., obfuscated watch history and
corresponding “noisy" recommendations). As we show later, this design ensures that only De-Harpo is
able to remove “noise’ while the adversary is unable to de-obfusacte without prohibitive collateral damage.
Second, we define new divergence-based metrics to measure privacy and utility in training obfuscator and
denoiser.
We deploy and evaluate De-Harpo’s effectiveness on YouTube using 10,000 sock puppet based personas, 10,000 Reddit user personas, and 936 real-world YouTube users [180]. Our evaluation shows that
De-Harpo’s obfuscator is able to degrade the quality of YouTube’s recommendations by up to 87.23% (privacy) and its denoiser is able to recover up to 90.40% of the actual recommendations (utility). We show that
De-Harpo outperforms the state-of-the-art by a factor of 2× in terms of improving utility for the same
level of privacy. Crucially, we also demonstrate that De-Harpo is stealthy and robust to de-obfuscation
by an adversarial system. Our evaluation shows that the adversary incurs a prohibitively large number of
false positives (order of tens/hundreds of millions) in attempting to undermine stealthiness and achieving
de-obfuscating.
96
4.2 Preliminaries
4.2.1 Problem Statement
Recommendation systems track users’ browsing activity to provide personalized recommendations. YouTube,
for example, tracks users’ browsing activity on YouTube (e.g., videos watched, channel subscriptions) as
well as off of YouTube (e.g., activity on other Google services such as Google Search and Google Analytics,
or web pages opened in Chrome browser) to personalize homepage and up-next video recommendations
[175]. Users can selectively remove certain videos from their YouTube watch history or clear their browsing activity altogether to influence personalized video recommendations. However, doing so does not
necessarily mean that their browsing activity is not tracked in the first place, and thus there is no material
privacy benefit to users. It will also hurt the quality of personalized recommendations because users will
likely not receive recommendations for videos they are interested in. In summary, users are unable to exert
meaningful control over recommendation systems to protect their privacy while preserving the utility of
personalized recommendations.
Prior work has proposed obfuscation approaches to protect user privacy in personalized recommendation systems without relying on cooperation from online content platforms. Existing approaches obfuscate
a user’s browsing history by injecting fake activity (e.g., webpage visits) to manipulate a user’s interest
segments and targeted ads in online behavioral advertising [178, 181]. These obfuscation approaches are
designed for recommendation systems (e.g., online behavioral advertising) where users are not necessarily
interested in consuming the output of the recommendation system, rather users are mainly interested in
subverting it. While these approaches aim to protect user privacy (e.g., inferred interest segments), they
do not consider the utility of recommendations (e.g., whether targeted ads are of interest to the user). In
contrast, in recommendation systems such as YouTube, these obfuscation tools would render the utility of
YouTube’s video recommendations useless to the user.
97
User
1. User Persona
2. Video Recommendations
(a) Without obfuscation-denoising system.
User
2. Obfuscated
User Persona
Obfuscator
1. User Persona
Denoiser
4. Denoised Video
Recommendations
3. Noisy Video
Recommendations
1. User Persona
2. Obfuscated
User Persona
Obfuscation-Denoising System
(b) With obfuscation-denoising system.
Figure 4.1: Problem Overview.
Can we design privacy-enhancing obfuscation approaches that can enhance privacy of users and at the
same time preserve utility for users in recommendation systems? With this goal in mind, we propose to
build a denoiser to remove the “noisy" videos injected as part of obfuscation. It is crucial that the denoiser
can only be used by the user and not by the recommendation system. To this end, our insight is that
the denoiser uses a “secret” (specifically, the user’s actual browsing history), which is only available to
the user and not the recommendation system. Therefore, by leveraging the knowledge of a user’s actual
browsing history, the denoiser may preserve the recommendations related to the users’ actual interests
while discarding the unrelated recommendations caused by obfuscation. Figure 4.1 illustrates this idea
that we next operationalize in De-Harpo.
4.2.2 Threat Model
User. The user’s goal is to routinely browse YouTube videos and get high-quality recommendation videos
fitting their interests, while misleading the YouTube recommendation system such that it can not accurately infer the user’s interests. To achieve this goal, users install a local obfuscation-denoising system,
which consists of an obfuscator and a denoiser. The obfuscator will obfuscate their video watching history
98
by injecting fake video watches into the user’s real video watches, and the denoiser will automatically
remove “noisy" recommended videos from YouTube (i.e. caused by obfuscation) that do not fit user’s interests. The obfuscation-denoising system is designed to satisfy the following properties:
• it is privacy-preserving in that the user’s interests are protected from being inferred by YouTube.
• it is utility-preserving in that the user can receive high-quality videos fitting their interests.
• t has low overhead in that the amount of obfuscation videos inject will not affect the user experience.
• it is stealthy in that it is impractical for YouTube to detect the usage of obfuscation-denoising system.
• it is robust to deobfuscation in that it is impossible for YouTube to distinguish fake video watches
from real video watches.
• it can be personalized in that it can treat video classes differently based on user preferences.
Recommendation system. The goal of the recommendation system is to track user activity for personalized recommendations to maximize user engagement (e.g., click rate and watch time). We assume that
the recommendation system has full access to the user’s video watching history (including both fake and
real video watches though it does not know which is which) and it recommends videos based on the user’s
video watching history, which is true for YouTube [182] (unless the user deletes their watching history). We
further assume that the recommendation system does not have access to the user’s off-platform browsing
history (e.g., the user is not simultaneously signed-in to YouTube and other services by YouTube’s parent
company Google, the user employs Google account controls to prevent off-YouTube information linking
(if the user is signed-in to YouTube and other services by YouTube’s parent company Google) [183], or
the user uses a browser such as Safari [184] or Firefox [185] – or privacy-enhancing browser extension
[186] – that prevents cross-site tracking). We also assume that the recommendation system has substantial
computation resources to train a machine learning model for its recommendations. This assumption also
99
holds for YouTube [187]. Moreover, we assume that the recommendation system has access to De-Harpo
once it is public, such that it can use it to analyze the obfuscation approach and possibly train adversarial
detectors to detect and filter the usage of De-Harpo. More specifically, we assume that the recommendation system has a two-step detection workflow. In the first step, the adversary will train a classifier to
detect whether or not a user uses De-Harpo. Then, in the second step, if De-Harpo usage is detected, the
adversary further attempts to achieve deobfuscation by filtering out obfuscation videos and keeping the
remaining videos.
4.3 Proposed Approach
In this section, we present the proposed utility-preserving obfuscation approach De-Harpo.
4.3.1 Overview
As already discussed, at a high-level De-Harpo consists of an obfuscator designed for enhancing user
privacy and a denoiser designed for preserving user utility, as demonstrated in Figures 4.1 and 4.2 (in more
detail). The De-Harpo obfuscator is a non-trivial adaptation of Harpo’s obfuscator [178] in the context of
YouTube’s recommendation system. The obfuscator injects fake video playing records into a user’s video
playing history at random times. We refer to videos played by the user as user videos and to videos played
by the obfuscator as obfuscation videos. Note that without any obfuscation videos in the user’s video playing
history (which is denoted by V
u
in this case), YouTube will recommend a set of videos desired by the user.
We refer to this set of videos as “clean” YouTube videos. However, with obfuscation videos in the user’s
video playing history (which is denoted by V
o
in this case), YouTube will recommend a set of videos which
include videos undesired by the user. We refer to this set of videos as “noisy” YouTube videos. The denoiser
is designed to predict the class distribution of “clean” YouTube videos from the class distribution of “noisy”
YouTube videos, such that De-Harpo can repopulate a new set of videos with the same class distribution
100
as the “clean” YouTube videos. We refer to the repopulated videos as De-Harpo videos. Note that each
video class represents a video topic, and we use the 154 affinity segments used by Google [179] as our
video classes.
In more detail, see Figure 4.2, De-Harpo starts by generating video embeddings of past played videos
via an embedding model. It then uses an obfuscator model to select obfuscation videos based on the generated video embeddings. Note that we follow a similar methodology with that in [178] to formulate the
process of inserting obfuscation videos as a Markov Decision Process (MDP), and use reinforcement learning (RL) to train the obfuscator model to maximize the divergence between the class distribution of “noisy”
YouTube videos (denoted by C
o
) and the class distribution of “clean” YouTube videos (denoted by C
u
).
After receiving the “noisy” YouTube videos, the denoiser outputs an estimate of the class distribution of
“clean” YouTube videos (denoted by Cˆu
), by taking as inputs V
u
, V
o
, and C
o
. Finally, De-Harpo will use
a repopulation model to generate the set of De-Harpo videos with class distribution Cˆu
.
4.3.2 System Preliminaries
User persona. We define a user persona as a sequence of YouTube videos. Formally, we denote the nonobfuscated user persona as V
u = [v
u
1
, ..., vu
n
], where v
u
i
represents the ith video played by the user, and n is
the total number of videos played by the user. We denote the obfuscated user persona as V
o = [v
j
1
, ..., v
j
n′],
where j ∈ {u, o}, v
u
i
and v
o
i
represent that the ith video is played by the user and obfuscator respectively,
and n
′
is the total number of videos played by the user and obfuscator combined.
Recommended video class distribution. We define the recommended video class distribution of a
non-obfuscated user persona V
u
(i.e. the class distribution of “clean” YouTube videos) as C
u = [c
u
1
, ..., cu
K],
where Pk=K
k=1 c
u
k = 1, c
u
k
is the percentile of videos from the kth class among recommended videos for V
u
,
and K is the total number of classes. Similarly, we define the recommended video class distribution of
an obfuscated user persona V
o
(i.e. the class distribution of “noisy” YouTube videos) as C
o = [c
o
1
, ..., co
K],
101
… …
�!
" �#!
"
User
�"
(c) “Clean” YouTube Videos
(a) User Videos
�"
�"
(a) Without obfuscation-denoising system.
… …
�!
�"
# �!
#
�!$"
%
2. Obfuscator Model
�%
User
�#
�&
%
3. Denoiser Model
…
…
�%
�%#
De-Harpo
(d) “Noisy”
YouTube
Videos
(e) De-Harpo Videos
1. Embedding Model
4. Repopulation Model
(a) User Videos
(b) Obfuscation
Videos
(b) With obfuscation-denoising system.
Figure 4.2: Overview of De-Harpo. Note that V
u denotes the non-obfuscated user persona, V
o denotes
the obfuscated user persona generated by the obfuscator, C
u
is the recommended video class distribution
based on V
u
, C
o
is the recommended video class distribution based on V
o
, Cˆu
is the denoiser’s estimate
of C
u
, and v
u
i
and v
o
i
represent user video and obfuscation video respectively.
where Pk=K
k=1 c
o
k = 1 and c
o
k
is the percentile of videos from the kth class among the recommended videos
for V
o
. We use the recommended video class distribution as a representation of the user interest profile
built by YouTube instead of directly using the recommended videos. This design choice is made to (i)
mitigate the impact of non-determinism in YouTube’s recommendations and (ii) alleviate the difficulty
of making video-level recommendations given an incomplete set of available videos while still making
reasonably fine-grained recommendations (among 154 different classes).
Privacy metric. At a high level, we want to distort the user interest profile built by YouTube for user
personas to enhance user privacy. Motivated by the use of the recommended video class distribution as a
representation of YouTube’s user interest profile, we first define the following privacy metric:
P = E[DKL(C
o
||C
u
)] = E[
k
X=K
k=1
c
o
k
log
c
o
k
c
u
k
], (4.1)
102
which measures the expected KL divergence between the two probability distributions (C
o
and C
u
)
∗
.
It is worth noting that we use KL divergence since it is a well-established measure of the discrepancy
between two distributions, and, together with the closely related mutual information measure they have
been used as on-average privacy metrics in myriad of applications including recommendation systems
[188, 189, 1, 190, 191, 192]. We do not use stricter privacy metrics which provide worst-case privacy
guarantees (e.g. differential privacy (DP) [193]), since in the context of our application one would need
to inject an enormous number of obfuscation videos to satisfy such guarantees (see Section 4.3.3 for a
detailed, formal discussion on DP in our context).
During real-world experimentation on YouTube, we observe that the recommended video class distribution of the same persona may differ a bit due to an inherent randomness of the system. Since we are
interested to measure the divergence thanks to obfuscation only, we define DM in as the expected KL divergence between a random sample of C
u
and its mean C¯u
(i.e., DM in = E[DKL(C¯u
, Cu
)]), and subtract
from P the divergence caused by randomness, that is, we work with P − DM in. Furthermore, since P is
unbounded, we normalize the privacy metric as follows. Denote the user persona set as V, which consists
of all user personas. Let V
u
and V
u
′
be two user personas uniformly and randomly sampled from V, and
let their associated recommended video class distributions be C
u
and C
u
′
respectively. Then, we define
the normalized privacy metric P
N orm by:
P
N orm =
P − DM in
DMax − DM in , (4.2)
where DMax = E[DKL(C
u
, Cu
′
)] is the expectation of the KL divergence between C
u
and C
u
′
and thus
corresponds to the average “distance" between two video class distributions of two randomly selected users.
Hence, P
N orm measures the fraction of the maximum possible divergence that obfuscation achieves, on
∗Note that if c
i
k = 0 (i ∈ {u, o}), we assign a small value to it to avoid getting ∞ in KL divergence calculation.
103
average. Note that for both P and P
N orm, the higher their value is, the more effective the obfuscator is in
enhancing user privacy (see Figure 4.3).
Utility metric. In our threat model, the user sends the obfuscated persona to YouTube and then receives
a “noisy” recommended video list with class distribution C
o
. However, the user desires the “clean” recommended video list with class distribution C
u
. Our denoiser is designed to predict C
u
from C
o
, such
that De-Harpo can repopulate the “clean” recommended video list from C
u
. With the above in mind, we
define our utility loss metric as follows:
ULoss = E[DKL(Cˆu
||C
u
)] = E[
k
X=K
k=1
cˆ
u
k
log
cˆ
u
k
c
u
k
], (4.3)
where Cˆu
is the output of the denoiser, representing its estimation of C
u
. Smaller ULoss means smaller
divergence between the non-obfuscated recommended video class distribution C
u
and the denoiser’s estimate of such distribution Cˆu
and thus a better estimate. The theoretical minimum that this value can
take is 0, representing two identical distributions i.e. the noise is perfectly removed. Note that without
applying the denoiser, the utility loss equals the value of privacy P (since Cˆu = C
o
). The denoiser can
reduce the utility loss caused by the obfuscator by P − ULoss which represents the denoiser utility gain.
Similarly to above, because P is unbounded and YouTube’s randomness causes, on average, a divergence
of DM in, we define the normalized utility gain metric as follows:
U
N orm
Gain =
P − ULoss
P − DM in , (4.4)
which represents the fraction of obfuscation noise reduced by the denoiser, on average. Higher U
N orm
Gain
implies that the denoiser can reduce the utility loss caused by the obfuscator more effectively and a value
of 100% indicates a complete removal of noise (see Figure 4.3).
104
�!"# �
�$%&&
�!'(
�)'"#
*%+, (Denoiser)
100%
100%
�!!
�̅
! �! �" �! �!# �!
�!
�*%+, (Obfuscator)
�!"
Figure 4.3: Privacy and utility metrics.
4.3.3 Performance Goals and Guarantees
Performance goals. As discussed already, our goal is to obfuscate the actual user profile, that is, the
inferred user’s interests by YouTube from the user’s video watch history. (We do not consider other subchannels via which YouTube may infer user interests, see threat model details in Section 4.2.2.) In view of
obfuscation, YouTube’s goal is to reconstruct the actual user profile (what YouTube would have inferred by
the user’s video watch history in the absence of obfuscation) as accurately as possible from the obfuscated
user profile (what YouTube infers by the user’s video watch history in the presence of obfuscation). Since
YouTube’s user profiles are not public, we infer them from YouTube’s recommended videos to the user,
and, more specifically, from the the recommended video class distribution (where we use the 154 affinity
segments used by Google as our video classes).
Motivated by the above, our privacy metric maximizes the distance (normalized KL divergence) between the recommended video class distribution before and after obfuscation. If the distance between the
recommended video class distribution before and after obfuscation is almost the same with the distance
between the recommended video class distribution before obfuscation and the recommended video class
distribution of another random user, then YouTube’s recommendations for the user under study are essentially random implying that YouTube is not able to learn the user’s actual interests from the obfuscated
user’s video watch history. Tellingly, in Section 4.7.3 we do show that with merely 70% of videos in a user
persona being obfuscation videos, the distance between the recommended video class distribution before
and after obfuscation is already 93% of the distance between random distributions.
105
Performance guarantees. A discussion about performance guarantees is in order. First, can De-Harpo
effectively de-noise the noisy recommendations such that their utility is high, despite that recommendations are as if they were random? Section 4.7.2 answers affirmatively. Related to this, if De-Harpo can
de-noise recommendations, can’t YouTube de-noise them as well? Sections 4.7.5, 4.7.6 show that it cannot
in practice, and Section 4.3.5 offers a formal explanation why it can’t. Note that even though YouTube
unavoidably learns the interests of a user corresponding to the user videos that the user actually watches,
it also learns interests corresponding to the obfuscation videos, the relative importance of each interest is
altered, and YouTube has no practical way of telling which interest is real and which is not. †
Second, both our privacy and utility metrics are based on expectations, see Eq. (1)-(4). Hence, DeHarpo guarantees performance goals “on-average". But what about “worst-case" privacy guarantees? In
our context this would require that no matter how unique the original video watch history of a specific
user may be, YouTube should not be able to learn any unique interests of this user regardless of how unsuccessful it may be across all users on average. There is a large line of prior work on both “on-average" [189,
1, 190, 191, 192] and “worst-case" [195, 193, 188, 1, 190] privacy guarantees. It is intuitive that strict definitions of privacy like differential privacy (DP) [193], which guarantee privacy in the worst-case, cannot
be satisfied for recommendations systems actively used by users.
Why DP can not be guaranteed? For a matter of completeness, we provide a formal proof about why
differential privacy can not be achieved as follows.
Theorem 1. Assume that there is one video, which the obfuscator (O, a randomized function) can not delete
from a user persona (P), then we can not achieve ϵ-DP or (ϵ, δ)-DP in terms of protecting the user persona.
Proof. First, to achieve ϵ-DP, for any two user personas P1 and P2 differing from one video, and for any
user persona set P belonging to the output space of the obfuscator, P r(O(P1)∈P)
P r(O(P2)∈P) ≤ e
ϵ
should be satisfied.
†
If a user wishes YouTube to not learn about the user’s real interests at all, the user should not use YouTube: Even though
YouTube in theory offers a method to remove a video from the watch history, (i) even if the video is deleted the corresponding
interest categories are not [194] and (ii) there is no "unlearning" at the ML level and hence the recommendation algorithm will
still recommend videos based on the total watch history.
106
Now, assume that that there is one video V which only exists in P2 but not in P1, and the obfuscator O
can not remove it from P2 after obfuscation, which means O(P2) will always contain video V . Then, there
exists an user persona set P which contains user personas without video V , where P r(O(P1) ∈ P) = 1
but P r(O(P2) ∈ P) = 0. Therefore, P r(O(P1)∈P)
P r(O(P2)∈P) = +∞ and hence ϵ will be infinite in order to bound
this worst case.
Second, to achieve (ϵ, δ)-DP, for any two user personas P1 and P2 differing from one video, and for
any user persona set P belonging to the output space of the obfuscator, |P r(O(P1) ∈ P)−e
ϵP r(O(P2) ∈
P)| ≤ δ should be satisfied. Moreover, for δ to be meaningful, it has to be inversely proportional to the
size of the dataset, which in our case is enormous (all possible user personas). However, since there exists
an user persona set P without video V , where P r(O(P1) ∈ P) = 1 but P r(O(P2) ∈ P) = 0, the value
of δ equals 1, which is meaningless in terms of (ϵ, δ)-DP.
Theorem 2. Assume that there is one interest category, which the obfuscator (O, a randomized function)
can not remove from a user profile (i.e. a list of interest categories) created by YouTube (R), then we can not
achieve ϵ-DP or (ϵ, δ)-DP in terms of protecting the user profiles.
Proof. Define the YouTube recommendation system as R. First, to achieve ϵ-DP, for any two user profiles
R(P1) and R(P2) differing from one interest category, and for any user profile set R in the output space of
recommendation system, P r(R(O(P1)∈R)
P r(R(O(P2)∈R) ≤ e
ϵ
should be satisfied. Now, assume that there is one interest
category I which is only in user profile R(P2) but not in user profile R(P1), and the obfuscator O can not
remove it from user profile R(O(P2)), which means user profile R(O(P2)) will always contain interest
category I. Then, there exists a user profile set R containing user profiles without interest category I,
where P r(R(O(P1)) ∈ R) = 1 but P r(R(O(P2) ∈ R) = 0. Therefore, P r(R(O(P1)∈R)
P r(R(O(P2)∈R) = +∞ and hence
ϵ will be infinite in order to bound this worst case.
Second, to achieve (ϵ, δ)-DP, for any two user profiles R(P1) and R(P2) differing from one interest
category, and for any user profile R(P) in the output space of recommendation system, |P r(R(O(P1)) =
107
R(P)) − e
ϵP r(R(O(P2)) = R(P))| ≤ δ should be satisfied. However, since there exists a user profile set
R containing user profiles without interest category I, where P r(R(O(P1) ∈ R) = 1 but P r(R(O(P2) ∈
R) = 0, the value of δ equals 1, which is meaningless in terms of (ϵ, δ)-DP.
Assume that there is one video V in user persona P1 (i.e. video watch history) which is not in user
persona P2, and the obfuscator O (the randomized function in the DP definition) can not remove it from
P1. Let P be a user persona without video V that we observe. Then, the probability of O(P1) being P
is zero while the probability of O(P2) being P is non-zero. Thus, per the DP definition, the ϵ for this
worst-case scenario will be infinite and DP is violated.
4.3.4 System Model
Obfuscator. The obfuscation video selection process of obfuscator can be formulated as a Markov Decision
Process (MDP) defined as follows:
1) Obfuscation step: As shown in Figure 4.4, at the beginning of each time step, a video will be played.
If the played video is an obfuscation video injected by the obfuscator, we refer to this time step as an
obfuscation step. We denote the number of videos that have been played up to obfuscation step t by nt
.
Note that we use the obfuscation budget α as a system parameter to control the percentile of obfuscation
videos. At each time step, with probability α, an obfuscation video will be injected by obfuscator into the
user persona.
2) State: We define state st ∈ S at obfuscation step t as st = [v1, ..., vnt
], where nt
is the total number
of videos played until the beginning of obfuscation step t, and S is the state space of the MDP.
3) Action: At obfuscation step t, an action at will be taken by the MDP. We define action at ∈ A as the
obfuscation video selected by the MDP policy, where A is the action space of the MDP, i.e. the obfuscation
videos set in our application.
108
4) State Transition: We define the state transition function as T (·|S, A) : S × A × S → R, which
outputs the probability of st+1 = s
′ given st = s and at = a as T (st+1 = s
′
|st = s, at = a). In our
system, state st+1 contains all videos played until state st
, the action at
(i.e. the obfuscation videos selected
at obfuscation step t), and all the videos played by users between obfuscation step t and obfuscation step
t + 1. Note that the randomness of this MDP comes from the random injection of obfuscation videos.
5) Reward: We associate a reward rt for the action at at obfuscation step t. Specifically, we define rt as
the difference of the privacy metric P (see Eq. (4.1)) between this obfuscation step and the previous one,
i.e., rt = Pt − Pt−1, where Pt represents the privacy metric value at obfuscation step t, calculated based
on the recommended video class distributions of a non-obfuscated user persona and the corresponding
obfuscated user persona at the end of obfuscation step t.
6) Policy: The policy of the MDP can be defined as π(·|S) : S × A → R, which outputs the probability
of at = a given st = s as π(at = a|st = s). In our system, the obfuscator is modeled as the policy in MDP,
which outputs the probability distribution of obfuscation video selection. Suppose we have M obfuscation
videos in the obfuscation video set (A), then we have Pi=M
i=1 π(at = i|st) = 1, where at = i represents
the selection of i-th obfuscation video. At each obfuscation step t, we randomly choose one obfuscation
video based on a multinomial distribution parameterized by At = [π(at = 1|st), · · · , π(at = M|st)],
conditioning on the current state st
. The goal of solving this MDP is to find the optimal policy, such that
the accumulative rewards Pt=T
t=1 rt can be maximized. Note that T is the total number of obfuscation steps
since we consider a finite-horizon MDP.
… …
State �! State �!"#
�#
$ �%!
$ �%!"#
$ �%!"#
& �%!"#
&
�
�!
Obfuscation step
Time step
�
�"
Policy
Action �! Reward �!
� + 1
�"#$
Figure 4.4: MDP for the obfuscator.
109
Note that the state st
(i.e. video sequence) will be continuously updated by appending new videos and
is only growing unless users manually delete the history. De-Harpo is designed to take the whole state
st as input of its obfuscator to select an obfuscation video, and then run denoising at each step. Hence,
the calculation made by De-Harpo at each step will depend on the calculation made by De-Harpo in the
previous step, which is consistent with how YouTube works. Moreover, we clarify that Harpo [178] and
De-Harpo use a similar MDP formulation but with a different state (video sequence instead of webpage
sequence) and reward function (privacy metric). They apply the same RL algorithm (A2C) to train the
obfuscator, though the implementation differs due to MDP differences.
Denoiser. At a high level, we model the denoiser as a mapping from the recommended video class distribution of the obfuscated user persona C
o ∈ R
K to the recommended video class distribution of the
non-obfuscated user persona C
u ∈ R
K (K is the total number of video categories).
Estimating directly C
u
from C
o
can be challenging. In the extreme case, where the mutual information
between C
u
and C
o
is zero [196], it is impossible for the denoiser to estimate C
u
from C
o
. To estimate C
u
,
the denoiser may leverage side information indicating how the obfuscation videos are injected into the user
personas, as in this case it may be able to undo the effect of obfuscation videos in the recommendations
list. In our application, such side information is explicitly available to users (V
u portion of V
o
), since the
obfuscator is installed locally and users know exactly how the obfuscation videos are injected into user
personas. Therefore, our denoiser is modeled to be a functional mapping from (V
u
, V o
, Co
) to C
u
.
4.3.5 The “Secret" of the Denoiser
We use the information theory concept of mutual information (MI) to explain why the denoiser works.
Recall that the recommendation system cannot distinguish user from obfuscation videos thus does not
know the user’s video playing history V
u
. In our system, both V
u
and V
o
are modelled as random vectors,
and V
o
is generated from V
u by the obfuscator, which is a random function. Additionally, both C
u
and C
o
110
are random vectors, which are generated from V
u
and V
o
respectively by the YouTube recommendation
system. By applying the chain rule of MI, we can derive the following equation:
I(C
o
, V o
, V u
; C
u
) = I(C
o
, V o
; C
u
) + I(V
u
; C
u
|C
o
, V o
), (4.5)
where I(C
o
, V o
, V u
; C
u
) is the MI between (C
o
, V o
, V u
) and C
u
, I(C
o
, V o
; C
u
) is the MI between
(C
o
, V o
) and C
u
, and I(V
u
; C
u
|C
o
, V o
) is the MI between V
u
and C
u
conditioning on (C
o
, V o
).
First, we show that the non-obfuscated user persona V
u
can be leveraged by the denoiser to better
estimate C
u
. Since C
u
is generated by YouTube recommendation system given V
u
, V
u
is correlated with
C
u
, thus I(V
u
; C
u
|C
o
, V o
) > 0. Hence,
I(C
o
, V o
, V u
| {z }
with secret
; C
u
) > I( C
o
, V o
| {z }
without secret
; C
u
). (4.6)
Since the MI between (V
u
, V o
, Co
) and C
u
is larger than the MI between (C
o
, V o
) and C
u
, (C
o
, V o
, V u
)
can reveal more information about C
u
than (C
o
, V o
), leading to a more accurate estimate of C
u
. As an
aside, note that YouTube may attempt to de-obfuscate V
u
from V
o
. We evaluate the robustness of the
obfuscator against de-obfuscation in Section 4.7.6.
Second, we show that including C
o
and V
o may help to further enhance the effectiveness of the denoiser, compared with using V
u only. Based on the chain rule of MI, we can rewrite Eq. (4.5) as follows:
I(V
u
, V o
, Co
; C
u
)
= I(V
u
; C
u
) + I(C
o
; C
u
|V
u
) + I(V
o
; C
u
|C
o
, V u
). (4.7)
111
Transcript
Category
View count
Average rating
Metadata
embedding �!
"
Transcript
embedding �!
#
Metadata
Pretrained
Transformers
Video embedding �!
Concatenate
Video �!
(a) Video embedding
Conv
LSTM
… …
Conv
LSTM
�!
�! �!"#
Video
embeddings
�#
$ �%!
$ �%!"#
$
�! �!"#
�!
#
ℎ! ℎ!"#
…
∗ ∗
�
�! �!"#
…
�!"#
�%!"#
&
FC FC
�!"#
#
�!
' �!"#
'
(b) obfuscator
…
�!
"
�#
"
�" �$
�!
$
�%
$
…
�$
�
…
!
"
�#&
"
…
LSTM LSTM
…
�̂
!
" �%̂ … " �&"
+ +
FC
�! �' �(
(c) denoiser
�!
" … �#
"
�! �#!
LSTM … LSTM
…
�!
"
�$
"
…
!
!"#
$
�!
% = 1
�"
�"
FC
(d) Surrogate model
Figure 4.5: Details of system design.
Consider the term I(C
o
; C
u
|V
u
). C
o depends on V
u
and the obfuscation videos, and C
u depends on V
u
.
Crucially, they both also depend on the (non deterministic) YouTube recommendation system. Hence, even
when V
u
is given, there is non-zero MI between C
o
and C
u
, that is, I(C
o
; C
u
|V
u
) > 0, leading to the
following inequality:
I(V
u
, V o
, Co
; C
u
) > I(V
u
; C
u
), (4.8)
which means the MI between (V
u
, V o
, Co
) and C
u
is larger than the MI between V
u
and C
u only. Intuitively, knowing the pair V
o
, Co
reveals information about how the YouTube recommendation system
selects videos to recommend given a user video watching history. Therefore, the denoiser taking C
o
and
V
o
as additional inputs can learn more information about C
u
, as compared to the denoiser taking only V
u
as input. Our evaluation results in Section 4.7.2 empirically support the above analysis.
4.4 System Design and Implementation
In this section, we describe the detailed design of De-Harpo and how we implement De-Harpo as a
browser extension. De-Harpo consists of five modules: (1) a video embedding model that maps videos
112
into embeddings; (2) a obfuscator model that selects obfuscation videos based on the video embeddings of
played videos; (3) a denoiser model that estimates the class distribution of “clean" YouTube videos from
the class distribution of “noisy" YouTube videos; (4) a repopulation model that outputs De-Harpo videos
with the estimated class distribution of “clean" YouTube videos; (5) a surrogate model used to train the
obfuscator model offline efficiently (see Figure 4.2b for the workflow of modules (1)-(4)).
4.4.1 Video Embedding
To make our system scalable to millions of YouTube videos without being restricted to a fixed set, we
represent each video by an embedding vector. A YouTube video typically consists of metadata (e.g. title, description, view count, rating, thumbnail, etc), a sequence of image frames (i.e. the video), and the
transcript for the video. Since a video’s transcript is a good representation of its content and it is more
computationally and spatially efficient to process the transcript compared to processing the original video
stream, we use video metadata and transcript to generate the video embedding, where the video embedding
for video vi
is denoted by ei ∈ R
404 ‡
.
Specifically, as demonstrated in Figure 4.5a, we start by extracting the category, view count and average
rating of each video from its metadata. We then use an one-hot embedding to represent the category of
each video (with dimension 18)§
, and use two real numbers to represent the standardized view count and
average rating of each video. By combining them together, we derive the metadata embedding with 20
elements. We denote the metadata embedding for video vi as eM
i ∈ R
20
.
Next, we use a pretrained natural language processing (NLP) Transformer from [197] to generate the
transcript embedding for the video transcript. Since the pretrained NLP Transformer has a constraint on
‡Note that the YouTube recommendation system will use the image frames and some other private features to generate the
video embedding (see [187]). We acknowledge that by including these features, our video embeddings may be closer to the actual
embeddings used by YouTube. However, since our video embeddings can already yield a surrogate model (see Section 4.4.5) with
reasonable performance and it is more computationally efficient, we choose the current design of our video embeddings.
§Note that YouTube has 17 video categories, and we add an additional “none” category for videos without category metadata.
Hence, the one-hot-embedding for category information has a dimension of 18.
113
the maximal number of words in the input text (256 words in our case), we firstly split video transcript with
more than 256 words into multiple transcript chunks, each of which contains 256 words. Then, for each
transcript chunk, we use it as input of the NLP model and get the output embedding vector. We take the
average of these embedding vectors for these transcript chunks to derive the final transcript embedding.
We denote the transcript embedding for video vi as e
T
i ∈ R
384, which is a real vector with dimension 384.
Note that if a video does not contain any transcript (e.g. music videos), we use the video title and description
as an alternative of transcript to generate the transcript embedding. Last, we concatenate the metadata
and transcript embeddings and derive the complete video embedding vector ei = [eM
i
, eT
i
] ∈ R
404
.
4.4.2 Obfuscator Model
As discussed before, we model the process of injecting obfuscation videos as an MDP. Due to the prohibitively large state space of this MDP, we use RL, parameterized by a deep neural network, to learn the
optimal policy for obfuscation video selection.
The obfuscator takes as input the state at each obfuscation step, and outputs a video embedding. By
measuring the cosine similarity between the output video embedding and each obfuscation video embedding, the obfuscator derives the probability distribution of the obfuscation video selection, where an
obfuscation video with more similar embedding as the output video embedding is assigned a higher probability. Specifically, as shown in Figure 4.5b, the obfuscator consists of a convolutional layer (Conv), a
LSTM layer, and a fully-connected layer (FC). At step t, the convolutional layer takes the embeddings of
the past nt videos as input (Et ∈ R
nt×404) and outputs a real vector with m1 elements (ϕ
1
t ∈ R
m1
). Next,
the LSTM layer takes ϕ
1
t
and the hidden vector at obfuscation step t − 1 with m3 elements ht−1 ∈ R
m3
as input, and outputs a real vector with m2 elements (ϕ
2
t ∈ R
m2
) and the hidden vector ht ∈ R
m3
for
obfuscation step t (m1 = m2 = m3 = 128 in our experiments). Finally, a linear layer converts ϕ
2
t
into
a real vector with the same dimension as the video embedding. We denote this vector by et ∈ R
404 as
114
it represents the target embedding for the obfuscation video. Let E = [e1, ..., eM] denote the embedding
vectors of the M obfuscation videos at our disposal. Then, the probability of selecting the i-th obfuscation
video, i = 1 . . . M, is calculated proportionally to the similarity between its embedding and the target
embedding after normalizing using a softmax function:
π(at = i|st) = e
⟨et,ei⟩
Pi=M
i=1 e
⟨et,ei⟩
, (4.9)
where ⟨x, y⟩ denotes the inner product between x and y. Note that we use the on-policy RL algorithm
A2C (Advantage Actor and Critic)[130] to train the obfuscator (see Section 4.6).
Recall that the De-Harpo obfuscator is a non-trivial adaptation of Harpo [178] to YouTube. An important technical difference is that by calculating a target embedding and then selecting an obfuscation item
(video in case of YouTube) based on the similarity between its embedding and the target embedding, the
De-Harpo obfuscator can handle an unlimited and varying number of possible obfuscation videos without
requiring re-training when the set of obfuscation videos changes.
4.4.3 Denoiser Model
As mentioned in Section 4.3.4, the denoiser has three inputs: the non-obfuscated user persona V
u
, the
obfuscated user persona V
o
, and the recommended video class distribution of obfuscated user persona C
o
.
The denoiser uses two LSTM layers and an FC layer to encode inputs, as shown in Figure 4.5c. Specifically,
the first LSTM layer takes as input the embeddings of videos in the non-obfuscated user persona V
u
recurrently and outputs its final hidden vector f
1 ∈ R
n
(we use n = 128 in our experiments). Similarly,
the inputs of the second LSTM layer are the embeddings of videos in the obfuscated user persona V
o
and
its output is its last hidden vector f
2 ∈ R
n
. Last, the FC layer converts the class distribution C
o ∈ R
K
(where K represents the number of categories) into a real vector f
3 ∈ R
n
. By concatenating vectors f
1
,
f
2
, and f
3
into a single vector with dimension 3n, a final FC layer is used to map it into the estimated
115
recommended video class distribution Cˆu ∈ R
K. Note that we train the denoiser based on supervised
learning with stochastic gradient descent (see Section 4.6).
4.4.4 Repopulating Recommended Videos
Recall that the denoiser in De-Harpo outputs a target video class distribution Cˆu
. In order to go from a target video class distribution back to actual videos on the user’s screen, we repopulate the recommendations
using a browser extension.
For efficiency, we maintain a “bank" of videos per class and use it to repopulate the recommendations.
This leads to the question of how often should we refresh this bank in order to get a suitable trade-off
between the recency of the videos and the overhead required to collect the videos. To ascertain what
the optimal time period would be to refresh this bank we run a 24-hour experiment where we query the
name of a class in the YouTube search bar as a proxy for the explicit class and collect statistics for each
class’s most popular recommended videos. Specifically, we run the same query each hour, collect the top
20 search results per query, and compute the percentile of top queries that remain the same. The results
indicate that for most classes about 70-80% of the top search results remain the same. Motivated by this,
we periodically – or on an on-demand basis – crawl a sufficiently large number of videos for each class
to re-populate our bank. Note that the “noisy" recommended videos removed during the repopulation
process will be included into our obfuscation video sets such that they can be played later to augment the
obfuscation effect.
4.4.5 YouTube Surrogate Model
The training of the obfuscator requires frequent interactions with the YouTube recommendation system.
However, directly interacting with YouTube is time-consuming, since it takes more than 30 minutes to
116
construct a single persona (as described in Section 4.5.2). To train the obfuscator efficiently, we build a
surrogate model as a replication of the actual YouTube recommendation system.
The architecture of our surrogate model consists of a LSTM layer and a FC layer. The LSTM layer
takes as input the embeddings of videos in a user persona recurrently and outputs its last hidden vector,
which will be used as the input of the FC layer. Then, the FC layer will output the recommended video
class distribution C
i ∈ R
K, where i ∈ {u, o} (see Figure 4.5d for details). Note that we train the surrogate
model via supervised learning with stochastic gradient descent (see Section 4.6).
Differences between our surrogate model and prior works. Prior approaches to learn latent useritem relationships for recommendation systems (e.g., matrix factorization [198, 199, 200, 201, 202, 203, 204,
205], neural MF [206, 207, 208, 209, 210, 211, 212, 213]) are not scalable because they rely on a fixed set of
users and items. To address this limitation, recent work has focused on embedding based recommendation
systems that predict the next item clicked by users from their item-click history and thus can scale to a
large and dynamic set of users and items [187, 214]. YouTube, deals with a large influx of videos and users
everyday [182] and thus uses a scalable recommendation system that predicts the next watched videos
based on the embeddings of the past watched videos and other factors [187]. Similar to YouTube’s embedding based recommendation architecture, our surrogate model also takes as input the video embeddings.
Slightly different from YouTube’s embedding based recommendation architecture and as explained in Section 4.3.2, our surrogate model is designed to predict the recommended video class distribution, instead of
making video-level recommendations.
4.4.6 De-Harpo Implementation
We implement De-Harpo as a browser extension, which consists of two components: obfuscator and
denoiser.
117
Obfuscator. The obfuscator is a lightly modified version of Harpo’s browser extension [178]. The browser
extension plays the selected obfuscation videos in a background tab that is hidden from users. In order to
determine the timing of playing obfuscation videos, the obfuscator component uses a background script to
keep monitoring the URLs visited by the user and estimating the arrival rate of YouTube videos watched
by user as λ
u
. Then, given obfuscation budget α, the obfuscator component will use a Poisson Process
with rate λ
o =
λ
uα
1−α
to inject randomly select obfuscation videos. To mimic a typical user who watches
one video at a time, the selected obfuscation videos can be played only when the user is not already using
YouTube. However, if a user continues to watch YouTube videos for an extended time period, we can
simultaneously play the selected obfuscation videos (in the background as explained above) to prevent
YouTube from getting unfettered user watch history.¶
Denoiser. The denoiser has two modules: HTML modification and the denoising. The HTML modification
module is implemented in the background script. Whenever the user visits YouTube homepage, the HTML
modification module sends the “noisy" homepage recommendation video list requested from the content
script to the denoising module. Once HTML modification module receives the “clean" homepage recommended video list from the denoising module, it will modify the HTML of YouTube homepage to show
“clean" homepage recommended videos. The denoising module is implemented in the back-end, which
is responsible for accessing the metadata of the received “noisy” homepage recommended videos, running the denoiser model to convert the “noisy” homepage recommended video list into a “clean" one, and
then sends the “clean” video list back to the HTML modification module. We evaluate the implementation
overhead of the obfuscator and denoiser components in Section 4.7.4.
¶
It is not entirely uncommon for YouTube users to play videos in multiple browser tabs.
118
4.5 Experimental Setup
4.5.1 User Personas
To train and evaluate De-Harpo, we need to construct realistic user personas. However, it is challenging
to have access to real-world YouTube users’ video watch history in a large scale as our training data. To address this concern, we design two approaches that can generate a large number of synthetic user personas
to simulate real-world users: 1) the first approach creates sock puppet based personas by following the “up
next” videos recommended by YouTube; 2) the second approach leverages the YouTube videos publicly
posted by Reddit users as an approximation of their YouTube user personas. We use these synthetic user
persona datasets to train De-Harpo. Then, we evaluate it on both synthetic user persona datasets and a
real user persona dataset that contains YouTube video watch history collected from real-world users. We
describe these three datasets in detail below.
Sock Puppet Based Personas. According to YouTube, about 70% of the videos viewed on the platform
are sourced from its recommendation system [215]. Accordingly, given the current video, the “up next”
videos recommended by YouTube are good representations of the potential subsequent videos watched by
real-world YouTube users. Based on this insight, we build a sock puppet user persona model that generates random recommendation trails from a single seed video to model realistic YouTube user personas, by
keeping playing one of the “up next” videos recommended by YouTube randomly with uniform probability.
Specifically, we denote this model as G(D, T) parameterized by D, the depth of the recommendation
trail, and T, the total number of videos in the watch history, and we define the recommendation trail as a
sequence of videos that are recommended and subsequently watched by a user starting from the given seed
video. At each step of the recommendation trail, we randomly select one “up next" video to watch from the
list of recommended videos with uniform probability. We repeat this process until the recommendation trail
reaches the depth D at which point we check if the user has watched T videos. If not, we randomly select
119
another seed video from the user’s homepage and repeat the process until T videos have been watched.
Note that we randomly select around 20,000 popular videos from a set of popular YouTube channels as our
seed videos. For each seed video, we randomly generate a recommendation trail and use it as a synthetic
user persona. Note that we collect seed videos used for generating sock puppet randomly from 200 popular
YouTube channels, which include videos from all YouTube video categories.
In total, we generate 10,000 sock puppet based personas with 40 videos each. Since these personas are
synthetically built, we are able to exercise more control over the distribution of watched videos. Note that
we set the length of each user persona as 40, since we empirically observe that 40 videos can trigger enough
personalized recommended videos on the YouTube homepage and the average time it takes to watch them
is close to the average daily time spent by each YouTube user (35 min) [182].
Reddit User Personas. As a second way of simulating real-world user personas in a large scale, we gather
YouTube links publicly posted by social media users as an approximation of their YouTube personas. While
there are various social media platforms where users can share YouTube videos, we choose to collect data
from Reddit, since it is one of the largest and most popular communities where users post links related to
their interests, and millions of Reddit’s user submissions∥
are publicly available.
Specifically, we download Reddit user submissions from 2017 to 2021 using APIs provided by pushshift.io
[216]. For each user submission, we first extract the username and all YouTube links posted by this Reddit
account. Next, we filter out any duplicate or broken links. Then, we extract the YouTube video ids from
these remaining links in order. Finally, we remove users with less than 40 YouTube video posts, since
a small number of videos is not a fair approximation of the user’s actual YouTube persona. In total, we
collect 10,000 Reddit user personas with length 40.
Real-world YouTube Users. To conduct a more realistic evaluation of De-Harpo, we use a real-world
user dataset from [180]. This dataset contains the web browsing histories of 936 real users collected through
∥A Reddit user submission is a json file storing metadata of a Reddit user’s posts, including the username, the timestamp, the
URL of post, the text, etc.
120
Web Historian [217] for three months. It is a good representative of real YouTube users, since: 1) the
demographic distribution of these users, including their gender, age (18-65+), and education level (from
less than high school to Doctoral degree), are relatively uniform; 2) on average 650 YouTube video URLs
are watched by each user in three months; 3) the first 40 videos watched by these users have different video
class distribution, indicating diverse user interests. Considering that the dataset is collected over a long
period, we select the first 40 YouTube videos watched by each of these 936 users as our real user personas,
to evaluate De-Harpo.
4.5.2 Data Collection and Preparation
User Persona Construction. We use a fresh Firefox browser based on Selenium to construct each user
persona. For each sock puppet based persona, we start with a seed video and then follow the “up next"
video recommendations to generate a recommendation trail. We play each video in a user persona for 30
seconds before playing the next video. Note that we clear any pop-up windows and skip the ads before
playing the video. For each Reddit user and real user persona, since we already known the video ids in
each persona, we visit these videos sequentially∗∗. Similar to constructing synthetic user personas, if there
are any pop-up windows or ads, we clear them and then play the video for 30 seconds.
Recommended Video Collection. After we complete the construction of each user persona, we go back
to the YouTube homepage and refresh it for 50 times to collect all the recommended videos into a list. Note
that we refresh the homepage multiple times since we want to collect enough homepage recommended
videos to estimate the recommended video class distribution. We choose the number of refresh times as 50
since we empirically observe that it is a good tradeoff between collecting enough samples and minimizing
the quantity of crawls to be performed. Because extremely popular videos are common across many users
regardless of their profile, we remove them to underscore personalized recommendations. With this in
∗∗Note that directly visiting the URL of each video doesn’t trigger cookies from YouTube and hence no personal recommendation can happen. To address this, we first search the video id at YouTube and then click the first search result.
121
mind, we filter out videos which appear in more than 1% of personas’ homepage recommended video lists.
We also exclude YouTube videos showing in the homepage of a fresh browser. Then, for each recommended
video, we extract the associated tags (i.e. a list of keywords) from its metadata, and map each of them into
one of the 154 topic-level video classes we have (note that a video may belong to multiple video classes).
Last, for each persona, we count the number of recommended videos in each class and divide it by the sum
of videos in all classes to derive the recommended video class distribution of each persona.
User Persona Dataset Collected for Surrogate Model. We construct 10,000 sock puppet based personas and 10,000 Reddit user personas with 40 videos each. For each of these personas, we collect the
YouTube homepage recommended videos and derive the recommended video class distribution. We use
these constructed personas as inputs (V
u
) and the associated recommended video class distributions as
labels (C
u
) to build the dataset for surrogate model training and testing. As discussed in Section 4.6, we
use supervised learning to train the surrogate model.
User Persona Dataset Collected for Obfuscator and Denoiser. To evaluate the effectiveness of the
obfuscator model against the real-world YouTube recommendation system, we need to construct both nonobfuscated and obfuscated user personas. Specifically, for each obfuscator model under an obfuscation
budget α, we first construct 2,936 non-obfuscated user personas †† with 40 videos each and the corresponding 2,936 obfuscated user personas generated by the obfuscator with on average 40 ∗
α
1−α
videos
each. Then for each pair of non-obfuscated and obfuscated user persona (V
u
and V
o
), we collect their
associated recommended videos from the YouTube homepage and derive their recommended video class
distribution (C
u
and C
o
).
Moreover, we use the same user persona data collected for the obfuscator evaluation to create the
dataset for the denoiser training and testing (see Section 4.6). Specifically, each input of this dataset consists
of one non-obfuscated user personas (V
u
), the corresponding obfuscated user persona generated by the
††Note that 2,936 non-obfuscated user personas consist of 1,000 sock puppet based personas, 1,000 Reddit user personas, and
the 936 real user personas from real-world users.
122
obfuscator (V
o
), and its associated recommended video class distribution (C
o
). Each label of this dataset is
the recommended video class distribution of the non-obfuscated user persona (C
u
).
Video Embedding Preparation. We use youtube-dl, a free software for downloading YouTube videos
[218], to collect metadata and transcripts of videos. For metadata, we extract the category, average rating,
view count, title, and description of each video, which is then used to generate the metadata embedding
of each video. For a transcript, after we download it, we extract the transcript text, split it into text chunks
with 256 words each, and use the pretrained Transformer all-MiniLM-L6-v2 from [197] to convert them
into transcript embeddings. As described in Section 4.4.1, we combine the metadata and transcript embeddings to generate the final video embedding.
Obfuscation Video Set. We create our obfuscation video set by combining played videos during persona
construction and videos appearing in homepage recommendations of all personas. In total, we collect approximately one million YouTube videos and use them as the obfuscation video set. Note that the obfuscator
will select one obfuscation video from the obfuscation video set at each obfuscation step.
4.6 Training and Testing
Surrogate Model. We split the user persona dataset collected for the surrogate model in 80% for training
and 20% for testing. We use stochastic gradient descent for the surrogate model to minimize its loss,
which is defined as the KL divergence between its output distribution and the actual recommended video
category distribution of input user persona. We train our surrogate model for 50 epochs, where all the
training samples are used once at each training epoch. We report that the average loss of our surrogate
model on the testing dataset is 0.55.
Obfuscator. Recall that the obfuscator needs to take as input the non-obfuscated user personas. We use the
training and testing user personas in the dataset collected for the surrogate model as the non-obfuscated
123
user personas, and train the obfuscator to generate obfuscated user personas that maximize privacy (see
Section 4.3.4). Specifically, we train the obfuscator against the surrogate model for 50 epochs, where all
the training user personas are used once at each epoch. After that, we use the testing user personas to
evaluate the obfuscator against both the surrogate model and the real-world YouTube recommendation
system, and report the average privacy metrics (P and P
N orm). Note that to evaluate the performance
of the obfuscator against YouTube, we construct non-obfuscated and obfuscated user personas to collect
real-world data from YouTube (see Section 4.5.2). Moreover, when training the obfuscator, we use the onpolicy RL algorithm A2C (Advantage Actor and Critic)[130], which is one of the state-of-the-art on-policy
RL algorithms. Note that we choose the on-policy RL algorithm since it fits our application well, where the
obfuscator (RL agent) needs to keep interacting with the YouTube recommendation system (environment)
to improve the policy in an online fashion due to the dynamics of the YouTube recommendation system.
Denoiser. As described in Section 4.5.2, we create a dataset with 1,800 samples to train and test the
denoiser, where 80% of the samples are used for training and 20% are used for testing. Specifically, the
denoiser is trained via stochastic gradient descent to minimize the KL divergence between the output of
the denoiser Cˆu
, i.e. the estimated recommendation video category distribution of a non-obfuscated user
persona, and the actual distribution C
u
. We train the denoiser for 50 epochs, where all the training samples
are used once at each training epoch. We test the denoiser using the remaining 20% samples and report the
average utility metrics (U
N orm
Gain and U
Loss).
Note that when we test De-Harpo on sock puppet based persona dataset, we use the models of DeHarpo trained on sock puppets dataset; when we test De-Harpo on Reddit user persona dataset, we use
the models of De-Harpo trained on sock puppets dataset; when we test De-Harpo on Real-world YouTube
user dataset, we use the models of De-Harpo trained on sock puppets dataset.
124
4.6.1 Baselines
Obfuscator. We compare the privacy-enhancing performance of De-Harpo obfuscator with three baselines:
1) Rand-Obf: At each obfuscation step, we randomly select one obfuscation video from the obfuscation
video set, and the probability of selecting each obfuscation video is equal to 1
M (M is the total number of
obfuscation videos in the set).
2) Bias-Obf: At each obfuscation step, we randomly select one obfuscation video from the obfuscation video set. However, the probability of selecting each obfuscation video is proportional to the reward
triggered by each obfuscation video. To create such non-uniform distribution, we first use Rand-Obf to
randomly select obfuscation videos and then record the reward after injecting them into non-obfuscated
user personas. We repeat this experiment for 50 epochs and count the accumulative reward of each obfuscation video, normalize it by the sum of the accumulative rewards of all obfuscation videos, and use the
normalized rewards as the non-uniform probability distribution.
3) PBooster-Obf: At each obfuscation step, we select one obfuscation video from the obfuscation video
set which can maximize the reward for the current step based on the greedy algorithm PBooster proposed
in [219].
Denoiser. We compare the utility-preserving performance of the denoiser in De-Harpo with a baseline
that uses the same architecture as the surrogate model to predict C
u directly from a non-obfuscated user
persona V
u
, without taking the obfuscated persona V
o
and the associated recommended video class distribution C
o
as inputs. We refer to this baseline as Surro-Den. Ideally, if the surrogate model is a perfect
replication of YouTube’s recommendation system, then users could directly use it to get recommended
videos based on their non-obfuscated user personas. Clearly this is unrealistic in practice since the surrogate model does not have access to the complete universe of YouTube videos which are updated constantly,
and the model is merely an approximation of the actual YouTube recommendation system.
125
For convenience, we denote the De-Harpo obfuscator and De-Harpo denoiser by De-Harpo-Obf and
De-Harpo-Den respectively in the rest of the paper.
4.7 Evaluation
In this section, we evaluate the effectiveness of De-Harpo from six perspectives: privacy, utility, overhead,
stealthiness, robustness to de-obfuscation, and personalization.
4.7.1 Privacy
We first evaluate the effectiveness of De-Harpo in enhancing privacy using three user persona datasets,
and report the results in TABLE 4.1. Note that we test De-Harpo-Obf and other obfuscator baselines
against the real-world YouTube recommendation system.
As shown in TABLE 4.1a, De-Harpo-Obf can trigger 0.91 KL divergence in the recommended video
class distribution after obfuscation (P) on sock puppet based personas, which translates into triggering
41.63% of the maximum possible KL divergence in the recommended video class distribution (P
N orm).
Compared with other baselines, De-Harpo-Obf can increase P
N orm by up to 2.01× and at least 1.33×.
Similarly, on Reddit user personas, De-Harpo-Obf outperforms all baselines by up to 1.57× and at least
1.32×, as reported in TABLE 4.1b.
Moreover, we evaluate whether the effectiveness of De-Harpo in enhancing privacy can be transferred to real-world user personas. Specifically, we use the same obfuscator trained on sock puppet based
personas to inject obfuscated videos into real-world user’s video watch history, and then test it against
YouTube. As reported in TABLE 4.1c, De-Harpo-Obf can trigger 87.23% of the maximum possible KL divergence in the recommended video class distribution (P
N orm), which outperforms all baselines against
YouTube by up to 1.92× and at least 1.58× in terms of P
N orm.
126
Obfuscator Rand-Obf Bias-Obf PBooster-Obf De-Harpo-Obf
P 0.71 0.70 0.81 0.91
P
N orm 21.55% 20.76% 31.24% 41.63%
(a) Using sock puppet based personas (DM in : 0.49, DMax : 1.51).
Obfuscator Rand-Obf Bias-Obf PBooster-Obf De-Harpo-Obf
P 1.05 1.07 1.13 1.30
P
N orm 48.79% 50.99% 57.84% 76.49%
(b) Using Reddit user personas (DM in : 0.60, DMax : 1.51).
Obfuscator Rand-Obf Bias-Obf PBooster-Obf De-Harpo-Obf
P 0.98 1.00 1.05 1.39
P
N orm 45.45% 48.01% 55.34% 87.23%
(c) Using real-world user personas (DM in : 0.53, DMax : 1.51).
Table 4.1: Privacy evaluation results against YouTube w.r.t. P and P
N orm.
4.7.2 Utility
Next, we evaluate the effectiveness of De-Harpo in preserving user utility. TABLE 4.2a reports our evaluation results in terms of ULoss and U
N orm
Gain using sock puppet based personas. Compared with Surro-Den,
De-Harpo-Den achieves on average 26% better performance in terms of decreasing ULoss (i.e. increasing
U
N orm
Gain ). Recall that different from Surro-Den, De-Harpo-Den also takes as inputs the obfuscated user
persona V
o
, and the associated recommended video class distribution C
o which comes directly from the
actual YouTube system. In contrast, the surrogate model is merely a “first-order" model of the actual, quite
complex YouTube system. We also evaluate the effectiveness of De-Harpo-Den in preserving user utility
using both Reddit user personas and real-world users. As reported in TABLE 4.2b-4.2c, De-Harpo-Den
can consistently preserve the utility well, reducing the utility loss by 93.80% and 90.40% respectively.
It is worth noting that the effectiveness of the denoiser in preserving utility does not depend on the
effectiveness of the obfuscator in enhancing privacy. As shown in Table 4.2a-4.2c, the same denoiser can
achieve almost the same utility loss ULoss under different obfuscators, which implies the denoiser does
127
not need to sacrifice privacy in order to preserve utility. We discuss the privacy-utility tradeoff in the next
subsection.
Obfuscator
Denoiser
Surro-Den De-Harpo-Den
Rand-Obf 0.60 / 50.91% 0.54 / 79.09%
Bias-Obf 0.60 / 49.06% 0.53 / 82.08%
PBooster-Obf 0.60 / 66.14% 0.53 / 86.83%
De-Harpo-Obf 0.60 / 74.59% 0.53 / 90.35%
(a) Using sock puppet based personas (DM in : 0.49).
Obfuscator
Denoiser
Surro-Den De-Harpo-Den
Rand-Obf 0.68 / 83.26% 0.64 / 91.18%
Bias-Obf 0.68 / 83.98% 0.66 / 88.96%
PBooster-Obf 0.68 / 85.88% 0.65 / 90.46%
De-Harpo-Obf 0.68 / 89.32% 0.65 / 93.80%
(b) Using Reddit user personas (DM in : 0.6).
Obfuscator
Denoiser
Surro-Den De-Harpo-Den
Rand-Obf 0.66 / 70.79% 0.62 / 81.12%
Bias-Obf 0.66 / 72.34% 0.61 / 82.34%
PBooster-Obf 0.66 / 76.99% 0.61 / 85.31%
De-Harpo-Obf 0.66 / 84.78% 0.61 / 90.40%
(c) Using real-world user personas (DM in : 0.53).
Table 4.2: Utility evaluation results w.r.t. ULoss and U
N orm
Gain . Note that each cell in the table reports
ULoss/U N orm
Gain .
4.7.3 Varying the Obfuscation Budget
So far, the obfuscation budget α is set to 0.2 in our evaluation. To evaluate how the obfuscation budget
(i.e. the percentile of obfuscation videos in a user persona) can affect the performance of De-Harpo,
we increase the value of α and evaluate how the performance of De-Harpo changes w.r.t. both privacy
128
(P
N orm) and utility (ULoss). We use sock puppet based persona dataset and consider three baselines: RandObf/De-Harpo-Den (i.e. the combination of Rand-Obf and the De-Harpo denoiser), Bias-Obf/De-HarpoDen (i.e. the combination of Bias-Obf and the De-Harpo denoiser), and PBooster-Obf/De-Harpo-Den (i.e.
the combination of PBooster-Obf and the De-Harpo denoiser).
Privacy-utility tradeoff. Figure 4.6 shows the privacy-utility tradeoff between P
N orm and ULoss with
varying α from {0.2, 0.3, 0.5}, where the top left region corresponds to both high privacy and utility.
We observe that, with De-Harpo-Den, the utility loss caused by different obfuscators can be significantly
reduced without sacrificing privacy. Note that since our denoiser is designed to work after obfuscation, it
does not hurt the performance of the obfuscator. Moreover, with De-Harpo-Den, the utility loss remains
almost the same as we keep increasing the obfuscation budget to get higher privacy. For example, compared
with baselines without using De-Harpo-Den, De-Harpo can reduce the utility loss by 2.12× when α =
0.5. Note that without De-Harpo-Den, the obfuscator needs to sacrifice utility (higher utility loss) to
achieve higher privacy. This is a key difference between De-Harpo and prior works that consider the
privacy-utility tradeoff (see Section 4.9).
Obfuscation budget and privacy level. Recall that we use the recommended video class distribution as
a proxy to a user profile, see Section 4.3.3. To evaluate whether De-Harpo can privatize a user profile to
look almost random, we increase the obfuscation budget beyond 0.5 aiming to achieve a P
N orm value as
close to 100% as possible. As shown in Figure 4.7, for α equal to 0.7 (i.e. 70% of the videos in a user persona
are obfuscation videos), P
N orm will reach 92.95%, which means the on-average (averaged over all users)
divergence between the recommended video class distribution before and after obfuscation is 93% of the
on-average divergence between the recommended video class distribution of two random users. It is also
worth noting that for real-world user personas, P
N orm can get very close to 100% with α merely equal to
0.5. Hence, we conclude that De-Harpo can achieve meaningful privacy for practical obfuscation budgets
α.
129
0.4 0.6 0.8 1 1.2
U
Loss
20
40
60
80
100
PNorm/%
Rand-Obf/De-Harpo-Den
Bias-Obf/De-Harpo-Den
PBooster-Obf/De-Harpo-Den
De-Harpo
Rand-Obf
Bias-Obf
PBooster-Obf
De-Harpo-Obf
2.12x
1.72x
=0.5
=0.3 1.81x
=0.2
Utility gain
Figure 4.6: Privacy-utility tradeoff w.r.t. P
N orm and ULoss under different obfuscation budget α. Note that
Rand-Obf/De-Harpo-Den represents the combination of Rand-Obf obfuscator and the De-Harpo denoiser,
Bias-Obf/De-Harpo-Den represents the combination of Bias-Obf obfuscator and the De-Harpo denoiser,
and PBooster-Obf/De-Harpo-Den represents the combination of PBooster-Obf obfuscator and the DeHarpo denoiser. Top left of figure represent both high privacy and high utility.
4.7.4 Overhead
Obfuscation budget and overhead. The larger the obfuscation budget the larger the overhead as more
obfuscation videos need to be injected in the video watch history. Not surprisingly, as shown in Figure
4.6, with increasing obfuscation budget α, the privacy (P
N orm) will increase for all obfuscators. That said,
De-Harpo can increase privacy with less obfuscation budget than the rest. Specifically, with α = 0.2, DeHarpo can achieve the same level of privacy as other baselines achieve with α = 0.5. That is, De-Harpo
can be as effective as baseline obfuscator in terms of enhancing privacy with 2.5× less obfuscation budget.
System overhead. We evaluate the system overhead of De-Harpo in terms of CPU and memory usage
and the video page load time using a an Intel i7 workstation with 64GB RAM on a campus WiFi network.
As described in Section 4.4.6, De-Harpo consists of an obfuscator component that always runs in the
background and a denoiser component that only runs when the user visits the YouTube homepage. We
separately report their overhead below.
1) Obfuscator: We select an obfuscation budget α from {0.0, 0.2, 0.3, 0.5}, where α = 0.0 is used as the
baseline (i.e. no obfuscation videos). For each obfuscation budget α we construct 10 user personas with 15
user videos each, and the browser extension visits 15 · α obfuscation videos in the background. We find
130
that the increased CPU usage is less than 5% and the increased memory usage is less than 2%, even for
obfuscation budget α = 0.5. Moreover, the change in video page load time of user videos is less than 2%
as α increases. Hence, we conclude that the obfuscator component in De-Harpo has a negligible impact
on the user experience overall.
2) Denoiser: The YouTube’s homepage load time with De-Harpo is 1.79 seconds, which represents just
a 37.8 millisecond increase as compared to the homepage load time without Harpo. Specifically, it takes
less than 24.6 millisecond to get the “noisy" recommended videos from the homepage, 13.0 millisecond for
the denoising module to get “clean" recommended videos, and 0.2 millisecond for showing these videos
in the homepage. In terms of the CPU and memory usage, the denoiser of De-Harpo will increase them
by 27.1% and 2.2% respectively, which is mainly due to running the ML model in the denoising module.
Note that the increase of the CPU usage (from 12.9% to 40.0%) lasts for just 13 milliseconds while the ML
model runs and returns to the normal level right after that. It is worth noting that the aforementioned
measurements are conducted for the live version of Harpo. In practice, we can reduce the overhead even
further by implementing a cached version of Harpo, which caches the YouTube homepage periodically in
the background and simply shows the cached homepage when the user navigates to the YouTube homepage. Hence, we conclude that the denoiser component in De-Harpo has a negligible impact on the user
experience overall.
Obfuscator
(Precision, Recall)
α = 0.2 α = 0.3 α = 0.5
Rand-Obf (4%, 99%) (5%, 92%) (5%, 92%)
Bias-Obf (7%, 72%) (5%, 81%) (36%, 94%)
PBooster-Obf (19%, 86%) (16%, 93%) (49%, 88%)
De-Harpo-Obf (67%, 98%) (73%, 99%) (74%, 99%)
Table 4.3: Stealthiness evaluation results under different obfuscation budget α with 5% DeHarpo users.
Note that we choose α from {0.2, 0.3, 0.5} and report (Precision, Recall) of the adversarial detector for
different obfuscators.
131
0 0.2 0.4 0.6 0.8 1
Obfuscation budget
30
40
50
60
70
80
90
100
PNorm/%
De-Harpo
92.95%
Figure 4.7: Privacy level P
N orm vs obfuscation
budget α.
50% 25% 10% 5% 1%
Percentage of DeHarpo Users
30
40
50
60
70
80
90
100
Precision/%
De-Harpo
Figure 4.8: Precision of the adversarial detector
vs the percentage of De-Harpo users under α =
0.5.
4.7.5 Stealthiness
In this subsection, we evaluate whether an adversary can train an ML classifier to accurately detect the
usage of obfuscators. We use the precision and recall of this adversarial detector to measure stealthiness
of obfuscation. If the detector achieves high precision and recall, then it means that an obfuscator is less
stealthy. Specifically, the input of the adversary is a user persona consisting of a sequence of videos and
the binary output indicates whether or not the user persona contains at least one obfuscation video.
We train the adversarial classifier via supervised learning. To create the labeled dataset, we use the
same set of non-obfuscated and obfuscated sock puppet based user personas used for evaluation in Section
4.7.4 as inputs, and assign the corresponding labels to the personas (0: non-obfuscated, 1: obfuscated). For
each obfuscator and obfuscation budget α, we get a balanced training dataset with 1,440 obfuscated personas and the corresponding 1,440 non-obfuscated personas. After training, we use an unbalanced dataset
with 5% obfuscation personas (a total of 360 obfuscated personas and 6,840 non-obfuscated personas) to
test the detector, since only a small fraction of YouTube users are expected to employ De-Harpo.
132
Table 4.3 reports the testing precision and recall of the adversarial detector under different α values.
We observe that as α increases, both the precision and recall of the detector also increase. This is expected as larger α represents more obfuscation videos, which makes it easier for the adversarial detector
to distinguish obfuscated personas from non-obfuscated personas.
Not surprisingly, Rand-Obf is the most stealthy obfuscator since it injects obfuscation videos randomly.
De-Harpo-Obf, which injects obfuscation videos that introduce new user interests to confuse YouTube,
can still achieve reasonable stealthiness even when α = 0.5. Specifically, it leads to 74% precision (36%
false positive rate) even with α = 0.5. Note the the high false positive rate presents a major obstacle in
deployment of the adversarial detector due to base-rate fallacy [220]. We further vary the percentage of
De-Harpo users over all YouTube users to show how the precision of the adversarial detector changes as
we go from a very unbalanced dataset to a perfectly balanced one. As shown in Figure 4.8, as the percentage
of De-Harpo users varies from 1% to 50%, the adversarial detector’s precision will increase, as expected.
However, it is unlikely in practice that a large fraction of YouTube users will use obfuscation measures.
And, even in the case of a balanced dataset, a 2% false positive rate still corresponds to tens of millions
of users making it prohibitively expensive to deploy it. Essentially, the adversarial detector will have to
achieve exceptionally high precision to be useful in practice.
Note that such a binary detector may be used as a first step of the detection workflow. Once the adversary detects the usage of De-Harpo, it may further attempt to de-obfuscate the obfuscated user personas.
That is, the adversary may attempt to identify obfuscation videos in the obfuscated user persona such
that it may remove them to retrieve the non-obfuscated user personas. We evaluate this de-obfuscation
performance of an adversary next.
133
Obfuscator
(Precision, Recall)
α = 0.2 α = 0.3 α = 0.5
Rand-Obf (62%, 97%) (67%, 91%) (69%, 99%)
Bias-Obf (67%, 89%) (71%, 89%) (77%, 92%)
PBooster-Obf (68%, 93%) (71%, 90%) (77%, 94%)
De-Harpo-Obf (79%, 93%) (83%, 97%) (84%, 97%)
Table 4.4: De-obfuscation robustness evaluation results under different obfuscation budget. Note that we
set α ∈ {0.2, 0.3, 0.5} and report (Precision, Recall) of adversarial detector under different obfuscation
approaches.
4.7.6 De-obfuscation Robustness
Once the adversary detects the usage of De-Harpo in a user persona, it can conduct de-obfuscation. To
evaluate whether an obfuscator is robust to de-obfuscation, we train a second adversarial detector to distinguish the obfuscation videos from the actual user videos. Specifically, we build a second ML classifier
to detect the type of each video (user versus obfuscation video) in each sock puppet based user persona,
and use its precision and recall to measure the de-obfuscation robustness. Smaller precision and recall
represents higher de-obfuscation robustness.
We use the same set of obfuscated personas as in Section 4.7.4 as inputs. For each video in an obfuscated
user persona, we assign a binary label, where 0 represents it is watched by the user while 1 represents that
it is injected by the obfuscator. The detector model takes as input the obfuscated user persona, and predicts
a label for each video in the user persona. We use a recurrent neural network (LSTM layer) to model this
adversarial detector.
As shown in Table 4.4, the precision of this adversarial detector is lower than 85%, which means more
than 15% of the obfuscated videos identified by the adversary are false positives (they are actual user
videos). Similar to stealthiness, false positives present a bigger challenge to the adversary in deploying
this detector in practice. Hence, we conclude that De-Harpo is robust to de-obfuscation by an adversary.
134
Note that while the adversary has lower precision against Rand-Obf and Bias-Obf than agaisnt DeHarpo, this is because De-Harpo is 2.5× more effective in preserving privacy (see Section 4.7.4), thus,
overall, it is more privacy-preserving.
4.7.7 Personalization
De-Harpo so far is trained to maximize the KL divergence in the recommended video class distribution
after obfuscation, by either increasing or reducing the probability of each video class. However, a YouTube
user may have a list of sensitive video classes (e.g. health or wellness related), where they do not want
the YouTube recommendations containing videos from these classes after obfuscation (i.e. reducing their
probability to zero).
Motivated by this, we design a mechanism that can treat sensitive video classes and non-sensitive video
classes differently based on user preferences. Without loss of generality, suppose the first L classes of the
recommended video class distribution are non-sensitive and the remaining K − L classes are sensitive. We
then train De-Harpo to maximize the following privacy metric, which aims to treat non-sensitive classes
like before (maximize divergence before and after obfuscation) and eliminate sensitive class videos:
P
P ersonalized=E[DKL(C
o
1:L,Cu
1:L)
| {z }
DNonSens
KL
−λ DKL(C
o
L+1:K,[ϵ]L+1:K)
| {z }
DSens
KL
], (4.10)
where [ϵ]L+1:K ∈ R
K−L indicates a close-to-zero vector filled with a small positive number ϵ (e.g. 0.0001),
and λ > 0 is an adjustable parameter for controlling the relative importance of DN onSens
KL versus DSens
KL .
Specifically, the term DN onSens
KL aims to maximize the distance between the distribution of non-sensitive
classes before and after obfuscation, like we did before for all classes. The term −λDSens
KL aims to minimize
the distance between the distribution of the sensitive classes and a distribution of very small probabilities.‡‡
‡‡Notice that we are somewhat abusing the “distribution" term above, because we do not re-normalize the corresponding
probabilities to sum up to 1, as this would (i) de-emphasize the contrast between the patterns of interest and (ii) is not required
to meaningfully use the KL divergence formula.
135
DN onSen
KL DSen
KL
De-Harpo 1.18 0.26
Personalized De-Harpo 0.81 (↓ 31.36%) 0.05 (↓ 80.77%)
Table 4.5: Personalization results. DN onSens
KL and DSens
KL denote the divergence in non-sensitive classes and
sensitive classes respectively.
Table 4.5 reports our evaluation results of personalized De-Harpo against surrogate models, where we
select 27 out of 154 video classes related to Beauty & Wellness and Sports & Fitness as sensitive classes.
Compared with non-personalized De-Harpo, personalized De-Harpo can reduce the divergence between
sensitive video class distribution and a zero vector (DSens
KL ) by more than 80%, while still triggering high
divergence in non-sensitive class distribution (DN onSens
KL ).
4.8 Discussion
4.8.1 Ethical Considerations
We outline the potential benefits and harms to the user and the recommendation system. We argue that
the potential benefits of De-Harpo outweigh its potential harms.
Users. De-Harpo provides a clear privacy benefit to its users, especially when platforms such as YouTube
do not provide any meaningful control over its tracking and profiling of users. Crucially, De-Harpo is
able to enhance privacy while mostly preserving the utility of personalized recommendations. Thus, DeHarpo does not degrade user experience on YouTube. However, users of De-Harpo potentially violate
YouTube’s Terms of Service (TOS) [221] because YouTube might interpret obfuscation as “fake engagement”. Therefore, if a user is signed-in to YouTube, their YouTube account might be suspended if YouTube
is able to detect De-Harpo’s usage (though we showed that YouTube would be unable to do so without
risking significant collateral damage). More seriously, the violation of TOS might be considered possible
136
violation of the Computer Fraud and Abuse Act (CFAA, 18 U.S. Code § 1030) [222]. However, given that DeHarpo users only watch videos that they are authorized to (i.e., publicly available videos), we argue that
the videos injected to the watch history for obfuscation nor the videos injected to the recommendations
for repopulation exceed authorized access that could be a violation of CFAA [223].
YouTube. Since De-Harpo aims to preserve utility of recommendations to YouTube users, we argue that it
will not directly hurt user engagement on YouTube. De-Harpo’s obfuscator and denoiser would, however,
contribute to additional traffic to YouTube servers and may have some indirect impact on the effectiveness
of YouTube’s recommendation system, if a large enough portion of the users adopt De-Harpo. We note
that De-Harpo can be applied with satisfactory trade-off privacy vs. utility as long as only a minority
of YouTube users employ obfuscation tools, which is arguably a realistic expectation. Otherwise, if a
significant fraction of users adopts De-Harpo, the obfuscation may lead to data poisoning, which will
indirectly affect the quality of recommendations for all users. In this case, and in the absence of legal
regulation of tracking and user profiling by YouTube, future research will need to explore an alternative
scalable solution for privacy preservation that is complementary to obfuscation. Overall, as compared
to extant privacy-enhancing obfuscation tools, we conclude that De-Harpo is more favorable since it
specifically aims to preserve utility and user engagement on YouTube.
4.8.2 Ethical Issues Related to Reddit User and Real-world User Personas.
For the Reddit dataset, it is deemed exempt by IRB, and the dataset is publicly available and pre-crawled at
https://files.pushshift.io/reddit/. We will de-identify usernames before public data release. For the YouTube
users’ dataset we obtained an IRB approval and conducted experiments along the Menlo Report guidelines
[224]: Users consented to their data being collected for research purposes. We will not publicly release the
dataset.
137
4.8.3 Limitations & Future Work
Side channels. De-Harpo’s stealthiness can be undermined by exploiting various implementation side
channels. For example, YouTube could use Page Visibility API [151] or the Performance API [152] to
detect whether obfuscation videos are unusually not being played in the foreground. However, there are
patches such as wpears [225] to avoid detection. Additionally, the obfuscator plays the obfuscation videos
in full in a background tab while disabling background throttling (or other such optimizations [152, 226])
to prevent detection by such side channels. As another example, the repopulation of recommendations
on the homepage after denoising would entail manipulation of the HTML DOM [227], which might be
detectable. However, such an attach would be infeasible in practice, because the detection approaches
would add an overhead of up to several seconds [154, 155].
Deployment on mobile devices. De-Harpo is currently implemented as a browser extension for desktops. Since browser extensions are not supported on iOS or Android, the only option for users to benefit
from De-Harpo on their mobile phone is to use other Chromium based browsers that allow extensions
[228, 229]. Another option for mobile users is to use a remote desktop utility [230] to access YouTube with
De-Harpo on a desktop. Finally, users might still be able to reap the obfuscation benefits of De-Harpo
if they deploy the extension on their desktop and be logged-in to the same Google account [231] on both
their mobile app and desktop with De-Harpo.
4.8.4 Discussion of Joint Training of Obfuscator and Denoiser
The obfuscator and denoiser in De-Harpo are separately trained and their joint training might be much
more effective. We experimented with jointly training the obfuscator and the denoiser using multi-objective
reinforcement learning. Specifically, we started by training a denoiser model. Then, we trained the obfuscator to maximize the privacy against the surrogate model, while minimizing the loss of the denoiser with
obfuscated user personas as inputs. After we trained the obfuscator, we retrained the denoiser and repeat
138
the above process until both the obfuscator and the denoiser converge. We found that jointly training did
not improve privacy or utility because of our use of the surrogate model, instead of YouTube in the wild,
for practical reasons. When trained against the surrogate model, denoiser was able to trivially replicate
the surrogate model. While in theory we could jointly train the obfuscator and the denoiser in the wild
to avoid this issue, it would not be practical due to its time consuming nature. Future work can look into
hybrid surrogate and in the wild joint training of obfuscator and denoiser.
4.9 Related Work
In this section, we discuss prior work on privacy-enhancing obfuscation in recommendation systems.
One line of prior research focuses on developing privacy-enhancing obfuscation approaches in online
behavioral advertising. These efforts are relevant to our work because online behavioral advertising is
essentially a recommendation system where the advertiser’s goal is to “recommend” personalized ads to
users based on their online activity. However, most of these privacy-enhancing obfuscation approaches
are not designed to preserve the utility (i.e., relevance of personalized ads) [110, 111, 232, 177, 173], as they
generally randomly insert a curated set of obfuscation inputs to manipulate online behavioral advertising.
TrackThis [111] by Mozilla injects a curated list of 100 URLs to obfuscate a user’s browsing profile.
AdNauseam [110] clicks a random set of ads to “confuse” advertisers. One subset of these efforts propose
“pollution attacks” against online behavioral advertising that also serve a dual role as privacy-enhancing
obfuscation [232, 177, 173]. Meng et al. [232] propose a pollution attack that can be launched by publishers to increase their advertising revenue by manipulating advertisers into targeting higher paying ads.
The attack involves the addition of curated URLs into a user’s browsing profile. Degeling et al. [177] and
Kim et al. [173] propose similar attacks but focus on two distinct stages of the online behavioral advertising pipeline: user profiling and ad targeting. Degeling et al. [177] propose an obfuscation approach that
139
involves adding URLs posted on Reddit into a user’s browsing profile. Kim et al. [173] propose “AdbudgetKiller" that involves adding a sequence of URLs into a user’s browsing profile to trigger retargeted ads,
which are costly for advertisers and waste their advertising budget.
Moving beyond online behavioral advertising, Xing et al [181] propose pollution attacks against more
general personalized recommendation systems such as YouTube, Amazon, and Google Search. The authors
show that personalized recommendations could easily be manipulated by injecting random or curated
obfuscation inputs. Since the attack’s victim is the user, the work does not take into account the utility of
recommendations to the user. In contrast, De-Harpo is a privacy-enhancing obfuscation system that also
takes into account the utility of the recommendations.
Follow up privacy-enhancing obfuscation systems do attempt to take into account the utility-privacy
trade-off. Beigi et al, [219] propose PBooster, a greedy search approach to obfuscate a user’s browsing
profile while also keeping utility in consideration. PBooster employs topic modeling to select a subset
of target topics and corresponding obfuscation URLs to add in a user’s browsing history. Zhang et al.
[178] propose Harpo, a reinforcement learning approach to obfuscate a user’s browsing profile such that a
subset of interest segments are kept while others are modified. Different from Harpo, De-Harpo pairs the
obfuscator with a denoiser to preserve the recommended videos related to the users’ actual interests while
removing the unrelated recommended videos caused by obfuscation. Moreover, the De-Harpo obfuscator
non-trivially adapts Harpo to YouTube, by building 1) a surrogate model with a different embedding model
and loss function for replicating the YouTube recommendation system and 2) an obfuscator model which
selects obfuscation videos based on the similarity between its embedding and the output embedding, such
that it can handle an unlimited and varying number of possible obfuscation videos without requiring
retraining. Huang et al. [233] propose a context-aware generative adversarial privacy (GAP) approach to
train a “privatizer” for privacy-enhancing obfuscation against an adversary who attempts to infer sensitive
information from input data. This approach is used to obfuscate mobile sensor data while navigating the
140
privacy-utility tradeoff [234, 235, 74]. While in theory we can apply GAP to jointly train the obfuscator
and denoiser, in practice training them against YouTube in the wild which is prohibitively time consuming
due to the iterative nature of GAP, and training them against the surrogate model is ineffective because
the denoiser is able to trivially replicate the surrogate model (see Section 4.8.4). Beiga et al. [114] propose
a crowd-based obfuscation approach that allows individual users to preserve privacy by scrambling their
browsing profiles via mediator accounts, which are selected such that the personalized recommendations to
these mediator accounts are still coherent and utility-preserving to the users behind each mediator account.
However, this approach requires a collaboration across multiple users of a recommendation system, and
cannot be used by standalone users.
While recent work on privacy-enhancing obfuscation has attempted to balance the privacy-utility
tradeoff, they are limited to obfuscating the input to the recommendation system to achieve this balance.
These approaches are fundamentally limited as to how much utility can be preserved without undermining
privacy by obfuscating the input to the recommendation system (see Figure 4.5). In contrast, De-Harpo
employs a two-step approach to this end. It first obfuscates the input to the recommendation system to
preserve user privacy and then attempts to de-obfuscate the output recommendations to preserve utility.
4.10 Conclusion
In this work, we propose De-Harpo, a privacy-enhancing and utility-preserving obfuscation approach for
YouTube’s recommendation system that does not rely on cooperation from YouTube. De-Harpo uses an
obfuscator to inject obfuscation videos into a user’s video watching history and a denoiser to remove the
“noisy” recommended videos thus recovering the initial, unobfuscated recommendations. Our evaluation
results demonstrated that De-Harpo can reduce the utility loss by 2× for the same level of privacy compared to existing state-of-the-art obfuscation approaches. Our work provides a template for implementing
141
such utility-preserving obfuscation approaches on other similar online platforms, such as TikTok [4] and
Facebook [236].
142
Part II
Enhancing User Privacy and Utility in Federated Learning with Secure
Aggregation
143
Chapter 5
Quantifying On-average Privacy Leakage in Federated Learning with
Secure Aggregation via Mutual Information
Federated learning (FL) has attracted growing interest for enabling privacy-preserving machine learning
on data stored at multiple users while avoiding moving the data off-device. However, while data never
leaves users’ devices, privacy still cannot be guaranteed since significant computations on users’ training
data are shared in the form of trained local models. These local models have recently been shown to pose a
substantial privacy threat through different privacy attacks such as model inversion attacks. As a remedy,
Secure Aggregation (SA) has been developed as a framework to preserve privacy in FL, by guaranteeing
the server can only learn the global aggregated model update but not the individual model updates. While
SA ensures no additional information is leaked about the individual model update beyond the aggregated
model update, there are no formal guarantees on how much privacy FL with SA can actually offer; as
information about the individual dataset can still potentially leak through the aggregated model computed
at the server. In this chapter, we perform a first analysis of the formal privacy guarantees for FL with
SA. Specifically, we use Mutual Information (MI) as a quantification metric and derive upper bounds on
how much information about each user’s dataset can leak through the aggregated model update. When
using the FedSGD aggregation algorithm, our theoretical bounds show that the amount of privacy leakage
reduces linearly with the number of users participating in FL with SA.
144
To validate our theoretical bounds, we use an MI Neural Estimator to empirically evaluate the MI
privacy leakage under different FL setups on both the MNIST and CIFAR10 datasets. Our experiments
demonstrate a reduction in privacy leakage as the number of users and local batch size grow, and an increase in privacy leakage as the number of training rounds increases. We also observe similar dependencies
for the FedAvg and FedProx protocol.
5.1 Introduction
Federated learning (FL) has recently gained significant interest as it enables collaboratively training machine learning models over locally private data across multiple users without requiring the users to share
their private local data with a central server [237, 39, 238]. The training procedure in FL is typically coordinated through a central server who maintains a global model that is frequently updated locally by the
users over a number of iterations. In each training iteration, the server firstly sends the current global
model to the users. Next, the users update the global model by training it on their private datasets and
then push their local model updates back to the server. Finally, the server updates the global model by
aggregating the received local model updates from the users.
In the training process of FL, users can achieve the simplest notion of privacy in which users keep their
data in-device and never share it with the server, but instead they only share their local model updates.
However, it has been shown recently in different works (e.g., [239, 240, 241]) that this alone is not sufficient
to ensure privacy, as the shared model updates can still reveal substantial information about the local
datasets. Specifically, these works have empirically demonstrated that the private training data of the
users can be reconstructed from the local model updates through what is known as the model inversion
attack.
145
➕ ➕ ➕
data
data
…
…
Information leaked about
a single client’s data from
the aggregated model ?
?
No leakage
(Secure Aggregation
guarantees)
Aggregated model
(a) Current and missing privacy guarantees
for FL with secure aggregation
(b) Privacy leakage vs. number of users
in FL with secure aggregation
Figure 5.1: Figure (a) illustrates the current formal privacy guarantee of FL with SA protocols and sheds
light on the missing privacy guarantee on the aggregated model information leakage which is studied in
this work. Figure (b) gives a preview of the behavior of the privacy leakage through the global aggregated
model for a CNN model as a function of the number of users in FL. The privacy leakage follows a O(1/N)
decay as proved in our theoretical bounds.
To prevent such information leakage from the individual models that are shared during the training
process of FL, Secure Aggregation (SA) protocols have emerged as a remedy to address these privacy concerns by enabling the server to aggregate local model updates from a number of users, without observing
any of their model updates in the clear. As shown in Figure 5.1(a), in each training round, users encrypt
their local model updates before sending it to the server for aggregation. Thus, SA protocols formally
guarantee that: 1) both the server and other users have no information about any user’s clear model update from the encrypted update in the information theoretic sense; 2) the server only learns the aggregated
model. In other words, secure aggregation ensures that only the aggregated model update is revealed to
the server. Note that these SA guarantees allow for its use as a supporting protocol for other privacypreserving approaches such as differential privacy [242]. In particular, these approaches can benefit from
SA by reducing the amount of noise needed to achieve a target privacy level (hence improving the model
accuracy) as demonstrated in different works (e.g., [243, 244]).
146
However, even with these SA guarantees on individual updates, it is not yet fully understood how much
privacy is guaranteed in FL using SA, since the aggregated model update may still leak information about
an individual user’s local dataset. This observation leads us to the central question that this work addresses:
How much information does the aggregated model leak about the local dataset of an individual user?
In this work, we tackle this question by studying how much privacy can be guaranteed by using FL
with SA protocols. We highlight that this work does not propose any new approaches to tackle privacy
leakage but instead analyzes the privacy guarantees offered by state-of-the-art SA protocols, where updates
from other users can be used to hide the contribution of any individual user. An understanding of this
privacy guarantee may potentially assist other approaches such as differential privacy, such that instead
of introducing novel noise to protect a user’s model update, the randomized algorithm can add noise only
to supplement the noise from other users’ updates to the target privacy level. We can summarize the
contributions of the work as follows.
Contributions. In this work, we provide information-theoretic upper bounds on the amount of information that the aggregated model update (using FedSGD [237]) leaks about any single user’s dataset under an
honest-but-curious threat model, where the server and all users follow the protocol honestly, but can collude to learn information about a user outside their collusion set. Our derived upper bounds show that SA
protocols exhibit a more favorable behavior as we increase the number of honest users participating in the
protocol at each round. We also show that the information leakage from the aggregated model decreases
by increasing the batch size, which has been empirically demonstrated in different recent works on model
inversion attacks (e.g., [239, 240, 241]), where increasing the batch size limits the attack’s success rate. Another interesting conclusion from our theoretical bounds is that increasing the model size does not have
a linear impact on increasing the privacy leakage, but it depends linearly on the rank of the covariance
matrix of the gradient vector at each user.
147
In our empirical evaluation, we conduct extensive experiments on the CIFAR10 [245] and MNIST [246]
datasets in different FL settings. In these experiments, we estimate the privacy leakage using a mutual
information neural estimator [247] and evaluate the dependency of the leakage on different FL system
parameters: number of users, local batch size and model size. Our experiments show that the privacy
leakage empirically follows similar dependencies to what is proven in our theoretical analysis. Notably,
as the number of users in the FL system increase to 20, the privacy leakage (normalized by the entropy of
a data batch) drops below 5% when training a CNN network on the CIFAR10 dataset (see Figure 5.1(b).
We also show empirically that the dependencies, observed theoretically and empirically for FedSGD, also
extend when using the FedAvg [237] FL protocol to perform multiple local training epochs at the users.
5.2 Preliminaries
We start by discussing the basic federated learning model, before introducing the secure aggregation protocol and its state-of-the-art guarantees.
5.2.1 Basic Setting of Federated Learning
Federated learning is a distributed training framework [238] for machine learning, in which a set of users
N = [N] (|N | = N), each with its own local dataset Di
(∀i ∈ [N]), collaboratively train a d-dimensional
machine learning model parameterized by θ ∈ R
d
, based on all their training data samples. For simplicity,
we assume that users have equal-sized datasets, i.e., Di = D for all i ∈ [N]. The typical training goal in
FL can be formally represented by the following optimization problem:
θ
∗ = arg min
θ∈Rd
"
C(θ) := 1
N
X
N
i=1
Ci(θ)
#
, (5.1)
148
Figure 5.2: The training process in federated learning.
where θ is the optimization variable, C(θ) is the global objective function, Ci(θ) is the local loss function
of user i. The local loss function of user i is given by
Ci(θ) = 1
D
X
(x,y)∈Di
ℓi(θ,(x, y)), (5.2)
where ℓi(θ,(x, y)) ∈ R denotes the loss function at a given data point (xi
, yi) ∈ Di
. The dataset Di at
user i ∈ [N] is sampled from a distribution Pi
.
To solve the optimization problem in Eq. (5.1), an iterative training procedure is performed between
the server and distributed users, as illustrated in Figure 5.2. Specifically, at iteration t, the server firstly
sends the current global model parameters, θ
(t)
, to the users. User i ∈ [N] then computes its model update
x
(t)
i
and sends it to the server. After that, the model updates of the N users are aggregated by the server
to update the global model parameters into θ
(t+1) for the next round according to
θ
(t+1) = θ
(t) − η
(t)
1
N
X
N
i=1
x
(t)
i
. (5.3)
There are two common protocols for computing the model update xi
: FedSGD and FedAvg [238]. Specifically, in FedSGD, each user uses a data batch B
(t)
i
of size B sampled uniformly at random from it local
dataset Di to compute the model update as follows:
149
x
(t)
i =
1
B
X
b∈B(t)
i
gi(θ
(t)
, b), (5.4)
where gi(θ
(t)
) is the stochastic estimate of the gradient ∇Ci(θ
(t)
) of the local loss function Ci of user
i computed based on a random sample b (corresponding to (xb, yb)) drawn uniformly from Di without
replacement. In FedAvg, each user will run E complete local training rounds over its local dataset Di to
get its model update x
(t)
i
. Specifically, during each training round, each user will use all their mini-batches
sampled from Di to perform multiple stochastic gradient descent steps.
5.2.2 Secure Aggregation Protocols for Federated Learning
Recent works (e.g., [239, 240, 241]) have empirically shown that some of the local training data of user i
can be reconstructed from the local model update xi
, for i ∈ [N]. To prevent such data leakage, different
SA protocols [248, 243, 249, 250, 251, 252, 253, 254, 255, 256, 257, 258] have been proposed to provide a
privacy-preserving FL setting without sacrificing the training performance. In the following, we discuss
the threat model used in these SA protocols.
Threat Model in Secure Aggregation for Federated Learning.
Most of SA protocols consider the honest-but-curious model [237] with the goal of uncovering users’
data. In this threat model, the server and users honestly follow the SA protocol as specified. In particular,
they will not modify their model architectures to better suit their attack, nor send malicious model update
that do not represent the actually learned model. However, the server and the participating users are
assumed to be curious and try to extract any useful information about the training data of any particular
user. The extraction of the information is done by storing and analyzing the different data received during
the execution of the protocol.
On the other hand, the threat model in theses SA protocols assumes that the server can collude with
any subset of users T ⊂ [N] by jointly sharing any data that was used during the execution of the protocol
150
(including their clear model updates xi
, for all i ∈ T ) that could help in breaching the data privacy of any
target user i ∈ [N]/T . Similarly, this threat model also assumes that users can collude with each other to
get information about the training data of other users.
Secure Aggregation Guarantees. In general, SA protocols that rely on different encryption techniques;
such as homomorphic encryption [248, 243, 249, 250], and secure multi-party computing (MPC) [251, 252,
253, 254, 255, 256, 257, 258], are all similar in the encryption procedure in which each user encrypts its
own model update y
(t)
i = Enc(x
(t)
i
) before sending it to the server. This encryption is done such that these
protocols achieve: 1) Correct decoding of the aggregated model under users’ dropout; 2) Privacy for the
local model update of the users from the encrypted model. In the following, we formally describe each of
these guarantees.
Correct decoding. The encryption guarantees correct decoding for the aggregated model of the surviving
users even if a subset U ⊂ [N] of the users dropped out during the protocol execution. In other words, the
server should be able to decode
Dec X
i∈V
y
(t)
i
!
=
X
i∈V
x
(t)
i
, (5.5)
where V is the set of surviving users (e.g., U ∪ V = [N] and U ∩ V = ϕ).
Privacy guarantee. Under the collusion between the server and any strict subset of users T ⊂ [N], we
have the following
I
{y
(t)
i
}i∈[N]
; {x
(t)
i
}i∈[N]
X
N
i=1
x
(t)
i
, zT
!
= 0, (5.6)
where zT is the collection of information at the users in T . In other words, Eq. (5.6) guarantees that
under a given subset of colluding users T with the server, the encrypted model updates {y
(t)
i
}i∈[N]
leak
no information about the model updates {x
(t)
i
}i∈[N] beyond the aggregated model PN
i=1 x
(t)
i
. We note
that the upper bound on the size of the colluding set T such that Eq. (5.6) is always guaranteed has been
151
analyzed in the different SA protocols. Assuming that |T | ≤ N
2
is widely used in most of the works (e.g.,
[255, 258]).
Remark 1. Recently, there have been also some works that enable doing secure model aggregation by
using Trusted Execution Environments (TEE) such as Intel SGX (e.g., [259, 260]). SGX is a hardware-based
security mechanism to protect applications running on a remote server. These TEE-based works are also
designed to give the same guarantee in Eq. (5.6).
In the following, we formally highlight the weakness of the current privacy guarantee discussed in Eq.
(5.6).
Our Contribution: Guarantees on Privacy Leakage from the Aggregated Model. Different SA protocols guarantee that the server doesn’t learn any information about the local model update x
(t)
i
of any
user i from the received encrypted updates {y
(t)
i
}i∈N , beyond the aggregated model as formally shown in
Eq. (5.6). However, it is not clear how much information the aggregated model update itself leaks about a
single user’s local dataset Di
. In this work, we fill this gap by theoretically analyzing the following term.
Ipriv/data = max
i∈[N]
I
Di
;
(
1
N
X
N
i=1
x
(t)
i
)
t∈[T]
. (5.7)
The term in Eq. (5.7) represents how much information the aggregated model over T global training rounds
could leak about the private data Di of any user i ∈ [N]. In the following section, we theoretically study
this term and discuss how it is impacted by the different FL system parameters such as model size, number
of users , etc. In Section 5.5, we support our theoretical findings by empirically evaluating Ipriv/data in
real-world datasets and different neural network architectures.
152
5.3 Theoretical Privacy Guarantees of FL with Secure Aggregation
In this section, we theoretically quantify the privacy leakage in FL when using secure aggregation with
the FedSGD protocol ∗
5.3.1 Main Results
For clarity, we first state our main results under the honest-but-curious threat model discussed in Section
5.2.2 while assuming that there is no collusion between the server and users. We also assume that there is
no user dropout. Later in Section 5.3.2.5, we discuss the general result with user dropout and the collusion
with the server.
Our central result in this section characterizes the privacy leakage in terms of mutual information for
a single round of FedSGD, which for round t is defined as
I
(t)
priv = max
i∈[N]
I
x
(t)
i
;
X
N
i=1
x
(t)
i
(X
N
i=1
x
(k)
i
)
k∈[t−1]
(5.8)
and then extends the privacy leakage bound to multiple rounds. Before stating our main result in Theorem 1 below, we first define two key properties of random vectors that will be used in stating our theorem
and formally state our operational assumptions.
Definition 1 (Independent under whitening). We say that a random vector v with mean µv and nonsingular covariance matrix Kv is independent under whitening, if the whitened vector vb is composed of independent random variables, where vb = K
−1/2
v (v − µv).
Definition 2 (Uniformly σ-log concave). A random vector v with covariance Kv is uniformly σ-log concave
if it has a probability density function e
−ϕ(v)
satisfying ∇2ϕ(v) ⪰ I and ∃ σ > 0, such that Kv ⪰ σI.
∗Note that the proofs for the theoratical results in this section are conducted by my colleagues Ahmed Roushdy Elkordy and
Yahya H Ezzeldin. My contribution is empirically evaluation in Section 5.5.
153
Assumption 1 (IID data distribution). Throughout this section, we consider the case where the local dataset
Zi are sampled IID from a common distribution, i.e., the local dataset of user iconsists of IID data samples from
a distribution Pi
, where Pi = P for ∀i ∈ [N]. This implies that the distribution of the gradients gi(θ
(t)
, b),
for i ∈ [N], conditioned on the last global model θ
(t)
is also IID. For this common conditional distribution, we
will denote its mean with µ
(t)
G
and the covariance matrix K
(t)
G
in the t-th round.
With the above definitions and using Assumption 1, we can now state our main result below, which is
proved in [190].
Theorem 1 (Single Round Leakage). Let d
∗ ≤ d be the rank of the gradient covariance matrix K
(t)
G
, and let
Sg denote the set of subvectors of dimension d
∗
of g(θ
(t−1), b) that have a non-singular covariance matrices.
Under Assumption 1, we can upper bound I
(t)
priv for FedSGD in the following two cases:
Case. 1 If ∃g¯ ∈ Sg, such that g¯ is independent under whitening (see Def. 1), and E|g¯i
|
4 < ∞, ∀i ∈ [d
∗
], then
∃ C0,g¯ > 0, such that
I
(t)
priv ≤
C0,g¯ d
∗
(N − 1)B
+
d
∗
2
log
N
N − 1
, (5.9)
Case. 2 If ∃g¯ ∈ Sg, such that g¯ is σ-log concave under whitening (see Def. 2) then we have that
I
(t)
priv ≤
d
∗C1,g¯ − C2,g¯
(N − 1)Bσ4
+
d
∗
2
log
N
N − 1
, (5.10)
where: the constants C1,g¯ = 2 (1 + σ + log(2π) − log(σ)) and C2,g¯ = 4
h(¯g) −
1
2
log(|Σg¯|
, with Σg¯
being the covariance matrix of the vector g¯.
15
Remark 2 (Simplified bound). Note that each g¯ ∈ S(t)
g satisfying Case 1 or Case 2 gives an upper bound
on I
(t)
priv. Let S
(t)
g,c be the set of g¯ ∈ S(t)
g satisfying either Case 1 or Case 2. Then, we can combine these
different bounds in Theorem 1 as follows
I
(t)
priv ≤
d
∗
2
log
N
N−1
+
min
g¯∈S(t)
g,c
n
d
∗Cb1,g¯ − Cb2,g¯
o
(N − 1)B
, (5.11)
where
(Cb1,g¯, Cb2,g¯) =
(C0,g¯, 0), if g¯ satisfies Case 1,
C1,g¯
σ4 ,
C2,g¯
σ4
, if g¯ satisfies Case 2,
where C0,g¯, C1,g¯ and C2,g¯ are defined as in Theorem 1.
Remark 3. (Why the IID assumption?) Our main result in Theorem 1 relies on recent results on the
entropic central [261, 262] for the sum of independent and identically random variables/vectors. Note that
the IID assumption in the entropic central limit theorem can be relaxed to independent (but not necessarily
identical) distributions, however, in this case, the upper bound will have a complex dependency on the
moments of the N distributions in the system. In order to high-light how the privacy guarantee depends
on the different system parameters (discussed in the next subsection), we opted to consider the IID setting
in our theoretical analysis.
Remark 4. (Independence under whitening) One of our key assumptions in Theorem 1 is the independence under whitening assumption for stochastic gradient descent (SGD). This assumption is satisfied if
the SGD vector can be approximated by a distribution with independent components or by a multivariate
Gaussian vector. Our adoption of this assumption is motivated by recent theoretical results for analyzing the behaviour of SGD. These results have demonstrated great success in approximating the practical
155
behaviour of SGD, in the context of image classification problems, by modeling the SGD with (i) a nonisotropic Gaussian vector [263], or, (ii) α-stable random vectors with independent components [264]. For
both these noise models, the independence under whitening assumption in Theorem 1 is valid. However, a
key practical limitation for the aforementioned SGD models (and thus of the independence under whitening assumption) is assuming a smooth loss function for learning. This excludes deep neural networks that
make use of non-smooth activation and pooling functions (e.g., ReLU and max-pooling).
Now using the bounds in Theorem 1, in the following corollary, we characterize the privacy leakage
of the local training data Di of user i after T global training rounds of FedSGD, which is defined as
Ipriv/data = max
i∈[N]
I
Di
;
1
N
X
i∈[N]
x
(t)
i
t∈[T]
, (5.12)
Corollary 1. Assuming that users follow the FedSGD training protocol and the same assumptions in Theorem 1, we can derive the upper bound of the privacy leakage Ipriv/data after T global training rounds of
FedSGD in the following two cases:
Case. 1: Following the assumptions used in Case 1 in Theorem 1, we get
Ipriv/data ≤ T
C0,g¯d
∗
(N − 1)B
+
d
∗
2
log
N
N − 1
, (5.13)
Case. 2: Following the assumptions used in Case 2 in Theorem 1, we get
Ipriv/data ≤ T
d
∗C1,g¯ − C2,g¯
(N − 1)Bσ4
+
d
∗
2
log
N
N−1
. (5.14)
We prove Corollary 1 in [190]. Note that, we can combine the bounds in Corollary 1 similar to the
simplification in Eq. (5.11) from Theorem 1.
156
5.3.2 Impact of System Parameters
5.3.2.1 Impact of Number of Users (N)
As shown in Theorem 1 and Corollary 1, the upper bounds on information leakage from the aggregated
model update decrease in the number of users N. Specifically, the leakage dependency on N is at a rate of
O(1/N).
5.3.2.2 Impact of Batch Size (B)
Theorem 1 and Corollary 1 show that the information leakage from the aggregated model update could
decrease when increasing the batch size that is used in updating the local model of each user.
5.3.2.3 Impact of Model Size (d)
Given our definition of d
∗
in Theorem 1, where d
∗
represents the rank of the covariance matrix KG(t) and
d
∗ ≤ d (d is the model size), the leakage given in Theorem 1 and Corollary 1 only increases with increasing
the rank of the covariance matrix of the gradient. This increase happens at a rate of O(d
∗
). In other words,
increasing the model size d (especially when the model is overparameterized) does not have a linear impact
on the leakage. The experimental observation in Section 5.4 supports these theoretical findings.
5.3.2.4 Impact of Global Training Rounds (T)
Corollary 1 demonstrates that the information leakage from the aggregated model update about the private
training data of the users increases with increasing the number of global training rounds. This result
reflects the fact as the training proceed, the model at the server start to memorize the training data of the
users, and the data of the users is being exposed multiple times by the server as T increases, hence the
leakage increases. The increase of the leakage happens at a rate of O(T).
157
5.3.2.5 Impact of User Dropout, Collusion, and User Sampling
In this section, we extend the results given in Theorem 1 and Corollary 1 to cover the more practical FL
scenario that consider, user dropout, the collusion between the server and the users and user sampling.
We start by discussing the impact of user dropout and collusion.
5.3.2.6 Impact of User Dropout and Collusion with the Server.
Note that, in the case of user dropouts, this is equivalent to a situation where the non-surviving users send
a deterministic update of zero. As a result, their contribution can be removed from the aggregated model,
and we can, without loss of generality, consider an FL system where only the surviving subset Ns ⊂ [N]
users participate in the system.
Similarly, when a subset of users colludes with the server, then the server can subtract away their
contribution to the aggregated model in order to unmask information about his target user i. As a result,
we can again study this by considering only the subset of non-colluding (and surviving, if we also consider
dropout) users in our analysis. This observation gives us the following derivative of the result in Theorem 1
which can summarized by the following corollary.
Corollary 2. In FedSGD, under the assumptions used in Theorem 1, if there is only a subset N
(t)
s ⊂ [N] of
non-colluding and surviving users in the global training round t, then, we have the following bound on I
(t)
priv
I
(t)
priv ≤
d
∗
2
log
|Ns|
|Ns|−1
+
min
g¯∈S(t)
g,c
n
d
∗Cb1,g¯ − Cb2,g¯
o
(|Ns| − 1)B
, (5.15)
where the maximization in I
(t)
priv (given in Eq. (5.8)) is only over the set of non-colluding surviving and noncolluding users; and the constants Cb1,g¯ and Cb2,g¯ are given in Remark 2.
158
This implies that the per round leakage increases when we have a smaller number of surviving and
non-colluding users. Similarly, we can modify the bound in Corollary 1 to take into account user dropout
and user collusion by replacing N with |Ns|.
5.3.2.7 Impact of User Sampling
In Theorem 1 and Corollary 1, we assume that all N users in the FL system participate in each training
round. If instead K users are chosen each round, then all leakage upper bound will be in terms of K, the
number of users in each round, instead of N. Furthermore, through Corollary 1, we can develop upper
bounds for each user i, depending on the number of rounds Ti that the user participated in. For example,
taking into account selecting K users in each round denoted by K(t)
, then the upper bound in Eq. (5.13)
is modified to give the following information leakage for user i
Ipriv/data(i) = I
Di
;
1
K
X
i∈K(t)
x
(t)
i
t∈[T]
≤ Ti
C0,g¯d
∗
(K − 1)B
+
d
∗
2
log
K
K − 1
, (5.16)
where Ti = K/N if the set of K users are chosen independently and uniformly at random in each round.
Thus, user sampling would improve the linear dependence of the leakage on T (Section 5.3.2.4), but
increase the per round leakage due to a smaller number of users in each round (Section 5.3.2.1).
5.4 Experimental Setup
5.4.1 MI Estimation
In order to estimate the mutual information in our experiments, we use Mutual Information Neural Estimator (MINE) which is the state-of-the-art method [247] to estimate the mutual information between two
159
random vectors. Specifically, given random vectors X and Z, and a function family parameterized by a
neural network F = {Tθ : X × Z → R}θ∈Θ, the following bound holds:
I(X;Z) ≥ IΘ(X;Z), (5.17)
where IΘ(X;Z) is the neural mutual information measure defined as:
IΘ(X;Z) = sup
θ∈Θ
EPXZ [Tθ] − log(EPX⊗PZ
[e
Tθ
]), (5.18)
PX and PZ are the marginal distribution of X and Z respectively, PXZ is the joint distribution of X and
Z, and PX ⊗ PZ is the product of marginals PX and PZ. As an empirical estimation of IΘ(X;Z), MINE
is implemented as
I(\X;Z)K = sup
θ∈Θ
EP
(K)
XZ
[Tθ] − log(EP
(K)
X ⊗P
(K)
Z
[e
Tθ
]), (5.19)
where P
(K)
(·)
is the empirical distribution of P(·) with K IID samples. Finally, solving Eq. (5.19) (i.e. get the
MI estimation) can be achieved by solving the following optimization problem via gradient ascent:
I(\X;Z)K = max
θ∈Θ
(
1
K
X
K
k=1
Tθ(xk, zk) − log
1
K
X
K
k=1
e
Tθ(xk,z¯k)
!) , (5.20)
where (xk, zk) is the k-th sample from PXZ and z¯k is the k-th sample from PZ.
In our experiments, at the t-th global training round, we use MINE to estimate I(x
(t)
i
;
PN
i=1 x
(t)
i
|θ
(t−1)),
i.e., the mutual information between model update of the i-th user x
(t)
i
and the aggregated model update
from all users PN
i=1 x
(t)
i
. Our sampling procedure is described as follows: 1) at the beginning of the global
training round t, each user will first update its local model parameters as the global model parameters
θ
(t−1); 2) Next, each user shuffles its local dataset; 3) Then, each user will pick a single data batch from its
local dataset (if using FedSGD) or use all local data batches (if using FedAvg) to update its local model; 4)
160
Lastly, secure aggregation is used to calculate the aggregated model update. We repeat the above process
for K times to get K samples {(x
(t)
i,k;
PN
i=1 x
(t)
i,k)}
k=K
k=1 , where x
(t)
i,k represents the model update from the
i-th user in the k-th sampling and PN
i=1 x
(t)
i,k represents the aggregated model update from the i-th user
in the k-th sampling. Note that we use the K − th (last) sample PN
i=1 x
(t)
i,K to update the global model.
We repeat the end-to-end training and MI estimation multiple times in order to get multiple MI estimates for each training round t. We use the estimates for each round to report the average MI estimate
and derive the confidence interval (95%) for the MI estimation†
.
Lastly, when using MINE to estimate MI, we use a fully-connected neural network with two hidden layers each having 100 neurons each as Tθ (see Eq. (5.20)) and we perform gradient ascent for 1000 iterations
to train the MINE network.
5.4.2 Datasets and Models
Datasets. We use MNIST and CIFAR10 datasets in our experiments. Specifically, the MNIST dataset
contains 60,000 training images and 10,000 testing images, with 10 classes of labels. The CIFAR10 dataset
contains 50,000 training images and 10,000 testing images, with 10 classes of labels. For each of the dataset,
we randomly split the training data into 50 local datasets with equal size to simulate a total number of 50
users with identical data distribution. Note that we describe how to generate users with non-identical data
distribution when we evaluate the impact of user heterogeneity in Section 5.5.6.
Moreover, we use MINE to measure the entropy of an individual image in each of these datasets, as an
estimate of the maximal potential MI privacy leakage per image. We report that the entropy of an MNIST
image is 567 (bits) and the entropy of a CIFAR10 image is 1403 (bits). Note that we will use the entropy of
training data to normalize the measured MI privacy leakage in Section 5.5.
†During our experiments, we observe that the estimated MI does not change significantly across training rounds. Hence, we
average the estimated MI across training rounds when reporting our results.
161
Models for MNIST
Name Linear SLP MLP
Size (d) 7850 7850 89610
Models for CIFAR10
Name Linear SLP CNN
Size (d) 30730 30730 82554
Table 5.1: Models used for MNIST and CIFAR10 datasets. Note that SLP, MLP, and CNN represent Single
Layer Perceptron, Multiple Layer Perceptron, and Convolutional Neural Network, respectively.
Models. Table 5.1 reports the models and their number of parameters used in our evaluation. For MNIST
dataset, we consider three different models for federated learning. For each of these models, it takes as input
a 28×28 image and outputs the probability of 10 image classes. We start by using a simple linear model,
with a dimension of 7850. Next, we consider a non-linear model with the same amounts of parameters as
the linear model. Specifically, we use a single layer perceptron (SLP), which consists of a linear layer and a
ReLU activation function (which is non-linear). Finally, we choose a multiple layer perceptron (MLP) with
two hidden layers, each of which contains 100 neurons. In total, it has 89610 parameters. Since the MLP
model we use can already achieve more than 95% testing accuracy on MNIST dataset, we do not consider
more complicated model for MNIST.
For the CIFAR10 dataset, we also evaluate three different models for FL. For each of these models, it
will take as input an 32×32×3 image and outputs the probability of 10 image classes. Similar to MNIST, the
first two models we consider are a linear model and a single layer perceptron (SLP), both of which contains
30720 parameters. The third model we consider is a Convolutional Neural Network (CNN) modified from
AlexNet [265], which contains a total of 82554 parameters and is able to achieve a testing accuracy larger
than 60% on CIFAR. We do not consider larger CNN models due to the limited computation resources.
162
5.5 Empirical Evaluation
In this section, we empirically evaluate how different FL system parameters affect the MI privacy leakage
in SA. Our experiments explore the effect of the system parameters on FedSGD, FedAvg and FedProx [266].
Note that our evaluation results on FedSGD are backed by our theoretical results in Section 5.3, while our
evaluation results on FedAvg and FedProx are purely empirical.
We start by evaluating the impact of the number of users N on the MI privacy leakage for FedSGD,
FedAvg and FedProx (see in Section 5.5.1). Then, we evaluate the impact of batch size B on the MI privacy
leakage for both FedSGD, FedAvg and FedProx (see in Section 5.5.3). Next, in Section 5.5.4, we measure
the accumulative MI privacy leakage across all global training rounds. We evaluate how the local training
rounds E for each user will affect the MI privacy leakage for FedAvg and FedProx in Section 5.5.5. Finally,
the impact of user heterogeneity on the MI privacy leakage for FedAvg is evaluated in Section 5.5.6.
We would like to preface by noting that FedProx differs from FedAvg by adding a strongly-convex
proximal term to the loss used in FedAvg. Thus, we expect similar dependencies on the number of users
N, batch-size B and local epochs E, when using FedAvg and FedProx.
5.5.1 Impact of Number of Users (N)
FedSGD. Figure 5.3 shows the impact of varying N on MI privacy leakage in FedSGD, where the number
of users is chosen from {2, 5, 10, 20, 50}, and we measure the MI privacy leakage of different models on
both MNIST and CIFAR10 datasets. We observe that increasing the number of users participating in FL
using FedSGD will decrease the MI privacy leakage in each global training round (see Figure 5.3a and 5.3b),
which is consistent with our theoretical analysis in Section 5.3.2.1. Notably, as demonstrated in Figure 5.3c
and 5.3d, the percentile of MI privacy leakage (i.e. normalized by the entropy of a data batch) can drop
below 2% for MNIST and 5% for CIFAR10 when there are more than 20 users.
163
(a) Unnormalized MI, MNIST. (b) Unnormalized MI, CIFAR10.
(c) Normalized MI, MNIST. (d) Normalized MI, CIFAR10.
Figure 5.3: Impact of the number of users (N) when using FedSGD. Note that we set and B = 32 for all
users on both MNIST and CIFAR10 datasets. We normalize the MI by entropy of a single data batch (i.e.
32 ∗ 567 for MNIST and 32 ∗ 1403 for CIFAR10).
FedAvg. Figure 5.4 shows the impact of varying N on MI privacy leakage in FedAvg. Similar to the results
in FedSGD, as the number of users participating in FedAvg increases, the MI privacy leakage in each global
training round will decrease (see Figure 5.4a and 5.4b), and the decreasing rate is approximately O(N).
Moreover, as shown in Figure 5.4c and 5.4d, the percentile of MI privacy leakage drops below 0.1% on
both MNIST and CIFAR10 when there are more than 20 users participating in FL. It is worth noting that
we normalize the MI by the entropy of the whole training dataset in FedAvg instead of the entropy of a
single batch, since users will iterate over all their data batches to calculate their local model updates in
164
(a) Unnormalized MI, MNIST. (b) Unnormalized MI, CIFAR10.
(c) Normalized MI, MNIST. (d) Normalized MI, CIFAR10.
Figure 5.4: Impact of the number of users (N) when using FedAvg. Note that we set E=1 and B = 32
for all users on both MNIST and CIFAR10 datasets. We normalize the MI by entropy of the whole local
training dataset (i.e. 1200 ∗ 567 for MNIST and 1000 ∗ 1403 for CIFAR10).
FedAvg. Therefore, although we observe that the unnormalized MI is comparable for FedSGD and FedAvg,
the percentile of MI privacy leakage in FedAvg is significantly smaller than that in FedSGD.
FedProx. Similar to FedAvg, Figure 5.5 shows how the MI privacy leakage with FedProx varies with the
number of users N. As the number of users increase, the MI privacy leakage decreases in each training
round at an approximate rate of O(N). With more than 20 participating users, the percentile of MI leakage
drops below 0.12% under both MNIST and CIFAR10. Same as FedAvg, we normalize the MI privacy leakage
by the entropy of the whole training dataset of a single user.
165
(a) Unnormalized MI, MNIST. (b) Unnormalized MI, CIFAR10.
(c) Normalized MI, MNIST. (d) Normalized MI, CIFAR10.
Figure 5.5: Impact of the number of users (N) when using FedProx. Note that we set E=1 and B = 32
for all users on both MNIST and CIFAR10 datasets. We normalize the MI by entropy of a single data batch
(i.e. 1200 ∗ 567 for MNIST and 1000 ∗ 1403 for CIFAR10).
In conclusion, while our theoretical analysis on the impact of N in Section 5.3.2.1 is based on the
assumption that the FedSGD protocol is used, our empirical study shows that it holds not only in FedSGD
but also in FedAvg and FedProx.
5.5.2 Impact of Model Size (d)
FedSGD. From Figure 5.3, we observe that increasing model size d will increase the MI leakage during
each global training round. However, the increase rate of MI leakage is smaller than the increase rate of
d. This is expected since the upper bound of MI privacy leakage is proportional to d
∗
(i.e. the rank of
the covariance of matrix as proved in Theorem 1), which will not increase linearly with d especially for
166
(a) Normalized MI, MNIST. (b) Normalized MI, CIFAR10.
Figure 5.6: Impact of batch size (B) when using FedSGD. The MI is normalized by the entropy of a data
batch, which is proportional to the batch size B (i.e. B ∗ 567 for MNIST and B ∗ 1403 for CIFAR10).
overparameterized neural networks (see Section 5.3.2.3). Finally, we observe that the MI privacy leakage
on CIFAR10 is generally higher than that on MNIST. Since the input images on CIFAR10 have higher
dimension than the images on MNIST, larger model size are required during training. Therefore, we expect
that the MI privacy leakage on CIFAR10 is higher than that on MNIST.
FedAvg and FedProx. As shown in Figure 5.4 and Figure 5.5, increasing the model size will also have a
sub-linear impact on the increase of the MI privacy leakage in FedAvg and FedProx, which is consistent
with our results in FedSGD.
5.5.3 Impact of Batch Size (B)
FedSGD. Figure 5.6 shows the impact of varying B on the normalized MI privacy leakage in FedSGD, where
the batch size is chosen from {16, 32, 64, 128, 256} and we use MLP model on MNIST and CNN model on
CIFAR10 during experiments. Note that we normalize the MI by the entropy of a single data batch used
in each training round, which is proportional to the batch size B. On both MNIST and CIFAR10 datasets,
we consistently observe that increasing B will decrease the MI privacy leakage in FedSGD, and the decay
rate of MI is inversely proportional to batch size B. As demonstrated in Figure 5.6, when there are more
than 20 users, the percentile of MI privacy leakage for a single training round can be around 4% on MNIST
167
(a) Normalized MI, MNIST. (b) Normalized MI, CIFAR10.
Figure 5.7: Impact of batch size (B) when using FedAvg. The MI is normalized by the entropy of a user’s
local dataset, which is a constant (i.e. 1200 ∗ 567 for MNIST and 1000 ∗ 1403 for CIFAR10).
(a) Normalized MI, MNIST. (b) Normalized MI, CIFAR10.
Figure 5.8: Impact of batch size (B) when using FedProx. The MI is normalized by the entropy of a user’s
local dataset, which is a constant (i.e. 1200 ∗ 567 for MNIST and 1000 ∗ 1403 for CIFAR10).
and 12% on CIFAR10 with batch size 16. However, such leakage can drop to less 1% on both MNIST and
CIFAR10 with batch size 256, which is significantly reduced.
FedAvg and FedProx. Figure 5.7 and Figure 5.8 show the impact of varying the batch size B on MI
privacy leakage in FedAvg and FedProx, respectively, following the same experimental setup as in Figure
5.6. Since in both FedAvg and FedProx, each user will transverse their whole local dataset in each local
training round, we normalize the MI by the entropy of the target user’s local training dataset. As shown in
Figure 5.7 and Figure 5.8, the impact of B in FedAvg and FedProx is relatively smaller than that in FedSGD.
However, we can still observe that increasing B can decrease the MI privacy leakage in both FedAvg and
168
FedProx. For example, with 20 users participating in FedAvg, the percentile of MI privacy leakage at
each training round can drop from 0.8% to 0.3% when the batch size increases from 16 to 256, achieving a
reduction in privacy leakage by a factor of more than 2×. Similarly, in FedProx, this causes a decrease in
the MI privacy leakage from 0.09% to 0.04% when the batch size increases from 16 to 256.
In conclusion, we observe that increasing the batch size B can decrease the MI privacy leakage from
the aggregated model update in FedSGD, FedAvg and FedProx which verifies our theoretical analysis in
Section 5.3.2.3.
5.5.4 Accumulative MI leakage
To evaluate how the accumulative MI privacy leakage will accumulate with the number of training round
T, we measure the MI between training data and the aggregated model updates across training round.
Specifically, given a local training dataset sample Di
, we will concatenate the aggregated model updates
{
1
N
P
i∈N x
(t)
i
}t∈[T] across T training rounds in a single vector with dimension d ∗ T. By randomly generating Di for the target user for K times, we can get K concatenated aggregated model update vectors.
Then, we use MINE to estimate I(Di
; {
1
N
P
i∈N x
(t)
i
}t∈[T]
) with these K dataset and concatenated model
update samples.
As illustrated in Figure 5.9, the MI privacy leakage will accumulate linearly as we increase the global
training round T on both MNIST and CIFAR dataset, which is consistent with our theoretical results in
Section 5.3.2.4. That also says, by reducing the times of local model aggregation, the MI privacy leakage
of secure aggregation will be reduced. In practice, we can consider using client sampling to reduce the
participation times of each client in FL, such that the accumulative MI leakage of individual users can be
reduced. Moreover, we can also consider increasing the number of local averaging as much as possible to
reduce the aggregation times for local model updates.
169
(a) Normalized accumulative MI, MNIST. (b) Normalized accumulative MI, CIFAR10.
Figure 5.9: Accumulative MI privacy leakage on MNIST and CIFAR10 datasets. Note that we normalize the
MI by the entropy of each user’s local dataset, which will not change with T. We use the linear model for
both MNIST and CIFAR10 datasets.
Although, the three aggregation algorithms exhibit a similar trend with T, these algorithms can result
in different convergence speeds to a target accuracy. To highlight the effect of convergence rate on the
accumulative MI privacy leakage, we show, in Figure 5.10, how the accuracy changes with the amount of
MI leakage incurred for the three algorithms during the training process up to a maximum of 30 training
rounds for FedSGD. We observe that although FedSGD achieves lower MI leakage for a fixed number of
rounds (see Figure 5.9), its slow convergence rate will make it suffer from more leakage before reaching
a target accuracy rate. For example, given a target accuracy of 85% on the MNIST dataset, both FedAvg
and FedProx achieve the target accuracy with 0.058% and 0.057% leakage while FedSGD will reach 85%
accuracy in later rounds resulting in an accumulative MI leakage of 0.11% (even with smaller leakage per
round).
5.5.5 Impact of Local Training Epochs (E)
Figure 5.11 shows the impact of varying the number of local training epochs E on MI privacy leakage in
FedAvg on both MNIST and CIFAR10 datasets. We select E from {1, 2, 5, 10} and N from {10, 20}, and
we consider MLP model for MNIST and CNN model for CIFAR10. We observe that increasing the local
170
(a) MNIST (b) CIFAR
Figure 5.10: Accumulative MI privacy leakage vs model accuracy of different FL algorithms. Note that we
use a linear model for case study and normalize the MI by the entropy of each user’s local dataset.
(a) Normalized MI, MNIST. (b) Normalized MI, CIFAR10.
Figure 5.11: Impact of the local training round (E) when using FedAvg. We normalize the MI by the entropy
of each user’s local dataset, and we consider N ∈ {10, 20}.
training round E will increase the MI privacy leakage in FedAvg. An intuitive explanation is that with
more local epochs, the local model updates become more biased towards the user’s local dataset, hence it
will potentially leak more private information about users’ and make it easier for the server to infer the
individual model update from the aggregated update. However, as shown in Figure 5.11, increasing the
local epochs E will not have a linear impact on the increase of MI privacy leakage. As E increases, the
increase rate of MI privacy leakage becomes smaller.
171
(a) Normalized MI, MNIST. (b) Normalized MI, CIFAR10.
Figure 5.12: Impact of the local training round (E) when using FedProx. We normalize the MI by the
entropy of each user’s local dataset, and we consider N ∈ {10, 20}.
(a) Normalized MI when E = 1. (b) Normalized MI when E = 5.
Figure 5.13: Impact of user heterogeneity when using FedAvg on non-IID CIFAR10. Note that α = ∞
means that the user data distributions are identical (IID users), and the MI is normalized by the entropy of
a user’s local dataset.
Similar to FedAvg, we observe from Figure 5.12 that the local training epochs E has a sub-linear impact
on the MI privacy leakage when using FedProx. As aforementioned, this can be attributed to the fact
that FedProx represents an application of FedAvg with the original loss function in addition to a convex
regularization term.
172
Figure 5.14: Impact of user heterogeneity when using FedAvg on FEMNIST. Note that the MI is normalized
by the entropy of target user’s local dataset, which is 678 ∗ 176 .
5.5.6 Impact of Data Heterogeneity
As discussed in Remark 3 of Section 5.3, in our theoretical analysis, we considered IID data distribution
across users in Theorem 1 in order to make use of entropic central limit theorem results in developing our
upper bounds on privacy leakage. However in practice, the data distribution at the users can be heterogeneous. Hence, in this subsection, we analyze the impact of the non-IID (heterogeneous) data distribution
across the users’ on the privacy leakage. To measure how user heterogeneity can potentially impact the
MI privacy leakage in FedAvg, we consider two different data settings. In the first setting, we create synthetic users with non-IID data distributions following the methodology in [267]. For the second setting,
we consider FEMNIST [268], a benchmark non-IID FL dataset extended from MNIST, which consists of 62
different classes of 28×28 images (10 digits, 26 lowercase letters, 26 uppercase letters) written by 3500
users.
In the first, synthetic non-IID data setting, we use Dirichlet distribution parameterized by α to split the
dataset into multiple non-IID distributed local datasets. Smaller α (i.e., α → 0) represents that the users’
datasets are more non-identical with each other, while larger α (i.e., α → ∞) means that the user datasets
are more identical with each other. We choose CIFAR10 as the dataset, CNN as the model, and use FedAvg
for a case study while using a batch size of B = 32. Note that we do not consider FedSGD since it will not
173
be affected by user heterogeneity. During the experiments, we choose the α value from {1, 10, 100, ∞} to
create different levels of non-IID user datasets, and we consider N ∈ {2, 5, 10, 20} and E ∈ {1, 5}.
Figure 5.13 shows how the MI privacy leakage varies with the number of users under different α, where
the MI privacy leakage is normalized by the entropy of each user’s local dataset. We notice that the MI
privacy leakage will decrease with the number of users consistently under different α, which empirically
shows that our theoretical results in Section 5.3 also holds in the case where users are heterogeneous.
For the second, FEMNIST data setting, we split the dataset by users into 3500 non-overlapping subsets,
each of which contains character images written by a specific user. Considering that the size of each
subset is small, in order to have enough training data, we choose to sample N users at each training round
instead of using a fixed set of N users, which simulates the user sampling scenario in FL. Specifically, at the
beginning of each FL training round with N participating users, we use the same target user and randomly
pick the other N − 1 out of 3500 users. Note that we consider N ∈ {2, 5, 10, 20, 50} and E ∈ {1, 5}, and
use the same model (CNN), batch size (B = 32), and FedAvg algorithm in our evaluation..
Figure 5.14 shows how the MI privacy leakage varies with the number of users. Similar to the synthetic
non-IID data setting in Figure 5.13, the privacy leakage decreases with increasing the number of user N.
5.5.7 Practical Privacy Implications
Success of Privacy attacks. To provide insights on how MI translates to practical privacy implications,
we conduct experiments using one of the state-of-the-art data reconstruction attack, i.e., the Deep Leakage
from Gradients (DLG) attack from [239], to show how the MI metric reflects the reconstructed image
quality of the attack as we vary system parameters. Specifically, we choose MNIST as the dataset, the
same SLP used in Section 5.4.2 as the model, and FedSGD with batch size of 32 as training algorithm. For
the data distribution across the users, we consider the IID setting. At the end of each training round, each
user uses a batch of images with size 32 to calculate their local gradients, which will be securely aggregated
174
Figure 5.15: Impact of varying the number of users N, on the reconstructed image quality (PSNR) of the
DLG attack and on the MI privacy leakage.
by the server. The DLG attack will reconstruct a batch of images with size 32 from the aggregated gradient,
making them as similar as possible to the batch of images used by the target user. After that, we apply the
same PSNR (Peak Signal-to-noise Ratio) metric used in [239] to measure the quality of reconstructed images
compared with the images used by the target user during training. Note that without loss of generality,
we report the PSNR value of reconstructed images by DLG attack for the first training round.
Figure 5.15 shows the impact of number of users N on the privacy leakage metric (MI) and the reconstructed image quality of DLG attack (PSNR). We pick the image of digit 3 out of the target 32 images as
an example of reconstructed images. We can observe that increasing the number of users N decreases the
MI metric as well as the PSNR at almost the same rate. This demonstrates that the MI metric used in this
work can translate to practical privacy implications well.
MI Privacy leakage under the joint use of DP and SA. To highlight the joint effect of differential
privacy with secure aggregation, we conduct experiments on the MNIST dataset with a linear model to
measure the MI privacy leakage in the presence of centralized DP noise added at the server after SA.
Specifically, following [269], we first clip the aggregated model updates to make its norm bounded by C,
and then add Gaussian noise with variance σ
2
to achieve (ϵ, δ)-DP. We set C = 1, δ = 1/1200, and
σ =
q
2 log( 1.25
δ
)/ϵ.
175
(a) Normalized MI, MNIST. (b) Model accuracy, MNIST.
Figure 5.16: Effects of using DP noise together with SA on MI privacy leakage and model accuracy. Note
that we add DP noise in aggregated model updates after SA.
Figure 5.16a shows the MI privacy leakage for different (ϵ, δ)-DP levels with SA (δ is fixed at 1/1200).
As the number of users increase, SA improves the privacy level (measured in terms of MI leakage) for
different levels of DP noise, with the effect being most pronounced for weak DP noise level (ϵ = 5000 in
Figure 5.16a). Our experiments also show that as the number of users increase, the gain from using higher
DP noise levels is diminished. In particular, with N = 1000 users, the MI leakage level for ϵ =5, 10 and
5000 are almost the same; MI leakage is only reduced from 0.046% to 0.034% when using ϵ = 5 instead of
ϵ = 5000. In contrast, we get a reduction from 0.234% to 0.056% when there are N = 2 users.
Importantly, the reduction observed in privacy leakage due to applying additional DP noise results in
a severe degradation in accuracy as seen in Figure 5.16b, whereas privacy improvement gained by having
more users has a negligible effect on the performance of the trained model. For example, consider the case
of 1000 users. One may achieve the same level of privacy in terms of MI leakage (lower than 0.05% MI)
with either (i) (ϵ, δ)-DP with ϵ = 10, which, however, results in unusable model accuracy (less than 50%),
or, (ii) by aggregating the 1000 users and using a tiny amount of DP noise (equivalent to ϵ = 5000), which
achieves a model accuracy higher than 90%.
176
5.6 Related work
Secure Aggregation in FL. As mentioned secure aggregation has been developed for FL [237] to provide
protection against model inversion attacks and robustness to user dropouts (due to poor connections or
unavailability). There has been a series of works that aim at improving the efficiency of the aggregation
protocol [251, 252, 253, 254, 255, 258, 256]. This general family of works using secure aggregation disallow
the learning information about each client’s individual model update beyond the global aggregation of
updates, however there has not been a characterization of how much information the global aggregation
can leak about the individual client’s model and dataset. To the best of our knowledge, in this work,
we provide the first characterization of the privacy leakage due to the aggregated model through mutual
information for FL using secure aggregation.
Differential Privacy. One way to protect a client’s contributions is to use differential privacy (DP). DP
provides a rigorous, worst-case mathematical guarantee that the contribution a single client does not impact the result of the query. Central application of differential privacy was studied in [270, 271, 269]. This
form of central application of DP in FL requires trusting the server with individual model updates before
applying the differentially private mechanism. An alternative approach studied in FL for an untrusted
server entity is the local differential privacy (LDP) model [272, 273, 274] were clients apply a differentially
private mechanism (e.g. using the Gaussian mechanism) locally on their update before sending to the central server. LDP constraints imply central DP constraints, however due to local privacy constraints LDP
mechanisms significantly perturb the input and reduces globally utility due to the compounded effect of
adding noise at different clients.
In this work, we use a mutual information metric to study the privacy guarantees for the client’s
dataset provided through the secure aggregation protocol without adding differential privacy noise at the
clients. In this case, secure aggregation uses contributions from other clients to mask the contribution of a
single client. We will discuss in Section 5.7 situations where relying only on SA can clearly fail to provide
177
differential privacy guarantees and comment on the prevalence of such situations in practical training
scenarios.
Privacy Attacks. There have been some works trying to empirically show that it is possible to recovery
some training data from the gradient information. [275, 248, 41, 241]. Recently, the authors in [240] show
that it is possible to recover a batch of images that were used in the training of non-smooth deep neural
network. In particular, their proposed reconstruction attack was successful in reconstruction of different
images from the average gradient computed over a mini-batch of data. Their empirical results have shown
that the success rate of the inversion attack decreases with increasing the batch size. Similar observations
have been demonstrated in the subsequent works [241]. In contrast to this work, we are the first to the
best of our knowledge to theoretically quantify the amount of information that the aggregated gradient
could leak about the private training data of the users, and to understand how the training parameters
(e.g., number of users) affect the leakage. Additionally, our empirical results are different from the ones
in [275, 248, 41, 241, 241] in the way of quantifying the leakage. In particular, we use the MINE tool to
abstractly quantify the amount of information leakage in bits instead of the number of the reconstructed
images. We have also empirically studied the effect of the system parameters extensively using different
real world data sets and different neural network architectures.
5.7 Further Discussion and Conclusion
In this work, we derived the first formal privacy guarantees for FL with SA using MI as a metric to measure
how much information the aggregated model update can leak about the local dataset of each user. We
proved theoretical bounds on the MI privacy leakage in theory and showed through an empirical study
that this holds in practice after FL settings. Our concluding observations is that by using FL with SA, we
get that: 1) the MI privacy leakage will decrease at a rate of O(
1
N
) (N is the number of users participating
in FL with SA); 2) increasing model size will not have a linear impact on the increase of MI privacy leakage,
178
Figure 5.17: Heatmap of the absolute values of sampled updates from clients 1, 2 and 3 in the counter
example. x4 and x
′
4
can be distinguished even adding the aggregated noise from P3
i=1 xi
.
and the MI privacy leakage only linearly increases with the rank of the covariance matrix of the individual
model update; 3) larger batch size during local training can help to reduce the MI privacy leakage. We
hope that our findings can shed lights on how to select FL system parameters with SA in practice to reduce
privacy leakage and provide an understanding for the baseline protection provided by SA in settings where
it is combined with other privacy-preserving approaches such as differential privacy.
Can we provide differential privacy guarantees using SA? Note that when using FL with SA, then
from the point of view of an adversary that is interested in the data of the i-th user, the aggregated model
in i
− = [N]\{i} can be viewed as noise that is independent of the gradient xi given the last global model,
which is very similar to an LDP mechanism for the update x
(t)
i
of user i that adds noise to x
(t)
i
. This leads
to an intriguing question: Can we get LDP-like guarantees from the securely aggregated updates?
Since DP is interested in a worst-case guarantee, it turns out that their exist model update distributions
where it is impossible to achieve an ϵ < ∞ DP guarantee by using other model updates as noise as
illustrated in Figure 5.17. In this case, the alignment of the sparsity pattern in x1, x2 and x3 allows an
adversary to design a perfect detector to distinguish between x4 and x
′
4
. We systematically investigate
whether the aggregated model update can provide DP guarantees or not in Chapter 6.
Why our MI privacy guarantee can avoid this? Although, the previous example illustrates that DP
flavored guarantees are not always possible, in practical scenarios, the worst-case distribution for x1, x2
and x3 that enables the distinguishing between x4 and x
′
4
in Figure 5.17 are an unlikely occurrence during
179
training. For instance, in our theoretical analysis, since users have IID datasets, then having the distribution of x1, x2 and x3 be restricted to a subspace Sxi− , implies also that points generated from x4 would
also belong to Sxi− almost surely. This is a key reason why we can get mutual information guarantee
in Theorem 1: for an aggregated gradient direction PN
i=1 xi
, where each component is restricted to a
common subspace Sx protects the contribution of each individual component xi as N increases.
In the worst case, where one component is not restricted to the subspace Sx spanned by the remaining
components, then we get the privacy leakage discussed in the example above. We highlight that through
our experiments and other studies in the literature [276], we observe that such sparsity alignment happens
with very low probability. This presents motivation for studying a probabilistic notion of DP that satisfies
(ϵ, δ)-DP with a probability at least γ, instead of the worst-case treatment in current DP notions, but this
is beyond the scope of the study in this current work.
Another interesting future direction is to use the results from this work for a providing “privacy metrics” to users to estimate/quantify their potential leakage for participating in a federated learning cohort.
Such metrics can be embedded in platforms, such as FedML [277], to guide users to make informed decisions about their participation in federated learning. Finally, it would also be important to extend the
results to model aggregation protocols that are beyond weighted averaging (e.g., in federated knowledge
transfer [278]).
180
Chapter 6
Quantifying Worst-case Privacy Leakage in Federated Learning with
Secure Aggregation via Differential Privacy
In Chapter 5, we extend privacy guarantees of FL with SA by bounding the information leakage through
the aggregate model over multiple training rounds thanks to leveraging the “noise” from other users’ updates. However, the privacy metric used in Chapter 5 (i.e. mutual information) measures the on-average
privacy leakage, without providing any privacy guarantees for worse-case scenarios. To address this, in
this chapter, we study the conditions under which FL with SA can provide worst-case Differential Privacy(DP) guarantees. Specifically, we formally identify the necessary condition that SA can provide DP
without addition noise. We then prove that when the randomness inside the aggregated model update
is Gaussian with non-singular covariance matrix, SA can provide differential privacy guarantees with the
level of privacy ϵ bounded by the reciprocal of the minimum eigenvalue of the covariance matrix. However,
we further demonstrate that in practice, these conditions are almost unlikely to hold and hence additional
noise added in model updates is still required in order for SA in FL to achieve DP. Lastly, we explore the potential solution of leveraging inherent randomness inside aggregated model update to reduce the amount
of addition noise required for DP guarantee.
181
6.1 Introduction
Federated learning (FL) has garnered considerable attention in recent years due to its ability for facilitating
collaborative training of machine learning models using locally private data from multiple users, eliminating the need for users to share their private local data with a central server [237, 39, 238]. A standard FL
system involves a central server that oversees the training of a global model, which is regularly updated
locally by the users over multiple rounds. In each round, the server initially distributes the current global
model to the users. Subsequently, the users enhance the global model by training it on their respective
private datasets and then transmit their local model updates back to the server. The server then modifies
the global model by aggregating the received local model updates from the users and so on.
Although communicating the local model updates avoids sharing of data directly, it has been shown
that the updates can be reverse-engineered to leak information about a user’s dataset [239, 240, 241].
To prevent information leakage from individual models, Secure Aggregation (SA) protocols have been
employed to enable the server to aggregate local model updates from multiple users without having access
to any clear model updates. In each training round, users encrypt their local updates before sending them
to the server for aggregation, such that the server can only learn the aggregated model, thereby preserving
the privacy of individual updates. Recently, the work in [190] extended the privacy guarantees for secure
aggregation to information-theoretic guarantees on the leakage through the aggregate model over multiple
training rounds. However, whether noise induced through vanilla secure aggregation (considering other
participating users as noise) can provide worst-case differential privacy (DP) [193] guarantees without any
novel randomness has remained an open problem as highlighted in [190].
In this work, we target an answer to this question by focusing on formal differential privacy (DP) for
secure aggregation (see Figure 6.1). Specifically, we provide theoretical answers to the following:
1. Under what conditions can secure aggregation provide DP guarantee?
2. If it can, how much differential privacy can it offer?
182
Server
…
User 1 User 2 User N
Aggregate Model
Secure Aggregation
?
Differential Privacy guarantees
on the aggregate model ?
Figure 6.1: Federated learning with SA and DP guarantees.
3. If not, is it possible to leverage the inherent randomness inside aggregated model update to reduce
the amount of additional noise required for DP?
Contributions. We demonstrate that one necessary condition for the aggregated model update to provide
DP guarantees for individual user’s local dataset is that the space of each individual user’s model update is
included in the space of the aggregated model update from all users, see Sections 6.3.3 and Section 6.4 for
detailed explanation and formal proofs. We further prove that under the special case where the randomness
inside the aggregated model update is Gaussian with non-singular covariance matrix, the aggregated model
update can provide DP guarantees for individual user’s local dataset without noise additional, where the
DP ϵ is bounded by the reciprocal of the minimal eigenvalue of the covariance matrix of the aggregated
model update.
Moreover, we demonstrate that in practice, the conditions for SA to provide DP guarantee is almost unlikely to hold, especially in deep learning implementations where models are over-parameterized. Therefore, additional DP noise is required in order for SA to provide DP guarantee for individual user’s dataset.
Lastly, we investigate the possibility of leveraging the inherent randomness inside aggregated model update to reduce the amount of additional noise needed for the same level of DP.
183
6.2 Preliminaries
We start off by presenting key definitions for differential privacy which are of interest to this work (Note
that for preliminaries related to FL, SA and its privacy guarantees, refer to Section 5.2 in Chapter 5 for
details).
6.2.1 Differential Privacy
Differential privacy (DP) has emerged as a reliable mechanism for ensuring data privacy in Federated
Learning (FL) through the injection of noise [279]. When using DP, the server aggregate model still retains an added noise component, unlike the SA approach where the server recovers a non-noisy version of
the aggregate. By carefully controlling the level of noise introduced through the DP mechanism, a provable privacy guarantee can be achieved for the local data of participating users, even when the local or
aggregated models are accessible to adversaries. However, the required noise for achieving high level of
privacy may negatively impact the model’s performance. In the following we formally define DP.
Definition 3 (DP [193]). : A randomized mechanism M : X → R with domain X and range R satisfies (ϵ,
δ)-DP, if for all sets S ⊆ R and for any two adjacent databases Di
, D′
i ∈ X .
P r[M(Di) ∈ S] ≤ e
ϵP r[M(D
′
i
)] + δ. (6.1)
In Eq. (6.1), ϵ represents the level of privacy, where smaller ϵ means a higher level of privacy the mechanism
M can achieve. On the other hand, δ is a relaxation item which represents the probability of the event that
the ratio P r[M(Di)∈S]
P r[M(D′
i
)∈S]
cannot be always bounded by e
ϵ
after applying the privacy-preserving mechanism
M. Specifically, when δ = 0, we define that the mechanism M satisfies pure differential privacy, i.e. ϵ-DP.
When δ > 0, we define that the mechanism M satisfies approximate differential privacy, i.e. (ϵ, δ)-DP. To
achieve local differential privacy in FL, Gaussian mechanism is widely adapted [279]. In particular, in each
184
training round, each user i ∈ [N] clips its model update clip(xi
, C) = Cxi
max(||xi||2,C)
, where C is a clipping
threshold used for bounding the model update of each user. Each user then adds a Gaussian noise n with
a standard deviation inversely proportional to the privacy level ϵ.
Another relaxation of ϵ-DP is Rényi Differential Privacy (RDP), which is defined based on the Rényi
divergence. We first provide the definition of Rényi divergence between two probability distributions as
follows:
Definition 4 (Rényi Divergence[280]). For two probability distributions as follows: P and Q over a continuous space X , and for any order α > 0, α ̸= 1, the Rényi divergence of order α from P to Q is defined
as:
Dα(P∥Q) = 1
α − 1
log Ex∼Q
P(x)
Q(x)
α
,
where Ex∼Q[·] denotes the expectation with respect to Q. For α = 1, Dα(P∥Q)converges to the KL divergence
as α approaches 1.
Based on Definition 4, the definition of (α, ϵ)-RDP is given below:
Definition 5 ((α, ϵ)-RDP[281]). A randomized mechanism M : X → Y satisfies (α, ϵ)-RDP if for any two
adjacent inputs x, x′ ∈ X , the following inequality holds:
Dα(M(x)∥M(x
′
)) ≤ ϵ, (6.2)
where Dα(·∥·) denotes the Rényi divergence of order α between two probability distributions.
Compared with (ϵ, δ)-DP, RDP provides a more accurate and convenient way to analyze and quantify
privacy loss, especially when multiple mechanisms are applied [281]. Moreover, when RDP is satisfied,
(ϵ, δ)-DP can also be guaranteed. Lemma 3 below provides a bridge for converting RDP guarantees into
(ϵ, δ)-DP guarantees, facilitating the application of differential privacy in various settings.
185
Lemma 3 (From RDP to (ϵ, δ)-DP[281]). A mechanism that satisfies(α, ϵ)-RDP also satisfies(ϵ+
log(1/δ)
α−1
, δ)-
DP for any δ > 0.
6.2.2 Threat Model for FL with SA
Server. We assume that the FL server can only observe the aggregated model updates from all users
without accessing individual model update from any user. We assume that the FL server is honest-butcurious. It is honest in that it will follow the FL and SA protocols without violating them. It is curious
since it may be interested in inferring sensitive local dataset of each individual user from the aggregate
model updates.
User. We assume that the users are honest, in that they will follow the FL with SA protocol to calculate
local model updates using their local dataset and send them to the server. We assume that the users will
not send the incorrect model updates to the server.
Privacy goal. Our privacy goal is to achieve record-level DP for each user. This says, if a user changes
one record in their local dataset, the server cannot distinguish the new record from the original one, by
observing the aggregated model updates across multiple training rounds.
6.3 Problem Statement
6.3.1 Motivation
We define the local dataset of user i as Di
, the model update from user i at step t as x
(t)
i
, and the aggregated
model update from all N users excluding user i as x
(t)
−i =
Pj=N
j=1,j̸=i
x
(t)
j
. As demonstrated in Figure 6.2,
when SA is used in FL, the server only receives x
(t)
i + x
(t)
−i
from the aggregator (i.e. the output of the
FL+SA system) without knowing x
(t)
i
from user i. Since each user conducts the data batch sampling when
calculating local model update, the aggregated model update x
(t)
i +x
(t)
−i
can be viewed as a random vector,
which can provide some privacy protection to user i’s local dataset Di
. Moreover, compared with the case
186
�! = {�!,#, … , �!,|%!|}
�!
(') = [�!,#
' , … , �!, %!
' ]
Sampling
Server
�# = {�#,#, … , �#,|%"|}
�#
(') = [�#,#
' , … , �#, %"
' ]
Sampling
�) = {�),#, … , �),|%#|}
�)
(') = [�),#
' , … , �), %#
' ]
Sampling
�!
(#)
�%
(#)
�&
(#)
�"!
(#)
�"%
(#)
�"&
(#)
#
!
�!
(#)
: Secure Aggregation
Aggregated
Model
Updates
�!
(#) = �%
(#)
�!,%
⋮
�!, '!
User �
User 1
User �
FL+SA
Figure 6.2: System model for FL with SA. Note that the input of this system is users’ local datasets
({Di}
i=N
i=1 ), and the output of the system is the aggregated model update (Pi=N
i=1 x
(t)
i
), which is a random
vector due to users’ local gradient (i.e. data batch) sampling. The server will infer user i’s local dataset
(Di
) by observing Pi=N
i=1 x
(t)
i
.
where SA is not used and server knows x
(t)
i
, user i may add less or even no additional noise locally by
oneself to achieve the same level of privacy, due to the leverage of noise from x
(t)
−i
.
In Chapter 5, we use Mutual Information (MI) [282] as a metric to measure the privacy leakage in
federated learning with secure aggregation. We have demonstrated that the model update from each user
contains random noise, which can offer privacy protection for other users. Hence, as we aggregate more
users, the MI privacy leakage of the individual model update from the aggregated model update will decrease, since more random noise is aggregated.
Formally, the MI between Di and x
(t)
i + x
(t)
−i =
Pj=N
j=1 x
(t)
j
is defined as follows:
I
(t)
priv = max
i∈[N]
I
Di
;
X
N
i=1
x
(t)
i
(X
N
i=1
x
(k)
i
)
k∈[t−1]
. (6.3)
187
For users with identical and independent data distributions, under the assumption that the model update
can be composed of independent random variables after whitening, we prove that
I
(t)
priv ≤
C d
(N − 1)B
+
d
2
log
N
N − 1
, (6.4)
where C is a constant, d is the model size, and B is the mini-batch size at training round t in FedSGD. The
inequality (6.4) indicates that MI between the individual model update and the aggregated model update
(i.e. the privacy leakage) reduces linearly with the number of users participating in FL.
However, MI only measures the on-average privacy leakage, without providing any privacy guarantees
for the worst-case scenarios. Since worst-case privacy guarantees are stronger than on-average privacy
guarantees because they guarantee privacy protection even in situations that are less likely to occur, it is
important to investigate how much privacy can leak in the worse-case when combining SA with FL.
Formally, we define the probabilistic distribution of aggregated model update x
(t)
i + x
(t)
−i
at step t as
M{Di∪D−i,θ(t)}
, where:
M{Di∪D−i,θ(t)}
(x) = Pr[x
(t)
i + x
(t)
−i = x|{Di}
i=N
i=1 , θ(t)
], (6.5)
where D−i =
SN
j=1,j̸=i Di
. Suppose Di and D
′
i
are two instances of user i’s local dataset. If there exists
α > 1 and ϵ > 0, such that for any Di and D
′
i
differing from one data point, the following inequality
holds:
Dα(M{Di∪D−i,θ(t)}
||M{D
′
i∪D−i,θ(t)}
) ≤ ϵ, (6.6)
then x
(t)
i + x
(t)
−i
can provide (α, ϵ)-RDP guarantee and hence (ϵ +
log(1/δ)
α−1
, δ)-DP for user i’s dataset Di
.
188
Figure 6.3: Heatmap of the absolute values of sampled updates from users 1, 2 and 3 in the counterexample.
x4 and x
′
4
can be distinguished even adding the aggregated noise from P3
i=1 xi
.
6.3.2 Negative Result for DP
We start by using an counterexample to show that the aggregated model update can not offer DP guarantee.
Specifically, Figure 6.3 demonstrates a worst-case scenario where the last 1/4 elements of model updates
for user 1, 2, 3 (i.e. x1, x2 and x3) are all zeros while the last 1/4 elements of model update from user 4 (i.e.
x4) are non-zero. Under this case, suppose user 4 changes one data point in their local dataset, making x4
become x
′
4
. Then, an adversary can perfectly distinguish Pi=3
i=1 xi + x4 from Pi=3
i=1 xi + x
′
4
. Hence, the
aggregated model update can not provide DP guarantees for user 4’s local dataset.
While the privacy leakage in the worst-case scenarios can be significant, as long as the occurrence
probability of worst-case scenarios is small, a reasonable on-average privacy guarantee (e.g. measured
by MI) can be achieved. This is the key difference between worst-case privacy guarantee and on-average
privacy guarantee.
6.3.3 What We Need for DP Guarantee
Based on the above counterexample, a natural question is under what conditions the above worst case can
be avoided, such that DP can be guarantee. In general, we need the following two conditions to be held:
1. The random noise among the aggregated model update from others users should be independent of
the model update of each individual user.
189
2. The space of each individual user’s model update should be included in the space of the aggregated
model update from all the other users.
The first condition is necessary since it stems from the fundamental “context-unaware" requirements
in DP [193]. The second condition is necessary for having a finite ϵ. Specifically, as demonstrated in
inequality (6.1), for any x
(t)
i
and x
(t)
′
i
calculated from Di and D
′
i
differing from one data point, both
P r[x
(t)
i + x
(t)
−i ∈ X] and P r[x
(t)
′
i + x
(t)
−i ∈ X] should be non-zero. Otherwise, ϵ can be infinite. This
says, the space of x
(t)
i
(i.e. the set consisting of all possible x
(t)
i
) must be included in the space of x
(t)
−i
(i.e.
the set consisting of all possible x
(t)
−i
).
Motivated by the above statements on general conditions for achieving DP, in the next section we
formalize several specific conditions under which we can analytically derive the ϵ bound of DP in FL with
SA.
6.4 Theoretical Results
In this section, we analyze under what conditions FL with SA can provide DP guarantee when using
FedSGD protocol.
6.4.1 Basic Assumption for Gradient Noise
We start by making the following assumption about FedSGD protocol in our analysis, which has been
widely used in FL [39].
190
Assumption 2 (Stochastic gradient with sampling noise and clipping). Assume that each user i at step t
uses its local dataset Di with size |Di
| to compute the model update as follows:
x
(t)
i = G
(t)
i wi =
g
(t)
i,1
· · · g
(t)
i,|Di|
wi,1
.
.
.
wi,|Di|
, (6.7)
where g
(t)
i,j = clip(
∂L(θ
(t)
,di,j )
∂θ(t)
, C) ∈ RK is clipped gradient computed using di,j , K is the dimension of
gradient (i.e. model size), di,j is the j-th data point in user i’s local dataset Di
, clip(x, C) = Cx
max(||x||2,C)
is a
clipping function to bound the norm of vector x by constant C, and wi ∈ R|Di|
is a random vector representing
the sampling noise, which satisfies Ewi
[x
(t)
i
] =
P|Di
|
j=1 g
(t)
i,j
|Di|
. Without loss of generality, we assume |Di
| = D
for all users.
Remark 1: Applicability of Assumption 2. Assumption 2 assumes that the gradient randomness is
due to sampling “noise", which holds in practice. For example, in traditional SGD, user i will randomly
and uniformly sample a mini-batch with size B from its local dataset to calculate the model update at
each step (denoted as IID sampling). In this case, each element in wi will be a random variable satisfying
wi,j ∈ {0,
1
B
} and PD
j=1 wi,j = 1, This says, for user i, we have wi,j =
1
B
for exactly B indices j, and
wi,j = 0 for the other D − B indices. As another example, when user i uses the Gaussian sampling noise
proposed in [283], wi ∼ N(
1
D
,
1
BD ID) is a Gaussian random vector. Depending on the sampling strategy,
the distribution of the gradient noise in SGD will be different.
The other assumption in Assumption 2 is that users will clip their gradients, which is widely used in
privacy-preserving machine learning, see, for example, [269].
191
Based on Assumption 2, from the perspective of user i, given its local dataset Di as input of the FL
with SA system, the output of the system will be the aggregated model update calculated as:
x
(t) =x
(t)
i + x
(t)
−i = G
(t)
i wi +
X
N
j=1,j̸=i
G
(t)
j wj =
X
N
j=1
G
(t)
j wj (6.8)
where w1, ..., wN are independent with each other.
6.4.2 Necessary Condition for DP Guarantee
We assume that x
(t)
i
and x
(t)
′
i
are two model update instances of x
(t)
i
calculated from Di and D
′
i
differing
from one data point respectively. Suppose that Di and D
′
i
differs from the j-th data point, we define the
j-th data point in Di and D
′
i
as di,j and d
′
i,j respectively, and the corresponding gradient calculated from
the j-th data point as g
(t)
i,j and g
(t)
′
i,j respectively. Moreover, we define the span of vectors {v1, .., vK} as
span({v1, .., vK}) = {x1v1 + ... + xKvK|∀ x1, ..., xK ∈ R} .
As motivated in Section 6.3.3, in order for the aggregated model update to guarantee (ϵ, δ)-DP with
bounded ϵ value, for any x
(t)
i
and x
(t)
′
i
, and any X ∈ Range(x
(t)
i +x
(t)
−i
) satisfying P r[x
(t),1
i +x
(t)
−i ∈ X] ≥
δ, both P r[x
(t),1
i + x
(t)
−i ∈ X] and P r[x
(t),2
i + x
(t)
−i ∈ X] should be non-zero. Motivated by the above, we
derive the following theorem.
Theorem 2 (A necessary condition for DP guarantee). Under Assumption 2, if the aggregated model update
x
(t)
i +x
(t)
−i
can provide (ϵ, δ)-DP user i’s dataset Di and δ ≤ Pr[wi,j ̸= 0] > δ, then for any Di and D
′
i
differs
from the j-th data point, we must have g
(t)
′
i,j ∈ span SN
j=1{g
(t)
j,k|∀ k ∈ {1, ..., D}}
.
Proof. Assume that there exists a data point d
′
i,j , such that g
(t)
′
i,j ∈/ span SN
j=1{g
(t)
j,k|∀ k ∈ {1, ..., D}}
when user i changes di,j into d
′
i,j . The aggregated model update when user i has local dataset Di
is
calculated as:
x
(t)
i + x
(t)
−i = g
(t)
i,j wi,j +
X
D
k=1,k̸=j
g
(t)
i,kwi,k +
X
N
j=1,j̸=i
X
D
k=1
g
(t)
j,kwj,k (6.9)
1
and the aggregated model update when user i has local dataset D
′
i
is calculated as:
x
(t)
′
i + x
(t)
−i = g
(t)
′
i,j wi,j +
X
D
k=1,k̸=j
g
(t)
i,kwi,k +
X
N
j=1,j̸=i
X
D
k=1
g
(t)
j,kwj,k (6.10)
Next, we define X as {x
(t)
′
i + x
(t)
−i
|∀ wi,j ̸= 0}. We can derive that Pr[x
(t)
′
i + x
(t)
−i ∈ X] = Pr[wi,j ̸= 0].
Since
g
(t)
′
i,j ∈/ span [
N
j=1
{g
(t)
j,k|∀ k ∈ {1, ..., D}}
,
and
x
(t)
i + x
(t)
−i ∈ span [
N
j=1
{g
(t)
j,k|∀ k ∈ {1, ..., D}}
,
then ∀x ∈ X, we have Pr[x
(t)
i + x
(t)
−i = x] = 0. This says Pr[x
(t)
i + x
(t)
−i ∈ X] = 0. Then, ∀ ϵ ∈ R, we
have Pr[x
(t)
′
i + x
(t)
−i ∈ X] − e
ϵPr[x
(t)
i + x
(t)
−i ∈ X] = Pr[wi,j ̸= 0] ≥ δ. Therefore, (ϵ, δ)-DP cannot be
guaranteed for δ ≤ Pr[wi,j ̸= 0] ∗
.
Theorem 2 states that a necessary condition for aggregated model update to provide DP guarantee for
each user’s local dataset is that any change of one data point in user i’s local dataset will not change the
span of all individual gradients from all users. Since the aggregated model update belongs to the span of all
individual gradients (see Eq. (6.9)), when the server observes that the aggregated model is from a different
span, it can potentially identify or reconstruct the data point which causes this change. Therefore, the
worst-case privacy DP guarantee is violated. Recent works [284] have proposed attacks to elude SA in
FL via model inconsistency across clients or model manipulation. Fundamentally, these attacks work by
making the model updates from different users fall into different vector space and hence the necessary
condition for worst-privacy guarantee in Theorem 2 will be violated, causing privacy leakage.
∗Note that Pr[wi,j ̸= 0] indicates the probability that the j-th data point is sampled to calculate the model update from
user i (e.g. Pr[wi,j ̸= 0] = B
D
in IID sampling and Pr[wi,j ̸= 0] = 1 in Gaussian sampling). Although (ϵ, δ)-DP may hold for
δ > Pr[wi,j ̸= 0], Pr[wi,j ̸= 0] is significantly larger than a meaningful bound for δ [193].
1
6.4.3 Gaussian Sampling Noise with Non-Singular Covariance Matrix
The most common type of sampling noise is IID sampling. Specifically, consider the scenario where each
user i will randomly and uniformly sample a mini-batch with size B from its local dataset to calculate
the model update at each step. We define span
1
B
{v1, .., vK}
= {x1v1 + ... + xKvK|∀ x1, ..., xK ∈
{0,
1
B
} and PK
k=1 xk = 1}. Since in practice, each user has a limited amount of data points in their local
dataset, the aggregated model update x
(t)
i + x
(t)
−i will be a random vector which takes value from a finite
set (i.e. span
1
B
SN
j=1{g
(t)
j,k|∀ k ∈ {1, ..., D}}
). Without assuming the distribution of user i’s dataset, g
(t)
′
i,j
can have infinite amount of possible values. In this case, g
(t)
′
i,j ∈ span
1
B
SN
j=1{g
(t)
j,k|∀ k ∈ {1, ..., D}}
cannot be guaranteed and hence DP is violated based on Theorem 2.
In the rest of this section, we consider a special type of sampling noise called Gaussian sampling noise,
where the weights of individual gradients in model update of each user i are Gaussian. We prove that
a closed-form ϵ bound in DP can be derived for Gaussian sampling noise when the covariance matrix of
sampling noise of each user (or the projected covariance matrix) is non-singular.
Assumption 3 (Gaussian sampling noise). Based on Assumption 2, we assume that each user i uses Gaussian
sampling noise, which satisfies wi =
1
D + √
1
BD ξi and ξi ∼ N(0, ID). Then, the covariance matrix of gradient
noise is derived as:
Σ
(t)
i =
1
D
G
(t)
i
(G
(t)
i
)
T =
1
D
j
X=D
j=1
g
(t)
i,j (g
(t)
i,j )
T
(6.11)
We define the SVD decomposition of covariance matrix as Σ
(t)
i = U
(t)
i
λ
(t)
i
U
(t)
i
T
. Then, the model update
from user i at step t can be computed as
x
(t)
i = ¯g
(t)
i +
1
√
BD
G
(t)
i
ξi = ¯g
(t)
i +
1
√
B
L
(t)
i
vi
, (6.12)
where g¯
(t)
i =
1
D
PD
j=1 g
(t)
i,j , L
(t)
i = U
(t)
i
λ
(t)
i
1
2
, vi ∼ N (0, IK) (K is the dimension of model update),
v1, ..., vN are independent with each other, and B is a scaling factor to adjust the magnitude of the covariance†
.
Based on Assumption 3, the aggregated model update at step t can be written as:
x
(t) =
X
N
j=1
G
(t)
j wj =
X
N
j=1
g¯
(t)
j +
X
N
j=1
1
√
B
L
(t)
i
vj . (6.13)
Moreover, given that v1, ..., vN are independent normal Gaussian, x
(t) will also be Gaussian with mean
PN
j=1 g¯
(t)
j
and covariance matrix
PN
j=1 Σ
(t)
j
B
.
Lemma 4 (Bounded maximal singular value). Based on Assumption 2, for each user i, the maximal singular
value of the gradient covariance matrix of gradient noise Σ
(t)
i
is upper bounded by C
2
.
Proof. ∀ x ∈ RK, we have x
T Σ
(t)
i
x =
1
D
Pj=D
j=1 x
T
(g
(t)
i,j )
T
g
(t)
i,j x =
1
D
Pj=D
j=1 (g
(t)
i,j x)
2
. Since |g
(t)
i,j x|≤
||g
(t)
i,j ||2||x||2 ≤ C||x||2, we have x
T Σ
(t)
i
x ≤ C
2
||x||2
2
. Hence, the maximal singular value of Σ
(t)
i
is upper
bounded by C
2
.
Assumption 4 (Non-singular covariance matrix). We assume that for each user i, the covariance matrix of
gradient noise Σ
(t)
i
is non-singular, with non-zero minimal eigenvalue lower bounded by λ
(t)
i,min > 0.
Theorem 3. Under Assumption 2,3,4, if C2
D <
PN
i=1 λ
(t)
i,min, then ∀ α ∈ (1,
D
PN
i=1 λ
(t)
i,min
C2 ), the aggregated
model update x
(t) at step t can provide (α, ϵ
(t)
i
)-RDP and (ϵ
(t)
i +
log(1/δ)
α−1
, δ)-DP for user i’s dataset Di
, where:
ϵ
(t)
i = (2αBC2
D2
+
αC2
(α − 1)D
)
1
PN
i=1 λ
(t)
i,min −
αC2
D
. (6.14)
†Note that when B is set as the mini-batch size in classical SGD (i.e. take the average of B IID sampled gradient as model
update), the covariance of Gaussian gradient noise will be close to the covariance of classical SGD noise [283].
19
Proof. Under Assumption 2 and 3, based on Eq. (6.5) and Eq. (6.13), for any Di and D
′
i
differing from one
data point, the Rényi divergence between M{Di∪D−i,θ(t)}
(i.e. the probabilistic distribution of aggregated
model update when Di
is used) and M{D
′
i∪D−i,θ(t)}
(i.e. the probabilistic distribution of aggregated model
update when D
′
i
is used) is derived as the Rényi divergence between two Gaussian distribution:
Dα(M{Di∪D−i,θ(t)}
∥M{D
′
i∪D−i,θ(t)}
)
=Dα(N (¯g
(t)
i + ¯g
(t)
−i
,
Σ
(t)
i + Σ(t)
−i
B
)∥N (¯g
(t)
′
i + ¯g
(t)
−i
,
Σ
(t)
′
i + Σ(t)
−i
B
))
(6.15)
Without loss of generality, we assume that Di and D
′
i
differs from the first data point, which are
denoted as di,1 and d
′
i,1
respectively, and the corresponding gradients calcualted by these two data points
are g
(t)
i,1
and g
(t)
′
i,1
. We define ∆¯g
(t)
i = ¯g
(t)
i − g¯
(t)
′
i =
1
D
(g
(t)
i,1 − g
(t)
′
i,1
), Σ1 =
Σ
(t)
i +Σ(t)
−i
B
, Σ2 =
Σ
(t)
′
i +Σ(t)
−i
B
, and
Σα = (1 − α)Σ1 + αΣ2.
Since ∀ x ∈ RK,
x
T Σαx = x
T
((1 − α)Σ1 + αΣ2)x
=x
T Σ1x + x
Tα(Σ2 − Σ1)x
=x
T Σ1x + x
T α
B
(Σ(t)
′
i − Σ
(t)
i
)x
=x
T Σ1x + x
T α
BD
(g
(t)
′
i,1
(g
(t)
′
i,1
)
T − g
(t)
i,1
(g
(t)
i,1
)
T
)x
≥(
PN
i=1 λ
(t)
i,min
B
−
αC2
BD
)x
T x,
(6.16)
and αC2
D <
PN
i=1 λ
(t)
i,min, we know that Σα is positive definite and its minimal eigenvalue is lower bounded
by
PN
i=1 λ
(t)
i,min
B −
αC2
BD . Therefore, based on Table 2 in [280], we have:
Dα(N (¯g
(t)
i + ¯g
(t)
−i
,
Σ
(t)
i + Σ(t)
−i
B
)∥N (¯g
(t)
′
i + ¯g
(t)
−i
,
Σ
(t)
′
i + Σ(t)
−i
B
))
=
α
2
(∆¯g
(t)
i
)
T Σ
−1
α ∆¯g
(t)
i
| {z }
T erm I
−
1
2(α − 1) ln |Σα|
|Σ1|
1−α|Σ2|
α
| {z }
T erm II
.
(6.17)
196
First, we calculate the upper bound for Term I. Since ||∆¯g
(t)
i
||2
2 = || 1
D
(g
(t)
i,1 − g
(t)
′
i,1
)||2
2 ≤
4C2
D2 and Σα has
minimal eigenvalue lower bounded by
PN
i=1 λ
(t)
i,min
B −
αC2
BD , we have:
α
2
(∆¯g
(t)
i
)
T Σ
−1
α ∆¯g
(t)
i ≤
2αC2
D2(
PN
i=1 λ
(t)
i,min
B −
αC2
BD )
.
(6.18)
Next, we calculate the upper bound for Term II. Due to the concavity of ln |X| on positive semi-definite
matrix, we have
ln |Σ1|
1−α|Σ2|
α
|Σα|
= (1 − α) ln |Σ1| + α ln |Σ2| − ln |Σα|
=α(ln |Σ2| − ln |Σ1|) + (ln |Σ1| − ln |Σα|)
≤αtr(Σ−1
1
(Σ2 − Σ1)) + tr(Σ−1
α (Σ1 − Σα))
=αtr(Σ−1
1
(
1
BD
g
(t)
′
i,1
(g
(t)
′
i,1
)
T −
1
BD
g
(t)
i,1
(g
(t)
i,1
)
T
))
+ tr(Σ−1
α (
α
BD
g
(t)
i,1
(g
(t)
i,1
)
T −
α
BD
g
(t)
′
i,1
(g
(t)
′
i,1
)
T
)))
=
α
BD
tr(Σ−1
1
(g
(t)
′
i,1
(g
(t)
′
i,1
)
T
)) −
α
BD
tr(Σ−1
1
(g
(t)
i,1
(g
(t)
i,1
)
T
))
+
α
BD
tr(Σ−1
α (g
(t)
i,1
(g
(t)
i,1
)
T
)) −
α
BD
tr(Σ−1
α (g
(t)
′
i,1
(g
(t)
′
i,1
)
T
))
(6.19)
where tr(X) denotes the trace of matrix X. Given that both g
(t)
i,1
(g
(t)
i,1
)
T
and g
(t)
′
i,1
(g
(t)
′
i,1
)
T
are semi-positive
definite matrix, based on von Neumann’s trace inequality [285], we have:
tr(Σ−1
1
(g
(t)
′
i,1
(g
(t)
′
i,1
)
T
)) ≤
tr(g
(t)
′
i,1
(g
(t)
′
i,1
)
T
)
min
j=1,...,K
λ1,j
≤
C
2
min
j=1,...,K
λ1,j
, (6.20)
tr(Σ−1
1
(g
(t)
i,1
(g
(t)
i,1
)
T
)) ≥
tr(g
(t)
i,1
(g
(t)
i,1
)
T
)
max
j=1,...,K
λ1,j
≥ 0, (6.21)
tr(Σ−1
α (g
(t)
i,1
(g
(t)
i,1
)
T
)) ≤
tr(g
(t)
i,1
(g
(t)
i,1
)
T
)
min
j=1,...,K
λα,j
≤
C
2
min
j=1,...,K
λα,j
, (6.22)
197
tr(Σ−1
α (g
(t)
′
i,1
(g
(t)
′
i,1
)
T
)) ≥
tr(g
(t)
′
i,1
(g
(t)
′
i,1
)
T
)
max
j=1,...,K
λα,j
≥ 0, (6.23)
Therefore, we can derive:
ln |Σ1|
1−α|Σ2|
α
|Σα|
≤
αC2
BD
(
1
min
j=1,...,K
λ1,j
+
1
min
j=1,...,K
λα,j
)
=
αC2
BD
(
1
PN
i=1 λ
(t)
i,min
B
+
1
PN
i=1 λ
(t)
i,min
B −
αC2
BD
)
(6.24)
Combing Eq. (6.18) and Eq. (6.24), we have:
Dα(M{Di∪D−i,θ(t)}
∥M{D
′
i∪D−i,θ(t)}
)
≤
2αC2
D2(
PN
i=1 λ
(t)
i,min
B −
αC2
BD )
+
αC2
2BD(α − 1)(
1
PN
i=1 λ
(t)
i,min
B
+
1
PN
i=1 λ
(t)
i,min
B −
αC2
BD
)
≤(
2αBC2
D2
+
αC2
(α − 1)D
)
1
PN
i=1 λ
(t)
i,min −
αC2
D
= ϵ
(t)
i
.
(6.25)
Hence, the aggregated model update x
(t) provides (α, ϵ
(t)
i
)-RDP for user i’s dataset Di
. Lastly, based on
Lemma 3, it can be derived that the aggregated model update x
(t) provides (ϵ
(t)
i +
log(1/δ)
α−1
, δ)-DP for user
i’s dataset Di
.
Theorem 3 indicates that the privacy bound ϵ in FL with SA mainly depends on two main factors:
1) the number of users participating in FL with SA, and 2) the minimal eigenvalues of the model update
covariance matrix from each user. As we increase the number of users N,
PN
i=1 λ
(t)
i,min will increase and
hence the ϵ will decay. Moreover, when λ
(t)
j,min, i.e., the minimal eigenvalues of any user j’s covariance
matrix increase, PN
i=1 λ
(t)
i,min will also increase and thus ϵ will decrease.
198
However, it is worth noting that Theorem 3 relies on the non-singular covariance matrix assumption
(i.e. Assumption 4), which may not easily hold in practice. Especially in applications using deep neural
networks, the number of training data points is typically smaller than the number of model parameters
(i.e. over-parameterization), and as a result the covariance matrix of the model update will be singular
[286, 287]. Motivated by this, we next consider the case where the sampling noise of each user is Gaussian
and the gradient covariance matrix is singular.
6.4.4 Gaussian Sampling Noise with Singular Covariance Matrix
Before presenting our second theorem, we formally make the following assumption:
Assumption 5 (Non-singular covariance matrix in subspace). Assume that at step t, for any user i, the
gradient g
(t)
i,j calculated by any data point di,j can be mapped into a subspace as g
(t)
i,j = S
(t)
g
∗(t)
i,j , where
S
(t) ∈ RK×K∗
(K > K∗
) is a matrix consisting of K∗
orthogonal unit vector. We further assume that after
being mapped to the subspace, the covariance matrix of gradient noise is Σ
∗(t)
i =
Pj=D
j=1 g
∗(t)
i,j (g
∗(t)
i,j )
T
, which
is non-singular and has non-zero minimal eigenvalue of λ
∗(t)
i,min > 0.
Theorem 4. Under Assumption 2,3,5, if C2
D <
PN
i=1 λ
(t)
i,min, then ∀ α ∈ (1,
D
PN
i=1 λ
∗(t)
i,min
C2 ), the aggregated
model update x
(t) at step t can provide (α, ϵ
∗(t)
i
)-RDP and (ϵ
∗(t)
i +
log(1/δ)
α−1
, δ)-DP for user i’s dataset Di
,
where:
ϵ
∗(t)
i = (2αBC2
D2
+
αC2
(α − 1)D
)
1
PN
i=1 λ
∗(t)
i,min −
αC2
D
. (6.26)
Proof. We define G
∗(t)
i = [g
∗(t)
i,1
, ..., g
∗(t)
i,D ]. Based on Assumption 5, we have:
x
(t) =x
(t)
i + x
(t)
−i =
X
N
i=1
G
(t)
i wj = S
(t)
X
N
i=1
G
∗(t)
i wj
=S
(t)
X
N
i=1
g¯
∗(t)
i +
X
N
i=1
1
√
B
L
∗(t)
i
v
∗
i
= S
(t)x
∗(t)
,
(6.27)
199
where g¯
∗(t)
i =
1
D
PD
j=1 g
∗(t)
i,j and Σ
∗(t)
i = U
∗(t)
i
λ
∗(t)
i
U
∗(t)
i
T
, L
(t)
i = U
(t)
i
λ
(t)
i
1
2
, v
∗
i ∼ N (0, IK∗ ),
and x
∗(t) will be Gaussian with mean PN
j=1 g¯
∗(t)
j
and covariance matrix
PN
j=1 Σ
∗(t)
j
B
. Therefore, ∀ x in the
subspace of S
(t)
(i.e. x = S
(t)x
∗
), we have:
M{Di∪D−i,θ(t)}
(x) = Pr[x
(t)
i + x
(t)
−i = x|{Di}
i=N
i=1 , θ(t)
]
=Pr[x
∗(t) = x
∗
|{Di}
i=N
i=1 , θ(t)
].
(6.28)
Following the proof of Theorem 3, for any Di and D
′
i
(i.e. two instances of user i’s local dataset ) differing
from one data point, we have:
Dα(M{Di∪D−i,θ(t)}
∥M{D
′
i∪D−i,θ(t)}
)
≤(
2αBC2
D2
+
αC2
(α − 1)D
)
1
PN
i=1 λ
∗(t)
i,min −
αC2
D
= ϵ
∗(t)
i
.
(6.29)
Hence, the aggregated model update x
(t)
at step t can provide (α, ϵ
∗(t)
i
)-RDP and (ϵ
∗(t)
i +
log(1/δ)
α−1
, δ).
Theorem 4 relies on Assumption 5. It states that for Gaussian gradient noise with singular covariance
matrix, in order to provide DP guarantee for user i’s local dataset, any difference in aggregated model
update caused by the change of one data point in must belong to the same subspace where the randomness
in the aggregated model update belongs to. Otherwise, DP will be violated. The counter example in
Figure 6.3 violates DP since the model update of user 4 fails to satisfy Assumption 5.
Remark 2: Applicability of Assumption 5 in practice. Prior works have demonstrated that the gradient converges to a tiny subspace [288] in the training process of overparameterized neural networks, and
mapping gradient into a subspace (e.g. using PCA) can reduce the amount of noise added for the same
level of DP [269]. Hence, Assumption 5 is a reasonable assumption for overparameterized neural network.
However, in order to verify whether Assumption 5 holds, a centralized server is needed to access to each
user’s individual gradients and run SVD on them, which breaks the privacy guarantee provided by SA in
FL (see Section 5.2.2). Therefore, in practice, a potential solution is that each user adds additional noise
locally to make the covariance matrix non-singular, as discussed in Section 6.5.
6.5 Water-Filling Noise Addition (WF-NA) Algorithm.
Till now, our theoretical results demonstrate the necessity of additional noise for DP guarantee in FL with
SA when the covariance matrix of gradient noise is non-singular. In this section, we explore whether it
is possible to leverage the inherent randomness inside aggregated model update to reduce the amount of
additional noise required for DP. Specifically, we introduce a novel algorithm called Water-Filling noise
addition (WF-NA), which lift the zero eigenvalues (and some small eigenvalues) in the covariance matrix
of each user’s gradient noise, in order to guarantee their non-singularity. For simplicity, we refer to this
algorithm as WF-NA and describe its details below.
6.5.1 Algorithm Design
The inputs of the WF-NA algorithm for each user i include the local dataset Di
, the current global model
parameters θ
(t)
, the number of users N participating in FL, the clipping value C, mini-batch size B, the
lower bound for minimal eigenvalue budget σ
2
, and the relaxation item δ. Note that without loss of generality, we assume that the input parameters are the same across users.
First, user i utilizes its local dataset Di to compute the mean gradient and covariance matrix of gradient
noise as µ
(t)
i
and Σ
(t)
i
respectively. We describe this process for FedSGD separately below.
Calculate covariance matrix. For each data point in the local training dataset, user i will calculate
a gradient from this data point and clip its L2 norm to make it upper bounded by constant C. After
obtaining |Di
| = D clipped gradient vectors, user i uses them to calculate the mean µ
(t)
i =
1
D
PD
j=1 g
(t)
i,j
and covariance matrix Σ
(t)
i =
1
BD
PD
j=1 g
(t)
i,j (g
(t)
i,j )
T
.
201
Run SVD and lower bound the smallest eigenvalue. Next, user i will run SVD on its estimate covariance matrix as Σ
(t)
i = U
(t)
i Λ
(t)
i
(U
(t)
i
)
T
, and clip the eigenvalues in Λ
(t)
i
such that all eigenvalues are
lower bounded by σ
2
. Define the updated diagonal matrix with bounded eigenvalues as Λ
(t)
i,+. The revised
covariance matrix can be calculated as Σ
(t)
i,+ = U
(t)
i Λ
(t)
i,+(U
(t)
i
)
T
. Note that to make the SVD process efficient, we split the d-dimension gradient into k parts, estimate the covariance matrix for each part, run
SVD on these k covariance matrices separately, and concatenate them into the final covariance matrix. By
running SVD approximately, the time complexity of SVD will be reduced from O(d
3
) to O(k
d
k
3
) = O(
d
3
k
2 )
(see similar approaches in [289, 290, 291]), without affecting the ϵ bound we have.
Add WF noise. Finally, each user adds Gaussian noise with covariance matrix ∆Σ(t)
i = Σ(t)
i,+ − Σ
(t)
i
into
the local model update (i.e. n ∼ N(0, ∆Σ(t)
i
)).
Theorem 5 (DP guarantees of Gaussian sampling noise + WF-NA algorithm). Given the input parameters
of WF-NA algorithm, for any α ∈ (1,
Nσ2D
2C2 ), the aggregated model update using WF-NA can provide (α, ϵ(t)
)-
RDP and (ϵ
(t) +
log(1/δ)
α−1
, δ)-DP guarantees to any user i’s local dataset, where the ϵ
(t)
of each training round
is given as follows:
ϵ
(t) = (2αC2
D2
+
2αC2
(α − 1)BD
)
1
Nσ2 −
2αC2
BD
. (6.30)
Proof. We define the change of covariance matrix after applying WF-NA as:
Σ
(t)
i,∆ = Σ(t)
i,+ − Σ
(t)
i
. (6.31)
Since WF-NA only increases the eigenvalues of the original covariance matrix by maximally σ
2
, Σ
(t)
i,∆ will
be semi-definite with maximal eigenvalue upper bounded by σ
2
. Then, consider Σ
(t),1
i,+ and Σ
(t),2
i,+ , which
are two instances of Σ
(t)
i,+ by changing one data point in Di
, we have:
Σ
(t),2
i,+ − Σ
(t),1
i,+ =
1
BD
(g
(t),2
i,1
(g
(t),2
i,1
)
T − g
(t),1
i,1
(g
(t),1
i,1
)
T
) + ∆Σ(t)
i,∆,
(6.32)
202
where ∆Σ(t)
i,∆ = Σ(t),2
i,∆ − Σ
(t),1
i,∆ .
Since removing g
(t),1
i,1
(g
(t),1
i,1
)
T decreases the eigenvalues, which may cause some eigenvalues below
smaller than σ
2
, WF-NA needs to refill eigenvalues smaller than σ
2
. Hence, we have ∆Σ(t)
i,∆ ⪯
1
BD g
(t),1
i,1
(g
(t),1
i,1
)
T
.
This says, by changing one data point in user i’s local dataset, WF-NA will at most increase Σ
(t),1
i,+ by
1
BD g
(t),1
i,1
(g
(t),1
i,1
)
T
to refill small eigenvalues. Moreover, since adding g
(t),2
i,1
(g
(t),2
i,1
)
T
increases the eigenvalues, WF-NA may lift eigenvalues smaller than σ
2
less. Hence, we have ∆Σ(t)
i,∆ ⪰ − 1
BD g
(t),2
i,1
(g
(t),2
i,1
)
T
.
This says, by changing one data point in user i’s local dataset, WF-NA will at most reduce Σ
(t),1
i,+ by
1
BD g
(t),2
i,1
(g
(t),2
i,1
)
T
. Formally, we have:
−
1
BD
g
(t),2
i,1
(g
(t),2
i,1
)
T ⪯ ∆Σ(t)
i,∆ ⪯
1
BD
g
(t),1
i,1
(g
(t),1
i,1
)
T (6.33)
Since ||g
(t),l
i,1
||2 ≤ C, we can derive that ∀ x ∈ RK,
x
T
(Σ(t),2
i,+ − Σ
(t),1
i,+ )x
≥ − x
T
(
1
BD
g
(t),1
i,1
(g
(t),1
i,1
)
T +
1
BD
g
(t),2
i,1
(g
(t),2
i,1
)
T
)x ≥ −
2C
2
BD
x
T x.
(6.34)
Therefore, ∀ x ∈ RK,
x
T Σα,+x = x
T
((1 − α)Σ1,+ + αΣ2,+)x
=x
T Σ1x + x
T α
B
(Σ(t),2
i,+ − Σ
(t),1
i,+ )x
≥(Nσ2 −
2αC2
BD
)x
T x.
(6.35)
Moreover, we can derive
Σ2,+ ⪯ Σ1,+ +
1
BD
g
(t),2
i,1
(g
(t),2
i,1
)
T +
1
BD
g
(t),1
i,1
(g
(t),1
i,1
)
T
, (6.36)
203
and
Σ1,+ ⪯ Σα,+ +
α
BD
g
(t),2
i,1
(g
(t),2
i,1
)
T +
α
BD
g
(t),1
i,1
(g
(t),1
i,1
)
T
. (6.37)
Following the proof of Theorem 3, we derive:
Dα(M{D1
i ∪D−i,θ(t)}
∥M{D2
i ∪D−i,θ(t)}
)
≤
α
2
(∆¯g
(t)
i
)
T Σ
−1
α,+∆¯g
(t)
i −
1
2(α − 1) ln |Σα,+|
|Σ1,+|
1−α|Σ2,+|
α
.
≤
2αC2
D2
1
Nσ2 −
2αC2
BD
+
αC2
(α − 1)BD
(
1
Nσ2
+
1
Nσ2 −
2αC2
BD
)
=(2αC2
D2
+
2αC2
(α − 1)BD
)
1
Nσ2 −
2αC2
D
= ϵ
(t)
.
(6.38)
Hence, the aggregated model update using WF-NA can provide (α, ϵ(t)
)-RDP and (ϵ
(t) +
log(1/δ)
α−1
, δ)-DP
guarantees to any user i’s local dataset.
6.5.2 Discussion about WF-NA
Intuitive comparison of WF noise with isotropic noise Figure 6.4 compares WF noise with isotropic
noise, where the x-axis represents the index of each eigenvalue and the y-axis represents the value of each
eigenvalue. The noise contained in the aggregated model update is randomly distributed in the directions
of all eigenvectors, and each eigenvalue measures the variance of the noise in each eigenvector direction.
When the eigenvalue is larger than σ
2
(i.e. the amount of noise we need for achieving our privacy target), then the aggregated model update can already provide enough privacy protection, without needing
additional noise. Since WF-NA leverages the noise contained in the aggregated model update, by only
adding noise to the eigenvector directions where the eigenvalues are smaller than σ
2
, it can add less noise
compared with adding isotropic noise, as shown in Figure 6.4a and 6.4b.
204
�!
Eigenvalue
Amount of noise added
using WF-NA
(a) WF noise.
�!
Eigenvalue
Amount of noise added
using isotropic Gaussian
(b) Isotropic noise.
Figure 6.4: Comparison of WF noise and isotropic noise.
Method for local model
update calculation
Additional
noise
(ϵ, δ)-DP guarantee
SGD + IID gradient
sampling
Isotropic
Gaussian
ϵ = log(1 + B
D
(e
ϵ
′
− 1)), where
ϵ
′ = min
α>1
log(1/δ)
α − 1
+
2αC2
B2
·
1
Nσ2
([269, 292])
SGD + IID gradient
sampling
WF-NA No DP guarantee
GD without sampling Isotropic
Gaussian
ϵ = min
α>1
log(1/δ)
α − 1
+
2αC2
D2
·
1
Nσ2
([269])
GD without sampling WF-NA No DP guarantee
SGD + Gaussian
gradient sampling
Isotropic
Gaussian
ϵ = min
α>1
log(1/δ)
α − 1
+ (2αC2
D2
+
αC2
(α − 1)BD
)
1
Nσ2 −
αC2
BD
SGD + Gaussian
gradient sampling
WF-NA ϵ = min
α>1
log(1/δ)
α − 1
+ (2αC2
D2
+
αC2
(α − 1)BD
)
1
Nσ2 −
αC2
BD
Table 6.1: Comparison of different DP mechanisms in FedSGD. Note that for SGD + IID gradient sampling,
the sampling ratio is B
D
, which will offer a privacy amplification with ratio B
D
approximately.
Comparison of different DP mechanisms. In Table 6.1, we summarize the DP guarantees of different
sampling noise with WF-NA and compare them with the DP guarantees provided by traditional DP mechanisms using isotropic Gaussian noise. It is worth noting that WF-NA itself is not enough to guarantee
DP, since it only guarantees that the covariance matrix of gradient noise is non-singular. In addition, it
needs to be combined with appropriate sampling noise in order to achieve DP guarantee (e.g. Gaussian
sampling noise).
205
As shown in Table 6.1, when SGD with IID sampling or Gradient Descent (GD) without gradient sampling is used, the usage of WF-NA cannot provide any DP guarantee, since WF-NA does not guarantee the
necessary condition in Theorem 3 (see Section 6.4.3 for details). Moreover, compared with mechanisms
using SGD, the usage of GD and isotropic Gaussian noise injects the least amount of noise into the local model updates, since GD does not introduce any sampling noise caused by gradient sampling, but it
does have larger cost than SGD with IID sampling since it uses all data points. Last, when SGD is used,
while WF-NA injects less noise compared with the addition of isotropic Gaussian, Gaussian gradient sampling may inject more sampling noise compared with IID gradient sampling. Hence, the total amount of
noise added by Gaussian gradient sampling + WF-NA may not be smaller than the total amount of noise
added by IID gradient sampling + isotropic Gaussian. In the next subsection, we empirically compare the
performance of these mechanisms for the same level of privacy.
6.5.3 Comparison of DP Mechanisms
In this subsection, we consider the tasks of training both a linear model and a Convolutional Neural Network (CNN) model on MNIST dataset with 50 users. We compare the accuracy of these models when
different DP mechanisms are employed during the training process, to verify our theoretical analysis in
Section 6.5.2. Specifically, we compare three mechanisms, described as follows.
SGD-DP: It uses SGD with IID sampling and the addition of isotropic Gaussian (i.e. adding noise in the
whole gradient space).
GSGD-DP: It uses SGD with Gaussian sampling and WF-NA (i.e. adding noise to the “unprotected” gradient subspace).
GD-DP: It uses GD without any sampling noise and the addition of isotropic Gaussian (i.e. adding noise
in the whole gradient space).
206
(a) Linear model. (b) CNN model.
Figure 6.5: Comparison of different DP mechanisms on MNIST dataset. Note that we consider 50 users
participating in FL. The training epoch is set as 100, the mini-batch size B is 32, the clipped value C is set
as 10, and we consider δ = 10−4
. We report the accumulative privacy across all training epochs by using
the composition theorem in [294].
As demonstrated in Figure 6.5, GSGD-DP has significantly better accuracy than SGD-DP. However,
both of them exhibit lower accuracy compared with GD-DP. In terms of training efficiency, SGD-DP is
better than GD-DP and GSGD-DP, since it merely requires users to sample a mini-batch of B data points
and take the average of gradients calculated from these data points as model update. Moreover, GSGD-DP
is the least efficient compared to GD-DP and SGD-DP, due to the necessity for WF-NA to run SVD on the
covariance matrix of gradient noise (see Section 6.5 for details). Therefore, we conclude that GD-DP is
preferable for maximizing model accuracy without considering training efficiency. Conversely, SGD-DP
is recommended when training efficiency is prioritized. Note that when the sampling ratio in SGD (i.e.
B
D
) converges to 1, the performance of SGD-DP will also converge to GD-DP. This says, the key difference
between SGD-DP and GD-DP is how they trade between training efficiency and model accuracy for the
same level of privacy. As indicated in [293], the mini-batch size B in SGD-DP can be adjusted to trade
between training efficiency and training accuracy. It is noteworthy that while GSGD-DP appears to be
impractical due to the requirement to run SVD, investigating how to add minimal noise only to the model
update subspace requiring additional noise for a DP guarantee is a compelling research direction. We leave
it as future research to explore whether this idea can be practically applied (see Section 6.6 for details).
207
6.6 Discussion and Future Work
Generalization to other FL protocols. While our theoretical results are derived based on FedSGD, they
can be generalized to other FL algorithms such as FedAvg [295], by only varying how we generate the
model update (i.e. Assumption 2). As an example, in FedAvg, at training round t each user conducts
multiple local training steps (i.e. a random process) to derive their local model update x
(t)
i
, which will also
be a random vector with some mean and covariance matrix. Therefore, after changing Assumption 2 w.r.t.
how we generate x
(t)
i
, Theorem 3-4 can be applied.
Leveraging inherent randomness in model updates. As explored in Section 6.5, when the inherent
randomness in the aggregated model update can be leveraged, the amount of additional noise required for
achieving DP guarantee can be reduced, since we only need to add additional noise to the subspace where
the randomness in aggregated model updates cannot offer enough privacy protection (i.e. the ‘unprotected”
subspace). Despite its potential, several challenges exist in the design of a practical mechanism to leverage
the inherent randomness in aggregated model updates to reduce the amount of noise needed for DP.
First, without making any assumptions about the distribution of each individual user’s model update,
we are unable to quantify and analyze the distribution of the inherent randomness in the aggregated model
update. Hence, the closed-form of ϵ cannot be derived through theoretical analysis. A potential solution is
to approximate the original aggregated model update whose distribution is unknown with a model update
satisfying a common distribution. For example, as demonstrated in Section 6.4, Gaussian gradient sampling
noise can be used to make the aggregated model update be Gaussian. While such gradient approximation
makes it analytically feasible to derive the bound of ϵ in DP, it actually introduces additional noise into
the raw aggregated model update (see Section 6.5 for details). Therefore, compared with traditional DP
mechanisms that add noise directly to the original gradient, whether novel gradient approximation approaches can be developed to improve the privacy-utility trade-offs during DP noise addition deserves
further research.
208
Second, the computational overhead of quantifying and leveraging the inherent randomness in the
model update can be large. As demonstrated in Section 6.4 and Section 6.5, quantifying the “unprotected”
subspace needs SVD, which will significantly increase the runtime overhead in practice. Hence, a practical
direction of future work is to explore how to efficiently quantify and use the inherent randomness in the
aggregated model update.
Third, to ensure DP guarantee with meaningful ϵ for each individual user, users need to clip their
gradients such that the distribution shift of the aggregated model update caused by the change of one data
point locally can be bounded and hence a meaningful bound for ϵ can be guaranteed. On the other hand, if
each user clips their individual gradient, the aggregated model update will be bounded and hence cannot
be used as unbounded noise with respect to other users. In this case, when applying DP mechanisms with
unbounded noise (e.g. Gaussian mechanism, Laplacian mechanism), the bounded inherent randomness
in the aggregated model update may not help to reduce the variance of unbounded noise needed for DP
guarantee. To address this, potential strategies include the adoption of DP mechanisms that utilize bounded
noise distributions (e.g [296, 297, 298]). Another approach involves discretizing the gradient, enabling the
addition of bounded discrete noise into the model update [244].
6.7 Related work
SA has been introduced to address information leakage from model updates in the context of FL. The stateof-the-art uses additive masking to protect the privacy of individual models [237, 251, 252, 253, 254, 255,
258]. SecAg [237] was the first practical protocol proposed for FL with SA that is resilient for both user
failures or dropouts and collusion between users as well as the server, and it is based on pairwise secret
keys to be generated between each pair of users for masking the model updates. The cost of constructing
and sharing these masks is quite large though, scaling with O(N2
) where N corresponds to the number
of users. Recent works have managed to reduce the complexity of SecAg to O(N log N), see SecAg+ [251]
209
and TruboAggregate [258]. SecAg+ leverages a sparse random graph where each user jointly encodes
its model update with only a subset of user, while TruboAggregate leverages both sequential training
over groups of rings and lagrange coded computing [299]. Further overhead reduction has been recently
achieved by LightSecAg [255] which uses private MDS codes. The reduction in complexity is based on
using one-shot aggregate-mask reconstruction of the surviving users. Other SA protocols proposed to
reduce the computation/communication complexity of SecAg include [253, 254].
A different direction to SA to provide privacy in the context of FL is to use DP by adding some noise
to either the data or the model updates. DP provides robust mathematical privacy guarantees by ensuring
that the individual contribution of a client does not have a significant impact on the final learning result.
In a trusted server setting, earlier works such as DP-SGD and DP-FedAvg [270, 269, 295] have focused on
applying DP centrally in FL, where the server is trusted with individual model updates before implementing
the deferentially private mechanism. An alternative approach, Local Differential Privacy (LDP), explores
similar guarantees where the server is not to be trusted. In LDP the updates are perturbed at the client-side
before it is collected by the server in the clear. The LDP model provides stronger privacy guarantees as it
does not require a fully trusted aggregator, but it suffers from a poor privacy-utility trade-off. To improve
privacy-utility trade-off of LDP, prior works have proposed to use parameter shuffling ([300, 301]) and
different FL protocols ([302]). However, these approaches do not consider the usage of SA and hence lead
to a significant utility drop [244].
Although FL systems may still leak information even when SA is utilized, there is an opportunity to
apply distributed DP mechanisms which may distributed the added noise among the users thus improving
utility, see, for example, recent efforts along these lines [273, 244, 303, 304, 305]. Specifically, in [273, 244,
303], DP is achieved by adding novel discrete noise that asymptotically converges to a normal distribution.
In [304], the authors propose a Poisson Binomial Mechanism for distributed DP that results in an unbiased
aggregate update at the server. In [305], the authors design a novel transformer layer deployed locally
210
for each user, which effectively reduces the accuracy drop caused by added noise in SA. However, these
approaches do not take advantage of the inherent randomness in FL from updates of other participating
clients in the system, which is the focus of this work. While there are a few works [306, 307, 308] which
leverage the inherent randomness from data sampling in centralized learning to protect DP guarantee,
they do not quantify to inherent randomness from model updates of other users in FL with SA.
Very recently, our work in [190] (i.e. Chapter 5) have used mutual information to show that the inherent
randomness from updates from other users may offer some privacy protection in the context of FL with
SA. In this work, we focus on whether FL with SA can provide DP guarantees, which is a much stronger
privacy guarantee.
6.8 Conclusion
In this work, we formally analyze the conditions under which FL with SA can provide DP guarantees
without additional noise. We then demonstrate that the impracticability of these conditions, especially for
over-parameterized neural networks. Hence, additional noise is necessary to achieve DP guarantees in FL
with SA in practice. Lastly, we explore how to leverage the inherent randomness inside aggregated model
update to reduce the amount of additional noise required for DP guarantees.
211
Part III
Leveraging Large Foundation Models to Protect User Privacy and Utility
in Specialized ML Model Training
212
Chapter 7
Efficient Toxic Content Detection by Bootstrapping and Distilling Large
Language Models
Toxic content on online platforms reveals sensitive information about users (e.g. their private information and personality) and causes harmful impacts on other platform users. Hence, it is crucial for online
services to remove inappropriate and sensitive content to protect the privacy of platform users. To automate the detection process, prior works have proposed varieties of machine learning (ML) approaches to
train Language Models (LMs) for toxic content detection. However, both their accuracy and transferability
across datasets are limited. Recently, Large Language Models (LLMs) have shown promise in toxic content
detection due to their superior zero-shot and few-shot in-context learning ability as well as broad transferability on ML tasks. However, efficiently designing prompts for LLMs remains challenging. Moreover,
the high run-time cost of LLMs may hinder their deployments in production. To address these challenges,
in this chapter, we propose BD-LLM, a novel and efficient approach to Bootstrapping and Distilling LLMs
for toxic content detection. At a high level, our approach extracts high-quality rationales from LLMs as
auxiliary training data of student LMs, thereby enhancing their performance on toxic content detection.
Specifically, we design a novel prompting method named Decision-Tree-of-Thought (DToT) to bootstrap
LLMs’ detection performance and extract high-quality rationales. DToT can automatically select more
fine-grained context to re-prompt LLMs when their responses lack confidence. Additionally, we use the
213
rationales extracted via DToT as training data to fine-tune student LMs. Our experimental results on various datasets demonstrate that DToT can improve the accuracy of LLMs by up to 4.6%. Furthermore, student
LMs fine-tuned with rationales extracted via DToT from LLMs outperform baselines on all datasets with
up to 16.9% accuracy improvement, while being more than 60× smaller than conventional LLMs. Finally,
we observe that student LMs fine-tuned with rationales extracted from LLMs exhibit better cross-dataset
transferability.
7.1 Introduction
Toxic content detection is important for online services to protect users from harmful and offensive content, ensuring a safer and more positive user experience. Common toxic content categories include hate
speech, biased content, sexual content, violent content, bullying content, etc. Due to the massive amount
of content on the Internet, it is impractical to manually check the toxicity of each content. Hence, machine
learning (ML) solutions based on supervised learning have been widely applied to automate the toxic content detection process, where Language Models (LMs) fine-tuned on a task-specific dataset achieve the
state-of-the-art (SOTA) performance [309, 310].
However, existing supervised learning ML solutions face three challenges. First, they require training
data with labels, which are non-trivial to obtain for toxic content detection tasks due to the lack of standard
definitions, especially for implicit toxic content. Second, the fine-tuned LMs may overfit the training
dataset, which limits the transferability to other datasets. Lastly, they typically can only predict binary
labels without detailed reasoning.
To handle the above challenges, the recently emerging Large Language Models (LLMs) have been leveraged to detect toxic content [311, 312], due to their superior zero-shot and few-shot in-context learning
performance and transferability. Existing works on LLMs for toxic content detection focus on designing
novel prompting approaches to enhance the performance of LLMs. However, their performance relies
214
heavily on the quality of prompts, which are non-trivial to design. Moreover, deploying LLMs for toxic
content detection in production can incur both high run-time cost and high latency, especially when the
number of tokens in the prompt is large (e.g. for in-context few-shot learning).
LLM DToT
Prompting
Question: �
Answer: �, Rationale: �
Fine-tune Student LM
� �, �
Question: �
Prompt
Generator
LLM
Prompt: �!
Confidence
Checker
Answer: �!
Rationale: �!
Rationale: �!
New context: �!"# = ��"�
�
Answer: �!
Rationale: �!
��
��"�
�
��"�
�
��"�
���
Context Tree
Context Selector Low
High
Initial
context: �'
DToT Prompting
Figure 7.1: Overall workflow of the proposed BD-LLM. Given question q, it first bootstraps the LLM via
DToT prompting to extract answer a and rationale r with high-confidence. Then, it uses q as input and
(a, r) as output to fine-tune the student LM.
Motivated by the above, in this work, we propose BD-LLM, a novel and efficient approach to Bootstrapping
and Distilling LLMs for toxic content detection (as shown in Figure 7.1). We first design a novel prompting
approach called Decision-Tree-of-Thought (DToT) to improve the zero-shot and few-shot in-context learning performance of LLMs as well as extract better rationales. At a high level, DToT works by iteratively
selecting more fine-grained context to re-prompt LLMs whenever they have low-confident answers. It
automatically decides when to select more fine-grained context for re-prompting and which type of finegrained context should be selected. Second, we propose to fine-tune a student LM with smaller model size
to predict both labels and rationales extracted via DToT.
We evaluate the proposed approach on three public datasets and one Amazon internal dataset. Experimental results demonstrate that employing DToT prompting consistently leads to improved zero-shot and
few-shot in-context learning performance on all datasets, by up to 4.6% higher accuracy. Furthermore, we
demonstrate that fine-tuning student LMs with DToT-extracted rationales results in up to 16.9% accuracy
improvement compared with baselines. Meanwhile, these student LMs have model size more than 60×
215
smaller than conventional LLMs. Finally, we observe that the cross-dataset transferability of student LMs
fine-tuned with rationale is significantly improved.
Our contributions are summarized as follows:
• We propose BD-LLM, an efficient approach for toxic content detection, which leverages LLMs strength
but also reduces their complexity via bootstrapping and distillation. This is the first-of-its-kind study
on toxic content detection.
• To bootstrap LLMs, we design a novel prompting method named DToT, which selectively re-prompts
LLMs with more fine-grained context to boost their detection performance and extract high-quality
rationales.
• To distill LLMs into smaller student LMs, we fine-tune the student LMs to predict both labels and
rationales extracted via DToT.
• We evaluate the proposed solution on four datasets and demonstrate that DToT can improve the toxic
content detection accuracy of LLMs by up to 4.6% and student LMs fine-tuned with DToT-extracted
rationales achieve the SOTA performance with up to 16.9% accuracy improvements and more than
60× smaller size.
7.2 Approach
Figure 7.1 demonstrates the overall workflow of BD-LLM, which consists of two separate steps. Specifically,
in the first step, we design DToT prompting to iteratively extract high-confidence answers a and rationales
r from LLM given input question q.
∗
In the second step, we conduct rationale distillation by fine-tuning a
student LM to generate both a and r given input q. We describe the details of each step below.
∗Note that for toxic content detection, an answer is either ‘Yes’ or ‘No’, while a rationale consists of one or more sentences.
216
7.2.1 DToT Prompting
At a high level, DToT prompting iteratively selects more fine-grained context and re-prompts the LLM
when they output responses with low confidence. Two challenges in designing DToT prompting are: 1)
how to decide whether the response of LLM is confident or not; 2) how to decide which type of fine-grained
context should be selected to re-prompt the LLM†
. To address the first challenge, we design a confidence
checker to measure the self-confidence score of the LLM’s answer. To handle the second challenge, we
design a context selector to select appropriate fine-grained context from a context tree based on the LLM’s
rationale.
In total, DToT prompting consists of four modules: 1) confidence checker; 2) context tree; 3) context
selector; 4) prompt generator, which can be used for both black-box and white-box LLMs. Note that for
black-box LLMs, we assume that we only have access to their output responses (e.g. ChatGPT). By contrast,
for white-box LLMs, we assume that we can also have access to their model parameters (e.g. open-source
LLMs). The detailed design of these modules and the end-to-end workflow are presented below.
Confidence Checker. This module is designed to measure the self-confidence score of LLM’s response,
which is defined as s
conf
t
for iteration step t . Specifically, for the black-box LLM, the confidence checker
uses the toxicity rating of LLM (defined as s
toxi
t ∈ [0, 100]) to calculate the confidence score. If s
toxi
t
is
above a maximal threshold sh or below a minimal threshold sl
, we consider the answer as confident, vice
versa. Note that to obtain s
toxi
t
, we explicitly require the LLM to output the toxicity rating in the prompt
pt
. For the white-box LLM (e.g. any open-source LLMs), the confidence checker will leverage the output
logits of LLM to calculate the probability of generating answers at conditional on the prompt pt
. This
†
Since there are many different types of toxic content and for each of them we need to use the right context (i.e. part of the
prompt that specifies the criteria to call certain content as toxic), there is a need to automatically select the appropriate context
for prompting.
217
probability will be used as the confidence measurement (i.e. s
conf
t = Pr[at
|pt
]). Formally, we define s
conf
t
as:
s
conf
t =
1[s
toxi
t ∈/ (sl
, sh)], for black-box LLMs
Pr[at
|pt
], for white-box LLMs
(7.1)
where 1[·] is an indicator function, and sl and sh are two adjustable thresholds (see Section 7.3.3 for selecting sl
, sh).
Suppose the confidence checker has measured the self-confidence score s
conf
t
of LLM’s answer at
.
Then, if s
conf
t
is higher than a threshold sδ, the checker will consider at as a confident answer. Otherwise,
at
is considered to be unconfident. Formally, the output of the confidence checker Dcheck is defined as:
Dcheck(s
conf
t
) =
Unconfident, if s
conf
t ∈ [0, sδ)
Confident, if s
conf
t ∈ [sδ, 1]
(7.2)
where sδ is an adjustable threshold (see Section 7.3.3 for selecting sδ).
Context Tree. Before introducing the context selector module, we first provide the definition of context
tree. Suppose the universe of context is represented as set C. We define the context tree as Tc : C → List[C],
which is a mapping from a parent-node context ct ∈ C to the list of its child-node contexts [c
1
t+1, ..., c
Nct
t+1],
where Nct
is the total number of child nodes of ct
, and c
j
t+1 ∈ C is the j-th child-node context of ct
.
Moreover, each child-node context’s category is designed to be a subcategory of its parent-node context’s
category, such that the child-node context is more fine-grained than parent-node context. For instance,
suppose that c0 provides the definition of toxic content, which includes hate speech, sexual content, etc.
Then, c
1
1
can be a more fine-grained definition for hate speech, and c
2
1
can be a more fine-grained definition
for biased content (see Figure 7.2).
218
Toxic content includes hate speech, biased content, sexual content, …
Hate and intolerant content refers to any form of communication,
media, or expression that promotes or incites violence, hostility, …
Biased and discriminatory content can disrespect, stereotype, or
promote bias and discrimination against an individual …
…
�!
�"
"
�"
#
Figure 7.2: An example of context tree.
Context Selector. Now we introduce the context selector module, which takes as input the rationale rt
and the context ct
in prompt at step t, and a context tree Tc, to select a more fine-grained context ct+1 for
step t+1. Whenever an answer at at iteration step tis considered as unconfident by the confidence checker,
the context selector will select new context from the context tree for re-prompting LLMs. Specifically, it
measures the relevance between rationale rt and each child-node context c
j
t+1, and then selects the most
relevant one ct+1 = c
j
∗
t+1. Formally, we define the output of context selector as:
Dselect(rt)=j
∗ = arg max
1≤j≤Nct
s
rele,j
t
(rt, c
j
t+1), (7.3)
where [c
1
t+1, ..., c
Nct
t+1] = Tc(ct), and s
rele,j
t
is the relevance score between rt and c
j
t+1.
Furthermore, to calculate s
rele,j
t
for LLMs which can generate relatively high-quality rationales (mostly
for black-box LLMs like ChatGPT), we use a classification prompt to ask LLM which of these child-node
context categories are most relevant to the rationale rt
. We denote the answer from the LLM as Class(rt).
By contrast, for white-box LLMs which cannot generate rationales with decent quality, we construct a set
of candidate rationales as rt = [r
1
t
, ..., r
Nct
t
], where each rt
[j] = r
j
t
is relevant to only one of the childnode contexts c
j
t+1. Since we have access to the output logits of the LLM, we measure the probability of
generating each candidate rationale conditional on the prompt pt
(i.e., Pr[rt
[j]|pt
]) as the relevance score.
Formally, we define s
rele,j
t
as:
s
rele,j
t =
1[j = Class(rt)], for black-box LLMs
Pr[rt[j]|pt], for white-box LLMs
(7.4)
where 1[·] is an indicator function.
219
Prompt Generator. The Prompt Generator module P generates an input prompt pt for LLM M based
on the question q, and the context ct at iteration step t. Note that it will modify the question q with the
change of context. For example, the initial question based on c0 is designed to ask whether a statement
contains toxic content. Suppose c1 is selected to provide context related to hate speech. Then, the question
will be modified to ask whether the statement contains hate speech, which is a specific category of toxic
content. We provide the prompt templates in Table 7.1 below:
Model Prompt template
ChatGPT
Context: c.
Sentence: s.
Does this sentence contain type(c)
content?
First, answer in “Yes" or “No".
Second, rate the type(c) level out
of 100.
Third, explain your rationale briefly.
The output format is given below:
Answer: ...
type(c) level: .../100.
Rationale: ...
FC-T5
c.
### Human: “s". Does this sentence
contain type(c) content?
Answer yes or no, and explain your
answer.
### Assistant:
Table 7.1: Prompt templates for different LLMs, given question q: “Does sentence s contain toxic content?"
(see Section 3.1). Note that c is the context, s is the statement, and type(c) is the category of context c
(e.g. toxic, hate and violent).
End-to-end Workflow. Algorithm 1 illustrates the end-to-end workflow of DToT prompting, where the
input contains LLM M, question q, initial context c0. At each iteration step t (line 2), it first generates
prompt pt via prompt generator P and gets LLM M’s output (lines 4-9). Next, it calculates the confidence
score of LLMs’s answer s
conf
t
and decides whether it is confident via the confidence checker Dcheck (lines
11-12). For unconfident answers, it will get a list of new candidate contexts from context tree Tc (line 14),
220
calculate the relevance score between LLM’s rationale and each one of the new context in the list (lines
15-22), and select the most relevant new context (lines 23-24). Lastly, it terminates when the maximal
iteration step T is reached (line 2) or the LLM’s answer is confident (line 27).
Algorithm 1 DToT Prompting
Input: Question q, initial context c0, LLM M, thresholds sl
, sh, and sδ, maximal iteration step T.
Output: Answer a, rationale r.
1: Define current step as t = 0.
2: while t < T do
3: // Generate prompt and get response
4: Generate input prompt: pt = P(q, ct
, M).
5: if M is black-box then
6: Get LLM output: (at
, stoxi
t
, rt) = M(pt).
7: else
8: Get LLM output: (at
, rt) = M(pt).
9: end if
10: // Check the confidence of answer
11: Calculate confidence score s
conf
t
by Eq. (7.1).
12: if Dcheck(s
conf
t
) = Confidence: then
13: // Select new context from context tree
14: Get the new context list: [c
j
t+1]
Nct
j=1 = Tc(ct).
15: if M is black-box then
16: Let M get the rationale class: Class(rt).
17: else
18: Let rt be the candidate rationale list: [rt,j ]
Nct
j=1.
19: Calculate relevance score [s
rele,j
t
]
Nct
j=1 by Eq. (7.4).
20: end if
21: Calculate j
∗ = Dselect(rt) by Eq. (7.3).
22: Set new context ct+1 as c
j
∗
t+1 and t = t + 1.
23: else
24: Exit.
25: end if
26: end while
27: return a = at
, r = rt
7.2.2 Augmented DToT Prompting
So far, we present how DToT prompting works under the zero-shot learning setup. Considering that DToT
prompting is orthogonal to few-shot learning, we propose two augmented versions of DToT prompting
below.
221
DToT+FS. The approach combines DToT prompting with the vanilla few-shot in-context learning, which
adds a few demonstrations in the prompt. Recent work [313] has shown that a few demonstrations can
effectively improve the in-context learning performance of LLMs. Moreover, motivated by prior work
[314], we select K positive statements and K negative statements that are most semantically similar to
the input statement as demonstrations from a development set.
DToT+FS+R. Recent works [315, 312] have demonstrated that rationales or facts in prompts serve as
grounded information to further enhance the few-shot in-context learning capability of LLMs. Therefore,
on top of DToT+FS approach, we include the rationale for each demonstration in the prompt.
7.2.3 Rationale Distillation
After obtaining the answers and rationales from LLMs via DToT, we can distill LLMs into smaller student
LMs, which can predict both answers and rationales. We describe the details of rationale distillation under
two scenarios below.
Distillation without Labels. Assume that we do not have ground truth labels for training inputs. Hence,
we use both the answers and rationales output by LLMs as ground truth. In practice, this approach can
be applied when it is challenging or costly to obtain ground truth labels. Suppose for the i-th input xi
,
the predicted label from LLM is yˆi
‡
and the associated rationale is ri
. Suppose the student model is
(y
s
i
, rs
i
) = f
s
θ
(xi) parameterized by θ. Then the loss function is defined as
L
s
(θ) = XD
i=1
(CE(y
s
i
, yˆi) + CE(r
s
i
, ri)), (7.5)
where D is the total number of training data points and CE represents the cross-entropy averaged at
token-level.
Distillation with Labels. Assume that we have access to ground truth labels for training inputs. Hence,
we can use the ground truth labels for fine-tuning. Specifically, if the LLMs can not predict the right
‡Note that we assign label 1 for ’Yes’ answer and label 0 for ’No’ answer.
222
answer, we only use the binary ground truth labels. Otherwise, we use both the labels and rationales.
Formally, the loss function is defined as
L
s
(θ) = XD
i=1
(CE(y
s
i
, yi) + λ ∗ 1yi=ˆyiCE(r
s
i
, ri)), (7.6)
where λ is an adjustable parameter (see Section 7.3.3 for selecting λ).
7.3 Experimental Setup
7.3.1 Datasets
We evaluate our approach on three public datasets and an Amazon private dataset.
Toxigen. This is an GPT3-generated dataset provided by [316] [316], which contains both toxic and benign
statements about 13 minority groups. We use their annotated dataset with 8,960 training statements and
940 testing statements in our experiments, where 40% of statements are toxic. Note that we exclude about
5% ambiguous statements with toxicity level 3 in experiments, where the scores (i.e. toxicity level) range
from 1 to 5 and 3 denotes ambiguity, thus are removed from our experiments.
SBIC. This dataset contains 44,671 social media posts from Reddit, Twitter, and hate sites. Each post was
annotated by the Social Bias Frames proposed in [317] to specify whether there exists any social bias and
stereotype towards a target group that is toxic. In our experiments, we randomly and uniformly sample
4,000 statements as training dataset and 1,000 statements as testing dataset, 50% of which are toxic.
DHate. This dataset is generated by a human-and-model-in-the-loop process proposed in [318]. In total, it
contains about 40,000 labeled statements, of which 54% contains hate content. We randomly and uniformly
sample 4,000 statements as training dataset and 1,000 statements as testing dataset in our experiments.
Amazon. This is a private dataset from Amazon, which contains 8,000 benign statements and 2,000 toxic
statements annotated by professional human labelers. This dataset is only used for testing purposes in our
experiments due to confidentiality policy.
223
7.3.2 Models and Baselines
DToT prompting. We evaluate the effectiveness of DToT prompting on both black-box and white-box
models, which are defined in Section 3.1. Specifically, we select gpt-3.5-turbo (denoted by ChatGPT [319])
with 175B parameters as our black-box model, and we consider FastChat-T5 (denoted by FC-T5 [320]) with
3B parameters as our white-box model.
Moreover, we compare DToT prompting with three existing baselines: (a) RoBerta model fine-tuned
on each dataset, since prior work [316] has shown that it can achieve the SOTA performance on Toxigen
dataset. (b) CoT prompting, which can be viewed as a special case of DToT prompting without iteratively
re-prompting. (c) UniLC prompting proposed by [312] (2023).
Finally, we compare DToT prompting with its two augmented versions: DToT+FS and DToT+FS+R (see
Section 3.2). Note that for DToT+FS, we select K = 3 positive statements and K = 3 negative statements
that are most semantically similar to the input statement from a development set. To measure the similarity,
we use sentence transformer [321] to convert each statement into an embedding, and use the cosine similarity between two statements as a measurement of their semantic similarity. Moreover, for DToT+FS+R,
we use the rationales associated with the correct answers generated by ChatGPT as augmentations.
Model Distillation. For our main experiments, we select FC-T5 to evaluate the effectiveness of finetuning a student LM using rationales generated from LLMs, and we consider two baseline approaches: (a)
fine-tuning with labels without rationale, and (b) fine-tuning with rationales generated by CoT prompting (which we denote by RCoT ). Note that for our approach (which we denote by RDT oT ), we use the
rationales generated via DToT+FS+R prompting as the ground truth during fine-tuning. Furthermore, to
investigate how the number of parameters in student LM affects the fine-tuning performance, we use
Flan-T5 models with different model size [322]. Note that use FC-T5 as the student LM since it has the
best performance after fine-tuning, compared with other models from T5 family [322] (see Table 7.5 in
224
Section 5.2). We do not consider models larger than 3B during distillation experiments, due to their high
fine-tuning cost while low run-time efficiency.
7.3.3 Parameters and Models
Parameters for DToT Prompting. We set sl and sh in Eq. (7.1) as 0 and 90 respectively for experiments
with ChatGPT. Since we observe that ChatGPT will answer ’Yes’ as long as the toxicity rating is above
zero, which leads to high recall while relatively low precision (i.e. high false positive), we select a small
value for sl and large value for sh to reduce the false positive rate of ChatGPT. We set sδ in Eq. (7.2)
as 0.9 for confidence checker in DToT prompting, since we empirically notice that setting the threshold
of confidence score as 0.9 improves the confidence of generated answers and thus enhances the overall
detection accuracy.
Moreover, we use a two-level context tree in our experiments (as shown in Figure 7.2 in Section 3.1)
and the maximal terminal step T is set as 2. Note that we did not experiment with a deeper context-tree
since DToT prompting with a two-level context tree can already significantly improve LLMs’ detection
performance with reasonable cost.
Parameters for Rationale Distillation. We set λ in Eq. (7.6) as 1, in order to make the predicted answers
and rationales of student LMs equally important in our experiments. During rationale distillation experiments (see Section 5.2), for each dataset, we fine-tune the student LM on the training set for 5 epochs,
where stochastic gradient descent with mini-batch size of 8 and learning rate of 1e-4 are used.
7.4 Evaluation Results
7.4.1 Evaluation of DToT
We start by evaluating DToT prompting to answer the following question.
Q1: Can DToT prompting enhance the detection performance of LLMs? As shown in Table 7.2,
225
Model Method
Toxigen SBIC DHate
Acc F1 AUC Acc F1 AUC Acc F1 AUC
RoBerta FT 82.45 75.35 90.24 82.80 83.96 91.82 70.60 75.38 81.27
FC-T5
CoT 79.69 73.49 85.54 63.50 71.15 67.76 63.40 71.09 69.13
DToT 81.24 75.36 86.67 68.80 74.26 72.47 66.80 73.05 72.66
DToT+FS 81.90 75.88 86.74 68.00 73.77 72.74 65.40 72.23 73.80
DToT+FS+R 82.01 75.99 86.94 68.00 73.77 72.56 66.50 72.87 73.37
ChatGPT
CoT 82.71 81.29 N/A 68.50 72.82 N/A 65.20 72.20 N/A
UniLC 83.30 82.73 N/A N/A N/A N/A N/A N/A N/A
DToT 85.03 82.76 N/A 71.60 74.18 N/A 68.20 73.92 N/A
DToT+FS 86.03 83.51 N/A 71.70 74.62 N/A 69.50 74.33 N/A
DToT+FS+R 87.03 85.06 N/A 72.00 74.91 N/A 69.20 74.30 N/A
Table 7.2: Evaluation results of DToT on Toxigen, SBIC, DHate datasets. In “Method" column, “FT"
stands for fine-tuning on training dataset, “CoT" refers to CoT prompting, “DToT" corresponds to DToT
prompting, “DToT+FS" denotes the combination of DToT prompting with few-shot demonstrations, and
“DToT+FS+R" presents the combination of DToT prompting with few-shot demonstrations and rationale
augmentations. Due to the lack of output logits, the AUC scores of ChatGPT are populated as “N/A".
compared with CoT prompting, DToT prompting can enhance the zero-shot learning performance of both
black-box model and white-box model significantly on all three public datasets. Specifically, for FC-T5,
DToT prompting can increase the accuracy by up to 5.30% and the F1 score by up to 3.11%. It is worth
noting that DToT prompting can also improve the AUC score of FC-T5 on all datasets, indicating its robust
performance. For ChatGPT, DToT prompting consistently outperforms CoT prompting, and increases the
accuracy by up to 3.10% and the F1 score by up to 1.72%.
Moreover, combining DToT with few-shot in-context learning (i.e. DToT+FS and DToT+FS+R) may
further improve models’ performance. For instance, on Toxigen dataset, compared with the vanilla DToT
prompting, adding demonstrations (i.e. DToT+FS) can improve ChatGPT’s accuracy by 1.00%. Furthermore, by incorporating both demonstrations and rationales during prompting (i.e. DToT+FS+R), ChatGPT’s accuracy can be improved by 2.00% respectively, outperforming baselines by up to 4.58% (for RoBerta)
and at least 3.73% (for UniLC).
226
Model Label Rationale
Toxigen SBIC DHate
Acc F1 AUC Acc F1 AUC Acc F1 AUC
RoBerta Human N/A 82.45 75.35 90.24 82.80 83.96 91.82 70.60 75.38 81.27
ChatGPT N/A N/A 87.03 85.06 N/A 72.00 74.91 N/A 69.50 74.33 N/A
FC-T5
LLM
N/A 81.90 80.19 91.78 64.90 71.99 74.67 63.50 71.15 70.50
RCoT 81.79 80.10 92.88 67.30 73.35 74.25 64.30 71.60 72.90
RDT oT 84.00 82.08 93.60 69.00 74.42 81.09 68.00 73.77 77.89
Human
N/A 84.99 83.00 92.43 84.00 84.91 92.89 85.00 85.71 93.64
RCoT 87.31 85.24 94.15 84.20 85.07 93.18 86.20 86.71 93.71
RDT oT 87.53 85.46 94.37 85.10 85.82 93.85 86.40 86.87 94.49
Table 7.3: Distillation evaluation results on Toxigen, SBIC, and DHate datasets. In “Label" column, “Human" indicates that the labels come from the training dataset, “LLM" indicates that the labels are predicted
by LLM. In “Rationale" column, “N/A" means no rationales are used in fine-tuning, “RCoT " means rationales extracted via CoT are used in fine-tuning, and “RCoT " means rationales extracted via DToT are used
in fine-tuning.
Therefore, we conclude that DToT prompting and its augmented versions significantly enhance LLMs’
performance on toxic content detection.
7.4.2 Evaluation of Rationale Distillation
Next, we evaluate whether fine-tuning with rationales extracted via DToT improves the performance of
student LMs. We first answer the following question:
Q2: Can fine-tuning with rationales extracted via DToT derive a student LM with higher accuracy? Table 7.3 reports the evaluation results of fine-tuning with or without rationales. In this table, “Label
= Human" denotes the utilization of the ground truth labels in training datasets for fine-tuning, “Label =
LLM" means the usage of labels predicted by the teacher LLM for fine-tuning, “Rationale = N/A" indicates
that fine-tuning is conducted solely with labels, “Rationale = RCoT " implies that fine-tuning takes place
with both labels and CoT-extracted rationale from teacher LLMs, and “Rationale = RDT oT " represents
that we fine-tune the model with both labels and DToT-extracted rationales from the teacher LLM (i.e.
ChatGPT).
227
As reported in Table 7.3, FC-T5 fine-tuned with DToT-extracted rationales and ground truth labels outperforms all baselines across various public datasets. Notably, this fine-tuning approach yields significant
improvements in accuracy, F1 score, and AUC score. Specifically, compared with fine-tuning with labels
only, it can increase the model accuracy by up to 2.54%, the F1 score by up to 2.46%, and the AUC score by
up to 1.94% respectively. Compared with the prior SOTA LM fine-tuned on these datasets (i.e. RoBerta), it
can significantly increase the model accuracy by up to 15.80%, the F1 score by up to 11.49%, and the AUC
score by up to 13.22% respectively. In addition, FC-T5 fine-tuned with labels and rationales outperforms
teacher LLM by up to 16.90% w.r.t. accuracy, with 60× smaller model size.
Moreover, compared with fine-tuning with RCoT and ground truth labels, we observe that fine-tuning
with RDT oT and ground truth labels can consistently result in student LMs with better detection performance on all public datasets. This indicates that rationales generated via DToT prompting (RDT oT ) have
higher quality than those generated via CoT prompting (RCoT ).
Lastly, under the scenario where we use labels predicted by teacher LLM as ground truth, we observe
that fine-tuning with RDT oT and labels can still outperform both fine-tuning with RCoT and labels, and
fine-tuning with only labels. Specifically, on Toxigen dataset, FC-T5 fine-tuned with RDT oT and predicted
labels from teacher LLM can even outperform RoBerta fine-tuned with human labels from training dataset.
However, without using RDT oT , the fine-tuned FC-T5 cannot outperform Roberta.
In summary, we conclude that fine-tuning with both labels and rationales can effectively improve the
student LMs’ performance. Moreover, using rationales with better quality (i.e. DToT-extracted rationales
versus CoT-extracted rationales) can further enhance the performance of fine-tuned LMs.
Q3: Can fine-tuning with rationales improve the transferability of student LMs across different
toxic datasets? To evaluate whether fine-tuning with both labels and rationales can improve the transferability of student LMs, we test the models fine-tuned on the other two public datsaets (SBIC and DHate)
and one private dataset (Amazon). Table 7.4 reports our transferability evaluation results, where we use
228
Model Labels Rationale SBIC DHate Amazon
RoBerta Human N/A 65.46 61.51 X
ChatGPT N/A N/A 72.00 69.50 N/A
FC-T5
LLM
N/A 75.36 74.96 X+2.47
RCoT 71.48 70.99 X+4.52
RDT oT 75.90 77.31 X+4.54
Human
N/A 75.24 77.75 X+0.05
RCoT 77.15 78.97 X+4.16
RDT oT 77.29 77.54 X+3.61
Table 7.4: Transferability evaluation results. Note that we fine-tune these models on Toxigen dataset while
testing them on other datasets, and we report AUC score. For Amazon dataset, due to confidentiality policy,
we only report the increased AUC score compared with RoBerta (whose AUC score is denoted by X).
AUC score as our metric. First, we observe that compared with fine-tuning with labels only (Rationale =
N/A), fine-tuning with both labels and rationales (Rationale = RCoT /RDT oT ) can improve the AUC score
of student LMs by up to 2.35% on testing datasets. Second, FC-T5 fined-tuned with both labels and rationales have significantly better AUC scores on all datasets compared with RoBerta. Lastly, while we
only fine-tune FC-T5 on Toxigen datset, it still outperforms teacher LLM (i.e. ChatGPT) on both SBIC and
DHate datasets. Hence, we conclude that fine-tuning with rationale effectively improves the cross-dataset
transferability of student LMs.
Q4: How does student LMs’ size affect the detection accuracy? So far, we use FC-T5 with 3B parameters as our student LM during rationale distillation experiments. To investigate how the model size affects
the student LMs’ performance, we fine-tune Flan-T5 models of different size with rationales on Toxigen
dataset. As reported in Table 7.5, fine-tuning with both labels and rationales can consistently enhance
the student LMs’ performance on Toxigen dataset with varying model size. Moreover, as the model size
becomes smaller, the model performance will decrease, and the performance gain provided by using rationales also becomes smaller, which indicates that larger student LMs can learn to generate rationales
better.
229
Model Size Rationale Acc F1 AUC
FC-T5 3B
N/A 84.99 83.00 92.43
RDT oT 87.53 85.46 94.37
F-T5-XL 3B
N/A 85.54 83.52 92.51
RDT oT 87.53 85.42 93.17
F-T5-L 770M
N/A 83.44 81.57 92.42
RDT oT 84.22 82.24 93.18
F-T5-B 220M
N/A 81.02 79.43 91.26
RDT oT 81.35 79.66 91.16
Table 7.5: Impact of student LMs’ size on rationale distillation. Note that we use Toxigen dataset for a case
study, and F-T5-XL/F-T5-L/F-T5-B is short for Flan-T5-XL/Flan-T5-Large/Flan-T5-Base.
Q5: Can student LMs fine-tuned with rationales actually generate high-quality rationales? We
conduct a case study on the quality of rationales generated by FC-T5 using different approaches via manual
check. We observe that after fine-tuning with both labels and rationales (RDT oT ), FC-T5 always provides
responses with rich rationales. In contrast, fine-tuning with labels only makes FC-T5 overfit the binary
labels and hence output ‘Yes/No’ answers only without rationales. In addition, without fine-tuning, FC-T5
cannot consistently generate meaningful rationales (see Table 7.6 for detailed examples).
7.5 Related Work
7.5.1 Toxic Content Detection
Prior works on toxic content detection can be categorized into two types. One type of research works
focuses on creating benchmark datasets for toxic content detection, either by crowdsourcing and annotating human-written text [323, 317, 318], or leveraging ML-based approaches to generate high-quality toxic
dataset in a scalable way [316]. Another type of works proposes novel approaches to fine-tune LMs on
toxic dataset. [309] [309] propose HateBERT, a pre-trained BERT model for abusive language detection,
230
Model Method Response
FC-T5
CoT Yes, this sentence contains
toxic content.
FT Yes.
FT+RDT oT Yes, this sentence contains
toxic content. Rationale:
The sentence contains
hate speech and biased
content towards immigrants. It implies that
immigrants are not welcome in America and
should stay in their own
country to fight for their
freedom.
Table 7.6: Responses generated by FC-T5 using different approaches. Note we require to detect toxic
content in sentence “why do you have to come to America, you should stay in your own country and fight for
your freedom, not come to America" in the prompt.
which significantly outperforms the original BERT model. [310] [310] propose a novel contrastive learning method to improve the cross-dataset performance of HateBert. Most recently, researchers have started
to use LLMs to detect toxic content. [311] [311] design a generative prompt-based inference method to
leverage LLMs for toxic content detection. [312] [312] propose an interpretable, unified, language checking method (UniLC) to enhance the few-shot in-context learning capabilities of LLMs for misinformation,
stereotypes, and hate speech detection. Compared with these works, our work not only proposes a novel
and orthogonal prompting approach that improves the zero-shot/few-shot performance, but also extracts
and distills rationales into a smaller but more effective student LM for toxic context detection.
7.5.2 Prompting LLMs
LLMs such as GhatGPT [319] and Llama [324] have demonstrated superior zero-shot/few-shot in-context
learning capabilities and generalizability on a variety of language tasks without fine-tuning [325]. However, their performance is heavily related to the quality of input prompts [326]. Hence, varieties of prompting approaches have been proposed to improve the quality and robustness of LLMs’ response [327, 328, 329,
231
330, 313, 331]. [327][327] and [329][329] propose Chain-of-Thought (CoT) and self-consistent CoT, which
prompts LLMs step by step to improve LLMs’ performance on reasoning tasks. [328][328] generalizes CoT
into Tree-of-Thought (ToT), which enables LLMs to explore multiple intermediate steps for complex problem solving tasks. Different from CoT and ToT which are suitable for step-by-step reasoning problems, the
proposed DToT in this work is designed for classification problems with a novel confidence checker and
context selector, which iteratively searches and injects more fine-grained context into prompts to enhance
the classification confidence of LLMs. Other works have proposed to leverage the in-context learning capability of LLMs, by augmenting the prompts with demonstrations [313], rationales [331] or grounded
information [330, 312]. Note that our DToT prompting is orthogonal to existing in-context learning methods for LLMs (see Section 3.2).
7.5.3 Distilling LLMs
Some recent works also tackle the problem by distilling LLMs into smaller LMs for domain-specific tasks
[332, 333, 334, 335]. [334] [334] propose PINTO, which uses LLMs’ rationales as context of a small LM
to improve its performance on reasoning tasks. [335] [335] further propose SCOTT, a faithful knowledge
distillation method that elicits self-consistent rationales from LLMs to fine-tune a small while more faithful
LM for open-domain question-answering tasks. Both [332] [332] and [333] [333] demonstrate that LLMs
can be distilled into smaller but more effective LMs by fine-tuning with both answers and rationales on
commonsense reasoning and arithmetic tasks. Different from these works, our work focus on the domain
of toxic content detection. Moreover, we propose DToT that extracts better rationales from LLMs and
demonstrate that using DToT-extracted rationales further improves the effectiveness of distillation for
toxic content detection.
232
7.6 Conclusion
In this work, we propose an end-to-end approach to bootstrapping and distilling LLMs for toxic content
detection, where a novel prompting method named DToT is designed to enhance LLMs’ detection performance and extract better rationales, and smaller LMs are fine-tuned with both labels and DToT-extracted
rationales. Our evaluation results on four datasets consistently demonstrate the effectiveness of both the
proposed DToT prompting and the proposed fine-tuning method for toxic content detection.
233
Chapter 8
Customized Synthetic Data Generation for Private Training of
Specialized ML Models
Specialized ML models tailored to users’ needs and requests are increasingly being deployed on smart devices to provide personalized intelligent services. However, two primary challenges hinder the training of
such models: the lack of publicly available labeled data suitable for specialized tasks and the inaccessibility of labeled private data due to concerns over privacy and labeling efforts. To address these challenges,
we propose a novel system that generates customized synthetic image data for specialized model training
using a few sanitized reference images. Our system offers users fine-grained, object-level sanitization control over the reference images, which allows user to trade between the privacy and utility of the generated
synthetic data according to their privacy preferences. Through experiments on three specialized model
training tasks, we demonstrate the feasibility of using customized synthetic images for specialized ML
model training with minimal usage of real users’ private data. We further demonstrate that our proposed
system can enhance the performance of specialized models without violating users’ privacy preferences.
8.1 Introduction
With the exponential growth of smart devices (e.g. smart speakers, smart monitors, smartwatches), Machine Learning (ML) models are increasingly being deployed on such smart devices to provide intelligent
234
services for users [336, 337]. A prominent trend in this evolution is the development of specialized ML
models that are specifically tailored to the users’ needs and requests, thereby enhancing the user experience across various applications [338], including smart voice assistants [339], wearable technologies [340],
specialized healthcare, [341], etc. Based on the availability and sensitivity of data and computation needs,
these specialized ML models can be trained either on the server or on local devices.
In this paper, we explore a novel scenario for training specialized private computer vision models with
minimal usage of user’s sensitive data, as illustrated in Figure 8.1. The training process starts when the user
sends a specialized ML model training request to the server through the local device. Then, the server will
automatically train an ML model tailored to user’s request and deploy it on user’s local device. Particularly,
the user’s local device is assumed to have no labeled dataset, but the user may be willing to share a few
unlabeled data points with the server as references based on the user’s privacy preference.
There are two major challenges to training such specialized ML models that can satisfy the unique
needs of users. First, public labeled data or models that fit the specialized tasks are often unavailable (e.g.
training an image classifier for uncommon objects). Second, the server can not have access to any labeled
private data, due to privacy concerns and the lack of appropriate data annotation systems in the local
devices.
User request
Model weights
User Device Server
“I want a model to
check my dog’s status.”
User request + reference images
Model weights
Mobile ML model
“It is playing.”
Reference images
Figure 8.1: Problem statement. The user sends a request about the model they need and a few reference
images. The server automatically train a model for the user.
235
To handle these practical challenges, prior works have suggested utilizing high-quality synthetic data
generated by large Diffusion Models (DMs) as training data [342, 343]. While this approach provides a feasible alternative for generating large amounts of training data, the distribution of the generated synthetic
data will significantly deviate from the user’s private data, resulting in suboptimal model performance.
To better tailor the distribution of generated synthetic data, recent works have proposed methods for
customizing large DMs. These methods involve fine-tuning the models on a set of reference images [344]
or incorporating conditional reference image features to the image generation process [345]. Synthetic
data generated by such customized DMs can significantly enhance the performance of specialized models
[346]. However, customizing DMs on reference images or image features can lead to privacy concerns, as
sensitive information about users can leak from the reference images or image features shared with the
server. The trade-offs between privacy and utility of customized synthetic data has not been well explored
in prior research works.
Therefore, in this paper, we propose a novel system to generate customized synthetic image data for
specialized model training, which allows users to trade between the privacy and utility of generated synthetic data according to their privacy preferences.
At a high level, our system provides users with fine-grained, object-level privacy control over the
reference images shared with the server. This enables the selective removal of sensitive objects or features,
while non-sensitive objects or features are retained and shared to maximize the utility of the customized
synthetic data for training specialized models.
Specifically, our proposed system comprises three critical components designed to enhance privacy
and customization: 1) a light-weight object segmentation module located on the local device, which partitions reference images into distinct image segments (e.g. target objects and background objects); 2) an
image sanitizer, also on the local device, that removes sensitive features from each image segment according to users’ privacy preferences; 3) a DM fine-tuning pipeline on the server side, designed to generate
236
customized image segments and seamlessly merge them into the final synthetic images. Our evaluation
results on three unique specialized model training tasks demonstrate that the proposed system can enhance
the performance of specialized models without compromising user’s privacy preference.
8.2 Threat Model
User. We assume that the user can request the server to build a specialized vision ML model tailored to a
specific task and mobile devices. The user request is a text prompt specifying the requirements for training
a specialized, which will subsequently be deployed locally on the user’s local device. We assume that there
is an absence of publicly available labeled datasets for training the specialized model. Additionally, we
hypothesize that the user might agree to share a limited number of sanitized reference images with the
server to customize the generated synthetic data and thereby improve the accuracy of the specialized
model. This consent can be acquired based on different levels of privacy preferences that will be available
to the user.
Server. We assume that the server has substantial computational resources to fine-tune and deploy large
DMs for generating customized synthetic data. This data will be used to train specialized mobile ML models
tailored to meet the user’s specific requirements. Additionally, we assume that the server can be curious,
implying that it may attempt to infer sensitive information from any data shared by the user. This includes
both the text prompts detailing the user’s requests and the sanitized reference images.
8.3 Design
In this section, we present the design of the proposed system for private training of specialized ML models.
237
Image sanitizer
L2: Text + raw images
L1: Text + image features
L0: Text
User
request
Target object
Customized
synthetic data
Background
Fine-tune DM to customize
target object generation
Sanitized
target object
Sanitized
background
Fine-tune DMs to customize
background inpainting
Text only?
Text only?
N
N
Y
Y
Train mobile ML models
Privacy
preference
Reference
images
Object detection and
segmentation module
User
Device
Mobile ML
models
Device Server
Target object
generation DM
Background
inpainting DM
Customized
target object
Model
weights
Text
prompt
Text
prompt
Synthetic data
generation
DM fine-tuning
Figure 8.2: Details of the proposed system.
8.3.1 System Overview
Figure 8.2 demonstrates the proposed system, which contains multiple modules on both the user’s device
and the server. Specifically, on the device side, the system consists of: 1) an object detection and segmentation model for detecting and isolating target object from the background, and 2) an image sanitizer
module for removing sensitive features in both the target object and the background (see Section 8.3.2).
The server-side system consists of three modules: 1) a DM fine-tuning pipeline to customize the target
object and background generation of synthetic data, and 3) a mobile ML model training module to train
mobile ML models with the generated synthetic data (see Section 8.3.3 for details).
At a high level, the proposed system works as follows. First, the user specifies the training objective
and requirements of the mobile ML model in a text request. The user has the option to share a few unlabeled private images with the server as reference training data. Suppose the user chooses to share a
few reference images, then the image sanitizer generates sanitized images based on user’s privacy preference and only sends the sanitized images to the server. Next, the server fine-tunes DM [5] to generate
customized synthetic data that satisfies user request and follows the distribution of sanitized reference
238
Key Value example
Target object Dog
Background Room
Training objective A ML model detects my dog’s status
Label classes Eating, sitting, sleeping, playing
Table 8.1: An example of user request, where the goal is to train a specialized ML model to monitoring the
dog’s status in the user’s room.
images. Lastly, the server uses the customized synthetic data to train a mobile ML model and then sends
the model weights back to the device. We describe the design details below.
8.3.2 Device-side System Design
User request. The user request comprises a list of key-value pairs specifying training requirements for
the specialized ML model. As demonstrated in Table 8.1, the user request contains four keys: target object,
background, training object, label classes. The target object and background are used to instruct the server
to generate images featuring a specific type of target object against a specified background. For example,
wishes to have a specialized ML model to monitor their dog’s status in their room, the target object and the
background would be specified as “dog” and “room” respectively, such that the server can generate images
of a dog in a room. Moreover, the training object is used to indicate the functionality of the specialized ML
model, while the label classes specify the categories of the generated images. For instance, in the scenario
involving monitoring the dog’s status, the label classes can include “eating”, “sitting”, “sleeping”, “playing”,
hich are used to classify the dog’s status within the images.
Object detection and segmentation module. The purpose of this module is to identify and separate
the objects in the reference images, which enables fine-grained privacy control for each object based on
the user’s individual privacy preferences. In this work, we group the objects within an image into two
categories: target objects and background objects. The target objects, which are specified by the user in
239
their request, are the primary focus of the model’s training. As an example, if the user request is to train
a ML model to monitor the dog’s status, the target object will be dog.
The object detection and segmentation module is built on the off-the-shelf object detection and segmentation model for mobile devices. For each reference image, it runs the object detection and segmentation model to detect and segment a list of objects. Based on the target object defined in user request,
this module splits the images into two distinct segments: one containing only the target object (the target
image) and another containing all remaining objects (the background image). This module allows users to
control the privacy of different image segments according to their specific needs.
Image sanitizer. This module is designed to generate text prompts for the input image segment (i.e.
target object or background), and remove sensitive features from the input. Specifically, the text prompts
are synthesized based on the user request. As illustrated in Table 8.1, the user request specifies the target
object, label classes for the target object, and background. The text prompts for the target object consist of
all possible combinations of the target object and each label class. For example, in the task of monitoring
a dog’s status (see Table 8.1), the set of text prompts would include: "a dog is eating," "a dog is sitting," "a
dog is sleeping," and "a dog is playing." Similarly, the text prompt for the background is simply the value
of the background specified in the user request.
Moreover, the image sanitizer provides three sanitization schemes to remove sensitive features from
the input, denoted by L0, L1 and L2 respectively. We describe the details of these schemes below.
1. L0: This scheme only generates text prompt of the input image segment (either target object or
background) and sends it to the server. As an example, if the input image segment contains a dog,
the output text of the image sanitizer under L0 scheme will be “dog”. Note that the text prompt of
the target object and background are extracted from user request (see Table 8.1 for example). When
the input image segment is highly sensitive, L0 should be used to maximize the privacy.
240
2. L1: This scheme extracts non-sensitive image features from the input image segment, and sends
both these features and the text prompt of the image to the server. Note that the extracted image
features may include canny edge, object skeleton, layout box, etc., which can be used for conditional
synthetic image generation [345]. L1 is suitable for use when the input image segment features that
are non-sensitive.
3. L2: This scheme sends the raw input image segment along with the text prompt of the image segment
to the server. When the input image segment is non-sensitive, using L2 is recommended to maximize
the utility of the sanitized output.
For each reference image, after the object detection and segmentation module has isolated the target object
and background, the image sanitizer processes these components in parallel to produce sanitized versions
of both the target object and background. Note that different sanitization schemes may be applied to the
target object and background as needed. When L0 scheme is applied to the target object or background,
the sanitized target object or background is represented by text prompts only. When L1 or L2 scheme is
applied to the target object or background, the sanitized target object or background includes both text
and sanitized reference images, which comprise either image features or raw images, depending on the
scheme applied.
User’s privacy preference. We define the user’s privacy preference as (L
f
i
, L
b
i
), where L
f
i
indicates that
Li scheme is used in image sanitizer for target object and L
b
i
indicates that Li scheme is used in image
sanitizer for background. This dual parameterization allows user to choose different sanitization level for
target object background separately.
8.3.3 Server-side System Design
DM fine-tuning module. This module is designed to fine-tune DM in order to generate both customized
target object and background when sanitized reference images are shared. Upon receiving the sanitized
241
target object and background from the user device, it first checks whether the sanitized target object contains any image data. If the sanitized target object contains text only (i.e. L0 is used), the server does not
fine-tune the target object generation DM. Instead, it uses the text as prompt input of the DM to generate target object. Conversely, if the server receives sanitized reference images (i.e. L1 and L2 are used), it
fine-tunes the target object generation DM using these images, in order to produce synthetic target objects
that mimic the distribution of the sanitized reference images. Specifically, if the sanitized reference images
contain image features only (i.e. L1 scheme is used), this module initially employs ControlNet [345] to
generate a set of new reference images conditional on these image features. Following this, the module
fine-tunes a pre-trained DM on these new reference images using the DreamBooth algorithm [344]. If the
sanitized reference images are generated by L2 scheme (which includes raw images), this module directly
fine-tunes the pre-trained DM on these reference images through the DreamBooth algorithm.
Similarly, if the server receives sanitized reference images for background, it will fine-tune the DM to
generate synthetic background images that are aligned with the reference background images. Otherwise,
it uses the pre-trained DM without any fine-tuning to generate synthetic background images.
Synthetic data generation module. This module operates in three sequential steps to produce synthetic
data. Initially, it utilizes the customized target object generation DM to create the target object based on
the provided input text prompt. Next, it generates a random background mask for the target object. Last, it
employs the customized background generation DM to fill the background mask and seamlessly integrate
the background with the generated target object. Notably, during our experiments, we merge the weights
of the fine-tuned background generation DM with those of a pre-trained inpainting DM. This fusion creates
a customized background inpainting DM specifically tailored for generating the background.
Mobile ML model training. After the synthetic data generation module produces a set of synthetic
data, this module employs it to train a mobile ML model via supervised learning. Specifically, for tasks
where the text prompt explicitly specifies the label of the generated synthetic images, the label can be
242
directly extracted from the text prompt. For instance, if the text prompt used for target object generation
is “a dog is running”, the label for this image would be “running”. For tasks where the text prompt does
not provide specific label information (such as an object detection task), the server will utilize automatic
labeling models to generate synthetic labels for training (see Section 8.4.3 for details).
8.3.4 Privacy Measurement
We evaluate the end-to-end privacy of the proposed system using two metrics. The first metric is the
average Peak Signal-to-Noise Ratio (PSNR) between the user’s private images and the synthetic images
generated by the server [347]. PSNR is a measure that quantifies the ratio between the maximum possible
power of a signal and the power of corrupting noise that affects the fidelity of its representation. In simpler
terms, PSNR assesses how much noise a synthetic image contains compared to the user’s private image.
A higher PSNR indicates that the synthetic image more accurately reconstructs the private image (i.e. less
distortion), thus suggesting greater privacy leakage.
The second metric is the Semantic Embedding Similarity (SIM) between the user’s private images and
the synthetic images generated by the server. For this, we employ a large pre-trained vision-language
model Blip2 [348] to generate image embeddings for each private and synthetic image. We then calculate
the average cosine similarity between the embeddings of the private images and those of the synthetic
images. A higher similarity indicates greater privacy leakage, as it suggests that the synthetic images are
more similar to the user’s private images.
8.4 Experimental setup
8.4.1 Real-world Tasks and Datasets
Pet status monitoring. This task involves training a mobile ML model to monitor the status of pets for
the user. We create a user dataset containing husky dogs,with behaviors categorized into four statuses:
243
playing, running, sitting, sleeping. We refer to this dataset as husky dataset, and use it to evaluate the
accuracy of the mobile ML model trained by the server.
Human activity monitoring. This task focuses on training a mobile ML model to monitor the daily
activities of senior individuals within a home environment. We utilize a subset of the Toyota Smarthome
dataset [349], which contains contains 16.1K video clips with 31 activity classes performed by 18 senior
people in a large house with 7 cameras. Specifically, we sample the image frames of a single senior person
engaging in four activities: eating, drinking, walking, reading. We denote the sampled data as smarthome
dataset.
Non-popular object detection. This task involves fine-tuning an object detection model to detect nonpopular objects for users. As a case study, we consider the task of detecting a pill bottle in a bedroom
environment, and we create an image dataset consisting of a special pill bottle in a bedroom to evaluate
the detection accuracy of mobile ML model. This dataset is named as bottle dataset.
8.4.2 Synthetic Dataset Generation
We employ the YOLO-v8 segmentation model [350] within the object detection and segmentation module
to separate the target object from the background. For the generation process, we utilize Stable-Diffusionv1.5 [5] as the pre-trained Diffusion Model (DM) for both target object and background generation in our
experiments. We describe how we generate customized synthetic datasets for each task in detail below.
Pet status monitoring. In this task, given that images in this data mainly contain target object, we apply
L0 sanitization scheme to it. For the target object (i.e. the husky dog in this case), we assume that the
user may have different privacy preferences. Therefore, we consider three different sanitization schemes
for the target object (range from L0 to L2). This results in three different user privacy preferences: (L
f
0
,
L
b
0
), (L
f
1
, L
b
0
), (L
f
2
, L
b
0
). For each user privacy preference, we generate 1,600 synthetic images (400 for each
image class), and use them to train a dog status classifier. Note that for (L
f
0
, L
b
0
), we use the pre-trained DM
244
without fine-tuning to generate synthetic images. For (L
f
1
, L
b
0
), we select canny edge as the image feature,
and the image sanitizer shares the canny edges of the target object with the server. The server then uses
ControlNet, incorporating the canny edge as an additional prompt [345], to generate 10 synthetic reference
images which are then used to fine-tune DM for target object generation. For (L
f
2
, L
b
0
), the image sanitizer
directly shares the target objects in these reference images with the server for DM fine-tuning.
Human activity monitoring. In this task, we consider the target object (i.e. the senior person) as highly
sensitive and accordingly apply L0 sanitization scheme for target object. For background (i.e. a room
environment), we recognize that the user may have different privacy preferences and thus we offer three
sanitization schemes for it (range from L0 to L2). This leads to three different user privacy preferences:
(L
f
0
, L
b
0
), (L
f
0
, L
b
1
), (L
f
0
, L
b
2
). For each user privacy preference, we generate 1,600 synthetic images (400
for each image class), which are used to train a human activity classifier. Note that for (L
f
1
, L
b
0
), we select
canny edge as the image feature, and the image sanitizer shares the canny edges of the background with
the server.
Non-popular object detection. In this task, we consider the scenario where the user may have different
privacy preferences for both target object (i.e. a pill bottle) and the background (i.e. a bedroom), and we
implement six user privacy preferences: (L
f
0
, L
b
0
), (L
f
0
, L
b
1
), (L
f
0
, L
b
2
), (L
f
1
, L
b
0
), (L
f
1
, L
b
1
), (L
f
1
, L
b
2
). For
each user privacy preference, we generate 1,600 synthetic images, and the bounding box labels for the pill
bottles are automatically generated by grounding DINO model [351]. Note that we select canny edge as
the image feature for L1 scheme on both target object and background.
8.4.3 Training and Testing
For pet status monitoring task and human activity monitoring tasks, we use MobileNet-v2 [352] as the
backbone, and then add a linear layer followed by a softmax layer as our specialized mobile ML models.
245
Note that we use the classification accuracy as the metric to measure the performance of specialized models. For non-popular object detection task, we use the pre-trained YOLO-v8 detection model [350] as the
specialized model. To measure the performance of specialized model, we use mAP50 (mean average precision calculated at an intersection over union (IoU) threshold of 0.50) as the metric, which is commonly
used in object detection.
During the training process, we use 80% synthetic data for training and the remaining 20% synthetic
data for validation. For each user privacy preference in each task, we train each specialized model for
5 epochs and select the model with the high validation accuracy as our final model. During the testing
phase, we test the accuracy of these specialized models on real-world data (see Section 8.4.1).
8.5 Evaluation
8.5.1 Utility
We first evaluate the utility of customized synthetic data with various user privacy preferences, using
the performance of specialized ML models as the metric (i.e. accuracy for husky dataset and smarthome
dataset, and mAP for bottle dataset). We report the model performance results on three tasks in Table 8.2.
Note that we report the model performance on both the validation sets of the generated synthetic data and
the real-world datasets.
First, we observe that using L2 scheme for target object or background sanitization can significantly
improve the model accuracy trained with the customized synthetic data on all three real-world datasets.
This is expected since L2 scheme sends the parts of the raw reference images to the server, which enables
the fine-tuned DMs to generate synthetic images that closely resemble these raw reference images.
Next, we observe that using L1 scheme for target object or background sanitization may not always
lead to better model accuracy. Since L1 sends canny edges of the reference images to the server, whether
246
these image features can help to generate better synthetic images depend on whether canny edges are
important to the model utility. For instance, prior works have shown that canny edges are important
features for object detection [353]. During our experiments on bottle dataset, we consistently observe
using L1 scheme effectively boosts the performance of specialized models.
Furthermore, we find that specialized models trained on synthetic data tend to overfit this data, and
hence they fail to generalize to the real-world testing data well. As shown in Table 8.2, the model accuracy
on synthetic validation data (see Section 8.4.2) is generally higher than that in the real-world testing data,
since synthetic validation data has the same distribution of synthetic training data, while the real-world
testing data does not. To prevent the model from overfitting the synthetic training data, we sample the
training data from synthetic data generated by different schemes, and we observe this approach can significantly improve the model accuracy. For example, on Husky dataset (see Table 8.2a), combining the
synthetic data generated with L0 and L2 for the target object increased model accuracy on real-world data
by 8.34%. Similar improvements are observed in other datasets.
8.5.2 Privacy
Next, we evaluate the privacy of customized synthetic data with various user privacy preferences. We
report the results in Table 8.3. Note that PSNR measures the average Peak Signal-to-Noise Ratio between
the user’s private images and the synthetic images generated by the server. Higher PSNR value indicates
that less distortion exists in the generated synthetic images compared with the user’s privacy image, suggesting that more privacy information is being leaked. SIM is a metric measuring the semantic similarity
between user’s private image and the generated synthetic images. Higher SIM value indicates that the
synthetic images generated by the server are more similar to the user’s private images, thereby leaking
more private information.
247
Privacy preference
Accuracy
Synthetic data Real-word data
(L
f
0
, L
b
0
) 87.88% 63.46%
(L
f
1
, L
b
0
) 83.45% 57.05%
(L
f
2
, L
b
0
) 88.74% 64.74%
(L
f
0
, L
b
0
) + (L
f
1
, L
b
0
) 81.82% 62.82%
(L
f
0
, L
b
0
) + (L
f
2
, L
b
0
) 83.10% 73.08%
(a) Husky dataset. Note that this dataset contains images of husky dogs exhibiting
four statuses: playing, running, sitting, and sleeping.
Privacy preference
Accuracy
Synthetic data Real-word data
(L
f
0
, L
b
0
) 85.33% 32.13%
(L
f
0
, L
b
1
) 83.45% 27.72%
(L
f
0
, L
b
2
) 74.93% 38.78%
(L
f
0
, L
b
0
) + (L
f
0
, L
b
1
) 79.43% 43.37%
(L
f
0
, L
b
0
) + (L
f
0
, L
b
2
) 72.74% 43.71%
(b) Smarthome dataset. Note that this dataset contains images of a single senior
person in a home environment engaged in four activities: eating, drinking, walking, and reading.
Privacy preference
mAP50
Synthetic data Real-word data
(L
f
0
, L
b
0
) 86.75% 20.85%
(L
f
0
, L
b
1
) 91.93% 57.52%
(L
f
0
, L
b
2
) 90.94% 87.96%
(L
f
1
, L
b
0
) 76.01% 93.59%
(L
f
1
, L
b
1
) 91.86% 95.88%
(L
f
1
, L
b
2
) 73.42% 97.94%
(L
f
0
, L
b
2
) + (L
f
1
, L
b
2
) 78.90% 98.32%
(c) Bottle dataset. Note that this dataset contains images of a special pill bottle
located in a bedroom.
Table 8.2: Model accuracy evaluation results.
On Husky dataset and Smarthome dataset, we observe that both the synthetic data generated by L1
and L2 schemes exhibits both higher PSNR and SIM score, compared with the data generated by L0. This
indicates that the synthetic data generated by L1 and L2 is more similar to the user’s private data, causing
248
more privacy leakage. On Bottle dataset, although the synthetic data generated by L1 and L2 schemes has
slower PSNR score, it shows higher SIM scores, which means that semantically, the generated images are
more similar to the user’s private image. This is expected, since the synthetic data generated by L1 and L2
schemes is customized to closely follow the distribution of reference images. Despite the increased PSNR
or SIM scores observed with L1 and L2 schemes on these datasets, it is worth noting that these increases
are relatively modest, indicating only a moderate enhancement in the similarity of the synthetic data to
the user’s private data.
8.5.3 Privacy-Utility Trade-offs
In this subsection, we compare the privacy-utility trade-off performance of customized synthetic data with
various user privacy preferences in Figure 8.3. Note that in privacy leakage is measured by SIM, which
represents the semantic similarity between generated synthetic images and the user’s private images. The
model utility represents the performance of the specialized model trained on synthetic data. The top-left
part of these figures indicates both higher privacy and higher utility.
As illustrated in Figure 8.3, (L
f
0
, Lb
0
) can always provide user with the highest privacy while worst
utility on all three datasets. This is expected since no reference images are shared with the server under
(L
f
0
, Lb
0
) privacy preference. Moreover, we observe that synthetic data generated by the L2 scheme have
privacy leakage levels similar to those generated by the L1 scheme, while providing better utility. It is
worth noting that the user only samples a small set of reference images for customized synthetic generation. Hence, although the L2 scheme may leak more privacy information about the reference images
compared with the L1 scheme, the average privacy leakage across all the user’s private data will be less
significant. In practice, users can choose the scheme that best balances privacy and utility of generated
synthetic data based on Figure 8.3, according to their specific privacy preferences.
249
Privacy preference
Privacy
PSNR SIM
(L
f
0
, L
b
0
) 32.28 0.5249
(L
f
1
, L
b
0
) 32.75 0.6136
(L
f
2
, L
b
0
) 32.77 0.6153
(a) Husky dataset. Note that this dataset contains images of husky dogs exhibiting
four statuses: playing, running, sitting, and sleeping.
Privacy preference
Privacy
PSNR SIM
(L
f
0
, L
b
0
) 32.91 0.3534
(L
f
0
, L
b
1
) 35.01 0.5242
(L
f
0
, L
b
2
) 34.75 0.5154
(b) Smarthome dataset. Note that this dataset contains images of a single senior
person in a home environment engaged in four activities: eating, drinking, walking, and reading.
Privacy preference
Privacy
PSNR SIM
(L
f
0
, L
b
0
) 34.20 0.5527
(L
f
0
, L
b
1
) 34.10 0.5867
(L
f
0
, L
b
2
) 33.63 0.6970
(L
f
1
, L
b
0
) 34.75 0.5541
(L
f
1
, L
b
1
) 33.82 0.6645
(L
f
1
, L
b
2
) 35.43 0.6317
(c) Bottle dataset. Note that this dataset contains images of a special pill bottle
located in a bedroom.
Table 8.3: Privacy evaluation results. Note that PSNR measures the average Peak Signal-to-Noise Ratio
between the user’s private images and the synthetic images generated by the server. SIM measures the
semantic embedding cosine similarity between the user’s private images and the synthetic images generated by the server. Higher values of both PSNR and SIM indicate that the synthetic images generated by
the server are more similar to the user’s private images, suggesting that more privacy information is being
leaked.
8.6 Related Work
Diffusion Models. The diffusion model (DM) was first introduced by Sohl et al. [354]. It involves a
forward diffusion process that incrementally adds noise to data, and a reverse diffusion process that reconstructs the original data from noise. Later, Jonathan et al. [355] demonstrated that DMs can efficiently
250
(a) Husky dataset. Note that this dataset contains images of husky dogs exhibiting four statuses: playing,
running, sitting, and sleeping.
(b) Smarthome dataset. Note that this dataset contains images of a single senior person in a home environment engaged in four activities: eating, drinking, walking, and reading.
(c) Bottle dataset. Note that this dataset contains images of a special pill bottle located in a bedroom.
Figure 8.3: Privacy-utility trade-off results. Note that privacy leakage is measured by SIM, which represents
the semantic similarity between generated synthetic images and the user’s private images. The model
utility represents the performance of the specialized model trained on synthetic data. The top-left part of
these figures indicates both higher privacy and higher utility.
generate high-quality synthetic images, surpassing previous methods such as Variational Autoencoders
(VAEs) [356] and Generative Adversarial Networks (GANs) [357]. To generate high-resolution synthetic
images, Latent DMs (also known as Stable Diffusion Models) were proposed by Rombach et al. [5], conducting diffusion and denoising processes in latent space. Additionally, various conditioning mechanisms
introduced in [5] have transformed DMs into flexible conditional image generators, supporting applications like text-to-image and super-resolution image generation. Recent advancements have incorporated
251
specific conditioning during the denoising phase to align synthetic images more closely with reference images in terms of edges, depth, and structure [358, 345, 359], fostering more controlled and realistic image
generation.
Synthetic Data Generation. Extensive research has demonstrated that combining synthetic data with
real data can enhance the performance of machine learning models across critical vision and control applications, such as image classification, semantic segmentation, face recognition, and autonomous vehicle
control [360, 361, 362, 363, 364, 365, 366, 367, 368, 342, 343]. These studies have employed a range of generative models, from GANs [357] to Stable Diffusion [5], to create synthetic datasets for model training.
While previous efforts have primarily focused on using synthetic data to complement real-world training
data for improving model performance, our work investigates scenarios where users need to train specialized ML models on specific tasks involving private data distributions, and where labeled real-world data
is unavailable. Therefore, customized synthetic data needs to be generated for training specialized ML
models. To achieve the goal of tailoring the distribution of generated synthetic data, recent works have
proposed methods, including fine-tuning the models on a set of reference images [344] or incorporating
conditional reference image features to the image generation process [345]. Different from these works,
we further study the privacy leakage problem when customizing synthetic data generation process, and
propose a novel framework that allows user to balance the privacy and utility of customized synthetic
data, based on their privacy preferences.
8.7 Conclusion
In this work, we propose a novel framework to generate customized synthetic image data for specialized
ML model training, where the user only needs to share a few sanitized reference images. Our system
provides users with fine-grained and object-level privacy control of sanitized reference images, allowing
them to balance privacy and utility according to their preferences. Our experiments conducted across
252
three distinct specialized model training tasks, demonstrate the viability of using customized synthetic
image for specialized ML model training while minimizing the use of real users’ private data. Moreover,
our proposed system can enhance the performance of specialized ML models without compromising user
privacy.
253
Chapter 9
Conclusion
In this thesis, I study the privacy-utility trade-off optimization problems in a variety of AI-enabled network
applications. In Part I (Chapter 2-4), I introduce principled user data obfuscation approaches for centralized
learning applications. By applying context-aware noise addition mechanisms to structured user location
data and designing reinforcement learning-based systems for unstructured user profile data, this part offers robust solutions for protecting user location privacy and user profiling privacy, while maintaining
adequate utility for ML models. In Part II (Chapter 5-6), I investigate the privacy-utility trade-offs in application using FL with SA. Both theoretical and empirical findings from this part demonstrate that the
inherent randomness in aggregated model update can be leveraged to enhance the privacy of individual
users’ data without affecting the utility of ML models. In Part III (Chapter 7-8), I design innovative approaches to leverage large FMs for high-quality synthetic data generation, such that accurate specialized
ML models can be trained with minimal reliance on actual user data. This part showcases the potential
of large FMs in addressing privacy concerns without compromising utility of ML models in AI-enabled
network applications.
254
List of Works
[1] Jiang Zhang, Lillian Clark, Matthew Clark, Konstantinos Psounis, and Peter Kairouz.
“Privacy-utility trades in crowdsourced signal map obfuscation”. In: Computer Networks 215
(2022), p. 109187.
[2] Jiang Zhang, Konstantinos Psounis, Muhammad Haroon, and Zubair Shafiq. “HARPO: Learning
to Subvert Online Behavioral Advertising”. In: Network and Distributed Systems Security (NDSS)
Symposium (2022).
[3] Jiang Zhang, Hadi Askari, Konstantinos Psounis, and Zubair Shafiq. “A Utility-Preserving
Obfuscation Approach for YouTube Recommendations”. In: Proceedings on Privacy Enhancing
Technologies (2023).
[4] Ahmed Roushdy Elkordy*, Jiang Zhang*, Yahya H Ezzeldin, Konstantinos Psounis, and
Salman Avestimehr. “How Much Privacy Does Federated Learning with Secure Aggregation
Guarantee?” In: Proceedings on Privacy Enhancing Technologies (2023). (* indicates equal
contributions.)
[5] Jiang Zhang, Qiong Wu, Yiming Xu, Cheng Cao, Zheng Du, and Konstantinos Psounis. “Efficient
toxic content detection by bootstrapping and distilling large language models”. In: Proceedings of
the AAAI Conference on Artificial Intelligence 38.19 (2024), pp. 21779–21787.
[6] Jiang Zhang, Ahmed Roushdy Elkordy, Yahya H Ezzeldin, Konstantinos Psounis, and
Salman Avestimehr. Differentially Private Federated Learning without Noise Addition: When is it
Possible? Under resubmission.
[7] Jiang Zhang, Rohan Sequeira, and Konstantinos Psounis. Customized synthetic data generation for
private training of specialized ML models. Manuscript.
255
Bibliography
[1] Christian Janiesch, Patrick Zschech, and Kai Heinrich. “Machine learning and deep learning”. In:
Electronic Markets 31.3 (2021), pp. 685–695.
[2] Emmanouil Alimpertis, Athina Markopoulou, Carter Butts, and Konstantinos Psounis. “City-wide
signal strength maps: Prediction with random forests”. In: The World Wide Web Conference. 2019,
pp. 2536–2542.
[3] Ashley Rodriguez. YouTube’s recommendations drive 70% of what we watch. Quartz.
https://qz.com/1178125/youtubes-recommendations-drive-70-of-what-we-watch. 2018.
[4] WSJ Staff. Inside Tiktok’s Highly Secretive Algorithm.
https://www.wsj.com/video/series/inside-tiktoks-highly-secretive-algorithm/investigationhow-tiktok-algorithm-figures-out-your-deepest-desires/6C0C2040-FF25-4827-8528-2BD6612E3796.
2021.
[5] Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Björn Ommer.
“High-resolution image synthesis with latent diffusion models”. In: Proceedings of the IEEE/CVF
conference on computer vision and pattern recognition. 2022, pp. 10684–10695.
[6] Open Signal Inc. 3G and 4G LTE Cell Coverage Map.
[7] Tutela Technologies. Manage your Mobile Experience.
[8] Joseph Cox. “EFF Hits AT&T With Class Action Lawsuit for Selling Customers’ Location to
Bounty Hunters”. In: (July 2019).
[9] Elena Agapie, Gong Chen, Doug Houston, Eric Howard, J Kim, Min Y Mun, A Mondschein,
Sasank Reddy, R Rosario, Jason Ryder, et al. “Seeing Our Signals: Combining location traces and
web-based models for personal discovery”. In: Proceedings of the 9th workshop on Mobile
computing systems and applications. ACM. 2008, pp. 6–10.
[10] Yves-Alexandre de Montjoye, Sébastien Gambs, Vincent Blondel, Geoffrey Canright,
Nicolas De Cordes, Sébastien Deletaille, Kenth Engø-Monsen, Manuel Garcia-Herranz,
Jake Kendall, Cameron Kerry, et al. “On the privacy-conscientious use of mobile phone data”. In:
Scientific data 5 (2018).
256
[11] Charlie Warzel, Glenn S. Gerstell, Sarah Aziza, Thorin Klosowski, Farhad Manjoo Bremer,
Nadieh, Sarah Jeong, Angella Foster, David M. Primo, Jessica Rich, Albert Fox Cahn, and et al. The
Privacy Project. https://www.nytimes.com/series/new-york-times-privacy-project. Sept. 2019.
[12] New Survey Finds Deep Consumer Anxiety over Data Privacy and Security.
https://newsroom.ibm.com/2018-04-15-New-Survey-Finds-Deep-Consumer-Anxiety-over-DataPrivacy-and-Security. Apr. 2018.
[13] General Data Protection Regulation (GDPR). https://gdpr-info.eu/.
[14] California Consumer Privacy Act (CCPA). https://oag.ca.gov/privacy/ccpa.
[15] X. Wu, P. Yang, S. Tang, X. Zheng, and Y. Xiong. “Privacy preserving RSS map generation for a
crowdsensing network”. In: IEEE Wireless Communications 22.4 (Aug. 2015), pp. 42–48. issn:
1536-1284. doi: 10.1109/MWC.2015.7224726.
[16] Lalitha Sankar, S Raj Rajagopalan, and H Vincent Poor. “Utility-privacy tradeoffs in databases: An
information-theoretic approach”. In: IEEE Transactions on Information Forensics and Security 8.6
(2013), pp. 838–852.
[17] Cynthia Dwork, Aaron Roth, et al. “The algorithmic foundations of differential privacy”. In:
Foundations and Trends® in Theoretical Computer Science 9.3–4 (2014), pp. 211–407.
[18] Quan Geng, Wei Ding, Ruiqi Guo, and Sanjiv Kumar. “Privacy and Utility Tradeoff in
Approximate Differential Privacy”. In: arXiv preprint arXiv:1810.00877 (2018).
[19] Chong Huang, Peter Kairouz, Xiao Chen, Lalitha Sankar, and Ram Rajagopal. “Generative
Adversarial Privacy”. In: CoRR abs/1807.05306 (2018). arXiv: 1807.05306. url:
http://arxiv.org/abs/1807.05306.
[20] E. Alimpertis, N. Fasarakis-Hilliard, and A. Bletsas. “Community RF Sensing for Source
Localization”. In: IEEE Wireless Communications Letters 3.4 (Aug. 2014), pp. 393–396. issn:
2162-2337. doi: 10.1109/LWC.2014.2321741.
[21] Emmanouil Alimpertis and Athina Markopoulou. “Using AntMonitor For Crowdsourcing Passive
Mobile Network Measurements”. In: NSDI’17 Poster Session (2017).
[22] Radiocell dataset. https://www.radiocells.org/.
[23] Azar Taufique, Mona Jaber, Ali Imran, Zaher Dawy, and Elias Yacoub. “Planning wireless cellular
networks of future: Outlook, challenges and opportunities”. In: IEEE Access 5 (2017),
pp. 4821–4845.
[24] Om Thakkar, Galen Andrew, and H Brendan McMahan. “Differentially private learning with
adaptive clipping”. In: arXiv preprint arXiv:1905.03871 (2019).
[25] H Brendan McMahan, Galen Andrew, Ulfar Erlingsson, Steve Chien, Ilya Mironov,
Nicolas Papernot, and Peter Kairouz. “A general approach to adding differential privacy to
iterative training procedures”. In: arXiv preprint arXiv:1812.06210 (2018).
257
[26] Vinko Erceg, Larry J Greenstein, Sony Y Tjandra, Seth R Parkoff, Ajay Gupta, Boris Kulic,
Arthur A Julius, and Renee Bianchi. “An empirically based path loss model for wireless channels
in suburban environments”. In: IEEE Journal on selected areas in communications 17.7 (1999),
pp. 1205–1211.
[27] Seon Yeong Han, Nael B Abu-Ghazaleh, and Dongman Lee. “Double regression: Efficient spatially
correlated path loss model for wireless network simulation”. In: 2013 Proceedings IEEE INFOCOM.
IEEE. 2013, pp. 1824–1832.
[28] Andreas F Molisch. Wireless communications. Vol. 34. John Wiley & Sons, 2012.
[29] RadioBeacon. https://f-droid.org/packages/org.openbmap/.
[30] AT&T Inc. https://www.att.com/.
[31] Tutela Inc. https://www.tutela.com/.
[32] Cynthia Dwork. “Differential privacy: A survey of results”. In: International Conference on Theory
and Applications of Models of Computation. Springer. 2008, pp. 1–19.
[33] Borja Balle and Yu-Xiang Wang. “Improving the gaussian mechanism for differential privacy:
Analytical calibration and optimal denoising”. In: arXiv preprint arXiv:1805.06530 (2018).
[34] Borja Balle. A Short Tutorial on Differential Privacy. Jan. 2018.
[35] Chong Huang, Peter Kairouz, Xiao Chen, Lalitha Sankar, and Ram Rajagopal. “Context-aware
generative adversarial privacy”. In: Entropy 19.12 (2017), p. 656.
[36] Hsien-Chung Wu. “The Karush–Kuhn–Tucker optimality conditions in an optimization problem
with interval-valued objective function”. In: European Journal of Operational Research 176.1
(2007), pp. 46–59.
[37] Matthew A Clark and Konstantinos Psounis. “Trading utility for privacy in shared spectrum
access systems”. In: IEEE/ACM Transactions on Networking (TON) 26.1 (2018), pp. 259–273.
[38] Lillian Clark, Matthew Clark, Konstantinos Psounis, and Peter Kairouz. “Privacy-Utility Trades in
Wireless Data via Optimization and Learning”. In: Proceedings of Information Theory and
Applications Workshop (ITA) (Feb. 2019).
[39] Peter Kairouz, H Brendan McMahan, Brendan Avent, Aurélien Bellet, Mehdi Bennis,
Arjun Nitin Bhagoji, Keith Bonawitz, Zachary Charles, Graham Cormode, Rachel Cummings,
et al. “Advances and open problems in federated learning”. In: arXiv:1912.04977 (2019).
[40] Tian Li, Anit Kumar Sahu, Ameet Talwalkar, and Virginia Smith. “Federated learning: Challenges,
methods, and future directions”. In: IEEE Signal Processing Magazine 37.3 (2020), pp. 50–60.
[41] Zhibo Wang, Mengkai Song, Zhifei Zhang, Yang Song, Qian Wang, and Hairong Qi. “Beyond
inferring class representatives: User-level privacy leakage from federated learning”. In: IEEE
INFOCOM 2019-IEEE Conference on Computer Communications. IEEE. 2019, pp. 2512–2520.
258
[42] Milad Nasr, Reza Shokri, and Amir Houmansadr. “Comprehensive privacy analysis of deep
learning: Passive and active white-box inference attacks against centralized and federated
learning”. In: 2019 IEEE symposium on security and privacy (SP). IEEE. 2019, pp. 739–753.
[43] Lingjuan Lyu, Han Yu, and Qiang Yang. “Threats to federated learning: A survey”. In: arXiv
preprint arXiv:2003.02133 (2020).
[44] Nguyen Truong, Kai Sun, Siyao Wang, Florian Guitton, and Yike Guo. “Privacy preservation in
federated learning: An insightful survey from the GDPR perspective”. In: Computers & Security
110 (2021), p. 102402.
[45] Chen Fang, Yuanbo Guo, Yongjin Hu, Bowen Ma, Li Feng, and Anqi Yin. “Privacy-preserving and
communication-efficient federated learning in Internet of Things”. In: Computers & Security 103
(2021), p. 102199. issn: 0167-4048. doi: https://doi.org/10.1016/j.cose.2021.102199.
[46] Evita Bakopoulou, Jiang Zhang, Justin Ley, Konstantinos Psounis, and Athina Markopoulou.
“Location leakage in federated signal maps”. In: arXiv preprint arXiv:2112.03452 (2021).
[47] Ligeng Zhu and Song Han. “Deep leakage from gradients”. In: Federated Learning. Springer, 2020,
pp. 17–31.
[48] Cynthia Dwork. “Differential privacy”. In: Encyclopedia of Cryptography and Security (2011),
pp. 338–340.
[49] Peter Kairouz, Sewoong Oh, and Pramod Viswanath. “Extremal mechanisms for local differential
privacy”. In: Advances in neural information processing systems. 2014, pp. 2879–2887.
[50] Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair,
Aaron Courville, and Yoshua Bengio. “Generative adversarial nets”. In: Advances in neural
information processing systems. 2014, pp. 2672–2680.
[51] Ramiro Camino, Christian Hammerschmidt, and Radu State. “Generating Multi-Categorical
Samples with Generative Adversarial Networks”. In: arXiv preprint arXiv:1807.01202 (2018).
[52] Mehdi Mirza and Simon Osindero. “Conditional Generative Adversarial Nets”. In: CoRR
abs/1411.1784 (2014). arXiv: 1411.1784. url: http://arxiv.org/abs/1411.1784.
[53] Mina Askari, Reihaneh Safavi-Naini, and Ken Barker. “An information theoretic privacy and
utility measure for data sanitization mechanisms”. In: Proceedings of the second ACM conference on
Data and Application Security and Privacy. ACM. 2012, pp. 283–294.
[54] Kousha Kalantari, Lalitha Sankar, and Oliver Kosut. “On information-theoretic privacy with
general distortion cost functions”. In: 2017 IEEE International Symposium on Information Theory
(ISIT). IEEE. 2017, pp. 2865–2869.
[55] Thomas M. Cover and Joy A. Thomas. Elements of Information Theory (Wiley Series in
Telecommunications and Signal Processing). New York, NY, USA: Wiley-Interscience, 2006. isbn:
0471241954.
259
[56] Anand D Sarwate and Lalitha Sankar. “A rate-disortion perspective on local differential privacy”.
In: 2014 52nd Annual Allerton Conference on Communication, Control, and Computing (Allerton).
IEEE. 2014, pp. 903–908.
[57] Xuebin Ren, Chia-Mu Yu, Weiren Yu, Shusen Yang, Xinyu Yang, Julie A McCann, and S Yu Philip.
“LoPub: High-Dimensional Crowdsourced Data Publication with Local Differential Privacy”. In:
IEEE Transactions on Information Forensics and Security 13.9 (2018), pp. 2151–2166.
[58] Weina Wang, Lei Ying, and Junshan Zhang. “On the relation between identifiability, differential
privacy, and mutual-information privacy”. In: IEEE Transactions on Information Theory 62.9
(2016), pp. 5018–5029.
[59] Mário S Alvim, Miguel E Andrés, Konstantinos Chatzikokolakis, Pierpaolo Degano, and
Catuscia Palamidessi. “Differential privacy: on the trade-off between utility and information
leakage”. In: International Workshop on Formal Aspects in Security and Trust. Springer. 2011,
pp. 39–54.
[60] Arpita Ghosh, Tim Roughgarden, and Mukund Sundararajan. “Universally utility-maximizing
privacy mechanisms”. In: SIAM Journal on Computing 41.6 (2012), pp. 1673–1693.
[61] Cynthia Dwork, Krishnaram Kenthapadi, Frank McSherry, Ilya Mironov, and Moni Naor. “Our
Data, Ourselves: Privacy Via Distributed Noise Generation”. In: Advances in Cryptology -
EUROCRYPT 2006. Ed. by Serge Vaudenay. Berlin, Heidelberg: Springer Berlin Heidelberg, 2006,
pp. 486–503. isbn: 978-3-540-34547-3.
[62] Jong Wook Kim, Dae-Ho Kim, and Beakcheol Jang. “Application of local differential privacy to
collection of indoor positioning data”. In: IEEE Access 6 (2018), pp. 4276–4286.
[63] Ashwin Machanavajjhala, Daniel Kifer, John Abowd, Johannes Gehrke, and Lars Vilhuber.
“Privacy: Theory Meets Practice on the Map”. In: Proceedings of the 2008 IEEE 24th International
Conference on Data Engineering. ICDE ’08. Washington, DC, USA: IEEE Computer Society, 2008,
pp. 277–286. isbn: 978-1-4244-1836-7. doi: 10.1109/ICDE.2008.4497436.
[64] Haiming Jin, Lu Su, Houping Xiao, and Klara Nahrstedt. “Inception: Incentivizing
privacy-preserving data aggregation for mobile crowd sensing systems”. In: Proceedings of the
17th ACM International Symposium on Mobile Ad Hoc Networking and Computing. 2016,
pp. 341–350.
[65] Reza Shokri, George Theodorakopoulos, Carmela Troncoso, Jean-Pierre Hubaux, and
Jean-Yves Le Boudec. “Protecting location privacy: optimal strategy against localization attacks”.
In: Proceedings of the 2012 ACM conference on Computer and communications security. 2012,
pp. 617–627.
[66] Nicolás E Bordenabe, Konstantinos Chatzikokolakis, and Catuscia Palamidessi. “Optimal
geo-indistinguishable mechanisms for location privacy”. In: Proceedings of the 2014 ACM SIGSAC
conference on computer and communications security. 2014, pp. 251–262.
260
[67] Xin Chen, Tao Zhang, Sheng Shen, Tianqing Zhu, and Ping Xiong. “An Optimized Differential
Privacy Scheme with Reinforcement Learning in VANET”. In: Computers & Security (2021),
p. 102446.
[68] Spyros Boukoros, Mathias Humbert, Stefan Katzenbeisser, and Carmela Troncoso. “On (the lack
of) location privacy in crowdsourcing applications”. In: 28th {USENIX} Security Symposium
({USENIX} Security 19). 2019, pp. 1859–1876.
[69] Jong Wook Kim, Kennedy Edemacu, Jong Seon Kim, Yon Dohn Chung, and Beakcheol Jang. “A
Survey of Differential Privacy-based Techniques and their Applicability to Location-Based
Services”. In: Computers & Security (2021), p. 102464. issn: 0167-4048. doi:
https://doi.org/10.1016/j.cose.2021.102464.
[70] Mehmet Emre Gursoy, Ling Liu, Stacey Truex, and Lei Yu. “Differentially private and utility
preserving publication of trajectory data”. In: IEEE Transactions on Mobile Computing 18.10 (2018),
pp. 2315–2329.
[71] Mehmet Emre Gursoy, Ling Liu, Stacey Truex, Lei Yu, and Wenqi Wei. “Utility-Aware Synthesis
of Differentially Private and Attack-Resilient Location Traces”. In: Proceedings of the 2018 ACM
SIGSAC Conference on Computer and Communications Security. ACM. 2018, pp. 196–211.
[72] Rim Ben Messaoud, Nouha Sghaier, Mohamed Ali Moussa, and Yacine Ghamri-Doudane. “Privacy
preserving utility-aware mechanism for data uploading phase in participatory sensing”. In: IEEE
Transactions on Mobile Computing 18.9 (2018), pp. 2160–2173.
[73] Wenjing Zhang, Bo Jiang, Ming Li, Ravi Tandon, Qiao Liu, and Hui Li. “Aggregation-based
location privacy: An information theoretic approach”. In: Computers & Security 97 (2020),
p. 101953.
[74] Sicong Liu, Junzhao Du, Anshumali Shrivastava, and Lin Zhong. “Privacy adversarial network:
representation learning for mobile data privacy”. In: Proceedings of the ACM on Interactive, Mobile,
Wearable and Ubiquitous Technologies 3.4 (2019), pp. 1–18.
[75] Nisarg Raval, Ashwin Machanavajjhala, and Jerry Pan. “Olympus: Sensor Privacy through Utility
Aware Obfuscation”. In: Proceedings on Privacy Enhancing Technologies 2019 (2019), pp. 25–5.
[76] Gunes Acar, Christian Eubank, Steven Englehardt, Marc Juarez, Arvind Narayanan, and
Claudia Diaz. “The Web Never Forgets: Persistent Tracking Mechanisms in the Wild”. In: ACM
Conference on Computer and Communications Security (CCS). 2014.
[77] Umar Iqbal, Steven Englehardt, and Zubair Shafiq. “Fingerprinting the Fingerprinters: Learning to
Detect Browser Fingerprinting Behaviors”. In: IEEE Symposium on Security & Privacy (S&P) (2021).
[78] Panagiotis Papadopoulos, Nicolas Kourtellis, and Evangelos Markatos. “Cookie synchronization:
Everything you always wanted to know but were afraid to ask”. In: The World Wide Web
Conference (WWW). 2019.
[79] Steven Englehardt and Arvind Narayanan. “Online Tracking: A 1-million-site Measurement and
Analysis”. In: ACM Conference on Computer and Communications Security (CCS). 2016.
261
[80] Steven Englehardt, Dillon Reisman, Christian Eubank, Peter Zimmerman, Jonathan Mayer,
Arvind Narayanan, and Edward W Felten. “Cookies that give you away: The surveillance
implications of web tracking”. In: International Conference on World Wide Web (WWW). 2015.
[81] Lukasz Olejnik, Minh-Dung Tran, and Claude Castelluccia. “Selling Off Privacy at Auction”. In:
Network and Distributed Systems Security (NDSS) Symposium. 2014.
[82] Federal Trade Commission. “Data brokers: A call for transparency and accountability”. In: (2014).
[83] Tami Kim, Kate Barasz, and Leslie K John. “Why Am I Seeing This Ad? The Effect of Ad
Transparency on Ad Effectiveness”. In: Journal of Consumer Research 45.5 (May 2018),
pp. 906–932. eprint: http://oup.prod.sis.lan/jcr/article-pdf/45/5/906/27498411/ucy039.pdf.
[84] Tobias Dehling, Yuchen Zhang, and Ali Sunyaev. “Consumer Perceptions of Online Behavioral
Advertising”. In: 2019 IEEE 21st Conference on Business Informatics (CBI). IEEE. 2019.
[85] Blase Ur, Pedro Giovanni Leon, Lorrie Faith Cranor, Richard Shay, and Yang Wang. “Smart,
useful, scary, creepy: perceptions of online behavioral advertising”. In: Symposium on Usable
Privacy and Security (SOUPS. 2012.
[86] Shoshana Zuboff. The Age of Surveillance Capitalism: The Fight for a Human Future at the New
Frontier of Power. 2019.
[87] Types of cookies used by Google. https://policies.google.com/technologies/types.
[88] Ashkan Soltani and Barton Gellman. New documents show how the NSA infers relationships based
on mobile location data. The Washington Post. 2013.
[89] S. Farrell and H. Tschofenig. Pervasive Monitoring Is an Attack. IETF RFC 7258. 2014.
[90] Noam Scheiber. Facebook Accused of Allowing Bias Against Women in Job Ads. The New York
Times. 2018.
[91] Chris Stokel-Walker. Facebook’s Ad Data May Put Millions of Gay People at Risk. New Scientist.
2019.
[92] Julia Angwin and Terry Parris Jr. Facebook Lets Advertisers Exclude Users by Race. ProPublica. 2016.
[93] S. C. Matz, M. Kosinski, G. Nave, and D. J. Stillwell. “Psychological targeting as an effective
approach to digital mass persuasion”. In: Proceedings of the National Academy of Sciences 114.48
(2017), pp. 12714–12719. issn: 0027-8424. doi: 10.1073/pnas.1710966114. eprint:
https://www.pnas.org/content/114/48/12714.full.pdf.
[94] Dan Gizzi. The Ethics of Political Micro-targeting. 2018.
[95] Donie O’Sullivan and David Shortell. “Exclusive: The FBI is running Facebook ads targeting
Russians in Washington”. In: (2019).
262
[96] Naomi LaChance. Program to Deradicalize Jihadis Will Be Used on Right-Wingers. The Intercept.
2018.
[97] App Tracking Transparency. https://developer.apple.com/documentation/apptrackingtransparency.
[98] Raymond Hill. An efficient blocker for Chromium and Firefox. Fast and lean, uBlock Origin.
https://github.com/gorhill/uBlock#ublock-origin. 2019.
[99] Ghostery. https://www.ghostery.com/.
[100] Umar Iqbal, Peter Snyder, Shitong Zhu, Benjamin Livshits, Zhiyun Qian, and Zubair Shafiq.
“Adgraph: A graph-based approach to ad and tracker blocking”. In: IEEE Symposium on Security &
Privacy (S&P) (2020).
[101] Weihang Wang, Yunhui Zheng, Xinyu Xing, Yonghwi Kwon, Xiangyu Zhang, and Patrick Eugster.
“WebRanz: Web Page Randomization for Better Advertisement Delivery and Web-Bot
Prevention”. In: ACM International Symposium on Foundations of Software Engineering (FSE). 2016.
[102] Georg Merzdovnik, Markus Huber, Damjan Buhov, Nick Nikiforakis, Sebastian Neuner,
Martin Schmiedecker, and Edgar Weippl. “Block me if you can: A large-scale study of
tracker-blocking tools”. In: 2017 IEEE European Symposium on Security & Privacy (Euro S&P). 2017.
[103] Mshabab Alrizah, Sencun Zhu, Xinyu Xing, and Gang Wang. “Errors, Misunderstandings, and
Attacks: Analyzing the Crowdsourcing Process of Ad-Blocking Systems”. In: ACM Internet
Measurement Conference (IMC). 2019.
[104] Quan Chen, Peter Snyder, Ben Livshits, and Alexandros Kapravelos. “Detecting Filter List Evasion
With Event-Loop-Turn Granularity JavaScript Signatures”. In: IEEE Symposium on Security &
Privacy (S&P). 2021.
[105] Peter Snyder, Antoine Vastel, and Ben Livshits. “Who Filters the Filters: Understanding the
Growth, Usefulness and Efficiency of Crowdsourced Ad Blocking”. In: Proceedings of the ACM on
Measurement and Analysis of Computing Systems 4.2 (2020), pp. 1–24.
[106] Hieu Le, Athina Markopoulou, and Zubair Shafiq. “CV-Inspector: Towards Automating Detection
of Adblock Circumvention”. In: Network and Distributed Systems Security (NDSS) Symposium.
2021.
[107] Muhammad Ahmad Bashir, Sajjad Arshad, Engin Kirda, William Robertson, and Christo Wilson.
“How tracking companies circumvented ad blockers using websockets”. In: ACM Internet
Measurement Conference (IMC). 2018.
[108] Ha Dao, Johan Mazel, and Kensuke Fukuda. “Characterizing CNAME Cloaking-Based Tracking
on the Web”. In: Traffic Measurement and Analysis Conference (TMA). 2020.
[109] Karthika Subramani, Xingzi Yuan, Omid Setayeshfar, Phani Vadrevu, Kyu Hyung Lee, and
Roberto Perdisci. “Measuring Abuse in Web Push Advertising”. In: arXiv:2002.06448 (2020).
263
[110] Daniel C Howe and Helen Nissenbaum. “Engineering Privacy and Protest: A Case Study of
AdNauseam”. In: International Workshop on Privacy Engineering (IWPE). 2017.
[111] Hey advertisers, track THIS. https://blog.mozilla.org/firefox/hey-advertisers-track-this. 2019.
[112] AdNauseam - Clicking Ads So You Don’t Have To. https://adnauseam.io/.
[113] Martin Degeling and Jan Nierhoff. “Tracking and Tricking a Profiler: Automated Measuring and
Influencing of Bluekai’s Interest Profiling”. In: Workshop on Privacy in the Electronic Society
(WPES). 2018.
[114] Asia J. Biega, Rishiraj Saha Roy, and Gerhard Weikum. “Privacy through Solidarity: A
User-Utility-Preserving Framework to Counter Profiling”. In: SIGIR ’17. Shinjuku, Tokyo, Japan:
Association for Computing Machinery, 2017, pp. 675–684. isbn: 9781450350228. doi:
10.1145/3077136.3080830.
[115] Ghazaleh Beigi, Ruocheng Guo, Alexander Nou, Yanchao Zhang, and Huan Liu. “Protecting User
Privacy: An Approach for Untraceable Web Browsing History and Unambiguous User Profiles”.
In: CoRR abs/1811.09340 (2018). arXiv: 1811.09340. url: http://arxiv.org/abs/1811.09340.
[116] Oracle Data Cloud Registry Information. https://datacloudoptout.oracle.com.
[117] Michalis Pachilakis, Panagiotis Papadopoulos, Evangelos P. Markatos, and Nicolas Kourtellis. “No
More Chasing Waterfalls: A Measurement Study of the Header Bidding Ad-Ecosystem”. In: ACM
Internet Measurement Conference (IMC). 2019.
[118] Nick Nikiforakis, Alexandros Kapravelos, Wouter Joosen, Christopher Kruegel, Frank Piessens,
and Giovanni Vigna. “Cookieless monster: Exploring the ecosystem of web-based device
fingerprinting”. In: IEEE Symposium on Security & Privacy (S&P). 2013.
[119] Zhonghao Yu, Sam Macbeth, Konark Modi, and Josep M. Pujol. “Tracking the Trackers”. In:
International Conference on World Wide Web (WWW). 2016.
[120] Web-scale ML : learning is not the (only) point.
https://labs.criteo.com/2018/05/ml-model-deployment/.
[121] Richard Sutton and Andrew Barto. Reinforcement Learning: An Introduction. MIT Press, 1998.
[122] Muhammad Ahmad Bashir, Umar Farooq, Maryam Shahid, Muhammad Fareed Zaffar, and
Christo Wilson. “Quantity vs. Quality: Evaluating User Interest Profiles Using Ad Preference
Managers”. In: Network and Distributed Systems Security (NDSS) Symposium. 2019.
[123] Tobias Urban, Dennis Tatang, Martin Degeling, Thorsten Holz, and Norbert Pohlmann. “A study
on subject data access in online advertising after the GDPR”. In: International Workshop on Data
Privacy Management. 2019.
[124] Panagiotis Papadopoulos, Nicolas Kourtellis, Pablo Rodriguez Rodriguez, and Nikolaos Laoutaris.
“If you are not paying for it, you are the product: How much do advertisers pay to reach you?” In:
Internet Measurement Conference (IMC). 2017.
264
[125] John Cook, Rishab Nithyanand, and Zubair Shafiq. “Inferring Tracker-Advertiser Relationships in
the Online Advertising Ecosystem using Header Bidding”. In: Privacy Enhancing Technologies
Symposium (PETS) (2020).
[126] Oracle Data Cloud Registry 2019 Data Directory.
https://www.oracle.com/us/solutions/cloud/data-directory-2810741.pdf.
[127] Quoc Le and Tomas Mikolov. “Distributed representations of sentences and documents”. In:
International Conference on Machine Learning (ICML). 2014.
[128] Taher H. Haveliwala, Glen M. Jeh, and Sepandar D. Kamvar. Targeted advertisements based on user
profiles and page profile. 2012.
[129] Darrell Anderson, Paul Buchheit, Alexander Paul Carobus, Yingwei Cui, Jeffrey A. Dean,
Georges R. Harik, Deepak Jindal, and Narayanan Shivakumar. Serving advertisements based on
content. 2006.
[130] OpenAI. OpenAI Baselines: ACKTR & A2C. https://openai.com/blog/baselines-acktr-a2c/. 2022.
[131] Yoon Kim. “Convolutional Neural Networks for Sentence Classification”. In: Conference on
Empirical Methods in Natural Language Processing (EMNLP). 2014.
[132] Matthew Hausknecht and Peter Stone. “Deep Recurrent Q-Learning for Partially Observable
MDPs”. In: AAAI Conference on Artificial Intelligence (AAAI) (2015).
[133] Zhiyuan Xu, Jian Tang, Chengxiang Yin, Yanzhi Wang, and Guoliang Xue. “Experience-driven
congestion control: When multi-path TCP meets deep reinforcement learning”. In: IEEE Journal
on Selected Areas in Communications 37.6 (2019), pp. 1325–1336.
[134] Chunting Zhou, Chonglin Sun, Zhiyuan Liu, and Francis Lau. “A C-LSTM neural network for text
classification”. In: arXiv preprint arXiv:1511.08630 (2015).
[135] Anatomy of an extension. https://developer.mozilla.org/en-US/docs/Mozilla/Addons/WebExtensions/Anatomy_of_a_WebExtension.
[136] Greg Pass, Abdur Chowdhury, and Cayley Torgeson. “A picture of search”. In: Proceedings of the
1st international conference on Scalable information systems. 2006, 1–es.
[137] WhoisXMLAPI for URL categorization. https://whois.whoisxmlapi.com/.
[138] Alexa - Top Sites by Category: The top 500 sites on the web.
https://www.alexa.com/topsites/category.
[139] Paul A Gagniuc. Markov chains: from theory to implementation and experimentation. John Wiley &
Sons, 2017.
[140] EasyList. https://easylist.to/.
265
[141] Léon Bottou. “Large-scale machine learning with stochastic gradient descent”. In: International
Conference on Computational Statistics. 2010.
[142] Farah Chanchary and Sonia Chiasson. “User perceptions of sharing, advertising, and tracking”.
In: Eleventh Symposium On Usable Privacy and Security (SOUPS 2015). 2015, pp. 53–67.
[143] Bennett Cyphers and Gennie Gebhart. Behind the One-Way Mirror: A Deep Dive Into the
Technology of Corporate Surveillance. Electronic Frontier Foundation, 2019.
[144] Amnesty International. “Surveillance Giant: How the Business Model of Google and Facebook
Threatens Human Rights.” In: (2019).
[145] Ryan Amos, Gunes Acar, Elena Lucherini, Mihir Kshirsagar, Arvind Narayanan, and
Jonathan Mayer. “Privacy Policies over Time: Curation and Analysis of a Million-Document
Dataset”. In: arXiv:2008.09159 (2020).
[146] Matthew W Vail, Julia B Earp, and Annie I Antón. “An empirical study of consumer perceptions
and comprehension of web site privacy policies”. In: IEEE Transactions on Engineering
Management 55.3 (2008), pp. 442–454.
[147] Arunesh Mathur, Gunes Acar, Michael J Friedman, Elena Lucherini, Jonathan Mayer,
Marshini Chetty, and Arvind Narayanan. “Dark patterns at scale: Findings from a crawl of 11K
shopping websites”. In: ACM Conference on Computer-Supported Cooperative Work and Social
Computing (CSCW) (2019).
[148] Finn Brunton and Helen Nissenbaum. “Political and ethical perspectives on data obfuscation”. In:
Privacy, due process and the computational turn: The philosophy of law meets the philosophy of
technology (2013).
[149] Garrett A Johnson, Scott K Shriver, and Shaoyin Du. “Consumer privacy choice in online
advertising: Who opts out and at what cost to industry?” In: Marketing Science 39.1 (2020),
pp. 33–51.
[150] Deepak Ravichandran and Nitish Korula. Effect of disabling third-party cookies on publisher
revenue. Google. 2019.
[151] Mozilla page visibility API.
https://developer.mozilla.org/en-US/docs/Web/API/Page_Visibility_API.
[152] Tab throttling and more performance improvements in Chrome M87. url:
https://blog.chromium.org/2020/11/tab-throttling-and-more-performance.html (visited on
05/10/2021).
[153] Pierre Laperdrix, Oleksii Starov, Quan Chen, Alexandros Kapravelos, and Nick Nikiforakis.
“Fingerprinting in Style: Detecting Browser Extensions via Injected Style Sheets”. In: 30th
{USENIX} Security Symposium ({USENIX} Security 21). 2021.
266
[154] Soroush Karami, Panagiotis Ilia, Konstantinos Solomos, and Jason Polakis. “Carnus: Exploring the
privacy threats of browser extension fingerprinting”. In: Network and Distributed Systems Security
(NDSS) Symposium. 2020.
[155] Oleksii Starov and Nick Nikiforakis. “Xhound: Quantifying the fingerprintability of browser
extensions”. In: 2017 IEEE Symposium on Security and Privacy (SP). IEEE. 2017, pp. 941–956.
[156] Alexander Sjösten, Steven Van Acker, and Andrei Sabelfeld. “Discovering browser extensions via
web accessible resources”. In: Proceedings of the Seventh ACM on Conference on Data and
Application Security and Privacy. 2017, pp. 329–336.
[157] Surf The Web With No Annoying Ads. https://adblockplus.org. 2019.
[158] Trackers and scripts Firefox blocks in Enhanced Tracking Protection. https://www.ghostery.com/.
[159] Brave browser. https://brave.com/features/.
[160] Grant Storey, Dillon Reisman, Jonathan Mayer, and Arvind Narayanan. “The future of ad
blocking: An analytical framework and new techniques”. In: arXiv:1705.08568 (2017).
[161] David Gugelmann, Markus Happe, Bernhard Ager, and Vincent Lenders. “An automated
approach for complementing ad blockers’ blacklists”. In: Privacy Enhancing Technologies
Symposium (PETS) (2015).
[162] Jason Bau, Jonathan Mayer, Hristo Paskov, and John C Mitchell. “A promising direction for web
tracking countermeasures”. In: Workshop on Web 2.0 Security and Privacy (2013).
[163] Muhammad Ikram, Hassan Jameel Asghar, Mohamed Ali Kaafar, Anirban Mahanti, and
Balachandar Krishnamurthy. “Towards seamless tracking-free web: Improved detection of
trackers via one-class learning”. In: Privacy Enhancing Technologies Symposium (PETS) (2017).
[164] Anastasia Shuba, Athina Markopoulou, and Zubair Shafiq. “NoMoAds: Effective and efficient
cross-app mobile ad-blocking”. In: Privacy Enhancing Technologies Symposium (PETS) (2018).
[165] Qianru Wu, Qixu Liu, Yuqing Zhang, Peng Liu, and Guanxing Wen. “A Machine Learning
Approach for Detecting Third-Party Trackers on the Web”. In: European Symposium on Research
in Computer Security (ESORICS). Ed. by Ioannis Askoxylakis, Sotiris Ioannidis, Sokratis Katsikas,
and Catherine Meadows. 2016.
[166] Jagdish Prasad Achara, Javier Parra-Arnau, and Claude Castelluccia. MyTrackingChoices: Pacifying
the Ad-Block War by Enforcing User Privacy Preferences. 2016. arXiv: 1604.04495 [cs.CR].
[167] Wei Meng, Byoungyoung Lee, Xinyu Xing, and Wenke Lee. “TrackMeOrNot: Enabling Flexible
Control on Web Tracking”. In: Proceedings of the 25th International Conference on World Wide
Web. WWW ’16. Montréal, Québec, Canada: International World Wide Web Conferences Steering
Committee, 2016, pp. 99–109. isbn: 9781450341431. doi: 10.1145/2872427.2883034.
[168] Muhammad Haris Mughees, Zhiyun Qian, and Zubair Shafiq. “Detecting anti ad-blockers in the
wild”. In: Privacy Enhancing Technologies Symposium (PETS) (2017).
267
[169] Rishab Nithyanand, Sheharbano Khattak, Mobin Javed, Narseo Vallina-Rodriguez,
Marjan Falahrastegar, Julia E Powles, Emiliano De Cristofaro, Hamed Haddadi, and
Steven J Murdoch. “Adblocking and counter blocking: A slice of the arms race”. In: USENIX
Workshop on Free and Open Communications on the Internet (FOCI). 2016.
[170] Finn Brunton and Helen Nissenbaum. “Obfuscation: A User’s Guide for Privacy and Protest”. In:
(2015).
[171] Xingyu Xing, Wei Meng, Dan Doozan, Alex C. Snoeren, Nick Feamster, and Wenke Lee. “Take
This Personally: Pollution Attacks on Personalized Services”. In: 22nd USENIX Security
Symposium (USENIX Security 13). Washington, D.C.: USENIX Association, Aug. 2013, pp. 671–686.
isbn: 978-1-931971-03-4. url:
https://www.usenix.org/conference/usenixsecurity13/technical-sessions/paper/xing.
[172] Wei Meng, Xinyu Xing, Anmol Sheth, Udi Weinsberg, and Wenke Lee. “Your Online Interests:
Pwned! A Pollution Attack Against Targeted Advertising”. In: Proceedings of the 2014 ACM
SIGSAC Conference on Computer and Communications Security. CCS ’14. Scottsdale, Arizona, USA:
Association for Computing Machinery, 2014, pp. 129–140. isbn: 9781450329576. doi:
10.1145/2660267.2687258.
[173] I Luk Kim, Weihang Wang, Yonghwi Kwon, Yunhui Zheng, Yousra Aafer, Weijie Meng, and
Xiangyu Zhang. “AdBudgetKiller: Online Advertising Budget Draining Attack”. In: Proceedings of
the 2018 World Wide Web Conference. WWW ’18. Lyon, France: International World Wide Web
Conferences Steering Committee, 2018, pp. 297–307. isbn: 9781450356398. doi:
10.1145/3178876.3186096.
[174] Flavio Chierichetti, Ravi Kumar, Prabhakar Raghavan, and Tamas Sarlos. “Are Web Users Really
Markovian?” In: Proceedings of the 21st International Conference on World Wide Web. WWW ’12.
Lyon, France: Association for Computing Machinery, 2012, pp. 609–618. isbn: 9781450312295.
doi: 10.1145/2187836.2187919.
[175] YouTube Help. Manage your recommendations and search results.
https://support.google.com/youtube/answer/6342839?hl=en. 2019.
[176] Helen Nissenbaum and Howe Daniel. “TrackMeNot: Resisting surveillance in web search”. In:
Oxford: Oxford University Press (2009).
[177] Martin Degeling and Jan Nierhoff. “Tracking and Tricking a Profiler: Automated Measuring and
Influencing of Bluekai’s Interest Profiling”. In: Proceedings of the 2018 Workshop on Privacy in the
Electronic Society. 2018, pp. 1–13.
[178] Jiang Zhang, Konstantinos Psounis, Muhammad Haroon, and Zubair Shafiq. “HARPO: Learning
to Subvert Online Behavioral Advertising”. In: Network and Distributed Systems Security (NDSS)
Symposium (2022).
[179] Google Ads Help. About audience targeting. https://support.google.com/googleads/answer/2497941?hl=en#zippy=%2Cin-market-segments%2Caffinity-segments. 2022.
268
[180] Andreu Casas, Ericka Menchen-Trevino, and Magdalena Wojcieszak. “Exposure to extremely
partisan news from the other political side shows scarce boomerang effects”. In: Political Behavior
(2022), pp. 1–40.
[181] Xingyu Xing, Wei Meng, Dan Doozan, Alex C Snoeren, Nick Feamster, and Wenke Lee. “Take this
personally: Pollution attacks on personalized services”. In: 22nd {USENIX} Security Symposium
({USENIX} Security 13). 2013, pp. 671–686.
[182] Cristos Goodrow. On YouTube’s recommendation system.
https://blog.youtube/inside-youtube/on-youtubes-recommendation-system/. 2021.
[183] Google Account Help. Web & App Activity Controls.
https://support.google.com/accounts/answer/54068?hl=en. 2023.
[184] John Wilander. Full Third-Party Cookie Blocking and More.
https://webkit.org/blog/10218/full-third-party-cookie-blocking-and-more/. 2020.
[185] Mozilla. Firefox rolls out Total Cookie Protection by default to all users worldwide.
https://blog.mozilla.org/en/mozilla/firefox-rolls-out-total-cookie-protection-by-default-to-allusers-worldwide/. 2022.
[186] uBlock-Origin. uBlock-Origin. https://ublockorigin.com/. 2022.
[187] Paul Covington, Jay Adams, and Emre Sargin. “Deep neural networks for youtube
recommendations”. In: Proceedings of the 10th ACM conference on recommender systems. 2016,
pp. 191–198.
[188] Paul Cuff and Lanqing Yu. “Differential privacy as a mutual information constraint”. In:
Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. 2016,
pp. 43–54.
[189] Matthew Clark and Konstantinos Psounis. “Optimizing primary user privacy in spectrum sharing
systems”. In: IEEE/ACM Transactions on Networking 28.2 (2020), pp. 533–546.
[190] Ahmed Roushdy Elkordy, Jiang Zhang, Yahya H Ezzeldin, Konstantinos Psounis, and
Salman Avestimehr. “How Much Privacy Does Federated Learning with Secure Aggregation
Guarantee?” In: Proceedings on Privacy Enhancing Technologies (2023), pp. 510–526.
[191] Javier Parra-Arnau, David Rebollo-Monedero, and Jordi Forné. “Measuring the privacy of user
profiles in personalized information systems”. In: Future Generation Computer Systems 33 (2014),
pp. 53–63.
[192] Javier Parra-Arnau, Jagdish Prasad Achara, and Claude Castelluccia. “Myadchoices: Bringing
transparency and control to online advertising”. In: ACM Transactions on the Web (TWEB) 11.1
(2017), pp. 1–47.
[193] Cynthia Dwork, Aaron Roth, et al. “The algorithmic foundations of differential privacy.” In:
Foundations and Trends in Theoretical Computer Science 9.3-4 (2014), pp. 211–407.
269
[194] Ricks Becca and McCrosky Jesse. Does This Button Work? Investigating YouTube’s ineffective user
controls. https://foundation.mozilla.org/en/research/library/user-controls/report/. 2022.
[195] Cynthia Dwork. “Differential privacy”. In: Automata, Languages and Programming: 33rd
International Colloquium, ICALP 2006, Venice, Italy, July 10-14, 2006, Proceedings, Part II 33.
Springer. 2006, pp. 1–12.
[196] David JC MacKay, David JC Mac Kay, et al. Information theory, inference and learning algorithms.
Cambridge university press, 2003.
[197] SentenceTransformers Documentation. https://www.sbert.net/. 2022.
[198] Andriy Mnih and Russ R Salakhutdinov. “Probabilistic matrix factorization”. In: Advances in
neural information processing systems 20 (2007), pp. 1257–1264.
[199] Hanhuai Shan and Arindam Banerjee. “Generalized probabilistic matrix factorizations for
collaborative filtering”. In: 2010 IEEE International Conference on Data Mining. IEEE. 2010,
pp. 1025–1030.
[200] Hao Ma, Haixuan Yang, Michael R Lyu, and Irwin King. “Sorec: social recommendation using
probabilistic matrix factorization”. In: Proceedings of the 17th ACM conference on Information and
knowledge management. 2008, pp. 931–940.
[201] Mohsen Jamali and Martin Ester. “A matrix factorization technique with trust propagation for
recommendation in social networks”. In: Proceedings of the fourth ACM conference on
Recommender systems. 2010, pp. 135–142.
[202] Hao Ma, Dengyong Zhou, Chao Liu, Michael R Lyu, and Irwin King. “Recommender systems with
social regularization”. In: Proceedings of the fourth ACM international conference on Web search
and data mining. 2011, pp. 287–296.
[203] Jiliang Tang, Xia Hu, Huiji Gao, and Huan Liu. “Exploiting local and global social context for
recommendation.” In: IJCAI. Vol. 13. Citeseer. 2013, pp. 2712–2718.
[204] Jiliang Tang, Charu Aggarwal, and Huan Liu. “Recommendations in signed social networks”. In:
Proceedings of the 25th International Conference on World Wide Web. 2016, pp. 31–40.
[205] Bo Yang, Yu Lei, Jiming Liu, and Wenjie Li. “Social collaborative filtering by trust”. In: IEEE
transactions on pattern analysis and machine intelligence 39.8 (2016), pp. 1633–1647.
[206] Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, and Tat-Seng Chua. “Neural
collaborative filtering”. In: Proceedings of the 26th international conference on world wide web.
2017, pp. 173–182.
[207] Xiangnan He, Xiaoyu Du, Xiang Wang, Feng Tian, Jinhui Tang, and Tat-Seng Chua. “Outer
product-based neural collaborative filtering”. In: arXiv preprint arXiv:1808.03912 (2018).
270
[208] Chong Chen, Min Zhang, Yongfeng Zhang, Yiqun Liu, and Shaoping Ma. “Efficient neural matrix
factorization without sampling for recommendation”. In: ACM Transactions on Information
Systems (TOIS) 38.2 (2020), pp. 1–28.
[209] Shuiguang Deng, Longtao Huang, Guandong Xu, Xindong Wu, and Zhaohui Wu. “On deep
learning for trust-aware recommendations in social networks”. In: IEEE transactions on neural
networks and learning systems 28.5 (2016), pp. 1164–1177.
[210] Xiang Wang, Xiangnan He, Liqiang Nie, and Tat-Seng Chua. “Item silk road: Recommending
items from information domains to social users”. In: Proceedings of the 40th International ACM
SIGIR conference on Research and Development in Information Retrieval. 2017, pp. 185–194.
[211] Zhou Zhao, Qifan Yang, Hanqing Lu, Tim Weninger, Deng Cai, Xiaofei He, and Yueting Zhuang.
“Social-aware movie recommendation via multimodal network learning”. In: IEEE Transactions on
Multimedia 20.2 (2017), pp. 430–440.
[212] Wenqi Fan, Qing Li, and Min Cheng. “Deep modeling of social relations for recommendation”. In:
Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 32. 1. 2018.
[213] Wenqi Fan, Yao Ma, Dawei Yin, Jianping Wang, Jiliang Tang, and Qing Li. “Deep social
collaborative filtering”. In: Proceedings of the 13th ACM Conference on Recommender Systems. 2019,
pp. 305–313.
[214] Minmin Chen, Alex Beutel, Paul Covington, Sagar Jain, Francois Belletti, and Ed H Chi. “Top-k
off-policy correction for a REINFORCE recommender system”. In: Proceedings of the Twelfth ACM
International Conference on Web Search and Data Mining. 2019, pp. 456–464.
[215] Joan E. Solsman. “YouTube’s AI is the puppet master over most of what you watch”. In: (2021).
[216] Reddit User Submissions. https://files.pushshift.io/reddit/submissions/. 2022.
[217] Web Historian. Visualize your web use to understand your habits. https://webhistorian.github.io/.
2022.
[218] youtube-dl downloads. https://youtube-dl.org/. 2022.
[219] Ghazaleh Beigi, Ruocheng Guo, Alexander Nou, Yanchao Zhang, and Huan Liu. “Protecting user
privacy: An approach for untraceable web browsing history and unambiguous user profiles”. In:
Proceedings of the twelfth ACM international conference on web search and data mining. 2019,
pp. 213–221.
[220] Stefan Axelsson. “The base-rate fallacy and the difficulty of intrusion detection”. In: ACM
Transactions on Information and System Security (TISSEC) 3.3 (2000), pp. 186–205.
[221] YouTube. Terms of Service. https://www.youtube.com/static?template=terms. 2021.
[222] 18 US Code § 1030. Fraud and related activity in connection with computers.
https://www.law.cornell.edu/uscode/text/18/1030. 2022.
271
[223] Aaron Mackey and Kurt Opsahl. Van Buren is a Victory Against Overbroad Interpretations of the
CFAA, and Protects Security Researchers. EFF. https://www.eff.org/deeplinks/2021/06/van-burenvictory-against-overbroad-interpretations-cfaa-protects-security. 2021.
[224] Erin Kenneally and David Dittrich. “The menlo report: Ethical principles guiding information and
communication technology research”. In: Available at SSRN 2445102 (2012).
[225] Wpears. Don’t Make Me Watch, Page Visibility API Blocker. 2018.
[226] Mozilla Support. Non-Active-Tabs. https://support.mozilla.org/si/questions/1228604. 2018.
[227] MDN Web Docs. HTML DOM API.
https://developer.mozilla.org/en-US/docs/Web/API/HTML_DOM_API. 2022.
[228] Yandex. https://yandex.com. 2022.
[229] KiwiBrowser. https://kiwibrowser.com. 2022.
[230] Chrome Remote Desktop. https://remotedesktop.google.com/access. 2022.
[231] Google Account. https://www.google.com/account/about. 2022.
[232] Wei Meng, Xinyu Xing, Anmol Sheth, Udi Weinsberg, and Wenke Lee. “Your online interests:
Pwned! a pollution attack against targeted advertising”. In: Proceedings of the 2014 ACM SIGSAC
Conference on Computer and Communications Security. 2014, pp. 129–140.
[233] Chong Huang, Peter Kairouz, Xiao Chen, Lalitha Sankar, and Ram Rajagopal. “Context-aware
generative adversarial privacy”. In: Entropy 19.12 (2017), p. 656.
[234] Nisarg Raval, Ashwin Machanavajjhala, and Jerry Pan. “Olympus: Sensor Privacy through Utility
Aware Obfuscation.” In: Proc. Priv. Enhancing Technol. 2019.1 (2019), pp. 5–25.
[235] Mohammad Malekzadeh, Richard G Clegg, Andrea Cavallaro, and Hamed Haddadi. “Mobile
sensor data anonymization”. In: Proceedings of the international conference on internet of things
design and implementation. 2019, pp. 49–58.
[236] O’Flaherty. The 1 Facebook Setting You Should Change Now.
https://www.forbes.com/sites/kateoflahertyuk/2021/11/20/facebook-has-hijacked-your-newsfeed-heres-how-to-get-it-back/?sh=4c942aa62e79. 2021.
[237] Keith Bonawitz, Vladimir Ivanov, Ben Kreuter, Antonio Marcedone, H Brendan McMahan,
Sarvar Patel, Daniel Ramage, Aaron Segal, and Karn Seth. “Practical secure aggregation for
privacy-preserving machine learning”. In: proceedings of the 2017 ACM SIGSAC Conference on
Computer and Communications Security. 2017, pp. 1175–1191.
[238] Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Aguera y Arcas.
“Communication-Efficient Learning of Deep Networks from Decentralized Data”. In: Proceedings
of the 20th International Conference on Artificial Intelligence and Statistics. Ed. by Aarti Singh and
Jerry Zhu. Vol. 54. Proceedings of Machine Learning Research. 2017, pp. 1273–1282.
272
[239] Ligeng Zhu, Zhijian Liu, and Song Han. “Deep Leakage from Gradients”. In: Advances in Neural
Information Processing Systems. Vol. 32. 2019.
[240] Jonas Geiping, Hartmut Bauermeister, Hannah Dröge, and Michael Moeller. “Inverting Gradients
– How easy is it to break privacy in federated learning?” In: Advances in Neural Information
Processing Systems. 2020.
[241] Hongxu Yin, Arun Mallya, Arash Vahdat, Jose M. Alvarez, Jan Kautz, and Pavlo Molchanov. “See
through Gradients: Image Batch Recovery via GradInversion”. In: arXiv,2104.07586 (2021).
[242] Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith. “Calibrating noise to
sensitivity in private data analysis”. In: Theory of Cryptography Conference (TCC). 2006.
[243] Stacey Truex, Nathalie Baracaldo, Ali Anwar, Thomas Steinke, Heiko Ludwig, Rui Zhang, and
Yi Zhou. “A hybrid approach to privacy-preserving federated learning”. In: Proceedings of the 12th
ACM Workshop on Artificial Intelligence and Security. 2019, pp. 1–11.
[244] Peter Kairouz, Ziyu Liu, and Thomas Steinke. “The distributed discrete gaussian mechanism for
federated learning with secure aggregation”. In: arXiv preprint arXiv:2102.06387 (2021).
[245] Alex Krizhevsky. Learning multiple layers of features from tiny images. Tech. rep. Citeseer, 2009.
[246] Yann LeCun, Corinna Cortes, and CJ Burges. “MNIST handwritten digit database”. In: ATT Labs
[Online]. Available: http://yann.lecun.com/exdb/mnist 2 (2010).
[247] Mohamed Ishmael Belghazi, Aristide Baratin, Sai Rajeshwar, Sherjil Ozair, Yoshua Bengio,
Aaron Courville, and Devon Hjelm. “Mutual Information Neural Estimation”. In: Proceedings of
the 35th International Conference on Machine Learning. Ed. by Jennifer Dy and Andreas Krause.
Vol. 80. Proceedings of Machine Learning Research. PMLR, Oct. 2018, pp. 531–540.
[248] Yoshinori Aono, Takuya Hayashi, Lihua Wang, Shiho Moriai, et al. “Privacy-preserving deep
learning via additively homomorphic encryption”. In: IEEE Transactions on Information Forensics
and Security 13.5 (2017), pp. 1333–1345.
[249] Ye Dong, Xiaojun Chen, Liyan Shen, and Dakui Wang. “EaSTFLy: Efficient and secure ternary
federated learning”. In: Computers & Security 94 (2020), p. 101824.
[250] Runhua Xu, Nathalie Baracaldo, Yi Zhou, Ali Anwar, and Heiko Ludwig. “Hybridalpha: An
efficient approach for privacy-preserving federated learning”. In: Proceedings of the 12th ACM
Workshop on Artificial Intelligence and Security. 2019, pp. 13–23.
[251] James Henry Bell, Kallista A Bonawitz, Adrià Gascón, Tancrède Lepoint, and Mariana Raykova.
“Secure single-server aggregation with (poly) logarithmic overhead”. In: Proceedings of the 2020
ACM SIGSAC Conference on Computer and Communications Security. 2020, pp. 1253–1269.
[252] Jinhyun So, Ramy E Ali, Basak Guler, Jiantao Jiao, and Salman Avestimehr. “Securing secure
aggregation: Mitigating multi-round privacy leakage in federated learning”. In: arXiv preprint
arXiv:2106.03328 (2021).
273
[253] Swanand Kadhe, Nived Rajaraman, O Ozan Koyluoglu, and Kannan Ramchandran. “Fastsecagg:
Scalable secure aggregation for privacy-preserving federated learning”. In: arXiv preprint
arXiv:2009.11248 (2020).
[254] Yizhou Zhao and Hua Sun. “Information theoretic secure aggregation with user dropouts”. In:
2021 IEEE International Symposium on Information Theory (ISIT). IEEE. 2021, pp. 1124–1129.
[255] Jinhyun So, Corey J Nolet, Chien-Sheng Yang, Songze Li, Qian Yu, Ramy E Ali, Basak Guler, and
Salman Avestimehr. “Lightsecagg: a lightweight and versatile design for secure aggregation in
federated learning”. In: Proceedings of Machine Learning and Systems 4 (2022), pp. 694–720.
[256] Ahmed Roushdy Elkordy and A. Salman Avestimehr. “HeteroSAg: Secure Aggregation with
Heterogeneous Quantization in Federated Learning”. In: IEEE Transactions on Communications
(2022), pp. 1–1. doi: 10.1109/TCOMM.2022.3151126.
[257] Vaikkunth Mugunthan, Antigoni Polychroniadou, David Byrd, and Tucker Hybinette Balch.
“Smpai: Secure multi-party computation for federated learning”. In: Proceedings of the NeurIPS
2019 Workshop on Robust AI in Financial Services. 2019.
[258] Jinhyun So, Başak Güler, and A Salman Avestimehr. “Turbo-aggregate: Breaking the quadratic
aggregation barrier in secure federated learning”. In: IEEE Journal on Selected Areas in Information
Theory 2.1 (2021), pp. 479–489.
[259] Eugene Kuznetsov, Yitao Chen, and Ming Zhao. “SecureFL: Privacy Preserving Federated
Learning with SGX and TrustZone”. In: 2021 IEEE/ACM Symposium on Edge Computing (SEC).
2021, pp. 55–67. doi: 10.1145/3453142.3491287.
[260] Yuhui Zhang, Zhiwei Wang, Jiangfeng Cao, Rui Hou, and Dan Meng. “ShuffleFL:
gradient-preserving federated learning using trusted execution environment”. In: Proceedings of
the 18th ACM International Conference on Computing Frontiers. 2021, pp. 161–168.
[261] Ronen Eldan, Dan Mikulincer, and Alex Zhai. “The CLT in high dimensions: quantitative bounds
via martingale embedding”. In: The Annals of Probability 48.5 (2020), pp. 2494–2524.
[262] Sergey G Bobkov, Gennadiy P Chistyakov, and Friedrich Götze. “Berry–Esseen bounds in the
entropic central limit theorem”. In: Probability Theory and Related Fields 159.3-4 (2014),
pp. 435–478.
[263] Zhanxing Zhu, Jingfeng Wu, Bing Yu, Lei Wu, and Jinwen Ma. “The Anisotropic Noise in
Stochastic Gradient Descent: Its Behavior of Escaping from Sharp Minima and Regularization
Effects”. In: International Conference on Machine Learning. PMLR. 2019, pp. 7654–7663.
[264] Umut Simsekli, Levent Sagun, and Mert Gurbuzbalaban. “A tail-index analysis of stochastic
gradient noise in deep neural networks”. In: International Conference on Machine Learning. PMLR.
2019, pp. 5827–5837.
[265] Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. “Imagenet classification with deep
convolutional neural networks”. In: Advances in neural information processing systems 25 (2012).
274
[266] Anit Kumar Sahu, Tian Li, Maziar Sanjabi, Manzil Zaheer, Ameet Talwalkar, and Virginia Smith.
“On the convergence of federated optimization in heterogeneous networks”. In: arXiv preprint
arXiv:1812.06127 3 (2018), p. 3.
[267] Tzu-Ming Harry Hsu, Hang Qi, and Matthew Brown. “Measuring the effects of non-identical data
distribution for federated visual classification”. In: arXiv preprint arXiv:1909.06335 (2019).
[268] Sebastian Caldas, Sai Meher Karthik Duddu, Peter Wu, Tian Li, Jakub Konečny, `
H Brendan McMahan, Virginia Smith, and Ameet Talwalkar. “Leaf: A benchmark for federated
settings”. In: arXiv preprint arXiv:1812.01097 (2018).
[269] Martin Abadi, Andy Chu, Ian Goodfellow, H Brendan McMahan, Ilya Mironov, Kunal Talwar, and
Li Zhang. “Deep learning with differential privacy”. In: Proceedings of the 2016 ACM SIGSAC
conference on computer and communications security. 2016, pp. 308–318.
[270] Raef Bassily, Adam Smith, and Abhradeep Thakurta. “Private empirical risk minimization:
Efficient algorithms and tight error bounds”. In: 2014 IEEE 55th Annual Symposium on Foundations
of Computer Science. IEEE. 2014, pp. 464–473.
[271] Kamalika Chaudhuri, Claire Monteleoni, and Anand D Sarwate. “Differentially private empirical
risk minimization.” In: Journal of Machine Learning Research 12.3 (2011).
[272] Shiva Prasad Kasiviswanathan, Homin K Lee, Kobbi Nissim, Sofya Raskhodnikova, and
Adam Smith. “What can we learn privately?” In: SIAM Journal on Computing 40.3 (2011),
pp. 793–826.
[273] Naman Agarwal, Ananda Theertha Suresh, Felix Xinnan X Yu, Sanjiv Kumar, and
Brendan McMahan. “cpSGD: Communication-efficient and differentially-private distributed
SGD”. In: Advances in Neural Information Processing Systems 31 (2018).
[274] Borja Balle, James Bell, Adrià Gascón, and Kobbi Nissim. “The privacy blanket of the shuffle
model”. In: Annual International Cryptology Conference. Springer. 2019, pp. 638–667.
[275] Le Trieu Phong, Yoshinori Aono, Takuya Hayashi, Lihua Wang, and Shiho Moriai.
“Privacy-preserving deep learning: Revisited and enhanced”. In: International Conference on
Applications and Techniques in Information Security. Springer. 2017, pp. 100–110.
[276] Irem Ergun, Hasin Us Sami, and Basak Guler. “Sparsified Secure Aggregation for
Privacy-Preserving Federated Learning”. In: arXiv preprint arXiv:2112.12872 (2021).
[277] Chaoyang He, Songze Li, Jinhyun So, Xiao Zeng, Mi Zhang, Hongyi Wang, Xiaoyang Wang,
Praneeth Vepakomma, Abhishek Singh, Hang Qiu, et al. “Fedml: A research library and
benchmark for federated machine learning”. In: arXiv preprint arXiv:2007.13518 (2020).
[278] Chaoyang He, Murali Annavaram, and Salman Avestimehr. “Group knowledge transfer:
Federated learning of large cnns at the edge”. In: Advances in Neural Information Processing
Systems 33 (2020), pp. 14068–14080.
275
[279] Kang Wei, Jun Li, Ming Ding, Chuan Ma, Howard H Yang, Farhad Farokhi, Shi Jin, Tony QS Quek,
and H Vincent Poor. “Federated learning with differential privacy: Algorithms and performance
analysis”. In: IEEE Transactions on Information Forensics and Security 15 (2020), pp. 3454–3469.
[280] Manuel Gil, Fady Alajaji, and Tamas Linder. “Rényi divergence measures for commonly used
univariate continuous distributions”. In: Information Sciences 249 (2013), pp. 124–131.
[281] Ilya Mironov. “Rényi differential privacy”. In: 2017 IEEE 30th computer security foundations
symposium (CSF). IEEE. 2017, pp. 263–275.
[282] Robert B Ash. Information theory. Courier Corporation, 2012.
[283] Jingfeng Wu, Wenqing Hu, Haoyi Xiong, Jun Huan, Vladimir Braverman, and Zhanxing Zhu. “On
the noisy gradient descent that generalizes as sgd”. In: International Conference on Machine
Learning. PMLR. 2020, pp. 10367–10376.
[284] Dario Pasquini, Danilo Francati, and Giuseppe Ateniese. “Eluding secure aggregation in federated
learning via model inconsistency”. In: Proceedings of the 2022 ACM SIGSAC Conference on
Computer and Communications Security. 2022, pp. 2429–2443.
[285] Leon Mirsky. “A trace inequality of John von Neumann”. In: Monatshefte für mathematik 79.4
(1975), pp. 303–306.
[286] Simon S Du, Xiyu Zhai, Barnabas Poczos, and Aarti Singh. “Gradient descent provably optimizes
over-parameterized neural networks”. In: arXiv preprint arXiv:1810.02054 (2018).
[287] Zeyuan Allen-Zhu, Yuanzhi Li, and Zhao Song. “A convergence theory for deep learning via
over-parameterization”. In: International Conference on Machine Learning. PMLR. 2019,
pp. 242–252.
[288] Guy Gur-Ari, Daniel A Roberts, and Ethan Dyer. “Gradient descent happens in a tiny subspace”.
In: arXiv preprint arXiv:1812.04754 (2018).
[289] Wei Liu. “Additive white Gaussian noise level estimation based on block SVD”. In: 2014 IEEE
Workshop on Electronics, Computer and Applications. IEEE. 2014, pp. 960–963.
[290] Sylvain Delattre, Marc Hoffmann, Dominique Picard, and Thomas Vareschi. “Blockwise SVD with
error in the operator and application to blind deconvolution”. In: (2012).
[291] A Christoper Tamilmathi and PL Chithra. “Tensor block-wise singular value decomposition for
3D point cloud compression”. In: Multimedia Tools and Applications 81.26 (2022), pp. 37917–37938.
[292] Mark Bun, Cynthia Dwork, Guy N Rothblum, and Thomas Steinke. “Composable and versatile
privacy via truncated cdp”. In: Proceedings of the 50th Annual ACM SIGACT Symposium on Theory
of Computing. 2018, pp. 74–86.
[293] Hsin-Pai Cheng, Patrick Yu, Haojing Hu, Syed Zawad, Feng Yan, Shiyu Li, Hai Li, and Yiran Chen.
“Towards decentralized deep learning with differential privacy”. In: International Conference on
Cloud Computing. Springer. 2019, pp. 130–145.
276
[294] Peter Kairouz, Sewoong Oh, and Pramod Viswanath. “The composition theorem for differential
privacy”. In: International conference on machine learning. PMLR. 2015, pp. 1376–1385.
[295] Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Aguera y Arcas.
“Communication-efficient learning of deep networks from decentralized data”. In: Artificial
intelligence and statistics. PMLR. 2017, pp. 1273–1282.
[296] Naoise Holohan, Spiros Antonatos, Stefano Braghin, and Pól Mac Aonghusa. “The bounded
laplace mechanism in differential privacy”. In: arXiv preprint arXiv:1808.10410 (2018).
[297] Yuval Dagan and Gil Kur. “A bounded-noise mechanism for differential privacy”. In: Conference
on Learning Theory. PMLR. 2022, pp. 625–661.
[298] Fang Liu. “Generalized gaussian mechanism for differential privacy”. In: IEEE Transactions on
Knowledge and Data Engineering 31.4 (2018), pp. 747–756.
[299] Qian Yu, Songze Li, Netanel Raviv, Seyed Mohammadreza Mousavi Kalan, Mahdi Soltanolkotabi,
and Salman A Avestimehr. “Lagrange coded computing: Optimal design for resiliency, security,
and privacy”. In: The 22nd International Conference on Artificial Intelligence and Statistics. PMLR.
2019, pp. 1215–1225.
[300] Antonious Girgis, Deepesh Data, Suhas Diggavi, Peter Kairouz, and Ananda Theertha Suresh.
“Shuffled model of differential privacy in federated learning”. In: International Conference on
Artificial Intelligence and Statistics. PMLR. 2021, pp. 2521–2529.
[301] Lichao Sun, Jianwei Qian, and Xun Chen. “LDP-FL: Practical private aggregation in federated
learning with local differential privacy”. In: arXiv preprint arXiv:2007.15789 (2020).
[302] Maxence Noble, Aurélien Bellet, and Aymeric Dieuleveut. “Differentially private federated
learning on heterogeneous data”. In: International Conference on Artificial Intelligence and
Statistics. PMLR. 2022, pp. 10110–10145.
[303] Naman Agarwal, Peter Kairouz, and Ziyu Liu. “The skellam mechanism for differentially private
federated learning”. In: Advances in Neural Information Processing Systems 34 (2021),
pp. 5052–5064.
[304] Wei-Ning Chen, Ayfer Ozgur, and Peter Kairouz. “The poisson binomial mechanism for unbiased
federated learning with secure aggregation”. In: International Conference on Machine Learning.
PMLR. 2022, pp. 3490–3506.
[305] Yuchen Yang, Bo Hui, Haolin Yuan, Neil Gong, and Yinzhi Cao. “{PrivateFL}: Accurate,
Differentially Private Federated Learning via Personalized Data Transformation”. In: 32nd
USENIX Security Symposium (USENIX Security 23). 2023, pp. 1595–1612.
[306] Yu-Xiang Wang, Stephen Fienberg, and Alex Smola. “Privacy for free: Posterior sampling and
stochastic gradient monte carlo”. In: International Conference on Machine Learning. PMLR. 2015,
pp. 2493–2502.
277
[307] Tian Dong, Bo Zhao, and Lingjuan Lyu. “Privacy for free: How does dataset condensation help
privacy?” In: International Conference on Machine Learning. PMLR. 2022, pp. 5378–5396.
[308] Nicholas Carlini, Vitaly Feldman, and Milad Nasr. “No Free Lunch in" Privacy for Free: How does
Dataset Condensation Help Privacy"”. In: arXiv preprint arXiv:2209.14987 (2022).
[309] Tommaso Caselli, Valerio Basile, Jelena Mitrović, and Michael Granitzer. “Hatebert: Retraining
bert for abusive language detection in english”. In: arXiv preprint arXiv:2010.12472 (2020).
[310] Youngwook Kim, Shinwoo Park, and Yo-Sub Han. “Generalizable implicit hate speech detection
using contrastive learning”. In: Proceedings of the 29th International Conference on Computational
Linguistics. 2022, pp. 6667–6679.
[311] Yau-Shian Wang and Yingshan Chang. “Toxicity detection with generative prompt-based
inference”. In: arXiv preprint arXiv:2205.12390 (2022).
[312] Tianhua Zhang, Hongyin Luo, Yung-Sung Chuang, Wei Fang, Luc Gaitskell, Thomas Hartvigsen,
Xixin Wu, Danny Fox, Helen Meng, and James Glass. “Interpretable unified language checking”.
In: arXiv preprint arXiv:2304.03728 (2023).
[313] Xinyi Wang, Wanrong Zhu, and William Yang Wang. “Large language models are implicitly topic
models: Explaining and finding good demonstrations for in-context learning”. In: arXiv preprint
arXiv:2301.11916 (2023).
[314] Jiachang Liu, Dinghan Shen, Yizhe Zhang, Bill Dolan, Lawrence Carin, and Weizhu Chen. “What
Makes Good In-Context Examples for GPT-3?” In: arXiv preprint arXiv:2101.06804 (2021).
[315] Zhiqing Sun, Xuezhi Wang, Yi Tay, Yiming Yang, and Denny Zhou. “Recitation-augmented
language models”. In: arXiv preprint arXiv:2210.01296 (2022).
[316] Thomas Hartvigsen, Saadia Gabriel, Hamid Palangi, Maarten Sap, Dipankar Ray, and Ece Kamar.
“Toxigen: A large-scale machine-generated dataset for adversarial and implicit hate speech
detection”. In: arXiv preprint arXiv:2203.09509 (2022).
[317] Maarten Sap, Saadia Gabriel, Lianhui Qin, Dan Jurafsky, Noah A Smith, and Yejin Choi. “Social
bias frames: Reasoning about social and power implications of language”. In: arXiv preprint
arXiv:1911.03891 (2019).
[318] Bertie Vidgen, Tristan Thrush, Zeerak Waseem, and Douwe Kiela. “Learning from the worst:
Dynamically generated datasets to improve online hate detection”. In: arXiv preprint
arXiv:2012.15761 (2020).
[319] OpenAI. ChatGPT: get instant answers, find creative inspiration, and learn something new.
https://openai.com/chatgpt. Accessed: 2023-08-15. 2023.
[320] Lianmin Zheng, Wei-Lin Chiang, Ying Sheng, Siyuan Zhuang, Zhanghao Wu, Yonghao Zhuang,
Zi Lin, Zhuohan Li, Dacheng Li, Eric. P Xing, Hao Zhang, Joseph E. Gonzalez, and Ion Stoica.
Judging LLM-as-a-judge with MT-Bench and Chatbot Arena. 2023. arXiv: 2306.05685 [cs.CL].
278
[321] Nils Reimers and Iryna Gurevych. “Sentence-BERT: Sentence Embeddings using Siamese
BERT-Networks”. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language
Processing. Association for Computational Linguistics, Nov. 2019. url:
https://arxiv.org/abs/1908.10084.
[322] Hyung Won Chung, Le Hou, Shayne Longpre, Barret Zoph, Yi Tay, William Fedus, Eric Li,
Xuezhi Wang, Mostafa Dehghani, Siddhartha Brahma, et al. “Scaling instruction-finetuned
language models”. In: arXiv preprint arXiv:2210.11416 (2022).
[323] Yiran Ye, Thai Le, and Dongwon Lee. “NoisyHate: Benchmarking Content Moderation Machine
Learning Models with Human-Written Perturbations Online”. In: arXiv preprint arXiv:2303.10430
(2023).
[324] Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux,
Timothée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, et al. “Llama: Open
and efficient foundation language models”. In: arXiv preprint arXiv:2302.13971 (2023).
[325] Takeshi Kojima, Shixiang Shane Gu, Machel Reid, Yutaka Matsuo, and Yusuke Iwasawa. “Large
language models are zero-shot reasoners”. In: Advances in neural information processing systems
35 (2022), pp. 22199–22213.
[326] Wayne Xin Zhao, Kun Zhou, Junyi Li, Tianyi Tang, Xiaolei Wang, Yupeng Hou, Yingqian Min,
Beichen Zhang, Junjie Zhang, Zican Dong, et al. “A survey of large language models”. In: arXiv
preprint arXiv:2303.18223 (2023).
[327] Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Fei Xia, Ed Chi, Quoc V Le,
Denny Zhou, et al. “Chain-of-thought prompting elicits reasoning in large language models”. In:
Advances in Neural Information Processing Systems 35 (2022), pp. 24824–24837.
[328] Shunyu Yao, Dian Yu, Jeffrey Zhao, Izhak Shafran, Thomas L Griffiths, Yuan Cao, and
Karthik Narasimhan. “Tree of thoughts: Deliberate problem solving with large language models”.
In: arXiv preprint arXiv:2305.10601 (2023).
[329] Xuezhi Wang, Jason Wei, Dale Schuurmans, Quoc Le, Ed Chi, Sharan Narang,
Aakanksha Chowdhery, and Denny Zhou. “Self-consistency improves chain of thought reasoning
in language models”. In: arXiv preprint arXiv:2203.11171 (2022).
[330] Hongyin Luo, Yung-Sung Chuang, Yuan Gong, Tianhua Zhang, Yoon Kim, Xixin Wu, Danny Fox,
Helen Meng, and James Glass. “SAIL: Search-Augmented Instruction Learning”. In: arXiv preprint
arXiv:2305.15225 (2023).
[331] Wei-Lin Chen, An-Zi Yen, Hen-Hsen Huang, Cheng-Kuang Wu, and Hsin-Hsi Chen. “ZARA:
Improving Few-Shot Self-Rationalization for Small Language Models”. In: arXiv preprint
arXiv:2305.07355 (2023).
[332] Lucie Charlotte Magister, Jonathan Mallinson, Jakub Adamek, Eric Malmi, and Aliaksei Severyn.
“Teaching small language models to reason”. In: arXiv preprint arXiv:2212.08410 (2022).
279
[333] Cheng-Yu Hsieh, Chun-Liang Li, Chih-Kuan Yeh, Hootan Nakhost, Yasuhisa Fujii,
Alexander Ratner, Ranjay Krishna, Chen-Yu Lee, and Tomas Pfister. “Distilling step-by-step!
outperforming larger language models with less training data and smaller model sizes”. In: arXiv
preprint arXiv:2305.02301 (2023).
[334] Peifeng Wang, Aaron Chan, Filip Ilievski, Muhao Chen, and Xiang Ren. “Pinto: Faithful language
reasoning using prompt-generated rationales”. In: arXiv preprint arXiv:2211.01562 (2022).
[335] Peifeng Wang, Zhengyang Wang, Zheng Li, Yifan Gao, Bing Yin, and Xiang Ren. “SCOTT:
Self-consistent chain-of-thought distillation”. In: arXiv preprint arXiv:2305.01879 (2023).
[336] Mohammad Saeid Mahdavinejad, Mohammadreza Rezvan, Mohammadamin Barekatain,
Peyman Adibi, Payam Barnaghi, and Amit P Sheth. “Machine learning for Internet of Things data
analysis: A survey”. In: Digital Communications and Networks 4.3 (2018), pp. 161–175.
[337] Statista. Number of households with smart home products and services in use worldwide from 2015 to
2025. 2022. url: https://www.statista.com/statistics/1252975/smart-home-households-worldwide/.
[338] Akanksha Atrey, Prashant Shenoy, and David Jensen. “Preserving Privacy in Personalized Models
for Distributed Mobile Services”. In: 2021 IEEE 41st International Conference on Distributed
Computing Systems (ICDCS). IEEE. 2021, pp. 875–886.
[339] Jide S Edu, Xavier Ferrer-Aran, Jose Such, and Guillermo Suarez-Tangil. “SkillVet: automated
traceability analysis of Amazon Alexa skills”. In: IEEE Transactions on Dependable and Secure
Computing 20.1 (2021), pp. 161–175.
[340] Valentina Bianchi, Marco Bassoli, Gianfranco Lombardo, Paolo Fornacciari, Monica Mordonini,
and Ilaria De Munari. “IoT wearable sensor and deep learning: An integrated approach for
personalized human activity recognition in a smart home environment”. In: IEEE Internet of
Things Journal 6.5 (2019), pp. 8553–8562.
[341] P Sriramalakshmi, Shreyaa Parvath Rajkumar, and R Nivedhithaa. “Modern Machine Learning
and IoT Applications for Personalized Healthcare: Opportunities and Challenges”. In:
Transformation in Healthcare with Emerging Technologies (2022), pp. 199–216.
[342] Ruifei He, Shuyang Sun, Xin Yu, Chuhui Xue, Wenqing Zhang, Philip Torr, Song Bai, and
Xiaojuan Qi. “Is synthetic data from generative models ready for image recognition?” In: arXiv
preprint arXiv:2210.07574 (2022).
[343] Ling Yang, Zhilong Zhang, Yang Song, Shenda Hong, Runsheng Xu, Yue Zhao, Wentao Zhang,
Bin Cui, and Ming-Hsuan Yang. “Diffusion models: A comprehensive survey of methods and
applications”. In: ACM Computing Surveys 56.4 (2023), pp. 1–39.
[344] Nataniel Ruiz, Yuanzhen Li, Varun Jampani, Yael Pritch, Michael Rubinstein, and Kfir Aberman.
“Dreambooth: Fine tuning text-to-image diffusion models for subject-driven generation”. In:
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023,
pp. 22500–22510.
280
[345] Lvmin Zhang, Anyi Rao, and Maneesh Agrawala. “Adding conditional control to text-to-image
diffusion models”. In: Proceedings of the IEEE/CVF International Conference on Computer Vision.
2023, pp. 3836–3847.
[346] Haoyang Fang, Boran Han, Shuai Zhang, Su Zhou, Cuixiong Hu, and Wen-Ming Ye. “Data
augmentation for object detection via controllable diffusion models”. In: Proceedings of the
IEEE/CVF Winter Conference on Applications of Computer Vision. 2024, pp. 1257–1266.
[347] Susu Yao, Weisi Lin, EePing Ong, and Zhongkang Lu. “Contrast signal-to-noise ratio for image
quality assessment”. In: IEEE International Conference on Image Processing 2005. Vol. 1. IEEE. 2005,
pp. I–397.
[348] Junnan Li, Dongxu Li, Silvio Savarese, and Steven Hoi. “Blip-2: Bootstrapping language-image
pre-training with frozen image encoders and large language models”. In: International conference
on machine learning. PMLR. 2023, pp. 19730–19742.
[349] Srijan Das, Rui Dai, Michal Koperski, Luca Minciullo, Lorenzo Garattoni, Francois Bremond, and
Gianpiero Francesca. “Toyota smarthome: Real-world activities of daily living”. In: Proceedings of
the IEEE/CVF international conference on computer vision. 2019, pp. 833–842.
[350] Glenn Jocher, Ayush Chaurasia, and Jing Qiu. Ultralytics YOLOv8. Version 8.0.0. 2023. url:
https://github.com/ultralytics/ultralytics.
[351] Shilong Liu, Zhaoyang Zeng, Tianhe Ren, Feng Li, Hao Zhang, Jie Yang, Chunyuan Li,
Jianwei Yang, Hang Su, Jun Zhu, et al. “Grounding dino: Marrying dino with grounded
pre-training for open-set object detection”. In: arXiv preprint arXiv:2303.05499 (2023).
[352] Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, and Liang-Chieh Chen.
“Mobilenetv2: Inverted residuals and linear bottlenecks”. In: Proceedings of the IEEE conference on
computer vision and pattern recognition. 2018, pp. 4510–4520.
[353] Yong Woon Kim and Addapalli VN Krishna. “A study on the effect of Canny edge detection on
downscaled images”. In: Pattern Recognition and Image Analysis 30 (2020), pp. 372–381.
[354] Jascha Sohl-Dickstein, Eric Weiss, Niru Maheswaranathan, and Surya Ganguli. “Deep
unsupervised learning using nonequilibrium thermodynamics”. In: International conference on
machine learning. PMLR. 2015, pp. 2256–2265.
[355] Jonathan Ho, Ajay Jain, and Pieter Abbeel. “Denoising diffusion probabilistic models”. In:
Advances in neural information processing systems 33 (2020), pp. 6840–6851.
[356] Diederik P Kingma and Max Welling. “Auto-encoding variational bayes”. In: arXiv preprint
arXiv:1312.6114 (2013).
[357] Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair,
Aaron Courville, and Yoshua Bengio. “Generative adversarial nets”. In: International Conference
on Neural Information Processing Systems (NIPS). 2014.
281
[358] Shihao Zhao, Dongdong Chen, Yen-Chun Chen, Jianmin Bao, Shaozhe Hao, Lu Yuan, and
Kwan-Yee K Wong. “Uni-ControlNet: All-in-One Control to Text-to-Image Diffusion Models”. In:
arXiv preprint arXiv:2305.16322 (2023).
[359] Hsin-Ping Huang, Yu-Chuan Su, Deqing Sun, Lu Jiang, Xuhui Jia, Yukun Zhu, and
Ming-Hsuan Yang. “Fine-grained Controllable Video Generation via Object Appearance and
Context”. In: arXiv preprint arXiv:2312.02919 (2023).
[360] Swami Sankaranarayanan, Yogesh Balaji, Arpit Jain, Ser Nam Lim, and Rama Chellappa.
“Learning from synthetic data: Addressing domain shift for semantic segmentation”. In:
Proceedings of the IEEE conference on computer vision and pattern recognition. 2018, pp. 3752–3761.
[361] Sergey I Nikolenko. Synthetic data for deep learning. Vol. 174. Springer, 2021.
[362] Konstantinos Bousmalis, Alex Irpan, Paul Wohlhart, Yunfei Bai, Matthew Kelcey,
Mrinal Kalakrishnan, Laura Downs, Julian Ibarz, Peter Pastor, Kurt Konolige, et al. “Using
simulation and domain adaptation to improve efficiency of deep robotic grasping”. In: 2018 IEEE
international conference on robotics and automation (ICRA). IEEE. 2018, pp. 4243–4250.
[363] Anton Osokin, Anatole Chessel, Rafael E Carazo Salas, and Federico Vaggi. “GANs for biological
image synthesis”. In: Proceedings of the IEEE International Conference on Computer Vision. 2017,
pp. 2233–2242.
[364] Matthias Müller, Alexey Dosovitskiy, Bernard Ghanem, and Vladlen Koltun. “Driving policy
transfer via modularity and abstraction”. In: arXiv preprint arXiv:1804.09364 (2018).
[365] Erroll Wood, Tadas Baltrušaitis, Charlie Hewitt, Sebastian Dziadzio, Thomas J Cashman, and
Jamie Shotton. “Fake it till you make it: face analysis in the wild using synthetic data alone”. In:
Proceedings of the IEEE/CVF international conference on computer vision. 2021, pp. 3681–3691.
[366] Shekoofeh Azizi, Simon Kornblith, Chitwan Saharia, Mohammad Norouzi, and David J Fleet.
“Synthetic data from diffusion models improves imagenet classification”. In: arXiv preprint
arXiv:2304.08466 (2023).
[367] Mert Bülent Sarıyıldız, Karteek Alahari, Diane Larlus, and Yannis Kalantidis. “Fake It Till You
Make It: Learning Transferable Representations From Synthetic ImageNet Clones”. In:
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). June
2023, pp. 8011–8021.
[368] Yongchao Zhou, Hshmat Sahak, and Jimmy Ba. “Training on thin air: Improve image
classification with generated data”. In: arXiv preprint arXiv:2305.15316 (2023).
[369] Xi He, Ashwin Machanavajjhala, and Bolin Ding. “Blowfish privacy: Tuning privacy-utility
trade-offs using policies”. In: Proceedings of the 2014 ACM SIGMOD international conference on
Management of data. ACM. 2014, pp. 1447–1458.
[370] Xiyang Liu and Sewoong Oh. “Minimax Rates of Estimating Approximate Differential Privacy”.
In: arXiv preprint arXiv:1905.10335 (2019).
282
[371] Om Thakkar, Galen Andrew, and H Brendan McMahan. “Differentially Private Learning with
Adaptive Clipping”. In: arXiv preprint arXiv:1905.03871 (2019).
[372] Root Metrics Inc. Metro RootScore Reports.
[373] Dennis Elbrächter, Dmytro Perekrestenko, Philipp Grohs, and Helmut Bölcskei. “Deep neural
network approximation theory”. In: arXiv preprint arXiv:1901.02220 (2019).
[374] Reza Shokri, George Theodorakopoulos, Jean-Yves Le Boudec, and Jean-Pierre Hubaux.
“Quantifying location privacy”. In: 2011 IEEE symposium on security and privacy. IEEE. 2011,
pp. 247–262.
[375] Mah-Rukh Fida, Andra Lutu, Mahesh K Marina, and Özgü Alay. “Zipweave: Towards efficient and
reliable measurement based mobile coverage maps”. In: IEEE INFOCOM 2017-IEEE Conference on
Computer Communications. IEEE. 2017, pp. 1–9.
[376] Ayon Chakraborty, Md Shaifur Rahman, Himanshu Gupta, and Samir R Das. “Specsense:
Crowdsensing for efficient querying of spectrum occupancy”. In: IEEE INFOCOM 2017-IEEE
Conference on Computer Communications. IEEE. 2017, pp. 1–9.
[377] Deepak Ravichandran and Nitish Korula. Effect of Disabling Third-party Cookies on Publisher
Revenue. https://services.google.com/fh/files/misc/disabling_thirdparty_cookies_publisher_revenue.pdf. 2019.
[378] Russell Heimlich. “Internet Users Don’t like Targeted Ads”. In: (2012).
[379] “TrackMeNot”. In: (2016).
[380] Paulo Almeida and Joao Gondim. “Click Fraud Detection and Prevention System for Ad
Networks”. In: Journal of Information Security and Cryptography (Enigma) 5 (Jan. 2019), p. 27. doi:
10.17648/jisc.v5i1.71.
[381] Andrea Peterson Ashkan Soltani and Barton Gellman. NSA uses Google cookies to pinpoint targets
for hacking. The Washington Post. 2013.
[382] Avi Goldfarb and Catherine E Tucker. “Privacy regulation and online advertising”. In:
Management science 57.1 (2011), pp. 57–71.
[383] Arpita Ghosh, Mohammad Mahdian, R Preston McAfee, and Sergei Vassilvitskii. “To match or not
to match: Economics of cookie matching in online advertising”. In: ACM Transactions on
Economics and Computation (TEAC) 3.2 (2015), pp. 1–18.
[384] Daniel C Howe and Helen Nissenbaum. “TrackMeNot: Resisting surveillance in web search”. In:
Lessons from the Identity trail: Anonymity, privacy, and identity in a networked society 23 (2009),
pp. 417–436.
[385] Howard Beales and Jeffrey A Eisenach. “An Empirical Analysis of the Value of Information
Sharing in the Market for Online Content”. In: Available at SSRN 2421405 (2014).
283
[386] Brett Stone-Gross, Ryan Stevens, Apostolis Zarras, Richard Kemmerer, Chris Kruegel, and
Giovanni Vigna. “Understanding fraudulent activities in online ad exchanges”. In: ACM Internet
Measurement Conference (IMC). 2011.
[387] Narseo Vallina-Rodriguez, Srikanth Sundaresan, Abbas Razaghpanah, Rishab Nithyanand,
Mark Allman, Christian Kreibich, and Phillipa Gill. “Tracking the Trackers: Towards
Understanding the Mobile Advertising and Tracking Ecosystem”. In: Workshop on Data and
Algorithmic Transparency (DAT). 2016.
[388] David Sontag, Kevyn Collins-Thompson, Paul N Bennett, Ryen W White, Susan Dumais, and
Bodo Billerbeck. “Probabilistic models for personalizing web search”. In: Proceedings of the fifth
ACM international conference on Web search and data mining. ACM. 2012.
[389] Hankz Hankui Zhuo, Wenfeng Feng, Qian Xu, Qiang Yang, and Yufeng Lin. “Federated
reinforcement learning”. In: arXiv:1901.08277 (2019).
[390] Ero Balsa, Carmela Troncoso, and Claudia Diaz. “OB-PWS: Obfuscation-based private web
search”. In: IEEE Symposium on Security & Privacy (S&P). 2012.
[391] Sai Teja Peddinti and Nitesh Saxena. “On the privacy of web search based on query obfuscation: a
case study of TrackMeNot”. In: International Symposium on Privacy Enhancing Technologies
Symposium. Springer. 2010, pp. 19–37.
[392] Mshabab Alrizah, Sencun Zhu, Xinyu Xing, and Gang Wang. “Errors, Misunderstandings, and
Attacks: Analyzing the Crowdsourcing Process of Ad-blocking Systems”. In: ACM Internet
Measurement Conference (IMC). 2019.
[393] What’s Trending in Display for Publishers? url: http://web.archive.org/web/20130330113404/http:
//www.google.com/think/research-studies/whats-trending-in-display-for-publishers.html.
[394] IAB Tech Lab Content Taxonomy.
https://www.iab.com/guidelines/iab-quality-assurance-guidelines-qag-taxonomy/.
[395] Harpocrates (Harpokrates). https://en.wikipedia.org/wiki/Harpocrates.
[396] Shaozhi Ye, Felix Wu, Raju Pandey, and Hao Chen. “Noise injection for search privacy protection”.
In: 2009 International Conference on Computational Science and Engineering. IEEE. 2009.
[397] Bracha Shapira, Yuval Elovici, Adlay Meshiach, and Tsvi Kuflik. “PRAW—A PRivAcy model for
the Web”. In: Journal of the American Society for Information Science and Technology 56.2 (2005),
pp. 159–172.
[398] Josep Domingo-Ferrer, Agusti Solanas, and Jordi Castellà-Roca. “h (k)-Private information
retrieval from privacy-uncooperative queryable databases”. In: Online Information Review 33.4
(2009), pp. 720–744.
[399] Mummoorthy Murugesan and Chris Clifton. “Providing privacy through plausibly deniable
search”. In: Proceedings of the 2009 SIAM International Conference on Data Mining. SIAM. 2009.
284
[400] Workshop Report, International Workshop on Obfuscation: Science, Technology, and Theory. http:
//www.obfuscationworkshop.org/wp-content/uploads/2017/10/obfuscation-workshop-report.pdf.
2017.
[401] Muhammad Ahmad Bashir, Sajjad Arshad, William Robertson, and Christo Wilson. “Tracing
Information Flows Between Ad Exchanges Using Retargeted Ads”. In: 25th {USENIX} Security
Symposium ({USENIX} Security 16). 2016.
[402] Rahat Masood, Dinusha Vatsalan, Muhammad Ikram, and Mohamed Ali Kaafar. “Incognito: A
method for obfuscating web data”. In: International Conference on World Wide Web (WWW). 2018.
[403] Gunes Acar, Christian Eubank, Steven Englehardt, Marc Juarez, Arvind Narayanan, and
Claudia Diaz. “The web never forgets: Persistent tracking mechanisms in the wild”. In: ACM
Conference on Computer and Communications Security (CCS). 2014.
[404] Surveying the Digital Future.
http://www.digitalcenter.org/wp-content/uploads/2013/10/2017-Digital-Future-Report.pdf.
2016.
[405] Sarah Frier and Bloomberg. Facebook Stops Recording Users’ Audio, as Contract Transcriptionists
Express Ethical Concerns. https://fortune.com/2019/08/13/facebook-audio-recording/.
[406] Ben Rossi. Facebook buys mobile analytics startup Onavo.
https://www.information-age.com/facebook-buys-mobile-analytics-startup-onavo-123457404/.
[407] David D. Kirkpatrick. Israeli Software Helped Saudis Spy on Khashoggi, Lawsuit Says.
https://www.nytimes.com/2018/12/02/world/middleeast/saudi-khashoggi-spyware-israel.html.
[408] Lawfare Blog. Snowden Revelations. https://www.lawfareblog.com/snowden-revelations.
[409] Revealed: 50 million Facebook profiles harvested for Cambridge Analytica in major data breach.
https://www.theguardian.com/news/2018/mar/17/cambridge-analytica-facebook-influence-uselection.
[410] Veronica Marotta, Vibhanshu Abhishek, and Alessandro Acquisti. Online Tracking and Publishers’
Revenues: An Empirical Analysis. Tech. rep. Working paper, 2019.
[411] WhiteOps. 2018-2019 Bot Baseline Fraud in Digital Advertising. Tech. rep. White Ops, 2019.
[412] DuckDuckGo. https://duckduckgo.com/about. 2019.
[413] Are ad blockers doomed or have we already won? A history lesson.
https://adguard.com/en/blog/ad-blocking-history.html. 2020.
[414] Umar Iqbal, Zubair Shafiq, and Zhiyun Qian. “The ad wars: retrospective measurement and
analysis of anti-adblock filter lists”. In: ACM Internet Measurement Conference (IMC). 2017.
285
[415] Shitong Zhu, Xunchao Hu, Zhiyun Qian, Zubair Shafiq, and Heng Yin. “Measuring and disrupting
anti-adblockers using differential execution analysis”. In: Network and Distributed Systems
Security (NDSS) Symposium. 2018.
[416] Esther Levin, Roberto Pieraccini, and Wieland Eckert. “Using Markov decision process for
learning dialogue strategies”. In: IEEE International Conference on Acoustics, Speech and Signal
Processing (ICASSP). 1998.
[417] Sinno Jialin Pan and Qiang Yang. “A survey on transfer learning”. In: IEEE Transactions on
knowledge and data engineering 22.10 (2009), pp. 1345–1359.
[418] Cynthia Dwork. “Differential privacy: A survey of results”. In: International conference on theory
and applications of models of computation. 2008.
[419] Average display advertising clickthrough rates (CTRs) – 2020 compilation.
https://www.smartinsights.com/internet-advertising/internet-advertising-analytics/displayadvertising-clickthrough-rates/.
[420] Jordi Castellà-Roca, Alexandre Viejo, and Jordi Herrera-Joancomartí. “Preserving user’s privacy
in web search engines”. In: Computer Communications 32.13-14 (2009), pp. 1541–1551.
[421] Ian Goldberg. “Improving the robustness of private information retrieval”. In: IEEE Symposium on
Security & Privacy (S&P). IEEE. 2007.
[422] Benny Chor, Oded Goldreich, Eyal Kushilevitz, and Madhu Sudan. “Private information retrieval”.
In: IEEE Symposium on Foundations of Computer Science (FOCS). 1995.
[423] Tianyu Gu, Brendan Dolan-Gavitt, and Siddharth Garg. “Badnets: Identifying vulnerabilities in
the machine learning model supply chain”. In: IEEE Access (2019).
[424] Nilesh Dalvi, Pedro Domingos, Sumit Sanghai, and Deepak Verma. “Adversarial classification”. In:
ACM International Conference on Knowledge Discovery and Data mining (KDD). 2004.
[425] Battista Biggio, Igino Corona, Davide Maiorca, Blaine Nelson, Nedim Šrndić, Pavel Laskov,
Giorgio Giacinto, and Fabio Roli. “Evasion attacks against machine learning at test time”. In: Joint
European conference on machine learning and knowledge discovery in databases (ECML PKDD).
2013.
[426] Ali Shafahi, W Ronny Huang, Mahyar Najibi, Octavian Suciu, Christoph Studer, Tudor Dumitras,
and Tom Goldstein. “Poison frogs! targeted clean-label poisoning attacks on neural networks”. In:
Conference on Neural Information Processing Systems (NeurIPS). 2018.
[427] Aniruddha Saha, Akshayvarun Subramanya, and Hamed Pirsiavash. “Hidden Trigger Backdoor
Attacks”. In: AAAI Conference on Artificial Intelligence (AAAI). 2019.
[428] Chaofei Yang, Qing Wu, Hai Li, and Yiran Chen. “Generative poisoning attack method against
neural networks”. In: arXiv:1703.01340 (2017).
286
[429] Xinyun Chen, Chang Liu, Bo Li, Kimberly Lu, and Dawn Song. “Targeted backdoor attacks on
deep learning systems using data poisoning”. In: arXiv:1712.05526 (2017).
[430] Mahmood Sharif, Sruti Bhagavatula, Lujo Bauer, and Michael K Reiter. “Accessorize to a crime:
Real and stealthy attacks on state-of-the-art face recognition”. In: ACM Conference on Computer
and Communications Security (CCS). 2016.
[431] Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan,
Ian Goodfellow, and Rob Fergus. “Intriguing properties of neural networks”. In: International
Conference on Learning Representations (ICLR) (2014).
[432] AdNauseam Implementation. https://github.com/dhowe/AdNauseam.
[433] Mehdi Mirza and Simon Osindero. “Conditional generative adversarial nets”. In: arXiv:1411.1784
(2014).
[434] Ian J. Goodfellow, Jonathon Shlens, and Christian Szegedy. Explaining and Harnessing Adversarial
Examples. 2015. arXiv: 1412.6572 [stat.ML].
[435] Minhao Cheng, Jinfeng Yi, Huan Zhang, Pin-Yu Chen, and Cho-Jui Hsieh. “Seq2Sick: Evaluating
the Robustness of Sequence-to-Sequence Models with Adversarial Examples”. In: CoRR
abs/1803.01128 (2018). arXiv: 1803.01128. url: http://arxiv.org/abs/1803.01128.
[436] Nicolas Papernot, Patrick D. McDaniel, Ananthram Swami, and Richard E. Harang. “Crafting
Adversarial Input Sequences for Recurrent Neural Networks”. In: CoRR abs/1604.08275 (2016).
arXiv: 1604.08275. url: http://arxiv.org/abs/1604.08275.
[437] Chrome-3rd-Party. Chrome-3rd-Party.
https://blog.google/products/chrome/update-testing-privacy-sandbox-web/. 2022.
[438] Thomas M. Cover and Joy A. Thomas. Elements of Information Theory (Wiley Series in
Telecommunications and Signal Processing). USA: Wiley-Interscience, 2006. isbn: 0471241954.
[439] Thee Chanyaswad, Alex Dytso, H Vincent Poor, and Prateek Mittal. “Mvg mechanism:
Differential privacy under matrix-valued query”. In: Proceedings of the 2018 ACM SIGSAC
Conference on Computer and Communications Security. 2018, pp. 230–246.
[440] Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. “BERT: Pre-training of deep
bidirectional transformers for language understanding”. In: North American Chapter of the
Association for Computational Linguistics: Human Language Technologies (2019), pp. 4171–4186.
[441] Yuheng Zhang, Ruoxi Jia, Hengzhi Pei, Wenxiao Wang, Bo Li, and Dawn Song. “The Secret
Revealer: Generative Model-Inversion Attacks Against Deep Neural Networks”. In: 2020 IEEE/CVF
Conference on Computer Vision and Pattern Recognition (CVPR). 2020, pp. 250–258. doi:
10.1109/CVPR42600.2020.00033.
[442] Elizabeth L Feld. “United States Data Privacy Law: The Domino Effect After the GDPR”. In: N.C.
Banking Inst. Vol. 24. HeinOnline, 2020, p. 481.
287
[443] Martin Raič. “A multivariate Berry–Esseen theorem with explicit constants”. In: Bernoulli 25.4A
(2019), pp. 2824–2853.
[444] Semih Yagli, Alex Dytso, and H Vincent Poor. “Information-theoretic bounds on the
generalization error and privacy leakage in federated learning”. In: 2020 IEEE 21st International
Workshop on Signal Processing Advances in Wireless Communications (SPAWC). IEEE. 2020, pp. 1–5.
[445] Alex Krizhevsky, Vinod Nair, and Hinton Geoffrey. “CIFAR-10 dataset”. In: Canadian Institute for
Advanced Research. Available: https://www.cs.toronto.edu/ kriz/cifar.html (2009).
[446] Dheeru Dua and Casey Graff. UCI Machine Learning Repository. 2017. url:
http://archive.ics.uci.edu/ml.
[447] Timothy Stevens, Christian Skalka, Christelle Vincent, John Ring, Samuel Clark, and Joseph Near.
“Efficient differentially private secure aggregation for federated learning via hardness of learning
with errors”. In: 31st USENIX Security Symposium (USENIX Security 22). 2022, pp. 1379–1395.
[448] Di Chai, Leye Wang, Kai Chen, and Qiang Yang. “Fedeval: A benchmark system with a
comprehensive evaluation model for federated learning”. In: arXiv preprint arXiv:2011.09655
(2020).
[449] Xiaoyuan Liu, Tianneng Shi, Chulin Xie, Qinbin Li, Kangping Hu, Haoyu Kim, Xiaojun Xu, Bo Li,
and Dawn Song. “Unifed: A benchmark for federated learning frameworks”. In: arXiv preprint
arXiv:2207.10308 (2022).
[450] Yiwei Li, Tsung-Hui Chang, and Chong-Yung Chi. “Secure federated averaging algorithm with
differential privacy”. In: 2020 IEEE 30th International Workshop on Machine Learning for Signal
Processing (MLSP). IEEE. 2020, pp. 1–6.
[451] Xiaocheng Shang, Zhanxing Zhu, Benedict Leimkuhler, and Amos J Storkey.
“Covariance-controlled adaptive Langevin thermostat for large-scale Bayesian sampling”. In:
Advances in Neural Information Processing Systems 28 (2015).
[452] Stephan Mandt, Matthew D Hoffman, and David M Blei. “Stochastic gradient descent as
approximate bayesian inference”. In: arXiv preprint arXiv:1704.04289 (2017).
[453] Zhanxing Zhu, Jingfeng Wu, Bing Yu, Lei Wu, and Jinwen Ma. “The anisotropic noise in
stochastic gradient descent: Its behavior of escaping from sharp minima and regularization
effects”. In: arXiv preprint arXiv:1803.00195 (2018).
[454] Umut Simsekli, Levent Sagun, and Mert Gurbuzbalaban. “A tail-index analysis of stochastic
gradient noise in deep neural networks”. In: International Conference on Machine Learning. PMLR.
2019, pp. 5827–5837.
[455] Mert Gurbuzbalaban, Umut Simsekli, and Lingjiong Zhu. “The heavy-tail phenomenon in SGD”.
In: International Conference on Machine Learning. PMLR. 2021, pp. 3964–3975.
288
[456] Xinyue Zhang, Jiahao Ding, Maoqiang Wu, Stephen TC Wong, Hien Van Nguyen, and Miao Pan.
“Adaptive privacy preserving deep learning algorithms for medical data”. In: Proceedings of the
IEEE/CVF Winter Conference on Applications of Computer Vision. 2021, pp. 1169–1178.
[457] Xinwei Zhang, Xiangyi Chen, Mingyi Hong, Zhiwei Steven Wu, and Jinfeng Yi. “Understanding
clipping for federated learning: Convergence and client-level differential privacy”. In:
International Conference on Machine Learning, ICML 2022. 2022.
[458] Galen Andrew, Om Thakkar, Brendan McMahan, and Swaroop Ramaswamy. “Differentially
private learning with adaptive clipping”. In: Advances in Neural Information Processing Systems 34
(2021), pp. 17455–17466.
[459] Ming Liu, Stella Ho, Mengqi Wang, Longxiang Gao, Yuan Jin, and He Zhang. “Federated learning
meets natural language processing: a survey”. In: arXiv preprint arXiv:2107.12603 (2021).
[460] Yue Tan, Guodong Long, Jie Ma, Lu Liu, Tianyi Zhou, and Jing Jiang. “Federated learning from
pre-trained models: A contrastive learning approach”. In: arXiv preprint arXiv:2209.10083 (2022).
[461] Yuanyishu Tian, Yao Wan, Lingjuan Lyu, Dezhong Yao, Hai Jin, and Lichao Sun. “FedBERT: when
federated learning meets pre-training”. In: ACM Transactions on Intelligent Systems and
Technology (TIST) 13.4 (2022), pp. 1–26.
[462] Borja Balle and Yu-Xiang Wang. “Improving the gaussian mechanism for differential privacy:
Analytical calibration and optimal denoising”. In: International Conference on Machine Learning.
PMLR. 2018, pp. 394–403.
[463] Joshua Christian Zhao, Atul Sharma, Ahmed Roushdy Elkordy, Yahya H Ezzeldin,
Salman Avestimehr, and Saurabh Bagchi. “LOKI: Large-scale Data Reconstruction Attack against
Federated Learning through Model Manipulation”. In: 2024 IEEE Symposium on Security and
Privacy (SP). IEEE Computer Society. 2023, pp. 30–30.
[464] Sivakanth Gopi, Yin Tat Lee, and Lukas Wutschitz. “Numerical composition of differential
privacy”. In: Advances in Neural Information Processing Systems 34 (2021), pp. 11631–11642.
[465] Alfred O Hero, Bing Ma, Olivier Michel, and John Gorman. “Alpha-Divergence for Classification,
Indexing and Retrieval1/4”. In: (2001).
[466] Jiqiang Gao, Boyu Hou, Xiaojie Guo, Zheli Liu, Ying Zhang, Kai Chen, and Jin Li. “Secure
aggregation is insecure: Category inference attack on federated learning”. In: IEEE Transactions
on Dependable and Secure Computing (2021).
[467] Fiona Fui-Hoon Nah, Ruilin Zheng, Jingyuan Cai, Keng Siau, and Langtao Chen. Generative AI
and ChatGPT: Applications, challenges, and AI-human collaboration. 2023.
[468] Miao Xiong, Zhiyuan Hu, Xinyang Lu, Yifei Li, Jie Fu, Junxian He, and Bryan Hooi. “Can LLMs
Express Their Uncertainty? An Empirical Evaluation of Confidence Elicitation in LLMs”. In: arXiv
preprint arXiv:2306.13063 (2023).
289
[469] Zhen Lin, Shubhendu Trivedi, and Jimeng Sun. “Generating with Confidence: Uncertainty
Quantification for Black-box Large Language Models”. In: arXiv preprint arXiv:2305.19187 (2023).
[470] Ebtesam Almazrouei, Hamza Alobeidli, Abdulaziz Alshamsi, Alessandro Cappelli,
Ruxandra Cojocaru, Merouane Debbah, Etienne Goffinet, Daniel Heslow, Julien Launay,
Quentin Malartic, Badreddine Noune, Baptiste Pannier, and Guilherme Penedo. “Falcon-40B: an
open large language model with state-of-the-art performance”. In: (2023).
[471] Fengjiao Zhang, Zhao Pan, and Yaobin Lu. “AIoT-enabled smart surveillance for personal data
digitalization: Contextual personalization-privacy paradox in smart home”. In: Information &
Management 60.2 (2023), p. 103736.
[472] Eran Toch, Yang Wang, and Lorrie Faith Cranor. “Personalization and privacy: a survey of
privacy risks and remedies in personalization-based systems”. In: User Modeling and User-Adapted
Interaction 22 (2012), pp. 203–220.
[473] Ashwin Karale. “The challenges of IoT addressing security, ethics, privacy, and laws”. In: Internet
of Things 15 (2021), p. 100420.
[474] Yunhao Ge, Jiashu Xu, Brian Nlong Zhao, Neel Joshi, Laurent Itti, and Vibhav Vineet. “Beyond
generation: Harnessing text to image models for object detection and segmentation”. In: arXiv
preprint arXiv:2309.05956 (2023).
[475] Jascha Sohl-Dickstein, Eric Weiss, Niru Maheswaranathan, and Surya Ganguli. “Deep
Unsupervised Learning using Nonequilibrium Thermodynamics”. In: Proceedings of the 32nd
International Conference on Machine Learning. Ed. by Francis Bach and David Blei. Vol. 37.
Proceedings of Machine Learning Research. Lille, France: PMLR, July 2015, pp. 2256–2265.
[476] Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal,
Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. “Learning transferable visual
models from natural language supervision”. In: International conference on machine learning.
PMLR. 2021, pp. 8748–8763.
[477] David Madras, Elliot Creager, Toniann Pitassi, and Richard Zemel. “Learning adversarially fair
and transferable representations”. In: International Conference on Machine Learning. PMLR. 2018,
pp. 3384–3393.
[478] Xiao Chen, Peter Kairouz, and Ram Rajagopal. “Understanding compressive adversarial privacy”.
In: 2018 IEEE Conference on Decision and Control (CDC). IEEE. 2018, pp. 6824–6831.
[479] Shaoan Xie, Zhifei Zhang, Zhe Lin, Tobias Hinz, and Kun Zhang. “Smartbrush: Text and shape
guided object inpainting with diffusion model”. In: Proceedings of the IEEE/CVF Conference on
Computer Vision and Pattern Recognition. 2023, pp. 22428–22437.
[480] Shiyuan Yang, Xiaodong Chen, and Jing Liao. “Uni-paint: A unified framework for multimodal
image inpainting with pretrained diffusion model”. In: Proceedings of the 31st ACM International
Conference on Multimedia. 2023, pp. 3190–3199.
[481] PK Anjana and PE Ameenudeen. “Autoencoders Based Digital Communication Systems”. In: ().
290
[482] Martín Abadi and David G Andersen. “Learning to protect communications with adversarial
neural cryptography”. In: arXiv preprint arXiv:1610.06918 (2016).
[483] Jiaming Song, Chenlin Meng, and Stefano Ermon. “Denoising diffusion implicit models”. In: arXiv
preprint arXiv:2010.02502 (2020).
[484] Aditya Ramesh, Prafulla Dhariwal, Alex Nichol, Casey Chu, and Mark Chen. Hierarchical
Text-Conditional Image Generation with CLIP Latents. 2022. arXiv: 2204.06125 [cs.CV].
[485] Yikai Wang, Chenjie Cao, and Ke Fan Xiangyang Xue Yanwei Fu. Towards Context-Stable and
Visual-Consistent Image Inpainting. 2024. arXiv: 2312.04831 [cs.CV].
[486] Andreas Lugmayr, Martin Danelljan, Andres Romero, Fisher Yu, Radu Timofte, and
Luc Van Gool. RePaint: Inpainting using Denoising Diffusion Probabilistic Models. 2022. arXiv:
2201.09865 [cs.CV].
[487] Harrison Edwards and Amos Storkey. Censoring Representations with an Adversary. 2016. arXiv:
1511.05897 [cs.LG].
[488] Manolis Savva, Angel X. Chang, Alexey Dosovitskiy, Thomas Funkhouser, and Vladlen Koltun.
“MINOS: Multimodal Indoor Simulator for Navigation in Complex Environments”. In:
arXiv:1712.03931 (2017).
[489] Shuran Song, Fisher Yu, Andy Zeng, Angel X Chang, Manolis Savva, and Thomas Funkhouser.
“Semantic Scene Completion from a Single Depth Image”. In: Proceedings of 30th IEEE Conference
on Computer Vision and Pattern Recognition (2017).
291
Abstract (if available)
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Striking the balance: optimizing privacy, utility, and complexity in private machine learning
PDF
Security and privacy in information processing
PDF
Enhancing privacy, security, and efficiency in federated learning: theoretical advances and algorithmic developments
PDF
Taming heterogeneity, the ubiquitous beast in cloud computing and decentralized learning
PDF
Practice-inspired trust models and mechanisms for differential privacy
PDF
Optimizing privacy and performance in spectrum access systems
PDF
Algorithms and frameworks for generating neural network models addressing energy-efficiency, robustness, and privacy
PDF
Controlling information in neural networks for fairness and privacy
PDF
Edge-cloud collaboration for enhanced artificial intelligence
PDF
Generative foundation model assisted privacy-enhancing computing in human-centered machine intelligence
PDF
Differentially private learned models for location services
PDF
Achieving efficient MU-MIMO and indoor localization via switched-beam antennas
PDF
Improving spectrum efficiency of 802.11ax networks
PDF
Optimal distributed algorithms for scheduling and load balancing in wireless networks
PDF
Responsible AI in spatio-temporal data processing
PDF
Privacy-aware geo-marketplaces
PDF
Coding centric approaches for efficient, scalable, and privacy-preserving machine learning in large-scale distributed systems
PDF
Modeling intermittently connected vehicular networks
PDF
Federated and distributed machine learning at scale: from systems to algorithms to applications
PDF
Enabling massive distributed MIMO for small cell networks
Asset Metadata
Creator
Zhang, Jiang
(author)
Core Title
Optimizing privacy-utility trade-offs in AI-enabled network applications
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Electrical Engineering
Degree Conferral Date
2024-05
Publication Date
04/17/2024
Defense Date
04/09/2024
Publisher
Los Angeles, California
(original),
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
machine learning,OAI-PMH Harvest,privacy,privacy-utility trade-offs
Format
theses
(aat)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Golubchik, Leana (
committee member
), Madhyastha, Harsha (
committee member
), Psounis, Konstantinos (
committee member
)
Creator Email
bitzj2015@outlook.com,jiangzha@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-oUC113877821
Unique identifier
UC113877821
Identifier
etd-ZhangJiang-12824.pdf (filename)
Legacy Identifier
etd-ZhangJiang-12824
Document Type
Thesis
Format
theses (aat)
Rights
Zhang, Jiang
Internet Media Type
application/pdf
Type
texts
Source
20240417-usctheses-batch-1141
(batch),
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the author, as the original true and official version of the work, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright.
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Repository Email
cisadmin@lib.usc.edu
Tags
machine learning
privacy
privacy-utility trade-offs