Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Efficiency in privacy-preserving computation via domain knowledge
(USC Thesis Other)
Efficiency in privacy-preserving computation via domain knowledge
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
EFFICIENCY IN PRIVACY-PRESERVING COMPUTATION VIA DOMAIN KNOWLEDGE
by
Weizhao Jin
A Dissertation Presented to the
FACULTY OF THE USC GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
(COMPUTER SCIENCE)
May 2025
Copyright 2025 Weizhao Jin
Acknowledgements
I would like to first thank my advisor Dr. Srivatsan Ravi, for the freedom that he gave me during my PhD
years to pursue what really interests me. I would also like to thank my committee members, Dr. Bhaskar
Krishnamachari, Dr. Harsha Madhyastha, and Dr. Fred Morstatter, for all the support and guidance they
have kindly provided.
During my years in the program, I was also fortunate to receive multiple external mentorships. Dr.
Carlee Joe-Wong’s meticulousness always amazed me at how many aspects of any research problem can be
there (and of course, can be ignored by most people). And with her, every scary paper rebuttal even seems
like a fun problem-solving exercise. Among the few lucky ones, I was able to witness the startup story
of Dr. Salman Avestimehr and Dr. Chaoyang He and had the opportunity to work on FedML. Chaoyang
taught me a completely unique way of viewing the relationship between research and industry, exactly
when I was about to be consumed by the confusion most PhD students would experience in their early
days as researchers.
Since 2022, I have spent every summer at Amazon as an intern ready to accidentally rm -rf / at any
given moment but I also had the chance to get to know many amazing folks at Amazon. Shahin was kind
enough to patiently guide me through my first industry experience and teach me the ins and outs of being
a scientist at a tech company. Naveed taught me how interconnected different research problems can be
even when they seem not even close to being remotely related. Tancrède showed me that a singular person
can handle so many things at the same time (and I still have not learned how!) and be good at any research
ii
topic he lays his eyes on. Aws made me realize there exists a poetic but also humorous approach towards
research. And Bee, thank you for putting your trust in me more than twice!
I would express my sincere gratitude towards the awesome advisors I had before I started my program,
Dr. Wenyuan Xu, Dr. Xiaoyu Ji, and Dr. Yuan Tian, for inspiring me years ago to pursue a PhD.
I am grateful for the fellow travelers whom I established great friendships with during my years trying
to figure out this whole PhD thing. Yuhang, Shanshan, Yixiang, and Jiajun helped build a community
where we truly enjoy the concept of collaboration.
My dog, Xigua, and my cat, Teemo, gave the unconditional and conditional love. They accompanied
me during those stressful deadlines in their sweet naps. I enjoy practicing our little barter system of treats
for attention.
To my sister, thank you for all the love and support throughout my entire life.
Lastly, Samantha, my partner in crime, experienced all the bittersweetness together with me in those
years. Without her, I would not be able to finish this journey by myself.
iii
Table of Contents
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
Chapter 1: Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Preamble . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Privacy-Preserving Computation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Efficient Privacy via Domain Knowledge . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3.1 Selective Protection of Private Data Sharing . . . . . . . . . . . . . . . . . . . . . . 4
1.3.2 Pre-Transformation of Private Data Sharing . . . . . . . . . . . . . . . . . . . . . . 5
1.3.3 Minimization of Involved Parties for Private Data Sharing . . . . . . . . . . . . . . 6
1.4 Organization of The Dissertation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Chapter 2: Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.1 Federated Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Entity Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3 Path Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.4 Homomorphic Encryption. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.5 Differential Privacy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.6 Non-Interactive Zero-Knowledge Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Chapter 3: Efficient Privacy-Preserving Deep Distributed Training . . . . . . . . . . . . . . . . . . 16
3.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.3 Domain Knowledge Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.4 FedML-HE: Federated Learning With Selective Parameter Encryption . . . . . . . . . . . . 20
3.4.1 Methodology Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.4.2 Algorithm for HE-Based Federated Aggregation . . . . . . . . . . . . . . . . . . . . 21
3.4.3 Efficient Optimization by Selective Parameter Encryption . . . . . . . . . . . . . . 23
3.5 Quantifying Privacy Of Selective Parameter Encryption . . . . . . . . . . . . . . . . . . . . 25
3.5.1 Encrypted Aggregation Quantified in Privacy Budget . . . . . . . . . . . . . . . . . 25
3.5.2 Selective Parameter Encryption by Privacy Composition . . . . . . . . . . . . . . . 25
3.6 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
iv
3.6.1 Experiment Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.6.2 FL With Threshold HE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.6.3 Optimized Overheads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.6.4 Effectiveness of Selective Encryption Defense . . . . . . . . . . . . . . . . . . . . . 31
3.6.5 Privacy Guarantee Quantification . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
Chapter 4: Efficient Multi-Party Privacy-Preserving Machine Learning Inference for Entity
Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.3 Domain Knowledge Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.4 HASP: Efficient Two-Stage Encrypted Deep Entity Resolution . . . . . . . . . . . . . . . . 39
4.4.1 Methodology Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.4.2 Offline Plaintext Embedding Generation . . . . . . . . . . . . . . . . . . . . . . . . 41
4.4.3 Online Ciphertext Differential Evaluation . . . . . . . . . . . . . . . . . . . . . . . 42
4.4.4 Efficient Training Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.5 Encrypted Approximation Layers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.5.1 Synthetic Ranging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.5.2 Polynomial Degree Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.5.3 Privacy Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.6 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.6.1 Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.6.2 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.6.2.1 Model Entity Resolution Performance . . . . . . . . . . . . . . . . . . . . 51
4.6.2.2 Privacy Overheads in HASP . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.6.2.3 Dimension Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
Chapter 5: Efficient Privacy-Preserving Path Validation System for Multi-Authority Sliced Networks 56
5.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
5.3 Domain Knowledge Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.4 P3V: Efficient Privacy-Preserving Path Validation . . . . . . . . . . . . . . . . . . . . . . . 60
5.4.1 Methodology Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
5.4.2 Base Approach: XOR-Hash . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
5.4.3 Improved Design: XOR-Hash-NIZK . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
5.4.3.1 Protocol Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.4.4 Malicious Node Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
5.4.5 Path Validation Property Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
5.5 Application Use Case: Integration With 5G Network Slicing . . . . . . . . . . . . . . . . . 66
5.5.0.1 Path Slicing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
5.5.0.2 Path Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
5.5.0.3 Path Rerouting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
5.6 Path Validation Protocol Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5.6.1 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5.6.2 Evaluation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
5.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
v
Chapter 6: Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
6.1 Dissertation Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
6.2 Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
.1 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
.1.1 FL & Homomorphic Encryption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
.1.2 HE Key Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
.1.3 Software Framework: Homomorphic Encryption In Federated Learning . . . . . . 89
.1.4 Framework APIs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
.1.5 Deploy Anywhere: A Deployment Platform MLOps For Edges/Cloud . . . . . . . . 90
.1.6 Proof of Base Full Encryption Protocol . . . . . . . . . . . . . . . . . . . . . . . . . 92
.1.7 Quantifying negligible privacy value in full encryption . . . . . . . . . . . . . . . . 94
.1.8 Proof of rJ-Privacy by Selective Parameter Encryption . . . . . . . . . . . . . . . . 95
.1.9 Proof of Privacy Budget Relationship Under Different Parameter Encryption Options 96
.1.10 Selective Parameter Encryption Privacy Proof Under Uniform Distribution . . . . . 96
.1.11 Selective Parameter Encryption Privacy Proof Under Exponential Distribution . . . 97
.1.12 Sensitivity Distribution and Privacy Budget Ratio of the Models Included . . . . . 97
.1.13 Parameter Sensitivity Map for LeNet . . . . . . . . . . . . . . . . . . . . . . . . . . 101
.1.14 Defense Effectiveness on CV and NLP Models . . . . . . . . . . . . . . . . . . . . . 101
.1.15 Experiments on Quantifying Privacy . . . . . . . . . . . . . . . . . . . . . . . . . . 102
.1.16 Parameter Efficiency Techniques in HE-Based FL . . . . . . . . . . . . . . . . . . . 103
.1.17 Results on Different Scales of Models . . . . . . . . . . . . . . . . . . . . . . . . . . 105
.1.18 Results on Different Cryptographic Parameters . . . . . . . . . . . . . . . . . . . . 105
.1.19 Client Data Distribution Impact on Sensitivity . . . . . . . . . . . . . . . . . . . . . 107
.1.20 Impact from Number of Clients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
.1.21 Communication Cost on Different Bandwidths . . . . . . . . . . . . . . . . . . . . 108
.1.22 Different Encryption Selections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
.1.23 Comparison with Other FL-HE Frameworks . . . . . . . . . . . . . . . . . . . . . . 109
.1.24 Change in Attack Performance over Training . . . . . . . . . . . . . . . . . . . . . 111
.1.25 MLOps Running Example Configuration . . . . . . . . . . . . . . . . . . . . . . . . 113
.1.26 The detailed procedure of P3V. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
vi
List of Tables
3.1 Defense Effectiveness on CV and NLP Models: each configuration is attacked 10 times
and the best attack score is recorded (VIF for CV tasks and Reconstruction Accuracy for
NLP tasks). The minimum encryption ratios are selected as the smallest encryption ratio
observed that reduces the attack score to below a certain level (0.2 for VIF of images
and 0.1 for Reconstruction Accuracy of texts). The largest encryption ratio used will be
recorded if the method fails to provide the desired protection level. . . . . . . . . . . . . . 33
3.2 Quantifying Privacy of Selective Parameter Encryption: r1 and r2 represent the ratio
of sum induced by the random encryption and selective encryption respectively. The
minimum Laplace scales are taken based on the smallest scale of the Laplace noises that
reduces the attack score to a desired level. The theoretical value of r1 is one minus the
encryption ratio and that of r2 is calculated based on the corresponding sensitivity data. . 33
4.1 Comparisons of Entity Resolution Solutions: : best; G#: medium; #: worst. . . . . . . . . 38
4.2 Entity Resolution Datasets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.3 Main Results of Non-private ER, Privacy-preserving ER, and HASP on Various Datasets:
each HASP result contains three entries with three distance measures. “-” indicates no
experiment is conducted in the corresponding paper. The best result for each dataset
among all methods, excluding HASP’s plaintext variants, is in bold. . . . . . . . . . . . . . 50
4.4 Ciphertext Overhead: the input and output file sizes are measured during the server-client
communication. The intermediate file sizes are measured for memory usage during
the computation on the server. Due to CKKS’s ciphertext packing techniques (in our
experiment ciphertext batch size is set to be 8192), any vector whose size is smaller than
the ciphertext batch size can be packed into a ciphertext of the same file size. . . . . . . . . 53
4.5 Time Cost of Encrypted Inference (in seconds) with Different Input Dimensions:
Manhattan distance does not increase multiplicative depth, while Euclidean and Cosine
increase it by one. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.1 Comparison of Different Path Validation Protocols: we refer to path validation protocols
without privacy-preserving (PP) design as Transparent Validation and path validation
solutions with some privacy-preserving design as Semi-PP Validation. . . . . . . . . . . . . 58
vii
5.2 Step Cost of XOR-Hash-NIZK: with n nodes, in (i : j), i means iterations required across
parties and j means total runtime. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
1 Comparison of Differential Privacy, Secure Aggregation, and Homomorphic Encryption . . 88
2 HE Framework APIs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
3 Vanilla Fully-Encrypted Models of Different Sizes: with 3 clients; Comp Ratio is calculated
by time costs of HE over time costs of Non-HE; Comm Ratio is calculated by file sizes of
HE over file sizes of Non-HE. CKKS is configured with default crypto parameters. . . . . . 102
4 Parameter Efficiency Overhead: PT means plaintext and CT means ciphertext. Communication reductions are 0.60 and 0.96. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
5 Computational & Communicational Overhead of Different Crypto Parameter Setups:
tested with CNN (2 Conv+ 2 FC) and on 3 clients; model test accuracy ∆s is the difference
between the best plaintext global model and the best global encrypted global models. . . . 106
6 Overheads With Different Parameter Selection Configs Tested on Vision Transformer:
“Enc w/ 10%” means performs encrypted computation only on 10% of the parameters; all
computation and communication results include overheads from plaintext aggregation for
the rest of the parameters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
7 Comparison with Existing HE-Based FL Systems: ⃝ implies limited support. For Selective
Parameter Encryption, FLARE offers the (random) partial encryption option which does
not have clear indications of privacy impacts; for Encrypted Foundation Model Training,
the other two platforms require massive resources to train foundation models in encrypted
federated learning. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
8 Different Frameworks: tested with CNN (2 Conv + 2 FC) and on 3 clients; Github commit
IDs are specified. For key management, our work uses a key authority server; FLARE uses
a security content manager; IBMFL currently provides a local simulator. . . . . . . . . . . 110
viii
List of Figures
2.1 Examples of Typical Malicious Behaviors En Route: compromised nodes (red) can conduct
their major malicious behaviors, namely skipping (skips certain honest nodes between
compromised nodes), detour(reroutes packets via other compromised nodes that are not
on the path) and out-of-order (disrupts the assigned node order). . . . . . . . . . . . . . . . 12
2.2 Basic HE Operations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.1 (left) Data Reconstruction Attacks: an adversarial server can recover local training data
from local model updates and global model at last round; (middle) HE-based Federated
Aggregation: models are encrypted and the server acts as a computing service without
access to models; (right) Computation and Communication Overhead for Aggregating
Fully Encrypted Models: compared with Nvidia Flare [98] (which does not have provable
selective parameter encryption), overheads include encryption/decryption and encrypted
aggregation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.2 Federated Learning Pipeline With Selective Parameter Encryption: in the Encryption
Key Agreementstage, clients can either use distributed threshold key agreement protocol
or outsource a trusted key authority. We simplify the illustration here by abstracting the
key pair of the public key and secret key (partial secret keys if using threshold protocol)
as one key; in the Encryption Mask Calculation stage, clients use local datasets to
calculate local model sensitivity maps which are homomorphically aggregated at the
server to generate an encryption mask; in the Encrypted Federated Learning stage,
clients use homomorphic encryption with encryption mask to protect local model updates
where the server aggregates them but does not have access to sensitive local models. . . . 21
3.3 Selective Parameter Encryption: in the initialization stage, clients first calculate privacy
sensitivities on the model using its own dataset and local sensitivities will be securely
aggregated to a global model privacy map. The encryption mask will be then determined
by the privacy map and a set selection value p per overhead requirements and privacy
guarantee. Only the masked parameters will be aggregated in the encrypted form. . . . . . 24
3.4 Sensitivity Distribution and Privacy Budget Ratio from Selective Parameter Encryption
(Transformer-3t): calculated parameter sensitivity follows a Log-Normal Mixture
distribution, allowing a smaller privacy budget to achieve the same privacy guarantee. . . 27
ix
3.5 Microbenchmark of Threshold-HE-Based FedAvg Implementation: with the x-axis
showing the sizes of vectors being aggregated, we use a two-party threshold setup.
Both the single-key variant and the threshold variant are configured with an estimated
precision of 36 for a fair comparison. Note that bars represent communication overheads
and lines represent computation overheads. . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.6 Computation (left) and Communication (right) Overhead Comparison For Models of
Different Sizes (logarithmic scale): 10% Encryption is based on our selection strategy and
50% encryption is based on random. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.7 Time Distribution of A Training Cycle on ResNet-50 on our industrial deployment
platform: plaintext FL (left), HE with full encryption (middle), and HE with selective
encryption (right). MLOps test env has a bandwidth of 20 MB/s (Multiple AWS Region).
The optimization setup uses an encryption mask with an encrypted ratio s = 0.01.
Detailed training configuration can be found in Appendix §.1.25. . . . . . . . . . . . . . . . 30
3.8 Selection Protection Against Gradient Inversion Attack [145] On LeNet with the CIFAR100 Dataset: attack results when protecting random parameters (left) vs protecting
top-s sensitive parameters (right). Each configuration is attacked 10 times and the
best-recovered image is selected. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.9 Language Model Inversion Attacks [34] on GPT-2 with the wikitext Dataset: Red
indicates falsely-inverted words and Yellow indicates correctly-inverted words. . . . . . . 32
3.10 Sensitivity Distributions of Llama-3.2-1B & Llama-3.2-3B. . . . . . . . . . . . . . . . . . . . 34
4.1 Encrypted Computation Overhead Comparison in the Number of Encrypted Model
Parameters and Encryption Multiplicative Depth: compared to our efficient two-stage
solution HASP, an end-to-end fully encrypted deep entity resolution model (e.g., based on
BERT) will yield overheads of impractical scales. . . . . . . . . . . . . . . . . . . . . . . . . 36
4.2 HASP: Two-Stage Privacy-preserving Deep Entity Resolution: the raw record r1 and r2
from each data owner is encoded to contextualized embeddings e1 and e2 locally using
fine-tuned language models. These embeddings are then encrypted with homomorphic
encryption, resulting in ciphertexts Je1K and Je2K, which are transmitted to the remote
server. On the server side, a distance in ciphertext JedK between record ciphertexts is
calculated and served as input to the subsequent encrypted neural classifier for resolving
entity pairs. The final encrypted result JzK is sent back to the data owners for decryption. . 39
4.3 ReLU Approximation with Different Ranges: polynomial degree of 4; inaccurate input
ranges result in large errors between the target function and the approximation function. . 44
4.4 Estimating Approximation Function Ranges via Synthetic Ranging: synthetically
generated data samples are fed into the neural network. The activation range of the
function is evaluated for the polynomial approximation algorithm. . . . . . . . . . . . . . 46
x
4.5 Multiplicative Depth vs. Cumulative Approximation Errors (range=[-0.2, 0.2]): with
different polynomial degrees, there are two competing optimization objectives, namely
encryption multiplicative depth and cumulative approximation errors. . . . . . . . . . . . 47
4.6 Inference Computation Overhead Comparison: the computational cost of the fully
encrypted pipeline is estimated. The error bars for Plaintext and HASP are not clearly
visible due to the significant difference in scale. . . . . . . . . . . . . . . . . . . . . . . . . 54
5.1 XOR-Hash-NIZK Protocol Workflow: blue depicts the packet forwarding process and red
indicates the backward path validation process. Note that “forward” payload forwarding
and “backward” are independent of each other from a user experience perspective. . . . . 62
5.2 The 3-Stage Modular System: Path Slicing to generate a viable path, Path Validation
to validate the path while forwarding packets, and Path Rerouting to reroute if any
malicious behavior detected. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
5.3 Testbed Example: with a substrate network of 12 nodes and 15 edges, an optimal path
with 5 hops is found and to be validated. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
5.4 Protocol Comparison On Testbed: the XOR-Hash protocol and our protocol XOR-HashNIZK with two different implementations. . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
5.5 End-User Perspective Testbed Evaluation: with an example of 100 nodes (100 Mbps) where
forward steps ( blu ) directly impact service quality (e.g., the content delivery time cost per
communication constraints) and backward steps ( red ) are related to the path validation
running in the background (in normal cases which will not be directly noticed by users). . 70
1 Framework Structure: our framework consists of a three-layer structure including Crypto
Foundation to support basic HE building blocks, ML Bridge to connect crypto tools with
ML functions, and FL Orchestration to coordinate different parties during a task. . . . . . 89
2 Deployment Interface Example: Overhead distribution monitoring on each edge device
(e.g. Desktop (Ubuntu), Laptop (MacBook), and Raspberry Pi 4), which can be used to
pinpoint HE overhead bottlenecks and guide optimization. . . . . . . . . . . . . . . . . . . 91
3 Sensitivity Distribution and Privacy Budget Ratio from Selective Parameter Encryption
(Transformer-3). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
4 Sensitivity Distribution and Privacy Budget Ratio from Selective Parameter Encryption
(Transformer-3f). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
5 Sensitivity Distribution and Privacy Budget Ratio from Selective Parameter Encryption
(Transformer-S). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
6 Sensitivity Distribution and Privacy Budget Ratio from Selective Parameter Encryption
(GPT-2). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
xi
7 Sensitivity Distribution and Privacy Budget Ratio from Selective Parameter Encryption
(LeNet). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
8 Sensitivity Distribution and Privacy Budget Ratio from Selective Parameter Encryption
(CNN). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
9 Sensitivity Distribution and Privacy Budget Ratio from Selective Parameter Encryption
(ResNet-18). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
10 Model Privacy Map Calculated by Sensitivity on LeNet: darker color indicates higher
sensitivity. Each subfigure shows the sensitivity of parameters of the current layer. The
sensitivity of parameters is imbalanced and many parameters have very little sensitivity
(its gradient is hard to be affected by tuning the data input for attack). . . . . . . . . . . . . 101
11 Results for Selected CV Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
12 Results for Selected NLP Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
13 Defense Effectiveness of DP Noises of Different Scales Under Three Protection Methods:
an encryption ratio is fixed for each model from the beginning to guarantee a good attack
performance at first. Each configuration is attacked 10 times and the best attack score is
recorded. The experiments are repeated for at least three different sets of applied DP noises. 104
14 Deviation of Sensitivity Distribution Induced by Different Client Data Distribution: two
client data distributions constructed from the ImageNet dataset with 100 images from
distinct classes sampled at equal intervals. Distribution 1 contains data with labels of [0,
1, 2, 3, 5] while Distribution 2 contains data whose labels span across 0 to 400. . . . . . . . 107
15 Results on Different Number of Clients and Communication Setup . . . . . . . . . . . . . . 108
16 Attack Performance on Transformer-3 over Batch Iterations. Each configuration is
attacked 10 times and the best score is recorded. The experiment is repeated on 10
different data points and their mean is presented. . . . . . . . . . . . . . . . . . . . . . . . 112
17 ResNet-50 MLOps Training Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
18 XOR-Hash-NIZK Path Validation Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
xii
Abstract
In recent years, the growing reliance on user data for building server-side applications and services has
significantly heightened the importance of data privacy. To meet expanding privacy regulations like GDPR,
service providers have turned to privacy-preserving methods that maintain computational functionality
while protecting user privacy. However, integrating techniques such as homomorphic encryption into
application protocols presents a critical challenge: achieving a balance between privacy and efficiency. This
thesis explores two distinct domains within privacy-preserving computation, offering practical, domainspecific solutions to address challenges related to overheads and protocol complexity. The focus is on
achieving efficient privacy in both machine learning and networks/IoT.
To illustrate how leveraging domain-specific insights—from federated learning, entity resolution, and
computer networking—can substantially enhance the efficiency of privacy-preserving computation, we
first introduce a selective encryption strategy for large-scale federated learning models, reducing overhead by encrypting only sensitive parameters while still maintaining robust privacy guarantees; secondly,
we demonstrate how homomorphic encryption can be optimized for deep entity resolution via a twostage computation scheme and novel techniques including synthetic ranging and polynomial degree optimization that preserve accuracy under encrypted computation; finally, we apply Non-Interactive ZeroKnowledge proofs to achieve lightweight privacy-preserving path validation across multi-authority network slices, ensuring data forwarding compliance without revealing sensitive topology details by utilizing
xiii
a backward pairwise validation procedure. Taken together, these studies highlight how targeting domainspecific challenges via domain-specific knowledge can yield practical, scalable frameworks for efficient
privacy-preserving computation.
xiv
Chapter 1
Introduction
1.1 Preamble
In an increasingly interconnected and data-driven world, ensuring privacy during computational processes
has become critical [5]. Privacy-preserving computation techniques, such as homomorphic encryption
(HE) [18, 30, 54], zero-knowledge proofs [49, 57, 71], and differential privacy [41, 42, 44], offer robust methods to protect sensitive information while enabling data processing. However, their widespread adoption
is hampered by computational inefficiencies and communication overhead, particularly when applied to
large-scale models and complex systems. Traditionally, the additional computational and communication
overheads introduced by privacy-preserving computation techniques, such as secure multi-party computation and homomorphic encryption, were regarded as inevitable trade-offs for ensuring strong privacy
guarantees. These methods, while theoretically sound, often face scalability challenges that limit their
practicality in real-world applications [27, 110].
However, recent advances suggest that integrating domain knowledge into the design and optimization
of privacy-preserving protocols can significantly mitigate these inefficiencies [27, 102, 110, 138]. By tailoring solutions to the unique characteristics and requirements of specific computational domains, there is
potential for domain-specific optimizations. For example, exploiting sparsity in data representations [142],
1
leveraging proper cryptographic primitives for domain-specific problems [68], or customizing algorithmic
workflows in protocols [69] can yield considerable performance gains without compromising privacy.
This dissertation explores the central idea that domain knowledge can be leveraged to improve the efficiency of privacy-preserving computation without compromising privacy guarantees. This shift toward
domain-aware optimization highlights an emerging paradigm where the traditionally rigid overheads of
privacy-preserving computation can be reduced, paving the way for more practical and efficient implementations in diverse fields in future work.
1.2 Privacy-Preserving Computation
Privacy-preserving computation is a field of study aimed at enabling computations on sensitive data while
ensuring its confidentiality. This paradigm is increasingly relevant in domains such as healthcare, finance,
and social sciences, where data privacy concerns often hinder collaborative analytics and research.
One of the foundational techniques in privacy-preserving computation is secure multi-party computation (SMPC), a cryptographic framework introduced by Yao [135]. SMPC allows multiple parties to jointly
compute a function over their inputs while keeping those inputs private.
Another key approach is homomorphic encryption, which permits computations to be performed directly on encrypted data without requiring decryption. The work by Gentry [54] introduced the first
fully homomorphic encryption (FHE) scheme, paving the way for secure outsourcing of computation to
untrusted environments, such as cloud servers. FHE has since been optimized for practical use cases,
including machine learning on encrypted datasets [4].
Differential privacy is another widely adopted technique for ensuring privacy during data analysis. Differentially private algorithms add calibrated noise to outputs to prevent leakage of individual data points,
even in aggregated results. Dwork et al. formalized this concept [41], which has been applied extensively
in statistics, database queries, and machine learning model training [1].
2
Generally, privacy-preserving computation primitives focus on extracted computation problems in
simplified forms, such as simple linear functions and statistical queries. However, in recent years, the integration of these methods has enabled significant advancements in privacy-preserving machine learning
(PPML) and computer network implementations, examples that combine specific domain problems with
privacy techniques to provide computational capabilities without compromising data privacy for users [1,
15, 61, 115].
1.3 Efficient Privacy via Domain Knowledge
Despite its progress, privacy-preserving computation faces challenges related to scalability, computational
efficiency, and usability [40, 52]. Addressing these limitations requires interdisciplinary collaboration
across computer science and domain-specific expertise.
The key to improving the efficiency of privacy-preserving computation is to reduce the amount of
private data being used for computation across parties, as in whom to share private data with, what (form
of) private data to share, and how much private data to share. In practice, a direct translation of a plaintext
computation problem often results in a privacy-preserving variant of the problem where the additional
overhead of privacy primitives compromises practicability [27, 130]. Different from (trusted) plaintext
computation where the data processing pipeline is essentially regarded as being carried out at a single
party, the privacy-preserving variant of the same computation problem involves multiple untrusted parties
where the order of the computation steps and the data allocation at each step are tightly related to the
privacy guarantee promised. For example, vertical federated learning requires a data alignment step of the
data samples across different data owners before sharing data for training [81]. Another example is that for
privacy-regulated medical data collaboration projects, which require a patient entity matching procedure
via private set intersection [29], only the data belonging to the patients that exist in both medical entities
3
can be utilized without revealing their identities. This differs from the non-private approach where two
entire databases can be pulled into a single location for processing.
At a high level, the approach of leveraging domain knowledge to improve the efficiency of privacypreserving computation can be categorized into three meta-methods, regarding the three aspects we mentioned earlier, namely how much private data to share, what (form of) private data to share, and whom to
share private data with:
1. Selective protection of private data sharing;
2. Pre-transformation of private data sharing;
3. Minimization of involved parties for private data sharing.
The remaining parts of this section will provide an abstract view of how these three meta-methods
aim to optimize efficient privacy without involving specific domain knowledge.
1.3.1 Selective Protection of Private Data Sharing
In many privacy-preserving applications, not all data carries the same degree of importance. Instead of universally encrypting or protecting all data, given proper domain knowledge, we might be able to identify
and selectively protect only the most privacy-sensitive subset. This approach drastically reduces computational overhead, communication costs, and protocol complexity without significantly compromising
privacy.
Let D = (d1, d2, . . . , dn) be the full set of private data in a given process. Define a binary mask
M ∈ {0, 1}
n
such that Mj = 1 if dj is protected (encrypted, secret-shared, or otherwise hidden) and 0
otherwise. The goal is:
min
M∈{0,1}n
O(M) subject to P(M) ≤ t,
where:
4
• O(M) measures overhead(computational, communication, etc.) incurred by protecting data elements
indexed by M;
• P(M) quantifies privacy leakage when only a subset is protected;
• t is a user- or system-defined privacy tolerance threshold.
1.3.2 Pre-Transformation of Private Data Sharing
Raw data (e.g., text, images, and sensor inputs) often contains redundancies and high-dimensional details
that are not always necessary for downstream tasks. A standard approach is to locally compress or embed
data into a representation that captures the essential information needed for the subsequent operations
while discarding extraneous details during pre-processing. Only this reduced representation is then processed in a privacy-preserving manner.
Definition 1.3.1 (Transform Function) A transformation function FT : X → Z, which maps raw private data x ∈ X to a compressed representation z ∈ Z.
Definition 1.3.2 (Downstream Function) A downstream function FD : Z → Y, which completes the
target computation (e.g., classification) on the representation z.
When privacy-preserving methods (e.g., homomorphic encryption) are applied (marked in double
brackets), the overhead typically grows with the number of processing functions. Hence:
O = |JFD(z)K| + |FT (x)|,
where:
• O measures overhead (computational and communication);
5
• z = FT (x).
with FT running locally in plaintext to generate z, and FD running on protected data (e.g., encrypted
z).
Thus, one aims to minimize the scope of functions applied in private manner:
min
FT , FD
|JFD(z)K| subject to L
JFD(FT )K
≤ l,
where L is the task loss, and l constrains the task accuracy.
1.3.3 Minimization of Involved Parties for Private Data Sharing
Privacy-preserving protocols often incur multiplicative overhead when many different parties (servers,
clients, nodes) must simultaneously participate, exchange secret tokens, or compute shared private data.
Moreover, sharing private information across multiple participants increases the likelihood of leakage (e.g.,
if any participant is malicious or compromised). To address this, one can minimize the number of parties
that must be involved in each protocol step, reducing overheads in both communication and computation.
In a straw-man design, each step might involve all participants, i.e., |Pk| = N. However, if domainspecific insights can show that only a subset of interactions is needed, so |Pk| is significantly smaller.
Formally, one aims to minimize the number of parties involved in interactive privacy-preserving protocol
steps:
minX
K
k=1
|Sk|,
where K is the total number of protocol steps required.
1.4 Organization of The Dissertation
To address the efficiency challenges of privacy-preserving computation, this dissertation explores specific
domain problems by exploiting domain-aware optimizations to balance privacy and efficiency guided by
the abstract optimization schema described above:
• In Chapter 2, we describe the preliminaries necessary for this dissertation, including problems related
to specific domains we wish to improve efficient privacy on.
• In Chapter 3, we introduce FedML-HE: Efficient Privacy-Preserving Efficient Privacy-Preserving Deep Distributed Training. Observing that not all model parameters contribute equally to privacy leakage, this
work proposes Selective Parameter Encryption, which encrypts only the most sensitive model parameters. This case study demonstrates how domain knowledge helps determine how much private data
to share, thus reducing the privacy-preserving workloads via selective protection of private data sharing.
Additionally, our formal theoretical framework quantifies the privacy-utility trade-off, showing significant improvements in overhead—achieving up to a 100x reduction for large-scale models like GPT-2 —
while retaining adequate privacy guarantees.
• In Chapter 4, we present HASP: Efficient Multi-Party Privacy-Preserving Machine Learning Inference for
Entity Resolution. This framework adopts a two-stage design, leveraging domain knowledge to split computations into offline plaintext embedding generation and online encrypted classification. In this case
study, the domain knowledge from deep learning-based entity resolution what form of private data to
share across computing parties, where the original private data is processed as rich embeddings via pretransformation of private data sharing followed by secure computation. By focusing on efficient function
approximation techniques, such as polynomial degree optimization, HASP reduces the overhead associated with homomorphic encryption while maintaining high accuracy in resolving entity pairs without
revealing private personal data.
7
• In Chapter 5, we design P3V: Efficient Privacy-Preserving Path Validation. In multi-authority network
environments, this work develops a backward pairwise validation protocol that ensures efficient and decentralized validation while guaranteeing path privacy without revealing the overall network topology
to adversarial nodes. This specific domain case illustrates how the domain knowledge from network
service validation can quantify the answer to whom to share the private data with through minimization
of involved parties for private data sharing while complying with the service requirements. Our protocol combines lightweight XOR-Hash operations with NIZK proofs such that each privacy-preserving
validation computation step only involves a minimal amount of untrusted parties.
• In Chapter 6, we summarize the contributions of this dissertation and discuss future directions of efficient
privacy-preserving computation.
Collectively, these contributions demonstrate that integrating domain knowledge into privacypreserving computation systems enables significant performance gains. This dissertation establishes a
practical pathway toward scalable and efficient privacy-preserving solutions, with real-world applicability in fields such as healthcare, federated learning, and network security. By optimizing computational
processes through problem-specific insights, this work advances the state-of-the-art, addressing the key
barrier of overheads to the deployment of privacy-aware systems.
8
Chapter 2
Preliminaries
2.1 Federated Learning
Federated learning (FL) is first proposed in [85], which builds distributed machine learning models while
keeping personal data on clients. Instead of uploading data to the server for centralized training, clients
process their local data and share updated local models with the server. Model parameters from a large
population of clients are aggregated by the server and combined to create an improved global model.
FedAvg [85] is commonly used on the server to combine client updates and produce a new global
model. At each round, a global model Wglob is sent to N client devices. Each client i performs gradient
descent on its local data with E local iterations to update the model Wi
. The server then does a weighted
aggregation of the local models to obtain a new global model, Wglob =
PN
i=1 αiWi
, where αi
is the
weighting factor for client i.
Typically, the aggregation runs using plaintext model parameters through a central server (in some
cases, via a decentralized protocol), giving the server visibility of each local client’s model in plaintext.
9
2.2 Entity Resolution
Privacy-preserving entity resolution (PPER) can be framed as a triple T = (D, M, E), where D =
{D1 . . . Dn} represents a collection of n distinct datasets comprising records r, each owned by data owners
P = {P1 . . . Pn}. The encoding or encryption algorithm E is responsible for maintaining the confidentiality of records from each dataset, such that r from each D is transformed into encoded or ciphertext
form, denoted as JrK ∈ E(D), where JrK indicates a privacy-preserving representation of r (i.e., ciphertext
in our setting). The match set M contains pairs of records that match between any two datasets among
the n parties, i.e., M = {(JriK, Jrj K)| ri = rj ; JriK ∈ E(Dk), Jrj K ∈ E(Dm)}, where ri = rj indicates
that ri and rj refer to the same entity.
2.3 Path Validation
Network slicing has been to incorporate a more robust array of services beyond what traditional 4G networks can provide [133]. One limitation of 4G networks, is the number of User Equipment (UE) connected
to Radio Access Networks (RAN), with mMTC the expectation is that 5G should be able to provide service
to up to 1 million radios in a square mile. Other high-level service features like URLLC require that radio
transmissions not exceed 1ms latency for 32-byte packets, and eMBB requires up to 20 Gigabit per second
peak usage. 3GPP proposes network slices [129] as a mechanism to create a logical slice over the physical
infrastructure, which transits the RAN, Transport Network (TN), and Core Network (CN) to meet strict
QoS constraints. Take for example the use case for a URLLC slice [103] where a doctor with a controller
and monitor is performing surgery on a patient across the country. The stability of a low-latency slice
enables the doctor to operate in conditions that would be natural if they were operating in person. It is
also for this reason that path validation of slices is a necessity.
10
Network path validation is used to enforce and verify cross-party multi-hop paths agreed to satisfy
certain service requirements. Deviating from agreed paths not only downgrades network service quality
but also disrupts network orchestration at large. The general objectives of path validation can be defined
as follows [72]:
• Enforcement: Path validation enforces that a packet is forwarded on the agreed path over each node
en route in the correct order.
• Verification: The sender node, intermediate nodes, or the receiver node is able to verify that the packet
forwarding follows the correct path.
It is worth mentioning that the verification objective enhances the enforcement objective such that to
successfully pass path verification, each node tends to follow the protocol thus enforcing the path. The
general procedure of path validation can be described as that, after correctly executing its portion of a given
packet forwarding task, a node en route is required to prove that it indeed follows certain paths/protocols
to either its neighboring nodes or some third-party trusted authorities/APIs.
The verification objective enhances the enforcement objective since each node is subject to providing
proof for following the agreed path thus the path is enforced.
We model the adversary A as a probabilistic polynomial-time machine and the nodes on the path
as interactive Turing machines. A controlling a subset of compromised nodes may have the following
malicious behaviors [20] (cf. Figure 2.1)
• Skipping: A node skips its successor and forwards the packet to another malicious node later en route.
• Detour: A malicious node forwards the packet to other malicious nodes that are not en route but eventually return back to the agreed path.
• Out-of-order: A group of malicious nodes forwards the packet but not in the agreed order.
11
(a) Skipping
(c) Out-of-order
(b) Detour
Option 1
Option 2
Figure 2.1: Examples of Typical Malicious Behaviors En Route: compromised nodes (red) can conduct their
major malicious behaviors, namely skipping (skips certain honest nodes between compromised nodes),
detour(reroutes packets via other compromised nodes that are not on the path) and out-of-order (disrupts
the assigned node order).
2.4 Homomorphic Encryption.
Homomorphic encryption (HE) allows the computation over encrypted data while preserving the input/output relationship of a given function between the plaintext space and ciphertext space without
the need of decryption during the computation [4, 50]. A typical public key homomorphic encryption
scheme consists of four major operations: key generation, encryption, encrypted evaluation, and decryption; see Figure 2.2. Since most real-world applications deal with floating-point numbers, we utilize the
12
Cheon-Kim-Kim-Song (CKKS) scheme [30], a leveled homomorphic encryption scheme designed to handle
approximate numbers.
• Key Generation: (pk, sk) ← HE.KeyGen(λ), which takes in a security parameter λ and outputs public
key pair (pk, sk);
• Encryption: c ← HE.Enc(pk, m), which encrypts plaintext message m with pk and returns ciphertext
c;
• Evaluation: cf ← HE.Eval(f,(c1, c2, . . . , ck)), which evaluates f over a ciphertext tuple
(c1, c2, . . . , ck) and returns cf .;
• Decryption: HE.Dec(sk, cf ) → f(m1, m2, . . . , mk), which decrypts ciphertext cf with sk, and the
result is equivalent to applying f directly on m, that is, f(m1, m2, . . . , mk).
Figure 2.2: Basic HE Operations.
2.5 Differential Privacy
Differential privacy (DP) [43] is a formal framework that ensures the output of a statistical analysis remains
nearly the same whether or not any individual’s data is included, thereby protecting individual privacy. It
can be formalized as follows:
Definition 2.5.1 (ϵ-Privacy [43]) A randomized algorithm M satisfies ϵ-privacy if for any two adjacent
datasets D1 and D2 that vary by one data point, and for any possible output O ⊆ Range(F), the following
inequality holds:
Pr [M(D1) ∈ O]
Pr [M(D2) ∈ O]
≤ e
ϵ
. (2.1)
ϵ-privacy can be achieved by adding Laplace noises on model updates. Note that we can also use
Gaussian to quantify the privacy here with conversion between mechanisms [21].
Lemma 2.5.2 (Achieving ϵ-Privacy by Laplace Mechanism [43]) A scale parameter b can be chosen as
b =
∆f
ϵ
, such that the Laplace Mechanism satisfies ϵ-privacy, where ∆f is the DP sensitivity defined as the
maximum difference in the output of a function f.
13
Definition 2.5.3 (Adjacent Datasets) Two datasets D1 and D2 are said to be adjacent if they differ in the
data of exactly one individual. Formally, they are adjacent if:
|D1∆D2| = 1
Definition 2.5.4 (Laplace mechanism) Given a function f : D → R,
where D is the domain of the dataset and d is the dimension of the output, the Laplace mechanism adds
Laplace noise to the output of f.
Let b be the scale parameter of the Laplace distribution, which is given by:
Lap(x | b) = 1
2b
e
−
|x|
b
Given a dataset D, the Laplace mechanism F is defined as:
M(D) = f(D) + Lap(0 | b)
d
Definition 2.5.5 (Differential Privacy Sensitivity) To ensure ϵ-privacy, we need to determine the appropriate scale parameter b. The DP sensitivity ∆f of a function f is the maximum difference in the output of f
when applied to any two adjacent datasets:
∆f = max
D1,D2:|D1∆D2|=1
∥f (D1) − f (D2)∥1
.
2.6 Non-Interactive Zero-Knowledge Proofs
In a zero-knowledge proof [49], a prover wants to convince a verifier that they possess certain knowledge
or that a statement is true, all without disclosing any secrets that could compromise privacy or security.
14
Traditionally, zero-knowledge proofs require multiple rounds of interaction between the prover and verifier—hence, they are called interactive zero-knowledge proofs.
However, in many real-world scenarios (e.g., secure cryptocurrency transactions, contract verification,
or secure voting protocols) [9, 93], repeated interactions between a prover and a verifier can be impractical.
To address this issue, Non-Interactive Zero-Knowledge Proofs (NIZK) were introduced. Non-Interactive
Zero-Knowledge proofs [89, 105] require little interaction (only one exchange) between a prover P and
a verifier V. During the process, P computes a proof π to convince V that a statement x ∈ L is true. V
verifies π, then decides to either accept or reject.
Several schemes and tools of NIZK have been developed over the past two decades, including
Groth16 [59], PLONK [53], Arkworks [32], and CIRCOM [12].
15
Chapter 3
Efficient Privacy-Preserving Deep Distributed Training
3.1 Background
Federated learning allows distributed clients to collectively train a global model without directly sharing
data. Instead of uploading raw data to a central server for training, clients train models locally and share
their model updates with the server, where the model updates are then averaged based on the aggregation
functions [85] to obtain a global model. While federated learning ensures that local raw data does not
leave their original locations, it remains vulnerable to eavesdroppers and malicious servers that might
exploit plaintext model updates to reconstruct sensitive training data (Fig. 3.1 (left)), i.e., gradient inversion
attacks [14, 33, 51, 62, 63, 64, 145]. This poses a privacy vulnerability especially when local models are
trained on small local datasets (e.g., smartphone text data for large language models). Local models derived
from these small datasets inherently contain fine-grained information, making it easier for adversaries to
extract sensitive information from local model updates.
Existing defense methods that reduce privacy leakage include differential privacy (DP) [23, 124] and
secure aggregation [16, 117]. DP adds noise to original model updates but may result in model performance degradation due to the privacy noises introduced. On the other hand, secure aggregation employs
zero-sum masks to shield local model updates, ensuring that individual updates remain private. However, secure aggregation demands additional interactive synchronization steps and is sensitive to client
16
dropout, making it less practical in real-world FL applications, where the unstable environments of clients
face challenges such as unreliable internet connections, and software crashes. Compared to the methods
above, homomorphic encryption (HE) [18, 30, 47, 54, 99] offers a robust post-quantum secure solution
that protects local models against attacks and provides privacy guarantee while introducing minimal model
performance degradation. As shown in Figure 3.1 (middle), HE-based federated learning (FedHE) encrypts
local models on clients and performs model aggregation over ciphertexts on the server to protect against
privacy attacks, which has been adopted by several FL systems [39, 66, 109, 138] and domain-specific
applications [120, 137].
Despite the advantages, homomorphic encryption remains a powerful but complex cryptographic
foundation with impractical federated aggregation overheads (as shown in Figure 3.1 (right)) for most realworld applications. Prior FedHE solutions mainly employ existing generic HE methods without sufficient
optimization for large-scale FL deployment [39, 66, 109, 138]. The scalability of encrypted computation
and communication during federated training then becomes a bottleneck, restricting its feasibility for realworld scenarios. This HE overhead limitation is particularly noticeable (commonly ∼15x increase in both
computation and communication), where both grow linearly w.r.t. the size of models [30, 58]. Especially
across resource-constrained devices, encrypted computing and communication of large models might take
considerably longer than the actual model training.
To address these challenges, we propose an efficient homomorphic-encryption-based privacypreserving FL solution with Selective Parameter Encryption for practical deployment. Our method significantly reduces communication and computation overheads, enabling efficient HE-based federated learning. We further provide the first theoretical framework to quantify the privacy guarantee of selective
encryption, which indicates a significant improvement over random encryption and differential privacy,
with the important observation that most existing models follow Log-Normal Mixture distributions. Extensive experiments validate our privacy quantification framework.
17
③ Distribute
Encrypted
Global Model
① Submit
Encrypted
Local Model
① Submit
Encrypted
Local Model
① Submit
Encrypted
Local Model
② Perform
Encrypted
Aggregation
Recovered Data Local Clients
Compromised Server Adversary
Local model updates
Recover local data
Di ← (∆Wi W) ∆W1 ∆Wi ∆Wn
Bob: Did you get
the Ibuprofen?
Global
Model
W
④ Decrypt
Global Model
0.0 0.2 0.4 0.6 0.8 1.0
Model Sizes
80
60
40
20
0
140
120
100
Execution Time (s)
RNN
ResNet-18
ResNet-34
ResNet-50
GViT
ViT
BERT Naive FedHE
Nvidia FLARE
Plaintext
!"!
0.0 0.2 0.4 0.6 0.8 1.0
Model Sizes
7
6
5
4
3
2
1
0
File Size (Bytes)
!"!
RNN ResNet-18
ResNet-34
ResNet-50
GViT
ViT
BERT
Naive FedHE
Nvidia FLARE
Plaintext
!""
Figure 3.1: (left) Data Reconstruction Attacks: an adversarial server can recover local training data from
local model updates and global model at last round; (middle) HE-based Federated Aggregation: models
are encrypted and the server acts as a computing service without access to models; (right) Computation
and Communication Overhead for Aggregating Fully Encrypted Models: compared with Nvidia Flare [98]
(which does not have provable selective parameter encryption), overheads include encryption/decryption
and encrypted aggregation.
Key contributions:
• We propose Selective Parameter Encryption in §3.4 that selectively encrypts the most privacy-sensitive
parameters to minimize encrypted model updates and reduce overheads while providing a privacy guarantee quantified by our proposed privacy analysis framework.
• We provide the first theoretical framework for quantifying the privacy guarantee of selective homomorphic encryption in §3.5. Selective Parameter Encryption requires significantly less encryption over random selection with provable guarantee validated empirically.
• Extensive experiments in §3.6 show that the optimized system achieves significant overhead reduction
while preserving privacy against state-of-the-art ML privacy attacks, particularly for large models (e.g.,
∼1000x reduction for ResNet, and ∼100x reduction for GPT-2), demonstrating the potential for realworld HE-based FL deployments.
18
3.2 Related Work
Privacy Attacks On FL. Threats and attacks on privacy in the domain of Federated Learning have been
studied in recent years [91]. Data reconstruction attacks [14, 33, 64] are usually carried out on the models
to retrieve certain properties of data providers or even reconstruct the data in the training datasets. With
direct access to more fine-grained local models trained on a smaller dataset [128], the adversary can have a
higher chance of a successful attack. Moreover, further attacks can be performed using GAN-based attacks
to even fully recover the original data [64]. The majority of the privacy attacks can be traced back to the
direct exposure of plaintext accesses to local models to other parties.
Non-HE Defense Mechanisms. Local differential privacy has been adopted to protect local model updates by adding differential noise on the client side before the server-side aggregation [23, 124] where
privacy guarantee requires large-scale statistical noise on fine-grained local updates that generally degrades model performance [125]. On the other hand, other work proposes to apply zero-sum masks (usually pair-wise) to mask local model updates such that any individual local update is indistinguishable to
the server [16, 117]. However, such a strategy introduces several challenges including key/mask synchronization requirements and federated learner dropouts. Compared to these solutions providing privacy
protection in FL, HE is non-interactive and dropout-resilient (vs. general secure aggregation protocols [16,
117]) and it introduces negligible model performance degradation (vs. noise-based differential privacy
solutions [23, 124]).
Existing HE-based FL Work. Existing HE-based FL work either applies restricted HE schemes (e.g.,
additive scheme Paillier) [48, 67, 138] without extensibility to further FL aggregation functions or provide
a generic but impractical HE implementation on FL aggregation [39, 67, 83], including industrial platforms
such as IBM FL [66], while leaving the key issue with impractical HE overheads as an unresolved question.
In our work, we propose a novel Selective Parameter Encryption optimization scheme that largely reduces
19
the overheads as well as provides the first theoretical framework to quantify the privacy guarantee of
selective encryption, which makes HE-based FL viable and provable in practical deployments.
3.3 Domain Knowledge Extraction
Fully encrypted models can guarantee no access to plaintext local models from the adversary, but they have
high overheads. However, previous work on privacy leakage analysis shows that “partial transparency”,
e.g. hiding parts of the models [63, 90], can limit an adversary’s ability to successfully perform attacks
like gradient inversion attacks [82]. Combined with the observation that HE overheads are directly related
to the size of encrypted model parameters [83], we propose Selective Parameter Encryption to selectively
encrypt the most privacy-sensitive parameters to reduce impractical overheads while providing quantifiable
privacy preservation.
At each round t ∈ [T], the server performs the aggregation
[Wglob] = X
N
i=1
αi
[[M ⊙ Wi
]] +X
N
i=1
αi((1 − M) ⊙ Wi), (3.1)
where [Wglob] is the partially-encrypted global model, Wi
is the i-th plaintext local model where [[]] indicates the portion of the model that is fully encrypted, αi
is the aggregation weight for client i, and M is
the global model encryption mask (details in Algorithm 1).
3.4 FedML-HE: Federated Learning With Selective Parameter
Encryption
In this section, we first provide the overview of FL with Selective Parameter Encryption in §3.4.1,
describe the general algorithmic design of HE-based FL in §3.4.2 and explain how Selective Parameter
Encryption optimizes the overheads in §3.4.3.
20
3.4.1 Methodology Overview
Option 1:
Threshold Key
Option 2:
Single Key
Encryption Key Agreement Encryption Mask Calculation Encrypted Federated Learning
Figure 3.2: Federated Learning Pipeline With Selective Parameter Encryption: in the Encryption Key
Agreement stage, clients can either use distributed threshold key agreement protocol or outsource a
trusted key authority. We simplify the illustration here by abstracting the key pair of the public key and
secret key (partial secret keys if using threshold protocol) as one key; in the Encryption Mask Calculation stage, clients use local datasets to calculate local model sensitivity maps which are homomorphically
aggregated at the server to generate an encryption mask; in the Encrypted Federated Learning stage,
clients use homomorphic encryption with encryption mask to protect local model updates where the server
aggregates them but does not have access to sensitive local models.
As shown in Figure 3.2, our efficient HE-based federated training process at a high level goes through
three major stages: (1) Encryption key agreement: the clients either use threshold HE key agreement
protocol or trusted key authority to generate HE keys; (2) Encryption mask calculation: the clients and
the server apply elective Parameter Encryption to agree on a selective encryption mask; (3) Encrypted
federated learning: at each round, the clients selectively encrypt local model updates using the HE key
and the encryption mask for efficient encrypted federated aggregation at the server.
3.4.2 Algorithm for HE-Based Federated Aggregation
We define a semi-honest adversary A that can corrupt the aggregation server or any subset of local clients.
A follows the protocol but tries to learn as much information as possible. Loosely speaking, under such
an adversary, the security definition requires that only the private information in local models from the
corrupted clients will be learned when A corrupts a subset of clients.
When A corrupts both the aggregation server and a number of clients, the default setup where the
private key is shared with all clients (also with corrupted clients) will allow A to decrypt local models
21
from benign clients (by combining encrypted local models received by the corrupted server and the private
key received by any corrupted client). This issue can be mitigated by adopting the threshold or multi-key
variant of HE where decryption must be collaboratively performed by a certain number of clients [8, 39, 83].
Since the multi-key homomorphic encryption issue is not the focus of this work, in the rest of the chapter
we default to a single-key setup, but details on threshold homomorphic encryption federated learning and
microbenchmarks are provided in §3.6.2.
Privacy-preserving federated learning systems utilize homomorphic encryption to enable the aggregation server to combine local model parameters without viewing them in their unencrypted form by
designing homomorphically encrypted aggregation functions. We primarily focus on FedAvg [85], which
has been proved as still one of the best-performing federated aggregation strategies while maintaining
computational simplicity [127].
Our HE-based secure aggregation algorithm can be summarized as: given an aggregation server and N
clients, each client i ∈ [N] owns a local dataset Di and initializes a local model Wi with the aggregation
weighing factor αi
; the key authority or the distributed threshold key agreement protocol generates a key
pair (pk, sk) and the crypto context, then distributes the key pair and crypto context to clients and only
the crypto context, which is public, to the server. The clients and the server then collectively calculate a
global encryption mask M for Selective Parameter Encryption also using homomorphic encryption.
We only need one HE multiplicative depth in our algorithm for weighting, which is preferred to reduce HE multiplication operations. Our method can also be easily extended to support more FL aggregation functions with HE by encrypting and computing the new parameters in these algorithms (e.g. FedProx [78]). We explain next in detail how the encryption mask M is formalized.
22
Algorithm 1 HE-Based Federated Aggregation
• [[W]]: the fully encrypted model | [W]: the partially encrypted model;
• p: the ratio of parameters for selective encryption;
• b: (optional) differential privacy parameter.
// Key Authority Generate Key
(pk, sk) ← HE.KeyGen(λ);
// Local Sensitivity Map Calculation
for each client i ∈ [N] do in parallel
Wi ← Init(W);
Si ← Sensitivity(W, Di);
[[Si
]] ← Enc(pk, Si);
Send [[Si
]] to server;
end
// Server Encryption Mask Aggregation
[[M]] ← Select(
PN
i=1 αi
[[Si
]], p);
// Training
for t = 1, 2, . . . , T do
for each client i ∈ [N] do in parallel
if t = 1 then
Receive [[M]] from server;
M ← HE.Dec(sk, [[M]]);
end
if t > 1 then
Receive [Wglob] from server;
Wi ← HE.Dec(sk,M ⊙ [Wglob]) + (1 − M) ⊙ [Wglob];
end
Wi ← T rain(Wi
, Di);
// Additional Differential Privacy
if Add DP then
Wi ← Wi + Noise(b);
end
[Wi
] ← HE.Enc(pk,M ⊙ Wi) + (1 − M) ⊙ Wi
;
Send [Wi
] to server S;
end
// Server Model Aggregation
[Wglob] ←
PN
i=1 αi
[[M ⊙ Wi
]] + PN
i=1 αi((1 − M) ⊙ Wi);
end
3.4.3 Efficient Optimization by Selective Parameter Encryption
Selective Parameter Encryption works in two major steps: privacy leakage analysis on clients and encryption mask agreement across clients (see Figure 3.3).
23
Apply EM
Set selective
. encryption ratio
.
.
.
.
.
Privacy Leakage Analysis Encryption Mask Partially-Encrypted Model
Aggregated Model
Privacy Map
Sensitivity
Calculation
Local Datasets Local Model
Privacy Map
Figure 3.3: Selective Parameter Encryption: in the initialization stage, clients first calculate privacy sensitivities on the model using its own dataset and local sensitivities will be securely aggregated to a global
model privacy map. The encryption mask will be then determined by the privacy map and a set selection
value p per overhead requirements and privacy guarantee. Only the masked parameters will be aggregated
in the encrypted form.
Step 1: Privacy Leakage Analysis on Clients. We adopt sensitivity [90, 97, 118] for measuring
the general privacy risk on model gradients with respect to the input data. Formally, given model W
and K data samples with input matrix X and ground truth label vector y, we compute the sensitivity for
each parameter wm as 1
K
PK
k=1 ∥Jk,m∥ , where Jk,m can be approximate by the gradient ∂
2
ℓ(X,y,W)
∂xk∂wm
, ℓ(·)
is the loss function given X, y and W, and ∥·∥ calculates the absolute value. The intuition is to calculate
how much the gradient of the parameter will change for each data point k. Each client i then sends the
encrypted parameters sensitivity matrix [[Si
]] to the server.
Different parts of a model contribute to attacks by revealing uneven amounts of information. Using this
insight, we propose to only select and encrypt parts of the model that are more important and susceptible
to attacks to reduce HE overheads while preserving adequate privacy.
Step 2: Encryption Mask Agreement across Clients. The sensitivity map is dependent on the
model and also the data. With potentially heterogeneous data distributions, the server aggregates local
sensitivity maps to a global privacy map PN
i=1 αi
[[Si
]]. The global encryption mask M is then configured
using a privacy-overhead ratio p ∈ [0, 1] which is the ratio of selecting the most sensitive parameters for
24
encryption. The global encryption mask is then shared among clients as part of the federated learning
configuration.
3.5 Quantifying Privacy Of Selective Parameter Encryption
Although sensitivity calculation provides guidance on selecting important model parameters, to the best of
our knowledge there is no existing work that successfully quantifies the privacy guarantee from the model
parameter sensitivity. In this section, we, for the first time, provide proof to analyze the privacy guarantee
of Selective Parameter Encryption using the theoretical framework of privacy budget analysis [41].
3.5.1 Encrypted Aggregation Quantified in Privacy Budget
Theorem 3.5.1 (Achieving ϵ0-Privacy by Full Homomorphic Encryption) For any two adjacent
datasets D1 and D2, since M(D) is computationally indistinguishable, we have
Pr [M(D1) ∈ O]
Pr [M(D2) ∈ O]
≤ e
ϵ
. (3.2)
We then have ϵ = ϵ0 if O is encrypted, where ϵ0 is some negligible value.
In other words, A cannot retrieve any useful sensitive information from encrypted parameters. The
simulation-based proof of the basic protocol with fully encrypted federated learning can be found in Appendix §.1.6 and approximating the negligible value of ϵ0 can be found in Appendix §.1.7.
3.5.2 Selective Parameter Encryption by Privacy Composition
Lemma 3.5.2 (Sequential Composition [43],) If M1(x) satisfies ϵ1-privacy and M2(x) satisfies ϵ2-
privacy, then the mechanism G(x) = (M1(x),M2(x)) that releases both results satisfies (ϵ1 + ϵ2)-privacy.
25
Based on Lemmas 2.5.2 and 3.5.2 and Theorem 3.5.1, letting J =
PN
i=1
∆fi
b
, we can quantify the privacy
of Full DP, random parameter encryption, and Selective Parameter Encryption.
Remark 3.5.3 (Achieving J-Privacy by Laplace Mechanism on All Model Parameters) If we add
Laplace noise on all parameters with fixed noise scale b, it satisfies J-privacy.
Remark 3.5.4 (Achieving (1 − p)J-Privacy by Random Encryption) If we randomly select model parameters with ratio p for homomorphic encryption and add Laplace noise on the remaining parameters, it
satisfies (1 − p)J-privacy.
Theorem 3.5.5 (rJ-Privacy by Selective Parameter Encryption) Suppose the sensitivity data follows
a distribution with density function p(x), x ∈ [0, xmax]. Applying homomorphic encryption on partial model
parameters S and Laplace Mechanism on the remaining parameters [N]/S with fixed noise scale b satisfies
rJ-privacy with the privacy budget ratio
r =
1
µ
Z Q1−p
0
xp(x)dx, (3.3)
where p is the fraction of homomorphically encrypted parameters, and µ and Q1−p are the mean and
(1 − p)
th quantile of p(x) respectively.
The proof of Theorem 3.5.5 can be found in Appendix §.1.8.
Remark 3.5.6 Let b0, b1, and b2 respectively be the scales of Laplace noises necessary for no encryption,
(uniform) random encryption, and selective encryption to reach the desired protection level (approximating
using J0 = J1 = J2). We will have the relation: b0 =
1
1−p
b1 =
1
r
b2.
Letting ∆f ∼ D, it is clear that the quantification of privacy guarantee from our Selective Parameter
Encryption depends on the distribution of the actual parameter distribution D of a given model.
26
10
11
10
9
10
7
10
5
10
3
10
1
10
1
Sensitivity
0
2
4
6
8
10
Percentage
True Distribution
Log-normal Mixture Model
(a) Estimation of the Sensitivity Distribution
0.00 0.02 0.04 0.06 0.08 0.10
Encryption Ratio
0.0
0.2
0.4
0.6
0.8
1.0
Privacy Budget Ratio
Random Encryption
Uniform Distribution
Exponential Distribution
Log-normal Mixture Distribution
True Distribution
(b) Estimation of the Privacy Budget Ratio
Figure 3.4: Sensitivity Distribution and Privacy Budget Ratio from Selective Parameter Encryption
(Transformer-3t): calculated parameter sensitivity follows a Log-Normal Mixture distribution, allowing
a smaller privacy budget to achieve the same privacy guarantee.
Key Observation. Our extensive experiments indicate that a noticeable collection of popular models’
parameters can be closely modeled by the Log-Normal Mixture distribution (as shown by the Transformer3t example in Figure 3.4a and Figure 3.4b, with more models in Appendix §.1.12). Assuming the sensitivity
distribution of a given model follows a Log-normal Mixture distribution D′
(µi as log mean and σi as
log variance), Selective Parameter Encryption requires only r portion of the privacy budget of complete
privacy with the same privacy guarantee, where
r =
P
i
λi
σi
R F −1
(1−p)
0
exp
−
(ln x−µi)
2
2σi
2
dx
√
2π
P
i
λi exp
µi +
σ
2
i
2
. (3.4)
Compared with random encryption, Selective Parameter Encryption provides much stronger privacy
preservation with the same encryption ratio (validated in §3.6.5). Such a framework can also fit any sensitivity distributions (Uniform and Exponential in Appendix §.1.10 and §.1.11).
27
3.6 Evaluation
In this section, we focus on the evaluation results to show how our proposed Selective Parameter Encryption largely mitigates these overheads for real-world deployment but still guarantees adequate defense
against privacy attacks. We also provide the validation of our proposed theoretical privacy quantification.
Note that additional experimental results regarding other FL system aspects are included in Appendix
§.1.15.
3.6.1 Experiment Setup
Models. We test our framework on models in different ML domains with different sizes including LLMs
(more details in Appendix §.1.15).
Attack Dataset. MNIST dataset (70k images), the CIFAR-100 dataset (50k images), and the WIKITEXT
dataset (100m tokens).
HE Libraries. We implement our HE core using both PALISADE and TenSEAL. Unless otherwise specified,
our results show the evaluation of the PALISADE version.
Default Crypto Parameters. Unless otherwise specified, we choose the multiplicative depth of 1, the
scaling factor bit digit of 52, an HE packing batch size of 4096, and a security level of 128 as our default
HE cryptographic parameters during the evaluation.
Machines. (1) For microbenchmarking HE overheads, we use an Intel 8-core 3.60GHz i7-7700 CPU with
32 GB memory and an NVIDIA Tesla T4 GPU; (2) For real MLOps system experiments: we use machines
with Intel 6-core 3.70GHz i7-8700K CPU, 64GB memory and NVIDIA GeForce GTX 1080 Ti as clients and
an M3 Pro 11-core CPU with 18 GB memory as the aggregation server; (3) For attacking experiments, we
use 6 NVIDIA DGX H100 GPUs with 720 GPU hours.
28
10
2 10
3 10
4 10
5 10
6
10
3
10
4
10
5
10
6
10
7
Communication Cost (kB)
Single-Key HE
Threshold HE
Single-Key HE
Threshold HE
10
2
10
3
10
4
Execution Time (ms)
Figure 3.5: Microbenchmark of Threshold-HE-Based FedAvg Implementation: with the x-axis showing the
sizes of vectors being aggregated, we use a two-party threshold setup. Both the single-key variant and the
threshold variant are configured with an estimated precision of 36 for a fair comparison. Note that bars
represent communication overheads and lines represent computation overheads.
3.6.2 FL With Threshold HE
The threshold variant of HE schemes is generally based on Shamir’s secret sharing [116] (which is also
implemented in PALISADE). Key generation/agreement and decryption processes are in an interactive
fashion where each party shares partial responsibility for the task. Threshold key generation results in
each party holding a share of the secret key and threshold decryption requires each party to partially
decrypt the final ciphertext result and merge to get the final plaintext result. We provide benchmarkings
of the threshold-HE-based FedAvg implementation in Figure 3.5.
29
Llama 2 (7B)
BERT
LeNet
ResNet-18
Linear
Llama 2 (7B)
BERT
LeNet
ResNet-18
Linear
Figure 3.6: Computation (left) and Communication (right) Overhead Comparison For Models of Different
Sizes (logarithmic scale): 10% Encryption is based on our selection strategy and 50% encryption is based
on random.
3.6.3 Optimized Overheads
We first examine the overhead optimization gains from Selective Parameter Encryption. Figure 3.6 microbenchmarks the overhead reduction from only encrypting certain parts of models, where both overheads are nearly proportional to the size of encrypted model parameters, which is coherent with the general relationship between HE overheads and input sizes. Note that after 10% encryption per our Selective
Parameter Encryption, the overheads are close to the ones of plaintext aggregation.
Train - 88.64 %
Comm:C-S - 5.08 %
Comm:S-C - 5.08 %
PlainAgg - 1.21 %
Train - 33.40 %
Comm:C-S - 30.90 %
Comm:S-C - 30.90 %
FHEAgg - 2.34 %
Dec - 1.77 %
Enc - 0.69 %
Init - 0.01 %
Train - 87.17 %
Comm:C-S - 5.75 %
Comm:S-C - 5.75 %
FHEAgg - 1.24 %
Dec - 0.05 %
Init - 0.02 %
Enc - 0.02 %
Figure 3.7: Time Distribution of A Training Cycle on ResNet-50 on our industrial deployment platform:
plaintext FL (left), HE with full encryption (middle), and HE with selective encryption (right). MLOps test
env has a bandwidth of 20 MB/s (Multiple AWS Region). The optimization setup uses an encryption mask
with an encrypted ratio s = 0.01. Detailed training configuration can be found in Appendix §.1.25.
Figure 3.7 dissects the training cycle overhead distribution for the HE framework (both with and without optimizations) and the plaintext framework respectively. Note that here we only consider the cost
30
distribution of a single round instead of the entire federated training. This is because, with proper CKKS
crypto parameter setup, the model training accuracy of encrypted training has a marginal difference compared to the one of plaintext training even considering the fact that encrypted training has approximate
computation under the hood (experimental results regarding this part can be found in Table 3 in Appendix).
For a medium-sized model, the majority of overheads (both computation and communication) are shifted
to HE-related steps in the full HE mode (w/o optimization) compared to the plaintext mode. However,
when optimized by Selective Parameter Encryption, the overheads from HE dramatically drop such that
the local training step becomes the majority again.
3.6.4 Effectiveness of Selective Encryption Defense
To evaluate the defense effectiveness of Selective Parameter Encryption, we encrypt model parameters per
parameter sensitivity and perform inversion attacks (CV: DLG [145]; NLP: TAG [34]).
Figure 3.8: Selection Protection Against Gradient Inversion Attack [145] On LeNet with the CIFAR-100
Dataset: attack results when protecting random parameters (left) vs protecting top-s sensitive parameters
(right). Each configuration is attacked 10 times and the best-recovered image is selected.
Defense effectiveness on CV tasks. We use image samples from CIFAR-100 to calculate the parameter sensitivities of the model. In the DLG attack experiments, we use Multi-scale Structural Similarity
Index (MSSSIM), Visual Information Fidelity (VIF), and Universal Quality Image Index (UQI) as metrics to
31
measure the similarity between recovered images and original training images to measure the attack quality hence the privacy leakage. In Figure 3.8, compared to random encryption selection where encrypting
42.5% of the parameters can start to protect against attacks, our top-5% encryption selection according
to the model privacy map only alone can defend against the attacks, meaning lower overall overhead with
the same amount of privacy protection.
The Tower Building of the Little Rock Arsenal,
also known as U.S. Arsenal Building, is a building
located in MacArthur Park in downtown Little
Rock, Arkansas
The Tower Building of Little Rock the the Arsenal
, also as of and the in Arsenal building in in
MacArthur for an Some Park a in downtown
Rock , It public
As It Robert of one One December My students
an gave -----------------------------------------------------
----------- as as a for historical it 83
<|endoftext|> stating ignore far suggested That
What It undeniably pseud make persuaded bi
Original Text Privacy Attack on 10% Random Encryption
Accuracy: 0.2500|S-BLEU: 0.16|ROUGE-L: 0.61
Privacy Attack on 1% Selective Encryption
Accuracy: 0.0312|S-BLEU: 0.02|ROUGE-L: 0.11
Figure 3.9: Language Model Inversion Attacks [34] on GPT-2 with the wikitext Dataset: Red indicates
falsely-inverted words and Yellow indicates correctly-inverted words.
Defense effectiveness on NLP tasks. We use language samples from the wikitext dataset in our experiment. As shown in Figure 3.9, with our sensitivity map indicating the top 1% privacy-sensitive parameters, our encryption mask can prevent inversion attacks that yield better defense results than randomly
encrypting 10% of the model parameters.
Empirical Selection Recipe. In Table 3.1, we show that empirically, encrypting the top-10% most sensitive parameters tends to be adequate to defend against inversion attacks [63], but up to 90% are needed
for random encryption.
3.6.5 Privacy Guarantee Quantification
To validate Remark 3.5.6, we fix the encryption ratio for both random and selective encryption on each
selected model and gradually increase the noise scales. When all the encryption methods reach a predefined protection level, we record the minimum noise scale needed and calculate the experimental ratios to
make a comparison with the theoretical values. The encryption ratio is chosen to be small so that we can
observe the influence of the Laplace noises by ensuring the attack score not to be too low at first. As in
32
Model Size
Selective Encryption Random Encryption
Minimum
Encryption
Ratio
Attack Score
Minimum
Encryption
Ratio
Attack Score
LeNet 88,648 0.05 0.1411 ± 0.0487 0.11 0.1835 ± 0.0720
CNN 2202,660 0.001 0.1640 ± 0.0530 0.007 0.1861 ± 0.0494
ResNet-18 11,220,132 0.001 0.1792 ± 0.1234 0.05 0.1458 ± 0.0732
Transformer-3f 10,702,129 0.1 0.0000 ± 0.0000 0.9 0.2000 ± 0.1672
Transformer-3 10,800,433 0.1 0.0000 ± 0.0000 0.9 0.9750 ± 0.0415
Transformer-S 53,091,409 0.1 0.0000 ± 0.0000 0.6 0.0875 ± 0.0573
GPT-2 124,439,808 0.01 0.0875 ± 0.0935 0.4 0.0644 ± 0.0720
Table 3.1: Defense Effectiveness on CV and NLP Models: each configuration is attacked 10 times and the
best attack score is recorded (VIF for CV tasks and Reconstruction Accuracy for NLP tasks). The minimum
encryption ratios are selected as the smallest encryption ratio observed that reduces the attack score to
below a certain level (0.2 for VIF of images and 0.1 for Reconstruction Accuracy of texts). The largest
encryption ratio used will be recorded if the method fails to provide the desired protection level.
Model Enc
Ratio
Minimum Laplace Scale r1 r2
Full
DP
Random
+ DP
Selective
+ DP Exp. Theo. Exp. Theo.
LeNet 0.005 0.11 0.09 0.09 0.8182 0.9950 0.8182 0.8094
TF-3 0.01 0.013 0.013 0.003 1.0000 0.9995 0.2308 0.8850
TF-3f 0.01 0.0125 0.0125 0.0025 1.0000 0.9999 0.2000 0.9587
TF-3t 0.01 0.013 0.012 0.004 0.9231 0.9990 0.3077 0.9214
Table 3.2: Quantifying Privacy of Selective Parameter Encryption: r1 and r2 represent the ratio of sum
induced by the random encryption and selective encryption respectively. The minimum Laplace scales
are taken based on the smallest scale of the Laplace noises that reduces the attack score to a desired level.
The theoretical value of r1 is one minus the encryption ratio and that of r2 is calculated based on the
corresponding sensitivity data.
Table 3.2, the four cases show with acceptable errors that our theorem provides an upper bound for the
differential privacy budget of the random and selective encryption methods.
Figure 3.10 shows how our method performs on newer LLMs from the Llama-3.2 collection. The experimental results indicate that newer LLMs align closely with the findings observed in our experiments
on earlier models.
33
10
12
10
9
10
6
10
3
10
0
Sensitivity
0
1
2
3
4
5
6
Percentage
True Distribution
Log-normal Mixture Model
10
19 10
15 10
11 10
7 10
3 10
1
Sensitivity
0
2
4
6
8
10
12
14
Percentage
True Distribution
Log-normal Mixture Model
Figure 3.10: Sensitivity Distributions of Llama-3.2-1B & Llama-3.2-3B.
3.7 Conclusion
In this work, we propose the first practical homomorphic-encryption-based privacy-preserving FL solution with Selective Parameter Encryption which is designed to support efficient foundation model federated training. Selective Parameter Encryption selectively encrypts the most privacy-sensitive parameters
to minimize the size of encrypted model updates to reduce overheads while providing privacy guarantees
quantifiable by our proposed theoretical privacy analysis framework. Future work includes further improving the performance of threshold HE in the less trusted FL setting as well as supporting decentralized
primitives such as Proxy Re-Encryption [11] for a dynamic FL ecosystem.
34
Chapter 4
Efficient Multi-Party Privacy-Preserving Machine Learning Inference
for Entity Resolution
4.1 Background
Entity resolution (ER), also known as record linkage or data matching, identifies and merges data records
that refer to the same real-world entity across different datasets. It is widely applied in big data applications such as business, finance, customer data integration, medical data analysis, etc [22, 31, 134]. In data
management, ER plays a crucial role in ensuring data consistency and quality across different sources, particularly when dealing with records that may contain variations or incomplete information. By effectively
resolving entities, ER overcomes challenges posed by data inconsistencies, duplicates, and variations across
different data sources, thus enabling data owners to significantly improve data quality, enhance decisionmaking processes, and gain more comprehensive insights from their data. However, entity resolution
in real-world applications faces two primary challenges: i) the inherent variability and unpredictability
of data quality across diverse sources, and ii) privacy concerns, especially when dealing with sensitive
datasets. Datasets used for entity resolution are often collected from various devices and sources, resulting in different quality, formats, and standards. Some datasets may suffer from inconsistencies or missing
values, further complicating the entity resolution process. Moreover, many datasets contain personal or
35
sensitive information, making privacy a significant concern, especially for applications for healthcare [2,
123], finance [132, 139], and education [84, 106], etc. Users may require to have entity resolution service
provided without exposing their sensitive data, which requires the method to determine records matches
without sharing or revealing the plaintext information.
HASP BERT-base BERT-large
Models
0
50
100
150
200
250
300
350
# of Parameters (million)
# of Parameters
Multiplicative Depth
0
25
50
75
100
125
150
175
Multiplicative Depth
Figure 4.1: Encrypted Computation Overhead Comparison in the Number of Encrypted Model Parameters
and Encryption Multiplicative Depth: compared to our efficient two-stage solution HASP, an end-to-end
fully encrypted deep entity resolution model (e.g., based on BERT) will yield overheads of impractical
scales.
Traditional ER approaches [31, 100] typically rely on manual feature engineering and rule-based methods. These methods require predefined rules and human intervention to handle variations in entity representation. While these methods might be useful for simple datasets, they struggle with scalability and
adaptability when faced with the complexity and dynamism of real-world data. In recent years, deep learning models, particularly embedding-based techniques, have demonstrated significant improvements in ER
tasks [19, 77, 79, 92, 131, 143]. These approaches automatically learn entity representations from raw data,
which eliminates the need for extensive feature crafting. To further address privacy concerns, privacypreserving entity resolution techniques have been introduced [56, 126]. While these methods incorporate
privacy mechanisms such as noise addition or perturbation to address some privacy risks, they often come
at the cost of reduced performance or remain susceptible to certain attacks [119, 141].
36
Among various privacy primitives, homomorphic encryption (HE) is a promising technology for ensuring privacy by allowing encrypted data to be processed without exposing the underlying information [4,
50, 96, 104, 113, 114, 136, 146]. However, directly integrating HE into large-scale models like BERT is impractical due to the large numbers of parameters and high computational cost; see Figure 4.1. Moreover,
HE schemes are limited to supporting only basic operators like addition and multiplication, while more
complex operations, such as non-linear transformations in neural networks, lack direct mapping, making
it impractical to adapt these operators straightforwardly.
To mitigate the efficiency challenge in encrypted deep entity resolution, we propose Hide-And-SeekPair (HASP), a novel privacy-preserving framework for entity resolution that leverages homomorphic
encryption in conjunction with embedding models in a two-stage fashion. Our two-stage approach is
designed to achieve a practical balance between ensuring robust privacy protection and minimizing the
computational overhead typically associated with HE, while still maintaining strong performance. This
framework offers a secure and efficient solution for performing ER on encrypted data, making it suitable
for real-world applications involving sensitive information.
We summarize our main contributions as follows:
• We propose a novel homomorphic-encryption-based privacy-preserving framework, HASP, that leverages embedding models for entity resolution tasks in an efficient two-stage fashion.
• Our approach provides a practical balance between privacy and utility with a minimal overhead increase.
• Our approach adopts Synthetic Ranging and Polynomial Degree Optimization to effectively realize precise non-linear function approximation in encrypted neural networks.
• Our approach outperforms other state-of-the-art deep entity resolutions in various dataset tasks.
37
Methods Performance Privacy Overhead
Non-private [79, 92] #
Bloom Filter based approach [113] # G# G#
CampER (NN-based) [61] G# G# G#
HASP (Ours) G#
Table 4.1: Comparisons of Entity Resolution Solutions: : best; G#: medium; #: worst.
4.2 Related Work
Traditional rule-based approaches [31, 100] and deep learning (DL) methods [19, 77, 79, 92, 131, 143] have
been widely used in ER. While these approaches have shown effectiveness in ER tasks, they fall to address a critical concern: data privacy. As awareness of user data privacy grows and regulations around
data privacy compliance become more stringent, the need for privacy-preserving ER solutions has become
paramount. However, bridging privacy-preserving computation in these DL-based ER solutions is more
challenging due to the scale of the DL models and the complexity of the computations [61]. Researchers
proposed CampER [61], a privacy-preserving DL-based entity resolution that is designed with privacy
protection via perturbation. However, it fails to provide sufficient privacy with formal proof. Moreover,
although CampER proposed to use partial homomorphic encryption to help improve the privacy guarantee, it regressed back to using simple rule-based entity-matching calculation rather than leveraging the
advantages of DL models, which presents more complex challenges for a privacy-preserving variant, such
as non-linear function approximation.
Contrary to previous work (see Table 4.1), HASP adopts an efficient two-stage deep entity resolution
framework by leveraging the contextual information of embeddings generated by pre-trained language
models [36] along with a lightweight encrypted neural classifier optimized using function approximation [28, 122], which provides a practical efficient deep entity resolution solution with adequate privacy
guarantee.
38
4.3 Domain Knowledge Extraction
Contextualized embedding vectors produced by fine-tuned language models, such as S-BERT ∗
, offer a
compact, fixed-size representation that effectively captures the essential information of the original record.
When these embeddings are used as input, a multi-layer perceptron classifier is sufficient for entity resolution, making the use of more complex, heavy-weight language models unnecessary.
Based on these observations, we propose a two-stage privacy-preserving deep entity resolution protocol that follows a two-stage pipeline (shown in Figure 4.2 and Algorithm 2). The key idea is to (1) keep the
computation-intensive embedding generation offline on the local data owner side such that the embedding
model can run in plaintext, and (2) perform a differential classification on encrypted embeddings using a
simple neural network in ciphertext online on the server side.
4.4 HASP: Efficient Two-Stage Encrypted Deep Entity Resolution
Encryption Online Ciphertext Differential Evaluation Decryption
Name: John A. Doe
Date of Birth: 12/05/1980
Diagnosis: Type 2 Diabetes
Medications: Metformin
Jonathan Doe, born on May 12,
1980, has been diagnosed with
Type 2 Diabetes Mellitus and is
currently prescribed
Metformin and Insulin.
Contextualized Embedding Generation ⊙ (Optionally w/ Dimension Reduction)
Linear
ReLU Approximation
Linear
ReLU Approximation
Linear
Manhattan
Euclidean
Cosine
is_pair
Distance Calculation
Classification
Encryption
Decryption
r1
r2
e1
e2
⟦e1⟧
⟦e2⟧
⟦ed⟧
argmax( )
Offline Plaintext Embedding Generation
⟦z⟧
Figure 4.2: HASP: Two-Stage Privacy-preserving Deep Entity Resolution: the raw record r1 and r2 from
each data owner is encoded to contextualized embeddings e1 and e2 locally using fine-tuned language
models. These embeddings are then encrypted with homomorphic encryption, resulting in ciphertexts
Je1K and Je2K, which are transmitted to the remote server. On the server side, a distance in ciphertext JedK
between record ciphertexts is calculated and served as input to the subsequent encrypted neural classifier
for resolving entity pairs. The final encrypted result JzK is sent back to the data owners for decryption.
∗
https://sbert.net
39
4.4.1 Methodology Overview
Applying privacy-preserving primitives, e.g., homomorphic encryption, on language-model-based deep
entity resolution is impractical due to the substantial overhead increase in computation and communication. Alternative approaches should balance both overheads and privacy guarantees in practical scenarios.
The two-stage PPER solution works as below:
Algorithm 2 Two-Stage Encrypted Deep Entity Resolution
• WE: the embedding model;
• JWCK: the encrypted classification model;
// Initialization
(pk, sk) ← HE.KeyGen(λ); //Cryptocontext is synced across parties.
// Offline Stage
for each client i do in parallel
ei ← WE(ri);
JeiK ← HE.Enc(pk, ri);
Send JeiK to server;
end
// Online Stage
for server do
JedK ← HE.Eval(⊙, [Je0K, Je1K]); // ⊙ is the chosen distance calculation function. JbK ←
HE.Eval(JWCK, JedK);
Send JbK to clients;
end
for each client i do in parallel
b ← HE.Dec(sk, JbK);
end
1. Offline Plaintext Embedding Generation: at each client locally (offline), a fine-tuned language
model is used to generate contextualized embeddings of fixed length from local records. These
embeddings will be encrypted using the public key, and the encrypted records will be shared with
the server.
2. Online Ciphertext Differential Evaluation: at the server remotely (online), a deep neural
network-based differential classifier designed using homomorphic encryption first calculates the
40
distance of the encrypted embedding pairs in the ciphertext and then inputs the distance ciphertext into an encrypted multi-layer perception classifier. The output of the classifier in ciphertext
contains whether the two original records belong to the same entity, and can only be decrypted
collaboratively by local clients using their private keys.
The two-stage design allows the computationally heavy embedding generation module to be executed
in plaintext at local clients. Only the fixed-length embedding vectors will be encrypted and evaluated
in the form of ciphertexts using a differential classifier with a relatively simple structure, which requires
dramatically smaller overheads (compared to fully encrypted deep models).
4.4.2 Offline Plaintext Embedding Generation
As language models have advanced, the contextualized embeddings they generate are able to effectively
capture the context of the original record’s content, whether it is in free-text or structured format. Besides, the fixed-length embeddings offer advantages for encryption, particularly in terms of efficient space
allocation.
Assume a record r composed of multiple attributes or columns (ci
) and their corresponding values (vi
),
such that r = {(ci
, vi)}i∈[1,n]
, where n represents the total number of attributes. For records in free text,
n = 1. Following the encoding strategy in [61], we encode each record with the special tokens [COL] and
[VAL] into the following representation:
T(r) = [COL]c1[V AL]v1 · · · [COL]cn[V AL]vn. (4.1)
The transformed T(r) is subsequently input into a locally deployed language model to produce a contextualized embedding vector e ∈ R
D, where D is the dimension of the embedding vector. While a pretrained language model is capable of providing sufficient performance for the classifier, its effectiveness
41
can be further enhanced by fine-tuning the language model using labeled pairs from the training data in
plaintext. Additionally, to further reduce the embedding dimension D †
and emphasize the most relevant
information, dimension reduction techniques like principal component analysis (PCA) can be applied to
the original embedding.
Lastly, each embedding e is encrypted using homomorphic encryption with public key pk, resulting
in JeK. The ciphertext of two records as a candidate pair ((Je1K, Je2K)) is then transmitted to the remote
server individually.
4.4.3 Online Ciphertext Differential Evaluation
The second stage takes place on a remote server, where all evaluations are performed directly on encrypted
data, without decryption, by utilizing homomorphic encryption. The final encrypted output is then transmitted back to the local clients, who can collaboratively decrypt it using their private keys to obtain the
final result. The encrypted evaluations are dissected in the following two modules.
Distance Calculation. To enable collaborative features and further reduce the input size of the classifier,
a distance metric is introduced to quantify the difference between two embeddings. We propose three
such distance metrics in Equation (4.2), which are both effective for comparing embedding vectors and
compatible with homomorphic encryption schemes that only support a limited set of operations, such as
addition and multiplication.
Je1K ⊙ Je2K =
Je1K − Je2K if Manhattan
(Je1K − Je2K)
2
if Euclidean
Je1K ∗ Je2K if Cosine
(4.2)
where the output of JedK = Je1K ⊙ Je2K has the same dimension as Je1K and Je2K, that is, JedK ∈ R
D.
†Typically, for BERT-based language models, D = 768.
42
Classification. SOTA methods [79] leverage language models to resolve entities due to their general capability. These models are usually large, making them impractical for efficient homomorphic encryption
evaluation (Figure 4.1). Instead of using a sledgehammer on a nut, we propose reverting to a simple yet
effective multi-layer perception for binary classification, particularly when using the high-quality, highdimensional distance measure vector JedK as the input. The classifier contains a series of layers, each
consisting of a linear followed by a non-linear activation transformation. The linear layer involves only
matrix addition and multiplication, allowing for straightforward homomorphic evaluation adaptation. In
contrast, the non-linear activation transformation necessitates appropriate approximation, which we detailed in §4.5. The output of the classifier is an encrypted 2d vector JzK ∈ R
2
, which represents the
probability that r1 and r2 refer to the same entity (yes or no).
4.4.4 Efficient Training Workflow
The training of the encrypted neural classifier does not necessarily need to be in ciphertext. More practically, the training can be performed with public or synthetic data with a similar distribution as the data
used in inference. Since this is a supervised learning process, the records in the training data must be
annotated in pairs to indicate whether they represent the same entity.
Given that HASP implements non-linear activation with approximations, the neural classifier must
account for a certain degree of accuracy loss. However, the direct transition of the plaintext neural classifier
to an encrypted model using approximation functions can not guarantee model convergence. Specifically,
the model weights and the optimal approximation polynomial degree and range are different in these two
setups. Thus, to mitigate such issues, when training the neural classifier, the activation layer is required
to enable approximation. A more detailed explanation of approximation is provided in §4.5.
43
Additionally, the training data is involved in fine-tuning the language model to produce embeddings.
A typical training strategy is to employ contrastive loss to maximize the distribution difference between
irrelevant entity pairs while minimizing the difference between relevant pairs.
0.10 0.05 0.00 0.05 0.10
0.00
0.02
0.04
0.06
0.08
0.10
range=[-0.1,0.1]
Exact
Approx
0.2 0.1 0.0 0.1 0.2
0.00
0.05
0.10
0.15
0.20
range=[-0.2,0.2]
Exact
Approx
0.2 0.0 0.2
0.2
0.1
0.0
0.1
0.2
0.3
range=[-0.3,0.3]
Exact
Approx
0.4 0.2 0.0 0.2 0.4
1.00
0.75
0.50
0.25
0.00
0.25
range=[-0.4,0.4]
Exact
Approx
Figure 4.3: ReLU Approximation with Different Ranges: polynomial degree of 4; inaccurate input ranges
result in large errors between the target function and the approximation function.
4.5 Encrypted Approximation Layers
HE does not directly support complex non-linear functions (e.g., ReLU) [28]. The common practice to
implement these functions in holomorphic encryption is to use optimal polynomial approximation such
as Minimax optimization [122] to find an optimal polynomial function that approximates a given nonlinear function:
min
P ∈Rn
max
−a≤x≤a
|F(x) − P(x)|, (4.3)
where F(x)is the target function with the range of −a ≤ x ≤ a and P(x)is the polynomial approximation
for that range.
In our work, we use the Remez algorithm [108] to iteratively solve such a polynomial optimization
problem:
1. Initialization: Start with an initial set of points K, typically the Chebyshev nodes scaled to the
desired range for a hot start;
44
2. Solve Linear System: A system of linear equations b0 + b1xi + · · · + bnx
n
i + (−1)iE = F (xi), i ∈
1, . . . , n + 2 is constructed and solved to find the coefficients of the polynomial approximation P(x)
and an error term E;
3. Residual Calculation: The residual function r(x) = F(x) − P(x) is computed using the polynomial coefficients extracted from the solution to the linear system, which represents the error between
the original function and the polynomial approximation;
4. Update Local Extrema: Find the local extrema of the residual function using a root-finding method
and update the set of points K with these new extrema;
5. Iteration: Repeat Step 2-4 until a maximum iteration condition is satisfied.
Using the above implementation of the Remez algorithm, given a non-linear function (e.g., ReLU) of a
certain input range and a desired polynomial degree (corresponding to the multiplicative depth in HE), we
can find an approximation function in the form of polynomials. For example, the ReLU function F(x) =
max(0, x) can be approximated by a degree-of-4 polynomial function with an input range of [−0.2, 0.2]:
P(x) = −63.51018555883651 + 0.9044829773777646 · x+
4.873617651588173 · x
2 + 0.4971786059988112 · x
3+
0.006671590830613045 · x
4
.
(4.4)
4.5.1 Synthetic Ranging
The Remez algorithm requires the expected input range as one of the constraints when finding a function
approximation. An inaccurate range will substantially impact the approximation quality and subsequently
reduce the model performance, as shown in Figure 4.3.
45
Synthetic Sample
Generation Sample Activation Sample Range
Measurement
Figure 4.4: Estimating Approximation Function Ranges via Synthetic Ranging: synthetically generated
data samples are fed into the neural network. The activation range of the function is evaluated for the
polynomial approximation algorithm.
While it is possible to design a HE-based range calculation function that dynamically measures the
activation ranges at runtime for choosing a proper approximation function, this approach will have several issues including substantial runtime overheads and intermediate decryption interaction requirements
across parties. We propose a range estimation approach that leverages synthetic data generation [80, 101]
to determine the function range at the initialization phase. Figure 4.4 shows the overall pipeline for Synthetic Ranging: a small set of synthetic samples are generated that follows the original data distribution,
and the generated samples will be used to activate the function and obtain a range as the reference when
choosing the approximation function.
4.5.2 Polynomial Degree Optimization
Naturally a higher polynomial degree results in a closer approximation, but it also increases the homomorphic encryption multiplicative depth that yields a larger computation overhead, as shown in Figure 4.5.
Achieving a balance between approximation accuracy and encrypted computation overhead can be
viewed as an optimization problem with two competing objectives: minimizing approximation error and
reducing multiplicative depth, both of which depend on the variable degree of the polynomial:
min
d
(α · |F(x) − Pd(x)| + β · muldepth(d)), (4.5)
46
where d is the degree of the polynomial Pd(x), α and β are weighting factors to balance the approximation
error minimization and homomorphic encryption multiplicative depth reduction objectives respectively.
1 2 3 4 5 6 7 8 9 10
Degree
2
4
6
8
10
12
Multiplicative depth
Multiplicative depth (ReLU)
Multiplicative depth (Overall)
Cumulative difference
0.005
0.010
0.015
0.020
0.025
Cumulative difference
Figure 4.5: Multiplicative Depth vs. Cumulative Approximation Errors (range=[-0.2, 0.2]): with different
polynomial degrees, there are two competing optimization objectives, namely encryption multiplicative
depth and cumulative approximation errors.
Note that for a polynomial of degree n, the multiplicative depth is approximately log2(n), we can
group multiplications as efficiently as possible by using a balanced binary tree structure, for example, x
4
(multiplicative depth of 3) can be grouped as (x
2
)
2
( depth of 2).
Finding an optimal polynomial degree can be achieved programmably with user-defined objectives [70].
4.5.3 Privacy Analysis
Threat Model. We define a semi-honest non-colluding adversary A that can corrupt the server or any
subset of local clients. A follows the protocol but tries to learn as much information as possible.
Definition 4.5.1 (UC-Security) A homomorphic-encryption entity resolution protocol π is simulation secure in the presence of a semi-honest adversary A, provided that the homomorphic encryption scheme is
47
securely realized, there exists a simulator S in the ideal world that also corrupts the same set of parties and
produces an output identically distributed to A’s output in the real world.
To prove the security of our protocol π, no environment Z can distinguish between the real-world
execution with A and the ideal-world execution with S interacting with F.
Ideal World. Our ideal world functionality F interacts with clients and the server as follows:
• Both honest and corrupted clients upload embedding ei to F.
• If local embeddings ⃗e from clients are enough to compute the binary classification label b, F sends
b ← WC(⊙(⃗e)) to all clients, otherwise F sends empty message ⊥.
During the entire interaction, the input from another client is not revealed to each client.
Real World. In real world, F is replaced by our protocol π described in Algo. 2.
Simulator. We describe a simulator S that simulates the view of the A in the real-world execution of
our protocol. Our privacy definition in Definition .1.4 and the simulator S prove both confidentiality and
correctness in different cases.
In the case of a corrupted server, S receives λ and 1
n
from F and executes the following steps:
1. S runs the key generation function to sample pk: (pk, sk) ← HE.KeyGen(λ);
2. S generates two ciphertexts of embeddings Je0K
′
and Je1K
′
to send to the corrupted server, which
forwarded to A.
The execution of S implies that:
{(Je0K, Je1K)}
s
≡
HE.Enc(pk, e′
0
), HE.Enc(pk, e′
1
)
(4.6)
Similarly, in the case of a corrupted client, S executes the following steps:
1. S forwards e0 from A (the corrupted client) to F;
4
2. S generates a ciphertext of embedding Je1K
′
to simulate the benign client’s encryption;
3. S simulates JedK
′ ← HE.Eval(⊙, [Je0K, Je1K
′
]) and JbK
′ ← HE.Eval(JWCK
′
, JedK
′
);
4. S obtains b ← WC(e0 ⊙ e1) from F and then sends JbK to A on behalf of the server.
The execution of S implies that:
{JbK}
s
≡ {JbK
′ ← HE.Eval(JWCK
′
, HE.Eval(⊙, [Je0K, Je1K
′
]))} (4.7)
The encrypted values are indistinguishable due to the security property of homomorphic encryption [30]. Thus, we conclude that S’s output in the ideal world is computationally indistinguishable from
the view of A in a real-world execution:
{S (1n
,(λ))}
s
≡ {viewπ
(λ)} , (4.8)
where view is the view of A in the real execution of π.
4.6 Evaluation
4.6.1 Settings
Datasets. We conduct experiments on benchmarking datasets [74, 92], as detailed in Table 4.2. These
datasets span a range of domains, including citations, restaurants, products, etc. While some datasets
consist primarily of textual data, others are structured with multiple attributes.
Baselines. We compare HASP against both non-private and privacy-preserving ER methods. For nonprivate methods, ZeroER [131] leverages unsupervised learning techniques and external knowledge to automatically resolve entities. DeepMatcher[92] uses attribute embedding to convert record attributes into
49
Code Dataset Name Domain Type
AB Abt-Buy Product Textual
AG Amazon-Google Software Structured
DA DBLP-ACM Citation Structured
FZ Fodors-Zagats Restaurant Structured
WA Walmart-Amazon Electronics Structured
Table 4.2: Entity Resolution Datasets.
Method AG DA WA FZ AB
Non-private ER
ZeroER .435 .982 .672 .955 -
DeepMatcher .707 .985 .736 1.0 .628
DITTO .756 .990 .868 1.0 .893
(Limited) Privacy-preserving ER
BF (0.7) .475 .846 .310 .911 -
BF (0.8) .315 .927 .187 .881 -
CampER (w) .734 .990 .832 .977 -
CampER (t) .739 .989 .829 .977 -
HASP (Manhattan / Euclidean / Cosine)
Plaintext (768) .902 / .979 / .990 .990 / 1.0 / .990 .922 / .990 / .990 .990 / .980 / .970 .951 / .970 / .990
Plaintext (384) .948 / .979 / .970 1.0 / 1.0 / .990 .970 / .990 / .990 .970 / .980 / .903 .990 / .980 / .980
Plaintext (192) .958 / .958 / .979 1.0 / 1.0 / .990 .979 / .990 / .990 .979 / .980 / .867 .990 / .980 / .980
Ciphertext (768) .885 / .979 / .979 .907 / 1.0 / .990 .881 / .979 / .990 .935 / .980 / .980 .951 / .970 / .990
Ciphertext (384) .902 / .979 / .979 .990 / 1.0 / .990 .950 / .990 / .990 .971 / .980 / .903 .980 / .980 / .961
Ciphertext (192) .939 / .979 / .969 .980 / 1.0 / .990 .959 / .990 / .990 .980 / .980 / .891 .970 / .980 / .961
Table 4.3: Main Results of Non-private ER, Privacy-preserving ER, and HASP on Various Datasets: each
HASP result contains three entries with three distance measures. “-” indicates no experiment is conducted
in the corresponding paper. The best result for each dataset among all methods, excluding HASP’s plaintext
variants, is in bold.
vector representations, similarity representation to capture the similarity between record pairs, and a classification layer that predicts whether the records are matches based on the similarity vectors. DITTO [79]
utilizes pre-trained language models (e.g., BERT) and augmentation techniques, improving the model’s
ability to resolve entities in complex, heterogeneous data.
In the realm of privacy-preserving methods, BF [113] uses Bloom filters to hash records and applies the
Dice coefficient to compare the hashed representations. We report its performance in two variants with
similarity thresholds of 0.7 (BF(0.7)) and 0.8 (BF(0.8)). CampER [61] utilizes collaborative match-aware
50
representation learning and privacy-aware similarity measurement to effectively resolve entities. Two
variants are reported: CampER(t) for supervised learning and CampER(w) for unsupervised learning.
Implementation details. We utilize S-BERT to convert original records into embedding vectors, specifically employing the pre-trained model all-distilroberta-v1. To fine-tune the embedding models with labeled
pairs, we apply contrastive loss using cosine distance. Additionally, we reduce the embedding dimensions
from 768 to 384 and 192 through principal component analysis. Note that we do not perform any particular
data cleaning or augmentation.
The neural classifier architecture comprises three linear layers: the first layer transforms the input to
the hidden layer, which is a fully connected layer of size 2048x2048, and the final layer maps output to a 2D
vector. An approximated ReLU activation function is used, with the Remez approximation degree set to 4,
and the activation range is adjusted based on the datasets. We employ cross-entropy as the classification
loss function and use the Adam optimizer with a learning rate of 2e-5, setting the batch size to 16. Model
checkpoints are saved every 10 epochs, up to a maximum of 100 epochs, and the best result is selected.
For encryption, we utilize CKKS [30], a homomorphic encryption scheme optimized for efficient computations using approximate arithmetic on encrypted data. Our implementation is built on TenSEAL [13],
an open-source Python library for homomorphic encryption built on top of Microsoft SEAL.
Metrics. In line with the majority of existing studies, we employ the F1 to evaluate the performance
of entity resolution. Furthermore, we analyze the communication and computation costs incurred by
encryption.
4.6.2 Experimental Results
4.6.2.1 Model Entity Resolution Performance
The performance results are demonstrated in Table 4.3. For HASP, we report its performance in both
51
plaintext and ciphertext configurations. The plaintext configuration represents the ideal scenario, where
the HASP pipeline operates without any approximations, while the ciphertext configuration reflects the
real-world scenario, incorporating encryption and approximation. In both configurations, we conduct
experiments using input embeddings of 768, 384, and 192 dimensions, with Manhattan, Euclidean, and
Cosine distance metrics for each.
On the one hand, we compare HASP performance between plaintext and ciphertext settings. Vertically,
the input with 768 dimensions is not always better than the ones with 384 and 192 dimensions. Contrarily,
when the distance is Manhattan, the input with 192 dimensions consistently outperforms the input with
768 dimensions by 2-8%. This proves that dimension reduction is effective in terms of concentrating on
important information in the record content. Horizontally, when employing Euclidean or Cosine distance
metrics, performance is superior to that of Manhattan distance. This may be attributed to several factors:
(1) The embedding model used is designed for accessing similarity using Euclidean and Cosine distances,
and the contrastive pair-wise fine-tuning of the embedding model also leverages Cosine as the similarity
metric. (2) Euclidean and Cosine distances can offer advantages in terms of capturing similarity more
effectively, especially in high-dimensional or normalized data contexts.
On the other hand, we compare HASP’s ciphertext performance with SOTA methods regardless of
whether they are private or non-private. Non-private ER methods, particularly DeepMatcher and DITTO,
which leverage deep neural networks or language models as their backbone, demonstrate superior performance compared to privacy-preserving SOTA methods. This disparity is largely due to the trade-off
between introducing noise or perturbations for privacy protection and maintaining data utility. However,
HASP outperforms all SOTA methods on AB, AG, DA, DS, and WA datasets, except for the CO and FZ
datasets, DeepMatcher and DITTO achieve superior performance by 7% and 2%, respectively, even though
they do not have any privacy protection for the record content. When comparing HASP with only privacypreserving ER methods, it consistently outperforms them in terms of performance. Furthermore, HASP
52
provides robust privacy protection through homomorphic encryption, as opposed to other methods relying
solely on hashing or the addition of noise.
Input Intermediate (Max) Output
Plaintext 3.4 kB 8.4 kB 0.4 kB
HASP 1.7 MB 1.7 MB 215 kB
Fully-encrypted 1.7 MB 10.2 GB 215 kB
Table 4.4: Ciphertext Overhead: the input and output file sizes are measured during the server-client
communication. The intermediate file sizes are measured for memory usage during the computation on
the server. Due to CKKS’s ciphertext packing techniques (in our experiment ciphertext batch size is set to
be 8192), any vector whose size is smaller than the ciphertext batch size can be packed into a ciphertext of
the same file size.
In conclusion, HASP demonstrates superior performance compared to non-private or privacypreserving entity resolution methods while offering robust privacy protection through homomorphic encryption.
4.6.2.2 Privacy Overheads in HASP
We analyze the overhead of HASP from two aspects: ciphertext overhead and computation overhead. We
additionally introduce a fully-encrypted pipeline, which refers to the deep entity resolution model if all
the computation is carried out in an end-to-end encrypted form (contrary to HASP’s plaintext-ciphertext
two-stage solution). We report the overhead of plaintext and HASP based on actual experiments, while
the results for the fully-encrypted pipeline are derived from estimations. Note that in the fully encrypted
version, embeddings are computed on the server end in the encrypted form. Specifically, the S-BERT pretrained model all-distilroberta-v1 is based on DistilRoBERTa-base [112], which has 6 layers, 768 dimensions,
and 12 attention heads, totalizing 82M parameters.
Ciphertext Overhead. As shown in Table 4.4, with the use of homomorphic encryption, the data (including inputs/outputs and intermediate results) will result in a substantial increase in file sizes. Although
HASP yields a 500x file size over plaintext, it still significantly reduces the overhead of ciphertext during
53
computation due to the simple structure of HASP’s encrypted neural network, compared to the fullyencrypted deep entity resolution model mentioned above.
Computation Overhead. The computation overhead is not only due to the multiplicative depth but also
the model complexity. For DistilRoBERTa-base, the multiplicative depth is 42, and the total number of parameters is 82M. Since it has a more complex architecture due to multi-head attention, layer normalization,
and other operations, we estimate an additional overhead factor of 2. As shown in Figure 4.6, the runtime
overhead of HASP is relatively close to the plaintext version, whereas the fully-encrypted pipeline incurs
significantly higher cost, approximately 130x more.
0 20 40 60 80
# records
0
5000
10000
15000
20000
25000
Time (s)
Plaintext
HASP
Fully-encrypted (estimate)
Figure 4.6: Inference Computation Overhead Comparison: the computational cost of the fully encrypted
pipeline is estimated. The error bars for Plaintext and HASP are not clearly visible due to the significant
difference in scale.
4.6.2.3 Dimension Reduction
Table 4.3 empirically proves that dimension reduction somewhat enhances the overall performance by
concentrating on essential information. Besides, it also brings the runtime optimization due to its reduced
size. As shown in Table 4.5, the smaller dimension consistently results in faster runtime. This trend is
54
observed across all distance calculation metrics. Note that when the distance metric is Euclidean or Cosine,
its runtime is worse than Manhattan due to an extra increment of multiplicative depth.
Dimension Manhattan Euclidean Cosine
768 20.45 43.10 49.13
384 17.13 35.21 35.88
192 15.67 30.84 31.09
Table 4.5: Time Cost of Encrypted Inference (in seconds) with Different Input Dimensions: Manhattan
distance does not increase multiplicative depth, while Euclidean and Cosine increase it by one.
4.7 Conclusion
We propose HASP to address the efficiency challenges of performing entity resolution on encrypted data
by combining homomorphic encryption with embedding models in a two-stage process. This innovative
approach achieves a practical balance between robust privacy guarantees and minimizing increased overhead, making it a viable solution for real-world applications that handle sensitive data. Additionally, by
employing Synthetic Ranging and Polynomial Degree Optimization, HASP ensures accurate non-linear
function approximation in encrypted neural networks. Overall, the framework provides stronger privacy
guarantees with minimal performance trade-offs, outperforming existing state-of-the-art methods in key
dataset tasks. Future work could explore expanding HASP’s capabilities to support more complex data
sources including multi-modal records and further improving its scalability for even larger datasets by
integrating privacy-preserving blocking methods.
55
Chapter 5
Efficient Privacy-Preserving Path Validation System for Multi-Authority
Sliced Networks
5.1 Background
Modern networks traverse multiple administrative domains run by mutually independent network operators. As an example, Network Slicing is a popular mechanism in 5G and future networks to provision
resources across multiple infrastructures resulting in a virtual network that meets some Quality of Service (QoS), Service Level Agreement (SLA), or security requirements. Network slicing has evolved from a
traditional network segregation technique using virtual local area networks (VLAN) to an Network Function Virtualization (NFV) enforcement mechanism. Initially, the crafting of slices was a manual process
based on leasing agreements and other contracts. The advent and adoption of NFV within network orchestration has resulted in the potential to rapidly instantiate slices from the underlying physical network
substrate given operator requirements (QoS, SLA, security). However, this also risks significant information leakage to other slice operators (SOs), infrastructure operators (IOs), and adversaries. New techniques
are required to leverage advances in NFV to quickly calculate and provision slices with assurances of the
slice correctness and security. The lack of sufficient path validation solutions poses an obvious threat to
service quality [20, 72]. To deliver the promised QoS required and enforce the correct order of network
56
operations, designing and implementing path validation solutions that adapt to the evolving network standards is a constant challenge for network operators.
To validate a path in a network slice, traditional solutions require the entire path to be revealed to
all parties involved, which exposes the network structure to third parties and malicious entities. One
research direction is to use Public Key Infrastructure signatures with third-party authorities (BGPSec [76])
to validate network paths, which struggles to scale with multiple independent network operators. Current
scalable path validation solutions [24, 25, 72, 94] generally inform all nodes en route with information
about the path (sometimes even the entire path), which can be harnessed by malicious nodes to reveal the
underlying network infrastructure and additional targets. Most network operators regard their network
topology as sensitive, as it may reveal the location of security devices, proxies, caches, and databases. More
importantly, the exposure of these resources enables an adversary to launch targeted attacks on specific
network bottlenecks when vulnerable network structures are pinpointed [7, 46], and exfiltrate sensitive
data from where it is being aggregated [3, 107]. For users, path information being revealed means that
adversaries can easily observe and correlate anonymous or private network traffic [35, 95].
Privacy-Preserving Path Validation Problem Network path validation is to enforce and verify cross-party
paths to satisfy QoS requirements [20]. Deviating from agreed paths downgrades service quality but also
disrupts network orchestration. The problem we formulate and address in this work is: is it possible to
implement a practical privacy-preserving path validation protocol that can be deployed by individual slice
nodes to periodically validate the control slice path?
Contributions. To summarize our contributions in this work: we present the design of a decentralized privacy-preserving path validation protocol XOR-Hash-NIZK using Non-Interactive Zero-Knowledge
proofs, to provide path privacy guarantees with reasonable performance tradeoffs. The NIZK-based pairwise validation design allows network operators to periodically monitor the network path in a decentralized approach, thus mitigating the impact on service QoS.
57
We have implemented and evaluated our protocol on the modDeter testbed [88], where the results
demonstrate scalability while presenting, in detail, the trade-offs between privacy and performance. As an
application use-case, we explain in §5.5 how the proposed approach can be incorporated within a modular
architecture in modern 5G networks such that it supports path creation, privacy-preserving path validation, and deterministic identification of nodes deviating from the prescribed path.
5.2 Related Work
Protocol
Transparent
Validation
[24, 25, 72, 94]
Semi-PP
Validation
[115]
XOR-HASH-NIZK
Privacy Guarantee
All nodes en route
learn the info about
the whole path
Intermediate nodes
learnsome info
Each intermediate
node only learns
its neighboring nodes
Integration with
Network Slicing ✗ ✗ ✓
Malicious
Rerouting Resolution ✗ ✗ ✓
Cryptographic
Overhead
Message
Authentication,
Encryption/Decryption
Message Authentication,
Anony Key Agreement,
Encryption/Decryption
Message Authentication,
NIZK
Table 5.1: Comparison of Different Path Validation Protocols: we refer to path validation protocols without
privacy-preserving (PP) design as Transparent Validation and path validation solutions with some privacypreserving design as Semi-PP Validation.
Several path validation protocols have been proposed over the years. A path validation protocol by
Kim et al. enables a node to validate all its predecessors en route before it moves on to delivering the
packets [72]. However, like other works on path validation [24, 25, 94], it fails to protect the privacy of
nodes and the path. To validate the path, their protocol reveals the information of the entire path to every
intermediate node as well as the identities of the end users. Sengupta et al. proposed a privacy-preserving
path validation to tackle this privacy issue [115], but has the following issues: (a) each intermediate node
still learns the number of nodes en route, which can still potentially leak more information (e.g., malicious
58
nodes on the same path comparing the number of nodes to learn whether they are on the same path to
reconstruct the path), even the entire path information under certain attack conditions; (b) there is no
malicious node report mechanism; (c) there is no system-level solution once certain malicious nodes are
detected, such as re-routing and backup paths. With better privacy properties, our protocol guarantees that
intermediate nodes would learn nothing beyond their predecessor and successor. It is worth mentioning
that our backward validation design allows minimal user service quality degradation from path validation
compared to other validate-then-forward protocols. A summarized comparison of related work can be
found in Table 5.1.
5.3 Domain Knowledge Extraction
In practical networking environments—particularly those where most parties are benign—several domainspecific insights can help improve the efficiency of path validation. We describe how these insights guide
our design choices to reduce overhead by mitigating waiting delays while preserving the privacy of node
identities.
Traditional path validation protocols [24, 25, 72, 94] often require each node to gather validation tokens
from all path predecessors, effectively unveiling each node’s identity to the entire path. This all-at-once
validation also forces the network to hold or delay packet forwarding while nodes wait for all previous
validation to be complete before proceeding with its packet forwarding task. Consequently, every node in
the path is engaged, and every predecessor in the chain must remain involved during each validation step.
Another insight is from observing that path validation and packet forwarding are relatively independent tasks in benign conditions, which is the vast majority of real-world cases [3, 7, 46, 107]. Traditional
validation techniques often couple the validation process too closely with packet forwarding [24, 25, 72,
94], namely the validate-then-forward design, creating a bottleneck in which the packet must wait until
the path is validated.
59
By leveraging these two domain insights, we design an efficient path validation protocol that incorporates two novel strategies: (i) pairwise validation strategy. Instead of requiring collective validation
by all predecessors at once, each node on the path only verifies the authenticity of its immediate neighbor. Over time, pairwise validations across each hop transitively validate the entire path; (ii) forward-first
strategy. Nodes can forward packets upon arrival in most benign scenarios, without immediate validation.
Validation logic runs in a backward direction from successors to predecessors in the background.
5.4 P3V: Efficient Privacy-Preserving Path Validation
5.4.1 Methodology Overview
We design a decentralized privacy-preserving path validation protocol XOR-Hash-NIZK using NonInteractive Zero-Knowledge proofs, to provide path privacy guarantees with reasonable performance
tradeoffs.
With the presence of an adversary A, our privacy-preserving path validation provides properties [72]:
• Path verification: Each intermediate node has to forward the packet received from its predecessor to
its successor per the assigned path order.
• End-host anonymity: Identities of neither the sender nor the receiver will be revealed to intermediate
nodes.
• Path privacy: (1) Even if A compromises intermediate nodes, A is still uncertain about the senderreceiver pair identity; (2) if there are at least two adjacent honest nodes between two compromised
intermediate nodes, A is not certain whether there are more nodes between these two honest nodes,
which means malicious nodes will not learn anything beyond their direct neighbors.
• Malicious node detection: If any node deviates from its fair share of the path, it cannot provide the
valid token hence the malicious action is detectable.
60
5.4.2 Base Approach: XOR-Hash
A base approach has been introduced using XOR operations and hashing for path validation [69]. In this
approach, XOR operators mask the paths from the current intermediate nodes to the receiver node and
hashing is used to validate the XORed results of tokens en route. The protocol interacts as follows: with
n hops, the sender samples n independent strings (x1, . . . , xn) correspondingly with yi = H
Ln
j=i
xj
using a hash function H and the XOR operator L. Each intermediate node i received from the sender
over an anonymous channel [26] a tuple (yi+1, xi). with the exception where the receiver node receives
(yn, xn). The validation procedure starts at the receiver, sending the XOR bit string to its predecessor,
which should match the hash digest that the predecessor obtained in the initialization stage. Moving on
from here, the predecessor uses the XOR bit string with its token to generate a new bit string for validation
at its own predecessor. This procedure propagates until the sender node and the sender node will then
infer a binary result.
Limitations of XOR-Hash Several issues exist in this approach: (a) each node needs all the nodes after
it on the path to finish their validation (recall that yi
is the XOR results of all xj from i to n) such that
it can execute its own validation procedure, such a serial execution fashion incurs the excessive overall
waiting period from a task perspective; (b) failures at any point of the path can fail the validation of
all predecessor nodes in the serial execution; (c) it is hard to pinpoint the locations of malicious nodes
because the protocol only produces a binary result on whether one or more malicious nodes exist on the
path without the accurate malicious location(s).
5.4.3 Improved Design: XOR-Hash-NIZK
In our protocol, each party obtains the validation token from its successor after finishing its task and uses
NIZK to prove to its predecessor that it correctly delivered the packet with a statement that it possesses
61
⋯⋯ ⋯⋯
Node 0 Node i-1 Node i Node i+1 Node n
⓪ (xn
, xr
, provK, veriK)
② check yi
= H(xi ⊕ xi+1); generate pfi ④ verify pfi+1
Anony Channel ① xi+1
③ pfi+1
⓪ (xi
, yi
, yi+1, provK, veriK)
⓪ (xi+1, yi+1, yi+2, provK, veriK)
Packet Forwarding Path Validation
Figure 5.1: XOR-Hash-NIZK Protocol Workflow: blue depicts the packet forwarding process and red indicates the backward path validation process. Note that “forward” payload forwarding and “backward” are
independent of each other from a user experience perspective.
the token. Note that this NIZK proof can be verified by any related party in the system. Specifically, the
authorities can also verify the proof for malicious detection.
5.4.3.1 Protocol Design
To solve the issues in the initial XOR-Hash approach, we have constructed a protocol that builds on the
XOR combiner and the hash function in a similar form as yi = H
Lm
j=i
xj
. However, instead of a pure
hash function, the validation uses NIZK with SHA256 for extra enforcement. After completing the content
delivery task and receiving its successor’s secret, each node Ni generates a NIZK proof as the prover,
proving that yi = H
Lm
j=i
xj
without revealing Lm
j=i
xj to its predecessor. The predecessor receives
the proof and verifies it. This validation process starts from the receiver and propagates back to the sender.
Taking from the NIZK-enhanced construction, we propose a pairwise validation that validates nodes only
using neighboring nodes’ secret tokens, i.e., each node proves to its predecessor that it has delivered the
content using its secret and its successor’s secret. The intuition behind this design is that if the delivery
between each pair is proven to be valid, the entire path should also be valid.
Figure 5.1 shows how the protocol works using NIZK (with a more detailed procedure described in
Appendix §.1.26). The sender node N0 first sends to each intermediate node Ni along the agreed path
62
its secret xi and the hash value yi+1 for validating its successor. Each hash value yi = H(xi ⊕ xi+1),
except the receiver’s hash value yn = H(xn). If a packet is delivered to the next node and the content
of the packet is unmodified, the next node reveals its secret to its predecessor. After the delivery task is
fulfilled, the intermediate Ni receives the secret xi+1 from its successor and generates its proof to prove
that it obtains xi+1. Ni sends the proof to its predecessor Ni−1. Ni−1 then verifies the proof. The entire
path is validated only when all nodes en route complete their pairwise validation.
By avoiding the sequential execution, NIZK overheads will not grow linearly as the size of the path
increases while the zero-knowledge property means performing local validation without revealing extra
information about the path. Note that the validation process runs across nodes in a near-parallel fashion
from the perspective of the entire system.
Detailed pseudocode for NIZK operators is described in Algorithm 3.
5.4.4 Malicious Node Detection
In addition to making a privacy-preserving pairwise validation solution possible, NIZK proofs can be
proved by anyone in the system, which makes local path validation and distributed malicious node detection viable. The key idea behind our malicious node detection function is that authorities can also learn
the hash digests of nodes managed and verify potential malicious nodes using NIZK proofs.
5.4.5 Path Validation Property Analysis
Definition 5.4.1 (Privacy-Preserving Validation) A path validation protocol π is securely realized if
there exists a simulator S in the ideal world such that, in the presence of A, for all inputs, probability distributions of the ideal world and the real world are computationally indistinguishable.
63
Algorithm 3 XOR-Hash-NIZK Protocol Components
Function InitN0
{N0, Ni
, . . . , Nn}
end
(s0, s1) ← random;
(pubK, privK) := KeyGen(s0);
Send(OA0, pubK);
for i from n to 1 do
if i == n then
(xn, xn+1) ← s1;
end
else
xi ← s1;
end
yi
:= xi ⊕ xi+1;
mi
:= Sign(privK,(xi
, yi
, yi+1, provK, veriK));
Send(Ni
, mi);
end
Function P roveNi
xi+1, provK, mi
, pubK
end
if !(yi == xi ⊕ xi+1 and V erifySign(mi
, pubK)) then
return error;
end
pfi
:= P fGen(provK, yi
, xi ⊕ xi+1);
Send(Ni−1, pfi);
Function V alidateNi
pfi+1, verK;
end
if !P fV erify(verK, pfi+1, yi+1) then
Send(OAi
,rptMsg);
end
64
Definition 5.4.2 (UC-Security) Let H be a random-oracle hash function and (P, V) be a NIZK proof system, the XOR-Hash-NIZK protocol π securely realizes the ideal functionality F in the presence of an adversary
A.
We use a simulator S to simulate the real-world executions and interact with the ideal world functionality F. Communications among parties are through secure channels.
The original sender is assumed to be honest. The operations of our protocol are re-described here
from the perspectives of F, S, and A. Each honest node ui receives a message parsed as (< ui−1, ui >, <
ui
, ui+1 >, xi
, yi
, yi+1) with the special case of the honest receiver receiving (< un−1, un >, xn, xr, yn).
For each ui
, S checks if V(yi = H(xi ⊕ xi+1)) = 1 (for the receiver, it is V(yi = H(xn ⊕ xr)) = 1) and
sends F a correct message, aborts otherwise. For each corrupted intermediate node, S is also notified with
(< ui−1, ui >, < ui
, ui+1 >, xi
, yi
, yi+1) from F. S samples an x ∈ {0, 1}
λ
and x
′ ∈ {0, 1}
λ
. On input
(x, x′
, H(x ⊕ x
′
)), S runs the simulation of NIZK to obtain the proof as P(y = H(x ⊕ x
′
)) = 1. A sends
⊤ to F if A sends x
′′ such that x
′′ = x ⊕ x
′
to S, otherwise the simulation is aborted.
The view of the environment in such a polynomial-efficient simulation is indistinguishable from the
view of the execution of the real-world protocol. Our NIZK-based protocol breaks the multi-hop validation
into a single-hop interaction mode. F always receives ⊤ returned from each honest node that completes
the delivery with correct validation, which means the validation is not interrupted at honest nodes. S
aborts if A fails to output xi ⊕ xi+1 for corrupted node ui
. This enforces the verification of the path.
Additionally, S learns nothing more from F than relative identities of the corrupted intermediate node’s
predecessor and successor, which is the same in the real-world execution. In other words, A is uncertain
about the corrupted node’s neighboring node’s absolute identities on the path, i.e., identities of potential
end-hosts. The distributions of real-world protocol executions and simulated results are probabilistically
indistinguishable.
65
Service
Requests
Network
Infra Specs
VNE-CBS
Path Slicing
XOR-Hash-NIZK
Path Validation
LR Updates
Malicious
Detected?
Network Service
Fulfilled
Exceed LR Attempt
Threshold? Path Destroy Updates
Yes
No
No
Yes
Figure 5.2: The 3-Stage Modular System: Path Slicing to generate a viable path, Path Validation to
validate the path while forwarding packets, and Path Rerouting to reroute if any malicious behavior
detected.
5.5 Application Use Case: Integration With 5G Network Slicing
We demonstrate here how the proposed approach can be integrated within a modular architecture in
modern networks that support path slicing, privacy-preserving path validation, and rerouting due to nodes
deviating from the prescribed network slice path. However, to the best of our knowledge, there is no prior
work that manages the entire network lifecycle involving privacy-preserving path validation. Previous
work focused heavily on designing solutions for each individual component for each specific sub-tasks [24,
55, 72, 94, 144]. Assembling these sub-solutions fails to realize a functioning network path resolution
system in practice. For example, existing path migration techniques [55] do not update path computation
based on the output of their systems. These approaches rely on other systems to generate paths, which
also means that there are inefficiencies (such as caching) in integration between creation, validation, and
migration.
We provide a holistic system architecture for testing integration. As shown in Figure 5.2, our system
follows three key stages of the network service: path slicing, path validation, and path rerouting. We use
the testbed as the substrate network.
66
5.5.0.1 Path Slicing
The first module of our system is the Virtual Network Embedding (VNE) path slicing/finding. The VNE
problem is the combinatorial problem of embedding Virtual Network Requests (VNRs) into a Substrate
Network (SN) while satisfying constraints, such as bandwidth or latency on SN edges, CPU or Memory
capacities on SN vertices, and geographical constraints on the VNR vertices. The embedding maps each
VNR vertex to a unique SN vertex and each VNR edge to a path in the SN between the corresponding
vertices. A solution is then the set of resources fitting the constraints, and ultimately a path or paths
making up vertices and edges through the SN.
Given a set of network nodes with various available resources of CPU and bandwidth at different
locations, a VNE algorithm (e.g. [144]) will be executed to generate a suitable path that satisfies QoS/SLA
for a specific task. From an input/output perspective, the inputs are VNR with the specific sender-receiver
relationship and substrate network specifications; the output is information about nodes selected in order.
5.5.0.2 Path Validation
After path slicing, authorities agree on a certain path for the packet delivery task and then inform each node
en route of its predecessor and successor. Path nodes execute the XOR-Hash-NIZK protocol to validate the
path while forwarding packets. If any malicious behaviors are detected by the protocol, the system will
proceed to the next stage of path rerouting.
5.5.0.3 Path Rerouting
Once a set of malicious nodes are detected, the slicing authorities work in real time to generate a new path
that excludes the malicious nodes and potentially related nodes. When regenerating On-the-fly (OTF)
paths, the path slicing works on an updated substrate network with detected malicious nodes removed.
This means that the major part of the network topology remains unchanged. Using this observation, we
67
can leverage the idea from D∗ Lite [73] to implement an efficient path regenerating algorithm for malicious
node resolution. The main idea of D∗ Lite is that changes in the graph topology (e.g., a new blockage that
was unknown before) only change a small amount of cells’ estimated goal distances while most of the cell
status stays the same, which means recalculating a path only involves cells with changed status.
5.6 Path Validation Protocol Evaluation
In this section, we present our experimental setup and a detailed evaluation of XOR-Hash-NIZK regarding
the efficiency of privacy-preserving validation.
5.6.1 Experimental Setup
Substrate network. Our system is on a testbed [88] with 200 computing nodes allocated(Debian
GNU/Linux 11|single-core AMD EPYC 7702 Processor|2GB RAM), which are dynamically configured to be
the substrate network per request andorchestrated using Ansible. To generate substrate network topologies, we utilize the Waxman graphs [144]. The Waxman graph is a popular algorithm to model random but
realistic geometric graphs. In our experiments, we generate testing communication networks and network
services requests with Waxman parameters α as 0.5 and β as 0.2.
An illustrative example of the substrate network on the testbed is shown in Figure 5.3.
NIZK implementation. We use the Circom library∗
for its properties such as constant proof sizes and
constant verification time. In the initialization stage upon the sender’s service request, the sender’s authority (acting as Generator) is responsible for generating a key pair of proving key and verifying key. The
key pair then will be distributed to all nodes via authority-to-authority communication. Keys and proofs
are serialized/deserialized using JSON. We also implemented a libsnark† version (by adding finalization
∗
https://github.com/iden3/circomlib
†
https://github.com/scipr-lab/libsnark
68
(a) Path Slicing (b) Path Forwarding + Validation
Forwarding Validation
Figure 5.3: Testbed Example: with a substrate network of 12 nodes and 15 edges, an optimal path with 5
hops is found and to be validated.
steps including padding to the compression hashing gadget) of our protocol with OpenSSL for comparison
(in Figure 5.4) with Circom.
5.6.2 Evaluation Results
Operation (ms) Sender Intermediate Receiver
Token Generation
+ Hashing n: 8.32 0 0
Signature
Generation
(RSA 1024)
n: 48.17 0 0
Signature
Verification
(RSA 1024)
0 1:0.02 1:0.02
ZK Proof
Generation 0 1:1231.13 1:1231.13
ZK Proof
Verification 1:335.12 1:335.12 0
Table 5.2: Step Cost of XOR-Hash-NIZK: with n nodes, in (i : j), i means iterations required across parties
and j means total runtime.
XOR-Hash-NIZK testbed evaluation. We evaluated the performance of our path validation protocol
on the testbed. We first break down the overhead of the protocol by looking at different parties in the
69
25 50 75 100 125 150 175 200
Number Of Nodes
0
10
20
30
40
Time (s)
XOR-Hash
XOR-Hash-NIZK (libsnark)
XOR-Hash-NIZK (Circom)
Bandwidth: 100 Mbps | Node Memory: 2 GB
Figure 5.4: Protocol Comparison On Testbed: the XOR-Hash protocol and our protocol XOR-Hash-NIZK
with two different implementations.
Validation Init Packet Forward Prove+Verify Proof Transfer
Steps
0
50
100
150
200
250
300
350
400
Time (s)
6.501
409.6
54.578
0.056
Forward
Backward
Figure 5.5: End-User Perspective Testbed Evaluation: with an example of 100 nodes (100 Mbps) where
forward steps ( blu ) directly impact service quality (e.g., the content delivery time cost per communication
constraints) and backward steps ( red ) are related to the path validation running in the background (in
normal cases which will not be directly noticed by users).
70
system, i.e., the sender, the intermediate nodes, and the receiver shown in Table 5.2. The sender needs
to generate tokens with hash digests and also sign the messages. These operations have to be repeated
for all nodes but are relatively low-cost. The intermediate nodes (and the receiver) are mainly responsible
for verifying the message signature received. Additionally, for the backward path validation, they need to
generate their own NIZK proof and also verify their predecessor’s proof (but not at the receiver). Although
each node at most is only required to perform one iteration of the NIZK operation, NIZK proofs are in
general computation-demanding cryptographic primitives where the majority of the cost resides in the
proof generation phase. The effect of this additional overhead, compared to path validation protocols that
are not based on NIZK (the XOR-Hash protocol), can be seen in Figure 5.4. As the result analysis shows,
costs involve stages of initialization, validation token distribution, secret release, and validation execution
for each protocol.
Takeaways Intuitively, this delay of around 4 seconds (from NIZK proof generation) will inevitably compromise the service quality for end users. However, this apparent network performance compromise can
be justified. Recall that compared to the traditional validate-then-forward fashion [24, 25, 72, 94, 115], our
path validation is carried out in a backward fashion and is also executed in a pair-wise validation mode.
With these two characteristics, as we focus on the service quality in terms of user experience, our path
validation process is separated from the packet forwarding process as a background service compared to
validate-then-forward approaches and will not be noticed from a user standpoint in normal cases with no
malicious behaviors. With the presence of malicious parties, the forwarding service is mainly destructed
by malicious behaviors. For example, in the case of detouring, the user experience might not encounter
any noticeable delay caused by the validation module other than the detouring forced by the adversary.
XOR-Hash-NIZK backward validation advantage. In our testbed system implementation, we also
focus on evaluating the trade-offs from the end-user perspective. In Figure 5.5, we use an example of a
path with 100 nodes en route on the testbed and the sender has requested a forward task of a 5 GB file
71
(the typical file size of an HD movie). The tested average bandwidth from Node 1 to Node 100 is around
100 Mbps. As shown in the figure, (a) most of the service wait time comes from the actual network task
(blue) of packet forwarding (b) while the backward path validation task (red) should run in the background
without interrupting regular service for the majority of the cases where there are no malicious behaviors.
Takeaways According to the mechanism of our validation process, the backward pairwise validation starts
right after the packet reaches the first hop, which means that a large portion of the nodes have finished the
validation at the time when the forward process completes. As a result, from the end-user experience, the
potential performance compromise from additional privacy-preserving secure path validation is inevitable,
but to some extent, can be justifiable regarding service quality guarantee. It is worth noting that, due to
the backward validation property of our protocol where the path validation task runs separately in the
background, even with increased bandwidth in practical 5G network cases (e.g., from our tested’s 100
Mbps to 5 Gbps), the path validation overheads are indirect to user experience, especially with networks
that adopt some periodic validation checking setups [37].
5.7 Conclusion
In this work, we propose a decentralized privacy-preserving path validation protocol that guarantees the
privacy of paths and nodes while further enhancing network security during packet delivery tasks against
information leakage about multi-hop paths and potentially the underlying network infrastructures. We
outlined how we can integrate our path validation protocol with an efficient path slicing algorithm and
a malice-resilient path rerouting mechanism, which is built and evaluated in a testbed environment. In
future implementations, it is essential to balance privacy with service quality for scalability; to address the
potential threat in collusion adversary models where nodes are controlled by a single authority via multiparty computation; and to integrate path validation directly into the transport layer protocol to support
data plane application traffic.
72
Chapter 6
Concluding Remarks
In this chapter, we summarize the key contributions of this dissertation and also provide insights on future
directions of domain-knowledge-guided optimizations for efficient privacy-preserving computation.
6.1 Dissertation Summary
This dissertation addresses the core challenge of making privacy-preserving computation more efficient
and scalable. While techniques such as homomorphic encryption offer strong data protection, they introduce significant computational and communication overheads that can limit their practicality in real-world
applications. This dissertation explores the possibility of integrating domain knowledge to guide the designing of more efficient privacy-preserving computation solutions. The meta-methods proposed in this
dissertation include selective protection of private data sharing, pre-transformation of private data sharing,
and minimization of involved parties for private data sharing, which are further backed and illustrated by
specific domain problem studies spanning across application areas such as medical data entity resolution,
distributed machine learning training, and multi-authority network path validation.
73
6.2 Future Directions
Despite that this dissertation provides a blueprint for leveraging domain knowledge to advance efficiency
optimization in privacy-preserving computation, there are still future research opportunities to further
explore the fundamentals of privacy in modern-day computation regarding concretized domains:
• Formal privacy quantification of efficient privacy-preserving computation solutions. In FedML-HE,
we provide a theocratic approach to quantify the privacy provided by Selective Parameter Encryption. Though limited by the currently available methods, it would be interesting to explore precise
privacy analysis schemes to quantify the privacy properties of private data sharing pre-transformation
and bounding the private information flow across minimal certification across parties. Or is there a
general scheme that can capture the privacy behaviors when the protocol is optimized (or in some way
reduced) via domain knowledge?
• Explore the connection between other optimization research aspects of a domain and privacy. For example, explainable learning techniques aim to improve interpretability by identifying critical parameters
or model components that significantly impact predictions [111, 140]. In our work, FedML-HE targets
privacy sensitivity, identifying parameters based on their vulnerability to privacy risks. While there will
likely be overlap between the parameters selected by both approaches—since privacy sensitivity also involves analyzing how sensitive the model parts react to data, similar to sensitivity analysis in explainable
learning—our focus is on preserving privacy across distributed clients. We believe it can be beneficial
future work to precisely bridge domains of privacy-preserving ML, efficient ML [87], and explainable
ML.
• Computation quality analysis along with privacy and efficiency. Although evaluated in all the works
presented in this dissertation during the battle between privacy and efficiency, computation quality (e.g.
model accuracy) can be largely impacted along with the pursuit of either privacy [1] or efficiency [87].
74
The complexity of the problem grows substantially when privacy, efficiency, and computation quality
all are required to be optimized. A general scheme that can integrate and theorize these three pillars
together can facilitate the designing of practical privacy-preserving computation solutions.
75
Bibliography
[1] Martin Abadi, Andy Chu, Ian Goodfellow, H Brendan McMahan, Ilya Mironov, Kunal Talwar, and
Li Zhang. “Deep learning with differential privacy”. In: Proceedings of the 2016 ACM SIGSAC
conference on computer and communications security. 2016, pp. 308–318.
[2] Mahyar Abbasian, Iman Azimi, Amir M Rahmani, and Ramesh Jain. “Conversational health
agents: A personalized llm-powered agent framework”. In: arXiv preprint arXiv:2310.02374 (2023).
[3] Abdallah Mustafa Abdelrahman, Joel JPC Rodrigues, Mukhtar ME Mahmoud, Kashif Saleem,
Ashok Kumar Das, Valery Korotaev, and Sergei A Kozlov. “Software-defined networking security
for private data center networks and clouds: vulnerabilities, attacks, countermeasures, and
solutions”. In: International Journal of Communication Systems 34.4 (2021), e4706.
[4] Abbas Acar, Hidayet Aksu, A Selcuk Uluagac, and Mauro Conti. “A survey on homomorphic
encryption schemes: Theory and implementation”. In: ACM Computing Surveys (Csur) 51.4 (2018),
pp. 1–35.
[5] Alessandro Acquisti, Laura Brandimarte, and George Loewenstein. “Privacy and human behavior
in the age of information”. In: Science 347.6221 (2015), pp. 509–514.
[6] Ehud Aharoni, Allon Adir, Moran Baruch, Nir Drucker, Gilad Ezov, Ariel Farkash, Lev Greenberg,
Ramy Masalha, Guy Moshkowich, Dov Murik, et al. HeLayers: A tile tensors framework for large
neural networks on encrypted data. 2011.
[7] Réka Albert, Hawoong Jeong, and Albert-László Barabási. “Error and attack tolerance of complex
networks”. In: nature 406.6794 (2000), pp. 378–382.
[8] Asma Aloufi, Peizhao Hu, Yongsoo Song, and Kristin Lauter. “Computing blindfolded on data
homomorphically encrypted under multiple keys: A survey”. In: ACM Computing Surveys (CSUR)
54.9 (2021), pp. 1–37.
[9] Syada Tasmia Alvi, Mohammed Nasir Uddin, Linta Islam, and Sajib Ahamed. “DVTChain: A
blockchain-based decentralized mechanism to ensure the security of digital voting system voting
system”. In: Journal of King Saud University-Computer and Information Sciences 34.9 (2022),
pp. 6855–6871.
76
[10] Gilad Asharov, Abhishek Jain, Adriana López-Alt, Eran Tromer, Vinod Vaikuntanathan, and
Daniel Wichs. “Multiparty computation with low communication, computation and interaction
via threshold FHE”. In: Advances in Cryptology–EUROCRYPT 2012: 31st Annual International
Conference on the Theory and Applications of Cryptographic Techniques, Cambridge, UK, April
15-19, 2012. Proceedings 31. Springer. 2012, pp. 483–501.
[11] Giuseppe Ateniese, Kevin Fu, Matthew Green, and Susan Hohenberger. “Improved proxy
re-encryption schemes with applications to secure distributed storage”. In: ACM Transactions on
Information and System Security (TISSEC) 9.1 (2006), pp. 1–30.
[12] Marta Bellés-Muñoz, Miguel Isabel, Jose Luis Muñoz-Tapia, Albert Rubio, and Jordi Baylina.
“Circom: A Circuit Description Language for Building Zero-knowledge Applications”. In: IEEE
Transactions on Dependable and Secure Computing (2022).
[13] Ayoub Benaissa, Bilal Retiat, Bogdan Cebere, and Alaa Eddine Belfedhal. TenSEAL: A Library for
Encrypted Tensor Operations Using Homomorphic Encryption. 2021. arXiv: 2104.03152 [cs.CR].
[14] Abhishek Bhowmick, John Duchi, Julien Freudiger, Gaurav Kapoor, and Ryan Rogers. “Protection
against reconstruction and its applications in private federated learning”. In: arXiv preprint
arXiv:1812.00984 (2018).
[15] Keith Bonawitz. “Towards federated learning at scale: Syste m design”. In: arXiv preprint
arXiv:1902.01046 (2019).
[16] Keith Bonawitz, Vladimir Ivanov, Ben Kreuter, Antonio Marcedone, H Brendan McMahan,
Sarvar Patel, Daniel Ramage, Aaron Segal, and Karn Seth. “Practical secure aggregation for
privacy-preserving machine learning”. In: proceedings of the 2017 ACM SIGSAC Conference on
Computer and Communications Security. 2017, pp. 1175–1191.
[17] Dan Boneh, Xavier Boyen, and Shai Halevi. “Chosen ciphertext secure public key threshold
encryption without random oracles”. In: Cryptographers’ Track at the RSA Conference. Springer.
2006, pp. 226–243.
[18] Zvika Brakerski, Craig Gentry, and Vinod Vaikuntanathan. “(Leveled) fully homomorphic
encryption without bootstrapping”. In: ACM Transactions on Computation Theory (TOCT) 6.3
(2014), pp. 1–36.
[19] Ursin Brunner and Kurt Stockinger. “Entity matching with transformer architectures-a step
forward in data integration”. In: 23rd International Conference on Extending Database Technology,
Copenhagen, 30 March-2 April 2020. OpenProceedings. 2020, pp. 463–473.
[20] Kai Bu, Avery Laird, Yutian Yang, Linfeng Cheng, Jiaqing Luo, Yingjiu Li, and Kui Ren. “Unveiling
the mystery of Internet packet forwarding: A survey of network path validation”. In: ACM
Computing Surveys (CSUR) 53.5 (2020), pp. 1–34.
[21] Mark Bun and Thomas Steinke. “Concentrated differential privacy: Simplifications, extensions,
and lower bounds”. In: Theory of cryptography conference. Springer. 2016, pp. 635–658.
77
[22] Douglas Burdick, Lucian Popa, and Rajasekar Krishnamurthy. “Towards high-precision and
reusable entity resolution algorithms over sparse financial datasets”. In: Proceedings of the Second
International Workshop on Data Science for Macro-Modeling. 2016, pp. 1–4.
[23] David Byrd and Antigoni Polychroniadou. “Differentially private secure multi-party computation
for federated learning in financial applications”. In: Proceedings of the First ACM International
Conference on AI in Finance. 2020, pp. 1–9.
[24] Hao Cai and Tilman Wolf. “Source authentication and path validation with orthogonal network
capabilities”. In: 2015 IEEE Conference on Computer Communications Workshops (INFOCOM
WKSHPS). IEEE. 2015, pp. 111–112.
[25] Kenneth L Calvert, James Griffioen, and Leonid Poutievski. “Separating routing and forwarding:
A clean-slate network layer design”. In: 2007 Fourth International Conference on Broadband
Communications, Networks and Systems (BROADNETS’07). IEEE. 2007, pp. 261–270.
[26] Jan Camenisch and Anna Lysyanskaya. “A formal treatment of onion routing”. In: Annual
International Cryptology Conference. Springer. 2005, pp. 169–187.
[27] Hao Chen, Wei Dai, Miran Kim, and Yongsoo Song. “Efficient multi-key homomorphic encryption
with packed ciphertexts with application to oblivious neural network inference”. In: Proceedings
of the 2019 ACM SIGSAC Conference on Computer and Communications Security. 2019, pp. 395–412.
[28] Hao Chen, Ran Gilad-Bachrach, Kyoohyung Han, Zhicong Huang, Amir Jalali, Kim Laine, and
Kristin Lauter. “Logistic regression over encrypted data from fully homomorphic encryption”. In:
BMC medical genomics 11 (2018), pp. 3–12.
[29] Hao Chen, Kim Laine, and Peter Rindal. “Fast private set intersection from homomorphic
encryption”. In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and
Communications Security. 2017, pp. 1243–1255.
[30] Jung Hee Cheon, Andrey Kim, Miran Kim, and Yongsoo Song. “Homomorphic encryption for
arithmetic of approximate numbers”. In: Advances in Cryptology–ASIACRYPT 2017: 23rd
International Conference on the Theory and Applications of Cryptology and Information Security,
Hong Kong, China, December 3-7, 2017, Proceedings, Part I 23. Springer. 2017, pp. 409–437.
[31] Vassilis Christophides, Vasilis Efthymiou, Themis Palpanas, George Papadakis, and
Kostas Stefanidis. “An overview of end-to-end entity resolution for big data”. In: ACM Computing
Surveys (CSUR) 53.6 (2020), pp. 1–42.
[32] arkworks contributors. arkworks zkSNARK ecosystem. 2022. url: https://arkworks.rs.
[33] John Criswell, Nathan Dautenhahn, and Vikram Adve. “KCoFI: Complete control-flow integrity
for commodity operating system kernels”. In: 2014 IEEE symposium on security and privacy. IEEE.
2014, pp. 292–307.
78
[34] Jieren Deng, Yijue Wang, Ji Li, Chenghong Wang, Chao Shang, Hang Liu,
Sanguthevar Rajasekaran, and Caiwen Ding. “TAG: Gradient Attack on Transformer-based
Language Models”. In: Findings of the Association for Computational Linguistics: EMNLP 2021.
Ed. by Marie-Francine Moens, Xuanjing Huang, Lucia Specia, and Scott Wen-tau Yih. Punta Cana,
Dominican Republic: Association for Computational Linguistics, Nov. 2021, pp. 3600–3610. doi:
10.18653/v1/2021.findings-emnlp.305.
[35] Liangdong Deng, Yuzhou Feng, Dong Chen, and Naphtali Rishe. “Iotspot: Identifying the iot
devices using their anonymous network traffic data”. In: MILCOM 2019-2019 IEEE Military
Communications Conference (MILCOM). IEEE. 2019, pp. 1–6.
[36] Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. “Bert: Pre-training of deep
bidirectional transformers for language understanding”. In: arXiv preprint arXiv:1810.04805 (2018).
[37] Mohan Dhawan, Rishabh Poddar, Kshiteej Mahajan, and Vijay Mann. “Sphinx: detecting security
attacks in software-defined networks.” In: Ndss. Vol. 15. 2015, pp. 8–11.
[38] Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai,
Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, et al.
“An image is worth 16x16 words: Transformers for image recognition at scale”. In: arXiv preprint
arXiv:2010.11929 (2020).
[39] Weidong Du, Min Li, Liqiang Wu, Yiliang Han, Tanping Zhou, and Xiaoyuan Yang. “A efficient
and robust privacy-preserving framework for cross-device federated learning”. In: Complex &
Intelligent Systems (2023), pp. 1–15.
[40] Yitao Duan, John Canny, and Justin Zhan. “{P4P}: Practical {Large-Scale}{Privacy-Preserving}
distributed computation robust against malicious users”. In: 19th USENIX Security Symposium
(USENIX Security 10). 2010.
[41] Cynthia Dwork. “Differential privacy”. In: International colloquium on automata, languages, and
programming. Springer. 2006, pp. 1–12.
[42] Cynthia Dwork. “Differential privacy: A survey of results”. In: International conference on theory
and applications of models of computation. Springer. 2008, pp. 1–19.
[43] Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith. “Calibrating noise to sensitivity
in private data analysis”. In: Theory of Cryptography: Third Theory of Cryptography Conference,
TCC 2006, New York, NY, USA, March 4-7, 2006. Proceedings 3. Springer. 2006, pp. 265–284.
[44] Cynthia Dwork, Aaron Roth, et al. “The algorithmic foundations of differential privacy”. In:
Foundations and Trends® in Theoretical Computer Science 9.3–4 (2014), pp. 211–407.
[45] Cynthia Dwork, Guy N Rothblum, and Salil Vadhan. “Boosting and differential privacy”. In: 2010
IEEE 51st annual symposium on foundations of computer science. IEEE. 2010, pp. 51–60.
[46] Ernesto Estrada. “Network robustness to targeted attacks. The interplay of expansibility and
degree distribution”. In: The European Physical Journal B-Condensed Matter and Complex Systems
52.4 (2006), pp. 563–574.
79
[47] Junfeng Fan and Frederik Vercauteren. Somewhat Practical Fully Homomorphic Encryption.
Cryptology ePrint Archive, Paper 2012/144. https://eprint.iacr.org/2012/144. 2012. url:
https://eprint.iacr.org/2012/144.
[48] Haokun Fang and Quan Qian. “Privacy preserving machine learning with homomorphic
encryption and federated learning”. In: Future Internet 13.4 (2021), p. 94.
[49] Uriel Fiege, Amos Fiat, and Adi Shamir. “Zero knowledge proofs of identity”. In: Proceedings of the
nineteenth annual ACM symposium on Theory of computing. 1987, pp. 210–217.
[50] Caroline Fontaine and Fabien Galand. “A survey of homomorphic encryption for nonspecialists”.
In: EURASIP Journal on Information Security 2007 (2007), pp. 1–10.
[51] Liam Fowl, Jonas Geiping, Steven Reich, Yuxin Wen, Wojtek Czaja, Micah Goldblum, and
Tom Goldstein. “Decepticons: Corrupted transformers breach privacy in federated learning for
language models”. In: arXiv preprint arXiv:2201.12675 (2022).
[52] Benjamin CM Fung, Ke Wang, Rui Chen, and Philip S Yu. “Privacy-preserving data publishing: A
survey of recent developments”. In: ACM Computing Surveys (Csur) 42.4 (2010), pp. 1–53.
[53] Ariel Gabizon, Zachary J Williamson, and Oana Ciobotaru. “Plonk: Permutations over
lagrange-bases for oecumenical noninteractive arguments of knowledge”. In: Cryptology ePrint
Archive (2019).
[54] Craig Gentry. “Fully homomorphic encryption using ideal lattices”. In: Proceedings of the
forty-first annual ACM symposium on Theory of computing. 2009, pp. 169–178.
[55] Masoumeh Gholami and Behzad Akbari. “Congestion control in software defined data center
networks through flow rerouting”. In: 2015 23rd Iranian conference on electrical engineering. IEEE.
2015, pp. 654–657.
[56] Aris Gkoulalas-Divanis, Dinusha Vatsalan, Dimitrios Karapiperis, and Murat Kantarcioglu.
“Modern privacy-preserving record linkage techniques: An overview”. In: IEEE Transactions on
Information Forensics and Security 16 (2021), pp. 4966–4987.
[57] Oded Goldreich and Yair Oren. “Definitions and properties of zero-knowledge proof systems”. In:
Journal of Cryptology 7.1 (1994), pp. 1–32.
[58] Charles Gouert, Dimitris Mouris, and Nektarios Georgios Tsoutsos. “New insights into fully
homomorphic encryption libraries via standardized benchmarks”. In: Cryptology ePrint Archive
(2022).
[59] Jens Groth. “On the size of pairing-based non-interactive arguments”. In: Advances in
Cryptology–EUROCRYPT 2016: 35th Annual International Conference on the Theory and
Applications of Cryptographic Techniques, Vienna, Austria, May 8-12, 2016, Proceedings, Part II 35.
Springer. 2016, pp. 305–326.
[60] Arpit Guleria, J Harshan, Ranjitha Prasad, and BN Bharath. “On Homomorphic Encryption Based
Strategies for Class Imbalance in Federated Learning”. In: arXiv preprint arXiv:2410.21192 (2024).
80
[61] Yuxiang Guo, Lu Chen, Zhengjie Zhou, Baihua Zheng, Ziquan Fang, Zhikun Zhang, Yuren Mao,
and Yunjun Gao. “CampER: An Effective Framework for Privacy-Aware Deep Entity Resolution”.
In: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining.
2023, pp. 626–637.
[62] Shanshan Han, Baturalp Buyukates, Zijian Hu, Han Jin, Weizhao Jin, Lichao Sun, Xiaoyang Wang,
Chulin Xie, Kai Zhang, Qifan Zhang, et al. “FedMLSecurity: A Benchmark for Attacks and
Defenses in Federated Learning and LLMs”. In: arXiv preprint arXiv:2306.04959 (2023).
[63] Ali Hatamizadeh, Hongxu Yin, Holger R Roth, Wenqi Li, Jan Kautz, Daguang Xu, and
Pavlo Molchanov. “Gradvit: Gradient inversion of vision transformers”. In: Proceedings of the
IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022, pp. 10021–10030.
[64] Briland Hitaj, Giuseppe Ateniese, and Fernando Perez-Cruz. “Deep models under the GAN:
information leakage from collaborative deep learning”. In: Proceedings of the 2017 ACM SIGSAC
conference on computer and communications security. 2017, pp. 603–618.
[65] Edward J Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang,
and Weizhu Chen. “Lora: Low-rank adaptation of large language models”. In: arXiv preprint
arXiv:2106.09685 (2021).
[66] IBM. IBMFL Crypto. https://github.com/IBM/federated-learninglib/blob/main/Notebooks/crypto_fhe_pytorch/pytorch_classifier_aggregator.ipynb. Accessed:
2023-1-25. 2022.
[67] Zhifeng Jiang, Wei Wang, and Yang Liu. “Flashe: Additively symmetric homomorphic encryption
for cross-silo federated learning”. In: arXiv preprint arXiv:2109.00675 (2021).
[68] Weizhao Jin, Bhaskar Krishnamachari, Muhammad Naveed, Srivatsan Ravi, Eduard Sanou, and
Kwame-Lante Wright. “Secure Publish-Process-Subscribe System for Dispersed Computing”. In:
2022 41st International Symposium on Reliable Distributed Systems (SRDS). IEEE. 2022, pp. 58–68.
[69] Weizhao Jin, Srivatsan Ravi, and Erik Kline. “Decentralized Privacy-Preserving Path Validation
for Multi-Slicing-Authority 5G Networks”. In: 2022 IEEE Wireless Communications and Networking
Conference (WCNC). IEEE. 2022, pp. 31–36.
[70] Vinu Joseph, Ganesh L Gopalakrishnan, Saurav Muralidharan, Michael Garland, and
Animesh Garg. “A programmable approach to neural network compression”. In: IEEE Micro 40.5
(2020), pp. 17–25.
[71] Joe Kilian. “A note on efficient zero-knowledge proofs and arguments”. In: Proceedings of the
twenty-fourth annual ACM symposium on Theory of computing. 1992, pp. 723–732.
[72] Tiffany Hyun-Jin Kim, Cristina Basescu, Limin Jia, Soo Bum Lee, Yih-Chun Hu, and
Adrian Perrig. “Lightweight source authentication and path validation”. In: Proceedings of the
2014 ACM Conference on SIGCOMM. 2014, pp. 271–282.
[73] Sven Koenig and Maxim Likhachev. “Dˆ* lite”. In: Aaai/iaai 15 (2002), pp. 476–483.
81
[74] Hanna Köpcke, Andreas Thor, and Erhard Rahm. “Evaluation of entity resolution approaches on
real-world match problems”. In: Proceedings of the VLDB Endowment 3.1-2 (2010), pp. 484–493.
[75] Peeter Laud and Long Ngo. “Threshold homomorphic encryption in the universally composable
cryptographic library”. In: International Conference on Provable Security. Springer. 2008,
pp. 298–312.
[76] Matt Lepinski and Kotikalapudi Sriram. BGPsec Protocol Specification. RFC 8205. Sept. 2017. doi:
10.17487/RFC8205.
[77] Bing Li, Yukai Miao, Yaoshu Wang, Yifang Sun, and Wei Wang. “Improving the efficiency and
effectiveness for bert-based entity resolution”. In: Proceedings of the AAAI Conference on Artificial
Intelligence. Vol. 35. 15. 2021, pp. 13226–13233.
[78] Tian Li, Anit Kumar Sahu, Manzil Zaheer, Maziar Sanjabi, Ameet Talwalkar, and Virginia Smith.
“Federated optimization in heterogeneous networks”. In: Proceedings of Machine learning and
systems 2 (2020), pp. 429–450.
[79] Yuliang Li, Jinfeng Li, Yoshihiko Suhara, AnHai Doan, and Wang-Chiew Tan. “Deep entity
matching with pre-trained language models”. In: arXiv preprint arXiv:2004.00584 (2020).
[80] Fan Liu, Zhiyong Cheng, Huilin Chen, Yinwei Wei, Liqiang Nie, and Mohan Kankanhalli.
“Privacy-preserving synthetic data generation for recommendation systems”. In: Proceedings of
the 45th International ACM SIGIR Conference on Research and Development in Information
Retrieval. 2022, pp. 1379–1389.
[81] Yang Liu, Yan Kang, Tianyuan Zou, Yanhong Pu, Yuanqin He, Xiaozhou Ye, Ye Ouyang,
Ya-Qin Zhang, and Qiang Yang. “Vertical federated learning: Concepts, advances, and challenges”.
In: IEEE Transactions on Knowledge and Data Engineering (2024).
[82] Jiahao Lu, Xi Sheryl Zhang, Tianli Zhao, Xiangyu He, and Jian Cheng. “APRIL: Finding the
Achilles’ Heel on Privacy for Vision Transformers”. In: Proceedings of the IEEE/CVF Conference on
Computer Vision and Pattern Recognition. 2022, pp. 10051–10060.
[83] Jing Ma, Si-Ahmed Naas, Stephan Sigg, and Xixiang Lyu. “Privacy-preserving federated learning
based on multi-key homomorphic encryption”. In: International Journal of Intelligent Systems 37.9
(2022), pp. 5880–5901.
[84] Amarachi B Mbakwe, Ismini Lourentzou, Leo Anthony Celi, Oren J Mechanic, and Alon Dagan.
ChatGPT passing USMLE shines a spotlight on the flaws of medical education. 2023.
[85] Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Aguera y Arcas.
“Communication-efficient learning of deep networks from decentralized data”. In: Artificial
intelligence and statistics. PMLR. 2017, pp. 1273–1282.
[86] Matias Mendieta, Taojiannan Yang, Pu Wang, Minwoo Lee, Zhengming Ding, and Chen Chen.
“Local learning matters: Rethinking data heterogeneity in federated learning”. In: Proceedings of
the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022, pp. 8397–8406.
82
[87] Gaurav Menghani. “Efficient deep learning: A survey on making deep learning models smaller,
faster, and better”. In: ACM Computing Surveys 55.12 (2023), pp. 1–37.
[88] mergetb.org. The Merge Testbed Platform. https://mergetb.org/. 2024.
[89] Silvio Micali. “Computationally sound proofs”. In: SIAM Journal on Computing 30.4 (2000),
pp. 1253–1298.
[90] Fan Mo, Anastasia Borovykh, Mohammad Malekzadeh, Hamed Haddadi, and Soteris Demetriou.
“Layer-wise characterization of latent information leakage in federated learning”. In: ICLR
Distributed and Private Machine Learning workshop. 2020.
[91] Viraaji Mothukuri, Reza M Parizi, Seyedamin Pouriyeh, Yan Huang, Ali Dehghantanha, and
Gautam Srivastava. “A survey on security and privacy of federated learning”. In: Future
Generation Computer Systems 115 (2021), pp. 619–640.
[92] Sidharth Mudgal, Han Li, Theodoros Rekatsinas, AnHai Doan, Youngchoon Park,
Ganesh Krishnan, Rohit Deep, Esteban Arcaute, and Vijay Raghavendra. “Deep learning for
entity matching: A design space exploration”. In: Proceedings of the 2018 international conference
on management of data. 2018, pp. 19–34.
[93] Ujan Mukhopadhyay, Anthony Skjellum, Oluwakemi Hambolu, Jon Oakley, Lu Yu, and
Richard Brooks. “A brief survey of cryptocurrency systems”. In: 2016 14th annual conference on
privacy, security and trust (PST). IEEE. 2016, pp. 745–752.
[94] Jad Naous, Michael Walfish, Antonio Nicolosi, David Mazieres, Michael Miller, and Arun Seehra.
“Verifying and enforcing network paths with ICING”. In: Proceedings of the Seventh Conference on
Emerging Networking Experiments and Technologies. 2011, pp. 1–12.
[95] Milad Nasr, Alireza Bahramali, and Amir Houmansadr. “Deepcorr: Strong flow correlation attacks
on Tor using deep learning”. In: Proceedings of the 2018 ACM SIGSAC Conference on Computer and
Communications Security. 2018, pp. 1962–1976.
[96] Frank Niedermeyer, Simone Steinmetzer, Martin Kroll, and Rainer Schnell. “Cryptanalysis of
basic bloom filters used for privacy preserving record linkage”. In: German Record Linkage Center,
Working Paper Series, No. WP-GRLC-2014-04 (2014).
[97] Roman Novak, Yasaman Bahri, Daniel A Abolafia, Jeffrey Pennington, and Jascha Sohl-Dickstein.
“Sensitivity and Generalization in Neural Networks: an Empirical Study”. In: International
Conference on Learning Representations. 2018.
[98] Nvidia. NVIDIA FLARE Federated Learning with Homomorphic Encryption.
https://developer.nvidia.com/blog/federated-learning-with-homomorphic-encryption. Accessed:
2023-1-25. 2021.
[99] Pascal Paillier. “Public-key cryptosystems based on composite degree residuosity classes”. In:
Advances in Cryptology—EUROCRYPT’99: International Conference on the Theory and Application
of Cryptographic Techniques Prague, Czech Republic, May 2–6, 1999 Proceedings 18. Springer. 1999,
pp. 223–238.
83
[100] George Papadakis, Dimitrios Skoutas, Emmanouil Thanos, and Themis Palpanas. “Blocking and
filtering techniques for entity resolution: A survey”. In: ACM Computing Surveys (CSUR) 53.2
(2020), pp. 1–42.
[101] Neha Patki, Roy Wedge, and Kalyan Veeramachaneni. “The synthetic data vault”. In: 2016 IEEE
international conference on data science and advanced analytics (DSAA). IEEE. 2016, pp. 399–410.
[102] Albin Petit, Thomas Cerqueus, Sonia Ben Mokhtar, Lionel Brunie, and Harald Kosch. “PEAS:
Private, efficient and accurate web search”. In: 2015 IEEE Trustcom/BigDataSE/ISPA. Vol. 1. IEEE.
2015, pp. 571–580.
[103] Shiva Raj Pokhrel, Jie Ding, Jihong Park, Ok-Sun Park, and Jinho Choi. “Towards enabling critical
mMTC: A review of URLLC within mMTC”. In: IEEE Access 8 (2020), pp. 131796–131813.
[104] Catherine Quantin, Hocine Bouzelat, FAA Allaert, Anne-Marie Benhamiche, Jean Faivre, and
Liliane Dusserre. “How to ensure data security of an epidemiological follow-up: quality
assessment of an anonymous record linkage procedure”. In: International journal of medical
informatics 49.1 (1998), pp. 117–122.
[105] Charles Rackoff and Daniel R Simon. “Non-interactive zero-knowledge proof of knowledge and
chosen ciphertext attack”. In: Annual international cryptology conference. Springer. 1991,
pp. 433–444.
[106] Md Mostafizer Rahman and Yutaka Watanobe. “ChatGPT for education and research:
Opportunities, threats, and strategies”. In: Applied Sciences 13.9 (2023), p. 5783.
[107] David R Raymond and Scott F Midkiff. “Denial-of-service in wireless sensor networks: Attacks
and defenses”. In: IEEE Pervasive Computing 7.1 (2008), pp. 74–81.
[108] Eugene Y Remez. “Sur la détermination des polynômes d’approximation de degré donnée”. In:
Comm. Soc. Math. Kharkov 10.196 (1934), pp. 41–63.
[109] Holger R Roth, Yan Cheng, Yuhong Wen, Isaac Yang, Ziyue Xu, Yuan-Ting Hsieh,
Kristopher Kersten, Ahmed Harouni, Can Zhao, Kevin Lu, et al. “NVIDIA FLARE: Federated
Learning from Simulation to Real-World”. In: arXiv preprint arXiv:2210.13291 (2022).
[110] Ahmad-Reza Sadeghi, Thomas Schneider, and Immo Wehrenberg. “Efficient privacy-preserving
face recognition”. In: International conference on information security and cryptology. Springer.
2009, pp. 229–244.
[111] Wojciech Samek and Klaus-Robert Müller. “Towards explainable artificial intelligence”. In:
Explainable AI: interpreting, explaining and visualizing deep learning (2019), pp. 5–22.
[112] Victor Sanh, Lysandre Debut, Julien Chaumond, and Thomas Wolf. “DistilBERT, a distilled
version of BERT: smaller, faster, cheaper and lighter”. In: ArXiv abs/1910.01108 (2019).
[113] Rainer Schnell, Tobias Bachteler, and Jörg Reiher. “Privacy-preserving record linkage using
Bloom filters”. In: BMC medical informatics and decision making 9 (2009), pp. 1–11.
84
[114] Ziad Sehili, Lars Kolb, Christian Borgs, Rainer Schnell, and Erhard Rahm. “Privacy preserving
record linkage with PPJoin”. In: (2015).
[115] Binanda Sengupta, Yingjiu Li, Kai Bu, and Robert H Deng. “Privacy-preserving network path
validation”. In: ACM Transactions on Internet Technology (TOIT) 20.1 (2020), pp. 1–27.
[116] Adi Shamir. “How to share a secret”. In: Communications of the ACM 22.11 (1979), pp. 612–613.
[117] Jinhyun So, Corey J Nolet, Chien-Sheng Yang, Songze Li, Qian Yu, Ramy E Ali, Basak Guler, and
Salman Avestimehr. “Lightsecagg: a lightweight and versatile design for secure aggregation in
federated learning”. In: Proceedings of Machine Learning and Systems 4 (2022), pp. 694–720.
[118] Jure Sokolić, Raja Giryes, Guillermo Sapiro, and Miguel RD Rodrigues. “Robust large margin deep
neural networks”. In: IEEE Transactions on Signal Processing 65.16 (2017), pp. 4265–4280.
[119] Congzheng Song and Ananth Raghunathan. “Information leakage in embedding models”. In:
Proceedings of the 2020 ACM SIGSAC conference on computer and communications security. 2020,
pp. 377–390.
[120] Dimitris Stripelis, Hamza Saleem, Tanmay Ghai, Nikhil Dhinagar, Umang Gupta,
Chrysovalantis Anastasiou, Greg Ver Steeg, Srivatsan Ravi, Muhammad Naveed,
Paul M Thompson, et al. “Secure neuroimaging analysis using federated learning with
homomorphic encryption”. In: 17th International Symposium on Medical Information Processing
and Analysis. Vol. 12088. SPIE. 2021, pp. 351–359.
[121] Hanlin Tang, Chen Yu, Xiangru Lian, Tong Zhang, and Ji Liu. “Doublesqueeze: Parallel stochastic
gradient descent with double-pass error-compensated compression”. In: International Conference
on Machine Learning. PMLR. 2019, pp. 6155–6165.
[122] Sherif A Tawfik. “Minimax approximation and Remez algorithm”. In: Lecture Notes at http://www.
math. unipd. it/˜ alvise/CS_2008/APPROSSIMAZIONE_2009/MFILES/Remez. pdf (2005).
[123] Arun James Thirunavukarasu, Darren Shu Jeng Ting, Kabilan Elangovan, Laura Gutierrez,
Ting Fang Tan, and Daniel Shu Wei Ting. “Large language models in medicine”. In: Nature
medicine 29.8 (2023), pp. 1930–1940.
[124] Stacey Truex, Nathalie Baracaldo, Ali Anwar, Thomas Steinke, Heiko Ludwig, Rui Zhang, and
Yi Zhou. “A hybrid approach to privacy-preserving federated learning”. In: Proceedings of the 12th
ACM workshop on artificial intelligence and security. 2019, pp. 1–11.
[125] Stacey Truex, Ling Liu, Ka-Ho Chow, Mehmet Emre Gursoy, and Wenqi Wei. “LDP-Fed:
Federated learning with local differential privacy”. In: Proceedings of the third ACM international
workshop on edge systems, analytics and networking. 2020, pp. 61–66.
[126] Dinusha Vatsalan, Peter Christen, and Vassilios S Verykios. “A taxonomy of privacy-preserving
record linkage techniques”. In: Information Systems 38.6 (2013), pp. 946–969.
85
[127] Jianyu Wang, Rudrajit Das, Gauri Joshi, Satyen Kale, Zheng Xu, and Tong Zhang. “On the
unreasonable effectiveness of federated averaging with heterogeneous data”. In: arXiv preprint
arXiv:2206.04723 (2022).
[128] Zhibo Wang, Mengkai Song, Zhifei Zhang, Yang Song, Qian Wang, and Hairong Qi. “Beyond
inferring class representatives: User-level privacy leakage from federated learning”. In: IEEE
INFOCOM 2019-IEEE conference on computer communications. IEEE. 2019, pp. 2512–2520.
[129] Wikipedia. 5G network slicing. https://en.wikipedia.org/wiki/5G_network_slicing. 2022.
[130] Alexander Wood, Kayvan Najarian, and Delaram Kahrobaei. “Homomorphic encryption for
machine learning in medicine and bioinformatics”. In: ACM Computing Surveys (CSUR) 53.4
(2020), pp. 1–35.
[131] Renzhi Wu, Sanya Chaba, Saurabh Sawlani, Xu Chu, and Saravanan Thirumuruganathan.
“Zeroer: Entity resolution using zero labeled examples”. In: Proceedings of the 2020 ACM SIGMOD
International Conference on Management of Data. 2020, pp. 1149–1164.
[132] Shijie Wu, Ozan Irsoy, Steven Lu, Vadim Dabravolski, Mark Dredze, Sebastian Gehrmann,
Prabhanjan Kambadur, David Rosenberg, and Gideon Mann. “Bloomberggpt: A large language
model for finance”. In: arXiv preprint arXiv:2303.17564 (2023).
[133] www.3gpp.org. www.3gpp.org. https://www.3gpp.org/ftp/Specs/archive/22_series/22.261/. 2022.
[134] Baoshi Yan, Lokesh Bajaj, and Anmol Bhasin. “Entity resolution using social graphs for business
applications”. In: 2011 International Conference on Advances in Social Networks Analysis and
Mining. IEEE. 2011, pp. 220–227.
[135] Andrew Chi-Chih Yao. “How to generate and exchange secrets”. In: 27th annual symposium on
foundations of computer science (Sfcs 1986). IEEE. 1986, pp. 162–167.
[136] Yixiang Yao, Tanmay Ghai, Srivatsan Ravi, and Pedro Szekely. “Amppere: A universal abstract
machine for privacy-preserving entity resolution evaluation”. In: Proceedings of the 30th ACM
International Conference on Information & Knowledge Management. 2021, pp. 2394–2403.
[137] Yuhang Yao, Weizhao Jin, Srivatsan Ravi, and Carlee Joe-Wong. “Fedgcn: Convergence and
communication tradeoffs in federated training of graph convolutional networks”. In: Advances in
neural information processing systems (2023).
[138] Chengliang Zhang, Suyi Li, Junzhe Xia, Wei Wang, Feng Yan, and Yang Liu. “Batchcrypt: Efficient
homomorphic encryption for cross-silo federated learning”. In: Proceedings of the 2020 USENIX
Annual Technical Conference (USENIX ATC 2020). 2020.
[139] Haobo Zhang, Junyuan Hong, Fan Dong, Steve Drew, Liangjie Xue, and Jiayu Zhou. “A
privacy-preserving hybrid federated learning framework for financial crime detection”. In: arXiv
preprint arXiv:2302.03654 (2023).
[140] Pin Zhang. “A novel feature selection method based on global sensitivity analysis with application
in machine learning-based prediction model”. In: Applied Soft Computing 85 (2019), p. 105859.
86
[141] Ruisi Zhang, Seira Hidano, and Farinaz Koushanfar. “Text revealer: Private text reconstruction via
model inversion attacks against transformers”. In: arXiv preprint arXiv:2209.10505 (2022).
[142] Zheng Zhang, Yong Xu, Jian Yang, Xuelong Li, and David Zhang. “A survey of sparse
representation: algorithms and applications”. In: IEEE access 3 (2015), pp. 490–530.
[143] Chen Zhao and Yeye He. “Auto-em: End-to-end fuzzy entity-matching using pre-trained deep
models and transfer learning”. In: The World Wide Web Conference. 2019, pp. 2413–2424.
[144] Yi Zheng, Srivatsan Ravi, Erik Kline, Sven Koenig, and T. K. Satish Kumar. “Conflict-Based Search
for the Virtual Network Embedding Problem”. In: Proceedings of the International Conference on
Automated Planning and Scheduling 32.1 (June 2022), pp. 423–433. url:
https://ojs.aaai.org/index.php/ICAPS/article/view/19828.
[145] Ligeng Zhu, Zhijian Liu, and Song Han. “Deep leakage from gradients”. In: Advances in neural
information processing systems 32 (2019).
[146] Ruiyu Zhu and Yan Huang. “Efficient privacy-preserving general edit distance and beyond”. In:
Cryptology ePrint Archive (2017).
87
.1 Appendix
Preliminaries
.1.1 FL & Homomorphic Encryption
Homomorphic Encryption is a cryptographic primitive that allows computation to be performed on encrypted data without revealing the underlying plaintext. It usually serves as a foundation for privacypreserving outsourcing computing models. HE has generally four algorithms (KeyGen, Enc, Eval, Dec).
The fundamental concept is to encrypt data prior to computation, perform the computation on the encrypted data without decryption, and then decrypt the resulting ciphertext to obtain the final plaintext.
Since FL model parameters are usually not integers, our method is built on the Cheon-Kim-Kim-Song
(CKKS) scheme [30], a (leveled) HE variant that can work with approximate numbers. The comparison of
HE vs other privacy-preserving primitives can be found in Table 1.
Model
Degradation Overheads Client Dropout Interactive
Sync
Model Visible
To Server
Differential Privacy With noise Light Robust No Yes
Secure Aggregation Exact Medium Susceptible Yes Yes
Homomorphic Encryption Exact Heavy Robust No No
Table 1: Comparison of Differential Privacy, Secure Aggregation, and Homomorphic Encryption
Key Management And Threshold HE
.1.2 HE Key Management
Our general system structure assumes the existence of a potentially compromised aggregation server,
which performs the HE-based secure aggregation. Alongside this aggregation server, there also exists a
trusted key authority server that generates and distributes HE keys and related crypto context files to
authenticated parties (as described previously in Algorithm 1 in the main paper. We assume there is no
collusion between these two servers.
88
Moreover, secure computation protocols for more decentralized settings without an aggregation server
are also available using cryptographic primitives such as Threshold HE [8], Multi-Key HE [8], and Proxy
Re-Encryption [11, 68]. In such settings, secure computation and decryption can be collaboratively performed across multiple parties without the need for a centralized point. We plan to introduce a more
decentralized version in the future. Due to the collaborative nature of such secure computation, the key
management will act more as a coordination point instead of a trusted source for key generation.
Framework and Platform Deployment
.1.3 Software Framework: Homomorphic Encryption In Federated Learning
In this part, we will illustrate how we design our HE-based aggregation from a software framework perspective.
Model Reshape
Ciphertext Packing
HE Libraries
KeyGen Enc/Dec HE Agg Functions
Serialization Crypto Foundation
Model Flattening
Selective Parameter Encryption
Optimization
ML Processing
Server Manager
Server Aggregator Client Trainer
Client Manager
Homomorphic Encryption Key Agreement FL Orchestration
ML Bridge
X Other
Figure 1: Framework Structure: our framework consists of a three-layer structure including Crypto Foundation to support basic HE building blocks, ML Bridge to connect crypto tools with ML functions, and FL
Orchestration to coordinate different parties during a task.
Figure 1 provides a high-level design of our framework, which consists of three major layers:
• Crypto Foundation. The foundation layer is where Python wrappers are built to realize HE functions
including key generation, encryption/decryption, secure aggregation, and ciphertext serialization using
open-sourced HE libraries;
89
• ML Bridge. The bridging layer connects the FL system orchestration and cryptographic functions.
Specifically, we have ML processing APIs to process inputs to HE functions from local training processes and outputs vice versa. Additionally, we realize the optimization module here to mitigate the HE
overheads;
• FL Orchestration. The FL system layer is where the key authority server manages the key distribution
and the (server/client) managers and task executors orchestrate participants.
Our layered design makes the HE crypto foundation and the optimization module semi-independent, allowing different HE libraries to be easily switched in our system and further FL optimization techniques
to be easily added to the system.
.1.4 Framework APIs
Table 2 shows the framework APIs in our system related to HE.
API Name Description
pk, sk = key_gen(params)
Generate a pair of HE keys
(public key and private key)
1d_local_model = flatten(local_model)
Flatten local trained model
tensors into a 1D local model
enc_local_model = enc(pk, 1d_model) Encrypt the 1D model
enc_global_model = he_aggregate(
enc_models[n], weight_factors[n])
Homomorphically aggregate
a list of 1D local models
dec_global_model = dec(sk, enc_global_model) Decrypt the 1D global model
global_model = reshape(
dec_global_model, model_shape)
Reshape the 1D global model
back to the original shape
Table 2: HE Framework APIs
.1.5 Deploy Anywhere: A Deployment Platform MLOps For Edges/Cloud
We implement our deployment-friendly platform such that our system can be easily deployed across cloud
and edge devices. Before the training starts, a user uploads the configured server package and the local
client package to the web platform. The server package defines the operations on the FL server, such as
90
the aggregation function and client sampling function; the local client package defines the customized
model architecture to be trained (model files will be distributed to edge devices in the first round of the
training). Both packages are written in Python. The platform then builds and runs the docker image with
the uploaded server package to operate as the server for the training with edge devices configured using
the client package.
As shown in Figure 2, during the training, users can also keep tracking the learning procedure including device status, training progress/model performance, and system overheads (e.g., training time,
communication time, CPU/GPU utilization, and memory utilization) via the web interface. Our platform
keeps close track of overheads, which allows users to in real-time pinpoint HE overhead bottlenecks if any.
Figure 2: Deployment Interface Example: Overhead distribution monitoring on each edge device (e.g.
Desktop (Ubuntu), Laptop (MacBook), and Raspberry Pi 4), which can be used to pinpoint HE overhead
bottlenecks and guide optimization.
91
Additional Definitions And Proofs
Definition .1.1 (Gradient-Based Sensitivity) For a function f : R
n
7→ R, its gradient-based sensitivity
∆f ∈ R
n
can be evaluated as its gradient
∆f =
∂f(D)
∂D .
We adopt the gradient of f as sensitivity (see Definition .1.1) which appears to be different from the
form in Definition 2.5.5. However, we argue that this notion is loosely compatible with the use of differential privacy if we view it as an extension to the continuous case, i.e., |D1 − D2| = 1 is replaced with
|D1 − D2| ≤ ε for some small ε.
.1.6 Proof of Base Full Encryption Protocol
In this subsection, we prove the privacy of base protocol where homomorphic-encryption-based federated
learning utilizes the full model parameter encryption (i.e., the selective parameter encryption rate is set to
be 1). We define the adversary in Definition .1.2 and privacy in Definition .1.4.
Definition .1.2 (Single-Key Adversary) A semi-honest adversary A can corrupt (at the same time) any
subset of n learners and the aggregation server, but not at the same time.
Note that the ref of the proof assumes the single-key setup and the privacy of the threshold variant of
HE-FL (as shown in Definition .1.3) can be easily proved by extending the proofs of threshold homomorphic
encryption [10, 17, 75].
Definition .1.3 (Threshold Adversary) A semi-honest adversary AT ⟨ can corrupt (at the same time) any
subset of n − k learners and the aggregation server.
92
Definition .1.4 (Privacy) A homomorphic-encryption federated learning protocol π is simulation secure in
the presence of a semi-honest adversary A, there exists a simulator S in the ideal world that also corrupts the
same set of parties and produces an output identically distributed to A’s output in the real world.
IdealWorld. Our ideal world functionality F interacts with learners and the aggregation server as follows:
• Each learner sends a registration message to F for a federated training model task Wglob. F determines
a subset N′ ⊂ N of learners whose data can be used to compute the global model Wglob.
• Both honest and corrupted learners upload their local models to F.
• If local models W⃗ of learners in N′
are enough to compute Wglob, F sends Wglob ←
PN′
i=1 αiWi to all
learners in N′
, otherwise F sends empty message ⊥.
Real World. In real world, F is replaced by our protocol described in Algorithm 1 with full model parameter encryption.
We describe a simulator S that simulates the view of the A in the real-world execution of our protocol.
Our privacy definition .1.4 and the simulator S prove both confidentiality and correctness. We omit the
simulation of the view of A that corrupts the aggregation server here since the learners will not receive
the ciphertexts of other learners’ local models in the execution of π thus such a simulation is immediate
and trivial.
Simulator. In the ideal world, S receives λ and 1
n
from F and executes the following steps:
1. S chooses a uniformly distributed random tape r.
2. S runs the key generation function to sample pk: (pk, sk) ← HE.KeyGen(λ).
3. For a chosen ith learner, S runs the encryption function to sample: (ci) ← HE.Enc(pk, r|Wi|
).
4. S repeats Step 3 for all other learners to obtain ⃗c, and runs the federated aggregation function f to
sample: (cglob) ← HE.Eval(⃗c, f).
93
The execution of S implies that:
ci
, cglob s
≡
nHE.Enc(pk,Wi), HE.Eval(W⃗ , f)
o
Thus, we conclude that S’s output in the ideal world is computationally indistinguishable from the
view of A in a real world execution:
{S (1n
,(λ))}
s
≡ {viewπ
(λ)},
where view is the view of A in the real execution of π.
.1.7 Quantifying negligible privacy value in full encryption
Given a security parameter λ that denotes the desired security level of the scheme, i.e., λ-bit security,
we can obtain a relaxed catastrophic failure probability δ0 =
1
2
λ , which satisfies (ϵapprox, δ0)-DP under
approximate DP (Gaussian mechanism), where ϵapprox = 0. Note that, in general for approximate DP,
the Gaussian mechanism will not actually release the entire dataset under catastrophic failure probability,
rather it fails gracefully, thus δ0 is a good approximation of the catastrophic failure probability under the
failure of the security scheme.
With (ϵapprox, δ0)-DP, we can switch the pure DP we used in our paper to approximate DP and use
Advanced Composition [45] (Theorem 3.20) to get a tight composition. On the other hand, to compose
the privacy of (ϵapprox, δ0)-DP under the Gaussian mechanism into our current pure DP composition in
the paper, we can also use Lemma 3.7 [21] to obtain a partial converse (up to a loss in parameters) from
approximate DP to pure DP via zCDP:
With
δ0 =
1
2
λ
, (1)
ρ = ϵapprox + 2 ln 1
δ0
− 2
r
ln 1
δ0
(ϵapprox + ln 1
δ0
), (2)
ϵ0 =
p
2ρ, (3)
9
we can have ϵ0 =
s
2ϵapprox + 2 ln 1
1
2λ
− 2
r
ln 1
1
2λ
(ϵapprox + ln 1
1
2λ
).
Let ϵapprox = 10−12 and λ = 128 for 128-bit security, we can have a negligible ϵ0 = 9.97 ∗ 10−07
.
Note that ϵapprox = 10−12 is a really conservative value for estimating privacy from encryption, when
ϵapprox = 0 we can have ϵ0 ≃ 0. Thus, we have ϵ0-DP from security of encryption, where ϵ0 ≃ 0.
.1.8 Proof of rJ-Privacy by Selective Parameter Encryption
The mean value of sensitivity within [0, Q1−p] is calculated by
E[X|X ≤ Q1−p] = 1
1 − p
Z Q1−p
0
xp(x)dx.
Suppose the total number of parameters is n, the ratio is then obtained as
r =
n(1 − p)
1
1−p
R Q1−p
0
xp(x)dx
nµ
=
1
µ
Z Q1−p
0
xp(x)dx.
Therefore, the total privacy budget is
J
′ =
X
i∈[N]/S
∆fi
b
= r
X
N
i=1
∆fi
b
= rJ.
95
.1.9 Proof of Privacy Budget Relationship Under Different Parameter Encryption
Options
bm induces the privacy budget of ε
(m)
i =
∆fi
bm
for the encryption method indicated by m. The total privacy
budgets for all the methods are then given by
J0 =
X
i
ε
(0)
i =
1
b0
X
i
∆fi
,
J1 = (1 − p)
X
i
ε
(1)
i =
1 − p
b1
X
i
∆fi
,
J2 = r
X
i
ε
(2)
i =
r
b2
X
i
∆fi
.
When the methods reach a similar protection level (approximating using J0 = J1 = J2), we have the
relation above by canceling out the term P
i ∆fi
.
.1.10 Selective Parameter Encryption Privacy Proof Under Uniform Distribution
Assume ∆f ∼ U(0, 1) where U represents the uniform distribution, we can have the following privacy
quantification.
Remark .1.5 (Achieving (1 − p)
2J-Privacy (Uniformly Distributed Sensitivity)) If we select the
most sensitive parameters with ratio p for homomorphic encryption and add Laplace noise on remaining
parameters, it satisfies (1 − p)
2J-Privacy.
For a uniform distribution with density function p(x) = 1
xmax
, x ∈ [0, xmax], mean µ =
1
2
xmax, and
(1 − p)th quantile Q1−p = (1 − p)xmax,
r =
2
xmax Z (1−p)xmax
0
x
xmax
dx = (1 − p)
2
.
96
Uniform distribution is a conservative estimation of the sensitivity distribution. In our experiments,
the obtained sensitivity data is mostly right-skewed and can be well modeled by the mixture of several
log-normal distributions (see the case of Transformer-3 as shown in Figure 3). However, it is hard to analytically depict the conclusion for log-normal distributions, so we provide Remark .1.6 as a demonstration
of the right-skewed case with the simpler exponential distribution.
.1.11 Selective Parameter Encryption Privacy Proof Under Exponential Distribution
Remark .1.6 (Achieving (p ln p − p + 1)J-Privacy(Exponentially Distributed Sensitivity))
For an exponential distribution with density function p(x) = λe−λx, mean µ =
1
λ
, and (1 − p)th
quantile Q1−p = −
ln p
λ
. The corresponding ratio is then
r = λ
Z −
ln p
λ
0
λxe−λxdx = p ln p − p + 1.
Taking Transformer-3t as an example, the estimated privacy budget ratio for sensitivity data under
different distributions is presented in Figure 3.4b. It is clear from the figure that a better fitting of the
sensitivity data distribution yields a better estimation of the privacy budget ratio. Note that the estimation
here is imperfect since finding the best fitting is not the main concern of our study, but is sufficient to show
the correctness of our theorem.
.1.12 Sensitivity Distribution and Privacy Budget Ratio of the Models Included
Figure 3, 4, 5, 6, 7„ 8, 9 show that the log-normal mixture model is a good fitting on the models we use for
our evaluation experiments.
97
10
16
10
13
10
10 10
7
10
4
10
1
Sensitivity
0
2
4
6
8
10
12
14
Percentage
True Distribution
Log-normal Mixture Model
(a) Estimation of the Sensitivity Distribution
0.00 0.02 0.04 0.06 0.08 0.10
Encryption Ratio
0.0
0.2
0.4
0.6
0.8
1.0
Privacy Budget Ratio
Random Encryption
Uniform Distribution
Exponential Distribution
Log-normal Mixture Distribution
True Distribution
(b) Estimation of the Privacy Budget Ratio
Figure 3: Sensitivity Distribution and Privacy Budget Ratio from Selective Parameter Encryption
(Transformer-3).
10
11
10
9 10
7
10
5
10
3
10
1
10
1
Sensitivity
0
2
4
6
8
10
Percentage
True Distribution
Log-normal Mixture Model
(a) Estimation of the Sensitivity Distribution
0.00 0.02 0.04 0.06 0.08 0.10
Encryption Ratio
0.0
0.2
0.4
0.6
0.8
1.0
Privacy Budget Ratio
Random Encryption
Uniform Distribution
Exponential Distribution
Log-normal Mixture Distribution
True Distribution
(b) Estimation of the Privacy Budget Ratio
Figure 4: Sensitivity Distribution and Privacy Budget Ratio from Selective Parameter Encryption
(Transformer-3f).
98
10
11
10
9 10
7
10
5
10
3
10
1
10
1
Sensitivity
0
2
4
6
8
10
12
Percentage
True Distribution
Log-normal Mixture Model
(a) Estimation of the Sensitivity Distribution
0.00 0.02 0.04 0.06 0.08 0.10
Encryption Ratio
0.0
0.2
0.4
0.6
0.8
1.0
Privacy Budget Ratio
Random Encryption
Uniform Distribution
Exponential Distribution
Log-normal Mixture Distribution
True Distribution
(b) Estimation of the Privacy Budget Ratio
Figure 5: Sensitivity Distribution and Privacy Budget Ratio from Selective Parameter Encryption
(Transformer-S).
10
13
10
10 10
7
10
4
10
1
10
2
Sensitivity
0
1
2
3
4
5
6
7
Percentage
True Distribution
Log-normal Mixture Model
(a) Estimation of the Sensitivity Distribution
0.00 0.02 0.04 0.06 0.08 0.10
Encryption Ratio
0.0
0.2
0.4
0.6
0.8
1.0
Privacy Budget Ratio
Random Encryption
Uniform Distribution
Exponential Distribution
Log-normal Mixture Distribution
True Distribution
(b) Estimation of the Privacy Budget Ratio
Figure 6: Sensitivity Distribution and Privacy Budget Ratio from Selective Parameter Encryption (GPT-2).
99
10
8
10
6
10
4
10
2
10
0
Sensitivity
0
1
2
3
4
5
Percentage
True Distribution
Log-normal Mixture Model
(a) Estimation of the Sensitivity Distribution
0.00 0.02 0.04 0.06 0.08 0.10
Encryption Ratio
0.0
0.2
0.4
0.6
0.8
1.0
Privacy Budget Ratio
Random Encryption
Uniform Distribution
Exponential Distribution
Log-normal Mixture Distribution
True Distribution
(b) Estimation of the Privacy Budget Ratio
Figure 7: Sensitivity Distribution and Privacy Budget Ratio from Selective Parameter Encryption (LeNet).
10
7
10
5
10
3
10
1
Sensitivity
0
2
4
6
8
10
12
Percentage
True Distribution
Log-normal Mixture Model
(a) Estimation of the Sensitivity Distribution
0.00 0.02 0.04 0.06 0.08 0.10
Encryption Ratio
0.0
0.2
0.4
0.6
0.8
1.0
Privacy Budget Ratio
Random Encryption
Uniform Distribution
Exponential Distribution
Log-normal Mixture Distribution
True Distribution
(b) Estimation of the Privacy Budget Ratio
Figure 8: Sensitivity Distribution and Privacy Budget Ratio from Selective Parameter Encryption (CNN).
100
10
10
10
8
10
6
10
4
10
2
10
0
10
2
Sensitivity
0
2
4
6
8
10
Percentage
True Distribution
Log-normal Mixture Model
(a) Estimation of the Sensitivity Distribution
0.00 0.02 0.04 0.06 0.08 0.10
Encryption Ratio
0.0
0.2
0.4
0.6
0.8
1.0
Privacy Budget Ratio
Random Encryption
Uniform Distribution
Exponential Distribution
Log-normal Mixture Distribution
True Distribution
(b) Estimation of the Privacy Budget Ratio
Figure 9: Sensitivity Distribution and Privacy Budget Ratio from Selective Parameter Encryption (ResNet18).
Supporting Materials for Defense Effectiveness Experiments
.1.13 Parameter Sensitivity Map for LeNet
Figure 10 visualizes the parameter sensitivity map of LeNet.
Conv_Layer1 Conv_Layer2 Conv_Layer3 Conv_Layer4 Linear_Classifier Figure 10: Model Privacy Map Calculated by Sensitivity on LeNet: darker color indicates higher sensitivity.
Each subfigure shows the sensitivity of parameters of the current layer. The sensitivity of parameters is
imbalanced and many parameters have very little sensitivity (its gradient is hard to be affected by tuning
the data input for attack).
.1.14 Defense Effectiveness on CV and NLP Models
Figure 11 and 12 are used for the records of Table 3.1.
101
0.000 0.025 0.050 0.075 0.100 0.125
Encryption Ratio
0.0
0.2
0.4
0.6
0.8
1.0
VIFP
LeNet
No Encryption
Selective Encryption
Random Encryption
0.000 0.002 0.004 0.006 0.008
Encryption Ratio
0.2
0.4
0.6
0.8
1.0
VIFP
CNN
No Encryption
Selective Encryption
Random Encryption
0.00 0.02 0.04 0.06 0.08 0.10
Encryption Ratio
0.0
0.2
0.4
0.6
0.8
1.0
VIFP
ResNet-18
No Encryption
Selective Encryption
Random Encryption
Figure 11: Results for Selected CV Models
.1.15 Experiments on Quantifying Privacy
Figure 13 shows the privacy guarantee of Selective Parameter Encryption using the equivalent privacy
budget.
Additional Experiments
Model Model Size HE
Time (s)
Non-HE
Time (s)
Comp
Ratio Ciphertext Plaintext Comm
Ratio
Linear Model 101 0.216 0.001 150.85 266.00 KB 1.10 KB 240.83
TimeSeries
Transformer 5,609 2.792 0.233 12.00 532.00 KB 52.65 KB 10.10
MLP (2 FC) 79,510 0.586 0.010 60.46 5.20 MB 311.98 KB 17.05
LeNet 88,648 0.619 0.011 57.95 5.97 MB 349.52 KB 17.50
RNN(2 LSTM
+ 1 FC) 822,570 1.195 0.013 91.82 52.47 MB 3.14 MB 16.70
CNN (2 Conv
+ 2 FC) 1,663,370 2.456 0.058 42.23 103.15 MB 6.35 MB 16.66
MobileNet 3,315,428 9.481 1.031 9.20 210.41 MB 12.79 MB 16.45
ResNet-18 12,556,426 19.950 1.100 18.14 796.70 MB 47.98 MB 16.61
ResNet-34 21,797,672 37.555 2.925 12.84 1.35 GB 83.28 MB 16.60
ResNet-50 25,557,032 46.672 5.379 8.68 1.58 GB 97.79 MB 16.58
GroupViT 55,726,609 86.098 19.921 4.32 3.45 GB 212.83 MB 16.61
Vision
Transformer 86,389,248 112.504 17.739 6.34 5.35 GB 329.62 MB 16.62
BERT 109,482,240 136.914 19.674 6.96 6.78 GB 417.72 MB 16.62
Llama 2 6.74 B 13067.154 2423.976 5.39 417.43 GB 13.5 GB 30.92
Table 3: Vanilla Fully-Encrypted Models of Different Sizes: with 3 clients; Comp Ratio is calculated by
time costs of HE over time costs of Non-HE; Comm Ratio is calculated by file sizes of HE over file sizes of
Non-HE. CKKS is configured with default crypto parameters.
102
0.0 0.2 0.4 0.6 0.8
Encryption Ratio
0.0
0.2
0.4
0.6
0.8
1.0
Accuracy
Transformer-3
No Encryption
Selective Encryption
Random Encryption
0.0 0.2 0.4 0.6 0.8
Encryption Ratio
0.0
0.2
0.4
0.6
0.8
Accuracy
Transformer3-f
No Encryption
Selective Encryption
Random Encryption
0.0 0.2 0.4 0.6
Encryption Ratio
0.0
0.2
0.4
0.6
0.8
Accuracy
Transformer-S
No Encryption
Selective Encryption
Random Encryption
0.0 0.2 0.4 0.6
Encryption Ratio
0.0
0.1
0.2
0.3
0.4
0.5
Accuracy
Gpt-2
No Encryption
Selective Encryption
Random Encryption
Figure 12: Results for Selected NLP Models
We evaluate the HE-based training overheads (without our optimization in place) across various FL
training scenarios and configurations. This analysis covers diverse model scales, HE cryptographic parameter configurations, client quantities involved in the task, and communication bandwidths. This helps
us to identify bottlenecks in the HE process throughout the entire training cycle. We also benchmark our
framework against other open-source HE solutions to demonstrate its advantages.
.1.16 Parameter Efficiency Techniques in HE-Based FL
Table 4 shows the optimization gains by applying model parameter efficiency solutions in HE-Based FL.
103
0.0025 0.0050 0.0075 0.0100 0.0125
Scale of Laplace Distribution
0.0
0.2
0.4
0.6
0.8
1.0
Accuracy
Transformer-3
Full DP
Selective Encryption + DP
Random Encryption + DP
0.0025 0.0050 0.0075 0.0100 0.0125
Scale of Laplace Distribution
0.0
0.2
0.4
0.6
0.8
1.0
Accuracy
Transformer3-t
Full DP
Selective Encryption + DP
Random Encryption + DP
0.00 0.01 0.02 0.03
Scale of Laplace Distribution
0.0
0.1
0.2
0.3
0.4
0.5
0.6
Accuracy
Transformer3-f
Full DP
Selective Encryption + DP
Random Encryption + DP
(a) Results for Selected NLP Models
0.00 0.05 0.10 0.15 0.20 0.25
Scale of Laplace Distribution
0.3
0.4
0.5
0.6
0.7
0.8
0.9
MSSSIM
Full DP
Selective Encryption + DP
Random Encryption + DP
0.00 0.05 0.10 0.15 0.20 0.25
Scale of Laplace Distribution
0.5
0.6
0.7
0.8
0.9
UQI
Full DP
Selective Encryption + DP
Random Encryption + DP
0.00 0.05 0.10 0.15 0.20 0.25
Scale of Laplace Distribution
0.1
0.2
0.3
0.4
0.5
0.6
0.7
VIFP
Full DP
Selective Encryption + DP
Random Encryption + DP
Lenet
(b) Results for LeNet
Figure 13: Defense Effectiveness of DP Noises of Different Scales Under Three Protection Methods: an
encryption ratio is fixed for each model from the beginning to guarantee a good attack performance at
first. Each configuration is attacked 10 times and the best attack score is recorded. The experiments are
repeated for at least three different sets of applied DP noises.
104
Models PT (MB) CT Opt
(MB)
ResNet-18
(12 M)
[121]
47.98 796.70 MB 19.03
BERT
(110 M)
[65]
417.72 6.78 GB 16.66
Table 4: Parameter Efficiency Overhead: PT means plaintext and CT means ciphertext. Communication
reductions are 0.60 and 0.96.
.1.17 Results on Different Scales of Models
We evaluate our framework on models with different size scales and different domains, from small models
like the linear model to large foundation models such as Vision Transformer [38] and BERT [36]. As Table 3
show, both computational and communicational overheads are generally proportional to model sizes.
Table 3 illustrates more clearly the overhead increase from the plaintext federated aggregation. The
computation fold ratio is in general 5x ∼ 20x while the communication overhead can jump to a common
15x. Small models tend to have a higher computational overhead ratio increase. This is mainly due to the
standard HE initialization process, which plays a more significant role when compared to the plaintext
cost. The communication cost increase is significant for models with sizes smaller than 4096 (the packing
batch size) numbers. Recall that the way our HE core packs encrypted numbers makes an array whose
size is smaller than the packing batch size still requires a full ciphertext.
.1.18 Results on Different Cryptographic Parameters
We evaluate the impacts of variously-configured cryptographic parameters. We primarily look into the
packing batch size and the scaling bits. The packing batch size determines the number of slots packed
105
HE
Batch
Size
Scaling
Bits
Comp
(s)
Comm
(MB)
Model Test
Accuracy
∆ (%)
1024 14 8.834 407.47 -0.28
1024 20 7.524 407.47 -0.21
1024 33 7.536 407.47 0
1024 40 7.765 407.47 0
1024 52 7.827 407.47 0
2048 14 3.449 204.50 -0.06
2048 20 3.414 204.50 -0.13
2048 33 3.499 204.50 0
2048 40 3.621 204.50 0
2048 52 3.676 204.50 0
4096 14 1.837 103.15 -1.85
4096 20 1.819 103.15 0.32
4096 33 1.886 103.15 0
4096 40 1.998 103.15 0
4096 52 1.926 103.15 0
Table 5: Computational & Communicational Overhead of Different Crypto Parameter Setups: tested with
CNN (2 Conv+ 2 FC) and on 3 clients; model test accuracy ∆s is the difference between the best plaintext
global model and the best global encrypted global models.
in a single ciphertext while the scaling bit number affects the “accuracy” (i.e., how close the decrypted
ciphertext result is to the plaintext result) of approximate numbers represented from integers.
From Table 5, the large packing batch sizes in general result in faster computation speeds and smaller
overall ciphertext files attributed to the packing mechanism for more efficiency. However, the scaling
factor number has an almost negligible impact on overheads.
Unsurprisingly, it aligns with the intuition that the higher bit scaling number results in higher “accuracy” of the decrypted ciphertext value, which generally means the encrypted aggregated model would
have a close model test performance to the plaintext aggregated model. However, it is worth mentioning
that since CKKS is an approximate scheme with noises, the decrypted aggregated model can yield either
positive or negative model test accuracy ∆s, but usually with a negative or nearly zero ∆.
106
.1.19 Client Data Distribution Impact on Sensitivity
Figure 14 shows the difference in sensitivity distribution of Resnet50 under two different client data distributions. The two sensitivity distributions still preserve the characteristics of log-normal mixture distribution, but it is noticeable a slight change in aspects like their mode, range, etc. This observation suggests
that alternative global mask aggregation functions, such as maximum-based aggregation, might outperform our current weighted averaging method in terms of privacy protection. It is worth future work to
investigate this specific aspect of our selective encryption.
10
16
10
12
10
8
10
4
10
0
Sensitivity
0.0
2.5
5.0
7.5
10.0
Percentage (%)
Data Distribution 1
Data Distribution 2
Figure 14: Deviation of Sensitivity Distribution Induced by Different Client Data Distribution: two client
data distributions constructed from the ImageNet dataset with 100 images from distinct classes sampled at
equal intervals. Distribution 1 contains data with labels of [0, 1, 2, 3, 5] while Distribution 2 contains data
whose labels span across 0 to 400.
To further investigate this aspect, experimental setups in the previous work [60, 86] for the FL data
heterogeneity can be considered in future work on this topic regarding privacy sensitivity calculation.
107
.1.20 Impact from Number of Clients
As real-world systems often experience a dynamic amount of participants within the FL system, we evaluate the overhead shift over the change in the number of clients. Figure 15a breaks down the cost distribution
as the number of clients increases. With a growing number of clients, it also means proportionally-added
ciphertexts as inputs to the secure aggregation function thus the major impact is cast on the server. When
the server is overloaded, our system also supports client selection to remove certain clients without largely
degrading model performance.
0 25 50 75 100 125 150 175 200
Number of Clients
0
10
20
30
40
50
Execution Time (s)
Total
Init
Enc
Secure Agg
Dec
(a) Step Breakdown of HE Computational Cost vs. Number of Clients (Up to 200): tested on fully-encrypted
CNN
MAR (HE) SAR (HE) IB (HE) MAR(Non) SAR (Non) IB (Non)
Bandwidths
0
50
100
150
200
250
300
Time Elapsed (s)
34.72%
1.38% 0.16%
3.92% 0.11% 0.01%
Others
Communication
(b) Impact of Different Bandwidths on Communication
and Training Cycles on Fully-Encrypted ResNet-50: HE
means HE-enabled training and Non means plaintext.
Others include all other procedures except communication during training. Percentages represent the portion
of communication cost in the entire training cycle.
Figure 15: Results on Different Number of Clients and Communication Setup
.1.21 Communication Cost on Different Bandwidths
FL parties can be allocated in different geo-locations which might result in communication bottlenecks.
Typically, there are two common scenarios: (inter) data centers and (intra) data centers. In this part, we
108
evaluate the impact of the bandwidths on communication costs and how it affects the FL training cycle.
We categorize communication bandwidths using 3 cases:
• Infiniband (IB): communication between intra-center parties. 5 GB/s as the test bandwidth.
• Single AWS Region (SAR): communication between inter-center parties but within the same geo-region
(within US-WEST). 592 MB/s as the test bandwidth.
• Multiple AWS Region (MAR): communication between inter-center parties but across the different georegion (between US-WEST and EU-NORTH). 15.6 MB/s as the test bandwidth.
As shown in Figure 15b, we deploy our system on 3 different geo-distributed environments, which
are operated under different bandwidths. It is obvious that the secure HE functionality has an enormous
impact on low-bandwidth environments while medium-to-high-bandwidth environments suffer limited
impact from increased communication overhead during training cycles, compared to Non-HE settings.
.1.22 Different Encryption Selections
Table 6 shows the overhead reductions with different selective encryption rates.
Selection Comp
(s) Comm Comp
Ratio
Comm
Ratio
Enc w/ 0% 17.739 329.62 MB 1.00 1.00
Enc w/ 10% 30.874 844.49 MB 1.74 2.56
Enc w/ 30% 50.284 1.83 GB 2.83 5.69
Enc w/ 50% 70.167 2.83 GB 3.96 8.81
Enc w/ 70% 88.904 3.84 GB 5.01 11.93
Enc w/ All 112.504 5.35 GB 6.34 16.62
Table 6: Overheads With Different Parameter Selection Configs Tested on Vision Transformer: “Enc w/
10%” means performs encrypted computation only on 10% of the parameters; all computation and communication results include overheads from plaintext aggregation for the rest of the parameters.
.1.23 Comparison with Other FL-HE Frameworks
Comparison with other popular HE-based FL work can be found in Table 7.
109
Features IBMFL Nvidia FLARE Ours
Homomorphic Encryption ✓ ✓ ✓
Threshold Key Management ✗ ✗ ✓
Selective Parameter Encryption ✗ ⃝ ✓
Encrypted Foundation Model Training ⃝ ⃝ ✓
Table 7: Comparison with Existing HE-Based FL Systems: ⃝ implies limited support. For Selective Parameter Encryption, FLARE offers the (random) partial encryption option which does not have clear indications
of privacy impacts; for Encrypted Foundation Model Training, the other two platforms require massive
resources to train foundation models in encrypted federated learning.
Frameworks HE Core Key
Management Comp (s) Comm
(MB)
HE
Multi-Party
Functionalities
Ours PALISADE ✓ 2.456 105.72
PRE,
ThHE
Ours (w/ Opt) PALISADE ✓ 0.874 16.37
PRE,
ThHE
Ours SEAL
(TenSEAL) ✓ 3.989 129.75 —
Nvidia FLARE
(9a1b226)
SEAL
(TenSEAL) ✓ 2.826 129.75 —
IBMFL
(8c8ab11)
SEAL
(HELayers) ⃝ 3.955 86.58 —
Plaintext — — 0.058 6.35 —
Table 8: Different Frameworks: tested with CNN (2 Conv + 2 FC) and on 3 clients; Github commit IDs
are specified. For key management, our work uses a key authority server; FLARE uses a security content
manager; IBMFL currently provides a local simulator.
We compare our framework to the other open-sourced FL frameworks with HE capability, namely
NVIDIA FLARE (NVIDIA) and IBMFL.
Both NVIDIA and IBMFL utilize Microsoft SEAL as the underlying HE core, with NVIDIA using OpenMinded’s python tensor wrapper over SEAL and TenSEAL; IBMFL using IBM’spython wrapper over SEAL
and HELayers (HELayers also has an HElib version). Our HE core module can be replaced with different
available HE cores, to give a more comprehensive comparison, we also implement a TenSEAL version of
our framework for evaluation.
Table 8 demonstrates the performance summary of different frameworks using an example of a CNN
model with 3 clients. Our PALISADE-powered framework has the smallest computational overhead due
110
to the performance of the PALISADE library. In terms of communication cost, our system (PALISADE)
comes second after IBMFL’s smallest file serialization results due to the efficient packing of HELayers’ Tile
tensors [6].
Note that NVIDIA’s TenSEAL-based realization is faster than the TenSEAL variant of our system. This
is because NVIDIA scales each learner’s local model parameters locally rather than weighing ciphertexts on
the server. This approach reduces the need for the one multiplication operation usually performed during
secure aggregation (recall that HE multiplications are expensive). However, such a setup would not suit
the scenario where the central server does not want to reveal its weighing mechanism per each individual
local model to learners as it reveals partial (even full in some cases) information about participants in the
system.
.1.24 Change in Attack Performance over Training
This experiment is used to study the attack performance at different stages of model training. We use
Transformer-3 to illustrate the trend as shown in Figure 16. The encryption ratio for random and selective
encryption is selected as 0.0005 to guarantee the attack performance at the beginning of the training. The
results indicate that the attack performance decreases as the model is trained to be more and more useful,
which makes sense since the importance of information contained in the gradient is expected to drop
gradually as the training goes toward convergence. Note that the experiment is conducted on only one
model because this part is not the main concern of our study. A more comprehensive setup should include
multiple CV and NLP models.
111
0 200 400 600 800 1000 1200
Batch Iteration
0.0
0.2
0.4
0.6
0.8
1.0
Accuracy
No Encryption
Selective Encryption
Random Encryption
0 200 400 600 800 1000 1200
Batch Iteration
0.0
0.2
0.4
0.6
0.8
1.0
SacreBLEU
No Encryption
Selective Encryption
Random Encryption
0 200 400 600 800 1000 1200
Batch Iteration
0.0
0.2
0.4
0.6
0.8
1.0
Google BLEU
No Encryption
Selective Encryption
Random Encryption
0 200 400 600 800 1000 1200
Batch Iteration
0.0
0.2
0.4
0.6
0.8
1.0
ROUGE-1
No Encryption
Selective Encryption
Random Encryption
0 200 400 600 800 1000 1200
Batch Iteration
0.0
0.2
0.4
0.6
0.8
1.0
ROUGE-2
No Encryption
Selective Encryption
Random Encryption
0 200 400 600 800 1000 1200
Batch Iteration
0.0
0.2
0.4
0.6
0.8
1.0
ROUGE-L
No Encryption
Selective Encryption
Random Encryption
Transformer-3
Figure 16: Attack Performance on Transformer-3 over Batch Iterations. Each configuration is attacked 10
times and the best score is recorded. The experiment is repeated on 10 different data points and their mean
is presented.
112
.1.25 MLOps Running Example Configuration
1 common_args:
2 training_type: "cross_silo"
3 scenario: "horizontal"
4 random_seed: 0
5
6 data_args:
7 dataset: "cifar100"
8 partition_method: "hetero"
9 partition_alpha: 0.5
10
11 model_args:
12 model: "resnet50"
13
14 train_args:
15 federated_optimizer: "FedAvg"
16 client_num_in_total: 3
17 client_num_per_round: 3
18 comm_round: 5
19 epochs: 1
20 batch_size: 10
21 client_optimizer: sgd
22 learning_rate: 0.03
23 weight_decay: 0.001
24
25 validation_args:
26 frequency_of_the_test: 5
27
28 device_args:
29 worker_num: 2
30 using_gpu: true
31 gpu_mapping_file: config/gpu_mapping.yaml
32
33 comm_args:
34 backend: "MQTT_S3"
35 mqtt_config_path: config/mqtt_config.yaml
36 s3_config_path: config/s3_config.yaml
37
38 fhe_args:
39 enable_fhe: true
40 scheme: ckks
41 batch_size: 8192
42 scaling_factor: 52
43 file_loc: "resources/cryptoparams/"
44
Figure 17: ResNet-50 MLOps Training Configuration
113
.1.26 The detailed procedure of P3V.
Initialization
• The sender node N0 generates secrets x and hash digests y for each node along the path. The sender
node N0 sends via an anonymous channel to each intermediate node Ni along the agreed path directed
by the network slicing authority its secret xi
(except that the receiver also receives an additional value
xr), its hash value yi and the hash value yi+1 for validating its successor. Each hash value yi = H(xi ⊕
xi+1), i.e., yi
is the hash value of XORing xi and xi+1, except the receiver’s hash value yn = H(xn⊕xr).
To prove its anonymous identity to other nodes, the sender also generates a pair of private key and
public key (which will be distributed to each node via authorities). Then the sender uses its private key
to sign the message containing validation tokens.
• The authority of each node acts as NIZK Generator to produce a proving key pk and a verification key
vk (which will be distributed to the node’s predecessor via other authorities).
• Upon receiving messages on the anonymous channel, each node first verifies the anonymous sender’s
identity by checking the signature using the received public key and only proceeds if the identity
verification passes.
Secret Release
• After the delivery contract is fulfilled, the node Ni+1 sends the secret xi+1 to its predecessor.
Proof Generation
• The intermediate Ni receives the secret xi+1 from its successor and generates its proof that it obtains
xi+1.
Proof Verification
• Ni sends the proof to its predecessor Ni−1. Ni−1 verifies the proof yi = H(xi ⊕ xi+1).
• If Ni
’s proof does not pass, Ni−1 reports the potential malicious behavior to its authority. The authority
requests the NIZK proof from Ni and verifies it. If the proof fails or is not received, the authority
punishes the node per policy; if the proof passes, the authority will conduct an investigation on the
reporting node per policy.
Figure 18: XOR-Hash-NIZK Path Validation Protocol
114
Abstract (if available)
Abstract
In recent years, the growing reliance on user data for building server-side applications and services has significantly heightened the importance of data privacy. To meet expanding privacy regulations like GDPR, service providers have turned to privacy-preserving methods that maintain computational functionality while protecting user privacy. However, integrating techniques such as homomorphic encryption into application protocols presents a critical challenge: achieving a balance between privacy and efficiency. This thesis explores two distinct domains within privacy-preserving computation, offering practical, domain-specific solutions to address challenges related to overheads and protocol complexity. The focus is on achieving efficient privacy in both machine learning and networks/IoT.
To illustrate how leveraging domain-specific insights—from federated learning, entity resolution, and computer networking—can substantially enhance the efficiency of privacy-preserving computation, we first introduce a selective encryption strategy for large-scale federated learning models, reducing overhead by encrypting only sensitive parameters while still maintaining robust privacy guarantees; secondly, we demonstrate how homomorphic encryption can be optimized for deep entity resolution via a two-stage computation scheme and novel techniques including synthetic ranging and polynomial degree optimization that preserve accuracy under encrypted computation; finally, we apply Non-Interactive Zero-Knowledge proofs to achieve lightweight privacy-preserving path validation across multi-authority network slices, ensuring data forwarding compliance without revealing sensitive topology details by utilizing a backward pairwise validation procedure. Taken together, these studies highlight how targeting domain-specific challenges via domain-specific knowledge can yield practical, scalable frameworks for efficient privacy-preserving computation.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Coding centric approaches for efficient, scalable, and privacy-preserving machine learning in large-scale distributed systems
PDF
Optimizing privacy-utility trade-offs in AI-enabled network applications
PDF
Algorithms and frameworks for generating neural network models addressing energy-efficiency, robustness, and privacy
PDF
Building straggler-resilient and private machine learning systems in the cloud
PDF
Generative foundation model assisted privacy-enhancing computing in human-centered machine intelligence
PDF
Responsible AI in spatio-temporal data processing
PDF
Differentially private learned models for location services
PDF
Privacy-aware geo-marketplaces
PDF
Practice-inspired trust models and mechanisms for differential privacy
PDF
Enhancing privacy, security, and efficiency in federated learning: theoretical advances and algorithmic developments
PDF
Advancing distributed computing and graph representation learning with AI-enabled schemes
PDF
Striking the balance: optimizing privacy, utility, and complexity in private machine learning
PDF
Controlling information in neural networks for fairness and privacy
PDF
Enhancing collaboration on the edge: communication, scheduling and learning
PDF
Unsupervised domain adaptation with private data
PDF
Improving efficiency, privacy and robustness for crowd‐sensing applications
PDF
Security and privacy in information processing
PDF
On scheduling, timeliness and security in large scale distributed computing
PDF
Novel and efficient schemes for security and privacy issues in smart grids
PDF
Measurements of strong scaling on Spark
Asset Metadata
Creator
Jin, Weizhao
(author)
Core Title
Efficiency in privacy-preserving computation via domain knowledge
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Computer Science
Degree Conferral Date
2025-05
Publication Date
01/22/2025
Defense Date
01/16/2025
Publisher
Los Angeles, California
(original),
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
domain knowledge,efficiency,OAI-PMH Harvest,privacy-preserving computation
Format
theses
(aat)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Ravi, Srivatsan (
committee chair
), Krishnamachari, Bhaskar (
committee member
), Madhyastha, Harsha (
committee member
), Morstatter, Fred (
committee member
)
Creator Email
weizhaoj@usc.edu
Unique identifier
UC11399FI0C
Identifier
etd-JinWeizhao-13774.pdf (filename)
Legacy Identifier
etd-JinWeizhao-13774
Document Type
Dissertation
Format
theses (aat)
Rights
Jin, Weizhao
Internet Media Type
application/pdf
Type
texts
Source
20250127-usctheses-batch-1237
(batch),
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the author, as the original true and official version of the work, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright.
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Repository Email
cisadmin@lib.usc.edu
Tags
domain knowledge
efficiency
privacy-preserving computation