Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Formal analysis of data poisoning robustness of K-nearest neighbors
(USC Thesis Other)
Formal analysis of data poisoning robustness of K-nearest neighbors
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
Formal Analysis of Data Poisoning Robustness of K-Nearest Neighbors by Yannan Li A Dissertation Presented to the FACULTY OF THE USC GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (COMPUTER SCIENCE) May 2023 Copyright 2023 Yannan Li for my family ii Acknowledgments Pursuing a Ph.D. has been an incredibly fulfilling and significant experience in my life. It was an immense challenge, and completing it would not have been possible without the assistance and support of numerous individuals I met during my Ph.D. journey. First and foremost, I am deeply grateful to my advisor Chao Wang for his unwavering commit- ment, patience, and consistent guidance in helping me develop into an independent researcher. His genuine passion for research and insatiable curiosity always encouraged me to establish a strong research foundation, enabling me to overcome countless obstacles and make groundbreaking dis- coveries. His boundless creativity consistently prompted me to think critically, as he challenged my ideas from various perspectives. Defending my arguments against his critiques pushed me to refine my thoughts and build stronger cases, ultimately resulting in robust research outcomes and the publication of high-quality research papers. I will forever treasure the wisdom and support he has imparted. His inspiration has propelled me to aim for excellence as a researcher and to emulate his compassionate mentorship. I firmly believe that this is the most appropriate means of expressing my everlasting appreciation for him. I would also really like to thank all committee members in my defense, proposal and qualifying exam: Nenad Medvidovi´ c, Jyotirmoy V . Deshmukh, Pierluigi Nuzzo and Mukund Raghothaman. Their insightful feedback and astute questions have played a crucial role in elevating and refining my dissertation. The lessons gleaned from their guidance were truly indispensable, and I could not have acquired them without their attentive suggestions and remarks on both my presentation and dissertation. Throughout my Ph.D. journey, I was fortunate to be surrounded by amazing labmates and “neighbor” labmates who have enriched my Ph.D. experience with countless delightful memories. iii Jingbo Wang has been an invaluable help in both my research and personal life. Brandon Paulsen, Shengjian (Daniel) Guo, and Chungha Sung, have provided assistance in my research endeavors and job interviews. With Meng Wu, Zunchen Huang, Brian Hyeongseok Kim, Xin Qin, Yuan Xia, Yifei Huang, Yixue Zhao, Mian Wan, Yingjun Lyu, and Zhaoxu Zhang, I have enjoyed memorable experiences such as hikes and dining out. I am deeply thankful for the unwavering support of my numerous friends, both emotionally and physically. To Ziyue Zhu and Lin Yuan, we have all accomplished our goals of obtaining our Ph.D. degrees. To Anguo Hu, Hang Yu, Weiye Wang, and Yuan Meng, I cherish the memories of our gym workouts, gaming adventures and festival celebrations. To Feixuan Wang, Wenxiu Zhang, and Xiaoyi He, your sincere encouragement and kind words will eternally echo in my heart. Finally, I wish to convey my heartfelt gratitude to my devoted parents and loving husband, Zhenyuan. Completing my Ph.D. would have been impossible without their unwavering support and the boundless freedom they have granted me throughout my life. They share in the achievement of this degree. During my darkest moments, they not only offered constant encouragement but also allowed me the autonomy to make decisions about my own path. They never imposed their personal desires upon me but instead genuinely listened, sought to understand, and helped me analyze my situation. They trusted my judgment and decisions, providing as much support as possible. I am eternally grateful for their parenting philosophy, which has provided me with a strong foundation, enabling me to take risks and choose what I truly value without fear. iv Contents Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi Chapter 1: Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Motivations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.1.1 Two Ways in which Poisoned Data Affects KNN Prediction Results . . . . 2 1.1.2 Challenges of Applying Formal Analysis . . . . . . . . . . . . . . . . . . 4 1.2 Insight and Hypothesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.2.1 Insight 1: Abstract Interpretation Facilitates Analysis. . . . . . . . . . . . . 4 1.2.2 Insight 2: Approximation Facilitates Analysis. . . . . . . . . . . . . . . . . 5 1.2.3 Hypothesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.3 Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.3.1 Certifying Robustness of KNNs under Data Poisoning . . . . . . . . . . . 7 1.3.2 Falsifying Robustness of KNNs under Data Poisoning . . . . . . . . . . . . 7 1.3.3 Certifying Fairness of KNNs under Historical Bias in Training Dataset . . . 7 1.4 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Chapter 2: Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.1 Data-Poisoning Robustness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.2 k-Nearest Neighbors (KNN) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 v 2.3 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.3.1 Data Poisoning in General . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.3.2 Mitigating Data Poisoning . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.3.3 Certifying the Defenses . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.3.4 Leveraging KNN for Attacks or Defenses . . . . . . . . . . . . . . . . . . 12 2.3.5 Mitigating Bias in Machine Learning . . . . . . . . . . . . . . . . . . . . . 13 Chapter 3: Certifying Robustness of KNNs under Data Poisoning . . . . . . . . . . . . 14 3.1 The Intuition and Overview of Proposed Method . . . . . . . . . . . . . . . . . . . 17 3.1.1 Two Ways of Affecting the Prediction Result . . . . . . . . . . . . . . . . 17 3.1.2 Overview of Proposed Method . . . . . . . . . . . . . . . . . . . . . . . . 18 3.2 Analyzing the KNN Parameter Tuning Phase . . . . . . . . . . . . . . . . . . . . . 19 3.2.1 The Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 3.2.2 The Label Counter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 3.2.3 The Removal Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.2.4 Misclassification Error Bounds . . . . . . . . . . . . . . . . . . . . . . . . 23 3.3 Analyzing the KNN Prediction Phase . . . . . . . . . . . . . . . . . . . . . . . . . 26 3.3.1 Computing the Classification Labels . . . . . . . . . . . . . . . . . . . . . 26 3.3.2 Pruning Redundant K Values . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.4 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.4.1 Results on the Small Datasets . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.4.2 Results on the Large Datasets . . . . . . . . . . . . . . . . . . . . . . . . . 31 3.4.3 Compared with the Existing Method . . . . . . . . . . . . . . . . . . . . . 32 3.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 Chapter 4: Falsifying Robustness of KNNs under Data Poisoning . . . . . . . . . . . . 34 4.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 4.1.1 The n-Poisoning Robustness . . . . . . . . . . . . . . . . . . . . . . . . . 38 vi 4.1.2 Challenge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 4.1.3 The Baseline Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 4.2 Overview of Proposed Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 4.3 Quickly Certifying Robustness . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 4.3.1 The QUICKCERTIFY Subroutine . . . . . . . . . . . . . . . . . . . . . . . 43 4.3.2 Two Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 4.3.3 Correctness and Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . 45 4.4 Reducing the Search Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 4.4.1 Minimal Violating Removal in Neighbors . . . . . . . . . . . . . . . . . . 47 4.4.2 An Illustrative Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 4.4.3 The Reduced Search Space . . . . . . . . . . . . . . . . . . . . . . . . . . 49 4.5 Incremental Computation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 4.5.1 The Intuition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 4.5.2 The Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 4.6 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 4.6.1 Evaluation Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 4.6.2 Results on the Smaller Datasets . . . . . . . . . . . . . . . . . . . . . . . 54 4.6.3 Results on All Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 4.6.4 Effectiveness of Proposed Method and Impact of Poisoning Threshold . . . 56 4.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 Chapter 5: Certifying Fairness of KNNs under Historical Bias in Training Dataset . . 59 5.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 5.1.1 Fairness of the Learned Model . . . . . . . . . . . . . . . . . . . . . . . . 63 5.1.2 Fairness in the Presence of Dataset Bias . . . . . . . . . . . . . . . . . . . 63 5.2 Overview of Proposed method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 5.3 Abstracting the KNN Prediction Step . . . . . . . . . . . . . . . . . . . . . . . . . 66 5.3.1 Finding the K-Nearest Neighbors . . . . . . . . . . . . . . . . . . . . . . . 66 vii 5.3.1.1 The Challenge. . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 5.3.1.2 Bounding Distance Between∆ ε (x) and t. . . . . . . . . . . . . . 67 5.3.1.3 Distance Bounds are Compositional. . . . . . . . . . . . . . . . 67 5.3.1.4 Four Cases in Each Dimension. . . . . . . . . . . . . . . . . . . 68 5.3.1.5 Computing overNN Using Bounds. . . . . . . . . . . . . . . . . 70 5.3.2 Checking the Classification Result . . . . . . . . . . . . . . . . . . . . . . 71 5.4 Abstracting the KNN Parameter Tuning Step . . . . . . . . . . . . . . . . . . . . . 72 5.4.1 Overapproximating the Classification Error . . . . . . . . . . . . . . . . . 73 5.4.2 Underapproximating the Classification Error . . . . . . . . . . . . . . . . . 74 5.4.2.1 The LP Problem . . . . . . . . . . . . . . . . . . . . . . . . . . 75 5.4.2.2 Necessary Conditions . . . . . . . . . . . . . . . . . . . . . . . 76 5.5 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 5.5.1 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 5.5.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 5.5.3 Results on Efficiency and Accuracy . . . . . . . . . . . . . . . . . . . . . 78 5.5.4 Results on the Certification Rates . . . . . . . . . . . . . . . . . . . . . . . 79 5.5.5 Results on Demographic Groups . . . . . . . . . . . . . . . . . . . . . . . 80 5.5.6 Caveat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 5.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 Chapter 6: Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 viii List of Tables 3.1 Statistics of the supervised learning datasets. . . . . . . . . . . . . . . . . . . . . . 29 3.2 Results of the proposed method and the baseline method on the small datasets with the maximal poisoning number n=1, 2, and 3. . . . . . . . . . . . . . . . . . . . . 30 3.3 Results of the proposed method on large datasets, and on small datasets but with larger poisoning numbers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 4.1 Comparing the accuracy of the proposed method with the baseline (ground truth)and two existing methods on the smaller datasets, for which the ground truth can be ob- tained by the baseline enumerative method (Algorithm 8). . . . . . . . . . . . . . . 54 4.2 Comparing the accuracy and efficiency of the proposed method with existing meth- ods on all datasets, with large poisoning thresholds; the percentages of certified and falsified cases are reported in Section 4.6.4 and shown in Figure 4.3. . . . . . . . . 56 5.1 Statistics of the datasets used in the experimental evaluation. . . . . . . . . . . . . 77 5.2 Results for certifying label-flipping and individual fairness (gender), for which ground truth is obtained by naive enumeration, and compared with the proposed method. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 5.3 Results for certifying label-flipping , individual, and ε-fairness by the proposed method. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 5.4 Results for certifying label-flipping + ε-fairness with both Race and Gender as protected attributes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 ix List of Figures 1.1 Example of direct influence of the poisoned data. . . . . . . . . . . . . . . . . . . 2 1.2 Example of indirect influence of poisoned data. . . . . . . . . . . . . . . . . . . . 3 2.1 The KNN algorithm, consisting of the parameter tuning and prediction steps. . . . . 11 3.1 Example of comparing the error bounds. . . . . . . . . . . . . . . . . . . . . . . . 21 3.2 Examples for Algorithm 5 with K= 5, n= 4, and y= orange being the correct label. 26 3.3 Example for Algorithm 6 with K = 5, n = 4, y = orange as correct label, and y ′ = blue as the most frequent wrong label. . . . . . . . . . . . . . . . . . . . . . . 27 3.4 Comparing the proposed method (blue) with Jia et al. [33] (orange): the x-axis is poisoning number n and the y-axis is the percentage of certified test data. . . . . . . 33 4.1 Robust example for QUICKCERTIFY, where the poisoning threshold is n= 2, and candidate K values are{1, 3}. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 4.2 Unknown example for QUICKCERTIFY, where the poisoning threshold is n= 2 and the only two candidate values are K=1 and K=3. . . . . . . . . . . . . . . . . 46 4.3 Results on how the poisoning threshold (in the x-axis) affects the percentages of certified, falsified, and unknown test cases (in the y-axis) in the proposed method. Here, falsified is in ‘− ’, unknown is in ‘.’, and certified is in either ‘|’ (quick certify) or ‘/’ (slow certify). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 5.1 FAIRKNN: the proposed method for certifying fairness of KNNs with label bias. . 61 5.2 Four cases for computing the upper and lower bounds of the distance function d ε i (δ i )=(δ i + A) 2 for δ i ∈[− ε i ,ε i ]. In these figures, δ i is the x-axis, and d ε i is the y-axis,LB denotes LB(d ε i ), andUB denotes UB(d ε i ). . . . . . . . . . . . . . . . . 69 x Abstract As machine learning techniques continue to gain prominence in software systems, ensuring their security has become a crucial software engineering concern. Data poisoning is an emerging secu- rity risk wherein attackers compromise machine learning models by contaminating their training data. This attack poses a significant threat to the safety and integrity of software systems that rely on machine learning technology. However, formally analyzing data poisoning robustness is a challenging task. I designed and implemented a set of formal methods for analyzing, both efficiently and ac- curately, the data-poisoning robustness of the k-nearest neighbors (KNN) algorithm, which is a widely-used supervised machine learning technique. First, I developed a method for certifying the data-poisoning robustness of KNN by soundly overapproximating both the parameter tuning and prediction phases of the KNN algorithm. Second, I developed a method for falsifying data- poisoning robustness of KNN, by quickly detecting the truly-non-robust cases using search space pruning and sampling. Finally, I extended these methods to encompass fairness certification, thus allowing for a more comprehensive analysis of the robustness of KNN. Experimental evaluations demonstrate that the proposed methods are both efficient and accurate in solving these problems. xi Chapter 1 Introduction As machine learning techniques are increasingly used in software systems, the security of these techniques becomes an important software engineering problem. Data poisoning is a type of emerging security risk where the attacker corrupts a machine learning model by polluting its train- ing data. Specifically, the attacker aims to change the result of a prediction model by injecting a small number of malicious data elements into the training set used to learn this model. Such attacks are possible, for example, when training data are collected from online repositories or gathered via crowdsourcing. Many existing studies have shown the effectiveness of these attacks, e.g., in malware detection systems [64] and facial recognition systems [12]. Data poisoning is an adversarial attack that aims to corrupt a machine learning model by poi- soning its training data, and thus affect the prediction results for test data [54]. Prior work shows that even a small amount of poisoned data, e.g.,≤ 0.4% of the training set, is enough to affect the prediction result [55, 8, 12]. Thus, analyzing the robustness of the prediction result in the presence of data poisoning is a practically important problem. Specifically, given a potentially-poisoned training set, T , and the assumption that at most n elements in T are poisoned, if we can prove that the prediction result for a test input x remains unchanged by any n poisoned elements in T , the prediction result can still be considered trustworthy. This dissertation focuses on the data poisoning robustness of the k-nearest neighbors (KNN) al- gorithm, which is a widely used supervised learning technique in applications such as e-commerce, 1 video recommendation, document categorization, and anomaly detection [30, 3, 62, 1, 49, 26, 40, 57, 66]. 1.1 Motivations I first introduce the two ways in which poisoned data affects KNN prediction results and then explain why applying formal analysis to KNN data poisoning robustness is challenging. 1.1.1 Two Ways in which Poisoned Data Affects KNN Prediction Results First, assume that the potentially-poisoned training set T may be partitioned into T ′ and(T\ T ′ ), where T ′ contains the clean data elements and(T\ T ′ ) contains the poisoned data elements. The KNN’s parameter K indicates how many neighbors to consider when predicting the class label for a test input x. For example, K = 3 means that the predicted label of x is the label most frequent among the 3-nearest neighbors of x in the training set. ? (a) poisoned set (K=3) Poisoned data ? (b) clean set (K=3) Figure 1.1: Example of direct influence of the poisoned data. One of the two ways in which poisoned data may affect the classification result is called direct influence . In this case, the poisoned elements directly change the K-nearest neighbors of x and thus the most frequent label, as shown in Figure 1.1. 2 Figure 1.1(a) shows only the clean subset T ′ , where the triangles and stars represent the train- ing data elements, and the square represents the test input x. Furthermore, triangle and star repre- sent the two distinct class labels. The goal is to predict the class label of x. In this figure, the dashed circle contains the 3-nearest neighbors of x. Since the most frequent label is star, x is classified as star. Figure 1.1(b) shows the entire training set T , including all of the elements in T ′ as well as a poisoned data element. In this figure, the dashed circle contains the 3-nearest neighbors of x. Due to the poisoned data element, the most frequent label becomes triangle and, as a result, x is mistakenly classified as triangle. ? (a) poisoned dataset (K=3) xs Poisoned data ? (b) clean dataset (K=5) Figure 1.2: Example of indirect influence of poisoned data. The other way in which poisoned data may affect the classification result is called indirect influence . In this case, the poisoned elements may not be close neighbors of x, but their presence in T changes the parameter K (Section 2.2 explains how to compute K), and thus the prediction label. Figure 1.2 shows such an example where the poisoned element is not one of the 3-nearest neighbors of x. However, its presence changes the parameter K from 3 to 5 in Figure 1.2(b). As a result, the predicted label for x is changed from star in Figure 1.2(a) to triangle in Figure 1.2(b). 3 The existence of indirect influence makes it incorrect to assume that parameter K is unaffected by poisoned elements and to only focus on cases where poisoned elements are near x; instead, each possible poisoning scenario should be taken into consideration. 1.1.2 Challenges of Applying Formal Analysis It is a challenging problem to apply formal analysis for two reasons. First, KNN relies heavily on numerical analysis, which involves a large number of non-linear arithmetic computations and complex statistical analysis techniques such as p-fold cross valida- tion. They are known to be difficult for existing software analysis and verification techniques. Second, even with a small n, there can be an extremely large number of possible scenarios in which poisoned elements in T may affect the trained model and hence the prediction result. As illustrate in Fig. 1.2, a poisoned data element far away from the test input x can still change the predicted label of x, by changing the optimal value of the parameter K and thus (indirectly) changing the most frequent label of x’s K nearest neighbors. This highlights the fact that the poisoned data’s influence is global – it cannot be soundly approximated by analyzing only the few data elements nearest to x. 1.2 Insight and Hypothesis In this section, I present the insight derived from my research and the hypothesis that this disser- tation aims to test, with the objective of efficiently and accurately analyzing the data-poisoning robustness of the k-nearest neighbors (KNN) algorithm. 1.2.1 Insight 1: Abstract Interpretation Facilitates Analysis. One insight is that employing appropriate abstract interpretation can greatly contribute to the effi- ciency and accuracy of the analysis process. 4 For example, consider the potentially-poisoned training set T =(x 1 ,l B ),(x 2 ,l A ),(x 3 ,l A ), where x i represents a feature vector and l j denotes a class label. Assuming that at most two elements are poisoned in the dataset T , there are 6 possible poisonings: {(x 1 ,l B )}, {(x 2 ,l A )}, {(x 3 ,l A )}, {(x 1 ,l B ),(x 2 ,l A )},{(x 1 ,l B ),(x 3 ,l A )}, and{(x 2 ,l A ),(x 3 ,l A )}. However, when only considering the label counts, there are merely four possible situations: {(l A : 1)},{(l B : 1)},{(l A : 2)}, and{(l A : 1),(l B : 1)}. In the above example, only considering the label counts is a form of abstract interpretation. Here,{(l A : 2)} represents removing an element labeled l A , but it does not specify which of the l A elements. Thus, it encompasses any concrete set with the same label counter. Let|L| be the total number of class labels, which is often small in practice (e.g., 2 or 10), and let there be at most n poisoned data. The number of poisoning count situations is∑ n i=0 i+|L|− 1 i . This value can be exponentially smaller than the number of possible concrete poisoning sets, which is∑ n i=0 |T| i , resulting in a more efficient analysis. The level of accuracy is closely linked to the granularity of the abstraction grid. With an appropriate abstraction grid, it is possible to achieve a balance between accuracy and computational efficiency. 1.2.2 Insight 2: Approximation Facilitates Analysis. Another insight is that employing appropriate over- and under-approximation techniques can greatly contribute to the efficiency and accuracy of the analysis process. For example, to certify robustness, instead of enumerating all possible scenarios, an over- approximation technique can be employed to estimate the potential changes introduced by poi- soning, ensuring a sound certification for robust cases. Conversely, to falsity the robustness, an under-approximation technique can be employed to eliminate scenarios where contaminated ele- ments will not impact the prediction result, ensuring a safe exclusion of irrelevant possible poison- ings. 5 The level of accuracy is closely linked to the granularity of the approximation grid. With an appropriate approximation grid, it is possible to achieve a balance between accuracy and compu- tational efficiency. 1.2.3 Hypothesis Based on the insights, I state the hypothesis of this dissertation as follows: A sound formal analysis framework can help analyze (both efficiently and accurately) the robustness of KNNs under data poisoning. To test the hypothesis, I designed and implemented a set of formal methods for deciding, both efficiently and accurately, the data-poisoning robustness of the KNN algorithm. First, I developed a method for certifying the data-poisoning robustness of KNN, by soundly overapproximating both the parameter tuning and prediction phases of the KNN algorithm. Second, I developed a method for falsifying data-poisoning robustness of KNN, by quickly detecting the truly-non-robust cases using search space pruning and sampling. Finally, I extended these methods to encompass fairness certification, thus allowing for a more comprehensive analysis of the robustness of KNN. The experimental evaluation demonstrates the efficiency and accuracy of the proposed methods in handling popular supervised-learning datasets. These results confirm the hypothesis of this dissertation. 1.3 Contribution This dissertation focuses on three formal analysis problems to test my hypothesis: certifying the robustness of KNNs under data poisoning, falsifying the robustness of KNNs under data poisoning, and certifying the fairness of KNNs under historical bias in the training dataset. Throughout these works, I have designed and implemented a set of formal methods that enable efficient and accurate analysis. Below, I elaborate on the contributions of each work. 6 1.3.1 Certifying Robustness of KNNs under Data Poisoning I propose a method for certifying robustness of KNNs under data poisoning, by soundly over- approximating the KNN algorithm to consider all possible scenarios in which poisoned elements may affect the prediction result. To the best of my knowledge, the proposed method is the only one that can formally certify n-poisoning robustness of the entire KNN algorithm, including both the parameter tuning phase and the prediction phase. The experimental evaluation shows the accuracy and efficiency of the proposed method in handling popular supervised-learning datasets. 1.3.2 Falsifying Robustness of KNNs under Data Poisoning I propose a method for falsifying robustness of KNNs under under data poisoning, by a novel over-approximate analysis in the abstract domain to quickly narrow down the search space, and systematic testing in the concrete domain to find the actual violations. To the best of my knowledge, this is the only method available for falsifying the complete KNN system, including both the parameter tuning and the prediction phases. The experimental evaluation shows the accuracy and efficiency of the proposed method in handling popular supervised-learning datasets. 1.3.3 Certifying Fairness of KNNs under Historical Bias in Training Dataset I proposes a method for certifying the fairness of KNNs under historical bias in the training dataset, by sound approximating the complex arithmetic computations used in the state-of-the-art KNN algorithm. To the best of my knowledge, this is the first method for KNN fairness certification in the presence of dataset bias. The experimental evaluation shows the accuracy and efficiency of these techniques in handling popular datasets from the fairness research literature. 7 1.4 Outline The remainder of this dissertation is organized as follows. First, I present the technical background and prior work in Chapter 2. Then, I present the main contribution of this dissertation in the chapters 3 - 5. Lastly, I conclude the dissertation in Chapter 6 . Chapter 3 presents my work for certifying robustness of KNNs under data poisoning, which has been published [41] in the main track of IEEE/ACM Formal Methods in Computer-Aided Design (FMCAD 2022). Chapter 4 presents my work for falsifying robustness of KNNs under data poisoning, which has been published [43] in main track of ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2023). Chapter 5 presents my work for certifying fairness of KNNs under historical bias in the training dataset, which has been published [42] in main track of ACM International Conference on Computer Aided Verification (CA V 2023). 8 Chapter 2 Background This chapter introduces the necessary background information that is used throughout the disser- tation and related work in the three applications this dissertation focuses on. 2.1 Data-Poisoning Robustness Let L be a supervised learning algorithm M = L(T), which takes a set T ={(x,y)} of training data elements as input and returns a learned model M as output. Within each data element, input x∈X ⊆ R D is an D-dimensional real-valued feature vector, and output y∈Y ⊆ N is a class label. The model is a prediction function M :X →Y that maps a test input x∈X to its class label y∈Y . Following Drews et al. [19], we define data-poisoning robustness below. n-Poisoning Model Let T be a potentially-poisoned training set, m=|T| be the total number of elements in T , and n be the maximum number of poisoned elements in T . Assuming that we do not know which elements in T are poisoned, the set of all possible scenarios is captured by the set of clean subsets, denoted ∆ n (T)={T ′ ⊆ T :|T\ T ′ |≤ n}. In other words, each T ′ may be the result of removing all of the poisoned elements from T . 9 n-Poisoning Robustness We say the inference result y= M(x) for a test input x∈X is robust to n-poisoning attacks of T if and only, for all T ′ ∈∆ n (T) and the corresponding model M ′ = L(T ′ ), we have M ′ (x)= M(x). In other words, the predicted label remains the same. For example, when T ={a,b,c,d} and n= 1, the clean subsets are T 1 ={b,c,d}, T 2 ={a,c,d}, T 3 ={a,b,d} and T 4 ={a,b,c}, which correspond to models M 1 ,...,M 4 and inference results x 1 = M 1 (x), x 2 = M 2 (x), x 3 = M 3 (x) and x 4 = M 4 (x). Let M be the model obtained by T and x= M(x) be the default output label. The inference result is 1-poisoning robust if and only if x 1 = x 2 = x 3 = x 4 = x. This robustness definition has two advantages. First, it provides a strong guarantee of trustwor- thiness. Second, it does not require the actual label of x to be known, which means it is applicable to unlabeled data, which are common in practice. 2.2 k-Nearest Neighbors (KNN) Unlike many other machine learning techniques, KNN does not have an explicit model M; instead, M can be regarded as the combination of T and K. KNN is a supervised learning algorithm with two phases. As shown in Fig. 2.1, KNN has a parameter tuning phase whereKNN paratune picks the optimal value of the parameter K, which indicates how many neighbors to consider when deciding the output label for a test input x; and a prediction phase where KNN predict computes the predicted label for an input x using T and a given parameter K. InsideKNN paratune, based on the training set T , a technique called p-fold cross validation is used to select the optimal value for K, e.g., from a set of candidate k values in the range[1,|T|× (p− 1)/p] by minimizing classification error, as shown in Line 6. This is accomplished by first partitioning T into p groups of roughly equal size (Line 3), and then computing err k i (a set of misclassified samples from G i ) by treating G i as the evaluation set, and T\ G i as the training set. Here, an input(x,y)∈ G i is “misclassified” if the expected output label, y, differs from the output ofKNN predict using the candidate k value. 10 1 func KNN_paratune(T) { 2 for (each candidate k value) { // conducting p-fold cross validation 3 Let {G i } = a partition of T into p groups of roughly equal size; 4 Let err k i = {(x,y)∈ G i | y̸=KNN predict(T\ G i ,k,x)} for each G i ; 5 } 6 Let K = argmin k 1 p ∑ p i=1 |err k i | |G i | ; 7 return K; 8 } 9 10 func KNN_predict(T,K,x) { 11 Let T K x = the K nearest neighbors of x in T; 12 Let Freq(T K x ) = the most frequent label in T K x ; 13 return Freq(T K x ); 14 } Figure 2.1: The KNN algorithm, consisting of the parameter tuning and prediction steps. Inside KNN predict, given an unlabeled test vector x∈X , the K nearest neighbors of x in T are used to compute the most frequent label Freq(T K x ), which is returned as the output label of x. The distance between data elements, which is used to find the nearest neighbors, is defined on the input feature vectors. The most widely used metric is the Euclidean distance: given two elements x a ,x b ∈X ⊆ R D , where D is the dimension of the input feature vector, the Euclidean distance is q ∑ D i=1 (x a [i]− x b [i]) 2 . 2.3 Related Work 2.3.1 Data Poisoning in General KNN is not the only type of machine learning techniques found vulnerable to adversarial data poisoning; prior work shows that regression models [46], support vector machines (SVM) [8, 65, 63], clustering algorithms [9], and neural networks [55, 58, 17, 68] are also vulnerable. Unlike our work, this line of research is primarily concerned with showing the security threats and identifying the poisoning sets, which is often formulated as a constrained optimization problem. 11 2.3.2 Mitigating Data Poisoning Techniques have been proposed to mitigate data poisoning for various machine learning algo- rithms [56, 59, 31, 24, 7]. There are also techniques [35, 44] for assessing the effectiveness of mit- igation techniques such as data sanitization [35] and differentially-private countermeasures [44]. More recently, Bahri et al. [5] propose a method that leverages both KNN and a deep neural net- work to remove mislabeled data. 2.3.3 Certifying the Defenses There is also a growing interest in studying certified defenses [52, 38, 32] where accuracy is guar- anteed either probabilistically or in a deterministic manner. For example, Rosenfeld et al. [52] leverage randomized smoothing to guarantee test-time robustness to adversarial manipulation with high probability. However, it is a probabilistic guarantee instead of formal guarantee provided by our method. Levine et al. [38] certify robustness of a defense by deriving a lower bound of classi- fication error, which relies on their deep partition aggregation (DPA) learning and is not applicable to typical learning approaches. Jia et al. [32] show that, when the poisoning set is bounded in a certain manner, ensemble learning can provably predict the correct classification. 2.3.4 Leveraging KNN for Attacks or Defenses There are techniques that leverage KNN to generate attacks or provide defenses for other machine learning models. For example, Li et al. [39] present a data-poisoning attack that leverages KNN to maximize the effectiveness of malicious behavior while mimicking the user’s benign behavior. Peri et al. [50] use KNN to defend against adversarial input based attacks, although it focuses only on tweaking the test input during the inference phase. 12 2.3.5 Mitigating Bias in Machine Learning There are also techniques for mitigating bias in machine learning systems. Some focus on improv- ing the learning algorithms using random smoothing [51], better embedding [10] or fair represen- tation [53], while others rely on formal methods such as iterative constraint solving [60]. There are also techniques for repairing models to make them fair [2]. Except for Ruoss et al. [53], most of them focus on group fairness such as demographic parity and equal opportunity; they are signifi- cantly different from our focus on certifying individual and ε-fairness of the classification results in the presence of dataset bias. Since fairness is a type of non-functional property, the verification/certification techniques are often significantly different from techniques used to verify/certify functional correctness. Instead, they are more closely related to techniques for verifying/certifying robustness [11] or noninter- ference [6] properties of a program, where the program may run multiple times, each time for a slightly different input drawn from a large (and sometimes infinite) set, to see if they all agree on the output. At a high level, this is related to a class of k-safety hyperproperties [25]. 13 Chapter 3 Certifying Robustness of KNNs under Data Poisoning In this chapter, I propose a method for certifying robustness of KNNs under data poisoning, by soundly overapproximating the KNN algorithm to consider all possible scenarios in which poi- soned elements may affect the prediction result. Unlike existing methods which only analyze the prediction phase but not the significantly more complex parameter tuning phase, the proposed method is capable of certifying the entire KNN algorithm. The experimental evaluation shows that the proposed method is signifi- cantly more accurate than existing methods, and is able to prove the robustness of KNNs generated from popular supervised-learning datasets. As the n-Poisoning Robustness defined in Section 2.1, the certification problem can be con- sidered as proving that, regardless of which of the n data elements in T are poisoned, they do not change the predicted labels of the test data. The certification problem is challenging if we attempt to explicitly check each possible poisoning scenario, as it requires considering a vast number of cases. Let the number of data elements in T be m=|T|. Since up to n elements in T may be poisoned, the number of possible clean subsets of T is∑ n i=0 m i . This number can be astronomically large in practice. Let∆ n (T) represent the set of all possible clean subsets of T . To certify data-poisoning robustness, we must prove that, for all T ′ ∈∆ n (T) and the corresponding model M ′ = L(T ′ ), the classification label of a test input x, denoted y ′ = M ′ (x), remains the same. When m= 100 and n= 5, for example,|∆ n (T)| is close to 8∗ 10 7 . Thus, it is practically impossible to explicitly check, for each T ′ ∈∆ n (T), the prediction result of M ′ generated from T ′ . 14 A practical approach, which is the one used by the proposed method, is to soundly over- approximate the impact of all the clean subsets while analyzing the machine learning algorithm following the abstract interpretation [15] paradigm. Here, the word soundly means that the pro- posed method guarantees that, as long as the over-approximated prediction result is proved robust, the actual prediction result is robust. In addition, the proposed method is efficient in that, instead of training a model for each clean subset T ′ , it combines all clean subsets together to compute a set of abstract models in a single pass. For KNN, in particular, each model corresponds to an optimal value of the parameter K, indi- cating how many neighbors are used to infer the output label of a test input x. Thus, the proposed method computes an over-approximated set of K values, denoted KSet. Then, it over-approximates the KNN’s prediction phase, to check if the output label of x remains the same for all K∈ KSet. If the output label remains the same, the prediction result for x is considered robust to any n-poisoning attack of the training set T . To the best of my knowledge, the proposed method is the only one that can formally certify n-poisoning robustness of the entire KNN algorithm, consisting of both the parameter tuning phase and the prediction phase. In the literature, there are two closely related prior works. The first one, by Jia et al. [33], aims to certify the robustness of KNN’s prediction phase only; in other words, they require the K value to be fixed and given, with the implicit assumption that the optimal K value is not affected by data poisoning. Unfortunately, this is not a valid assumption, as shown in Section 1.1.1. Furthermore, by fixing the K value, they side-step the more challenging part of the certification problem, which is certifying the p-fold cross validation during KNN’s parameter tuning phase. How to over-approximate KNN’s parameter tuning phase soundly and efficiently is a main contribution of this work. The other closely-related prior work, by Drews et al. [19], aims to prove robustness of a differ- ent machine learning technique, namely the decision tree learning (DTL) algorithm. Since DTL differs significantly from KNN in that it relies primarily on logical operations (such as And, Or, 15 and Negation) as opposed to nonlinear arithmetic computations, their certification method relies on a fundamentally different technique (symbolic path exploration). At a high level, the proposed method works as follows. Given a tuple⟨T,n,x⟩ as input, where T is the potentially-poisoned training set, n is the maximum number of poisoned elements in T , and x is a test input, the proposed method tries to prove that, no matter which i≤ n elements in T are poisoned, the KNN’s prediction result for x remains the same. By default, the training set T will lead to a model M, whose prediction result for x will be y= M(x). Using an overapproximated analysis, the proposed method checks if the output label y ′ = M ′ (x) produced by any clean subset of T ′ ⊆ T , and its corresponding model M ′ , remains the same as the default label y= M(x). If that is the case, the proposed method certifies the robustness of the prediction result. Otherwise, it remains inconclusive. I have implemented the proposed method and conducted experimental evaluation using popular machine learning datasets, which include both small and large datasets. The small datasets are particularly useful in evaluating the accuracy of the certification result because, when datasets are small, even the baseline approach of explicitly enumerating all clean subsets is fast enough to complete and obtain the ground truth. The large datasets, some of which have more than 50,000 training data elements and thus are well beyond the reach of the baseline enumeration approach, are useful in evaluating the efficiency of the proposed method. For comparison, I also evaluated the method of Jia et al. [33] with fixed K values. The experimental results show that, for KNN’s prediction phase only, the proposed method is significantly more accurate than the method of Jia et al. [33] and as a result, proves robustness for many more cases. Overall, the proposed method is able to achieve similar empirical accuracy as the ground truth on small datasets, while being reasonably accurate on large datasets and several orders-of-magnitudes faster than the baseline method. In particular, the proposed method is the only one that can finish the complete certification of 10,000 test inputs for a training dataset with more than 50,000 elements within half an hour. To summarize, this paper has the following contributions: 16 • I propose the first method for soundly certifying data-poisoning robustness of the entire KNN algorithm, consisting of both the parameter tuning phase and the prediction phase. • I evaluate the method on popular supervised learning datasets to demonstrate its advantages over both the baseline and a state-of-the-art technique. The reminder of this paper is organized as follows. First, I present the intuition and overview of the proposed method in Section 3.1. Next, I present the proposed method for overapproximating the KNN’s parameter tuning phase in Section 3.2 and the proposed method for overapproximating the KNN’s prediction phase in Section 3.3. Then, I present the experimental results in Section 3.4. Finally, I give the summary in Section 3.5. 3.1 The Intuition and Overview of Proposed Method I first present the intuition behind the proposed method, and then give an overview of the method in contrast to the baseline. 3.1.1 Two Ways of Affecting the Prediction Result As introduced in Section 1.1.1, there are two ways in which poisoned training elements in T affect the prediction result. One is called direct influence , is to change the neighbors of x and thus their most frequent label. The other one, called indirect influence , is to change the parameter K itself. These two ways of influence highlight the importance of analyzing both the parameter tuning phase and the prediction phase of the KNN algorithm. Otherwise, certification would be unsound, which is the case for Jia et al. [33] due to their (incorrect) assumption that K is not affected by poisoned elements in T . In contrast, the proposed method soundly certifies both phases of the KNN algorithm. While certifying the prediction phase itself is already challenging, certifying the parameter tuning phase is even more challenging, since it uses p-fold cross validation to compute the optimal K value. 17 3.1.2 Overview of Proposed Method Before presenting the proposed method, I present a conceptually-simple, but computationally- expensive, baseline method and then compare it with the proposed method. Algorithm 1: Baseline methodKNN Certify(T,n,x). for each T ′ ∈∆ n (T) do K ′ ← KNN paratune(T ′ ) y ′ ← KNN predict(T ′ ,K ′ ,x) Y Set← Y Set∪{y ′ } end robust← (|Y Set|= 1) The Baseline Method This method relies on checking whether the prediction result remains the same for all possible ways in which the training set is poisoned. Algorithm 1 shows the pseudo code, where T is the training set, n is the maximal poisoning number, and x is a test input. For each clean subset T ′ ∈∆ n (T), the parameter K is computed using the standard KNN paratune subroutine, and used to predict the label of x using the standard KNN predict subroutine. Here, Y Set stores the set of predicted labels; thus,|Y Set|= 1 means the prediction result is always the same (and hence robust). The baseline method is both sound and complete, and thus may be used to obtain the ground truth when the size of the dataset is small enough. However, it is not a practical solution for large datasets because of the combinatorial blowup – it has to explicitly enumerate all|∆ n (T)|=∑ n i=0 m i cases. Even for m= 100 and n= 5, the number becomes as large as 8∗ 10 7 . For realistic datasets, often with tens of thousands of elements, the baseline method cannot even be finished within a billion years. Algorithm 2: Proposed methodabs KNN Certify(T,n,x). KSet← abs KNN paratune(T,n) Y Set← abs KNN predict(T,n,KSet,x) robust← (|Y Set|= 1) 18 Algorithm 3: Subroutine for the baseline: KNN paratune(T). Divide T into p groups{G i } of equal size; for each K∈ CandidateKset do for each group G i do errCnt K i = 0 for each sample(x,y)∈ G i do errCnt K i ++ when (KNN predict(T\ G i ,K,x)̸= y); error K i = errCnt K i /|G i | error K = 1 p ∑ p i=1 error K i return the K value with the smallest error K The Proposed Method The proposed method avoids enumerating the individual scenarios in ∆ n (T). As shown in Algorithm 2, it first analyzes, in a single pass, the KNN’s parameter tuning phase while simultaneously considering the impact of up-to-n poisoning elements in T . The result of this over-approximated analysis is a supper set of possibly-optimal K values, stored in KSet. Details of the subroutineabs KNN paratune is presented in Section 3.2. Then, for each K∈ KSet, the proposed method analyzes the KNN’s prediction phase while considering all possible ways in which up-to-n elements in T may be poisoned. The result of this over-approximated analysis is a supper set of possible output labels, denoted Y Set. We say the prediction result for x is robust if the cardinality of Y Set is 1; that is, the label of x remains the same regardless of how T was poisoned. Details of the subroutine abs KNN predict is presented in Section 3.3. 3.2 Analyzing the KNN Parameter Tuning Phase To understand why soundly analyzing the KNN parameter tuning phase is challenging, we need to review the original subroutine,KNN paratune, shown in Algorithm 3, which computes the optimal K value using p-fold cross-validation (see Section 2.2 for the detailed explanation). 19 Algorithm 4: Subroutine KSet=abs KNN paratune(T,n). Divide T into p groups{G i } of equal size; for each K∈ CandidateKset do for each group G i do errCntLB K i = errCntUB K i = 0; for each sample(x,y)∈ G i do errCntLB K i ++ if (abs KNN cannot obtain correct label(T\ G i ,n,K,x,y)== True); errCntUB K i ++ if (abs KNN may obtain wrong label(T\ G i ,n,K,x,y)== True); errorLB K i = max{0, (errCntLB K i − n)/(|G i |− n)}; errorUB K i = min{errCntUB K i /(|G i |− n), 1}; errorLB K = 1 p ∑ p i=1 errorLB K i ; errorUB K = 1 p ∑ p i=1 errorUB K i ; Let minUB= the smallest errorUB K for all K; KSet={K| errorLB K ≤ minUB}; 3.2.1 The Algorithm In contrast, the proposed method shown in Algorithm 4 computes an over-approximated set of K values. The input consists of the training set T and the maximal poisoning number n, while the output KSet is a supper set of the optimal K values. Inside Algorithm 4, the proposed method first computes the lower and upper bounds of the misclassification error for each K value, by considering the best case (errorLB K ) and the worst case (errorUB K ) when up-to-n elements in T are poisoned. After computing the interval [errorLB K ,errorUB K ] for each K value, it computes minUB, which is the minimal upper bound among all K values. Then, by comparing minUB with the errorLB K for each K, it over-approximates the set of possible K values that may become the optimal K value for some T ′ ∈∆ n (T). Here, the intuition is that, by excluding K values that are definitely not the optimal K for any T ′ ∈∆ n (T) — they are the ones whose errorLB K is larger than minUB — we obtain a sound over-approximation in KSet. 20 Error K minUB Figure 3.1: Example of comparing the error bounds. Example for minUB Fig. 3.1 shows an example, where each vertical bar represents the interval [errorLB K ,errorUB K ] of a candidate K value, and the blue dashed line represents minUB. The selected K values are those corresponding to the blue bars, since their errorLB k are smaller than minUB. The K values corresponding to the gray bars are dropped, since they are definitely cannot have the smallest misclassification error. The Soundness Guarantee To understand why the KSet computed in this manner is an over- approximation, assume that minUB= errorUB K ′ for some value K ′ . I now explain why K cannot be the optimal value (with the smallest error) when errorLB K > minUB. Let the actual errors be error K ∈[errorLB K ,errorUB K ] and error K ′ ∈[errorLB K ′ ,errorUB K ′ ]. Since we have errorLB K > errorUB K ′ , we know error K must be larger than error K ′ . Therefore, K cannot have the smallest error. To compute the interval [errorLB K ,errorUB k ], we add up the misclassification error for each element(x,y)∈ G i , where x∈X is the input and y∈Y is the (correct) label. For each element (x,y), there is a misclassification error if, for some reason, y differs from the predicted label. Here, errCntLB K i corresponds to the best case scenario — removing n elements from T in such a way that prediction becomes as correct as possible. In contrast, errCntUB K i corresponds to the worst case scenario — removing n elements from T in such a way that prediction becomes as 21 incorrect as possible. These two error counts are computed by two subroutines, which will be presented later in this section. To convert errCntLB K i and errCntUB K i to error rates, we consider removing n misclassified elements when computing the lower bound errorLB K i , and removing n correctly-classified data elements when computing the upper bound errorUB K i . We assume n<|G i |, which is a reasonable assumption in practice. To explain subroutines abs cannot obtain correct label and abs may obtain wrong label, we need to define some notations, including label counter and removal strategy. 3.2.2 The Label Counter Nearest Neighbors T K x . Let T K x be a subset of T consisting of the K nearest neighbors of x. For example, given T ={((0.1, 0.1), l 2 ), ((1.1, 0.1), l 1 ), ((0.1, 1.1), l 1 ), ((2.1, 3.1), l 3 ), ((3.3, 3.1), l 3 )}, test input x=(1.1,1.1), and K= 3, the set is T 3 x ={((0.1, 0.1), l 2 ), ((1.1, 0.1), l 1 ), ((0.1, 1.1), l 1 )}. Label CounterE(T K x ). Given any dataset Z, including T K x , we useE(Z)={ (l i : #l i )} to represent the label counts, where l i is a class label, and #l i ∈N is the number of elements in Z that have the label l i . For example, given T 3 x above, we haveE(T 3 x )={(l 1 : 2),(l 2 : 1)}, meaning it has two elements with label l 1 and one with label l 2 . Most Frequent Label Freq(E(T K x )). Given a label counterE , the most frequent label, denoted Freq(E), is the label with the largest count. Similarly, we can define the second most frequent la- bel. Thus, the KNN prediction phase can be described as computing Freq(E(T K x )) for the training set T , test input x, and K value. Tie-Breaker 1 (l i <l j ) . If two labels have the same frequency, we use their lexicographic order as a tie-breaker: Let< be the order relation,(l i < l j ) must be either true or false. Thus, we define an indicator function, 1 (l i <l j ) , to return the numerical value 1 (or 0) when(l i < l j ) is true (or false). 22 3.2.3 The Removal Strategy The removal strategy is an abstract way of modeling the impact of poisoned data elements. In contrast, the removal set is a concrete way of modeling the impact. The Removal Set. Given a dataset Z, the removal set R⊂ Z can be any subset of Z. Given T 3 x above, for example, there are 6 possible removal sets: R 1 ={(x 1 ,y 1 )}, R 2 ={((x 2 ,y 2 ))}, R 3 ={(x 3 ,y 3 )}, R 4 ={(x 1 ,y 1 ),(x 2 ,y 2 )}, R 5 ={(x 1 ,y 1 ),(x 3 ,y 3 )}, and R 6 ={(x 2 ,y 2 ),(x 3 ,y 3 )}. In particular, R 1 means removing element(x 1 ,y 1 ) from Z. The Removal Strategy. The removal strategy is simply the label counter of a removal set R, denotedS =E(R). In the above example, the six removal sets correspond to only four removal strategiesS 1 ={(l 1 : 1)},S 2 ={(l 2 : 1)},S 3 ={(l 1 : 1),(l 2 : 1)}, andS 4 ={(l 1 : 2)} . In particular,S 2 means removing an element labeled l 2 ; however, it does not care which of the l 2 elements is removed. Thus, it captures any removal set that has the same label counter. The Strategy Size. Let the removal strategy be denotedS ={(l i : #l i )}, we define the size as ||S|| =∑ (l i ,#l i )∈S #l i — it is the total number of removed elements. ForS 1 ={(l 1 : 1)}, S 2 ={(l 2 : 2)}, andS 3 ={(l 1 : 1),(l 3 : 3)}, the strategy size would be||S 1 || = 1,||S 2 || = 2, and ||S 3 || = 4. Following the abstract interpretation [15] paradigm, we view the removal sets as the concrete domain and the removal strategies as the abstract domain. Focusing on the abstract domain makes the proposed method more efficient. Let |L| be the total number of class labels, which is often small in practice (e.g., 2 or 10). Since the count of each label in a removal set is at most n, the number of removal strategies is at most∑ n i=0 i+|L|− 1 i . This can be exponentially smaller than the number of removal sets, which is∑ n i=0 |T| i . 3.2.4 Misclassification Error Bounds Using the notations defined so far, I present the proposed method for computing the lower and upper bounds, errCntLB K i and errCntUB K i , as shown in Algorithms 5 and 6. 23 Both bounds rely on computing T K+n x , the K+ n neighbors of x in T , and the label counter E(T K+n x ). • The first subroutine checks whether it is impossible, even after removing up-to- n elements from T , that the correct label y becomes the most frequent label. • The second subroutine checks whether it is possible, after removing up-to-n elements from T , that some wrong label becomes the most frequent label. Before explaining the details, I present Theorem 1, which states the correctness of these checks. It says that, to model the impact of all subsets T ′ ∈∆ n (T), we only need to analyze(K+ n) nearest neighbors of x, stored in T K+n x . Theorem 1.∀T ′ ∈∆ n (T), we have Freq(E((T ′ ) K x ))∈{Freq(E(T K+n x )\S)|S ⊂ E(T K+n x ),||S||≤ n}. For brevity, the detailed proof is omitted. Instead, the intuition behind the proof is as follows: • For each clean training subset T ′ ∈∆ n (T), we can always find such a label counter E(T K+i x ) and a removal strategyS ∈E(T K+i x ), where ||S|| = i≤ n, satisfyingE(T K+i x \S) = E((T ′ ) K x ). • If we want to check all the predicted labels of x generated by all T ′ ∈∆ n (T), we need to search through all ofE(T K x ),E(T K+1 x ),...,E(T K+n x ), which is expensive when n is large. • Fortunately,E(T K+n x )\S , where||S||≤ n, contains all the possible scenarios denoted by E(T K+i x )\S , where||S||= i and i= 0,...,n− 1. As a result, we only need to analyzeE(T K+n x ), which corresponds to the(K+n) nearest neighbors of x; other elements which are further away from x can be safely ignored. 24 Algorithm 5: Subroutine used in Algorithm 4 f lag = abs KNN cannot obtain correct label(T,n,K,x,y). LetE(T K+n x ) be the label counter of T K+n x ; Define removal strategy S ={(y ′ : #y ′ − #y+ 1 y ′ <y )|(y ′ : #y ′ )∈E(T K+n x ),y ′ ̸= y,#y ′ ≥ #y}; return (||S||> n); Algorithm 6: Subroutine used in Algorithm 4 f lag = abs KNN may obtain wrong label(T,n,K,x,y). LetE(T K+n x ) be the label counter of T K+n x ; Let y ′ be the most frequent label inE(T K+n x ) except the label y; Define removal strategy S ={(y ′ : max{0,#y− #y ′ + 1 y<y ′})}; return (||S||≤ n); Algorithm 5 To compute the lower bound errCntLB K i , Algorithm 5 checks if all the strategies S satisfying Freq(E(T K+n x )\S)= y andS ⊂ E(T K+n x ) must have||S||> n. Fig. 3.2 shows two examples. In each example, the gray dot is the test input x and the other dots are neighbors of x in T K+n x . In Fig. 3.2 (a), #orange= 2 is the number of orange dots (votes of the correct label). In contrast, #blue= 5 and #green= 2 are votes of the incorrect labels. By assuming the lexicographic order blue< green< orange, we define the indicator function as 1 blue<orange = 1 and 1 green<orange = 1. Given the removal strategyS ={(blue : 4),(green : 1)}, we know||S||= 5 and, since n= 4, we have||S||> n. Thus, removing up to n=4 dots cannot make the test input x correctly classified (as orange). As a result, errCntLB K i ++ is executed to increase the lower bound. In Fig. 3.2 (b), however, since #blue= 4, #orange= 3, 1 blue<orange = 1, andS ={(blue : 2)}, we have||S||= 2. Since||S||≤ n, removing up to n=4 dots can make the test data x correctly classified (as orange). As a result, errCntLB K i ++ is not executed. Algorithm 6 To compute the upper bound errCntUB K i , Algorithm 6 checks if there exists a strategyS that satisfies the condition: Freq(E(T K+n x )\S)̸= y,S ⊂ E(T K+n x ), and||S||≤ n. Fig. 3.3 shows two examples. In Fig. 3.3 (a), #orange= 2 is the number of correct label, and #blue= 5 is the number of dots with the most frequent wrong label. Thus,S = / 0 and since 25 ? (a)S ={(blue : 4),(green : 1)} and return true. ? (b)S ={(blue : 2)} and return f alse. Figure 3.2: Examples for Algorithm 5 with K= 5, n= 4, and y= orange being the correct label. ||S||≤ n, we know that removing up to n= 4 elements can make the test data misclassified. As a result, errCntUB K i ++ is executed. In Fig. 3.3 (b), #orange= 7 is the number of orange dots, #blue= 2 is the number of dots with the most frequent wrong label. Here, we assume 1 orange<blue = 0. Thus,S ={(orange : 5))} and since||S||> n, we know that removing up to n= 4 dots cannot make ‘blue’ (or any other wrong label) the most frequent label. As a result, errCntUB K i ++ is not executed. 3.3 Analyzing the KNN Prediction Phase In this section, I present the proposed method for analyzing the KNN prediction phase, imple- mented in Algorithm 2 as the subroutine Y Set = abs KNN predict(T,n,KSet,x), which returns a set of predicted labels for test input x, by assuming that T contains up-to-n poisoned elements. 3.3.1 Computing the Classification Labels Algorithm 7 shows the proposed method, which first checks whether the second most frequent label (y ′ ) can become the most frequent one after removing at most n elements. This is possible only if there exists a strategyS such that (1) it removes at most n elements labeled y, and (2) 26 ? (a)S = / 0 and return true. ? (b)S ={(orange : 5)} and return f alse. Figure 3.3: Example for Algorithm 6 with K= 5, n= 4, y= orange as correct label, and y ′ = blue as the most frequent wrong label. after the removal, y ′ becomes the most frequent label. This is captured by the condition||S||= (#y− #y ′ + 1 y<y ′)≤ n. Otherwise, the predicted label is not unique. We do not attempt to compute more than two labels, as shown by the return statement in the then-branch, because they are not needed by the top-level procedure (Algorithm 2), which only checks if|Y Set|= 1. 3.3.2 Pruning Redundant K Values Inside Algorithm 7, after checking K∈ KSet, the proposed method puts K into the visited set to make sure it will never be checked again for the same test input x. In addition, it identifies other values in KSet that are guaranteed to be equivalent to K, and prunes away these redundant values. Here, equivalent K values are defined as those with the same prediction result for test input x. To be conservative, we underapproximate the set of equivalent K values. As a result, these K values can be safely skipped since the (equivalent) prediction result has been checked. This optimization is implemented using the visited set in Algorithm 7. The visited set is computed from K andE(T K+n x ) based on the expression(#y− #y ′ − n− 1 y ′ <y ) over the removal strategy. 27 Algorithm 7: Methodabs KNN predict(T,n,KSet,x). Y Set={} visited={} while ∃K∈(KSet\ visited) do LetE(T K+n x ) be the label counter of T K+n x ; Let y be the most frequent label ofE(T K+n x ); Let y ′ be the second most frequent label ofE(T K+n x ); Let removal strategyS ={(y : #y− #y ′ + 1 y<y ′)}; if||S||≤ n then Y Set= Y Set∪{y,y ′ }; return Y Set; else Y Set= Y Set∪{y}; K LB = K− (#y− #y ′ − n− 1 y ′ <y ); K UB = K+(#y− #y ′ − n− 1 y ′ <y ); visited= visited∪[K LB ,K UB ] return Y Set; The Correctness Guarantee I now explain why this pruning technique is safe. The intuition is that, if the most frequent label Freq(E(T K+n x )) is the label with significantly more counts than the second most frequent label, then it may also be the most frequent label for another value K ′ . There are two possibilities: • If(K ′ < K), then T K ′ +n x has(K− K ′ ) fewer elements than T K+n x . Since removing elements from the neighbors will not increase the label count #y ′ , the only way to change the prediction result is decreasing the label count #y. When(K− K ′ )≤ (#y− #y ′ − n− 1 y ′ <y ), decreasing #y will not make any difference. Thus, the lower bound of K ′ is K− (#y− #y ′ − n− 1 y ′ <y ). • If(K ′ > K), then T K ′ +n x has(K ′ − K) more elements than T K+n x . Since adding elements to the neighbors will not decrease the label count #y, the only way to change the prediction result is increasing the label count #y ′ . However, as long as(K ′ − K)≤ (#y− #y ′ − n), increasing #y ′ will not make any difference. Thus, the upper bound of K ′ is K+(#y− #y ′ − n− 1 y ′ <y ). For example, consider K = 13, n = 2, andE(T 15 x )={(l 1 : 12),(l 2 : 2),(l 3 : 1)}. According to Algorithm 7, #y− #y ′ − n− 1 y ′ <y = 12− 2− 2= 8 and thus we compute the interval[13− 8,13+ 8]=[5,21]. As a result, K values in the set{5,6,7,...,21} can be safely skipped. 28 Table 3.1: Statistics of the supervised learning datasets. Name # training data # test data # output label # input dimension (|T|) (|XSet|) (L ) (D) Iris [27] 135 15 3 4 Digits [29] 1,617 180 10 64 HAR [4] 9,784 515 6 561 Letter [28] 18,999 1,000 26 16 MNIST [37] 60,000 10,000 10 36 CIFAR10 [36] 50,000 10,000 10 288 3.4 Experiments I have implemented the proposed method in Python and using the machine learning library scikit- learn 0.24.2, and evaluated it on two sets of supervised learning datasets. Table 3.1 shows the statistics, including the name, size of the training set, size of the test set, number of class labels, and dimension of the input feature space. For MNIST and CIFAR10, in particular, the features were extracted using the standard histogram of oriented gradients (HOG) method [16]. The first set consists of Iris and Digits, two small datasets for which even the baseline method as shown in Algorithm 1 can finish and thus obtain the ground truth. We use the ground truth to evaluate the accuracy of the proposed method. The second set consists of HAR, Letter, MNIST, and CIFAR10, which are larger datasets used to evaluate the efficiency of the proposed method. For comparison purposes, I also implemented the baseline method in Algorithm 1, and the method of Jia et al. [33], which represents the state of the art. Experiments were conducted on poisoned training sets obtained by randomly inserting≤ n input and output mutated samples to the original datasets. All experiments were conducted on a computer with a 2 GHz Quad-Core Intel Core i5 CPU and 16 GB of memory. 3.4.1 Results on the Small Datasets I first compared the proposed method with the baseline on the small datasets where the baseline method could actually finish. This is important because the baseline method does not rely on over- approximation, and thus can obtain the ground truth. Here, the ground truth means which of the 29 Table 3.2: Results of the proposed method and the baseline method on the small datasets with the maximal poisoning number n=1, 2, and 3. Name Baseline Proposed Method Accuracy # robust time (s) # robust time (s) Iris (n=1) 15/15 60 14/15 1 93.3% iris (n=2) 14/15 4,770 13/15 1 93.3% iris (n=3) - >9,999 11/15 1 - Digits (n=1) 179/180 8,032 172/180 1 96.1% Digits (n=2) - >9,999 170/180 1 - Digits (n=3) - >9,999 165/180 1 - test data (in XSet) have prediction results that are actually robust against n-poisoning attacks. By comparing with the ground truth, we were able to evaluate the accuracy of the proposed method. Table 3.2 shows the results. Column 1 shows the name of the dataset and the poisoning number n. Columns 2-3 show the result of the baseline method, consisting of the number of certified test data and the time taken. Similarly, Columns 4-5 show the result of our method. Column 6 shows the accuracy of the proposed method in percentage. The results indicate that, for test data that are indeed robust according to the ground truth, the proposed method can successfully certify most of them. In Iris (n=2), for example, Column 2 shows that 14 of the 15 test data are robust according to the baseline method, and Column 4 shows that 13 out of these 15 test data are certified by the proposed method. Therefore, the proposed method is 93.3% accurate. The proposed method is also faster than the baseline. For Digits (n=1), the proposed method took only 1 second to certify 172 out of the 180 test data as being robust while the baseline method took 8,032 seconds. As the maximal poisoning number n increases, the baseline method ran out of time even for these small datasets. As a result, we no longer have the ground truth needed to directly measure the accuracy of our method. Nevertheless, the number of certified test data in Column 4 of Table 3.2 serves as a proxy – it decreases slowly as n increases, indicating that the accuracy of the proposed method remains high. 30 Table 3.3: Results of the proposed method on large datasets, and on small datasets but with larger poisoning numbers. Name Poisoning Number Certified Percentage Certification Time (n) (# robust/|XSet|) (s) Iris 1∼ 5 (4%) 93.3%∼ 73.3% 1∼ 1 Digits 1∼ 16 (1%) 95.6%∼ 80.6% 1∼ 2 HAR 1∼ 98 (1%) 99.4%∼ 71.7% 85∼ 93 Letter 1∼ 190 (1%) 94.0%∼ 5.5% 33∼ 43 MNIST 1∼ 600 (1%) 99.9%∼ 53.5% 888∼ 994 CIFAR10 1∼ 500 (1%) 99.2%∼ 2.8% 1,453∼ 1,559 3.4.2 Results on the Large Datasets I also evaluated the proposed method on the large datasets. Table 3.3 summarizes the results on these large datasets as well as the two small datasets but with larger maximal poisoning numbers. Since these certification problems are out of the reach of the baseline method, we no longer has the ground truth. Thus, instead of measuring the accuracy, we measure the percentage of test data that we can certify, shown in Column 3 of Table 3.3. For example, in Iris, n= 1∼ 5 (4%) in Column 2 means that these experiments were conducted for each maximal poisoning number n= 1,2,...5. Since the training dataset has 135 elements, n= 5 means 4% (or 5/135) of these training data may be poisoned. In Column 3, 93.3% is the percentage of certified test data for n= 1, while 73.3% is the percentage of certified test data for n= 5. Except for Iris, which has a small number of training data, we set the maximal poisoning number n to be less than 1% of the training dataset. Overall, the proposed method remains fast as the sizes of T , XSet and n increase. For MNIST, in particular, the proposed method finished analyzing both 10-fold cross validation and KNN predic- tion in 26 minutes, for all of the 60,000 data elements in the training set and 10,000 data elements in the test set. In contrast, the baseline method failed to certify any of the test data within the 9999-second time limit. Without the ground truth, the certified percentage provides a lower bound on the number of test data that remain robust against data-poisoning attacks. When n=1, the certified percentage in Column 3 is high for all datasets. As the poisoning number n increases to 1% of the entire 31 training set T , the certified percentage decreases. Furthermore, the decrease is more significant for some datasets than for other datasets. For example, In MNIST, at least 53.5% of the test data remain robust under 1% (or 600) poisoning attacks. In CIFAR, however, only 2.8% of the test data remains robust under 1% (or 500) poisoning attacks. Thus, the relationship between the certified percentage and the poisoning number reflects more about the unique characteristics of these datasets. 3.4.3 Compared with the Existing Method While the proposed method is the only one that can certify the entire KNN algorithm, there are existing methods that can certify part of the KNN algorithm. The most recent method proposed by Jia et al. [33], in particular, aims to certify the KNN prediction step with a given K value; thus, it can be regarded as functionally equivalent to the subroutine of the proposed method as presented in Algorithm 7. However, the proposed method is significantly more accurate. To make the two methods comparable, we use their method to replace Algorithm 7 in the proposed method before conducting the experimental comparison. Since an open-source implementation of their method is not available, I have implemented it myself. Fig. 3.4 shows the results, where blue lines represent the proposed method and orange lines represent their method [33]. Overall, the certified percentage obtained by the proposed method is significantly higher. For all datasets, the certified percentage obtained by their method drops more quickly than the certified percentage obtained by the proposed method. For Iris, in particular, their method cannot certify any of the test data, while the proposed method can certify more than 70% of them as being robust. 3.5 Summary I have presented the first method for soundly certifying n-poisoning robustness for the entire KNN algorithm that includes both the parameter tuning and the prediction phases. It relies on sound 32 0.00% 1.00% 2.00% 3.00% 4.00% 0% 20% 40% 60% 80% 100% (a) Iris 0.00% 0.25% 0.50% 0.75% 1.00% 0% 20% 40% 60% 80% 100% (b) Digits 0.00% 0.25% 0.50% 0.75% 1.00% 0% 20% 40% 60% 80% 100% (c) HAR 0.00% 0.25% 0.50% 0.75% 1.00% 0% 20% 40% 60% 80% 100% (d) Letter 0.00% 0.25% 0.50% 0.75% 1.00% 0% 20% 40% 60% 80% 100% (e) MNIST 0.00% 0.25% 0.50% 0.75% 1.00% 0% 20% 40% 60% 80% 100% (f) CIFAR10 Figure 3.4: Comparing the proposed method (blue) with Jia et al. [33] (orange): the x-axis is poisoning number n and the y-axis is the percentage of certified test data. overapproximation to exhaustively and yet efficiently cover the astronomically large number of possible adversarial scenarios. I have demonstrated the accuracy and efficiency of the proposed method, and its advantages over a state-of-the-art method, through experimental evaluation using both small and large supervised-learning datasets. Besides KNN, the proposed method for soundly over-approximating p-fold cross validation is also applicable to similar cross-validation steps fre- quently used in other modern machine learning systems. 33 Chapter 4 Falsifying Robustness of KNNs under Data Poisoning In this chapter, I propose a method for falsifying robustness of KNNs under data poisoning, by a novel over-approximate analysis in the abstract domain to quickly narrow down the search space, and systematic testing in the concrete domain to find the actual violations. Existing methods for deciding data-poisoning robustness have either poor accuracy or long running time and, more importantly, they can only certify some of the truly-robust cases, but cannot falsify the truly-non- robust cases. The proposed method overcome this limitation. I have evaluated this method on a diverse set of supervised-learning datasets. The results show that the method significantly out- performs state-of-the-art techniques, and can decide data-poisoning robustness of KNN prediction results for most of the test inputs. Faced with data-poisoning attacks, users may be interested in knowing if the result generated by a potentially-poisoned prediction model is robust, i.e., the result remains the same regardless of whether or how the training set may have been poisoned by up-to-n elements [19]. This is motivated, for example, by the following use case scenario: the model trainer collects data from potentially malicious sources but is confident that the number of potentially-poisoned elements is bounded by n; and despite the risk, the model trainer wants to use the learned model to make a prediction for a new test input. If we can certify the robustness, the prediction result can still be used; this is called robustness certification. If, on the other hand, we can find a scenario that violates the robustness property, the prediction result is discarded; this is called robustness falsifi- cation. Therefore, the falsification and certification problems are analogous to software testing and 34 verification problems: falsification aims to detect violations of a property, while certification aims to prove that such violations do not exist. Conceptually, the data-poisoning robustness problem can be solved as follows. First, we as- sume that the training set T consists of both clean and poisoned data, but which elements are poi- soned remains unknown. Based on T , we use a learning algorithm L to obtain a model M= L(T), and then use it to predict the output label y= M(x) for a test input x. Next, we check if the pre- diction result could have been different by removing the poisoned elements from T . Assuming that exactly 1≤ i≤ n of the|T| data elements are poisoned, the clean subset T ′ ⊂ T will have the remaining (|T|− i) elements. Using T ′ to learn the model M ′ = L(T ′ ), we could have predicted the result y ′ = M ′ (x). Finally, by comparing y ′ with y, we decide if prediction for the (unlabeled) test input x is robust: the prediction is considered robust if and only if, for all 1≤ i≤ n, y ′ is the same as the default result y. While the solution presented above (called the baseline approach) is a useful mental model, it is not efficient enough for practical use. This is because for a given training set T , the number of possible clean subsets (T ′ ⊂ T ) can be as large asΣ n i=1 |T| i . Due to this combinatorial explosion, it is practically impossible to enumerate all the clean subsets and then check if they generate the same result as y= M(x). To avoid the combinatorial explosion, I propose a more efficient method for deciding n-poisoning robustness. Instead of enumerating the clean subsets (T ′ ⊂ T ), we use an over-approximate analysis to either certify robustness quickly or narrow down the search space, and then rely on systematic testing in the narrowed search space to find a subset T ′ that can violate robustness. However, deciding the n-poisoning robustness of KNN is a challenging task. This is because the KNN algorithm has two phases: the parameter tuning phase and the prediction phase (see Section 2.2). During the parameter tuning phase, the entire training set T is used to compute the optimal value of parameter K such that, if the most frequent label among the K-nearest neighbors of an input is used to generate the prediction label, the average misclassification error will be minimized. Here, the misclassification error is computed over data elements in T using a technique 35 called p-fold cross validation (see Section 2.2) and the distance used to define nearest neighbors is the Euclidean distance in the input vector space. As a result, the parameter tuning phase itself can be time-consuming, e.g., computing the optimal K for the MNIST dataset with|T|=60,000 elements may take 30 minutes, while computing the prediction result for a test input may take less than a minute. The large size of T and the complex nature of the mathematical computations make it difficult for conventional software testing and verification techniques to accurately decide the robustness of the KNN system. To overcome these challenges, I propose three novel techniques. First, I propose an over- approximate analysis to certify n-poisoning robustness in a sound but incomplete manner. That is, if the analysis says that the default result y= M(x) is n-poisoning robust, the result is guaranteed to be robust. However, this quick certification step may return unknown and thus is incomplete. Second, I propose a search space reduction technique, which analyzes both the parameter tuning and the prediction phases of the KNN algorithm in an abstract domain, to extract common prop- erties that all potential robustness violations must satisfy, and then uses these common properties to narrow down the search space in the concrete domain. Third, I propose a systematic testing technique for the narrowed search space, to find a clean subset T ′ ⊂ T that violates the robustness property. During systematic testing, incremental computation techniques are used to reduce the computational cost. I have implemented the proposed method as a software tool that takes as input the potentially- poisoned training set T , the poisoning threshold n, and a test input x. The output may be Certified , Falsified or Unknown. Whenever the output is Falsified , a subset T ′ ⊂ T is also returned as ev- idence of the robustness violation. I evaluated the tool on a diverse set of benchmarks collected from the literature. For comparison, I also implemented three alternative approaches. The first one is the baseline approach that explicitly enumerates all subsets T ′ ⊂ T . The other two are existing methods by Jia et al. [33] and Li et al. [41] which only partially solve the robustness problem: Jia 36 et al. [33] do not analyze the KNN parameter tuning phase at all, and thus require the optimal pa- rameter K to be given manually; and both Jia et al. [33] and Li et al. [41] focus only on certification in that they may return Certified or Unknown, but not Falsified . The benchmarks used in the experimental evaluation are six supervised learning datasets. Two of them are small enough that the ground truth (robust or non-robust) may be obtained by the base- line enumerative approach, and thus are useful in evaluating the accuracy of the proposed tool. The others are larger datasets, e.g., with 60,000 training elements and 10,000 test elements, which are useful in evaluating the efficiency. The experimental results show that the proposed method can certify or falsify n-poisoning robustness for the vast majority of test cases. Furthermore, among the three competing methods, the proposed method has the best overall performance. Specifically, the proposed method is as accurate as the baseline enumerative approach on benchmarks that the baseline approach can handle, while being exponentially faster. Compared with the other exist- ing method [33], the proposed method is significantly more accurate. For example, on the large CIFAR10 dataset with the poisoning threshold set to n=150, the proposed method successfully decide 100% of the test cases, while Li et al. [41] resolved only 36.0%, and Jia et al. [33] resolved only 10.0%. To summarize, this paper makes the following contributions: • I propose the first method capable to certifying as well as falsifying n-poisoning robustness of the entire state-of-the-art KNN system, including both the parameter tuning and the pre- diction phase. • I propose techniques to keep the proposed method accurate as well as efficient, by using over-approximate analysis in the abstract domain to narrow down the search space before using systematic testing to identify violations in the concrete domain. • I implement the proposed method as a software tool and evaluate it on six popular supervised learning datasets to demonstrate its advantages over two state-of-the-art techniques. 37 The remainder of this paper is organized as follows. I first present the technical background in Section 4.1. Then, I present proposed method in Sections 4.2, 4.3, 4.4 and 4.5. Next, I present the experimental results in Section 4.6. Finally, I summarize this work in Section 4.7. 4.1 Background In this section, we highlight the challenges in deciding n-poisoning robustness. 4.1.1 The n-Poisoning Robustness Let us recall the definition of n-Poisoning Robustness in Section 2.1. Given a potentially-poisoned training set T and a poisoning threshold n indicating the maxi- mal poisoning count, the set of possible clean subsets of T is represented by∆ n (T)={T ′ | T ′ ⊂ T and|T\ T ′ |≤ n}. That is,∆ n (T) captures all possible situations where the poisoned elements are eliminated from T . We say the prediction y= M(x) for a test input x is robust if and only, for all T ′ ∈∆ n (T) such that M ′ = L(T ′ ) and y ′ = M ′ (x), we have y ′ = y. In other words, the default result y= M(x) is the same as all of the possible results, y ′ = M ′ (x), no matter which are the (i≤ n) poisoned data elements in the training set T . 4.1.2 Challenge As shown in Section 1.1.1, the poisoned data may change the prediction result in two ways. One way in which poisoned data may affect the classification result is called direct influence . In this case, the poisoned elements directly change the K-nearest neighbors of x and thus the prediction label. The other way in which poisoned data may affect the classification result is called indirect influence . In this case, the poisoned elements may not be close neighbors of x, but their presence in T changes the parameter K (Section 2.2 explains how to compute K), and thus the prediction label. 38 Algorithm 8: Procedure FALSIFY BASELINE(T,n,x). K← KNN PARATUNE(T) y← KNN PREDICT(T,K,x) ∆ n (T)←{ T ′ | T ′ ⊂ T and|T\ T ′ |≤ n} while∆ n (T)̸= / 0∧ consumed time< time limit do Remove a clean subset T ′ from∆ n (T) K ′ ← KNN PARATUNE(T ′ ) y ′ ← KNN PREDICT(T ′ ,K ′ ,x) if y̸= y ′ then return Falsified with (T\ T ′ ) as evidence if∆ n (T)= / 0 then return Certified else return Unknown The existence of indirect influence prevents us from only considering the cases where poisoned elements are near x; instead, we must consider each T ′ ∈∆ n (T). 4.1.3 The Baseline Method I first present the baseline method in Algorithm 8, and then compare it with the proposed method in Algorithm 9 (Section 4.2). The baseline method explicitly enumerates the possible clean subsets T ′ ∈∆ n (T) to check if the prediction result y ′ produced by T ′ is the same as the prediction result y produced by T for the given input x. As shown in Algorithm 8, the input consists of the training set T , the poisoning threshold n, and the test input x. The subroutines KNN PARATUNE and KNN PREDICT implement the standard parameter tuning and prediction phases of the KNN algorithm. Without the time limit, the baseline method would be both sound and complete; in other words, it would return either Certified (Line 13) or Falsified (Line 9). With the time limit, however, the baseline method will return Unknown (Line 15) after it times out. The baseline procedure is inefficient for three reasons. First, it is a slow certification (Line 13) to check whether the prediction result for x remains the same for all possible clean subsets T ′ ∈ ∆ n (T). In many cases, most of the elements around x belong to the same class, and thus x’s 39 predicted label cannot be changed by either direct or indirect influence. However, the baseline procedure cannot quickly identify and exploit this to avoid enumeration. Second, even if a violat- ing subset T ′ exists, the vast majority of subsets in∆ n (T) are often non-violating. However, the baseline procedure cannot quickly identify the violating T ′ from∆ n (T). Third, within the while- loop, different subsets share common computations inside KNN PARATUNE, but these common computations are not leveraged by the baseline procedure to reduce the computational cost. 4.2 Overview of Proposed Method There are three main differences between the proposed method in Algorithm 9 and the baseline method in Algorithm 8. They are marked in dark blue. They are the novel components designed specifically to overcome limitations of the baseline method. First, we add the subroutine QUICKCERTIFY to quickly check whether it is possible to change the prediction result for the test input x. This is a sound but incomplete check in that, if the subroutine succeeds, we guarantee that the result is robust. If it fails, however, the result remains unknown and we still need to execute the rest of the procedure. The detailed implementation of QUICKCERTIFY is presented in Section 4.3. Second, before searching for a clean subset that violates robustness, we compute ∇ x n (T)⊆ ∆ n (T), to capture the likely violating subsets. In other words, the obviously non-violating ones in ∆ n (T) are safely skipped. Note that, while∆ n (T) depends only on T and n,∇ x n (T) depends also on the test input x. For this reason,∇ x n (T) is expected to be significantly smaller than ∆ n (T), thus reducing the search space. The detailed implementation of GENPROMISINGSUBSETS is presented in Section 4.4. Third, instead of applying the standard KNN PARATUNE subroutine to each subset T ′ to per- form the expensive p-fold cross validation from scratch, we split it to KNN PARATUNE INIT and KNN PARATUNE UPDATE, where the first subroutine is applied only once to the original train- ing set T , and the second subroutine is applied to each subset T ′ ∈∇ x n (T). Within subroutine 40 Algorithm 9: The proposed procedure FALSIFY NEW(T,n,x). if QUICKCERTIFY(T,n,x) then return Certified ⟨K,Error⟩← KNN PARATUNE INIT(T) y← KNN PREDICT(T,K,x) ∇ x n (T)← GENPROMISINGSUBSETS(T,n,x,y) while∇ x n (T)̸= / 0∧ consumed time< time limit do Remove a subset T ′ from∇ x n (T) K ′ ← KNN PARATUNE UPDATE(T\ T ′ ,Error) y ′ ← KNN PREDICT(T ′ ,K ′ ,x) if y̸= y ′ then return Falsified with (T\ T ′ ) as evidence if∇ x n (T)= / 0 then return Certified else return Unknown KNN PARATUNE UPDATE, instead of performing p-fold cross validation for T ′ from scratch, we leverage the results returned by KNN PARATUNE INIT to incrementally compute the results for K ′ . The detailed implementation of these two new subroutines is presented in Section 4.5. 4.3 Quickly Certifying Robustness In this section, I present the subroutine QuickCertify, which is a sound but incomplete procedure for certifying robustness of the KNN for a given input x. Therefore, if it returns True, the pre- diction result for x is guaranteed to be robust. If it returns False, however, we still need further investigation. Before presenting the algorithm, I need to define the notations used in the algorithm, following the ones used by Li et al. [41]. Training Set T . Let T ={(x 1 ,y 1 ),(x 2 ,y 2 ), ..., (x m ,y m )} be a set of labeled data elements, where input x i ∈X ⊆ R D is a feature vector in the feature spaceX , and y∈Y ⊆ N is a class label in the label spaceY . 41 Set of K-nearest Neighbors T K x . Let T K x be the set of K nearest neighbors of test input x in the training set T . Label CounterE(·). LetE(D)={(l i : #l i )} be the set of label counts for a dataset D, where l i ∈Y is a label and #l i ∈N is the number of elements in D with label l i . Most Frequent Label Freq(·). Let Freq(E(D)) be the most frequent label in the label counter E(D) for the dataset D. Lexicographic Order 1 y<y ′. Let 1 y<y ′ be the tie-breaker used by the KNN algorithm to decide the most frequent label, whenever two labels y and y ′ have the same count. 1 y<y ′ evaluates to 1 if y is ahead of y ′ in their lexicographic order, and evaluates to 0 if y is behind y ′ . Consider T 3 x ={(x 1 ,l a ),(x 2 ,l a ),(x 3 ,l b )} as an example, which captures the 3-nearest neighbors of a test input x. Then the corresponding label counter isE(T 3 x )={(l a : 2),(l b : 1)}, meaning that two elements in T 3 x have the label l a and one element has the label l b . The corresponding most frequent label is Freq(E(T 3 x ))= l a . For each subset T ′ ∈∆ n (T), we define a removal set R = (T\ T ′ ) and a removal strategy S =E(R). • A removal set R for a set T is a non-empty subset R⊂ T , to represent the removal of the elements in R from T . • A removal strategyS is the label counter of a removal set R, i.e.,S =E(R). Thus, all the removal sets form the concrete domain, and all the removal strategies form an abstract domain. While analysis in the (large) concrete domain is expensive, analysis in the (smaller) abstract domain is much cheaper. This is analogous to the abstract interpretation [14] paradigm for static program analysis. For the set T 3 x above, there are 6 removal sets: R 1 ={(x 1 ,l a )}, R 2 ={(x 2 ,l a )}, R 3 ={(x 3 ,l b )}, R 4 ={(x 1 ,l a ),(x 2 ,l a )}, R 5 ={(x 1 , l a ),(x 3 ,l b )}, and R 6 ={(x 2 ,l a ),(x 3 ,l c )}. They correspond to 4 removal strategies:S 1 ={(l a : 1)},S 2 ={(l c : 1)},S 3 ={(l a : 1),(l c : 1)}, andS 4 ={(l a : 2)}. 42 Algorithm 10: Subroutine QUICKCERTIFY(T,n,x). LabelSet←{} for each candidate K value do Let y= Freq(E(T K x )) and add y into LabelSet; if y̸= Freq(E(T K+n x )\{(y : n)}) then returnFalse if|LabelSet|> 1 then returnFalse returnTrue As the number of elements in T increases, the size gap between the concrete and abstract domains increases drastically— this is the reason why the proposed method is efficient. 4.3.1 The QUICKCERTIFY Subroutine This subroutine checks a series of sufficient conditions under which the prediction result for test input x is guaranteed to be robust. These sufficient conditions are designed to avoid the most expensive step of the KNN algorithm, which is the parameter tuning phase that relies on p-fold cross validations to compute the optimal K parameter. Since the optimal K parameter is chosen from a set of candidate values, where p-fold cross validations are used to identify the value that minimizes prediction error, skipping the parameter tuning phase means we must directly analyze the behavior of the KNN prediction phase for all candidate K values. That is, assuming any of the candidate K value may be the optimal one, we prove that the prediction result remains the same no matter which candidate K value is used as the K parameter. Algorithm 10 shows the procedure, which takes the training set T , poisoning threshold n, and test input x as input, and returns either True or False as output. Here, True means the result is n-poisoning robust, and False means the result is unknown. For each candidate K value, y= Freq(E(T K x )) is the most frequent label of the K-nearest neighbors of x. Recall that, in Section 1.1.1, I have explained the two ways in which poisoned data in T may affect the prediction result. The first one is called direct influence : without changing the K value, 43 the poisoned data may affect the K-nearest neighbors of x and thus their most frequent label. The second one is called indirect influence : by changing the K value, the poisoned data may affect how many neighbors to consider. Inside the QUICKCERTIFY subroutine, we check for sufficient conditions under which none of the above two types of influence is possible. The check for direct influence is implemented in Line 4. Here, T K+n x consists of the (K+ n) nearest neighbors of x, andE(T K+n x ) is the label counter. Therefore,E(T K+n x )\{(y : n)} means removing n data elements labeled y. Freq(E(T K+n x )\{(y : n)}) represents the most frequent label after the removal. If it is possible for this removal strategy to change the most frequent label, then we conservatively assume that the prediction result may not be robust. The check for indirect influence is implemented in Line 7. Here, LabelSet stores all of the most frequent labels for different candidate K values. If the most frequent labels for any two candidate K values differ, i.e., |LabelSet| > 1, we conservatively assume the prediction result may not be robust. On the other hand, if the prediction result remains the same during both checks, we can safely assume that the prediction result is n-poisoning robust. 4.3.2 Two Examples I illustrate Algorithm 10 using two examples. Figure 4.1 shows an example where robustness can be proved by QUICKCERTIFY. For sim- plicity, we assume the only two candidate values for the parameter K are K= 1 and K= 3. When K = 1, as shown in Figure 4.1 (a), star is the most frequent label of the x’s neighbors, denoted E(T 1 x )={(star : 1)}, and inside Algorithm 10, we have LabelSet ={star}. The extreme case is represented byE(T 1+2 x )\{(star : 2)}={(star : 1)}, which means x is still classified as star after applying this aggressive removal strategy. When K= 3, as shown in Figure 4.1 (b), star is also the most frequent label inE(T 3 x )={star : 3} and thus LabelSet={star}. The extreme case is represented byE(T 3+2 x )\{star : 2}={star : 44 ? (a) For K= 1, Freq(E(T 1 x ))= star, and Freq(E(T 1+n x )\{star : n})= star ? (b) For K= 3, Freq(E(T 3 x ))= star, and Freq(E(T 3+n x )\{star : n})= star Figure 4.1: Robust example for QUICKCERTIFY, where the poisoning threshold is n= 2, and candidate K values are{1, 3}. 3}, which means x is still classified as star after applying this removal strategy. In this example n= 2, thus x is proved to be robust against 2-poisoning attacks. Figure 4.2 shows an example where the robustness cannot be proved by QUICKCERTIFY. When K = 1, as shown in Figure 4.2 (a), star is the most frequent label inE(T 1 x )={(star : 1)} and LabelSet={star}. The extreme case isE(T 1+2 x )\{(star : 2)}={triangle : 2}, which means x is classified as triangle. Thus, QUICKCERTIFY returnsFalse in Line 5. 4.3.3 Correctness and Efficiency The following theorem states that the proposed method is sound in proving n-poisoning robustness. Theorem 2. If QUICKCERTIFY(T,n,x) returns True, the KNN’s prediction result for x is guar- anteed to be n-poisoning robust. Due to space limit, the full proof is omitted. Instead, the intuition behind Line 4 of the al- gorithm is explained as follows. First, we note that the prediction label Freq(E(T ′ K x )) from any T ′ ⊂ ∆ n (T) can correspond to a Freq(E(D)) where D is obtained by removing i(≤ n) elements 45 ? (a) For K= 1, Freq(E(T 1 x ))= star, and Freq(E(T 1+n x )\{star : n})= triangle ? (b) For K= 3, Freq(E(T 3 x ))= triangle, and Freq(E(T 3+n x )\{triangle : n})= star Figure 4.2: Unknown example for QUICKCERTIFY, where the poisoning threshold is n= 2 and the only two candidate values are K=1 and K=3. from T K+n x . Thus, we only need to pay attention to the(K+ n) nearest neighbors of x; other ele- ments which are further away from x can be safely ignored (cf. [41, 33]). Next, to maximize the chance of changing the most frequent label from y to another label, we want to remove as many y-labeled elements as possible from x’s neighbors. Thus, the most aggressive removal case is cap- tured byE(T K+n x )\{(y : n)}. If the most frequent label remains unchanged even in this case, it is guaranteed unchanged. Next, I explain why QUICKCERTIFY is fast. There are three reasons. First, it completely avoids the computationally expensive p-fold cross validations. Second, it considers only the K+n nearest neighbors of x. Third, it focuses on analyzing the label counts, which are in the (small) abstract domain, as opposed to the removal sets, which are in the (large) concrete domain. For these reasons, the execution time of this subroutine is often negligible (e.g., less than 1 second) even for large datasets. At the same time, the experimental evaluation shows that it can prove robustness for a surprisingly large number of test inputs. 46 4.4 Reducing the Search Space In this section, I present the subroutine GENPROMISINGSUBSETS, which narrows down the search space by removing obviously non-violating subsets from ∆ n (T) and returns the remaining ones, denoted by the set∇ x n (T) in Algorithm 9. 4.4.1 Minimal Violating Removal in Neighbors We filter the obviously non-violating subsets by computing some common property for each can- didate K value such that it must be part of every violating removal set. We observe that any violating removal set for a specific candidate K value must ensure that, for test input x, its new K nearest neighbors after removal have a most frequent label y ′ that is different from the default label y. The proposed method computes the minimal number of removed elements in x ′ s neighborhood to achieve this, let us call it minimal violating removal, denote min rmv. With this number, we know the every violating removal set must have at least min rmv elements from x’s neighbors T K+n x . The test input x’s new nearest neighbors after removal is represented as T K+i x \{i elements from T K+i x }, where i= 1,2,...n. To compute the minimal violating removal, rather than checking each possible value of i from 1 to n, we need a more efficient method, e.g., binary search with O(log n). To use binary search, we need to prove the monotonicity of violating removals, defined below. Theorem 3 (Monotonicity). If there is some i allowing T K+i x \{i elements from T K+i x } to have a different most-frequent label y ′ , then any larger value j> i will also allow T K+ j x \{ j elements from T K+ j x } to have a different most-frequent label y ′ . Conversely, if i does not allow it, then any smaller value j< i does not allow it either. Proof. If there is some i allowing T K+i x \{i elements from T K+i x } to have a different most-frequent label y ′ , there exists S⊂ T K+i x such that|S|= i and Freq(T K+i x \S)= y ′ . For any j> i and T K+ j x , we can always construct S ′ = S∪(T K+ j x \ T K+i x ), which satisfies S ′ ⊂ T K+ j x ,|S ′ |= j and Freq(T K+ j x \ S ′ )= y ′ . The reverse can be proved similarly. 47 Lines 2-11 in Algorithm 11 show the process of finding the minimal violating removal using binary search. Assume the possible range is 0∼ n+ 1 (line 2), the binary search divides the range in half (line 4) and checks the middle value (line 5). To check whether a removal mid can result in a different label y ′ ̸= y, the most possible operation is to remove mid elements with y label. It mid works, according to Theorem 3, we know the minimal removal is in the range start∼ mid (line 6); otherwise it is in the range mid+ 1∼ end (line 8). The binary search stops when start equals end, and this will the minimal violating removal. Algorithm 11: GENPROMISINGSUBSETS(T,n,x,y). for each candidate K value do start= 0;end= n+ 1; while start< end do mid=(start+ end)/2; if y̸= Freq(E(T K+m x )\{(y : m)}) then end= mid; else start= mid+ 1; min rmv= start; if min rmv≤ n then for each R 1 ⊆ T K+n x s.t.|R 1 |≥ min rmv do for each R 2 ⊆ (T\ T K+n x ) and|R 2 |≤ n−| R 1 | do R= R 1 ∪ R 2 ; Add(T\ R) to∇ x n (T); Since n is the maximal allowed removal, when min rmv > n, it is impossible for the most frequent label to change from y to y ′ . 4.4.2 An Illustrative Example Here we give an example of the binary search in Algorithm 11. Assume in the original training set T , for the test input x, the optimal K is K= 1 and the default label is y= star. Example 1. Assume n= 5,E(T 3 x )={star∗ 2,triangle∗ 1},E(T 4 x )={star∗ 2,triangle∗ 2}, and E(T 5 x )={star∗ 3,triangle∗ 2}. For the candidate K = 2, we show how to compute the minimal violating removal in x ′ s neighbors. 48 At first, start = 0 and end = 6, which means the possible value range of minimal removal is 0∼ 6. The propsoed method first checks mid = 3, sinceE(T 2+3 x )\{(star : 3)} results in the most- frequent label triangle, the proposed method can cut the possible range by half to 0∼ 3. Next, we check mid= 1, and reduce the range to 0∼ 1. Finally, we check mid= 0, which does not work, so the range becomes 1∼ 1, and we return 1 as the minimal violating removal in x’s neighbors. Since binary search reduces the range by half at each step, it is efficient. For example, when n=180 for MNIST, binary search needs only 8 checks to compute the result, whereas going through each value in the range requires 180 checks. In other words, the speedup is more than 20X. 4.4.3 The Reduced Search Space Based on the minimal violating removal, min rmv, we compute the reduced set∇ x n (T) as shown in Lines 12-20 of Algorithm 11. Here, each removal set R is the union of two sets, R 1 and R 2 , where R 1 is a removal set that contains at least min rmv elements from x’s neighborhood T K+n x , and R 2 ⊆ (T\ T K+n x ) is a subset of the left-over data elements. The experiments show that, in practice, the reduced set ∇ x n (T) is often significantly smaller than the original set∆ n (T). A special case is when min rmv= 0, for which∇ x n (T) is the same as ∆ n (T), meaning the search space is not reduced. However, this special case is rare and, during the experimental evaluation, it never occurred. 4.5 Incremental Computation In this section, I present the proposed method for speeding up an expensive step of the KNN algo- rithm, the p-fold cross validations inside KNN PARATUNE. We achieve this speedup by splitting KNN PARATUNE into two subroutines: KNN PARATUNE INIT, which is applied only once to the original training set T , and KNN PARATUNE UPDATE, which is applied to each individual removal set R=(T\ T ′ ), where T ′ ∈∇ x n (T). 49 4.5.1 The Intuition First, I explain why the standard KNN PARATUNE is computationally expensive. This is because, for each candidate value of parameter K, denoted K i , the standard p-fold cross validation [45] must be used to compute the classification error. Algorithm 12 (excluding Lines 15-16) shows the computation. First, the training set T is partitioned into p groups, denoted{G 1 ,G 2 ,...,G p }. Then, the set of misclassification samples in each group G j is computed, denoted errSet K i G j . Next, the error is aver- aged over all groups, which results in error K i . Finally, the K i value with the smallest classification error is chosen as the optimal K value. The computation is expensive because error K i G j , for each K i , requires exactly|G j | calls to the standard KNN PREDICT(T\ G j ,K i ,x), one per data element x∈ G j , while treating the set D= (T\ G j ) as the training set. The intuition for speeding up this computation is as follows. Given the original training set T , and a subset T ′ ∈∇ x n (T), the corresponding removal set R=(T\ T ′ ) can capture the difference between these two sets, and thus capture the difference of their error K i . Since K i is fixed when computing error K i , we only need to consider the direct influence (i.e., neighbors change) brought by removal set R. In practice, the removal set is often small, which means the vast majority of data elements in the p-fold partition of T ′ , denoted{G ′ 1 ,...,G ′ p }, are the same as data elements in the p-fold partition of T , denoted{G 1 ,...,G p }. Thus, for most elements, their neighbors are almost the same. Instead of computing the error sets (errSet K i G ′ j ) from scratch for every single G ′ j , we can use the error sets (errSet K i G j ) for G j as the starting point, and only compute the change brought by removal set R, leveraging the intermediate computation results stored in Error. 4.5.2 The Algorithm The proposed incremental computation consists of two steps. As shown in Algorithm 9, we apply KNN PARATUNE INIT only once to the original training set T , and then apply KNN PARATUNE UPDATE to each removal set R=(T\ T ′ ). 50 Algorithm 12: Subroutine KNN PARATUNE INIT(T). Partition the training set T into p groups{G 1 ,G 2 ,...,G p } for each candidate K i value do for each group G j do errSet K i G j ←{} for each data element(x,y)∈ G j do if KNN PREDICT(T\ G j ,K i ,x)̸= y then Add(x,y) to errSet K i G j ; error K i G j = errSet K i G j / G j error K i = 1 p ∑ p j=1 error K i G j K← argmin K i error K i Error←⟨{ G 1 ,G 2 ,...,G p },{(errSet K i G 1 ,...,errSet K i G p )}⟩ return⟨K,Error⟩ The subroutine KNN PARATUNE INIT is shown in Algorithm 12. It differs from the standard KNN PARATUNE only in Lines 15-16, where it stores the intermediate computation results in Error. The first component in Error is the set of p groups in T . The second component contains, for each K i , the misclassified elements in G j . The subroutine KNN PARATUNE UPDATE is shown in Algorithm 13. It computes the new errSet K i G ′ j based on the errSet K i G j stored in Error. First, it computes the new groups G ′ j by removing elements in R from the old groups G j . Then, it computes the influenced set ( in f luSet), which is defined in the next paragraph. Finally, it modifies the old errSet K i G j (in Line 16) based on three cases: it removes the set R (Case 1) and the set newSet − (Case 2), and adds the set newSet + (Case 3). Below are the detailed explanations of these three cases: 1. If(x,y)∈ G j \G ′ j was misclassified by (T\G j ), but this element is no longer in T ′ , it should be removed. 2. If(x,y)∈ G j ∩ G ′ j was misclassified by (T\ G j ), but this element is correctly classified by T ′ \ G ′ j , it should be removed. 51 Algorithm 13: KNN PARATUNE UPDATE(R,Error). Let{G 1 ,...,G p } and{(errSet K i G 1 ,...,errSet K i G p )} be groups and error sets stored in Error Compute the new groups{G ′ j | G ′ j = G j \ R where j= 1,..., p} Compute the new training set T ′ = S j∈{1,...,p} G ′ j Compute the influenced set, in f luSet, using R and{G j } for each candidate K i value do for each new group G ′ j do newSet + = newSet − ={} for each data element(x,y)∈(G ′ j ∩in f luSet) do if KNN PREDICT(T\ G j ,K i ,x)= y and KNN PREDICT(T ′ \ G ′ j ,K i ,x)̸= y then Add(x,y) to newSet + ; if KNN PREDICT(T\ G j ,K i ,x)̸= y and KNN PREDICT(T ′ \ G ′ j ,K i ,x)= y then Add(x,y) to newSet − ; errSet K i G ′ j = errSet K i G j \ R\ newSet − ∪ newSet + error K i G ′ j = errSet K i G ′ j / G ′ j error K i = 1 p ∑ p j=1 error K i G ′ j K← argmin K i error K i return K 3. If (x,y)∈ G j ∩ G ′ j was correctly classified by (T\ G j ), but is misclassified by T ′ \ G ′ j , it should be added. Case (1) can be regarded as an explicit change brought by the removal set R, whereas Case (2) and Case (3) are implied changes brought by R: these changes are implied because, while the element (x,y) is not inside R, it is classified differently after the elements in R are removed from T . Since the removal set is small, most data elements in G j will not be part of the explicit or implied changes. To avoid redundantly invoking KNN PREDICT on these data elements, we filter them out using the influenced set (Line 8). Here, assume that K max = max({K i }) is the maximal candidate value, and during cross-validation, when G j is treated as the test set, D=(T\ G j ) is the corresponding training set. 52 in f luSet={(x,y)∈ G j | (x,y)̸∈ R, D K max x ∩ R̸= / 0, and QUICKCERTIFY(D,n,x)=False} In other words, every element(x,y) inside in f luSet must satisfy three conditions: (1) the element is not in R; (2) at least one of its neighbors in D K max x is in R; and (3) the element may be misclassified when at most n neighbors are removed. Recall that the subroutine used in the last condition has been explained in Algorithm 10. 4.6 Experiments I have implemented the proposed method using Python and the popular machine learning toolkit scikit-learn 0.24.2, together with the baseline method in Algorithm 8, and the two existing methods of Jia et al. [33] and Li et al. [41]. For experimental comparison, I used six popular supervised learning datasets as benchmarks. There are two relatively small datasets, Iris [27] and Digits [29]. Iris has 135 training and 15 test elements with 3 classes and 4-D features. Digits has 1,617 training and 180 test elements with 10 classes and 64-D features. Since the baseline approach (Algorithm 8) can finish on these small datasets and thus obtain the ground truth (i.e., whether prediction is truly robust), these small datasets are useful in evaluating the accuracy of the proposed method. The other four benchmarks are larger datasets, including HAR (human activity recognition using smartphones) [4], which has 9,784 training and 515 test elements with 6 classes and 561-D features, Letter (letter recognition) [28], which has 18,999 training and 1,000 test elements with 26 classes and 16-D features, MNIST (hand-written digit recognition) [37], which has 60,000 training and 10,000 test elements with 10 classes and 36-D features, and CIFAR10 (colored image classification) [36], which has 50,000 training and 10,000 test elements with 10 classes and 288- D features. Since none of these datasets can be handled by the baseline approach, they are used primarily to evaluate the efficiency of the proposed method. 53 Table 4.1: Comparing the accuracy of the proposed method with the baseline (ground truth)and two existing methods on the smaller datasets, for which the ground truth can be obtained by the baseline enumerative method (Algorithm 8). Benchmark Baseline Jia et al. [33] Li et al. [41] Proposed Method dataset test data certified falsified unknown time certified falsified unknown time certified falsified unknown time certified falsified unknown time # # # # (s) # # # (s) # # # (s) # # # (s) Iris (n=1) 15 15 0 0 49 0 0 15 1 14 0 1 1 15 0 0 1 Iris (n=2) 15 14 1 0 3,086 0 0 15 1 13 0 2 1 14 1 0 5 Iris (n=3) 15 0 1 14 6,721 0 0 15 1 11 0 4 1 13 1 1 120 Digits (n=1) 180 0 1 179 7,168 170 0 10 1 172 0 8 1 179 1 0 3 4.6.1 Evaluation Criteria The experiments aimed to answer the following research questions: RQ1 Is the proposed method accurate enough for deciding (certifying or falsifying) n-poisoning robustness for most of the test cases? RQ2 Is the proposed method efficient enough for handling all of the datasets? RQ3 How often can prediction be successfully certified or falsified by the proposed method, and how is the result affected by the poisoning threshold n? I used the state-of-the-art implementation of KNN in the experiments, with 10-fold cross val- idation and candidate K values in the range 1∼ 1 10 |T|. The set T is obtained by inserting up-to-n malicious samples to the datasets. I first generate a random number n ′ ≤ n, and then insert exactly n ′ mutations of randomly picked input features and output labels of the original samples. I ran all four methods on all datasets. For the slow baseline, I set the time limit to 7200 seconds per test input. For the other methods, I set the time limit to 1800 seconds per test input. The experiments were conducted (single threaded) on a CloudLab [22] c6252-25g node with 16-core AMD 7302P at 3 GHz CPU and 128GB EEC Memory (8 x16 GB 3200MT/s RDIMMs). 4.6.2 Results on the Smaller Datasets To answer RQ1, I compared the result of the proposed method with the ground truth obtained by the baseline enumerative method on the two smallest datasets. 54 Table 4.1 shows the experimental results. Columns 1-2 show the name of the dataset, the poisoning threshold n, and the number of test data. Columns 3-6 show the result of the baseline method, including the number of test data that are certified, falsified, and unknown, respectively, and the average time per test input. The remaining columns compare the results of the two existing methods and the proposed method. Since the goal is to compare the proposed method with the ground truth (obtained by the baseline method), I must choose small n values to ensure that the baseline method does not time out. On Iris (n= 2), the baseline method was able to certify 14/15 of the test data and falsify 1/15. However, it was slow: the average time was 3,086 seconds per test input. In contrast, the method by Jia et al. [33] was much faster, albeit with low accuracy. It took 1 second per test input, but failed to certify any of the test data. The method by Li et al. [41] certified 11/15 of the test data but left 4/15 as unknown. The proposed method certified 14/15 of the test data and falsified the remaining 1/15, and thus is as accurate as the ground truth; the average time is 5 seconds per test input. While the slow baseline method was able to handle Iris, it did not scale well. With a slightly larger dataset or larger poisoning threshold, it would run out of time. On Digits (n=1), the baseline method falsified only 1/180 of the test data and returned the remaining 179/180 as unknown. In contrast, the proposed method successfully certified or falsified all of the 180 test data. 4.6.3 Results on All Datasets To answer RQ2, I compared the proposed method with the two state-of-the-art methods [33, 41] on all datasets, using significantly larger poisoning thresholds. Since these benchmarks are well beyond the reach of the baseline method, the ground truth is no longer available. However, when- ever the proposed method returns Certified or Falsified , the results are guaranteed to be conclusive. Thus, the Unknown cases are the only unresolved cases. If the percentage of Unknown cases is small, it means the proposed method is accurate. 55 Table 4.2: Comparing the accuracy and efficiency of the proposed method with existing methods on all datasets, with large poisoning thresholds; the percentages of certified and falsified cases are reported in Section 4.6.4 and shown in Figure 4.3. Benchmark Jia et al. [33] Li et al. [41] Proposed Method dataset poisoning unknown time unknown time unknown time threshold % (s) % (s) % (s) Iris n=3 (2%) 100% 1 26.7% 1 6.7% 120 Digits n=16 (1%) 100% 1 19.4% 1 1.0% 19 HAR n=97 (1%) 100% 1 28.3% 1 0.8% 21 Letter n=190 (1%) 100% 1 94.5% 1 0.0% 4 MNIST n=180 (0.3%) 38.1% 1 25.0% 1 2.0% 47 CIFAR10 n=150 (0.3%) 90.0% 1 64.0% 1 0.0% 558 Table 4.2 shows the results, where Column 1 shows the name of the dataset, and Column 2 shows the poisoning threshold. For the smallest dataset, we set n to be 2% of the size of T . For medium datasets, we set it to be 1%. For large datasets, we set it to be 0.3%. Columns 3-6 show the percentage of test data left as unknown by the two existing methods and the average time taken. Recall that these methods can only certify, but not falsify, n-poisoning robustness. Columns 7-8 show the percentage of test data left as unknown by the proposed method. While the proposed method has a higher computational cost, it is also drastically more accurate than the two existing methods. On HAR, for example, the existing methods left 100% and 28.3% of the test data as unknown when n= 97. The proposed method, on the other hand, left only 0.8% of the test data as unknown. On CIFAR10, which has 50,000 data elements with 288-D feature vectors, the proposed method was able to resolve 100% of the test cases when the poisoning threshold was as large as n= 150. In contrast, the two existing methods resolved only 10.0% and 36.0%. In other words, they left 90.0% and 64.0% as unknown. 4.6.4 Effectiveness of Proposed Method and Impact of Poisoning Threshold To answer RQ3, we studied the percentages of certified , falsified , and unknown cases reported by the proposed method, as well as how they are affected by the poisoning threshold n. 56 dataset poisoning threshold certified by proposed method falsified by proposed method Iris n=3 (2%) 86.6% 6.7% Digits n=16 (1%) 80.0% 19.0% HAR n=97 (1%) 71.8% 26.8% Letter n=190 (1%) 5.6% 94.4% MNIST n=180 (0.3%) 75.0% 23.0% CIFAR10 n=150 (0.3%) 36.0% 64.0% 0.8% 1.0% 1.2% 1.4% 1.6% 1.8% 2.0% 2.2% 0% 20% 40% 60% 80% 100% (a) Iris 0.2% 0.4% 0.6% 0.8% 1.0% 0% 20% 40% 60% 80% 100% (b) Digits 0.0% 0.2% 0.4% 0.6% 0.8% 1.0% 0% 20% 40% 60% 80% 100% (c) HAR 0.0% 0.2% 0.4% 0.6% 0.8% 1.0% 0% 20% 40% 60% 80% 100% (d) Letter 0.00% 0.05% 0.10% 0.15% 0.20% 0.25% 0.30% 0% 20% 40% 60% 80% 100% (e) MNIST 0.0% 0.1% 0.1% 0.2% 0.2% 0.3% 0.3% 0% 20% 40% 60% 80% 100% (f) CIFAR10 Figure 4.3: Results on how the poisoning threshold (in the x-axis) affects the percentages of certi- fied, falsified, and unknown test cases (in the y-axis) in the proposed method. Here, falsified is in ‘− ’, unknown is in ‘.’, and certified is in either ‘|’ (quick certify) or ‘/’ (slow certify). In addition to the percentage of unknown cases shown in Table 4.2, we show the percentages of certified and falsified cases reported by the proposed method below. There is no need to report these percentages for the two existing methods, because they always have 0% of falsified cases. Figure 4.3 shows how these percentages are affected by the poisoning threshold. Here, the x- axis shows n/|T| in percentage, and the y-axis shows the percentages of falsified in ‘− ’, unknown in ‘.’ and certified in either ‘|’ (quick certify) or ‘/’ (slow certify). Recall that in Algorithm 9, a test case may be certified in either Line 2 or Line 16. When it is certified in Line 2, it belongs to the ‘ |’ region (quick certify) in Figure 4.3. When it is certified in Line 16, it belongs to the ‘/’ region (slow certify). 57 For example, in Figure 6(e): When n=1, the falsify percentage is 0%, the unknown percentage is 10% and the quick-certify percentage is 90%. When n=180, the falsify percentage is 23%, the unknown percentage is 2%, and the quick-certify percentage is 75%. Figure 4.3 demonstrates the effectiveness of the proposed method. Since the ‘.’ regions that represent unknown cases remain small, the vast majority of cases are successfully certified or falsified. The results also reflect the nature of n-poisoning robustness: as n increases, the percentage of truly robust cases decreases. This is inevitable since having more poisoned elements in T leads to a higher likelihood of changing the classification label. This is consistent with the results of prior studies [55, 8, 12], which found that the prediction errors became significant even if a small percentage (< 0.2%) of training data in T was poisoned. 4.7 Summary I have presented a method for deciding n-poisoning robustness accurately and efficiently for the state-of-the-art implementation of the KNN algorithm. To the best of my knowledge, this is the only method available for certifying as well as falsifying the complete KNN system, including both the parameter tuning and the prediction phases. The proposed method relies on novel techniques that first narrow down the search space using over-approximate analysis in the abstract domain, and then find violations using systematic testing in the concrete domain. I have evaluated the proposed techniques on six popular supervised-learning datasets, and demonstrated the advantages of the proposed method over two state-of-the-art techniques. 58 Chapter 5 Certifying Fairness of KNNs under Historical Bias in Training Dataset In this chapter, I propose a method for certifying the fairness of KNNs under data bias, by sound approximating the complex arithmetic computations used in the state-of-the-art KNN algorithm. As far as I know, this is the first certification method for KNN based on three variants of the fair- ness definition: individual fairness, ε-fairness, and label-flipping fairness. Initially, I first define the fairness certification problem for KNN and then propose methods to lift the computation re- sults from the concrete domain to an abstract domain, to reduce the computational cost. I show effectiveness of this abstract interpretation based technique through experimental evaluation on six datasets widely used in the fairness research literature. I also show that the method is accurate enough to obtain fairness certifications for a large number of test inputs, despite the presence of historical bias in the datasets. Certifying the fairness of the classification output of a machine learning model has become an important problem. This is in part due to a growing interest in using machine learning techniques to make socially sensitive decisions in areas such as education, healthcare, finance, and criminal justice systems. One reason why the classification output may be biased against an individual from a protected minority group is because the dataset used to train the model may have historical bias; that is, there is systematic mislabeling of samples from the protected minority group. Thus, we 59 must be extremely careful while considering the possibility of using the classification output of a machine learning model, to avoid perpetuating or even amplifying historical bias. One solution to this problem is to have the ability to certify, with certainty, that the classification output y = M(x) for an individual input x is fair, despite that the model M is learned from a dataset T with historical bias. This is a form of individual fairness that has been studied in the fairness literature [23]; it requires that the classification output remains the same for input x even if historical bias were not in the training dataset T . However, this is a challenging problem and, as far as I know, techniques for solving it efficiently are still severely lacking. The propsoed work aims to fill the gap. Specifically, we are concerned with three variants of the fairness definition. Let the input x= ⟨x 1 ,...,x D ⟩ be a D-dimensional input vector, andP be the subset of vector indices corresponding to the protected attributes (e.g., race, gender, etc.). The first variant of the fairness definition is individual fairness, which requires that similar individuals are treated similarly by the machine learning model. For example, if two individual inputs x and x ′ differ only in some protected attribute x i , where i∈P, but agree on all the other attributes, the classification output must be the same. The second variant is ε-fairness, which extends the notion of individual fairness to include inputs whose un-protected attributes differ and yet the difference is bounded by a small constant (ε). In other words, if two individual inputs are almost the same in all unprotected attributes, they should also have the same classification output. The third variant is label-flipping fairness , which requires the aforementioned fairness requirements to be satisfied even if a biased dataset T has been used to train the model in the first place. That is, as long as the number of mislabeled elements in T is bounded by n, the classification output must be the same. However, obtaining a fairness certification for KNN is still challenging and, in practice, the most straightforward approach of enumerating all possible scenarios and then checking if the clas- sification outputs obtained in these scenarios agree would have been prohibitively expensive. To overcome the challenge, I propose an efficient method based on the idea of abstract inter- pretation [14]. The proposed method relies on sound approximations to analyze the arithmetic 60 T KNN ParaTune KNN Predict Abs KNN ParaTune Abs KNN Predict K value y label a set of y labels a set of K values T, n x x, Abstract domain Concrete domain Figure 5.1: FAIRKNN: the proposed method for certifying fairness of KNNs with label bias. computations used by the state-of-the-art KNN algorithm both accurately and efficiently. Fig- ure 5.1 shows an overview of the proposed method in the lower half of this figure, which conducts the analysis in an abstract domain, and the default KNN algorithm in the upper half, which op- erates in the concrete domain. The main difference is that, by staying in the abstract domain, the proposed method is able to analyze a large set of possible training datasets (derived from T due to n label-flips) and a potentially-infinite set of inputs (derived from x due to ε perturbation) symbolically, as opposed to analyze a single training dataset and a single input concretely. To the best of my knowledge, this is the first method for KNN fairness certification in the presence of dataset bias. While Meyer et al. [47, 48] and Drews et al. [20] have investigated robustness certification techniques, their methods target decision trees and linear regression, which are different types of machine learning models from KNN. The proposed method also differs from the KNN data-poisoning robustness certification techniques developed by Jia et al. [34] and Li et al. [41], which do not focus on fairness at all; for example, they do not distinguish protected attributes from unprotected attributes. Furthermore, Jia et al. [34] consider the prediction step only while ignoring the parameter tuning step, and Li et al. [41] do not consider label flipping. The proposed method, in contrast, considers all of these cases. I have implemented the proposed method and demonstrated the effectiveness through exper- imental evaluation. I used all of the six popular datasets in the fairness research literature as 61 benchmarks. The evaluation results show that the proposed method is efficient in analyzing com- plex arithmetic computations used in the state-of-the-art KNN algorithm, and is accurate enough to obtain fairness certifications for a large number of test inputs. To better understand the impact of historical bias, I also compared the fairness certification success rates across different demographic groups. To summarize, this paper makes the following contributions: • I propose an abstract interpretation based method for efficiently certifying the fairness of KNN classification results in the presence of dataset bias. The method relies on sound ap- proximations to speed up the analysis of both the parameter tuning and the prediction steps of the state-of-the-art KNN algorithm, and is able to handle three variants of the fairness definition. • I implement the method and evaluate it on six datasets that are widely used in the fairness literature, to demonstrate the efficiency of the proposed approximation techniques as well as the effectiveness of the proposed method in obtaining sound fairness certifications for a large number of test inputs. The remainder of this paper is organized as follows. I first present the technical background in Section 5.1 and then give an overview of the proposed method in Section 5.2. Next, I present the detailed algorithms for certifying the KNN prediction step in Section 5.3 and certifying the KNN parameter tuning step in Section 5.4. This is followed by the experimental results in Section 5.5. Finally, I summarize this work in Section 5.6. 5.1 Background Let L be a supervised learning algorithm that takes the training dataset T as input and returns a learned model M= L(T) as output. The training set T ={(x,y)} is a set of labeled samples, where each x∈X ⊆ R D has D real-valued attributes, and the y∈Y ⊆ N is a class label. The learned model M :X →Y is a function that returns the classification output y ′ ∈Y for any input x ′ ∈X . 62 5.1.1 Fairness of the Learned Model We are concerned with fairness of the classification output M(x) for an individual input x. LetP be the set of vector indices corresponding to the protected attributes in x∈X . We say that x i is a protected attribute (e.g., race, gender, etc.) if and only if i∈P. Definition 1 (Individual Fairness). For an input x, the classification output M (x) is fair if, for any input x ′ such that (1) x j ̸= x ′ j for some j∈P and (2) x i = x ′ i for all i̸∈P, we have M(x)= M(x ′ ). It means two individuals (x and x ′ ) differing only in some protected attribute (e.g., gender) but agreeing on all other attributes must be treated equally. While being intuitive and useful, this notion of fairness may be too narrow. For example, if two individuals differ in some unprotected attributes and yet the difference is considered immaterial, they must still be treated equally. This can be captured by ε− fairness. Definition 2 (ε-Fairness). For an input x, the classification output M (x) is fair if, for any input x ′ such that (1) x j ̸= x ′ j for some j∈P and (2)|x i − x ′ i |≤ ε for all i̸∈P, we have M(x)= M(x ′ ). In this case, such inputs x ′ form a set. Let∆ ε (x) be the set of all inputs x ′ considered in the ε− fairness definition. That is, ∆ ε (x) :={x ′ | x j ̸= x ′ j for some j∈P,|x i − x ′ i |≤ ε for all i̸∈P}. By requiring M(x)= M(x ′ ) for all x ′ ∈∆ ε (x), ε-fairness guarantees that a larger set of individuals similar to x are treated equally. Individual fairness can be viewed as a special case of ε-fairness, where ε = 0. In contrast, when ε > 0, the number of elements in∆ ε (x) is often large and sometimes infinite. Therefore, the most straightforward approach of certifying fairness by enumerating all possible elements in∆ ε (x) would not work. Instead, any practical solution would have to rely on abstraction. 5.1.2 Fairness in the Presence of Dataset Bias Due to historical bias, the training dataset T may have contained samples whose output are unfairly labeled. Let the number of such samples be bounded by n. We assume that there are no additional 63 clues available to help identify the mislabeled samples. Without knowing which these samples are, fairness certification must consider all of the possible scenarios. Each scenario corresponds to a de-biased dataset, T ′ , constructed by flipping back the incorrect labels in T . LetdBias n (T)={T ′ } be the set of these possible de-biased (clean) datasets. Ideally, we want all of them to lead to the same classification output. Definition 3 (Label-flipping Fairness) . For an input x, the classification output M (x) is fair against label-flipping bias of at most n elements in the dataset T if, for all T ′ ∈dBias n (T), we have M ′ (x)= M(x) where M ′ = L(T ′ ). Label-flipping fairness differs from and yet complements individual and ε-fairness in the fol- lowing sense. While individual and ε-fairness guarantee equal output for similar inputs, label- flipping fairness guarantees equal output for similar datasets. Both aspects of fairness are practi- cally important. By combining them, we are able to define the entire problem of certifying fairness in the presence of historical bias. To understand the complexity of the fairness certification problem, we need to look at the size of the setdBias n (T), similar to how we have analyzed the size of∆ ε (x). While the size ofdBias n (T) is always finite, it can be astronomically large in practice. Let q is the number of unique class labels and m be the actual number of flipped elements in T . Assuming that each flipped label may take any of the other q− 1 possible labels, the total number of possible clean sets is |T| m · (q− 1) m for each m. Since m≤ n,|dBias n (T)|=∑ n m=1 |T| m · (q− 1) m . Again, the number of elements in dBias n (T) is too large to enumerate, which means any practical solution would have to rely on abstraction. 5.2 Overview of Proposed method Given the tuple⟨T,P,n,ε,x⟩, where T is the training set,P represents the protected attributes, n bounds the number of biased elements in T , and ε bounds the perturbation of x, the proposed method checks if the KNN classification output for x is fair. 64 Since the proposed method relies on an abstract interpretation of the KNN algorithm, let us first review how the original KNN algorithm operates in the concrete domain. As described in Section 2.2, the KNN algorithm has two phases: parameter tuning phase and prediction phase. In the parameter tuning phase, KNN paratune computes the optimal value of the parameter K using p-fold cross-validation. In the prediction phase,KNN predict computes the predicted label for an input x as the most frequent label among the K nearest neighbors in the training set T . Next, I present the proposed method. Algorithm 14 outlines the top-level procedure of the proposed fairness certification method, which first executes the KNN algorithm in the concrete domain (Lines 1-2) to obtain the default K and y, and then proceeds to the analysis in the abstract domain. Algorithm 14: Proposed method for certifying fairness of KNN for input x. K =KNN paratune(T ); y =KNN predict(T,K,x); KSet =abs KNN paratune (T,n); for each K∈ KSet do ifabs KNN predict same(T,n,K,x,y) = False then return unknown; return certified; In the abstract parameter tuning step (Line 3), instead of considering T , the proposed method considers the set of all clean datasets in dBias n (T) symbolically, to compute the set of possible optimal K values, denoted KSet. In the abstract prediction step (Lines 4-8), for each K, instead of considering input x, the proposed method considers all perturbed inputs in∆ ε (x) and all clean datasets in dBias n (T) sym- bolically, to check if the classification output always stays the same. The proposed method returns “certified” only when the classification output always stays the same (Line 9); otherwise, it returns “unknown” (Line 6). In the next two sections, I present the detailed algorithms for abstracting the prediction step and the parameter tuning step, respectively. 65 5.3 Abstracting the KNN Prediction Step I start with abstract KNN prediction, which is captured by the subroutineabs KNN predict same used in Line 5 of Algorithm 14. It consists of two parts. The first part (to be presented in Section 5.3.1) computes a superset of T K x , denoted overNN, while considering the impact ofε perturbation of the input x. The second part (to be presented in Section 5.3.2) leverages overNN to decide if the classification output always stays the same, while considering the impact of label-flipping bias in the dataset T . 5.3.1 Finding the K-Nearest Neighbors To compute overNN, which is a set of samples in T that may be the K nearest neighbors of the test input x, we must be able to compute the distance between x and each sample in T . This is not a problem at all in the concrete domain, since the K nearest neighbors of x in T , denoted T K x , is fixed and is determined solely by the Euclidean distance between x and each sample in T in the attribute space. However, whenε perturbation is applied to x, the distance changes and, as a result, the K nearest neighbors of x may also change. Fortunately, the distance in the attribute space is not affected by label-flipping bias in the dataset T , since label-flipping only impacts sample labels, not sample attributes. Thus, in this subsection, we only need to consider the impact of ε perturbation of the input x. 5.3.1.1 The Challenge. Due toε perturbation, a single test input x becomes a potentially-infinite set of inputs ∆ ε (x). Since the goal is to over-approximate the K nearest neighbors of∆ ε (x), the expectation is that, as long as there exists some x ′ ∈∆ ε (x) such that a sample input t in T is one of the K nearest neighbors of x ′ , denoted t∈ T K x ′ , then t must be included in the set overNN. That is, [ x ′ ∈∆ ε (x) T K x ′ ⊆ overNN⊆ T. 66 However, finding an efficient way of computing overNN is a challenging task. As explained be- fore, the naive approach of enumerating x ′ ∈∆ ε (x), computing the K nearest neighbors, T K x ′ , and unionizing all of them would not work. Instead, we need abstraction that is both efficient and accurate enough in practice. The proposed solution is that, for each sample t in T , I first analyze the distances between t and all inputs in∆ ε (x) symbolically, to compute a lower bound and an upper bound of the distances. Then, I leverage these lower and upper bounds to compute the set overNN, which is a superset of samples in T that may become the K nearest neighbors of∆ ε (x). 5.3.1.2 Bounding Distance Between∆ ε (x) and t. Assume that x=(x 1 ,x 2 ,...,x D ) and t=(t 1 ,t 2 ,...,t D ) are two real-valued vectors in the D-dimensional attribute space. Letε=(ε 1 ,ε 2 ,...,ε D ), whereε i ≥ 0, be the small perturbation. Thus, the perturbed input is x ′ =(x ′ 1 ,x ′ 2 ,...,x ′ D )=(x 1 +δ 1 ,x 2 +δ 2 ,...,x D +δ D ), where δ i ∈[− ε i ,ε i ] for all i= 1,...,D. The distance between x and t is a fixed value d(x,t)= q ∑ D i=1 (x i − t i ) 2 , since both x and the samples t in T are fixed, but the distance between x ′ ∈∆ ε (x) and t is a function of δ i ∈[− ε i ,ε i ], since q ∑ D i=1 (x ′ i − t i ) 2 = q ∑ D i=1 (x i − t i +δ i ) 2 . For ease of presentation, I define the distance as d ε = q ∑ D i=1 d ε i , where d ε i =(x i − t i +δ i ) 2 is the (squared) distance function in the i-th dimension. Then, the goal becomes computing the lower bound, LB(d ε ), and the upper bound, UB(d ε ), in the domain δ i ∈[− ε i ,ε i ] for all i= 1,...,D. 5.3.1.3 Distance Bounds are Compositional. The first observation is that bounds on the distance d ε as a whole can be computed using bounds in the individual dimensions. To see why this is the case, consider the (square) distance in the i-th dimension, d ε i =(x i − t i +δ i ) 2 , whereδ i ∈[− ε i ,ε i ], and the (square) distance in the j-th dimension, d ε j =(x j − t j +δ j ) 2 , where δ j ∈[− ε j ,ε j ]. By definition, d ε i is completely independent of d ε j when i̸= j. 67 Thus, the lower bound of d ε , denoted LB(d ε ), can be calculated by finding the lower bound of each d ε i in the i-th dimension. Similarly, the upper bound of d ε , denoted UB(d ε ), can also be calculated by finding the upper bound of each d ε i in the i-the dimension. That is, LB(d ε )= q ∑ D i=1 LB(d ε i ) and UB(d ε )= q ∑ D i=1 UB(d ε i ). 5.3.1.4 Four Cases in Each Dimension. The second observation is that, by utilizing the mathematical nature of the (square) distance func- tion, we can calculate the minimum and maximum values of d ε i , which can then be used as the lower bound LB(d ε i ) and upper bound UB(d ε i ), respectively. Specifically, in the i-th dimension, the (square) distance function d ε i =((x i − t i )+δ i ) 2 may be rewritten to(δ i +A) 2 , where A=(x i − t i ) is a constant andδ i ∈[− ε,+ε] is a variable. The function can be plotted in two dimensional space, usingδ i as x-axis and the output of the function as y-axis; thus, it is a quadratic function Y =(X+ A) 2 . Fig. 5.2 shows the plot, which reminds us of where the minimum and maximum values of a quadratic function is. There are two versions of the quadratic function, depending on whether A> 0 (corresponding to the two subfigures at the top) or A< 0 (corresponding to the two subfigures at the bottom). Each version also has two cases, depending on whether the perturbation interval [− ε i ,ε i ] falls inside the constant interval [−| A|,|A|] (corresponding to the two subfigures on the left) or falls outside (corresponding to the two subfigures on the right). Thus, there are four cases in total. In each case, the maximal and minimal values of the quadratic function are different, as shown by the LB and UB marks in Fig. 5.2. Case (a) This is when(x i − t i )> 0 and− ε i >− (x i − t i ), which is the same as saying A> 0 and − ε i >− A. In this case, function d i (ε i )=(δ i + A) 2 is monotonically increasing w.r.t. variable δ i ∈[− ε i ,+ε i ]. Thus, LB(d ε i )=(− ε i +(x i − t i )) 2 and UB(d ε i )=(+ε i +(x i − t i )) 2 . 68 (a) (b) (c) (d) Figure 5.2: Four cases for computing the upper and lower bounds of the distance function d ε i (δ i )= (δ i +A) 2 forδ i ∈[− ε i ,ε i ]. In these figures, δ i is the x-axis, and d ε i is the y-axis,LB denotes LB(d ε i ), andUB denotes UB(d ε i ). Case (b) This is when(x i − t i )> 0 and− ε i <− (x i − t i ), which is the same as saying A> 0 and − ε i <− A. In this case, the function is not monotonic. The minimal value is 0, obtained when δ i =− A. The maximal value is obtained when δ i =+ε i . Thus, LB(d ε i )= 0 and UB(d ε i )=(+ε i +(x i − t i )) 2 . Case (c) This is when (x i − t i )< 0 and ε i <− (x i − t i ), which is the same as saying A< 0 and ε i <− A. In this case, the function is monotonically decreasing w.r.t. variable δ i ∈[− ε i ,ε i ]. Thus, LB(d ε i )=(ε i +(x i − t i )) 2 and UB(d ε i )=(− ε i +(x i − t i )) 2 . Case (d) This is when (x i − t i )< 0 and ε i >− (x i − t i ), which is the same as saying A< 0 and ε i >− A. In this case, the function is not monotonic. The minimal value is 0, obtained when δ i =− A. The maximal value is obtained when δ i =− ε i . Thus, LB(d ε i )= 0 and UB(d ε i )=(− ε i +(x i − t i )) 2 . 69 Summary By combining the above four cases, we compute the bounds of the entire distance function d ε as follows: " s D ∑ i=1 max(|x i − t i |− ε i ,0) 2 , s D ∑ i=1 (|x i − t i |+ε i ) 2 # Here, the take-away message is that, since x i , t i and ε i are all fixed values, the upper and lower bounds can be computed in constant time, despite that there is a potentially-infinite number of inputs in∆ ε (x). 5.3.1.5 Computing overNN Using Bounds. With the upper and lower bounds of the distance between ∆ ε (x) and sample t in the dataset T , denoted[LB(d ε (x,t)), UB(d ε (x,t))], we are ready to compute overNN such that every t∈ overNN may be among the K nearest neighbors of∆ ε (x). Let UB Kmin denote the K-th minimum value of UB(d ε (x,t)) for all t∈ T . Then, we define overNN as the set of samples in T whose LB(d ε (x,t)) is not greater than UB Kmin . In other words, overNN={t∈ T | LB(d ε (x,t))≤ UB Kmin }. Example Given a dataset T ={t 1 ,t 2 ,t 3 ,t 4 ,t 5 }, a test input x, perturbationε, and K= 3. Assume that the lower and upper bounds of the distance between∆ ε (x) and samples in T are[25.4,29.4], [30.1,34.1], [35.3,39.3], [37.2,41.2], [85.5,90.5]. Since K = 3, we find the 3rd minimum upper bound, UB 3min = 39.3. By comparing UB 3min with the lower bounds, we compute overNN 3 = {t 1 ,t 2 ,t 3 ,t 4 }, since t 5 is the only sample in T whose lower bound is greater than 39.3. All the other four samples may be among the 3 nearest neighbors of∆ ε (x). Due to ε perturbation, the set overNN 3 for K = 3 is expected to contain 3 or more samples. That is, since different inputs in∆ ε (x) may have different samples as their 3-nearest neighbors, to be conservative, we have to take the union of all possible sets of 3-nearest neighbors. 70 Algorithm 15: Subroutineabs same label(overNN,K,y). Let S be a subset of overNN obtained by removing all y-labeled elements; Let y ′ = Freq(S), and #y ′ be the count of y ′ -labeled elements in S; if #y ′ < K−| S|− 2∗ n then return True; return False; Soundness Proof Here we prove that any t ′ / ∈ overNN K is impossible to be among the K nearest neighbors of any x ′ ∈∆ ε (x). Since UB Kmin is the K-th minimum UB(d ε (x,t)) for all t∈ T , there are must be samples t 1 ,t 2 ,...t K such that UB(d ε (x,t i ))≤ UB Kmin for all i= 1,2,...K. For any t ′ / ∈ overNN, we have LB(d ε (x,t ′ ))> UB Kmin . Combining the above conditions, we have LB(d ε (x,t ′ )) > UB(d ε (x,t i )) for i= 1,2,...K. It means at least K other samples are closer to x than t ′ . Thus, t ′ cannot be among the K-nearest neighbors of x ′ . 5.3.2 Checking the Classification Result Next, we try to certify that, regardless of which of the K elements are selected from overNN, the prediction result obtained using them is always the same. The prediction label is affected by both ε perturbation of the input x and label-flipping bias in the dataset T . Since ε perturbation affects which points are identified as the K nearest neighbors, and its impact has been accounted for by overNN, from now on, we focus only on label-flipping bias in T . The proposed method is shown in Algorithm 15, which takes the set overNN, the parameter K, and the expected label y as input, and checks if it is possible to find a subset of overNN with size K, whose most frequent label differs from y. If such a “bad” subset cannot be found, we say that KNN prediction always returns the same label. To try to find such a “bad” subset of overNN, we first remove all elements labeled with y from overNN, to obtain the set S (Line 1). After that, there are two cases to consider. 71 1. If the size of S is equal to or greater than K, then any subset of S with size K must have a different label because it will not contain any element labeled with y. Thus, the condition in Line 3 of Algorithm 15 is not satisfied (# y ′ is a positive number, and right-hand side is a negative number), and the procedure returns False. 2. If the size of S, denoted|S|, is smaller than K, the most likely “bad” subset will be S K = S∪{ any(K−| S|) y-labeled elements from overNN}. In this case, we need to check if the most frequent label in S K is y or not. In S K , the most frequent label must be either y (whose count is K−| S|) or y ′ (which is the most frequent label in S, with the count #y ′ ). Moreover, since we can flip up to n labels, we can flip n elements from label y to label y ′ . Therefore, to check if the proposed method should return True, meaning the prediction result is guaranteed to be the same as label y, we only need to compare K−| S| with #y ′ + 2∗ n. This is checked using the condition in Line 3 of Algorithm 15. 5.4 Abstracting the KNN Parameter Tuning Step In this section, I present the proposed method for abstracting the parameter tuning step, which computes the optimal K value based on T and the impact of flipping at most n labels. The output is a super set of possible optimal K values, denoted KSet. Algorithm 16 shows the proposed method, which takes the training set T and parameter n as input, and returns KSet as output. To be sound, we require the KSet to include any candidate k value that may become the optimal K for some clean set T ′ ∈dBias n (T). In Algorithm 16, the proposed method first computes the lower and upper bounds of the clas- sification error for each k value, denoted LB k and UB k , as shown in Lines 5-6. Next, it computes minUB, which is the minimal upper bound for all candidate k values (Line 8). Finally, by com- paring minUB with LB k for each candidate k value, the proposed method decides whether this candidate k value should be put into KSet (Line 9). 72 Algorithm 16: Subroutine abs KNN paratune(T,n) for each candidate k value do Let{G i } = a partition of T into p groups of roughly equal size; errUB k i ={(x,y)∈ G i |abs may err (T\ G i ,n,k,x,y)= true} for each G i ; errLB k i ={(x,y)∈ G i |abs must err(T\ G i ,n,k,x,y)= true} for each G i ; UB k = 1 p ∑ p i=1 |errUB k i |/|G i |; LB k = 1 p ∑ p i=1 |errLB k i |/|G i |; Let minUB = min({UB 1 ,...,UB p }); return KSet={k| LB k ≤ minUB}; I will explain the steps needed to compute LB k and UB k in the remainder of this section. For now, assuming that they are available, I explain how they are used to compute KSet. Example Given the candidate k values, k 1 ,k 2 ,k 3 ,k 4 , and their error bounds[0.1,0.2], [0.1,0.3], [0.3,0.4], [0.3,0.5]. The smallest upper bound is minUB= 0.2. By comparing minUB with the lower bounds, we compute KSet ={k 1 ,k 2 }, since only LB k 1 and LB k 2 are lower than or equal to minUB. Soundness Proof Here we prove that any k ′ / ∈ KSet cannot result in the smallest classification error. Assume that k s is the candidate k value that has the minimal upper bound (minUB), and err k s is the actual classification error. By definition, we have err k s ≤ minUB. Meanwhile, for any k ′ / ∈ KSet, we have LB k ′ > minUB. Combining the two cases, we have err k ′ > minUB≥ err k s . Here, err k ′ > err k s means that k ′ cannot result in the smallest classification error. 5.4.1 Overapproximating the Classification Error To compute the upper bound errUB k i defined in Line 3 of Algorithm 16, we use the subroutine abs may err to check if(x,y)∈ G i may be misclassified when using T\ G i as the training set. Algorithm 17 shows the implementation of the subroutine, which checks, for a sample (x,y), whether it is possible to obtain a set S by flipping at most n labels in T K x such that the most frequent 73 Algorithm 17: Subroutineabs may err(T,n,K,x,y). Let y ′ be, among the non-y labels, the label with the highest count in T K x ; Let #y be the number of elements in T K x with the y label; Let n ′ be min(n,#y∈ T K x ); Changing n ′ elements in T K x from y label to y ′ label; return Freq(T K x )̸= y; label in S is not y. If it is possible to obtain such a set S, we conclude that the prediction label for x may be an error. The condition Freq(T K x )̸= y, computed on T K x after the y label of n ′ elements is changed to y ′ label, is a sufficient condition under which the prediction label for x may be an error. The rationale is as follows. In order to make the most frequent label in the set T K x different from y, we need to focus on the label most likely to become the new most frequent label. It is the label y ′ (̸= y) with the highest count in the current T K x . Therefore, Algorithm 17 checks whether y ′ can become the most frequent label by changing at most n elements in T K x from y label to y ′ label (Lines 3-5). 5.4.2 Underapproximating the Classification Error To compute the lower bound errLB k i defined in Line 4 of Algorithm 16, we use the subroutine abs must err to check if(x,y)∈ G i must be misclassified when using T\ G i as the training set. Algorithm 18 shows the implementation of the subroutine, which checks, for a sample (x,y), whether it is impossible to obtain a set S by flipping at most n labels in T K x such that the most frequent label in S is y. In other words, is it impossible to avoid the classification error? If it is impossible to avoid the classification error, , the conclusion drawn is that the prediction label must be erroneous, and thus the procedure returns True. In this sense, all samples in errLB k i (computed in Line 4 of Algorithm 16 are guaranteed to be misclassified. 74 Algorithm 18: Subroutineabs must err(T,n,K,x,y). if∃S obtained from T K x by flipping up to n labels such that Freq (S)= y then return False; return True; The challenge in Algorithm 18 is to check if such a set S can be constructed from T K x . The intuition is that, to make y the most frequent label, we should flip the labels of non- y elements to label y. Let us consider two examples first. Example 1 Given the label counts of T K x , denoted{l 1 * 4, l 4 * 4, l 3 * 2}, meaning that 4 elements are labeled l 1 , 4 elements are labeled l 4 , and 2 elements are labeled l 3 . Assume that n= 2 and y= l 3 . Since we can flip at most 2 elements, we choose to flip one l 1 → l 3 and one l 2 → l 3 , to get a set S ={l 1 * 3, l 2 * 3, l 3 * 4}. Example 2 Given the label counts of T K x , denoted{l 1 * 5, l 4 * 3, l 3 * 2}, n= 2, and y= l 3 . We can flip two l 1 → l 3 to get a set S ={l 1 * 3, l 2 * 3, l 3 * 4}. 5.4.2.1 The LP Problem The question is how to decide whether the set S (defined in Line 1 of Algorithm 18) exists. We can formulate it as a linear programming (LP) problem. The LP problem has two constraints. The first one is defined as follows: Let y be the expected label, l i ̸= y be another label, where i= 1,...,q and q is the total number of class labels (e.g., in the above two examples, the number q= 3). Let #y be the number of elements in T K x that have the y label. Similarly, let #l i be the number of elements with l i label. Assume that a set S as defined in Algorithm 18 exists, then all of the labels l i ̸= y must satisfy #l i − # f lip i < #y+ q ∑ i=1 # f lip i , (5.1a) 75 where # f lip i is a variable representing the number of l i –to–y flips. Thus, in the above formula, the left-hand side is the count of l i after flipping, the right-hand side is the count of y after flipping. Since y is the most frequent label in S, y should have a higher count than any other label. The second constraint is q ∑ i=1 # f lip i ≤ n, (5.2a) which says that the total number of label flips is bounded by the parameter n. Since the number of class labels (q) is often small (from 2 to 10), this LP problem can be solved quickly. However, the LP problem must be solved|T| times, where|T| may be as large as 50,000. To avoid invoking the LP solver unnecessarily, I propose two easy-to-check conditions. They are necessary condition in that, if either of them is violated, the set S does not exist. Thus, we invoke the LP solver only if both conditions are satisfied. 5.4.2.2 Necessary Conditions The first condition is derived from Formula (1a), by adding up the two sides of the inequality constraint for all labels l i ̸= y. The resulting condition is ∑ l i ̸=y #l i − q ∑ i=1 # f lip i ! < (q− 1)#y+(q− 1) q ∑ i=1 # f lip i ! . The second condition requires that, in S, label y has a higher count (after flipping) than any other label, including the label l p ̸= y with the highest count in the current T K x . The resulting condition is (#l p − #y)/2< n, since only when this condition is satisfied, it is possible to allow y to have a higher count than l p , by flipping at most n of the label l p to y. 76 Table 5.1: Statistics of the datasets used in the experimental evaluation. Dataset Description Size|T| # Attr. Protected Attr. Parameters ε and n Salary salary level [61] 52 4 Gender ε = 1% attribute range, n= 1 Student academic performance [13] 649 30 Gender ε = 1% attribute range, n= 1 German credit risk [21] 1,000 20 Gender ε = 1% attribute range, n= 10 Compas recidivism risk [18] 10,500 16 Race+Gender ε = 1% attribute range, n= 10 Default loan default risk [67] 30,000 36 Gender ε = 1% attribute range, n= 50 Adult earning power [21] 48,842 14 Race+Gender ε = 1% attribute range, n= 50 5.5 Experiments I have implemented the proposed method as a software tool written in Python using the scikit- learn machine learning library. I evaluated the tool on six datasets that are widely used in the fairness research literature. 5.5.1 Datasets Table 5.1 shows the statistics of each dataset, including the name, a short description, the size (|T|), the number of attributes, the protected attributes, and the parameters ε and n. The value of ε is set to 1% of the attribute range. The bias parameter n is set to 1 for small datasets, 10 for medium datasets, and 50 for large datasets. The protected attributes include Gender for all six datasets, and Race for two datasets, Compas and Adult, which are consistent with known biases in these datasets. In preparation for the experimental evaluation, I have employed state-of-the-art techniques in the machine learning literature to preprocess and balance the datasets for KNN, including encod- ing, standard scaling, k-bins-discretizer, downsampling and upweighting. 5.5.2 Methods For comparison purposes, I implemented six variants of the proposed method, by enabling or disabling the ability to certify label-flipping fairness, the ability to certify individual fairness, and the ability to certify ε-fairness. 77 Table 5.2: Results for certifying label-flipping and individual fairness (gender), for which ground truth is obtained by naive enumeration, and compared with the proposed method. Certifying label-flipping fairness Certifying label-flipping + individual fairness Ground Proposed Ground Proposed Name truth Time method Time Accuracy Speedup truth Time method Time Accuracy Speedup Salary 50.0% 1.7s 33.3% 0.2s 66.7% 8.5X 33.3% 1.5s 33.3% 0.2s 100% 7.5X Student 70.8% 23.0s 60.0% 0.2s 84.7% 115X 58.5% 25.2s 44.6% 0.2s 76.2% 116X Except forε-fairness, I also implemented the naive approach of enumerating all T ′ ∈dBias n (T). Since the naive approach does not rely on approximation, its result can be regarded as the ground truth (i.e., whether the classification output for an input x is truly fair). The goal is to obtain the ground truth on small datasets, and use it to evaluate the accuracy of the proposed method. How- ever, as explained before, enumeration does not work for ε-fairness, since the number of inputs in ∆ ε (x) is infinite. The experiments were conducted on a computer with 2 GHz Quad-Core Intel Core i5 CPU and 16 GB of memory. The experiments were designed to answer two questions. First, is the proposed method efficient and accurate enough in handling popular datasets in the fairness literature? Sec- ond, does the proposed method help us gain insights? For example, it would be interesting to know whether decision made on an individuals from a protected minority group is more (or less) likely to be certified as fair. 5.5.3 Results on Efficiency and Accuracy I first evaluate the efficiency and accuracy of the proposed method. For the two small datasets, Salary and Student, I was able to obtain the ground truth using the naive enumeration approach, and then compare it with the result of the proposed method. The objective was to know the extent of deviation between the results of the proposed method and the ground truth. Table 5.2 shows the results obtained by treating Gender as the protected attribute. Column 1 shows the name of the dataset. Columns 2-7 compare the naive approach (ground truth) and the proposed method in certifying label-flipping fairness. Columns 8-13 compare the naive approach (ground truth) and the proposed method in certifying label-flipping plus individual fairness. 78 Table 5.3: Results for certifying label-flipping , individual, andε-fairness by the proposed method. Name Label-flipping fairness Time + Individual fairness Time + ε-fairness Time Salary (gender) 33.3% 0.2s 33.3% 0.2s 33.3% 0.2s Student (gender) 60.0% 0.2s 44.6% 0.2s 32.3% 0.2s German (gender) 48.0% 0.2s 44.0% 0.3s 43.0% 0.2s Compas (race) 95.0% 0.3s 63.4% 1.4s 59.4% 1.1s Compas (gender) 95.0% 0.3s 62.4% 1.3s 60.4% 1.0s Default (gender) 83.2% 2.3s 73.3% 4.4s 64.4% 3.5s Adult (race) 76.2% 2.2s 65.3% 4.5s 53.5% 5.3s Adult (gender) 76.2% 2.2s 52.5% 3.5s 43.6% 3.3s Based on the results in Table 5.2, we conclude that the accuracy of the proposed method is high (81.9% on average) despite its aggressive use of abstraction to reduce the computational cost. The proposed method is also 7.5X to 126X faster than the naive approach. Furthermore, the larger the dataset, the higher the speedup. 5.5.4 Results on the Certification Rates I now present the success rates of the proposed certification method for the three variants of fair- ness. Table 5.3 shows the results for label-flipping fairness in Columns 2-3, label-flipping plus in- dividual fairness (denoted + Individual fairness) in Columns 4-5, and label-flipping plus ε-fairness (denoted + ε-fairness) in Columns 6-7. For each variant of fairness, I show the percentage of test inputs that are certified to be fair, together with the average certification time (per test input). In all six datasets, Gender was treated as the protected attribute. In addition, Race was treated as the protected attribute for Compas and Adult. From the results, we can see that, as more stringent fairness standard is used, the certified percentage either stays the same (as in Salary) or decreases (as in Student). This is consistent with what we expect, since the classification output is required to stay the same for an increasingly larger number of scenarios. For Compas (race), in particular, adding ε-fairness on top of label-flipping fairness causes the certified percentage to drop from 63.4% to 59.4%. Nevertheless, the proposed method still maintains a high certification percentage. Recall that, for Salary, the 33.3% certification rate (for +Individual fairness) is actually 100% accurate ac- cording to comparison with the ground truth in Table 5.2, while the 44.6% certification rate (for 79 Table 5.4: Results for certifying label-flipping + ε-fairness with both Race and Gender as protected attributes. (a) Compas White Other Wt. Avg Male 55.6% 49.3% 50.0% Female 100% 61.1% 66.7% Wt. Avg 66.7% 51.7% 53.5% (b) Adult White Other Wt. Avg Male 35.3% 33.3% 35.1% Female 33.3% 66.7% 37.0% Wt. Avg 34.8% 44.4% 35.6% +Individual fairness) is actually 76.2% accurate. Furthermore, the efficiency of the proposed method is high: for Adult, which has 50,000 samples in the training set, the average certification time of the proposed method remains within a few seconds. 5.5.5 Results on Demographic Groups Table 5.4 shows the certified percentage of each demographic group, when both label-flipping and ε-fairness are considered, and both Race and Gender are treated as protected attributes. The four demographic groups are (1) White Male, (2) White Female, (3) Other Male, and (4) Other Female. For each group, I show the certified percentage obtained by the proposed method. In addition, I show the weighted averages for White and Other, as well as the weighted averages for Male and Female. For Compas, White Female has the highest certified percentage (100%) while Other Male has the lowest certified percentage (49.3%); here, the classification output represents the recidivism risk. For Adult, Other Female has the highest certified percentage (66.7%) while the other three groups have certified percentages in the range of 33.3%-35.3%. The differences may be attributed to two sources, one of which is technical and the other is social. The social reason is related to historical bias, which is well documented for these datasets. If the actual percentages (ground truth) is different, the percentages reported by the proposed method will also be different. The technical reason is related to the nature of the KNN algorithm itself, which I explain as follows. In these datasets, some demographic groups have significantly more samples than others. In KNN, the lowest occurring group may have a limited number of close neighbors. Thus, for each 80 test input x from this group, its K nearest neighbors tend to have a larger radius in the input vector space. As a result, the impact of ε perturbation on x will be smaller, resulting in fewer changes to its K nearest neighbors. That may be one of the reasons why, in Table 5.4, the lowest occurring groups, White Female in Compas and Other Female in Adult, have significantly higher certified percentage than other groups. 5.5.6 Caveat The proposed work should not be construed as an endorsement nor criticism of the use of machine learning techniques in socially sensitive applications. Instead, it is an effort on developing methods and tools to help improve the understanding of these techniques. 5.6 Summary I have presented a method for certifying the individual and ε-fairness of the classification output of the KNN algorithm, under the assumption that the training dataset may have historical bias. The proposed method relies on abstract interpretation to soundly approximate the arithmetic compu- tations in the parameter tuning and prediction steps. The experimental evaluation shows that the method is efficient in handling popular datasets from the fairness research literature and accurate enough in obtaining certifications for a large number of test data. While this paper focuses on KNN only, as a future work, the proposed method can be extended to other machine learning models. 81 Chapter 6 Conclusions Data poisoning presents a considerable threat to the safety and integrity of software systems reliant on KNNs. However, analyzing the data poisoning robustness of KNNs is notoriously challenging. To address the challenge, in this dissertation I have designed and implemented a set of novel methods for analyzing, both efficiently and accurately, the data-poisoning robustness of KNNs. In Chapter 3, I have presented a method for certifying KNN data-poisoning robustness accu- rately and efficiently. To the best of my knowledge, the proposed method is the only one that can formally certify n-poisoning robustness of the entire KNN algorithm, including both the parameter tuning phase and the prediction phase. The proposed method relies on novel abstract interpretation techniques to exhaustively cover the often astronomically large number of possible scenarios while aggressively pruning the redundant scenarios. The experimental evaluation shows the accuracy and efficiency of of these techniques in handling both small and large supervised-learning datasets. In Chapter 4, I have presented a method for falsifying KNN data-poisoning robustness accu- rately and efficiently. To the best of my knowledge, this is the only method available for falsifying the complete KNN system, including both the parameter tuning and the prediction phases. The proposed method relies on novel techniques that first narrow down the search space using over- approximated analysis in the abstract domain, and then find violations using random sampling in the concrete domain. The experimental evaluation shows the accuracy and efficiency of these techniques in handling popular supervised-learning datasets. 82 In Chapter 5, I have presented a method for certifying KNN fairness accurately and efficiently, under the assumption that the training dataset may have historical bias. To the best of my knowl- edge, this is the first method for KNN fairness certification in the presence of dataset bias. The proposed method relies on abstract interpretation to soundly approximate the arithmetic computa- tions in the parameter tuning and prediction steps. The experimental evaluation shows the accuracy and efficiency of these techniques in handling popular datasets from the fairness research literature. These results demonstrate that a sound formal analysis framework can indeed help analyze (both efficiently and accurately) the robustness of KNNs under data poisoning. 83 References [1] David Adedayo Adeniyi, Zhaoqiang Wei, and Y Yongquan. Automated web usage data mining and recommendation system using k-nearest neighbor (knn) classification method. Applied Computing and Informatics, 12(1):90–108, 2016. [2] Aws Albarghouthi, Loris D’Antoni, and Samuel Drews. Repairing decision-making pro- grams under uncertainty. In International Conference on Computer Aided Verification , pages 181–200. Springer, 2017. [3] Moa Andersson and Lisa Tran. Predicting movie ratings using knn, 2020. [4] Davide Anguita, Alessandro Ghio, Luca Oneto, Xavier Parra, and Jorge Luis Reyes-Ortiz. A public domain dataset for human activity recognition using smartphones. In Esann, volume 3, page 3, 2013. [5] Dara Bahri, Heinrich Jiang, and Maya Gupta. Deep k-nn for noisy labels. In International Conference on Machine Learning, pages 540–550. PMLR, 2020. [6] Gilles Barthe, Pedro R. D’Argenio, and Tamara Rezk. Secure information flow by self- composition. In 17th IEEE Computer Security Foundations Workshop, (CSFW-17 2004), 28-30 June 2004, Pacific Grove, CA, USA , pages 100–114. IEEE Computer Society, 2004. [7] Battista Biggio, Igino Corona, Giorgio Fumera, Giorgio Giacinto, and Fabio Roli. Bagging classifiers for fighting poisoning attacks in adversarial classification tasks. In International workshop on multiple classifier systems , pages 350–359. Springer, 2011. [8] Battista Biggio, Blaine Nelson, and Pavel Laskov. Poisoning attacks against support vector machines. arXiv preprint arXiv:1206.6389, 2012. [9] Battista Biggio, Konrad Rieck, Davide Ariu, Christian Wressnegger, Igino Corona, Giorgio Giacinto, and Fabio Roli. Poisoning behavioral malware clustering. In Proceedings of the 2014 workshop on artificial intelligent and security workshop , pages 27–36, 2014. [10] Tolga Bolukbasi, Kai-Wei Chang, James Y Zou, Venkatesh Saligrama, and Adam T Kalai. Man is to computer programmer as woman is to homemaker? debiasing word embeddings. Advances in neural information processing systems, 29, 2016. [11] Swarat Chaudhuri, Sumit Gulwani, and Roberto Lublinerman. Continuity and robustness of programs. Commun. ACM, 55(8):107–115, 2012. 84 [12] Xinyun Chen, Chang Liu, Bo Li, Kimberly Lu, and Dawn Song. Targeted backdoor attacks on deep learning systems using data poisoning. arXiv preprint arXiv:1712.05526, 2017. [13] Paulo Cortez and Alice Maria Gonc ¸alves Silva. Using data mining to predict secondary school student performance. EUROSIS-ETI, 2008. [14] Patrick Cousot and Radhia Cousot. Abstract interpretation: A unified lattice model for static analysis of programs by construction or approximation of fixpoints. In Robert M. Graham, Michael A. Harrison, and Ravi Sethi, editors, Conference Record of the Fourth ACM Sym- posium on Principles of Programming Languages, Los Angeles, California, USA, January 1977, pages 238–252. ACM, 1977. [15] Patrick Cousot and Radhia Cousot. Abstract interpretation frameworks. Journal of logic and computation, 2(4):511–547, 1992. [16] Navneet Dalal and Bill Triggs. Histograms of oriented gradients for human detection. In 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR’05), volume 1, pages 886–893. Ieee, 2005. [17] Ambra Demontis, Marco Melis, Maura Pintor, Matthew Jagielski, Battista Biggio, Alina Oprea, Cristina Nita-Rotaru, and Fabio Roli. Why do adversarial attacks transfer? explaining transferability of evasion and poisoning attacks. In 28th {USENIX} Security Symposium ({USENIX} Security 19), pages 321–338, 2019. [18] William Dieterich, Christina Mendoza, and Tim Brennan. COMPAS risk scales: Demon- strating accuracy equity and predictive parity. Northpointe Inc, 2016. [19] Samuel Drews, Aws Albarghouthi, and Loris D’Antoni. Proving data-poisoning robustness in decision trees. In Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 1083–1097, 2020. [20] Samuel Drews, Aws Albarghouthi, and Loris D’Antoni. Proving data-poisoning robustness in decision trees. In Alastair F. Donaldson and Emina Torlak, editors, Proceedings of the 41st ACM SIGPLAN International Conference on Programming Language Design and Im- plementation, PLDI 2020, London, UK, June 15-20, 2020, pages 1083–1097. ACM, 2020. [21] Dheeru Dua and Casey Graff. UCI machine learning repository, 2017. [22] Dmitry Duplyakin, Robert Ricci, Aleksander Maricq, Gary Wong, Jonathon Duerig, Eric Eide, Leigh Stoller, Mike Hibler, David Johnson, Kirk Webb, Aditya Akella, Kuangching Wang, Glenn Ricart, Larry Landweber, Chip Elliott, Michael Zink, Emmanuel Cecchet, Snigdhaswin Kar, and Prabodh Mishra. The design and operation of CloudLab. In Pro- ceedings of the USENIX Annual Technical Conference (ATC), pages 1–14, July 2019. [23] Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Richard S. Zemel. Fair- ness through awareness. In Shafi Goldwasser, editor, Innovations in Theoretical Computer Science 2012, Cambridge, MA, USA, January 8-10, 2012, pages 214–226. ACM, 2012. 85 [24] Jiashi Feng, Huan Xu, Shie Mannor, and Shuicheng Yan. Robust logistic regression and classification. Advances in neural information processing systems, 27:253–261, 2014. [25] Bernd Finkbeiner, Lennart Haas, and Hazem Torfah. Canonical representations of k-safety hyperproperties. In 32nd IEEE Computer Security Foundations Symposium, CSF 2019, Hoboken, NJ, USA, June 25-28, 2019, pages 17–31. IEEE, 2019. [26] Ivan Firdausi, Alva Erwin, Anto Satriyo Nugroho, et al. Analysis of machine learning tech- niques used in behavior-based malware detection. In 2010 second international conference on advances in computing, control, and telecommunication technologies, pages 201–203. IEEE, 2010. [27] Ronald A Fisher. The use of multiple measurements in taxonomic problems. Annals of eugenics, 7(2):179–188, 1936. [28] Peter W Frey and David J Slate. Letter recognition using holland-style adaptive classifiers. Machine learning, 6(2):161–182, 1991. [29] Geoffrey Gates. The reduced nearest neighbor rule (corresp.). IEEE transactions on infor- mation theory, 18(3):431–433, 1972. [30] Gongde Guo, Hui Wang, David Bell, Yaxin Bi, and Kieran Greer. Knn model-based ap- proach in classification. In OTM Confederated International Conferences” On the Move to Meaningful Internet Systems”, pages 986–996. Springer, 2003. [31] Matthew Jagielski, Alina Oprea, Battista Biggio, Chang Liu, Cristina Nita-Rotaru, and Bo Li. Manipulating machine learning: Poisoning attacks and countermeasures for regression learn- ing. In 2018 IEEE Symposium on Security and Privacy (SP), pages 19–35. IEEE, 2018. [32] Jinyuan Jia, Xiaoyu Cao, and Neil Zhenqiang Gong. Intrinsic certified robustness of bagging against data poisoning attacks. arXiv preprint arXiv:2008.04495, 2020. [33] Jinyuan Jia, Xiaoyu Cao, and Neil Zhenqiang Gong. Certified robustness of nearest neigh- bors against data poisoning attacks and backdoor attacks. In AAAI Conference on Artificial Intelligence (AAAI), 2022. [34] Jinyuan Jia, Yupei Liu, Xiaoyu Cao, and Neil Zhenqiang Gong. Certified robustness of nearest neighbors against data poisoning and backdoor attacks. In Proceedings of the AAAI Conference on Artificial Intelligence , 2022. [35] Pang Wei Koh, Jacob Steinhardt, and Percy Liang. Stronger data poisoning attacks break data sanitization defenses. Machine Learning, pages 1–47, 2022. [36] Alex Krizhevsky, Geoffrey Hinton, et al. Learning multiple layers of features from tiny images. 2009. [37] Yann LeCun, L´ eon Bottou, Yoshua Bengio, and Patrick Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998. 86 [38] Alexander Levine and Soheil Feizi. Deep partition aggregation: Provable defense against general poisoning attacks. arXiv preprint arXiv:2006.14768, 2020. [39] Bo Li, Yining Wang, Aarti Singh, and Yevgeniy V orobeychik. Data poisoning attacks on factorization-based collaborative filtering. Advances in neural information processing sys- tems, 29, 2016. [40] Yang Li, Binxing Fang, Li Guo, and You Chen. Network anomaly detection based on tcm- knn algorithm. In Proceedings of the 2nd ACM symposium on Information, computer and communications security, pages 13–19, 2007. [41] Yannan Li, Jingbo Wang, and Chao Wang. Proving robustness of knn against adversarial data poisoning. In CONFERENCE ON FORMAL METHODS IN COMPUTER-AIDED DESIGN– FMCAD 2022, page 7, 2022. [42] Yannan Li, Jingbo Wang, and Chao Wang. Certifying the fairness of knn in the presence of dataset bias. In International Conference on Computer Aided Verification , 2023. [43] Yannan Li, Jingbo Wang, and Chao Wang. Systematic testing of the data-poisoning robust- ness of knn. In ACM SIGSOFT International Symposium on Software Testing and Analysis, 2023. [44] Yuzhe Ma, Xiaojin Zhu, and Justin Hsu. Data poisoning against differentially-private learn- ers: Attacks and defenses. arXiv preprint arXiv:1903.09860, 2019. [45] Geoffrey J McLachlan, Kim-Anh Do, and Christophe Ambroise. Analyzing microarray gene expression data. 2005. [46] Shike Mei and Xiaojin Zhu. Using machine teaching to identify optimal training-set attacks on machine learners. In Proceedings of the AAAI Conference on Artificial Intelligence , 2015. [47] Anna P. Meyer, Aws Albarghouthi, and Loris D’Antoni. Certifying robustness to pro- grammable data bias in decision trees. In Marc’Aurelio Ranzato, Alina Beygelzimer, Yann N. Dauphin, Percy Liang, and Jennifer Wortman Vaughan, editors, Advances in Neural Informa- tion Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, December 6-14, 2021, virtual, pages 26276–26288, 2021. [48] Anna P. Meyer, Aws Albarghouthi, and Loris D’Antoni. Certifying data-bias robustness in linear regression. CoRR, abs/2206.03575, 2022. [49] Fairuz Amalina Narudin, Ali Feizollah, Nor Badrul Anuar, and Abdullah Gani. Evaluation of machine learning classifiers for mobile malware detection. Soft Computing, 20(1):343–357, 2016. [50] Neehar Peri, Neal Gupta, W Ronny Huang, Liam Fowl, Chen Zhu, Soheil Feizi, Tom Gold- stein, and John P Dickerson. Deep k-nn defense against clean-label data poisoning attacks. In European Conference on Computer Vision, pages 55–70. Springer, 2020. 87 [51] Elan Rosenfeld, Ezra Winston, Pradeep Ravikumar, and J. Zico Kolter. Certified robustness to label-flipping attacks via randomized smoothing. In Proceedings of the 37th International Conference on Machine Learning, ICML 2020, 13-18 July 2020, Virtual Event, volume 119 of Proceedings of Machine Learning Research, pages 8230–8241. PMLR, 2020. [52] Elan Rosenfeld, Ezra Winston, Pradeep Ravikumar, and Zico Kolter. Certified robustness to label-flipping attacks via randomized smoothing. In International Conference on Machine Learning, pages 8230–8241. PMLR, 2020. [53] Anian Ruoss, Mislav Balunovic, Marc Fischer, and Martin T. Vechev. Learning certified individually fair representations. In Hugo Larochelle, Marc’Aurelio Ranzato, Raia Hadsell, Maria-Florina Balcan, and Hsuan-Tien Lin, editors, Advances in Neural Information Pro- cessing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual, 2020. [54] Avi Schwarzschild, Micah Goldblum, Arjun Gupta, John P. Dickerson, and Tom Goldstein. Just how toxic is data poisoning? A unified benchmark for backdoor and data poisoning attacks. In Marina Meila and Tong Zhang, editors, Proceedings of the 38th International Conference on Machine Learning, ICML 2021, 18-24 July 2021, Virtual Event, 2021. [55] Ali Shafahi, W Ronny Huang, Mahyar Najibi, Octavian Suciu, Christoph Studer, Tudor Du- mitras, and Tom Goldstein. Poison frogs! targeted clean-label poisoning attacks on neural networks. Advances in neural information processing systems, 31, 2018. [56] Jacob Steinhardt, Pang Wei W Koh, and Percy S Liang. Certified defenses for data poisoning attacks. Advances in neural information processing systems, 30, 2017. [57] Ming-Yang Su. Real-time anomaly detection systems for denial-of-service attacks by weighted k-nearest-neighbor classifiers. Expert Systems with Applications, 38(4):3492–3498, 2011. [58] Octavian Suciu, Radu Marginean, Yigitcan Kaya, Hal Daume III, and Tudor Dumitras. When does machine learning{FAIL}? generalized transferability for evasion and poisoning attacks. In 27th{USENIX} Security Symposium ({USENIX} Security 18), pages 1299–1316, 2018. [59] Brandon Tran, Jerry Li, and Aleksander Madry. Spectral signatures in backdoor attacks. arXiv preprint arXiv:1811.00636, 2018. [60] Jingbo Wang, Yannan Li, and Chao Wang. Synthesizing fair decision trees via iterative constraint solving. In International Conference on Computer Aided Verification , pages 364– 385. Springer, 2022. [61] Sanford Weisberg. Applied Linear Regression, Second Edition, page 194. New York: John Wiley & Sons, 1985. [62] Wenjin Wu, Wen Zhang, Ye Yang, and Qing Wang. Drex: Developer recommendation with k-nearest-neighbor search and expertise ranking. In 2011 18th Asia-Pacific Software Engi- neering Conference, pages 389–396. IEEE, 2011. 88 [63] Han Xiao, Huang Xiao, and Claudia Eckert. Adversarial label flips attack on support vector machines. In ECAI, pages 870–875, 2012. [64] Huang Xiao, Battista Biggio, Gavin Brown, Giorgio Fumera, Claudia Eckert, and Fabio Roli. Is feature selection secure against training data poisoning? In International Conference on Machine Learning, pages 1689–1698. PMLR, 2015. [65] Huang Xiao, Battista Biggio, Blaine Nelson, Han Xiao, Claudia Eckert, and Fabio Roli. Support vector machines under adversarial label contamination. Neurocomputing, 160:53– 62, 2015. [66] Miao Xie, Jiankun Hu, Song Han, and Hsiao-Hwa Chen. Scalable hypergrid k-nn-based online anomaly detection in wireless sensor networks. IEEE Transactions on Parallel and Distributed Systems, 24(8):1661–1670, 2012. [67] I-Cheng Yeh and Che-hui Lien. The comparisons of data mining techniques for the predictive accuracy of probability of default of credit card clients. Expert systems with applications, 36(2):2473–2480, 2009. [68] Chen Zhu, W Ronny Huang, Hengduo Li, Gavin Taylor, Christoph Studer, and Tom Gold- stein. Transferable clean-label poisoning attacks on deep neural nets. In International Con- ference on Machine Learning, pages 7614–7623. PMLR, 2019. 89
Abstract (if available)
Abstract
As machine learning techniques continue to gain prominence in software systems, ensuring their security has become a crucial software engineering concern. Data poisoning is an emerging security risk wherein attackers compromise machine learning models by contaminating their training data. This attack poses a significant threat to the safety and integrity of software systems that rely on machine learning technology. However, formally analyzing data poisoning robustness is a challenging task.
I designed and implemented a set of formal methods for analyzing, both efficiently and accurately, the data-poisoning robustness of the k-nearest neighbors (KNN) algorithm, which is a widely-used supervised machine learning technique. First, I developed a method for certifying the data-poisoning robustness of KNN by soundly overapproximating both the parameter tuning and prediction phases of the KNN algorithm. Second, I developed a method for falsifying data-poisoning robustness of KNN, by quickly detecting the truly-non-robust cases using search space pruning and sampling. Finally, I extended these methods to encompass fairness certification, thus allowing for a more comprehensive analysis of the robustness of KNN. Experimental evaluations demonstrate that the proposed methods are both efficient and accurate in solving these problems.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Learning logical abstractions from sequential data
PDF
Side-channel security enabled by program analysis and synthesis
PDF
Scalable optimization for trustworthy AI: robust and fair machine learning
PDF
Differential verification of deep neural networks
PDF
Security-driven design of logic locking schemes: metrics, attacks, and defenses
PDF
Automated repair of layout accessibility issues in mobile applications
PDF
Robust causal inference with machine learning on observational data
PDF
Attacks and defense on privacy of hardware intellectual property and machine learning
PDF
Assume-guarantee contracts for assured cyber-physical system design under uncertainty
PDF
Graph machine learning for hardware security and security of graph machine learning: attacks and defenses
PDF
Data-driven and logic-based analysis of learning-enabled cyber-physical systems
PDF
Neighborhood and graph constructions using non-negative kernel regression (NNK)
PDF
Identifying and mitigating safety risks in language models
PDF
Improving binary program analysis to enhance the security of modern software systems
PDF
Custom hardware accelerators for boolean satisfiability
PDF
Optimization strategies for robustness and fairness
PDF
Robust and adaptive online reinforcement learning
PDF
Sample-efficient and robust neurosymbolic learning from demonstrations
PDF
Efficient machine learning techniques for low- and high-dimensional data sources
PDF
Hybrid methods for robust image matching and its application in augmented reality
Asset Metadata
Creator
Li, Yannan
(author)
Core Title
Formal analysis of data poisoning robustness of K-nearest neighbors
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Computer Science
Degree Conferral Date
2023-05
Publication Date
05/09/2023
Defense Date
04/25/2023
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
AI safety,data poisoning,Fairness,formal methods,k nearest neighbors,machine learning,OAI-PMH Harvest,robustness
Format
theses
(aat)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Wang, Chao (
committee chair
), Nuzzo, Pierluigi (
committee member
), Raghothaman, Mukund (
committee member
)
Creator Email
hi.yannan.li@gmail.com,yannanli@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-oUC113103445
Unique identifier
UC113103445
Identifier
etd-LiYannan-11789.pdf (filename)
Legacy Identifier
etd-LiYannan-11789
Document Type
Dissertation
Format
theses (aat)
Rights
Li, Yannan
Internet Media Type
application/pdf
Type
texts
Source
20230509-usctheses-batch-1040
(batch),
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the author, as the original true and official version of the work, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright.
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Repository Email
cisadmin@lib.usc.edu
Tags
AI safety
data poisoning
formal methods
k nearest neighbors
machine learning
robustness