Close
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Detecting and characterizing network devices using signatures of traffic about end-points
(USC Thesis Other)
Detecting and characterizing network devices using signatures of traffic about end-points
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
DETECTING AND CHARACTERIZING NETWORK DEVICES USING SIGNATURES OF TRAFFIC ABOUT END-POINTS by Hang Guo A Dissertation Presented to the FACULTY OF THE USC GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (COMPUTER SCIENCE) August 2020 Copyright 2020 Hang Guo Dedication I dedicate this dissertation to Jesus Christ, my wife Yongxi, my son Nathaniel and my parents, without whom this dissertation would never be possible. ii Acknowledgments I would like give thanks to all those who support and help me through my PhD journey. I thank my PhD advisor, Prof. John Heidemann, for his teaching, guiding and sup- porting throughout my PhD career. I especially appreciate his patience when I had a slow start and I benefit tremendously from his explicit and implicit teachings: critical thinking, rigorous attitude towards science and the art of making complex ideas under- standable, just to name a few. I thank Prof. Shahram Ghandeharizadeh, Prof. Barath Raghavan, Prof. Jelena Mirkovic, and Prof. Rahul Jain for their service on my qualification exam committee. I thank Prof. Barath Raghavan, and Prof. Bhaskar Krishnamachari for their service on my dissertation committee. I thank my fellow PhDs and workmates in USC/ISI for their helps and supports: Xun Fan, Liang Zhu, Yuri Pradkin, Calvin Ardi, Lan Wei, Abdul Qadeer, Guillermo Baltra, Basileal Iman, A.S.M. Rizvi, Asma Enayet, Abdulla Alwabel, Hao Shi, Xiyue Deng, and many others. I thank Joe Kemp, Alba Regalado and Jeanine Yamazaki for their help on adminis- trative tasks in ISI. iii Table of Contents Dedication ii Acknowledgments iii List of Tables vii List of Figures ix Abstract xi 1 Introduction 1 1.1 Thesis Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2 Demonstrating the Thesis Statement . . . . . . . . . . . . . . . . . . . 4 1.3 Research Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2 ICMP Rate Limiting Detection 9 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.2 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.3 Modeling Rate Limited Blocks . . . . . . . . . . . . . . . . . . . . . . 13 2.3.1 Rate Limit Implementations in Commercial Routers . . . . . . 13 2.3.2 Modeling Availability . . . . . . . . . . . . . . . . . . . . . . 14 2.3.3 Modeling Response Rate . . . . . . . . . . . . . . . . . . . . . 14 2.3.4 Modeling Alternation Count . . . . . . . . . . . . . . . . . . . 15 2.4 Detecting Rate Limited Blocks . . . . . . . . . . . . . . . . . . . . . . 16 2.4.1 Input for Detection . . . . . . . . . . . . . . . . . . . . . . . . 16 2.4.2 Four Phases of ICMP Rate Limiting . . . . . . . . . . . . . . . 17 2.4.3 Detecting Rate Limited Blocks . . . . . . . . . . . . . . . . . . 18 2.5 Results: Rate Limiting in the Wild . . . . . . . . . . . . . . . . . . . . 21 2.5.1 How Many Blocks are Rate Limited in the Internet? . . . . . . 22 2.5.2 Verifying Results Hold Over Time . . . . . . . . . . . . . . . . 25 2.5.3 Is Faster Probing Rate Limited? . . . . . . . . . . . . . . . . . 27 2.5.4 Rate Limiting of Response Errors at Nearby Routers . . . . . . 33 iv 2.6 Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 2.6.1 Does the Model Match Real-World Implementations? . . . . . 34 2.6.2 Correctness in Noise-Free Testbed . . . . . . . . . . . . . . . . 37 2.6.3 Correctness in the Face of Packet Loss . . . . . . . . . . . . . . 38 2.6.4 Correctness with Partially Responsive Blocks . . . . . . . . . . 38 2.6.5 Correctness in Other Network Conditions . . . . . . . . . . . . 39 2.7 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 2.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 3 General IoT Device Detection 43 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 3.2 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 3.2.1 IP and DNS-Based Detection Methods . . . . . . . . . . . . . . 47 3.2.2 Certificate-Based IoT Detection Method . . . . . . . . . . . . . 55 3.2.3 Adversarial Prevention of Detection . . . . . . . . . . . . . . . 58 3.3 Results: IoT devices in the Wild . . . . . . . . . . . . . . . . . . . . . 58 3.3.1 IP-Based IoT Detection Results . . . . . . . . . . . . . . . . . 59 3.3.2 DNS-Based IoT Detection Results . . . . . . . . . . . . . . . . 64 3.3.3 Certificate-Based IoT Detection Results . . . . . . . . . . . . . 74 3.4 Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 3.4.1 Accuracy of IP-Based IoT Detection . . . . . . . . . . . . . . . 78 3.4.2 Accuracy of DNS-Based IoT Detections . . . . . . . . . . . . . 80 3.5 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 3.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 4 Compromised IoT Detection and DDoS Mitigation 88 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 4.2 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 4.2.1 Device Detection . . . . . . . . . . . . . . . . . . . . . . . . . 92 4.2.2 Server Learning . . . . . . . . . . . . . . . . . . . . . . . . . . 96 4.2.3 Traffic Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . 102 4.2.4 Deployment Incentives . . . . . . . . . . . . . . . . . . . . . . 103 4.2.5 Countermeasures by Knowledgeable Adversaries . . . . . . . . 104 4.3 Validation by Trace Replay . . . . . . . . . . . . . . . . . . . . . . . . 106 4.3.1 False Positive with Benign Traffic . . . . . . . . . . . . . . . . 106 4.3.2 True Positives and False Negatives with Attack Traffic . . . . . 110 4.4 Validation by Router Deployment . . . . . . . . . . . . . . . . . . . . 114 4.5 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 4.5.1 IoT Device Detection . . . . . . . . . . . . . . . . . . . . . . . 118 4.5.2 IoT-Based DDoS Defense . . . . . . . . . . . . . . . . . . . . 118 4.5.3 Traditional DDoS Defense . . . . . . . . . . . . . . . . . . . . 121 4.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 v 5 Future Work and Conclusions 125 5.1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 5.1.1 Immediate Future Work for Our Studies . . . . . . . . . . . . . 125 5.1.2 Future Work Suggested by This Thesis . . . . . . . . . . . . . 127 5.2 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 Bibliography 131 vi List of Tables 2.1 Datasets used in this paper . . . . . . . . . . . . . . . . . . . . . . . . 22 2.2 Application of FADER to it71w census and survey. . . . . . . . . . . . 23 2.3 True rate limited blocks in the it71w Census and Survey. . . . . . . . . 24 2.4 Results of rate-limit detection on it71w. . . . . . . . . . . . . . . . . . 26 2.5 Applying 15 FADER Tests to Each of ZMap Target /16 Blocks . . . . . 32 3.1 The 10 IoT Devices that We Purchased . . . . . . . . . . . . . . . . . . 48 3.2 Datasets for Real-world IoT Detection . . . . . . . . . . . . . . . . . . 58 3.3 4-Month IoT Detection Results on USC Campus and Our Estimations of IoT Users and Devices . . . . . . . . . . . . . . . . . . . . . . . . . 61 3.4 August IoT Detection Results on USC Campus . . . . . . . . . . . . . 62 3.5 IoT Deployment for One House in CCZ Data . . . . . . . . . . . . . . 73 3.6 IPCam Detection Break-Down . . . . . . . . . . . . . . . . . . . . . . 75 3.7 Partial Validation of Certificate-Based Detection Results . . . . . . . . 76 3.8 Detected IP cameras and NVRs by Countries . . . . . . . . . . . . . . 76 3.9 Resilience of Detection and Server Learning . . . . . . . . . . . . . . 81 4.1 14 IoT devices We Own . . . . . . . . . . . . . . . . . . . . . . . . . . 92 4.2 Eight Domains from Five IoT Platforms . . . . . . . . . . . . . . . . . 101 4.3 Server Learning Breakdown with Benign IoT Traffic . . . . . . . . . . 109 vii 4.4 Simulated Attacks in Validation . . . . . . . . . . . . . . . . . . . . . 109 4.5 Simulated Attacks in Deployment . . . . . . . . . . . . . . . . . . . . 113 viii List of Figures 2.1 Four phases of ICMP rate limiting . . . . . . . . . . . . . . . . . . . . 17 2.2 Confirming block 182.237.200/24 is rate limited with additional probing. 25 2.3 Response rate of 202.120.61/24 . . . . . . . . . . . . . . . . . . . . . . 26 2.4 The Original ZMap 50-Second Experiment Availability Chart . . . . . . 28 2.5 Rate limiting model for ZMap data . . . . . . . . . . . . . . . . . . . . 30 2.6 Our modeled availability and response rate . . . . . . . . . . . . . . . . 30 2.7 Two ZMap Target Blocks Showing Multiple Rate Limits . . . . . . . . 33 2.8 Responses from 103.163.18/24 over time . . . . . . . . . . . . . . . . 35 2.9 Validating the availability model. . . . . . . . . . . . . . . . . . . . . . 36 2.10 Validation of the Alternation Count Model . . . . . . . . . . . . . . . . 36 2.11 FADER detection in a noise-free environment. . . . . . . . . . . . . . . 37 2.12 FADER detection with packet loss. . . . . . . . . . . . . . . . . . . . . 38 2.13 FADER detection with partially responsive target blocks. . . . . . . . . 39 3.1 Workflow for DNS-Based IoT Detection with Server Learning . . . . . 53 3.2 Overall AS Penetration for Our 23 Device Types from 2013 to 2018 . . 66 3.3 ECDF for Device Type Density in IoT-ASes from 2013 to 2018 . . . . . 66 3.4 Detected IoT-ASes under Extended Observation at B Root . . . . . . . 67 3.5 Per-Device Type AS Penetrations . . . . . . . . . . . . . . . . . . . . . 68 3.6 IoT Deployments for All Houses in CCZ Data . . . . . . . . . . . . . . 72 ix 4.1 Directed Graph Storing Known IoT Manufacturers and Collaborators . 93 4.2 ECDF for Server Names Queried by 60 IoT and 6 non-IoT . . . . . . . 96 4.3 Name and IP-accessed Servers Our Devices Talk to Within 10 Days . . 100 4.4 IP-accessed Servers Our IoT Devices Visit Per Hour Within 10 Days . . 100 4.5 IoTSTEED’s Per-hour CPU and Memory Usage during Deployment . . 115 x Abstract The Internet has become an inseparable part of our society. Since the Internet is essen- tially a distributed system of billions of inter-connected, networked devices, learning about these devices is essential for better understanding, managing and securing the Internet. To study these network devices, without direct control over them or direct con- tact with their users, requires traffic-based methods for detecting devices. To identify target devices from traffic measurements, detection of network devices relies on signa- tures of traffic, mapping from certain characteristics of traffic to target devices. This dissertation focuses on device detection that use signatures of traffic about end-points: mapping from characteristics of traffic end-point, such as counts and identities, to target devices. The thesis of this dissertation is that new signatures of traffic about end-points enable detection and characterizations of new class of network devices. We support this thesis statement through three specific studies, each detecting and characterizing a new class of network devices with a new signature of traffic about end-points. In our first study, we present detection and characterization of network devices that rate limit ICMP traffic based on how they change the responsiveness of traffic end-points to active prob- ings. In our second study, we demonstrate mapping identities of traffic end-points to a new class of network devices: Internet-of-Thing (IoT) devices. In our third study, we xi explore detecting compromised IoT devices by identifying IoT devices talking to sus- picious end-points. Detection of these compromised IoT devices enables us to mitigate DDoS traffic between them and suspicious end-points. xii Chapter 1 Introduction The Internet has become an inseparable part of our society, powering our communica- tions, business operations, financial transactions, educations and entertainments. The recent proliferation of Internet-of-Thing (IoT), such as network enabled light bulbs and thermostats, further blurs the line between the Internet and our everyday life. As of December 2019, more than half of the world population (4.6 billion out of 7.8 billion, about 59%) have become users of the Internet [68]. At its core, the Internet is a distributed system with billions of network devices inter- connecting with each other. Learning about these network devices is essential for better understanding, managing and securing the Internet. To study and understand these network devices, requires detections of network devices, or simply device detection. (Device detection addresses the limitation that researchers may not have direct control over all network devices being studied.) Device detection helps researchers understand aspects of the Internet by revealing the Internet- wide distributions and historical growths of target devices. Device detection aids net- work management by enabling IT administrators to discover and monitor target devices in their network. Device detection also helps protect Internet security by identifying vulnerable network devices before malware infection and discovering compromised devices for quarantining. As more every-day objects get connected into the Internet, device detection may even help understand the physical world, such as tracking network- enabled vehicles for crime investigation. 1 Device detection is usually traffic-based. Passive device detection detects network devices by passively measuring and identifying traffic generated by target devices. Active device detection detects network devices by actively probing target devices and identifying their response traffic. Active device detection can also identify traffic- manipulating middle-boxes by probing IPs on the other side of the target middle-box and identify signs of traffic manipulation in the reply traffic. To detect network devices from traffic measurements, device detection could use signatures of traffic: mapping from certain traffic characteristics to target devices, One example is signatures of traffic flows that map characteristics of traffic flows, such as a traffic flow’s packet rates, packet lengths and inter-packet intervals, to network devices. Signatures of traffic flows are suitable for machine learning models and could be con- structed data-drivenly. Signatures of traffic flows could also protect user privacy by avoiding deep packet inspection (DPI). However they require target devices to have dis- tinct and deterministic traffic patterns which may not always be true. Neither are they robust to network topologies such as those with Network Address Translator (NAT) because when NAT mixes traffic from multiple NATted devices, traffic flow pattern are often obscured for outside observers. Lastly, since the correlation between a network device and the statistics of its traffic flow may not be obvious for human, it could be hard to construct signatures of traffic flows based on human intuition and domain knowledge. This dissertation focuses on device detection using signatures of traffic about end- points that map traffic characteristics about end-points such as the end-points’ counts, responsiveness and identities observed from the traffic, to network devices. Detections based on signatures of traffic about end-points could be robust to complexities in net- work typologies and characteristics of traffic flows due to their focus on traffic end- points. In some cases, signatures of traffic about end-points could also be more intuitive to construct. For example, knowing IoT devices talk to many end-points run by their 2 vendors [134], we can map traffic to end-points run by IoT vendors (judged by their DNS domain names) to IoT devices. In this thesis, we examine three studies of device detections, each detecting a new class of network devices based on a new signature of traffic about end-points. 1.1 Thesis Statement This thesis is that new signatures of traffic about end-points enable detection and characterization of new classes of network devices. By “new signatures of traffic about end-points”, we mean we use new traffic characteristics about end-points that have not been used as signals to detect network devices in prior work. By “detection and characterization” we mean besides finding out the existence of network devices, we also infer certain characteristics of theirs such as device types. By “new classes of network devices”, we mean our detections cover new types of network devices that have not been fully covered in prior work. We demonstrate the validity of our thesis with three studies that detect and charac- terize new classes of network devices using new signatures of traffic about end-points. First, we present detection and characterization of network devices that rate limit ICMP traffic based on how they change the observed responsiveness of traffic end-points to active probing. Second, we demonstrate mapping observed identities of traffic end- points to a new class of network devices: IoT devices. Last, we explore identifying compromised IoT devices by identifying IoT devices talking to suspicious end-points. Detection of compromised IoT devices also enables us to mitigate DDoS attacks from these compromised devices to suspicious end-points. 3 1.2 Demonstrating the Thesis Statement We prove the thesis statement through three specific studies, each developing a new signature of traffic about end-point to detect and characterize a new class of network devices. In our first study (Chapter 2), we detect network devices rate limiting ICMP traffic by ICMP scanning the IPv4 space and comparing the observed responsiveness of target end-points between fast and slow scans (an active device detection). We also estimate the packet rate of detected rate limits based on the differences in observed responsive- ness of end-points between scans. Out first study demonstrates four aspects of our thesis statement. First, we develop a new signature of traffic about end-points: differences in observed responsiveness of probing end-points between fast and slow scans. Second, with this new signature, we detect a new class of network devices: ICMP rate limiters between our prober and target /24 blocks. Third, besides detecting the existence of ICMP rate limiters, this signature also enables us to characterize these detected rate limiters by estimating the effective rate limits they apply to the target /24 blocks. Fourth, this signature makes our detec- tion robust to the complexities of network topology: regardless of the number and the location of rate limiters, we always detect the effective rate limiting applied to the target block. Having shown an example of active device detection (recall we use active probing in the detection of ICMP rate limiters), we next apply passive device detection to a new class of network devices: IoT devices. In our second study (Chapter 3), we propose two methods to detect IoT devices based on the observed identities of their traffic end-points. Our methods covers general IoT devices both on the public Internet and behind the 4 NAT. One of our two methods also supports automatically learning IoT-specific traffic end-points from detected devices during detection. Out second study demonstrates four aspects of our thesis statement. First, we use a new signature of traffic about endpoints: devices talking to known combinations of end-points operated by IoT manufacturers are IoT devices. Second, this new signature enables us to detect a new class of network devices: general IoT devices both on the public Internet and behind the NATs. Third, this new signature also enables us to char- acterize detected IoT devices: we infer their device types based on the combination of end-points they talk to because the combinations of end-points could be unique for IoT device types. Fourth, this signature is robust to different network topologies especially those with NAT because when NAT mixes traffic from multiple devices, the identities of remote end-points are still preserved in the mixed traffic. This signature is also robust to different traffic flow characteristics and allows us to detect IoT devices regardless of the pattern of their talking (such as timing and rates). Having shown that signature based on observed identities of traffic end-points enables detecting general IoT device, we apply this signature to a special class of IoT devices: compromised IoT devices. In our third study (Chapter 4), we detect compromised IoT devices participating in distributed denial of service (DDoS) attacks by identifying IoT devices talking to end-points other than known benign traffic end- points. Detection of these compromised IoT devices enables us to mitigate DDoS traffic between them and suspicious end-points. Our third study demonstrates four aspects of our thesis statement. First, we come up with a new traffic signature about end-point: IoT devices talking to end-points other than known benign end-points are compromised. Second, our detection cover a new class of network devices: general IoT devices that are compromised and participat- ing in DDoS attack. Third, our detection of compromised IoT devices is essentially a 5 characterization: we characterize all IoT devices under monitoring as either benign or malicious depending on whether they talk to suspicious end-points. Fourth, because by using traffic characteristics about end-points, our detection is robust to both the type of malicious IoT traffic (such as TCP SYN flooding and DNS query flooding) and the flow characteristics of malicious IoT traffic (such as packet rates). Our studies above show that signatures of traffic about end-points can effectively detect and characterize network devices. These studies serve as three examples for our thesis statements. They suggest that signatures of traffic about end-point are effective in detecting and studying the ever-increasing, heterogeneous, networked devices and could help gain better understanding of the Internet. 1.3 Research Contributions Each of the above three studies demonstrates our thesis statement and consequently their first contribution is that they prove our thesis. In addition to that, each work has its own contributions that could be beneficial for research community, industry and others, as summarized below. In our study of ICMP rate limiting detection (Chapter 2), our first contribution is to create FADER, a new algorithm that can identify rate limiting from user-side traces with minimal new measurement traffic. Our second contribution of this study is to apply our methods to real-world network traces to explore distributions of rate limiting in the Internet. We show that rate limiting exists but that for slow probing rates, rate-limiting is very rare: out of sample of 40,493 /24 blocks, we confirm 6 blocks (0.02%!) see rate limiting at 0.39 packets/s per block. We also look at higher rates in public datasets and suggest that fall-off in responses as rates approach 1 packet/s per /24 block is consistent 6 with rate limiting. Lastly, we show that even very slow probing (0.0001 packet/s) can encounter rate limiting of NACKs that are concentrated at a single router near the prober. In our study of general IoT device detection (Chapter 3), our first contribution is to create three new methods that detect IoT devices on the Internet: server IP addresses in traffic, server names in DNS queries, and manufacturer information in TLS certifi- cates. Our second contribution of this study is applying our algorithms to a number of observations to explore current distributions and historical growth of IoT devices. Our IP-based algorithms report detections from a university campus over 4 months and from traffic transiting an IXP over 10 days. We apply our DNS-based algorithm to to traffic from 8 root DNS servers from 2013 to 2018 to study AS-level IoT deployment. We find substantial growth (about3:5) in AS penetration for 23 types of IoT devices and mod- est increase in device type density for ASes detected with these device types (at most 2 device types in 80% of these ASes in 2018). DNS also shows substantial growth in IoT deployment in residential households from 2013 to 2017. Our certificate-based algo- rithm finds 254k IP cameras and network video recorders from 199 countries around the world. In our study of compromised IoT detection and DDoS mitigation (Chapter 4), Our first contribution is to propose IoTSTEED (IoT bot-Side Traffic-Endpoint-basEd Defense), a system that runs in the edge routers of access networks of bots and mitigates DDoS traffic at victims by filtering at the source. IoTSTEED processes each packet entering router with 3 steps: detection of IoT devices in the access network, learning of benign servers detected IoT devices talk to and filtering of suspicious IoT traffic to and from all other servers. We validate IoTSTEED’s correctness in device detection and false positive (FP) in server learning and traffic filtering with replay of 10-day benign traffic capture from an IoT access network. We show IoTSTEED correctly detects all 14 7 IoT and 6 non-IoT devices in this network and maintains low FP in server learning (flag- ging 2% of 642 benign IoT servers as suspicious) and traffic filtering (dropping 0.45% of about 7 million benign IoT packets). We validate IoTSTEED’s true positive (TP) and false negative (FN) in filtering attacks with replay of real-world DDoS traffic. Validation results suggests IoTSTEED could mitigate all except two types of attacks. First, IoT- STEED cannot mitigate attacks launched shortly after bootup of the attacking devices. However we show that the probability for such attack is low. Second, IoTSTEED cannot mitigate attacks launched by devices that constantly talk to new server IPs. However we show that some of these devices are just constantly responding unsolicited probes from Internet scanners — a side effect of Universal Plug’n’Play service [112] (UPnP) in NAT router. We show that once we disable UPnP service in router, these devices will stop responding Internet scanners and failing our defense. Lastly, validation results show IoTSTEED mitigates all other attacks tested regardless of the attacks’ types (TCP SYN flooding and DNS query flooding), flow characteristics (packet rates) and exact attacking devices. Our second contribution is to deploy IoTSTEED in NAT router of an IoT access network for 10 days. We show IoTSTEED could work from commodity router with reasonable run-time overhead: small memory usages (4% of 512MB) and no downgrading of router’s packet forwarding. We confirm IoTSTEED’s correctness (FP, TP and FN in server learning and traffic filtering) during on-line deployment is is similar to what we report in off-line validation. 8 Chapter 2 ICMP Rate Limiting Detection In this chapter we detect the existence of network devices that rate limit ICMP traffic by comparing observed responsiveness of end-points between fast and slow scans. We also estimate the rate limits for ICMP rate limiters. We show that ICMP rate limiting exists but it is very rare for probing rates up to 0.39 packets/s per block. We also show fast probing, up to 1 packets/s per block, risks being rate limited. This study of ICMP rate limiting demonstrates our thesis statement as follows. We develop a new signature of traffic based on differences in observed responsiveness of end-points between scans. With this new signature, we detect a new class of network devices: rate limiters to ICMP traffic. Besides detection, this signature also enables us to characterize these rate limiters by estimating the packet rates of their rate limits. Our detection is robust to the the number and topological locations of the actual rate limiters, by focusing on changes in observed responsiveness of target end-points and identifying the effective rate limit applied to target blocks. Part of this chapter was published in Passive and Active Measurements Conference 2018 [53]. 2.1 Introduction Active probing with pings and traceroutes (ICMP echo requests) are often the first tool network operators turn to to assess problems, and widely used tools in network research [60, 67, 90, 103, 118]. Studies of Internet address usage [27, 60, 81, 116, 153], 9 path performance [90], outages [118,128], Carrier-Grade NAT deployment [125] DHCP churn [103] and topology [14, 67, 88, 92, 137] all depend on ICMP. An ongoing concern about active probing is that network administrators rate limit ICMP. If widespread, rate limiting could easily distort measurements, possibly silently corrupting results. Researchers try to avoid rate limiting by intentionally probing slowly and selecting targets in a pseudo-random order [60,82], but recent work has emphasized probing as quickly as possible [34]. For IPv4 scanning, the Internet Census (2008) sends 1.5k probe/s [60], IRLscanner (2010) sends 22.1k probe/s [83], ZMap (2013) sends 1.44M probes/s [34], or 14M probes/s in their latest revision [7], and Yarrp (2016) sends 100k probes/s or more [14]. Assuming about 3 billion target addresses and pseudoran- dom probing, these rates imply a probe arrives at a router handling a given /16 every 0.003 to 30 seconds. Interest in faster probing makes rate limit detection a necessary part of measurement, since undetected rate limiting can silently distort results. Although rate limiting is a concern to active probing and has been studied briefly in some papers that consider active probing [14, 49, 60, 81, 137], we know only two prior study explicitly looking for rate limiting in the general Internet [38, 123]. The work from Universite Nice Sophia Antipolis detect and characterize rate limit to ICMP Time exceeded replies in response to expired ICMP echo requests. [123]. However their detection is expensive, requiring hundreds of vantage points and 17 probing rates to cover 850 routers in the Internet. More importantly, they never look at rate limiting to ICMP echo requests on forward path. Google studied traffic policing of TCP protocol from server side traces [38], Their detection depended on server-side traffic analysis of billions of packets in Google’s CDN. Like those prior works, we want to study rate lim- iting of ICMP in global scale, but our goal is to do so in a lightweight manner that does not require intensive traffic probing or extensive sever-side data. Lightweight methods to 10 detect rate limiting will help researchers by preventing their results from being distorted silently, while not adding too much extra complexity and cost to their research. Our first contribution is to provide a new lightweight algorithm to detect ICMP rate-limiting and estimate rate limit across the Internet. Our approach is based on two insights about how rate-limiting affects traffic: first, a rate-limiting will cause probe loss that is visible when we compare slower scans with faster scans, and second, this probe loss is randomized. As a result, we can analyze two ICMP scans taken at different rates to identify rate limiting at any rate less than the faster scan. Our second contribution is to re-examine existing public data for signs of ICMP rate limiting in the whole Internet. We examine two different groups of data. First, we use random samples of about 40k /24 blocks to show that ICMP Rate limiting is very rare in the general Internet for rates up to 0.39 packets/s per /24: only about 1 in 10,000 /24 blocks are actually rate limited. Thus it is almost always safe to probe in this range. Second, we look at higher rate scans (up to 0.97 packets/s) and show the fall-off of responses in higher probing rates is consistent with rate limits at rates from 0.28 to 0.97 packets/s per /24 in parts of the Internet. Our third contribution is to show that rate limiting explains results for error replies when Internet censuses cover non-routed address space, although low-rate scans do not usually trigger rate limiting. The final contribution is that this study supports the thesis by using a new signa- ture of traffic about end-point (differences in endpoints’ responsiveness between scans) to detect a new class of network devices (ICMP rate limiters) and characterize these network devices (estimation of rate limits). 11 2.2 Problem Statement Rate limiting is a facility provided in all routers to allow network administrators to control access to their networks. In most routers, rate limiting can be configured in several ways. Administrators may do traffic policing, limiting inbound ICMP (or TCP or UDP) to prevent Denial-of-Service (DoS) attacks against internal networks. Routers also often rate-limit generation of ICMP error messages (ICMP types 3 and 11, called here ICMP NACKs) to prevent use of the router to attack others (an attacker generate a stream ICMP NACK-generating traffic, spoofing a victim’s address to amplify the attack’s effects). ICMP rate limiting in either direction matters to researchers. Limits on the forward path affect address usage and outage studies [60, 123], while limits on the reverse path affect studies that use traceroute-like mechanisms [67, 137]. When a rate limit is reached, the router can simply drop packets over the limit, or it can generate an error reply (ICMP type 3). Most routers simply drop traffic over the rate limit, but Linux IP tables can also generate NACKs [4]. Dropping traffic over the rate limit matches the typical goal of protecting the network from excessive traffic, since generating NACKs adds more traffic to the network. Our paper develops FADER (Frequent Alternation Availability Difference ratE limit detector), an algorithm that can detect and estimate rate limits in the forward path. Our method estimates the effective rate limit at each target /24 block, or the aggregate rate limit of intermediate routers across their covered space. Our goal is to estimate rate limits while minimizing network traffic on infrastructure: our approach works from a single vantage point, and requires two scans at different rates, detecting rate limits that take any value between those rates. This goal is challenging for two reasons: First, the amount of information conveyed through two-rate probing from single vantage point is very limited. Second, active probing data can be distorted 12 by potential events at target IP blocks like DHCP [103], diurnal variation [119], and outages [118]. 2.3 Modeling Rate Limited Blocks Our detection algorithm is based on models of rate limiting in commercial routers. 2.3.1 Rate Limit Implementations in Commercial Routers We examined router manuals and two different router implementations; most routers, including those from Cisco [1] and Juniper [2], implement ICMP rate limiting with some variation on a token bucket. With a token bucket, tokens accumulate in a “bucket” of sizeB tokens at a rate ofL tokens/s. When the bucket size is exceeded, extra tokens are discarded. When a packet arrives, it consumes one token and is forwarded, or the packet is discarded if the token bucket is empty (we assume 1 token per packet, although one can use tokens per bytes). Ideally (assuming smooth traffic), for incoming traffic at rate P packets/s, if P < L, the traffic is below rate limit and will be passed by token bucket without loss. When P > L, initially all packets will be passed as the bucket drains, then packet loss and transmission will alternate as packets and tokens arrive and are consumed. In the long run, whenP >L, egress traffic exits at rateL packets/s. We only model steady-state behavior of the token bucket because our active probing (Section 2.5 lasts long enough (2 weeks, 1800 iterations) to avoid disturbance from transient conditions. 13 2.3.2 Modeling Availability We assume a block of addresses is behind a rate-limited router, with all sharing a com- mon IPv4 prefix of some length. When not otherwise specified, we assume blocks have /24 prefixes and usen B to represent number of IP in a /24 block: 256. We first model the availability of that block—the fraction of IPs in target block that respond positively to our probing. We consider both the true availability (A), ignoring rate limiting, and also the observed availability ( ^ A) as affected by rate limiting. Two observations help model availability. From Section 2.3.1, recall thatL packet/s pass (the rate limit), whenP packet/s are presented to the token bucket. ThereforeL=P is the proportion of probes that are passed. Second, ifN IPs in target block are respon- sive, a non-rate-limited ping hits a responsive IP that replies with probability N=n B . Putting above two observations together gives us the model of rate limited block’s avail- ability: A= N n B and ^ A= 8 > > < > > : A(L=P); ifP >L A; otherwise (2.1) 2.3.3 Modeling Response Rate Response rate is the positive responses we receive from target block per second. In our model Equation 2.2, we consider both the true value (R) ignoring rate limit and the observed value ( ^ R) affected by rate limit. R = N n B P and ^ R = 8 > > < > > : R(L=P); ifP >L R; otherwise (2.2) 14 2.3.4 Modeling Alternation Count Response Alternation is defined as the transition of an address from responsive to non- responsive or the other way around. For instance, if a target IP responds positively twice in a row, then omits a response, then responds again, it alternates responses twice (from responsive to non-responsive and back). Response alternation is important to distin- guish between rate limits and other sources of packet loss—rate limits cause frequent alternation between periods of packet response and drops as the token bucket fills. Fre- quent alternation helps distinguish rate limiting from networks outages, since outages are long-lived while rate-limits show as intermittent failures. Frequent alternation is, however, less effective in distinguishing rate limiting from transient network congestion because congestion losses are randomized (mostly due to our probes are randomized) and create frequent alternation. An extra re-probing could ensure the detection result are robust against potential false positives caused by transient network congestion. We model the count of observed response alternations, ^ C, both accurately and approximately. The accurate model fits measured values very precisely over all val- ues ofP=L, but requires enumerating the probabilities and response alternation counts for all 2 r possible cases forr rounds of observation (wherep n andc n in the model are the probability and response alternation count of then th case). Sincer is quite large for full datasets we study in Section 2.5 (our data hasr =1800 iterations), this enumeration is not computable. The accurate model is: ^ C = 8 > > < > > : N P 2 r n=1 p n c n ; ifP >L 0; otherwise (2.3) for a rate limitL, probing rateP andN ideal responsive IPs. 15 The approximate model, on the other hand, provides single expression covering all r but fits only when P L (so that consecutive packets from same sender are never passed by token bucket). We use it in our evaluation since it is computable when r = 1800. It is: ^ C =2(L=P)Nr; whenPL (2.4) 2.4 Detecting Rate Limited Blocks The models (Section 2.3) assist our detection algorithm. 2.4.1 Input for Detection Our detection algorithm requires low- and high-rate measurements as input. Low-rate measurements must be slower than any rate limit that are detected. They therefore capture the true availability (A) of target blocks. While they require that we guess the range of possible rate limits, we observe that most routers have a minimum value for rate limits (for example, 640 kb/s for Cisco ME3600-A [1]), and we can easily be well under this limit. An example of a suitable low-rate scan is the ISI Internet censuses at 0.0001 pings/s per /24 block [60]. High-rate measurements must exceed the target rate limit. It is more difficult to guarantee we exceed rate limits because we do not want measurements to harm target blocks with too much traffic. However, selecting some high rate allows us to detect or rule out rate limiting up to that rate; the high-rate sets the upper bound for FADER’s detection range. 16 0 0.2 0.4 0.6 0.8 1 0.01 0.1 1 10 100 1000 Observed Avalability Probing Rate/Rate Limit Non-RL (P<L) RL-Tran (L<P<1.1L) RL-Sat (1.1L<P<100L) RL-Rej (100L<P) Figure 2.1: Four phases of ICMP rate limiting, with ^ A as a function ofP=L. In addition, high-rate measurements must be repeated to use the alternation detection portion of our algorithm Algorithm 1. Validation shows that 6 repetitions is sufficient for alternation detection (Section 2.6.1), but our existing datasets include as many as 1800 receptions, and our algorithms can detect rate-limiting candidates without alternation count, although with a high false-positive rate. An example high-rate measurement is the ISI Internet surveys that repeatedly scan many blocks about 1800 times at 0.39 pings/s per /24 block. Both low- and high-rate measurements need to last a multiple of 24 hours to account for regular diurnal variations in address usage [119]. 2.4.2 Four Phases of ICMP Rate Limiting The models from Section 2.3 allow us to classify the effects of ICMP rate limiting on block availability into four phases (Figure 2.1). Defined by the relationship between probing rateP and rate limitL, these phases guide our detection algorithm: 1. Non-RL (Non-Rate-Limited): whenP <L, before rate limiting has any effect, 17 2. RL-Tran (Rate Limit Transition): when L < P < 1:1L, rate limiting begins to reduce the response rate with alternating responses. ^ A starts to fall but is not distinguishing enough for detecting. 3. RL-Sat (Rate Limit Saturation): when 1:1L < P < 100L, rate limiting and frequent alternation are common, and ^ A falls significantly. 4. RL-Rej (Rate Limit Rejection): when P > 100L occurring at threshold T rej = P=L = 100. Here ^ A < 0:01N=n B , most packets are dropped and response alter- nations become rare. These phases also help identify regions where no algorithm can work: rate limits right at the probing rate, or far above it. We use empirical thresholds 1:1L and 100L to define these two cases. For rate limits that happen to lie in the RL-Tran Phase (L < P <1:1L)—here there is not enough change in response for us to identify rate limiting over normal packet loss. Fortunately, this region is narrow. In addition, no algorithm can detect rate limits in RL-Rej phase because such rate limited block will look dark ( ^ A < 0:01N=n B ) and give too little information (at most one response for every one hundred probes sent) for anyone to know if it is a heavily rate limited block or a non- rate-limited gone-dark block. In Section 2.6.2 we show that our algorithm is correct in the remaining large regions (Non-RL and RL-Sat), providedP <60L. 2.4.3 Detecting Rate Limited Blocks FADER is inspired by observations that the RL-tran phase is narrow, but we can easily tell the difference between the non-RL and RL-Sat phases. Instead of trying to probe at many rates, we probe at a slow and fast rate, with the hope that the slow probes land in the non-RL phase and with the goal of bracketing the RL-Tran phase. If the target 18 Algorithm 1 Frequent Alternation Test Input: ^ C: observed response alternation count in high-rate measurement r: number of probing rounds in high-rate measurement ^ N L : number of responsive IPs observed in low-rate measurement ^ N H : number of responsive IPs observed in each round of high-rate measurement (where responsive IPs observed atith round is ^ N H i ) Output: O fat : results of frequent alternation test 1: if ^ C >(2 ^ N L r)=T rej and NOTDIRTMPDN( ^ N H ; ^ N L ;r) then 2: O fat Passed = has frequent alternations= 3: else 4: O fat Failed = no frequent alternations= 5: end if 6: function NOTDIRTMPDN( ^ N H ; ^ N L ;r) 7: fori=1 tor do 8: if ^ N H i ^ N L then 9: returnfalse 10: end if 11: end for 12: returntrue 13: end function block shows much higher availability in slow probing, we consider the block a rate limit candidate and examine if its traffic pattern look like rate limiting: consistent and randomized packet dropping and passing. We first introduce Frequent Alternation Test Algorithm 1 to distinguish rate limiting from other types of packet loss. This subroutine identifies the consistent and randomized packet dropping caused by rate limiting (by looking for large number of responses alternations). Threshold (2 ^ N L r)=T rej is derived from our approximate alternation count model Equation 2.4. As low-rate measurement is assumed non-rate-limited, we have ^ N L (number of respon- sive IPs observed in low-rate measurement) = N (the number of responsive IP when non-rate-limited). Recall that we do not try to detect rate limits in RL-Rej phase (Sec- tion 2.4.2), we have P < T rej L. Substituting both into alternation count model, for a rate limited block, there must be at least(2 ^ N L r)=T rej response alternations. 19 Function NotDirTmpDn filters out diurnal and temporarily down blocks, which oth- erwise may be misinterpreted (false positives) because their addresses also alternate between responsive and non-responsive. NotDirTmpDn checks if any round of the high- rate measurement looks like the daytime (active period) of diurnal block or the up-time of temporarily down blocks, satisfying ^ N H i = ^ N L or even ^ N H i > ^ N L Next, we describe our detection algorithm FADER (Algorithm 2). FADER first detects if target block is rate-limited (line 1-19), producing “cannot tell” for blocks that are non-responsive when probed or respond too little to judge. No active measurement system can judge the status of non-responsive blocks; mark such block as cannot-tell rather than misclassifying them as rate limited or not. In our experiments we see cannot-tell rates of 65% when probing rate is 100 times faster than rate limit and in average only 2.56 IPs respond in each target block 2.11a; these rates reflect the fundamental limit of any active probing algorithm in an Internet with firewalls, rather than a limit specific to our algorithm. Once target block is detected as rate limited, FADER estimates its rate limit (line 20-22). Note that threshold ^ N L < 10 used in line 3 is empirical, but chosen because very sparse blocks provide too little information to reach definite conclusions. Test ( ^ A L ^ A H )= ^ A L >0:1 in line 5 is derived by substitutingP >1:1L ,the lower bound of RL-Sat phase (recall that we intentionally give up detecting rate limits in RL-Tran Phase L<P < 1:1L), into availability model Equation 2.1. Test ^ A H = ^ A L < 1=T rej (line 6) is derived by substitutingP > T rej L (the threshold that we give up detection), into avail- ability model Equation 2.1. Estimating the rate limit (line 21) inverts our availability model (Equation 2.1). 20 Algorithm 2 FADER Input: ^ A L : measured block availability in low-rate measurement ^ A H : measured block availability in high-rate measurement ^ N L : number of responsive IPs in low-rate measurement T rej : lower bound of RL-Rej phase O fat : result of frequent alternation test Output: O fader : detection result of FADER ^ L: estimated rate limit (if detect rate limit) 1: if ^ A L =0 or ^ A H =0 then = target block down= 2: O fader Can-Not-Tell 3: else if ^ N L <10 then = block barely responsive= 4: O fader Can-Not-Tell = if significant lower availability in faster probing= 5: else if ( ^ A L ^ A H )= ^ A L >0:1 then 6: if ^ A H = ^ A L <1=T rej then= RL-Rej phase= 7: O fader Can-Not-Tell 8: else 9: ifO fat = Passed then 10: O fader Rate-Limited 11: else = no frequent alternations (most= 12: = likely target block temp down)= 13: O fader Can-Not-Tell 14: end if 15: end if = if no significant availability drop in faster probing= 16: else if0:1>( ^ A L ^ A H )= ^ A L >0:1 then 17: O fader Not-Rate-Limited 18: else =0:1>( ^ A L ^ A H )= ^ A L = 19: O fader Not-Rate-Limited 20: end if = estimate rate limit if detected= 21: ifO fader = Rate-Limited then 22: ^ L n B ^ A H P H ^ N L 23: end if 2.5 Results: Rate Limiting in the Wild We next apply FADER to existing public Internet scan datasets to learn about ICMP rate limiting in the Internet. (We validate the algorithm later in Section 2.6.) 21 Start Date (Duration) Size (/24 blocks) Alias Full Name 2016-08-03 (32 days) 14,460,160 it71w census internet_address_census_it71w-20160803 2016-08-03 (14 days) 40,493 it71w survey internet_address_survey_reprobing_it71w-20160803 2016-06-02 (32 days) 14,476,544 it70w census internet_address_census_it70w-20160602 2016-06-02 (14 days) 40,493 it70w survey internet_address_survey_reprobing_it70w-20160602 2013-09-17 (32 days) 14,477,824 it56j census internet_address_census_it56j-20130917 2013-11-27 (33 days) 14,476,544 it57j census internet_address_census_it57j-20131127 2014-01-22 (29 days) 14,472,192 it58j census internet_address_census_it58j-20140122 Table 2.1: Datasets used in this paper 2.5.1 How Many Blocks are Rate Limited in the Internet? We first apply FADER to find rate limited blocks in the Internet, confirming what we find with additional probing. Input data: Rather than do new probing, we use existing Internet censuses and surveys as test data [60]. Reusing existing data places less stress on other networks and it allows us to confirm our results at different times Section 2.5.2. Table 2.1 lists the datasets we use in result section and they are available publicly [143, 144]. Censuses and surveys define the low- and high-rates that bound rate limits detected by our algorithm. A census scans at at 0.0001 pings/s per block, while surveys send 0.39 pings/s per block. These two rates provide the low- and high-rates to test our algorithm. We could re-run FADER with higher rates to test other upper bounds; we report on existing higher rate scans in Section 2.5.3. Both censuses and surveys are intentionally slow and randomized, to spread out load on the Internet and avoid abuse complaints. Surveys cover about 40k blocks (30k of which are randomly selected and 10k randomly selected drawn from specific lev- els of responsiveness), and they probe those blocks about 1800 times over two weeks, supporting Frequent Alternation detection. With a 2% of the responsive IPv4 address space, randomly chosen, our data provides a representative of the Internet. Censuses 22 blocks studied 40,493 (100%) not-rate limited 24,414 (60%) cannot tell 15,941 (39%) rate limited 111 (0.27%) (100%) false positives 105 (0.25%) (95%) true positives 6 (0.015%) (5%) Table 2.2: Application of FADER to it71w census and survey. cover almost the entire unicast IPv4 Internet, but we use only the part that overlaps the survey. Initial Results: Here we apply FADER to it71w, the latest census and survey datasets, in Table 2.2. We find that most blocks are not rate limited (60%), while a good number (39%) are cannot tell, usually because they are barely responsive and so provide little information for detection. However, our algorithm classifies a few blocks (111 blocks, 0.27%) as apparently rate limited. Validation with additional probing: To confirm our results, we next re-examine these likely rate-limited blocks, We re-probe each block, varying probing rates from 0.01 to 20 ping/s per block to confirm the actual rate limiting. Our additional probing is relatively soon (one month) after our overall scan; it is unlikely that many network blocks changed use in that short time. Figure 2.2 shows this confirmation process for one example block. Others are simi- lar. In these graphs, red squares show modeled availability and response rate, assuming the block is rate limited (given the rate limit estimation from our algorithm in Table 2.3). Green line with diamonds show the availability and response rate if the block is not rate limited. For a rate limited block, its measured availability and response rate would match the modeled values with rate limiting. As Figure 2.2 shows, this block’s measured availability (blue dots) and response rate (cyan asterisks) tightly matches the modeled value with rate limiting (red squares) while diverging from theoretical values without 23 Response Rate Availability Rate Limit (ping/s per blk) /24 Block (measured, pkts/s) ( ^ A L , %) (measured) (estimated) 124.46.219.0 0.009 9.77 0.09 0.09 124.46.239.0 0.08 53.13 0.15 0.12 182.237.200.0 0.06 58.98 0.10 0.12 182.237.212.0 0.04 27.34 0.15 0.10 182.237.217.0 0.06 49.61 0.12 0.13 202.120.61.0 0.35 17.58 1.99 0.32 Table 2.3: True rate limited blocks in the it71w Census and Survey. rate limiting (green diamonds). This data shows that this block, 182.237.200.0/24, is rate limited. Although this example shows a positive confirmation, we find that most of the 111 blocks are false positives (their availabilities and response rates in re-probing do not match rate limit models). Only the 6 blocks listed in Table 2.3 are indeed rate limited. Two reasons contribute to this high false-positive rate. First, we design to favor false positives to avoid missing rate-limited blocks (false negatives). (by using necessary con- ditions: significant lower availability in faster scan and frequent response alternations in fast scan, to detect rate limiting). Second, this trade-off (favoring false positives over false negatives) is required to confirm the a near-absence of rate limiting we observe. We rule out the possibility that these false positives are caused by concurrent high-rate ICMP activities at our target blocks by observing over long durations and at different times (Section 2.5.2. We use additional verification to confirm true positives. Among the 6 rate limited blocks, five belong to the same ISP: Keumgang Cable Network in South Korea, while the last block is from Shanghai Jiaotong University in China. We have contacted both ISPs to confirm our findings, but they did not reply. Our first conclusion from this result is there are ICMP rate-limited blocks, but they are very rare. We find only 6 blocks in 40k, less than 0.02%. 24 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.01 20 0.1 1 10 Avalability Probing Rate (ping/s per/24) Experimental Avalability Modeled Avalability w/o Rate Limit Modeled Avalability w/ Rate Limit (a) Availability 0 0.1 0.2 0.3 0.4 0.5 0.01 20 0.1 1 10 Response Rate (Echo Reply/s) Probing Rate (ping/s per /24) Experimental Response Rate Modeled Response Rate WO Rate Limit Modeled Response Rate w Rate Limit (b) Response Rate Figure 2.2: Confirming block 182.237.200/24 is rate limited with additional probing. Second, we see that each of FADER’s steps rule out about 95% of all the blocks entering that rule. As in Table 2.4, 2,088 out of 40,403 (5.2%) passed phase 1 (Avail- ability Difference Test) and 111 out of 2,088 (5.3%) passed phase 2 (Frequent Alterna- tion Test). However, even after two phases of filtering, there is still a fairly high false positive rate in the remaining blocks, since only 6 of 111 (5.4%) are finally confirmed as rate limited. Finally, we show that when we detect rate limiting, our estimate of the rate limit are correct in general. Table 2.3 shows this accuracy: five out of six rate limits observed in re-probing closely match FADER’s estimates. However our rate limit estimation (0.32 ping/s per block) for block 202.120.61/24 is 5 times smaller than the rate limit (1.99 pings/s) that we observe when we re-probe. (We compute the rate limit when re-probing by measuring ^ R (the response rate), ^ A L and inverting our response-rate model Equation 2.2.) When we review the raw data, we believe that the rate limit for this block changed between our measurements. 2.5.2 Verifying Results Hold Over Time To verify our approach works on other datasets, we also apply FADER to it70w census and survey data. This data is taken two months before it71w and sharing 76% of the 25 0.02 0.04 0.07 0 0.1 08/03/16 08/05/16 08/07/16 08/09/16 08/11/16 08/13/16 08/15/16 08/17/16 08/19/16 Response Rate Time (MM/DD/YY) Experimental Value Modeled Un-Rate-Limited (a) It71w Survey 0.02 0.04 0.07 0 0.1 06/02/16 06/04/16 06/06/16 06/08/16 06/10/16 06/12/16 06/14/16 06/16/16 06/18/16 Response Rate Time (MM/DD/YY) Experimental Value Modeled Un-Rate-Limited (b) It70w Survey Figure 2.3: Response rate (reply/s) of 202.120.61/24: measured every 1000 s (cyan asterisks), compared to expected when not rate-limited (green diamonds) Number of Blocks (Ratio) Test Name Input Passed Filtered Availability Difference 40,403 2,088 (5.2%) 38,315 (94.8%) Frequent Alternation 2,088 111 (5.3%) 1,977 (94.7%) Re-probing Validation 111 5 (4.5%) 106 (95.5%) Table 2.4: Results of rate-limit detection on it71w. same target blocks. Detection results of it70w data agrees with our previous conclusion, resulting in about the same number of blocks identified as rate limited (0.3%, 138 of 40,493) , and the same fraction as actually limited (0.012%, 5). Of blocks that we confirm as rate limited after re-probing, four also are detected and confirmed in it71w. The fifth, 213.103.246/24, is from ISP Swipnet of Republic of Lithuania and is not probed in it71w. We observe inconsistencies between it70w and it71w for two blocks, 124.46.219/24 and 202.120.61/24. These blocks are detected as rate limited blocks in it71w, but are classified as Can-Not-Tell and Not-Rate-Limited respectively in it70w. We believe one block is hard to measure, and the other actually changed its use between the mea- surements. Block 124.46.219/24 is only sparsely responsive, with only 25 addresses responding (9.8%). Most of our probes to this block go to non-responding addresses are dropped, making it difficult to detect rate limiting (as shown in Section 2.6.4). For 26 the 202.120.61/24 block, we believe it is not rate limited in it70w even though it is in it71w. 2.3b shows its response in the it70w survey; this measured response-rate closely matches what we expect without rate limiting. In comparison, measured response rate in it71w is strictly below expected value without rate limiting as shown in 2.3a, matching our expectation of reduced response rate under rate limiting. 2.5.3 Is Faster Probing Rate Limited? Having shown that rate-limited blocks are very rare, at least when considering rates up to 0.39 packets/s, we next evaluate if faster probing shows signs of rate limiting. We study ZMap TCP-SYN probing datasets from 0.1M to 14M packet/s [7] (0.007 to 0.97 packets/s per /24 block as estimated in Section 2.5.3) and show rate limiting could explain the drop-off in response they see at higher rates. Although both our models and FADER were originally designed for ICMP rate limiting, they also detect TCP-SYN rate limiting because they detect the actions of the underlying token bucket. ZMap achieves a probing rate of 14.23M packets/s allowing a full scan of IPv4 in about 4 minutes [7]. To evaluate these very fast rates, they perform a series of 50- second experiments from 0.1M to 14M packets/ [1,2]. Each experiment targets a differ- ent random sample of an IP pool of about 3.7 billion addresses. Their probing results show overall availability (the fraction of positive responses of all hosts that are probed) is roughly stable for probing rates up to 4M packets/s. However, when probing rates exceed 4M packets/s, the availability starts to decline linearly as the probing increases. At 14M packets/s they see availability that is only 67% of the availability of measure- ments at 4M packet/s. They state that they do not know the exact reason for this decline. Figure 2.4, which is a copy of Figure 2 in ZMap paper [7], visualize this linear avail- ability drop from 4M to 14M packets/s. 27 Scan Rate Hit Rate Duration 1.44 Mpps (1 GigE) 1.00 42:08 3.00 Mpps 0.99 20:47 4.00 Mpps 0.97 15:38 14.23 Mpps (10 GigE) 0.63 4:29 Table 1: Performance of Internet-wide Scans — We show the scan rate, the normalized hit rate, and the scan duration (m:s) for complete Internet-wide scans performed with optimized ZMap. IP address using the address constraint tree, and creates an addressed packet in the PF_RING ZC driver’s mem- ory. The packet is added to a per-thread single-producer, single-consumer packet queue. The send thread reads from each packet queue as packets come available, and sends them over the wire using PF_RING. To determine the optimal number of packet creation threads, we performed a series of tests, scanning for 50 seconds using 1–6 packet creation threads, and measured the send rate. We find the optimal number of threads corresponds with assigning one per physical core. 4 Evaluation We performed a series of experiments to characterize the behavior of scanning at speeds greater than 1 Gbps. In our test setup, we completed a full scan of the public IPv4 address space in 4m29s on a server with a 10 GigE uplink. However, at full speed the number of scan results (the hit rate) decreased by 37% compared to a scan at 1 Gbps, due to random packet drop. We find that we can scan at speeds of up to 2.7 Gbps before seeing a substantial drop in hit rate. We performed the following measurements on a Dell PowerEdge R720 with two Intel Xeon E5-2690 2.9 GHz processors (8 physical cores each plus hyper-threading) and 128 GB of memory running Ubuntu 12.04.4 LTS and the 3.2.0-59-generic Linux kernel. We use a single port on a Intel X540-AT2 (rev 01) 10 GigE controller as our scan interface, using the PF_RING-aware ixgbe driver bundled with PF_RING 6.0.1. We configured ZMap to use one send thread, one receive thread, one monitor thread, and five packet creation threads. We used a 10 GigE network connection at the Uni- versity of Michigan Computer Science and Engineering division connected directly to the building uplink, an aggregated 2 10 GigE channel. Beyond the 10 GigE connection, the only special network configuration ar- ranged was static IP addresses. We note that ZMap’s performance may be different on other networks depend- ing on local congestion and upstream network condi- tions. 0 0.2 0.4 0.6 0.8 1 0 2e+06 4e+06 6e+06 8e+06 1e+07 1.2e+07 1.4e+07 Hit Rate (Normalized) Speed (pps) ZMap Masscan Figure 2: Hit-rate vs. Scan-rate — ZMap’s hit rate is roughly stable up to a scan rate of 4 Mpps, then declines linearly. This drop off may be due to upstream network congestion. Even using PF_RING, Masscan is unable to achieve scan rates above 6.4 Mpps on the same hardware and has a much lower hit rate. We performed all of our experiments using our lo- cal blacklist file. Our blacklist, which eliminates non- routable address space and networks that have requested exclusion from scanning [6], consists of over 1,000 entries of various-sized network blocks. It results in 3.7 billion allowed addresses—with almost all the excluded space consisting of IANA reserved allocations. 4.1 Hit-rate vs. Scan-rate In our original ZMap study, we experimented with var- ious scanning speeds up to gigabit Ethernet line speed (1.44 Mpps) and found no significant effect on the num- ber of results ZMap found [7]. In other words, from our network, ZMap did not appear to miss any results when it ran faster up to gigabit speed. In order to determine whether hit-rate decreases with speeds higher than 1 Gigabit, we performed 50 second scans at speeds ranging from 0.1–14 Mpps. We performed 3 trials at each scan rate. As can be seen in Figure 2, hit- rate begins to drop linearly after 4 Mpps. At 14 Mpps (close to 10 GigE linespeed), the hit rate is 68% of the hit rate for a 1 GigE scan. However, it is not immediately clear why this packet drop is occurring at these higher speeds—are probe packets dropped by the network, re- sponses dropped by the network, or packets dropped on the scan host due to ZMap? We first investigate whether response packets are being dropped by ZMap or the network. In the original ZMap work, we found that 99% of hosts respond within 1 sec- ond [7]. As such, we would expect that after 1 second, there would be negligible responses. However, as can be seen in Figure 3, there is an unexpected spike in response packets after sending completes at 50 seconds for scans at 10 and 14 Mpps. This spike likely indicates that response 4 Figure 2.4: The Original ZMap 50-Second Experiment Availability (Hit-Rate) Chart from Figure 2 in ZMap paper [7] We believe the cause of this drop is rate limiting—once rate limits are exceeded, as the packet rate increases, availability drops. We also believe that there are roughly the same amount of rate limiting at each packet rate between 4M and 14M packets/s (0.28 to 0.97 packets/s per /24 as estimated in Section 2.5.3) in the Internet, causing the overall availability drop to be linear. We would like to apply FADER to ZMap probing results. Unfortunately we cannot because there is no way to recover the exact IPs that are probed in each experiment, so we cannot compare observed availability against actual (they probe in a pseudorandom sequence, but do not preserve the seed of this sequence). In addition, each of their experiments is a one-time run, so we can not look for response alternation. However we still manage to reveal signs of rate limiting in ZMap probing result with existing ZMap data. (We chose not to collect new, high-rate ZMap data to avoid stressing target networks.) we create a model of their measurement process and show rate limiting can cause the same drop in availability as the probe rate increases Section 2.5.3. We also 28 show availability of many ZMap target blocks match our expectation of availability of rate limited blocks by statistically estimating the number of IPs probed in each block and the block availability Section 2.5.3. Rate Limiting Can Explain ZMap Probing Drop-Off To support our hypothesis that rate limiting is the cause of the linear drop in availability in ZMap probing results, we create a model of their measurement process and show rate limiting can cause the same probing results. Our model, shown in Figure 2.5, simulates the whole measurement process of ZMap 50-second experiments. More specifically, we model a simplified network topology that just captures rate limiting and the traffic generated by ZMap 50-second experiments. In our modeled network topology, there is one prober and 100 rate-limiting routers whose rate limits are 40k (4M=100) packets/s, 41k (4:1M=100) packets/s, all the way up to 139k (13:9M=100) packets/s. We use 100 routers, each with a rate limit 1k packets/s faster than the one before, to match our assumption that there are same amount of rate limiting at every probing rate from 4M to 14M packets/s. Each router covers roughly 1/100th of the 3.7 billion IP addresses 1.08% of which are responsive to TCP-SYN probing. (1.08% is the availability before linear drop in ZMap data) In each experiment, prober probes a random sample of this 3.7 billion IP pool with P packets/s for 50 sec- onds. As a consequence, there are roughly(50=100)P target IPs behind each router in each experiment. It is reasonable to assume at least one probe is sent to each target /24 block because the set of ZMap experiments we care about (from 4M to 14M packets/s) send in average 14 to 48 probes to each target /24 block as the probes are uniform random. Availability ( 2.6a) and response rate ( 2.6b) produced by this model (red square in the charts) matches those in ZMap probing results (blue rounds in the charts) closely. 29 Figure 2.5: Rate limiting model for ZMap data 0 0.2 0.4 0.6 0.8 1 1.2 2 4 6 8 10 12 14 0 Avalability(%) Probing Rate (*10 6 packet/s) Modeled Avalability with Rate Limit Zmap Experimental Avalability (a) Availability Comparison 0 0.2 0.4 0.6 0.8 1 1.2 1.4 2 4 6 8 10 12 14 0 Response Rate (*10 5 packets/s) Probing Rate (*10 6 packets/s) Modeled Response Rate with Rate Limit Zmap Experimental Response Rate (b) Response Rate comparison Figure 2.6: Our modeled availability and response rate (Red) closely matches the exper- imental values in ZMap probing results (Blue, 3 trials at each probing rate) The similarity of the results of our model to their experimental results suggests that our simple model can provide a plausible explanation for the packet loss observed at their fast probing. We know that our model is limited—the Internet is not 100 routers each handling 1/100th of ZMap traffic. However, it shows that it is possible to explain the fall-off in ZMap response with rate limiting. The precision of our model isn’t very surprising because we set up this model to simulate ZMap data. If we could determine the exact IPs that are probed and carry out multiple passes, we could apply the complete FADER and draw a clear picture of rate limiting in fast probing. 30 Availability of ZMap Target Blocks Shows Signs of Rate Limiting Observing that rate limits are consistent with the drops in response of ZMap at high speeds, we next apply FADER to ZMap data to look for specific blocks that appear to be rate limited. (We cannot apply Frequent Alternation Test with single-round ZMap data so we omit this test.) Input data: A challenge in using ZMap data is there is no easy way to recover what specific addresses are probed in an incomplete run—we know the order is pseudo- random, but they do not provide the seed. We address this gap by statistically estimating the number of IPs probed in each block, assuming pseudorandom probes into the same 3.7 billion IPv4 pool. Assuming uniform sampling, about same number of IP will be sampled from each /16 block in the pool. (Here we look at /16 blocks instead of /24 blocks because larger blocks decrease the statistical variance.) As a consequence, for a 50-second ZMap scan ofP packets/s, approximately50P=(3:710 9 )2 16 IPs are probed in each /16 block, given50P=(3:7 10 9 ) as the fraction of addresses probed in 50s, against a target2 16 addresses in size. We then estimate availability of each /16 block as the fraction of target IPs that respond positively to probes. We estimate probe rates by substituting 2 8 for 2 16 , finding that ZMap probing rates are 0.007 to 0.97 packets/s per /24 block. Initial Results: We next apply FADER to detect rate limiting (assuming all blocks pass Frequent Alternation Tests). For each ZMap target block, we use slowest 50 s scan (0.1M packets/s) as the low-rate measurement and test each of the other 15 faster ZMap scans as high-rate measurement. This gives us 15 test results (each at a different high rate), for each target block. We consider a block as potentially rate limited if it is detected as rate limited in at least one test. We do not consider the other blocks (cannot tell or not-rate limited) further. 31 blocks studied 56,550 (100%) 0 rate limited 53,460 (94.54%) (100%) 15 cannot tell 53,149 (93.99%) (99.42%) 15 not-rate limited 311 (0.55%) (0.58%) others 0 (0%) (0%) at least 1 rate limited 3,090 (5.46%) (100%) at least 13 rate limited 2,153 (3.81%) (69.68%) less than 13 rate limited 937 (1.66%) (30.32%) Table 2.5: Applying 15 FADER Tests to Each of ZMap Target /16 Blocks Table 2.5 shows detection results. Most ZMap target blocks (53,149 blocks,93:99%) are cannot tell in all 15 FADER tests (43,067 of them due to target block went com- pletely dark during low-rate measurement and provide no information for detection). A good number of them (3,090 blocks, 5:46%) are classified as rate-limited in at least one FADER test and are thus considered potentially rate-limited. It’s worth noting that most (69:68%) of those potentially rate-limited blocks are consistently classified as rate- limited in most FADER tests (at least 13 out of 16 tests), supporting our claim that those blocks are potentially rate-limited. Confirmation with Additional Examinations: Our algorithm is optimized to avoid false negatives, so we know many of these potential rate limited blocks are false positives. Since we cannot do Frequent Alternation Tests, to further filter out false pos- itive and false negative in detection results, we manually check a 565 (1%) random sample of 56,500 ZMap target blocks. Of these sample blocks, 31 are detected as rate- limited in at least one FADER test and are considered potentially rate-limited. Among the other 534 blocks (that are detected as rate limited in zero test and are considered cannot tell or not-rate limited), 532 are classified as cannot tell in all 15 FADER tests while 2 are classified as not-rate limited in all 15 tests. We find the 534 cannot tell or not-rate limited blocks to be true negative. They either have almost 0 ^ A L or ^ A H (providing no information for detection) or become 32 0 0.1 0.2 0.1M 1M 4M 14M Avalability Probing Rate (packets/s) ZMap Avalability (a) 125.182/16 in Log Scale 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 2M 4M 6M 8M 10M 12M 14M Avalability Probing Rate (packets/s) ZMap Avalability (b) 50.62/16 in Linear Scale Figure 2.7: Two ZMap Target Blocks Showing Multiple Rate Limits more available at higher probing rate (opposing our expectation of reduced availability at faster scan) All 31 potential rate-limited blocks show reduced availability at higher probing rates (regardless of jitter caused by probing noise and distortion introduced by our statistical estimation), matching our expectation of rate limited blocks. We also find 7 of them appear to have more than one rate limits. For example, block 125.182/16 in 2.7a looks like a superposition of ^ A curves of two rate limits: one at 0.5M packets/s, the other at 4M packets/s (recall the ideal ^ A curve a of rate limited block in Figure 2.1). Block 50.62/16 in 2.7b, on the other hand, show nearly linear drops in availability as probing rates get higher, suggesting it consists of multiple rate limits (reasons are similar as in Section 2.5.3). We manually check each /24 blocks in those two /16 blocks, and it appears that those /24 blocks indeed have multiple rate limits. This observation supports our claim that different parts of the /16 have different rate limits. 2.5.4 Rate Limiting of Response Errors at Nearby Routers Although we have shown that probing rates up to 0.39 pings/s trigger rate limits on almost no target blocks, we next show that even slow probing can trigger rate limits when traffic to many targets is aggregated at a nearby router. 33 In the it57j census we see this kind of reverse-path aggregation because a router near our prober generates ICMP error messages for packets sent to unrouted IPv4 address space. We examine millions of NACK replies in this census and see they are all gener- ated by the same router. We confirm this router is near our prober with traceroute and based on the hostname. This router sees about 500 ping/s, about one-third of census traffic. The router was configured to generate ICMP error message (NACKs), but it had NACK rate limiting of about 80 NACK/s. To better understand this procedure, we visualize responses from one of the target block in it57j census. Figure 2.8 shows block 103.163.18/24, one of the unreachable blocks behind this router. It56j census 2.8a shows the whole block is non-responsive and so is the first half of It57j census. 2.8b However in the middle of it57j census (16/12/2013 GMT), this router began to generate NACK feedback for probes targeting unreachable IPs. Rather than each probe drawing a NACK, we instead see a roughly constant rate of 1.64 NACK/day. Similar NACK traffic is also seen in the it59j Census 2.8c. This response is consistent with a rate limited return path. 2.6 Validation We validate our model against real-world routers and our testbed, and our algorithm with testbed experiments. 2.6.1 Does the Model Match Real-World Implementations? We next validate our models for availability, response alternation, and response rate of rate-limited blocks. We show they match the ICMP rate limiting implementations in two carrier-grade, commercial routers and our testbed. 34 0 10 20 30 40 50 60 70 80 90 100 09/12/13 09/19/13 09/26/13 10/03/13 10/10/13 10/17/13 10/24/13 Percentage Time (MM/DD/YY) Other Reply No Response NACK Echo Reply (a) It56j Census 0 10 20 30 40 50 60 70 80 90 100 11/21/13 11/28/13 12/05/13 12/12/13 12/19/13 12/26/13 01/02/14 Percentage Time (MM/DD/YYY Other Reply No Response NACK Echo Reply (b) It57j Census 0 10 20 30 40 50 60 70 80 90 100 01/16/14 01/23/14 01/30/14 02/06/14 02/13/14 02/20/14 02/27/14 Percentage Time (MM/DD/YY) Other Reply No Response NACK Echo Reply (c) It58j Census Figure 2.8: Responses from 103.163.18/24 over time (every 1.85 day) Our experiments use two commercial routers (Cisco ME3600-A and Cisco 7204VXR) and one Linux box loaded with Linux filter iptables as rate limiters. Our measurement target is a fully responsive /16 block, simulated by one Linux box loaded with our customized Linux kernel [3]. In each experiment, we run a 6-round active ICMP probe, with the rate changing from below the limit to at most7500 the rate limit (while fixing rate limit), pushing our model to extremes. We begin with validating our availability model from Equation 2.1. Figure 2.9 shows model predicted availability (the red line with squares) closely matches router experi- ments (blue line with dots on the left graph) and testbed experiments (blue line with dots on the right graph) from below to above the rate limit. 35 0 0.2 0.4 0.6 0.8 1 0.5 1 4 8 0.1 Avalability Probing Rate/Rate Limit Experimental Avalability Modeled Avalability (a) Router Experiment: Up To 8 Times Rate Limit 0 0.2 0.4 0.6 0.8 1 7500 0.01 0.1 1 10 100 1000 Avalability Probing Rate/Rate Limit Experimental Avalability Modeled Avalability (b) Testbed Experiment: Up To 7500 Times Rate Limit Figure 2.9: Validating the availability model. 0 200 400 600 800 1000 0.1 7500 1 10 100 1000 Alternation Count Probing Rate/Rate Limit Experimental Value Modeled Value (a) Precise Model 0 500 1000 1500 2000 2500 3000 0.1 7500 1 10 100 1000 Alternation Count Probing Rate/Rate Limit Experimental Value Modeled Value (b) Approximate Model Figure 2.10: Validation of the Alternation Count Model, up to7500 rate limit. We validate our response rate model from Equation 2.2. We omit this data due to space limitations, but our response rate model is accurate from a response rate of 0.01 to90 the rate limit. We next validate our models of alternation counts (Equation 2.3 and Equation 2.4). 2.10a shows precise model fits perfectly from below the rate limit up to7500 the rate limit 2.10b shows the abstract model (defined in Equation 2.4) fits when P L. In our case, with 6 rounds of active probing, the approximate model fits whenP >10L. We are unable to validate alternation count model with commercial routers; the routers are only available for a limited time. But we believe testbed validations shows 36 0 10 20 30 40 50 60 70 80 90 100 0 60 100 144 240 Percentage of Reply Probing Rate/Rate Limit Detections Correctness Can-Not-T ell (a) Detection Correctness 0 0.5 1 1.5 2 2.5 0 60 100 144 Rate Limit Est/Rate Limit Probing Rate/Rate Limit Experimental Rate Limit Est Rate Limit Groundtruth (b) Rate Limit Estimation Figure 2.11: FADER detection in a noise-free environment. the correctness of our alternation counts models since we have already shown rate lim- iting in testbed matches that of two commercial routers. 2.6.2 Correctness in Noise-Free Testbed We next test the correctness of FADER in a testbed without noise. For our noise-free experiment, we run our high-rate measurement probing from1:6L all the way to240L stressing FADER beyond its designed detecting rangeP <60L. 2.11a shows that FADER detection is perfect forP <60L. However, as we exceed FADER’s design limit (60L), it starts marking blocks as can-not-tell. The fraction of can-not-tell rises asP grows from60L to144L. Fortunately, without packet loss, even when the design limit is exceeded, FADER is never incorrect (it never gives a false positive or false negative), it just refuses to answer (returning can-not-tell). In addition to detecting rate limiting, FADER gives an estimate of what that rate limit is. 2.11b shows the precision of its estimation, varying P from L to 144L, The rate limit estimate is within 7% (from4:2% to +6:9%) whenP < 60L, and it drops gradually as the design limit is exceed. 37 0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 Ability to Detect (Percent) Probing Rate/Rate Limit No Noise 2% Noise 4% Noise 6% Noise 8% Noise 10% Noise 15% Noise 20% Noise (a) Detection Correctness 0 0.5 1 1.5 2 0 2 4 6 8 10 15 20 0 0.2 0.4 0.6 0.8 1 Rate Limit Est/Rate Limit Rate Limit Est Noise Level (Percent) Rate Limit Estimation True Rate Limit Value (b) Rate Limit Estimation WhenP =26L Figure 2.12: FADER detection with packet loss. 2.6.3 Correctness in the Face of Packet Loss We next consider FADER with packet loss, one form of noise. Packet loss could be confused with loss due to rate limiting, so we next we vary the amount of random packet loss from 0 to 60%. 2.12a shows FADER detection as packet loss increases. There is almost no misde- tection until probe rates become very high. At the design limit ofP =60L, we see only about 4% of trials are reported as cannot tell. While ability to detect is insensitive to noise, our estimate of the rate limit is some- what less robust. 2.12b shows that packet loss affects our estimate of the value of the rate limit (here we fixP =26L, but we see similar results for other probe rates). Error in our rate limit is about equal to the dropping rate (at 20% loss rates, the median estimate of rate limit is 20.72% high). 2.6.4 Correctness with Partially Responsive Blocks We next consider what happens when blocks are only partially responsive. Partially responsive blocks are more difficult for FADER because probes sent to non-responsive addresses are dropped, reducing the signal induced by rate limiting. Here we vary probe 38 0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 Ability to Detect (Percent) Probing Rate/Rate Limit 100% Responsive 90% Responsive 80% Responsive 70% Responsive 60% Responsive 50% Responsive 40% Responsive 30% Responsive 20% Responsive 10% Responsive (a) Detection Correctness 0 0.5 1 1.5 2 90 80 70 60 50 40 30 20 10 0 0.2 0.4 0.6 0.8 1 Rate Limit Est/Rate Limit Rate Limit Est Responsive Addresses (Percent) Rate Limit Estimation True Rate Limit Value (b) Rate Limit Estimation WhenP =26L Figure 2.13: FADER detection with partially responsive target blocks. rate for different density blocks. (We hold other parameters fixed and so do not add packet loss.) In 2.13a we vary the relative probing rate and plot separate lines for each level of block responsiveness. In general, the number of can-not-tell responses increase as block responsiveness falls, but only when the probe rate is also much greater than the rate limit. In the worst case, with only 10% of IPs responding at a probe rate 60 the rate limit 35% of tries report can-not-tell. Fortunately, even in these worst cases, the algorithm reports that it cannot tell rather than silently giving a wrong answer. 2.13b shows the rate limit output by FADER as the block density changes. We show median and quartiles with box plots, and minimum and maximum with whiskers. The median stays at the true value, but the variance increases, as shown by generally wider boxes and whiskers. HereP =26L; we see similar results at other probing rates. 2.6.5 Correctness in Other Network Conditions FADER is designed for the general Internet, but we consider how blocks that use DHCP or for mobile networks might affect its accuracy. DHCP: Addresses turnover in a DHCP block might affect FADER: long-term changes may affect it’s comparison of availability (line 5, 16 and 18 in Algorithm 2), and 39 short-timescale turnover might appear to be frequent alternation (Algorithm 1). When a DHCP allocates addresses sequentially from large block (multiple /24s), some of its /24 components may switch from busy to completely unutilized. When empty, FADER’s availability comparison will trigger, but not frequent alternation. FADER’s frequent alternation test will be fooled by short-term changes in DHCP address use (around minutes). However, DHCP studies suggest that typical churn occurs on timeframes of 5 to 61 hours [103], so DHCP churn will not usually effect FADER. Mobile Networks: Mobile networks (telephones) may have higher packet loss than typical due to wireless fading. However as validation Section 2.6.3 shows, random loss does not affect FADER’s detection precision, although it does gradually decrease precision of rate limit estimation as loss increases. 2.7 Related Work Two other groups have studied the problem of detecting rate limits in the Internet. Work from Universite Nice Sophia Antipolis studies router rate-limiting for tracer- outes [123]. Specifically, they study ICMP, Type 11, Time exceeded replies on reverse paths. They detect rate limit by launching TTL-limited ICMP echo requests from 180 van- tage points, varying the probing rate from 1 to 4000 ping/s. Their algorithm looks for constant response rates as a sign of rate limits. They studied 850 routers and found 60% to do rate limiting. Our work has several important differences. The overall result is quite different: they find 60% of reverse paths are rate limited in their 850 routers, mea- sured up to 4000 ping/s, while we find only 0.02% of forward paths in 40k /24 blocks are rate limited, with measurements up to 0.39 pings/s. 40 We believe that both their results and ours are correct. Many routers have reverse- path rate limiting on by default, consistent with their results. Our approach provides much broader coverage and generates less additional traffic since we reuse existing data at a lower rate. Our work uses a different signal (availability difference between fast and slow probing), and we add detection of frequent alternation to filter known false positives. Finally, we concentrate on the forward path, so our results apply to address allocation information, while they focus on reverse path, with results that apply to fast traceroutes. Google recently examined traffic policing, particularly in video traffic [38]. Their analysis uses sampled measurement from hundreds of Google CDNs to millions of users of YouTube. They provide a thorough analysis on the prevalence of policing and the interaction between policing and TCP. They also provide suggestions to both ISP and content providers on how to mitigate negative effect of traffic policing on user quality of experience. Their focus on TCP differs from ours on ICMP rate-limiting. Their coverage is far greater than ours, although that coverage is only possibly because Google is a major content provider. They find fairly widespread rate limiting of TCP traffic, but their subject (TCP video) is much faster than ours (ICMP) that such differences in results are not surprising. 2.8 Conclusion Undetected rate limiting can silently distort network measurement and bias research results. We have developed FADER, a new, light-weight method to detect ICMP rate limiting. We validated FADER against commercial routers and through sensitivity experiments in a testbed, showing it is very accurate at detecting rate limits when probe traffic is between 1 and60 the rate limit. 41 We applied FADER to a large sample of the Internet (40k blocks) on two separate dates. We find that a only a tiny fraction (0.02%) of Internet blocks are ICMP rate limited up to 0.39 pings/s per /24 block. We also examined public high-rate datasets (up to about 1 ping/s per /24 block) and showed their probing results are consistent with rate limiting. We only see significant rate limiting on the reverse path when routers near the prober see a large amount of traffic. We conclude that low-rate ICMP measurement (up to 0.39 ping/s per block) are unlikely to be distorted while high-rate measurement (up to 1 ping/s per block) risks being rate limited. This chapter is an example that supports our thesis statement. Specifically, we show that our new signature of traffic based on differences in observed responsiveness of end-points between scans enables detecting ICMP rate limiters, a new class of network devices. We show that besides detecting the existence of ICMP rate limiters, this new signature also enables characterizing them by estimating the packet rates for their rate limits. By applying detections with our new signature to multiple real-world network traces, we show that ICMP rate limiting exists but is very rare for probing rates up to 0.39 packets/s per block (Section 2.5.1). We also show fast probing up to 1 packets/s per block risks being rate limited (Section 2.5.3. We thus conclude this work partially supports the thesis statement, providing one example as the evidence. Our detection of ICMP rate limiting shows an example of active device detection due to our use of active probing, we next show passive detections of network devices (general IoT devices in Chapter 3 and compromised IoT devices in Chapter 4) using signature of traffic about end-points. 42 Chapter 3 General IoT Device Detection We next show an example of passive device detection with signature of traffic about end- points. In this study, we use a new signature (observed identities of traffic end-points) to detect a new class of network devices: general IoT devices. This new signature also enables us to characterize detected IoT devices by inferring their devices types. We report detections from a university campus over 4 months and from traffic transiting an IXPs over 10 days. We show that AS penetrations for 23 types of IoT devices has grown substantially (about 3:5) from 2013 to 2018 but the device types density in ASes detected with these device types only increase modestly. We also show substantial IoT deployment growth at household-level from 2013 to 2017. This study of IoT device detection demonstrates our thesis statements as follows. We develop a new signature of traffic based on observed identities of end-points: devices talking to known combinations of IoT-specific end-points are IoT devices. With this new signature, we detect a new class of network devices: general IoT devices both on the public Internet and behind the NAT. Beside detection, this new signature also enables us to characterize these detected IoT devices by inferring their device types from the combination of end-points they talk to. Our detection is robust to different network topologies especially those with NAT: we can identify IoT devices behind the NAT by observing their traffic end-points from outside. Our detection is also robust to the patterns of IoT traffic such as timing and rates. Part of this chapter was published in Workshop on Internet of Things Security and Privacy 2018 [56]. 43 3.1 Introduction There is huge growth in sales and the installed base of Internet-of-Things (IoT) devices like Internet-connected cameras, light-bulbs, and TVs. Gartner forecasts the global IoT installed base will grow from 3.81 billion in 2014 to 20.41 billion in 2020 [41]. This large and growing number of devices, coupled with multiple security vulnera- bilities, brings an increasing concern about the security threats they raise for the Internet ecosystem. A significant risk is that compromised IoT devices can be used to mount large-scale Distributed Denial-of-Service (DDoS) attacks. In 2016, the Mirai botnet, with over 100k compromised IoT devices, launched a series of DDoS attacks that set records in attack bit-rates. Estimated attack sizes include a 620 Gb/s attack against cybersecurity blogKrebsOnSecurity.com (2016-09-20) [75], and a 1 Tb/s attack against French cloud provider OVH (2016-09-23) [113] and DNS provider Dyn (2016- 10-21) [35]. The size of the Mirai botnet used in these attacks has been estimated at 145k [113] and 100k [35]. Source code to the botnet was released [85], showing it targeted IoT devices with multiple vulnerabilities. If we are to defend against IoT security threats, we must understand how many and what kinds of IoT devices are deployed. Our paper proposes three algorithms to discover the location, distribution and growth of IoT devices. We believe our algorithms and results could help guide the design and deployment of future IoT security solutions by revealing the scale of IoT security problem (how wide-spread are certain IoT devices in the whole or certain part of Internet?), the problem’s growth (how quickly do new IoT devices spread over the Internet?) and the distribution of the problem (which countries or autonomous systems have certain IoT devices?). Our goal here is to assess the scope of the IoT problem; improving defenses is complementary future work. 44 Our IoT detection algorithms can also help network researchers study the distribu- tion and growth of target IoT devices and help IT administrators discover and monitor IoT devices in their network. As more every-day objects get connected into the Internet, our algorithms may even help understand the physical world by, for example, detecting and tracking network-enabled vehicles for crime investigation. Our first contribution is to propose three IoT detection methods. Our two main methods detect IoT devices from observations of network traffic: IPs in Internet flows (Section 3.2.1) and stub-to-recursive DNS queries (Section 3.2.1). They both use knowl- edge of servers run by manufacturers of these devices (called device servers). Our third method detects IoT devices supporting HTTPS remote access (called HTTPS-Accessible IoT devices) from the TLS (Transport Layer Security [30]) certificates they use (Sec- tion 3.2.2). (We reported an early version of IP-based detection method [56]; here we add additional methods and better evaluate our prior method in Section 3.3.1.) Our second contribution is to apply our three detection methods to multiple real- world network measurements (Table 3.2). We apply our IP-based method to flow-level traffic from a college campus over 4 months (Section 3.3.1) and a regional IXP (Internet Exchange Point [23]) over 10 days (Section 3.3.1). We apply our DNS-based method to DNS traffic at 8 root name servers from 2013 to 2018 (Section 3.3.2) to study IoT deployment by Autonomous Systems (ASes [77]). We find about 3:5 growth in AS penetration for 23 types of IoT devices and modest increase in device type density for ASes detected with these device types (we find at most 2 known device types in 80% of these ASes in 2018). We confirm substantial deployment growth at household-level by applying DNS-based method to DNS traffic from a residential neighborhood from 2013 to 2017 (Section 3.3.2). We apply our certificate-based method to a public TLS certificate dataset (Section 3.3.3) and find 254K IP cameras and network video recorders (NVR) from 199 countries. 45 Our third contribution is that this study support our thesis by using a new signature (observed identities of end-points) to detect a new class of network devices (general IoT devices) and characterize these network devices by inferring their device types). This paper builds on prior work in the area. We draw on data from University of New South Wales (UNSW) [134]. Others are currently studying the privacy and vulner- abilities of individual devices (for example [6]); we focus on detection. Prior work has studied detection [10, 20, 32, 97, 130, 132, 134], but we use different detection signals to observe devices behind NATs (Network Address Translations devices [37]) as well as those on public IP addresses (detailed comparisons in Section 3.5). We published an early version of IP-based detection in a workshop [56]. This paper adds two new detection methods: DNS-based detection (Section 3.2.1) and certificate-based detection (Section 3.2.2) and adds a new 4-month study of IoT devices on college campus for IP-based detection (Section 3.3.1). Our studies of IP-based and DNS-based detections are approved by USC IRB as non-human subject research (IRB IIR00002433 on 2018-03-27 and IRB IIR00002456 on 2018-04-19). We make data captured from our 10 IoT devices (Table 3.1) public at [51]. 3.2 Methodology We next describe our three methods to find IoT devices. Our three detection methods use different types of network measurements (IPs in Internet flows Section 3.2.1, stub- to-recursive DNS queries Section 3.2.1 and TLS certificates Section 3.2.2) to achieve different coverage of IoT devices. Combining our three methods reveals a more com- plete picture of IoT deployment in the Internet. (However, even with all three methods, we do not claim complete coverage of global IoT deployment.) 46 3.2.1 IP and DNS-Based Detection Methods Our two main methods detect general IoT devices from two types of passive mea- surements: Internet flows, measured from any vantage point in the Internet (IP-based method); and DNS queries, measured between stub and recursive servers (DNS-based method). These two methods cover IoT devices that are visible to these two data sources, including those that use public IP addresses or are behind NAT devices. Our methods exploits the observation that most IoT devices exchange traffic reg- ularly with device-specific servers. (For example, IoT inspector project observes 44,956 IoT devices from 53 manufactures talking to cloud servers during normal opera- tion [65].) If we know these servers, we can identify IoT devices by watching traffic for these packet exchanges. Since servers are usually unique for each class of IoT device, we can also identify the types of devices. Our approaches consider only with whom IoT devices exchange traffic, not patterns like timing or rates, because patterns are often obscured when traffic mixes (such as with multiple devices behind a NAT). Our two methods depend on identifying servers that devices talk to (Section 3.2.1), and looking for these servers by IP address (Section 3.2.1) and DNS name (Sec- tion 3.2.1). Although our method is general, it requires knowledge of what servers devices talk to, and therefore it requires device-specific data obtained by us or others. We still detect devices that change the servers with which they interact provided they continue to talk to most of their old servers. For IoT devices behind NAT, our methods only identify the existence of each type of IoT devices but can not know the exact number of devices for each type because we cannot count NATted devices outside the NAT. 47 Manufacturer Model Alias Amazon Dash Button Amazon Button Amazon Echo Dot Amazon Echo Amazon Fire TV Stick Amazon FireTV Amcrest IP2M-841 IP Cam Amcrest IPCam D-Link DCS-934L IP Cam D-Link IPCam Foscam FI8910W IP Cam Foscam IPCam Belkin (Wemo) Mini Smart Plug Belkin Plug TP-Link HS100 Smart Plug TPLink Plug Philips (Hue) A19 Starter Kit Philips LightBulb TP-Link LB110 Light Bulb TPLink LightBulb Table 3.1: The 10 IoT Devices that We Purchased Identifying Device Server Names Our approach depends on knowing what servers devices talk to. Our goal is to find domain names for all servers that IoT devices regularly and uniquely talk to. However, we need to remove server names that are often shared across multiple types of devices, since they would otherwise produce false detections. Note that even with our filtering of common shared server names, we sometimes find servers that are shared across multiple types of devices. We handle this ambiguity from shared servers by not trying to distinguish these devices types in detection, as we explain later in this section. Identifying Candidate Server Names: We bootstrap our list of candidate server names by purchasing samples of IoT devices and recording who they talk to. We describe the list of devices we purchased in Table 3.1 and provide the information we learned as a public dataset [51]. For each IoT device we purchase, we boot it and record the traffic it sends. We extract the domain name of server candidates from type A DNS requests made by target IoT device in operation. We capture DNS queries at the ingress side of recursive DNS resolver to mitigate effects of DNS caching. 48 Filtering Candidate Server Names: We exclude domain names for two kinds of servers that would otherwise cause false positives in detection. One is third-party servers: servers not run by IoT manufacturers that are often shared across many device types. The other is human-facing servers: servers that also serve human. Third-party servers usually offer public services like time, news and music streaming and video streaming. If we include them, they would cause false positives because they interact many different clients. We consider server nameS as a third-party server for some IoT productP if neither P ’s manufacturer nor the sub-brandP belongs to (if any) is a substring ofS’s domain (regardless of case). We define domain of a URL as the immediate left neighbor of the URL’s public suffix. (We identify public suffix based on public suffix list from Mozilla Foundation [104]). We use Python library tldextract to identify TLD suffixes [78]. Human-facing servers serve both human and device (note that all server candidates serve device because they are DNS queried by IoT devices in the first place). They may cause mis-classifying a laptop or cellphone (operated by human) as IoT devices. We identify human-facing servers by if they respond to web requests (HTTP or HTTPS GET) with human-focused content. We define respond as returning an HTML page with status code 200. We define human-focused content as the existence of any web content instead of place-holder content. Typically place-holder content is quite short. (For example, http://appboot.netflix.com shows place holder “Net- flix appboot” and is just 487 bytes.) So we treat HTML text longer than 630 bytes as human-focused content. We determined this threshold empirically from HTTP and HTTPS content at 158 server domain names queried by our 10 devices (Table 3.1). We call the remaining server names device-facing manufacturer server, or just device servers, because they are run by IoT manufacturers and serve devices only. We use device servers for detection. 49 Handling Shared Server Names: Some device server names are shared among multiple types of IoT devices from the same manufacturer and can cause ambiguity in detection. If different device types share the exact set of server names, then we cannot distin- guish them and simply treat them as the same type—a device merge. If different device types have partially overlapping sets of device server names, we can not guarantee they are distinguishable. If we treat them as separate types, we risk false positives and confusing the two types. We avoid this problem with detection merge: when we detect device types sharing common server names, we conservatively report we detect at least one of these device types. (Potentially we could look for unique device servers in each type; we do not currently do that.) Handling Future Server Name Change: The server names that our devices (Table 3.1) use are quite stable over 1 to 1.5 years (as shown in Section 3.4.2). How- ever, both our IP-based and DNS-based detection risks missing devices that get software updates that cause them to talking to new server names. We mitigate these potential missed detections by reporting that a device exists when we see a majority of server names for that device (both IP-based method Section 3.2.1 and DNS-based method Sec- tion 3.2.1). For DNS-based method, we also propose a technique to discover new device server names during detection (Section 3.2.1). IP-Based IoT Detection Method Our first method detects IoT devices by identifying packet exchanges between IoT devices and device servers. For each device type, we track device-type-to-server-name mapping: a list of device server names that type of devices talks to. We then define a threshold number of server names; we interpret the presence of traffic to that number of 50 server names (identified by server IP) from a given IP address as indicating the presence of that type of IoT device. Tracking Server IP Changes: We search for device servers by IP addresses in traffic, but we discover device servers by domain names in sample devices. We therefore need to track when DNS resolution for server name changes. We assume server names are long-lived, but the IP addresses they use sometimes change. We also assume server-name-to-IP mappings could be location-dependent. We track changes of server-name-to-IP mapping by resolving server names to IP addresses every hour (frequent enough to detect possible DNS-based load balancing). To make sure IPs for detection are correct, we track server IPs across the same time period and at roughly the same geo-location as the measurement of network traffic under detection. Completeness Threshold Selection: Since some device servers may serve both devices and individuals (due to we use necessary condition to determine device-facing server in Section 3.2.1 and risk mis-classifying human-facing manufacturer server as device server) and sometimes we might miss traffic to a server name due to observation duration or lost captures, we set a threshold of server names required to indicate the presence of each IoT device type. This threshold is typically a majority, but not all, of the server names we observe a representative device talk to in the lab. (This majority- but-not-all threshold also mitigates potential detection misses caused by devices that start talking to new servers.) Most devices talk to a handful of device server names (up to 20, from our laboratory measurements Section 3.3.1). For these types of devices, we require seeing at least2=3 device server names to believe a type of IoT device exists at a given source IP address. Threshold 2=3 is chosen because for devices with 3 or more server names, requiring seeing anything more than 2=3 server names will be equivalent to requiring seeing all 51 server names for some devices. For example, requiring at least 4=5 server names is equivalent to requiring all server names for devices with 3 to 4 device server names. For devices that talk to many device server names (more than 20), we lower our threshold to1=2. Typically these are devices with many functions and the manufacturer uses a large pool of server names. (For example, our Amazon FireTV , as in Table 3.1, has 41 device server names.) Individual devices will most likely talk to only a subset of the pool, at least over short observations. Limitation: Although effective, IP-based detection faces two limitations. First, it cannot detect IoT devices in previously stored traces, since we usually do not know device server IPs in the past, and coverage of commercial historical DNS datasets can be limited ( [56]). Second, we assume we can learn the set of servers the IoT devices talk to. If we do not learn all servers during bootstrapping (Section 3.2.1), or if device behavior changes (perhaps due to a firmware update), we need to learn new servers. However we cannot learn new device servers during IP-based detection because we find it hard to judge if an unknown IP is a device server, even with help of reverse DNS and TLS certificates from that IP. These limitations motivate our next detection method. DNS-Based IoT Detection Method Our second method detects IoT devices by identifying the DNS queries prior to actual packet exchanges between IoT devices and device servers. Strengths: This method addresses the two limitations for IP-based detection (Sec- tion 3.2.1). First, we can directly apply DNS-based detection to old network traces because server names are stable while server IP can change. Second, we can learn new device server names during DNS-based detection by examining unknown server names DNS queried by detected IoT devices and learning those look like device servers (using rules in Section 3.2.1). 52 Detection First Iteration? Leant Servers? Device Splitting Detect Less Than Last Iteration? Server Learning Redo Last Detection No No Yes Yes Yes No Use the updated list of IoT device server names End Start Figure 3.1: Workflow for DNS-Based IoT Detection with Server Learning Limitations: This method requires observation of DNS queries between end-user machines and recursive DNS servers, limiting its use to locations that can see “under” recursive DNS revolvers. This method also works with recursive-to-authority DNS queries (see Section 3.3.2) when observations last longer than DNS caching, since then we see users-driven queries for server names even above the recursive. Detection with recursive-to-authority DNS queries reveals presence of IoT devices at the AS-level, since recursives are usually run by ISPs (Internet service providers [139]) for their users. Method Description: Our DNS-based method has three components: detection, server learning and device splitting. Figure 3.1 illustrates this method’s overall work- flow: it repeatedly conducts detections with the latest knowledge of IoT device server names, learns new device server names after each detection, and terminates when no new server names are learned (see the loop of “Detection” and “Server Learning” in Figure 3.1). This method also revises newly learned server names by device splitting if it suspects they are incorrect, as signaled by decreased detection after new server names are added (see “Device Splitting” in Figure 3.1). Detection: Similar to Section 3.2.1, for each type of IoT devices, we track a list of device server names that type of device talks to. We interpret presence of DNS queries for above a threshold (same as Section 3.2.1) amount of device server names from a give IP address as presence of that IoT device type. (We call this IP IoT user IP.) 53 To cover possible variants of known device servers, in detection, we treat digits in server name’s sub-domain as matching any digit. We define sub-domain of a URL as everything on the left of the URL’s domain (URL’s domain as defined in Section 3.2.1). Server Learning: After each detection, we learn new device server names and use them in subsequent detections. Specifically, we examined unknown server names DNS queried by IoT user IPs and if we find any unknown server names resemble device servers for certain IoT device detected at certain IoT user IP (judged by rules in Sec- tion 3.2.1), we extend this IoT device’ server name list with these unknown server names. Device Splitting: We may incorrectly merge two types of devices that talk to differ- ent set of servers if we only know their shared server names prior to detection. Incorrect device merges can reduce detection rates. When we falsely merge different device types P1 and P2 as P , we risk learning new server names for the merged type P that P1 and P2 devices do not both talk to and causing reduced detections of P in subsequent iterations because we miss some P1 (or P2) devices by searching for the newly-acquired server names thatP1 (orP2) do not talk to. Device splitting addresses this problem by reverting incorrect merge. If we detect fewer device types P at certain IP after learning new server names, we know P is an incorrect merge of two different device types,P1 andP2, and that the new server names learned forP do not apply for bothP1 andP2. We thus splitP intoP1 andP2, with P1 talking toP ’s server names before last server learning (without newly-learned server names) and P2 talk to P ’s latest server names (with the new server names). We show an example of how device splitting reverts an incorrect device merge later in controlled experiment (Section 3.4.2). 54 3.2.2 Certificate-Based IoT Detection Method Our third method detects IoT devices using HTTPS by active scanning for TLS cer- tificates and identifying target IoT devices’ TLS certificates. This method thus covers HTTPS-Accessible IoT devices either with public IPs or behind NATs but forwarded to a public port. However, certificate scanning will miss devices behind NATs that lack public-facing IP addresses and IoT devices that do not use TLS Note that prior work has mapped TLS certificate to IoT devices, both by match- ing text (like “IP camera”) with certificates [130], and by using community-maintained annotations [32]. In comparison, our method uses multiple techniques to improve the accuracy of certificate matching, and also confirms that matched certificates come from HTTPS servers running in IoT devices. We use existing public crawls of IPv4 TLS certificates. We first identify candidate certificates: the TLS certificates that contain target devices’ manufacturer names and (optionally) product information. Candidate certificates most likely come from HTTPS servers related to target devices such as HTTPS servers ran by their manufacturers and HTTPS servers ran directly in them. We then identify IoT certificates: the candidate certificates that come from HTTPS servers running directly in target devices. Each IoT certificate represents a HTTPS-Accessible IoT device. Identify Candidate Certificates We identify candidate certificates for every target device by testing each TLS certificate against a set of text strings we associate with each device (called matching keys). (We describe where our list of target devices is found in Section 3.3.3.) Matching Keys: We build a set of matching keys for each target device with the goal to suppress false positives in finding candidate certificates. If a target device’s 55 manufacturer does not produce any other type of Internet-enabled products (per product information on manufacturer websites), its matching key is simply the name of its man- ufacturer (called manufacturer key). Otherwise, its matching keys will be manufacturer key plus its product type (like “IP Camera”). We also include IoT-specific sub-brands (if any). For example, “American Dynamics” is the sub-brand associated the IP cameras manufactured by Tyco International. We do two kinds of matching between a matching keyK and a fieldS in TLS Cer- tificate: Match means K is a substring of S (ignore case); Good-Match means K is a Match ofS and the character(s) adjacent toK’s match inS are neither alphabetical nor numbers. For example, “GE” is a Match but not a Good-Match of “Privilege” because the adjacent characters of “GE” in “Privilege” is “e” (an alphabet). (We do not simply look for identicalK andS because oftenS uses a prefix or suffix. For example, a cer- tificate’s subject-organization field “Amcrest Technologies LLC” will be a Good-Match with manufacturer key “Amcrest”, but is not identical due to the suffix “Technologies LLC”.) Requiring Good-Match for manufacturer keys reduces false positives caused by IoT manufacturer names being substrings of other companies. For example, name of IP camera manufacturer “Axis Communications” is a substring of Telecom company “Maxis Communications” but they are not a Good-Match. We use the Match (not Good-Match) rule for other keys (product types and sub- brand) because they require greater flexibility. For example, product type “NVR” can be matched to text string like ”myNVR”. Key Matching Algorithm: We test each TLS certificate (input) with matching keys from each target device. Specifically, we examine four subject fields in a TLS certificate C (organizationC O , organization unitsC OU , common nameC CN and SubjectAltNames C DN , if present) and considerC a candidate certificate for deviceP ifP ’s manufacturer 56 key (K P m ) Good-MatchesC O and any non-manufacturer keys forP Match any of these four subject fields inC. We handle two edge cases when testing ifK P m Good-MatchesC O . IfC O is empty, or an default (“SomeOrganization” or “company”), we instead test if K P m Good-Matches any of the other three fields we examine (C OU ,C CN andC DN ). If we compareK P m to a field that is a URL, we only matchK P m against the URL’s domain part (URL’s domain as defined in Section 3.2.1) because domain shows ownership of a server name. (For example, Accedo Broadband owns * .sharp.accedo.tv’’, not Sharp.) Identify IoT Certificate We identify IoT-specific certificates because they are not typically signed by a Certificate Authority. We identify them because they are self-signed and lack valid domain names. Self Signing: Many HTTPS servers on IoT devices use self-signed certificates rather than CA-signed certificates to avoid the cost and complexity of CA-signing. We con- sider a candidate certificateC (for deviceP ) self signed if C’s issuer organizationC iO is either a copy of any of the 4 subject fields we examined (C O ,C OU ,C CN andC DN ) or is Good-Matched byP ’s manufacturer key (K P m ). Lacking Valid Domain Names: Often IoT users lack dedicated DNS domain names for their home network. The only exception we found is some devices use “www.”+manufacter+“.com” as a place holder for C CN . (For example, www. amcrest.com for Amcrest IP Camera.) We consider a candidate certificate C lacking valid domain names if none of the values inC CN andC DN (if present) is a valid domain name. We ignore Dynamic DNS names (using a public list of dynamic DNS providers [110]) and default names. 57 Dataset Type Span IP Assignment Coverage USC IP 4 Months Dynamic A College Campus FRGP IP 10 Days Dynamic An IXP’s Clients DITL DNS 6 Years N/A The Whole Internet CCZ DNS 5 Years Static A Neighborhood ZMap Cert 1 Day N/A The Public Internet Table 3.2: Datasets for Real-world IoT Detection 3.2.3 Adversarial Prevention of Detection Although our methods generally work well in IoT detection, they are not designed to prevent an adversary from hiding IoT devices. For example, use of a VPN (Virtual Private Network [46]) that tunnels traffic from the IoT to its servers would evade IP- based detection. IoT devices that access device servers with hard-coded IP addresses rather than DNS names will avoid our DNS-based detection. Although an adversary can hide IoT devices, since they are designed for consumer use and to minimize costs, we do not anticipate widespread intentional concealment of IoT devices. (We did not observe any devices intentionally avoiding detection during our study) 3.3 Results: IoT devices in the Wild We next apply our detection methods with real-world network traffic (Table 3.2) to learn about the distribution and growth of IoT devices in the wild. Although we have no ground truth from the real-world, we demonstrate our methods show high detection accuracy in controlled experiments with controlled ground truth in Section 3.4. 58 3.3.1 IP-Based IoT Detection Results To apply our IP-based detection, we first extract device server names from 26 devices by 15 vendors (Section 3.3.1). We then apply detection to Internet flows at a college campus from a 4-month period (Section 3.3.1) and partial traffic from an IXP (Section 3.3.1). Identifying Device Server Names We use device servers from two sets of IoT devices in detection: 10 IoT devices we purchased (Table 3.1) and 21 IoT devices from data provided by the UNSW (devices as listed in Figure.1b of [134]). (Our 10 devices were chosen for their popularity on amazon.com in 2018.) We extract device server names from both sets of devices with method in Section 3.2.1. We break-down server names we found. Of the 171 candidate server names from our 10 devices, about half (56%, 96) are third-party servers, providing time, news or music streaming, while the other half (44%, 75) are manufacturer servers. Of these manufacturer servers, only a small portion (7%, 5) are human-facing (like prime. amazon.com). The majority of manufacturer servers (93%, 70) are device-facing and will be used in detection. We manually examine the 171 candidate server names and confirm the classifications for most of them are correct (for 157 out of 171, ownership of server domain is verified by whois or websites). We cannot verify ownership of 11 candidate server names. Luckily, our method lists them as third-party servers and they will not be used in detection. We find three can- didate server-names (api.xbcs.net, heartbeat.lswf.net, andnat.xbcs. net) are falsely classified as third-party servers. We confirm they are run by IoT manu- facturer Belkin based on “whoislswf.net” and prior work [129] and add them back 59 to our list. These three server names fail our test for manufacturer server (Section 3.2.1) because their domains show no information of manufacturer. Similarly, we extracted 48 device servers from 18 of 21 IoT devices from UNSW (using datasets available on their website https://iotanalytics.unsw.edu. au). The remaining 3 of their devices are not detectable with our method because they only visit third-party and human-facing servers. Combining server names measured from our 10 devices and the 18 detectable devices from UNSW (merging two duplicate devices, Amazon Echo and TPLink Plug) gives us 26 detectable IoT devices; Among these 26 detectable IoT devices, we merge TPLink IPCam, TPLink Plug and TPLink Lightbulb as a meta-device because they talk to the same set of of device servers (a device merge, recall in Section 3.2.1). Similarly, we merge Belkin Switch and Belkin MotionSensor. After device merge, we are left with 23 merged devices talking to 23 distinct sets of device server names. (Together they have 99 distinct device server names.) By detecting with these server names, we are essentially looking for 23 types of IoT devices that talk to these 23 set of server names, including but not limited to the 26 IoT devices owned by us and UNSW. IoT Deployment in a College Campus We apply our IP-based detection method to partial network traffic from our university campus for a 4-month period in 2018. Input Datasets: We use passive Internet measurements at the University of Southern California (USC) guest WiFi for 4 different 4-day-long periods from August to Novem- ber in 2018 (Table 3.2). To protect user privacy, packet payloads are not kept and IPs are anonymized by scrambling the last byte of each IP address in a prefix preserving manner. 60 IoT IoT Est IoT Users Est IoT Month Detection User IP (Res : Non-Res) Devices Aug 13 6 2 ( 2 : 0 ) 5 to 7 Sep 23 6 5 ( 2 : 3 ) 21 to 28 Oct 19 6 4 ( 3 : 1 ) 11 to 15 Nov 10 3 2 ( 2 : 0 ) 8 to 12 Table 3.3: 4-Month IoT Detection Results on USC Campus and Our Estimations of IoT Users and Devices Input Server IPs: Since server-name-to-IP bindings could vary over time and phys- ical locations (as discussed in Section 3.2.1), we collect latest IPv4 addresses for our 99 device server name daily at USC, as described in Section 3.2.1. Ideally we would always use the latest server IPs in detection. However, due to outages in our infrastructure, we can ensure the server IPs we use in detections are no more than one-month old. IoT Detection Results: As shown in Table 3.3, IoT detections increase on campus from August to September (from 13 to 23), but decrease in October and November (to 19 and then 10). In comparison, IoT user IPs on campus remain the same from August to October (6) and drop in November (3). (We discuss reasons behind these variations in campus IoT deployment later in this section.) We show our August detection results in Table 3.4. (Detections in other months are similar.) Note that “Amazon *” in Table 3.4 stands for at least one of Amazon FireTV and Amazon Echo. Similarly “Withings *” stands for at least one of Withings Scale and Withings SleepSensor (recall detection merge in Section 3.2.1). We find that IoT user IPs are often detected with multiple device types, suggesting the use of network- address translation (NAT) devices. We also find two sets of IoT user IPs (A and H; C and F) , each sharing the exact set of IoT device types. A likely explanation is these two sets of IPs belong to two IoT users using dynamically assigned IP addresses, and these addresses change one time during our 4-day observation. (More discussions of IoT users on campus later.) 61 IP-A & IP-H IP-B IP-C & IP-F IP-D LiFX LightBulb Withings * HP Printer LiFX LightBulb Amazon * Withings * Withings * Amazon * Amazon * Table 3.4: August IoT Detection Results on USC Campus (Merging IPs with Identical Detections) Since USC guest WiFi dynamically assigns IPs, our counts of IoT detections and IoT user IPs risk over-estimating actual IoT deployments on campus. When one user gets multiple IPs, our IoT user IP count over-estimates IoT user count. When one user’s devices show up in multiple IPs, our IoT detection count gets inflated. (We validate our claim that dynamic IPs inflate detection in Section 3.4.1.) Estimating Numbers of IoT Users and Devices: To get a better knowledge of actual IoT deployments on campus, we estimate the number of IoT users on campus based on the insight that although one user could get assigned different IPs, he may still be identified by the combination of IoT device types he owns. We then infer the number of IoT devices we see on campus given this many users. We infer the existence of IoT users by clustering IoT user IPs from the same month or adjacent months that have similar detections. We consider detec- tions at two IPs (represented by two sets of detected IoT device types d1 and d2, without detection merge) to be similar if they satisfy the following heuristic: size(intersect(d1;d2))=size(union(d1;d2))0:8. While our technique risks under-estimating the number of IoT users by combining different users who happen to own same set of device types into one user, we argue this error is unlikely because most IP addresses that have IoT devices (16 out of 21, 76%) show multiple device types (at least 4, without detection merge), and the chance that two different users have identical sets of device types seems low. 62 We find three clusters of IPs: with one each spanning 4, 3 and 2 months. These three clusters of IPs likely belong to three campus residents who could install their IoT devices relatively permanently on campus, such as students living on campus and faculty (or staff) who have office on campus. We find four IPs that do not belong to any clusters. These four IPs likely belong to four campus non-residents who only brought their devices to campus briefly, such as students living off-campus and other campus visitors. We then estimate number of IoT devices on campus in each month by adding up devices owned by estimated IoT users in each month. We estimate devices owned by a given user in a given month by taking the union of device types detected from this user’s IPs in this month and assuming this user owns exactly one device from each detected type. (Recall from Section 3.2.1 that for NATted IoT devices, our method only identifies the existence of device types but cannot know the device count for each type.) We summarize our estimated numbers of IoT users and devices in Table 3.3. (Our estimated IoT device counts are ranges of numbers because we do not always know the exact number of detected device types due to detection merge). Our first observations is campus residents are mostly stable except an existing resident disappear in November (likely due to he stops using his only detected device type: LiFX LightBulb) and a new resident show up in October (likely due to a faculty or staff installing new IoT devices in their office). Our second observation is number of campus non-residents differs a lot by month. While we find 3 non-residents in September and 1 non-resident in October, we find none in August and November. One explanation for this trend is there are more campus events in the middle of the semester (September and October) which attracts more campus visitors (potentially bringing IoT devices). 63 We argue that the small number of IoT users and devices we detect is an under- estimation of the actual campus IoT deployment since our measurements only cover campus guest WiFi and we expect IoT devices to be deployed on wired networks and secure WiFi that we do not cover. IoT Devices at an IXP We also apply IP-based detection to partial traffic from an IXP, using FRGPContinu- ousFlowData (FRGP) dataset [142] collected by Colorado State University from 2015- 05-10 to 2015-05-19 (10 days), as in Table 3.2. We find 122 triggered detections of 10 to 11 device types (we do not know exact number of types due to detection merge Section 3.2.1) from 111 IPs. (Similar to Section 3.3.1, since clients of FRGPs may use dynamically assigned IPs, our detection counts and IoT user IPs counts risk being inflated.) Please see our tech report for details [54]. 3.3.2 DNS-Based IoT Detection Results We next apply our DNS-based detections to two real-world DNS datasets. Global AS-Level IoT Deployments We apply detection to Day-in-the-Life of the Internet (DITL) datasets from 2013 to 2018 to explore growth of AS-level deployments for our 23 device types. Input Datasets: our detection uses DITL datasets from 8 out of 13 root DNS servers (each a root letter) between 2013 and 2018 (excluding G, D, E and L roots for not participating in all these DITL data and I root for using anonymized IPs) to show growth in AS-level IoT deployment in this period, as summarized in Table 3.2. Each DITL dataset contains DNS queries received by a root letter in a 2-day window. 64 Since root DNS servers see requests from recursive DNS resolvers (usually run by ISPs for their users), these results detect devices at the AS-level, not for households. To find out the ASes where detected devices come from, we map recursive DNS resolvers’ IPs to AS numbers (ASN) with CAIDA’s Prefix to AS mappings dataset [16]. Since the data represents ASes and instead of households, we do detection only (Sec- tion 3.2.1) and omit the server-learning portion of our algorithm. With many households mixed together, AS-size aggregation risk learning wrong servers. To count per-device- type detections, we do not use detection merge (Section 3.2.1). With more than half of all 13 root letters (62%, 8 out of 13), we expect to observe queries from the majority of recursives in the Internet because prior work has showed that under 2-day observation, most (at least 80%) recursives query multiple root letters (with 60% recursives query at least 6 root letters) [106]. However, even with visibil- ity to the majority of recursives, our detection still risks under-estimating AS-level IoT deployment because the 2-day DITL measurement is too short to observe DNS queries from all known IoT device types behind these visible recursives. (Under short obser- vation, IoT DNS queries could be hidden from root letters by both DNS caching and non-IoT overshadowing: if a non-IoT device queries a TLD before an IoT device behind the same recursive does, the IoT DNS query, instead of being sent to a root letter, will be answered by the DNS caches created or renewed by the non-IoT DNS query.) Con- sequently, we mainly focus on the trend shown in our detection results instead of the exact number of detections. Growth in AS Penetrations: We first study the “breadth” of AS-level IoT deploy- ment by examining the number of ASes that our 23 IoT device types have penetrated into. We show overall AS penetration for our 23 IoT device types (number of ASes where we find at least of one of our 23 IoT device types) in Figure 3.2 as the blue crosses. We 65 200 400 600 800 1000 2013-05-28 2014-04-15 2015-04-13 2016-04-05 2017-04-11 2018-04-10 Num of ASes DITL Date Figure 3.2: Overall AS Penetration for Our 23 Device Types from 2013 to 2018 0.5 0.6 0.7 0.8 0.9 1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 ECDF Device Type Density in IoT-ASes 2018-04-10 2015-04-13 2013-05-28 Figure 3.3: ECDF for Device Type Density in IoT-ASes from 2013 to 2018 find the overall AS penetration for our device types increases significantly from 2013 to 2017 (from 244 to 846 ASes, about 3.5 times) but plateau from 2017 to 2018 (from 846 to 856 ASes). We believe the reason that overall AS penetration for our 23 IoT device types plateau between 2017 and 2018 is the sales and deployment decline as these models replaced by newer releases. To support this hypothesis, we estimate release dates for our device types and compare these estimated release dates with per-device-type AS penetration (number of ASes where each of our 23 device types is found) from 2013 to 2018 (Fig- ure 3.5). 66 0 1000 2000 3000 4000 0 20 40 60 80 100 120 Num of IoT-AS Observation Duration (Days) Figure 3.4: Detected IoT-ASes under Extended Observation at B Root We estimate release dates for 22 of our 23 device types based on estimated release dates for our 26 detectable IoT devices (recall Section 3.2.1). (We exclude device type HP Printer here because there are many HP wireless printers released from a wide range of years and it would be inaccurate to estimate release date of this whole device type based on any HP Printer devices.) If a device type includes more than one of our 26 detectable IoT devices (due to device merge), we estimate release dates for all these devices and use the earliest date for this device type. We estimate release date for a given IoT device from one of three sources (ordered by priority high to low): release date found online, device’s first appearance date and device’s first customer comment date on Amazon.com. We confirm all the 22 device types are released at least two years before 2017 (2 in 2011, 7 in 2012, 3 in 2013, 5 in 2014 and 5 in 2015), consistent with our claim that their sales are declining in 2017. We compare estimated release dates with per-device-type AS penetration results (Figure 3.5) and find that detections of device types tend to plateau after release, consis- tent with product cycles and a decrease in sales and use of these devices. For example, Withings SmartScale and Netatmo WeatherStation, which are released in 2012, stop growing roughly after 2016-10-04 and 2017-04-11, suggesting a product cycle of about 4 and 5 years. In comparison, TPLink-IPCam/Plug/LightBulb is the only device type 67 0 100 200 300 400 2013 2014 2015 2016 2017 2018 Amazon-Echo 0 100 200 300 400 2013 2014 2015 2016 2017 2018 Amazon-FireTV 0 100 200 300 400 2013 2014 2015 2016 2017 2018 Belkin-SmartPlug 0 100 200 300 400 2013 2014 2015 2016 2017 2018 D-Link-IPCam 0 100 200 300 400 2013 2014 2015 2016 2017 2018 Foscam-IPCam 0 100 200 300 400 2013 2014 2015 2016 2017 2018 HP-Printer 0 100 200 300 400 2013 2014 2015 2016 2017 2018 LiFX-LightBulb 0 100 200 300 400 2013 2014 2015 2016 2017 2018 NEST-SmokeAlarm 0 100 200 300 400 2013 2014 2015 2016 2017 2018 Nest-IPCam 0 100 200 300 400 2013 2014 2015 2016 2017 2018 Netatmo-WeatherStation 0 100 200 300 400 2013 2014 2015 2016 2017 2018 PIX-STAR-PhotoFrame 0 100 200 300 400 2013 2014 2015 2016 2017 2018 Philips-LightBulb 0 100 200 300 400 2013 2014 2015 2016 2017 2018 Samsung-IPCam 0 100 200 300 400 2013 2014 2015 2016 2017 2018 TPLink-IPCam/Plug/LightBulb 0 100 200 300 400 2013 2014 2015 2016 2017 2018 Withings-SleepSensor 0 100 200 300 400 2013 2014 2015 2016 2017 2018 Withings-SmartScale Figure 3.5: Per-Device Type AS Penetrations (Omitting 7 Device Types Appearing in Less Than 10 ASes) released around 2016 (TPLink IPCam on 2015-12-15, TPLink Plug on 2016-01-01 and TPLink Lightbulb on 2016-08-09) and their AS penetration continue to rise even on 2018-04-10, despite AS penetration of other device types (released between 2011 and 2015) roughly stop increasing by 2017. Note the fact that the AS penetrations of our 23 device types plateau does not con- tradict with the constant growth of overall IoT deployment because new IoT devices are constantly appearing. Growth in Device Type Density: Having showed that our 23 IoT device types penetrate into about 3.5 times more ASes from 2013 to 2018, we next study how many IoT device types are found in these ASes—their device type density. We use device type density to show the “depth” of AS-Level IoT Deployment. For every AS detected with at least one of our 23 IoT device types (referred to as IoT- AS for simplicity) from 2013 to 2018, we compute its device type density. We present the empirical cumulative distribution (ECDF) for device type densities of IoT-ASes from 2013 to 2018 in Figure 3.3. 68 Our first observation from Figure 3.3 is from 2013 to 2018, not only are there 3.5 times more IoT-ASes (as shown by AS penetration), the device type density in these IoT-ASes are also constantly growing. Our second observation is despite the constant growth, device type density in IoT- ASes are still very low as of 2018. In 2018, most (79%) of the IoT-ASes have at most 2 of our 23 device types, which is a modest increase comparing to 2013 where the similar percentage (80%) of IoT-ASes have at most 1 of our 23 device types. Our results suggest that for IoT devices, besides potential to further grow in AS penetration (which would lead to growth in household penetration), there exists even larger potential to grow in device type density (which would lead to growth in device density). This unique potential of two-dimensional growth (penetration and density) sets IoT devices apart from other fast-growing electronic products in recent history such as cell-phone and personal computer (PC) which mostly grow in penetration (considering that while a person may only own 1 to 2 cell-phones and PCs, he could own many more IoT devices). We rule out the possibility that the increasing AS penetration and device type density we observe is an artifact of device servers we used in detection (measured around 2017) do not apply to IoT devices in the past by showing IoT device-type-to-server-name map- pings are stable over time in Section 3.4.2. ASes with Highest Device Type Density in 2018: We examined the top 10 ASes with highest device type density in 2018 (detected with 8 to 14 of our 23 device types). Our first observation is that they are pre-dominantly from the U.S. (4 ASes) and Europe (3 ASes). There are also 2 ASes from Eastern Asia (Korea and China) and 1 from Haiti. This distribution also consistently show up in top 20 ASes with 10 ASes from the U.S. and 5 ASes from Europe. Our second observation is that these top 10 ASes are mostly major consumer ISPs in their operating regions such as Comcast, Charter, AT&T and 69 Verizon from the U.S., Korea Telecom from South Korea and Deutsche Telekom for Germany. Estimating Actual Overall AS Penetration in 2018: Recall that the overall AS penetrations for our 23 device types reported in Figure 3.2 are under-estimations of the ground truth, both because our DITL data is not complete (8 of 13 root letters provide visibility to most but not all global recursives), and because two days of data will miss many queries due to DNS caching and non-IoT overshadowing. We estimate actual overall AS penetration in 2018 by applying detection to extended measurement at B root. With this extended measurement, we expect to observe queries from most global recursives at B root because most global recursives rotate among root letters (at least 80% [106]). We also hope to observe IoT DNS queries that would otherwise get hidden by DNS caching and non-IoT overshadowing in short observation. (Ideally, when adding more observations leads to no new detections, we know we have detected all IoT-ASes that could be visible to B root.) To evaluate how many IoT-ASes we could see, we extend 2-day 2018 DITL obser- vation at B root to 112 days. As shown in Figure 3.4, we see a constant increase in detection of IoT-ASes over longer observation. With 112-day observation, we detect 3106 IoT-ASes, 8 more than what we see in 2 days of B root only (388 IoT ASes), and3:6 more than 2 days with 8 roots (856 IoT ASes, as in Figure 3.2). In 112 days, we see about 5% of all unique ASes in the routing system in early 2018 (about 60,000, reported by CIDR-report.org [149]) However, we do not see the detection curve in Figure 3.4 flattening even after 112 days. We model IoT query rates from an IoT-AS as seen by a single root letter. Simple models (a root letter receives 1/13th of the traffic) show a curve flattening after at least 300 days, consistent with what we see in Figure 3.4. However, a detailed model requires understanding the IoT query rates and the aggregate (IoT and non-IoT) query rates, more 70 information than we have. We conclude that the real numbers of IoT-ASes are much higher than our detections with DITL in Figure 3.2. IoT Deployments in a Residential Neighborhood We next explore deployments of our 23 device types in a residential neighborhood from 2013 to 2017. Input Datasets: We use DNS datasets from Case Connection Zone (CCZ) to study a residential neighborhood [8]. This dataset records DNS lookups made by around 100 residential houses in Cleveland, OH that connected to CCZ Fiber-To-The-Home experimental network and covers a random 7-day interval in each month between 2011 and 2017. Specifically, we apply DNS-based detection (both with and without server learning) to the January captures of 2013 to 2017 CCZ DNS data (Table 3.2). Results without Server Learning: As shown in Figure 3.6, from 2013 to 2017, we see roughly more detections and more types of device detected each year from this neighborhood. (Similar to Section 3.3.2, to count per-device-type detection, we do not use detection merge.) We believe our detection counts in Figure 3.6 lower-bound the actual IoT device counts in this neighborhood for two reasons: first, unlike our study on USC campus where dynamically assigned IPs inflate IoT detection counts (Section 3.3.1), IPs in CCZ data are static to each house and do not cause such inflation; second, recalling that for NATted devices, our method only detects the existence of device types but cannot know the device counts for each type (Section 3.2.1), our detection counts in Figure 3.6 under- estimate IoT device counts if any household owns multiple devices of same types. We conclude that the lower bound of IoT device count in this neighborhood increases about 4 times from 2013 (at least 3 devices) to 2017 (at least 13 devices), consistent with our observation of increasing AS-level IoT deployment in this period. 71 0 2 4 6 8 10 12 14 2013-01 2014-01 2015-01 2016-01 2017-01 Number of Detections Date Amazon-Echo Amazon-FireTV HP-Printer NEST-SmokeAlarm Nest-IPCam Philips-LightBulb Withings-SmartScale Figure 3.6: IoT Deployments for All Houses in CCZ Data We want to track IoT deployment by house but we can do that for only about half the houses because (according to author of this dataset) although IPs are almost static to each house, about half of the houses are rentals and see natural year-to-year variation from student tenants. Our detection results are consistent with this variation: most IPs with IoT detections at one year cannot be re-detected with the same set of device types in the following years. We show the increasing IoT deployment can also be observed from a single house by tracking one house whose tenant looks very stable (since it is detected with consistent set of IoT device types over the 5 years). As shown in Table 3.5, this household owns none of our known device types in 2013 (omitted in the table) and acquire HP Printer in 2014, Nest IPCam and Nest SmokeAlarm in 2015, as well as Philips LightBulb and Withings SmartScale in 2016. Withings SmartScale is missed in 2017 detection poten- tially due to this type of device generates no background traffic and it is not used during the 7-day measurement of 201701 CCZ data. Results with Server Learning: With server learning, we see no additional detec- tions. We do observe that during our detection to 5 years’ CCZ DNS data, 951 distinct server names are learned and 3 known IoT device types are split. By analyzing these 72 2014-01 2015-01 2016-01 2017-01 HP Printer HP Printer HP Printer HP Printer Nest IPCam Nest IPCam Nest IPCam Nest SmokeAlarm Nest SmokeAlarm Nest SmokeAlarm Philips LightBulb Philips Bulb Withings Scale Table 3.5: IoT Deployment for One House in CCZ Data new server names, we conclude that server learning could discover new sub-types of known IoT device type but risk learning wrong servers from NATted traffic. We first show server learning could learn new device server names and even new sub-type for known IoT device types. HP Printer is originally mapped to 3 server names (per prior knowledge obtained in Section 3.3.1). In the 2015-01 detection (oth- ers are similar), we learn 9 new server names for it in first iteration. But with these updated 12 server names, we find 2 less HP Printer in subsequent detection, suggesting HP Printer is in fact an aggregation of two sub-types (just like we merge Belkin Switch and Belkin MotionSensor as one type in Section 3.2.1): one sub-type talk to the original 3 server names while the other talk to the updated 12 server names. We split HP Printer into two sub-types and re-discover the two missed HP Printer in subsequent detection. We show our method risks learning wrong servers for a given IoT device type P behind NAT if there are non-IoT devices behind the same NAT visiting servers run by P ’s manufacturer. This is caused by two limitations in our method design: first, our method tries learning all unknown server names queried by IoT user IPs (Sec- tion 3.2.1) because we cannot distinguish between DNS queries from detected IoT devices and DNS queries from other non-IoT devices behind the same NAT; second, we risk mis-classifying human-facing manufacturer server (that also serve non-IoT devices) as device server because we use necessary condition to determine device-facing server in Section 3.2.1. In the 2015-01 detection (others are similar), we learn suspiciously high 73 176 device servers for Amazon Echo and 277 device servers for Amazon FireTV in first iteration, suggesting many of these new servers are learned from non-IoT devices (like laptops using Amazon services) behind the same NAT as the detected Amazon devices (because IoT devices usually only talk to at most 10 servers per day [134]). This false learning poisons our knowledge of device servers and causes us to detect two less Ama- zon FireTV and one less Amazon Echo in second iteration. Luckily, our method splits Amazon Echo and Amazon FireTV into two sub-types where one sub-type still mapped to the original, un-poisoned, set of device servers, allowing us to re-discover these miss- ing Amazon devices in subsequent detections. (We observe good performance in validation Section 3.4.2 where we apply server learning inside the NAT.) 3.3.3 Certificate-Based IoT Detection Results Certificate-based detection only applies to devices that directly provide public web pages. IP cameras and Network Video Recorders (NVR) both often export their content, so we search for these. We find distinguishing them is hard because IP camera manufac- turer often also produce NVR and to distinguish them requires finding non-manufacturer keys “IP Camera” and “NVR” in TLS certificates (per rules in Section 3.2.2). Since we find certificates rarely contains these two text strings, we do not try to distinguish them and report them together as “IPCam”. Input Datasets: We apply detection to ZMap’s 443-https-ssl 3-full ipv4 TLS cer- tificate dataset captured on 2017-07-26 [155] (as in Table 3.2). This dataset consists of certificates found by ZMap TCP SYN scans on port 443 in the public IPv4 address space. 74 Tyco Axis Arecont Manufacturer Dahua Hikvision Amcrest Mobotix Foscam Vivotek Intl Schneider NetGear Comm Exacq Vision Apexis Candidate Certificates 228,080 9,243 5,458 956 10,833 95 60 4 1 31 12 5 1 IoT Certificates 228,045 9,169 5,458 954 290 77 60 4 1 0 0 0 0 Adding Foscam Rule 228,045 9,169 5,458 954 10,814 77 60 4 1 0 0 0 0 Table 3.6: IPCam Detection Break-Down We target IPCam devices from 31 manufacturers (obtained from market reports [47, 48] and top Amazon sellers). We build matching keys for these IPCams based on rules in Section 3.2.2. Initial Detection Results: Table 3.6 shows 244,058 IPCam devices we detect (rep- resented by IoT certificates, 0.46% of all 52,968,272 input TLS certificates) from 9 manufacturers (29% of 31 input manufacturers, we do not see any detection from other 22 manufacturers). Among the detected devices, most (228,045, 93.43%) come from the top manufacturer Dahua. (Dahua is responsible for most IP cameras used in one DDoS attack [101].) Almost all (243,916, 99.94%) detected devices come from the top 5 manufacturers. Partial Validation: Due to lack of ground truth, it is not possible to directly vali- date our results. We indirectly validate our results by accessing (via browser) IPs of 50 random candidate certificates from each IPCam manufacturers where we found at least one candidate certificate. If browser accessing shows a login screen with the correct manufacturer name on it, we consider it valid. This validation is limited since even a true positive may not pass it due to the device may be off-line or not show the manufac- turer when we try it. (Our validation tests were done only 3 days after TLS certificate collection, to minimize IP address changes.) Table 3.7 shows our results, with 66% of detections correct. For the 106 false pos- itives, in 40 cases the IP address did not respond and in 53 cases, we get login screen showing no manufacturer information. All 33 false negatives are due to Foscam IPCam 75 Devices studied 404 (100%) Correctness 265 (66%) Incorrectness 139 (34%) (100%) False Positives 106 (26%) (76%) (100%) IP Non-Responsive 40 (10%) (29%) (38%) Login w/o Mfr Info 53 (13%) (38%) (50%) False Negatives 33 (8%) (24%) Table 3.7: Partial Validation of Certificate-Based Detection Results Country Total Dahua Foscam Hikvision Amcrest Mobotix USA 47,690 38,139 3,666 655 5,038 143 S.Korea 22,821 22,520 84 212 4 0 India 19,244 19,029 23 186 6 0 China 17,575 15,539 288 1,748 0 0 Vietnam 14,092 13,794 113 176 9 0 France 8,006 7,059 506 372 1 62 Mexico 7,868 7,593 71 158 34 11 Poland 7,252 6,870 171 200 1 9 Argentina 6,384 6,141 154 75 13 0 Romania 5,646 5,272 139 207 2 23 Table 3.8: Detected IP cameras and NVRs by Countries fail our two rules to find IoT certificates in Section 3.2.2: they are signed by a CA called “WoSign” and have uncommonC CN place holder * .myfoscam.org. By adding a special rule for Foscam devices (candidate certificates of Foscam that are signed by WoSign and have * .myfoscam.org as C CN are IoT certificates), our detection correctness percentage increases to 70% (283 out of 404, with 15 true neg- atives becoming false positives due to we cannot confirm ground truth for 15 newly detected Foscam IPCam) and false negative percentage drops to 0%. Revised Detection Results: Last row of Section 3.2.2 shows our revised detection results with the special rule for Foscam: with 10,524 more detected Foscam devices, we have a total of 254,582 IPCam detections. 76 Geo-location Analysis: We geo-locate our revised detection result with Maxmind data published on 2017-07-18 (8 days before collection of the TLS certificate data we use) and find our detected IPCams come from 199 countries. We examine what devices are in each country to gain confidence in what we detect. Table 3.8 shows the top ten countries by number of detected devices, and breaks down how many devices are found in country by manufacturer. (We show show only manu- facturer with at least 1000 global detections in Table 3.6.) We find manufacturers prefer different operating regions. We believe these pref- erences are related to their business strategies. While Dahua, Foscam and Hikvision are global,the latter two show substantially more deployment in the U.S. and China, respectively. Amcrest (formerly Foscam U.S. [26]) is almost exclusive to the American market. The German company Mobotix, while is present in Europe and America, seems completely absent from Asian markets. 3.4 Validation We validate the accuracy of our two main methods by controlled experiments. Validation requires ground truth, so we turn to controlled experiments with devices we own. We have 10 devices (Table 3.1) from 7 different manufacturers and at different prices (from $5 to $85, in 2018). This diversity provides a range of test subjects, but the requirement to own the devices means our controlled experiment is limited in size. In principle, we could scale up testing by by crowd-sourcing traffic captures, as shown in [65]. Our experiments also show our method correctly detects multiple devices from same manufacturer (3 devices from Amazon and 2 from TP-Link, as in Table 3.1) using device 77 merge and detection merge (recall how we handle devices from same manufacturer shar- ing servers in Section 3.2.1). 3.4.1 Accuracy of IP-Based IoT Detection We validate the correctness and completeness of our IP-based method by controlled experiments. We set up our experiment by placing our 10 IoT devices (Table 3.1) and 15 non-IoT devices in a wireless LAN behind a home router. We assign static IPs to these 25 devices. We run tcpdump inside the wireless LAN to observe all traffic from the LAN to the Internet. We run our experiments for 5 days to simulate 3 possible cases in real-world IoT measurements. On Day 1 to 2 (inactive days), we do not interact with IoT devices at all. So first 2 days’ data simulates observations of unused devices and contains only background traffic from the devices, not user-driven traffic. On day 3 to 4 (active days), we trigger the device-specific functionality of each of the 10 devices like viewing the cameras and purchasing items with Amazon Button. The first 4 days’ data shows extended device use. On day 5, we reboot each device, looking how a restart affects device traffic. Our detection algorithm uses the same set of device server names that we describe in Section 3.3.1. We collect IPv4 addresses for these device server names (by issuing DNS queries every 10 minutes) during the same 5-day period at the same location as our controlled experiments. Detection During Inactive Days: We begin with detection using the first 2 days of data when the devices are inactive. We detect more than half of the devices (6 true positives out of 10 devices); we miss the remaining 4 devices: Amazon Button, Fos- cam IPCam, Amcrest IPCam, and Amazon Echo (4 false negative). We see no false positives. (All 15 no-IoT devices are detected as non-IoT.) This result shows that short 78 measurements will miss some inactive devices, but background traffic from even unused devices is enough to detect more than half. Detection During Inactive and Active Days: We next consider the first four days of data, including both inactive periods and active use of the devices. When observations include device interactions, we find all devices. We also see one false positive: a laptop is falsely classified as Foscam IPCam. We used the laptop to configure the device and change the device’s dynamic DNS setting. As part of this configuration, the laptop contactsddns.myfoscam.org, a device-facing server name. Since the Foscam IPCam has only one device server name, this overlap is sufficient to detect the laptop as a camera. This example shows that IoT devices that use only a few device server names are liable to false positive. Applying Detection to All Data: When we apply detection to the complete dataset, including inactivity, active use, and reboots, we see the same results as without reboots. We conclude that user device interactions is sufficient for IoT detection; we do not need to ensure observations last long enough to include reboots. Simulating Dynamic IPs: We next show how dynamically assigned IPs can inflate IoT detections (both at USC, Section 3.3.1 and at an IXP, Section 3.3.1). We simulate dynamic-assigned IPs by manually re-assigning random static IPs to our 25 devices every day during our 5-day experiment. Our IP-based detection with this simulated 5-day dynamic-IP measurements finds 26 true positive IoT detections from 25 dynamic IPs. One IP is detected with two IoT devices because they were each assigned to this IP on a different day. Similar to our 4- day and 5-day static-IP detection, we see a false detection of a laptop as Foscam IPCam, and no false negatives. This experiment showed 2:6 more IoT devices than we have, less than the5 inflation that would have occurred with each device being detected on a different IP each day. 79 We conclude that dynamic addresses can inflate device counts, and the degree depends on address lease times. 3.4.2 Accuracy of DNS-Based IoT Detections We validate correctness and completeness of our DNS-based detection method by con- trolled experiments. We use the same set up, devices and device server names as in Section 3.4.1. We also validate our claim that DNS-based detection can be applied to old network measurements by showing IoT device-type-to-server-name mappings are stable over time. We run our experiments for 7 days and trigger device-specific functionality of each of the 10 devices every day to mitigate the effect of DNS caching. We first apply detections with the complete set of device server names to evaluate the detection correctness and server learning performance of our DNS-based method. We then detect with incomplete set of device server names to test the resilience of detection and server learning to incomplete prior knowledge of device servers. Detection with Complete Server Names: Results show 100% correctness (10 true positives and 15 true negatives), with 13 new device server names learned and 1 known device type splitted. By analyzing the detection log, we show server learning and device splitting can correct an incorrect device-merge. Recall in Section 3.3.1, we merge TPLink Plug and TPLink LightBulb as one type (TPLink Plug/Bulb) per our prior knowledge, they talk to the same server name devs.tplinkcloud.com. After first iteration of detec- tion, we learn a new serverdeventry.tplinkcloud.com for TPLink Plug/Bulb (from a detected TPLink LightBulb, as shown by ground truth). However with now 2 server names mapped to TPLink Plug/Bulb, we see one less detection of it in sec- ond iteration (ground truth shows a TPLink Plug becomes un-detected). This reduced 80 Percentage of Detection Mapping Learned Learn-back Mapping Dropped Correctness Back/Dropped Ratio 0% 100% — — 10% 100% 5/8 63% 20% 96% 6/15 40% 30% 96% 10/22 46% 40% 92% 11/29 38% 50% 96% 21/36 58% Table 3.9: Resilience of Detection and Server Learning detection suggests TPLink LightBulb and TPLink Plug are in fact different device types: the former talks to the updated set of servers (devs.tplinkcloud.com and deventry.tplinkcloud.com) while the latter talk to the original set of servers (devs.tplinkcloud.com). We split TPLink Plug/Bulb back into two to fix this incorrect device merge and re-discover the missed TPLink Plug in subsequent detec- tions. Detection with Incomplete Set of Server Names: We detect with incomplete set of device server names to test resilience of detection and server learning to incomplete prior knowledge. Our goal is to simulate cases where we do not know all servers devices contact. We can have incomplete information should we not learn for long enough from them prior to detection (Section 3.2.1), or because they change servers over time (perhaps due to firmware changes). We randomly drop 10%, 20% to 50% known device-type-to-server-name mappings while ensuring each device type is still mapped to at least one server. We then compare the detection correctness and the learn-back ratio (how many dropped mappings are learned back after detections) of each experiment. Results (Table 3.9) show our detection correctness are fairly stable: with 50 % servers dropped we still have 96% correctness. We believe two reasons cause this high correctness: our detection method suppress false positive (by ensuring device servers 81 are not likely to serve human and IoT devices from other manufacturers) and the way we drop servers (ensuring each device mapped to at least one server name) guarantee low false negatives. We also find the learn-back ratio is relatively stable, fluctuating around 50%. To explore how false detection happen and why about half dropped mappings cannot be learned back, we closely examine the detection and server learning with 20% (15) mappings dropped (others are similar). This experiment has only one false detection: Belkin Plug is not detected due to 2 of its 3 server names are dropped while the remain- ing 1 server name is not queried in validation data. This experiment fail to learn back 9 of 15 dropped mappings: 4 due to server names not seen in validation data, 2 due to non-detection of Belkin Plug (recall we only try to learn server from detected devices) and the rest 3 due to server names are not considered unknown (recall we only try to learn unknown servers) because they are originally mapped to both Amazon FireTV and Amazon Echo and we only dropped them from server list of Amazon Echo. Stability of Device Server Names: We support our claim that DNS-based detection can be applied to old network measurements by verifying IoT device-type-to-server- name mappings are stable over time. We show 8 of our 10 IoT devices (Table 3.1) and a newly purchased Samsung IPCam talk to almost identical set of device server names across 1 to 1.5 years. We exclude Amazon Echo and Amazon FireTV from this exper- iment because they talk to large number of device servers (previously measured 15 and 45) and it is hard to track all of them over time. We update these 9 devices to latest firmwares on May, 2018, measure latest servers name they talk to and compare these servers name with those we used in detection (measured on Oct 2016 for 1 device, on Dec, 2016 for 6 devices and on June 2017 for 2 devices). We found these 9 devices still talk to 17 of the 18 device server names we measured from them 1 to 1.5 years ago. The only difference is D-Link IPCam who changes 1 of its 3 device server name from 82 signal.mydlink.com to signal.auto.mydlink.com. A close inspection shows signal.auto.mydlink.com is CNAME of signal.mydlink.com, suggesting although D-Link IPCam change the server names it queries (making it less detectable for our DNS-based method) , it still talk to the same set of actual servers (meaning our IP-based method is un-affected). 3.5 Related Work Prior groups considered detection of IoT devices: Heuristic-based traffic analysis: IoTScanner detects LAN-side devices by passive measurement within the LAN [132]. They intercept wireless signals such as WiFi pack- ets and identify existence of IoT devices by packets’ MAC addresses. While their work require LAN access and cannot generalize to Internet-wide detection, our three methods apply to whatever parts of the Internet that are visible in available network measure- ments, and are able to categorize device types. Work from Georgia Institute of Technology detects existence of Mirai-infected IoT devices by watching for hosts doing Mirai-style scanning (probes with TCP sequence numbers equal to destination IP addresses) [10]. Their detection reveals existence of Mirai-specific IoT devices, but does not further characterize device types. In compari- son, our three detection methods reveal both existence and type of IoT devices. Our IP and DNS-based method cover general IoT devices talking to device servers rather than just Mirai-infected devices. Work from University of Maryland detects Hajime infected IoT devices by mea- suring the public distributed hash table (DHT) that Hajime use for C&C communi- cation [61]. They characterize device types with Censys [32], but types for most of 83 their devices remain unknown. In comparison, our three detection methods detect exis- tence of known devices and always characterize their device types. Our IP and DNS- based methods cover general IoT devices talking to device servers rather than just those infected by Hajime. Machine-learning-based traffic analysis: Work from Ben-Gurion University of the Negev (BGUN) detect IoT devices from LAN-side measurement by identifying their traffic flow statistics with machine learning (ML) models such as random forest and GBM [94, 95]. They use a wide range of features (over 300) extracted from network, transport and application layers, such as number of bytes and number of HTTP GET requests. Similarly, work from the University of New South Wales (UNSW) characterizes the traffic statistics of 21 IoT devices such as packet rates and average packet sizes and briefly discusses detecting these devices from LAN-side by identifying their traffic statistics with ML model (random forest) [134]. Comparing to work from BGUN from UNSW, our work uses different features: packet exchanges with particular device servers and TLS certificate for IoT remote access rather than traffic statistics or traffic flow features. While they use LAN-side measurement where traffic from each device can be separated by IP or MAC addresses, our IP-based and DNS-based methods can work with aggregated traffic from outside the NAT and cover IoT devices both on public Internet and behind NAT. Not requir- ing LAN-side measurement also enables our IP-based and DNS-based methods to do Internet-wide detection. Our certificate-based method covers HTTPS-Accessible IoT devices on public Internet by crawling TLS certificates in IPv4 space. Work from IBM transforms DNS names into embeddings, the numeric represen- tations that capture the semantics of DNS names, and classify devices as either IoT or 84 non-IoT based on embeddings of their DNS queries using ML model (multilayer percep- tron) [79]. In comparison, our three methods not only detect existence of IoT devices, but also categorize their device types. While they rely on LAN-side measurement to aggregate DNS queries by device IPs, our three methods do not require measuring from inside the LAN. IPv4 scanners: Shodan is a search engine that provides information (mainly ser- vice banners, the textual information describing services on a device, like certificates from HTTPS TLS Service) about Internet-connected devices on public IP (including IoT devices) [130]. Shodan actively crawls all IPv4 addresses on a small set of ports to detect devices by matching texts (like “IP camera”) with service banners and other device-specific information. Censys is similar to Shodan but they also support community maintained annotation logic that annotate manufacturer and model of Internet-connected devices by matching texts with banner information [32]. Compared to Shodan and Censys, our IP-based and DNS-based methods cover IoT devices using both public and private IP addresses, because we use passive measure- ments to look for signals that work with devices behind NATs. These two methods thus cover all IoT devices that exchanges packets with device servers during operation. Our certificate-based method, while also relying on TLS certificates crawled from IPv4 space, provides a better algorithm to match TLS certificates with IoT related text strings (with multiple techniques to improve matching accuracy) and ensures matched certifi- cates come from HTTPS servers running in IoT devices. Work from Concordia University infers compromised IoT devices by identifying the fraction of IoT devices detected by Shodan that send packets to allocated but un-used IPs monitored by CAIDA [140]. Their focus on compromised IoT devices is different from our focus on general IoT devices. Due to their reliance on Shodan data, they cover 85 devices with public IP while our IP-based and DNS-based method cover devices on both public and private IP. We also report IoT deployment growth over a much longer period (6 years) than they do (6 days). Northeastern University infers devices hosting invalid certificates (including IoT devices) by manually looking up model numbers in certificates and inspecting web pages hosted on certificates’ IP addresses [20]. In comparison, our certificate-based method introduces an algorithm to map certificates to IoT devices and does not fully rely on manual inspection. Work from University of Michigan detects industrial control systems (ICS) by scanning the IPv4 space with ICS-specific protocols and watching for positive responses [97]. Unlike from their focus on ICS-protocol-compliant devices and pro- tocols, our approaches considers general IoT devices. Our approach also uses different measurements and signals for detection. 3.6 Conclusion To understand the security threats of IoT devices requires knowledge of their location, distribution and growth. To help provide these knowledge, we propose two methods that detect general IoT devices from passive network measurements (IPs in network flows and stub-to-recursive DNS queries) with the knowledge of their device servers. We also propose a third method to detect HTTPS-Accessible IoT devices from their TLS Certificates. We apply our methods to multiple real-world network measurements. Our IP-based algorithm reports detections from a university campus over 4 months and from traffic transiting an IXP over 10 days. Our DNS-based algorithm finds about3:5 growth in AS penetration for 23 device types from 2013 to 2018 and modest increase in device type density in ASes detected with these device types. Our DNS-based method 86 also confirms substantial growth in IoT deployments at household-level in a residential neighborhood. Our certificate-based algorithm find 254K IP camera and NVR from 199 countries around the world. This study supports our thesis statement by showing that a new signature of traffic based on observed identities of end-points enables detecting general IoT devices, a new class of network devices. Specifically, we detect existence of IoT devices by identifying devices talking to end-points run by IoT manufacturers. We also characterize detected IoT devices by inferring their device types from the combinations of end-points they talk to. By applying our new signature to multiple real-world network traces, we report detections from a university campus over 4 months (Section 3.3.1) and from traffic tran- siting an IXPs over 10 days (Section 3.3.1). We show that AS penetrations for 23 types of IoT devices has grown substantially (about 3:5) from 2013 to 2018 but the device types density in ASes detected with these device types only increase modestly (Sec- tion 3.3.2). We also show substantial IoT deployment growth at household-level from 2013 to 2017 (Section 3.3.2). Thus, our second study of general IoT device detection shows another example for support of our thesis statement. 87 Chapter 4 Compromised IoT Detection and DDoS Mitigation Having shown that signature based on observed identities of traffic end-points can detect general IoT devices, we next apply this signature to detect compromised IoT devices. Specifically, we detect compromised IoT devices participating in DDoS attack by iden- tifying IoT devices talking to end-points other than a list of known benign end-points. Detection of these compromised IoT devices also enables us to filter DDoS traffic between them and potential DDoS victims. We show our method maintains low false positive rate in flagging suspicious remote endpoints (2%) and filtering DDoS packets (0.45%) when validated with replay of benign IoT traffic captures. We show our method mitigates attacks all except two types of attacks tested regardless of the attacks’ types (such as TCP SYN flooding and DNS query flooding) and traffic characteristics (such as packet rates) when validated with replay of real-world DDoS traffic. This study of compromised IoT detection and DDoS mitigation demonstrates our thesis statement as follows. We develop a new signature of traffic based on observed identities of end-points: IoT devices talking to end-points other than known benign IoT end-points are compromised. This new signature enables us to detect a new class of network devices: compromised IoT devices. Our detection of compromised IoT devices is essentially a characterization: we characterize IoT devices under monitoring as either benign or malicious depending on if they talk to suspicious end-points. Our detection is 88 robust to both the type of malicious traffic (such as TCP SYN flooding or DNS query flooding) and the flow characteristics of malicious traffic (such as packet rates). 4.1 Introduction There is an increasing concern about the security threats that Internet-of-Things (IoT) devices, such as Internet-enabled light bulbs and cameras, raise for the Internet ecosys- tem. The massive number of IoT devices, together with their often inadequate secu- rity [24, 25, 124] and even unpatchabilities [127], make them attractive targets for com- promises. One flagrant example is that compromised IoT devices (as known as“bots”) could be used to mount large-scale Distributed Denial-of-Service (DDoS) attacks and significantly damage Internet security. In 2016, over 100k IoT devices, compromised by IoT malwares Mirai [86], launched a series of record-breaking DDoS attacks, includ- ing a 620 Gb/s attack againstkrebsonsecurity.com (2016-09-20) [76] and 1 Tb/s attacks against cloud-provider OVH (2016-09-23) [114] and DNS-provider Dyn (2016- 10-21) [36]. A naive way to defend IoT-based DDoS attacks is to make all IoT devices secure. However, IoT manufacturers are not incentivized to produce more secure and likely more expensive products, because they may see less sales from price-sensitive cus- tomers [147, 151]. Another option is to mitigate IoT-based DDoS attack at victims, as advocated by many prior work on defending traditional DDoS attacks [5, 13, 19, 64, 71–74, 84, 107, 117, 120–122, 145, 146, 150, 152]. However due to the large number of IoT devices (5.8 billion in 2020 [40]) and the resulting high volume of IoT-based DDoS traffic (at most 1 Tb/s for one attack as of 2016), this option could be extremely costly in practice. (For example, Akamai estimated defense cost of millions of dollars [15]). As a result, 89 defending against an attack with capacity at the victim is possible only for the largest operators today. In this paper, we advocate the third option: defending IoT-based DDoS attacks by mitigating at bot-side. The main advantage of bot-side defense is that the attack traffic volume at bot-side is much smaller than that at victim-side. As a result, bot-side defense is much less costly and is more likely to cope with future growth in attack volume. Our first contribution is to propose IoTSTEED (IoT bot-Side Traffic-Endpoint-basEd Defense), a system runs in edge routers and defends IoT-based DDoS attacks by miti- gating at bots’ access networks (Section 4.2). IoTSTEED watches traffic that leaves and enters the home network, detecting IoT devices at home (Section 4.2.1), learning the benign servers they talk to (Section 4.2.1), and filtering their traffic to other servers as a potential DDoS attack (Section 4.2.3). Our second contribution is to validate IoTSTEED’s correctness with with replay of off-line traffic capture (Section 4.3). We validate IoTSTEED’s accuracy in device detection and false positives (FP) in server learning and traffic filtering with replay of 10-day benign traffic capture from an IoT access network (Section 4.3.1). We show IoTSTEED correctly detects all 14 IoT and 6 non-IoT devices in this network (100% accuracy) and maintains low false-positive rates when learning the servers IoT devices talk to (flagging 2% benign servers as suspicious) and filtering IoT traffic (dropping only 0.45% benign packets). We validate IoTSTEED’s true positives (TP) and false negatives (FN) in filtering attacks with replay of real-world DDoS traffic (Section 4.3.2). Our experiments show IoTSTEED could mitigate all except four types of attacks (as described in Section 4.2.5) regardless of the attacks’ traffic types, attacking devices and victims; an intelligent adversary can design to avoid detection in a few cases, but at the cost of a weaker attack. 90 Our third contribution is to deploy IoTSTEED in NAT router of an IoT access net- work for 10 days (Section 4.4). We show IoTSTEED runs well on a commodity router: memory usage is small (4% of 512MB) and the router forwards traffic at full uplink rates. We confirm IoTSTEED’s accuracy in device detection and FP, TP and FN in server learning and traffic filtering during on-line router deployment is similar to what we report in off-line trace-replay validation. (We make the source code of IoTSTEED and the 10-day benign IoT traffic capture we use in validation experiment public at [50, 52].) 4.2 Methodology IoTSTEED follows the observation that IoT devices usually talk to a small number of benign servers (from our prior work [55,57]). By whitelisting these benign servers, it can mitigate suspicious IoT traffic to all other servers. IoTSTEED examine packets entering and leaving an IoT access network from its edge router, detecting IoT devices in this network (Section 4.2.1), learning benign servers these IoT devices talk to (Section 4.2.2) and filtering suspicious IoT traffic to and from other servers (Section 4.2.3). IoTSTEED thus focuses on single-purpose IoT devices, such as smart plugs and cameras, that talk to a small amount of server names. IoTSTEED does not work with multiple-purpose IoT devices, such as smart TV , that could talk to hundreds of server names by installing new applications. IoTSTEED currently handles IPv4 traffic since our test home networks have been v4-only (Section 4.3 and Section 4.4); adding IPv6 support should be straightforward. We believe that our results of defending IPv4 attacks (Section 4.3 and Section 4.4) prove the effectiveness of our system and we leave defending IPv6 attacks as future work. 91 Manufacturer Device Type (Model) Alias Amcrest IP Camera (IP2M-841) Amcrest Cam Belkin Smart Plug (Wemo Mini) Belkin Plug Dyson Air Purifier (Pure Cool Link) Dyson Purifier D-Link IP Camera (DCS-934L) D-Link Cam Foscam IP Camera (FI8910W) Foscam Cam Foscam IP Camera (R2C) Foscam Cam2 HP Wireless Printer (Envy 4500) HP Printer Samsung IP Camera (SNH-P6410BN) Samsung Cam Philips Light Bulb (Hue A19 Kit) Philips Bulb TP-Link Smart Plug (HS100) TPLink Plug TP-Link Light Bulb (LB110) TPLink Bulb Tenvis IP Camera (WH-TH661) Tenvis Cam Wyze IP Camera (WYZEC2) Wyze Cam Wansview IP Camera (633GBU) Wansview Cam Table 4.1: 14 IoT devices We Own 4.2.1 Device Detection IoTSTEED first detects IoT devices in the access network where it runs. We later learns these IoT devices’ benign servers (Section 4.2.2) and filter IoT traffic to other suspicious servers (Section 4.2.3). Overview Our detection follows the observation that many IoT manufacturers only produce IoT devices. Therefore we can detect IoT devices by mapping their MAC addresses to their manufacturer names and identifying known IoT manufacturer names (Section 4.2.1 and Section 4.2.1). For manufacturers that produce both IoT and non-IoT devices (such as Samsung who makes IP cameras and smart phones), IoTSTEED risks misclassifying their non-IoT devices as IoT. We correct these false-positive detections by watching for detected IoT devices that talk to excessive number of server names. The rationale is 92 Dahua August Xiaomi AMCREST Null Ampak Roborock Yeelight Yeelink Insteon Figure 4.1: Part of Directed Graph Storing Our Knowledge of IoT Manufacturers (Dark Circle) and Collaborators (Light Circle) that we find non-IoT devices usually talk to more server names than IoT devices do (Figure 4.2). Collect IoT Manufacturer Names To detect IoT devices by comparing their MAC-inferred manufacturers with known IoT manufacturers, we collect a list of IoT manufacturer names. However, knowledge of IoT manufacturers is not enough. Some IoT MAC addresses (about one third of 185 we examine in next paragraph and about one third of 522 that IoT inspector examines in [66]) get mapped to third-party organizations such as parts makers and original equip- ment manufacturers (OEMs) that collaborate with the actual IoT manufacturers. We call these “manufacturer-collaborators” or simply “collaborators”, and collect collaborators for each known IoT manufacturer. When an IoT MAC address gets mapped to a collabo- rator, we can narrow down the device’s potential manufacturers to a list of manufacturers related to this collaborator. We first find a list of IoT manufacturers by collecting IoT MAC addresses and their ground truth manufacturer names. We collect 185 IoT MAC addresses from 67 IoT manufacturers based on devices we own (Table 4.1), public IoT traffic capture [9, 58, 133], and Google image searches (for example, we search “smart plug MAC address” for MAC addresses printed on bottom of smart plugs). (We ensure the first three octets of our 185 IoT MAC addresses, which uniquely identify vendors, are all distinct.) We 93 then find collaborators for these 67 IoT manufacturers by looking up these 185 MAC addresses with a MAC-to-vendor mapping library [89] and identifying lookup results different from ground truth manufacturer names as collaborators. We show 67 of these 185 MAC addresses (36%) get mapped to collaborators. As a result, we obtain 67 distinct IoT manufacturer and 45 distinct collaborator names. To expedite finding potential manufacturers for IoT devices whose MAC addresses get mapped to collaborators, we store known IoT manufacturer and collaborator names in a directed graph. We store each manufacturer and its collaborators as vertices in this graph (known as manufacturer and collaborator vertices respectively) and connect them with edges pointing to each manufacturer vertex from its collaborator vertices. The resulting graph (part of which is shown in Figure 4.1) allows us to identify all manufac- turers related to a collaborator by identifying all manufacturer vertices reachable from this collaborator’s vertex. We handle two edge cases in building this direct graph. For IoT manufacturers that are also collaborators for other manufacturers (such as IP camera makers Dahua that also OEMs for other IP camera makers like Amcrest [69]) we label them as IoT manufacturer in our graph (see “Dahua” manufacturer vertex in Figure 4.1). We still know these manufacturers are also collaborators from our graph because their vertices point to other manufacturer vertices. For IoT manufacturer that uses MAC addresses that cannot be mapped to any organizations, or use private MAC addresses in some devices, we add a special collaborator (“’null” or “private”) to this manufacturer (as shown in Figure 4.1). Our IoT detection risks being incomplete because our knowledge of manufacturers and their collaborators is limited. In principle, we could scale up by crowd-sourcing IoT MAC addresses and ground truth manufacturer names, as shown in [66]. 94 Detect IoT Devices by MAC Lookup We next detect IoT devices by identifying devices whose MAC lookup results ( [89]) match certain vertices in our direct graph. When finding such a matching vertex (indi- cating a new IoT device), we determine this device’s potential manufacturers by finding all manufacturer vertices reachable from this matching vertex. IoTSTEED examines source and destination MAC addresses for every packet enter- ing and leaving the IoT access network where it runs. When finding a new MAC address, indicating a new device, IoTSTEED looks up this MAC address with MAC-to-vendor mapping library [89] and classifies this new device as IoT if its lookup result matches any vertices in our graph. (IoTSTEED otherwise considers this device non-IoT). IoT- STEED considers a vertex matching a lookup result if this vertex’s manufacturer (or collaborator) name is a sub-string of this lookup result (regardless of case) and every word in this vertex’s name shows up in this lookup result. When detecting a new IoT devices, as its lookup result match certain vertex in our graph, IoTSTEED infer this device’s manufacturers (or a list of potential manufacturers) by finding all manufacturer vertices (directly or indirectly) reachable from this matching vertex. For example, if an IoT device’s MAC address gets mapped to parts maker “Ampak” in Figure 4.1, its potential manufacturers include “August” and “Xiaomi” who use parts from Ampak and “Roborock” and “Yeelight” who partner with Xiaomi. Correct Potential Incorrect Detections We constantly monitor if any detected IoT devices talk to excessive number of server names—a sign of incorrect IoT detection. If any IoT device DNS queries more thanT svr distinct servers names, we re-classify it as non-IoT to correct potential false positive (FP) caused by some IoT manufacturers also produce non-IoT devices. We setT svr as 70 based on examining 10-day operational traffic from 60 IoT devices and 6 non-IoT 95 0 0.2 0.4 0.6 0.8 1 1 2 3 5 10 20 50 100 250 500 1000 2000 ECDF Number of Distinct Server Names Queried IoT Devices non-IoT Devs Figure 4.2: ECDF for Accumulated Distinct Server Names Queried by 60 IoT and 6 non-IoT Devices in 10 Days devices (we measure our 14 IoT and 6 non-IoT devices, as in Table 4.1, and use public traffic pcaps for the rest 46 IoT devices [9, 58, 133]). As in Figure 4.2, we find these 60 IoT devices each query at most 15 distinct server names (in average 5) while these 6 non-IoT devices each query at least 128 distinct server names (in average 451) in 10 days. 4.2.2 Server Learning We next learn benign remote endpoints detected IoT devices talk to (IoT servers). Knowing IoT servers enables us to mitigate suspicious traffic to all other endpoints (non-IoT servers) in Section 4.2.3. Overview We learn IoT servers from all remote endpoints detected IoT devices talk to in two rounds. (We maintain a separate IoT server list for each IoT device.) We also whitelist a short list of server IPs that we always consider benign: Google’s public DNS revolvers (8.8.8.8 and 8.8.4.4) that are often visited by IoT devices, public IPs assigned to the NAT router where IoTSTEED runs (since IoT devices sometimes talk to their NAT router’s public IPs) and public IPs of the mobile phone used to remote access IoT devices. Our first-round learning bootstraps a list of IoT servers for every IoT device by including all endpoints they DNS query or directly talk to (without querying) shortly 96 after bootup (Section 4.2.2). Our second-round learning expands these IoT server lists by including a fraction of endpoints visited after first round that resemble common IoT servers (Section 4.2.2). Server Identification To learn IoT servers, we identify servers by either their DNS names or IP addresses. We identify servers mainly by their DNS names because we find server names to be relatively stable over time while server IPs could change. We find some devices could visit servers directly by IPs without preceding DNS queries (such as Google’s pub- lic DNS resolvers 8.8.8.8) and identify these servers by their IPs (called “IP-accessed servers” hereafter). (We call servers visited with preceding DNS queries “name- accessed servers”.) Since we mainly identify servers by DNS names but IoTSTEED sees server IPs in traffic, we track server name-to-IP mapping based on DNS traffic observed from IoT devices. We track a list of server names each IoT device talks based on server names they queried using type A, AAAA and CNAME DNS requests. We then extract server IPs and canonical names for these server names from corresponding type A and CNAME DNS replies. (We do not track AAAA DNS replies because currently IoT- STEED ignores non-DNS IPv6 traffic.) First-Round Learning: Server Bootstrapping IoTSTEED bootstraps a list of IoT servers for every IoT device by including all remote endpoints they DNS query (for name-accessed servers) or directly talk to without query- ing (for IP-accessed servers) shortly after most recent bootup. The rationale is that we trust recently-bootup devices to be uncompromised and only talking to benign endpoints because IoT malwares usually do not sustain device reboot [11, 42, 44, 96] 97 and re-infections take time (considering that many malware randomly scan for infec- tion [11,43,96]). For a newly-detected IoT deviceD, IoTSTEED first estimates its most recent bootup time (T D bt ) with the the timestamp of the first observed packet fromD and then classifies all servers thatD DNS query or directly visit between[T D bt ;T D bt +T sp ) as benign, whereT sp is the duration of server bootstrapping. Our estimation of T D bt holds intuitively if D is first boot up after IoTSTEED starts. IfD is already running, we require owner to reboot it before starting IoTSTEED so that we correctly estimateT D bt . We experimentally set T sp as two hours for name-accessed IoT servers (annotated as T abn sp ). By observing our 14 IoT devices for 10 days after bootup, we find that they talk to most (75% or 67) of their 89 name-accessed IoT servers (blue bars in Figure 4.3) within the first two hours. We experimentally set T sp as 120 hours for IP-accessed IoT servers (annotated as T abi sp ). By observing our 14 IoT devices for 10 days after bootupwe find 10 devices (all except Foscam Cam, Amcrest Cam, Belkin Cam and D-Link Cam) talk to most (91% or 30) of their 33 IP-accessed IoT servers (gray bars in Figure 4.3) in first 120 hours (green area in Figure 4.4). For the rest four IoT devices (Foscam Cam, Amcrest Cam, Belkin Cam and D-Link Cam) that keep talking to new IP-accessed servers even after bootstrapping (see Figure 4.4), our server learning cannot handle them: bootstrapping- based first round only covers part of their IP-accessed servers and server-name-based second round does not apply to IP-accessed servers. We choose to not filter traffic between these devices and their IP-accessed IoT servers (called “turn off IP-accessing filtering”) to reduce potential FP in traffic filtering. (See Section 4.2.3 for details.) (We use separate 10-day measurements to selectT sp and to validate IoTSTEED later in Section 4.3.) 98 We show the lack of bootstrapping in talking to IP-accessed IoT servers we observe is mainly an artifact of UPnP service in our router. We find three of the four devices without bootstrapping behavior (Foscam Cam, Amcrest Cam, and D-Link Cam) set up static port mapping on our routers via UPnP. (We confirm Foscam Cam uses UPnP for remote device accessing but are not certain about other devices.) As a side effect, they get unsolicited packets from a large number of remote IPs (574 or 98% of their 586 IP-accessed IoT servers) such as scanners from internet-census.org and shodan.io. By responding to these unsolicited packets, these three devices appear constantly talking to new IP-accessed servers. We support our hypothesis that UPnP cause lack of bootstrapping by showing that these three devices show bootstrapping in talking to IP-accessed servers in a similar 10-day experiment without UPnP (Sec- tion 4.4). For the remaining one device (Belkin Plug) that does not use UPnP, its 22 IP-accessed IoT servers are mostly STUN servers for NAT traversal (73% or 16). One explanation for its lack of bootstrapping is that Belkin Plug keep trying keep connecting to different STUN servers IPs for NAT relay services. UPnP service also explains the large number of IP-accessed IoT servers (641, as in Figure 4.3) our IoT devices talk to. The three devices (Foscam Cam, Amcrest Cam and D-Link Cam) that contributing to almost all (91%, 586) of these 641 IP-accessed IoT servers all set up static port mapping via UPnP. We have shown their 586 IP-accessed IoT servers is mainly an artifact of them responding to unsolicited probes from remote IPs (98%, 574). To support our hypothesis that UPnP service inflates IP-accessed server count, we show our 14 devices only talk to 69 IP-accessed IoT servers (9 less than 641 servers with UPnP) in 10 days without UPnP (Section 4.4). While we used public IoT traces [9, 58, 133] for collecting IoT manufacturer names (Section 4.2.1) and setting T svr values (Section 4.2.1), we cannot do that here because these capture do not contain the device bootup traffic that we need to setT sp values. 99 0 10 20 30 Samsung-Cam Philips_Bulb TENVIS_Cam WANSVIEW_Cam Wyze_Cam HP_Printer AMCREST_Cam TPLink_Plug TPLink_Bulb Foscam_Cam2 Belkin_Plug D-Link_Cam Dyson_Puri fier Foscam_Cam Server Count Name-Accessed Servers IP-Accessed Servers Figure 4.3: Distinct Name- and IP-accessed IoT Servers Our Devices Visit Within 10 Days of Bootup (Amcrest Cam, D-Link Cam and Foscam Cam Visit 117, 328 and 141 Servers but Get Cropped for Displaying). 0 40 80 120 160 200 0 50 100 150 200 250 Foscam_Cam 0 40 80 120 160 200 0 50 100 150 200 250 AMCREST_Cam 0 10 20 30 40 50 0 50 100 150 200 250 Belkin_Plug 0 80 160 240 320 400 0 50 100 150 200 250 D_Link_Cam 0 2 4 6 8 10 0 50 100 150 200 250 Samsung_Cam 0 2 4 6 8 10 0 50 100 150 200 250 TENVIS_Cam 0 2 4 6 8 10 0 50 100 150 200 250 WANSVIEW_Cam 0 2 4 6 8 10 0 50 100 150 200 250 Wyze_Cam Figure 4.4: Distinct IP-accessed IoT Servers Our Devices Visit Per Hour Within 10 Days of Bootup (Omitting Six Devices Talking to No More Than One Server). Green Area Highlights 120-hourT abi sp . Second-Round Learning: Server Expansion After server bootstrapping period, we only consider a server benign, if its DNS domains resemble one of three classes of common IoT servers judged per-class rule below. Manufacturer servers are servers run by IoT manufacturers to implement core IoT functions such as remote controlling and device monitoring. Manufacturer servers can usually be identified by their manufacturer-owned DNS domain. (We already know at at least a list of potential manufacturer names for each detected IoT devices in Sec- tion 4.2.1.) We consider a server name N as a manufacturer server for IoT device D 100 Tuya Evrythng PubNub Xively Azure IoT tuyacn evrythng pubnub xively azure-devices tuyaus pndsn tuyaeu Table 4.2: Eight Domains from Five IoT Platforms if any of D’s potential manufacturer name is a substring of N’s domain (regardless of case). We define domain of a URL as the immediate left neighbor of the URL’s public suffix. (We identify public suffix based on public suffix list from Mozilla Foun- dation [105]). Third-party servers are servers ran by non-IoT-manufacturers that provide services such as time (NTP) services and news services to IoT devices. We find it challenging to identify third-party servers because they could be specific to device types which we do not know. We thus only identify two groups of third-party servers. The first group of servers provide NTP (time) services which we find common for IoT devices to talk to. We consider a server name a NTP server if it use well-known NTP server domains (nist.gov andntp.org) or have string “time” or “ntp” in their sub-domains. The second group of servers are those run by the same organizations as some of our boot- strapped third-party servers (determined by identical server domain). Platform servers are special third-party servers that allow manufacturers to imple- ment core IoT functions without setting up their own servers. Platform servers can be identified by their platform-specific domain names. We currently look for eight domains from five IoT platforms, as summarized in Table 4.2. Our second-round learning does not apply for IP-accessed servers due to their lack of DNS names. For these servers, if they fail first-round learning, we consider them malicious unless they are visited by devices whose IP-accessing filtering get turned off in Section 4.2.3. 101 To cover more IP-accessed servers, we keep monitoring if any LAN device queries any server name that get resolved to any IP-accessed server IP: if found, we assign this server name to this IP-accessed server and re-learn this IP-accessed server with both round of learning as if it is a name-accessed server. 4.2.3 Traffic Filtering We defend DDoS attack by allowing traffic between IoT devices and benign servers we learn in Section 4.2.2 and filtering IoT traffic to and from all other servers. Upon dropping traffic from some IoT devices, we notify device owners about the potential device compromise through our user interface. We also suggest device owners to reboot (or factory reset) their devices (for cleaning up potential malwares), modify device login password and update device firmware (for preventing re-infection). For devices that shows no bootstrapping behavior and instead keep talking to new IP-accessed servers,we turn off their IP-accessing filtering by passing all traffic between them and their IP-accessed servers. By doing so, we avoid dropping benign traffic to IP- accessed IoT servers that we fail to learn (FP) but risk allowing these devices to attack IP-accessed servers (FN). To detect devices without bootstrapping behavior and turn off their IP-accessing filtering, we count new IP-accessed servers every IoT device talks to and look for devices that talk to new IP-accessed servers at an roughly constant rate (server per second) both before and after server bootstrapping period. We use A(x) to annotate the number of distinct IP-accessed servers a given IoT device D talk to between period [T D bt ;T D bt + x) (excluding whitelisted server IPs in Section 4.2.2). We consider D to be lack of bootstrapping behavior if att seconds after bootup, the average rate thatD talk to new IP-accessed servers after server bootstrapping period (seeR t in Equation 4.1) is larger than a threshold. We set this threshold as a fraction (r in Equation 4.1, empirically 102 set as 50%) of the average rate that D talk to new IP-accessed servers during server bootstrapping (see R sp in Equation 4.1). We do not turn off IP-accessing filtering for devices that do not normally visit IP-accessed servers (judged byA(T abi sp ) < 3 where 3 is empirical) because it is suspicious if they suddenly talks to some servers by IPs. R t = A(t)A(T abi sp ) tT abi sp R sp = A(T abi sp ) T abi sp R t >R sp r (4.1) To prevent a compromised device D from evading our defense by intentionally probes many server IPs and causing us to turn off its IP-accessing filtering, we can turn IP-accessing filtering back on whenD talk to too much new IP-accessed server per seconds after bootstrapping (for example, whenR t >10R sp ). 4.2.4 Deployment Incentives Both IoT owners and ISPs have reasons to deploy our system. We incentivize IoT owners to run IoTSTEED in their home routers by protecting them from the potential privacy and security breach resulted from compromised IoT devices. For example, a compromised IP camera may leak live footage [126, 154], an hacked smart lock might lead to robbery [138] and a hacked smart oven could potentially cause house fire [18]. IoTSTEED protects IoT owners by constantly monitoring their IoT devices (Section 4.2.1 and Section 4.2.2) and notifying them about device compro- mises (Section 4.2.3). We also prevent IoT devices from talking to suspicious servers (Section 4.2.3) which mitigate the risks of IoT-related privacy breach (considering, for example, compromised IP cameras may talk to adversarial servers for transmitting video footage). 103 We incentivize ISPs to pre-install IoTSTEED in their customer premises equipment (CPEs) from two aspects. First is value-added service: by pre-installing IoTSTEED (which protects their customers from compromised IoT devices), ISP is effectively pro- viding an IoT security service. (Survey shows two-thirds of households with up to ten IoT devices are willing to pay an average of $6.90 per month for IoT security ser- vices [87].) Second is bandwidth saving: by rejecting IoT-based DDoS traffic at CPEs, ISPs save their bandwidth for legitimate user traffic. 4.2.5 Countermeasures by Knowledgeable Adversaries DDoS attacks are launched by criminals who will seek to avoid detection. We next discuss four possible ways a knowledgeable adversary would try to evade our defense. While they show how an knowledgeable adversary can reduce IoTSTEED effectiveness, these countermeasures are either difficult, limited applicability, or weaken attacks. First, a bot master could exploit first-round server learning (Section 4.2.2) by launch- ing attacks during bots’ server bootstrapping period and causing IoTSTEED to incor- rectly learn attacks as valid behavior. Such evasion is unlikely in practice. Our boot- strapping period is a relatively short time window (2 hours or 120 hours after device bootup, Section 4.2.2). It is challenging for a bot master to infect IoT devices, rent out these devices to customers as part of DDoS-for-a-service infrastructure and launch attacks via these devices all within this time window. Second, a bot master could exploit second-round server learning (Section 4.2.2) by launching attacks to the three types of common IoT servers we consider benign in Sec- tion 4.2.2 (such as servers run by attacking devices’ manufacturers) and causing IoT- STEED to pass these attacks. IoTSTEED indeed cannot defend bots from attacking these common IoT servers. However by filtering attacks to all other servers, IoTSTEED still effectively breaks the economy of running DDOS as a service. The rationale is that 104 to monetize DDoS infrastructure, a bot master needs to be able to attack any servers and not just a few common IoT servers. Third, a bot master could also exploit traffic filtering (Section 4.2.3) by using bots without IP-accessing filtering to attack IP-accessed servers and surpassing our defense. While IoTSTEED works on most devices (11 or 12 of the 14 devices in Section 4.3 and Section 4.4)), it does not work well on these devices. In addition, if UPnP is disabled, some of these devices become defendable (Section 4.4),. Lastly, a bot master could exploit device detection (Section 4.2.1) by disguising bots as non-IoT devices and bypassing our defense (IoTSTEED does not filter non- IoT traffic). There are two ways to disguise bots as non-IoT devices: One could dis- guise bots as non-IoT include spoofing bots’ MAC addresses with some non-IoT MAC addresses (recalling Section 4.2.1). one could also make bots query more than T svr server names and cause IoTSTEED to re-classify these bots as non-IoT devices (recall- ing Section 4.2.1. While disguising bots as non-IoT devices evades our defense, by adding the extra need to disguising bots, our defense makes launching IoT-based DDoS attacks harder. The need to disguising bots also weakens IoT-based DDoS attacks by making bots more identifiable. Disguised bots could be potentially identified by the pool of non-IoT MAC addresses bot master use for MAC spoofing or the pool of server names that bot master makes bots query. Potentially, we could make the last exploitation difficult to implement by require IoT owners to manually specify MAC addresses of their non-IoT devices when starting IoT- STEED and treating the rest MAC addresses as IoT devices. This way, to disguise bots as non-IoT device, a bot master needs to know MAC address of other non-IoT devices in the same LAN and spoof their bots with these specific non-IoT MAC addresses. (We do not currently do so to minimize manual operations required from IoT owners.) 105 4.3 Validation by Trace Replay We validate the correctness of IoTSTEED with replay of off-line traffic capture. We first test IoTSTEED’s accuracy in device detection (the fraction of IoT and non-IoT devices correctly detected) and false positives (FP) in server learning (flagging of benign end- points as suspicious) and traffic filtering (dropping of benign packets). In this test, we run IoTSTEED with replay of 10-day benign traffic capture from an IoT access net- work (Section 4.3.1). We then test IoTSTEED’s true positives (TP) and false negatives (FN) in server learning and traffic filtering (Section 4.3.2). In this test, we replay real- world DDoS traffic capture together with the same 10-day benign traffic capture above (Section 4.3.2). We use off-line capture because they enable testing IoTSTEED with real-world DDoS traffic by replaying DDoS capture (Section 4.3.2). In Section 4.4, we test IoT- STEED’s accuracy in device detection and FP, TP and FN in server learning and traffic filtering with live traffic from an IoT access network. 4.3.1 False Positive with Benign Traffic To understand IoTSTEED’s accuracy in device detection and FP in server learning and traffic filtering, we capture 10-day benign traffic from an IoT access network and run IoTSTEED with replay of this traffic capture. We show IoTSTEED correctly detect all 14 IoT and 7 non-IoT devices in this network (100% accuracy) and maintains low false- positive rate: flagging 2% benign 642 IoT endpoints as suspicious and dropping 0.45% of about seven million benign IoT packets. Experiment Setup: To test IoTSTEED with benign traffic, we set up an experimen- tal IoT access network by placing 14 IoT devices (Table 4.1) and seven non-IoT devices (two mobile phones, two tablets and three laptops) in a wireless LAN behind a NAT 106 router. (Our IoT devices, as in Table 4.1, are mostly IP cameras because IP cameras are used in large-scale DDoS attacks [102]) To simulate running IoTSTEED inside the NAT router, we capture all traffic between LAN and the Internet by running tcpdump in NAT router (with UPnP service on) for 10 days. Since we require rebooting existing devices before starting IoTSTEED (Section 4.2.2), we shut off our IoT devices and boot them after tcpdump begins. We interact with our IoT devices daily from one of our mobile phones. (By testing with only our 14 IoT devices, our experiment is limited in device coverage. In principle, we could scale up by crowd-sourcing traffic capture from IoT devices owned by others, as shown in [66].) Accuracy in Device Detection: We first test IoTSTEED’s accuracy for device detection (Section 4.2.1). Our definition of accuracy is(TP+TN)=(TP+TN+FP+ FN) where we treat IoT devices as positives and non-IoT devices as negatives. To get IoTSTEED’s accuracy, we compare devices it detects with ground truth and finding the fraction of IoT and non-IoT devices it correctly detects. We show IoTSTEED correctly detects all 14 IoT devices and infers their manufac- turers. IoTSTEED infers both Dahua and Amcrest as Amcrest Cam’s potential manu- facturers because this device’s MAC lookup result shows “Dahua” which is both an IoT manufacturer and a collaborator (OEM) to “Amcrest”, recalling rules from Section 4.2.1. We show IoTSTEED correctly identifies all seven non-IoT devices, resulting in over- all 100% accuracy in device detection. IoTSTEED initially classify six of these non-IoT devices as IoT because they come from IoT vendors (five from Apple and one from Sam- sung). IoTSTEED later re-classifies them as non-IoT since it observes that they query more than T svr server names (Section 4.2.1). We show this initial mis-classification causes incorrect packet loss later this section. 107 False Positive in Server Learning: We next examine IoTSTEED’s FP in server learning (Section 4.2.2) and show it maintains low false-positive rate: flagging a small fraction (12 out of 642 or 2%) of benign IoT endpoints as malicious. We breakdown server learning results by rounds in Table 4.3 to understand what cause the a few FPs. We show that the first-round learning cause no FP: it tests most (464 or 73%) of 642 ground truth IoT server and correctly identifies all of them (Table 4.3). Our second-round learning cause all the FP by mis-identifying a small fraction (12 or 7%) of the 170 IoT servers it tests as malicious. Eight of these 12 FPs are IP- accessed servers and the rest four are name-accessed servers whose domains (“google” and “opendns”) do not resemble common IoT servers, judged by rules in Section 4.2.2. Lastly, we show that turning off certain devices’ IP-accessing filtering is crucial for keeping IoTSTEED’s false positives in server learning low. IoTSTEED turns off three devices’ IP-access filtering (Samsung Cam, D-Link Cam and Foscam Cam) because they constantly talk to new IP-accessed servers even after bootstrapping (Section 4.2.3). We find that most (146 or 86%) of the 170 IoT servers that second-round learning tests are IP-accessed servers visited by these three devices and are labeled benign (TP) only because these devices’ IP-accessing filtering are off (see Table 4.3). If we keep these three device’s IP-accessing filtering on, these 146 IP-accessed servers would be flagged as malicious (FP) (recalling Section 4.2.2), boosting IoTSTEED’s false-positive rate in server learning to 25% (158 out of 642). (In Section 4.3.2, we show that the tradeoff of turning off filtering for these devices is that we risk allowing attacks from them.) False Positive in Traffic Filtering: We next examine IoTSTEED’s FP in traffic filtering (Section 4.2.3). We show IoTSTEED’s false-positive rate is low, dropping only a tiny fraction (33,183 or 0.45%) of about seven million packets sent to (or originated from) our IoT devices in this 10-day measurement. 108 ground truth IoT servers from 14 IoT devices 642 (100%) whitelisted server IPs (all correctly identified) 8 (1%) enter first-round learning (all correctly identified) 464 (72%) enter second-round learning 170 (26%) (100%) correctly identified 158 (25%) (93%) detected by 3rd-party-svr rule 12 (2%) (7%) visited by devs without IP-acs fltr 146 (23%) (86%) mis-identified as malicious 12 (2%) (7%) Table 4.3: Server Learning Breakdown with Benign IoT Traffic Access Simulated Traffic Start After Filtering Victims Type Attackers Type Bootstrap? Decision B root IP Amcrest Cam TCP Yes Drop B root IP Amcrest Cam DNS Yes Drop B root IP Amcrest Cam TCP No Pass B root IP Foscam Cam TCP Yes Pass B root IP a non-IoT dev TCP Yes Pass Krebs Name Amcrest Cam TCP Yes Drop Krebs Name Philips Bulb TCP Yes Drop Philips Name Philips Bulb TCP Yes Pass Krebs Name HP Printer TCP Yes Drop Krebs Name Dyson Purifier TCP Yes Drop Table 4.4: Simulated Attacks in Validation We show that out of these (33,183) false-positive packet loss most (90.72% or 30,104) are because IoTSTEED misclassifies 12 IoT servers as malicious (Table 4.3). We show that the rest false-positive packet loss (7.83%, 2,597 out of 33,183) are because IoTSTEED initially mis-classifies six non-IoT devices as IoT and filters these non-IoT device’s traffic as if they are IoT traffic. After we correctly re-classify these six non-IoT devices later, we no longer filter their traffic (because we ignore non-IoT traffic). In Sec- tion 4.4 we show that we could avoid these packet loss by listing all non-IoT devices’ MAC addresses as exceptions (whose packets IoTSTEED will ignore) when starting IoTSTEED. 109 4.3.2 True Positives and False Negatives with Attack Traffic To understand IoTSTEED’s true positive (TP) and false negative (FN) in server learning and traffic filtering, we spoof real-world DDoS attack capture and run IoTSTEED with replay of these spoofed attack traffic capture. Replay of Real-world DDoS Traffic: We test IoTSTEED with attack traffic from two real-world DDoS events captured at B-root DNS server (simply “B root” hereafter): a DNS query flooding event in December 2015 and a TCP SYN flooding event in June 2016. We first simulate ten IoT-based DDoS attacks (each a row in Table 4.4) by spoof- ing attack traffic capture from these two DDoS events so that the attacks appear coming from our IoT devices. (Among these ten simulated attacks, four are examples of the four potential countermeasures to IoTSTEED discussed in Section 4.2.5.) We then test IoT- STEED with replay of each of these simulated IoT-based DDoS attacks together with 10-day benign traffic capture from Section 4.3.1. We first simulate 5 IoT-based DDoS attacks to an IP-accessed server: B root, shown as top five attacks in Table 4.4. (We simulate attacks to B root because IoT-based DDoS attacks have frequently targeted DNS servers [36]. We do not simulate attacks to other IP-accessed servers because IoTSTEED only cares about when an IP-accessed server get accessed and who accesses it instead of the exact server IP.) To simulate attacks to B root, we first extract one random attacker’s DDoS packets from 15-minute sample of each DDoS event (referred as DNS or TCP “attacker capture”). We then simulate an IoT deviceD attacking B root at timet by replacing the source MAC and IP addresses in an attacker capture with MAC and LAN IP ofD and shifting timestamp of all DDoS packets in this attacker capture to right aftert (called “spoofed attacker capture”). Since the two DDoS events we use are captured at victim (B root) and do not include potential DNS queries from attackers about victim, our spoofed attacker capture simulates device 110 D directly attacking B root by IP without preceding DNS queries. We summarize the five IoT-based DDoS attacks to B root we simulate in Table 4.4. Four of them are based on spoofing TCP attacker capture with four combinations of attackers and attacking time: Amcrest Cam attacking both after and during server bootstrapping, Foscam Cam attacking after bootstrapping and an non-IoT device attacking after bootstrapping. (We test IoTSTEED with an non-IoT attack to simulate defending IoT devices disguised as non-IoT, one potential countermeasure to IoTSTEED from Section 4.2.5.) The remain- ing one simulated attack is based on spoofing DNS attacker capture with one attacking device (Amcrest Cam) and time (after bootstrapping). We next simulate five attacks to two name-accessed servers (bottom five attacks in Table 4.4): www.krebsonsecurity.com andwww.philips.com (shortened as “Krebs” and “Philips” hereafter.) We simulate an attack to Philips, a manufacturer server for Philips Bulb, to test IoTSTEED’s defense of attacks to common IoT servers from Section 4.2.2. We simulate attacks to Krebs, victim of an IoT-based DDoS attack in 2016 [76], to test IoTSTEED’s defense to all other name-accessed servers. To simulate an IoT device D attacking Krebs at time t via TCP-SYN flooding (simulating attacks to Philips is similar), We first generate a spoofed TCP attacker capture for D (which simulateD TCP-SYN flooding B root at timet) as we discussed above. We then replace victim IP in spoofed TCP attacker capture (B-root IP) with Krebs’ IP. Lastly, we inject forged DNS traffic (containing type-A DNS query fromD about Krebs and correspond- ing DNS replies) to the beginning of this spoofed TCP attacker capture. We summarize the five IoT-based DDoS attacks to Krebs and Philips we simulate in Table 4.4. Our five simulated attacks are based on spoofing TCP attacker capture with five different combinations of attackers and victims: Amcrest Cam, Philips Bulb, HP Printer and Dyson Purifier to Krebs and Philips Bulb to Philips. 111 True Positives with attack traffic: We show IoTSTEED defends all attacks except the four attacks based on countermeasures from Section 4.2.5 (see the six attacks with “Drop” filtering decision in Table 4.4). We show our IoTSTEED defends these six attacks regardless of their attacking devices (four different devices Table 4.4), traffic types (TCP-SYN flooding and DNS query flooding), victims (B root and Krebs) and accessing types for victims (IP-accessed and name-accessed). In live deployment Sec- tion 4.4, we extend this observation by showing that IoTSTEED defends attacks regard- less of their packet rates. False Negatives with Attack Traffic: We confirm IoTSTEED indeed cannot miti- gates the four types of attacks discussed in Section 4.2.5, contributing to the four FNs we observe in Table 4.4 (see the four “passed” attacks). We also confirm our conclusion from Section 4.2.5 that these countermeasures are either difficult, limited applicability, or weaken attacks. First, we show that IoTSTEED cannot defend attacks launched during server boot- strapping period of the attacking devices, which contribute to the passing of attacks from Amcrest Cam to attack B root during bootstrapping (third B-root attack in Table 4.4). However this type of attacks are not likely to happen in practice during server bootstrap- ping since bootstrapping period is relatively short (see Section 4.2.5 for details.) Second, we show IoTSTEED cannot defend devices from attacking the three class of common IoT servers we consider benign in Section 4.2.2, which cause the miss- ing of attacks from Philips Bulb to www.philips.com (see the attack to Philips in Table 4.4). While IoTSTEED cannot defend attacks to a few common IoT servers, IoTSTEED still defend the rest majority of servers and break economy of running com- mercial DDoS attacks (Section 4.2.5). 112 Access Simulated Traffic Start After Filtering Victims Type Attackers Type Bootstrap? Decision Univ Svrs IP Amcrest Cam Slow TCP No Pass Univ Svrs IP Foscam Cam Slow TCP Yes Drop Univ Svrs IP Philips Bulb Slow TCP Yes Drop Univ Svrs IP TPLink Plug Fast TCP Yes Drop Table 4.5: Simulated Attacks in Deployment Third, we show that IoTSTEED cannot defend devices without IP-accessing filtering from attacking IP-accessed server, which contribute to the passing of attacks from Fos- cam Cam whose IP-accessing filtering is off (forth B-root attack in Table 4.4). We argue that since these devices are in minority (3 of 14 IoT devices in Section 4.3) and some of these devices (including Foscam Cam) would become defendable once disabling UPnP service in router, IoTSTEED still defends majority of IoT devices and could effectively reduce IoT-based DDoS traffic at victim (Section 4.2.5.) (We show later in Section 4.4 that by disabling UPnP service, Foscam Cam becomes defendable.) Lastly, we show IoTSTEED cannot defend attacks from IoT devices disguised as non-IoT devices, which contribute to the passing of attacks from an non-IoT device to B root (last B-root attack in Table 4.4). As discussed in Section 4.2.5, while disguising IoT devices does evade our defense, it weaken the resulting IoT-based DDoS attack by making the attack harder to implement and making the bots easier to identify. In summary, our experiment results suggest IoTSTEED could mitigate all except the four types of attacks discussed in Section 4.2.5 regardless of the attacks’ types, flow characteristics and exact attacking devices. 113 4.4 Validation by Router Deployment Having tested IoTSTEED with replay of off-line traffic capture (Section 4.3), we next deploy it on-line in an IoT access network’s NAT router for 10 days. We show IoT- STEED works in router deployment similar to trace-replay validation: with reasonable run-time overhead, few false positive (FP) with benign traffic and similar true positive (TP) and false negative (FN) with attack traffic. Experiment Setup: We deploy IoTSTEED in the NAT router of the experimental IoT access network from Section 4.3. (The NAT router is a Linksys WRT1900ACS router running OpenWRT version 19.07.1 [111].) We add one extra Linux PC to this network to simulate IoT-based DDoS attacks. Similar to trace-replay validation (Sec- tion 4.3), our deployment experiment lasts 10 days. We shut off all IoT devices initially and boot them up after IoTSTEED starts. We interact with IoT devices daily with the same mobile phone as in Section 4.3. We simulate four IoT-based DDoS attacks during 10-day deployment, as summa- rized in Table 4.5. We run hping3 [63] from one Linux laptop in this IoT access network (with one “fast” TCP syn flooding attack of 1000 packets/s and three “slow” TCP syn attacks of 10 packets/s) and spoof this laptop’s MAC and IP addresses with those of certain IoT devices (the attackers in Table 4.5). We block all inbound and outbound traffic to this laptop except traffic to DDoS victim to prevent laptop from talking to non-victim PC-oriented servers using spoofed IoT MAC addresses and confusing IoT- STEED into thinking that some IoT devices are talking to these PC-oriented servers. (We cannot simulate attacks to name-accessed servers because if we allow DNS traffic from this Linux laptop, it could DNS query non-victim PC-oriented servers, such as connectivity-check.ubuntu.com, with spoofed IoT MAC address and con- fuse IoTSTEED.) 114 40 45 50 55 0 24 48 72 96 120 144 168 192 216 240 CPU% 2 4 6 0 24 48 72 96 120 144 168 192 216 240 Mem% Hours Since Deployment Figure 4.5: IoTSTEED’s Per-hour CPU and Memory Usage during Deployment Lastly, we apply two tweaks to our router deployment experiment to validate our claims earlier in trace-replay validation that they could reduce FP (Section 4.3.1) and FN (Section 4.3.2) in traffic filtering. First, we list all non-IoT devices’ MAC addresses as exceptions (whose packets IoTSTEED would ignore) when starting IoTSTEED. Our goal is to validate the claim that doing so could reduce FP in traffic filtering (caused by initial mis-classification of some non-IoT devices as IoT, recalling Section 4.3.1). Sec- ond, we disable UPnP service in NAT router during our 10-day deployment. Disabling UPnP allows us to validate our claim that without UPnP, devices like Foscam Cam will stop constantly talking to new IP-accessed servers and causing potential FN in filtering DDoS attacks (Section 4.3.2). Measuring Run-time Overhead: We measure memory and CPU usage of IoT- STEED every hour during this 10-day deployment. We show IoTSTEED uses a small amount (in average 4%) of the 512MB memory and about half (in average 49%) of the 1.6 GHz dual-core CPU in this NAT router and its memory and CPU usages are quite stable, as illustrated in Figure 4.5. We confirm that IoTSTEED’s CPU usage does not slow down router’s packet for- warding and thus even though half the CPU is busy, the router’s user-visible performance is unaffected. We run Internet speed test (www.speedtest.net) from one laptop in this access network and confirm that this laptop’s peak download and upload speeds with IoTSTEED running (113 Mb/s and 11 Mb/s, averaged over 10 tests) is roughly identical 115 to those without IoTSTEED running (114 Mb/s and 11 Mb/s, averaged over 10 tests). (As router’s CPU keep growing faster, we expect IoTSTEED’s CPU usage to decrease over time.) False Positive with Benign Traffic: We confirm IoTSTEED’s accuracy in device detection and FP in server learning and traffic filtering during router deployment are sim- ilar to what we report in trace-replay validation (Section 4.3.1). IoTSTEED correctly detects all 14 IoT devices and infers their manufacturers (100% accuracy). It maintains low false-positive rate in server learning: flagging a small amount (6 or 4%) of 139 benign IoT endpoints as suspicious. (Six FPs include five IP-accessed servers entering second-round learning and one name-accessed servers with domain “Google” that does not resemble common IoT servers.) It also shows low false-positive rate in traffic filter- ing: dropping a tiny fraction (8,769 or 0.07%) of 12 million IoT packets observed from this LAN. We show all FP in traffic filtering are due to that IoTSTEED flags six benign end- points as suspicious during server learning. Comparing to Section 4.3.1, we see no false-positive filtering caused by initial misclassification of non-IoT devices as IoT because we exclude non-IoT MAC addresses. We conclude that excluding non-IoT MAC addresses when starting IoTSTEED could reduce FP in traffic filtering. True Positive and False Negative with Attack Traffic: We next examine IoT- STEED’s TP and FN in filtering attack traffic during router deployment and show they are similar to what we observe in trace-replay validation. We confirm that IoTSTEED cannot defend attack during bootstrapping, which causes the one FN in Table 4.5. We confirm IoTSTEED mitigates all other three attacks tested regardless of the attacking devices (three different devices Table 4.5) and packet rates (1000 packets/s and 10 pack- ets/s) of attacks. 116 We note that we successfully mitigate the attack from Foscam Cam during router deployment while we pass a similar attack from Foscam Cam during trace-replay val- idation Section 4.3.2) . We believe the reason for this difference is that we turn off UPnP service in deployment. In trace-replay validation, Foscam Cam keeps respond- ing to unsolicited probes from Internet scanners (a side effect of UPnP) and appears to be constantly visiting new IP-accessed servers, forcing IoTSTEED to turn off its IP- accessing filtering. In deployment, these Internet scanners cannot reach Foscam Cam because without UPnP service, Foscam Cam cannot set up static port mapping in router. As a result, IoTSTEED keeps IP-accessing filtering for Foscam Cam on and mitigate its attack. We conclude that for some devices (such as Foscam Cam, Samsung Cam and D-Link Cam in router deployment), we can prevent them from constantly contacting new IP-accessed servers and failing our defense by disabling UPnP service in router. Lastly, we show that even without UPnP service, a few devices could still keep talk- ing to new IP-accessed servers and IoTSTEED cannot defend them from attacking IP- accessed servers. During our 10-day deployment, we find Belkin Plug and Tenvis Cam both talk to a few new IP-accessed servers after bootstrapping and cause us to turn off their IP-accessing filtering. For Belkin Plug, we find most (12 or 80%) of its 15 IP- accessed servers to be STUN servers for NAT traversal (similar to what we observe in Section 4.2.2 and in Section 4.3). However we are not certain about why Belkin Plug keeps talking to new STUN servers throughout router deployment while in Section 4.3, we observe Belkin Plug stops talking to new STUN servers in the middle of measure- ment. We are also unclear about why Tenvis Cam talks to a few new IP-accessed servers (3, not counting whitelisted server IPs in Section 4.2.2) after bootstrapping in router deployment but not in our prior measurements (Section 4.2.2 and Section 4.3). 117 4.5 Related Work Prior groups have studied detecting IoT devices and defending both IoT-based and tra- ditional DDoS attacks (launched from PCs and servers). 4.5.1 IoT Device Detection Several prior projects detect IoT devices with public IPs by active scanning [21, 33, 98, 131, 141]. In comparison, our detection uses passive measurement and thus could detect devices both with public IPs and behind NATs (when running from NAT box). Other prior work also measures IoT devices passively and thus covers devices behind NAT [11, 55, 57, 62, 135]. However they either rely on pre-training with traffic from tar- get devices [55,57,135] (our prior studies on general IoT detection [55,57] fall into this category), or only cover IoT devices infected by certain malware [11, 62]. In compar- ison, our detection is based on MAC addresses and requires no such pre-training with target devices’ traffic. Our detection also applies to general IoT devices instead of just the compromised ones. 4.5.2 IoT-Based DDoS Defense Traffic Endpoint as Signals: similar to our work, prior work also explores the idea of detecting malicious IoT traffic based on traffic endpoints. They either operate near the IoT devices or near remote endpoints IoT devices talk to. Several groups propose detecting malicious IoT traffic from the edge router of IoT access network based on traffic endpoints [29, 59, 115]. The preliminary measure- ment study from Brown University shows it is feasible to detect DDoS attack traffic from (and to) IoT devices by whitelisting their legitimate endpoints without providing a method to build this whitelist [29]. Along the same line, work from University of New 118 South Wales proposes building whitelists of benign IoT endpoints based on manufac- turer usage description (MUD, an IETF draft [80]) profile provided by MUD-compliant IoT devices [59]. Different from both works, we provide concrete mechanisms to build a whitelist of benign IoT endpoints, without relying on the currently non-existent MUD- compliant IoT devices. Bogazici University detects compromised IoT devices by identi- fying devices that send TCP SYN packets to many different endpoints in a short interval without receiving as much positive responses [115]. Unlike their focus on TCP SYN scanning traffic, we detect all types of malicious IoT traffic not sent to (nor come from) benign IoT endpoints. One group detects malicious IoT traffic by measuring from the remote endpoints IoT devices talk to [141]. Work from Concordia University ( [141]) infers compromised IoT devices in the public Internet by identifying the fraction of IoT devices detected by Shodan ( [131]) that send packets to allocated but un-used IPs monitored by CAIDA (as known as darknet [17]). In comparison, our work have very different coverage: we cover compromised IoT devices in the access network we monitor (potentially with private IP addresses) while they cover a subset of compromised IoT devices on public Internet. We cover malicious traffic to all endpoints suspicious for IoT devices to visit, unlike their focus on malicious IoT traffic to a specific group of suspicious endpoints: the CAIDA darknet IPs. Traffic Flow Statistics as Signals Several prior work detect IoT-based DDoS traf- fic based on traffic flow statistics (such as packet rates and packet sizes) from the edge router of an IoT access network (such as the NAT router of a LAN) [31, 93, 109]. They detect IoT-based DDoS traffic using machine learning (ML) models trained with either a mixture of benign IoT traffic and simulated attack traffic (binary classification that detects DDoS by looking for similarity to known attack traffic patterns [31]) or with only benign IoT traffic (anomaly detection that detects DDoS by looking for deviations 119 from known benign traffic patterns [93, 109]). Unlike their signals based on traffic flow statistics, we use first-visit time and identities of traffic endpoints as signal. Comparing to their detection techniques of ML-based binary classification [31] and anomaly detec- tion [93,109], we use a different technique: heuristic-based rules. While they all assume an IoT-only access network, IoTSTEED could separate IoT from non-IoT devices and could operate in realistic access networks with both IoT and non-IoT devices. IoT- STEED also covers malicious traffic of different types (such as TCP SYN flooding and DNS query flooding) and flow statistics (such as packet rates) by dropping all traf- fic not sent to (nor come from) benign endpoints. In comparison, ML-based binary classification only detects attacks similar to known attack traffic seen in model train- ing [31]. Although in principle ML-based anomaly detection [93, 109] could identify malicious traffic of different types and different flow statistics by looking for deviations from known benign traffic patterns, prior work has shown that ML models are not good at detecting such deviations especially when dealing with real-world network traffic of highly variable flow statistics [136]. Lastly, their methods are computationally heavy and are not likely to run from resource-limited NAT router. In comparison, IoTSTEED is light-weight and could run entirely in commodity NAT router without downgrading router’s packet forwarding capabilities, as shown in Section 4.4. Other Signals: Other prior work detects compromised IoT devices and defend mali- cious IoT traffic with other signals [28,70]. Work from IFFAR detects compromised IoT sensors reporting altered measurements by finding outliers in measurements reported by a pool of homogeneous IoT sensors [28]. In comparison, IoTSTEED applies to all types of IoT access network instead of just networks of homogeneous IoT sensors. Work from National University of Singapore [70] mitigates IoT-based DDoS attack to a given server by setting static traffic quotas in this server for each contacting IoT device and dropping excessive packets. In comparison, IoTSTEED do not require access to victim servers. 120 4.5.3 Traditional DDoS Defense Prior to IoT-based DDoS attacks, DDoS attacks launched from traditional network devices such as PC and servers has been studied widely. Defense at Bot Side: similar to our work, prior work propose defending traditional DDoS traffic at bot side [39, 45, 99]. D-WARD detects DDoS attacks at bot side by comparing traffic flow statistics from and to the access network of bots with pre-defined models for normal flow statistics [99]. FireCol puts multiple intrusion prevention sys- tems (IPS) close to the access network of bots and detects DDoS based on traffic band- widths measured from these IPSes [39]. MULTOPS detects DDoS attacks overloading network bandwidth of victims by identifying disproportional difference between packet rate coming from and going back to the bots’ access network [45]. Different from their focus on traffic flow statistics such as packet rates and ratio of number of packets sent and received, our work focuses on traffic endpoints such as their first-visit time and DNS names. Defense at Intermediate Network: Prior work also studies defending DDoS attack at the intermediate network between bots and victims. Work from AT&T Labs detects DDoS traffic from intermediate network by identifying aggregates of network flows that cause network link congestion. Their detection relies on traffic flow statistics such as packet arrival rates and packet dropping history [91]. In comparison, IoTSTEED detects DDoS traffic with traffic endpoints as signals and operates at bot side. Defense at Victim Side: Other prior work studies defending DDoS traffic at victim network. Prior works defend network and transport-layer DDoS attacks that consumes band- width and other resources of victims (such as ICMP flooding and TCP SYN flooding) 121 with techniques such as statistical anomaly detection (which detects DDoS by identi- fying anomaly in certain traffic flow statistics) [145], flow imbalance heuristic (which looks for unbalanced packet rates between the incoming and outgoing traffic flows) [5] and TCP SYN cookies [13]. Prior works defend application-layer DDoS attacks targeting web applications (such as HTTP GET flooding) with techniques such as Turing test (which distinguishes legitimate human users from machines) [73, 120], moving-target techniques (which moves target web application around a pool of servers to increase uncertainty for the attacker) [71, 74, 150], domain-helps-domain collaboration (which allows a domain to direct excessive traffic to other trusted external domains for DDoS filtering) [122] and also statistical anomaly detection [107, 121]. Different from these prior works that focus on defending DDoS attacks at victim side IoTSTEED defends DDoS traffic at bot side. 4.6 Conclusion We propose IoTSTEED, a system that defend IoT-based DDoS attacks at bot-side. IoT- STEED detects IoT devices in its deployed network, learns their benign servers and filters suspicious IoT traffic to and from all other servers. We validate IoTSTEED with replay of 10-day benign traffic capture from an IoT access network and simulated IoT- based DDoS attacks. We show IoTSTEED could correctly detect the 14 LAN IoT and 6 non-IoT devices in this access network (100% accuracy). We show IoTSTEED main- tains low false-positive rate in server learning (2%) and traffic filtering (0.45%). Exper- iment results also show IoTSTEED mitigates all except four types of attacks regardless of the attacks’ types, flow characteristics and exact attacking devices. We show none of the four types of attacks IoTSTEED miss could completely evade our defense without 122 weakening the attacks themselves. Lastly, we deploy IoTSTEED in NAT router of an IoT access network for 10 days. We show IoTSTEED could run from commodity router with reasonable overhead: small memory usage (4% of 512MB) and no downgrading of router’s packet forwarding. We confirm IoTSTEED’s accuracy in device detection and false positive, true positive and false negative in server learning and traffic filtering during on-line router deployment is similar to what we report in off-line trace-replay validation. In our study of compromised IoT detection and DDoS mitigation we develop a new signature of traffic based on observed identities of end-points: IoT devices talking to end-points other than known benign IoT end-points are compromised. We use this new signature to detect compromised IoT devices, a new class of network devices. Our detec- tion of compromised IoT devices is essentially also a characterization where we char- acterize IoT devices under monitoring as either benign or malicious depending on if they talk to suspicious end- points. Our detection of compromised IoT devices also enables filtering of DDoS traffic between them and suspicious endpoints. We show our method maintains low false positive rates in flagging suspicious remote endpoints (2%) and filtering DDoS packets (0.45%) when validated with replay of benign IoT traffic captures (Section 4.3.1). We show our method mitigates attacks all except two types of attacks tested regardless of the attacks’ types (such as TCP SYN flooding and DNS query flooding) and traffic characteristics (such as packet rates) when validated with replay of real-world DDoS traffic (Section 4.3.2). Lastly, we show our method could run from commodity router with reasonable overhead: small memory usages (4% of 512MB) and no downgrading of router’s packet forwarding (Section 4.4). We also con- firm our method’s correctness (false positive, true positive and false negative in server learning and traffic filtering) during on-line deployment is similar to what we report 123 in off-line validation (Section 4.4). Thus our study of compromised IoT detection and DDoS mitigation demonstrates the thesis statement. 124 Chapter 5 Future Work and Conclusions In this chapter,we discuss possible directions for future work and conclude this thesis in the end. 5.1 Future Work There are immediate future work for each of our three studies that could strengthen our thesis statements. Our work also suggest some future studies that could benefit from ideas in this thesis. We next discuss possible future work. 5.1.1 Immediate Future Work for Our Studies In Chapter 2, we presented detecting and characterizing ICMP rate limiters based on how they change the responsiveness of traffic end-points to active probings. There are two directions to strengthen this work. First, our study (Chapter 2) focused on detecting and characterizing rate limiters to IPv4 ICMP traffic but we would also like to under- stand ICMP rate limiting in IPv6. We could apply our detection techniques to IPv6 ICMP measurements to detect rate limiters to ICMPv6 traffic. By detecting ICMPv6 rate limiters and estimating the packet rates for their rate limits, we strengthen our thesis statement by showing that our end-point-based device detection works for ICMPv6 rate limiters (a new class of network devices). By comparing detected ICMP rate limiting (rate limiters and their estimated rate limits) in IPv6 with those in IPv4, we get a more complete view of distribution of ICMP rate limiting in the Internet. Second, our study of 125 rate limiting to fast probing was based on existing Zmap measurement which does not satisfy all our needs (they do not keep exact IP probed and only probe one round while we need multi-round probings) . As a result, we could only apply part of our method to these existing Zmap measurements and could not be definite about distribution of ICMP rate limiting at high packet rates (up to 1 packet/s per /24). To better understand the distribution of ICMP rate limiting at high packet rates, we need to obtain high-rate ICMP measurement that fits the need of our method either by collecting new high-rate measurements ourselves or by contacting ZMap to collect high-rate measurements for us. Understanding ICMP rate limiting at high rates will complete our view of ICMP rate limiting distribution at different packet rates. In Chapter 3, we demonstrated mapping identities of traffic end-points to IoT devices. There are two directions to strengthen this work. First, our IoT detection tech- nique only identified the existence of IoT devices behind NAT without knowing exact device count. We would like to know the exact device count to better understand the number and type of IoT devices IoT users own. We could explore counting detected IoT devices behind NAT based on TCP/IP header information of their traffic such as the values of IP ID fields and TCP timestamp options, as pointed out by prior work like [12,148]. Second, our study for overall AS penetrations of our 23 device types was a under-estimations of the ground truth (currently based on 112-day measurement at B root, recalling Section 3.3.2). We would like to understand the actual number of ASes in the world where these 23 device types exist. We could estimate the actual overall AS penetrations of our 23 device types by applying detection to even longer measurement at B root. (We know our B-root measurement is long enough when we do not see extra detections with additional measurements.) In Chapter 4, we explored detecting compromised IoT devices by identifying IoT devices talking to end-points other than a list of known benign IoT end-points. We also 126 filtered DDoS traffic from these compromised IoT devices and suspicious end-points. There are two directions to strengthen this work. First, our method built the list of benign end-points that uncompromised IoT devices talk to by learning from their traffic, which is shown to be imperfect (our method flags 2% of benign end-points as malicious in val- idation). We could explore directly importing the groundtruth of this benign-end-points list from IoT devices supporting MUD (Manufacturer Usage Description), which is an IETF draft that allows IoT manufacturers to specify the legitimate network communica- tions for their products (called MUD file), including the benign traffic end-points [80]. By reading from MUD file, we could ensure our list of benign IoT end-points are per- fect, at least for MUD-compliant IoT devices. Second, our study only defended attack traffic over IPv4. Extending our methods to also defend attack traffic over IPv6 will increase our method’s coverage of attacks. 5.1.2 Future Work Suggested by This Thesis In addition to the above immediate future work that could strengthen our three specific studies, our thesis suggests wider future directions. First, future work that studies the Internet with active probing could benefit from both the detection method and results in our study of ICMP rate limiting detection (Chapter 2). Studies relying on ICMP probing could adjust their probing rates based on our recommendations (that probing up to 0.39 packets/s per block is safe and that probing up to 1 packet/s per /24 risks being rate limited) and avoid being distorted by rate limiting. Studies that rely on other types of active probing (such as TCP SYN) could run our method, detect potential rate limiting to their probing and adjust their packet rate accordingly. Although our method was originally designed to detect ICMP rate limiting, this approach also could detect other type of rate limiting because it detects the actions of the underlying token bucket. 127 Second, future efforts on IoT security and IoT management could benefit from detec- tion algorithms in our general IoT device detection study (Chapter 3). Future stud- ies about IoT security solutions could understand the scale and growth of IoT security problem with our IoT detection algorithms. DNS-based detection applied to recursive- to-authority DNS queries shows which ASes have target IoT devices. Certificate-based detection reveals target devices on public IPs. DNS and certificate-based detections applied to historical measurements show growth of target IoT devices at AS-level and on public IP over the years. Our IoT detection algorithms could also benefit network man- agement by enabling ISPs and IT administrators to discover and monitor IoT devices in their network. Third, future study of victim-side DDoS defense to IoT-based DDoS attacks could benefit from the bot-side mitigation system in our study of compromised IoT detection and DDoS mitigation (Chapter 4). While it is very costly to defend large-scale DDoS attacks at victim due to large attack volume, our bot-side DDoS mitigation system could be deployed in conjunction with victim-side mitigation to decrease the attack volume at victim (by pre-filtering at bot side) and decrease the cost of victim-side DDoS defense. Our bot-side DDoS mitigation system could also be used as part of an end-to-end DDoS defense solution that filters at bot side, intermediate networks and victim side. Lastly, our thesis demonstrates the effectiveness and benefits (such as NAT robust- ness) of using signature of traffic about end-points in detecting three types of network devices. Future work about detecting other network devices that talk to a small num- ber of benign end-points (for example, medical devices in hospital [108] and voting machines used for elections [100]) and about monitoring these devices for device com- promises could benefit from adopting signature of traffic about end-points. Future work about detecting network devices from aggregated traffic (such as NATted traffic) and regulated traffic (such as bandwidth-throttled traffic [22]) could also benefit from our 128 idea of profiling traffic end-points because while traffic patterns such as rates and timing are usually obfuscated in aggregated and regulated traffic, traffic end-points are pre- served. 5.2 Conclusions Detecting and studying network devices is essential for better understanding, managing and securing the Internet. To detect network devices from traffic measurements requires signature of traffic: mapping from certain traffic characteristics to target devices. This thesis proposal focuses on detecting network devices using one type of signature of traffic — signature of traffic about end-points: mappings from traffic characteristics about end-points, such as end-points’ counts, responsiveness and identities observed from traffic, to target devices. This thesis states that: “new signatures of traffic about end-points enable detec- tion and characterizations of new class of network devices.” We support this statement through three specific studies, each detecting and characterizing a new class of network devices with a new signature of traffic about end-point. In our first study, we present detecting and characterizing network devices rate limiting ICMP traffic based on how they change the observed responsiveness of traffic end-points to active probings. In our second study, we demonstrate mapping observed identities of traffic end-points to a new class of network devices: general IoT devices. In our third study, we explore detecting compromised IoT devices by identifying IoT devices talking to suspicious end-points. Detection of compromised IoT devices also enables us to mitigate DDoS attacks from these compromised devices to suspicious end-points. Our three studies cover two sub- spaces of the traffic-based device detection problem: active device detection (ICMP rate limiting detection) and passive device detection (general IoT and compromised IoT 129 detections) and suggest our thesis statement is helpful across multiple parts of the prob- lem space. 130 Bibliography [1] Cisco Manual For Configuring Traffic Policing. http://www. cisco.com/c/en/us/td/docs/switches/metro/me3600x_ 3800x/software/release/15-3_2_S/configuration/guide/ 3800x3600xscg/swqos.html#wp999715. [2] Juniper Manual For Configuring Traffic Policing. http://www.juniper. net/techpubs/en_US/junos14.2/topics/concept/policer- types.html. [3] rejwreply: a Linux kernel patch that adds echo-reply to feedback type of ipt- able REJECT rule. https://ant.isi.edu/software/rejwreply/ index.html. [4] Ubuntu User Manual for IPtables. http://manpages.ubuntu.com/ manpages/natty/man8/iptables.8.html. [5] S. Abdelsayed, D. Glimsholt, C. Leckie, S. Ryan, and S. Shami. An efficient filter for denial-of-service bandwidth attacks. In IEEE Global Telecommunications Conference, volume 3, pages 1353–1357 vol.3, Dec 2003. [6] G¨ unes Acar, Noah Apthorpe, Nick Feamster, Danny Y . Huang, Frank, and Arvind Narayanan. IoT Inspector Project from Princeton University. https://iot- inspector.princeton.edu/. [7] David Adrian, Zakir Durumeric, Gulshan Singh, and J. Alex Halderman. Zippier ZMap: Internet-wide scanning at 10 Gbps. In Proceedings of the USENIX Work- shop on Offensive Technologies, San Diego, CA, USA, August 2014. USENIX. [8] Mark Allman. Case Connection Zone DNS Transactions, January 2018 (latest release). http://www.icir.org/mallman/data.html. [9] O. Alrawi, C. Lever, M. Antonakakis, and F. Monrose. SoK: Security evalua- tion of home-based IoT deployments. In 2019 IEEE Symposium on Security and Privacy (SP), pages 1362–1380, May 2019. 131 [10] Manos Antonakakis, Tim April, Michael Bailey, Matt Bernhard, Elie Bursztein, Jaime Cochran, Zakir Durumeric, J. Alex Halderman, Luca Invernizzi, Michalis Kallitsis, Deepak Kumar, Chaz Lever, Zane Ma, Joshua Mason, Damian Men- scher, Chad Seaman, Nick Sullivan, Kurt Thomas, and Yi Zhou. Understanding the Mirai botnet. In 26th USENIX Security Symposium, 2017. [11] Manos Antonakakis, Tim April, Michael Bailey, Matt Bernhard, Elie Bursztein, Jaime Cochran, Zakir Durumeric, J. Alex Halderman, Luca Invernizzi, Michalis Kallitsis, Deepak Kumar, Chaz Lever, Zane Ma, Joshua Mason, Damian Men- scher, Chad Seaman, Nick Sullivan, Kurt Thomas, and Yi Zhou. Understanding the Mirai botnet. In 26th USENIX Security Symposium (USENIX Security 17), pages 1093–1110, Vancouver, BC, 2017. USENIX Association. [12] Steven M. Bellovin. A technique for counting NATted hosts. In Proceedings of ACM SIGCOMM Workshop on Internet Measurment, 2002. [13] D. J. Bernstein. TCP SYN cookies. http://cr.yp.to/syncookies. html. [14] Robert Beverly. Yarrp’ing the Internet: Randomized high-speed active topology discovery. In Proceedings of the ACM Internet Measurement Conference, Santa Monica, CA, USA, November 2016. ACM. [15] Hiawatha Bray. Akamai breaks ties with security expert. https:// www.bostonglobe.com/business/2016/09/23/cybercrooks- akamai/qOAhvHoohJcmkxIwg5ChKO/story.html. [16] CAIDA. Routeviews prefix to AS mappings dataset. https://www.caida. org/data/routing/routeviews-prefix2as.xml. [17] CAIDA. The UCSD network telescope. https://www.caida.org/ projects/network_telescope/. [18] Ashley Carman. Smart ovens have been turning on overnight and preheating to 400 degrees. https://www.theverge.com/2019/8/14/20802774/ june-smart-oven-remote-preheat-update-user-error. [19] D. Chourishi, A. Miri, M. Mili´ c, and S. Ismaeel. Role-based multiple controllers for load balancing and security in SDN. In IEEE Canada International Humani- tarian Technology Conference (IHTC), pages 1–4, May 2015. [20] Taejoong Chung, Yabing Liu, David Choffnes, Dave Levin, Bruce MacDowell Maggs, Alan Mislove, and Christo Wilson. Measuring and applying invalid SSL certificates: the silent majority. In Proceedings of the 2016 Internet Measurement Conference, 2016. 132 [21] Taejoong Chung, Yabing Liu, David Choffnes, Dave Levin, Bruce MacDowell Maggs, Alan Mislove, and Christo Wilson. Measuring and applying invalid SSL certificates: The silent majority. In Proceedings of the 2016 Internet Measure- ment Conference, IMC ’16, pages 527–541, New York, NY , USA, 2016. ACM. [22] Cisco. Configuring traffic shaping. cisco.com/c/en/us/td/docs/ switches/datacenter/nexus3000/sw/qos/7x/b_3k_QoS_ Config_7x/b_3k_QoS_Config_7x_chapter_0100.pdf. [23] Cloudflare. What is an IXP. https://www.cloudflare.com/ learning/cdn/glossary/internet-exchange-point-ixp/. [24] Lucian Constantin. Hackers found 47 new vulnerabilities in 23 IoT devices at DEF CON. https://www.csoonline.com/article/3119765/ security/hackers-found-47-new-vulnerabilities-in-23- iot-devices-at-def-con.html. [25] Andrei Costin, Apostolis Zarras, and Aurlien Francillon. Automated dynamic firmware analysis at scale: A case study on embedded web interfaces. In Pro- ceedings of the 11th ACM on Asia Conference on Computer and Communications Security, ASIA CCS ’16, pages 437–448, New York, NY , USA, 2016. ACM. [26] Dahua. Important message from Foscam digital technologies regarding US sales and service. http://foscam.us/products.html/. [27] Alberto Dainotti, Karyn Benson, Alistair King, kc claffy, Michael Kallitsis, Eduard Glatz, and Xenofontas Dimitropoulos. Estimating Internet address space usage through passive measurements. ACM Computer Communication Review, 44(1):42–49, January 2014. [28] P. S. S. de Souza, W. dos Santos Marques, F. D. Rossi, G. da Cunha Rodrigues, and R. N. Calheiros. Performance and accuracy trade-off analysis of techniques for anomaly detection in IoT sensors. In 2017 International Conference on Infor- mation Networking (ICOIN), pages 486–491, Jan 2017. [29] Nicholas DeMarinis and Rodrigo Fonseca. Toward usable network traffic policies for IoT devices in consumer networks. In Proceedings of the 2017 Workshop on Internet of Things Security and Privacy, IoTS&P ’17, pages 43–48, New York, NY , USA, 2017. ACM. [30] T. Dierks and E. Rescorla. The transport layer security (TLS) protocol. RFC 4346, Internet Request For Comments, 2006. [31] Rohan Doshi, Noah Apthorpe, and Nick Feamster. Machine learning DDoS detection for consumer Internet of Things devices. CoRR, abs/1804.04159, 2018. 133 [32] Zakir Durumeric, David Adrian, Ariana Mirian, Michael Bailey, and J. Alex Hal- derman. A search engine backed by Internet-wide scanning. In Proceedings of the ACM Conference on Computer and Communications Security, 2015. [33] Zakir Durumeric, David Adrian, Ariana Mirian, Michael Bailey, and J. Alex Hal- derman. A search engine backed by Internet-wide scanning. In Proceedings of the ACM Conference on Computer and Communications Security, pages 542– 553, Denver, CO, USA, October 2015. ACM. [34] Zakir Durumeric, Eric Wustrow, and J. Alex Halderman. Zmap: Fast Internet- wide scanning and its security applications. In Proceedings of the 22Nd USENIX Conference on Security, SEC’13, pages 605–620, Berkeley, CA, USA, 2013. USENIX Association. [35] Dyn. Analysis of October 21 attack. http://dyn.com/blog/dyn- analysis-summary-of-friday-october-21-attack/. [36] Dyn. Dyn analysis summary of Friday October 21 attack. http: //dyn.com/blog/dyn-analysis-summary-of-friday- october-21-attack/. [37] K. Egevang and P. Francis. The IP network address translator (NAT). RFC 1631, Internet Request For Comments, 1994. [38] Tobias Flach, Pavlos Papageorge, Andreas Terzis, Luis Pedrosa, Yuchung Cheng, Tayeb Karim, Ethan Katz-Bassett, and Ramesh Govindan. An Internet-wide anal- ysis of traffic policing. In Proceedings of the ACM SIGCOMM Conference, pages 468–482, Floranopolis, Brazil, 2016. ACM. [39] J. Francois, I. Aib, and R. Boutaba. FireCol: A collaborative protection net- work for the detection of flooding DDoS attacks. IEEE/ACM Transactions on Networking, 20(6):1828–1841, Dec 2012. [40] Gartner. Gartner says 5.8 billion enterprise and automotive IoT endpoints will be in use in 2020. https://www.gartner.com/en/newsroom/ press-releases/2019-08-29-gartner-says-5-8-billion- enterprise-and-automotive-io. [41] Gartner. IoT installed base forcast. https://www.statista.com/ statistics/370350/internet-of-things-installed-base- by-category/. 134 [42] Pascal Geenens. Assessing the threat the Reaper botnet poses to the Inter- net—what we know now. https://arstechnica.com/information- technology/2017/10/assessing-the-threat-the-reaper- botnet-poses-to-the-internet-what-we-know-now/. [43] Pascal Geenens. Hajime: Analysis of a decentralized Internet worm for IoT devices. https://security.rapiditynetworks.com/ publications/2016-10-16/hajime.pdf. [44] Pascal Geenens. Hajime – sophisticated, flexible, thoughtfully designed and future-proof.https://blog.radware.com/security/2017/04/ hajime-futureproof-botnet/. [45] Thomer M. Gil and Massimiliano Poletto. MULTOPS: A data-structure for band- width attack detection. In Proceedings of the 10th Conference on USENIX Secu- rity Symposium, SSYM, Berkeley, CA, USA, 2001. USENIX Association. [46] B. Gleeson, A. Lin, J. Heinanen, Telia Finland, G. Armitage, and A. Malis. A framework for IP based virtual private networks. RFC 2764, Internet Request For Comments, 2000. [47] GlobalInfoResearch. IP camera market report. https://goo.gl/254g2M. [48] GlobalInfoResearch. NVR market report. https://goo.gl/sxQRis. [49] Mehmet H. Gunes and Kamil Sarac. Analyzing router responsiveness to active measurement probes. In Proceedings of the 10th International Conference on Passive and Active Network Measurement, PAM ’09, pages 23–32, Berlin, Hei- delberg, 2009. Springer-Verlag. [50] Hang Guo and John Heidemann. 10-day IoT traffic captures. https://ant. isi.edu/datasets/iot/. [51] Hang Guo and John Heidemann. IoT traces from 10 devices we purchased. https://ant.isi.edu/datasets/iot/. [52] Hang Guo and John Heidemann. IoTSTEED source code. https://ant. isi.edu/software/iotsteed/index.html. [53] Hang Guo and John Heidemann. Detecting ICMP rate limiting in the Internet (extended). Technical Report ISI-TR-717, USC/Information Sciences Institute, May 2017. [54] Hang Guo and John Heidemann. Detecting IoT devices in the Internet (extended). Technical report, USC/ISI, 2018. 135 [55] Hang Guo and John Heidemann. Detecting iot devices in the Internet (extended). Technical Report ISI-TR-726, USC/Information Sciences Institute, July 2018. [56] Hang Guo and John Heidemann. IP-based IoT device detection. In Proceedings of Workshop on IoT Security and Privacy, 2018. [57] Hang Guo and John Heidemann. IP-based IoT device detection. In Proceedings of the 2nd Workshop on IoT Security and Privacy, aug 2018. [58] Ayyoob Hamza, Hassan Habibi Gharakheili, Theophilus A. Benson, and Vijay Sivaraman. Detecting volumetric attacks on IoT devices via SDN-based moni- toring of MUD activity. In Proceedings of the 2019 ACM Symposium on SDN Research, SOSR ’19, page 36–48, New York, NY , USA, 2019. Association for Computing Machinery. [59] Ayyoob Hamza, Hassan Habibi Gharakheili, and Vijay Sivaraman. Combining MUD policies with SDN for IoT intrusion detection. In Proceedings of the 2018 Workshop on IoT Security and Privacy, IoT S&P ’18, pages 1–7, New York, NY , USA, 2018. ACM. [60] John Heidemann, Yuri Pradkin, Ramesh Govindan, Christos Papadopoulos, Genevieve Bartlett, and Joseph Bannister. Census and survey of the visible Inter- net. In Proceedings of the 8th ACM SIGCOMM Conference on Internet Measure- ment, IMC ’08, pages 169–182, New York, NY , USA, 2008. ACM. [61] Stephen Herwig, Katura Harvey, George Hughey, Richard Roberts, and Dave Levin. Measurement and analysis of Hajime, a peer-to-peer IoT botnet. In Net- work and Distributed System Security Symposium, 2019. [62] Stephen Herwig, Katura Harvey, George Hughey, Richard Roberts, and Dave Levin. Measurement and analysis of Hajime, a peer-to-peer IoT botnet. In Net- work and Distributed System Security Symposium (NDSS), Feb 2019. [63] hping. Hping project. http://www.hping.org/hping3.html. [64] S. Hsu, T. Chen, Y . Chang, S. Chen, H. Chao, T. Lin, and W. Shih. Design a hash-based control mechanism in vSwitch for software-defined networking envi- ronment. In IEEE International Conference on Cluster Computing, pages 498– 499, Sep. 2015. [65] Danny Yuxing Huang, Noah Apthorpe, Gunes Acar, Frank Li, and Nick Feamster. IoT inspector: Crowdsourcing labeled network traffic from smart home devices at scale, 2019. 136 [66] Danny Yuxing Huang, Noah Apthorpe, Gunes Acar, Frank Li, and Nick Feamster. IoT inspector: Crowdsourcing labeled network traffic from smart home devices at scale, 2019. [67] Bradley Huffaker, Daniel Plummer, David Moore, and k claffy. Topology discov- ery by active probing. In Proceedings of the IEEE Symposium on Applications and the Internet, pages 90–96. IEEE, January 2002. [68] internetworldstats.com. Internet world stats. https://www. internetworldstats.com/stats.htm. [69] IPVM. Dahua OEM directory. https://ipvm.com/reports/dahua- oem. [70] Uzair Javaid, Ang Kiang Siang, Muhammad Naveed Aman, and Biplab Sikdar. Mitigating IoT device based DDoS attacks using blockchain. In Proceedings of the 1st Workshop on Cryptocurrencies and Blockchains for Distributed Systems, CryBlock’18, pages 71–76, New York, NY , USA, 2018. ACM. [71] Q. Jia, K. Sun, and A. Stavrou. MOTAG: Moving target defense against Internet denial of service attacks. In 22nd International Conference on Computer Com- munication and Networks (ICCCN), pages 1–9, July 2013. [72] K. Kalkan, G. G¨ ur, and F. Alag¨ oz. SDNScore: a statistical defense mechanism against DDoS attacks in SDN environment. In IEEE Symposium on Computers and Communications (ISCC), pages 669–675, July 2017. [73] Srikanth Kandula, Dina Katabi, Matthias Jacob, and Arthur Berger. Botz-4-sale: Surviving organized DDoS attacks that mimic flash crowds. In Proceedings of the 2Nd Conference on Symposium on Networked Systems Design & Implemen- tation - Volume 2, NSDI’05, pages 287–300, Berkeley, CA, USA, 2005. USENIX Association. [74] S. M. Khattab, C. Sangpachatanaruk, R. Melhem, D. l Mosse, and T. Znati. Proac- tive server roaming for mitigating denial-of-service attacks. In International Con- ference on Information Technology: Research and Education. ITRE, pages 286– 290, Aug 2003. [75] Brian Krebs. Krebs hit with DDoS. https://krebsonsecurity.com/ 2016/09/krebsonsecurity-hit-with-record-ddos/. [76] Brian Krebs. KrebsOnSecurity hit with record DDoS. https: //krebsonsecurity.com/2016/09/krebsonsecurity-hit- with-record-ddos/. 137 [77] Paul Krzyzanowski. Understanding autonomous systems. https://www.cs. rutgers.edu/ ˜ pxk/352/notes/autonomous_systems.html. [78] John Kurkowski. python library tldextract. https://pypi.python.org/ pypi/tldextract. [79] Franck Le, Mudhakar Srivatsa, and Dinesh Verma. Unearthing and exploiting latent semantics behind DNS domains for deep network traffic analysis. In Work- shop on AI for Internet of Things, 2019. [80] E. Lear, R. Droms, and D. Romascanu. Manufacturer usage description specifi- cation. https://tools.ietf.org/html/rfc8520. [81] Youndo Lee and Neil Spring. Identifying and aggregating homogeneous IPv4 /24 blocks with hobbit. In Proceedings of the ACM Internet Measurement Con- ference, Santa Monica, CA, USA, November 2016. ACM. [82] Derek Leonard and Dmitri Loguinov. Demystifying service discovery: Imple- menting an internet-wide scanner. In Proceedings of the ACM Internet Mea- surement Conference, pages 109–123, Melbourne, Victoria, Australia, November 2010. ACM. [83] Derek Leonard and Dmitri Loguinov. Demystifying service discovery: Imple- menting an Internet-wide scanner. In Proceedings of the ACM Internet Mea- surement Conference, pages 109–123, Melbourne, Victoria, Australia, November 2010. ACM. [84] S. Lim, S. Yang, Y . Kim, S. Yang, and H. Kim. Controller scheduling for contin- ued SDN operation under DDoS attacks. Electronics Letters, 51(16):1259–1261, 2015. [85] Peter Loshin. Details emerging on Dyn DDoS attack. http:// searchsecurity.techtarget.com/news/450401962/Details- emerging-on-Dyn-DNS-DDoS-attack-Mirai-IoT-botnet, 2016. [86] Peter Loshin. Details emerging on Dyn DNS DDoS attack, Mirai IoT botnet. blog http://searchsecurity.techtarget.com/news/450401962/ Details-emerging-on-Dyn-DNS-DDoS-attack-Mirai-IoT- botnet, October 2016. [87] Allot Ltd. New research shows that consumers are willing to pay for IoT services. https://www.allot.com/blog/new-research-shows- that-consumers-are-willing-to-pay-for-iot-services/. 138 [88] Matthew Luckie, Amogh Dhamdhere, Bradley Huffaker, David Cla rk, and kc claffy. bdrmap: Inference of borders between IP networks. In Proceedings of the ACM Internet Measurement Conference, Santa Monica, CA, USA, November 2016. ACM. [89] macaddress.io. MAC address vendor lookup library.https://macaddress. io/. [90] Harsha V . Madhyastha, Tomas Isdal, Michael Piat ek, Colin Dixon, Thomas Anderson, Arvind Krishnamurthy, and Arun Venka taramani. iPlane: An infor- mation plane for distributed services. In Proceedings of the 7th USENIX Sympo- sium on Operating Systems Design and Implementation, pages 367–380, Seattle, WA, USA, November 2006. USENIX. [91] Ratul Mahajan, Steven M. Bellovin, Sally Floyd, John Ioannidis, Vern Paxson, and Scott Shenker. Controlling high bandwidth aggregates in the network. SIG- COMM Comput. Commun. Rev., 32(3):62–73, July 2002. [92] Alexander Marder and Jonathan M. Smith. MAP-IT: Multipass accurate passive inferences from traceroute. In Proceedings of the ACM Internet Measurement Conference, Santa Monica, CA, USA, November 2016. ACM. [93] Y . Meidan, M. Bohadana, Y . Mathov, Y . Mirsky, A. Shabtai, D. Breitenbacher, and Y . Elovici. N-BaIoT—network-based detection of IoT botnet attacks using deep autoencoders. IEEE Pervasive Computing, 17(3):12–22, Jul 2018. [94] Yair Meidan, Michael Bohadana, Asaf Shabtai, Juan David Guarnizo, Martin Ochoa, Nils Ole Tippenhauer, and Yuval Elovici. ProfilIoT: A machine learn- ing approach for IoT device identification based on network traffic analysis. In Proceedings of SAC, 2017. [95] Yair Meidan, Michael Bohadana, Asaf Shabtai, Martin Ochoa, Nils Ole Tippen- hauer, Juan Davis Guarnizo, and Yuval Elovici. Detection of unauthorized IoT devices using machine learning techniques, 2017. [96] Trend Micro. Hide N Seek botnet uses Peer-to-Peer infrastructure to com- promise IoT devices. https://www.trendmicro.com/vinfo/us/ security/news/internet-of-things/-hide-n-seek-botnet- uses-peer-to-peer-infrastructure-to-compromise-iot- devices. [97] A. Mirian, Z. Ma, D. Adrian, M. Tischer, T. Chuenchujit, T. Yardley, R. Berthier, J. Mason, Z. Durumeric, J. A. Halderman, and M. Bailey. An Internet-wide view of ICS devices. In Annual Conference on Privacy, Security and Trust (PST), 2016. 139 [98] A. Mirian, Z. Ma, D. Adrian, M. Tischer, T. Chuenchujit, T. Yardley, R. Berthier, J. Mason, Z. Durumeric, J. A. Halderman, and M. Bailey. An Internet-wide view of ICS devices. In Annual Conference on Privacy, Security and Trust (PST), Dec 2016. [99] J. Mirkovic, G. Prier, and P. Reiher. Attacking DDoS at the source. In 10th IEEE International Conference on Network Protocols, pages 312–321, Nov 2002. [100] Kevin Monahan, Cynthia McFadden, and Didi Martinez. ’online and vul- nerable’: Experts find nearly three dozen u.s. voting systems connected to internet. https://www.nbcnews.com/politics/elections/ online-vulnerable-experts-find-nearly-three-dozen-u- s-voting-n1112436. [101] Motherboard. 1.5 million hijacked cameras make an unprecedented bot- net. https://motherboard.vice.com/en_us/article/8q8dab/ 15-million-connected-cameras-ddos-botnet-brian-krebs. [102] Motherboard. 15 million hijacked cameras make an unprecedented bot- net. https://motherboard.vice.com/en_us/article/8q8dab/ 15-million-connected-cameras-ddos-botnet-brian-krebs. [103] G. C. M. Moura, C. Ga˜ n´ an, Q. Lone, P. Poursaied, H. Asghari, and M. van Eeten. How dynamic is the ISPs address space? towards Internet-wide DHCP churn estimation. In 2015 IFIP Networking Conference (IFIP Networking), pages 1–9, May 2015. [104] Mozilla. Public suffix list. https://www.publicsuffix.org/. [105] Mozilla. Public suffix list from Mozilla foundation. https://www. publicsuffix.org/. [106] Moritz M¨ uller, Giovane C. M. Moura, Ricardo de O. Schmidt, and John Hei- demann. Recursives in the wild: Engineering authoritative DNS servers. In Proceedings of ACM Internet Measurement Conference, 2017. [107] M. M. Najafabadi, T. M. Khoshgoftaar, C. Calvert, and C. Kemp. User behavior anomaly detection for application layer DDoS attacks. In IEEE International Conference on Information Reuse and Integration (IRI), pages 154–161, Aug 2017. [108] Lily Hay Newman. Medical devices are the next security night- mare. https://www.wired.com/2017/03/medical-devices- next-security-nightmare/. 140 [109] Thien Duc Nguyen, Samuel Marchal, Markus Miettinen, Minh Hoang Dang, N. Asokan, and AhmadReza Sadeghi. D¨ ıot: A crowdsourced self-learning approach for detecting compromised IoT devices. CoRR, abs/1804.07474, 2018. [110] No-IP. Domain names provided by No-IP. http://www.noip.com/ support/faq/free-dynamic-dns-domains/. [111] Openwrt. Openwrt project. https://openwrt.org/. [112] Openwrt. Universal Plug’n’Play and NAT-PMP on OpenWrt. https:// openwrt.org/docs/guide-user/firewall/upnp/upnp_setup/. [113] OVH. DDoS didn’t break V AC. https://www.ovh.com/us/news/ articles/a2367.the-ddos-that-didnt-break-the-camels- vac. [114] OVH. OVH news - the DDoS that didn’t break the camel’s V AC. https://www.ovh.com/us/news/articles/a2367.the-ddos- that-didnt-break-the-camels-vac. [115] M. Ozc ¸elik, N. Chalabianloo, and G. Gur. Software-defined edge defense against IoT-based DDoS. In 2017 IEEE International Conference on Computer and Information Technology (CIT), pages 308–313, Aug 2017. [116] Ramakrishna Padmanabhan, Amogh Dhamdhere, Emile Aben, kc cla ffy, and Neil Spring. Reasons dynamic addresses change. In Proceedings of the ACM Internet Measurement Conference, Santa Monica, CA, USA, November 2016. ACM. [117] A. F. M. Piedrahita, S. Rueda, D. M. F. Mattos, and O. C. M. B. Duarte. Flowfence: a denial of service defense system for software defined networking. In Global Information Infrastructure and Networking Symposium (GIIS), pages 1–6, Oct 2015. [118] Lin Quan, John Heidemann, and Yuri Pradkin. Trinocular: Understanding Inter- net reliability through adaptive probing. In Proceedings of the ACM SIGCOMM Conference, pages 255–266, Hong Kong, China, August 2013. ACM. [119] Lin Quan, John Heidemann, and Yuri Pradkin. When the Internet sleeps: Corre- lating diurnal networks with external factors. In Proceedings of the ACM Inter- net Measurement Conference, pages 87–100, Vancouver, BC, Canada, November 2014. ACM. 141 [120] Jothi Rangasamy, Douglas Stebila, Colin Boyd, and Juan Gonz´ alez Nieto. An integrated approach to cryptographic mitigation of denial-of-service attacks. In Proceedings of the 6th ACM Symposium on Information, Computer and Commu- nications Security, ASIACCS ’11, pages 114–123, New York, NY , USA, 2011. ACM. [121] S. Ranjan, R. Swaminathan, M. Uysal, A. Nucci, and E. Knightly. DDoS-Shield: DDoS-resilient scheduling to counter application layer attacks. IEEE/ACM Transactions on Networking, 17(1):26–39, Feb 2009. [122] B. Rashidi, C. Fung, and E. Bertino. A collaborative DDoS defence framework using network function virtualization. IEEE Transactions on Information Foren- sics and Security, 12(10):2483–2497, Oct 2017. [123] R. Ravaioli, G. Urvoy-Keller, and C. Barakat. Characterizing ICMP rate limi- tation on routers. In 2015 IEEE International Conference on Communications (ICC), pages 6043–6049, June 2015. [124] TREND MICRO RESEARCH. MQTT and CoAP: Security and pri- vacy issues in IoT and IIoT communication protocols. https: //www.trendmicro.com/vinfo/us/security/news/internet- of-things/mqtt-and-coap-security-and-privacy-issues- in-iot-and-iiot-communication-protocols. [125] Philipp Richter, Florian Wohlfart, Narseo Vallina-Rodriguez, Mark Allman, Randy Bush, Anja Feldmann, Christian Kreibich, Nicholas Weaver, and Vern Paxson. A multi-perspective analysis of carrier-grade NAT deployment. In Proceedings of the ACM Internet Measurement Conference, Santa Monica, CA, USA, November 2016. ACM. [126] JOS ´ E MANUEL ROMERO. Hackers broadcast live stream of police camera at Podemos leaders’ home. https://elpais.com/elpais/2019/04/08/ inenglish/1554706393_909358.html. [127] Bruce Schneier. The Internet of Things is wildly insecure—and often unpatch- able. https://www.schneier.com/essays/archives/2014/01/ the_internet_of_thin.html. [128] Aaron Schulman and Neil Spring. Pingin’ in the rain. In Proceedings of the 2011 ACM SIGCOMM Conference on Internet Measurement Conference, IMC ’11, pages 19–28, New York, NY , USA, 2011. ACM. [129] SCIP. Belkin Wemo switch communications analysis. https://www.scip. ch/en/?labs.20160218. 142 [130] Shodan. Shodan search engine front page. https://www.shodan.io/. [131] Shodan. Shodan search engine front page. https://www.shodan.io/. [132] Sandra Siby, Rajib Ranjan Maiti, and Nils Ole Tippenhauer. IoTscanner: Detect- ing privacy threats in IoT neighborhoods. In Workshop on IoT Privacy, Trust, and Security, 2017. [133] A. Sivanathan, H. H. Gharakheili, F. Loi, A. Radford, C. Wijenayake, A. Vish- wanath, and V . Sivaraman. Classifying IoT devices in smart environments using network traffic characteristics. IEEE Transactions on Mobile Computing, 18(8):1745–1759, Aug 2019. [134] Arunan Sivanathan, Daniel Sherratt, Hassan Habibi Gharakheili, Adam Radford, Chamith Wijenayake, Arun Vishwanath, and Vijay Sivaraman. Characterizing and classifying IoT traffic in smart cities and campuses. In Workshop on Smart Cities and Urban Computing, 2017. [135] Arunan Sivanathan, Daniel Sherratt, Hassan Habibi Gharakheili, Adam Radford, Chamith Wijenayake, Arun Vishwanath, and Vijay Sivaraman. Characterizing and classifying IoT traffic in smart cities and campuses. In Proceedings of the IEEE Infocom Workshop on Smart Cities and Urban Computing, pages 559–564, May 2017. [136] Robin Sommer and Vern Paxson. Outside the closed world: On using machine learning for network intrusion detection. In Proceedings of the 2010 IEEE Sym- posium on Security and Privacy, SP ’10, pages 305–316, Washington, DC, USA, 2010. IEEE Computer Society. [137] Neil Spring, Ratul Mahajan, and David Wetherall. Measuring ISP topologies with Rocketfuel. In Proceedings of the ACM SIGCOMM Conference, pages 133–145, Pittsburgh, Pennsylvania, USA, August 2002. ACM. [138] Tom Spring. Smart lock turns out to be not so smart, or secure. https://threatpost.com/smart-lock-turns-out-to-be- not-so-smart-or-secure/146091/. [139] ThousandEyes. What is an ISP? https://www.thousandeyes.com/ learning/glossary/isp-internet-service-provider. [140] S. Torabi, E. Bou-Harb, C. Assi, M. Galluscio, A. Boukhtouta, and M. Debbabi. Inferring, characterizing, and investigating Internet-scale malicious IoT device activities: A network telescope perspective. 2018. 143 [141] S. Torabi, E. Bou-Harb, C. Assi, M. Galluscio, A. Boukhtouta, and M. Debbabi. Inferring, characterizing, and investigating Internet-scale malicious IoT device activities: A network telescope perspective. In 2018 48th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), pages 562–573, June 2018. [142] USC/LANDER. FRGP (www.frgp.net) Continuous Flow Dataset, traces taken 2015-05-10 to 2015-05-19. provided by the USC/LANDER project (http://www.isi.edu/ant/lander). [143] USC/LANDER project. Internet address census dataset, predict id usc-lander/internet_address_census_it71w-20160803/ rev5468 and usc-lander/internet_address_census_it70w- 20160602/rev5404 and usc-lander/internet_address_ census_it56j-20130917/rev3704 and usc-lander/internet_ address_census_it57j-20131127/rev3745 and usc-lander/ internet_address_census_it58j-20140122/rev3912. web page http://www.isi.edu/ant/lander. [144] USC/LANDER project. Internet address survey dataset, predict id USC-LANDER//internet_address_survey_reprobing_it70w- 20160602/rev5417 and usc-lander/internet_address_ survey_reprobing_it71w-20160803/rev5462. web page http://www.isi.edu/ant/lander. [145] Haining Wang, Danlu Zhang, and Kang G. Shin. Detecting SYN flooding attacks. In Joint Conference of the IEEE Computer and Communications Societies, vol- ume 3, pages 1530–1539, June 2002. [146] R. Wang, Z. Jia, and L. Ju. An entropy-based distributed DDoS detection mech- anism in software-defined networking. In IEEE Trustcom/BigDataSE/ISPA, vol- ume 1, pages 310–317, Aug 2015. [147] Dean Webe. Why it’s so hard to implement IoT security. https: //www.securityweek.com/why-its-so-hard-implement- iot-security. [148] G. Wicherski, F. Weingarten, and U. Meyer. IP agnostic real-time traffic filtering and host identification using TCP timestamps. In IEEE Conference on Local Computer Networks, 2013. [149] Wikipedia. Autonomous system (internet).https://en.wikipedia.org/ wiki/Autonomous_system_(Internet). 144 [150] P. Wood, C. Gutierrez, and S. Bagchi. Denial of Service Elusion (DoSE): Keep- ing clients connected for less. In IEEE 34th Symposium on Reliable Distributed Systems (SRDS), pages 94–103, Sep. 2015. [151] Corp Xerox. 5 reasons why IoT security is difficult. https://www.xerox. com/en-us/insights/iot-security. [152] A. Zaalouk, R. Khondoker, R. Marx, and K. Bayarou. OrchSec: an orchestrator- based architecture for enhancing network-security using network monitoring and SDN control functions. In IEEE Network Operations and Management Sympo- sium (NOMS), pages 1–9, May 2014. [153] Sebastian Zander, Lachlan L. H. Andrew, and Grenville Armitage. Capturing ghosts: Predicting the used IPv4 space by inferring unobserved addresses. In Proceedings of the ACM Internet Measurement Conference, pages 319–332, Van- couver, BC, Canada, November 2014. ACM. [154] Sarah Zhang. Creepy website is streaming from 73,000 private secu- rity cameras. https://gizmodo.com/a-creepy-website-is- streaming-from-73-000-private-secur-1655653510. [155] ZMap. ZMap 443 HTTPS SSL full IPv4 datasets. https://censys.io/ data/443-https-ssl_3-full_ipv4. 145
Abstract (if available)
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Improving network security through collaborative sharing
PDF
AI-enabled DDoS attack detection in IoT systems
PDF
Collaborative detection and filtering of DDoS attacks in ISP core networks
PDF
Understanding the characteristics of Internet traffic dynamics in wired and wireless networks
PDF
Detecting periodic patterns in internet traffic with spectral and statistical methods
PDF
A protocol framework for attacker traceback in wireless multi-hop networks
PDF
Improving network reliability using a formal definition of the Internet core
PDF
Protecting online services from sophisticated DDoS attacks
PDF
Efficient pipelines for vision-based context sensing
PDF
Balancing security and performance of network request-response protocols
PDF
Measuring the impact of CDN design decisions
PDF
Efficient processing of streaming data in multi-user and multi-abstraction workflows
PDF
High-performance distributed computing techniques for wireless IoT and connected vehicle systems
PDF
Robust routing and energy management in wireless sensor networks
PDF
Optimizing task assignment for collaborative computing over heterogeneous network devices
PDF
Mitigating attacks that disrupt online services without changing existing protocols
PDF
Efficient crowd-based visual learning for edge devices
PDF
Design of cost-efficient multi-sensor collaboration in wireless sensor networks
PDF
Exploiting diversity with online learning in the Internet of things
PDF
Towards highly-available cloud and content-provider networks
Asset Metadata
Creator
Guo, Hang
(author)
Core Title
Detecting and characterizing network devices using signatures of traffic about end-points
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Computer Science
Publication Date
06/27/2020
Defense Date
04/13/2020
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
detection,network devices,OAI-PMH Harvest,traffic signature
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Heidemann, John (
committee chair
), Krishnamachari, Bhaskar (
committee member
), Raghavan, Barath (
committee member
)
Creator Email
jeremiahhguo@gmail.com,leonguoleonguo@gmail.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c89-322745
Unique identifier
UC11663412
Identifier
etd-GuoHang-8625.pdf (filename),usctheses-c89-322745 (legacy record id)
Legacy Identifier
etd-GuoHang-8625.pdf
Dmrecord
322745
Document Type
Dissertation
Rights
Guo, Hang
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
detection
network devices
traffic signature