Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
AI-enabled DDoS attack detection in IoT systems
(USC Thesis Other)
AI-enabled DDoS attack detection in IoT systems
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
AI-Enabled DDoS Attack Detection in IoT Systems
by
Arvin Hekmati
A Dissertation Presented to the
FACULTY OF THE USC GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
(COMPUTER SCIENCE)
May 2024
Dedication
To my beloved family, for their constant support.
ii
Acknowledgements
This dissertation stands as a testament to the collective wisdom, encouragement, and support I have been
fortunate enough to receive during my academic journey. At the forefront of this endeavor has been my
advisor, Professor Bhaskar Krishnamachari, whose guidance has been nothing short of transformative. His
profound insights, unwavering support, and remarkable patience have not only shaped this dissertation
but have also been pivotal in my growth as a researcher. The intellectual environment fostered by
Professor Krishnamachari has been both inspiring and nurturing.
I extend my profound gratitude to my dissertation committee members, Professors Cauligi
Raghavendra and Aiichiro Nakano, whose expertise and constructive critiques have greatly enhanced the
depth and breadth of this research. I am equally grateful to Prof. Cyrus Shahabi, Prof. Maja Mataric, Prof.
Mohammad Rostami, Prof. Murali Annavaram, and Prof. Mitul Luhar for their invaluable contributions
during my Qualification Exams and Thesis Proposal. Their rigorous assessments and thoughtful feedback
have immensely contributed to the refinement of this work.
My gratitude also goes out to my fellow researchers and colleagues, especially my lab mates at the
Autonomous Networks Research Group (ANRG), whose constructive feedback, stimulating discussions,
and friendship have been fundamental in the evolution of the ideas encapsulated in this dissertation. Their
unique perspectives and collective wisdom have been crucial in navigating the complexities of this
research journey.
iii
The support and resources provided by the University of Southern California have been instrumental
in facilitating my research endeavors. I am particularly thankful for the financial assistance I received from
the Defense Advanced Research Projects Agency (DARPA, Contract Number HR001120C0160), which has
been essential in the progression and development of my work.
I owe an immeasurable debt of gratitude to my family, whose endless love, encouragement, and faith in
me have been the bedrock of my resilience and determination. Their constant support has been my source
of strength and inspiration throughout this journey.
I also wish to extend my heartfelt thanks to all participants and contributors who dedicated their time
and insights to this research. Their involvement has been crucial to the success of this project.
In closing, I am thankful to everyone who has played a part in this academic endeavor. Their collective
contributions have been indispensable to the completion of this dissertation. Their support has not only
contributed to my growth as a scholar but has also enriched this work in countless ways.
Arvin Hekmati
University of Southern California
May 2024
iv
Table of Contents
Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii
Chapter 1: Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Chapter 2: Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.1 Internet of Things . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.1.1 Security Vulnerabilities in IoT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2 Distributed Denial of Service (DDoS) Attacks . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2.1 DDoS Attacks Leveraging IoT Systems . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.3 Machine Learning Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.3.1 Multi-Layer Perceptrons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.3.2 Convolutional Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.3.3 Long Short-Term Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.3.4 Autoencoders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.3.5 Transformers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.3.6 Graph Convolutional Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.3.7 Large Language Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Chapter 3: Related Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.1 Traditional Methods For DDoS Attack Detection . . . . . . . . . . . . . . . . . . . . . . . . 19
3.2 Machine Learning Based Methods For DDoS Attack Detection . . . . . . . . . . . . . . . . 20
3.2.1 Flooding DDoS Attack Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.2.2 Slow-rate and Stealthy DDoS Attacks Detection . . . . . . . . . . . . . . . . . . . . 21
3.2.3 Application-Layer-Based DDoS attack on IoT . . . . . . . . . . . . . . . . . . . . . 22
3.2.4 GCN Based Methods in DDoS Detection . . . . . . . . . . . . . . . . . . . . . . . . 23
3.2.5 LLM Based DDoS detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.3 Differentiating Characteristics of Our Work . . . . . . . . . . . . . . . . . . . . . . . . . . 24
v
Chapter 4: Tunable Futuristic DDoS Attacks In IoT Systems . . . . . . . . . . . . . . . . . . . . . . 27
4.1 Problem Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.2 Modeling IoT Network Traffic Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.2.1 Addressing Data Bias . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.2.2 Dataset Augmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.2.3 Large-Scale Urban IoT Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.3 Tunable Futuristic DDoS Attacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.3.1 DDoS Traffic Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.3.2 DDoS Attack Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.4 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.4.1 Large Scale Urban IoT Dataset Evaluation . . . . . . . . . . . . . . . . . . . . . . . 36
4.4.2 Tunable Futuristic DDoS Attacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
Chapter 5: Correlation-Aware Neural Networks for DDoS Attack Detection In IoT Systems . . . . . 42
5.1 Problem Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
5.2 Correlation-Aware Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
5.2.1 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
5.2.2 Neural Network Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
5.3 IoT Nodes With Constrained Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
5.4 Real Test-Bed Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
5.4.1 System Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
5.5 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.5.1 Preparing General Training Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.5.2 Experiment Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5.5.3 Experiment Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
5.5.4 Binary Classification Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
5.5.5 DDoS Attack Prediction Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5.5.6 DDoS Attacks With Different Tunable Papermeters . . . . . . . . . . . . . . . . . . 69
5.5.7 Real Test-Bed Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
5.5.8 IoT Nodes with Constrained Resources . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.5.9 DDoS Detection Performance Analysis over All Nodes In The Dataset . . . . . . . 75
5.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
Chapter 6: Graph Convolutional Networks for DDoS Attack Detection in a Lossy Network . . . . . 78
6.1 Problem Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
6.2 GCN-Based Defense Mechanism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
6.2.1 Defining The Graph Nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
6.2.2 Defining The Graph Edges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
6.2.3 Model Lossy Connections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
6.2.4 Define GCN Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
6.3 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
6.3.1 Experiment Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
6.3.2 Attack Mechanism Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
6.3.3 Detection Mechanism Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
6.3.4 Results Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
6.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
vi
Chapter 7: DDoS Detection Reasoning With Large Language Models . . . . . . . . . . . . . . . . . 103
7.1 Problem Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
7.2 LLM-Based DDoS Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
7.2.1 Few-shot LLM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
7.2.2 Fine-tuning LLM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
7.2.3 General Prompt Engineering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
7.2.4 Neural Network Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
7.3 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
7.3.1 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
7.3.2 Simulation Results for CIC-IDS2017 Dataset . . . . . . . . . . . . . . . . . . . . . . 109
7.3.3 Simulation Results For Urban IoT DDoS Dataset . . . . . . . . . . . . . . . . . . . . 112
7.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
Chapter 8: Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
8.1 Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
vii
List of Tables
3.1 Selected papers with ML-based methods for detecting DDoS attacks in IoT systems . . . . 26
4.1 Sample data points in the introduced large-scale enhanced dataset containing the activity
and packet volume of urban IoT devices. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.2 Related papers with IoT datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
5.1 Sample data points in general training dataset used in correlation aware architectures . . . 56
6.1 Sample data points in the training dataset used in GCN-based detection model. . . . . . . . 90
7.1 Evaluating the LLM capability for DDoS detection reasoning/explanation using the
CIC-IDS2017 dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
7.2 Evaluating the LLM capability for DDoS detection reasoning/explanation using the urban
IoT dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
viii
List of Figures
1.1 Denial of Service (DoS) Attack Schematic . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Distributed Denial of Service (DDoS) Attack Schematic . . . . . . . . . . . . . . . . . . . . 3
1.3 AI-Enabled DDoS Attack Detection in IoT Systems Schematic . . . . . . . . . . . . . . . . 4
4.1 Benign/attack packet volume (figure left) and mean flow inter-arrival time (figure right)
probability density of real DDoS attacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.2 Large-scale urban IoT dataset active nodes percentage vs time . . . . . . . . . . . . . . . . 37
4.3 Large-scale urban IoT dataset nodes activity mean correlation vs distances . . . . . . . . . 38
4.4 Large-scale urban IoT dataset histograms of mean activity and inactivity time per node . . 39
4.5 Real packet volume distribution vs truncated Cauchy distribution . . . . . . . . . . . . . . 40
5.1 Neural network models deployed in correlation aware architectures . . . . . . . . . . . . . 50
5.2 Using all nodes vs selected nodes for sharing information in correlation-aware architectures 51
5.3 Structure of the LSTM/MM-WC model for IoT DDoS Detection . . . . . . . . . . . . . . . . 54
5.4 Real testbed network architecture and information flow for evaluating correlation aware
architectures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.5 Comparing the percentage of active Nodes vs time for the whole urban IoT dataset and 50
random IoT devices. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
5.6 Comparing the probability density function (PDF) of nodes mean activity/inactivity for day
and night for the whole urban IoT dataset and 50 random IoT devices. . . . . . . . . . . . . 59
5.7 Compare different neural network models’ performance by using the multiple models with
correlation (MM-WC) architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
ix
5.8 Compare different neural network models’ performance by using the multiple models
without correlation (MM-NC) architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.9 Compare different neural network models’ performance by using the one model with
correlation (OM-WC) architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
5.10 Compare different neural network models performance by using the one model without
correlation (OM-NC) architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
5.11 Compare different architectures’ performance by using the LSTM neural network model . 66
5.12 Compare different architectures’ performance by using the TRF neural network model . . 67
5.13 Attack prediction vs time on the testing dataset for the case that attack starts at 12 pm for 8
hours over all IoT nodes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5.14 Evaluating the performance of the LSTM/MM-WC model by considering various ratios of
the nodes under attack (ar) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
5.15 Evaluate the LSTM/MM-WC model performance by varying three tunable DDoS attack
parameters k1, k2, and k3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.16 Evaluating the proposed DDoS detection mechanism in a real testbed of Raspberry Pi (RP)
cluster. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
5.17 Evaluating the performance of the LSTM/MM-WC model in a real test-bed of Raspberry Pis 73
5.18 Feature importance analysis based on the LSTM/MM-WC model for a specific node . . . . 74
5.19 Compare the performance of different methods for selecting a subset of the nodes for
training the LSTM based correlation aware models . . . . . . . . . . . . . . . . . . . . . . . 75
5.20 Compare the performance of LSTM/MM-WC DDoS detection methods over all nodes in the
urban IoT dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
6.1 Peer to peer (p2p) graph topology for GCN-based DDoS attack detection in IoT systems. . 83
6.2 Network graph topology for GCN-based DDoS attack detection in IoT systems. . . . . . . 84
6.3 Hybrid graph topology for GCN-based DDoS attack detection in IoT systems. . . . . . . . 85
6.4 Unidrected graph topology for GCN-based DDoS attack detection in IoT systems. . . . . . 86
6.5 Directed node-to-neighbors graph topology for GCN-based DDoS attack detection in IoT
systems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
6.6 Directed neighbors-to-node graph topology for GCN-based DDoS attack detection in IoT
systems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
x
6.7 Modeling lossy connections in the graphs for GCN-based DDoS attack detection in IoT
systems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
6.8 Graph Convolutional Network (GCN) model schematic . . . . . . . . . . . . . . . . . . . . 88
6.9 Compare the performance of GCN-based DDoS detection model using the Peer-to-Peer
topology directed graph with node-to-neighbors edges for graph construction . . . . . . . 94
6.10 Compare the performance of GCN-based DDoS detection model using the Peer-to-Peer
topology directed graph with neighbors-to-node edges for graph construction . . . . . . . 95
6.11 Compare the performance of GCN-based DDoS detection model using the Peer-to-Peer
topology undirected graph for graph construction . . . . . . . . . . . . . . . . . . . . . . . 97
6.12 Compare the performance of GCN-based DDoS detection model using the Hybrid/Network
topology undirected graph for graph construction . . . . . . . . . . . . . . . . . . . . . . . 98
6.13 Compare the performance of GCN-based DDoS detection model using the Peer-to-Peer
topology undirected graph for graph construction and evaluating the number of edges per
node. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
7.1 Performance evaluation of the few-shot/fine-tuned LLM and MLP methods on the
CIC-IDS2017 dataset. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
7.2 Performance evaluation of the few-shot LLM and MLP models using urban IoT dataset. . . 113
7.3 Performance evaluation of the fine-tuned LLM and MLP models using urban IoT dataset. . 114
xi
Abstract
We develop AI-enabled mechanisms for detecting Distributed Denial of Service (DDoS) attacks in Internet
of Things (IoT) systems. We introduce a novel, tunable DDoS attack model that emulates benign IoT device
behavior. We investigate these futuristic DDoS attacks that use large numbers of IoT devices and
camouflage their attack by having each node transmit at a volume typical of benign traffic. We propose
correlation-aware, learning-based frameworks that leverage IoT node correlation data for enhanced
detection accuracy. We extensively analyze the proposed architectures by evaluating various neural
network models. We observe that long short-term memory (LSTM) and a transformer-based model, in
conjunction with the architectures that use correlation information of the IoT nodes, provide higher
detection performance than the other models. We evaluated our findings through practical implementation
on a Raspberry Pi-based testbed. In order to address the challenge of leveraging massive IoT device arrays
for DDoS attacks, we introduce heuristic solutions for selective correlation information sharing among IoT
devices. To overcome the challenge of fixed input limitations in conventional machine learning, we
propose a model based on the Graph Convolutional Network (GCN) to manage incomplete data in IoT
devices caused by network losses. We introduce and evaluate various IoT device graph topologies. Our
simulations reveal that the graph topology employing correlation-based peer-to-peer undirected edges,
along with network topology edges, achieves the highest detection performance. Finally, we explore the
application of Large Language Models (LLMs) for explaining the detection rationale, demonstrating the
potential of LLMs to provide insightful detection reasoning.
xii
Chapter 1
Introduction
The emergence of the Internet of Things (IoT) represents a paradigm shift in the digital field, driven by the
rapid advancement of technology [1]. IoT devices, characterized by their ability to connect and exchange
data with the internet and other devices, are increasingly becoming an integral part of our daily lives,
promising to outpace mobile devices in terms of pervasiveness [2]. According to recent projections, the
global network of IoT devices is expected to encompass approximately 29 billion units by the end of this
decade, underscoring the monumental scale of this digital ecosystem [3]. However, this explosive growth
in IoT device deployment is paralleled by a corresponding escalation in security vulnerabilities, posing
significant challenges to ensuring the cybersecurity of these devices[4]. Investigations conducted by
leading security research teams have revealed a concerning array of vulnerabilities across a sample of
popular IoT devices, including, but not limited to, the absence of transport encryption, the prevalence of
insecure software/firmware, and susceptibility to cross-site scripting attacks [5, 6]. These findings
highlight the urgent need for robust cybersecurity measures to pace with the rapid expansion of IoT
networks, thereby safeguarding the integrity and security of digital infrastructures [7]. In light of these
concerns, this work addresses one of the most dangerous types of attacks involving IoT systems, namely
distributed denial of service (DDoS) attacks [8, 9].
1
Figure 1.1: Denial of Service (DoS) Attack Schematic
Denial of Service (DoS) attacks represent a formidable challenge in cybersecurity, aiming to disrupt the
normal functioning of a victim server. Figure 1.1 presents a schematic of this attack. These attacks are
executed with the intention of denying legitimate users access to the server’s resources, effectively
diminishing its availability to serve its intended users [10]. DoS attacks come in various forms, including
User Datagram Protocol (UDP) flood attacks, where the attacker overwhelms the server with traffic
volumes beyond its processing capacity [11], and Synchronize (SYN) flood attacks, characterized by the
attacker initiating a TCP connection request with spoofed source addresses, but never send back the
acknowledgment [12], etc. These tactics exemplify the diverse methodologies employed to compromise the
availability and reliability of targeted servers.
As we can see in figure 1.2, Distributed Denial of Service (DDoS) attacks leverage multiple
compromised devices, often referred to as zombies, to amplify the scale and impact of DoS attacks on
victim servers [13, 14]. By exploiting the vast number of IoT devices connected to the internet—coupled
with their inherent security vulnerabilities—attackers can orchestrate widespread disruptions, significantly
magnifying the damaging effects of DoS attacks [15]. The susceptibility of IoT devices to compromise,
consequently rendering them a great foundation for participants in DDoS attacks, underscores the
criticality of advancing our understanding and mitigation strategies against such cyber threats. It is vitally
2
Figure 1.2: Distributed Denial of Service (DDoS) Attack Schematic
important that we continue to develop and implement comprehensive security protocols to safeguard the
integrity of IoT devices and detect DDoS attacks emanating from IoT devices.
DoS and DDoS attacks could be performed by using the protocols of different layers in the Open
Systems Interconnection (OSI) model [16]. DDoS attacks that use the transport layer are the most common
type of attacks where the attacker sends as many packets as possible to the victim server through UDP
flooding, SYN flooding, etc [17]. We also have other types of DDoS attacks that use the application layer in
the OSI model. Attackers try to disturb the behavior of the victim server by tying up its every thread with
slow requests. The slow-rate DDoS attacks could be performed by sending data to the victim server at a
very slow rate, but fast enough to prevent the connection from getting timed out [18, 19].
Among all the various types of cybersecurity threats in the DDoS field, the Mirai botnet, first identified
in August 2016, has emerged as a notoriously effective mechanism for IoT-based DDoS attacks,
3
exemplifying the severe vulnerabilities inherent in IoT devices. This botnet, leveraging a vast network of
compromised IoT devices, has demonstrated the capacity to inundate victim servers with unprecedented
volumes of network traffic, peaking in the order of Terabits per second (Tbps). Such volumetric attacks
have not only crippled target servers but have also severely disrupted services for millions of end-users
worldwide [20, 21]. The Mirai botnet’s operation involves exploiting weak security measures—such as
default usernames and passwords—on IoT devices, thereby co-opting them into a botnet army capable of
executing coordinated and large-scale DDoS attacks. Following the Mirai botnet, we witnessed the arrival
of the Reaper botnet, a sophisticated variant that further refined the attack strategy by infecting an
estimated 2.7 million IoT devices [22]. Reaper botnet leveraged a more diverse set of vulnerabilities and
exhibited advanced targeting capabilities, underscoring the evolving sophistication of threats facing IoT
infrastructures.
Figure 1.3: AI-Enabled DDoS Attack Detection in IoT Systems Schematic
In light of the escalating and increasingly sophisticated DDoS attacks, our research delves into the
application of machine learning (ML) techniques as a pivotal strategy for detecting Distributed Denial of
Service (DDoS) attacks emanating from IoT devices [23]. As depicted in figure 1.3, each IoT device could be
4
running an ML-based model to provide detection and protection against DDoS attacks. A critical
observation underpinning our study is the discernible disparity in traffic characteristics between benign
and malicious (DDoS) activities originating from IoT devices. This discrepancy presents a significant
opportunity for ML models to effectively distinguish between benign and DDoS traffic patterns, thereby
enhancing the accuracy and reliability of DDoS detection mechanisms.
However, our study does not merely focus on conventional detection methodologies. We research a
more advanced challenge: the detection of tunable futuristic DDoS attacks wherein adversaries exhibit the
capability to emulate the benign traffic patterns of IoT nodes[24, 25]. Such sophisticated attack vectors
exploit the vast number of IoT devices, leveraging their sheer numbers to orchestrate a DDoS attack that
blends seamlessly with legitimate traffic. By modulating the attack to dispatch fewer packets from each
compromised device at intervals mimicking benign traffic, attackers can significantly obfuscate their
malicious intent, thereby complicating the detection process.
This futuristic form of DDoS attack, which forms the essence of our study, represents a substantial
evolution from traditional DDoS strategies. Traditional approaches, characterized by the volumetric
inundation of target servers or the noticeable disruption of data transmission rates, inadvertently reveal
the attacker’s presence. In contrast, the futuristic DDoS attacks that we explore are designed to remain
under the radar, presenting a formidable challenge to existing detection systems. Through our research, we
aim to address this gap by leveraging the analytical capabilities of Machine Learning (ML) models to detect
these subtle anomalies, thereby contributing to the development of more resilient and adaptive
cybersecurity measures for IoT ecosystems[26, 27, 28].
In chapter 4, we introduce a novel, anonymized Urban IoT DDoS dataset capturing packet volume
activity from 4,060 IoT devices located within urban environments [24]. Furthermore, we introduce our
tunable futuristic DDoS attack that is capable of mimicking the benign behavior of the IoT devices while
performing the attack from a vast number of IoT devices. The volumes of the attacks are regulated by
5
adjusting parameters that affect the properties of the truncated Cauchy distribution, the most suitable
option for modeling the packet volumes of IoT devices based on our analysis and also compatible with
prior research [29].
In chapter 5, we enhance the proposed Urban IoT DDoS dataset by incorporating correlation data for
the traffic activity of each IoT node. Furthermore, we introduce a variety of machine learning-based
detection mechanisms that leverage the correlation data of IoT nodes and employ diverse training
methodologies. Our empirical findings underscore the critical role of node correlation data in the enhanced
detection of DDoS attacks, especially in scenarios where attackers disguise their malicious traffic as benign.
To validate our detection methodology in a tangible real-world context, we employed a network of
Raspberry Pis (RPs) as a testbed, demonstrating the practical applicability and effectiveness of our
approach. Considering the vast amount of features generated by the extensive array of IoT devices, we
introduce and evaluate methodologies for the selective incorporation of node correlation data during the
training and inference phases of model development.
A significant challenge in deploying conventional machine learning models for DDoS attack detection
within IoT systems lies in their intrinsic limitation in managing incomplete data. Given the dynamic and
distributed nature of IoT environments, it is plausible that correlation data shared among IoT
nodes—critical for making informed DDoS detection decisions—might be lost or disrupted during network
transmission[30, 31]. This scenario underscores a critical vulnerability in the effectiveness of conventional
detection models under conditions of data uncertainty. To address this challenge, in chapter 6, we propose
robust DDoS detection mechanisms based on Graph Convolutional Networks (GCNs) [32]. GCNs represent
a cutting-edge advancement in deep learning, offering a robust framework for handling data structured in
graphs. By conceptualizing the IoT device network as a graph, where devices are represented as nodes
interconnected by their relationships and data flows, GCNs enable the incorporation of relational
information directly into the detection mechanism.
6
Finally, in chapter 7, we propose mechanisms for providing reasoning and explanation about the IoT
network traffic properties. We utilize Large Language Models (LLMs) [33] to both detect and reason about
the IoT network traffic properties and evaluate their performance[34]. This chapter highlights the potential
of LLMs to enhance IoT network security by offering a novel method for detecting and reasoning network
traffic anomalies.
1.1 Contributions
In this dissertation, we present several pioneering contributions to the field of DDoS attack detection
within IoT networks. Our research is based on a comprehensive analysis and development of advanced
methodologies aimed at enhancing the resilience of IoT systems against sophisticated DDoS threats. The
main contributions of our work are enumerated as follows:
• Futuristic DDoS Attack Model: We propose a refined model of DDoS attacks that surpasses the
complexity of those addressed in the existing literature. In this tunable futuristic DDoS attack, we
propose mechanisms allowing attackers to calibrate their approach to achieve a balance between
aggression and stealth while attacking the victim server.
• Correlation-Aware DDoS Detection Model: We propose an innovative approach leveraging
machine learning to detect DDoS attacks in IoT networks. This strategy utilizes correlation data
from IoT nodes, enhancing the DDoS detection capability by incorporating relational dynamics
among IoT devices.
• Large-Scale IoT DDoS Dataset: We introduce an extensive dataset comprising packet volume
activity from over 4,000 IoT nodes. This dataset is substantially larger than those utilized in prior
research, providing a robust foundation for training and evaluating our proposed detection models.
7
• Real-World Validation: We evaluate the effectiveness of our detection mechanisms through
practical experimentation using a Raspberry Pi cluster. This testbed mirrors real-world IoT
environments, verifying the feasibility and reliability of our proposed solutions.
• Resource Management in Constrained IoT Nodes: Acknowledging the resource constraints
inherent in IoT devices, we devise a strategy for selective information sharing among IoT devices.
This method improves the use of limited computational and network resources, enhancing the
scalability of our detection framework.
• GCN-Based Detection in Lossy Networks: Recognizing the challenges posed by lossy network
environments, we propose a model based on Graph Convolutional Networks (GCNs) to forge a
robust detection mechanism. Through the design and evaluation of various graph topologies, we
demonstrate the potential of GCN-based models to maintain high detection accuracy even in adverse
network conditions.
• LLM-based Detection and Reasoning: We introduce innovative methodologies for leveraging
large language models (LLMs) to analyze and interpret the underlying characteristics of IoT network
traffic. This approach enhances our comprehension of the detection outcomes by providing a deeper
insight into the causative factors and patterns within the network data.
Collectively, these contributions represent a significant advancement in the detection of DDoS attacks
within IoT networks. Through the integration of sophisticated attack models, innovative machine learning
techniques, and practical validation methods, our research aims to strengthen the cybersecurity
infrastructure of IoT systems against the evolving field of DDoS attacks.
8
Chapter 2
Background
The background chapter of this dissertation lays the foundation for understanding the primary concepts
and technologies at the heart of our study, focusing on the application of machine learning in enhancing the
security of Internet of Things (IoT) systems against Distributed Denial of Service (DDoS) attacks. Initially,
we delve into the IoT, outlining its significance, architecture, and the inherent security vulnerabilities that
make it susceptible to cyberattacks. Subsequently, we explore the nature of DoS and DDoS attacks,
characterizing their impact on IoT systems and the challenges they pose to maintaining network integrity
and availability. The core of this chapter is dedicated to a comprehensive examination of various machine
learning and deep learning models, including Multi-Layer Perceptrons (MLP), Convolutional Neural
Networks (CNN), Long Short-Term Memory (LSTM) networks, Autoencoders (AEN), Transformers (TRF),
Graph Convolutional Networks (GCN), and Large Language Models (LLM). For each model, we discuss its
foundational principles and relevance to detecting DDoS attacks within IoT frameworks. This discussion
aims to equip the reader with a robust understanding of the current state of machine learning applications
in IoT security, setting the stage for our investigation into innovative strategies for DDoS detection.
9
2.1 Internet of Things
The Internet of Things (IoT) signifies a transformative development in the realm of technology and digital
connectivity, marking the arrival of a new era where physical objects are integrated into the networked
world, enabling them to collect, exchange, and analyze data autonomously [35]. This technological
revolution extends the boundaries of the internet beyond traditional devices like computers and
smartphones to encompass a vast array of objects such as household appliances, industrial equipment,
wearable devices, and even vehicles, all embedded with electronics, software, sensors, actuators, and
connectivity [36]. The essence of IoT lies in its ability to bridge the physical and digital worlds, allowing
for a seamless interaction that enhances human life and works through increased connectivity, efficiency,
and intelligence[1].
The IoT ecosystem is characterized by its diverse and complex structure, which accommodates a wide
range of functionalities from sensing and data collection to processing and application-specific services.
The foundation of IoT devices is the perception layer that consists of sensors and devices equipped to
capture environmental data and statistics such as temperature, motion, humidity, etc. These devices are
interconnected through the network layer that employs various communication protocols—including
Wi-Fi, Bluetooth, Zigbee, and cellular networks—to ensure seamless data transmission. This network,
essential for the IoT’s functionality, facilitates the flow of data to the cloud or local computing devices,
where it undergoes processing and analysis. Finally, the final step of this process is observed in the
application layer, where the analyzed data is harnessed to deliver tailored services and applications [37].
The application of IoT technologies has the potential to revolutionize various sectors by providing
previously unachievable insights. For example, in smart homes, IoT devices can automate tasks, enhance
security, and improve energy efficiency. In healthcare, wearable devices can monitor patient health in
real-time, enabling proactive management of patients’ health status. In agriculture, IoT sensors can
optimize water usage, and collect important environmental properties such as temperature, humidity, etc.
10
In industrial settings, IoT devices can streamline operations, automate tasks, enhance safety, and reduce
maintenance costs through predictive maintenance strategies[38, 36, 39].
However, along with the many applications and automation that IoT devices provide, we must also
address the inherent security vulnerabilities and risks accompanying IoT infrastructure[40]. In the
following section, we delve into these vulnerabilities, highlighting the challenges in securing the IoT
ecosystem against DDoS attacks.
2.1.1 Security Vulnerabilities in IoT
This rapid expansion and integration of IoT devices present a myriad of security vulnerabilities, making IoT
devices prime targets for cyberattackers. Specifically, these vulnerabilities make IoT devices susceptible to
being co-opted into botnets, which can then be utilized to perform DDoS attacks. In the following, we will
enumerate the key vulnerabilities and security issues inherent in IoT devices that contribute to their
exploitation in botnet formations.
• Default Credentials: A significant number of IoT devices are deployed with default usernames and
passwords, which are often easily guessable and widely known. Furthermore, consumers usually do
not pay attention to this matter and do not change the default credentials [41]. This practice makes it
trivial for attackers to gain unauthorized access and enlist devices into botnets.
• Limited Computation Power: The constrained processing capabilities and memory of many IoT
devices restrict the implementation of robust security measures. Furthermore, The design of IoT
devices often prioritizes energy efficiency over security, leading to the omission of resource-intensive
security features [42]. This would make them easy targets for DDoS attackers looking for botnets.
11
• Lack of Firmware Updates: Many IoT devices lack the capability for automatic firmware updates,
requiring manual intervention for security patches[43]. This leads to prolonged periods where
devices remain vulnerable to exploitation due to unpatched security flaws.
• Insecure Communication Protocols: IoT devices frequently communicate over insecure channels,
without encryption or secure authentication mechanisms [44]. Furthermore, IoT devices are
designed to be constantly connected to the internet, without proper monitoring of the network, thus
providing a persistent attack surface. This vulnerability allows attackers to intercept
communications, steal sensitive information, and inject malicious commands to take control of the
devices, thus enabling them to become botnets.
• Heterogeneous Ecosystem: The IoT device ecosystem contains very diverse and heterogeneous
devices [45]. This would result in inconsistent security standards, which complicate the deployment
of unified security solutions, therefore, increasing the attack surface for botnets.
The inherent vulnerabilities and security issues of IoT devices discussed above, significantly contribute
to their exploitation in botnet formations for DDoS attacks. In this dissertation, we do research on
developing novel frameworks for detecting the IoT devices performing DDoS attacks.
2.2 Distributed Denial of Service (DDoS) Attacks
Denial of Service (DoS) attacks are cyber-attacks aimed at disrupting the normal operation and behavior of
a targeted/victim server, service, or network These attacks are executed by inundating the target with an
overwhelming volume of internet traffic, far beyond what the system is designed to handle. DoS attacks
achieve this by exploiting various vulnerabilities in the network or system, thereby denying service to
legitimate users. The simplicity and effectiveness of DoS attacks make them a common tool for
cybercriminals seeking to disrupt services [46].
12
Distributed Denial of Service (DDoS) attacks represent an evolution in complexity and scale. Unlike
their DoS counterparts, DDoS attacks originate from multiple, often globally distributed, compromised
devices. This army of compromised devices, also known as zombies, may include a wide array of
internet-connected devices such as computers, IoT devices, and servers. Malware infections are typically
used to seize control of these devices, transforming them into bots or zombies that carry out the attacker’s
command without the owners’ knowledge [47, 48]. The use of botnets significantly amplifies the volume
and impact of the attacks, making DDoS attacks far more challenging to mitigate and trace back to their
origins.
DDoS attacks are primarily executed through two methods. One approach involves attackers
dispatching improperly formatted packets to the victim server, leading to confusion in either a protocol or
an application on the receiving end[13]. The alternative and more common approach for the attacker is to
overwhelm the connectivity of legitimate users by exhausting bandwidth and network infrastructures
which targets the network/transport layer [13] or overload the victim servers by draining its resources,
characterizing application-level flooding attacks [14, 49].
In contemporary settings, DDoS attacks are frequently orchestrated through a coordinated network of
remotely controlled, systematically organized, and geographically dispersed zombies or botnets by sending
vast quantities of traffic and/or service requests to the victim server, causing it to either operate very
slowly or to cease functioning altogether [14, 49, 50].
2.2.1 DDoS Attacks Leveraging IoT Systems
The proliferation of IoT devices has introduced new vulnerabilities into the cyber field, making IoT systems
particularly susceptible to performing DDoS attacks. The inherent security weaknesses of many IoT
devices, such as inadequate authentication mechanisms, unencrypted data transmission, and the use of
default passwords as discussed in section 2.1.1, make them easy targets for attackers seeking to build
13
botnets. Once compromised, these devices can be used to launch massive DDoS attacks that can
overwhelm not just single websites but entire portions of the internet infrastructure.
DDoS attacks leveraging IoT systems pose significant challenges for several reasons:
• Scale and Dispersity: The vast number of IoT devices distributed globally offers a broad attack
surface [51]. A botnet comprising thousands or millions of IoT devices can generate an enormous
volume of traffic, far exceeding the capabilities of traditional network defense mechanisms. This
global distribution of devices not only amplifies the attack’s reach but also complicates efforts to
pinpoint and neutralize the source.
• Device Vulnerability and Compromise: As discussed in the section 2.1.1, many IoT devices are built
with minimal security considerations, making them easy targets for hijacking. Once compromised,
these devices can silently participate in DDoS attacks without the owner’s knowledge [52]. The
simplicity of converting these devices into bots for a botnet makes IoT ecosystems attractive to
attackers looking to maximize the impact of their attacks.
• Complexity of Mitigation: Mitigating DDoS attacks originating from IoT devices is complicated by
the devices’ diversity and the distributed nature of the attack [45]. Traditional DDoS mitigation
techniques, such as traffic filtering and rate limiting, may not be effective against large-scale,
multi-vector attacks.
• Impact on Critical Services: IoT devices are increasingly used in critical infrastructure and services,
such as healthcare, transportation, and utilities [53, 54]. DDoS attacks leveraging these IoT devices
can have severe consequences since they would disrupt essential services even by successfully
detecting the IoT botnet and disconnecting it from the network.
The fight against DDoS attacks in IoT systems is ongoing, with researchers and cybersecurity
professionals continuously exploring new strategies and technologies to protect against these threats. As
14
IoT continues to grow and evolve, so too will the methods to secure it against the ever-present danger of
DDoS attacks.
2.3 Machine Learning Models
In the context of enhancing the security of IoT systems against DDoS attacks, machine learning models
offer a promising avenue for the detection and mitigation of such threats [55]. This section delves into
several primary machine learning models used in this dissertation by discussing their principles,
advantages, and applicability to DDoS detection in IoT environments.
2.3.1 Multi-Layer Perceptrons
Multi-Layer Perceptrons (MLP) are a class of feedforward artificial neural networks that consist of at least
three layers: an input layer, one or more hidden layers, and an output layer. Each layer is composed of
neurons that use nonlinear activation functions, except for the input nodes. MLPs are capable of modeling
complex relationships in data by adjusting the weights between neurons during the training process, using
techniques such as backpropagation[56].
In DDoS detection, MLPs can be trained on features derived from network traffic, such as packet rates,
sizes, and protocol types, to classify traffic as normal or malicious [57]. Their strength lies in the ability to
learn nonlinear models, making them effective in identifying subtle patterns indicative of DDoS activities.
2.3.2 Convolutional Neural Networks
Convolutional Neural Networks (CNN) have demonstrated exceptional proficiency in handling data
characterized by temporal sequences and time series. By leveraging convolutional layers, pooling layers,
and fully connected layers, CNNs adeptly learn temporal features in the dataset [58].
15
This capability makes them particularly suitable for applications requiring the analysis of sequential
data patterns. In the context of DDoS attack detection, CNNs can be employed to interpret network traffic
as temporal sequences, facilitating the extraction of distinctive features indicative of malicious
activities[59]. This approach underscores the versatility of CNNs in processing and analyzing temporal
datasets for the identification of complex patterns.
2.3.3 Long Short-Term Memory
Long Short-Term Memory (LSTM) networks are a type of recurrent neural network (RNN) [60] designed to
address the limitations of traditional RNNs in learning long-term dependencies. LSTMs are capable of
learning order dependence in sequence prediction problems with long sequences, making them ideal for
applications where the context or temporal sequence of events is crucial [61].
In the context of DDoS detection, LSTMs can analyze long time-series data of network traffic, learning
to recognize patterns of normal and attack traffic over time [62].
2.3.4 Autoencoders
Autoencoders are a type of neural network model used to compress (encode) the input into a
lower-dimensional representation and then reconstruct (decode) it back to the original input. The
reconstruction error can indicate anomalies when the model is exposed to data that significantly deviates
from the training data[63].
For DDoS detection, autoencoders can be trained on normal traffic data, allowing the model to learn a
representation of regular traffic patterns. When presented with new data, significant deviations in the
reconstruction error can signal the presence of DDoS attack traffic [64].
16
2.3.5 Transformers
Transformers are a type of model that uses self-attention mechanisms to weigh the significance of different
parts of the input data. Unlike RNNs and LSTMs, transformers can process entire sequences of data
simultaneously, making them highly efficient and effective for tasks involving sequential data, such as
natural language processing[65].
In DDoS detection, transformers can analyze sequences of network traffic data, identifying patterns
and relationships that are indicative of an attack. Their ability to handle long-range dependencies and
process data in parallel allows for rapid and accurate detection of DDoS activities.
2.3.6 Graph Convolutional Networks
Graph Convolutional Networks (GCN) extend the concept of convolution from grid data to graph data,
enabling the analysis of data that is structured as graphs. GCNs are particularly useful for tasks where the
data exhibits an irregular structure, such as social networks, knowledge graphs, and network traffic [32].
For detecting DDoS attacks, GCNs can model the complex relationships and interactions between
different network entities. It is especially useful in the case of unstable and lossy network connections
where the graph is changing all the time. By learning the patterns of normal and malicious traffic flows
within the network graph, GCNs can effectively identify anomalies associated with DDoS attacks.
2.3.7 Large Language Models
Large Language Models (LLMs) represent a significant advancement in the field of artificial intelligence,
particularly within natural language processing (NLP). LLMs, such as Generative Pre-trained Transformer
(GPT)[66], are designed to understand, generate, and interact with human language at an unprecedented
scale. Large Language Models leverage the transformer architecture, which utilizes self-attention
mechanisms [65] to analyze the entire input data simultaneously. This approach allows LLMs to efficiently
17
process long sequences of text, making them highly effective for a wide range of NLP tasks, including text
generation, translation, summarization, and sentiment analysis [33].
In the context of IoT security, LLMs offer innovative approaches to detecting DDoS attacks. LLMs can
be utilized to analyze network traffic data, identifying patterns and anomalies that signify DDoS activities.
By training or fine-tuning these models on datasets that include both normal and DDoS traffic, LLMs learn
to discern between benign and malicious traffic. This capability enables the deployment of dynamic and
intelligent DDoS detection systems that can adapt to evolving attack methodologies. Furthermore, LLMs
can assist in understanding the explanation behind the DDoS detection results by offering insights about
the input network traffic properties. Furthermore, they can help automate the response to identified
threats, thereby enhancing the resilience of IoT ecosystems against DDoS attacks.
Each of these machine learning models offers unique advantages in the detection of DDoS attacks
within IoT systems. The choice of model depends on the specific characteristics of the network traffic, the
type of DDoS attack, and the computational resources available. By leveraging these models, researchers
and cybersecurity professionals can develop more sophisticated and effective DDoS detection systems that
enhance the security and resilience of IoT environments.
18
Chapter 3
Related Works
In this chapter, we provide a comprehensive examination of the evolving field of DDoS attack detection
methodologies, spanning from traditional to cutting-edge machine learning (ML) approaches. Initially, we
delve into conventional strategies that have served as the bedrock of DDoS detection mechanisms. These
foundational methods, while effective against known threats, exhibit limitations in adapting to the novel
and sophisticated attack vectors prevalent in today’s DDoS attacks. Transitioning into the realm of
ML-based methods, we explore the paradigm shift towards employing advanced algorithms capable of
learning from network behavior to identify and counteract DDoS attacks. This exploration includes a
detailed analysis of various ML techniques applied to detect traditional flooding, slow-rate, and stealthy
DDoS attacks, alongside innovative applications of Graph Convolutional Networks (GCNs) and the
emerging role of Large Language Models (LLMs) in enhancing network security. ∗
3.1 Traditional Methods For DDoS Attack Detection
Traditional methods for detecting and mitigating DDoS attacks primarily focus on identifying abnormal
traffic patterns and filtering out malicious packets. These techniques often involve the deployment of
Intrusion Detection Systems (IDS) and Intrusion Prevention Systems (IPS), which monitor network traffic
∗This chapter is adapted from [26, 30, 34].
19
for signs of an attack and take actions to block malicious data. Signature-based detection [67], one of the
earliest approaches, relies on a database of known attack patterns to identify and mitigate attacks.
Although effective against known threats, it struggles with unknown attacks due to its reliance on
previously identified signatures. Rate-based detection, another conventional method, monitors the volume
of traffic to identify sudden spikes that may indicate a DDoS attack, employing thresholds to trigger
defensive mechanisms [68]. However, traditional methods often face challenges in distinguishing between
legitimate traffic surges and DDoS attacks, leading to potential false positives and the inadvertent blocking
of genuine users.
3.2 Machine Learning Based Methods For DDoS Attack Detection
In light of the increasingly sophisticated DDoS attacks being orchestrated globally, conventional detection
methodologies are proving to be inadequate in identifying and mitigating these evolved threats. This
insufficiency stems from the traditional systems’ reliance on static and predefined attack signatures, which
are easily circumvented by modern attackers employing dynamic and complex attack vectors.
Consequently, there is a pressing need for innovative approaches capable of evolving with these threats.
Machine Learning (ML) offers a promising avenue for the development of adaptive and intelligent DDoS
detection mechanisms that can learn from data patterns and predict potential attacks before they escalate.
In this chapter, we delve into the current field of ML-based DDoS detection methods.
3.2.1 Flooding DDoS Attack Detection
Prior works have significantly focused on traditional DDoS attacks where the primary objective is to
disrupt service by overwhelming it with malicious traffic. Doshi et al. [69] set up an environment with
three IoT devices, simulating the Mirai Botnet attack. They tested multiple classifiers, including random
forests and neural networks, on network middleboxes. Chen et al. [70] gathered data from 9 IoT devices at
20
a campus and transmitted it via SDN switches. Implementing a decision tree on IoT gateways and SDN
controllers. Mohammed et al. [71] employed a naive Bayes approach on the NSL-KDD dataset [72],
validating their model with traffic from four SDN controllers. Furthermore, Meidan et al. [64] gathered data
from nine infected commercial IoT devices, and presented the N-BaIoT anomaly detection technique, which
is based on autoencoders. Using autoencoders, they differentiated between benign and attacked IoT traffic.
Almost all of the presented detection mechanisms in the above papers could achieve a very high accuracy
of above 0.98. However, their methods primarily handle DDoS attacks where malicious traffic largely
deviates from benign traffic, exposing the botnets. In contrast, we introduce futuristic DDoS attacks that
closely mimic IoT nodes’ benign traffic, which is a more challenging scenario for detection. Furthermore,
contrary to these works, our approach advocates for a peer-to-peer mechanism that is capable of running
the DDoS detection on IoT devices, eliminating central node dependencies such as running the detection
mechanism on servers or routers.
3.2.2 Slow-rate and Stealthy DDoS Attacks Detection
Slow-rate DDoS attacks subtly deplete target resources by sending malicious requests slowly over
prolonged durations. Several researchers have delved into detecting such attacks. Yungaicela-Naula et al.
[18] utilized datasets CICDDoS 2017 [73] and CICDDoS 2019 [74], applying machine and deep learning
techniques like Support Vector Machines (SVM) and MLP. Nugraha and Murthy [75] proposed a hybrid
CNN-LSTM model and evaluated its performance on a self-developed dataset. Cheng et al. [76] evaluated
several classifiers, like random forest and naive Bayes. The proposed slow-rate DDoS detection
mechanisms could achieve high accuracies of above 95%. Muraleedharan and Janet [77] used a random
forest (RF) classifier to detect slow-rate DDoS attacks, and their model could achieve up to %99 accuracy.
However, these prior works often assume a significant deviation in malicious traffic of the slow-rate DDoS
21
attack network traffic as compared to benign traffic, especially the features of the network related to time,
such as inter-arrival time.
Stealthy DDoS attacks, subtly induce service slowness and network delays, posing challenges for
detection [78]. Doshi et al. [79] introduced an IDS using the Online Discrepancy Test (ODIT) for detecting
stealthy DDoS in IoT networks. The stealthy attack mechanism used in this paper generates DDoS attacks
through IoT nodes that have at least 20% higher packet volume as compared to their benign behavior. Their
mechanism operates cooperatively, sharing node statistics centrally, while we propose peer-to-peer
architectures for DDoS detection. Feng et al. [80] developed a lightweight, unsupervised classifier
combined with reinforcement learning for DDoS detection, defining a low-rate attack based on IoT data
rates. In contrast, our DDoS approach mimics the genuine behavior of IoT nodes and utilizes peer-to-peer
designs over centralized ones. They defined a low-rate DDoS attack as one that increases the data rate of
IoT devices by 20% up to 100% relative to the benign behavior of the device. However, our detection
mechanism is capable of detecting DDoS attacks that exactly mimic the benign behavior of the nodes.
3.2.3 Application-Layer-Based DDoS attack on IoT
In the field of IoT security, considerable attention has been directed towards application-layer DDoS
attacks. Particularly, protocols intrinsic to IoT communication, such as Message Queuing Telemetry
Transport (MQTT) and (Constrained Application Protocol) CoAP, have emerged as potential targets for
such intrusions [81]. Noteworthy works have explored the vulnerabilities and proposed countermeasures
for these application-layer protocols. For instance, researchers have unveiled potential weaknesses within
CoAP due to its lightweight nature tailored for IoT devices, which is particularly susceptible to DDoS
attacks [82, 83]. Syed [84] focuses on Application Layer DoS attacks on the MQTT protocol, specifically
targeting its authentication and authorization mechanisms. The study introduces a machine learning-based
framework to detect these attacks. Husnain et al. [85] introduces an MQTT parsing engine, designed to
22
integrate with network-based IDSs, for thorough inspection against IoT protocol vulnerabilities. It
validates packet fields meticulously during the parsing phase. While these studies illuminate the necessity
of securing application-layer protocols, our research navigates a different trajectory, concentrating on
transport-layer DDoS attacks, especially those similar to UDP floods emanating from IoT devices. Syed et
al. [86] introduced a framework for detecting application layer DoS attacks on MQTT brokers using
machine learning. They employed methods like AODE, C4.5 trees, and MLP on three virtual machines.
3.2.4 GCN Based Methods in DDoS Detection
Graph Convolutional Networks (GCNs) represent a cutting-edge approach in machine learning, designed
to work directly with graph-structured data. By considering the graph’s structure, GCNs can capture
complex patterns and relationships that classic ML methods might overlook [87]. They hold considerable
promise in a variety of fields, including network security, where they can leverage the interconnected
nature of devices to more effectively identify and mitigate threats such as DDoS attacks, a potential notably
explored in this study. A recent study by Cao et al. [88] utilized Spatial-Temporal Graph Convolutional
Networks (ST-GCN) in Software-Defined Networking (SDN) for DDoS attack detection. In contrast to their
studies that have primarily tested their methodologies against rudimentary DDoS attack mechanisms—
where IoT devices’ attacking network traffic attributes markedly significantly differ from benign traffic— in
our work, we leverage a sophisticated, future-oriented DDoS attack mechanism with adjustable packet
volume, enhancing the robustness and relevance of our findings. Furthermore, their method has exhibited
insufficient resilience in unstable network environments characterized by lossy data connections. A
noteworthy limitation of their approach is the assumption of a centralized router node in the system
topology, inherently introducing a single point of failure. In comparison, our GCN-based method uses a
peer-to-peer topology approach which offers enhanced resilience amidst fluctuating network conditions,
thereby promising more versatile applicability in real-world scenarios.
23
3.2.5 LLM Based DDoS detection
Recent research suggests that LLMs possess an inherent capability for reasoning. However, the exact
bounds and depth of this capability remain subjects of intensive research [89]. In the area of artificial
intelligence and network security, LLMs hold the promise of a formidable defense against cyber threats. By
leveraging models like GPT-4, we can significantly augment the resilience of cybersecurity systems,
granted they are correctly implemented [90]. Despite the potential advantages, it is important to note the
evolving stage of research in employing LLMs specifically for network security. A recent work in this
domain by Ferrag et al.[91] proposed SecurityLLM, an integrated model that addresses cybersecurity
threats. This model proposes two distinct elements: SecurityBERT, which focuses on threat detection, and
FalconLLM, designed for incident response. They fine-tuned an LLM based on transformer models for the
purpose of detecting potential threats. Furthermore, they prompt engineered FalconLLM to craft responses
to these detected threats. However, a significant gap in their work is the absence of reasoning behind
identifying an attack. Moreover, the responses generated by FalconLLM tend to be overarching and lack
the specificity required for individual systems. Contrasting this, our approach aims to harness a
pre-trained LLM, not only for the purpose of detection but also to explain the reasoning behind identifying
a network security incident.
3.3 Differentiating Characteristics of Our Work
Our research offers unique features:
• Futuristic DDoS Attacks - Our DDoS attacks can mimic the exact benign behavior of IoT nodes with
tunable parameters to have higher or lower aggressiveness toward the victim server.
• Peer-to-peer Architectures - We propose architectures that run in a complete peer-to-peer
mechanism on IoT devices, removing the need for edge/router/central devices.
24
• Correlation Awareness - We propose architectures that account for correlation information between
IoT nodes, an aspect that many studies overlook.
• Comprehensive Dataset - Our dataset incorporates more than 4000 IoT nodes, giving a richer
perspective.
• Transport-layer-based DDoS attack - We focus on transport-layer DDoS attacks, especially those
similar to UDP floods emanating from IoT devices.
• Robust DDoS Detection in Lossy networks - Our proposed GCN model is capable of detecting DDoS
attacks in the IoT system with lossy networks.
• Providing Insightful Information For Network Admins - Our proposed LLM-based mechanism
provides insightful information and reasoning for the detected DDoS attack in the network.
In summary, while prior studies have made considerable advancements in the field, our research
addresses gaps and introduces novel methods and perspectives to further the understanding and detection
capabilities for DDoS attacks in IoT networks. There are many other works that studied DDoS detection on
IoT devices, mainly using machine learning methods. Some of these papers are presented in table 3.1.
25
Table 3.1: Selected papers with ML-based methods for detecting DDoS attacks in IoT systems
Reference Dataset Detection Method Centralized/
Distributed Inference Device
Doshi et al. [69] Self-developed RF, KNN, SVM, MLP Centralized Network middlebox
Chen et al. [70] Self-developed DT Centralized SDN controller
Syed et al. [86] Self-developed MLP, AODE, DT Centralized MQTT broker
Meidan et al. [64] N-BaIoT [64] AEN Distributed IoT node
Liu et al. [92] Self-developed RL Centralized SDN controller
Roopak et al. [93] Self-developed MLP, CNN, LSTM Centralized Unclear
Gurulakshmi et al. [94] Self-developed SVM Centralized Unclear
Mohammed et al. [71] NSL-KDD [72] NB Centralized SDN controller
Zekri et al. [95] Self-developed DT Centralized Cloud
Blaise et al. [96] CTU-13 [97] MLP, SVM, RF, LR Centralized Unclear
Soe et al. [98] N-BaIoT [64] MLP, DT, NB Centralized Unclear
Nugraha et al. [99] CTU-13 [97] MLP, CNN, LSTM Centralized Unclear
Kumar et al. [100] Self-developed RF, KNN, GNB Centralized IoT gateway
Yungaicela et al. [18] CICDDoS [73],[74]
RF, SVM, KNN, MLP,
CNN, GRU, LSTM Centralized Server
Cheng et al. [76] Self-developed SVM, NP, KNN, RF Centralized Controller/Switch
Where MLP: Multilayer Perceptron, CNN: Convolutional Neural Network,
LSTM: Long Short Term Memory, AEN: Autoencoder, RL: Reinforcement Learning,
DT: Decision Tree, RF: Random Forest, KNN: K Nearest Neighbor,
SVM: Support Vector Machine, AODE: Average One-Dependence Estimator,
NB: Naive Bayes, GNB: Gaussian Naive Bayes, LR: Logistic Regression, GRU: Gated Recurrent
26
Chapter 4
Tunable Futuristic DDoS Attacks In IoT Systems
Distributed Denial of Service (DDoS) attacks present a formidable challenge in the field of cybersecurity,
especially with the rapid rise of Internet of Things (IoT) devices. In this chapter, we introduce our tunable
futuristic DDoS attack that is capable of mimicking the benign behavior of IoT devices while performing
the attack from a vast number of IoT devices. This futuristic DDoS attack presents a serious challenge for
the DDoS detection models introduced in prior research. Furthermore, in this chapter, we introduce a
novel, anonymized Urban IoT DDoS dataset capturing packet volume activity from 4,060 IoT devices
located within urban environments. This large-scale dataset provides a benchmark for fellow researchers
developing ML-based DDoS detection methods in IoT systems.∗†
4.1 Problem Definition
A critical examination of existing literature reveals a significant oversight: the majority of research on
DDoS detection mechanisms predominantly focuses on quantifiable metrics such as the volume of packets
transmitted from IoT nodes or the timing of packet flows during an attack. These metrics are assumed to be
orders of magnitude higher in attack scenarios compared to benign traffic conditions.
∗This chapter is adapted from [24, 25, 28].
†The dataset and source code used in this chapter are available in an open-source repository online at https://github.
com/ANRGUSC/Urban_IoT_DDoS_Data.
27
For instance, Meidan et al. [64] presented an ML-based methodology for the detection of DDoS attacks,
utilizing a dataset encompassing both benign and DDoS traffic from nine IoT nodes. As depicted in Figure
4.1a, the analysis of packet volume from a specific IoT node (ID XCS7_1003) under attack conditions
revealed an average packet volume that was 1604 times higher than during benign operation. Similarly,
Sharafaldin et al. [73] introduced a dataset that includes a slow-rate DDoS attack variant, accompanied by
an ML-based detection technique. Their findings, illustrated in Figure 4.1b, highlighted that the mean flow
inter-arrival time (IAT) for attack traffic is roughly 11 times higher than that of benign traffic, indicating
significant temporal disparities between attack and normal traffic flows.
(a) Mirai botnet DDoS attack (b) Slow-rate DDoS attack
Figure 4.1: Benign/attack packet volume (figure left) and mean flow inter-arrival time (figure right) probability density of real DDoS attacks
Such pronounced differences in traffic characteristics during attack scenarios have traditionally granted
a substantial advantage to machine learning models tasked with detecting DDoS attacks. Nonetheless, in
this study, we introduce the more sophisticated DDoS attacks that adeptly mimic the benign traffic patterns
of IoT devices. In an era where the IoT ecosystem encompasses millions of devices, attackers could
potentially utilize vast networks of these devices. By strategically moderating the packet output to mirror
benign traffic patterns, both in volume and timing, attackers could obfuscate their malicious activities. This
form of attack, which constitutes the primary point of our investigation, signifies a dramatic evolution
28
from conventional DDoS strategies that are easily detectable due to their significant deviation from benign
traffic patterns. By adopting a more stealthy approach, attackers could significantly undermine the efficacy
of current DDoS detection mechanisms, thereby posing so many threats to cybersecurity infrastructures.
4.2 Modeling IoT Network Traffic Properties
The foundation of our study on introducing futuristic DDoS attacks is based on a comprehensive analysis of
real-world IoT network traffic properties. In order to model the IoT network traffic, we will first process the
raw data of the network activity of the real IoT devices. The dataset utilized is sourced from deployments
across diverse urban locales and offers invaluable insights into the operational dynamics of IoT devices.
The raw dataset is characterized by a rich array of features, including the unique identifier (node ID)
for each IoT device, its geographical coordinates (latitude and longitude), and a binary indicator of its
activity status. An activity status alteration—switching from inactive to active or vice versa—triggers the
generation of a new record within the dataset. This compilation of data spans an entire month,
incorporating records from 4060 distinct IoT nodes without any omissions.
4.2.1 Addressing Data Bias
A critical examination of the raw dataset revealed a notable bias: nodes exhibiting more frequent
alterations in their activity status are disproportionately represented, resulting in an uneven distribution of
records. This imbalance poses a significant challenge, as it can bias future training of neural network
models to over fitting, leading to a scenario where the models exhibit a preference for learning from nodes
with heightened activity variability.
To mitigate this bias, we implement a preprocessing script that systematically normalizes the dataset.
Utilizing a predetermined time step, denoted as ts, this script iteratively processes the raw data, yielding a
“balanced dataset” that records the activity status of each IoT node at regular intervals of ts seconds. This
29
approach not only addresses the skewness inherent in the original dataset but also ensures a uniform
distribution of learning samples across all nodes. By doing so, we facilitate a more balanced and
comprehensive training process for the neural network models, enabling them to derive insights from a
broader spectrum of IoT device behaviors without bias towards those with more frequent status changes.
4.2.2 Dataset Augmentation
The balanced dataset introduced above primarily encapsulates the binary activity status of IoT nodes. In an
endeavor to enhance this dataset and render it more reflective of real-world IoT network behaviors, we
incorporated an additional dimension: the packet volume transmitted by active IoT nodes at each time step.
Based on the research conducted by Meidan et al. [64], which delves into DDoS attack detection across a
small set of nine IoT nodes, we leveraged their dataset that records IoT nodes packet volumes at ten-second
intervals. Specifically, we extracted packet volume data of the security camera IoT node, identified as ID
XCS7_1003, from Meidan et al.’s study. This data served as a basis for simulating both benign and attack
packet volumes within our enhanced dataset.
In order to model the packet volume of the IoT devices, we extensively evaluated 80 distinct statistical
distributions, utilizing the packet volume data from the security camera node as a reference point both in
their benign and DDoS activity. This rigorous analysis, carried out by the Fitter library [101] in Python,
aimed to ascertain the most fitting distribution model for the IoT nodes’ packet volume. The truncated
Cauchy distribution emerged as the optimal choice, distinguished by its minimal square error. This
analysis highlighted the truncated Cauchy distribution’s efficacy in capturing the properties of urban IoT
data traffic, a choice supported by prior studies in network traffic modeling [29]. The adoption of a
truncated Cauchy distribution ensures an accurate representation of packet volumes in our dataset. In this
model, active IoT nodes generate packet volumes based on this distribution, while inactive nodes are
30
denoted with a packet volume of zero, underscoring the event-driven nature of our dataset. Table 4.1
shows a sample of the enhanced dataset.
Table 4.1: Sample data points in the introduced large-scale enhanced dataset containing the activity and
packet volume of urban IoT devices.
NODE LAT LNG TIME ACTIVE PACKET
1 33 40
2021-01-01
23:00:00
0 0
1 33 40
2021-01-01
23:10:00
0 0
1 33 40
2021-01-01
23:20:00
1 9
1 33 40
2021-01-01
23:30:00
1 11
It is worth mentioning that the concept of ‘urban IoT benign traffic’ in this study is grounded in the
understanding of the dynamic nature of urban environments. This encompasses the data generated from a
myriad of smart devices and sensors deployed in urban landscapes to monitor and control traffic flow,
weather conditions, and home appliances among others, aiming to enhance the quality of urban life. It is
essential to highlight that this characterization could potentially vary from the traffic patterns observed in
other IoT environments such as industrial and agricultural IoT settings. The industrial IoT domain
generally witnesses more uniform and predictable traffic, originating from machinery and systems
operational in manufacturing units, while agricultural IoT is associated with sporadic traffic patterns
reflecting the data from various agricultural equipment and weather monitoring systems.
4.2.3 Large-Scale Urban IoT Dataset
Prior researchers mostly introduced and worked on datasets that contained very few numbers of IoT
devices. Table 4.2 presents an overview of datasets in this field with the respective number of IoT nodes
used in creating the dataset. As a major contribution in the field of DDoS detection in IoT systems, we have
released our enhanced dataset [28], named large-scale Urban IoT Dataset. This dataset contains the packet
31
volume information of 4060 IoT devices across one month of activity, along with their anonymized
locations. This large-scale dataset will help fellow researchers to analyze the performance of their DDoS
detection methodology on a large-scale IoT network.
Table 4.2: Related papers with IoT datasets
Reference Date Number of
Nodes
IoT specific/
General
Binary activity/
Traffic Volume
Benign/Attack
Traffic
DARPA 2000[102] 2000 60 General Traffic Volume Both
CAIDA UCSD DDoS
Attack 2007 [103]
2007 Unclear General Traffic Volume Attack
Shiravi et. al. [104] 2012 24 General Traffic Volume Both
CICDDoS 2017 [73] 2017 25 General Traffic Volume Both
Meidan et. al. [64] 2018 9 IoT Specific Traffic Volume Both
CSE-CIC-IDS
2018 on AWS [105]
2018 450 General Traffic Volume Attack
CICDDoS 2019 [74] 2019 25 General Traffic Volume Attack
The Bot-IoT
Dataset (Univ. of NSW) [106]
2019 Unclear IoT Specific Traffic Volume Both
Ullah et. al [107] 2020 42 IoT Specific Traffic Volume Both
Erhan et. al. [108] 2020 4000 General Traffic Volume Both
Hekmati et. al. [24] 2021 4060 IoT Specific Binary Activity Both
Hekmati et. al. [28] 2021 4060 IoT Specific Traffic Volume Both
4.3 Tunable Futuristic DDoS Attacks
In the pursuit of understanding and mitigating the threat posed by DDoS attacks in the context of IoT
networks, in this chapter we present a novel approach for simulating such attacks.
4.3.1 DDoS Traffic Distribution
Building upon the foundational packet volume distribution framework established in the previous section,
our methodology introduces a mechanism to synthesize DDoS attacks emanating from IoT nodes. To
simulate the packet volume of IoT nodes under attack, we formulate a truncated Cauchy distribution
characterized by
32
xa = (1 + k1) · xb (4.1)
γa = (1 + k2) · γb (4.2)
ma = (1 + k3) · mb (4.3)
In the equations above, xb, γb, and mb represent the benign traffic parameters—location, scale, and
maximum packet volume, respectively, as characterized by the truncated Cauchy distribution. Conversely,
xa, γa, and ma represent the corresponding parameters under the simulated attack scenario. The
parameters k1, k2, and k3 are tunable parameters aiming to augment the location, scale, and peak packet
volume in relation to benign traffic. This would empower the DDoS attackers to modulate the attack
traffic’s location, scale, and peak packet volume relative to the benign baseline.
This tunability is not merely a mathematical abstraction but a critical tool in the emulation of a
spectrum of attack intensities. It’s noteworthy that during a DDoS attack, IoT nodes become active and
transmit packets irrespective of their prior activity status. As compared to prior studies scenarios where
attackers either flood packets or intentionally reduce data transfer rates, our approach, introduces a DDoS
attacker that seamlessly integrates malicious traffic with benign patterns.
By adjusting the values of k1, k2, and k3, we can simulate attackers who either subtly blend in with
benign traffic or opt for a more aggressive attack. Lower values of these parameters suggest an attacker’s
intent to mimic benign traffic patterns closely, thereby obscuring malicious activities and complicating the
detection process. In this scenario, the attacker would be able to utilize a huge number of IoT devices to
perform an effective DDoS attack. On the other end of the spectrum, higher parameter values result in an
amplified packet volume directed towards the target server, increasing the attacker’s visibility. In this
scenario, the attacker would be able to utilize a lower number of IoT devices to perform an effective DDoS
33
attack. This approach facilitates a deeper understanding of the dynamics at play in futuristic DDoS attacks
and enhances the development of detection mechanisms capable of identifying and mitigating such
sophisticated threats.
The dataset leveraged in this research is grounded in real urban IoT activity data. In synthesizing the
DDoS attacks for the dataset, we have emulated a scenario similar to UDP flood attack with tunable packet
volume, which typifies a transport layer attack within the network layer of the IoT system. This choice of
attack vector not only allows for a realistic representation of potential vulnerabilities but also aligns
seamlessly with the prevalent communication protocols at the transport layer, thereby providing an
opportunity for evaluating the effectiveness of the proposed DDoS detection mechanisms in real-world
scenarios. It is important to note that the attack initiation point is started at the network layer, exploiting
the transport layer protocols to produce a tunable number of packets targeting the victim server.
4.3.2 DDoS Attack Properties
IoT systems can generally be characterized into three distinct layers: the device layer, the network layer,
and the application layer. These layers employ a range of TCP/IP-based protocols, and consequently, they
can be perceived as potential source in a DDoS attack scenario. In this research, we have studied the
vulnerabilities and potential attack vectors present at the transport layer, which functions within the
network layer. Specifically, we have explored DDoS attack dynamics similar to a UDP flood attack, wherein
a tunable number of UDP packets are sent to the victim server with the intent to disrupt its normal
functioning. It’s important to highlight that our emphasis is on DDoS attacks emanating from IoT systems,
particularly IoT devices that are utilized as a botnet. Our study is not centered on attacks aimed directly at
IoT servers or IoT applications themselves, but rather on the potential of IoT devices being weaponized as
part of a DDoS attack.
34
In this study, DDoS attacks are replicated by a strategic activation of IoT nodes, which are tasked with
executing the attack over a specified duration. We simulate DDoS attacks by activating all IoT nodes
performing the attack during the attack duration and by determining the packet volume to dispatch in
every time slot. Given our event-driven IoT dataset, nodes performing an attack start dispatching packets
regardless of their prior activity. To specify the packet volume transmitted during the attack, we utilize the
equations (4.1), (4.2), and (4.3) to characterize the truncated Cauchy distribution. With tunable parameters
k1, k2, and k3 in place, we draw samples from this distribution independently and identically. The DDoS
attack performed by the attacker can have various parameters to be set:
• Tunable packet distribution parameters (k1, k2, and k3): These pivotal parameters dictate the
characteristics of the truncated Cauchy distribution, directly influencing the packet volume
dispatched to the victim server. Elevated values of these parameters orchestrate a more aggressive
attack profile, whereas lower values facilitate a subtler attack, enabling the malicious traffic to
closely mimic the benign behavior of IoT nodes.
• Attack’s start time (as): This parameter allows the attacker to pinpoint the commencement of the
attack, offering strategic flexibility in timing the onset of the DDoS attack.
• Attack’s duration (ad): Determining the length of the attack, this parameter sets the temporal bounds
for the DDoS activity, dictating its persistence and potential impact.
• Attacking node ratios (ar): This parameter helps the attacker to set what percentage of the
compromised IoT devices to be utilized for performing the attack. It is noteworthy that a lower
setting of the distribution parameters (k1, k2, and k3) necessitates a higher ratio of attacking nodes
to ensure the efficacy of the DDoS onslaught, and vice versa.
By integrating these diverse parameters into the simulation framework, the attacker is equipped to
execute a wide array of attack profiles. This adaptability not only enhances the realism of the simulated
35
attacks but also provides a diverse set of samples for neural network models to discern node behaviors
across various attack scenarios. Consequently, this facilitates the development of sophisticated prediction
models capable of accurately identifying the IoT nodes implicated in the attack and the timing of the attack.
4.4 Simulation Results
In the following, we show our simulation results explaining the properties of the proposed large-scale
urban IoT dataset. Furthermore, we present the effectiveness of the truncated Cauchy distribution in
modeling the IoT network traffic, and also its effect on the tunable futuristic DDoS attacks.
4.4.1 Large Scale Urban IoT Dataset Evaluation
In this subsection, some statistics of the introduced dataset are presented to illustrate the dataset’s
properties.
Figure 4.2 presents the daily cycle of IoT node activity within our dataset, illustrating the mean
percentage of active nodes as it varies throughout the day. The graph indicates a significant uptick in
activity, peaking at around 65% in the middle of the day, which may be reflective of increased operational
requirements during typical business hours. This period of heightened activity gradually diminishes as the
day progresses, reaching the lowest level of roughly 20% activity during midnight hours. Such trends
underscore the need for temporal context in interpreting IoT node behavior, as well as in configuring
network resources and security measures.
In Figure 4.3, we examine the mean correlation of node activity against the spatial distance separating
nodes, with pairwise distances calculated via the Euclidean distance. Utilizing Pearson correlation
coefficients computed over a dataset spanning an entire month, we aggregate the data into 1000 bins based
on node distances and compute the average correlation for node pairs within each bin. The resulting trend
36
Figure 4.2: Large-scale urban IoT dataset active nodes percentage vs time
suggests a noticable decay in correlation with increasing distance, with mean correlation values declining
from approximately 0.4 for proximal nodes to 0.1 or lower for nodes separated by distances greater than 10
units. This inverse relationship may be indicative of the spatial distribution of node interactions and the
localized nature of certain IoT activities.
Further, we explore the temporal characteristics of node activity through histograms in Figure 4.4.
Here, we present the distribution of mean active and inactive times for IoT nodes, segmented into day (8
AM to 8 PM) and night (8 PM to 8 AM) periods. Figures 4.4a and 4.4c detail the average duration of node
activity during day and night, respectively, while Figures 4.4b and 4.4d convey the corresponding average
inactivity durations. The histograms reveal that both active and inactive periods tend to be prolonged
during the night compared to the daytime. This observation could be reflective of less frequent but more
sustained node activities during the night.
37
Figure 4.3: Large-scale urban IoT dataset nodes activity mean correlation vs distances
4.4.2 Tunable Futuristic DDoS Attacks
In our comprehensive analysis aimed at understanding the tunable parameters within the framework of
futuristic DDoS attacks, we delve into the empirical examination of packet volume distributions. This
exploration is pivotal for validating the efficacy of the truncated Cauchy distribution model in simulating
attack scenarios with varying intensities. To this end, we illustrate the behavior of the probability density
function (PDF) and the complementary cumulative distribution function (CCDF) through a series of
graphical representations. In this experiment, we assumed k1 = k2 = k3 = k for simplicity. Figures 4.5a
and 4.5b present the empirical data of benign traffic (depicted in blue) against the modeled distributions
under different settings of the tunable parameter k, with specific instances where k = 0, k = 1, and k = 2.
The scenario where k = 0 is of particular interest, as it shows a situation wherein the synthetic packet
volume is intended to mirror the empirical packet volume distributions observed in real-world data. The
alignment of the truncated Cauchy distribution with k = 0 (represented in red) with the empirical data (in
38
(a) Histogram of mean activity time per node - From
8 AM to 8 PM
(b) Histogram of mean inactivity time per node - From
8 AM to 8 PM
(c) Histogram of mean activity time per node - From
8 PM to 8 AM
(d) Histogram of mean inactivity time per node - From
8 PM to 8 AM
Figure 4.4: Large-scale urban IoT dataset histograms of mean activity and inactivity time per node
39
(a) PDF (b) CCDF
Figure 4.5: Real packet volume distribution vs truncated Cauchy distribution
blue) within figures 4.5a and 4.5b is a testament to the model’s accuracy in replicating benign traffic
patterns.
As we increment the value of k, there is a noticeable shift in the dynamics of the distribution. The
escalation of k inherently amplifies the location, scale, and maximum packet volume parameters of the
distribution. This augmentation is visually captured in the aforementioned figures, where higher values of
k correspond to an increased probability of larger packet volumes. Such a trend underscores the model’s
capacity to simulate varying degrees of attack intensity, from subtle to severe, by modulating the k
parameter. High k values facilitate scenarios where immense volumes of packets are transferred during an
attack, thereby enabling a smaller number of IoT devices to potentially overwhelm the victim server. On
the other hand, lower k values result in reduced packet volumes during attack simulations, necessitating a
more extensive network of IoT nodes to perform an effective DDoS attack.
4.5 Summary
In this chapter, we presented a comprehensive analysis of IoT node behavior within the context of detecting
and understanding the dynamics of DDoS attacks. Our investigation commenced with the formulation of
40
an urban IoT dataset, predicated on the activity status of nodes, which was then augmented with packet
volume information derived from real-world IoT device interactions. The dataset was subjected to rigorous
preprocessing to mitigate inherent bias, ensuring uniformity across all nodes. Subsequently, we introduced
a novel approach to simulating tunable futuristic DDoS attacks by employing a truncated Cauchy
distribution, parameterized by tunable factors to reflect varying attack intensities. This methodological
innovation enables the simulation of attacks that either subtly blend with benign traffic or severly contrast
it, enhancing our understanding of potential cybersecurity threats. Finally, we presented simulations and
analyses on the new large-sclae IoT dataset that we introduced. The findings underscore the importance of
considering temporal and spatial dimensions when developing DDoS detection and mitigation strategies.
41
Chapter 5
Correlation-Aware Neural Networks for DDoS Attack Detection In IoT
Systems
Recent analyses reveal a worrying trend of vulnerabilities in IoT ecosystems, highlighting the urgency for
robust cybersecurity measures [109, 5]. Traditional DDoS detection mechanisms, primarily based on
analyzing traffic volume and flow, are increasingly ineffective against sophisticated attacks that mimic
benign network behavior [64, 73]. Recognizing this, in this chapter, we target the detection of futuristic,
tunable DDoS attacks that cleverly disguise themselves within benign IoT traffic patterns. We propose a
novel approach leveraging the correlation information among IoT nodes, which could significantly
enhance the detection of such stealthy attacks. This chapter presents the innovative architectures and
neural network models developed for this purpose, laying the groundwork for a new paradigm in DDoS
attack detection within IoT systems.∗†
5.1 Problem Definition
In this chapter, we significantly refine our approach towards generating a comprehensive training dataset,
pivotal for the development of robust neural network models aimed at detecting DDoS attacks in IoT
∗This chapter is adapted from [26, 27, 110].
†The dataset and source code used in this chapter are available in an open-source repository online at https://github.
com/ANRGUSC/correlation_aware_ddos_iot.
42
systems. Our refined methodology incorporates the correlation information of IoT nodes’ activities,
enriching the dataset with comprehensive insights into the collective behavior of IoT devices under
varying conditions. The enhanced “dataset+script” framework we introduce not only facilitates the
injection of sophisticated, truncated Cauchy distributed attacks but also offers the flexibility to adjust the
characteristics of benign and attack traffic through customizable parameters. This adaptability enables the
creation of a wide array of training scenarios, thereby augmenting the models’ ability to recognize and
respond to diverse attack patterns. To precisely tailor the generated attack traffic, as discussed in chapter 4,
we employ three tunable parameters denoted as k1, k2, and k3. These parameters are instrumental in
defining the location, scale, and maximum packet volume within the truncated Cauchy distribution,
thereby allowing for a granular control over the simulation of attack scenarios.
Building on this foundation, we propose and evaluate four distinct architectural strategies for training
neural network models on IoT devices. These strategies are considering all combinations of either using
the correlation information of the IoT devices or not, and either training one central model for all nodes or
training distributed individual models for each IoT device. Specifically, we introduce a) Multiple Models
with Correlation (MM-WC), b) Multiple Models without Correlation (MM-NC), c) One Model with
Correlation (OM-WC), and d) One Model without Correlation (OM-NC). These architectures are designed
to explore the benefits of leveraging correlation data across IoT nodes and the efficacy of centralized versus
distributed learning paradigms. By incorporating correlation information, we enable each IoT node to
consider not only its own traffic data but also the traffic characteristics of peer devices, facilitating a
peer-to-peer sharing mechanism for enhanced situational awareness for DDoS detection in IoT systems.
Furthermore, we conduct a thorough investigation into the performance of these architectures using
various neural network models, including Multi-Layer Perceptron (MLP), Convolutional Neural Networks
(CNN), Long Short-Term Memory (LSTM), Transformers (TRF), and Autoencoders (AEN). This extensive
analysis aims to identify the most effective architecture and neural network model combination for the
43
detection of DDoS attacks on IoT devices, thereby contributing to the development of more secure and
resilient IoT ecosystems. Notably, the scope of our work is primarily focused on transport layer DDoS
attacks, particularly those based on UDP flood. This is relevant for IoT systems, given that UDP is often
preferred due to its lower overhead compared to TCP. Our emphasis is on DDoS attacks emanating from
IoT networks, which are vulnerable to being used as a Botnet, rather than DDoS attacks targeting IoT
application-layer protocols.
Our simulation results demonstrate that incorporating the correlation information among IoT devices
significantly enhances the detection of DDoS attacks. This is particularly true in scenarios where attackers
disguise their activities to mimic the volume distribution of benign traffic. Moreover, the efficacy of our
proposed detection model has been validated through implementation in a test-bed comprising a cluster of
Raspberry Pis (RPs), underscoring its practical applicability and effectiveness in real-world settings.
Given the vast array of IoT devices susceptible to exploitation in DDoS attacks, leveraging correlation
information across all nodes introduces a considerable challenge due to the extensive number of features
that neural network models are required to process and learn from. In light of this, we have delved into
strategies for the selective inclusion of nodes based on their correlation information, aiming to streamline
the training process for neural network models. Our results indicate that using the Pearson correlation to
actively select the nodes for training and prediction purposes, results in a good performance in terms of
binary accuracy and F1 score for detecting the DDoS attacks as compared to using the correlation
information of all nodes. This targeted approach to feature selection exemplifies a strategic refinement in
the development of machine learning models for IoT security, paving the way for more efficient and
effective DDoS detection mechanisms.
44
5.2 Correlation-Aware Models
In this chapter, we present the architectures and neural network models utilized for DDoS attack detection.
Section 5.5 provides a comparative performance analysis of these models. Before delving into the specific
architectures, it is essential to clarify the concept of “correlation information” as it is central to this study.
In the context of DDoS detection from IoT devices, correlation information refers to the interdependencies
and behavioral similarities or dissimilarities in packet transmission volumes across different IoT nodes over
time. By analyzing these correlations, our models can identify anomalous patterns indicative of a
coordinated DDoS attack, which might not be apparent when considering data from individual nodes in
isolation, given our proposed tunable DDoS attack. This approach considers the scenario that nodes
involved in a DDoS attack will exhibit coordinated or correlated behaviors, distinct from their normal,
independent operational patterns. Thus, incorporating correlation information allows our models to
leverage these collective behavioral insights, enhancing the detection accuracy and robustness against
sophisticated attacks that aim to mimic benign traffic patterns. In our study, the sharing and utilization of
correlation information are operationalized through a peer-to-peer mechanism among IoT nodes. At each
timestamp, every node transmits its packet volume data to other nodes in the network. Consequently, each
node, equipped with this aggregated data, analyzes not only its own packet volume but also the volumes
reported by its peers. This collective dataset enables each node to make a more informed, correlation-aware
decision about the presence of a DDoS attack. This method ensures that all nodes are uniformly informed
and capable of identifying potential attacks through a comprehensive analysis of network-wide traffic
patterns, enhancing the system’s overall predictive accuracy and response to potential threats.
5.2.1 Architecture
We propose four architectures for training neural network models, differentiated by their engagement with
node activity correlation and the decision to deploy single versus multiple models. For the architecture
45
employing a singular model to identify DDoS attacks across all IoT devices, we incorporate a one-hot
encoding mechanism within the training dataset to distinctly identify data pertaining to each IoT node.
Moreover, in strategies that leverage the correlation information of IoT nodes, we record the packet
volumes from all nodes at each instance. This approach ensures that, for every data point, information
about the packet volumes from other nodes supplements the data for the target node. Here we present the
details of each architecture:
• Multiple Models without Correlation (MM-NC): This architecture involves training individual
neural network models for each IoT node, focusing solely on the packet volume of the respective
node over time. The exclusion of correlation information means that each model is optimized for
detecting anomalies in the node’s activity without considering broader network behaviors. This
approach simplifies the training process but may overlook complex attack patterns that exploit the
interconnected nature of IoT devices.
• Multiple Models with Correlation (MM-WC): Similar to MM-NC, this architecture trains
individual models for each IoT node. However, it incorporates the correlation information of nodes’
activities by including the packet volumes of other nodes in the training data. This enables the
models to recognize coordinated attack patterns across the network. The integration of peer-to-peer
shared correlation information enhances detection capabilities by allowing models to leverage
collective network insights.
• One Model without Correlation (OM-NC): In contrast to the multiple models approach, OM-NC
trains a single neural network model to serve all IoT nodes. This centralized model utilizes packet
volume data from each node, treated independently, to detect DDoS attacks. The use of one-hot
encoding facilitates the differentiation of data originating from various nodes. While this unified
model simplifies deployment, it may not fully capture the specific behaviors of individual nodes.
46
• One Model with Correlation (OM-WC): This architecture extends OM-NC by training a single
neural network model that incorporates correlation information from across the IoT network. The
model uses packet volumes from all nodes, along with their correlation data, to identify potential
DDoS attacks. The centralized approach, supported by peer-to-peer sharing of correlation
information, enables comprehensive analysis of network-wide patterns, significantly enhancing the
detection of sophisticated attacks that mimic benign behaviors. One-hot encoding is incorporated to
distinguish nodes in the training dataset.
Employing correlation information in the MM-WC and OM-WC architectures significantly improves
the models’ ability to predict DDoS attacks. This is especially true in scenarios where attackers orchestrate
the attack using multiple IoT devices, often disguising malicious activities as benign. The strategic use of
correlation data allows for the detection of subtle and coordinated anomalies, providing a robust defense
mechanism against complex DDoS threats. Note that, in correlation-aware architectures like MM-WC and
OM-WC, nodes share correlation information directly through a peer-to-peer approach, eliminating the
necessity for a centralized node or server to gather data and perform DDoS detection inference
5.2.2 Neural Network Models
For each architecture presented in section 5.2.1, we use five different neural network models to do binary
classification for detecting the DDoS attack on the IoT nodes:
• Multilayer Perceptron (MLP): This is the simplest model of the feed-forward artificial neural
network that consists of one input layer, one output layer, and one or more hidden layers [56]. In
this paper, the input layer is followed by one dense layer with 5 neurons and Rectified Linear Unit
(ReLU) activation. In the MM-WC and OM-WC architectures, the dense layer also has a 30% dropout
rate, while in the MM-NC and OM-NC architectures, the dense layer does not perform dropout. The
output is a single neuron with the Sigmoid activation function.
47
• Convolutional Neural Network (CNN): CNNs are similar to MLP models but use mathematical
convolution operation in at least one of their hidden layers instead of general matrix multiplication
[58]. In this paper, the model’s input layer is followed by one 1-dimensional convolution layer with 5
filters, kernel size of 3, and ReLU activation function. In the MM-WC and OM-WC architectures, the
CNN layer also has a 30% dropout rate, while in the MM-NC and OM-NC architectures, the CNN
layer does not have a dropout rate. The CNN layer is followed by a 1-dimensional max-pooling layer
with a pool size of 2, and then a flattened layer. Finally, the output is a single neuron with the
Sigmoid activation function.
• Long Short-Term Memory (LSTM): This is a neural network model that can process a sequence of
data and is a suitable choice for time-series datasets [61]. In this paper, the model’s input layer is
followed by one LSTM layer with 4 units and a hyperbolic tangent activation function. In the
MM-WC and OM-WC architectures, the LSTM layer also has an L2 regularization factor of 0.3, while
in the MM-NC and OM-NC architectures, the LSTM layer does not have an L2 regularization. Finally,
the output is a single neuron with the Sigmoid activation function.
• Transformer (TRF): Transformer models are a neural network model that solely relies on the
attention mechanism to calculate dependencies between the input and output [65]. This model has
an encoder and decoder, but for the purpose of binary classification, we will only need the encoder
part. In this model, the input layer is followed by a multi-head attention layer with one attention
head, and the size of each attention head for query and key is 1. In the MM-WC and OM-WC
architectures, the multi-head attention layer also has an L2 regularization factor of 0.3, while the
MM-NC, and OM-NC architectures, do not have L2 regularization. The multi-head attention layer is
followed by the global average pooling layer. Finally, the output is a single neuron with the Sigmoid
activation function.
48
• Autoencoder (AEN): Autoencoders are a type of neural network model that consists of two parts.
The first part tries to encode the input into a compressed representation and the second part tries to
decode the compressed representation into the original input [63]. In fact, this model tries to learn a
meaningful compressed representation of the input. In this paper, the input layer is followed by an
encoder that consists of five dense layers with 256, 128, 64, 32, and 16 neurons with a ReLU
activation function, each followed by batch normalization. The decoder consists of five dense layers
with 16, 32, 64, 128, and 256 neurons with a ReLU activation function, each followed by batch
normalization. Finally, the output dimension will be the same as the input layer. For the binary
classification, the encoder is followed by a dense layer with 8 neurons and a ReLU activation
function, and the output will be a single neuron with a Sigmoid activation function. By using the
AEN model, we essentially train the encoder to encode our input dataset into a latent space. Then,
we design a classification model that gets this latent space as input and predicts whether the latent
features show malicious behavior or not.
Figure 5.1 illustrates the aforementioned neural network models. For correlation-aware models, we
employed regularization (dropout and L2) to mitigate observed overfitting during training. This is worth
mentioning that model parameters were determined via a performance-optimized grid search.
Furthermore, due to the unbalanced nature of our dataset, class weighting was implemented following the
methodology in [111], ensuring proper representation in model training. Class weights and initial layer
biases were defined as:
49
Figure 5.1: Neural network models deployed in correlation aware architectures
wn =
pos + neg
2 neg
(5.1)
wp =
pos + neg
2 pos
(5.2)
b0 = log( pos
neg
) (5.3)
where, wn and wp represents the weight for the negative(not attacked) and positive(attacked) classes,
respectively. b0 represents the initial bias of the layers. neg and pos represent the total number of negative
and positive samples, respectively.
50
5.3 IoT Nodes With Constrained Resources
Architectures discussed in section 5.2.1, particularly MM-WC and OM-WC, necessitate using information
from all nodes to train neural network models. This is because the packet volume of all nodes serves as
input for these models. Utilizing these correlation-aware architectures introduces a vast array of input
features for the neural networks, resulting in a significant network overhead for IoT nodes. This becomes
particularly impractical for real-world scenarios characterized by a myriad of IoT devices with limited
resources and computational capabilities. To mitigate this challenge, we suggest each IoT node
incorporates only a subset of the information from other IoT nodes for their neural network model inputs,
enabling data sharing amongst specific IoT nodes in a peer-to-peer manner. As illustrated in figure 5.2,
nodes can either incorporate traffic information from all nodes (figure 5.2a) or from a select group of nodes
(figure 5.2b). This selective approach significantly enhances efficiency during neural network training and
DDoS attack prediction for resource-constrained IoT nodes.
(a) All Nodes (b) Selected Nodes
Figure 5.2: Using all nodes vs selected nodes for sharing information in correlation-aware architectures
We present five heuristic solutions for choosing important IoT nodes to share data with specific nodes
during the neural network model training for the correlation-aware architectures:
51
• All nodes: This approach leverages the packet volume data from all nodes when training the neural
network models. As a default method discussed in earlier sections, it is anticipated to offer superior
performance. However, this comes at the expense of increased computational demands.
• Pearson correlation: This approach employs the Pearson correlation to assess node activity
correlations based on their transmitted packet volume throughout the day. Specifically, for each node
i, its Pearson correlation is calculated with all other nodes. Subsequently, the top n nodes exhibiting
the highest correlation with node i are determined. During the training of correlation-aware
architectures for each node, information is sourced only from the top n nodes and node i itself.
• SHAP: This method utilizes SHapley Additive exPlanations (SHAP) [112] to identify the most
influential features when deciding if a node is performing the attack. After training a neural network
model using data from all nodes for node i, the SHAP method is employed to identify the top n
features integral to the decision-making process. A new model is subsequently trained, focusing only
on these top n features. Notably, this method demands an initial training phase involving all nodes,
thus, might not be the most practical solution for real-world scenarios, but serves as a comparative
measure against other methods.
• Nearest Neighbor: Here, nodes’ Euclidean distances are used to estimate activity correlations based
on their proximities. Specifically, for each node i, its Euclidean distance to all other nodes is
determined, and the top n closest nodes to node i are selected. Only data from these proximate n
nodes and node i are used when training correlation-aware architectures.
• Random: For node i, n nodes are randomly selected to provide data to train the neural network
models. This process is repeated a set number of times, with the evaluation metrics’ average
calculated across all iterations. Serving as a baseline, the random method’s results can be compared
against the outcomes of the aforementioned solutions.
52
5.4 Real Test-Bed Implementation
In addition to the extensive simulation that we designed, we also provide a real-world emulation of IoT
networks to detect futuristic DDoS attacks by the proposed mechanisms. Using a carefully designed
testbed of 51 Raspberry Pi (RPi) devices, we mimic the complexities of a large-scale IoT system. To the best
of our knowledge, this work is the first to deploy Edge Machine Learning for DDoS detection in a testbed
of such a size, replicating some of the complexities and intricacies of real-world IoT networks. We have
designed the flow-level traffic and complexities of networked devices to closely mirror real-world scenarios.
Furthermore, we introduce real-time inference capabilities by deploying the trained detection model on
each RPi device. This enables each node to autonomously detect DDoS attacks using real-time traffic,
further enhancing the practicality of our framework.
We will demonstrate through our testbed the reliable performance of the LSTM/MM-WC framework in
identifying DDoS attacks, even when the attackers employ sophisticated camouflage techniques. The
demonstrated system allows experimenting with the futuristic DDoS attack and seeing the detection
framework performance in real-time on a GUI framework. Additionally, for those interested in delving
deeper, our code and dataset can be accessed at https://github.com/ANRGUSC/ddos_demo.
5.4.1 System Overview
As we will observe in the extensive simulation results discussed in section 5.5, the combination of multiple
models with correlation (MM-WC) architecture along with the LSTM neural network model performs the
best. In order to validate its effectiveness in the real-world environment, we also use MM-WC/LSTM
framework for detecting the futuristic DDoS attacks. This model utilizes the traffic information of all nodes
for making the DDoS detection inference. Figure 5.3 illustrates the structure of this model. Each IoT node
is trained with its dedicated LSTM model and conducts inference autonomously. We disseminate the
53
packet rate (packets exchanged over a fixed time interval) from each individual RPi node to other nodes,
utilizing the correlation characteristics inherent in each node’s activity.
Figure 5.3: Structure of the LSTM/MM-WC model for IoT DDoS Detection
Figure 5.4 presents the workflow of the real testbed. The network architecture can be divided into
three parts: the control server (CS), the IoT nodes, and the victim/idle server. Our design leverages a
Behavior Dataset (BD) sourced from a dataset consisting of real IoT nodes situated in an urban
environment to emulate complex IoT node activities and advanced DDoS threats [26]. Each node maintains
a BD, which serves as a benchmark for simulating benign IoT node activities, consequently obfuscating
DDoS traffic within benign traffic. As we can see in this figure, upon system initialization, the CS takes
multiple actions to establish communication and synchronize operations. First, it broadcasts a UDP packet
containing configuration information to all IoT nodes. After the startup packet is received, each IoT node
creates threads to monitor its network traffic and shares traffic features with peer nodes to enable the
correlation-aware MM-WC architecture. Then, each node establishes a UDP server, awaiting subsequent
control signals from the CS. If an IoT node is supposed to be active in its benign behavior, CS transmits a
control signal to the corresponding node to dispatch benign traffic to the idle server. Furthermore, upon
54
attack initiation, the CS sends a signal containing the attack properties to the involved nodes to perform
the attack on the victim node.
Botnets
UDP
Servers
Start
UDP Servers
Dispatch UDP
Control Signals
Run Network Traffic
Threads On Botnets
Assign Attack Parameters
Idle Server
Victim Node
Control Server
Benign Traffic
UDP Flood
1
2
3
4
5
Figure 5.4: Real testbed network architecture and information flow for evaluating correlation aware
architectures
5.5 Simulation Results
In the following, we evaluate the proposed neural network models and also architectures by doing
extensive simulations to analyze each architecture/model’s performance across various scenarios.
5.5.1 Preparing General Training Dataset
This subsection outlines the strategy for creating the labeled general training dataset, encompassing all
features potentially employed by the architectures presented in section 5.2.1. Each data point concerning a
specific IoT node i incorporates the node ID, timestamp, and associated packet volume. Additionally, each
entry records the packet volumes of all nodes. Hence, for any given timestamp, one can perceive the packet
volume relayed via node i and all other nodes. Furthermore, in order to distinguish the records for each
node, we employ one-hot encoding in our dataset. The label indicates whether a node is performing the
55
attack or not. Capturing packet volume information from all nodes for each sample in the dataset aids in
understanding nodes’ correlation behaviors, which is instrumental in shaping correlation-aware
architectures. As further discussed in section 5.2.1, the proposed architectures differ based on their
utilization of correlation data and one-hot encoding. Table 5.1 showcases a sample of the training dataset,
spotlighting data from nodes 1 and 2 (N1 and N2) and their respective packet volumes (P1 and P2). Note
that, in order to detect the DDoS attack at each timestamp, we also need the information of the previous
time slots to understand the behavior of the node through time. Therefore, the training dataset could be
considered a time-series dataset. Consequently, we stack the preceding nt entries for each data point,
enabling predictions about the attack status of that specific entry. The accompanying Python script offers
flexibility in adjusting nt based on IoT device resources. Larger nt values enrich the neural network’s
comprehension of IoT device behaviors, albeit demanding more computational power for inference.
Table 5.1: Sample data points in general training dataset used in correlation aware architectures
Node Time N_1 N_2 P_1 P_2 Attacked
1
2021-01-01
23:20:00
1 0 12 9 0
1
2021-01-01
23:30:00
1 0 13 8 1
2
2021-01-01
23:20:00
0 1 9 12 0
2
2021-01-01
23:30:00
0 1 8 13 1
5.5.2 Experiment Methodology
The training of the neural network models is conducted on a central server, not on the IoT devices
themselves. This approach does not put the computational burden on the IoT devices, as they are only
tasked with running the inference to detect DDoS attacks. Furthermore, to mitigate the communication
overhead due to peer-to-peer packet volume sharing among a large number of IoT nodes, we employ the
specific mechanisms discussed in Section 5.3 for selecting a subset of nodes for correlation information
56
sharing. For this purpose, computation-intensive tasks, such as calculating Pearson correlation, SHAP
values, and nearest neighbor determinations among IoT nodes, are also performed on the central server
during the model training phase. The outcome of this process is a trained model that is deployed on IoT
devices, ensuring minimal computational demand during its operational phase. This architecture ensures
that our approach is not only effective in detecting DDoS attacks but also feasible for implementation in
real-world IoT systems with limited computational resources.
5.5.3 Experiment Setup
For our simulations, we adopted a timestep of 10 minutes, i.e. ts = 600seconds, to create the benign
dataset following the methodology outlined in section 4.2. Though we used this specific timestep, the
provided script allows for adjustments as per specific requirements. The introduced enhanced dataset
encompasses 4060 IoT nodes. In this subsection, 50 nodes were chosen randomly for the simulations. We
present the activity analysis of these 50 nodes against the full dataset of 4060 nodes. Figure 5.5 presents the
average number of active nodes within the benign dataset against the time of day, considering both the full
dataset and the sampled 50 nodes. It is evident that up to 70% of the nodes are activated around midday.
However, by midnight, the activity diminishes to about 20%. The activity patterns of the 50 randomly
sampled IoT nodes closely mirror those of the complete set.
In Figure 5.6, the probability density function showcases nodes’ mean activity and inactivity patterns
during daytime (8 AM to 8 PM) and nighttime (8 PM to 8 AM). For the entire dataset and the selected 50
nodes, nodes tend to be active for approximately 100 minutes and largely inactive for around 80 minutes
during the day. Nighttime sees nodes active for about 80 minutes and predominantly inactive for around
220 minutes. A key observation from this figure suggests that the sampled 50 nodes decently represent the
broader behavioral tendencies of the entire dataset. Thus, any conclusions drawn from simulations
employing the 50 random nodes can be safely generalized for the full dataset.
57
Figure 5.5: Comparing the percentage of active Nodes vs time for the whole urban IoT dataset and 50
random IoT devices.
Our dataset encompasses a period of one month, from which we selected data spanning four, one, and
three distinct days for the training, validation, and testing phases of our neural network models,
respectively. As presented in Section 4.3.2, DDoS attacks can be artificially generated through the use of a
dedicated Python script, which accepts specific attack attributes as input parameters. For the purposes of
these simulations, we utilized the following set of attack properties:
• Attack initiation times were scheduled for 2 AM, 6 AM, and 12 PM. This timing ensures that the
models are adept at predicting attacks across different network traffic volumes, including peak
periods, thereby mitigating time-based biases in detection capabilities.
• The duration of attacks was diversified to include 4 hours, 8 hours, and 16 hours. This variation aims
to evaluate the models’ effectiveness in accurately predicting both short-lived and prolonged DDoS
attacks, testing their adaptability across varying attack lengths.
58
(a) PDF of nodes mean activity in the day (b) PDF of nodes mean inactivity in the day
(c) PDF of nodes mean activity in the night (d) PDF of nodes mean inactivity in the night
Figure 5.6: Comparing the probability density function (PDF) of nodes mean activity/inactivity for day and
night for the whole urban IoT dataset and 50 random IoT devices.
59
• The proportion of nodes involved in the attack was set to either 0.5 or 1, reflecting a realistic
scenario where attackers may choose not to utilize all compromised IoT devices in a DDoS attack.
This introduces variability in the simulation, mirroring the unpredictability of real-world attacks.
• Attack-specific tunable parameters, denoted as k1, k2, and k3, were employed to modulate the packet
output volume from compromised IoT nodes. These parameters were chosen from a set including 0,
0.1, 0.3, 0.5, 0.7, and 1. For the sake of simplicity, most experiments were conducted with
k1 = k2 = k3 = k. Nevertheless, we also investigated scenarios with varying k values to assess the
models’ detection performance under different attack intensities. A k value approaching 0 indicates
the attacker is mimicking benign traffic patterns, which complicates the detection process. In
contrast, a k value nearing 1 signifies an aggressive attack strategy, marked by a high volume of
packet transmissions, which, while causing significant network disruption, may facilitate easier
detection due to anomalous traffic patterns.
To generate the attack and general training dataset, we considered all possible attack scenarios derived
from the attack properties discussed above. This comprehensive approach ensures that our neural network
models are trained on all possible combinations of these scenarios. For the training dataset, a time window
of nt = 10 was used. This implies that predictions by the neural network models are based on the
information from the preceding 10 time slots for each sample. Once the datasets were generated, they were
shuffled to mitigate any potential prediction biases related to specific times of the day. Training was
performed on all proposed neural network models for 3 epochs, using batch sizes of 32.
5.5.4 Binary Classification Metrics
In this subsection, we evaluate various architectures and neural network models based on binary accuracy,
F1 score, and AUC, particularly focusing on the impact of the tunable parameter k in the testing dataset.
We also analyze the ROC curve for k = 0, providing insights into model performance under scenarios
60
where DDoS attacks resemble benign traffic. Performance metrics for each k value are averaged over
variations in attack start time, duration, and node attack ratio.
Figure 5.7 shows the MM-WC architecture’s performance as detailed in section 5.2. The LSTM model
stands out, achieving F1 scores between 0.82 and 0.87, followed by the TRF model. CNN and MLP models
rank lower, while the AEN model performs the least effectively, with F1 scores ranging from 0.39 to 0.45.
This suggests the AEN difficulties in understanding benign traffic patterns given the MM-WC
architecture’s extensive input features. This matter is elaborated more at the end of this subsection.
(a) Binary Accuracy (b) F1 Score
(c) AUC (d) ROC Curve - k = 0
Figure 5.7: Compare different neural network models’ performance by using the multiple models with
correlation (MM-WC) architecture
61
Figure 5.8 presents the performance of the multiple models without correlation architecture, i.e.
MM-NC. As k increases, a general improvement in the performance of all neural network models is
observed. The LSTM model stands out, demonstrating superior performance with an F1 score ranging from
0.35 to 0.85. Following closely, the CNN, MLP, and TRF models exhibit closely matched performance
metrics. Interestingly, AEN demonstrates improved efficacy under the MM-NC (with F1 score between 0.51
and 0.65) compared to MM-WC, especially notable at k = 0 for the F1 score metric. This divergence in
AEN’s performance, especially its enhancement under the MM-NC architecture compared to the MM-WC
architecture, suggests that fewer input features might be more beneficial for the AEN model.
(a) Binary Accuracy (b) F1 Score
(c) AUC (d) ROC Curve - k = 0
Figure 5.8: Compare different neural network models’ performance by using the multiple models without
correlation (MM-NC) architecture
62
Figure 5.9 depicts the performance of the One Model with Correlation (OM-WC) architecture. Notably,
the TRF model outperforms its counterparts, registering an F1 score ranging from 0.82 to 0.85. Subsequent
in rank is the LSTM model, achieving an F1 score in the range of 0.79 to 0.81. Both MLP and CNN models
yield comparable results, with F1 scores spanning from 0.76 to 0.78. In contrast, AEN presents the lowest
performance for the OM-WC architecture. This trend echoes observations from the MM-WC architecture,
reinforcing the notion that AEN struggles with handling a large number of input features. Moreover, the
superior performance of the TRF model suggests its ability to manage complex architectures like OM-WC,
which demands the handling of a significant number of input features, as opposed to other models.
(a) Binary Accuracy (b) F1 Score
(c) AUC (d) ROC Curve - k = 0
Figure 5.9: Compare different neural network models’ performance by using the one model with correlation
(OM-WC) architecture
63
Figure 5.10 illustrates the performance of the One Model Without Correlation (OM-NC). A clear trend
is evident that as k values rise, there is a corresponding improvement in the performance of all the neural
network models. Among them, the LSTM stands out, registering an F1 score that ranges from 0.4 to 0.86.
Subsequently, the CNN, TRF, and MLP architectures demonstrate comparable results, each with an F1 score
spanning from 0.39 to 0.85. In contrast, the AEN consistently lags, rendering it the least effective under the
OM-NC architecture.
(a) Binary Accuracy (b) F1 Score
(c) AUC (d) ROC Curve - k = 0
Figure 5.10: Compare different neural network models performance by using the one model without
correlation (OM-NC) architecture
The main take away from Figures 5.7, 5.8, 5.9, and 5.10 are the following:
64
• MM-WC and OM-WC architectures, which include correlation data, notably excel in DDoS detection,
especially at lower k values.
• The LSTM model demonstrates superior performance in MM-WC, MM-NC, and OM-NC
architectures, primarily due to its effective temporal data processing, aiding in identifying
anomalous node behaviors during attacks.
• In the OM-WC architecture, the TRF model outperforms others, highlighting its proficiency in
handling temporal information within complex architectures with extensive input features like
OM-WC.
• MM-NC and OM-NC architectures, lacking correlation data, show inferior performance compared to
their correlation-aware counterparts. However, detection improves with higher k values, implying
that more intense DDoS attacks with larger packet volumes are easier to identify without correlation
data.
• Across most scenarios, the AEN model is less effective, particularly due to its training on benign
traffic over a short four-day period. This limits its capability to understand benign behaviors, most
notably in the MM-WC architecture with its vast input features. Extending the training period is
expected to improve AEN’s performance by exposing it to a wider range of benign traffic patterns.
Since LSTM and TRF are performing better as compared to other models, here we also present the
performance of only using either LSTM or TRF model against different architectures.
Figure 5.11 presents the performance of the LSTM model across different architectures. As we can see,
the architectures that do not use correlation information, i.e. MM-NC and OM-NC, show poor performance
with an F1 score around 0.35 for low values of k. Their performance increases with higher values of k with
an F1 score around 0.85 which is similar to the performance of architectures that use correlation
information, i.e. MM-WC and OM-WC. Furthermore, we can also observe that MM-WC architecture is
65
performing the best in almost all values of k as compared to other architectures with an F1 score between
0.82 and 0.87.
(a) Binary Accuracy (b) F1 Score
(c) AUC (d) ROC Curve - k = 0
Figure 5.11: Compare different architectures’ performance by using the LSTM neural network model
Figure 5.12 illustrates the TRF model’s performance across various architectures. Notably, the
architectures not using the correlation information, i.e. MM-NC and OM-NC, exhibit subpar performance
with an F1 score approximating 0.4 for smaller values of k. However, their performance improves for larger
k values, reaching an F1 score of about 0.82. Interestingly, this score is comparable to those architectures
incorporating correlation information, such as MM-WC and OM-WC. Moreover, it is evident that both
66
MM-WC and OM-WC consistently demonstrate good performance across all k values, as compared to other
architectures.
(a) Binary Accuracy (b) F1 Score
(c) AUC (d) ROC Curve - k = 0
Figure 5.12: Compare different architectures’ performance by using the TRF neural network model
In summary, we observe that both the LSTM/MM-WC and TRF/OM-WC architectures yield
comparable, top-tier performance. The choice between them depends on the specific application. We lean
towards the LSTM/MM-WC architecture in this work, primarily due to its design, where individual models
are trained for each IoT node. In contrast, the TRF/OM-WC architecture relies on a singular central model.
This centralized approach poses a vulnerability; if the central model were to malfunction, it could
jeopardize the entire IoT system. Conversely, individual models of the LSTM/MM-WC ensure resilience;
67
even if some nodes falter in DDoS detection, the overall system remains operational. For these reasons, we
will focus exclusively on the LSTM/MM-WC architecture in the subsequent sections of this paper.
5.5.5 DDoS Attack Prediction Analysis
In this subsection, we present the performance of the LSTM model combined with the MM-WC
architecture in predicting DDoS attacks in real-world scenarios.
Figures 5.13a and 5.13b depict the attack predictions for cases when k = 0 and k = 1, respectively. We
model a situation where the attacks start at 12 PM—a peak activity hour—and persist for 16 hours. Each
figure presents the ratio of nodes performing the attack as the day progresses. The blue line (“True”)
represents the actual ratio of nodes performing the attack, while the orange (“TP”) and green line (“FP”)
present the model’s correct and incorrect predictions of nodes performing the attack, respectively. Notably,
the model shows remarkable accuracy in both k = 0 and k = 1 cases. Enhanced prediction accuracy for
k = 1 is attributed to attackers dispatching larger packet volumes than when k = 0, making the attacker
more exposed.
(a) k = 0 (b) k = 1
Figure 5.13: Attack prediction vs time on the testing dataset for the case that attack starts at 12 pm for 8
hours over all IoT nodes.
68
Based on our analysis, employing the LSTM/MM-WC successfully detected all the attack sessions. In
fact, the model could identify abnormal behaviors in at least one of the truly attacked IoT nodes. In a
broader view, 88% of truly attacked nodes were accurately predicted across all attack scenarios.
Furthermore, 92% of nodes that were not under attack were correctly classified as benign. False positive
detections had a duration averaging 5.39 time slots, while false negatives spanned around 4.41 time slots on
average.
For a deeper insight into the detection mechanism’s performance, we created a distinct testing dataset
by modulating the ratio of nodes performing the attack between 0.1 to 1. Figure 5.14 presents the
LSTM/MM-WC model’s F1 score against the ratio of the attacked nodes, i.e. ar. Four scenarios—determined
by the combinations of k = 0/1 and ad = 4/16 hours—are presented. Expectedly, the scenario with k = 1
and ad = 16 hours achieved the highest F1 score, due to the dispatching of a large volume of packets
making detection easier. However, when k = 0 for both ad = 4/16 durations, the model’s performance
suffers—attributed to the attacker’s strategy of blending malicious traffic with benign, especially when
using fewer IoT devices (ar = 0.1). It’s essential to highlight that this specific scenario—low attack ratio
(ar = 0.1) and low attack packet volume (k = 0)—is rather unrealistic in a real-world context since such a
modest attack barely impacts the victim server’s behavior. Although this scenario might not serve as a
robust case for assessing our model, it’s included for the sake of comprehensive analysis.
5.5.6 DDoS Attacks With Different Tunable Papermeters
In this subsection, we assess the robustness of the LSTM/MM-WC model in detecting DDoS attacks
characterized by three distinct tunable parameters: k1, k2, and k3. We examined all possible combinations
of these parameters taking values from the set {0, 0.3, 0.5, 0.7, 1}. Figure 5.15 depicts the F1 score
performance of the LSTM/MM-WC model under different parameter configurations. In each subplot within
the figure, the x-axis represents a fixed tunable parameter while the y-axis displays the corresponding F1
69
Figure 5.14: Evaluating the performance of the LSTM/MM-WC model by considering various ratios of the
nodes under attack (ar)
score. Every data point illustrates the F1 score for one fixed parameter with the other two being varied, and
each subplot presents a regression line complemented by its coefficient value. Furthermore, we are
showing the average line. From Figure 5.15a, an increase in k1 (indicating the location of the truncated
Cauchy) significantly boosts the F1 score. Conversely, Figures 5.15b and 5.15c suggest that elevations in k2
and k3 lead to only marginal improvements. The regression coefficients for k1, k2, and k3 are observed as
0.0463, 0.0085, and 0.0024, respectively. A larger value of k1 is associated with attacks having potentially
higher packet volumes, making attackers more detectable. Across varying tunable parameter scenarios, the
LSTM/MM-WC model consistently showcased robust detection, achieving an F1 score ranging between 0.8
and 0.88.
5.5.7 Real Test-Bed Evaluation
To evaluate our findings in a real environment, we implemented the proposed futuristic DDoS attack and
the proposed detection mechanism of LSTM/MM-WC in a real testbed [110]. The testbed comprises a
Raspberry Pi (RP) cluster of 52 nodes. One node served as the control server (CS), two others as the victim
70
(a) F1 Score vs k1 (b) F1 Score vs k2
(c) F1 Score vs k3
Figure 5.15: Evaluate the LSTM/MM-WC model performance by varying three tunable DDoS attack
parameters k1, k2, and k3.
71
and idle server, while the remaining 49 nodes functioned as botnets used by the attacker to target the
victim. The CS governs the behavior of IoT nodes. Each node maintains its own instance of a behavior
dataset, which corresponds to the benign dataset outlined in Section 4.2.
Figure 5.16: Evaluating the proposed DDoS detection mechanism in a real testbed of Raspberry Pi (RP)
cluster.
The overall overview of our system is illustrated in Figure 5.16. We conducted this real-world
experiment by considering various k values of 0, 2, 4, 6, and 8, attack durations of 10 and 20 minutes, along
with attack participation rates of 50% and 100%. Figure 5.17 presents the F1 scores obtained in our
experiments, revealing a positive correlation between higher F1 scores and increased k values. This not
only confirms that more intense DDoS attacks render botnets easier to detect but also highlights the
robustness of our proposed system. Notably, the LSTM/MM-WC model achieves an F1-score ranging from
0.8 to 0.86, indicating its robust performance in practical scenarios, and validating our simulation results
presented above.
72
Figure 5.17: Evaluating the performance of the LSTM/MM-WC model in a real test-bed of Raspberry Pis
5.5.8 IoT Nodes with Constrained Resources
Based on the simulation results discussed in Section 5.5.4, it is evident that utilizing the correlation data
from IoT nodes enhances the predictive capability of neural network models. This is especially true when
the attacker disguises the attack within benign traffic, notably when k = 0. However, as highlighted in
Section 5.3, using the correlation data from all nodes is impractical in real-world scenarios due to the large
volume of potentially vulnerable IoT devices. To address this challenge, we proposed five methods in
Section 5.3 to analyze various techniques for incorporating the correlation data from the most important
IoT nodes in DDoS attack prediction. In this subsection, we detail the performance of these methods, using
only five nodes for correlation information, inclusive of the node executing the model’s predictions. In this
simulation, we exclusively consider the LSTM/MM-WC model as our chosen detection method.
One notable method from section 5.3 utilizes the SHAP values to determine the five most influential
nodes from the training dataset. Figure 5.18 illustrates these important nodes for the LSTM/MM-WC model
with respect to node 1159. The left vertical axis lists the feature names, ranked by their significance,
whereas the right axis represents a heatmap corresponding to the feature values. This heatmap prese
high (in red) and low (in blue) feature values for each test sample. The horizontal axis quantifies the
influence of each sample on the model’s prediction, where a higher value represents the occurrence of an
attack and a lower value its absence. It is evident from Figure 5.18 that a higher packet volume for node
1159 correlates with a heightened likelihood of a DDoS attack. Moreover, data from nodes 3167, 33989,
23093, and 19159 have the most important information in predicting the attack on node 1159.
Consequently, for the SHAP method outlined in section 5.3, training the LSTM model on the MM-WC
architecture for node 1159 mandates correlation data solely from nodes 1159, 3167, 33989, 23093, and 19159.
This process is replicated for other nodes to pinpoint their top five influential features.
Figure 5.18: Feature importance analysis based on the LSTM/MM-WC model for a specific node
Figure 5.19 compares the various methodologies presented in section 5.3, using the LSTM/MM-WC
model and prioritizing the five most important nodes. This comparison shows that leveraging the data
from all nodes yields superior performance, reflected by an F1 score ranging from 0.82 to 0.87. Remarkably,
the Pearson correlation outperforms the SHAP, random, and nearest neighbor methods, albeit having a
marginal decline in F1 score at most 5% when compared against the all-nodes approach. The SHAP
technique, while effective, lags slightly behind the Pearson correlation. Furthermore, the nearest neighbor
method underperforms relative to other methods. These insights suggest that the Pearson correlation
74
method offers a promising approach for detecting DDoS attacks across a vast IoT network while relying on
data from only a subset of nodes, with a slight degradation in detection performance.
(a) Binary Accuracy (b) F1 Score
(c) AUC (d) ROC Curve - k = 0
Figure 5.19: Compare the performance of different methods for selecting a subset of the nodes for training
the LSTM based correlation aware models
5.5.9 DDoS Detection Performance Analysis over All Nodes In The Dataset
In this subsection, we analyze the performance of the proposed DDoS detection mechanism across all
nodes of the dataset. In this simulation, nodes are randomly distributed into 81 groups, each containing 50
nodes. Within each group, every node has access to the information of its peer nodes. The LSTM neural
network model combined with the MM-WC architecture is employed for this assessment. Moreover, two
75
distinct methods, All Nodes and Pearson Correlation, which were previously introduced in Section 5.3, are
considered for the training and testing phases of the neural network models. Figure 5.20 depicts the
average performance of the mentioned methods against varying the attack packet distribution parameter,
i.e. k. The results highlight that when the correlation information of all 50 nodes within a group is
employed, the F1 score ranges between 0.81 and 0.85. However, when using the Pearson Correlation
method, there is a slight decline of at most 5% in the F1 score compared to the All Nodes approach.
(a) Binary Accuracy (b) F1 Score
(c) AUC (d) ROC Curve - k = 0
Figure 5.20: Compare the performance of LSTM/MM-WC DDoS detection methods over all nodes in the
urban IoT dataset
76
5.6 Summary
In this chapter, we introduced a machine learning-based framework designed to detect DDoS attacks
within expanding IoT infrastructures. Utilizing a dataset discussed in the previous chapter 4, which
presents real urban IoT activities and simulates both benign and attack traffic through the truncated
Cauchy distribution, we introduced a novel general training dataset. This dataset encapsulates correlation
data amongst IoT nodes, aiming to enhance model training within correlation-aware architectural designs.
We proposed four architectures—MM-WC, MM-NC, OM-WC, and OM-NC—each tailored to using a single
model for all of the IoT devices or use individual models for each device, alongside the decision of
leveraging individual node data versus their correlation information. Furthermore, we analyzed five neural
network models: MLP, CNN, LSTM, TRF, and AEN, assessing their performance across all of the proposed
architectures. The results from extensive simulations revealed superior performance in the LSTM/MM-WC
and TRF/OM-WC configurations, outperforming other architectural and model combinations. Notably,
architectures integrating node correlation data, specifically MM-WC and OM-WC, demonstrated superior
performance. This advantage was particularly evident in the scenarios in which attackers attempt to blend
malicious traffic with benign activities by emulating normal traffic patterns. The efficacy of these models
was further evaluated through deployment on a real-world testbed of Raspberry Pi clusters.
Given the impracticality of employing correlation data from an extensive array of IoT devices due to
potential large-scale DDoS attacks, we proposed heuristic methodologies for the selective inclusion of IoT
nodes in model training. Among the proposed solutions, choosing IoT nodes with the highest Pearson
correlation coefficients for training correlation-aware models was shown to outperform alternative
strategies. This chapter not only advances the field of DDoS detection in IoT systems but also sets a
foundation for future research in employing machine learning techniques within these increasingly
complex networks.
77
Chapter 6
Graph Convolutional Networks for DDoS Attack Detection in a Lossy
Network
A significant challenge in deploying conventional machine learning models for DDoS attack detection
within IoT systems lies in their intrinsic limitation in managing incomplete data. Given the dynamic and
distributed nature of IoT environments, it is plausible that correlation data shared among IoT
nodes—critical for making informed DDoS detection decisions—might be lost or disrupted during network
transmission. This scenario underscores a critical vulnerability in the effectiveness of conventional
detection models under conditions of data uncertainty. To address this challenge, in this chapter, we
propose robust DDoS detection mechanisms based on Graph Convolutional Networks (GCNs) [32]. GCNs
represent a cutting-edge advancement in deep learning, offering a robust framework for handling data
structured in graphs. By conceptualizing the IoT device network as a graph, where devices are represented
as nodes interconnected by their relationships and data flows, GCNs enable the incorporation of relational
information directly into the detection mechanism. ∗†
∗This chapter is adapted from [30, 31].
†The dataset and source code used in this chapter are available in an open-source repository online at https://github.
com/ANRGUSC/ddos_gcn_paper.
78
6.1 Problem Definition
In previous chapters, our research centered on the detection of tunable, futuristic DDoS attacks within IoT
ecosystems by using correlation-aware neural network models. A critical limitation identified in
conventional neural network frameworks such as MLP, CNN, LSTM, Autoencoders, and Transformers is
their dependency on fully formed input data. The architectures we introduced, namely MM-WC and
OM-WC, necessitate comprehensive node information to accurately predict DDoS attacks emanating from
IoT devices. This requirement persists even when a subset of IoT nodes is selected for intercommunication;
the neural network models mandate complete input data for effective function. However, within the
dynamic and interconnected realm of IoT systems, the potential loss of node correlation information
during network transmission poses significant challenges. This scenario underscores the necessity for a
mechanism adept at compensating for these missing data points in neural network inputs, a capability that
traditional neural networks inherently lack.
Transitioning to the focus of this chapter, we delve into the utilization of Graph Convolutional
Networks (GCNs) [32] for DDoS attack detection in IoT environments. GCNs emerge as a potent solution,
leveraging their robust representation learning capabilities to interpret the complex web of relationships
among IoT devices. By conceptualizing IoT devices as nodes within a graph, GCNs inherently
accommodate the relational dynamics inherent in IoT networks, thus facilitating the detection process. A
standout feature of GCN models lies in their exceptional ability to manage incomplete information,
particularly when edges within the graph—representative of the correlation information among IoT
nodes—experience disruption or alteration over time. This attribute is especially important in the context
of IoT systems, where network connectivity and information flow may be intermittently compromised or
altered, leading to potential gaps in correlation data.
The adaptability of GCNs to handle missing edges contrasts sharply with the limitations of classical
neural network models, which struggle in the absence of complete input datasets. GCNs, through their
79
graph-based approach, are adept at inferring missing or incomplete relational data, thereby maintaining
the integrity of the detection process even when direct correlation information is unavailable. This
capability is invaluable in ensuring the resilience of IoT systems against DDoS attacks, particularly in
scenarios where attackers may exploit the network’s inherent vulnerabilities or disruptions to camouflage
their activities. By integrating GCNs into our detection framework, we aim to enhance the robustness and
accuracy of DDoS detection in IoT systems, addressing the critical challenges posed by incomplete data and
evolving network dynamics.
In this chapter, we delve into the construction of various graph topologies, leveraging the network of
IoT devices as the foundational framework. We conceptualize IoT devices as nodes within a graph and
introduce three distinct methodologies for edge construction to shape the graph’s structure. These
methodologies encompass: (a) a peer-to-peer approach, where connections are established directly
between IoT devices; (b) a network topology-based approach, reflecting the physical or logical structure of
the IoT network; and (c) a hybrid approach, which combines elements of both peer-to-peer and network
topology strategies. Additionally, we explore various configurations of edge construction, including the
consideration of both directed and undirected edges, to accurately model the dynamics of IoT interactions.
Our investigation rigorously evaluates the efficacy of each proposed graph topology in the context of
DDoS attack detection within IoT systems. The objective is to ascertain the most effective graph topology
for enhancing the accuracy and reliability of DDoS detection mechanisms. Through comprehensive
analysis, we aim to identify optimal configurations that not only facilitate accurate modeling of IoT device
interactions with least possible overhead but also significantly improve the detection of DDoS attacks. This
evaluation is critical for developing robust defense mechanisms against DDoS attacks, ensuring the
security and resilience of IoT networks.
We make the dataset and the attack emulation script, along with our proposed GCN-based detection
model, available as an open-source repository online at https://github.com/ANRGUSC/ddos_gcn_paper.
80
6.2 GCN-Based Defense Mechanism
To counteract the DDoS attacks efficiently in IoT systems, we propose graph convolutional networks
(GCN) [32] based algorithm as a robust detection mechanism, leveraging a graph representation where the
nodes represent IoT devices. This approach, detailed herein, advances from prior chapters where we
initiated a correlation-aware detection strategy grounded on the concept of sharing the traffic information
of IoT nodes with each other, facilitating detecting the futuristic DDoS attacks with low k values. Although
correlation-based models proposed in chapter 5 showed promising results in detecting the futuristic DDoS
attacks, the method showed vulnerability, predominantly failing in scenarios of network information loss
or IoT nodes disconnection. These issues could be addressed by utilizing a GCN-based approach for the
detection mechanism.
Leveraging GCN for DDoS attack detection in IoT systems poses a strong solution due to several
reasons. Firstly, GCNs are inherently adept at handling relational data, which is central in network settings,
thus providing a natural and efficient framework to capture the complex patterns and dependencies in the
network traffic data of IoT devices. Furthermore, it can effectively model and maintain the relational
structures even when some nodes are missing, or data is lost, enhancing the robustness of the detection
mechanism in volatile network environments typical of many IoT systems. Moreover, GCNs facilitate the
extraction of localized and topological features that are pivotal in identifying sophisticated attack patterns,
thereby potentially offering a higher detection accuracy compared to traditional machine learning models.
Lastly, it allows for the integration of diverse node features (e.g., different node types) in the learning
process, thereby enabling a more comprehensive understanding of the underlying patterns and aiding in
the timely and accurate detection of DDoS attacks.
The initial and critical phase in developing a GCN-based DDoS detection model involves the
meticulous definition of the graph structure. It is essential to highlight that our scenario is characterized by
a temporal dimension, with the graph’s features undergoing evolution over time. Consequently, there will
81
be a continuous update of the graph properties, including both the edges and node features, to reflect the
dynamic nature of the IoT environment. In the subsequent sections, we will delve into the methodology
employed for constructing the graph, detailing the approaches for adapting its structure and attributes to
accommodate the temporal variations inherent in our problem space.
6.2.1 Defining The Graph Nodes
To represent the nodes within our graph, we identify the IoT devices present within the network as the
fundamental nodes. Additionally, when incorporating routers into the graph topology, these, too are
classified as nodes. The process of defining node attributes involves capturing the network traffic
properties of IoT devices, which are subsequently utilized as features for the graph nodes. In scenarios
where routers form part of the graph topology, a distinct approach is employed to specify router features.
Initially, we focus on routers directly connected to IoT nodes. For a given router i, the features of node i
are determined by summing the features of IoT devices connected to router i at each time stamp. For
routers at higher levels within the graph, this methodology is replicated; the features of a higher-level
router are derived by summing the features of directly connected routers, thereby defining the attributes of
the higher-level router in a hierarchical manner.
6.2.2 Defining The Graph Edges
To accurately reflect the dynamics of IoT systems in our graph model, we propose several mechanisms to
define the graph edges. These edges are dynamically established at each time stamp, capturing the network
topology, which represents the interconnectivity of nodes and their capacity for mutual traffic information
exchange.
82
• Peer-to-Peer: In this strategy, as we can see in figure 6.1, each IoT node in the graph is connected to
n other IoT nodes. The selection of connections for each IoT node to other nodes within the graph is
determined through the following mechanisms:
Figure 6.1: Peer to peer (p2p) graph topology for GCN-based DDoS attack detection in IoT systems.
– Distance-based: In this approach, we create edge connections by aligning each IoT node i
with its n geographically closest nodes in the system, thereby leveraging spatial closeness as
the defining parameter for connection establishment.
– Correlation-based: In this approach, we connect each IoT node i with n other nodes in the
graph that demonstrate the highest Pearson correlation with node i according to their benign
activity behavior, advocating a connection grounded on statistical resemblance rather than
physical proximity.
• Network Topology: As presented in figure 6.2, this approach incorporates routers as nodes within
the graph, recognizing a hierarchical structure where IoT nodes connect to certain routers, which in
83
turn connect to higher-level routers. This method allows for capturing a graph topology that mirrors
the actual network topology, facilitating information flow within the graph.
Figure 6.2: Network graph topology for GCN-based DDoS attack detection in IoT systems.
• Hybrid: By merging the Peer-to-Peer and Network Topology methods, as presented in figure 6.3, this
approach constructs graph edges to create a composite model for edge definition, leveraging both
direct device connections and hierarchical network structures.
In the process of designing the topology for the graph representation of the IoT system, a critical
decision involves determining the connectivity amongst nodes, specifically the orientation of the edges. To
this end, we explore the implications of employing both directed and undirected edges within the graph.
Figure 6.4 represents the undirected edges of the graph in which IoT nodes and routers get connected to
each other with undirected edges, facilitating information flow in both directions.
When employing directed edges within the graph, a critical consideration is the designation of source
and target nodes for each edge. In the context of the graph topologies previously outlined, particularly
within the Peer-to-Peer construction method, we explore two distinct scenarios for structuring directed
edges:
84
Figure 6.3: Hybrid graph topology for GCN-based DDoS attack detection in IoT systems.
• Node-to-neighbors: In this configuration, as presented in figure 6.5, we establish connections from
each node i to n other nodes within the graph. These connections are based on spatial proximity or
the highest correlation to node i, with node i acting as the source and the n selected nodes serving
as targets. This approach facilitates the directional flow of information from a central node to its
neighbors.
• Neighbors-to-node: Conversely, as depicted in figure 6.6, this strategy inverts the direction of
information flow by establishing n nodes within the graph as sources and node i as the target.
Connections are again determined by either spatial closeness or the correlation to node i. This setup
models the aggregation of information from multiple nodes towards a single central node.
85
Figure 6.4: Unidrected graph topology for GCN-based DDoS attack detection in IoT systems.
Figure 6.5: Directed node-to-neighbors graph topology for GCN-based DDoS attack detection in IoT systems.
These scenarios provide a flexible framework for directed edge construction, allowing for tailored
information flow models that can be optimized based on the specific dynamics and requirements of the IoT
system under investigation. It is important to note that in employing the Network Topology approach, we
will only utilize undirected edges. This decision stems from the observation that within the Network
Topology framework, using only the directed connection of IoT nodes to routers—or vice versa—does not
facilitate the exchange of correlation information among the IoT nodes. Consequently, the utilization of
directed edges becomes redundant when adopting the Network Topology method, as it does not align with
the primary goal of sharing correlation information across nodes. In scenarios where the Hybrid method is
applied, incorporating both routers and IoT nodes within the graph, the preference also shifts towards
undirected edges to maintain consistency in information-sharing protocols. An alternative strategy for
graph design under the Hybrid method could involve employing undirected edges for router-to-IoT node
86
Figure 6.6: Directed neighbors-to-node graph topology for GCN-based DDoS attack detection in IoT systems.
connections, while reserving directed edges for Peer-to-Peer connections among IoT nodes. However, this
particular configuration has not been explored in the current research and is marked for future
investigation to assess its viability and effectiveness in enhancing DDoS detection mechanisms within IoT
networks.
It is important to note that while undirected edges facilitate broader dissemination of network traffic
information among IoT devices, thereby enriching the graph’s informational content, such an approach
may not be entirely feasible in real-world applications. Specifically, in the deployment of a distributed GCN
framework for DDoS detection, the extensive flow of information inherent to a graph populated with
undirected edges could lead to substantial network overhead. Consequently, in scenarios characterized by
constrained network resources, opting for directed graphs may be advantageous. Directed graphs can
effectively streamline the flow of information, preventing potential bottlenecks in the network and thus,
maintaining the efficiency and responsiveness of the GCN-based DDoS detection mechanism.
6.2.3 Model Lossy Connections
To accurately emulate the phenomenon of connection losses within the network, we introduce a variable, l,
which specifies the percentage of edges in the graph subject to disconnection at each time stamp as
presented in figure 6.7. It is important to remember that edges represent the potential for information
87
exchange between IoT nodes and routers. Thus, dropping an edge signifies a failed attempt at information
sharing among IoT nodes and routers, due to its loss in the network. This mechanism is crucial for
simulating a lossy environment within the IoT system, enabling us to assess the resilience of our proposed
GCN model. By incorporating this variable, we aim to evaluate the model’s capability to continue detecting
DDoS attacks originating from IoT devices despite the challenges posed by intermittent network
connectivity.
Figure 6.7: Modeling lossy connections in the graphs for GCN-based DDoS attack detection in IoT systems.
6.2.4 Define GCN Model
As illustrated in Figure 6.8, the GCN model that we developed encapsulates the process of feature
representation learning on graphs. The model operates in three distinct stages:
Figure 6.8: Graph Convolutional Network (GCN) model schematic
• Input: Initially, the graph is composed of nodes with their respective feature vectors, represented as
squares containing smaller rectangles. These feature vectors serve as the initial state for each node
in the graph.
88
• Graph Convolutional Operations:
– Feature Propagation: Here, each node aggregates features from its neighbors, facilitating the
diffusion of information across the graph. This step is essential for capturing the graph
structure in the feature space.
– Linear Transformation: The linear transformation refines the feature representation for the
downstream tasks by multiplying the features by the weight matrix.
– Nonlinearity: A nonlinearity activation function (usually ReLU) is applied to introduce
non-linearity, enabling the model to learn complex patterns from the graph data.
• Output: The final output is a graph with nodes that have undergone a transformation, depicted as
circles, which represent binary classifications of the IoT nodes predicting which IoT nodes are
performing the DDoS attack.
6.3 Simulation Results
We evaluate the performance of the proposed GCN-based DDoS detection model by utilizing the various
topologies discussed in the previous section to find the best configuration for detecting DDoS attacks in IoT
systems with lossy networks.
6.3.1 Experiment Setup
In the following sections we will discuss the methods we used for designing our experiments.
6.3.1.1 Dataset
In order to evaluate the performance of the proposed GCN model for detecting DDoS attacks in IoT
systems with lossy networks, we utilize the dataset derived from real event-driven IoT nodes operative in
89
an urban area, presented in chapter 4. As discussed previously, this dataset contains several features,
including node ID, geographical coordinates (latitude and longitude), time stamps, and packet volumes
registered at respective timestamps.
To enrich the dataset, additional features were engineered, presenting the average packet volume
transmitted from IoT devices over varying time intervals — 30 minutes, 1 hour, 2 hours, and 4 hours —
preceding each timestamp. Furthermore, the packet volume features both in benign and attack states, were
synthesized leveraging a truncated Cauchy distribution following the proposed method in section 4.3.
Table 6.1 presents some sample data points in the dataset. It is worth mentioning that the number of
features used for performing the model training/inference could also be adjusted in the script we provided
for the simulations.
Table 6.1: Sample data points in the training dataset used in GCN-based detection model.
NODE LAT LNG TIME ACTIVE PACKET PACKET
30 MIN AVG
PACKET
1 HR AVG
PACKET
2 HR AVG
PACKET
4 HR AVG
1 10 20
2021-01-01
23:00:00
0 0 0 0 0 0
1 10 20
2021-01-01
23:10:00
1 9 3 1.5 0.75 0.38
1 10 20
2021-01-01
23:20:00
1 11 6.67 3.33 1.67 0.83
1 10 20
2021-01-01
23:30:00
1 10 10 5 2.5 1.25
6.3.2 Attack Mechanism Setup
The configuration of the DDoS attack mechanism contains an orchestration of attack parameters, including
different initiation times at 2 AM, 6 AM, and 12 PM, coupled with varying durations extending for 4, 8, and
16 hours. Furthermore, we also set the IoT nodes that perform the DDoS attack with a participation attack
ratio chosen between 0.5 and 1.
A critical element in this setup is the tunable parameter k, which dictates the aggression level of the
attack, translating to the volume of packets dispatched to the target server. In this experiment, we utilized
90
six distinct values for k, ranging from 0 to 1, thereby analyzing attacks mirroring benign traffic (at k = 0)
to those that are far more aggressive (at k = 1).
In order to prepare a rich training, validation, and testing dataset, we accounted for different attack
variations, leveraging diverse combinations of attack properties presented above including initiation times,
durations, node participation ratios, and k values.
6.3.3 Detection Mechanism Setup
We constitute the graphs for the GCN-based detection mechanism based on the methodologies presented
in section 6.2. In order to evaluate the Peer-to-Peer method, we connect each node in the graph to four
other nodes, i.e., n = 4, dictated by either spatial proximity or Pearson correlation methods. Furthermore,
in the case of using the Network Topology method, we assume the simple scenario of having one router in
the network and connecting all IoT nodes to it. It is worth mentioning that although we consider only
having one router in the network, the proposed GCN-based defense mechanism utilizing the Network
Topology is capable of handling detection in the case of using hierarchical router topology. Additionally,
we also evaluate the Hybrid method by constructing the graph using both Peer-to-Peer and Network
Topology methods. Finally, for all these combinations, we also evaluate the use of directed or undirected
edges, except in the case of having routers in the graph which we will only use undirected edges.
To realistically simulate scenarios of connection losses prevalent in real-world networks, we utilize the
l to dictate the percentage of edge drops at individual timestamps, thereby examining network resilience
under 0%, 30%, and 50% connection loss scenarios.
The GCN model used in this paper comprises two GCN convolution layers with 1024 hidden channels.
The input dimension to the model is 50 × 5, which represents 50 nodes in the graph, each node having 5
features which have been described in section 6.3.1.1. The node features represent the average packet
volume transferred from each node in various time intervals. The edges of the graph do not have any
91
features. The output dimension is 50 × 1 since we are doing node-level binary classification. During the
forward pass, the input graph data undergoes two convolution operations with a ReLU activation function
and a dropout layer with a 0.4 probability to help prevent overfitting. The output is obtained via a Sigmoid
activation function, making the GCN model compatible with binary cross-entropy-based loss functions for
training. We used a customized loss function method for the training which addresses class imbalance in
the training dataset by determining class weights from the class frequencies, and then incorporating these
weights into a binary cross-entropy loss function. This strategy aims to counteract the influence of class
imbalance during training, fostering a model that can more robustly handle imbalanced datasets. The GCN
model has been trained for 100 epochs and a batch size of 1024.
Finally, in order to comprehensively analyze the detection performance with a high confidence level in
the efficacy of the GCN-based models, we selected ten separate groups of IoT nodes, each containing 50
nodes, to run independent tests, with the subsequent analysis offering 95% confidence intervals for
performance metrics including binary accuracy, F1 score, area under curve (AUC), and recall, thereby
providing a comprehensive evaluation of the model’s performance.
6.3.4 Results Discussion
We analyze the efficacy of the proposed graph convolutional network (GCN) model in detecting DDoS
attacks, particularly within the context of lossy network environments. The performance metrics adopted
for this examination include binary accuracy, F1 score, area under curve (AUC), and recall, systematically
analyzed against variations in the k parameter, which dictates the attack severity by determining the
volume of packets dispatched to the victim server during the DDoS attack.
We present the aggregated performance metrics over diverse attack scenarios in each plot. Each plot
contains various curves portraying the implications of using various graph topologies discussed in section
6.2 and varying degrees of connection loss, represented by parameter l. The legends, encoded as “XY ”,
92
denote the type of Peer-to-Peer topology (X) and the corresponding value of connection loss (Y ),
represented by the variable l which indicates the fraction of edges dropped at each time stamp. At most
three graph topologies are compared in each plot: 1) Distance, 2)Correlation, and 3) No_p2p. The value of
the connection loss at each time stamp, i.e. l, has also been varied between 0, 0.3, and 0.5.
Figure 6.9 demonstrates the performance of the Peer-to-Peer topology utilizing directed edges and a
node-to-neighbors edge construction approach, across varied attack packet volume distribution parameters
(k). As depicted in Figure 6.9b, for both Distance-based and Correlation-based Peer-to-Peer topologies, the
F1 score escalates with the increment of k. This phenomenon indicates that higher k values, despite
intensifying the attack, concurrently elevate the detectability of the attacker. The Distance-based topology
outperforms the Correlation-based approach, where, in scenarios without connection loss (i.e., l = 0), the
Distance-based methodology achieves an F1 score ranging from 0.78 to 0.85, surpassing the
Correlation-based technique, which scores between 0.67 and 0.81. The resilience of the proposed GCN
model is evident as the F1 score declines with an increase in connection loss (l), with the Distance-based
method showing a maximum decrease of 3% from no connection loss to 50% connection loss, highlighting
its robustness against connection disruptions.
Figure 6.10 presents the performance of the Peer-to-Peer topology with directed edges and a
neighbors-to-node edge construction strategy under various attack packet volume distribution parameters
(k). As illustrated in Figure 6.10b, for both Distance-based and Correlation-based topologies, the F1 score
augments with increasing k values, underscoring the enhanced detectability of more aggressive attacks.
The Distance-based topology exhibits superior performance compared to the Correlation-based strategy.
Notably, in the absence of connection loss (i.e., l = 0), the Distance-based approach records an F1 score
from 0.78 to 0.84, marginally higher than the Correlation-based method’s 0.76 to 0.81. In terms of the
connection loss, the F1 score differential becomes more pronounced with connection loss, where the
Distance-based method’s F1 score decreases by a maximum of 6% with 50% connection loss, reflecting the
93
(a) Binary Accuracy (b) F1 Score
(c) Area Under Curve (d) Recall
Figure 6.9: Compare the performance of GCN-based DDoS detection model using the Peer-to-Peer topology
directed graph with node-to-neighbors edges for graph construction
GCN model’s resilience. The directed graph utilizing neighbors-to-node edges showcases an enhanced
performance for the Correlation-based algorithm. Especially, for k = 0, the Correlation-based method
achieved an F1 score of 0.67 in the node-to-neighbors method as compared to the 0.76 F1 score that it
achieved in the neighbors-to-node method. However, the performance of the Distance-based method
remains very similar across both node-to-neighbors and neighbors-to-node configurations. A particularly
noteworthy finding is the enhanced resilience of the node-to-neighbors approach to connection losses.
Specifically, within the Distance-based method, the degradation in F1 score was limited to a maximum of
3% when transitioning from a scenario with no connection loss to one with 50% connection loss, using
94
node-to-neighbors edges. This contrasts with a more pronounced reduction, up to 6%, observed in the
neighbors-to-node edges. In conclusion, when employing directed graphs based on the Peer-to-Peer
topology, the Distance-based method not only outperforms alternative approaches in terms of achieving a
superior F1 score but also demonstrates great robustness against connection losses, evidenced by a minimal
decline in F1 score.
(a) Binary Accuracy (b) F1 Score
(c) Area Under Curve (d) Recall
Figure 6.10: Compare the performance of GCN-based DDoS detection model using the Peer-to-Peer topology
directed graph with neighbors-to-node edges for graph construction
Transitioning to undirected graphs, Figure 6.11 presents the Peer-to-Peer topology’s performance with
undirected edges under variable attack packet volume distribution parameters (k). As shown in Figure
6.11b, for both Distance-based and Correlation-based topologies, the F1 score increments with rising k
95
values, affirming that higher aggression levels in attacks improve detectability. Contrary to directed graph
scenarios, the Correlation-based topology excels over the Distance-based approach in the undirected
configuration, especially when there is no connection loss (i.e., l = 0), achieving F1 scores from 0.85 to 0.89
compared to the latter’s 0.82 to 0.87. This observation suggests that employing undirected graphs,
characterized by enhanced information sharing among IoT nodes, leads to improved performance in DDoS
detection. However, it is crucial to acknowledge that while undirected graphs exhibit superior
performance, they concurrently increase the network overhead due to the augmented sharing of
information between IoT nodes. This elevation in network overhead could potentially result in a network
bottleneck, particularly in scenarios where the IoT nodes operate under conditions of limited bandwidth.
Lastly, Figure 6.12 depicts the comparative performance of both Network and Hybrid topologies
employing the undirected edges approach under various attack packet volume distribution scenarios,
parameterized by k. The curves that show the Correlation and Distance in the legend are related to the
Hybrid method, and the curve that has a No_p2p legend represents the case of using Network topology
without any peer-to-peer connections. As illustrated in Figure 6.12b, an increase in k uniformly enhances
the F1 score across all evaluated topologies, suggesting that higher k values, despite intensifying the attack,
facilitate the attacker’s detection. Notably, the Hybrid Correlation-based topology outperforms its
counterparts, achieving an F1 score ranging from 0.89 to 0.91 in scenarios without connection loss (l = 0),
in contrast to the Hybrid Distance-based topology’s F1 score, which spans from 0.88 to 0.90. The absence of
peer-to-peer connections in the Network topology under similar conditions yields an F1 score between 0.86
and 0.88. A universal decline in F1 score is observed with increasing connection loss (l), with the Hybrid
Correlation-based topology demonstrating a resilience of up to a 2% reduction in F1 score from no
connection loss to 50% connection loss. This resilience underscores the proposed GCN model’s robustness
against connection disruptions. It is interesting to observe that, the exclusion of peer-to-peer connections
in the Network topology results in a significant F1 score reduction of approximately 26% at k = 0,
96
(a) Binary Accuracy (b) F1 Score
(c) Area Under Curve (d) Recall
Figure 6.11: Compare the performance of GCN-based DDoS detection model using the Peer-to-Peer topology
undirected graph for graph construction
highlighting the impact of neglecting peer-to-peer edges on DDoS detection, particularly when attackers
mimic benign IoT node behavior. Comparative analysis of the GCN model’s efficacy with undirected
graphs, both with and without a router node, as depicted in Figures 6.12 and 6.11, respectively, reveals
superior performance when incorporating a router node within the graph.
Another important factor to evaluate in the GCN-based DDoS detection model is to see how number of
edges in the graph would effect the detection performance. In the above results, we evaluated the scenario
of having 4 edges per node. In the following, we will consider the cases of having 1 up to 49 edges per node
in the graph. Recall that, we have 50 nodes in this simulation results and having 49 edges per node we will
97
(a) Binary Accuracy (b) F1 Score
(c) Area Under Curve (d) Recall
Figure 6.12: Compare the performance of GCN-based DDoS detection model using the Hybrid/Network
topology undirected graph for graph construction
essentially have a complete graph. Figure 6.13 represents the performance of the GCN-based DDoS
detection model using the Peer-to-Peer topology undirected graph for graph construction and evaluating
the number of edges per node. As we can see in the figure 6.13b, as we go from 1 edge per node to 4 edges
per node, we see a considerable increase in the F1 score. But after that, the F1 score starts decreasing
slightly in the case of having higher number of edges per node. This supports the fact that having too
many edges in the graph will results in the scenario of getting smothed features for the nodes and the GCN
model would not perform as it is expected to do.
98
An essential aspect to analyze within the GCN-based DDoS detection model pertains to the impact of
the graph’s edge density on detection efficacy. Previously, our analysis was confined to a graph structure
incorporating an average of four edges per node. In this simulation, we extend our investigation to
encompass graphs varying from one to forty-nine edges per node. Recall that, given the simulation’s
framework of fifty nodes, attaining forty-nine edges per node results in the formation of a fully connected
graph. Figure 6.13 presents the GCN-based DDoS detection model’s performance, employing a
Peer-to-Peer topology as the foundational structure for the undirected graph, whilst assessing the influence
of edge density on node performance. As we can see in Figure 6.13b, an initial increase in the F1 score as
the edge count per node transitions from one to four, signifying enhanced detection capability. Conversely,
a subsequent escalation in edge quantity per node marginally diminishes the F1 score, substantiating the
fact that an excessive number of edges fosters feature over-smoothing across nodes, thereby impairing the
GCN model’s expected performance.
The main key takeaways from these simulations are the following:
• Across all graph topologies, the performance of the GCN model improves with increasing values of k.
This observation underscores the principle that higher k values, despite leading to more aggressive
attacks, enhance the detectability of the attackers.
• In all of the graph topologies, having connection losses results in having a lower detection F1 score.
• Among the evaluated topologies, the Hybrid Correlation-based topology with undirected edges
emerged as the most effective in detecting DDoS attacks originating from IoT devices, achieving an
F1 score range of 0.89 to 0.91. Notably, this topology exhibited the minimal reduction in F1 score due
to network connection losses, with a maximum decrease of only 2% under a 50% connection loss
scenario.
99
(a) Binary Accuracy (b) F1 Score
(c) Area Under Curve (d) Recall
Figure 6.13: Compare the performance of GCN-based DDoS detection model using the Peer-to-Peer topology
undirected graph for graph construction and evaluating the number of edges per node.
• Within directed graph configurations utilizing the Peer-to-Peer topology, the Distance-based method
incorporating node-to-neighbor edges achieved the highest F1 score, ranging from 0.78 to 0.85. This
is important to note that directed graphs, owing to their reduced network overhead for sharing IoT
device correlation information during GCN predictions, are preferable under network constraints
affecting IoT devices.
• Employing the Network topology without peer-to-peer connections significantly compromises the
GCN model’s performance in the presence of connection losses. This effect is particularly observed
100
at k = 0, where a 50% connection loss led to a dramatic F1 score reduction of approximately 26%,
highlighting the critical importance of peer-to-peer connections in maintaining detection accuracy.
• In terms of the edge density in the graph, we observed that having too few or too many edges per
node in the graph results in lower performance. In our simulations, we observed that utilizing four
edges per node in the case of having 50 nodes in the graph would result in the best performance.
6.4 Summary
In this chapter, we proposed a GCN-based model for the detection of DDoS attacks within IoT
environments. Our investigation highlighted the adaptability of GCNs in managing the complex relational
dynamics of IoT devices, conceptualized as nodes within a graph, particularly under conditions of
incomplete information. We proposed a robust detection framework that leverages the inherent strengths
of GCNs to infer missing or incomplete relational data, thereby maintaining detection integrity even
amidst network disruptions.
We introduced various methodologies for constructing graph topologies, including Peer-to-Peer,
Network Topology-based, and Hybrid approaches, to model the IoT system. Our proposed GCN-based
detection mechanism, builds upon the premise of sharing traffic information among IoT nodes to facilitate
the detection of tunable futuristic DDoS attacks.
Our experimental setup, utilizing a dataset derived from real event-driven IoT nodes, was designed to
simulate a range of DDoS attack scenarios. We examined the effects of different graph construction
methodologies, including directed and undirected edges, on the model’s detection capabilities. The
resilience of our GCN model was put to the test in environments characterized by lossy connections, where
we introduced a variable to simulate connection losses within the network.
101
We extensively analyzed the performance of all proposed graph topologies to find the best
configuration for detecting DDoS attacks in IoT systems with lossy networks. We observed that the model’s
performance improved with higher values of the k parameter, which indicates the volume of packets
dispatched during an attack. The Hybrid Correlation-based topology with undirected edges emerged as the
most effective configuration, showcasing minimal performance degradation even in scenarios of significant
connection loss. Conversely, the exclusive use of a Network topology without peer-to-peer connections
markedly diminished the model’s efficacy, particularly when the attacker mimicked benign traffic patterns.
In summary, our findings underscore the potential of GCNs in enhancing DDoS detection within IoT
systems, especially in the face of network disruptions and incomplete data. The adaptability of our
proposed model to various graph topologies and its resilience against connection losses affirm the viability
of GCNs as a cornerstone of future DDoS defense mechanisms in IoT environments.
102
Chapter 7
DDoS Detection Reasoning With Large Language Models
In this chapter, we propose mechanisms for providing reasoning and explanation about the IoT network
traffic properties. We utilize Large Language Models (LLMs) [33] to both detect and reason about the IoT
network traffic properties and evaluate their performance. This chapter highlights the potential of LLMs to
enhance IoT network security by offering a novel method for detecting and explaining network traffic
anomalies. ∗†
7.1 Problem Definition
Recent advancements in artificial intelligence (AI) have prompted the development of innovative
technologies such as Open AI’s ChatGPT, one of the largest large language models (LLMs)[33]. These
models have demonstrated remarkable performance in a variety of natural language processing (NLP) tasks,
including language translation, text summarization, and question answering, given that they have been
pre-trained on enormous quantities of text data.[113] Due to their remarkable model parameterization, data
analysis and interpretation, scenario generation, and model evaluation capabilities, LLMs, such as ChatGPT,
play a vital role in software development, education, healthcare, and even the environment[114, 115, 116].
∗This chapter is adapted from [34].
†The dataset and source code used in this chapter are available in an open-source repository online at https://github.
com/ANRGUSC/llm_cybersecurity.
103
In the previous chapters of this dissertation, we delved into the complications accompanied by
detecting DDoS attacks within IoT systems, employing a variety of machine learning models for this
purpose. Building upon this foundation, the current chapter seeks to illuminate the potential role of Large
Language Models (LLMs) within the cybersecurity domain, with a special emphasis on their application in
the realm of DDoS attack detection in IoT systems. This exploration is not merely a comparison of LLMs’
detection capabilities relative to those of conventional neural network models; rather, it extends to an
in-depth analysis of the LLMs’ unique ability to provide explanatory insights into the detection outcomes.
The use of LLMs in this context is based on their advanced natural language processing capabilities,
which enable them not only to identify the occurrence of DDoS attacks but also to explain the underlying
reasons for these detections. This dual functionality represents a significant leap forward in cybersecurity
measures for IoT systems. By integrating LLMs into the detection framework, network administrators are
empowered with a tool that goes beyond mere identification; it offers an understanding of whether
network traffic is benign or indicative of a DDoS attack, and it provides detailed information about the
characteristics of the traffic in question.
Utilizing OpenAI’s GPT-3.5, GPT-4, and Ada models, we will assess LLMs’ capabilities in identifying
DDoS threats across two distinct datasets: CICIDS 2017 [73] and the more complex Urban IoT Dataset
presented in chapter 4. By supplying context in few-shot method or through fine-tuning, LLMs can analyze
network data, detect potential DDoS attacks, and provide insights into their reasoning. Our evaluations
revealed that on the CICIDS 2017 dataset, few-shot LLM methods with only 10 prompt samples approached
an accuracy of 90%, whereas fine-tuning with 70 samples achieved about 95%. On the challenging Urban
IoT Dataset, few-shot techniques attained a 70% accuracy with only 10 samples in the prompt, while
fine-tuning reached nearly 84% accuracy. When compared to a multi-layer perceptron (MLP) model trained
with a similar number of samples, LLMs outperformed the MLP. Notably, LLMs demonstrated the ability to
104
articulate the basis of their DDoS detections in few-shot learning and showed great potential. However,
they were prone to hallucination in the fine-tuning method.
The importance of this capability cannot be overstated. In the complex and dynamic ecosystem of IoT,
where the sheer volume and variety of devices can obscure the origins and nature of network threats, the
ability to not only detect but also reason about DDoS attacks provides critical insights. These insights
enable administrators to make informed decisions about safeguarding the network, enhancing the overall
security posture of IoT systems. Through a comprehensive evaluation of LLMs’ performance in detecting
and reasoning about DDoS attacks, this chapter aims to highlight how these models can serve as an
invaluable asset in the ongoing effort to secure IoT environments against the evolving threat of DDoS
attacks.
7.2 LLM-Based DDoS Detection
In this study, our primary approach employs both few-shot and fine-tuned Large Language Models (LLMs)
for the detection of DDoS attacks. This section offers a comprehensive feasibility analysis on the efficacy of
providing limited context to LLMs in the few-shot approach or leveraging fine-tuned LLMs for DDoS
attack detection. Furthermore, we explain the methods for selecting optimal input data as context and
provide guidelines on training the fine-tuned model using specific architectures.
7.2.1 Few-shot LLM
Given the extensive pre-training of LLMs and their proficiency in reasoning from language-based data, our
aim is to evaluate their performance in a few-shot setting for detecting DDoS attacks. We hypothesize that
LLMs could draw inferences from minimal data, relying primarily on the semantic content presented. The
constrained context size inherent to LLMs does not pose significant challenges in a few-shot context.
OpenAI’s research has already highlighted the potency of LLMs in few-shot learning [117], further
105
strengthening our inclination towards this approach. In the following, we explain the methods we used for
few-shot LLM method:
• LLM Random: In this approach, we utilize the gpt-3.5-turbo model via the OpenAI API, executed
from a Python script. We introduce the model to n random samples of few-shot data before
prompting it to classify an unlabeled sample as either “Benign" or “DDOS". We varied n between 0
and 70 to observe performance variations as the model is exposed to increasing amounts of data. We
have termed this methodology “LLM Random”.
• LLM Top K: A subsequent strategy involved the establishment of a Pinecone index containing every
labeled sample from the training data. During inference on a specific test data sample, we retrieved
the top k training data samples for each label from Pinecone. These samples then served as the
labeled examples in the prompt context. By focusing on the “most relevant” data subset, this method
effectively addresses the challenge posed by restricted context lengths.
7.2.2 Fine-tuning LLM
LLMs are initially pre-trained on vast and diverse public datasets, enabling them to generate responses to a
wide array of queries [118]. For specific tasks, fine-tuning these pre-trained LLMs with smaller, task-centric
datasets can notably elevate their performance and response precision. In our research, we focus on
fine-tuning OpenAI’s Ada model to enhance its capacity to understand and assess the traffic data from IoT
devices, and to predict with greater accuracy whether these devices face DDoS attacks.
In this approach, we explore the performance of a fine-tuned Ada model in detecting DDoS attacks
when exposed to only a limited data subset. This method stands in contrast to the gpt-3.5-turbo strategies
explained in section 7.2.1, i.e. LLM Random and LLM Top K, rather than presenting the training data
within the context at inference, the model undergoes fine-tuning on a pre-selected data subset before
106
inference. The training process involves pairs of prompts and responses, where each prompt represents an
unlabeled training data sample, and the response is its associated label.
7.2.3 General Prompt Engineering
In general, over several tests, certain additions to our prompting seemed to yield better results, so they
were used when collecting results. These include:
• Writing each feature’s name before its value on every row: Instead of presenting the rows in tabular
form, in each row, each feature’s label is repeated before its value (e.g. Destination Port: 80).
• Using specific strings as separators and explaining their use in the prompt: For example, each feature
is separated by a pipe symbol, and each row is separated by a new line. The training data and the test
prompt are separated by three consecutive # symbols. All of these symbols are explicitly defined at
the beginning of the prompt so that the model understands their use as separators.
• Asking the model to explain its reasoning based on the data before outputting its predicted label.
This allows the model to output observations of the data and then “reason” on these observations
before outputting a prediction. With the inverse approach, the model tended to pick an output and
then hallucinate reasoning for its output, often lying about the data.
• Asking for the output to follow a specific format every time. For example, in the prompts, we told
the model to “surround the predicted label with ‘$$$’ on each side”. This made it more likely for the
model to output a prediction as opposed to before where it occasionally refused to make a prediction.
Giving it a specific format to follow seems to ensure that a prediction is made because it attempts to
follow the format. Another benefit of including this in the prompt is that it facilitates programmatic
extraction of the predicted label, as well as making the location of the prediction clear within the
response.
107
7.2.4 Neural Network Model
To verify whether LLM has an advantage over conventional neural network models in DDoS attack
detection, as a benchmark, we trained a basic Multi-Layer Perceptron (MLP) [56] model on the identical
few-shot tasks.
7.3 Simulation Results
We present the results of our rigorous testing using LLMs for predicting various tasks and datasets,
including DDoS binary classification detection and reasoning. Our comparison of different methods
demonstrates the significant effectiveness of LLMs on these tasks and highlights their enormous potential
for future applications.
7.3.1 Datasets
7.3.1.1 CIC-IDS 2017 Dataset
The CIC-IDS2017 dataset [73] encompasses both benign and common attacks, mirroring real-world data
accurately. Our research focuses on the “Friday-WorkingHours-Afternoon-DDOS" pcap file from this
dataset, which comprises samples of labeled data. Each sample is characterized by 85 features and a label
indicating either “Benign” or “DDoS”. Given the constraints imposed by the limited context size of LLMs,
we have updated the dataset to include only 4 features per sample. This reduction leverages findings from
prior research [119], which identified the most informative features for this task based on information gain.
Consequently, our study utilizes the following features: Packet Length Standard Deviation, Total Length of
Backward Packets, Subflow Backward Bytes, and Destination Port. This refinement enables the training of
models on a larger dataset within the confines of LLMs’ context length limitations. The primary objective
of this selection process is to preserve features that are critical for the classification task and possess
108
significant linguistic interpretability for model inferences. Employing the four most significant features has
facilitated the use of up to 70 samples in the few-shot approach.
7.3.1.2 Urban IoT DDoS Dataset
In this study, we extended our evaluation of LLM performance to the tunable futuristic DDoS attack model,
as introduced in chapter 4. This dataset poses a greater challenge for reasoning and classifying DDoS
attacks, primarily due to the introduction of a tunable parameter, k. With lower k values, the attacker’s
strategy closely mimics benign IoT node behavior, significantly complicating detection efforts.
Each sample in the training dataset documents the node ID, timestamp, packet volume transmitted by
the node over a 10-minute span, and the average packet volume across intervals ranging from 30 minutes
to 4 hours. To capture the correlation information among IoT nodes, we also record the packet volumes of
all nodes within each sample. Thus, for every timestamp in the training dataset, we acquire comprehensive
packet transfer data for node i and its counterparts, facilitating a robust analysis of network behavior. Each
dataset sample is subsequently labeled to indicate whether it is associated with a DDoS attack. The dataset
that we used for our simulations contains information of 50 IoT nodes.
7.3.2 Simulation Results for CIC-IDS2017 Dataset
7.3.2.1 DDoS Detection Performance
Figure 7.1 illustrates the comparative analysis of five distinct methodologies applied to few-shot learning,
fine-tuning, and Multilayer Perceptron (MLP) utilizing the CIC-IDS2017 dataset [73]. This comparison
focuses on binary accuracy relative to the number of hints (samples) employed within the context of the
few-shot approach. For both MLP and fine-tuning scenarios, the term ’hints’ denotes the number of
samples utilized for MLP training and LLM fine-tuning, respectively. The LLM Top K strategy demonstrated
superior performance across most few-shot scenarios, consistently outperforming MLP-based approaches.
109
It is important to note that the simulation maintains an equal sample count for the few-shot context and
MLP training, ensuring a balanced comparison between the few-shot techniques and the MLP model. The
fine-tuned LLM model initially exhibited the poorest performance, only surpassing alternative methods
upon utilizing 40 data samples. This analysis further reveals that fine-tuning an LLM with 70 samples
achieves the best accuracy rate of 95%, whereas the LLM Top K method attains a 90% accuracy with merely
10 samples. These findings suggest that while fine-tuning an LLM offers enhanced performance over
few-shot methodologies, it necessitates a substantial volume of training data to reach its full potential.
Figure 7.1: Performance evaluation of the few-shot/fine-tuned LLM and MLP methods on the CIC-IDS2017
dataset.
7.3.2.2 DDoS Detection Reasoning
Table 7.1 represents some samples of the DDoS detection reasoning that we got from the few-shot LLM Top
K and fine-tuned LLM methods. An important observation from this comparison is that we observe that
the LLM prompting methods tended to produce interesting and useful explanations behind their
predictions as well as explicitly stating their lack of confidence in certain predictions. Contrary to this, the
110
fine-tuned model, despite answering correctly more often, was more prone to adding hallucinations that
made little sense after its answer such as the ones shown in table 7.1.
Table 7.1: Evaluating the LLM capability for DDoS detection reasoning/explanation using the CIC-IDS2017
dataset
Prompt Destination Port: 53 | Packet Length Std: 104.5003987 | Total Length of Bwd
Packets: 232 | Subflow Bwd Bytes: 232
Few-shot LLM Top K
Based on the labeled data provided, it’s noticeable that the benign behavior usually
occurs when the destination port is 53, the packet length standard deviation is around
104.5, and both the total length of Bwd packets and Subflow Bwd bytes are less than 240.
Given that the values in the unlabeled row match these characteristics, it is probable
that this unlabeled row represents benign behavior.
Hence, the prediction would be:
Destination Port: 53 | Packet Length Std:
104.5003987 | Total Length of Bwd Packets: 232 | Subflow Bwd Bytes: 232 |
Label: $$$Benign$$$
Fine-tune LLM
"text": " BENIGN BATCH LINES
u00d7 BENIGN BENIGN BENIGN BENIGN BEN"
"text": " DDOS BENIGN SOFTWARE BIRD STAYS ON OBSOFTWARE"
"text": " BENIGN BOS DDOS Advisory ID: peek@benigndos."
7.3.2.3 GPT-4 LLM Model
Because of the prohibitive cost of GPT-4, we only ran a single few-shot test with that model, which we
based on the top-k with k = 20 samples of data approach, as we observed that a value for k in this range is
producing the best results with GPT-3.5. The result of this experiment has an accuracy of 0.92 and f1-score
of 0.93. Looking more closely at some of the incorrect predictions it made, GPT-4 justified its answer by
correctly pointing out that the training data had a similar sample to the one it was predicting for both
labels, and saying that because of this it was unable to make a real prediction and would choose a label
arbitrarily. In this case, it was unable or unwilling to take into account the fact that there were more
identical samples of one label than the other, so it struggled with weighting the frequency of certain
features in the training data it was shown.
111
7.3.3 Simulation Results For Urban IoT DDoS Dataset
In this subsection, the DDoS detection and reasoning performance of the fine-tuning and prompt
engineering LLMs, and MLP model will be analyzed on the urban IoT DDoS dataset. The performance of
these models is shown in terms of their binary accuracy versus the attack parameter k.
7.3.3.1 DDoS Detection Performance
In this section, we leverage the GPT-3.5 model to analyze the Urban IoT DDoS dataset, with the objective of
identifying the presence of DDoS attacks emanating from IoT devices. Our methodology incorporates a
few-shot learning paradigm utilizing LLMs, specifically tailored to process the correlation information
across IoT nodes comprehensively. Given the dataset’s extensive feature set, we evaluate two distinct
scenarios: one where feature names are included alongside their corresponding values (referred to as
“tagged data”), and another where feature names are omitted (referred to as “untagged data”). Subsequent
to the prompt preparation, we do a comparative analysis of the DDoS detection capabilities of GPT-3.5
under both tagged and untagged conditions. It is imperative to note that the dataset employed in our
few-shot learning approach maintains a balance between benign and DDoS samples, ensuring a balanced
representation for a robust evaluation.
To construct the prompt for the few-shot methodology, we incorporate between 2 to 10 samples within
the prompt. Binary accuracy serves as the metric for assessing the performance of the few-shot LLM
method. As depicted in Figure 7.2, an increase in the number of samples within the context correlates with
a significant enhancement in GPT-3.5’s ability to detect DDoS attacks. Specifically, performance metrics
reveal that tagged data exhibits an improvement in accuracy, escalating from 0.51 to 0.71, in contrast to
untagged data, which demonstrates an accuracy range from 0.39 to 0.55. This experiment underscores that
with a mere 10 samples incorporated into the prompt, the model’s accuracy peaks at 0.71. Considering the
Urban IoT dataset’s extensive feature set and the context limitation inherent to GPT-3.5, our analysis was
112
Figure 7.2: Performance evaluation of the few-shot LLM and MLP models using urban IoT dataset.
constrained to a maximum of 10 samples within the few-shot approach. The outcomes of this experiment
suggest a positive correlation between the number of context samples—particularly tagged data—and the
model’s detection performance, indicating the potential for further accuracy improvements with an
increased sample count in the context.
In our investigation, we further assess the fine-tuned LLM performance for detecting DDoS attacks
within urban IoT datset. The fine-tuning process utilized OpenAI’s Ada model. Recall that in order to
utilize the correlation information of all IoT nodes for DDoS detection purposes results in numerous input
features. Because of the budget limitation associated with the prohibitive cost of fine-tuning, we only
choose 5 IoT nodes in the system and the corresponding samples with attack volume parameter, k = 0.5,
the proper packet volume which is neither too easy nor too difficult to detect as being attacked. For
comparative analysis, we employ a baseline MLP model, equipped with a hidden layer of 20 neurons.
The fine-tuning of the LLM was executed incrementally, with a sample intake range from 300 to 900.
As illustrated in Figure 7.3, the LLM’s performance, pre-trained on vast corpora of data, surpasses the MLP
in terms of binary accuracy, even with a sample size below 1,000. Remarkably, the LLM, fine-tuned with
113
Figure 7.3: Performance evaluation of the fine-tuned LLM and MLP models using urban IoT dataset.
merely 900 samples, achieves an accuracy of 0.84. This empirical evidence suggests that LLMs, with their
pre-trained knowledge base, exhibit a robust adaptability to new data, thus providing an effective means
for DDoS detection in IoT environments. However, it is noteworthy that the LLM’s performance
demonstrates variability, as evidenced by the fluctuation in accuracy, which indicates that while the overall
trend is positive, the model’s performance is not uniformly consistent across different sample sizes.
7.3.3.2 DDoS Detection Reasoning
For the urban IoT dataset, we examined the reasoning capability of the GPT-3.5 model to explain the
reasoning behind the detection response. Table 7.2 presents the prompts and responses from the LLM. Both
the “User” and “Assistant” messages in table 7.2a are generated according to the urban IoT dataset, while in
table 7.2b and 7.2c, only the contents in “Prompt” are from the dataset, and the “Response” messages are
the messages from GPT-3.5 which show the reasoning capability of the GPT-3.5. It has been observed that
although we provide certain and identical formatting explanations, like the "Assistant" message shown in
table 7.2a when we give the context, however, as shown in table 7.2b, GPT-3.5 sometimes generates
114
explanations that diverge from the provided context, which demonstrates the explanation ability of
GPT-3.5 model with just a few-shot context, instead of just “remembering the answers”. Additionally, table
7.2c indicates GPT-3.5 could also express a sense of ambiguity regarding its prediction. When the quantity
of samples inside the context increases, the range and ambiguity of the provided explanations diminish
correspondingly.
7.4 Summary
In this chapter, we explored the use cases of large language models (LLMs) in the area of DDoS attack
detection, along with reasoning in IoT systems. We proposed methodologies encompassing few-shot and
fine-tuning LLM approaches. A comparative analysis was drawn between traditional multi-layer
perceptron (MLP) models and the advanced capabilities of LLMs, leveraging LLM models such as OpenAI’s
GPT-3.5, GPT-4, and Ada models.
We analyzed the proposed few-shot and fine-tuning approaches by utilizing two distinct datasets,
namely CICIDS 2017 and the Urban IoT Dataset. Our evaluations showed that LLMs, with the proper
context and training, could achieve impressive accuracies in DDoS detection. Specifically, using few-shot
methods on the CICIDS 2017 dataset, LLMs approached a 90% accuracy with merely 10 prompt samples.
This surged to around 95% when fine-tuned with 70 samples. The more challenging Urban IoT Dataset
showcased a similar trend, where the fine-tuned LLM model with 900 samples achieved an accuracy of 84%.
One of the most notable contributions of our study was the capability of LLMs to articulate the basis
behind their DDoS detections, especially in few-shot learning scenarios. However, it is essential to note
their tendency for hallucination, indicating that while LLMs promise significant advances, careful
application and ongoing examination are of the greatest importance.
115
Table 7.2: Evaluating the LLM capability for DDoS detection reasoning/explanation using the urban IoT
dataset
(a) Two examples of provided context
User Time: 7; node 0: packet: 260, packet_30_min: 243.3333333, pkt_1_hr:
253.8333333, pkt_2_hr: 276.75, pkt_4_hr: 271.2916667;node 1: packet: 167,
packet_30_min: 230, pkt_1_hr: 253.1666667, pkt_2_hr: 246, pkt_4_hr:
251;node 2: packet: 266, packet_30_min: 185, pkt_1_hr: 283.8333333, pkt_-
2_hr: 262.6666667, pkt_4_hr: 263.0416667;node 3: packet: 288, packet_-
30_min: 263.3333333, pkt_1_hr: 272.5, pkt_2_hr: 208.3333333, pkt_4_hr:
230.2083333;node 4: packet: 0, packet_30_min: 0, pkt_1_hr: 0, pkt_2_hr:
0, pkt_4_hr: 0; Predict node 3
Assistant Node 3 is attacked since it has a high packet volume.
User Time: 23; node 0: packet: 0, packet_30_min: 0, pkt_1_hr: 0, pkt_2_hr: 0,
pkt_4_hr: 0;node 1: packet: 0, packet_30_min: 0, pkt_1_hr: 0, pkt_2_hr: 0,
pkt_4_hr: 0;node 2: packet: 0, packet_30_min: 0, pkt_1_hr: 0, pkt_2_hr: 0,
pkt_4_hr: 82.20833333;node 3: packet: 186, packet_30_min: 164.6666667, pkt_-
1_hr: 172.6666667, pkt_2_hr: 168.5, pkt_4_hr: 168.125;node 4: packet: 186,
packet_30_min: 177.3333333, pkt_1_hr: 164.8333333, pkt_2_hr: 154.75, pkt_4_-
hr: 177.4583333; Predict node 2
Assistant Node 2 is not attacked since it has a low packet volume in 4 hours.
(b) Two responses from test with its corresponding prompts
Prompt Time: 17; node 0: packet: 0, packet_30_min: 52, pkt_1_hr: 26, pkt_2_hr: 13,
pkt_4_hr: 21.20833333;node 1: packet: 0, packet_30_min: 57, pkt_1_hr: 86,
pkt_2_hr: 135.5833333, pkt_4_hr: 129.0416667;node 2: packet: 169, packet_-
30_min: 156.6666667, pkt_1_hr: 154.6666667, pkt_2_hr: 164.3333333, pkt_4_-
hr: 157.8333333;node 3: packet: 123, packet_30_min: 161.3333333, pkt_1_hr:
173.1666667, pkt_2_hr: 159.8333333, pkt_4_hr: 167.0833333;node 4: packet: 0,
packet_30_min: 0, pkt_1_hr: 0, pkt_2_hr: 0, pkt_4_hr: 0; Predict node 2
Response Node 2 is not attacked because its packet volume is within normal range.
Prompt Time: 6; node 0: packet: 141, packet_30_min: 136, pkt_1_hr: 123.1666667, pkt_-
2_hr: 144.3333333, pkt_4_hr: 163.375;node 1: packet: 0, packet_30_min: 0, pkt_-
1_hr: 0, pkt_2_hr: 0, pkt_4_hr: 0;node 2: packet: 0, packet_30_min: 0, pkt_1_hr:
0, pkt_2_hr: 0, pkt_4_hr: 0;node 3: packet: 265, packet_30_min: 263.3333333,
pkt_1_hr: 281.8333333, pkt_2_hr: 334.25, pkt_4_hr: 295.9166667;node 4: packet:
0, packet_30_min: 0, pkt_1_hr: 0, pkt_2_hr: 0, pkt_4_hr: 0; Predict node 3
Response Node 3 is attacked because it has a high packet volume compared to the other
nodes.
(c) Some other responses
Response Node 3 is likely to be attacked because it has a significant increase in packet
volume compared to the previous time period.
Response Node 0 is possibly attacked, but the prediction would require more information
to be more accurate.
116
Chapter 8
Conclusion
In this dissertation, we have developed AI-enabled mechanisms for the detection of Distributed Denial of
Service (DDoS) attacks in Internet of Things (IoT) systems. We pioneered a novel approach by introducing
a tunable, futuristic DDoS attack model capable of emulating the benign behavior of IoT devices through
the application of a truncated Cauchy distribution to model their benign packet volume distribution. This
model is characterized by three parameters that define the aggression level of the DDoS attack towards the
victim server.
To counteract such sophisticated DDoS attacks, we proposed innovative correlation-aware,
learning-based models. These models excel in detecting DDoS activities by leveraging the correlation data
among IoT nodes. We presented four distinct architectures: multiple models with correlation (MM-WC),
multiple models without correlation (MM-NC), one model with correlation (OM-WC), and one model
without correlation (OM-NC). These frameworks are evaluated based on their ability to utilize either
individual or multiple neural network models and to incorporate or disregard correlation information
among IoT nodes. We thoroughly assessed the efficacy of five neural network models—Multi-Layer
Perceptrons (MLP), Convolutional Neural Networks (CNN), Long Short-Term Memory (LSTM) networks,
Autoencoders (AEN), and Transformers—across all proposed architectures. Our comprehensive simulation
results revealed that the LSTM/MM-WC and Transformer/OM-WC configurations outperform other
117
architectural and model combinations. Notably, architectures that integrate node correlation data,
specifically MM-WC and OM-WC, demonstrated superior performance. This advantage was particularly
evident when attackers attempted to disguise their actions by mimicking legitimate traffic patterns. We
evaluated our findings through practical implementation on a Raspberry Pi-based testbed. In order to
address the challenge of leveraging massive IoT device arrays for DDoS attacks, we introduced heuristic
approaches for selective correlation information sharing among IoT devices. Our analyses highlighted the
effectiveness of selecting IoT nodes for model training based on their Pearson correlation.
Addressing the limitations of conventional machine learning models related to fixed input
requirements, we proposed a solution for handling incomplete data due to network losses through a Graph
Convolutional Network (GCN) model. We explored various configurations for representing IoT device
graph topologies, including Network, Peer-to-Peer, and Hybrid topologies, along with scenarios featuring
both directed and undirected edges. Our extensive simulations demonstrated the superior performance of
the Hybrid topology, which utilizes correlation-based peer-to-peer undirected edges, maintaining an F1
score with at most 2% even with a 50% network connection loss. This underscores the GCN model’s
robustness in detecting DDoS attacks under lossy network conditions.
Moreover, we analyzed the application of Large Language Models (LLMs) for both detecting DDoS
attacks in IoT networks and explaining the underlying logic of detection outcomes. Through the
application of fine-tuning and few-shot prompt engineering methods, we evaluated the LLMs’ detection
capabilities. Our findings indicate that while fine-tuning with an extensive dataset achieves high accuracy,
the few-shot approach also presents a viable alternative, especially when the sample size is limited.
Furthermore, we observed that the few-shot method provides a decent understanding of the rationale of
the DDoS detection outcomes.
In summary, this dissertation contributes to the cybersecurity field by advancing the detection of
DDoS attacks in IoT environments through a comprehensive approach that includes novel AI-enabled
118
mechanisms. Our work lays a foundation for future research in securing IoT systems against increasingly
sophisticated cyber threats.
8.1 Future Directions
In this dissertation, we have contributed to the advancement of ML methodologies for DDoS detection in
IoT environments. However, the dynamic nature of cyber threats necessitates ongoing research. The
following directions are proposed for future work:
• Policy Module Design for DDoS Mitigation: Future studies could focus on developing a policy
module that operates upon DDoS attack detection. This module should prioritize IoT device
criticality and implement mitigation strategies accordingly. Research could explore adaptive policy
algorithms that adjust in real-time based on the attack’s characteristics and the affected devices’
importance.
• Proactive DDoS Attack Detection: Investigating methodologies for early detection of DDoS
attacks by analyzing traffic patterns associated with botnet scanning activities. This entails
developing predictive models that leverage traffic anomalies to foresee potential attacks, enabling
preemptive defensive measures.
• Hybrid Detection Mechanisms: Combining the strengths of various ML detection models could
enhance the accuracy and speed of DDoS detection. Future work might involve the development of a
framework that dynamically selects the optimal detection model based on the current distribution of
attack traffic, thereby improving overall system resilience.
• Adaptive Unsupervised Learning Mechanisms: There is a need for designing unsupervised
learning models that adapt to new and evolving traffic patterns in IoT networks without requiring
119
labeled data. Such mechanisms could continuously learn from the network’s traffic, identifying novel
attack vectors autonomously.
• Federated Learning for Decentralized Security: Exploring federated learning approaches could
address privacy concerns and enhance the scalability of DDoS detection models. Research could
focus on developing decentralized learning protocols that enable collaborative model training across
multiple IoT devices or networks while minimizing data centralization.
• Reinforcement Learning for Evolving Threats: Utilizing reinforcement learning to develop
models capable of dynamically adapting to changing attack patterns presents a promising research
avenue. These models could learn optimal defensive strategies over time, improving their
effectiveness against complex and adaptive DDoS attacks.
• LLM-Based Network Admin Assistant: Designing a Large Language Model (LLM)-based assistant
for network administrators could streamline the process of responding to DDoS attacks. Such a
system could analyze the nature and severity of ongoing attacks, offering real-time, context-aware
recommendations for mitigation strategies, thus enhancing decision-making during critical incidents.
As DDoS attacks continue to evolve in complexity and scale, the continuous exploration and
adaptation of advanced ML models will be paramount in safeguarding the future of IoT infrastructures.
120
Bibliography
[1] Shancang Li, Li Da Xu, and Shanshan Zhao. “The internet of things: a survey”. In: Information
systems frontiers 17 (2015), pp. 243–259.
[2] Pallavi Sethi, Smruti R Sarangi, et al. “Internet of things: architectures, protocols, and applications”.
In: Journal of electrical and computer engineering 2017 (2017).
[3] Number of Internet of Things (IoT) connected devices worldwide from 2019 to 2021, with forecasts from
2022 to 2030. https://www.statista.com/statistics/1183457/iot-connected-devices-worldwide/.
Accessed: 11-27-2022.
[4] Fadele Ayotunde Alaba, Mazliza Othman, Ibrahim Abaker Targio Hashem, and Faiz Alotaibi.
“Internet of Things security: A survey”. In: Journal of Network and Computer Applications 88 (2017),
pp. 10–28.
[5] Internet of things research study - 2014 report.
https://d-russia.ru/wp-content/uploads/2015/10/4AA5-4759ENW.pdf. Accessed: 11-27-2022.
[6] Internet of things research study - 2015 report.
https://www.alain-bensoussan.com/wp-content/uploads/2017/08/34794474.pdf. Accessed:
11-27-2022.
[7] Wan Haslina Hassan et al. “Current research on Internet of Things (IoT) security: A survey”. In:
Computer networks 148 (2019), pp. 283–294.
[8] Constantinos Kolias, Georgios Kambourakis, Angelos Stavrou, and Jeffrey Voas. “DDoS in the IoT:
Mirai and Other Botnets”. In: Computer 50.7 (2017), pp. 80–84.
[9] Yizhen Jia, Fangtian Zhong, Arwa Alrawais, Bei Gong, and Xiuzhen Cheng. “Flowguard: An
intelligent edge defense mechanism against IoT DDoS attacks”. In: IEEE Internet of Things Journal
7.10 (2020), pp. 9552–9562.
121
[10] Zhang Chao-Yang. “DOS attack analysis and study of new measures to prevent”. In: 2011
International Conference on Intelligence Science and Information Engineering. IEEE. 2011, pp. 426–429.
[11] Karan Verma, Halabi Hasbullah, and Ashok Kumar. “An efficient defense method against UDP
spoofed flooding traffic of denial of service (DoS) attacks in VANET”. In: 2013 3rd IEEE International
Advance Computing Conference (IACC). 2013, pp. 550–555.
[12] C.L. Schuba, I.V. Krsul, M.G. Kuhn, E.H. Spafford, A. Sundaram, and D. Zamboni. “Analysis of a
denial of service attack on TCP”. In: Proceedings. 1997 IEEE Symposium on Security and Privacy (Cat.
No.97CB36097). 1997, pp. 208–223.
[13] Jelena Mirkovic and Peter Reiher. “A taxonomy of DDoS attack and DDoS defense mechanisms”. In:
ACM SIGCOMM Computer Communication Review 34.2 (2004), pp. 39–53.
[14] Saman Taghavi Zargar, James Joshi, and David Tipper. “A survey of defense mechanisms against
distributed denial of service (DDoS) flooding attacks”. In: IEEE communications surveys & tutorials
15.4 (2013), pp. 2046–2069.
[15] Marcos VO de Assis, Luiz F Carvalho, Joel JPC Rodrigues, Jaime Lloret, and Mario L Proença Jr.
“Near real-time security system applied to SDN environments in IoT networks using convolutional
neural network”. In: Computers & Electrical Engineering 86 (2020), p. 106738.
[16] Jasek Roman, Benda Radek, Vala Radek, and Sarga Libor. “Launching distributed denial of service
attacks by network protocol exploitation”. In: Proceedings of the 2nd international conference on
Applied informatics and computing theory. 2011, pp. 210–216.
[17] Zi-Yang Shen, Ming-Wei Su, Yun-Zhan Cai, and Meng-Hsun Tasi. “Mitigating SYN Flooding and
UDP Flooding in P4-based SDN”. In: 2021 22nd Asia-Pacific Network Operations and Management
Symposium (APNOMS). IEEE. 2021, pp. 374–377.
[18] Noe Marcelo Yungaicela-Naula, Cesar Vargas-Rosales, and Jesus Arturo Perez-Diaz. “SDN-based
architecture for transport and application layer DDoS attack detection by using machine and deep
learning”. In: IEEE Access 9 (2021), pp. 108495–108512.
[19] Noe M Yungaicela-Naula, Cesar Vargas-Rosales, Jesús Arturo Pérez-Díaz, and
Diego Fernando Carrera. “A flexible SDN-based framework for slow-rate DDoS attack mitigation
by using deep reinforcement learning”. In: Journal of Network and Computer Applications 205 (2022),
p. 103444.
122
[20] Joel Margolis, Tae Tom Oh, Suyash Jadhav, Young Ho Kim, and Jeong Neyo Kim. “An In-Depth
Analysis of the Mirai Botnet”. In: 2017 International Conference on Software Security and Assurance
(ICSSA). 2017, pp. 6–12.
[21] Christopher D McDermott, Farzan Majdani, and Andrei V Petrovski. “Botnet detection in the
internet of things using deep learning approaches”. In: 2018 international joint conference on neural
networks (IJCNN). IEEE. 2018, pp. 1–8.
[22] Tim Kelley and Eoghan Furey. “Getting Prepared for the Next Botnet Attack : Detecting
Algorithmically Generated Domains in Botnet Command and Control”. In: 2018 29th Irish Signals
and Systems Conference (ISSC). 2018, pp. 1–6.
[23] Christopher D. McDermott, Farzan Majdani, and Andrei V. Petrovski. “Botnet Detection in the
Internet of Things using Deep Learning Approaches”. In: 2018 International Joint Conference on
Neural Networks (IJCNN). 2018, pp. 1–8.
[24] Arvin Hekmati, Eugenio Grippo, and Bhaskar Krishnamachari. “Large-Scale Urban IoT Activity
Data for DDoS Attack Emulation”. In: Proceedings of the 19th ACM Conference on Embedded
Networked Sensor Systems. SenSys ’21. Coimbra, Portugal: Association for Computing Machinery,
2021, pp. 560–564.
[25] Arvin Hekmati, Eugenio Grippo, and Bhaskar Krishnamachari. “Neural Networks for DDoS Attack
Detection using an Enhanced Urban IoT Dataset”. In: 2022 International Conference on Computer
Communications and Networks (ICCCN). 2022, pp. 1–8.
[26] Arvin Hekmati, Jiahe Zhang, Tamoghna Sarkar, Nishant Jethwa, Eugenio Grippo, and
Bhaskar Krishnamachari. “Correlation-Aware Neural Networks for DDoS Attack Detection In IoT
Systems”. In: arXiv preprint arXiv:2302.07982 (2023).
[27] Arvin Hekmati. “PhD Forum Abstract: DDoS attack detection in IoT systems using Neural
Networks”. In: Proceedings of the 22nd International Conference on Information Processing in Sensor
Networks. 2023, pp. 340–341.
[28] Arvin Hekmati, Nishant Jethwa, Eugenio Grippo, and Bhaskar Krishnamachari. Large-Scale Urban
IoT Dataset. 2023.
[29] Tony Field, Uli Harder, and Peter Harrison. “Network traffic behaviour in switched Ethernet
systems”. In: Proceedings. 10th IEEE International Symposium on Modeling, Analysis and Simulation
of Computer and Telecommunications Systems. IEEE. 2002, pp. 33–42.
123
[30] Arvin Hekmati and Bhaskar Krishnamachari. “Graph Convolutional Networks for DDoS Attack
Detection in a Lossy Network”. In: IEEE International Conference on Machine Learning for
Communication and Networking (IEEE ICMLCN) (2024).
[31] Arvin Hekmati and Bhaskar Krishnamachari. “Graph-Based DDoS Attack Detection in IoT Systems
with Lossy Network”. In: arXiv preprint arXiv:2403.09118 (2024).
[32] Thomas N Kipf and Max Welling. “Semi-supervised classification with graph convolutional
networks”. In: arXiv preprint arXiv:1609.02907 (2016).
[33] Lei Wang, Chen Ma, Xueyang Feng, Zeyu Zhang, Hao Yang, Jingsen Zhang, Zhiyuan Chen,
Jiakai Tang, Xu Chen, Yankai Lin, et al. “A Survey on Large Language Model based Autonomous
Agents (2023)”. In: URL http://arxiv. org/abs/2308.11432 2308 (2023).
[34] Michael Guastalla, Yiyi Li, Arvin Hekmati, and Bhaskar Krishnamachari. “Application of Large
Language Models to DDoS Attack Detection”. In: International Conference on Security and Privacy in
Cyber-Physical Systems and Smart Vehicles. Springer. 2023, pp. 83–99.
[35] Jayavardhana Gubbi, Rajkumar Buyya, Slaven Marusic, and Marimuthu Palaniswami. “Internet of
Things (IoT): A vision, architectural elements, and future directions”. In: Future generation computer
systems 29.7 (2013), pp. 1645–1660.
[36] Rafiullah Khan, Sarmad Ullah Khan, Rifaqat Zaheer, and Shahid Khan. “Future internet: the internet
of things architecture, possible applications and key challenges”. In: 2012 10th international
conference on frontiers of information technology. IEEE. 2012, pp. 257–260.
[37] Miao Wu, Ting-Jie Lu, Fei-Yang Ling, Jing Sun, and Hui-Ying Du. “Research on the architecture of
Internet of Things”. In: 2010 3rd international conference on advanced computer theory and
engineering (ICACTE). Vol. 5. IEEE. 2010, pp. V5–484.
[38] Debasis Bandyopadhyay and Jaydip Sen. “Internet of things: Applications and challenges in
technology and standardization”. In: Wireless personal communications 58 (2011), pp. 49–69.
[39] Ala Al-Fuqaha, Mohsen Guizani, Mehdi Mohammadi, Mohammed Aledhari, and Moussa Ayyash.
“Internet of things: A survey on enabling technologies, protocols, and applications”. In: IEEE
communications surveys & tutorials 17.4 (2015), pp. 2347–2376.
[40] Sabrina Sicari, Alessandra Rizzardi, Luigi Alfredo Grieco, and Alberto Coen-Porisini. “Security,
privacy and trust in Internet of Things: The road ahead”. In: Comput. Networks 76 (2015),
pp. 146–164.
124
[41] Shalitha Wijethilaka and Madhusanka Liyanage. “Survey on Network Slicing for Internet of Things
Realization in 5G Networks”. In: IEEE Communications Surveys & Tutorials 23.2 (2021), pp. 957–994.
[42] Felisberto Sequeira Pereira, Ricardo Correia, Pedro Pinho, Sérgio Ivan Lopes, and
Nuno Borges de Carvalho. “Challenges in Resource-Constrained IoT Devices: Energy and
Communication as Critical Success Factors for Future IoT Deployment”. In: Sensors (Basel,
Switzerland) 20 (2020).
[43] Koen Zandberg, Kaspar Schleiser, Francisco Acosta, Hannes Tschofenig, and Emmanuel Baccelli.
“Secure Firmware Updates for Constrained IoT Devices Using Open Standards: A Reality Check”. In:
IEEE Access 7 (2019), pp. 71907–71920.
[44] Muhammad Husnain, Khizar Hayat, Enrico Cambiaso, Ubaid Ullah Fayyaz, Maurizio Mongelli,
Habiba Akram, Syed Ghazanfar Abbas, and Ghalib A. Shah. “Preventing MQTT Vulnerabilities
Using IoT-Enabled Intrusion Detection System”. In: Sensors (Basel, Switzerland) 22 (2022).
[45] Nathaniel Gyory and Mooi Choo Chuah. “IoTOne: Integrated platform for heterogeneous IoT
devices”. In: 2017 International Conference on Computing, Networking and Communications (ICNC)
(2017), pp. 783–787.
[46] Georgios Loukas and Gülay Öke Günel. “Protection Against Denial of Service Attacks: A Survey”.
In: Comput. J. 53 (2010), pp. 1020–1037.
[47] Meisam Eslahi, Rosli Salleh, and Nor Badrul Anuar. “Bots and botnets: An overview of
characteristics, detection and challenges”. In: 2012 IEEE International Conference on Control System,
Computing and Engineering (2012), pp. 349–354.
[48] Jing Liu, Yang Xiao, Kaveh Ghaboosi, Hongmei Deng, and Jingyuan Zhang. “Botnet: Classification,
Attacks, Detection, Tracing, and Preventive Measures”. In: EURASIP Journal on Wireless
Communications and Networking 2009 (2009), pp. 1–11.
[49] Supranamaya Ranjan, R. P. Swaminathan, Mustafa Uysal, and Edward W. Knightly. “DDoS-Resilient
Scheduling to Counter Application Layer Attacks Under Imperfect Detection”. In: Proceedings IEEE
INFOCOM 2006. 25TH IEEE International Conference on Computer Communications (2006), pp. 1–13.
[50] Rocky K. C. Chang. “Defending against flooding-based distributed denial-of-service attacks: a
tutorial”. In: IEEE Commun. Mag. 40 (2002), pp. 42–51.
[51] Muhammad Burhan, Rana Asif Rehman, Bilal Khan, and Byung-Seo Kim. “IoT elements, layered
architectures and security issues: A comprehensive survey”. In: sensors 18.9 (2018), p. 2796.
125
[52] Ihsan Ali, Abdelmuttlib Ibrahim Abdalla Ahmed, Ahmad Almogren, Muhammad Ahsan Raza,
Syed Attique Shah, Anwar Khan, and Abdullah Gani. “Systematic literature review on IoT-based
botnet attack”. In: IEEE Access 8 (2020), pp. 212220–212232.
[53] Yi Sun, Jie Liu, K. Yu, Mamoun Alazab, and Kaixiang Lin. “PMRSS: Privacy-Preserving Medical
Record Searching Scheme for Intelligent Diagnosis in IoT Healthcare”. In: IEEE Transactions on
Industrial Informatics 18 (2022), pp. 1981–1990.
[54] Fotios Zantalis, Grigorios E. Koulouras, Sotiris Karabetsos, and Dionisis Kandris. “A Review of
Machine Learning and IoT in Smart Transportation”. In: Future Internet 11 (2019), p. 94.
[55] Tong Anh Tuan, Hoang Viet Long, Le Hoang Son, Raghvendra Kumar, Ishaani Priyadarshini, and
Nguyen Thi Kim Son. “Performance evaluation of Botnet DDoS attack detection using machine
learning”. In: Evolutionary Intelligence 13 (2020), pp. 283–294.
[56] S.K. Pal and S. Mitra. “Multilayer perceptron, fuzzy sets, and classification”. In: IEEE Transactions on
Neural Networks 3.5 (1992), pp. 683–697.
[57] Mouhammd Alkasassbeh, Ghazi Al-Naymat, Ahmad Hassanat, and Mohammad Almseidin.
“Detecting Distributed Denial of Service Attacks Using Data Mining Techniques”. In: International
Journal of Advanced Computer Science and Applications 7 (2016).
[58] Saad Albawi, Tareq Abed Mohammed, and Saad Al-Zawi. “Understanding of a convolutional neural
network”. In: 2017 International Conference on Engineering and Technology (ICET). 2017, pp. 1–6.
[59] Roberto Doriguzzi Corin, Stuart Millar, Sandra Scott-Hayward, Jesús Martínez del Rincón, and
Domenico Siracusa. “Lucid: A Practical, Lightweight Deep Learning Solution for DDoS Attack
Detection”. In: IEEE Transactions on Network and Service Management 17 (2020), pp. 876–889.
[60] Alex Sherstinsky. “Fundamentals of recurrent neural network (RNN) and long short-term memory
(LSTM) network”. In: Physica D: Nonlinear Phenomena 404 (2020), p. 132306.
[61] Sepp Hochreiter and Jürgen Schmidhuber. “Long Short-term Memory”. In: Neural computation 9
(Dec. 1997), pp. 1735–80.
[62] Yan Li and Yifei Lu. “LSTM-BA: DDoS Detection Approach Combining LSTM and Bayes”. In: 2019
Seventh International Conference on Advanced Cloud and Big Data (CBD) (2019), pp. 180–185.
[63] Dor Bank, Noam Koenigstein, and Raja Giryes. Autoencoders. 2020.
126
[64] Yair Meidan, Michael Bohadana, Yael Mathov, Yisroel Mirsky, Asaf Shabtai, Dominik Breitenbacher,
and Yuval Elovici. “N-BaIoT—Network-Based Detection of IoT Botnet Attacks Using Deep
Autoencoders”. In: IEEE Pervasive Computing 17.3 (2018), pp. 12–22.
[65] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez,
Łukasz Kaiser, and Illia Polosukhin. “Attention is All you Need”. In: Advances in Neural Information
Processing Systems. Ed. by I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus,
S. Vishwanathan, and R. Garnett. Vol. 30. Curran Associates, Inc., 2017.
[66] Gokul Yenduri, Manju Ramalingam, G. ChemmalarSelvi, Y Supriya, Gautam Srivastava,
Praveen Kumar Reddy Maddikunta, G DeeptiRaj, Rutvij H. Jhaveri, B. Prabadevi, Weizheng Wang,
Athanasios V. Vasilakos, and Thippa Reddy Gadekallu. “Generative Pre-trained Transformer: A
Comprehensive Review on Enabling Technologies, Potential Applications, Emerging Challenges,
and Future Directions”. In: ArXiv abs/2305.10435 (2023).
[67] Lersak Limwiwatkul and Arnon Rungsawangr. “Distributed denial of service detection using
TCP/IP header and traffic measurement analysis”. In: IEEE International Symposium on
Communications and Information Technology, 2004. ISCIT 2004. 1 (2004), 605–610 vol.1.
[68] Chen-Mou Cheng, H. T. Kung, and Koan-Sin Tan. “Use of spectral analysis in defense against DoS
attacks”. In: Global Telecommunications Conference, 2002. GLOBECOM ’02. IEEE 3 (2002), 2143–2148
vol.3.
[69] Rohan Doshi, Noah Apthorpe, and Nick Feamster. “Machine Learning DDoS Detection for
Consumer Internet of Things Devices”. In: 2018 IEEE Security and Privacy Workshops (SPW). 2018,
pp. 29–35.
[70] Yi-Wen Chen, Jang-Ping Sheu, Yung-Ching Kuo, and Nguyen Van Cuong. “Design and
Implementation of IoT DDoS Attacks Detection System based on Machine Learning”. In: 2020
European Conference on Networks and Communications (EuCNC). 2020, pp. 122–127.
[71] Saif Saad Mohammed, Rasheed Hussain, Oleg Senko, Bagdat Bimaganbetov, JooYoung Lee,
Fatima Hussain, Chaker Abdelaziz Kerrache, Ezedin Barka, and Md Zakirul Alam Bhuiyan. “A New
Machine Learning-based Collaborative DDoS Mitigation Mechanism in Software-Defined
Network”. In: 2018 14th International Conference on Wireless and Mobile Computing, Networking and
Communications (WiMob). 2018, pp. 1–8.
[72] NSL-KDD dataset. https://www.unb.ca/cic/datasets/nsl.html. Accessed: 11-27-2022.
127
[73] Iman Sharafaldin, Arash Habibi Lashkari, and Ali Ghorbani. “Toward Generating a New Intrusion
Detection Dataset and Intrusion Traffic Characterization”. In: 4th International Conference on
Information Systems Security and Privacy (ICISSP). Jan. 2018, pp. 108–116.
[74] University of New Brunswick, “DDoS Evaluation Dataset (CICDDoS2019)”,unb.ca, 2019.
https://www.unb.ca/cic/datasets/ddos-2019.html. Accessed: 09-13-2021.
[75] Beny Nugraha and Rathan Narasimha Murthy. “Deep Learning-based Slow DDoS Attack Detection
in SDN-based Networks”. In: 2020 IEEE Conference on Network Function Virtualization and Software
Defined Networks (NFV-SDN). 2020, pp. 51–56.
[76] Haosu Cheng, Jianwei Liu, Tongge Xu, Bohan Ren, Jian Mao, and Wei Zhang. “Machine learning
based low-rate DDoS attack detection for SDN enabled IoT networks”. In: International Journal of
Sensor Networks 34.1 (2020), pp. 56–69. eprint:
https://www.inderscienceonline.com/doi/pdf/10.1504/IJSNET.2020.109720.
[77] N. Muraleedharan and B. Janet. “Flow-based machine learning approach for slow HTTP distributed
denial of service attack classification”. In: International Journal of Computational Science and
Engineering 24.2 (2021), pp. 147–161. eprint:
https://www.inderscienceonline.com/doi/pdf/10.1504/IJCSE.2021.115101.
[78] Xiaoyu Liang and Taieb Znati. “On the performance of intelligent techniques for intensive and
stealthy DDos detection”. In: Computer Networks 164 (2019), p. 106906.
[79] Keval Doshi, Yasin Yilmaz, and Suleyman Uludag. “Timely detection and mitigation of stealthy
DDoS attacks via IoT networks”. In: IEEE Transactions on Dependable and Secure Computing 18.5
(2021), pp. 2164–2176.
[80] Yuming Feng, Weizhe Zhang, Shujun Yin, Hao Tang, Yang Xiang, and Yu Zhang. “A Collaborative
Stealthy DDoS Detection Method based on Reinforcement Learning at the Edge of the Internet of
Things”. In: IEEE Internet of Things Journal (2023).
[81] Sedat Görmüş, Hakan Aydın, and Güzin Ulutaş. “Security for the internet of things: a survey of
existing mechanisms, protocols and open research issues”. In: Journal of the Faculty of Engineering
and Architecture of Gazi University 33.4 (2018), pp. 1247–1272.
[82] Sultan M Almeghlef, Abdullah AL-Malaise AL-Ghamdi, Muhammad Sher Ramzan, and
Mahmoud Ragab. “Application Layer-Based Denial-of-Service Attacks Detection against IoT-CoAP”.
In: Electronics 12.12 (2023), p. 2563.
128
[83] S Arvind and V Anantha Narayanan. “An overview of security in CoAP: attack and analysis”. In:
2019 5th international conference on advanced computing & communication systems (ICACCS). IEEE.
2019, pp. 655–660.
[84] Naeem Firdous Syed. “IoT-MQTT based denial of service attack modelling and detection”. In: (2020).
[85] Muhammad Husnain, Khizar Hayat, Enrico Cambiaso, Ubaid U Fayyaz, Maurizio Mongelli,
Habiba Akram, Syed Ghazanfar Abbas, and Ghalib A Shah. “Preventing mqtt vulnerabilities using
iot-enabled intrusion detection system”. In: Sensors 22.2 (2022), p. 567.
[86] Naeem Firdous Syed, Zubair Baig, Ahmed Ibrahim, and Craig Valli. “Denial of service attack
detection through machine learning for the IoT”. In: Journal of Information and Telecommunication
4.4 (2020), pp. 482–503. eprint: https://doi.org/10.1080/24751839.2020.1767484.
[87] Si Zhang, Hanghang Tong, Jiejun Xu, and Ross Maciejewski. “Graph convolutional networks: a
comprehensive review”. In: Computational Social Networks 6.1 (2019), pp. 1–23.
[88] Yongyi Cao, Hao Jiang, Yuchuan Deng, Jing Wu, Pan Zhou, and Wei Luo. “Detecting and Mitigating
DDoS Attacks in SDN Using Spatial-Temporal Graph Convolutional Network”. In: IEEE
Transactions on Dependable and Secure Computing 19.6 (2022), pp. 3855–3872.
[89] Jie Huang and Kevin Chen-Chuan Chang. “Towards reasoning in large language models: A survey”.
In: arXiv preprint arXiv:2212.10403 (2022).
[90] Andrew Johnson. Leveraging Large Language Models for Network Security. [accessed 08-07-2023].
[91] Mohamed Amine Ferrag, Mthandazo Ndhlovu, Norbert Tihanyi, Lucas C Cordeiro,
Merouane Debbah, and Thierry Lestable. “Revolutionizing Cyber Threat Detection with Large
Language Models”. In: arXiv preprint arXiv:2306.14263 (2023).
[92] Yandong Liu, Mianxiong Dong, Kaoru Ota, Jianhua Li, and Jun Wu. “Deep Reinforcement Learning
based Smart Mitigation of DDoS Flooding in Software-Defined Networks”. In: 2018 IEEE 23rd
International Workshop on Computer Aided Modeling and Design of Communication Links and
Networks (CAMAD). 2018, pp. 1–6.
[93] Monika Roopak, Gui Yun Tian, and Jonathon Chambers. “Deep Learning Models for Cyber Security
in IoT Networks”. In: 2019 IEEE 9th Annual Computing and Communication Workshop and
Conference (CCWC). 2019, pp. 0452–0457.
129
[94] K. Gurulakshmi and A. Nesarani. “Analysis of IoT Bots Against DDOS Attack Using Machine
Learning Algorithm”. In: 2018 2nd International Conference on Trends in Electronics and Informatics
(ICOEI). 2018, pp. 1052–1057.
[95] Marwane Zekri, Said El Kafhali, Noureddine Aboutabit, and Youssef Saadi. “DDoS attack detection
using machine learning techniques in cloud computing environments”. In: 2017 3rd International
Conference of Cloud Computing Technologies and Applications (CloudTech). 2017, pp. 1–7.
[96] Agathe Blaise, Mathieu Bouet, Vania Conan, and Stefano Secci. “Botnet Fingerprinting: A
Frequency Distributions Scheme for Lightweight Bot Detection”. In: IEEE Transactions on Network
and Service Management 17.3 (2020), pp. 1701–1714.
[97] The CTU-13 Dataset. A Labeled Dataset with Botnet, Normal and Background traffic.
https://www.stratosphereips.org/datasets-ctu13. Accessed: 11-27-2022.
[98] Yan Naung Soe, Yaokai Feng, Paulus Insap Santosa, Rudy Hartanto, and Kouichi Sakurai. “Machine
Learning-Based IoT-Botnet Attack Detection with Sequential Architecture”. In: Sensors 20.16 (2020).
[99] Beny Nugraha, Anshitha Nambiar, and Thomas Bauschert. “Performance Evaluation of Botnet
Detection using Deep Learning Techniques”. In: 2020 11th International Conference on Network of
the Future (NoF). 2020, pp. 141–149.
[100] Ayush Kumar and Teng Joon Lim. “EDIMA: Early Detection of IoT Malware Network Activity
Using Machine Learning Techniques”. In: 2019 IEEE 5th World Forum on Internet of Things (WF-IoT).
2019, pp. 289–294.
[101] Python Fitter Library. https://fitter.readthedocs.io/en/latest/. Accessed: 10-08-2023.
[102] DARPA 2000 Intrustion Detection Scenario Specific Data Sets. https://www.ll.mit.edu/r-d/datasets/.
Accessed: 10-08-2021.
[103] The CAIDA UCSD ”DDoS Attack 2007” Dataset, 2007.
https://www.caida.org/catalog/datasets/ddos-20070804_dataset/. Accessed: 10-08-2021.
[104] Ali Shiravi, Hadi Shiravi, Mahbod Tavallaee, and Ali A Ghorbani. “Toward developing a systematic
approach to generate benchmark datasets for intrusion detection”. In: computers & security 31.3
(2012), pp. 357–374.
130
[105] University of New Brunswick, “CSE-CIC-IDS2018 on AWS”, 2018.
https://www.unb.ca/cic/datasets/ids-2018.html. Accessed: 09-13-2021.
[106] University of New South Wales, The Bot-IoT Dataset.
https://research.unsw.edu.au/projects/bot-iot-dataset. Accessed: 09-12-2021.
[107] A Scheme for Generating a Dataset for Anomalous Activity Detection in IoT Networks.
https://sites.google.com/view/iot-network-intrusion-dataset. Accessed: 09-12-2021.
[108] Derya Erhan and Emin Anarım. “Boğaziçi University distributed denial of service dataset”. In: Data
in brief 32 (2020), p. 106187.
[109] Sajjad Hussain Shah and Ilyas Yaqoob. “A survey: Internet of Things (IoT) technologies, applications
and challenges”. In: 2016 IEEE Smart Energy Grid Engineering (SEGE). IEEE. 2016, pp. 381–385.
[110] Jiahe Zhang, Tamoghna Sarkar, Arvin Hekmati, and Bhaskar Krishnamachari. “Demo Abstract:
CUDDoS - Correlation-aware Ubiquitous Detection of DDoS in IoT Systems”. In: Proceedings of the
21th ACM Conference on Embedded Networked Sensor Systems (Demo Paper). 2023.
[111] Classification on imbalanced data.
https://www.tensorflow.org/tutorials/structured_data/imbalanced_data. Accessed: 11-27-2022.
[112] Scott M Lundberg and Su-In Lee. “A Unified Approach to Interpreting Model Predictions”. In:
Advances in Neural Information Processing Systems. Ed. by I. Guyon, U. Von Luxburg, S. Bengio,
H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett. Vol. 30. Curran Associates, Inc., 2017.
[113] Yiheng Liu, Tianle Han, Siyuan Ma, Jiayue Zhang, Yuanyuan Yang, Jiaming Tian, Hao He,
Antong Li, Mengshen He, Zhengliang Liu, et al. “Summary of chatgpt/gpt-4 research and
perspective towards the future of large language models”. In: arXiv preprint arXiv:2304.01852 (2023).
[114] Som S Biswas. “Potential use of chat gpt in global warming”. In: Annals of biomedical engineering
51.6 (2023), pp. 1126–1127.
[115] Som S Biswas. “Role of chat gpt in public health”. In: Annals of biomedical engineering 51.5 (2023),
pp. 868–869.
[116] Nigar M Shafiq Surameery and Mohammed Y Shakor. “Use chat gpt to solve programming bugs”.
In: International Journal of Information Technology & Computer Engineering (IJITC) ISSN: 2455-5290
3.01 (2023), pp. 17–22.
131
[117] Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal,
Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal,
Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh,
Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler,
Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish,
Alec Radford, Ilya Sutskever, and Dario Amodei. Language Models are Few-Shot Learners. 2020.
arXiv: 2005.14165 [cs.CL].
[118] Da Yu, Saurabh Naik, Arturs Backurs, Sivakanth Gopi, Huseyin A Inan, Gautam Kamath,
Janardhan Kulkarni, Yin Tat Lee, Andre Manoel, Lukas Wutschitz, et al. “Differentially private
fine-tuning of language models”. In: arXiv preprint arXiv:2110.06500 (2021).
[119] Kurniabudi, Deris Stiawan, Darmawijoyo, Mohd Yazid Bin Idris, Alwi M. Bamhdi, and
Rahmat Budiarto. “CICIDS-2017 Dataset Feature Analysis With Information Gain for Anomaly
Detection”. In: IEEE Access 8 (2020), pp. 132911–132921.
132
Abstract (if available)
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Collaborative detection and filtering of DDoS attacks in ISP core networks
PDF
Protecting online services from sophisticated DDoS attacks
PDF
Dispersed computing in dynamic environments
PDF
High-performance distributed computing techniques for wireless IoT and connected vehicle systems
PDF
Mitigating attacks that disrupt online services without changing existing protocols
PDF
A protocol framework for attacker traceback in wireless multi-hop networks
PDF
Dynamic graph analytics for cyber systems security applications
PDF
Efficient crowd-based visual learning for edge devices
PDF
Detecting and characterizing network devices using signatures of traffic about end-points
PDF
Advancing distributed computing and graph representation learning with AI-enabled schemes
PDF
Graph machine learning for hardware security and security of graph machine learning: attacks and defenses
PDF
Data-driven methods for increasing real-time observability in smart distribution grids
PDF
Responsible AI in spatio-temporal data processing
PDF
Scalable exact inference in probabilistic graphical models on multi-core platforms
PDF
Enhancing collaboration on the edge: communication, scheduling and learning
PDF
Algorithm and system co-optimization of graph and machine learning systems
PDF
Learning to optimize the geometry and appearance from images
PDF
Theoretical and computational foundations for cyber‐physical systems design
PDF
Leveraging programmability and machine learning for distributed network management to improve security and performance
PDF
Scaling up temporal graph learning: powerful models, efficient algorithms, and optimized systems
Asset Metadata
Creator
Hekmati, Arvin
(author)
Core Title
AI-enabled DDoS attack detection in IoT systems
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Computer Science
Degree Conferral Date
2024-05
Publication Date
05/03/2024
Defense Date
03/21/2024
Publisher
Los Angeles, California
(original),
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
DDoS,IoT,machine learning,OAI-PMH Harvest
Format
theses
(aat)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Krishnamachari, Bhaskar (
committee chair
), Nakano, Aiichiro (
committee member
), Raghavendra, Cauligi (
committee member
)
Creator Email
hekmati.arvin@gmail.com,hekmati@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-oUC113920138
Unique identifier
UC113920138
Identifier
etd-HekmatiArv-12892.pdf (filename)
Legacy Identifier
etd-HekmatiArv-12892
Document Type
Dissertation
Format
theses (aat)
Rights
Hekmati, Arvin
Internet Media Type
application/pdf
Type
texts
Source
20240507-usctheses-batch-1148
(batch),
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the author, as the original true and official version of the work, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright.
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Repository Email
cisadmin@lib.usc.edu
Tags
DDoS
IoT
machine learning