Close
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Game theoretic deception and threat screening for cyber security
(USC Thesis Other)
Game theoretic deception and threat screening for cyber security
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
Game Theoretic Deception and Threat Screening for Cyber Security
by
Aaron Schlenker
A Dissertation Presented to the
FACULTY OF THE GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
(Computer Science)
December 2018
Copyright 2019 Aaron Schlenker
Acknowledgments
Completing the doctorate has been one of the most arduous, enjoyable, and bittersweet experi-
ences in my life. I have learned an enormous amount about myself which has led to personal
growth in unexpected ways. My personal feeling is that most Ph.D. students will share similar
sentiments about completing the doctorate and that taking this path has not always led them to
the place they expect. However, given the struggles faced during the Ph.D. it is impossible not
to come out of it feeling a deeper appreciation of the personal learning process. For me this has
been a beautiful, unexpected consequence. I feel that I have grown immensely in these few short
years, but even more exciting, it feels as though the Ph.D. is a launchpad to bigger and better
things. Coming to Los Angeles to study at USC after completing my undergrad at Butler Uni-
versity was a huge shift in life-style, culture and the exposure to many incredible individuals that
have positively shaped my world views. My experience will always be something that I treasure,
and is something that I would not trade for a different career path. It is also important to say that
– and I imagine nearly all Ph.D. students agree – completing the doctorate is not possible without
the support and encouragement from a small village of people.
The many advisors I have had throughout my years as a college student – given the privilege
of learning under such wise, intelligent teachers – have given me the intellectual rigor to ask basic
questions about the world. This has been a guiding principle used to drive my research and will
ii
be a focus of my future work in general. Dr. Bill Johnston (or Dr. J as everyone knows you) and
Dr. Jon Sorenson, you are the first two great mentors that have helped shape me into the person I
am today. From taking your classes in my first college years, I learned to gain a true appreciation
of technical work and exploring abstract concepts and principles. Those valuable experiences
first led me to pursuing a research and career oriented towards solving basic problems. It was
your time and effort that opened doors for me that I never knew existed and provided me with the
foundation to build a wonderful career.
Professor Milind Tambe, my Ph.D. advisor and good mentor, thank you for taking me un-
der your wing and guiding me through the difficult and rewarding Ph.D. life. You have given
me invaluable experience in regards to both research and identifying the most significant prob-
lems in the world today. In part, it is my time in the Teamcore research group that has shaped
me into the person I am today. I want to thank all of the wonderful people and colleagues I
have had the privilege of getting to know during my time in Teamcore, Amulya Yadav, Sara
Mc Carthy, Matthew Brown, Debarun Kar, Shahrzad Gholami, Arunesh Sinha, Haifeng Xu, Ben
Ford, Thanh Nguyen, Fei Fang, Eric Shieh, Bryan Wilder, Elizabeth Orrico, Elizabeth Bondi,
Chao Zhang, Aida Rahmattalabi, Yundi Qian, Yasaman Abbasi, Omkar Thakoor, Han Ching Ou,
Biswarup Bhattacharya, Donnabell Dmello, Aaron Ferber, Kai Wang, Subasree Sengupta, and
Sarah Cooney. To my close friends in Teamcore (you know who you are), thank you for all
of the wonderful conversations about nothing in particular, for embracing my excitement about
seemingly irrelevant things, and spending your time and effort to help build each other up to
create a incredibly warm, challenging environment to grow as a person. It was learning about
everyone’s vastly different cultures and being exposed to the intelligent people at Teamcore and
USC that has taught me tremendously valuable life lessons that I will keep with me. I feel this
iii
has permanently altered my life for the better and has opened more mind to considerations which
I would have never perceived otherwise. Lastly, I want to thank the many talented, exceptional
people that have been my co-authors who I have been fortunate enough to collaborate with and
learn a great deal from, Solomon Sonya, Chris Kiekintveld, Phebe Vayanos, Eugene V orobey-
chik, Tran Thanh-Long, Ruta Mehta, Mina Guirguis, Noah Dunstatter, Darryl Balderas. It was
working with you all that I have gained many precious insights into the fundamental challenges
in the problem domains we were looking into and the creative solutions that were devised to solve
those problems. All of these experiences I will carry with me and use them in new situations that
will inevitably arise in the future where they can provide important direction about the best path
forward.
To my rock during both the most difficult times of the Ph.D., and the best of times, Rosemary
Feregrino. It is hard to put into words the impact you have had on me and how incredibly grateful I
am to have shared my life with you these past few years. You have been a source of overwhelming
love and support and I could not imagine completing the Ph.D. without you. Your entire family
has been warm and welcoming to me, and you all have ensured that these last few years were
exciting and full of adventure!
Finally, I want to thank my family who has been a constant source of support throughout my
early years until I finished the doctorate. I feel grateful and fortunate to have all of them in my
life and their positive effect on my life is immeasurable. First of all, my brothers David, Kevin,
Chris and Scott have all been sources of inspiration giving me the fuel needed to finish those long
nights in the office toiling away on research. It is especially helpful when you spend all of those
hours working on a project, completing interesting research (at least to you) and then seeing that
project never amount to something tangible – besides the skills and lessons learned along the
iv
way. My parents, Susan and Steve, have guided and encouraged me from the time I was just a
young elementary school kid to a seasoned grad student. My grandmother, Shirley, and all of
my extended family who gave strength and love from afar. I cannot begin to thank you enough
for providing such a positive influence through these last 26 years. Finishing a Ph.D. becomes
exponentially harder without individuals like you all in someone’s life.
v
Table of Contents
Acknowledgments ii
List Of Figures viii
List Of Tables x
Abstract xi
Chapter 1: Introduction 1
1.1 Cyber Deception . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2 Cyber Threat Screening . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Chapter 2: Background 11
2.1 Security Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2 Threat Screening Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Chapter 3: Related Work 16
3.1 Stackelberg Security Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.2 Cyber-alert Allocation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.3 Cyber Deception . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Chapter 4: Cyber Deception Games 22
4.1 Problem Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.2 Cyber Deception Game . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.3 Optimal Defender Strategy against Powerful Adversary . . . . . . . . . . . . . . 31
4.3.1 Computational Complexity . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.3.2 The Defender’s Optimization Problem . . . . . . . . . . . . . . . . . . . 34
4.3.3 MILP Bisection Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.3.4 Greedy-Minimax Algorithm . . . . . . . . . . . . . . . . . . . . . . . . 37
4.3.5 Solving for an Optimal Marginal Assignment n . . . . . . . . . . . . . . 42
4.4 Optimal Defender Strategy against Naive Adversary . . . . . . . . . . . . . . . . 43
4.5 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.5.1 Powerful Adversary - Scalability and Solution Quality Loss . . . . . . . 48
4.5.2 Comparing Solutions for Different Types of Adversaries . . . . . . . . . 51
4.6 Real World Applicability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
vi
4.6.1 OS and Application Fingerprinting & Obfuscation . . . . . . . . . . . . 53
4.6.2 Deceptive Network Topologies . . . . . . . . . . . . . . . . . . . . . . . 57
4.6.3 Honeypots and Network Tools . . . . . . . . . . . . . . . . . . . . . . . 58
4.6.4 Leveraging the CDG Model . . . . . . . . . . . . . . . . . . . . . . . . 60
4.7 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
Chapter 5: Cyber-alert Allocation Games 63
5.1 Problem Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
5.2 Cyber-alert Allocation Games . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
5.3 Defender’s Optimal Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
5.3.1 Defender’s Optimal Marginal Strategy . . . . . . . . . . . . . . . . . . . 72
5.4 CAG Algorithmic Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
5.4.1 Constraint Conversion . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
5.4.2 Branch-and-Bound Search . . . . . . . . . . . . . . . . . . . . . . . . . 79
5.4.2.1 Heuristic Search . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.4.2.2 Convex Hull Extension . . . . . . . . . . . . . . . . . . . . . 81
5.5 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
5.5.1 Full vs Heuristic Search . . . . . . . . . . . . . . . . . . . . . . . . . . 83
5.5.2 Solving large CAG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
5.5.3 Allocation Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
5.6 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
Chapter 6: General-sum Threat Screening Games 88
6.1 Problem Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
6.2 Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
6.3 GATE: Solving Bayesian General-Sum TSG . . . . . . . . . . . . . . . . . . . . 92
6.3.1 Hierarchical Type Trees . . . . . . . . . . . . . . . . . . . . . . . . . . 93
6.3.2 Advantages of GATE . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
6.4 Scaling Up GATE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
6.4.1 GATE-H: GATE with Heuristics . . . . . . . . . . . . . . . . . . . . . . 97
6.4.2 Tuning Leaf Node Computation . . . . . . . . . . . . . . . . . . . . . . 100
6.4.3 Tuning Non-leaf Node Computation . . . . . . . . . . . . . . . . . . . . 102
6.5 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
6.5.1 Scaling Up and Solution Quality . . . . . . . . . . . . . . . . . . . . . 104
6.5.2 Moving Towards Zero Sum . . . . . . . . . . . . . . . . . . . . . . . . 106
6.6 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
Chapter 7: Conclusion 109
7.1 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
7.2 Future Work and Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
Reference List 113
vii
List Of Figures
1.1 The Cyber Kill Chain developed by Lockheed Martin. . . . . . . . . . . . . . . 2
1.2 Domains of application for game theory in varying security domains and opera-
tions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1 TSG Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.1 The network reconnaissance domain in which an adversary scans the defender’s
enterprise network and the defender deceptively alters the system’s responses. . 24
4.2 Simple example of an enterprise network. . . . . . . . . . . . . . . . . . . . . . 28
4.3 Runtime Comparison and Solution Quality Comparison (20 Observables) - Re-
formulated MILP (MILP), the bisection algorithm withe=:0001 (Bisection) and
Greedy MaxiMin (GMM) with 1000 random shuffles. . . . . . . . . . . . . . . . 49
4.4 Solution Quality Comparison (20 systems and 20 OCs) - Comparison of Hard-
GMM (GMM - H) and Soft-GMM (GMM -l) varying the number of shuffles. . 50
4.5 Solution Quality Comparison (10 OCs) - In (a) the solution quality of the two
types of defender strategies is shown against a powerful adversary. In (b) the
solution quality of the strategies is shown against a naive adversary. . . . . . . . 52
4.6 The alteration of an outgoing packet to mimic a certain desired deceptive signa-
ture. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.7 Network views for a host connected to the defender’s network. In 4.7(a) is the
true network state while in 4.7(b) is an altered state with additional network con-
nections and honeypots. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.8 The HoneyD server initializes network daemons to respond to pings and scans
for various IP addresses not used by a network. . . . . . . . . . . . . . . . . . . 59
viii
5.1 To protect against cyber intrusions, enterprise networks deploy Intrusion Detec-
tion and Prevention Systems across their network that work at both a host and
network level. The alerts generated are given a risk classification and aggregated
into a central repository called a SIEM. The network administrator then must de-
termine how to allocate the alerts to analysts for investigation and remediation if
necessary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
5.2 CAG Strategies for the defender. . . . . . . . . . . . . . . . . . . . . . . . . . . 67
5.3 Conversion of Column Constraints on CAG . . . . . . . . . . . . . . . . . . . . 77
5.4 Geometric view of the defender’s strategy space. . . . . . . . . . . . . . . . . . 82
5.5 Experimental Results for CAG instances. . . . . . . . . . . . . . . . . . . . . . 84
5.6 Allocation Approach Comparison. . . . . . . . . . . . . . . . . . . . . . . . . . 85
5.7 Scaling Number of Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
6.1 Bayesian Adversary Strategy Space . . . . . . . . . . . . . . . . . . . . . . . . 92
6.2 Branch-and-Guide Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
6.3 Runtime and Solution Quality . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
6.4 Scaling Up to Larger TSG Instances . . . . . . . . . . . . . . . . . . . . . . . . 106
6.5 Solution Quality - Moving to Zero Sum . . . . . . . . . . . . . . . . . . . . . . 106
ix
List Of Tables
4.1 Solution Quality % loss and number of optimal instances for GMM versus MILP. 49
x
Abstract
Protecting an organization’s cyber assets from intrusions and breaches due to attacks by mali-
cious actors is an increasingly challenging and complex problem. Companies and organizations
(hereon referred to as the defender) who operate enterprise networks employ the use of numerous
protection measures to intercept these attacks, such as Intrusion and Detection Systems (IDS)
and along with dedicated Cyber Emergency Readiness Teams (CERT) composed of cyber ana-
lysts tasked with the general protection of an organization’s cyber assets. In order to optimize
the use of the defender’s limited resources and protection mechanisms, we can look to game
theory which has been successfully used to handle complex resource allocation problems and
has several deployed real-world applications in physical security domains. Applying previous
research on security games to cybersecurity domains introduce several novel challenges which
I address in my thesis to create models that deceive cyber adversaries and provide the defender
with an alert prioritization strategy for IDS. My thesis provides three main contributions to the
emerging body of research in using game theory for cyber and physical security , namely (i) the
first game theoretic framework for cyber deception of a defender’s network, (ii) the first game-
theoretic framework for cyber alert allocation and (iii) algorithms for extending these frameworks
to general-sum domains.
xi
In regards to the first contribution, I introduce a novel game model called the Cyber De-
ception Game (CDG) model which captures the interaction between the defender and adversary
during the recon phase of a network attack. The CDG model provides the first game-theoretic
framework for deception in cybersecurity and allows the defender to devise deceptive strategies
that deceptively alters system responses on the network. I study two different models of cyber
adversaries and provide algorithms to solve CDGs that handle the computational complexities
stemming from the adversary’s static view of the defender’s network and the varying differences
between adversary models.
The second major contribution of my thesis is the first game theoretic model for cyber alert
prioritization for a network defender. This model, the Cyber-alert Allocation Game, provides
an approach which balances intrinsic characteristics of incoming alerts from an IDS with the
defender’s analysts that are available to resolve alerts. Additionally, the aforementioned works
assume the games are zero-sum which is not true in many real-world domains. As such, the third
contribution in my thesis extends CAGs to general-sum domains. I provide scalable algorithms
that have additional applicability to other physical screening domains (e.g., container screening,
airport passenger screening).
xii
Chapter 1
Introduction
Protecting an organization’s cyber assets from intrusions and breaches due to attacks by malicious
actors is an increasingly challenging and complex problem. This challenge is highlighted by
several recent major breaches which have caused severe damage, such as the Equifax breach in
2017 and Yahoo in 2016
1
. To protect from cyber breaches, companies and organizations employ
the use of anti-virus softwares, Intrusion and Detection Systems (IDS), and Cyber Emergency
Readiness Teams (CERT) composed of cyber analysts tasked with the general protection of an
organization’s network and cyber assets. Modern day cyber adversaries are persistent, targeted
and sophisticated. This highlights the tremendous need for organizations protecting against such
attacks to model these adversaries to optimize the application of the network defender’s to protect
targets and systems across the enterprise network. The Cyber Kill Chain [55, 8] encapsulates
the necessary steps an adversary must complete to successfully breach the defender’s enterprise
network.
2
Figure 1.1 shows the Cyber Kill Chain
3
. In the first phase of the Cyber Kill Chain
1
24th Air Force - AFCYBER: http://www.24af.af.mil; 688th Cyberspace Wing: http://www.24af.af.
mil/Units/688th-Cyberspace-Wing
2
The Cyber Kill Chain is a common framework for post analysis of a cyber breach. Recently, it was used to
analyze the methods used by Russian actors targeting energy and other critical U.S. infrastructure sectors. https:
//www.us-cert.gov/ncas/alerts/TA18-074A
3
Source: https://www.eventtracker.com/tech-articles/siemphonic-cyber-kill-chain/.
1
the adversary spends a significant amount of time completing reconnaissance of the defender’s
enterprise network to learn about the vulnerabilities present and potential points of compromise.
After recon, the adversary’s next phases consist of weaponizing his exploit or malware, delivering
it to the network through some medium and then exploiting a vulnerability in a system connected
to the defender’s network. To finish out his attack, the adversary’s installs additional malware to
ensure persistence and then establishes a command and control channel so he is able to act on his
objectives, i.e., exfiltrate sensitive information, and complete his ultimate goal from breaching
the network.
Figure 1.1: The Cyber Kill Chain developed by Lockheed Martin.
Stopping a cyber breach crucially depends on thwarting an adversary’s attack during one of
the phases occurring in the Cyber Kill Chain. To impede the adversary during the phases of his
attack, network administrators use techniques such as the whitelisting of applications, locking
down permissions, and immediately patching vulnerabilities [42]. An interesting direction of re-
search is the use of deception as a framework to improve cybersecurity defenses [4]. In particular,
deception is extremely useful as a defense mechanism to impede an adversary during the recon
2
phase of his attack. Criminals who target networks first map it out by using network scanning
tools to ascertain network information through a suite of requests using tools such as NMap [53].
Deception is useful here as the adversary relies on the information received from network scan-
ning tools to learn about the vulnerabilities he can exploit to compromise the defender’s network.
Instead of directly stopping an attack, deceptive techniques concentrate on diverting an adversary
to attack non-critical systems or honeypots using deceptive views of the network state. Essen-
tially, approaches for deception focus on making it difficult for the adversary to ascertain the true
state of the network using network scanning tools like NMap. However, one drawback of most
of these previous approaches is that they do not adequately model the adversarial nature of the
cybersecurity domain.
After the recon phase, the adversary must complete phases 2 through 6 of his attack which in-
clude the delivery and exploitation phases of his attack. During these phases, automated intrusion
detection and prevention systems (IDS) generate alerts for potentially malicious activity occur-
ring on the defender’s network where the generated alerts are aggregated into a central repository
security information and event management tools (SIEM). To resolve the alerts generated by the
IDS, human cybersecurity analysts on the CERT must investigate the alerts to assess whether they
were generated by malicious activity, and if so, how to respond. Unfortunately, these automated
systems are notorious for generating high rates of false positives [78]. Compounding this prob-
lem is the fact that expert analysts are in short supply, so organizations face a key challenge in
investigating and managing the enormous volume of alerts they receive using the limited time of
analysts. Failing to solve this problem can render the entire system insecure, e.g., in the 2013 at-
tack on Target, IDS raised alarms, but they were missed in the deluge of alerts [69]. It is important
to note these alerts give the defender the ability to stop the adversary during the later phases of
3
the cyber kill chain, e.g., the delivery, exploitation, or installation phases, and prevents a network
breach from occurring or catches the adversary earlier in his attack that reduces the damage from
a network breach.
An overriding consideration throughout the Cyber Kill Chain is the strategic reasoning taking
place between the network defender and a motivated adversary. Unfortunately, the defender is
limited in her security resources and cannot protect all systems from an attack; further, the ad-
versary could be conducting surveillance to learn about the defender’s deceptive strategies and
resolution strategies. A drawback of previous approaches to deception and the prioritization of
cyber alert resolution is that they do not sufficiently consider the response of a strategic adversary.
Failing to consider the actions of a strategic adversary can have detrimental effects on the effec-
tiveness of the defender’s deception and protection strategies as deterministic (non-randomized)
strategies are open to exploitation by smart adversaries. Game theory provides a foundation for
modeling the strategic interactions between two opposing parties. In this sense, my thesis looks to
game theory as a foundation to improve cyber defense for enterprise networks and to provide an-
other tool to the network defender to harness for protecting her network from constantly evolving
cyber adversaries.
In recent years, the use of game theory has seen tremendous success in security domains for
handling handling the complex scheduling and resource allocation problems of security resources.
One model of interest for the cybersecurity problems studied in my thesis is the Threat Screening
Game (TSG) model which was developed for the airport passenger screening domain. Airport
passenger screening relates nicely to the problem of choosing which incoming alerts to screen
with cyber analysts. However, it and other game theory models fail in three significant ways
when being applied to cyber deception and alert prioritization. Namely, (1) previous game theory
4
models consider an adversary who observes a mixed strategy from the defender, (2) they do
not model defender resources with heterogenous screening times or attacks which present as
probabilistic distributions over alert types, and (3) previous work assumes a game with zero-sum
payoffs.
(a) Network Reconnaissance (b) Cyber Security Operations (c) TSA Screening
Figure 1.2: Domains of application for game theory in varying security domains and operations.
My thesis addresses these issues and advances the state of the art in the application of game
theory to cybersecurity and the field of security games. The first major contribution of my the-
sis is the Cyber Deception Game (CDG) model which captures the interaction between defender
and adversary during the reconnaissance phase of an enterprise network attack. The second ma-
jor contribution of my thesis is the Cyber-alert Allocation Game model which provides a game
theoretic framework for prioritization of cyber alerts coming from IDS placed throughout the
defender’s network and tackles to second limitation of previous game theoretic models. The fi-
nal major contribution of my thesis extends CAGs to domains with general-sum payoffs that is
additionally applicable to physical security domains as it applies to TSGs as well.
5
1.1 Cyber Deception
Experienced attackers attempting to infiltrate a network spend a significant amount of time during
the reconnaissance phase of their attack to find vulnerabilities throughout the network by map-
ping out the network through NMap scans, stealth SYN scans, TCP connections scans along with
others [54, 42]. After gathering all of this information, the attacker then mounts their attack on
a network. In the cyber domain, the network administrator has asymmetric information as she
knows the true state of the network, i.e., properties of systems such as its operating system and
applications running, and further, she can deceptively alter responses to network scans sent by
an adversary [18, 2]. By hiding or lying about part of each system’s configuration, the defender
makes it significantly harder for the adversary to determine the true vulnerabilities present in
systems on the network. Since exploits generally rely on specific vulnerabilities and versions of
software, incorrectly identifying a system’s software information decreases the likelihood of a
successful attack and increases the amount of time it takes an adversary to compromise the de-
fender’s network. This type of interaction introduces an opportunity for the defender to employ
deceptive techniques at a network level to increase uncertainty during an adversary’s reconnais-
sance activities.
The first contribution of my thesis concentrates on how the defender can maximize the benefit
from deceiving cyber adversaries with a mix of true, false and obscure responses to network
scans. To highlight the defender’s advantage, consider a network with 1 system running Nginx
and 2 running Tomcat. Suppose the adversary has a specific exploit for Nginx. The adversary
uses nmap to determine the webserver for each system and deploy his exploit to the one running
Nginx. However, if the defender can lie about the webserver, the adversary potentially has to
6
deliver his exploit to all systems before infiltrating the network. This process increases the time
spent by the adversary to infiltrate the network (which gives the defender time to mount a better
defense) and increases the chances the defender identifies an ongoing attack. The main problem
for the defender then is to determine how to alter the adversary’s perception of the network to
minimize her expected loss from an attack.
In this regard, I introduce a novel game theoretic model called the Cyber Deception Game
(CDG) [72] which captures the interaction between the defender and adversary during the recon
phase of network attacks and the cyber kill chain. The CDG model introduces a framework for a
defender to deploy randomized deceptive strategies in a holistic fashion at a network wide level.
I first consider a powerful adversary where I make a robust assumption about the information
the adversary has to determine his optimal response to the defender’s deceptive strategy In this
domain, the adversary is assumed to view a static version of the network state, i.e., one observed
configuration for each system on the network, which in turn causes the problem of computing
the defender’s optimal strategy to be NP-hard. To alleviate these issues, I explore two separate
algorithms for solving CDGs. First, I use a reformulation approach to convert the defender’s non-
linear optimization problem into a MILP. Second, I make use of a Bisection algorithm framework
which solves a sequence of Mixed Integer Linear Programs (MILP) feasibility problems to obtain
ane-optimal approximate strategy for the defender. The second adversary model I explore in this
research focuses on a naive adversary who is not aware of the deception and has a fixed set of
preferences for systems (given an observed configuration) on the network. Extensive experimen-
tal evaluation is provided comparing the two approaches and the trade-off between the solving
the optimal reformulated MILP and the bisection algorithm.
7
1.2 Cyber Threat Screening
Many approaches for mitigating the problem of sifting through the deluge of alerts generated fo-
cus on reducing the number of overall alerts. IDS can be carefully configured, alert thresholds can
be tuned, and the classification methods underlying the detections can be improved [77, 13, 48].
Other techniques include aggregating alerts [91], and visualizing alerts for faster analysis [61].
Even when using all of these techniques, there is still a large volume of alerts which makes it
infeasible for analysts to investigate all of them in depth. My work focuses on the remaining
problem of assigning limited analysts to investigate alerts after automated pre-processing meth-
ods have been applied.
The typical approach to managing alerts is either ad-hoc or first investigates alerts with the
highest priority (e.g., risk). However, this fails to account for the adversarial nature of the cy-
ber security setting. An attacker who can learn about a predictable alert management policy can
exploit this knowledge to launch a successful attack and reduce the likelihood that they will be
detected. For example, if the defender had a policy that only inspects alerts from high valued as-
sets in her network, an attacker who can learn this evades detection indefinitely by only attacking
lower valued assets.
To address shortcomings of the previous methods for cyber alert allocation, I introduce the
Cyber-alert Allocation Game (CAG) [73] model which provides a game theoretic framework for
the defender to prioritizes and assign alerts for resolution raised from IDPS placed on systems and
nodes across the network. Game theory allows us to explicitly model the strategies an attacker
could take to avoid detection. By following a randomized, unpredictable assignment strategy the
defender can improve the effectiveness of alert resolution strategies against strategic attackers.
8
The CAG model considers the characteristics of the alerts (e.g., criticality of origin system), as
well as the capabilities of the analysts in determining the optimal policy for the defender. I de-
velop techniques to find the optimal allocation of alerts to analysts on the Computer Emergency
Readiness Teams (CERT) in general CAGs and identify special cases where the computation be-
comes easy. The main algorithm takes advantage of a compact marginal representation of the
defender’s strategy space that leverages a special type of constraint structure called a bihierar-
chy which provides special conditions for when optimizing over the defender’s marginal strategy
space is equivalent to the mixed strategy space. Further heuristics are developed to achieve sig-
nificant scale-up in CAGs without significant trade-offs in solution quality.
Both the CDG and CAG models assume games in which the payoffs are zero-sum, however,
in many real world security domains this assumption may not hold. Hence, the last contribution
of my thesis extends the CAG to domains with general-sum payoffs which I show makes the com-
putation of the defender’s optimal strategy NP-hard [71]. This work has additional applicability
to physical security domains (e.g., shipping container screening or airport passenger screening)
as it extends the work done on Bayesian Threat Screening Games (TSGs) which previously as-
sumed the payoffs to be general-sum as well. To deal with these issues, I provide the GATE
which combines branch-and-bound search with Marginal Guided Algorithm, introduced in [22]
to solve zero-sum TSGs, to optimally solve general-sum TSGs and additional heuristics which
improve the scalability of GATE to real-world domains.
9
1.3 Thesis Outline
In Section 2, I discuss two relevant game theoretic models which inspire the development of the
game theoretic models in this thesis. In Section 3, related works are discussed for background
in security games and cybersecurity research. Next, Section 4 I introduce the Cyber Decep-
tion Game (CDG) model and discuss the associated algorithms for solving for optimal deceptive
defender strategies. Section 5 introduces the Cyber-alert Allocation Game (CAG) model and
presents algorithms for determining defender alert allocation strategies to her cyber analysts on
the floor. In Section 6, I present the GATE algorithm which solves General-sum Bayesian Threat
Screening Games. Finally, in Section 7 I discuss relevant future work for the use of deception in
cybersecurity domains, future work for using the alert prioritization strategies in the real world
and conclude my thesis.
10
Chapter 2
Background
2.1 Security Games
A security game [70, 27, 60, 81, 43] is a two-player game player between a defender and an
adversary, where the defender protects a set of N targets from the attacker. The defender is
assumed to have K security resources at her disposal to prevent an attack. A pure strategy for
the defender is an assignment of the K resources to eiher targets or patrols (which can consist
of multiple targets) while an adversary’s pure strategy consists of choosing a target to attack.
Denote the k
th
defender pure strategy as P
k
, which is an assignment of all security resources.
P
k
is represented as a column vector P
k
=[P
ki
]
T
, where P
ki
= 1 determines whether target i is
protected by P
k
. For instance, consider a security game with 3 targets and 2 resources, then
P
k
=[1;0;1] represents the pure strategy where the defender protects targets 1 and 3 while 2
is left uncovered. Each target i2 N has a corresponding set of payoffsfU
c
d
;U
u
d
;U
c
a
;U
u
a
g: If
an adversary attacks target i and it is protected by a resource, then he receives utility U
c
a
and the
defender receives utility U
c
d
. If target i is not protected then the adversary receives utility U
u
a
while
the defender receives U
u
d
. For security games, it is assumed that U
c
a
< U
c
d
and U
u
a
> U
u
d
, which
11
means the defender strictly benefits from covering a target more often with a resource while it is
disadvantageous to the adversary. In security domains, it is often true that there are significantly
more targets to protect than security resources available to protect them, i.e., K < N, and hence,
it is of tremendous importance to the defender to carefully design a protection strategy.
A major assumption in most work on security games is that the interaction between the de-
fender and adversary can be captured via the Stackelberg assumption, i.e., the defender commits
to a strategy first. The adversary is then assumed to conduct surveillance and learns the defender’s
strategy before selecting his best response. This type of game is called the Stackelberg Security
Game (SSG) where the standard solution concept is the Strong Stackelberg Equilibrium (SSE).
For SSE, the defender is assumed to select an optimal strategy based on the assumption that an
adversary selects his best response, breaking ties in favor of the defender. In an SSG, the de-
fender’s optimal strategy is generally a mixed (randomized) strategy q, which is a distribution
over pure strategiesP, as an adversary can typically exploit any deterministic (pure strategy)
played by the defender. The mixed strategy is represented as a vector q=[q
k
]
T
, where q
k
2[0;1]
is the probability of choosing a pure strategy P
k
andå
k
q
k
= 1. The defender’s strategy can also
be represented in a compact manner using a “marginal” representation. Let n be the marginal
strategy, then n
i
=å
P
k
2P
q
k
P
ki
is the probability that target i is protected. Hence, work on SSG
typically focus on solving for the defender’s optimal marginal strategy q or the marginal strategy
n.
12
2.2 Threat Screening Games
A threat screening game (TSG) is a Stackelberg game played between the screener (leader) and an
adversary (follower) in the presence of a set of non-player screenees that pass through a screening
checkpoint operated by the screener. While TSGs are applicable to many domains, given that
readers may be familiar with passenger screening at airports, I use examples from that domain.
A TSG is composed of several features beyond the traditional SSG. A TSG is played over Time
windows w2W that capture the temporal aspect of screening. The incoming passengers are group
together by implicit characteristics, e.g., risk level and flight, into screenee categories c2 C.
N
c
gives the total number of screenees arriving in a category while N
w
c
denotes the number of
screenees in category c arriving in time window w. The adversary is assumed to have an adversary
typeq2Q that defines implicit characteristics of an adversary which cannot be chosen, e.g., TSA-
assigned risk level. This in turn restricts the adversary to select from screenee categories C
q
C.
The adversary is assumed to know his type, but the defender only knows a prior distribution z
over the types. The adversary then chooses an attack methods m2 M to attack with, e.g., on-
body explosive. The defender has resource types r2 R that can be used to screen at most L
w
r
in
a given time window. Each resource has effectiveness E
r
m
which is the probability of detecting
attack method m. These resources can be combined to team types t2 T that are used to screen
a passenger, e.g., walk-through metal detector and x-ray machine. Each team has effectiveness
E
t
m
= 1P
r2t
(1 E
r
m
) which defines the probability of detecting m.
Pure strategy A pure strategy P for the screener can be represented byjWjjCjjTj non-
negative integer-valued numbers P
w
c;t
, where each P
w
c;t
is the number of screenees in c assigned to
be screened by team type t during time window w. Pure strategy P must assign every screenee
13
t
1
t
2
t
3
c
1
c
2
c
3
4
1
3
2
4
5
3
4
1
(a) Pure Strategy
t
1
t
2
t
3
c
1
c
2
c
3
4.3
0.7
3
1.5
4.1
5.4
3.2
4.2
0.6
(b) Marginal Strategy
Figure 2.1: TSG Strategies
to a team type while satisfying the resource type capacity constraints for each time window, via
the following constraints: å
t2T
I
t
r
å
c2C
P
w
c;t
L
w
r
8w2 W;8r2 R and å
t2T
P
w
c;t
= N
w
c
8w2
W;8c2 C where I
t
r
is an indicator function returning 1 if team type t contains resource type r
and 0 otherwise.
ˆ
P denotes the set of all valid pure strategies and it is assumed
ˆ
P6= / 0, i.e., it is
possible to assign every screenee to a team type. The pure strategies for the adversary types are
denoted as a
q;w
c;m
which specifies that adversary type q selects time window w, screenee category
c, and attack method m.
For example, consider a game with one time window w
1
and two screening resources r
1
,
r
2
with capacity constraints L
r
1
;r
2
= 20, respectively. The resources are combined into three
screening teams t
1
=fr
1
g, t
2
=fr
1
;r
2
g and t
3
=fr
3
g. There are three categories c
1
, c
2
and c
3
and each has N
c
= 9 passengers. Figure 2.1(a) shows an example of a pure strategy allocation in
this game.
Marginal strategy A marginal strategy n for the screener can be represented byjWjjCj
jTj non-negative real-valued numbers n
w
c;t
, where n
w
c;t
is the number of screenees in c assigned
to be screened by team type t during time window w. We would want this marginal strategy to
be a valid mixed strategy (implementable), i.e., there should exist a probability distribution over
14
ˆ
P given by q
P
(i.e., å
P2
ˆ
P
q
P
= 1, 0 q
P
1) such that n
w
c;t
=å
P
q
P
P
w
c;t
. An example marginal
screener strategy is shown in Figure 2.1(b) with the same game parameters described previously.
Utilities Since all screenees in category c are screened equally in expectation, we can inter-
pret n
w
c;t
=N
w
c
as the probability that a screenee in category c arriving during time window w will
be screened by team type t. Then, the probability of detecting an adversary type in category c
during time window w using attack method m is given by x
w
c;m
=å
t
E
t
m
n
w
c;t
=N
w
c
. The payoffs for
the screener are given in terms of whether adversary type q chooses screenee category c and is
either detected during screening, denoted as U
d
s;c
, or is undetected during screening, denoted as
U
u
s;c
. Similarly, the payoffs for adversary type q are given in terms of whether q chooses scree-
nee category c and is either detected during screening, denoted as U
d
q;c
, or is undetected during
screening, denoted as U
u
q;c
. Given adversary type q pure strategy a
q;w
c;m
, the screener’s expected
utility is given by U
s
(a
q;w
c;m
)= x
w
c;m
U
d
s;c
+(1 x
w
c;m
)U
u
s;c
and the expected utility for adversary type
q is given by U
q
(a
q;w
c;m
)= x
w
c;m
U
d
q;c
+(1 x
w
c;m
)U
u
q;c
.
15
Chapter 3
Related Work
3.1 Stackelberg Security Games
Stackelberg Security Games (SSG) have been well studied in previous literature [45, 50, 47, 34,
80, 41, 51, 46]. The early work in this area of research concentrated on providing the necessary
theoretic and algorithmic foundation needed to solve general Stackelberg games and not on the
problem of security. [82] is the first work to explore the concept of commitment to mixed (i.e.,
randomized) strategies in Stackelberg games. The earliest approach to solving general Stackel-
berg games termed Multiple LPs is introduced in [27] which shows that you can compute the
optimal commitment by solving a single LP for each adversary action. The DOBSS algorithm
presented in [60] improves on the Multiple LPs approach by showing the Strong Stackelberg
Equilibrium (SSE) in Bayesian Stackelberg Games, i.e., where the leader may face multiple fol-
lower types, can be found using a single Mixed Integer Linear Program (MILP).
The formal Stackelberg Security Game model was introduced in [43] along with the
ORIGAMI and ERASER algorithms for solving these games. ORIGAMI gives a polynomial time
algorithm for solving security games without resource constraints, e.g., scheduling constraints in
16
patrolling domains. ERASER, on the other hand, compactly represents the defender’s strategy
space which allows for significantly more scalability in security games with multiple resources.
Additionally, ERASER handles resource constraints that are not considered in ORAGAMI. The
first optimal approach to solving SSG with arbitrary resource constraints is ASPEN [37] which
uses a branch-and-price framework that considers only the most relevant pure strategies to incre-
mentally build up the defender’s optimal mixed strategy. The HSBA algorithm [39] advanced
the state-of-the-art in solving Bayesian SSG. HSBA breaks down the Bayesian game into smaller
restricted games, i.e., games with a subset of the adversary types from the original problem, and
solves the restricted games by combining an efficient branch and bound search with column gen-
eration. The solution information from these restricted games is then used to solve the original
game more effectively.
Research on SSGs have led to several successfully deployed decision-support applications in-
cluding ARMOR [64], IRIS [81], GUARDS [65], PROTECT [75], and TRUSTS [87]. These de-
cision aids provide algorithms that generate strategies and patrols of defender security resources
for the protection of physical targets such as airports, ports and metro systems. All of these works
assume that an adversary observes a mixed strategy from the defender and assume the security
resources are 100% effective at protecting a target if allocated to it. This is in contrast to the
work in this thesis that accounts for resources that are not 100% effective and an adversary who
observes a pure strategy from the defender, which are crucial features of the cyber domain.
The problem of threat screening has been explored extensively in literature. The Threat
Screening Game (TSG) model was introduced in [22] and looks to SSG for inspiration to de-
vise a game theoretic model for the threat screening domain. As [22] points out, the TSG ap-
proach provides a significant improvement on prior non-game-theoretic models in domains such
17
as, screening for shipping containers [6], stadium patrons [68], and airport passengers [57, 58] or
simple game-based models such as [83]. Specifically, previous non-game-theoretic approaches
fail to model a rational adversary who aims to take advantage of vulnerabilities in the screening
strategies whereas TSGs take this into account when devising screening strategies. Furthermore,
previous solution methods for solving security games fail to apply directly to airport passenger
screening. This happens because TSGs (i) include a group of non-player screenees that all must
be screened while a single adversary tries to pass through screening undetected, (ii) model screen-
ing resources with different efficacies and capacities that can be combined to work in teams and
(iii) do not have an explicitly modeled set of targets. These are fundamental differences from
traditional security games [43, 64, 75, 81] where the defender protects a set of targets against a
single adversary. [22] provides MGA to solve for the defender’s optimal screening strategy in
Bayesian zero-sum TSGs, but these techniques fail to apply to TSGs with non-symmetric (i.e.,
general-sum) payoffs. HSBA represents an interesting approach for solving Bayesian general-
sum TSGs, but as shown in [22] column generation is not a scalable approach for solving TSGs.
Hence, novel algorithms are needed to tackle the challenges of scaling up solution methods to
Bayesian general-sum TSGs.
3.2 Cyber-alert Allocation
Intrusion detection has been studied for over three decades, beginning with the early work
of [7, 29]. A major concentration of research on intrusion detection has focused on developing au-
tomated techniques (e.g., machine learning) for identifying malicious activity [36, 35, 84, 30, 79].
However, these methods have significant detection error and suffer from generating a plethora of
18
alerts for the defender to investigate [77]. As the volume of alerts increased significantly due to
Intrusion and Detection systems across a network, later research [12, 77] focused on reducing the
number of false positive alerts by developing automated alert reduction techniques. In this regard,
there are both open source [21] and commercially available [91] Security and Event Management
(SIEM) tools that take raw data input from sensors, aggregate and correlate them, and provides
a central repository of alerts from the enterprise network that can then be assigned for investiga-
tion by the cyber security analysts. Most modern cybersecurity operations of large organizations
house a team of human cyber analysts (e.g., Computer Emergency Response Team (CERT)
1
)
who are tasked with investigating these alerts generated by the automated detectors [28]. A re-
cent line of work [33] uses decision theory to optimize the scheduling of cyber-security analysts
over 2 week periods to minimize the risk associated with investigating and needing to remediate a
critical attack, but this approach does not consider the response of a strategic attacker. In contrast
with this previous work, my thesis focuses on the problem of how to assign the alerts to analysts
which are currently on the operations floor given an adversary attempting to exploit weaknesses
in the defender’s allocation strategy.
My approach for cyber alert allocation draws on the principles and modeling techniques of
the large body of work that applies game theory to security problems [80]. The existing work
on security games focuses heavily on applications to physical security (e.g., patrolling), with
some exceptions (e.g., [48, 31]). However, Cyber-alert Allocation Games (CAG) significantly
differs from SSG due to the absence of an explicit set of targets, a large number of benign alerts
and varying time requirements for inspections. TSGs relate nicely to CAGs, but there are some
1
As an example the United States Air Force employs the use of 688th Cyberspace Wing to protect De-
partment of Defense networks and systems from priority threats. http://www.afcyber.af.mil/About-Us/
688th-Cyberspace-Wing/
19
crucial differences with the cybersecurity domain: (1) Screening in airports is a quick scan of a
passenger; in CAGs, investigating a threat may take varying amounts of time leading to a different
“non-implementability” [45] issue for CAG as compared to TSG and other security games which
require novel techniques to resolve, (2) CAG does not consider teams of resources, and (3) in
cybersecurity attacks result in a distribution of potential alert types.
3.3 Cyber Deception
The use of game theory has been studied before in the context cybersecurity problems [52, 5,
74, 49, 73]. [44, 63, 32, 31] study a honeypot selection game where the defender chooses the
properties of the network, a sequel where the adversary can probe machines to ascertain the true
state and a final version which models the adversary’s actions as attack graphs. [24] studies a
signaling game where the defender signals to an adversary if a system is either real or a honeypot
when the adversary performs a scan. [62] extends the signaling game to account for an adversary
who can gain evidence about the true state of a system. In my work, I consider a game scenario in
which the defender determines the optimal way to respond to scans sent by a potential adversary
given a set of possible responses. Further, I explore different types of adversaries with varying
awareness of deception.
Deception has also been widely studied as a means to improve the protection of enterprise
networks from potential hackers and intruders [3, 1]. [2] uses a graph theoretic approach to
confuse a potential attacker by manipulating his view of systems on the network. However, this
work focuses on finding a view which is measurably different from the true state and does not
adequately model the response of a strategic adversary. [40] is closely related to my work on
20
deception. The authors study how to respond to an attacker’s scan queries using an annotated
probabilistic logic model. My research provides a complimentary view using game theory to
determine how a defender manipulates scan responses to confuse an attacker’s view of systems
on the network. I also study varying adversary models, which can have significant impact on the
defender’s optimal strategy which is not explored in [40].
21
Chapter 4
Cyber Deception Games
4.1 Problem Domain
The first stage of the cyber kill chain, the reconnaissance phase, is one of the most critical stages
for an adversary attempting to breach an enterprise network, and hence, a critical phase in which
the defender can protect against network intrusions. . Motivated adversaries spend a signifi-
cant amount of time completing reconnaissance on the enterprise network to learn about system
vulnerabilities and the best points of compromise in the defender’s network. Reconnaissance
is completed through a suite of scan requests, such as Stealth SYN scans, OS scans, and ser-
vices scans by using network scanning tools such as NMap. Importantly, the type of interaction
between the defender and an adversary during the recon phase provides an opportunity for the
defender to deceive a cyber adversary by altering and lying about the system information which
is returned from from scanning activity.
This chapter explores a game theoretic model, the Cyber Deception Game (CDG), which
provides a framework for the defender to deceptively set scan responses for systems across the
defender’s enterprise network to deceive potential cyber adversaries. Two adversary models are
22
explored that represent the various types of real-world adversaries that can be encountered in
the network security domain. The first is a powerful adversary who is assumed to have a robust
amount of information about the defender’s deceptive strategy, e.g., nation-state level adversaries
who may have access to insider information. The second is a naive adversary assumed to not
be aware of the deception and who has a fixed set of preferences over observed systems on the
network, e.g., script-kiddies who are not as advanced. I show that computing the optimal strategy
for the network defender is NP-hard due when facing a powerful adversary due to either masking
or cost constraints imposed on the defender’s deceptive response strategy space. Additionally, I
show computing the optimal strategy against the naive adversary to be NP-hard as well when both
the masking and cost constraints are imposed on the defender. I develop three solution approaches
to find the defender’s optimal deceptive response scheme against a powerful adversary: (1) a
Mixed Integer Linear Programming (MILP) approach, (2) a bisection algorithmic approach, and
approaches to find the defender’s optimal strategy against naive adversaries and (3) a greedy
heuristic approach to quickly generate defender deceptive strategies. I also present experimental
results showing the scalability of the algorithms to larger scale CDGs. Finally, technical deceptive
techniques and approaches are discussed which could leverage the use of the CDG model to
generate deceptive network views.
4.2 Cyber Deception Game
The Cyber Deception Game (CDG) is a zero-sum Stackelberg game between the defender (e.g.,
network administrator) and an adversary (e.g., hacker). The defender moves first and chooses how
the systems should respond to scan queries from an adversary, and the adversary subsequently
23
Figure 4.1: The network reconnaissance domain in which an adversary scans the defender’s
enterprise network and the defender deceptively alters the system’s responses.
moves by choosing a system to attack based on the responses. Despite the similarities with game-
theoretic models in security domains, such as [80, 15, 16], there are two key differences. First,
the defender can only commit to a pure strategy and not an arbitrary mixed strategy. This is
because, in these domains, network administrators modify the network very infrequently, and
thus, the attackers’ view of the network is static. Second, there are no explicit security resources
for the defender in CDGs. Consequently, the existing approaches for solving standard Stackelberg
games in security domains, cannot be directly applied. The various components of the game and
the aforementioned model characteristics are described in detail as follows:
Systems and True Configurations
The defender aims to protect a set K of systems, from possible exploits and intrusions. Each sys-
tem has certain attributes, e.g., an operating system, an anti-virus protection mechanism, services
hosted, etc. These attributes altogether constitute the true configuration (TC) of the system. The
24
set of all possible TCs is denoted by F. Each system has an associated utility, which captures how
much the adversary would get by attacking it. This utility solely depends on the TC of the system
— each f2 F induces a utility denoted by U
f
to any system that is assigned f . U
f
can be negative
if the security level of the system is so high that the attacker’s efforts end in vain or the attacker
gets fake data from a seemingly successful attack, leading to a loss in the end. It follows that,
the true state of the network (TSN) can be represented as a vector N=(N
f
)
f2F
, where N
f
2Z
>0
denotes the number of systems on the network which have a TC f and å
f2F
N
f
=jKj (Assume
N
f
6= 0, since such a TC simply need not be considered).
Observed Configurations
The adversary attempts to gain information about every system on the network, via probes and
scans. By scanning a system, the adversary observes certain attributes, which constitute the
observable configuration (OC) of the system. The set of possible OCs is denoted by
˜
F. It is
assumed that it is possible for the defender to make some of the observable attributes of a system
appear different than what they truly are (e.g., altering the TCP/IP stack of a system, spoofing a
running service on a port). By means of such alterations at her disposal, the defender controls
the OC an attacker sees when probing a system. Note that it may not be possible for an arbitrary
TC f to be made to appear as an arbitrary OC
˜
f2
˜
F — such a constraint is called a feasibility
constraint, and these are denoted by a (0,1)-matrix p. Iff p
f;
˜
f
= 1, then f can be covered, or
masked with
˜
f . The set of OCs which can mask a TC f , is denoted by
˜
F
f
=f
˜
f2
˜
Fjp
f;
˜
f
= 1g,
and similarly, the set of TCs which can be masked by an OC
˜
f , by F
˜
f
=f f2 Fjp
f;
˜
f
= 1g.
From the adversary’s perspective, two systems having the same
˜
f as their OC are indistin-
guishable, and hence, his observed state of the network (OSN) can be represented as a vector
25
˜
N=(
˜
N
˜
f
)
˜
f2
˜
F
where
˜
N
˜
f
2Z
0
denotes the number of systems which have an OC
˜
f . As is the case
with the TSN N, we must have å
˜
f2
˜
F
˜
N
˜
f
=jKj.
It is assumed that masking a TC f with an OC
˜
f , has a cost of c( f;
˜
f) incurred by the defender,
which typically captures the monetary costs for deploying network modifications necessary for
such a deception. It is also useful to note that although honeypots are not explicitly discussed in
the model they can represented with a true configuration f
honey
and an observable
˜
f
honey
. Systems
on the network which do not appear for a specific system can also be modeled with an observable
configuration
˜
f
hidden
which does not have a utility, and further, any systems masked with
˜
f
hidden
do not appear in the optimization problem formulated in Section 4.3.2.
Defender Strategies
Naturally, F,
˜
F, p, c and N are known to the defender. Given all this information, the defender
must decide her strategy — for each TC f , she must decide how many of the N
f
systems having
TC f , should be assigned the OC
˜
f , where
˜
f2
˜
F
f
. Thus, any possible strategy can be represented
as ajFjj
˜
Fj matrixf having non-negative integer entries, withf
f;
˜
f
representing the number of
systems having TC f and OC
˜
f . Hence,f must satisfy
f
f;
˜
f
2Z
0
8 f2 F;8
˜
f2
˜
F (4.1)
Since the TSN N is fixed,f must also satisfy
å
˜
f2
˜
F
f
f;
˜
f
= N
f
8 f2 F (4.2)
26
Since feasibility constraintsp are specified,f must also satisfy
f
f;
˜
f
p
f;
˜
f
N
f
8 f2 F;8
˜
f2
˜
F (4.3)
Finally, since setting any OC on a system has an associated cost, the defender’s total cost cannot
exceed a limit B, which is called the budget constraint. Formally,f must also satisfy
å
f2F
å
˜
f2
˜
F
f
f;
˜
f
c( f;
˜
f) B (4.4)
The set of strategies f which satisfy the constraints (4.1), (4.2), (4.3), and (4.4), is denoted by
F.
1
When the defender playsf2F, the resulting OSN
˜
N is given by
˜
N
˜
f
= å
f2F
f
f;
˜
f
8
˜
f2
˜
F.
Adversary Strategies
Depending on the defender’s strategy, the adversary observes
˜
N as described above. All systems
having the same OC
˜
f are indistinguishable to the adversary, and hence, he must be indifferent
between all such
˜
N
˜
f
systems when deciding which system to attack. As a result, the adversary
is assumed to choose the OC
˜
f which gives him the highest expected utility (described momen-
tarily), and attack all the
˜
N
˜
f
systems having this OC with an equal probability. In short, we say
“the adversary attacks an OC
˜
f ” to mean he attacks all the systems having OC
˜
f with an equal
probability. A general mixed strategy for the adversary is to attack the set of OCs with any proba-
bility distribution. However, since there always exists a pure best-response strategy in any game,
it suffices to consider the adversary’s strategies as simply attacking a particular
˜
f .
1
The feasibility constraints can simply be captured via the budget constraint by setting the costs of infeasible
assignments to be higher than the budget. However, they are essential in the model, since, in some cases, having no
budget constraint allows an efficient solution to the problem (e.g. Section 5), while still having the very practical
feasibility constraints keeps the problem non-trivial.
27
Utilities
When the defender plays a strategyf, the adversary’s expected utility on attacking an OC
˜
f with
˜
N
˜
f
> 0, denoted by
¯
U
˜
f
(f) — or, as
¯
U
˜
f
for simplicity, when the underlying f is unambiguously
understood — is given by
˜
U
˜
f
= E[U
f
jf;
˜
f]=
å
f2F
˜
f
P( fjf;
˜
f)U
f
=
å
f2F
f
f;
˜
f
˜
N
˜
f
U
f
(4.5)
(4.5) follows from computing P( fjf;
˜
f) using the fact that out of
˜
N
˜
f
systems having an OC
˜
f ,
f
f;
˜
f
have a TC f . Since the game is zero-sum, the defender’s expected utility is
˜
U
˜
f
when
˜
f
is attacked. Note the attacker cannot attack an OC
˜
f with
˜
N
˜
f
= 0, or equivalently, his expected
utility is¥ if he does so.
Next, the model is illustrated using a simple example.
Figure 4.2: Simple example of an enterprise network.
Example Game 1: Figure 4.2 shows a simple example enterprise network which will be used as
a running example. We have a set of systems K=fk
1
;k
2
;k
3
g, set of TCs F =f f
1
; f
2
; f
3
g (shown
in Figure 4.2 as the green boxes) and set of OCs
˜
F =f
˜
f
1
;
˜
f
2
g (shown in Figure 4.2 as the yellow
28
boxes). Let the feasibility constraints be given by the sets F
˜
f
1
=f f
1
; f
2
g and F
˜
f
2
=f f
2
; f
3
g. The
TCs are as follows:
f
1
= [[os] L, [web] T, [ssh] O, [files] S]
f
2
= [[os] L, [web] N, [ssh] O, [files] P]
f
3
= [[os] W, [web] N, [ssh] O, [files] I]
For the TCs, the utilities are U
f
1
= 10, U
f
2
= 0, and U
f
3
= 6. The OCs are as follows:
˜
f
1
= [[os] L, [web] T]
˜
f
2
= [[os] W, [web] T]
For simplicity, let all the costs c( f;
˜
f) to be 0, so that there is essentially no budget constraint.
Based on the TCs assigned as shown, the state of the network (N
f
)
f2F
is (1;1;1). When the
defender assigns OCs as shown in Figure 4.2, her strategyf is given by
2
6
6
6
6
6
6
4
˜
f
1
˜
f
2
f
1
1 0
f
2
1 0
f
3
0 1
3
7
7
7
7
7
7
5
The expected utility of the adversary (loss of the defender) when he attacks
˜
f
1
or
˜
f
2
is respectively
given by
˜
U
˜
f
1
=(10+0)=2= 5 and
˜
U
˜
f
2
= 6=1= 6. Thus, attacking
˜
f
2
leads to the highest expected
utility for the attacker.
29
Adversary Knowledge and Utility Estimation
The attacker’s awareness of the deception and the understanding of the defender’s strategy may
vary. Note that if the adversary is always able to find the OC with highest expected utility, it is the
worst case scenario for the defender given the game is zero-sum. An attacker who is fully aware
of how the defender sends the false responses to scan requests (via insider threats, information
leakage, etc.) would have such an ability. Formally, a powerful attacker is defined as someone
who knows F,
˜
F, p, U and f and chooses to attack the OC with the (correct) highest expected
utility
˜
U
˜
f
computed through Equation (4.5). If the defender chooses a strategy that minimizes the
expected utility of a powerful attacker, she gets a robust strategy as the defender can be assured
that no matter the extent of the adversary’s knowledge, no strategy he plays can lead to a greater
loss for the defender, in alignment with the minimax principle.
However, the attacker may not be so powerful. On the other end of the spectrum, if the at-
tacker is unaware of the defender’s precise deception scheme or has a very limited understanding
of the situation such that he cannot make any meaningful inference, his decision making would
be completely dependent on the observed configurations of the systems and some fixed prefer-
ences over OCs in terms of the estimated expected utility. Formally, a naive attacker is defined
to be someone who chooses to attack an existing OC
˜
f (i.e., one which has at least one system
configured to it) with the highest
¯
U
˜
f
where
¯
U
˜
f
is not dependent on the defender’s strategy and
is known to the defender. This is also equivalent to the case where the attacker just has a fixed
preference of the OCs. CDGs with powerful attackers are analyzed in Section 4.3, and CDGs
with naive attackers in Section 4.4.
30
4.3 Optimal Defender Strategy against Powerful Adversary
In this section, it is shown how to compute the defender’s optimal strategy in a CDG assuming
a powerful adversary. The adversary attacks an OC from the set argmax
˜
f2
˜
F
˜
U
˜
f
and gets an ex-
pected utility of max
˜
f2
˜
F
˜
U
˜
f
, denoted in short as
˜
U
(f), where the negative value is the defender’s
expected loss. Hence, the defender aims to minimize her loss by choosing her f from the set
argmin
f2F
˜
U
(f).
4.3.1 Computational Complexity
The problem of finding optimal defender strategy against a powerful adversary in a CDG is called
CDG-Robust.
First, it is useful to investigate a special case. The following proposition provides a tight
lower bound on min
f2F
˜
U
(f).
Lemma 4.1 The expected loss of the defender when playing her optimal strategy, is no lower
than the average utility of the systems, i.e.,
min
f
˜
U
(f) U
Ave
(K)=
å
f2F
N
f
U
f
jKj
31
Proof 4.1 Equivalently, it can be shown that,
˜
U
(f)
å
f2F
N
f
U
f
jKj
for all f. Fix any f2F. We
then have,
˜
U
(f)
˜
U
˜
f
(f) 8
˜
f (by definition of
˜
U
(f))
)
å
˜
f2
˜
F
N
˜
f
˜
U
(f)
å
˜
f2
˜
F
N
˜
f
˜
U
˜
f
) jKj
˜
U
(f)
å
˜
f2
˜
F
å
f2F
f
f;
˜
f
U
f
(using (4.5))
=
å
f2F
U
f å
˜
f2
˜
F
f
f;
˜
f
!
(re-ordering terms)
=
å
f2F
U
f
N
f
(by definition off, N
f
)
)
˜
U
(f)
å
f2F
N
f
U
f
jKj
Since the choice off was arbitrary, the claim follows.
Thus, even when the defender plays her optimal strategy, the attacker’s expected utility is
at least U
Ave
(K). Consequently, if the inequality becomes tight for a strategy f, it must be an
optimal strategy. It is easy to see that the bound becomes tight if and only if
˜
U
(f)=
˜
U
˜
f
(f),8
˜
f .
Clearly, this is true if and only if
¯
U
˜
f
is the same for each
˜
f set on any system, and trivially so, if
only a single OC is set on all the systems. Thus,
Corollary 4.1 If it is feasible for the defender to set the same OC on all the systems making
them all indistinguishable to the adversary, doing so is an optimal strategy. Formally, if9
˜
f
s.t.
9f
2F wheref
f;
˜
f
= N
f
;8 f , thenf
2 argmin
f2F
˜
U
(f).
It is possible to efficiently check if such an OC exists, by enumeration. However, it may not
exist, and next it is shown that CDG-Robust is NP-hard in general.
32
Proposition 4.1 CDG-Robust is NP-hard.
Proof 4.2 The result is proven via a reduction from the Partition problem (PART ) which is known
to be NP-complete. Given a multiset S of n positive integers that sum up to 2r, PART is the
decision problem to determine if S can be partitioned into two subsets S
1
and S
2
such that the
sum of integers in S
1
and S
2
are each r. It can be reduced to CDG-Robust as follows.
Let the input to PART be a set of integers S=fs
1
;:::;s
n
g whose elements sum to 2r. To
construct a CDG, let the set of TCs be F =f f
1
;:::; f
n
g[f f
n+1
; f
n+2
g, with utilities U
f
i
= s
i
for each i2f1;:::;ng and U
f
n+1
= U
f
n+2
=r. Next, let there be n+ 2 systems, each having a
different TC. Let the set of OCs be
˜
F =f
˜
f
1
;
˜
f
2
g, with
˜
F
f
i
=
˜
F for each i2f1;:::;ng, and
˜
F
f
n+1
=
f
˜
f
1
g,
˜
F
f
n+2
=f
˜
f
2
g. Let all the costs be 0 so that the budget constraint can be ignored. Assuming
the adversary to be powerful, these components completely define a CDG-Robust problem.
Note that, by Corollary 4.1 and the fact that å
f
U
f
= 0, the optimal strategy f must have
˜
U
(f) 0. Now, suppose S can be partitioned into subsets S
1
and S
2
such that the numbers
in each sum to r. Then, consider the strategy f which masks the TCs inf f
i
js
i
2 S
1
g and f
n+1
with
˜
f
1
, and masks the TCs inf f
i
js
i
2 S
2
g and f
n+2
with
˜
f
2
. It is easy to check that
˜
U
˜
f
1
(f)=
˜
U
˜
f
2
(f)= 0=
˜
U
(f), making f an optimal strategy. On the other hand, suppose the defender’s
optimal f yields
˜
U
(f)= 0. Since
˜
f
1
must mask f
n+1
, and
˜
f
2
must mask f
n+2
, neither of the
OCs are unused. Since
˜
U
(f)= 0, w.l.o.g., assume
˜
U
˜
f
1
= 0. Hence, the sum of utilities of the
TCs masked with
˜
f
1
must be 0. Therefore, the sum of utilities of TCs masked by
˜
f
2
is also 0.
Then, S
1
=fs
i
jf
f
i
;
˜
f
1
= 1g, and S
2
=fs
i
jf
f
i
;
˜
f
2
= 1g form a partition of S, each having sum of the
elements r. It follows that, PART should output Y ES iff CDG-Robust finds an optimal strategyf
with
˜
U
(f)= 0. This reduction, being polynomial-time, proves the claim.
33
4.3.2 The Defender’s Optimization Problem
The defender’s optimal strategy f can be computed by solving the optimization problem given
below.
min
u;f
u (4.6a)
s.t. u
å
f2F
f
f;
˜
f
å
f2F
f
f;
˜
f
U
f
8
˜
f2
˜
f (4.6b)
Constraints (4.1) (4.4)
The objective function in Equation (4.6a) minimizes the utility u the adversary receives for
the game. Equation (4.6b) enforces that the adversary chooses a best response to the defender’s
strategy f, where the expected utility for attacking a given
˜
f is given by (4.5). Constraints
(4.1)(4.4) represent a feasible defender strategy.
This optimization problem is non-convex due to constraint (4.6b), which can be linearized,
to convert the optimization problem to a MILP as follows. First, an alternate representation is
devised for the defender’s strategyf, as ajKjj
˜
Fj (0,1)-matrixs, wheres
k;
˜
f
= 1 denotes system
k is masked with
˜
f . Further, the TSN N is represented via a vector x, where x
k
2 F represents
the TC for system k. Then, for each TC f , we have N
f
=jK
f
j where, K
f
=fk2 Kj x
k
= Fg,
and f
f;
˜
f
=å
k2K
f
s
k;
˜
f
8 f;8
˜
f . Hence, the alternate representations are indeed equivalent. Then,
constraints equivalent to (4.1)(4.4) can be easily formulated for s and x with an additional
34
constraintå ˜
f2
˜
F
s
k;
˜
f
= 1 8k2 K to ensure feasibility. More importantly, equation (4.6b) can be
reformulated as
u
å
k2K
s
k;
˜
f
å
k2K
s
k;
˜
f
U
x
k
8
˜
f2
˜
F (4.7)
The left-hand side of (4.7) can be seen as the sum of a set of terms us
k;
˜
f
, each of which is
the product of binary variable s
k;
˜
f
and the continuous variable u. Such an expression can be
linearized by introducing variables z
k;
˜
f
for each k2 K and
˜
f2
˜
F, and enforcing z
k;
˜
f
= us
k;
˜
f
.
Consequently, (4.7) can be rewritten as:
å
k2K
z
k;
˜
f
å
k2K
s
k;
˜
f
U
x
k
(4.8)
To enforce z
k;
˜
f
= us
k;
˜
f
, consider u2 [U
min
;U
max
] where U
min
= min
f2F
U
f
and U
max
=
max
f2F
U
f
. With these bounds on u, we then include the constraints for each z variable in the
optimization problem as follows:
U
min
s
k;
˜
f
z
k;
˜
f
U
max
s
k;
˜
f
(4.9)
u(1s
k;
˜
f
)U
max
z
k;
˜
f
u(1s
k;
˜
f
)U
min
(4.10)
After this conversion the optimization problem becomes a MILP. The complete formulation is
given below for clarity.
35
min
u;s;z
u (4.11a)
s.t.
å
k2K
z
k;
˜
f
å
k2K
s
k;
˜
f
U
x
k
8
˜
f2
˜
f (4.11b)
å
˜
f2
˜
F
s
k;
˜
f
= 1 8k2 K (4.11c)
s
k;
˜
f
p
x
k
;
˜
f
8k2 F;8
˜
f2
˜
F (4.11d)
å
˜
f2
˜
F
å
k2K
s
k;
˜
f
c(x
k
;
˜
f) B (4.11e)
U
min
s
k;
˜
f
z
k;
˜
f
U
max
s
k;
˜
f
8k2 F;8
˜
f2
˜
F (4.11f)
u(1s
k;
˜
f
)U
max
z
k;
˜
f
8k2 F;8
˜
f2
˜
F (4.11g)
z
k;
˜
f
u(1s
k;
˜
f
)U
min
8k2 F;8
˜
f2
˜
F (4.11h)
s
k;
˜
f
2f0;1g 8k2 F;8
˜
f2
˜
F (4.11i)
4.3.3 MILP Bisection Algorithm
The reformulated MILP presented requires the addition ofjKjj
˜
Fj variables and 4jKjj
˜
Fj con-
straints to solve forf. This conversion significantly increases the size of the optimization problem
from the original number ofjFjj
˜
Fj decision variables in the original optimization problem and
can create issues when solving larger CDG instances. The second approach develop for CDGs
does not require the reformulation and instead solves a sequence of smaller MILPs (same size as
(4.6a)) to find an e-approximate solution for the defender [89]. This is done via a bisection al-
gorithmic framework. The algorithm initially is given an interval that the optimal objective value
˜
U
lies in which for CDGs is U
2[U
LB
;U
UB
] where U
LB
= U
A V E
(K) and U
UB
= max
f2F
U
f
.
36
For the algorithm, two variables l = U
LB
and d = U
UB
are introduced with the initial width as
e
0
= d l that contains the optimal value U
of the optimization problem. The main loop of the
algorithm is repeated until the width d le. The main loop has the following two steps:
1. Take t =(u+ l)=2 and solve the feasibility problem in Equation (4.12a) to find if there
exists a solution n that satisfies the constraints.
2. if feasible, take u :=t; if not feasible, take l :=t.
The algorithm is guaranteed to converge as after each update the interval [l;u] contains the
optimal U
and the width is halved. The number of steps that are needed to find thee-approximate
optimal solution isdlog
e
0
e
e.
max
f
1 (4.12a)
s.t. t
å
f2F
f
f;
˜
f
å
f2F
f
f;
˜
f
U
f
8
˜
f2
˜
f (4.12b)
Constraints (4.1) (4.4)
While the bisection algorithm may need to solve on the order of a dozen MILPs to arrive
at the approximate solution, as is shown in the experiments it can significantly outperform the
reformulated MILP in computational speed.
4.3.4 Greedy-Minimax Algorithm
Although the optimal f can be found via a MILP, it can still be computationally expensive for
large instances. Hence, heuristic algorithms can be preferable as they may be suboptimal but run
37
fast and perform well on average. In this section, a simple approach is described to sequentially
assign OCs to the systems, by greedily minimizing attacker’s maximum expected utility for the
partially built strategy at each stage. Algorithm 1 gives the pseudo-code.
Algorithm 1: Greedy-Minimax
1 minIndCost[] (min
˜
f
c( f;
˜
f))
f2F
2 minTotCost å
f
N
f
minIndCost[ f]
3 initialize minu
,s
best
4 For iter= 1:::numIter
5 K
list
[] shu f f le(K)
6 initialize remB B, reqB minTotCost
7 initializes[],
¯
N[],
¯
U[]
8 For i= 1:::jKj
9 k K
list
[i], f x[k]
10 s[k] GMMAssign( f;s[];
¯
N;
¯
U[])
11
¯
N[s[k]]
¯
N[s[k]]+ 1
12 update(
¯
U[s[k]])
13 remB remB c( f;s[k])
14 reqB reqB minIndCost[ f]
15 compute u
= max
˜
f
¯
U[
˜
f])
16 update(minu
;u
;s
best
;s)
17 returns
best
18 Procedure GMMAssign( f;s[];
¯
N;
¯
U[])
19 initialize newU
[]
20 For
˜
f2
˜
F
f
21 If (reqB minIndCost[ f]+ c( f;
˜
f)> remB) Then
22 Continue
23 s[k]
˜
f
24 newU
[
˜
f] U
(s)
25
˜
F
best
argmin
˜
f
newU
[
˜
f]
26 generate
˜
f
best
uniRand(
˜
F
best
)
27 return
˜
f
best
Greedy-Minimax starts by computing for each f2 F, the minimum cost of masking f with
any feasible OC, and subsequently, the minimum total cost of masking all the systems (Lines 1-
2). Next,s
best
and minu
are initialized, which respectively denote the final output strategy of the
algorithm and the corresponding utility (Line 3). Subsequently, the algorithm is conducted in a
38
number of iterations. In each iteration, a random shuffle of the set of systems is obtained, referred
to as K
list
above. Subsequently, the strategys which is a candidate solution corresponding to this
shuffle, the corresponding observed state of the network(
¯
N
˜
f
)
˜
f2
˜
F
, and the corresponding utilities
(
¯
U
˜
f
)
˜
f2
˜
F
are all initialized. These are constantly maintained as the algorithm loops through K
list
,
building the solution by assigning an OC to a system one by one (Lines 8-10). The OC to be
assigned for a system is determined via the function GMMAssign() which is the essence of this
heuristic algorithm. The input to this function is the TC f of the system in question, and the
currently built solution in terms of s;
¯
N;
¯
U;remB;reqB. Given these, the function considers the
candidate OCs in
˜
F one by one, refutes those which lead to the violation of the budget constraint
(i.e., make the resultant minimum required budget to exceed the resultant remaining budget). For
every other
˜
f , it computes resultant
¯
U
˜
f
if the system is masked with
˜
f , and stores it in the array
newU
(Lines 19, 23-24). Finally, based on these, it uniformly randomly chooses an OC from
those which minimize the resultant utility newU
() (Lines 25,26). Once GMMAssign() returns
an OC
˜
f , it is assigned to the system in question,
¯
N
˜
f
,
¯
U
˜
f
are updated accordingly, as well as the
remaining budget and the minimum required (Lines 11-14). Once the loop through K
list
is over
and the full strategy s is built, its utility u is computed, and compared with minu
, to update
minu
ands
best
appropriately (Lines 15-16).
It is possible to conceive examples where this heuristic approach does not yield a good solu-
tion on an arbitrary shuffle, even for problem instances with small parameters. Such an example
with 4 systems, 4 TCs and 2 OCs is discussed next. Further, there are examples where the solu-
tion value isQ(jKj) times as bad as the optimal, on exponentially many shuffles. This motivates
getting candidate solutions for a large number of shuffles and choosing the best among them
as described above. Since the greedy choice does not guarantee optimality, Soft-GMM is also
39
proposed, a slight modification of GMM which makes assignment probabilistically, and not de-
terministically. It works exactly as GMM, except Lines 25,26 — it draws f
best
from a distribution
P(
˜
F) where, P(
˜
f)µ exp(newU
[
˜
f]).
Note that the adversary’s utility U
(f) for any strategyf can be at mostjKj times the optimal
value min
f
U
(f). This follows from observing that for any strategyf, we have
˜
U
˜
f
max
fjN
f
>0
U
f
8
˜
f
by definition, and thus, U
(f) max
fjN
f
>0
U
f
, whereas min
f
U
(f) is at least the average of all the
system utilities by Proposition 4.1. Since any choice a greedy heuristic makes can be potentially
suboptimal, one may intuitively expect its performance to be worse for a higher number of choices
to be made, that is, for larger sized inputs, and relatively better for smaller inputs. However, an
example instance is shown next of a CDG where despite the input size (jFj;jKj;j
˜
Fj) being very
small, the (hard-)GMM algorithm in a particular iteration (i.e., when conducted on a particular
shuffle of the systems) gives a highly suboptimal solution.
Consider the set of systems K=fk
1
;k
2
;k
3
;k
4
g, the set of TCs F =f f
1
; f
2
; f
3
; f
4
g and the set
of OCs
˜
F =f
˜
f
1
;
˜
f
2
g. Let the feasibility constraints be given via the sets F
˜
f
1
=f f
1
; f
2
; f
3
g and
F
˜
f
2
=f f
2
; f
3
; f
4
g. Let each system k
i
have the TC f
i
, so that the TSN(N
f
)
f2F
is(1;1;1;1). For
the TCs, let the utilities be U
f
1
= 1, U
f
2
= 2, U
f
3
= 30, and U
f
4
= 40. For simplicity, let all the
costs c( f;
˜
f) to be 0, so that there is essentially no budget constraint.
40
Consider the ordering of the systems on which GMM is performed to be: fk
1
;k
2
;k
3
;k
4
g.
Then, the strategys computed by the GMM on this ordering is as follows:
2
6
6
6
6
6
6
6
6
6
6
4
˜
f
1
˜
f
2
k
1
1 0
k
2
1 0
k
3
1 0
k
4
0 1
3
7
7
7
7
7
7
7
7
7
7
5
Accordingly, the expected utilities of OCs are
˜
U
˜
f
1
=(1+2+30)=3= 11 and
˜
U
˜
f
2
= 40=1= 40,
and thus, adversary’s utility is 40 for this strategy. The optimal solution, however, masks k
1
, k
3
with
˜
f
1
and k
2
, k
4
with
˜
f
2
giving the expected utilities of the OCs:
˜
U
˜
f
1
=(40+ 2)=2= 21 and
˜
U
˜
f
2
=(30+ 1)=2= 15:5, thus, the optimal being just 21.
Further, the following is an example of a CDG which shows the GMM algorithm can perform
Q(jKj) as bad as the optimal solution on exponentially many shuffles.
Consider the CDG instance with the set of systems K =fk
1
;:::;k
m
g, so thatjKj= m. Let
the set of TCs F =f f
1
; f
2
; f
3
g and the set of OCs
˜
F =f
˜
f
1
;
˜
f
2
g. Let the true state of the network
be: x=(1;2;3;:::) Let the feasibility constraints be given by the sets F
˜
f
1
=f f
1
; f
3
g and F
˜
f
2
=
f f
2
; f
3
g. For the TCs, the utilities are U
f
1
= 1, U
f
2
= 2000, and U
f
3
=e. For simplicity, let all the
costs c( f;
˜
f) to be 0, so that there is essentially no budget constraint.
The optimal solution to this CDG is to assign systems k
2
;:::;k
m
to be masked by
˜
f
2
with
k
1
being masked with
˜
f
1
. This gives the following expected utilities:
˜
U
˜
f
1
= 1=1= 1 and
˜
U
˜
f
2
=
2000+(m2)e
m1
=
2000
m1
+
(m2)e
m1
. Consider any shuffle which orders the systems such that k
1
is first
and k
2
is last (of which there are (m 2)!). Given any ordering of this type, GMM assigns
41
assigns systems k
3
;:::;k
m
to be masked with
˜
f
1
and would assign k
2
to be masked with
˜
f
2
. The
expected utilities given this assignment is the following:
˜
U
˜
f
1
=
1+(m2)e
m1
=
1
m1
+
(m2)e
m1
and
˜
U
˜
f
2
= 2000=1= 2000. The loss in this case is
2000
2000
m1
=
1
m1
which is aQ(jKj) loss.
4.3.5 Solving for an Optimal Marginal Assignment n
The prior analysis focuses on finding the optimal pure strategyf for the defender to commit to in
the game. This is due to the assumption that adversaries view a fixed (static) version of the net-
work when completing reconnaissance. However, it can also be useful to find the optimal mixed
strategy q for the defender in the game. Formally, a mixed strategy is defined as a probability
distribution over all possible defender pure strategiesf2F whereå
f2F
q
f
= 1 and 0 q
f
1.
For this game, enumerating the set of pure strategies is infeasible, but it is possible to find the
defender’s optimal marginal strategy n=å
f2F
q
f
f due to compactly representing the defender’s
strategy space. The optimal marginal strategy can be found using the same optimization as (4.6a)
and replacing all instances off
f;
˜
f
with n
f;
˜
f
. The optimization problem for finding the defender’s
optimal marginal strategy can be seen as a generalized fractional linear program.
As in Section 4.3, generalized linear fractional programs are solved efficiently using a bisec-
tion algorithmic approach which solves a sequence of linear programming feasibility problems
to get an e-approximate optimal solution [10]. Similarly to the MILP bisection algorithm, this
algorithm is given an interval that U
lies in which is U
2[U
LB
;U
UB
]. The variables l = U
LB
and d = U
UB
are introduced with the initial width e
0
= d l that contains the optimal value U
of the optimization problem. The main loop is repeated until the width d le. The main loop
has the following two steps:
42
1. Take t =(u+ l)=2 and solve the feasibility problem in Equations (4.13)(4.18) to find if
there exists a solution n that satisfies the constraints.
2. if feasible, take u :=t; if not feasible, take l :=t.
The algorithm is guaranteed to converge as after each update the interval [l;u] contains the
optimal U
and the width is halved. The number of steps that are needed to find thee-approximate
optimal solution isdlog
e
0
e
e.
max
u;s
1 (4.13)
s:t: t
å
f2F
n
f;
˜
f
å
f2F
n
f;
˜
f
U( f) 8
˜
f2
˜
f (4.14)
å
˜
f2
˜
f
n
f;
˜
f
= N
f
8 f2 F (4.15)
å
˜
f2
˜
F
å
f2F
n
f;
˜
f
c( f;
˜
f) B (4.16)
n
f;
˜
f
p
f;
˜
f
N
f
8 f2 F;8
˜
f2
˜
F (4.17)
n
f;
˜
f
0 8 f2 F;8
˜
f2
˜
F (4.18)
4.4 Optimal Defender Strategy against Naive Adversary
The robust approach to solving CDGs, i.e., assuming a powerful adversary with knowledge of
f, can cause the defender to not fully realize the benefit of her informational advantage when
faced with a less powerful attacker. In particular, the adversary may value OCs in a fixed manner
43
that is known to the defender.
2
In this case, the values
¯
U
˜
f
are fixed and the defender’s strategy
does not affect the adversary’s expected utility for attacking some
˜
f . Importantly, if there is no
budget constraint one can solve for the defender’s optimal strategy f in polynomial time using
Algorithm 2. W.l.o.g. it is assumed the adversary has a strict preference ordering over
˜
F as if
¯
U
˜
f
is equal for any two OCs, the sets could be merged from the defender’s perspective, with the
feasibility constraint and cost adjusted accordingly.
Algorithm 2 begins by initializing f, G
(which stores the TCs the adversary attacks) and
˜
f
(the OC the adversary attacks given f). In Line 3 the matrix minUtil[] is computed which
stores the lowest utility achievable for each TC which is min
˜
f2
˜
F
f
¯
U
˜
f
. The For loop in Line 4
iterates over all
˜
f2
˜
F which is sorted descending by
¯
U
˜
f
(Line 2) and determines for each
˜
f the
best set of TCs to mask if
˜
f is attacked by the adversary in Lines 5 through 12. To do this, F
is split into 4 separate sets P
1
, P
2
, P
3
and P
4
and the set of TCs to be masked with
˜
f
i
is stored in
G
0
. Note that for each f , N
f
copies for the algorithm are enumerated. P
1
contains all TCs which
cannot be masked with an
˜
f that has
¯
U
˜
f
<
¯
U
˜
f
i
. Intuitively, if this set is non-empty it means the
defender is not able to devise a strategyf such that the adversary prefers to attack
˜
f
i
, and hence,
all subsequent
˜
f
i
will never be preferred by the adversary. P
2
(P
4
) contain TCs f which must be
masked (cannot be masked) with
˜
f
i
. P
3
then contains all TCs f which can be masked with
˜
f
i
but may also be masked with another OC
˜
f
j
6=
˜
f
i
. The function update(G
0
;P
3
) sorts the TCs in
ascending order and iterates over the TCs f2 P
3
and masks all TC f with
˜
f
i
() U
f
EU(G
0
).
In Line 13 update(G
;G
0
;
˜
f
;
˜
f
i
) sets G
=G
0
and
˜
f
=
˜
f
i
if EU(G
0
) < EU(G
). Finally, the
function update(f;G
;
˜
f
) in Line 14 determines the OCs
˜
f
0
for all f = 2G
given
¯
U
˜
f
0 <
¯
U
˜
f
and
the strategyf is returned.
2
As an example, the adversary could estimate his utility according to values derived from the NIST National
Vulnerability Database [59].
44
Algorithm 2: Compute defender’s optimalf with fixed
¯
U
˜
f
.
1 initializef,G
,
˜
f
2 sort(
˜
F) //descending by utility
¯
U
˜
f
3 minUtil[] :=(min
˜
f
¯
U
˜
f
)
f
4 For i= 1;:::;j
˜
Fj
5 initializeG
0
6 P
1
:=f fj minUtil[ f]>
¯
U
˜
f
i
g
7 If P
1
6= / 0
8 break
9 P
2
:=f fj minUtil[ f]=
¯
U
˜
f
i
g
10 P
3
:=f fj minUtil[ f]<
¯
U
˜
f
i
and
˜
f
i
2
˜
F
f
g
11 P
4
:=f fj minUtil[ f]<
¯
U
˜
f
i
and
˜
f
i
= 2
˜
F
f
g
12 G
0
:= P
2
13 update(G
0
, P
3
)
14 update(G
,G
0
,
˜
f
,
˜
f
i
)
15 update(f,G
,
˜
f
)
16 returnf
Proposition 4.2 Given fixed utilities
¯
U
˜
f
and no budget constraint, Algorithm 2 computes the
optimal strategyf in O(jFjj
˜
Fj).
Proof 4.3 First, I show that for each
˜
f2
˜
F, Lines 5 through 13 in Algorithm 2 computes the set
G
0
with the minimum average value. To see this, note that all TCs f2 P
2
must be inG
0
while all
TCs f2 P
4
cannot be included. In update(G
0
;P
3
) (note P
3
is given in sorted order) the defender
decides for each f2 P
3
to include the N
f
TCs inG
0
() U
f
EU(G
0
). At the end of this update,
tt follows thatG
0
must be the minimum average set for
˜
f
i
. Given that the for loop in Line 4 iterates
through all
˜
f2
˜
F, it must be the case that the optimalG
is returned for some
˜
f .
In Line 2, sorting
˜
F takes O(j
˜
Fjlogj
˜
Fj) time and calculating minUtil[] takes O(jFjj
˜
Fj) time.
For each iteration of the for loop in Line 4, it takes O(jFj) time to split F into the sets the four sets
P
1
, P
2
, P
3
and P
4
. It takes the function update(G
0
;P
3
) at mostjFj operations to updateG
0
while
update(G
;G
0
;
˜
f
;
˜
f
i
) takes O(1) time. Hence, each iteration it takes O(jFj) time and hence,
45
O(jFjj
˜
Fj) time for the for loop. Lastly, update(f;G
;
˜
f
) takes at O(jFjj
˜
Fj) time to return the
defender’s strategyf as it must find an OC
˜
f
j
for each f = 2G
with
¯
U
˜
f
j
<
¯
U
˜
f
i
.
It is possible to efficiently compute the defender’s optimal strategy when there is no budget
constraint. When the defender has a budget constraint, however, the question arises if her optimal
strategy can be found efficiently as well. This problem is called CDG-Fixed and next it is shown
to be NP-Hard.
Proposition 4.3 CDG-Fixed is NP-hard.
Proof 4.4 The proposition is proved via a reduction from the 0-1 Knapsack problem (0-1 KP),
which is a classical NP-hard problem. Given a budget B and a set of m items each with a weight w
i
and value v
i
, 0-1 KP is the optimization problem of finding the subset of items Y which maximizes
å
i2Y
v
i
subject to the budget constraintå
i2Y
w
i
B. Now I show that 0-1 KP can be reduced to
CDG-Fixed. For convenience, [m] is used to denote the setf1;2:::;mg and S=å
i2[m]
v
i
denote
the sum of all utilities.
Given a 0-1 KP instance as described above, construct a CDG instance as follows. Let the set
of TCs be F =f f
1
;:::; f
m
g[f f
m+1
g, with utilities U
f
i
= v
i
;8i2[m] and U
f
m+1
=W for some
fixed constant V . Note N
f
= 18 f2 F. Let the set of OCs be
˜
F =f
˜
f
1
;
˜
f
2
g, with
˜
F
f
i
=
˜
F8i2[m]
and
˜
F
f
m+1
=f
˜
f
1
g. Set the costs as c( f
i
;
˜
f
1
)= 0, c( f
i
;
˜
f
2
)= w
i
for all i2[m] and c( f
m+1
;
˜
f
1
)= 0.
Set
¯
U
˜
f
1
>
¯
U
˜
f
2
. Assuming a naive adversary, these components completely define a CDG-Fixed
problem. Since f
m+1
is bound to be masked by
˜
f
1
, and
¯
U
˜
f
1
>
¯
U
˜
f
2
, attacking
˜
f
1
is a dominant
strategy for the adversary.
Observe thatå
f2F
U
f
iså
i2[m]
v
i
V = SV . It is claimed that the optimal objective of the
0-1 KP instance is greater than SV if and only if the optimal defender utility in the constructed
46
CDG-Fixed problem, i.e., U
(f), is negative. The( direction is proven next as the) is a
similar proof. Let f
be the optimal solution to the CDG-Fixed problem. By definition, the set
Y =fi :f
f
i
;
˜
f
2
= 1g is a feasible solution to the 0-1 KP since the cost of mapping f
i
to
˜
f
2
is w
i
. The
sum of all utilities of all systems is SV whereas U
(f
)< 0 means the total utilities of systems
mapped to
˜
f
1
is less than 0, this implies that the total utilities of systems mapped to
˜
f
2
is at least
SV . Note each system mapped to
˜
f
2
corresponds to an item and hence, the optimal objective
of the 0-1 KP is also at least SV .
The above claim shows that for any constant V , one can check whether the optimal objective
of the 0-1 KP is greater than SV by solving a CDG-Fixed instance. Using this procedure as a
black-box, a binary search can be performed to find the exact optimal objective of the 0-1 KP with
integer values within O(poly(log(S))) steps (both S and weights are machine numbers with input
size O(log(S))). As a result, a polynomial time reduction has been constructed from computing
the optimal objective of any given 0-1 KP to solving the CDG-Fixed problem. This implies the
NP-hardness of the CDG-Fixed problem.
CDG-Fixed can be solved with Algorithm 2 via a modification to the function update(G
0
;P
3
)
in Line 13. GivenG
0
, one computes the minimum budget B
0
required to mask all TCs f2G
0
with
˜
f
i
and mask all TCs f2 P
3
and f2 P
4
with
˜
f
j
such that
¯
U
˜
f
j
<
¯
U
˜
f
i
. IfG
0
= / 0, then for f2 P
3
mask
f with
˜
f
i
if c( f;
˜
f
i
)< B
0
. Assuming P
3
is sorted ascending, once the defender assigns
˜
f
i
to a TC
47
f she is done. IfG
0
6= / 0, the defender must solve multiple MILPs, with n= n
G
0;:::;jKj to find the
bestG
0
. Denote u
G
0 = EU(G
0
).
min
f
n
G
0 u
G
0+
å
f
f
f;
˜
f
U
f
(4.19a)
s.t.
å
f
f
f;
˜
f
i
n n
G
0 (4.19b)
Constraints (4.1) (4.4)
4.5 Experiments
The CDG model and solution techniques are evaluated using synthetically generated game in-
stances. The game payoffs are set to be zero-sum, and for each TC, the payoffs U
f
are uniformly
distributed in[1;10]. Each OC
˜
f is randomly assigned a set of TCs it can mask, while ensuring
each TC can be masked with at least one OC. To generate the TSN, each system is randomly
assigned a TC uniformly at random. The costs c( f;
˜
f) are uniformly distributed in[1;100] with
the budget B uniformly distributed in-between the minimum cost assignment and maximum cost
assignment. All experiments are averaged over 30 randomly generated game instances.
4.5.1 Powerful Adversary - Scalability and Solution Quality Loss
When solving for the defender’s optimal strategy f strategy for enterprise networks, it is impor-
tant to have solution techniques which can scale to large instances of CDGs. The first experi-
ment compares the scalability of the reformulated MILP, the bisection algorithm and the Greedy
Minimax (GMM) algorithm with 1000 random shuffles along with the solution quality of the
approaches. In Figure 4.3(a) the runtime results are shown with the runtime in seconds on the
48
0.1
1
10
100
1000
10000
10 12 14 16 18 20
Runtime (s)
Number of Systems
GMM
Bisection
MILP
(a) Runtime
4
4.5
5
5.5
6
6.5
7
7.5
8
10 12 14 16 18 20
Adversary Utility
Number of Systems
MILP
Bisection
GMM
(b) Solution Quality
Figure 4.3: Runtime Comparison and Solution Quality Comparison (20 Observables) - Reformu-
lated MILP (MILP), the bisection algorithm with e = :0001 (Bisection) and Greedy MaxiMin
(GMM) with 1000 random shuffles.
# OCs 2 4 6 8 10
10 systems 0 0.092% 0.015% 0.028% 0.512%
Optimal Instances 30 29 29 29 25
20 systems 0 0.028% 0.615% 1.91% 3.18%
Optimal Instances 30 28 17 12 9
Table 4.1: Solution Quality % loss and number of optimal instances for GMM versus MILP.
y-axis and the number of systems varied on the x-axis. As can be seen, the runtime for solving the
reformulated MILP increases dramatically as the number of systems increases while both GMM
and the bisection algorithm finish in under 10 seconds in all cases. The results from the bisection
algorithm compared to the reformulated MILP are quite surprising given it provides thee optimal
solution and highlights the benefit from solving smaller MILP for larger CDG instances.
While GMM is much faster than the reformulated MILP (but comparable to the bisection
algorithm), it is not guaranteed to provide the optimal solution or an e-approximate solution.
However, the experimental results show that empirically the solution quality loss is very small.
In Figure 4.3(b) the solution quality of the MILP is compared to GMM, where the attacker’s
utility is given on the y-axis and the number of OCs are varied on the x-axis. Importantly, GMM
49
5
5.5
6
6.5
7
7.5
8
Adversary Utility
# of Shuffles
GMM - H
GMM - .01
GMM - 1
GMM - 10
Figure 4.4: Solution Quality Comparison (20 systems and 20 OCs) - Comparison of Hard-GMM
(GMM - H) and Soft-GMM (GMM -l) varying the number of shuffles.
shows a low solution quality loss for the defender compared to the MILP with a minimum loss of
1:68% for 12 systems and a maximum loss of 5:93% for 16 systems. This experiment highlights
the scalability of GMM and shows the loss in solution quality from GMM can give a reasonable
trade-off between computational efficiency and solution quality.
An interesting feature of GMM is how often it returns the optimal solution for the defender as
the CDG game size changes. Table 4.1 compares the solution quality of GMM (with 1000 random
shuffles) versus the MILP for several game sizes with 10 and 20 systems where the number of
OCs are varied from 2 to 10. Interestingly, for CDGs with 10 systems, Hard-GMM is able to
find the optimal solution in a vast majority of instances (142 out of 150 instances). However,
for CDGs with 20 systems, GMM fails to recover the optimal solution in about a third of the
instances (96 out of 150). Nevertheless, the loss of solution quality still remains low (3.18%)
even when GMM returns the optimal solution a third of the time.
The solution quality of a variation of GMM, called Soft-GMM or GMMl, was tested as
well. Instead of greedily choosing the OC with minimax expected utility at the stage, a soft-min
50
function [19] is applied with parameter l controlling the greediness of the next choice. Fig-
ure 4.4 shows the solution quality of GMM (denoted as GMM-H) and GMMl with varying l
values. GMM:01 is very close to randomly choosing OCs for the systems and performs poorly
compared to larger l values, indicating that GMM is an effective heuristic and performs much
better than random assignment. Importantly, the randomness in GMMl leads to a potential of
finding better strategies than GMM since GMM-Hard is restricted to a limited strategy space and
GMMl is not. This can be seen by comparing the results for GMM-Hard and GMM-10 where
the latter outperforms the solution quality achieved with GMM-Hard at 8000 and 16000 shuffles.
Further investigation is deferred to future work.
4.5.2 Comparing Solutions for Different Types of Adversaries
The last experiment compares how the optimal strategies for the two adversary models (powerful
versus naive) perform in the opposite case. Figure 4.5(a) compares the solution quality of the
MILP in Section 4.3 to Algorithm 2 when the adversary is assumed to knowf with the attacker’s
utility on the y-axis and the number of systems varied on the x-axis. This figure highlights that
for the powerful adversary the MILP performs significantly better than Algorithm 2 (except for 5
systems) and shows the risk of underestimating the adversary’s information when devising the de-
fender’s strategyf. In Figure 4.5(b) the solution quality of Algorithm 2 is compared to the MILP
when the adversary is assumed to have fixed utilities. As the figure shows, the improvement in
utility is dramatically higher for Algorithm 2 compared to the MILP. The reason for this differ-
ence lies in Algorithm 2 leveraging the adversary’s fixed preferences over OCs and minimizes the
value of systems masked with the OC the adversary will attack. The MILP, however, minimizes
51
the worst case utility given the adversary may attack any OC and hence, fails to leverage the
defender’s advantage to a high benefit.
0
2
4
6
8
10
5 10 15 20
Adversary Utility
Number of Systems
Powerful
Naive
(a) Powerful Adversary
0
2
4
6
8
10
5 10 15 20
Adversary Utility
Number of Systems
Powerful
Naive
(b) Naive Adversary
Figure 4.5: Solution Quality Comparison (10 OCs) - In (a) the solution quality of the two types
of defender strategies is shown against a powerful adversary. In (b) the solution quality of the
strategies is shown against a naive adversary.
4.6 Real World Applicability
In this section, I will highlight several recent works that have provided technical solutions to
achieving the deception outlined in this chapter. This section is not meant to be a complete refer-
ence of all technical deceptive techniques available, but rather, a catalog of applicable deceptive
methods for a network administrator using the Cyber Deception Game model. There are three
main areas of active research in developing deceptive techniques to thwart adversarial network
reconnaissance: (1) Operating System and application/service obfuscation against fingerprinting,
(2) deceptive network topology alteration and (3) honeypots and honey-* defenses.
52
4.6.1 OS and Application Fingerprinting & Obfuscation
One of the first pieces of information an adversary must gain when completing reconnaissance on
a defender’s network is determining the operating systems deployed on systems in the network.
This important first step provides crucial information about the types of exploits available to be
used and potentially the difficulty in attacking the defender’s network. Operating System (OS)
fingerprinting is an area of significant interest as it provides the tools necessary for an adversary to
determine the OSs that systems across the enterprise network are running. OS fingerprinting can
be accomplished in two ways (i) active fingerprinting which involves sending carefully crafted
packets to the target system and analyzing the results and (ii) passive fingerprinting which sniffs
and analyzes network network traffic traveling between systems.
Active fingerprinting techniques are generally more sophisticated than passive fingerprinting.
In some cases, an adversary can disregard stealthier approaches and simply attempt to connect to
the host system to learn about the OS of the system by establishing a connection via the Telnet
or SSH protocol which sends the OS version as part of the welcome message. For network recon
tools, active fingerprinting techniques trigger a target system to send a series of responses that
are then analyzed by the adversary’s network tools to determine the type and version of the OS
a system is running. The ICMP, TCP and UDP packets sent to a system are specially crafted to
observe how the system responds to both valid and invalid packets. For example, some features
of TCP probes received from a system can distinguish in-between different operating systems
(e.g., order of the TCP options, the TCP sequence number).
Passive fingerprinting consists of using a packet sniffer that passively collects and analyzes
packets traveling between systems in a network. One simple method of passive fingerprinting
53
uses the Time To Live (TTL) field in the IP header and the TCP Window Size of the SYN or
SYN+ACK packet in a TCP session to determine the OS of a system. This is due to the values for
TTL and the TCP Window Size depending on the OS implementation as the RFC specifications
only define intervals of values and recommended values, and does not mandate specific values to
be used.
There are numerous tools available for fingerprinting today. Due to the variety of the network
mapping tools, no single deception approach can be used to defeat all of them, but with a com-
bination of approaches it is possible to significantly increase the difficulty in the reconnaissance
efforts of an adversary. Below is a list of some of the network scanning tools available to give
the reader a small overview of this area. A more in-depth and exceptional technical breakdown
of network scanning tools and their specifications can be found in [1].
1. Nmap [53]: Nmap is a network security scanner used to discover hosts and services on
a computer network that builds a “map” of the network. It works as an active tool by
sending specially designed packets to a host which it then analyzes to identify the OS and
applications running on different ports.
2. SinFP3: SinFP was developed in order to complete OS fingerprinting of a host under the
worst-case network conditions [9]. This includes a remote host having only one port or the
traffic to all other TCP and UDP ports is dropped by filtering services. Once the response
packets have been received SinFP uses a matching algorithm to determine the OS of a
particular system.
54
3. Xprobe: Xprobe is an active OS fingerprinting tool that uses fuzzy signature matching,
probabilistic guesses and a signature database to determine the OS for a given host [86].
Many of the techniques in Xprobe have been built into nmap over time.
4. p0f3: p0f v3 [90] is a tool that utilizes passive traffic analysis to fingerprint hosts behind
any TCP/IP communications in a network without interfering in any way. The techniques
it uses for the fingerprinting are sophisticated and this tool can be used when Nmap probes
may be blocked or the adversary wants to be stealthy.
5. amap: amap [67] is a application mapper tool which determines the applications a host on a
network is running and the specific versions of those applications by interrogating network
services and sockets.
6. Nessus: Nessus [17] provides a wide-variety of network scanning tools in order to map
out a network. It is extremely useful as a vulnerability scanner by looking for types of
vulnerabilities such as misconfigurations, default passwords, and denials of service against
TCP/IP stack by using malformed packets. Additionally, it uses many of the other network
and application fingerprinting tools in conjunction to provide results to a Nessus user.
For the fingerprinting of applications running on a port of a host system, a typical approach
relies on retrieving the service banner to gather information on the application and version along
with using the port number (e.g., webservers typically have HTTP on port 80). In this regard,
approaches to combat application fingerprinting rely on altering the service banner sent back in
a packet for a given probe. There are a few potential issues from altering the banner of a service
which causes it to not be applicable to some services. For example, services that use the banner
information during the connection process (like SSH) require a non-transparent approach.
55
Version IHL Type of Service Total Length
Identification xDM Fragment Offset
TTL Protocol Checksum
Source Address
Destination Address
IP Options (Variable Length)
Source Port Destination Port
Sequence Number
Acknowledgement Number
Offset Reserv. U A P R S F Window
Checksum Urgent Pointer
TCP Options (Variable Length) Padding
TCP Data
Outgoing Packet
IP
Header
TCP
Header
Kernel Module
Modified
Outgoing
Packet
Deceptive Signature
B11113 F0x12 W4344 O0204 M1460 S8
General
Information
TCP Flags TCP Window
TCP Options
Format
MSS WScale
Figure 4.6: The alteration of an outgoing packet to mimic a certain desired deceptive signature.
A recent paper [1] provides a technical approach for OS and application obfuscation against
nmap reconnaissance efforts. Their approach works by altering important information contained
in the TCP/IP header’s in order to fool nmap scans into misclassifying the OS of a particular
system, and potentially, the application and services running on that system as well. Figure 4.6
shows how an outgoing packet is modified in order to provide a deceptive signature of the host
machine through the alteration of specific properties in the packet header. This approach is useful
as it can thwart both active and passive OS fingerprinting tools, such as Nmap and p0f3, while
also altering the reported services by altering the header files sent back from a particular service
scan (achieved by scanning the socket). For the obfuscation of a service hosted on a system, the
techniques presented in [1] relies on altering the banner message sent back when establishing a
connection or in the header of each appliance-level protocol data unit.
56
4.6.2 Deceptive Network Topologies
Another area of interest for recent research focuses on altering the network view each host system
on a network observes from reconnaissance efforts. A recent line of work [26] provides the the
Adaptive Cyber Deception System (ACyDS) that gives the network administrator an incredibly
powerful tool. ACyDS provides each network host a unique virtual view of the enterprise network
that alters the subnet topology and IP address assignments of reachable hosts and servers and does
not reflect the physical network configurations.
ACyDS works by altering the network view for each host (system) on the network which con-
sists of the network entities and network topology. The network entities viewable from a system
are those which it is permitted to communicate with on the network. The network topology view
of a system’s network view consists of the routers connecting the host and other systems present
on the network from the network entities view. For a given system, the network view is then the
composite of the network entities and network topology views. An example is given in Figure 4.7
which shows Host1’s network view.
(a) True Network (b) Deceptive Network
Figure 4.7: Network views for a host connected to the defender’s network. In 4.7(a) is the true
network state while in 4.7(b) is an altered state with additional network connections and honey-
pots.
57
ACyDS works by deceptively setting the network view for a system via altering the network
entities or network topology views. This is achieved with the use of Software-Defined Networking
(SDN) along with OpenFlow to correctly manage the traffic in the network. ACyDS leverages
SDN controllers, SDN switches, and other components to create the individual deceptive network
views for systems. It is recommended the reader reference [26] for an in-depth breakdown of the
incredible ACyDS tool.
4.6.3 Honeypots and Network Tools
Honeypots are used ubiquitously in cybersecurity as a means of identifying adversary attacks and
learning about adversary behavior. Although I do not talk about the use of honeypots extensively
in this chapter, they can be incorporated into the deception model along with their behaviors.
Additionally, software applications have been developed which can mimic network personalities
of host systems to increase the perceived number of host machines on a network to an adver-
sary. HoneyD [66] establishes virtual network daemons that listen for requests sent to certain IP
addresses on a defender’s network and can answer these requests for the virtualized machines.
HoneyD provides the ability to set the network topology, host machines and each the machine’s
configuration on a network. This provides the network administrator with the ability to fake ad-
ditional services on the real hosts while also emulating additional honeypots that may appear the
same as the real host systems. Using an approach of this type creates vastly more uncertainty
for an adversary and forces them to expend significant effort in order to identify real versus fake
systems on a network. Figure 4.8 gives an example of the how HoneyD can be used within a
network.
58
Figure 4.8: The HoneyD server initializes network daemons to respond to pings and scans for
various IP addresses not used by a network.
Beyond HoneyD, another recent tool which has been developed to identify adversarial net-
work reconnaissance in Project Nova
3
. This application works by first analyzing a defender’s
network through the use of nmap and then developing a deceptive ‘haystack’ that is deployed
using HoneyD. The haystack is essentially a set of virtual systems to create in addition to the
current real systems operating on a network to increase the difficulty to an adversary learning
about systems on the network. In addition to increasing the effort expended by an adversary,
the haystack is useful in helping determine adversarial nodes that have been compromised in a
network which are being used for reconnaissance or attacking efforts. The CDG model can be
leveraged with the use of HoneyD or inside of an application like Project Nova to determine the
network state to show and autonomously change the views showed to the adversary.
3
http://www.novanetworksecurity.com/index.html
59
4.6.4 Leveraging the CDG Model
The technical approaches presented in this section highlight much of the current state-of-the-art
in developing network technologies to fool adversarial reconnaissance efforts. The CDG model
represents a high-level approach to reasoning about the deception schemes a network administra-
tor can employ when given an enterprise network environment and available technical deception
tools. One question that arises for a network administrator from this research is, “Why respond
at all?” It turns out that many organization’s enterprise networks must be open to outside users
through the DMZ portion of their network. These situations require systems on the network to
respond to requests for connections and scanning activity which is part of their function on the
network. Indeed, the CDG model is particularly useful as a tool to protect the perimeter of an
organization’s network, but it is general enough to also capture reconnaissance activities in the
organization’s intranet. In order to gain a concrete understanding of the applicability of CDGs, it
is useful to go through an example similar to the example network in Section 4.2.
Consider an example scenario where the network administrator has 4 systems, e.g., acting as
webservers in this scenario, on the network which have an operating system and a webserver soft-
ware. For the systems, assume systems k
1
and k
2
have the Windows Server 2012 OS with k
3
and
k
4
having the Ubuntu Linux 12.04 (Linux 3.2.x Distribution) OS. Assume all are running Apache
webserver, with k
1
running version 2.2, k
2
and k
3
running version 2.3, and k
4
running version
2.4. The TCs for these systems are then as follows: k
1
=f[os] WS2012; [web] Apache2:2g,
k
2
=f[os] WS2012; [web] Apache2:3g, k
3
=f[os] Linux3:2:x; [web] Apache2:3g and k
4
=
f[os] Linux3:2:x; [web] Apache2:4g. For this scenario the technical approach presented in [1]
is considered which alters outgoing packets to obfuscate the OS and applications for a system.
60
Assume the network administrator has developed several obfuscation personas for the oper-
ating systems which allows them to make an OS of a system to appear as Windows Server 2008,
Windows Server 2012, Linux 2.6.x, and Linux 3.2.x. Additionally, the network administrator
can obfuscate the webserver version of Apache to appear as 2.2, 2.3 or 2.4. The combination of
all obfuscation personas make up the set of OCs available for the CDG model to optimize given
constraints on which personas can be used for a given system, e.g., Windows Server 2012 can
only be obfuscated as Windows Server 2008.
A possible deception scheme for the network administrator - from optimizing the CDG model
- is deploying the following personas for the systems: k
1
=f[os] WS2008; [web] Apache2:2g,
k
2
=f[os] WS2008; [web] Apache2:2g, k
3
=f[os] Linux2:6:x; [web] Apache2:4g and k
4
=
f[os] Linux2:6:x; [web] Apache2:4g. In this scheme, k
1
and k
2
are made to appear the same
and in a similar way k
3
and k
4
are made to appear the same. From here the network administrator
only needs to tell the OS and service obfuscator to alter packets to make each system appear as
the desired persona. Switching to new personas is also easy as it only requires changing the at-
tached personas in the network switch (kernel module in Figure 4.6) responsible for altering the
packets sent out from systems on the network, i.e., one could have k
1
and k
2
have Apache 2.3 as
their webserver without much overhead.
As mentioned in Section 4.6.1, the deception is not always transparent to legitimate users.
Presenting false personas for systems requires the alteration of all outgoing packets in some cases
(given a passive network scanner), and hence, latency can become an issue along with reductions
in the data transfer speed caused from altering values such as the Maximum Segment Size (MSS)
or Window Size (WS). It is recommended the reader refer to [1] for more in-depth details and
analysis of the cost to a legitimate user from their technical deception approach.
61
4.7 Chapter Summary
In this chapter, I study the problem of a network administrator should respond to scan requests
from an adversary attempting to infiltrate her network. I show that computing the optimal de-
fender strategy against a powerful adversary is NP-hard and provide two main techniques, a
MILP approach and a bisection algorithmic approach, to solve for the defender’s optimal strategy
against a powerful adversary. Additionally, a greedy algorithm is provided which quickly finds
good defender strategies and performs well empirically. I then show that computing the optimal
strategy against a naive attacker is still NP-hard given a budget constraint. Extensive experimen-
tal analysis is given demonstrating the effectiveness of the approaches. Finally, a section covering
technical deception approaches shows applications and tools that could be used in leveraging the
CDG model and algorithms to generate deceptive network views in real-world networks.
62
Chapter 5
Cyber-alert Allocation Games
5.1 Problem Domain
While many organizations face the challenge of cyber alert allocation, this chapter highlights
a scenario developed in consultation with experts at the United States Air Force (USAF). The
USAF relies on extensive global cyber systems to support its missions, which are monitored by
IDPS to prevent attacks on the network by intelligent adversaries. The Air Force Cyber defense
unit (AFCYBER) is responsible for investigating and resolving alerts generated by these IDPS
1
.
Due to the global scale of USAF computer systems, millions of alerts are generated every day,
associated with different types of events. Prescreening of the alerts eliminates a large fraction of
insignificant events, but thousands remain to be investigated. Any of these remaining alerts could
indicate a malicious attack, but a large fraction are false positives.
Two primary features are used to prioritize the most critical alerts to investigate. First, each
alert has a risk classification (e.g., high, medium, low) based on the type of event detected by the
IDPS. Second, each alert has an origin location within the global network (e.g., a specific host,
system); some locations (e.g., headquarters) are more critical to operations.
1
24th Air Force - AFCYBER:http://www.24af.af.mil
63
24
Cyber Network Defense (CERT)
Enterprise Network
Data Center I
Data Center II
Bob
“High”
“Medium”
“Lo w”
“H igh”
“Medium”
“Lo w”
Intrusion Detection
and Prevention
Systems (IDPS)
Host and Network Level
Alice
Chad
SIEM
Figure 5.1: To protect against cyber intrusions, enterprise networks deploy Intrusion Detection
and Prevention Systems across their network that work at both a host and network level. The
alerts generated are given a risk classification and aggregated into a central repository called a
SIEM. The network administrator then must determine how to allocate the alerts to analysts for
investigation and remediation if necessary.
The AFCYBER has a limited number of Incident Response Team (IRT) cyber analysts who
investigate significant alerts after prescreening
2
. Each analyst has different areas of expertise,
and may therefore be more effective and/or faster at investigating certain types of incidents. The
USAF also must protect against an adaptive adversary who can observe strategies through bea-
coning and other techniques. The problem AFCYBER faces is an excellent example of the central
analyst assignment problem covered in this section in the real world.
In this chapter, I will discuss the Cyber-alert Allocation Game (CAG) model which captures
the assignment problem a network administrator is faced with in this domain. This model helps
prioritize the resolution of alerts by accounting for a strategic adversary and constraints on the
cyber analysts investigating the alerts. Next, the CAG is computationally analyzed and techniques
are developed to solve for the defender’s optimal assignment policy. Finally, experiments are
2
688th Cyberspace Wing: http://www.24af.af.mil/Units/688th-Cyberspace-Wing
64
shown which demonstrate the benefit of the game theoretic approach versus ad-hoc assignment
policies while the scalability of the algorithms is also analyzed.
5.2 Cyber-alert Allocation Games
The Cyber-alert Allocation Game (CAG) is modeled as a (zero-sum) Stackelberg game played
between the defender (e.g., AFCYBER) and an adversary (e.g., hacker). The defender commits
to a mixed strategy to assign alerts to cyber analysts. A worst-case assumption is made such that
the attacker moves with complete knowledge of the defender’s strategy and plays a best-response
attack strategy [43]. However, in a zero-sum game the optimal strategy for the defender is the
same as the Nash equilibrium (i.e., when the attacker moves simultaneously) [88], so the order of
the moves is not consequential in the model.
Systems and Alerts: The defender responds to alerts originating from a set of systems k2 K.
A “system” in this model could represent any level of abstraction, ranging from a specific server
to a complete network. IDPS for each system generate alerts of different types, a2 A. The alert
types correspond to levels of severity (e.g., high, medium, and low), reflecting the likelihood of
a malicious event. The combination of the alert type and the origin system is represented as an
alert category, c2 C, where c=(k;a). The alerts in a given category are not differentiable, so the
defender must investigate all alerts within a category with the same probability. The total number
of alerts for a given category c is denoted by N
c
. It is assumed that both the defender and attacker
know the typical value of N
c
from historical averages (similar to [33]).
Attack Methodologies: Attackers can choose from many attack methodologies. These fall
into high-level categories such as denial of service attacks, malware, web exploitation, or social
65
engineering. These broad classes of attacks are represented as attack methods m2 M. For every
attack method there is a corresponding probability distributionb
m
a
which represents the probabil-
ity that the IDPS generates an alert of type a for an attack method m. For example, if the attacker
chooses m= DoS the corresponding alert probabilities could be b
DoS
High
= :8, b
DoS
Medium
= :15 and
b
DoS
Low
=:05.
Cybersecurity Analysts: Cybersecurity analysts R are assigned to investigate alerts. The time
required for an analyst to resolve an alert type a varies, and is represented by T
r
a
. Intuitively, T
r
a
represents the portion of a time period that an analyst needs to resolve an alert of type a. A time
period may be a shift, an hour or other fixed scheduling period. For example, if an analyst needs
half a time period to resolve a, then T
r
a
= 0:5. In the model: T
r
a
1,8 a2 A, i.e., an analyst
can address multiple alerts within a time period. In addition to T
r
a
, the effectiveness of an analyst
against an attack method, representing her expertise, is captured via a parameter E
r
m
.
Defender Strategies: A pure strategy P for the defender is a non-negative matrix of integers of
sizejCjjRj. Each c,r entry is the number of alerts in category c assigned to be investigated by
cyber analyst r, denoted by P
c;r
. The set of all pure strategies
ˆ
P is all allocations that satisfy the
following constraints; C
a
denotes all categories with the alert type a:
å
a2A
å
c2C
a
T
r
a
P
c;r
1 8r2 R (5.1)
å
r2R
P
c;r
N
c
8c2 C (5.2)
P
c;r
are integers (5.3)
66
(a) Pure Strategy (b) Marginal Strategy
Figure 5.2: CAG Strategies for the defender.
Inequality (5.1) ensures that each analyst is assigned a valid number of alerts, while inequal-
ity 5.2 ensures the number of alerts assigned are not more than the total in a category.
Example CAG. Consider a CAG with two systems K =fk
1
;k
2
g, two alert levels A=fa
1
;a
2
g,
and two analysts r =fr
1
;r
2
g. There are four alert categories C =fc
1
;c
2
;c
3
;c
4
g, where c
1
=
(k
1
;a
1
), c
2
=(k
1
;a
2
), c
3
=(k
2
;a
1
) and c
4
=(k
2
;a
2
). For the alert categories we have N
c
1
= 3,
N
c
2
= 2, N
c
3
= 0, and N
c
4
= 1. For r
1
, assume T
r
1
a
1
= 1 and T
r
1
a
2
= 0:5; For r
2
, assume T
r
2
a
1
= 0:4
and T
r
2
a
2
= 0:2. The analyst capacity constraint (Inequality (5.1)) for r
1
is instantiated as follows
(the other columns are similar):
P
c
1
;r
1
+ 0:5 P
c
2
;r
1
+ P
c
3
;r
1
+ 0:5 P
c
4
;r
1
1
For c
1
the alert capacity constraint (Inequality (5.2)) we have (the other rows are similar):
P
c
1
;r
1
+ P
c
1
;r
2
3
An example of a pure strategy P is given in Figure 5.2(a). The dashed boxes in Figure 5.2(a)
represent the set of variables in the analyst capacity constraints, i.e. constraints of type (5.1). An
67
example marginal strategy is shown in Figure 5.2(b). This drops constraint (5.3), but satisfies
constraints (5.1) and (5.2).
Define a mixed strategy q over pure strategies P2
ˆ
P (å
P2
ˆ
P
q
P
= 1;0 q
P
1). From the
mixed strategy one can calculate the marginal (expected) number of alerts of category c assigned
to each analyst r, denoted by n
c;r
=å
P
q
P
P
c;r
. The marginal allocation is denoted by n with
component n
c;r
representing the expected number of alerts in category c assigned to analyst r.
The adversary plays a best response to the defender’s marginal strategy n which amounts to
choosing a system k to attack and an attack method m.
Utilities Since the alerts in a category are indistinguishable they are all investigated with the
same probability n
c;r
=N
c
, which is the probability that an alert in category c is investigated by
analyst r. The probability of detecting an attack of type m that results in an alert of type c
is calculated as: x
c;m
=å
r2R
E
r
m
n
c;r
=N
c
. The payoffs for the defender depend on the system k
that is attacked, the attack method m, and if the adversary is detected (or undetected) during
investigation. This is denoted by U
d
d;c
and U
u
d;c
, respectively, where c refers to the category(k;a)
and d is the defender. A CAG is formulated as a zero-sum game, hence the payoffs for the
adversary (q) are U
d
q;c
=U
d
d;c
and U
u
q;c
=U
u
d;c
. If the adversary chooses k, m, and given b
m
a
,
the defender’s utility is:
U
s
=
å
a2A
b
m
a
[x
c;m
U
d
d;c
+(1 x
c;m
)U
u
d;c
] (5.4)
Bayesian Game It is possible to extend the CAG game formulation to allow for varying adversary
types. The motivation behind this extension is to handle the situation where a defender may be
protecting against adversaries that may value targets in the network differently, e.g. a nation-state
68
versus script-kiddie. In this case, denote the set of adversary types as Q and it is assumed the
defender knows a prior z over the chance of encountering the varying adversary types. Define the
utilities for adversary typeq as U
d
d;c
(q) and U
u
d;c
(q). The defender’s utility givenq, k, m andb
m
a
is then U
s;q
=å
a2A
b
m
a
[x
c;m
U
d
d;c
(q)+(1 x
c;m
)U
u
d;c
(q)]. Although the bayesian formulation
can easily be handled, for clarity the rest of the analysis is completed without considering the
adversary types.
5.3 Defender’s Optimal Strategy
The defender’s optimal mixed strategy (maximin strategy) can be computed a linear program,
denoted as MixedStrategyLP:
max
n;v
v (5.5)
s:t: v U
s
8k2 K;8m2 M (5.6)
x
c;m
=å
r2R
E
r
m
n
c;r
N
c
8c2 C;8m2 M (5.7)
n
c;r
=
å
P2
ˆ
P
q
P
P
c;r
8c2 C;8r2 R (5.8)
å
P2
ˆ
P
q
P
= 1;q
P
0 (5.9)
This LP requires exponentially many pure strategies P2
ˆ
P. The objective function in Equa-
tion 5.5 maximizes the defender’s utility, v. Equation 5.6, which uses Equation (5.4), ensures the
adversary selects a best response over all m2 M and k2 K. Equation 5.7 calculates the detection
probabilities x from the marginal strategy n, which is computed by Equation 5.8. Equation 5.9
ensures the mixed strategy is valid.
69
Computing the maximin mixed strategy for the defender was shown to be NP-hard in the
case of TSGs [22]. The computational hardness arises from the underlying team formation of
applying a group of screening resources to screen incoming passengers. However, in CAGs there
are not teams of analysts, the defender only needs to assign the alerts to individual analysts. Thus,
one might hope that this could simplify the problem and admit a polynomial time algorithm.
Unfortunately, this turns out not to be the case. Specifically, in Theorem 5.1 it is shown that the
problem is still NP-hard, where the hardness arises from a different domain feature, i.e., the time
values, T
r
a
, for the analysts.
Theorem 5.1 Computing the defender maximin strategy is weakly NP-hard when there is only
one resource, and is strongly NP-hard with multiple resources.
Proof 5.1 From [85], it is shown that the computational complexity of computing minimax equi-
librium is equivalent to that of finding the best response. Here, I show that the best response
problem in CSGs is weakly NP-hard even when there is only one resource via a reduction from
the Knapsack problem and becomes strongly NP-hard when there are multiple resources via a re-
duction from the Generalized Assignment Problem (GAP). This, together will the results of [85],
yields the claimed conclusion. First, the weak NP-hardness is proven when there is a single re-
source. In the knapsack problem, we have N items each with a weight w
i
and value v
i
8i2 N, and
aim to pick items of maximum possible value subject to a weight-capacity budget B.
Now create a CSG instance with 1 system k
1
, 1 attack method m
1
, and one resource r
1
. Set
E
r
1
m
1
= 1:0 and L
r
1
= 1. Also, N alert levels are created, thusjAj=jCj= N. For each alert
level a2 A, set T
r
1
a
i
= w
i
=B 1. Each category c
i
2 C also corresponds to a
i
since there is only
one system. Set U
d
s;c
i
= v
i
and U
u
s;c
i
= 0. Also set q
m
1
a
1
=:::= q
m
1
a
N
= 1=jNj. Further, set N
c
i
= 1
70
8i2 N, i.e., each category has precisely one alert. The adversary only has one choice k
1
;m
1
.
In this constructed instance, the defender’s best response is a pure strategy that picks alerts of
the maximum value but subject to total time limit constraint. Let n
i
2f0;1g denote whether the
resource resolves category c
i
, the best response problem then solves the following optimization
program:
max
n
å
N
i=1
n
i
v
i
(5.10)
s:t:
N
å
i=1
n
i
w
i
=B 1 (5.11)
n
i
2f0;1g 8i= 1;:::;N (5.12)
It is easy to see that this is precisely the Knapsack problem described above, yielding the
weak NP-hardness. To prove that the problem is strongly NP-hard with multiple resources, the
reduction is from the following Generalized Assignment Problem (GAP), a well-known NP-hard
problem: given R machines and A jobs, assign job a to machine r which costs T
r
a
time units and
achieves utility E
r
a
; machine R has a limit of 1 time unit. The goal is to assign these jobs to
machines to maximize the total utility subject to each machine’s time capacity. It is easy to verify
that this corresponds to the best response problem of a particular CSG as follows: one system,
R resources, alert set A;jAj attack methods with method m
a
triggering alert a with probability 1.
Trivial details are omitted here.
In some special cases, it is possible to compute the optimal marginal strategy in polynomial
time. Specifically, if all T
r
a
for a given analyst r are identical8a2 A, then the optimal marginal
71
strategy can be found with an LP which is stated in Proposition 5.1. This result is discussed
further in Section 5.4.
Proposition 5.1 When T
r
a
i
= T
r
a
j
8a
i
;a
j
2 A for each resource, then there is a polynomial time
algorithm for computing the maximin strategy.
Proof 5.2 When all T
r
a
for a given analyst r are equal, constraint: å
a2A
å c2 C
a
T
r
a
n
c;r
1 can
be converted intoå
a2A
å c2 C
a
n
c;r
1
T
r
a
. WLOG,
1
T
r
a
!b
1
T
r
a
c as any marginal assignment over
b
1
T
r
a
c will not be implementable. This happens as all pure strategies have integer assignments.
Hence, for all constraints of type (5.16) a new constraintå c2 C
a
n
c;r
b
1
T
r
a
c is introduced. This
new set of analyst capacity constraints forms a hierarchy H
2
and therefore, the set of constraints
form a bihierarchy. The defender’s optimal marginal strategy n can be found by solving the
MSLP .
5.3.1 Defender’s Optimal Marginal Strategy
In the security games literature, two approaches are commonly used to handle scale-up: marginal
strategies [43, 50] and column generation [37]. A marginal strategy based approach is adopted
which finds the defender’s marginal strategy n and does not need to explicitly enumerate the
exponential number of pure strategies. A relaxed version of LP (5.5)(5.9) in given in LP
(5.13)(5.17). LP (5.13)(5.17) is similar to LP (5.5)(5.9) except that equations (5.8) and (5.9)
are replaced with equations (5.16) and (5.17) to model the relaxed marginal space. Recall that
marginal strategies satisfy constraints (5.1)(5.2) (which lead to Equations 5.16 and 5.17) but
72
drop constraint (5.3). The optimal marginal strategy n for the defender can then be found by
solving the following MarginalStrategyLP
3
(MSLP):
max
n;v
v (5.13)
s:t:v U
s
8k2 K;8m2 M (5.14)
x
c;m
=å
r2R
E
r
m
n
c;r
N
c
8c2 C;8m2 M (5.15)
å
a2A
å
c2C
a
T
r
a
n
c;r
1 8r2 R (5.16)
å
r2R
n
c;r
N
c
; n
c;r
0 8r2 R;8c2 C (5.17)
Though MarginalStrategyLP computes the optimal marginal strategy n, it may not correspond
to any valid mixed strategy q, i.e., there may not exist a corresponding mixed strategy q such
that n=å
P2
ˆ
P
q
P
P,å
p2
ˆ
P
q
P
= 1. Marginal strategies of this type are called non-implementable.
However, when T
r
a
have a particular structure, one can show the marginal strategy returned is the
optimal for the defender. The intuition is that when T
r
a
=
1
w
a
where w
a
2Z
+
, the extreme points
of the marginal polytope are all integer. In these cases, the defender’s optimal implementable
marginal strategy can be efficiently computed using the MSLP.
Theorem 5.2 For any feasible marginal strategy n to MSLP , there is a corresponding mixed
strategy q that implements n whenever T
r
a
=
1
w
a
where w
a
2Z
+
,8r2 R,8a2 A and N
c
å
r2R
1
T
r
a
,
8c2 C for a given CAG.
Proof 5.3 Let Q be the polytope defined by constraints 5.16 and 5.17. Notice that since
N
c
å
r2R
1
T
r
a
, constraint 5.17 is trivially satisfied, where å
r2R
n
c;r
N
c
. Now, because 5.16
3
Note for the bayesian formulation, the objective function would be å
q2Q
z
q
v
q
while the second constraint be-
comes v
s;q
U
s;q
8k2 K;8m2 M;8q2Q
73
is independent across r2 R, Q can be written as Q = Q
1
Q
2
::: Q
r
jRj
, where Q
r
=
fn
c;r
jå
a2A
å
c2C
a
T
r
a
n
c;r
1;n
c;r
0g.
To show any feasible marginal strategy n from the MSLP has a valid mixed strategy q it needs
to be shown that the extreme points of Q belong toD
ˆ
P
. Using a result from [20] it is known that
n2 Q is an extreme point iff n
r
2 Q is an extreme point of Q
r
,8r2 R. Hence, it only needs to be
shown that the extreme points of Q
r
2D
ˆ
P
.
Take an arbitrary point extreme point of Q
r
, thenjCj linearly independent constraints must
be active. Since,å
a2A
å
c2C
a
T
r
a
n
c;r
= 1,jCj-1 of the n
c;r
0 constraints must be active meaning
jCj-1 entries of n
c;r
= 0 for a given analyst r. Hence, n
c;r
> 0 for only one entry and given
å
a2A
å
c2C
a
T
r
a
n
c;r
= 1 implies that n
c;r
= w
a
. As this is integer, this point is also a pure strategy.
Therefore, Q
r
2D
ˆ
P
and by extension Q2D
ˆ
P
. In a similar way, one can argue the opposite
direction. If MarginalStrategyLP has a valid solution, then a corresponding mixed strategy can
be found.
The intuition behind Theorem 5.2 is that when the T
r
a
=
1
w
a
and w
a
2Z
+
, the extreme points
of the defender’s strategy space become integer. This can be seen from the maximum number of
alerts each resource is able to resolve. Whenever, T
r
a
=
1
w
a
the number of alerts of a given type a
resource can solve will be w
a
, which corresponds to an integer assignment. Hence, the defender’s
marginal strategy space is the same as the defender’s mixed strategy space when these conditions
are true and the MSLP returns the optimal marginal strategy for the defender.
74
5.4 CAG Algorithmic Approach
The problem of non-implementability of marginals in security games has been studied in previous
research [50, 22], but the non-implementability arose because of spatio-temporal resource con-
straints and constraints from combining resources into teams. For CDGs, non-implementability
arises from the presence of the T
r
a
coefficients (an example is discussed later). In this section,
an algorithm is presented that takes the initial constraints on a CAG and converts them to en-
sure the implementability of the marginal strategy. To that end, [23] presents a useful approach,
as they define a special condition on the constraints on the marginals called a bihierarchy. A
bihierarchy captures a sufficient condition needed to guarantee the implementability of the de-
fender’s marginal strategy n. Unfortunately, constraints on CAGs rarely satisfy the conditions for
a bihierarchy and must be converted to achieve the bihierarchy condition.
Definitions and Notation The marginal assignments n for the defender form ajCjjRj
matrix. The assignment constraints on the defender’s marginal strategy, namely Equations 5.16
and 5.17, are a summation of n
c;r
over a set SjCjjRj with an integral upper bound. For
example, based on Equation 5.17,ffc
1
;r
1
g;fc
1
;r
2
gg forms a constraint subset for the example
CAG. The collection of all such S form a constraint structure H when all coefficients in the
constraints are unitary, as they are in Equation 5.17.
A marginal strategy n is said to be implementable with respect to H if there exists a distri-
bution (a.k.a., mixed strategy) q such that n=å
P2
ˆ
P
q
P
P. A constraint structure H is said to be
a hierarchy if, for any two constraint sets in H, we have that either one is a subset of the other
or they are disjoint. More concretely, we have the following: 8S
1
;S
2
2 H, S
1
S
2
, S
2
S
1
75
or S
1
\ S
2
= / 0. H is said to be a bihierarchy if there exists hierarchies H
1
and H
2
, such that
H = H
1
[ H
2
and H
1
\ H
2
= / 0.
For any CAG, the row constraintså
r2R
n
c;r
N
c
form a hierarchy H
1
. However, the column
constraints, one for each resource r2 R, do not form a hierarchy: å
a2A
å
c2C
a
T
r
a
n
c;r
1. As
mentioned earlier, the culprit lies in the T
r
a
coefficients, as they can be non-unitary, and to achieve
a hierarchy H
2
on the column constraints, and thus give us a bihierarchy, all T
r
a
coefficients must
be removed.
5.4.1 Constraint Conversion
The T
r
a
coefficients admits possibly non-implementable marginal strategies to be returned. For
instance, in Figure 5.2(b) the marginal strategy is non-implementable, because it is impossible to
get n
c
1
;r
2
= 2:5 by mixing pure assignments. This is because constraints (5.1) and (5.3), force the
relevant pure strategy P
c
1
;r
2
b1=0:4c= 2. The column constraints can be converted, namely:
å
a2A
å
c2C
a
T
r
a
n
c;r
1 into a hierarchy by removing the T
r
a
coefficients. The conversion can be
completed by grouping together all n
c;r
which have the same T
r
a
and introducing a new constraint
on these sets of n
c;r
. Specifically, each column constraint (equation 5.16) is replaced withjAj
constraints:
å
c2C
a
n
c;r
L
C
a
r
(5.18)
This conversion must be done for all analysts r2 R for the column constraints to form a hierarchy
H
2
. L
C
a
r
gives an upper bound on the number of alerts of type a that an analyst can solve. The
choices of L
C
a
r
must satisfy the original capacity constraint, namely:å
a2A
T
r
a
L
C
a
r
1 and L
C
a
r
2Z.
76
Figure 5.3: Conversion of Column Constraints on CAG
Conversion Example This example refers to the example CAG where the marginal strategy
is given in Figure 5.3. The column constraints must be converted to a hierarchy. This conversion
is shown for r
1
(as r
2
is converted in the same manner). Initially, for r
1
we have the following
constraint:
T
r
1
a
1
n
c
1
;r
1
+ T
r
1
a
2
n
c
2
;r
1
+ T
r
1
a
1
n
c
3
;r
1
+ T
r
1
a
2
n
c
4
;r
1
1
The T
r
a
coefficients can be removed by grouping together all n
c;r
which share T
r
a
and introducing
two new constraints like (5.18). This leads to two new constraints:
n
c
1
;r
1
+ n
c
3
;r
1
L
C
a
1
r
1
n
c
2
;r
1
+ n
c
4
;r
1
L
C
a
2
r
1
These new constraints are shown for r
1
in Figure 5.3 on the right of the arrow. Next, the L
C
a
r
variables must be set. One possible combination is H
2
=fn
c
1
;r
1
+ n
c
3
;r
1
0;n
c
2
;r
1
+ n
c
4
;r
1
2g
(H
2
also includes constraints on r
2
which are not shown). This satisfies the original the original
analyst capacity constraints as: L
C
a
1
r
1
+ 0:5 L
C
a
2
r
1
1. However, there is another choice for L
C
a
r
,
H
2
=fn
c
1
;r
1
+n
c
3
;r
1
1;n
c
2
;r
1
+n
c
4
;r
1
0g. Given either of the two hierarchies H
2
, the constraint
77
structure is now a bihierarchy. The original marginals shown in Figure 5.3 do not satisfy these
new constraints; but solving the MSLP with these additional constraints in H
2
is guaranteed to
give an implementable marginal.
Rounding T
r
a
Values In the conversion process, a hierarchy H
2
is created on the column
constraints by introducingjAj L
C
a
r
values for each resource. The conversion process then allows
for combinatorially many configurations of the L
C
a
r
values which satisfy the original capacity
constraints for a resource, i.e. Constraint (5.16). To alleviate this search, an algorithm could take
advantage of Theorem 5.2 and round each T
r
a
to the nearest
1
w
a
value which is greater than T
r
a
where w
a
2Z
+
. The marginal strategy n returned for this modified CAG is then guaranteed to be
implementable. However, as is shown next this can lead to a
1
2
loss for the defender in the worst
case.
Counter Example Consider a CAG with one system K=fk
1
g, two alert levels A=fa
1
;a
2
g,
and one analyst r=fr
1
g. There are two alert categories C=fc
1
;c
2
g, where c
1
=(k
1
;a
1
) and
c
2
=(k
1
;a
2
). For the alert categories we have N
c
1
= 1 and N
c
2
= 1. For r
1
, assume T
r
1
a
1
= 0:5+e
and T
r
1
a
2
= 0:5e. If one rounds the T
r
a
values up to the nearest
1
w
a
we would have the following:
T
r
1
a
1
= 1 and T
r
1
a
2
= 0:5.
Now assume the adversary has one attack method m
1
with E
r
1
m
1
= 1 and where b
m
1
a
1
= 0:5+
e and b
m
1
a
2
= 0:5e. Assume U
d
d;c
1
= U
d
d;c
2
and U
u
d;c
1
= U
u
d;c
2
where U
d
d;c
1
> U
u
d;c
1
0. The
adversary has one choice and hence, chooses to use m
1
to attack system k
1
. The modified CAG
would then assign the alert in c
1
to r
1
and receive a utility of v
0
=(0:5e)U
u
d
+(0:5+e)U
d
d
.
In the unmodified CAG, however, the defender would be able to assign both alerts to r
1
and
78
therefore, achieve a utility v
= U
d
d
. In this case, the worst possible loss from the modification of
the CAG happens when U
u
d
= 0. This results in the following:
v
0
v
=
(0:5e)U
u
d
+(0:5+e)U
d
d
U
d
d
(0:5+e)U
d
d
U
d
d
In the worst case, v
0
=(0:5+e)v
. Hence, rounding the T
r
a
values means the defender can lose
up to
1
2
of the optimal utility. This amount of loss is not acceptable in cyber security domains
which have highly sensitive targets and therefore, algorithms must be devised which provide
better solutions that mitigate this loss.
5.4.2 Branch-and-Bound Search
So far, it has seen that a marginal strategy n for a CAG output from the MSLP may be non-
implementable. The goal is to ensure that the marginal strategy output by MSLP is implementable
by adding new column constraints, i.e., by realizing a bihierarchy. The addition of new constraints
as outlined above gives us a bihierarchy, but there are multiple ways to set the values of L
C
a
r
variables (as shown in the above example), creating a choice of what bihierarchy to create. Indeed,
one may need to search through the combinatorially many ways to convert the constraints of
CAG to a bihierarchy. Previous work [22] proposed the Marginal Guided Algorithm (MGA) for
creating bihierarchies, but MGA does not apply to CAGs as it does not deal with the non-unitary
coefficients present in CAGs.
Here a novel branch-and-bound search is proposed: out of the set of constraints that could be
added to MSLP, find the best that would give the defender the optimal utility v
. At the root node,
are the original constraints (13) and (14); running MSLP potentially yields a non-implementable
79
marginal strategy n. Then branches are created from this root, where at each level in the tree, new
constraints are added for an analyst r, and the children are expanded with the following rules:
1. Substituteå
a2A
å
c2C
a
T
r
a
n
c;r
1 withjAj constraints: å
c2C
a
n
c;r
L
C
a
r
for all a2 A. The
jAj new constraints form a set H
2
(r). A branch is created for all combinations of L
C
a
r
which
satisfyå
a2A
T
r
a
L
C
a
r
1.
2. Solve the MarginalStrategyLP at each node with the modified constraints.
Thus, at each level of the tree, the capacity constraint of some analysts have been substituted,
and for these, we have constraints of type (5.18), but for others, we still have constraint (5.16).
This set of constraints does not form a hierarchy H
2
as T
r
a
coefficients are present in some analyst
constraints. Still, each intermediate node gives an upper bound on the defender’s utility v which
is stated in Proposition 5.2, as each conversion from (5.16) to (5.18) introduces new constraints
on the defender’s strategy space.
Proposition 5.2 Each intermediate node in the tree gives an upper bound on the defender’s utility
v for all subsequent conversions for the remaining analyst capacity constraints.
Proof 5.4 In an intermediate node there are two types of column constraints present: (1)
å
a2A
å
c2C
a
T
r
a
n
c;r
1 and (2) å
a2A
n
c;r
L
C
a
r
. At the next level of the tree a constraint of the
first type is replaced with constraints of the second type. These new constraints restrict the de-
fender’s marginal strategy space and hence, the defender’s utility v will either stay the same or
decrease.
A leaf in the search tree has column constraints only of the form: å
a2A
n
c;r
L
C
a
r
. Hence,
they form a hierarchy H
2
as all n
c;r
have unitary coefficients and an integer upper bound. At a
80
leaf, the MSLP can be solved with the resulting bihierarchical constraints to find a lower bound
on the defender’s utility v. Combining this with Proposition 5.2 gives the components needed for
a branch-and-bound search tree which returns the optimal bihierarchy for the defender.
5.4.2.1 Heuristic Search
The full branch-and-bound procedure struggles with large CAG. To find good bihierarchies, one
can take advantage of the optimal marginal strategy n
returned from MSLP at an intermediate
node to reduce the amount of branching done. The intuition for this strategy, is that the optimal
bihierarchy either contains, or is near, n
. For example, in the conversion done in Figure 5.3, the
L
C
a
r
values can be set close to n. Set L
C
a
1
r
2
=b1=:4c= 2, while the leftover capacity for r
2
is used
to set L
C
a
2
r
2
= 1. L
C
a
1
r
2
could be set to another value, but our choice must stay close to n
.
For the heuristic search, the following rules are used to expand child nodes which must set
the L
C
a
r
values for an analyst r: (1) L
C
a
r
=dn
C
a
;r
e, (2) L
C
a
r
=bn
C
a
;r
c or (3) L
C
a
r
=b
1å
a2A
T
r
a
L
C a
r
T
r
a
c,
where n
C
a
;r
=å
c2C
a
n
c;r
. The third rule is used whenever an L
C
a
r
value cannot be set to the roof or
floor of n
, and is set to be the max value given the leftover analyst capacity. These choices are
done in an attempt to capture the optimal marginal strategy n
. The set of all valid combinations
of the L
C
a
r
values using the above rules which satisfyå
a2A
T
r
a
L
C
a
r
1 constitute the search space
at each intermediate node. These rules then significantly reduce the branching at intermediate
nodes in the search tree.
5.4.2.2 Convex Hull Extension
The above searches return a set of good bihierarchies for obtaining a high value of v
for the de-
fender when solving MSLP, as each leaf contains a bihierarchy H
i
. Each bihierarchy H
i
contains
81
\
H
1
H
2
(a) Individual Bihierarchies
\
H
1
H
2
(b) Convex Hull
Figure 5.4: Geometric view of the defender’s strategy space.
a portion of the defender’s mixed strategy space (due to new constraints). Thus, taking a convex
hull over these bihierarchies increases the size of the defender’s strategy space and hence, will
only improve the defender’s utility. In Figure 5.4 a geometric representation of the defender’s
strategy space is shown. Individual points represent the defender’s pure strategies and the region
contained in the convex hull of these points is the defender’s mixed strategy space, while the
outer region represents the defender’s relaxed marginal strategy space. Figure 5.4(a) shows how
individual bihierarchies capture portions of the defender’s mixed strategy space represented by
the shaded regions enclosed by the dashed lines. Figure 5.4(b) shows that by taking the convex
hull of the two bihierarchies H
1
and H
2
the size of the defender’s strategy space can be increased
without generating any new bihierarchies. Note, as each bihierarchy is implementable, the convex
hull will also be implementable [22].
To take the convex hull, first notice each bihierarchy H
i
is a set of linear constraints and can
be written as D
i
n b
i
for matrix D
i
and vector b
i
. Hence, by definition n(H
i
)=fnjD
i
n b
i
g.
Using a result from [11] that represents the convex hull using linear constraints, one can write:
82
conv(n(H
1
);:::;n(H
l
))=fnjå
i
n
i
;D
i
n
i
l
i
b
i
;l
i
0;å
i
l
i
= 1g. This allows for the convex hull
of the bihierarchies to be computed efficiently using an LP similar to MSLP.
In terms of the convex hull there are two options available: (1) Take the convex hull of
all bihierarchies or (2) build the convex hull iteratively. In some cases, the set of bihierarchies
available to the defender can be very large and hence, optimizing over all bihierarchies is not
feasible. To alleviate this issue, the convex hull can be built iteratively. This is done by first sorting
the bihierarchies by the defender utility v. Next, the convex hull of the top two bihierarchies is
taken which gives a utility v
0
to the defender. Bihierarchies are added to the convex hull while the
utility v
0
returned increases by at least somee, and stops otherwise.
5.5 Evaluation
The CAG model and solution algorithms are evaluated with experiments inspired by the oper-
ations of the AFCYBER. The game payoffs are set to be zero-sum, i.e. U
u
d;c
=U
u
q;c
, and the
defender’s payoffs are randomly generated with U
u
d;c
uniformly distributed in [1;10]. The
rest of the game payoffs, U
d
d;c
and U
d
q;c
, are set to be zero. Each experiment is averaged over 30
randomly generated game instances.
5.5.1 Full vs Heuristic Search
Whether the heuristic approach of staying close to n
would yield the right solution quality-speed
tradeoff remains to be seen. To test this, the performance of the full branch-and-bound search
(Full) is compared to the heuristic search (Heur). For this experiment two variations of the full
search are tested: Full-1 which uses the full convex hull and Full-2 which uses the iterative convex
83
Experimental Results
Runtime Comparisons: Full 1, Full 2, Heur 1, Heur 2
1
0
50
100
150
200
250
300
2 3 4 5
Runtime (s)
Number of Resources
Full-1 Full-2
Heur-2 Heur-1
(a) Runtime Comparison
Experimental Results
Solution Comparison: Full 1, Full 2, Heur 1, Heur 2
2
-11
-9
-7
-5
-3
2 3 4 5
Defender Utility
Number of Resources
Full-1 Full-2
Heur-1 Heur-2
(b) Solution Comparison
Experimental Results
Scale-up: Heuristic 1, Heuristic 2
3
0
10
20
30
40
50
60
70
6 8 10 12 14
Runtime (s)
Number of Resources
Heur-1
Heur-2
(c) Runtime Comparison
Experimental Results
Solution Comparison: Relaxed, Heur 1, Heur 2
4
-11
-9
-7
-5
-3
6 8 10 12 14
Defender Utility
Number of Resources
Relaxed Heur-1 Heur-2
(d) Solution Comparison
Figure 5.5: Experimental Results for CAG instances.
hull. The Heuristic search the same two variations are tested, labeled as Heur-1 and Heur-2. Each
instance has 20 systems, 3 attack methods, and 3 alert types.
In Figure 5.5(a) the number of resources are varied on the x-axis and the runtime in seconds
is shown on the y-axis. As can be seen the runtime of the full search explodes exponentially as
the number of resources is increased. However, the average runtime of the heuristic approach is
under 1 second in all cases and provides up to a 100x runtime improvement for 5 resources. In
Figure 5.5(b) the number of resources are on the x-axis while the y-axis shows the defender’s ex-
pected utility. This graph shows that all variations perform similarly, with the heuristic suffering
less than 1% solution in defender utility compared to the full search for all game sizes. Hence,
84
-11
-10
-9
-8
-7
-6
-5
-4
4 5 6 7 8
Defender Utility
Number of Alert Types
Heur-1 Heur-2
Greedy Random
(a) Runtime Comparison
-7
-6
-5
-4
-3
-2
-1
0
Defender Utility
Heur-1
Greedy
Random
(b) Solution Comparison
Figure 5.6: Allocation Approach Comparison.
these results show that the heuristic significantly improves runtime without sacrificing solution
quality.
0
2
4
6
8
10
20 30 40 50 60
Runtime (s)
Number of Systems
Heur-1
Heur-2
(a) Runtime Comparison
-11
-10
-9
-8
-7
-6
-5
-4
-3
-2
20 30 40 50 60
Defender Utility
Number of Systems
Relaxed Heur-1 Heur-2
(b) Solution Comparison
Figure 5.7: Scaling Number of Systems
5.5.2 Solving large CAG
Another important feature of real-world domains are the larger number of cybersecurity analysts
available to investigate alerts. Accordingly, the next experiment tests the scalability of the heuris-
tic approach to large CAG instances. The parameters for these experiments are 100 systems, 10
attack methods, and 3 alert levels.
85
In Figure 5.5(c) the runtime results are shown with the number of analysts on the x-axis and
the runtime in seconds on the y-axis. For example, Heur-1 takes an average of 40 seconds to
solve a CAG with 10 analysts. This graph shows the heuristic runs in under a minute, even as
the number of analysts is increased from 6 to 14. In Figure 5.5(d) the solution quality results
are shown with the number of analysts on the x-axis and the defender’s expected utility on the
y-axis. The solution quality is compared to the (potentially non-implementable) MSLP solution.
This graph highlights that the heuristic approach achieves a utility close to the theoretical optimal
value. Therefore, this experiment shows that game theoretic approaches scale to large CAG
without sacrificing much solution quality.
In the next experiment, the number of systems that have to be protected are varied. For this
experiment the defender has 5 cyber experts to assign. Figure 5.7 shows the runtime and solution
quality results. In Figure 5.7(a) the number of systems are varied on the x-axis and the runtime
in seconds is shown on the y-axis. For instance, for 50 systems Heur-1 takes an average of 1.78
seconds to finish running. This graph shows Heur-1 and Heur-2 show no issues in scaling to a
larger number of systems. In Figure 5.7(b) the x-axis shows the number of systems while the
y-axis gives the defender’s expected utility. The solution quality is again compared to the MSLP
solution. In all cases, the heuristic approaches suffer only a small loss in defender expected utility
compared to the MSLP value. As can be seen from the results, the heuristic approaches scale to
CAG with a larger number of systems without sacrificing much in the way of solution quality.
86
5.5.3 Allocation Approach
The last experiment aim to show that the game theoretic apporach for CAGs outperform ap-
proaches used in practice. In addition to the heuristic approach, a greedy approach which inves-
tigates the highest priority alerts from the most critical bases first and a random approach for the
allocation are used for comparison. The parameters for this experiment are 20 systems, 5 attack
methods, and 10 analysts. In Figure 5.6(a) the solution quality results are shown with the number
of alerts on the x-axis and defender’s utility on the y-axis. For example, with 4 alert types the
heuristics achieve a utility of -7.52 while the greedy and randomized allocations give -9.09 and
-9.65, respectively. This difference is statistically significant (p< 0:05). In Figure 5.6(b), a so-
lution comparison for a specific CAG instance is shown. This graph gives intuition for why the
game theoretic approach performs so well. The greedy and random approaches tend to overpro-
tect some systems (system 4) while leaving others without adequate protection (system 2).
5.6 Chapter Summary
In this paper I address the pressing problem in cyber security operations of how to allocate cyber
alerts to a limited number of analysts. The Cyber-alert Allocation Game (CAG) is introduced to
analyze this problem and I show computing optimal strategies for the defender is NP-hard. I then
give special cases when the optimal strategy for a CAG is efficiently computable. To solve CAG,
a novel approach is presented to address implementability issues in computing the defender’s
optimal marginal strategy. Finally, I give heuristics to solve large CAGs, and give empirical
evaluation of the CAG model and solution algorithms.
87
Chapter 6
General-sum Threat Screening Games
6.1 Problem Domain
Extending the CAG and TSG models to general-sum domains is important for real-world appli-
cability of these models as security domains commonly have non-symmetric payoffs between
defenders and adversaries. In this chapter, I investigate Bayesian general-sum Threat Screening
Games (TSG) and present an approach to solve these games which has additional applicability to
the CAG model presented in Chapter 5. It is suggested for the reader to familiarize themselves
of the details of the TSG model which is given in the background section. To highlight the use-
fullness of modeling general-sum payoffs, consider that for an adversary attacking an airport he
may receive a positive payoff even when getting caught as he gains notoriety for his attack. The
defender, however, would not receive any payoff given that she is only able to stop the attack and
no damage occurs. In this chapter, I present an approach that handles the computational com-
plexities arising from solving TSGs with non zero-sum payoffs which leverages a hierarchical
decomposition method for handling the adversary action tree and uses MGA [22] to compute
approximate marginal strategies for the defender.
88
6.2 Approach
While it has been shown that Bayesian Stackelberg games are hard to solve [27], it is interesting
to observe that Bayesian general-sum TSGs are hard to solve even in the marginal strategy space
as is shown next. This result shows that finding the optimal marginal strategy n for Bayesian
general-sum TSGs is fundamentally more complex than the zero-sum case which can be solved
in polynomial time as an LP. Thus, it is not surprising that the solution approaches used in [22]
are not directly applicable to the general-sum case.
Theorem 6.1 Finding the optimal solution in Bayesian general-sum TSGs is NP-hard even in the
relaxed marginal strategy space.
Proof 6.1 The reduction is given from the knapsack problem to the TSG problem. Assume n items
with weights w
i
and value v
i
with a sack of capacity K. Wlog, assume w
i
and K are integers.
Construct a game with n adversary typesjQj= n. Each type of adversary has two flights to
board: f
0
; f
1
. Thus, C=f(q
i
; f
j
)j 0 i n; j2f0;1gg. Choose the other parameters of the
game as follows: two resources r
1
,r
2
with capacities L
r
1
= K and L
r
2
=¥. There have two teams
t
1
=fr
1
g and t
2
=fr
2
g. There is only one attack method m
1
. For this game E
t
1
m
1
= 1 and E
t
2
m
1
= 0,
i.e., t
2
is not effective at all at detecting m
1
. The number of screenees for each screenee category
N
(q
i
; f
0
)
= w
i
and N
(q
i
; f
1
)
= 1 for allq
i
2Q. Each typeq
i
occurs with probability
v
i
å
q
i
v
i
.
The utilities for the screener are U
d
s;(q
i
; f
0
)
= 0;U
u
s;(q
i
; f
0
)
= 0,8q
i
and U
d
s;(q
i
; f
1
)
= 2;U
u
s;(q
i
; f
1
)
= 1,
8 q
i
. Thus, the screener strictly prefers the adversary of every type to choose f
1
. The util-
ities for the adversary are set as follows: U
d
Q;(q
i
; f
0
)
= 1;U
u
Q;(q
i
; f
0
)
= 2,8 q
i
and U
d
Q;(q
i
; f
1
)
=
0;U
u
Q;(q
i
; f
1
)
= 1,8q
i
. Thus, the adversary will choose(q
i
; f
1
) only when the probability of detec-
tion x
(q
i
; f
0
)
= 1;x
(q
i
; f
1
)
= 0 (breaking ties in favor of the screener). This happens only when all of
89
the screenees N
(q
i
; f
0
)
= w
i
are screened by t
1
. Thus, we have from the capacity constraints that
å
q
i
chooses f
1
w
i
K. Therefore, for the choice of f
1
by adversary of type q
i
, the screener earns
v
i
å
q
i
v
i
and otherwise the screener earns 0. Given, the optimization problem maximizes this utility:
å
q
i
chooses f
1
v
i
å
q
i
v
i
, the optimization provides a solution for the knapsack problem.
The resulting TSG instance from the knapsack problem has two flights, two resources, two
teams, and as many adversary types as the number of items n in the knapsack problem. The
screenee types assigned to t
1
gives the knapsack solution. Therefore, the reduction from the
knapsack problem is overall polynomial in the number of items n.
The optimal marginal strategy for a Bayesian general-sum TSG can be obtained by solving
the mixed integer linear program MarginalStrategyMILP, provided below.
max
n;s;x;a
]quad
å
q2Q
z
q
s
q
(6.1)
s:t: s
q
U
s
(a
q;w
c;m
)(1 a
q;w
c;m
) Z 8q;w;c;m (6.2)
0 k
q
U
q
(a
q;w
c;m
)(1 a
q;w
c;m
) Z 8q;w;c;m (6.3)
x
w
c;m
=
å
t2T
E
t
m
n
w
c;t
N
w
c
8w;c;m (6.4)
å
t2T
I
t
rå
c2C
n
w
c;t
L
w
r
8w;r (6.5)
å
t2T
n
w
c;t
= N
w
c
8w;c (6.6)
n
w
c;t
0 8w;c;t (6.7)
a
q;w
c;m
2f0;1g 8q;w;c;m (6.8)
å
w;c;m
a
q;w
c;m
= 1 8q (6.9)
90
Equations (6.8) and (6.9) force each adversary type q to choose a pure strategy. Equation
(6.1) is the objective function which maximizes the screener expected utility as a weighted sum-
mation of screener expected utility against adversary type q, s
q
, multiplied by the probability of
encountering adversary type q, z
q
. Equation 6.2 defines the screener’s expected payoff against
each adversary type, contingent on the choice of pure strategies by the adversary types. The con-
straint places an upper bound of U
s
(a
q;w
c;m
) on s
q
, but only if a
q;w
c;m
= 1, as Z denotes an arbitrarily
large constant. For all other pure strategies of adversary type q, the RHS is arbitrarily large.
Similarly, Equation (6.3) places an upper bound of U
q
(a
q;w
c;m
) on the utility of adversary type q,
k
q
, for a
q;w
c;m
= 1. Additionally, Equation (6.3) also lower bounds k
q
by the largest U
q
(a
q;w
c;m
) over
all a
q;w
c;m
. Taken together these upper and lower bounds ensure that the pure strategy selected by
adversary typeq is a best response to the screener marginal strategy n. Equations (6.5) and (6.7)
requires that n satisfies the resource type capacity constraints and screenee category assignment
constraints.
Even though MarginalStrategyMILP represents a NP-hard problem, solving it does not nec-
essarily produce an implementable marginal screener strategy, i.e., the marginal strategy may
not map to a probability distribution over pure strategies. The issue of implementability was
addressed in [22] for TSGs by introducing the Marginal Guided Algorithm (MGA), which uses
the (potentially non-implementable) marginal strategy to additionally restrict the constraints of a
TSG to obtain a provably implementable marginal strategy though it can possibly lose solution
quality in the process.
91
6.3 GATE: Solving Bayesian General-Sum TSG
Despite operating in the marginal screener strategy space, solving the MarginalStrategyMILP is
computationally expensive due to the presence of the integer variables that encode the adversary
strategy space. Therefore, to solve Bayesian general-sum TSGs a clever algorithmic is needed
to explore the adversary strategy space. An example of the adversary strategy space for two
adversary types and two actions per type is shown in Figure 6.1(a). The leaf nodes in this tree
represent all possible joint pure strategy combinations for the two adversary types.
(b) Hierarchical Adversary Type Tree – Binary Partitioning (a) Two Adversary Type Game Tree
[*,*]
[1,*] [2,*]
[1,1] [1,2] [2,1] [2,2]
Type
1
Type
2
{
1
,
2
} {
3
,
4
}
{
4
} {
1
} {
2
} {
3
}
Branch & Guide
Figure 6.1: Bayesian Adversary Strategy Space
As mentioned previously a standard algorithmic approach that could be used to solve
Bayesian general-sum TSGs is Multiple LPs [27]. This technique exploits the fact that, by fixing
the joint adversary pure strategy, the underlying optimization problem is converted from a MILP
to an LP. Thus, an LP can be solved for each leaf node in the adversary strategy tree to obtain
the best marginal screener strategy which induces that joint adversary pure strategy as a best
response. The marginal strategy with the highest screener utility is returned as the solution for
the TSG. However, the number of joint adversary pure strategies is exponential in the number of
adversary typesQ and thus Multiple LPs is not scalable for large problem instances. Therefore,
the General-sum Algorithm for Threat screening game Equilibria (GATE) is proposed and in this
92
section an intuitive, high level description is provided. In Section 6.4 a detailed algorithm is given
as the experimental results show GATE does not scale leading to the need for heuristics to further
speed up computation.
6.3.1 Hierarchical Type Trees
At a high level GATE seeks to exploit the structure of TSGs and reduce the number of joint
adversary pure strategies that need to be evaluated. GATE achieves this reduction by building
off intuition from the HBSA algorithm [39], which involves constructing a hierarchical type tree.
Such a tree decomposes the game with each node in the tree corresponding to a restricted game
over a subset of adversary types. The idea is to solve these smaller, restricted games to efficiently
obtain (1) infeasibility information to eliminate large sets of joint adversary pure strategies, and
(2) utility upper bound information that can be used to terminate the evaluation of joint adversary
pure strategies.
GATE operates on restricted TSGs, where T SG(Q
0
) is defined to be a TSG with a subset of
adversary types Q
0
Q. It is important to note that, despite not including all adversary types,
T SG(Q
0
) does not ignore the screenee categories associated with adversary typesQ
n
Q
0
. Indeed,
T SG(Q
0
) must still satisfy the constraint that all screenees in each category must be assigned to
a screening team type, i.e., Equation 6.6. By continuing to enforce these constraints, the upper
bounds generated will be tighter as the screener cannot focus all the screening resources on just a
subset of screenee categories, helping to improve the ability of GATE to prune out joint adversary
pure strategies.
The subsets of adversary types are decomposed such that each level in the hierarchical type
tree forms a partition overQ satisfyingQ
0
i
\Q
0
j
= / 0;8i;8 j;i6= j as well as[
i
Q
0
i
=Q. Additionally,
93
the set of adversary types in each parent node is the union of the sets of adversary types of all
of its children, with the root node of the hierarchical type tree corresponding to the full problem.
Figure 6.1(b) shows an example of a hierarchical tree structure with full binary partitioning for
a game with four adversary types. The root node is the parent to two restricted games with two
adversary types, each of which is a parent to two restricted games for individual adversary types.
The evaluation of the hierarchical tree starts at the leaf nodes and works up the tree such that
all child nodes are evaluated before the parent nodes are evaluated. Every node is processed by
evaluating the pure strategies of the restricted game and propagating up only the feasible pure
strategies (i.e., pure strategies inducible as an adversary best response). [39] proved that if a pure
strategy a
q
0 can never be a best response for adversary typeq
0
in a restricted game T SG(Q
0
) with
Q
0
=fq
0
g then any joint pure strategy containing a
q
0 can never be a best response in any T SG(Q
00
)
withQ
0
Q
00
. Thus, at a given node, it is only necessary to consider joint pure strategies in the
cross product of the sets of feasible pure strategies passed up from the child nodes.
For each pure strategy to be propagated, the corresponding utility with respect to the restricted
game is also passed up. [39] also proved that it is possible to upper bound the screener utility
for joint adversary pure strategy a
Q
in T SG(Q) by å
q2Q
z
q
b(a
q
) where z
q
is the normalized
probability of adversary type q in T SG(Q
0
) and b(a
q
) is the upper bound on the screener utility
for adversary pure strategy a
q
in the restricted game T SG(fqg). These upper bounds can be used
to determine the order to evaluate joint pure strategies as well as when it no longer necessary to
evaluate joint pure strategies. This propagation of pure strategies and upper bounds continues
until the root node is solved to obtain the best solution for the game.
Example Consider a game with four adversary types Q =fq
1
;q
2
;q
3
;q
4
g, like in Fig-
ure 6.1(b), where each adversary type has two actions available, a
1
q
i
;a
2
q
i
. The full game is broken
94
down into the restricted games and the leaf nodes in the hierarchical tree are solved first. Suppose
that after all of the leaf nodes are solved we get the following action sets for the adversaries:
q
1
=fa
1
q
1
g, q
2
=fa
1
q
2
;a
1
q
2
g, q
3
=fa
1
q
3
g, q
4
=fa
1
q
4
g (as the other actions are found to be infea-
sible). Then, in the nodefq
1
;q
2
g we evaluate [a
1
q
1
,a
2
q
2
] and [a
1
q
1
,a
1
q
2
] while for nodefq
3
;q
4
g we
evaluate [a
2
q
3
,a
1
q
4
]. Now, at the root node (Q) one only has to evaluate two joint adversary pure
strategies (i.e., [a
1
q
1
,a
1
q
2
,a
2
q
3
,a
1
q
4
] and [a
1
q
1
,a
2
q
2
,a
2
q
3
,a
1
q
4
]) instead of 16 (cross product of all adversary
type strategy sets) in order to find the optimal strategy for the game.
6.3.2 Advantages of GATE
As mentioned earlier, security games techniques do not apply to TSGs directly, and thus, HBSA
is not well suited for this problem. In particular, HBSA utilizes a Branch-and-Price framework
[14] which requires running column generation (given the large number of defender strategies in
complex domains) to evaluate every single adversary joint pure strategy in the hierarchical type
tree. While Branch-and-Price is a general approach frequently used for Bayesian Stackelberg
games, column generation has been shown to be incapable of scaling for large-scale TSGs even
in the zero-sum case due to the massive number of screener pure strategies [22]. Thus, having to
repeatedly run column generation for the Bayesian general-sum TSGs is a non-starter.
To efficiently evaluate the joint adversary pure strategies in the nodes of the hierarchical
tree, Branch-and-Guideis introduced, which combines branch-and-bound search with MGA to
simultaneously mitigate the challenges of both scalability and implementability when solving
Bayesian general-sum TSGs. Branch-and-Guide may be run at the root node of the hierarchical
type tree, so that given a large of joint adversary pure strategies, a large portion of them can
be pruned using upper-bounds, speeding up the computation. Furthermore, Branch-and-Guide
95
exploits the fact that for a fixed joint adversary pure strategy, an implementable marginal screener
strategy can be obtained quickly (even if it not necessarily optimal) and thus can avoid having to
rely on column generation.
An example of the adversary strategy tree explored in Branch-and-Guide is shown in Figure
6.2, with the size and ordering of the tree based on the feasible joint adversary pure strategies
and corresponding upper bounds propagated up by the child nodes. Branches to the left fix the
joint adversary pure strategy, converting MarginalStrategyMILP into a linear program which can
be solved efficiently. However, the resulting marginal strategy may not be implementable and
thus MGA is run on the marginal while ensuring that the selected joint adversary pure strategy
is still a best response. This two-step process produces an implementable marginal strategy that
gives a lower bound on the overall solution quality. Branches to the right represent the upper
bound on the screener utility for the next best joint adversary pure strategy which is calculated
using the solution quality information passed up from the child nodes. If the screener utility for
the best solution found thus far is higher than the upper bound than the next best joint adversary
pure strategy, then the execution of Branch-and-Guide can be terminated without exploring the
remaining joint adversary pure strategies.
6.4 Scaling Up GATE
While GATE incorporates state-of-the-art techniques to solve MarginalStrategyMILP, it fails to
scale up to real world problem sizes for TSGs (see comparison in Evaluation). Thus, intuitive
heuristics are employed that further narrows down the search space for GATE, thereby enabling
up to 10X run time improvement with only 5-10% solution quality loss. There are two distinct
96
MGA Node
Upper Bound Node
Lower Bound 1:
First Leaf: a
t1
= 1, a
rest
= 0
Lower Bound 2:
Second Leaf: a
t1
= 0, a
t2
= 1,
a
rest
= 0
Upper Bound 1:
First Node: a
all
ϵ [0,1]
Upper Bound 2:
Second Node: a
t1
= 0, a
rest
ϵ [0,1]
Upper Bound |a|
UB
1
UB
2
… UB
|a|
LB
1
, … , LB
|a|
: Possibly Not Ordered
Lower Bound |a|:
Last leaf: a
|a|
= 1, a
rest
= 0
Figure 6.2: Branch-and-Guide Tree
steps in GATE where additional heuristics can help: the processing step at the leafs of the hi-
erarchical type tree and the processing step at the intermediate and root nodes. In this section
GATE-H (GATE with heuristics) is presented first formally and then the heuristics used to speed
up the computation are described.
6.4.1 GATE-H: GATE with Heuristics
GATE-H solves TSGs efficiently by limiting both the number of adversary pure strategies passed
up the hierarchical adversary type tree from restricted games and by limiting the number of adver-
sary strategies evaluated in the individual nodes. Each node in the hierarchical tree is solved using
Algorithm 3, beginning at the leaf nodes. The feasible adversary pure strategy set, denoted as A
0
,
is passed up to the parent nodes as each child is solved. Notice that not all strategies need be eval-
uated at a given node for the computation to terminate as either the Branch-and-Guide heuristic
or K cutoff heuristic, both introduced later in this section, can end the computation early.
97
Algorithm 3: GATE H NODE(;A
i
;B
i
;U
s
;U
;K)
//A
i
Q
: Pruned feasible pure strategy set for all adversary types
1. A
00
:= all-Joint-Adversary-Pure-Strategies()
2. B
0
(a
Q
) := getBound(a
Q
;B
i
)8a
Q
2Õ
Q
A
i
Q
3. sort(A
00
;B
0
(a
Q
)) //sort a
Q
in descending order of B
0
(a
Q
)
4. a
Q
:= [A
1
Q
(1);A
2
Q
(1);:::;A
jQj
Q
(1)]
5. r
;r
0
=¥ //Save best and iterative solutions
repeat
6. (feasible, n, r) := MGA(a
Q
)
if feasible then
7. A
0
:= A
0
[ a
Q
if r> r
then
8a. r
:= r
8b. n
:= n
9. B
0
(a
Q
) := r
else
10. A
00
:= A
00
na
Q
11. Every K iterations
if r
0
= r
6=¥ then
12. break
else
13. r
0
:= r
14. a
Q
:= getNextStrategy(a
Q
;r
;A
i
Q
;B
i
)
until a
Q
= NULL
return (n
;r
;A
0
;B
0
)
98
When solving a given node in GATE-H the adversary set (Q) is given and the screener’s
and adversary’s utilities (U
s
and U
Q
, respectively), with the feasible strategies (A
i
Q
) and bound
information (B
i
) is acquired from the children of that node as shown in Algorithm 3. In the case
of solving a leaf node in the tree all of that adversary type’s strategies are enumerated with bound
information given by solving an upper bound LP, described in Section 6.4.2. After constructing
the joint adversary pure strategy set (Line 1), the set is ordered by their upper bound values (Line
3) and each strategy is evaluated one by one (main loop starts after Line 5). Heuristics are used
inside of this loop, leading to GATE-H.
Algorithm 4: getNextStrategy(a
;r
;A
i
;B
i
)
for i=jQj to 1 Step-1 do
j := index-of(a
Q
,A
i
Q
)
//Set each adversary type strategy equal to left most leaf
a
i
Q
:= [A
1
Q
(1);:::;A
jQj1
Q
(1);A
jQj
Q
( j+ 1)]
if r
< getBound(a
Q
;B) then
return a
Q
return NULL
The first of the two heuristics used is the Branch-and-Guide approach (Line 14 - getNextStrat-
egy()) which ends the computation early if the value of the current best strategy is greater than
the next highest upper bound. The second heuristic, discussed in more detail in Section 6.3, is the
K cutoff (Line 11) which ends the computation early if the current best solution is not improving
after K iterations.
Algorithm 4 describes the getNextStrategy function in detail. Essentially the function builds
the next strategy to be evaluated by iterating through all of the adversary types and grabbing the
highest valued strategy in their respective pure strategy lists.
99
6.4.2 Tuning Leaf Node Computation
At the leaf nodes in the hierarchical type tree the restricted game for each adversary type is
solved. For GATE to be exact, all feasible adversary strategies from each leaf node must be
returned. Returning a portion of the feasible pure strategies gives a heuristic approach, which
may run well in practice but is not guaranteed to be optimal. Nonetheless, in some cases even
for optimality it might suffice for us to return only some promising strategies. Below a special
condition is highlighted under which it is optimal to just consider one adversary strategy at each
leaf in the hierarchical tree.
Lemma 6.1 Let n
q
represent the optimal allocation against type q at the leaf. Let n
q
[C
q
] be
the part of the allocation that is assigned to screenees in screenee category C
q
. If the strategy n
formed by putting together all n
q
[C
q
]: n=å
q
n
q
[C
q
] is feasible then n is the optimal defender
strategy and the single adversary best response for each single type is the adversary best response
in the overall game.
Proof 6.2 (Proof Sketch) First, note that the strategy n achieves the payoffå
q
z
q
s
q
, where s
q
is
the defender utility in the restricted game with just the typeq. Also, clearly s
q
is an upper bound
on the defender utility for the restricted game. By the result from [39], the upper bound on the
defender utility iså
q
z
q
s
q
which is achieved by n.
Branch-and-Guide can be used as described earlier to return fewer promising adversary strate-
gies, but the question that arises when using Branch-and-Guide at the leaf nodes is how to com-
pute the upper bounds for the nodes on the right of the tree (Recall for non-leaf nodes this upper
bound is computed from the upper bounds that are propagated up from each child). One ap-
proach to compute this upper bound is to adapt the ORIGAMI [43] approach; ORIGAMI is the
100
fastest technique for solving non-Bayesian general-sum security games without resource schedul-
ing constraints. The underlying idea is to solve an LP to minimize the utility of the adversary by
inducing the largest possible attack set (set of targets that are equally and most attractive for the
attacker) and [43] shows this provides the optimal defender utility in games without scheduling
constraints. However, even for a TSG with a single time window and a single adversary type,
ORIGAMI may not provide the optimal solution.
Therefore, ORIGAMI cannot be applied directly to find upper bounds on the adversary pure
strategies at the leaf nodes. Thus, UpperBoundLP is provided, shown below, to calculate the
upper bound at the leaf nodes in the hierarchical tree.
min
n;q;s;x
k
q
0 (6.10)
s:t: k
q
0 x
w
c;m
U
d
q
0
;c
+(1 x
w
c;m
)U
u
q
0
;c
8w;8c;8m (6.11)
x
w
c;m
=å
t2T
E
t
m
n
w
c;t
N
w
c
8w;8c;8m (6.12)
å
t2T
I
t
r
å
c2C
n
w
c;t
L
w
r
8r;8w (6.13)
å
t2T
n
w
c;t
N
w
c
; n
w
c;t
0 8c;8w (6.14)
The objective function (6.10) minimizes the attacker’s utility for the adversary typeq
0
. Equa-
tion (6.11) enforces that the adversary payoff be the maximal payoff for the adversary given a
marginal strategy n. Equations (6.12)-(6.14) enforce the resource constraints from the original
game. This LP uses a slightly modified formulation when enforcing the capacity constraints.
Namely, replace Equation (6.5) with (6.14), i.e., the constraint that every passenger in a given
category c for a time window w must be screened is relaxed.
101
Theorem 6.2 UpperBoundLP provides an upper bound on the screener utility d
q
0 for a non-
Bayesian TSG.
Proof 6.3 By relaxing constraint (6.5) to constraint 6.14, the attack set of the adversary can only
be expanded from the original formulation. This happens as Equation 6.14 allows for an adver-
sary to not be screened which can only increase their utility for targets, thus possibly increasing
their attack set. Since the adversary breaks ties in the screener’s favor this can only increase the
screener’s possible utility.
The above approach serves two purposes: it enormously reduces the set of strategies that are
sent up to the parent even if all the adversary strategies are evaluated in the Branch-and-Guide
tree, as the attack set is much smaller than the set of all feasible strategies. The running time is
also reduced, given the small attack set and the efficient LP to obtain this attack set and upper
bounds.
6.4.3 Tuning Non-leaf Node Computation
Section 6.4.2 focused on reducing the number of adversary pure strategies returned from the leaf
nodes. However, another very important area to prune the search space is at the interior nodes
in the binary tree. One way to approach this problem is using Branch-and-Guide in these nodes
and stopping the search once the current best solution is better than the next highest upper bound.
This can possibly provide significant speed-ups in terms of the computation speed of GATE but
it is not quite enough. Unfortunately, in the experiments it turned out that most of the joint
adversary pure strategies were evaluated in the interior nodes by MGA as the stopping condition
for Branch-and-Guide was almost never met.
102
Thus, inspiration is taken from column generation and security games literature [76] where
column generation is stopped if the solution quality does not change much with new columns
being added. This heuristic almost always provides a very good approximation. This same prin-
ciple is adopted so that when evaluating adversary strategies in the Branch-and-Guide approach
if the current solution quality does not change over the next K strategies then the computation can
be stopped and the current solution is declared as the final solution with the adversary strategies
evaluated so far propagated to the parent. This approach can also be adopted at the root. Here
K is a parameter that can be varied, but K = 30 is used for the experiments as it seems to work
the best for this problem. This approach serves two purposes: (1) it reduces the run time at each
intermediate node and the root node and (2) it reduces the number of adversary pure strategies
propagated up the tree.
GATE-H then allows us to solve large-scale Bayesian general-sum TSGs. Further, empiri-
cally these heuristics maintain high solution quality while decreasing the runtime by an order of
magnitude.
6.5 Evaluation
GATE-H is evaluated using synthetic examples from the passenger screening domain as real
world data is not available. The LPs and MILPs are solved using CPLEX 12.5 with the bar-
rier method, as this was found to work the best, on USC’s HPC Linux cluster limited to one
Hewlett-Packard SL230 node with 2 processors. The adversary and screener payoffs are gener-
ated uniformly at random with U
u
a
2[2;11] and U
u
s
2[1;10]. For both the adversary and the
screener U
d
= 0. The default values for the experiments are 6 adversary risk levels, 5 screening
103
resource types, 10 screening teams, 2 attack methods and 3 time windows unless otherwise spec-
ified. For all experiments the capacity resource constraints also remain constant. All results are
averaged over 20 randomly generated game instances.
6.5.1 Scaling Up and Solution Quality
The first experiment tests the scalability of each approach and provides solution quality informa-
tion for GATE-H relative to the MarginalStrategyMILP. This experiment provides information
about the trade-off between runtime and solution quality for the heuristic algorithm. The four
different variations of GATE that were tested are: (1) GATE which evaluates all adversary pure
strategies in each of the restricted games and only uses Branch-and-Guide at the root, (2) GATE-
H-BG which uses the Branch-and-Guide heuristic in all nodes, (3) GATE-H-K which uses the
K= 30 cutoff in all nodes and (4) GATE-H-BG-K which uses both Branch-and-Guide and the K
cutoff in all nodes. In Figure 6.3(a) the runtime results are shown for all of the algorithms. On
the x-axis the number of flights are varied from 60 to 120 in increments of 20. On the y-axis is
the runtime in seconds. For example, for 80 flights MarginalStrategyMILP takes almost 10,000
seconds to finish. GATE did not finish in any of the instances showing it cannot scale to large
TSG instances. GATE-H-BG also does not perform well as it fails to beat the average runtime of
the MILP over all of the instances. As can be seen GATE-H-BG-30 and GATE-H-30 significantly
reduce the runtime with an average of a 10 fold improvement over all cases and GATE-H-BG-30
providing a 20 fold speed up at 120 flights.
In Figure 6.3(b) the solution quality of the MILP is compared with GATE-H-BG-30 and
GATE-H-30. GATE and GATE-H-BG are not included as they did not finish in a majority of
instances making it difficult to compare the solution quality. On the x-axis the number of flights
104
0
5000
10000
15000
20000
25000
30000
35000
60 80 100 120
Runtime (s)
Number of Flights
Runtime Comparisons
MILP
GATE-H-BG-30
GATE-H-30
GATE-H-BG
GATE
(a) Runtime Comparison
-0.7
-0.6
-0.5
-0.4
-0.3
-0.2
-0.1
0
60 80 100 120
Screener Utility
Number of Flights
Solution Quality
MILP
GATE-H-BG-30
GATE-H-30
(b) Solution Comparison
Figure 6.3: In Figure 6.3(a) is a runtime comparison between the MILP and GATE. In Fig-
ure 6.3(b) is a solution quality comparison of the MILP and GATE.
are varied from 60 to 120 in increments of 20. On the y-axis is the screener’s utility. For instance,
for 60 flights GATE-H-BG-30 returns an average screener utility of -0.5974. As the graph shows
the average solution quality loss over these game instances is always less than .0411 for both
GATE-H-BG-30 and GATE-H-30 compared to the MILP. These results show that both GATE-H-
BG-30 and GATE-H-30 provide good approximations for large scale TSGs.
The next experiment aimed to test the ability of GATE-H-BG-30 and GATE-H-30 to scale up
to much larger TSG instances. The results are shown in Figure 6.4. On the x-axis the number of
flights is increased from 160 to 220. On the y-axis the average runtime in seconds is shown to
solve each of the TSG instances. For example, GATE-H-BG-30 took on average 2,396 seconds to
solve a game with 200 flights. An interesting trend here is that the runtime peaks at 180 flights and
starts to decrease afterward. This trend could be related to the resource saturation problem as seen
in other security games [38], where the observation is that resource optimization is easiest when
the resources available are comparatively small or equal to the number of targets and becomes
difficult when this is not the case.
105
0
500
1000
1500
2000
2500
3000
3500
4000
160 180 200 220
Runtime (s)
Number of Flights
Scaling Up
GATE-H-BG-K
GATE-H-30
Figure 6.4: Scaling Up to Larger TSG Instances
-4
-3.5
-3
-2.5
-2
-1.5
-1
-0.5
0
0 -0.1-0.2-0.3-0.4-0.5-0.6-0.7-0.8-0.9
Screnner Utility
Correlational Coefficient
Solution Quality
MILP
GATE-H-BG-30
GATE-H-30
Figure 6.5: Solution Quality - Moving to Zero Sum
6.5.2 Moving Towards Zero Sum
The last experiment aimed to test what happens to the solution quality of GATE-H-BG-30 and
GATE-H-30 as the game payoffs move toward zero-sum. For this experiment, 40 flights are used
and there is one time window as the MILP does not scale in these instances. An r-coefficient is
varied from 0 to -0.9 in increments of -0.1, where r= 0 means there is no correlation between the
attacker’s and screener’s payoffs and r=1 means the game is zero-sum (Note: -1 is not tested
as there is a specialized algorithm to deal with that case where the techniques in this chapter are
not useful). In previous experiments a correlation between the adversary’s and screener’s payoffs
106
is not explicitly set. On the x-axis is the r-coefficient and the y-axis shows the screener’s utility.
For example, when r= 0 the MarginalStrategyMILP returns a screener utility -0.889, GATE-H-
BG-30 returns a screener utility -0.917 and GATE-H-30 returns a solution quality of -0.931. In
this experiment an interesting trend appears. As the game moves toward zero-sum payoffs the
relative performance of both GATE-H-BG-30 and GATE-H-30 progressively worsens. However,
until the game payoffs are nearly zero-sum (r=0:9) both GATE-H variations do have a solution
quality loss greater than 11.385%. This experiment again shows that both GATE-H-BG-30 and
GATE-H-30 provide good approximations in general-sum TSGs. (A careful reader might notice
in Figures 6.3(b) and 6.3(b) that there are slight differences in solution quality between GATE-
H-BG-30 and GATE-H-30, however these are not statistically significant.)
6.6 Chapter Summary
The TSG model provides an extensible and adaptable model for game-theoretic screening in the
real world. It improves upon previous models in security games that fail to capture important
properties of the screening domain, e.g., the presence of non-player screenees in the game and
complex team capacity constraints. The model also improves on work done on threat screen-
ing, such as screening stadium patrons [68], cargo container screening [6], and screening airport
passengers [58, 57].
Previous work done on TSGs [22] focused on the bayesian zero-sum case and in this chap-
ter zero-sum TSGs are extended to the Bayesian general-sum case. Four contributions are
provided to accomplish this task: (1) the GATE algorithm which efficiently solves large scale
107
Bayesian general-sum TSGs, (2) the Branch-and-Guide approach which combines branch-and-
bound search and MGA in order to efficiently solve nodes in the hierarchical tree, (3) heuristics
that speed up the computation of GATE, and (4) experimental evaluation of GATE showing the
scalability of the game theoretic algorithm.
Using the aforementioned contributions this chapter presents a practical approach for solving
Bayesian general-sum TSGs that scales up to problem sizes encountered in the real world. Thus,
with this research I hope to increase the applicability of TSGs by providing techniques for solving
large scale Bayesian general-sum TSGs.
108
Chapter 7
Conclusion
7.1 Contributions
Protecting an organization’s cyber assets from malicious actors is one of the most significant and
pressing challenges for cybersecurity in the coming years. Previous research completed in cyber-
security has presented numerous approaches for resolving and mitigating problems the defender
faces throughout the phases of the Cyber Kill Chain. As alluded to earlier, however, these ap-
proaches do not adequately consider the adversarial component of cybersecurity which is crucial
in devising strong protection strategies against motivated and intelligent adversaries. My thesis
explores the use of game theory in the cybersecurity domain and resolves deficiencies in previous
game theoretic approaches for security domains to handle the complex challenges in allocation
problems faced by a network administrator. Crucially, my thesis advances the state of the art in
game theoretic approaches and algorithms for deceiving cyber adversaries and prioritizing cyber
alert resolution which provides a network administrator with more tools to thwart cyber adver-
saries.
109
First, to deceive cyber adversaries I introduce the novel Cyber Deception Game model that
provides a basis for strategic deception of an intelligent and motivated adversary conducting re-
connaissance on an enterprise network. This model gives an additional tool to the network admin-
istrator which allows them to thwart an adversary in the initial phase - the reconnaissance phase
- of the Cyber Kill Chain and increases the difficulty of successfully breaching the defender’s
network. To solve this problem, I show how to formulate the adversary’s knowledge acquisition
phase in a game framework. This gives us a way to then strategically interact with an adversary
attempting to hack our network, and most importantly, provides a way to strategically deceive
the adversary. I take (1) a robust approach to solve this problem by making the assumption that
that we are facing a powerful adversary, e.g., a nation-state, who has distributional information
about the defender’s deceptive response scheme and (2) I consider a naive adversary, e.g., a script
kiddie, with a fixed set of preferences over the set of possible responses from systems on the net-
work. This initial formulation provides valuable insight into future work which will investigate
the interesting problem of varied adversarial knowledge states of the defender’s network.
Second, for prioritizing alerts raised from IDS across a defender’s network I provide the
novel Cyber-alert Allocation Game (CAG) model. This model gives a framework for how to
randomize the resolution of alerts by cyber analysts and makes it more difficult for a stealthy
adversary’s attack to go undetected. To achieve this advance, I solve fundamental problems in
the field of game theory for threat screening. An important previous model, the Threat Screening
Game (TSG), introduced a zero-sum Stackelberg game formulation to solve a problem where a
defender must allocate a set of “screening” resources to detect a possible attack coming into a
secure area that attempts to pose as a regular, i.e., “non-adversarial”, object. The CAG model
makes two important advances by considering screening resources with heterogeneous screening
110
times for the incoming objects and allows for an attack by the adversary to appear as probability
distributions over alert types. The CDG framework provides a network administrator with a
useful tool in impeding cyber adversaries and protecting from massive breaches of sensitive user
information.
Lastly, I extend the applicability of the CAG framework to non zero-sum domains with the in-
troduction of the GATE algorithm. This work additionally applies to physical security domains as
it significantly improves the scalability of computing optimal SSE equilibria in Bayesian general-
sum TSGs. The GATE algorithm leverages the Marginal Guided Algorithm (MGA) given in [22]
which solves for an optimal marginal strategy for the defender in Bayesian zero-sum TSGs. From
here, I use a branch and bound algorithmic approach and a hierarchical decomposition adversary
type tree to prune a large portion of the adversary’s strategy space and solve larger-scale Bayesian
general-sum TSGs. Importantly, GATE allows for the application of Bayesian general-sum TSG
to problems with non-symmetric payoffs between the defender and an adversary which applies to
a wide array of screening domains. Looking to future work, it is important to solve other difficult
challenges arising for varying screening domains and problems in cybersecurity along with other
threat screening domains, and my work provides significant results to broaden the applicability
of the current state-of-the-art.
7.2 Future Work and Directions
One of the most interesting areas for future work beyond my thesis lies in the application of
the deceptive techniques to several other domains in cybersecurity. In this thesis, I explored the
use of deception for thwarting adversaries in the reconnaissance phase of a cyber attack. These
111
ideas, however, could be extended to conceal the true features of a network to adversaries who
may have already compromised systems in the defender’s enterprise network. In this situation,
the adversary would leverage the compromised nodes to complete further recon of the defender’s
network to identify machines which may be hidden from the outside. Deception here then could
be used to alter the network views [26] which are observed through network reconnaissance
conducted from the compromised nodes. In this way, the defender increases the uncertainty
throughout all phases of an adversary’s attack from recon to moving throughout the defender’s
network to identify important systems to compromise and subnets in the enterprise network.
In the future, it will also be important to better quantify the advantage gained by the defender
utilizing deceptive algorithms and techniques. This can be achieved through the development
of a test-bed network scenario where different deceptive algorithms can be tested against human
players. To this end, there has been some initial ground work on developing a network scenario
with a realistic network recon experience using the CyberV AN test-bed [25]. This environment
makes it easier as well to study how an adversary plays the game of network reconnaisance.
With this information, future research can learn strategies employed by cyber adversaries during
reconnaissance and gives better insight into quantifying the informational gain for adversaries
from network recon and attacks.
Beyond network defense, another interesting area of application lies in applying deceptive
principles to mitigate social engineering attacks which represents an area of significant impor-
tance for cybersecurity defenses. Organizations that suffer from targeted social engineering at-
tacks could leverage the use of deception to make it difficult for an adversary to ascertain the
true employees within an organization. This allows a defender to recognize when a particular
112
department, e.g., financial department, is under attack from an adversary’s campaign and mount
a better defense to protect against breaches by sending out timely warnings to its employees.
My work on cyber alert allocation is a crucial first step in applying game theory to real world
cyber security settings, but there remain significant challenges which need to be addressed in
future work. Firstly, I assume the time to resolve an alert is known exactly, but in the real world
there is uncertainty for how long it would take to resolve an alert. Second, the CAG model
assumes that attacks show up as known alert categories, but it is possible that in the real-world
some attacks may show up as “unknown” categories. The question then is how to assign these
alerts to analysts given we do not know which expert may have an expertise in dealing with this
type of attack. Lastly, in CAG’s there is not an overflow of alerts from one time period to the
next. Overflow of passengers for screening has been explored in the context of TSGs [56], but in
the context of cybersecurity, overflow also includes the possibility of unexpected alerts flooding
the system along with the expected alerts. In the real-world resolving alerts in a timely manner is
crucially important to limit the possible damage from a network intrusion.
113
Reference List
[1] Massimiliano Albanese, Ermanno Battista, and Sushil Jajodia. A deception based approach
for defeating os and service fingerprinting. In Communications and Network Security
(CNS), 2015 IEEE Conference on, pages 317–325. IEEE, 2015.
[2] Massimiliano Albanese, Ermanno Battista, and Sushil Jajodia. Deceiving attackers by cre-
ating a virtual attack surface. In Cyber Deception, pages 169–201. Springer, 2016.
[3] Mohammed H Almeshekah and Eugene H Spafford. Planning and integrating deception
into computer security defenses. In Proceedings of the 2014 Workshop on New Security
Paradigms Workshop, pages 127–138. ACM, 2014.
[4] Mohammed H Almeshekah and Eugene H Spafford. Cyber security deception. In Cyber
Deception, pages 25–52. Springer, 2016.
[5] Tansu Alpcan and Tamer Bas ¸ar. Network security: A decision and game-theoretic approach.
Cambridge University Press, 2010.
[6] Saket Anand, David Madigan, Richard Mammone, Saumitr Pathak, and Fred Roberts. Ex-
perimental analysis of sequential decision making algorithms for port of entry inspection
procedures. In Intelligence and Security Informatics, pages 319–330. Springer, 2006.
[7] James P Anderson. Computer security threat monitoring and surveillance. Technical Re-
port, James P . Anderson Company, 1980.
[8] Michael J Assante and Robert M Lee. The industrial control system cyber kill chain. SANS
Institute InfoSec Reading Room, 1, 2015.
[9] Patrice Auffret. Sinfp, unification of active and passive operating system fingerprinting.
Journal in computer virology, 6(3):197–205, 2010.
[10] Erik B Bajalinov. Linear-Fractional Programming Theory, Methods, Applications and Soft-
ware, volume 84. Springer Science & Business Media, 2013.
[11] Egon Balas. Disjunctive programming and a hierarchy of relaxations for discrete optimiza-
tion problems. SIAM Journal on Algebraic Discrete Methods, 6(3):466–486, 1985.
[12] Daniel Barbara, Julia Couto, Sushil Jajodia, and Ningning Wu. An architecture for anomaly
detection. In Applications of Data Mining in Computer Security, pages 63–76. Springer,
2002.
114
[13] Daniel Barbara and Sushil Jajodia. Applications of data mining in computer security, vol-
ume 6. Springer Science & Business Media, 2002.
[14] Cynthia Barnhart, Ellis L Johnson, George L Nemhauser, Martin WP Savelsbergh, and
Pamela H Vance. Branch-and-price: Column generation for solving huge integer programs.
Operations research, 46(3):316–329, 1998.
[15] Nicola Basilico and Nicola Gatti. Automated abstractions for patrolling security games. In
AAAI, 2011.
[16] Nicola Basilico, Nicola Gatti, and Francesco Amigoni. Patrolling security games: Defi-
nition and algorithms for solving large instances with single patroller and single intruder.
Artificial Intelligence, 184:78–123, 2012.
[17] Jay Beale, Renaud Deraison, Haroon Meer, Roelof Temmingh, and Charl Van Der Walt.
Nessus network auditing. Syngress Publishing, 2004.
[18] David Barroso Berrueta. A practical approach for defeating nmap os- fingerprinting. Re-
trieved March, 12:2009, 2003.
[19] Christopher M Bishop. Pattern recognition and machine learning. springer, 2006.
[20] Stephen Boyd and Lieven Vandenberghe. Convex optimization. Cambridge university
press, 2004.
[21] Paxson V Bro. A system for detecting network intruders in real-time. In Proc. 7th USENIX
Security Symposium, 1998.
[22] Matthew Brown, Arunesh Sinha, Aaron Schlenker, and Milind Tambe. One size does not
fit all: A game-theoretic approach for dynamically and effectively screening for threats. In
AAAI conference on Artificial Intelligence (AAAI), 2016.
[23] Eric Budish, Yeon-Koo Che, Fuhito Kojima, and Paul Milgrom. Designing random alloca-
tion mechanisms: Theory and applications. The American Economic Review, 103(2):585–
623, 2013.
[24] Thomas E Carroll and Daniel Grosu. A game theoretic investigation of deception in network
security. Security and Communication Networks, 4(10):1162–1172, 2011.
[25] Ritu Chadha, Thomas Bowen, Cho-Yu J Chiang, Yitzchak M Gottlieb, Alex Poylisher, An-
gello Sapello, Constantin Serban, Shridatt Sugrim, Gary Walther, Lisa M Marvel, et al.
Cybervan: A cyber security virtual assured network testbed. In Military Communications
Conference, MILCOM 2016-2016 IEEE, pages 1125–1130. IEEE, 2016.
[26] Cho-Yu J Chiang, Yitzchak M Gottlieb, Shridatt James Sugrim, Ritu Chadha, Constantin
Serban, Alex Poylisher, Lisa M Marvel, and Jonathan Santos. Acyds: An adaptive cyber
deception system. In Military Communications Conference, MILCOM 2016-2016 IEEE,
pages 800–805. IEEE, 2016.
[27] Vincent Conitzer and Tuomas Sandholm. Computing the optimal strategy to commit to. In
Proceedings of the 7th ACM conference on Electronic commerce, pages 82–90. ACM, 2006.
115
[28] Anita DAmico and Kirsten Whitley. The real work of computer network defense analysts.
In VizSEC 2007, pages 19–37. Springer, 2008.
[29] Dorothy E Denning. An intrusion-detection model. IEEE Transactions on software engi-
neering, (2):222–232, 1987.
[30] Roberto Di Pietro and Luigi V Mancini. Intrusion detection systems, volume 38. Springer
Science & Business Media, 2008.
[31] Karel Durkota, Viliam Lis` y, Branislav Boˇ sansk` y, and Christopher Kiekintveld. Approx-
imate solutions for attack graph games with imperfect information. In GameSec, pages
228–249. Springer, 2015.
[32] Karel Durkota, Viliam Lis` y, Branislav Bosansk` y, and Christopher Kiekintveld. Optimal
network security hardening using attack graph games. In IJCAI, pages 526–532, 2015.
[33] Rajesh Ganesan, Sushil Jajodia, Ankit Shah, and Hasan Cam. Dynamic scheduling of cy-
bersecurity analysts for minimizing risk using reinforcement learning. ACM Transactions
on Intelligent Systems and Technology (TIST), 8(1):4, 2016.
[34] William B Haskell, Debarun Kar, Fei Fang, Milind Tambe, Sam Cheung, and Elizabeth
Denicola. Robust protection of fisheries with compass. In AAAI, pages 2978–2983, 2014.
[35] Steven Andrew Hofmeyr and Stephanie Forrest. An immunological model of distributed
detection and its application to computer security. PhD thesis, Citeseer, 1999.
[36] Wenjie Hu, Yihua Liao, and V Rao Vemuri. Robust anomaly detection using support vector
machines. In Proceedings of the international conference on machine learning, pages 282–
289, 2003.
[37] Manish Jain, Erim Kardes, Christopher Kiekintveld, Fernando Ord´ onez, and Milind Tambe.
Security games with arbitrary schedules: A branch and price approach. In AAAI, 2010.
[38] Manish Jain, Kevin Leyton-Brown, and Milind Tambe. The deployment-to-saturation ratio
in security games. Target, 1(5):5, 2012.
[39] Manish Jain, Milind Tambe, and Christopher Kiekintveld. Quality-bounded solutions for
finite bayesian stackelberg games: Scaling up. In International Conference on Autonomous
Agents and Multiagent Systems, 2011.
[40] Sushil Jajodia, Noseong Park, Fabio Pierazzi, Andrea Pugliese, Edoardo Serra, Gerardo I
Simari, and VS Subrahmanian. A probabilistic logic of cyber deception. IEEE Transactions
on Information Forensics and Security, 12(11):2532–2544, 2017.
[41] Albert Xin Jiang, Zhengyu Yin, Chao Zhang, Milind Tambe, and Sarit Kraus. Game-
theoretic randomization for security patrolling with dynamic execution uncertainty. In
Proceedings of the 2013 international conference on Autonomous agents and multi-agent
systems, pages 207–214. International Foundation for Autonomous Agents and Multiagent
Systems, 2013.
116
[42] Rob Joyce. Disrupting nation state hackers. San Francisco, CA, 2016. USENIX Associa-
tion.
[43] Christopher Kiekintveld, Manish Jain, Jason Tsai, James Pita, Fernando Ord´ o˜ nez, and
Milind Tambe. Computing optimal randomized resource allocations for massive security
games. AAMAS, 2009.
[44] Christopher Kiekintveld, Viliam Lis` y, and Radek P´ ıbil. Game-theoretic foundations for the
strategic use of honeypots in network security. In Cyber Warfare, pages 81–101. Springer,
2015.
[45] Dmytro Korzhyk, Vincent Conitzer, and Ronald Parr. Complexity of computing optimal
stackelberg strategies in security resource allocation games. In AAAI, 2010.
[46] Dmytro Korzhyk, Vincent Conitzer, and Ronald Parr. Security games with multiple attacker
resources. In IJCAI Proceedings-International Joint Conference on Artificial Intelligence,
volume 22, page 273, 2011.
[47] Dmytro Korzhyk, Zhengyu Yin, Christopher Kiekintveld, Vincent Conitzer, and Milind
Tambe. Stackelberg vs. nash in security games: An extended investigation of interchange-
ability, equivalence, and uniqueness. Journal of Artificial Intelligence Research, 2011.
[48] Aron Laszka, Jian Lou, and Yevgeniy V orobeychik. Multi-defender strategic filtering
against spear-phishing attacks. In AAAI, 2016.
[49] Aron Laszka, Yevgeniy V orobeychik, and Xenofon D Koutsoukos. Optimal personalized
filtering against spear-phishing attacks. In AAAI, pages 958–964, 2015.
[50] Joshua Letchford and Vincent Conitzer. Solving security games on graphs via marginal
probabilities. In AAAI, 2013.
[51] Joshua Letchford, Liam MacDermed, Vincent Conitzer, Ronald Parr, and Charles L Isbell.
Computing optimal strategies to commit to in stochastic games. In AAAI, 2012.
[52] Kong-wei Lye and Jeannette M Wing. Game strategies in network security. International
Journal of Information Security, 4(1-2):71–86, 2005.
[53] Gordon Fyodor Lyon. Nmap network scanning: The official Nmap project guide to network
discovery and security scanning. Insecure, 2009.
[54] Mandiant. Apt1: Exposing one of chinas cyberespionage units, 2013.
[55] Lockheed Martin. Cyber kill chain R
. URL: http://cyber. lockheedmartin.
com/hubfs/Gaining the Advantage Cyber Kill Chain. pdf, 2014.
[56] Sara Marie Mc Carthy, Phebe Vayanos, and Milind Tambe. Staying ahead of the game:
adaptive robust optimization for dynamic allocation of threat screening resources. In Pro-
ceedings of the 26th International Joint Conference on Artificial Intelligence, pages 3770–
3776. AAAI Press, 2017.
117
[57] Laura A McLay, Adrian J Lee, and Sheldon H Jacobson. Risk-based policies for airport
security checkpoint screening. Transportation science, 44(3):333–349, 2010.
[58] Xiaofeng Nie, Rajan Batta, Colin G Drury, and Li Lin. Passenger grouping with risk levels
in an airport security system. European Journal of Operational Research, 194(2):574–584,
2009.
[59] NIST. National Vulnerability Database, 2017. https://nvd.nist.gov/.
[60] Praveen Paruchuri, Jonathan P Pearce, Janusz Marecki, Milind Tambe, Fernando Ordonez,
and Sarit Kraus. Playing games for security: An efficient exact algorithm for solving
bayesian stackelberg games. In Proceedings of the 7th international joint conference on
Autonomous agents and multiagent systems-Volume 2, pages 895–902. International Foun-
dation for Autonomous Agents and Multiagent Systems, 2008.
[61] Robert M Patton, Justin M Beaver, Chad A Steed, Thomas E Potok, and Jim N Tread-
well. Hierarchical clustering and visualization of aggregate cyber data. In 2011 7th Inter-
national Wireless Communications and Mobile Computing Conference, pages 1287–1291.
IEEE, 2011.
[62] Jeffrey Pawlick and Quanyan Zhu. Deception by design: evidence-based signaling games
for network defense. arXiv preprint arXiv:1503.05458, 2015.
[63] Radek Pıbil, Viliam Lis` y, Christopher Kiekintveld, Branislav Boˇ sansk` y, and Michal Pe-
choucek. Game theoretic model of strategic honeypot selection in computer networks. De-
cision and Game Theory for Security, 7638:201–220, 2012.
[64] James Pita, Manish Jain, Janusz Marecki, Fernando Ord´ o˜ nez, Christopher Portway, Milind
Tambe, Craig Western, Praveen Paruchuri, and Sarit Kraus. Deployed armor protection: the
application of a game theoretic model for security at the los angeles international airport. In
Proceedings of the 7th international joint conference on Autonomous agents and multiagent
systems: industrial track, pages 125–132. International Foundation for Autonomous Agents
and Multiagent Systems, 2008.
[65] James Pita, Milind Tambe, Chris Kiekintveld, Shane Cullen, and Erin Steigerwald. Guards:
game theoretic security allocation on a national scale. In The 10th International Confer-
ence on Autonomous Agents and Multiagent Systems-Volume 1, pages 37–44. International
Foundation for Autonomous Agents and Multiagent Systems, 2011.
[66] Niels Provos. Honeyd-a virtual honeypot daemon. In 10th DFN-CERT Workshop, Ham-
burg, Germany, volume 2, page 4, 2003.
[67] Antonia Rana. What is amap and how does it fingerprint applications. SANS Institute, 2014.
[68] Brian C Ricks, Brian Nakamura, Alper Almaz, Robert DeMarco, Cindy Hui, Paul Kantor,
Alisa Matlin, Christie Nelson, Holly Powell, Fred Roberts, et al. Modeling the impact of
patron screening at an nfl stadium. In IIE Annual Conference. Proceedings, page 3086.
Institute of Industrial Engineers-Publisher, 2014.
118
[69] Michael Riley, Ben Elgin, Dune Lawrence, and Carol Matlock. Missed alarms and 40 mil-
lion stolen credit card numbers: How target blew it. http://www.zdnet.com/article/anatomy-
of-the-target-data-breach-missed-opportunities-and-lessons-learned/, 2014. Accessed:
2016-11-10.
[70] Rahul Savani and Bernhard V on Stengel. Exponentially many steps for finding a nash equi-
librium in a bimatrix game. In Foundations of Computer Science, 2004. Proceedings. 45th
Annual IEEE Symposium on, pages 258–267. IEEE, 2004.
[71] Aaron Schlenker, Matthew Brown, Arunesh Sinha, Milind Tambe, and Ruta Mehta. Get me
to my gate on time: Efficiently solving general-sum bayesian threat screening games. In
ECAI, pages 1476–1484, 2016.
[72] Aaron Schlenker, Omkar Thakoor, Haifeng Xu, Milind Tambe, Phebe Vayanos, Fei Fang,
Long Tran-Thanh, and Yevgeniy V orobeychik. Deceiving cyber adversaries: A game theo-
retic approach. In International Conference on Autonomous Agents and Multiagent Systems,
2018.
[73] Aaron Schlenker, Haifeng Xu, Mina Guirguis, Chris Kiekintveld, Arunesh Sinha, Milind
Tambe, Solomon Sonya, Darryl Balderas, and Noah Dunstatter. Dont bury your head
in warnings: a game-theoretic approach for intelligent allocation of cyber-security alerts.
In Proceedings of the 26th International Joint Conference on Artificial Intelligence, pages
381–387. AAAI Press, 2017.
[74] Edoardo Serra, Sushil Jajodia, Andrea Pugliese, Antonino Rullo, and VS Subrahmanian.
Pareto-optimal adversarial defense of enterprise systems. ACM Transactions on Informa-
tion and System Security (TISSEC), 17(3):11, 2015.
[75] Eric Shieh, Bo An, Rong Yang, Milind Tambe, Craig Baldwin, Joseph DiRenzo, Ben Maule,
and Garrett Meyer. Protect: A deployed game theoretic system to protect the ports of the
united states. In Proceedings of the 11th International Conference on Autonomous Agents
and Multiagent Systems-Volume 1, pages 13–20. International Foundation for Autonomous
Agents and Multiagent Systems, 2012.
[76] Eric Shieh, Albert Xin Jiang, Amulya Yadav, Pradeep Varakantham, and Milind Tambe.
Unleashing dec-mdps in security games: Enabling effective defender teamwork. In Euro-
pean Conference on Artificial Intelligence (ECAI), 2014.
[77] Robin Sommer and Vern Paxson. Outside the closed world: On using machine learning
for network intrusion detection. In 2010 IEEE symposium on security and privacy, pages
305–316. IEEE, 2010.
[78] Georgios Spathoulas and Sokratis Katsikas. Methods for post-processing of alerts in intru-
sion detection: A survey. 2013.
[79] VS Subrahmanian, Michael Ovelgonne, Tudor Dumitras, and B Aditya Prakash. The Global
Cyber-Vulnerability Report. Springer, 2015.
[80] Milind Tambe. Security and game theory: algorithms, deployed systems, lessons learned.
Cambridge University Press, 2011.
119
[81] Jason Tsai, Christopher Kiekintveld, Fernando Ordonez, Milind Tambe, and Shyamsunder
Rathi. Iris-a tool for strategic security allocation in transportation networks. 2009.
[82] Bernhard V on Stengel and Shmuel Zamir. Leadership with commitment to mixed strategies.
Technical report, Technical Report LSE-CDAM-2004-01, CDAM Research Report, 2004.
[83] Xiaowen Wang, Cen Song, and Jun Zhuang. Simulating a multi-stage screening network: A
queueing theory and game theory application. In Game Theoretic Analysis of Congestion,
Safety and Security, pages 55–80. Springer, 2015.
[84] Jianfa Wu, Dahao Peng, Zhuping Li, Li Zhao, and Huanzhang Ling. Network intrusion
detection based on a general regression neural network optimized by an improved artificial
immune algorithm. PloS one, 10(3):e0120976, 2015.
[85] Haifeng Xu. The mysteries of security games: Equilibrium computation becomes combi-
natorial algorithm design. EC ’16, New York, NY , USA, 2016. ACM.
[86] Fyodor Yarochkin, Meder Kydyraliev, and Ofir Arkin. Xprobe project, 2014.
[87] Zhengyu Yin, Albert Xin Jiang, Matthew Paul Johnson, Christopher Kiekintveld, Kevin
Leyton-Brown, Tuomas Sandholm, Milind Tambe, and John P Sullivan. Trusts: Scheduling
randomized patrols for fare inspection in transit systems. In IAAI, 2012.
[88] Zhengyu Yin, Dmytro Korzhyk, Christopher Kiekintveld, Vincent Conitzer, and Milind
Tambe. Stackelberg vs. nash in security games: Interchangeability, equivalence, and
uniqueness. In AAMAS, pages 1139–1146. International Foundation for Autonomous
Agents and Multiagent Systems, 2010.
[89] Dajun Yue, Gonzalo Guill´ en-Gos´ albez, and Fengqi You. Global optimization of large-
scale mixed-integer linear fractional programming problems: A reformulation-linearization
method and process scheduling applications. AIChE Journal, 59(11):4255–4272, 2013.
[90] Michal Zalewski. p0f v3 (version 3.08 b), 2014.
[91] Carson Zimmerman. Ten strategies of a world-class cybersecurity operations center.
MITRE corporate communications and public affairs. Appendices, 2014.
120
Abstract (if available)
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Hierarchical planning in security games: a game theoretic approach to strategic, tactical and operational decision making
PDF
Balancing tradeoffs in security games: handling defenders and adversaries with multiple objectives
PDF
Thwarting adversaries with unpredictability: massive-scale game-theoretic algorithms for real-world security deployments
PDF
When AI helps wildlife conservation: learning adversary behavior in green security games
PDF
The human element: addressing human adversaries in security domains
PDF
Towards addressing spatio-temporal aspects in security games
PDF
Protecting networks against diffusive attacks: game-theoretic resource allocation for contagion mitigation
PDF
Human adversaries in security games: integrating models of bounded rationality and fast algorithms
PDF
Predicting and planning against real-world adversaries: an end-to-end pipeline to combat illegal wildlife poachers on a global scale
PDF
Handling attacker’s preference in security domains: robust optimization and learning approaches
PDF
Not a Lone Ranger: unleashing defender teamwork in security games
PDF
Addressing uncertainty in Stackelberg games for security: models and algorithms
PDF
Modeling human bounded rationality in opportunistic security games
PDF
Defending industrial control systems: an end-to-end approach for managing cyber-physical risk
PDF
Combating adversaries under uncertainties in real-world security problems: advanced game-theoretic behavioral models and robust algorithms
PDF
Dynamic graph analytics for cyber systems security applications
PDF
Automated negotiation with humans
PDF
Improving binary program analysis to enhance the security of modern software systems
PDF
Real-world evaluation and deployment of wildlife crime prediction models
PDF
Theoretical foundations and design methodologies for cyber-neural systems
Asset Metadata
Creator
Schlenker, Aaron Robert
(author)
Core Title
Game theoretic deception and threat screening for cyber security
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Computer Science
Publication Date
10/24/2018
Defense Date
05/14/2018
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
cyber kill chain,cybersecurity,Deception,game theory,OAI-PMH Harvest,optimization,threat screening
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Tambe, Milind (
committee chair
), Gratch, Jonathan (
committee member
), John, Richard (
committee member
), Mirkovic, Jelena (
committee member
), Naveed, Muhammad (
committee member
)
Creator Email
aschlenk@usc.edu,aschlenk3@gmail.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c89-96931
Unique identifier
UC11675406
Identifier
etd-SchlenkerA-6908.pdf (filename),usctheses-c89-96931 (legacy record id)
Legacy Identifier
etd-SchlenkerA-6908.pdf
Dmrecord
96931
Document Type
Dissertation
Format
application/pdf (imt)
Rights
Schlenker, Aaron Robert
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
cyber kill chain
cybersecurity
game theory
optimization
threat screening