Close
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Thwarting adversaries with unpredictability: massive-scale game-theoretic algorithms for real-world security deployments
(USC Thesis Other)
Thwarting adversaries with unpredictability: massive-scale game-theoretic algorithms for real-world security deployments
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
ThwartingAdversarieswithUnpredictability: Massive-scaleGame-TheoreticAlgorithmsforReal-worldSecurityDeployments by ManishJain ADissertationPresentedtothe FACULTYOFTHEUSCGRADUATESCHOOL UNIVERSITYOFSOUTHERNCALIFORNIA InPartialFulfillmentofthe RequirementsfortheDegree DOCTOROFPHILOSOPHY (COMPUTERSCIENCE) August2013 Copyright 2013 ManishJain Acknowledgments Today,Istandontheadviseandsupportofmanypeople. Myresearchstyle,skillsanddirection hasbeenforgedaftermanydiscussionswithmyadvisor,colleagues,friendsandfamily. Iwould like to give a special thanks to my advisor Milind Tambe without whom I would not be the researcher I am today and this thesis would probably be non-existent. Milind, I thank you for yourunendingdedicationtomyprogressandsuccess,andforthecountlesshoursyouhavespent onhavingdiscussionswithme,advisingme,debatingwithme,correctingmyunpolisheddrafts and advertising my finished results. I have thoroughly enjoyed my PhD process, and a whole lot of credit goes to just you. You taught me what it meant to do research. You gave me the opportunitytoworkonaproblemwhereIcouldseemyresearchinaction–athoughtthatstill givesmegoosebumps. Iappreciateandthankyouforhowyousoughtoutopportunitiesforawards, presentationsandcollaborationsforme. Iadmireyouforhelpingmegrowasapresenterofmy work. IstillremembertheratherembarrassingpresentationatthePreferenceHandlingWorkshop at AAAI, 2008 when I was so nervous that I ran through my slides in about 10 minutes in a 25 minuteslot–ithasbeenajourneyofprogresssincethen. Ialsopraiseyouforyoure↵ ortsyouput tomaintainanactivesociallifeinthegroupwithourco↵ ees,lunches,dinners,boatcruisesand retreats,andhowyoukeepengagedwiththealumniofourresearchgroup. Icannotevenbeginto ii enumerate the things that Ihave learntfrom you and would hopefully beemulating in my research careerasImoveforward. Iwouldalsoliketothankothermembersofmythesiscommittee,namely,FernandoOrd´ o˜ nez, VincentConitzer,BhaskarKrishnamachariandMathewMcCubbinsfortheirhelpfulfeedbackand guidance. AspecialthankstoFernando,sincehewasatremendousguidetomewhenIstarted myresearchondevelopingscalablealgorithms. Youintroducedmetolarge-scaleoptimization techniquesof OperationsResearch,atool thatIhaveusedextensivelythroughoutmythesis work. Wehavecollaboratedsince2008andIhopethatourcollaborationextendsmuchfurther. Similarly, IwouldliketoespeciallythankVinceforallhishelpthroughtheyears. Youhavebeenapartof multipleresearche↵ ortsthathaveconstitutedmywork,andthisthesiswoulddefinitelynotbein itscurrentshapewithoutyou. IwouldalsoliketothankthemanycollaboratorsIhavehadovertheyears. ApartfromMilind, FernandoandVince,Ihavebeenfortunateenoughtoworkalongwithmanyresearchers. Ithank all of them whole-heartedly. This lists includes Sarit Kraus, Makoto Yokoo, Kevin Leyton-Brown, ChristopherKiekintveld,MatthewE.Taylor,BoAn,AlbertXinJiang,JamesPita,JasonTsai,Eric Shieh,Ondˇ rejVanˇ ek,ZhengyuYin,BranislavBoˇ sansk´ y,MichalPˇ echouˇ cek,DmytrovKorzhyk, RongYang,ErimKardesandJun-youngKwak,amongothers. Ialsothankallthestudentswho workedondevelopingthesoftwareassistantsARMORandIRIS,includingChristopherPortway, CraigWestern,ShyamsundarRathi,ParthParimalShah,YouZhouandRippleGoyal. I would also like to thank CREATE and the Federal Air Marshals Service for help and support overtheyears. Theyprovidedmenotjustthereal-worldproblemsofresearchinterest,butalso with the unique opportunity of deploying my research. My special thanks goes to James Curren of theFederalAirMarshalsServiceforhiscontinuedsupportofmyresearch. Ialsowanttothank iii IsaacMayaandErrollSouthersfortheirtirelesspromotionofgame-theoreticrandomizationas the preferred approach for real-worldsecurityscheduling. IalsothankmycolleaguesatUSC and the entire Teamcore family. You have all helped make my PhD experience unique and fun for thesemanyyears. IespeciallythankJamesPita,JasonTsai,Jun-youngKwak,ZhengyuYin,Rong Yang, MatthewBrown andEric Shieh foryou have all mademy stay atUSC special. Iwould also liketothankallthestudentsIhaveadvisedfortheirinputsonmyadvisingstyleandshortcomings. Finally,Iwouldliketothankmyfamilyandfriendswhohavebeenatremendoussupportfor alltheseyears. TomyfatherSudhirJain,mymotherNilamJain,mysisterMahimaJainandmy grandma Nirmala Jain, you have all been always supportive of all my endeavors. Special thanks to mycousinRippleGoyalforbeingthereatUSCforthepasttwoyears,youhelpedmeespeciallyin timeswhenIwasstressedandalsotreatedmewithgreathome-cookedfood. Ialsowouldliketo thankmyfriendsWilliamYeoh,NileshMishra,VivekKumarSinghandMeghaGuptawhohave allbeenapartofmylifeatUSC.Thankyoueveryoneforassistingmeinpushingmylimitsand exploringtheworldaroundme. WithouttheunconditionalloveandsupportofallthesepeopleI wouldnothavebeenabletogettowhereIamtoday. iv TableofContents Acknowledgments ii ListofFigures viii Abstract xii Chapter1: Introduction 1 1.1 ProblemAddressed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.2.1 Aspen .................................... 7 1.2.2 RuggedandSnares............................. 9 1.2.3 HbgsandHbsa ............................... 11 1.2.4 d:sandPhaseTransition . . . . . . . . . . . . . . . . . . . . . . . . . . 12 1.2.5 RealWorldApplications: ARMORandIRIS . . . . . . . . . . . . . . . 13 1.3 Guidetothesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 Chapter2: Background 15 2.1 MotivatingDomains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.1.1 LosAngelesInternationalAirport(LAX): . . . . . . . . . . . . . . . . . 16 2.1.2 UnitedStatesFederalAirMarshalsService(FAMS): . . . . . . . . . . . 17 2.1.3 UnitedStatesTransportationSecurityAgency(TSA): . . . . . . . . . . . 17 2.1.4 UnitedStatesCoastGuard: . . . . . . . . . . . . . . . . . . . . . . . . 18 2.2 StackelbergGames . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.3 StrongStackelbergEquilibrium(SSE) . . . . . . . . . . . . . . . . . . . . . . . 21 2.4 StackelbergSecurityGames . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 2.5 SecurityProblemswithArbitrarySchedulingConstraints(SPARS) . . . . . . . . 27 2.6 SecurityProblemswithPatrollingConstraints(SPPC) . . . . . . . . . . . . . . . 29 2.6.1 Payo↵ Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 2.7 GameModel1: MultipleAttackers . . . . . . . . . . . . . . . . . . . . . . . . . 32 2.8 NetworkSecurityDomain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 2.9 BaselineAlgorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 2.9.1 MultipleLPsapproach . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 2.9.2 Dobss: MixedIntegerLinearProgram . . . . . . . . . . . . . . . . . . . 36 2.9.3 EraserMixedIntegerLinearProgam . . . . . . . . . . . . . . . . . . . 38 2.9.4 Rangersolutionapproach . . . . . . . . . . . . . . . . . . . . . . . . . 39 v Chapter3: StrategyGenerationforOnePlayer 40 3.1 SPARSdomain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 3.1.1 ASPENColumnGeneration . . . . . . . . . . . . . . . . . . . . . . . . 43 3.1.1.1 MasterProblem: . . . . . . . . . . . . . . . . . . . . . . . . . 43 3.1.1.2 SlaveProblem: . . . . . . . . . . . . . . . . . . . . . . . . . . 45 3.1.2 ImprovingBranchingandBounds . . . . . . . . . . . . . . . . . . . . . 47 3.2 Columngenerationforjointpatrollingschedules . . . . . . . . . . . . . . . . . 51 3.3 SPPCDomain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 3.3.1 ColumnGeneration: . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 3.3.1.1 MasterFormulation: . . . . . . . . . . . . . . . . . . . . . . 54 3.3.1.2 SlaveFormulation: . . . . . . . . . . . . . . . . . . . . . . . 55 3.4 ExperimentalResults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 3.4.1 ComparisonResults . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 3.4.2 ASPENonLargeSPARSInstances: . . . . . . . . . . . . . . . . . . . . 59 Chapter4: Strategygenerationforbothplayers 63 4.1 RangerCounterexample . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 4.2 Rugged ....................................... 67 4.2.1 AlgorithmDescription . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 4.2.2 CoreLP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 4.2.3 DefenderOracle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 4.2.4 AttackerOracle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 4.3 Evaluationof Rugged................................ 79 4.3.1 ComparisonwithRANGER . . . . . . . . . . . . . . . . . . . . . . . . 80 4.3.2 Scale-upandanalysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 4.3.3 AlgorithmDynamicsAnalysis . . . . . . . . . . . . . . . . . . . . . . . 83 4.4 Snares........................................ 84 4.4.1 AlgorithmDescription . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 4.4.2 Warm-startingusingmincut-fanout .................. 87 4.4.3 Using”Better”Responses . . . . . . . . . . . . . . . . . . . . . . . . . 88 4.4.3.1 BetterResponsefortheDefender . . . . . . . . . . . . . . . . 88 4.4.3.2 BetterResponsefortheAttacker . . . . . . . . . . . . . . . . 93 4.4.4 Evaluationof Snares............................ 95 4.4.4.1 AnalysisofComponentsof Snares............... 95 4.4.4.2 ScalabilityinSimulation: . . . . . . . . . . . . . . . . . . . . 98 4.4.4.3 RealData . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 Chapter5: SolvingBayesianGames 102 5.1 SolvingBayesian-SPNSCProblemInstances . . . . . . . . . . . . . . . . . . . 102 5.1.1 BayesianGameComputation . . . . . . . . . . . . . . . . . . . . . . . 103 5.1.2 HbgsSolutionMethodology . . . . . . . . . . . . . . . . . . . . . . . . 105 5.1.2.1 HierarchicalTypeTrees . . . . . . . . . . . . . . . . . . . . . 105 5.1.2.2 PruningaBayesianGame . . . . . . . . . . . . . . . . . . . . 107 5.1.2.3 HbgsDescription . . . . . . . . . . . . . . . . . . . . . . . . 110 5.1.3 ExperimentalResults . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 vi 5.1.3.1 HbgsScale-up . . . . . . . . . . . . . . . . . . . . . . . . . . 114 5.1.3.2 Approximations . . . . . . . . . . . . . . . . . . . . . . . . . 117 5.2 SolvingBayesian-SPARSProblemInstances . . . . . . . . . . . . . . . . . . . . 118 5.2.1 Bayesian-Aspen............................... 118 5.2.1.1 Bayesian-AspenColumnGeneration . . . . . . . . . . . . . . 119 5.2.1.2 ImprovingBranchingandBounds . . . . . . . . . . . . . . . . 121 5.2.2 Experimentalresults . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 5.2.2.1 Scale-upinnumberoftargets: . . . . . . . . . . . . . . . . . 122 5.2.2.2 Scale-upinnumberoftypes: . . . . . . . . . . . . . . . . . . 123 5.2.2.3 E↵ ectsofusingdi↵ erenthierarchies . . . . . . . . . . . . . . 123 5.2.2.4 Di↵ erentarrangementoftypesinHbsa ............. 125 5.3 BayesianSPPCDomain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 5.3.1 MasterFormulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 5.3.2 Slave . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 Chapter6: d:sandPhaseTransition 131 6.1 DeploymenttoSaturationRatio . . . . . . . . . . . . . . . . . . . . . . . . . . 134 6.1.1 SPNSCDomain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 6.1.1.1 Generalsumrepresentation. . . . . . . . . . . . . . . . . . . . 135 6.1.1.2 Securitygamecompactrepresentation. . . . . . . . . . . . . . 137 6.1.2 SPARSDomain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 6.1.3 SPPCDomain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140 6.2 Implicationsofthefindings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140 6.3 PhaseTransitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 6.3.1 PhaseTransitionsinSecurityGames . . . . . . . . . . . . . . . . . . . . 144 Chapter7: RelatedWork 149 7.1 ComputationinStackelbergSecurityGames . . . . . . . . . . . . . . . . . . . . 149 7.1.1 E cientcomputationofStackelbergequilibria . . . . . . . . . . . . . . 150 7.1.2 Stackelbergequilibriamodelinghumans . . . . . . . . . . . . . . . . . . 154 7.1.3 Computationofrobustsolutions . . . . . . . . . . . . . . . . . . . . . . 155 7.2 Stackelberggamesinothersecurityscenarios . . . . . . . . . . . . . . . . . . . 155 7.3 Otheroptimizationtechniquesforsecuritydomains . . . . . . . . . . . . . . . . 156 Chapter8: Conclusions 158 8.1 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160 8.2 FuturePlans . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162 8.2.1 Scalablebehavioralgametheory . . . . . . . . . . . . . . . . . . . . . . 162 8.2.2 Stochasticcoalitionalgametheory . . . . . . . . . . . . . . . . . . . . . 163 8.2.3 Spatiotemporalgametheory . . . . . . . . . . . . . . . . . . . . . . . . 164 Bibliography 166 vii ListofFigures 1.1 The image shows examples of three domains which have inspired my research: inallthesethreedomains,securityagenciesneedtodeploylimitedresourcesto protectfrompotentialadversaries. . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2 Theimageshowsexamplesofthreesecuritydomainswheremyresearchhasbeen successfullyapplied................................. 7 1.3 The figure shows the screenshots of ARMOR and IRIS, the deployed software assistants. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.1 Example1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 3.1 WorkingofBranchandPriceforanon-BayesianSPARSprobleminstance. . . . 41 3.2 ExampleMinimumCostNetworkFlowGraphforaSPARSprobleminstance. . . 47 3.3 Thisfigureshowsanexamplenetwork-flowbasedslaveformulation. Thereareas manylevelsinthegraphsasthenumberoftargets. Eachnoderepresentsaspecific target. Apathfromthesourcetothesinkmapstoatourtakenbythedefender. . 57 3.4 ComparisonbetweenERASER-C,AspenandBnPforSPARSprobleminstances with10resources. Thenumber of schedulesfor theseexperiments wastwo times thenumberoftargets. Inthisfigureandotherswithy-axisonthelogscale,error barsarenotshownsincetheyarenotprominentbecauseofthelogarithmicaxis. In this experiment, the di↵ erence in runtime for all pair-wise comparisons was statisticallysignificantexceptbetweenERASER-CandAspenfor40and50targets. 59 3.5 These figures present results comparing the runtime required by Aspen, Aspen without the ORIGAMI-S branch-and-bound heuristic (BnP) and ERASER-C with columngenerationonSPARSinstanceswith2schedulespertarget. They-axis shows the runtime in seconds in log scale, whereas the x-axis varies an input parameter,asspecifiedinthefigure. Again,wedonotshowtheerrorbarsbecause ofthelogscaleofthey-axis,butAspenwasthefastestalgorithmwithstatistical significance. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 viii 3.6 These figures present the runtime results for Aspen when the size of the input problemisscaled. They-axisshowstheruntimeinseconds,whereasthex-axis variesaninputparameter,asspecifiedinthefigure. . . . . . . . . . . . . . . . . 62 4.1 This example is solved incorrectly by Ranger. The variablesa,b are the coverage probabilitiesonthecorrespondingedges. . . . . . . . . . . . . . . . . . . . . . 64 4.2 Thepossibleallocationsoftworesourcestothefouredges. Theblockededgesare showninbold. Theprobabilities(xory)areshownnexttoeachallocation. . . . 65 4.3 AdefenderoracleprobleminstancecorrespondingtotheSET-COVERinstance with U = {1,2,3}, S = {{1},{2},{3},{1,2},{1,3}}. Here, the attacker’s mixed strategy uses three paths: (e 1 ,e 1,2 ,e 1,3 ,e 0 1 ), (e 2 ,e 1,2 ,e 0 2 ), (e 3 ,e 1,3 ,e 0 3 ). Thus, the SET-COVERinstancehasasolutionofsize2(forexample,using{1,2}and{1,3}); correspondingly,with2resources,thedefendercanalwayscapturetheattacker (forexample,bycoveringe 1,2 ,e 1,3 ). . . . . . . . . . . . . . . . . . . . . . . . . 71 4.4 AnexamplegraphcorrespondingtotheCNFformula(x 1 _¬ x 2 _¬ x 3 )^ (x 1 _ x 2 _ x 4 ) 75 4.5 Example graph of Southern Mumbai with 455 nodes. Sources are depicted as greenarrowsandtargetsareredbulls-eyes. Bestviewedincolor. . . . . . . . . . 78 4.6 Results. Figure(a)showsthescale-upanalysisonWFCgraphofdi↵ erentsizes. Figure(b)showstheconvergenceoforaclevaluestothefinalgamevalueandthe anytimebounds. Figure(c)comparestheruntimesoforaclesandthecoreLP. . . 81 4.7 ComparisonbetweenSnaresandstate-of-the-art: Snarescannowscaletosolve problems the size of full cities where previous work could only scale to the southerntipofMumbai. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 4.8 FlowChartfortheSnaresAlgorithm . . . . . . . . . . . . . . . . . . . . . . . . 85 4.9 The contributions of individual components of Snares. Rugged is used as a baseline. 95 4.10 TheruntimerequiredbySnaresastheinputproblemsizeisvaried. . . . . . . . 95 4.11 VaryingNumberoftargets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 5.1 Example treerepresenting thepure strategiesfor theattacker in aBayesian Stack- elbergsecuritygame. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 ix 5.2 Examples of possible hierarchical type trees generated in Hbgs. The root node is the original Bayesian-SPARS problem instance,G(⇥ , ⇤ ). Every other node is a restricted Bayesian game. Figure 5.2(a) shows a depth-one partitioning, where G(⇥ , ⇤ ) is decomposed into four restricted games G(⇥ , ⇤ i ),i2{1..4}. Figure 5.2(b), on the other hand, shows the full binary partitioning where the originalG(⇥ , ⇤ )isdecomposedintotworestrictedgames,whicharethenfurther re-decomposedintotwosmallerrestrictedgames. . . . . . . . . . . . . . . . . . 107 5.3 Thisplotshowsthecomparisonsinperformanceofthefouralgorithmswhenthe sizeoftheinputproblemisscaled. . . . . . . . . . . . . . . . . . . . . . . . . . 114 5.4 Thisplotshowsthecomparisonsof Hbgsanditsapproximationvariants. . . . . . 118 5.5 This plot shows the comparisons in performance of the three algorithms when the inputproblemisscaled. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 5.6 ADepth-Twotreefor8attackertypes. . . . . . . . . . . . . . . . . . . . . . . . 124 5.7 Thefigureshowstheimpactinruntimewhenthedistributionoftreesisvariedin thehierarchyusedinHbsa. Theseresultsarefor8attackertypes. WhileIdonot showtheerrorbarsinthisfigureaswell,depth-twohierarchywasfasterfor10and 15targetswhereasfullbinaryhierarchywasfasterfor20targetswithstatistical significance. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 5.8 The figure shows the impact in runtime when the distribution of types is varied inthehierarchyusedinHbsa. We,again,donotshowtheerrorbarsbecauseof the log scale of the y-axis but the similar type arrangement was faster than the dissimilartypearrangementfor10,15and20targetswithstatisticalsignificance. 128 6.1 Averagerunning time of DOBSS, ASPEN and multiple-attacker branch-and-price algorithmforSPNSC,SPARSandSPPCdomainsrespectively. . . . . . . . . . . 133 6.2 AverageruntimeofcomputingtheoptimalsolutionforaSPNSCprobleminstance. Theverticaldottedlineshowsd:s = 0.5. . . . . . . . . . . . . . . . . . . . . . 136 6.3 Average runtime of computing the optimal solution for a SPARS game using ASPEN.Theverticaldottedlineshowsd:s = 0.5. . . . . . . . . . . . . . . . . . 138 6.4 Averageruntimeforcomputingtheoptimalsolutionforapatrollingdomain. The verticaldottedlineshowsd:s = 0.5. . . . . . . . . . . . . . . . . . . . . . . . . 139 6.5 The di↵ erence between expected defender utilities from ERASER and a na¨ ıve randomizationpolicy. Theverticallineshowsd:s = 0.5. . . . . . . . . . . . . . 143 x 6.6 AverageruntimeofcomputingtheoptimalsolutionforaSPNSCprobleminstance. Theverticaldottedlineshowsd:s = 0.5. . . . . . . . . . . . . . . . . . . . . . 145 6.7 Average runtime of computing the optimal solution for a SPARS game using ASPEN.Theverticaldottedlineshowsd:s = 0.5. . . . . . . . . . . . . . . . . . 146 6.8 Averageruntimeforcomputingtheoptimalsolutionforapatrollingdomain. The verticaldottedlineshowsd:s = 0.5. . . . . . . . . . . . . . . . . . . . . . . . . 147 6.9 ERASERresultswithvaryingnumberofattackertypes. . . . . . . . . . . . . . . 147 6.10 Probability thatthe decisionproblemSSE(D ⇤ ) issoluble for SPNSCinstances of threeproblemsizes. Thephasetransitiongetssharperastheproblemsizeincreases.148 xi Abstract Protecting critical infrastructure and targets such as airports, transportation networks, power generationfacilitiesaswellascriticalnaturalresourcesandendangeredspeciesisanimportant task for police and security agencies worldwide. Securing such potential targets using limited resources against intelligent adversaries in the presence of the uncertainty and complexities of thereal-worldisamajorchallenge. Myresearchusesagame-theoreticframeworktomodelthe strategic interaction between a defender (or security forces) and an attacker (or terrorist adversary) insecuritydomains. Gametheoryprovidesasoundmathematicalapproachfordeployinglimitedsecurityresources to maximize their e↵ ectiveness. While game theory has always been popular in the arena of security, unfortunately, the state of the art algorithms either fail to scale or to provide a correct solutionforlargeproblemswitharbitraryschedulingconstraints. Forexample,UScarriersflyover 27,000domesticand2,000internationalflightsdaily,presentingamassiveschedulingchallenge forFederalAirMarshalService(FAMS). Mythesiscontributestoaverynewareathatsolvesgame-theoreticproblemsusinginsights fromlarge-scaleoptimizationliteraturetowardsaddressingthecomputationalchallengeposedby real-worlddomains. Ihavedeveloped new models and algorithms that compute optimalstrategies for scheduling defender resources is large real-world domains. My thesis makes the following xii contributions. First, it presents new algorithms that can solve for trillions of actions for both thedefenderandtheattacker. Second,itpresentsahierarchicalframeworkthatprovidesorders of magnitude scale-up in attacker types for Bayesian Stackelberg games. Third, it provides an analysisanddetectionofaphase-transitionthatidentifiespropertiesthatmakessecuritygames hardtosolve. Thesenewmodelshavenotonlyadvancedthestateoftheartincomputationalgame-theory, buthaveactuallybeensuccessfullydeployedinthereal-world. Myworkrepresentsasuccessful transitionfromgame-theoreticadvancementstoreal-worldapplicationsthatarealreadyinuse,and ithasopenedexcitingnewavenuestogreatlyexpandthereachofgametheory. Forinstance,my algorithmsareusedintheIRISsystem: IRIShasbeeninusebytheFederalAirMarshalsService (FAMS)toscheduleairmarshalsonboardinternationalcommercialflightssinceOctober2009. xiii Chapter1: Introduction Protecting critical infrastructure and targets such as airports, transportation networks, power generationfacilitiesaswellascriticalnaturalresourcesandendangeredspeciesisanimportant task for police and security agencies worldwide. Securing such potential targets using limited resources against intelligent adversaries in the presence of the uncertainty and complexities of the real-world is a major challenge. For example, in 2001, the 9/11 attack on the World Trade Center in New York City via commercial airliners resulted in $27.2 billion of direct short term costs[Looney,”2002”]aswellasalossof2,974lives. The2004Madridcommutertrainbombings resulted in 191 lives lost, 1755 wounded, and an estimated cost of 212 million Euros [Blanco etal.,2007]. Finally,the2008terroristattacksinMumbairesultedin195liveslostandnearly300 wounded [Chandran and Beitchman, 29 November 2008]. Measures for protecting potential target areas include monitoring entrances and inbound roads, checking inbound tra c and patrolling aboardtransportationvehicles. Whereastherearemanymoreimportantsecurityscenarios,the commonproblemamongthemisthatsecurityagencieshavelimitedsecurityresourcestoprotect theircriticalinfrastructureagainstanadaptiveadversarywhoconductssurveillanceandresponds tothestrategyfollowedbythesecuritypersonnel. 1 (a) TransportationNetworks (b) Cyber-security (c) SustainableFishing Figure1.1: Theimageshowsexamplesofthreedomainswhichhaveinspiredmyresearch: inall thesethreedomains,securityagenciesneedtodeploylimitedresourcestoprotectfrompotential adversaries. 1.1 ProblemAddressed Game theoryprovides a soundmathematicalapproachfordeployinglimitedsecurityresources to maximizetheire↵ ectiveness. Indeed,myresearchusesgame-theoretictechniquesforcomputing optimal strategies deploying defender resources. While this connection between game theory andsecurityhasexistedforlastseveraldecades,researchoncomputationalapproachestogame theoryinthepasttwodecadeshasenabledverylarge-scaleproblemstobecastingame-theoretic contexts,thusprovidingthecomputationaltoolstoaddressproblemsofsecurityallocations. Mycontributionshavebeenat theforefront ofthis researchapplying gametheory forsecurity. I have been involved in the development of actual deployed applications of game theory for security. TheARMOR(AssistantforRandomizedMonitoringOverRoutes)softwarescheduling assistant [Pita et al., 2008], is successfully deployed at the Los Angeles International Airport (LAX) since 2007; in particular, ARMOR uses game theory to randomize allocation of police checkpoints and canine units. I also led the development and deployment of IRIS (Intelligent RandomizationinScheduling)[Jainetal.,2010c],whichisusedbytheUSFederalAirMarshal Servicesince2009todeployairmarshalsonUSaircarriers. SuccessofARMORandIRIShave 2 paved the way for more deployed applications: GUARDS for the US Transportation Security Administrationisgettingevaluatedforanationaldeploymentacrossallairports[Pitaetal.,2011b]. Similarly, PROTECT [Shieh et al., 2012], for the United States Coast Guard, is under deployment at the ports of Boston, New York and Los Angeles Long Beach and is expected to be taken nation-wide;TRUSTS[Yinetal.,2012b]isbeingusedbytheLosAngelesSheri↵ ’sdepartment for conducting patrols on metro trains; and many other agenciesaround the globe are now looking todeploythesetechniques. Thissetofapplicationsandassociatedalgorithmshasaddedtothealreadysignificantinterest in game theory for security. They use the framework of Stackelberg games, which were first introducedtomodelleadershipandcommitment[vonStackelberg,1934],tomodeltheproblem faced by the security agencies. While I provide a formal definition for Stackelberg games in Chapter2,theyareanaturalmodelforsuchsecurityproblemsbecausetheymodelthecommitment a defender (e.g., security agency) must make in allocating her resources before an attacker can conduct surveillance and choose his best attack strategy, considering the action chosen by the defender 1 . Beyond the deployed real-world applications, Stackelberg games have been used to study security problems ranging from “police and robbers” scenarios [Gatti, 2008; Paruchuri et al., 2008b; Basilico et al., 2009] to computer network security [Lye and Wing, 2005], to missile defensesystems[Brownetal.,2005a]andanti-terrorismpolicies[SandlerandM.,2003]. WhileStackelberggameshavebeensuccessfullyusedtomodelthesecurityresourceallocation problem,newerandmorecomplexreal-worldapplicationdomainsmakeitchallengingforexisting techniquesforStackelberggamestobeapplied,andthusnoveltechniquestosolvelargegames are required. Real world problems, like scheduling air marshals to protect flights or protecting 1 Byconvention,Irefertothedefender(leader)assheandattacker(follower)ashe 3 computernetworks,presenttrillionsofpurestrategiestoeitheroneplayerortoboththedefender andtheattacker. Suchlargeprobleminstancescannotevenberepresentedinmoderncomputers, let alone solved using previous techniques. I have provided new models and algorithms that compute optimal defender strategies for massive real-world security domains. In particular, I have addressedthefollowingproblems: 1. Compute optimal security scheduling strategies in domains with a very large number of pure strategies (up to trillions of actions) forthe defender. Letususetheproblemfacedby theFederalAirMarshalsService(FAMS)asanexampledomainwherethenumberofpure strategies for the defender are very large. The FAMS schedules armed o cers on-board passengeraircrafts. TheenormityofthechallengefacedbytheFAMScanberevealedbya smallexample: aninstancewith100flightsand10o cerswouldhavemorethanabillion possibleassignmentsofairmarshalstoflights;inreality,thereareanestimated3,000–4,000 o cersandabout30,000flights[Keteyian,2010]. 2. Compute optimal security scheduling strategies in domains with a very large number of pure strategies (up to billions of actions) for the defender as well as the attacker. An exampledomainrequiringalgorithmsthatcanscaleinnumberofpurestrategiesforboth playersisthesecurityresourceallocationproblemfacedwhenprotectingtargetsinacity. In response to the attacks in 2008, the Mumbai police have started to schedule a limited number of inspectioncheckpoints on the roadnetwork throughout the city. Since the police couldscheduleanycombinationofcheckpointsontheroads,theyhaveexponentiallymany choices. Similarly, the attacker has exponentially many choices: a path from any source to any target is a feasible attacker strategy. Similar problems are faced in other domains 4 like cyber-security as well, especially when there are multiple defender resources to be scheduled. 3. Compute optimal security scheduling strategies in domains with a very large number of attackertypes: Alargenumberofattackertypesarerequiredtomodeluncertaintyinthe real-world. Forexample,thepolicemaybefacingeitherawell-fundedhard-linedterroristor criminalsfromlocalgangs. Thesetwogroupsmayhavedi↵ erentcapabilities,andthepolice maynotknowwhichattackergrouptheywillbefacingonanygivenday. Thesedi↵ erent attacker preferences, or simply uncertainty over the attacker is modeled as di↵ erent attacker types using a Bayesian Stackelberg game. The computational complexity of Bayesian StackelberggameshasalreadybeenproventobeNP-hard[ConitzerandSandholm,2006]. 4. Finally, broadly identifying properties that make Stackelberg game problem instances computationallychallengingisanimportantproblem. Theobjectiveistoidentifyproperties thatareinvarianttodi↵ erentdomains,algorithms,solversandevenequilibriumcomputation methods,butimpacttheruntimerequiredtocomputesolutionsforaprobleminstance. An understanding of the runtimerequiredbygame-theoreticalgorithmsiscriticaltofurthering theapplicationofgametheorytootherreal-worlddomains. Insummary,existingtechniquestocomputeoptimaldefenderstrategiesinsecuritydomains failtoscaletoreal-worlddomains. MyPh.D.thesishasfocusedonaddressingthesechallenges, andhasmadecontributionsbydevelopingscalablegametheoreticalgorithms. 5 1.2 Contributions Game theoretic models for real-world problems can have trillions of pure strategies for both players. Such large models cannot even be stored in the memory of computers today, let alone besolvedbyexistingalgorithms. Ihaveprovidedalgorithmsespeciallydesignedtoscale-upto exponentiallymanypurestrategiesforbothplayers,asrequiredtocomputesolutionsforlargereal- world domains. These algorithmsavoid representingthe entire game in memory, whilecomputing optimal solutions for the entire large problem. These algorithms are built on the following insights: (i)Real-worlddomainshaveexponentiallymanypurestrategiesforthedefender (e.g. acombinationofcheckpoints),andso,anincrementalapproachofgeneratingpurestrategiesof the defender is required. This will avoid enumerating all the pure strategies, and will only add a pure strategy if the pure strategy would help increase defender payo↵ . (ii) In domains with exponentiallymanyattackerpurestrategies,anincrementalapproachtogeneratepurestrategies fortheattacker(e.g. attackpaths)shouldbeusedtoavoidenumerationofthepurestrategysetof theattacker. (iii)ABayesianStackelberggamecanbedecomposedinto hierarchically-organized smallergames,eachwithsmallernumberofattackertypes,providingheuristicswhichcanbeused toeliminatethenever-best-response(thatis,dominated)purestrategiesoftheattacker. These insights provide speed-ups by reducing the size of the game: while insights (i) and (ii)restrictthegamesizebye cientlygeneratingsub-gamesthatincludeapurestrategyonlyif itimprovestheplayer’spayo↵ ,insight(iii)pre-processestheinputBayesianStackelberggame instance and removes the attacker pure strategies that cannot be part of the optimal solution. Additionally,allthesetechniquesprovidemathematicalguaranteesandcangenerateoptimalas well as approximate solutions e ciently. Furthermore, I have investigated what properties of 6 Stackelbergsecuritygameinstancesmakethemhardtosolveinpractice,acrossdi↵ erentinput sizes and di↵ erent security domains. The algorithms developed in my thesis have also been deployedinthereal-world. Specifically,thecontributionsofmythesisareasfollows: (a) Airport security domain: image showsplacementofvehicularcheck- points at Los Angeles International Airport. (b) FAMS security domain: image shows all the international flights leavingChicagoO’Hareonaday. (c) Networksecuritydomain: image showstheentireroadnetworkofthe cityofMumbai. Figure 1.2: The image shows examples of three security domains where my research has been successfullyapplied. 1.2.1 Aspen TheAspenalgorithmcomputesoptimalstrategiesforthedefenderindomainswherethenumber of pure strategies of the defender can be prohibitively large. It provides the scale-up using the first insightmentionedabove–namely,strategygenerationforthedefender[Jainetal.,2010b]. As anillustrativeexampleforareal-worlddomainwithexponentiallymanydefenderstrategies,let usconsidertheproblemfacedbytheFederalAirMarshalsService(FAMS).Therearecurrently tensofthousandsofcommercialflightsflyingeachday,andpublicestimatesstatethatthereare thousandsofairmarshalsthatarescheduleddailybytheFAMS[Keteyian,2010]. Airmarshals must be scheduled on tours of flights that obey logistical constraints (e.g., the time required to 7 board,fly,anddisembark). Anexampleofavalidscheduleisanairmarshalassignedtoaround triptourfromLosAngelestoNewYorkandback. The scale of the domain is massive, since there are billions of possible assignments of air marshalstoflighttours. Forexample,inourillustrativeexampleoftheFAMS,thenumberofways in which 10 air marshals can be scheduled over 100 flights is ⇣ 100 10 ⌘ ⇡ 1.7⇥ 10 4 billion. Simply findingschedulesforthemarshalsisacomputationalchallenge. Thetaskismademoredi cult bytheneedtofindanoptimalstrategyovertheseschedulesthatmeetstheschedulingconstraints ofthedomain,whilealsoaccountingforanadaptiveattackerandthedi↵ erentpayo↵ valuesof eachflight. I cast this problem as a security game, described in Section 2.4, where the attacker can choose anyoftheflightstoattack,andeachairmarshalcancoveroneschedule. Eachschedulehereisa feasiblesetoftargetsthatcanbecoveredtogether;intheFAMSexample,eachschedulewould representaflighttourwhichsatisfiesallthelogisticalconstraintsthatanairmarshalcouldfly. A jointschedulethenwouldassigneveryairmarshaltoaflighttour,andtherecouldbeexponentially manyjointschedulesinthedomain. Apurestrategyforthedefenderinthissecuritygameisajoint schedule. Since all the defender pure strategies (or joint schedules) cannot be enumerated for such massiveproblems,Aspenstartsbyrepresentingonlyfewpurestrategiesofthedefenderandthen incrementallygeneratingtherequiredpurestrategies[Jainetal.,2010b]. Aspendecomposesthe problemintoamaster problemandaslaveproblem,whicharethensolvediteratively,asdescribed in Chapter 3. Given a limited number of pure strategies, the master solves for the defender and the attackeroptimizationconstraints,whiletheslaveisusedtogenerateanewpurestrategyforthe defenderineveryiteration. 8 ThemasterinAspenoperateson thepurestrategies(jointschedules) generatedthusfar; its objectiveistocomputetheoptimalmixedstrategyofthedefenderoverthesepurestrategies. The objectiveoftheslaveproblemistogeneratethebestjointscheduletoaddtothesetofgenerated pure strategies. The best joint schedule is identified using the concept of reduced costs [Bertsimas and Tsitsiklis, 1994], which measures if a pure strategy can potentially increase the defender’s expected utility(the detailsof the approach are provided in Chapter 3. While a na¨ ıve approach would be to iterate over all possible pure strategies to identify the pure strategy with the maximum potential,Aspenusesanovelminimum-costintegerflowproblemtoe cientlyidentifythebest purestrategytoadd. Aspenalwaysconvergesontheoptimalmixedstrategyforthedefenderas describedinChapter3. Employing strategy generation for large optimization problems is not an “out-of-the-box” approach, the problem has to be formulated in a way that allows for domain properties to be exploited. Thenovelcontributionof Aspenistoprovidealinearformulationforthemasterand aminimum-costintegerflowformulationfortheslave,whichenabletheapplicationofstrategy generationtechniques. Additionally,Aspenalsoprovidesabranch-and-boundheuristictoreason over attacker actions. This branch-and-bound heuristic provides a further order of magnitude speed-up,allowingAspentohandlethemassivesizesofreal-worldproblems. Indeed,Aspenis currentlybeingusedbytheFAMStoscheduleairmarshalsoninternationalflights. 1.2.2 RuggedandSnares Ruggedisdesignedfordomainswherethenumberofpurestrategiesoftheplayersareexponentially large, and it does strategy generation for both the defender and the attacker [Jain et al., 2011b]. RuggedthenservesasabuildingblockforSnares,whichimprovesonRuggedbymakingnovel 9 use of two approaches: warm starts and greedy responses [Jain et al., 2013]. Snares can now e cientlycomputesolutionsfor extremelylargeproblemswithtrillions ofpurestrategiesforboth players. I also evaluate the performance of Snares in real-world networks illustrating a significant advanceoverstate-of-the-artalgorithms. Letusconsidertheurbannetworksecuritygameasanillustrativeexample. Inthisdomain,the pure strategies of the defender correspondto allocationsofresources toedgesinthe network–for example,anallocationofpolicecheckpointstoroadsinthecity. Thepurestrategiesoftheattacker correspond to pathsfrom any sourcenode to anytarget node –for example, a path froma landing spot on the coast to the airport. The pure strategy space of the defender grows exponentially with the number of available resources, whereas the pure strategy space of the attacker grows exponentially with the size of the network. For example, in a fully connected graph with 20 nodes and 190 edges, the number of defender pure strategies for only 5 resources is ⇣ 190 5 ⌘ ⇡ 2 billion,whilethenumberofpossibleattackerpathswithoutanycyclesis⇡ 6.6e18. Thegraphs representingreal-worldscenariosaresignificantlylarger,e.g.,asimplifiedgraphrepresentingthe roadnetworkinthesoutherntipofMumbaihasmorethan250nodes(intersections)and700edges (streets),andthesecurityforcescandeploytensofresources. Here,strategygenerationisrequiredforboththedefenderandtheattackersincethenumber of pure strategies of both the players are prohibitively large. Rugged models the domain as a zero-sumgame,andcomputestheminimaxequilibrium,sincethepayo↵ oftheminimaxstrategy isequivalent totheSSE payo↵ in zero-sum games [Yin et al., 2010a]. Rugged decomposesthe computation into three modules: a minimax module and best response modules for both the defenderandtheattacker,whereasthetwobestresponsemodulesgeneratenewstrategiesforthe twoplayersrespectively. 10 The contribution of Rugged isto provide themixedintegerformulationsfor thebestresponse moduleswhichenabletheapplicationofsuchastrategygenerationapproach. Thesemixedinteger formulationsareprovidedinChapter4. Ruggedcancomputetheoptimalsolutionfordeploying upto4resourcesinreal-citynetworkwithasmanyas250nodeswithinareasonabletimeframeof 10hours(thecomplexityofthisproblemcanbeestimatedbyobservingthatboththebestresponse problemsareNP-hardthemselves[Jainetal.,2011b]). The Rugged algorithm serves as a building block for Snares, which makes the following contributions: (1) It defines and usesmincut-fanout, a novel method for e cient warm-starting ofthecomputation;(2)Itexploitsthesubmodularitypropertyofthedefenderoptimizationina greedyheuristic,whichisusedto generate “better-responses”; Snaresalso uses a better-response computationfortheattacker. Ialsoevaluatetheperformanceof Snaresinreal-worldnetworks illustrating a significant advance: for example, whereas state-of-the-art algorithms could only schedule resources of Mumbai police on just the southern tip of Mumbai, Snares can compute optimalstrategyfortheentireurbanroadnetworkofMumbai. 1.2.3 HbgsandHbsa Thedi↵ erentpreferencesofdi↵ erentattackertypesaremodeledthroughBayesianStackelberg games. Computing the optimal leader strategy in Bayesian Stackelberg game is NP-hard [Conitzer andSandholm,2006],andpolynomial timealgorithmscannotachieveapproximationratiosbetter thanO(types)[Letchfordetal.,2009]. IhavedevelopedanewtechniqueforsolvinglargeBayesian Stackelberggamesthatdecomposestheentiregameintomanyhierarchically-organized restricted Bayesian Stackelberg games as suggested in insight (iii); it then utilizes the solutions of these restrictedgamestomoree cientlysolvethelargerBayesianStackelberggame[Jainetal.,2011a]. 11 Theoverarchingideaofhierarchicalstructureistoimprovetheperformanceofbranch-and- bound on the attacker action tree by pruning the search space. It decomposes the Bayesian Stackelberg game into many hierarchically-organized smaller games, as explained in detail in Chapter5. Eachoftherestrictedgamesconsideronlyafewattackertypes,andarethusexponen- tiallysmallerthattheBayesianStackelberggameatthe‘parent’. Thesolutionsobtainedforthe restrictedgamesatthechildnodesofthehierarchicalgametreeareusedtoprovide: (i)pruning rules, (ii) tighter bounds, and (iii) e cient branching heuristics to solve the bigger game at the parentnodefaster. Suchhierarchicaltechniqueshaveseenlittleapplicationtowardsobtainingoptimalsolutions inBayesiangames,whileStackelbergsettingshavenotseenanyapplicationofsuchhierarchical decomposition. IprovideHbgswhichappliesthehierarchicalframeworktogeneralStackelberg games, and Hbsa that combines thehierarchical decomposition with strategy generationof Aspen. I have shown that both Hbgs and Hbsa are orders of magnitude faster than other Bayesian Stackelberg algorithms for the respective problem settings. Additionally, these algorithms are naturallydesignedforobtainingqualityboundedapproximationssincetheyarebasedonbranch- and-bound, and provide a further order of magnitude scale-up without any significant loss in qualityifapproximatesolutionsareallowed. 1.2.4 d:sandPhaseTransition Ihavealsoidentifiedthepropertiesthatmakeaprobleminstancecomputationallychallenging. I formalizedtheconceptofthedeployment-to-saturation(d:s)ratioinStackelbergsecuritygames, andshowedthatprobleminstancesforwhichd:s = 0.5arecomputationallyharderthaninstances with other d:s ratios for a wide range of di↵ erent domains, algorithms, solvers or equilibrium 12 computationmethods[Jainetal.,2012]. Thisworkalsoprovidesevidencethatthecomputationally hardregionisalsoonewhereoptimizationismostbeneficialtothereal-worldsecurityagencies, therebycorroboratingtheneedforalgorithmicadvances. Furthermore,Iusetheconceptofphasetransitionstobetterunderstandthiscomputationally hardregion. Idefineadecisionproblemrelatedtosecuritygames,andshowthattheprobabilitythat thisproblemhasasolutionexhibitsaphasetransitionasthed : sratiocrosses0.5. Thisprovides evidencethatsecuritygameswithd : s = 0.5areindeedcomputationallymorechallenging. 1.2.5 RealWorldApplications: ARMORandIRIS Game-theoreticapproachesforsecurityschedulinghavebeensuccessfullydeployedinthereal world,withapplicationslikeARMORandIRISinusebytheLosAngelesairportpoliceandthe FAMSsinceAugust2007andOctober2009respectively[Jainetal.,2010c]. Iledthedevelopment ofIRIS,IalsomadesignificantcontributionstothedevelopmentofARMOR.Whilethealgorithm ofchoiceinARMORhasbeenDobss[Paruchurietal.,2008b](theLAXdomainissmallcompared to FAMS or urban network security since only 8 terminals need to be protected at LAX), IRIS usestheAspenalgorithmmentionedbeforeanddescribedinChapter3. Currently,IRISisusedto scheduleairmarshalson-boardinternational flights; FAMS is indeed working towards increasing the scope of IRIS towards domestic and other sectors. Furthermore, the success of IRIS and ARMORsystemshasledtonewerdeploymentsofsuchalgorithmsinotherreal-worldsecurity domains,likethePROTECTsystemfortheU.S.CoastGuard,andtheTRUSTSsystemforthe LASheri↵ Department. 13 (a) ScreenshotoftheARMORsoftwareassistantinuse bytheLosAngelesAirportPolice. (b) ScreenshotoftheIRISsoftwareassistantinuseby theFederalAirMarshalsService. Figure 1.3: The figure shows the screenshots of ARMOR and IRIS, the deployed software assistants. 1.3 Guidetothesis This thesis is organized in the following way. Chapter 2 introduces necessary background for the research presented in this thesis and Chapter 7 presents related work. Chapter 3 presents thealgorithmAspenandcorrespondingexperimentalresults. Chapter4presentsthealgorithms RuggedandSnaresandcorrespondingexperimentalresults. Chapter5describesthehierarchical decompositiontechniqueforBayesianStackelberggames,whereasChapter6describesthed:s ratio and examines properties that make problem instances hard to solve. Finally, Chapter 8 concludesthethesisandpresentsissuesforfuturework. 14 Chapter2: Background This chapter begins by introducing motivating examples of real world security applications in Section2.1. TheworkinthisthesisbuildsonStackelberggamesformodelingsecuritydomains, and so I then provide the background on the general Stackelberg game model and its Bayesian extensioninSection2.2. Section2.3introducesthestandardsolutionconceptknownastheStrong StackelbergEquilibrium(SSE).Section2.4thentalksaboutthesecuritygamerepresentationfor Stackelberg games, whereas Section 2.5 describes security game models with arbitrary scheduling constraints. Section 2.6 then introduces security game model with patrolling constraints, followed bydescriptionofnetworksecuritydomainsinSection2.8. Finally,Section2.9overviewsprevious algorithmsofrelevancetothisthesis. 2.1 MotivatingDomains Securityscenariosaddressedinthisworkexhibitthefollowingimportantcharacteristics: thereisa leader/followerdynamicbetweenthesecurityforcesandterroristadversaries,sincethesecurity forcescommittoasecuritypolicyfirstwhiletheadversariesconductsurveillancetoexploitany weaknessesorpatternsinthesecuritystrategies[Tambe,2011]. Asecuritypolicyhererefersto somescheduletopatrol,checkormonitortheareaunderprotection. Therearelimitedsecurity 15 resources available to protect a very large space of possible targets, so it is not possible to provide complete coverage at all times. Moreover, the targets in the real-world clearly have di↵ erent valuesandvulnerabilitiesineachdomain. Additionally,thereisuncertaintyovermanyadversary types. For example, the security forces may not know whether they would face a well-funded terroristoralocalgangmemberorsomeotherthreat. Typically,thesecurityforcesareinterested inarandomizedschedule,sothatsurveillancedoesnotyieldpredictablepatterns;yettheywish to ensure that more important targets have a higher protection and that they guard against an intelligent adversary’s adaptive response to their randomized schedule. I now describe some concretesecuritydomainswhichhavemotivatedmyresearch. 2.1.1 LosAngelesInternationalAirport(LAX): LAXisthefifthbusiestairportintheUnitedStates,thelargestdestinationairportintheUnited States, and serves 60-70 million passengers per year [LAWA, 2007; Stevens and et. al., 2006]. TheLAXpoliceusediversemeasurestoprotecttheairport,whichincludevehicularcheckpoints, policeunitspatrollingtheroadstotheterminals,patrollinginsidetheterminals(withcanines),and securityscreeningandbagchecksforpassengers. Theapplicationofgame-theoreticapproachis focusedontwoofthesemeasures: (1)placingvehiclecheckpointsoninboundroadsthatservice theLAXterminals,includingbothlocationandtiming(2)schedulingpatrolsforbomb-sni ng canineunitsatthedi↵ erentLAXterminals. The eight di↵ erent terminals at LAX have very di↵ erent characteristics, like physical size, passenger loads, foot tra c or international versus domestic flights. These factors contribute to the di↵ ering risk assessments of these eight terminals. The numbers of available vehicle checkpointsandcanineunitsarelimitedbyresourceconstraints,sothekeychallengeistoapply 16 game-theoretic algorithms to intelligently allocate these resources – typically in a randomized fashion—toimprovetheire↵ ectivenesswhileavoidingpatternsinthescheduleddeployments. 2.1.2 UnitedStatesFederalAirMarshalsService(FAMS): TheFAMSplacesundercoverlawenforcementpersonnelaboardflightsofUSaircarriersorigi- natinginanddepartingtheUnitedStatestodissuadepotentialaggressorsandpreventanattack shouldoneoccur[TSA,2008]. Theexactmethodsusedtoevaluatetherisksposedbyindividual flights is not made public by the service, and many factors might influence such an evaluation. Forexample,flightshavedi↵ erentnumbersofpassengers,andsomeflyoverdenselypopulated areaswhileothersdonot[TSA,2008]. Internationalflightsalsoservedi↵ erentcountries,which mayposedi↵ erentrisks. Specialeventscanalsochangetherisksforparticularflightsatcertain times [Wiki, 2008]. The scale of the domain is massive. There are currently tens of thousands ofcommercialflightsscheduledeachday,andpublicestimatesstatethattherearethousandsof airmarshals[CNN,2008]. Airmarshalsmustbescheduledontoursofflightsthatobeyvarious constraints(e.g., thetime requiredtoboard, fly, anddisembark). Simplyfinding schedulesforthe marshalsthatmeetalloftheseconstraintsisacomputationalchallenge. Thetaskismademore di cult by the need to find a randomized policy that meets these scheduling constraints, while alsoaccountingforthedi↵ erentvaluesofeachflight. 2.1.3 UnitedStatesTransportationSecurityAgency(TSA): The TSA is tasked with protecting the nation’s transportation systems [TSA, 2011a]. One set of systems in particular is the over 400 airports [TSA, 2011a] which services approximately 28,000commercialflightsanduptoapproximately87,000totalflights[ATC,2011]perday. To 17 protectthislargetransportationnetwork,theTSAemploysapproximately48,000Transportation SecurityO cers[TSA,2011a],whoareresponsibleforimplementingsecurityactivitiesateach individualairport. While many people are aware of common security activities, such as individual passenger screening, this is just one of many security layers TSA personnel implement to help prevent potential threats [TSA, 2011b,a]. These layers can involve hundreds of heterogeneous securityactivitiesexecutedbylimited TSA personnel leading to a complex resource allocation challenge. Whileactivitieslikepassengerscreeningareperformedforeverypassenger,theTSA cannot possibly run every security activity all the time. Thus, while the resources required for passenger screening are always allocated by the TSA, it must also decide how to appropriately allocateitsremainingsecurityo cersamongthelayersofsecuritytoprotectagainstanumber of potential threats, while facing challenges such as surveillance and an adaptive adversary as mentionedbefore. 2.1.4 UnitedStatesCoastGuard: TheUSCoastGuardpatrolsharborstosafeguardthemaritimeandsecurityinterestsofthecountry. The Coast Guard continues to face a challenging future with an evolving asymmetric threat within themaritimeenvironmentbothwithintheMaritimeGlobalCommonsbutalsowithintheports and waterways that make up the United States Maritime Transportation System (MTS). The Coast Guardcancoveranysubsetofpatrolareasinanypatrolschedule. Theycanalsoperformmany security activities at each patrol area. The challenge for the Coast Guard again is to design a randomized patrolling strategy given that they need to protect a diverse set of targets along the harborandtheattackerconductssurveillanceandisadaptive. 18 2.2 StackelbergGames InaStackelberggame,aleadercommitstoastrategyfirst,andthenfollowerssequentiallyselfishly optimize their rewards, considering the action chosen by the leader. For the remainder of this thesis,Iwillrefertotheleaderas‘her’andthefolloweras‘him’. Toseetheadvantageofbeingthe leaderinaStackelberggame,considerasimplegamewiththepayo↵ tableasshowninTable2.1, which was first presented by [Conitzer and Sandholm, 2006]. The leader is the row player and the followeristhecolumnplayer. Theonlypure-strategyNashequilibriumforthisgameiswhenthe leaderplays a andthefollowerplays c, which gives the leader a payo↵ of 2; in fact, for the leader, playingbisstrictlydominated. However,iftheleadercancommittoplayingbbeforethefollower chooses his strategy, then the leader will obtain a payo↵ of 3, since the follower would then play d toensureahigherpayo↵ forhimself. Iftheleadercommitstoauniformmixedstrategyofplaying aandbwithequal(0.5)probability,thenthefollowerwillplayd,leadingtoapayo↵ fortheleader of3.5. Target1 Target2 Target1 2,1 4,0 Target2 1,0 3,2 Table2.1: Payo↵ tableforexampleStackelberggame. The Stackelberg games I consider inthis thesis havetwo agents, the leader (defender),⇥ , and the follower (attacker/adversary), . Each player has a set of possible pure strategies, denoted ⇥ 2⌃ ⇥ and 2⌃ .A mixed strategy allows a player to play a probability distribution over pure strategies, denoted x and a for the leader and the follower respectively. Payo↵ s for each player are defined over all possible joint pure-strategy outcomes: U ⇥ : ⌃ ⇥ ⇥ ⌃ !R for the defenderandsimilarlyforeachattacker. Thepayo↵ functionsare extendedto mixedstrategiesin 19 thestandardwaybytakingtheexpectationoverpure-strategyoutcomes. Thefollowercanobserve theleader’sstrategy,andthenactinawaytooptimizeitsownpayo↵ s. Formally,theattacker’s strategyinaStackelbergsecuritygamebecomesafunctionthatselectsastrategyforeachpossible leaderstrategy: F : x! a. The Bayesian extension to Stackelberg games allows for each player (leader or follower) to be one of multiple possible types, with each type associated with its own payo↵ values. In this thesis, thedefender(leader),⇥ ,onlyhasonetypesincesheisconsideringherownpersonalresources. However, the attacker (follower) can be one of a set of possible types denoted by 2⇤ . For example, a security force may be interested in protecting against potential terrorist attacks and catchingpotentialdrugsmugglers,whichrepresenttwodi↵ erenttypesofadversaries. Eachtypeis represented by a di↵ erent and possibly uncorrelated payo↵ matrix for both the leader and follower. Thatis,theleader’spayo↵ swillvaryalongwiththefollower’spayo↵ sforeachtypeoffollower. At any time the leader does not know what follower type she will face, however, she is aware oftheprobabilitydistributionoverfollowertypes(i.e., sheknowshowfrequentlyshewillface each follower type). The probability with which follower type 2⇤ appears is denoted by p . Thefollowerisalwaysawareofhisowntypeandthusalwayshasperfectinformationaboutthe leader’spayo↵ sandhisownpayo↵ s. ThenotationusedinthisthesiswhenreferringtoBayesian StackelberggamesisgiveninTable2.2. Given this formal model, the leader’s goal is to determine the mixed strategy x, such that her expected value is maximized given that each follower type will choose his expected-value- maximizingactionwithcompleteknowledgeoftheleader’smixedstrategy. Suchacommitment 20 Variable Definition ⇥ Referstotheleader Referstothefollower ⇤ Setoffollowertypes ⌃ ⇥ Setofpurestrategiesoftheleader ⌃ Setofpurestrategiesofthefollower p Probabilityoffollowertype G(⇥ , ⇤ ) BayesianStackelberggame U ⇥ Payo↵ sfortheleader U Payo↵ sforthefollower x Leader’sstrategy a Follower’sstrategy Table2.2: Notation toamixedstrategymodelsareal-worldsituationwheresecurityforcescommittoarandomized patrollingstrategyfirst. Giventhiscommitment,anadversarycanconductasmuchsurveillanceof thismixedstrategyashedesires. Evenwithknowledgeofthismixedstrategy,theadversaryhas no specific knowledge of what the security force may do on a particular day however. He only hasknowledgeofthemixedstrategythesecurityforcewillusetodecideherresourceallocations for that day. In this model, predictable defense strategies are vulnerable to exploitation by a determinedadversary. 2.3 StrongStackelbergEquilibrium(SSE) ThemostcommonsolutionconceptingametheoryisaNashequilibrium,whichisaprofileof strategies for each player in which noplayercan gainby unilaterallychanging toanother strategy [OsbourneandRubinstein,1994]. StackelbergequilibriumisarefinementofNashequilibrium specific to Stackelberg games. It is a form of sub-game perfect equilibrium in that it requires thateachplayerselectthebest-responseinanysubgameoftheoriginalgame(wheresubgames correspond to partial sequences of actions). The e↵ ect is to eliminate equilibrium profiles that 21 aresupportedbynon-crediblethreatso↵ theequilibriumpath. Subgameperfectionisanatural requirement,butitdoesnotguaranteea uniquesolutionincaseswhere thefollowerisindi↵ erent amongasetofstrategies. TheliteraturecontainstwoformofStackelbergequilibriathatidentify uniqueoutcomes,firstproposedbyLeitmann[Leitmann,1978],andtypicallycalled“strong”and “weak”afterBretonet. al.[Bretonetal.,1988]. Thestrongformassumesthatthefollowerwill always choose the optimal strategy for the leader in cases of indi↵ erence, while the weak form assumes that the follower will choose the worst strategy for the leader. Unlike the weak form, strongStackelbergequilibriaareknowntoexistinallStackelberggames[BasarandOlsder,1995]. Astandardargumentsuggeststhattheleaderisoftenabletoinducethefavorablestrongformby selectingastrategyarbitrarilyclosetotheequilibriumwhichcausesthefollowertostrictlyprefer thedesiredstrategy[vonStengelandZamir,2004]. WeadoptstrongStackelbergequilibriumas oursolutionconceptinpartforthesereasons,butalsobecauseitisthemostcommonlyusedin relatedliterature[OsbourneandRubinstein,1994;ConitzerandSandholm,2006;Paruchurietal., 2008b]. Definition 1. A set of strategies ( ⇥ ,F ) form a Strong Stackelberg Equilibrium (SSE) if they satisfythefollowing: 1. Theleaderplaysabest-response: U ⇥ (x,F (x)) U ⇥ (x 0 ,F (x 0 ))8 x 0 . 2. Thefollowerplaysabest-response: U (x,F (x)) U (x,a)8 a. 22 3. Thefollowerbreakstiesoptimallyfortheleader: U ⇥ (x,F (x)) U ⇥ (x,a)8 x,a2⌃ ⇤ (x),where⌃ ⇤ (x)isthesetoffollowerbest-responses, asabove. In the case of Bayesian games with many follower types⇤ , the leader’s best response is a weighted best response to the followers’ responses, where the weights are based on the probability of occurrence of each type (p ). The strategy of each attacker type becomes: F , which still satisfiesconstraints2and3inDefinition1. 2.4 StackelbergSecurityGames In a security game, a defender must perpetually defend the site in question, whereas the attacker is able to observe the defender’s strategy and attack when success seems most likely. This fits neatly intothedescriptionofaStackelberggameifwemaptheattackerstothefollower’sroleandthe defendertotheleader’srole[Avenhausetal.,2002;Brownetal.,2006;Kiekintveldetal.,2009]. Theactionsforthesecurityforcesrepresenttheactionofschedulingapatrolorcheckpoint,e.g. a checkpoint at the LAX airport or a federal air marshal scheduled to a flight. The actions for an adversaryrepresentanattackatthecorrespondinginfrastructuralentity. Thestrategyfortheleader isamixedstrategyspanningthevariouspossibleactions. There are two major problems with using conventional methods to represent security games in normal form. First, many solution methods require the use of a Harsanyi transformation when dealingwithBayesiangames[HarsanyiandSelten,1972]. TheHarsanyitransformationconverts aBayesiangameintoanormal-formgame,butthenewgamemaybeexponentiallylargerthan theoriginalBayesiangame. ThecompactrepresentationofsecuritygamesavoidsthisHarsanyi 23 transformation, and instead directly operates on the Bayesian game. Operating directly on the Bayesian representation is possible in security games because the evaluation of the leader strategy againstaHarsanyi-transformedgamematrixisequivalenttoitsevaluationagainsteachofthegame matricesfortheindividualfollowertypes[Kiekintveldetal.,2009]. Thesecondproblemarises thatthedefenderhasmanypossibleresourcestoscheduleinthesecuritypolicy. Thiscanalsolead toacombinatorialexplosioninastandard normal-formrepresentation. For example,iftheleader has mresourcestodefend nentities,thennormal-formrepresentationsmodelthisproblemasa singleleaderwith ⇣ n m ⌘ rows,eachrowcorrespondingtoaleaderactionofcoveringmtargetswith securityresources. However,inthesecuritygamerepresentation,thegamerepresentationwould only include n rows, each row corresponding to whether the corresponding target was covered or not. Such a representation is equivalent to the normal form representation if the defender doesnotfaceanyschedulingconstraints(examplesofdomainswithschedulingconstraintswill be given in Sections 2.5 to 2.8 – for the same reason, such problems are also referred to as SecurityProblemswithNoSchedulingConstraints(SPNSC)inthisthesis). Thiscompactness in SPNSC is possible because the payo↵ s for the leader in these games simply depend on whether theattackedtargetwascoveredornot,andnotonwhatothertargetswerecovered(ornotcovered). The representation we use here avoids both of these potential problems, using methods similar to other compact representations for games [Koller and Milch, 2003; Jiang and Leyton-Brown, 2006]. Inowintroducethiscompactrepresentationforsecuritygames. LetT ={t 1 ,...,t n }beaset of targets that may be attacked, corresponding to pure strategies for the attacker. The defender hasasetofresourcesavailableto cover thesetargets,R ={r 1 ,...,r m }(forexample,intheFAMS domain, targets could be flights and resources could be federal air marshals). Associated with 24 eachtargetarefourpayo↵ sdefiningthepossibleoutcomesforanattackonthetarget,asshown in Table 2.3. Similarly, 2.4 shows an example for an entire Bayesian security game with no scheduling constraints with two targets. There are two cases, depending on whether or not the targetiscoveredbythedefender. Thedefender’spayo↵ foranuncoveredattackwhenfacingan adversary of type is denoted U , u ⇥ (t), and for a covered attack U , c ⇥ (t). Similarly, U , u (t) and U , c (t)arethepayo↵ softheattacker. Covered Uncovered Defender 5 –20 Attacker –10 30 Table2.3: Examplepayo↵ sforanattackonatarget. AttackerType1 Defender Attacker Target Cov. Uncov. Cov. Uncov. t 1 10 0 -1 1 t 2 0 -10 -1 1 AttackerType2 Defender Attacker Target Cov. Uncov. Cov. Uncov. t 1 5 -4 -2 1 t 2 4 -5 -1 2 Table2.4: ExampleBayesiansecuritygamewithtwotargetsandtwoattackertypes. Acrucialfeatureofthemodelisthatpayo↵ sdependonlyonthetargetattacked,andwhether ornotitiscoveredbythedefender. Thepayo↵ sdonot dependontheremainingaspectsofthe schedule, such as whether any unattacked target is covered or which specific defense resource providescoverage. Forexample,ifanadversarysucceedsinattackingTerminal1,thepenaltyfor the defender is the same whether the defender was guarding Terminal 2 or 3. Therefore, from a payo↵ perspective, many resource allocations by the defender are identical. We exploit this by summarizing the payo↵ -relevant aspects of the defender’s strategy in a coverage vector, C, thatgivestheprobabilitythateachtargetiscovered,c t . Theanalogousattackvector A givesthe 25 probability of attacking a target by a follower of type . We restrict the attack vector for each followertypetoattackasingletargetwithprobability1. Thisiswithoutlossofgeneralitybecause astrongStackelbergequilibrium(SSE)solutionstillexistsunderthisrestriction[Paruchurietal., 2008b]. Thus,thefolloweroftype canchooseanypurestrategy 2⌃ ,thatis,attackanyone targetfromthesetoftargets. The payo↵ for a defender when a specific target t is attacked by an adversary of type is givenby U ⇥ (t,C)andisdefinedinEquation2.1. Thus,theexpectationof U ⇥ (t,C)over t gives U ⇥ ,whichisthedefender’sexpectedpayo↵ givencoveragevectorC whenfacinganadversary of type whose attack vector is A . U ⇥ is defined in Equation 2.2. The same notation applies foreachfollowertype,replacing⇥ with . Thus,U (t,C)givesthepayo↵ totheattackerwhen a target t is attacked by an adversary of type . We also define the useful notion of the attack set in Equation 2.3,⇤ (C), which contains all targets that yield the maximum expected payo↵ for the attacker type given coverage C. This attack set is used by the adversary to break ties whencalculatingastrongStackelbergequilibrium. Moreover,inthesesecuritygames,exactlyone adversaryisattackinginoneinstanceofthegame;however,theadversarycouldbeofanytype andthedefenderdoesnotknowthetypeoftheadversaryfaced. U ⇥ (t,C) = c t U , c ⇥ (t)+(1 c t )U , u ⇥ (t) (2.1) U ⇥ (C,A ) = X t2T a t ·(c t ·U , c ⇥ (t)+(1 c t )U , u ⇥ (t)) (2.2) ⇤ (C) = {t : U (t,C) U (t 0 ,C)8 t 0 2T}. (2.3) 26 InastrongStackelbergequilibrium,theattackerselectsthetargetintheattacksetwithmaximum payo↵ forthedefender. Let t ⇤ denotethisoptimaltarget. ThentheexpectedSSEpayo↵ forthe defender when facing this adversary of type with probability p is ˆ U ⇥ (C) = U ⇥ (t ⇤ ,C)⇥ p , and fortheattacker ˆ U (C) = U (t ⇤ ,C). 2.5 Security Problems with Arbitrary Scheduling Constraints (SPARS) I now introduce the SPARS domain, which includes arbitrary scheduling constraints for the defender. ASPARSgameisasecuritygamewhereeachdefenderresourcecanbeassignedtoa schedulecoveringmultipletargets, s✓ T,sothesetofalllegalschedulesisdefinedas S✓P (T). The defender has R resources, such that each defender resource r is restricted to a set of legal schedules, S r ✓ S (note that this implies that defender resources are no longer identical). The defender’spurestrategiesarethesetofjointschedulesthatassigneachresourcetoatmostone schedule. Additionally,weassumethatatargetmaybecoveredbyatmost1resourceinajoint schedule (though this can be generalized). A joint schedule j can be represented by the vector P j = hP jt i2{0,1} n where P jt represents whether or not target t is covered in joint schedule j. The set of all feasible joint schedules is denoted byJ. We define a mapping M from j to P j as: M(j) =hP jt i,where P jt = 1ift2 S s2j s;0otherwise. Thedefender’smixedstrategyxspecifies the probabilities of playing eachj2J, where each individual probability is denoted by x j . Let c =hc t ibethevectorofcoverageprobabilitiescorrespondingtox, where c t = P j2J P jt x j isthe marginalprobabilityofcoveringt. 27 ABayesian-SPARSinstanceisanextensionfortheSPARSproblemformanyattackertypes. Thus,eachtargetinBayesian-SPARSnowhasmanysetsofpayo↵ s,oneforeachattackertype. For reference, Table 2.5 summarizes all of the notation introduced here, and builds on the notation presentedinTable2.2forSPNSCproblems. Example: ConsideraFAMSgamewith5targets(flights),T ={t 1 ,...,t 5 },andthreemarshals of the same type, R = 3, S 1 = S 2 = S 3 = S. Let the set of feasible schedules be S = {{t 1 ,t 2 },{t 2 ,t 3 },{t 3 ,t 4 },{t 4 ,t 5 },{t 1 ,t 5 }}. The set of feasible joint schedules is shown below, where columnj 1 representsthejointschedule{{t 1 ,t 2 },{t 3 ,t 4 }}. P = j 1 j 2 j 3 j 4 j 5 t 1 : t 2 : t 3 : t 4 : t 5 : 2 66666666666666666666666666666666666666666666664 11 110 11 101 11 011 10 111 01 111 3 77777777777777777777777777777777777777777777775 Each joint schedule in J assigns only 2 air marshals in this example, since no more than 1 air marshal is allowed on any flight. Thus, the third air marshal will remain unused. Suppose all of the targets have identical payo↵ s U c ⇥ (t) = 1, U u ⇥ (t) = 5, U c (t) = 1 and U u (t) = 5. In thiscase,theoptimalstrategyforthedefenderrandomizesuniformlyacrossthejointschedules, x = h.2,.2,.2,.2,.2i, resulting in coverage c = h.8,.8,.8,.8,.8i. All pure strategies have equal 28 payo↵ sfortheattacker,giventhiscoveragevector. Furthermore,forthisexample,ERASER-C incorrectlyoutputsthecoveragevectorc =h1,1,1,1,1i. 1 Tosummarize,asolutiontotheSPARSprobleminstanceistheoptimalmixedstrategyover joint schedules, such that each joint schedule assigns every defender resource to a schedule (or asetoftargets). Thismixedstrategyisindexedoverjointschedules,wherethenumberofjoint schedulesiscombinatorialinthenumberofschedulesS andthenumberofdefenderresources. InChapter3,IintroducetheAspenalgorithmthatusesbranch-and-pricetocomputesolutionsto SPARS instances with just one attackertype (non-Bayesian). Thisalgorithm isextended tohandle theBayesiancaseinChapter5. 2.6 SecurityProblemswithPatrollingConstraints(SPPC) I now introduce a new domain: Security Problems with Patrolling Constraints (SPPC). This is a generalized security domain that allows us to consider many di↵ erent facets of the patrolling problem. Thedefenderneedstoprotect aset oftargets, locatedgeographically ona plane, usinga limitednumberofresources. Theseresourcesstartatagiventargetandthenconductatourthat can cover an arbitrary number of additional targets; the constraint is that the total tour length must notexceeda given parameter L. Iconsider twovariantsofthis domainfeaturingdi↵ erentattacker models. 1 Thiscoveragevectorisincorrectsincetheredoesnotexistamixedstrategyoverjointschedulesforthedefender thatcanachievethesemarginals. 29 Table2.5: SPARSNotation Variable Definition ⇥ Referstothedefender Referstotheattacker ⇤ Setofattackertypes p Probabilityofattackertype G(⇥ , ⇤ ) BayesianStackelberggame T Targets R DefenderResources S Schedulingconstraintsforthedefender S r Setoffeasibleschedulesforresourcer J JointSchedules P MappingbetweenTargetsT andJointSchedulesJ x DistributionoverJ(mixedstrategyofthedefender) a Attackvector(purestrategyoftheattackertype ) d Defenderrewardagainsttype ,whenthedefenderhas committedtomixedstrategyxandthe attackeroftype isplayinghisbestresponsea k Rewardofattackertype ,whenthedefenderhas committedtomixedstrategyxandthe attackeroftype isplayinghisbestresponsea d Columnvectorofd k Columnvectorofk q Attacksetoftype U u , ⇥ Utilityfordefenderwhentargetisuncovered U c , ⇥ Utilityfordefenderwhentargetiscovered D Diag. matrixofU c , ⇥ (t) U u , ⇥ (t) A Diag. matrixofU c , (t) U u , (t) U u , ⇥ VectorofvaluesU u ⇥ (t),similarlyfor c Marginalcoverageovertargets M HugePositiveconstant 1. There are multiple independent attackers, and each target can be attacked by a separate attacker. Each attacker can learn the probability that the defender protects a given target, andcanthendecidewhetherornottoattackit. 2. Thereisasingleattackerwithmanytypes,modeledasaBayesiangame. Thedefenderdoes notknowthetypeofattackershefaces. Theattackerattacksasingletarget. 30 Variable Definition S Setofsites(targets) T Setoftours L Upperboundonthelengthofadefendertour x ProbabilitydistributionoverT q Attackvector z st Binaryvalueindicatingwhetherornot s2T d Defenderreward k Adversaryreward P Penaltytoattacker R s Rewardtoattackeratsite s ⌧ s Defenderrewardforcatchingattackeronsite s M HugePositiveconstant Table2.6: NotationfortheSPPCDomain Thesevariantsweredesignedtocapturepropertiesofpatrollingproblemsstudiedbyresearchers acrossmanyreal-worlddomains[Anetal.,2011;Bosanskyetal.,2011;Vaneketal.,2011]. An examplefortheBayesiansingleattackersettingistheUSCoastGuardpatrollingasetoftargets alongtheporttoprotectagainstpotentialthreats. Thedefender’sobjectiveistofindtheoptimal mixedstrategyx overtoursT for all its resources in order to maximize her expected utilityd. The notationusedforthisdomainisdescribedinTable2.6. 2.6.1 Payo↵ Structure Witheachtargetinthedomainareassociatedpayo↵ s,whichspecifythepayo↵ toboththedefender andtheattackerincaseofansuccessfuloranunsuccessfulattack. Theattackerpaysahighpenalty forgettingcaught,whereasthedefendergetsarewardforcatchingtheattacker. Ontheotherhand, iftheattackersucceeds,theattackergetsa reward whereasthedefender paysapenalty. Boththe playersgetapayo↵ of0iftheattackerchoosesnottoattack. Thepayo↵ matrixforeachtarget is given in Table 2.7. Thus, the defender gets a reward of ⌧ s units if she succeeds in protecting 31 theattackontarget s,i.e. ifthedefenderiscoveringthetarget swhenitisattacked. Ontheother hand, the attacker pays a penalty of P on being caught. Similarly, the reward to the attacker isR s for a successful attack on site s, whereas the corresponding penalty to the defender for leaving the targetuncovered is R s . NoAttack Attack Covered 0,0 ⌧ s , P Uncovered 0,0 R s ,R s Table 2.7: Payo↵ structure for each target: defender gets a reward of ⌧ s units for successfully preventinganattack,whiletheattackerspaysapenalty P. Similarly,onasuccessfulattack,the attackergainsR s andthedefenderloses R s . Bothplayersget0incasethereisnoattack. 2.7 GameModel1: MultipleAttackers In this game model, there are as many attackers as the number of targets in the domain. Each attacker can choose to attack or not attack a distinct target. Each attacker can observe the net coverage,orprobabilityofthetargetbeingonadefender’spatrol,forthetargetthattheattacker is interested in. In our formulation, we assume that the attackers are independent and do not coordinate or compete. Figure 2.1 shows an example problem and solutions for this example. There are just two targets, A and B, which are placed 5 units away from the home (starting) locationofthedefender’sresources. Therearetwoattackers,oneforeachtarget. Thetourlength allowed in this example was 10 units, that is, the defender can only patrol exactly one target in eachpatrolroute. Thepenalty Pwassetto70unitswhereastherewardRforasuccessfulattack totheattackerwas100units. Forthisparticularexample,thedefendercannotprotecttheattacks 32 on bothsitesandtheoptimaldefender strategy is to cover one target with probability 0.588,cover theothertargetwithprobability0.412withtheoptimaldefenderrewardbeing 50.588. 5 time units 5 time units Attacker Penalty P = 70 Time Bound = 10 time units Number of inspectors = 1 Inspector Reward = -50.588 R = 100 λ = 20 R = 100 λ = 20 A" 0.588" B" 0.412" A B Inspector Strategy Figure2.1: Example1 2.8 NetworkSecurityDomain Inowintroduceanetworksecuritygame,wherethenumberofactionsforboththedefenderand theattackercanbeexponentialinnumber. Anetworksecuritydomain,asintroducedbyTsaiet al.[Tsaietal.,2010],ismodeledusingagraphG = (N,E). Theattackerstartsatoneofthesource nodes s 2S ⇢ N and travels along a path of his choosing to any one of the targets t 2T ⇢ N. The attacker’s pure strategies are thus all the possible s t paths from any source s 2S to any targett2T. Thedefendertriestocatchtheattackerbeforehereachesanyofthetargetsbyplacing k available (homogeneous) resources on edges in the graph. The defender’s pure strategies are thusallthepossibleallocationsof k resourcestoedges,sothereare ⇣ |E| k ⌘ intotal. Assumingthe defenderplaysallocation X i ✓ E,andtheattackerchoosespath A j ✓ E,theattackersucceedsif andonlyif X i \ A j =; . Additionally,apayo↵ T(t)isassociatedwitheachtargett,suchthatthe attacker getsT(t) for a successful attack on t and 0 otherwise. The defender receivesT (t) in case of a successful attack on t and 0 otherwise. The payo↵ structure used in network security domainsofinterestinthisthesisisgiveninTable2.8. Thenetworksecuritydomainismodeled 33 asacomplete-informationzero-sumgame,wherethesetS ofsources,T oftargets,thepayo↵ s U u , (t) = T(t) for all the targets t and the number of defender resources R are known to both theplayersa-priori. Theobjectiveistofindthemixedstrategyxofthedefender,corresponding to a Stackelberg equilibrium – equivalently, a Nash equilibrium strategy or a minimax strategy sinceinzero-sumgames,StackebergequilibriumstrategiesareequivalenttoNashandminimax strategies[Yinetal.,2010b]. Thenotationusedtodescribeanetworksecuritygameisgivenin theTable2.9. Covered Uncovered Defender 0 T (t) Attacker 0 T(t) Table2.8: Payo↵ structureforanetworksecuritygameconsideredinthisthesis. G(N,E) Network N Nodes(niteratesoverN) E Edges(eiteratesoverE) T TargetPayo↵ s R Defenderresources X Setofdefenderallocations,X ={X 1 ,X 2 ,...,X n } X i i th defenderallocation. X i ={X ie }8 e,X ie 2{0,1} A Setofattackerpaths,A ={A 1 ,A 2 ,...,A m } A j j th attackerpath. A j ={A je }8 e,A je 2{0,1} z ij Booleanvariableindicatingwhether A j intersectswith X i x Defender’smixedstrategyoverX a Adversary’smixedstrategyoverA U ⇥ (x,A j ) Defender’sexpectedutilityplayingxagainst A j ⇤ ⇥ Defender’spurestrategybestresponse ⇤ Attacker’spurestrategybestresponse Table2.9: Notation 34 2.9 BaselineAlgorithms ThereispreviousworkoncomputingsolutionsforBayesianStackelberggames. Theseapproaches haverangedfromsolvingthenormalformStackelberggame,solvingsecuritygamestosolving the network securitygames. In this section, Ibriefly describe (i) MultipleLPs [Conitzer andSand- holm,2006];and(ii)DOBSS[Paruchurietal.,2008b]forsolvinggenericBayesianStackelberg games,(iii)ERASER[Kiekintveldetal.,2009]forBayesianSecuritygames(SPNSC);and(iv) RANGER [Tsai et al., 2010] for solving network security games. These approaches either do not scale or do not compute optimal solutions for the problems of interest in my thesis; how- ever,Idescribethemheresincetheyprovidebackgroundforalgorithmdesignandexperimental evaluation. 2.9.1 MultipleLPsapproach The leader’s strategy in the SSE is considered the optimal leader’s strategy as it maximizes the leader’sexpectedutilityassumingthefollowerbestresponds. Thissectionexplainsthebaseline algorithmsforfindingtheoptimalleader’sstrategyofaBayesianStackelberggame. Asshownby Conitzer and Sandholm Conitzer and Sandholm [2006], the problem of computing the optimal leader’s strategy x is equivalent to finding a leader’s mixed strategy x and a follower’s pure strategy response a = g(x) such that the three SSE conditions (refer Definition 1) are satisfied. Mathematicallyxcanbefoundbysolvingthefollowingmaximizationproblem: (x ⇤ , ⇤ ) = argmax x, ⇤ {U ⇥ (x, ⇤ )|U (x, ⇤ ) U (x, 0 ),8 0 2⌃ }. (2.4) 35 Here, ⇤ is thepure strategyforthe Bayesian follower,andconsistsof ⇤ =h ⇤ i,i.e.,pure strategybestresponsesforeachfollowertype. x ⇤ referstotheoptimalmixedstrategyoftheleader. Equation (2.4) suggests the multiple linear program (LP) approach for finding x ⇤ as given in[ConitzerandSandholm,2006]. Theideaistoenumerateallpossiblepurestrategyresponsesof thefollower , 2⇤ , 2⌃ . Furthermore,foreach =h i,theoptimalmixedstrategy oftheleaderischosensuchthat isabestresponseofthefollower,andcanbefoundbysolving thefollowingLP: 2 max x U ⇥ (x, ) s.t. x 0 1 T x = 1 U (x, ) U (x, 0 ), 8 2⇤ ,8 0 2⌃ 0 (2.5) SomeoftheLPsmaybeinfeasiblebutitcanbeshownthatatleastoneLPwillreturnafeasible solution. The optimal leader’s strategy x ⇤ is then the optimal solution of the LP which has the highestobjectivevalue(i.e.,theleader’sexpectedutility)amongallfeasibleLPs. 2.9.2 Dobss: MixedIntegerLinearProgram Sincethefollowersofdi↵ erenttypes are mutually independent of each other, there can be atmost Q |⌃ | possible combinations of follower best response actions over the follower types. The multipleLPsapproachwillthenhavetosolve Q |⌃ |LPsandthereforeitsruntimecomplexity growsexponentiallyinthenumberoffollowertypes. Infact,theproblemoffindingtheoptimal 2 Notetheformulationhereisslightly di↵ erent from and has fewerconstraints in each LP than the original multiple LPsapproachin[ConitzerandSandholm,2006]whereaBayesiangameistransformedtoanormal-formoneusing Harsanyitransformation[HarsanyiandSelten,1972]. 36 strategy for the leader in a Bayesian Stackelberg game with multiple follower types is NP- hard[ConitzerandSandholm,2006]. Nevertheless,researchershavecontinuedtoprovidepractical improvements. Dobssisane cientgeneralStackelbergsolver[Paruchurietal.,2008b]andisin useforsecurityschedulingattheLosAngelesInternationalAirportintheARMORsystem[Pita etal.,2008]. Dobssobtainsadecompositionschemebyexploitingthepropertythatfollowertypes areindependentofeachotherandsolvestheentireproblemasonemixed-integerlinearprogram (MILP): max x,d,k,a P p d s.t. 1 T x = 1,x 0 P|⌃ | j=1 a j = 1, 8 a j 2{0,1}, 8 , 8 j = 1..|⌃ | d U ⇥ (x, j)+(1 a j )· M, 8 , 8 j 0 k U (x, j) (1 a j )· M, 8 , 8 j (2.6) Dobsse↵ ectivelyreducestheproblemofsolvinganexponentialnumberofLPstoacompactly representedMILPwhichcanbesolvedmuchmoree cientlyviamoderntechniquesinOperations Research. Thekey ideaof theDobss MILPis torepresent thestrategyof eachfollower typea as abinaryvectora = [ 1 , 2 ,...]where j is1ifthefollowertype choosespurestrategy j and 0 otherwise. Here, U ⇥ (x, j) is the expected utility for the leader when the leader is playing mixed strategyx and the follower is playing its j th pure strategy, i.e., j . U (x, j) is also defined similarlyforthefollower. M is(conceptually)aninfinitelylargeconstant. 37 2.9.3 EraserMixedIntegerLinearProgam The Eraser MILP is based on the same idea as Dobss, however, it operates on security games. Thus,itexploitsthecompactrepresentationofsecuritygamesandavoidstheenumerationofthe exponentiallymanypurestrategycombinationsforthedefender. TheEraserMILPisgivenas follows: max P d a t 2 {0,1} 8 t2T, 2⇤ P t2T a t = 1 8 2⇤ c t 2 [0,1] 8 t2T P t2T c t R d U ⇥ (t,C) (1 a t )· M 8 t2T, 2⇤ 0 k U (t,C) (1 a t )· M 8 t2T, 2⇤ (2.7) Here,the defender’s strategyisrepresentedasthe coverage vectorC insteadofthetraditional probabilitydistributionx,anditrepresentsthemarginalcoverageofthedefenderoneachtarget. LikeinDobss,thestrategyoftheattackeroftype ,a ,isanindicatorvectorandis1fortargett only if t is the best response of attacker type to the coverageC of the defender. U ⇥ (t,C) and U (t,C)representtheexpectedutilitiestothedefenderandtheattackerrespectivelygivenattacker type when the defender is playing the coverage vectorC and the attacker of type attacks target t. Risthetotalnumberofavailabledefenderresources. M,again,is(conceptually)aninfinitely largeconstant. 38 2.9.4 Rangersolutionapproach Ranger [Tsai et al., 2010] provides approximate solutions to the network security game. It approximates the strategy space of the players, using an e cient shortest-path based linear program to perform the computation of the optimal defender allocation in the network. Tsai et. al [Tsai et al., 2010] thenprovide two sampling schemes to sample from the mixed strategy computedbyRangertoscheduletheactualallocationofdefenderresources. TheRangerlinear programisgivenasfollows: max d ⇤ s.t. d ⇤ (1 d t ).T(t) d s = 0 d v min(1,d u + x e ) 8 e = (u,v)2E 0 x e 1 8 e2E P e2E x e R (2.8) InthisRangerlinearprogram,d ⇤ isthedefenderrewardcomputedbyRanger. x e isthemarginal probability of placing a checkpoint on edge e. The d v are, for each vertex v, the minimum sum of checkpoint probabilities along any path from the source s to vertex v. The Ranger solution overestimates the optimal defender reward, and as I show, in Chapter 4, the solution quality of Rangerisunbounded. 39 Chapter3: StrategyGenerationforOnePlayer Inthischapter,Ishowhowstrategygenerationforoneplayercanbeusedindomainswherethe numberofpurestrategiesforoneplayerareextremelylarge. Iwillusetwodomainstodemonstrate thepoint: theSPARSdomainaswellastheSPPCdomain. Iwillalsoshowhowstrategygeneration can be used in conjunction with previously published algorithms to compute optimal defender strategiesintheSPARSdomain. IconcludethissectionwithexperimentalresultsinSection3.4. 3.1 SPARSdomain Inthissection,Ipresent Accelerated SPARS ENgine,Aspen,analgorithmthatsolvesaninstance oftheSPARSproblemwithouthavingtoenumerateallthepurestrategiesofthedefender. Aspen focuses on the non-Bayesian case (and so I omit the usage of the subscript in this section) – I will describe the Bayesian extension in Chapter 5. Aspen formulates the SPARS problem as a mixed-integer program in which the pure strategies of the attacker are represented by integer variablesawitha t = 1iftargett isattackedand0otherwise. Twokeycomputationalchallenges ariseinsuchaformulation. First,thespaceofpossiblestrategies(jointschedules)forthedefender su↵ ersfromcombinatorialexplosion: aFAMSproblemwith100flights,scheduleswith3flights, and 10 air marshals has up to 100,000 schedules and ⇣ 100000 10 ⌘ joint schedules. Second, integer 40 variables are a well-known challenge for optimization, since linear problems without integer variables can be solved in polynomial time, while versions with integer variables are NP-hard. Aspen overcomes these challenges using the framework of Branch and Price [Barnhart et al., 1994], which is used to solve very large optimization problems. It combines branch and bound search with column generation to mitigate both of the problems described above. The use of columngenerationdecomposestheprobleminsuchawaythatonepurestrategy(jointschedule) of the defender is generated in each iteration, thereby allowing the computation of the optimal strategywithoutexplicitlyenumeratingtheentirepurestrategyspaceofthedefender. Thismethod operatesonjointschedules(andnotmarginalprobabilities,likeERASER-C),soitisabletohandle schedulingconstraintsdirectly. First Node: all a t ε [0,1] First leaf: a t1 = 1, a rest = 0 Second node: a t1 = 0, a rest ε [0,1] Second leaf: a t1 =0, a t2 =1, a rest =0 Third node: a t1 ,a t2 = 0, a rest ε [0,1] Last leaf: a tT = 1, a rest = 0 Upper Bound 1 UB 1 >= UB 2 >= … >= UB T LB 1 , …, LB T : Not necessarily ordered Upper Bound 2 Upper Bound 3 Upper Bound T Lower Bound 1 Lower Bound 2 Lower Bound T Column Generation Node ORIGAMI Node Figure3.1: WorkingofBranchandPriceforanon-BayesianSPARSprobleminstance. AnexampleofbranchandpriceforourproblemisshowninFigure3.1,withtherootnode representingtheoriginalproblem. Everybranchofthistreerepresentsapurestrategychoicefor the attacker. Branchesto the left(gray nodes, labeled ‘ColumnGeneration Node’) set exactly one 41 variable t i in a to 1 and the rest to zero, resulting in a linear program that gives a lower bound ontheoverallsolutionquality. Branchestotherightfixthatsamevariablet i tozero,leavingthe remainingvariablesunconstrained. Anupperboundonsolutionqualitycomputedforeachwhite node(labeledas‘ORIGAMINode’)canbeusedtoterminateexecutionwithoutexploringallof the possible integer assignments. This upper bound can be computed both na¨ ıvely as well as using heuristics. InSection 3.1.2wedescribeaheuristicbasedontheORIGAMIalgorithmthatweuse inAspen. Solvingthelinearprogramsineachgraynodenormallyrequiresenumeratingallpossiblejoint schedulesforthedefender. Columngeneration(i.e.,pricing)avoidsthisbyiterativelysolvinga restrictedmasterproblem,whichincludesonlyasmallsubsetofthevariables,andaslaveproblem that identifies new variables to include in the master problem. The new variables added by the slaveproblemaretheonesthatwillmaximallyimprovetheobjectivevalueinthemasterproblem. The algorithm terminates when the slave problem cannot identify any new variables that will improvetheobjectiveforthemasterproblem. Branch and price is not an “out of the box approach” and it has only recently begun to be applied in game-theoretic settings [Halvorson et al., 2009a]. To apply the technique here, we design a novel master-slave decomposition to facilitate column generation for SPARS, including a network flow formulation of the slave problem. A standard technique for generating upper bounds istousealinearprogrammingrelaxationoftheproblem,butthisapproachperformspoorlyfor these problems based on our experimental results. Instead, we propose new methods for bounding and selecting branches based on fast algorithms for security games without scheduling constraints. 42 3.1.1 ASPENColumnGeneration The linear programs at each leaf in Figure 3.1 are decomposed into master and slave problems for column generation (see Algorithm 1). The master solves for the defender strategy x, given a restricted set of columns (i.e., joint schedules) P, such that the attacker’s best response a is consistentwiththeleafnodefromFigure3.1. Theobjectivefunctionfortheslaveisupdatedin each iteration based on the current solution of the master problem. The slave problem is then solvedtoidentifythebestnewcolumntoaddtothemasterproblem,usingreducedcosts(explained later). Ifnocolumncanimprovethesolutionthealgorithmterminates. Algorithm1Columngeneration 1. Initialize P 2. SolveMasterProblem 3. Calculatereducedcostcoe cientsfromsolution 4. Updateobjectiveofslaveproblemwithcoe cients 5. SolveSlaveProblem if Optimalsolutionobtainedthen 6. Return(x,P) else 7. ExtractnewcolumnandaddtoP 8. RepeatfromStep2 3.1.1.1 MasterProblem: Themasterproblem(Equations3.1to3.6)solvesfortheprobabilityvectorxthatmaximizesthe defenderreward(Table2.5intheChapter2succinctlyliststhenotation). 1 Thismasterproblem operates directly on columns of P, and the coverage vector c is computed from these columns asPx. d representstheexpecteddefenderutilitythatistobemaximized,whereasdisacolumn vector of d of dimension |T|. Similarly, k represents the attacker’s expected utility whereas k 1 Theactualalgorithmminimizesthenegativeofthedefenderrewardforcorrectnessofreducedcost computation usedintheSlaveProblem;weshowmaximizationofdefenderrewardforexpositorypurposes. 43 is a column vector of k of dimension |T|. The vectors d and k are introduced for dimensional consistency in the matrix notation of the mixed integer linear program described below. While the payo↵ s U u ⇥ and U c ⇥ for the defender are defined as before, D here is a diagonal matrix of U c ⇥ (t) U u ⇥ (t)ofdimension|T|⇥| T|. Similarly,U u andU c representthepayo↵ sfortheattacker andAisadiagonalmatrixofU c (t) U u (t),againofdimension|T|⇥| T|. Constraints3.2–3.4 enforcetheSSEconditionsthattheplayerschoosemutualbestresponses. Constraints 3.3 and 3.4 ensure that the assignment a t = 1 can generate a feasible solution if and only if target t is the best response of the attacker (note that this assignment is done when resolvingthebranchofthebranch-and-boundtreegiveninFigure3.1). Thedefenderexpected payo↵ (Equation2.1)fortarget t isgivenbythe t th componentofthecolumnvectorDPx+U u ⇥ and denoted (DPx +U u ⇥ ) t . Similarly, the attacker payo↵ for target t is given by (APx +U u ) t . Constraints3.2and3.3areactiveonlyforthesingletargett ⇤ attacked(a t ⇤ = 1). Thistargetmust be a best-response, due to Constraint 3.4. The pure strategy of the attacker a is an input to the columngenerationprocedureandnotavariablefortheformulationbelow. max d (3.1) s.t. d DPx U u ⇥ (1 a)M (3.2) k APx U u (1 a)M (3.3) APx+U u k (3.4) X j2J x j = 1 (3.5) x 0 (3.6) 44 3.1.1.2 SlaveProblem: TheslaveproblemfindsthebestcolumntoaddtothecurrentcolumnsinP. Thisisdoneusing reduced costs, which captures the rate of change in the defender payo↵ if a candidate column is added toP. The candidate column with minimum reduced cost improves the objective value the most [Bertsimas and Tsitsiklis, 1994]. The reduced cost ¯ c j of variable x j (associated with columnP j )isgiveninEquation3.7,wherew,y,zandharedualvariablesofmasterconstraints 3.2, 3.3, 3.4 and 3.5 respectively. The dual variable measures the influence of the associated constraintontheobjective,andcanbecalculatedusingstandardtechniques. ¯ c j = w T (DP j )+y T (AP j ) z T (AP j ) h (3.7) An ine cient approach would be to iterate through all of the columns and calculate each reduced cost to identify the best column to add. Indeed, this would defeat the whole point of columngeneration. Instead,weformulateaminimumcostnetworkflow(MCNF)problemthat e ciently finds the optimal column. Feasible flows in the network map to feasible joint schedules in the SPARS problem, so the scheduling constraints are captured by this formulation. The proceduretoconstructtheMCNFgraphforagivenSPARSinstanceisgivenbelow,followedby anexample. Westartbycreatingasinknodewithdemand|R|. Asourcenodesource r withsupply1is createdforeachdefender r2R. 2 TherestoftheMCNFgraphisconstructedsuchthattheflow from source r to the sink will provide the schedule s 2 S r that this resource shall undertake. Next,weconstructasetofpathsforeachresourcer fromthesourcesource r tothesinknode. 2 Infact,resourcesr withthesamesetofschedulingconstraintsS r arecombinedandconnectedtoonesourcenode withsupplyequaltothenumberoftheseresourceswiththesamesetofschedulingconstraintsS r . 45 A path is added between source r and sink for each schedule s r 2S r . We represent targets in schedule s for resource r using a pair of nodes (a s r ,t ,b s r ,t ) with a forward directed edge. Each targett isrepresentedmultipletimesintheMCNFgraphbyapairofnodes(a s r ,t ,b s r ,t )forevery schedule s r itappearsin. Then,fortheschedule s r 2S r ,weaddapathfromthesourcetothesink: hsource r ,a s r ,t i 1 ,b s r ,t i 1 ,a s r ,t i 2 ,...,b s r ,t i L ,sinki. The capacities on all edges are set to 1, and the defaultcoststo0. Adummyflowwithinfinitecapacityisaddedtorepresentthepossibilitythat someresourcesareunassigned. Thenumberofresourcesassignedtot inacolumnP j iscomputed as: assigned(t) = X s2S flow[edge(a s,t ,b s,t )]. Constraintsareaddedtothisslaveproblemsothat assigned(t) 1 foralltargetst. Thus,thevalueof 1forassigned(t)represents thatthe targett iscoveredbythe defender,while0representsthatnodefenderresourcewasallocatedtotargett. A partial MCNF graph for our earlier example is shown in Figure 3.2, showing paths for 3 ofthe5schedules. Thepathscorrespondtoschedules{t 1 ,t 2 },{t 2 ,t 3 }and{t 1 ,t 5 }. Thesupplyand demandareboth3,correspondingtothenumberofavailableFAMS.Double-borderedboxesmark the flows used to compute assigned(t 1 ) and assigned(t 2 ). Every joint schedule corresponds toafeasibleflowinG. Forexample,thejointschedule{{t 2 ,t 3 },{t 1 ,t 5 }}hasaflowof1uniteach throughthepathscorrespondingtoschedules{t 2 ,t 3 }and{t 1 ,t 5 },andaflowof1throughthedummy. Similarly, any feasible flow through the graphG corresponds to a feasible joint schedule, since all resourceconstraintsaresatisfied. 46 cap =1 cap =1 cap =1 cap =1 cap =1 cap =1 sink demand = 3 source 1 supply r 1 = 3 target t 1 target t 2 target t 3 dummy target and path cap = inf target t 5 Flow = 1 Flow = 1 Flow = 1 Figure3.2: ExampleMinimumCostNetworkFlowGraphforaSPARSprobleminstance. It remains to define link costs such that the cost of a flow is the reduced cost for the joint schedule. Wedecompose ¯ c j intoasumofcostcoe cientspertarget, ˆ c t ,sothat ˆ c t canbeplaced onlinks(a s,t ,b s,t )foralltargets t. ˆ c t isdefinedas w t .D t +y t .A t z t .A t where w t ,y t and z t are t th components of w,y and z. D t is equal to U c ⇥ (t) U u ⇥ (t) and A t = U c (t) U u (t). The overall objectivegivenbelowfortheMCNFproblemsumsthecontributionsofthereducedcostfromeach individualflowandsubtractsthedualvariableh. Ifthisisnon-negative,nocolumncanimprove themastersolution,otherwisetheoptimalcolumn(identifiedbytheflow)isaddedtothemaster andtheprocessiterates. min flow X (a s,t ,b s,t ) ˆ c t .flow[(a s,t ,b s,t )] h 3.1.2 ImprovingBranchingandBounds ASPEN uses branch and bound to search over the space of possible attacker strategies. A standard technique in branch and price is to use LP relaxation, which allows the integer variables to take on arbitrary values (rather than just integers) to give an optimistic bound on the objective value of the originalMIP.Ourexperimentalresults(Section3.4)showthatthisgenericmethodisine↵ ectivein 47 ourdomain. WeintroduceORIGAMI-S,anovelbranchandboundheuristicforSPARSbasedon ORIGAMI [Kiekintveldetal.,2009], whichisane cientsolutionmethodforsecuritygames without scheduling constraints and heterogeneous resources. We use ORIGAMI-S to solve a relaxedversionofSPARSthatoptimisticallyignoresschedulingconstraintsandintegratethiswith ASPEN. The ORIGAMI-S model is given in Equations 3.8–3.16. It minimizes the attacker’s maximum payo↵ (Equations3.8–3.10). Thevectorqrepresentstheattackset,andis1foreverytargetthat gives the attacker maximal expected payo↵ (Equation 3.10). The remaining non-trivial constraints restrict the coverage probabilities. ORIGAMI-S defines a set of probabilities ˜ c t,s that represent the coverageofeachtarget t ineachschedule s2S r . Thetotalcoverage c t oftarget t isthesumof coverageont acrossindividualschedules(Equation3.11). WedefineasetT r whichcontainsone targetfromeachschedule s2S r (e.g., choosingthefirsttargetfromeveryschedule). Thetotal coverage assigned for every resourcer is boundedfrom aboveby 1 (Equation3.12), analogous to the constraint that the total flow from thesource r in the MCNF network flow graph cannot be greaterthantheavailablesupplyof1resource. 3 Totalcoverageisalsoboundedbymultiplyingthe number of resources by the maximum size of any schedule (L) in Equation 3.13. The defender can 3 Again, as in the MCNF graph, resources with the same set of constraintsS r are combined into one constraint, with theavailablesupplyequaltothenumberofequivalentresources. 48 neverbenefitbyassigningcoveragetonodesoutsideoftheattackset,sotheseareconstrainedto0 (Equation3.14). min k (3.8) U (c) = Ac+U u (3.9) 0 k U (c) (1 q)· M (3.10) c t = P s2S ˜ c t,s 8 t2T (3.11) X s2S r ˜ c T r (s),s 1 8 r2R (3.12) X t2T c t L·|R| (3.13) c q (3.14) q2 {0,1}, (3.15) c,c t,s 2 [0,1] 8 t2T,s2S (3.16) ORIGAMI-S is solved once at the start of ASPEN, and targets in the attack set are sorted by expected defender reward. The maximum value is an initial upper bound on the defender reward. The first leaf node that ASPEN evaluates corresponds to this maximum valued target (i.e,settingitsattackvalueto1),andasolutionisfoundusingcolumngeneration. Thissolution is a lower bound of the optimal solution, and the algorithm stops if this lower bound meets the ORIGAMI-S upper bound. Otherwise, a new upper bound from the ORIGAMI-S solution is obtainedbychoosingthesecond-highestdefenderpayo↵ fromtargetsintheattackset,andASPEN evaluates thecorrespondingleafnode. Thisprocesscontinuesuntiltheupper boundismet,orthe availablenodesinthesearchtreeareexhausted. 49 Theorem1. The defender payo↵ , computed by ORIGAMI-S, is an upper bound on the defender’s payo↵ for the correspondingSPARSproblem. ForanytargetnotintheattacksetofORIGAMI-S, therestrictedSPARSprobleminwhichthistargetisattackedisinfeasible. ProofSketch: ORIGAMIandORIGAMI-Sbothminimizethemaximumattackerpayo↵ over asetoffeasiblecoveragevectors. Iftherearenoschedulingconstraints,thisalsomaximizesthe defender’s policy [Kiekintveld et al., 2009]. Briefly, the size of the attack set in the solution is maximized,andthecoverageprobabilityoneachtargetintheattacksetisalsomaximal. Bothof theseweaklyimprovethedefender’spayo↵ becauseaddingcoveragetoatargetisstrictlybetter forthedefenderandworsefortheadversary. ORIGAMI-Smakesoptimisticassumptionsaboutthecoverageprobabilitythedefendercan allocatebytakingthemaximumthatcouldbeachievedbyanylegaljointscheduleandallowing it to be distributed arbitrarily across the targets, ignoring the scheduling constraints. To see this,considerthemarginalprobabilitiesc ⇤ ofanylegaldefenderstrategyforSPARS.Thereisat least one feasible coverage strategy for ORIGAMI that gives the same payo↵ for the defender. Constraints3.11and3.16aresatisfiedbyc ⇤ ,becausetheyarealsoconstraintsofSPARS.Each variable ˜ c T r (s),s inthesetdefinedforConstraint3.12belongstoasinglescheduleassociatedwith resourcer,andatmost1ofthesecanbeselectedinanyfeasiblejointschedule,sothisconstraint must also hold for c ⇤ . Constraint 3.13 must be satisfied because it assumes that each available resourcecoversthelargestpossibleschedule,soitgenerallyallowsexcesscoverageprobability to be assigned. Finally, constraint 3.14 may be violated by c ⇤ for some target t. However, the coveragevectorwithcoverageidenticaltoc ⇤ foralltargetsintheORIGAMI-Sattacksetand0 coverageoutsidetheattacksethasidenticalpayo↵ s,sincethesetargetsareneverattacked. 50 3.2 Columngenerationforjointpatrollingschedules Severalpreviousalgorithms,mostnotablyERASER-C[Kiekintveldetal.,2009],directlyreason aboutthemarginalcoverageprobilitiesontargetsC asopposedtomixedstrategiesofthedefender (refer Section 2.9.3). Using marginals avoids the necessity to represent the entire pure-strategy spaceofthedefender,whichcanbeprohibitivelylarge. However,marginalprobabilitiescannotbe used directly to sample a specific schedule for the defender to implement. Computing a full mixed strategy for the defender that can be used to sample a specific schedule in the general case has not beenaddressedinpreviouswork(somelimitingcasesareaddressedby[Korzhyketal.,2010]). Furthermore,itmaynotbefeasibletocomputemixedstrategiesfromthemarginaldistributions computedbyanalgorithmsincesomeconstraintscannotberepresentedinaformulationbased onlyonmarginals. Forexample,ERASER-ConlycorrectlycomputessolutionstoSPARSproblem instances when each schedule in the problem instance contains only 2 targets, and they can be organized in a bipartite graph, like a federal air marshal taking a departure flight and an arrival flight. We show here how the same column generation approach used in Aspen can also be used to generate joint schedules from the marginals output of such algorithms. This provides a competing approachtoAspen,whichwewillevaluateexperimentallyinthefollowingsection. TheobjectiveofthefollowingprogramistofindjointschedulesPandaprobabilitydistribution xsuchthatthecoveragevectorPxisascloseaspossibletothemarginalscprovidedasinput. We use L1 norm to determine the distance between the coverage vector obtained by this approach andthemarginals,sinceitmaynotalwaysbefeasibletoconvertmarginalsintoimplementable 51 mixedstrategies. The procedure below willreturn an objective of0 only whenthe marginalsc are implementableasamixedstrategyxoverjointschedulesPbythedefender. min x ||Px c|| 1 P j x j = 1 x j 0 (3.17) Theassociatedmasterproblemis: min x, X t2T t (3.18) s.t. Px c (3.19) Px+ c (3.20) X t2T x t = 1 (3.21) x 0 (3.22) TheslaveprobleminthiscaseisthesameastheoneusedinAspen,wherethereducedcostofa jointscheduleis: ¯ c j = (w 1 w 2 ) T P j (3.23) 52 where w 1 ,w 2 , are the optimal dual variables of the current master problem associated con- straints 3.19, 3.20, and 3.21. Again, the reduced cost ¯ c j can be decomposed into reduced cost coe cientspettarget ˆ c t ,whichcanbecomputedusingEquation3.24asfollows: ˆ c t = (w 1t w 2t ) (3.24) Thus,thecolumngenerationapproachof Aspencanbecombinedwithanyalgorithmthatcomputes marginal probabilities, producing a mixed strategy for the defender consistent with scheduling constraintsinthedomain. 3.3 SPPCDomain I propose a branch-and-price based formulation to compute optimal defender strategies in this domain as well. Again, branch and bound search is used to address the integer variables: each branchsetsthevaluesforsomeintegervariable,whereascolumngenerationisusedtoscaleupthe computation to very large input problems. There is a binary variable associated with each attacker: eitheran attackerchoosestoattackorhedoesnot. Thebinaryvariablesareassignedvaluesusing abranchandboundtree,whereeachbranchofthistreeassignsaspecificvaluetoeachattacker variable. Thus, each leaf of this tree assigns a value for every attacker, that is, for every binary variable. Inowpresentthealgorithmforthecasewithmultipleattackers;theformulationfora Bayesian single attacker in presented in Chapter 5. As mentioned above, I use branch-and-price to computesolutionsforthisproblem,andthecolumngenerationprocedureusedhereispresented next. 53 3.3.1 ColumnGeneration: Columngenerationisusedtosolveeachnodeoftheabovebranchandboundtree. Theproblem ateachleafisformulatedasalinearprogram,whichisthendecomposedintoaMaster problem anda Slaveproblem. Themastersolvesforthedefenderstrategyx,givenarestrictedsetoftours T. The objective function for the slave is updated based on the solution of the master, and the slaveissolvedtoidentifythebestnewcolumntoaddtothemasterproblem,usingreducedcosts (explained later). If no tour can improve the solution further, the column generation procedure terminates. 3.3.1.1 MasterFormulation: Themasterproblemsolvesforthebestprobabilitydistributionxthatmaximizesthedefender’s expectedutility given a limitednumber ofpatrol toursT. The defender’s expectedutility isa sum ofdefenderutilitiesd s overallthetargets s. ThemasterformulationisgiveninEquations(3.25) to(3.31). ThenotationisdescribedinTable2.6. Equations(3.27)and(3.28)capturethepayo↵ the defender. They ensure that d s is upper bounded by the payo↵ to the defender at target s, Equation (3.27) capturing the payo↵ when the attacker chooses to attack s (i.e. q s = 1) whereas (3.28) captures the defender’s payo↵ when the attacker chooses to not attack s (i.e. q s = 0). Similarly, Equations (3.29) and (3.30) capture the payo↵ of the attacker. They ensure that the assignment q s = 1 is feasible if and only if the payo↵ to the attacker for attacking the target s,( P t2T x t z st )( P R s )+R s ,isgreaterthan0,theattacker’spayo↵ fornotattackingtarget s. Equations(3.26)and(3.31)ensurethatthestrategyxisavalidprobabilitydistribution. 54 min x,y,d,q X s2S d s (3.25) s.t. X t2T x t 1 (3.26) d s X t2T x t z st (⌧ s +R s )+ Mq s M R s (3.27) d s Mq s 0 (3.28) Mq s (P+R s )+ X t2T x t z st (P+R s ) M +R s (3.29) Mq s (P+R s ) X t2T x t z st (P+R s ) R s (3.30) x t 2[0,1] (3.31) 3.3.1.2 SlaveFormulation: The slave problem find the best patrol tour to add to the current set of tours T. This is done using reduced cost, which captures the total change in the defender payo↵ if a tour is added to the set of tours T. The candidate tour with the minimum reduced cost improves the objective value the most [Bertsimas and Tsitsiklis, 1994]. The reduced cost c t of variable x t , associated with tour T , is given in Equation 3.32, where w, y, v and h are dual variables of master con- straints(3.27),(3.29),(3.30)and(3.26)respectively. Thedualvariablemeasurestheinfluenceof theassociatedconstraintontheobjective,andcanbecalculatedusingstandardtechniques: c t = X s2S (w s (⌧ s +R s )+(v s y s )(P+R s ))z st h (3.32) Oneapproachtoidentifythetourwiththeminimumreducedcostwouldbetoiteratethrough all possible tours, compute their reduced costs, and then choose the one with the least reduced 55 cost. However, we propose a minimum-cost integer network flow formulation that e ciently finds theoptimalcolumn(tour). Feasibletoursinthedomainmaptofeasibleflowsinthenetworkflow formulation and vice-versa. The minimum cost network flow graph is constructed in the following manner. A virtual source and virtual sink are constructed to mark the beginning and ending locations,i.e. homebase,foradefendertour. Thesetwovirtualnodesaredirectlyconnectedby an edge signifying the ”Not attack” option for the attacker. As many levels of nodes are added tothegraphasthenumberoftargets. Eachlevelcontainsnodesforeverytarget. Therearelinks fromeverynodeonlevelitoeverynodetoleveli+1. Eachnodeoneveryleveliisalsodirectly connectedtothesink. Additionally,thelengthoftheedgebetweenanytwonodesistheEuclidean distance between the two corresponding targets. Constraints are added to the slave problem to disallowatourthatcoverstwonodescorrespondingtothesametarget(i.e. anetworkflowgoing throughnode(1,1)and(2,1)inthefigurewouldbedisallowedsinceboththesenodescorrespond totarget1). Anadditionalconstraintisaddedtotheslavetoensurethatthetotallengthofevery flow (i.e. sum of lengths of edges with a non-zero flow) is less than the specified upper bound L. Thus, the slave is setup such that there exists a one-to-one correspondence between a flow generated by the slave problem and patrol route that the defender can undertake. Figure 3.3 shows anexamplegraphfortheslave. Each node representing a target is split into two dummy nodes with an edge between them. Link costs are put on these edges. The costs on these graphs are defined by decomposing the reduced cost of a tour, c t , into reduced costs over individual targets, ˆ c s . We decompose c t into 56 … … … … Target 1 Target N … Virtual Source Virtual Sink Not Attack Target 1 Target N Level 1 Level 2 Level N (1,1) (1,2) (1,N) (2,1) (2,2) (2,N) (N,1) (N,2) (N,N) Figure 3.3: This figure shows an example network-flow based slave formulation. There are as manylevelsinthegraphsasthenumberoftargets. Eachnoderepresentsaspecifictarget. Apath fromthesourcetothesinkmapstoatourtakenbythedefender. a sum of cost coe cients per target ˆ c s , so that ˆ c s can be placed on the edges between the two dummynodesofeachtarget. ˆ c s aredefinedasfollows: c t = P s2S ˆ c s z st h (3.33) ˆ c s = (w s (⌧ s +R s )+(v s y s )(P+R s )) (3.34) 3.4 ExperimentalResults . 3.4.1 ComparisonResults This section presents detailed results of using strategy generation for computing solutions for domainswithlargenumberofpurestrategiesforoneplayer. Theresultsinthissectionfocussolely on results on the SPARS domain (the trends for the SPPC domain are similar and are analyzed inmoredepthinChapter6). Here, Icomparetheperformanceof Aspen(Section3.1.1), Aspen without the branch-and-bound heuristic of ORIGAMI-S (henceforth referred to as BnP), and the total runtime of ERASER-C [Kiekintveld et al., 2009]. In our results, we present runtimes 57 for ERASER-C that include the generation of the defender’s mixed strategy using our column generationapproach(referSection3.2)inordertoprovideafaircomparisonwithASPEN.For thisexperiment,wegeneraterandominstancesof SPARSproblems[Kiekintveldetal.,2009]with schedulesofsizetwoorganizedinabipartitegraph,e.g. withonedepartureflightandonearrival flightdrawnfromdisjointsets(thisensuresthatERASER-Ccanfindacorrectoptimalsolution). Wevarythenumberoftargets,defenderresources,andschedules. Allexperimentsarebasedon30samplegames,andprobleminstancesthattooklongerthan30 minutestorunwereterminated. TheresultsofthefirstexperimentareshowninFigure3.4. The y-axisshowstheruntimeinsecondsonthelogscale. Thex-axisshowsthenumberofresources. These results shows that for low number of targets (less than 100), ERASER-C is the fastest algorithm. However, ERASER-C starts to lose its performance advantage when the number of targetsisincreased. BnPistheslowest(asexpected),andscalesmuchpoorlyascomparedtothe otheralgorithms. Tofurtherexaminetheadvantageo↵ eredbyAspen,weincreasedthenumberof targets to 200 and beyond in the second set of experiments. The results varying the number of defenderresourcesfor200targetsareshowninFigure3.5(a). AsseeninFigure3.5,Aspenisthefastestofthethreealgorithmsforlargenumberoftargets. The e↵ ectiveness of the ORIGAMI-S bounds and branching are clear in the comparison with standardBnPmethod. SinceAspensolvesafarmoregeneralsetofsecuritygames(SPARS),we wouldnotexpectittobecompetitive with ERASER-Cin its specializeddomain. However,Aspen was6timesfasterthanERASER-Cinsomeinstances. ThisimprovementoverERASER-Cwasan unexpectedtrend,andcanbeattributedtothenumberofcolumnsgeneratedbythetwoapproaches (Table3.1). WeobservesimilarresultsinthesecondandthirddatasetspresentedinFigures3.5(b) and3.5(c). 58 0.1$ 1$ 10$ 100$ 20$ 30$ 40$ 50$ 100$ 150$ Run,me$(in$secs)$ Targets$ Comparison$(10$Resources)$ ERASER$ BnP$ ASPEN$ Figure3.4: ComparisonbetweenERASER-C,AspenandBnPfor SPARSprobleminstanceswith 10 resources. The number of schedules for these experiments was two times the number of targets. Inthisfigureandotherswithy-axisonthelogscale,errorbarsarenotshownsincetheyarenot prominent because of the logarithmic axis. In this experiment, the di↵ erence in runtime for all pair-wisecomparisonswasstatisticallysignificantexceptbetweenERASER-CandAspenfor40 and50targets. Resources ASPEN ERASER-C BnP(max. 30mins) 10 126 204 1532 20 214 308 1679 30 263 314 1976 40 227 508 1510 50 327 426 1393 Table 3.1: This table shows the average number of columns (pure strategies for the defender) generatedbythecolumngenerationapproachforallthethreealgorithms: Aspen,BnP,andcolumn generation on marginals generated by ERASER-C. These results are averaged over 30 SPARS probleminstances, with200 targets and 600 schedules where each schedule is of size 2. These resultsconfirmthepresenceofsmallsupportsetsinsuchprobleminstances: thenumberoftargets intheseexperimentswas200whichimpliesthatthemaximumnumberofpurestrategiescanbe oftheorderof10 16 for10resourcesupto10 47 for50resources. 3.4.2 ASPENonLargeSPARSInstances: Wealsoevaluatetheperformanceof Aspenonarbitraryschedulingproblemsasthesizeoftheprob- lemis varied toinclude very large instances. No comparisons could be made because ERASER-C doesnothandlearbitraryschedulesandtheonlycorrectalgorithmsknown,DOBSS[Paruchuri 59 1" 10" 100" 1000" 1" 2" 3" 4" 5" Run+me"(in"secs)"[log7scale]" Number"of"Resources" Comparison"(200"Targets,"600"schedules)" ERASER7C" BnP" ASPEN" (a) Resources 1" 10" 100" 200" 400" 600" 800" 1000" Run+me"(in"secs)"[log7scale]" Number"of"Schedules" Comparison"(200"Targets,"10"Resources)" ERASER7C" BnP" ASPEN" (b) Schedules 1" 10" 100" 1000" 200" 400" 600" 800" 1000" Run+me"("in"secs")"[log7scale]" Number"of"Targets" Comparison"(1000"schedules,"10"Resources)" ERASER7C" BnP" ASPEN" (c) Targets Figure 3.5: These figures present results comparing the runtime required by Aspen, Aspen without theORIGAMI-Sbranch-and-boundheuristic(BnP)andERASER-Cwithcolumngenerationon SPARS instances with 2 schedules per target. The y-axis shows the runtime in seconds in log scale,whereasthex-axisvariesaninputparameter,asspecifiedinthefigure. Again,wedonot showtheerror barsbecauseof thelog scaleofthe y-axis,but Aspen wasthe fastest algorithmwith statisticalsignificance. 60 et al., 2008a] and BnP (Aspen without the ORIGAMI-S branch-and-bound heuristic), do not scale to these problem sizes. We vary the number of resources, schedules, and targets as before. In addi- tion,wevarythenumberoftargetsperscheduleforeachofthethreecasestotestmorecomplex schedulingproblems. Figure3.6(a)showstheruntimeresultswith1000feasibleschedulesand 200targets,averagedover10samples. Thex-axisshowsthenumberofresources,andthey-axis shows the runtime in seconds. Each line represents a di↵ erent number of schedules per target. The number of joint schedules in these instances can be as large as 10 23 ( ⇣ 1000 10 ⌘ ⇡ 2.6⇥ 10 23 ). Interestingly,theruntimedoesnot increase muchwhen the numberof resources is increasedfrom 10 to 20 when there are 5 targets per schedules. Column 4 of Table 3.2 illustrates that the key reasonforconstantruntimeisthattheaveragenumberofgeneratedcolumnsremainssimilar. For afixednumberofresources,weobserveanincreaseinruntimeformorecomplexschedulesthat corresponds with an increase in the number of columns generated. The other two experiments, Figure3.6(b)and3.6(c)alsoshowsimilartrends. Resources 3Targets/schedule 4Targets/schedule 5Targets/schedule 5 456 518 658 10 510 733 941 15 649 920 1092 20 937 1114 1124 Table 3.2: This table shows the average number of columns (pure strategies for the defender) generated by the column generation approach for Aspen. These results are averaged over 30 SPARSprobleminstances,with200targetsand1000schedules. 61 0" 1000" 2000" 3000" 4000" 5000" 6000" 7000" 8000" 5" 10" 15" 20" Run.me"(in"seconds)" Number"of"Resources" Scale?up"(200"Targets,"1000"schedules)" 2"Targets/Schedule" 3"Targets/Schedule" 4"Targets/Schedule" 5"Targets/Schedule" (a) Resources 0" 5000" 10000" 15000" 20000" 25000" 50" 500" 1000" 1500" 2000" Run)me"(in"seconds)" Number"of"Schedules" Scale;up"(200"Targets,"10"Resources)" 2"Targets/Schedule" 3"Targets/Schedule" 4"Targets/Schedule" 5"Targets/Schedule" (b) Schedules 0" 1000" 2000" 3000" 4000" 5000" 6000" 7000" 8000" 50" 100" 150" 200" Run.me"(in"seconds)" Number"of"Targets" ScaleBup"(1000"schedules,"10"resources)" 2"Targets/Schedule" 3"Targets/Schedule" 4"Targets/Schedule" 5"Targets/Schedule" (c) Targets Figure 3.6: These figures present the runtime results for Aspen when the size of the input problem isscaled. They-axisshowstheruntimeinseconds,whereasthex-axisvariesaninputparameter, asspecifiedinthefigure. 62 Chapter4: Strategygenerationforbothplayers Thischapterfocusesonthenetworksecuritydomain(NSD),andshowshowstrategygeneration forbothplayerscan beusedtocomputesolutionsforextremelylargeproblemswithexponentially manypurestrategiesforboththeplayers. However,beforeIproceedtodescribemyalgorithms, RuggedandSnares,IfirstdescribewhyRanger,thebestpreviousapproach,failedtogenerate correct solutions for NSD problem instances. The Ranger algorithm as used in this chapter is describedinSection2.9.4. 4.1 RangerCounterexample RangerwasintroducedbyTsaietal.[Tsaietal.,2010]andwasdesignedtoobtainapproximate solutions for the defender for the network security game. Its main component is a polynomial- sized linear program that, rather than solving for a distribution over allocations, solves for the marginal probability with which the defender covers each edge. It does this by approximating thecaptureprobabilityasthesumofthemarginalsalongtheattacker’spath. Itfurtherpresents somesamplingtechniquestoobtainadistributionoverdefenderallocationsfromthesemarginals. What was known before was that the Ranger solution (regardless of the sampling method used) is suboptimalin general,becauseitis notalwayspossibletofind adistributionover allocations such 63 that the capture probability is indeed thesum of marginals on the path. In this paper, we show that Ranger’serrorcanbearbitrarilylarge. Letusconsider theexamplegraphshowninFigure4.1. Thismulti-graph 1 hasasingle source node, s, and two targets,t 1 andt 2 ; the defender has 2 resources. Furthermore, the payo↵ sT of the targetsaredefinedtobe1and2fortargetst 1 andt 2 respectively. s t1 t2 a a a b Figure 4.1: This example is solved incorrectly by Ranger. The variables a, b are the coverage probabilitiesonthecorrespondingedges. Rangersolution: SupposeRangerputsmarginalcoverageprobabilityaoneachofthethreeedgesbetween sand t 1 , 2 andprobabilitybontheedgebetweent 1 andt 2 ,asshowninFigure4.1. Rangerestimatesthat the attacker gets caught with probability a when attacking target t 1 and probability a+b when attacking targett 2 . Ranger will attemptto make theattacker indi↵ erentbetween thetwo targets to obtain the minimax equilibrium. Thus, Ranger’s output is a = 3/5,b = 1/5, obtained from the followingsystemofequations: 1(1 a) = 2(1 (a+b)) (4.1) 3a+b = 2 (4.2) 1 Weuseamulti-graphforsimplicity. Thiscounterexamplecaneasilybeconvertedintoasimilarcounterexample thathasnomorethanoneedgebetweenanypairofnodesinthegraph. 2 Wecanassumewithoutlossofsolutionqualitythatsymmetricedgeswillhaveequalcoverage. 64 However, there can be no allocation of 2 resources to the edges such that the probability of the attackerbeingcaughtonhiswaytot 1 is3/5andtheprobabilityoftheattackerbeingcaughton his way to t 2 is 4/5. (The reason is that in this example, the event of there being a defensive resourceonthesecondedgeinthepathcannotbedisjointfromtheeventoftherebeingoneon thefirstedge.) Infact,forthisRangersolution,theattackercannotbecaughtwithaprobability of more than 3/5 when attacking target t 2 , and so the defender utility cannot be greater than 2(1 3/5) = 4/5. Optimalsolution: Figure 4.2 shows the six possible allocations of the defender’s two resources to the four edges. Three of them block some pair of edges between s and t 1 . Suppose that each of these three allocationsisplayedbythedefenderwithprobability x. 3 Eachoftheotherthreeallocationsblocks oneedgebetween sand t 1 aswellastheedgebetween t 1 andt 2 . Supposethedefenderchooses theseallocationswithprobabilityyeach(referFigure4.2). Theprobabilityoftheattackerbeing x x x y y y t2 t1 s t2 t1 s t2 t1 s t2 t1 s t2 t1 s t2 t1 s Figure 4.2: The possible allocations of two resources to the four edges. The blocked edges are showninbold. Theprobabilities(x ory)areshownnexttoeachallocation. caught on his way to t 1 is 2 3 3x+ 1 3 3y, or 2x+y. Similarly, the probability of the attacker being 3 Again,thiscanbeassumedwithoutlossofgeneralityforsymmetricedges. 65 caught on his way to t 2 is 2x+3y. Thus, a minimax strategy for this problem is the solution of Equations(4.3)and(4.4),whichmaketheattackerindi↵ erentbetweentargetst 1 andt 2 . 1(1 2x y) = 2(1 2x 3y) (4.3) 3x+3y = 1 (4.4) Thesolutiontotheabovesystemis x = 2/9,y = 1/9,sothattheexpectedattackerutilityis4/9. Thus,theexpecteddefenderutilityis 4/9,whichishigherthantheexpecteddefenderutilityofat most 4/5resultingfromusingRanger. Rangersub-optimality: Supposethepayo↵ T(t 2 )oftargett 2 intheexampleabovewas H,H > 1. TheRangersolutionin thiscase,againobtainedusingEquations4.1and4.2,wouldbea = (H+1) (2H+1) ,b = (H 1) (2H+1) . Then, consider an attacker who attacks the target t 2 by first going through one of the three edgesfrom stot 1 uniformlyatrandom(andthenontot 2 ). Theattackerwillfailtobecaughton thewayfromt 1 tot 2 withprobability(1 b),giventhatthedefender’sstrategyisconsistentwith theoutputof Ranger. Evenconditionalonthisfailure,theattackerwillfailtobecaughtonthe wayfrom stot 1 withprobabilityatleast1/3,becausethedefenderhasonly2resources. Thus,the probability of a successful attack on t 2 is at least (1 b)(1/3), and the attacker’s best-response utilityisatleast: H(1 b) 3 = H(H +2) 3(2H +1) > H(H +0.5) 3(2H +1) = H 6 (4.5) Thus,thetruedefenderutilityforanystrategyconsistentwithRangerisatmost H 6 . 66 Now, consider another defenderstrategy inwhich the defenderalways blocksthe edge fromt 1 to t 2 , and also blocks one of the three edges between s and t 1 uniformly at random. For such a defenderstrategy,theattackercanreacht 1 withprobability2/3,butcannotreachtargett 2 atall. Thus, the attacker’s best-response utility in this case is 2/3. Therefore, the optimal defender utility isatleast 2/3. Therefore,anysolutionconsistentwithRangerisatleast H 6 / 2 3 = H 4 suboptimal. Since H is arbitrary, Ranger solutions can be arbitrarily suboptimal. This motivates our exact, double-oraclealgorithm,Rugged. 4.2 Rugged In this section, I present Rugged, a double-oracle basedalgorithm for network security games. We alsoanalyzethecomputationalcomplexityofdeterminingbestresponsesforboththedefender andtheattacker,and,tocompletetheRuggedalgorithm,wegivealgorithmsforcomputingthe bestresponses. 4.2.1 AlgorithmDescription ThealgorithmRuggedispresentedasAlgorithm2. Xisthesetofdefenderallocationsgenerated so far, whileA is the set of attacker paths generated so far. CoreLP(X,A) finds an equilibrium (andhence,minimaxandmaximinstrategies)ofthetwo-playerzero-sumgameconsistingofthe setsofpurestrategies,XandA,generatedsofar. CoreLPreturnsxanda,whicharethecurrent equilibrium mixed strategies for the defender and the attacker over X and A respectively. The defenderoracle(DO)generatesadefenderallocation⇤ thatisabestresponseforthedefender 67 against a. (This is a best response among all allocations, not just those in X.) Similarly, the attackeroracle(AO)generatesanattackerpath thatisabestresponsefortheattackeragainstx. Algorithm2DoubleOracleforUrbanNetworkSecurity 1. InitializeXbygeneratingarbitrarycandidatedefenderallocations. 2. InitializeAbygeneratingarbitrarycandidateattackerpaths. repeat 3. (x,a) CoreLP(X,A). 4a.⇤ DO(a). 4b. X X[{ ⇤ }. 5a. AO(x). 5b. A A[{ }. untilconvergence 7. Return(x,a) The double oracle algorithm thus starts with a small set of pure strategies for each player, andthengrowsthesesetsineveryiterationbyapplyingthebest-responseoraclestothecurrent solution. Executioncontinuesuntilconvergenceisdetected. Convergenceisachievedwhenthe best-responseoraclesofboththedefenderandtheattackerdonotgenerateapurestrategythatis betterforthatplayerthantheplayer’sstrategyinthecurrentsolution(holdingtheotherplayer’s strategyfixed). Inotherwords,convergenceisobtainedif,forbothplayers,therewardgivenby thebest-responseoracleisnobetterthantherewardforthesameplayergivenbytheCoreLP. The correctness of best-response-based double oracle algorithms for two-player zero-sum games has been established by McMahan et al [McMahan et al., 2003]; the intuition for this correctnessisasfollows. Oncethealgorithmconverges,thecurrentsolutionmustbeanequilibrium ofthe game,becauseeachplayer’scurrentstrategyisabestresponsetotheotherplayer’scurrent strategy—this follows from thefact that the best-response oracle, which searches over all possible strategies,cannotfindanythingbetter. Furthermore,thealgorithmmustconverge,becauseatworst, itwillgenerateallpurestrategies. 68 4.2.2 CoreLP Thepurposeof CoreLPistofindanequilibriumoftherestrictedgameconsistingofdefenderpure strategiesX and attacker pure strategiesA. Below is the standard formulation for computing a maximinstrategyforthedefenderinatwo-playerzero-sumgame. max U ⇤ d ,x U ⇤ d (4.6) s.t. U ⇤ d U d (x,A j ) 8 j = 1,...,|A| (4.7) 1 T x = 1 (4.8) x 2[0,1] |X| (4.9) Thedefender’smixedstrategyx,definedoverX,andutilityU ⇤ d arethevariablesforthisproblem. Inequality (4.7) is family of constraints; there is one constraint for every attacker path A j in A. The function U d (x,A j ) is the expected utility of the attacker path A j . Given A j , the probability that the attacker is caught is the sum of the probabilities of the defender allocations that would catchtheattacker. (Wecansumtheseprobabilitiesbecausetheycorrespondtodisjointevents.) Moreprecisely,letz ij beanindicatorforwhetherallocation X i intersectswithpath A j ,thatis, z ij = 8 >>>>>>< >>>>>>: 1 if X i \ A j ,? 0 otherwise (4.10) These z ij arenotvariablesofthelinearprogram; theyareparametersthataredeterminedatthe time the best responses are generated. Then, the probability that an attacker playing path A j is caughtis P i z ij x i ,andtheprobabilitythatheisnotcaughtis P i (1 z ij )x i . Thus,thepayo↵ function 69 U d (x,A j )forthedefenderforchoosingamixedstrategyxwhentheattackerchoosespath A j is givenbyEquation(4.11),whereT(t j )istheattacker’spayo↵ forreachingt j . U d (x,A j ) =T (t j )·( X i (1 z ij )x i ) (4.11) The dual variables corresponding to Inequality (4.7) give the attacker’s mixed strategy a, definedoverA. Theexpectedutilityfortheattackerisgivenby U ⇤ d . 4.2.3 DefenderOracle This section concerns the best-response oracle problem for the defender. The Defender Oracle problemisstatedasfollows: generatethedefenderpurestrategy(resourceallocation)⇤ allocating kresourcesovertheedges E thatmaximizesthedefender’sexpectedutilityagainstagivenattacker mixedstrategyaoverpathsA. DefenderOracleproblemisNP-hard: Weshowthisbyreducingthesetcoverproblemtoit. TheSet-Coverproblem: GivenareasetU,acollectionSofsubsetsofU (thatis,S✓ 2 U ),and anintegerk. ThequestioniswhetherthereisacoverC✓S ofsizek orless,thatis, S c2C c = U and|C| k. We will use a modification of this well-known NP-hard problem so thatS always contains all singleton subsets ofU, that is, x2U implies{x}2S. This modified problem remains NP-hard. Theorem2. The Defender Oracle problem is NP-hard, even if there is only a single source and a singletarget. Proof. ReductionfromSet-Coverto Defender Oracle: Weconvertanarbitraryinstanceofthe setcoverproblemtoaninstanceofthedefenderoracleproblembyconstructingagraphG with 70 just3nodes,asshowninFigure4.3. ThegraphG isamulti-graph 4 withjustthreenodes,sothat N ={s,v,t},where sistheonlysourceandtistheonlytarget(witharbitrarypositivevalue). There are up to|S|loop edges adjacenttonodev;eachloopedgecorrespondstoauniquenon-singleton subset in S. There are |U| edges between s and v, each corresponding to a unique element in U. Therearealso|U|edgesbetweenvandt,eachcorrespondingtoauniqueelementinU. The attacker’s paths correspond to the elements in U. A path that corresponds to u 2U starts with theedgebetween sandvthatcorrespondstou,thenloopsthroughalltheedgesthatcorrespond tonon-singletonsubsetsinSthatcontainu,andfinallyendswiththeedgebetweenvandt that correspondstou. Hence,anytwopathsusedbytheattackercanonlyintersectattheloopedges. Theprobabilitiesthatthedefenderplacesonthesepathsarearbitrarypositivenumbers. Wenow e 2 e 1 e 3 e 1 ' e 2 ' e 3 ' e 1,2 e 1,3 sv t Figure4.3: A defender oracleprobleminstancecorrespondingtotheSET-COVERinstancewith U ={1,2,3},S ={{1},{2},{3},{1,2},{1,3}}. Here,theattacker’smixedstrategyusesthreepaths: (e 1 ,e 1,2 ,e 1,3 ,e 0 1 ),(e 2 ,e 1,2 ,e 0 2 ),(e 3 ,e 1,3 ,e 0 3 ). Thus,theSET-COVERinstancehasasolutionofsize 2(forexample,using{1,2}and{1,3});correspondingly,with2resources,thedefendercanalways capturetheattacker(forexample,bycoveringe 1,2 ,e 1,3 ). showthatsetU canbecoveredwithk subsetsinS✓ 2 U ifandonlyifthedefendercanblockall oftheattacker’spathswithk resourcesinthecorrespondingdefenderoracleprobleminstance. The “if” direction: If the defender can block all the paths used by the attacker with k resources,thenthesetU canbecoveredwithC✓S ,where|C| = k andisconstructedasfollows. If the defender places a resource on a loop edge, thenC includes the non-singleton subset inS 4 Havingamulti-graphisnotessentialtotheNP-hardnessreduction. 71 that corresponds to that loop edge. If the defender blocks any other edge then C includes the correspondingsingletonsubset. The“onlyif”direction: IfthereexistsacoverCofsize k,thenthedefendercanblockall thepathsbyplacingadefensiveresourceoneveryloopedgethatcorrespondstoanon-singleton subsetinC,andplacingadefensiveresourceonthecorrespondingedgeoutof sforeverysingleton subsetinC. ⇤ Formulation: Thedefenderoracleproblem,describedbelow,canbeformulatedasamixed integerlinearprogram(MILP).TheobjectiveoftheMILPistoidentifytheallocationthatcoversas manyattackerpathsaspossible,wherepathsareweightedbytheproductofthepayo↵ ofthetarget attacked by the path and probability of attacker choosing it. (In this formulation, probabilities a j arenotvariables;theyareprovidedbyCoreLP.)Intheformulation, e = 1indicatesthatwe assignaresourcetoedge e,and z j = 1indicatesthatpath A j (referTable2.9)isblockedbythe allocation. max z, P j (1 z j )a j T t j (4.12) s.t. z j X e A je e (4.13) P e e k (4.14) e 2{0,1} (4.15) z j 2[0,1] (4.16) Theorem 3. The MILP described above correctly computes a best-response allocation for the defender. 72 Proof. The defender receives a payo↵ ofT (t j )a j if the attacker successfully attacks target t j usingpath A j ,and0inthecaseofanunsuccessfulattack. Hence,ifwemakesurethat1 z j = 1 ifpath A j isnotblocked,and0otherwise,thentheobjectivefunction(4.12)correctlymodelsthe defender’s expected utility. Inequality (4.13) ensures this: its right-hand side will be at least 1 if there exists an edge on the path A j that defender is covering, and 0 otherwise. z j need not be restrictedtotakeanintegervaluebecausetheobjectiveisincreasingwithz j andifthesolvercan pushitabove0,itwillchoosetopushitallthewayupto1. Therefore,ifwelet⇤ correspondto thesetofedgescoveredbythedefender,z j willbesetbythesolversothat: z j = 8 >>>>>>< >>>>>>: 1 if⇤ \ A j ,? ,9 e| e = A je = 1 0 otherwise (4.17) Inequality (4.14) enforces that the defender covers at most as many edges as the number of availableresourcesk,andthusensuresfeasibility. Hence,theaboveMILPcorrectlycapturesthe best-responseoracleproblemforthedefender. ⇤ Proposition 1. For any attacker mixed strategy, the defender’s expected utility from the best response provided by the defender oracle is no worse than the defender’s equilibrium utility in the fullzero-sumgame. Proof. In any equilibrium, the attacker plays a mixed strategy that minimizes the defender’s best-response utility; therefore, if the attacker plays any other mixed strategy, the defender’s best-responseutilitycanbenoworse. ⇤ 73 4.2.4 AttackerOracle This section concerns the best-response oracle problem for the attacker. The Attacker Oracle problemistogeneratetheattackerpurestrategy(path) fromsomesource s2S tosometarget t 2 T that maximizes the attacker expected utility given the defender mixed strategy x over defenderallocationsX. AttackerOracleisNP-hard: WeshowthattheattackeroracleproblemisalsoNP-hardby reducing3-SATtoit. Theorem4. TheAttackerOracleproblemisNP-hard,evenifthereisonlyasinglesourceand a singletarget. Proof. Reduction from 3-SAT to Attacker Oracle: We convert an arbitrary instance of 3-SAT toaninstanceoftheattackeroracleproblemasfollows. Supposethe3-SATinstancecontainsn variables x i , i = 1,...,n,andk clauses. Eachclauseisadisjunctionofthreeliterals,whereeach literaliseitheravariableorthenegationofthevariable. Considerthefollowingexample: E = (x 1 _¬ x 2 _¬ x 3 )^ (x 1 _ x 2 _ x 4 ) (4.18) Theformula E containsn = 4variablesandk = 2clauses. We construct a multi-graphG 5 with n+k+1 nodes, v 0 ,...,v n+k so that the source node is s = v 0 ,andthetargetnodeist = v n+k . Everyedgeconnectssomepairofnodeswithconsecutive indices,sothateverysimplepathfrom stot containsexactlyn+k edges. Eachedgecorresponds to a literal in the 3-SAT expression (that is, either x i or¬x i ). There are exactly three edges that connect nodesv i 1 andv i fori = 1,...,k. Those three edges correspond to the three literals in the 5 Weuseamulti-graphforsimplicity;havingamulti-graphisnotessentialfortheNP-hardnessreduction. 74 i-thclause. Thereareexactlytwoedgesthatconnectnodesv k+j 1 andv k+j for j = 1,...,n. Those two edges correspond to literals x j and¬x j . An example graph that corresponds to the expression (4.18)isshowninFigure4.4. x 1 x 1 x 4 x 1 ¬x 1 x 2 ¬x 2 x 3 ¬x 3 x 4 ¬x 4 ¬x 2 ¬x 3 x 2 s=v 0 v 1 v 2 v 3 v 4 v 5 v 6 =t Figure4.4: AnexamplegraphcorrespondingtotheCNFformula(x 1 _¬ x 2 _¬ x 3 )^ (x 1 _ x 2 _ x 4 ) There are 2n defender pure strategies (allocations of resources), each played with equal probability of 1/(2n). Each defender pure strategy corresponds to a literal, and the edges that correspond to that literal are blocked in that pure strategy. In the example shown in Figure 4.4, the defender plays 8 pure strategies, each with probability 1/8. Three edges are blocked in the pure strategy that correspondsto the literal x 1 (namely, the top edge between v 0 and v 1 , the top edge between v 1 and v 2 , and the top edge between v 2 and v 3 ); only one edge is blocked in the pure strategy that corresponds to the literal ¬x 4 (the bottom edge between v 5 and v 6 ). (If it is desired that the defender always use the same number of resources, this is easily achieved by addingdummyedges.) Wenowshowthatthereisanassignmentofvaluestothevariablesinthe 3-SATinstancesothattheformulaevaluatesto trueif andonlyifthereisapathfrom stot inthe correspondingattackeroracleprobleminstancewhichisblockedwithprobabilityatmost1/2. The“if”direction: Supposethereisapath from sto t thatisblockedwithprobabilityat most1/2. Notethatanypathfrom stot isblockedbyatleastoneofthestrategies{x i ,¬x i },forall i = 1,...,n,sotheprobabilitythatthepathisblockedisatleast n/(2n) = 1/2. Moreover,iffor somei,thepathpassesthroughbothanedgelabeled x i andonelabeled¬x i ,thentheprobability that the path is blockedis at least (n+1)/(2n) > 1/2—so this cannot be the case for . Hence, we canassignthe truevaluetotheliteralsthatcorrespondtotheedgesonthepath ,and falsetoall 75 theotherliterals. Thismustcorrespondtoasolutiontothe3-SATinstance,becauseeachclause mustcontainaliteralthatcorrespondstoanedgeonthepath,andisthusassignedatruevalue. The“onlyif”direction: Supposethereisanassignmentofvaluestothevariablessuchthat the 3-SAT formulaevaluates to true. Considera simplepath thatgoes from s tot through edges that correspond to literals with true values in the assignment. Such a path must exist because byassumptiontheassignmentsatisfieseveryclause. Moreover,thispathisblockedonlybythe defender strategies that correspond to true literals, of which there are exactlyn. So the probability thatthepathisblockedisn/(2n) = 1/2. ⇤ Formulation: Theattackeroracleproblemcanbeformulatedasasetofmixedintegerlinear programs,asdescribedbelow. ForeverytargetinT,wesolveforthebestpathtothattarget;then wetakethebestsolutionoverall. Belowistheformulationwhentheattackerisattackingtarget t m . (In this formulation, probabilities x i are not variables; they are values produced earlier by CoreLP.)Intheformulation, e = 1indicatesthattheattackerpassesthroughedgee,andz i = 1 76 indicatesthattheallocation X i blockstheattackerpath. Equations(4.20)to(4.22)representthe flowconstraintsfortheattackerforeverynoden2N. max z, T t m P i x i (1 z i ) (4.19) s.t. P e2out(n) e = X e2in(n) e n, s,t m (4.20) P e2out(s) e = 1 (4.21) P e2in(t m ) e = 1 (4.22) z i e +X ie 1 8 e8 i (4.23) z i 0 (4.24) e 2{0,1} (4.25) Theorem5. The MILP describedabove correctly computesa best-responsepathfor theattacker. Proof. The flow constrains are represented in Equations (4.20) to (4.22). The sink for the flow isthetargett m thatwearecurrentlyconsideringforattack. Todealwiththecasewherethereis morethanonepossiblesourcenode,wecanaddavirtualsource(s)toG thatfeedsintoallthereal sources. in(n) represents the edges coming into n, out(n) represents those going out of n. The flowconstraintsensurethatthechosenedgesindeedconstituteapathfromthe(virtual)sourceto thesink. Theattackerreceivesapayo↵ ofT(t m )ifheattackstargett m successfully,thatis,ifthepath doesnotintersectwithanydefenderallocation. Hence,ifwemakesurethat1 z i = 1ifallocation X i doesnotblockthepath,and0otherwise,thentheobjectivefunction(4.19)correctlymodels theattacker’s expected utility. Inequality (4.23) ensures this: if the allocation X i covers somee for 77 which e = 1, then it will force z i to be set at least to 1; otherwise, z i only needs to be set to at least0(andineachcase,thesolverwillpushitallthewaydowntothisvalue,whichalsoexplains why the z i variables do not need to be restricted to take integer values). Therefore, if we let correspondtothepathchosenbytheattacker,z i willbesetbythesolversothat z i = 8 >>>>>>< >>>>>>: 1 if X i \ ,? ,9 e| e = X ie = 1 0 otherwise (4.26) ItfollowsthattheMILPobjectiveiscorrect. Hence,theaboveMILPcapturesthebest-response oracleproblemfortheattacker. ⇤ Proposition 2. For any defender mixed strategy, the attacker’s expected utility from the best responseprovidedbytheattackeroracleisnoworsethantheattacker’sequilibriumutilityinthe fullzero-sumgame. Proof. In any equilibrium, the defender plays a mixed strategy that minimizes the attacker’s best-response utility; therefore, if the defender plays any other mixed strategy, the attacker’s best-responseutilitycanbenoworse. ⇤ Figure4.5: ExamplegraphofSouthernMumbaiwith455nodes. Sourcesaredepictedasgreen arrowsandtargetsareredbulls-eyes. Bestviewedincolor. 78 4.3 Evaluationof Rugged In this section, we describe the results I achieved with Rugged. We conducted experiments on graphsobtainedfromroadnetworkGISdataforthecityofMumbai(inspiredbythe2008Mumbai incidents [Chandran and Beitchman, 29 November 2008]), as well as on artificially generated graphs. Weprovidetwotypesofresults: (1)Firstly,wecomparethesolutionqualityobtainedfrom Ruggedwith the solutionqualityobtainedfromRanger. TheseresultsareshowninSection4.3.1. (2) Secondly, we provide runtime results showing the performance of Rugged when the input graphsarescaledup. 6 Thefollowingthreetypesofgraphswereusedfortheexperimentalresults: (1) Weakly fully connected (WFC) graphs, denoted G WFC (N,E), are graphs where N is an ordered set of nodes {n 1 ,...,n m ;S = {n 1 },T = {n m }}. For each node n i , there exists a set of directededges,{(n i ,n j )|n i < n j },in E. Thesegraphswerechosenbecauseoftheextremesizeof thestrategyspacesforbothplayers. Additionally,therearenobottleneck edges,sothesegraphs aredesignedtobecomputationallychallengingforRugged. (2) Braid-type graphs, denoted G B (N,E), are graphs where N is a sequence of nodes n 1 to n m suchthateachpairn i 1 andn i isconnectedby2to3edges. Noden 1 isthesourcenode. Any followingnodeisatargetnodewithprobability0.2,withpayo↵ T randomlychosenbetween1 and100. ThesegraphshaveasimilarstructureasthegraphinFigure4.1,andweremotivatedby thecounterexampleinSection4.1. (3)Citygraphsofdi↵ erentsizeswereextractedfromthesouthernpartofMumbaiusingthe GIS data provided by OpenStreetMaps. The placement of 2-4 targets was inspired by the Mumbai incidents from 2008 [Chandran and Beitchman, 29 November 2008]; 2-4 sources were placed 6 Allexperimentswererunonstandarddesktop2.8GHzmachinewith2GBmainmemory. 79 ontheborderofthegraph, 7 simulatinganattackerapproachingfromthesea. Weranthetestfor graphswiththefollowingnumbersofnodes: 45,129and252. Figure4.5showsasampleMumbai graphwith252nodes,4sourcesand3targets. 4.3.1 ComparisonwithRANGER This section compares the solution quality of Rugged and Ranger. Although we have already established that Ranger solutions can be arbitrarily bad in general, the objective of these tests is to comparetheactualperformanceof RangerwithRugged. TheresultsaregiveninTable4.1,which showstheaverageandmaximum error fromRanger. WeevaluatedRangeronthethreetypesof graphs—citygraphs,braidgraphsandweaklyfullyconnectedgraphsofdi↵ erentsizes,fixing thenumberofdefenderresourcesto2andplacing3targets,withvariedvaluesfromtheinterval [0,1000]. TheactualdefenderutilityfromthesolutionprovidedbyRanger 8 iscomputedbyusing thebest-responseoraclefortheattackerwiththeRangerdefenderstrategyasinput. Theerrorof Ranger isthen expressedasthe di↵ erence between the defender utilities in the solutions provided byRangerandbyRugged. Table 4.1 shows the comparison results between Ranger and Rugged, summarized over 30 trials. ItshowsthepercentageoftrialsinwhichRangergaveanincorrectsolution(denoted“pct”). Italsoshowstheaverageandmaximumerrorof Ranger(denotedasavgandmaxrespectively) over these trials. It shows that while Ranger was wrong only about 1/3 of the time for Braid graphs,itgavethewronganswerinalltherunsonthefullyconnectedgraphs. Furthermore,itwas wrong 90%of the time oncity graphs, with an average error of 215 units and a maximum error of 7 Weplacedmoresourcesandtargetsintolargergraphs. 8 BecauseRangerprovidesasolutionin the formofmarginalprobabilitiesofdefenderallocations alongedges,we usedCombsampling[Tsaietal.,2010]toconvertthisintoa(joint)probabilitydistributionoverdefenderallocations. 80 721 units. Given an average target value of 500, these are high errors indeed — indicating that Rangerisunsuitablefordeploymentinreal-worlddomains. City Braid WFC nodes 45 129 10 20 10 20 avgerror 215 250 210 259 191 80 maxerror 721 489 472 599 273 117 pct 90% 100% 30% 37% 100% 100% avgT 500 500 500 500 500 500 Table4.1: RangeraverageandmaximumerrorandpercentofsampleswhereRangerprovideda suboptimalsolution. TargetvaluesT wererandomlydrawnfromtheinterval[1,1000]. (a) (b) (c) Figure4.6: Results. Figure(a)showsthescale-upanalysisonWFCgraphofdi↵ erentsizes. Figure (b)showstheconvergenceoforaclevaluestothefinalgamevalueandtheanytimebounds. Figure (c)comparestheruntimesoforaclesandthecoreLP. 4.3.2 Scale-upandanalysis This section concerns the performance of Rugged when the input problem instances are scaled up. TheexperimentswereconductedongraphsderiveddirectlyfromportionsofMumbai’sroad network. The runtime results are shown in Table 4.2, where the rows represent the size of the 81 1 2 3 4 45 0.91 6.43 22.58 33.42 129 6.63 32.55 486.48 3140.23 252 17.19 626.25 2014.14 34344.70 Table 4.2: Runtime (in seconds) of Rugged when the input problem instances are scaled up. These testsweredoneongraphsextractedfromtheroadnetworkofMumbai. Therowscorrespondto the number of nodes in the graph whereas the columns correspond to the number of defender resources. graphandthecolumnsrepresentthenumberofdefenderresourcesthatneedtobescheduled. As anexampleofthecomplexityofthegraph,thenumberofattackerpathsintheMumbaigraphwith 252nodesisatleasta10 12 ,while thenumber of defender allocationsis approximately 10 10 for4 resources. Thegamematrixforthisproblemcannotevenberepresented, letalonesolved. The abilityof Rugged tocomputeoptimal solutions in such situations, while overcoming NP-hardness of both oracles, marks a significant advance in the state of the art in deploying game-theoretic techniques. Figure4.6(a)examinestheperformanceof Ruggedwhenthesizeofthestrategyspacesfor both players is increased. These tests were conducted on WFC graphs, since they are designed tohavelargestrategyspaces. Theseproblemshave20to100nodesandupto5resources. The x-axis in thefigure showsthe number of nodes in the graph, while the y-axis shows the runtime in seconds. Di↵ erentnumberofdefenderresourcesarerepresentedbydi↵ erentcurvesinthegraph. Forexamplefor40nodes,and5defenderresources,Ruggedtook108secondsonaverage. Tospeeduptheconvergenceof Rugged,wetriedtowarm-start thealgorithmwithaninitial defender allocation such as min-cut-based allocations, target- and source-centric allocations, RANGER allocations and combinations of these. No significant improvement of runtime was measured;in somecases,theruntimeincreasedbecauseofthelargerstrategysetforthedefender. 82 4.3.3 AlgorithmDynamicsAnalysis This section analyzes the anytime solution quality and the performance of each of the three componentsof Rugged: thedefenderoracle,theattackeroracle,andtheCoreLP.Whenwesolve thebest-responseoracleproblems,theyprovidelowerandupperboundsontheoptimaldefender utility, as shown in Propositions 1 and 2. Figure 4.6(b) shows the progress of the bounds and the CoreLP solution for a sample problem instance scheduling 2 defender resources on a fully connected network with 50 nodes. The x-axis shows the number of iterations and the y-axis shows the expected defender utility. The graph shows that a good solution (i.e., one where the di↵ erence in the two bounds is less than ✏) can be computed reasonably quickly, even though the algorithm takes longer to converge to the optimal solution. For example, a solution with an allowed approximation of 10 units 9 can be computed in about 210 iterations, whereas 310 iterations are required to find the optimal solution. The di↵ erence between these two bounds gives anupperboundontheerrorinthecurrentsolutionoftheCoreLP;thisalsoprovidesuswithan approximationvariantof Rugged. Figure 4.6(c) compares the runtime needed by the three modules in every iteration. The x-axis showstheiterationnumberandthey-axisshowstheruntimeinsecondsinlogarithmicscale. As expected,CoreLP—solvingastandardlinearprogram—needsconsiderablylesstimeineach iterationthanboththeoracles,whichsolvemixed-integerprograms. Thefigurealsoshowsthat themodulesscalewellasthenumberofiterationsincreases. 9 10unitsis1%ofthemaximumtargetpayo↵ (1000). 83 4.4 Snares The focus of this section is to present Snares (Securing Networks by Applying a Randomized Emplacement Strategy), a new algorithm for computing the optimal solution for much larger network security domains. Snares builds on the double-oracle approach of Rugged, and has twonovelfeatures. First,Snaresusesgreedyheuristicstocompute“better”(ratherthan“best”) responses for both players. I show that the best-response problem for the defender has a sub- modularity property, and exploiting this provide a bound for the solution quality of our better response. Second,Snaresusesanovel“mincut-fanout”techniquetowarm-startthecomputation by initializing the game matrix with pure strategies for both the players. We show that na¨ ıve ways of warm-starting the computation can actually increase the runtime, and that our novel techniqueise↵ ective. WeextensivelyanalyzetheseindividualcomponentsofSnaresinthispaper. Finally,IdemonstrateSnares’ssignificantadvanceoverthestate-of-the-art(i.e.,Rugged): whereas state-of-the-artwasrestrictedtothesoutherntipofMumbai,withSnares,optimalstrategiesfor theentireroadnetworkofthecityofMumbai(referFigure4.7)canbeobtainedinareasonable amountoftime. 4.4.1 AlgorithmDescription InowpresentSnaresindetail. TheflowchartforSnaresispresentedinFigure4.8. Theboxes withdoublelinesarethethreenovelfeaturesinSnares,andIwilldescribethemindetailinthis section,whilemodulesalsopresentinRuggedareshowninboxeswithsingle-lineborders. The formalalgorithmisgivenasAlgorithm3. 84 Southern Tip: State-of-the-art Full City: This paper Mumbai' Figure 4.7: Comparison between Snares and state-of-the-art: Snares can now scale to solve problems the size of full cities where previous work could only scale to the southern tip of Mumbai. Minimax Best Response Defender Best Response Better Response Defender Better Response Attacker Useful: Yes Useful: No Useful: Yes Useful: No mincut ! fanout ! Figure4.8: FlowChartfortheSnaresAlgorithm Snares warm starts the computation with the pure strategies obtained using the mincut-fanout procedure, which will be explained next. DBO and ABO in Lines 4 and 8 re- fertothedefenderbetterresponseandattackerbetterresponseoraclesrespectively,whileDOand AO in Lines 6 and 10 are the best response oracles for both players respectively. In Line 4, DBO 85 Algorithm3Snares(G,k) 1: InitializeX,Ausingmincut-fanout 2: repeat 3: (x,a) CoreLP(X,A). 4: X DBO(a). 5: if U d (X,a) U d (x,a) ✏ then 6: X DO(a). 7: X X[{ X}. 8: A ABO(x). 9: if U a (A,x) U a (a,x) ✏ then 10: A AO(x). 11: A A[{ A}. 12: untilconvergence 13: return (x,a) returnsaheuristicresponse X ofthedefender(wecallthisheuristicresponseasa“better”response asopposedtothe“best”response;hereX maynotbestrictlybetterthanexistingstrategies). Line5 checkswhetherthisutilityU d (X,a)ishigherthantheutilityU d (x,a)obtainedfromminimaxby at least✏, i.e., whether X is at least✏-better than all strategies present inX againsta. IfU d (X,a) is notatleast✏ higherthanU d (x,a),thenthebestresponseDOisinvoked(Line6). Line7guarantees thateachiterationaddsanimprovingpurestrategytoX,shouldoneexist. Similarlycomputation is performed for the attacker in Lines 8–11. Line 12 states that the computation proceeds until convergence,whichisobtainedwhentheutilityobtainedfromthebestresponseofeachplayer is not better than the corresponding player’s utility from the minimax strategy. In other words, convergence is obtained when U d (X,a) U d (x,a) ⌧ and U a (A,x) U a (a,x) ⌧ , where ⌧ definesthetolerance. 10 Finally,Snaresisguaranteedtoconvergeontheglobaloptimalsolution since convergence can be obtained only when the best responses for both the players are unable to generateanimprovingstrategy. 10 Inallourexperiments,wefixboth✏ and⌧ to0.001. 86 4.4.2 Warm-startingusingmincut-fanout The objective of warm-starting is to generate pure strategies for both the defender and the attacker beforethecomputationoftheminimaxstrategyisstarted. Giventhesetupofthegame,thebest responseoftheattackerwillchoosetoattackthehighest-valuedtarget ˆ t withprobability1.0,if thereexistsa s ˆ t pathinG(N,E)thatdoesnotintersectwithanyofthedefender’sallocations. Consequently,strategiesconsideringthemost-valuedtargetgetgeneratedbytheiterativeprocedure ofthedouble-oraclebasedalgorithminthefirstfewiterations. Theobjectiveofwarm-startshere istogeneratesuchstrategiesforbothplayersandaddthembeforethestartofthedouble-oracle iterativeprocedure. Thiswillreducethenumberofiterationsthatarerequiredbythealgorithm. Thus, we construct a game with just one target: ˆ t. Solution for this game can be computed in polynomial-time: it is to uniformly distribute the defender’s resources on the s ˆ t min- cut [Washburn and Wood, 1995]. Thus, mincut-fanout will sample pure strategies for the defender,or defenderallocations, fromthe s ˆ t min-cut, suchthat eachallocation covers k edges. We then compute pure strategies for the attacker, or attacker paths, which are best responses to eachindividualdefenderallocation. ThisisdoneusingDijkstra’sshortestpathcomputation,such thateachedgeeinthedefenderallocation X hasweightinf,whiletheotheredgeshaveaweight of1. Thisalsoprefersshortattackerpathsoverlongones,theintuitionbeingthatshorterpaths shouldbepreferredbytheattackersincelongerpathsaremorelikelytobeintercepted. Wealsogeneralizedtheaboveideatoconsiderallthetargetsandnotjustthehighestvalued target ˆ t. However, as we show in Section 4.4.4.1, our choice of using only the highest valued targetinmincut-fanout performsbetter against consideringallthe targets. We alsoshowthatit performsbetteragainstotherpotentialstrategiesforwarmstartingthecomputation. 87 4.4.3 Using”Better”Responses Snarespresentsheuristicstocompute“betterresponses”forbothplayers. Eachbetterresponse module aims to compute a strategy for the corresponding player that is better than any strategy already in the mix against the other player’s current equilibrium strategy. If successful, this guaranteesthatCoreLPwillcomputeadi↵ erentequilibriumwhenthisstrategyisadded. Ifthe betterresponseheuristicfailstogeneratesuchastrategy,thenthebest-responsemoduleiscalled. Theobjectiveistoreducethenumberofinvocationsofthebestresponsemodules,bothofwhich solveanNP-hardproblem[Jainetal.,2011b]andconsumesignificantruntimewhentheproblem size gets bigger. On the other hand, the better response solutions used by Snares can be computed inpolynomialtime. Furthermore,evenwiththeuseofbetterresponses,Snaresstillconvergeson theglobalequilibriumsincethebestresponsesmodulesarecalledifthebetterresponsemodules do not generate an improving strategy. We now present the better response algorithms for both players. 4.4.3.1 BetterResponsefortheDefender Inthissection,wefirstpresentagreedyapproachtogeneratethebetterresponseX g ofthedefender. This approach greedily maximizes a “normalized” defender utility function f(X,a). We next show that this utility function is a non-negative sub-modular function, and then establish a bound on thesolutionqualityofourgreedybetterresponsesolution. Thisboundsuggeststhatourgreedy approachgeneratesgoodsolutions. 88 Thedefenderpayo↵ sarealwaysnegativeinourdomain: thedefendergetsapayo↵ ofT (t) whentheattackersuccessfullyattackstargett and0otherwise. Tofacilitateanalysis,wedefinea non-negativenormalizedutilityfunction f forthedefender,where U d (X,a) = P A j 2A|X\ A j =; a j T(t j ) (4.27) f a (X) = U d (X,a) U d (; ,a) (4.28) More specifically, f gives the added benefit of the defender allocation X over the defender not protectinganyedges. Furthermore,usingEquation4.27, f a (X) = X A j 2A|X\ A j =; a j T(t(j)) U d (; ,a) = X A j 2A|X\ A j ,; a j T(t j ) (4.29) The greedy better response algorithm is described as Algorithm 4, where in each iteration, theobjectiveistoaddanedgeetothegreedydefenderallocation X g thatmaximizes f a (X[{ e}). Here, X g (Line1)isthecomputedbetterresponseofthedefender. aisthemixedstrategyofthe attackeroverthesetofpathsAthatisinputtothebetterresponseoracle. A r inLine2isusedto keeptrackofattackerpathsthatnotalreadycoveredbythedefender’sallocation. w e (initializedin Line4)representstheweightofeachedge. TheseweightsareupdatedinLines5–7,suchthatw e representsthetotalmarginalusageofedgeeintheattacker’smixedstrategya,weightedbythe payo↵ ofthetargetattackedbytheattackerwhentraversingthroughe. Thedefender,following thegreedyapproach,choosesanedgee ⇤ withthehighestweightinLine8. Finally,allthepaths intersectingwithedgee ⇤ areremovedfromthesetofpathsconsideredinsubsequentiterations, 89 i.e.,fromA r (Lines10–12). Lines13–15areinvokedonlyiftheallocation X g alreadyintersects withalltheattackerpaths A j 2A,andensuresthatthedefenderchoosesk edgesinitsallocation. Algorithm4DefenderBetterResponse: DBO(A,a) 1: Initialize X g ; (defenderallocation) 2: InitializeA r A 3: while|X g | < kandA r ,; do 4: w e = 0 8 e2E 5: forall A j 2A r do 6: foralle2A j do 7: w e w e +a j T(t(j)) 8: e ⇤ argmax e w e 9: X g X g [{ e ⇤ } 10: forall A j 2A r do 11: if e ⇤ 2A j then 12: A r A r { A j } 13: while|X g | < kdo 14: Chooseearbitrarilyfrom E 15: X g X g [ e 16: return X g Theorem6. Thenormalizeddefenderutility, f a (X),issub-modularin X. Proof. Weshowherethatthegainindefenderutilitywhenaddinganedgeetoanexistingdefender allocation X exhibits diminishing returns. Let X 1 and X 2 be two defender allocations such that X 1 ✓ X 2 ✓ E. Furthermore, letA 1 = {A j |A j \ X 1 =;} , i.e., thesetofattackerpathsthatarenot coveredby X 1 . Similarly,letA 2 = {A j |A j \ X 2 =;} . Furthermore, X 1 ✓ X 2 , therefore,A 2 ✓ A 1 sinceeverypaththatiscoveredby X 1 iscoveredby X 2 . Moreover,usingEquation4.29, f a (X i ) = X A j 2A i a j T(t(j)) U d (; ,a), i2{1,2} (4.30) 90 Letusnowconsidertheadditionofanedgee. Leteintersectwithattackerpaths A e 12 [ A e 2 [ A e ; where A e 12 = {A j |e 2 A j ,A j 2 A,A j \ X 1 , ;} (and thus, paths in A e 12 also intersect with X 2 ), A e 2 ={A j |e2A j ,A j 2A,A j \ X 1 =; ,A j \ X 2 ,;} ,andA e ; ={A j |e2A j ,A j 2A,A j \ X 2 =;} (and thus,pathsin A e ; alsodonotintersectwith X 1 ). Here,Arepresentsallthepossibleexponentially manyattackerpathspossibleintheinputnetworksecuritygame. Therefore, f a (X 1 [{ e}) f a (X 1 ) (4.31) = X A j 2A e 2 [ A e ; a j T(t(j)) (4.32) = X A j 2A e 2 a j T(t(j))+ X A j 2A e ; a j T(t(j)) (4.33) = X A j 2A e 2 a j T(t(j))+(f a (X 2 [{ e}) f a (X 2 )) (4.34) f a (X 2 [{ e}) f a (X 2 ) (4.35) Hence,thenormalizeddefenderutility f a (X)isanon-negativesub-modularfunction. ⇤ Letting X ⇤ tobethebestresponseofthedefender, f a (X g ) (1 1 e )f a (X ⇤ ) (4.36) sincewecomputeanincrementallymaximizinggreedysolutiontoanon-negativesub-modular function[Nemhauseretal.,1978]. We now establish the relationship between a global minimax equilibrium solutionhx ⇤ ,a ⇤ i and the solutionhx c ,a c i obtained when using just this greedy response DBO to compute pure 91 strategiesforthedefender,i.e.,wenevercall DObutwedocall AObytakingoutLines5and6 fromAlgorithm3toarriveathx c ,a c i. Theorem7. ThedefenderutilityU d (x c ,a c )islowerboundedby(1 1 e )U d (x ⇤ ,a ⇤ )+ 1 e U d (; ,a c ). Proof. Firstly,giventhathx ⇤ ,a ⇤ iisaminimaxsolution, U d (x ⇤ ,a ⇤ ) U d (x,a ⇤ ) 8 x (4.37) U d (x ⇤ ,a) U d (x ⇤ ,a ⇤ ) 8 a (4.38) Furthermore,defining f a (x) = X X i 2X x i f a (X i ) (4.39) andusingEquations(4.28),(4.37),(4.38)and(4.39),wehave: f a ⇤ (x ⇤ ) f a ⇤ (x) 8 x (4.40) f a (x ⇤ ) f a ⇤ (x ⇤ )+(U d (; ,a ⇤ ) U d (; ,a)) 8 a (4.41) Therefore,usingEquations(4.36)and(4.41), f a c (x c ) = f a c (DBO(a c )) (4.42) (1 1 e )f a c (DO(a c )) (4.43) (1 1 e )f a c (x ⇤ ) (bydefinitionof DO) (4.44) (1 1 e )[f a ⇤ (x ⇤ )+(U d (; ,a ⇤ ) U d (; ,a c ))] (4.45) Therefore,U d (x c ,a c ) (1 1 e )U d (x ⇤ ,a ⇤ )+ 1 e U d (; ,a c ). ⇤ 92 Thus, not only is the better response defender utility bounded in each iteration, but the solutionqualityforusingjustthebetterresponseisalsoboundedatconvergence. Furthermore, U d (x c ,a c ) U d (x c ,a)8 a. Thus, U d (x c ,a c ) is the utility that x c guarantees the defender. Also, naturally,U d (; ,a c ) max t2T T(t). So the greedy solution does at least as well as doing nothing 1 e ofthetimeandplayingoptimallytherestofthetime. Thisproofsuggeststhatthebetterresponse oraclecanproducegoodsolutionse ciently,ahypothesiswhichweexperimentallyvalidatein Section4.4.4. 4.4.3.2 BetterResponsefortheAttacker We now describe the better response heuristic for the attacker. We use a shortest path based approach to generate better responses for the attacker, which is given as Algorithm 5. This algorithmisdesignedtoaccuratelydeterminethedefender’scoverageprobabilitywhenestimating theattacker’sutilityofthebetterresponse,eveniftheattackerchoosestwoedgesinhispathwhich arecoveredinthesamedefenderallocation. Forexample,ifintheattacker’sbetterresponse,the attacker traverses through the edgese 1 ande 2 , and there exists a defender allocation X i |e 1 ,e 2 2X i , then Algorithm 5 will not double count the probability x i associated with allocation X i when computing the attacker’s reward. This is di↵ erent from previous greedy approaches for computing theattacker’sresponse,whichdefinedthecostoftheedgeeasthedefender’smarginalcoverage ofe [Tsai et al., 2010], and su↵ ered from over-estimation of the defender’s coverage. Algorithm 5 below assumes the presence of only one source s; if there are multiple sources, then a virtual sourceisaddedwhichthenconnectstoalltheexistingsources. Lines 1–15of Algorithm5 follow the Dijkstra’s algorithm. Here, path distances are computed usingcaught[u],whichgivestheprobabilityoftheattacker’s s upathgettinginterceptedby 93 Algorithm5AttackerBetterResponse: ABO(X,x) 1: foralln2N do 2: caught[n] 1 3: caught[s] 0 4: X s X 5: Add stoPriorityQueuePQ 6: whilePQisnotemptydo 7: u argmin n2PQ caught[n] 8: RemoveufromPQ. 9: fore(u,v)2out-edges(u)do 10: c e P X i 2X u |e2X i x i 11: if caught[u]+c e caught[v]then 12: caught[v] caught[u]+c e 13: prev[v] u 14: X v X u { X i |e2X i } 15: AddvtoPQ 16: fort2T do 17: payoff[t] (1 caught[t])·T(t) 18: t ⇤ argmax t payoff[t] 19: A g path(s t ⇤ )constructedusingprev[t ⇤ ] 20: return A g the defender. To ensure the correct computation of caught[u], the algorithm keeps track ofX u , or thesetofallocationsthattheattackerhasnot encounteredalongthe s upath. X v isupdatedonce theattackermovestonode v using the edge e:(u,v) (Line 14), by removing all the allocations fromX u thatcontributedtothecostc e ofedgee. c e (Line10)givestheprobabilityoftheattacker getting intercepted by the defender on this particular edge, only considering allocations in X u . Lines16to18thenfindhighestexpectedutilitytargettoattackfortheattacker,andthegreedypath A g is then constructed using the stored predecessorinformation (Lines13 and19). As opposed to thedefender’spurestrategies,theedgesfortheattackercannot bechosenindependently,because theyshoulddefineavalidpathfrom s toatargett. Sucharestrictionpreventsusfromconducting a sub-modularity analysis similar to the defender case; and we focus on experimental validation of thisapproach. TheresultsarepresentedinSection4.4.4. 94 4.4.4 Evaluationof Snares InowexperimentallyevaluatetheperformanceofSnares,bothonsimulatedgraphsaswellason real urban road networks. I first present the analysis of the components of Snares, followed by scalabilityresults. 4.4.4.1 AnalysisofComponentsof Snares In this section, we evaluate the performance boost provided by each component of the Snares algorithm. Specifically,wecomparetheperformancewithandwithouttheuseof mincut-fanout strategies for warm starts as well as with and without the better responses. These experiments wereconductedonamachinewith16GBmainmemoryanda2.3GHzprocessor. 0" 200" 400" 600" 800" Run*me"(in"secs)" Rugged" Def:"Random" Def:"MinCut" Def:"Ranger" A>:"Ranger" A>:"Shortest" mincutAfanout" (a) E↵ ectofwarmstarts 0" 50" 100" 150" 200" 250" 300" 350" Run*me"(in"secs)" Rugged" Defender" A8acker"" SNARES"(wo" mincutAfanout)" (b) E↵ ectofusingbetterresponses 1" 10" 100" Rugged" Defender" A.acker" Both" Percentage"of"calls"to"" best:response" A.:"AO" Def:"DO" (c) Analyzingbetterresponses Figure4.9: ThecontributionsofindividualcomponentsofSnares.Ruggedisusedasabaseline. 0.1$ 1$ 10$ 100$ 1000$ 10000$ 50$ 100$ 150$ 200$ Run*me$(in$secs)$$ [log6scale]$ Nodes$ Rugged$ Snares$without$ beAer$responses$ Snares$without$ mincut6fanout$ Snares$ (a) Varyingnumberofnodes 0.01$ 0.10$ 1.00$ 10.00$ 100.00$ 1000.00$ 10000.00$ 0.1$ 0.2$ 0.3$ 0.4$ Run+me$(in$seconds)$$ [log8scale]$ Graph$Density$ Rugged$ Snares$without$ beEer$responses$ Snares$without$ mincut8fanout$ Snares$ (b) Varyingdensityd 0.01$ 0.10$ 1.00$ 10.00$ 100.00$ 1000.00$ 1$ 2$ 3$ 4$ 5$ Run,me$(in$seconds)$$ [log9scale]$ Resources$ Rugged$ Snares$without$ beBer$responses$ Snares$without$ mincut9fanout$ Snares$ (c) Varying number of defender re- sources Figure4.10: TheruntimerequiredbySnaresastheinputproblemsizeisvaried. 95 Warm Starts without Better Responses: We compare the performance of choosing the mincut-fanout warm-start methodology with 5 other methodologies, ranging from random selection to using previously publishedalgorithms forthis domain. We alsoestablish thebaseline usingRugged. Alldatapointsareaveragedover30samplesforrandomgeometricgraphs. Weuserandom geometricgraphssincetheyhavebeenshowntomimictheconnectivitypropertiesofrealroad networks [Eppstein and Goodrich, 2008]. A random geometric graph is generated as follows: nodes or vertices are placed at random uniformly and independently on a 2-D region, and two verticesu,vareconnectedbyanedgeifandonlyifthedistancebetweenthemisatmostathreshold d, i.e.,||x y|| 2 d. In all ourexperiments, our2-D regionis normalizedto aunit square, andthe valueofd variesfrom0to1. These results are shown in Figure 4.9(a). The y-axis shows the runtime in seconds and the x-axis shows the di↵ erent methodologies for warm starts. These results are averaged over 30 randomgeometricgraphswith50nodes,5targets,3defenderresourcesandd = 0.2. Thetarget payo↵ s were randomly generated between 0 and 100. The first bar of the graph represents the runtimeof Rugged. Thesecond,thirdandfourthbarsareresultswhenwarmstartsareusedforthe defender. Theyrepresentarandomchoiceofk edges,samplingofdefenderpurestrategiesfrom theunionofmin-cutsbetweenthesource sandeachtargett2T,andpurestrategiesobtainedby samplingthesolutionobtainedusingRangerrespectively. Thefifthandthesixthbarsareresults whenwarm-startsareusedfortheattacker. TheyusetheRangeralgorithmandshortestpathsfrom thesource stoeachtargett2T respectively. Theseventhbarinthegraphrepresentstheresults ofusingmincut-fanoutasinSnares. Here,Ruggedtook329.69seconds,randomselectionof edgesforthedefenderincreasedtheruntimeto435.24secondswhereasmincut-fanoutreduced 96 itto76.67seconds. Theseresults show thatwhilemincut-fanoutise↵ ective,otherapproaches arenotase↵ ectiveandmayevenperformworsethanRugged. Better Responses without Warm Starts: We now present the results of using better re- sponses in Snares. We evaluate the use of better responses for each player independently as wellasinconjunction. However,nowarmstartswereusedintheseexperiments,thatis,thefull Algorithm 3 was used but with Line 1 disabled. These experiments are also done on the same graphs as before: 30 samples of random geometric 50 node graphs with 5 targets, 3 defender resourcesandd = 0.2. Again,Ruggedformsthebaseline. TheseresultsareshowninFigure4.9(b). The y-axis shows the runtimein seconds whereas the x-axis shows the di↵ erent configurationsfor theexperiment. Forexample, Rugged took 329.69 seconds whereas using better responses forjust thedefenderreducedtheruntimeto33.58seconds. Moreover,Snaresrequiredjust4.46seconds, animprovementof98%. Theseresultsalsoshowthatusingbetterresponsesforanyoneplayer provides aboost in performance,andusingitforbothplayerssimultaneouslymakes Snareseven moree↵ ective. Figure 4.9(c) shows the percentage of times calls to the best response module have to be made on the y-axis (on log-scale) and varies the configuration on the x-axis. The results are shown for all the four configurations from the previous experiment. Each result shows the percentage of iterations better response did not generate an improving strategy for both the players (i.e, in Algorithm3,checkinLine5failedandLine6wascalledforthedefender,andcheckinLine9 failedandLine10wascalledfortheattacker). Thelowerthebar,thelowerpercentageoftimes thebestresponsemodulehastobecalled. Ruggedhasnobetterresponses,andishenceplottedat 100%. For example, the best response of the attacker and defender were computed in only 15.81% and1.69%oftheiterationsinSnares. 97 4.4.4.2 ScalabilityinSimulation: ThissectionevaluatestheperformanceofSnaresastheinputproblemsizeisvaried. Theseresults are also conducted on random geometric graphs and are averaged over 30 samples. We do not plot error bars in our graphs because the y-axis is on a log-scale; however, Snares was statistically significantly(with p < 0.05)fasterthanRuggedforallexperimentswithmorethan100nodes,or d 0.2,ork 4,or|T| 4. Vary Number of Nodes: Figure 4.10(a) presents the results when varying the number of nodesintheinputproblem. Thex-axisshowsthenumberofnodesinthegraph,whereasthey-axis showstheruntimeinseconds(onalog-scale). Thefourbarsinthegraphcomparetheperformance of Rugged with (i) Snares without better responses, (ii) Snares withoutmincut-fanout, and (iii) Snares. These experiments were conducted on random graphs generated with density d = 0.1, 3 targets, 3 source nodes for the attacker, and 3 defender resources. These results show that all configurationsarebetterthanthebaselineof Rugged,andthatSnaresisthemoste cient. Rugged ran out of memory in 26 out of 30 samples for 200 node graphs, and was killed in another 2 samples since it did not finish execution in 3 hours. For example, for 150 nodes, while Rugged took6021.08seconds,andSnareswithoutbetterresponsesused4,152.80seconds. Additionally, Snareswithoutmincut-fanoutreducedtheruntimeto19.58secondsandSnaresrequiredonly 6.53seconds. Thus,theexperimentshowsthatthecombinationprovidesthemostsignificantboost inperformance,Snarestakingonly1.08%ofthetimerequiredbyRugged. VaryGraphDensity: Figure4.10(b)presentstheresultswhenvaryingthedistanced used when generating the random geometric graphs. The x-axis shows the value of d, whereas the y-axisgivestheruntime. Theseexperimentswereconductedonrandomgraphsgeneratedwith 98 density 50 nodes, 3 targets, 3 source nodes for the attacker, and 3 defender resources. These resultsshowthatasthedensityofthegraphisincreased,theruntimerequiredbyallthealgorithms increases. Experimentsthatdidnotterminatein10,800seconds(3hours)wereterminated(asin thecaseofmostexperimentswithgraphdensityhigherthan0.3whenrunwithRugged). These resultsalsoshowthatSnaresperformsmuchbetterthanRugged. Forexample,ford = 0.3,most experimentswithRuggeddidnotfinishin3hours,whereasSnareswithoutbetterresponsestook 537.19seconds. Furthermore,Snareswithoutmincut-fanoutrequired32.74seconds,whereas Snaresrequiredonly9.04seconds. Vary Number of Resources: We now present the results when varying the number of defenderresourcesinFigure4.10(c). Theseresultsareshownforrandomgraphswith50nodes, 3 targets, 3 sources and graph density d = 0.1. The x-axis in this graph shows the number of defenderresources,whereasthey-axisshowstheruntimeinsecondsonlogscale. Theresultsshow thattheperformancetrendsremainthesame: SnaresscalesmuchbetterthanRuggedwhenthe problemsizeisincreasedandtheuseofbetterresponsesprovidesamoresignificantperformance boost. Forexample,for4resources,Ruggedrequired83.90secondswhereasSnaresonlyrequired 0.72secondsonaverage. 0.10$ 1.00$ 10.00$ 1$ 2$ 3$ 4$ Run+me$(in$seconds)$$ [log8scale]$ Targets$ Rugged$ Snares$without$ beBer$responses$ Snares$without$ mincut8fanout$ Snares$ Figure4.11: VaryingNumberoftargets. 99 Vary Number of Targets: Figure 4.11 presents the results when varying the number of targets for random graphs 50 nodes, 3 sources, 3 resources and d = 0.1. The x-axis shows the number of targets, whereas the y-axis shows the runtime in seconds on log-scale. We observe similar performance trends: Snares scales better than Rugged in the numberof targets as well. For example,for4targets,whereRuggedtook10.97seconds,Snaresonlyrequired0.77seconds. 4.4.4.3 RealData WealsotestedtheperformanceofSnaresonrealurbanroadnetworkdata. Wedownloadedthe OSMinformationfortheroadnetworkofMumbai[HaklayandWeber,2008]. Weextractedthis information for the entire city of Mumbai, that is, the road network existing between latitudes 18.840and72.750andlongitudes19.360and73.160. We first present results in Table 4.3 comparing Snares with Rugged on the southern part of the Mumbairoadnetwork. RuggedwasonlyabletosolveforasubsetoftheMumbairoadnetwork, andthus, wecanonlyuseasubset oftheMumbaimapfor thisexperiment. Theseexperiments weredonewith3targetsand3sources,whoselocationsonthemapweremotivatedbytheMumbai attacks of November 2008. These results show that Snares has a significant improvement over Rugged; for example, for a subset of the graph with 252 nodes, Rugged took 34,344.70 seconds to place4resourceswhereasSnaresonly30.77seconds! While Rugged could only compute solutions for the southern tip of the road network of Mumbai,Snareswasabletosolvetheentireroadnetwork. TheseresultsareshowninTable4.4. We ran Snares varying the number of targets and number of defender resources for 3 sources. Theeasy-hard-easypatternintheruntimewiththeincreaseinresourcesisexpectedbasedonthe runtimepropertiesofsecuritygames[Jainetal.,2012]. Forexample,ittookSnaresonly101.09 100 Map Resources Size Algorithm 1 2 3 4 45 Rugged 0.91 6.43 22.58 33.42 Snares 1.08 2.62 7.53 7.71 129 Rugged 6.63 32.55 486.48 3,140.23 Snares 2.23 2.99 10.99 21.06 252 Rugged 17.19 626.25 2,014.14 34,344.70 Snares 3.83 4.85 16.19 30.77 Table4.3: Runtimeof Rugged[6]andSnaresonrealMumbaimap. ResultsofSnaresareaveraged over30runs. secondstofindtheoptimalsolutionforplacing10checkpointswhenconsidering8targets. Thus, Snaresisnowscalableenoughtobeappliedandusedintherealworld. Resources NumberofTargets 4 8 1 18.23 27.12 5 1,209.37 7,289.03 10 27.26 129.51 15 12.64 101.09 RuntimeRequiredbySnares: Allresultsaveragedover30samples. Table 4.4: The image is map of Mumbai road network comprising 9,503 nodes and 20,416 edges. Thesources(bluedots)andtargets(reddots)areplacedarbitrarilyforthesetests. 101 Chapter5: SolvingBayesianGames Inthischapter,IpresentthehierarchicalapproachtosolvinggeneralBayesiangames,Bayesian games witharbitrary scheduling constraints as well as Bayesian games with patrolling constraints. 5.1 SolvingBayesian-SPNSCProblemInstances I now present Hierarchical Bayesian solver for General Stackelberg games, or Hbgs, an algo- rithm that computes solutions to Bayesian-SPNSC instances. Two main approaches have been proposedinpriorworktocomputetheequilibriumforsuchBayesianStackelberggames,namely DOBSS [Paruchuri et al., 2008a] and Multiple-LPs [Conitzer and Sandholm, 2006] (refer Sec- tion2.9fordetails). DOBSSsolves theBayesian game by solving a mixed-integer linear program that internally decomposes the problem by individual follower types, whereas Multiple-LPs works on the Harsanyi transformed version of the game. The exponential number of linear programs that are solved by Multiple-LPs approach, and the exponential possibilities for assigning values for integer variables in DOBSS, does not allow these algorithms to scale well with increasing number of follower types. Indeed, if the optimal solution could be obtained by solving only a few of these linear programs, the performance could be improved significantly — even significantly better than DOBSS. Specifically, if I could construct a smaller tree of the follower’s action choices in the 102 firstplace,orobtainboundsonsolutionqualitytoperformbranchandboundsearch,significant speed-upswouldbeobtained. ThisistheintuitionbehindHbgs: Hbgsreducesthenumberoflinear programsthatneedtobesolvedusingtwomaininsights: (1)Feasibilityrulesthathelpeliminate infeasiblefollowerstrategiesintheBayesiangame;and(2)Boundsthathelpprunethefollower action space using branch and bound search. Hbgs constructs a hierarchical tree of restricted games,thesolutionsofwhichprovidesuchfeasibilityandboundsinformation. Ifirstdiscussthe hierarchicalstructureof Hbgs,andthendescribethefeasibilityandboundingtechniques. 5.1.1 BayesianGameComputation Computing solutions to a Bayesian Stackelberg security game has been shown to be NP- Hard[ConitzerandSandholm,2006]. Thisischieflybecauseoftheexponentiallylargeattacker actionspaceinaBayesianStackelbergsecuritygame. Theentireactionspaceoftheattackeris shown in Figure 5.1. Figure 5.1 shows an attacker action tree, i.e., a depiction that represents possiblecombinationsofactionsforeachattackertypeinatree. AsolutiontoaBayesianStack- elbergsecuritygamerequiresspecifyinganactionchoiceforeachpossibleattackertypeinthe game, since the defender does not know which attacker type it is playing against. In the tree representation shown here, every level corresponds to a di↵ erent attacker type, whereas every branchcorrespondstothepossibleactionsforthattype(inourcase,thepossibletargetsthatcan beattacked). Everyleafnoderepresentedacompletestrategyfortheattacker,withanactionfor eachtypedefinedbythechoicesalongthepathfromtheroottotheleaf. Forexample,thepure strategy[t 1 2 ,t 2 1 ]whereattackeroftype1attackstargett 2 andattackeroftype2attackstargett 1 is representedbytheleaf[2,1]inFigure5.1. InaBayesiangamethereareanexponentialnumber 103 of leaves in this tree (specifically, |T| |⇤ | , where⇤ is the set of attacker types). A game with 10 attackertypesandjust5targetswouldhave9,765,625leaves. [*,*] [1,1] [1,2] [2,1] [2,2] [1,*] [2,*] Type Type t 1 2 Figure5.1: ExampletreerepresentingthepurestrategiesfortheattackerinaBayesianStackelberg securitygame. OneapproachtosolvingaBayesianStackelberggameistoconsidertheoptimalsolutionfor the defender in each of the leaf nodes, and take the maximum over all of these possibilities (this is theapproachtakeninMultipleLPs[ConitzerandSandholm,2006]). Thesolutionforeachleaf nodeisanoptimalmixedstrategyforthedefender,subjecttotheadditionalconstraintsthatthe attackerchoicesrepresentedinthepathtotheleafnodearebest-responsesforeachattackertypeto thegivendefenderstrategy. Forexample,thesolutionforleaf[2,1]forourexampleinFigure5.1 would be the optimal mixed strategy x for the defender such that the best response of attacker type 1 is target t 2 and for attacker type 2 is target t 1 . Note that it is possible that a leaf has no solution,i.e., theremaynot existamixedstrategyforthedefendersuchthatthebestresponsesof the attacker are consistent withthe leafnode. Thesenodes are infeasible. Theoptimal solutionfor theentireproblemisthesolutionamongallfeasibleleafnodesthatgivesthedefenderthehighest expectedutility. Any solution algorithm for a Bayesian Stackelberg game must consider all possible leaf nodes tofindanoptimalsolution. TheMultipleLPs method[ConitzerandSandholm, 2006]doesthis explicitly, enumerating the leaf nodes and solving a linear program to find the optimal strategy for 104 eachone. DOBSS[Paruchurietal.,2008a]hasanimplicitformulationbasedonamixed-integer programthatmapseachleafoftheattackeractiontreetooneuniqueassignmentofvaluestothe integervariables. Dependingonthesolverusedforthisoptimizationproblem,thismayallowfor somepruningofleafnodesinthesolutionprocess. Inthefollowingsections,IwillpresentHbgs asusingtheMultiple-LPslinearprogramforsolvingeachoftheleafnodes;however,alternate methodscouldbeemployed. 5.1.2 HbgsSolutionMethodology TheintuitionbehindHbgsisbasedontwomaininsights: (1)Feasibilityrulesthathelpeliminate infeasible attacker strategies in the Bayesian game (an attacker strategy or target is deemed infeasible if it cannot be a best response of the attacker to attack this target for any possible mixed strategyforthedefender);and(2)Generationoftighterboundsthathelpprunetheattackerpure strategyspaceduringbranchandboundsearch. Hbgsconstructsahierarchicaltreeofrestricted gameswithfewerattackertypestoprovidethisinformation. Ifirstdiscussthehierarchicalstructure of Hbgs,and thentherules thathelp in pruning the attacker action space for the Bayesian-SPNSC game. Finally,IdescribehowallofthesetechniquesarecombinedinthefullHbgsalgorithm. 5.1.2.1 HierarchicalTypeTrees Hbgsfirstconstructsahierarchicalstructureofrestrictedgames. Arestrictedgameisaninstan- tiation of the input Bayesian-SPNSC instance in which only a subset of the attacker types is represented,ignoringtherest. Theprobabilityoffacingeachattackertypeisre-normalizedinthe restricted game. These restrictedgamesare exponentiallysmaller thantheoriginal inputinstance, sincetheproblemsizescalesexponentiallywiththenumberofattackertypes. Theobjectiveof 105 constructing such restricted games is two fold: solutions to these games provides (i) infeasible set ofpurestrategiesperattackertype intheoriginalproblem,whichcanthenbeprunedout;and (ii)correspondingupperboundsB forthesetT offeasiblepurestrategyforeachattackertype intherestrictedgame. ForthepurposesofcomputingthisfeasiblesetofpurestrategiesT and theupperboundsB ,theBayesianStackelberggameG(⇥ , ⇤ )isdecomposedintomanysmaller restricted games,G(⇥ , ⇤ i ) by partitioning the set of types,⇤ , into subsets⇤ i . Any partition of⇤ intosubsets⇤ i isapplicable,suchthat: [ i ⇤ i =⇤ (5.1) ⇤ i \ ⇤ j =;8 i,8 j, j, i (5.2) Onceapartitionhasbeenestablished,ahierarchicaltypetreeisconstructedwheretheroot nodecorrespondstotheentireBayesian-SPARSgameG(⇥ , ⇤ ),anditschildrencorrespondtothe restrictedgames,G(⇥ , ⇤ i ). Notethatthisisadi↵ erenttreefromtheoneintroducedpreviously to represent the attacker actions; this tree instead represents a partitioning of the types and not theattackeractions. Ipresentandexperimentallyevaluatethreepossiblepartitioningschemesin this article: (1) a depth-one partition, (2) a depth-two partition, and (2) a fully branched binary tree (where children can then be hierarchically decomposed into even more restricted games). An example game of depth-one partitioning with 4 types is shown in Figure 5.2(a). Here, each restrictedgamehasexactlyonetypeandthetotaldepthofthetreeisone. Figure5.2(b)showsa fully branched binary partitioning, where the entire problem is decomposed into two restricted gamesoftwotypeseach,whichareinturndecomposedintotwosub-games. 106 [*,*,*,*] [1,1,*,*] [1,2,*,*] [2,1,*,*] [2,2,*,*] [1,*,*,*] [2,*,*,*] Type Type (a) Depth-OnePartitioning. [*,*,*,*] [1,1,*,*] [1,2,*,*] [2,1,*,*] [2,2,*,*] [1,*,*,*] [2,*,*,*] Type Type (b) FullBinaryPartitioning. Figure5.2: ExamplesofpossiblehierarchicaltypetreesgeneratedinHbgs. Therootnodeisthe originalBayesian-SPARSprobleminstance,G(⇥ , ⇤ ). EveryothernodeisarestrictedBayesian game. Figure 5.2(a) shows a depth-one partitioning, where G(⇥ , ⇤ ) is decomposed into four restricted games G(⇥ , ⇤ i ),i2{1..4}. Figure 5.2(b), on the other hand, shows the full binary partitioning where the originalG(⇥ , ⇤ ) is decomposed into two restricted games, which are then furtherre-decomposedintotwosmallerrestrictedgames. Theevaluationofthehierarchicaltreeisbottom-up,soallchildrenareevaluatedbeforethe parentnodesareevaluated. EverynodeisevaluatedusingAlgorithm6(discussedlater),andthe feasiblepurestrategiesT ⇤ i withcorrespondingboundsB ⇤ i obtainedatthei th childarepropagated up to the parent. These are used when the parent is evaluated, again using Algorithm 6. This processcontinues untiltheroot nodeissolvedto obtaintheoptimal solutionforthe originalgame G(⇥ , ⇤ ). 5.1.2.2 PruningaBayesianGame (1) Feasibility: Hbgs applies the following theorem to reduce the strategy spaceT ⇤ of the attacker. Theorem8. Theattacker’spurestrategyt = [t ]isinfeasibleintheBayesiangameG(⇥ , ⇤ )if thestrategy t isinfeasiblefortheattackeroftype inarestrictedgame,G(⇥ , ⇤ 0 ),wherethe attackercanonlybeoftype (thatis,⇤ 0 ={ }). 107 Proof. I prove the theorem by showing the contrapositive. Suppose that the pure strategy t containingt isfeasibleintheoriginalgameG(⇥ , ⇤ ),with beingthecorrespondingdefender optimal mixed strategy. Thus, by the definition of SSE and optimal defender strategy, the best response of the attacker of type to the defender strategy is t even in the restricted game G(⇥ , ⇤ 0 ),whichprovesthetheorem. ⇤ Theorem8statesthatift canneverbethebestresponseofattackertype intherestricted gameG(⇥ , ⇤ 0 ),⇤ 0 ={ }(thatis,agamewithonlytheattackeroftype ),thenapurestrategy containingt canneverbethebest-responseoftheattackerinanyBayesiangameG(⇥ , ⇤ ),⇤= { 1 , 2 ,...}. Thus,anypurestrategycontainingt isinfeasiblewhencomputingsolutionsforthe original game G(⇥ , ⇤ ) and can be omitted. In other words, if some branches in the attacker actiontree(Figure5.1)areinfeasible,no leavesinthesubtreeconnectedbythatbranchneedtobe evaluated. Thetheoremcanbeextendedtorestrictedgameswith⇤ 0 ✓ ⇤ byconsidering⇤ 0 asone hyper-type. Thisimpliesthatapurestrategyt canberemovedfromtheBayesiangameifanyof itscomponentst isinfeasibleinthecorrespondingrestrictedgame. Consider the possible performance improvement for a sample problem with five attacker types (|⇤ | = 5)andtenpurestrategiesforattackerofeachtype(|T | = 10, 2⇤ ). Thetotalnumberof purestrategiesfortheattackerintheBayesianStackelberggameis10 5 . Ifanoraclecouldinform usa-priorithattwoparticularpurestrategiescanbediscardedforeverytypeoftheattacker,the strategyspacewouldreduceto8 5 purestrategies,whichisroughly33%ofthesizeoftheinitial problem. 108 HbgsidentifiestheinfeasiblestrategiesineachoftherestrictedgamesandappliesTheorem8 to prune out infeasible strategies from G(⇥ , ⇤ ). This process is applied recursively in the hierarchicaltree(referFigure5.2)toobtaine↵ ectivepruningattherootnode. (2) Bounds: A pure strategy for the attacker needs not be evaluated if the upper bound on the maximum defender expected utility for the corresponding pure strategy is available, and if thisupperboundisnotbetterthanthebestsolutionknownsofar. Ana¨ ıveupperboundis+inf whichleads tono pruning,and wouldleadtotheconventionalMultiple-LPsapproach. Hbgsuses techniquesforobtainingtighterupperboundsonthedefenderutilitybasedonTheorem9. Theorem9. Themaximaldefenderpayo↵ isupperboundedby P 2⇤ p B(t )whentheattacker choosesapurestrategy t =< t >,whereB(t )istheupperboundonthedefenderutilityinthe restrictedgameG 0 (⇥ , ⇤ 0 )|⇤ 0 ={ }whentheattackeroftype isinducedtochoosepurestrategy t . Proof. B(t )upper-boundsthemaximumutilityofthedefenderforanystrategythatinducesthe attackeroftype tochooset asthebestresponse. Thus,thedefenderutilityagainstattackerof type foranystrategy isnomorethanB(t ). Therefore,U ⇥ (, t )B (t ). ApplyingEquation (??), U ⇥ (, t) X 2⇤ p B(t ) 8 (5.3) whichprovesthetheorem. 1 ⇤ 1 Thistheoremcanalsobegeneralizedtorestrictedgameswhere⇤ 0 ✓ ⇤ ,justlikeTheorem8. 109 TheboundsB(t )aregeneratedforallchildrenandthenpropagatedupthehierarchicaltree (Figure 5.2), where they are usedby theparent toprune outbranches fromits own Bayesian game (Figure5.1). 5.1.2.3 HbgsDescription Hbgssolveseachnodeofthehierarchicaltree(referFigure5.2)usingAlgorithm6. Thisalgorithm takesasinputasetofattackertypes⇤ ,thedefender⇥ ,thesetoffeasiblepurestrategiesofthe attackerT ⇤ , 2 upperboundsB ⇤ onthedefenderpayo↵ foreveryattackerpurestrategyinT ⇤ and theutilitiesU ⇥ andU )forboththeplayers. Theoutputofthealgorithmistheoptimaldefender mixed strategy ⇤ for this game, the optimal defender reward r ⇤ and the updated set of feasible purestrategiesT ⇤ fortheattackerandtheupdatedcorrespondingupperboundsonthedefender’s payo↵ B ⇤ . These upper boundsB ⇤ are the exact payo↵ s r (Line 7, Algorithm 6) computed for alltheattackerpurestrategiesthatareexplicitlyevaluatedbythealgorithm,andequaltheupper boundsB ⇤ fortheattackerpurestrategiesthatthealgorithmwasabletoprunewithoutexplicit computation. Hbgs is used to solve each node of the hierarchical tree (refer Figure 5.2); the hierarchical tree is evaluated in a bottom-up sequence such that the parent nodes gets updated informationfromitschildren. Once Hbgs starts its computation for a node of the hierarchical tree (refer Figure 5.2), it constructs a tree representing the attacker actions, as in Figure 5.1. This attacker action tree is then solved using Multiple-LPs linear programming approach. Only the pure strategies in the cross-product of the feasible set of strategies of individual types need to be evaluated for the attacker(Theorem8). T ⇤ representsthismaximalset,asgiveninLine2(andupdatedlaterinLine 2 T ⇤ couldcontainsomeinfeasiblepurestrategiesof theattacker,butno feasiblepurestrategywill beomitted. For instance,thealgorithmstartsbyconsideringtheexhaustivesetofpurestrategies. 110 10). B ⇤ representstheboundsforallthesestrategies,andisobtainedinLine3(andupdatedlaterin Line9). Lines2to5areinitialization;T (i)representsthei th purestrategyinthesetT . Themain loopofthealgorithmstartsafterLine6,whereonepurestrategy(leaf)isevaluatedafteranother. Thefunctionsolve(Line7)inHbsasolvesthecolumngenerationprocedureofBayesian-Aspen Algorithm6Hbgs(⇤ ,⇥ ,T ⇤ ,B ⇤ ,U ⇥ ,U ) // initialize //T ⇤ : pruned feasible pure strategy set for all attacker types //B ⇤ : bounds for all pure strategies for all attacker types 1. FT:=construct-attacker-Action-Tree(T ⇤ ) 2. T ⇤ :=leaves-of(FT) //feasible pure strategies of 3. B ⇤ (t):= getBounds(t,B ⇤ ) 8 t2 Q T 4. sort(T ⇤ ,B ⇤ (t))// sortt in descending order ofB ⇤ (t) 5. t :=[T 1 (1),T 2 (1),...,T |⇤ | (1)] // left-most leaf 6. r ⇤ := inf //r ⇤ : current best known solution // start repeat 7. (feasible, , r):=solve(⇥ ,t) // using Bayesian-Aspen if feasiblethen if r > r ⇤ then // update current best solution 8a. r ⇤ := r 8b. ⇤ := 9. B ⇤ (t):= r //update bound else 10. T ⇤ := T ⇤ t //remove infeasible strategy 11. t :=getNextStrategy(t,r ⇤ ,T ⇤ ,B ⇤ ) untilt <>NULL return ( ⇤ ,r ⇤ ,T ⇤ ,B ⇤ ) (describedinSection5.2.1). Theattackerpurestrategyt isfeasibleifthisBayesian-Aspenhasa feasiblesolution. Themaximaldefenderrewardr andthecorrespondingdefendermixedstrategy arealsoobtainedfromBayesian-Aspen(Line7). Ifthepurestrategyisfeasible,theboundsB ⇤ are updated (Line 9). Otherwise, the strategy t is removed from the pure strategy set T ⇤ of the attacker(Line10). 111 The function getNextStrategy() moves from one leaf (pure strategy) to another of this attackeractiontree: thisisthebranchingheuristic(Line11). Forexample,itwoulditeratethrough all the 4 leaves in Figure 5.1 one by one if no leaf was pruned. The defender strategy ⇤ to the maximalcorrespondingdefenderrewardr ⇤ istheoptimaldefenderstrategyforthisBayesiangame. Algorithm6alsoreturnsthesetoffeasiblepurestrategies,T ⇤ ,andthecorrespondingbounds,B ⇤ . ThisfeasiblestrategysetT ⇤ isasubsetofthecross-productofT ,thefeasiblestrategiespertype, since it does not contain the strategies that were computed and found to be infeasible. 3 T ⇤ and B ⇤ arethefeasibilitysetsandboundsthatarepropagatedupthehierarchicaltree;however,Ifirst discussthebranchandboundheuristicusedinAlgorithm6. BranchandBoundHeuristics: Hbgssortst 2T ,thepurestrategiespertype,indecreasing orderoftheirboundsB(t )beforethetreeinFigure5.1isconstructed. Thebranchingheuristic prefers thatthe leafwhich cangenerate thehighest defender expected utility. The boundson each leafareadirectapplicationofTheorem9. ThefunctiongetBoundscomputestheweightedsum oftheboundsperattackertypeB (t ) 4 togeneratetheboundB(t)forthisleafusingTheorem9. TreeTraversalandPruning: Algorithm7formallydefinesthetree-traversalstrategy. The algorithm traverses the leaves of the attacker action tree from left to right (lexicographic tree traversal)with the objective to find the first leaf (pure strategy) whose bound is higher than the currentbestsolutionr ⇤ . Ifnosuchleafexists,theoptimalsolutionhasbeenachievedandHbgscan besuccessfullyterminated. Thistreeisconstructedkeepingthechildnodessortedindescending order from left to right in every sub-tree. For example, in Figure 5.1, B(T 1 (1))B (T 1 (2)) (childrenofroot)andB(T 2 (1))B (T 2 (2))whereT (i)representsthei th purestrategyforattacker 3 Some of the strategies in T ⇤ that were not computed may still be infeasible; Algorithm 6 ensures no feasible strategyisremoved. 4 Theboundsareweightedbythedistribution p overtypes. 112 type . Theleavesareevaluatedfromlefttoright,thatis,theleaf[1,1]isevaluatedfirstandleaf [2,2]last. IftheboundBforanyleaf(purestrategy)t issmallerthanthebest solutionobtainedthusfar, that leaf need not be evaluated. Additionally, right siblings of this leaf t need not be evaluated either,giventhesortednatureofeverysub-tree. Forexample,inFigure5.1,iftheboundofleaf [2,1] is worse than the solution at [1,2], then the leaf [2,2] does not need to evaluated as well. Algorithm7accomplishesthistypeofpruningofbranchesaswell. Algorithm7getNextStrategy(t,r ⇤ ,T ⇤ ,B ⇤ ) for =|⇤ |to1Step 1do j:=index-of(T ,t ) // Fix the pure strategies of parents: t i ,i< // Update the pure strategy of type : T (j+1) // Children choose their best pure strategy: T i (1),i> t :=[t 1 ,...,t 1 ,T (j+1),T +1 (1),...,T |⇤ | (1)] if r ⇤ <getBounds(t,B ⇤ )then return t return NULL Hbgs Summary: The leaves of the hierarchical type tree are solved to identify infeasible strategiesandobtainupperboundsoneveryattackerstrategy. Thisinformationispropagatedup the tree, and the procedure repeated for every node until the optimal solution is obtained at the root. WhileHbgsdoesincurtheoverheadofsolvingmanysmallerrestrictedgames,itoutperforms allexistingtechniquesintheoverallperformance,asshownintheexperimentalresults. 5.1.3 ExperimentalResults Iprovide two sets of experimental results. First, Icomparetheperformanceof DOBSS,Multiple- LPs and Hbgs for Bayesian-SPNSC games. Second, I show speedups via approximations. The payo↵ sforbothplayersforalltestinstanceswererandomlygenerated,andwereconsistentwith 113 thedefinitionofsecuritygamesasdefinedearlierinthethesis. Resultswereobtainedonastandard 2.8GHzmachinewith2GBmainmemory,andareaveragedover30trials. 0.10$ 1.00$ 10.00$ 100.00$ 1000.00$ 10000.00$ 100000.00$ 10$ 20$ 30$ 40$ 50$ Run,me$(secs)$[log7scale]$ Number$of$Pure$Strategies$ Dobss$ Mlps$ HBGS7D$ HBGS7F$ (a) ScalingUpPureStrategies(5types) 0.00# 2000.00# 4000.00# 6000.00# 8000.00# 10000.00# 12000.00# 14000.00# 4# 5# 6# Run-me#(secs)# Number#of#Types# Dobss# Mlps# HBGSCD# HBGSCF# (b) ScalingUpTypes(30purestrategies) (c) InitializationTimeversusTotalTime Figure5.3: Thisplotshowsthecomparisonsinperformanceofthefouralgorithmswhenthesize oftheinputproblemisscaled. 5.1.3.1 HbgsScale-up I compare the runtime of Hbgs against the runtime of DOBSS and Multiple-LPs, the two chief algorithms for general Bayesian Stackelberg games. I use two variants of Hbgs: (1) the first variant, denoted Hbgs-D constructed a hierarchical tree of a fixed depth of one where as many restrictedgamesweregeneratedasthenumberoffollowertypes. (2)Thesecondvariant,Hbgs-F, 114 constructedmaximallybranchedbinarytreessuchthateachBayesiangamewasdecomposedinto tworestrictedgameswithhalfasmanytypes,untiltheleavessolvedarestrictedgamewithexactly one type. I compared the performance of these algorithms when the number of targets and the number of types were increased. I also show the speed ups obtained when approximation was allowed. Scale-up of number of strategies: Figure 5.3(a) shows how the performance of the four algorithmsscaleswhenthestrategyspacesareincreased. Thesetestsweredonefor5types. The x-axisshowsthenumberofpurestrategiesforbothplayers,whilethey-axisshowstheruntime in seconds on a log scale. For example, for 30 actions and 5 types, Multiple-LPs would solve 30 5 = 2.43e7linearprograms. Theexperimentshadatimecut-o↵ of24hours. Thefigureshowsthatwhilebothvariantsof Hbgscansuccessfullycomputefor5typesand30 purestrategies,DOBSSandMultiple-LPscannot. Furthermore,Hbgs-Fwithitsfullybalanced binary tree scales better than Hbgs-D. This is because it solves a much smaller problem at the rootnode,eventhoughitsolvesmanymorerestrictedproblems. Eachrestrictedgameprovides more pruning (infeasible combinations of follower actions will not be propagated up the tree) and potentiallytighterbounds. Figure 5.3(c) shows an analysis of time required by Hbgs-D and Hbgs-F in solving all the restricted Bayesian games before the root node of hierarchical type tree is solved. The x-axis shows the number of pure strategies for both the players and the y-axis shows the percentage of runtime. It shows that while Hbgs-D spends almost no time in initialization (‘Init’), Hbgs-F spends almost 40% of its runtime in solving the restricted games. On the other hand, Hbgs-F decomposes the problem more finely and thus spends more time solving more of the restricted games. This is because the number of restricted games generated by Hbgs-F are more than the 115 corresponding number in Hbgs-D.However, the total time required by Hbgs-F is considerably smaller(Figure5.3(a))whichshowsthathierarchicaldecompositionsobtainmorepruningand generatebetterboundsthandepth-onehierarchicaltrees. Scale-upofnumberoftypes: Fortheseexperiments,boththerowandthecolumnplayerhad 30purestrategies.Thex-axisshowsthenumberoftypes,whereasthey-axisshowstheruntimein seconds. Again,the experiments wereterminatedafteracut-o↵ timeof24 hours. Wecanseethat Hbgs-Fscalesextremelywellascomparedtotheotheralgorithms;forexample,Hbgs-Fsolveda problemwith6typesinanaverage231secondswhereasDOBSStookanaverageof12593.8for thesameprobleminstances. Theothertwoalgorithmsdidn’tevenfinishtheirexecutionin24hours. While DOBSS and Multiple-LPs do not scale beyond a few number of types, Hbgs-F provides scale-upbyanorderofmagnitude. InTable5.1,Ipresenttheruntimeresultsof Hbgs-Fforupto 50types. Theexperimentsinthiscasehad5purestrategiesforbothplayers(theotheralgorithms cannotsolveanyinstancewithmorethan20typesin24hours). ThisshowsthatDOBSSisno longer the fastest Bayesian Stackelberg game solution algorithm, and Hbgs-F provides scale-up by anorderofmagnitude. Table5.1: Scalinguptypes(30purestrategiespertype) Types FollowerPureStrategyCombinations Runtime(secs) 10 9.7e7 0.41 20 9.5e13 16.33 30 9.3e20 239.97 40 9.1e27 577.49 50 8.9e34 3321.681 116 5.1.3.2 Approximations This section discusses the performance scale-ups that can be achieved when the algorithm was allowed to return approximation solutions. Three parameter settings of approximations were allowed: 1 unit, 5 unit and 10 units 5 . The approximations were tried on Hbgs-F (with fully branchedbinarytrees)sincethatpriorexperimentshadshownittobethefastestalgorithm. Thenumberoftypeswasfixedto6andthenumberofpurestrategieswasvariedfortheresults shown in Figure 5.4(a). The number of targets here is shown on the x-axis, whereas the y-axis showstheruntimeinseconds. Similarly,Figure5.4(b)showstheresultswhenthenumberoftypes wasincreasedwhilefixingthestrategyspaceto50purestrategiesfortheleaderandallfollower types. Thesefiguresshowthattheapproximationvariantsof Hbgsscalesignificantlybetter. For example,whileHbgs-Ftook43,727secondstosolveaprobleminstancewith50purestrategies and 6 types, the 1,5 and 10 unit approximations were able to solve the same problem in 10639, 3131and2409secondsrespectively,whichisupto18timesfaster. Ialsoanalyzedthedi↵ erenceinsolutionqualitywhentheapproximationswereallowed,which isshowninFigure5.4(c). They-axisshowsthe percentage error inthe actualsolutionqualityof theapproximatesolutionwhilethex-axisshowsthenumberoftargets. Lowerbarimplieslower error. Forexample,themaximumerrorinallsettingsfor Hbgswithanallowedapproximationof fiveunitswaslessthantwopercent. Theseresultsshowthatallowingforapproximatesolutions candramaticallyincreasethescalabilityofthealgorithmswithoutsignificantlossinthesolution quality. 5 Themaximumrewardinthematrixwas100units,andthesewerechosenas1%,5%and10%ofthemaximum possiblepayo↵ . 117 0.00# 10000.00# 20000.00# 30000.00# 40000.00# 50000.00# 10# 20# 30# 40# 50# Run,me#(secs)# Pure#Strategies# HBGS# Approx#1# Approx#5# Approx#10# (a) ScalingUpTargets 0.00 10000.00 20000.00 30000.00 40000.00 50000.00 4" 5" 6" Run(me"(secs)" Types" HBGS" Approx"1" Approx"5" Approx"10" (b) ScalingUpTypes 0" 1" 2" 3" 4" 5" 1" 2" 3" 4" 5" Percentage"Error" Targets" Approx"1" Approx"5" Approx"10" (c) SolutionQuality Figure5.4: Thisplotshowsthecomparisonsof Hbgsanditsapproximationvariants. 5.2 SolvingBayesian-SPARSProblemInstances I now present Hierarchical Bayesian Solver for Security games with Arbitrary schedules, or Hbsa, an algorithm that computes solutions to Bayesian-SPARS instances. First, I extend the basic Aspen problem formulation to include multiple attacker types so it can be applied to instances of Bayesian-SPARS. Hbsa uses the same hierarchical framework as described above in Hbgs, however, it uses my new Bayesian-Aspen formulation to solve each node in the attacker action tree. InowdescribeBayesian-Aspenapproachinmoredetail. 5.2.1 Bayesian-Aspen Bayesian-Aspen computes solutions to instances of the Bayesian-SPARS problem. It uses column generationtoevaluateleavesoftheattackeractiontreefromFigure5.1,andIfirstdescribethis procedure. ThenIdescribe themethods for using branching and bounding heuristics to reducethe overallcomputationdoneonthetree. 118 5.2.1.1 Bayesian-AspenColumnGeneration ThecolumngenerationprocedureusedinBayesian-AspenissimilartomethodsusedinAspen, as described in Section 3.1.1. The primary di↵ erences are extensions to the master problem formulation to include multiple attacker types, and updated computations for the reduced cost calculationsusedintheslavenetworkflowproblem. Master Problem for Bayesian-Aspen: The defender and the adversary optimization con- straints from Aspen need to be extended over all adversary types, in accordance with the definition ofSSE.Themasterproblem,giveninEquations(5.4)to(5.8),solvesfortheprobabilityvectorx that maximizes the defender reward. 6 Equations (5.5) and (5.6) enforce the SSE conditions for the defenderandadversaryofeachtype,suchthattheattackerofeachtypechooseshisbest-responses. Additionally,thechosendefenderstrategyissuchthatmaximizesthedefender’spayo↵ ,asgiven by the objective in Equation (5.4). Here, d refers to the expected utility of the defender when the defender has committed to the mixed strategy x and the attacker of type has chosen the best response a . The defender expected utility for protecting target t against adversary type isgivenbythet th componentofthecolumnvectorD Px+U , u ⇥ (theadversarypayo↵ isdefined 6 JustlikeinAspen,theactualalgorithmminimizesthenegativeofthedefenderrewardforcorrectnessofreduced costcomputation,andIshowmaximizationofdefenderrewardonlyforexpositorypurposes. 119 analogously). As before, the leaf node (attacker pure strategya) is fixed before column generation isexecutedandisnotavariableinthisformulation. max X 2⇤ d p (5.4) s.t. d (D Px+U u , ⇥ ) (1 a )M 8 2⇤ (5.5) 0 k (A Px+U u , ) (1 a )M 8 2⇤ (5.6) X j2J x j = 1 (5.7) x 0 (5.8) SlaveProblem: TheslaveproblemforBayesian-AspenisthesameasforAspenexceptthat the reduced cost of a joint schedule is updated to take into account all the attacker types. The reduced cost ¯ c j of variable x j , associated with columnP j for Bayesian-Aspen is given in Equation (5.9),wherew ,y ,z andharedualvariablesofmasterconstraints(5.5),(5.6-rhs),(5.6-lhs)and (5.7)respectively. ¯ c j = X 2⇤ (w T (D P j )+y T (A P j ) z T (A P j )) h (5.9) Again,thereducedcosts ¯ c j aredecomposedinto ˆ c t ,reducedcostspertarget: ˆ c t = X 2⇤ (w , t D , t +y , t A , t z , t A , t ) (5.10) Thecolumnwiththeminimumreducedcostisidentifiedusingthesameminimumcostnetwork flowslaveformulationasAspenusingthenewcalculationfor ˆ c t . 120 5.2.1.2 ImprovingBranchingandBounds AspenusedtheORIGAMIheuristictoarrangetheleavesrepresentingtheattackerpurestrategies for a SPARS problem instance, as shown in Figure 3.1. ORIGAMI-S is designed only for security gameswithoneattackertype,andhenceisnotdirectlyapplicabletoBayesian-Aspen. However, we can adapt the method for Bayesian-Aspen by solving ORIGAMI-S for each attacker type independently and then using the weighted combination of the solutions as branch and bound heuristics. The solutions are weighted by the probability p of facing the attacker type . For example, if the ORIGAMI-S solution for type 1 provides a defender upper bound of B(t 1 2 ) for target t 2 , and the ORIGAMI-S solution to type 2 provides a defender upper bound ofB(t 2 1 ) for targett 1 ,thentheupperboundusedbyBayesian-Aspenfortheleaf[2,1]inFigure5.1wouldbe B(t 1 2 )p 1 +B(t 2 1 )p 2 , where p 1 and p 2 are the probabilities for attacker type 1 and 2 respectively. ThisupperboundisvalidbecauseifthedefendercannotperformbetterthanB(t i )againsteach attackertype independently,thenitcannotperformbetterthanB(t i )p whenitisfacingthetype withprobability p . Bayesian-Aspencomputestheupperboundforalltheleaves,sortsthemin decreasing order and then uses column generation to compute the exact solution for each leaf until (i)theexactsolutionforalltheleaves has been computed, or (ii) a solution has been obtainedthat isbetterthantheupperboundforallremainingleaves. Inowintroduce Hbsa thatbuildsonBayesian-Aspen andfurtherreducesthenumber ofleaves (referFigure5.1)thatmustbeevaluated. ThissectionwillalsoclarifytheuseofBayesian-Aspen in Hbsa. I will then provide an experimental comparison of Hbsa and Bayesian-Aspen with ORIGAMI-Sinthefollowingexperimentalresultssection. 121 5.2.2 Experimentalresults Inthissection,Icomparetheperformanceof HbsaforBayesian-SPARSgames. Sincenoprevious algorithms exist to solve Bayesian security games with scheduling constraints, I compare the performancebetweenBayesian-Aspenandvariantsof Hbsa. Itestedthreedi↵ erentalgorithms: (1) the first, Bayesian-Aspen, as explained in Section 5.2.1. (ii) The second, Hbsa-D, uses a hierarchical tree witha depth of one (e.g., as shown Figure 5.2(a)), such that each leaf solves a restricted game with exactly one follower type. (2) The third, Hbsa-F, uses a fully branched binary tree,astheoneshowninFigure5.2(b). 0" 20000" 40000" 60000" 80000" 30" 40" 50" 60" 70" RunTime" Targets" HBSA:O" HBSA:D" HBSA:F" Bayesian-ASPEN (a) ScalingUpTargets 0" 20000" 40000" 60000" 80000" 2" 3" 4" 5" 6" RunTime" Types HBSA4O" HBSA4D" HBSA4F" Bayesian-ASPEN (b) ScalingUpTypes Figure 5.5: This plot shows the comparisons in performance of the three algorithms when the inputproblemisscaled. 5.2.2.1 Scale-upinnumberoftargets: In these experiments, the number of targets was varied while keeping the number of adversary typesfixedto5. Thenumberofdefenderresourceswassetsothatthedefenderisabletocover 10% of the total number of targets. The results are shown in Figure 5.5(a) where the x-axis showsthenumberoftargetsandthey-axisshowstheruntimeinseconds. Thegraphshowsthat Hbsa-Fisfastest,andscalesmuchbettercomparedtotheBayesian-AspenandHbsa-Dvariants. 122 The simulations were terminated if they didn’t finish in 24 hours. For example, Hbsa-D and Bayesian-Aspendidnotfinishin24hoursforthecasewith70targets,whileHbsa-Fwasableto solvetheprobleminstanceinlessthan5hours. 5.2.2.2 Scale-upinnumberoftypes: Theseexperimentsvariedthenumberoftypes,whilekeepingthenumberoftargetsfixedto50. Thenumberofresourceswassetto5,soastocover10%ofthetotalnumberoftargets. Thex-axis shows the number of types whereas the y-axis shows the runtime in seconds. The graph again shows that Hbsa-F is the fastest algorithm. Again, the cut-o↵ time for the experiments was 24 hours,andforexample,Hbsa-DandBayesian-Aspencouldnotsolvefor6typesin24hours. 5.2.2.3 E↵ ectsofusingdi↵ erenthierarchies Theobjectiveoftheseexperimentsistofurtherexploretheimpactofusingdi↵ erenttreearrange- mentsintheruntimeperformanceof Hbsa. Whilemypreviousexperimentsshowedthatthefull binarytreearrangementwasbetterthanusingadepth-onearrangement,thisexperimentisdesigned to investigate the trade-o↵ between pre-processing and actual computation: a greater depth in the hierarchical tree implies more preprocessing but also better bounds, which are conflicting factors contributing to the total runtime. Specifically, I explore the runtime performance of an intermediateleveltree,thedepth-twotree. Anexampleofadepth-twotreefor8attackertypes isshowninFigure5.6. IgeneraterandomBayesian-SPARSinstanceswith8attackertypesand comparetheruntimeperformanceofthedepth-twotreewiththefullbinarytree(depth-threefor 8attackertypes). Idonotshowresultsforthedepth-onetreesincethattooksubstantiallymore (morethan5hoursforsomeinstances)whenthenumberoftargetswasmorethan10. 123 1 2 3 4 5 6 7 8 1 2 { 1 , 2 , 3 , 4 } { 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 } { 5 , 6 , 7 , 8 } Figure5.6: ADepth-Twotreefor8attackertypes. TheresultsareshowninFigure5.7. Theattackertypesfortheseexperimentswererandomly generated,andwererandomlyarrangedtoformthehierarchicalstructure(inotherwords,there was no preferential selection of attacker type 1 over attacker type 2 when constructing the hierarchicaltree). Theresultsareaveragedover30probleminstances. Theruntimeperformance is shown in Figure 5.7(a). The y-axis in the figure shows the time in seconds on the log scale, whereas the x-axis shows the number of targets in the problem instance. Each cluster has two sets of results: the first two bars correspond to the results for depth-two hierarchy, whereas the nexttwobarsshowtheresultsforafullbinarytreehierarchy. Thefirstbarineachsetisthetime requiredforpre-processing,i.e. thetimerequiredtosolvethesub-treesbelowtherootnodein the hierarchicaltypetree(referFigure5.2). Thesecondbarshowsthetotalaverageruntimerequired bythealgorithm. Forexample,for20targets,thepre-processingandtotalruntimerequiredforthe depth-twohierarchyare 107and2,288seconds respectively,whereas thecorrespondingtimes for thefull-binaryhierarchyare15and1,054secondsrespectively. The results show that shallower hierarchies are better when the number of targets is small, whereas the full binary hierarchy starts to dominate when the number of targets is increased. Thisresultmakesintuitivesensesincethepre-processingoverheadisamoresignificantfraction of the total runtime for problem instances with a smaller number of targets. I now show the number of pure strategies of the attacker required to be explored at the root node of both the 124 depth-twoandfullbinaryhierarchicaltypetrees(refer(Figure5.2)inFigure5.7(b). Remember that the number of pure strategies evaluated refers to the number of leaves explored in the tree representing the attacker’s pure strategies (refer Figure 5.1), and that this directly corresponds totheamountofcomputationrequiredafter thepre-processinghasbeencompleted. They-axis of Figure 5.7(b) shows the number of pure strategies evaluated for the attacker, whereas the x-axisshowsthenumberoftargets. Forexample,thenumberofpurestrategiesexpandedbythe depth-two hierarchy for 20 targets is 184,763 whereas for the full binary hierarchy is 88,422. Similarly,for15targets,thenumberofpurestrategiesexpandedbythedepth-twohierarchyfor 20targetsis51,096whereasforthefullbinaryhierarchyis46,108. Thus,incombinationwith Figure 5.7(a), the experiment shows (i) the number of pure strategies evaluated for full binary hierarchyisalwaysnogreaterthanthepurestrategiesevaluatedforthedepth-twohierarchy(the di↵ erence in number of pure strategies evaluated is statistically significant only for 15 and 20 targetsinFigure5.7(b)),and(ii)thedepth-twohierarchyisfasterforlessernumberoftargets(for 5, 10 and 15 targets in Figure 5.7(a)) and slower for 20 targets. The results confirms that there existsatrade-o↵ inthechoiceofthehierarchy: deepertreesarefasteronlywhenthenumberof targetsarelargeotherwisetheoverheadofpre-processingoverwhelmsthetotalruntimerequired tosolvetheproblem. 5.2.2.4 Di↵ erentarrangementoftypesinHbsa The objective of these experiments is to identify the impact of the distribution of types in the runtime performance of Hbsa as the number of targets is increased. I test two arrangements in these experiments: one that puts similar typesin thesamesub-tree whereasother whichcombines dissimilar types. The optimal coverage vectors c = Px are computed per type by solving 125 0.1$ 1$ 10$ 100$ 1000$ 10000$ 5$ 10$ 15$ 20$ Run*me$(in$secs)$ Targets$ Depth$Two:$Preprocessing$ Depth$Two:$Total$ Full$Binary:$Preprocessing$ Full$Binary:$Total$ (a) Runtime 1" 10" 100" 1000" 10000" 100000" 1000000" 5" 10" 15" 20" Number"of"A/acker"Pure" Strategies"Evaluated! Targets" Depth"Two" Full"Binary" (b) EvalutedPureStrategiesoftheAttacker Figure5.7: Thefigureshowstheimpactinruntimewhenthedistributionoftreesisvariedinthe hierarchyusedinHbsa. Theseresultsarefor8attackertypes. WhileIdonotshowtheerrorbars in this figure as well, depth-two hierarchy was faster for 10 and 15 targets whereas full binary hierarchywasfasterfor20targetswithstatisticalsignificance. a restricted SPARS game just for type , and then the K-L divergence distance [Kullback and Leibler, 1951] between the resulting solutions c is used as a similarity metric. A smaller K-L distanceimpliessimilarmarginalcoveragevectors;thelargerthisdistance,themoredissimilarthe marginals. Forthisexperiment,IgeneratedrandominstancesoftheBayesian-SPARSproblems with 8 attacker types, and always arranged them in a full binary tree hierarchy. I conducted experiments with di↵ erent tree hierarchies like depth-one, depth-two and full binary tree; however, Ipresentresultsonlyforthefullbinarytreehierarchysinceitwasfoundtoperform(andscale) thebestinallmyexperiments. The results are shown in Figure 5.8. These results are averaged over 30 problem instances. The runtime performance is shown in Figure 5.8(a). The y-axis in the figure shows the time in secondsonthelogscale,whereasthex-axisshowsthenumberoftargetsintheprobleminstance. Eachcluster hastwosetsofresults: the firsttwobarscorrespondto theresultswhensimilar types are combined, whereas the next two bars show the results when dissimilar types are combined. The first bar in each set is the time required for pre-processing, i.e. the time required to solve the sub-trees below the root node in the hierarchical type tree (refer Figure 5.2). The second bar shows the total average runtime required by the algorithm. For example, for 20 targets, the 126 pre-processing and total runtimerequiredforthe‘similar’arrangementare12and1,028seconds respectively,whereasthecorrespondingtimesforthe‘dissimilar’arrangementare21and2,578 seconds,respectively. Theresultsshowthatcombiningsimilartypesisalwaysmorehelpfulthancombiningdissimilar types. This result makes intuitive sense since combining similar types is expected to generate tighter bounds (since the defender is better able to optimize against two similar opponents), resulting in more pruning at every subsequent parent node. To test this hypothesis, I also plot thenumberofpurestrategiesoftheattacker(referFigure5.1)thatareexpandedattherootnode ofthehierarchicaltypetree(refer(Figure5.2). Likeinthepreviousexperiment,thenumberof purestrategiesoftheattackerrepresentsthenumberofleavesfromFigure5.1thatareexpanded bytheHbsaalgorithmafterthepre-processinghasbeencompleted. Theseresultsareshownin Figure5.8(b). They-axisshowsthe numberofpurestrategiesevaluatedfor theattacker,whereas the x-axis shows the number of targets. For example, the number of pure strategies expanded bythe‘similar’arrangementfor20targetsis80,781whereasforthe‘dissimilar’arrangementis 160,502respectively. Theresultsconfirmmyhypothesisthatcombiningsimilartypesresultsin fewerpurestrategiesoftheattackerbeingevaluated. 5.3 BayesianSPPCDomain In this section, I present the solutionmethodology for solvingthe Bayesian extensionof the SPPC domain. Again, thedefenderdoesnotknowthetype oftheattackershewouldbefacing, however, the defender does know a prior probability of facing each type. The attacker knows his type as wellthedefenderstrategy,andthencomputeshisbestresponse. 127 1" 10" 100" 1000" 10000" 5" 10" 15" 20" Run)me"(in"secs)" Targets" Similar:"Preprocessing" Similar:"Total" Dissimilar:"Preprocessing" Dissimilar:"Total" (a) Runtime 1" 10" 100" 1000" 10000" 100000" 1000000" 5" 10" 15" 20" Number"of"A/acker"Pure" Strategies"Evaluated" Targets" Similar" Dissimilar" (b) EvalutedPureStrategiesoftheAttacker Figure 5.8: The figure shows the impact in runtime when the distribution of types is varied in thehierarchyusedinHbsa. We,again,donotshowtheerrorbarsbecauseofthelogscaleofthe y-axisbutthesimilar typearrangementwasfasterthanthedissimilartypearrangementfor10,15 and20targetswithstatisticalsignificance. I again present abranch-and-priceformulationtocomputeoptimalsolutionsfortheBayesian extensionoftheSPPCdomain. Here,again,thebranch-and-priceformulationiscomposedofa branch and bound module and a column generation module. Again, the actions of the attacker aremodeledasanintegervariable. Thebranchandboundassignsavalue(i.e. aspecifictargetto attack) tothis integerin everybranch. The solution at each node of this tree is computedusing thecolumngenerationprocedure. Themasterandtheslaveproblemsforthiscolumngeneration procedurearedescribedbelow. 5.3.1 MasterFormulation Theobjectiveofthemasterformulationistocomputetheprobabilitydistributionxovertheset oftoursTsuchthattheexpecteddefenderutilityismaximized. Themasterformulationisgiven in Equations (5.11) to (5.16). ⇤ specifies the set of adversary types, and is subscripted using . Again, Equation(5.13) computes the payo↵ of the defender. Equations(5.14) and(5.15) compute thepayo↵ oftheattacker, while ensuring that q s = 1 is feasibleif and only if attackingtarget s 128 isthebestresponseoftheattackeroftype . Equations(5.12)and(5.16)ensurethatxisavalid probabilitydistribution. min x,d,q X 2⇤ d (5.11) s.t. X t2T x t 1 (5.12) d X t2T x t z st (⌧ s +R s )+ Mq s M R s (5.13) k X t2T x t z st (P +R s )+R s 0 (5.14) k + X t2T x t z st (P +R s )+ Mq s R s M (5.15) x t 2[0,1] (5.16) 5.3.2 Slave TheobjectiveoftheslaveformulationisthecomputethenextbesttourtoaddtothesetoftoursT. Thisisagaindoneusingaminimumcostintegernetworkflowformulation. Thenetworkflowgraph is constructed in the same way as before. The updated reduced costs for this variant of the domain arecomputedusingthesamestandardtechniquesandaregivenintheEquation(5.17). Here, w , y ,v andhrepresentthedualsofEquations(5.13),(5.14),(5.15)and(5.12)respectively. c t = X 2⇤ X s2S (w s (⌧ s +R s )+(v s y s )(P s +R s ))z st h (5.17) 129 Thisreducedcostofatourc t isagaindecomposedintoreducedcostspertargetinthefollowing manner: c t = P s2S ˆ c s z st h (5.18) ˆ c s = P 2⇤ (w s (⌧ s +R s )+(v s y s )(P s +R s )) (5.19) These reduced costs per target, ˆ c s , are then put as the costs on the links of the minimum cost networkflowformulation. TheresultsoftheruntimeofthisdomainarepresentedwithdetailedanalysisinChapter6. 130 Chapter6: d:sandPhaseTransition SoftwaresecurityassistantsbuiltontheframeworkofStackelbergsecuritygames[Kiekintveld et al., 2009] have been deployed by a variety of real-world security agencies. For example, ARMOR[Jainetal.,2010c]hasbeeninusebythepoliceatLosAngelesInternationalAirport since2007. Similarly,IRIS[Jainetal.,2010c]andPROTECT[Anetal.,2011]havebeeninuse bytheUSFederalAirMarshalsServiceandtheUSCoastGuardsince2009and2011respectively. Manydi↵ erentalgorithmshavebeenproposedforcomputingsolutionstosuchproblems[Conitzer and Sandholm, 2006; Paruchuri et al., 2008b; Gatti, 2008; Kiekintveld et al., 2009; Jain et al., 2010b; Dickerson et al., 2010b; Letchford and Vorobeychik, 2011; Bosansky et al., 2011] with a focus on scalability to enable the application of these models to newer and more complex real-worlddomains. In this chapter, I investigate what properties of Stackelberg security game instances make themhardtosolveinpractice,acrossdi↵ erentinputsizesanddi↵ erentsecuritydomains. Ishow thatthisquestioncanbeansweredusingthenovelconceptofthedeployment-to-saturation(d:s) ratio. Thisratio,whichwedenoted:s,isdefinedasthenumberofdeployeddefenderresources dividedbythenumberofresourcesbeyondwhichadditionaldeploymentswouldnotprovideany benefit to the defender. I show that the hardest computational instances arise when this ratio is 131 0.5, independent of the domain representation, model and solver. Specifically, I consider three di↵ erentclassesofsecuritydomains,eightdi↵ erentMIPalgorithms(includingtwoalgorithms actuallydeployedinpractice),fivedi↵ erentunderlyingMIPsolvers,twodi↵ erentvariationson theStackelbergequilibriumconcept,andavarietyofdomainsizesandconditions. Iidentifytwoimportantimplicationsofmyfindings. First,newalgorithmsshouldbecompared onthehardestprobleminstances;Ishowthatmostpreviousresearchhascomparedtheruntime performance of algorithms only at lowd:s ratios, where problems are comparatively easy. Second, Iarguethatthiscomputationallyhardregionisthepointwhereoptimizationo↵ ersthegreatest benefittosecurityagencies,implyingthatproblemsinthisregiondeserveincreasedattentionfrom researchers. Finally, I leverage the concept of phase transitions to better understand this algorithm- independent, computationally hard region. This approach to understanding algorithm-independent structuralpropertieswaspioneeredby[Cheesemanetal.,1991]and[Mitchelletal.,1992]. They showedthattheprobabilitythatauniform-random3-SATinstancewillbesatisfiableexhibitsa phasetransitionwhenthenumberofvariablesisfixedandthenumberofclausescrossesroughly 4.3timesthenumberofvariables,whichcorrespondstothepointwheretheprobabilitythatthe instance will be solvable crosses 0.5. Phase transitions have also been used to understand the computationalimpactofproblemstructureinoptimization problems,suchas MAX-SAT[Slaney and Walsh, 2002] and TSP [Gent and Walsh, 1996; Frank et al., 1998]. The approach taken in thisworkistoidentifyaphasetransitioninthedecisionversionoftheoptimizationproblem(i.e., askingwhetherornot asolutionexistswithagiven objectivefunctionvalue). Ishowthatsecurity gamesexhibitaphasetransitionat0.5forrandomStackelbergsecuritygameinstances,andthat thisphasetransitioncorrespondstothecomputationallyhardestinstancesatthed:sratioof0.5. 132 0.001$ 0.01$ 0.1$ 1$ 10$ 100$ 0$ 3$ 6$ 9$ 12$ 15$ Run-me$(seconds)$$ resources' DOBSS$Run-me:$2$Types$ 15$Targets$ 10$Targets$ (a) 0" 2" 4" 6" 8" 0" 10" 20" 30" 40" 50" 60" 70" 80" 90" 100" Run/me""(seconds)" resources" Aspen:"500"Schedules,"" 2"Targets"Per"Schedule" 100"Targets" 50"Targets" (b) 1.00$ 10.00$ 100.00$ 1000.00$ 1$ 6$ 11$ 16$ 21$ 26$ 31$ 36$ Run+me$(seconds)$ tour%length% Mul+ple$A8ackers:$8$Targets$ 1$Resource$ 2$Resources$ (c) Figure 6.1: Average running time of DOBSS, ASPEN and multiple-attacker branch-and-price algorithmforSPNSC,SPARSandSPPCdomainsrespectively. Asmentionedpreviously,thischapterfocusesontheruntimerequiredbythedi↵ erentalgo- rithms that compute solutions for instances of security games for the three domains described above. Figure6.1showstheruntimeforcomputingsolutionsusingDOBSS,ASPENandmultiple- attacker branch-and-price algorithm for the SPNSC, SPARS and SPPC domains respectively. (Recallthatthesethreeconfigurationsareparticularlyinteresting,astheyaretheonesthathave beendeployedinpractice.) The x-axisineachfigureshowsthenumberofavailableresourcesto thedefender,and they-axisshows the runtime in seconds. In the SPNSC and the SPARS domains Idefinethenumberofresourcesasthenumberofo cersavailabletothedefender;intheSPPC domain,Idefineitasthemaximumfeasibletourlength. Thesegraphsshowthatthereisnounified valueofthenumberofdefenderresourceswhichmakessecuritygameinstanceshardtosolve. I 133 alsotriednormalizingthenumberofresourcesbythenumberoftargets,however,Ifoundsimilar inconsistenciesacrossdi↵ erentdomainsevenwiththatnormalization. 6.1 DeploymenttoSaturationRatio Inowproposethenovelconceptofthedeployment-to-saturation(d:s)ratio,aconceptthatuni- fies the domain independent properties of problem instances across di↵ erent security domains. Specifically, I consider the three security domains introduced before, eight di↵ erent MIP algo- rithms(includingtwodeployedalgorithms),fivedi↵ erentunderlyingMIPsolvers,twodi↵ erent equilibrium concepts in Stackelberg securitygames, and avariety ofdomain sizesand conditions. I show experimentally that the hardest problem instances occur when the dts ratio is about 0.5. (Specifically, inmyexperiments,thehardest probleminstancesoccurredatthed:s ratiosbetween 0.48 and 0.54. However, because of discretization, I was not able to test all d:s ratios in all domains. Withoutexception,thehardestvaluewasthefeasiblevalueclosestto0.5. Thisfact,in combinationwithmyphasetransitionargumentspresentedattheendofthechapter,leadmeto concludethatthehardestregionispreciselyd:s = 0.5.) More specifically, the deployment-to-saturation (d:s) ratio is defined in terms of defender resources, a concept whose precise definition di↵ ers from one domain to another. Given this definition, deployment denotes the number of defender resources available to be allocated, and saturationdenotestheminimumnumberofdefenderresourcessuchthattheadditionoffurther resourcesbeyondthispointyieldsnoincreaseinthedefender’sexpectedutility. FortheSPNSC andtheSPARSdomain,deploymentdenotesthenumberofavailablesecuritypersonnel,whereas saturation refers to the minimum number of o cers required to cover all targets with a probability 134 of1. FortheSPPCdomain,deploymentdenotesthemaximumfeasibletourlength,whilesaturation denotestheminimumtourlengthrequiredbytheteamofdefenderresourcessuchthattheteam cantourallthetargetswithaprobabilityof1. I now present results for all the three security domains. All the results shown below are averaged over 100 samples, and were collected on a machine with a 2.7GHz Intel Core i7 processorand8GBmainmemory. Inallgraphs,the x-axisshowsthed:sratioandthey-axisshows theruntimeinCPU-seconds. ExperimentswereconductedusingCPLEX12.2unlessotherwise noted. 6.1.1 SPNSCDomain Forthisdomain,Iconsideredboththegeneral-sumandsecuritygamecompactrepresentations. 6.1.1.1 Generalsumrepresentation. Iconductedexperimentsvaryingthealgorithm,numberofattackertypesandnumberoftargets. The results for the general sum representation are plotted in Figures 6.2(a), 6.2(b) and 6.2(c). The payo↵ s for the two players were selected uniformly at random: the payo↵ s for success and failure wereselectedfromtheranges[1,10]and[ 10, 1]respectively. Figure6.2(a)showsthattheruntimerequiredbyallthreealgorithms(MultipleLPs,DOBSS and HBGS) peaks at d:s = 0.53. (While I would like to have observed peaks at d:s = 0.5 in all ofmyexperiments,Itypicallyobservedvaluesthatwereslightlydi↵ erent. Theexplanationthat mightcomefirsttomindisvarianceduetohavingmeasuredaninsu cientnumberofsamples. However, there is also a more critical issue: because the numbers of resources and targets are discrete, I was not able to measure everyd:s value. Here, 8 resources and 15 targets corresponded 135 0.001$ 0.01$ 0.1$ 1$ 10$ 100$ 0$ 0.2$ 0.4$ 0.6$ 0.8$ 1$ Run,me$(seconds)$$ d:s$ra,o$ Varia,on$in$Algorithms:$$ 2$Types,$15$Targets$ Mul,ple$LPs$ DOBSS$ HBGS$$ (a) 0.001$ 0.01$ 0.1$ 1$ 10$ 100$ 0$ 0.2$ 0.4$ 0.6$ 0.8$ 1$ Run,me$(seconds)$$ d:s$ra,o$ DOBSS$Run,me:$2$Types$ 10$Targets$ 15$Targets$ (b) 0.001$ 0.01$ 0.1$ 1$ 10$ 0$ 0.2$ 0.4$ 0.6$ 0.8$ 1$ Run,me$(seconds)$$ d:s$ra,o$ DOBSS$Run,me:$10$Targets$ 2$Types$ 3$Types$ (c) 0.001$ 0.01$ 0.1$ 1$ 10$ 100$ 1000$ 0.00$ 0.20$ 0.40$ 0.60$ 0.80$ 1.00$ Run,me$(seconds)$$ d:s$ra,o$ Brass:$2$Types$ 15$Targets$ 10$Targets$ (d) 0" 0.4" 0.8" 1.2" 1.6" 2" 0" 0.2" 0.4" 0.6" 0.8" 1" Run,me"(seconds)" d:s$ra,o" Eraser"Run,me:"2"Types" 50"Targets" 75"Targets" (e) 0" 0.5" 1" 1.5" 2" 0" 0.2" 0.4" 0.6" 0.8" 1" Run-me"(seconds)" d:s"ra-o" Varia-on"in"Solu-on"Mechanism:"" 2"Types,""50"Targets" CPLEX:"Primal"Simplex" CPLEX:"Dual"Simplex" CPLEX:"Network"Simplex" CPLEX:"Barrier" GLPK"Simplex" (f) Figure6.2: AverageruntimeofcomputingtheoptimalsolutionforaSPNSCprobleminstance. Theverticaldottedlineshowsd:s = 0.5. tod:s = 0.53.) Thissetofexperimentsconsidered2attackertypesand15targets. Moreover,all the three algorithms required the maximum runtime when the d:s ratio was 0.53. For example, MultipleLPsrequired27.9secondstocomputetheoptimalsolution. Thisexperimentshowsthat computationishardestfortheSPNSCprobleminstanceswhenthed:sratioisabout0.5. The next two experiments study runtime variation when the number of targets and the number oftypesinthedomainvary. Figure6.2(b)variesthenumberoftargetsinthedomainfrom10to 136 15. Similarly, Figure 6.2(c) varies the number of types from 2 to 3 in the security domain. For example,DOBSStook1.8secondsonaverageforinstanceswith3attackertypesatthed:sratio of0.5(5resourcesand10targets), Ialso ranexperimentswithBRASS, which computes an✏-Stackelberg equilibrium [Pita etal., 2010]. TheseresultsareshowninFigure6.2(d). ThehardestinstancesforBRASSinadomain with15targetscorrespondedtoad:sratioof0.53(8resources),withBRASStaking17seconds. 6.1.1.2 Securitygamecompactrepresentation. I conducted experiments with ERASER that varied the number of targets and the number of attackertypes. Igeneratedpayo↵ sfortheplayersasbefore. Figure6.2(e)showstheresultsfor problem instances with 2 attacker types and both 50 and 75 targets. For example, the runtime required for 75 targets for the d:s ratio of 0.49 (37 resources) was 1.8 seconds. This was the computationallyhardestpointfor75targets. Ialsovariedthenumberoftypes,andthoseresults alsoconfirmedmyclaim. Ournextexperimentinvestigatedthee↵ ectonERASER’sruntimeofchangingitsunderlying solver and solution mechanism. I plot the runtime required by ERASER with CPLEX Primal Simplex,CPLEXDualSimplex,CPLEXNetworkSimplex,CPLEXBarrierandGLPKSimplex methods. Again,Igeneratedthepayo↵ sandgameinstancesasbefore. Asampleresultisthat,for thed:sratioof0.50(25targets),theruntimerequiredbyCPLEXDualSimplexwas0.9seconds and the runtime required by GLPK Simplex was 1.6 seconds. TThese experiments again show thatthed:sratioof0.5correspondswiththecomputationallyhardestinstancesacrossarangeof underlyingsolverandsolutionmechanismsfortheSPNSCdomain. 137 1" 10" 100" 0" 0.2" 0.4" 0.6" 0.8" 1" Run,me"(seconds)" " d:s"ra,o" Aspen"Run,me" 100"Targets,"500"Schedules" |S|"="4" |S|"="2" (a) 0" 2" 4" 6" 8" 0" 0.2" 0.4" 0.6" 0.8" 1" Run,me"(seconds)" d:s"ra,o" Aspen:"100"Targets,"" 2"Targets"per"schedule" 400"Schedules" 500"schedules" (b) 0" 2" 4" 6" 8" 0" 0.2" 0.4" 0.6" 0.8" 1" Run,me""(seconds)" d:s"ra,o" Aspen:"500"Schedules,"" 2"Targets"Per"Schedule" 100"Targets" 50"Targets" (c) Figure6.3: AverageruntimeofcomputingtheoptimalsolutionforaSPARSgameusingASPEN. Theverticaldottedlineshowsd:s = 0.5. 6.1.2 SPARSDomain Iconductedexperimentsvaryingthelengthofaschedule,thenumberoftargetsandthenumber of schedules in SPARS problem instances. ASPEN was used to compute solutions. As before, payo↵ s for the two players were selected uniformly at random; the rewards for success were selected uniformly at random from the interval [1,10], and the penalty for failure uniformly at randomfromtheinterval[ 10, 1]. Figure 6.3(a) shows the results for SPARS problem instances with 100 targets and 500 schedules,whenthenumberoftargetscoveredbyeachschedule,denotedby|S|,wasvaried. For example, the runtime required for |S| = 2 at the d:s ratio of 0.50 (20 deployed resources, 40 requiredforsaturation)was6.4seconds. Thiswascomputationallythehardestpointfor|S| = 2. 138 0.1$ 1$ 10$ 100$ 1000$ 0.00$ 0.20$ 0.40$ 0.60$ 0.80$ 1.00$ Run,me$(seconds)$$ d:s$ra,o$ Mul,ple$A;ackers:$1$resource$ 8$Targets$ 6$Targets$ (a) 1.00$ 10.00$ 100.00$ 1000.00$ 0.00$ 0.20$ 0.40$ 0.60$ 0.80$ 1.00$ Run,me$(seconds)$ d:s$ra,o$ Mul,ple$A;ackers:$8$Targets$ 1$Resource$ 2$Resources$ (b) 0.1$ 1$ 10$ 100$ 0.00$ 0.20$ 0.40$ 0.60$ 0.80$ 1.00$ Run,me$(seoncds)$ d:s$ra,o$ Bayesian$Single$A>acker:$8$ Targets,$1$resource$ 1$Type$ 2$Types$ (c) Figure 6.4: Average runtime for computing the optimal solution for a patrolling domain. The verticaldottedlineshowsd:s = 0.5. For|S| = 4, the computationally hardest point required 28.9 seconds at d:s = 0.5 (10 available resources,20requiredforsaturation). Figure6.3(b)showstheresultsforSPARSprobleminstanceswith100targetsand2targetsper schedule,considering400and500schedules. Forexample,theruntimerequiredfor500schedules forthed:sratioof0.50(20resources,40requiredforsaturation)was6.4seconds. Figure6.3(c) shows the results for SPARS problem instances with 500 schedules, 2 targets per schedule for 50and100targets. Forexample,theruntimerequiredfor50targetsforthed:sratioof0.50(10 resources, 20 required for saturation) was 1.9 milliseconds which, again, was the computationally hardestpointfor50targets. 139 6.1.3 SPPCDomain IpresentresultsforboththemultipleattackerandtheBayesiansingleattackervariantsinFigure6.4. Figure6.4(a)showstheresultsforthemultipleattackerSPPCdomainwith1defenderresource, and with the number of targets varying from 6 to 8. For example, for the d:s ratio of 0.50, the algorithmtook108.2secondstocomputetheoptimalsolution. Figure6.4(b)showstheruntime requiredtocomputetheoptimalsolutionforthemultpleattackerSPPCdomainwith8targets. It varies the number of defender resources from 1 to 2. For example, for the d:s ratio of 0.50, the algorithm took 144.0 seconds to compute the optimal solution for 2 resources. This was again the computationallyhardestpoint. Similarly,Figure6.4(c)showstheruntimerequiredforourbranch-and-pricebasedalgorithm to compute an optimal solution for the Bayesian single attacker SPPC domain with 8 targets and 1 defender resource, along with the probability p thatthe decisionproblem issolvable. Itvaries the numberoftypesfrom1to2. Forexample,forthed:sratioof0.50,thealgorithmtook2.0seconds tocomputetheoptimalsolutionfor1type. 6.2 Implicationsofthefindings I have provided evidence that the hardest random Stackelberg game instances occur at a deployment-to-saturationratioof0.5. Thisfindinghastwokeyimplications. First, it is important to compare algorithms on hard problems. If random data is used to test algorithms for security domains, it should be generated at a d:s ratio of 0.5. There has indeedbeenasignificantresearche↵ ortfocusingonthedesignoffasteralgorithmsforsecurity domains. Randomdatahasoftenbeenused;unfortunately,Ifindthatittendsnottohavecome 140 from the d:s = 0.5 region. (Of course, the concept of a deployment-to-saturation ratio did not previouslyexist;nevertheless,Icanassesspreviousworkintermsofthed:sratioatwhichdatawas generated.) Forexample,[Jainetal.,2011a]comparedtheperformanceofHBGSwithDOBSS andMultipleLPs,buttheyonlycomparedd:sratiosbetween0.10and0.20. Similarly,[Pitaetal., 2010]presentedruntimecomparisonsbetweendi↵ erentalgorithms,varyingthenumberofattacker types in the security domain; all experiments in this chapter were fixed atd:s = 0.30 (10 targets, 3 resources). [Jainetal.,2011b]showedscalabilityresultsforRUGGED,testingatd:sratiosof0.10 and(mostly)0.20. Runtimeresultshavealsobeenpresentedinothersecuritysettings[Dickerson et al., 2010b; Vanek et al., 2011; Bosansky et al., 2011]; these algorithms compute defender strategies for networked domains. Their experiments keep the number of resources fixed and increase thesize ofthe underlying network; however, none of these papers provides enoughdetail abouthowinstancesweregeneratedtoallowustoaccuratelycomputethed:sratio. Tomakeiteasierforfutureresearcherstotesttheiralgorithmsonhardproblems,Ihavewritten abenchmarkgeneratorforsecuritygamesthatgeneratesinstancesfromd:s = 0.5. Thisgenerator iswritteninJava,andisavailablefordownloadathttp://teamcore.usc.edu/DTS.Itallows users to generate instances for all domains described above, as well as to compute solutions for all thealgorithmsmentionedinthisresearch. ItalsoallowsswitchingbetweenGLPKandCPLEX. Second,Iobservethatintermediatevaluesofthed:sratio,thecomputationallyhardregion,is alsotheregionwhereoptimizationismostvaluableandhencewheresecurityo cialsaremost likely to seek help in optimizing their resource deployment. If the d:s ratio is large, there are enough resourcesto protect almostalltargets,andperformingarigorousoptimizationo↵ ers little additional benefit. If d:s is small, there are not enough resources for optimized deployment to haveasignificantimpact. Tomakethisintuitionconcrete,Ishowanexamplefromrandomdata 141 fromtheSPNSCdomaininFigure6.5. Ipresentresultsfor50and75targets,eachaveragedover 100randominstances. The x-axisplotsthed:sratio,andthey-axisshowsthedi↵ erencebetween thedefenderutilitiesobtainedbytheoptimalstrategyandana¨ ıverandomizationstrategy. Alow utility di↵ erence impliesthat the na¨ ıve strategy is almostas goodas the optimalstrategy, whereas ahighdi↵ erenceshowsthatitisworthwhiletoinvestincomputingtheoptimalstrategy. Thena¨ ıve strategy I use here prioritizes targets based on the attacker’s payo↵ for successfully attacking a target,andthenuniformlydistributesitsresourcesovertwiceasmanytoptargetsasthenumberof resources. Forexample,atad:sratioof0.5,for50targets,thedi↵ erenceinutilitiesbetweenthe optimal solution and the solution from the randomized strategy was 5.07 units, whereas it was 0.21 units at ad:s ratio of 1. This suggests that computationally hard settings are also those where security forces would benefit the most from adopting nontrivial strategies; hence, researchers shouldconcentrateontheseproblems. 6.3 PhaseTransitions All the runtime results show an easy-hard-easy computational pattern as the d:s ratio increases from0to1,withthehardestproblemsatd:s = 0.5. Sucheasy-hard-easypatternshavealsobeen observedinotherNP-completeproblems,mostnotably3-SAT[Cheesemanetal.,1991;Mitchell et al., 1992]. In 3-SAT, the hardness of the problems varies with the clause-to-variable (c/v) ratio,withthehardestinstancesoccurringataboutc/v = 4.26. TheSATcommunityhasusedthe conceptofphasetransitionstobetterunderstandthishardnesspeak. Phasetransitionshavealsobeenusedtostudypropertiesofoptimizationproblems. Specifically, one can derive the decision version of an optimization problem by asking “does there exist a 142 0" 1" 2" 3" 4" 5" 6" 7" 0" 0.2" 0.4" 0.6" 0.8" 1" U-lity" d:s"ra-o" U-lity"Difference" (50"Targets)" U-lity"Difference" (75"Targets)" Figure6.5: Thedi↵ erencebetweenexpecteddefenderutilitiesfromERASERandana¨ ıverandom- izationpolicy. Theverticallineshowsd:s = 0.5. solutionwithobjectivefunctionvalue k?”,andthenlookingforaphasetransitioninthedecision problem. ThisapproachwaspioneeredbyGentetal.[GentandWalsh,1995],whostudiedthe propertiesoffourdi↵ erentoptimizationproblemsincludingTSPandBooleancircuitsynthesis. Theycomputedtheoptimaltourlength fromthe TSP optimizationproblem, andthen usedthis valueask todefinethedecisionversionoftheTSPinstance. Theyshowedthataphasetransition inthesolubilityofthisproblemcorrespondedtothecomputationallyhardestregion. Thesame approachwaslateralsousedbyGentandWalsh[GentandWalsh,1996]. In the context of game theory, phase transitions have been used to analyze the e ciency of marketsinadaptivegames[Savitetal.,1999]andtheprobabilityofcooperationinevolutionary game theory [Hauert and Szab, 2005]. Tomyknowledge,this workisthefirstthat identifiessuch structuralpropertiesinthecontextofStackelbergsecuritygames. 143 6.3.1 PhaseTransitionsinSecurityGames I begin by defining the decision version of the SSE optimization problem, which I denote SSE(D). SSE(D)askswhetherthereexistsadefenderstrategythatguaranteesexpectedutilityofatleast thegivenvalue D. Iwanttoclaimthataphasetransitioninthedecisionproblemcorrelateswith thehardestrandomprobleminstances. However,Iobtaindi↵ erentphasetransitionsfordi↵ erent valuesof D. FollowingGentetal.[GentandWalsh,1995],Idefine D ⇤ asthemedianobjective functionvalueachievedintheSSEoptimizationproblemwhenthed:sratioissetto0.5. (Observe that this definition guarantees that at a d:s ratio of 0.5, exactly 50% of problem instances will have a feasible solution. On the other hand, it does not guarantee that there will be a phase transition—i.e., a sharp change—in the probability of solvability.) I estimated D ⇤ by sampling 100randomprobleminstancesatd:s = 0.5,andcomputingthesamplemedianoftheirobjective functionvalues. Claim1. As thed:s ratio varies from 0 to 1, the probability p that a solution exists to SSE(D ⇤ ) exhibits a phase transition at d:s = 0.5. This phase transition is independent of the security domainoritsrepresentation. Furthermore, thisphase transitionaligns withthe computationally hardestinstances. To support this claim, I computed the probability of solvability of the decision problem SSE(D ⇤ ) for all the security domains and algorithms mentioned above. The phase transition results are shown in Figures 6.6 for the SPNSC domain, in Figure 6.7 for the SPARS domain, whereas results for the SPPC domain are shown in Figure 6.8 respectively. In the all figures, the x-axis shows the d:s ratio as before, the primary y-axis shows the runtime in seconds, and the secondary y-axis shows the probability p of finding a solution toSSE(D ⇤ ). I plot runtimes 144 0" 0.5" 1" 0.001" 0.01" 0.1" 1" 10" 100" 0" 0.2" 0.4" 0.6" 0.8" 1" Probability"p" Run7me"(seconds)"" d:s$ra7o" Varia7on"in"Algorithms:"" 2"Types,"15"Targets" Mul7ple"LPs" DOBSS" HBGS"" Probability"p" (a) 0" 0.5" 1" 0.001" 0.01" 0.1" 1" 10" 100" 0" 0.2" 0.4" 0.6" 0.8" 1" Probability"p" Run7me"(seconds)"" d:s"ra7o" DOBSS"Run7me:"2"Types" 10"Targets" 15"Targets" Probability"p"(10"targets)" Probability"p"(15"targets)" (b) 0" 0.5" 1" 0.01" 0.1" 1" 10" 0" 0.2" 0.4" 0.6" 0.8" 1" Probability"p" Run7me"(seconds)"" d:s"ra7o" DOBSS"Run7me:"10"Targets" 2"Types" 3"Types" Probability"p"(2"Types)" Probability"p"(3"Types)" (c) 0" 0.5" 1" 0.001" 0.01" 0.1" 1" 10" 100" 1000" 0.00" 0.20" 0.40" 0.60" 0.80" 1.00" Probability"p" Run7me"(seconds)"" d:s"ra7o" Brass:"2"Types" 15"Targets" 10"Targets" Probability"p"(15"Targets)" Probability"p"(10"Targets)" (d) 0" 0.5" 1" 0" 0.4" 0.8" 1.2" 1.6" 2" 0" 0.2" 0.4" 0.6" 0.8" 1" Probability"p" Run7me"(seconds)" d:s$ra7o" Eraser"Run7me:"2"Types" 50"Targets" 75"Targets" Probability"p"(50"targets)" Probabiilty"p"(75"targets)" (e) 0" 0.5" 1" 0" 0.5" 1" 1.5" 2" 0" 0.2" 0.4" 0.6" 0.8" 1" Probability"p" Run7me"(seconds)" d:s"ra7o" Varia7on"in"Solu7on"Mechanism:"" 2"Types,""50"Targets" CPLEX:"Primal"Simplex" CPLEX:"Dual"Simplex" CPLEX:"Network"Simplex" CPLEX:"Barrier" GLPK"Simplex" Probability"p" (f) Figure6.6: AverageruntimeofcomputingtheoptimalsolutionforaSPNSCprobleminstance. Theverticaldottedlineshowsd:s = 0.5. usingsolidlines,asbefore,andplot pusingadashedline. Figure6.6(c)presentsresultsforthe DOBSSalgorithmfortheSPNSCdomainfor10targetsand2and3attackertypes. Asexpected, the d:sratioof0.5correspondswith p = 0.51aswellasthecomputationallyhardestinstances; more interestingly, I observe that p undergoes a phase transition as the d:s grows. Similarly, Figure6.7(b)showsresultsfortheASPENalgorithmfortheSPARSdomainwith100targets,2 targetsperscheduleand400and500schedules,andFigure6.8(a)showsresultsforthemultiple 145 0" 0.5" 1" 1" 10" 100" 0" 0.2" 0.4" 0.6" 0.8" 1" Probability"p" Run7me"(seconds)" " d:s"ra7o" Aspen"Run7me" 100"Targets,"500"Schedules" |S|"="4" |S|"="2" Probabiilty"p"(|S|"="4)" Probability"p"(|S|"="2)" (a) 0" 0.5" 1" 0" 2" 4" 6" 8" 0" 0.2" 0.4" 0.6" 0.8" 1" Probability"p" Run7me"(seconds)" d:s"ra7o" Aspen:"100"Targets,"" 2"Targets"per"schedule" 400"Schedules" 500"schedules" Probability"p"(400"schedules)" Probability"p"(500"schedules)" (b) 0" 0.5" 1" 0" 2" 4" 6" 8" 0" 0.2" 0.4" 0.6" 0.8" 1" Probability"p" Run7me""(seconds)" d:s"ra7o" Aspen:"500"Schedules,"" 2"Targets"Per"Schedule" 100"Targets" 50"Targets" Probability"p"(100"Targets)" Probability"p"(50"Targets)" (c) Figure6.7: AverageruntimeofcomputingtheoptimalsolutionforaSPARSgameusingASPEN. Theverticaldottedlineshowsd:s = 0.5. attackerSPPCdomainfor1defender resource. In both cases, I again observe a phase transitionin p. TheruntimeresultsofERASERvaryingthenumberofattackertypesareshowninFigure6.9. Thex-axisshowsthed:sratio,whereasthey-axisshowstheruntimeinseconds. One might wonder how I can decide that theincrease in p is steep enough to becalled a phase transition. I know from both the experimental and theoretical literature on phase transitions in decisionproblemsthatthetransitionbecomessteeperasproblemsizeincreases,approachinga stepfunctioninthelimit[KirkpatrickandSelman,1994;Friedgut,1998]. Thisisapropertythat I can check for experimentally. I conducted experiments in the SPNSC domain, since it is the easiesttosolveatwidelyvaryingproblemsizes;theresultsareshowninFigure6.10. The x-axis shows the d:s ratio, and the y-axis shows the probability p of finding a feasible solution to the decisionversionofa correspondingSPNSCproblem;I plotresultsforprobleminstances with50, 146 0" 0.5" 1" 0.1" 1" 10" 100" 1000" 0.00" 0.20" 0.40" 0.60" 0.80" 1.00" Run-me"(seconds)"" d:s$ra-o" Mul-ple"A<ackers:"1"resource" 8"Targets" 6"Targets" Probability"p"(8"targets)" Probability"p"(6"targets)" (a) 0" 0.5" 1" 1.00" 10.00" 100.00" 1000.00" 0.00" 0.20" 0.40" 0.60" 0.80" 1.00" Probability"p" Run7me"(seconds)" d:s$ra7o" Mul7ple"AAackers:"8"Targets" 1"Resource" 2"Resources" Probability"p"(1"resource)" Probability"p"(2"resources)" (b) 0" 0.5" 1" 0.1" 1" 10" 100" 0.00" 0.20" 0.40" 0.60" 0.80" 1.00" Probability"p" Run7me"(seoncds)" d:s"ra7o" Bayesian"Single"ACacker:"8"Targets,"1"resource" 1"Type" 2"Types" Probability"p"(1"Type)" Probability"p"(2"Types)" (c) Figure 6.8: Average runtime for computing the optimal solution for a patrolling domain. The verticaldottedlineshowsd:s = 0.5. 0" 0.5" 1" 0" 0.5" 1" 1.5" 2" 2.5" 0" 0.2" 0.4" 0.6" 0.8" 1" Run-me"(seconds)" d:s"ra-o" Eraser"Run-me:"50"targets" 2"Types" 3"Types" Probability"p"(2"Types)" Probability"p"(3"types)" Figure6.9: ERASERresultswithvaryingnumberofattackertypes. 100and150targets. Iobservethedesiredresult: thephasetransitionindeedbecomessharperas thenumberoftargetsincreases. 147 0" 0.5" 1" 0.3" 0.4" 0.5" 0.6" 0.7" Probability"p" d:s$ra'o$ 50"Targets" 100"Targets" 150"Targets" Figure6.10: Probabilitythat the decision problemSSE(D ⇤ ) is soluble for SPNSC instances of threeproblemsizes. Thephasetransitiongetssharperastheproblemsizeincreases. 148 Chapter7: RelatedWork There exists related work both in game-theoretic as well as non-game theoretic literature studying the security domain. Much of this work is theoretical analysis of hypothetical scenarios, while my thesisfocusesonalgorithmsdesignedforuseinreal-worldsecurityoperations. Therearethree mainareasofrelatedwork: 1. ThefirstareaisoncomputationforStackelbergsecuritygames. 2. ThesecondareaofrelatedworkappliesStackelberggamestoothersecurityscenarios. 3. Thethirdareaofrelatedworkappliesotheroptimizationtechniquestomodelthesecurity domain. Inowexplorethesethreeareasindetail. 7.1 ComputationinStackelbergSecurityGames The first area of related work is on e cient algorithms for Stackelberg security games, such as theonesstudiedinthispaper. Game-theoreticmodelshavebeenappliedinavarietyofhomeland securitysettings, such asprotecting critical infrastructure [Brown et al., 2006; Pita et al.,2008; Nieetal.,2007;Kiekintveldetal.,2009]andcomputernetworksecurity[LyeandWing,2005; 149 Srivastava et al., 2005], and di↵ erent algorithms have been developed for such models. The relatedworkonthesegame-theoreticapproachesforsecuritycanbebroadlysub-dividedintothree categories: 1. AlgorithmsoncomputingStackelbergequilibria, 2. Algorithmstoaccountforhumanbehavior,and 3. Algorithmsforcomputingrobust defenderstrategies. 7.1.1 E cientcomputationofStackelbergequilibria Withinthisareaofe cientalgorithms, the MultipleLPs approach[Conitzer and Sandholm,2006] was the first to propose optimal computation of leader strategies in a Stackelberg game. While thispaperdidnotenvisionasecurity scenario, security domainsled toa requirement ofBayesian Stackelberg games. A Bayesian Stackelberg game can be converted to a Stackelberg game by usingtheHarsanyitransformation[HarsanyiandSelten,1972];however,thesizeoftheHarsanyi transformed matrix would grow exponentially with the number of attacker types, inhibiting applicationforlargeinputproblems. TheMultipleLPsapproach[ConitzerandSandholm,2006] computedoptimal defenderstrategiesbyoperatingontheHarsanyitransformedmatrix[Harsanyi andSelten,1972];whileitdidnotexplicitlyrepresenttheHarsanyitransformedmatrixinmemory, itrequiredcomputingsolutionstoexponentiallymanylinearprograms. DOBSS[Paruchurietal., 2008b] avoided this Harsanyi transformation, and was the first mixed-integer linear program proposed for computing optimal defender strategies for the Stackelberg security games. While DOBSSwasmuchfasterthatMultiple LPs [Conitzer andSandholm, 2006], it stillrequiresthe 150 input to explicitly represent all the pure strategies of the defender and hence does not scale to domainswiththousandsoftargetsandtensofresources. Continuing with the first area of related work, Origami is a polynomial time algorithm that computes optimal defender strategies for a security game when no scheduling constraints are present (an example of scheduling constraints is the logistical constraints faced by a federal air marshal). ERASER-C, then, a mixed-integer linear program [Kiekintveld et al., 2009] was developedforlargerandmorecomplexStackelbergsecuritygames. ERASER-Cfocusesongames withlargenumbersofdefenderswithschedulingconstraints,handlingthecombinatorialexplosion in the defender’s joint schedules. Unfortunately, as the authors note, ERASER-C may fail to generate a correct solution in cases where arbitrary schedules with more than two flights (e.g., multi-cityflighttours)areallowedintheinput,orwhenthesetofflightscannotbepartitionedinto distinctsetsfordepartureandarrivalflights. ERASER-Cavoidsenumeratingjointschedulesto gaine ciency,butlosestheabilitytocorrectlymodelarbitraryschedules. Furthermore,ERASER- Conlyoutputsacoveragevectorc,wherec t istheprobabilityofthedefenderprotectingtargett. ItdoesnotprovidethemixedstrategyxoverjointschedulesJofalldefenderresources,whichis necessarytoimplementthecoveragecinpractice. Additionally,nootheralgorithmhasyetbeen provided to convert marginals toprobabilitiesoverjointschedules,andinSection3.2,wepresent thefirste cientalgorithmthataddressesthischallengeforarbitraryschedulingconstraints. The most recent algorithm employing large-scale optimization methods to solve Bayesian Stackelberg games is HUNTER [Yin and Tambe, 2012]. It employs Benders decomposition [Bert- simas and Tsitsiklis, 1994] and cut-generation, however, it is not designed to compute optimal strategy for a defender with scheduling constraints. Double-oracle based approaches that use both cut and column generation have also been proposed for security games [Halvorson et al., 151 2009a; Jain et al., 2011b], however, they have only been applied for zero-sum settings. We present algorithms that are not restricted to the strict zero-sum requirement of the previous algorithms, butinsteadoperateonStackelbergsecuritygames[Kiekintveldetal.,2009]. Inasecuritygame, protectingatargetisalwaysbeneficialforthedefenderandworsefortheattacker,however,the payo↵ sneednotbezero-sum. In contrast, Aspen employs the technique of branch-and-price [Barnhart et al., 1994] and computessolutionsforaStackelbergsecuritygamewitharbitraryschedulingconstraints. Addi- tionally,hierarchicaldecompositionmethodsliketheoneemployedbyHbsahavealsonotseen muchapplicationtowardsobtainingoptimalsolutionsinBayesianStackelberggames,although similar techniques have been proposed to obtain approximate Nash equilibrium for symmetric games[Wellmanetal.,2005]. Gametheoryhasalsobeenappliedtoawiderangeofproblemswhereoneplayer—theevader —triestominimizetheprobabilityofdetectionbyand/orencounterwiththeotherplayer—the patroller;thepatrollerwantstothwarttheevader’splansbydetectingand/orcapturinghim. The formalization ofthis problemledtoafamilyofgames,oftencalled pursuit-evasion games[Adler etal.,2002]. Astherearemanypotentialapplicationsofthisgeneralidea,morespecializedgame typeshavebeenintroduced,e.g.,hider-seekergames[Flood,1972;Halvorsonetal.,2009b]and infiltrationgames[Alpern,1992]withmobilepatrollersandmobileevaders;searchgames[Gal, 1980] with mobilepatrollers andimmobileevaders;and ambush games[Ruckleetal.,1976] with themobilitycapabilitiesreversed. Inthegamemodelproposedinthispaper,theevaderismobile whereasthepatrollerisnot,justlikeinambushgames. However,incontrastwithambushgames, weconsidertargets(termeddestinationsinambushgames)ofvaryingimportance. 152 Thenetworksecuritygamemodeldescribedinthisthesisismostsimilartothatofinterdiction games [Washburn and Wood, 1995], where the evading player — the attacker — moves on an arbitrarygraphfromoneoftheoriginstooneofthedestinations(aka.targets);andtheinterdicting player—thedefender—inspectsoneormoreedgesinthegraphinordertodetecttheattacker andpreventhimfromreachingthetarget. Asopposedtointerdictiongames,wedonotconsider the detection probability on edges, but we allow di↵ erent values to be assigned to the targets, whichiscrucialforreal-worldapplications. Recentworkhasalsoconsideredschedulingmultiple-defenderresourcesusingcooperative game-theory, as in path disruption games [Bachrach and Porat, 2010], where the attacker tries to reach a single known target. In contrast with the static asset protection problem [Dickerson et al., 2010a], we attribute di↵ erent importance to individual targets and unlike its dynamic variant[Dickerson et al.,2010a], we consider only statictarget positions. Recent workin security games and robotic patrolling [Basilico et al., 2009; Jain et al., 2010b] has focused on concrete applications. However, they have not considered the scale-up for both defender and attacker strategies. For example, inASPEN, theattacker’spure strategy spaceis polynomiallylarge, since theattackerisnotfollowinganypathandjustchoosesexactlyonetargettoattack. Ourgamemodel wasintroducedbyTsaietal.[Tsaietal.,2010]; however,theirapproximatesolutiontechnique canbesuboptimal. WediscusstheshortcomingsoftheirapproachinSection4.1,andprovidean optimalsolutionalgorithmforthegeneralcase. Finally, algorithms that model security domains using Stackelberg games have seen many real-world deployments. ARMOR [Pita et al., 2008; Jain et al., 2010c] is a software assistant deployed at the Los Angeles International Airport since August 2007 that casts the scheduling problemofvehicularcheckpointsandcaninepatrolsasaStackelberggameandusestheDOBSS 153 algorithmtocomputetheseschedules. Similarly,theIRIS[Jainetal.,2010c]schedulingsystem, inusebytheUnitedStatesFederalAirMarshalServicesinceOctober2009,generatesschedules overinternationalflightsfortheairmarshalsusingtheAspenalgorithmpresentedinthisthesis. Similarly, GUARDS [Pita et al., 2011b] is also a game-theoretic scheduling system in use by the United States Transport Security Administration to randomize behind-the-scenes activities conducted by the security agency. In the same way, PROTECT [An et al., 2011], another software schedulingsystembasedonStackelberggames,has beeninusebytheBostonCoastGuardsince April 2011 to generate patrol routes for the boats and other resources available to the Boston Coast Guard. Finally, TRUSTS [Yin et al., 2012c] is another software scheduling system based on StackelberggamesthathasbeeninusebytheLosAngelesSheri↵ ’sDepartmentsinceMay2012 togeneratestrategiesfortheo cerstoinspectpassengerstowardsreducingcrimeandsuppressing fareevasionontheLosAngelesmetrotransitsystem. 7.1.2 Stackelbergequilibriamodelinghumans The second category of related work that applies game-theory to security domains computes defender strategies when faced against human adversaries. This work takes into account the biasesandboundedrationalityofahumanopponent. Cobraisonesuchalgorithmthatconsiders anchoring-biasofhumans[Pitaetal.,2009],aswellasattackerindi↵ erencebetweenpurestrategies that are at most✏ away from the optimal. Alternate solution techniques grounded in psychological conceptslikeQuantalResponseEquilibrium(QRE)havealsobeenproposed[Yangetal.,2011]. Thefocusofthesealgorithmshasbeentodevelopalgorithmsthatperformwellagainsthumans; indeedtheyhavebeenevaluatedagainsthumansubjects. Myresearchiscomplimentarytothis 154 work;andmytechniquesofcolumngenerationarenowbeingcombinedwithQuantalresponse andothermodelsofhuman-decisionmaking. 7.1.3 Computationofrobustsolutions Thethirdcategoryofrelatedworkaimsatcomputingrobustsolutions. Kiekintveldet. al[Kiek- intveld et al., 2011] model distributions over preferences of an attacker using infinite Bayesian games,andproposeanalgorithmtogenerateapproximatesolutionsforsuchgames. Yinet. al[Yin et al., 2012a] also provide robust solutions in the presence of execution and observation errors. TheCobraalgorithm[Pitaetal.,2009]mentionedbeforefocuseson✏-rationalityoftheattacker, specifically for human subjects. In contrast, Yin et. al [Yin et al., 2010a] consider the limiting casewhereanattackerhasnoobservationsandthusinvestigatetheequivalenceofStackelbergvs Nashequilibria. Evenearlierinvestigationshaveemphasizedthevalueofcommitmenttomixed strategiesinStackelberggamesinthepresenceofnoise[vanDammeandHurkens,1997]. Secrecy anddeceptionhavealsobeenmodeledforStackelberggames[ZhuangandBier,2011]. Outside of Stackelberg games, models for execution uncertainty in game-theory have been separately developed[ArchibaldandShoham,2009]. Robustsolutionmethodsforsimultaneousmovegames havealsobeenstudied[AghassiandBertsimas,2006;Porteretal.,2002]. 7.2 Stackelberggamesinothersecurityscenarios Thesecondareaofrelatedworkapplies Stackelberg gamesto other securityscenarios. Lawrence et al [Wein, 2008] apply Stackelberg games in the context of screening visitors entering the US. In theirwork,theymodeltheU.S.Governmentastheleaderwhospecifiesthebiometricidentification 155 strategy to maximize the detection probability using finger print matches, and the follower is the terrorist who can manipulate the image quality of the finger print. Stackelberg games have also been used for studying missile defense systems [Brown et al., 2005a] and for studying the developmentofanadversary’sweaponsystems[Brownetal.,2005b]. AfamilyofStackelberg games known as inspection games is closely related to the security games we are interested in and includes models of arms inspections and border patrols [Avenhaus et al., 2002]. Another lineofrecentworkisonrandomizedsecurityandroboticpatrollingusingStackelberggamesfor generic“policeandrobbers”scenario[Gatti,2008;Amigoniet al.,2008;Basilicoetal.,2009]. Thesemodelsuseextensiveformgamestomodelthedomainandpresentalternativealgorithmsto patrollingdomains. 7.3 Otheroptimizationtechniquesforsecuritydomains The thirdarea of relatedwork applies other optimization techniques to model the security domain, butdoesnotaddressthestrategicaspectsoftheproblem. Thesemethodsprovidearandomization strategyforthedefender,buttheydonottakeintoaccountthefactthattheadversariescanobserve thedefender’sactionsandthenadjusttheirbehavior. Examplesofsuchapproachesinclude[Ruan et al., 2005; Paruchuri et al., 2006] which are based on learning, Markov Decision Processes (MDPs)andPartiallyObservableMarkovDecisionProcesses(POMDPs). Aspartofthiswork, the authors model the patrolling problem with locations and varying incident rates in each of the locations and solve for optimal routes using a MDP framework. Another example is the “HypercubeQueueingModel”[Larson,1974]whichisbasedonqueueingtheoryanddepictsthe detailed spatial operation of urban police departments and emergency medical services. It has 156 found application in police beat design, in allocation of patrolling time, etc. Babu et al [Babu et al., 2006] have worked on modeling passenger security system at US airports using linear programmingapproaches,however,theirobjectiveistoclassifythepassengersinvariousgroups andthenscreenthembasedonthegrouptheybelongto. Similarly,Agmonet. al[Agmonetal., 2008]analyzetheperformanceofanattackerwithzero-knowledge,andanattackerwithpartial knowledgeinaperimeterpatrollingdomain. Suchframeworkscanaddressmanyoftheproblems weraise,includingdi↵ erenttargetvaluesandincreasinguncertaintybyusingmanypossiblepatrol routes. However,theyfailtoaccountforthepossibilitythatanintelligentattackerwillobserveand exploitpatterns in thesecurity policy. Ifapolicyisbasedonthehistoricalfrequencyofattacks, it isessentiallyareactivepolicyandanintelligentattackerwillalwaysbeonestepahead. 157 Chapter8: Conclusions Game-theoreticapproacheshaveshowntheirusefulnessindeployedsecurityapplicationssuch asARMORfortheLosAngelesInternationalAirportPitaetal.[2008],IRISfortheFederalAir MarshalServiceJainetal.[2010a],GuardsfortheTransportationSecurityAdministrationPita etal.[2011a], PROTECTfortheBostonCoastGuardShiehetal.[2012], andTRUSTSforthe Los Angeles Metro Rail System Yin et al. [2012b]. At the core of the these applications is the Stackelberg game model. A solution to these Stackelberg games yields the optimal mixed strategy, ortheoptimalallocationofthedefender’sresources. However,gametheoreticmodelsforreal-worldproblemscanhavetrillionsofpurestrategies for both players. Thus, scalable algorithms to solve these game-theoretic models are thus required toobtainsolutionsforreal-worlddomains: e.g.,scheduling10airmarshalstoprotect100flights yields ⇣ 100 10 ⌘ ⇡ 1.7⇥ 10 13 unique assignments (or pure strategies) for the defender. Such large modelscannotevenbestoredinthememoryofcomputerstoday,letalonebesolvedbyexisting algorithms. Ihaveprovidedalgorithmsespeciallydesignedtoscale-uptoexponentiallymanypure strategies for both players, as required to compute solutions for large real-world domains. My researchoncomputationalgametheoryprovidesalgorithmsthathaveenabledoptimalsolutionsto 158 be computedfor domainswith: (i) trillions ofstrategiesfor the defender, (ii) trillionsof strategies fortheattacker,and(iii)BayesianStackelberggameswithmanyadversarytypes. Thesealgorithmsavoidrepresentingtheentiregameinmemory,whilecomputingsolutions for the entire large problem. These algorithms are built on the following insights: (i) Real- worlddomainshaveexponentiallymanypurestrategiesforthedefender (e.g. acombinationof checkpoints), and so, an incremental approach of generating pure strategies of the defender is required. Thiswillavoidenumeratingallthepurestrategies,andwillonlyaddapurestrategyif thepurestrategywouldhelpincreasedefenderpayo↵ . (ii)Indomainswithexponentiallymany attackerpurestrategies,anincrementalapproachtogeneratepurestrategiesfortheattacker(e.g. attackpaths)shouldbeusedtoavoidenumerationofthepurestrategysetoftheattacker. (iii)A Bayesian Stackelberg game can be decomposed into hierarchically-organized smaller games, each with smaller number of attacker types, providing heuristics which can be used to eliminate the never-best-response(thatis,dominated)purestrategiesoftheattacker. Ialsoprovidemathematical guaranteesfortheobtainedsolutions: thealgorithmscancomputethegloballyoptimalsolution andcanalsobetunedtoreturnanapproximatesolutionwithqualityguarantees. These insightsprovide speed-upsbyreducingthesizeofthegame: whileinsightsonstrategy generationrestrictthegamesizebye cientlygeneratingsub-gamesthatincludeapurestrategy only if it improves the player’s payo↵ , the hierarchical approach pre-processes the input Bayesian Stackelberg game instance and removes the attacker pure strategies that cannot be part of the optimalsolution. Furthermore,IhaveinvestigatedwhatpropertiesofStackelbergsecuritygame instancesmakethemhardtosolveinpractice,acrossdi↵ erentinputsizesanddi↵ erentsecurity domains. Thealgorithmsdevelopedinmythesishavealsobeendeployedinthereal-world. 159 8.1 Contributions While previous algorithms could not scale to domains with more than 1000 pure strategies in Stackelberg games, my thesis provides novel algorithms built on large-scale optimization techniquesthatscaletobillionsofpurestrategiesforbothplayers. Mythesismadethefollowing contributions: • Aspen: I designed Aspen [Jain et al., 2010b] for large security problems with arbitrary scheduling constraints,e.g., the problemof schedulingair marshals onboard flights. Aspen builds on the large-scale optimization approach of branch-and-price, which uses a combina- tion of branch-and-bound,linear programming and networkflows toe ciently compute the solution. Infact,AspenisthealgorithmthatpowerstheIRISsystem. • Rugged and Snares: I designed Rugged for domains where both the defender and the attackerhavebillionsofpurestrategiesandtheyoperateonanetwork[Jainetal.,2011b], e.g.,whenconsideringcyber-securityorprotectionofanurbanroadnetwork. TheRugged algorithm uses a double-oracle methodology, which is similar to column/cut generation approachesproposedinoperationsresearch. TheRuggedalgorithmlaidthefoundationfor Snares,whichexploitingsub-modularitypropertiesintheproblem,builtonRugged,and providedordersofmagnitudespeed-ups[Jainetal.,2013]. Weare,infact,indiscussions withthepoliceforceofMumbai,IndiatoseehowSnaresmaybemadeavailabletothem fordeployingcheckpointsontheroadsofMumbai. • Hbgs and Hbsa: Furthermore, modeling uncertainty is an important challenge when fo- cusingonreal-worlddomains. UncertaintyismodeledthroughtheBayesianextensionof Stackelberggames. Ithusdevelopedane cientcomputationalapproachforsolvinglarge 160 BayesianStackelberggames,orStackelberggameswithmanyattackertypes. Iproposed ahierarchicalmethodologyofdecomposinglargeBayesianStackelberggamesintomany smaller Bayesian Stackelberg games, and provided a framework to use the solutions to thesesmallergamestoe cientlyapplybranch-and-bound ontheoriginallargeBayesian Stackelberg game. Using this hierarchical methodology, I have provided two algorithms: Hbgs and Hbsa tailored to compute e cient solutions with and without arbitrary scheduling constraintsrespectively[Jainetal.,2011a]. • Deployment-to-Saturationratio: IhavestudiedprobleminstancesforStackelberggames broadlytoidentifythepropertiesthatmakeaprobleminstancecomputationallychallenging. Iformalizedtheconceptofthe deployment-to-saturation(d:s)ratioinStackelbergsecurity games,andshowedthatprobleminstancesforwhichd:s = 0.5arecomputationallyharder thaninstanceswithotherd:sratiosforawiderangeofdi↵ erentdomains,algorithms,solvers orequilibriumcomputationmethods[Jainetal.,2012]. Thisworkalsoprovidesevidence that the computationally hard region is also one where optimization is most beneficial to the real-worldsecurity agencies, corroborating the need for algorithmic advances. Furthermore, Iusetheconceptofphasetransitionstobetterunderstandthiscomputationallyhardregion. • Deployments–ARMORandIRIS:Myworkhasalsobeendeployedinthereal-world: I have developed algorithms for and subsequently led the building of the IRIS system that is used by the Federal Air Marshals Service (FAMS) to schedule air marshals on board internationalcommercialflightssinceOctober2009[Jainetal.,2010c]. Ialsoworkedon theARMORsystem,whichisinusebytheLosAngelesairportpolice[Pitaetal.,2008]. Furthermore, the success of IRIS and ARMOR systems has led to newer deployments 161 of such algorithms in other real-world security domains, like the PROTECT system for the U.S. Coast Guard [Shieh et al., 2012], and the TRUSTS system for the LA Sheri↵ Department[Yinetal.,2012b]. 8.2 FuturePlans Theresearchdescribedinthisthesisisdrivenbythevisionofaworldwhereagenttechnologies arewidespread,andworkingtoassistdecisionmakersincriticaldomains,withemphasison(i) security,e.g.,protectingcyber-networksandtransportationsystems,(ii) sustainability,e.g.,forest andwildlifeconservation,and(iii)safety,e.g.,healthcareanddisasterresponse. The mathematical framework of game theory is appropriate for analyzing decision making for allthesedomains. However,therearethreesignificantresearchareasthatneedtobedeveloped towards my vision: (i) Scalable behavioral game theory: computing solutions for large real- world domains in the presence of humans, who may be boundedly rational or have limited computationally capability;(ii) Stochastic coalitional game theory: analyzing decision makingfor ateamofagentsinanuncertainenvironment;and(iii)Spatiotemporalgametheory: performing computations over continuous space and time. My current research has focused on addressing thesechallenges,withthefocusofmyPh.D.thesisbeingscalablegametheoreticalgorithms. I brieflydescribethesepillarsofgametheorybelow. 8.2.1 Scalablebehavioralgametheory Scalable algorithms that incorporate human decision making are required to obtain game-theoretic solutions for real-world domains. This is required since real human adversaries do not necessarily 162 act according to abstract models of rationality, since they may have cognitive biases, limited computational ability and uncertain or incomplete information. Instead, humans may rely on heuristicsandsimplifiedabstractionsfortheirdecisionmaking,whichisnotcapturedintraditional game-theoreticmodels. Theinsightsproposedinmythesishaveledtoalgorithmsthatcanscaleto real-world sizes; now these algorithms are being extended with models of human decision making. Ihaveearlierco-developedgame-theoreticalgorithmsthatexploitinsightsfrombehavioralmodels like anchoring bias [Pita et al., 2009], and algorithms using strategy generation for achieving scalability and using quantal response for human decision making are forthcoming. However, much more remains to be done in developing scalable algorithms that perform better against humansindividuallyoringroups. 8.2.2 Stochasticcoalitionalgametheory Manyreal-worlddomainshavemultipleteamsofagents,whereagentsinoneteamcooperateand coordinatetojointlyachieveacommongoal. Thiscoordinationbetweenagentsleadstocoalition structures;however,inreal-worlddomains,suchcoalitionsmayfaceuncertaintyandadversaries. Payo↵ stotheteamdependonthecombinationofactionstakenbytheindividualteammembers, andthesepayo↵ smayalsovaryovertime. DistributedConstraintOptimizationProblems(DCOPs)frameworkisproposedinliterature for coordinating between multiple agents, and provides a basis for coalition games. However, DCOPs are limited since they do not handle stochasticity. Thus, I have defined a new class of DistributedCoordinationofExplorationandExploitation(DCEE)[Jainetal.,2009;Tayloretal., 2010] problems, that balance the exploration and exploitation of multiple mobile agents (e.g., 163 for robot navigation in buildings for post disaster recovery); I developed novel algorithms and implementedthemonactualiRobotCreates. Coalition formation among agents is a key aspect of real-world decision making, in both non-adversarial as well as adversarial domains. A coalition of adversaries who may not have perfectly-alignedgoalsmustnegotiatewitheachotheronwhopayswhichcostorwhoreceives what benefits. Similarly, a coalition of defenders may have teams with di↵ erent skill sets and goals. It is also critical to understand the dynamic nature of how these relationships form and change. IplantouseDCEEforanalyzingdefendercoalitionsinadversarialsettingsbyresearching equilibrium analysis ofsuch games, as well as investigate strategic adaptation over time, basedon bothsoundmathematicalprinciplesandhumanbehavioralmodels. 8.2.3 Spatiotemporalgametheory Adversarial domains in the real world require both the defenders and the attackers to act in a geographicalspace,andincontinuoustime. Incorporatingspatiotemporalmodelingingametheory requiresnovelresearch—currentworkoftendoesnotmodelcontinuousspacesandcontinuous time. Incorporatingspatiotemporalreasoningwillimproveourabilitytoaddressdomainssuch asprotectionoflargeforestareasorfisheriesoftheworldorprotectingendangeredspecies,and helpusgeneratemoree↵ ectivestrategiesforthedefender. Thisfieldhasnotseenmuchfocus;a situationthatIwouldliketoredress. Insummary,formyfuturework,inspiredbyreal-worldchallengesofsecurity,sustainability andsafety,myvisionistotakeonfundamentalchallengesthatariseintakingcomputationalgame theoryintotherealworld: inproblemdomainsthatrangefromphysicalsecuritytocybersecurity to forest protection to conservation of wildlife to disaster rescue and management. To address 164 these fundamental challenges, I would like to conduct basic research built on foundations of interdisciplinary connections to economics, operations research, psychology, health, and law amongotherdisciplines. 165 Bibliography AirTra cControl: BytheNumbers.http://www.natca.org/mediacenter/bythenumbers.msp, 2011. Micah Adler, Harald R¨ acke, Naveen Sivadasan, Christian Sohler, and Berthold V¨ ocking. Random- izedpursuit-evasioningraphs. InICALP,pages901–912,2002. MicheleAghassiandDimitrisBertsimas. RobustGameTheory. Math.Program.,107:231–273, June2006. Noa Agmon, Vladimir Sadov, Gal A. Kaminka, and Sarit Kraus. The Impact of Adversarial KnowledgeonAdversarialPlanninginPerimeterPatrol. InProceedingsoftheInternational ConferenceonAutonomousAgentsandMultiagentSystems(AAMAS),volume1,pages55–62, 2008. S. Alpern. Infiltration Games on Arbitrary Graphs. Journal of Mathematical Analysis and Applications,163:286–288,1992. Francesco Amigoni, Nicola Gatti, and Antonio Ippedico. A game-theoretic approach to de- termining e cient patrolling strategies for mobile robots. In Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Tech- nology - Volume 02, WI-IAT ’08, pages 500–503, Washington, DC, USA, 2008. IEEE Computer Society. ISBN 978-0-7695-3496-1. doi: 10.1109/WIIAT.2008.324. URL http://dx.doi.org/10.1109/WIIAT.2008.324. Bo An, James Pita, Eric Shieh, Milind Tambe, Chris Kiekintveld, and Janusz Marecki. Guards and protect: next generation applications of security games. SIGecom Exch., 10(1):31–34, March 2011. ISSN 1551-9031. doi: 10.1145/1978721.1978729. URL http://doi.acm.org/10.1145/1978721.1978729. Christopher Archibald and Yoav Shoham. Modeling Billiards Games. In Proceedings of the InternationalConferenceonAutonomousAgentsandMultiagentSystems(AAMAS),2009. Rudolf Avenhaus, Bernhard von Stengel, and Shmuel Zamir. Inspection Games. In Robert J. Aumann and Sergui Hart, editors, Handbook of Game Theory, volume 3, chapter 51, pages 1947–1987.North-Holland,Amsterdam,2002. L. Babu, L. Lin, and R. Batta. Passenger Grouping Under Constant Threat Probability in an AirportSecuritySystem. InEuropeanJournalofOperationalResearch,volume168,pages633 –644,2006. 166 YoramBachrachandElyPorat. PathDisruptionGames. InAAMAS,pages1123–1130,2010. C. Barnhart, E.L. Johnson, G.L. Nemhauser, M.W.P. Savelsbergh, and P.H. Vance. Branch andPrice: ColumnGeneration forSolving Huge Integer Programs. In Operations Research, volume46,pages316–329,1994. Tamer Basar and Geert Jan Olsder. Dynamic Noncooperative Game Theory. Academic Press, San Diego,CA,2ndedition,1995. N. Basilico, N. Gatti, and F. Amigoni. Leader-Follower Strategies for Robotic Patrolling in EnvironmentswithArbitraryTopologies. InProceedingsoftheInternationalConferenceon AutonomousAgentsandMultiagentSystems(AAMAS),pages500–503,2009. Dimitris Bertsimas and John N. Tsitsiklis. Introduction to Linear Optimization. Athena Scientific, 1994. MikelBlanco,AureliaValino,JoostHeijs,ThomasBaumert,andJavierGonzalezGomez. The Economic Cost of March 11: Measuring the direct economic cost of the terrorist attack on March11,2004inMadrid. TerrorismandPoliticalViolence,19(4):489–509,2007. Branislav Bosansky, Viliam Lisy, Michal Jakob, and Michal Pechoucek. Computing Time- DependentPoliciesforPatrollingGameswithMobileTargets. InTenthInternationalConference onAutonomousAgentsandMultiagentSystems,pages989–996,2011. Michele Breton, A. Alg, and Alain Haurie. Sequential Stackelberg Equilibria in Two-Person Games. OptimizationTheoryandApplications,59(1):71–97,1988. G.Brown,M.Carlyle,J.Kline,andK.Wood. ATwo-SidedOptimizationforTheaterBallistic MissileDefense. InOperationsResearch,volume53,pages263–275,2005a. G.Brown,M.Carlyle,J.Royset,andK.Wood. OnTheComplexityofDelayinganAdversary’s Project. In B. Golden, S. Raghavan, and E. Wasil, editors, The Next Wave in Computing, OptimizationandDecisionTechnologies,pages3–17.Springer,2005b. GeraldBrown,MatthewCarlyle,JavierSalmeron,andKevinWood. DefendingCriticalInfrastruc- ture. InInterfaces,volume36,pages530–544,2006. RinaChandranandGregBeitchman. BattleforMumbaiEnds,DeathTollRisesto195. Times of India,29November2008. PeterCheeseman,BobKanefsky,andWilliamM.Taylor. WheretheReallyHardProblemsare. In Proceedings of the International Joint Conference on Artificial Intelligence, pages 331–337, 1991. CNN. Sources: Air Marshals missing from almost all flights. 2008. http://articles.cnn.com/2008-03-25/travel/siu.air.marshals 1 air-marshals-federal-air-flights? s=PM:TRAVEL/. Vincent Conitzer and Tuomas Sandholm. Computing the Optimal Strategy to Commit to. In ProceedingsoftheACMConferenceonElectronicCommerce(ACM-EC),pages82–90,2006. 167 John Dickerson, Gerardo Simari, V.S. Subrahmanian, and Sarit Kraus. A Graph-Theoretic ApproachtoProtectStaticandMovingTargetsfromAdversaries. InAAMAS,pages299–306, 2010a. J.P.Dickerson,G.I.Simari,V.S.Subrahmanian,andSaritKraus. AGraph-TheoreticApproachto ProtectStatic andMoving TargetsfromAdversaries. In Proceedings of the Ninth International Conference on Autonomous Agents and Multiagent Systems (AAMAS-2010), pages 299–306, 2010b. David Eppstein and Michael T. Goodrich. Studying (Non-Planar) Road Networks Through an AlgorithmicLens. CoRR,abs/0808.3694,2008. MerrillM.Flood. TheHideandSeekGameofVonNeumann. MANAGEMENTSCIENCE,18 (5-Part-2):107–109,1972. Jeremy Frank, Ian P. Gent, and Toby Walsh. Asymptotic and Finite Size Parameters for Phase Transitions: Hamiltonian Circuit as a Case Study. Inf. Process. Lett., 65:241–245, March1998. ISSN0020-0190. doi: http://dx.doi.org/10.1016/S0020-0190(97)00222-6. URL http://dx.doi.org/10.1016/S0020-0190(97)00222-6. EhudFriedgut. Sharpthresholdsofgraphproperties,andthek-satproblem. J.Amer.Math.Soc, 12:1017–1054,1998. ShmuelGal. SearchGames. AcademicPress,NewYork,1980. NicolaGatti. GameTheoreticalInsightsinStrategicPatrolling: ModelandAlgorithminNormal- Form. In Proceedings of the European Conference on Artificial Intelligence (ECAI), pages 403–407,2008. IanP.GentandTobyWalsh. Phasetransitionsfromrealcomputationalproblems. In Proceedings ofthe8thInternationalSymposiumonArtificialIntelligence,pages356–364,1995. IanP.GentandTobyWalsh. TheTSPPhaseTransition. Artificial Intelligence,88(12):349–358, 1996. M. Haklay and P. Weber. Openstreetmap:user-generated street maps. Pervasive Computing, IEEE, 7(4):12–18,2008. E. Halvorson, V. Conitzer, and R. Parr. Multi-step multi-sensor hider-seeker games. In IJCAI, pages336–341,2009a. ErikHalvorson,VincentConitzer,andRonaldParr. Multi-stepMulti-sensorHider-SeekerGames. InProceedingsoftheInternationalJointConferenceonArtificialIntelligence(IJCAI),pages 159–166,2009b. J.C.HarsanyiandR.Selten. AGeneralizedNashSolutionforTwo-personBargainingGameswith IncompleteInformation. InManagementScience,volume18,pages80–106,1972. ChristophHauertandGyrgiSzab. GametheoryandPhysics. Am. J. Phys.,73(5):405–414,2005. doi: 10.1119/1.1848514. 168 M.Jain,J.Tsai,J.Pita,C.Kiekintveld,S.Rathi,M.Tambe,andF.Ord´ o˜ nez. Softwareassistants forrandomizedpatrolplanningfortheLAXAirportPoliceandtheFederalAirMarshalsService. Interfaces,40:267–290,2010a. ManishJain,MatthewE.Taylor,MilindTambe,andMakotoYokoo. Dcopmeetstherealworld: Exploringunknownrewardmatriceswithapplicationstomobilesensornets. InIJCAI,2009. Manish Jain, Erim Kardes, Christopher Kiekintveld, Fernando Ord´ o˜ nez, and Milind Tambe. SecurityGameswithArbitrarySchedules: ABranchandPriceApproach. InProceedingsofthe AAAIConferenceonArtificialIntelligence(AAAI),2010b. ManishJain,JasonTsai,JamesPita, ChristopherKiekintveld,ShyamsunderRathi, MilindTambe, and Fernando Ord´ o˜ nez. Software Assistants for Randomized Patrol Planning for the LAX AirportPoliceandtheFederalAirMarshalsService. Interfaces,40:267–290,2010c. ManishJain,ChristopherKiekintveld,andMilindTambe. Quality-boundedSolutionsforFinite BayesianStackelbergGames: Scalingup. InProceedingsoftheInternationalConferenceon AutonomousAgentsandMultiagentSystems(AAMAS),2011a. ManishJain,DmytroKorzhyk,OndrejVanek,VincentConitzer,MichalPechoucek,andMilind Tambe. ADoubleOracleAlgorithmforZero-SumSecurityGamesonGraphs. In Proceedings of the International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2011b. Manish Jain, Kevin Leyton-Brown, and Milind Tambe. The deployment-to-saturation ratio in securitygames. InAAAI,2012. ManishJain,VincentConitzer,MichalPechoucek,andMilindTambe. Securityschedulingfor real-worldnetworks. InAAMAS,2013. AlbertJiangandKevinLeyton-Brown. Apolynomial-timealgorithmforaction-graphgames. In ArtificialIntelligence,pages679–684,2006. Armen Keteyian. TSA: Federal Air Marshals. 2010. http://www.cbsnews.com/stories/2010/02/01/earlyshow/main6162291.shtml, retrieved Feb1,2011. Christopher Kiekintveld, Manish Jain, Jason Tsai, James Pita, Milind Tambe, and Fernando Ord´ o˜ nez. ComputingOptimalRandomizedResourceAllocationsforMassiveSecurityGames. In Proceedings of the International Conference on Autonomous Agents and Multiagent Systems (AAMAS),pages689–696,2009. Christopher Kiekintveld, Janusz Marecki, and Milind Tambe. Approximation Methods for Infinite BayesianStackelbergGames: ModelingDistributionalPayo↵ Uncertainty. InProceedingsof theInternationalConferenceonAutonomousAgentsandMultiagentSystems(AAMAS),2011. Scott Kirkpatrick and Bart Selman. Critical behavior in the satisfiability of random boolean formulae. Science,264:1297–1301,1994. 169 Daphne Koller and Brian Milch. Multi-agent influence diagrams for representing and solving games. GamesandEconomicBehavior,45(1):181–221,2003. D.Korzhyk,V.Conitzer,andR.Parr. ComplexityofComputingOptimalStackelbergStrategies inSecurityResourceAllocationGames. InProceedingsoftheAAAIConferenceonArtificial Intelligence(AAAI),pages805–810,2010. Solomon Kullback and Richard A. Leibler. On Information and Su ciency. The Annals of MathematicalStatistics,22(1):79–86,1951. R.C.Larson. AHypercubeQueueingModelingforFacilityLocationandRedistrictinginUrban Emergency Services. In Journal of Computers and Operations Research, volume 1, pages 67–95,1974. GeorgeLeitmann. OnGeneralizedStackelbergStrategies. Optimization Theory and Applications, 26(4):637–643,1978. Joshua Letchford and Yevgeniy Vorobeychik. Computing Randomized Security Strategies in NetworkedDomains. InProceedingsoftheWorkshoponAppliedAdversarialReasoningand RiskModeling(AARM)atAAAI,2011. Joshua Letchford, Vincent Conitzer, and Kamesh Munagala. Learning and Approximating the OptimalStrategytoCommitTo. InProceedingsoftheInternationalSymposiumonAlgorithmic GameTheory(SAGT),pages250–262,2009. RobertLooney. EconomicCoststotheUnitedStatesStemmingFromthe9/11Attacks. Strategic Insights,1(6),August”2002”. Kong-wei Lye and Jeannette M. Wing. Game strategies in network security. International Journal ofInformationSecurity,4(1–2):71–86,2005. H.BrendanMcMahan,Geo↵ reyJ.Gordon,andAvrimBlum. PlanninginthePresenceofCost Functions Controlled by an Adversary. In Proceedings of the International Conference on MachineLearning(ICML),pages536–543,2003. D. Mitchell, B. Selman, and H. Levesque. Hard and Easy Distributions of SAT Problems. In ProceedingsoftheAmericanAssociationforArtificialIntelligence,pages459–465,1992. G.L.Nemhauser,L.A.Wolsey,andM.L.Fisher. AnAnalysisofApproximationsforMaximizing SubmodularSetFunctions–I. MathematicalProgramming,14(1):265–294,Dec1978. Nie, R. Batta, Drury, and Lin. Optimal Placement of Suicide Bomber Detectors. In Military OperationsResearch,volume12,pages65–78,2007. MartinJ.OsbourneandArielRubinstein. ACourseinGameTheory. MITPress,1994. PraveenParuchuri,MilindTambe,FernandoOrdonez,andSaritKraus. SecurityinMultiagentSys- tems by Policy Randomization. In Proceedings of the International Conference on Autonomous AgentsandMultiagentSystems(AAMAS),2006. 170 PraveenParuchuri,JonathanP.Pearce,JanuszMarecki,MilindTambe,FernandoOrd´ o˜ nez,and SaritKraus. Playinggameswithsecurity: Ane cientexactalgorithmforBayesianStackelberg games. InAAMAS-08,pages895–902,2008a. PraveenParuchuri,JonathanP.Pearce,JanuszMarecki,MilindTambe,FernandoOrd´ o˜ nez,and SaritKraus. PlayingGameswithSecurity: AnE cientExactAlgorithmforBayesianStack- elberg games. In Proceedings of the International Conference on Autonomous Agents and MultiagentSystems(AAMAS),pages895–902,2008b. J. Pita, C. Kiekintveld, M. Tambe, E. Steigerwald, and S. Cullen. GUARDS - game theoretic securityallocationonanationalscale. InAAMAS,pages37–44,2011a. JamesPita,ManishJain,CraigWestern,ChristopherPortway,MilindTambe,FernandoOrd´ o˜ nez, Sarit Kraus, and Praveen Paruchuri. Deployed ARMOR Protection: The Application of a Game-theoreticModelforSecurityattheLosAngelesInternationalAirport. In Proceedings of theInternationalConferenceonAutonomousAgentsandMultiagentSystems(AAMAS),Industry Track,pages125–132,2008. James Pita, Manish Jain, Fernando Ord´ o˜ nez, Milind Tambe, Sarit Kraus, and Reuma Magori- cohen. E↵ ective solutions for real-world Stackelberg games: When agents must deal with humanuncertainties. InProceedingsoftheInternationalConferenceonAutonomousAgents andMultiagentSystems(AAMAS),2009. JamesPita,ManishJain,FernandoOrdonez,MilindTambe,andSaritKraus. RobustSolutions toStackelbergGames: AddressingBoundedRationalityandLimitedObservationsinHuman Cognition. ArtificialIntelligenceJournal,174(15):1142-1171,2010,2010. James Pita, Christopher Kiekintveld, Milind Tambe, Erin Steigerwald, and Shane Cullen. GUARDS - Game Theoretic Security Allocation on a National Scale. In Proceedings of the International Conference on Autonomous Agents and Multiagent Systems (AAMAS),2011b. R. Porter, A. Ronen, Y. Shoham, and M. Tennenholtz. Mechanism Design with Execution Uncertainty. InProceedingsoftheConferenceonUncertaintyinArtificialIntelligence(UAI), 2002. S.Ruan,C.Meirina,F.Yu,K.R.Pattipati,andR.L.Popp. PatrollinginaStochasticEnvironment. In10thIntl.CommandandControlResearchandTech.Symp.,2005. WilliamRuckle,RobertFennell,PaulT.Holmes,andCharlesFennemore. AmbushingRandom WalksI:FiniteModels. OperationsResearch,24:314–324,1976. ToddSandlerandDanielG.ArceM. TerrorismandGameTheory. SimulationandGaming,34(3): 319–337,2003. Robert Savit, Radu Manuca, and Rick Riolo. Adaptive Competition, Market E ciency, Phase TransitionsandSpin-Glasses. UniversityofMichigan,82:2203–2206,1999. E.Shieh,B.An,R.Yang,M.Tambe,C.Baldwin,J.DiRenzo,B.Maule,andG.Meyer. PROTECT: AnapplicationofcomputationalgametheoryforthesecurityoftheportsoftheUnitedStates. InAAAI,pages2173–2179,2012. 171 John Slaney and Toby Walsh. Phase Transition Behavior: from Decision to Optimization. In Proceedingsofthe5thInternationalSymposiumontheTheoryandApplicationsofSatisfiability Testing,SAT,2002. Vivek Srivastava, James Neel, Allen B. MacKenzie, Rekha Menon, Luiz A. Dasilva, James E. Hicks, Je↵ rey H. Reed, and Robert P. Gilles. Using Game Theory to Analyze Wireless Ad Hoc Networks. IEEECommunicationsSurveysandTutuorials,7(4),2005. D. Stevens and et. al. Implementing Security Improve- ment Options at Los Angeles International Airport. 2006. http://www.rand.org/pubs/documented briefings/2006/RAND DB499-1.pdf. Milind Tambe. Security and Game Theory: Algorithms, Deployed Systems, Lessons Learned. CambridgeUniversityPress,2011. MatthewE.Taylor, ManishJain, YanqinJin, Milind Tambe, andMakoto Yokoo. Whenshould therebea”me”in”team”? distributedmulti-agentoptimizationunderuncertainty. InAAMAS, 2010. TSA.TSA:FederalAirMarshals.2008.http://www.tsa.gov/lawenforcement/programs/fams.shtm. JasonTsai,ZhengyuYin,JunyoungKwak,DavidKempe,ChristopherKiekintveld,andMilind Tambe. Urban Security: Game-Theoretic Resource Allocation in Networked Physical Domains. InProceedingsoftheAAAIConferenceonArtificialIntelligence(AAAI),2010. LAWA. General Description: Just the Facts. 2007. http://www.lawa.org/lax/justTheFact.cfm. TSA. TransportationSecurityAdministration—U.S.DepartmentofHomelandSecurity. 2011a. TSA. LayersofSecurity: WhatWeDo. 2011b. EricvanDammeandSjaakHurkens. GameswithImperfectlyObservableCommitment. Games andEconomicBehavior,21(1-2):282–308,1997. OndrejVanek,MichalJakob,ViliamLisy,BranislavBosansky,andMichalPechoucek. Iterative Game-theoretic Route Selection for HostileArea Transit andPatrolling. In Tenth International ConferenceonAutonomousAgentsandMultiagentSystems,pages1273–1274,2011. H.vonStackelberg. MarktformundGleichgewicht. Springer,1934. Bernhard von Stengel and Shmuel Zamir. Leadership with Commitment to Mixed Strategies. TechnicalReportLSE-CDAM-2004-01,CDAMResearchReport,2004. AlanWashburnandKevinWood. Two-personZero-sumGamesforNetworkInterdiction. Opera- tionsResearch,43(2):243–251,1995. LawrenceM.Wein. HomelandSecurity: FromMathematicalModelstoPolicyImplementation. InOperationsResearch,2008. 172 Michael P. Wellman, Daniel M. Reeves, Kevin M. Lochner, Shih-Fen Cheng, and Rahul Suri. ApproximateStrategicReasoningthroughHierarchicalReductionofLargeSymmetricGames. InProceedingsoftheAAAIConferenceonArtificialIntelligence(AAAI),2005. Wiki.FederalAirMarshalService.2008.http://en.wikipedia.org/wiki/Federal Air Marshal Service. Rong Yang, Christopher Kiekintveld, Fernando Ord´ o˜ nez, Milind Tambe, and Richard John. Improved Computational Models of Human Behavior in Security Games. In Proceedings of the InternationalConferenceonAutonomousAgentsandMultiagentSystems(AAMAS),Extended Abstract,2011. Z. Yin and M. Tambe. A unified method for handling discrete and continuous uncertainty in BayesianStackelberggames. InAAMAS,pages855–862,2012. Z. Yin, M. Jain, M. Tambe, and F. Ord´ o˜ nez. Risk-averse strategies for security games with executionandobservationaluncertainty. InAAAI,pages758–763,2012a. Z. Yin, A. Jiang, M. Johnson, M. Tambe, C. Kiekintveld, K. Leyton-Brown, T. Sandholm, and J.Sullivan. TRUSTS:Schedulingrandomizedpatrolsforfareinspectionintransitsystems. In IAAI,2012b. ZhengyuYin,DmytroKorzhyk,ChristopherKiekintveld,VincentConitzer,andMilindTambe. Stackelberg vs. Nash in security games: interchangeability, equivalence, and uniqueness. In ProceedingsoftheInternationalConferenceonAutonomousAgentsandMultiagentSystems (AAMAS),2010a. ZhengyuYin,DmytroKorzhyk,ChristopherKiekintveld,VincentConitzer,andMilindTambe. Stackelbergvs. Nash inSecurity Games: Interch- angeability, Equivalence, and Uniqueness. In AAMAS,pages1139–1146,2010b. Zhengyu Yin, Albert Jiang, Matthew Johnson, Milind Tambe, Christopher Kiekintveld, Kevin Leyton-Brown,TuomasSandholm,andJohnSullivan. Trusts: Schedulingrandomizedpatrols for fare inspection in transit systems. In Conference on Innovative Applications of Artificial Intelligence(IAAI),2012c. Jun Zhuang and Vicki Bier. Secrecy and Deception at Equilibrium, with Applications to Anti- TerrorismResourceAllocation. DefenceandPeaceEconomics,22:43–61,2011. 173
Abstract (if available)
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Addressing uncertainty in Stackelberg games for security: models and algorithms
PDF
Human adversaries in security games: integrating models of bounded rationality and fast algorithms
PDF
Protecting networks against diffusive attacks: game-theoretic resource allocation for contagion mitigation
PDF
The human element: addressing human adversaries in security domains
PDF
Game theoretic deception and threat screening for cyber security
PDF
Not a Lone Ranger: unleashing defender teamwork in security games
PDF
Hierarchical planning in security games: a game theoretic approach to strategic, tactical and operational decision making
PDF
Towards addressing spatio-temporal aspects in security games
PDF
Predicting and planning against real-world adversaries: an end-to-end pipeline to combat illegal wildlife poachers on a global scale
PDF
Balancing tradeoffs in security games: handling defenders and adversaries with multiple objectives
PDF
Combating adversaries under uncertainties in real-world security problems: advanced game-theoretic behavioral models and robust algorithms
PDF
Local optimization in cooperative agent networks
PDF
Handling attacker’s preference in security domains: robust optimization and learning approaches
PDF
Modeling human bounded rationality in opportunistic security games
PDF
When AI helps wildlife conservation: learning adversary behavior in green security games
PDF
Machine learning in interacting multi-agent systems
PDF
Artificial intelligence for low resource communities: Influence maximization in an uncertain world
PDF
Real-world evaluation and deployment of wildlife crime prediction models
PDF
Towards efficient planning for real world partially observable domains
PDF
Optimizing task assignment for collaborative computing over heterogeneous network devices
Asset Metadata
Creator
Jain, Manish
(author)
Core Title
Thwarting adversaries with unpredictability: massive-scale game-theoretic algorithms for real-world security deployments
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Computer Science
Publication Date
07/28/2013
Defense Date
07/27/2013
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
artificial intelligence,game theory,large scale optimization,multi-agent systems,OAI-PMH Harvest,Security
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Tambe, Milind (
committee chair
), Conitzer, Vincent (
committee member
), Krishnamachari, Bhaskar (
committee member
), McCubbins, Mathew D. (
committee member
), Ordóñez, Fernando (
committee member
)
Creator Email
manish@armorway.com,manishja@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c3-302749
Unique identifier
UC11293951
Identifier
etd-JainManish-1860.pdf (filename),usctheses-c3-302749 (legacy record id)
Legacy Identifier
etd-JainManish-1860-1.pdf
Dmrecord
302749
Document Type
Dissertation
Format
application/pdf (imt)
Rights
Jain, Manish
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
artificial intelligence
game theory
large scale optimization
multi-agent systems