Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Topics in algorithms for new classes of non-cooperative games
(USC Thesis Other)
Topics in algorithms for new classes of non-cooperative games
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
Topics in Algorithms for New Classes of Non-cooperative Games Tianyu Hao Supervisor: Jong-Shi Pang Committee Members: Rahul Jain Jong-Shi Pang Meisam Razaviyayn Suvrajeet Sen Dissertation Submitted in Partial Fulllment of the Requirements for the Degree of Doctor of Philosophy Daniel J. Epstein Department of Industrial and Systems Engineering University of Southern California August 2018 Acknowledgement This research thesis was conducted in the Daniel J. Epstein Department of Industrial and Systems Engineering at the University of Southern California, under the supervision of Professor Jong-Shi Pang. I would like to express my deepest gratitude to my advisor Dr. Pang for his advice, inspiration, support and patience throughout my PhD studies. He has set an example of excellence as a researcher, mentor, instructor, and role model. It is impossible to nish the thesis without his help. I am very fortunate to have him as my mentor not only in academics but also in daily life. I am also happy to give special thanks to all my thesis committee members, including Professor Rahul Jain, Professor Meisam Razaviyayn, and Professor Suvrajeet Sen for all of their guidance and help in my research and dissertation. I am lucky to have the chance to get to know many wonderful graduate students in the ISE department and EE department. Thanks for their kindness and support, I enjoyed a very good time at USC. I am also very grateful to meet my partner and soulmate Yunxiao Deng at USC. Finally, I would like to thank my parents Mingjie Hao and Meijing Huang, as well as my sister Jiashu Hao for their love and support. 1 Contents 1 Introduction 4 1.1 Technical Material . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2 Improved Convergence of Best-Response Algorithm of Nash Games 11 2.1 Best-Response Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.1.1 Assumptions for Convergence in Previous Work . . . . . . . . . . . . . . . . . . . . 12 2.2 Improved Convergence Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.2.1 Assumptions on Function u f (x f ;x f ) . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.2.2 Four-Point Condition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.2.3 Four-Point Condition for Dierentiable Functions . . . . . . . . . . . . . . . . . . . 13 2.2.4 Diagonal Dominance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.2.5 Proof of Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 2.3 Special Cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 2.3.1 Bi-Lipschitz continuous games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.3.2 Poly-Matrix Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 2.3.3 Non-cooperative games with minmax objectives . . . . . . . . . . . . . . . . . . . . 25 2.3.4 Two-Stage Stochastic Game . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 3 Dierence-Convex Parameterized Value-function Based Non-cooperative Games 32 3.1 Game Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 3.1.1 Finite-Scenario SP Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 3.1.2 Network Interdiction Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 3.1.3 Max-Flow Enhancing Game . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 3.1.4 Quasi-Nash Equilibria of the GamesG VF . . . . . . . . . . . . . . . . . . . . . . . 41 3.2 The Pull-out Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 3.2.1 The GameG + VF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 3.2.2 The GameG VF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 3.2.3 Existence of equilibria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 3.3 Solution by Linear Complementarity Formulations . . . . . . . . . . . . . . . . . . . . . . 47 3.3.1 The gameG +;pull VF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 3.3.2 The gameG ;pull VF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 3.3.3 The LCP +;pull VF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 3.3.4 The LCP ;pull VF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 3.4 No Restriction on g f; j;` . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 3.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 2 4 DC Potential Games 61 4.1 Game Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 4.2 Potential Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 4.2.1 Common Pointwise Maximum Function . . . . . . . . . . . . . . . . . . . . . . . . 63 4.2.2 Separable Pointwise Maximum Function . . . . . . . . . . . . . . . . . . . . . . . . 64 4.2.3 Stochastic Separable Recourse Function . . . . . . . . . . . . . . . . . . . . . . . . 64 4.2.4 Stochastic Common Recourse Function . . . . . . . . . . . . . . . . . . . . . . . . 65 4.2.5 Stochastic Quadratic Recourse Function . . . . . . . . . . . . . . . . . . . . . . . . 66 4.3 Algorithm for Dierentiable DC Potential Game . . . . . . . . . . . . . . . . . . . . . . . 67 4.4 Algorithm for Pointwise Maximum DC Potential Game . . . . . . . . . . . . . . . . . . . 70 4.5 DC Potential Games with Coupling Constraints . . . . . . . . . . . . . . . . . . . . . . . . 75 5 Augmented Lagrangian Based Algorithm for Monotone Generalized Nash Games 76 5.1 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 5.2 Augmented Lagrangian Based Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 5.3 Convergence of Augmented Lagrangian Based Algorithm . . . . . . . . . . . . . . . . . . . 78 6 Numerical Results 83 6.1 Lemke's Method on Network Interdiction Games . . . . . . . . . . . . . . . . . . . . . . . 83 6.1.1 Multiple Solutions with Dierent Initial Points . . . . . . . . . . . . . . . . . . . . 86 6.1.2 Stochastic Network Interdiction Games . . . . . . . . . . . . . . . . . . . . . . . . 87 6.1.3 Another Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 6.2 Best-Response Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 6.2.1 Two-Stage Stochastic Game . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 6.2.2 Stochastic Min-Cost Enhancement Game . . . . . . . . . . . . . . . . . . . . . . . 97 6.2.3 Nash Cournot Game . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 References 110 3 1 Introduction Introduced by John Nash [63, 64], modern-day game theory in non-cooperative games has developed into a very fruitful research discipline. Extending a single optimization problem, a non-cooperative game is a multi-agent optimization problem wherein a nite number of selsh players seek to optimize their indi- vidual objectives in the face of competition from their rivals and subject to limited resource availabilities. Among the variety of models and solution concepts related to game theory, Nash equilibrium prob- lem(NEP) and the corresponding solution, Nash equilibrium (NE) play central roles and have been used mostly to model interactions among individuals competing for higher utilities under scarce resources. Nash equilibrium(NE) [63, 64] is by denition a tuple of strategies, one for each player, such that no player will be better o by unilaterally deviating from his/her equilibrium strategy while the rivals keep executing their equilibrium strategies. A detailed survey on the current approaches to study Nash equi- librium problems can be found in the book chapter [29]. In recent years, there has been more and more interest in the use of non-cooperative games to model and solve resource allocation problems, where there are interactions among several agents and central- ized approaches are not applicable for such problems. There are plenty of applications in the areas of communications [74, 85, 95, 94, 87, 114, 4, 5, 17, 59, 99, 89, 90, 75, 91, 92, 93, 97, 88, 81], big data [114, 15, 39, 84, 48], and electricity markets [22, 44, 45, 70, 115, 83, 7]. Since the concept of Nash equilibrium is so important to a non-cooperative game, and it represents a set of stable strategies for each player, we are interested in seeking algorithms which can compute such Nash equilibrium solutions to non-cooperative games. Among such algorithms, the best-response algorithm is the most natural way to design and implement. In a game, the best response of a player is dened as his/her optimal strategy conditioned on the strategies of the other players. In a best-response algorithm, players take turns in some order to adapt their strategies based on the most recently known strategies of the other players (without considering the eect on future play in the game). The benet of a best- response algorithm is that it can be implemented distributively in a decentralized manner. For the models where no central controller is present and in the lack of a centralized coordination scheme, it becomes essential to distribute the computation among competing players, and that is where the best-response algorithm works. Though the best-response algorithm is simply described, the theoretical convergence requires certain prop- erties of the players' objective functions. There have been plenty of papers that addressed the research of iterative algorithms for player-convex NEPs under dierent settings and assumptions. Extending the pioneer work of Ortega and Rheinboldt [66] on the convergence of iterative methods for systems of non- linear equations and the subsequent treatment for the linear complementarity problem [21], an extensive analysis of the convergence of such methods for solving the nite-dimensional variational inequalities (VIs) [29] has been done in the papers [68, 67]. We note that in the latter paper [67] which deals with 4 a partitioned VI dened on a Cartesian product of sets of lower dimensions, the results can be directly applied to certain best-response algorithms for computing Nash equilibria. Besides, the monograph [13] provides a comprehensive and rigorous treatment of parallel and distributed numerical methods for opti- mization problems and VIs. Although these early publications have built a good foundation for analyzing the convergence of best-response algorithms, they are not tailored to the game problems that contain special features not generally taken into account in the more general context. As we can see from these references, especially [21], there are in general three approaches to establish the convergence of iterative algorithms: contraction of the iterates, a potential descent argument, and monotonicity of the iterates. Contraction-based convergence proof of the best-response algorithms has originated from the analysis of a special resource allocation problem in dynamic spectrum management [59], and then extended to several enhanced versions of this problem [74, 87, 75, 76], and further generalized in [29]. Among the convergent conditions, a spectral radius condition on the second derivatives of the players' objective functions plays a central role in the contraction-based convergence proof of the best-response algorithms. When a potential function exists for the game, there is a single optimization formulation for a game, the paper [33] has analyzed the convergence of Gauss Seidel-type algorithms. To date, the two approaches, contraction and potential, provide the-state-of-the-art convergence analysis of the best-response algorithms for solving non-cooperative games. The research of best-response algorithms has been further addressed in many recent papers. Among those papers, many works are motivated by specic applications [114, 59, 99, 89, 90, 91, 92, 93, 97, 88]. Dierent resource allocation problems in the communication area are modeled as non-cooperative games and solved by iterative algorithms. One common feature of all these models is that the best response of each player (or the optimal solution of each player's optimization problem) is unique, which simplies the application of standard xed-point arguments to the study of the convergence of best-response algo- rithms. Thus, our goal is to seek weaker convergent conditions of the best-response algorithms, which can be applied to more general classes of games. A need arises to derive less restrictive convergent conditions. The book chapter [71] analyzed the con- vergence of a unied best-response algorithm for games with non-smooth, non-convex player objective functions, based on a contraction approach and a potential approach. The algorithm employs a family of surrogate objective functions to deal with the non-convexity and nonsmoothness of the original objective functions and solves subgames in parallel. Several assumptions have been made on the objective function of each player, and among the assumptions, they required the objective function of each player to contain two parts, where the rst part of the objective function is in C 2 , and the second part is separable. One contribution of our work in the convergence of the best-response algorithm is that we have relaxed the assumptions on the objective functions fromC 2 toC 1 , and separable function to non-separable func- tion. Inspired by the work [71], we have extended the assumption on second part of objective function from separable function to a non-separable function which satises certain conditions, while another 5 part of objective function to be C 1 instead of C 2 , and shown that under those conditions, the sequence generated by the best-response algorithm converges to a Nash equilibrium of the game. We established a contraction-based convergence proof of Jacobi-type best-response algorithm. Thus, the best-response algorithm can be extended to a wider class of non-cooperative games with guaranteed convergence. For example, the two-stage stochastic games with regularizations have been shown to satisfy convergent con- ditions such that the sequences generated by the best-response algorithm converges to a Nash equilibrium solution. Depending on whether each player's feasible set is independent or dependent of the other players' decision variables, non-cooperative games are classied into standard games and generalized games. In a general- ized game, not only the objective function of each player depends on the other players' decision variables, but also the feasible set of each player depends on the other players' decision variables. Thus, some players' constraints are of the coupling type. The generalized games are in fact of substantial practical relevance since they are natural models for many real-world settings. For example, a game where players compete for portions of a resource such as total pollution can be modeled as a generalized game [52]. In the game, the agents face a coupling constraint on the total pollution. Another example is a game in power markets where players decide the power to be sold at various nodes of a network while being subject to a common set of transmission constraints [16]. Generally speaking, a setting in which there is competition between players under an overarching system-level requirement binding the choices of all players, is or can be modeled as a shared-constraint game. Check [28] for a detailed survey of generalized game. As discussed in several papers (e.g. [6, 86] for optimization and [73, 96] for games), one way to extend the algorithms for standard game to generalized game (with coupled constraints) is to employ a sequential pricing scheme that converts the problems with coupled constraints to a sequence of problems with private constraints only, resulting in a doubly iterative loop wherein the outer loop updates the prices (or the multipliers of the coupled constraints) and each inner loop updates the players' strategies at the current prices. This pricing scheme is somewhat like the primal-dual algorithm in optimization. However, we have noticed that diagonal dominant Jacobian matrices for objective functions are required to guarantee the convergence of the pricing scheme algorithms. One contribution of our work in generalized game is that we design an augmented version of pricing scheme algorithms, and prove its convergence. Inspired by the augmented Lagrangian method in opti- mization [14, 78], we propose an augmented Lagrangian based algorithm to compute a generalized Nash equilibrium solution to the generalized games. Dierent from the pricing scheme [96], the convergent condition of augmented Lagrangian based algorithms is the monotonicity of objective functions, instead of diagonal dominant Jacobian matrices [96]. Thus, by using augmented Lagrangian based algorithms, we can compute the Nash equilibrium solution of generalized games with monotone objective functions without requiring Jacobian matrices to be diagonal dominant. This provides us an alternative way of 6 computing generalized Nash equilibrium solutions as well. In this thesis, all the games we discussed are with convex constraints. Thus, in a game, depending on whether the objective function of each player is convex or non-convex in its own decision variable, there are two classes of games, convex game, and non-convex game. In convex games, a Nash equilibrium may exist, and we develop algorithm to compute a NE. And in non-convex games, a NE may not exist, thus we introduced a relaxed version of NE, a quasi-Nash equilibrium (QNE). A QNE is a solution of a variational inequality obtained under the rst-order optimality conditions of the player's problems, while retaining the convex constraints in the variational inequality problem. A QNE of a game is similar to a stationary point of an optimization problem, and is of interest for the non-convex game. However, unlike the stationary points in optimization problems, in a game, it is usually not easy to show the existence of a QNE. An important class of non-convex games is called DC game. A DC game is a multi-agent extension of a DC program. A general dierence-of-convex (DC) program refers to the minimization/maximization of an objective function that is the dierence of two convex functions subject to constraints dened by func- tions of the same kind. We have noticed that DC programs are very common in communications, signal processing, and networking. For example, the power control problems in cellular systems [79, 3, 109], MIMO relay optimization [50]; sum-rate maximization, proportional-fairness and max-min optimization of SISO/MISO/MIMO ad-hoc networks [51, 100, 46, 86]; dynamic spectrum management in DSL sys- tems [111, 107] all belong to the class of DC programs. There has been much eort on the research of designing algorithms to compute solutions of DC program. Specically, Pham Thinh Tao and Le Thi Hoai An [54, 24, 47, 55, 106] have made signicant contri- butions to this important subeld of contemporary optimization and are responsible for much of the development of theory, algorithms, and applications of DC programming to date. In terms of algorithms, the DCA (dierence-of-convex algorithm) has been a principal algorithm for computing a critical point of the problem. The DCA is basically linearizing the non-convex part of objective function at each iter- ation. One thing we are concerned with critical point is that it is not a strong condition, meaning that a critical point of a DC program is not necessary a directional stationary (d-stationary) point, which is of interest. In the paper [72], the convergence of a novel algorithm for computing a d-stationary solution of DC programs with convex constraints has been developed and established. As an extension of DC program, DC game means the objective function of each player is a DC function of his/her own decision variables, and the constraints of each player are also DC functions. Since we are discussing the games with convex constraints in this thesis, the DC games mentioned here are games with DC objective functions and convex constraints. We have discussed two special classes of DC games, DC-potential games, and DC parameterized value-function based games. DC-potential game means the game is both a DC game and a potential game. Potential games, as consid- ered by Monderer and Shapley [62], are games with potential functions. Potential functions are dened 7 as functions of strategy proles such that the dierence induced by a single deviation is equal to that of the deviator's payo function. The potential function is a useful tool to analyze equilibrium properties of games, since the incentives of all players are mapped into one function, and the set of Nash equilibria can be found by locating the optimal solution of the potential function. It is known that a Nash equilibrium solution of a potential game coincides with an optimal solution of the potential function, if we consider all players' decision variables as the decision variables for such optimization problem. Potential game is an important concept, and has been widely applied in communications [43], smart grid [110], etc. By the denition of a potential game, it has good properties that can be used to prove the convergence of algorithm [33]. An algorithm to compute a QNE solution of the non-convex generalized potential games have been developed in [33] as well, but with a strong assumption which requires to solve some non-convex optimization problems. In DC-potential games, the objective function of each player is a dierence-of-convex (DC) function and a potential function exists. One contribution of our work in DC potential game is that we proposed an extension of DCA on DC potential games, which can be proved to generate subsequences converging to a QNE of the DC potential game. Inspired by the DCA in optimization (DC programs) [6] and the way of deriving convergence for algorithm in potential games [33], we developed a linearized best-response algorithm to calculate a QNE of such games, and showed the subsequential convergence of this algorithm. Such works can be applied to the general network interdiction games with a potential function. This work extends the current work on DC program to DC games with potential functions, and designs an algorithm to compute a QNE of specic class of non-convex games. Another special class of DC game is dierence-convex parameterized value-function based non-cooperative games. Unifying a (single-)leader multi-follower game introduced by Stackelberg [104] in economics and a bilevel optimization problem in hierarchical modeling, the class of mathematical programs with equi- librium constraints (MPECs) was given a systematic study in [60] and has since been researched ex- tensively.Generalizing an MPEC, an equilibrium program with equilibrium constraints (EPEC) [26, 105] oers a mathematical framework for the study of a multi-leader multi-follower game [69] and a non- cooperative multi-agent bilevel programs. While both are two-level multi-agent problems, the dierence is that the former game involves equilibrium problems in the lower level and the latter game involves single-agent optimization problems in the lower level. It is known since the beginning that all these two-stage equilibrium problems are highly challenging to be solved computationally, in spite of the initial works and subsequent eorts [20, 56]. Recently the authors of the paper [101] study a class of network interdiction games wherein there are multiple interdictors with diering objectives each of whose goal is to disrupt the intended operations of an adversary who is solving a network ow subproblem. The interdiction is through modications of certain elements of the network such as the link capacities. The cited reference studies the interdiction of a shortest path problem (thus of the minimization kind) and derives a linear complementarity problem [21] formulation that is shown to be solvable by the renowned 8 Lemke algorithm. In a subsequent note [102], the interdiction game is extended to both the maximum ow and minimum cost network ow problems with limited analysis oered. As we shall see, part of the challenge of the maximum- ow interediction game is that it is not of the standard convex, dierentiable kind, and therefore cannot be treated by known approaches such as those described in the survey [29]. Generalizing the interdiction games described in [101, 102] in several directions, we studied a game which can be considered as an EPEC with the each player's optimization problem being a bilevel program. Moreover, each lower-level optimization problem enters into the rst level only through its optimal objec- tive value. As such, the overall objective function in the upper level is the sum of a rst-stage objective plus the value function of a second-stage linear program that is parameterized by the rst-stage variables, through a (possibly piecewise) linear function that upper bounds the second-stage decision variables. The latter linear program can be dualized, leading to a formulation where the second-stage constraints are independent of the rst-stage variables that now appear only in the (dualized) second-stage objective function. This dualization of the second-stage problem becomes the basis of a class of single-stage games which we term value-function (VF) based games with dc piecewise parameterization. Such a game is im- mediately connected to two existing classes of games: one is the family of games with min-max objectives studied in [32]; and the other is a two-stage stochastic game with recourse [77]. There are signicant dierences, however. The most important one is that a VF-based game is not necessarily of the standard convex kind; in fact, the players' resulting optimization problems may turn out be of the dierence-of- convex kind. This is one non-standard feature of this class of two-stage non-cooperative games. An immediate consequence of the non-convexity of the players' combined rst-stage and second-stage objec- tive functions is that there is no more guarantee that a Nash equilibrium solution (in the well-known sense) of the game exists. As a remedy, we employ the rst-order optimality conditions of the players' optimization problems to dene the concept of a quasi-Nash equilibrium and study its existence and computation. In turn, this is accomplished by applying the pull-out idea introduced in the reference [32]; this idea oers an eective way to overcome the non-dierentiability and non-convexity of the value func- tion and suggests a constructive solution approach to the VF based game; furthermore, in the case when the rst-stage objective is quadratic and the value-function is piecewise linear, a linear complementarity formulation provides a constructive approach for computing a QNE of the game. The structure of this thesis is as follows: in Chapter 2, we are going to discuss the convergence of best- response algorithm for convexC 1 game. In Chapter 3, we will focus on the non-convex, non-dierentiable value-function based non-cooperative game, and show that Lemke's method can compute a QNE to such game. In Chapter 4, we will introduce a linearized best-response algorithm, and show the algorithm can generate a sequence which converges to QNE of the non-convex DC potential games. In Chapter 5, augmented Lagrangian based algorithm will be introduced to compute a GNE of monotone generalized games, and convergence of such algorithm will be proved. In Chapter 6, some numerical experiments will be done to test the performance of best-response algorithm and Lemke's method. 9 1.1 Technical Material In a non-cooperative game, there are F selsh players, each player optimizes a rival-dependent objective by choosing feasible strategies satisfying certain constraints. In a Nash equilibrium problem(NEP), each player's feasible set is independent of the rival players' decision variables, while the objective function of each player may depend on its own and the other players' decision variables. Formally speaking, in an F player non-cooperative game, by anticipating the rival players' decision variable x f , each player f chooses the value of its own decision variable x f 2X f , and solves the following optimization problem: min x f 2X f f (x f ;x f ): (1) Here, x, (x f ;x f ) is denoted as the decision variables for all players. X f is the feasible set for player f, and X, Q F f=1 X f is the feasible set for all players' decision variables. We assume that X f is a closed convex set for each player f. In a convex game, where the objective function f (x f ;x f ) for each player f is convex with respect to x f , and suppose each player's feasible set X f is convex and compact, we can conclude that a Nash equilibrium of such game must exist. A formal denition of Nash equilibrium is as follows: a player prolex = (x f; ) F f=1 is a Nash equilibrium if for every player f = 1;:::;F : f (x f; ;x f; ) f (x f ;x f; ) 8x f 2X f : A Nash equilibrium is a set of strategies, one for each player, such that the strategy of each player is optimal to his/her objective function and no one would deviate from this solution. Suppose that f (x f ;x f ) is not convex with respect to x f , Nash equilibrium may not exist. So we introduce a relaxed equilibrium concept: quasi-Nash equilibrium. A player prole x = (x f; ) F f=1 is a quasi-Nash equilibrium if for every player f = 1;:::;F : f (;x f; ) 0 (x f; ;x f x f; ) 0 8x f 2X f ; where the objective function f (x f ;x f ) is directional dierentiable with respect to x f . Suppose a player prole x is a NE to the game, it must be a QNE to this game, however, the reverse is not always true. But for convex games, quasi-Nash equilibrium and Nash equilibrium are equivalent. When the feasible set of each player depends on the other players' decision variables x f , the game is called a generalized Nash equilibrium problem(GNEP). Thus, both the objective function and the feasible set of each player depend on the other players' decision variables. The GNEP is the problem of nding a vector x such that each player's strategy x f; solves the player problem(given x f; ): f (x f; ;x f; ) f (x f ;x f; ) 8x f 2X f (x f; ): Such a vector x is called a generalized Nash equilibrium(GNE) to the GNEP. 10 2 Improved Convergence of Best-Response Algorithm of Nash Games In this chapter, we establish the convergence of best-response algorithm under non-separable, C 1 objec- tive functions, extending the results from [71]. This chapter is organized as follows: in Section 2:1, we will describe the framework of best-response algorithm and the convergent conditions established in the previous work [71]. In Section 2:2, we will develop the relaxed convergent conditions, including the cor- responding convergent conditions for games with dierentiable objective functions and non-dierentiable objective functions. In Section 2:3, we will discuss some specic classes of games satisfying such condi- tions, including bi-Lipschitz games, poly-matrix games, minmax games and two-stage stochastic games. 2.1 Best-Response Algorithm Best-response algorithm is widely used to compute a solution to Nash equilibrium problem(NEP). It is a natural behavior of the players. At each iteration, every player solves an optimization problem with respect to his/her own decision variable given the other players' decision variables, respectively. And we hope that the sequence generated by this algorithm will converge to a Nash equilibrium solution. There are two major types of best-response algorithms, which are Jacobi type and Gauss-Seidel type. In the Jacobi type best-response algorithm, every player updates his/her own decision variable simultaneously given the information of the other players from the last iteration. Such Jacobi type algorithm can be implemented in parallel. On the other hand, the Gauss-Seidel type best-response algorithm can not be implemented in parallel. At each iteration, every player updates his/her decision variable sequentially given the most recently updated information of the other players. In this section, we will discuss the Jacobi-type best-response algorithm. Recall that the objective function of each player f is f (x f ;x f ), where f (x f ;x f ) is convex in x f for xed x f . We have the following algorithm to compute a NE solution of such game. Algorithm 1 Jacobi-type Best-Response Algorithm 1: Choose any feasible starting point x 0 = (x 1 0 ;:::;x F 0 ). Set k = 0. 2: At iteration k + 1, solve the optimization problem for each player f minimize x f 2X f f (x f ;x f k ) And set x f k+1 2 argmin x f 2X f f (x f ;x f k ). f = 1;:::;F 3: Check whether the stopping criteria: jjx f k+1 x f k jj " for all f is satised. If satised, stop. Otherwise, go to Step 2 with k k + 1 The above algorithm is parallel. Starting from x 0 = (x 1 0 ;:::;x F 0 ), it solves a convex problem for each player f simultaneously, meaning that all players use the other players' information from last iteration to update their own decision variables at the same time. And this process is executed iteratively. 11 2.1.1 Assumptions for Convergence in Previous Work However, what we are concerned with the Jacobi-type best-response algorithm is that it may not converge without some conditions. In order for the algorithm to converge, we need to make some assumptions, so that a convergence can be guaranteed. In the paper [71], four assumptions have been made on the game to guarantee the convergence of best- response algorithm. These are: (A:1) The objective function of each player f is f (x f ;x f ), u f (x f ;x f ) +v f (x f ), with v f (x f ) being separable and convex on X f , and u f (x f ;x f ) being C 2 on X. (A:2) Each function u f (;z f ) is strongly convex with a strong convexity modulus ff > 0 on X f in- dependent ofz f 2X f , i.e., (x f y f ) T r 2 z f z f u f (z f ;z f )(x f y f ) ff jjx f y f jj 2 for allz f ;x f ;y f 2X f . (A:3) Each functionu f has bounded mixed second partial derivatives on X; specically,jjr 2 x f 0 x f u f (x f ;y f )jj is bounded above by ff 0 > 0 on X, respectively, for all x f 2X f ;y f 2X f and for any f6=f 0 . (A:4) There exist positive scalars d f such that ff d f > P f 0 6=f ff 0 d f 0 holds for all f. In [71, Theorem 4], it has been proved that under the assumptions (A:1) (A:4), best-response algo- rithm converges to a Nash equilibrium solution. However, in the rst assumption (A:1), the non-smooth function v f (x f ) is a separable function, which just depends on x f . We hope to extend the result of this theorem to a more general case that the non-smooth part v f (x f ;x f ) is a non-separable function, i.e., v f (x f ;x f ) depends not only on the playerf's own decision variable, but also on the other players' deci- sion variables. Thus, we establish the theory which extends the work in [71] to a game where each player's objective function contains a convex non-smooth non-separable function v f (x f ;x f ) inx f for xedx f . Besides that, we are also concerned with (A:1) that the function u f (x f ;x f ) is a C 2 function. We hope to extend the assumption on u f from a C 2 function to C 1 function. In this chapter, the reason that we still keep two parts of objective function, u f and v f separated is the assumptions on them are dierent. We will prove that under these relaxed assumptions, best-response algorithm generates a sequence which converges to a Nash equilibrium solution. 2.2 Improved Convergence Assumptions 2.2.1 Assumptions on Function u f (x f ;x f ) The rst thing we want to extend from the assumptions (A:1)(A:4) is to replace the assumption thatu f is inC 2 by a more general assumption thatu f is inC 1 . Besides that, the assumptions onu f in (A:2); (A:3) are now replaced with a strong monotone gradientr x fu f (x f ;x f ) whose modulus is ff , and a Lipschitz continuous partial gradientr x fu f (x f ;): jr x fu f (x f ;x f )r x fu f (x f ;y f )j P f 0 6=f ff 0 jjx f 0 y f 0 jj, whose Lipschitz constants ff 0 are independent of x f 2X f . 12 Formally speaking, we assume that for every player f, there exists a positive constant ff such that (x f y f ) T (r x fu f (x f ;x f )r x fu f (y f ;x f )) ff jjx f y f jj 2 for allx f andy f inX f , and allx f inX f . This is the corresponding assumption with respect to (A:2), which requires the function u f to be a strongly convex function with a modulus ff . Also, we assume that the partial gradient of function u f ,r x fu f (x f ;) to be Lipschitz continuous in the sense that jjr x fu f (x f ;x f )r x fu f (x f ;y f )jj F X f 0 6=f ff 0 jjx f 0 y f 0 jj for all x f 2 X f , and all x f ;y f in X f . This is the corresponding assumption with respect to (A:3), which requires the function u f to have bounded second-order mixed partial derivatives. 2.2.2 Four-Point Condition As mentioned earlier, the functionv f (x f ) is assumed to be a separable function in (A:1), and we want to relax this assumption to a non-separable function v f (x f ;x f ), with a more general assumption, which is called \four-point condition" in this section. The \four-point condition" is that for each player f, there exist positive scalars L ff 0 > 0 for f 0 6=f such that for all x f ;y f 2X f , and x f ;y f 2X f j(v f (x f ;x f )v f (x f ;y f )) (v f (y f ;x f )v f (y f ;x f ))j X f 0 6=f L ff 0 jjx f 0 y f 0 jjjjx f y f jj: The \four-point condition" is the crucial condition to be used in this section. We can see that whenv f (x f ) is a separable function, v f (x f ;x f ) = v f (x f ;y f ), and v f (y f ;x f ) = v f (y f ;y f ), thusjv f (x f ;x f ) v f (x f ;y f ) (v f (y f ;x f )v f (y f ;x f ))j = 0, meaning the \four-point condition" always holds for a separable function. We can regard the separable function v f (x f ) as a special case of the function v f (x f ;x f ) satisfying the \four-point condition". There are a lot of functions that satisfy the \four-point condition", and among them, there are two major classes of functions, dierentiable functions and non-dierentiable functions. We will look at the corresponding properties on these two classes of functions to make sure the \four-point condition" are satised. 2.2.3 Four-Point Condition for Dierentiable Functions We have relaxed the assumption on v f from a separable function v f (x f ) to a non-separable function v f (x f ;x f ), while v f (x f ;x f ) satises the \four-point condition". The thing we are concerned with is whether the \four-point condition" is consistent with some stronger assumptions, like v f (x f ;x f ) to be dierentiable in x f . To answer this question, we investigate the relation between smoothness and \four-point condition". 13 Consider the case that each player f's objective function is in the form of u f (x f ;x f ) +v f (x f ;x f ), where both u f and v f are dierentiable. Then we can show that if the partial gradientr x fv f (x f ;) is Lipschitz continuous, the functionv f must satisfy the \four-point condition", as described in the following lemma: Lemma 1. The partial gradientr x fv f (x f ;) is Lipschitz continuous uniformly for all x f in an open convex set X f in the sense that jjr x fv f (x f ;x f )r x fv f (x f ;y f )jj X f 0 6=f L ff 0 jjx f 0 y f 0 jj for all x f ;y f 2X f if and only if the \four-point condition" j(v f (x f ;x f )v f (x f ;y f )) (v f (y f ;x f )v f (y f ;y f ))j X f 0 6=f L ff 0 jjx f 0 y f 0 jjjjx f y f jj is satised. Proof. We begin the proof by showing the necessary condition. First of all, we can construct an univariate function (t). Denote (t),v f (y f +t(x f y f );x f )v f (y f +t(x f y f );y f ), for a scalar t2 [0; 1]. Substituting t with 1 and 0, we obtain (1) = v f (x f ;x f )v f (x f ;y f ) and (0) = v f (y f ;x f ) v f (y f ;x f ). By the mean value theorem, there exists a scalar t2 (0; 1) such that (1)(0) = 0 (t). By applying the mean value theorem to and the Cauchy-Schwartz inequality, we have derived the following inequalities: (v f (x f ;x f )v f (x f ;y f )) (v f (y f ;x f )v f (y f ;y f )) =(1)(0) = 0 (t) =[r x fv f (y f +t(x f y f );x f )r x fv f (y f +t(x f y f );y f )] T (x f y f ) jjx f y f jjjjr x fv f (y f +t(x f y f );x f )r x fv f (y f +t(x f y f );y f )jj: Note that the latter partjjr x fv f (y f +t(x f y f );x f )r x fv f (y f +t(x f y f );y f )jj is Lipschitz continuous from the lemma's assumption, so we have jjx f y f jjjjr x fv f (y f +t(x f y f );x f )r x fv f (y f +t(x f y f );y f )jj X f 0 6=f L ff 0 jjx f y f jjjjx f 0 y f 0 jj Combining the above two inequalities, we obtain (v f (x f ;x f )v f (x f ;y f )) (v f (y f ;x f )v f (y f ;y f )) X f 0 6=f L ff 0 jjx f 0 y f 0 jjjjx f y f jj: (2) 14 This is the proof for one side of the absolute value. We can prove the other side as well by switching the role of x f and y f , and noticing that the left side of (2) will change sign, while the right side will not. Thus, we can get: (v f (y f ;x f )v f (y f ;y f )) (v f (x f ;x f )v f (x f ;y f )) X f 0 6=f L ff 0 jjx f 0 y f 0 jjjjx f y f jj: (3) We can conclude by using the inequalities (2) and (3) that j(v f (y f ;x f )v f (y f ;y f )) (v f (x f ;x f )v f (x f ;y f ))j X f 0 6=f L ff 0 jjx f 0 y f 0 jjjjx f y f jj: Then we need to show the sucient condition. Suppose the \four-point condition" holds, and by the denition of gradient, we have the following equations: r x fv f (x f ;x f ) T d = lim !0 v f (x f +d;x f )v f (x f ;x f ) r x fv f (x f ;y f ) T d = lim !0 v f (x f +d;y f )v f (x f ;y f ) Combining the above two equations, and applying the \four-point condition", we have jr x fv f (x f ;x f ) T dr x fv f (x f ;y f ) T dj = lim !0 j(v f (x f +d;x f )v f (x f ;x f )) (v f (x f +d;y f )v f (x f ;y f ))j X f 0 6=f L ff 0 jjx f 0 y f 0 jjjjdjj: By the fact thatd is arbitrary, and applying the Cauchy-Schwartz inequality, we can conclude that there exists a vector d such that jr x fv f (x f ;x f ) T dr x fv f (x f ;y f ) T dj =jjr x fv f (x f ;x f )r x fv f (x f ;y f )jjjjdjj holds, which implies jjr x fv f (x f ;x f )r x fv f (x f ;y f )jj X f 0 6=f L ff 0 jjx f 0 y f 0 jj: So the sucient condition has been proved. Thus, we have established the equivalence between the Lipschitz continuous partial gradientr x fv f (x f ;) and the \four-point condition". For a function satisfying the conditions of Lipschitz continuous partial gradientr x fv f (x f ;) stated in Lemma 1, it must satisfy the \four-point condition". When the functionv f (x f ;x f ) is not necessarily dierentiable, but just directional dierentiable, there is also a corresponding directional derivative \four-point condition", which is a necessary but not sucient condition for the \four-point condition" for the function itself. 15 Lemma 2. If the function v f (x f ;x f ) is directional dierentiable in x f , and the \four-point condition" j(v f (x f ;x f )v f (x f ;y f )) (v f (y f ;x f )v f (y f ;y f ))j X f 0 6=f L ff 0 jjx f 0 y f 0 jjjjx f y f jj holds, then the following directional derivative \four-point condition" will be satised: jv f (;x f ) 0 (x f ;x f y f )v f (;y f ) 0 (x f ;x f y f )j X f 0 6=f L ff 0 jjx f 0 y f 0 jjjjx f y f jj for all x f ;y f 2X f and x f ;y f 2X f . Proof. By the denition of directional derivatives, we have the following inequalities: jv f (;x f ) 0 (x f ;y f x f )v f (;y f ) 0 (x f ;y f x f )j =j lim #0 (v f (x f +(y f x f );x f )v f (x f ;x f )) (v f (x f +(y f x f );y f )v f (x f ;y f )) j X f 0 6=f L ff 0 jjx f y f jjjjx f 0 y f 0 jj by four-point condition: Thus, we can conclude that the directional derivative \four-point condition" is a necessary condition for the \four-point condition". There are cases where a function v f satises the the directional derivative \four-point condition" while not satisfy the the \four-point condition". One counter example is that for two players, if the v 1 (x 1 ;x 2 ) function for player 1 is as follows: v 1 (x 1 ;x 2 ) = 8 < : 1 2 x 1 x 2 x 1 0 x 1 x 2 x 1 0 ; where bothx 1 andx 2 are one-dimension variables. Then ifx 1 0 andy 1 < 0, the directional derivatives are v 1 (;x 2 ) 0 (x 1 ;y 1 x 1 ) = (y 1 x 1 )x 2 and v 1 (;y 2 ) 0 (x 1 ;y 1 x 1 ) = (y 1 x 1 )y 2 : Obviously, the dierentiable derivatives \four-point condition" is satised. However, the \four-point condition" is not satised, as follows j(v 1 (x 1 ;x 2 )v 1 (x 1 ;y 2 )) (v 1 (y 1 ;x 2 )v 1 (y 1 ;y 2 ))j =j 1 2 x 1 x 2 1 2 x 1 y 2 (y 1 x 2 y 1 y 2 )j =j( 1 2 x 1 y 1 ) (x 2 y 2 )j: So the dierentiable derivatives \four-point condition" is not a sucient condition for the \four-point condition". 16 2.2.4 Diagonal Dominance We have discussed the relaxed assumptions on the functionsu f andv f in the previous sections, and among the assumptions (A:1) (A:4) from previous works, the assumption (A:4) includes only the information about the u f part of objective function. Now the assumption (A:4) has to be extended to involve the \four-point condition". Thus, we will discuss the corresponding statement which connects the functions u f and v f in the relaxed assumptions. First of all, we can collect the above scalars corresponding to the function u f in two matrices , diag[ ff ] F f=1 and , where the components of are denoted as ff 0, 8 < : ff 0 if f 0 6=f 0 if f 0 =f: Similar to the notation of , we can collect the scalars L ff 0 (of \four-point condition") corresponding to v f in the matrix L, where L ff 0, 8 < : L ff 0 if f 0 6=f 0 if f 0 =f: Besides the \four-point condition", another key assumption we make is that the matrix M, L, which itself is a Zmatrix of order F , is also a P -matrix, i.e., M is a Minkowski matrix. There are many characterizations of a Minkowski matrix; for our purpose, the following Lemma 3 will be used: Lemma 3. (a): The matrix M is invertible and has a nonnegative inverse. (b): The spectral radius of the (nonnegative) matrix ^ M, ( ) 1 (L + ) is less than unity. (c): The statements (a) and (b) are equivalent and each is equivalent to the existence of positive scalars d f for f = 1;:::;F such that ff d f > F X f 0 6=f L ff 0 d f 0 + F X f 0 6=f ff 0 d f 0 8f = 1;:::;F: (4) The above inequality 4 is similar to the assumption (A:4) in a way it also assumes a matrix to be Minkowski matrix. In (A:4), the matrix is assumed to be a Minkowski matrix, but here, we assume M , L to be a Minkowski matrix, with an additional term L, i.e, the matrix composed of the \four-point condition" constants. The above property of the matrix M is the generalized diagonal dominance property. If the diagonal terms ff are large and the o-diagonal termsL ff 0 + ff 0 are small, then the above inequality will hold, thus, in a way, we can say that the diagonal elements of this matrix can dominate the o-diagonal elements, and call it a generalized diagonal dominant matrix. Intuitively speaking, generalized diagonal dominance means that for each playerf, its objective function is impacted mostly by its own decision variable, or choice of strategies, rather than the in uences of the other players. Thus the assumption of a generalized diagonal dominant matrix M is reasonable. 17 To summarize the assumptions we have made, in the gameG, the objective function for each player f is u f (x f ;x f ) +v f (x f ;x f ), which satises the following conditions: (B:1) Both u f (;x f ) and v f (;x f ) are convex. u f (x f ;x f ) is C 1 , and v f (x f ;x f ) satises the \four- point condition": for each player f, there exist positive scalars L ff 0 > 0 for f 0 6= f such that for all x f ;y f 2X f , and x f ;y f 2X f jv f (x f ;x f )v f (x f ;y f ) (v f (y f ;x f )v f (y f ;x f ))j X f 0 6=f L ff 0 jjx f 0 y f 0 jjjjx f y f jj: (B:2)r x fu f (;x f ) is strongly monotone with a modulus ff > 0, uniformly for all x f 2X f . (B:3)r x fu f (x f ;) is Lipschitz continuous in the sense that jjr x fu f (x f ;x f )r x fu f (x f ;y f )jj F X f 0 6=f ff 0 jjx f 0 y f 0 jj uniformly for all x f 2X f , with some Lipschitz constants ff 0 > 0. (B:4) There exists a vector d whose components are positive scalars, such that ff d f > F X f 0 6=f L ff 0 d f 0 + F X f 0 6=f ff 0 d f 0 : 2.2.5 Proof of Convergence After making several assumptions on the objective function of each player, we can show the convergence of Jacobi-type best-Response Algorithm to a Nash equilibrium of the gameG. Theorem 4. For the gameG with feasible set X = Q F f=1 X f , where X f is an nonempty closed convex set for each f. If the assumptions (B:1)-(B:4) hold for each player f, then the sequence of iteratesfx k g generated by Jacobi-type best-response algorithm is well dened and contracts to a limit x 1 , which is a Nash equilibrium of the gameG. Proof. We begin the convergence analysis by invoking the variational principle applied to the optimization problem min x f 2X fu f (x f ;x f k ) +v f (x f ;x f k ), obtaining at iteration k for each player f, v f (x f ;x f k )v f (x f k+1 ;x f k ) + (x f x f k+1 ) T r x fu f (x f k+1 ;x f k ) 0 8x f 2X f : Substituting x f =x f k , we deduce v f (x f k ;x f k )v f (x f k+1 ;x f k ) + (x f k x f k+1 ) T r x fu f (x f k+1 ;x f k ) 0: Considering two consecutive iterations, k and k + 1, we have the inequalities v f (x f k ;x f k )v f (x f k+1 ;x f k ) + (x f k x f k+1 ) T r x fu f (x f k+1 ;x f k ) 0 18 and v f (x f k+1 ;x f k1 )v f (x f k ;x f k1 ) + (x f k+1 x f k ) T r x fu f (x f k ;x f k1 ) 0: Summing up the above two inequalities, we obtain X f 0 6=f L ff 0 jjx f k+1 x f k jjjjx f 0 k1 x f 0 k jj v f (x f k+1 ;x f k )v f (x f k ;x f k ) +v f (x f k ;x f k1 )v f (x f k+1 ;x f k1 ) by four-point condition (x f k x f k+1 ) T (r x fu f (x f k+1 ;x f k )r x fu f (x f k ;x f k1 )): By inserting an intermediate termr x fu f (x f k ;x f k ) into the last inequality, we can get (x f k x f k+1 ) T (r x fu f (x f k+1 ;x f k )r x fu f (x f k ;x f k1 )) = (x f k x f k+1 ) T (r x fu f (x f k+1 ;x f k )r x fu f (x f k ;x f k )) + (x f k x f k+1 ) T (r x fu f (x f k ;x f k )r x fu f (x f k ;x f k1 )) (5) By the strong convexity of the function u f , which has been mentioned in the assumption (B:2), the rst part on the right hand side of the equality (5), yields (x f k x f k+1 ) T (r x fu f (x f k+1 ;x f k )r x fu f (x f k ;x f k )) ff jjx f k x f k+1 jj 2 (6) By the fact thatr x fu f (x f ;) is Lipschitz continuous, which is stated in (B:3), and applying the Cauchy- Schwartz inequality, the second part on the right hand side of the above inequality (5), yields (x f k x f k+1 ) T (r x fu f (x f k ;x f k )r x fu f (x f k ;x f k1 )) P f 0 6=f ff 0 jjx f k x f k+1 jjjjx f 0 k x f 0 k1 jj: (7) Combining the above three inequalities(5, 6, 7), we deduce (x f k x f k+1 ) T (r x fu f (x f k+1 ;x f k )r x fu f (x f k ;x f k )) +(x f k x f k+1 ) T (r x fu f (x f k ;x f k )r x fu f (x f k ;x f k1 )) ff jjx f k x f k+1 jj 2 + X f 0 6=f ff 0 jjx f k x f k+1 jjjjx f 0 k x f 0 k1 jj: Thus we can get the following inequality ff jjx f k x f k+1 jj X f 0 6=f ff 0 jjx f 0 k x f 0 k1 jj + X f 0 6=f L ff 0 jjx f 0 k x f 0 k1 jj: For each iteration k and player f, denote e f k ,jjx f k x f k1 jj, and E k , (e f k ) F f=1 . Since the above inequality holds for each player f, we can reformulate the above inequality as a simple matrix inequality: ( )E k (L + )E k1 : 19 The assumption (B:4) implies that the matrixM, L is a K-matrix, thus recalling the denition of ^ M, ( ) 1 (L + ), we derive that E k ^ ME k1 : Since ^ M has a spectral radius less than unity, it follows that the sequencefx k g is a contracted se- quence and therefore converges tox 1 , (x f 1 ) F f=1 , which must necessarily satisfy the following optimality conditions: for all f, x f 1 2 argmin x f 2X fu f (x f ;x f 1 ) +v f (x f ;x f 1 ): Consequently, if (B:1)-(B:4) hold, then the sequencefx k g converges to a Nash equilibrium of the game G. The above theorem showed that for the games satisfying the assumptions (B:1) (B:4), the sequence generated by the best-response algorithm will converge to a Nash equilibrium solution. As we have established the corresponding 'four-point conditions' for games with dierentiable v f in Lemma 1, we can show the convergence of best-response algorithm on games with dierentiable v f as well. Besides the convergence of best-response algorithm, we are also interested in the uniqueness of Nash equilibrium solution. In the following proposition, we will show the uniqueness of Nash equilibrium under the proposed assumptions (B:1) to (B:4). Proposition 5. For the gameG with feasible set X = Q F f=1 X f , whereX f is an nonempty closed convex set for each f. If the assumptions (B:1)-(B:4) hold for each player f, then the game has a unique Nash equilibrium solution. Proof. Suppose that there are two dierent Nash equilibrium solutions,x 1 andx 2 . Similar to the proof of Theorem 4, by invoking the variational principle applied to the optimization problem min x f 2X fu f (x f ;x f )+ v f (x f ;x f ), obtaining at two dierent Nash equilibriums, x 1 and x 2 for each player f, v f (x f 2 ;x f 1 )v f (x f 1 ;x f 1 ) + (x f 2 x f 1 ) T r x fu f (x f 1 ;x f 1 ) 0 and v f (x f 1 ;x f 2 )v f (x f 2 ;x f 2 ) + (x f 1 x f 2 ) T r x fu f (x f 2 ;x f 2 ) 0 Summing up the above two inequalities, we obtain X f 0 6=f L ff 0 jjx f 1 x f 2 jjjjx f 0 1 x f 0 2 jj v f (x f 1 ;x f 1 )v f (x f 2 ;x f 1 ) (v f (x f 1 ;x f 2 )v f (x f 2 ;x f 2 )) by four-point condition (x f 1 x f 2 ) T (r x fu f (x f 2 ;x f 2 )r x fu f (x f 1 ;x f 1 )): By inserting an intermediate termr x fu f (x f 1 ;x f 2 ) into the last inequality, we can get (x f 1 x f 2 ) T (r x fu f (x f 2 ;x f 2 )r x fu f (x f 1 ;x f 1 )) = (x f 1 x f 2 ) T (r x fu f (x f 2 ;x f 2 )r x fu f (x f 1 ;x f 2 )) + (x f 1 x f 2 ) T (r x fu f (x f 1 ;x f 2 )r x fu f (x f 1 ;x f 1 )) (8) 20 By the strong convexity of the function u f , which has been mentioned in the assumption (B:2), the rst part on the right hand side of the equality (8), yields (x f 1 x f 2 ) T (r x fu f (x f 2 ;x f 2 )r x fu f (x f 1 ;x f 2 )) ff jjx f 1 x f 2 jj 2 (9) By the fact that u f has Lipschitz continuous gradients, which is stated in (B:3), and applying the Cauchy-Schwartz inequality, the second part on the right hand side of the above inequality (5), yields (x f 1 x f 2 ) T (r x fu f (x f 1 ;x f 2 )r x fu f (x f 1 ;x f 1 )) P f 0 6=f ff 0 jjx f 1 x f 2 jjjjx f 0 1 x f 0 2 jj: (10) Combining the above three inequalities(8, 9, 10), we deduce (x f 1 x f 2 ) T (r x fu f (x f 2 ;x f 2 )r x fu f (x f 1 ;x f 2 )) +(x f 1 x f 2 ) T (r x fu f (x f 1 ;x f 2 )r x fu f (x f 1 ;x f 1 )) ff jjx f 1 x f 2 jj 2 + X f 0 6=f ff 0 jjx f 1 x f 2 jjjjx f 0 1 x f 0 2 jj: Thus we can get the following inequality ff jjx f 1 x f 2 jj X f 0 6=f ff 0 jjx f 0 1 x f 0 2 jj + X f 0 6=f L ff 0 jjx f 0 1 x f 0 2 jj: For each player f, denote e f ,jjx f 1 x f 2 jj, and E, (e f ) F f=1 . Since the above inequality holds for each player f, we can reformulate the above inequality as a simple matrix inequality: ( L )E 0: The assumption (B:4) implies that the matrix M, L is a K-matrix, we know that ME 0 implies the vector E = 0. Thus, the uniqueness of Nash equilibrium has been proved. 2.3 Special Cases After showing the convergence of Jacobi-type best-Response Algorithm to a Nash equilibrium of the gameG under the assumptions (B:1) (B:4), we are interested in nding out dierent classes of games, or objective functions have the form of objective functions u f (x f ;x f ) +v f (x f ;x f ), with v f (x f ;x f ) satisfying the \four-point condition". Four special cases will be discussed in this section, including the Bi-Lipschitz continuous game, poly-matrix game, min-max game and stochastic games. Among these dierent types of games, the v f function of Bi-Lipschitz continuous games is non-dierentiable, the v f functions of poly-matrix game, min-max game and stochastic games are dierentiable. 21 2.3.1 Bi-Lipschitz continuous games Bi-Lipschitz continuous game is a generalization of bi-matrix games, the v f (x f ;x f ) function contains a product of an ane function in x f and a Lipschitz continuous function in x f , instead of a product of two linear functions. The function v f may not be dierentiable and is in the form of v f (x f ;x f ) = (A f x f +a f ) T B f (x f ) + f (x f ): B f (x f ) is a vector function. f (x f ) is a separable function and convex in x f . Thus the function v f (;x f ) is convex inx f whenx f is xed. Bi-Lipschitz means that both the function (A f x f +a f ) (as it is linear) and B f (x f ) are Lipschitz continuous functions, in the sense that jjB f (x f )B f (y f )jj X f 0 6=f l ff 0 jjx f 0 y f 0 jj 8x f ;y f 2X f : B f (x f ) is Lipschitz continuous in x f with some Lipschitz constants l ff 0 . The matrix A f is assumed to be bounded above with normjjA f jjb f . Thus, by applying the Cauchy-Schwartz inequality, we can derive the following inequalities: j(v f (x f ;x f )v f (x f ;y f )) (v f (y f ;x f )v f (y f ;y f ))j =j((A f x f +a f ) T B f (x f ) (A f x f +a f ) T B f (y f )) ((A f y f +a f ) T B f (x f ) (A f y f +a f ) T B f (y f ))j =j(A f (x f y f )) T (B f (x f )B f (y f ))j jjA f (x f y f )jjjjB f (x f )B f (y f )jj b f jjx f y f jj X f 0 6=f l ff 0 jjx f 0 y f 0 jj = X f 0 6=f L ff 0 jjx f y f jjjjx f 0 y f 0 jj: Here we denote the Lipschitz constants as L ff 0 ,l ff 0 b f . So we can see that the function v f (x f ;x f ) satises the \four-point condition". j(v f (x f ;x f )v f (x f ;y f )) (v f (y f ;x f )v f (y f ;y f ))j X f 0 6=f L ff 0 jjx f y f jjjjx f 0 y f 0 jj: Similar to the assumptions (B:1) (B:4), we can construct the assumptions for Bi-Lipschitz continuous games: (B:1:1) u f (x f ;x f ) is convex in x f for xed x f , and A f (x f ) is pointwise convex in x f . u f (x f ;x f ) is in C 1 , while B f (x f ) is a Lipschitz continuous function, in the sense that jjB f (x f )B f (y f )jj X f 0 6=f l ff 0 jjx f 0 y f 0 jj 8x f ;y f 2X f : 22 The matrix A f is bounded above with normjjA f jjb f . (B:2:1)r x fu f (;x f ) is strongly monotone with a modulus ff > 0, uniformly for all x f 2X f . (B:3:1)r x fu f (x f ;) is Lipschitz continuous in the sense that jjr x fu f (x f ;x f )r x fu f (x f ;y f )jj F X f 0 6=f ff 0 jjx f 0 y f 0 jj uniformly for all x f 2X f . (B:4:1) There exists a vector d whose components are positive scalars, such that ff d f >b f F X f 0 6=f l ff 0 d f 0 + F X f 0 6=f ff 0 d f 0 : Since we have shown under the assumption (B:1:1), \four-point condition" holds for the functionv f (x f ;x f ), the convergence of best-response algorithm is guaranteed under the assumptions (B:1:1) (B:4:1). We have the following proposition: Proposition 6. For the above game with feasible set X = Q F f=1 X f , where X f is a nonempty closed convex set for each playerf. If the assumptions (B:1:1)-(B:4:1) hold for each playerf, then the sequence of iteratesfx t g generated by the Jacobi-type best-response algorithm is well dened and contracts to a limit x 1 , which is also a Nash equilibrium solution. 2.3.2 Poly-Matrix Games There are many dierent types of non-cooperative games with objective functions in the form ofu f (x f ;x f )+ v f (x f ;x f ), where both u f and v f are convex and dierentiable with respect to x f for xed x f . We have some examples of smooth non-cooperative games. Poly-matrix game is a game where the objective function of each player is a product of multiple linear functions [25]. For the simplicity of interpretation and notations, we will discuss 3-player poly-matrix games. In a 3-player Poly-Matrix game, there are 3 players, each player f2f1; 2; 3g tries to optimize its own objective function by choosing its decision variable x f while anticipating the rival players' decision variables. The feasible set X f of each player f is assumed to be bounded. The optimization problem of each player f is as follows: min x f 2X f u f (x f ;x f ) +v f (x f ;x f ): Here, both u f and v f are convex and dierentiable in x f for xed x f , moreover, v f has the following formulation: v 1 (x 1 ;x 2 ;x 3 ) = X i 1 ;i 2 ;i 3 a i 1 ;i 2 ;i 3 x 1 i 1 x 2 i 2 x 3 i 3 v 2 (x 1 ;x 2 ;x 3 ) = X i 1 ;i 2 ;i 3 b i 1 ;i 2 ;i 3 x 1 i 1 x 2 i 2 x 3 i 3 23 v 3 (x 1 ;x 2 ;x 3 ) = X i 1 ;i 2 ;i 3 c i 1 ;i 2 ;i 3 x 1 i 1 x 2 i 2 x 3 i 3 : Here,x f i f denotes thei f th element ofx f , and we assume the dimension of each player's decision variable x f , asn 1 ;n 2 andn 3 . We can see that thev f for each playerf = 1; 2; 3 is a poly-matrix function, and the poly-matrix function is dierentiable and the partial gradientr x fv f (x f ;) is Lipschitz continuous. The partial derivatives are as follows: @v 1 (x 1 ;x 2 ;x 3 ) @x 1 i 1 = X i 2 ;i 3 a i 1 ;i 2 ;i 3 x 2 i 2 x 3 i 3 8i 1 = 1; 2; ;n 1 : Similarly, we can see the partial derivatives of v 2 with respect to x 2 , and partial derivatives of v 3 with respect to x 3 are as follows: @v 2 (x 1 ;x 2 ;x 3 ) @x 2 i 2 = X i 1 ;i 3 b i 1 ;i 2 ;i 3 x 1 i 1 x 3 i 3 8i 2 = 1; 2; ;n 2 : @v 3 (x 1 ;x 2 ;x 3 ) @x 3 i 3 = X i 1 ;i 2 c i 1 ;i 2 ;i 3 x 1 i 1 x 2 i 2 8i 3 = 1; 2; ;n 3 : We have proved in the Lemma 1 that if the partial gradientr x fv f (x f ;) is Lipschitz continuous, then the function v f must satisfy the \four-point condition". We can denote the matrix A i 1 as the ma- trix containing all the components of a i 1 ;i 2 ;i 3 for all i 2 ;i 3 while i 1 is xed, i.e., (A i 1 ) i 2 ;i 3 , a i 1 ;i 2 ;i 3 . Thus, P i 2 ;i 3 a i 1 ;i 2 ;i 3 x 2 i 2 x 3 i 3 = x 2 T A i 1 x 3 can be represented as products between vectors and matrix A i 1 . By the assumption that X f is bounded for each player f, we have the following notations: L 13 , max x 2 2X 2 P i 1 jjx 2 T A i 1 jj and L 12 , max y 3 2X 3 P i 1 jjA i 1 y 3 jj. Now we show the partial gradient r x 1v 1 (x 1 ;) is Lipschitz continuous. jjr x 1v 1 (x 1 ;x 2 ;x 3 )r x 1v 1 (x 1 ;y 2 ;y 3 )jj X i 1 jj X i 2 ;i 3 a i 1 ;i 2 ;i 3 (x 2 i 2 x 3 i 3 y 2 i 2 y 3 i 3 )jj X i 1 jj X i 2 ;i 3 a i 1 ;i 2 ;i 3 (x 2 i 2 x 3 i 3 x 2 i 2 y 3 i 3 )jj +jj X i 2 ;i 3 a i 1 ;i 2 ;i 3 (x 2 i 2 y 3 i 3 y 2 i 2 y 3 i 3 )jj X i 1 jjx 2 T A i 1 x 3 x 2 T A i 1 y 3 jj + X i 1 jjx 2 T A i 1 y 3 y 2 T A i 1 y 3 jj L 13 jjx 3 y 3 jj +L 12 jjx 2 y 2 jj: Thus, we have shown that for the poly-matrix function v 1 , its partial gradient with respect to x 1 is Lipschitz continuous, and we can conclude the \four-point condition" holds for v 1 by Lemma 1. Since the form of v 1 , v 2 and v 3 are similar, using similar ideas, we can show v 2 and v 3 satisfy the \four-point condition". Thus, by applying Lemma 1 and Theorem 4, we've got: Proposition 7. If the following conditions are satised: 24 1. The feasible set X f for each player f = 1; 2; 3 is bounded. Denote (A i 1 ) i 2 ;i 3 , a i 1 ;i 2 ;i 3 , (A i 2 ) i 1 ;i 3 , b i 1 ;i 2 ;i 3 , (A i 3 ) i 1 ;i 2 , b i 1 ;i 2 ;i 3 . And L 13 , max x 2 2X 2 P i 1 jjx 2 T A i 1 jj , L 12 , max y 3 2X 3 P i 1 jjA i 1 y 3 jj, L 21 , max y 3 2X 3 P i 2 jjA i 2 y 3 jj, L 23 , max x 1 2X 1 P i 2 jjx 1 T A i 2 jj, L 31 , max y 2 2X 2 P i 3 jjA i 3 y 2 jj, L 32 , max x 1 2X 1 P i 3 jjx 1 T A i 3 jj, then L 12 , L 13 , L 21 , L 23 , L 31 , L 32 are all bounded. 2. The function u f satises the assumptions (B:1) (B:4). Then the Jacobi-type best-response algorithm will generate a sequence which converges to a NE solution of the above poly-matrix game with 3 players. Now consider a poly-matrix game with F players, the objective function for each player f is: v f (x 1 ;x 2 ;:::;x F ) = X j 1 ;j 2 ;:::;jn s f j 1 ;j 2 ;:::;j F x 1 j 1 x 2 j 2 :::x F j F ; where x f j f denotes the j f th component of the vector x f . We can use similar ideas and methods to show that for the F -player poly-matrix games, the partial gradient of the function v f with respect to x f is Lipschitz continuous, and thus by the Lemma 1, we can conclude that the \four-point condition" holds for each v f . Thus, if the function u f satises the assumptions (B:1)(B:4), the Jacobi-type best-response algorithm computes a Nash equilibrium solution to such F -player poly-matrix game. 2.3.3 Non-cooperative games with minmax objectives As a special case of dierentiable \four-point condition", we will discuss the min-max game in this section. Consider the non-cooperative game where each player's objective function is f (x f ;x f ),u f (x f ;x f )+ v f (x f ;x f ), with v f (x f ;x f ) being a pointwise maximum function of h f (x f ;x f ;y f ) on y f 2Y f : v f (x f ;x f ) = max y f 2Y f h f (x f ;x f ;y f ): We assume h f (x f ;x f ;y f ) to be a strongly concave function in y f for xed x f and x f . Since we have proved if v f is dierentiable and its partial gradient with respect to x f is Lipschitz continuous, then v f satises the \four-point condition" in Lemma 1, we need to check whether the pointwise maximum function v f has such properties. By the fact that the function h f (x f ;x f ;y f ) is strongly concave in y f for xed x, we know that the maximizer of h f (x f ;x f ;y f ) with respect to y f is unique, and thus v f is dierentiable. Now we need to check the Lipschitz continuity of the partial gradientr x fv f (x f ;x f ). By Danskin's theorem, the partial gradient of v f (x f ;x f ) with respect to x f is r x fv f (x f ;x f ) =r x fh f (x f ;x f ;y f (x f ;x f )); where y f (x f ;x f ) is denoted as the maximizer of h f (x f ;x f ;y f ) with respect to y f , i.e., y f (x f ;x f ), argmax y f 2Y fh f (x f ;x f ;y f ): 25 Thus, in order to prove the Lipschitz continuity of the partial gradient of v f with respect to x f , we will rst need to show the Lipschitz continuity of the maximizer y f (x f ;x f ), which is given in the following Lemma 8. Lemma 8. Suppose the following assumptions are satised: (1) The set Y f is a nonempty closed convex set. (2) h f (x f ;x f ;) is strongly concave with a uniform modulus f . (3)r y fh f (x f ;;y f ) is Lipschitz continuous in the sense that jjr y fh f (x f ;x f ;y f )r y fh f (x f ;z f ;y f )jj X f 0 6=f L ff 0 2 jjx f 0 z f 0 jj for all x f ;z f 2X f , where the Lipschitz constants are independent of x f and y f . Then the maximizer y f (x f ;), argmax y f 2Y fh f (x f ;;y f ) will be Lipschitz continuous in the sense that: jjy f (x f ;x f )y f (x f ;z f )jj P f 0 6=f L ff 0 3 jjx f 0 z f 0 jj8z f 2X f , where L ff 0 3 , L ff 0 2 f Proof. By applying the variational inequality to the optimization problem max y f 2Y fh f (x f ;x f ;y f ) at the points x, (x f ;x f ) and x 0 , (x f ;z f ), we have the following inequalities (y f y f (x)) T r y fh f (x;y f (x)) 0 8y f 2Y f (11) and (y f y f (x 0 )) T r y fh f (x 0 ;y f (x 0 )) 0 8y f 2Y f ; (12) wherey f (x) andy f (x 0 ) are optimal solutions to max y f 2Y fh f (x f ;x f ;y f ) and max y f 2Y fh f (x f ;z f ;y f ), respectively. By subsisting y f =y f (x 0 ) in (11) and y f =y f (x) in (12), we obtain (y f (x 0 )y f (x)) T r y fh f (x;y f (x)) 0 and (y f (x)y f (x 0 )) T r y fh f (x 0 ;y f (x 0 )) 0: We can sum up the above two inequalities and get (y f (x 0 )y f (x)) T (r y fh f (x;y f (x))r y f 2Y fh f (x 0 ;y f (x 0 ))) 0: By inserting an intermediate term (y f (x 0 )y f (x)) T r y fh f (x;y f (x 0 )) into the above inequality, we can derive (y f (x)y f (x 0 )) T (r y fh f (x;y f (x))r y fh f (x;y f (x 0 ))) (13) (y f (x)y f (x 0 )) T (r y fh f (x 0 ;y f (x 0 ))r y fh f (x;y f (x 0 ))): (14) 26 Since the functionh f (x f ;x f ;) is strongly concave with a modulus f , the left hand side of the inequality (13) yields f jjy f (x)y f (x 0 )jj 2 (y f (x)y f (x 0 )) T (r y fh f (x;y f (x))r y fh f (x;y f (x 0 ))): On the other hand, as the partial gradientr y fh f (x f ;;y f ) is Lipschitz continuous, by applying the Cauchy-Schwartz inequality, the right hand side of the inequality (13) yields (y f (x)y f (x 0 )) T (r y fh f (x 0 ;y f (x 0 ))r y fh f (x;y f (x 0 ))) X f 0 6=f L ff 0 2 jjx f 0 z f 0 jjjjy f (x f ;x f )y f (x f ;z f )jj: Combining the above three inequalities, we can deduce that f jjy f (x)y f (x 0 )jj 2 jjy f (x)y f (x 0 )jj X f 0 6=f jjx f 0 z f 0 jj: The above inequality can be further reformulated as jjy f (x f ;x f )y f (x f ;z f )jj X f 0 6=f L ff 0 2 f jjx f 0 z f 0 jj: Denoting L ff 0 3 , L ff 0 2 f , we get the Lipschitz continuity of y f (x f ;): jjy f (x f ;x f )y f (x f ;z f )jj X f 0 6=f L ff 0 3 jjx f 0 z f 0 jj; which completes the proof. Now we have shown that ifh f (x f ;x f ;y f ) is strongly concave iny f and its partial gradient with respect to y f is Lipschitz continuous, then the maximizer of max y fh f (x f ;x f ;y f ), i.e., y f (x f ;x f ) is Lipschitz continuous. Thus we can use this result to show that the Jacobi-type best-response algorithm can be used to compute a Nash equilibrium solution to the min-max objective games. As mentioned in the following Proposition 9. Proposition 9. If the following conditions are satised: 1. The set Y f is a nonempty closed convex set. 2. There exist Lipschitz constants L ff 0 1 and s f , such that jjr x fh f (x f ;x f ;y f 1 )r x fh f (x f ;z f ;y f 2 )jj X f 0 6=f L ff 0 1 jjx f 0 z f 0 jj +s f jjy f 1 y f 2 jj hold for all x f 2X f , x f ;z f 2X f and y f 1 ;y f 2 2Y f . 3. The function h f (x f ;x f ;) is strongly concave on Y f with a uniform modulus f . 4.r y fh f (x f ;;y f ) is Lipschitz continuous in the sense that jjr y fh f (x f ;x f ;y f )r y fh f (x f ;z f ;y f )jj X f 0 6=f L ff 0 2 jjx f 0 z f 0 jj 27 for all x f ;z f 2X f , where the Lipschitz constants are independent of x f and y f . 5. The function u f satises the assumptions on u f in (B:1) (B:3), and the assumption (B:4) holds, with L ff 0 ,L ff 0 1 + s f L ff 0 2 f . Then the Jacobi-type best-response algorithm will generate a sequence which converges to a NE solution of the gameG with v f (x f ;x f ) = max y f 2Y fh f (x f ;x f ;y f ). Proof. As mentioned earlier, since the function h f (x f ;x f ;) is strongly concave, the partial gradient of v f with respect to x f is r x fv f (x f ;x f ) =r x fh f (x f ;x f ;y f (x f ;x f )) and r x fv f (x f ;z f ) =r x fh f (x f ;z f ;y f (x f ;z f )) by Daskin's theorem. By the second assumption thatr x fh f (x f ;;) is Lipschitz continuous, we can obtain jjr x fv f (x f ;x f )r x fv f (x f ;z f )jj =jjr x fh f (x f ;x f ;y f (x f ;x f ))r x fh f (x f ;z f ;y f (x f ;z f ))jj (15) X f 0 6=f L ff 0 1 jjx f 0 z f 0 jj +s f jjy f (x f ;x f )y f (x f ;z f )jj: (16) It is obvious from the conclusion of Lemma 8 that jjy f (x f ;x f )y f (x f ;z f )jj X f 0 6=f L ff 0 2 f jjx f 0 z f 0 jj: By plugging the right hand side of the above inequality into (16), we can derive jjr x fv f (x f ;x f )r x fv f (x f ;z f )jj X f 0 6=f L ff 0 1 jjx f 0 z f 0 jj +s f L ff 0 2 f jjx f 0 z f 0 jj; which can be reformulated as jjr x fv f (x f ;x f )r x fv f (x f ;z f )jj X f 0 6=f L ff 0 jjx f 0 z f 0 jj; where the Lipschitz constants of v f are denoted as L ff 0 ,L ff 0 1 + s f L ff 0 2 f for f 0 6=f. Thus, by applying the Lemma 1, the \four-point condition" will be satised by the function v f . We can further use the Theorem 4 to conclude that the sequence generated by the Jacobi-type best-response algorithm converges to a NE solution of the pointwise maximum (minmax) game. 28 Remark 10. We can notice that the min-max non-cooperative game can also be formulated as a pull-out game where there are 2F players, including x 1 ;x 2 ;:::;x F ;y 1 ;y 2 ; ;y F . Among them, for the F players (x 1 ;x 2 ;:::;x F ), each player tries to solve the following optimization problem: min x f 2X f u f (x f ;x f ) +h f (x f ;x f ;y f ): On the other hand, for the other F players y 1 ;y 2 ;:::;y F , each player tries to optimize its own objective functions as max y f 2Y f h f (x f ;x f ;y f ): We can apply the Lemma 1 to show that this pull-out non-cooperative game satises the \four-point condition" and thus the sequence generated by the Jacobi-type best-response algorithm converges to a NE of the pull-out game. And since we can show the equivalence between a Nash equilibrium of the pull-out game and a Nash equilibrium of the original min-max game, the sequence generated by the Jacobi-type best-response algorithm converges to a Nash equilibrium of the min-max non-cooperative game. 2.3.4 Two-Stage Stochastic Game We have discussed the non-cooperative games with min-max objective functions, and shown that the \four-point condition" is satised for the min-max functions under several assumptions. Actually, each player's objective function in the two-stage stochastic game is also a min-max objective function. Now we look at the two-stage stochastic game as a special case of the general min-max non-cooperative game, which has been discussed earlier. For the rst stage, the objective function of each player f = 1;:::;F is f (x f ;x f ) = u f (x f ;x f ) + v f (x f ;x f ), whilev f (x f ;x f ) = P j p j f (x f ;x f ;! j ). p j is the probability of scenarioj and f (x f ;x f ;! j ) is the second stage's objective function with a random variable ! j . f (x f ;x f ;! j ) is a pointwise minimum function with the following formulation: f (x f ;x f ;! j ), min z f [c f (! j ) + X f 0 6=f G ff 0 (! j )x f 0 ] T z f + 1 2 (z f ) T Q ff z f s:t: F X f 0 =1 A ff 0 (! j )x f 0 +D f z f b f (! j ): Here z f is player f's second-stage decision variable. And the matrix Q ff is assumed to be positive denite with its smallest eigenvalue denoted by f min > 0. Introducing the dual variablesy f , and applying the linear programming duality, we can reformulate the above second-stage problem as a maximization problem: max z f ;y f (b f (! j ) F X f 0 =1 A ff 0 (! j )x f 0 ) T y f 1 2 z f T Q ff z f s:t: c f (! j ) + X f 0 6=f G ff 0 (! j )x f 0 +Q ff z f D f T y = 0: 29 Now both y f and z f are the decision variables of the second-stage dual problem. By the assumption that the matrixQ ff is a positive denite matrix, we can conclude that the objective function is strongly concave in (z f ;y f ) for the dual problem. Since the objective function of the dual problem is strongly concave with respect to (z f ;y f ), it has a unique maximizer. If we denote (z f (x f ;x f ;! j );y f (x f ;x f ;! j )) as the maximizer to the previous problem, by applying Daskin's theorem, we can conclude that the gradient of the objective function f (x f ;x f ;! j ) exists and the partial gradient with respect to x f is r x f f (x f ;x f ;! j ) =A ff T (! j )y f (x f ;x f ;! j ): (17) As we have proved the Lipschitz continuity of the maximizery f (x f ;) in Lemma 8 under strong concavity, by the equation (17), we can derive the Lipschitz continuity ofr x f f (x f ;;! j ): jjr x f f (x f ;x f 1 ;! j )r x f f (x f ;x f 2 ;! j )jj = jjA ff T (! j )(y f (x f ;x f 1 ;! j )y f (x f ;x f 2 ;! j ))jj jjA ff (! j )jjjjy f (x f ;x f 1 ;! j )y f (x f ;x f 2 ;! j )jj jjA ff (! j )jj P f 0 6=f t ff 0 jjx f 0 1 x f 0 2 jj P f 0 6=f ff 0 j jjx f 0 1 x f 0 2 jj (18) for allx f 2X f ,x f 0 1 ;x f 0 2 2X f . We denotet ff 0 j , jjA ff 0 (! j )jj f min , and ff 0 j ,t ff 0 j jjA ff (! j )jj = jjA ff 0 (! j )jjjjA ff (! j )jj f min , for all f 0 6=f. The rst-stage objective functionv f (x f ;x f ) is an expectation of f (x f ;x f ;! j ) with respect to dierent random variables ! j , i.e., v f (x f ;x f ) = X j p j f (x f ;x f ;! j ): Thus, the partial gradient ofv f (x f ;x f ) with respect tox f is also an expectation of the partial gradient of f (x f ;x f ;! j ) with respect to x f . r x fv f (x f ;x f ) = X j p j r x f f (x f ;x f ;! j ) = X j p j (A ff T (! j )y f (x f ;x f ;! j )): 30 By using the result of the inequality (18), we can derive the following inequalities for v f : jjr x fv f (x f ;x f 1 )r x fv f (x f ;x f 2 )jj =jj X j p j A ff T (! j )(y f (x f ;x f 1 ;! j )y f (x f ;x f 2 ;! j ))jj X j p j jjA ff T (! j )(y f (x f ;x f 1 ;! j )y f (x f ;x f 2 ;! j ))jj X j p j X f 0 6=f ff 0 j jjjx f 0 1 x f 0 2 jj = X f 0 6=f X j p j ff 0 j jjx f 0 1 x f 0 2 jj = X f 0 6=f L ff 0 jjx f 0 1 x f 0 2 jj for all x f 2X f , x f 1 ;x f 2 2X f , where L ff 0 , P j p j ff 0 j . Thus, we have proved that the partial gradientr x fv f (x f ;) is Lipschitz continuous, by applying the Lemma 1, we know that v f (x f ;x f ) satises the \four-point condition". If the assumptions (B:1) (B:4) hold for the function u f (x f ;x f ), the sequence generated by the Jacobi-type best-response algorithm converges to a Nash equilibrium solution of the regularized two- stage stochastic games. 31 3 Dierence-Convex Parameterized Value-function Based Non-cooperative Games Unifying a (single-)leader multi-follower game introduced by Stackelberg [104] in economics and a bilevel optimization problem in hierarchical modeling, the class of mathematical programs with equilibrium constraints (MPECs) was given a systematic study in [60] and has since been researched extensively. Generalizing an MPEC, an equilibrium program with equilibrium constraints (EPEC) [26, 105] oers a mathematical framework for the study of a multi-leader multi-follower game [69] and a non-cooperative multi-agent bilevel programs. While both are two-level multi-agent problems, the dierence is that the former game involves equilibrium problems in the lower level and the latter game involves single-agent optimization problems in the lower level. It is known since the beginning that all these two-stage equi- librium problems are highly challenging to be solved computationally, in spite of the initial works and subsequent eorts [20, 56]. Recently the authors of the paper [101] study a class of network interdiction games wherein there are multiple interdictors with diering objectives each of whose goal is to disrupt the intended operations of an adversary who is solving a network ow subproblem. The interdiction is through modications of certain elements of the network such as the link capacities. The cited refer- ence studies the interdiction of a shortest path problem (thus of the minimization kind) and derives a linear complementarity problem [21] formulation that is shown to be solvable by the renowned Lemke algorithm. In a subsequent note [102], the interdiction game is extended to both the maximum ow and minimum cost network ow problems with limited analysis oered. As we shall see, part of the challenge of the maximum- ow interediction game is that it is not of the standard convex, dierentiable kind, and therefore cannot be treated by known approaches such as those described in the survey [29]. Generalizing the interdiction games described in [101, 102] in several directions, we studied a game which can be considered as an EPEC with the each player's optimization problem being a bilevel program. Moreover, each lower-level optimization problem enters into the rst level only through its optimal objec- tive value. As such, the overall objective function in the upper level is the sum of a rst-stage objective plus the value function of a second-stage linear program that is parameterized by the rst-stage variables, through a (possibly piecewise) linear function that upper bounds the second-stage decision variables. The latter linear program can be dualized, leading to a formulation where the second-stage constraints are independent of the rst-stage variables that now appear only in the (dualized) second-stage objective function. This dualization of the second-stage problem becomes the basis of a class of single-stage games which we term value-function (VF) based games with dc piecewise parameterization. Such a game is im- mediately connected to two existing classes of games: one is the family of games with min-max objectives studied in [32]; and the other is a two-stage stochastic game with recourse [77]. There are signicant dierences, however. The most important one is that a VF-based game is not necessarily of the standard convex kind; in fact, the players' resulting optimization problems may turn out be of the dierence-of- convex kind. This is one non-standard feature of this class of two-stage non-cooperative games. An 32 immediate consequence of the non-convexity of the players' combined rst-stage and second-stage objec- tive functions is that there is no more guarantee that a Nash equilibrium solution (in the well-known sense) of the game exists. As a remedy, we employ the rst-order optimality conditions of the players' optimization problems to dene the concept of a quasi-Nash equilibrium and study its existence and computation. In turn, this is accomplished by applying the pull-out idea introduced in the reference [32]; this idea oers an eective way to overcome the non-dierentiability and non-convexity of the value func- tion and suggests a constructive solution approach to the VF based game; furthermore, in the case when the rst-stage objective is quadratic and the value-function is piecewise linear, a linear complementarity formulation provides a constructive approach for computing a QNE of the game. The structure of this chapter is as follows. In Section 3.1, we present the formulations of the value-function based games and discuss two examples of such games: a two-stage stochastic game (Subsection 3.1.1) and some network interdiction games (Subsection 3.1.2). We also dene the concept of a quasi-Nash equilibrium solution of these non-convex games in Subsection 3.1.4. In Section 3.2, we introduce the pull-out reformulation of the value-function based games and establish the relationship between their respective solutions. In Section 3.3, we present the LCP formulations of the pull-out games and establish the solvability of these LCPs by Lemke's method [21, Section 4.4] under appropriate assumptions. In Sec- tion 3.4, we will remove some restrictions on the parameters in the previously introduced assumptions and describe an iteratively bounding procedure to broaden the applicability of Lemke's method. In the last Section 3.5, we will summarize the contributions of this paper and draw some conclusions about our study. 3.1 Game Formulation There are two versions of the non-cooperative value-function based game to be studied in this paper, depending on whether it is of the interdiction or enhancement type; their basic setting is as follows. The game consists of F selsh players each having an objective f (x f ;x f ) (to be minimized) that is a function jointly of his/her own decision variablex f and those of the rivals denotedx f , x f 0 F f 0 6=f . Let x, x f F f=1 be the collective strategy prole of all the players. Each player's variable x f is constrained by a private closed and convex strategy set X f R n f for some positive integer n f . This player's overall objective f (x) is the sum or dierence of a rst-level objective ' f (x) and a second-level value function f (x) of a linear program parameterized by x in the objective function: f (x), maximum f 2 f (U f (x) ) T f = m f X j=1 U f j (x) f j ; (19) where f , f 2R m f + j G f f e f 6=; is a (private) polyhedron dened by the matrixG f 2R ` f m f and vector e f 2 R ` f . The most important feature of the parameterization in the value function f (x) is the dierence-convexity property of each function U f j (x); i.e., U f j (x) = u f;+ j (x)u f; j (x) for some 33 player-dependent convex functions u f; j (x). Subsequently, we will show in Proposition 11 that with this dc property of each U f j (x), the value function f (x) is dc, thus non-convex, on X, F Y f=1 X f , provided that f (x) is nite for allx2 X. Letn, F X f=1 n f be the dimension of the players' strategy prole x2 X. The combined single-level optimization problem of player f is given by: anticipating x f 2 X f , F Y f 0 6=f X f 0 , minimize x f f (x), ' f (x) f (x) subject to x f 2 X f , x f 2 R n f + j A f x f b f ; (20) where A f 2 R k f n f and b f 2 R k f . The sign allows us to treat the two cases of a maximization or minimization second-level problem. Problem (20) is a bilevel program in the pair of variables (x f ; f ): minimize x f ; f ' f (x) (U f (x) ) T f subject to x f 2 X f and f 2 argmax b f 2 f (U f (x) ) T b f : Nevertheless, unlike a general bilevel program, the lower-level variable f enters the rst-level optimiza- tion only through the optimal objective value of the lower-level optimization problem. Collecting the F optimization problems (20), we obtain the F -player value-function based games, which we denoteG VF and formally state as: G + VF , 8 > < > : parameterized by x f 2X f , minimize x f 2X f [' f (x) + f (x) ] 9 > = > ; F f=1 and G VF , 8 > < > : parameterized by x f 2X f , minimize x f 2X f [' f (x) f (x) ] 9 > = > ; F f=1 ; where each f (x) is given by (19). Employing the implicitly dened value function f (x) that hides the second-level variables f , each gameG VF is a special EPEC that is amenable to the treatment as a one-level game. Up to here, we have not formally dened a solution concept of this game as each player's optimization problem (20) may be non-convex. Until then, we will speak of the gamesG VF to mean the collection of these optimization problems for all f without referring to their solutions. From the perspective of a two-level decision making problem, a minus second-level value function i.e., f (x) in (20), provides a model of goal consistency in both levels, enhancing the activities in the second level via the aid of the rst-level decision x; while a plus second-level value function is applicable to a context of interdiction where the rst-level players work to oppose the objective of a second-level economic agent (perhaps an adversary). Examples of each case will be illustrated below. 34 3.1.1 Finite-Scenario SP Games As a rst source problem of the VF games, we mention the case where the value function f (x) is a discretized expected-value recourse function that is the building block of standard two-stage stochas- tic programming (SP) with nite scenarios. For each player f, such a recourse function is given by SP f;min= max (x) , I E ! h SP f;min= max (x;!) i , where ! is a random variable assumed to be discretely dis- tributed with valuesf! s g S s=1 and associated probabilitiesfp s g S s=1 for some integer S > 0 and I E is the expectation operator taken over these discrete scenarios, and where for each pair (x;! s ), SP f;min (x;! s ), minimum y f 0 b f (! s ) T y subject to C f (! s )x +D f y f f (! s ); and SP f;max (x;! s ), maximum y f 0 b f (! s ) T y subject to C f (! s )x +D f y f f (! s ); (21) whereC f (! s ),b f (! s ) and f (! s ) are given scenario-dependent matrices and vectors of appropriate orders and D f is a constant matrix. We refer the reader to the recent article [77] for a comprehensive study of two-stage non-cooperative games with uncertainty where the randomness is not required to be discrete but the value function f is that of a minimization problem. In such a case of continuously distributed randomness, the discretized model discussed here is applicable to a sampling approach for solving the resulting game with uncertainty wherein nite samples are drawn in approximating the continuous dis- tribution. Consider rst a minimization recourse function that is the basic framework in standard two-stage SP. As mentioned above, one may think of this as a model of a situation where the rst-stage players, who are the primary (or perhaps the only) decision makers of the game, aim to minimize a combined rst-stage and second-stage objectives to enhance the individual goals in two stages, wherein a deterministic decision is made in the rst stage that is supplemented in the second stage when the uncertainty is realized. We have SP f;min (x) = S X s=1 p s SP f;min (x;! s ) = minimum y f;s 0;s=1;;S S X s=1 p s b f (! s ) T y f;s subject to C f (! s )x +D f y f;s f (! s ); s = 1; ;S = maximum f;s 0;s=1;;S S X s=1 h C f (! s )x f (! s ) i T f;s subject to D f T f;s p s b f (! s ); s = 1; ;S; which is a convex function inx. When the above recourse function is embedded in a value-function game, the resulting game is a special case ofG + VF and provides an instance where the parameterized coecient 35 function U f (x) in (19) is ane. In this case, each player's combined objective function ' f (x f ;x f ) + SP f;min (x) is convex inx f provided that the rst-stage objective' f (x f ;x f ) is convex inx f for xedx f . Nevertheless, this convexity property fails in the non-standard case of a maximization recourse function which as mentioned above can be interpreted as an interdiction game wherein the rst-level players aim to disrupt the operations of an adversary in a state of uncertainty to be realized subsequently. We have SP f;max (x) = S X s=1 p s SP f;max (x;! s ) = b SP f;max (x); where b SP f;max (x) , maximum f;s 0;s=1;;S S X s=1 h f (! s )C f (! s )x i T f;s subject to D f T f;s p s b f (! s ); s = 1; ;S; is reformulated to conform to a maximizing value function (19). In this case, a player's combined objective such as' f (x f ;x f )+ SP f;max (x) =' f (x f ;x f ) b SP f;max (x) is no longer convex inx f even if the rst-stage objective ' f (x f ;x f ) is convex in x f for xed x f . The resulting game is of the kindG VF 3.1.2 Network Interdiction Games The network interdiction games can be used to formulate real-world problems where there are (possibly more than one) agents and interdictors operating on a network. Before introducing the mathematical formulations, we will discuss the relevance of these models in smuggling interdiction where a malicious agent attempts to transport some illegal materials such as drugs from some sources to some destinations in a network. The agent aims to maximize the amount of ow from the sources to the destinations or to minimize the transportation costs. If the network covers a geographical area across several dierent nations or regions, the interdiction resources may be spread across dierent jurisdictions. The overall objective of these interdictors is to counter the agent's objective. Though the interdictors have a common agent to interdict, it is appropriate to formulate this problem as a non-cooperative game because of the fact that the interdiction resources are spread across multiple interdictors and collaborations between dierent interdictors may be dicult due perhaps to their geographical separation. Such resources may include monitoring mechanisms such as patrolling or remote sensing equipments. As mentioned in [103], applications of network interdiction games can also be found in the areas of infectious disease control and air strike targeting. We consider two network interdiction games described in [103]: (a) a capacitated minimum-cost network ow problem; and (b) a maximal ow problem with arc capacities. Both problems are of the kindG VF . In both of them, there is an underlying network with node setN and arc setA. For each nodei2N , let A out;i andA in;i be, respectively, the set of arcs inA with i as the start and end node. We assume that there are F interdictors whose goal is to in ict the most adverse eect on the network agent's objective by changing the link capacities. Thus for problem (a), each interdictor's objective is to maximize the minimum cost; while for problem (b), each interdictor's objective is minimize the maximum ow. It 36 is assumed that each interdictor f2f1; ;Fg has a set of arcs, denotedA f , whose capacities (s)he can aect. Our formulations below of the interdiction games are slightly dierent from the ones in the cited reference. It is worth emphasizing that the games considered below have multiple interdictors and one single adversary whom we call the network agent. In the context of a leader-follower game, the interdictors are the leaders and the latter agent is a (single) follower whose decision variables are not the primary variables in the resulting one-level gamesG VF . Thus, the network interdiction games considered here are multi-leader single-follower games. Common to all the interdictors, the network agent's minimum-cost problem has the following formulation. Given net supplies i at the nodes i2N (a positive i denotes supply, a negative i denotes demand, and a zero i means transshipment), anticipating the interdiction x, this agent determines an action y by solving: cost min (x), minimum y X a2A c a y a subject to X a2A out;i y a X a2A in;i y a = i ; 8i2N and 0 y a u a (x) 8a2A; (22) where the link capacities could be one of two forms: u sum a (x) = max 0 @ 0; u 0 a X f :a2A f x f a 1 A or u max a (x) = max 0; u 0 a max f :a2A f x f a ; (23) with u 0 a > 0 being the capacity on link a without interdiction and x f a being the amount of interdiction f applies to this arc a's capacity. The sum-form interdiction models a situation where the interdiction occurs all at once and thus each arc capacity is reduced by the sum of the interdiction amounts. The max-form interdiction means the agent is concerned only with the largest interdiction on each arc. This may happen in the situation where interdictions occur at dierent epochs, and only the most interdiction matters. The agent may feel that as long as (s)he can deal with the largest of the interdictions, the lesser interdictions are not essential. In both cases, it is the network agent who is taking into account the eect of the interdiction; the interdictors do not take into account how their respective interdiction aects the agent's strategy. Interdictors make their decisions based on the pairs (' f ;X f ) and their information on the agent's optimal objective values f (x) as a result of the interdiction tuple x. Anticipating the other interdictors' strategies x f , interdictor f's optimization problem is: minimize x f 2X f X a2A f c f a (x f a ) f cost min (x); (24) wherex f , x f a a2A f is interdictorf's strategy constrained by the setX f containing for instance budget or technological restrictions and f > 0 is this interdictor's weight between his (her) cost of interdiction (the rst summand) and the negative of the network agent's shipment cost; thus this interdictor aims 37 to maximize the latter shipment cost of the network agent weighed against his/her cost of interdiction. Notice that the agent's decision variable y enters into the problem (24) only through the value function cost min (x). It is worth repeating that the resulting game is one of multiple interdictors against a common network agent, rendering this a Nash problem, albeit with a (nonconvex, non-dierentiable) objective function in (24). This is dierent from some common network interdiction problems that have one single interdictor and multiple adversaries, rendering the problem either a bilevel program if the adversaries have a central objective or a MPEC if the adversaries are non-cooperative. Introducing dual variables i for i2N and a for a2A, and using linear programming duality, we can write the minimum cost (22) as a maximization problem: cost min (x), maximum i ;a X i2N i i X a2A u a (x) a subject to i j a c a ; 8a2A with start node i and end node j and a 0; 8a2A: (25) Similar to the ow variables y a , the dual variables a and i belong to the network agent and are not the primary variables of the game. Provided that the net supplies sum to zero, i.e., X i2N i = 0, the ow conservation equations in (22) can be equivalently written as inequalities. In this equivalent reformulation, the corresponding dual variables i are constrained to be nonnegative. This sign restriction facilitates the application of the iterative bounding procedure described in Section 3.4 for solving the game. With each u a (x) given by (23), the above value function cost min (x) is neither convex or concave. Similar to the recourse functions SP min= max (x), the feasible region of cost min (x) is a constant polyhedron that is clearly unbounded in both the nodal variables i and link variables a . In contrast, the max- ow interdiction game can be formulated as follows. There is a set of origin- destination (OD) pairsWNN . Each interdictor f has a subset, denotedW f , of such OD-pairs that (s)he wishes to minimize associated maximal ows between them. For each OD-pairw2W f joining source s w to destination t w , the maximum ow problem is ow w;max (x), maximum y; subject to X a2A out;i y a X a2A in;i y a = 8 > > > > < > > > > : 0 if i2N nfs w ; t w g if i = s w if i = t w and 0 y a u a (x); 8a2A; (26) where u a (x) is given similarly to (23), i.e., it is one of two kinds: u sum a (x) = max 0 @ 0; u 0 a X f :a2A f x f a 1 A or u max a (x) = max 0; u 0 a max f :a2A f x f a ; (27) 38 withu 0 a > 0 being the capacity on link a without interdiction and x f a being the amount of interdiction f applies to this arc a's capacity. Thus, interdictor f's min-sum optimization problem is: minimize x f 2X f X a2A f c f a (x f a ) + f X w2W f ow w;max (x): (28) To write ow w;max (x) in the form of (19), we use linear programming duality to deduce ow w;max (x), minimum i ;a X a2A u a (x) a subject to i j a 0; 8a2A with start node i and end node j tw sw = 1 and a 0; 8a2A = b ow w;max (x); where b ow w;max (x), maximum i ;a X a2A u a (x) a subject to i j a 0; 8a2A with start node i and end node j tw sw = 1 and a 0; 8a2A: (29) Thus the problem (28) can be written as minimize x f 2X f X a2A f c f a (x f a ) f X w2W f b ow w;max (x): To motivate the discussion in the next section, we point out that the capacityu sum a (x) is a convex function of x, while u max a (x) is a dierence of two convex piecewise ane functions; namely, u max a (x) = max max f :a2A f x f a ; u 0 a | {z } convex in x max f :a2A f x f a | {z } convex in x : These properties of the capacities will translate into the dc assumption of the functions U f j (x) dening the value function f (x) in the general formulation (19); see Proposition 11 below. To close the discussion of the above interdiction games, we mention one extension that leads to a game with coupling constraints in the players' optimization problems. For either problem (24) and (28), it is plausible that the system may have upper limits on the total amounts of interdictions imposed by a central system authority; these are expressed by the constraints: fora2A, X f:a2A f x f a a , where a > 0 is the upper bound of interdiction on arca2A. These constraints, which couple all interdictors' decision 39 variables, are included in the respective rst-level problems. With these coupling constraints present, the overall game become one of the generalized kind [29] that can be converted to a standard problem by setting marginal prices on them. Take problem (28) for instance. The resulting game with prices on the arc interdictions is then dened by F + 1 optimization poblems wherein problem f = 1; ;F is: minimize x f 2X f X a2A f c f a (x f a ) f X w2W f b ow w;max (x) + X a2A p a 0 @ X f 0 :a2A f 0 x f 0 a a 1 A ; and the (F + 1)-st problem is minimize pa0 X a2A p a 0 @ X f 0 :a2A f 0 x f 0 a a 1 A ; which is equivalent to the complementarity conditions: 0 p a ? a X f 0 :a2A f 0 x f 0 a 0; a2A; where? is the perpendicularity notation which in this context describes the complementarity betweenp a and the corresponding constraint slack and can be interpreted as a market clearing condition. Formulated asF +1 optimization problems, this extended game has been called a game with side constraints in [73] and a Multiple Optimization Problems with Equilibrium Constrains (MOPEC) in [34]. The value functions in the players' individual optimization problems are one novel feature that have not been considered in these references. 3.1.3 Max-Flow Enhancing Game To illustrate the activity-enhancing nature of the VF-gameG + VF , we discuss a variation of the max- ow game where instead of interdiction, the rst-level players' goals are to enhance the network agent's objective of ow maximization by increasing (instead of decreasing) the link capacities. Specically, the max- ow problem is the same as (26) except that the link capacities are given by u sum a (x) = max 0 @ u 0 a ; X f :a2A f x f a 1 A or u max a (x) = max u 0 a ; max f :a2A f x f a : (30) Player f's optimization problem is: minimize x f 2X f X a2A f c f a (x f a ) + f X w2W f b ow w;max (x): the resulting game is easily seen to be of typeG + VF . Yet unlike this game derived from a two-stage SP game with convex minimization recourse functions SP min (x), the value functions b ow w;max (x) are not convex in x due to the nonlinearity of the sum-capacity or max-capacity functions in (30). In summary, the gamesG VF can be used to model a competitive two-level decision making problem 40 wherein the players minimize their rst-level activity costs while aiming to either degrade or enhance the objective of an economic agent who is either an adversary or ally with their adverse or supportive actions on the latter agent's problem. 3.1.4 Quasi-Nash Equilibria of the GamesG VF The value function f (x) given by (19) belongs to a class of pointwise maximum functions whose dc property has been established in [72, Appendix A] under the boundedness assumption of the set f . Since the feasible regions of the dual network ow problems are not bounded, we need to give a separate proof of the result below employing instead the polyhedrality of f . Proposition 11. Let f be a polyhedron and each U f j (x) be dc on the set X. Suppose that f (x) is nite for every x2 X. It holds that the value function f (x) is dc on X. Proof. Let f;t T t=1 be the nite set of extreme points of f for some integer T > 0. We have f (x) = maximum 1tT (U f (x) ) T f;t : (31) Each of the maximand x 7! (U f (x) ) T f;t is a dc function in x for xed f;t . Since the pointwise maximum of nitely many dc functions is dc, it follows that f (x) is dc on X. A consequence of Proposition 11 is that if the rst-level objective ' f (;x f ) is dc on X f , then the combined objective f (;x f ) = ' f (;x f ) f (;x f ) is dc, and hence directionally dierentiable. Since f (;x f ), and thus f (;x f ) is in general not dierentiable, the rst-order optimality conditions of playerf's optimization problem cannot be stated as a standard variational inequality [30, Chapter 1]. Instead, with a convex rst-level objective' f (;x f ), we may dene the following concept of a rst-order Nash solution of the value-function based games, relying on the directional derivatives of the players' objective functions, which we recall as f (;x f ) 0 (x f ;v f ), lim #0 f (x f +v f ;x f ) f (x) ; The rest of the paper will be devoted to the proof of existence and computation of such a solution of the two value-function based gamesG VF . Denition 12. A tuple x , x ;f F f=1 2 X is a (a) quasi-Nash equilibrium (QNE) of the value-function based gameG + VF if for all f = 1; ;F , + f (;x ;f ) 0 (x ;f ;x f x ;f ) 0; 8x f 2 X f ; (alternatively, this may be termed a Nash stationary solution; throughout the paper, we will use the QNE terminology); 41 (b) local Nash equilibrium (LNE) ofG + VF if there exists a neighborhood F Y f=1 N f of x such that for all f = 1; ;F , + f (x f ;x ;f ) + f (x ); 8x f 2 X f \N f : Similar denitions apply to the gameG VF . Motivated the capacity functions in the network ow games, we assume that each functionU f j (x) is given by the dierence of two pointwise maxima of nitely many ane functions; i.e., U f j (x) = max 1`K f;+ j 8 > > > > < > > > > : g f;+ j;` +C f;+ j;` x | {z } ,u f;+ j;` (x) 9 > > > > = > > > > ; | {z } ,u f;+ j (x) max 1`K f; j 8 > > > > < > > > > : g f; j;` +C f; j;` x | {z } ,u f; j;` (x) 9 > > > > = > > > > ; | {z } ,u f; j (x) (32) for some given scalarsg f; j;` ,n-dimensional row vectorsC f; j;` and positive integersK f; j . Note thatu f; j (x) are convex, albeit non-dierentiable functions. In terms of the individual player variables x f , we may write C f; j;` x = F X f 0 =1 C f; j;`;f 0 x f 0 , where each C f; j;`;f 0 is an n f -row vector. By the expression (31), it follows that each second-level value function f (x) is a piecewise ane function of x. As such, we have the following result that asserts that every QNE of the gamesG VF is a LNE. Proposition 13. Suppose that each ' f (;x f ) is convex for every x f 2 X f and all f = 1; ;F . With each U f j (x) given by (32), every QNE of the gamesG VF is a LNE. Proof. Let x be a QNE of the gameG VF . By the convexity of ' f (;x f ) which yields, ' f (x f ;x ;f ) ' f (x ) +' f (;x ;f ) 0 (x ;f ;x f x ;f ); 8x f 2 X f ; and by the piecewise ane property of f (;x f ) which yields [30, expression (4.2.7)], f (x f ;x ;f ) = f (x ) + f (;x ;f ) 0 (x ;f ;x f x ;f ); 8x f suciently close to x ;f ; it follows that f (x f ;x ;f ) f (x ) + f (;x ;f ) 0 (x ;f ;x f x ;f ); for all x f suciently close to x ;f . Thus x is a LNE of the respective games. In conclusion, supported by realistic applications, the two classes of games, G VF , have the following novel feature: the combined objective functions ' f (x) f (x) are non-convex and non-dierentiable in the players' own variables. These games may be considered as special EPECs amenable to constructive treatments by the pull-out idea combined with linear complementarity methods in the linear-quadratic case to be discussed starting in the next section. 42 3.2 The Pull-out Games The idea of pull-out was introduced in [32] as a way to convert a non-cooperative game with min-max, thus non-dierentiable, objectives into one with smooth objectives so that provably convergent distributed algorithms can be applied. This idea turns out to be useful in the context of the value-function based gamesG VF due to their non-dierentiability and non-convexity. For these games, two kinds of pull-out are needed; rst is the maximization in the value functions f (x) and the second is the pointwise maxima in the functionsU f j (x). We remark that this paper does not deal with the design of distributed algorithms for these VF-based games; instead we will subsequently discuss their solution by linear complementarity methods in the linear-quadratic case. 3.2.1 The GameG + VF For the gameG + VF , we apply the rst pull out to the value function f (x), resulting in a 2F -player game: 8 > > > > > < > > > > > : parameterized by (x f ; f )2X f f ; minimize x f 2X f 2 6 6 4 ' f (x) +U f (x) T f | {z } remains dc in x f 3 7 7 5 9 > > > > > = > > > > > ; F f=1 together with 8 > > > > < > > > > : parameterized by x2 X; maximize f 2 f U f (x) T f | {z } a linear program 9 > > > > = > > > > ; F f=1 : To convert the player problem minimize x f 2X f 2 4 ' f (x) + m f X j=1 u f;+ j (x)u f; j (x) f j 3 5 , we apply the pull-out idea to the functionu f; j (x) which we write as a convex combination of the ane maximands that dene it; namely, u f; j (x) = K f; j X `=1 f; j;` g f; j;` +C f; j;` x ; for some f; j;` 0 such that K f; j X `=1 f; j;` = 1 | {z } set denoted f; j : This results in 3 sets of optimization problems: 8 > > > > > > > > > > > < > > > > > > > > > > > : x-players : parameterized by (x f ; f )2X f f and f; j;` 0 summing (over `) to unity, minimize x f 2X f + f (x; f ; f; ), ' f (x) + m f X j=1 2 6 4u f;+ j (x) K f; j X `=1 f; j;` g f; j;` +C f; j;` x 3 7 5 f j | {z } convex in x f if ' f (;x f ) is convex 9 > > > > > > > > > > > = > > > > > > > > > > > ; F f=1 ; 43 8 > > > > > > < > > > > > > : -players : parameterized by x2 X; maximize f 2 f f ( f ;x), U f (x) T f = m f X j=1 u f;+ j (x)u f; j (x) f j 9 > > > > > > = > > > > > > ; F f=1 ; 8 > > > > > > > < > > > > > > > : auxiliary players (not needed unless K f; j 2) : parameterized by x2 X, for every j = 1; ;m f ; maximize f; j 2 f; j u f; j ( f; j ;x), K f; j X `=1 f; j;` g f; j;` +C f; j;` x 9 > > > > > > > = > > > > > > > ; F f=1 . Together, these optimization problems dene a game which we denote G +;pull VF . Since each of these optimization problems is a convex program, we can speak of a Nash equilibrium (NE) solution of the latter pull-out game. Specically, a triple (x ; ; ; ), (x ;f ) F f=1 ; ( ;f ) F f=1 ; ( ;f; ) F f=1 with ;f; , ;f; j m f j=1 is a NE ofG +;pull VF if for every f = 1; ;F , x ;f 2 argmin x f 2X f + f (x f ;x ;f ; ; ;f; ); ;f 2 argmax f 2 f f ( f ;x ); ;f; j 2 argmax f; j 2 f; j u f; j ( f; j ;x ) for every j = 1; ;m f . The result below connects a QNE of the VF-based gameG + VF with a NE of the pull-out gameG +;pull VF . Proposition 14. Suppose that each ' f is continuous on X and ' f (;x f ) is convex on X f for every xed x f 2X f . The following two statements hold. (a) If x is a QNE of the VF-based gameG + VF , then for every pair ( ; ) such that 8 > > < > > : ;f; j 2 argmax f; j 2 f; j u f; j ( f; j ;x ) 8 j = 1; ;m f and ;f 2 argmax f 2 f f ( f ;x ) 9 > > = > > ; for every f = 1; ;F; (33) the triple (x ; ; ; ) is a NE of the gameG +;pull VF . (b) Conversely, if (x ; ; ; ) is a NE of the gameG +;pull VF such that for every f = 1; ;F and every j = 1; ;m f ,I ;f; j , argmax 1`K f; j u f; j;` (x ) is a singleton, then x is a QNE of the gameG + VF . Proof. (a) By linear programming duality, we deduce that ;f; j;` = 0 for all`62I ;f; j , argmax 1`K f; j u f; j;` (x ), 44 which yields X `2I ;f; j ;f; j;` = 1. We have, for every x f 2X f , 0 + f (;x ;f ) 0 (x ;f ;x f x ;f ) = ' + f (;x ;f ) 0 (x ;f ;x f x ;f )+ m f X j=1 h u f;+ j (;x ;f ) 0 (x ;f ;x f x ;f )u f; j (;x ;f ) 0 (x ;f ;x f x ;f ) i ;f j = ' + f (;x ;f ) 0 (x ;f ;x f x ;f )+ m f X j=1 " u f;+ j (;x ;f ) 0 (x ;f ;x f x ;f ) max `2I ;f; j C f; j;`;f (x f x ;f ) # ;f j ' + f (;x ;f ) 0 (x ;f ;x f x ;f )+ m f X j=1 2 6 4u f;+ j (;x ;f ) 0 (x ;f ;x f x ;f ) X `2I ;f; j ;f; j;` C f; j;`;f (x f x ;f ) 3 7 5 ;f j = + f (;x ;f ; ;f ; ;f; ) 0 (x ;f ;x f x ;f ): Since + f (;x ;f ; ;f ; ;f; ) is convex, it follows that x ;f 2 argmin x f 2X f + f (x f ;x ;f ; ; ;f; ). Together with the optimizing choice (33) of ;f; j and ;f , part (a) follows. (b) This can be proved by reversing the above derivation using the singleton assumption ofI ;f; j . 3.2.2 The GameG VF The analysis of this game is similar; we need only to identify the corresponding pull-out components. We begin with the expression: u f;+ j (x) = K f;+ j X `=1 f;+ j;` g f;+ j;` +C f;+ j;` x ; for some f;+ j;` 0 such that K f;+ j X `=1 f;+ j;` = 1 | {z } set denoted f;+ j : The pull-out version of the gameG VF , which we denoteG ;pull VF consists of 3 sets of optimization problems as follows: 8 > > > > > > > > > > > < > > > > > > > > > > > : x-players : parameterized by (x f ; f )2X f f and f;+ j;` 0 summing to unity, minimize x f 2X f f (x; f ; f;+ ), ' f (x) + m f X j=1 2 6 4 K f;+ j X `=1 f;+ j;` g f;+ j;` +C f;+ j;` x +u f; j (x) 3 7 5 f j | {z } convex in x f if ' f (;x f ) is convex 9 > > > > > > > > > > > = > > > > > > > > > > > ; F f=1 ; 45 8 > > > > > > < > > > > > > : -players : parameterized by x2 X; maximize f 2 f f ( f ;x), U f (x) T f = m f X j=1 u f;+ j (x)u f; j (x) f j 9 > > > > > > = > > > > > > ; F f=1 ; 8 > > > > > > > < > > > > > > > : auxiliary players (not needed unless K f;+ j 2) : parameterized by x2 X, for j = 1; ;m f ; maximize f;+ j 2 f;+ j u f;+ j ( f;+ j ;x), K f;+ j X `=1 f;+ j;` g f;+ j;` +C f;+ j;` x 9 > > > > > > > = > > > > > > > ; F f=1 . We have the following result for the two games:G VF andG ;pull VF . Its proof is omitted. Proposition 15. Suppose that each ' f is continuous on X and ' f (;x f ) is convex on X f for every xed x f 2X f . The following two statements hold. (a) If x is a QNE of the VF-based gameG VF , then for every pair ( ; ) such that 8 > > < > > : ;f;+ j 2 argmax f;+ j 2 f;+ j u + f ( f;+ ;x ); 8j = 1; ;m f and ;f 2 argmax f 2 f f ( f ;x ) 9 > > = > > ; for every f = 1; ;F; the triple (x ; ; ;+ ) is a NE of the gameG ;pull VF . (b) Conversely, if (x ; ; ;+ ) is a NE of the gameG ;pull VF such that for every f = 1; ;F and every j = 1; ;m f ,I ;f;+ j , argmax 1`K f;+ j u f;+ j;` (x ) is a singleton, then x is a QNE of the gameG VF . 3.2.3 Existence of equilibria The key to the proof of existence of a NE to the gamesG ;pull VF hinges on the boundedness of the variables f . For this purpose, we impose the following condition, which is an equivalent way of postulating that the value function f (x) is nite for all x2 X: For everyx2 X, [ f 2 f 1 ) U f (x) T f 0 ], where f 1 , f 2R m f + j G f f 0 is the recession cone of the polyhedron f . Under this assumption, letting b f denote the convex hull of the extreme points of the polyhedron f , we then have 1 < f (x) = maximum f 2 f f ( f ;x) = maximum f 2 b f f ( f ;x) <1; 8x2 X: (34) The signicance of this equality is that b f is a (nonempty) polytope while f may be unbounded. This equality does not imply, however, that argmax f 2 f f ( f ;x) is bounded. 46 Theorem 16. Suppose that for every f = 1; ;F , X f is a compact convex set and1< f (x)<1 for allx2 X. Let eachU f j (x) be given by (32). Suppose that each' f is continuous on X and' f (;x f ) is convex on X f for every xed x f 2X f . Then Nash equilibria to both pull-out gamesG ;pull VF exist, which are QNE of the original gamesG VF under the assumptions in part (b) of Proposition 14 and 15, respectively. Proof. We have all the necessary assumptions in place to apply a standard existence theorem to obtain a Nash equilibrium of the gamesG ;pull VF , namely convexity of the players' objective functions in their own variables and compactness and convexity of their respective feasible sets. 3.3 Solution by Linear Complementarity Formulations With each X f being a polyhedron given byfx f 2 R n f + j A f x f b f g for some matrix A f and vector b f of appropriate orders, we are interested in the application of Lemke's algorithm [21] to the respec- tive linear complementarity (LCP) formulations of the gamesG ;pull VF in which each rst-level objective ' f (x) = 0 @ q f + X f 0 6=f Q ff 0 x f 0 1 A T x f + 1 2 (x f ) T Q ff x f is a quadratic function. Subsequently, we will make an assumption on the matrix Q, 2 6 6 6 6 6 6 6 6 6 6 6 4 Q 11 Q 12 Q 1F1 Q 1F Q 21 Q 22 Q 2F1 Q 2F . . . . . . . . . . . . . . . Q F11 Q F12 Q F1F1 Q F1F Q F 1 Q F 2 Q FF1 Q FF 3 7 7 7 7 7 7 7 7 7 7 7 5 ; which is not necessarily symmetric, for the successful termination of Lemke's algorithm. The pull-out formulation in Subsection 3.2.1 is sucient for converting the gamesG VF to standard ones with player-convex optimization problems from which existence of a QNE can be established. Neverthe- less, due to the bilinear products f; j;` f j appearing explicitly in the objective functions of the x-players' problems and implicitly in the -players' problems, the pull-out gamesG ;pull VF are not amenable to a nitely terminating solution method even when the players' rst-level objective functions ' f (x) are lin- ear or quadratic, as is the case that is the focus of this section. To derive a suitable formulation for this purpose, we make a change of variables b f; j;` , f; j;` f j and redene the auxiliary players' problems in terms of the new variables b f; j;` . We treat the two resulting redened games separately in the following 47 two subsections. To prepare for this treatment, let b f; , 8 > > > > > > > < > > > > > > > : b f; , b f; j m f j=1 2 m f Y j=1 R K f; j + j m f X j=1 G f j K f; j X `=1 b f; j;` | {z } f j e f 9 > > > > > > > = > > > > > > > ; ; f = 1; ;F be the feasible sets of the new variables, where G f j denote the j-th column of G f 3.3.1 The gameG +;pull VF We rst consider the gameG +;pull VF . There are two steps needed in the following derivation; Step 1 is as described above and is not needed when K f; j = 1; Steps 2a and 2b are not needed when K f;+ j = 1. Step 1. With the substitution of variables: b f; j;` , f; j;` f j the redened game b G +;pull VF consists of the following two sets of optimization problems: 8 > > > > > > > < > > > > > > > : x-players : parameterized by x f 2X f andb f; 2 b f; minimize x f 2X f + f (x; f ;b f; ), ' f (x) + m f X j=1 K f; j X `=1 b f; j;` h u f;+ j (x) g f; j;` +C f; j;` x i ; 9 > > > > > > > = > > > > > > > ; F f=1 ; 8 > > > > > > > < > > > > > > > : auxiliary players (not needed unless K f; j 2) : parameterized by x2 X; maximize b f; 2 b f; u f; (b f; ;x), m f X j=1 K f; j X `=1 b f; j;` h u f;+ j (x) g f; j;` +C f; j;` x i 9 > > > > > > > = > > > > > > > ; F f=1 . We have the following result connecting the two games:G +;pull VF and b G +;pull VF . Lemma 17. The following two statements hold: (A) if (x ; ; ; ) is a NE of the gameG +;pull VF , then (x ;b ; ) is a NE of the game b G +;pull VF , where b ;f; j;` , ;f; j;` ;f j for all f = 1; ;F , j = 1; ;m f , and ` = 1; ;K f; j ; conversely, (B) if (x ;b ; ) is a NE of the game b G +;pull VF , then (x ; ; ; ) is a NE of the gameG +;pull VF , where for every f = 1; ;F , ;f , m f X j=1 b ;f; j and if ;f j > 0, then for all ` = 1; ;K f; j , ;f; j;` , b ;f; j;` ;f j ; if ;f j = 0, then n ;f; j;` o K f; j `=1 can be arbitrary scalars maximizing K f; j X `=1 f; j;` g f; j;` +C f; j;` x over all 48 n f; j;` o K f; j `=1 2 f; j . Step 2a. Introduce a single variable t f;+ j for u f;+ j (x) = max 1`K f;+ j (g f;+ j;` +C f;+ j;` x ), resulting in player f's problem becoming one with additional variables and constraints but remaining a convex program: minimize x f 2X f ;t f;+ j 8 > < > : ' f (x) + m f X j=1 K f; j X `=1 b f; j;` h t f;+ j g f; j;` +C f; j;` x i 9 > = > ; subject to t f;+ j g f;+ j;` +C f;+ j;` x; 8` = 1; ;K f;+ j ; j = 1; ;m f ; and the corresponding auxiliary problem becoming: maximize b f; 2 b f; m f X j=1 K f; j X `=1 b f; j;` h t f;+ j g f; j;` +C f; j;` x i : Step 2b. Lets f;+ j ,t f;+ j (g f;+ j;1 +C f;+ j;1 x) 0. We may then substitutet f;+ j =s f;+ j +g f;+ j;1 +C f;+ j;1 x into the optimization problems in Step 2a, obtaining minimize x f 2X f ;s f;+ j 0 8 > < > : ' f (x) + m f X j=1 K f; j X `=1 b f; j;` h s f;+ j +g f;+ j;1 g f; j;` + C f;+ j;1 C f; j;` x i 9 > = > ; subject to s f;+ j g f;+ j;` g f;+ j;1 + C f;+ j;` C f;+ j;1 x; 8` = 2; ;K f;+ j ; j = 1; ;m f ; and maximize b f; 2 b f; m f X j=1 K f; j X `=1 b f; j;` h s f;+ j +g f;+ j;1 g f; j;` + C f;+ j;1 C f; j;` x i : 3.3.2 The gameG ;pull VF Applying a similar derivation, we can obtain a redened game b G ;pull VF and two modied families of opti- mization problems similar to those in Step 2b of the game b G +;pull VF : 8 > > > > > < > > > > > : minimize x f 2X f ;s f; j 0 8 > < > : ' f (x) + m f X j=1 K f;+ j X `=1 b f;+ j;` h s f; j +g f; j;1 g f;+ j;` + C f; j;1 C f;+ j;` x i 9 > = > ; subject to s f; j g f; j;` g f; j;1 + C f; j;` C f; j;1 x; 8` = 2; ;K f; j ; j = 1; ;m f ; 9 > > > > > = > > > > > ; F f=1 ; 8 < : maximize b f;+ 2 b f;+ m f X j=1 K f;+ j X `=1 b f;+ j;` h s f; j g f; j;1 +g f;+ j;` + C f;+ j;` C f; j;1 x i 9 = ; F f=1 . We will return to discuss this redened game b G ;pull VF after the discussion of its counterpart b G +;pull VF . 49 3.3.3 The LCP +;pull VF Introducing multipliers f , f , andb f;+ j;` for the constraints A f x f b f in X f , m f X j=1 G f j K f; j X `=1 b f; j;` e f in b f; , and s f;+ j g f;+ j;` g f;+ j1 + C f;+ j;` C f;+ j;1 x, we may write down the optimality conditions of the two sets of optimization problems in Step 2b of the game b G +;pull VF , obtaining its linear complementarity formulation, which we denote LCP +;pull VF , as follows: variables: z + , 8 < : x f ; s f;+ j ; b f; j;` K f; j `=1 ! m f j=1 ; f ; f ; b f;+ j;` K f;+ j `=2 ! m f j=1 9 = ; F f=1 ; complementarity conditions: for all f = 1;F , 0 x f ? q f + F X f 0 =1 Q ff 0 x f 0 + (A f ) T f + m f X j=1 K f; j X `=1 b f; j;` (C f;+ j;1;f C f; j;`;f ) T + m f X j=1 2 6 4 K f;+ j X `=2 b f;+ j;` (C f;+ j;`;f C f;+ j;1;f ) T 3 7 5 0 0 f ? b f A f x f 0 0 s f;+ f ? K f; j X `=1 b f; j;` K f;+ j X `=2 b f;+ j;` 0; j = 1; ;m f 0 b f;+ j;` ? g f;+ j;1 g f;+ j;` +s f;+ j F X f 0 =1 C f;+ j;`;f 0 C f;+ j;1;f 0 x f 0 0 ` = 2; ;K f;+ j ; j = 1; ;m f 0 b f; j;` ? g f;+ j;1 +g f; j;` s f;+ j F X f 0 =1 C f;+ j;1;f 0 C f; j;`;f 0 x f 0 + (G f j ) T f 0 ` = 1; ;K f; j ; j = 1; ;m f 0 f ? e f m f X j=1 G f j K f; j X `=1 b f; j;` 0: (35) The LCP +;pull VF can be written in the compact form: 0 z + ? q + + M + z + 0; for some NN matrix M + and vector q + . Our goal is to apply [21, Theorem 4.4.13] that provides a sucient condition for Lemke's algorithm to successfully compute a solution of a general LCP; this condition is as follows: M + is copositive on the nonnegative orthant; i.e., z T M + z 0 for all z 0 and the following implication holds: 0 z? M + z 0 ) z T q + 0: (36) 50 The latter implication is equivalent to the membership: q + belonging the dual cone of the solution set of homogeneous LCP (0; M + ). A special case where the implication (36) holds easily is when M + is strictly copositive on the nonnegative orthant; i.e., z T M + z> 0 for all nonzero z 0. Yet this strict copositivity does not hold for the LCP on hand; see instead Remark 21 for a strict copositivity assumption on Q. Returning to the copositivity requirement of the matrix M + , we impose the following matrix-theoretic assumption on the functions u f; j (x) as given by (32): Assumption C + for (32) in b G +;pull VF : (a) C f; j;`;f 0 C f;+ j;1;f 0 for all f, f 0 = 1; ;F , j = 1; ;m f , ` = 1; ;K f; j , and f 0 6=f; (b) C f;+ j;1;f 0 C f;+ j;`;f 0 for all f, f 0 = 1; ;F , j = 1; ;m f , ` = 2; ;K f;+ j , and f 0 6=f. Condition (b) holds vacuously if K f;+ j = 1. This is the case when u f;+ j is a linear function. The two conditions hold trivially whenu f; j (x) are private function. An example of a functionU f j (x) with coupled variables satisfying both conditions is the following: U f j (x) = max 1`K f;+ j g f;+ j;` +C f;+ j;`;f x f | {z } private to player x f max 1`K f; j 0 @ g f; j;` + F X f 0 =1 C f; j;`;f 0 x f 0 1 A | {z } remains rival dependent C f; j;`;f 0 0; 8f 0 6=f , which is in general neither convex nor concave in x f for given x f but is concave in x f for given x f . This includes the following function after absorbing the rst linear term into the pointwise maximum in the second term. U f j (x) = g f;+ j;1 + F X f 0 =1 C f;+ j;1;f 0 x f 0 | {z } linear in x C f;+ j;`;f 0 0; 8f 0 6=f max 1`K f; j g f; j;` +C f; j;`;f x f | {z } private to player x f , which is concave in x f for given x f and linear in x f for given x f . We further illustrate these functions in the source games discussed in Subsection 3.1.1. Example 18. Consider the SP game with standard minimizing recourse functions that consists of the 51 optimization problems: 8 > > > > > > > > > > > > > < > > > > > > > > > > > > > : parameterized by x f 2X f ; minimize x f 2X f ' f (x) + SP min;f (x); where SP min;f (x), maximum s 0;s=1;;S S X s=1 2 6 6 4 C f (! s )x f (! s ) | {z } ,U f;s (x) 3 7 7 5 T s subject to (D f ) T s p s b f (! s ); s = 1; ;S 9 > > > > > > > > > > > > > = > > > > > > > > > > > > > ; F f=1 : With each C f (! s )x = F X f 0 =1 C ff 0 (! s )x f 0 for some matrices C ff 0 (! s ) with row C ff 0 j (! s ), and identifying each u s;f;+ j (x) = F X f 0 =1 C ff 0 j (! s )x f 0 f j (! s ) and u s;f; j = 0, Assumption C + holds if C ff 0 (! s ) 0 for all f 0 6=f and alls; in particular, this holds in the case of private recourse functions, i.e., whereC ff 0 (! s ) = 0 for all f 0 6=f and all s = 1; ;S. Separately, the max- ow enhancement game consists of the optimization problems: 8 > > > > > > > < > > > > > > > : parameterized by x f 2X f ; minimize x f 2X f X w2W f X a2A f c f a (x f a ) + f X w2W f b ow w;max (x); where b ow w;max (x), maximum 2 f " X a2A u a (x) a # 9 > > > > > > > = > > > > > > > ; F f=1 ; see (29) for details of the set f . With each function u f+ j (x) = 0 and u f j (x) identied as u a (x) that is given by either one of the two functions in (30), we can easily deduce that Assumption C + holds for this class of games. Under Assumption C + , we have the following copositivity result. Proposition 19. Suppose that the matrix Q is copositive on the nonnegative orthant in thex-space and Assumption C + holds for (32) of the game b G +;pull VF . Then the matrix M + in the LCP +;pull VF is copositive on the nonnegative orthant in the z-space. Proof. This follows easily from multiplying out the product z T M + z using the displayed block-wise ex- pressions (35) of the LCP +;pull VF . Indeed, we have z T M + z = x T Qx+ F X f=1 X f 0 6=f m f X j=1 K f; j X l=1 C f; j;l;f C f;+ j;1;f x f 0 b f; j;l + F X f=1 X f 0 6=f m f X j=1 K f;+ j X l=2 C f;+ j;1;f C f;+ j;l;f x f 0 b f;+ j;l ; which is nonnegative by the copositivity of matrix Q and assumption C + . 52 Next we address the implication (36) that can be written as: 2 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 for all f = 1; ;F; 0 x f ? F X f 0 =1 Q ff 0 x f 0 + (A f ) T f + m f X j=1 K f; j X `=1 b f; j;` (C f;+ j;1;f C f; j;`;f ) T + m f X j=1 2 6 4 K f;+ j X `=2 b f;+ j;` (C f;+ j;`;f C f;+ j;1;f ) T 3 7 5 0 0 f ? A f x f 0 0 s f;+ j ? K f; j X `=1 b f; j;` K f;+ j X `=2 b f;+ j;` 0; j = 1; ;m f 0 b f;+ j;` ? s f;+ j F X f 0 =1 C f;+ j;`;f 0 C f;+ j;1;f 0 x f 0 0 ` = 2; ;K f;+ j ; j = 1; ;m f 0 b f; j;` ? s f;+ j F X f 0 =1 C f;+ j;1;f 0 C f; j;`;f 0 x f 0 + (G f j ) T f 0 ` = 1; ;K f; j ; j = 1; ;m f 0 f ? m f X j=1 G f j K f; j X `=1 b f; j;` 0 3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5 which we denote as the H(omogeneous)LCP +;pull VF , implies F X f=1 8 > > > > > < > > > > > : (x f ) T q f + ( f ) T b f + ( f ) T e f + m f X j=1 2 6 4 K f; j X `=1 b f; j;` g f;+ j;1 +g f; j;` + K f;+ j X `=2 b f;+ j;` g f;+ j;1 g f;+ j;` 3 7 5 9 > > > > > = > > > > > ; 0: (37) Withb f; j;` 0, a simple condition for m f X j=1 2 6 4 K f; j X `=1 b f; j;` g f;+ j;1 +g f; j;` + K f;+ j X `=2 b f;+ j;` g f;+ j;1 g f;+ j;` 3 7 5 0 to hold is g f; j;` g f;+ j;1 8` = 1; ; K f; j and g f;+ j;1 g f;+ j;` 8` = 2; ; K f;+ j : (38) 53 More generally, with K f; j X `=1 b f; j;` K f;+ j X `=2 b f;+ j;` , we have F X f=1 m f X j=1 8 > < > : K f; j X `=1 b f; j;` g f;+ j;1 +g f; j;` + K f;+ j X `=2 b f;+ j;` g f;+ j;1 g f;+ j;` 9 > = > ; F X f=1 m f X j=1 8 > < > : 0 B @ K f; j X `=1 b f; j;` 1 C A " min 1`K f; j max g f; j;` g f;+ j;1 ; 0 ! + min 1`K f; j min g f; j;` g f;+ j;1 ; 0 !# + 0 B @ K f;+ j X `=2 b f;+ j;` 1 C A " min 2`K f;+ j max g f;+ j;1 g f;+ j;` 0 ! + min 2`K f;+ j min g f;+ j;1 g f;+ j;` ; 0 !# 9 > = > ; F X f=1 m f X j=1 8 > < > : 0 B @ K f; j X `=1 b f; j;` 1 C A " min 1`K f; j max g f; j;` g f;+ j;1 ; 0 + min 1`K f; j min g f; j;` g f;+ j;1 0 + min 2`K f;+ j min g f;+ j;1 g f;+ j;` ; 0 #) : Based on the above expression, we can establish the following solvability result of the LCP +;pull VF , thus the redened game b G +;pull VF and the pull-out gameG +;pull VF , by Lemke's algorithm. Theorem 20. Suppose that the matrix Q is copositive onR n + and Assumption C + holds for (32) of the game b G +;pull VF . Suppose further that the set X contains the origin, and that 8 > > > > > > < > > > > > > : for all f = 1; ;F; A f x f 0; x f 0 F X f 0 =1 Q ff 0 + Q f 0 f T x f 0 0 9 > > > > > > = > > > > > > ; implies 8 > > > > > > < > > > > > > : F X f=1 (x f ) T q f 0; and F X f 0 =1 C f;+ j;1;f 0 C f; j;`;f 0 x f 0 0 9 > > > > > > = > > > > > > ; : (39) Assume further that for all f = 1; ;F , f 0 and G f f 0 implies m f X j=1 ( f j " min 1`K f; j max g f; j;` g f;+ j;1 ; 0 + min 1`K f; j min g f; j;` g f;+ j;1 ; 0 + min 2`K f;+ j min g f;+ j;1 g f;+ j;` ; 0 #) 0; f 0 and (G f ) T f 0 implies ( f ) T e f 0. Then Lemke's algorithm will successfully compute a solution to the LCP +;pull VF in a nite number of iterations. 54 Proof. We resume the above derivation. Since X contains the origin, it follows that b f 0 for all f = 1; ;F . Furthermore, from 0 = z T Mz with the latter quadratic form equal to F X f=1 8 > > > < > > > : (x f ) T F X f 0 =1 Q ff 0 x f 0 + m f X j=1 2 6 6 6 4 K f;+ j X `=2 b f;+ j;` X f 0 6=f 0 B B B @ C f;+ j;1;f 0 C f;+ j;`;f 0 | {z } 0 by Assumption C + 1 C C C A x f 0 + K f; j X `=1 b f; j;` X f 0 6=f 0 B B B @ C f; j;`;f 0 C f;+ j;1;f 0 | {z } 0 by Assumption C + 1 C C C A x f 0 3 7 7 7 5 9 > > > = > > > ; ; since each summand within the brackets [] is nonnegative, we deduce in particular that x T Qx = 0. Since the matrix Q is by assumption copositive on the nonnegative orthant in thex-space, it follows that h Q + ( Q ) T i x 0. Hence, for all j = 1; ;m f , (G f j ) T f s f;+ j + F X f 0 =1 C f;+ j;1;f 0 C f; j;`;f 0 x f 0 0; where the last inequality follows from (39). Thus ( f ) T e f 0. Since the tuple f j m f j=1 , where f j , K f; j X `=1 b f; j;` 0, satises G f f 0, the desired inequality (37) follows readily. Remark 21. If Q is copositive on R n + and strictly copositive on X 1 , then any vector x satisfying the left-hand side of the implication (39) must be zero; hence this implication holds easily in this case. 3.3.4 The LCP ;pull VF The LCP of the modied game b G ;pull VF is dened by the variables: z , 8 < : x f ; s f; j ; b f;+ j;` K f;+ j `=1 ! m f j=1 ; f ; f ; b f; j;` K f; j `=2 ! m f j=1 9 = ; F f=1 ; 55 complementarity conditions: for all f = 1;F , 0 x f ? q f + F X f 0 =1 Q ff 0 x f 0 + (A f ) T f + m f X j=1 K f;+ j X `=1 b f;+ j;` (C f; j;1;f C f;+ j;`;f ) T + m f X j=1 2 6 4 K f; j X `=2 b f; j;` (C f; j;`;f C f; j;1;f ) T 3 7 5 0 0 f ? b f A f x f 0 0 s f; f ? K f;+ j X `=1 b f;+ j;` K f; j X `=2 b f; j;` 0; j = 1; ;m f 0 b f; j;` ? g f; j;1 g f; j;` +s f; j F X f 0 =1 C f; j;`;f 0 C f; j;1;f 0 x f 0 0 ` = 2; ;K f; j ; j = 1; ;m f 0 b f;+ j;` ? g f; j;1 g f;+ j;` +s f; j + F X f 0 =1 C f; j;1;f 0 C f;+ j;`;f 0 x f 0 + (G f j ) T f 0 ` = 1; ;K f;+ j ; j = 1; ;m f 0 f ? e f m f X j=1 G f j K f;+ j X `=1 b f;+ j;` 0: (40) Corresponding to Assumption C + is the following: Assumption C for (32) in b G ;pull VF : (a) C f;+ j;`;f 0 C f; j;1;f 0 for all f f 0 = 1; ;F , j = 1; ;m f , ` = 1; ;K f;+ j ; (b) C f; j;1;f 0 C f; j;`;f 0 for all f, f 0 = 1; ;F , j = 1; ;m f , ` = 2; ;K f; j , and f 0 6=f. The following is the analog of Theorem 20 for the LCP ;pull VF . Theorem 22. Suppose that the matrix Q is copositive onR n + and Assumption C holds for (32) of the game b G ;pull VF . Suppose further that the set X contains the origin, and that, 8 > > > > > > < > > > > > > : for all f = 1; ;F; A f x f 0; x f 0 F X f 0 =1 Q ff 0 + Q f 0 f T x f 0 0 9 > > > > > > = > > > > > > ; implies F X f=1 (x f ) T q f 0: (41) Assume further that for all f = 1; ;F , 56 f 0 and G f f 0 implies m f X j=1 ( f j " min 1`K f;+ j max g f;+ j;` +g f; j;1 ; 0 + min 1`K f;+ j min g f;+ j;` +g f; j;1 ; 0 + min 2`K f; j min g f; j;1 g f; j;` ; 0 #) 0; (42) f contains the origin, or equivalently e f 0. Then Lemke's algorithm will successfully compute a solution to the LCP ;pull VF in a nite number of iterations. 3.4 No Restriction on g f; j;` Theorem 22 is applicable to broad classes of functions U f (x); an example is: U f j (x) = max 1`K f;+ j 0 @ g f;+ j;` + F X f 0 =1 C f;+ j;`;f 0 x f 0 1 A | {z } rival dependent C f;+ j;`;f 0 0; 8f 0 6=f max 1`K f; j g f; j;` +C f; j;`;f x f | {z } private to player f with the constants g f; j;` satisfying the following condition that is the analog of (38): g f;+ j;` g f; j;1 8` = 1; ; K f;+ j and g f; j;1 g f; j;` 8` = 2; ; K f; j : Nevertheless, as the example below shows, the source condition (42) is not satised in the min-cost network interdiction game; this motivates us to extend the analysis and seek an alternative condition so that the result can be applied without restriction on the constants g f; j;` in general and to the network model in particular. Example 23. The min-cost interdiction game consists of the optimization problems: 8 > > > > > > > > > > > > > > < > > > > > > > > > > > > > > : parameterized by x f 2 X f ; minimize x f 2X f X a2A f c f a (x f a ) f cost min (x); where cost min (x), maximum i ;a X i2N i i X a2A u a (x) 0 a subject to i j 0 a c a ; 8a2A with start node i and end node j and 0 a 0; 8a2A and i 0;8i2N 9 > > > > > > > > > > > > > > = > > > > > > > > > > > > > > ; F f=1 ; where we have required, without loss of generality, that the nodal variables i are all nonnegative, due to the row sum property of the node-arc incidence matrix of a network and the feasibility condition of 57 the net supplies i over all nodes summing to zero. Each u a (x) is given by (23); i.e., u sum a (x) = max 0 @ 0; u 0 a X f :a2A f x f a 1 A or u max a (x) = max 0; u 0 a max f :a2A f x f a ; each of which can be identied as a u f; j (x) with K f; j = 2; we also have u f;+ j (x) = 0. For the sum capacity, under the identication that u f; 1 (x) = 0 andu f; 2 (x) =u 0 a X f :a2A f x f a , it can be seen that the implication (42) becomes: 8 < : i j 0 a 0; 8a2A with start node i and end node j 0 a 0; 8a2A; and i 0;8i2N 9 = ; ) X i2N i i X a2A 0 a u 0 a 0; which clearly cannot hold even in some simple cases. The key for the successful applicability of Lemke's algorithm to the min-cost network interdiction game, and more generally to the games b G f; VF is to rely on the representation (31), which shows that the value function f (x) is equal to that of a linear program over a polytope. Assuming that an upper bound B > 0 is known such that m f X j=1 f;t j B for all f = 1; ;F and all t = 1;T , we may append each sum constraint: m f X j=1 f j B (43) to the set f , resulting in an augmented matrix b G f that satises the implication: h f 0; b G f f 0 i ) f = 0: The reason that this implication holds is that we have a constraint P m f j=1 f j 0 in the augmented inequality b G f f 0, while the variable f is nonnegative, which forces the f to be equal to 0. This leads to the following corollary of Theorem 22 where we continue to use the same matrix G f with the understanding that it is appended by the bound constraint (43) if needed. No proof of the corollary is required. A similar corollary can be stated for Theorem 20 which we omit. Corollary 24. Suppose that the matrix Q is copositive onR n + and Assumption C holds for (32) of the game b G ;pull VF . Suppose further that the set X contains the origin, and that (41) holds. Assume further that for all f = 1; ;F , f 0 and G f f 0 implies f = 0, and f contains the origin, or equivalently e f 0. Then Lemke's algorithm will successfully compute a solution to the LCP ;pull VF in a nite number of iterations. 58 An alternative way to handle the case of unbounded sets f is via a sequence of LCPs. Specically, for each scalarB > 0, we let f B denote the set f appended by the bound constraint (43) and let LCP ;pull B;VF denote the resulting LCP. We then have the following proposition wherein the a priori knowledge of B is not needed. Instead, a (nite) sequence of LCPs with bounded f B can be solved to obtain a desired solution of the LCP ;pull VF by systematically increasing the boundB iteratively; see the procedure following the proposition. For the purpose of simplicity, we only state the proposition with respect to b G +;pull VF , a similar result holds for the game b G ;pull VF as well. Proposition 25. Suppose that the matrix Q is copositive on R n + and Assumption C + holds for (32) of the game b G +;pull VF . Suppose further that the set X contains the origin, and that (39) holds. Assume further that for all f = 1; ;F , f 0 and (G f ) T f 0 implies ( f ) T e f 0. Then for every scalar B > 0, Lemke's algorithm will successfully compute a solution to the LCP +;pull B;VF in a nite number of iterations; moreover, for a nite B > 0, every solution of LCP +;pull B;VF for B B is a solution of the LCP +;pull VF , provided that1< f (x)<1 for all x2 X. One practical way to convert Proposition 25 and similar results of b G ;pull VF into a constructive method for computing Nash equilibria of the pull-out gamesG ;pull VF , thus per Propositions 14 and 15 respec- tively, for computing quasi-Nash equilibria of the original gamesG VF with quadratic rst-level objective functions satisfying the required copositivity condition is as follows. Start with an arbitrary scalar B > 0, solve the respective LCPs ;pull B;VF , and check if the obtained solution z B; has the property that B;;f 2 argmax f 2 f U f (x B; ) T f for allf = 1; ;F . If so, a desired solution to the respective games is on hand and the procedure stops. Otherwise, double the scalarB and repeat the procedure. By doubling the scalarB a nite number of times, this procedure must stop with a desired equilibrium solution provided that the last condition of Proposition 25 is in place. To close this section, we mention that the above bounding technique can be applied to the extended games discussed at the end of Subsection 3.1.2 with coupling constraints on the rst-level variables x f and associated pricing determination where no a priori known bounds are available for the prices p a , a2A. We omit the details and refer the reader to [73, Section 5] where the technique was termed price truncation. 3.5 Conclusion This work has studied a special class of non-cooperative multi-agent bilevel games where each agent selshly optimizes his/her objective that is the sum of a rst-level function and a value function derived from a second-level optimization problem parameterized by the rst-level decision variable in a dierence- convex manner. We distinguish two versions of such a game: interdiction versus enhancement. While each version has its own practical signicance, we have oered a unied treatment with minor variations. 59 Due to the non-convexity of the resulting combined objectives, a quasi-Nash equilibrium is sought that is in essence a solution of the rst-order optimality conditions of the players' single-stage optimization problems. This is achieved by a pull-out idea that leads to a kind of convexication of the non-convex games. Existence of a Nash equilibrium of such a pull-out game is proved under mild assumptions and a computational procedure for such an equilibrium based on Lemke's algorithm is proposed in the ane case. Overall, the class of games studied herein represents a signicant family of EPECs with realistic applications that can be analyzed rigorously and solved by a well-known algorithm with convergence guarantee. 60 4 DC Potential Games In this chapter, we will introduce the DC potential games, where the objective function of each player is a DC function, i.e., the dierence of two convex functions. We propose a linearized best-response algorithm to compute a quasi-Nash equilibrium solution to such games, and show the convergence of the proposed algorithm. This chapter is organized as follows: in Section 4:1, we will describe the formulation of games to be discussed, especially the structure of the objective function. In Section 4:2, we will discuss the potential games, and some special cases of potential games. In Section 4:3, we will develop a linearized best-response algorithm to compute a solution to the DC potential games with dierentiable objective functions, and discuss the convergence of such algorithm. In Section 4:4 and Section 4:5, algorithms to compute a quasi-Nash equilibrium of DC potential games with pointwise maximum objective function and sum of pointwise maximum objective functions will be discussed, respectively. In Section 4:6, the generalized DC potential games will be discussed. 4.1 Game Formulation Among the general class of non-cooperative games, or NEP, we are interested in the DC games, i.e., the game whose objective function of each player is a DC function in the player's own decision variable when the other players' decision variables are xed. DC is short for dierence of convex. Formally speaking, each player f, solves the following optimization problem: min x f 2X f f (x f ;x f ) where the objective function f (x f ;x f ) =u f (x f ;x f )v f (x f ;x f ) is the dierence of two functions, u f and v f . u f (x f ;x f ) and v f (x f ;x f ) are all convex in x f for xed x f . Since the DC game is not a convex game, meaning that the objective function of each player is not convex in its own decision variable, a Nash equilibrium may not exist, and the sequence generated by the best-response algorithm may not converge to a solution. Thus, we will introduce a specic type of DC game, the DC potential game. One example of DC game where a Nash equilibrium does not exist is that for two players, if the objective function of player 1 is as follows: min x2[1;3] x 2 (y + 2) (x 2) 2 ; and the objective function of player 2 is min y2[3;1] y 2 (x 2) (y + 2) 2 ; where x and y are one-dimension decision variables for player 1 and 2, respectively. Then each player's objective function is a DC function in its own decision variable given the other player's decision variable 61 xed. We can see that the optimal strategies x ? ;y ? are as follows: x ? = 8 > > > > < > > > > : 1 if 3y<2 1 or 3 if y =2 3 if 2<y1 y ? = 8 > > > > < > > > > : 3 if 2<x 3 1 or 3 if x = 2 1 if 1x< 2 for player 1 and 2, respectively. The optimal strategies of both players are characterized in the following Figure 1, we can see that the green lines represent the y ? , while the red lines represent the x ? . Thus, it is clear to see a Nash equilibrium of such game does not exist. However, we can show that a quasi-Nash equilibrium of such game exists. Indeed, a pair of strategies x = 2;y =2 is a QNE. Given y =2, a stationary point for the player 1 is x = 2, and on the other hand, given x = 2, y =2 is a stationary for the player 2. Thus, (2;2) is a quasi-Nash equilibrium to the DC game, which is represented by the black point in Figure 1. An example of DC game can be found in [72, Eq. 1], in the area of information theory. Under the Figure 1: Optimal Strategies of Two Players (Red for Player 1 and Green for Player 2) single-input-single-output framework, there are several pairs of users and friendly jammers, and one eavesdropper. Each pair of user and jammer wish to increase their secrecy capacity by allocating the user's and jammer's power in dierent channels in order to increase the signal to noise ratio(SINR) at the receiver's position as well as reduce the SINR at the eavesdropper's position. If we regard each pair of user and jammer as a player, then the player's objective function is its secrecy capacity, and is a DC 62 function. And such game is a DC game. Similar examples of DC game can be also found in [6, 86], by regarding each user as a player, in the areas of signal processing, communications and networking. 4.2 Potential Games In game theory, a game is said to be a potential game if the incentive of all players to change their strate- gies(or decision variables) can be expressed using a single global function called the potential function. There are many dierent classes of potential games, among them, the strictest one is called the exact potential game. And a more relaxed class of potential game is called the generalized potential games. To give a denition of the generalized potential game, we need to introduce a forcing function , which satises the following condition: lim k!1 (t k ) = 0) lim k!1 t k = 0 The formal denition of a generalized potential game is as follows: Denition 26. A Nash equilibrium Problem is a generalized potential game if: There exists a continuous function P (x) :R n !R such that for each player f, for all x f 2X f , and y f ;z f 2X f : f (y f ;x f ) f (z f ;x f )> 0 implies P (y f ;x f )P (z f ;x f ) ( f (y f ;x f ) f (z f ;x f )), where : R + ! R + is a forcing function satisfying: lim k!1 (t k ) = 0) lim k!1 t k = 0 By denition, we know that the exact potential game is a special case of the generalized potential game with a potential function such thatP (y f ;x f )P (z f ;x f ) = f (y f ;x f ) f (z f ;x f ) for ally f ;z f 2X f when x f is xed. DC potential game is a broad class of games, and there are many examples of DC potential games. Now let's look at several classes of DC potential games. 4.2.1 Common Pointwise Maximum Function The rst type of DC potential games that we will discuss is the case that the v f function is a common pointwise maximum function to all players, and denoted as v. Suppose each player's objective function is f (x) =u f (x)v(x), wherev(x) = max 1iI ' i (x) is a pointwise maximum function. The component function' i (x) is convex inx f for xedx f , and this holds for all playersf and componentsi = 1;:::;I. Now assume that theu f (x) part has an exact potential functionP u (x), then we can conclude the objective function f (x) has an exact potential function P u (x) max 1iI ' i (x). The reason is that u f has a potential function P u , and the function v itself is a potential function to all players for the v part, thus, the objective function u f v has a potential function by the following 63 derivations: f (y f ;x f ) f (z f ;x f ) =u f (y f ;x f )u f (z f ;x f ) ( max 1iI ' i (y f ;x f ) max 1iI ' i (z f ;x f )) =P u (y f ;x f )P u (z f ;x f ) ( max 1iI ' i (y f ;x f ) max 1iI ' i (z f ;x f )) =(P u (y f ;x f ) max 1iI ' i (y f ;x f )) (P u (z f ;x f ) max 1iI ' i (z f ;x f )): 4.2.2 Separable Pointwise Maximum Function The second type of DC potential game is the case where the v f function is a separable function, and it is a pointwise maximum function of the player f's decision variable, i.e., v f (x f ). Suppose f (x f ;x f ) = u f (x f ;x f )v f (x f ), where v f (x f ) = max 1iI f' i f (x f ) is a separable pointwise max- imum function. The component function ' i f (x f ) is convex in x f . Now assume u f (x f ;x f ) has an exact potential function P u (x), then we can conclude the objective function f (x) has an exact potential func- tion P u (x) P F f=1 max 1iI f' i f (x f ). The reason is that u f has a potential function P u , and since the function v f is a separable function, the sum of v f is a potential function for v f part, thus, u f v f has a potential function by the following derivations: f (y f ;x f ) f (z f ;x f ) =u f (y f ;x f )u f (z f ;x f ) ( max 1iI f ' i f (y f ) max 1iI f ' i f (z f )) =P u (y f ;x f )P u (z f ;x f ) ( max 1iI f ' i f (y f ) max 1iI f ' i f (z f )) =(P u (y f ;x f ) max 1iI f ' i f (y f ) X f 0 6=f max 1jI f 0 ' j f 0 (x f 0 )) (P u (z f ;x f ) max 1iI f ' i f (z f ) X f 0 6=f max 1jI f 0 ' j f 0 (x f 0 )): 4.2.3 Stochastic Separable Recourse Function As an example of the above separable function, when v f is a stochastic recourse function, which just depends on playerf's own decision variable, then it is also a separable and pointwise maximum function. Suppose that f (x f ;x f ) =u f (x f ;x f )v f (x f ), wherev f (x f ) = 1 L f P L f j=1 ' f (x f ;! j ), and the component function is a stochastic recourse value function, denoted as ' f (x f ;!), min y f (d f ) T y f s:t: C f (!)x f +D f y f f (!) = max f (C f (!)x f + f (!)) T f s:t: D f T f =d f ; f 0: 64 Note that the above second equality is derived by formulating the dual problem of the inner optimization problem. f is the multiplier of the primal inner optimization problem. Thus, we can reformulate the objective function of each player f and get f (x f ;x f ) =u f (x f ;x f ) 1 L f L f X j=1 max f ( f (! j )C f (! j )x f ) T f s:t: D f T f =d f ; f 0 =u f (x f ;x f ) 1 L f L f X j=1 max f 2 f ( f (! j )C f (! j )x f ) T f : Here we denote f as the set of vertices of the feasible setf f jD f T f = d f ; f 0g. The reason that the above second equality holds is the inner maximization problem is a linear program with respect to f , and the optimal value can be attained at one of the polyhedron (or feasible set) vertices. So by the property of the separable pointwise maximum functions, a potential function for the objective function f (x f ;x f ) is P u (x) P f P L f j=1 1 L f max f 2 f ( f (! j )C f (! j )x f ) T f . 4.2.4 Stochastic Common Recourse Function Dierent from the previous stochastic separable recourse function, when the concave part in the objective function a stochastic recourse function that is common to all players, i.e., v =v 1 =v 2 = =v F , then the functionv is a special case of the type of functions discussed in 4.2.1, i.e., common pointwise maximum functions. Suppose that f (x f ;x f ) = u f (x f ;x f )v(x), where v(x) = 1 L P L j=1 '(x;! j ), and the component function is a stochastic recourse value function, denoted as '(x;!), min y (d) T y s:t: C(!)x +Dy(!) = max (C f (!)x +(!)) T s:t: D T =d; 0: Note that the above second equality is derived by formulating the dual problem of the inner optimization problem. is the multiplier of the primal inner optimization problem. Thus, we can reformulate the objective function of each player f and get f (x f ;x f ) =u f (x f ;x f ) 1 L L X j=1 max ((! j )C(! j )x) T s:t: D T =d; 0 =u f (x f ;x f ) 1 L L X j=1 max 2 ((! j )C(! j )x) T : 65 Here we denote f as the set of vertices of the feasible setfjD T = d; 0g. The reason that the above second equality holds is the inner maximization problem is a linear program with respect to , and the optimal value can be attained at one of the polyhedron (or feasible set) vertices. So by the property of the common pointwise maximum functions, a potential function for the objective function f (x f ;x f ) is P u (x) P L j=1 1 L max 2 ((! j )C(! j )x) T . 4.2.5 Stochastic Quadratic Recourse Function Consider the case that the objective function of each player f is f (x f ;x f ) =u f (x f ;x f ) +v(x), where the function v(x) is a quadratic recourse function [65], and common to all players. Suppose thatv(x) = 1 L P L j=1 '(x;! j ), and the component function is a quadratic recourse value function, denoted as '(x;!), min z (d(!) +G(!)x) T z + 1 2 z T Qz s:t: C(!)x +Dz(!); where the matrix Q is assumed to be positive denite. It has shown in [65] that when the matrix Q is positive denite, the value function '(x;!) is equal to: '(x;!) = 1 2 (d(!) +G(!)x) T Q 1 (d(!) +G(!)x)+ min y 1 2 y T Qy s:t: C(!)x +Dy(!) +DQ 1 (d(!) +G(!)x); while the minimum value function is convex with respect to x. Thus, we can reformulate the objective function of each player f and get f (x f ;x f ) =u f (x f ;x f ) + 1 L L X j=1 [ 1 2 (d(! j ) +G(! j )x) T Q 1 (d(! j ) +G(! j )x)+ min y j 2Y (x;! j ) 1 2 y T j Qy j ]; where the feasible set is denoted as Y (x;!),fyjC(!)x +Dy (!) +DQ 1 (d(!) +G(!))g. We can see that the objective function f is a DC function in x f for xed x f : both u f (;x f ) and the minimum value functions are convex in x f given x f is xed, while the quadratic terms are concave in x f for xed x f . Besides that, suppose the function u f has a potential function P u , by the property of the common pointwise maximum functions, a potential function for the objective function f (x f ;x f ) is P u (x)+ 1 L P L j=1 [ 1 2 (d(! j )+G(! j )x) T Q 1 (d(! j )+G(! j )x)+min y j 2Y (x;! j ) 1 2 y T j Qy j ]. Thus, the stochastic quadratic recourse game is a DC potential game. The formulations of stochastic quadratic recourse functions are consistent with the formulation of the QP-based recourse functions in [77]. It has been mentioned in the paper [77] that such models have been widely used in the area of electricity markets, including a scenario-based dynamic oligopolistic problem under uncertainty [36], capacity expansion in electricity markets [41, 27, 40] and two-settlement markets with uncertainties in the future market [49, 112]. 66 4.3 Algorithm for Dierentiable DC Potential Game Best-response algorithm is widely used in computing a solution to Nash equilibrium Problems(NEP). Thus, we would like to develop an algorithm which is similar to the best-response algorithm so that it can compute a solution to the DC potential games. However, one thing we are concerned with the clas- sical best-response algorithm is that it would require the objective function of each player to be convex in its own decision variable, so that the optimization problem for each player is convex, which is easy to compute. But the objective function of each player in the DC potential games is not convex in the player's own decision variable. Actually, the objective function is DC(dierence of convex) in the player's own decision variable. Thus, we wish to linearize the concave part or v f function, if possible, to get a convex objective function during the iterates of the algorithm. Another thing we did to modify the classical best-response algorithm is to add a regularization term, in order to guarantee the convergence of such algorithm. Since the optimization problem of each player is not convex, a Nash equilibrium may not exist, thus, our algorithm works to compute a quasi Nash equilibrium of the DC potential games. Suppose the objective function of each player f is f (x f ;x f ) = u f (x f ;x f ) v f (x f ;x f ), where u f (x f ;x f ) is convex inx f for a xedx f ,v f (x f ;x f ) is convex inx f and dierentiable with respect to x f for xed x f . And this game is a generalized potential game, with a potential function P (x f ;x f ). We develop the following algorithm to compute a QNE solution of the DC potential games. Algorithm 2 Linearized Best-Response Algorithm for DC potential Games 1: Choose any feasible starting point x 0 = (x 1 0 ;:::;x F 0 ), a positive regularization parameter 0 > 0 and set k = 0. 2: If x k satises a suitable termination criterion: Stop, otherwise, set x k;1 =x k . 3: For f = 1;:::;F , compute a solution x f k+1 of min x f u f (x 1 k+1 ;:::;x f1 k+1 ;x f ;x f+1 k ;:::;x F k )v f (x 1 k+1 ;:::;x 1 k+1 ;:::;x f1 k+1 ;x f k :::;x F k ) r x fv f (x 1 k+1 ;:::;x f1 k+1 ;x f k ;:::;x F k ) T (x f x f k ) + k jjx f x f k jj 2 s:t: x f 2X f End For. 4: Set k+1 = maxfmin[ k ; max f=1;:::;F fjjx f k+1 x f k jjg]; 0:1 k g. x k+1 = (x 1 k+1 ;:::;x F k+1 ), k k + 1. Go to Step 2. The above algorithm is a sequential algorithm. We start from x 0 = (x 1 0 ;:::;x F 0 ), and solve a convex problem for each player f one by one, meaning the latter player will get updated on the information of previous players' decision variables. And this process is executed iteratively. 67 Notice that in this algorithm we update the value of the coecient k . By selecting the rule of updating k [33], we can make sure the sequence of k is non-increasing, and we hope it to be convergent to 0. The following theorem shows that the sequence of k converges to 0. Theorem 27. Suppose that the functionv f (x) is dierentiable and concave, if there exists a cluster point of the sequencefx k g generated by the Algorithm 2, then (i) we have lim k!1 k = 0; (ii) there exists an innite index set K of iterations such that k+1 < k . This theorem shows that without actually letting k to be zero, k is guaranteed to be reduced an innite number of times, and thus forced to be reduced to 0. Proof. This proof is similar to that of [33, Lemma 5.1]. From Step 4 of the Algorithm 2, we have k+1 k . To prove point (i) holds, we assume by contradiction that lim k!1 k = 0 is not true, so there exists a > 0 such that k > 0 8k. For the convenience of interpreting the players' decision variables in the iterations of the algorithm, we denote x k;f+1 , (x 1 k+1 ;:::;x f k+1 ;x f+1 k ;:::;x F k ). Thus, x k;1 , (x 1 k ;:::;x f k ;x f+1 k ;:::;x F k ) = x k and x k;F +1 , (x 1 k+1 ;:::;x f k+1 ;x f+1 k+1 ;:::;x F k+1 ) =x k+1 . By the updating rule of the algorithm, we can derive the following inequalities: f (x f k+1 ;x f k;f ) + jjx f k+1 x f k jj 2 f (x f k+1 ;x f k;f ) + k jjx f k+1 x f k jj 2 by k =v f (x f k+1 ;x f k;f ) +u f (x f k+1 ;x f k;f ) + k jjx f k+1 x f k jj 2 u f (x f k+1 ;x f k;f )v f (x f k ;x f k;f )r x fv f (x f k ;x f k;f ) T (x f k+1 x f k ) + k jjx f k+1 x f k jj 2 by the convexity of v f u f (x f k ;x f k;f )v f (x f k ;x f k;f ) = f (x f k ;x f k;f ) by the optimality of x f k+1 in Step 2: The above inequalities imply that f (x f k+1 ;x f k;f ) f (x f k ;x f k;f ) jjx f k+1 x f k jj 2 8k;f (44) By the denition of the generalized potential game, this relationship implies that P (x f k ;x f k;f )P (x f k+1 ;x f k;f )( f (x f k ;x f k;f ) f (x f k+1 ;x f k;f )) 0 8k;f (45) where P is the potential function for this game. Noting that x k;f = (x f k ;x f k;f ) and x k;f+1 = (x f k+1 ;x f k;f ), we can reformulate (45) as P (x k;f+1 )P (x k;f ) (46) From (46), recalling that x k =x k;1 and x k+1 =x k;F +1 , we get P (x k+1 ) =P (x k;F +1 )P (x k;f )P (x k;1 ) =P (x k ) (47) 68 Let ^ Kf0; 1;:::g be an innite subset index such that lim k!1;k2 ^ K x k = x. By the continuity of the potential function P and by (47) it follows that the full sequencefP (x k )g is convergent to a nite value P , therefore, by (47) it follows that lim k!1 P (x k;f ) = P; 8f: In turn, taking into account (45), this implies lim k!1 ( f (x f k ;x f k;f ) f (x f k+1 ;x f k;f )) = 0: and hence by the denition of the forcing function lim k!1 f (x f k ;x f k;f ) f (x f k+1 ;x f k;f ) = 0: Combining that with (44) gives lim k!1 jjx f k+1 x f k jj = 0: (48) From (48) we get max 1;:::;F fjjx 1 k+1 x 1 k jj;:::;jjx f k+1 x f k jj;:::;jjx F k+1 x F k jjg< for k suciently large, which means, together with Step 4, that k+1 < . And this contradicts with k . Point (ii) can be easily derived from point (i) by using the updating rule in Step 4. After showing that k converges to 0, the next thing we are going to prove is the convergence of the algorithm. In the following Theorem 28 we will show that every limit point offx k g K is a quasi-Nash equilibrium. Theorem 28. Suppose that the objective function f (x f ;x f ) is a continuous and DC function in x f for xed x f , assume that there exists a cluster point of the sequencefx k g generated by the Algorithm 2, let K be the innite subset of iterations dened at point (ii) of Theorem 27, then any cluster point of the subsequencefx k g K is a quasi-Nash equilibrium of the potential dc-game. Proof. The proof is similar to that of [33, Theorem 5.2]. Let Kf0; 1;:::g be the innite subset of iterations dened at point (ii) of Theorem 27 where k+1 < k : Then for all k2K we have max 1;:::;N fjjx 1 k+1 x 1 k jj;:::;jjx f k+1 x f k jj;:::;jjx F k+1 x F k jjg< k : From the above inequality and from the point (i) of Theorem 27, we obtain lim k!1;k2K jjx f k+1 x f k jj = 0; 8f: (49) 69 Let x be any cluster point of the subsequencefx k g K . By (49) and the denition of x k;f in the proof of Theorem 27, we also have lim k!1;k2K x k;f = x; 8f: (50) Now we can show that f (; x f ) 0 ( x f ;x f x f ) 0 8x f 2X f : Recalling the denition of x f k+1 at Step 4 and the optimization problem at Step 3, we can derive that u f (;x f k;f ) 0 (x f k ;x f x f k+1 )v f (;x f k;f ) 0 (x f k ;x f x f k+1 ) + 2 k (x f x f k+1 ) T (x f k+1 x f k ) 0 (51) for all x f 2 X f . Note that the directional derivative v f (;x f k;f ) 0 (x f k ;x f x f k+1 ) is equal to the product r x fv f (x f k ;x f k;f ) T (x f x f k+1 ) and thus continuous. Taking the limit fork!1;k2K on the inequality 51, and by [12, Proposition 4.1.2], we can derive the following inequalities: u f (; x f ) 0 ( x f ;x f x f )v f (; x f ) 0 ( x f ;x f x f ) lim sup k!1;k2K u f (;x f k;f ) 0 (x f k ;x f x f k+1 )v f (; x f ) 0 ( x f ;x f x f ) 0; and this concludes the proof. By proving the above Theorem 28, we have shown that the linearized best-response algorithm can com- pute a QNE solution to the DC potential game with the function v f to be dierentiable with respect to x f for xedx f , and any accumulation point of the sequence generated by the Algorithm 2 is a QNE of the DC potential game. Since the above algorithm and theorem requires the v f part to be dierentiable, now the question is, instead of being dierentiable, when the v f function is a pointwise maximum function of several dier- entiable function in x f for xed x f , will the linearized best-response algorithm, or some modications of this algorithm, compute a QNE to the corresponding DC potential game? To answer this question, we develop a modied linearized best-response algorithm, and discuss its con- vergence in the following Section 4.4. 4.4 Algorithm for Pointwise Maximum DC Potential Game Now consider the potential DC game, where the objective function of each player f is f (x f ;x f ) = u f (x f ;x f )v f (x f ;x f ). u f (x f ;x f ) is convex inx f for xedx f , andv f (x f ;x f ) = max 1iI ' i (x f ;x f ). The function ' i (x) is convex and dierentiable with respect to x f for xed x f , and this holds for all players f. By the assumption on the function v f (x f ;x f ), we can see v 1 (x) = v 2 (x) = ::: = v F (x), meaning that the function v f is common to all players. From the discussion in Section 4.2.1, we know that if there exists an exact potential function to the u f part, then the overall game is an exact potential 70 game. Without lose of generality, we assume this game is a generalized potential game, with a potential function P (x f ;x f ). Notice thatv f is a pointwise maximum function of nitely manyC 1 convex functions inx f , it is a convex piece-wise smooth function with directional derivative at a point (x f ;x f ) along with directiond f (which is just the direction with respect to x f ) given by v f (;x f ) 0 (x f ;d f ) = max i2M(x) r x f' i (x f ;x f ) T d f whereM(x), argmax 1iI ' i (x). For a given scalar ", letM " (x),fij' i (x) v f (x)"g, which is a superset ofM(x). Inspired by the idea of generating a linearized optimization problem at a point for each index i in the set M " (x) [72, Algorithm I], we can develop the following algorithm to compute a QNE solution to this DC potential game. Algorithm 3 Linearized Best-Response Algorithm for non-dierentiable DC potential Games 1: Choose any feasible starting point x 0 = (x 1 0 ;:::;x F 0 ), a positive regularization parameter 0 > 0 and set k = 0. 2: If x k satises a suitable termination criterion: stop, otherwise, set x k;1 =x k . 3: For f = 1;:::;F , and for each index i2 M " (x 1 k+1 ; ;x f1 k+1 ;x f k ; ;x F k ) compute a solution ^ x f k+1;i of min x f u f (x 1 k+1 ;:::;x f1 k+1 ;x f ;x f+1 k ;:::;x F k ) r x f' i (x 1 k+1 ;:::;x f1 k+1 ;x f k ;:::;x F k ) T (x f x f k ) + k jjx f x f k jj 2 s:t: x f 2X f : Let ^ i2 argmin i2M"(x 1 k+1 ;;x f1 k+1 ;x f k ;;x F k ) f (x 1 k+1 ; ;x f1 k+1 ; ^ x f k+1;i ;x f+1 k ; ;x F k ) + k jj^ x f k+1;i x f k+1 jj 2 : Set x f k+1 = ^ x f k+1; ^ i : End For. 4: Set k+1 = maxfmin[ k ; max f=1;:::;F fjjx f k+1 x f k jjg]; 0:1 k g: And x k+1 = (x 1 k+1 ;:::;x F k+1 );k k + 1: Go to Step 1. 71 Notice that in this algorithm, each player will need to solve several optimization problems, and each optimization problem corresponds to an index in the setM " (x). Then the player will select the best solution among the solutions to dierent optimization problems(i.e. candidate solutions) as its strategy at this iteration. Theorem 29. Suppose that the function' i (x) is dierentiable and concave, if there exists a cluster point of the sequencefx k g generated by the Algorithm 3, then (i) we have lim k!1 k = 0; (ii) there exists an innite index set K of iterations such that k+1 < k . This theorem shows that without actually letting k to be zero, k is guaranteed to be reduced an innite number of times, and thus forced to be reduced to 0. Proof. The proof is similar to [72, Proposition 11]. From Step 4 of the Algorithm 3, we have k+1 k . To prove the point (i) holds, we assume by contradiction that k ! 0 is not true, so there exists > 0 such that k > 0 8k. For the convenience of interpreting the players' decision variables in the iterations of algorithm, we use the same notation as Algorithm 3. Denote x k;f+1 , (x 1 k+1 ;:::;x f k+1 ;x f+1 k ;:::;x F k ). Thus, x k;1 , (x 1 k ;:::;x f k ;x f+1 k ;:::;x F k ) =x k and x k;F +1 , (x 1 k+1 ;:::;x f k+1 ;x f+1 k+1 ;:::;x F k+1 ) =x k+1 . By the updating rule of the Algorithm 3, we can derive the following inequalities: f (x f k+1 ;x f k;f ) + jjx f k+1 x f k jj 2 f (x f k+1 ;x f k;f ) + k jjx f k+1 x f k jj 2 by k =u f (x f k+1 ;x f k;f ) max i ' i (x f k+1 ;x f k;f ) + k jjx f k+1 x f k jj 2 by the denition of f u f (^ x f k+1;i ;x f k;f ) max j ' j (^ x f k+1;i ;x f k;f ) + k jj^ x f k+1;i x f k jj 2 8i2M(x f k ;x f k;f ) u f (^ x f k+1;i ;x f k;f )' i (^ x f k+1;i ;x f k;f ) + k jj^ x f k+1;i x f k jj 2 8i2M(x f k ;x f k;f ) u f (^ x f k+1;i ;x f k;f )' i (x f k ;x f k;f )r x f' i (x f k ;x f k;f ) T (^ x f k+1;i x f k ) + k jjx f k+1 x f k jj 2 8i2M(x f k ;x f k;f ) u f (x f k ;x f k;f )' i (x f k ;x f k;f ) 8i2M(x f k ;x f k;f ) by the updating rule of candidate solution ^ x f k+1;i : The second inequality is obtained by the updating rule ofx f k+1 , which is selected from candidate solutions ^ x f k+1;i , while the fourth inequality is derived by the convexity of ' i (;x f k;f ). The above inequalities imply that f (x f k+1 ;x f k;f ) f (x f k ;x f k;f ) jjx f k+1 x f k jj 2 8k;f: (52) By the denition of the generalized potential game, this relationship implies that P (x f k ;x f k;f )P (x f k+1 ;x f k;f )( f (x f k ;x f k;f ) f (x f k+1 ;x f k;f )) 0 8k;f: (53) 72 Noting that x k;f = (x f k ;x f k;f ) and x k;f+1 = (x f k+1 ;x f k;f ), we can reformulate (53) as P (x k;f+1 )P (x k;f ): (54) From (54), recalling that x k =x k;1 and x k+1 =x k;F +1 , we can get P (x k+1 ) =P (x k;F +1 )P (x k;f )P (x k;1 ) =P (x k ): (55) Let ^ Kf0; 1;:::g be an innite subset index such that lim k!1;k2 ^ K x k = x. By the continuity of P and by (55) it follows that the full sequencefP (x k )g is convergent to a nite value P , therefore, again by (55) it follows that lim k!1 P (x k;f ) = P; 8f In turn, taking into account (53), this implies lim k!1 ( f (x f k ;x f k;f ) f (x f k+1 ;x f k;f )) = 0; and hence by the denition of the forcing function lim k!1 f (x f k ;x f k;f ) f (x f k+1 ;x f k;f ) = 0: Combining that with (52) gives lim k!1 jjx f k+1 x f k jj = 0: (56) From (56) we get max 1;:::;F fjjx 1 k+1 x 1 k jj;:::;jjx f k+1 x f k jj;:::;jjx F k+1 x F k jjg< for k suciently large, which means, together with Step 4, that k+1 < , and this contradicts with k . Point (ii) can be easily derived from point (i) by using the updating rule in Step 4. After showing that k converges to 0, the next thing we are going to prove is the convergence of the algorithm. In the following Theorem 31 we will show that every limit point offx k g K is a quasi-Nash equilibrium. But before the discussion on the convergence, we would like to specify the necessary and sucient condition for a directional stationary point of each player f's optimization problem in the DC potential game with pointwise maximum function v f . Proposition 30. A vector x f 2X f is a d-stationary point for player f's optimization problem: min x f 2X f f (x f ; x f ) =u f (x f ; x f ) max i ' i (x f ; x f ) if and only if for every i2 M( x f ), x f 2 argmin x f 2X f [u f (x f ; x f )r x f' i (x f ; x f ) T (x f x f )], or equivalently, x f 2 argmin x f 2X f [u f (x f ; x f )r x f' i (x f ; x f ) T (x f x f ) +jjx f x f jj 2 ]. 73 Proof. This follows readily because both the function u f (x f ; x f )r x f' i ( x f ; x f ) T (x f x f ) and u f (x f ; x f )r x f' i ( x f ; x f ) T (x f x f ) +jjx f x f jj 2 are convex in x f for xed x f . And this holds for all players f. Thus, if we have a set of vectorsf x f g F f=1 , each satises the condition of Proposition 30, the set of decision variables x is actually a QNE of the DC potential game. This result is useful for the proof of the following theorem, which draws a conclusion of the convergence on Algorithm 3. Theorem 31. Suppose that the objective function f (x f ;x f ) = u f (x f ;x f ) max 1iI ' i (x f ;x f ) is a continuous and DC function in x f for xed x f , assume that there exists a cluster point of the sequencefx k g generated by the Algorithm 3, let K be the innite subset of iterations dened at point (ii) of Theorem 29, then any cluster point of the subsequencefx k g K is a quasi-Nash equilibrium of the DC potential game. Proof. The proof is similar to that of [33, Theorem 5.2]. Let Kf0; 1;:::g be the innite subset of iterations dened at point (ii) of Theorem 29 where k+1 < k : Then for all k2K we have max 1;:::;F fjjx 1 k+1 x 1 k jj;:::;jjx f k+1 x f k jj;:::;jjx F k+1 x F k jjg< k : From the above inequality and from point (i) of Theorem 29, we obtain lim k!1;k2K jjx f k+1 x f k jj = 0; 8f: (57) Let x be any cluster point of the subsequencefx k g K . By (57) and the denition of x k;f in the proof of Theorem 29, we can also have lim k!1;k2K x k;f = x; 8f: (58) It can be derived that M(x f k ;x f k;f ) M( x) M " (x f k ;x f k;f ) for all suciently large k2 K. Therefore, using the updating rule of Algorithm 3 for all i2M( x), we have f (x f k+1 ;x f k;f ) + k jjx f k+1 x f k jj 2 f (^ x f k+1;i ;x f k;f ) + k jj^ x f k+1;i x f k jj 2 u f (x f ;x f k;f ) (' i (x f ;x f k;f ) +r x f' i (x f ;x f k;f ) T (x f x f k )) + k jjx f x f k jj 2 for all x f 2X f . Taking the limit k2K!1, yields f ( x f ; x f )u f (x f ; x f ) (' i ( x f ; x f ) +r x f' i ( x f ; x f ) T (x f x f )) 8x f 2X f ; 8i2M( x): By the convexity of the function u f (x f ; x f ) (' i ( x f ; x f ) +r x f' i ( x f ; x f ) T (x f x f )) in x f , x f is a d-stationary point. And from the result of the Proposition 30, it can be concluded that x is a quasi-Nash equilibrium solution of the potential DC game. 74 By proving the Theorem 31, we have shown that the Algorithm 3 can compute a QNE solution to the DC potential game with a pointwise maximum function v f , and any accumulation point of the sequence generated by algorithm is a QNE of DC potential game. We can extend the case of v f to be a pointwise maximum function one step further to be the sum of several pointwise maximum functions. 4.5 DC Potential Games with Coupling Constraints In the previous sections, we mainly focus on the cases where the games are Nash Games, meaning that each player's feasible set is independent of the other players' decision variables. Inspired by the work in [33], we want to extend the results to the games where there are some coupling constraints. Suppose that the feasible set of player f is X f (x f ),fx f 2X f jg(x f ;x f ) 0g given the other players' strategies x f . Then the optimization problem of each player f is min x f 2X f (x f ) u f (x f ;x f )v f (x f ;x f ); where both u f (x f ;x f ) and v f (x f ;x f ) are convex in x f for xed x f . We can see that the constraint g(x f ;x f ) 0 is a common constraint for all players, and the set X f contains private constraints onx f . To extend the results to the DC potential games with coupling constraints, we will need extend the results from [33, Lemma 4:1] which guarantees the feasibility ofx k;f at each iterationk and each playerf. Note that the dierence between the games discussed here and the games in [33] is the objective function of each player is DC in its own decision variable instead of convex. Thus, we can have the following Lemma 32 which is similar to [33, Lemma 4:1]. Lemma 32. For every k and every f, x k;f is feasible. Proof. Assume that x k;f is feasible. We will rst show that x k;f+1 is feasible. By the denition, x k;f = (x 1 k+1 ; ;x f1 k+1 ;x f k ; ;x F k ) x k;f+1 = (x 1 k+1 ; ;x f k+1 ;x f+1 k ; ;x F k ) The feasibility ofx k;f implies thatx i k+1 2X i for alli2f1; ;f1g,x j k 2X j for allj2ff; ;Fg and that (x 1 k+1 ; ;x f1 k+1 ;x f k ; ;x F k )2 X. By the denition of x f k+1 we have that x f k+1 2X f (x f k;f ), that is x f k+1 2X f , and (x 1 k+1 ; ;x f k+1 ;x f+1 k ; ;x F k )2 X, namely x k;f+1 is feasible. This fact, together with x 0 2X 1 (x 1 0 )X F (x F 0 ), x k;1 =x k , and x k;F +1 =x k+1 , completes the proof. Using this fact, we can extend the result in Section 4.3-4.4 to the generalized Nash Games. Thus, for the DC potential games with coupling constraints, we can apply linearized best-response algorithm to compute a QNE solution, with subsequential convergence guaranteed. 75 5 Augmented Lagrangian Based Algorithm for Monotone Generalized Nash Games In this chapter, we use and prove the convergence of the augmented Lagrangian based algorithm to compute a solution to the generalized Nash equilibrium problem. A generalized Nash equilibrium problem is an extension of the standard Nash equilibrium problem, where both the objective function and the feasible set of each player depend on the other players' decision variables. This chapter is organized as follows: in Section 5:1, we will describe the formulation of the games to be discussed, i.e., the monotone games with linear coupling constraints. In Section 5:2, we will develop an augmented Lagrangian based algorithm to compute a solution to such games. In Section 5:3, we will establish the convergence of such an algorithm. 5.1 Problem Statement Consider an F player game which consists of the following optimization problems: 8 > < > : min x f 2X f f (x f ;x f ) s:t: Ax =b 9 > = > ; F f=1 : (59) Here the feasible set of each player is denoted asX f (x f ),fx f 2X f R n f j P f 0 6=f A f 0 x f 0 +A f x f =bg, where X f contains the private constraints of player f, which does not depend on the other players' decision variables and is a closed convex set. The matrix A is an aggregation of the submatrices A f for f = 1 ;F . Assuming that each player's objective function f (;x f ) is convex and dierentiable in x f whenx f is xed, we can denote a vector function G(x) whose elements are all the players' partial gradients, in the following form: G(x) = 2 6 6 6 6 6 4 r x 1 1 (x) r x 2 2 (x) . . . r x F F (x) 3 7 7 7 7 7 5 : (60) Inspired by the works on monotone games [85], we further assume that the function G(x) is a monotone function, meaning that it satises the condition that for all vectors y and z with the same dimension as x, the following inner product is nonnegative: (G(y)G(z)) T (yz) 0: Such games are called monotone games, see [85] for more details about monotone games. As the objective function f (;x f ) of each playerf is convex with respect to its own decision variablex f 76 given the other players' decision variables x f xed, and the feasible set X f (x f ) is a closed convex set for xedx f , the solutions that satisfy the rst-order conditions must be the generalized Nash equilibria. Thus, we are interested in developing an algorithm in order to compute a GNE to such games in this chapter. 5.2 Augmented Lagrangian Based Algorithms The kind of games we are going to focus on in this chapter have coupling constraints that are common to all players and are linear equalities. The classic paper [82] contains an extensive treatment of the game with common coupled convex constraints via the KKT system of the game, and the results have been further extended in the book chapter [29]. Thus, inspired by such works on games with common coupling constraints [82, 29], and the augmented Lagrangian method [14, 78] applied to optimization problems, we have developed an algorithm which utilizes the special feature of the common coupling constraint. We introduce a multiplier (common to all players) in the algorithm, which corresponds to the shared linear equality constraint, and compute a Nash equilibrium to a subgame which takes the value of a multiplier as input. After that, the value of is updated. The algorithm is a two-loop algorithm; in the inner loop, a NE of the subgame is computed, while in the outer loop, the value of is updated. We introduce a constant scalar c> 0 as the penalty parameter, > 0 as a regularization coecient, and > 0 as the step size, then the augmented Lagrangian based algorithm is as follows: Algorithm 4 augmented Lagrangian Based Algorithm for monotone games 1: At iteration k, compute a Nash equilibrium x k+1 , (x f k+1 ) F f=1 of the subgame: min x f 2X f f (x f ;x f ) k T (Axb) + c 2 jjAxbjj 2 + 2 jjx f x f k jj 2 F f=1 given the values of k and x k . 2: Update the multiplier : k+1 = k 0 @ F X f=1 A f x f k+1 b 1 A given the value of x k+1 . 3: Check whether the stopping criteria: jj k+1 k jj " is satised for an error threshold " > 0. If satised, stop. Otherwise, go to Step 1 with k k + 1. We can see that the subgame at step 1 is a convex game, i.e., the objective function of each player f is convex with respect tox f given the other players' decision variablesx f xed. Thus, a Nash equilibrium of such a subgame can be computed by using some ecient algorithms(subroutines). At step 1, the algorithm computes a Nash equilibrium to the subgame where the objective function of each player is an augmented Lagrangian function. Here, the multiplier is xed as a value from last 77 iteration. At step 2, we update the value of the multiplier after the update of all players' decision variables (x f ) f . 5.3 Convergence of Augmented Lagrangian Based Algorithm Now the question is, suppose we have an algorithm which can compute a NE of the subgame in an ecient way, will the augmented Lagrangian based algorithm generate a sequence which converges to a GNE of the original game 59? To answer this question, we need to look at the convergence of this algorithm. First, we show the subsequential convergence of this algorithm. We can claim that if there exists a GNE to this generalized Nash equilibrium problem, and if some accumulation points of the sequence generated by the algorithm exist, then we can nd a subsequence of the algorithm which converges to a GNE solution. The following Theorem 33 proves the subsequential convergence of the augmented Lagrangian based algorithm. Theorem 33. If we set = c > 0 in the algorithm 4, and there exists a solution to the generalized Nash equilibrium problem 59, then the sequencefy k g k ,fx k ; k g k generated by the algorithm satisfy the following condition: jjx k+1 x k jj! 0 andjj k+1 k jj! 0. Also, for any GNE solution y , (x ; ) of the monotone game, we havejjy k+1 y jj C jjy k y jj C , meaning it is non-expansive with respect to y in terms of a positive-denite matrix C , " I 0 0 1 I # , where the C-normjjjj C is denoted as jjyjj C , p y T Cy for a vector y. Besides that, there exists a subsequence offy k g k , which converges to a GNE of 59. Proof. By the necessary condition of a NE for the subgame and the updating rule of , we can get the following inequalities: [r x f f (x f k+1 ;x f k+1 )A f T k +cA f T (Ax k+1 b) +(x f k+1 x f k )] T (x f x f k+1 ) 0 8x f 2X f ( k+1 k +(Ax k+1 b)) T ( k+1 ) 0 82R m : (61) Now we can denote a vector b as b, " 0 b # ; and a vector function H(y) as H(y) = " G(x)A T Ax # ; which includes the function G(x) in 60 and a component with respect to the multiplier . Substituting k+1 = k k (Ax k+1 b), and denoting y, (x;), we can derive an equivalent formulation for the inequalities 61, as follows: (yy k+1 ) T [H(y k+1 ) +C(y k+1 y k ) b] 0 8y2Y; (62) 78 where Y is the feasible set of y and denoted as Y , Q F f=1 X f R m . Thus, if there exists a GNE solution y to the monotone game 59, we have the variational inequality: (yy ) T [H(y ) b] 0 8y2Y: (63) By substituting y =y into (62) and y =y k+1 into (63), we can get (y y k+1 ) T [H(y k+1 ) +C(y k+1 y k ) b] 0 and (y k+1 y ) T [H(y ) b] 0: Summing up those two inequities gives us the following inequality: (y k+1 y ) T [H(y )H(y k+1 )C(y k+1 y k )] 0; which implies that (y k+1 y ) T C(y k+1 y k ) (y k+1 y ) T (H(y k+1 )H(y )) 0: (64) The right hand side of 64 follows from the monotonicity of the functionH(y). The reason that the vector function H(y) is monotone is that for any y 1 = (x 1 ; 1 ) and y 2 = (x 2 ; 2 ), (H(y 1 )H(y 2 )) T (y 1 y 2 ) = (G(x 1 )G(x 2 )) T (x 1 x 2 ), which implies the monotonicity of H(y) by the monotonicity of G(x). Adding and subtracting the vector y from the left hand side of 64, we can get (y k+1 y ) T C(y k+1 y +y y k ) 0; which can be further reformulated as (y k+1 y ) T C(y k+1 y ) (y k+1 y ) T C(y k y ): (65) As the matrix C is a positive denite matrix, we can get that jjy k+1 y jj C jjy k y jj C by applying the Cauchy-Schwartz inequality. Sincejjy k+1 y jj C is bounded below by 0, the sequencefjjy k+1 y jj C g converges. This result shows that the sequencefy k g k is bounded, and an accumulation point exists. To further show thatjjy k+1 y k jj C ! 0, we can use the square expansion and get jjy k+1 y k jj 2 C =jjy k+1 y +y y k jj 2 C =jjy k+1 y jj 2 C +jjy k y jj 2 C + 2(y k+1 y ) T C(y y k ) jjy k+1 y jj 2 C +jjy k y jj 2 C ! 0: 79 The above inequality is derived by using the inequality 65. Sojjy k+1 y k jj C ! 0, and by the denition of C-norm, we have jjx k+1 x k jj 2 + 1 jj k+1 k jj 2 ! 0; meaning thatjjx k+1 x k jj! 0 andjj k+1 k jj! 0. Now we have shown that under the assumption of monotonicity, if a GNE exists for the original game, when the number of iteration k goes to innity, the distance between two consecutive y k in the sequence generated by Algorithm 4 decreases to 0. And by the fact thatjjy k+1 y jj C jjy k y jj C , we know thatjjy k y jj C jjy 0 y jj C for allk, and the sequence offy k g k is bounded. So an accumulation point of the sequencefy k g k exists. Suppose y 1 is an accumulation point andfy k g k2 is a subsequence which converges to it. By the inequality 62, we can derive that (yy k ) T [H(y k ) +C(y k y k1 ) b] 0 8y2Y: Thus, by letting k2 go to innity, we have (yy 1 ) T [H(y 1 ) b] 0 8y2Y; which means that the accumulation point y 1 of the sequencefy k g k is a GNE of the original game. And this completes the proof of subsequential convergence. After showing the subsequential convergence of Algorithm 4, we would like to know whether a limit point exists for the sequence generated by the Algorithm 4, i.e., the sequential convergence of Algorithm 4. Theorem 34. If we set =c> 0 in the algorithm, and there exists a solution to the generalized Nash equilibrium problem 59, then the sequencefy k g k ,fx k ; k g k generated by the Algorithm 4 converges to a GNE solution. Proof. We will use the same notations for matrices as the proof of Theorem 33. From the proof of Theorem 33, we know that lim k!1 jjy k y jj C exists, andjjy k y jj C is bounded. So there exists at least one accumulation point of the sequencefy k g k . Suppose y 1 is an accumulation point andfy k g k2 is a subsequence which converges to it. Thus, we know lim k2!1 jjy k y 1 jj C = 0. By the fact that lim k!1 jjy k+1 y k jj C = 0 derived from the proof of Theorem 33, we can see that y 1 is a solution to the original generalized Nash equilibrium problem 59. This is because for an elementy k+1 in the subsequence , by step 1 of Algorithm 4, we have (yy k+1 ) T [H(y k+1 ) +C(y k+1 y k ) b] 0 8y2Y; (66) and when k goes to innity, it converges to (yy 1 ) T [H(y 1 ) b] 0 8y2Y: (67) By the fact thatjj k+1 k jj! 0 in Theorem 33, and the step 2 of Algorithm 4, we can conclude thaty k is feasible when k goes to1, i.e., y 1 is feasible. The feasibility of y 1 together with 67 imply that y 1 is 80 a solution to the generalized Nash equilibrium problem. By substituting y =y 1 into (66) and y =y k+1 into (67), we have (y 1 y k+1 ) T [H(y k+1 ) +C(y k+1 y k ) b] 0 and (y k+1 y 1 ) T [H(y 1 ) b] 0: Summing up those two inequities, we can get (y k+1 y 1 ) T [H(y 1 )H(y k+1 )C(y k+1 y k )] 0; which implies that (y k+1 y 1 ) T C(y k+1 y k ) (y k+1 y 1 ) T (H(y k+1 )H(y 1 )) 0: (68) The right hand side of 68 follows from the monotonicity of the function H(y). Adding and subtracting the vector y 1 from the left hand side of 68, we can get (y k+1 y 1 ) T C(y k+1 y 1 +y 1 y k ) 0; which can be further reformulated as (y k+1 y 1 ) T C(y k+1 y 1 ) (y k+1 y 1 ) T C(y k y 1 ): Thus, we can derive jjy k+1 y 1 jj C jjy k y 1 jj C by applying the Cauchy-Schwartz inequality. Sincejjy k y 1 jj C is bounded below by 0, the sequencefjjy k y 1 jj C g k converges, and lim k!1 jjy k y 1 jj C exists. As we know lim k2!1 jjy k y 1 jj C = 0, the overall sequence should have a limit lim k!1 jjy k y 1 jj C = 0. Sojjy k y 1 jj C ! 0, and by the denition of C-norm, jjx k x 1 jj 2 + 1 jj k 1 jj 2 ! 0; meaning thatjjx k x 1 jj! 0 andjj k 1 jj! 0 . After proving the convergence, we want to see the relationship between the Algorithm 4 and the general- ized proximal point algorithm in [31]. In order to establish the relationship between them, we will need to formulate the variational inequality of the problem 59, as follows: " G(x )A T Ax b # T " xx # 0 8x2 F Y f=1 X f ; 2R m 81 for a GNE solution (x ; ). Thus, we can see that the Algorithm 4 is a special case of the generalized proximal point algorithm applied to the generalized Nash equilibrium problem 59. Another thing we want to show is that if the original generalized Nash equilibrium problem has no solution, then the sequence generated by Algorithm 4 is unbounded, which is stated in the following Proposition 35. Proposition 35. If a GNE does not exist for the generalized Nash equilibrium problem 59, then the sequencefy k g k ,fx k ; k g k generated by Algorithm 4 is unbounded. Proof. This can be shown by simply applying the results of part b in [31, Theorem 12:3:9] to the Algo- rithm 4, which completes the proof. 82 6 Numerical Results In this chapter, we have done numerical experiments on Lemke's method and best-response algorithm. Lemke's method will be applied to network interdiction games in order to check for the computation speed and convergence of such method. We are also going to apply best-response algorithm to two-stage stochastic games, Nash Cournot games and network enhancement games. 6.1 Lemke's Method on Network Interdiction Games We have applied Lemke's method [21, Section 4.4] to compute quasi-Nash equilibria of the min-cost network interdiction game described in Subsection 3.1.2 with two choices of the matrix Q: zero and the identity matrix. With the former choice, each rst-level objective function ' f is linear; with the latter choice, the rst-level objective function ' f (x f ;x f ) is strongly convex in x f . The LCP formulation for the latter game with sum-form interdiction is solved using the NEOS interface for the complementarity solver PATH with AMPL as the programming language https://neos-server.org/neos/solvers/cp: PATH/AMPL.html. We set the appropriate options in the solver so that Lemke's method will be applied. The network is given in Figure 2, where the nodes are labelled by letters and the arcs are labelled by numbers. There are 5 interdictors and a common agent, who ships certain commodity from a source node s to a destination node t. The sum-form interdiction means that the amount of reduced capacity for each arc is the sum of all interdictors' amounts of interdiction applied to the arc. The initial arc capacities and the agent's transportation costs on the arcs are listed in Table 1. The interdiction costs of each interdictor on dierent arcs are listed in Table 2. The interdictors' budgets are 1:0, 1:1, 0:8, 0:6 and 0:6, respectively. The supply at the source node s is 10, and the demand at the sink node t is 10. b c a d e f h g s t 1 2 3 4 5 6 7 8 11 9 10 15 12 14 13 16 17 Figure 2: Nodes and arcs in the network Arc 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 Initial Capacity 4 18 8 10 15 10 12 8 10 12 11 8 10 10 5 18 5 Unit Transportation Cost 1 6 1 7 2 2 3 5 4 1 2 3 3 3 3 6 4 Table 1: Initial arc capacities and agent's shipment costs 83 Arc Player 1 Player 2 Player 3 Player 4 Player 5 1 1 1 1 0.5 1 2 2 1 0.5 1 0.5 3 2 3 2 3 2 4 3 2 3 2 3 5 5 5 5 5 5 6 3 4 3 4 3 7 2 1 2 1 2 8 1 2 1 2 1 9 5 3 5 3 5 10 3 4 3 4 3 11 2 3 2 3 2 12 4 3 4 3 4 13 2 2 2 2 2 14 3 1 3 1 3 15 1 3 1 3 1 16 6 8 6 8 6 17 3 2 3 2 3 Table 2: Interdictor costs on dierent arcs In order not to be concerned with the restrictions on parameters g f; j;l , we implemented the iterative bounding procedure in Section 3.4. As we have mentioned following (25), starting with the agent's (pri- mal) min-cost problem (22), we can formulate the ow conservation constraints as inequalities provided that the nodal supplies/demands i sum up to zero. The corresponding dual variables i in (25) are therefore nonnegative. Thus the bound constraint that we add to the latter dual is: X a2A a + X i2N i B: With this constraint added to each set f , we obtain a LCP of dimension 720, which is not a trivial size. Since we had no prior knowledge of the bound B, we started with a small B equal to 25 and double it iteratively until Lemke's method computes a solution satisfying the original LCP as well, i.e., the obtained solution to the LCP with bound B is also a solution to the LCP without such bounds; this successive increase of B will terminate in a nite number of steps as asserted by Proposition 25. In the numerical run, this occurs after doubling the bound B twice to 100. In the obtained solution with Q being the zero matrix, the values of dierent interdictors' decision variables are given in the Table 3. We can see these values are consistent with the network structure. 84 The task of the agent is to transport certain amount of ow from the source node s to the destination node t in order to satisfy the demands at the node t while trying to minimize the total transportation cost. For the common agent, starting from the source node s to the destination node t, there is an intermediate nodec that every possible path would like to pass, since the minimum cost from the source node s to the node c is 2. And from the source node s to the intermediate node c, there are only two paths, one is from s! a! c, the other is s! b! c. One thing we can observe from the network in the left subgure of Figure 3 is that the unit transportation cost for the agent on the path s! a! c is 2, while on the path s! b! c is 13. Thus, one reasonable strategy of the interdictors is to reduce the capacities on the path s! a! c in order to increase the agent's min cost. As shown on the right subgure of Figure 3, the capacity on the arc s! a is 4, and the capacity on the arc a! c is 8, the interdiction will be more eective to reduce the bottleneck capacity on this path, which is the capacity of the arc s!a. b c a d e f h g s t 1 6 1 7 2 2 3 5 4 1 2 3 3 3 3 6 4 b c a d e f h g s t 4 18 8 10 15 10 12 8 10 12 11 8 10 10 5 5 18 Figure 3: Unit Transportation Cost and Initial Arc Capacities on the network In the solution, the amount of capacity on arcs!a reduced by interdictor 1 is 1, reduced by interdictor 2 is 0.4, reduced by interdictor 3 is 0.8, reduced by interdictor 4 is 1.2, and reduced by interdictor 5 is 0:6. So the sum of capacity reduced by all interdictors is exactly 4.0; the total makes the arc s!a fully interdicted (capacity reduced to 0) and thus path s!a!c can no longer be used. Figure 4 shows the remaining capacity after the interdictions. This strategy of the interdictors increases the agent's min cost. Though the interdictors do not cooperate with each other, they still achieve a good tuple of strategies at the solution, in the sense of increasing the min cost of agent. b c a d e f h g s t 4 18 8 10 15 10 12 8 10 12 11 8 10 10 5 18 5 b c a d e f h g s t 0 18 8 10 15 10 12 8 10 12 11 8 9.77 10 5 18 5 Figure 4: Initial and reduced arc capacities in the network for Q = 0 The results for Q being the identity matrix are displayed in Table 4. In this case the amount of resources 85 Arc Player 1 Player 2 Player 3 Player 4 Player 5 1 1 0.4 0.8 1.2 0.6 2 0 0 0 0 0 3 0 0 0 0 0 4 0 0 0 0 0 5 0 0 0 0 0 6 0 0 0 0 0 7 0 0 0 0 0 8 0 0 0 0 0 9 0 0 0 0 0 10 0 0 0 0 0 11 0 0 0 0 0 12 0 0.233333 0 0 0 13 0 0 0 0 0 14 0 0 0 0 0 15 0 0 0 0 0 16 0 0 0 0 0 17 0 0 0 0 0 Table 3: Amount of arc capacity reduced by each interdictor in the solution for Q = 0 allocated on arc s!a remains the same as in the previous case with a zero matrix Q, and the capacity on this arc s! a. However, the resources allocated on dierent arcs by each player are more separate compared with the solution in Table 3. This is consistent with our intuition since with the introduction of quadratic terms, each player's cost will be increase with the allocation of too much resource to a specic arc, and they will try to allocate the resources in a sparse way. 6.1.1 Multiple Solutions with Dierent Initial Points There may be dierent quasi-Nash equilibrium solutions to the network interdiction game, and one thing that in uences the obtained solution from Lemke's method is the initial point in the PATH solver. For the above example, we have selected dierent starting points and obtained dierent solutions. One solution that is dierent from that in Table 3 is listed in Table 5. One possible explanation for the existence of multiple quasi-Nash equilibrium solutions is the lack of convexity of the mathematical model and the decentralized behavior of the interdictors. Though the interdictors have a common agent to interdict, they are modeled as non-cooperative players who take independent actions without communication with each other. The hierarchical nature of the game wherein 86 Arc Player 1 Player 2 Player 3 Player 4 Player 5 1 0.857002 0.723259 0.8 1.01974 0.6 2 0 0 0 0 0 3 0 0 0 0 0 4 0 0 0 0 0 5 0 0 0 0 0 6 0 0 0 0 0 7 0 0 0 0 0 8 0 0 0 0 0 9 0 0 0 0 0 10 0 0 0 0 0 11 0 0 0 0 0 12 0.0357494 0.12558 0 0.0300435 0 13 0 0 0 0 0 14 0 0 0 0 0 15 0 0 0 0 0 16 0 0 0 0 0 17 0 0 0 0 0 Table 4: Amount of arc capacity reduced by each interdictor for the game with identity matrix Q the network agent is modelled as a follower who respond to the interdictors' actions could be another cause for the multiplicity of solutions. In general, for a leader-follower game, uniqueness of an equilibrium solution, even one of a Nash type if it exists, is not likely. The numerical runs of Lemke's method conrm that dierent solutions can be reached with dierent starting points in the PATH solver. 6.1.2 Stochastic Network Interdiction Games Next, we extend the numerical study of the above network interdiction game to a stochastic setting where the supplies and demands are random. The common agent ships some commodity from the node s to the node t, and from the node b to the node g. The initial arc capacities and the agents' transportation costs are listed in the Table 6. The interdiction costs of each interdictor on dierent arcs are listed in the Table 2, The interdictors' budgets are the same as before. The supplies at the source nodes s and b have normal distributions with mean 8 and 2, respectively, and with the same standard deviation of 0:1. The demands at the destination nodest andg are the same as the supplies at their corresponding source nodes, s and b, respectively. For this game with continuously distributed supplies and demands, we solve several sample-average 87 Arc Player 1 Player 2 Player 3 Player 4 Player 5 1 0.999986 1.09998 0.349545 1.19994 0.35056 2 0 0 0.900885 0 0.498855 3 0 0 0 0 0 4 0 0 0 0 0 5 0 0 0 0 0 6 0 0 0 0 0 7 0 0 0 0 0 8 0 0 0 0 0 9 0 0 0 0 0 10 0 0 0 0 0 11 0 0 0 0 0 12 0 0 0 0 0 13 0 0 0 0 0 14 0 0 0 0 0 15 0 0 0 0 0 16 0 0 0 0 0 17 0 0 0 0 0 Table 5: Amount of arc capacity reduced by each interdictor in the solution approximated games by taking samples of the demands and supplies from the normal distribution, starting with a problem with just one sample. For this 1-sample problem, the supplies at the source nodes s and b are 8 and 2, respectively; the demands at the destination nodes t and g are equal to the supplies of their corresponding source nodes,s andb, respectively. Lemke's method yields a solution in just 0.006998 seconds. Then we take dierent number of samples to approximate the distribution of demands and supplies, and use Lemke's method to compute a solution to each such approximated game. The computation time and the common agent's cost are listed in the Table 7. From this table, we can see that with the number of samples increased, the number of variables in LCP increases, and the computation time increases as Arc 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 Initial Capacity 9 18 18 20 15 15 14 18 15 12 16 18 16 14 15 18 15 Unit Transportation Cost 4 6 4 2 2 2 3 5 4 1 2 3 3 3 3 6 4 Table 6: Initial Arc Capacities and Agent's transportation cost 88 Number of Samples 1 2 5 10 20 Number of Variables in LCP 720 1350 3240 5540 12690 Computation Time (in seconds) 0.006998 0.017998 0.064989 0.583912 3.821419 Agent's Expected Cost 122 122.306 120.938 122.945 122.339 Table 7: Agent's transportation cost with standard deviation 0:1 Number of Samples 1 2 5 10 20 Number of Variables in LCP 720 1350 3240 5540 12690 Computation Time (in seconds) 0.006998 0.017997 0.049993 0.460931 0.939857 Agent's Expected Cost 122 119.623 119.423 122.374 124.9 Table 8: Agent's transportation cost with standard deviation 0:5 well. When the number of samples increases from 10 to 20, the computation time increases from less than 1 second to more than 7 seconds, but the computation time is still acceptable for the 20-sample approximated game. Another interesting observation is that the common agent's cost is always around 122, which is the agent's cost for the deterministic network interdiction game. Thus, by taking relatively small number of samples, we can get an approximated solution to the stochastic network interdiction game, and the objective value of the obtained solution is close to that of the stochastic network interdiction game. To see the in uence of the standard deviation on the computational results, we have taken two more groups of samples, where the supplies at source nodess andb have normal distributions with mean 8 and 2 (which is the same as the previous experiment),respectively, but with dierent standard deviations from the previous case. The demands at the destination nodes t and g are still equal to the respective supplies at their corresponding source nodes s and b. The computation time and the common agent's cost are listed in the Table 8 for the games with standard deviation of 0:5 and Table 9 for the games with standard deviation of 1. We can see that Lemke's method can handle all these cases with dierent standard deviations, and when the standard deviation increases, the agent's expected cost tends to be less stable, which is consistent with our expectation. Number of Samples 1 2 5 10 20 Number of Variables in LCP 720 1350 3240 5540 12690 Computation Time (in seconds) 0.006998 0.026995 0.045994 0.459930 3.722434 Agent's Expected Cost 122 114.288 122.794 119.963 119.949 Table 9: Agent's transportation cost with standard deviation 1 89 We use Lemke's method to solve network interdiction games discussed in Chapter 3. As an example of such games, we look at the min-cost interdiction games. The LCP formulation for the latter game with sum-form interdiction is solved using the NEOS interface for the complementarity solver PATH, with AMPL as the programming language. 6.1.3 Another Example In the previous examples, the solutions computed by Lemke's method are well explained and conform with our intuition about the examples. However, there are cases that the obtained solution may not be consistent with our intuition, as the obtained solutions are just guaranteed to be quasi-Nash equilibriums, instead of Nash equilibriums. We have another example, which has exactly the same nodes and arc as the network in Figure 2. There are 5 interdictors and a common agent in this example as well, and the only dierence between the new example with the previous one is that the agent's transportation costs on dierent arcs are no longer same as those listed in Table 1, instead they are listed below in Table 10. Arc 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 Unit Transportation Cost 4 6 4 2 2 2 3 5 4 1 2 3 3 3 3 6 4 Table 10: Agent's transportation cost One thing we can observe from the network in Figure 5 is that the transportation cost for agent on the path s! a! c is the same as the transportation cost on the path s! b! c. Thus, it is hard to tell whether the interdictors will use their resources to interdict the s! a! c or not from our intuition. Then we use Lemke's method to compute a solution to this example. The values of dierent b c a d e f h g s t 4 6 4 2 2 2 3 5 4 1 2 3 3 3 3 6 4 Figure 5: Unit Transportation Cost of arcs in the network interdictors' decision variables are given as below in Table 11. It is hard to explain the decisions of interdictors directly from our intuition. And the Figure 6 below shows the remaining capacities of arcs 90 after interdictions. Thus, the solutions computed by Lemke's method can not always be consistent with our intuition, especially when it is hard to tell directly which arc should be interdicted from the network. Arc Player 1 Player 2 Player 3 Player 4 Player 5 1 0.175792 0 0 1.04575 0 2 0.266332 0.991286 1.59083 0.0760611 1.19054 3 0 0 0 0 0 4 0 0 0 0 0 5 0 0 0 0 0 6 0 0 0 0 0 7 0 0.107449 0 0 0 8 0.291041 0 0 0 0 9 0 0 0 0 0 10 0 0 0 0 0 11 0 0 0 0 0 12 0 0 0 0 0 13 0 0 0 0 0 14 0 0 0 0 0 15 0 0 0 0 0 16 0 0 0 0 0 17 0 0 0 0 0 Table 11: Amount of arc capacity reduced by each interdictor in the solution b c a d e f h g s t 4 18 8 10 15 10 12 8 10 12 11 8 10 10 5 18 5 b c a d e f h g s t 2.78 13.88 8 10 15 10 11.89 7.71 10 12 11 8 10 10 5 18 5 Figure 6: Initial and remained arc capacities in the network 6.2 Best-Response Algorithm We use the best-response algorithm presented in Chapter 2 to study several instances of Nash non- cooperative games. The best-response algorithm is implemented in MATLAB R2016b with CVX ver- 91 sion 2:1 [37, 38] as the optimization solver. Computational experiments are carried out on a laptop with a 1.1 GHz Intel Core M processor and 8 GB of memory running macOS Sierra. We are interested in applying the best-response algorithm to compute a Nash equilibrium solution of three classes of games, including two-stage stochastic games, Nash-Cournot games and network enhancement games, all of these classes of games satisfy the assumptions from (B:1) - (B:4) made in Theorem 4. 6.2.1 Two-Stage Stochastic Game We implement Algorithm 1 to compute a Nash equilibrium solution to a two-stage stochastic game with strong convexity. As an example, we look at the two-stage stochastic game mentioned in Section 2.3.4. In the two-stage stochastic game, there are 5 players, each player f = 1; ; 5 computes an optimal solution to its own optimization problem while anticipating the other players' decision variables. The formulation of player f's optimization problem is as follows: min x f 2X f q f T x f + 1 2 x f E ff x f +x f T X f 0 6=f E ff 0 x f 0 + X j p j f (x f ;x f ;! j ); where x f is the player's rst-stage decision variable, and its rst-stage objective function is q f T x f + 1 2 x f E ff x f +x f T P f 0 6=f E ff 0 x f 0 + P j p j f (x f ;x f ;! j ), which contains a quadratic function q f T x f + 1 2 x f E ff x f +x f T P f 0 6=f E ff 0 x f 0 and an expectation of the second-stage objective function P j p j f (x f ;x f ;! j ). There is a random variable ! j associated with each scenario j in the second-stage objective function f (x f ;x f ;! j ), and f (x f ;x f ;! j ) is a pointwise minimum function in the following form: f (x f ;x f ;! j ), min z f 2 4 c f (! j ) + X f 0 6=f G ff 0 (! j )x f 0 3 5 T z f + 1 2 z f T Q ff z f s:t: A ff x f +D f z f b f (! j ): In the second-stage optimization problem, the random variable! j appears in both the objective function and the constraints. z f is the second-stage decision variable. We can see that the second-stage objective function depends on the other players' rst-stage decision variables x f , while the constraints depend on its own rst-stage decision variable. Note that in the above inner optimization problem, the constraint does not depend on other players' decision variables x f , which is dierent from the problem in Sec- tion 2.3.4, and we will benet from this dierence in the implementation of best-response algorithm. Since the Algorithm 1 is a framework of algorithm, requiring an ecient solver(subroutine) to compute solutions of optimization problems at each iteration, and the optimization problem we are facing with here is a stochastic program. Thus, we use a technique which is similar with the sample average approx- imation, i.e., pull out the second-stage decision variables z f , and compute the solutions of x f as well as 92 solutions of z f for dierent scenarios, which reformulates the previous problem as: min x f 2X f ;(z f j ) j q f T x f + 1 2 x f E ff x f +x f T X f 0 6=f E ff 0 x f 0 + X j p j 2 4 (c f (! j ) + X f 0 6=f G ff 0 (! j )x f 0 ) T z f j + 1 2 z f j T Q ff z f j 3 5 s:t: A ff x f +D f z f j b f (! j ) 8j: Consider the following instance, there are 20 scenarios and 5 players, each player needs to prepare for some goods in two stages in order to meet a stochastic demand at the end of the second stage. The amounts of goods to buy at the rst stage are the rst-stage decision variables, while the amounts of goods to buy at the second stage are the second-stage decision variables. At the rst stage, each player only knows what are his/her costs to buy those goods, given the fact that his/her costs on those goods depend on both his/her own and the other players' rst-stage decision variables. And there are many (20) possibilities at the second stage, for each possibility (or scenario), there might be dierent costs associated with the goods to buy at the second stage, and dierent demands to be satised at the end of the second stage. The costs at the second stage also depends on the amount of goods bought by the other players at the second stage. Thus, at the rst stage, each player chooses his/her own st-stage decision variable while anticipating the other players' decision variables and the expectation of its second-stage cost, in order to minimize the sum of his/her st-stage cost and expectation of the second-stage cost, while making sure the amount of goods prepared in two stages meet the demand. At the second stage, given the amount of goods prepared in the rst stage, and the other players' rst-stage decision variables, each player solves a second-stage optimization problem and chooses the amount of goods to buy in the second stage, given the rst-stage decision variables of the other players and the realization of scenario. One interpretation of the previous instance is a preparation of electrical contracts for future demands in an electricity market. There are 5 factories in an electricity market, and they need to sign contracts with electricity providers including both renewable electricity providers and conventional electricity providers in order to guarantee their own electricity demands. And we consider the preparing process of electric- ity as two stages. In the rst stage, each factory will need to sign contracts with renewable electricity providers, and usually, the price of electricity at this stage should be cheaper than that of the second stage. And at the second stage, since the actual demands are known to the factory, they can sign some short-term contracts with conventional electricity providers in order to meet the demands, but this may cost more than renewable electricity contracts signed at the rst stage. In order to compute a solution to the instance mentioned earlier, we apply the best-response algorithm to compute a solution to the sample average formulation, i.e., compute the optimal rst-stage decision variables and second-stage decision variables simultaneously. As the constraint A ff x f +D f z f b f (! j ) does not include the other players' decision variablesx f , meaning that there are no coupling constraints, and the overall sample average problem is a Nash problem. Thus we can use the best-response algorithm to compute a Nash equilibrium solution to such sample average game. 93 The feasible set of the rst-stage decision variable x f is X f , which contains a private budget constraint and a nonnegative constraint on x f . The rst-stage budgets of each player f are positive values, which makes the feasible set of x f to be nonempty. On the other hand, the feasible set of the second-stage decision variables z f j depend on the player's own rst-stage decision variable x f . We set the constraint matrix A ff to an identity matrix, components of the demands b f (! j ) to be positive values, and the matrixD f to an identity matrix. This is because the amount of prepared goods in the rst stage and the second stage should exceed the demandb f (! j ). Thus, we can see that the feasible sets of the second-stage decision variables z f j are nonempty. In the objective function, vector q f and o-diagonal matrices E ff 0 have positive components, and the matrices E ff 0 satisfy E ff 0 = E f 0 f for all f 0 6= f. This is consistent with the example of preparing goods, since the amount of goods bought by the other players in the rst stage would in uence the cost of buying goods at the rst stage, and the more goods bought by the other players, the more cost will be faced by the player. As the players will be competing on the market of certain goods. And the matrix E ff is a diagonal matrix with positive diagonal elements, meaning that the matrix E ff is a positive denite matrix. As p j represents probability of scenario j, it satises nonnegative constraint p j 0 and 1-sum constraint P j p j = 1. For the second-stage cost, matrices G ff 0 are random matrices with positive elements, and Q ff is a diagonal matrix with positive diagonal elements. These settings of matrices are consistent with our example, since the more goods purchased by the player at the second stage, the more cost he/she needs to pay for those goods. By the fact that the matrices E ff and Q ff are positive denite, the objective function is convex. In our example, the assumptions (B:1) to (B:4) are satised. Since the objective function is C 1 , the \four-point condition" in (B:1) is satised, and by the fact that the objective function is convex in each player's own decision variable given the other players' decision variables xed, the assumption (B:2) is satised as well. Lipschitz continuity follows from the fact that objective function is a quadratic function, and thus assumption (B:3) is satised. And among the assumptions (B:1) (B:4) made in Theorem 4, diagonal dominance of the matrix, i.e., assumption (B:4) plays an important role, it decides how fast the sequence generated by the algorithm converges. Since the diagonal matrices E ff and Q ff have positive and large diagonal elements, when compared with the elements of o-diagonal matrices E ff 0 and G ff 0 , the overall matrix has the diagonal dominant property. The algorithm terminates after just 15 iterations, with an error threshold of 10 4 , where the error e t of iteration t is denoted as e t ,jjx t x t1 jj, the norm of dierence between decision variables of two consecutive iterations. And the runtime is 113:4435 seconds. This is fairly fast for a problem with 20 scenarios and 5 players, with 17-dimension rst-stage and second-stage decision variables. We have 1785 decision variables in total. The Figure 7 shows how the sequence of errors converges during the iterations. In the obtained solution, values of dierent players' rst-stage decision variables are given in Table 12. Though these values are harder to be explained than those of the network interdiction games, we can still get some insights from them. The task of each player is to prepare some goods in two stages in 94 Figure 7: Log error v.s. number of iterations order to meet demands at the end of the second stage, while minimizing the total cost. As the demands are uncertain, while the prices of goods at the rst stage are cheaper than those of the second stage, the players may purchase some goods at the rst stage and purchase some remaining amount of goods to meet the realized amount of demands at the second stage. Type of Goods 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 Player 1 0 5.1 0 0 2.6 0 1.1 1.1 1.1 0 0 0 0 0 0 19 0 Player 2 0 7.5 0 0 5.4 0.2 4.1 4.1 4.1 0.2 1.1 0.2 1.9 1.9 0 19.1 0 Player 3 0 8.8 1.6 2.2 7.1 2.8 6.0 6.0 6.0 2.8 3.5 2.8 4.2 4.2 1.0 18.6 2.2 Player 4 0 6.8 0 0 4.3 0 2.8 2.8 2.8 0 0 0 0.2 0.2 0 20 0 Player 5 0 7.0 0.9 1.4 5.5 1.9 4.6 4.6 4.6 1.9 2.5 1.9 3.1 3.1 0.4 15.2 1.4 Table 12: Solutions of rst-stage decision variables To simplify the interpretation of solutions, we can look at the values of solutions and parameters associ- ated with player 1. In our example, thebudget f for player 1 at the rst stage is 30, meaning that the player 1 can purchase at most 30 units of goods in the rst stage. The cost of goods at the rst stage for player 1 includes two terms, a linear termq 1 T x 1 and a quadratic term 1 2 x 1 T E 11 x 1 . The linear parameterq f is set to be equal to a zero vector in our example, and the quadratic parameterE 11 is an identity matrix timed by 1:25. The second-stage unit costs are more expensive than the rst-stage unit costs, including the linear termc 1 T z 1 and the quadratic term 1 2 z 1 T Q 11 z 1 . The quadratic parameterQ 11 is set to be equal to an identity matrix timed by 5, and the components of the linear parameterc 1 are listed below in Table 13. 95 Type of Goods 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 Linear Unit Cost 10 13 20 15 20 20 30 50 40 10 20 30 30 30 30 60 40 Table 13: Second-stage linear unit cost of player 1 The demands of dierent goods at the second stage are stochastic. In our example, the demands b 1 (! j ) for dierent types of goods with respect to scenario 1 are listed in the Table 14. And for the other sce- narios, the demand of each type of goods is dierent from the demand of that type of goods in scenario 1. Type of Goods 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 Demand 4 18 8 10 15 10 12 8 10 12 11 8 10 10 5 18 5 Table 14: Demand of player 1 for the rst scenario We can see that the player 1 attempts to prepare for many dierent types of goods in the rst stage, and has used up all its budget, as the sum ofx 1 components is equal to 30. Among the components ofx 1 , the 16-th component is the largest, and the second component is the second largest component. This result is reasonable by the fact that the 16-th type of goods has the highest unit linear cost of 60 among all types of goods, and both the 16-th and the second type of goods have the highest demands of 18 among all types of goods, as shown in Table 14. Thus, by allocating the most budgets to these two types of goods in the rst stage, the player 1 can reduce his/her costs in the second stage, in order to achieve a lowest cost while satisfying demands. Spectral radius of the overall matrix characterizes the diagonal dominance of the problem, and determines the convergence rate of our algorithm. In order to verify this fact, we tested our algorithm on examples with dierent spectral radius, and record the number of iterations needed to achieve an error of 10 5 . The Figure 8 describes the relationship between the spectral radius and the number of iterations to get an error of 10 5 . The results indicate that for the two-stage stochastic game satisfying the assumptions from (B:1) (B:4), we can use the best-response algorithm to compute a Nash equilibrium solution and expect a fast convergence after small number of iterations. And for problems with larger spectral radius and diagonal dominant properties, the sequence will converge faster to a Nash equilibrium solution. 96 Figure 8: Number of iteration required to achieve convergence v.s. spectral radius 6.2.2 Stochastic Min-Cost Enhancement Game After implementing the best-response algorithm on the two-stage stochastic games, we are interested to see if we can apply the best-response algorithm to compute a solution to the stochastic network-typed games, i.e., network-typed games with uncertain demands. Thus, we look at the stochastic network en- hancement games. Unlike the network interdiction games, in a network enhancement game, each player attempts to help (enhance) a common agent, in order to minimize the agent's min transportation cost while satisfying the demands on certain nodes. Consider the following instance, based on the network in Figure 2, where the letters in the nodes are the labels of the nodes, and the numbers on the arrows represent labels of the arcs. There are 5 enhancers and a common agent, who has a source node s and destination node t. The enhancement is continu- ous and in sum-form. The initial arc capacities are listed in the Table 15, represented as u 0 edge . The enhancement costs of each enhancer on dierent arcs are listed in the Table 16, represented as A f edge , while the agent's transportation costs on dierent arcs are listed in the Table 17, represented asc f edge (! j ). Enhancement budget for enhancers 1 to 5 are 3; 5; 8; 4; 6, respectively. The demand at the destination node t is stochastic, and there are 20 possibilities, each with a probability of 0:05, and the demand of each possibility(scenario) is listed in Table 18. Arc 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 Initial Capacity 4 18 8 10 15 10 12 8 10 12 11 8 10 10 5 18 5 Table 15: Initial Arc Capacities 97 Arc Player 1 Player 2 Player 3 Player 4 Player 5 1 1 1 1 0.5 1 2 2 1 0.5 1 0.5 3 2 3 2 3 2 4 3 2 3 2 3 5 5 5 5 5 5 6 3 4 3 4 3 7 2 1 2 1 2 8 1 2 1 2 1 9 5 3 5 3 5 10 3 4 3 4 3 11 2 3 2 3 2 12 4 3 4 3 4 13 2 2 2 2 2 14 3 1 3 1 3 15 1 3 1 3 1 16 6 8 6 8 6 17 3 2 3 2 3 Table 16: Interdictor costs on dierent arcs Arc 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 Unit Transportation Cost 1 6 1 7 2 2 3 5 4 1 2 3 3 3 3 6 4 Table 17: Agent's transportation cost Scenario 1 2 3 4 5 6 7 8 9 10 Demand 15 14.5 14 13.5 13 12.5 12 11.5 11 10.5 Scenario 11 12 13 14 15 16 17 18 19 20 Demand 10 9.5 9 8.5 8 7.5 7 6.5 6 5.5 Table 18: Agent's demand under dierent scenarios The dierence between the network enhancement games here and the network interdiction games im- plemented with Lemke's method is that the enhancement is in the form of linear instead of pointwise maximum. Thus the network enhancement game is a convex game, and a Nash equilibrium of such game 98 should exist. In a min-cost enhancement game, all players have a common agent, and all these players wish to use their resources to increase the arc capacities on the network in order to reduce the min-cost of their common agent. The formulation of network enhancement game is as follows: min x f 2X f q f T x f + 1 2 x f T E ff x f +x f T X f 0 6=f E ff 0 x f 0 + X j p j (x f ;x f ;! j ); here x f is the decision variable of the enhancer, and represents how much arc capacity the enhancer is willing to add to each arc. The feasible set ofx f isX f , and includes a nonnegative constraint as well as a budget constraint on x f . The summation q f T x f + 1 2 x f T E ff x f +x f T P f 0 6=f E ff 0 x f 0 is the enhancement cost for enhancer f. And the expected value P j p j (x f ;x f ;! j ) is an expected value of the agent's min-cost. The function (x f ;x f ;! j ) is a pointwise minimum function, and represents the agent's min cost if the scenario j is realized, p j is the probability of scenario j. The pointwise minimum function (x f ;x f ;! j ) is in the following form: (x f ;x f ;! j ), min z c T z + 1 2 z T Qz s:t: zu 0 + X f 0 x f 0 Dz =b(! j ) z 0: We can see the above inner optimization problem is the agent's min-cost problem, and z is its decision variable. The rst constraint is the arc capacity constraint, and the second constraint is the demand con- straint. Similar as the two-stage stochastic game, we can see that such stochastic min-cost enhancement game is a bi-level game, and we can use a sample average approximation technique on this game. We pull out the decision variables z of the inner optimization problem, and compute the solutions of x f , z for the following optimization problem: min x f ;fz j g j q f T x f + 1 2 x f T B ff x f +x f T X f 0 6=f B ff 0 x f 0 + X j p j [c T z j + 1 2 z j T Qz j ] s:t: z j u 0 + X f 0 x f 0 8j Dz j =b(! j ) 8j z j 0 8j x f 0 A f T x f budget f : Here the variables z j are the decision variables of the common agent under dierent scenarios, and x f is the decision variable of enhancer f. In our example, the algorithm terminates after 17 iterations, with an error threshold 10 5 , where the 99 error e t of iteration t is denoted as e t ,jjx t x t1 jj, the norm of dierence between decision variables of two recent iterations. And the runtime is 99:2240 seconds. This is fairly fast for a problem with 20 scenarios and 5 players, with 17-dimension rst-stage and second-stage decision variables. We have 1785 decision variables in total. The following Figure 9 shows how the sequence converges during the iterations. In the obtained solution, the values of dierent enhancers' decision variables are given in the Table 3. Figure 9: Log error v.s. number of iterations These values can be well explained. The task of the agent is to transport certain amount of goods from the source node s to the destination node t in order to satisfy demands at the node t while minimizing the total transportation cost. For the common agent, starting from the source node s to the destination node t, there is an intermediate node c that every possible path would like to pass, since the minimum cost from the source node s to the node c is 1. And from the source node s to the intermediate node c, there are only two paths, one is from s!a!c, the other iss!b!c. One thing we can observe from the network in Figure 10 is that the transportation cost on the path s!a!c is 2, while on the path s!b!c is 13. Thus, one reasonable strategy for the enhancers is to increase the capacities on the path s!a!c in order to decrease the agent's min cost. As shown on the Figure 10, the capacity on the arc s! a (s! a) is 4, and capacity on the arc a! c is 8, the enhancers are more willing to increase the bottleneck capacity on this path, which is the capacity of the arc s!a. 100 b c a d e f h g s t 1 6 1 7 2 2 3 5 4 1 2 3 3 3 3 6 4 b c a d e f h g s t 4 18 8 10 15 10 12 8 10 12 11 8 10 10 5 5 18 Figure 10: Unit Transportation Cost and Initial Arc Capacities on the network Arc Player 1 Player 2 Player 3 Player 4 Player 5 1 1.0090 0.4779 0.3131 0.1853 1.0090 2 0 0 0 0 0 3 0 0 0 0 0 4 0 0 0 0 0 5 0 0 0 0 0 6 0 0 0 0 0 7 0 0 0 0 0 8 0 0 0 0 0 9 0 0 0 0 0 10 0 0 0 0 0 11 0 0 0 0 0 12 0.0024 0.0012 0 0 0.0024 13 0 0 0 0 0 14 0 0 0 0 0 15 0 0 0 0 0 16 0 0 0 0 0 17 0 0 0 0 0 Table 19: Amount of arc capacity increased by each enhancer in the solution In the solution, the amount of capacity on arc s! a increased by enhancer 1 is 1:0090, increased by enhancer 2 is 0:4779, increased by enhancer 3 is 0:3131, increased by enhancer 4 is 0:1853, and increased by enhancer 5 is 1:0090. So the sum of capacities increased by all enhancers is 2:9943, which makes the capacity of arc s! a to be increased to 6:9943. Thus the capacity on the path s! a! c has been increased to 6:9943. The Table 19 shows the augmented capacity after enhancement. This strategy of enhancers decreases the agent's min cost. Though the enhancers do not cooperate with each other, they still achieve a good tuple of strategies at the solution, in the sense of decreasing the min cost of the agent. 101 As mentioned in the previous section, diagonal dominance and spectral radius plays an important role b c a d e f h g s t 4 18 8 10 15 10 12 8 10 12 11 8 10 10 5 18 5 b c a d e f h g s t 6.99 18 8 10 15 10 12 8 10 12 11 8 10 10 5 18 5 Figure 11: Initial and augmented arc capacities in the network in determining not only whether the sequence generated by the best-response algorithm converges to a Nash equilibrium solution, but also the rate of convergence. Thus, we have applied the best-response algorithm to compute a Nash equilibrium solution to the stochastic network enhancement games with dierent spectral radius and get the following Figure 12, which describes the relationship between the spectral radius and the number of iterations to get an error of 10 5 . The results indicate that for a stochastic network enhancement game, we can use the best-response Figure 12: Number of iteration required to achieve convergence v.s. spectral radius algorithm to compute a Nash equilibrium solution and expect a fast convergence after small number of iterations. And for the problems with larger spectral radius and diagonal dominance, the sequence will converge faster to a Nash equilibrium solution. In order to compare the performance of the best-response algorithm on network enhancement game with that of Lemke's method, we computed a Nash equilibrium solution to a deterministic min-cost enhancement game using both best-response algorithm and Lemke's method. The solutions got from these two algorithms are the same, and number of iterations required by Lemke's method to achieve an error of 10 5 is 30, while the number of iterations required by the best-response algorithm is 16, which shows that the best-response algorithm may outperform Lemke's method on network-typed games. 102 6.2.3 Nash Cournot Game Nash-Cournot game is an economic model used to describe an industry structure in which companies compete on the amount of output they will produce, while the price of product depends on the production amount of all these companies. The classical Nash-Cournot game is a single-stage decision problem, where the only decision that companies could make is the amount of productions. However, we are interested in two-stage Nash-Cournot game, which is dierent from the classical Nash- Cournot game in the sense that the objective function of each player or company contains both the rst-stage cost and the second-stage expected prot. Decision variables of the companies include both the amount of productions in the second stage and the investment in the rst stage, which could in uence the production capacity of the company. In the rst stage, each player could decide how much resource to invest in the factory's facilities and technologies in order to increase the production capacities of the second stage. And in the second stage, each player could decide how much to produce for each type of productions, while the price of products depend on all players' production amount of the same type of productions, and the maximum production amount of each player depends on the investment in the rst stage. The formulation of such game is as follows: max x f 2X f q f T x f 1 2 x f T Q ff x f x f T X f 0 6=f Q ff 0 x f 0 + X j p j f (x f ;x f ;! j ); wherex f is the player's rst-stage decision variable, i.e. the amount of resource to invest in the factory's facilities and technologies, and the player's objective function contains a quadratic function and an expec- tation of the second-stage objective function, P j p j f (x f ;x f ;! j ). The quadratic function represents the investment cost in the rst stage in order to improve the production capacity of the second stage. The feasible set ofx f isX f , which is a convex compact set containing a budget constraint and a nonneg- ative constraint. There is a random variable ! j associated with dierent scenarios j in the second-stage objective function f (x f ;x f ;! j ), and f (x f ;x f ;! j ) is a pointwise minimum function in the following form: f (x f ;x f ;! j ), max z f j z f j T 2 4 P ( X f z f j ;! j ) 3 5 c f (z f j ) s:t: z f j b f (! j ) +x f 8j A f z f j a f 8j z f j 0 8j: In the second-stage optimization problem, random variable ! j appears in both the objective function and the constraint. z f j is the second-stage decision variable for scenario j, representing the amount of production to produce in the second stage. Components of z f j represents the amount of productions for each type of products. The constraints of the second-stage decision variable include a production capacity constraintz f j b f (! j ) +x f and a production activity constraintA f z f j a f . The price function 103 P is a linear decreasing function with respect to the sum of productions from all factories P f z f j , and is denoted as P ( P f z f j ;! j ), d(! j )G(! j ) P f z f j . This comes from the law of supply. The production cost function c f (z f j ) is a quadratic function of z f j , denoted as c f (z f j ), 1 2 z f j T E ff z f j +c f T z f j . Since the Algorithm 1 needs an ecient solver(subroutine) to compute the solutions of optimization problems at each iteration, and we are dealing with bi-level optimization problem at each iteration, we use a technique which is similar with sample average approximation, as what we did in the Section 6.2.1. We pull out the second stage decision variablesz f j , and compute the solutions ofx f as well as the solutions of z f j for dierent scenarios ! j , which reformulates the Nash-Cournot game as follows: max x f ;(z f j ) j q f T x f 1 2 x f T Q ff x f x f T X f 0 6=f Q ff 0 x f 0 + X j p j 2 4 z f j T 0 @ d(! j )G(! j ) X f z f j 1 A 1 2 z f j T E ff z f j c f T z f j 3 5 s:t: z f j b f (! j ) +x f 8j A f z f j a f 8j z f j 0 8j r f T x f budget f x f 0: The formulation of two-stage Nash-Cournot game is similar to that of pull-out two-stage stochastic game, except that the second-stage objective function is dependent on other players' second-stage decision vari- ables rather than rst-stage decision variables. Consider the following instance, there are 20 scenarios, each with a probability of 0:05, and 5 players. In the budget constraint ofx f , the budget of each player is positive, thus the feasible set of x f is nonempty. The constraint z f j b f (! j ) +x f involves both the rst-stage decision variable x f , and the second-stage decision variable z f j , and represents the capacity constraint, meaning that for each player, or factory, his/her production amount in each type of products can not exceed his/her production capacity, which is the sum of initial capacity and the investment in the rst stage, as the investment in the rst stage would increase the production capacity in the second stage. The production activity constraint A f z f j a f is a private constraint, and involves only the second-stage decision variable. The components of matrix A f and vector a f are nonnegative, since the more products produced by the factory, there will be more production activities. Thus, the feasible set of the second-stage decision variable z f j is nonempty. In the objective function, components of matrices Q ff 0 are nonnegative, and the matrix Q ff is a diagonal matrix with positive components, thus it is positive denite. The reason that the components of those matrices are positive is that they correspond to the rst-stage investment costs. Components of the vectord(! j ) and matrixG(! j ) are positive, since the productz f j T d(! j )G(! j ) P f z f j represents the revenue of the factory f in the second-stage, and thus the more productions produced by itself and the other factories, the lower price they will get for that type of production. Matrix E ff is a diagonal matrix with positive diagonal elements, and thus it is positive denite. This comes from the fact that 1 2 z f j T E ff z f j 104 represents the production cost in the second stage. Thus, the overall objective function is concave, and for our example, the matrices E ff and Q ff have large enough diagonal elements compared with the elements of other matrices, we have diagonal dominance property for the overall objective function. Similar with the two-stage stochastic game mentioned earlier, assumptions (B:1) (B:4) are satised. And among those assumptions, the key assumption is the diagonal dominance, which is critical to the contraction. In our example, the algorithm terminates after just 16 iterations, with an error threshold of 10 4 , where the error e t of iteration t is denoted as e t ,jjx t x t1 jj, the norm of dierence between decision vari- ables of two recent iterations. And the runtime is 202:8677 seconds. This is a fast convergence for our example with 5 players, 20 scenarios, 17-dimension rst-stage and second-stage decision variables. We have 1785 decision variables in total. The following Figure 13 shows how the sequence converges during the iterations. In the obtained solution, values of dierent players' rst-stage decision variables are given in Table 20. Figure 13: Log error v.s. number of iterations We can get some insights from the solutions. The objective of each player is to increase the production capacity in the rst stage, and maximize the total prot in the second stage by choosing appropriate pro- duction amount, while the prices of products on the market depend on all players' second-stage decision variables(production amount). As the product price functions are uncertain, and the initial production capacities may be too small for some types of products, the players may invest in facilities and technolo- gies in the rst stage in order to improve the expected prot at the second stage. 105 Product Type Player 1 Player 2 Player 3 Player 4 Player 5 1 0 0 0 0.04 0 2 0 0 0 0 0 3 0 0 0 0 0 4 0 0 0 0 0 5 0 0 0 0 0 6 0 0 0 0 0 7 0 0.40 0 0.33 0.12 8 2.17 1.57 1.23 1.46 2.30 9 0 0.57 0 0.40 0.16 10 0 0 0 0 0 11 0 0 0 0 0 12 0 0.07 0 0 0.03 13 0.10 0.19 0 0.07 0.40 14 0 0.64 0 0.58 0 15 1.15 0.23 0.37 0.05 1.28 16 0 0 0 0 0.46 17 0.82 1.43 0 1.32 1.24 Table 20: Values of rst-stage decision variables in the solutions To simplify the interpretation of solution, we can look at the values of solutions and parameters associated with player 1. In our example, the budget budget f for player 1 is 6, meaning that player 1 can invest at most 6 units of resource to increase production amount in the rst stage. The cost of investment at the rst stage for player 1 includes two terms, a linear term q 1 T x 1 and a quadratic term 1 2 x 1 T Q 11 x 1 . The linear parameter q 1 is set to be a zero vector in our example, and the quadratic parameter Q 11 is an identity matrix timed by 0:1. The second-stage price function includes an initial price parameter d(! j ) and a price elasticity parameter G(! j ). The values ofG(! j ) under dierent scenarios are set to be 0:1j, where j represents the scenario number. The values of d(! j ) for scenario 1 has been listed below in Table 21. Production Type 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 Initial Price d(! 1 ) 10 13 10 15 20 20 30 50 40 10 20 30 30 30 30 60 40 Table 21: Initial price parameter d(! j ) for scenario 1 The initial production capacities of dierent products at the second stage are dierent, and are listed in 106 the Table 22 for player 1 and scenario 1. Production Type 1 2 3 4 5 6 7 8 9 Initial production capacity b 1 (! 1 ) 0.4 1.8 0.8 1 1.5 1 1.2 0.8 1 Production Type 10 11 12 13 14 15 16 17 Initial production capacity b 1 (! 1 ) 1.2 1.1 0.8 1 1 0.5 1.8 0.5 Table 22: Initial production capacity b 1 (! 1 ) for player 1 and scenario 1 We can see that player 1 attempts to invest in the production capacities of many dierent types of goods in the rst stage, and has used up all its budget, as the value of product r 1 T x 1 is equal to the budget 6. Among the components of x 1 , the 8-th component is the largest. This result is reasonable by the fact that the 8-th type of products has the second largest initial price parameter, and the initial production capacity is relatively small compared with the 16-th type of products, which has the largest initial price parameter, but with 0 as the the rst-stage decision variable. The values of second-stage decision vari- ables are listed in the Table 23 for the rst scenario. The results of the second-stage decision variables are consistent with the values of initial prices. Thus, by allocating the rst-stage budgets to production capacities and deciding production amounts in the second stage, player 1 can increase his/her expected prot in the second stage. 107 Product Type Player 1 Player 2 Player 3 Player 4 Player 5 1 0.40 0.40 0.40 0.44 0.40 2 1.23 1.23 1.23 1.23 1.23 3 0.80 0.80 0.80 0.80 0.80 4 1.00 1.00 1.00 1.00 1.00 5 1.50 1.50 1.50 1.50 1.50 6 1.00 1.00 1.00 1.00 1.00 7 1.20 1.60 1.20 1.53 1.32 8 2.97 2.37 2.03 2.26 3.09 9 1.00 1.57 1.00 1.40 1.16 10 0.94 0.94 0.94 0.94 0.94 11 1.10 1.10 1.10 1.10 1.10 12 0.80 0.87 0.80 0.80 0.83 13 1.10 1.19 1.00 1.07 1.40 14 1.00 1.64 1.00 1.58 1.00 15 1.65 0.73 0.87 0.55 1.78 16 1.80 1.80 1.80 1.80 2.26 17 1.32 1.93 0.50 1.82 1.74 Table 23: Values of second-stage decision variables in the solutions We have also tested another example with no diagonal dominant property, which does not converge after 100 iterations, and shows no evidence of convergence. Besides that, the diagonal dominance should also in uence how fast it converges. Intuitively, we can imagine that the stronger diagonal dominant property we have, we will have faster convergence. Thus, we have computed the Nash equilibrium solutions of Nash Cournot games with dierent diagonal dominance. One feature that represents the diagonal dominance is the spectral radius, and we have got the following Figure 14, which describes the relationship between the spectral radius and the number of iterations needed to get an error of 10 5 . 108 Figure 14: Number of iterations required to achieve convergence v.s. spectral radius 109 References [1] A. Akella, S. Seshan, R. Karp, S. Shenker, and C. Papadimitriou. Selsh behavior and stability of the internet: a game-theoretic analysis of TCP. In ACM SIGCOMM Computer Communication Review 32.4: 117-130 (2002). [2] T. Alpcan and T. Basar. A game-theoretic framework for congestion control in general topology networks. In Proceedings of the 41st IEEE Conference on Decision and Control 2: 1218-1224 (2002). [3] H. Al-Shatri and T. Weber. Achieving the maximum sum rate using DC programming in cellular networks. IEEE Transactions on Signal Processing 60.3: 1331-1341 (2012). [4] E. Altman and L. Wynter. Equilibrium, games, and pricing in transportation and telecommu- nication networks. Networks and Spatial Economics 4.1: 7-21 (2004). [5] E. Altman, T. Boulogne, R. El-Azouzi, T. Jimnez, and L. Wynter. A survey on network- ing games in telecommunications. Computers and Operations Research 33.2: 286-311 (2006). [6] A. Alvarado, G. Scutari, and J. S. Pang. A new decomposition method for multiuser DC- programming and its applications. IEEE Transactions on Signal Processing 62.11: 2984-2998 (2014). [7] I. Atzeni, L. G. Ordez, G. Scutari, D. P. Palomar, and J. R. Fonollosa. Demand-side management via distributed energy generation and storage optimization. IEEE Transactions on Smart Grid 4.2: 866-876 (2013). [8] U. Ayesta, O. Brun, and B. J. Prabhu. Price of anarchy in non-cooperative load balancing games. Performance Evaluation 68: 1312-1332 (2011). [9] H. Baligh, M. Hong, W. C. Liao, Z. Q. Luo, M. Razaviyayn, M. Sanjabi, and R. Sun. Cross-layer provision of future cellular networks: A WMMSE-based approach. IEEE Signal Processing Magazine 31.6: 56-68 (2014). [10] T. Baar and G. J. Olsder. Dynamic noncooperative game theory. Society for Industrial and Applied Mathematics (1998). [11] N. Berger, M. Feldman, O. Neiman, and M. Rosenthal. Dynamic ineciency: Anarchy without stability. In 4th Symposium on Algorithmic Game Theory (October 2011). [12] D. P. Bertsekas, A. Nedi c, and A. E. Ozdaglar. Convex Analysis and Optimization, Athena Scientic optimization and computation series (2003). [13] D. P. Bertsekas and J. N. Tsitsiklis. Parallel and distributed computation: numerical meth- ods. Optimization and neural computing series Vol. 23. Englewood Clis, NJ: Prentice hall (1989). 110 [14] M. R. Hestenes. Multiplier and gradient methods. Journal of optimization theory and applications 4.5: 303-320 (1969). [15] L. M. Bruce. Game theory applied to big data analytics in geosciences and remote sensing. IEEE International Conference on Geoscience and Remote Sensing Symposium(IGARSS): 4094- 4097 (2013). [16] J. B. Cardell, C. C. Hitt, and W. W. Hogan. Market power and strategic interaction in electricity networks. Resource and energy economics 19.1: 109-137 (1997). [17] R. Cendrillon, J. Huang, M. Chiang, and M. Moonen. Autonomous spectrum balancing for digital subscriber lines. IEEE Transactions on Signal Processing 55.8: 4241-4257 (2007). [18] D. E. Charilas and A. D. Panagopoulos. A survey on game theory applications in wireless networks. Computer Networks 54.18: 3421-3430 (2010). [19] H. L. Chen, J. Marden, and A. Wierman. The eect of local scheduling in load balancing designs. In ACM SIGMETRICS Performance Evaluation Review 36.2: 110-112 (2008). [20] Y. Chen, B. F. Hobbs, S. Leyffer, and T. S. Munson. Leader-follower equilibria for electric power and NOx allowances markets. Computational Management Science 3.4: 307-330 (2006). [21] R. W. Cottle, J. S. Pang, and R. E. Stone. The Linear Complementarity Problem, SIAM Classics in Applied Mathematics 60, Philadelphia (2009) [Originally published by Academic Press, Boston (1992)]. [22] C. J. Day, F. F. Hobbs, and J. S. Pang. Oligopolistic competition in power networks: a conjectured supply function approach. IEEE Transactions on Power Systems 17: 597-607 (2002) . [23] A. Czumaj, P. Krysta, and B. Vocking. Selsh trac allocation for server farms. In Proceed- ings of the thiry-fourth annual ACM symposium on Theory of computing: 287-296. ACM (May 2002). [24] T. P. Dinh and H. A. Le Thi. Recent advances in DC programming and DCA. Transactions on computational intelligence XIII: 1-37. Springer, Berlin, Heidelberg (2014). [25] B. C. Eaves. Polymatrix games with joint constraints. SIAM Journal on Applied Mathematics 24.3: 418-423 (1973). [26] A. Ehrenmann. Equilibrium Problems with Equilibrium Constraints and their Application to Electricity Markets. Ph.D. thesis, Fitzwilliam College (2004) [27] A. Ehrenmann and Y. Smeers. Generation capacity expansion in a risky environment: A stochastic equilibrium analysis. Operations Research 59: 1332-1346 (2011). 111 [28] F. Facchinei and C. Kanzow. Generalized Nash equilibrium problems. 4OR: A Quarterly Jour- nal of Operations Research 5.3: 173-210 (2007). [29] F. Facchinei and J. S. Pang. Nash equilibria: The variational approach. In Y. Eldar and D. Palomar, editors, Convex Optimization in Signal Processing and Communications: 443-493. Cambridge University Press (Cambridge 2009). [30] F. Facchinei and J. S. Pang. Finite-Dimensional Variational Inequalities and Complementarity Problems, Volumes I. Springer-Verlag, New York (2003). [31] F. Facchinei and J. S. Pang. Finite-Dimensional Variational Inequalities and Complementarity Problems, Volumes II. Springer-Verlag, New York (2003). [32] F. Facchinei, J. S. Pang, and G. Scutari. Non-cooperative games with minmax objectives. Computational Optimization and Applications 59.1: 85-112 (2014). [33] F. Facchinei, V. Piccialli, and M. Sciandrone. Decomposition algorithms for generalized potential games. Computational Optimization and Applications 50.2: 237-262 (2011). [34] M. C. Ferris and R. J. B. Wets. MOPEC: multiple optimization problems with equilibrium constraints. http://www.cs.wisc.edu/ ~ ferris/talks/chicago-mar.pdf (2013). [35] R. Garg, A. Kamra, and V. Khurana. A game-theoretic approach towards congestion control in communication networks. In ACM SIGCOMM Computer Communication Review 32.3: 47-61 (July 2002). [36] T. S. Genc, S. S. Reynolds, and S. Sen. Dynamic oligopolistic games under uncertainty: A stochastic programming approach. Journal of Economic Dynamics & Control 31: 55-80 (2007). [37] M. Grant and S. Boyd. CVX: Matlab software for disciplined convex programming, version 2.0 beta. http://cvxr.com/cvx (September 2013). [38] M. Grant and S. Boyd. Graph implementations for nonsmooth convex programs, Recent Ad- vances in Learning and Control, pages 95-110, Lecture Notes in Control and Information Sciences, Springer. http://stanford.edu/ ~ boyd/graph_dcp.html (2008). [39] I. Guyon and A. Elisseeff. An introduction to variable and feature selection. Journal of machine learning research: 1157-1182 (2003). [40] G. Gurkan, O. Ozdemir, and Y. Smeers. Generation capacity investments in electricity mar- kets: Perfect competition. (CentER Discussion Paper; Vol. 2013-045). Tilburg: Econometrics (2013). 112 [41] G. Gurkan and J. S. Pang. Approximations of Nash equilibria. Mathematical Programming 117.1-2: 223-253 (2009). [42] M. Haviv and T. Roughgarden. The price of anarchy in an exponential multi-server. Operations Research Letters 35.4: 421-426 (2007). [43] T. Heikkinen. A potential game approach to distributed power control and scheduling. Computer Networks 50.13: 2295-2311 (2006). [44] B. F. Hobbs and U. Helman. Complementarity-based equilibrium modeling for electric power markets. Chapter 3 in D.W. Bunn, editor, Modeling Prices in Competitive electricity Markets. John Wiley and Sons (2004). [45] B. F. Hobbs, C. B. Metzler, and J. S. Pang. Nash-Cournot equilibria in power markets on a linearized DC network with arbitrage: formulations and properties. Networks and Spatial Economics 3: 123-150 (2001). [46] M. Hong and Z. Q. Luo. Signal processing and optimal resource allocation for the interference channel. arXiv preprint arXiv:1206.5144 (2012). [47] R. Horst and N. V. Thoai. DC programming: overview. Journal of Optimization Theory and Applications 103.1: 1-43 (1999). [48] T. Joachims. Text categorization with support vector machines: Learning with many relevant features. Machine learning: ECML-98: 137-142 (1998). [49] A. Kannan, U. V. Shanbhag, and H. M. Kim. Strategic behavior in power markets under uncertainty. Energy Systems 2: 115-141 (2011). [50] A. Khabbazibasmenj, F. Roemer, S. A. Vorobyov, and M. Haardt. Sum-rate maximization in two-way AF MIMO relaying: Polynomial time solutions to a class of DC programming problems. IEEE Transactions on Signal Processing 60.10: 5478-5493 (2012). [51] S. J. Kim, and G. B. Giannakis. Optimal resource allocation for MIMO ad hoc cognitive radio networks. IEEE Transactions on Information Theory 57.5: 3117-3131 (2011). [52] J. B. Krawczyk. Coupled constraint Nash equilibria in environmental games. Resource and En- ergy Economics 27.2: 157-181 (2005). [53] A. Leshem and E. Zehavi. Cooperative Game Theory and the Gaussian Interference Channel. IEEE Journal on Selected Areas in Communications, 26: 1078-1088 (2008). [54] H. A. Le Thi and T. P. Dinh. DC programming in communication systems: challenging problems and methods. Vietnam Journal of Computer Science 1.1: 15-28 (2014). 113 [55] H. A. Le Thi, T. P. Dinh, and H. V. Ngai. Exact penalty and error bounds in DC programming. Journal of Global Optimization 52.3: 509-535 (2012). [56] S. Leyffer and T. S. Munson. Solving multi-leader-common-follower games. Optimization Methods and Software 25.4: 601-623 (2010). [57] S. Li and T. Baar. Distributed algorithms for the computation of noncooperative equilibria. Automatica 23.4: 523-533 (1987). [58] L. Lopez, G. del Rey Almansa, S. Paquelet, and A. Fernandez. A mathematical model for the TCP tragedy of the commons. Theoretical Computer Science 343.1-2: 4-26 (2005). [59] Z. Q. Luo and J. S. Pang. Analysis of iterative waterlling algorithm for multiuser power control in digital subscriber lines. EURASIP Journal on Advances in Signal Processing 2006.1: 024012 (2006). [60] Z. Q. Luo, J. S. Pang, and D. Ralph. Mathematical Programs With Equilibrium Constraints. Cambridge University Press (1996). [61] R. Menon, A. MacKenzie, J. Hicks, R. Buehrer, and J. Reed. A game-theoretic framework for interference avoidance. IEEE Transactions on Communications, 57.4: 1087-1098 (2009). [62] D. Monderer and L. S. Shapley. Potential games. Games and economic behavior 14.1: 124-143 (1996). [63] J. F. Nash. Equilibrium points in n-person games. Proceedings of the National Academy of Sciences 36: 48-49 (1950). [64] J. F. Nash. Non-cooperative games. Annals of Mathematics 54: 286-295 (1951). [65] M. Nouiehed, J. S. Pang, and M. Razaviyayn. On the pervasiveness of dierence-convexity in optimization and statistics. Mathematical Programming, Series B (submitted February 2017) [66] J. M. Ortega and W. C. Rheinboldt. Iterative solution of nonlinear equations in several variables. Society for Industrial and Applied Mathematics (2000). [67] J. S. Pang. Asymmetric variational inequality problems over product sets: applications and iter- ative methods. Mathematical Programming 31.2: 206-219 (1985). [68] J. S. Pang and D. Chan. Iterative methods for variational and complementarity problems. Mathematical programming 24.1: 284-313 (1982). [69] J. S. Pang and M. Fukushima. Quasi-variational inequalities, generalized Nash equilibria, and multi-leader-follower games. Computational Management Science 2.1: 21-56 (2005). 114 [70] J. S. Pang and B. F. Hobbs. Spatial oligopolistic equilibria with arbitrage, shared resources, and price function conjectures. Mathematical Programming 101.1: 57-94 (2004). [71] J. S. Pang and M. Razaviyayn. A unied distributed algorithm for non-cooperative games with non-convex and non-dierentiable objectives. In S.G. Cui, A. Hero, Z.Q. Luo, and J.M.F. Moura, editors, Big Data over Networks: 101-134 (2016). [72] J. S. Pang, M. Razaviyayn, and A. Alvarado. Computing B-stationary points of nonsmooth DC programs. Mathematics of Operations Research, published online http://dx.doi.org/10. 1287/moor.2016.0795 (October 2016). [73] J. S. Pang and G. Scutari. Nonconvex games with side constraints. SIAM Journal on Opti- mization 21: 1491-1522 (2011). [74] J. S. Pang and G. Scutari. Joint sensing and power allocation in nonconvex cognitive radio games: Quasi-Nash equilibria. IEEE Transactions on Signal Processing 61: 1782-2366 (2013). [75] J. S. Pang, G. Scutari, F. Facchinei, and C. Wang. Distributed power allocation with rate constraints in Gaussian parallel interference channels. IEEE Transactions on Information Theory 54.8: 3471-3489 (2008). [76] J. S. Pang, G. Scutari, D. P. Palomar, and F. Facchinei. Design of cognitive radio systems under temperature-interference constraints: A variational inequality approach. IEEE Transactions on Signal Processing 58.6: 3251-3271 (2010). [77] J. S. Pang, S. Sen, and U. Shanbhag. Two-stage non-cooperative games with risk-averse play- ers. Mathematical Programming 165.1: 235-290 (2017). [78] M. J. D. Powell. A method for nonlinear constraints in minimization problems, in Optimization ed. by R. Fletcher: 283-298. Academic Press, New York, NY (1969). [79] K. T. Phan, S. A. Vorobyov, C. Telambura, and T. Le-Ngoc. Power control for wireless cellular systems via DC programming. IEEE/SP 14th Workshop on Statistical Signal Processing: 507-511 (2007). [80] A. B. Philpott, M. C. Ferris, and R. J. B. Wets. Equilibrium, uncertainty and risk in hydro-thermal electricity systems. Mathematical Programming 157.2: 483-513 (2016). [81] M. Razaviyayn, Z. Q. Luo, P. Tseng, and J. S. Pang. A Stackelberg game approach to distributed spectrum management. Mathematical programming 129.2: 197 (2011). [82] J. B. Rosen. Existence and uniqueness of equilibrium points for concave n-person games. Econo- metrica 33.3: 520-534 (1965). 115 [83] W. Saad, Z. Han, H. V. Poor, and T. Basar. Game-theoretic methods for the smart grid: An overview of microgrid systems, demand-side management, and smart grid communications. IEEE Signal Processing Magazine 29.5: 86-105 (2012). [84] Y. Saeys, I. Iaki, and L. Pedro. A review of feature selection techniques in bioinformatics. bioinformatics 23.19: 2507-2517 (2007). [85] G. Scutari, F. Facchinei, J. S. Pang, and D. P. Palomar. Real and complex monotone communication games. IEEE Transactions on Information Theory 60.7: 4197-4231 (2014). [86] G. Scutari, F. Facchinei, P. Song, D. P. Palomar, and J. S. Pang. Decomposition by partial linearization: Parallel optimization of multi-agent systems. IEEE Transactions on Signal Processing 62.3: 641-656 (2014). [87] G. Scutari and J. S. Pang. Joint sensing and power allocation in nonconvex cognitive radio games: Nash equilibria and distributed algorithms. IEEE Transactions on Information Theory 59: 4626-4661 (2013). [88] G. Scutari and D. P. Palomar. MIMO cognitive radio: A game theoretical approach. IEEE Transactions on Signal Processing 58.2: 761-780 (2010). [89] G. Scutari, D. P. Palomar, and S. Barbarossa. Optimal linear precoding strategies for wideband noncooperative systems based on game theory-Part I: Nash equilibria. IEEE Transactions on Signal Processing 56.3: 1230-1249 (2008). [90] G. Scutari, D. P. Palomar, and S. Barbarossa. Asynchronous iterative water-lling for Gaussian frequency-selective interference channels. IEEE Transactions on Information Theory 54.7: 2868-2878 (2008). [91] G. Scutari, D. P. Palomar, and S. Barbarossa. Competitive design of multiuser MIMO systems based on game theory: A unied view. IEEE Journal on Selected Areas in Communications 26.7: 1089-1103 (2008). [92] G. Scutari, D. P. Palomar, and S. Barbarossa. The MIMO iterative waterlling algorithm. IEEE Transactions on Signal Processing 57.5: 1917-1935 (2009). [93] G. Scutari, D. P. Palomar, and S. Barbarossa. Cognitive MIMO radio. IEEE Signal Pro- cessing Magazine 25.6: 46-59 (2008). [94] G. Scutari, D. P. Palomar, F. Facchinei, and J. S. Pang. Flexible design of cognitive radio wireless systems: From game theory to varitional inequality theory. IEEE Signal Processing Magazine 26: 107-123 (2009). 116 [95] G. Scutari, D. P. Palomar, F. Facchinei, and J. S. Pang. Convex optimization, game theory, and variational inequality in multiuser communication systems. IEEE Signal Processing Magazine 27: 35-49 (2010). [96] G. Scutari, D. P. Palomar, F. Facchinei, and J. S. Pang. Monotone games for cognitive radio systems. Distributed Decision-Making and Control: 83-112 (2012). [97] G. Scutari, D. P. Palomar, J. S. Pang, and F. Facchinei. Flexible design of cognitive radio wireless systems. IEEE Signal Processing Magazine 26.5: 107-123 (2009). [98] S. J. Shenker. Making greed work in networks: A game-theoretic analysis of switch service disciplines. IEEE/ACM Transactions on Networking (TON) 3.6: 819-831 (1995). [99] K. W. Shum, K. K. Leung, and C. W. Sung. Convergence of iterative waterlling algorithm for Gaussian interference channels. IEEE Journal on Selected Areas in Communications 25.6: 1091- 1100 (2007). [100] D. A. Smith, C. Shi, R. A. Berry, M. L. Honig, and W. Utschick. Distributed resource allocation schemes. IEEE Signal Processing Magazine 26.5: 53-63 (2009). [101] H. Sreekumaran, A. R. Hota, A. L. Liu, and N. A. Uhan, and S. Sundaram. Multi-agent decentralized network interdiction games. arXiv:1503.01100v2 (July 2015). [102] H. Sreekumaran and A. Liu. A note on the formulation of max- ow and min-cost- ow network interdiction games. (September 2015). [103] H. Sreekumaran. Decentralized algorithms for Nash equilibrium problems-applications to multi- agent network interdiction games and beyond. Ph.D. thesis, Purdue University (September 2015). [104] H. van Stackelberg. The Theory of Market Economy. Oxford University Press (1952). [105] C. L. Su. Equilibrium Problems with Equilibrium Constraints: Stationarities, Algorithms, and Ap- plications. Ph.D. thesis, Department of Management Science and Engineering, Stanford University (2005). [106] P. D. Tao and H. A. Le Thi. Convex analysis approach to dc programming: Theory, algorithms and applications. Acta Mathematica Vietnamica 22.1: 289-355 (1997). [107] P. Tsiaflakis, M. Diehl, and M. Moonen. Distributed spectrum management algorithms for multiuser DSL networks. IEEE Transactions on Signal Processing 56.10: 4825-4843 (2008). [108] S. Uryas' ev and R.Y. Rubinstein. On relaxation algorithms in computation of noncooperative equilibria. IEEE Transactions on Automatic Control 39.6: 1263-1267 (1994). 117 [109] N. Vucic, S. Shi, and M. Schubert. DC programming approach for resource allocation in wireless networks. In 2010 Proceedings of the 8th International Symposium on Modeling and Op- timization in Mobile, Ad Hoc and Wireless Networks (WiOpt): 380-386 (2010). [110] C. Wu, H. Mohsenian-Rad, J. Huang, and A. Y. Wang. Demand side management for wind power integration in microgrid using dynamic potential game theory. In GLOBECOM Workshops (GC Wkshps): 1199-1204. IEEE (2011). [111] Y. Xu, T. Le-Ngoc, and S. Panigrahi. Global concave minimization for optimal spectrum balancing in multi-user DSL networks. IEEE Transactions on Signal Processing 56.7: 2875-2885 (2008). [112] J. Yao, I. Adler, and S. S. Oren. Modeling and computing two-settlement oligopolistic equi- librium in a congested electricity network. Operations Research 56: 34-47 (2008). [113] R. D. Yates. A framework for uplink power control in cellular radio systems. IEEE Journal on selected areas in communications 13.7: 1341-1347 (1995). [114] W. Yu, G. Ginis, and J. M. Cioffi. Distributed multiuser power control for digital subscriber lines. IEEE Journal on Selected Areas in Communication 20: 1105-1115 (2002). [115] J. Zhao, B. F. Hobbs, and J. S. Pang. Long-run equilibrium modeling of alternative emissions allowance allocation systems in electric power markets. Operations Research 58: 529-548 (2010). 118
Abstract (if available)
Abstract
Introduced by John Nash, modern-day game theory in non-cooperative games has developed into a very fruitful research discipline. Depending on the properties of players' objective functions and constraints, there are various classes of non-cooperative games. In this thesis, we have studied a variety of non-cooperative games, including Nash games with non-differentiable non-separable objective functions, value-function based games, DC potential games, and generalized Nash equilibrium problem with coupling constraints. In order to compute the solutions to these different classes of games, we designed algorithms and proved the convergence of these algorithms. We established the convergence of best-response algorithm under non-separable, C¹ objective functions, extending the results from Jong-Shi Pang and Meisam Razaviyayn. We developed the relaxed convergent conditions, including the corresponding convergent conditions for games with differentiable objective functions and non-differentiable objective functions. Some specific classes of games satisfy such conditions, including bi-Lipschitz games, poly-matrix games, minmax games and two-stage stochastic games. We have also studied the value-function based games. Generalizing certain network interdiction games communicated to us by Andrew Liu and his collaborators, we have studied a bilevel, non-cooperative game wherein the objective function of each player’s optimization problem contains a value function of a second-level linear program parameterized by the first-level variables in a non-convex manner. In the applied network interdiction games, this parameterization is through a piecewise linear function that upper bounds the second-level decision variable. In order to give a unified treatment to the overall two-level game where the second-level problems may be minimization or maximization, we formulated it as a one-level game of a particular kind. Namely, each player’s objective function is the sum of a first-level objective function ± a value function of a second-level maximization problem whose objective function involves a difference-of-convex (dc), specifically piecewise affine, parameterization by the first-level variables. We investigated the existence of a first-order stationary solution of such a game, which we call a quasi-Nash equilibrium, and study the computation of such a solution in the linear-quadratic case by Lemke’s method using a linear complementarity formulation. We have also introduced the DC potential games, where the objective function of each player is a DC function, i.e., the difference of two convex functions. A linearized best-response algorithm to compute a solution to the DC potential games with differentiable objective functions has been discussed, and the convergence of such algorithm has been proved. In the end, we have used the augmented Lagrangian based algorithm to compute a solution to the generalized Nash equilibrium problem, and proved its convergence. A generalized Nash equilibrium problem is an extension of the standard Nash equilibrium problem, where both the objective function and the feasible set of each player depend on the other players’ decision variables.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Stochastic games with expected-value constraints
PDF
Landscape analysis and algorithms for large scale non-convex optimization
PDF
The fusion of predictive and prescriptive analytics via stochastic programming
PDF
On the interplay between stochastic programming, non-parametric statistics, and nonconvex optimization
PDF
Algorithms and landscape analysis for generative and adversarial learning
PDF
Difference-of-convex learning: optimization with non-convex sparsity functions
PDF
Computational stochastic programming with stochastic decomposition
PDF
Provable reinforcement learning for constrained and multi-agent control systems
PDF
The next generation of power-system operations: modeling and optimization innovations to mitigate renewable uncertainty
PDF
Computational validation of stochastic programming models and applications
PDF
Online learning algorithms for network optimization with unknown variables
PDF
Empirical methods in control and optimization
PDF
Sequential decision-making for sensing, communication and strategic interactions
PDF
Traffic assignment models for a ridesharing transportation market
PDF
Elements of robustness and optimal control for infrastructure networks
PDF
Optimal distributed algorithms for scheduling and load balancing in wireless networks
PDF
Information design in non-atomic routing games: computation, repeated setting and experiment
PDF
Algorithms for stochastic Galerkin projections: solvers, basis adaptation and multiscale modeling and reduction
Asset Metadata
Creator
Hao, Tianyu
(author)
Core Title
Topics in algorithms for new classes of non-cooperative games
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Industrial and Systems Engineering
Publication Date
07/19/2018
Defense Date
05/24/2018
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
best-response algorithm,linear complementarity,Nash equilibrium,network interdiction,non-cooperative games,OAI-PMH Harvest,two-level games,variational inequalities
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Pang, Jong-Shi (
committee chair
), Jain, Rahul (
committee member
), Razaviyayn, Meisam (
committee member
), Sen, Suvrajeet (
committee member
)
Creator Email
tianyuha@fb.com,tianyuha@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c89-20304
Unique identifier
UC11670835
Identifier
etd-HaoTianyu-6426.pdf (filename),usctheses-c89-20304 (legacy record id)
Legacy Identifier
etd-HaoTianyu-6426.pdf
Dmrecord
20304
Document Type
Dissertation
Format
application/pdf (imt)
Rights
Hao, Tianyu
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
best-response algorithm
linear complementarity
Nash equilibrium
network interdiction
non-cooperative games
two-level games
variational inequalities