Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Smarter markets for a smarter grid: pricing randomness, flexibility and risk
(USC Thesis Other)
Smarter markets for a smarter grid: pricing randomness, flexibility and risk
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
Smarter Markets for a Smarter Grid: Pricing Randomness, Flexibility and Risk by Nathan Dahlin A Dissertation Presented to the FACULTY OF THE GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (Electrical Engineering) December 2021 Copyright 2021 Nathan Dahlin Dedication For my Grandma Pat, who had the courage to leave a small town in northern Minnesota, and the dedication to push through the Cadet Nurse Corps instructors’ warnings that most of her incoming class would not make it to graduation, and for my Grandpa Bob, who always wanted to hear about what I’d been learning in math class. ii Acknowledgements I must begin by expressing my sincere thanks and appreciation to my advisor Prof. Rahul Jain. Returning to USC for graduate study, I sought in part an advisor and lab with significant focus on problems related to the transition to cleaner, more sustainable energy systems, and I found both of those things in Prof. Jain and the Stochastic Systems & Learning Laboratory. Beyond this alignment in research interests, I had the good fortune to find in Prof. Jain an advisor willing to take a chance on a student coming from an audio signal processing background far afield from the topics I voiced interest in pursuing as a Ph.D student. Of course, all of this was just the tip of the iceberg of deep technical expertise, research acumen, intellectual rigor and unending curiosity and enthusiasm in pursuit of new questions and problems that I have gleaned over the past several years working with Prof. Jain. It is no exaggeration to say that Prof. Jain has given me a new perspective on how to ask and answer questions, and generally operate as a critical thinker. Simply put, Prof. Jain has taught me the value of slowing down and simplifying problems, and then carefully assembling the bits of insight gained in that process to work through seemingly intractable sticking points. Perhaps even more crucially, he has taught me that some of the greatest limitations one faces in academic, professional and intellectual growth are the ones we place on ourselves. iii No challenge is too great if we are willing to sit down and start the work. I especially thank Prof. Jain for his unfailing patience in nudging me to learn these important lessons for myself. Returning to the lab I entered in Fall 2015, over the last several years I have had the great privilege to work amongst a cohort of talented, inspiring and supportive labmates: Krishna Chaitanya Kalagarla, Mehdi Jafarnia-Jahromi, and Hiteshi Sharma. Our lunch breaks, group outings and daily office banter constituted the glue holding my composure together through the challenges I faced on the way to my Ph.D. In particular I am not sure how I would have gotten through the math department courses for my MA without the many lengthy discussions and study sessions Krishna and I crammed into EEB 335. I would like to thank each of the members of my defense committee - Prof. Ashutosh Nayyar, Prof. Pierluigi Nuzzo and Prof. Suvrajeet Sen. Prof. Nuzzo has been kind enough to involve me in his lab’s ongoing work over the last couple of years, and he has provided invaluable advice and guidance on research, technical writing and presentation over that period of time, reflected in the final chapter of this dissertation. I was lucky enough to have Prof. Sen has my instructor for ISE 630, where I first encountered in depth the convex analysis and optimization tools forming the backbone of many of the following chapters. I also would like to thank Prof. Ketan Savla and Prof. Edmund Jonckheere for serving on my qualifying exam committee along with the other members of my defense committee. I am also incredibly grateful for the day in day out hard work of the electrical engineering department staff, in particular Diane Demetras and Annie Yu. I always found their doors open when I needed assistance in navigating the Ph.D. requirements and procedures. iv Between my graduation from USC with a bachelor’s degree in electrical engineering and my return in 2015, I worked as a research and development engineer at Audyssey Laboratories in Los Angeles. I would be remiss not to thank my supervisor Dr. Shiva Sundaram (also a USC EE Ph.D graduate), who encouraged me to seriously consider pursuing a Ph.D, and has since regularly taken time to meet with me whenever he finds himself back in LA. I am also incredibly thankful to my previous supervisors at Audyssey, Dr. Sunil Bharitkar and Phil Hilmes (again both USC alumni) for providing letters of recommendation as I applied to return to USC. Last but not least I would like to thank my family, in particular my parents. I truly cannot imagine reaching this accomplishment without the love, support and encouragement they have provided since long before I started my Ph.D. v Table of Contents Dedication ii Acknowledgements iii List Of Tables ix List Of Figures x Abstract xii Chapter 1: Introduction 1 1.1 Background and Market Structure of Electric Power Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Document Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Chapter 2: A Two-Stage Stochastic Mechanism for Selling Random Power 9 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.2.1 The Setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.2.2 Mechanism Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.3 The Generator’s Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.4 Stochastic VCG Mechanism for Selling Random Power . . . . . . . . . . . . 18 2.5 Proof of Theorem 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 2.5.1 The SelectionI −(i) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 2.5.2 Proof of Theorem 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 2.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 Chapter 3: Two-Stage Electricity Markets with Renewable Energy Integra- tion: Market Mechanisms and Equilibrium Analysis 29 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.2 Electricity Market Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 3.3 Sequential Competitive Equilibrium and Efficient Allocations . . . . . . . . . 44 3.3.1 Social Welfare Theorems . . . . . . . . . . . . . . . . . . . . . . . . . 49 3.4 Two-Stage Network Mechanism for Electricity Market with Renewable Generation . . . . . . . . . . . . . . . . . . . . . . . 53 vi 3.5 Dynamic Economic Dispatch Game and Efficient Bids . . . . . . . . . . . . . 54 3.5.1 LSE Utility Functions and SPP-P Reformulation . . . . . . . . . . . 54 3.5.2 DED Game and Efficient Bids . . . . . . . . . . . . . . . . . . . . . . 58 3.6 Sequential Nash Equilibria . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 3.6.1 Existence of Efficient Sequential Nash Equilibria . . . . . . . . . . . . 62 3.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 Chapter 4: A Risk Aware Two-Stage Market Mechanism for Electricity with Renewable Generation 73 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 4.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 4.2.1 Risk Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 4.2.2 Conditional value at risk . . . . . . . . . . . . . . . . . . . . . . . . . 77 4.3 Risk Aware Stochastic Economic Dispatch Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 4.3.1 Generator’s Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 4.3.2 ISO’s Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 4.4 Sequential Competitive Equilibrium . . . . . . . . . . . . . . . . . . . . . . . 84 4.5 Two-Stage Mechanism for Risk Aware Electricity Market with Renewable Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 4.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 Chapter 5: Scheduling Flexible Non-Preemptive Loads in Smart-Grid Net- works 96 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 5.1.1 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 5.1.2 Statement of Contributions . . . . . . . . . . . . . . . . . . . . . . . 99 5.2 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 5.2.1 The Social Planner’s Problem . . . . . . . . . . . . . . . . . . . . . . 103 5.2.2 Consumer’s Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 5.2.3 Generator and ISO Problems . . . . . . . . . . . . . . . . . . . . . . 109 5.3 Competitive Equilibrium and Theorems of Welfare Economics . . . . . . . . 110 5.4 Replicated and Large Economies . . . . . . . . . . . . . . . . . . . . . . . . . 114 5.5 Market Mechanism for Large Population Economy . . . . . . . . . . . . . . . 119 5.6 Case Study: EV Charging . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 5.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 Chapter 6: Designing Interpretable Approximations to Deep Reinforcement Learning with Soft Decision Trees 130 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 6.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 6.2.1 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 6.2.1.1 Knowledge distillation . . . . . . . . . . . . . . . . . . . . . 132 6.2.1.2 Evaluation metrics . . . . . . . . . . . . . . . . . . . . . . . 134 6.2.2 Imitation Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 vii 6.2.3 Soft Decision Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 6.3 Controller Characterization Metrics . . . . . . . . . . . . . . . . . . . . . . . 138 6.4 Case Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141 6.4.1 Problem description . . . . . . . . . . . . . . . . . . . . . . . . . . . 141 6.4.2 Expert DQN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 6.4.3 Hard and soft decision trees . . . . . . . . . . . . . . . . . . . . . . . 142 6.4.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 6.4.5 Further Case Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . 146 6.5 Conclusion and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . 148 Bibliography 149 Appendices 159 A Proof of Lemma 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 B Proof of Lemma 11 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160 C Proof of Lemma 16 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 viii List Of Tables 2.1 Payment function for LSE (i)∈I . . . . . . . . . . . . . . . . . . . . . . . . 20 ix List Of Figures 1.1 Conventional wholesale electricity market structure. . . . . . . . . . . . . . . 2 2.1 Case 1 RT transfer for i = 3 and n = 6. . . . . . . . . . . . . . . . . . . . . 21 2.2 Case 2 RT transfer for i = 3, n = 6 and r 3 = 4. . . . . . . . . . . . . . . . . 21 2.3 Case 3 RT transfer for i = 3, n = 6 and r 3 = 2. . . . . . . . . . . . . . . . . 22 3.1 Network diagram for Example 1. Generators are represented by circles, LSEs are represented by triangles. . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 5.1 Example disutility vectors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 5.2 Scheduled aggregate load with and without flexibility. . . . . . . . . . . . . . 126 5.3 Scheduled generation (equal to aggregate load less renewable generation) with and without flexibility. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 5.4 Proportions of loads served with and without flexibility. . . . . . . . . . . . . 128 5.5 Social welfare achieved with and without flexibility. . . . . . . . . . . . . . . 129 6.1 SDT with a single inner node and two leaf nodes [31] . . . . . . . . . . . . . 137 6.2 Generation and L2 norm comparison of empirical value functions ˆ V ˜ π and ˆ V ˆ π . 141 6.3 Soft Decision Tree for Mountain Car . . . . . . . . . . . . . . . . . . . . . . 145 6.4 Normalized RMS L2 Error for HDT/SDTs depth 2-9 (statespace discretized into 20 steps per dimension). . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 6.5 Percentage Policy Accuracy for HDT/SDTs depth 2-9 (statespace discretized into 100 steps per dimension). . . . . . . . . . . . . . . . . . . . . . . . . . . 146 x 6.6 Performance evaluation for HDT/SDTs and reference DQN for 100 episodes. 146 6.7 Number of parameters for HDT/SDTs and reference DQN. . . . . . . . . . . 147 6.8 Trained SDT and DQN policies, for statespace S discretized to 10000 points. 147 6.9 Trained HDT and DQN policies, for statespace S discretized to 10000 points. 148 xi Abstract The current moment finds the world’s energy infrastructure at the threshold of rapid transformation. Long heralded as the most complex human-technological system ever real- ized, in view of a looming climate crisis, as well as ever accelerating technological advances, today’s power systems are defined by increasing penetration of renewable and distributed energy resources on the supply side, and the emergence of flexible loads on the demand side. While these developments present opportunities for improved system operation and out- comes, they come with significant challenges as well, which can not be solved solely through continued development and adoption of technology. Consider the case of Germany’s En- ergiewende program. Government imposed above market-rates for solar and wind genera- tion grew the renewable share of electricity consumption from 5% in 1999 to 27% in 2014. Nevertheless, these same rates, together with mandated prioritization of renewables, proved high enough to raise consumer prices, and low enough to undercut natural gas, turning many utilities back to coal, and producing an uptick in German carbon dioxide emissions. Clearly, market institutions have a crucial role to play in this transformation as well. Through market analysis, design and optimization, this work explores how such institutions can evolve to address three key characteristics of modern power systems: uncertainty, risk, and flexibility. xii Central to the potential of markets to mitigate the uncertainty posed by reliance on re- newables is a revision of the current two-settlement market structure common to deregulated energy markets across the world. Rather than arrange advance supply and account for real time imbalances in supply and demand separately, this work considers a two-stage, stochas- tic market clearing paradigm. Probabilistic information regarding renewable generation is used to couple day ahead and expected real time recourse decisions, increasing efficiency. Within this framework, an incentive compatible two-stage market mechanism is designed for a renewable generator selling its random power to strategic customers. Next, a two-stage mechanism is developed for a two-sided exchange with primary and ancillary generation and demand response, implementing a sequential competitive equilibrium (SCEq). As renewables assume a larger share of electricity generation and consumption, market participants on both sides are exposed to higher levels of risk, both in terms of price and quantity variability. Many of these participants are risk averse, so that optimization of the expectation of a stochastic objective is not sufficient. Therefore, optimization of conditional value at risk of the cost of addressing shortfall in renewable generation is considered. A two-stage market mechanism implementing an SCEq is developed in this risk-aware setting. Given that user flexibility is considered one of the most valuable, yet still untapped resources available for accommodating the transition to renewables, an explicit market for flexibility is designed and analyzed. Users report preferences for service over a finite time horizon to a scheduler which shapes the aggregate demand profile to the output of a renewable generator, while minimizing the cost of resorting to thermal generation. Social welfare properties of competitive equilibria and an accompanying mechanism are studied. xiii Chapter 1 Introduction 1.1 Background and Market Structure of Electric Power Systems Electricity is in many aspects a unique commodity. For example, it is inseparable from a physical system which operates at a time scale more faster than almost any other market. That is, supply and demand must be balanced from moment to moment in order to main- tain power grid stability. Furthermore, although electricity storage has recently attracted attention as one of the potential keys to a transition to greater levels of renewable adop- tion, storage of large quantities of electricity remains challenging. These idiosyncrasies of electricity have made and continue to make design of functional electricity markets difficult. In the wake of the deregulation over the last few decades, two market structures dominate electricity markets: bilateral trading and competitive electricity pools [54]. Bilateral trades involve direct negotiation over delivery of power at a future point in time, while electricity pools offer a centralized platform for pricing and wholesale market clearing between buyers 1 Figure 1.1 Conventional wholesale electricity market structure. and sellers. The buyers are utilities, distributors and load serving entities (LSEs), with the sellers are generators. These entities submit supply and demand curves to the independent system operator (ISO), which then solves an economic dispatch, problem, determining the lowest cost dispatch of generation which satisfies grid constraints such as power flow bal- ance at all network nodes and capacity limits of transmission constraints. The buyers then distribute the purchased energy to consumers via retail markets. Most electricity markets have multi-settlement structures. These markets operate at varying time scales, e.g. week-ahead, day-ahead, hour-ahead, minutes-ahead and real time, but the most common implementation takes on a two-stage structure. For example, in North America, supply is arranged in advance, day-ahead markets in order for slower adjusting thermal generators to be available and running on time to meet estimated demand. As this demand cannot be predicted exactly, the day-ahead markets are supplement by real-time markets, which balance out any variations in load or generator availability [35]. Historically this arrangement has been sufficient for addressing load variation, as predic- tion errors have fallen under 5%. In this context of this work, this arrangement is notable 2 for the fact that these decisions are often made in an uncoupled fashion, i.e. the day-ahead decisions do not take into account the real-time balancing decisions that may occur in the following market stage. Later, stochastic market clearing models will be presented which use distribution information regarding uncertain renewable generation, which exhibits a much higher degree of variation than that found on the demand side. Any effort to substantially mitigate the risks of climate change must include a drastic reduction in global greenhouse gas (GHG) emissions. In 2018, International Panel on Climate Change (IPCC) stated that in order to meet the Paris Climate Change Agreement’s goal of limiting global temperature increase to 1.5 ◦ C above pre-industrial levels, carbon emissions worldwide must be reduced by 45% by 2030. Additionally, as IPCC AR5 report [55] detailed, electricity and heat production together accounted for 25% of total global GHG emissions (the largest percentage of any sector) in 2010. Therefore, such a reduction must involve the power grid, and indeed the movement away from fossil fuels to renewable sources of energy has already begun. For the past four years, installation of renewable energy capacity has outpaced combined new fossil fuel and nuclear capacity. Overall, renewables now account for more than one-third of global installed power capacity. At its outset, this growth in renewable adoption was facilitated by policy interventions such as feed in tariffs and renewable portfolio standards. However, it is often the case that such policies amount to payment without any guarantee of supply, an arrangement which does not scale with increasing penetration of renewable energy sources. On the other hand, in terms of per unit cost, renewables now represent the cheapest sources of power available worldwide. Thus, it seems both necessary and plausible for renewable energy generators and associated service providers to participate directly in competitive electricity pools. 3 But what does such participation look like? How would such markets function? Electric- ity has always been a somewhat unique commodity in a number of aspects. For example, demand and supply must be matched instantaneously from instant to instant. Power gener- ation decisions must be made in view of the physical laws and constraints governing the grid. Despite receiving increased attention as of late, mass storage of electricity and its integration into the power grid remains a challenge. Heavier reliance on renewables further complicates this picture due to their intrinsic randomness. While uncertainty due to demand fluctuations and infrastructure malfunction, for example, has always been an issue facing grid and market operators, the magnitude of variability introduced by renewables far exceeds that which was addressed in the past. The general interest of this work is to explore market and mechanism designs which further the global turn towards renewable energy. Adopting the view of renewable energy as a random good, i.e., a good supplied or generated only with some probability, the issue of uncertainty addressed above can be equivalently considered as an issue of reliability. How can market designs contribute to the reliability of renewable energy? What does it mean for a random good to be reliable? How does the reliability of renewable sources affect the price of conventionally generated, thermal energy? Can complementary goods or services be offered an exchanged in order to enhance the reliability of renewables? 1.2 Document Organization Chapter 2 addresses an issue relevant to all real world markets, including that for elec- tricity - competitive behavior, considering a setting in which a single renewable generator 4 seeks to sell its power to strategic load serving entities. These strategic agent each demand a single unit of energy, and are characterized by v i , the value that they will derive from receiving that unit, and c i , the cost they will incur if promised a unit that is not ultimately delivered. The generator seeks to maximize the summed expected utilities of all consumers, i.e., distribute any available units to those consumer who will benefit most. This is only possible if each LSE i submits its true information (v i ,c i ). Given the random nature of the generator’s output, a stochastic extension to the well known VCG mechanism is presented which elicits the true parameters from all agents, ensures participation, and ultimately leads to the socially optimal outcome. Chapter 3 examines the dynamic aspect of renewable integration. Across the world, the sale and distribution of electricity is settled via a sequence of markets, including several forward markets, closed days or hours ahead of delivery and a real-time balancing market, which opens (and closes) just minutes prior to delivery. Currently, the common practice is to make generation and reserve capacity procurement quantity decisions in each of these markets independently. However, assuming that probability distributions are available for the renewable energy productions outputs, it seems reasonable to make earlier decisions in view of the expected recourse actions that will be taken in later stages. In Chapter 3, a two- stage market is considered, which includes primary conventional generation sources, more expensive ancillary generation on the supply side, along with consumer participation in de- mand response and planned blackouts on the demand side. The operation of this market can be modeled as a two-stage stochastic optimization problem, which admits a decomposition into individual generator and load serving entity (LSE) problems. It is shown that the the 5 optimal, centralized solution can be constructed from the solution to these individual prob- lems via a mechanism based upon sequential competitive equilibrium prices. Allowing for strategic behavior, an analogous sequential Nash equilibrium concept is introduced. Assum- ing that LSE’s bid directly on their marginal valuation for electricity in both market stages, under either a congestion free condition on the optimal dispatch or monopoly free condition on the network topology, an efficient sequential Nash equilibrium is shown to exist. Of course, a given realization of a random variable may be quite different from its ex- pected value. Given this observation, as well as the fact that a significant amount of empirical evidence suggests that real world economic decision makers are risk-averse, Chapter 4 consid- ers a two-stage market setting wherein a risk-averse social planner schedules the generation of N thermal generators, and a single renewable generator. In particular, the expectation over random cost found in the setting of Chapter 3 is replaced with conditional value at risk (CVaR), a risk measure common in finance which essentially expresses the mean value of a random variable, given that it has taken a value above some threshold which is exceeded with low probability. A decomposition of the posed social planner’s problem into individual gener- ator problems is detailed, and the existence of decentralizing sequential equilibrium concepts is demonstrated. Noteworthy in this case is the fact that the generators are assumed to be risk-neutral, i.e., expected payoff maximizers, while the social planner is risk-averse. Despite this discrepancy in attitude towards risk, given the sequential equilibrium prices, the indi- vidual generator solutions together combine to give the social planner’s optimal, centralized solution. 6 Again, one of the defining operational requirements of the power grid is that it balances demand and supply instantaneously. Clearly this becomes more challenging as the sup- ply side of that equation becomes more variable. But what about the supply side? Over the past decade, demand response has become one of the most studied tools for adapting to the uncertainties of renewable generation. A blanket term encompassing a variety of approaches, demand response programs can broadly be differentiated into direct methods, where, for example, a utility is able to directly schedule appliances and other loads for par- ticipating end users, and indirect methods, where a signal such as time varying prices are transmitted to users in the hope that they will shift their consumption, thus implementing the intended schedule of a utility or social planner. Chapter 5 studies a setting where end consumers of electricity provide a social planner with disutility functions u dS i and u dE i de- scribing discomfort for service prior to and following, a desired service time window within a finite time horizonT . The social planner then minimizes the summed cumulative disutil- ities and (thermal) generation costs corresponding to its chosen schedule for all submitted loads. The solution to the planner’s centralized problem yields prices not only for per unit energy consumption, but also inflexibility posed by the end users via their utility function. As in prior settings, a mechanism based upon equilibrium energy and inflexibility prices is described which implements the socially optimal equilibrium in a decentralized manner. The equilibrium described follows from a convex relaxation of the scheduling decision variables, and therefore does not directly yield an optimal schedule. However, considering the large economy limit under infinite symmetrical replication of each load type, the equilibrium can be seen as providing an optimal probability distribution over start times for a given load. These equilibrium probabilities can then be used to determine which proportion of each load 7 type should be activated at each time slot, yielding an optimal schedule which respects the integer constraints of the original setting. Finally, Chapter 6 presents preliminary findings on a separate track of research con- cerning interpretable artificial intelligence. While deep neural networks (DNNs) continue to constitute the state of the art in broad set of applications, adoption of DNNs in critical ap- plications is currently hindered by their opaqueness. One class of techniques for addressing the problem of interpretability in DNNs is known as knowledge distillation, a specific form of imitation learning, which extracts knowledge from a trained DNN to a simpler or more leg- ible model [18]. Although alternative model architectures, e.g., decision trees, trained from DNN outputs can achieve comparable performance to the original target DNN performance, there is still a lack of consensus as to what metrics can be used to quantify interpretability or otherwise measure similarity between distilled models and their targets. Chapter 6 presents a quantitative framework for assessing the outcome of the distillation process from DNNs to conventional decision trees, as well as soft decision trees [31]. The framework, which includes metrics such as 0-1 loss between model policies, seeks to identify which reduced models not only preserve a desired level of performance, but also compactly explain the knowledge em- bedded in the original DNN. The utility of the framework is demonstrated in the context of benchmark reinforcement learning (RL) tasks implemented as part of the OpenAI Gym software package. 8 Chapter 2 A Two-Stage Stochastic Mechanism for Selling Random Power 2.1 Introduction Remarkable strides have been made in meeting aggressive renewable energy portfolio standards worldwide. For example, California has gone from under 5% of energy from renewable sources in 2010 to nearly 25% today and expected to reach 33% in 2020. In fact, the state has mandated 100% of energy to be derived from renewable sources by 2045. Other states and countries are following suit [29]. This presents immense technological and economic challenges. Renewable energy sources such as wind and solar are inherently variable, meaning that efforts to bridge such gaps must address the challenge of efficiently mitigating the impact of uncertainty in generation. For some time, renewable resources have been considered more akin to negative loads than firm block generating units, and their growth has been spurred by feed-in tariffs [1], [28]. In order to maintain grid stability, it is the responsibility of an 9 independent system operator (ISO) to procure adequate reserves to compensate for potential shortfalls in generation. While feasible at relatively low levels of usage, at higher levels of penetration such arrangements hamper the net benefits of renewable energy [8]. Therefore, prevailing approaches to renewable integration have shifted from so called supply-push-type mechanisms [1] to those which place more of the burden on the generators themselves as well as their end users. This shift has manifested in a couple of ways, first via what can be categorized as tech- nological regulations. For example in Spain, both new and existing generators are required to equip themselves with fault-ride-through capability. Secondly, economic regulations have evolved such that generators are more directly exposed to the risks associated with their variability. Again turning to the Spanish example, mandatory hourly forecasting require- ments were implemented along with penalties of 10% estimated system cost for deviations of over 20%. This example typifies two-settlement systems: renewable generators are forced to par- ticipate in conventional energy markets with ex-post monetary penalties for deviations from ex-ante contracts. The latter are settled in day-ahead (DA) markets, while the former are determined in real-time (RT) spot markets. Formally, the determination of committed quan- tities and allocations can be captured in the framework of optimization problems [53]. Adopting the view of wind or solar energy as a random good, much work has focused on how the manner in which generators bring their product to market can affect social welfare. In [72] and [8], given the probability distribution of power generation scenarios, contracts of the form (ρ w ,p w ), signifying a price of p w per unit of the contract, which will be fulfilled with probability ρ w , must be designed. Thus, Load serving entitities (LSEs) are 10 given an opportunity on how much risk due to insufficient generation they need to take on, given the variable-reliability options presented by the supplier. Such a market is shown in [8] to operate more efficiently than a firm-electricity market, which incurs additional costs including the provisioning of reserves. In [10] the problem is to set a price p and offered generation quantityC in order to maximize expected profit, rather than social welfare. Still it is shown that the optimal expected shortfall is nondecreasing in p and C, demonstrating the need to curtail offerings to reduce necessary reserve capacities. The studies above assume that the distribution used in determining optimal offerings is reported truthfully. [73] investigates the aggregator’s task of selecting a subset of available generators to maximize a given objective (e.g. maximize expected generation) as well as the ISO’s problem of pricing wind energy, given a set of available generators, allowing for strategic behavior by generators in reporting their generation distributions. A stochastic VCG mechanism, as well as a commitment with penalty type mechanism are proposed which elicit truthful distribution reporting, i.e., incentive compatibility, satisfy generator individual rationality (or voluntary participation) constraints and achieve efficient outcomes. In this work, we examine the situation wherein a renewable generator, or an aggregator allocates generated electricity among a set of LSEs via an auction mechanism. We design a two stage auction. In Stage 1 (day ahead), the LSEs make bids that specify their value v for each unit. Furthermore, they also specify their cost c of real-time fulfillment (e.g., from the spot market) in case there is a shortfall and the generator cannot meet the commitment it made in Stage 1. In Stage 2, random generation level W is realized. In case there is a shortfall over the commitment already made in Stage 1, the auctioneer “de-allocates” some of the LSEs but pays them a compensation that depends on the real-time fulfillment costs 11 c. Since the LSEs are strategic and need not report their values v and costs c truthfully, we have to devise the allocation, de-allocation and payment rules such that it leaves no incentive for the LSEs to not be truthful. Furthermore, we would also like such a mechanism to be efficient in the sense of maximizing the expected social welfare even in the absence of knowledge about the true valuations and costs of the LSEs. We note that the problem posed here is a two stage auction mechanism for allocation of a random good with two part payments. Literature on multi-stage and dynamic mechanisms is sparse since they are usually regarded as rather difficult problems. The reader may refer to [47] for the standard game theoretic terminology that we use throughout this work. 2.2 Preliminaries 2.2.1 The Setting Consider a renewable energy generator with random integer valued generation W ∈ [0,w] where w∈Z + gives the maximum generation amount. Let p = (p 0 ,...,p w ) give the probability mass function for the generator’s output. We will consider a two stage setting. In stage 1, the generator conducts an auction to sell the random renewable energy generation in whichN load serving entities (LSEs) participate and determines an allocation. Each LSE demands a single unit of energy. In stage 2, W is realized, and say W =w with probability p w . This is known to both the generator and the LSEs. This may result in a deficit over the allocation in stage 1, in which case some of the LSEs do not receive any power but may incur some cost to fulfill demand from spot market. Denote the value of receiving a unit by 12 v i and the cost of unfulfilled demand (after allocation in stage 1) by c i for LSE i. Further, denote σ i = (v i ,c i ), which is LSE i’s private information, and σ = (σ 1 ,...,σ N ). The generator thus, faces a two-stage stochastic optimization problem. In stage 1, it decides out of the various bidding LSEs, which ones to commit to (the allocation). And once renewable energy generation W has been realized, if there is shortfall, in stage 2 it decides which LSEs will receive energy units, and which one’s will get de-allocated. Let us denote stage 1 decision byx =x(σ,p) and stage 2 decision byz =z(x,w). x i = 1 denotes allocation to LSE i in stage 1, and zero otherwise. z i = 1 if LSE i is deallocated in stage 2, and zero otherwise. Thus given a first stage decision x and realized generation level W =w, the generator’s second stage decision problem can be formulated as min z N X i=1 (v i +c i )z i : z i ≤x i , N X i=1 z i = N X i=1 x i −w ! + (2.1) where x + = max(x, 0). Let Q(x,w;σ) denote the minimum cost achievable in (2.1), given first stage decision x, renewable realization w, and LSE information σ. Now define the social welfare (SW) achieved when allocation x is chosen and generation level w is realized as SW (x,w;σ) = N X i=1 v i x i −Q(x,w;σ) (2.2) The deallocationz depends on shortfall in generationW which is random. Thus, in stage 1 the generator maximizes expected social welfare to determine allocation x: max x E[SW (x,W ;σ)] (2.3) 13 Note that (2.3) may also be written max x N X i=1 v i x i −E[Q(x,W ;σ)] (2.4) 2.2.2 Mechanism Design The generator now holds an auction in which it asks the various LSEs to submit bids. LSEi submits a bid ˆ σ i = (ˆ v i , ˆ c i ) which may be different from its private informationσ i . The generator will use the bids ˆ σ = (ˆ σ 1 ,... ˆ σ N ) to make allocation (and deallocation) decisions. But the allocation can only be expected to be efficient, i.e., social welfare maximizing if the bidders truthfully report σ i . Let γ i = v i +c i and similarly ˆ γ i = ˆ v i + ˆ c i . But bidders are strategic and unless provided proper incentives need not be truthful. Thus, we would like to design an auction mechanism that aligns incentives of LSEs such that they indeed are truthful. To specify such a mechanism, we need to specify the allocation rule (based on the bids), the deallocation rule (based on the bids and the generation realizationw) and the payments to be made. LetI⊆N be the subset of LSEs selected to receive a potential unit of energy. Those LSEs which are not selected inI leave the auction with zero payoff. Each selected LSE i makes a payment t d i (ˆ σ) to the generator, prior to realization of W . Then in the RT market, after realization of W , say W =w, a final subsetI w ⊆I is selected to receive any available units. Additionally each LSEi∈I receives a paymentt r i (ˆ σ,w) from the generator. Both t r i and t d i are allowed to take on negative values, indicating a payment in the reverse 14 direction specified. Together these selection and payment schemes define a direct revelation mechanism Γ = (I,I w ,t d i ,t r i ). In order to specify the desired properties of such an auction mechanism it is necessary to first define the payoffs each LSE will receive. Therefore we define the payoff of LSE i, given generator decisions (x,z), generation realization w and bids ˆ σ as π i (x,z,w; ˆ σ) =v i x i − (v i +c i )z i −t d i +t r i (2.5) The first goal in designing our auction mechanism will be to ensure LSE participation. We assume that this can be accomplished if no LSE can expect to receive a negative payoff, and say that a mechanism achieves individual rationality (IR) in expectation when: E W [π i (x,z,W ; (σ i , ˆ σ −i ))]≥ 0 ∀i, ˆ σ −i , (2.6) where ˆ σ −i = (ˆ σ 1 ,..., ˆ σ i−1 , ˆ σ i+1 ,..., ˆ σ N ) gives the bid profile aside from ˆ σ i . (2.6) states that each LSE can expect to receive a nonnegative payoff when participating in the mechanism. Given that the LSEs choose to participate in the mechanism, it is further desired that they bid truthfully. This will occur if it is in their interest to do so, and we say that a mechanism achieves incentive compatibility (IC) in expectation when: E W [π i (x,z,W ; (σ i , ˆ σ −i )]≥E W [π i (x,z,W ; (ˆ σ i , ˆ σ −i )] (2.7) for all i and ˆ σ −i . (2.7) states that for each LSE i, regardless of the bids of the other LSEs, bidding truthfully yields at least as high an expected value as any other strategy. 15 Finally, given that LSEs choose to participate truthfully it is desired that the auction mechanism makes LSE selections which maximize the expected social welfare in (2.3). We say that an auction mechanism is efficient if it selects an allocation such that (2.3) is maximized. Again, the generator needs to know the valuations and costs of the LSEs in order to select the best set of LSEs, in the sense of social welfare maximization. As the LSEs are strategic, they may not truthfully reveal their valuations and costs. Thus the goal of this work is to design an auction mechanism which is individually rational and incentive compatible in expectation, as well as efficient. As the mechanism to be designed is one-sided, we do not attempt to achieve budget balance. 2.3 The Generator’s Problem Given that our mechanism will be given in the form Γ = (I,I w ,t d i ,t r i ), we now give a reformulation of the generator’s problem. In stage 1, it decides which LSEs to allocate to, and in stage 2, with knowledge of the generation realization, it decides which LSEs of those allocated in stage 1, need to be deselected (and compensated for) when there is a generation shortfall. Defining x as previously we have that I ={i : x i = 1} (2.8) The number of LSEs selected will be denotedn =|I|. LetX −i denote the set of allocations for which x i = 0. Note that functions of x can be considered equivalently as functions of 16 I, so that occasionally we will interchange them as arguments. For example Q(x,w;σ)≡ Q(I,w;σ). Upon realization of the value of W , the generator makes a second stage selection of the final subset of LSEsI w ⊆I to receive the available units, where the subscriptw reflects that W has assumed the value w∈{0, 1,...,w}. This leaves a subsetI w =I\I w of deselected LSEs. Therefore defining z as in the previous discussion we have I w ={i : z i = 0}. (2.9) LetN denote the set{0, 1,...,N}. Let ˆ γ i := (ˆ v i + ˆ c i ). Then the generator’s two stage problem can be rewritten as max I⊆N X i∈I ˆ v i − |I|−1 X w=0 p w min Iw⊆I X i∈Iw ˆ γ i :|I w | =|I|−w (2.10) Note that assuming truthful bids, the inner minimization problem in (2.10) is equivalent to (2.1), and the outer maximization problem is equivalent (2.4). Therefore, in solving (2.4), the generator determines its selectionI. Given selectionI with n =|I|, only when generation level w <n will the minimum cost in (2.1) be positive. Otherwise, each LSE the generator made a commitment to in stage 1 will receive a unit, yielding Q(x,w; ˆ σ) = 0. When the realization w < n, the generator will solve (2.1), to determine which LSE are to be deselected with decision z and associatedI w . Note that the selection of a particularI induces a ranking on the selected LSEs. For simplicity assume that the values of ˆ γ are unique. Suppose that n =|I| LSEs have been 17 selected in the first stage, and w < n units are generated. I w thus, will include the n−w LSEs selected inI with the lowest ˆ γ i . This implies that for each LSE i∈I there is a maximum level of generation at which they will not receive a unit. For example the LSE with rank 1 will not receive a unit if zero units are generated. The LSE with rank 2 will not receive a unit if one or fewer units are generated and so on. If this maximum level is denoted w i for i∈I, then let r i (I) :=r i :=w i + 1 denote the rank of LSE i for i∈I. We use the notation (·) to indicate indexing with reference to the selectionI. For example the reported valuation of the LSE with rank 1 underI is denoted ˆ v (1) . Thus, when|I| =n ˆ γ (1) > ˆ γ (2) >···> ˆ γ (n−1) > ˆ γ (n) (2.11) For j / ∈I let r j :=w + 1. We will say that LSE (1) occupies the highest rank and LSE (n) occupies the lowest rank, givenI. 2.4 Stochastic VCG Mechanism for Selling Random Power We introduce a stochastic VCG mechanism for selling random power. To specify our mechanism Γ we need to specify selection schemesI andI w and payment schemest d i andt r i . The solutions to (2.4) and (2.1) giveI andI w . We now give the payment schemest r i andt d i . First, define LSE i’s utility under allocation x given production level w as u i (I, ˆ σ,w) :=u i (ˆ σ,w) =x i (v i −γ i 1 {w<r i } ) (2.12) 18 Note that the valuation and cost terms in (2.12) are the true values for LSE i, not the reported ones. LSE i’s payoff is given as π i (ˆ σ,w) =u i (ˆ σ,w)−t i (ˆ σ,w) (2.13) where t i (ˆ σ,w) =t d i (ˆ σ)−t r i (ˆ σ,w). Taking expectation over all generation scenarios gives E[π i (ˆ σ,W )] =E[u i (ˆ σ,W )]−E[t i (ˆ σ,W )] =E[x i v i −γ i 1 {w<r i } ]−E[t i (ˆ σ,W )] =x i (v i −γ i r i −1 X w=0 p w )−E[t i (ˆ σ,W )] Now, to specify the payment rules t r i and t d i we need a couple of additional definitions. First, fixing a first stage decisionI and LSE (i)∈I, for j / ∈I, denote θ i j := ˆ v j − ˆ γ j p 0 − i−1 X w=1 p w min(ˆ γ (w) , ˆ γ j )− n−1 X w=i p w min(ˆ γ (w+1) , ˆ γ j ) ! (2.14) and j(i) := arg max j/ ∈I θ i j . We assume that j(i) is unique. θ i j represents the contribution that a particular LSE j / ∈I would make to the expected social welfare, if it were selected along with LSEsI\{(i)}. Ifθ i j > 0 for at least one LSEj / ∈I, thenj(i) is the LSE that is selected if LSEi is disregarded, forming selectionI\{(i)}∪{j(i)}. Denote θ i j(i) :=θ i , and similarly ˆ v j(i) :=v i and ˆ γ j(i) :=γ i . The motivation for j(i) is as follows. Having fixedI we will want to determine the externality that LSE (i) imposes on the generator and other LSEs. This externality will 19 be a function of the best selection possible given that x (i) = 0, i.e., the best solution which disregards LSE (i). Denote this selectionI −(i) . As will be shown,I −(i) will select all other LSEs inI, i.e.,I\{(i)} as well as up to one additional LSE j / ∈I, depending upon whether θ i > 0. This LSE is j(i). As described in section III, the selectionI −(i) will induce a ranking on the included LSEs. The notation (·) −(i) will be used to refer to this ranking. In particular,r j(i) (I −(i) ) :=r i gives the rank of LSE j(i) amongst LSEsI −(i) . The paymentst d (i) andt r (i) for LSE (i)∈I will depend uponθ i andr i and are specified in Table I. A positivet d (i) indicates a payment from LSE (i) to the generator in the DA market, and a positive t r (i) indicates a payment from the generator to LSE (i) in the RT market. In either case negative values indicate transfers in the opposite direction. If LSE j / ∈I then t d j =t r j = 0. Table 2.1 Payment function for LSE (i)∈I Cases t d (i) (ˆ σ) t r (i) (ˆ σ,w) 1. θ i ≤ 0 0 0 0≤w≤i− 1 −ˆ γ (w+1) i≤w≤n− 1 0 w≥n 2. θ i > 0 r i >i v i γ i 0≤w≤i− 1 γ i − ˆ γ (w+1) i≤w≤r i − 1 0 r i ≤w 3. θ i > 0 r i ≤i v i γ i 0≤w≤r i − 1 ˆ γ (w) r i ≤w≤i− 1 0 w≥i Case 1: (θ i ≤ 0) Since θ i ≤ 0, adding any LSE j / ∈I to the selectionI\{(i)} will not increase the expected social welfare. Thus LSE (i) does not cause an externality in the first stage andt d (i) = 0. In the second stage, whenw≤i− 1, LSE (i) is deselected andt r (i) = 0. If 20 1 2 3 4 5 6 7 8 −ˆ γ (4) −ˆ γ (5) −ˆ γ (6) w t r (3) Figure 2.1 Case 1 RT transfer for i = 3 and n = 6. 1 2 3 4 5 6 7 γ i γ i − ˆ γ (4) w t r (3) Figure 2.2 Case 2 RT transfer for i = 3, n = 6 and r 3 = 4. i≤w≤n− 1 then LSE (i) receives a unit, but there is a shortfall and LSE (i) pays ˆ γ (w+1) , the sum of value lost and cost incurred by LSE (w + 1), the “first” deselected LSE. When w≥n, LSE (i) and all other selected LSEs receive a unit and t r (i) = 0. Case 2: (θ i > 0,r i > i ) θ i > 0 indicates that if LSE (i) were not present then some other LSE j / ∈I could be added to the selectionI\{(i)} to increase the expected SW. In particular LSE j(i) could be added to maximize this increase, so in the DA market LSE i pays t d (i) =v i , the value term that LSE j(i) would have added to the expected SW. If w≤ i− 1, then LSE (i) does not receive a unit, but received compensation γ i . If i≤w≤r i −1 then LSE (i) receives a unit, but makes a payment oft r (i) =γ i − ˆ γ (w+1) . To see that this difference is nonpositive observe that givenw∈ [i,r i − 1], since ˆ γ (j) is increasing in j, ˆ γ (w+1) ≥ ˆ γ (r i ) ≥γ i . The last inequality here follows from the assumptionr i ≤i− 1, which implies that LSE (i + 1) has rank i in selectionI\{(i)}∪j(i), i.e. r (i+1) (I −i ) =i>r i . If w≥ r i , then t r (i) = 0. This is because while LSE (i) is omitted, they are replaced by LSEj(i), so that all LSEs with rank higher thanr i in selectionI will have the same rank in 21 1 2 3 4 5 6 7 γ i ˆ γ (2) w t r (3) Figure 2.3 Case 3 RT transfer for i = 3, n = 6 and r 3 = 2. selectionI −(i) . Therefore the presence of LSE (i) in the RT market does not affect whether or not they experience shortfall, i.e. LSE (i) imposes no externality on them. Case 3: (θ i > 0,r i ≤ i ) The scheme and explanation for t d (i) in this case is the same as in the previous case. When 0≤ w≤ r i − 1 < i, LSE (i) receives no unit, but they are compensated with a paymentt r (i) =γ i . Alternatively this may be interpreted as LSEi receiving a credit ofγ i , as their presence prevents the selection of LSE j(i), and therefore the loss of γ i when w<r i . Ifr i ≤w≤i−1, then LSE (i) again receives no unit and is compensated witht r (i) = ˆ γ (w) . Again this can be interpreted in terms of hypothetical loss. Since w≥ r i , LSE j(i) would receive a unit. Since r i ≤ i, LSE j(i) would force LSEs (r i ) through (i− 1) to one higher rank, i.e.,r (w) (I −i ) =w + 1 forr i ≤w≤i− 1. Therefore, ifr i ≤w≤i− 1, LSE (w) would be the “first” LSE deselected underI −i , and LSE (i) imposes a positive externality of ˆ γ (w) . Finally, if w≥ i then LSE (i) receives a unit and t r (i) = 0. In this case, no RT transfer occurs between LSE (i) and the generator because as in Case 2, LSE (i) is omitted and then replaced inI −i , withr i 0, and LSE 1 contributes ˆ v 1 − ˆ γ 1 p 0 = 3− (2)(1/2)> 0. However if either LSE 1 or 2 were not selected, and LSE 3 were selected, then it would achieve rank 2 and make a contribution of θ 1 =θ 2 = ˆ v 3 − ˆ γ 3 (p 0 +p 1 ) = 13/32− (1/2)(3/4)> 0. to the expected social welfare. Thus LSE 1 and LSE 2 fall under case 2 and 3, respectively, of Table I, and their DA payments are t d 1 =t d 2 = 13/32. For LSE 1t r 1 (ˆ σ, 0) = 1/2,t r 1 (ˆ σ, 1) = 1/2− 1 =−1/2 andt r 1 (ˆ σ,w) = 0 forw≥ 2. For LSE 2, t r 2 (0) =t r 2 (1) = 1/2, and t r 2 (w) = 0 for w≥ 2. 23 We call the mechanism consisting of selection scheme (2.8), deselection scheme (2.9) and payments listed in Table 1 for i∈I and t d i := 0, t r i := 0 for i / ∈I a Stochastic VCG Mechanism for Random goods (SVCG-RANDOM). Theorem 1. The SVCG-RANDOM mechanism is individually rational and incentive compatible in expectation, as well as efficient. This appears to be the first two stage mechanism for random goods such as renewable energy. While we focus on LSEs wanting single units, it can be extended to multi-unit domain and bids. IR and IC properties are achieved in expectation ex ante. Ex post IR and IC are unlikely to be achievable. 2.5 Proof of Theorem 1 Subsection B of this section presents a proof of the claimed mechanism properties in each of the three cases listed in Table I. The proof relies on a lemma, from subsection A, concerning the optimal selection that the generator makes when x (i) = 0, denotedI −(i) . 2.5.1 The SelectionI −(i) In order to show that the proposed selection and payment schemes constitute a mecha- nism achieving the properties stated in Theorem 1, an explicit form ofI −(i) for (i)∈I is needed. Below, Lemma 2 enumerates the possible forms ofI −(i) , as well as conditions for when each form applies. Lemma 1 is necessary in the proof of Lemma 2. Proofs of both, as well as a more detailed proof of Theorem 1 may be found in the appendix of [23]. 24 Lemma 2. For any LSE j / ∈I with n =|I| ˆ v j − ˆ γ j p 0 ≤ n X w=1 min(ˆ γ j , ˆ γ (w) )p w ≤ ˆ γ j n X w=1 p w (2.15) Lemma 3. For (i)∈I, the optimal selection given x (i) = 0,I −(i) , is I −(i ) = I\{(i)} if θ i ≤ 0 I\{(i)}∪{j(i)} if θ i > 0 (2.16) 2.5.2 Proof of Theorem 1 Let x −i denote the first stage allocation corresponding to the solution of (2.10) when x i = 0. Letr −i j give the ranking of LSEj under selectionx −i , andV −i denote the maximum expected SW achieved when x i = 0, given bids ˆ σ. For some LSEi, fix ˆ σ −i , the bids of the other LSEs, and assume that LSEi bids truthfully, so that the bid vector is ˆ σ = (σ i , ˆ σ −i ). Also assume that given this ˆ σ, the generator makes selectionI withn =|I|, and that LSEi∈I with ranki, so that in this context (i) =i. Let V ∗ be the optimal expected SW associated with selectionI. Case 1: (θ i ≤ 0) Since LSE i∈I, x i = 1 and E[π i (I,W ; ˆ σ)] =v i −γ i E[1 {Wi) E[π i (I,W ; ˆ σ)] =v i −γ i E[1 {W 0 for all k∈L i and i∈N . Together, the outputs of each LSE’s renewable output form the random vector W . This vector has associated probability mass function p = (p 0 ,...,p w ), assumed to be common knowledge amongst all market participants, with the probability of scenario (realization) W = w∈ R |L| + given by p w , whereL is the set of all LSEs in the network. In all scenarios and at all buses, renewable generation incurs zero marginal cost. Given realized renewable generationW =w, we denote the quantity of renewable energy produced by LSE (i,k) as w k for all k∈L i and i∈N . At each node i∈N , LSE i may purchase energy in both the DA and RT markets, at nodal prices P 1,i andP 2,i (w) (given W =w), respectively, where P 2,i : R |L| + →R. Note that 33 in our finite scenario setting, each P 2,i (·) may be considered as a finite length vector. We denote the quantities purchased by LSE (i,k) in the first and second stages, given W = w and i∈N as y L 1,k and y L 2,k (w) for all k∈L i . Each LSEi’s constituent consumers are assumed to participate in demand response pro- grams, where they are compensated for reducing their consumption at the LSE’s request. In more detail, having secured first stage energy quantity y L 1,k and observed W =w, LSE i may request its consumer population to reduce their aggregate energy by amount x L 2,k (w), incurring consumer compensation cost c dr,k (x L 2,k (w)). While the demand response programs are intended to provide LSEs with a cushion against second stage energy prices, in cases of extreme underproduction in renewables, LSEs may effectively schedule a shortfall in energy offered to its consumers by selecting energy and demand reduction quantities in the RT market which sum to less than the residual demand D k −y L 1,k . Such an action incurs blackout cost c bo,k (z L 2,k (w)), with z L 2,k (w) =D k −w k −y L 1,k −y L 2,k (w)−x L 2,k (w). It is assumed that each LSE i is price taking, so that its consumption decisions y L 1,k and y L 2,k (w) cannot affect energy prices in either market stage. Thus, given P 1,k and P 2,k (·), renewable generation vectorW =w, as well as service decisions (y L 1,k ,y L 2,k (w),x L 2,k (w),z L 2,k (w)) the utility enjoyed by LSE (i,k) is π L i,k (y L 1,k ,y L 2,k (w),x L 2,k (w),z L 2,k (w)) := −P 1,i y L 1,k −P 2,i (w)y L 2,k (w)−c dr,k (x L 2,k (w))−c bo,k (z L 2,k (w)). (3.1) 34 Each LSE (i,k) seeks to maximize (3.1) with its first and second stage decisions. Specif- ically, given y L 1,k , renewable generation W = w, and second stage nodal price schedule P 2,i , the LSE (i,k)’s second stage optimization problem is (LSE2 i,k ) max y L 2,k (w),x L 2,k (w) z L 2,k (w) −P 2,k (w)y L 2,k (w)−c dr,k (x L 2,k (w))−c bo,k (z L 2,k (w)) (3.2) s.t. y L 1,k +y L 2,k (w) +x L 2,k (w) +z L 2,k (w)≥D k −w k (3.3) y L 2,k (w)≥ 0, x L 2,k (w)≥ 0, z L 2,k (w)≥ 0 (3.4) Note that constraint (3.3) is an inequality in order to maintain feasibility when D k − y L 1,k −w k < 0, i.e., when renewable generation exceeds residual demand D k −y L 1,k . Denote asπ L 2,k (y L 1,k ;w,P 2,k ) the maximum utility achievable in (LSE2 i,k ), given LSE (i,k)’s first stage decision, realized renewable generation and prices. Then, given prices P 1,i and P 2,i , LSE (i,k)’s first stage optimization problem is to maximize its summed first stage utility−P 1,i y L 1,k and expected maximum second stage utility with respect to uncertainty in renewable generation: (LSE1 i,k ) max y L 1,k −P 1,i y L 1,k +E[π L 2,k (y L 1,k ;W,P 2,i )] (3.5) s.t. y L 1,k ≥ 0. (3.6) Each generator is equipped with two different sources of power generation. First, genera- tor (i,k) owns a primary, dispatchable, nonrenewable power station, which can be scheduled 35 to produce quantity y G 1,k ∈R + at cost c 1,k (y G 1,k ). This primary plant is assumed to be inflex- ible for the purposes of our market, i.e., once scheduled in the first stage its generation level must remain fixed in the second stage. Each generator (i,k) also owns and operates a secondary or ancillary station, e.g., a gas turbine, which can be dispatched quickly in the second stage market. Specifically, having scheduled its primary plant to produce energy amounty G 1,k and observed renewable generation W =w, the generator may schedule secondary generation quantity y G 2,k (w)∈R + , incurring cost c 2,k (y G 2,k (w)). Throughout, we assume that dispatchable or renewable energy produced in excess of consumer demand may be disposed of at zero cost, or placed in a separate spot market not considered here. Generation (i,k) is compensated for first stage generation y G 1,k at rate P 1,i and, given W =w, compensated for second stage generationy G 2,k (w) at rateP 2,k (w). As with the LSEs, we assume that each generator (i,k) is price taking, so that given pricesP 1,i ,P 2,i andW =w, along with dispatch decisions y G 1,k and y G 2,k (w), generator (i,k) enjoys profit π G i,k (y G 1,k ,y G 2,k (w)) :=P 1,i y G 1,k −c 1,k (y G 1,k ) +P 2,k (w)y G 2,k (w)−c 2,k (y G 2,k (w)). (3.7) 36 Given P 2,k and renewable generation scenario W = w, generator (i,k) maximizes its second stage profit by solving (GEN2 i,k ) max y G 2,k (w) P 2,i (w)y G 2,k (w)−c 2,k (y G 2,k (w)) (3.8) s.t. y G 2,k (w)≥ 0. (3.9) Let π G 2,k (w,P 2,i ) denote generator (i,k)’s maximum achievable second stage profit, given W =w and P 2,i . Then, in the first stage, generator (i,k) solves (GEN1 i,k ) max y G 1,k P 1,i y G 1,k −c 1,k (y G 1,k ) +E[π G 2,k (W,P 2,i )] (3.10) s.t. y G 1,k ≥ 0. (3.11) Note here that the expected second stage profitE[π G 2,k (W,P 2,i )] is a constant when opti- mizing overy G 1,k , reflecting the fact that from the view of each generator (i,k), the two market stage are completely decoupled. We separate the two generator optimization problems to emphasize that the generator observes W =w before it decides y G 2,k (w). Finally, the ISO is responsible for enforcing the safe operation of the power grid, which we describe using the DC power flow model [71]. The model characterizes the network lines with susceptance matrix B, where B ij = B ji gives the susceptance of the line connecting nodes i and j. Denote the voltage phase angle at node i in the DA stage as θ 1,i , and in the 37 RT stage, given scenario w, as θ 2,i (w). Then, the active power flows from node i to node j in each stage are f 1,ij =B ij (θ 1,i −θ 1,j ), f 2,ij (w) =B ij (θ 2,i (w)−θ 2,j (w)), and the power balance equations at node i in each stage are X k∈G i y G 1,k − X k∈L i y L 1,k = X j f 1,ij X k∈G i y G 2,k (w)− X k∈L i y L 2,k (w) = X j f 2,ij (w)− X j f ij,1 (3.12) Note that summing both equalities in (3.12) gives that X i X k∈G i y G 1,k = X i X k∈L i y L 1,k and X i X k∈G i y G 2,k (w) = X i X k∈L i y L 2,k (w), ∀w. Letting f max ij = f max ji ≥ 0 denote the flow limit of line (i,j), the ISO also ensures that power flows do not exceed line capacities in either market stage: f 1,ij ≤f max ij and f 2,ij (w)≤f max ij , ∀i,j,w In the following sections we will introduce a sequential competitive equilibrium definition for our setting. In order to assess the welfare properties of the allocations included in such equilibria, we specify here a two-stage social planner’s problem (SPP) which corresponds to our two settlement market model. The social planner seeks to maximize the aggregate welfare of all market participants. 38 Given renewable generation scenarioW =w, the aggregate welfare is defined as the sum of LSE utilities and generator profits given in (3.1) and (4.4): π SPP (w) : = X i X k∈G i (π G i,k (ˆ y G 1,k , ˆ y G 2,k (w)) + X k∈L i π L i,k (ˆ y L 1,k , ˆ y L 2,k (w), ˆ x L 2,k (w), ˆ z L 2,k (w)) =− X i X k∈G i c 1,k (ˆ y G 1,k + ˆ y G 2,k (w) − X i X k∈L i c dr,k (x L 2,k (w)) +c bo,k (z L 2,k (w)), (3.13) where (ˆ y G 1,k , ˆ y L 1,k , ˆ y G 2,k (w), ˆ y L 2,k (w), ˆ x L 2,k (w), ˆ z L 2,k (w)) for all k∈G i andL i and all i∈N are the planner’s first and second stage decisions, the second stage decisions made with knowledge of realized scenario W =w. 39 Let ˆ y G 1 ∈ R + denote the vector collecting the planner’s first stage generation dispatch decisions for all generators. Similarly defining ˆ y L 1 and ˆ θ 1 , given ˆ y 1 := (ˆ y G 1 , ˆ y L 1 , ˆ θ 1 ) andW =w, the social planner’s RT optimization problem is (SPP2) max ˆ y G 2 (w),ˆ y L 2 (w) ˆ x L 2 (w),ˆ z L 2 (w), ˆ θ 2 (w) − X i X k∈G i c 2,k (ˆ y G 2,k (w))− X i X k∈G i c dr,k (ˆ x L 2,k (w)) +c bo,k (ˆ z L 2,k (w)) p w (3.14) s.t. X k∈G i ˆ y G 2,k (w)− X k∈L i ˆ y L 2,k (w) = X j B ij ( ˆ θ 2,i (w)−θ 2,j (w)− ˆ θ 1,i + ˆ θ 1,j ) (3.15) B ij ( ˆ θ 2,i (w)− ˆ θ 2,j (w))≤f max ij , ∀i,j (3.16) ˆ y L 1,k + ˆ y L 2,k (w) + ˆ x L 2,k (w) + ˆ z L 2,k (w)≥D k −w k , ∀i,k∈L i (3.17) ˆ y L 2,k (w)≥ 0, ˆ x L 2,k (w), ˆ z L 2,k (w)≥ 0, ∀i,k∈L i (3.18) ˆ y G 2,k (w)≥ 0 ∀i,k∈G i . (3.19) 40 Defineπ SPP 2 (ˆ y G 1 , ˆ y L 1 , ˆ θ 1 ;w) as the maximum aggregate welfare achievable in the second stage, given the planner’s first stage decisions andW =w. Then, the planner’s first stage problem is (SPP1) max ˆ y G 1 ,ˆ y L 1 , ˆ θ 1 − X i X k∈G i c 1,k (ˆ y G 1,k ) +E[π SPP 2 (ˆ y G 1 , ˆ y L 1 ;W )] (3.20) s.t. X k∈G i ˆ y G 1,k − X k∈L i ˆ y L 1,k = X j B ij ( ˆ θ i − ˆ θ j ), ∀i (3.21) B ij ( ˆ θ i − ˆ θ j )≤f max ij ∀i,j (3.22) ˆ y G 1,i ≥ 0, ˆ y L 1,i ≥ 0, ∀i. (3.23) Let ˆ y 2 (w) := (ˆ y G 2 (w), ˆ y L 2 (w)) for allw, and similarly for ˆ x 2 (w). We refer to optimal solutions ˆ y ∗ 1 and (ˆ y ∗ 2 (·), ˆ x ∗ 2 (·), ˆ z ∗ 2 (·)) to (SPP1) and (SPP2) as efficient sequential allocations. The following assumptions are made throughout the following sections. Assumption 1. c 1,k ,c 2,k ,c dr,k and c bo,k are strictly convex, increasing, differentiable and nonnegative overR + for all generators and LSEs. First, we argue that problems (SPP1) and (SPP2) can be combined into a single-stage optimization problem. 41 Lemma 4. The two-stage problem (SPP1)-(SPP2) is equivalent to the following primal single stage problem: (SPP-P) max ˆ y G 1 ,ˆ y G 2 ,ˆ y L 1 ,ˆ y L 2 ˆ x L 2 ,ˆ z L 2 , ˆ θ 1 , ˆ θ 2 − X i X k∈G i c 1,k (ˆ y G 1,k ) + X w c 2,k (ˆ y G 2,k (w)p w ! − X i X k∈L i X w c dr,k (ˆ x L 2,k (w)) +c bo,k (ˆ z L 2,k (w)) p w (3.24) s.t. X k∈G i ˆ y G 1,k − X k∈L i ˆ y L 1,k = X j B ij ( ˆ θ 1,i − ˆ θ 1,j ), ∀i (3.25) X k∈G i ˆ y G 2,k (w)− X k∈L i ˆ y L 2,k (w) = X j B ij ( ˆ θ 2,i (w)− ˆ θ 2,j (w)− ˆ θ 1,i + ˆ θ 1,j ), ∀i,w (3.26) B ij ( ˆ θ 1,i − ˆ θ 1,j )≤f max ij , ∀i,j (3.27) B ij ( ˆ θ 2,i (w)− ˆ θ 2,j (w))≤f max ij , ∀i,j (3.28) ˆ y L 1,k + ˆ y L 2,k (w) + ˆ x L 2,k (w) + ˆ z L 2,k (w)≥D k −w k , ∀i,k∈L i ,w (3.29) ˆ y L 1,k ≥ 0, ˆ y L 2,k (w)≥ 0, ˆ x L 2,k (w)≥ 0, ˆ z L 2,k (w)≥ 0, ∀i,k∈L i ,w (3.30) ˆ y G 1,k ≥ 0, ˆ y L 2,k (w)≥ 0, ∀i,k∈G i ,w (3.31) Proof. See Appendix A. By “equivalent” in Lemma 1, we mean that (SPP-P) and (SPP1) have the same optimal objective value. Moreover, if (ˆ y G∗ 1 , ˆ y L∗ 1 , ˆ y G∗ 2 (·), ˆ y L∗ 2 (·), ˆ x L∗ 2 (·), ˆ z L∗ 2 (·), ˆ θ ∗ 1 , ˆ θ ∗ 2 (·)) is optimal for (SPP-P), then (ˆ y G∗ 2 (·), ˆ y L∗ 2 (·), ˆ x L∗ 2 (·), ˆ θ ∗ 2 (·)) is optimal for (SPP2). Conversely, if (ˆ y G∗ 1 , ˆ y L∗ 1 , ˆ θ ∗ 1 ) 42 is optimal for (SPP1) and (ˆ y G∗ 2 (·), ˆ y L∗ 2 (·), ˆ x L∗ 2 (·), ˆ z L∗ 2 (·), ˆ θ ∗ 2 (·)) is optimal for (SPP2), given the optimal solution to (SPP1), then (ˆ y G∗ 1 , ˆ y L∗ 1 , ˆ y G∗ 2 (·), ˆ y L∗ 2 (·), ˆ x L∗ 2 (·), ˆ z L∗ 2 (·), ˆ θ ∗ 1 , ˆ θ ∗ 2 (·)) is optimal for (SPP-P). Similar results hold for the generator and LSE two-stage problems, giving (GEN-P i,k ) max y G 1,k ,y G 2,k P 1,i y G 1,k −c 1,k (y G 1,k ) + X w P 2,i (w)y G 2,k (w)−c 2,k (y G 2,k (w)) p w (3.32) y G 1,k ≥ 0, y G 2,k (w)≥ 0 ∀w, (3.33) and (LSE-P i,k ) max y L 1,k ,y L 2,k x L 2,k ,z L 2,k −P 1,i y L 1,k − X w P 2,i (w)y L 2,k (w) +c dr,k (x L 2,k (w)) +c bo,k (z L 2,k (w)) p w (3.34) s.t. y L 1,k +y L 2,k (w) +x L 2,k (w) +z L 2,k (w)≥D k −w k , ∀w (3.35) y L 1,k ≥ 0, y L 2,k (w)≥ 0, x L 2,k (w)≥ 0, z L 2,k (w)≥ 0, ∀w. (3.36) Note that while the stated problem setting includes finitely many scenarios, under suit- able conditions on the first and second stage objective functions and second stage feasibility set (given first stage decisions and uncertainty realization), the interchangeability principle 43 stated above holds more generally for two stage convex optimization problems involving in- finitely many scenarios. Additionally, the interchangeability principle extends beyond prob- lems optimizing expectation to those involving risk measures. See Chapter 4, as well as [69] for further details. 3.3 Sequential Competitive Equilibrium and Efficient Allocations In single-stage markets involving a single good, a competitive equilibrium is given by a price P and quantity x such that, given P , producers find it optimal to produce, and consumers find it optimal to purchase quantity x of the good [47]. In such a situation, it is said that the market clears, i.e., demand equals supply. Here, in order to assess the outcome of the two-stage market described in the previous section, we define a two-stage, sequential version of competitive equilibrium, similar to that found in [24]. Definition 1. A sequential competitive equilibrium (SCEq) is a tuple (y G∗ 1 ,y L∗ 1 ,y G∗ 2 (·),y L∗ 2 (·),x L∗ 2 (·),P ∗ 1 ,P ∗ 2 (·)) such that, for all (i,k), given P ∗ 1,i and P ∗ 2,i (·), y G∗ 1,k is optimal for (GEN1 i,k ), y L∗ i,k is opti- mal for (LSE1 i,k ), and, given W = w and P ∗ 2 (·), y G∗ 2,k (w) is optimal for (GEN2 i,k ) and 44 (y L∗ 2,k (w),x L∗ 2,k (w)) is optimal for (LSE2 i,k ), and the markets clear in both stages in all scenar- ios: X i X k∈G i y G∗ 1,k = X i X k∈L i y L∗ 1,k and X i X k∈G i y G∗ 2,k (w) = X i X k∈L i y L∗ 2,k (w), ∀w. (3.37) Note that in the SCEq definition, P ∗ 2 (·) and y G∗ 2,k (·), y L∗ 2,k (·) and x L∗ 2,k (·) for each k are functions. Also in problems involving shortfall decisions, a solution consists only of ancil- lary generation and demand response decisions, as they are enough to determine shortfall decisions, e.g., (ˆ y G∗ 2 (·), ˆ y L∗ 2 (·), ˆ x L∗ 2 (·)) solves (SPP2). We now study the existence of an SCEq in the two-stage market. Let ˆ λ ∗ 1 = (λ ∗ 1,1 ,··· ,λ ∗ 1,N ) > and ˆ λ ∗ 2 (·) = ( ˆ λ ∗ 2,1 (·),..., ˆ λ ∗ 2,N (·)) > denote dual optimal variables associated with constraints (3.25) and (3.26) in (SPP-P). Theorem 5. Under Assumption 1, a sequential competitive equilibrium exists, and is given by (ˆ y G∗ 1 , ˆ y L∗ 1 , ˆ y G∗ 2 (·), ˆ y L∗ 2 (·), ˆ x L∗ 2 (·), ˆ λ ∗ 1 , ˆ λ ∗ 2 (·)), where (ˆ y G∗ 1 , ˆ y L∗ 1 , ˆ y L∗ 2 (·), ˆ y L∗ 2 (·), ˆ x L∗ 2 (·)) is the primal solution to (SPP-P), and ( ˆ λ ∗ 1 , ˆ λ ∗ 2 (·)) is an optimal dual solution to (SPP-P). Proof. In addition to feasibility constraints (3.25)-(3.31), the optimal solution to (SPP-P), denoted as (ˆ y G∗ 1 , ˆ y L∗ 1 , ˆ y G∗ 2 (·), ˆ y L∗ 2 (·), ˆ x L∗ 2 (·), ˆ θ ∗ 1 , ˆ θ ∗ 2 (·)) 45 satisfies the following KKT conditions: c 0 1,k (ˆ y G∗ 1,k )− ˆ λ ∗ 1,i ≥ 0, ∀i,k∈G i (3.38) ˆ y G∗ 1,k c 0 1,k (ˆ y G∗ 1,k )− ˆ λ ∗ 1,i = 0, ∀i,k∈G i (3.39) c 0 2,k (ˆ y G∗ 2,k (w))− ˆ λ ∗ 2,i (w)≥ 0, ∀i,k∈G i ,w (3.40) ˆ y G∗ 2,k (w) c 0 2,k (ˆ y G∗ 2,k (w))− ˆ λ ∗ 2,i (w) = 0, ∀i,k∈G i (3.41) ˆ λ ∗ 1,i − X w ˆ μ ∗ 2,k (w)p w ≥ 0, ∀i,k∈L i ,w (3.42) ˆ y L∗ 1,k ˆ λ ∗ 1,i − X w ˆ μ ∗ 2,k (w)p w ! = 0, ∀i,k∈L i ,w (3.43) ˆ λ ∗ 2,i (w)− ˆ μ ∗ 2,k (w)≥ 0, ∀i,k∈L i ,w (3.44) ˆ y L∗ 2,k (w) ˆ λ ∗ 2,i (w)− ˆ μ ∗ 2,k (w) = 0, ∀i,k∈L i ,w (3.45) c 0 dr,k (ˆ x L∗ 2,k (w))− ˆ μ ∗ 2,k (w)≥ 0, ∀i,k∈L i ,w (3.46) ˆ x L∗ 2,k (w) c 0 dr,k (ˆ x L∗ 2,k (w))− ˆ μ ∗ 2,k (w) = 0, ∀i,k∈L i ,w (3.47) c 0 bo,k (ˆ z L∗ 2,k (w))− ˆ μ ∗ 2,k (w)≥ 0, ∀i,k∈L i ,w (3.48) ˆ z L∗ 2,k (w) c 0 bo,k (ˆ z L∗ 2,k (w))− ˆ μ ∗ 2,k (w) = 0, ∀i,k∈L i ,w (3.49) ˆ μ ∗ 2,k (w) D k − ˆ y L∗ 1,k − ˆ y L∗ 2,k (w)− ˆ x L∗ 2,k (w)− ˆ z L∗ 2,k (w)−w k = 0, ∀i,k∈L i ,w (3.50) X j B ij ˆ λ ∗ 1,i − ˆ λ ∗ 1,j − X w ˆ λ ∗ 2,i (w)− ˆ λ ∗ 2,j (w) p w + ˆ γ ∗ 1,ij − ˆ γ ∗ 1,ji ! = 0, ∀i (3.51) X j B ij ˆ λ ∗ 2,i (w)− ˆ λ ∗ 2,j (w) + ˆ γ ∗ 2,ij (w)− ˆ γ ∗ 2,ji (w) = 0, ∀i,w (3.52) ˆ γ ∗ 1,ij B ij ( ˆ θ ∗ 1,i − ˆ θ ∗ 1,j )−f max ij = 0, ∀i,j (3.53) ˆ γ ∗ 2,ij (w) B ij ( ˆ θ ∗ 2,i (w)− ˆ θ ∗ 2,j (w))−f max ij = 0, ∀i,j,w (3.54) ˆ μ ∗ 2,k (w)≥ 0, ∀i,k∈L i ,w (3.55) ˆ γ ∗ 1,ij ≥ 0, ˆ γ ∗ 2,ij (w)≥ 0 ∀i,j,w (3.56) 46 Note that, due to Assumption 1, the optimal solution to (SPP-P) is unique when it exists. We show here that this solution also gives optimal solutions to (GEN-P i,k ) and (LSE-P i,k ) for all (i,k). Aside from the nonnegativity constraints given in (3.33), the optimal solution to (GEN-P i,k ) satisfies c 0 1,k (y G∗ 1,k )−P 1,i ≥ 0 (3.57) y G∗ 1,k (c 0 1,k (y G∗ 1,k )−P 1,i ) = 0 (3.58) c 0 2,k (y G∗ 2,k (w))−P 2,i (w)≥ 0, ∀w (3.59) y G∗ 2,k (w)(c 0 2,k (y G∗ 2,k (w))−P 2,i (w)) = 0, ∀w. (3.60) 47 In addition to the feasibility constraints (3.35)-(3.36), the optimal solution for (LSE-P i,k ) satisfies P 1,i − X w μ L∗ 2,i (w)p w ≥ 0 (3.61) y L∗ 1,i P i,1 − X w μ L∗ 2,k (w)p w ! = 0 (3.62) P 2,i (w)−μ L∗ 2,k (w)≥ 0, ∀w (3.63) y L∗ 2,k (w) P 2,i (w)−μ L∗ 2,k (w) = 0, ∀w (3.64) c 0 dr,k (x L∗ 2,k (w))−μ L∗ 2,k (w)≥ 0 ∀w (3.65) x L∗ 2,k (w) c 0 dr,k (x L∗ 2,k (w))−μ L∗ 2,k (w) = 0, ∀w (3.66) c 0 bo,k (z L∗ 2,k (w))−μ L∗ 2,k (w)≥ 0 ∀w (3.67) z L∗ 2,k (w) c 0 bo,k (z L∗ 2,k (w))−μ L∗ 2,k (w) = 0, ∀w (3.68) μ L∗ 2,k (w) D k −w k −y L∗ 1,k −y L∗ 2,k (w)−x L∗ 2,k (w)−z L∗ 2,k (w)) = 0, ∀w (3.69) μ L∗ 2,k (w)≥ 0, ∀w, (3.70) where μ L∗ 2,k (·) is the optimal dual vector corresponding to constraint (3.35) in (LSE-P i,k ). Now, we define candidate prices P 1,i = ˆ λ ∗ 1,i and P 2,i (w) = ˆ λ ∗ 2,i (w), ∀i,w, (3.71) and claim the following. Claim 1. (ˆ y G∗ 1 , ˆ y L∗ 1 , ˆ y G∗ 2 (·), ˆ y L∗ 2 (·), ˆ x L∗ 2 (·),P 1 ,P 2 (·)) is an SCEq, whereP 1 andP 2 (·) are defined in (3.71), and (ˆ y G∗ 1 , ˆ y L∗ 1 , ˆ y G∗ 2 (·), ˆ y L∗ 2 (·), ˆ x L∗ 2 (·)) is the unique solution to (SPP-P). 48 Proof. Starting with (GEN-P i,k ), substituting for P 1,i and P 2,i (w), and selecting y G 1,k = ˆ y G∗ 1,k and y G 2,k (w) = ˆ y G∗ 2,k (w) for all w in (3.57)-(3.60) yields expressions identical to (4.38)- (4.41). This shows that, given P 1 and P 2 (·) as defined in (3.71), (ˆ y G∗ 1,k , ˆ y G∗ 2,k (·)) is optimal for (GEN-P i,k ), and therefore ˆ y G∗ 1,k is optimal for (GEN1 i,k ) and ˆ y G∗ 2,k (·) for (GEN2 i,k ). Continuing to the LSEs’ problems, substituting for P 1 and P 2 (w) in (3.61)-(3.70), and selecting y L 1,k = ˆ y L∗ 1,k , y L 2,k (w) = ˆ y L∗ 2,k (w), x L 2,k (w) = ˆ x L∗ 2,k (w), z L 2,k (w) = ˆ z L∗ 2,k (w) for all w yields expressions which are the same as (4.42)-(3.50) and (3.55), except that μ L∗ 2,i (w) appears in- stead of ˆ μ ∗ 2,k (w). Therefore, setting μ L∗ 2,k (w) = ˆ μ ∗ 2,k (w) for all w makes (3.61)-(3.70) identical to (4.42)-(3.50) and (3.55), showing that (ˆ y L∗ 1,k , ˆ y L∗ 2,k (·), ˆ x L∗ 2,k (·), ˆ z L∗ 2,k (·)) from the (SPP-P) op- timal solution gives an optimal solution for (LSE-P i,k ). Thus, given prices P 1 and P 2 (·) as defined in (3.71), ˆ y L∗ 1,k is optimal for (LSE1 i,k ), and (ˆ y L∗ 2,k (w), ˆ x L∗ 2,k (w), ˆ z L∗ 2,k (w)) is optimal for (LSE2 i,k ), for all w, given ˆ y L∗ 1,k . The market clearing condition is satisfied due to feasibility constraints (3.25) and (3.26). Therefore, the tuple in the claim is a sequential competitive equilibrium, and we have proven Theorem 1. 3.3.1 Social Welfare Theorems There exists an important connection between competitive equilibria and efficient allo- cations, described by the two fundamental theorems of welfare economics. Here we give statements of first and second theorems of welfare economics for our two-stage market set- ting. If an allocation is included in an SCEq, we say that the equilibrium supports the allocation. 49 Theorem 6. (i) Every sequential competitive equilibrium supports an efficient sequential allocation. (ii) Conversely, an efficient sequential allocation can be supported by a sequential competitive equilibrium. Proof. To prove statement (i), per Definition 1, under a sequential competitive equilibrium, the market clears both in the DA and RT stage, in all scenarios. This condition is equivalent to posing the following ISO problem [77]: (ISO) max ˜ y G 1 ,˜ y G 2 ,˜ y L 1 ,˜ y L 2 ˜ θ 1 , ˜ θ 2 X i P 1,i X k∈L i ˜ y L 1,k − X k∈G i ˜ y G 1,k + X w X i P 2,i (w) X k∈L i ˜ y L 2,k (w)− ˜ y G 2,k (w) p w (3.72) s.t. X k∈G i ˜ y G 1,k − X k∈L i ˜ y L 1,k = X j B ij ( ˜ θ 1,i − ˜ θ 1,j ), ∀i (3.73) X k∈G i ˜ y G 2,k (w)− X k∈L i ˜ y L 2,k (w) = X j B ij ( ˜ θ 2,i (w)− ˜ θ 2,j (w)− ˜ θ 1,i + ˜ 1,j), ∀i,w (3.74) B ij ( ˜ θ 1,i − ˜ θ 1,j )≤f max ij , ∀i,j (3.75) B ij ( ˜ θ 2,i (w)− ˜ θ 2,j (w))≤f max ij , ∀i,j,w (3.76) ˜ y G 1,k ≥ 0, ˜ y G 2,k (w)≥ 0, ∀i,k∈G i ,w (3.77) ˜ y L 1,k ≥ 0, ˜ y L 2,k (w)≥ 0, ∀i,k∈L i ,w, (3.78) and then requiring that (y G∗ 1 ,y L∗ 1 ,y G∗ 2 (·),y L∗ 2 (·)) as given in the SCEq definition also solves (ISO). 50 Summing the objectives of all agents, i.e., (GEN-P i,k ) and (LSE-P i,k ) over (i,k), along with the objective of (ISO) recovers the objective of (SPP-P). Similarly, collecting the con- straints from individual (GEN-P i,k ) and (LSE-P i,k ) problems, along with those from (ISO) recovers the full set of constraints found in (SPP-P). Together, therefore, (ISO), along with all (GEN-P i,k ) and (LSE-P i,k ) represent a decomposition of (SPP-P). Denote the Lagrange multipliers corresponding to constraints (3.73) and (3.74) as ˜ λ 1 and ˜ λ 2 . Note that the KKT conditions corresponding to ˜ y G 1,k and ˜ y L 1,k are P 1,i − ˜ λ ∗ 1,i ≥ 0, ∀i,k∈G i (3.79) ˜ y G∗ 1,k (P 1,i − ˜ λ ∗ 1,i ) = 0, ∀i, k∈G i (3.80) ˜ λ ∗ 1,i −P 1,i ≥ 0, ∀i,k∈G i (3.81) ˜ y L∗ 1,k ( ˜ λ ∗ 1,i −P 1,i ) = 0, ∀i, k∈G i (3.82) This implies that we can take ˜ λ ∗ 1,i =P 1,i for alli, and it can similarly be shown that ˜ λ ∗ 2,i (w) = P 2,i (w) for all i and w. The remaining KKT conditions for (ISO) are identical in form to (SPP-P) KKT conditions (3.51)-(3.54). Therefore, associating Lagrange multipliers ˜ γ ∗ 1 and ˜ γ ∗ 2 (·) with (ISO) constraints (3.75) and (3.76), together with (P 1 ,P 2 (·), ˜ γ ∗ 1 , ˜ γ ∗ 2 (·), ˜ θ ∗ 1 , ˜ θ ∗ 2 (·)) satisfy (3.51)-(3.54), and therefore give a candidate ( ˆ λ ∗ 1 , ˜ λ ∗ 2 (·), ˜ γ ∗ 1 , ˜ γ ∗ 2 (·), ˆ θ ∗ 1 , ˆ θ ∗ 2 (·)) in (SPP-P). Takingμ L∗ 2,k (w) = ˆ μ ∗ 2,k (w) for all LSEs as in the proof of Theorem 5 and choosing the primal allocation quantities in (SPP-P) as the equilibrium quantities (together with implied shortfall decisions z L∗ 2,k (w) for all LSEs in all scenarios), the (SPP-P) KKT conditions are satisfied. Therefore, the sequential competitive equilibrium supports an efficient allocation. 51 The proof of the second statement follows directly from the constructive proof of Theorem 5. In price taking settings, the first fundamental theorem of welfare economics (part (i) of Theorem 6), holds under local nonsatiation of consumer preferences [47]. Local nonsatiation holds when for any individual’s allocation of goods x, there exists another allocation (not necessarily feasible) x 0 such thatkx−x 0 k≤ for any > 0, and x 0 is preferred to x. For the setting in this chapter, local nonsatiation holds from the assumptions placed on the cost functions for power and ancillary services. In addition to local nonsatiation, the second fundamental theorem of welfare economics requires that all production sets are convex, and that all consumer preference relations are convex, i.e. the allocations x 0 which are preferred to a given allocation x form a convex set. These conditions are satisfied in our setting due to the convexity of all entity and centralized optimization problems. Therefore, in a sense, having transformed the two-stage stochastic programs posed in this chapter to single stage optimization problems, Theorem 6 can be seen as implied by the general equilibrium theory presented in [47]. Still, this framework is presented for single shot economies rather than two-stage decisions involving recourse action. Thus we find it illuminating to present the more direct proof for this setting detailed above. 52 3.4 Two-Stage Network Mechanism for Electricity Market with Renewable Generation We showed in the proof of Theorem 5 that SCEq prices arise from the dual solution to (SPP-P). If we make the additional assumptions that all market participant cost functions can be finitely parametrized (for example taking quadratic form), and further that all market participants are non-strategic, the following market mechanism implements the SCEq, and clears the market following the RT stage: 1. Each generator and LSE (i,k) submits parameters (ξ 1,k ,ξ 2,k ) and (ξ dr,k ,ξ bo,k ), respec- tively. 2. The ISO solves (SPP-P), and announces DA prices P ∗ 1 = ˆ λ ∗ 1 , along with RT price schedule P ∗ 2 (·) = ˆ λ ∗ 2 (·). 3. Generator (i,k) solves (GEN1 1,k ) and LSE (i,k) solves (LSE1 1,k ). LSE (i,k) pays P ∗ 1,i y L∗ 1,k and generator (i,k) receives P ∗ 1,i y G∗ 1,k . 4. At the start of the RT stage, the renewable generation output W =w is observed by both the generators and LSEs. Generator (i,k) solves (GEN2 i,k ), and LSE (i,k) solves (LSE2 i,k ). LSE (i,k) pays P ∗ 2,i (w)y L∗ 2,k (w), and generator (i,k) receives P ∗ 2,i (w)y G∗ 2,k (w). 5. Generator (i,k) produces y G∗ 1,k +y G∗ 2,k (w), and LSE receives y L∗ 1,k +y L∗ 2,k (w). 53 3.5 Dynamic Economic Dispatch Game and Efficient Bids Previous sections assumed that the ISO has full knowledge of the cost functions associ- ated with each generator and LSE. In this section, we relax that assumption, instead allowing market participants to report information related to their respective costs to the ISO, which it then uses to make dispatch decisions. Additionally, we allow that all entities may behave strategically, so that the submitted information may not reflect their true costs. In practice, the bid formats typically implemented in power markets are not expressive enough to cap- ture the strictly convex cost functions we posed as in previous sections. For example, the California ISO uses 10-segment piecewise linear bids for supply-side bids [43]. Therefore, in this section we reformulate (SPP-P) as a dynamic economic dispatch (DED) game and study the outcomes of that game. 3.5.1 LSE Utility Functions and SPP-P Reformulation In the following development, we will assume that generators and LSEs submit linear bids for the cost of energy production and value of energy consumption, respectively, for both the first and second stages of the market. For the RT market, both types of agents are allowed to submit bids corresponding to each scenario W =w. Given the objective of (SPP-P), it would be natural to allow the LSEs to submit bids on demand response and blackout costs. However, in order to provide sufficient conditions for the existence of Nash equilibria for our market in later sections, it is necessary instead to 54 work with equivalent LSE valuation functions, i.e., functions giving the benefit LSEs derive from consuming quantities of primary and ancillary energy. In essence, this is due to the fact that under our market formulation, the ISO can only allocate demand response or planned blackouts to a given LSE (i,k)’s consumer population via the offerings of LSE (i,k) itself. This is true even if multiple LSEs exist at a given node, or if multiple LSEs exist in a network with no congestion. In contrast, to the extent allowed by the network transmission constraints, the planner can route the least expensive electricity to serve loads. As we will show in a later example, if LSEs directly bid on their own service costs, it creates opportunities to increase their payoffs (decrease overall operating costs). On the other hand, when LSEs provide bids on their valuation for electricity in both stages, they compete directly with one another for the same service, and the planner is able to allocate flows to the highest bidders. To start, consider the following problem, given y L 1,k and y L 2,k (w) for all scenarios W =w (LSE i,k (y L 1,k ,y L 2,k )) min x 2,k ,z 2,k X w (c dr,k (x 2,k (w)) +c bo,k (z 2,k (w)))p w (3.83) s.t. y L 1,k +y L 2,k (w) +x 2,k (w) +z 2,k (w)≥D k −w k , ∀w. (3.84) 55 This problem can be decomposed into|W| convex problems (LSE i,k (y L 1,k ,y L 2,k (w),w), each corresponding to a single scenario w, where|W| gives the total number of possible second stage scenarios. Problem (LSE i,k (y L 1,k ,y L 2,k )) has KKT conditions c 0 dr,k (x ∗ 2,k (w))−μ ∗ 2,k (w)≥ 0, ∀w (3.85) x ∗ 2,k (w) c 0 dr,k (x ∗ 2,k (w))−μ ∗ 2,k (w) = 0, ∀w (3.86) c 0 bo,k (z ∗ 2,k (w))−μ ∗ 2,k (w)≥ 0, ∀w (3.87) z ∗ 2,k (w) c 0 bo,k (z ∗ 2,k (w))−μ ∗ 2,k (w) = 0, ∀w (3.88) μ ∗ 2,k (w) D k −w k −y L 1,k −y L 2,k (w)−x ∗ 2,k (w)−z ∗ 2,k (w) = 0, ∀w (3.89) μ ∗ 2,k (w)≥ 0, ∀w. (3.90) Let u i,k (y L 1,k ,y L 2,k ) denote the negation of the optimal value of (LSE i,k (y L 1,k ,y L 2,k )), and u i,k (y 1,k ,y 2,k (w),w) denote the negation of the optimal value of (LSE i,k (y L 1,k ,y L 2,k ),w) for eachw (without scaling by p w ). Thus these functions give the benefit of consuming electricity acquired in the first and second stages of the market in terms of the associated negative costs of demand response and scheduled blackouts. Note that the definitions of u i,k (y L 1,k ,y L 2,k ) and u i,k (y L 1,k ,y L 2,k (w),w) imply the following equality: u i,k (y L 1,k ,y L 2,k ) = X w u i,k (y L 1,k ,y L 2,k (w),w)p w . 56 As each (LSE i,k (y L 1,k ,y L 2,k ,w)) is a convex problem, its optimal value function is convex in both y L 1,k and y L 2,k (w), implying that u i,k (·,·,w) is convex for all w in both arguments [14]. Since (3.85)-(3.90) is a subset of the (SPP-P) KKT conditions, we can use the specified utility functions to rewrite (SPP-P) without explicit reference to x L 2,k (w) and z L 2,k (w): (SPP-U) max ˆ y G 1 ,ˆ y G 2 ,ˆ y L 1 ,ˆ y L 2 ˆ θ 1 , ˆ θ 2 X i X k∈L i X w u i,k (y L 1,k ,y L 2,k (w))p w − X i X k∈G i c 1,k (ˆ y G 1,k ) + X w c 2,k (ˆ y G 2,k (w)p w ! (3.91) s.t. X k∈G i ˆ y G 1,k − X k∈L i ˆ y L 1,k = X j B ij ( ˆ θ 1,i − ˆ θ 1,j ), ∀i (3.92) X k∈G i ˆ y G 2,k (w)− X k∈L i ˆ y L 2,k (w) = X j B ij ( ˆ θ 2,i (w)− ˆ θ 2,j (w)− ˆ θ 1,i + ˆ θ 1,j ), ∀i,w (3.93) B ij ( ˆ θ 1,i − ˆ θ 1,j )≤f max ij , ∀i,j (3.94) B ij ( ˆ θ 2,i (w)− ˆ θ 2,j (w))≤f max ij , ∀i,j (3.95) ˆ y L 1,k ≥ 0, ˆ y L 2,k (w)≥ 0, ∀i,k∈L i ,w (3.96) ˆ y G 1,k ≥ 0, ˆ y L 2,k (w)≥ 0, ∀i,k∈G i ,w (3.97) While u i,k (·,·,w) may not be differentiable under our original cost function assumptions in all cases, we can still make use of the subdifferential versions of the KKT conditions to describe optimal solutions to (SPP-U). 57 3.5.2 DED Game and Efficient Bids In the DED, each generator (i,k) bidsb G 1,k (·) andb G 2,k (·,w) for each scenario, representing its reported cost for producing electricity in both stages. Similarly, the LSEs bid b L 1,k (·) and b L 2,k (·,w) for each scenario, giving their utility for consuming primary and ancillary generation. Here it is assumed that all participants submit nonnegative scalar bids in each case, specifying cost and utility functions, e.g. b G 1,k (x) =b G 1,k x for all x≥ 0. Given the generator and LSE bids, the ISO solves the following problem: (DED) max ˆ y G 1 ,ˆ y G 2 ,ˆ y L 1 ,ˆ y L 2 ˆ θ 1 , ˆ θ 2 X i X k∈L i b L 1,k ˆ y L 1,k + X w b L 2,k (w)ˆ y L 2,k (w))p w ! − X i X k∈G i b G 1,k ˆ y G 1,k + X w b G 2,k (w)ˆ y G 2,k (w)p w ! (3.98) s.t. X k∈G i ˆ y G 1,k − X k∈L i ˆ y L 1,k = X j B ij ( ˆ θ 1,i − ˆ θ 1,j ), ∀i (3.99) X k∈G i ˆ y G 2,k (w)− X k∈L i ˆ y L 2,k (w) = X j B ij ( ˆ θ 2,i (w)− ˆ θ 2,j (w)− ˆ θ 1,i + ˆ θ 1,j ), ∀i,w (3.100) B ij ( ˆ θ 1,i − ˆ θ 1,j )≤f max ij , ∀i,j (3.101) B ij ( ˆ θ 2,i (w)− ˆ θ 2,j (w))≤f max ij , ∀i,j (3.102) ˆ y L 1,k ≥ 0, ˆ y L 2,k (w)≥ 0, ∀i,k∈L i ,w (3.103) ˆ y G 1,k ≥ 0, ˆ y L 2,k (w)≥ 0, ∀i,k∈G i ,w (3.104) 58 First, we show by construction that there exists a bid profile b = (b G 1,k ,b G 2,k ,b L 1,k ,b L 2,k ) for (DED) that induces an allocation which is efficient, i.e., optimal for (SPP-P). Proposition 7. For each generator (i,k), let b G 1,k =b L 1,k =λ ∗ 1,i , b G 2,k (w) =b L 2,k (w) =λ ∗ 2,i (w), ∀i,w, (3.105) whereλ ∗ 1,i andλ ∗ 2,i (w) give optimal Lagrange multipliers associated with (SPP-P) constraints (3.25) and (3.26) for all i and w. Then, there exists a solution to (DED) that is also a solution to (SPP-P). 59 Proof. In addition to feasibility, the KKT conditions for (DED) are: b G 1,k − ˆ λ ∗ 1,i ≥ 0, ∀i,k∈G i (3.106) ˆ y G∗ 1,k b G 1,k − ˆ λ ∗ 1,i = 0, ∀i,k∈G i (3.107) b G 2,k (w)− ˆ λ ∗ 2,i (w)≥ 0, ∀i,k∈G i (3.108) ˆ y G∗ 2,k (w) b G 2,k (w)− ˆ λ ∗ 2,i (w) = 0, ∀i,k∈G i (3.109) ˆ λ ∗ 1,i −b L 1,k ≥ 0, ∀i,k∈G i ,w (3.110) ˆ y L∗ 1,k ˆ λ ∗ 1,i −b L 1,k = 0, ∀i,k∈G i ,w (3.111) ˆ λ ∗ 2,i (w)−b L 2,k (w)≥ 0, ∀i,k∈L i ,w (3.112) ˆ y L∗ 2,k (w) ˆ λ ∗ 2,i (w)−b L 2,k (w) = 0, ∀i,k∈L i ,w (3.113) X j B ij ˆ λ ∗ 1,i − ˆ λ ∗ 1,j − X w ˆ λ ∗ 2,i (w)− ˆ λ ∗ 2,j (w) p w + ˆ γ ∗ 1,ij − ˆ γ ∗ 1,ji ! = 0, ∀i (3.114) X j B ij ˆ λ ∗ 2,i (w)− ˆ λ ∗ 2,j (w) + ˆ γ ∗ 2,ij (w)− ˆ γ ∗ 2,ij (w) = 0, ∀i,w (3.115) ˆ γ ∗ 1,ij B ij ( ˆ θ ∗ 1,i − ˆ θ ∗ 1,j −f max ij = 0, ∀i,j (3.116) ˆ γ ∗ 2,ij B ij ( ˆ θ ∗ 2,i (w)− ˆ θ ∗ 2,j (w)−f max ij = 0, ∀i,j (3.117) ˆ γ ∗ 1,ij ≥ 0, ˆ γ ∗ 2,ij (w)≥ 0, ∀i,j,w (3.118) Under the bid profile given in (3.105), the planner can select ˆ λ ∗ = λ ∗ to satisfy conditions (3.106)-(3.113). Given this selection, the planner can select ( ˆ θ ∗ , ˆ γ ∗ ) = (θ ∗ ,γ ∗ ) to satisfy the remaining conditions, and then select ˆ y ∗ =y ∗ , where (y ∗ ,θ ∗ ,γ ∗ ) are part of an optimal solution to (SPP-U). 60 3.6 Sequential Nash Equilibria While the previous section demonstrated the existence of an efficient bid profile for the dynamic economic dispatch game, given that the market participants are free to bid any nonnegative scalar values, it remains to show whether generators or LSEs might find it in their own interest to deviate from such a profile. To begin, we define individual outcomes, i.e., the payoffs for the generators and LSEs under the DED with linear bids, assuming the locational marginal pricing (LMP) scheme [53] is enforced by the ISO. For a given bid profile b, and the resulting DED solution allocation (ˆ y G∗ 1 (b), ˆ y G∗ 2 (b), ˆ y L∗ 1 (b), ˆ y L∗ 2 (b)), the expected payoff for generator (i,k) is E[π G i,k (b)] = ˆ λ ∗ 1,i (b)ˆ y G∗ 1,i (b)−c 1,k (ˆ y G∗ 1,i (b)) + X w ˆ λ ∗ 2,i (b,w)ˆ y G∗ 2,i (b,w)−c 2,k (ˆ y G∗ 2,k (b,w)) p w . (3.119) The expected payoff for LSE (i,k) is E[π L i,k (b)] =− ˆ λ ∗ 1,i (b)ˆ y L∗ 1,k (b) + X w u i,k (ˆ y L∗ 1,k (b), ˆ y L∗ 2,k (b,w),w)− ˆ λ ∗ 2,k (b,w)ˆ y L∗ 2,k (b,w)) p w . (3.120) For a bid profile b, denote the collection of bids aside from b i,k as b −(i,k) . Through the remainder of this section, we will omit the expectation notationE[·] and denote the expected payoff of individual generators and LSEs, given bid profileb asπ G i,k (b) andπ L i,k (b), respectively. Definition 2. A sequential Nash equilibrium is a bid profile b ∗ such that for all generators, it holds that π G i,k (b ∗ )≥π G i,k (b i,k ,b −(i,k) ) ∀b i,k ∈R + ×R |W| + (3.121) 61 and for all LSEs it holds that π L i,k (b ∗ )≥π L i,k (b i,k ,b −(i,k) ) ∀b i,k ∈R + ×R |W| + . (3.122) 3.6.1 Existence of Efficient Sequential Nash Equilibria This section explores two conditions under either of which an efficient Nash equilibrium exists for (DED). For both conditions, the efficient bid profile specified in the previous section coincides with such a Nash equilibrium. First, the following assumption is necessary to preclude the possibility that any single generator has the market power to ask for arbitrarily high prices in its bid. Assumption 2. The system problem (SPP-U) is feasible in the absence of any one of the generators in either market stage. The first sufficient condition is as follows. Definition 3. (Congestion Free Condition) No branch power flow constraint (3.94) or (3.95) is binding in the optimal dispatch for (SPP-U). With the previous section’s existence result in hand, it is possible to show that the congestion-free condition guarantees the existence of an efficient Nash equilibrium. Theorem 8. Under Assumption 2 and the congestion-free condition, there exists an efficient Nash equilibrium for (DED). Proof. Considering bid profile b ∗ as in (3.105). Note that under the congestion-free condi- tion, the bid profile of all generators and LSEs are identical, as the LMPs become uni- form across all nodes in the network [53]. Denote these uniform prices as λ ∗ 1 (b ∗ ) and 62 λ ∗ 2 (b ∗ ,w) for all w. By Proposition 7, the bid profile is efficient, inducing optimal dispatch {y G∗ 1 ,y L∗ 1 ,y G∗ 2 (·),y L∗ 2 (·),θ ∗ 1 ,θ ∗ 2 (·),γ ∗ 1 ,γ ∗ 2 (·)}. To show that this bid profile is an efficient Nash equilibrium, we start with the generators. Collecting constraints (3.106) and (3.108) across all nodes and generators gives that ˆ λ ∗ 1,i (b ∗ )≤ min j,k∈G j b G∗ 1,k =λ ∗ 1 (b ∗ ) (3.123) ˆ λ ∗ 2,i (b ∗ ,w)≤ min j,k∈G j b G∗ 2,k (w) =λ ∗ 2 (b ∗ ,w). (3.124) Note that when any electricity is generated anywhere in the network for a given stage, the inequalities above are tight. If a generator (i,k) deviates in its bid from b G∗ i,k to b G i,k , then these inequalities become ˆ λ ∗ 1,i b G i,k ,b ∗ −(i,k) ≤ min{b 1,k ,λ ∗ 1 (b ∗ )} (3.125) ˆ λ ∗ 2,i b G 2,k ,b ∗ −(i,k) ≤ min{b 2,k (w),λ ∗ 2 (b ∗ ,w)} (3.126) Inequalities (3.125) and (3.126) show that the generator can only bring its nodal price down by deviating in its bid. Setting b = b G i,k ,b ∗ −(i,k) , its expected profit when deviating unilat- erally becomes π G i,k (b) = ˆ λ ∗ 1,i (b)ˆ y G∗ 1,k (b)−c 1,k (ˆ y G∗ 1,k (b)) + X w ˆ λ ∗ 2,i (b,w)ˆ y G∗ 2,k (b,w)−c 2,k (ˆ y G∗ 2,k (b,w)) p w (3.127) ≤ ˆ λ ∗ 1,i (b ∗ )ˆ y G∗ 1,k (b)−c 1,k (ˆ y G∗ 1,k (b)) + X w ˆ λ ∗ 2,k (b ∗ ,w)ˆ y G∗ 2,k (b,w)−c 2,k (ˆ y G∗ 2,k (b,w)) p w (3.128) 63 Given ˆ λ ∗ 1,i (b ∗ ) and ˆ λ ∗ 2,i (b ∗ ,w) for allw, (SPP-P) KKT conditions (4.38)-(4.41) constitute first order conditions for maximization of the right hand side of (3.128). This implies that out of all possible allocations,{ˆ y G∗ 1,k (b ∗ ), ˆ y G∗ 2,k (b ∗ ,w)} maximizes the right hand side of (3.128), and π G i,k (b)≤ ˆ λ ∗ 1,i (b ∗ )ˆ y G∗ 1,k (b ∗ )−c 1,k (ˆ y G∗ 1,k (b ∗ )) + X w ˆ λ ∗ 2,i (b ∗ ,w)ˆ y G∗ 2,k (b ∗ ,w)−c 2,k (ˆ y G∗ 2,k (b ∗ ,w)) p w (3.129) =π G i,k (b ∗ ). (3.130) Therefore, no generator can increase its payoff by unilaterally deviating from b G∗ i,k in its bid. Turning to the LSEs, collecting (DED) KKT conditions (3.110) and (3.112) across all nodes and LSEs gives that λ ∗ 1 (b ∗ ) = max j,k∈L j b L∗ 1,k ≤ ˆ λ ∗ 1,i (3.131) λ ∗ 2 (b ∗ ,w) = max j,k∈L j b L∗ 2,k (w)≤ ˆ λ ∗ 2,i (w). (3.132) If an LSE (i,k) deviates in its bid from b L∗ i,k to b L i,k , then these inequalities become max n b L 1,k ,λ ∗ 1 o ≤ ˆ λ ∗ 1,i (3.133) max n b L 2,k (w),λ ∗ 2 (b ∗ ,w) o ≤ ˆ λ ∗ 2,i (w). (3.134) 64 Thus, any LSE (i,k) can only drive their nodal electricity prices up when deviating from b L∗ i,k . Setting b = (b L i,k ,b ∗ −(i,k) ), its expected payoff when deviating unilaterally becomes π L i,k (b) =− ˆ λ ∗ 1,i (b)ˆ y L∗ 1,k (b) + X w u i,k (ˆ y L∗ 1,k (b), ˆ y L∗ 2,k (b,w),w)− ˆ λ ∗ 2,k (b,w)ˆ y L∗ 2,k (b,w)) p w (3.135) ≤− ˆ λ ∗ 1,i (b ∗ )ˆ y L∗ 1,k (b) + X w u i,k (ˆ y L∗ 1,k (b), ˆ y L∗ 2,k (b,w),w)− ˆ λ ∗ 2,k (b ∗ ,w)ˆ y L∗ 2,k (b,w)) p w . (3.136) The KKT conditions for (SPP-U) concerning the LSE utility function terms are as follows: ˆ λ ∗ 1,i − X w u 0 i,k (ˆ y L∗ 1,k , ˆ y L∗ 2,k (w),w)p w ≥ 0, ∀i,k∈L i (3.137) ˆ y L∗ 1,k ˆ λ ∗ 1,i − X w u 0 i,k (ˆ y L∗ 1,k , ˆ y L∗ 2,k (w),w)p w ! ≥ 0, ∀i,k∈L i (3.138) ˆ λ ∗ 2,i (w)−u 0 i,k (ˆ y L∗ 1,k , ˆ y L∗ 2,k (w),w)≥ 0, ∀i,k∈L i ,w (3.139) ˆ y L∗ 2,k (w) ˆ λ ∗ 2,i (w)−u 0 i,k (ˆ y L∗ 1,k , ˆ y L∗ 2,k (w),w) ≥ 0, ∀i,k∈L i ,w. (3.140) Note that since it is the sum of ˆ y L 1,k and ˆ y L 2,k (w) that appears in LSE i,k (ˆ y L 1,k , ˆ y L 2,k (w),w), the set of subgradients of u i,k (·,·,w) with respect to ˆ y L 1,k or ˆ y L 2,k (w) is the same, and wherever u i,k (·,·,w) is differentiable, the derivative with respect to either argument is the same. 65 As in the proof for the generators, these conditions show that given ˆ λ ∗ 1,i and ˆ λ 2,i (w) for all w, out of all possible allocations,{ˆ y L∗ 1,k (b ∗ ), ˆ y L∗ 2,k (b ∗ ,w)} maximizes the right hand side of (3.136), and π L i,k (b)≤− ˆ λ ∗ 1,i (b ∗ )ˆ y L∗ 1,k (b ∗ ) + X w u i,k (ˆ y L∗ 1,k (b ∗ ), ˆ y L∗ 2,k (b ∗ ,w),w)− ˆ λ ∗ 2,k (b ∗ ,w)ˆ y L∗ 2,k (b ∗ ,w)) p w (3.141) =π L i,k (b ∗ ). (3.142) Therefore, no LSE can increase its payoff by unilaterally deviating from b L∗ i,k in its bid. The second sufficient condition for existence of an efficient Nash equilibrium for (DED) is the following. Definition 4. (Monopoly-Free Condition) At each bus, there are either at least two gener- ators, or no generators at all. Also, at each bus, there are either at least two LSEs, or no LSEs at all. Before proving the sufficiency of the monopoly-free condition for ensuring the existence of an efficient Nash equilibrium, we provide a counter example to demonstrate the necessity of our reformulation of the (SPP-P) problem in terms of LSE utility functions. The follow- ing example demonstrates that if LSEs bid marginal costs for demand response, then the monopoly-free condition fails to ensure the existence of an efficient Nash equilibrium. Example 1. For this counter example, we set aside planned blackouts, and uncertainty in generation, so that (SPP-P) reduces to a single stage problem, and consider the following network, illustrated in Figure 3.1. 66 Figure 3.1 Network diagram for Example 1. Generators are represented by circles, LSEs are represented by triangles. Suppose that both the generator cost functions and LSE demand response cost functions are quadratic, with no constant terms. There are two buses, the first with a single generator and two LSEs, the second with a single generator. For this example the generators are assumed to be nonstrategic, as we focus on the strategic behavior of the LSEs. We simplify the subscripts here to refer only to specific generators and LSEs: c 1 (y) = 80(y) 2 + 40y c 2 (y) = 40(y) 2 + 20y c dr,1 (x) = 10(x) 2 + 20x, D 1 = 30 c dr,2 x) = 10(x) 2 + 30x, D 2 = 20 For the network characteristics, we take f max 12 =f max 21 = 2, and B 12 =B 21 = 1. The optimal primal solution for (SPP-U) with these parameters is as follows: ˆ y G∗ 1 = 3, ˆ y G∗ 2 = 2, (ˆ x L∗ 1 , ˆ y L∗ 1 ) = (25, 5), (ˆ x L∗ 2 , ˆ y L∗ 2 ) = (30, 0), ˆ θ ∗ = 2 (3.143) 67 with optimal dual solution ˆ λ ∗ 1 = 520, ˆ λ ∗ 2 = 180, ˆ μ ∗ 1 = 520, ˆ μ ∗ 2 = 430, ˆ γ ∗ 12 = 0, ˆ γ ∗ 21 = 340. (3.144) If the generators each bid the LMP for their respective node, and the LSEs bid ˆ μ ∗ , corre- sponding to their respective marginal costs for demand response at the optimal allocation (3.143), i.e., if each entity bids their respective term from (3.144), then the (SPP-P) primal and dual solutions are optimal for the corresponding (DED) game. It can be shown that LSE 1 will not benefit from increasing its bid, which could only cause an increase in ˆ λ ∗ 1 , the price it pays for electricity. However, due to the KKT conditions of the (DED) game, it can unilaterally deviate and bid in the range b L∗ 2 = ˆ μ ∗ 2 (b ∗ ) = 430≤b L 1 ≤ 520 = ˆ λ ∗ 1 (b ∗ ) =b G∗ 1 , (3.145) where the dependence of the (DED) dual variables on the bid profile is made explicit, and b ∗ is the efficient bid profile given in (3.105). Under bid profile b ∗ , the payoff for LSE 1 is π L 1 (b ∗ ) =− ˆ λ ∗ 1 (b ∗ )ˆ y L∗ 1 (b ∗ )−c dr,1 (ˆ x L∗ 1 (b ∗ )) (3.146) =−520· 5− 10· 25 2 − 20· 25 (3.147) =−9350. (3.148) 68 Suppose that LSE 1 deviates and bids b L 1 = 440. Then, denoting the bid including LSE 1’s deviation as b, the optimal (DED) primal solution becomes ˆ y G∗ 1 (b) = 0, ˆ y G∗ 2 (b) = 2, (ˆ x L∗ 1 (b), ˆ y L∗ 1 (b)) = (28, 2), (ˆ x L∗ 2 (b), ˆ y L∗ 2 (b)) = (30, 0), ˆ θ ∗ (b) = 2. (3.149) Note that since the reported cost of generator 2 remains lower than the cost of demand response for either LSE, the planner still finds it optimal to dispatch generator 2 to the extent that the transmission line constraintf max 12 = 2 allows. Since both LSEs now underbid generator 1, generator 1 is not dispatched. The two units provided by generator 2 go to LSE 1, since it still reports a higher demand response cost than LSE 2, and LSE 1’s reported demand response cost now sets the price of energy at node 1. The corresponding dual solution is ˆ λ ∗ 1 (b) = 440, ˆ λ ∗ 2 (b) = 180, ˆ μ ∗ 1 (b) = 440, ˆ μ ∗ 2 (b) = 430, ˆ γ ∗ 12 (b) = 0, ˆ γ ∗ 21 (b) = 260, (3.150) and LSE 1’s payoff under bid profile b 0 is π L 1 (b) =− ˆ λ ∗ 1 (b)ˆ y L∗ 1 (b)−c dr,1 (ˆ x L∗ 1 (b)) (3.151) =−440· 2− 10· 28 2 − 20· 28 (3.152) =−9280>−9350 =π L 1 (b ∗ ). (3.153) Thus, while LSE 1 consumes less energy at the reduced price, this reduction in cost of energy consumption outweighs the increased cost of demand response it must take on in the outcome 69 under demand profile b. Note that if LSE 2 were to also consume energy, then its bid b L∗ 2 would be equal to 520 as well, and LSE 1 would not be able to bring the cost of electricity at node 1 down with its demand response bid. In essence, our reformulation of (SPP-P) in terms of LSE utility functions for electricity consumption, together with the prescribed bid format and efficient bid (3.105) makes the monopoly-free a sufficient condition because it ensures that for each LSE in the network at any given node, there will always exist another LSE bidding the equilibrium price at the same node. Therefore, LSEs which underbid will be forced to fulfill their demand entirely through demand response or planned blackouts, which can be shown to yield a payoff no better than their equilibrium payoff. The details can be found in the following proof. Theorem 9. Under Assumption 2 and the monopoly-free condition, there exists an efficient Nash equilibrium for (DED). Proof. The proof for the sufficiency of the monopoly-free condition is similar to that of the congestion-free condition. Since there is no guarantee on congestion conditions, the optimal dual variables corresponding to the (SPP-U) power balance constraints (3.92) and (3.93) are λ ∗ 1 = (λ ∗ 1,1 ,...,λ ∗ 1,N ) > and λ ∗ 2,i (w) = (λ ∗ 2,1 (w),...,λ ∗ 2,N (w)) > for all i. In general the individual entries of these vectors may take on different values. 70 Under the monopoly-free condition, the bounds on the nodal prices, i.e., the Lagrange multipliers corresponding to (DED) power balance constraints (3.99)-(3.100) can only be aggregated per node. In this case of the generators, for an individual node i this gives ˆ λ ∗ 1,i (b ∗ )≤ min k∈G i b G∗ 1,k =λ ∗ 1,i (b ∗ ) (3.154) ˆ λ ∗ 2,i (b ∗ ,w)≤ min k∈G i b G∗ 2,k (w) =λ ∗ 2,i (b ∗ ,w). (3.155) Under unilateral deviation by generator (i,k) to bid b G i,k , these inequalities become ˆ λ ∗ 1,i b G i,k ,b ∗ −(i,k) ≤ min n b G 1,k ,λ ∗ 1,i (b ∗ ) o (3.156) ˆ λ ∗ 2,i (b G i,k ,b ∗ −(i,k) ,w)≤ min n b G 2,k (w),λ ∗ 2,i (b ∗ ,w) o . (3.157) Again, the nodal prices can only decrease as a result of the generator’s unilateral deviation, so that the remainder of the previous proof can be used in this case. Similarly, the LSEs can only drive their nodal prices up, and it can be shown that their payoff will not increase via unilateral deviation either. Thus, the bid profile given in (3.105) constitutes a Nash equilibrium under the monopoly-free condition as well. The key idea behind both of these conditions is that there are other market participants in the network bidding such individual generators or LSEs can only cause prices to change in a direction which does not lead to increased payoff. In the congestion-free case, these other participants can be located anywhere in the network, while in the monopoly-free case, they are colocated at the same node. Note that while the monopoly-free condition can be checked from the network topology alone, the congestion-free condition requires knowledge 71 of the optimal solution, given complete transparency on the part of all market participants. Even if such information is available, the specifics of a given problem instance, i.e., existence of congestion at the optimal dispatch, may still preclude application of the congestion-free condition for guaranteeing existence of an efficient Nash equilibrium. 3.7 Conclusions In this paper, we have proposed a two-stage market mechanism that integrates renewable energy generators as an alternative to the extant multi-settlement markets that are operated independently though the decision-making of the market participants is obviously coupled. We formulate the two stage economic dispatch problem as a two-stage stochastic program with recourse. We first show that a sequential competitive equilibrium indeed exists in such a two-stage market. We show that every sequential competitive equilibrium supports an efficient allocation, and conversely every efficient allocation can be supported by a sequential competitive equilibrium. We then design a market mechanism for such settings. We showed that when market participants act strategically, and if either a congestion-free or a monopoly- free condition is satisfied, the Nash equilibrium of the two-stage market mechanism exists and is efficient. We also gave a counterexample that if LSEs bid marginal costs, then these conditions are not enough to guarantee of an efficient Nash equilibrium. We have ignored physical aspects of the network such as transmission loss, though that could be incorporated as well at the risk of greater notational and proof complexity though the essence of the results presented would be the same. Additional economic aspects such as demand elasticity are left to future work. 72 Chapter 4 A Risk Aware Two-Stage Market Mechanism for Electricity with Renewable Generation 4.1 Introduction Electricity markets covering the majority of the US, and most of the industrialized world are operated as multi-settlement markets. These markets are organized in the sense that demand for and supply of energy and ancillary services are matched via a centralized auction mechanism, as opposed to bilateral negotiations over individual transactions [35], [54]. An independent system operator (ISO) runs a given market as a series of multi stage forward markets, and a real time or spot market. Depending upon the region, forward markets are settled days or hours ahead of the intended time of delivery, allowing for provision of cheaper, but relatively inflexible generation. Spot markets open and are settled minutes before delivery in order to balance supply and demand in real time. While the clearing mechanisms in electricity markets are designed with the objective of maximizing welfare of both producers and consumers, the imperative to increase penetration 73 of renewables and reduce reliance on fossil fuels now strains the existing multi-settlement paradigm. In the past, the primary source of uncertainty in market clearing came from de- mand side deviations, and such errors reduced to under 5% by the opening of spot markets [41]. Current levels of renewable adoption have already exacerbated typical levels of uncer- tainty for net demand (demand minus renewable generation), and under official mandates to increase the proportion of energy sourced from renewables this trend will only continue. For example, in California renewables constitute nearly 34% of total retail sales, while a recently passed state bill legislates that 100% of power come from renewables by 2045 [21], [29]. Re- liance on renewables to this degree introduces an order of magnitude greater uncertainty in net demand, and necessitates novel market designs to address this challenge. As a starting point, given that economic dispatch is a multi-settlement process, it makes sense to couple markets across forward and real time stages, and then allow for recourse decisions, rather than settle each stage independently. If a probability distribution for the stochastic generation is known, then maximization of expected welfare is a reasonable objec- tive. This type of problem can be formulated as a two-stage stochastic program, and in fact it is possible to show that stochastic clearing is more efficient than two-settlement systems [35], [53]. There are a couple of issues with the use of expected social welfare as an objective function. In purely mathematical terms, a given realization of a random variable can be quite different from its expectation. Thus optimization of an expectation guarantees little in terms of variation over possible outcomes. Further, real-world observations indicate that economic decision makers are risk averse, or at least act so [16]. Therefore, given increasing 74 levels of generation variability, it is of both theoretical and practical interest to incorporate some notion of risk into market objective functions. In this work we study how the introduction of risk preferences into the central objective function affects market operation. We consider a setting with an ISO and multiple generators. The ISO owns a nondispatchable, renewable resource, and the market clears in two stages: a forward stage in which only a forecast for renewable generation is available, and a real time stage, wherein the exact realized renewable generation is known. The generators each own primary and ancillary plants, which may be dispatched in the forward and real time stages, respectively. In the forward market, the ISO schedules primary energy procurements from the generators, and in the real time market purchases ancillary service where necessary. All participants are assumed to be non-strategic price takers. However, while we assume that the generators seek to maximize their expected profit, the ISO is risk averse and minimizes a weighted sum of the expectation and conditional value at risk (CVaR) of its costs. CVaR has over the past two decades become the most widely used risk measure, due to the fact that it is a coherent risk measure, and can be calculated via a convex program [27]. Our main result is the proof of existence of a sequential competitive equilibrium (SCEq) in this risk aware, two-stage market with recourse. In particular, we demonstrate the existence of first and second stage prices such that, given these prices, the generation decisions of the generators in both decisions achieve market clearance in stage two, thus balancing supply and demand. We then specify a two-stage market mechanism which implements the SCEq. Related work. Numerous past works have studied market and mechanism design and equilibrium outcomes in the two-stage expected welfare maximization, or risk neutral setting, e.g., [24], [76] and [78]. 75 Turning to literature which incorporates risk preferences, several works consider settings in which agents may enter into contracts in order to hedge against risky outcomes. In [61] it is shown that a complete market, wherein all uncertainties can be addressed via a balanced set of contracts, involving agents equipped with coherent risk measures, is equivalent to one in which said agents are risk neutral, and take actions based on a probability density function determined by a system risk agent. The work then investigates necessary and sufficient conditions for existence of an equilibrium consisting of allocations, prices and contracts. Assuming a similar setting in the context of hydro thermal markets, [58] shows that given a sufficiently rich set of securities are available to risk averse agents, that a multi-stage competitive equilibrium may be derived from the solution to a risk-averse social planning solution. [35] investigates difficulties that may arise when risk averse agents maximize their welfare in a market a are not complete, including existence of multiple, potentially stable equilibria. Our setting differs from these works in that we have one risk aware customer for multiple risk neutral producers, and we do not allow for transactions between agents outside of the quantities of energy purchased and consumed. 4.2 Preliminaries 4.2.1 Risk Measures In stochastic optimization we are concerned with losses Z(ω) =L(x,ω) that are both a function of a decision x, as well as some random outcome ω, unknown when the decision is made. Generally speaking, a risk measure is a functional which accepts as input the entire collection of realizations Z(ω), w∈ Ω. 76 More specifically, consider a sample space (Ω,F) equipped with sigma algebraF, on which random functions Z = Z(ω) are defined. A risk measure ρ(Z) maps such random functions into the extended real line [69]. Often times the domain of ρ, denotedZ is taken asL p (Ω,F,P) for some p∈ [1, +∞) and reference probability measure P . The following characteristics of risk measures will become useful in later sections. Definition 5. A proper risk measure satisfies ρ(Z)>−∞ for all Z∈Z and dom(ρ) :={Z∈Z : ρ(Z)<∞}. We denote byZZ 0 the pointwise partial order, meaningZ(ω)≥Z 0 (ω) for a.e. ω∈ Ω. Definition 6. A risk measure is monotonic if Z,Z 0 ∈Z and ZZ 0 implies ρ(Z)≥ρ(Z 0 ). Definition 7. A risk measure is coherent if it is monotonic, convex and satisfies translation equivariance and positive homogeneity (see [69] for details on these properties). 4.2.2 Conditional value at risk In the following sections, we will focus in particular on conditional value at risk, or CVaR. CVaR is an example of a coherent risk measure [69]. Before defining CVaR, we introduce the related quantity, value at risk. Suppose that random variable Z is distributed according to Borel probability mea- sure P , with associated sample space (Ω,F), and cdf F . When Z represents losses, the α-Value-at-Risk is defined as follows. 77 Definition 8. For a given confidence levelα∈ (0, 1), theα-Value-at-Risk or VaR α of random loss Z =Z(ω) is VaR α (Z) = min{z : F (z)≥α}. (4.1) Thus, VaR α (Z) is the lowest amountz such that, with probabilityα,Z will not exceedz. In the case whereF is continuous, VaR α (Z) is the uniquez satisfyingF (z) =α. Otherwise, it is possible that the equationF (z) =α has no solution, or an interval of solutions, depending upon the choice of α. This, among other difficulties, motivates the use of alternative risk measures such as CVaR [63]. Informally CVaR α of Z gives the expected value of Z, given that Z≥ VaR α (Z). The precise definition is as follows. Let [x] + = max{0,x}. Definition 9. [63] Let φ α (Z,ζ) =ζ + 1 1−α E[Z−ζ] + . (4.2) Then CVaR α (Z) = min ζ φ α (Z,ζ), and VaR α (Z) = lower endpoint of arg min ζ φ α (Z,ζ). (4.3) It follows from the joint convexity of φ α in Z and ζ that CVaR α is convex overZ. Restricting attention to random losses Z(ω) =L(x,ω) which depend upon a decision x, we have the following result. Theorem 10. Let Z(ω) = L(x,ω). If the mapping x7→ Z is convex in x then CVaR α (Z) is convex in x [63]. 78 Theorem 10 will later ensure that optimization problems with objectives including a CVaR α term are convex. 4.3 Risk Aware Stochastic Economic Dispatch Formulation We consider a setting withN conventional generators, and a single renewable generator. An additional entity, the independent system operator (ISO) operates the power grid and plays the role of the social planner (from this point we use the terms interchangeably). For simplicity we consider a single bus network. We consider a two-stage setting, where generation is dispatched in the first stage (also referred to as day-ahead or DA) and then adjusted in the second stage (real time or RT) to match demand. Let D≥ 0 denote the aggregate demand. This demand is assumed inelastic, i.e., it is not affected by changes in first or second stage prices. The renewable generator’s output is modeled as a nonnegative random variableW , upper bounded byW > 0. We make the following additional assumption on the distribution ofW . Assumption 3. Random variableW is distributed according to pdff W (and cdfF W ), which is continuous and positive on [0,W ]. The probability distribution of W is assumed to be known to all market participants. The marginal cost of renewable generation is zero. The quantity of renewable generation scheduled is denoted y. 79 Conventional generatori has access to a primary plant and an ancillary plant. Generator i schedules its primary plant to produce quantity x G i ∈R + prior to realization of W at cost a i (x G i ) 2 wherea i > 0. We assume the primary plant is inflexible, so that its generation level must remained fixed once it is scheduled. After realization of W , generator i can activate its ancillary plant to produce level z G i ∈ R + at cost ˜ a i (z G i ) 2 where ˜ a i > 0. Any ancillary generation produced in excess of aggregate demandD can be disposed of or sold in a separate spot market, which we do not consider. We assume that a i < ˜ a i for all i, a i 6=a j for i6=j, and that max i a i < min i ˜ a i and ˜ a i 6= ˜ a j for i6=j. The generator is compensated for its first stage productionx G i at priceP 1 . In the second stage, given W = w, the generator is compensated for second stage generation z G i (w) at price P 2 (w). 4.3.1 Generator’s Problem We assume that each generator i is price taking, i.e., its decisions x G i and z G i (w) do not affect prices in either stage. Therefore, generator i’s profit is given by π G i (x G i ,z G i (w)) :=P 1 x G i −a i (x G i ) 2 +P 2 (w)z G i (w)− ˜ a i (z G i (w)) 2 . (4.4) Each generator is risk neutral, and so makes first and second stages to maximize the expectation of (4.4). In stage 2, given production levelw and priceP 2 (w), generatori solves the following problem (GEN2 i ) max z G i (w)≥0 P 2 (w)z G i (w)− ˜ a i (z G i (w)) 2 . (4.5) 80 Let π 2 i (w,P 2 ) be the maximum objective value obtained in solving (4.5), given w and P 2 . Then in the first stage, given price P 1 , generator i solves the following problem (GEN1 i ) max x G i ≥0 P 1 x G i −a i (x G i ) 2 +E[π 2 i (W,P 2 )]. (4.6) The termE[π 2 i (W,P 2 )] is a constant when optimizing over x G i , as generator i’s DA and RT decisions can be made independently. In order to emphasize the fact that generatori observes W =w prior to selecting z G i (w), we separate generator i’s two optimization problems. 4.3.2 ISO’s Problem In Section III, our definition of a sequential competitive equilibrium includes a tuple of allocations, i.e., generation levels. For the purposes of examining the welfare properties of these allocations, we now introduce a two stage social planner’s problem (SPP), correspond- ing to our two settlement market. As is seen in the static case, the SPP involves maximizing the social welfare of all market participants. We take the welfare of generator i to be the negation of generation costs from stages 1 and 2. GivenW =w, the aggregate welfare is the negation of the summation of these costs over all generators: c SPP (w) := X i a i ˆ x 2 i + ˜ a i ˆ z 2 i (w) , (4.7) where (ˆ x i , ˆ z i (w)) for all i and w are the social planner’s decisions in stages 1 and 2. Define ˆ x := (ˆ x 1 ,..., ˆ x N ), and similarly ˆ z(w) := (ˆ z 1 (w),..., ˆ z N (w)). 81 We assume that the social planner is risk averse. That is, instead of seeking to minimize the expectation of (4.7), they seek to minimize a weighted combination of E[c SPP (W )] and CVaR α (c SPP (W )). α∈ [0, 1) signifies that the ISO considers worst case or tail events with cumulative probability 1−α to be “risky”, and therefore weights them more heavily. We now introduce the additional parameter ∈ [0, 1], which gives the gives the social planner’s relative weighting of overall expectation and CVaR α of the first and second stage generation costs, and define the social planner’s risk measure as ρ SPP (·) = (1−)E[·] +CVaR α (·). (4.8) It can be shown that ρ SPP (·) is a coherent risk measure [69]. Given that ˆ y is the amount of renewable generation scheduled by the social planner in stage 1, and W =w, the social planner’s second stage problem is (SPP2) min ˆ z(w)≥0 X i ˜ a i ˆ z 2 i (w) (4.9) s.t. X i ˆ z i (w)≥ ˆ y−w. (4.10) Note that constraint (4.10) is an inequality in order to accommodate scenarios in which renewable generation exceeds residual demand D− P i ˆ x i = ˆ y. 82 Define c SPP 2 (ˆ x,w) as the minimum aggregate social cost achieved in the second stage, given ˆ x and W =w. Then the social planner’s first stage problem is (SPP1) min ˆ x,ˆ y≥0 X i a i ˆ x 2 i +ρ SPP c SPP 2 (ˆ x,W ) (4.11) s.t. X i ˆ x i + ˆ y =D, (4.12) where we have used translation equivariance of CVaR α to move the summed first-stage costs outside ofρ SPP . We now argue that problems (SPP1) and (SPP2) can be combined into the following single stage optimization problem. Lemma 11. The two-stage problem (SPP1)-(SPP2) is equivalent to the following single stage problem: (SPP) min ˆ x,ˆ y,ˆ z(·)≥0 X i c i (ˆ x i ) +ρ SPP X i ˜ a i ˆ z 2 i (W ) ! (4.13) s.t. X i ˆ x i + ˆ y =D (4.14) X i ˆ z i (w)≥ ˆ y−w ∀w, (4.15) where ˆ z(·) : R + →R + . Proof. See Appendix B. Here we use the term “equivalent” in the sense that (SPP) and (SPP1) have the same optimal objective value. Additionally, if (ˆ x ∗ , ˆ z ∗ (·)) is optimal for (SPP), then ˆ x ∗ is optimal for (SPP1), and ˆ z ∗ (w) is optimal for (SPP2) for all w, given ˆ x ∗ . Conversely, if ˆ x ∗ is optimal for 83 (SPP1) and ˆ z ∗ (·) collects the optimal solutions to (SPP2) for allw, given ˆ x ∗ , then (ˆ x ∗ , ˆ z ∗ (·)) is optimal for (SPP). Similar to the equivalency demonstrated for the ISO’s problem in Lemma 11, it can be shown that the following single stage problem is equivalent to (GEN1 i ) and (GEN2 i ) (GEN i ) max x G i ≥0,z G i (·)≥0 P 1 x G i −a i (x G i ) 2 +E[P 2 (w)z G i (w)− ˜ a i (z G i (w)) 2 ]. (4.16) where z G i (·) : R + →R + . 4.4 Sequential Competitive Equilibrium In a single stage market for a single good, a competitive equilibrium is specified by a price P and quantity x such that, given P , producers find it optimal to produce, and consumers find it optimal to purchase, quantity x of the good. Thus, the market clears, i.e., demand equals supply. To understand the outcome of the two-stage market, we consider a sequential version of competitive equilibrium. Definition 10. A sequential competitive equilibrium (SCEq) is a tuple (x ∗ ,z ∗ (·),P ∗ 1 ,P ∗ 2 (·)) such that, for alli, givenP ∗ 1 andP ∗ 2 (·),x ∗ i is optimal for (GEN1 i ),z ∗ i is optimal for (GEN2 i ), and there exists a y ∗ , such that X i x ∗ i +y ∗ =D, X i z ∗ i (w)≥y ∗ −w ∀w. (4.17) 84 Note that in the SCEq definition, z ∗ i (·) and P ∗ 2 (·) are functions. We now investigate the existence of an SCEq in our two stage, risk aware setting. Let ˆ μ(w) be the Lagrange multiplier corresponding to constraint (4.10). The Lagrangian for (SPP2) is L = X i ˜ a i ˆ z 2 i (w) + ˆ μ(w) ˆ y−w− X i ˆ z i (w) ! , (4.18) giving, in addition to feasibility, the following optimality conditions for problem (SPP2): 2˜ a i ˆ z ∗ i (w)− ˆ μ ∗ (w)≥ 0 ∀i (4.19) ˆ z ∗ i (2˜ a i ˆ z ∗ i (w)− ˆ μ ∗ (w)) = 0 ∀i (4.20) ˆ μ ∗ (w) ˆ y ∗ −w− X i ˆ z ∗ i (w) ! = 0 ∀w, (4.21) ˆ μ ∗ (w)≥ 0 ∀w. (4.22) Assuming ˆ y>w, ˆ z ∗ i (w)> 0 for all i, and in particular ˆ z ∗ i (w) = ˆ μ ∗ (w) 2˜ a i . (4.23) If ˆ y≤ w then ˆ z ∗ i (w) = 0 for all i. Summing (4.23) over i, applying constraint (4.15), and rearranging gives ˆ μ ∗ (w) = 2˜ a· [ˆ y−w] + , (4.24) 85 where the constant ˜ a is defined as ˜ a := P i 1 ˜ a i −1 . Therefore ˆ z ∗ i (w) = ˜ a· [ˆ y−w] + ˜ a i . Summing over i gives the optimal second stage objective value (i.e., the minimum recourse cost given ˆ x) c SPP 2 (ˆ x,w) = X i∈I ˜ a i ˜ a· [ˆ y−w] + ˜ a i ! 2 = ˜ a· [ˆ y−w] 2 + . (4.25) Therefore, VaR α (c SPP 2 (ˆ x,W )) may be expressed as VaR α ˜ a· [ˆ y−W ] 2 + = inf n t : P ˜ a· [ˆ y−W ] 2 + ≤t ≥α o (4.26) = inf t≥ 0 : P W < ˆ y− s t ˜ a ≤ 1−α = 0 if ˆ y<F −1 W (1−α) ˜ a· (ˆ y−F −1 W (1−α)) 2 if ˆ y≥F −1 W (1−α). (4.27) Given this expression for VaR α , the following lemma gives an explicit expression of CVaR α for our quadratic cost function setting. Lemma 12. Assuming first and second stage generation cost functions of the form ax 2 and ˜ az(w) 2 , a, ˜ a> 0, CVaR α (c SPP 2 (ˆ x,W )) can be expressed as CVaR α c SPP 2 (ˆ x,W ) = CVaR α ˜ a· [ˆ y−W ] 2 + = 1 1−α Z min{F −1 W (1−α),ˆ y} 0 ˜ a· (ˆ y−w) 2 f W (w)dw. (4.28) 86 Proof. Given Assumption 3, the cdfF c SPP 2 of lossesc SPP 2 (ˆ y,W ) will be continuous everywhere except possibly at zero, since P (c SPP 2 (ˆ x,W ) = 0) = P (W ≥ ˆ y). By Theorem 6.2 of [69], when VaR α (c SPP 2 (ˆ x,W ))> 0, we may write CVaR α (c SPP 2 (ˆ x,W )) = 1 1−α Z ˜ aˆ y 2 ˜ a·(ˆ y−F −1 W (1−α)) 2 qf c SPP 2 (q)dq = 1 1−α Z F −1 W (1−α) 0 (ˆ y−w) 2 f W (w)dw, (4.29) where f c SPP 2 gives the pdf corresponding to F c SPP 2 . If VaR α (c SPP 2 (ˆ x,W )) = 0, then using Definition 9, we have that CVaR α (c SPP 2 (ˆ x,W )) =ζ ∗ + 1 1−α E[c SPP 2 (ˆ x,W )−ζ ∗ ] + = 0 + 1 1−α E[c SPP 2 (ˆ x,W )− 0] + = 1 1−α Z ˆ y 0 c SPP 2 (ˆ x,w)f W (w)dw. (4.30) Substituting for c SPP 2 (ˆ x,w) and then combining (4.29) and (4.30) completes the proof. While CVaR α (c SPP 2 (ˆ x,W )) is convex in the first stage decision ˆ x due to (4.25) and The- orem 10, the upper limit ˆ θ of the integral in (4.28) is not a differentiable function of ˆ y, so that the Leibniz integral rule does not directly apply. The next lemma addresses this issue. Lemma 13. Given Assumption 3, expression (4.28) is continuously differentiable with re- spect to ˆ y, with derivative CVaR 0 α (c SPP 2 (ˆ x,W )) = 2˜ a Z ˆ θ 0 (ˆ y−w)f W (w)dw. (4.31) 87 Proof. We consider two cases, depending on the two possible values of ˆ θ(ˆ y). When ˆ θ(ˆ y) = F −1 W (1−α), applying the Leibniz integral rule gives CVaR 0 α (c SPP 2 (ˆ x,W )) = 2˜ a Z F −1 W (1−α) 0 (ˆ y−w)f W (w)dw. When ˆ θ(ˆ y) = ˆ y, application of the Leibniz integral rule gives CVaR 0 α (c SPP 2 (ˆ x,W )) = 2˜ a Z ˆ y 0 (ˆ y−w)f W (w)dw. Combining the last two equations gives the expression in the lemma statement. When ˆ y≤ F −1 W (1−α), CVaR 0 α (c SPP 2 (ˆ x,W )) is an affine function of ˆ y, and when ˆ y > F −1 W (1−α), CVaR 0 α (c SPP 2 (ˆ x,W )) is continuous by the continuity of f W (w). The two expressions agree at ˆ y =F −1 W (1−α), so that CVaR 0 α (c SPP 2 (ˆ x,W )) is continuous. Let ˆ θ = ˆ θ(ˆ y) = min{F −1 W (1−α), ˆ y}. Then, problem (SPP) may be written as min ˆ x,ˆ y,ˆ z(·)≥0 X i a i ˆ x 2 i + (1−) Z ˆ y 0 X i ˜ a i ˆ z 2 i (w)f W (w)dw (4.32) + 1−α Z ˆ θ 0 X i ˜ a i ˆ z 2 i (w)f W (w)dw s.t. X i ˆ x i + ˆ y =D (4.33) X i ˆ z i (w)≥ ˆ y−w ∀w. (4.34) 88 Locational marginal pricing (LMP) is a commonly used settlement scheme for economic dispatch problems, and previous work has examined extensions of LMPs to problems includ- ing two stage markets with recourse. In such models, the LMPs arise as the dual variables to power balance constraints for each stage (in our setting (4.33) and (4.34) in (SPP)). Previous work ([24],[76]) has demonstrated that such LMPs support a competitive equilibrium when the ISO or social planner is risk neutral, i.e. when = 0. We state this formally in terms of our setting in the following theorem. Let ˆ λ ∗ and ˆ μ ∗ (w) denote the optimal Lagrange multipliers for constraints (4.33) and (4.34), given W =w, respectively. Theorem 14. When = 0, there exists an SCEq. In particular, (x ∗ ,z ∗ ) are given by (ˆ x ∗ , ˆ z ∗ ) in the optimal solution to (SPP), and the equilibrium prices are given by P ∗ 2 (w) = ˆ μ ∗ (w), P ∗ 1 = ˆ λ ∗ . (4.35) Proof. Our setting with = 0 can be seen as a special case of that in [24]. The proof then follows from Theorem 1 in [24]. 89 Theorem 15. If 0≤ < 1, then there exists a competitive equilibrium. In particular, (x ∗ ,z ∗ ) are given by (ˆ x ∗ , ˆ z ∗ ), the optimal solution to problem (SPP), and the equilibrium prices are given by P ∗ 2 (w) = ˆ μ ∗ (w) (1−+ 1−α ) 0≤w≤ ˆ θ ∗ ˆ μ ∗ (w) (1−) ˆ θ ∗ ≤w< ˆ y ∗ 0 ˆ y ∗ ≤w , P ∗ 1 = ˆ λ ∗ , (4.36) where ˆ θ ∗ = min{F −1 W (1−α), ˆ y ∗ }. Proof. By Lemma 13, the objective and all constraints in (SPP) are continuously differen- tiable. Problem (4.32)-(4.34) has Lagrangian L = X i a i ˆ x 2 i + (1−) Z ˆ y 0 X i ˜ a i ˆ z 2 i (w)f W (w)dw + 1−α Z ˆ θ 0 X i ˜ a i ˆ z 2 i (w)f W (w)dw + ˆ λ D− X i ˆ x i − ˆ y ! + Z ˆ μ(w) ˆ y−w− X i ˆ z i (w) ! f W (w)dw. Let ˆ c ,α (w) = 1− + 1−α 0≤w≤ ˆ θ ∗ 1− ˆ θ ∗ <w< ˆ y ∗ 0 ˆ y ∗ ≤w . (4.37) 90 Then, in addition to feasibility, the optimality conditions for (4.32)-(4.34) are [69]: 2a i ˆ x ∗ i − ˆ λ ∗ ≥ 0 ∀i (4.38) ˆ x ∗ i 2a i ˆ x ∗ i − ˆ λ ∗ = 0 ∀i (4.39) − ˆ λ ∗ + Z ˆ μ ∗ (w)f W (w)dw≥ 0 (4.40) ˆ y ∗ −λ ∗ + Z ˆ μ ∗ (w)f W (w)dw = 0 (4.41) 2˜ a i ˆ c ∗ ,α (w)ˆ z ∗ i (w)− ˆ μ ∗ (w)≥ 0 ∀w (4.42) ˆ z ∗ i (w) 2˜ a i ˆ c ∗ ,α (w)ˆ z ∗ i (w)− ˆ μ ∗ (w) = 0 ∀w (4.43) ˆ μ ∗ (w) ˆ y ∗ −w− X i ˆ z ∗ i (w) ! = 0 ∀w, (4.44) ˆ μ ∗ (w)≥ 0 ∀w. (4.45) In addition to feasibility, the optimality conditions for (GEN i ) are 2˜ a i x G∗ i −P 1 ≥ 0 (4.46) x G∗ i 2˜ a i x G∗ i −P 1 = 0. (4.47) 2˜ a i z G∗ i (w)−P 2 (w)≥ 0 ∀w (4.48) z G∗ i (w) 2˜ a i z G∗ i (w)−P 2 (w) = 0 ∀w. (4.49) 91 In view of optimality conditions (4.42) and (4.43), we choose the following price schedule P 2 (w) = ˆ μ ∗ (w) ((1−)+ 1−α ) 0≤w≤ ˆ θ ∗ ˆ μ ∗ (w) (1−) ˆ θ ∗ ≤w< ˆ y ∗ 0 ˆ y ∗ ≤w , P 1 = ˆ λ ∗ . Given these choices, for each i, the optimality conditions for (GEN i ) become 2a i x G∗ i − ˆ λ ∗ ≥ 0 ∀i (4.50) x G∗ i 2a i x G∗ i − ˆ λ ∗ = 0 ∀i (4.51) 2˜ a i ˆ c ,α (w)z G∗ i (w)− ˆ μ ∗ (w)≥ 0 ∀w (4.52) z G∗ i (w) ˆ c ,α (w)z G∗ i (w)− ˆ μ ∗ (w) = 0 ∀w. (4.53) Choosing x G∗ i = ˆ x ∗ i for all i and z G∗ i (w) = ˆ z ∗ i (w) for all i and w, (4.50) and (4.51) become identical to (4.38) and (4.39), and (4.52) and (4.53) become identical to (4.42) and (4.43). Therefore x G∗ i = ˆ x ∗ i for all i, and z G∗ i (w) = ˆ z ∗ i (w) for all i and w, and the selected prices, together with (ˆ x ∗ i , ˆ z ∗ i (w)) for all i and w constitute an SCEq, and we have shown by construction the existence of an SCEq. Assuming ˆ z ∗ i (w) > 0 for any i (and therefore for all i), the second stage price given in (4.36) can be rewritten in terms of the social planner’s primal decision variables and the level of renewable generation. Rearranging the term in parenthesis in (4.43) gives ˆ z ∗ i (w) = ˆ μ ∗ (w) ˆ c ,α (w) ∀w≤ ˆ y ∗ . (4.54) 92 Summing both sides of (4.54) over i and using constraint (4.15) gives ˆ y ∗ −w = ˆ μ ∗ (w) 2˜ a· ˆ c ,α (w) =⇒ ˆ μ ∗ (w) ˆ c ,α (w) = 2˜ a(ˆ y ∗ −w). (4.55) Thus when 0≤< 1, we have P ∗ 2 (W ) = 2˜ a· [ˆ y ∗ −W ] + . Given that ˆ x ∗ i > 0 for any i (and therefore for all i), a similar calculation gives ˆ λ ∗ =P ∗ 1 = 2a(D− ˆ y ∗ ), where a = P i 1 a i −1 . We now address the case where = 1, as prices given in the statement of Theorem 15 cannot be applied directly in the case where ˆ θ ∗ < ˆ y ∗ . Consider a sequence{(k)}, where lim k→∞ (k) = 1. Then, suppressing the dependence of ˆ μ(w) on , and taking the limit as k→∞ on both sides of (4.55) gives lim k→∞ ˆ μ ∗ (w) ˆ c ∗ (k),α (w) = 2˜ a· lim k→∞ ˆ y ∗ ((k))−w . (4.56) The limit lim k→∞ ˆ y ∗ ((k)) exists, as (SPP) may be solved for the case where = 1, and the optimal solution is unique given our assumptions on the generator cost function form. Therefore, it still holds in the case where = 1 thatP ∗ 2 (W ) = 2˜ a· [ˆ y ∗ −W ] + , and in turn a competitive equilibrium is given by (ˆ x ∗ , ˆ z ∗ (·),P ∗ 1 ,P ∗ 2 (·)), where P ∗ 1 = ˆ λ ∗ and P ∗ 2 (W ) = 93 2˜ a· [ˆ y ∗ −W ] + . Finally we give the following lemma on continuity of the equilibrium prices in . Lemma 16. The equilibrium prices given in 15 are continuous in ∈ [0, 1]. Proof. See Appendix C. 4.5 Two-Stage Mechanism for Risk Aware Electricity Market with Renewable Generation In the proof of Theorem 15, it was shown that the SCEq prices arise as optimal dual so- lutions to (SPP). If we assume that the generators are not strategic, and that all participants know the distribution of W , then the following mechanism implements the SCEq: (1) Each generator i submits cost function coefficients a i and ˜ a i . (2) The ISO solves (SPP), and announces stage 1 priceP ∗ 1 and stage 2 price scheduleP ∗ 2 (·) as given by (4.36). (3) Generator i solves (GEN1 i ) and receives P ∗ 1 x G∗ i . (4) At the start of stage 2, the renewable generation output W = w is observed by the generators. Generator i solves (GEN2 i ) and receives P ∗ 2 (w)z G∗ i (w). (5) Generator i produces x G∗ i +z G∗ i (w). 94 4.6 Conclusion In this work we consider a two-stage electricity market model with a single customer and multiple generators, taking into account the risk preferences of the customer while as- suming that the generators are risk neutral. Our goal has been to determine if a sequential competitive equilibrium exists in such a market, given this discrepancy in risk attitude. We show that such an equilibrium does exist by formulating the risk aware stochastic economic dispatch market as a two-stage stochastic program, and solving this problem to determine equilibrium energy procurements and prices. The equilibrium prices directly reflect the social planner’s risk attitude. Given these prices, we specify a market mechanism for implementa- tion of the equilibrium, assuming that the generators are not strategic. In future work we will incorporate network topology, multiple consumers, and strategic behavior in both the generators and consumers and general convex cost functions. 95 Chapter 5 Scheduling Flexible Non-Preemptive Loads in Smart-Grid Networks 5.1 Introduction Over the roughly century long history of the electrical power grid, the situation facing both grid managers and end users has remained largely the same: electricity available on demand. In the case of the latter, operation of lightbulbs, television sets and other appliances has been just the flip of a switch away, while for the former, the set of available controls and actions was over supply, i.e., which generators to activate - how much to generate and when [62]? Managers have consistently succeeded in providing an adequate supply to meet the demand of end users from second to second largely due to the fact that over the past century, demand forecasting has reached day ahead accuracy within 5% [41]. Recently, circumstances have changed on both the supply and demand sides of the grid. Increased adoption of renewables means that the available power supply is becoming less controllable. Thus, even in the presence of relatively predictable aggregate load, forecasting 96 errors in excess load can be significant. Meanwhile, the rise of networked appliances, homes and buildings is now facilitating synchronization and coordination of consumption to the extent that the demand side flexibility stands to become one of the most important assets available to grid operators [60]. Water heaters and electric vehicles (EV) typify loads charac- terized by such flexibility. A newly published report from the Brattle Group estimates that load flexibility could be expanded to satisfy nearly 20 percent of US peak demand, and avoid nearly $18 billion in annual generation capacity, energy, transmission and ancillary service costs [40]. Currently, aggregate flexibility is leveraged through demand response programs. Typi- cally these programs are used to reduce peaks in demand, either by indirect load control via real-time pricing or direct control, where utilities have the ability to turn devices on or off. Moving forward, much of the additional benefit is expected to come from expanding the use of demand response to applications such as load shifting and building, e.g., to track a time varying supply of renewable energy, and services such as frequency regulation and voltage control [40]. This work considers a population of non-preemptive loads, i.e., loads which must be served continuously for a predetermined amount of time without interruption once service has started. Examples of such loads are household appliances like dishwashers, and EV charging with tight deadlines [36]. Users report their level of discomfort for being served at each time slot of a finite time horizon. The social planner is tasked with serving these loads has access to a thermal generator with convex generation cost, as well as a renewable generator with zero marginal cost. Given the users’ preferences, thermal generator’s cost function, and knowledge of the renewable generator’s output, the scheduler determines an 97 efficient schedule for cost minimization. We seek to answer the following questions: How can these flexible loads be scheduled over the available time slots? Once a schedule has been determined, how should users be compensated for their flexibility? What is the “price of inflexibility” in this setting? In particular, the problem of monetizing flexibility has proven quite challenging thus far due to a lack of suitable optimization formulations [60]. 5.1.1 Related Work In [34], a setting similar to the one presented here in continuous time is examined. Prices for load consumption and inflexibility are derived as dual variables to the scheduler’s convex optimization problem, and a competitive equilibrium with respect to reported loads reported consumption level and duration is studied. The paper also studies a discretized time setting, and describes approximately optimal scheduling and pricing heuristics. More recently, [60] details a power exchange platform allowing for random arrivals of buy and sell orders, as well as flexible consumers. A fluid relaxation of the discrete time flexible scheduling problem together with a projection method for deriving a feasible schedule is presented, and the fluid solution shown to be optimal asymptotically as the number of flexible consumers tends to infinity. Marginal pricing, given a schedule of the flexible loads is shown to be inadmissable with respect to incoming offers, and modification for the nonconvex discrete time scheduling problem is left to future work. The scheduling and pricing of deadline differentiated loads is studied in [9], wherein the longer a consumer is willing to defer, the lower the price in energy they will pay. The derived pricing scheme is shown to yield a competitive equilibrium 98 between the consumers and supplier and, when used in tandem with earliest-deadline-first scheduling, is incentive compatible in terms of reported deadlines. 5.1.2 Statement of Contributions In this paper, we propose a tractable optimization formulation for the problem of schedul- ing and pricing of non-preemptive loads. Such problems are typically formulated as mixed integer optimization problems, which are NP hard in general, and do not directly yield prices for services and commodities [60]. Additionally, the inclusion of non-preemptive loads ne- cessitates specification of constraints and optimization variables which capture the fact that once activated, a given load must remain in service until completion without interruption. In order to make the development and statement of our formulation and key results more concrete, we frame the discussion in the EV charging application. We address these challenges via a three step approach. First, we give mixed binary optimization problems for the non-preemptive loads, a profit maximizing thermal generator, and an independent system operator (ISO), which is tasked with ensuring supply/demand balance. Next we consider the convex relaxations of the load and ISO problems, which allow for definition of the desired prices. Finally, we show that when the population of loads served in the mixed binary optimization model is infinite, the prices derived from the relaxed optimization problems result in a competitive equilibrium outcome amongst market participants. The key contributions of this paper are summarized as follows. (i) We propose a novel specification of load flexibility and decentralized optimization formulation for the scheduling 99 of non-preemptive loads. (ii) We formulate a corresponding centralized welfare maximization problem, and prove the existence of a competitive equilibrium in the relaxed version of this setting with finitely many loads, i.e., we show that there exist prices for per unit energy consumption and inflexibility such that the thermal generator produces efficient levels at each time step, and a social planner schedules loads such that demand equals supply while respecting the loads’ flexibility preferences (iii) We prove that the competitive equilibrium determined for the finite population setting is also a competitive equilibrium in the original mixed binary problem, when each load is interpreted as representing an infinite population of loads with appropriately scaled demand. Thus, the prices derived via convex relaxation are suitable for use in the binary constrained setting. Such a result is currently absent in the related literature. (iv) We specify a market mechanism for implementing the competitive equilibrium. (v) We present a case study demonstrating the utility of our formulation, based on real world electric vehicle charging data drawn from the Adaptive Charging Network (ACN) project [44]. 5.2 Problem Formulation The market consists of M non-preemptive loads (or consumers) and a single thermal generator. Additionally, an ISO (independent system operator) ensures safe grid operation. LetT = [1,··· ,T ] denote the discrete time horizon over which loads are scheduled and served. For simplicity, we assume a single bus network model. We assume throughout that all entities are price taking, i.e., their actions do not affect market prices. 100 Each loadi is characterized by a tuple (τ i ,l i ,U i ,u dS i· ,u dE i· ), whereτ i gives the duration in time slots, l i gives the consumption level, and u dS i· and u dE i· give the disutility functions of consumer i due to service starting prior to or after a desired service window, respectively. Consumer i demands l i MW of electricity for τ i consecutive time slots, and derives utility U i as their load is fulfilled. Figure 1 plots an example pair of disutility functions vectors. If consumer i is allowed to be served in time slot t, it suffers disutility u dS it +u dE it ≥ 0. For example, an EV commuter may prefer that their vehicle be charged for a five hour period at 5 kW/h between 9AM and 5PM - while they work - rather than arrive early or stay late in order to charge their vehicle. Thus, the consumer’s overall utility is a function of the flexibility that it allows for in the scheduling of its load. Figure 5.1 Example disutility vectors. 101 Denote x C i := (x C i1 ,...,x C iT )∈{0, 1} T . We will similarly define vector and matrix valued quantities throughout. Given flexibility incentives p S i ∈R T + and p E i ∈R T + , consumer i solves the following optimization problem: (CONS i ) min x C i ,y C i ,z C i ∈{0,1} T X t p con it x C it −U i T−τ i +1 X t=1 x C it + X t (1−y C it )(u dS it −p S it ) + (1−z C it ) u dE it −p E it (5.1) s.t. t X s=1 s X r=max{1,s−τ i +1} x C ir ≤τ i (1−y C it )∀t (5.2) T X s=t s X r=max{1,s−τ i +1} x C ir ≤τ i (1−z C it )∀t. (5.3) In the context of EV charging, if x C it = 1, then consumer i chooses to begin charging their vehicle at time slot t and pay activation price p con it . The inner sums on the left hand side in constraints (5.2) and (5.3) give the charging/idle status of load i at each time slot s. That is, the sums will be equal to 1 if consumer i’s vehicle started charging in any time slot {max{1,s−τ i + 1},...,s} and 0 otherwise. The term (1−y C it ) = 1 when consumer i’s EV has started charging prior to or at time slot t. In such a case, consumer i incurs disutility u dS it ≥ 0 for having started by time t, but is compensated at early start rate p S it . Similarly, (1−z C it ) = 1 indicates that consumer i’s EV will be charging at or after time slot t, with u dE it and late charge ending rate p E it analogous to u dS it and p S it . The generator is characterized by its thermal generation cost function c(·) : R + →R + , which is assumed to be strictly convex, increasing and twice differentiable onR + . In addition to the generator’s thermal plant, we assume that it also owns a renewable generator which 102 produces energy at zero marginal cost. The output of the renewable generator, g : T → (0,∞) is assumed to be known to all market participants at time t = 0. Given prices p gen ∈ R T + , the generator chooses generation levels q G ∈ R T + to solve the following profit maximization problem (GEN) max q G ≥0 X t p gen t (q G t +g t )−c(q G t ) . Finally, the ISO collects all load profiles and determines the set of admissible load and generation schedules by solving (ISO) min q I ≥0 x I ∈{0,1} X t p gen t q I t +g t − X i l i X s=max{1,t−τ i +1} x I is ! s.t. X i l i t X s=max{1,t−τ i +1} x I is −g t ≤q I t ∀t. Note that the ISO incurs positive cost at any time slott where thermal generationq t exceeds residual demand (aggregate demand less renewable generation), and thus will find those schedules which balance thermal generation and residual demand optimal. 5.2.1 The Social Planner’s Problem In order to study the welfare properties of the competitive equilibrium given later, we introduce a social planning problem. The social planner is concerned with maximizing the combined welfare of all market participants, while ensuring safe operation of the power grid. Specifically, the social planner collects the profiles of each loadi, and schedules them so that each is served without interruption for their entire duration. In practice, either the ISO or 103 equivalent market participant, or a government organization often assumes responsibility for these tasks [30]. Here we introduce the social planner as a distinct entity for clarity as we investigate properties of our market formulation. Let ˆ x it ∈{0, 1} denote the social planner’s decision as to whether loadi will begin service in time slott, where ˆ x it = 1 denotes that load i will start at time slot t. A schedule is then defined as ˆ x∈{0, 1} M×T . The social planner selects a schedule, auxiliary load status variables ˆ y and ˆ z, and corresponding generation levels ˆ q := (ˆ q 1 ,..., ˆ q t ) in order to solve the following problem: (SPP) min ˆ q≥0 ˆ x,ˆ y,ˆ z∈{0,1} M×T X t c(ˆ q t ) + X i X t u dS it (1− ˆ y it ) + X i X t u dE it (1− ˆ z it )− X i U i T−τ i +1 X t=1 ˆ x it (5.4) s.t. X i l i t X s=max{1,t−τ i +1} ˆ x is −g t ≤ ˆ q t ∀t (5.5) t X s=1 s X r=max{1,s−τ i +1} ˆ x ir ≤τ i (1− ˆ y it ) ∀i, t (5.6) T X s=t s X r=max{1,s−τ i +1} ˆ x ir ≤τ i (1− ˆ z it ) ∀i, t. (5.7) 104 In order to develop prices for electricity consumption and load inflexibility, as well as a competitive equilibrium concept, we relax the binary constraints on matrices ˆ x, ˆ y and ˆ z, and consider the following problem: (SPP-R) min ˆ q,ˆ x,ˆ y,ˆ z≥0 X t c(ˆ q t ) + X i X t u dS it (1− ˆ y it ) + X i X t u dE it (1− ˆ z it )− X i U i T−τ i +1 X t=1 ˆ x it (5.8) s.t. ˆ λ t : X i l i t X s=max{1,t−τ i +1} ˆ x is −g t ≤ ˆ q t ∀t (5.9) ˆ ν S it : t X s=1 s X r=max{1,s−τ i +1} ˆ x ir ≤τ i (1− ˆ y it ) ∀i, t (5.10) ˆ ν E it : T X s=t s X r=max{1,s−τ i +1} ˆ x ir ≤τ i (1− ˆ z it ) ∀i, t, (5.11) where ˆ λ t , ˆ ν S it and ˆ ν E it denote the dual variables corresponding to constraints (5.9-5.11). It can be shown that constraints (5.10) and (5.11) ensure that all entries of ˆ x, ˆ y and ˆ z are less than 1, and also that T−τ i +1 X t=1 ˆ x it ≤ 1 ∀i, t. (5.12) Under the relaxation, since in addition to (5.12), each schedule decision variable ˆ x it satisfies 0≤ ˆ x it ≤ 1, ˆ x may be interpreted as a matrix specifying the probability that a given load of type i will be scheduled at time slot t for all i and t. That is, for each i, the planner will choose ˆ x i· ∈R T equal to e t , the tth standard basis vector, with probability ˆ x it . Therefore, ˆ x in (SPP-R) gives a probabilistic schedule for the loads and if, for a giveni, (5.12) holds with equality, then loadi is certain to be activated at some time slott∈T . Otherwise, the load only has a chance of ever being activated. Fixing a matrix of probabilities ˆ x, (1− ˆ y it ) 105 and (1− ˆ z it ) give probabilities that loadi has been activated up to timet, and will be active from time slot t onward, respectively, for all i and t. Consequently, the (SPP-R) objective may be viewed as the expectation of overall social welfare, and the constraints as being met in expectation. This interpretation is key to the competitive equilibrium definition and properties we detail in later sections. Note that due to the nonnegativity ofu dS it andu dE it for alli, t, for any fixed ˆ x, it is always optimal to choose each entry of matrices{1− ˆ y} and{1− ˆ z} as small as possible, where 1 denotes the matrix of size N×T with each entry equal to 1. Therefore, constraints (5.10) and (5.11) may be replaced with equalities, and matrices ˆ y and ˆ z are completely determined given a particular ˆ x. Having relaxed the binary constraints on matrices ˆ x, ˆ y and ˆ z, we may employ Lagrangian analysis in order to arrive at a solution to (SPP-R). Problem (SPP-R)’s Lagrangian is given by L = X t c(ˆ q t ) + X i X t u dS it (1− ˆ y it ) + X i X t u dE it (1− ˆ z it )− X i U i T−τ i +1 X t=1 ˆ x it + X t ˆ λ t X i l i t X s=max{1,t−τ i +1} ˆ x is −g t − ˆ q t ! + X i X t ˆ ν S it t X s=1 s X r=max{1,s−τ i +1} ˆ x ir −τ i (1− ˆ y it ) ! + X i X t ˆ ν E it T X s=t s X r=max{1,s−τ i +1} ˆ x ir −τ i (1− ˆ z it ) ! . Let p ˆ λ it =l i min{T,t+τ i −1} X s=t ˆ λ s (5.13) 106 p ˆ ν it = T X s=t ˆ ν S is min{s−t + 1,τ i } + min{T,t+τ−1} X s=1 ˆ ν E is min{T−t + 1,τ i ,τ i − (s−t)}. (5.14) (See Appendix A of [25] for the derivation of p ˆ λ and p ˆ ν ). Then, the (SPP-R) Lagrangian can be rearranged as L = X t c(ˆ q t )− ˆ λ t (ˆ q t +g t ) ! + X i X t p ˆ λ it +p ˆ ν it ! ˆ x it − X i U i T−τ i +1 X t=1 ˆ x it + X i X t (1− ˆ y it )(u dS it − ˆ ν S it τ i ) + X i X t (1− ˆ z it )(u dE it − ˆ ν E it τ i ), (5.15) and in addition to feasibility, the KKT optimality conditions for (SPP-R) are c 0 (ˆ q ∗ t )− ˆ λ ∗ t ≥ 0 ∀t (5.16) ˆ q ∗ t c 0 (ˆ q ∗ t )− ˆ λ ∗ t = 0 ∀t (5.17) p ˆ λ ∗ it +p ˆ ν ∗ it −U i ≥ 0 ∀i, t≤T−τ i + 1 (5.18) ˆ x ∗ it p ˆ λ ∗ it +p ˆ ν ∗ it −U i = 0 ∀i, t≤T−τ i + 1 (5.19) p ˆ λ ∗ it +p ˆ ν ∗ it ≥ 0 ∀i, t>T−τ i + 1 (5.20) ˆ x ∗ it p ˆ λ ∗ it +p ˆ ν ∗ it = 0 ∀i, t>T−τ i + 1 (5.21) τ i ˆ ν S∗ it −u dS it ≥ 0 ∀i, t (5.22) ˆ y ∗ it τ i ˆ ν S∗ it −u dS it = 0 ∀i, t (5.23) τ i ˆ ν E∗ it −u dE it ≥ 0 ∀i, t (5.24) ˆ z ∗ it τ i ˆ ν E∗ it −u dE it = 0 ∀i, t (5.25) ˆ λ ∗ t X i l i t X s=max{1,t−τ i +1} ˆ x ∗ is −g t − ˆ q ∗ t = 0 ∀t (5.26) ˆ λ ∗ t ≥ 0 ∀i, t. (5.27) 107 Note that due to condition (5.19), load i may only be activated with nonzero probability during time slots when the sum p ˆ λ ∗ it +p ˆ ν ∗ it is equal to the constant marginal utility term U i . Additionally, condition (5.17) implies that for time slots in which it is optimal to produce a positive quantity of electricity, we have that ˆ λ ∗ t =c 0 (ˆ q ∗ t )> 0, the marginal cost of production. In turn, condition (5.26) implies that generated quantities of electricity in such time slots will be equal to demand less forecast renewable generation. In view of our interpretation of the (SPP-R) objective as the expected value of social welfare, and the constraints as being met in expectation, the solution to (SPP-R) yields a set of admissible load activation schedules which may be randomly selected by the social planner in a single shot of the original, binary constrained problem. We will further examine this correspondence in later sections. Fixing such an activation schedule and taking into account renewable generation output g t , the optimal generation schedule follows from constraints (5.5) and (5.9), and condition (5.25). 108 5.2.2 Consumer’s Problem The second (SPP-R) Lagrangian expression (5.15) suggests the following decomposition of the relaxed social planner’s problem into relaxed versions of the individual entity prob- lems presented above. See [25] for the optimality conditions corresponding to each of these problems. Starting with the consumer problems, we have (CONS-R i ) min x C i ,y C i z C i ≥0 X t p con it x C it −U i T−τ i +1 X t=1 x C it + X t (u dS it −p S it )(1−y C it ) + u dE it −p E it (1−z C it ) (5.28) s.t. θ S it : t X s=1 s X r=max{1,s−τ i +1} x C ir =τ i (1−y C it ) ∀t (5.29) θ E it : T X s=t s X r=max{1,s−τ i +1} x C ir =τ i (1−z C it ) ∀t. (5.30) Again, under the relaxation on the binary constraints onx C i ,y C i andz C i , we may interpret the consumer’s problem as selecting probabilities of activation for each time slot t, in the interest of maximizing their expected net utility (here written in minimization form). 5.2.3 Generator and ISO Problems The generator’s problem remains the same as before (GEN-R) max q G ≥0 X t p gen t (q G t +g t )−c(q G t ) . (5.31) 109 Finally, the relaxed ISO problem is given by (ISO-R) min q I ≥0 x I ≥0 X t p gen t q I t +g t − X i l i t X s=max{1,t−τ i +1} x I is ! s.t. α t : X i l i t X s=max{1,t−τ i +1} x I is −g t ≤q I t ∀t, 5.3 Competitive Equilibrium and Theorems of Welfare Economics The competitive or Walrasian equilibrium is a standard reference point in economic anal- ysis for assessing market outcomes. A competitive equilibrium is specified by an allocation of goods and prices, with the defining characteristic that taking the equilibrium prices as given, every market participant finds it optimal to select the corresponding equilibrium al- location [47]. At equilibrium prices, the quantity of goods demanded by consumers is equal to the quantity produced by suppliers., i.e., the market clears. Therefore, equilibrium prices provide a coordinating signal for markets to operate in a decentralized fashion. Assuming that competitive equilibrium exists for a particular market setting, it is natural to compare the equilibrium allocation to allocations which directly maximize the aggregate welfare of all market participants. The latter allocations are called efficient, and in our setting are given by solutions to (SPP-R). We now give the competitive equilibrium definition for our setting, and explore existence, as well as welfare properties of such an equilibrium. As related above, competitive equilibria are typically specified in two-sided settings involving consumers and producers maximizing 110 their individual well being and profits, respectively. Similar to the analysis found in [65] and [76], we augment the standard definition to include a nonprofit entity, i.e., the ISO. Definition 11. (Competitive Equilibrium). A tuple (q ∗ ,x ∗ ,y ∗ ,z ∗ ,p con∗ ,p S∗ ,p E∗ ,p gen∗ ) with p gen∗ ≥ 0 is said to be a competitive equilibrium if, given (p con∗ i ,p S∗ i ,p E∗ i ), (x ∗ i ,y ∗ i ,z ∗ i ) solves (CON-R i ) for each i, q ∗ solves (GEN-R), given p gen* , q ∗ solves (GEN-R), and given p gen∗ , (q ∗ ,x ∗ ) solves (ISO-R). As noted in the previous section since solutions to (CON-R i ) will, in general, give values of x C it ∈ [0, 1], the quantities (x ∗ ,y ∗ ,z ∗ ) in the competitive equilibrium in Definition 11 have probabilistic interpretations: consumers select probabilities x C it of being scheduled at each time slot t∈T , in order to maximize their expected net utility. Our first result addresses the existence of the competitive equilibrium defined above. Theorem 17. There exists a competitive equilibrium, given by an optimal solution to (SPP-R), (ˆ q ∗ , ˆ x ∗ , ˆ y ∗ , ˆ z ∗ ), and the following prices derived from an optimal dual solution to (SPP-R) p con∗ it =p ˆ λ∗ it +p ˆ ν∗ it , p gen∗ t = ˆ λ ∗ t , p S∗ it =τ i ˆ ν S∗ it , p E∗ it =τ i ˆ ν E∗ it , (5.32) for all i and t. Proof. See proof of Theorem 1 in [25]. The two fundamental theorems of welfare economics describe the relationship between competitive equilibria and efficient allocations. The first fundamental theorem states that competitive equilibria lead to, or support efficient allocations [47]. The second fundamental theorem states that the converse also holds, and in our settings corresponds to Theorem 1. 111 We now state and prove the first fundamental theorem for our setting, given by Theorem 18. Whereas proofs of the efficiency of competitive equilibria often require that the balance of supply and demand be included in the definition of such equilibria, in our development this equality arises from the given formulation of the ISO’s problem (ISO-R). That is, facing equilibrium prices, the ISO will act to balance supply and demand as it optimizes (ISO-R). Theorem 18. Any competitive equilibrium forms an optimal solution for (SPP-R). Proof. By definition, the competitive equilibrium (q ∗ ,x ∗ ,y ∗ ,z ∗ ,p con∗ ,p S∗ ,p E∗ ,p gen∗ ) satisfies c 0 (q ∗ t )−p gen∗ t ≥ 0 ∀t (5.33) q ∗ t (c 0 (q ∗ t )−p gen∗ t ) = 0 ∀t (5.34) p con∗ it +p θ∗ it −U i ≥ 0 ∀i, t≤T−τ i + 1 (5.35) x ∗ it p con∗ it +p θ∗ it −U i = 0 ∀i, t≤T−τ i + 1 (5.36) 112 p con∗ it +p θ∗ it ≥ 0 ∀i, t>T−τ i + 1 (5.37) x ∗ it p con∗ it +p θ∗ it = 0 ∀i, t>T−τ i + 1 (5.38) p S∗ it +τ i θ S∗ it −u dS it ≥ 0 ∀i, t (5.39) y ∗ it p S∗ it +τ i θ S∗ it −u dS it = 0 ∀i, t (5.40) p E∗ it +τ i θ E∗ it −u dE it ≥ 0 ∀i, t (5.41) z ∗ it p E∗ it +τ i θ E∗ it −u dE it = 0 ∀i, t (5.42) p gen∗ t −α ∗ t ≥ 0 ∀t (5.43) q ∗ t (p gen∗ t −α ∗ t ) = 0 ∀t (5.44) −p p gen∗ it +p α ∗ it ≥ 0 ∀t (5.45) x ∗ t −p p gen∗ it +p α ∗ it = 0 ∀t (5.46) α ∗ t X i l i t X s=max{1,t−τ i +1} x ∗ is −g t −q ∗ t = 0 ∀t (5.47) α ∗ t ≥ 0 ∀t (5.48) for some θ S∗ , θ E∗ and α ∗ ≥ 0, as well as the feasibility conditions for each of the individual entity problems. Therefore, observing that for any p gen∗ ≥ 0 the form of the objective in (ISO-R) ensures that complementary slackness condition (5.26) will be satisfied at the competitive equilibrium, selecting (ˆ q ∗ , ˆ x ∗ , ˆ y ∗ , ˆ z ∗ ) = (q ∗ ,x ∗ ,y ∗ ,z ∗ ) as the primal variables, and dual variables ˆ λ ∗ =p gen∗ =α ∗ and (ˆ ν S∗ it , ˆ ν E∗ it ) = (p S∗ /τ i +θ S∗ it ,p E∗ /τ i +θ E∗ it ) for all i, t, forms optimal primal and dual solutions to (SPP-R). 113 5.4 Replicated and Large Economies In general a competitive equilibrium is not guaranteed to exist when the social planner’s problem is a mixed integer programming problem. Nevertheless, our competitive equilib- rium definition allows for probabilistic allocation to consumers, and thus the existence of a competitive equilibrium is related to the existence of a primal and dual solution to the (relaxed) (SPP-R) problem. In this section we justify the study of this relaxed problem by demonstrating its equivalence to the original, binary constrained (SPP) when each load i is interpreted as representing an infinite population of identical loads, with scaled demand. Thus far, our development has crucially relied on the assumption that market participants are price taking, i.e., presented with market prices, they make decisions in view of their own preferences and constraints, revealing their true demand without consideration of how their choices might influence these prices. But why should they act in this manner? In economic theory, the notion of large economies provides one justification for adoption of this assumption. The essence of the argument is as follows. As the number of market participants increases, any influence that an individual participant might have on market prices diminishes. When when that number grows to infinite, that influence vanishes entirely, and the price taking assumption becomes reasonable [4]. Following this intuitive argument, the question of how to add individuals to the market still remains. A special method, known as replication, is to introduce participants with preferences and constraints identical to existing types, in the same proportion as existing ones [37]. In the context of electric vehicle charging, this could mean scaling up the number of drivers with the same model vehicle and desired charging schedule, with demand scaled 114 down so as to avoid infeasible aggregate demands as the population of each type grows. In general it can be shown that as participants are added in this way, those of the same type will receive the same allocation. Suppose that each loadi is replicatedN times, and that the resulting loads have demand, utility and disutility scaled by N. Indexing the replicas of each type i with the indexn, the binary constrained SPP with N replication is (SPP(N)) min ˆ q≥0 ˆ x∈{0,1} M×N×T ˆ y∈{0,1} M×N×T ˆ z∈{0,1} M×N×T X t c (ˆ q t )− X i X n U i N T−τ i +1 X t=1 ˆ x int + X i X n X t u dS it N (1− ˆ y int ) + X i X n X t u dE it N (1− ˆ z int ) (5.49) s.t. ˆ λ t : X i X n l i N t X s=max{1,t−τ i +1} ˆ x ins −g t ≤ ˆ q t ∀t ˆ ν S int : t X s=1 s X r=max{1,s−τ i +1} ˆ x inr =τ i (1− ˆ y int ) ∀i, n, t (5.50) ˆ ν E int : T X s=t s X r=max{1,s−τ i +1} ˆ x inr =τ i (1− ˆ z int ) ∀i, n, t. (5.51) We refer to the problem with N replication which relaxes the binary constraint on ˆ x as SPP(N)-R (instead of SPP(1)-R, we will still refer to the original relaxed problem as SPP-R). When we wish to emphasize the dependence of decision variables on the replication factor N, we will append (N), e.g., ˆ x int (N). Proposition 19. Let (ˆ q ∗ , ˆ x ∗ , ˆ y ∗ , ˆ z ∗ , ˆ λ ∗ , ˆ ν S∗ , ˆ ν E∗ ) denote an optimal solution to SPP-R. Then for any N, an optimal solution to SPP(N)-R can be formed by setting ˆ x ∗ int (N) = ˆ x ∗ it , ˆ y ∗ int (N) = ˆ y ∗ it , ˆ z ∗ int (N) = ˆ z ∗ it , for all i, n, t, ˆ λ ∗ t (N) = ˆ λ ∗ t for all t, and ˆ ν S∗ int (N) = ˆ ν S∗ it /N and ˆ ν E∗ int (N) = ˆ ν E∗ it /N for all i and t. 115 Proof. SPP(N)-R has the following KKT conditions. For all t c 0 (ˆ q ∗ t (N))− ˆ λ ∗ t (N)≥ 0 (5.52) ˆ q ∗ t (N) c 0 (ˆ q ∗ t (N))− ˆ λ ∗ t (N) = 0, (5.53) for all i, n and T≤T−τ i + 1 p ˆ λ∗ it (N) N +p ˆ ν∗ int (N)− U i N ≥ 0 (5.54) ˆ x ∗ int (N) p ˆ λ∗ it (N) N +p ˆ ν∗ int (N)− U i N = 0, (5.55) for all i, n, and T >T−τ i + 1 p ˆ λ∗ it (N) N +p ˆ ν∗ int (N)≥ 0 (5.56) ˆ x ∗ int (N) p ˆ λ∗ it (N) N +p ˆ ν∗ int (N) = 0, (5.57) for all i, n, and t τ i ˆ ν S∗ int (N)− u dS it N ≥ 0 (5.58) ˆ y ∗ int (N) τ i ˆ ν S∗ int (N)− u dS it N ! = 0 (5.59) τ i ˆ ν E∗ int (N)− u dE it N ≥ 0 (5.60) ˆ z ∗ int (N) τ i ˆ ν E∗ int (N)− u dE it N ! = 0, (5.61) 116 and for all t ˆ λ ∗ t (N) X i l i N X n t X s=max{1,t−τ i +1} ˆ x ∗ ins (N)−g t − ˆ q ∗ t (N) ! = 0 (5.62) ˆ λ ∗ t (N)≥ 0. (5.63) The proof of the theorem follows from making the selections specified in the theorem state- ment, substituting into (5.52)-(5.63), and comparing with (5.16)-(5.27). Proposition 19 states that an optimal probabilistic schedule ˆ x ∗ (N) in the problem withN replication can be derived from an optimal probabilistic schedule ˆ x ∗ for (SPP-R) and specifies how to do so. In the limit as N→∞ we can use ˆ x to generate an optimal deterministic, binary constrained schedule if we interpret ˆ x ∗ it as the proportion of the population of type i to be activated at time t. This is stated formally in the following theorem. Theorem 20. An optimal solution to SPP(∞) is given by activating proportion ˆ x ∗ it of type i population at time t for each i and t, where ˆ x ∗ is an optimal solution to SPP-R. Proof. Note that constraints (5.50) and (5.51) may be rewritten ˆ y int = 1− 1 τ i t X s=1 s X r=max{1,s−τ i +1} ˆ x inr ∀i, n, t ˆ z int = 1− 1 τ i T X s=t s X r=max{1,s−τ i +1} ˆ x inr ∀i, n, t, (5.64) 117 so that overall SPP(N) can be written as min ˆ q≥0, ˆ x∈{0,1} X t c (ˆ q t )− X i U i T−τ i +1 X t=1 1 N X n ˆ x int + X i X t u dS it 1 τ i t X s=1 s X r=max{1,s−τ i +1} 1 N X n ˆ x inr ! + X i X t u dE it 1 τ i T X s=t s X r=max{1,s−τ i +1} 1 N X n ˆ x inr ! s.t. ˆ λ t : X i l i t X s=max{1,t−τ i +1} 1 N X n ˆ x ins −g t ≤ ˆ q t ∀t. (5.65) Now, if ˆ x int (N) is considered as a Bernoulli random variable with P (ˆ x int (N) = 1) = ˆ x ∗ it and ˆ q ∗ t (N) is chosen as ˆ q ∗ t (1) = ˆ q t for all t, then by the Law of Large Numbers, constraint (5.65) converges to X i l i t X s=max{1,t−τ i +1} ˆ x ∗ is −g t ≤ ˆ q ∗ t ∀t. Similarly, the objective function converges to X t c (ˆ q ∗ t )− X i U i T−τ i +1 X t=1 ˆ x ∗ it + X i X t u dS it 1 τ i t X s=1 s X r=max{1,s−τ i +1} ˆ x ∗ ir ! + X i X t u dE it 1 τ i T X s=t s X r=max{1,s−τ i +1} ˆ x ∗ ir ! . Since the optimal objective of the relaxed problem provides a lower bound for the binary constrained problem, and the power balance constraint is satisfied in the limit as N→∞, the solution produced by randomly activating loads according to ˆ x ∗ converges to an optimal binary constrained solution as N→∞. 118 5.5 Market Mechanism for Large Population Economy Market mechanism design is an approach in economic theory which, rather than taking economic institutions as fixed and predicting the outcomes generated by such institutions, starts with an outcome identified as desirable and attempts to construct a mechanism by which it may be delivered [48]. In this section we consider the competitive equilibrium concept discussed in prior sections as the target outcome for our market, and specify a mechanism by which it can be achieved. Mechanism design plays a crucial role for market in which participants may misreport preferences, costs or other information when it is in their individual best interest to do so. Therefore, the mechanism presented in this section may be viewed as a starting point for future work in which market participants are allowed to behave strategically. The competitive equilibrium definition given in the previous section allows for non-binary activation schedule x ∗ . As mentioned, since 0≤ x ∗ it ≤ 1, and P t x ∗ it ≤ 1, each x ∗ it may be interpreted as giving the portion of load i activated at timet under relaxation of the binary constraints on the activation schedule orthe probability that an individual load of type i in the infinite replication setting is fully activated at time t. 119 Let us explore the infinitely replicated setting from the perspective of an individual load n of type i. First, note that for finite N, (SPP(N)-R) has Lagrangian L = X t c (ˆ q t (N))− ˆ λ t (N)(ˆ q t (N) +g t ) + X i,n,t u dS it N (1− ˆ y int (N)) + u dE it N (1− ˆ z int (N)) ! − X i,n,t ˆ ν S int (N)(1− ˆ y int (N))− X i,n,t τ i ˆ ν S int (N)(1− ˆ z int (N)) + X i,n,t p ˆ λ int (N) N +p ˆ ν int (N) ˆ x int (N)− X i U i N X n T−τ i +1 X t=1 ˆ x int (N) where p ˆ λ int (N) and p ˆ ν int (N) are defined analogously to (5.13) and (5.14). Thus, under N replication and relaxation, the optimization problem for consumer n of type i is given by (CONS in (N)-R) min x C in ,y C in z C in ≥0 X t p con int (N)x C int − U i N T−τ i +1 X t=1 x C int + X t u dS it N −p S int ! (1−y C int ) + X t u dE it N −p E int ! (1−z C int ) s.t. θ S int : t X s=1 s X r=max{1,s−τ i +1} x C inr =τ i (1−y C int ) ∀t θ E int : T X s=t s X r=max{1,s−τ i +1} x C inr =τ i (1−z C int ) ∀t. Multiplying by N, the (CON in (N)-R) objective function can be written as X t Np con int x C int + X t u dS it −Np S int (1−y C int ) + X t u dE it −Np E int (1−z C int )−U i T−τ i +1 X t=1 x C int . (5.66) 120 As in Theorem 17, set p con int (N) = p ˆ λ∗ int (N) N +p ˆ ν∗ int (N), p gen∗ t (N) = ˆ λ ∗ t (N) p S int (N) =τ i ˆ ν S∗ int (N), p E int (N) =τ i ˆ ν E∗ int (N), and as in Proposition 19, choose ˆ λ ∗ t (N) = ˆ λ ∗ t (1) = ˆ λ ∗ t , ˆ ν S∗ int (N) = ˆ ν S∗ it /N, ˆ ν E∗ int (N) = ˆ ν E∗ it /N. Then letting N→∞ gives lim N→∞ Nτ i ˆ ν S∗ int (N) =τ i ˆ ν S∗ it , lim N→∞ Nτ i ˆ ν E∗ int (N) =τ i ˆ ν E∗ it . This implies that lim N→∞ Np con int (N) =p ˆ λ∗ it +p ˆ ν∗ it . Therefore, posing the prices described in Proposition 19 in the limit asN→∞, the objective functions for each (CON in ) converge to X t p ˆ λ ∗ it +p ˆ ν ∗ it x C int −U i T−τ i +1 X t=1 x C int + X t u dS it −τ i ˆ ν S∗ it (1−y C int ) + u dE it −τ i ˆ ν E∗ it (1−z C int ) ! . Thus, the pricing facing each load of type i is identical, and in fact the problem facing each is the same as the single load of type i in the decomposition with relaxation but not 121 replication. Further, each will select the same x C∗ in· = ˆ x ∗ i· ∈ R T + , where x C∗ int = ˆ x ∗ it gives the probability that the load will be scheduled at time t. Therefore the equal allocation for individuals of the same type mentioned earlier holds in our setting. The following mechanism (FLEX-SCHED(N)) uses the probability values selected by the continuum of consumers to generate a binary constrained schedule in the setting with N replication. Note that since the generator’s problem does not involve consumer utility and disutility functions, nor consumer scheduling variables, its problem is not affected by replica- tion (or relaxation). Therefore (GEN(N)) is the same as (GEN) for allN, includingN =∞ . 1. Each consumer (i,n) submits u dS i· and u dE i· , and the generator submits c to the social planner (i.e. the entity taking on this role, such as the government or ISO). 2. The social planner solves (SPP-R), and announces (p con∗ ,p S∗ ,p E∗ ,p gen∗ ,p bal∗ ) as spec- ified in Theorem 17. 3. Each consumeri solves (CON(∞)-R in ), the generator solves (GEN(∞)), and (x ∗ i ,y ∗ i ,z ∗ i ) for all i, as well as q ∗ are submitted to the social planner. 4. The social planner randomly assigns proportion x ∗ i of loads of type i to start at time t, for each i and t. The generator produces q ∗ over the finite horizon. Combined with the renewable generation outputg, this generated power is allocated to the consumers according to x ∗ i and demands l i for each i. In the large population setting, the following result regarding (FLEX-SCHED(∞)) holds. 122 Theorem 21. The mechanism (FLEX-SCHED(∞)) is ex-post individually rational, budget balanced and efficient. Proof. See proof of Theorem 5 in [25]. 5.6 Case Study: EV Charging Electric vehicle charging constitutes one of the most important and challenging appli- cations of load scheduling optimization currently facing power grid operators. Today, the transportation sector accounts for approximately 64% of global consumption of oil, a resource which has been linked to increasing CO2 emissions, and further is expected to expire in about 50 years. In contrast, transportation sector operations comprise just 1.5% of worldwide elec- tricity usage [30]. Reliance on electricity is more amenable to a shift towards renewable sources of energy such as solar and wind, which in total are expected to make up approxi- mately one third of all power generation by 2040. From a market perspective, demand for electric vehicles increase each year. According to the International Energy Agency, 740,000 vehicles were produced in total in 2014, and that figure is expected to reach 20 million by 2020 [30]. Charging process scheduling is now recognized as one of the key technologies for integration of electricity based mobility into existing power grids. In order to demonstrate the utility of our flexible scheduling problem formulation, we simulated its performance on real world load and renewable generation data in the context of electric vehicle (EV) charging. The input load parameters (τ i ,l i ,u dS i ,u dE i ) are derived from data included in the ACN-Data dataset, a dynamic dataset of workplace EV charging [44]. In particular, we take as our base set of loads the recorded vehicle arrivals for May 123 28, 2018. For each vehicle charging session, the ACN dataset includes vehicle connection and disconnection times, as well as a charging completion time. In these simulations each time index represents a 15 minute period. We take τ i as the difference between charging completion time and the first time period where the EV drew a positive amount of current. We then divide the total kWh delivered to the vehicle by τ i to arrive at l i . We design disutility functions for each load in the following manner. Let t C denote the vehicle’s recorded connection time, andt D denote its recorded disconnection time. Then for each load i, we let u dS it = α(t C −t) 2 0≤t≤t C − 1 0 t≥t C and u dE it = 0 t≤t D α(t−T ) 2 t D + 1≤t≤T , (5.67) where α is a scaling parameter.Thus u dS it +u dE it = 0 for t∈ [t C ,t D ], and elsewhere increases quadratically away from this desired service window. The sample disutility curves pictured in Figure 5.1 are generated according to this quadratic form. We setU i =U for some nonnegative scalar U. We take our renewable generation profile g t from data generated by NREL’s SAM tool [11]. Specifically, we draw on solar power generation time series estimated for downtown Los Angeles, also for May 28, 2018. 124 In our simulations we compare the performance of our flexible load scheduling approach to a schedule which naively begins charging loads as soon as they arrive. We implemented the latter approach by setting u dS 0 it = α max{t 2 C , (T−t C ) 2 } 0≤t≤t C − 1 0 t≥t C u dE 0 it = 0 t≤t C α max{t 2 C , (T−t C ) 2 } t C + 1≤t≤T , i.e., loads are essentially inflexible, with 0 disutility at timet C and the maximum disutility in (5.67) for all other time periods t6= t C . We consider the social welfare objective value achieved, as well as the percentage of loads served. We adjusted α, U, and a scale on the quadratic cost c(z t ) in order to ensure that both scheduling approaches successfully scheduled all loads. In particular we letα = 0.01,U = 100 and scaled the cost by factor 0.5. As shown in Figures 5.2 and 5.3, when users report disutility functions, the scheduler shifts loads such that that overall demand moves towards the mid day period of high renewable generation, thus relying less on the generator and therefore incurring less cost and higher social welfare overall. For the base load case examined in Figure 4, the peak generation falls from 4.67 kW to 3.46 kW, a reduction of roughly 29%, while the peak demand falls from 7.23 kW to 5.46 kW, a reduction of roughly 24%. To demonstrate the robustness of the disutility function approach to a surge in demand, we randomly sample loads served during the other weekdays of May 2018 in order to increase 125 Figure 5.2 Scheduled aggregate load with and without flexibility. overall power demand. Specifically, we study demand scaled up in increments of 25% of the base load up to a 100% increase in load in terms of overall demand. Loads are randomly added to the base until the total power demand exceeds the desired level of increase, and the same random sets of loads are added in the cases with and without flexibility. The performance of both approaches are shown in Figures 5.4 and 5.5. The flexibility enabled schedule which makes use of the disutility functions continues to offer increased social welfare over on demand scheduling. Additionally, while disutility based scheduling still includes all loads, the on demand based scheduler finds it optimal to exclude between 25% and 33% of loads as the number of loads increases to double the base. 5.7 Conclusion In this work, we study how to schedule and price service for a population of flexible, but non-preemptive loads, in the presence of renewable generation, as well as a dispatchable thermal generator. Formulating a collection of mixed integer optimization programs for the 126 Figure 5.3 Scheduled generation (equal to aggregate load less renewable generation) with and without flexibility. consumers, and generator, we then study a centralized version of our setting with relaxed in- teger constraints, allowing for use of Lagrangian analysis and derivation of prices. A solution of this centralized problem yields a competitive equilibrium, and conversely a competitive equilibrium yields an efficient solution. Finally, we present a case study involving electric car charging data to demonstrate the efficacy of our approach. There are several directions for future work in this area. First, in terms of the scheduling aspect, it is desirable to determine a method for deriving at least an approximately optimal solution to the original integer constrained setting, given an efficient solution to the relaxed social planner’s problem presented here. In terms of pricing, properties such as fairness should be examined. For example, assuming that the disutility functions of each user can be at least partially ordered from less to more restrictive, is the compensation offered to more flexible users more than to those which are not as flexible? It will also be of interest to explore other types of loads, such as those which may be interrupted, as well as those which might accept less than an upper bound of total energy delivered. Strategic behavior 127 Figure 5.4 Proportions of loads served with and without flexibility. amongst market participants should also be taken into account, as well as more detailed network modeling and constraints. 128 Figure 5.5 Social welfare achieved with and without flexibility. 129 Chapter 6 Designing Interpretable Approximations to Deep Reinforcement Learning with Soft Decision Trees 6.1 Introduction Deep neural network (DNN)-driven algorithms now stand as the state of the art in a variety of domains, from perceptual tasks such as computer vision, speech and language processing to, more recently, control tasks such as robotics [50], [7]. Nevertheless, there is often reason to avoid direct use of DNNs. For example, the training from scratch or hyperparameter tuning of such networks can be prohibitively expensive or time consuming [68]. For some applications, the size or complexity of such DNNs precludes their use in real time, or employment in edge devices with limited processing resources [19], [50]. In other areas such as flight control or self driving cars, DNNs are sidelined (at least for mass deployment) by their opaqueness or lack of decision making interpretability [6], [38]. In light of such issues, a vast body of work has focused in recent years on developing simpler or more structured controllers which retain desirable properties of a given DNN based 130 controller [7]. Additionally, multiple studies in this area, e.g., [7], [5] attribute the efficacy of DNNs across such a broad range of problems not to an inherently richer representative capacity over other architectures, or even over shallower neural networks, but rather to the many regularization techniques which currently facilitate DNN training. Therefore, while it may not always be clear how to precisely obtain alternative controllers with performance similar to DNNs, it is theoretically possible, and therefore well-motivated, to do so. A related challenge is how to select an alternative controller, given multiple objectives. For example, given a reference DNN, one may wish to design an alternative controller with fewer parameters and comparable performance that is also easier to understand. In this work, we propose a collection of metrics constituting a framework for evaluating how well decision tree controllers distilled from a target DNN “match” the original, as well as how to make comparisons amongst competing alternatives. We focus on reinforcement learning (RL) tasks, and in particular on decision tree alternatives to DNNs trained via the DQN algorithm [51]. We study standard hard decision trees based on thresholding of input attributes at inner nodes as well as “soft” decision trees as described in [31] and in later sections of this work. We consider metrics such as the average reward, policy accuracy percentage, number of parameters, and normalized root mean square (NRMS) error between what we term the em- pirical value functions (EVFs) associated with each controller (including the reference DQN). To obtain the EVFs, we use sample trajectories starting from a collection of points spanning a given RL environment’s state space, since both hard and soft decision tree controllers do not directly provide such an estimate. Based on these metrics, we assess collections of hard and soft trees, demonstrating how a designer can tailor a controller to specific design criteria, 131 in particular by tuning the tree depth. In summary, our contributions can be summarized as follows: • We propose a collection of metrics including a novel, empirical value function based RMS distance metric for evaluating the quality of distilled hard and soft decision tree controllers; • We train hard and soft decision trees of varying depth, and examine the impact of this parameter on our proposed distillation quality metrics. • We demonstrate via our framework that in our test environment, approximately 80% accuracy in policy matching yields performance comparable to the original DNN, and that the SDT achieving these thresholds does so with fewer parameters and NRMS error than the HDTs which reach the same accuracy and performance level. 6.2 Background 6.2.1 Related Work Related literature can be broken into two primary areas: knowledge distillation and model evaluation metrics. 6.2.1.1 Knowledge distillation In essence, distillation is the transfer of behavior or learned knowledge for a given problem or task from one model or controller to another. Briefly, this is related to, but distinct from model compression, which seeks to quantize, code or otherwise process reference network 132 weights in a way that leads to a reduced complexity model of the same structure [59]. Our framework could be used to evaluate the models resulting from such processes as well, but we focus here instead on distillation. In [17], modestly sized models are trained on “pseudo-data” generated by large ensembles of base level classifiers. This teacher-student paradigm is central to the distillation literature. Shallow neural networks are trained in [5] to achieve comparable performance to state-of- the art DNNs with the same number of parameters when the shallow networks are trained to mimic the DNNs instead of learning directly from the original labeled training data. Concentrating on edge devices such as smart phones, [50] demonstrate a student of fixed size or complexity will perform poorly if the teacher is too large, and propose a “teaching- assistant” facilitated process involving multiple distillation steps between the original teacher and target student. [39] argues that training a student neural network using a weighted combination of the correct labels and the output soft labels (class probabilities) generated by a teacher network helps transfer knowledge to the student regarding relative likelihoods of different classes, therefore improving student performance. This approach is extended in [19] to more com- plex, multi-class object tasks by incorporating training loss functions accounting for class imbalance as well as “hints” from intermediate teacher layers, amongst other techniques. Introducing soft decision trees (SDTs) which essentially feature a single layer perceptron at each inner node, [31] demonstrated that the target student need not be a smaller neural network. Training these SDTs on data generated by an expert DNN improves performance over training directly on the labeled data. See Section 6.2.3 for a more detailed overview of this approach. 133 Distillation has also been extended from the realm of supervised learning to RL. Using a technique termed policy distillation, [66] shows the policy of an RL agent can be extracted to train a smaller, more efficient network to perform at expert level. [7] augment the Dagger algorithm [64] by making use of the expert network’s Q-function (see Section 2.1) to extract a series of policies, the best of which is selected based upon a cross-validation procedure. [45] extend mimic learning to RL settings via Linear Model U-Trees (LMUTs), a variant of the U-Tree representation of a Q function [49], placing linear models at each leaf node. Turning to a different objective, [68] proffer the kickstarting approach to speed up the training RL students in the presence of trained experts. Finally, nearest to our experimental work, [22] studied how the SDTs described in [31] can be used to explain the behavior of expert DNNs in an RL setting. 6.2.1.2 Evaluation metrics The collection of controller evaluation metrics presented here is most closely related to [3], which proposes four metrics to measure the quality of rules extracted from a neural network [26]: accuracy, fidelity, consistency and comprehensibility. For a classifier ˆ c, accuracy is defined in [3] as P (ˆ c(X) =C), where C represents a previously unseen problem instance. Fidelity, the extent to which the classifer ˆ c decisions correspond to the original neural network (nn) is defined as the probability P (ˆ c(X) =nn(X)). 134 Consistency refers to the stability of the extracted policy across multiplenn training sessions, and comprehensibility is the number of rules extracted from the network, along with the number of antecedants or conditions per rule. Our EVF based NRMS calculation is most closely related to the examination of mean ab- solute error (MAE) and root mean square error (RMSE) between reference DNN and LMUT Q function representations in [45]. We instead examine the distance between empirical es- timates for the value functions corresponding to each type of controller, as this captures in some sense the policy actually followed by each controller, given the system dynamics. 6.2.2 Imitation Learning We use the methodology of imitation learning (IL) [64] to train our simpler and more interpretable decision trees (students) to imitate the complex trained DQN (expert). IL trains a classifier or regressor to predict the behavior of an expert based on a dataset of observation (state) and action trajectories produced by the expert. While extensions of this basic approach which account for discrepancies between the distributions of states visited by the expert and student are available e.g., [64], we employ the basic IL approach here and leave exploration of more advanced techniques to future research. 6.2.3 Soft Decision Trees While DNNs perform well in automatically learning policies for control problems, this efficacy is often offset by a lack of clarity as to how specific actions are selected. The root of this difficulty lies in the distributed nature of the representations embedded in the hidden 135 layers of DNNs [31]. Beyond the first or last few layers, it is usually hard to explain the functional role of a given node as layers repeatedly aggregate latent features in order to form subsequent representations. Decision trees offer an alternative, often more legible decision making paradigm. Decision tree selected actions can be traced back through sequences of decisions based directly on input data. In particular, we focus in this work on soft decision trees (SDTs), as described in [31]. For this class of binary decision trees, each inner node i learns a filter w i and bias b i , in order to output a probability of taking the right branch p i (x) =σ(β(x T w i +b i )). The parameterβ, termed the ‘inverse temperature’, is introduced in order to avoid very soft decisions. The inverse temperature may be a learned parameter or static hyperparameter. Each leaf node ` learns a distribution over the possible output selections Q ` k = exp(φ ` k ) P k 0 exp(φ ` k 0) . A two-leaf tree with this architecture is shown in Figure 6.1. Learning occurs via mini-batch gradient descent. In our setting, the SDT learns based on environment state observations paired with actions from the expert DQN. The training 136 makes use of a loss function which minimizes the cross entropy between each leaf and the target distribution T , weighted by its path probability L(x) = log − X `∈leaf nodes P ` (x) X k T k logQ ` k , where T is the target distribution, and P ` (x) is the probability of arriving at node `, given input x. Figure 6.1 SDT with a single inner node and two leaf nodes [31] Unlike standard decision trees, this architecture allows for decisions to be made at each inner node based on aggregated input characteristics, rather than splitting based upon value ranges of input characteristics. Note that the input is fed directly into each inner node simul- taneously, with the path from root to leaf for a given input being determined by left/right decisions at the inner nodes encountered. Having arrived at a leaf node, the output selec- tion can be made either randomly according to the learned distribution, or by selecting the 137 output with the highest probability at that leaf. In our experiments below we use the latter method. Empirically it has been shown [31] that on supervised learning tasks such as MNIST digit recognition, SDTs trained on the outputs of a neural net expert exhibit better generalization than trees trained on data directly, though they do not match the performance of the expert. However, it is often easier to explain the output decisions of the SDTs than those of the neural network, as SDTs learn to make hierarchical decisions. Our SDT implementation is based on the Github repository “Hierarchical mixture of Bigots” [46] associated with publication [31]. The case study environment was from an off-the-shelf implementation available in the OpenAI Gym software package [15]. All imple- mentations of the hard and soft decision trees for these this test environment were carried out on a Macbook Pro with a Quad-Core Intel Core i5 processor and 8GB RAM. 6.3 Controller Characterization Metrics Given a reference expert controller and a collection of distilled models, it is desirable to identify which of the models represents the “best” approximation to the expert. Precise characterization of the quality of a distilled representation of the expert, i.e., which metrics should be used to evaluate the learned policies - remains application specific, and in many cases an open problem, particularly in applications focused on interpretibility. Much recent work on distillation techniques and their application focuses on controller performance [7], [22], [31]. In particular, [45] explicitly asserts that play performance, the average return achieved by the distilled controller is the most relevant metric. Indeed, in practice many of 138 the studies cited in Section 2.1 take the highest performing controller as best, though not all, e.g. [6]. Still, controller performance alone does not indicate the extent to which the distilled controller “matches” or explains the target DNN policy, particularly in cases where fitted models have lower performance than the expert [22]. Aside from performance, fidelity, or the extent to which fitted models match the predictions of the expert is another natural evaluation metric for distilled models [45]. We evaluate decision trees distilled from a reference, DQN trained DNN on the basis of controller performance means and confidence bounds, as well as a pair of fidelity metrics. Taking the view of both the reference DNN and distilled trees as function approximators for the true value function of a given environment, we examine the normalized RMS error between the empirical value functions associated with each model. Figure 6.2 illustrates the generation and comparison of these empirical value functions. The reference DNN can be represented by the function, e Q. Given distillation training obser- vation set (states) s train ={s 1 ,...,s n }, with n∈Z, the DNN outputs corresponding labels (actions)e a ={e a 1 ,...,e a n }. Together, labeled data (s,e a) is used to train a soft decision tree as detailed in Section 6.2.3, and we infer policies e π and b π corresponding to the DNN and SDT, respectively. Starting from randomly seeded starting statess test ={s 0 1 ,...,s 0 m } withm∈Z, trajectories for both the DNN and SDT are generated in order to obtain empirical estimates of the value functions b V e π and b V b π associated with e π and b π, respectively. We then take the 139 empirical L2 norm of the difference between b V e π and b V b π , evaluated ats test and normalized by the maximum absolute value of b V b π : RMSE( b V e π , b V b π ,s test ) = v u u t 1 m X s 0 ∈stest b V e π (s 0 )− b V b π (s 0 ) 2 NRMSE( b V e π , b V b π ,s test ) = RMSE( b V e π , b V b π ,s test ) max s | b V b π (s)| . (6.1) We focus here on value function approximation, rather than Q-value function approx- imation, as the value function reflects the actions that are actually taken under a given policy, and therefore captures in some sense the closed loop behavior of the controllers under comparison. The second fidelity metric that we consider is policy accuracy, or percent 0-1 loss, which we later report as the percentage %ACC(e π,b π,s grid ) = |{s∈s grid : e π(s) = b π(s)}| |{s∈s grid }| , where s grid is a discretization of S. Policy accuracy is meant to assess how well the actions taken under a distilled tree policy b π match the actions of the reference DNN. Intuitively, fixing a tree depthn, a more accurate distilled tree should provide a better explanation of the latent knowledge captured in the DNN. Aside from performance and fidelity, we also consider the complexity of candidate con- trollers in terms of the number of included trainable parameters. 140 Figure 6.2 Generation and L2 norm comparison of empirical value functions ˆ V ˜ π and ˆ V ˆ π . 6.4 Case Study In this section, we detail the MountainCar-v0 environment case study. 6.4.1 Problem description The mountain car problem is a well-known benchmark problem in reinforcement learning, prevalent in the literature since the 1990s [52]. The problem includes an under powered mountain car trying to reach a hilltop, starting out from a valley. As the car lacks sufficient power to drive directly uphill to the goal, the policy that suits it the best is to reverse up the hill on the opposite side and use the acquired momentum in addition to the engine to reach the goal. In this problem, the state space is continuous with two states – the position of the car along the x–axis, contained in the interval [−1.2, 0.6], and the velocity of the car, contained in the interval [−0.07, 0.07]. Three actions are allowed – the car can either choose to go left, go right or do nothing. The task is declared successfully completed if the car reaches the goal state within 200 time steps. For every time step that the goal is not reached, the car 141 receives a reward of−1 point. The episode is terminated as a failure if the time limit is exceeded and the car has not reached its goal. 6.4.2 Expert DQN For MountainCar-v0, our expert neural net consisted of two hidden layers with 24 and 48 neurons, and three output neurons to account for the three possible actions. Discount factor γ was set to 0.99 for training, and this architecture was trained using the DQN algorithm on 400 episodes of MountainCar-v0. The output reference model has a total of 1419 trainable parameters. The DQN learned the optimal control policy successfully. We tested the DQN controller over 100 episodes, and observed that the controller succeeded in achieving the goal in every episode. The DQN completed the control task over all the 100 episodes with a mean cumulative reward and 95% confidence interval of -154.05±0.79 units. Both soft and hard trees were trained on a set of labeled data generated by the expert DQN over 1500 MountainCar-v0 episodes. The data was preprocessed such that the final distillation data contained equal numbers of state/actions pairs for each of the three available actions. In all, the distillation training set contained 167,085 data points. 6.4.3 Hard and soft decision trees Soft decision trees of depths 2 through 9 were trained using the technique discussed in Section 6.2.3. Hard trees of depths 2 through 9 were trained on the DQN generated labeled data using the sklearn Python package [57]. 142 Figure 6.3 shows a trained SDT of depth 3. The colored squares in the inner layer nodes display the trained weights w i . At each inner node, the left half panel gives the weight applied to the car position, while the right half panel gives the weight applied to the car velocity. Thus, for example, one may observe that the decision made at the root node depends most heavily on the (signed) car velocity. The tricolor panels at the leaf nodes represent the learned probability distribution over potential actions, with the letter corresponding to highest probability action given for each leaf node in the bottom row. For this particular tree, there is a clear highest probability action corresponding to each leaf node. Figures 6.4 through 6.7 display the results of the application of the metrics introduced in Section 6.3 to the sets of hard and soft trees, and reference DQN. Starting with our fidelity metrics, Figure 6.4 compares the NRMS values as calculated in (6.1) for each tree type and depth. For the purposes of this plot, the state space was discretized to 20 steps in each dimension, giving 400 test trajectories overall for each controller. As can be seen, the L2 error does not vary considerably with the tree depth for trees of either type, though the error increases slightly with depth for hard trees. Figure 6.5 shows the policy accuracy percentage for each of the distilled controller. In this experiment, the state space was discretized more finely, as each data point requires a call to the controller’s prediction function, rather than a complete episode trajectory. In particular, the range of each input state feature was discretized into 100 steps. For the tree depths tested, depth 5 gives the optimal accuracy for SDTs, as deeper, more complex trees appear to overfit to the DQN labeled data. On the other hand, in terms of percentage policy accuracy, HDTs increase monotonically with depth over the range tested. 143 Turning to performance, Figure 6.6 plots the empirical mean and 95% confidence intervals across 100 test trajectories for each controller. The DQN outperforms all SDTs in terms of mean reward, though the 95% confidence bounds of the depth 5 SDT overlap with those of the DQN. HDTs of depth 7 or greater either slightly outperform or essentially match the DQN. Finally, Figure 6.7 compares the complexity of each distilled tree and the reference DQN in terms of tunable parameters. For each SDT, this number represents the learned weights and biases of each inner node, along with the probability distributions of each leaf node. For each HDT, this number is twice the number of inner nodes, as each inner node splits the data based upon an input attribute and threshold. For tree depth larger than 7, the SDTs actually become more complex than the DQN in the sense just described. The highest performing, depth 5 SDT is specified by 190 parameters, meaning this tree requires about 13% of the parameters needed to specify the DQN. The HDTs of depth 7, 8 and 9, which outperform this depth 5 SDT require 330, 446, and 570 parameters, respectively. For reference, Figures 6.8 and 6.9 show the policies associated with depth 2, 5 and 8 SDTs and HDTs, respectively, along with the DQN policy. Each policy plot again reflects discretization of the statespace S into 100 points along both position and velocity dimen- sions. As these plots demonstrate, while SDTs approximate the DQN policy via increasing number of hyperplanes, the HDTs approximate the DQN policy via increasingly finer simple functions. See the Supplementary Materials for a similar analysis in the context of the CartPole-v0 environment, another benchmark problem available as part of the OpenAI Gym toolkit. 144 Figure 6.3 Soft Decision Tree for Mountain Car Figure 6.4 Normalized RMS L2 Error for HDT/SDTs depth 2-9 (statespace discretized into 20 steps per dimension). 6.4.4 Discussion Overall, we identify three primary takeaways from our case study. First, larger tree depth does not always result in improved performance or fidelity. Second, given our choice of decision tree architecture and training methodology, as well as RL environment, it seems that the percentage accuracy is a better predictor of performance than the proposed L2 error metric. Third, soft decision trees with a fraction of the tunable parameters of the original DNN can achieve similar performance in the MountainCar-v0 task. 145 Figure 6.5 Percentage Policy Accuracy for HDT/SDTs depth 2-9 (statespace discretized into 100 steps per dimension). Figure 6.6 Performance evaluation for HDT/SDTs and reference DQN for 100 episodes. 6.4.5 Further Case Studies Future work will also include case studies of more complex environments. In particular, we have begun work on a case study of the CarRacing-v0 environment available in OpenAI Gym, considered the easiest continuous control task to learn from image pixel input. The version under study includes a discretized action space of cardinality 12 (actions here are a tuple of turn direction, acceleration and braking), and input observations of 96× 96 pixel frames. Using the DQN algorithm, it is possible to train a controller which “solves” the environment in terms of performance. 146 Figure 6.7 Number of parameters for HDT/SDTs and reference DQN. Figure 6.8 Trained SDT and DQN policies, for statespace S discretized to 10000 points. At this point HDTs of at least depth 13 trained on DQN trajectories via IL achieve performance approaching, but not matching the DQN. Thus far, however, IL has not been effective in training SDTs, necessitating further investigation. Tractable discretization of this larger state space for calculation of EVFs and policy accuracy also remains a challenge. Please see the supplementary document for more details. 147 Figure 6.9 Trained HDT and DQN policies, for statespace S discretized to 10000 points. 6.5 Conclusion and Future Work We introduced an evaluation framework and metrics to assess the quality of distilled decision tree controllers in RL settings against a variety of design objectives. Specifically, we consider normalized root mean square error between empirical value functions, policy match accuracy, mean performance with confidence bounds, and number of tunable parameters to guide designers and researchers toward assessing tradeoffs between performance, fidelity, and complexity as tree depths and types vary. Apart from explore more complex RL environments, future directions for our work in- clude: Considering alternative architectures beyond trees, such as kernel machines; incor- porating more sophisticated learning algorithms such as Dagger [64] into the training of our alternative architectures; and developing metrics which focus more on the “closed-loop” behavior of controllers in a given environment. 148 Bibliography [1] Abbad, J. R. (2010). Electricity market participation of wind farms: the success story of the spanish pragmatism. Energy policy 38(7), 3174–3179. [2] Amelang, S. (2018, January). Renewables cover about 100% of german power use for first time ever. [Online; posted 05-January-2018]. [3] Andrews, R., J. Diederich, and A. B. Tickle (1995). Survey and critique of techniques for extracting rules from trained artificial neural networks. Knowledge-based systems 8 (6), 373–389. [4] Aumann, R. J. (1964). Markets with a continuum of traders. Econometrica: Journal of the Econometric Society, 39–50. [5] Ba, J. and R. Caruana (2014). Do deep nets really need to be deep? In Advances in neural information processing systems, pp. 2654–2662. [6] Bastani, O., C. Kim, and H. Bastani (2017). Interpretability via model extraction. arXiv preprint arXiv:1706.09773 . [7] Bastani, O., Y. Pu, and A. Solar-Lezama (2018). Verifiable reinforcement learning via policy extraction. In Advances in neural information processing systems, pp. 2494–2504. 149 [8] Bitar, E., K. Poolla, P. Khargonekar, R. Rajagopal, P. Varaiya, and F. Wu (2012). Selling random wind. In System Science (HICSS), 2012 45th Hawaii International Conference on, pp. 1931–1937. IEEE. [9] Bitar, E. and Y. Xu (2016). Deadline differentiated pricing of deferrable electric loads. IEEE Transactions on Smart Grid 8 (1), 13–25. [10] Bitar, E. Y., R. Rajagopal, P. P. Khargonekar, K. Poolla, and P. Varaiya (2012). Bring- ing wind energy to market. IEEE Transactions on Power Systems 27 (3), 1225–1235. [11] Blair, N., A. P. Dobos, J. Freeman, T. Neises, M. Wagner, T. Ferguson, P. Gilman, and S. Janzou (2014). System advisor model, sam 2014.1. 14: General description. Technical report, National Renewable Energy Lab.(NREL), Golden, CO (United States). [12] Borenstein, S. (2002). The trouble with electricity markets: understanding Californi- aâĂŹs restructuring disaster. Journal of economic perspectives 16 (1), 191–211. [13] Bose, S., D. W. Cai, S. Low, and A. Wierman (2014). The role of a market maker in networked cournot competition. In 53rd IEEE Conference on Decision and Control, pp. 4479–4484. IEEE. [14] Boyd, S. and L. Vandenberghe (2004). Convex optimization. Cambridge university press. [15] Brockman, G., V. Cheung, L. Pettersson, J. Schneider, J. Schulman, J. Tang, and W. Zaremba (2016). Openai gym. arXiv preprint arXiv:1606.01540 . 150 [16] Bublitz, A., D. Keles, F. Zimmermann, C. Fraunholz, and W. Fichtner (2019). A survey on electricity market design: Insights from theory and real-world implementations of capacity remuneration mechanisms. Energy Economics 80, 1059–1078. [17] BuciluÇŐ, C., R. Caruana, and A. Niculescu-Mizil (2006). Model compression. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 535–541. [18] Carvalho, D. V., E. M. Pereira, and J. S. Cardoso (2019). Machine learning inter- pretability: A survey on methods and metrics. Electronics 8(8), 832. [19] Chen, G., W. Choi, X. Yu, T. Han, and M. Chandraker (2017). Learning efficient object detection models with knowledge distillation. In Advances in Neural Information Processing Systems, pp. 742–751. [20] Clarke, E. H. (1971). Multipart pricing of public goods. Public choice 11 (1), 17–33. [21] Commission, C. E. (2019, January). Renewable energy. [22] Coppens, Y., K. Efthymiadis, T. Lenaerts, and A. Nowe (2019). Distilling deep re- inforcement learning policies in soft decision trees. In Proceedings of the IJCAI 2019 Workshop on Explainable Artificial Intelligence, pp. 1–6. [23] Dahlin, N. and R. Jain (2018). A two stage mechanism for selling random power. CoRR abs/1809.09873. 151 [24] Dahlin, N. and R. Jain (2019). A two-stage market mechanism for electricity with renewable generation. In 2019 IEEE 58th IEEE conference on decision and control (CDC). IEEE. [25] Dahlin, N. and R. Jain (2020). Scheduling of flexible non-preemptive loads. arXiv preprint arXiv:2003.13220 . [26] Dancey, D., Z. A. Bandar, and D. McLean (2007). Logistic model tree extraction from artificial neural networks. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 37 (4), 794–802. [27] De Maere d’Aertrycke, G., A. Ehrenmann, and Y. Smeers (2017). Investment with incomplete markets for risk: The need for long-term contracts. Energy Policy 105, 571– 583. [28] DeMeo, E. A., W. Grant, M. R. Milligan, and M. J. Schuerger (2005). Wind plant integration [wind power plants]. IEEE Power and Energy Magazine 3 (6), 38–46. [29] Dillon, L. (2018, September). California to rely on 100 percent clean electricity by 2045 under bill signed by gov. jerry brown. [Online; posted 10-September-2018]. [30] El-Bayeh, C. Z. A review on charging strategies of plug-in electric vehicles. [31] Frosst, N. and G. Hinton (2017). Distilling a neural network into a soft decision tree. arXiv preprint arXiv:1711.09784 . [32] Groves, T. and M. Loeb (1975). Incentives and public inputs. Journal of Public eco- nomics 4(3), 211–226. 152 [33] Gupta, A., R. Jain, K. Poolla, and P. Varaiya (2015). Equilibria in two-stage electricity markets. In 54th IEEE Conference on Decision and Control, CDC 2015, Osaka, Japan, December 15-18, 2015, pp. 5833–5838. [34] Gupta, A., R. Jain, and R. Rajagopal (2015). Scheduling, pricing, and efficiency of non-preemptive flexible loads under direct load control. In 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton), pp. 1008–1015. IEEE. [35] GÃľrard, H., V. LeclÃĺre, and A. Philpott (2018). On risk averse competitive equilib- rium. Operations Research Letters 46 (1), 19 – 26. [36] Hashmi, M. U. (2018). Load flexibility for price based demand response. [37] Hildenbrand, W. (1970). On economies with many agents. Journal of economic the- ory 2(2), 161–188. [38] Hind, M., D. Wei, M. Campbell, N. C. F. Codella, A. Dhurandhar, A. Mojsilovic, K. N. Ramamurthy, and K. R. Varshney (2019). TED: teaching AI to explain its decisions. In V. Conitzer, G. K. Hadfield, and S. Vallor (Eds.), Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society, AIES 2019, Honolulu, HI, USA, January 27-28, 2019, pp. 123–129. ACM. [39] Hinton, G., O. Vinyals, and J. Dean (2015). Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 . [40] Hledik, R., A. Faruqui, T. Lee, and J. Higham (2019, 06). The National Potential for Load Flexibility. Technical report, The Brattle Group. 153 [41] Hoff, S. (2016, July). Electricity grid operators forecast load shapes to plan electricity supply. [Online; posted 22-July-2016]. [42] Hoium, T. (2017, September). 4 Utilities Betting Billions on Renewable Energy. [Online; posted 28-September-2017]. [43] ISO, C. (2021, March). Business practice manual for market instruments. [Online; posted 30-Mar-2021]. [44] Lee, Z. J., T. Li, and S. H. Low (2019). Acn-data: Analysis and applications of an open ev charging dataset. In Proceedings of the Tenth ACM International Conference on Future Energy Systems, pp. 139–149. [45] Liu, G., O. Schulte, W. Zhu, and Q. Li (2018). Toward interpretable deep reinforcement learning with linear model u-trees. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 414–429. Springer. [46] Martak, L. (2020). Distill-nn-tree. [47] Mas-Colell, A., M. D. Whinston, J. R. Green, et al. (1995). Microeconomic theory, Volume 1. Oxford university press New York. [48] Maskin, E. (2019). Introduction to mechanism design and implementation. [49] McCallum, A. K. et al. (1996). Learning to use selective attention and short-term memory in sequential tasks. In From animals to animats 4: proceedings of the fourth international conference on simulation of adaptive behavior, Volume 4, pp. 315. MIT Press. 154 [50] Mirzadeh, S.-I., M. Farajtabar, A. Li, N. Levine, A. Matsukawa, and H. Ghasemzadeh (2019). Improved knowledge distillation via teacher assistant. arXiv preprint arXiv:1902.03393 . [51] Mnih, V., K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, and M. Riedmiller (2013). Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 . [52] Moore, A. W. (1990). Efficient memory-based learning for robot control. [53] Morales, J. M., A. J. Conejo, H. Madsen, P. Pinson, and M. Zugno (2013). Integrating renewables in electricity markets: operational problems, Volume 205. Springer Science & Business Media. [54] Moye, R. and S. P. Meyn (2018). The use of marginal energy costs in the design of U.S. capacity markets. In 51st Hawaii International Conference on System Sciences, HICSS 2018, Hilton Waikoloa Village, Hawaii, USA, January 3-6, 2018. [55] on Climate Change, I. P. (2014). Summary for Policymakers, pp. 1–30. Cambridge University Press. [56] Paterakis, N. G., O. Erdin¸ c, and J. P. Catal˜ ao (2017). An overview of demand response: Key-elements and international experience. Renewable and Sustainable Energy Reviews 69, 871–891. [57] Pedregosa, F., G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blon- del, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, 155 M. Brucher, M. Perrot, and E. Duchesnay (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research 12, 2825–2830. [58] Philpott, A., M. Ferris, and R. Wets (2016). Equilibrium, uncertainty and risk in hydro-thermal electricity systems. Mathematical Programming 157 (2), 483–513. [59] Polino, A., R. Pascanu, and D. Alistarh (2018). Model compression via distillation and quantization. arXiv preprint arXiv:1802.05668 . [60] Qin, J., J. Mether, J.-Y. Joo, R. Rajagopal, K. Poolla, and P. Varaiya (2018). Au- tomatic power exchange for distributed energy resource networks: Flexibility scheduling and pricing. In 2018 IEEE Conference on Decision and Control (CDC), pp. 1572–1579. IEEE. [61] Ralph, D. and Y. Smeers (2015). Risk trading and endogenous probabilities in invest- ment equilibria. SIAM Journal on Optimization 25 (4), 2589–2611. [62] Roberts, D. (2019, October). Using electricity at different times of day could save us billions of dollars. [Online; posted 24-October-2019]. [63] Rockafellar, R. T. and S. Uryasev (2002). Conditional value-at-risk for general loss distributions. Journal of banking & finance 26 (7), 1443–1471. [64] Ross, S., G. Gordon, and D. Bagnell (2011). A reduction of imitation learning and struc- tured prediction to no-regret online learning. In Proceedings of the fourteenth international conference on artificial intelligence and statistics, pp. 627–635. 156 [65] Rossi, F., R. Iglesias, M. Alizadeh, and M. Pavone (2019). On the interaction between autonomous mobility-on-demand systems and the power network: Models and coordina- tion algorithms. IEEE Transactions on Control of Network Systems 7 (1), 384–397. [66] Rusu, A. A., S. G. Colmenarejo, C. Gulcehre, G. Desjardins, J. Kirkpatrick, R. Pascanu, V. Mnih, K. Kavukcuoglu, and R. Hadsell (2015). Policy distillation. arXiv preprint arXiv:1511.06295 . [67] Ruszczy´ naski, A. and A. Shapiro. 6. Risk Averse Optimization, pp. 253–332. [68] Schmitt, S., J. J. Hudson, A. Zidek, S. Osindero, C. Doersch, W. M. Czarnecki, J. Z. Leibo, H. Kuttler, A. Zisserman, K. Simonyan, et al. (2018). Kickstarting deep reinforce- ment learning. arXiv preprint arXiv:1803.03835 . [69] Shapiro, A., D. Dentcheva, and A. Ruszczy´ nski (2009). Lectures on Stochastic Program- ming. Philadelphia, PA, USA: SIAM. [70] Still, G. (2018). Lectures on parametric optimization: An introduction. Optimization Online. [71] Stott, B., J. Jardim, and O. Alsa¸ c (2009). Dc power flow revisited. IEEE Transactions on Power Systems 24(3), 1290–1300. [72] Tan, C.-W. and P. Varaiya (1993). Interruptible electric power service contracts. Journal of Economic Dynamics and Control 17 (3), 495–517. [73] Tang, W. and R. Jain (2015). Market mechanisms for buying random wind. IEEE Transactions on Sustainable Energy 6 (4), 1615–1623. 157 [74] Tang, W. and R. Jain (2016). Dynamic economic dispatch game: The value of storage. IEEE Transactions on Smart Grid 7 (5), 2350–2358. [75] Wang, G., A. Kowli, M. Negrete-Pincetic, E. Shafieepoorfard, and S. Meyn (2011). A control theorist’s perspective on dynamic competitive equilibria in electricity markets. IFAC Proceedings Volumes 44 (1), 4933–4938. [76] Wang, G., M. Negrete-Pincetic, A. Kowli, E. Shafieepoorfard, S. Meyn, and U. Shanbhag (2011). Dynamic competitive equilibria in electricity markets. In A. Chakrabortty and M. Illic (Eds.), Control and Optimization Theory for Electric Smart Grids. Springer. [77] Wang, G., M. Negrete-Pincetic, A. Kowli, E. Shafieepoorfard, S. Meyn, and U. V. Shanbhag (2012). Dynamic competitive equilibria in electricity markets. In Control and optimization methods for electric smart grids, pp. 35–62. Springer. [78] Xu, Y. and S. H. Low (2015). An efficient and incentive compatible mechanism for wholesale electricity markets. IEEE Transactions on Smart Grid 8 (1), 128–138. 158 Appendices A Proof of Lemma 4 Proof. Given the problem setting of Chapter 3, and defining ˆ y 1 :={ˆ y G 1 , ˆ y L 1 , ˆ θ 1 } and ˆ y 2 (w) := ˆ y G 2 (w), ˆ y L 2 (w), ˆ x L 2 (w), ˆ z L 2 (w), ˆ θ 2 (w)} where (w) denotes that y 2 (w) is chosen for a realization W =w (not necessarily optimal), we may write (SPP2) as min ˆ y 2 ∈G(ˆ y 1 ,w) g(ˆ y 2 (w)). g(·) gives the (SPP2) objective function, andG is a multifunction giving the constraint set for (SPP2) for first stage decision ˆ y 1 and W =w. We may then further rewrite (SPP2) as min ˆ y 2 g(ˆ y 1 , ˆ y 2 (w),w), where g(ˆ y 1 , ˆ y 2 (w),w) := g(ˆ y 2 (w)) if ˆ y(w) 2 ∈G(ˆ y 1 ,w) +∞ otherwise . 159 Since g(·) is convex and does not itself depend upon w, it is a Carath´ eodory function and therefore random lower semicontinuous [69]. Note that g(·) may be viewed as the sum of g(·) andI G(ˆ y 1 ,w) (·), where I G(ˆ y 1 ,w) (ˆ y 2 (w)) = 0 if ˆ y 2 (w)∈G(ˆ y 1 ,w) +∞ otherwise . As the constraint functions in (SPP2) are all linear,G(·,w) is closed, andG(·,·) is measurable with respect to the sigma algebra ofY 1 × Ω, where ˆ y 1 ∈Y 1 and w∈ Ω. ThereforeI G(ˆ y 1 ,w) is also random lower semicontinuous, so that g is random lower semicontinuous. DefineN as the space of mappings y : Ω→Y 2 , whereY 2 gives the unconstrained space of possible second stage decisions, i.e.,G(ˆ y 1 ,w)⊂Y 2 for all ˆ y 1 and w. Since N is finite dimensional Euclidean space, it is a linearly decomposable, so that by Theorem 7.92 of [69], we have E " inf ˆ y 2 (w) g(ˆ y 1 , ˆ y 2 (w),w) # = inf ˆ y∈N E[g(ˆ y 1 , ˆ y,w)]. Thus, in the problem setting of Chapter 3, minimization and expectation may be inter- changed, and the lemma follows from Theorems 2.20 and 2.21 of [69]. B Proof of Lemma 11 From the definition of CVaR α , it is clear that CVaR α is convex overL 1 (Ω,F,P), and therefore also continuous overL 1 (Ω,F,P). CVaR α is also clearly a proper, monotonic risk function. 160 For every possible realizationw and first stage decision (x,y), there always exists a feasible solution to second stage problem (4.9)-(4.15), so π SPP 2 (x,W ) is finite with probability 1 for all x. Together with the quadratic forms of the first and second stage cost functions, this implies that π SPP 2 (x,W )∈L 1 (Ω,F,P) for all x, and further, that CVaR α is continuous at π SPP 2 (x,W ). Therefore by Proposition 6.37 of [67] we can write CVaR α (Q(x,W )) = inf z(·)∈G(x,·) CVaR α X i ˜ a i (z i (·)) ! , (2) where z(·) : Ω→ R N , and z(w) := [z 1 (w),...,z n (w)] for all w. z(·)∈G(x,·) denotes that z(w) is a feasible choice for the constraint set in problem (4.9)-(4.15), given first stage decision (x,y), for any w. The argument for interchanging expectation and minimization over z(·) follows from the interchangeability principle (Theorem 7.80) in [69]. C Proof of Lemma 16 LetF () denote the feasible set of (SPP), given parameter∈ [0, 1]. From [70], the local compactness of F at some is satisfied if there exists a δ> 0 and compact set C 0 such that [ k−k≤δ F ()⊂C 0 . 161 Observing (SPP) is equivalent to a problem with the same objective and constraints, with the additional constraints that 0≤ P i ˆ x i ≤D, 0≤ ˆ y≤D and P i ˆ z i (w)≤W , and that the feasible set of (SPP) does not depend upon, local compactness is satisfied for any∈ [0, 1]. From [70] the constraint qualification holds for F () at some (ˆ x, ˆ z(·)) with (ˆ x, ˆ z(·)) if there is a sequence (ˆ x ν , ˆ z ν (·))→ (ˆ x, ˆ z(·)) such that (4.15) is satisfied with strict inequality. Clearly this condition holds for all ∈ [0, 1]. Define the optimal solution set of (SPP), given as S(). Then, since local compactness and the constraint qualification is satisfied for all ∈ [0, 1] by Lemma 5.6 in [70], S() is outer semicontinuous, meaning that for all sequences (ˆ x ∗ ν , ˆ z ∗ (·) ν , ν ), ν∈ N, with ν → and (ˆ x ∗ ν , ˆ z ∗ (·) ν )∈ S( ν ), there exists an (ˆ x ∗ , ˆ z ∗ (·))∈ S() such thatkˆ x ∗ ν − ˆ x ∗ k→ 0 and kˆ z ∗ (·) ν − ˆ z ∗ (·)k→ 0 for ν→∞. Due to the strict convexity of the first and second stage cost functions, the objective of (SPP) is strictly convex, so that when an optimal solution (ˆ x ∗ , ˆ z ∗ (·)) exists, it is unique. Therefore, outer semicontinuity of the optimal primal solutions as varies is equivalent to continuity. Since the equilibrium prices depend continuously on the primal solutions to (SPP), the prices themselves are continuous at any ∈ [0, 1] 162
Abstract (if available)
Abstract
The current moment finds the world’s energy infrastructure at the threshold of rapid transformation. Long heralded as the most complex human-technological system ever realized, in view of a looming climate crisis, as well as ever accelerating technological advances, today’s power systems are defined by increasing penetration of renewable and distributed energy resources on the supply side, and the emergence of flexible loads on the demand side. ❧ While these developments present opportunities for improved system operation and outcomes, they come with significant challenges as well, which can not be solved solely through continued development and adoption of technology. Consider the case of Germany’s Energiewende program. Government imposed above market-rates for solar and wind generation grew the renewable share of electricity consumption from 5% in 1999 to 27% in 2014. Nevertheless, these same rates, together with mandated prioritization of renewables, proved high enough to raise consumer prices, and low enough to undercut natural gas, turning many utilities back to coal, and producing an uptick in German carbon dioxide emissions. ❧ Clearly, market institutions have a crucial role to play in this transformation as well. Through market analysis, design and optimization, this work explores how such institutions can evolve to address three key characteristics of modern power systems: uncertainty, risk, and flexibility. ❧ Central to the potential of markets to mitigate the uncertainty posed by reliance on renewables is a revision of the current two-settlement market structure common to deregulated energy markets across the world. Rather than arrange advance supply and account for real time imbalances in supply and demand separately, this work considers a two-stage, stochastic market clearing paradigm. Probabilistic information regarding renewable generation is used to couple day ahead and expected real time recourse decisions, increasing efficiency. Within this framework, an incentive compatible two-stage market mechanism is designed for a renewable generator selling its random power to strategic customers. Next, a two-stage mechanism is developed for a two-sided exchange with primary and ancillary generation and demand response, implementing a sequential competitive equilibrium (SCEq). ❧ As renewables assume a larger share of electricity generation and consumption, market participants on both sides are exposed to higher levels of risk, both in terms of price and quantity variability. Many of these participants are risk averse, so that optimization of the expectation of a stochastic objective is not sufficient. Therefore, optimization of conditional value at risk of the cost of addressing shortfall in renewable generation is considered. A two-stage market mechanism implementing an SCEq is developed in this risk-aware setting. ❧ Given that user flexibility is considered one of the most valuable, yet still untapped resources available for accommodating the transition to renewables, an explicit market for flexibility is designed and analyzed. Users report preferences for service over a finite time horizon to a scheduler which shapes the aggregate demand profile to the output of a renewable generator, while minimizing the cost of resorting to thermal generation. Social welfare properties of competitive equilibria and an accompanying mechanism are studied.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
The smart grid network: pricing, markets and incentives
PDF
The next generation of power-system operations: modeling and optimization innovations to mitigate renewable uncertainty
PDF
Learning and control in decentralized stochastic systems
PDF
Computational validation of stochastic programming models and applications
PDF
Sequential decision-making for sensing, communication and strategic interactions
PDF
Computational stochastic programming with stochastic decomposition
PDF
Sequential Decision Making and Learning in Multi-Agent Networked Systems
PDF
Elements of robustness and optimal control for infrastructure networks
PDF
Team decision theory and decentralized stochastic control
PDF
Learning and decision making in networked systems
PDF
Defending industrial control systems: an end-to-end approach for managing cyber-physical risk
PDF
I. Asynchronous optimization over weakly coupled renewal systems
PDF
On the interplay between stochastic programming, non-parametric statistics, and nonconvex optimization
PDF
Learning enabled optimization for data and decision sciences
PDF
Information design in non-atomic routing games: computation, repeated setting and experiment
PDF
Understanding goal-oriented reinforcement learning
PDF
Online learning algorithms for network optimization with unknown variables
PDF
New Lagrangian methods for constrained convex programs and their applications
PDF
Discrete optimization for supply demand matching in smart grids
PDF
Electric vehicle integration into the distribution grid: impact, control and forecast
Asset Metadata
Creator
Dahlin, Nathan John
(author)
Core Title
Smarter markets for a smarter grid: pricing randomness, flexibility and risk
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Electrical Engineering
Degree Conferral Date
2021-12
Publication Date
09/30/2021
Defense Date
06/02/2021
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
electricity markets,game theory,mechanism design,OAI-PMH Harvest,smart grid,stochastic optimization
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Jain, Rahul (
committee chair
), Nayyar, Ashutosh (
committee member
), Nuzzo, Pierluigi (
committee member
), Sen, Suvrajeet (
committee member
)
Creator Email
dahlin@usc.edu,nathan.dahlin@gmail.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-oUC16010301
Unique identifier
UC16010301
Legacy Identifier
etd-DahlinNath-10117
Document Type
Dissertation
Format
application/pdf (imt)
Rights
Dahlin, Nathan John
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the author, as the original true and official version of the work, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright. The original signature page accompanying the original submission of the work to the USC Libraries is retained by the USC Libraries and a copy of it may be obtained by authorized requesters contacting the repository e-mail address given.
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Repository Email
cisadmin@lib.usc.edu
Tags
electricity markets
game theory
mechanism design
smart grid
stochastic optimization