Close
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Prediction models for dynamic decision making in smart grid
(USC Thesis Other)
Prediction models for dynamic decision making in smart grid
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
PREDICTION MODELS FOR DYNAMIC DECISION MAKING IN SMART GRID by Saima Aman A Dissertation Presented to the FACULTY OF THE USC GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (COMPUTER SCIENCE) January 2016 Copyright 2016 Saima Aman Dedication To the successful women in STEM. ii Acknowledgments I would like to express my sincere gratitude to my advisor, Prof. Viktor K. Prasanna for the continuous guidance and support through my Ph.D. years. His guidance and immense knowledge has helped me remain motivated and appreciate the “big picture” of my research. I am also grateful to Prof. Cauligi Raghaven- dra and Prof. Cyrus Shahabi for serving on my thesis committee and providing valuable feedback. My sincere thanks also go to Dr. Yogesh Simmhan for mentoring me during the initial years of my Ph.D. and teaching me the skills of doing and communi- cating research, and to Dr. Charalampos Chelmis for mentoring, guidance, and collaboration on several research projects and publications. Their guidance has been invaluable in my research career. I would also like to thank all the members of the P-group at USC, especially, Dr. MarcFrincuandDr. AnandPanangadan, forstimulatingdiscussionsandhelp at numerous occasions. Finally, I would like to thank my family, especially my husband, for his unwa- vering support during my Ph.D. years. iii Contents Dedication ii Acknowledgments iii List of Tables viii List of Figures ix Abstract xi 1 Introduction 1 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Research Contributions . . . . . . . . . . . . . . . . . . . . . . . . . 5 2 Background 8 2.1 Smart Grid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.1.1 USC Campus Microgrid . . . . . . . . . . . . . . . . . . . . 9 2.2 Big Data Sources in Smart Grid . . . . . . . . . . . . . . . . . . . . 11 2.3 Smart Grid Applications . . . . . . . . . . . . . . . . . . . . . . . . 12 2.3.1 Planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.3.2 Customer Education . . . . . . . . . . . . . . . . . . . . . . 13 2.3.3 Demand Response . . . . . . . . . . . . . . . . . . . . . . . 13 2.4 Prediction Models used in Smart Grid . . . . . . . . . . . . . . . . 15 2.4.1 Averaging Models . . . . . . . . . . . . . . . . . . . . . . . . 15 2.4.2 Regression Models . . . . . . . . . . . . . . . . . . . . . . . 17 2.4.3 Time Series Models . . . . . . . . . . . . . . . . . . . . . . . 17 2.4.4 AI and Machine Learning Models . . . . . . . . . . . . . . . 18 2.5 Real-world Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . 18 2.5.1 Electricity Consumption . . . . . . . . . . . . . . . . . . . . 18 2.5.2 Weather . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.5.3 Schedule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 iv 3 Dynamic Demand Response 21 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.2 Dynamic Decision Making . . . . . . . . . . . . . . . . . . . . . . . 25 3.3 Prediction Models for Dynamic Demand Response . . . . . . . . . . 26 3.4 Requirements for Prediction Models for D 2 R............. 27 3.4.1 Feature Selection . . . . . . . . . . . . . . . . . . . . . . . . 28 3.4.2 Data Collection . . . . . . . . . . . . . . . . . . . . . . . . . 28 3.4.3 Computational Complexity. . . . . . . . . . . . . . . . . . . 29 3.4.4 Cost Vs Benefits Trade-os . . . . . . . . . . . . . . . . . . . 29 3.5 Dynamic Demand Response in USC Microgrid . . . . . . . . . . . . 30 4 Prediction Evaluation Measures 31 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 4.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 4.3 Performance Measure Dimensions . . . . . . . . . . . . . . . . . . . 37 4.4 Application Independent Measures . . . . . . . . . . . . . . . . . . 39 4.4.1 Mean Absolute Percentage Error (MAPE) . . . . . . . . . . 39 4.4.2 CoecientofVariationofRootMeanSquareError(CVRMSE) 39 4.4.3 Relative Improvement (RIM) . . . . . . . . . . . . . . . . . 40 4.4.4 Volatility Adjusted Benefit (VAB) . . . . . . . . . . . . . . . 40 4.4.5 Computation Cost (CC) . . . . . . . . . . . . . . . . . . . . 41 4.4.6 Data collection Cost (CD) . . . . . . . . . . . . . . . . . . . 41 4.5 Application Dependent Measures . . . . . . . . . . . . . . . . . . . 42 4.5.1 Domain Bias Percentage Error (DBPE) . . . . . . . . . . . . 42 4.5.2 Reliability Threshold Estimate (REL). . . . . . . . . . . . . 43 4.5.3 Total Compute Cost (TCC) . . . . . . . . . . . . . . . . . . 44 4.5.4 Cost-Benefit Measure (CBM) . . . . . . . . . . . . . . . . . 45 4.6 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 4.6.1 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 4.6.2 Candidate Prediction Models . . . . . . . . . . . . . . . . . 46 4.6.3 Model Configurations . . . . . . . . . . . . . . . . . . . . . . 48 4.7 Analysis of Independent Measures . . . . . . . . . . . . . . . . . . . 49 4.7.1 24-hour Campus Predictions . . . . . . . . . . . . . . . . . . 49 4.7.2 24-hour Building Predictions . . . . . . . . . . . . . . . . . . 50 4.7.3 15-min Campus Predictions . . . . . . . . . . . . . . . . . . 55 4.7.4 15-min Building Predictions . . . . . . . . . . . . . . . . . . 55 4.7.5 Cost Measures . . . . . . . . . . . . . . . . . . . . . . . . . . 57 4.8 Analysis of Dependent Measures . . . . . . . . . . . . . . . . . . . . 58 4.8.1 Planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 4.8.2 Customer Education . . . . . . . . . . . . . . . . . . . . . . 64 4.8.3 Demand Response . . . . . . . . . . . . . . . . . . . . . . . 67 4.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 v 5 Prediction with Partial Data 69 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 5.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 5.3 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 5.4 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 5.4.1 Influence Discovery . . . . . . . . . . . . . . . . . . . . . . . 76 5.4.2 Influence Model (IM) . . . . . . . . . . . . . . . . . . . . . . 77 5.4.3 Local Influence Model (LIM). . . . . . . . . . . . . . . . . . 78 5.4.4 Global Influence Model (GIM) . . . . . . . . . . . . . . . . . 79 5.5 Cost-eciency of Influence Models . . . . . . . . . . . . . . . . . . 79 5.6 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 5.6.1 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 5.6.2 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 5.6.3 Influence Variation . . . . . . . . . . . . . . . . . . . . . . . 82 5.6.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 5.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 6 Prediction of Reduced Consumption 91 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 6.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 6.3 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 6.3.1 Consumption Sequences . . . . . . . . . . . . . . . . . . . . 99 6.3.2 Contextual Attributes . . . . . . . . . . . . . . . . . . . . . 99 6.3.3 Problem Definition . . . . . . . . . . . . . . . . . . . . . . . 101 6.4 Base Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 6.4.1 IDS: In-DR Sequence Model . . . . . . . . . . . . . . . . . . 101 6.4.2 PDS: Pre-DR Sequence Similarity Model . . . . . . . . . . . 102 6.4.3 DSS: Daily Sequence Similarity Model . . . . . . . . . . . . 103 6.5 Reduced Electricity Consumption Ensemble (REDUCE) . . . . . . 104 6.5.1 Computational Complexity of REDUCE . . . . . . . . . . . 105 6.6 Big Data Ensemble for Reduced Consumption (BiDER) . . . . . . . 106 6.6.1 Computational Complexity of BiDER . . . . . . . . . . . . . 108 6.7 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 6.7.1 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 6.7.2 Model Parameters . . . . . . . . . . . . . . . . . . . . . . . 110 6.7.3 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 6.8 Analysis of REDUCE . . . . . . . . . . . . . . . . . . . . . . . . . . 111 6.8.1 Eect of Schedule . . . . . . . . . . . . . . . . . . . . . . . . 113 6.8.2 Eect of Training Data Size . . . . . . . . . . . . . . . . . . 115 6.8.3 Eect of Variance in Consumption . . . . . . . . . . . . . . 117 6.9 Analysis of BiDER . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 6.9.1 Selecting the Training Period . . . . . . . . . . . . . . . . . 119 vi 6.10 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 7 Conclusions 123 7.1 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 7.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 Reference List 128 vii List of Tables 2.1 Description of campus microgrid dataset. . . . . . . . . . . . . . . . 19 3.1 Key characteristics of DR and D 2 R.................. 23 4.1 Summary of campus microgrid dataset . . . . . . . . . . . . . . . . 47 4.2 Application-independent cost measures . . . . . . . . . . . . . . . . 57 4.3 Application-specific parameters . . . . . . . . . . . . . . . . . . . . 59 4.4 Application-specific cost parameters and measures . . . . . . . . . . 66 6.1 Key characteristics of normal consumption, reduced consumption, and DR baseline. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 6.2 Notations used in reduced consumption modeling . . . . . . . . . . 100 viii List of Figures 1.1 Conceptualdiagramofelectricityconsumptionandreductionduring DR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.1 Big Data Sources in Smart Grid . . . . . . . . . . . . . . . . . . . . 11 2.2 Demand Response: duration and depth of reduction . . . . . . . . . 14 2.3 PDF of electricity consumption in the USC campus buildings . . . . 20 3.1 The transition to Dynamic Demand Response . . . . . . . . . . . . 22 3.2 Dynamic Decision Making for D 2 R.................. 26 4.1 Measures (CVRMSE and MAPE) for coarse-grained predictions . . 51 4.2 Measures (RIM and VAB) for coarse-grained predictions . . . . . . 52 4.3 Measures (CVRMSE and MAP) for fine-grained predictions . . . . 53 4.4 Measures (RIM and VAB) for fine-grained predictions . . . . . . . . 56 4.5 Measures (DBPE and CBM) for coarse-grained predictions . . . . . 61 4.6 Measure (REL) for coarse-grained predictions . . . . . . . . . . . . 62 4.7 Measures (DBPE and CBM) for fine-grained predictions . . . . . . 63 4.8 Measure (REL) for fine-grained predictions . . . . . . . . . . . . . . 64 5.1 Partial Data Problem . . . . . . . . . . . . . . . . . . . . . . . . . . 75 5.2 Cost-eectiveness of IM model . . . . . . . . . . . . . . . . . . . . 80 ix 5.3 Influence/dependency with respect to time . . . . . . . . . . . . . . 82 5.4 Influence/dependency with respect to size . . . . . . . . . . . . . . 83 5.5 Influence/dependency with respect to distance . . . . . . . . . . . . 83 5.6 Performance of IM with respect to ART . . . . . . . . . . . . . . . 84 5.7 Prediction performance of LIM models . . . . . . . . . . . . . . . . 85 5.8 Performance of LIM with respect to IM and ART . . . . . . . . . . 86 5.9 Prediction performance of GIM models . . . . . . . . . . . . . . . . 87 5.10 Performance of GIM with respect to IM and ART . . . . . . . . . . 88 5.11 Lift in performance of LIMs with respect to IM and ART . . . . . . 89 5.12 Lift in in performance of GIMs with respect to IM and ART . . . . 90 6.1 Normal consumption, Reduced consumption, and DR baseline vis- a-vis a DR event. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 6.2 Reduced consumption during DR . . . . . . . . . . . . . . . . . . . 93 6.3 Distribution of DR events over 3 years . . . . . . . . . . . . . . . . 110 6.4 Distribution of DR events across buildings . . . . . . . . . . . . . . 111 6.5 Performance of REDUCE using CDF plot for MAPE . . . . . . . . 112 6.6 Performance of REDUCE across buildings . . . . . . . . . . . . . . 113 6.7 Performance of REDUCE using CDF plot for REL . . . . . . . . . 114 6.8 Performance of REDUCE for a building with labs and oces . . . . 115 6.9 Performance of REDUCE for a campus center building . . . . . . . 115 6.10 Performance of REDUCE for an academic building . . . . . . . . . 116 6.11 Performance of REDUCE with respect to training data size . . . . . 116 6.12 Performance of REDUCE with respect to average consumption . . . 117 6.13 Performance of BiDER with respect to training window size . . . . 118 6.14 Performance of IDS, ITER, ICER, and BiDER models . . . . . . . 121 6.15 Performance of IDS, ITER, ICER, and BiDER using CDF plots . . 122 x Abstract Thewidespreaduseofsmartmetersandothersensorsinsmartgridhasresulted in unprecedented amounts of data being generated at high spatial and temporal resolutions. This high volume data is being generated at a high velocity and comes from a variety of sources, and is designated as “big data” by researchers and prac- titionersinvariousdomains,includingthesmartgriddomain. Predictivemodeling can be used to learn from this data about how electricity consumption patterns changeovertimeandwhenpeakdemandperiodsoccur. Utilitiesroutinelyfacethe challenge of ensuring uninterrupted electric supply during peak demand periods. The widespread practice to address this challenge is by demand response (DR), whereby utilities ask customers to reduce their consumption during peak demand periods according to a-priori agreements. For DR, utilities need to make decisions about when, by how much, and how to reduce consumption. While day-ahead predictions have long been used to make these decisions, in this dissertation, we address the problem of making predictions and decisions at a few hours’ advance notice whenever necessitated by the changing conditions of the grid. In particular, we formulate and address the problem of dynamic demand response (D 2 R) in smart grids that involves balancing supply and demand in real-time and adapt- ing to dynamically changing conditions by automating and transforming the DR planning process. We also focus on the requirements and challenges of prediction xi modeling of electricity consumption data and its evaluation to enable D 2 R. For example,thepredictionmodelsforD 2 Rmustsatisfyoftenconflictingrequirements of high accuracy and low computational complexity for fast predictions. First, we identify the limitations of existing measures for evaluating the per- formance of electricity consumption prediction models in smart grid and pro- pose a suite of performance measures that address accuracy, reliability, and cost. For example, while common error measures only consider the absolute dierence between the predicted and observed values, the sign of the dierence is very use- ful in determining if it was an under-prediction or over-prediction, in applications concerned with predicting peaks during D 2 R. Our application dependent measures with parametrized coecients set by domain experts allow model comparison that is meaningful for specific smart grid applications. While our measures have been proposed in context of smart grid, their scope and analysis of their use is rele- vant for applications beyond the smart grid domain. Our analysis of the measures oers deeper insight into models’ behavior and their impact on real applications, enables intelligent cost-benefit trade-os between models, and oers a comprehen- sive goodness of fit for picking the “right” model. We formulate and address the partial data problem that arises when real-time data from all sensors is not available at the utilities, making it impossible to do reliablepredictionsforD 2 Rusingtraditionaltimeseriesbasedmodels. Wepropose a novel approach that extends the notion of time series dependency to discover a small subset of “influential” sensors, and uses real-time data only from them to enable accurate predictions for all sensors. Next, we address the problem of pre- dictingreducedconsumptionduringDR,whichisrequiredtodoplanningforD 2 R, aswellastoselectcustomersforparticipationinD 2 Randincalculatingtheircom- pensation. The abrupt change in consumption profiles at the beginning and end of xii theDRperiodandtheusuallyshortdurationsofDRmakeitimpossibletoreliably use time series based models for predictions. To address the unique challenges of reducedconsumptionprediction, weproposeanensemblemodelthatusesdierent sequences of daily consumption on DR event days and contextual attributes for prediction. In particular, we leverage big data on reduced consumption to learn a single ensemble model for diverse customers over dierent time intervals, thus achievinghighcostreductionintermsofnumberofmodelstrained. Thereduction isoftheorderofn◊ L, wherenisthenumberofcustomersandListhenumberof intervals in the DR period. Also, the low computational complexity of our model makes it ideal for dynamic decision making required for D 2 R. The prediction modeling problems addressed in this dissertation are motivated byreal-worldapplicationswithintheUniversityofSouthernCalifornia(USC)cam- pus microgrid. For training and evaluating our models, we use data from a variety of sources: fine-grained electricity consumption data from the USC campus micro- grid collected at every 15-minutes for over 7 years; hourly weather data collected fromtheNOAAweatherstationoncampus; aswellasscheduledatafromthecam- pus. Our models have been implemented and integrated with the USC Facilities Management Services’ (FMS) solution for D 2 R on the USC campus. xiii Chapter 1 Introduction 1.1 Motivation Energy derived from fossil fuels is the single largest driver of industrialization and development worldwide, however, there is a growing realization that increas- ing demand for energy will eventually deplete natural fossil fuel reserves [85] and alternative measures must be found to achieve energy sustainability. One of the key eorts in this regard is by transforming the electric grids worldwide into smart grids [85] as they adopt smart meters and advanced metering infrastructure 1 and sensing devices to monitor and collect data about electricity consumption and grid conditions in near real time. Sensors installed under the Internet of Things (IoT) frameworks at commercial and residential sites also collect data from which elec- tricity consumption related information can be derived [12]. This has resulted in unprecedented amounts of data being generated at high spatial and temporal res- olutions [91]. However, translating this data into actionable insights for decision making requires novel data modeling methods [85] as well as relevant evaluation measures, [14]. For example, predictive modeling can be used by electric utilities to learn about how electricity consumption patterns will change over a span of few hours, to anticipate periods of peak demands, and to make dynamic real-time decisions on how to address potential mismatch between supply and demand by 1 Smart Meter deployments continue to rise, US Energy Info. Admin., 2012. http://www.eia.gov/todayinenergy/detail.cfm?id=8590 1 optimizing resources. Traditional modeling methods may not work well for these applications due to various factors, such as lack of real-time data, abrupt changes in consumption profiles, and high computational complexity, etc. Smart grids are transforming the decision making process for the manage- mentofpeakdemandperiods. Thepeakperiodsaredeterminedbasedonhistorical consumption profiles. When the demand for electricity approaches supply limits, typically in the afternoon and specially in summers, it may cause power interrup- tions. Building enough power plants to meet increasing demand is not feasible from cost and environment perspective. Several approaches have been proposed to mitigate the problem of keeping the grid up and running during the peak demand period without adding new generation units. Some utilities use load shifting to shift o some electric loads to o-peak periods. It requires making decision about which loads to shift. Many utilities worldwide now usedemand response (DR) (Figure 1.1) to manage peak load periods. DR refers to the changes in electric- ity use eected by the demand side in response to a need indicated by the utility. Usually,utilitiessendoutsignalstocustomerstoreduceconsumptionduringantic- ipated peak periods as per as-priori agreements with the customers, thus making sure that there is no mismatch between the supply and the demand. The main advantage of Demand Response is that the load curtailment task is shared be- tween the utility and the customers. Main benefits for the utility include preven- tion of black- outs, reduced need for adding new generation units, and increased system reliability. The customers generally get incentives for participation. TheDRprogramscurrentlyusedbytheutilitiescanbebroadlycategorizedinto two: 1) Direct DR, where the utility remotely controls electricity use in buildings through embedded controls and building energy management systems. This is generally pre-programmed and pre-authorized. 2) Voluntary DR, whereby the 2 customersvoluntarilyturnoequipmentwhensentsignalbytheutilityinresponse to incentives given by the utility. Most of the current DR programs are static in that the timing and requests are statically defined a-priori. From a research perspective, static DR is considered a solved problem. The next development in the field of DR is dynamic DR, as described in the following paragraph. In traditional DR, the curtailment response for peak demand periods is planned statically much in advance of the actual DR event. These responses are generally based on either schedule-based or incentive-based DR approaches. In both these cases, the decisions about the DR period and the DR response are made days and weeks in advance. As mentioned previously, this does not work reliably in event of variable peak demand. Also, the demand patterns change more dynamically at fine spatio-temporal scales. In the context of DR, deciding how much the customers need to reduce their consumption, and for how long, as well as deciding which customers to include in a DR event are important considerations. The decision making process in smart grids relies on long term and short term predictions of consumption and reduction in consumption. While utilities have long used day-ahead predictions for decision making about demand response, recent advancements in smart grids require deci- sion making at a few hours’ advance notice whenever necessitated by dynamically changingconditionsofthegrid,suchasduetointermittentgenerationfromrenew- able energy sources [11]. Recognizing this need, in this dissertation, we formulate and propose the problem of dynamic demand response (D 2 R) (Chapter 3) whereby the planning for DR is done a few minutes to a few hours before the beginning of the DR event. This planning for DR involves predictions, decision making, and notification for DR. We consider dynamic demand response as the motivating and prime example of dynamic decision making in smart grid that 3 Figure 1.1: Conceptual diagram of electricity consumption and reduction during DR involves balancing supply and demand in real-time and adapting to dynamically changing conditions by the DR process. D 2 R involves decisions about when, by how much, and how to reduce consumption 2 [98, 6] during DR. PredictionmodelsusedforD 2 Rneedtotakeintoconsiderationthedynamically changing conditions of the grid, such as fluctuating production from renewable generationsources,andduetodynamicadditionandremovalofdistributedenergy generation and storage sources; as well as newly available streaming data. These models must satisfy challenging requirements of having high accuracy for optimal predictions and low computational complexity for doing real-time predictions for large number of end users. Also, the models should be able to deal with less than ideal conditions, such as sudden changes in patterns, missing data, or the unavailabilityofdatainreal-time. Itisalsoimportanttoevaluatetheperformance 2 In this work, we address energy consumption prediction, which deals with average energy over an interval (i.e., kWh). This is dierent from demand (or load) prediction (measured in kW). 4 of a model in the right context with appropriately selected performance measures. We address these prediction challenges in this dissertation. 1.2 Research Contributions The key contributions of this dissertation are as follows: • We propose and define the problem of Dynamic Demand Response (D 2 R) (Chapter 3). We identify the factors necessitating the transition towards (D 2 R) and the unique characteristics and challenges for D 2 R vis-à-vis DR. Utilitieshavelonguseddemandresponse(DR)forachievingconsumer-driven reduction in consumption during anticipated peak demand periods to main- tain grid reliability [11]. Traditionally, planning and notification for DR is done one day ahead of the day when DR is to be performed. However, for D 2 R, utility provider needs to perform DR at a few hours’ advance notice whenever necessitated by dynamically changing conditions of the grid. • We propose a suite of performance measures for evaluation of prediction models in Smart Grids along dimensions that go beyond just the magni- tude of errors, and explore scale independence, reliability, and cost. It has been well recognized that model performance is often evaluated based on abstract metrics, detached from their meaningful evaluation for the end-use domain [105]. In Chapter 4, we discuss how for many smart grid domain applications, error measures such as MAPE alone are inadequate to cor- rectly evaluate prediction models, for example, in some cases, the frequency with which a model outperforms the baseline is an important consideration. Also, thecostofcollectingdataandrunningmodelscannotbeignored, espe- cially for models dealing with “big data”. Our novel application dependent 5 measureswithparametrizedcoecientssetbydomainexpertsallowcompar- ison that is meaningful for that scenario. We analyze the usefulness of our proposed measures with respect to three Smart Grid applications: planning, customereducation,andDR.OurstudybasedonrealworldSmartGriddata is the first of its kind in defining holistic measures and evaluating candidate consumption models for emerging microgrid and utility applications. • Weformulateandaddresstheproblemofpredictionwithpartialdata(Chap- ter 5) that arises when real-time data from only some sensors is available at the utilities. Sensors in smart grid are located at geographically dispersed locations and periodically send back acquired data to the utilities [34] via wireless links and the Internet [33]. Prediction at the utilities must be per- formed under conditions where data from all sensors is not available in real- time, leading to the partial data problem, where only partial data from sensors is available in real-time, and complete high resolution data is avail- ableonlyperiodically. Withoutaddressingthisproblem,traditionalmethods for prediction risk degradation in performance and inaccurate interpretation of generated insights. We propose a novel solution to do reliable predictions for all sensors even in absence of real-time data from all by using real-time data from a small subset of “influential” sensors. First, we learn the depen- dency among time series of dierent sensors, then, we use data from a small subsetofsensorstomakepredictionsforallsensors. Whiletimeseriesdepen- dencies have been studied and utilized previously, the novelty of our work is in extending the notion of dependencies to discover influential sensors and using real-time data only from them to do predictions for all sensors. Using real-world electricity consumption data, we demonstrate that despite lack of real time data, our prediction models perform comparably to the baseline 6 model that uses real-time data, thus indicating their usefulness for dynamic decision making scenarios in smart grid. • Weidentifykeycharacteristicsandchallengesofpredictingreducedconsump- tion during DR (Figure 1.1). In Chapter 6, we describe how this problem is dierent from the more widely studied problem of consumption prediction outside the DR event window and identify unique challenges associated with this problem. We leverage big data to learn a single ensemble model for diverse customers over dierent time intervals, thus achieving high eciency in terms of the number of models trained. The reduction in the number of models is of the order of n◊ L, where n is the number of customers and L is thenumberofintervalsintheDRperiod. Thelowcomputationalcomplexity of our model makes it ideal for real-time dynamic decision making required for D 2 R. 7 Chapter 2 Background In this chapter, we provide a background on smart grids and key applications in the domain. We also discuss big data sources and prediction models used in smart grid. The content presented in this chapter is aimed at providing relevant context to the discussion in the following chapters. 2.1 Smart Grid According to the International Energy Agency, the global energy demand is set to grow by 37% by 2040 (compared to the 2014 levels), of which electricity is the fastest growing final form of energy [58]. Thus, meeting increased demand for electricity is considered as one the most critical challenges facing modern societies. Eorts initiated to meet this challenge include generating electricity from renew- able energy sources, such as solar and wind; developing energy storage systems; modernizing and optimizing the electric grids; and increasing consumer awareness and participation in achieving energy sustainability. The modern electric grid, also referred to as the “smart grid”, is equipped with advancedinstrumentationformonitoring,control,andcommunication. Amongthe key characteristics of the smart grid are widespread deployment of smart meters to collect fine grained electricity consumption information, integration of renew- able energy sources, rise in distributed energy generation, increased use of electric vehicles, customer engagement using demand response and time of use pricing, 8 microgrids, electricity markets, etc. The smart grid is formally described by the U.S. Department of Energy as: A fully automated power delivery network that monitors and controls every customer and node, ensuring a two-way flow of elec- tricity and information between the power plant and the appliance, and all points in between. Its distributed intelligence, coupled with broadband communications and automated control systems, enables real-time market transactions and seam- less interfaces among people, buildings, industrial plants, generation facilities, and the electric network [103]. 2.1.1 USC Campus Microgrid TheUniversityofSouthernCalifornia(USC)campusmicrogridprovidesafully operational smart grid environment. USC campus encompasses many of the fea- tures that make up a diverse city like Los Angeles that make it suitable as a Microgrid. The USC campus is the largest private customer for the Los Angeles Department of Water and Power (LADWP), with an annual consumption of 155 GWh and an average load of 20 MW. The campus is diverse, both in terms of demographics and buildings. With 33,000 students and 13,000 faculty and sta spread over 300 acres containing class rooms, residence halls, oces, labs, hospi- tals, restaurants, public transit, electric vehicles, and even a gas station, it forms a “citywithinacity”. The100+majorbuildingsarebetween90and2yearsoldwith disparate electrical and heating/cooling facilities. Two power vaults route power from LADWP and a co-generation chiller is available for energy storage [90]. The USC Facilities and Management Services (FMS) maintains a relatively “smart” electrical and equipment infrastructure. It has the ability to measure energy usage per building at 1 minute interval, with the possibility of zone or room level measurement for a third of the buildings and indirect calculation of 9 equipment level usage. Their state-of-the-art control center is capable of manag- ing 170 buildings spread across two campuses totaling more than 50,000 sensors including smart meters, thermal, humidity, presence, and photosensitive devices. Together, they make the USC campus a truly “living laboratory” for advancing Smart Grid research [48]. The control center aggregates data across all buildings, and can centrally control or override HVAC (heating, ventilation and air condi- tioning) equipments that consume up to 50% of the total campus power. However, many of these features are only used by manual intervention when demand opti- mization is required, and automated intelligence for decision making is lacking [90]. These features make USC campus a ready, instrumented Smart Grid environ- ment for conducting controlled and calibrated experiments. Besides the available data collection and control facilities, there is also the flexibility of trying emerg- ing Smart Grid sensors and instruments from third party vendors on the campus for fine grained and richer sources of data, and points of control. The FMS has more than 7 years worth of historical and real-time kWh consumption data aggre- gated at 15 minute intervals. Combined with detailed information on the classes’ schedule, buildings’occupancy, andweatherdata, itoersauniqueopportunityto investigate the main challenges and possible solutions to adopting an ecient and reliable controlled demand response (DR) program in complex dynamic environ- ments. Currently FMS’ focus is on HVAC based DR but upgrades to extend it to otherDRtechniquessuchasthosebasedonthelightingsystemhavebeenplanned. The microgrid serves as a test-bed for the Los Angeles Smart Grid demonstration project to experiment and evaluate demand response and other smart grid tech- nologies. 10 Figure 2.1: Big Data Sources in Smart Grid 2.2 Big Data Sources in Smart Grid Oneofthekeyfeaturesofsmartgridistheuseofavarietyofsensingdevicesfor tracking and collected fine grained data about electricity consumption, weather, indoor environment and occupancy, etc. While some of these sensors are deployed especially under the smart grid paradigm, such as the smart meters for collecting electricity data, many others are being installed under other initiatives such as the Internet of Things (IoT). Data is being also generated at high streams by social media. Some of these data can also be considered as indirect indicators of electricity consumption [12]. For example, prediction of high temperatures can be an indicator of increased electricity consumption. Similarly, occupancy in a building could be an indicator of energy consumption. Figure 2.1 shows some of the sources of big data in smart grid. The magnitude of data collected is shown in context of USC campus and the Los Angeles (LA) city. The increase in the availability of real-time data from varied sensors allows researchers to develop and apply data mining techniques to predict about peak 11 demand periods for buildings. Electric utilities can use these insights for planning purposes, for educating the customers about their consumption behavior, and to ask building occupants and facility managers to reduce consumption during antic- ipated peak demand periods, a practice popularly know as Demand Response (described in Section 2.3.3). We describe these application areas in detail in the next section. 2.3 Smart Grid Applications We introduce three fundamental application areas in the smart grid domain that can benefit from prediction modeling. These are planning, customer educa- tion, and demand response. 2.3.1 Planning Planning capital infrastructure such as building remodeling and power system upgrades for energy eciency involves trade-o between investment and electric power savings. Medium to long term electricity consumption predictions at coarse (24-hour) granularity for campus and individual buildings can help in this decision making. This application involves long term planning, and therefore such models are required to run infrequently, for example, every couple of months [14]. 12 2.3.2 Customer Education Educating energy customers on their energy usage can enhance their partici- pation in energy sustainability by curtailing demand and meeting monthly bud- gets [85]. One form of education is through giving consumption forecasts to cus- tomers in a building on web and mobile apps 1 . For this application, building-level predictions at both 24-hour and 15-min granularities are useful. As customers are expected to check their energy consumption generally during the daytime, it is considered adequate to make predictions for customer education during the day, i.e. from 6 AM - 10 PM [14]. 2.3.3 Demand Response To maintain reliability in the grid and avoid blackouts, it has been proposed thattheelectricitydemandshouldbemadeadaptivetosupplyconditions[88]. The standardapproachusedbyelectric utilities toachieve this is bymeans of “demand response”, whereby utilities ask consumers to decrease their consumption for the durations when the utility anticipates peak demands [72]. Historically, these high peaks occur between 1-5 PM (Figure 2.2) on weekdays 2 , and predictions during theseperiodsovertheshorttimehorizonat15-mingranularityarevitalforutilities to decide when to initiate curtailment requests from customers or change their pricing. Often, the predictions are performed before, at the beginning of, and during the high peak period [14]. Demand Response (DR) is formally defined by the Federal Energy Regulatory Commission as: Changes in electric usage by end-use customers from their normal 1 USC SmartGrid Portal. http://smartgrid.usc.edu 2 DWP TOU Pricing. http://bp.ladwp.com/energycredit/energycredit.htm 13 Figure 2.2: Demand Response: duration and depth of reduction consumption patterns in response to changes in the price of electricity over time, or to incentive payments designed to induce lower electricity use at times of high wholesalemarketpricesorwhensystemreliabilityisjeopardized [22]. Traditionally, the utilities do planning and notification for a DR event one day before the event is performed based on predictions of electricity consumption, weather, and other relevant features. Buildings account for about 40% of the energy consumption worldwide [102], therefore, novel energy optimization measures adopted in buildings can signifi- cantly contribute to global energy sustainability eorts. While these measures include both engineering solutions as well as those involving human participation such as DR, the latter is more cost-eective compared to more expensive engineer- ing solutions. 14 2.4 Prediction Models used in Smart Grid Many smart grid applications, such as those described in Section 2.3, require predictions of future energy consumption. Since many decades, utilities have used averagingmodels, univariatetimeseriesmodels, suchasHolt-WintersExponential Smoothing and Box-Jenkins Auto-Regressive Integrated Moving Average models for making predictions. In the last decade, Neural Network models have also been usedinpredictionoflong-termandshort-termelectricitydemand. However, many of these approaches consider electric demand as a simple time-series data or being consistent with the time of day and week, and ignore phenomenons that partic- ularly aect electricity demand. For example, the demand may change suddenly due to the use of an intermittent renewable energy source, an electric vehicle, a social event, or consumer-initiated reduction in consumption. In the next chapter we will describe the specific needs and challenges associated with modeling for dynamically changing conditions in the grid. Here, we briefly describe the commonly used prediction modeling approaches in the smart grid. While the models described here are not exhaustive, they repre- sent the most commonly used algorithms in smart grids. Electricity consumption prediction models can be broadly categorized into three groups [6]: 1) simple aver- aging models; 2) statistical models like regression and time series models; and 3) artificial intelligence and machine learning models (AI/ML) like neural networks and support vector machines [6], [71], [85], [66]. 2.4.1 Averaging Models Averaging models are popular among utilities and ISOs [2][1][3] due to their simplicity [36]. Averaging models make predictions based on linear combinations 15 of consumption values from limited historical data. They have been shown to perform as well as advanced machine learning and time-series models [66] while considerably reducing the computational need for complex predictive modeling, thus oering high cost-eciency. We consider three popular averaging models and a Time of Week (ToW) model, as described below: • New York ISO Model (NYISO): It predicts for the next day by taking hourly averages of the five days with highest average consumption value amongapooloftenpreviousdays, startingfromtwodayspriortoprediction [2]. It excludes data from weekends, holidays, past DR event days or days with sharp drop in the energy consumption [11]. • California ISO Model (CAISO): It predicts for the next day by taking hourly averages of the three days with highest average consumption value among a pool of ten previous days, excluding weekends, holidays, and past DR event days [1], [11]. • Southern California Edison Model (CASCE): It predicts for the next day by taking hourly averages across past ten immediate or similar days, excluding weekends, holidays, and past DR event days [36], [3]. • Time of Week Average Model (ToW): It predicts for each 15-min inter- val in a week by taking average over all weeks in the training dataset. It captures consumption variations over the duration of a day , i.e., from day to night, and across dierent days of the week [11]. Time related features are important for electricity consumption [47] as it is closely tied to human schedules and activities. 16 2.4.2 Regression Models Regression models combine several independent features to form a linear func- tion. Commonly used regression models for electricity consumption prediction are regression tree models [12], probabilistic linear regression, and gaussian process regression models [61]. Hybrid methods that combine regression-based models with other models have also been used for short term load prediction [76]. A mul- tiplelinearregressionmodelforloadpredictionwaspresentedin[54]. Anon-linear and non-parametric regression model for next day half-hourly load prediction was employed in [45] for stochastic planning and operations decision making. In other studies, Support Vector Machines have also been used for load forecasting [32], [18]. Weuseregressiontrees[27]inourstudy. Aregressiontreerecursivelypartitions data into smaller regions until each region can be represented by a constant or a linear regression model. Its key advantage is its flowchart or tree representation thatenablesdomainuserstointerprettheimpactofdierentfeaturesonpredicted values [12]: Also, once a regression tree is trained, predictions are fast to compute by a tree look-up [14]. 2.4.3 Time Series Models A Time Series model predicts future values based on recent observations. One of the early reviews for time series based models for load forecasting is given in [52]. A comparison of time series models for load forecasting with other models is presented in [99]. A time series model for short to medium term load forecasting (few hours to few weeks ahead) of hourly loads was proposed in [15]. Inourstudy, weusetheAuto-RegressiveIntegratedMovingAverage(ARIMA) model[52]. ARIMAisdefinedintermsofthreeparameters: d,thenumberoftimes 17 a time series needs to be dierenced to make it stationary; p, the auto-regressive order, that denotes the number of past observations included in the model; and q, the moving average order that denotes the number of past white noise error terms included in the model. These parameters are derived from the Box-Jenkins test [25]. 2.4.4 AI and Machine Learning Models This group of prediction models comprises of varied approaches, such as arti- ficial neural networks (ANNs), fuzzy systems, support vector machines (SVM), and expert systems [6], [71]. Among these ANNs have been the most popular approach and have seen renewed interest from the research community due to recent success and popularity of deep learning systems. Some researchers have also used a hybrid approach of combining models. For example, [46] have pro- posed a hybrid model based on Bayesian classifiers and SVM. In other studies, pattern matching approaches have been used for prediction. For example, in [71], a novel approach based on similarity of pattern sequences prior to the day to be predicted is proposed. Thus, this group of models has scope for many novel ideas and is of increasing interest to researchers. 2.5 Real-world Datasets We now describe the real-world datasets used in our research. 2.5.1 Electricity Consumption Weusebuilding-leveldatafromtheUSCcampusmicrogrid[95,91]providedby the USC Facilities Management Services (FMS). It comprises of 15-min electricity 18 Table 2.1: Description of campus microgrid dataset. Number of participants 170 Data collection period 7.5 years Data points 7.5 years ◊ 365 days ◊ 96 intervals ¥ 26ú 10 4 points per building ¥ 44ú 10 6 points total Client type academic, residential, and administrative buildings Mean consumption (kWh) large (30.52±7.65) Average variance (kWh) 122.56 consumption values (measured in kWh) from 170 USC campus buildings, collected betweenJul2007untilDec 2014 [95,91]. The buildings represents large customers of diverse type: teaching and oce spaces, residential, and administrative build- ings. We excluded buildings with major discontinuities in data, and used linear interpolation for minor gaps. Key properties of the dataset are summarized in Table 2.1, and its distribution is shown in Figure 2.3. More details on the dataset are available in [9]. 2.5.2 Weather Two commonly used sources for weather data are NOAA [77] and Weather Underground [4]. We obtained curated weather data from NOAA [77], [5] data processed under a quality control system [77]. We used hourly temperature obser- vations, which were interpolated to 15-min values, in alignment with the granular- ity of our electricity consumption datasets. 19 Figure 2.3: Probability density function (PDF) of average electricity consumption per 15-min interval in the USC campus buildings. 2.5.3 Schedule It was obtained for the campus dataset comprised of information on working days, holidays, and semester durations (for campus dataset). We used this infor- mation to compare the performance of workday versus non-workday performance of the models. 20 Chapter 3 Dynamic Demand Response In this chapter we propose “dynamic demand response” as an advancement over the state-of-the-art practice of “demand response” by the utilities, and as a prime example of dynamic decision making in smart grids. 3.1 Introduction Electricity consumption optimization is critical to enhance electric grid relia- bility and to avoid supply-demand mismatches. Utilities have long used demand response(DR)forachievingcustomer-drivenreductioninconsumptionduringpeak demand periods to maintain reliability [94]. Traditionally, planning and notifica- tion for DR is done a day ahead, that is one day prior to the day when reduction using DR is to be performed [108]. However, with the Smart Grid transitioning towards more and more dynamic operations, it is no longer possible for electric utilities and system operators to perform planning and decision making about grid operations one or more days ahead. They are required to perform DR at a few hours’ advance notice whenever necessitated by dynamically changing conditions of the grid, such as intermittent generation from renewable energy sources, or occurrence of special events. Recognizing this need, we introduce the novel con- cept of “Dynamic Demand Response” (D 2 R). In Figure 3.1, we show a figure from theLawrenceBerkelyNationalLab, whichshowsthetransitiontowardsincreasing levels of granularity from day ahead to fast real-time DR. The state-of-the-art in 21 Figure 3.1: The transition to Dynamic Demand Response (D 2 R). (Original figure from the Lawrence Berkely National Lab) theutilitiesiscurrentlyonday-ahead(slow)DR.Eventually, withadequateinfras- tructure and processing power, the goal is to move towards fast DR. In between we identified a scope for dynamic demand response, whereby the planning for DR is doneafewminutes toafewhours beforethe beginningofaDRevent. Here, the planning for DR involves predictions, decision making, and notification for DR. We formally define D 2 R as follows: Definition 3.1. Dynamic Demand Response (D 2 R) is the process of balancing supply and demand in real-time and adapting to dynamically changing conditions by automating and transforming the demand response planning process. Several factors drive the transition towards D 2 R; most notably the integration of renewable energy sources, which due to their intermittent, non-dispatchable, and uncertain nature result in supply instability ([108]). The need to curtail at 22 Table 3.1: Key characteristics of DR and D 2 R DR D 2 R Goal advance planning dynamic adaptation Horizon day ahead hours ahead Data/control granularity coarse fine Data rate monthly billing real-time data from smart meters Timing and duration fixed and pre-defined flexible, dynamically determined Extent of curtailment fixed dynamically deter- mined/adjustable Customer selection selected a-priori dynamically selected Challenges labor intensive, data unavailability, inability to adapt small latency require- ments, computational complexity, data deluge time periods which were traditionally considered non-peak periods, such as week- ends, as a result of such instabilities is beyond existing DR policies. The need to curtail any time as a result of such instabilities is beyond existing DR policies, which are traditionally defined for workdays, and usually in hot summer after- noons ([80], [14]). Besides, Plug-in electric vehicles (PEVs) can introduce spikes in consumption at arbitrary times during the course of a day ([35]), whereas special events can result in increased load on weekends. Thus, in case of D 2 R, we consider the possibility of anytime DR (even on weekend and holidays, and in traditionally non-peak periods). InDR,thefocushasbeenonlargeindustrialandcommercialcustomers([108]), selected a-priori, who are expected to contribute large-sized curtailment. With increasing adoption of smart meters [93], [91], and home energy management and automation systems ([13], [108]) however, the participation of small customers in demand side management is increasing. The electricity demand of such small 23 customers might be easier to regulate (i.e. shift or shave) as compared to the load ofcommercialentities,however,consumptionpredictionforsmallcustomersandat high temporal granularity is challenging ([66, 78]). One of the key implications of involving small customers would be to dynamically and optimally select customers for participation in curtailment ([108]) and request only the minimum curtailment required to avoid fatigue and loss of interest in the customers ([14]). The key dierences between DR and D 2 R are summarized in Table 3.1. Given thisbackgroundontheconditionsnecessitating“dynamic”DRorD 2 R,weidentify the following key challenges in performing D 2 R: • Address both small, highly variable, individual consumers as well as rela- tively larger and more stable consumers, and intelligently target them for participation in D 2 R; • Handle data at very small consumption granularity (i.e., at 15-min interval) for appropriately timing the requests for dynamic demand response (D 2 R) [14] • Focus on short-term (few minutes to few hours ahead) planning, including predictions and corresponding decision making required for (D 2 R) [14], as opposed to day-ahead planning for DR; • Alsoconsiderplanningforweekendsandholidayswhentheconsumptionpat- ternsaredierentfromnormaldays,andwhichweretraditionallyconsidered as non-peak and thus not addressed in DR planning. 24 3.2 Dynamic Decision Making We described above how the DR events will become necessary not only on fixed time intervals and weekdays predetermined by static policies, but also at any time and on weekends to react to fluctuating demand. Unique challenges arise in this context vis-a-vis automated and ecient dynamic decision making. Consider the D 2 R schematic shown in Figure 3.2. It illustrates the process of dynamic decision making forD 2 Rthatweaddressinthisdissertation. Individualconsumers and buildings are sources of a variety of data being generated in high volumes and at high granularity. This data can be mined to get insights into electricity consumption of individual consumers and buildings. We consider two kinds of prediction models: 1) the reduced consumption prediction model for predicting consumption during a demand response event; and 2) the (normal) consumption prediction for predicting consumption outside a demand response event. Based on the outputs from these prediction models, the D 2 R policy engine makes the following decisions: • when to reduce: dynamic predictions for short horizons from few minutes to few hours ahead enables decision making about when to start a D 2 Revent and for how long to keep it active. • how much to reduce: dynamicpredictionsalsohelptheutilitiesinestimating how much reduction is required during a D 2 Revent. • how to reduce: dynamic prediction results help in making decisions about which consumers to target for participation in a D 2 R event and which reduc- tion strategy to use. Only a subset of consumers are selected to avoid exhausting the incentives that are given out by the utility to the consumers in return for reduction. It also helps in maintaining consumer engagement 25 Figure 3.2: Dynamic Decision Making for D 2 R which would otherwise wane when consumers are frequently asked to reduce consumption. Thus, it is evident how predictions are used for dynamic decision making for D 2 R. 3.3 Prediction Models for Dynamic Demand Response As shown in figure 3.2, consumption prediction models are used to support automateddynamicdecisionmakingforD 2 R.Whilethereisexistingresearchwork on prediction models for DR, D 2 R is a newly introduced concept and needs more research given its unique challenges and requirements. Specifically, we focus on prediction models that can operate at a very small data granularity (here 15-min 26 intervals),and for both weekdays and weekends - all conditions that character- ize scenarios for D 2 R. Prediction models used for D 2 R should balance conflicting requirementsofhighpredictionaccuracyandlowcostsinbuildingmodelsand making prediction using them in real-time for diverse customers. Our proposed models and results in this regard are useful for both researchers and practitioners in the smart grid domain. 3.4 RequirementsforPredictionModelsforD 2 R In this section, we address the requirements for prediction models for D 2 R. Thereareseveralconsiderationsinselectingapredictionmodel,especiallyinterms ofthenumberofmodelstobebuilt(forindividualconsumersorconsumergroups), as well as the eort and computation time required for data collection, training, and making prediction. The number of times a trained model can be reused before requiring retraining is also a factor to consider. Data cost is a particularly impor- tantconsiderationinthisageof“bigdata”sincequalitychecking, maintainingand storinglarge feature-sets canbe untenable. Compute costs can even be intractable when prediction models are used millions of times within short periods. A model built or used at high cost would be impractical even if designed to give high accu- racy. Thus, it is critical that the models selected for a real-time 1 application such as D 2 R satisfy often conflicting requirements of accuracy, eciency, and cost. We discus below the main considerations and requirements in selecting predic- tion models for D 2 R: 1 In this dissertation’s context, real-time implies at 15 minute intervals. 27 3.4.1 Feature Selection Feature selection is a major challenge in building prediction models for D 2 R, which is further compounded due to the availability of a large number of data sources(“bigdata”)insmartgrid(asdescribedinSection2.2). Notallfeaturesare relevant to a given prediction problem and selection of insignificant features could aect the predictive power of the prediction model. It is desirable to have a par- simonious model with intelligent selection of relevant features. Many approaches, such as the lasso [100] and LARS [44], have been proposed to do ecient fea- ture selection. In Chapter 5, we address the situation that arises when only some smart meters cantransmit their data in real-time, leading topartial data problem. There, we propose a lasso based method for intelligent selection of “influential” smart meters that transmit data in real-time. 3.4.2 Data Collection While the availability of “big data” in smart grid can be gainfully leveraged in prediction models, it is critical to consider the data collection cost associated with it forpractical applications such as D 2 R. Ratherthan examine the raw size of data used for training or predicting using a model, a more useful measure is the eort required to acquire and assemble the data. Size can be managed through cheap storage but collecting the necessary data often requires human and organizational eort and time. It may also require specialized infrastructure and equipment, such as sensors, as well as support for transmission of data over networks or the Internet. It may not always be possible to collect and transmit data in real-time as required for dynamic decision making for D 2 R due to network limitations or due to consumers refusing to allow real-time transmission of their data for security and privacy concerns. 28 Data collection cost is defined in terms of the number of unique values of features involved in a prediction model. These features could be static (time- invariant) that require a one-time collection eort, or dynamic in form of time series, that require continuous or periodic data acquisition. For example, a uni- variate model such as the ARIMA time series model requires data from just one source and may be more cost-ecient from a data collection perspective compared to multivariate models. In Chapter 4, we propose cost measures that account for data collection cost. 3.4.3 Computational Complexity Thetimerequiredintrainingandpredictionusingamodelcanproveimportant when it is used at large scale or in real-time applications such as dynamic demand response that are sensitive to prediction latency. For D 2 R, decisions about when and how much to reduce electricity consumption have to be made in real-time few hours or few minutes before the DR event begins. This puts serious limitation on the use of prediction models with high computational complexity for D 2 R. 3.4.4 Cost Vs Benefits Trade-os It is of practical importance for dynamic decision making applications such as D 2 R to select a model which not only provides good prediction accuracy but is also cost ecient. In Chapter 4, we propose a cost-benefit measure that considers how good the accuracy is per unit of compute cost. In Chapter 6, we leverage big data on reduced consumption in a single ensemble model to predict for diverse customers over dierent time intervals. This way we achieve huge cost reduction compared to the case of individual ensembles learned for each customer, without sacrificing much prediction accuracy. 29 3.5 Dynamic Demand Response in USC Micro- grid Dynamic demand response is a novel problem that we introduce. As described previously in 2.1.1 the USC campus microgrid provides a conducive environment to test our proposed prediction models for dynamic decision making. Not many integratedDRsystemsforcomplexenvironmentssuchasacampusmicrogridexist. A recent paper discussed a research platform deployed on the University of Cali- fornia San Diego (UCSD) microgrid for developing large scale, predictive analytics for real-time energy management [20]. Contrary to our work which deals with DR by focusing on both (normal) consumption and reduced consumption prediction, the UCSD smart grid is focused on consumption prediction only to improve oper- ational eciency. As such they do not consider the complexity of D 2 R and the challenges it raises. A survey of previous works reveals that there have been numerous attempts to deal with consumption demand. Utility providers can either compensate by buying extra power at high prices [23] or employ DR strategies. The latter is a well known concept divided into two categories which include direct control and voluntary participation addressing residential and commercial buildings as well as large industrial facilities and data centers [48]. In this dissertation, we focus on a heterogeneous campus microgrid which includes a mixture of various building types including residential, oces, libraries, and mixed spaces. This environment oers a more realistic scenario for concepts such as smart cities. Our work is based on directly controlling the building equipment to achieve and sustain a specified reduction [48]. 30 Chapter 4 Prediction Evaluation Measures In this chapter, we focus on meaningful evaluation of prediction models that goesbeyondaccuracyandconsidercost-eectiveness aswellasapplication-relevant evaluation. 4.1 Introduction Recently in the machine learning domain, it has been acknowledged that the performanceofpredictionmodelsisoftenbasedonabstract metrics,detachedfrom their meaningful evaluation for the end-use domain [105]. Generally, popular per- formance measures like Mean Absolute Error (MAE) or Root Mean Square Error (RMSE) are used for evaluating and selecting a prediction model for an applica- tion, without consideration of their relevance to the application. For many smart griddomainapplications,wepostulatethatthesemeasuresaloneareinadequateto evaluate prediction models. The following discussion is relevant to many applied domains beyond just smart grids. We list below the motivation for selection of additional performance measures: • The impact of under- and over- predictions or prediction bias can be asym- metric for some applications, and measures like RMSE are insensitive to prediction bias. For e.g., under-prediction is more deleterious to Smart Grid applications that respond to peak demand. 31 • Scale-dependent metrics are unsuitable for comparing prediction models applied to dierent customer sizes. • The focus on the magnitude of errors overlooks the frequency with which a model outperforms a baseline model or predicts within an error tolerance. Reliable prediction is key for certain domain applications. • Volatility is a related factor that is ignored in common measures, wherein a less volatile prediction model performs consistently better than a baseline model. • Lastly,giventhe“BigData”consequencesofsmartgridapplications,thecost of collecting data, building models, and running them cannot be disregarded. The extra cost for improved accuracy from a model may be impractical at largescalesinaSmartGridwithmillionsofcustomers[57],orthelatencyfor apredictioncanmakeitunusableforoperationaldecisions. Thus,itiscritical thatcost-eciencyof themodels is takenintoaccountbeforedeploymentfor real-world applications. Thesegapshighlighttheneedforholisticperformancemeasurestomeaningfully evaluate and compare prediction models by domain practitioners. We make the following novel contributions in this chapter: • Weproposeasuiteofperformancemeasuresforevaluatingpredictionmodels in Smart Grids, defined along three dimensions: scale independence, relia- bility and cost (Section 4.3). These include two existing measures and eight innovative ones (Section4.4, Section 4.5), andalsoencompass parameterized measures that can be customized for the domain. 32 • We analyze the usefulness of these concrete measures by evaluating ARIMA and regression tree prediction models (Section 4.6.2) applied to three Smart Grid applications (Section 4.6) in the Los Angeles Smart Grid Project 1 . We oer meaningful performance measures to evaluate predictive models along dimensions that go beyond just the magnitude of errors, and explore bias, reliabil- ity, volatility and cost [105]. Not all our measures are novel and some extend from other disciplines – this oers completeness and also gives a firm statistical ground- ing. Our novel application dependent measures with parameterized coecients set by domain experts allow apples-to-apples comparison that is meaningful for that scenario [39]. A model that seems good using common error metrics may behave poorly or prove inadequate for a given application; this intuition is validated by our analysis. All our measures are reusable by other domains, though they are inspired by and evaluated for the Smart Grid domain. As Smart Grid data becomes widely available, data mining and machine learn- ingresearchcanprovideimmensesocietalbenefitstothisunder-serveddomain[85]. OurstudybasedonrealSmartGriddatacollectedover3yearsisamongthefirst of its kind in defining holistic measures and evaluating candidate consumption mod- els for emerging microgrid and utility applications. Our analysis of the measures underscores their key ability to oer: deeper insight into models’ behavior that can help improve their performance, better understanding of prediction impact on real applications, intelligentcost-benefit trade-os between models, and a com- prehensive, yet accessible, goodness of fit for picking the right model. Our work oers a common frame of reference for future researchers and practitioners, while also exposing gaps in existing predictive research for this new domain. 1 Los Angeles Department of Water and Power: Smart Grid Regional Demonstration, US Department of Energy, 2010. 33 The remainder of this chapter is organized as follows: Section 4.2 presents related work and Section 4.3 introduces our proposed dimensions for model eval- uations. In Section 4.4, we propose application independent measures, while in Section 4.5, we propose application specific measures. Section 4.6 describes the experimentsandanalysisoftheresultsispresentedinSections4.7and4.8. Finally, conclusion is given in Section 4.9. 4.2 Related Work The performance evaluation of predictive models often involves a single dimen- sion, such as an error measure, which is simple to interpret and compare, but does not necessarily probe all aspects of a model’s performance or its goodness for a given application [105]. A new metric is proposed in [39] based on aggre- gating performance ratios across time series for fair treatment of over- and under- forecasting. [83] emphasizes the importance of treating predictive performance as amulti-dimensionalproblemforamorereliableevaluationofthetrade-osbetween various aspects. [51] introduces a measure to reduce the double penalty eect in forecasts whose features are displaced in space or time, compared to point-wise metrics. Further, [104] identifies the need for cost-benefit measures to help cap- ture the performance of a prediction model by a single profit-maximization metric which can be easily interpreted by practitioners. [16] highlights the importance of scale, along with measures like cost, sensitivity, reliability, understandability, and relationship for decision making using forecasts. Other studies [96] also go beyond standard error measures to include dimensions of sensitivity and speci- ficity. Our eort is in a similar vein. We propose a holistic set of measures along 34 multiple dimensions to assist domain users in intelligent model selection for their application, with empirical validation for emerging Smart Grid domain. Existing approaches for consumption prediction include our and other prior work on regression tree [12, 76], time series models [15], artificial neural networks and expert systems [71, 62]. In practice, utilities use simpler averaging models based on recent consumption [2, 36]. In this spirit, our baseline models for com- parativeevaluationconsiderTimeoftheWeek(ToW)andDayoftheWeek(DoW) averages. As elsewhere, Smart Grid literature often evaluates predictive model perfor- mance in terms of the magnitude of errors between the observed and predicted values [59, 29]. Common statistical measures for time series forecasting [16] are Mean Absolute Error (MAE), the Root Mean Square Error (RMSE) and the Mean Absolute Percent Error (MAPE), the latter given by: MAPE = 1 n n ÿ i=1 |p i ≠ o i | o i (4.1) where o i is the observed value at interval i, p i is the model predicted value, and n is the number of intervals for which the predictions are made. Mean Error Relative (MER) to the mean of the observed, has also been used to avoid the eects of observed values close to zero [71]. RMSE values normalized by the mean of observed values, called the coecient of variation of the RMSE (CVRMSE) , is used [42, 12]: CVRMSE = 1 o ˆ ı ı Ù 1 n n ÿ i=1 (p i ≠ o i ) 2 (4.2) where o is the mean of the n observed values, and p i ,o i and n are as before. While thesemeasuresoeranecessarystatisticalstandardofmodelperformance,theyby themselves are inadequate due to the reasons listed before, viz., their inability to 35 address prediction-bias, reliability, scale independence and cost of building models and making predictions. Some researchers have proposed application-specific metrics. [72] defines met- rics related to Demand-Response. The Demand Shed Variability Metric (SVM) and Peak Demand Variability Metric (PVM) help reduce over-fitting and extrap- olation errors that increase error variance or introduce prediction bias. Our application-dependent (rather than -specific) measures are defined more broadly, withmeasure parameters thatcanbetunedfordiverseapplicationsthatspaneven beyond Smart Grids. Relative measures help compare a prediction model with a baseline model [16]. Percent Better gives the fraction of forecasts by a model that are more accurate than a random walk model. This is a unit-free measure that is also immune to outliers present in the series, by discarding information about the amount of change. The Relative Absolute Error (RAE), calculated as a ratio of forecast error for a model to the corresponding error for the random walk, is simple to interpret and communicate to the domain users. The prediction horizon also has an impact on model performance. Cumulative RAE [16] is defined as the ratio of the arithmetic sum of the absolute error for the proposed model over the forecast horizon and the corresponding error for the random walk model. Relative and horizon metrics have been used less often for smart grid prediction models. Prediction error metrics have been categorized into scale-dependent measures, percentage errors, relative errors and scaled errors [55]. RMSE and MAE are scale-dependent and applicable to datasets with similar values. Scale-independent Percentage errors like MAPE and CVRMSE can be used to compare performance on datasets with dierent magnitudes. Relative measures are determined by divid- ing model errors by the errors of a baseline model, while scaled errors remove the 36 scale of data by normalizing the errors with the errors obtained from a baseline prediction method. Insummary,standardstatisticalmeasuresofperformanceforpredictivemodels maynotbeadequateormeaningfulfordomain-specificapplications,whilenarrowly defined measures for a single application are not reusable or comparable across applications. This gapisparticularlyfeltinthe noveldomainofSmartGrids. Our workisanattempttoaddressthisdeficiencybyintroducingasuiteofperformance measures along several dimensions, while also leveraging existing measures where appropriate. 4.3 Performance Measure Dimensions Performance measures that complement standard statistical error measures for evaluating prediction models fall along several dimensions that we discuss here. Application In/dependence: Application-independent measures are speci- fied without knowledge of how predictions from the model are being used. These do not have any specific dependencies on the usage scenario and can be used as a uniform measure of comparison across dierent candidate models. On the other hand, application-dependent measuresincorporateparametersthataredetermined by specific usage scenarios of the prediction model. The measure formulation itself is generic but requires users to set values of parameters (e.g., acceptable error thresholds) for the application. These allow a nuanced evaluation of prediction models that is customized for the application in concern. Scale-Independence: Indefiningerrormeasures, theresidualerrorsaremea- sured as a dierence between the observed and predicted values. So, if o i is the i th observed value and p i is the i th predicted value, then the scale-dependent residual 37 or prediction error, e i = p i ≠ o i . MAE and RMSE are based on residual errors, and suer from being highly dependent on the range of observed values. Scale- independent errors, ontheotherhand, areusuallynormalizedagainsttheobserved value and hence better suited for comparing model performance across dierent magnitudes of observations. Reliability: Reliabilityoersanestimateofhowconsistentlyamodelproduces similar results. This dimension is important to understand how well a model will perform on a yet unseen data that the system will encounter in future, relative to the data used while testing. A more reliable model provides its users with more confidence in its use. Most commonly used measures fail to consider the frequency of acceptable model performance over a period of time, which we address through measures we introduce. Cost-Eciency : Developing a prediction model has a cost associated with it in terms of eort and time for data collection, training models, and using them to make prediction. The number of times a trained model can be reused is also a factor. Data cost is a particularly important consideration in this age of “Big Data” since quality checking, maintaining and storing large feature-sets can be untenable. Compute costs can even be intractable when prediction models are used millions of times within short periods. A model built or used at high cost wouldbeimpracticalevenifdesignedtogivehighaccuracy. Thus,itiscriticalthat the models are cost-ecient when they are to be used in large-scale and real-time applications. 38 4.4 Application Independent Measures Many standard measures with well understood theoretical properties fall in the categoryofapplicationindependentmeasures. Forcompleteness, werecognizetwo relevant, existing scale-independent measures, MAPE [59, 71] and CVRMSE [42, 12]. Moreimportantly,weintroducenovelapplicationindependentmeasures,along the reliability and cost dimensions, and their properties. 4.4.1 Mean Absolute Percentage Error (MAPE) This is a variant of MAE, normalized by the observed value 2 at each interval (4.1), thus providing scale independence. It is simple to interpret and commonly used for evaluating predictions in energy and related domains [71, 59, 56, 29]. 4.4.2 Coecient of Variation of Root Mean Square Error (CVRMSE) It is the normalized version of the common RMSE measure, that divides it by the average 3 of the observed values (4.2) to oer scale independence. This is an unbiased estimator that incorporates both the prediction model bias and its variance, and gives a unit-less percentage error measure. CVRMSE is sensitive to infrequent large errors due to the squared term. 2 MAPEisnotdefinediftherearezerovaluesintheinput,whichisrareasenergyconsumption (kwh) values are generally non-zero due to always present base consumption (unless there is a black-out), and can be ensured by data pre-processing. 3 CVRMSE is not defined if this average is zero, which is rare as energy consumption (kwh) values are generally positive (unless there is net-metering), and can be ensured by data pre- processing. 39 4.4.3 Relative Improvement (RIM) We propose RIM as a relative measure for reliability that is estimated as the frequency of predictions by a candidate model that are better than a baseline model. RIM is a simple, unit-less measure that complements error measures in cases where being accurate more often than a baseline is useful, and occasional large errors relative to the baseline are acceptable. RIM = 1 n n ÿ i=1 C(p i ,o i ,b i ) (4.3) where o i ,p i and b i are the observed, model predicted and baseline predicted values for interval i, and C(p i ,o i ) is a count function defined as: C(p i ,o i ,b i )= Y __________] __________[ 1, if |p i ≠ o i | <|b i ≠ o i | 0, if |p i ≠ o i | =|b i ≠ o i | ≠ 1, if |p i ≠ o i | >|b i ≠ o i | (4.4) 4.4.4 Volatility Adjusted Benefit (VAB) VAB is another measure for reliability that captures how consistently a candi- datemodeloutperformsabaselinemodelbynormalizingthemodel’serrorimprove- ments overthe baseline bythe standarddeviationof these improvements. Inspired bytheSharpeRatio,thisrelativemeasureoersa“riskadjusted”scale-independent error value. The numerator captures the relative improvement of the candidate model’s MAPE over the baseline’s (the benefit). If these error improvements 4 are consistent across i, then their standard deviation would be low (the volatility) and 4 The error improvements oered by a given model over the baseline model are expected to have normal distribution for VAB to be meaningful. 40 theVABhigh. But, withhighvolatility, thebenefitswouldreducereflectingalack of consistent improvements. VAB = 1 n q n i=1 ( |b i ≠ o i | o i ≠ |p i ≠ o i | o i ) ‡ ( |b i ≠ o i | o i ≠ |p i ≠ o i | o i ) (4.5) where o i ,p i and b i are the observed, model predicted and baseline predicted values for interval i. 4.4.5 Computation Cost (CC) The cost for training and predicting using a model can prove important when it is used either at large scales and/or in real-time applications such as dynamic demand response that are sensitive to prediction latency. CC is defined in seconds as the sum of the wallclock time required to train a model,CC t , and the wallclock time required to make predictions using the model, CC p , for a given prediction duration with a certain horizon. Thus, CC =CC t +CC p . 4.4.6 Data collection Cost (CD) Rather than examine the raw size of data used for training or predicting using a model, a more useful measure is the eort required to acquire and assemble the data. Size can be managed through cheap storage but collecting the necessary dataoftenrequireshumanandorganizationaleort. Weproposea scale-dependent measure of data cost defined in terms of the number of unique values of features involved in a prediction model. CD is defined for a particular training and predic- tion duration as the sum of n s , the number of static (time-invariant) features that require a one-time collection eort, and n d , the number of dynamic features that need periodic acquisition. 41 CD = ns ÿ i=1 [s i ]+ n d ÿ i=1 [d i ] (4.6) where [s i ] and [d i ] are the counts of the unique values for the feature s i and d i respectively. 4.5 Application Dependent Measures Unlikethepreviousmeasures,applicationdependentperformancemeasuresare parameterized to suit specific usage scenarios and can be customized by domain experts to fit their needs. The novel measures we propose here are themselves not narrowly defined for a single application (though they are motivated by the needs observed in the smart grid domain). Rather, they are generalized through the use of coecients that are themselves application specific. 4.5.1 Domain Bias Percentage Error (DBPE) We propose DBPE as a signed percentage error measure that oers scale inde- pendence. It indicates if the predictions are positively or negatively biased com- pared to the observed values, which is important when over- or under-prediction errors, relative to observed, have a non-uniform impact on the application. We define DBPE as an asymmetric loss function based on the sign bias. Granger’s linlin function [50] is suitable for this as it is linear on both sides of the origin but with dierent slopes on each side. The asymmetric slopes allow dierent penalties for positive/negative errors. DBPE = 1 n n ÿ i=1 L(p i ,o i ) o i (4.7) 42 where L(p i ,o i ) is the linlin loss function defined as: L(p i ,o i )= Y __________] __________[ – ·|p i ≠ o i |, if p i >o i 0, if p i = o i — ·|p i ≠ o i |, if p i <o i (4.8) where o i and p i are the observed and model predicted values for the interval i, and – and — are penalty parameters associated with over- and under- prediction, respectively. – and — are configured for specific application and the ratio –/— measures the relative cost of over-prediction to under-prediction for that applica- tion [101]. Further, we introduce a constraint that – +— =2 to provide DBPE the interesting property of reducing to MAPE when – = — =1. 4.5.2 Reliability Threshold Estimate (REL) Often, applications may care less about the absolute errors of a model’s predic- tions and preferan estimate of how frequentlythe errors fall withinaset threshold that the application can withstand. We define REL as the frequency of prediction errors that are less than an application determined error threshold, e t . REL = 1 n n ÿ i=1 C(p i ,o i ) (4.9) 43 where o i and p i are the observed and the model predicted values for the interval i, and C(p i ,o i ) is a count function defined as: C(p i ,o i )= Y __________] __________[ 1, if |p i ≠ o i | o i <e t 0, if |p i ≠ o i | o i = e t ≠ 1, if |p i ≠ o i | o i >e t (4.10) 4.5.3 Total Compute Cost (TCC) In context of an application, it is meaningful to supplement the data and com- putecosts(CDandCC)withanestimateofthetotalrunningcostofusingamodel for a duration of interest specific to that application. We define the parameters: • · , the number of times a model is trained within the duration, • fi , the number of times a model makes predictions with a given horizon, in that duration. These parameters are not just application specific but also vary by the candidate model, based on how frequently it needs to be trained and its eective prediction horizon. We define the total training cost in seconds for a prediction duration based on · and fi , and the unit costs for training and prediction using the model, CC t and CC p , introduced in Section 4.4.5: TCC =CC t ·· +CC p ·fi (4.11) 44 4.5.4 Cost-Benefit Measure (CBM) Rather than treat cost in a vacuum, it is worthwhile to consider the cost for a model relative to the gains it provides. CBM compares candidate models having dierent error measures and costs to evaluate which provides a high reward for a unit compute cost spent. CBM = (1≠ DBPE) TCC (4.12) The numerator is an estimate of the accuracy (one minus error measure) while the denominator is the compute cost. We use DBPE as the error measure and TCC as the cost, but these can be replaced by other application dependent error measures (e.g., CVRMSE, MAPE) and costs (e.g., CD, CC p ). A model with high accuracy but prohibitive cost may be unsuitable. 4.6 Experiments We validate the ecacy of our proposed performance measures for real world applications. The USC campus microgrid [95] is a testbed for the DOE-sponsored Los Angeles Smart Grid Project. ARIMA and Regression Tree prediction models areusedtopredictenergyconsumptionat24-hourand15-mingranularities,forthe entire campus and for 35 individual buildings. Here, we consider the campus and four representative buildings: DPT, a small department with teaching and oce space; RES, a suite of residential dormitories with decentralized control of cooling and appliance power loads; OFF, hosting administrative oces and telepresence lab;andACD,alargeacademicteachingbuilding. Thesebuildingswereconsidered 45 after several pilot studies to provide diversity in terms of floor size, age, end use, types of occupants, and net electricity consumption. 4.6.1 Datasets Electricity Consumption Data 5 : We used 15-min granularity electricity consumption data collected by the USC Facility Management Services between 2008 to 2010 (Table 4.1). These gave 3◊ 365◊ 96 or≥ 100K samples per building. We linearly interpolated missing values (<3% of samples) and aggregated 15-min dataineachdaytogetthe24-hourgranularityvalues(≥ 1Ksamplesperbuilding). Observations from 2008 and 2009 were used for training the models while the predictions were evaluated against the out-of-sample observed values for 2010 6 . Weather Data 7 : We collected historical hourly average and maximum tem- perature data curated by NOAA for Los Angeles/USC Campus for 2008–2010. These values were linearly interpolated to get 15-min values. We also collected daily maximum temperatures that were used for the 24-hour granularity models. Schedule Data 8 : We gathered campus information related to the semester periods, working days and holidays from USC’s Academic Calendar. 4.6.2 Candidate Prediction Models We used the following candidate models for evaluating our performance mea- sures: Time Series Model: A time series (TS) model predicts the future values 5 The electric consumption datasets used in this dissertation are available upon request for academic use. 6 The 24-hour data was available only till Nov 2010 at the time of experiments, and hence 24-hour models are tested for a 11 month period. The 15-min models span the entire 12 months. 7 NOAA Quality Controlled Local Climatological Data. http://cdo.ncdc.noaa.gov/qclcd/ 8 USC Academic Calendar. http://academics.usc.edu/calendar/ 46 Table 4.1: Electricity consumption dataset. Summary statistics of the campus microgrid consumption data for training years 2008-2009, and testing year 2010, at dierent spatial and temporal granularities. Entity Mean (kWh) Std. Deviation (kWh) Training Testing Training Testing Campus 24-hour data 462,970 440,803 52,956 43,454 15-min data 4,823 4,377 809 770 DPT 24-hour data 405.64 405.56 112.39 108.14 15-min data 4.23 4.16 1.93 1.98 RES 24-hour data 4,220.30 3,670.56 1,809.00 1,460.08 15-min data 43.97 37.79 22.36 17.93 OFF 24-hour data 2,938.90 2,790.70 591.97 549.37 15-min data 30.66 28.42 13.05 10.03 ACD 24-hour data 4,466.40 4,055.85 640.92 552.64 15-min data 46.65 41.30 14.08 13.09 ofavariablebasedonitspreviousobservations. TheARIMA(AutoregressiveInte- grated Moving Average) model is a commonly used TS prediction model. These parameters are determined using autocorrelation and partial autocorrelation func- tions, using the Box-Jenkins test [25]. ARIMA is simple to use as it does not require knowledge of the underlying domain [15]. However, estimating the model parameters, d,p, and q, requires human examination of the partial correlogram of the time series, though some automated functions perform a partial parameter sweep to select these values. 47 Regression Tree Model: A regression tree (RT) model [27] is a kind of decision tree that recursively partitions the data space into smaller regions, until a constant value or a linear regression model can be fit for the smallest partition. OurearlierworkonanRTmodelforcampusmicrogridconsumptionprediction identified several advantages [12]. It’s flowchart style tree structure helps interpret the impact of dierent features on consumption. Making predictions on a trained model is fast though collecting feature data and training the model can be costly. It can be used to make predictions far into the future if the feature values are available. 4.6.3 Model Configurations Regression Tree (RT) Models: For 24-hour (granularity) predictions, we used five features for the RT model: Day of the Week (Sun-Sat), Semester (Fall, Spring, Summer), Maximum and Average Temperatures, and a Holiday/Working day flag. For the 15-min (granularity) predictions, we used five features: Day of the Week, Time of Day (1–96, representing the 15-min slots in a day), Semester, temperature, and Holiday/Working day flag. The RT model was trained once using MATLAB’s classregtree function [27] to find an optimally pruned tree. ARIMA Time Series (TS) Models: For 24-hour predictions, the ARIMA models are retrained and used to make predictions every week for four dierent prediction horizons: 1-week, 2-week, 3-week, and 4-week ahead. Unlike RT, the performanceoftimeseriesmodelsdierbythepredictionhorizon. Weuseamoving window over the past 2 years for training these models with (p,d,q)=(7,1,7), equivalent to a 7 day lag, selected after examining several variations. For 15- min predictions, we retrain models and predict every 2 hours for three dierent horizons: 2-hour, 6-hour, and 24-hour ahead. We use a moving window over the 48 past 8 weeks for training, with (p,d,q)=(8,1,8), equivalent to a 2 hour lag. We used the arima function in the R forecast package [56] for constructing the time series models. This function used conditional sum of squares (CSS) as the fitting method. BaselineModels: For24-hourpredictions,weselectedtheDayofWeekmean (DoW) as the baseline, defined for each day of the week as the kWh value for that day averaged over the training period (i.e., 7 values from averaging over 2008 and 2009). DoW was chosen over Day of Year (DoY) and Annual Means since it consistently out-performed them. For 15-min predictions, we selected the Time of the Week mean (ToW) as the baseline, defined for each 15-min in a week as the kWh value for that interval averaged over the training period (i.e., 7◊ 96 values from averaging over 2008 and 2009). Here too, ToW out-performed Time of the Year (ToY) and Annual Means. 4.7 Analysis of Independent Measures Wefirstexaminetheuseandvalueofthesixapplicationindependentmeasures (Section 4.4) to evaluate the candidate models for predicting campus and building consumption at coarse and fine time granularities. 4.7.1 24-hour Campus Predictions Fig. 4.1a presents the CVRMSE and MAPE measures for the DoW baseline, RT, and TS models, the latter for four dierent horizons, for campus 24-hour pre- dictions. By these measures, TS models at dierent horizons oer higher accuracy than the RT and DoW models. This is understandable, given the noticeable dif- ference in mean and standard deviations (Table 4.1) between the training and test 49 periods. TS incrementally uses more recent data as a moving window, while RT and DoW model are only trained on the two years’ test data. Also, the errors for TS deteriorate as the prediction horizon increases. This is a consequence of their dependence on recent lag values, making them suited only for near-term predic- tions. RT models are independent of prediction horizons (assuming future feature values are known), and therefore preferable for predictions with long horizons. The DoW errors are marginally higher than RT. This is quickly evident using our relative improvement (RIM) measure (Fig. 4.2a), that reports an improvement of 2.5% for RT and 58.39% for TS (1wk) over the baseline. However, when volatility is accounted for, this margin over the DoW increases to a VAB of 11.45% and 74.42% for RT and TS (1wk), making them much more dependable. 4.7.2 24-hour Building Predictions The CVRMSE and MAPE measures for DPT (Fig. 4.1b) diverge in their ranking of the RT and TS models; RT is best based on CVRMSE while TS (1wk) is best on MAPE. This divergence highlights the value of having dierent error measures. In CVRMSE, residual errors are squared and thus large errors are magnified more than in MAPE. Our RIM measure oers another perspective as a relative measure independent of error values (Fig. 4.2b). TS (1wk) is clearly more favorable than RT, performing better than the baseline in 50% of predictions (RIM ¥ 0) compared to RT (RIM=≠ 19.88%). When accounting for volatility in VAB,TS(1wk)outperforms the DoW(VAB=17.62%)andevenRTexhibits lesser relative volatility (VAB=4.35%). These demonstrate why multiple measures oer a more holistic view of the model performance. RES has 100’s of residential suites with independent power controls, and hence higher consumption variability. This accounts for the higher errors in predictions 50 7.6%% 3.7%% 4.9%% 5.1%% 5.3%% 6.9%% 8.6%% 5.4%% 6.7%% 6.9%% 7.0%% 7.9%% 0%% 10%% 20%% 30%% 40%% 50%% DoW% TS%(1wk)% TS%(2wk)% TS%(3wk)% TS%(4wk)% RT% CVRMSE% MAPE% (a) Campus 0%# 10%# 20%# 30%# 40%# 50%# DoW# TS#(1wk)# TS#(2wk)# TS#(3wk)# TS#(4wk)# RT# CVRMSE# MAPE# (b) DPT 0%# 10%# 20%# 30%# 40%# 50%# DoW# TS#(1wk)# TS#(2wk)# TS#(3wk)# TS#(4wk)# RT# CVRMSE# MAPE# (c) RES 0%# 10%# 20%# 30%# 40%# 50%# DoW# TS#(1wk)# TS#(2wk)# TS#(3wk)# TS#(4wk)# RT# CVRMSE# MAPE# (d) OFF 0%# 10%# 20%# 30%# 40%# 50%# DoW# TS#(1wk)# TS#(2wk)# TS#(3wk)# TS#(4wk)# RT# CVRMSE# MAPE# (e) ACD Figure 4.1: Performance (CVRMSE and MAPE) for coarse-grained (24-hour) pre- dictions for campus and four buildings. Lower values are better. Day of Week (DoW) baseline, ARIMA Time Series (TS) with 1, 2, 3 and 4-week prediction horizons, and Regression Tree (RT) models are on X-axis. Campus has the small- est errors, RES residential building the largest, and, except for OFF, TS and RT outperform the baseline. 51 !80%% !60%% !40%% !20%% 0%% 20%% 40%% 60%% 80%% TS%(1wk)% TS%(2wk)% TS%(3wk)% TS%(4wk)% RT% RIM% VAB% (a) Campus 17.6%& 0.1%& (3.7%& (11.9%& 4.4%& (80%& (60%& (40%& (20%& 0%& 20%& 40%& 60%& 80%& TS&(1wk)& TS&(2wk)& TS&(3wk)& TS&(4wk)& RT& RIM& VAB& (b) DPT 24.8%& 7.3%& 3.9%& *7.0%& 7.5%& *80%& *60%& *40%& *20%& 0%& 20%& 40%& 60%& 80%& TS&(1wk)& TS&(2wk)& TS&(3wk)& TS&(4wk)& RT& RIM& VAB& (c) RES !80%% !60%% !40%% !20%% 0%% 20%% 40%% 60%% 80%% TS%(1wk)% TS%(2wk)% TS%(3wk)% TS%(4wk)% RT% RIM% VAB% (d) OFF !3.1%& !9.8%& !7.1%& !9.6%& !3.1%& 4.9%& !1.7%& !2.3%& !3.6%& 4.0%& !80%& !60%& !40%& !20%& 0%& 20%& 40%& 60%& 80%& TS&(1wk)& TS&(2wk)& TS&(3wk)& TS&(4wk)& RT& RIM& VAB& (e) ACD Figure 4.2: Relative Improvement (RIM) and Volatility-Adjusted Benefit (VAB) values for coarse-grained (24-hour) predictions for campus and four buildings. Higher values indicate better performance relative to DoW baseline; zero value means performance similar to baseline. DoW is more volatile for RES due to summer vacation. VAB for RT and TS are high, showing resilience. across models (Fig. 4.1c). Further, the building is unoccupied during summer and vacation periods. Hence, it is unsurprising to see DoW perform particularly 52 0%# 10%# 20%# 30%# 40%# 50%# 60%# ToW## TS#(2hr)# TS#(6hr)# TS#(24hr)# RT# CVRMSE# MAPE# (a) Campus 0%# 10%# 20%# 30%# 40%# 50%# 60%# ToW## TS#(2hr)# TS#(6hr)# TS#(24hr)# RT# CVRMSE# MAPE# (b) DPT 0%# 10%# 20%# 30%# 40%# 50%# 60%# ToW## TS#(2hr)# TS#(6hr)# TS#(24hr)# RT# CVRMSE# MAPE# (c) RES 0%# 10%# 20%# 30%# 40%# 50%# 60%# ToW## TS#(2hr)# TS#(6hr)# TS#(24hr)# RT# CVRMSE# MAPE# (d) OFF 0%# 10%# 20%# 30%# 40%# 50%# 60%# ToW## TS#(2hr)# TS#(6hr)# TS#(24hr)# RT# CVRMSE# MAPE# (e) ACD Figure 4.3: CVRMSE and MAPE values for fine-grained (15-min) predictions for campusandfourbuildings. Lowererrorsindicatebettermodelperformance. Time ofWeekbaseline(ToW),ARIMATimeSeries(TS)at2,6&24-hourhorizons,and Regression Tree (RT) models are shown. Errors usually increase as the prediction horizon is increased. RT is independent of prediction horizon. worse. (We verified the impact of summer by comparing DoW with DoY. DoY did perform better, but for consistency, we retain DoW as the baseline.) RT has lower 53 errors than DoW as it captures schedule-related features like holidays and summer semester. However, the test data for RES has a smaller mean than the training data (Table 4.1), thus skewing predictions. TS (1wk) has the smallest error due to its ability to capture changing and recent trends. The RIM (Fig. 4.2c) with respect to DoW is greater than 0 for all models except TS (4wk), which performs worsethanthebaseline 7%ofthetime,evenasithascomparativelysmallererrors. Given the high consumption variability for RES, performance under volatility is important. A high VAB is desirable and provided by all models. ForOFF(Fig.4.1d),weagainseeadivergenceinmodelrankingwhenbasedon CVRMSE or on MAPE, reflecting the benefit of each measure. Uniquely, neither RT nor TS are able to surpass the DoW baseline in terms of CVRMSE. We inde- pendently verified if the consumption pattern of this building is highly-correlated with the DoW by examining the decision tree generated by RT, the best choice in terms of MAPE. We found the DoW feature to be present in the root node of the tree whiletheholidayflagwasatthesecondlevel. RTisalsotheonlymodelwhich (marginally) outperforms the baseline on RIM and VAB (Fig. 4.2d), thus deliver- ing the benefits of using a feature-based approach that subsumes DoW. TS fails to do well, possibly due to temporal dependencies that extend beyond the 7-day lag period. It is notable that while DoW is the preferred model based on CVRMSE (Fig. 4.1d) for OFF, measures we propose, such as RIM and VAB (Fig. 4.2d) that evaluate performance against the DoW baseline, indicate that RT is the better choice. For ACD (Fig. 4.1e), TS (1wk) and RT perform incrementally better than DoW on CVRMSE and MAPE, and VAB is positive for only these two models (Fig. 4.2e). The sharp change in standard deviation between the training and test dataaccountsforthehighersensitivitytovolatilityofthebaseline(Table4.1). But 54 we observe slightly negative values of RIM for all models, implying more frequent errors than the baseline. 4.7.3 15-min Campus Predictions The 15-min predictions for the campus shows TS (2hr) to fall closest to the observedvalues, basedonCVRMSE(6.88%)andMAPE(4.18%)(Fig.4.3a). This accuracy is validated relative to the baseline, with high RIM and VAB values (Fig. 4.4a). These reflect the twin benefits of large spatial granularity of the campus, which make its consumption slower changing, and the short horizon of TS (2hr), helping it capture temporal similarity. RT is the next best, performing similar to TS (6hr) and ToW baseline on CVRMSE, MAPE and RIM, though it is better with volatility (VAB=5.21%). 4.7.4 15-min Building Predictions For 15-min predictions for buildings, we see that TS (2hr) is the only candi- date model that always does better than the ToW baseline on all four measures (Figs. 4.3b–4.3e & 4.4b–4.4e). TS (6hr) and RT are occasionally better than ToW on CVRMSE and MAPE, and TS (24hr) rarely. Their CVRMSE errors are also uniformlylargerthanMAPE,showingthatthemodelssuermorefromoccasional large errors. The academic environment with weekly class schedules encourages a uniform energy use behavior based on ToW, that is hard to beat. RES is the exception, where all candidate models are better than the baseline (Fig. 4.3c), given the aberrant summer months when it is unused. However,whenweconsidertheRIMandVABmeasures,itisinterestingtonote that the candidate models are not significantly worse than the baseline (Fig. 4.4b– 4.4e). In fact, TS (6hr) is better than ToW for all buildings but DPT, showing 55 40.0%% &1.3%% &38.2%% 1.5%% 29.5%% 1.9%% &39.2%% 5.2%% &60%% &40%% &20%% 0%% 20%% 40%% 60%% TS%(2hr)% TS%(6hr)% TS%(24hr)% RT% RIM% VAB% (a) Campus !60%% !40%% !20%% 0%% 20%% 40%% 60%% TS%(2hr)% TS%(6hr)% TS%(24hr)% RT% RIM% VAB% (b) DPT 4.86%& '60%& '40%& '20%& 0%& 20%& 40%& 60%& TS&(2hr)& TS&(6hr)& TS&(24hr)& RT& RIM& VAB& (c) RES !4.74%& !60%& !40%& !20%& 0%& 20%& 40%& 60%& TS&(2hr)& TS&(6hr)& TS&(24hr)& RT& RIM& VAB& (d) OFF 11.12%% 8.1%% 15.8%% (6.4%% (5.1%% (11.9%% (60%% (40%% (20%% 0%% 20%% 40%% 60%% TS%(2hr)% TS%(6hr)% TS%(24hr)% RT% RIM% VAB% (e) ACD Figure 4.4: RIM and VAB values for fine-grained (15-min) predictions for campus and four buildings. Higher values indicate better model performance with respect to the baseline; zero indicates similar performance as baseline. TS (2hr) usually oers highest reliability in all cases. that it is more often accurate and more reliable under volatility. RT, however, is more susceptible to volatility and shows negative values for all buildings but RES. While TS (2hr) followed by TS (6hr) are obvious choices for short horizon, ToW 56 Table 4.2: Application-independent cost measures. Prediction horizon is 4 weeks for24-hourpredictions; and24hoursfor15-minpredictions. CDmeasuresnumber of unique feature values used in training and testing. TS and baseline do not have a training cost. Data Cost Compute Cost (millisec) Model CD CC t CC p DoW/ToW Baseline 24-hour predictions 1,096 - - 15-min predictions 1,05,216 - - Time Series 24-hour predictions 1,096 - 101 15-min predictions 1,05,216 - 933 Regression Tree 24-hour predictions 3,301 94 1.6 15-min predictions 1,31,629 17,275 48 and RT have the advantages of being able to predict over a longer term. In the latter case, ToW actually turns out to be a better model. 4.7.5 Cost Measures The data and compute cost measures, discussed here, are orthogonal to the other application-independent measures, and their values are summarized in Table 4.2. Making cost assessment helps grid managers ensure rational use of resources, including skilled manpower and compute resources, as well as informed selection of models for D 2 R. Data Cost (CD): The baselines and TS are univariate models that require only the electricity consumption values for the training and test periods. Hence their data costs are smaller, and correspond to the number of intervals trained and tested over. RT model has a higher cost due to the addition of several features 57 (§ 4.6.3). However, the cost does not increase linearly with the number of features and instead depends on the number of unique feature values. As a result, its data cost is only ≥ 25% and ≥ 300% greater than TS for 24-hr and 15-min predictions respectively. Compute Cost (CC): We train over 2 years and predict for 4 weeks (24-hour granularity) and 24 hours (15-min) on a Windows Server with AMD 3.0GHz CPU and 64GB RAM, and report the average over 10 experiment runs. The baseline’s compute cost is trivial as it is just an average over past values, and we ignore it. For the TS models, retraining is interleaved with the prediction and we report them as part of the prediction cost (CC p ). We found prediction times for TS to be identical across campus and the four buildings, and the 15-min predictions to be ≥ 9◊ the cost of 24-hour – understandable since there are ≥ 10◊ the data points. The horizons did not aect these times. For RT, we find the training and prediction times to be similar (but not same) across campus and four buildings, and this is seen in the dierences in the sizes of the trees constructed. We report their average time. While RT has a noticeable training time (17 secs for 15-min), itspredictiontimeisanorderofmagnitudesmallerthanTS.Asaresult,itsregular useforpredictionischeaper. Itismoreresponsive,withalowerpredictionlatency, even as the number of buildings (or customers) increase to the thousands. 4.8 Analysis of Dependent Measures The application-dependent measures (Section 4.5) enable model selection for specific application scenarios. We consider three applications used within the USC microgrid to evaluate our proposed application-dependent measures. These are: 58 Table 4.3: Application Specific Parameters. –,— are over- and under-prediction penalties for DBPE, and e t is the error tolerance for REL. Application & Prediction Type DBPE (–,— )REL(e t ) Planning 24-hour Buildings 0.50, 1.50 0.15 24-hour Campus 1.00, 1.00 0.10 Customer Education 24-hour Building 0.75, 1.25 0.15 15-min Buildings (6AM-10PM) 1.50, 0.50 0.10 Demand Response 15-min Campus (1PM-5PM) 0.50, 1.50 0.05 15-min Buildings (1PM-5PM) 0.50, 1.50 0.10 planning, customer education, and demand response (Section 2.3). For each appli- cation,themeasures’parametervaluesaredefinedinconsultationwiththedomain experts. These values are listed in Tables 4.3 & 4.4. 4.8.1 Planning Planning requires medium- and long-term consumption predictions at 24-hour granularities for the campus and buildings, six times a year. The short horizon of TS (4 weeks) precludes its use. So we only consider DoW and RT models, but do report TS results. Campus: For campus-scale decisions, both over- and under- predictions can be punitive. The former will lead to over-provisioning of capacity with high costs while the latter can cause reduced usability of capital investments. Hence, for DBPE, we equally weight – =1 and — =1, whereby DBPE reduces to MAPE. We set e t =10%, a relatively lower tolerance, since even a small swing in error % for a large consumer like USC translates to large shifts in kWh. 59 Fig. 4.5a shows RT (and TS) to perform better than the DoW baseline on DBPE (6.87% vs. 7.56%, consistent with MAPE). The RT model’s reliability is also higher than DoW’s (Fig. 4.6a), with RT providing errors smaller than the threshold 60.87% of the time – a probabilistic measure for the planners to use. When we consider the total compute cost for training and running the model (Table4.4),RTistrainedonceayearandusedsixtimes,withanegligiblecompute costof 103msecandahighCBMof 900%/sec(Fig.4.5a). ThesemakeRTabetter qualitative and cost-eective model for long term campus planning. Buildings: Buildings being upgraded for sustainability and energy eciency favor over-prediction of consumption to ensure an aggressive reduction of carbon footprint. Reflecting this, we set – =0.5 and — =1.5 for DBPE. A higher error tolerance than campus is acceptable, at e t =15%. Cost parameters and measure values are the same as campus. DBPE reflects a truer measure of error for the application and we see that it is smaller than MAPE across all models and buildings (Fig 4.5b-4.5e). Investigating the data reveals that the average kWh for the training period was higher than that for the test period, leading to over-predictions. Here, the models’ inclination to over-predict works in their favor. While RT is uniformly better than DoW on DBPE, it is less reliable for RES and ACD (Figs. 4.6c & 4.6e), even falling below 0%, indicating that predictions go over the error threshold more often than below the threshold. REL unlike DBPE treats over- and under-predictions similarly. While the baseline has fewer errors above the threshold, their magnitudes are much higher, causing DBPE (an average) to rise for smaller REL. The costs for RT are minimal like for campus and their CBMs similar. So the model of choice depends on if 60 7.6%% 3.7%% 4.9%% 5.1%% 5.3%% 6.9%% 0% 200% 400% 600% 800% 1000% 0%% 10%% 20%% 30%% 40%% DoW% TS%(1wk)% TS%(2wk)% TS%(3wk)% TS%(4wk)% RT% CBM$ DBPE$ %/sec$ DBPE%(Planning)% CBM%(Planning)% (a) Campus 0" 200" 400" 600" 800" 1000" 0%" 10%" 20%" 30%" 40%" DoW" TS"(1wk)" TS"(2wk)" TS"(3wk)" TS"(4wk)" RT" CBM$ DBPE$ %/sec! DBPE"(CustEd)" DBPE"(Planning)" CBM"(CustEd)" CBM"(Planning)" (b) DPT 0" 200" 400" 600" 800" 1000" 0%" 10%" 20%" 30%" 40%" DoW" TS"(1wk)" TS"(2wk)" TS"(3wk)" TS"(4wk)" RT" CBM$ DBPE$ %/sec! DBPE"(CustEd)" DBPE"(Planning)" CBM"(CustEd)" CBM"(Planning)" (c) RES 0" 200" 400" 600" 800" 1000" 0%" 10%" 20%" 30%" 40%" DoW" TS"(1wk)" TS"(2wk)" TS"(3wk)" TS"(4wk)" RT" CBM$ DBPE$ %/sec! DBPE"(CustEd)" DBPE"(Planning)" CBM"(CustEd)" CBM"(Planning)" (d) OFF 0" 200" 400" 600" 800" 1000" 0%" 10%" 20%" 30%" 40%" DoW" TS"(1wk)" TS"(2wk)" TS"(3wk)" TS"(4wk)" RT" CBM$ DBPE$ %/sec! DBPE"(CustEd)" DBPE"(Planning)" CBM"(CustEd)" CBM"(Planning)" (e) ACD Figure 4.5: Domain bias percentage error (DBPE), primary Y-axis, and Cost- BenefitMeasure(CBM),secondaryY-axis,forcoarse-grained(24-hour)predictions for Planning and Customer Education. Customer Education is not relevant for campus. Lower DBPE and higher CBM are better, as seen in RT. the predictions need to be below the threshold more often (DoW) or if the biased- errors are lower (RT). Particularly, for OFF, REL (Fig. 4.6d) shows RT is best for 61 !30%% !10%% 10%% 30%% 50%% 70%% DoW% TS%(1wk)% TS%(2wk)% TS%(3wk)% TS%(4wk)% RT% REL%(Planning%&%CustEd)% (a) Campus !30%% !10%% 10%% 30%% 50%% 70%% DoW% TS%(1wk)% TS%(2wk)% TS%(3wk)% TS%(4wk)% RT% REL%(Planning%&%CustEd)% (b) DPT 1.2%% 12.4%% 4.1%% '5.2%% '14.3%% '5.0%% '30%% '10%% 10%% 30%% 50%% 70%% DoW% TS%(1wk)% TS%(2wk)% TS%(3wk)% TS%(4wk)% RT% REL%(Planning%&%CustEd)% (c) RES !30%% !10%% 10%% 30%% 50%% 70%% DoW% TS%(1wk)% TS%(2wk)% TS%(3wk)% TS%(4wk)% RT% REL%(Planning%&%CustEd)% (d) OFF !30%% !10%% 10%% 30%% 50%% 70%% DoW% TS%(1wk)% TS%(2wk)% TS%(3wk)% TS%(4wk)% RT% REL%(Planning%&%CustEd)% (e) ACD Figure 4.6: Reliability (REL) values for coarse-grained (24-hour) predictions for Planning, and Customer Education. Both have the same value of error tolerance parameter,andshownbyasinglegraph. Highervaluesindicatebetterperformance than the baseline; zero matches the baseline. PlanningevenasDoWwasthebettermodelbasedonCVRMSE(Fig.4.1d). Simi- larly, for ACD, REL (Fig. 4.6e) recommends DoW for Planning even as CVRMSE 62 6.8%% 1.5%% 3.7%% 22.5%% 5.5%% 0% 1% 2% 3% 0%% 20%% 40%% 60%% 80%% ToW%% TS%(2hr)% TS%(6hr)% TS%(24hr)% RT% CBM$ DBPE$ %/sec$ DBPE%(DR)% CBM%(DR)% (a) Campus 0" 1" 2" 3" 0%" 20%" 40%" 60%" 80%" ToW"" TS"(2hr)" TS"(6hr)" TS"(24hr)" RT" CBM$ DBPE$ %/sec$ DBPE"(DR)" DBPE"(CustEd)" CBM"(DR)" CBM"(CustEd)" (b) DPT 0" 1" 2" 3" 0%" 20%" 40%" 60%" 80%" ToW"" TS"(2hr)" TS"(6hr)" TS"(24hr)" RT" CBM$ DBPE$ %/sec$ DBPE"(DR)" DBPE"(CustEd)" CBM"(DR)" CBM"(CustEd)" (c) RES 0" 1" 2" 3" 0%" 20%" 40%" 60%" 80%" ToW"" TS"(2hr)" TS"(6hr)" TS"(24hr)" RT" CBM$ DBPE$ %/sec$ DBPE"(DR)" DBPE"(CustEd)" CBM"(DR)" CBM"(CustEd)" (d) OFF 0" 1" 2" 3" 0%" 20%" 40%" 60%" 80%" ToW"" TS"(2hr)" TS"(6hr)" TS"(24hr)" RT" CBM$ DBPE$ %/sec$ DBPE"(DR)" DBPE"(CustEd)" CBM"(DR)" CBM"(CustEd)" (e) ACD Figure 4.7: Domain bias percentage error (DBPE), primary Y-axis, and Cost- BenefitMeasure(CBM),secondaryY-axis,forfine-grained(15-min)predictionsfor Demand Response and Customer Education. Customer Education is not relevant for campus. Lower DBPE and higher CBM are desirable, and provided by TS (2hr) and TS (6hr) for DR. (Fig. 4.1e) suggests RT and MAPE (Fig. 4.1e) suggests TS (1wk). These high- light the value of defining application-specific performance measures like REL for meaningful model selection. 63 !50%% !30%% !10%% 10%% 30%% 50%% 70%% 90%% ToW%% TS%(2hr)% TS%(6hr)% TS%(24hr)% RT% REL%(DR)% (a) Campus 4.6%% 38.5%% )7.5%% )6.4%% )50%% )30%% )10%% 10%% 30%% 50%% 70%% 90%% ToW%% TS%(2hr)% TS%(6hr)% TS%(24hr)% RT% REL%(DR)% REL%(CustEd)% (b) DPT !3.6%& !50%& !30%& !10%& 10%& 30%& 50%& 70%& 90%& ToW&& TS&(2hr)& TS&(6hr)& TS&(24hr)& RT& REL&(DR)& REL&(CustEd)& (c) RES !50%% !30%% !10%% 10%% 30%% 50%% 70%% 90%% ToW%% TS%(2hr)% TS%(6hr)% TS%(24hr)% RT% REL%(DR)% REL%(CustEd)% (d) OFF 4.4%$ 93.9%$ 2.8%$ 2.8%$ 4.1%$ 23.1%$ *12.4%$ *14.5%$ *50%$ *30%$ *10%$ 10%$ 30%$ 50%$ 70%$ 90%$ ToW$$ TS$(2hr)$ TS$(6hr)$ TS$(24hr)$ RT$ REL$(DR)$ REL$(CustEd)$ (e) ACD Figure 4.8: Reliability (REL) values for fine-grained (15-min) predictions for Demand Response and Customer Education. Higher is better. 4.8.2 Customer Education Thisapplicationuses24-hourand15-minpredictionsatthebuilding-levelmade during the daytime (6AM-10PM), and provides them to residents/occupants for monthly budgeting and daily energy conservation. 64 24-hour predictions: 24-hour predictions impact monthly power budgets, and over-predictions are better to avoid slippage. We pick – =0.75 and — =1.25 for DBPE and an error tolerance e t =15% for REL. We use a 4-week prediction duration for costing with one 24-hour prediction done each day by RT and TS. RT is trained once in this period. We report TCC (Table 4.4), DBPE & CBM (Figs. 4.5b-4.5e), and REL (Figs. 4.6b-4.6e). Like for Planning that preferred over-predictions, the DBPE here is smaller than MAPE for all models, and it is mostly smaller for RT and TS models than DoW. But for a building like ACD, while one may have picked TS (1wk) based on the application-independent MAPE measure (Fig. 4.1e), both RT and DoW are better for Customer Education on DBPE (Fig. 4.5e). Similarly, for both DPT and RES,TS(1wk)wasthebestoptionbasedonMAPE(Figs.4.1b,4.1c)aswellason DBPE forCustomerEducation(Figs.4.5b, 4.5c). However, foradierentapplica- tion, suchas Planning, RT is the recommendedmodelbasedonDBPE (Figs. 4.5b, 4.5c). This highlights how a measure that is tailored for a specific application by setting tunable parameters can guide the eective choosing of models for it. When considering reliability, REL for RT is marginally (DPT) or significantly (OFF)betterthanDoWevenastheapplication-independentRIMshowedRTtobe worse or as bad as DoW respectively – yet another benefit of measures customized for the application. RT also equals or out-performs TS (1wk) on both DBPE and REL on all buildings but RES. The TCC cost for TS while being ≥ 20◊ more than RT is still small given the one month duration. This also reflects in the CBM being much lower for TS. 15-min predictions: This application engages customers by giving periodic forecasts during the day to encourage eciency. Over-predicting often or more frequent errors will mitigate a customer’s interest. So we set – =1.5 and — =0.5 65 Table 4.4: Application-specific cost parameters and measures (TCC). · is the trainings per duration, and fi is the model usage with a prediction horizon per duration. Application Trainings, · Uses, fi (horizon) TCC (millisec) Planning (duration = 1 year) 24-hour RT 1 6 (2mo) 103 Customer Education (duration = 4 weeks) 24-hour RT 1 28 (1dy) 139 24-hour TS - 28 (1dy) 2,845 15-min RT 1 8·28 (2hr) 28,103 15-min TS - 8·28 (2hr) 2,09,037 Demand Response (duration = 4 weeks) 15-min RT 4 5·3 (6hr) 69,824 15-min TS - 4·5·3 (6hr) 55,992 for DBPE, and we have a lower error tolerance at e t =10% for REL. Prediction duration is 4-weeks for cost parameters, with 8 uses per day at 2 hour horizons. RT is trained once. For all buildings, both DBPE and REL rank TS (2hr) as the best model (Figs. 4.7b-4.7e) & (Figs. 4.8b-4.8e). These rearm the eectiveness of TS for short-term predictions. For many models, the (daytime) DBPE for this appli- cation is higher than the (all-day) MAPE due to higher variations in the day. However, TS (2hr) bucks this trend for RES, OFF and ACD. RT is worse than evenToWonreliability, withRELbelow0%forallbuildings. ForRES,allmodels but TS (2hr) have REL below 0%. So qualitatively, TS (2hr) is by far a better model. However, on costs (Table 4.4), TS has TCC ¥ 209 secs. This may not seem much but when used for 10,000’s of buildings in a utility, it can be punitive. At large scales, CBM (Figs. 4.7b-4.7e) may oer a better trade-o and suggest RT for DPT and OFF. 66 4.8.3 Demand Response DR uses 15-min predictions to detect peak usage and preemptively correct them to prevent grid instability. Hence, over-predictions are favored than under to avoid missing peaks, and we set – =0.5 and — =1.5 for DBPE. The campus is a large customer with tighter requirements of error threshold at e t =5% for REL, while individual buildings with lower impact are allowed a wider error margin of e t =10%. Prediction duration is 4 weeks for cost parameters, with the models used thrice a weekday – before, at the start and during the 1-5PM period, and RT trained weekly. DBPE is uniformly smaller than MAPE for the campus and buildings (Figs. 4.7a-4.7e), sometimes even halving the errors. Thus the 4 hour DR periods intheweekdaysaremore(over-)predictablethanall-daypredictions. TS(2hr)has significantly better DBPE than other models, with even TS (6hr) out-performing RT and ToW. For campus, RT is better than DoW, in part due to using temper- ature features that have a cumulative impact on energy use during midday. We see TS (2hr) gives a high REL of 91% for campus (Fig. 4.8a) and is the only model with positive REL for RES. Also, TS (6hr) and RT prove to be more reliableforDRincampusandDPTthantheirpoorershowingintheRIMandVAB independent measures (Fig. 4.4), making them competitive candidates. However, RT suers in reliable predictions for other buildings, with lower or negative REL (Figs. 4.8c-4.8e) while TS (2hr) and (6hr) continue to perform reliably. Cost-wise, we see RT and TS models are comparable on TCC (Table 4.4). For once, RT takes longer than TS due to the more aggressive retraining (every week), preferred for critical DR operations. But when seen through the CBM measure, all TS models beat RT for all cases but one (TS (24hr) on DPT). Thus, the TS (2hr) and TS (6hr) are the best for DR on all measures. 67 4.9 Summary Thekeyconsiderationinevaluatingpredictionmodelsisitsperformanceforthe taskathand. Traditionally, accuracymeasureshavebeenusedasthesolemeasure of prediction quality. In this chapter, we examined the value of performance mea- sures along the dimensions of scale independence, reliability and cost eciency. In evaluating them for consumption prediction in smart grids, we observed that scale independence ensures that performance can be compared across models and applications and for dierent customers; reliability evaluates a model’s consistency of performance with respect to the baseline models; while cost is a key consid- eration when deploying models for real world applications such as for dynamic decision making for D 2 R. We used existing scale-independent measures, CVRMSE and MAPE, and proposed four additional application-independent measures: RIM and VAB for measuring reliability; and CD and CC for data and compute costs. Further, our novel application-dependent measures can be customized by domainexpertsformeaningfulmodelevaluationforapplicationsofinterest. These measures include DBPE for scale independence, REL for reliability, and TCC and CBM for cost. The value of these measures for scenario-specific model selection were empirically demonstrated using three Smart Grid applications that anchored our analysis even as they are generalizable to other domains. Through cross cor- relation analysis, we found that only MAPE and CVRMSE show absolute cor- relation > 0.9, indicating that all measures are individually useful. Our results demonstrated the valuable insights that can be gleaned on models’ behavior using holisticmeasures. Thesehelptoimprovetheirperformance, andprovideanunder- standing of the predictions’ real impact in a comprehensive yet accessible manner. As such, they oer a common frame of reference for model evaluation by future researchers and practitioners. 68 Chapter 5 Prediction with Partial Data In this chapter, we address the problem of predicting fine-grained energy con- sumptionforthenextfewhoursinabsenceofreal-timedatafromallsmartmeters. 5.1 Introduction Low cost wireless sensors are increasingly being deployed in large numbers for performing monitoring and control in many sustainability domains such as in smart electric grids, transport networks, and natural resource and environment monitoring. These sensors are located at geographically dispersed locations and periodically send back acquired data to centrally located processing nodes [34] via wireless links and the Internet [33]. They include sensors for monitoring natural resources and environment such as biodiversity and atmosphere [69]; smart meters for measuring energy consumption [92], [70]; loop detectors installed under pave- ments for recording trac [79]; and meters on wind turbines that record wind speed and turbines’ power output [28]. Due to several factors, data from all sensors is not available at central nodes in real-time or at a frequency that is required for fast and real-time modeling and decision-making. For example, wind turbines record data every few seconds, but transmit data every five minutes to far-o research centers for use in prediction algorithms [28]. Physical limitations of existing transmission networks, such as latency, bandwidth and high energy consumption [34] are key factors that limit 69 the frequency of data transmission from sensors to central nodes [24]. Sometimes, consumersmayalsolimitfrequenttransmissionofinformationfromsensorslocated at their premises for security and privacy concerns [73]. For instance, in the smart grid domain, fine-grained electricity consumption data collected through smart meters can be used to infer activities of the consumers and also indicate their presence or absence [74]. Faults, outages, and unreliability or shadow fading of transmission links [33] may be other factors that make the data unavailable at central nodes. All these situations reflect thepartial data problem, where only partial data from sensors is available in real-time, and complete high resolution data is avail- able only periodically, generally one or more times a day. Without addressing this problem, traditional solutions risk degradation in performance and inaccurate interpretation of generated insights. For instance, time series prediction methods are adversely aected by the prediction horizon length, and in the case of missing data, the eective prediction horizon becomes larger, leading to inferior prediction estimates. Thus, the time series approach cannot be used for accurate predictions, for example, for up to 8 hours ahead. An alternative approach - as we propose in this chapter - is to develop creative solutions using data from a small subset of sensors selected on the basis of some heuristics or learning methods, while min- imizing information loss as a result of leaving out data from remaining sensors. The intuition behind this approach is the fact that sensors located spatially close to each other or sensing activities driven by similar schedules, such as those on an academic campus or trac on high density roads, are likely to be correlated . If this information can be leveraged, it will obviate the need for real-time transmission 70 from all sensors to the centralnodes, and therebyreduce the loadon the transmis- sion network. Also, it would make it simpler to add new sensors without straining the network. In this chapter, we address the partial data problem in the context of smart electricity grids, where high volume electricity consumption data is collected by smart meters at consumer premises and securely transmitted back to the electric utility over wireless or broadband networks. There, they are used to predict elec- tricity consumption and to initiate curtailment programs ahead of time by the utility to avoid potential supply-demand mismatch. Partial data problem arises when data from smart meters is only partially available in real-time. To address this, we propose a two-stage solution: first, we learn the dependencies among time series of dierent smart meters, then, we use data from a small subset of smart meters which are found to have high influence on others to make predictions for all meters. When using partial data from only ≥ 7% influential smart meters, we witness prediction error increase by only ≥ 0.5% over the baseline (Fig. 5.12b), thus demonstrating the usefulness of our method for practical scenarios [8]. Our main contributions are: 1) We leverage dependencies among time series sensor data for making short term predictions with partial real-time data. While time series dependencies have been used previously, the novelty of our work is in extending the notion of depen- dencies to discover influential sensors and using real-time data only from them to do predictions for all sensors. 2) Using real-world smart grid data, we empirically demonstrate that despite lack of real time data, our models achieve performance comparable to the baseline model that uses real-time data, thus indicating the usefulness of influence-driven models for practical sustainability domain scenarios. 71 The remainder of this chapter is organized as follows: Section 5.2 presents relatedworkandSection5.3providesformaldefinitionsandnotationsusedinsolv- ingthepartialdataproblem. Section5.4explainsourproposedmodelsandSection 5.6 describes the experiments and presents an analysis of the results. Finally, con- clusion is given in Section 5.7. 5.2 Related Work Many predictive modeling methods are designed for ideal scenarios where all requireddataisreadilyavailable. Forexample,time-seriespredictionmethodssuch asAuto-RegressiveIntegratedMovingAverage(ARIMA)[25]andAuto-Regressive Trees (ART) [75] require observations from recent past to be readily available in real-time to make short-term future predictions. However, this assumption does not hold true for many sensor-based applications involving “big data” that is only partially available in real time. The solutions proposed to address this problem can be categorized into two types: 1) Reduce the volume of transmitted data by techniques such as data compression [70], [86], data aggregation [60], model-driven data acquisition [41], and communication ecient algorithms [87]; 2) Estimate missingreal-timedatabytechniquessuchasinterpolationbasedonregression[63], or through transient time models that use dierential equations to model system behavior [38]. Main challenge with these methods is that estimates depend on the accuracy of models and interpolation errors get propagated to subsequent analysis anddecision-makingsteps. Anothermethodforestimationisusingspectralanaly- sis of time series, though it is a more complex and involved process that is suitable only for periodic time series [19]. We use an orthogonal approach where instead of 72 trying to estimate missing real-time data, we first discover influential sensors and then do predictive modeling using real time data from only these sensors. Our approach involves learning dependencies among time series data from dif- ferentsensors. Severaltechniqueshavebeenproposedtolearndependenciesamong timeseriesdata; themorepopularamongthemarebasedoncross-correlations[25] and Granger Causality [49]. The latter has gained popularity in many domains such as climatology, economics, and biological sciences due to its simplicity and robustness [19]. It is however time consuming for evaluating pairwise dependen- cies when large number of variables are involved. Lasso-Granger [17] is proposed to provide a more scalable and accurate solution. In our work, we leverage the Lasso-Granger method to discover dependencies among time series from dierent sensors, and then identify influential sensors based on these dependencies. Our work brings the much needed focus to ecient data collection methods for sustainability domains. In smart grid, data streams from thousands of sensors are monitored for predictive analytics [21], and demand response [65]. With large scale adoption of smart meters, most cities would soon have millions of smart meters recording electricity consumption data every minute. For utilities, real- time data collection from meters all over a city would be prohibitive due to limited capacityofcurrenttransmissionnetworks. Suchscenariosnecessitatedevelopment of alternative methods, such as ours, that could work with only partial data that is available in real-time. 73 5.3 Preliminaries In this section, we formulate the problem of prediction with partial data addressed in this chapter and give formal definitions of key words used in con- text of the problem. Consider a large set of sensors S ={s 1 ,...,s n } collecting real-time 1 data. Due to network bandwidth constraints, only some of these sensors can send data back to the central node in real-time, while the rest send the collected data in batches every few hours (Fig. 5.1). The problem we address is to use this partial data to make predictions for all sensors. Definition 5.1. A time series output of a sensor s i is an ordered sequence of readings T i ={x i j },j=1,...,t up to the current time stamp t. Definition 5.2. Given a set of sensor time series outputs {x i j },j =1,...,t,i = 1,...,n,short-term prediction 2 is to estimate{x i j },j = t+1,...,t+h,i=1,...,n for a horizon h. Problem Definition Given a set of sensors S with time series outputs {x i j },j =1,...,t,i=1,...,n, make short-term predictions {x i j },j = t+1,...,t + h,i=1,...,n for each sensor s i œS , when readings {x o k },k = t≠ r+1,...,t for oœO are missing for a subset O of sensors,OµS . For simplicity, we assume all time-series sensor outputs to be sampled at the same frequency and be of equal length. Wehypothesize thatwecanlearndependenciesinpasttimeseriesoutputsfrom sensors and use them to identify the set of sensors that are more helpful in making 1 In this dissertation, we consider data collected at 15-min intervals as real-time, even though our models would be applicable (with even greater impact) to data at smaller resolutions. 2 In the context of smart grid, for short-term predictions, the prediction horizon is 1 to 8 hours and the prediction intervals are 15-min, 30-min or 1-hour long. 74 Figure 5.1: Partial Data Problem: some sensors can send readings to a central node in real-time, while the rest send every few hours resulting in partial data available in real time. predictions for other sensors, so that we can collect real-time readings from only these sensors. Definition 5.3. A dependency matrix M is an n ◊ f matrix, where each element M[i,j] represents the dependence of time series T i on time series T j . Definition 5.4. The influence I k of a time series T k is defined as the sum of all values in the column k in the dependency matrix M. I k = n ÿ j=1 M[j,k] (5.1) In the following, we propose a strategy towards modeling influence impact and corresponding prediction techniques. Definition 5.5. Compression Ratio, CR is defined as the ratio between the total number of sensor readings that would be required for real-time prediction and 75 the number of readings actually transmitted from selected influential sensors for prediction with partial data. CR = q n i=1 |P i | q n i=1 |P i |≠ q oœO |P o | (5.2) where P i is the sequence of past values from sensor s i used for prediction and |P i | is the length of this sequence;O is the subset of sensors with missing real-time readings and n is the total number of sensors. For simplicity, we consider same length l of past values for all sensors. Hence, |P i | = l,’ i and above equation can be simplified asCR = n n≠|O| . 5.4 Methodology We propose a two-stage process, where we first learn dependencies from past data and determine influence for individual sensors, and then use this information for selecting influential sensors for regression tree based prediction. 5.4.1 Influence Discovery We cast the problem of making predictions for a sensor s i œO in terms of recent real time data from other sensors as a regression problem. In ordinary least squares (OLS) regression, given data (x i ,y i ),i=1,2,...,n, the response y i is estimated in terms of p predictor variables, x i =(x i1 ,...,x ip ) by minimizing the residual squared error. Weidentifysensorsthatshowstronger influence onothersensorsusinga lasso- based approach. The lasso method is used in regression for shrinking some coe- cients and setting others to zero by penalizing the absolute size of the coecients 76 [100]. The OLS method generally gives low bias due to over-fitting but has large variance. The Lasso improves variance by shrinking coecients and hence may reduce overall prediction errors [100]. Given n sensor outputs in form of time series x 1 ,x 2 ,...,x n , with readings at timestampst=1,...T,foreachseriesx i ,weobtainasparsesolutionforcoecients w by minimizing the sum of squared error and a constant times the L1-norm of the coecients: w = arg min T ÿ t=l+1 . . . . . . x i t ≠ n ÿ j=1 w T i,j P j t . . . . . . 2 2 +⁄ ÎwÎ 1 (5.3) where P j t is the sequence of past l readings, i.e., P j t =[x j t≠ l ,...,x j t≠ 1 ], w i,j is the j-th vector of coecients w i representing the dependency of series i on series j, and⁄ isaparameterwhichdeterminesthesparsenessofw i andcanbedetermined using cross-validation method. 5.4.2 Influence Model (IM) WefirstlearnthedependencymatrixM j foreachdayasfollows. Eachsensor’s dataissplitintoasetofq dailyseries{D i j } i=1,...,n,j=1,...,q . Longertimeseriesdatais usuallynon-stationary,i.e.,thedependenceonprecedingvalueschangeswithtime. Splitting into smaller day-long windows ensures stationarity for time series data in each window. For the same reason, dependency matrix M is also re-calculated daily using these daily series 3 . Weights for daily series for each day are calculated using eqn. 5.3. The weight vectors w i form the rows of the dependency matrix M. We set diagonals of the dependency matrix to zero, i.e., M[i,i]=0 in order 3 As we show later, influence changes with time (Fig. 5.3), in our case, on a daily basis, and hence, there is a need for learning dependency matrix M daily. 77 to remove self-dependencies and simulate the case of making prediction without a sensor’s own past real-time data. Given M, influence I of all series can be calculated using eqn. 5.1. Predictions for a given day are based on training data from a previous similar day sim. We consider two cases of similarity: 1) previous week - same day in the preceding week, which captures similarity for sensor data related to periodic (weekly) human activities; 2) previous day, which captures similarity for sensor data related to human activities and natural environments on successive days. We apply a windowing transformation to the daily series{D i } in both training and test data to get a set of Èpredictor,responseÍ tuples. Given time-series x with k values, the transformation of length l results in a set of tuples of the form È(x t≠ l+1 ,...,x t ),x t+h Í such that lÆ tÆ k≠ h. Thepredictionmodelforasensors i isaregressiontree[27]thatusespredictors from all sensors with non-zero coecients in the dependency matrix learned from a similar day, i.e., predictors are taken from {D k },’ k : M sim [i,k] ”=0. Since M[i,i]=0, sensor s i ’s own past values are not used as predictors. Hence, a key benefit of this model is that we are able to do predictions for a sensor in absence of its own past values by using past values of its influential sensors. 5.4.3 Local Influence Model (LIM) In the previous section, we discussed how IM resolves the problem of partial data availability by using influential sensors (a small subset of sensors) to transmit data in real time. However, without restricting the number of influential sensors, the subset of influential sensors considered by IM may include the total number of sensors. Next, we discuss a policy to ensure that only a fraction of sensors is considered for real-time predictions. For each sensor s i , we sort the corresponding 78 row M[i,] in the dependency matrix and consider only readings from the top · l sensors in this model. 5.4.4 Global Influence Model (GIM) In LIM, because local influencers are selected for each sensor, overall it may still require real-time data from a large number of sensors, thus defeating the goal of getting real-time data from only a few influencers. Thus, we are interested in finding global influencers. Using dependency matrices M j , we calculate daily influence I i j for each sensor s i as described in equation 5.1. After sorting the sensors based on their influence values, we consider only readings from the top · g sensors in the influence model. 5.5 Cost-eciency of Influence Models Weexaminethecostofourproposedinfluencemodelsintermsofthenumberof sensors required to transmit data in real time. For comparison, we use the Auto- Regressive Tree (ART) Model as the baseline. ART uses a sensor’s own recent observationsasfeaturesinaregressiontreemodel[75]. Weimplementaspecialized ART(p,h)modelthatusesrecentp observationsofavariableformakingh interval aheadprediction. WhileARTusesavariable’sownrecentobservations,ourmodels only use influential sensors’ recent observations to make predictions. We observe the following about the number of sensors selected for transmitting real-time data: 1)ARTrequiresreal-timedatafromallsensors; 2)IMrequiresreal-timedatafrom only the influential sensors; 3) LIM requires real-time data from influential sensors selected locally for each sensor; and 4) GIM uses real-time data from all influential 79 Figure5.2: Cost-eectivenessofIMmodelvis-a-visthenumberofsensorsrequired to transmit data in real-time. sensors selected globally. Thus, each successive model requires fewer number of sensors resulting in higher cost-eciency. 5.6 Experiments 5.6.1 Datasets We conducted experiments with real-world datasets to evaluate our proposed influence-based prediction models: 1) Electricity Consumption Data 4 : collected at 15-min intervals by over 170 smart meters installed in the USC campus microgrid [92] in Los Angeles. 2) Weather Data: temperature and humidity data taken from NOAA’s [77] USC campus station, linearly interpolated to 15-min resolution. 4 Available from the USC Facilities Management Services. 80 5.6.2 Evaluation We evaluate our models for up to 8 hour-ahead prediction 5 . Given the short horizon, thelengthofpreviousvaluesusedwassetto1-hour. Outoftwochoicesof similar day for training, previous week and previous day, we found previous week to perform better. Baseline Model We use the Auto-Regressive Tree (ART) Model as the baseline. Our proposed modelsarealsobasedonregressiontreeconceptsoARTprovidesanaturalbaseline tocompareperformances. ARTusesrecentobservationsasfeaturesinaregression treemodelandhasbeenshowntooerhighpredictiveaccuracyonalargerangeof datasets[75]. ART’smainadvantageisitsabilitytomodelnon-linearrelationships indata,whichleadstoacloserfittothedatathanastandardautoregressivemodel. We implement a specialized ART(p,h) model that uses recent p observations of a variable for making h interval ahead prediction. While ART uses a variable’s ownrecentobservations, ourmodelsonlyuseothervariables’observationstomake predictions. Evaluation Metric We used MAPE (Mean Absolute Percentage Error) as the evaluation met- ric, as it is a relative measure and therefore scale-independent [14]. MAPE = 1 n q n i=1 |x i ≠ ˆ x i | x i where x i is the observed value and ˆ x i is the predicted value. 5 Smart Grid applications such as Demand Response usually require up to 6 hours ahead predictions [14]. 81 0 50 100 150 200 0 50 100 150 200 0 50 100 150 200 0 50 100 150 200 1 2 3 4 1 20 40 60 80 100 120 Day Influence Figure 5.3: Influence/dependency with respect to time 5.6.3 Influence Variation Fig. 5.3 shows influence variation for the top 4 influencer sensors. Given this variation, we decided to re-calculate influence for each day in our experiments, rather than use a static value calculated over a large number of days. Fig. 5.4 shows the distribution of influence for each sensor with the size of sensor readings. Itisinterestingtonotethatbuildingswithsmallerconsumptionvalueshavehigher influence. Also, influence for weekdays is higher possibly due to more activity and movement of people between buildings. Fig. 5.5 shows average dependency decreasing with increase in the distance betweenthesensors. Thisvalidatesourintuitionaboutgreaterdependencyamong closely located sensors. This can be attributed to greater movement of people between neighboring buildings, and hence greater dependency in their electricity consumption. Also, there is more movement on weekdays, hence we observe that average dependency is higher for weekdays than for weekends. 82 ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ●● ●● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ●● ● ● ●● ● ● ● ●● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ●●● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ●●● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ●● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ●● ●● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ●● ● ● ●●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ●● ● ● ● ●● ●● ● ● ●● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ●●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ●●●● ●● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ●●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ●● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ●● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ●● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ●● ● ● ● ● ● ● ● ●●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ●● ● ●● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ●● ● ● ● ● ● ● ● ●● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ●● ● ●● ● ● ● ● ● ● ● ●● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●● ●● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ●● ●● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ●● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ●● ●●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●●● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ●● ●●● ● ● ● ● ●●● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ●● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ●● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ●● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●●●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ●● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ●●● ● ● ●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●●● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ●● ●● ●● ● ● ● ● ●●● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ●● ● ●● ●●● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ●● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ●●● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ●● ●● ●● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●●●● ● ● ● ●● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ●● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ●● ● ●● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ●● ● ●● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ● ● ● ● ● ● ●● ●● ● ●● ● ● ● ● ●● ●● ●● ● ● ● ●●● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●●● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ●● ●● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ●●● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ●●● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ●● ● ●●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ●● ● ● ●● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●●● ● ●● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ●●● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●●● ● ● ● ● ● ● ● ●● ● ● ●● ●● ● ● ●●● ●● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●●● ● ● ● ● ● ● ● ●● ●● ● ● ●● ● ● ●● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●●● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●●● ● ●● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ●●● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ●●● ●● ● ●● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●●● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ●● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ●●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●●● ●●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ●● ● ●●● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ●●● ●● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ●●● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ●● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●●● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ●● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ●● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●●● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ●● ● ●● ● ● ● ● ●● ● ● ●● ● ●● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●●● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ●● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● 0 100 200 0 5000 10000 15000 20000 Sensor Reading Influence ● Weekday Weekend Figure 5.4: Influence/dependency with respect to size. Higher values are observed for weekdays than for weekends. ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.005 0.010 0.015 0.020 0.025 0.3 0.6 0.9 1.2 1.5 Distance (km) Average Dependency ● ● ● ● ● ● ● Mon Tue Wed Thu Fri Sat Sun Figure 5.5: Influence/dependency with respect to distance. It is found to decrease with distance. 83 ● ● ● ● ● ● ● ● ● 0.12 0.14 0.16 0.18 0.20 1 4 8 12 16 20 24 28 32 Horizon Average MAPE ● IM ART Figure 5.6: Partial real-time data availability vs. complete real time data availabil- ity: For ART, recent values used as predictors at the time of prediction become increasingly ineective for longer horizons, when IM’s use of more recent real-time values of other sensors become more useful. 5.6.4 Results Fig. 5.6 shows prediction errors of the influence model, averaged over all days and for all sensors. ART performs well up to 6 intervals (1.5 hour), as due to the veryshortpredictionhorizon,electricityconsumptionisnotexpectedtodrastically change from its previous 4 values. ART performs well as it has access to real- time data. Instead, IM achieves comparable accuracy despite the lack of real- time data. IM’s accuracy also increases with the prediction horizon, where it consistently outperforms ART. While increase in IM’s error is subdued, ART’s error increases rapidly with increasing horizon implying that the previous 4 values used as predictors at the time of prediction become increasingly ineective for predicting values beyond 1.5 hours ahead in time. Here, more recent real-time values of other sensors become more useful predictors than a sensor’s own relatively 84 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.12 0.13 0.14 0.15 0.16 0.17 0.18 0.19 0.20 1 4 8 12 16 20 24 28 32 Horizon Average MAPE ● ● ● ● ● Top4 Top8 Top12 Top16 Top20 IM ART Figure 5.7: Prediction performance of LIM models older values. That IM achieves good accuracy despite the lack of real-time data is an important result and its main advantage. For local influence model (Fig. 5.7), we consider real-time values from top · influential sensors for each sensor (· =4,8,12,16,20). ART performs well initially due to very short prediction horizon, but its errors increase rapidly with increasing horizon. The LIMs show performance comparable to IM, while using real-time values from fewer sensors. Using increasingly fewer predictors increases the prediction error for LIMs, but only slightly. LIM’s performance deteriorates comparedto IM in terms of percentage change (Fig. 5.8a) withincreasing horizon. Thiscanbetheeectofveryfewsensorsremaininginfluentialoverlongerhorizons. When averaged over all horizons, we observe 4.71% increase in error compared to IM for Top 4 model which comes down to 1.97% increase for Top 8 and less than 85 ● ● ● ● ● ● ● ● ● −8 −7 −6 −5 −4 −3 −2 −1 0 1 1 4 8 12 16 20 24 28 32 Horizon Change in MAPE (%) ● Top4 Top8 Top12 Top16 Top20 (a) ● ● ● ● ● ● ● ● ● −30 −20 −10 0 10 1 4 8 12 16 20 24 28 32 Horizon Change in MAPE (%) ● Top4 Top8 Top12 Top16 Top20 (b) Figure 5.8: Percentage change in MAPE of LIM with respect to (a) IM and (b) ART. (Negative change implies increase in error.) 1% increase for Top 12, 16, and 20 models (Fig. 5.11a). We observe that beyond 1-2 hour horizon, all LIMs outperform ART (Fig. 5.8b) as for ART, the eective 86 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.12 0.13 0.14 0.15 0.16 0.17 0.18 0.19 0.20 1 4 8 12 16 20 24 28 32 Horizon Average MAPE ● ● ● ● ● Top4 Top8 Top12 Top16 Top20 IM ART Figure 5.9: Prediction performance of GIM models horizon nowincludesthepredictionhorizonandthemissingreal-timedata. When averaged over all horizons, we observe that for Top 4, there is an increase in error by 2.24%, but for Top 8 (and Top 12, 16, 20), the error actually decreases (Fig. 5.11b) with respect to ART. Thus, we conclude that for this dataset, we need at least8influentialsensorsforeachsensortoimproveperformanceoverthebaseline. The global influence model uses real-time values from only top · influential sensorsselectedgloballyforall sensors(· =4,8,12,16,20)GIMoutperformsART beyond8intervals(Fig. 5.9). Howeverasthenumberofpredictorsisreducedwhen moving from Top 20 to Top 4 model, we observe that increase in errors is more pronounced for GIM (Fig. 5.9) than for LIM (Fig. 5.7) as the number of unique influentialsensorsissignificantlylowerinthecaseofGIMascomparedtoLIMand IM. While LIM used influential sensors selected separately for each sensor, GIM 87 ● ● ● ● ● ● ● ● ● −8 −7 −6 −5 −4 −3 −2 −1 0 1 1 4 8 12 16 20 24 28 32 Horizon Change in MAPE (%) ● Top4 Top8 Top12 Top16 Top20 (a) ● ● ● ● ● ● ● ● ● −30 −20 −10 0 10 1 4 8 12 16 20 24 28 32 Horizon Change in MAPE (%) ● Top4 Top8 Top12 Top16 Top20 (b) Figure 5.10: Percentage change in MAPE of GIM with respect to (a) IM and (b) ART. (Negative change implies increase in error.) uses the same set of influential sensors for all sensors and still achieves comparable performance. Top 20 and Top 16 GIMs outperform IM (Fig. 5.10a) for 1 interval 88 −5 −4 −3 −2 −1 0 1 2 Top4 Top8 Top12 Top16 Top20 Models Lift in MAPE (%) (a) −5 −4 −3 −2 −1 0 1 2 Top4 Top8 Top12 Top16 Top20 Models Lift in MAPE (%) (b) Figure 5.11: Lift in MAPE for LIMs w.r.t (a) IM and (b) ART. Positive lift is observed w.r.t. ART beyond Top 8. (Positive lift indicates reduction in MAPE.) ahead and later for 28 and 32 intervals. This could be due to the large number (20 and 16) of predictors selected in these models overlapping with those of IM. This is further supported by the average result over all horizons, where both Top 20 and Top 16 models show less than 1% increase in errors compared to IM (Fig. 5.12a). We also observe that all GIMs outperform ART beyond 12 intervals, i.e., 3 hour horizon (Fig. 5.10b). When at least 12 influential sensors are available, improvements are observed over the baseline across all horizons (Fig. 5.12b). We alsofoundthatascompressionratio(eqn. 5.2)wasincreasedfrom5to30,increase inMAPEwasonlyby≥ 1%. GIM is able to provide a practical solution using real- time values from only a small fraction of sensors, thus achieving great compression ratio. 5.7 Summary We addressed the partial data problem in sustainability domain applications thatariseswhendatafromallsensorsisnotavailableatcentralnodesinrealtime, 89 −5 −4 −3 −2 −1 0 1 2 Top4 Top8 Top12 Top16 Top20 Models Lift in MAPE (%) (a) −5 −4 −3 −2 −1 0 1 2 Top4 Top8 Top12 Top16 Top20 Models Lift in MAPE (%) (b) Figure 5.12: Lift in MAPE for GIMs w.r.t (a) IM and (b) ART. Positive lift is observed w.r.t. ART beyond Top 12. (Positive lift indicates reduction in MAPE.) In (b) Only≥ 0.5% increase in prediction error over ART is witnessed while using just top 8 (≥ 7%) of smart meters. either due to network latency or data volume, or when transmission is limited by the consumers for security and privacy reasons. Standard models for short term predictions are either unable to predict or perform poorly when trying to predict with partial data. We proposed novel influence based models to make predictions using real-time data from only a few influential sensors, while still providing per- formancecomparabletoorbetterthanthebaseline. Thus, weprovidedapractical alternative to canonical methods - which assume real-time data availability for all sensors-fordealingwithmissingreal-timereadingsinsensorstreams. Thesemod- els are generalizable to applications in several domains, and provide a simple and interpretable solution that is easy to understand and apply for domain experts. Future extensions of this work is towards a two stage process for influence discovery, guided by heuristics, and by a combination of local and global selection ofinfluentialsensorstofurtherimprovepredictionperformance. Anotherdirection of research is for scenarios when time series data from dierent sensors is not sampled at equally spaced time intervals resulting in irregular time series. 90 Chapter 6 Prediction of Reduced Consumption In the previous chapter, we discussed the problem of predicting energy con- sumption at fine granularity (i.e., in 15-minute intervals) for the next few hours in absence of real-time data from all smart meters. This was done for normal consumption periods, which is outside the DR event window (Figure 6.1). In this chapter, we move towards addressing the problem of reduced consumption predic- tionduringtheDReventwindowwhenconsumersreducetheirenergyconsumption according to a prior agreement with the utility. 6.1 Introduction The electricity consumption during DR (Figure 6.1) is called reduced con- sumptionasitisnormallylowerthanwhattheprofilewouldhavebeeninabsence of DR. A number of factors aect the amount of reduction achieved during DR, such as: day of week, day of year, reduction strategy, time of day, and external characteristics, such as weather, special events, etc. Figure 6.2 shows a sample of real-word reduced consumption data. Electric utilities can mine electricity consumption data collected by smart meters installed in buildings to learn about peak demand periods in buildings and opportunities for consumption reduction using Demand Response (DR). When 91 Figure 6.1: Normal consumption, Reduced consumption, and DR baseline vis-a- vis a DR event. Normal consumption occurs outside a DR event, while reduced consumption occurs within a DR event window. DR baseline is the consumption that would have occurred in absence of a DR event. collected from hundreds of thousands of diverse customers in a city area at high granularity of every 15 minutes or less, the reduced consumption data forms big data, andoersuniqueopportunitytolearnfromalargeanddiversedata. Despite its importance to the success of DR programs, there are few existing studies on reduced consumption prediction. To the best of our knowledge, we are the first to address this problem from a big data perspective. Utilities use reduced consumption prediction during DR for the following tasks: • planning: estimatingtheextentofpotentialreductionduringDRbefore the DR event occurs [31]; • dynamic DR: performing DR at a few hours’ advance notice whenever necessitated by the dynamically changing conditions of the grid, such as due to the integration of intermittent renewable generation sources [11]; 92 Figure 6.2: Reduced consumption for two buildings during Demand Response period (shown in grey). • customer selection: intelligently targeting customers for participation in DR based on a prediction of their reduced consumption and modifying such selection in real-time as needed [108]; • customer compensation: estimating the amount of incentives to be given to the customers [106]. 93 Table 6.1: Key characteristics of normal consumption, reduced consumption, and DR baseline Prediction Task Normal Consumption Counterfactual DR Baseline Reduced Con- sumption Goal Planning, DR Curtailment cal- culation Planning, DR, dynamic DR Prior work Several Several None Timing Outside the DR event During the DR event During the DR event Historical Data Readily available Readily available Sparse or non- existent Compute Requirements O-line or real- time O-line Real-time for dynamic DR Profile changes Gradual Gradual AbruptattheDR event boundaries Techniques that work well for normal consumption prediction, such as time series models, are ineective for reduced consumption prediction due to the fol- lowing challenges: • Abrupt changes at DR event boundaries: While time series methods use recent observations to predict short-term future values, this is not appli- cable in the case of reduced consumption prediction due to abrupt change in consumption at the DR event boundaries. • Insucient recent observations within the DR window : Within the DR win- dow,thereareinsucientrecentobservationsavailableforatimeseriesmodel to be trained reliably. Also, as we mentioned in Chapter 5, while real-time collection of smart metering data happens widely, communication and com- putation challenges have so far limited the collection of smart metering data in real-time, making the applicability of time-series models challenging. 94 • Eect of DR event length : The models for reduced consumption prediction are expected to make predictions for the entire length of a DR event before the event begins. A time-series model would be ineective due to a long prediction horizon, as time series models are known to have lower accuracy for longer horizons. Neither would iterative application of time-series models forpredictingshorterhorizonsworkduetoinsucientnumberofrecentdata points, as mentioned above. We therefore use historical data from past DR events as predictors for reduced consumption. The motivation for using it is that electricity consumption has an element of periodicity related to human behavior and consumption is expected to be similar on same days and times of the week. For example, a person is likely to be at the same place on all Mondays at 10 AM, and therefore an emerging behavior can be recorded, which here implies that the kWh consumption of a building will be similar on Monday mornings even if occupied by multiple tenants or when hosting hundreds of oce spaces. The consumption on Mondays 10 AM can therefore be estimated as the average of past observations at the same time and day. However, usage is likely to dier by several minutes earlier or later due to natural irregularities in behavior (for e.g., someone leaving home at 9:30 AM. instead of 9:15 AM), which can adversely aect historical averaging models. One challenge in reduced consumption prediction is that is that it is aected by several factors such as the time of day and day of week, and DR factors such as curtailment strategy, human behavior, as well as environmental factors, such as temperature, and data about all of them may not be readily available. Due to limited data availability, small number of DR events per customer, and diversity of customer types (e.g., residential versus oce buildings), reduced consumption 95 estimation remains a challenging task. Also, automated DR programs are some- timesabortedormodifiedmidwayduringtheDRwindowiftheyviolateoccupants’ thermal comfort limits. This makes the task of learning from previous DR events even harder. Ourworkbringsthemuchneededfocustoreducedenergyconsumptionpredic- tion models for automated and dynamic demand response and other sustainability applications.In smart grid, customers are playing an increasingly active role in managing their consumption and reducing consumption in response to requests from the utility, as well as to benefit from dynamic pricing or time-of-use pricing mechanisms being adopted by the utilities. With large scale realization of smart grids worldwide, millions of customers would partner with their local utilities to participate in demand response for achieving supply-demand balance. Cost wise, it would be prohibitive for the utilities to build models tuned for individual cus- tomers. Suchscenariosnecessitatedevelopmentofcost ecient predictionmodels, such as the one we propose here in this chapter. Our key contributions in this study are: • We use diverse predictors in a novel Reduced Electricity Consumption Ensemble (REDUCE), that use dierent sequences of daily electricity con- sumption on DR event days, as well as contextual attributes, for reduced consumption prediction. The low computational complexity of our model makes it ideal for real-time decision making tasks such as dynamic demand response [11]. • We also propose BiDER, a BigData Ensemble for Reduced electricity con- sumption prediction then combines predicted values from three base models. While the three base models are learned individually for each customer, our key contribution is in using big data on reduced consumption from a large 96 number of customers to predict for diverse customers over dierent time intervals by learning a single ensemble model. The benefit of using a single model is huge in terms of cost reduction of the order of n◊ L, where n is the number of customers and L is the number of intervals in the DR period. The remainder of this chapter is organized as follows: Section 6.2 presents relatedworkandSection6.3providesformaldefinitionsandnotationsusedinsolv- ingthereducedconsumptionpredictionproblem. Sections6.4,6.5,and6.6explain our proposed models and Section 6.7 describes the experiments and presents an analysis of the results. Finally, conclusion is given in Section 6.10. 6.2 Related Work Electricity consumption prediction is studied in three contexts: 1) normal consumption,2) reduced consumption, and 3) DR baselines, which dier greatlyintermsoftheirscopeandcharacteristics(Table6.1). Whileelectricitycon- sumption is a widely studied problem [71], [72], [7], the problem of reduced energy consumptionpredictionisanewandopenproblemwithlittleexistingresearch[31]. Thiscanbeattributedtofactorssuchastheunavailabilityofreducedconsumption data; the human factors causing variance in response to DR; and cancellation of DR when found violating occupants’ thermal comfort limits. The utilities have so farfocusedmoreonpredictingnormalconsumptionorDRbaselines. DRbaselines are calculated during a DR event [81], [37] and estimate the amount of electricity that would have been consumed in absence of a DR event (Figure 6.1). They are counterfactual in that they give a theoretical measure of what the customer did not do, but would have done in absence of a DR event. Utilities generally use simple averaging models for DR baseline predictions due to their simplicity and 97 reduced computational requirements [37]. DR baselines are used to measure the extent of curtailment achieved during a DR event. Muchresearchhasbeenpreviouslydoneintheareaofensemble learning,where base models are first used to produce their respective predicted values, and then combined to achieve a better prediction model. It has been shown that ensembles improve upon prediction accuracy when their base models are diverse [64], [30]. In the energy domain, ensemble approach has previously been used for applications such as electricity demand forecasting [89], [53], photovoltaic output prediction [84], [68], and wind power forecasting [40]. However, our work is dierent from these in that our goal is to use predictions from base models for individual cus- tomerstolearnasinglelargeensemblemodelthatbenefitsfrombigdataproperties of diverse customers and is also more practical and scalable than building individ- ual ensembles tuned for each customer. 6.3 Preliminaries Definition 6.1. A DR event for a building is the period during which the build- ing’selectricityconsumptionisreduced(fore.g. byturningdevicesoorbyturning them down to a lower consumption setting than normal operation). ADReventisinitiatedbytheutility,whichengagescustomersbysendingthem participation signals, such as time-of-use pricing, critical peak pricing, variable peak pricing, real time pricing, and critical peak rebates. In direct load control programs, customers provide power companies the ability to cycle non-critical loads such as air conditioners and water heaters on and o during DR in exchange for financial incentives. DR is designed to reduce peak demand or avoid system emergencies in response to supply conditions. Current demand response schemes 98 targetthereductionofservicesaccordingtopreplannedloadprioritizationschemes during critical time frames. The reduction in electricity consumption during a DR event is also sometimes achieved by voluntary action of the customers. A day in which a DR event occurs is called DR day [106]. 6.3.1 Consumption Sequences Definition 6.2. A daily sequence of electricity consumption observations for a building on the i-th DR day, E i = {e i,1 ,e i,2 ,...,e i,J }, where e i,j is the observation at time interval j, and J is the number of intervals in a day. For simplicity, we assume all data to be sampled at the same frequency; hence all daily sequences are of the same length J. The set of all daily sequences from DR days for a building is an I ◊ J matrix, E =(E 1 ,E 2 ,...,E I ) T where I is the number of DR days observed for the given building. Definition6.3. Thepre-DR sequenceE i,1,d≠ 1 ={e i,1 ,e i,2 ,...,e i,d≠ 1 } withd> 1, is a subsequence of E i beginning at interval 1 and ending just before d, the interval at which a DR event begins. Definition 6.4. The in-DR sequence E i,d,L ={e i,d ,e i,d+1 ,...,e i,d+L≠ 1 }, withd> 1, and d +L≠ 1 Æ J, is a subsequence of E i that begins at time interval d, the interval at which a DR event begins and is L intervals long. 6.3.2 Contextual Attributes Electricity consumption is impacted by physical as well as human activity driven factors; hence in our models we consider two types of contextual attributes: time-series attributes that are defined for each time interval of the day, and static attributes that remain same for all intervals of a day. Examples of the 99 Table 6.2: Notations used in reduced consumption modeling Symbol Description I Number of DR days observed J Number of observations in a day L Length of the DR event window e i,j Electricity consumed on day i in interval j E i Daily DR sequence for day i E i,s,l Subsequence of E i starting at s of length l C i Daily context for day i C i,s,l Subsequence of C i starting at s of length l A i [k] Vector of k-th time series attribute for day i B i [k] k-th static attributes for day i former include temperature, humidity, dynamic pricing, occupancy, etc. while for the latter include day of week, holidays, etc. For N t distinct time-series attributes, and N s static attributes, we define the following: Definition 6.5. The daily context for a building on the i-th DR day is a tuple, C i = ÈA i [1],...,A i [N t ],B i [1],...,B i [N s ]Í, where A i [k]= {a i,1 ,a i,2 ,...,a i,J } is the k-th time series attribute and B i [k] is the k-th static attribute. Definition6.6. Thepre-DR contextforabuildingonthei-thDRdayisatuple, C i,1,d≠ 1 =ÈA i [1],...,A i [N t ],B i [1],...,B i [N s ]Í where A i [k]={a i,1 ,a i,2 ,...,a i,d≠ 1 } is a subsequence of the k-th time series attribute from interval 1 to just before interval d when the DR begins, and B i [k] is the k-th static attribute. Our definition for contextual attributes is generalizable to any number of arbi- traryattributes. However, theselectionofrelevantattributesisanon-trivialprob- lemandoftendictatedbytheknowledgeofdomainexpertsorbyempiricalevidence [12]. 100 Table 6.2 summarizes the notations used in this chapter. We are finally in a position to formally define the problem of reduced electricity consumption predic- tion. 6.3.3 Problem Definition We formulate the problem of predicting reduced electricity consumption for a building during a DR event as the problem of calculating the values of in-DR sequenceE i,d,L forDRdayi,giventhepre-DRsequenceE i,1,d≠ 1 andpre-DRcontext C i,1,d≠ 1 for the day i, and the set of daily sequences E and daily contexts C from the historical data. 6.4 Base Models We first describe the three base models used later in our ensemble models. Each of these base models account for in-DR similarity of electricity consump- tion sequences, pre-DR similarity of consumption sequences, and all-day similarity of consumption sequences. The motivation for using diverse base models is the intuition that each of them might perform better in dierent durations of the DR window. Also, it has been shown previously [30], [97], [64] that ensembles improve upon prediction accuracy when their base models are diverse. 6.4.1 IDS: In-DR Sequence Model IDS is similar to the approach used by the utility for predicting DR baselines. While utilities average over a set of past similar (non-DR) days, IDS averages all 101 in-DR sequences from past DR days. Thus, the in-DR sequence for each building during the i-th DR day is given by: [ ˆ E i,d,L ] IDS = 1 |E| |E| ÿ ‘=1 E ‘,d,L (6.1) This model oers two key advantages: 1) low computational complexity, as computation time is independent of the length of the DR event and the size of the historical data, making it ideal for real-time predictions for dynamic DR, and 2) it is a univariate model that only depends on electricity consumption values and does not require additional variables, which would increase data collection costs [14]. 6.4.2 PDS: Pre-DR Sequence Similarity Model This model considers the contextual attributes and the electricity consumption values on the DR day before the beginning of DR, for selecting “similar” DR days from the past data. Our hypothesis is that if two DR days have similar pre-DR sequences, their in-DR sequences would be similar. Thus, for each DR day i in the historical data we first form a tuple of pre-DR sequence and pre-DR context ÈE ‘,1,d≠ 1 ,C ‘,1,d≠ 1 Í. Similarly, we form a tuple for DR day i. The similarity score between each DR day ‘ in the historical data and given DR day i is given by SimScore(‘,i)= sim(ÈE ‘,1,d≠ 1 ,C ‘,1,d≠ 1 Í, ÈE i,1,d≠ 1 ,C i,1,d≠ 1 Í). (6.2) where sim can be any similarity measure. Next, we sort historical days based on their similarity score to DR day i in descendingorder. Wethenpredictthein-DRsequenceonagivendayasaweighted 102 averageofhistoricalin-DRconditions,suchthathigherweightsareassignedtodays with a higher similarity score. The weights are chosen to exponentially decrease with decreasing similarity score. The predicted in-DR sequence is given by [ ˆ E i,d,L ] PDS = 1 |E| |E| ÿ ‘=1 Ê ‘ ◊E ‘,d,L (6.3) where weights Ê ‘ = exp(≠ ⁄ ), and 0<⁄ Æ 1 is the decay rate that determines the rate of decrease of weights with decreasing similarity score. 6.4.3 DSS: Daily Sequence Similarity Model This model considers the entire daily sequences and contexts in the historical data to first discover clusters of daily profiles for each building. We define daily profiles P ‘ = ÈE ‘ ,C ‘ Í for each building to consist of tuples of daily sequences E ‘ and daily contexts C ‘ for each DR day ‘ in the historical data. We cluster the daily profiles using k-means clustering [43] into N k clusters, C ={C 1 ,C 2 ,...,C N k }. The number of clusters N k is estimated by minimizing the within cluster sum of squares. Thecentroidc m ofeachclusterC m canbeinterpretedasthecharacteristic profile of the cluster: c m = 1 N k N k ÿ ‘=1 P ‘ (6.4) For a given day i, we calculate the probability of i belonging to cluster C m using the pre-DR part of the daily profile for the i-th day and finding their similarity to the pre-DR part of the centroid vector’s profile: P(iœ C m )= 1 – ÎP i,1,d≠ 1 ≠P cm,1,d≠ 1 Î 2 (6.5) 103 where – is a constant used to normalize the probability values between 0 and 1: – = N k ÿ 1 1 ÎP i,1,d≠ 1 ≠P cm,1,d≠ 1 Î 2 (6.6) The in-DR sequence of DR day i can then be calculated as the weighted sum of characteristic in-DR sequences with weights equal to the probability of DR day i belonging to the respective clusters, as follows: [ ˆ E i,d,L ] DSS = 1 N k N k ÿ m=1 P(iœ C m )◊E cm,d,L (6.7) The clustering step is performed only once for each building on historical data whosesizeissmall. However,theclusteringstepcanberepeatedperiodicallywhen more historical data is accumulated. 6.5 Reduced Electricity Consumption Ensemble (REDUCE) We propose Reduced Electricity Consumption Ensemble (REDUCE) that learns to combine outputs from three base models using a Random Forest 1 approach [26] to do reduced consumption prediction. While a single regression tree can also be used to combine predictions from various base models, it is not found to provide best performance on out-of-sample data. In contrast, a random forest ensemble combines fits from many trees to give an overall fit. Random For- est grow multiple decision trees using randomly selected subset of features and 1 Implemented by the randomForestR package [67]. 104 samples. Particularly, each tree is grown using samples with replacement from the training data. To train the ensemble, we form a set of predictor tuples ([ ˆ E i,t,1 ] IDS ,[ ˆ E i,t,1 ] PDS ,[ ˆ E i,t,1 ] DSS ), where [ ˆ E i,t,1 ] m is the value predicted by base model m for interval t on day i; and corresponding set of responses E i,t,1 from the observed values for each day i. The set of predictor tuples are taken from the training data and separate random forest models are then trained with these predictors and their corresponding responses for each customer and for each time interval during the DR event window. Thus, if there are n customers and the DR event window is of length L, we train n◊ L ensemble models. The trained models are then used to make predictions corresponding to the predictor tuples in the test data. 6.5.1 Computational Complexity of REDUCE The In-DR Sequence model, IDS, has O(1) run-time complexity. The Pre- DR Sequence Similarity model, PDS, involves a one-time step of sorting historical days based on similarity, with O(nlogn) complexity, where n is the number of days in the historical data. Thereafter, prediction with PDS is of O(1). The Daily Sequence Similarity model, DSS, involves k-nn clustering as a one time step, while prediction is of O(1). REDUCE uses the random forest method with time complexity of O(nlogn) for the training step. Prediction is of O(1) complexity. Given this low time complexity, our proposed model is ideally suited for making real-time predictions. 105 6.6 Big Data Ensemble for Reduced Consump- tion (BiDER) We now present BiDER, a BigData Ensemble for Reduced Consumption pre- diction. The motivation behind BiDER is to learn a single ensemble of predictions for all customers and for all intervals during the DR period. This is an improve- ment over the REDUCE ensemble presented in Section 6.5, which was built indi- vidually for each customer and for each time interval during DR. The big data ensemble approach oers several advantages: • By utilizing predictions for all cases, our proposed ensemble learning model benefits from big data features of reduced electricity consumption across all customers and all time intervals. • An ensemble model averages out the various errors from individual models and thus leads to lower variance and robustness. • A major contribution of BiDER is in terms of cost, as it provides huge cost reduction over ensembles built for individual cases while achieving similar predictionperformance. Thecostimprovementofourensembleisbyafactor of n◊ L where n is the number of consumers and L is the length of DR event window. While BiDER might not provide the best prediction for every single consumer, unlike a fine-tuned model for each consumer, it provides comparable predictions for a large number of consumers at a fraction of the cost. In practical real-world scenarios, involving hundreds of thousands of consumers in an electric utility ser- vice area, it is not feasible to build individual ensemble prediction models for each consumer, and herein lies the key contribution of our work. 106 As with REDUCE, we again use Random Forest 2 ensemble learning [26] to implement the BiDER ensemble. Given dierent base prediction models, the task here is to aggregate the predicted values from each base model into the final pre- diction. For each interval i in the day ‘ in the training data, we form a matrix of predictor tuples ([ ˆ E i,t,1 ] IDS ,[ ˆ E i,t,1 ] PDS ,[ ˆ E i,t,1 ] DSS ), where [ ˆ E i,t,1 ] m is the value pre- dicted by base model m for interval t on day i; and corresponding set of responses E i,t,1 from the observed values for each day i. Unlike REDUCE, we now learn a single ensemble model using predictors from all intervals and from all customers. Thus, it provides an ecient approach with cost reduction of n◊ L, where where n is the number of consumers and L is the length of DR event window. We compare BiDER’s prediction performance against the following models: Individual Customer Ensemble for Reduced Consumption Prediction (ICER) Here,individualensemblemodelsarelearnedforeachcustomer. Thepredictors foreachensemblearederivedfromabuildingorcustomer’sowndata. Thenumber of ensemble equals n, the number of buildings/customers. Individual Time-interval Ensemble for Reduced Consumption Predic- tion (ITER) Here, we go a step ahead to learn individual ensemble model for each time interval during the DR period for each customer. Thus, we have n◊ L ensemble models in this case. Each ensemble is tuned for a particular interval and for a particular customer to achieve best performance. This is similar to the REDUCE model described earlier. 2 Implemented using the randomForestR package [67]. 107 In-DR Sequence Averaging Model (IDS) This model was described previously in 6.4.1 and is selected for comparing our proposedmodelsasitisarepresentativeaveragingmethodthatutilitiescommonly use for baseline prediction [37, 107]. 6.6.1 Computational Complexity of BiDER We examine the computational complexity and cost-eciency of ensemble learningandthebenefitsoflearningasinglemodelversusbuildingindividualmod- elsasinICERandITERdescribedabove. Thetimecomplexityofthebasemodels can be described as follows. IDS has O(1) run-time complexity. PDS involves a one-time step of sorting historical days based on similarity, with O(nlogn) com- plexity, where n is the number of days in the historical data. After this initial step, prediction is of O(1) complexity. DSS involves k-means clustering as a one time step. Prediction is of O(1) complexity. The BiDER model uses the random forest method with time complexity of O(nlogn) for the training step. Prediction is of O(1) complexity. Thus, BiDER is well-suited for real-time predictions. The key cost-eciency of the BiDER comes from the fact that it involves building a single ensemble model as opposed to ICER and ITER, thus reducing cost by up to a factor of n◊ L, where n is the number of customers and L is the number of intervals in the DR period. 108 6.7 Experimental Setup 6.7.1 Dataset Reduced consumption data was collected from 952 DR events (2012-2014) in theUSCmicrogrid(Figure6.4) [94]. ThedatasetcontainsalogofDReventsthat were performed for a diverse set of 32 buildings, including academic buildings with teaching and oce space, residential dormitories, and administrative buildings. (Building names have been obfuscated for privacy.) Consumption reduction was achieved via DR strategies [82] that directly reduce the loads or alter temperature settings. TheDRexperimentswereconductedwhileschoolwasinsession,allowing reduced consumption during DR to reflect standard mode of operation. Apart fromdailyelectricityconsumptiontime-seriesandreducedconsumptionsequences, hourly temperature data was collected from the NOAA weather station located on the university campus. It was interpolated to 15-minute interval values in alignment with the electricity consumption time-series. Figure6.3showstheDReventsdistributionbetween2012and2014. Thenum- ber of DR events across months and across buildings is not uniform. The selection of buildings for putting under DR was done by the USC Facilities Management Services (FMS). Some buildings have had more than 40 DR events, while others were rarely selected. This results in a total of 952 events for dierent building- strategypairs. Figure6.4showsthedistributionofthenumberofDReventsacross buildings. For our experiments, we chronologically split the dataset per building in two parts: the initial 2/3 of events are used for training and the remaining 1/3 for testing. 109 Figure 6.3: Distribution of DR events over 3 years 6.7.2 Model Parameters We use 15-minute granularity data resulting in J =96 intervals per day. The DR events occurred on weekdays during 1 PM (d=54) to 5 PM, the peak load perioddesignatedbythelocalutility[Anonymized]. Thus,thelengthofDRevents is set at L=16. We use one time-series attribute (i.e., N t =1), temperature, and seven (N s =7) static attributes to represent day of week based on a simple 1-of-7 encoding scheme. 6.7.3 Evaluation Our models are trained on the initial two-thirds of the dataset, while they are tested on the remaining one-thirds of the dataset, chronologically speaking. We use MAPE (Mean Absolute Percentage Error) for evaluation. As a relative measure, MAPE is independent of the scale of consumption of a building [14]. 110 Figure 6.4: Distribution of DR events across buildings It is defined as MAPE = 1 n q n i=1 |e i ≠ ˆ e i e i |, where e i and ˆ e i represent observed and predicted electricity consumption respectively. IDS, being similar to the popularly used models with the utilities [37, 107], is used as thebaseline for comparing the performance of the ensemble. We also use the reliability measure REL as defined previously in chapter 4 to evaluate performance of our proposed ensemble model vis-a-vis the baseline. 6.8 Analysis of REDUCE Figure 6.6 shows the MAPE values for individual buildings, while Figure 6.5 shows the cumulative distribution function of average MAPE of all buildings. We observe that our ensemble REDUCE outperforms the baseline IDS for about 70% of the buildings. It also limits prediction error to Æ 10% for over half of the buildings, which is considered highly reliable by domain experts [14]. Overall, it achieves an average error of 13.5% and standard deviation of 7.3%, which is an improvement of 8.8% over the baseline. The improvement can be attributed to 111 Figure 6.5: Performance of REDUCE using CDF plot for MAPE the better prediction power of combining outputs from three heterogeneous base models, each of which exploits dierent properties of the historical data (Section 6.4). ThebaselineIDSperformsreasonablywellwith14.8%averageerrorand7.4% standarddeviation(Figure6.5)indicatingthathistoricaltimeofdayaveragingisa strong predictor for reduced electricity consumption, similar to normal electricity consumption prediction[10]. Although simple, it derives its predictive power from the errors being averaged out over the entire dataset, as well as from electricity consumptionbeingstronglyrelatedtorepetitivepatternsofhumanactivities. This repetition is more pronounced for campus buildings where activities are tightly coupled to class schedules. It is worth noting here that for residential customers, dailyactivitydoesnotusuallyfollowsuchstrictschedules,hence,averagingmodels such as IDS may not perform well for them. We address this in the following section. 112 Figure 6.6: Performance of REDUCE across buildings We also compare the performance of our Ensemble model with the baseline IDS using the REL measure. We set the acceptable error threshold as 10% as recommended for demand response by domain experts [14]. The REL measure essentially demonstrates how frequently the error values for individual predictions liebelowtheerrorthreshold. Figure6.7showsthecumulativedistributionfunction of average REL values of all buildings. As higher values are preferred, we conclude that our ensemble REDUCE outperforms the baseline IDS for a majority of the buildings. 6.8.1 Eect of Schedule We examine two types of buildings: 1) scheduled buildings:, where occupants’ activities are driven by schedules, for example, buildings with classrooms on a universitycampus,and2)non-scheduled,whereoccupants’activitiesarenotdriven by schedules. For B15, B21, B28, and B29, which are non-scheduled, REDUCE gives superior performance (Figure 6.6). IDS does not perform well here as there are no significant repetitive patterns in human activity such as those found in classroom-only buildings where activities are tightly coupled to class schedules. 113 Figure 6.7: Performance of REDUCE using CDF plot for REL To further assess this dierence, we analyze three buildings: B21, a building with large computer labs, and faculty and graduate student oces, B28, a campus center building with large meeting spaces, and a grand ballroom with seating for over 1000 people, and B14, an academic building with classrooms and few oce spaces. Figures 6.8a and 6.9a show that REDUCE gives low error for both B21 and B28. The error for B21 is low for all days of week but one (Figure 6.8b), while for B28 it is low for all days of week (Figure 6.9b). We observe similar behavior across seasons (Figures 6.8c and 6.9c). On the contrary, IDS outperforms REDUCE for B14(Figure6.10a). REDUCEperformsmoreconsistentlyonTuesdays,Thursdays, and Fridays, as compared to IDS (Figure 6.10b). It is notable that in Fall, when classes are scheduled, IDS performs well; however, in Summer when few classes 114 (a) For all test days (b) By day of week (c) By seasons Figure6.8: MAPEvaluesforB21: abuildingwithlargecomputerlabs,andfaculty and graduate student oces (a) For all test days (b) By day of week (c) By seasons Figure 6.9: MAPE values for B28: campus center with meeting and event spaces and a grand ballroom are oered, and a variety of events may be occurring, REDUCE outperforms IDS (Figure 6.10c). Insight 1: REDUCE gives superior performance when applied to build- ingswhichdonotfollowatightschedule. Asacorollary, we expect REDUCE to achieve similar performance for residential buildings, where human activities do not follow strict schedules, and hence, the performance of averaging models such as IDS will deteriorate. 6.8.2 Eect of Training Data Size Contrary to our ensemble REDUCE, the performance of IDS deteriorates with increasing size of the training data, which can be attributed to noise being intro- duced into the training dataset (Figure 6.11). 115 (a) For all test days (b) By day of week (c) By seasons Figure 6.10: MAPE values for B14: an academic building with large proportion of classrooms and some faculty oces Figure 6.11: Performance of REDUCE with respect to training data size Insight 2: The performance of REDUCE is not sensitive to the training data size. As a corollary, REDUCE would allow accurate predictions to be made with fewer historical data which is useful for new buildings as well as for reducing computational and storage requirements. 116 Figure 6.12: Performance of REDUCE with respect to average consumption 6.8.3 Eect of Variance in Consumption We observe that prediction error decreases with increasing average consump- tion for our ensemble REDUCE model, while it does not change for IDS (Figure 6.12). This could be attributed to more stable and predictable behavior for larger buildings, though it needs further investigation to understand this behavior. Also, for smaller buildings, the electricity consumption values are small; so even when the predicted value is oset by a small amount, it translates to a large percentage error. Insight 3: The performance of REDUCE slightly improves for larger buildings. 117 (a) For IDS model (b) For PDS model (c) For DSS Model Figure6.13: Theeectoftrainingwindowonpredictionaccuracy: 1)fixedwindow, 2) expanding window (ew), 3) moving window (mw). 118 6.9 Analysis of BiDER 6.9.1 Selecting the Training Period The time series electricity consumption data is known to exhibit trend changes over time. Also, reduced consumption profiles for a customer change over time due to a variety of reasons, such as the eect of seasons, or change in the set of electrical devices being involved in achieving consumption reduction, or the customers getting accustomed to DR and thus showing change in their response to DR. These trend changes can adversely aect the prediction accuracy of a model, especially when the model was trained on data points taken from a while ago. To address this problem, we examine three dierent training periods: 1) fixed window; 2) expanding window (ew); and 3) moving window (mw). For the fixed window case, we use the initial 2/3rd data in chronological order as the training period. Intheexpandingwindow, westartwithainitial2/3rddataasthetraining period and incrementally add data points to increase training window size. In the movingwindow,weslidethetrainingwindowasnewdatapointsareseen,anddrop o older data points. The last two types are an example of incremental learning method where learning happens whenever new examples emerge, and the model is adjusted to what has been learned from the new example. Figure 6.13a shows the eect of training window size on IDS model’s perfor- mance. We do not observe any significant improvement with expanding or fixed window. So, we select fixed window model for subsequent ensemble learning. For PDS (Figure 6.13a) we observe a slight reduction in MAPE with expanding win- dow,however,PDSgivesverylowweighttodissimilardays,soineect,notalldays areusefulinexpandingwindow. So,westickwiththefixedwindowmodel. Finally, for DSS model (Figure 6.13c) also, we do not observe performance improvement 119 with expanding or moving window, so we choose simplest fixed window implemen- tation. One implication of this result is that big data (from expanding window) may not always yield better results. It is critical that big data learning mod- els carefully make decision on how much training data is useful for a particular application. BiDER Results In Figure 6.14, we observe that best results are achieved with the ITER model which involves an ensemble model for each prediction interval of the DR period for each customer. While it gives best performance, it is not a practical solution for real-world scenarios involving hundreds of thousands of customers. We observe that our proposed model BiDER is able to outperform both IDS and the ICER model. This is a single model that performs almost as good as ITER (shown in Figure 6.15), at a fraction of the cost. Thus, we have demonstrated how the use of big data in an ensemble learning model improves performance as well as reduces the cost of building the models. 6.10 Summary We addressed the reduced consumption prediction problem that is relevant for successful implementation of dynamic demand response (DR) by the electric utili- ties. Standardmodelsforelectricityconsumptionprediction,suchasthetimeseries models, areunsuitableforthisproblemduetoabruptchangesinconsumptionpro- file at the beginning and end of DR events. We proposed novel ensemble models to make predictions using pre-DR , in-DR, and all-day consumption sequences, 120 Figure 6.14: Performance of IDS, ITER, ICER, and BiDER models to provide superior performance. Also, our models provided a simple and gen- eralizable approach allowing domain experts to integrate a variety of contextual attributes that could aect reduced electricity consumption. The REDUCE ensemble model achieved an average error of 13.5%, which is an improvement of 8.8% over averaging based baseline approach. With low com- putational complexity, REDUCE provides a practical solution that can be applied for real-time prediction. Results indicated that REDUCE is particularly relevant for: 1) buildings for which electricity consumption does not follow a strict schedule (i.e., absence of periodic activities), and 2) buildings with less historical DR data. Webelievethatourresultsandinsightssetthefoundationforfuturemodelingand practice of DR programs in the smart grid domain. 121 Figure 6.15: Performance of IDS, ITER, ICER, and BiDER using CDF plots We also learned an ensemble BiDER using reduced consumption values from all buildings thus forming a big data of reduced electricity consumption. BiDER delivered performance similar to local ensembles learned for individual consumers, while reducing cost by a factor of n◊ L, where n is the number of customers and L is the number of intervals in the DR period. The low computational complexity of BiDER makes it ideal for use in real- time predictions and decision making for dynamic DR applications in Smart Grids. 122 Chapter 7 Conclusions By leveraging real-world big data in the smart grid domain, this thesis addressed the problem of developing prediction models to support dynamic deci- sion making. Prime example of dynamic decision making that we addressed in this thesis is dynamic demand response, whereby utilities must decide about when, by how much, and how to reduce consumption in the next few minutes to few hours as dictated by the dynamically changing conditions in the grid. Other examples of dynamic decision making in smart grids include addition and removal of distributed energy generation and storage units as per the current state of the grid, and deciding prices in the electricity markets. 7.1 Contributions The main contributions of this thesis are summarized as follows: • Chapter 3 advanced the state-of-the-art in demand response research by introducing the notion of dynamic demand response (D 2 R) and discussed the need and unique challenges for developing prediction models to support dynamic decision making in smart grids. • Chapter 4 emphasized on meaningful evaluation of prediction models’ per- formance that go beyond accuracy of predictions with special consideration of the applications they are used for. We proposed a suite of performance measures defined along the dimensions of scale independence, reliability, and 123 cost. These include eight innovative measures. We analyzed the usefulness of these measures by evaluating time series and regression tree prediction models applied to three key Smart Grid applications: planning, customer education, and DR, which span an entire spectrum of long-term to short- term predictions, and involve both utility and customer participation. • Chapter 5 addressed the problem of prediction with partial data. We lever- aged dependencies among time series sensor data for making short term pre- dictions with partial data. While time series dependencies have been used previously,thenoveltyofourworkisinextendingthenotionofdependencies to discover influential sensors and using real-time data only from them to do predictions for all sensors. Using real-world smart grid data from the USC campus microgrid, we empirically demonstrated that using partial data from only≥ 7% of sensors, we achieved performance comparable to that achieved using real-time data from all sensors with prediction error increasing only marginallyby≥ 0.5%overthebaseline, thusdemonstratingtheeectiveness of our method for practical scenarios. • Chapter 6 addressed the problem of reduced consumption prediction dur- ing a DR event. We proposed a novel low complexity ensemble model, that used sequences of daily electricity consumption on DR event days, as well as contextual attributes, for reduced consumption prediction. The low compu- tational complexity of our model makes it ideal for real-time decision making tasks such as D 2 R. We also proposed a cost-ecient Bi gData Ensemble (BiDER) for Reduced electricity consumption prediction then combines pre- dicted values from three base models. Here, our key contribution is in using 124 big data on reduced consumption from a large number of consumers to pre- dict for diverse consumers over dierent time intervals by learning a single ensemble model. This is an improvement over the earlier approach that required building a separate ensemble for each customer and for each inter- val in the DR period. The benefit of using a single model is huge in terms of reduction in the number of models trained of the order of n◊ L, where n is the number of consumers and L is the number of intervals in the DR period. We evaluated our models on a large real-world reduced consumption dataset from the USC microgrid, achieving an average error of 13.5%, which is an 8.8% improvement of over the baseline. For a majority of the buildings, ourproposedmodelkeptthepredictionerrorunder 10%, whichisconsidered highly reliable by domain experts [14]. Our work is the first to address the problem of reduced consumption prediction and provides a foundation for future work. Our models and results have important implications for researchers and prac- titioners in the smart grid domain. The research work presented in this thesis has been published in or is under submission in leading Computer Science conferences and journals such as the AAAI Conference on Artificial Intelligence and the IEEE Transactions in Knowledge and Data Engineering (TKDE). Our novel work on D 2 R lays foundation for next future work on prediction models for generation DR and reduced consumption prediction. Our work on prediction with partial data is generalizable to several domains where decision making requires real-time data which is not always available from all sensors. Our work on evaluation measures is relevant for researchers working on similar problems in other domains. Similarly, manyofourproposedmeasurescanbecorrespondinglydefinedforpredictionprob- lems in other domains. 125 The prediction models proposed in this dissertation have been deployed in the USC campus microgrid. They are being used by the USC Facilities Manage- mentServices(FMS)intheirDemandDecisionSupport(DDS)moduletosupport dynamic decision making for D 2 R [48]. The FMS is also working with the Los Angeles Department of Water and Power (LADWP) on deploying our models at a large scale in the Los Angeles city. 7.2 Future Work Ourformulationoftheproblemofdynamic demand response providesthebasis fordefiningnextgenerationDRtechnologies, suchasreal-timeDR.Withadvance- ment in sensing methods, data integration techniques, and increasing processing power, utilities would transition towards real-time DR where decisions about DR timing, duration, and participation would be made in almost real time. Future work in this direction can benefit greatly from our work, especially in identifying challenges and requirements for prediction models for real-time DR. In the prediction with partial data problem, future extensions of our work could include using a two stage process for influence discovery, the first one being guided by heuristics to restrict the set of eligible sensors from which the influential ones are selected in the second stage. A combination of local and global selection of influential sensors could also be explored to improve prediction performance. Another direction of research is for scenarios when time series data from dierent sensors is not sampled at equally spaced time intervals resulting in irregular time series. Some researchers have addressed the problem of causality in irregular time series [19], and those results could be leveraged for the partial data problem as well. 126 For the reduced consumption prediction problem, future work could involve building more robust models that take into account the consumer behavior and preferences for participation in a DR event, as well as the eect of special events, such as sporting and entertainment events that involve large scale participation of customers and alter electricity consumption considerably. Such work would however require relevant data to be available to the researchers. Another poten- tial research direction for extending our work could be to explore causal relations among pre-DR sequences of individual buildings and learn additional causality- based features to be used in ensemble models for reduced consumption prediction. Also, there is potential to develop ecient methods for updating the prediction models online using streaming real-time data. 127 Reference List [1] CAISO Demand Response User Guide, Guide to Participation in MRTU Release 1, Version 3.0. Technical report, California Independent System Operator (CAISO), 2007. [2] EmergencyDemandResponseProgramManual,Sec5.2: CalculationofCus- tomer Baseline Load (CBL). Technical report, New York Independent Sys- tem Operator (NYISO), 2010. [3] 10-DayAverageBaselineand“Day-Of”Adjustment.Technicalreport,South- ern California Edison, 2011. [4] Weather underground. 2013. [5] N. Addy, S. Kiliccote, J. Mathieu, and D. S. Callaway. Understanding the eect of baseline modeling implementation choices on analysis of demand response performance. In ASME 2012 International Mechanical Engineering Congress and Exposition,2013. [6] H. K. Alfares and M. Nazeeruddin. Electric load forecasting: literature sur- vey and classification of methods. International Journal of Systems Science, 33(1), 2002. [7] C. Alzate and M. Sinn. Improved electricity load forecasting via kernel spec- tral clustering of smart meters. In IEEE International Conference on Data Mining,2013. [8] S. Aman, C. Chelmis, and V. Prasanna. Influence-driven model for time seriespredictionfrompartialobservations. InAAAI Conference on Artificial Intelligence,2015. [9] S. Aman, M. Frincu, C. Chelmis, M. Noor, Y. Simmhan, and V. Prasanna. Empirical Comparison of Prediction Methods for Electricity Consumption Forecasting. CSDepartmentTechnicalReport14-942, UniversityofSouthern California,2014. 128 [10] S. Aman, M. Frincu, C. Chelmis, M. Noor, Y. Simmhan, and V. Prasanna. Empirical Comparison of Prediction Methods for Electricity Consumption Forecasting. CSDepartmentTechnicalReport14-942, UniversityofSouthern California,2014. [11] S. Aman, M. Frincu, C. Chelmis, M. Noor, Y. Simmhan, and V. Prasanna. Prediction models for dynamic demand response: Requirements, challenges, and insights. In IEEE International Conference on Smart Grid Communi- cations,2015. [12] S. Aman, Y. Simmhan, and V. Prasanna. Improving energy use forecast for campus micro-grids using indirect indicators. In IEEE ICDM Workshop on Domain Driven Data Mining (DDDM),2011. [13] S. Aman, Y. Simmhan, and V. Prasanna. Energy Management Systems: State of the Art and Emerging Trends. IEEE Communications Magazine, Ultimate Technologies and Advances for Future Smart Grid (UTASG),2013. [14] S. Aman, Y. Simmhan, and V. Prasanna. Holistic measures for evaluating predictionmodelsinsmartgrids. IEEE Transactions in Knowledge and Data Engineering,27(2),2015. [15] N. Amjady. Short-term hourly load forecasting using time-series modeling withpeakloadestimationcapability. IEEE Transactions on Power Systems, 2001. [16] J.ArmstrongandF.Collopy. Errormeasuresforgeneralizingaboutforecast- ing methods: Empirical comparisons. International Journal of Forecasting, 8(1):69–80, 1992. [17] A. Arnold, Y. Liu, and N. Abe. Temporal causal modeling with graphical granger methods. In International conference on Knowledge discovery and data mining (KDD’07).ACM,2007. [18] Z. Aung, M. Toukhy, J. Williams, A. Sanchez, and S. Herrero. Towards accurate electricity load forecasting in smart grids. In DBKDA 2012, The Fourth International Conference on Advances in Databases, Knowledge, and Data Applications, pages 51–57, 2012. [19] M. T. Bahadori and Y. Liu. Granger causality analysis in irregular time series. In SIAM International Conference on Data Mining (SDM 2012). SIAM, 2012. [20] N. Balac, T. Sipes, N. Wolter, K. Nunes, B. Sinkovits, and H. Karimabadi. Large scale predictive analytics for real-time energy management. In Big Data, 2013 IEEE International Conference on, pages 657–664, 2013. 129 [21] N.Balac,T.Sipes,N.Wolter,K.Nunes,R.S.Sinkovits,andH.Karimabadi. Large scale predictive analytics for real-time energy management. In IEEE International Conference on Big Data,2013. [22] M.BalijepalliandK.Pradhan. Reviewofdemandresponseundersmartgrid paradigm. In IEEE PES Innovative Smart Grid Technologies,2011. [23] A. Barbato, A. Capone, L. Chen, F. Martignon, and S. Paris. A power schedulinggameforreducingthepeakdemandofresidentialusers. InOnline Conference on Green Communications (GreenCom), 2013 IEEE, pages 137– 142, Oct 2013. [24] F.Bouhafs,M.Mackay,andM.Merabti. Linkstothefuture: communication requirements and challenges in the smart grid. IEEE Power and Energy Magazine,10(1),2012. [25] G.E.P.BoxandG.M.Jenkins.Timeseriesanalysis, forecastingandcontrol. Holden-Day, 1970. [26] L. Breiman. Random forests. Machine learning, 45(1):5–32, 2001. [27] L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone. Classification and Regression Trees. Chapman and Hall, 1984. [28] K. Bullis. Smart wind and solar power. MIT Technology Review,2014. [29] P. Chakraborty, M. Marwah, M. Arlitt, and N. Ramakrishnan. Fine-grained photovoltaic output prediction using a bayesian ensemble. In AAAI Confer- ence on Artificial Intelligence,2012. [30] N. Chawla, S. Eschrich, and L. O. Hall. Creating ensembles of classifiers. In IEEE International Conference on Data Mining,2001. [31] C. Chelmis, S. Aman, M. Saeed, M. Frincu, and V. Prasanna. Predicting reduced electricity consumption during dynamic demand response. In AAAI Workshop on Computational Sustainability,2015. [32] B.-J. Chen, M.-W. Chang, and C.-J. Lin. Load forecasting using Support VectorMachines: astudyonEUNITEcompetition2001. IEEE Transactions on Power Systems, 19(4):1821–1830, 2004. [33] C. Chong and S. P. Kumar. Sensor networks: Evolution, opportunities, and challenges. Proceedings of the IEEE,91,2003. [34] A. Ciancio and A. Ortega. A distributed wavelet compression algorithm for wireless multihop sensor networks using lifting. In IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP’05),2005. 130 [35] K.Clement-Nyns,E.Haesen,andJ.Driesen. Theimpactofchargingplug-in hybrid electric vehicles on a residential distribution grid. IEEE Transactions on Power Systems,25(1),2010. [36] K.Coughlin,M.A.Piette,C.Goldman,andS.Kiliccote.Estimatingdemand responseloadimpacts: Evaluationofbaselineloadmodelsfornon-residential buildings in California. Technical Report LBNL-63728, Lawrence Berkeley National Lab, 2008. [37] K.Coughlin, M.A.Piette, C.Goldman, andS.Kiliccote. Statisticalanalysis of baseline load models for non-residential buildings. Energy and Buildings, 41(4), 2009. [38] J.C.Cuevas-Tello,P.Tio,S.Raychaudhury,X.Yao,andM.Harva. Uncov- eringdelayedpatternsinnoisyandirregularlysampledtimeseries: anastron- omy application. Pattern Recognition,43(3),2010. [39] A. Davydenko and R. Fildes. Measuring forecasting accuracy: The case of judgmentaladjustmentstosku-leveldemandforecasts. InternationalJournal of Forecasting,29(3),2013. [40] R.deAquino,T.Ludermir,A.Ferreira,O.NobregaNeto,R.Souza,M.Lira, and M. Carvalho. Improving reservoir based wind power forecasting with ensembles. In IEEE International Conference on Systems, Man and Cyber- netics (SMC),2014. [41] A. Deshpande, C. Guestrin, S. R. Madden, J. M. Hellerstein, and W. Hong. Model-drivendataacquisitioninsensornetworks.InInternationalconference on Very large data bases (VLDB 2004),2004. [42] B. Dong, C. Cao, and S. E. Lee. Applying support vector machines to pre- dict building energy consumption in tropical region. Energy and Buildings, 37:545–553, 2005. [43] S. A. Dudani. The distance-weighted k-nearest-neighbor rule. IEEE Trans- actions on Systems, Man and Cybernetics,4,1976. [44] B. Efron, T. Hastie, I. Johnston, and R. Tibshirani. Least angle regression. The Annals of Statistics,32(2),2004. [45] S. Fan and R. Hyndman. Short-term load forecasting based on a semi- parametric additive model. IEEE Transactions on Power Systems,27(1), 2012. 131 [46] S. Fan, C. Mao, J. Zhang, and L. Chen. Forecasting electricity demand by hybrid machine learning model. Neural Information Processing, LNCS, 4233:952–963, 2006. [47] E. A. Feinberg and D. Genethliou. Load forecasting. Applied Mathematics for Restructured Electric Power Systems: Optimization, Control, and Com- putational Intelligence, pages 269–285, 2005. [48] M. Frincu, C. Chelmis, S. Aman, M. Saeed, V. Zois, and V. K. Prasanna. Enabling automated dynamic demand response: From theory to practice. In ACM International Conference on Future Energy Systems (e-Energy),2015. [49] C. W. J. Granger. Investigating causal relations by econometric models and cross-spectral methods. Econometrica, 37(3):424–438, 1969. [50] C. W. J. Granger. Prediction with a generalized cost of error function. Operational Research Quarterly, 20:199–207, 1969. [51] S. Habena, J. Wardb, D. V. Greethama, C. Singletona, and P. Grindroda. A new error measure for forecasts of household-level, high resolution electrical energy consumption. International Journal of Forecasting,30(2),2014. [52] M. Hagan and S. M. Behr. The time series approach to short term load forecasting. IEEE Transactions on Power Systems,2(3),1987. [53] S.Hassan, A.Khosravi, andJ.Jaafar. Neuralnetworkensemble: Evaluation of aggregation algorithms in electricity demand forecasting. In International Joint Conference on Neural Networks (IJCNN),2013. [54] T. Hong, M. Gui, M. Baran, and H. Willis. Modeling and forecasting hourly electric load by multiple linear regression with interactions. In Power and Energy Society General Meeting, 2010 IEEE, pages 1–8, 2010. [55] R.HyndmanandA.Koehler. Anotherlookatmeasuresofforecastaccuracy. International Journal of Forecasting,22(4),2006. [56] R. J. Hyndman and Y. Khandakar. Automatic time series for forecasting: The forecast package for R. Technical report, Monash University, 2007. [57] IBM Software Information. Managing big data for smart grids and smart meters. Technical report, IBM Corporation, 2012. [58] International Energy Agency. World Energy Outlook. 2014. 132 [59] N. Jewell, M. Turner, J. Naber, and M. McIntyre. Analysis of forecasting algorithms for minimization of electric demand costs for EV charging in commercial and industrial environments. In Transportation Electrification Conference and Expo,2012. [60] B. Karimi, V. Namboodiri, and M. Jadliwala. On the scalable collection of meteringdatainsmartgridsthroughmessageconcatenation. InIEEE Inter- national Conference on Smart Grid Communications (SmartGridComm), 2013. [61] J. Z. Kolter and J. F. Jr. A large-scale study on predicting and contextual- izing building energy usage. In AAAI Conference on Artificial Intelligence, 2011. [62] O. Kramer, B. Satzger, and J. Lassig. Power prediction in smart grids with evolutionary local kernel regression. Hybrid Artificial Intelligence Systems, LNCS, 6076:262–269, 2010. [63] D. M. Kreindler and C. J. Lumsden. The eects of the irregular sample and missing data in time series analysis. Nonlinear dynamics, psychology, and life sciences,10(2),2006. [64] L. I. Kuncheva and C. J. Whitaker. Measures of diversity in classifier ensem- bles and their relationship with the ensemble accuracy. Machine learning, 51(2):181–207, 2003. [65] J. Kwac and R. Rajagopal. Demand response targeting using big data ana- lytics. In IEEE International Conference on Big Data,2013. [66] D.Lachut,N.Banerjee,andS.Rollins. Predictabilityofenergyuseinhomes. In International Green Computing Conference,2014. [67] A. Liaw and M. Wiener. Classification and regression by randomforest. R News, 2(3):18–22, 2002. [68] E.Lorenz, J.Hurka, D.Heinemann, andH.Beyer. Irradiance forecastingfor the power prediction of grid-connected photovoltaic systems. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing,2(1), 2009. [69] A. C. Lozano, H. Li, A. Niculescu-Mizil, Y. Liu, C. Perlich, J. Hosking, and N. Abe. Spatial-temporal causal modeling for climate change attribution. In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’09).ACM,2009. 133 [70] A. Marascu, P. Pompey, E. Bouillet, O. Verscheure, M. Wurst, M. Grund, and P. Cudre-Mauroux. Mistral: An architecture for low-latency analytics onmassivetimeseries. InIEEE International Conference on Big Data,2013. [71] F.Martinez-Alvarez, A.Troncoso, J.C.Riquelme, andJ.S.A.Ruiz. Energy time series forecasting based on pattern sequence similarity. IEEE Transac- tions on Knowledge and Data Engineering,23(8),2011. [72] J. L. Mathieu, D. S. Callaway, and S. Kiliccote. Variability in Automated Responses of Commercial Buildings and Industrial Facilities to Dynamic Electricity Prices. Energy and Buildings,2011. [73] P.McDanielandS.McLaughlin. Securityandprivacychallengesinthesmart grid. IEEE Security and Privacy,7,2009. [74] E. McKenna, I. Richardson, and M. Thomson. Smart meter data: Balanc- ing consumer privacy concerns with legitimate applications. Energy Policy, 41(C), 2012. [75] C. Meek, D. M. Chickering, and D. Heckerman. Autoregressive tree models for time-series analysis. In 2nd International SIAM Conference on Data Mining (SDM). SIAM, 2002. [76] H. Mori and A. Takahashi. Hybrid intelligent method of relevant vector machineandregressiontreeforprobabilisticloadforecasting. InIEEE Inter- national Conference and Exhibition on Innovative Smart Grid Technologies (ISGT Europe),2011. [77] NOAA. Quality Controlled Local Climatological Data Improve- ments/Dierences/Updates. 2013. [78] H.Y.NohandR.Rajagopal. Data-drivenforecastingalgorithmsforbuilding energyconsumption. InSPIE 8692, Sensors and Smart Structures Technolo- gies for Civil, Mechanical, and Aerospace Systems,2013. [79] B. Pan, U. Demiryurek, and C. Shahabi. Utilizing real-world transportation data for accurate trac prediction. In IEEE International Conference on Data Mining (ICDM’12),2012. [80] S. Park, S. Ryu, Y. Choi, and H. Kim. A framework for baseline load esti- mation in demand response: Data mining approach. In IEEE International Conference on Smart Grid Communications,2014. [81] S. Park, S. Ryu, Y. Choi, and H. Kim. A framework for baseline load esti- mation in demand response: Data mining approach. In IEEE International Conference on Smart Grid Communications,2014. 134 [82] M. A. Piette, S. Kiliccote, and J. H. Dudley. Field Demonstration of Auto- mated Demand Response for Both Winter and Summer Events in Large Buildings in the Pacific Northwest. Lawrence Berkeley National Lab Techni- cal Report, LBNL-6216E,2012. [83] R.C.Prati,G.E.A.P.A.Batista,andM.C.Monard. Asurveyongraphical methods for classification predictive performance evaluation. IEEE Trans- actions on Knowledge and Data Engineering,23(11),2011. [84] M. M. M. A. Prithwish Chakraborty1, 2 and . Naren Ramakrishnan1. Fine- grained photovoltaic output prediction using a bayesian ensemble. In AAAI Conference on Artificial Intelligence,2012. [85] S. D. Ramchurn, P. Vytelingum, A. Rogers, and N. R. Jennings. Putting the ’Smarts’ Into the Smart Grid: A Grand Challenge for Artificial Intelligence. Communications of the ACM,55(4),2012. [86] M. A. Razzaque, C. Bleakley, and S. Dobson. Compression in wireless sen- sor networks: A survey and comparative evaluation. ACM Transactions on Sensor Networks,10(1),2013. [87] P. Sanders, S. Schlag, and I. Muller. Communication ecient algorithms for fundamental big data problems. In IEEE International Conference on Big Data,2013. [88] F. Schweppe, B. Daryanian, and R. Tabors. Algorithms for a Spot Price Responding Residential Load Controller. IEEE Power Engineering Review, 9(5), 1989. [89] W. Shen, V. Babushkin, Z. Aung, and W. L. Woon. An ensemble model for day-ahead electricity demand time series forecasting. In International Conference on Future Energy Systems (ACM e-Energy),2013. [90] Y. Simmhan, S. Aman, B. Cao, M. Giakkoupis, A. Kumbhare, Q. Zhou, D. Paul, C. Fern, A. Sharma, , and V. Prasanna. An informatics approach to demand response optimization in smart grids. Technical report, 2011. [91] Y. Simmhan, S. Aman, A. Kumbhare, R. Liu, S. Stevens, Q. Zhou, and V.Prasanna. Cloud-basedsoftwareplatformfordata-drivensmartgridman- agement. IEEE/AIP Computing in Science and Engineering,2013. [92] Y. Simmhan, S. Aman, A. Kumbhare, R. Liu, S. Stevens, Q. Zhou, and V.Prasanna. Cloud-basedsoftwareplatformfordata-drivensmartgridman- agement. IEEE Computing in Science and Engineering,15,2013. 135 [93] Y. Simmhan and M. Noor. Scalable prediction of energy consumption using incremental time series clustering. In Workshop on Big Data and Smarter Cities, 2013 IEEE International Conference on Big Data,2013. [94] Y. Simmhan, V. Prasanna, S. Aman, S. Natarajan, W. Yin, and Q. Zhou. Toward data-driven demand-response optimization in a campus microgrid. In Proceedings of the Third ACM Workshop on Embedded Sensing Systems for Energy-Eciency in Buildings , pages 41–42. ACM, 2011. [95] Y. Simmhan, V. Prasanna, S. Aman, S. Natarajan, W. Yin, and Q. Zhou. Towards data-driven demand-response optimization in a campus microgrid. In ACM Workshop On Embedded Sensing Systems For Energy-Eciency In Buildings (BuildSys),2011. [96] M. Sokolova, N. Japkowicz, and S. Szpakowicz. Beyond Accuracy, F-Score and ROC: A family of discriminant measures for performance evaluation. In Australian Joint Conference on Artificial Intelligence,2006. [97] A.StrehlandJ.Ghosh. Clusterensembles—aknowledgereuseframeworkfor combining multiple partitions. The Journal of Machine Learning Research, 3:583–617, 2003. [98] L. F. Sugianto and X. Lu. Demand forecasting in the deregulated market: a bibliography survey. Australasian Universities Power Engineering Confer- ence, pages 1–6, 2002. [99] J. W. Taylor, L. M. de Menezes, and P. E. McSharry. A comparison of univariate methods for forecasting electricity demand up to a day ahead. International Journal of Forecasting, 22:1–16, 2006. [100] R. Tibshirani. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society, Series B,58(1),1996. [101] L. Torgo and R. Ribeiro. Precision and recall for regression. In International Conference on Discovery Science,2009. [102] UNDP. Promotingenergyeciencyinbuildings: Lessonslearnedfrominter- national experience. Release Date: 2010, 2010. [103] U.S. Department of Energy. Grid 2030: A National Vision for Electricity’s Second 100 Years. Technical Report. 2003. [104] T.Verbraken,W.Verbeke,andB.Baesens. Anovelprofitmaximizingmetric for measuring classification performance of customer churn prediction mod- els. IEEE Transactions on Knowledge and Data Engineering,25(5),2013. 136 [105] K. Wagsta. Machine learning that matters. In International Conference on Machine Learning (ICML), pages 529–536, 2012. [106] T.K.Wijaya, M.Vasirani, andK.Aberer. Whenbiasmatters: Aneconomic assessment of demand response baselines for residential customers. IEEE Transactions on Smart Grid,5(4),2014. [107] T. K. Wijaya, M. Vasirani, and K. Aberer. When bias matters: An eco- nomic assessment of demand response baselines for residential customers. IEE Transactions on Smart Grid,2014. [108] H. Ziekow, C. Goebel, J. Struker, and H.-A. Jacobsen. The potential of smart home sensors in forecasting household electricity demand. In IEEE International Conference on Smart Grid Communications,2013. 137
Abstract (if available)
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Data-driven methods for increasing real-time observability in smart distribution grids
PDF
A function-based methodology for evaluating resilience in smart grids
PDF
Adaptive and resilient stream processing on cloud infrastructure
PDF
Model-driven situational awareness in large-scale, complex systems
PDF
Dynamic graph analytics for cyber systems security applications
PDF
Discrete optimization for supply demand matching in smart grids
PDF
Customized data mining objective functions
PDF
Machine learning for efficient network management
PDF
The smart grid network: pricing, markets and incentives
PDF
Modeling and recognition of events from temporal sensor data for energy applications
PDF
Scalable exact inference in probabilistic graphical models on multi-core platforms
PDF
Spatiotemporal traffic forecasting in road networks
PDF
Probabilistic data-driven predictive models for energy applications
PDF
Learning the semantics of structured data sources
PDF
Defending industrial control systems: an end-to-end approach for managing cyber-physical risk
PDF
Novel and efficient schemes for security and privacy issues in smart grids
PDF
Responsible AI in spatio-temporal data processing
PDF
On efficient data transfers across geographically dispersed datacenters
PDF
A complex event processing framework for fast data management
PDF
Integration of energy-efficient infrastructures and policies in smart grid
Asset Metadata
Creator
Aman, Saima
(author)
Core Title
Prediction models for dynamic decision making in smart grid
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Computer Science
Publication Date
02/24/2016
Defense Date
11/30/2015
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
demand response,evaluation measures,Models,OAI-PMH Harvest,smart grid
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Prasanna, Viktor K. (
committee chair
), Raghavendra, Cauligi (
committee member
), Shahabi, Cyrus (
committee member
)
Creator Email
saimasuhail@gmail.com,saman@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c40-216694
Unique identifier
UC11278488
Identifier
etd-AmanSaima-4169.pdf (filename),usctheses-c40-216694 (legacy record id)
Legacy Identifier
etd-AmanSaima-4169.pdf
Dmrecord
216694
Document Type
Dissertation
Format
application/pdf (imt)
Rights
Aman, Saima
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
demand response
evaluation measures
smart grid