Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Residential electricity demand in the context of urban warming: leveraging high resolution smart meter data to quantify spatial and temporal patterns…
(USC Thesis Other)
Residential electricity demand in the context of urban warming: leveraging high resolution smart meter data to quantify spatial and temporal patterns…
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
Residential electricity demand in the
context of urban warming: Leveraging high resolution smart meter data to quantify spatial and
temporal patterns in electricity consumption, cooling demand, and heat vulnerability.
by
McKenna Peplinski
A Dissertation Presented to the
FACULTY OF THE USC GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
(ENVIRONMENTAL ENGINEERING)
August 2024
ii
Acknowledgements
I would like to first thank my advisor, Dr. Kelly Sanders, for being a wonderful mentor
during my five years at USC. When I applied to USC’s PhD program as an undergraduate, I was
inspired by your work and passion and enthusiasm for energy and policy. This has remained true
throughout my time at USC. I have appreciated your guidance, support, and advice, and I would
not be the same researcher or writer without your thoughtful feedback. I also want to remember
Dr. George Ban-Weiss, who I was lucky enough to have as a co-advisor during my first three
years at USC. From my first interview, he made the PhD process feel less daunting and was
always a great example of how to work hard but have fun. I am so grateful that I was able to
learn from him and be mentored by him.
I would also like to thank my qualifying and defense committee members, Dr. Sam Silva,
Dr. Jiachen Zhang, Dr. Lucio Soibelman, Dr. Burcin Becerik, and Dr. Erika Garcia, for
contributing to discussions about my research, providing helpful feedback, and challenging my
understanding of engineering, all of which shaped this body of work.
I am extremely grateful for all the labmates and friends that I have met at USC, including
all the current and previous lab members of the S3 and GBW group. You all made USC a
welcoming place to work and study, and I have learned so much from each one of you. I want to
give a special thanks to Maile, my roommate, and Andrew and Stepp; I can’t imagine doing the
last five years with anyone else. I have benefited so much from our joint class projects, office
conversations, brainstorming sessions, nights at the Prince, boba runs, zooies cookie pickups,
etc., etc.
Finally, I want to give a big thank you to my parents and each of my brothers who have
always been there to support me unconditionally. I wouldn’t be who I am today without you.
iii
Table of Contents
ACKNOWLEDGEMENTS............................................................................................................ ii
LIST OF TABLES......................................................................................................................... vi
LIST OF FIGURES ...................................................................................................................... vii
ABSTRACT................................................................................................................................. viii
CHAPTER 1: INTRODUCTION................................................................................................... 1
1.1 Motivations..............................................................................................................................................................1
1.2 Research Gaps..........................................................................................................................................................2
1.2.1 Predicting residential electricity demand .........................................................................................................2
1.2.2 Using humid heat metrics to make estimates of AC ownership .......................................................................3
1.2.3 Quantifying residential cooling demand ..........................................................................................................4
1.2.4 Estimating how responsive residential customers were to Flex Alerts.............................................................4
1.3 Structure of Document and Research Questions......................................................................................................5
CHAPTER 2: A MACHINE LEARNING FRAMEWORK TO ESTIMATE RESIDENTIAL
ELECTRICITY DEMAND BASED ON SMART METER ELECTRICITY, CLIMATE,
BUILDING CHARACTERISTICS, AND SOCIOECONOMIC DATASETS ............................. 7
2.1 Introduction .............................................................................................................................................................7
2.2 Methods.................................................................................................................................................................16
2.2.1 Datasets..........................................................................................................................................................16
2.2.2 Data preparation .............................................................................................................................................19
2.2.3 Model training and evaluation........................................................................................................................23
2.2.4 Feature selection.............................................................................................................................................26
2.2.5 Spatiotemporal resolution...............................................................................................................................27
2.2.5 Feature Importance.........................................................................................................................................28
2.3 Results and discussion ...........................................................................................................................................29
2.3.1 Model performance ........................................................................................................................................29
2.3.2 Sequential feature selection............................................................................................................................36
2.3.3 Feature importance.........................................................................................................................................39
2.4 Conclusion and future work...................................................................................................................................42
CHAPTER 3: INVESTIGATING WHETHER THE INCLUSION OF HUMID HEAT
METRICS IMPROVES ESTIMATES OF AC PENETRATION RATES: A CASE STUDY OF
SOUTHERN CALIFORNIA........................................................................................................ 45
3.1 Introduction ...........................................................................................................................................................45
iv
3.2 Methods.................................................................................................................................................................48
3.2.1 Electricity Records.........................................................................................................................................48
3.2.2 Weather Data and Heat Metrics .....................................................................................................................50
3.2.3 Statistical Model.............................................................................................................................................51
3.2.4 Spatial Analysis..............................................................................................................................................54
3.3 Results and Discussion ..........................................................................................................................................54
3.3.1 Differences in estimated AC Penetration Rates .............................................................................................54
3.3.2 Improving confidence in AC estimates..........................................................................................................57
3.4 Conclusion.............................................................................................................................................................59
CHAPTER 4: REVEALING SPATIAL AND TEMPORAL PATTERNS OF RESIDENTIAL
COOLING IN SOUTHERN CALIFORNIA THROUGH COMBINED ESTIMATES OF AC
OWNERSHIP AND USE............................................................................................................. 62
4.1 Introduction ...........................................................................................................................................................62
4.2 Literature review....................................................................................................................................................65
4.2.1 Studies on AC ownership ...............................................................................................................................65
4.2.2 Studies on AC use ..........................................................................................................................................67
4.2.3 Research gaps in the literature........................................................................................................................70
4.3. Methodology.........................................................................................................................................................70
4.3.1 Dataset information and preprocessing ..........................................................................................................72
4.3.2 AC Ownership Algorithm and computation of AC Penetration Rate (Step 1 in Figure 1) ............................74
4.3.3 AC State Algorithm and computation of AC Operation Rate (Step 2 in Figure 1) ........................................78
4.3.4 Calculation of Net AC Utilization (Step 3 in Figure 1)..................................................................................82
4.4. Results and Discussion .........................................................................................................................................83
4.4.1 Comparison of AC Penetration Rates with other studies ...............................................................................83
4.4.2 Tracking temporal patterns of AC Operation Rate.........................................................................................85
4.4.3 Spatial trends in AC Penetration Rate, AC Operation Rate, and Net AC Utilization ....................................87
4.4.4 Net AC Utilization considering climate .........................................................................................................91
4.5. Conclusion............................................................................................................................................................92
CHAPTER 5: RESIDENTIAL ELECTRICITY DEMAND ON CAISO FLEX ALERT DAYS:
A CASE STUDY OF VOLUNTARY EMERGENCY DEMAND RESPONSE PROGRAMS.. 95
5.1 Introduction ...........................................................................................................................................................95
5.2 Methods...............................................................................................................................................................100
5.2.1 Datasets........................................................................................................................................................100
5.2.2 Response Metrics .........................................................................................................................................102
5.3 Results and Discussion ........................................................................................................................................104
5.3.1 Level of response across Flex Alert days.....................................................................................................104
5.3.2 Load profiles on Flex Alert days..................................................................................................................108
5.3.3 Variation in response across residential customers ......................................................................................110
5.4 Conclusion...........................................................................................................................................................113
v
CHAPTER 6: CONCLUSION ................................................................................................... 116
REFERENCES ........................................................................................................................... 119
APPENDICES ............................................................................................................................ 142
A - SUPPLEMENTAL INFORMATION FOR CHAPTER 2 ................................................... 142
Section A1. Temporal variability in model performance ..........................................................................................142
Section A2. Summary of daily, monthly, annual model results for all 11 initial models ..........................................143
B - SUPPLEMENTAL INFORMATION FOR CHAPTER 3 ................................................... 145
Section B1. Weather station distances.......................................................................................................................145
Section B2. Heat metric definitions and equations....................................................................................................146
Section B3. Segmented linear regression criteria examples .....................................................................................148
Section B4. Comparison of AC penetration rates acquired from previous studies....................................................149
Section B5. Distribution of the daily average heat metric values in each of the study’s climate zones ....................150
Section B6. Humid heat metric distribution across temperature bins and climate zones ..........................................152
C – SUPPLEMENTAL INFORMATION FOR CHAPTER 4................................................... 155
Section C.1 Electric Heating Penetration Rates.........................................................................................................155
D – SUPPLEMENTAL INFORMATION FOR CHAPTER 5................................................... 156
Section D1. Methods Flow Diagram .........................................................................................................................156
Section D2. Summary of socioeconomic Indicators..................................................................................................157
Section D3. Hourly load profiles ......................................................................................................................159
Section D4. Hourly percent change in demand .........................................................................................................161
Section D5. Hourly load profiles and socioeconomic indicators...............................................................................163
Section D6. Ramping response by hour of the Flex Alert period ..............................................................................169
vi
List of Tables
Table 2-1. Summary of machine learning studies for residential electric load prediction. .......... 15
Table 2-2. Full feature set ............................................................................................................. 21
Table 2-3. Annual, monthly, daily results before sequential feature selection............................. 33
Table 2-4. Results of the pre-aggregation and post-aggregation training..................................... 35
Table 2-5. Feature importance with annual, monthly, and daily data by household. ................... 41
Table 3-1. Description of heat metrics used in this study............................................................. 51
Table 3-2 Summary of the study region’s averaged regression results for each heat metric. ...... 56
Table 4-1. Transition matrix summarizing AC Penetration Rates and Net AC Utilization ........ 90
Table 5-1. Summary of CAISO’s Flex Alerts from 2015-2016 and 2018-2020 ........................ 108
vii
List of Figures
Figure 2-1. Machine Learning Model Development Framework................................................. 16
Figure 2-2. Graphical representation of bootstrap resampling methods....................................... 26
Figure 2-3. Two data aggregation methods .................................................................................. 29
Figure 2-4. Model performance measured by r2 across the overall top 5 .................................... 32
Figure 2-5. Model performance of the best models for each combination................................... 36
Figure 2-6. Final feature set for top performing annual, monthly, and daily models................... 38
Figure 3-1. An example set of segmented linear regressions for one home ................................. 52
Figure 3-2. Choropleth maps depicting the difference between AC penetration rates................. 55
Figure 3-3. (a): Percentage of homes identified as having an AC................................................ 59
Figure 4-1. Overview of methodology for finding AC Penetration Rates.................................... 71
Figure 4-2. Hourly electricity consumption versus hourly ambient temperature ......................... 76
Figure 4-3. Top: Scatterplot of hourly electricity consumption and temperature......................... 81
Figure 4-4. Choropleth maps depicting the difference between AC Penetration Rates ............... 85
Figure 4-5. Stacked bar chart showing the breakdown of AC Operation Rates........................... 86
Figure 4-6. Heat map depicting the average AC Operation Rate ................................................. 87
Figure 4-7. Choropleth maps depicting the a) AC Penetration Rate, ........................................... 89
Figure 4-8. Scatter plot of normalized Net AC Utilization versus CDD...................................... 92
Figure 5-1. Flex Period Response of the a) the residential SCE load and b) total SCE load ..... 107
Figure 5-2. a-b) Normalized hourly electricity load profile ....................................................... 110
Figure 5-3. a-b) Choropleth map of the census tract level Flex Period Response...................... 111
Figure 5-4. Hourly residential electricity load by a) income percentile ..................................... 111
viii
Abstract
Analyzing the tensions between global climate change and the power system is critical as rising
temperatures and electrification trends are expected to drive large increases in overall electricity
demand. These tensions have been especially evident on extreme heat days when demand for
cooling has overwhelmed grid resources, and in some cases, pushed the grid to failure. In the
residential sector, where a significant portion of end-use electricity consumption is attributed to
space cooling, a robust understanding of energy-climate interactions is critical to accurately
forecast demand under changing population, demographic, and climate scenarios. In this body of
work, we develop frameworks to better understand residential electricity use that 1) predict
household electricity consumption using machine learning models, 2) characterize regional AC
ownership and operation through statistical techniques, and 3) evaluate electricity demand during
extreme heat and demand response events. In the past, researchers have relied on coarse data to
model the relationship between electricity demand its driving factors, but these datasets fail to
capture the highly variable behavior of residential customers. This dissertation utilizes a high
spatiotemporal resolution dataset consisting of smart meter electricity records for approximately
200,000 homes in the greater Los Angeles area, which enables highly resolved estimates of
residential electricity consumption and cooling demand across a particularly diverse study
region. The studies’ results can be used by grid operators and utilities to plan for future energy
needs, and by public health officials and environmental justice advocates to increase cooling
access in vulnerable communities. As high-resolution smart meter data become more accessible,
researchers can adopt these frameworks in new regions to study the electricity demand and
cooling behavior of the residential sector in the context of a warming climate.
1
Chapter 1: Introduction
1.1 Motivations
The residential sector is a significant consumer of energy, accounting for one-fourth of final
electricity consumption globally [1]. In the upcoming decades, we can expect to see increased
electricity demand from residential customers due to a combination of electrification trends,
higher standards of living, and rising temperatures [2], [3], [4], [5]. At the same time,
transforming and decarbonizing the power grid will complicate the challenge of balancing supply
and demand [6]. Thus, a robust understanding of residential electricity consumption is critical to
plan and manage electricity infrastructure while meeting domestic and international climate
goals.
In countries with high rates of air-conditioning (AC) ownership, cooling demand is a major
driver of residential electricity consumption. AC is a key adaptation tool that protects
populations from the consequences of extreme heat events, but rapidly growing AC adoption and
the increasing use of existing cooling services will have significant implications for the energy
landscape [7]. On hot summer days, peak electricity demand, which often coincides with the peak
demand for cooling, can push the electric grid to the brink and poses a high risk for power system
failure [44]–[48]. Therefore, characterizing patterns of AC ownership and use and analyzing
residential electricity demand in response to extreme heat is critical to ensure the electricity
needs of a region are met and locate which communities are most vulnerable to rising
temperatures under climate change.
This body of work builds frameworks to predict both total residential electricity
consumption and cooling demand and evaluate residential grid flexibility during extreme heat
events. Together, these studies contribute a robust analysis of residential electricity demand, and
2
the methodologies and results described can be used for improved grid management in a
warming climate.
1.2 Research Gaps
Developing frameworks to estimate the residential sector’s electricity needs is difficult due
to the widely varying energy habits among customers that are best captured with highly granular
datasets, which are typically unavailable. This body of work analyzes the magnitude and patterns
of residential electricity demand in several ways including 1) predictions of household electricity
consumption, 2) estimates of who has access to cooling, 3) characterization of AC behavior, and
4) evaluation of the impact of a system-wide DR program consumption. We use a high
spatiotemporal resolution dataset consisting of household level smart meter records for ~200,000
homes in Southern California at 15-minute intervals. Southern California is an ideal location for
this body of research because of wide ranging climates, building stocks, and socioeconomic
status across a relatively small spatial extent. Through our dataset we can address the research
gaps that stem from data limitation while also exploring how multiple factors contribute to the
differences in demand.
1.2.1 Predicting residential electricity demand
Machine learning (ML) has become a popular tool for predicting electricity demand
because it can capture the complex non-linear dynamics of electricity consumption with higher
accuracy than statistical models and more simplicity than physics-based models [11], [12], [13].
Still, most electric load forecasting ML studies use either coarse resolution data (e.g., state or
regional) that cannot capture the highly varying demand of the residential sector or electricity
records from a subset of homes that is not statistically representative of the regional demand
3
[14], [15], [16]. We address this limitation through access to a highly granular, large-scale
dataset of smart meter electricity records.
This dissertation begins in Chapter 2, where I develop a generalized, repeatable model to
predict household-level electricity consumption using ML models and a combination of site
weather, building characteristics, and socioeconomic data. To explore the impact of spatial and
temporal data resolution on model performance, ML models were trained with household and
census tract level data at daily, monthly, and annual time scales. Feature selection and feature
importance algorithms were also implemented to select the set of variables that leads to the best
estimates and improve the model explainability. This study can serve as a framework of best
practices for energy domain experts interested in utilizing ML models to forecast electric load.
This work resulted in the following publication:
McKenna Peplinski, M. Chen, B. Dilkina, G.A. Ban-Weiss, K.T. Sanders (2024). “A
machine learning framework to estimate residential electricity demand based on smart meter
electricity, climate, building characteristics, and socioeconomic datasets.” Applied Energy, 357,
122413.
1.2.2 Using humid heat metrics to make estimates of AC ownership
Despite recent evidence that cooling demand is dependent on humidity conditions [17],
[18], [19], [20], most research to date uses temperature-derived variables as the only climate
indicators in predicting electricity demand [21], [22], [23], [24]. In Chapter 3 of this body of
work, I estimate rates of AC ownership across Southern California using a variety of humid heat
metrics. This is done with a segmented linear regression model, previously developed by Chen et
al. [25], that I adapt to model the relationship between residential electricity use and each heat
metric. Quantifying AC penetration rates is difficult because we lack a ground truth of residential
4
AC ownership to validate against. A major advantage of this method is that we can increase our
confidence in estimating AC penetration by evaluating estimates with multiple metrics. The
work in this chapter resulted in the following publication:
McKenna Peplinski, Peter Kalmus, K.T. Sanders. (2023). “Investigating whether the
inclusion of humid heat metrics improves estimates of AC penetration rates: a case study of
Southern California.” Environmental Research Letters, 18(10), 104054
1.2.3 Quantifying residential cooling demand
Studies that utilize electricity records to estimate AC ownership are valuable but are
limited in their ability to characterize the nature of cooling demand [10], [26]. For example,
quantifying the cooling demand of a region requires both detailed estimates of who has access to
AC and how they use their AC. In a three-part framework in Chapter 4, I first develop a novel
method to identify who has AC using hourly smart meter records, as opposed to daily aggregated
consumption, which can provide insight into a household’s intraday patterns of electricity use. In
the second part of Chapter 4’s framework, I then adapt a linear model developed by Dyson et al.
[10] to make predictions of in which hours a household with AC has it turned on. In the final step
of this methodology, I combine the AC penetration values with the predictions of AC use, to
create a new metric called Net AC Utilization which better captures the cooling behavior of a
region. Through this framework, temporal, spatial, and climatic patterns of AC use can be
characterized.
1.2.4 Estimating how responsive residential customers were to Flex Alerts
The effectiveness of unincentivized, voluntary DR programs has been underexplored in
the literature. My research in Chapter 5 addresses this gap through a case study of Flex Alerts, a
voluntary, emergency demand response tool implemented by CAISO that asks customers to
5
reduce their load during peak hours in an effort to maintain grid stability during emergency
events [27]. These alerts, and other DR programs, are a promising strategy to avert grid crises
and continue meeting peak load without building out new generation and transmission sources.
In general, quantifying the effectiveness of DR programs is challenging because no ground
truth value exists for what the demand would have been in the absence of a DR event occurring.
In this work, I define two new metrics to evaluate the effectiveness of Flex Alerts using smart
meter data records and total CAISO load data. In this analysis, we capture the variation in
response across the study region and on different Flex Alert days. The results also indicate which
subpopulations were more responsive to Flex Alerts, lending insight into how to develop a robust
DR program that offers more reliable grid flexibility. This study led to the following publication:
McKenna Peplinski, K.T. Sanders. (2023). “Residential electricity demand on CAISO Flex
Alert days: A case study of voluntary emergency demand response programs.” Environmental
Research: Energy, 1(1), 015002.
1.3 Structure of Document and Research Questions
This document is organized into 6 Chapters. Chapter 2, 3, 4, and 5 correspond to the
research questions listed below. Chapter 6 summarizes and concludes this body of work. The
following research questions are addressed:
Chapter 2.
1. How useful are machine learning models in predicting residential electricity demand?
2. How does the spatiotemporal resolution of the electricity data impact the accuracy of
predictions?
3. Which features are most useful for predicting residential electricity use?
6
Chapter 3.
1. Does humidity and temperature together better quantify the climate sensitivity portion of
electricity demand than temperature alone?
2. Are there heat measures beyond ambient air temperature that more tightly correlate with
electricity demand?
3. Can heat metrics that incorporate both humidity and dry bulb temperature identify which
households have AC with higher accuracy?
Chapter 4.
1. How do the results of a novel, hourly model that captures both electric heating and
AC compare to previous estimates of AC penetration?
2. Can we use hourly electricity data to make inferences about the frequency of AC
usage and improve our understanding of spatial and temporal trends of residential
cooling demand?
Chapter 5.
1. Are voluntary demand response programs effective at shedding load during
emergency grid events?
2. Can we identify factors that impact the level of customer response on Flex Alert
days?
3. Which subpopulations are most likely to change their behavior in response to Flex
Alert
7
Chapter 2: A machine learning framework to estimate residential electricity
demand based on smart meter electricity, climate, building characteristics,
and socioeconomic datasets
2.1 Introduction
The residential sector is a significant consumer of electricity, accounting for 39% of US
total end-use electricity consumption in 2020 [27]. Although per capita electricity consumption
flattened in recent years [28], there is an expectation that a warming climate coupled with
electrification trends will drive up electricity demand in the future [2]–[4]. Given the residential
sector’s significance in overall electricity demand, anticipating future household electricity
consumption will be essential to maintaining grid reliability, managing peak demand, and
planning for new power capacity investments.
In previous studies analyzing factors driving residential electricity consumption,
temperature has been found to play one of the most significant roles [29]–[31]. Additionally,
physical building characteristics (e.g., square footage, insulation, number of stories, number of
appliances)[29] [32], [33], socio-economics, (i.e., occupation, income, education, class) [34],
[35] and occupant behavior and preferences [29], [36]–[38] are significant factors in influencing
electricity demand. While these studies provide some insight into the factors that shape
electricity use, the accuracy of residential electricity demand models remain limited by the
diverse and complex nature of the residential sector and the data available to capture that
diversity, as housing stock can vary significantly both across and within regions according to
home size, building materials, appliances, demographics, occupancy patterns, etc.
Residential energy modeling studies can be categorized into two distinct approaches: topdown [39]–[44] and bottom-up [45]–[52]. Top-down models rely on aggregate data to establish
relationships between variables and energy use and predict energy demand [53]. In top-down
8
studies, historical energy consumption is typically estimated at a city, state, or regional level and
regressed against macroeconomic indicators, such GDP or unemployment [39], [40] , energy
prices [41], [42], housing stock trends [42], [43], or weather variables [42], [44]. The focus of
many of these analyses is to capture how socioeconomic characteristics impact the electricity
sector [54]. For example, one study implemented two statistical methods, ordinary least squares
(OLS) and random coefficient (RC), to analyze the relationship between electricity consumption
and socioeconomic variables, including per capita GNP, GDP growth, structure of the economy,
urbanization, and level of literacy, using data from 93 countries and found that electricity
consumption increases with socioeconomic development [55]. Salari and Javid estimated
electricity and natural gas demand in 48 U.S. states while considering socioeconomic and
demographic variables, building stock characteristics, energy prices, and weather data. The
results from three different linear regression techniques, OLS, random effect (RE), and fixed
effects (FE), show that the socioeconomic and demographic variables of per capita income,
household size, and percentage of residents with a high school degree have a statistically
significant impact on the residential energy demand [56]. These top-down approaches are
advantageous because of model simplicity and the wide availability of data, but their lack of
detail makes it difficult to identify local demand patterns and areas for improvement.
In contrast, bottom-up models use microdata, i.e., highly detailed building and appliance
information, from an individual home or subset of homes to estimate energy demand and
extrapolate to the region, using either a physics-based [45]–[48] or statistical approach [49]–[51].
Physics-based models simulate a region's electricity demand by utilizing a set of building
archetypes, which are described based on an extensive selection of possible user-defined input
variables, to broadly represent the region's building stock [57], [58]. A representative building
9
stock model for Los Angeles County was used to estimate the region’s residential electricity and
natural gas demand in 2020-2060 under climate change scenarios and energy efficiency trends
[59]. The study found that under population growth and temperature increases, the total
residential electricity demand for the region could increase by 41-87% between 2020 and 2060.
However, the total increase in electricity demand could fall to 28% with aggressive energy
efficiency policies. Physics based models are valuable because they describe current and
prospective technologies with high detail, including a breakdown of end-use consumption,
without requiring private residential electricity records and building-specific info that are often
not publicly available. Because simulations depend on physical characteristics and
thermodynamic principles, the impact of potential technological combinations and energy
efficiency measures can be quantified, and policies that more effectively target consumption can
be developed. The drawbacks of physics based models are that many assumptions have to be
made regarding behavioral factors and their influence on energy [57], since the models do not
rely on historical data, and the building stock of a region must be coarsened to a few types of
buildings with estimations made for the number of buildings for each type.
Statistical models, a second type of bottom-up model, use historical data, such as energy
bills or smart meter data, from a subset of homes to relate physical building characteristics,
climate, and occupancy behavior to energy demand (see [53] for a survey). The benefit of using
actual energy data is that the effect of a homeowner’s individual behaviors and demographics
can be considered, unlike physics-based models which require many assumptions to estimate
behavior or top-down methods that apply broad socioeconomic indicators to their model. For
example, Min et al. performed linear regression analysis of four different residential end use
categories (space heating, water heating, cooling and appliance) to develop a mathematical
10
relationship between energy use and predictor variables, including energy price, household
characteristics, housing unit characteristics, regional fixed effects, and heating/cooling degreedays [60]. The regression models were used to estimate residential energy by end use and fuel
type for every US zip code and provide an in depth look into how energy use varies across
regions. In general, bottom-up models are advantageous because they reveal information about
end usage and finer-scale resolution energy patterns and predictions. However, both bottom-up
methods have higher complexity and computation time than top-down methods and require
detailed input data that are typically not readily available [61].
Machine learning has emerged more recently as a method to forecast energy usage that
can address the complexity, dynamics, and nonlinearity of building energy systems without
requiring detailed information on the building properties and energy system configurations [9]–
[11]. This approach has been proven effective in fast and accurate forecasting for building
energy prediction studies due to its relative simplicity, particularly in comparison to physics
based models [62]. Models are trained with historical data to determine the relationship between
input parameters (e.g., weather, building characteristics, and socioeconomic data) and building
energy consumption [10]. Like linear regression models, machine learning models are datadriven but can be better equipped to model nonlinear and complex patterns [63], [64]. Machine
learning models are also advantageous because they require less detailed building characteristics
than physics based methods, which can be expensive and time consuming to gather and therefore
difficult to extrapolate to a larger building stock [65]. Further, studies have shown that machine
learning models can forecast energy demand with higher accuracy than linear regression and
physics based models [66], [67]. While there are advantages of using machine learning models
for energy forecasting, several gaps exist in the literature, mainly due to constraints of the
11
available data.
Machine learning models have been used to predict electricity demand for both
commercial [68]–[70] and residential buildings [71]–[73] as well as for a mixed building stock
[74]–[77] but substantially fewer studies have been conducted for residential buildings than other
building types. The lack of research in the residential sector is most likely due to two limitations:
there are less data available from private residences versus commercial or industrial buildings,
and residential consumption is highly variable and greatly driven by occupancy patterns that are
difficult to model [78]–[80]. As the number of smart meter installations has increased in recent
years, electricity data for residential homes have become more widely accessible and used in a
growing number of machine learning studies. For example, one study used hourly consumption
data from 6,309 individual customers during the 2020 COVID mandates to predict how power
consumption patterns could change under a new remote work era using a machine learning
framework. The results showed that power consumption increased by 13% in the afternoon due
to COVID mandates [81]. However, most existing machine learning studies that use high
temporal resolution residential electricity data (i.e. 15-minute or hourly intervals) only use data
for one or a handful of buildings [12]–[14], [82] as few studies have had access to high volumes
of individual customer smart meter data [83], [84].
Very few machine learning electric load forecasting studies have incorporated weather
data, physical building characteristics, and socioeconomics together, and those studies that do
often use detailed occupant information for a select number of homes that are not publicly
available [63], [85], [86]. Instead of joining multiple datasets to build a diverse feature set, many
studies include only the historic electricity data of an individual building to forecast its shortterm electricity load [87]–[89]. Studies that do incorporate a combination of characteristics are
12
often constrained by coarse resolution spatial or temporal data, or vice versa. For example, a
study by Zhang et al. used household level information from the Residential Energy
Consumption Survey (RECS), Public Use Microdata Survey (PUMS) and American Community
Survey (ACS) datasets for ~2,000 residential homes, but the study was limited by the course
temporal granularity of annual consumption and dataset length of one year [90]. Another study
trained various machine learning models with population, building, and weather data from Dubai
to investigate the impact of different features on electricity demand, but predictions were made at
a monthly, community-wide scale [83].
Past machine learning energy forecasting studies have predicted large scale (e.g. regional
or national) energy demand at short [91], [92], medium [93]–[95], and long-term time horizons
[85], [96], [97]. Short-term load forecasts aid daily grid operations such as energy transfers and
load dispatch [98], while medium to long-term forecasts are necessary for infrastructure
investments and future capacity installments [99]. However, most studies focus on the shortterm, only forecasting load up to one day ahead. While there are studies that focus on long-term
prediction (e.g. months, years) they most often use data with coarse spatial resolution, such as at
a city-wide or countrywide scale [100]–[102]. Thus, building a more thorough understanding of
how input parameters might affect long term demand, especially under changing conditions (e.g.,
rising temperatures, higher AC adoption rates, growing incomes), is prudent for grid planning
over the longer term for aspects such as future grid capacity and storage investments.
The current body of electric load forecasting literature utilizing machine learning has
been constrained by limited access to 1) high resolution data that can capture both spatial and
temporal variations in energy consumption, 2) statistically representative data at a regional scale,
and 3) combinations of weather, physical building, occupancy, and sociodemographic data. To
13
our knowledge, no study has investigated how machine learning models perform under different
spatial and temporal resolutions for residential electricity demand projections across entire
regions, and because few machine learning studies in this field have used high resolution,
regionally representative data with a diverse feature set, there is little insight into how to best
optimize these models. To address these research gaps, we ask the following research questions:
1. To what extent can machine learning models accurately predict residential electricity
demand with publicly available climate, building, and socioeconomic data?
2. How does the spatiotemporal resolution of historical electricity consumption data
impact the ability of machine learning models to make precise predictions of
electricity demand?
3. Which features are most useful for predicting the target variable of electricity
consumption?
Here we develop a generalized, repeatable framework to predict household-level
electricity consumption for the residential sector. We train machine learning models using smart
meter electricity records for 58,537 households in the Greater Los Angeles region, as well as
feature sets derived from publicly available local site weather, building characteristics, and
socioeconomic data. The main contribution of our study is to use household-level smart meter
data to capture differences in electricity usage in households across different regions, as well as
differences across individual households within regions, to better understand the factors that
drive trends in residential electricity consumption. Our study improves upon previous methods of
load forecasting by leveraging a diversity of high spatiotemporal resolution datasets at a regional
scale, previously unavailable to researchers, to test model efficacy across a selection of ML
models, spatiotemporal aggregations, and feature sets.
14
The framework proposed here can serve as a guide for researchers in the energy domain
utilizing ML to estimate residential electricity consumption for a variety of applications.
Although our case study is performed in southern California, our framework utilizes
standardized smart meter data and publicly available climate, building, and socioeconomic
datasets so that it can be repeated in other regions that utilize smart meters. Southern California
serves as a valuable case study as it consists of widely varying microclimates with
socioeconomically diverse populations and building stocks, making it an ideal location to
develop a methodology that can be repeated in cities around the world. The heterogeneity of
dataset contributes to this study’s novelty, as residential smart meter datasets used for electricity
consumption analyses typically represent a more uniform climate, set of buildings, or
demographics [29], [34], [90], [103]–[105].In an era where electricity reliability will be
challenged by a changing climate, trends towards increasing electrification, and massive
decarbonization investments, anticipating future demand at more granular resolutions will be
important for informing decisions related to infrastructure investment, designing equitable
demand response programs, and offsetting the need for additional power plant capacity.
15
Table 2-1. Summary of machine learning studies for residential electric load prediction.
Model Type Temporal
resolution
Spatial
resolution
Number
of
Buildings
Training features Region Citation
SVR
10-min,
Hourly, and
Daily
Apartment
Unit, Floor,
Building
1 Weather Data New York
City
Jain et al.
2014 [80]
SVM Hourly Building 1
Weather Data,
Building
Characteristics,
Occupant Behavior
France Paudel et al.
2017 [86]
ANN, SVR, LS-SVM, GPR,
GMM Hourly Building 4
Weather Data,
Building
Characteristics
San Antonio,
Texas
Dong et al.
2016 [106]
ANN, SVR, GPR, BN Hourly Building 4 Weather Data San Antonio,
Texas
Rahman,
Srikumar,
and Smith
2017 [95]
SVR, MLP, LR Hourly Building 782 Weather Data Ireland
Humeau et
al. 2013
[107]
ANN Hourly,
Daily Building 93
Building
Characteristics,
Occupant Behavior
Lisbon,
Portugal
Rodrigues,
Cardeira,
and Calado
2014 [108]
ANN, SVM, Classification
and Regression Tree, LR,
ARIMA, Voting, Bagging,
SARIMA-PSO-LSSVR,
SARIMA-MetaFA-LSSVR
Daily Building 1 Weather Data New Taipei
City, Taiwan
Chou and
Tran 2018
[12]
SVR Daily Building 1, 20, 50
Weather Data,
Building
Characteristics,
Occupant Behavior
France
Zhao and
Magoules
2012 [109]
SVM, BPNN, RBFNN,
GRNN Annual Building 59 Building
Characteristics
Guandong,
China
Li, Ren,
Meng 2010
[110]
ANN, GB, DNN, RF,
Stacking, KNN, SVM, DT, LR Annual Building 5000
Weather Data,
Building
Characteristics
UK
Olu-Ajayi
et al. 2022
[111]
ElasticNet, Lasso, Ridge, LR,
Bagging, RF, GB, Adaboost,
Extra Trees
Annual Zip Code 2,246 Building
Characteristics Atlanta Zhang et al.
2018 [90]
MLR, RF, MNN. GB Annual District
Building
Characteristics,
Socioeconomic Data
London
Gassar,
Yun, and
Kim 2019
[112]
ElasticNet, Lasso, Ridge, LR,
Bagging, RF, GB, Adaboost,
Extra Trees, MLP, KNN
Daily,
Monthly,
Annual
Building,
Census
Tract
58,537
Weather Data,
Building
Characteristics,
Socioeconomics
Southern
California Our study
16
2.2 Methods
The main objectives of this study are to 1) develop a predictive machine learning model
that can be applied to new and changing scenarios (e.g., different regions, climates, and building
stocks) to predict residential electricity demand, 2) identify which variables are most useful in
predicting residential energy through feature selection and feature importance, and 3) optimize
model performance by training models with various combinations of spatial and temporal data
resolution. An overview of the methodology is depicted in Figure 1.
Figure 2-1. Machine Learning Model Development Framework
2.2.1 Datasets
Southern California Edison (SCE), an Investor-Owned Utility (IOU), provided household
electricity records for roughly 200,000 customers across Greater Los Angeles. These homes were
selected to be statistically representative of the 4.5 million homes that are in the region at a 99%
17
confidence level as described in [24] (note: following the data preparation steps in this analysis,
the dataset was no longer statistically representative of the region). Households within the SCE
dataset that were located in Orange County, roughly 50,000, were not included in the study as
there were no publicly available building property data to match to the records. After the
additional data processing steps (described in section 2.2), the final dataset utilized for our study
consisted of 58,537 unique single-family homes. The smart meter data were collected from each
household at 15-minute intervals over the course of two years from 2015 to 2016 and aggregated
to the daily, monthly, and annual level for model training. To conduct this study at high
geospatial resolution, the street addresses of each home were provided by the utility. Due to the
privacy concerns and security requirements of the IOU, the data were stored on the University of
Southern California Center for High Performance Computing (HPC) cluster with a highly secure
High Security Data Account.
To gain insight into the factors that influence electricity demand, site weather, building
characteristics, and socioeconomic data were also obtained. Weather datasets with similar
spatiotemporal resolution to the electricity data were necessary to accurately capture energyclimate interactions. Historical weather records were retrieved from two automated weather
networks: the California Irrigation Management Information System (CIMIS) and the National
Oceanic and Atmospheric Administration’s National Centers for Environmental Information
(NCEI) [113], [114]. Both networks consist of hundreds of automated, land-based stations across
California that record hourly observations of climatic indicators such as temperature,
precipitation, dew point, and windspeed. For this study, we use only the ambient near-surface air
temperature from 36 CIMIS stations and 43 NCEI stations. The stations were selected based on
their proximity to the households, with each household being matched to the nearest weather
18
station. The ambient temperature observations were used to calculate cooling degree days (CDD)
and heating degree days (HDD). Degree days are a measure of how cold or warm a location is.
CDD (HDD) is defined as the daily cumulative number of degrees above (below) a given
temperature threshold. This threshold is defined on an application-specific basis. Here, we used
18 degrees Celsius as the threshold (approximately the temperature at which air conditioning
(AC) is expected to be needed) to calculate the daily, monthly, and annual CDD and HDD [115].
We also computed a customized metric that we call “extreme cooling degree days” (ECDD) with
a threshold of 35 degrees Celsius as an indicator of extreme heat to further differentiate climates.
Various building characteristics for individual households were retrieved from the
Property Information Systems database, established by the Office of the Assessor, for Los
Angeles, San Bernardino, and Riverside Counties, which were the three counties containing the
households analyzed in this study [116]–[118]. The county databases contain public records for
all the properties in each of the three counties including square footage, number of bedrooms and
bathrooms, year of construction, address and more, shown in table 2. To merge datasets, we
matched electricity records provided by SCE with each building’s physical characteristics using
the given street addresses.
Demographic information was collected to explore the role of population characteristics
on electricity use. Socioeconomic data were retrieved from CalEnviroScreen 3.0 [119], a
mapping tool developed by the Office of Environmental Health Hazard Assessment, on behalf of
California Environmental Protection Agency, that identifies which California communities are
subjected to higher pollution levels and are often most vulnerable to the effects. CalEnviroScreen
includes environmental, health, and socioeconomic information from state and federal
government sources for the approximately 8,000 census tracts in California. In this study, each
19
individual home within a census tract is matched with the corresponding census indicators. The
indicators used in this study are listed in table 2.
2.2.2 Data preparation
Data preparation is an important step in machine learning that transforms the raw,
collected data into a quality dataset that is more suitable for model training [120]. A few of the
standard tasks that are commonly practiced include data cleaning, data transforms, and feature
engineering [121]–[125]. The methods and algorithms used in an ML study depend on the
specific dataset and modeling objectives, but broadly, the goal is to better uncover the underlying
nature of the data by removing erroneous data and produce a dataset that the desired analysis can
be carried out with. Data preparation measures applied to the datasets utilized in this study are
outlined in Figure 1 step a.
Data cleaning is a practice that filters flawed points from a dataset. In some cases, model
performance improves by identifying and correcting for outliers and missing values in the data
[126]. For this application, we first screened out customers with less than a year of electricity
records and homes deemed uninhabited, defined as annual consumption less than 20 kWh, the
average daily demand of a home in California [127]. Our analysis targets single family detached
homes so electricity customers with an apartment indicator in the address line (e.g., unit number)
or that were designated as an apartment in the County Assessor databases were removed from
the dataset. We adopted the method developed in our previous publication to identify homes with
onsite electricity generation (e.g. solar panels) or homes without AC because this information
was not provided by the utility [24]. With this method, any home with at least one hour of zero
electricity consumption between 10:00 and 16:00 and one or more hours of positive electricity
consumption between 17:00 and 23:00 on at least 5% of the days within the two-year time period
20
(i.e., 36 days), was identified as a household with onsite electricity generation [24]. The
electricity-temperature sensitivity of a home was also characterized to determine whether a
household utilized AC during the period of study based on our framework detailed in [24].
Homes with solar panels and/or without AC were filtered from the data as not to distort the
electricity-temperature relationship of the single-family homes remaining in dataset.
Outliers were removed based on the total square footage of the home and electricity
demand. The average daily electricity demand per square footage was calculated, and customers
with an electricity demand three times greater than the standard deviation for 10% of the time
period were filtered out to exclude possible multi-family units or very high consuming
households that might skew models. Using the same reasoning, homes with square footage above
20,000 square feet were identified as outliers and removed. The outliers were located throughout
the region and not biased towards certain areas. The features in each of the datasets were also
processed, and individual variables with more than 10% of records missing were excluded. The
number of stories in each building was the only omitted feature across all the originally included
features due to the frequency at which it was missing. Table 2 summarizes the features used in
the study.
21
Table 2-2. Full feature set
Category Feature Type Mean Units Number of
Categories
Physical Building
Property
Square footage Continuous 1808 Square feet
Bedrooms Continuous 3.3 Bedrooms
Bathrooms Continuous 3.2 Bathrooms
Presence of pool Binary
Building vintage Continuous 1971
Building vintage
category Categorical 3
Climate
Climate zone Categorical 7
Annual cooling
degree days Continuous 1293 Degree days
Annual heating
degree days Continuous 863 Degree days
Annual extreme
cooling degree days Continuous 141 Degree days
Monthly cooling
degree days Continuous 101 Degree days
Monthly heating
degree days Continuous 66.0 Degree days
Monthly extreme
cooling degree days Continuous 9.81 Degree days
Monthly average
temperature Continuous 19.1 degrees Celsius
Monthly
temperature delta Continuous 11.7 degrees Celsius
Daily cooling
degree days Continuous 3.58 Degree days
Daily heating
degree days Continuous 2.32 Degree days
Daily extreme
cooling degree days Continuous 0.39 Degree days
Daily average
temperature Continuous 19.2 degrees Celsius
Daily max
temperature Continuous 25.6 degrees Celsius
Daily min
temperature Continuous 13.1 degrees Celsius
Daily temperature
delta Continuous 12.5 degrees Celsius
Socioeconomic
Education Continuous 17.8 Percent
Linguistic Isolation Continuous 8.1 Percent
Poverty Continuous 32.5 Percent
Housing Burden Continuous 17.3 Percent
Unemployment Continuous 10.7 Percent
Temporal Month Categorical 12
Day of Week Categorical 7
22
Data preprocessing steps are performed to prepare the raw data for subsequent processing
steps. A few basic preprocessing steps include handling missing data, converting formats, and
data transformations. For the weather data, numerical imputation was used when hourly weather
data were missing to compute degree days and daily average values. Additionally, date formats
were converted to match weather data and electricity data. As stated, variables from the county
assessor databases were discarded if missing more than 10% of the time. Categorical encoding is
a key data preprocessing technique that converts categorical variables to numerical
representation so that they are machine readable [128]. In this feature set, the climate zone,
presence of a pool, building vintage category, month, and day of week are all unordered,
categorical variables that are categorically encoded prior to model training using OneHotEncoder
from the Python Scikit-learn library [129].
Data transformations are used to convert a dataset into a format that is more suitable for a
given machine learning model. The transformations might be mandatory, meaning that they are
necessary for data compatibility, or optional quality transformations, which help the model
perform better [130]. Transformations are commonly used to scale and standardize features to
the same range so that variables have equal influence [131]. Because there is a large difference in
scale across the input variables for this study, the StandardScaler transform from the Python
Scikit-learn library was selected to standardize the numerical features by subtracting the mean
and scaling to unit variance. This ensures that one feature with high variance does not dominate
the rest during training. A PowerTransformer, from the Scikit-learn library, and log transform
were also implemented prior to model training, but both reduced model performance and were
thus omitted in the final analysis.
23
2.2.3 Model training and evaluation
One of the main objectives of this study is to develop an optimized machine learning
framework that can predict the electricity demand of individual households using the variables
described in section 2.3.1. Machine learning models take a set of features, X, as input variables
and a target variable, Y. The models build mathematical functions that define Y in terms of X
based on the relationships in the training set. Using these functions, target variable predictions
are made on a test set based on the corresponding input variables. Machine learning models from
varying machine learning model classes, including linear, non-linear, ensemble, and tree models,
were selected to see which models and model types are best suited for this application. In step b
of Figure 1, we trained the following 11 machine learning models from the scikit-learn Python
library in our study: ridge regressor, linear regressor, elasticnet regressor, lasso regressor,
adaboost regressor, bagging regressor, gradient boosting regressor (XGBoost), random forest
regressor (RF regressor), extra trees regressor (ET regressor), multi-layer perceptron regressor
(MLP regressor), and k-nearest neighbor regressor (KNN regressor). It is important to note that
because the goal of this study is to build a repeatable framework that can be applied to other
regions (as opposed to creating an optimal model for our particular dataset and region), we focus
on machine learning models that are generalizable and easy to implement. However, a more
complex set of models, or combinations of models, might lead to improved model performance.
These models are optimized by finding the ideal coefficients, θ, that minimize the sum of losses
between each data point and the predicted value calculated by a cost function, L, which varies by
model. For each of the selected models, the minimized cost function was mean squared error
with some models having added regularization penalties that are built in.
In machine learning, hyperparameters are parameters explicitly defined by the user that
24
control a given model’s learning process. The values and configurations for the hyperparameters
can be adjusted prior to training in an effort to achieve optimal performance. However,
determining the best values is often completed through rule of thumb or trial and error, which are
both time intensive. For the scope of this study, the hyperparameters of the models were only
slightly adapted from the default settings of the scikit-learn Python library version 0.24.2 in
instances where the default settings might cause long run times. For the XGBoost regressor, the
max depth was adjusted to 4 and the number of estimators was reduced to 20. Changes made to
the RF regressor include setting the max depth to 3 and the number of estimators to 60. Lastly,
the max depth for ET regressor was set to 3. For these three models, the rest of the
hyperparameters remained as the default. For all other models, all hyperparameters were set to
the defaults. Additional hyperparameter tuning could improve model performance, and we do not
suggest that these are the optimal settings.
Resampling methods are commonly used in machine learning studies to reduce bias in
the training set by repeatedly sampling from the original data [132]–[135]. The technique is used
to avoid overfitting, which happens when a model has learned the training data too well instead
of a generalizable relationship. Overfitting results in poor model performance when predicting
on new data [136]. In the model training steps shown in Figure 2, a bootstrap method was
implemented in which n number of equally sized subsets are extracted from the dataset with
replacement [137]. During training, data leakage can occur when information is shared between
the training and testing, leading to unrealistically high levels of measured model performance
[138]. To avoid data leakage in this study, the entirety of each of the household’s data was
included in either the training or test set for each split (i.e., the training and test sets have the
entirety of a households two years of data).The model was trained on each of the sampled
25
subsets (training set size of ~90,000, ~1,000,000, ~29,000,000 records, respectively, for the
annual, monthly, and daily models) and was evaluated on the test set of remaining data, as
illustrated in Figure 2. To evaluate the models, we set n equal to 10 and recorded the
bootstrapped mean score for the four error metrics described below.
For this study, several accuracy metrics were explored including mean absolute error,
median absolute error, and r2
score. The mean absolute error (MAE) and median absolute error
(MdAE) are both scale dependent error metrics, meaning the error metrics are expressed in units.
MAE measures the average magnitude of the absolute value of errors in a set of predictions,
while MdAE is the median value of all the absolute values of the residuals. Mean absolute
percent error (MAPE) is the average difference between the forecasted value and the actual value
given as a percentage [139]. The r2
score, or the coefficient of determination, measures the
amount of variance between the samples in the dataset and predictions in the model. The
drawback of scale-based metrics for this application is that they cannot be directly compared
across temporal resolutions with differing magnitudes of electricity demand. Conversely,
percentage-based metrics are flawed because MAPE might be higher for values that tend towards
zero (e.g., some daily energy values) and could have different values for two predictions with the
same absolute error. Because it does not have these same interpretability limitations, we selected
the r2 metric to assess best model performance in step c of Figure 1 [140]. The remaining metrics
are still reported for completeness.
26
Figure 2-2. Graphical representation of bootstrap resampling methods used during model training, where n is equal
to 10.
2.2.4 Feature selection
Following the data preparation steps, an initial round of model training is performed to
determine the overall best models. Feature selection is conducted after the first round of model
training on the top five performing models to attain the final feature set (See step d of Figure 1).
Feature selection is the process of identifying and removing redundant or irrelevant variables that
are less useful in predicting the target variable. By removing extraneous or redundant features,
model performance and computational time for training can both be improved [141]–[143]. Most
commonly used feature selection algorithms can be broadly classified as filter or wrapper
methods. Filter methods rank each feature by evaluating the relationship between the input and
target variables and then select only the highly ranked features. Wrapper methods select the
feature subset that leads to best model performance based on a specified performance indicator
[144].
27
In this study we apply the wrapper method, using the sequential forward selection (SFS)
algorithm from the mlxtend library to reduce the d-dimensional dataset into a k-dimensional
dataset, where k < d [145]. SFS is a greedy search algorithm in which features are added one at a
time until the best feature subset of k features is determined based on the cross-validated r2
score.
In the first iteration, each feature is individually tested and the single feature, x, that leads to best
model performance is selected. In the subsequent iteration, every combination of feature x plus
an additional feature is tested to determine the two features that in combination achieve the
highest performance. Iterations are repeated until a combination of features of size k is found.
The value of k can be a specified number or range of numbers. For our model, we set the range
as 0 to k, with k equal to the total number of features, to attain the feature set size with the overall
best model performance. While feature selection is a valuable algorithm in the machine learning
process because of its ability to reduce computational time and improve model accuracy, it does
not aid in increasing the interpretability of models. Feature selection can inform hypotheses
between features and the target variable, but it does not provide causal understanding for why
specific features were selected or discarded from the final feature set.
2.2.5 Spatiotemporal resolution
To explore the impact of spatiotemporal data resolution on model performance, models
were trained with daily, monthly, and annual electricity demand (step b of Figure 1), and model
performance was evaluated for each resolution (step c of Figure 1). Features with a temporal
dimension were averaged (e.g., daily average temperature) or aggregated (e.g., annual CDD)
depending on the variable. The ability of our model to predict on larger spatial scales was
evaluated using two different methods, illustrated in Figure 3, to gain insight into how the spatial
resolution of the electricity consumption dataset impacts the ability of the model to make
28
accurate predictions. In the first method, referred to as pre-aggregation, the models were trained
with household data, and the predicted electricity consumption for all the homes within a census
tract was averaged and compared to the true mean of the test set observations. Conversely, in the
second method, post-aggregation models were trained and tested with census tract averages of
electricity consumption and input variables.
2.2.5 Feature Importance
Feature importance techniques are beneficial because they improve the explainability of
machine learning models that are often complex and difficult to unpack and reveal relationships
between features and target variables[146]–[148]. There is often overlap between the techniques
used for feature importance and feature selection; the key difference is that feature selection is a
preprocessing technique that is applied before a model is trained to detect the most relevant
features and discard the others. Feature importance algorithms are typically implemented
following model training to determine which features are most useful to the model and explain
the model behavior [149].
In step d of Figure 1, we selected the permutation feature importance algorithm from the
Python Scikit-learn library, because it has a fast calculation time, is easy to understand, and is
applicable for all models in this study [150]. Permutation importance measures the deterioration
of model performance after permuting each feature, which effectively breaks the relationships
between the feature and the target variable. Because permutation importance is calculated after a
model has been fitted, reordering the values of a feature does not impact the relationship learned
by the model. The process is as follows: 1) train model, 2) individually shuffle the values of a
single variable within the test set and compute the drop in performance score, and 3) return the
dataset to the original order and repeat for each of the remaining variables. A feature that
29
significantly impacts the target variable will greatly reduce model performance when shuffled,
while one that is less important will have a smaller impact on the accuracy.
Figure 3. Two data aggregation methods were executed to test how the spatial resolution of input data
impacts a model’s ability to predict electricity demand
Figure 2-3. Two data aggregation methods were executed to test how the spatial resolution of input data impacts a
model’s ability to predict electricity demand.
2.3 Results and discussion
The goal of this study was to develop, evaluate, and optimize ML models for residential
electricity forecasting. Results from this study gauge the extent to which various machine
learning models can accurately predict residential electricity demand with publicly available
climate, building, and socioeconomic datasets and at differing spatiotemporal data resolutions.
The feature selection and feature importance steps also provide better understanding of the
models, giving insight into which features are most useful to make energy demand predictions at
various scales.
2.3.1 Model performance
30
We wanted to understand and compare a model’s ability to predict short-term (e.g., daily)
versus longer-term electricity consumption (e.g., annual), as well as household level versus more
aggregated scales (e.g., census tracts) of electricity consumption, as projections at each of these
various spatiotemporal resolutions offer different insights and utility based on application. Daily
household level data has the advantage of capturing day to day variations in energy use among
households, which would be important in situations such as understanding the impacts of
demand side management strategies or behind-the-meter generation or storage technologies
across different populations. For instance, if a utility wanted to anticipate how time-of-use rate
structures might affect wealthy versus marginalized communities within their service territories,
understanding household level variability on finer timescales would be advantageous. There are
other applications in which we might want to understand how electricity use is impacted at
broader regional scales or across longer time scales. For example, regional scale forecasts would
be most desirable for planning longer term investments in new utility-scale generation capacity.
Table 3 presents the model results for all models across the three temporal resolutions
(complete results can be found in the SI). The best performing models and temporal resolutions
are consistent across the four different performance metrics (e.g., MLP regressor is the best
performing model for each temporal resolution when evaluated by each performance metric). In
general, the MAE, MdAE, and MAPE are larger for the annual models than monthly and daily
models and larger for monthly models than daily models. The results for MAE and MdAE are
intuitive because the yearly electricity demand is greater than the monthly or daily demand and
will then likely have larger absolute prediction errors as well. The MAPE results suggest that the
models can more accurately predict monthly and daily electricity demand than annual demand.
For this study, the r2
value was selected as the main indicator of model performance to
31
allow for direct comparisons between the annual, monthly, and daily models. The results showed
that prediction accuracy varied significantly across the different ML models and varying
temporal resolutions. Figure 4 summarizes the results of the top five best performing models
across all three temporal resolutions. Their r2
values range from 0.25 to 0.45, suggesting that
while these models can likely be useful in informing broader trends in residential electricity use,
there is still a lot of behavioral variability across individual homes that cannot be captured by the
feature set utilized in this study, limiting the models’ performance above this r2
range. It is
important to note that there is temporal variation in the model performance (e.g., monthly model
performs better in certain months), and a time-series plot of the performance variation is shown
in the SI.
The results depicted in Figure 4 also show how the temporal resolution of data impacts
model performance; the r2
values are slightly higher for all the ML models trained with monthly
data, rather than annual or daily, except random forest regressor. The MLP Regressor model
trained with monthly electricity data has the highest overall r2
of 0.45, with the best r2
for annual
and daily data being 0.34 and 0.38 respectively, using MLP Regressor. Monthly models are
likely more accurate in predicting the target variable of electricity demand because the monthly
data average out some of the highly variable demand seen in the daily data but capture seasonal
weather trends more accurately than annual models. While the monthly MLP model has the
overall best model performance, in general, the linear models perform with similar accuracy to
the non-linear models (e.g., MLP, RFR) when comparing r2 values.
Direct comparisons of model performance between this study and studies in the literature
are limited. This is because studies either 1) utilize household data to make short term
predictions for a single household or set of households or 2) use aggregated data to make longer
32
term estimates at larger spatial scales. Typically, these studies have higher model performance as
the individual homeowner’s behavior is either more easily learned when a single household is
used or averaged out in aggregated demand loads. For example, a study that predicted household
daily electricity for one home using neural networks had r2
values ranging from 0.87-0.91 [13].
The results of our study are more consistent with the few studies that have access to large
samples of household electricity data. Zhang et. al. used ML models to predict annual residential
electricity demand with r2 scores ranging from 0.78-0.88. While these values are higher than
those reported in our study, the household’s annual electricity bill was used to predict demand
and was shown to be most highly correlated [90]. A study by Williams and Gomez predicted
monthly, household residential electricity demand in Texas with three methods: linear
regression, regression trees, and multivariate adaptive regression splines and achieved r2
values
ranging from 0.41 to 0.48 [151].
Figure 2-4. Model performance measured by r2 across the overall top 5 best performing models trained with annual,
monthly, and daily data. The r2 values, represented by the bars, range from 0.25-0.45, with the monthly MLP model
achieving the highest score.
33
Table 2-3. Annual, monthly, daily results before sequential feature selection for top 5 models
Temporal
Resolution Model Mean Absolute
Error
Median
Absolute Error r
2
Mean Average
Percent
Difference
Annual
Ridge Regressor 2780 +/- 16.4 2120 +/- 13.45 0.31 +/- 0.01 112+/- 2.33
Linear Regressor 2780 +/- 16.4 2120 +/- 13.48 0.31 +/- 0.01 112 +/- 2.33
GB Regressor 2780 +/- 15.1 2140 +/- 11.05 0.32 +/- 0.01 113 +/- 2.18
RF Regressor 2830 +/- 14.1 2180 +/- 14.30 0.30 +/- 0.01 113 +/- 2.20
MLPRegressor 2740 +/- 15.8 2090 +/- 8.31 0.34 +/- 0.01 111 +/- 2.21
Monthly
Ridge Regressor 250. +/- 1.02 184 +/- 0.69 0.38 +/- 0.0 83.7 +/- 9.65
Linear Regressor 250. +/- 1.02 184 +/- 0.69 0.38 +/- 0.01 83.7 +/- 9.65
GB Regressor 255 +/- 0.98 195 +/- 1.07 0.36 +/- 0.01 89.8 +/- 11.0
RF Regressor 277 +/- 2.61 212 +/- 1.82 0.25 +/- 0.01 94.6 +/- 12.1
MLPRegressor 235 +/- 1.47 171 +/- 1.52 0.45 +/- 0.01 81.5 +/- 9.01
Daily
Ridge Regressor 9.59 +/- 0.0266 6.96 +/- 0.0284 0.30 +/- 0.007 74.9 +/- 1.41
Linear Regressor 9.04 +/- 0.0255 6.47 +/- 0.0251 0.37 +/- 0.0074 69.9 +/- 1.28
GB Regressor 9.18 +/- 0.028 6.81 +/- 0.0308 0.35 +/- 0.0067 73.9 +/- 1.41
RF Regressor 9.76 +/- 0.0297 7.11 +/- 0.0401 0.26 +/- 0.0042 77.2 +/- 1.47
MLPRegressor 8.72 +/- 0.0508 6.13 +/- 0.143 0.38 +/- 0.026 67.4 +/- 2.32
Most prior electricity forecasting work has utilized coarser spatial scale data, which has
limited analysis of home-to-home variability. While the smart meter data utilized in this study
offer much better spatial resolution, there are applications where household level projections are
less desirable than for more coarse regional spatial extents. While acknowledging this, this study
is the first to analyze how different techniques for aggregating data can affect the accuracy of
ML model performance across large spatial scales. In other words, no study has explored
whether high-resolution regional ML model projections will be more accurate if 1) models are
trained with household level data that preserve variations in demand, and then aggregate results,
or 2) data are first aggregated to the spatial scale at which predictions are being made prior to
running the ML model.
Table 4 summarizes the model results of the two different spatial aggregation methods.
Again, the results are consistent across the four performance metrics within a specific spatial and
34
temporal resolution combination (e.g., the MLP regressor is the most accurate monthly postaggregation model regardless of the selected performance metric). For all three temporal
resolutions and each evaluation metric, the predictions from the post-aggregation method were
overall more accurate than the predictions from the pre-aggregation method.
For direct comparison of annual, monthly, and daily results, the overall best r2
values for
each combination of temporal resolution and spatial aggregation method are highlighted in
Figure 5. The results show that for models being trained with all three temporal resolutions of
data, predictions made at the census tract level with both methods are more accurate than at the
household level. Models trained with monthly data again perform better than models trained with
annual or daily data for both aggregation methods, with the highest r2
value being 0.81 for the
MLP Regressor trained with the post-aggregation method. The r2
values for annual and daily
models using the post-aggregation method are 0.63 and 0.69, respectively. The higher
performance of the post-aggregation method suggests that, when assessing prediction accuracy
for aggregated data, a model originally trained and optimized on data after aggregation will
perform better than one that is trained prior to data aggregation. This means that household level
data are not necessary for improving the accuracy of residential electricity projections for coarser
spatial scales, such as the census tract level.
35
Table 2-4. Results of the pre-aggregation and post-aggregation training for the top 5 models and all 3 temporal
resolutions
Temporal/Spatial
Resolution Model Mean Absolute
Error
Median
Absolute Error r
2
Mean Average
Percent
Difference
Annual Preaggregation
Ridge Regressor 1430 +/- 28.1 1020 +/- 24.2 0.47 +/- 0.01 30.6 +/- 2.60
Linear Regressor 1430 +/- 28.2 1020 +/- 24.2 0.47 +/- 0.01 30.6 +/- 2.60
GB Regressor 1410 +/- 30.3 1030 +/- 24.3 0.48 +/- 0.01 31.1 +/- 2.73
RF Regressor 1470 +/- 33.4 1090 +/- 29.1 0.44 +/- 0.01 32.0 +/- 2.73
MLP Regressor 1370 +/- 22.7 986 +/- 20.7 0.51 +/- 0.01 29.8 +/- 2.63
Annual Postaggregation
Ridge Regressor 831 +/- 30.2 618 +/- 28.0 0.65 +/- 0.04 13.3 +/- 0.43
Linear
Regressor 829 +/- 30.5 618 +/- 28.3 0.66 +/- 0.04 13.3 +/- 0.43
GB Regressor 824 +/- 40.8 600 +/- 17.4 0.65 +/- 0.04 13.7 +/- 0.77
RF Regressor 888 +/- 48.5 638 +/- 20.5 0.60 +/- 0.04 14.6 +/- 0.73
MLP Regressor 868 +/- 45.5 628 +/- 27.7 0.63 +/- 0.03 13.8 +/- 0.67
Monthly Preaggregation
Ridge Regressor 135 +/- 1.25 97.9 +/- 1.94 0.58 +/- 0.01 25.9 +/- 0.81
Linear Regressor 135 +/- 1.25 97.9 +/- 1.94 0.58 +/- 0.01 25.9 +/- 0.81
GB Regressor 147 +/- 1.37 113 +/- 1.37 0.53 +/- 0.01 29.4 +/- 1.00
RF Regressor 175 +/- 1.82 138 +/- 1.88 0.36 +/- 0.02 34.5 +/- 1.07
MLP Regressor 112 +/- 1.64 78.6 +/- 1.44 0.70 +/- 0.02 22.1 +/- 1.10
Monthly Postaggregation
Ridge Regressor 104 +/- 2.02 75.2 +/- 1.29 0.68 +/- 0.02 18.1 +/- 0.44
Linear Regressor 104 +/- 2.02 75.2 +/- 1.29 0.68 +/- 0.02 18.1 +/- 0.44
GB Regressor 112 +/- 3.57 84.7 +/- 2.45 0.64 +/- 0.02 20.0 +/- 0.77
RF Regressor 144 +/- 3.16 106.9 +/- 1.49 0.40 +/- 0.03 25.4 +/- 0.84
MLP Regressor 75.2 +/- 2.83 51.6 +/- 1.77 0.82 +/- 0.02 13.5 +/- 0.63
Daily Pre-aggregation
Ridge Regressor 9.16 +/- 0.025 6.58 +/- 0.025 0.35 +/- 0.006 70.7 +/- 1.3
Linear Regressor 9.16 +/- 0.025 6.58 +/- 0.025 0.35 +/- 0.006 70.7 +/- 1.3
GB Regressor 9.18 +/- 0.028 6.81 +/- 0.031 0.35 +/- 0.006 73.9 +/- 1.40
RF Regressor 9.76 +/- 0.03 7.11 +/- 0.04 0.26 +/- 0.004 77.2 +/- 1.47
MLP Regressor 8.71 +/- 0.043 6.12 +/- 0.125 0.38 +/- 0.021 67.3 +/- 2.68
Daily Postaggregation
Ridge
Regressor 3.60 +/- 0.09 2.54 +/- 0.06 0.69 +/- 0.02 19.2 +/- 0.82
Linear Regressor 3.60 +/- 0.09 2.54 +/- 0.06 0.69 +/- 0.02 19.2 +/- 0.82
GB Regressor 3.70 +/- 0.13 2.67 +/- 0.07 0.67 +/- 0.02 20.4 +/- 0.79
RF Regressor 4.49 +/- 0.12 3.23 +/- 0.06 0.53 +/- 0.01 24.4 +/- 0.76
MLP Regressor 4.08 +/- 0.11 2.98 +/- 0.10 0.62 +/- 0.03 22.6 +/- 0.95
36
Figure 2-5. Model performance of the best models for each combination of temporal resolution and spatial
aggregation method measured by r2. The r2 values, represented by the bars, range from 0.34-0.81, with the monthly,
post-aggregation MLP model being the most accurate.
2.3.2 Sequential feature selection
After the initial round of model training, feature selection was completed to find the most
relevant subset of features. Sequential feature selection is commonly implemented in ML studies
both to optimize model performance and to cut down on run times by reducing the feature set.
The results from the sequential feature selection algorithm are shown in Figure 6 for annual,
monthly, and daily data resolutions. Certain variables were consistently selected across all
models and temporal resolutions, such as the home’s square footage and whether a home has a
pool. Climate related features, such as the climate zone the house belongs to, which month it is,
or differing temperature indicators, were also frequently selected. At the annual level, ECDD
was selected for all the top five models except one, while CDD and HDD were selected for three
and two of the models, respectively. For monthly models, which month it is (a proxy for
weather) was selected for all models, and ECDD and HDD were selected for all but one. The
CDD and monthly mean temperature were only kept in the final feature set for two of the top
37
five monthly models. Lastly, the month variable and daily max temperature were selected by
feature selection for all models at the daily level, followed by CDD, ECDD, daily mean
temperature, and daily min temperature, which were selected for all but one model.
Socioeconomic indicators, including education, linguistic isolation, poverty, housing
burden, and unemployment, were selected less frequently. These variables reflect census tractlevel data so they are imprecise indicators in characterizing house to house variability. Across all
three temporal resolutions, education and linguistic isolation were the two socioeconomic
indicators that were most often kept in the final feature set. These features are often correlated to
household financial insecurity, which impacts electricity usage. However, all the demographic
variables are highly correlated, making it difficult to tease out their individual influences on
energy behavior or determine why one socioeconomic indicator is more useful to model training
than another.
38
*Feature selection algorithm was not completed for these combinations of model/temporal resolution because it was
too computationally expensive.
Figure 2-6. Final feature set for top performing annual, monthly, and daily models after performing sequential
feature selection. The features selected by the algorithm are filled in with color.
39
2.3.3 Feature importance
After training the models, permutation feature importance was used to examine which of
the features were most useful to predicting electricity consumption. The results of the algorithm
showed a decrease in model score (here, r2
) when the records of a specific feature are randomly
shuffled within the dataset, breaking the relationship between the feature and the target and
revealing how much the model depends on that feature. Table 5 shows the permutation feature
importance of all the features in the final feature set for household level data with the top
performing ML model, which was MLP Regressor for annual, monthly, and daily trained
models. Total square footage was consistently one of the most useful features to the model,
ranking first for annual and daily, with feature importance values of 0.402 and 0.232
respectively, and second for monthly with a value of 0.272. The annual and daily feature
importance values for square footage are an order of magnitude higher than any of the other
annual or daily feature values which suggests that the rest of the variables do not matter much for
these prediction cases.
These results show that feature importance varies significantly depending on the temporal
resolution. In general, weather indicators were much more important to the monthly model than
the daily and annual models. For example, the month of the year, which is strongly tied to
temperature, was the most important feature for the monthly model and had a value of 0.336.
Conversely, the month of the year is ranked second overall for the daily model, but with a much
lower mean importance of 0.087. The low importance of weather features for the daily model
could be because there are so many other uncaptured, highly variable daily occupancy factors
that outweigh the impact of weather. Similarly, ECDD was the highest ranked climate indicator
for the annual model with a low mean importance of 0.059. Because the model was only trained
40
with two years of data, there might not have been enough variation in annual degree days for the
model to learn, leaving building characteristics to be more useful in predicting annual electricity
use. The results show that socioeconomic indicators are generally less useful to the model than
climate and building characteristics, ranking low in importance across each of the temporal
resolutions. Of the five socioeconomic indicators, linguistic isolation is shown to have the
highest importance for annual, monthly, and daily models, with values of 0.018, 0.028, and 0.015
respectively. Since socioeconomic variables are only available at the census tract level, it follows
that they would not be the best predictors for household electricity demand.
In comparing feature selection and feature importance, the features that were most
frequently selected also had consistently higher values for feature importance. Accordingly,
those that were not often included in the final feature set typically had lower values of
importance in the instances that they were included. For example, the square footage and pool
ownership variables were selected for every model and temporal resolution, and their mean
importances also ranked in the top three for all three temporal resolutions.
41
Table 2-5. Feature importance with annual, monthly, and daily data by household.
ANNUAL
MLP
MONTHLY
MLP
DAILY
MLP
Feature
Mean
Importa
nce
Std
Deviati
on
Feature
Mean
Importa
nce
Std
Deviati
on
Feature
Mean
Importa
nce
Std
Deviati
on
Total Sqft 0.402 0.004 Month 0.336 0.001
Total
Square Feet 0.232 0.049
Pool 0.075 0.002 Total Sqft 0.272 0.001 Month 0.087 0.019
ECDD 0.059 0.001 Climate Zone 0.081 0.001 Pool 0.025 0.008
CDD 0.053 0.002 Pool 0.063 0.001
Daily Max
Temp 0.022 0.017
Climate Zone 0.042 0.002
Linguistic
Isolation 0.028 0.001
Linguistic
Isolation 0.015 0.009
Vintage 0.023 0.001 Bathrooms 0.026 0.001 Bathrooms 0.013 0.010
Linguistic
Isolation 0.018 0.001 Vintage 0.026 0.000
Climate
Zone 0.010 0.008
Bathrooms 0.011 0.001
Monthly Mean
Temp 0.021 0.000
Day of
Week 0.004 0.003
Education 0.007 0.000 Education 0.020 0.001 Education 0.004 0.008
HDD 0.006 0.000 Bedrooms 0.016 0.000
Vintage
Category 0.003 0.003
Vintage
Category 0.004 0.000 Poverty 0.015 0.000 CDD 0.003 0.007
Unemployment 0.001 0.000
Vintage
Category 0.011 0.001
Daily Mean
Temp 0.000 0.000
Monthly Delta
Temp 0.010 0.000 ECDD 0.000 0.000
ECDD 0.009 0.000
Daily Min
Temp 0.000 0.000
HDD 0.007 0.000
Daily Temp
Delta 0.000 0.000
CDD 0.006 0.000 HDD 0.000 0.000
Unemployment 0.005 0.000 Poverty 0.000 0.000
Housing Burden 0.003 0.000 Vintage 0.000 0.000
Bedrooms 0.000 0.000
Unemploy
ment 0.000 0.000
Housing
Burden 0.000 0.000
42
2.4 Conclusion and future work
Machine learning models are capable of learning highly complex relationships between
electricity demand and its driving factors, making them a promising tool for energy load
forecasting. To date, studies utilizing ML models to predict residential electricity demand at a
regional scale have only had access to coarse spatial (e.g., city, state, regional) and temporal
(e.g., monthly or annual) electricity data.
The results show that ML models can predict household level electricity demand with a
significant degree of accuracy in certain cases; the best performing model, MLP regressor trained
with monthly data, achieves an r2
value of 0.45. Monthly trained models may have superior
performance to annual and daily models because some of the highly variable day to day
differences in energy demand behavior are averaged out while still providing a greater
distribution of training data than the annual model. Across all temporal resolutions, models
predicted census tract level residential electricity demand with higher accuracy than for
individual households. Using the post-aggregation training method for an MLP regressor model
trained with monthly data, the mean electricity demand of census tracts was predicted with an
accuracy of an r2
of 0.81. These results are promising because they show that residential
electricity demand can be predicted at relatively high-resolution spatial scales without needing
private customer electricity data and can provide insight into patterns of energy demand which
are necessary to understand for daily grid operation and future infrastructure investments.
The total square footage of a building as well as climate indicators were consistently
selected to be in the final feature set across all the models. These features typically were found to
be most important by the feature importance algorithm; total square footage, for example was
ranked first for the annual and daily models and second for the monthly model. Socioeconomic
43
indicators did not rank as high but because they were reported at the census tract level, it is
harder to determine their influence on household demand. As this study serves as a framework
for future grid modeling studies, feature selection and feature importance results can also give
insight into where data retrieval efforts should be focused.
Certain limitations cap the extent to which predictions of residential electricity demand
can be made. First, individual behavior of homeowners is highly variable and unpredictable and
can vastly impact electricity demand. The socioeconomic data serves as a proxy to relate the
occupants to their possible energy behavior, but it is not informative enough to account for many
of their decisions pertaining to electricity use. As information about the demographics of
homeowners is highly private and occupancy patterns would be almost impossible to extract, it
would be difficult to surpass this limitation. Second, while the data is regionally representative, it
is not necessarily representative of the conditions that are the focus of the study. For example,
the annual models only have two years of data that may have, due to external factors, been easier
or more difficult for models to predict on rather than other years.
The knowledge gained from this study can serve as a reference to optimize building
energy prediction studies, which are crucial to anticipate future energy needs and develop
climate adaption and mitigation plans. Researchers can build off the framework presented in this
study and improve model performance through a number of techniques such as tuning models’
hyperparameters, using a combination of models based on the results of a more granular
performance assessment, and employing more complex ML models. Future work will
incorporate highly resolved estimates of future temperature across the region of study into the
optimized ML model to investigate how residential electricity demand might change due to
urban warming. Under a warming climate, the distribution of temperatures, and any other
44
weather data used in future studies, will be fundamentally different than the historical data that is
available for model training. Building properties and socioeconomics will also shift, meaning the
models will be trained with a feature distribution that no longer exists. The inconsistency
between the feature set for training and real-world data will be a limitation for future studies as
models will have to both interpolate and extrapolate well to make accurate predictions of
electricity demand decades into the future.
45
Chapter 3: Investigating whether the inclusion of humid heat metrics
improves estimates of AC penetration rates: A case study of Southern
California
3.1 Introduction
Global heat stress projections show significant growth in both exposure to and frequency of
dangerous heat conditions through the 21st century [152], [153]. As temperatures and humidity
rise, widespread access to air conditioning (AC) will be crucial to mitigate the health risks posed
by exposure to extreme heat events [154]–[156]. However, growth in AC adoption and use has
major implications for the world’s energy systems and, depending on the pace of decarbonization
effort, on greenhouse gas emissions [157]. By 2050, it is estimated that AC will be the second
largest source of global electricity demand due in large part to the huge growth in cooling units
expected in developing countries, many of which are in the hottest regions of the world [7].
Increasing cooling demand will exacerbate the intensity and frequency of peak demand events
putting even more strain on aging electricity systems [19], [158] ;when these electricity systems
fail, power outages interrupt vital services and increase heat exposure, putting public health at
risk [159]–[163].
Although the relationship between electricity demand and temperature has been well
established [44], [164], [165], there are aspects of the thermal environment beyond air
temperature that influence human comfort levels [152], [160], [166], [167], and therefore,
energy-consuming behaviors. Heat stress, a physiological response to extreme humid heat
conditions that limit the body’s ability to regulate temperature, depends on a combination of both
temperature and humidity, among other factors [152], [160], [168]–[172]. Despite studies
showing cooling is impacted by humidity conditions as well as temperature [15]–[18], most
research to date uses temperature-derived variables as the only climate indicators in predicting
46
electricity demand [19]–[22], [173]. Because of the proven link between humid heat and energy
demand, it is likely that humidity levels impact both AC ownership and patterns of adoption.
However, the connection between AC ownership and humidity has not been explored in the
literature.
As the power sector transitions to a grid that more heavily relies on variable sources of
generation and demand side management, grid planners will benefit from more accurate
estimates of AC ownership at local scales to manage future cooling loads [174] and potentially
leverage these loads for demand side management strategies [175], [176]. Developing a thorough
understanding of spatial and temporal trends in cooling behavior would also help identify areas
with high growth potential for AC adoption, as well as communities that lack access to AC and
are most vulnerable to extreme heat. Developing estimates and projections of residential AC
ownership is difficult because detailed information on homeowner appliances and energy
behavior is rarely publicly available.
Most reported AC penetration rates are from appliance saturation surveys or residential
energy consumption surveys that are carried out by federal or state governments [177]–[179].
These studies are time intensive, expensive, and generally limited in spatial scale to larger
geographic regions (e.g., climate zones or groups of states). Some studies have used the survey
data to build predictive models of AC ownership [180]–[184]. For example, researchers used
responses from the American Housing Survey and American Community Survey to estimate the
probability of AC ownership in census tracts across 115 US metropolitan areas and found
patterns of inequality in AC access [184]. However, the empirical model constructed in the study
is based on nationwide trends that might not hold true in certain regions; specifically, the model
did not perform as well in relatively cool climates (e.g., the Northeast, Northwest, Midwest,
47
Colorado, and coastal California). The coarse resolution of survey data limits the ability of these
models to develop highly resolved estimates of AC ownership.
As smart meters have become increasingly common, their electricity data records have been
used in a variety of energy building studies to investigate historical energy behavior and demand
with much higher level of detail than previously possible [29], [106], [185], [186]. Chen et al.
(2018) developed a methodology to determine whether or not a home has AC using household
level smart meter electricity records and local weather data, and then characterized AC
penetration rates at the census tract level across Southern California [23], [24]. This study was
novel in generating highly resolved estimates of AC ownership across a large geographic region
with widely varying microclimates, building stock, and socioeconomics. The methodology was
also used to identify populations that might be especially vulnerable to extreme heat events due
to the confluence of low rates of AC penetration and high poverty levels [187]. These studies
resulted in highly resolved estimates of AC ownership across a large geographical area but were
limited by their focus on dry bulb temperature (DBT) alone to characterize climate-energy
interactions.
More recently, studies that model electricity demand have included humidity-related indices
and found that humidity is a critical element in estimating both cooling and overall demand [15],
[173], [188]–[191]. In one study, models were developed using monthly, state-level electricity
from the United States and various climate indicators to project residential electricity demand
under climate change scenarios. The results showed that projections based solely on DBT can
underestimate electricity demand by as much as 10-15% [15]. A second study used electric load
data from EIA and hourly meteorological data for four electricity regions of southeastern United
States, and found that apparent temperature (AT), which captures both humidity and
48
temperature, was better for modeling historical electricity demand than DBT alone [190]. The
projected demand using AT was also higher for all four regions than when using DBT. These
studies are significant because they show that humid conditions will alter electricity demand for
space cooling, but they focus only on growth in demand from current units, ignoring potential
installations of new AC.
While previous studies have assessed the demand for cooling using DBT, to our knowledge,
no study has used humidity-related indices to identify patterns of AC ownership. We believe this
relationship warrants investigation, as the literature has shown that humidity impacts both human
perception of heat and overall demand for cooling. In this study, we compute a variety of humid
heat metrics (HHMs) from local weather station data that encompass both temperature and
humidity and build on the methodology developed by Chen et al. (2018) to test our hypothesis
that estimates of AC penetration rates, i.e., the percentage of homes in a defined area that have
AC, can be improved by considering humidity as well as temperature. Southern California is a
particularly interesting case study to develop high resolution estimates of AC penetration rates
because the building stock, socioeconomics, and microclimates, which greatly impact the
likelihood of a household having AC, all vary significantly across relatively small spatial extents
[192]. Further, in California, 75% of people have AC, which is roughly 16 points lower than the
national average [193]. Therefore, it is especially prudent to uncover patterns and trends in AC
ownership in California to foresee where growth in electricity demand might occur and locate
communities that are most at risk during extreme heat events.
3.2 Methods
3.2.1 Electricity Records
Southern California Edison (SCE), an Investor-Owned Utility, provided hourly
49
residential electricity data from the years 2015 and 2016 for roughly 200,000 households
(including single family homes and apartment units within multifamily buildings) within their
service area. The customers were randomly selected so that the dataset is statistically
representative of Greater Los Angeles’s 4.5 million residential households at 99% confidence
level. SCE also supplied the street level address of each customer, which allowed for a highly
detailed geospatial analysis. All electricity data were stored on USC’s center for HighPerformance Computing with a highly secure HPC Secure Data Account, to remain in line with
the security and confidentiality requirements of SCE.
Steps were taken to screen outliers in the data that might distort the relationship between
household electricity and the study’s heat metrics. Households with less than half a year of
electricity records were removed from the dataset, as well as homes that had less than 20 kWh of
annual electricity demand, the amount of electricity an average home in California consumes
each day [127]. We omit these homes as including unoccupied homes could distort estimations
of AC penetration rates Homes with solar panels were removed from the dataset because
electricity demand met by solar panel generation is not measured by the smart meters. Thus, the
gap between measured and actual demand would convolute the relationship between the home’s
electricity consumption and outdoor weather conditions. The data provided by SCE does not
identify customers with residential solar, so a method developed by Chen et al. (2019) was
employed to detect these homes based on their hourly electricity consumption [24]. Only a small
fraction of the homes were identified as having solar panels (1-2%) so their omission should
introduce no significant bias during the period studied. As solar penetration increases over time,
this assumption would need to be reevaluated in future studies. After the screening steps,
158,114 households remained in the dataset.
50
3.2.2 Weather Data and Heat Metrics
Local weather data were collected at an hourly resolution for the years 2015-2016 from
three different sources of land-based weather stations: the California Irrigation Management
Information System (CIMIS), the National Oceanic and Atmospheric Administration’s National
Center for Environmental Information (NCEI), and the Environmental Protection Agency Air
Quality System (EPA AQS) [113], [114], [194]. In total data from 102 stations were used. Each
of the sources contain data from land-based weather stations across the Southern California
region that are automated and quality controlled. Hourly ambient DBT, relative humidity (RH),
and wind speed were measured by all three sources.
Dew point (DP) temperature was also available from CIMIS and NCEI stations, and
NCEI stations measured wet bulb temperature (WBT). Using the DBT and RH, DP and WBT
were calculated for the stations that did not record their values. Effective temperature (ET),
apparent temperature (AT), and Steadman’s model of heat index (HI) were computed using the
weather data retrieved from the weather stations described above. These humid heat metrics
(HHM) were selected because they are commonly discussed in literature regarding human
perception of heat and heat related public health risk and incorporate humidity in their
calculations. There is also stronger consensus within the heat literature on their definition and
how to calculatme them, while many other heat metrics are not as well defined. Table 1 defines
both the measured and calculated heat metrics used in this study. The formulas and packages
used to compute the calculated metrics are outlined in the SI.
51
Table 3-1. Description of heat metrics used in this study.
Metric Definition
Dry Bulb Temperature
The ambient temperature measured by a thermometer, referred to as air
temperature [195].
Wet Bulb Temperature
The temperature of adiabatic saturation measured by a thermometer covered with
a wet cloth. At 100% relative humidity*, the wet-bulb temperature is equal to the
air temperature. At lower humidity, the wet-bulb temperature is lower than drybulb temp [195].
Dew Point Temperature
The temperature that air needs to be cooled to achieve 100% relative humidity*.
The higher the relative humidity, the closer the dew point to the actual air
temperature [195].
Heat Index
Human perceived equivalent temperature when considering air temperature and
relative humidity* [195].
Apparent Temperature
Temperature equivalent perceived by humans (feels like) caused by combined
effects of air temperature, relative humidity*, and wind speed [196]. According
to the National Digital Forecast Database, the apparent temperature is equal to
the dry bulb temperature between 50 and 80°F, the heat index above 80°F, and
the wind chill below 50° F [197].
Effective Temperature
The temperature of saturated air that would incur the same level of discomfort
for humans as the measured dry bulb temperature and relative humidity*. Thus,
the equation for effective temperature includes terms for both the dry bulb
temperature and relative humidity [198].
*Relative humidity: The amount of water vapor present in air expressed as a percentage of the amount needed
for saturation at the same temperature [195].
3.2.3 Statistical Model
The segmented linear regression developed by Chen et al. (2018) is implemented in this
study to model the relationship between residential electricity use and each of the heat metrics
[23]. In that model, a household’s daily aggregated electricity demand was regressed against
daily average DBT to determine whether the household had AC during the period of study. To
test which heat metric best estimates AC ownership, we therefore aggregate hourly electricity
52
demand to daily electricity demand for each of the households and regress against the daily
average value across each respective heat metric. Figure 1 shows the segmented linear
regressions between daily aggregated electricity use and each of the six heat metrics for an
example household in the study region.
A distance cutoff was implemented so that any household more than 20 miles away from
a weather station was removed from the analysis (Refer to SI for distribution of distance from
household to weather station). This distance was selected to try to keep as many homes as
possible in the dataset without matching homes to weather stations that would not accurately
represent the local conditions of the home. On days where weather station data were missing,
households were matched with weather data from the next closest weather station (so long as
station was within 20 miles from home).
Figure 3-1. An example set of segmented linear regressions for one home in La Crescenta, CA that was identified
as having AC with all six heat metrics evaluated on each x-axis.
53
The segmented linear regression depicts two key pieces of information. The first is the
stationary point temperature (SPT) which is the inflection point on the plot and is regarded as the
temperature at which a household is expected to turn on their AC if they have it in their home.
The second takeaway is the electricity-temperature sensitivity (E-T sensitivity), the slope of the
line to the right of the SPT. The slope is the sensitivity of a household’s electricity consumption
to the ambient temperature and is impacted by occupant and household characteristics that are
not explicitly explored in this study due to data limitations (e.g., cooling preferences, occupancy
rates, insulation, AC efficiency). In this study, multiple measures of heat are used, and the
temperature refers to the heat metric used in a given regression (e.g., WBT, ET). The r2
values,
which measure the goodness of fit of the segmented linear regression model, are recorded for the
values to the right of the SPT to compare the correlations between electricity and temperature
across the heat metrics.
A household is determined to have AC if two conditions in the segmented linear
regression are met. The first condition is that the slope to the right of the SPT (referred to as
slope-right) is greater than zero, because it is presumed that a household with AC would have
electricity consumption that positively correlates with increasing ambient temperatures. The
second condition is that the absolute value of the slope-right is greater than the absolute value of
the slope to the left of SPT (referred to as slope-left). A majority of homes in California are
heated with natural gas, meaning the slope-left should typically be near-zero for these homes
[199]. Thus, a household with an absolute slope-right value smaller than the absolute slope-left
value likely does not have AC, as the household’s electricity demand at temperatures above the
SPT is only nominally dependent on the temperature. This condition is set to rule out homes that
have near-zero E-T sensitivities caused by noise or slightly higher electricity consumption of
54
appliances on warmer days. If a household does not meet these criteria, it is assumed that the
household did not use an AC during the period of study. Examples of households that do and do
not meet these criteria are shown in the SI.
The segmented linear regression is run for each of the households in the study, across
each of the heat metrics defined in Table 1. After running the regression for each individual
household, an AC penetration rate is computed for each census tract by dividing the number of
homes identified as having AC within a census tract by the total number of homes available in
our dataset in that census tract. Differences in the computed AC penetration rates, E-T
sensitivity, and SPT when separate heat metrics are used are discussed below.
3.2.4 Spatial Analysis
Maps were created to illustrate the geospatial variations in AC ownership across the
study region and differences in estimated AC penetration rate for each of the heat metrics used.
The results of the household regression for each of the six heat metrics were aggregated to the
census tract level to protect the privacy of the customer data. Then, estimates of AC penetration
rates were depicted using choropleth maps and census tract boundary shapefiles from the US
Census Bureau [200]. The climate zones as defined by the California Energy Commission were
also depicted to generate a better understanding of how AC ownership differs across the
microclimates of the region [201].
3.3 Results and Discussion
3.3.1 Differences in estimated AC Penetration Rates
The AC penetration rates from each HHM were compared against the AC penetration
rates found using DBT. Areas shown in red have lower rates of AC penetration (when the given
heat metric is used instead of DBT) and tend to be in inland and desert areas, which are hotter
55
and drier; areas shown in blue have higher estimates and are typically coastal, which tend to be
cooler and more humid. The estimates for AT, ET, and HI closely align with DBT, while more
significant differences are observed in the maps for WBT and DP.
Figure 3-2. Choropleth maps depicting the difference between AC penetration rates at census tract level estimated
with each HHM and DBT. The difference is found by subtracting the AC penetration rates computed with DBT
from the AC penetration rates computed using the each of the HHMs a) WBT, b) AT, c) ET, d) HI, and e) DP.
Generally, the AC penetration rate computed with a HHM is lower (red) in desert regions and higher (blue) in
coastal regions than when DBT is used.
A summary of the study region’s average regression results for each heat metric is shown
in Table 2. The estimated AC penetration rate ranges from 73% (DBT) to 84% (DP). In general,
there is agreement between the AC penetration rates estimated by the HHMs and DBT.
However, the regional estimates of AC penetration rates with AT, ET, and HI are closer to the
estimates produced with DBT than WBT or DP are, a trend also depicted in the choropleth maps
in Figure 2. The regional average E-T sensitivity values computed by the regression models
range from 0.08 kW/degrees C for DP to 0.15 kW/degrees C for ET across the six heat metrics
evaluated (See Table 2).
Regional average r2
values for each heat metric are also given in Table 2. The model is fit
56
to minimize the r2 value for all data points, but the reported r2 values only consider the set of
data points to the right of the SPT in the segmented regression model, as we are most interested
in how a home responds to temperature at the critical point at which a cooling system is turned
on. The r2
values for the six heat metrics range from 0.15 to 0.40. DP represents the lower
boundary of this range, and HI and AT both have an r2
of 0.40. In general, these results show that
heat metrics that include humidity either have an r2
value that is lower or similar to the
regression model analyzing DBT alone.
Table 3-2. Summary of the study region’s averaged regression results for each heat metric.
Metric
AC Penetration
Rate (%)
SPT (degrees C)
ET sensitivity
(kW/degrees C)
r
2
Dry Bulb Temperature 73 19.4 0.10 0.39
Wet Bulb Temperature 80 14.2 0.13 0.28
Dew Point Temperature 83 10.6 0.08 0.15
Heat Index 75 19.1 0.10 0.40
Apparent Temperature 74 19.4 0.11 0.40
Effective Temperature 77 17.9 0.15 0.39
These results contradicted our initial hypothesis that HHMs would be significantly better
suited for identifying whether a home has AC. We expected that household cooling demand
would be best correlated with heat indices that account for humidity, based on the understanding
that a person’s comfort level is impacted by both temperature and humidity. The weak
correlation between WBT and demand could be explained by the findings in Vecellio et. al
(2022) [202] which show that WBT doesn’t appropriately capture nonlinear function of
temperature and humidity that is appropriately matched to human physiology [202].
Additionally, the other three HHMs performed no better than DBT. The results make sense from
an engineering perspective, given that the regional climate zones analyzed in this study generally
57
do not consistently experience high humidity. In an AC unit, the temperature and moisture
content of outdoor air is reduced air upon interaction with the AC’s cooling coils, which are kept
below the air’s DP [203]. While there is an energy penalty associated with dehumidifying the air
(i.e., the latent load), the total energy load is dominated by the sensible load (i.e., the energy
required to reduce the air temperature) except in extremely humid climates [204]. Hence, it is
likely that the low humidity levels in Southern California do not cause an observable signal in
the overall electricity demand of a household (see SI for distribution of RH and DBT across
study region).
Consequently, our results may be region specific; a city that is both hot and humid likely
demonstrates a stronger link between humidity metrics and overall demand. However, people
living in more humid climates are also more likely to be more tolerant of higher humidity levels
than those living in dryer regions due to regional acclimatization [205], which might dilute an
observable relationship between cooling load and humidity. Conducting studies in regions with
diverse climates would provide insight into the interactions between humid heat, human behavior
and acclimatization, and electricity demand, but the lack of availability of household level
electricity data is a limiting factor.
3.3.2 Improving confidence in AC estimates
While the difference in r2
results from the regression models are not definitive enough to
state which of the metrics should be used to determine AC ownership, evaluating AC penetration
with multiple metrics can provide higher confidence in the estimations. In Figure 3, the homes
were grouped by the number of heat metrics that identified the household as having AC, and the
breakdown of which heat metrics identified the households as having AC within each grouping is
shown in the bar chart. In Figure 3 a), 69% of households were determined to have AC using the
58
segmented linear regression methodology with all five of the heat metrics (note that DP is excluded
because preliminary results showed it was a poor predictor of AC ownership). Figure 3 offers
insight into our confidence in the total AC penetration rate across the region of study, which is
highest for the set of homes identified as having AC based on agreement between 5 metrics (69%)
and slightly less as we add the additional homes identified with at least 4 metrics (+3% of homes)
or 3 metrics (+2%), raising the overall AC penetration rate estimates to 72% and 74%, respectively.
We have low confidence for regional AC penetration rate estimates in the range of 76% to 81%,
which includes all homes identified with at least one metric.
These results align with regional estimates (table of estimates given in SI) conducted by
[178], [184], [206]–[209], suggesting that we can have high confidence in the 69% of homes that
were identified as having AC by all heat metrics. Although the most recent California Residential
Appliance Saturation Survey estimates that 86% of customers in SCE territory have AC, the
estimate is based on 2019 survey data rather than 2015 and 2016. Additionally, the survey includes
any household that reported owning an AC, regardless of how often they use it, and our method
might not capture households that use their AC infrequently (e.g., a vacation home with low
average occupancy throughout the year). Similarly, the study by Romitti et al. (2022) [184] reports
a higher average AC penetration rate, 81%, for the Los Angeles-Long Beach-Anaheim area but
also uses more recent survey data and would capture all AC ownership, regardless of use.
59
Figure 3-3. (a): Percentage of homes identified as having an AC with all five heat metrics (i.e., consensus across all
metrics). (b-e): The additional homes identified as having AC with a consensus of n metrics. (f): Summary of the
percentage of homes identified as having AC by n heat metrics. The transition from dark to light blue implies
diminishing confidence in the homes identified as having AC (e.g., we have more confidence in the homes identified
with 5 metrics, represented with dark blue, than the homes identified with 1 metric, represented with light blue).
3.4 Conclusion
Highly resolved estimates of AC ownership are essential to prepare for future cooling
demand and identify communities who will be most at risk during future extreme heat events.
However, determining AC penetration rates at fine scales is difficult due to the lack of
availability of household level data and limited understanding of how AC use behavior responds
to varying heat metrics. This study improved upon existing methods of predicting AC
penetration rates by incorporating a variety of humidity and temperature related heat metrics
with a robust dataset of electricity records for ~160,000 homes in Southern California.
60
In total, 81% of the households were identified as having AC by at least one heat metric
(when excluding DP), while 69% of the homes were determined to have AC with a consensus
across all five of the heat metrics. These results are aligned with the results from other studies of
the region (SI B4). A limiting factor of any method used to estimate AC penetrations rate is that
there is no ground truth of residential AC ownership to validate against, particularly across small
spatial extents, which is important for understanding heat vulnerability across different socioeconomic groups. Hence, our method is advantageous because it provides insight into our
relative certainty in estimating if a home uses AC based on five analyses of electricity usage and
a respective heat metric. Accordingly, while this analysis suggests that between 69-81% of
households in SCE have AC, we have higher confidence that the true range is 69-74% of homes
for the years analyzed.
The computed regional AC penetration rates range from 73% for DBT to 83% for DP.
Maps of AC penetration rates show that there are geospatial variations in the prediction of AC
ownership. For DP and WBT, where regional estimates diverged from DPT more significantly,
the dryer, hotter regions were estimated to have lower AC ownership than when DPT was used.
The opposite was true in the milder, more humid coastal regions (i.e., calculated AC penetration
rate was higher for DP and WBT than DBT). The regional average r2
values vary from 0.15 to
0.40, and the highest values are from HI and AT. WBT performed worse than DBT (0.28 vs
0.39), suggesting that the demand for cooling is more dependent on air temperature than
humidity. While this contradicts our initial hypothesis, it makes sense with thermodynamic
principles, and results might be different in areas of very extreme humidity where the latent load
of AC units is much more pronounced.
While it is difficult to draw a conclusion as to which heat metric is most accurately
61
predicts AC ownership from the results of this study, using DBT alone possesses several
advantages and performed similarly to or better than other metrics within this study region. DBT
is a well understood metric of heat, and DBT data can be easily retrieved from a variety of
historical weather sources, unlike other heat metrics. Additionally, regional meteorological
models and climate models can predict DBT with more accuracy than humidity and heat metrics
that include humidity [210]–[213]. We chose Southern California as our study region because it
is one of the only regions where researchers can gain access to smart meter data at a large scale
(through a formal process outlined by California’s Public Utilities Commission) [214] across
diverse climate zones, and it is expected to have relatively large increases in AC adoption in the
coming years when compared to other regions of the United States that already have high AC
penetration rates. While we acknowledge that the outcome of this study may be regionally
specific, the outlined methodology can serve as a framework that should be repeated in more
humid climates as smart meter data becomes available to confirm this conclusion. Furthermore,
repeating this study with higher resolution temperature and heat metrics would be desirable to
ensure that the distance to weather station, which can be as much as 20 miles in this analysis,
does not skew results.
62
Chapter 4: Revealing spatial and temporal patterns of residential cooling in
Southern California through combined estimates of AC ownership and use
4.1 Introduction
Rising temperatures associated with global climate change and urban warming coupled with
higher standards of living are set to drive huge increases in the electricity demand for cooling.
By 2050, the global capacity for air conditioning (AC) is expected to triple through both new AC
adoptions and increased use of existing units [7]. While it is prudent to ensure that people have
sufficient access to cooling resources, especially as extreme temperatures threaten public health
[215], doing so will have large implications for the power grid. Thus, a robust understanding of
residential cooling demand is necessary to identify communities without adequate AC access and
plan for future energy needs.
AC is a key adaption tool to protect populations from the from the health effects of climate
change [216], especially as extreme heat events both intensify and become more frequent [217].
As such, equitable and resilient cooling access is a major objective for several stakeholders
including utilities, public health officials, and energy advocates. Although there has been huge
growth in AC adoption, there are still many communities and countries in warm climates with
low rates of AC ownership [187], [218]. Further, many households with AC are unable to meet
their energy needs because of the rising costs of electricity. For example, in a 2020 survey of US
household energy insecurity, 5% of respondents cited that financial circumstances prevented
them from using their AC and 11% reported keeping their home at an unhealthy temperature to
lower their electricity bill [177]. It is important to identify the communities that do not have
adequate access to AC (either because of the complete lack of AC or underutilization of cooling
appliances due to energy insecurity) to promote policies that will ensure vulnerable communities
will not be in danger during extreme heat events.
63
Meeting the increased demand for cooling could exacerbate the challenge of managing peak
loads across the power grid. The use of AC units and fans currently account for 20% of
electricity demand in buildings and 10% of all global electricity consumption [7]. With rising
temperatures and increasing AC adoption, this percentage is expected to grow, placing strain on
the electric grid through increases in both the overall and peak demand [19], [219], [220]. Grid
operators and utilities rely on accurate forecasts of electricity demand to ensure there is enough
power on the grid at any given time [221]. As AC units account for a significant portion of
electricity consumption in hot months, accurate estimates and projections of the cooling demand
are critical.
The challenge of quantifying a region’s demand for cooling is two-fold. Highly accurate,
high-resolution estimates of AC penetration are essential to determine the contribution of cooling
to a region’s electricity consumption, as well as make projections of how energy needs may
change in the future. However, residential customers have widely varying patterns of demand
due to different occupancy patterns, thermal comforts, building characteristics, and appliances
[33], [222]–[224]. Thus, merely knowing if a household has AC, or the number of households in
a region with AC, is not enough to model the energy demand of the house or region itself.
Instead, knowledge of how residential customers use their AC, in combination with who has AC,
is critical to evaluate the electricity demand that is required for cooling.
Exploring patterns of AC ownership and use is difficult due to the shortage of data. Data
regarding household appliances is rarely publicly available and information about AC ownership
and usage has typically been gathered through state or federal surveys which are both financially
expensive and time intensive [177]–[179]. Further, these efforts most often produce AC
estimates at large spatial extents, such as statewide or regionally, that do not provide insight into
64
local energy needs. More recently, studies have utilized large scale smart meter data records to
study cooling demand, but these methods have shortcomings [23]–[25], [187], [225]. First,
several of these studies ignore the impact of electric heaters on the electricity temperature
relationship by either excluding data at lower temperatures or assuming electric heaters are rarely
present in the dataset, which can lead to misidentification of AC households [24], [25], [225].
Second, most studies utilize daily electricity data, which does not capture intraday patterns of
electricity use, and therefore focus primarily on classifying if households have AC rather than
analyzing how ACs are used [24], [74], [225], [226].
In this study, we present a three-part framework to study spatial and temporal patterns of
cooling demand in Southern California. In the first part of this study, we develop a novel
methodology (referred to as the “AC Ownership Algorithm for the remainder of the paper) that
utilizes hourly smart meter electricity records to identify the presence of AC and electric heat
appliances based on the relationship between electricity demand and outdoor temperature. In the
second part of this study, we then adapt and apply a linear regression method (referred to as the
“AC State Algorithm” for the remainder of the paper) to the identified AC households to make
predictions about their hourly AC on/off state. (Note: even though we identify electric heating in
this paper, doing so is only to estimate AC penetration more accurately. Hence, we focus our
analysis on AC ownership and use characterization, and we do not attempt to characterize
electric heating use). In the final step of the methodology, we aggregate and combine estimates
of AC ownership and hourly AC states to better understand the regional cooling demand.
Through this analysis we answer the following research questions:
65
1. How do AC penetration estimates produced with a model that utilizes hourly electricity
data and considers both electric heating and cooling compare to previous estimates in the
literature?
2. Can we use hourly electricity data to identify hours in which customers use their AC?
3. What are the aggregate trends in sub-daily cooling behavior across temporal, climatic,
and spatial extents?
4. Can a region’s residential cooling behavior be captured through combined estimates of
regional AC ownership and patterns of AC consumption?
This framework improves upon previous studies because it can identify electric heating in
homes, and hence, does not ignore or require the lack of electric heating to correctly identify AC.
The use of hourly data also gives insight into the intraday patterns of households AC
consumption, which can better inform grid system planning, energy equity policies, and demand
side management.
4.2 Literature review
As smart meter installations expand and providers make data more accessible, researchers
have used the electricity records to make inferences about residential electricity behavior.
Specifically, studies have used the relationship between electricity demand and heat metrics to
make estimates of which households in a dataset have AC [24], [25], [225], [227]. These
methods are based on the understanding that AC units will consume more energy to cool a space
as temperatures increase above a certain temperature threshold; thus, a positively correlated
relationship between demand and temperature at higher temperatures will indicate a household
with AC.
4.2.1 Studies on AC ownership
66
Several papers have employed a method that screens for whether a household’s electricity
demand has temperature dependence and determines whether a household has AC based on that
dependence. For example, in the first step of a multi-part methodology Dyson et al. regressed the
daily electricity demand of a household against the ambient daily average temperature on days
above 55°F, calculated the slope of the linear model, and asserted that homes with a positive
slope (i.e., electricity demand increases with temperature) had AC [25]. The method used in this
study is a simple demonstration of the electricity temperature sensitivity concept that underlies
many of the studies regarding AC identification and behavior [24], [225]–[228] .
One limitation of this methodology is that homes without AC that have a very slight
temperature dependence (e.g., a home that uses fans during warmer temperatures) could be
misclassified as having AC, since there is no minimum slope threshold. Chen et al. developed a
more robust methodology to avoid misclassifying these homes that regressed daily average
electricity demand against daily average temperature with a segmented linear regression model
[23]. Then, a home was determined to have AC if a) the slope to the right of the stationary point
temperature, or SPT (the point at which a home is expected to turn on their AC if they have it),
was greater than zero and b) the sum of the slopes to the right and left of the SPT was greater
than zero. The second criterion was included to ensure that homes with a negligible temperature
dependence were not identified as having AC, but the rule assumes the household does not have
electric heating or rarely uses it. While this is a reasonable assumption in California where a
majority of homes are heated with natural gas, electric heating is more common in other regions
and will become more common on a future grid with high electrification [199], [229]. This
methodology was used to make census level estimates of residential AC ownership across the
Southern California region and identify communities that would be most vulnerable to extreme
67
heat and adapted in a later study to test whether humid heat metrics are better indicators of AC
ownership [24], [187], [225].
A study by Elmallah et al. investigated access to both heating and cooling in Northern
California through a dataset of ~60,000 households in PG&E territory, addressing the gap in the
literature regarding electric heat [230] . In this study, the segmented linear regression model
described by Chen et al. was adapted to detect electric or gas heating and electric cooling using
both gas and electricity usage records. In contrast to the model used by Chen et al., which used
one changepoint (referred to as the SPT) the researchers fit the data to three different linear
models in which there were no changepoints, one changepoint, and two changepoints. The
different models represent 1) a house without heating or cooling that has no temperature
dependence, 2) a house with either heating or cooling that has temperature dependence at either
low or high temperatures but not both, and 3) a house with both electric heating and cooling that
has temperature dependence at both low and high temperatures. Then, Bayesian Information
Criterion was used to select the best model for each household, informing whether the household
heats or cools. In all, the study detected gas or electric heating in 68% of households, and electric
cooling in 40% of households. This study is advantageous because it does not rely on the
absence of electric heating to categorize homes, and further explores the distribution of both
heating and cooling access.
4.2.2 Studies on AC use
While these studies were novel in their ability to detect cooling across large spatial
extents at high resolutions, they only capture whether a household has an AC unit which is not
enough to quantify the cooling demand of a region. Information pertaining to how households
use their AC is critical to plan for electricity needs, but acquiring data appliance level data is
68
challenging and thus research related to AC use is even further limited. Studies on household
AC use typically analyze data obtained through surveys on household energy use [231]–[233] or
smart meter trials and programs with sub-metered appliances [234]–[241]. For example, a study
related to AC usage in Hong Kong collected questionnaires from ~554 residents which included
questions about how many hours and in which months they turned on their AC at night as well as
their temperature settings [231]. The results of the study provided insights into the cooling
preferences of residents in Hong Kong, but studies that use survey data are only capable of
capturing general trends in AC use and are likely imprecise as they rely on customers to
accurately report their energy behavior.
Datasets consisting of sub-metered appliance electricity records are advantageous
because they can produce a more exact quantification of the cooling patterns of the studied
buildings. A study in Sydney analyzed the contribution of ACs to regional summer demand
peaks using the Smart Grid Smart City (SGSC) data set which includes appliance level data from
808 homes and found that residential AC contributes up to 9% percent of total peak demand
[240]. However, the authors acknowledge that the size of the dataset is a limitation of the study
and may not fully capture the variety of AC load profiles that exist in the study region. In
general, a limitation of appliance monitoring datasets is that they consist of a small number of
samples (e.g., less than 1000 homes). Thus, until large scale sub-metered electricity datasets
become available, using appliance monitoring to draw inferences about the cooling behavior of
an entire region is not feasible.
The granularity and size of smart meter datasets presents an opportunity to gain insight
into patterns of cooling behavior within and across regions. The previously described AC
identification studies use smart meter data records but aggregated the records to daily data.
69
While it has been shown that the correlation between daily electricity records and temperatures is
stronger than the correlation between hourly electricity and temperatures [23], the coarse
resolution conceals intraday patterns of AC use. Conversely, many studies have used higher
resolution data and developed methods to non-intrusively disaggregate appliance level
consumption from overall electricity demand, but isolating the AC load from smart meter records
is challenging [242]. For example, one study implemented a three-stage load decomposition
method that relied on the hourly electricity temperature relationship and building characteristics
to separate the AC load from the household’s total load [185]. The method was able to accurately
estimate the AC load profiles of the households in the dataset, when compared with ground truth
appliance level data. However, researchers with large scale smart meter datasets typically do not
have access to the building characteristics that were utilized in this study.
In the second step of the study by Dyson et al., the authors used smart meter data from
30,000 customers in PG&E’s service territory to identify the hours in which a household turned
on their AC [25]. Similar to the studies that determine the presence of an AC unit based on
temperature and electricity, the household’s hourly electricity demand and outdoor temperature
were fit to a linear regression model. However, this study utilized hourly measurements to
analyze intraday electricity demand and make inferences about AC usage. Each pair of hourly
electricity and temperature measurements were fit to either a temperature-independent model or
a temperature dependent model with hourly and weekend/weekday fixed effects and reassigned
iteratively until the model converges. The authors then calculated the impact that a 4-degree
change in the AC setpoint would have on each household’s power consumption and aggregated
the results to estimate the demand response capacity of customers in PG&E’s service territory.
This study provided meaningful insight into the extent of grid services and flexibility that
70
residential customers can provide, but the analysis of cooling behavior was limited.
4.2.3 Research gaps in the literature
Recent residential cooling demand studies have produced highly resolved estimates of
AC ownership across large spatial extents, offering unprecedented insight into regional patterns
of AC adoption. However, a majority of these studies rely on daily data to infer which
households have AC, thus limiting the knowledge of patterns of AC use. Conversely, studies
pertaining to the patterns of AC have thus far utilized small dataset samples that cannot be
extrapolated to understand regional cooling usage. Therefore, a research gap exists in the
literature as the heterogeneity of AC consumption across spatial, temporal, and climatic extents
has not been well explored. In this body of work, we use a large scale hourly smart meter dataset
to make highly resolved estimates of household AC ownership and use patterns across the study
region of Southern California.
4.3. Methodology
In this section, we describe the three-part framework that we develop to characterize cooling
behavior in Southern California using smart meter data from ~200,000 residential customers.
Before implementing the framework, we carry out several filtering methods and outlier checks to
ensure that our dataset only contains households with a sufficient amount of data, and that
erroneous values are removed for each separate household. In step 1 of our framework, we use a
novel AC identification methodology (the AC Ownership Algorithm) to determine which
households have AC. We compare our AC ownership results with the results from Chen et al.
[24] and survey data for the same study area [178], [208] to analyze the impact that household
location and technologies may have on AC identification for each method. In step 2, we employ
an AC state model (the AC State Algorithm) to determine in which hours the AC households
71
(determined in step 1) have their AC on. In step 3, we combine our estimates of AC penetration
and operation to describe the cooling demand of the study region. This series of steps is
summarized in Figure 1. In this three-part framework, we define and calculate three different
metrics that capture different aspects of a region’s cooling demand:
• AC Penetration Rate: The estimated percentage of homes in a defined region with AC.
• AC Operation Rate: The estimated fraction of hours out of a defined set of hours (e.g.,
e.g., two-year study period or ever hour 12 in a year) for which the AC is active. The rate
can be calculated for a single household or as an average of all households in a defined
region.
• Net AC Utilization: The product of a defined region’s AC Penetration Rate and AC
Operation Rate.
Figure 4-1. Overview of methodology for finding AC Penetration Rates, AC Operation Rates, and Net AC
Utilization.
72
4.3.1 Dataset information and preprocessing
The dataset used in this analysis consists of smart meter electricity records measured in
15-minute intervals at the household level for 2015 and 2016. The data was provided by
Southern California Edison (SCE), an investor-owned utility, and contains data from roughly
200,000 distinct customers identified by SCE as being single-family households. The customers
were selected at random to be statistically representative of the 4.5 million households located in
Greater Los Angeles at 99% confidence level. The street address for each customer was also
provided, allowing for detailed spatial analysis (e.g., AC use at the census tract level). These
dwellings span over ~2,500 census tracts and 7 building climate zones, as defined by the
California Energy Commission [201], in the Southern California area. As this data is highly
confidential, the smart meter records were stored on a high security data account (HSDA)
provided by the University of Southern California to meet the security requirements of SCE.
Prior to applying the AC Ownership and AC State Algorithms, we perform outlier
analyses on the aggregate hourly and daily electricity data. A large portion of this outlier analysis
follows the steps performed by Chen [24] and Peplinski, et al. [225]. The goal of the
preprocessing step is to remove all homes for which there is insufficient smart meter data and
remove smart meter records that indicate missing or highly abnormal behavior. We aim to curate
a dataset that is both representative of the region and makes it possible to clearly establish the
relationship between electricity and temperature at the household level.
First, homes with fewer than 20 kWh of average annual electricity consumption, which is
approximately the daily electricity demand of an average California home, are removed as it is
likely these homes are uninhabited [127]. Additionally, homes with consumption falling 3
standard deviations above the mean annual electricity consumption are removed as outliers.
73
Next, we filter all homes that are suspected to have solar panels on site to avoid the
inconsistencies created by net metering, discussed in greater detail in the work by Chen [24] (we
estimate that less than 2% of homes in our dataset have solar panels). For the remaining homes,
we aggregate the 15-minute smart meter data to the hourly level and drop all hours for which the
electricity consumption is zero. For a smart meter to give a reading of zero across an hour, the
home would have to either be disconnected from the grid due to long-term vacancy or power
failure or possess solar panels that cause a meter read of zero due to net-metering. In either case,
the hours in question would not reflect the customer’s typical consumption patterns, which may
interfere with the analysis of AC ownership and use. Note that temporary vacancy would be
highly unlikely to give a meter read of zero due to plug loads like refrigerators.
Next, we match each individual household to weather stations within a 20-mile radius,
using data from 102 weather stations within three different land-based weather station systems
[113], [114], [194]. For each household and each hour, the temperature of the nearest weather
station that has a temperature reading in that hour is assigned to the household. If no weather
station within 20 miles has a temperature reading, the hour in question is removed from the
household’s data due to an inability to establish an electricity-temperature relationship. Next, to
eliminate hours with extreme levels of electricity consumption, we bin electricity data into ten
temperature quantiles and remove hours for which consumption exceeds 5 standard deviations
above the mean within said quantile. This eliminates hours with highly irregular electricity
consumption, caused by an unexpected load, that would distort the relationship between
electricity demand and ambient temperature. After performing this hourly filtering, we drop any
homes for which less than 4,380 hourly records remain (one half of a year) to ensure sufficient
data to perform the AC Ownership and State Algorithms. At the end of this outlier removal
74
process, we retain ~160,000 households from 2,439 census tracts and 4 counties across Southern
California Edison’s service territory.
4.3.2 AC Ownership Algorithm and computation of AC Penetration Rate (Step 1 in Figure 1)
We determine whether each household in the filtered data set has an AC unit and/or
electric heater by examining the relationship between hourly electricity consumption and hourly
outdoor temperature. For households with an AC, we expect that there is a positive correlation
between hourly electricity consumption and hourly temperature above a certain temperature
threshold, but this relationship is dependent on the operational status of the AC in a specific
hour. For example, a household with an AC may turn it off when unoccupied, so hightemperature hours will only display temperature dependence within the subset of hours for which
the AC unit was running. Similarly, homes with electric heating should display temperature
dependence at temperatures below a specific temperature threshold, but only for the fraction of
hours for which the electric heater was in use. Given that electricity consumption depends on
many loads that are not related to temperature (e.g., cooking, entertainment, household chores)
and will therefore depend on individual user behavior, we must account for this temperatureindependent electricity consumption before analyzing the temperature-dependent AC and electric
heating loads. We do this by subtracting an estimate of the typical temperature-independent load
for each hour (that is, the expected electricity consumption of non-AC or electric heating loads).
For each home, we group the data by hour of the day and day type (weekend vs weekday) and
find the 25th percentile of electricity consumption for each group. We then subtract the
corresponding 25th percentile value from each hourly electricity record. We use the 25th
percentile, rather than the 50th percentile, as an estimate of the temperature-independent
consumption to account for the fact that some hours will feature a significant amount of AC
75
and/or electric heating use that skews the distribution.
We then fit each user’s adjusted data to four models that relate electricity consumption to
temperature with each model representing an AC and electric heating technology combination.
Examples of homes that demonstrate good fits for each of the above models are shown in Figure
2. We refer to temperatures at which heating or cooling behaviors may change as stationary point
temperatures (SPT). For example, the cooling SPT is the temperature above which there is a
possibility of AC use.
• Model 1: All data is fit to one horizontal line, implying that electricity consumption is
independent of temperature. This model corresponds to no electric heating or AC (top left
quadrant).
• Model 2: A portion of the data is fit to one line representing the temperature-independent
portion and, at temperatures above a cooling SPT, a portion is fit to an additional line
representing the hours that demonstrated a temperature-dependent load due to AC usage.
This model corresponds to a residence with AC no electric heating (top right quadrant).
• Model 3: A portion of the data is fit to one line representing the temperature-independent
portion of the load, and, at temperatures below a heating SPT, a portion is fit to an
additional line representing the hours that demonstrated a temperature-dependent load
due to electric heating usage. This model corresponds to a residence with electric heating
but no AC (bottom left quadrant).
• Model 4: A portion of the data is fit to one line representing the temperature-independent
portion of the load and the remaining data is fit to one of two temperature-dependent
lines, with one line for temperatures below the heating SPT and one for temperatures
76
above the cooling SPT. This model corresponds to a residence with electric heating and
AC (bottom right quadrant).
Figure 4-2. Hourly electricity consumption versus hourly ambient temperature for four example homes in Southern
California over the two-year period. Each plot depicts one of the four AC and electric heating (EH) technology
combinations: Model 1) no AC or EH, Model 2) AC no EH, Model 3) EH no AC, and Model 4) AC and EH.
For all the above models, every individual datapoint (i.e., hour) is fit to exactly one line.
For Model 1, there is a single temperature-independent line that has a constant y-value equal to
the mean electricity consumption of all points. However, for Models 2-4, we determine in which
hours the electricity demand exhibits temperature dependence and the lines of best fit for
temperature dependent and temperature-independent hours using a version of the expectation-
77
maximization (EM) algorithm. The EM algorithm involves iteratively classifying datapoints to
groups and then fitting models of those groups until a condition is met.
In Model 2, we assume that for temperatures above the cooling SPT there is a possibility
that the AC will be running and therefore that these hours can demonstrate temperature
dependent or temperature-independent electricity consumption. To begin the EM algorithm, we
first assume that all hours with temperature above the 70th percentile and electricity consumption
above the 70th percentile of this subset of hours are temperature dependent, and all other hours
are temperature independent (though the results of this algorithm were not noticeably sensitive to
different initial seedings). The temperature-independent line is then defined by the mean
electricity consumption for all points assigned to it, and the slope of the temperature-dependent
line is calculated via a non-negative linear regression of electricity consumption on temperature
for all hours assigned to it. All hours above the cooling SPT are then reassigned to the two lines
depending on error minimization, and the models are refit with the newly assigned hours. This
process continues iteratively until fewer than 1% of eligible points switch line assignment or
until fewer than 1% of the total hours are assigned to the temperature-dependent line. We test
potential cooling SPTs of integers ranging from 60 to 100oF to cover a large range of potential
cooling preferences and select the SPT that minimizes the total error. Model 3 proceeds
identically to Model 2, but the classification of points occurs at temperatures below the heating
SPT, and the search space for the heating SPT ranges from 40 to 70oF.
For Model 4, a household’s data is split into two portions based on the midpoint of the
heating and cooling SPTs found by Models 2 and 3, and then the algorithm described above is
repeated for each portion of data with the midpoint serving as the lowest possible cooling SPT
and the highest possible heating SPT. The temperature-independent line is again set as the mean
78
of all points not assigned to the temperature-dependent lines, regardless of temperature.
For each of the four models, the model error is determined by the mean-squared error for the
lines of best fit multiplied by the number of lines fitted (one line for Model 1, two for Models 2
and 3, and three for Model 4). The multiplier on the mean-squared error penalizes more complex
models that would otherwise generally have lower error (similar to error terms used in
information criterion analysis [243]). For each home, we select the model that minimizes this
custom error function. Homes for which Models 2 or 4 were selected are considered to have AC,
and homes for which Models 3 or 4 were selected are considered to have electric heating. This
algorithm is designed to capture the general relationship between electricity consumption and
temperature of specific households, which gives insight to their space conditioning technologies,
and not to minimize model error or most-accurately describe their heating or cooling demand.
To characterize AC ownership across our region, we match each household to a census
tract and a California building climate zone using shapefiles from the US Census Bureau [200]
and California Energy Commission [201]. For each respective census tract and climate zone, we
calculate the AC Penetration Rate by dividing the number of homes in the area identified as
having AC by the total number of homes in the region present in our dataset. We compare the
results of this methodology to the results found by Chen et al. [24] at the census tract level for
Southern California and our aggregated results to survey data collected in the region. Following
the filtering steps, the remaining records are statistically representative of 1,534 census tracts.
4.3.3 AC State Algorithm and computation of AC Operation Rate (Step 2 in Figure 1)
For the subset of households designated as having AC, we proceed with a more finetuned algorithm to determine the cooling SPT and the hours during which the AC is on. While
the AC Ownership Algorithm (described in section 3.2) aimed to establish general electricity-
79
temperature relationships for the purpose of identifying the presence of electric heating and
cooling technologies, here we use the AC State Algorithm, adapted from Dyson et al. , to
establish specific cooling behaviors and parameters [25]. This includes the SPT (i.e., the
temperature at which people begin to turn their AC on) and a more precise prediction of which
hours feature AC activity. The AC State Algorithm classifies every hour as being “AC on” or
“AC off” even though during an “AC on” hour the AC may not be running continuously
throughout the hour.
In this method, each home is fit to a multiple-linear regression model that regresses
electricity consumption on temperature and dummy variables that represent day type (weekend
vs weekday) and hour of day interactions. The temperature dependent portion of the model is
again conditional on the state of the heating and cooling technologies and is only defined for
specific temperature ranges.
𝐸ℎ = 𝑫𝒉,𝒘 + 𝐻ℎ
(𝛽1 × (𝑆𝑃𝑇𝐻 − 𝑇ℎ
) + 𝑖) + 𝐶ℎ
(𝛽2 × (𝑇ℎ − 𝑆𝑃𝑇𝐶
)) + 𝑗) [1]
In Equation 1, the electricity consumption during hour h (𝐸ℎ) is determined by the AC
state (𝐶ℎ), the electric heating state (𝐻ℎ), and a vector of dummy variables (Dh,w) that specify the
fixed impact of hour of the day and day type (w, weekday vs weekend) combination. If the AC is
classified as on at during hour h, the electricity consumption depends on the electricitytemperature sensitivity for cooling (𝛽2) multiplied by the difference between the temperature and
the cooling SPT (𝑆𝑃𝑇𝐶), and an AC intercept (j). Similarly, at temperatures below the heating
SPT we assume that electric heating could be on, and that electricity consumption therefore
depends on the electricity-temperature sensitivity for heating 𝛽1 and a separate heating intercept
80
(𝑖). For homes that were classified as having AC but no electric heat, the heating state of all
hours was set to zero. The heating and cooling intercepts can be interpreted as the minimum
additional electricity consumed when the AC or electric heating is on and the electricitytemperature sensitivities can be interpreted as the increase in electricity consumption that occurs
as outdoor temperature increases when the AC is on, or as temperature decreases when the
electric heating is on.
For each hour, we again determine the AC state (on/off) and electric heating state (on/off)
via the EM algorithm that was used in the AC Ownership Algorithm; we iteratively find lines of
best fit for each state, and then reassign hours based on minimizing the prediction error. We use
the same initial seeding from section 3.2 for AC on hours, and again terminate this algorithm
when fewer than 1% of eligible points switch AC state or when fewer than 1% of total points are
classified as AC on. Figure 3 illustrates the results of this model for an example home with a
SPT of 81oF. We show results for four hours of the day during the week with the hourly AC and
non-AC points indicated. Note that across all hours the AC intercept and temperature sensitivity
are constant, which assumes that AC consumption is linearly dependent on changes in
temperature regardless of time of day or current temperature (provided the temperature is above
the SPT). With hour of the day included as a variable in the regression, an 8am datapoint may be
classified as AC on despite having a lower electricity consumption than a 4pm datapoint that is
classified as AC off because the 8am datapoint represents unusually high electricity consumption
for that time of day and day type.
81
Figure 4-3. Top: Scatterplot of hourly electricity consumption and temperature for the two-year period for one
household. Bottom: Results of the AC State Algorithm for four different weekday hours over the two-year period.
For this home, we determined a cooling SPT of 81oF, hence only hours with temperature above 81oF can be
classified as AC on.
To estimate the cooling SPT, we look for a temperature that both reduces error and
increases the probability of correctly classifying the state of the AC. A lower cooling SPT
generally reduces the error term because more of the data is fit to two lines instead of one.
Conversely, higher cooling SPTs generally lead to more confident predictions of the AC state,
since at higher temperatures there is typically a higher fraction of hours classified as AC on and a
larger magnitude difference between the electricity consumption of an AC-on versus AC-off
82
hour. We balance these two objectives through an error term that combines the prediction
likelihood and the probability of AC being on, which is defined as the fraction of hours with
temperature above the SPT that are classified as AC hours. We test each cooling SPT between 60
and 100oF and find one value for each household that minimizes the error term. More discussion
of the SPT selection method can be found in [25]. We used a fixed heating SPT of 60oF because
we are not interested in identifying specific electric heating behaviors. We note that it is
necessary to include this temperature dependence below 60oF to avoid the errors at low
temperature hours dominating the total error and therefore skewing the cooling SPT selection
process. With the optimal cooling SPT selected, we can make a final classification of the hours
during which a household’s AC is on.
We then determine a household’s AC Operation Rate, which is the fraction of hours that
a home has its AC on (the number of hours classified as AC on divided by the total hours).
Recall that an “AC on” hour is an hour that demonstrates clear temperature dependence, and the
classification does not capture the number or length of AC cycles that occur during the hour.
Following the same method of matching households to census tracts and climate zones as was
used for AC Penetration Rate, we also find the average AC Operation Rate for a region by taking
the mean of the AC Operation Rate for each household in the region. The household and average
AC Operation Rates can be calculated for the entire study period or a subset of time (e.g., the AC
Operation Rate in all hour 12s).
4.3.4 Calculation of Net AC Utilization (Step 3 in Figure 1)
Finally, we use our regional estimates of AC Penetration Rate and AC Operation Rate to
calculate the Net AC Utilization for a region as shown in Equation 3.
𝑁𝑒𝑡_𝐴𝐶_𝑈𝑡𝑖𝑙𝑖𝑡𝑧𝑎𝑡𝑖𝑜𝑛𝑅 = 𝐴𝐶_𝑃𝑒𝑛𝑒𝑡𝑟𝑎𝑡𝑖𝑜𝑛_𝑅𝑎𝑡𝑒𝑅 × 𝐴𝐶_𝑂𝑝𝑒𝑟𝑎𝑡𝑖𝑜𝑛_𝑅𝑎𝑡𝑒𝑅 [3]
83
Equation 3 defines the Net AC Utilization of a region R as the product of the region’s AC
Penetration and Operation Rates. We find Net AC Utilization for the entire study region, as well
as for each census tract and climate zone within the study region. Net AC Utilization is directly
proportional to the number of per-household cooling on hours in a region and better describes the
AC use in a region than estimates of AC ownership or state in isolation.
4.4. Results and Discussion
4.4.1 Comparison of AC Penetration Rates with other studies
Across the entire study region, AC was detected in 79% of households. In the California
Residential Appliance Saturation Survey (RASS), 75% and 86% of customers surveyed in SCE
territory reported having central or room AC in the years 2009 and 2019, respectively. Our
region-wide estimate is in alignment with the survey results considering that the smart meter
records analyzed in this study (2015-2016) fell in between the survey years, [178], [208].
This study’s estimate of the region’s overall AC Penetration Rate is significantly higher
than the value found in Chen et al. (69%) [24]. There are multiple explanations that account for
the difference. First, the methodology used in this study to classify AC households does not
make assumptions regarding the electric heating status, and thus, is more likely to correctly
identify homes that have and use both electric heating and cooling systems. Second, we expect
this methodology to better capture households that use their AC infrequently and/or have other
electric loads that contribute significantly to total demand, diluting the electricity temperature
signature at the daily level. This theory is in part validated by the breakdown of central versus
room AC units reported in the RASS (58%/18% in 2009 and 68%/18% in 2019), indicating that
the daily methodology utilized by Chen et al. might have accurately captured the central
84
conditioners but failed to identify the room conditioners with smaller loads.
To gain an understanding of spatial differences in the results from this study and Chen et
al. [24], the difference in AC Penetration Rate estimates in each census tract was plotted on a
choropleth map shown in Figure 4. The areas in blue were estimated to have higher AC
Penetration Rates when the proposed hourly method was used in place of the previous daily
method, while areas shown in red were estimated to have lower AC Penetration Rates. This
study’s method of detecting AC finds a higher penetration in the majority of census tracts across
the region. We note that the households studied in this analysis were not evenly distributed
across census tracts, and thus some of the census tracts that show large differences between
methods in Figure 4 are the result of having a small number of homes in that specific census
tract. (Census tracts that are not statistically represented are indicated in Figure 4 with cross
hatching.)
Through this methodology, we also estimated that 25% of households have electric
heating. In the 2009 RASS, only 4% and 1% of customers reported an electric heater as their
primary and auxiliary space heating appliance. The percentages of primary and auxiliary space
heating appliances increased to 17% and 6% in the 2019 RASS. Since our smart meter records
fall in between the survey years, the methodology used in this study likely overestimates the
portion of electric heaters present in Southern California. One explanation for the discrepancy in
values is that households may be supplementing their natural gas heating with electric room
space heaters that were not surveyed in RASS. Elmallah et al. detected electric heating in 27% of
homes, which was higher than the value reported for RASS in some of climate zones located in
their dataset; similarly, the authors pointed to the use of room space heaters as an explanation
[230]. It is important to note that Southern California’s has more CDDs than HDDs [244], and
85
demand for space conditioning is driven by cooling needs rather than heating needs.
Figure 4-5. Choropleth maps depicting the difference between AC Penetration Rates at the census tract level
estimated with the proposed method in this study and the method developed by Chen et al. The difference is found
by subtracting the AC penetration rates computed with the new, hourly method from the AC Penetration Rates
computed using the daily method. Generally, the AC penetration rate computed with the new method is higher
(blue) than when the previous method was used.
4.4.2 Tracking temporal patterns of AC Operation Rate
One of the major advantages of the method described in this study is the ability to track
patterns of AC operation. While AC Penetration Rate is an important metric to characterize who
has access to AC in a community and inform where the power grid may experience spikes in
demand, information about how people use their AC is also necessary to quantify cooling
demand. Here, we analyze variations in customer AC Operation Rate, including how often and
when their AC unit is on, and explore how these behaviors create regional differences in cooling
behavior.
86
The results of this study found that customers with AC have an average AC Operation
Rate of 5.4% during the two-year study period. The bar chart shown in Figure 5 depicts how AC
Operation Rates vary across and within the climate zones in SCE’s territory. In the cooler,
coaster climate zones (e.g., climate zones 6 and 8) the AC Operation Rate across the full study
period is generally lower than for customers in the hot, desert climate zones (e.g., climate zones
14 and 15). For example, in climate zone 15 which is characterized by a hot, desert climate, 27%
of customers have an AC Operation Rate over 15%, compared to 1% of customers in the coastal
climate zone 6.
Figure 4-5. Stacked bar chart showing the breakdown of AC Operation Rates over the study period for each climate
zone. Each bin represents an AC Operation Rate range, with darker shades of blue indicating a higher AC Operation
Rate (e.g., AC is classified on for more hours). The summer mean temperature for each climate zone is shown to the
right of each bar.
In addition to knowing how often utility customers use their AC, we can capture the
timing of when customers use their AC and how that varies across the region. The heat maps in
87
Figure 6 show the average AC Operation Rate in each hour and month combination for each of
the study region’s seven climate zones. Across all climate zones, the AC Operation Rate is
higher in the afternoon and early evening, as well as in the hot, summer months. In climate zones
that experience relatively cool temperatures (e.g., climate zones 6 and 8) the range of hours and
months with notable AC Operation Rates is smaller, and the AC Operation Rate itself is, in those
time periods, generally lower than in the hotter, desert climate zones, such as 14 and 15.
Figure 4-6. Heat map depicting the average AC Operation Rate of each day and month of the year combination for
a-g) each climate zone and h) full study region. The AC Operation Rate is averaged across all pertinent customers
that were identified as having AC. The summer mean temperature for each climate zone is shown above each
subplot.
4.4.3 Spatial trends in AC Penetration Rate, AC Operation Rate, and Net AC Utilization
To observe how trends in cooling behavior vary across the study region, study results
were aggregated to the census tract level. In Figure 7, panel a) depicts AC Penetration Rates,
lending insight into which areas have higher rates of AC ownership. Panel b) displays AC
Operation Rates, which measure how often the average customer in each census tract uses their
AC during the study period. In general, the cooler, coastal and mountainous regions have lower
88
rates of ownership and use their AC less frequently than the hotter, inland and desert regions
(also shown in section 4.2).
While AC Penetration Rates and AC Operation Rates separately provide important
information about the cooling demand of a community, we can better estimate the locations that
are likely to have high cooling demand by combining these factors into one metric. Thus, Net
AC Utilization, which accounts for both the percentage of households in a specified area that
have AC and how often those customers have their AC on, was computed for the entire study
region by census tract, with the results shown in Figure 7, panel c). The Net AC Utilization of a
census tract is directly proportional to the expected number of hours of AC use that an average
household selected from our data in that census tract would have, and thus, is useful for
evaluating local cooling need and the location of demand surges during extreme heat events. In
Figure 7, we report Net AC Utilization by decile because there is not a clear physical meaning of
the metric as a percentage value (in contrast to AC Penetration and AC Operation Rates).
In general, the regional patterns are consistent across each of the panels shown in Figure
7, meaning areas with higher AC Penetration Rates also have higher AC Operation Rates and
Net AC Utilization. Although the regional trends are consistent, there are still census tracts
where the AC Penetration Rate is relatively high, but the AC Operation Rate is relatively low
(and vice versa), which demonstrates the limitation of relying on AC Penetration Rates alone
when evaluating cooling demand.
89
Figure 4-7. Choropleth maps depicting the a) AC Penetration Rate, b) AC Operation Rate, and c) Net AC
Utilization decile computed at the census tract level.
If we compare the AC Penetration Rates and Net Utilization Rates, we can see how
incorporating the AC Operation Rates impacts our evaluation of cooling demand. Table #
provides the percentage of census tracts at each quantile of AC Penetration Rate that fall into
each quantile of Net AC Utilization. For example, of the census tracts in the 20-40% percentile
of AC Penetration, it is more likely that they fall into a lower percentile of Net AC utilization
than remain in the 20-40% percentile range. This could be explained by the fact that these census
tracts experience cool enough temperatures that they rarely need to use their AC, or that they are
lower income census tracts within that quantile that are more conscious of their electricity
a) b)
c)
90
consumption.
A second interesting insight is that while most census tracts with high AC Penetration
Rates also have high AC Operation Rates, roughly 20% of census tracts in the top quantile of AC
Penetration Rate shift into the bottom two Net AC Utilization quantiles. This could be explained
by rich census tracts that own ACs despite living in relatively cooler climates, thus not requiring
cooling often, or poor census tracts in hot regions where households forgo cooling to lower
electricity costs despite high temperatures. The results of this section suggest that AC Penetration
and AC Operation Rates are not always tightly correlated and warrants a further analysis of what
factors cause diverging results in some regions, as those populations may either be underserved
or consume a disproportionate amount of electricity making them a target for grid flexibility
efforts.
Table 4-1. Transition matrix summarizing AC Penetration Rates and Net AC Utilization percentile ranks of the
census tracts in the study region, where 0-20% indicates the lowest and 80-100% indicates the highest AC
Penetration Rate/Net AC Utilization quantile. Each value represents the percent of census tracts that originally fell in
each AC Penetration Rate quantile (denoted by row) that shift into the specified Net AC Utilization quantile
(denoted by column), effectively showing the impact that including AC Operation Rates has on the cooling demand
evaluation.
Net AC Utilization Percentiles
AC Penetration Rate Percentiles
0-20% 20-40% 40-60% 60-80% 80-100%
0-20% 58% 24% 13% 4% 1%
20-40% 32% 25% 21% 15% 7%
40-60% 16% 23% 22% 22% 17%
60-80% 10% 19% 22% 25% 25%
80-100% 6% 13% 19% 25% 36%
91
4.4.4 Net AC Utilization considering climate
While Net AC Utilization provides a useful metric of existing cooling demand in a
region, we are also interested in the relationship between a household’s theoretical need for
cooling and their actual AC behaviors. We approximate a single household’s theoretical cooling
need by aggregating their hourly temperatures to the daily level and calculating their annual
CDDs. In Figure 8, we plot the mean household Net AC Utilization against the mean household
CDDs at the census tract level.
We see that for a given number of CDDs, there is a large variety in the degree of Net AC
Utilization across census tracts. This is of particular note for census tracts with a high number of
CDDs, and thus a high theoretical cooling need, but a low Net AC Utilization. For example,
there are 41 census tracts that rank above the 80th percentile of CDDs but fall below the 50th
percentile of Net AC Utilization. These census tracts may be experiencing energy insecurity due
to poor access to AC or lack the financial resources needed to use the AC that they do have
(although there are confounding factors unrelated to enery insecurity, such as AC efficiency and
a building’s thermal properties, that can influence AC use). Additional analysis is needed to
determine if these census tracts are particularly vulnerable to extreme heat. Lastly, a small
number of census tracts display high Net AC Utilization despite relatively low theoretical
cooling need, which may represent an opportunity for targeted demand response programs.
92
Figure 4-8. Scatter plot of normalized Net AC Utilization versus CDD experienced during the study period
averaged by census tract. Census tracts that are not statistically represented by the households in our dataset are not
included.
4.5. Conclusion
In this three-part framework, we first developed a novel methodology for identifying the
presence of AC from household-level smart meter data and used the model to compute regional
AC Penetration Rates. Unlike previous methods, our novel model used hourly, rather than daily,
electricity consumption data and directly modeled electric heating, which was a confounding or
ignored variable in several previous studies. We believe our focus on hourly data allowed us to
better identify homes with a variety of AC types and with intermittent AC use and find that our
results align well with survey data from similar years in the same region. In the second part of
this study, we predicted the hourly AC state at the household level using the AC State Algorithm
and aggregated the results to observe trends in AC Operation Rates across spatial, temporal, and
climatic ranges. Finally, we combined AC Penetration and AC Operation Rates to calculate each
93
census tract’s Net AC Utilization and better characterize regional residential cooling behavior.
Unsurprisingly, we find higher rates of AC Operation Rates in the middle of the day and
afternoon of summer months. We also find that some census tracts have surprisingly low Net AC
Utilization when compared to adjacent areas and when compared to the amount we would expect
for an area with significant climatic need for cooling. This phenomenon may be explained by the
demographic or economic traits of the census tract (which is beyond the bounds of this analysis).
Regardless of the cause, these areas would likely benefit from programs designed to increase AC
access and/or address energy insecurity. In future work, we plan on conducting a more rigorous
analysis of the factors that drive disparities in the cooling demand. For areas that already have
high AC Penetration Rates and AC Operation Rates, these census-level estimates increase our
understanding of where surges in demand are likely to occur during extreme heat events and high
temperatures which is useful information for utilities and grid planners.
The authors would like to acknowledge several limitations of this study. First, there is no
ground truth data of AC ownership or operation with which to validate our results, thus we
cannot determine the accuracy of our algorithms that were used to determine the AC Penetration
and AC Operation Rates. Furthermore, comparisons between methods also cannot speak to
whether one method is more or less accurate for our dataset. Instead, we focus on comparing our
AC Penetration estimates with relevant survey data for the same region. While our dataset
contains nearly 160,000 homes after filtering, the large spatial extent of the data spreads these
homes across many census tracts and creates a large range in the number of homes per census
tract. As a result, the samples of homes in this dataset are only statistically representative for
~63% of the census tracts. We believe our general method of relating electricity consumption
and ambient temperature at the hourly level with models that represent distinct electric space
94
conditioning technologies and usage patterns can be extrapolated to other regions. However, in
other regions, the different climatic factors and relative frequencies of a variety of space heating
and cooling technologies may require modifications to the methodology presented here. Lastly,
in this study we discuss Net AC Utilization as a way to characterize AC behavior, but we
acknowledge that a more complete study of cooling demand would consider the magnitude of
AC electricity consumption, which is beyond the bounds of this analysis.
95
Chapter 5: Residential electricity demand on CAISO Flex Alert days: A case
study of voluntary emergency demand response programs
5.1 Introduction
In California, the state’s largest balancing authority, California Independent System
Operator or CAISO, utilizes a voluntary demand response (DR) tool known as Flex Alerts. When
a Flex Alert is issued, electricity consumers are asked to voluntarily reduce their electricity usage
for a specified period in time to reduce strain on the grid during periods when grid reliability is
threatened [26]. The Flex Alert program has become an important tool for managing California’s
electric grid, which has been challenged by an increasing frequency and intensity of extreme heat
events, in conjunction with a rapid increase in variable renewable energy, which provide
challenges for reliable grid operation.
Flex Alerts have helped avoid rolling blackouts by reducing total system load during
times of grid stress particularly on hot summer afternoons when AC use is high [26], [245].
During heat waves, electricity demand often surges as customers turn on air-conditioning (AC)
to stay cool [19]. To avoid disruption, grid operators must secure enough generation resources to
meet rising demand, at the same time that electricity generators themselves might experience
reduced capacity because of the high temperatures. For example, extreme heat can reduce
reliable thermal power plant operation via lower power plant efficiencies [246], or in extreme
cases, due to inadequate or disrupted cooling water resources [247]. The efficiency of solar
photovoltaic panels are also reduced on hot days [248]. When extreme heat occurs during
periods of drought, low hydropower resources can further exacerbate reductions in generation
capacity [249]. Heat may also cause losses in transmission lines through both lowered line
ratings and a need to de-energize in instances of wildfire risk [250], [251]. Extreme heat events
96
in the past have forced utilities to resort to involuntary load shedding (i.e., rolling blackouts)
when available power generation was inadequate to meet demand [252]–[254]. While rolling
blackouts can effectively stabilize the power grid, they pose significant public health risks and
have been shown to more adversely affect minorities and populations with lower socioeconomic
status [255].
In addition to extreme heat, high penetrations of variable renewable energy, namely solar,
on California’s electric grid have prompted a set of emerging challenges that can threaten
supply-demand balancing, particularly during periods of peak demand. These challenges are
especially acute on high electricity demand days in the early evenings when solar PV generation
falls at the same time that net demand (i.e., total demand less total variable renewable
generation) approaches its peak. To accommodate the rapid increase in net demand, electricity
generators from other (typically more dirty) resources have to quickly ramp up their generation,
which can threaten grid reliability as there are physical constraints on the rate at which fast
generators can be dialed up [256]. Conversely, in mid-day periods when solar resources are high
and net load is at a low, CAISO has been challenged by solar overgeneration (i.e., periods when
solar generation exceeds what the grid can accommodate), typically on sunny Spring days when
hydropower and/or wind resources are also high [257]. During these periods system operators
often curtail this load to avoid damage to the grid [258].
Flex Alerts can alleviate some of the stress of fossil fuel power generator ramping and
solar overgeneration by encouraging customers to shift electricity demand from on-peak to offpeak hours. In CAISO, shifting load to off-peak hours when solar and wind output is high has the
added benefit of increasing the amount of load met with emissions-free renewable generation.
For some loads, this temporal shifting might cause a net increase in daily electricity usage, for
97
example if customers precool homes their homes by running their air-conditioners more
intensely in early afternoon in efforts to relieve cooling during on-peak hours [175]. (However,
since electricity is cleaner and cheaper prior to Flex Alerts, there are likely indirect emissions
and cost benefits in addition to grid reliability benefits, even in these cases [259].)
Most analyses of CAISO Flex Alerts have focused on impacts to systemwide or regional
demand, and therefore, do not capture how participation in each DR event may have varied
across sectors, spatial and temporal extents, or population demographics. CAISO itself has stated
that there have been significant drops in overall demand during typical ramping hours on Flex
Alert days, suggesting that they are a useful tool for shedding load [245], [260], [261]. However,
an energy consulting firm released a load impact evaluation of California’s Flex Alert program in
2014 using PG&E load data from three different Flex Alert days in 2013, roughly 10 years after
program deployment, and did not find a statistically significant difference in the load, compared
to reference days [262]. In recent years, there have been calls by the California Public Utilities
Commission (CPUC) and investor-owned utilities (IOUs) to study consumer awareness of the
program [263], but there has very little analysis on the efficacy of or participation in CAISO’s
Flex Alert program, in part due to the lack of high-resolution, publicly available data at a
regional scale. As a result, our understanding of how different populations of electricity
consumers respond when Flex Alerts are issued is limited.
In this research we aim to both evaluate the effectiveness of Flex Alerts as well as define
what it means for a Flex Alert to be “effective”, as there exists no standard metric in the
literature. We use five years of hourly smart meter electricity data from approximately 200,000
homes in Southern California to analyze the energy demand of residential customers on Flex
Alert days. The following research questions are addressed: 1) How effective have Flex Alerts
98
been in reducing pressure on generation fleet ramping ("Ramping Response") and residential
sector demand ("Flex Period Response") during Flex Alert hours? 2) Do factors including daily
maximum temperature, weekend-weekday scheduling, and frequency of Flex Alert issuance
affect Flex Period Responses on Flex Alert Days?, and 3) What subpopulations of customers are
most likely to change their behavior during Flex Alert Periods? The results of this study will
provide insight into the efficacy of voluntary demand response programs and how they can be
tailored to better engage and motivate different groups of residential customers.
Background
While few studies have analyzed the CAISO Flex Alert campaign specifically, many
studies have attempted to quantify the full potential of residential DR by simulating or modeling
energy behavior and customer response to certain signals [25], [264]–[266]. These studies focus
on how “flexible” certain loads could be (i.e., how much of demand could potentially be shifted
to other hours), but the results of these analyses do not give insight into how actual electricity
users behave when DR events are called. Energy behavior, especially in the residential sector, is
driven by a multitude of factors and is highly variable and unpredictable [267]. In the case of DR
programs, users will not always respond to price signals or incentives consistently or in the most
economical way [268]. A review of residential engagement in DR trials and programs found that
modeling studies generally have optimistic assumptions about consumer engagement that are not
realized, meaning the real-world customer response is lower than estimates in the peer-reviewed
literature [269].
To gain better insight into how electricity users and different subpopulations will actually
respond to DR, many studies analyze the load data of customers who participated in DR pilot
programs and trials [270]–[274]. For example, one study evaluated the response of 483
99
residential customers to critical-peak pricing in California, finding that users with higher
electricity demand and users in cooler climate zones were more responsive to critical-peak
pricing and reduced their demand by larger percentages, compared to users with smaller loads
and in warmer locations [270]. However, the number of participants enrolled in these DR pilots
are not typically representative of an entire region (e.g., a couple hundred to a couple thousand
homes [275]–[278]), which limits insight into the overall efficacy of the program or how
responses differ within a region. Further, they focus on pricing or incentive driven programs
[274], [279], [280] rather than programs like Flex Alerts where customers are called on to
voluntarily reduce demand without incentive [281], [282].
Voluntary DR programs that offer no financial incentive for customers who shift or shed
their load, such as Flex Alerts, are far less common in practice and in the literature. However, it
has been shown that customers can be persuaded to conserve energy without incentive when
emergency situations occur. In 2009 when a transmission line was severed in Juneau, Alaska, the
town’s residents responded to conservation requests and decreased their electricity use by 30%
compared to the same time in the previous year [282]. Similarly, California developed a
statewide public information campaign, titled Flex Your Power, during its 2001 energy crisis,
that urged residents and businesses to reduce both their overall and peak demand. As a result of
the campaign, additional DR programs, and efficiency measures, the peak electricity demand
was reduced by an estimated 6,369 MW, with 2,616 MW reduction being credited to voluntary
conservation alone [283]. These examples indicate that voluntary DR programs can be an
effective tool to help avert energy emergencies, but to replicate their success, further analyses of
the factors that impact the level of response are needed.
Evaluating the efficacy of DR programs is difficult because there is no precise ground
100
truth of what energy demand would have been in the absence of the DR tool since variables such
as temperature and other meteorological conditions, day of the week, holidays, etc. influence
day-to-day and hour-to-hour demand. Most studies establish a reference case (e.g., predicted
demand [270], demand from similar, non-participating customers [274], or a constructed baseline
determined from demand during a reference period [272], [273], [279], [282], [284]) to serve as
a proxy that can be compared to the observed demand to estimate how successful a DR program
was. However, a more robust, standardized analytical method that assesses the extent to which
customers engaged in DR events would be useful to systematically quantify and compare the
efficacy of DR programs across time, space, and customer demographics and the role DR can
play in achieving system reliability.
5.2 Methods
5.2.1 Datasets
Flex Alert data were retrieved from CAISO’s Grid Emergencies History Report for the
years 2015-2016 and 2018-2020 [285]. (These time periods were selected to align to our
available hourly household-level electricity dataset described below.) Data regarding the CAISO
Flex Alert issuance date, targeted period (e.g., 4pm to 9pm), and regional coverage (e.g., all of
CAISO vs. Southern California) were extracted. In total, 18 Flex Alerts were issued during the
study period with regional coverage that included Southern California. We chose to omit one
Flex Alert (June 20, 2016) from the study because the length of the alert (issued from 10 am to 9
pm) was markedly longer than others. We kept another Flex Alert, issued on 9/7/2020, in the
analysis, although it occurred on a major holiday (Labor Day), which we acknowledge may have
impacted customer response level.
Residential sector data comprised of hourly-smart meter electricity records for roughly
101
200,000 unique homes across the entire Southern California Edison (SCE) Investor-Owned
Utility (IOU) service area were selected to be statistically representative of the SCE region with
99% certainty. SCE provided the smart-meter dataset with matching street-level addresses for the
years 2015-2016 and 2018-2020. Data were stored on a high security data account (HSDA) to
abide by the privacy requirements of the SCE. These household-level data were analyzed to
evaluate how responsive the residential sector and subpopulations within the residential sector
were to Flex Alerts. In this analysis, we use the normalized shape of our hourly smart meter
dataset (i.e., after aggregating customer load at each hourly increment) as a proxy for SCE’s
normalized residential sector load and refer to it as the “residential load”. Total hourly load data
for SCE, which encompasses all end-use sectors, on each Flex Alert day were retrieved from
CAISO as a reference to compare to the residential smart meter data (referred to as “total SCE
load”).
Local and regional temperatures were used in this analysis to investigate the impact of
temperature on customer response during Flex Alerts and to identify similarly hot days to use as
reference. Three sources of land-based weather station networks were used in this study: CIMIS,
EPA AQS, NOAA. In total, historical dry bulb temperature records were retrieved from 112
different stations throughout the study region [113], [114], [194], and each household was
matched to the nearest weather station. Daily average and maximum temperatures were
computed on Flex Alert days for every census tract in the study area, as well as for the study area
as a whole, by taking the average of each household’s assigned temperature in the area of
interest. The temperatures were used to compare weather conditions across Flex Alert days and
determine comparable days (described in the following section). Daily temperatures were
assumed to be equal for the residential load and total SCE load analyses. Each house was also
102
matched to its corresponding California climate zone, as defined by the California Energy
Commission, to gain insight into how consumer response varied across micro climates [201].
Census tract level data from CalEnviroScreen 3.0 and the U.S. Census Bureau were retrieved to
compare energy consumption behavior across different subpopulations [119], [179]. These data
include poverty percentile, education percentile (e.g., the percent of households in a census tract
where a resident has at least a high school education– shown in SI), and income. Each household
is characterized by the census tract it is situated in, as household level socioeconomic data is not
available.
5.2.2 Response Metrics
To define how “responsive” a household was to the Flex Alerts, it was prudent to define a
reference scenario to estimate what hourly electricity usage behavior might have been in the
absence of the Flex Alert. Since it is impossible to know what actual electricity use behavior
would have been in the absence of a Flex alert on a Flex Alert day, we assigned three
“comparable days” to each of the 17 Flex Alerts studied. The set of comparable days were
selected based on 4 criteria, including that they (1) occurred in the two weeks leading up to or
two weeks following the given Flex Alert, (2) shared the same “day type” (i.e., weekday or
weekend) as the Flex Alert, (3) were not classified as a Flex Alert day, and (4) represented the
three hottest days based on the region’s daily maximum temperature. Every census tract was also
assigned three comparable days for each Flex Alert day using the census tract’s daily maximum
temperature. Thus, the assigned comparable days may differ across census tracts but are always
the same for every household within a census tract.
The electricity demand of each unique customer i on each of the three comparable days,
𝐸𝑖,ℎ
(𝐶1)
, 𝐸𝑖,ℎ
(𝐶2)
, and 𝐸𝑖,ℎ
(𝐶3)
, respectively, is averaged to generate an estimate of the electricity
103
demand of each customer in hour h on the Flex Alert day had the Flex Alert not been issued. We
refer to this as the reference electricity demand, E
(R)i,h described in Equation 1. We use this
metric as a proxy to estimate hourly demand across the reference day in Equations 2-4.
𝐸𝑖,ℎ
(𝑅) =
𝐸𝑖,ℎ
(𝐶1) + 𝐸𝑖,ℎ
(𝐶2) + 𝐸𝑖,ℎ
(𝐶3)
3
(1)
Two metrics were developed to evaluate how responsive customers were to each Flex Alert. The
metrics were calculated for the SCE’s entire service territory and at the census tract level to explore how
the response varied across spatial extents and subpopulations. (Note: In Equations 1, 2, 3 and 5 no
summation across customers i is required when calculating the Flex Period Response and Ramping
Response of SCE’s total load data as these data are already aggregated into hourly values representing all
customers and all sectors in SCE territory.)
The first metric, described by Equation 2, is referred to as the Flex Period Response and is a
proxy to estimate the change in the percent of total daily demand that took place during targeted Flex
Alert hours on the Flex Alert day, F, compared to the same day if a Flex Alert was not issued (i.e., the
reference day, R). Hence, to calculate the Flex Period Response, we calculate, 𝑃
(𝐹)
, representing the
percentage of total daily demand that occurred within the Flex Alert period by summing the total amount
of electricity demand by customers across the spatial extent of interest (i.e., census tract or total SCE
region) between hour s, defined to be the start of the Flex Alert period, and hour e, the end of the Flex
Alert period, and divide by the total daily electricity usage of all customers in the dataset on the Flex Alert
day, as shown in Equation 2. Then we repeat the calculation in Equation 3 to calculate the same value for
the percentage of demand occurring within the Flex Alert hours on the reference day referred here to as
𝑃
(𝑅)
.
𝑃
(𝐹) =
∑ ∑ 𝐸𝑖,ℎ
𝑒 (𝐹)
𝑖 ℎ=𝑠
∑ ∑ 𝐸𝑖,ℎ
23 (𝐹)
𝑖 ℎ=0
× 100
(2)
104
𝑃
(𝑅) =
∑ ∑ 𝐸𝑖,ℎ
𝑒 (R)
𝑖 ℎ=𝑠
∑ ∑ 𝐸𝑖,ℎ
23 (R)
𝑖 ℎ=0
) × 100
(3)
The Flex Period Response (units of percent change) represents the difference between, 𝑃
(𝐹)
and
𝑃
(𝑅)
as shown in Equation 4.
𝐹𝑙𝑒𝑥 𝑃𝑒𝑟𝑖𝑜𝑑 𝑅𝑒𝑠𝑝𝑜𝑛𝑠𝑒 = 𝑃
(𝐹) − 𝑃
(𝑅)
(4)
The second metric, described as the Ramping Response (units of percent change) in Equation 5, is
a proxy to estimate the difference in how demand changed across the first hour of the Flex Alert Period
(i.e., the hour spanning h=s to h=s+1) on the actual Flex Alert day versus the reference day.
𝑅𝑎𝑚𝑝𝑖𝑛𝑔 𝑅𝑒𝑠𝑝𝑜𝑛𝑠𝑒 = (
∑ 𝐸𝑖,ℎ=𝑠+1
(𝐹)
𝑖 − ∑ 𝐸𝑖,ℎ=𝑠
(𝐹)
𝑖
∑ 𝐸𝑖,ℎ=𝑠
(𝐹)
𝑖
−
∑ 𝐸𝑖,ℎ=𝑠+1
(𝑅)
𝑖 − ∑ 𝐸𝑖,ℎ=𝑠
(𝑅)
𝑖
∑ 𝐸𝑖,ℎ=𝑠
(𝑅)
𝑖
) × 100
(5)
The Flex Alert period varied across Flex Alert days so the targeted period in Equations 2 through
4 reflects the period defined by the unique Flex Alert listed in CAISO’s Grid Emergencies History Report
[285]. For example, on June 30th, 2015 the alert was issued to begin at 2 pm and end at 9 pm (i.e., s=2pm
to e=9pm) and on August 14th, 2020, the alert was issued to begin at 3 pm and end at 10 pm (i.e., s=3pm
to e=10pm). For each set of comparable days, the s and e values are defined based on the corresponding
Flex Alert day.
5.3 Results and Discussion
5.3.1 Level of response across Flex Alert days
The Flex Period and Ramping Responses of both the residential load and the total SCE load were
calculated to understand the varying success of the program and identify factors that might have
motivated customer engagement. In Figure 1, the Flex Period Response of the a) residential load and b)
105
total SCE load on each Flex Alert day is plotted against the study region’s daily maximum temperature on
the day of the Flex Alert. In Figure 1 c), a timeline of the Flex Alerts across the five years of data is
shown with the corresponding Flex Period Response on those days (note: desired Flex Period Responses
are negative, representing reductions in load). The shape of the points on the scatter plot represents
whether the date was a weekday or weekend, and the shade of each point refers to the order within each
year that the Flex Alerts were issued. (We were interested in understanding the frequency and timing of
Flex Alerts throughout the year to see if there might be fatigue across the customer base when many Flex
Alerts were issued within a short duration of time.) Points that are outlined in red represent Flex Alert
days that were hotter in temperature than all three of their comparable days.
The scatter plots show that there is a correlation between the study region’s Flex Period
Responses and the region’s daily maximum temperature, and the slope and r value of the line of best fit
can give insight into how dependent and correlated the Flex Period Responses are to daily maximum
temperature. In general, the residential load and total SCE load were more likely to reduce demand during
Flex Alert hours on days with relatively cooler temperatures (i.e., higher magnitude, negative Flex Period
Response). When compared to total SCE load, the Flex Period Response of the residential load is more
strongly dependent on (slope of 0.52 versus 0.34) but less correlated to (r value of 0.58 vs 0.69) the daily
maximum temperature. No observable trends were noted across weekday/weekends, the number of Flex
Alert issued in the year, or whether the Flex Alert was the hottest day of the comparable days.
Additionally, the timeline in Figure 1 does not show any trend of consumers being less responsive on
consecutive days of issued Flex Alerts (i.e., there was no response fatigue after multiple consecutive days
of Flex Alerts).
Table 1 summarizes Flex Alert days and the responses of both the residential sector and all SCE
customers. Compared to the Flex Period Response, the Ramping Response was much less strongly
dependent on the region’s daily maximum temperature, with a slope of 0.01 for the residential load and
the total SCE load. On average, the Ramping Response also has a lower magnitude, negative value for
both the residential SCE and total SCE load. It is important to note that the shape of the load profile on
106
the three comparable days strongly impacts the Ramping Response, and if the demand peaks earlier or
later than is typical it could lead to an under or overestimation of the responsiveness. For example, on
June 11, 2019, the demand on one of the three comparable days is already declining before the start of the
Flex Alert period, which increases the value of the computed Ramping Response (i.e., positive or lower
magnitude, negative value). Because there is no ground truth to compare against, this is a difficult
limitation to avoid. (Note: the hourly load profiles of the residential load and total SCE load on all of the
Flex Alert are available in the SI.)
Table 1 also highlights that the overall Flex Period Response of both the residential and total SCE
load is inconsistent, and certain Flex Alert days appear to be effective (i.e., a negative Flex Period
Response) while others are not. On the most effective Flex Alert days (apart from the Flex Alert that fell
on a major holiday), the Flex Period Responses of the residential loads are -11% (June 30th, 2015 and July
1
st, 2015), but most days have more modest values, or even positive Flex Period Response values,
indicating that load actually increased during the targeted period when compared to the estimated
refer5Zence day. The average Flex Period Response of the residential and total SCE load are -4% and -
2%, respectively. It is likely that large industrial and commercial loads are already participating in
existing SCE DR programs (e.g., [286]) that incentivize them to shift or shed their large loads (and
possibly on selected comparable days, in addition to Flex Alert days), which might partially explain why
these customers have a lower average Flex Period Response and less sensitivity to temperature.
While the days with positive Flex Period Reponses may seem like days in which the program was
ineffective, our reference day demand is only an estimate of what demand would have been in the absence
of a Flex Alert, so we have no precise ground truth for comparison. Thus, it is possible that customers still
used less demand during the Flex Alert period on these days than they would have in the absence of a
Flex Alert, particularly on days where the Flex Alert day was hotter than the three comparable days used
to calculate the reference.
107
Figure 5-1. Flex Period Response of the a) the residential SCE load and b) total SCE load on each Flex Alert day
issued in 2015-2016 and 2018-2020 versus the region’s daily maximum air temperature. c) A timeline of when each
Flex Alert was issued with the corresponding Flex Period Response.
108
Table 5-1. Summary of CAISO’s Flex Alerts from 2015-2016 and 2018-2020 with the corresponding Flex Period
Response and Ramping Response of SCE’s residential load and total load.
Residential SCE Load Total SCE Load
Date Day of Week Daily Max
Temp (F)
Ramping
Response
Flex Period
Response
Ramping
Response
Flex Period
Response
6/30/2015 Tuesday 90.4 -9% -11% 0% -5%
7/1/2015 Wednesday 86.5 -1% -11% -2% -7%
7/27/2016 Wednesday 93.3 -2% -6% 0% -4%
7/28/2016 Thursday 92.2 -1% -5% 1% -4%
7/24/2018 Tuesday 96.3 1% -1% 2% 0%
7/25/2018 Wednesday 93.4 1% -2% 2% -1%
6/11/2019 Tuesday 92.5 0% 2% 1% 0%
8/14/2020 Friday 97.5 0% -1% 3% 0%
8/16/2020 Sunday 94.9 -3% -6% -2% -3%
8/17/2020 Monday 93.9 -1% -2% -1% -2%
8/18/2020 Tuesday 102 -2% -6% -1% -4%
8/19/2020 Wednesday 97.3 -1% -6% -1% -4%
9/5/2020 Saturday 106.2 -2% 3% 2% 4%
9/6/2020 Sunday 108.3 -3% -2% -1% 0%
9/7/2020* Monday 88 -7% -18% 0% -7%
10/1/2020 Thursday 99.7 -1% -1% 2% 0%
10/15/2020 Thursday 94.2 1% -2% 5% -1%
Average response -2% -4% 1% -2%
Median response -1% -2% 0% -2%
Slope of response metric to temperature metric 0.01 0.69 0.01 0.53
r value of response metric to temperature metric 0.14 0.38 0.27 0.37
*Flex Alert fell on a federal holiday
5.3.2 Load profiles on Flex Alert days
As observed in Figure 1, the Flex Period Response of both the residential and total SCE load
varies across the Flex Alert days. Figure 2 a-b) highlights the load profiles of the residential load and total
SCE load on two different Flex Alert days (June 30th, 2015 and July 1st, 2015), which had the largest flex
109
period responses in the sample of Flex Alerts studied (both -11%). The load profiles of the comparable
days, used to calculate the reference day hourly load, are also shown for context. As a note, the three
comparable days exhibited strong consistency in the shape of their load profiles for a corresponding Flex
Alert day (refer to the SI for the load profiles of the comparable days). The Flex Period Response
suggests that the DR event was effective on both of these days; overall, customers used less of their daily
load during Flex Alert hours than they did on comparable days, reducing the generation resources needed
during those periods.
While significant Flex Period Responses were observed during Flex Alert hours on both
days in Figure 2, the shape of the responses differs. On June 30th, 2015, the residential load
sharply fell in the hours leading up to and after the first hour of the Flex Alert and remained
relatively low before peaking again in the final hours of the Flex Alert. The heat maps in Figure
2 c-d) underscore this behavior, illustrating a significant decrease in demand occurring between
hours 14 and 15, followed by an increase between hours 18 and 19, which is markedly different
from the patterns of changes in hourly demand on the three comparable days occurring before or
after the June 30th Flex Alert. In contrast, on July 1st, 2015 there is no sharp drop in the
residential load; instead, the demand is consistently lower than the comparable days leading up
to and throughout the hours of the issued alert. However, on both Flex Alert days, there are
ramping benefits meaning that hourly increases in load during the initial hours of the Flex Alert
are less than comparable days. However, we observe slight penalties, in the case of the June Flex
Alert, when demand spikes in the final hours of the Flex Alert period.
The load profiles also highlight some differences in how the residential sector consumes
energy throughout the day versus all sectors. While both normalized load profiles show peaking
behavior in the late afternoon through evening hours, the residential load’s peak is significantly
higher than the total SCE load’s peak. Hence, the residential sector drives much of the peaking
110
behavior that occurs during typical Flex Alert hours, as much of the residential sector’s demand
is driven by AC use. While it might be difficult for some households to shift or shed their load
when temperatures are hot, the high percentage of demand that takes place during the hours of
interest underscore the opportunity that residential consumers present for participating in DR
programs aimed to reduce peak electricity consumption.
Figure 5-2. a-b) Normalized hourly electricity load profile of residential load (purple) and total SCE load (red) on
two different Flex Alert days (solid lines) compared to the hourly load profiles on the comparable days (dashed
lines). c-d) And hourly percent change in electricity demand on two Flex Alert days (outlined in black) and their
corresponding comparable days.
5.3.3 Variation in response across residential customers
The response to Flex Alerts across residential customers is not uniform. To observe
variations across both the microclimates of the study region and subpopulations, the residential
Flex Period Response of each census tract was mapped across SCE territory. The choropleth
maps of the residential load’s Flex Period Response on two different Flex Alert days are depicted
in Figure 3. Figure 3 a), represents June 30, 2015, a day where the alert prompted a relatively
large Flex Period Response of -11% for the residential load (compared to the average residential
load’s Flex Period Response across all Flex Alert days of -4%). On this day, the response in the
study region is relatively consistent with reductions in loads across most census tracts.
Conversely, on June 11, 2019, the average Flex Period Response for residential customers across
111
SCE’s service region was +2%, meaning that there was an average increase in load during the
Flex Alert compared to electricity consuming behavior on similar days. However, when the
responses are mapped at the census tract level, there are many census tracts that were responsive
to the Flex Alert (i.e., had a negative Flex Period Response value).
Figure 5-3. a-b) Choropleth map of the census tract level Flex Period Responseof SCE’s residential load on two
different Flex Alert days. Areas in blue consumed a lower percentage of their total daily electricity demand during
Flex Alert hours than they did on reference days, while areas in red used a higher percentage of their total daily
electricity demand during Flex Alert hours than they did on reference days.
Figure 5-4. Hourly residential electricity load by a) income percentile and b) demand percentile and normalized
hourly residential electricity load by c) income percentile and d) demand percentile. Heat maps of hourly percent
change in demand by e) income percentile and f) demand percentile on a Flex Alert day, June 30, 2015. Note: The
10th percentile refers to the lowest income and demand percentiles, and the 100th percentile refers to the highest
income and demand percentiles.
112
We also investigated how socioeconomic factors influence a customer’s participation in
Flex Alerts. Figure 4 a) and b) depict the hourly residential load in MWh by electricity demand
percentile and income percentile, respectively, (binned according to total annual demand in each
year) on a Flex Alert day, June 30, 2015. Figure 4 c) and d) depict the percent of daily residential
electricity load split across each hour of the day by electricity demand percentile and income
percentile, respectively, on the same Flex Alert day. From the load profiles, we observe that
higher income and higher demand customers have larger demand reductions during Flex Alert
hours than customers with lower income and demand. The heat maps in Figure 4 e) and f) show
percent change in hourly demand on the same Flex Alert day, again by demand and income
percentiles. Here we see that high income, high demand customers have a steeper decline in
demand during the initial hours of the flex alert than low income, low demand customers. These
results can be explained in part by previous studies on energy poverty, which have found
differences in the energy behavior of low-income and high-income utility customers. Lowincome customers typically use significantly less energy than their high-income counterparts and
more consistently engage in energy limiting behaviors to reduce costs [287]. Thus, when DR
events occur, it is difficult for low-income customers to further reduce their demand.
These results illustrate that average households in some census tracts likely have greater
flexibility to respond to Flex Alerts than in others. Despite the wide disparities in climate,
housing stock, and population demographics across and within both the census tracts and
subpopulations studied, we can draw some conclusions on some common characteristics of the
households most likely to respond. For example, customers with larger demand (who also tend to
be higher income) are more likely to modify electricity consuming activities and might have
access to technology that enables flexibility [288]. In contrast, a customer that is already very
113
energy conscious and/or has few discretionary electric loads (who also tend to be lower income)
might not be able to shift much of their load to other periods of the day. These results, on one
hand, indicate that the households with the largest potential to contribute large reductions in peak
demand (i.e., high consumers) are typically the households that are most likely to respond to Flex
Alerts, which is valuable for program success. However, on the other hand, our census tract
results across the SCE region indicate that responsiveness to Flex Alerts on extreme heat days
(particularly in the hottest regions) tend to be lower than Flex Alerts issued on comparatively
cooler days (and in cooler regions). Hence, the Flex Alert program might not be very effective on
the most extreme heat days when grid resources tend to be most strained.
5.4 Conclusion
The results of this study show that voluntary DR programs, even without financial or
other incentives, can significantly influence energy behavior, especially in the residential sector,
where peak electricity usage tends to be high compared to other sectors. On the Flex Alert days
with the greatest percent reduction of daily consumption during the Flex Alert hours, the
residential load and total SCE load had Flex Period Responses of -11% and -7%, respectively.
These values represent meaningful decreases in the percent of daily demand used during Flex
Alert hours suggesting that Flex Alerts have provided a valuable tool to help maintain grid
reliability on days when the grid has been stressed. However, there are also days when
responsiveness appears to have been somewhat negligible and regional results suggest that Flex
Alert responsiveness tends to be reduced on days and in locations experiencing extreme heat
(and electricity resources were presumably most strained).
Our investigation of differences across subpopulations and census tracts implies that
some households have more flexibility or ability to shift or shed their load and that this flexibility
114
can vary significantly across Flex Alert days. While we do not have full transparency into what
drives these differences, which are likely due to variety of factors (e.g., outdoor temperature,
higher efficiency homes, more flexible schedules, more discretionary loads during Flex Alert
hours, and the health, wealth, and age of occupants, etc.), we can generate some broad insights.
From our analysis we see clear overlaps between household income, total electricity demand,
and Flex Period Response; customers with large electricity demand often also have high income,
and hence, more demand flexibility. However, future analyses should focus on teasing out the
other factors that drive differences in customer engagement and literacy in the program across
subpopulations to gain a better understanding of what factors influence customer participation so
that messaging for future events can be tailored to increase engagement. Future work might also
look into the efficacy of trying to drive larger Flex Alert responses in cooler regions on extreme
heat days, when responsiveness in the hottest regions is likely to be muted.
This study underscores some of the shortcomings of unincentivized, voluntary DR
programs, since it difficult to predict how much additional capacity can be gained by issuing
Flex Alerts, limiting their benefits to grid operators for longer term planning. As the frequency
and intensity of extreme heat events grow, a viable, dynamic grid will likely require more
flexibility than a program such as Flex Alerts can provide, meaning robust, incentivized DR
programs with reliable participation are needed to ensure resources are available during periods
of high demand. Most incentivized DR programs in California over the past decade have been
geared towards industrial and commercial customers with large loads for load shedding and load
shifting, but the results of this study show that there is huge potential for demand-side
management programs through the aggregation of smaller residential load reductions. Currently,
DR programs are decentralized and administered by either the state’s three IOUs, CPUC
115
jurisdictional entities, or commercial DR providers, and residential customers can elect to
participate. However, the future of California’s electricity systems will likely include a more
formal approach to DR, such as universal opt-in access to real-time energy and capacity pricing,
as was outlined in the CPUC Energy Division’s 2022“CalFUSE” proposal [289]. (Note: In May
202, the California Public Utility Commission launched a pilot program administered by a
customer’s IOU, called “Power Saver Rewards” that pays customers in bill credits when they
respond to Flex Alerts [290]. That incentivized pilot program was not implemented during the
period of study.)
As markets for incentivized DR are developed, attention should be given to how to best
optimize these programs as there are conflating factors that can limit their overall success. For
example, although wealthier customers have the capability to load shift, studies have shown that
they are harder to incentivize financially [291]. And while lower income customers are more
likely to respond to pricing signals, their participation doesn’t offer as much flexibility to the grid
as higher income customers because their loads are generally smaller [292], [293]. Further, lowincome, energy insecure customers may already be engaging in energy limiting behaviors due to
their own financial constraints. Thus, it is important to design programs that distribute the
benefits to vulnerable households, whose ability to participate is limited [294], and hence, will
not benefit equally from financial incentives or lowered electricity prices in comparison to
wealthier, higher consuming users [295]. Still, regardless of who participates, successful DR
programs that shift load to off peak hours benefit the whole customer base through reductions in
overall electricity costs that reflect a transfer of wealth from electricity generators to electricity
consumers [296], [297].
116
Chapter 6: Conclusion
Extreme heat events and rising average temperatures will be a major challenge for grid
management in the upcoming decades; thus, a detailed understanding of climate energy
interactions is critical to plan for future energy needs. The research described in this dissertation
improves our understanding of residential electricity consumption in a warming climate through
electric load forecasting, cooling demand characterization, and grid flexibility analysis. Prior
research has lacked access to high spatiotemporal datasets that can sufficiently capture the
relationship between residential energy behavior and its driving factors, limiting the spatial
resolution of the studies. We address existing research gaps using a highly granular dataset of
smart meter electricity records that permits highly resolved analyses of the spatial and temporal
trends of residential electricity use.
In Chapter 2, residential electricity demand is predicted at multiple spatial (e.g.,
household and census tract) and temporal (e.g., daily, monthly, annual) resolutions with various
ML models. From the results, we see that ML models can predict household level electricity
demand with a relative degree of accuracy in certain cases using smart meter, weather, building
and socioeconomic data; the best performing model, MLP regressor trained with monthly data,
achieves an r2 value of 0.45. Models trained with census tract level data were more accurate; the
MLP regressor trained with monthly data and the post-aggregation method achieved an r2 of
0.81. The census tract level results are promising because they indicate that residential electricity
demand can be predicted at relatively high-resolution spatial scales, which are informative for
grid planning and operation, without needing private customer electricity data.
In Chapter 3 of this dissertation, we tested how estimates of AC penetration made using
humid heat metrics would compare against AC penetration rates made using dry bulb
117
temperature alone. We hypothesized that humidity and temperature together would better capture
the climate sensitive portion of electricity demand since humidity is known to impact human
thermal comfort and recent energy modeling studies have shown it is an important predictor of
cooling demand. However, our study found that in the study region of Southern California,
humid heat metrics were either less or similarly correlated with electricity consumption when
compared to dry bulb temperature. While these results are not definitive enough to confirm
which heat metric is best suited for estimating AC ownership, particularly given that Southern
California does not have climate zones with extreme humidity, we found that combining multiple
heat metrics can increase confidence in our predictions.
Chapter 4 expands the AC analysis by estimating both AC Penetration Rates and AC
Operation Rates. In the first step of this three-part methodology, we developed a novel model to
detect AC units that we believe is better suited than other contributions in the peer-reviewed
literature to capture homes with electric heating and smaller AC loads. In the second step, we
adapted a linear regression model to determine the hours in which customers use their AC.
Finally, these results are combined in the third step to calculate Net AC Utilization, which
provides greater insight into the spatial and temporal patterns of cooling behavior than AC
Penetration Rates alone and allows us to identify communities with disproportionate AC use.
Chapter 5 explores residential electricity demand on extreme heat days,
specifically with regards to Flex Alerts. Flex Alerts are used to relieve strain on
CAISO’s grid during emergency events when demand threatens to outpace supply,
often on hot summer afternoons when cooling demand is driving peak demand. In this
study, we define two new metrics to quantify how much residential electricity demand
decreased across the Flex Alert period and how much ramping decreased in the first
118
hour of the alert, when compared to similar days. The study found that the response to
Flex Alerts was correlated with temperature and varied across days and
subpopulations. In general, the variability in the level of response suggests that a more
robust DR program with financial incentives would be better suited to ensure reliable
customer participation on days where flexibility is critical to maintain grid stability.
The knowledge gained from these studies can serve as a reference to 1)
optimize building energy prediction studies, 2) improve understanding of the
residential sector’s cooling demand, and 3) aid in the design of successful DR
programs. As climate change drives increases in temperatures certain populations at
the intersection of extreme heat and high poverty will be disproportionately impacted.
This body of work can also be used to better identify the most vulnerable communities
and craft targeted policies that can improve access to cooling services and alleviate some
of the energy burden associated with climate change. As smart meter data becomes
increasingly available, researchers can adopt the methods outlined in this study to gain
deeper insight into the electricity needs and cooling demand of their study regions,
informing grid management strategies, future infrastructure investments, and
household energy insecurity policy
119
References
[1] International Energy Agency (IEA), “Key World Energy Statistics,” Paris, 2021. [Online].
Available: https://www.iea.org/reports/key-world-energy-statistics-2021.
[2] G. Franco and A. H. Sanstad, “Climate change and electricity demand in California,” Clim.
Change, vol. 87, no. 1 SUPPL, pp. 139–151, 2007, doi: 10.1007/s10584-007-9364-y.
[3] G. S. Eskeland and T. K. Mideksa, “Electricity demand in a changing climate,” Mitig. Adapt.
Strateg. Glob. Chang., vol. 15, no. 8, pp. 877–897, 2010, doi: 10.1007/s11027-010-9246-x.
[4] M. Sugiyama, “Climate change mitigation and electrification,” Energy Policy, vol. 44, pp. 464–
468, May 2012, doi: 10.1016/j.enpol.2012.01.028.
[5] M. Auffhammer, “Climate Adaptive Response Estimation: Short and long run impacts of climate
change on residential electricity and natural gas consumption,” J. Environ. Econ. Manage., vol.
114, 2022, doi: 10.1016/j.jeem.2022.102669.
[6] S. R. Sinsel, R. L. Riemke, and V. H. Hoffmann, “Challenges and solution technologies for the
integration of variable renewable energy sources—a review,” Renewable Energy, vol. 145. 2020,
doi: 10.1016/j.renene.2019.06.147.
[7] IEA, “The Future of Cooling,” Paris, 2018. [Online]. Available: https://www.iea.org/reports/thefuture-of-cooling.
[8] H. T. Haider, O. H. See, and W. Elmenreich, “A review of residential demand response of smart
grid,” Renew. Sustain. Energy Rev., vol. 59, pp. 166–178, 2016, doi: 10.1016/j.rser.2016.01.016.
[9] L. Zhang et al., “A review of machine learning in building load prediction,” Appl. Energy, vol.
285, no. July 2020, p. 116452, 2021, doi: 10.1016/j.apenergy.2021.116452.
[10] K. Amasyali and N. M. El-gohary, “A review of data-driven building energy consumption
prediction studies,” Renew. Sustain. Energy Rev., vol. 81, no. March 2017, pp. 1192–1205, 2018,
doi: 10.1016/j.rser.2017.04.095.
[11] M. Bourdeau, X. qiang Zhai, E. Nefzaoui, X. Guo, and P. Chatellier, “Modeling and forecasting
building energy consumption: A review of data-driven techniques,” Sustainable Cities and
Society, vol. 48. 2019, doi: 10.1016/j.scs.2019.101533.
[12] J. S. Chou and D. S. Tran, “Forecasting energy consumption time series using machine learning
techniques based on usage patterns of residential householders,” Energy, vol. 165, pp. 709–726,
2018, doi: 10.1016/j.energy.2018.09.144.
[13] M. A. R. Biswas, M. D. Robinson, and N. Fumo, “Prediction of residential building energy
consumption: A neural network approach,” Energy, vol. 117, pp. 84–92, 2016, doi:
10.1016/j.energy.2016.10.066.
[14] R. E. Edwards, J. New, and L. E. Parker, “Predicting future hourly residential electrical
consumption: A machine learning case study,” Energy Build., vol. 49, pp. 591–603, 2012, doi:
10.1016/j.enbuild.2012.03.010.
120
[15] D. Maia-Silva, R. Kumar, and R. Nateghi, “The critical role of humidity in modeling summer
electricity demand across the United States,” Nat. Commun., vol. 11, no. 1, 2020, doi:
10.1038/s41467-020-15393-8.
[16] Z. Li, W. Chen, S. Deng, and Z. Lin, “The characteristics of space cooling load and indoor
humidity control for residences in the subtropics,” Build. Environ., vol. 41, no. 9, pp. 1137–1147,
Sep. 2006, doi: 10.1016/J.BUILDENV.2005.05.016.
[17] J. Winkler, J. Munk, and J. Woods, “Sensitivity of occupant comfort models to humidity and their
effect on cooling energy use,” Build. Environ., vol. 162, 2019, doi:
10.1016/j.buildenv.2019.106240.
[18] J. Woods et al., “Humidity’s impact on greenhouse gas emissions from air conditioning,” Joule,
vol. 6, no. 4. 2022, doi: 10.1016/j.joule.2022.02.013.
[19] M. Auffhammer, P. Baylis, and C. H. Hausman, “Climate change is projected to have severe
impacts on the frequency and intensity of peak electricity demand across the United States,” Proc.
Natl. Acad. Sci. U. S. A., vol. 114, no. 8, 2017, doi: 10.1073/pnas.1613193114.
[20] B. J. van Ruijven, E. De Cian, and I. Sue Wing, “Amplification of future energy demand growth
due to climate change,” Nat. Commun., vol. 10, no. 1, 2019, doi: 10.1038/s41467-019-10399-3.
[21] X. Li, J. Chambers, S. Yilmaz, and M. K. Patel, “A Monte Carlo building stock model of space
cooling demand in the Swiss service sector under climate change,” Energy Build., vol. 233, 2021,
doi: 10.1016/j.enbuild.2020.110662.
[22] L. T. Biardeau, L. W. Davis, P. Gertler, and C. Wolfram, “Heat exposure and global air
conditioning,” Nat. Sustain., vol. 3, no. 1, 2020, doi: 10.1038/s41893-019-0441-9.
[23] M. Chen, G. A. Ban-Weiss, and K. T. Sanders, “The role of household level electricity data in
improving estimates of the impacts of climate on building electricity use,” Energy Build., vol. 180,
pp. 146–158, 2018, doi: 10.1016/j.enbuild.2018.09.012.
[24] M. Chen, K. T. Sanders, and G. A. Ban-Weiss, “A new method utilizing smart meter data for
identifying the existence of air conditioning in residential homes,” Environ. Res. Lett., vol. 14, no.
9, 2019, doi: 10.1088/1748-9326/ab35a8.
[25] M. E. H. Dyson, S. D. Borgeson, M. D. Tabone, and D. S. Callaway, “Using smart meter data to
estimate demand response potential, with application to solar energy integration,” Energy Policy,
vol. 73, 2014, doi: 10.1016/j.enpol.2014.05.053.
[26] “What is a Flex Alert?” https://www.flexalert.org/what-is-flex-alert.
[27] “Electric Power Annual: Table 1.2. Summary statistics for the United States, 2010-2020.”
[Online]. Available: https://www.eia.gov/electricity/annual/html/epa_01_02.html.
[28] “Per capita U.S. residential electricity use was flat in 2020, but varied by state,” 2021.
https://www.eia.gov/todayinenergy/detail.php?id=49036.
121
[29] A. Kavousian, R. Rajagopal, and M. Fischer, “Determinants of residential electricity consumption:
Using smart meter data to examine the effect of climate, building characteristics, appliance stock,
and occupants’ behavior,” Energy, vol. 55, pp. 184–194, Jun. 2013, doi:
10.1016/j.energy.2013.03.086.
[30] B. E. Psiloglou, C. Giannakopoulos, S. Majithia, and M. Petrakis, “Factors affecting electricity
demand in Athens, Greece and London, UK: A comparative assessment,” Energy, vol. 34, no. 11,
pp. 1855–1863, Nov. 2009, doi: 10.1016/j.energy.2009.07.033.
[31] J. C. Lam, “Climatic and economic influences on residential electricity consumption,” Energy
Convers. Manag., vol. 39, no. 7, pp. 623–629, 1998, doi: 10.1016/S0196-8904(97)10008-5.
[32] T. Ahmad et al., “Supervised based machine learning models for short, medium and long-term
energy prediction in distinct building environment,” Energy, vol. 158, pp. 17–32, 2018, doi:
10.1016/j.energy.2018.05.169.
[33] C. Bartusch, M. Odlare, F. Wallin, and L. Wester, “Exploring variance in residential electricity
consumption: Household features and building properties,” Appl. Energy, vol. 92, pp. 637–643,
Apr. 2012, doi: 10.1016/j.apenergy.2011.04.034.
[34] E. McKenna et al., “Explaining daily energy demand in British housing using linked smart meter
and socio-technical data in a bottom-up statistical model,” Energy Build., vol. 258, p. 111845,
2022, doi: 10.1016/j.enbuild.2022.111845.
[35] G. Huebner, D. Shipworth, I. Hamilton, Z. Chalabi, and T. Oreszczyn, “Understanding electricity
consumption: A comparative contribution of building factors, socio-demographics, appliances,
behaviours and attitudes,” Appl. Energy, vol. 177, 2016, doi: 10.1016/j.apenergy.2016.04.075.
[36] T. Dietz, G. T. Gardner, J. Gilligan, P. C. Stern, and M. P. Vandenbergh, “Household actions can
provide a behavioral wedge to rapidly reduce US carbon emissions,” Proc. Natl. Acad. Sci. U. S.
A., vol. 106, no. 44, pp. 18452–18456, 2009, doi: 10.1073/pnas.0908738106.
[37] J. Schot, L. Kanger, and G. Verbong, “The roles of users in shaping transitions to new energy
systems,” Nat. Energy, vol. 1, no. May, 2016, doi: 10.1038/nenergy.2016.54.
[38] D. Yan et al., “Occupant behavior modeling for building performance simulation: Current state
and future challenges,” Energy Build., vol. 107, 2015, doi: 10.1016/j.enbuild.2015.08.032.
[39] S. Vojtovic, A. Stundziene, and R. Kontautiene, “The impact of socio-economic indicators on
sustainable consumption of domestic electricity in Lithuania,” Sustain., vol. 10, no. 2, 2018, doi:
10.3390/su10020162.
[40] E. Ziramba, “The demand for residential electricity in South Africa,” Energy Policy, vol. 36, no. 9,
pp. 3460–3466, 2008, doi: 10.1016/j.enpol.2008.05.026.
[41] A. Alberini and M. Filippini, “Response of residential electricity demand to price: The effect of
measurement error,” Energy Econ., vol. 33, no. 5, pp. 889–895, 2011, doi:
10.1016/j.eneco.2011.03.009.
122
[42] T. Dergiades and L. Tsoulfidis, “Estimating residential demand for electricity in the United States,
1965-2006,” Energy Econ., vol. 30, no. 5, pp. 2722–2730, 2008, doi:
10.1016/j.eneco.2008.05.005.
[43] D. Wiesmann, I. Lima Azevedo, P. Ferrão, and J. E. Fernández, “Residential electricity
consumption in Portugal: Findings from top-down and bottom-up models,” Energy Policy, vol. 39,
no. 5, pp. 2772–2779, 2011, doi: 10.1016/j.enpol.2011.02.047.
[44] C. L. Hor, S. J. Watson, and S. Majithia, “Analyzing the impact of weather variables on monthly
electricity demand,” IEEE Trans. Power Syst., vol. 20, no. 4, 2005, doi:
10.1109/TPWRS.2005.857397.
[45] J. Huang, H. Akbari, and L. Rainer, “Lawrence Berkeley National Laboratory Recent Work Title
481 Prototypical Commercial Buildings for 20 Urban Market Areas Permalink
https://escholarship.org/uc/item/1g90f5gj Publication Date,” 1991, [Online]. Available:
https://escholarship.org/uc/item/1g90f5gj.
[46] M. W. Opitz, L. K. Norford, Y. A. Matrosov, and I. N. Butovsky, “Energy consumption and
conservation in the Russian apartment building stock,” Energy Build., vol. 25, no. 1, pp. 75–92,
1997, doi: 10.1016/s0378-7788(96)00995-4.
[47] D. Charlier and A. Risch, “Evaluation of the impact of environmental public policy measures on
energy consumption and greenhouse gas emissions in the French residential sector,” Energy
Policy, vol. 46, no. 2012, pp. 170–184, 2012, doi: 10.1016/j.enpol.2012.03.048.
[48] Y. Shimoda, T. Fujii, T. Morikawa, and M. Mizuno, “Residential end-use energy simulation at
city scale,” Build. Environ., vol. 39, no. 8 SPEC. ISS., pp. 959–967, 2004, doi:
10.1016/j.buildenv.2004.01.020.
[49] M. A. R. Lopes, C. H. Antunes, and N. Martins, “Towards more effective behavioural energy
policy: An integrative modelling approach to residential energy consumption in Europe,” Energy
Res. Soc. Sci., vol. 7, pp. 84–98, 2015, doi: 10.1016/j.erss.2015.03.004.
[50] T. Pukšec, B. V. Mathiesen, T. Novosel, and N. Duić, “Assessing the impact of energy saving
measures on the future energy demand and related GHG (greenhouse gas) emission reduction of
Croatia,” Energy, vol. 76, pp. 198–209, 2014, doi: 10.1016/j.energy.2014.06.045.
[51] J. P. Gouveia, P. Fortes, and J. Seixas, “Projections of energy services demand for residential
buildings: Insights from a bottom-up methodology,” Energy, vol. 47, no. 1, pp. 430–442, 2012,
doi: 10.1016/j.energy.2012.09.042.
[52] M. Ferrando, F. Causone, T. Hong, and Y. Chen, “Urban building energy modeling (UBEM)
tools: A state-of-the-art review of bottom-up physics-based approaches,” Sustainable Cities and
Society, vol. 62. 2020, doi: 10.1016/j.scs.2020.102408.
[53] L. G. Swan and V. I. Ugursal, “Modeling of end-use energy consumption in the residential sector:
A review of modeling techniques,” Renewable and Sustainable Energy Reviews, vol. 13, no. 8.
Pergamon, pp. 1819–1835, Oct. 01, 2009, doi: 10.1016/j.rser.2008.09.033.
[54] M. Gul and S. A. Qa, “Incorporating g Economic and Demo ographic Variablesfor Fore casting
Electricity Con nsumption in Pakistan.”
123
[55] N. A. Burney, “Socioeconomic development and electricity consumption A cross-country analysis
using the random coefficient method,” Energy Econ., vol. 17, no. 3, pp. 185–195, 1995, doi:
10.1016/0140-9883(95)00012-J.
[56] M. Salari and R. J. Javid, “Residential energy demand in the United States: Analysis using static
and dynamic approaches,” Energy Policy, vol. 98, pp. 637–649, 2016, doi:
10.1016/j.enpol.2016.09.041.
[57] M. Kavgic, A. Mavrogianni, D. Mumovic, A. Summerfield, Z. Stevanovic, and M. DjurovicPetrovic, “A review of bottom-up building stock models for energy consumption in the residential
sector,” Build. Environ., vol. 45, no. 7, pp. 1683–1697, 2010, doi: 10.1016/j.buildenv.2010.01.021.
[58] A. Uihlein and P. Eder, “Policy options towards an energy efficient residential building stock in
the EU-27,” Energy Build., vol. 42, no. 6, pp. 791–798, 2010, doi: 10.1016/j.enbuild.2009.11.016.
[59] J. L. Reyna and M. V. Chester, “Energy efficiency to reduce residential electricity and natural gas
use under climate change,” Nat. Commun., vol. 8, no. May 2017, pp. 1–12, 2017, doi:
10.1038/ncomms14916.
[60] J. Min, Z. Hausfather, and Q. F. Lin, “A high-resolution statistical model of residential energy end
use characteristics for the United States,” J. Ind. Ecol., vol. 14, no. 5, pp. 791–807, 2010, doi:
10.1111/j.1530-9290.2010.00279.x.
[61] H. X. Zhao and F. Magoulès, “A review on the prediction of building energy consumption,”
Renew. Sustain. Energy Rev., vol. 16, no. 6, pp. 3586–3592, 2012, doi: 10.1016/j.rser.2012.02.049.
[62] S. Seyedzadeh, F. P. Rahimian, I. Glesk, and M. Roper, “Machine learning for estimation of
building energy consumption and performance: a review,” Vis. Eng., vol. 6, no. 1, 2018, doi:
10.1186/s40327-018-0064-7.
[63] C. Robinson et al., “Machine learning approaches for estimating commercial building energy
consumption,” Appl. Energy, vol. 208, no. August, pp. 889–904, 2017, doi:
10.1016/j.apenergy.2017.09.060.
[64] B. Yildiz, J. I. Bilbao, and A. B. Sproul, “A review and analysis of regression and machine
learning models on commercial building electricity load forecasting,” Renew. Sustain. Energy
Rev., vol. 73, no. February, pp. 1104–1122, 2017, doi: 10.1016/j.rser.2017.02.023.
[65] A. Foucquier, S. Robert, F. Suard, L. Stéphan, and A. Jay, “State of the art in building modelling
and energy performances prediction: A review,” Renew. Sustain. Energy Rev., vol. 23, pp. 272–
288, 2013, doi: 10.1016/j.rser.2013.03.004.
[66] A. H. Neto and F. A. S. Fiorelli, “Comparison between detailed model simulation and artificial
neural network for forecasting building energy consumption,” Energy Build., vol. 40, no. 12, pp.
2169–2176, 2008, doi: 10.1016/j.enbuild.2008.06.013.
[67] C. Turhan, T. Kazanasmaz, I. E. Uygun, K. E. Ekmen, and G. G. Akkurt, “Comparative study of a
building energy performance software (KEP-IYTE-ESS) and ANN-based building heat load
estimation,” Energy Build., vol. 85, pp. 115–125, 2014, doi: 10.1016/j.enbuild.2014.09.026.
124
[68] R. Jing, M. Wang, R. Zhang, N. Li, and Y. Zhao, “A study on energy performance of 30
commercial office buildings in Hong Kong,” Energy Build., vol. 144, pp. 117–128, 2017, doi:
10.1016/j.enbuild.2017.03.042.
[69] H. Deng, D. Fannon, and M. J. Eckelman, “Predictive modeling for US commercial building
energy use : A comparison of existing statistical and machine learning algorithms using CBECS
microdata,” Energy Build., vol. 163, pp. 34–43, 2018, doi: 10.1016/j.enbuild.2017.12.031.
[70] R. Mohammadiziazi and M. M. Bilec, “Application of Machine Learning for Predicting Building
Energy Use at Di ff erent Temporal and Spatial Resolution under Climate Change in USA,” 2020.
[71] N. Bassamzadeh and R. Ghanem, “Multiscale stochastic prediction of electricity demand in smart
grids using Bayesian networks,” Appl. Energy, vol. 193, pp. 369–380, 2017, doi:
10.1016/j.apenergy.2017.01.017.
[72] J. Ma and J. C. P. Cheng, “Identifying the influential features on the regional energy use intensity
of residential buildings based on Random Forests,” Appl. Energy, vol. 183, pp. 193–201, 2016,
doi: 10.1016/j.apenergy.2016.08.096.
[73] J. Ma and J. C. P. Cheng, “Estimation of the building energy use intensity in the urban scale by
integrating GIS and big data technology,” Appl. Energy, vol. 183, pp. 182–192, 2016, doi:
10.1016/j.apenergy.2016.08.079.
[74] X. Xu, W. Wang, T. Hong, and J. Chen, “Energy & Buildings Incorporating machine learning
with building network analysis to predict multi-building energy use,” Energy Build., vol. 186, pp.
80–97, 2019, doi: 10.1016/j.enbuild.2019.01.002.
[75] E. Mocanu, P. H. Nguyen, W. L. Kling, and M. Gibescu, “Unsupervised energy prediction in a
Smart Grid context using reinforcement cross-building transfer learning,” Energy Build., vol. 116,
pp. 646–655, 2016, doi: 10.1016/j.enbuild.2016.01.030.
[76] S. Papadopoulos, B. Bonczak, and C. E. Kontokosta, “Pattern recognition in building energy
performance over time using energy benchmarking data,” Appl. Energy, vol. 221, no. March, pp.
576–586, 2018, doi: 10.1016/j.apenergy.2018.03.079.
[77] J. Z. Kolter and J. Ferreira, “A large-scale study on predicting and contextualizing building energy
usage,” in Proceedings of the National Conference on Artificial Intelligence, 2011, vol. 2.
[78] Y. Wei et al., “A review of data-driven approaches for prediction and classification of building
energy consumption,” Renew. Sustain. Energy Rev., vol. 82, no. August 2017, pp. 1027–1047,
2018, doi: 10.1016/j.rser.2017.09.108.
[79] Z. Wang and R. S. Srinivasan, “A review of arti fi cial intelligence based building energy use
prediction : Contrasting the capabilities of single and ensemble prediction models,” Renew.
Sustain. Energy Rev., vol. 75, no. September 2015, pp. 796–808, 2017, doi:
10.1016/j.rser.2016.10.079.
[80] R. K. Jain, K. M. Smith, P. J. Culligan, and J. E. Taylor, “Forecasting energy consumption of
multi-family residential buildings using support vector regression : Investigating the impact of
temporal and spatial monitoring granularity on performance accuracy,” Appl. Energy, vol. 123, pp.
168–178, 2014, doi: 10.1016/j.apenergy.2014.02.057.
125
[81] A. L. Ku, Y. (Lucy) Qiu, J. Lou, D. Nock, and B. Xing, “Changes in hourly electricity
consumption under COVID mandates: A glance to future hourly residential power consumption
pattern with remote work in Arizona,” Appl. Energy, vol. 310, 2022, doi:
10.1016/j.apenergy.2022.118539.
[82] C. Fan, J. Wang, W. Gang, and S. Li, “Assessment of deep recurrent neural network-based
strategies for short-term building energy predictions,” Appl. Energy, vol. 236, no. July 2018, pp.
700–710, 2019, doi: 10.1016/j.apenergy.2018.12.004.
[83] M. Abdallah, M. Abu Talib, M. Hosny, O. Abu Waraga, Q. Nasir, and M. A. Arshad, “Forecasting
highly fluctuating electricity load using machine learning models based on multimillion
observations,” Adv. Eng. Informatics, vol. 53, p. 101707, Aug. 2022, doi:
10.1016/J.AEI.2022.101707.
[84] F. Burlig, J. Bushnell, D. Rapson, and C. Wolfram, “Low Energy: Estimating Electric Vehicle
Electricity Use,” AEA Pap. Proc., vol. 111, 2021, doi: 10.1257/pandp.20211088.
[85] Z. Ma, C. Ye, H. Li, and W. Ma, “Applying support vector machines to predict building energy
consumption in China,” Energy Procedia, vol. 152, pp. 780–786, 2018, doi:
10.1016/j.egypro.2018.09.245.
[86] S. Paudel et al., “A relevant data selection method for energy consumption prediction of low
energy building based on support vector machine,” Energy Build., vol. 138, pp. 240–256, 2017,
doi: 10.1016/j.enbuild.2016.11.009.
[87] J. W. Taylor and P. E. McSharry, “Short-term load forecasting methods: An evaluation based on
European data,” IEEE Trans. Power Syst., vol. 22, no. 4, pp. 2213–2219, 2007, doi:
10.1109/TPWRS.2007.907583.
[88] L. Hernandez, C. Baladrón, J. M. Aguiar, B. Carro, A. J. Sanchez-Esguevillas, and J. Lloret,
“Short-term load forecasting for microgrids based on artificial neural networks,” Energies, vol. 6,
no. 3, pp. 1385–1408, 2013, doi: 10.3390/en6031385.
[89] L. C. P. Velasco, C. R. Villezas, P. N. C. Palahang, and J. A. A. Dagaang, “Next day electric load
forecasting using Artificial Neural Networks,” 8th Int. Conf. Humanoid, Nanotechnology, Inf.
Technol. Commun. Control. Environ. Manag. HNICEM 2015, no. December, pp. 1–6, 2016, doi:
10.1109/HNICEM.2015.7393166.
[90] W. Zhang et al., “Estimating residential energy consumption in metropolitan areas: A
microsimulation approach,” Energy, vol. 155, pp. 162–173, 2018, doi:
10.1016/j.energy.2018.04.161.
[91] A. G. Baklrtzis, V. Petrldis, S. J. Klartzls, and M. C. Alexladls, “A NEURAL NETWORK
SHORT TERM LOAD FOR THE GREEK POWER S Department of Electrical and Computer
Engineering,” Neural Networks, pp. 858–863, 1995.
[92] S. Ryu, J. Noh, and H. Kim, “Deep neural network based demand side short term load
forecasting,” Energies, vol. 10, no. 1, pp. 1–20, 2017, doi: 10.3390/en10010003.
126
[93] S. Masum, Y. Liu, and J. Chiverton, “Multi-step time series forecasting of electric load using
machine learning models,” Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell.
Lect. Notes Bioinformatics), vol. 10841 LNAI, pp. 148–159, 2018, doi: 10.1007/978-3-319-91253-
0_15.
[94] A. Azadeh, S. F. Ghaderi, and S. Sohrabkhani, “A simulated-based neural network algorithm for
forecasting electrical energy consumption in Iran,” Energy Policy, vol. 36, no. 7, pp. 2637–2644,
2008, doi: 10.1016/j.enpol.2008.02.035.
[95] A. Rahman, V. Srikumar, and A. D. Smith, “Predicting electricity consumption for commercial
and residential buildings using deep recurrent neural networks,” Appl. Energy, vol. 212, no.
December 2017, pp. 372–385, 2018, doi: 10.1016/j.apenergy.2017.12.051.
[96] J. Wang, L. Li, D. Niu, and Z. Tan, “An annual load forecasting model based on support vector
regression with differential evolution algorithm,” Appl. Energy, vol. 94, pp. 65–70, 2012, doi:
10.1016/j.apenergy.2012.01.010.
[97] A. Azadeh, S. F. Ghaderi, S. Tarverdian, and M. Saberi, “Integration of artificial neural networks
and genetic algorithm to predict electrical energy consumption,” Appl. Math. Comput., vol. 186,
no. 2, pp. 1731–1741, 2007, doi: 10.1016/j.amc.2006.08.093.
[98] S. Rahman, R. B. Senior, and M. Member, “AN EXPERT SYSTEM BASED ALGORITHM FOR
SHORT TERM LOAD FORECAST,” 1988. doi: 10.1109/59.192889.
[99] F. Kong and G. P. Song, “Middle-long power load forecasting based on dynamic grey prediction
and support vector machine,” Int. J. Adv. Comput. Technol., vol. 4, no. 5, 2012, doi:
10.4156/ijact.vol4.issue5.18.
[100] E. S. Mostafavi, S. I. Mostafavi, A. Jaafari, and F. Hosseinpour, “A novel machine learning
approach for estimation of electricity demand: An empirical evidence from Thailand,” Energy
Convers. Manag., vol. 74, pp. 548–555, 2013, doi: 10.1016/j.enconman.2013.06.031.
[101] M. Aydinalp, V. Ismet Ugursal, and A. S. Fung, “Modeling of the appliance, lighting, and spacecooling energy consumptions in the residential sector using neural networks,” Appl. Energy, vol.
71, no. 2, pp. 87–110, 2002, doi: 10.1016/S0306-2619(01)00049-6.
[102] W. C. Hong, “Electric load forecasting by support vector model,” Appl. Math. Model., vol. 33, no.
5, pp. 2444–2454, 2009, doi: 10.1016/j.apm.2008.07.010.
[103] H. Fan, I. F. MacGill, and A. B. Sproul, “Statistical analysis of driving factors of residential
energy demand in the greater Sydney region, Australia,” Energy Build., vol. 105, 2015, doi:
10.1016/j.enbuild.2015.07.030.
[104] S. Haben, C. Singleton, and P. Grindrod, “Analysis and clustering of residential customers energy
behavioral demand using smart meter data,” IEEE Trans. Smart Grid, vol. 7, no. 1, pp. 136–144,
2016, doi: 10.1109/TSG.2015.2409786.
[105] L. Czétány et al., “Development of electricity consumption profiles of residential buildings based
on smart meter data clustering,” Energy Build., vol. 252, p. 111376, Dec. 2021, doi:
10.1016/J.ENBUILD.2021.111376.
127
[106] B. Dong, Z. Li, S. M. M. Rahman, and R. Vega, “A hybrid model approach for forecasting future
residential electricity consumption,” Energy Build., vol. 117, pp. 341–351, 2016, doi:
10.1016/j.enbuild.2015.09.033.
[107] S. Humeau, T. K. Wijaya, M. Vasirani, and K. Aberer, “Electricity load forecasting for residential
customers: Exploiting aggregation and correlation between households,” 2013 Sustain. Internet
ICT Sustain. Sustain. 2013, 2013, doi: 10.1109/SustainIT.2013.6685208.
[108] F. Rodrigues, C. Cardeira, and J. M. F. Calado, “The daily and hourly energy consumption and
load forecasting using artificial neural network method: A case study using a set of 93 households
in Portugal,” Energy Procedia, vol. 62, pp. 220–229, 2014, doi: 10.1016/j.egypro.2014.12.383.
[109] H. X. Zhao and F. Magoulès, “Feature selection for predicting building energy consumption based
on statistical learning method,” J. Algorithms Comput. Technol., vol. 6, no. 1, pp. 59–77, 2012,
doi: 10.1260/1748-3018.6.1.59.
[110] Q. Li, P. Ren, and Q. Meng, “Prediction model of annual energy consumption of residential
buildings,” 2010 Int. Conf. Adv. Energy Eng. ICAEE 2010, pp. 223–226, 2010, doi:
10.1109/ICAEE.2010.5557576.
[111] R. Olu-Ajayi, H. Alaka, I. Sulaimon, F. Sunmola, and S. Ajayi, “Building energy consumption
prediction for residential buildings using deep learning and other machine learning techniques,” J.
Build. Eng., vol. 45, 2022, doi: 10.1016/j.jobe.2021.103406.
[112] A. A. Ahmed Gassar, G. Y. Yun, and S. Kim, “Data-driven approach to prediction of residential
energy consumption at urban scales in London,” Energy, vol. 187, 2019, doi:
10.1016/j.energy.2019.115973.
[113] California Irrigation Management Information Sytem (CIMIS), “CIMIS Station Reports.”
[Online]. Available: https://cimis.water.ca.gov/Stations.aspx.
[114] National Oceanic and Atmospheric Administration (NOAA) National Centers for Environmental
Information (NCEI), “NOAA NCEI Local Climatological Data (LCD).” [Online]. Available:
https://www.ncei.noaa.gov/cdo-web/datatools/lcd.
[115] G. J. Schoenau and R. A. Kehrig, “Method for calculating degree-days to any base temperature,”
Energy Build., vol. 14, no. 4, 1990, doi: 10.1016/0378-7788(90)90092-W.
[116] “Assessor Parcel Data 2016,” County of Los Angeles Open Data.
https://data.lacounty.gov/browse?category=Property%2FPlanning&utf8=✓.
[117] “San Bernardino County Assessor’s Property Characteristics 2016,” Office of San Bernardino
County Assessor-Recorder-Clerk, 2016. https://sbcountyarc.org/services/property-information/.
[118] “Riverside County Assessor’s Property Characteristics 2016,” County of Riverside AssessorCounty Clerk-Recorder, 2016. https://www.rivcoacr.org/obtaining-record-copies.
[119] “Office of Environmental Health Hazard Assessment, C.E.P.A. CalEnviroScreen 3.0,” 2018.
https://oehha.ca.gov/calenviroscreen/report/calenviroscreen-30.
128
[120] S. Zhang, C. Zhang, and Q. Yang, “Data preparation for data mining,” vol. 9514, no. 2003, 2010,
doi: 10.1080/713827180.
[121] S. B. Kotsiantis and D. Kanellopoulos, “Data preprocessing for supervised leaning,” Int. J. …, vol.
1, no. 2, pp. 1–7, 2006, doi: 10.1080/02331931003692557.
[122] I. Guyon, “An Introduction to Variable and Feature Selection 1 Introduction,” vol. 3, pp. 1157–
1182, 2003.
[123] J. Luengo, S. García, and F. Herrera, On the choice of the best imputation methods for missing
values considering three groups of classification methods. 2012.
[124] S. A. N. Alexandropoulos, S. B. Kotsiantis, and M. N. Vrahatis, Data preprocessing in predictive
data mining, vol. 34, no. April 2020. 2019.
[125] T. Ahmad and M. N. Aziz, “Data preprocessing and feature selection for machine learning
intrusion detection systems,” ICIC Express Lett., vol. 13, no. 2, pp. 93–101, 2019, doi:
10.24507/icicel.13.02.93.
[126] D. C. Corrales, J. C. Corrales, and A. Ledezma, “How to address the data quality issues in
regression models: A guided process for data cleaning,” Symmetry (Basel)., vol. 10, no. 4, pp. 1–
20, 2018, doi: 10.3390/sym10040099.
[127] “Household Energy Use in California,” 2009. [Online]. Available:
https://www.eia.gov/consumption/residential/reports/2009/state_briefs/pdf/ca.pdf.
[128] K. Potdar, T. S., and C. D., “A Comparative Study of Categorical Variable Encoding Techniques
for Neural Network Classifiers,” Int. J. Comput. Appl., vol. 175, no. 4, pp. 7–9, 2017, doi:
10.5120/ijca2017915495.
[129] F. Pedregosa et al., “Scikit-learn: Machine Learning in Python.” pp. 2825–2830, 2011.
[130] S. F. Crone, S. Lessmann, and R. Stahlbock, “The impact of preprocessing on data mining : An
evaluation of classifier sensitivity in direct marketing,” vol. 173, pp. 781–800, 2006, doi:
10.1016/j.ejor.2005.07.023.
[131] J. Huang, Y. Li, and M. Xie, “An empirical analysis of data preprocessing for machine learningbased software cost estimation,” Inf. Softw. Technol., vol. 67, pp. 108–127, 2015, doi:
10.1016/j.infsof.2015.07.004.
[132] E. Dodangeh et al., “Science of the Total Environment Integrated machine learning methods with
resampling algorithms for fl ood susceptibility prediction,” vol. 705, 2020, doi:
10.1016/j.scitotenv.2019.135983.
[133] A. M. Molinaro, R. Simon, and R. M. Pfeiffer, “Prediction error estimation : a comparison of
resampling methods,” vol. 21, no. 15, pp. 3301–3307, 2005, doi: 10.1093/bioinformatics/bti499.
[134] D. Anguita, A. Ghio, S. Ridella, and D. Sterpi, “K-Fold Cross Validation for Error Rate Estimate
in Support Vector Machines . K – Fold Cross Validation for Error Rate Estimate in Support Vector
Machines,” no. June 2014, 2009.
129
[135] R. Kohavi, “A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model
Selection,” no. June, 2013.
[136] C. J. M. . Gonçalves I., Silva S., Melo J.B., “Random Sampling Technique for Overfitting Control
in Genetic Programming,” in Genetic Programming, Lecture No., Springer, Berlin, Heidelberg,
2012, pp. 218–229.
[137] J. L. Horowitz, “The bootstrap,” 2001, doi: 10.1016/S1573-4412(01)05005-X.
[138] S. Kaufman, S. Rosset, and C. Perlich, “Leakage in data mining: Formulation, detection, and
avoidance,” 2011, doi: 10.1145/2020408.2020496.
[139] M. V. Shcherbakov, A. Brebels, N. L. Shcherbakova, A. P. Tyukov1, T. A. Janovsky, and V. A.
evich Kamaev, “A survey of forecast error measures,” World Appl. Sci. J., vol. 24, no. 24, 2013,
doi: 10.5829/idosi.wasj.2013.24.itmies.80032.
[140] D. Chicco, M. J. Warrens, and G. Jurman, “The coefficient of determination R-squared is more
informative than SMAPE, MAE, MAPE, MSE and RMSE in regression analysis evaluation,”
PeerJ Comput. Sci., vol. 7, 2021, doi: 10.7717/PEERJ-CS.623.
[141] J. Cai, J. Luo, S. Wang, and S. Yang, “Neurocomputing Feature selection in machine learning : A
new perspective,” Neurocomputing, vol. 300, pp. 70–79, 2018, doi:
10.1016/j.neucom.2017.11.077.
[142] P. A. T. Langley, L. Flamingo, and S. Edu, “Selection of Relevant Features in Machine Learning,”
pp. 127–131, 1994.
[143] M. A. Hall and L. A. Smith, “Feature Selection for Machine Learning : Comparing a Correlationbased Filter Approach to the Wrapper CFS : Correlation-based Feature,” 1999.
[144] G. Chandrashekar and F. Sahin, “A survey on feature selection methods q,” Comput. Electr. Eng.,
vol. 40, no. 1, pp. 16–28, 2014, doi: 10.1016/j.compeleceng.2013.11.024.
[145] S. Raschka, “Sequential Feature Selector,” 2014.
http://rasbt.github.io/mlxtend/user_guide/feature_selection/SequentialFeatureSelector/.
[146] S. Hooker, D. Erhan, P. J. Kindermans, and B. Kim, “A benchmark for interpretability methods in
deep neural networks,” Adv. Neural Inf. Process. Syst., vol. 32, no. NeurIPS, 2019.
[147] N. Huang, G. Lu, and D. Xu, “A permutation importance-based feature selection method for shortterm electricity load forecasting using random forest,” Energies, vol. 9, no. 10, 2016, doi:
10.3390/en9100767.
[148] A. Razmjoo, P. Xanthopoulos, and Q. P. Zheng, “Online feature importance ranking based on
sensitivity analysis,” Expert Syst. Appl., vol. 85, pp. 397–406, 2017, doi:
10.1016/j.eswa.2017.05.016.
[149] M. Saarela and S. Jauhiainen, “Comparison of feature importance measures as explanations for
classification models,” SN Appl. Sci., vol. 3, no. 2, 2021, doi: 10.1007/s42452-021-04148-9.
[150] “Permutation feature importance.” https://scikit-
130
learn.org/stable/modules/permutation_importance.html.
[151] K. T. Williams and J. D. Gomez, “Predicting future monthly residential energy consumption using
building characteristics and climate data: A statistical learning approach,” Energy Build., vol. 128,
2016, doi: 10.1016/j.enbuild.2016.06.076.
[152] E. D. Coffel, R. M. Horton, and A. De Sherbinin, “Temperature and humidity based projections of
a rapid rise in global heat stress exposure during the 21st century,” Environmental Research
Letters, vol. 13, no. 1. 2018, doi: 10.1088/1748-9326/aaa00e.
[153] L. R. Vargas Zeppetello, A. E. Raftery, and D. S. Battisti, “Probabilistic projections of increased
heat stress driven by climate change,” Commun. Earth Environ., vol. 3, no. 1, 2022, doi:
10.1038/s43247-022-00524-4.
[154] R. Burgess, O. Deschênes, D. Donaldson, and M. Greenstone, “Weather, Climate Change and
Death in India,” LSE Work. Pap., 2017.
[155] T. Carleton et al., “Valuing the Global Mortality Consequences of Climate Change Accounting for
Adaptation Costs and Benefits,” SSRN Electron. J., 2021, doi: 10.2139/ssrn.3665869.
[156] R. S. Kovats and S. Hajat, “Heat stress and public health: A critical review,” in Annual Review of
Public Health, 2008, vol. 29, doi: 10.1146/annurev.publhealth.29.020907.090843.
[157] “The Future of Cooling: Opportunities for energy-efficient air conditioning,” 2018.
[158] Z. Khan et al., “Impacts of long-term temperature change and variability on electricity
investments,” Nat. Commun., vol. 12, no. 1, 2021, doi: 10.1038/s41467-021-21785-1.
[159] M. P. Naughton et al., “Heat-related mortality during a 1999 heat wave in Chicago,” Am. J. Prev.
Med., vol. 22, no. 4, 2002, doi: 10.1016/S0749-3797(02)00421-X.
[160] C. Mora et al., “Global risk of deadly heat,” Nat. Clim. Chang., vol. 7, no. 7, 2017, doi:
10.1038/nclimate3322.
[161] J. A. Añel, M. Fernández-González, X. Labandeira, X. López-Otero, and L. de la Torre, “Impact
of cold waves and heat waves on the energy production sector,” Atmosphere (Basel)., vol. 8, no.
11, 2017, doi: 10.3390/atmos8110209.
[162] A. Agarwal and H. Samuelson, “Too hot to stay at home: Residential heat vulnerability in urban
India,” in Journal of Physics: Conference Series, 2021, vol. 2069, no. 1, doi: 10.1088/1742-
6596/2069/1/012166.
[163] B. Stone et al., “Compound Climate and Infrastructure Events: How Electrical Grid Failure Alters
Heat Wave Risk,” Environ. Sci. Technol., vol. 55, no. 10, 2021, doi: 10.1021/acs.est.1c00024.
[164] A. Henley and J. Peirson, “Non‐Linearities in Electricity Demand and Temperature: Parametric
Versus Non‐Parametric Methods,” Oxf. Bull. Econ. Stat., vol. 59, no. 1, 1997, doi: 10.1111/1468-
0084.00054.
131
[165] J. Moral-Carcedo and J. Vicéns-Otero, “Modelling the non-linear response of Spanish electricity
demand to temperature variations,” Energy Econ., vol. 27, no. 3, 2005, doi:
10.1016/j.eneco.2005.01.003.
[166] M. Prek, “Thermodynamical analysis of human thermal comfort,” in Energy, 2006, vol. 31, no. 5,
doi: 10.1016/j.energy.2005.05.001.
[167] E. Ng and V. Cheng, “Urban human thermal comfort in hot and humid Hong Kong,” in Energy
and Buildings, 2012, vol. 55, doi: 10.1016/j.enbuild.2011.09.025.
[168] S. C. Sherwood and M. Huber, “An adaptability limit to climate change due to heat stress,” Proc.
Natl. Acad. Sci. U. S. A., vol. 107, no. 21, 2010, doi: 10.1073/pnas.0913352107.
[169] C. D. Ashley, C. L. Luecke, S. S. Schwartz, M. Z. Islam, and T. E. Bernard, “Heat strain at the
critical WBGT and the effects of gender, clothing and metabolic rate,” Int. J. Ind. Ergon., vol. 38,
no. 7–8, 2008, doi: 10.1016/j.ergon.2008.01.017.
[170] J. M. Hanna and D. E. Brown, “Human heat tolerance: an anthropological perspective.,” Annu.
Rev. Anthropol. Vol. 12, 1983, doi: 10.1146/annurev.an.12.100183.001355.
[171] H. Lee, J. Holst, and H. Mayer, “Modification of human-biometeorologically significant radiant
flux densities by shading as local method to mitigate heat stress in summer within urban street
canyons,” Adv. Meteorol., vol. 2013, 2013, doi: 10.1155/2013/312572.
[172] A. Sobolewski, M. Młynarczyk, M. Konarska, and J. Bugajska, “The influence of air humidity on
human heat stress in a hot environment,” Int. J. Occup. Saf. Ergon., vol. 27, no. 1, 2021, doi:
10.1080/10803548.2019.1699728.
[173] R. Obringer et al., “Implications of Increasing Household Air Conditioning Use Across the United
States Under a Warming Climate,” Earth’s Vol., vol. 10, no. 1, 2021.
[174] P. Sherman, H. Lin, and M. McElroy, “Projected global demand for air conditioning associated
with extreme heat and implications for electricity grids in poorer countries,” Energy Build., vol.
268, p. 112198, Aug. 2022, doi: 10.1016/J.ENBUILD.2022.112198.
[175] S. Mayes and K. Sanders, “ Quantifying the electricity, CO 2 emissions, and economic tradeoffs of
precooling strategies for a single-family home in Southern California* ,” Environ. Res. Infrastruct.
Sustain., vol. 2, no. 2, 2022, doi: 10.1088/2634-4505/ac5d60.
[176] H. Ren, Y. Sun, A. K. Albdoor, V. V. Tyagi, A. K. Pandey, and Z. Ma, “Improving energy
flexibility of a net-zero energy house using a solar-assisted air conditioning system with thermal
energy storage and demand-side management,” Appl. Energy, vol. 285, 2021, doi:
10.1016/j.apenergy.2021.116433.
[177] EIA (Energy Information Administration), “2020 Residential Energy Consumption Survey
(RECS),” 2020. [Online]. Available: https://www.eia.gov/consumption/residential/data/2020/.
[178] California Energy Commission (CEC), “2019 California Residential Appliance Saturation Survey
(RASS),” 2019. [Online]. Available: https://www.energy.ca.gov/data-reports/surveys/2019-
residential-appliance-saturation-study.
132
[179] United States Census Bureau, “2021 American Housing Survey (AHS),” 2019. [Online].
Available: https://www.census.gov/programs-surveys/ahs.html.
[180] M. A. McNeil and V. E. Letschert, “Modeling diffusion of electrical appliances in the residential
sector,” Energy Build., vol. 42, no. 6, 2010, doi: 10.1016/j.enbuild.2009.11.015.
[181] M. A. McNeil, V. E. Letschert, M. A. McNeil, and V. E. Letschert, “Future Air Conditioning
Energy Consumption in Developing Countries and what can be done about it: The Potential of
Efficiency in the Residential Sector Publication Date,” Lawrence Berkeley Natl. Lab., 2008.
[182] M. Goldsworthy and L. Poruschi, “Air-conditioning in low income households; a comparison of
ownership, use, energy consumption and indoor comfort in Australia,” Energy Build., vol. 203, p.
109411, Nov. 2019, doi: 10.1016/J.ENBUILD.2019.109411.
[183] C. J. Gronlund and V. J. Berrocal, “Modeling and comparing central and room air conditioning
ownership and cold-season in-home thermal comfort using the American Housing Survey,” J.
Expo. Sci. Environ. Epidemiol., vol. 30, no. 5, 2020, doi: 10.1038/s41370-020-0220-8.
[184] Y. Romitti, I. S. Wing, K. R. Spangler, and G. A. Wellenius, “Inequality in the availability of
residential air conditioning across 115 US metropolitan areas,” PNAS Nexus, vol. 1, no. 4, 2022,
[Online]. Available: https://doi.org/10.1093/pnasnexus/pgac210.
[185] N. Qi et al., “Smart meter data-driven evaluation of operational demand response potential of
residential air conditioning loads,” Appl. Energy, vol. 279, p. 115708, Dec. 2020, doi:
10.1016/J.APENERGY.2020.115708.
[186] M. Ghofrani, M. Hassanzadeh, M. Etezadi-Amoli, and M. S. Fadali, “Smart meter based shortterm load forecasting for residential customers,” NAPS 2011 - 43rd North Am. Power Symp., pp.
13–17, 2011, doi: 10.1109/NAPS.2011.6025124.
[187] M. Chen, G. A. Ban-Weiss, and K. T. Sanders, “Utilizing smart-meter data to project impacts of
urban warming on residential electricity use for vulnerable populations in Southern California,”
Environ. Res. Lett., vol. 15, no. 6, 2020, doi: 10.1088/1748-9326/ab6fbe.
[188] R. Kumar, B. Rachunok, D. Maia-Silva, and R. Nateghi, “Asymmetrical response of California
electricity demand to summer-time temperature variation,” Sci. Rep., vol. 10, no. 1, 2020, doi:
10.1038/s41598-020-67695-y.
[189] Y. Wang and J. M. Bielicki, “Acclimation and the response of hourly electricity loads to
meteorological variables,” Energy, vol. 142, pp. 473–485, Jan. 2018, doi:
10.1016/J.ENERGY.2017.10.037.
[190] D. Rastogi, F. Lehner, T. Kuruganti, K. J. Evans, K. Kurte, and J. Sanyal, “The role of humidity in
determining future electricity demand in the southeastern United States,” Environ. Res. Lett., vol.
16, no. 11, p. 114017, Nov. 2021, doi: 10.1088/1748-9326/ac2fdf.
[191] R. Gupta, A. Antony, V. Garg, and J. Mathur, “Investigating the relationship between residential
AC, indoor temperature and relative humidity in Indian dwellings,” J. Phys. Conf. Ser., vol. 2069,
no. 1, p. 012103, Nov. 2021, doi: 10.1088/1742-6596/2069/1/012103.
133
[192] L. W. Davis and P. J. Gertler, “Contribution of air conditioning adoption to future energy use
under global warming,” Proc. Natl. Acad. Sci. U. S. A., vol. 112, no. 19, 2015, doi:
10.1073/pnas.1423558112.
[193] A. Kim, L., Marlon, J., Lacroix, K., Carman, J., Kotcher, J., Maibach, E., Rosenthal, S., Wang, X.,
& Leiserowitz, “Beat the Heat: Extreme Heat Risk Perceptions & Air Conditioning Ownership in
California.” Yale Program on Climate Change Communication, 2021, [Online]. Available:
https://climatecommunication.yale.edu/publications/beat-the-heat-extreme-heat-risk-perceptionsair-conditioning-ownership-in-california/.
[194] Environmental Protection Agency (EPA), “EPA Air Quality System (AQS).” [Online]. Available:
https://aqs.epa.gov/aqsweb/airdata/download_files.html.
[195] “ASHRAE Terminology.” https://terminology.ashrae.org/.
[196] R. G. Steadman, “A universal scale of apparent temperature.,” J. Clim. Appl. Meteorol., vol. 23,
no. 12, 1984, doi: 10.1175/1520-0450(1984)023<1674:AUSOAT>2.0.CO;2.
[197] “National Digital Forecast Database Definitions,” National Weather Service.
https://digital.weather.gov/staticpages/definitions.php.
[198] M. Gregorczuk and K. Cena, “Distribution of Effective Temperature over the surface of the
Earth,” Int. J. Biometeorol., vol. 11, no. 2, 1967, doi: 10.1007/BF01426841.
[199] “U.S. households’ heating equipment choices are diverse and vary by climate region,” U.S. Energy
Information Administration. https://www.eia.gov/todayinenergy/detail.php?id=30672.
[200] “US Census Bureau 2016 Cartographic Boundary Shapefiles-Census Tracts.” [Online]. Available:
https://census.gov/geographies/mapping-files/time-series/geo/carto-boundary-file.html.
[201] “California Energy Commision 2015 California Building Climate Zone Areas.” [Online].
Available: http://energy.ca.gov/maps/renewable/building_climate_zones.html.
[202] D. J. Vecellio, T. Wolf, R. M. Cottle, and W. L. Kenney, “Evaluating the 35°C wet-bulb
temperature adaptability threshold for young, healthy subjects (PSU HEAT Project),” J. Appl.
Physiol., vol. 132, no. 2, pp. 340–345, 2022.
[203] K. J. Chua, S. K. Chou, W. M. Yang, and J. Yan, “Achieving better energy-efficient air
conditioning - A review of technologies and strategies,” Applied Energy, vol. 104. 2013, doi:
10.1016/j.apenergy.2012.10.037.
[204] K. Zhao, X. H. Liu, T. Zhang, and Y. Jiang, “Performance of temperature and humidity
independent control air-conditioning system in an office building,” Energy Build., vol. 43, no. 8,
2011, doi: 10.1016/j.enbuild.2011.03.041.
[205] A. K. Mishra and M. Ramgopal, “Field studies on human thermal comfort - An overview,”
Building and Environment, vol. 64. 2013, doi: 10.1016/j.buildenv.2013.02.015.
[206] EIA (Energy Information Administration), “2015 Residential Energy Consumption Survey
(RECS).” EIA, 2015.
134
[207] L. Lutzenhiser, M. Moezzi, A. Ingle, and L. Wu, “Advanced Residential Energy and Behavior
Analysis Project Final Report,” 2016.
[208] C. Palmgren, N. Stevens, M. Goldberg, R. Bames, and K. Rothkin, “2009 California Residential
Appliance Saturation Survey,” 2009.
[209] S. Borgeson, J. A. Flora, J. Kwac, C. W. Tan, and R. Rajagopal, “Learning from hourly household
energy consumption: Extracting, visualizing and interpreting household smart meter data,” in
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence
and Lecture Notes in Bioinformatics), 2015, vol. 9188, doi: 10.1007/978-3-319-20889-3_32.
[210] N. Mölders, “Suitability of the Weather Research and Forecasting (WRF) model to predict the
June 2005 fire weather for interior Alaska,” Weather Forecast., vol. 23, no. 5, 2008, doi:
10.1175/2008WAF2007062.1.
[211] D. Yan et al., “Integrating remote sensing data with WRF model for improved 2-m temperature
and humidity simulations in China,” Dyn. Atmos. Ocean., vol. 89, 2020, doi:
10.1016/j.dynatmoce.2019.101127.
[212] S. C. Pryor and J. T. Schoof, “Evaluation of near-surface temperature, humidity, and equivalent
temperature from regional climate models applied in type II downscaling,” J. Geophys. Res., vol.
121, no. 7, 2016, doi: 10.1002/2015JD024539.
[213] A. Gossart, S. Helsen, J. T. M. Lenaerts, S. Vanden Broucke, N. P. M. van Lipzig, and N.
Souverijns, “An evaluation of surface climatology in state-of-the-art reanalyses over the Antarctic
Ice Sheet,” J. Clim., vol. 32, no. 20, 2019, doi: 10.1175/JCLI-D-19-0030.1.
[214] Order Instituting Rulemaking to Consider Smart Grid Technologies Pursuant to Federal
Legislation and on the Commission’s own Motion to Actively Guide Policy in California’s
Development of a Smart Grid System. 2011.
[215] G. Luber and M. McGeehin, “Climate Change and Extreme Heat Events,” American Journal of
Preventive Medicine, vol. 35, no. 5. 2008, doi: 10.1016/j.amepre.2008.08.021.
[216] A. Barreca, K. Clay, O. Deschenes, M. Greenstone, and J. S. Shapiro, “Adapting to climate
change: The remarkable decline in the US temperature-mortality relationship over the Twentieth
Century,” J. Polit. Econ., vol. 124, no. 1, 2016, doi: 10.1086/684582.
[217] G. A. Meehl and C. Tebaldi, “More intense, more frequent, and longer lasting heat waves in the
21st century,” Science (80-. )., vol. 305, no. 5686, 2004, doi: 10.1126/science.1098704.
[218] A. Mastrucci, E. Byers, S. Pachauri, and N. D. Rao, “Improving the SDG energy poverty targets:
Residential cooling needs in the Global South,” Energy Build., vol. 186, 2019, doi:
10.1016/j.enbuild.2019.01.015.
[219] F. Pietro Colelli, I. S. Wing, and E. De Cian, “Air-conditioning adoption and electricity demand
highlight climate change mitigation–adaptation tradeoffs,” Sci. Rep., vol. 13, no. 1, 2023, doi:
10.1038/s41598-023-31469-z.
135
[220] M. Waite, E. Cohen, H. Torbey, M. Piccirilli, Y. Tian, and V. Modi, “Global trends in urban
electricity demands for cooling and heating,” Energy, vol. 127, 2017, doi:
10.1016/j.energy.2017.03.095.
[221] H. Hahn, S. Meyer-Nieberg, and S. Pickl, “Electric load forecasting methods: Tools for decision
making,” Eur. J. Oper. Res., vol. 199, no. 3, 2009, doi: 10.1016/j.ejor.2009.01.062.
[222] F. McLoughlin, A. Duffy, and M. Conlon, “Characterising domestic electricity consumption
patterns by dwelling and occupant socio-economic variables: An Irish case study,” Energy Build.,
vol. 48, 2012, doi: 10.1016/j.enbuild.2012.01.037.
[223] S. Akbari and F. Haghighat, “Occupancy and occupant activity drivers of energy consumption in
residential buildings,” Energy Build., vol. 250, 2021, doi: 10.1016/j.enbuild.2021.111303.
[224] Z. Wang et al., “Individual difference in thermal comfort: A literature review,” Building and
Environment, vol. 138. 2018, doi: 10.1016/j.buildenv.2018.04.040.
[225] M. Peplinski, P. Kalmus, and K. T. Sanders, “Investigating whether the inclusion of humid heat
metrics improves estimates of AC penetration rates: a case study of Southern California,” Environ.
Res. Lett., vol. 18, no. 10, p. 104054, 2023, doi: 10.1088/1748-9326/acfb96.
[226] P. Westermann, C. Deb, A. Schlueter, and R. Evins, “Unsupervised learning of energy signatures
to identify the heating system and building type using smart meter data,” Appl. Energy, vol. 264,
2020, doi: 10.1016/j.apenergy.2020.114715.
[227] R. Porteiro, S. Nesmachnow, P. Moreno-Bernal, and C. E. Torres-Aguilar, “Computational
intelligence for residential electricity consumption assessment: Detecting air conditioner use in
households,” Sustain. Energy Technol. Assessments, vol. 58, 2023, doi:
10.1016/j.seta.2023.103319.
[228] K. X. Perez, K. Cetin, M. Baldea, and T. F. Edgar, “Development and analysis of residential
change-point models from smart meter data,” Energy Build., vol. 139, pp. 351–359, Mar. 2017,
doi: 10.1016/j.enbuild.2016.12.084.
[229] J. Deason and M. Borgeson, “Electrification of Buildings: Potential, Challenges, and Outlook,”
Current Sustainable/Renewable Energy Reports, vol. 6, no. 4. 2019, doi: 10.1007/s40518-019-
00143-2.
[230] S. Elmallah, C. Montanes, and D. Callaway, “Who accesses residential gas and electric heating
and cooling in Northern California, and what does that mean for a residential energy transition?,”
American Geophysical Union. San Francisco, 2023.
[231] Z. Lin and S. Deng, “A questionnaire survey on sleeping thermal environment and bedroom air
conditioning in high-rise residences in Hong Kong,” Energy Build., vol. 38, no. 11, 2006, doi:
10.1016/j.enbuild.2006.04.004.
[232] K. J. Chua and S. K. Chou, “Energy performance of residential buildings in Singapore,” Energy,
vol. 35, no. 2, 2010, doi: 10.1016/j.energy.2009.10.039.
136
[233] X. Meng, Y. Gao, C. Hou, and F. Yuan, “Questionnaire survey on the summer air-conditioning
use behaviour of occupants in residences and office buildings of China,” Indoor Built Environ.,
vol. 28, no. 5, 2019, doi: 10.1177/1420326X18793699.
[234] J. An, D. Yan, and T. Hong, “Clustering and statistical analyses of air-conditioning intensity and
use patterns in residential buildings,” Energy Build., vol. 174, 2018, doi:
10.1016/j.enbuild.2018.06.035.
[235] D. Xia, S. Lou, Y. Huang, Y. Zhao, D. H. W. Li, and X. Zhou, “A study on occupant behaviour
related to air-conditioning usage in residential buildings,” Energy Build., vol. 203, 2019, doi:
10.1016/j.enbuild.2019.109446.
[236] Y. Wang, J. Wu, F. Xie, C. Liu, H. Li, and J. Cheng, “Survey of residential air-conditioning-unit
usage behavior under south China climatic conditions,” 2011, doi:
10.1109/ICEICE.2011.5776982.
[237] M. Zhu, Y. Huang, S. N. Wang, X. Zheng, and C. Wei, “Characteristics and patterns of residential
energy consumption for space cooling in China: Evidence from appliance-level data,” Energy, vol.
265, 2023, doi: 10.1016/j.energy.2022.126395.
[238] M. Schweiker and M. Shukuya, “Comparison of theoretical and statistical models of airconditioning-unit usage behaviour in a residential setting under Japanese climatic conditions,”
Build. Environ., vol. 44, no. 10, 2009, doi: 10.1016/j.buildenv.2009.03.004.
[239] P. Ramapragada, D. Tejaswini, V. Garg, J. Mathur, and R. Gupta, “Investigation on air
conditioning load patterns and electricity consumption of typical residential buildings in tropical
wet and dry climate in India,” Energy Informatics, vol. 5, 2022, doi: 10.1186/s42162-022-00228-
1.
[240] A. Malik, N. Haghdadi, I. MacGill, and J. Ravishankar, “Appliance level data analysis of summer
demand reduction potential from residential air conditioner control,” Appl. Energy, vol. 235, 2019,
doi: 10.1016/j.apenergy.2018.11.010.
[241] S. A. Zaki, A. Hagishima, R. Fukami, and N. Fadhilah, “Development of a model for generating
air-conditioner operation schedules in Malaysia,” Build. Environ., vol. 122, 2017, doi:
10.1016/j.buildenv.2017.06.023.
[242] S. Dash and N. C. Sahoo, “Electric energy disaggregation via non-intrusive load monitoring: A
state-of-the-art systematic review,” Electric Power Systems Research, vol. 213. 2022, doi:
10.1016/j.epsr.2022.108673.
[243] P. Stoica and Y. Selen, “Model-order selection: a review of information criterion rules,” IEEE
Signal Process. Mag., vol. 21, no. 4, 2004.
[244] Y. Petri and K. Caldeira, “Impacts of global warming on residential heating and cooling degreedays in the United States,” Sci. Rep., vol. 5, 2015, doi: 10.1038/srep12427.
[245] California ISO, “Californians Do Conserve When Asked – Flex Alerts Are Vital,” 2021.
http://www.caiso.com/about/Pages/Blog/Posts/Californians-Do-Conserve-When-Asked-FlexAlerts-Are-Vital.aspx.
137
[246] M. Meng and K. T. Sanders, “A data-driven approach to investigate the impact of air temperature
on the efficiencies of coal and natural gas generators,” Appl. Energy, vol. 253, 2019, doi:
10.1016/j.apenergy.2019.113486.
[247] K. T. Sanders, “Critical review: Uncharted waters? the future of the electricity-water nexus,”
Environmental Science and Technology, vol. 49, no. 1. 2015, doi: 10.1021/es504293b.
[248] S. Dubey, J. N. Sarvaiya, and B. Seshadri, “Temperature dependent photovoltaic (PV) efficiency
and its effect on PV production in the world - A review,” in Energy Procedia, 2013, vol. 33, doi:
10.1016/j.egypro.2013.05.072.
[249] N. Voisin et al., “Vulnerability of the US western electric grid to hydro-climatological conditions:
How bad can it get?,” Energy, vol. 115, pp. 1–12, 2016.
[250] D. Burillo, M. V. Chester, S. Pincetl, and E. Fournier, “Electricity infrastructure vulnerabilities
due to long-term growth and extreme heat from climate change in Los Angeles County,” Energy
Policy, vol. 128, 2019, doi: 10.1016/j.enpol.2018.12.053.
[251] T. K. Mideksa and S. Kallbekken, “The impact of climate change on the electricity market: A
review,” Energy Policy, vol. 38, no. 7, 2010, doi: 10.1016/j.enpol.2010.02.035.
[252] California ISO, “Rotating Power Outages.” [Online]. Available:
http://www.caiso.com/Documents/Rotating-Power-Outages-Fact-Sheet.pdf.
[253] S. Neumann, F. Sioshansi, A. Vojdani, and G. Yee, “How to Get More Response from Demand
Response,” Electr. J., vol. 19, no. 8, 2006, doi: 10.1016/j.tej.2006.09.001.
[254] G. Zhang, H. Zhong, Z. Tan, T. Cheng, Q. Xia, and C. Kang, “Texas electric power crisis of 2021
warns of a new blackout mechanism,” CSEE J. Power Energy Syst., vol. 8, no. 1, 2022, doi:
10.17775/CSEEJPES.2021.07720.
[255] A. X. Andresen, L. C. Kurtz, D. M. Hondula, S. Meerow, and M. Gall, “Understanding the social
impacts of power outages in North America: a systematic review,” Environmental Research
Letters, vol. 18, no. 5. 2023, doi: 10.1088/1748-9326/acc7b9.
[256] B. Jones-Albertus, “Confronting the Duck Curve: How to Address Over-Generation of Solar
Energy.” 2017, [Online]. Available: https://www.energy.gov/eere/articles/confronting-duck-curvehow-address-over-generation-solar-energy.
[257] P. Denholm, M. O’Connell, G. Brinkman, and J. Jorgenson, “Overgeneration from Solar Energy in
California: A Field Guide to the Duck Chart (NREL/TP-6A20-65023),” Tech. Rep., no.
November, p. 46, 2015, [Online]. Available: http://www.nrel.gov/docs/fy16osti/65453.pdf.
[258] R. Golden and B. Paulos, “Curtailment of Renewable Energy in California and Beyond,” Electr.
J., vol. 28, no. 6, 2015, doi: 10.1016/j.tej.2015.06.008.
[259] S. Mayes, T. Zhang, and K. T. Sanders, “Residential precooling on a high-solar grid: Impacts on
CO2 emissions, peak period demand, and electricity costs across California,” Environ. Res.
Energy, vol. 1, no. 1.
138
[260] California Public Utilities Commissoin, “CPUC Proposals Ensure Electricity Reliability During
Extreme Weather for Summers 2022 and 2023.” 2021, [Online]. Available:
https://docs.cpuc.ca.gov/PublishedDocs/Published/G000/M419/K226/419226930.PDF.
[261] California ISO, “Consumer conservation helps avert outages for second straight day.” 2020,
[Online]. Available: http://www.caiso.com/Documents/Consumer-Conservation-Helps-AvertOutages-Second-Straight-Day.pdf.
[262] S. D. Bratihwait, D. G. Hansen, and M. Hilbrink, “2013 Impact Evaluation of California’s Flex
Alert Demand Response Program,” 2014. [Online]. Available:
https://www.calmac.org/publications/2013_Flex_Alert_-_Impact_Eval_-_Final_20140228.pdf.
[263] Order Instituting Rulemaking to Establish Policies, Processes, and Rules to Ensure Reliable
Electric Service in California in the Event of an Extreme Weather Event in 2021. 2021.
[264] R. Yin et al., “Quantifying flexibility of commercial and residential loads for demand response
using setpoint changes,” Appl. Energy, vol. 177, 2016, doi: 10.1016/j.apenergy.2016.05.090.
[265] X. Chen, J. Wang, J. Xie, S. Xu, K. Yu, and L. Gan, “Demand response potential evaluation for
residential air conditioning loads,” IET Gener. Transm. Distrib., vol. 12, no. 19, 2018, doi:
10.1049/iet-gtd.2018.5299.
[266] M. Ali, A. Safdarian, and M. Lehtonen, “Demand response potential of residential HVAC loads
considering users preferences,” in IEEE PES Innovative Smart Grid Technologies Conference
Europe, 2015, vol. 2015-January, no. January, doi: 10.1109/ISGTEurope.2014.7028883.
[267] L. Lutzenhiser, “Social and Behavioral Aspects of Energy use,” Annu. Rev. Energy Environ., vol.
18, no. 1, 1993, doi: 10.1146/annurev.eg.18.110193.001335.
[268] G. D. Jacobsen and J. I. Stewart, “How do consumers respond to price complexity? Experimental
evidence from the power sector,” J. Environ. Econ. Manage., vol. 116, 2022, doi:
10.1016/j.jeem.2022.102716.
[269] B. Parrish, R. Gross, and P. Heptonstall, “On demand: Can demand response live up to
expectations in managing electricity systems?,” Energy Res. Soc. Sci., vol. 51, 2019, doi:
10.1016/j.erss.2018.11.018.
[270] K. Herter and S. Wayland, “Residential response to critical-peak pricing of electricity: California
evidence,” Energy, vol. 35, no. 4, 2010, doi: 10.1016/j.energy.2009.07.022.
[271] B. Parrish, P. Heptonstall, R. Gross, and B. K. Sovacool, “A systematic review of motivations,
enablers and barriers for consumer engagement with residential demand response,” Energy Policy,
vol. 138, 2020, doi: 10.1016/j.enpol.2019.111221.
[272] J. Crawley, C. Johnson, P. Calver, and M. Fell, “Demand response beyond the numbers: A critical
reappraisal of flexibility in two United Kingdom field trials,” Energy Res. Soc. Sci., vol. 75, 2021,
doi: 10.1016/j.erss.2021.102032.
[273] J. M. Potter, S. S. George, and L. R. Jimenez, “SmartPricing Options Final Evaluation,” 2014.
[Online]. Available:
https://efis.psc.mo.gov/mpsc/commoncomponents/viewdocument.asp?DocId=936276747.
139
[274] A. Faruqui and S. Sergici, “Household response to dynamic pricing of electricity: A survey of 15
experiments,” Journal of Regulatory Economics, vol. 38, no. 2. 2010, doi: 10.1007/s11149-010-
9127-y.
[275] “Analysis of the residential time-of-day and energy watch pilot programs: Final Report,” 2006.
[Online]. Available:
https://puc.idaho.gov/fileroom/PublicFiles/elec/IPC/IPCE0502/company/20060329PILOT
PROGRAMS FINAL REPORT.PDF.
[276] “Ontario Smart Price Pilot Final Report,” 2007. [Online]. Available:
https://www.smartgrid.gov/files/documents/Ontario_Smart_Price_Pilot_Final_Report_200706.pdf
.
[277] D. J. Hammerstrom, “Pacific Northwest GridWiseTM Testbed Demonstration Projects,” 2007.
[Online]. Available: https://www.pnnl.gov/main/publications/external/technical_reports/PNNL17167.pdf.
[278] “Residential Time-Of-Use (RTOU) Pilot Study Load Research Analysis Report,” 2004. [Online].
Available: https://www.oeb.ca/documents/cases/RP-2004-0203/2005-07-
submissions/cdm_trccomments_toronto_supplementary2.pdf.
[279] K. Herter, P. McAuliffe, and A. Rosenfeld, “An exploratory analysis of California residential
customer response to critical peak pricing of electricity,” Energy, vol. 32, no. 1, 2007, doi:
10.1016/j.energy.2006.01.014.
[280] K. Kessels, C. Kraan, L. Karg, S. Maggiore, P. Valkering, and E. Laes, “Fostering residential
demand response through dynamic pricing schemes: A behavioural review of smart grid pilots in
Europe,” Sustain., vol. 8, no. 9, 2016, doi: 10.3390/su8090929.
[281] “AGL trials impacts of emerging technologies on the grid and energy bills,” 2016.
https://www.agl.com.au/about-agl/media-centre/asx-and-media-releases/2016/march/agl-trialsimpacts-of-emerging-technologies-on-the-grid-and-energy-bills.
[282] A. Meier, “How one city cut its electricity use over 30% in six weeks,” 2009.
[283] S. L. Bender, M. Moezzi, M. H. Gossard, and L. Lutzenhiser, “Using mass media to influence
energy consumption behavior: California’s 2001 flex your power campaign as a case study,” Proc.
2002 ACEEE Summer Study, vol. 8, 2002.
[284] C. Bartusch and K. Alvehag, “Further exploring the potential of residential demand response
programs in electricity distribution,” Appl. Energy, vol. 125, 2014, doi:
10.1016/j.apenergy.2014.03.054.
[285] California ISO, “Grid Emergencies History Report.” [Online]. Available:
https://www.caiso.com/Documents/Grid-Emergencies-History-Report-1998-Present.pdf.
[286] Southern California Edison, “Demand Response Programs For Business.”
https://www.sce.com/business/demand-response.
140
[287] S. Cong, D. Nock, Y. L. Qiu, and B. Xing, “Unveiling hidden energy poverty using the energy
equity gap,” Nat. Commun., vol. 13, no. 1, 2022, doi: 10.1038/s41467-022-30146-5.
[288] G. Powells and M. J. Fell, “Flexibility capital and flexibility justice in smart energy systems,”
Energy Research and Social Science, vol. 54. 2019, doi: 10.1016/j.erss.2019.03.015.
[289] “Advanced Strategies for Demand Flexibility Management and Customer DER Compensation,”
2022. [Online]. Available: https://www.cpuc.ca.gov/-/media/cpuc-website/divisions/energydivision/documents/demand-response/demand-response-workshops/advanced-der---demandflexibility-management/ed-white-paper---advanced-strategies-for-demand-flexibilitymanagement.pdf.
[290] California Public Utilities Commissoin, “Power Saver Rewards (PSR) Fact Sheet.” [Online].
Available: https://www.cpuc.ca.gov/-/media/cpuc-website/divisions/energydivision/documents/summer-2021-reliability/emergency-load-reduction-program/psr-factsheet.pdf.
[291] F. Libertson, “(No) room for time-shifting energy use: Reviewing and reconceptualizing flexibility
capital,” Energy Res. Soc. Sci., vol. 94, 2022, doi: 10.1016/j.erss.2022.102886.
[292] S. Silva, I. Soares, and C. Pinho, “Electricity demand response to price changes: The Portuguese
case taking into account income differences,” Energy Econ., vol. 65, 2017, doi:
10.1016/j.eneco.2017.05.018.
[293] A. Drehobl and L. Ross, “Lifting the High Energy Burden in America’s Largest Cities: How
Energy Efficiency Can Improve Low-Income and Underserved Communities,” 2016.
[294] M. Kwon, S. Cong, D. Nock, L. Huang, Y. (Lucy) Qiu, and B. Xing, “Forgone summertime
comfort as a function of avoided electricity use,” Energy Policy, vol. 183, 2023, doi:
10.1016/j.enpol.2023.113813.
[295] S. Gyamfi, S. Krumdieck, and T. Urmee, “Residential peak electricity demand response -
Highlights of some behavioural issues,” Renewable and Sustainable Energy Reviews, vol. 25.
2013, doi: 10.1016/j.rser.2013.04.006.
[296] H. K. Trabish, “Real-time pricing, new rates and enabling technologies target demand flexibility to
ease California outages,” UtilityDive, 2022. https://www.utilitydive.com/news/real-time-pricingnew-rates-and-enabling-technologies-target-demand-flexib/631002/.
[297] N. G. Paterakis, O. Erdinç, and J. P. S. Catalão, “An overview of Demand Response: Keyelements and international experience,” Renewable and Sustainable Energy Reviews, vol. 69.
2017, doi: 10.1016/j.rser.2016.11.167.
[298] J. R. Buzan, K. Oleson, and M. Huber, “Implementation and comparison of a suite of heat stress
metrics within the Community Land Model version 4.5,” Geosci. Model Dev., vol. 8, no. 2, 2015,
doi: 10.5194/gmd-8-151-2015.
[299] L. Rothfusz and NWS Southern Region Headquarters, “The heat index equation (or, more than
you ever wanted to know about heat index),” Forth Worth, Texas, 190AD.
[300] R. G. Steadman, “The assessment of sultriness. Part I. A temperature-humidity index based on
141
human physiology and clothing science.,” J. Appl. Meteorol., vol. 18, no. 7, 1979, doi:
10.1175/1520-0450(1979)018<0861:TAOSPI>2.0.CO;2.
[301] S. D. Borgeson, “Targeted Efficiency: Using Customer Meter Data to Improve Efficiency Program
Outcomes,” UC Berkeley, 2013.
142
Appendices
A - Supplemental information for Chapter 2
Section A1. Temporal variability in model performance
Figure A1.1. A time-series plot of the monthly model performance by month for the top five performing models.
Figure A1.2. A time-series plot of the daily model performance over an entire year for the top five performing
models.
143
Section A2. Summary of daily, monthly, annual model results for all 11 initial
models
Table A2-1. Annual training results for entire set of ML models
Temporal
Resolution Model Mean
Absolute
Error
Median
Absolute
Error
r
2
Mean
Average
Percent
Difference
Ridge Regressor 2780 +/- 16.4 2120 +/- 13.45 0.31 +/- 0.01 112 +/- 2.33
Linear Regressor 2780 +/- 16.4 2120 +/- 13.48 0.31 +/- 0.01 112 +/- 2.33
ElasticNet 2870 +/- 16.1 2210 +/- 12.20 0.25 +/- 0.00 116 +/- 2.23
Lasso 2780 +/- 16.4 2120 +/- 14.27 0.31 +/- 0.01 112 +/- 2.32
AdaBoost
Regressor 3420 +/- 246 2880 +/- 300. 0.10 +/- 0.10 152 +/- 12.7
Annual Bagging Regressor 3060 +/- 15.6 2360 +/- 20.31 0.19 +/- 0.01 114 +/- 2.18
GB Regressor 2780 +/- 15.1 2140 +/- 11.05 0.32 +/- 0.01 113 +/- 2.18
RF Regressor 2830 +/- 14.1 2180 +/- 14.30 0.30 +/- 0.01 113 +/- 2.20
Extra Trees
Regressor 2870 +/- 15.1 2230 +/- 17.51 0.26 +/- 0.01 116 +/- 2.07
MLPRegressor 2740 +/- 15.8 2090 +/- 8.31 0.34 +/- 0.01 111 +/- 2.21
KNN Regressor 3070 +/- 18.6 2360 +/- 15.8 0.18 +/- 0.02 113 +/- 2.16
Supplementary Information
144
Table A2-2. Monthly training results for entire set of ML models
Temporal
Resolution Model Mean
Absolute
Error
Median
Absolute
Error
r
2
Mean
Average
Percent
Difference
Ridge Regressor 250. +/- 1.02 184 +/- 0.69 0.38 +/- 0.01 83.7 +/- 9.65
Linear Regressor 250. +/- 1.02 184 +/- 0.69 0.38 +/- 0.01 83.7 +/- 9.65
ElasticNet 277 +/- 0.92 215 +/- 0.98 0.24 +/- 0.01 92.3 +/- 10.2
Lasso 250. +/- 0.98 184 +/- 0.75 0.38 +/- 0.01 83.9 +/- 9.85
AdaBoost
Regressor 568 +/- 102 567 +/- 125 -1.04 +/- 0.56 209 +/- 38.8
Monthly Bagging Regressor 262+/- 1.11 189 +/- 1.29 0.31 +/- 0.02 86.1 +/- 8.55
GB Regressor 255 +/- 0.98 195 +/- 1.07 0.36 +/- 0.01 89.8 +/- 11.0
RF Regressor 277 +/- 2.61 212 +/- 1.82 0.25 +/- 0.01 94.6 +/- 12.1
Extra Trees
Regressor 276 +/- 1.07 212 +/- 0.67 0.25 +/- 0.00 96.7 +/- 13.3
MLPRegressor 235 +/- 1.47 171 +/- 1.52 0.45 +/- 0.01 81.5 +/- 9.01
KNN Regressor 255 +/- 1.53 183 +/- 1.24 0.33 +/- 0.02 84.0 +/- 8.94
Table A2-3. Daily training results for entire set of ML models
Temporal
Resolution Model Mean
Absolute
Error
Median
Absolute
Error
r
2
Mean
Average
Percent
Difference
Ridge Regressor 9.59 +/- 0.0266 6.96 +/- 0.0284 0.30 +/- 0.007 74.9 +/- 1.41
Linear Regressor 9.04 +/- 0.0255 6.47 +/- 0.0251 0.37 +/- 0.0074 69.9 +/- 1.28
ElasticNet 9.59 +/- 0.0263 7.007 +/- 0.0258 0.30 +/- 0.0042 75.03 +/- 1.40
Lasso 9.67 +/- 0.0262 7.075 +/- 0.0268 0.29 +/- 0.007 75.6 +/- 1.41
AdaBoost Regressor 37.3 +/- 11.5 34.05 +/- 13.1 -6.5 +/- 3.98 307 +/- 83.8
Daily
Bagging Regressor 9.67 +/- 0.0262 7.075 +/- 0.0268 0.29 +/- 0.007 75.6 +/- 1.41
GB Regressor 9.18 +/- 0.028 6.81 +/- 0.0308 0.35 +/- 0.0067 73.9 +/- 1.41
RF Regressor 9.76 +/- 0.0297 7.11 +/- 0.0401 0.26 +/- 0.0042 77.2 +/- 1.47
Extra Trees
Regressor 9.68 +/- 0.0284 7.27 +/- 0.0286 0.27 +/- 0.0074 78.1 +/- 1.35
MLPRegressor 8.72 +/- 0.0508 6.13 +/- 0.143 0.38 +/- 0.026 67.4 +/- 2.32
Supplementary Information
145
B - Supplemental information for Chapter 3
Section B1. Weather station distances
Figure B1.1. A histogram depicting the distance in miles between each household and the closest (shown in blue)
and second closest (shown in green) matched weather station. On days where weather station data were missing,
homes are matched to the next closest weather station. A 20-mile cutoff is implemented so that the households are
not matched to weather stations that would not accurately represent the local conditions of the home.
Supplementary Information
146
Section B2. Heat metric definitions and equations
Historical weather data were retrieved from three networks of weather stations
(NOAA, EPA, CIMIS), as described in methods. Hourly observations of dry bulb
temperature, relative humidity, and wind speed were available from all three sources.
The remaining heat metrics (DP, WBT, HI, ET, AT) were calculated using hourly
measurements from the weather stations and a combination of equations commonly
used in the literature and python packages. Hourly dew point and wet bulb temperature
observations were also available in some cases.
a) Wet Bulb Temperature
Wet bulb measurements were available from the NOAA weather stations. For the
EPA and CIMIS weather stations, Equation (2) from Buzan et.al. [298] was used
where T is hourly dry bulb temperature and RH is hourly relative humidity.
𝑇𝑤 = 𝑇 ∗ arctan (0.151977 ∗ (𝑅𝐻 + 8.3136)
1
2) + arctan(T + RH)
− arctan(𝑅𝐻1.679331) + 0.00391838 ∗ (𝑅𝐻
3
2)
∗ arctan(0.023101 ∗ 𝑅𝐻) − 4.68035
( 6 )
b) Dew Point Temperature
Dew point measurements were available from the CIMIS and NOAA stations. For the
EPA weather stations, the dew point function in the Python metpy package was used.
The function is based on Equation (3), where e is the vapor pressure which can be
calculated with the dry bulb temperature and relative humidity.
𝑇 =
243.5log (
𝑒
6.112)
17.67 − log (
𝑒
6.112)
( 7 )
Supplementary Information
147
c) Heat Index
Heat index values were calculated using the heat index function in the Python metpy
package. The function is based on Equation (4) from Rothfusz et. al. [299] , where T
is dry bulb temperature and R is relative humidity, which is a multi-variable leastsquares regression of the values obtained by Steadman [300].
𝐻𝐼 = −42.379 + 2.04901523𝑇 + 10.14333127𝑅 − 0.22475541𝑇𝑅 − 6.83783
∗ 10−3𝑇
2 − 5.481717 ∗ 10−2𝑅𝐻2 + 1.22874 ∗ 10−3𝑇
2𝑅𝐻 + 8.5282
∗ 10−4𝑇𝑅𝐻21.99 ∗ 10−6𝑇
2𝑅𝐻2
( 8 )
d) Apparent Temperature
Apparent temperature values were calculated with the apparent temperature function in
the Python meteocalc package. The function is based on the National Digital Forecast
Database’s [197] definition of apparent temperature.
When:
T < 50 F, AT = Wind Chill
T > 80 F, AT = Heat Index
else, AT = Dry Bulb Temperature
e) Effective Temperature
Effective temperature values were calculated with equation (5) from Gregorczuk
and Cena [198], where T is dry bulb temperature and RH is relative humidity.
𝐸𝑇 = 𝑇 − 0.4 ∗ (𝑇 − 10) ∗ (1 −
𝑅𝐻
100)
( 9 )
Supplementary Information
148
Section B3. Segmented linear regression criteria examples
Figure B3.1. The segmented linear regression of the daily electricity consumption versus the daily average dry bulb
temperature for three example homes in the dataset. The stationary point temperature (SPT) and electricitytemperature sensitivity (E-T sensitivity) are labeled in a), a home that meets the criteria and thus, was identified as
having AC. The home represented in b) did not meet the criteria of the sum of slope-left and slope-right being
greater than zero, and the home represented in c) did not meet the criteria of slope-right being positive. Thus, the
homes in b) and c) were not identified as having AC. This Figure was presented in a previous study that established
the segmented linear regression as a method to identify homes that have AC [24].
Supplementary Information
149
Section B4. Comparison of AC penetration rates acquired from previous
studies
Table B4-1. Summary of results from previous AC penetration rate studies.
Source
AC
Penetration Rate AC types included Investigated year Investigated area Citation
S
3
lab study 69% All types* 2015-2016 SCE territory Chen et al [24]
Residential Energy
Consumption Survey 68% All types 2015 Pacific US US EIA [206]
Advanced Residential and
Behavior Analysis Project >60% All types Not provided California Lutzenhiser
et al [207]
California Residential
Appliance Saturation Study 75% Central and Room 2009 SCE Territory Palmgren et al
[208]
California Residential
Appliance Saturation Study 86% Central and Room 2019 SCE Territory Palmgren et al
[178]
Borgeson PhD Dissertation 60% or 68-
70%
All types 2008-2011 PG&E Territory Borgeson [301]
Inequality in the availability of
residential air conditioning
across 115 US metropolitan
areas
81% All types 2015-2019 Los Angeles-Long
Beach-Anaheim
Romitti et al
[184]
*other than possibly evaporative cooling devices
Supplementary Information
150
Section B5. Distribution of the daily average heat metric values in each of the
study’s climate zones
Supplementary Information
151
Figure B5.1. Distribution of each of the six average daily heat metrics, over the study period (2015- 2016), in each of the seven climate zones that are within the study region.
Supplementary Information
152
Section B6. Humid heat metric distribution across temperature bins and
climate zones
Figure B6.1. Distribution of the daily average value for each heat metric within a given DBT range. Each point
represents a census tract day, and the shade corresponds to the average electricity intensity (EI) percentile for all
the homes in a census tract on that day. EI refers to the average electricity intensity percentile in the temperature
bin, and n is the number of census tract days in the bin.
Supplementary Information
153
Figure B6.2. Scatter plot of daily average humidity versus daily average dry bulb temperature grouped by climate
zone. Each point represents a census tract day, and the shade represents the average EI percentile of all the
homes in the census tract on that day. Marginal histograms of the dry bulb temperature and relative humidity are
plotted on the x and y axis.
Supplementary Information
154
In Figure B6.1, the distributions of each of the heat metrics are plotted to gain insight into how
they relate to DBT. On each plot, each point represents the daily average value for each heat
metric for each census tract day. These daily values are plotted according to DBT range, and the
color of each value corresponds with the average electricity intensity (EI) percentile consumed
by all the households within the census tract. (EI percentile is defined as the daily percent rank of
each census tract day based on the average electricity demand per square feet of all homes in the
census tract, where data were available). Note that the number of points, and each point’s
corresponding color for EI percentile, are exactly the same in each bin. However, the distribution
of those points according to heat metric value varies for each of the six plots.
Figure B6.2 depicts the relationship between daily average RH and daily average DBT within
each of the seven climate zones. (The colors again refer to EI percentile.) In general, there is a
negative relationship between temperature and relative humidity (i.e., as DBT increases RH
decreases) which is expected because warmer air can hold higher levels of moisture. Certain
climate zones have a steeper negative slope, meaning the decrease in RH for each unit increase
in DBT is smaller. These climate zones tend to be the hotter, drier regions; for example, climate
zone 14 has a slope of -1.29 and is described as medium to high desert [201]. Climate zone 6,
which has milder temperatures and is influenced by its coastal location has a slope of -0.83.
The probability density curves of daily average DBT and RH are also shown in the marginal
plots. As is expected, the distributions of DBT for climate zones 14, 15, and 16 are shifted
further towards higher temperatures than the milder climate zones 6, 8, and 9. The distributions
of RH for the coastal climate zones 6, 8, and 9 are skewed towards higher RH values than the
dry, desert climate zones of 14 and 15.
Supplementary Information
155
C – Supplemental Information for Chapter 4
Section C.1 Electric Heating Penetration Rates
Figure C1-1. Choropleth map of census tract level electric heating penetration rates for the study region.
Supplementary Information
156
D – Supplemental Information for Chapter 5
Section D1. Methods Flow Diagram
Figure D1.1 Flow diagram of the methodology
Supplementary Information
157
Section
D2. Summary of socioeconomic Indicators
Number of
Households
36342
52439
39863
47602
14028
6994
2747
200015
Mean
Annual
Household
Electricity
Demand
(kWh)
4635
5002
6097
6669
6713
8947
4761
6118
Mean
Income ($)
129836
99779
106287
89969
69763
88658
67592
93126
Mean
Educational
Attainment
10%
22%
18%
19%
19%
15%
23%
18%
Mean
Linguistic
Isolation
7%
12%
13%
7%
5%
6%
7%
8%
Mean
Poverty
26%
36%
31%
37%
48%
41%
53%
39%
Mean
Unemploym
ent
8%
9%
9%
12%
15%
12%
13%
11%
Mean
Housing
Burden
18%
21%
18%
18%
20%
19%
25%
20%
Climate
Zone
6
8
9
10
14
15
16
All climate
zones
Table D2.1 Summary of socioeconomic indicators* by climate zone for households in the dataset.
Supplementary Information
158
Table D2.2. Number of households (in dataset) classified in each percentile bin by socioeconomic indicator*.
Percentile
Educational
Attainment
Percentile
Linguistic
Isolation
Percentile
Poverty
Percentile
Unemployment
Percentile
Housing
Burden
Percentile
10 20179 22312 17997 16105 16020
20 22062 17285 23375 18202 19747
30 18170 23686 24200 24198 22968
40 19427 21519 20389 21214 19653
50 22556 22234 20491 20549 20645
60 20814 20488 21082 21026 22285
70 21059 21199 19002 18910 21056
80 21265 19137 19772 20220 20649
90 17934 16646 19062 20674 20497
100 15961 13773 14436 17177 15734
*The socioeconomic data were retrieved from the U.S. Census Bureau (income) and CalEnviroScreen 3.0.
(education, linguistic isolation, poverty, unemployment, housing burden). Socioeconomic information is available at
the census tract only. Thus, to find the mean values reported in Table D2.1 each household was assigned the value of
the census tract it resides within, and average value of all the households in the climate zone was calculated. Each
household was also assigned the percentile value of the census tract that it belongs to, and the number of households
in the dataset classified by each percentile bin for the socioeconomic indicators is reported in Table S2.2. Refer to
the CalEnviroScreen website for complete definitions and descriptions of the indicator (CalEnviroScreen | OEHHA).
Supplementary Information
159
Section D3. Hourly load profiles
Supplementary Information
160
Figure D3.1 Hourly load profile of residential load (purple) and total SCE load (red) on two different Flex Alert days (solid lines) compared
to the hourly load profiles on the comparable days (dashed lines). The total SCE load data were retrieved from CAISO and includes all end- use sectors.
Supplementary Information
161
Section D4. Hourly percent change in demand
Supplementary Information
162
Figure D4.1 Hourly percent change in demand on each Flex Alert and its corresponding comparable days. For hours shown in red,
the electricity demand increased from the previous hour. For hours shown in blue, the electricity demand decreased from the
previous hour.
Supplementary Information
163
Section D5. Hourly load profiles and socioeconomic indicators
Supplementary Information
164
Figure D5.1. Hourly load profiles of the SCE residential load by demand percentile, where the 10th percentile is the customers
with the lowest electricity demand and the 100th percentile is the users with the highest electricity demand.
Supplementary Information
165
Supplementary Information
166
Figure D5.2. Hourly load profiles of the SCE residential load by income percentile, where the 10th percentile is the customers in
the census tracts with the lowest income and the 100th percentile is the customers in the census tracts with the highest
income.
Supplementary Information
167
Supplementary Information
168
Figure D5.3. Hourly load profiles of the SCE residential load by income percentile, where the 10th percentile is the customers in
the census tracts with the highest education attainment (i.e., percentage of residents above 25 with at least a high school
education), and the 100th percentile the customers in the census tracts with the lowest educational attainment.
Supplementary Information
169
Section D6. Ramping response by hour of the Flex Alert period
Table D6.1 Ramping Response on each Flex Alert day for the first three hours of the Flex Alert period.
Ramping Response
Date Day of Week First Hour Last Hour Daily Max
Temp (F) Hour 1 Hour 2 Hour 3
6/30/2015 Tuesday 14 21 90.4 -14% -4% 2%
7/1/2015 Wednesday 14 21 86.5 0% 1% 2%
7/27/2016 Wednesday 14 21 93.3 1% -4% 0%
7/28/2016 Thursday 14 21 92.2 2% -1% -2%
7/24/2018 Tuesday 17 21 96.3 1% 0% -1%
7/25/2018 Wednesday 17 21 93.4 -1% -1% -1%
6/11/2019 Tuesday 16 22 92.5 0% -2% -1%
8/14/2020 Friday 15 22 97.5 -2% 0% 2%
8/16/2020 Sunday 15 22 94.9 -3% -3% -3%
8/17/2020 Monday 15 22 93.9 1% 2% 1%
8/18/2020 Tuesday 15 22 102 -4% -2% 0%
8/19/2020 Wednesday 15 22 97.3 -3% -1% 0%
9/5/2020 Saturday 15 21 106.2 -1% -1% 1%
9/6/2020 Sunday 15 21 108.3 -2% -3% 2%
9/7/2020 Monday 15 21 88 -5% -3% 1%
10/1/2020 Thursday 15 22 99.7 -1% -1% -1%
10/15/2020 Thursday 15 22 94.2 -1% -1% 1%
Average Response -2% -1% 0%
Median Response -1% -1% 0%
Abstract (if available)
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Investigating the role of climate in affecting residential electricity consumption through high spatiotemporal resolution observations
PDF
Evaluating energy consuming behaviors and the sufficiency of urban systems in the context of extreme heat hazards
PDF
Using demand-side management for decarbonization: developing methods to quantify the impact of altering electricity consumption patterns
PDF
Developing frameworks to quantify the operational and environmental performance of energy systems within the context of climate change
PDF
Beyond greenhouse gases and towards urban-scale climate mitigation: understanding the roles of black carbon aerosols and the urban heat island effect as local to regional radiative forcing agents
PDF
Developing high-resolution spatiotemporal methods to model and quantify water use for energy
PDF
Demand response management in smart grid from distributed optimization perspective
PDF
From tugboats to trees: investigating the coupled systems of urban air pollution and meteorology
PDF
Evaluating the role of energy system decarbonization and land cover properties on urban air quality in southern California
PDF
Integration of energy-efficient infrastructures and policies in smart grid
PDF
Identifying and mitigating the effects of urban heat islands in California
Asset Metadata
Creator
Peplinski, McKenna Shea
(author)
Core Title
Residential electricity demand in the context of urban warming: leveraging high resolution smart meter data to quantify spatial and temporal patterns…
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Environmental Engineering
Degree Conferral Date
2024-08
Publication Date
07/26/2024
Defense Date
04/05/2024
Publisher
Los Angeles, California
(original),
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
OAI-PMH Harvest,smart meter, residential electricity, air-conditioning, demand response
Format
theses
(aat)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Sanders, Kelly (
committee chair
), Silva, Sam (
committee member
), Zhang, Jiachen (
committee member
)
Creator Email
kennapep23@gmail.com,peplinsk@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-oUC113998G0X
Unique identifier
UC113998G0X
Identifier
etd-PeplinskiM-13279.pdf (filename)
Legacy Identifier
etd-PeplinskiM-13279
Document Type
Dissertation
Format
theses (aat)
Rights
Peplinski, McKenna Shea
Internet Media Type
application/pdf
Type
texts
Source
20240730-usctheses-batch-1187
(batch),
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the author, as the original true and official version of the work, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright.
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Repository Email
cisadmin@lib.usc.edu
Tags
smart meter, residential electricity, air-conditioning, demand response