Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
User-centric smart sensing for non-intrusive electricity consumption disaggregation in buildings
(USC Thesis Other)
User-centric smart sensing for non-intrusive electricity consumption disaggregation in buildings
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
USER-CENTRIC SMART SENSING FOR NON-INTRUSIVE ELECTRICITY CONSUMPTION
DISAGGREGATION IN BUILDINGS
By
Farrokh Jazizadeh Karimi
A Dissertation Presented to the
FACULTY OF THE USC GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
(CIVIL ENGINEERING)
August 2015
Copyright 2015 Farrokh Jazizadeh Karimi
ii
Table of Contents
Chapter One: Introduction and Motivation ........................................................................................... 1
1.1 Energy Consumption Facts ..................................................................................................... 1
1.2 Energy Consumption Management ........................................................................................ 2
1.3 Electricity Consumption Monitoring ...................................................................................... 5
1.4 Energy Awareness Vision....................................................................................................... 6
1.5 Dissertation Organization ....................................................................................................... 9
Chapter Two: Research Background and Scope .................................................................................. 10
2.1 Electricity Consumption Disaggregation Research Background.......................................... 10
2.1.1 Data Acquisition Systems ............................................................................................. 12
2.1.2 Features Extraction Approaches ................................................................................... 13
2.1.3 Pattern Recognition Algorithms .................................................................................... 17
2.2 Problem Statement ................................................................................................................ 18
2.2.1 Closer Look at Appliance from Training Point of View ............................................... 20
2.2.2 Feasibility of Pre-Installation Training ......................................................................... 24
2.2.3 Research Background in NILM Training ..................................................................... 26
2.3 Research Objectives and Scope ............................................................................................ 28
2.3.1 Reducing Interaction Requirement in Active Training ................................................. 29
2.3.2 Relaxing Response Time Constraint ............................................................................. 30
2.3.3 Research Questions ....................................................................................................... 31
2.4 Research Methodology ......................................................................................................... 32
2.4.1 Field Experiments and Data Sets .................................................................................. 32
2.4.2 Validation Metrics ......................................................................................................... 35
Chapter Three: Event Based Appliance State Transition Detection ................................................... 37
3.1 Load Monitoring Set Up ....................................................................................................... 37
3.1.1 Residential Electricity Infrastructure and Load Behavior ............................................. 37
1.1.1 Data Processing System ................................................................................................ 39
3.1.2 Appliance State Change Detection................................................................................ 43
3.1.3 Feature Extraction Methods .......................................................................................... 46
3.1.4 Classification Algorithms ............................................................................................. 49
3.2 Algorithm Evaluation and Tuning ........................................................................................ 50
iii
3.2.1 Data Labeling Procedure ............................................................................................... 50
3.2.2 Classification Algorithms Performance Evaluation ...................................................... 56
3.3 Summary ............................................................................................................................... 61
Chapter Four: Unsupervised Feature Vector Clustering ..................................................................... 63
4.1 Hierarchical Clustering Heuristic Algorithm ........................................................................ 64
4.2 Performance Evaluation of the Algorithm ............................................................................ 74
4.3 Performance Evaluation Metrics .......................................................................................... 76
4.4 Algorithm Evaluation ........................................................................................................... 82
4.5 Summary ............................................................................................................................... 90
Chapter Five: User-Centric Smart Interaction Framework .................................................................. 92
5.1 User-Centric NILM System .................................................................................................. 92
5.2 Basic User-Centric Training ................................................................................................. 93
5.3 Smart User-Centric Training ................................................................................................ 96
5.3.1 Cluster Validation ......................................................................................................... 97
5.3.2 Anomaly Detection Algorithms .................................................................................. 105
5.3.3 SI Framework Description .......................................................................................... 110
5.4 Performance Evaluation of the SI Framework Components .............................................. 112
5.4.1 Cluster Validation Evaluation and Analysis ............................................................... 112
5.4.2 Anomaly Detection Algorithm Validation .................................................................. 130
5.5 Training Frameworks Performance Evaluation and Validation ......................................... 139
5.5.1 Internal Cluster Validation .......................................................................................... 142
5.5.2 Cluster Merging .......................................................................................................... 156
5.5.3 Rule-based Cluster Identification ................................................................................ 160
5.5.4 Ensemble of Anomaly Detectors ................................................................................ 165
5.5.5 User-Interaction Process Validation ........................................................................... 170
5.6 Summary ............................................................................................................................. 173
Chapter Six: Time Constraint Relaxation in Training ....................................................................... 175
6.1 Towards Enabling Games with a Purpose (GWAP) ........................................................... 177
6.2 Proposed Passive Training Approach ................................................................................. 178
6.2.1 Signature-Label Matching Algorithm ......................................................................... 179
6.3 Passive Training Evaluation ............................................................................................... 182
6.3.1 NILM System User Interface ...................................................................................... 182
iv
6.3.2 Experimental Validation ............................................................................................. 184
6.4 Summary ............................................................................................................................. 195
Chapter Seven: Conclusion and Future Directions ............................................................................ 196
7.1 Addressing Research Objectives and Questions ................................................................. 197
7.2 Future Research Directions ................................................................................................. 202
8 References .................................................................................................................................. 205
v
List of Figures
Figure 2-1 Examples of finite state machines representing the appliance models; a) a light fixtures
with four states, b) an On/Off appliance such as an electric kettle (originally presented by [36]) ...... 11
Figure 2-2 Samples of real power time series at different resolutions: a) a segment of the real
power time series in 60 Hz; b) the same segment of power time series down-sampled to 1Hz; c) a
sub-segment of the power time series in (a) highlighted by dash line five seconds before and five
seconds after the event; d) a sub-segment of the power time series in (b) highlighted by dash-dot
line five seconds before and five seconds after the event .................................................................... 14
Figure 2-3 a) Distribution of electricity consumption in the U.S households for the major
consumers; EIA, Residential Energy Consumption Survey (RECS), 2001[73], b) the variation of
the energy consumption in residential sector for the last two decades [74] ......................................... 21
Figure 3-1 Appliance state change (event) feature vectors examples; a) the basic feature vector,
comprised of real and reactive power time series segments; b) the modeled feature vector through
linear regression analysis ..................................................................................................................... 47
Figure 3-2 The complete framework for event based non-intrusive load monitoring with appliance
state transition detection modules highlighted in hatched dark gray pattern ....................................... 49
Figure 3-3 Appliances’ feature vector labeling interface; a) the feature vector (signature) of
interest (fundamental frequency component of real power representation) at 60Hz, b) the feature
vector (signature) of interest (fundamental frequency component of reactive power representation)
at 60 Hz, c) matching a longer segment of the fundamental frequency component of real power
time series with signal (power or light intensity at 1Hz) time series from ground truth sensors ......... 51
Figure 4-1 The agglomerative hierarchical binary cluster tree generation algorithm .......................... 65
Figure 4-2. Dendrogram of binary cluster tree for feature vectors of all turn-on events in a data set
from a residential setting; the tree shows up to one hundred leaf nodes.............................................. 67
Figure 4-3 Variation of different metrics - used for detecting the structure of cluster tree - along
the cluster tree from leaf nodes to the root; a) between cluster distances along the tree; b) slope
measure representing distance growth rate (equation 12); c) slope growth measure representing
slope growth rate (equation 13) ........................................................................................................... 69
Figure 4-4 Distance growth rate histogram segmentation process; a) the histogram of the 𝛿 value
of the cluster tree; b) variation of 𝛿 values along the cluster tree; c) variation of the |∆ℎ𝑤 | (the
absolute difference of the weighted sum of 𝛿 ) along the tree; d) variation of 𝛿𝑢 values along the
cluster tree ............................................................................................................................................ 71
Figure 4-5 Dendrogram of the cluster tree at different scales through recursive clustering; sub-
graphs b to d represent the dendrogram of the cluster tree for the residual feature space ................... 73
vi
Figure 4-6 Recursive hierarchical clustering (RH) algorithm.............................................................. 74
Figure 4-7 Feature vector classes for phase A turn-on events, manually labeled by a user, using
the ground truth sensors’ data .............................................................................................................. 75
Figure 4-8 Cluster quality (CQ) metric representation; the left side shows the results for all
clusters and the right side shows the congested area in a larger scale ................................................. 81
Figure 4-9 Variation of F-Measure and CQI indices for different number of Fourier basis
functions and degree of polynomial basis functions ............................................................................ 86
Figure 4-10 Feature vectors clusters for phase A turn-off events, manually labeled by user, using
the ground truth sensors’ data .............................................................................................................. 88
Figure 5-1User-Centric NILM framework components and their relationship ................................... 93
Figure 5-2 Basic training process by leveraging the user-centric NILM framework depicted in
Figure 5-1 ............................................................................................................................................. 94
Figure 5-3The resulted clusters, generated by the autonomous clustering algorithm ( presented in
Chapter 4) for the turn-on events on phase A of the apartment 1 data set for two weeks of the data
(the same data that was used for evaluating the clustering algorithms) ............................................... 98
Figure 5-4 Selected clusters form clusters on phase A, turn-on events in apartment 1; a) cluster of
AC compressor turn-on events; b) cluster of kettle turn-on events; c) second cluster of AC
compressor turn-on events ................................................................................................................. 103
Figure 5-5 Covariance matrices of the signature samples from selected clusters; a-c) Covariance
of the sample signatures in matrix format; d-e) Covariance of the sample signature in the
vectorized format ............................................................................................................................... 104
Figure 5-6 A two dimensional representation of the anomaly detection through application of
SVDD approach including the core data boundary and its radius and the slack variables for
representative data points ................................................................................................................... 107
Figure 5-7 The smart interaction framework including the anonymous labeling module, the real-
time decision-making module, and the user interaction interface ..................................................... 111
Figure 5-8 Variation of the CCM approach second condition for different dimensionality in the
data set ............................................................................................................................................... 118
Figure 5-9 Variation of the CCM approach third condition for different dimensionality in the data
set ....................................................................................................................................................... 120
Figure 5-10 The effect of the cluster merging on the cluster quality indices (i.e., accuracy and
𝐶𝑄𝐼 metrics) using CCM approach for different values of 𝐾 1 and 𝐾 3 for original dimensions ...... 122
Figure 5-11 The effect of the cluster merging on the cluster quality indices (i.e., accuracy and
𝐶𝑄𝐼 metrics) using CCM approach for different values of 𝐾 1 and 𝐾 3 for the first five principal
dimensions ......................................................................................................................................... 122
vii
Figure 5-12 Variation of the FCM similarity ratio (SR) metric for the turn-on events on phase A
of apartment 1; variation of the SR measure across different clusters (left side); histogram of the
SR values across different clusters (right side) .................................................................................. 123
Figure 5-13 The effect of the cluster merging on the cluster quality indices (i.e., accuracy and
𝐶𝑄𝐼 metrics) using FCM approach for different values of 𝜏 using original dimensions .................. 123
Figure 5-14 Variation of the cluster quality indices (f-measure and 𝐶𝑄𝐼 ) merging groups of
clusters, formed based on the real power range variation of clusters ................................................ 126
Figure 5-15 Performance of the CCM algorithm for apartment 1 data set: a) turn-on events on
phase A, original dimensions for K
1
=0.7; b) turn-on events on phase A, original dimensions for
K
1
=0.5; c) turn-on events on phase A, first 5 PCs for K
1
=0.7; b) turn-on events on phase A, first
5 PCs for K
1
=0.5; ............................................................................................................................... 127
Figure 5-16 Performance of the CCM algorithm for apartment 1 data set using first 5 PCs and
K
1
=0.5: a) turn-on events on phase A, b) turn-off events on phase A, c) turn-on events on phase
B, and d) turn-off events on phase B ................................................................................................. 128
Figure 5-17 Performance of the FCM algorithm for apartment 1 data set: a) turn-on events on
phase A, b) turn-off events on phase A, c) turn-on events on phase B, and d) turn-off events on
phase B ............................................................................................................................................... 129
Figure 5-18 Performance of the CBM algorithm for apartment 1 data set: a) turn-on events on
phase A, b) turn-off events on phase A, c) turn-on events on phase B, and d) turn-off events on
phase B ............................................................................................................................................... 129
Figure 5-19 Variation of the outlier detection accuracy versus the change in the real power range
of signature clusters for both apartment 1 and apartment 2 data sets using RBF and second degree
polynomial kernel functions............................................................................................................... 134
Figure 5-20 The variation of real power range and the number of samples in different classes ....... 135
Figure 5-21 Variation of optimum C values versus characteristics of the signature classes ............. 136
Figure 5-22The distribution of inlier and outlier detection accuracy with respect to power range
of the signature class .......................................................................................................................... 138
Figure 5-23 The IDEF0 diagram of the training framework, which could be selectively used for
both basic and enhanced training procedures .................................................................................... 140
Figure 5-24 The NILM prototype set up in an apartment as part of the experimental test bed ......... 141
Figure 5-25 the user interface of our NILM prototype used for training by users ............................. 142
Figure 5-26 Variation of the F-Measure versus SC
Index
for different values of P
b
on the data from
Apartment 1 – Phase A – Turn-on events .......................................................................................... 147
Figure 5-27 Variation of the F-Measure versus SC
Index
for different values of P
b
on the data from
Apartment 1 – Phase A – Turn-off events ......................................................................................... 148
viii
Figure 5-28 Variation of the F-Measure versus SC
Index
for different values of P
b
on the data from
Apartment 1 – Phase B – Turn-on events .......................................................................................... 149
Figure 5-29 Variation of the F-Measure versus SC
Index
for different values of P
b
on the data from
Apartment 1 – Phase B – Turn-off events.......................................................................................... 150
Figure 5-30 Variation of the F-Measure versus SC
Index
for different values of P
b
on the data from
Apartment 2 – Phase B – Turn-on events .......................................................................................... 151
Figure 5-31 Variation of the F-Measure versus SC
Index
for different values of P
b
on the data from
Apartment 2 – Phase B – Turn-off events.......................................................................................... 152
Figure 5-32 Variation of the F-Measure versus SC
Index
for different values of P
b
on the data from
Apartment 2 – Phase A – Turn-on events .......................................................................................... 153
Figure 5-33 Variation of the F-Measure versus SC
Index
for different values of P
b
on the data from
Apartment 2 – Phase A – Turn-off events ......................................................................................... 154
Figure 5-34 Variation of the average ΔCQI and ΔF-Measure over the eight data sets for both test
bed apartments (apartment 1 and apartment 2) for power ranges above 300 watts ........................... 158
Figure 5-35 Variation of the average ΔCQI and ΔF-Measure over the eight data sets for both test
bed apartments (apartment 1 and apartment 2) for power ranges below 300 watts ........................... 159
Figure 5-36 Sample histograms of the synthetic data ........................................................................ 161
Figure 5-37 Histograms and frequent events’ indices for clusters of the appliances’ signatures
from turn-on events on phase A of apartment 1 ................................................................................. 163
Figure 5-38 Histograms and frequent events’ indices for clusters of the appliances’ signatures
from turn-off events on phase A of apartment 1 ................................................................................ 164
Figure 5-39 Evaluation of the anomaly detection techniques on data sets from apartment 1 ........... 167
Figure 5-40 Evaluation of the anomaly detection techniques on data sets from apartment 2 ........... 168
Figure 6-1 Time difference between events for data sets in apartment 1; the events were detected
on the real power time series @20Hz ................................................................................................ 176
Figure 6-2 Signature-Label Matching (SLM) algorithm for assigning the labels and feature
vectors in a given electricity measurement data set ........................................................................... 180
Figure 6-3 User interface screenshots of the passive training smartphone application, developed
on Android platform .......................................................................................................................... 184
Figure 7-1 Lighting fixtures activity detection in time and frequency domains ................................ 203
ix
List of Tables
Table 2-1 Appliances with about 80% of contribution in electricity consumption of households ...... 22
Table 2-2 List of Appliances and duration of the data acquisition in different field experimental
test beds used in addressing the research objectives in this dissertation.............................................. 34
Table 3-1 Appliances sub-labels, representing appliance state transitions in Apartment 1 test bed.... 53
Table 3-2 Appliances sub-labels, representing appliance state transitions in Apartment 2 test bed.... 54
Table 3-3 Appliances sub-labels, representing appliance state transitions in Apartment 3 test bed.... 55
Table 3-4 Performance evaluation of different classifier algorithms on the turn-on and turn-off
events signatures, collected in Apartment 1 test bed for two weeks .................................................... 56
Table 3-5 Performance of the 1NN algorithm using 10-fold cross validation for two weeks of the
data in Apartment 1 for turn-on and turn-off events on phase A ......................................................... 57
Table 3-6 Performance of the 1NN algorithm using 10-fold cross validation for two weeks of the
data in Apartment 1 for turn-on and turn-off events on phase B ......................................................... 58
Table 3-7 the confusion matrix for KNN classifier performance using 10-fold cross validation on
the data from phase A for turn-on events ............................................................................................. 59
Table 3-8 the confusion matrix for KNN classifier performance using 10-fold cross validation on
the data from phase B for turn-on events ............................................................................................. 60
Table 3-9 the confusion matrix for KNN classifier performance using 10-fold cross validation on
the data from phase A for turn-off events ............................................................................................ 60
Table 3-10 the confusion matrix for KNN classifier performance using 10-fold cross validation on
the data from phase B for turn-off events ............................................................................................ 61
Table 4-1 The matrix representing the association between cluster labels and ground truth labels
(association matrix) .............................................................................................................................. 78
Table 4-2 The mapped version of the association matrix in the form of a conventional confusion
matrix ................................................................................................................................................... 80
Table 4-3 Performance of the heuristic algorithm for turn-on events on phase A for different
linkage and distance metrics (average values across five 5-fold cross validation results) .................. 84
Table 4-4 Performance of the heuristic algorithm for turn-on events on phase A for different
feature extraction methods (average values across five 5-fold cross validation results) ..................... 86
x
Table 4-5 Performance of the proposed heuristic algorithm for turn-off events on phase A
(average values across five 5-fold cross validation results) ................................................................. 87
Table 4-6 Performance of the heuristic algorithm for turn-on and turn-off events on phase B for
different feature extraction methods (average values across five 5-fold cross validation results) ...... 89
Table 5-1The clusters that could be merged, determined by visual observation (the clusters have
been autonomously generated using two weeks of the turn-on events in phase A in apartment 1) ... 114
Table 5-2 The CCM approach first condition (eq. 5-1) evaluation for clusters of the turn-on
events on phase A in apartment 1 ...................................................................................................... 116
Table 5-3 Evaluation of the CCM first condition for the data set with reduced dimensionality
using PCA for dimensions up to 30, as well as original dimensions ................................................. 117
Table 5-4 Evaluation of the CCM second condition for selected clusters of turn-on events on
phase A of Apartment 1 data set ........................................................................................................ 119
Table 5-5 Evaluation of the CCM third condition for selected clusters of turn-on events on phase
A of Apartment 1 data set .................................................................................................................. 121
Table 5-6 Variation of the CQI values for individual classes of appliance state transitions ............. 124
Table 5-7 Example of groups of clusters for the targeted data set and their associated cluster
labels .................................................................................................................................................. 125
Table 5-8 Evaluation of the SVDD algorithm for anomaly detection using the data in apartment 1
and apartment 2 data sets ................................................................................................................... 133
Table 5-9 Evaluation of the MD algorithm for anomaly detection using the data in apartment 1
and apartment 2 data sets ................................................................................................................... 138
Table 5-10 Variation of clustering algorithm performance versus different internal evaluation
indices; Apartment 1 – Phase A – Turn-on events ............................................................................. 147
Table 5-11 Variation of clustering algorithm performance versus different internal evaluation
indices; Apartment 1 – Phase A – Turn-off events ............................................................................ 148
Table 5-12 Variation of clustering algorithm performance versus different internal evaluation
indices; Apartment 1 – Phase B – Turn-on events ............................................................................. 149
Table 5-13 Variation of clustering algorithm performance versus different internal evaluation
indices; Apartment 1 – Phase B – Turn-off events ............................................................................ 150
Table 5-14 Variation of clustering algorithm performance versus different internal evaluation
indices; Apartment 2 – Phase B – Turn-on events ............................................................................. 151
xi
Table 5-15 Variation of clustering algorithm performance versus different internal evaluation
indices; Apartment 2 – Phase B – Turn-off events ............................................................................ 152
Table 5-16 Variation of clustering algorithm performance versus different internal evaluation
indices; Apartment 2 – Phase A – Turn-on events ............................................................................. 153
Table 5-17 Variation of clustering algorithm performance versus different internal evaluation
indices; Apartment 2 – Phase A – Turn-off events ............................................................................ 154
Table 5-18 Clustering configuration for apartment 1 and 2 data sets ................................................ 157
Table 5-19 Average variation of the CQI versus F-Measure metrics for merging over all the eight
data sets in two test bed apartments (1 and 2) .................................................................................... 158
Table 5-20 Detailed variation of ΔCQI and ΔF-Measure for selected threshold values ................... 159
Table 5-21 Effect of the cluster merging techniques on reducing the number of clusters ................. 160
Table 5-22 Variation of the anomaly detection algorithms’ performance metrics on the data sets
from apartment 1 and apartment 2 ..................................................................................................... 166
Table 5-23 SI framework validation results in apartment 1 and apartment 2 using user interaction
requirement and labeling accuracy metrics ........................................................................................ 171
Table 6-1 List of appliances and their associated labels that were targeted in the experiments in
apartment 1 and apartment 2 test beds ............................................................................................... 185
Table 6-2 Performance of the SLM algorithm on the data collected from Apartment 1for turn-on
events ................................................................................................................................................. 187
Table 6-3 Performance of the SLM algorithm on the data collected from Apartment 1for turn-off
events ................................................................................................................................................. 189
Table 6-4 Performance of the SLM algorithm on the data collected from Apartment 3 for turn-on
events ................................................................................................................................................. 190
Table 6-5 Performance of the SLM algorithm on the data collected from Apartment 3 for turn-off
events ................................................................................................................................................. 191
Table 6-6 Performance of the SLM algorithm on the data collected from Apartment 1 measured
over up to 50 combinations of the signature subsets ......................................................................... 193
Table 6-7 Performance of the SLM algorithm on the data collected from Apartment 3 measured
over 50 combinations of the signature subsets................................................................................... 194
xii
Acknowledgements
I would like to express my gratitude to my advisors Professor Burcin Becerik-Gerber and Professor
Lucio Soibelman for their encouragement, advice, and support throughout my studies at University
of Southern California. Their mentorship provided me with great learning experience in all aspects of
the academic endeavors. I would also like to thank all the members of my guidance and PhD defense
committees, Professor Sami Masri, Professor Alexander Sawchuk, Professor Mario Berges, and
Professor Viktor Prasanna for their advice, recommendations, and support. Moreover, I express my
sincere appreciation to Dr. Michael Orosz from USC Information Sciences Institute for his
mentorship and support during my studies.
I would also like to thank all my friends and colleagues in the Sonny Astani Department of Civil and
Environmental Engineering at USC, especially all the members of the Innovation in Integrated
Informatics Lab (iLAB) for their friendship and supports. Thank you all for making this experience
so memorable.
Finally, and most of all, I would like to express my sincere gratitude to all my family members,
especially my parents, for their unconditional love and support throughout different stages of my life.
Without their supports and encouragements I would have never made it through this journey.
xiii
Abstract
This dissertation focuses on enabling energy aware facilities through the application of low cost
advanced sensing systems using non-intrusive load monitoring (NILM) for buildings. Acquiring
disaggregated electricity consumption information at individual load level enables energy aware
facilities and provides the ground for various applications, such as human energy-affecting behavior
modification, load shedding and demand-response applications, fault detection, activity detection,
etc. As a low-cost alternative to appliance level sensing, NILM uses few sensing nodes for measuring
aggregated electricity data, coupled with specialized machine-learning and signal processing
algorithms to infer operational schedules of individual loads. Majority of the NILM solutions are
based on supervised learning algorithms and thus require training data, for which user interaction
with the NILM system is required due to diversity of appliances’ technologies and signatures. The
need for training is one of the obstacles in wide commercial adoption of NILM systems.
Accordingly, this research focuses on user-centric intelligent sensing system, in which NILM is
defined as a problem of human-computer interaction in the context of load disaggregation. The main
goal of the research is to facilitate the adoption of NILM systems by considering two objectives of
relaxing user-interaction requirements and time constraints during the training period while
preserving accuracy. This dissertation has contributed by devising a framework for smart interaction
in the form of a user-centric NILM system, through which clusters of appliances’ signatures are
autonomously populated for intelligent communication (versus a brute-force approach) with users.
The objective was to enable clustering of the appliances transient signatures (represented in a multi-
scale high dimensional feature space) compatible to the appliances’ operational modes in the physical
domain. Accordingly, a customized cluster validation approach, which uses a combination of cluster
splitting and merging, was developed. As part of this approach, an unsupervised recursive
xiv
hierarchical clustering based heuristic algorithm was devised and comprehensively evaluated. This
autonomous clustering algorithm was coupled with unsupervised cluster merging and internal
validation to enable accurate feature space partitioning. The autonomous clustering module was
integrated with data acquisition and processing, event detection, classifier, and anomaly detector
components to achieve efficient interactions while keeping high level of accuracy. An ensemble of
anomaly detection algorithms combination was introduced to balance the trade-off between inlier
detection rate and accuracy metrics. The framework enables efficient interactions through reduced
user-system communication for labeling populated clusters of data. The evaluation of the smart
interaction framework showed more than 94% reduction in the number of interactions, while
achieving the labeling accuracy more than 96% on average. Moreover, we have coupled this
framework with a “passive” training algorithm, which enables users to interact with NILM system
under relaxed time constraints instead of reacting to the operational schedule of appliances. A
signature-label matching algorithm was presented to enable the NILM system to match signatures
with corresponding labels with a user reaction time relaxation of about 30 minutes. These algorithms
have been validated in three residential settings through implementation of a user centric real-time
prototype with a single sensing point at the main circuit panel. The research has also contributed to
the field by being the first on quantification of the training interaction and also by providing fully
labeled (acquired in high frequency) data sets in three real residential settings, where user interaction
with the NILM system for training has been recorded.
1
Chapter One: Introduction and Motivation
1
1.1 Energy Consumption Facts
In recent decades energy-environmental sustainability has been one of the main priorities of the
global efforts. In the last two decades, global energy consumption and consequently CO
2
emissions have grown dramatically by 49% and 43%, respectively [1, 2]. Current predictions
show a global annual increasing trend of 2% in energy consumption and 1.8% in CO
2
emissions
[1, 2]. Fossil fuels are the source for about 81% of the primary energy consumption [1]. These
energy resources are limited and are major contributors to the CO
2
emissions, which is correlated
to global temperature rise [3]. Therefore, ongoing global efforts have concentrated on reduction of
energy consumption and CO
2
emissions. Among all the consumers, buildings are one of the major
players. At the global level, it is estimated that buildings consume 20% to 40% of the annual
energy [2]. In the United States., both commercial and residential buildings account for about
41% of the annual energy consumption. Building sector consumes more than the industrial and
transportation sectors [4]. Moreover, buildings are the major consumer of the electricity in the
United States by consuming 71 percent of total annual electricity in 2011 [5] and expecting to
account for 75 percent of annual electricity consumption in 2025 [6]. Therefore, management of
electricity consumption in buildings plays an important role in energy conservation efforts.
2
1.2 Energy Consumption Management
Considering the increasing trend in energy demand, efforts have been focused on reducing energy
consumption and the dependency on fossil fuels, as the major source of energy consumption.
Majority of the efforts are focused on improving the efficiency of the buildings through
retrofitting building components and improving building systems efficiency. In a 2010 article
from Johnson Controls Inc. in GreenBiz [7], ten ways of reducing energy in buildings have listed
as: 1) assessing building energy consumption and waste, 2) using more energy efficient
equipment, 3) matching HVAC and lighting to occupancy, 4) maintaining equipment for optimum
efficiency, 5) maximizing lighting efficiency, 6) measuring water usage and waste, 7) scheduling
cleaning during regular work hours, 8) insulating thoroughly, 9) meeting LEED standards, and 10)
make building occupants more informed. An overview of the recommended approaches shows
that the improvement in building components/systems efficiency, incorporating human dynamism
in managing energy, and increasing the knowledge of energy consumption in buildings have been
highlighted as potential mitigating strategies. Accordingly, occupants' interaction with building
systems and appliances is an important aspect in energy management in buildings and energy
conservation endeavors, and building occupants play a central role in defining building energy
consumption demands and patterns.
From building energy consumption perspective, occupants' energy-affecting behavioral traits in a
general context could embrace a variety of variables such as preferences, habits, and decisions in
interacting with building systems and appliances. Margins of normal consumption are correlated
with and affected by these energy-affecting behavioral traits. Technological developments over
the course of decades have resulted in the change of occupants' convenience; as a result,
3
occupants adopted preferences and habits, which may satisfy their expectations but might not be
environmentally friendly [8]. Satisfying these expectations (habits and social conventions), that
are now considered as convenient, requires unsustainable resource intensive building systems'
operation [8].
The growth of research studies [9-15] on integrating human factors in building systems' design
and operation processes is an indicator of acknowledging the importance of these factors in
buildings' energy management. Part of the research efforts have focused on integrating occupants'
behavioral parameters such as dynamic occupancy schedules, equipment and lighting system
densities, and occupants’ preferences in the design and operation processes [9, 11-16]. On the
other hand, many research efforts have focused on energy-affecting human behavior modification
with the objective of energy conservation [10, 17-19]. The core idea of these efforts relies on
using different interventions by providing information about energy consumption patterns in
buildings and their association with the user behavior. Examples of these interventions include
applications such as creating energy awareness by providing occupants with their detailed energy
consumption information, provision of monetary incentives, comparison of energy consumption
between peers, and so forth.
In addition to energy conservation objectives, demand response management programs pursue the
objective of managing electricity consumption during peak hours by reducing the consumption or
shifting it to the off-peak times. Although the demand side management (DSM) does not
necessarily reduce the energy consumption, it could alleviate the necessity for development in
distribution network and power plants that bring about the sustainability of the power generation
and distribution system as a whole. Therefore, demand response provides an opportunity for
4
consumers to actively participate in sustainable practices. DSM, which uses time-based rates or
other form of financial incentives, is also emphasizing on the importance of human behavior
modification in sustainable practices in the energy domain. Smart grid technologies have been
developed as drivers and facilitators of the DSM by introducing a two way communication
between utilities and the end users for sensing and actuating. Smart grid is an electric network that
records the actions of its users to deliver sustainable, economic and secure electricity supplies
[20].
In many of the studies, it has been shown that occupants are impacted by information feedback,
social norms, social influence, and other aspects of relaying information [10, 17, 18, 21].
Moreover, it is assumed that human behavior could be modified by restructuring the flow of
information [22]. The energy affecting behavior modification is focused on the concept of energy
consumption awareness. In the context of human-building interaction, energy awareness is the
recognition of energy affecting behaviors [23] that create a tangible sense of end-user’s energy
consumption [24]. In order for building occupants, owners, or facility managers to make informed
choices relating to energy, they need to understand its consumption as well as the potential
consequences of overconsumption or inefficient consumption. Therefore, detailed energy
consumption information is essential for both utility companies and end users in order to
efficiently manage the supply-demand chain. Providing detailed information of spatiotemporal
energy consumption also facilitates informed decisions for future investments in distribution
systems, and reduces the costs [25]. Moreover, the availability of detailed energy consumption
enables more accurate load forecast for power system planning. More importantly, the availability
of this information provides the ground for managing distribution of the energy demand uniformly
on the grid by load scheduling and peak shifting. In addition to energy management technologies,
5
the application of disaggregated electricity consumption has been shown to be effective in activity
detection for applications such as elderly care [26].
1.3 Electricity Consumption Monitoring
As noted, provision of energy information could be used in energy sustainability efforts. Research
studies show that the consumers are ready to change when they are presented with the appropriate
information; however, they still lack the tools to obtain detailed and personal energy consumption
information [27]. Accordingly, sensing electricity and determining the patterns of electricity use is
one of the stepping stones in managing human interaction with electricity infrastructure. Smart
grid partially addresses the need for sensing by integrating smart meters (the more advanced
version of the AMR (automatic meter reading) technology). Smart meters are one of the main
components of the smart grid for obtaining energy consumption patterns. These meters could
provide almost real-time remote access to the energy consumption information. Smart meter
facilitates the application of dynamic demand management strategies such as dynamic pricing. On
the other hand, users can be informed of their inefficient energy consumption patterns at the
aggregate level so that they can take actions towards reducing the energy consumption. In
addition to utility meters, a variety of commercial electricity monitoring devices has been
developed to address the need for increased energy consumption awareness. These devices
include solutions for building/unit level metering such as TED [28] or eMonitor [29] that enable
monitoring with higher granularity at the circuit level; e.g., floor level. However, the aggregate
electricity consumption could obscure the causality of the observations. In other words, the energy
consumption in the aggregate form does not provide information about the contribution of
individual loads in total consumption. The increase in granularity of the electricity information
6
could be achieved by commercial solutions that enable plug level monitoring such as "Kill-a-
watt" [30], "Watts up?" or "Watts up? Pro" [31], Greenwave Reality PowerNodes [32], and
Enmetric Powerports [33]. These technologies could provide electricity consumption for
individual appliances that could be plugged into their sensors. Although the application of these
technologies could make plug level sub-metering for individual appliances in a building possible,
their application for energy awareness objectives does not provide a feasible solution. The first
consideration is the cost associated with the application of the plug level meters. Considering the
number of appliances in a typical household, plug level metering could be costly and
consequently not justifiable considering the typical energy expenditures. More detailed analyses
of cost associated with the application of plug level sensing could be found in [34]. Moreover, not
all the individual loads could be monitored using the plug level sensors considering loads such as
lighting fixtures or appliances with hard wiring. Hard wired appliances are the ones that are
connected to the electricity distribution infrastructure without using plugs and receptacles.
1.4 Energy Awareness Vision
Considering the importance of energy awareness in sustainability practices, the vision in this
dissertation is to enable human-building interaction by increasing occupants' knowledge about the
building’s energy consumption patterns while accounting for the trade-off between cost and level
of information. In other words, the idea is to enable cost-effective flow of information between
building occupants and the building energy management system that monitors the operation of
building components, namely building systems, equipment, and appliances. Once the foundation
is set to convey the information, various applications and studies could use this information. At
the core, these applications include improving the efficiency of the building operations through
7
increasing building occupants' energy awareness, providing operational information at grid level,
providing energy-conserving solutions for building occupants or facility managers. (1) Sending
alerts for violation from a projected energy budget, (2) providing management solutions, in which
the most energy consuming appliances are identified, (3) identification of malfunctioning
equipment/appliances, (4) providing detailed monthly reports, (5) provision of comparative
analyses between similar buildings, (6) assigning energy scores to different subspaces of a
building in order to leverage social competition for energy conservation, and (7) predicting
activity patterns towards autonomous management of the electricity consumption (e.g., by
controlling and minimizing standby power consumption) are examples of the applications that
could be developed upon obtaining disaggregated electricity consumption. Therefore, the goal of
this dissertation is to enable a framework, which facilitates electricity consumption (as the major
part of energy consumption) sensing in buildings by taking into account the information-cost
trade-off to provide the ground for human-building interaction.
Pursuing the aforementioned goal, we are interested to obtain information with a granularity
higher than what a building/unit level power meter could provide to gain more insight into
contribution of individual loads. Yet, we intend to avoid the application of sub-metering (i.e., the
application of sensors at the appliance level) which could be prohibitively expensive. Therefore,
we want the best of both worlds by considering the tradeoff between cost and information level.
Specifically, the vision is to enable high granularity load monitoring by reducing the number of
sensing points in building while enabling the disaggregation of individual loads’ contribution to
the aggregated power time series.
8
In this dissertation, the realization of this vision is pursued through facilitating the application of
non-intrusive load monitoring (NILM). NILM, as an alternative to individual load level
sensing/monitoring, could enable the disaggregation of the electricity consumption using few
sensing points at circuit breaker levels (ideally one point at the main circuit breaker panel level for
a building unit) by capturing aggregated signal time series such as power metrics (through sensing
and processing raw waveforms such as current and voltage waveforms) and leveraging machine
learning and signal processing algorithms.
However, reducing the number of sensing points in NILM applications to avoid extensive
hardware installation brings about shifting the challenges to the computational side for
decomposing the aggregated signal time series into its constituents (i.e., the time series associated
with individual loads). Specialized learning algorithms are used to determine the identity of the
loads (appliances/equipment) that are responsible for the observations (of changes) on the
aggregated signal time series. The behavior of the loads (derived by components and technologies
that are used in different appliances) are leveraged in order to recognize the association between
appliances and the observations. Accordingly, the performance of the algorithms in recognizing
the loads depends on various factors including the behavior of the loads, algorithm itself, the
characteristics of the signal that are used to represent individual loads, and the process of learning
for the algorithms to associate the observations with appliances in the physical domain. Despite
several research studies in the electricity consumption disaggregation domain, the application of
this technology as a commercial product has remained very limited. Although many factors could
contribute to this limited adoption of the technology, a crucial factor in the success of the
technology is the performance of disaggregation algorithms in appliances’ operations recognition.
Learning is an essential process in driving the performance of the algorithms. In this dissertation,
9
the realization of the electricity consumption disaggregation is explored from the algorithms
learning process point of view.
1.5 Dissertation Organization
The remaining chapters of the document present the problem statement, research scope, proposed
solutions, and validations. Chapter 2 of the document describes the research background of
electricity consumption disaggregation, expands on the problem statement, and presents the
research scope in this dissertation including the objectives and the research questions. Chapter 3
presents the base electricity disaggregation framework, adopted in this dissertation, as part of the
research methodology, including the experimental set up and test bed descriptions. Chapters 4 to 6
present the proposed solutions to address the research questions and the validation results. Chapter
7 concludes the dissertation by summarizing the conclusions, discussions, and future directions of
the study.
10
Chapter Two: Research Background and Scope
2
Non-intrusive load monitoring has been introduced as an alternative approach to individual load
level monitoring for electricity consumption patterns disaggregation. The individual load level
monitoring relies on monitoring individual equipment/appliances by using one sensor per
consumption node (e.g. an appliance or a light fixture) to capture the consumption patterns. The
concept is known as sub-metering or intrusive load monitoring as it requires access to all
appliances and their wiring, while NILM approaches seek to decompose the aggregated power
time series into its contributing individual loads without physically accessing the wiring of the
appliances.
2.1 Electricity Consumption Disaggregation Research Background
Research studies on non-intrusive load monitoring were originally introduced in late 80’s by Hart
[35, 36]. Since the introduction of the NILM concept and during the last two decades many
research efforts have been made to improve NILM for electricity disaggregation. Majority of the
research efforts have been focused on event based techniques. Generally, in these techniques, the
variations in the sensed signals are detected as events, which are representative of equipment and
appliances operational state transitions. Appliance state transitions are defined as changes in the
operational modes of the appliances, e.g. a light bulb going from off state to on state. Upon
detection, the events have to be classified as a specific state transition related to an equipment or
appliance. For this purpose, the signal characteristics in proximity of the events or between events
11
(appliances' signature or fingerprint) are used as representative features, which are determined
through feature extraction processes. Once the state changes, associated to the observations on the
aggregated power time series, are recognized (by assigning a label, which associates the event
signature to operational states of an appliance), the disaggregation of the electricity consumption
for each individual appliance could be carried out through an energy estimating algorithm, which
follows the changes in the operational states of the appliances through appliances’ models. As
proposed in some of the studies [34, 36], appliance state models could be represented by finite
state machines, which present the logical relationship between different states of an appliance.
Figure 2-1 illustrates examples of the appliance models for a multi-state lighting fixture (a) and
for a typical on/off appliance (b) such as a single light bulb, a toaster, or an electric kettle.
Off
0 Watts
High
75 Watts
Medium
50 Watts
+ 25
Watts
Low
25 Watts
+ 25
Watts
+ 25
Watts
- 75
Watts
C
Off
0 Watts
On
1000 Watts
- 1000
Watts
+
1000
Watts
a
b
Figure 2-1 Examples of finite state machines representing the appliance models; a) a light
fixtures with four states, b) an On/Off appliance such as an electric kettle (originally presented by
[36])
Therefore, the performance of the event-based NILM approaches is strongly correlated with the
performance of the algorithms in detecting the events (i.e. state changes) and labeling their
signature through pattern recognition algorithms. Accordingly, majority of the research efforts in
12
the past years were focused on improving the performance of the pattern recognition algorithms
by exploring different feature extraction approaches and their appurtenant means and methods.
Steady state power metrics were used in Hart's research [36]. Steady state power variation refers
to the condition in which an appliance operational mode varies from an almost stable condition to
another stable condition without considering the transients in between. By detecting the state
transitions using an edge detection algorithm, the step changes in real and reactive power were
considered as appliances' signatures which describe the possible space of the state transitions. The
signature space was represented as two dimensional clusters, in which negative and positive
clusters with similar magnitude were paired to represent one operational cycle. Appliances
operational models (the sequences of possible state transitions for a specific appliance as
illustrated in Figure 2-2) were used to avoid possible overlaps in the signature space. Following
Hart's research, different researchers proposed variations of NILM approaches by introducing
different hardware and software set up which are reviewed in the following sections.
2.1.1 Data Acquisition Systems
Different data acquisitions systems with low and high frequency sampling capability have been
used in research studies. Most of the low frequency systems use a 1Hz sampling frequency such
as the Energy Detective (TED) [28], Watts Up? Pro [31] and Google Power Meter [37]. Since the
fundamental frequency is either 50 or 60 Hz, using the low frequency data acquisition systems
results in using relatively coarse signal features in pattern recognition algorithms. However, more
information could be captured in case of using higher frequency data acquisition systems, which
provide microscopic features of the electricity signal. By increasing the sampling frequency in the
data acquisition systems, the feature extraction algorithms could benefit from harmonic content of
13
the waveforms, as well as transients between steady states, which provide details of variation in
electric signal comparing to low frequency data acquisition systems. In general, a review of the
research studies reveals that increasing the details of electric signal features, as it is expected,
results in improving the performance of the pattern recognition algorithms.
2.1.2 Features Extraction Approaches
An overview of the research studies in the field of NILM shows that many research efforts have
been focused on feature extraction studies [38]. Features are the signal characteristics that
represent the behavior of loads. Therefore, the selection of the features plays an important role in
the performance of the algorithms in recognizing the appliances state transitions. Uing different
methods in data acquisition and data processing, various types of features have been taken into
account in different studies. Considering the main body of the NILM research studies, feature
selection approaches could be categorized as steady state, transient-state or a combination of the
two. As described above, in transient state based approaches, the information contained during
transient states of the power system (between two steady states) are taken into account.
Figure 2-2 illustrates the difference in power time series resolutions. Figure 2-2-a shows a
segment of the (fundamental component) real power time series at 60 Hz resolution, containing
three events. The steady state and transient sections of the time series have been highlighted in
this figure. Figure 2-2-b presents the same segment (as presented in sub-figure a) down-sampled
to 1 Hz. As it could be seen, by reducing the resolution of the power metrics, the information
contained in the transition between steady states could be eliminated, which explains the rationale
behind using different data acquisition techniques. This difference has been emphasized in
14
Figure 2-2-c and d by illustrating the power time series in proximity of the second event. The
signatures were extracted for five seconds before and five seconds after the event.
Figure 2-2 Samples of real power time series at different resolutions: a) a segment of the real
power time series in 60 Hz; b) the same segment of power time series down-sampled to 1Hz; c) a
sub-segment of the power time series in (a) highlighted by dash line five seconds before and five
seconds after the event; d) a sub-segment of the power time series in (b) highlighted by dash-dot
line five seconds before and five seconds after the event
Real power (𝑃 ) variations (∆𝑃 calculated as the difference between steady state segments of the
power time series before and after events) can be considered as the most intuitive feature that
could be used for load disaggregation. However, due to potential similarity of the signatures
between the appliance state transitions, this approach could result in confusion of algorithms for
0 2000 4000 6000 8000 10000 12000 14000 16000 18000
0
200
400
Samples
Watt
0 200 400 600
100
150
200
250
300
Samples
Watt
0 5 10 15
100
150
200
250
300
Samples
Watt
0 50 100 150 200 250 300
100
150
200
250
300
Samples
Watt
a
b
c d
Pre-event Steady State
Post-event Steady State
Figure 1.2-d
Figure 1.2-c
Transient
15
disaggregating electricity consumption. It is also sensitive to concurrency of the state transitions.
In this case, change in power metric for multiple appliances is considered as change for one
appliance. Addition of other types of features has been considered as a solution for increasing the
performance of the pattern recognition algorithms. As stated above, reactive power metric was
used in Hart’s research [36] to improve the performance of the NILM system for cases, where
appliances with similar power draw are fed on the same circuit. Reactive power is produced when
an inductive or capacitive load is present in addition to resistive load, causing a phase shift
between current and voltage waveform. Since then, many studies used different combinations of
steady state features to improve the performance of the algorithms. Real power, reactive power,
power factor, RMS current, and RMS voltage are among different feature extraction methods that
were used in some of the studies [39-41]. In addition, rule-based systems [42, 43], as well as
addition of the harmonic content of the current waveform [44-46] are other approaches, which
were used to enhance the application of steady state features.
Integration of the information contained in the transients has been used in some of the studies to
improve the performance of the algorithms. The information in transients mainly stems from the
load that is producing it. In fact, this information is an indicator of the physical behavior of the
appliance that produces it. The transient information affects the current draw related to a specific
appliance state transition which in turn is reflected in the power signal [34]. By using transient
features, there is a need for more advanced event detectors comparing to systems which use
steady state features. Accordingly, a number of the research studies were focused on developing
transient event detectors [47-51]. In selection of features using transient states, the shape of the
appliances‘ state transitions real and reactive power was used in [52, 53] as features for pattern
recognition and showed that it could be an effective approach in improving the performance of the
16
algorithms. As the third category, in a number of studies the steady state and transient approach
were combined with different degrees of resolution for the transient features. As examples of
these studies it could be pointed to [54, 55].
A different approach to non-intrusive load monitoring was introduced by using the generated
electric noise as features for event detection and classification [56, 57]. In [56], Patel et al. used
the unique voltage transient noise which is generated due to operation of electromechanical
components of the circuit (i.e. switching appliances on/off) and could be sensed at one outlet. The
FFT of the noise was used as the feature for pattern recognition purposes. However, in [57],
Gupta et al. (from the same research group) pointed to the need for training for each new
installation, the computational complexity, the dependency of the transient features on wiring as
the challenges of the transient noise features and introduced the application of steady-state voltage
noise features. The features are extracted from continuous electro-magnetic interference (EMI)
generated by appliances’ switch mode power supply. Many modern appliances, which include
electronic components as control mechanisms, generate high frequency EMI. Their approach
requires very high frequency up to 500 KHZ in order to capture EMI signal.
A number of other feature extraction approaches were also introduced including the application of
wavelet transform instead of FFT [58], application of the harmonic content of the current
waveform [44-46], application of geometrical properties of I-V curves [59], and use of raw
waveforms as features [60]. Non electrical data sources were also used to as features or for
indirect measurement of appliances power consumption [61, 62].
17
2.1.3 Pattern Recognition Algorithms
In general, NILM approaches could be categorized into two major classes of event-based and non-
event-based. As noted, in event-based category, which include the major part of the studies, state
transitions (i.e., event) and their associated features or signatures, either around the events or for
segments of the signal time series between the events, are treated individually for pattern
recognition through a supervised learning problem. Classification algorithms have been used as
the main category of supervised learning problems in event based approaches. Nearest neighbor
[53, 57], neural networks [63, 64], and Bayes classifiers [53] are examples of the algorithms that
have been explored in a number of studies.
In the second category of approaches (i.e., non-event-based approaches), the time series of
operational state transitions of appliances (the power time series) is considered as a whole and the
disaggregation is defined as an optimization problem to determine the most probable sequence
[60, 65, 66] of appliances by using models of appliances. The appliances’ models are usually
defined as state machines as presented in Figure 2-1. In recent years, a number of studies have
been working on developing non-event based algorithms based on hidden Markov models [67-
69]. These non-event-based approaches usually use power time series with low resolution of 1 Hz.
The application of different unsupervised inference algorithms have been explored by Kim et al.
[70] using different variations of hidden Markov models and different features including the
contextual features such as when and how appliances have been used. They have shown that
conditional factorial hidden semi-Markov models and the contextual features outperform other
variations of the models. In their research, the appliances’ models were trained using data
collected from residential units and then the updated models were used in the inference process.
18
Kolter et al. [68] proposed an approximate inference approach for additive factorial hidden
Markov models to avoid local optimum solution for aggregated power time series separation. The
appliances models in these approaches require the knowledge of the appliances at different states
and the power draw associated to each state (refer to Figure 2-1 for examples). Generating
appliance models is a challenging task, which could reduce the success of event-based
applications for commercial applications. Parson et al. [69] proposed an approach for using prior
models of appliances that are trained using the aggregated power time series in a household. In
their approach the prior models of appliances are updated through training process, which
includes isolating an appliance operation for updating the model. In general, due to the diversity
of the appliances and their technology, developing generalized appliance models that could
alleviate the training process is a complex process. The results of the aforementioned studies show
that with the state-of-the-art non-event-based methods, the application of predefined appliance
models to eliminate the training process does not eliminate the need for training and further
exploration is required. Accordingly, in this dissertation, the efforts have been focused on event-
based electricity consumption disaggregation methods, in which individual state transitions of
appliances are taken into account.
2.2 Problem Statement
As elaborated above, majority of the NILM research efforts in the last two decades were focused
on development of algorithms and feature extraction methods that could improve the performance
of the algorithms in recognizing the appliances’ states. Improvement of the event detection,
exploring and introduction of new feature extraction methods coupled with various classification
algorithms for event based approaches comprise the main streams in non-intrusive electricity
19
disaggregation research. Despite all the improvements on the algorithm side, and although
industry stakeholders have shown interest and invested in commercialization of the NILM
systems, the technology has not been widely adopted as commercial product for residential use.
One of the available commercial applications that was developed based on Hart’s research [36] is
the one from Enetics [71], which is mainly used for offline monitoring and auditing purposes.
Since the concept of the NILM emphasizes on single sensor data acquisition, the performance of
the signal processing and machine learning algorithms is the bottle neck in performance of the
NILM approach. The algorithms that have been used so far for state transitions' recognition are
commonly supervised machine learning algorithms, which depend on a collection of training data.
The training data set is required to have labels for different states (or the states transitions) of all
the appliances in the building, where the system has been deployed. For example, the training data
set needs to have examples of signatures with labels for refrigerator condenser turn-on event as
well as turn-off event. Moreover, the refrigerator operational modes could include other state
transitions such as defrosting turn-on and turn-off state transitions, as well as the turn-on and turn-
off events of the lighting fixture in its compartment.
However, due to the diversity of the technologies used by different appliances and different
manufacturers, as well as, various sources of noise in a data acquisition process, in most of the
cases, upon installation of a NILM system, in-situ training has to be carried out. Training is
required for the provision of labeled information for events. In its simplest form, training could be
carried out by triggering events (i.e., changing the appliances operational states) such as turning
an appliance on or off and label the captured change associated with the event on the aggregated
signal time series. The process of training needs to be carried out by the end-users or by trained
technicians. In other words, users need to associate the signatures of the aggregated time series
20
with the observations in the environment by providing labels for the signatures. This training
process and its associated challenges could be a barrier for practical deployment of the NILM
system due to complexity of the process for different appliances in a household and might be
categorized as one of the major barriers for wide adoption of the technology. Accordingly, either
users are not interested to use the product or the training process could be incomplete. Therefore,
the success of the NILM as a practical tool for energy disaggregation partially depends on
facilitating the interaction between the system and its users. Regardless of the features and the
algorithms that are used, high quality training process is a requirement in order to provide the
mapping between disaggregated electricity consumption and individual appliances and achieve
high performance in disaggregation.
Accordingly, in this dissertation, electricity consumption disaggregation is tackled as a human-
computer interaction problem with the objective of facilitating the training process to increase
user convenience while maintaining the quality of the training process. Thus, the core objective
of this dissertation is to explore solutions that relax the constraints for the training process. These
constraints include the number of times that NILM system calls for interaction and the response
time (the time that user need to respond to a call from the load monitoring system) required for
user-system interaction. More explanation about the training process and associated research
background are presented in the following sections.
2.2.1 Closer Look at Appliance from Training Point of View
Although the framework for training could be used for different types of buildings, the focus of
this study is on residential buildings. This is due to the diversity of appliances in residential
settings, the complexity in appliance operations (e.g., a number of appliances go through multiple
21
modes of operation), and the fact that users, in residential settings, will be involved with the
training process as part of their life routine and therefore, convenience plays an important role. To
provide more insight into the dimensions of the training problem, in this section, the diversity of
appliances and their operational modes have been presented.
a
Kitchen appliance electricity
consumption in US households, 2001
HVAC electricity consumption in US
households, 2001
Electronic equipment electricity
consumption in US households,
2001
b
Figure 2-3 a) Distribution of electricity consumption in the U.S households for the major
consumers; EIA, Residential Energy Consumption Survey (RECS), 2001[72], b) the variation of
the energy consumption in residential sector for the last two decades [73]
Figure 2-3 shows the most consuming appliances and their contribution in energy consumption in
the U.S. households [72] and the change in the consumption distribution [73]. The contribution of
the appliances in the electricity consumption depends on the appliances power draw (in different
modes of operation) and the operational duration. According to the U.S. Energy Information
22
Administration [74], the appliances which contribute to about 80 percent of electricity
consumption in households have been listed in Table 2-1.
Table 2-1 Appliances with about 80% of contribution in electricity consumption of households
Category Kitchen Laundry Electronics Building Systems Illumination
Appliances
Refrigerator Washer TV Air Conditioner Lighting
Freezer Dryer Home Theater Space Heater
Coffee Maker Computer Furnace Fans
Electric Stoves
(Range and Oven)
Printer Water Heaters
Microwave Pool Pump
Dishwasher
This list of appliances is representative of a typical household. This list could be different
considering the diversity of the life style in different parts of the country. Considering the
appliances on this list, different operational modes of these appliances are described to determine
training challenges from user interaction perspective. In general, appliances could be categorized
into three groups in terms of operational modes [36, 38]:1) appliances that have two distinct
states, namely off-on and on-off state transitions such as lighting systems with Boolean switches,
refrigerators with no defrosting module, and water pumps; 2) appliances that have multiple state
transitions through multiple definite switch states with a complete switching cycle repeated
frequently; examples of these appliances include refrigerators with anti-frosting module,
dishwashers, washing machines, and lighting fixtures with multiple switches for different levels of
lighting; as an example of this type of appliance, a washing machine, depending on it's make and
model might have various operational modes in one cycle including fill, empty, wash (agitate),
rinse, spin, and heat- many of the washing machines work with cold water and heat the water to
different degrees during the washing process; 3) appliances that have varying power draw in each
23
time of operation and their varying power draw does not follow any specific periodic pattern; as
an example of these appliance it could be pointed to the lighting systems with dimmer.
By looking at appliances from user interaction point of view, it could be seen that some of these
state transitions cannot be detected by typical users. For a closer look example, refrigerators and
freezers operational state transitions are typically triggered by compressor module, the light in the
refrigerator and in some cases automated defrosting heating elements for the freezer compartment.
Typically, for the appliance class of refrigerator/freezer six state transitions include condenser off-
on/on-off, defrosting off-on/on-off, and off-on/on-off for the light in refrigerator and freezer
compartment. The compressor and defrosting (in the appliances that have this option) modules are
set to operate in cycles automatically. Therefore, except for the lights in each one of the
compartments, other state transitions could not be triggered by a user to be used for training
system. In the generalized form, some appliances with only two state transitions of on-off and off-
on could be operated by users for training purposes; however, a number of other appliances have
state transitions that are set to be operated automatically by the appliance and therefore it is
difficult for user to interact with and trigger them for training purposes. More importantly, the
closer look at the number of loads (including the appliances and lights) in a residential setting and
the number of states for those appliances shows the scale of the training process and its challenges
for the end-user. Considering these challenges, this study is looking for approaches that could
facilitate the deployment of the electricity consumption disaggregation technology through
assessment of feasible approaches in technology-user interaction.
24
2.2.2 Feasibility of Pre-Installation Training
Considering the challenges associated with the training process, the first question that comes to
mind is whether it is possible to train the algorithms prior to installation of the electricity
consumption disaggregation system. One approach in achieving this objective is through training
across buildings, in which signatures from different installations are shared. The feasibility of
using the signatures from other buildings depends on many factors including the differences in the
manufacturers’ technology (and consequently the differences in load behavior), the combination
of loads that exist in that environment, the feature extraction approach, and the pattern recognition
algorithms.
As it was mentioned in the background section, the appliances’ signatures depend on the hardware
and software systems that are used for feature extraction and pattern recognition. Depending on
the hardware set-up and the feature extraction approach, these feature vectors could contain data
from macroscopic power metrics such as change in real and reactive power from low resolution
power time series to microscopic metrics such as transient shape and the harmonic content of the
current and voltage signals from high resolution signal time series. Therefore, the data collection
approach, its related equipment, and signal processing methods could cause variation of the
signatures across different settings and installations, which in turn calls for standardization of the
data acquisition systems or devising data analytics approaches that account for the variations in
signatures from different data acquisition and processing technologies.
On the other hand, given that the signatures are captured with similar technology, other factors are
important to be taken into account. In one aspect, the difference in appliances’ technologies
25
results in difference in signatures. Even the signatures from two appliances with the same product
model and the same manufacturer could be different. This could be observed due to the fact that
manufacturers might use different vendors for different production batches. This diversity in the
technology and components results in variations in appliances’ signatures from the same class. In
another aspect, the performance of the pattern recognition algorithms depends on the structure of
the signature space. For example, in a building unit, where there are a number of appliances with
lower power draws (e.g. single light bulbs, LCD monitors, personal computer) along with one
appliance with larger power draw (e.g., a large resistive load such as an electric kettle) on the
same phase, the detection of the larger load with using a few examples from other buildings could
be feasible. This detection could be realized not because the training data is sufficient, but due to
the fact that the training samples are more similar to the larger load signature compared to other
loads. However, this is not the same for loads with lower power draws in this example due to the
multi-scale nature of the signature space and thus potential similarity of the signatures in the
smaller scales. The differences in noise level could compound the challenge since the presence of
the noise in one environment might alter the signatures of a certain class of appliances. Therefore,
cross-building training (by using the data from other buildings) could be highly context
dependent. Although the feasibility of cross-building training data provision could be explored by
using sample signatures from individual appliances (in an experimental set up by collecting data
from individual appliance or in a real world scenario, where the appliances are operating in
buildings in combination with other appliances), statistically significant conclusions on the
feasibility of the approach calls for a large scale field experiment. Therefore, in this study, we
focus on in-situ training solutions and explore the electricity disaggregation systems’ training
from a system-user interaction perspective by assuming that user involvement for training the
26
NILM system is required. Facilitating the training process could also enable large scale data
collection, which in turn could be used for exploring the feasibility of pre-installation training.
2.2.3 Research Background in NILM Training
As reviewed in the background section (Section 2.1), research studies in the last two decades have
proposed different approaches to address the research questions in the field of NILM and in all of
these studies (specifically the ones on event based approaches) the assumption is that the training
is a required part of the system which needs to be carried out at a certain stage. However, in
majority of these cases, the training requirements for the algorithms and methodologies were
remained undiscussed.
As mentioned, Gupta et al. [56], proposed an electricity consumption disaggregation approach
based on the application of electromagnetic interference noise (EMF) that could be observed due
to mechanisms of the SMPS devices. Modern appliances, with electronic components, generate
high frequency EMI, which could be captured using high sampling frequencies about 500 KHz
and higher. They have argued that these features have been found to be transferable for a number
of electronic appliances across different residential settings [57] and tested this argument by
assessing a number of appliances in different residential settings and showed that those appliances
show the same signatures regardless of the electricity distribution infrastructure. However, this
approach could be used only for appliances that are equipped with SMPS components for voltage
regulation. Furthermore, the challenges related to the variation in appliances’ manufacturing
technologies (which could result in variation of the signatures for the same class of appliance) and
the assessment of temporal stability of the signatures are yet to be explored.
27
Moreover, in recent years, a number of studies explored NILM algorithms training regardless of
the feature extraction methods. These studies were mainly focused on assisted training, in which
non-electricity sensing systems were used to provide the NILM system with the information about
appliances’ operational states. In one of these methods, a contactless electromagnetic sensor
system, to be used for training appliance state transitions, was proposed and evaluated in a
number of studies [75, 76]. Their proposed sensor system uses the fluctuations in the
electromagnetic field (EMF) in proximity of the appliances wiring to infer the changes of the
power draw. The feasibility of this contactless sensor system for automated training of NILM
systems has been studied by Giri and Berges [77]. In another study, Kim et al. [78] used a
network of sensors for measuring non-electricity signals from appliances in addition to the
aggregate power consumption and proposed an optimization approach to autonomously calibrate
and disaggregate the electricity consumption. In their approach, magnetic sensors, light intensity
sensors, and temperature sensors were used, and each appliance required a sensor in its vicinity.
Similarly, the application of a combination of non-electricity sensors for sensing heat, vibration,
sound and light generated by appliances, as well as, current variations in concomitant with a
calibration method were also used by Schoofs et al. [79] for automated electricity data annotation.
Following the same direction of research, Taysi et al. [80] explored the application of acoustic
sensors for generating device level power consumption reports, based on acoustic signatures.
Although this study did not directly point to NILM training as part of the objectives, the approach
could be extended for signature annotation purposes. These approaches provide the information
for appliance signature labeling; however, they are still based on extensive application of
appliance level sensors, which requires certain level of expertise in processing. In another
direction of these efforts, Berges et al. have proposed a user-centered NILM framework, which
facilitates the training by triggering the user interaction through application of event detection and
28
classification algorithms. In this approach, users are called to interact with the NILM system for
labeling the appliances state transitions as they are detected by NILM system during the regular
operation of the appliances. This approach helps users to associate the changes in the environment
with appliances state transitions and more importantly enables users to label the states that could
not be triggered by users.
2.3 Research Objectives and Scope
Immature technology, cost of the technology, lack of proper cost effective computing devices, and
the need for user interaction could be enumerated as potential barriers of adopting the NILM
technology as a commercial product over the past two decades. Taking into account the
technological developments that have been made in computational side of the technology over
these years, as well as enhancement of the algorithms for data processing and pattern recognition,
this dissertation is hypothesizing that it is possible to facilitate user interaction in training process
of the electricity consumption disaggregation systems and achieve an acceptable level of
performance for the classification algorithms while reducing the constraints for user interaction.
In this dissertation, a user-centeric smart sensing is considered as the base approach, where end-
users provide the labels for training over a period of time to ensure that majority of the
appliances’ state transitions are covered. In this process, when events are detected, the NILM
system calls users for interaction. Accordingly, the objectives in this dissertation include reducing
the number of times that a user is called for interacting with a NILM system to provide the
training labels and relaxing the time constraint that a user need to respond to calls from the NILM
system for training. These objectives are elaborated as follows.
29
2.3.1 Reducing Interaction Requirement in Active Training
In the context of this dissertation, the training process, in which users provide labels for the
signatures of the appliances’ state transitions, is called active training since users need to actively
follow the events and respond in a timely manner. Accordingly, whenever a new event (i.e., a
potential appliance state transition) is detected, the NILM system will classify it using its
signature library. If the signature library is empty, user input will be asked to be replaced for the
unknown label. Moreover, even in the case of detecting the label for a state transition, to add the
labeled signature to the library with high confidence level user interaction will be required until
all the state transitions are labeled. Therefore, user interaction is continued until all the appliances
are used and users complete the training process. During this process, every detected event (even
the false positives that are not detected to an event of interest) results in a call to the user;
considering the number of appliances and different states associated with them (as described
above), a considerable number of calls is required during the training process. Accordingly, the
first objective of this dissertation is to alleviate the active training requirements in the form of
reducing the number of calls that a user receives during the training process. By shifting the
training process to the computing side through anonymous (numeric) labeling, the need for
numerous calls to users could be alleviated. The anonymous (numeric) labeling is the process of
automatically populating the signature library using the detected events for each appliance class
and assigning numeric labels to the feature vectors of each class. Once the signature library is
populated and the system is confident that all feature vectors associated with each state transition
is correctly labeled during numeric labeling, the system sends the message to user asking for the
actual label associated with each group of state transitions.
30
Although anonymous labeling is intended to reduce user-system interaction, the populated
signature library, generated during the anonymous labeling process, could be used for other
applications such as inferring the behavior of certain loads. Specifically, some of the state
transitions could be autonomously labeled by using the operational characteristics of the
appliances, and it could further reduce the need for user interaction. Autonomous labeling of this
category of appliances could be achieved by considering the fact that majority of the appliances
with automated schedules for state changes follow a periodic pattern in their operation. This
periodic pattern could be used in addition to other rule-based information such as power draw
range, and the characteristics of the transients for automated labeling. This approach not only
could facilitate the training process by reducing the number of interactions, but also could enable
labeling the signatures of those appliance state transitions that cannot be triggered or detected by
users. For example, refrigerator’s compressor and defrost generate periodic events that are
difficult for user to understand in the physical domain.
2.3.2 Relaxing Response Time Constraint
As noted, active training requires immediate user response to the calls from the NILM system
when events are observed. If the user does not interact for labeling in a timely manner, other
signatures might be replaced in the labeling queue. These signatures are either from other
appliances state transitions or from false positive events. For example, if a user enters the
bathroom to wash their hands (assuming that the user turn the light on and then off) the user needs
to label the turn-on event before the turn-off event is detected, otherwise the turn-on event is
replaced in the queue. Moreover, if multiple appliances are in operation at the same time, the
events from these appliances might be confused with each other. The anonymous labeling
31
approach prior to active training stage could partially address this problem by calling the user for
those events that are not labeled yet. However, in addition to the above mentioned challenges, the
success of the training process could be potentially increased if users indirectly provide the labels
for appliances’ signatures not as an obligation. For example the training could be carried out
through certain types of games, in which users provide information for training while they play a
game. However, prior to achieving this goal, there are challenges that need to be addressed and
one of the major challenges is the time dependency of the labels. In other words, if labels are not
provided immediately after the events, the association between the signatures and labels will be
distorted, which may result in low performance of the algorithms. Accordingly, this aspect of the
research focuses on enabling the reduction of the time dependency of the user input to facilitate
indirect training. Therefore, the objective is to receive generalized appliances’ operational
information (compared to exact information about the state changes) from users with relaxed
constraints and enable labeling. Some examples of the generalized information could include: an
appliance was turned on and then off (i.e., completed a cycle of operation), an appliance have
been started, and an appliance does not exist in a specific setting. By reducing the constraints in
the physical domain again the challenges are shifted to the computational domain.
2.3.3 Research Questions
Based on the aforementioned objectives, the following research questions are explored:
1. How the appliances state transitions’ signatures could be autonomously partitioned into
clusters? What feature combinations could improve the partitioning process performance?
How to evaluate the quality of clusters in representing appliances’ state transitions?
32
2. How the training process could be improved to reduce the number of interactions by
leveraging the pre-populated clusters of signatures while conserving accuracy?
3. How the response time constraint could be relaxed and for which appliances this approach
will be feasible?
2.4 Research Methodology
To address the research objectives and questions, the following chapters of this dissertation
describe the event-based appliance’s state transitions approach, which is used as the baseline
approach in our experiments. Each specific research objective/question is addressed in chapters
that follow the baseline approach. Accordingly, each chapter briefly expands on the problem
statement, describes the devised approaches in addressing research objectives, presents the
customized validation metrics for each solution, and presents the findings. Since this dissertation
addresses challenges of the training process, all of the validations are carried out through data
collection in field experimental studies.
2.4.1 Field Experiments and Data Sets
During the last two decades of research in the field of NILM, there has been always the lack of
availability of common data sets for researchers to compare the performance of the algorithms.
The findings in different studies have been based on limited experimental data (collected for small
durations in experimental set up) or synthetic data; this is an indicator of the complexity of
collecting labeled data for research validation. In recent years, the only available fully labeled
public data set with high frequency data acquisition system is the one (BLUED data set) that was
collected in a house in Pittsburgh, PA for a week and published by Anderson et al. [80]. Another
33
publicly available data set is the REDD data set published by Kolter and Johnson [81]; although
this data set include high frequency data for two homes with relatively long period of data
collection, the data set does not provide labels per individual loads; in fact, this data has been
mainly intended for non-event based NILM approaches, in which low frequency (1Hz) power data
is used.
Therefore, to explore and address the research objectives, we have adopted field experimental
studies to collect fully labeled data using high frequency data acquisition systems for relatively
long periods of time in residential setting, where occupants use the appliances in their daily life
routines. This set up enables us to account for realistic scenarios, which reflect realistic user
interaction patterns and challenges associated to the presence of noise. Three apartments were
selected as the test beds in this dissertation. The selection criteria included the feasibility of data
acquisition system installation and the occupants’ interest in participating in the study. Details of
the data acquisition system are presented in Chapter three. The appliances list in these apartments
is as presented in Table 2-2. The appliances’ list on this table also shows the distribution of the
appliances in the test bed apartments based on different feeding circuits (i.e., phases) on the circuit
breaker panel in the apartments. Phase A and Phase B represent each separate circuit. The split
phase system in the U.S. buildings is explained in Chapter 3.
34
Table 2-2 List of Appliances and duration of the data acquisition in different field experimental
test beds used in addressing the research objectives in this dissertation
Experimental
Test Bed
Appliances on Phase A Appliances on Phase B
Duration of Data
Collection
Apartment 1 Refrigerator
Television
Blu-ray Player
Toaster
Kettle
Bathroom Light and Fan
Kitchen Light Fixture 1
Kitchen Light Fixture 2
Bedroom Lamp
Kitchen Fan Light
Air Conditioning System
Iron
Hair Dryer
Washing Machine
Laptop Computer
LCD Monitor
Desktop Lamp
Air Conditioning System
Four Weeks
Apartment 2 Laptop Computer 1
Laptop Computer 2
Desktop Lamp
Air Conditioning System
Closet Light
Refrigerator
Microwave
Coffee Maker
Sandwich Maker
Electric Range
Toaster
Television
Xbox
Cable Box
Living Room Light Fixture
Living Room Lamp
Three Weeks
Apartment 3 Refrigerator
Microwave
Dish Washer
Kitchen Light Fixture
Living Room Light Fixture
Living Room Lamp
Television
Laptop Computer
Tablet
Bathroom Light
Bathroom Fan
Hair Dryer
Hair Iron 1
Hair Iron 2
Bedroom Lamp
Bedroom Light Fixture
Air Conditioning System
Closet Light
Toaster
Kettle
Washing Machine
Clothes Dryer
Air Conditioning System
Four Weeks
35
The data collected in these data sets are used selectively for addressing different research
objectives/questions. The Apartment 1 data set is used for exploring the first research questions.
Apartment 1 and Apartment 2 data sets are used in exploring the second research question. The
data set in Apartment 1 and Apartment 3 are used for exploring research questions 3. The
selection of the data sets for addressing each specific research question is based on the experiment
design in each test bed apartment. Since the experiments required occupants’ engagement for a
relatively long period of time, the experiments were divided to be carried out in different
apartments for different questions. Details of specific methods in data collection and experiments
set up are discussed in the following chapters.
2.4.2 Validation Metrics
Training process directly affects the results of the classification algorithms in detecting the
identity of the state transitions. However, as noted, the main objective in this study is to increase
user convenience without compromising the accuracy. The proposed solutions in Section 2.3
might cover training for all or certain set of appliances in a specific setting. The facilitated active
training is intended to cover all appliances and therefore, the validation metric includes label
accuracy and the interaction requirements. In other words, the number of calls for interaction and
the accuracy of the assigned labels with respect to the ground truth data are used in this
dissertation. For the objective of time constraint relaxation, the evaluation is carried out by
measuring the number of correctly labeled signatures (using the labels from ground truth data)
under the time relaxation constraints.
In evaluating the components of the solution, the accuracy of the classification/clustering
36
algorithms is measured using the conventional metrics including the confusion matrix and
accuracy metrics such as recall, precision, and F-measure, which are obtained as follows:
Precision=
𝑇𝑃
𝑇𝑃 +𝐹𝑃
2-1
Precision=
𝑇𝑃
𝑇𝑃 +𝐹𝑁
2-2
Fmeasure= 2.
Precision .Recall
Precision+Recall
2-3
In addition to these conventional metrics, addressing the research questions require customized
metrics of evaluation that have been described in associated chapters.
37
Chapter Three: Event Based Appliance State Transition
Detection
3
The research methodology in this dissertation has been focused on event-based algorithms and
their performance. In the event-based NILM methodologies, the accuracy of the disaggregated
power time series is highly dependent on the performance of the classification algorithm, which in
turn are dependent on the quality of the training process. Therefore, we focused on event-based
appliance state transition detection approach and did not include the energy calculation module.
The following sections describe the baseline framework including the data processing, event
detection, and classification algorithms.
3.1 Load Monitoring Set Up
3.1.1 Residential Electricity Infrastructure and Load Behavior
In the United States, most residential loads are connected to a two-phase circuit. Each phase is fed
with AC 120V with 180° phase difference. Appliances are usually connected to one phase except
for two-phase appliances which are fed from both phases. More details on the split phase system
could be found in [36]. Power measurements could be carried out at different levels of granularity
from individual appliances/plugs to building level. Circuit breaker is a logical aggregation point to
measure the aggregated power of multiple appliances. By increasing the number of loads at the
aggregation level, the complexity of power decomposition problem is increased; however, cost for
metering devices is decreased. Therefore, the criterion for selecting the sensing point in a
residential setting depends on the trade-off between the cost for hardware and the number of state
38
transitions on the feature space. Depending on the size of the building the sensing could be carried
out at the main feed to the unit.
In the AC circuit, depending on the characteristics of the load, different load behavior could be
observed. The appliances components could be purely resistive loads such as incandescent light
bulbs, kettles, irons, electric water heaters, electric cookers or partially reactive loads such as
refrigerator and washing machines, which in addition to resistive loads have inductive or
capacitive components. The resistive loads consume all the energy that is provided, but reactive
loads consume part of the provided energy and release some of the energy back to the source.
Accordingly, the power in the AC circuit is a complex quantity. For the resistive loads the current
draw waveform is in phase with voltage waveform and the power is always positive. Loads with
reactive components cause a phase shift between the voltage and current waveforms, which
results in reactive power. Moreover, modern appliances contain nonlinear loads. A Non-linear
load is characterized by switching action which causes current interruptions. Electronic appliances
such as printers, TVs, personal computers, with components such as rectifiers and switched-mode
power supply (SMPS) are examples of nonlinear loads on AC circuits. The presence of the
nonlinear loads results in current waveforms with higher harmonics, which are the integer
multiples of the fundamental frequency (60 Hz). For linear sinusoidal and periodic currents the
real and reactive power calculation is straight forward and the apparent power is as follows:
𝐴 =𝑃 +𝑗𝑄 =𝑉 𝑟𝑚𝑠 𝐼 𝑟𝑚𝑠 (𝑐𝑜𝑠 Φ+𝑗𝑠𝑖𝑛 Φ)
3-1
in which P is the real power, Q is the reactive power, V
rms
and I
rms
are root mean squared voltage
and current, and Φ is the phase shift which is determined by load impedance. However, in general
39
form and in the case that nonlinear loads exist, the power factor, which is the ratio between
apparent power and real power, is affected by the current harmonics.
1.1.1 Data Processing System
Data Acquisition Systems
By using sensing devices with high frequency sampling capabilities, it is possible to obtain
features at different levels of granularity including steady state and transient real and reactive
power metrics and harmonic contents. The harmonic contents of the load could potentially
provide useful information for the purpose of pattern recognition since they (specifically low
order odd-numbered harmonics, i.e. 180Hz, 300Hz, 420Hz, and 540Hz) include information about
the behavior of some appliances [34, 55]. For the studies in this dissertation, general purpose
Analogue to Digital (A/D) cards of National Instrument, with high sampling frequency (≥
100 𝐾𝐻𝑧 ) and the capability of simultaneous sampling on four channels, were used. Considering
the harmonic contents of the current waveform and the Nyquist sampling theorem, to capture the
first 9 harmonics a sampling frequency of more than 1 KHz is required. Therefore, by the selected
A/D cards a wide range of harmonic contents could be obtained.
To capture the signal, voltage and current transformers are used. Since the voltage waveforms in
the split phase system have 180° phase difference, the voltage measurement is performed on one
phase only. For capturing the voltage waveform, Pico TA041 - 25MHz ±700v differential probe
was used. For current waveforms, the measurement is performed on each phase using Fluke i200
AC current clamp. The analogue signals captured by these sensors are digitized through NI DAQ
and are stored/processed on a local PC computer.
40
Another important part of the data acquisition system is the ground truth data collection. One way
of ground truth data acquisition is to follow the activity of the appliances and manually prepare a
log of all the events. This process could be very complicated, specifically in cases that the number
of appliances is relatively large and appliances with multiple states of operation are used.
Moreover, the events that are not detectable by users are ignored in manual logging of the ground
truth. The most accurate approach in collecting ground truth data is through sensing at the
individual appliance (load) level in parallel to aggregate measurements. In this way, the isolated
log of operational modes for each appliance could be obtained. The isolated logs could then be
used for individual state transitions’ labeling and energy ground truth collection. Therefore, the
ground truth data acquisition is obtained through plug level metering for appliances that are
plugged into a receptacle and with ambient light sensors for lights. Since the aggregated data
acquisition system uses high frequency sampling rates, in order to avoid errors in using ground
truth labels on the aggregated signal, the plug level sensors need to provide relatively high
resolution time series. Therefore, in this study, Enmetric Powerports [33], off-the-shelf plug level
meters are used. These sensors provide 1 Hz power signal sampling rate through their integrated
Application Programing Interface (API). Collecting the ground truth for lighting is performed
using Linksprite DiamondBack™ microcontrollers, equipped with a WiFi module and AMBI™
light intensity sensors.
Data Processing Approach
The feature extraction could be carried out in different signal sources including the raw cureent
and waveforms or the processed signal into RMS metrics or power metrics. To ensure that a high
accuracy ground truth data is obtained, we use processed power metrics for feature extraction.
41
Once the current 𝑖(𝑡 ) and the voltage 𝑣 (𝑡 ) waveforms are sampled, in the absence of nonlinear
loads, the real power at any time 𝑡 could be calculated by taking the product of the voltage and
current and taking their average over one period. The reactive power (at any time 𝑡 ) is also
calculated by multiplying the voltage waveform with quadrature component of the current
waveform and averaging over one period. However, when the load is nonlinear due to the
presence of harmonics the above mentioned approach does not hold true. The definition of
reactive power is a challenging problem and there is no standard solution to it [82, 83]. However,
approximate approaches have been developed for this purpose [49, 55, 84]. In these approaches,
based on the definitions of fundamental powers, a properly shifted harmonic voltage waveform
could be used as a reference in order to compute power at a harmonic frequency. However, the
higher voltage harmonic contents are negligible at the consumption site, and current harmonic
distortion is more important [55] since they reflect the behavior of the appliances. Therefore, from
NILM perspective, computing the current harmonic powers is taken into account. A periodic
current waveform 𝑖̃(𝑡 ) could be represented in terms of continuous time Fourier series [55]:
𝑖̃(𝑡 )=𝑎 0
+∑𝑎 𝑘 ∞
𝑘 =1
cos(𝑘 2𝜋 𝑇 𝑡 )+∑𝑏 𝑘 ∞
𝑘 =1
sin(𝑘 2𝜋 𝑇 𝑡 )
3-2
in which 𝑇 is the period, 𝑘 is the harmonic index and 𝑎 𝑘 and 𝑏 𝑘 are the Fourier series coefficients
at time 𝑡 :
𝑎 𝑘 =
1
𝑇 ∫ cos(𝑘 2𝜋 𝑇 𝜏 )
𝑡 +𝑇 𝑡 𝑑𝜏 , 𝑏 𝑘 =
1
𝑇 ∫ sin(𝑘 2𝜋 𝑇 𝜏 )
𝑡 +𝑇 𝑡 𝑑𝜏 𝑘 ≥1 3-3
42
Considering the fundamental component of the voltage waveform as 𝑉 , the current harmonic
powers could be defined as:
𝑃 𝑘 (𝑡 )=
𝑉 2
𝑎 𝑘 , 𝑄 𝑘 (𝑡 )=
𝑉 2
𝑏 𝑘
3-4
The Fourier series coefficients are scaled versions of the Fourier transform, evaluated at the
harmonic frequencies. In this study, considering the above definition, the approximation approach
for calculating power metrics presented by [84] as spectral envelope coefficient computation is
adopted. Considering the digitally sampled current waveforms, the computation of the spectral
envelope coefficients are carried out using a Short-time Fourier Transform (STFT) on current and
voltage waveform which results in I(t) and V(t). The sample size per cycle is better to be set to
provide power of two samples per period and this improves the speed of the FFT algorithm [34].
The windows size for STFT is defined based on the desired resolution of the power time series
and it depends on the sampling frequency of the raw waveforms. The STFT window is usually an
integer multiple of fundamental frequency. To account for the phase of the voltage relative to the
window of the STFT, the phase shift (𝜃 (𝑡 )) between the first harmonic component of current, and
voltage signal are obtained after the application of the STFT, and real and reactive power
components [34]:
𝑃 𝑘 (𝑡 )=|𝐼 𝑘 (𝑡 )|.sin(𝜃 (𝑡 )).|𝑉 1
(𝑡 )|
3-5
𝑄 𝑘 (𝑡 )=|𝐼 𝑘 (𝑡 )|.cos(𝜃 (𝑡 )).|𝑉 1
(𝑡 )|
3-6
43
in which 𝑃 𝑘 and 𝑄 𝑘 are k
th
real and reactive power quantities, 𝐼 𝑘 is the k
th
harmonic component of
the transformed current waveform, and 𝑉 1
is the first harmonic component of the transformed
voltage.
3.1.2 Appliance State Change Detection
One of the fundamental steps in an event-based NILM approach is to detect event − when an
actual change in the appliances state occurs. The change in the operational state of appliances is
reflected on the signal time series and could be tracked for detecting the events. These changes
could be observed on various signal sources. In our approach, events are defined as sharp
variations in the fundamental frequency component of real power time series that are associated
with the appliances’ state changes in a building. Since we intended to use the information in the
transients between steady states in power time series, high resolution power time series (≥20𝐻𝑧 )
are used. Our observations have shown that the presence of certain appliances and (potentially)
the characteristics of the building electricity distribution infrastructure might result in high levels
of noise in the calculated power time series. Although increasing the resolution of the power time
series increases the information gain, for the signal time series with higher resolutions, the
presence of noise could affect the event detection rate. Specifically, low performance of the event
detector on noisy data could occur when a naïve event detection algorithm is used. A naïve event
detector tracks the power value changes between consecutive signal values and uses a power
change threshold as the criterion for event detection. Consequently, in this dissertation, a
probabilistic event detection algorithm has been adopted to avoid false positives due to the
presence of noise. The event detection approach is based on the Generalized Likelihood Ratio test
(GLRT) that was proposed [85] and improved [53] in previous research studies. The GLRT
44
examines two hypotheses of 𝐻 0
and 𝐻 1 ,
which represent the association of signal segments to a
probability density function:
𝑓 (𝑠 |𝜃 ) :
𝐻 0
: 𝜃 =𝜃 0
𝐻 1
: 𝜃 =𝜃 1
3-7
The 𝐻 0
and 𝐻 1
hypotheses are composite hypotheses, which have unknown parameters as the
signal changes over time. Therefore, unknown parameters of the 𝑓 𝐻 0
(𝑠 ) and 𝑓 𝐻 1
(𝑠 ) are replaced
with the parameters obtained from the maximum likelihood estimation under 𝐻 0
and 𝐻 1
:
𝑓 𝐻 0
(𝑠 |𝜃̂
0
) and 𝑓 𝐻 1
(𝑠 |𝜃̂
1
) . The GLRT uses:
𝐿 (𝑛 )=
𝑓 𝐻 0
(𝑠 |𝜃̂
0
)
𝑓 𝐻 1
(𝑠 |𝜃̂
1
)
𝐻 0
>
<
𝐻 1
𝛼
3-8
Where:
𝜃̂
𝑖 =max
𝜃 𝑖 𝑓 𝐻 𝑖 (𝑠 |𝜃 𝑖 ) is the maximum likelihood estimate of 𝜃 𝑖 . The test examines the test
statistic, which is the ratio between the likelihood for 𝐻 0
and 𝐻 1
using 𝜏 𝑔𝑙𝑟𝑡 as the threshold. The
electricity signal follows a Gaussian distribution (𝑆 ~𝑁 (𝜇 ,𝜎 2
) ) [85] and therefore, a Gaussian
distribution is used in the GLRT test. The Gaussian distribution parameters are dynamically
calculated from the signal. Thus, the algorithm uses two contiguous moving windows, 𝑤 𝑝𝑟𝑒 and
𝑤 𝑝𝑜𝑠𝑡 , which are slid along the signal samples and then calculates the signal segments mean and
standard deviation to be used for the log likelihood ratio calculation as follows:
45
𝐿 (𝑛 )=ln
𝑃 (𝑠 𝑖 |𝜇 1
,𝜎 1
2
)
𝑃 (𝑠 𝑖 |𝜇 0
,𝜎 0
2
)
3-9
𝑃 (𝑠 |𝜇 ,𝜎 2
)=
1
√2𝜋 𝜎 2
𝑒 −(𝑠 𝑖 −𝜇 )
2
2𝜎 2
3-10
Where 𝑠 𝑖 is the signal sample, and 𝜇 0
, 𝜎 0
2
, 𝜇 1
and 𝜎 1
2
are mean and standard deviation in the
[𝑖 −𝑤 𝑝𝑟𝑒 −1,𝑖 −1] and [𝑖 +1,𝑖 +𝑤 𝑝𝑜𝑠𝑡 +1], respectively. The mean and standard deviation of
the signal are calculated in the windows before and after of each sample point. Once these values
are calculated for each data point of signal time series, the natural log of the ratio of 𝑃 (𝑠 |𝜇 ,𝜎 2
)
before and after each point is calculated. The points with a ratio higher than 𝜏 𝑔𝑙𝑟𝑡 threshold could
be marked as events. However, to avoid false positives, as proposed in [53] , a voting scheme is
used to improve the performance of the algorithm. Upon calculating the 𝐿 (𝑛 ) for all of the sample
points of the power time series, a moving detection window (that slides along the power time
series one point at a time), 𝑤 𝑒 , is used along the power time series and votes are assigned to each
point as follows:
𝑣𝑜𝑡𝑒 𝑖𝑛𝑑𝑒𝑥 =max
𝑤 𝑒 𝐿 (𝑛 )
𝑠 .𝑡 .𝐿 (𝑛 )>0
3-11
As shown in Equation 3-11, votes are assigned to sample points with 𝐿 (𝑛 )>0. As the GLRT
algorithm sweeps the power time series a zero 𝐿 (𝑛 ) value is assigned to sample points, for which
∆𝜇 =𝜇 1
−𝜇 0
<𝜏 ∆𝑃 , in which, 𝜏 ∆𝑃 is a minimum power change threshold. The minimum power
46
change threshold is defined to assure that relatively small power changes do not result in event
detection.
3.1.3 Feature Extraction Methods
As noted, signal characteristics in the vicinity of event locations are extracted as feature vectors.
Although signatures could be extracted from current or voltage waveforms, in this dissertation,
only the power metric signatures (i.e., the signatures that are extracted from real or reactive power
time series) are used as the features. This is due to the fact that considering all combinations of the
feature extraction methods results in a large number of assessment cases, which is not the main
objective of the study. Moreover, ground truth data labeling on the power time series gives us
higher accuracy since the sensors used for individual load monitoring present the signal in the
form of real power time series. Furthermore, in previous studies, the application of features,
extracted from power time series, has shown promising results [53]. Since transient features
provide more information that may improve the performance of pattern recognition algorithms,
these features (power time series with resolution more than 20 Hz) are used for performance
evaluation purposes. As described in section 1.1.1, due to the presence of non-linear loads, higher
harmonic components of the current waveform and consequently the power metrics could
potentially provide more information to enhance the performance of pattern recognition
algorithms [82, 83]. Therefore, appliance state signatures (which are called feature vectors
hereafter) are considered as power time series segments close to the event locations. These
segments are extracted using two windows, one before, 𝑤 𝑏 and one after, 𝑤 𝑎 , the event index.
Feature vectors are extracted as vectors with elements of real power segment followed by the
reactive power segment. In the context of this document, the basic feature vector is the feature
47
vector of real and reactive power for the fundamental frequency. Figure 3-1-a illustrates a basic
feature vector. The feature vectors in their general form are as follows:
𝒙 𝒏 ={𝑝 1
[𝑛 ],𝑞 1
[𝑛 ],…,𝑝 𝑘 [𝑛 ],𝑞 𝑘 [𝑛 ]},𝑘 ∈{1,2,…,𝐾 } 3-12
where 𝑝 and 𝑞 are the real and reactive power components and 𝑘 is the number of harmonic
components, used for feature extraction. 𝐾 is the total number of harmonics, which is up to the
first nine harmonics of the fundamental frequency in this study. The feature vectors could be
comprised of all, even, or odd harmonics of the fundamental frequency.
Figure 3-1 Appliance state change (event) feature vectors examples; a) the basic feature vector,
comprised of real and reactive power time series segments; b) the modeled feature vector through
linear regression analysis
In order to explore the effect of reducing the noise in the signatures, we also use regression
analysis to model the transients and use feature vectors with number of dimensions less than the
dimension of the original feature vector. Due to the complexity of the transient shapes, higher
order regression analysis using basis transformations is used. As illustrated in Figure 3-1-b, the
0 50 100 150 200
-50
0
50
100
150
Power sample points
Watt [Var]
0 20 40 60 80 100
-50
0
50
100
150
Power sample points
Watt
Real power segment
Fitted curve a b
48
combination of the polynomial and Fourier basis functions could be used to model the transients
with high accuracy:
𝑓̂
(𝑥 )=𝜔 0
+∑𝛼 𝑖 𝑥 𝑖 𝑟 𝑖 =1
+∑[𝛽 𝑗 sin (
2𝜋 𝑗𝑥 𝑇 𝑠 𝑗 =1
)+𝛾 𝑗 cos (
2𝜋𝑗𝑥 𝑇 )]
3-13
where 𝑇 is the period, 𝑟 is the highest degree of polynomial, 𝑠 is the number of Fourier basis
functions, and 𝜔 0
is the bias. Coefficient vector, namely 𝜔 ={𝜔 0
,𝛼 ,𝛽 ,𝛾 } , is used as the feature
vector that represents the transformed feature vector. The vector 𝜔 is obtained by minimizing the
residual some of square following objective function:
𝑅𝑆𝑆 (𝜔 )=
1
𝑁 ∑[𝑓 (𝑥 )−𝑓̂
(𝑥 𝑛 )]
2
𝑛
3-14
The vector 𝜔 is obtained through a closed form solution to the optimization problem for
regularized ridge regression as follows in the matrix format:
𝜔 =(𝜙 (𝑥 )
𝑇 𝜙 (𝑥 )+𝜆𝐼 )
−1
𝜙 (𝑥 )
𝑇 𝑦
3-15
in which, 𝜙 (𝑥 ) is the vector of the transformed vector of sample indices in feature vector, 𝜆 is the
regularized coefficient, and 𝑦 is the feature vector values (i.e., the variation of the power metric).
49
3.1.4 Classification Algorithms
Upon detection of an event and extracting the feature vector associated to that event, classification
algorithms are used for detecting the identity of the detected event. Different classification
algorithms were taken into account and evaluated. These algorithms include K nearest neighbor
(KNN) algorithm, naïve Bayes classifier, support vector machine (SVM), and the decision tree
classifier. Evaluation of the algorithms on the data from Apartment 1 data set showed that the
KNN algorithm with K equal to 1 outperforms other algorithms. The evaluation of the algorithms
is presented in the following section. The performance of different classification algorithms on the
feature vectors, extracted using the appliance state change transients, have been evaluated and
reported in previous research studies [53] and similar results were demonstrated. Accordingly, in
this dissertation, the 1NN classifier was adopted.
Data Acquisition and
Preprocessing
Event Detection and
Feature Extraction
Pattern Recognition
(Classification)
Energy Calculation
Training
Dataset
Decomposed Energy
Consumption Per Appliance
Figure 3-2 The complete framework for event based non-intrusive load monitoring with appliance
state transition detection modules highlighted in hatched dark gray pattern
By selecting the classifier algorithm, the components of the event based framework for appliance
state transition detection are complete. Figure 3-2 illustrates the components of the event-based
NILM framework. The complete NILM framework includes the energy calculation component,
50
which is not the focus of this dissertation. Accordingly, the framework components for state
transition detection have been highlighted in hatched dark gray pattern.
3.2 Algorithm Evaluation and Tuning
In order to find the optimum parameters of the aforementioned algorithms and to explore the set
of parameters that result in better performance of these algorithms in addressing our objectives,
evaluation of the algorithms was carried out on Apartment 1 data set. In general, as described
earlier, all the evaluations in this dissertation are carried out using fully labeled data sets.
Accordingly, in Section 3.2.1, the labeling process and thus the appliances and the labels for their
different states of operation for our data sets are described. This information is used for all
evaluation/validation purposes throughout the dissertation.
3.2.1 Data Labeling Procedure
Upon collection of the data in each residential setting, all the data instances for each appliance
state transition were labeled using the ground truth data in comparison to the aggregated time
series. Since the data is used for evaluating event-based NILM applications, the labeling process
was carried out using sub-labels that represent different state transitions for each appliance. Sub-
labels are described in the following paragraphs. At the appliance level, the collected data
included the power time series for plug level meters and light intensity signal for the lighting
fixtures.
51
Figure 3-3 Appliances’ feature vector labeling interface; a) the feature vector (signature) of
interest (fundamental frequency component of real power representation) at 60Hz, b) the feature
vector (signature) of interest (fundamental frequency component of reactive power
representation) at 60 Hz, c) matching a longer segment of the fundamental frequency component
of real power time series with signal (power or light intensity at 1Hz) time series from ground
truth sensors
The signal time series collected at appliance level were compared with the fundamental frequency
component of the real power times series obtained at the main electrical feed and the events were
labeled. To make sure that the data represent realistic scenarios, the event detection algorithm
(described in Section 3.1.2) was used for detecting the location of the events on the power time
series. Figure 3-3 illustrates the interface that was used for labeling the feature vectors.
As noted, since in event-based NILM the objective is to classify individual events, each
appliance’s state transition was labeled with a unique label. These unique labels represent
52
different possible state transitions for each appliance (or at least the state transitions that were
observed during our experiments). Although for an appliance a user might observe only one
physical state of change, that physical state change might be represented by multiple events on the
aggregated power time series. This specifically holds true while using power time series with
higher resolutions. For example in Apartment 1 data set, turning the television on results in
generating multiple events on the power time series despite the fact that in the physical domain all
those events represent a single change. The appliances labels and their associated sub-labels for
all three test bed apartments are as presented in Table 3-1 to Table 3-3. Labels and sub-labels
create a hierarchy of names for events. In this dissertation, labels are three-digit numbers
representing a specific appliance class (e.g. refrigerator) and sub-labels are five-digit numbers
representing all different clusters of feature vectors that a user visually observed and categorized.
For example, Table 3-1 shows the list of appliances in Apartment 1 test bed, as well as, the labels
associated to their different internal operational states. In this data set, refrigerator has two turn-on
states (11101, 11103) and two turn-off states (11102, 11104); these states are related to the
operation of the compressor and the defrost components. However, as you see in Table 3-2 and
Table 3-3, the refrigerators have more states, which represent their lighting fixtures operations.
53
Table 3-1 Appliances sub-labels, representing appliance state transitions in Apartment 1 test bed
Appliance Phase
Label
Code
Sub-label Codes
Refrigerator A 111 11101,11102,11103,11104
Air
Conditioning
A, B 180 18001, 18002, 18003, 18004, 18005, 18006
Toaster A 163 16301, 16302
Kettle A 162 16201, 16202
TV A 129 12900, 12901, 12902, 12903, 12904, 12905, 12906
Iron B 182 18201, 18202
Hair Dryer B 181 18101, 18102
LCD monitor B 122 12201, 12202
Laptop B 120 12001, 12002
Desk Lamp B 151 not used during experiments
Washing
Machine
B 183
18301, 18302, 18303, 18304, 18305, 18306, 18307, 18308,
18309, 18310
Bathroom Light A 145 14501, 14502
Bedroom Light A 144 14401, 14402
Kitchen Light 1 A 141 14101, 14102
Kitchen Light 2 A 143 14301, 14302
Kitchen Fan
Light
A 142 14201, 14202
Unknown A, B 300 -
54
Table 3-2 Appliances sub-labels, representing appliance state transitions in Apartment 2 test bed
Appliance Phase
Label
Code
Sub-label Codes
Refrigerator B 111 11101, 11102, 11103, 11104, 11105, 11106
Grill B 200 20001, 20002
Toaster B 163 16301, 16302
Coffee Maker B 162 16201, 16202
Television B 129 12901, 12902, 12904
xBox B 123 12301, 12302
Microwave B 161 16101, 16102, 16103, 16105
Dishwasher B 182 in progress
Bathroom Light B 120 12001, 12002
Bathroom Fan B 151 15101, 15102
Living Room
Light
B 141 14101, 14102
Kitchen Light B 143 14301, 14302
Cable Box B 118 not used during experiments
Electric Range A, B 100 10001, 10002
Laptop 1 A 119 11901, 11902
Laptop 2 A 120 12001, 12002
Closet Light A 144 14401, 14402
Desktop Lamp A 151 15101, 15102
Air Conditioning A 180
18001, 18002, 18003, 18004, 18005, 18006, 18007, 18008,
18009
Unknown A, B 300 -
55
Table 3-3 Appliances sub-labels, representing appliance state transitions in Apartment 3 test bed
Appliance Phase
Label
Code
Sub-label Codes
Refrigerator A 111 11101, 11102, 11103, 11104, 11105, 11106
Hair Dryer A 181 18101, 18102
Hair Iron 1 A 165 16501, 16502
Hair Iron 2 A 166 16601, 16602
Television A 129 12901, 12902, 12903, 12904
Surface Tablet A 123 12301, 12302
Microwave A 161 16101, 16102, 16103, 16104,16105, 16106, 16107
Laptop A 120 12001, 12002
Bathroom Light and Fan A 145 14501, 14502, 14503, 14504, 14505, 14506
Living Room Light A 144 14401, 14402
Living Room Lamp A 147 14701, 14702
Kitchen Light A 143 14301, 14302
Bedroom Lamp A 146 14601, 14602
Closet Light A 142 14201, 14202
Kettle B 162 16201, 16202
Toaster B 163 16301, 16302
Dishwasher B 167 16701, 16702, 16703
Air Conditioning A, B 180 18001, 18002,18003,18004, 18005,18006
Unknown A, B 300 -
56
3.2.2 Classification Algorithms Performance Evaluation
In order to select the better classification algorithm in addressing our objectives, the evaluation of
different classifier algorithms was carried out using the basic feature vectors extracted from power
time series at 60 Hz resolution for a 40 sample window before and 60 sample window after the
event location. Therefore, power time series characteristics for about 2 seconds (two third of a
second before and one second after) around the event location were used. As noted earlier, four
well known classifier algorithms (KNN, naïve Bayes, SVM, and Decision tree) were evaluated.
Table 3-4 presents the outcome of the evaluations for these four algorithms for turn-on and turn-
off events on phase A of Apartment 1 data set. The results were obtained by using 10-fold cross
validation of the data from two weeks of data collection. As the results show, the best
performance was obtained by using the KNN algorithm with K=1 and the Euclidean distance
function in evaluation similarity measures.
Table 3-4 Performance evaluation of different classifier algorithms on the turn-on and turn-off
events signatures, collected in Apartment 1 test bed for two weeks
Event Type
Classifier
Algorithm
True
Positive
Rate
False
Positive
Rate
Precision Recall F-measure
Turn-On
Events
KNN 0.957 0.010 0.956 0.957 0.956
Naïve Bayes 0.708 0.031 0.85 0.708 0.695
SVM 0.679 0.179 0.496 0.679 0.564
Decision Tree 0.917 0.022 0.918 0.917 0.917
Turn-Off
Events
KNN 0.968 0.017 0.967 0.968 0.967
Naïve Bayes 0.493 0.016 0.936 0.493 0.575
SVM 0.768 0.517 0.597 0.768 0.67
Decision Tree 0.953 0.024 0.954 0.953 0.953
57
More detailed results for the performance of the KNN algorithm on the data for both phase A and
phase B turn-on and turn-off events in Apartment 1 are as follows. The evaluation metrics include
the accuracy measures (presented in Table 3-5 and Table 3-6 for phase A and phase B,
respectively), as well as the confusion matrices for all four evaluation cases (each phase two types
of events (i.e., on or off). The confusion matrices were illustrated in Table 3-7 to Table 3-10.
Table 3-5 Performance of the 1NN algorithm using 10-fold cross validation for two weeks of the
data in Apartment 1 for turn-on and turn-off events on phase A
Turn-On Events Turn-Off Events
Class Label Precision Recall F-Measure Class Label Precision Recall
F-
Measure
0 0.968 0.958 0.963 0 0.99 0.989 0.989
200 0 0 0 200 0 0 0
300 0.848 0.778 0.812 300 0.538 0.368 0.438
11101 0.977 0.992 0.984 11103 0.978 0.989 0.983
11102 1 1 1 11104 1 1 1
12900 0.918 0.928 0.923 12904 0.9 0.947 0.923
12901 1 1 1 12905 0.857 0.545 0.667
12902 1 1 1 14102 0.765 0.813 0.788
12903 1 0.95 0.974 14202 0.682 0.75 0.714
12906 1 0.917 0.957 14302 0 0 0
14101 0.786 0.917 0.846 14402 1 1 1
14201 0.864 0.864 0.864 14502 1 1 1
14301 0.8 1 0.889 16202 1 1 1
14401 0.938 0.968 0.952 16302 0.944 0.944 0.944
14501 0.986 0.993 0.989 18004 0.951 0.983 0.967
16201 1 1 1 18005 0.75 1 0.857
16301 1 1 1 18006 0.8 0.8 0.8
18001 0.929 0.951 0.94 Weighted Average 0.967 0.968 0.967
18002 0.983 0.983 0.983
Weighted Average 0.956 0.957 0.956
58
Table 3-6 Performance of the 1NN algorithm using 10-fold cross validation for two weeks of the
data in Apartment 1 for turn-on and turn-off events on phase B
Turn-On Events Turn-Off Events
Class Label Precision Recall F-Measure Class Label Precision Recall F-Measure
0 0.966 0.952 0.959 0 0.976 0.968 0.972
100 0.797 0.84 0.818 300 0.636 0.438 0.519
300 0.111 0.091 0.1 1802 0.455 0.455 0.455
1801 0.75 0.818 0.783 12002 0.067 0.125 0.087
12001 0.125 0.222 0.16 12202 0.565 0.591 0.578
12201 0.3 0.214 0.25 18004 0.937 0.983 0.959
18001 0.94 0.979 0.959 18102 0.643 0.474 0.545
18002 1 1 1 18202 0.733 0.846 0.786
18101 0.909 0.588 0.714 18302 0.996 1 0.998
18201 0.75 0.923 0.828 18304 0.983 1 0.992
18301 0.987 0.989 0.988 18306 1 1 1
18303 0.976 1 0.988 18308 0.857 0.923 0.889
18305 1 1 1 18310 0.875 0.875 0.875
18307 0.967 1 0.983 Weighted Average 0.965 0.963 0.963
Weighted Average 0.944 0.939 0.941
In addition to a metric for evaluation, confusion matrices also provide a presentation for
population of the classes and the distribution of feature vectors in the feature space. As the
evaluation metrics for different evaluation scenarios show, the KNN classifier performs with high
accuracy for majority of the events for both on and off events. The lower accuracy measures were
obtained for some of the lighting loads and specifically for LCD monitor and laptop computer.
The confusion of the algorithm for lighting loads stems from the fact that in many cases lighting
loads could be similar due to the similarity between the light bulbs. This also holds true for
electronic appliances. The technologies that are used in the electronic appliances in some cases
regulate the event transients resulting in similarity between events’ feature vectors.
59
Table 3-7 the confusion matrix for KNN classifier performance using 10-fold cross validation on
the data from phase A for turn-on events
Ground Truth
Classified As
0
200
300
11101
11102
12900
12901
12902
12903
12906
14101
14201
14301
14401
14501
16201
16301
18001
18002
0 454
3 3
10
2 1
1
200
1 1
300 1 1 28
1
2
3
11101 1
125
11102
37
12900 10
167
1 2
12901
14
12902
25
12903 1
19
12906
1
11
14101
11
1
14201
3
19
14301
8
14401
30 1
14501 1
138
16201
15
16301
20
18001
2
39
18002 1
57
60
Table 3-8 the confusion matrix for KNN classifier performance using 10-fold cross validation on
the data from phase B for turn-on events
Ground Truth
Classified As
0
100
300
1801
12001
12201
18001
18002
18101
18201
18301
18303
18305
18307
0 2906 59 5
61 9 2
8 2
1
100 42 231
2
300 4
1 3
2 1
1801 2
9
12001 29
3
10 3
12201 15
7 6
18001 1
47
18002
66
18101
10 4 2 1
18201
1 12
18301 9
781
18303
120
18305
20
18307
29
Table 3-9 the confusion matrix for KNN classifier performance using 10-fold cross validation on
the data from phase A for turn-off events
Ground Truth
Classified As
0
200
300
11103
11104
12904
12905
14102
14202
14302
14402
14502
16202
16302
18004
18005
18006
0 1150 1 4 1
1
1
1 2 2
200
3 2 1
300 5
7 1
2
2
1
1
11103 1
87
11104
37
12904
36 1
1
12905 4
1 6
14102
2
13
1
14202
4 1
15
14302
2
1 1
14402
32
14502
138
16202
16
16302 1
17
18004 1
58
18005
6
18006
1
4
61
Table 3-10 the confusion matrix for KNN classifier performance using 10-fold cross validation on
the data from phase B for turn-off events
Ground Truth
Classified As
0
300
1802
12002
12202
18004
18102
18202
18302
18304
18306
18308
18310
0 1104 1 4 13 10 1 1
3 1
2
300 5 7 2 1
1
1802 4 2 5
12002 6 1
1
12202 9
13
18004 1
59
18102 1
2 9 4
1
2
18202
2 11
18302
787
18304
118
18306
4
18308 1
12
18310
2
14
Moreover, in Apartment 1, the noise level on phase B is higher than the noise level on Phase A.
The laptop and LCD monitor are both fed on phase B and as it could be observed, a large number
of false positive events were detected on this phase. The noise level on phase B is very close to
the variation of the power draw for laptop and LCD monitor and this is the reason for lower
accuracy observed for these two loads.
3.3 Summary
In summary, in this dissertation, event based NILM is used as the baseline approach in addressing
the research objectives. Although the current and voltage signal is acquired and stored
independently, the analyses in this dissertation are carried out using power metrics calculated
through the application of spectral envelope coefficient approach to account for non-linear loads
62
and higher harmonic contents of the current waveforms. A modified generalized likelihood ratio
test and KNN classifier with K equal to 1 are the algorithms that are used for event detection and
classification, respectively. The experimental studies were carried out in three apartment units
with fully labeled data that was obtained by matching the variation of the signal in aggregated
power time series with individual load level sensing.
63
Chapter Four: Unsupervised Feature Vector Clustering
4
As described in Section 2.3, one of our objectives is to facilitate active training by transferring the
burden of the labeling process to the computing side and then asking for user interaction to
provide label for populated clusters of the feature vector. This process was described as
anonymous labeling in Chapter 2. Accordingly, the first step in addressing this objective is
through provision of a pre-populated training data set. Pre-populating a training data set requires
grouping the events’ signatures into similar classes (or clusters). This could be achieved through
clustering algorithms, which are used to group signatures into similar clusters, where the members
of each cluster are similar to each other compared to the signatures in other clusters based on a
predefined similarity metric. Although clustering is an unsupervised approach, for the majority of
the clustering algorithms, the number of clusters is an input parameter. However, as noted, the
objective is to reduce the challenges associated to the training process. The number of appliances’
state changes, associated with the number of events, depends on the number of appliances and
their operational states. As shown in Chapter 3, taking into account the differences in the number
of appliances in each building, their internal components, and the fact that not all of the
operational states could be observed and detected by users, determination of the number of
appliances state transitions (i.e., the number of clusters) is not a trivial task and it requires close
monitoring of appliances by trained users. Accordingly, as also described as one of the research
questions, we are interested in autonomous clustering of similar events’ signatures, associated
with appliances operational state transitions, which, in this context, are called natural partition of
the feature space.
64
For defining the natural patterns in a data set various clustering approaches could be used. One of
the well-known categorizes of clustering approaches include centroid based clustering algorithms
such as k-means or C-means. In this class of algorithms, each cluster is defined by its central
vector and the problem is solved as an optimization problem given that the number of clusters is a
known parameter. Moreover, due to the formulation of the optimization problem, the algorithms
find clusters with approximately same size since each feature vector is assigned to the nearest
centroid. Hierarchical clustering approach could help overcome the above mentioned issues by
finding clusters at different levels of granularity in the feature space, and therefore, this is the
approach that we have adopted as the base approach in addressing the research question. Although
other unsupervised (in the sense that the number of clusters are not required as an input
parameter) clustering algorithms, such as mean shift and spectral clustering, could also be used for
this purpose, in all these methods autonomous clustering requires determining a separation
threshold at an stage of the computation. We adopted hierarchical clustering arguing that the
information contained in the structure of the hierarchical binary cluster tree could be used for
autonomous determination of clustering threshold. Accordingly, in this chapter, the development
of a heuristic algorithm based on hierarchical clustering for autonomous clustering of similar
events’ signatures is described and evaluated.
4.1 Hierarchical Clustering Heuristic Algorithm
Given the signatures’ (i.e., feature vectors) set for events, 𝐹𝑉 have been extracted, the problem is
defined as finding the natural partition of the feature space. Hierarchical clustering is a
connectivity-based clustering approach that could provide clusters at different levels of
granularity in a feature space. The foundation of our heuristic lies in the application of the
65
agglomerative clustering, which is a bottom-up approach [26]. Hierarchical clustering uses
distance measures between feature vectors and connectivity measures between clusters to find all
possible partitioning in a feature space in the form of a binary tree. The agglomerative hierarchical
clustering algorithm starts with all feature vectors as singleton clusters. The pairwise distance
matrix, 𝐷 𝑝𝑤
, of all singleton clusters is generated based on the distances between clusters in the
distance matrix. Clusters are merged to form binary clusters. Distance matrix is updated and
iterative merging continues until only one binary cluster remains, which contain all of the sub-
clusters and singleton clusters. Figure 4-1 presents the algorithm for building the binary cluster
tree.
Input: 𝐹𝑉
𝐶 𝑏 ← ∀𝒙 ∈𝐹𝑉
𝑁 𝐶 ← 𝑠𝑖𝑧𝑒 (𝐶 )
For each 𝒙 𝑖
𝐷 𝑝𝑤
← 𝑑𝑖𝑠𝑡 𝒙 (𝒙 𝑖 ,𝒙 𝑗 )
End
While (∃𝑐 ∈𝐶 𝑐𝑜𝑣𝑒𝑟𝑖𝑛𝑔 𝑎𝑙𝑙 𝑁 𝐶 𝑐𝑙𝑢𝑠𝑡𝑒𝑟𝑠 )
𝐶 𝑏 ← 𝑚𝑒𝑟𝑔𝑒 𝑐𝑙𝑢𝑠𝑡𝑒𝑟𝑠 (𝐷 𝑝𝑤
)
𝑁 𝐶 ← 𝑠𝑖𝑧𝑒 (𝐶 )
𝐷 𝑝𝑤
← 𝑑𝑖𝑠𝑡 𝒄 (𝑐 𝑖 ,𝑐 𝑗 )
End
Output: Binary Cluster Tree
𝐹𝑉 : the set of events’ signatures
𝐶 𝑏 : the binary cluster set
𝑁 𝐶 : the number of clusters
𝐷 𝑝𝑤
: the pairwise distance matrix
Figure 4-1 The agglomerative hierarchical binary cluster tree generation algorithm
For building the cluster binary tree, distance and linkage metrics are used to find the distance
between the feature vectors, 𝑑𝑖𝑠𝑡 𝒙 (𝒙 𝑖 ,𝒙 𝑗 ) , and clusters, 𝑑𝑖𝑠𝑡 𝒄 (𝑐 𝑖 ,𝑐 𝑗 ) , respectively. The 𝐿 𝑝 -norm of
the 𝒙 𝑖 −𝒙 𝑗 vectors could be used to evaluate different distance functions, including the common
distance metrics, namely Euclidean (𝐿 2
-norm) and the Manhattan (city block) (𝐿 1
-norm)
distances:
66
𝑑𝑖𝑠𝑡 𝒙 (𝒙 𝑖 ,𝒙 𝑗 )=‖𝒙 𝑖 −𝒙 𝑗 ‖
𝑝 =(∑|(𝒙 𝑚𝑖
−𝒙 𝑛𝑖
|
p
𝐷 𝑖 =1
)
1
𝑝
4-1
where 𝐷 is the number of features in each feature vector and 𝑝 could be any integer number to
represent different distance metrics. Since clusters include more than one feature vector, the
distance between clusters is defined in the form of linkage metrics. These linkage metrics could
consider the distance between different objects in clusters, which results in different
representation of linkage metrics including a single linkage, the distance between two closest
feature vectors in two clusters, a complete linkage, the distance between two most distant feature
vectors in each cluster, and an average linkage, the average of pairwise distances between feature
vectors in each cluster. Depending on the nature of the data that is the subject of clustering, one of
these linkage metrics might result in better performance. These linkage metrics are defined as
follows:
𝑑𝑖𝑠𝑡 𝑠𝑖𝑛𝑔𝑙𝑒 (𝑐 𝑖 ,𝑐 𝑗 )=𝑚𝑖𝑛
𝒙 𝑚 ∈𝑐 𝑖 ,𝒙 𝑛 ∈𝑐 𝑗 𝑑 (𝒙 𝑚 ,𝒙 𝑛 ) 4-2
𝑑𝑖𝑠𝑡 𝑐𝑜𝑚𝑝𝑙𝑒𝑡𝑒 (𝑐 𝑖 ,𝑐 𝑗 )=𝑚𝑎𝑥
𝒙 𝑚 ∈𝑐 𝑖 ,𝒙 𝑛 ∈𝑐 𝑗 𝑑 (𝒙 𝑚 ,𝒙 𝑛 ) 4-3
𝑑𝑖𝑠𝑡 𝑎𝑣𝑒𝑟𝑎𝑔𝑒 (𝑐 𝑖 ,𝑐 𝑗 )=
1
𝑁 𝑐 𝑖 𝑁 𝑐 𝑗 ∑ ∑ 𝑑 (𝒙 𝑚 ,𝒙 𝑛 )
𝒙 𝑛 ∈𝑐 𝑗 𝒙 𝑚 ∈𝑐 𝑖
4-4
where 𝐶 𝑘 is the 𝑘 𝑡 ℎ
cluster and 𝑁 𝐶 𝑘 is the number of feature vectors in cluster 𝑘 . The algorithm in
Figure 4-1 generates a binary cluster tree. Figure 4-2 illustrates the generated binary tree (i.e., the
67
dendrogram of the tree) using the power signatures from phase A of Apartment 1 test bed. The
visualization is based on the signatures’ data obtained for all of the turn-on events.
Figure 4-2. Dendrogram of binary cluster tree for feature vectors of all turn-on events in a data set
from a residential setting; the tree shows up to one hundred leaf nodes
As Figure 4-2 shows, the dendrogram represents the structure of the feature vector space. This
dendrogram presents the cluster tree with one hundred leaf nodes. Each leaf node may include a
number of sub-clusters. The inverse U shape connections (highlighted for two clusters in
Figure 4-2 with a dashed dot line) represent a binary cluster that is the result of merging two sub-
clusters. The root node covers all feature vectors and sub-clusters. The height of the connecting
lines shows the distance between two clusters. As noted, once the tree is generated, it contains
clusters at different levels of granularity. Grouping the feature vectors into clusters could be
carried out by pruning the tree, using different criteria including number of clusters, an
inconsistency threshold, and a distance threshold. As discussed earlier, providing the number of
clusters is not a viable approach since it requires the knowledge of unique appliance state
0
1
2
3
4
5
6
7
8
9
10
x 10
4
Clusters
Distance Between Clusters
Representative of
distance between
clusters
Connection between
sub-clusters
68
transitions. An inconsistency threshold could be used for the purpose of pruning. When the
heights of two successive links are close together, the links are considered consistent and there is
no clear distinction between the clusters. Inconsistency could be measured by comparing the
height of a link with the average heights of links below it. In order to determine the clusters, an
inconsistency threshold has to be determined. Therefore, we use the global inconsistency at the
cluster tree level to avoid pruning based on a local inconsistency threshold. The local
inconsistency could happen due to the presence of noise in a signal and the feature extraction
approach.
As shown in Figure 3, a distance threshold gives a horizontal pruning level in the tree for
separating clusters. By moving the distance threshold (represented by the dashed grid lines in
Figure 4-2) different granularity of clusters could be obtained. We use the cluster tree structure
characteristics to determine the distance threshold. As the structure of the tree indicates, the
difference between distances are small close to the leaf nodes and the distances increase by
moving towards the root node and consequently the rate of distance increases between the clusters
by moving towards the root node. The distance between clusters represents the
similarity/dissimilarity. Therefore, we use the distance growth rate as a metric for pruning of the
tree. Figure 4-3.a illustrates the variations in the distance measure between the binary clusters
along the tree -- moving from leaf nodes to the root.
The distance measure, 𝑑𝑖𝑠𝑡 (Figure 3a), is obtained using the distance and linkage metrics, as
described in equations 4-1to 4-4. The slope metric, 𝛿 , is defined as follows:
𝛿 =𝑑𝑖𝑠𝑡 𝑖 +1
−𝑑𝑖𝑠𝑡 𝑖
4-5
69
Figure 4-3 Variation of different metrics - used for detecting the structure of cluster tree - along
the cluster tree from leaf nodes to the root; a) between cluster distances along the tree; b) slope
measure representing distance growth rate (equation 4-6); c) slope growth measure representing
slope growth rate (equation 4-7)
As illustrated in Figure 4-3.a, at a point on the cluster tree, the distances start growing very fast.
This is the point of pruning the tree into constituent sub-clusters. In order to find the associated
location on the tree, the slope of the distance measure curve, 𝛿 , does not provide enough
information and therefore we use the slope growth rate, defined as follows:
∆=(𝛿 𝑖 +1
−𝛿 𝑖 )/ 𝛿 𝑖 4-6
The distance threshold for pruning the tree, 𝜏 𝑑 ∗
, is obtained by:
0 200 400 600 800
0
10
20
x 10
4
Clusters tree levels
dist
200 400 600 800
0
5
10
x 10
4
Clusters tree levels
0 200 400 600 800
0
2000
4000
Clusters tree levels
a
b
c
70
𝜏 𝑑 ∗
=argmax
𝑙 𝑡 ∆
4-7
where 𝑙 𝑡 represents different levels of the tree. However, as noted for inconsistency pruning
threshold, the local inconsistency in the tree structure might result in a 𝜏 𝑑 ∗
value in the lower parts
of the cluster tree. Consequently, this results in a number of clusters that does not represent the
natural (i.e. compatible with the appliances’ state transitions) separation of the data in the feature
space. As structure of the tree in Figure 4-2 implies, to address this issue, the pruning distance
threshold needs to be sought for in the upper parts of the cluster tree (if it exists), close to the root
node. The question is how to find the border. In order to find the upper part of the tree, again the
information contained in the structure of the tree could be used. Figure 4-4.a shows the histogram
of the 𝛿 values for the cluster tree. In order to find the upper segment of the tree, where the
distance growth rate is higher, a histogram segmentation method is used. As the histogram
illustrates, the majority of the 𝛿 values are close to zero. Histogram bins with higher 𝛿 values are
less populated and bins with lower 𝛿 values are highly populated. Therefore, the segmentation
point of the tree is selected as the point, where there is balance in weighted sum of the 𝛿 values in
the histogram. The 𝛿 segmentation threshold is obtained by finding the minimum value of
absolute difference of the weighted sum of 𝛿 values in the histogram:
𝜏 𝛿 ∗
=argmin
𝑐 ∈𝐶 |
|
∑𝑐 𝑖 𝑛 𝑖 − ∑ 𝑐 𝑗 𝑛 𝑗 𝑞 ∈𝑤 𝑟 𝑗 =1
𝑝 ∈𝑤 𝑙 𝑖 =1 ⏟
∆
ℎ𝑤 |
|
4-8
71
where 𝑐 𝑘 is the center of the 𝑘 th
histogram bins, 𝑛 𝑘 is the number of elements in each histogram
bin, 𝐶 is the set of all histogram center values, 𝑝 is the number of histogram bins in the 𝑤 𝑙 , and 𝑞
is the number of histogram bins in the 𝑤 𝑟 . 𝑤 𝑙 and 𝑤 𝑟 are the two contiguous summation windows
with a common side that moves along the histogram of 𝛿 . Once the 𝜏 𝛿 ∗
is obtained, the upper part
of the cluster binary tree is determined by moving upwards from the leaf nodes and finding the
level at which, the median of the 𝛿 values from that point to the root is greater than 𝜏 𝛿 . This
remaining set is called 𝛿 𝑢 and is used in equations 4-6 and 4-7. In this process, the number of
histogram bins is considered to be equal to the size of the 𝛿 set. Figure 4-4 illustrates the
histogram segmentation process for the cluster tree. If the point of segmentation does not exist
(for example in case of clustering very similar feature vectors) the 𝛿 𝑢 will be equal to 𝛿 .
Figure 4-4 Distance growth rate histogram segmentation process; a) the histogram of the 𝛿 value
of the cluster tree; b) variation of 𝛿 values along the cluster tree; c) variation of the |∆
ℎ𝑤 | (the
absolute difference of the weighted sum of 𝛿 ) along the tree; d) variation of 𝛿 𝑢 values along the
cluster tree
0 100 200 300 400 500 600 700
0
50
100
150
Cluster tree levels
0 20 40 60 80 100 120
0
50
100
150
200
250
Histogram Bins
Number of elements
0 100 200 300 400 500 600 700
0
500
1000
1500
Number of histogram bins
|
hw
|
0 100 200 300 400 500 600 700
0
50
100
150
Cluster tree levels
u
a
b
d
c
72
Figure 4-5.a shows the dendrogram of the cluster tree for a sample feature space. This is the same
dendrogram presented in Figure 4-2, but in this dendrogram, the number of leaf nodes was
increased so that the multi scale nature of the tree could be observed. Data processing showed that
due to the nature of the signatures in the feature space, the cluster tree has a multi-scale structure
and by evaluating the dissimilarity measure over the entire feature space, the information
contained in sub-components of the tree (sub-trees) does not contribute to the information
retrieval for pruning threshold determination. Although the leaf nodes in Figure 4-5.a do not show
all of the singleton clusters, the scale effect in the structure of the tree is observed. Accordingly,
by evaluating the pruning threshold and clustering the feature vectors over the entire feature
space, part of the feature space remains as one cluster of feature vectors that might not be
associated to a unique appliance state transition. This cluster is defined as the residual cluster.
Accordingly, we use recursion in the clustering process to account for the scale effect in the
cluster tree. In other words, once the clusters in each scale are determined, the residual cluster
introduces a different structure for the cluster tree, in which the natural separation of the feature
vectors is amplified, thus resulting in the application of the proposed heuristics at different scales.
Since the pruning at each recursion is performed on a horizontal level, in each recursion, a number
of clusters with one or two feature vectors are generated. Accordingly, in order to control the
number of clusters, these clusters are eliminated in our approach. A dispersion measure for all of
the clusters in the feature space is used to determine the residual cluster. The cluster with
maximum dispersion value is determined as the residual cluster. The dispersion is measured by
using the dispersion index:
𝐼 𝑑 =
𝝈 2
𝝁 4-9
73
Figure 4-5 Dendrogram of the cluster tree at different scales through recursive clustering; sub-
graphs b to d represent the dendrogram of the cluster tree for the residual feature space
in which, 𝝈 2
is the vector containing the element-wise variance of the feature vectors in each
cluster, and 𝝁 is the vector of element-wise mean of the feature vectors in each cluster.
Figure 4-5.b-d illustrate the residual trees for different recursions. The amplification of the scale
through different recursions is observed in these figures. The algorithm recursively searches the
tree for natural separations in the feature space until it reaches to a point that no more separation
in the residual tree could be achieved (i.e. the number of clusters in the recursion is equal to 1).
The pseudo code of the proposed heuristic algorithm is presented in Figure 4-6. In this algorithm,
the agglomerative hierarchical binary cluster tree generation algorithm, presented in Figure 4-1, is
used as the aggTree function.
0
5
10
15
x 10
4
Clusters
Distance
Recursion 1
500
1000
1500
2000
2500
Clusters
Distance
Recursion 2
400
600
800
1000
1200
Clusters
Distance
Recursion 3
400
600
800
1000
1200
Clusters
Distance
Recursion 4
b
c
d
a
74
Input: 𝐹𝑉
𝑁 𝑐𝑟
← 0
𝐶 ← ∅
𝐹𝑉
𝑟 ← 𝐹𝑉
Function RH (𝐹𝑉
𝑟 )
While 𝑁 𝑐𝑟
≠ 1
𝑇 𝑐 ← aggTree(𝐹𝑉
𝑟 )
𝑑𝑖𝑠𝑡 ← 𝑒𝑥𝑡𝑟𝑎𝑐𝑡 𝑑𝑖𝑠𝑡 𝑓𝑟𝑜𝑚 𝑇 𝑐
𝛿 ← 𝑐𝑙𝑎𝑐𝑢𝑙𝑎𝑡𝑒 𝛿 𝑢𝑠𝑖𝑛𝑔 𝑑𝑖𝑠𝑡
(𝑁 ℎ
,𝐶 ℎ
) ← ℎ𝑖𝑠𝑡𝑜𝑔𝑟𝑎𝑚 (𝛿 )
𝜏 𝛿 ∗
=argmin
𝑐 ∈𝐶 ℎ
|∑ 𝑐 𝑖 𝑛 𝑖 −∑ 𝑐 𝑗 𝑛 𝑗 𝑞 ∈𝑤 𝑟 𝑗 =1
𝑝 ∈𝑤 𝑙 𝑖 =1
|; 𝑐 ∈𝐶 ℎ
,𝑛 ∈𝑁 ℎ
𝛿 𝑢 ← 𝑢𝑝𝑑𝑎𝑡𝑒 𝛿 𝑢𝑠𝑖𝑛𝑔 𝜏 𝛿 ∗
∆ ← 𝑐𝑎𝑙𝑐𝑢𝑙𝑎𝑡𝑒 ∆ 𝑢𝑠𝑖𝑛𝑔 𝛿 𝑢
𝜏 𝑑 ∗
=argmax
𝑙 𝑡 ∆
𝐶 𝑟 ← 𝑝𝑟𝑢𝑛 𝑡 ℎ𝑒 𝑡𝑟𝑒𝑒 𝑡𝑜 𝑐𝑙𝑢𝑠𝑡𝑒𝑟𝑠 𝑢𝑠𝑖𝑛𝑔 𝜏 𝑑 ∗
𝐶 ← 𝐶 ∪ 𝐶 𝑟
𝑁 𝑐𝑟
← 𝑠𝑖𝑧𝑒 (𝐶 )
𝐹𝑉
𝑟 ← argmax
𝑐 ∈𝐶 𝐼 𝑑 (𝑐 )
RH (𝐹𝑉
𝑟 )
End
End Function
𝐹𝑉 : The set of events’ signatures
𝐹𝑉
𝑟 : The set of residual feature vectors
𝑁 𝑐𝑟
: The number of clusters in each
recursion
𝐶 : The set of clusters
𝑇 𝑐 : The cluster tree
𝑁 ℎ
: Number of elements in histogram bins
𝐶 ℎ
: The values for histogram centers
𝐶 𝑟 : The set of clusters in each recursion
Figure 4-6 Recursive hierarchical clustering (RH) algorithm
4.2 Performance Evaluation of the Algorithm
The algorithm performance for autonomous clustering of appliances signatures was evaluated
using the same two weeks of data that was used for classification evaluation in Chapter 3. The
evaluation of the clustering algorithms sheds more lights on the importance of the labeling process
75
and the need for sub-labels. As presented in Table 2-1, all appliances’ state transitions have been
uniquely labeled to make sure that a fair evaluation of the clustering algorithm is performed. The
appliances with two sub-labels (e.g., toaster turn-on (16301) events and its turn-off (16302)
events) are the ones with binary operational states. However, these binary operational states for an
appliance such as hair dryer could still have different signatures because of the difference in
power draw in various modes of operation (i.e., low, medium, high). Since in a NILM system, the
sign of the signature (whether the signature is related to a turn-on or turn-off event) could be
autonomously evaluated and the analysis could be independently carried out on each phase, for
clustering evaluation the same scenarios for data separation to turn-on and turn-off events for each
phase have been considered.
Figure 4-7 Feature vector classes for phase A turn-on events, manually labeled by a user, using
the ground truth sensors’ data
76
To provide more tangible sense of the feature space, Figure 4-7 illustrates all of the real power
components of the turn-on events on phase A in Apartment 1. As this figure shows, the feature
vectors were extracted using the event detection algorithm, explained in Section 3.1.2. For the
feature extraction process, the before and after event windows, 𝑤 𝑏 and 𝑤 𝑎 , were set to be 40 and
60 samples with power resolution of 60Hz (two third of a second before and one second after the
event), respectively. This is similar to the data that was used for classification algorithm
evaluation. Since the event detection algorithm was used for the labeling process, a number of
false positives were also detected. The label 0 in the data set is associated to those false positive or
unimportant (change in power draw less than 20 watts) events that do not represent actual events
on power time series. The label 200 in Figure 4-7 is for the lighting fixtures’ feature vectors that
could not be labeled by a unique label/sub-label.
4.3 Performance Evaluation Metrics
Different performance evaluation metrics have been introduced for clustering algorithms. In one
category, the performance metrics are distance based such as the Dunn index [86], which uses the
ratio between the minimum distance between clusters (inter-cluster) and maximum distance
between feature vectors in each cluster (intra-cluster). In this dissertation, since the labels of the
feature vectors are known, the evaluation is carried out following a procedure similar to the one
used for supervised learning problems by matching the ground truth labels and the labels of the
clusters. However, since the number of clusters is not necessarily equivalent to the number of sub-
labels (see Table 4-1 as an example), in order to use a confusion matrix and consequently F-
measure metrics we use a mapping approach in addition to a complementary metric for cluster
quality evaluation. The outcome of the clustering algorithm could be represented in a matrix
77
similar to a confusion matrix. The evaluation metrics are described through the presentation of the
confusion matrix resulted from running the heuristic algorithm on the data, shown in Figure 4-7.
Table 4-1 presents the association matrix, which shows the association between the clusters
assigned by the algorithm and the sub-labels of the feature vectors.
78
Table 4-1 The matrix representing the association between cluster labels and ground truth labels (association matrix)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 𝑁 𝑜
0
9
3
16 41
200
1
1 2
300
4
4
1
3
3 5
1 36
11101
4
117
126
11102
37
37
12900
1
3
171 180
12901
12
14
12902
6
19
25
12903
18
20
12906
10
12
14101
11
12
14201
12 8 22
14301
8
8
14401
30
1
31
14501
139
139
16201
7
5
15
16301
19
20
18001
3
36
1
41
18002 5 3
3 8 3
12
58
79
In Table 4-1, the first row represents the numerical labels autonomously assigned to the clusters
by the algorithm. The first column shows the labels of the ground truth data classes and the last
column shows 𝑁 𝑜 values, which are the numbers of feature vectors associated to each ground truth
label. The association matrix is mapped to a conventional confusion matrix as follows:
The cluster labels are defined as the class label associated with the majority of the feature
vectors in a cluster. Once all the clusters are labeled, the clusters with the same labels are
merged and labeled with associated class label. In merging the clusters, the entire column
related to each cluster is moved to be merged.
Upon the completion of the mapping, precision, recall and the F-measures were calculated
for the confusion matrix.
Table 4-2 illustrates the outcome of the mapping procedure to create a metric similar to the
conventional confusion matrix.
Another important factor in evaluation of the algorithm performance is the number of clusters
associated with each ground truth label, as well as, the density of the clusters. In this context, this
factor is called the cluster quality. As noted in section 3.3, in each recursion, the clusters with one
or two feature vectors are removed in order to control the number of clusters. Accordingly, in
evaluating the effect of feature extraction methods and the distance and linkage metrics, the
number of clusters and eliminated feature vectors play an important role. The ideal condition is to
have the same number of clusters as the number of ground truth labels with each cluster
containing all original ground truth feature vectors. This condition is interpreted as the ideal
80
cluster quality. Although this ideal situation might not happen (due to the variation of the
signatures), it is used in this study as the benchmark for the cluster quality analyses.
Table 4-2 The mapped version of the association matrix in the form of a conventional confusion
matrix
Ground Truth Labels
Labels Autonomously Assigned by Clustering Algorithms
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
0 12
16
200
0
1
1
300
19
1
1
11101
221
11102
37
12900
174
1
12901
12
12902
25
12903
18
12906
10
14101
11
14201
8
12
14301
8
0
14401
30 1
14501
139
16201
12
16301
19
18001
1
39
18002
58
The evaluation of clusters’ quality could be carried out qualitatively by looking at the association
matrix. However, to facilitate the comparison, we introduce a cluster quality (CQ) graph as a
visual metric. Figure 4-8 shows the CQ graph for the data presented in Figure 4-7 using the basic
81
feature vector in addition to city block distance and single linkage metrics. The CQ graph is
presented on the left side; the x axis shows the number of clusters and the y axis shows the density
of clusters. On the right side, a portion of the CQ graph has been zoomed in for clarification. The
interpretation of the CQ graph could be carried out by looking at the distance between the markers
and the ideal point of (1,1). The farther the markers lie from the ideal point, the lower the quality
of the clustering is. Figure 4-8 shows a relatively high quality of clustering since most of the
classes have high densities with one or two clusters.
Figure 4-8 Cluster quality (CQ) metric representation; the left side shows the results for all
clusters and the right side shows the congested area in a larger scale
In addition to the CQ graph metric for comparison, to provide a quantitative metric to be used for
the comparison between different features and distance metrics, we introduce the cluster quality
index (𝐶𝑄𝐼 ) to represent the information contained in CQ graph.
82
𝐶𝑄𝐼 =
1
𝑁 𝑙 ∑
𝜌 𝑙 𝑁 𝑐𝑙
𝑙 𝑁 𝑙 1
(22)
𝜌 𝑙 =(
𝑁 𝑟 𝐹𝑉
𝑁 𝑎 𝐹𝑉
)
𝑙
(23)
in which, 𝜌 𝑙 is the density associated with each label feature vector and 𝑁 𝑟 𝐹𝑉
is the ratio between
remaining number of feature vectors, , 𝑁 𝑎 𝐹𝑉
𝑖𝑠 the actual number of feature vectors for each label,
𝑁 𝑐𝑙
𝑙 is the number of clusters for each label 𝑙 , and 𝑁 𝑙 is the number of labels. Although absolute
value of the 𝐶𝑄𝐼 could be used for evaluation, lower values of the 𝐶𝑄𝐼 are not necessarily an
indication of poor performance. For example, if in a clustering procedure, 𝜌 𝑙 is equal to one for all
labels and the algorithm returns one cluster for half of the labeled feature vectors and two clusters
for the other half, a 𝐶𝑄𝐼 value of 0.75 is obtained. Although this is a desirable condition, the
absolute 𝐶𝑄𝐼 is relatively low. Accordingly, the relative 𝐶𝑄𝐼 values are used for the comparison
between different feature extraction methods and distance metrics.
4.4 Algorithm Evaluation
The evaluation of the algorithm was carried out for different feature extraction methods and for
different distance and linkage functions. As noted in Section 3.1.3, the feature extraction was
carried out using the event detection algorithm to account for realistic variations in feature
vectors, which happen due to the differences in detected event indices, detected by the algorithm.
The feature vectors with label 0 could be eliminated from the feature space before clustering. In
order to remove these feature vectors the ∆P, change in first harmonic real power draw at the
point of event, was calculated. If |∆P| is less than a certain threshold, the associated feature vector
83
could be eliminated. However, the difference in appliances power draw and the fact that the
location of the detected events on real power time series do not necessarily coincide with the
actual point of events, finding the ∆P threshold is a challenging task for some of the events.
Accordingly, although a large portion of the feature vectors with label 0 could be excluded from
the cluster analysis, complete elimination of zero labeled feature vectors could result in the loss of
some of the important events. Consequently, a portion of these feature vectors remained in the
feature space.
Since the dissimilarity measures between clusters and therefore the structure of the feature space
play an important role in the performance of the algorithm, 5-fold cross validation was used in the
evaluations. In addition, to avoid biased conclusions the results were reported as average values
across five 5-fold cross validations. The reported performance metrics (presented in the following
tables) are then mean values of the performance metrics. Combination of all distance and linkage
functions (equations 4-1 to 4-4) and different feature extraction methods could result in a large
number of combinations for algorithm evaluation. Therefore, the effect of distance and linkage
functions was evaluated using a basic feature vector (i.e., [𝑃 1
,𝑄 1
]) on phase A turn-on events.
Euclidean and city block distance metrics along with three linkage metrics were taken into
account. Accordingly, six combinations were evaluated and the results are presented in Table 4-3.
𝐶𝑄𝐼 𝑚 is modified 𝐶𝑄𝐼 for which the data related to feature vectors with labels 0 and 300 were
removed. The feature vectors in these two classes could be associated to multiple clusters and thus
their data could result in inaccurate reduction of 𝐶𝑄𝐼 . In order to determine the optimum results,
𝑃 𝐼 or the index for proximity to ideal case was defined. 𝑃 𝐼 is defined as the distance between the
pair of F-measure and 𝐶𝑄𝐼 𝑚 to point (1,1). The case with minimum 𝑃 𝐼 is the optimum condition.
As the results in Table 4-3 shows, the single linkage along with the city block distance metrics
84
resulted in the better combination. Since the objective is to facilitate training by reducing the
number of true clusters, which represent the appliances state transition, higher 𝐶𝑄𝐼 value is
desirable.
Table 4-3 Performance of the heuristic algorithm for turn-on events on phase A for different
linkage and distance metrics (average values across five 5-fold cross validation results)
Case no. Distance Linkage Precision Recall F-Measure 𝐶𝑄𝐼 𝐶𝑄𝐼 𝑚
**
𝑃 𝐼
1 City Block Single 0.91 0.93 0.90 0.55 0.60 0.41
2 Euclidean Single 0.91 0.95 0.91 0.45 0.49 0.51
3 City Block Average 0.88 0.91 0.86 0.48 0.53 0.49
4 Euclidean Average 0.92 0.92 0.89 0.41 0.45 0.56
5 City Block Complete 0.90 0.89 0.85 0.39 0.42 0.60
6 Euclidean Complete 0.91 0.92 0.89 0.37 0.40 0.61
**
The cluster quality index by excluding the feature vectors related to 0 and 300 labels
The effects of the feature extraction methods were evaluated using the city block distance and
single linkage metrics. Five cases were compared: (1) the real and reactive power of the
fundamental frequency component (basic feature vector), (2) the basic feature vector using a
kernelized distance function, (3) the real and reactive power for the first nine harmonics, (4) the
real and reactive power for the first five odd harmonics, and (5) the features representing the
reduced noise version of the transients using higher order linear regression (Equation 3-12). The
case no.4 was considered due to the fact that odd harmonic components of the current waveform
contain more information and eliminating even harmonics could potentially improve the
algorithm performance. Since case no.5 represents noise-reduced features for transients, the
dimension of feature vectors is reduced, and different 𝑟 (i.e., the degree of polynomial basis
functions) and 𝑠 (i.e., the number of Fourier basis functions) values were used to extract features.
85
The regression coefficient vectors for real power segment and reactive power segment were
combined and used as the feature vector. Figure 4-9 shows the sensitivity analysis of the
algorithm performance for different model parameters on the data from phase A turn-on events.
As this figure shows, changes in the values of 𝑟 and 𝑠 does not change the results dramatically.
However, slightly better results were obtained for 𝑟 =1 and 𝑠 =5−20. For case no.2 the
application of kernelized distance, using polynomial kernel function with degree 1, was
investigated to explore whether it is effective in magnifying the dissimilarities. The kernelized
distance was observed to improve the results in some of the individual runs and therefore, we
explored their effect through cross validation.
𝑑 𝑘 (𝒙 𝑚 ,𝒙 𝑛 )=𝑥 𝑚 𝑇 𝑥 𝑚 +𝑥 𝑛 𝑇 𝑥 𝑛 −2𝑥 𝑚 𝑇 𝑥 𝑛 =𝑘 (𝒙 𝑚 ,𝒙 𝑚 )+𝑘 (𝒙 𝑛 ,𝒙 𝑛 )−2𝑘 (𝒙 𝑚 ,𝒙 𝑛 )
4-10
Table 4-4 shows the results of the analyses for the turn-on events on Phase A. The best results
were obtained in cases, where basic feature vectors and higher harmonic contents were used (i.e.
cases no. 1, 3, and 4). For this part of the data, the addition of the higher harmonic contents did
not improve the performance. However, using the odd numbered harmonics showed better
performance. Using the kernelized distance resulted in equally high F-measure value; however,
the highest CQI
m
value was obtained in case no.1 and no.3. The application of the linear
regression for modeling the transients resulted in relatively poor performance compared to other
cases. The association matrix and CQ graph of the entire turn-on events data on phase A (for a
single run on the entire data set using basic feature vectors – case no.1) are presented in Table 4-1
and Figure 4-8, respectively. As these illustrations show the algorithm showed promising
performance for case no.1.
86
Table 4-4 Performance of the heuristic algorithm for turn-on events on phase A for different
feature extraction methods (average values across five 5-fold cross validation results)
Case no. Feature Vector Precision Recall F-Measure 𝐶𝑄𝐼 𝐶𝑄𝐼 𝑚
**
𝑃 𝐼
1 [𝑃 1
,𝑄 1
] 0.91 0.93 0.90 0.55 0.60 0.41
2 [𝑃 1
,𝑄 1
] (kernelized distance) 0.90 0.95 0.90 0.42 0.46 0.55
3 [𝑃 1
,𝑄 1
, …, 𝑃 9
,𝑄 9
] 0.92 0.93 0.90 0.53 0.58 0.43
4 [𝑃 1
,𝑄 1
, 𝑃 3
,𝑄 3
, … , 𝑃 9
,𝑄 9
] 0.91 0.93 0.89 0.55 0.60 0.42
5 [𝛼 ,𝛽 ,𝛾 ]
𝑃 ,𝑄 *
0.87 0.79 0.76 0.29 0.31 0.73
*
The set of coefficient in equation 18 for real and reactive power
**
The cluster quality index by excluding the feature vectors related to 0 and 300 labels
Figure 4-9 Variation of F-Measure and CQI indices for different number of Fourier basis
functions and degree of polynomial basis functions
The same cases were taken into account for turn-off events on phase A. The problem could be
more challenging in case of turn-off events since the information related to the dynamics of load
variation is missing in turn-off events. Figure 4-10 illustrates the feature vectors for turn-off
events on phase A. As this figure shows, there are apparent similarities between some of the
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
2 3 5 10 15 20 25 30
F-Measure and CQI
Number of Fourier Basis Functions
CQI (r=1) F-Measure (r=1)
CQI (r=2) F-Measure (r=2)
PI (r=1) PI (r=2)
87
feature vectors (e.g., labels 11103 and 14502). Table 4-5 shows the evaluation results. Similarly,
the feature extraction method in case no.1, 3 and 4 resulted in better performance considering
lower 𝑃 𝐼 values. The performance of the algorithm for these three cases and for both turn-on and
turn-off events data sets was similar with subtle changes in accuracy and 𝐶𝑄𝐼 values which
indicates that the addition of the higher harmonic contents did not affect the algorithm
performance for the feature space on this phase. Furthermore, as the result for both of the analyses
show, the application of the kernelized distance using a polynomial kernel function with degree 1
did not result in improved performance. Although the F-measure values are almost similar for
both turn-on and turn-off events’ cases, the turn-on events on Phase A have higher 𝐶𝑄𝐼 values.
Table 4-5 Performance of the proposed heuristic algorithm for turn-off events on phase A
(average values across five 5-fold cross validation results)
Case no. Feature Vector Precision Recall F-Measure 𝐶𝑄𝐼 𝐶𝑄𝐼 𝑚
**
𝑃 𝐼
1 [𝑃 1
,𝑄 1
] 0.96 0.89 0.91 0.46 0.51 0.50
2 [𝑃 1
,𝑄 1
] (kernelized distance) 0.97 0.85 0.89 0.37 0.41 0.60
3 [𝑃 1
,𝑄 1
, …, 𝑃 9
,𝑄 9
] 0.97 0.87 0.90 0.45 0.49 0.52
4 [𝑃 1
,𝑄 1
, 𝑃 3
,𝑄 3
, … , 𝑃 9
,𝑄 9
] 0.94 0.89 0.90 0.46 0.51 0.50
5 [𝛼 ,𝛽 ,𝛾 ]
𝑃 ,𝑄 *
0.90 0.84 0.83 0.31 0.35 0.68
*
The set of coefficient in equation 18
**
The cluster quality index by excluding the feature vectors related to 0 and 300 labels
The evaluation of the performance for phase B turn-on and turn-off events were also carried out.
In general, in our experimental test bed, the noise level on phase B was higher. The results are
presented in Table 4-6. As the results for both turn-on and turn-off events show, the better
performance was again observed in cases that basic feature vectors and higher harmonic contents
were used. However, for the feature space (both turn-on and turn-off events) on this phase the
88
algorithm was more sensitive to the application of different harmonic contents and the use of the
first 9 harmonics showed relatively better performance. The relatively lower F-measure values for
the turn-off events could be an indicator of the reduced information related to the missing
dynamics of load variation in turn-off events.
Figure 4-10 Feature vectors clusters for phase A turn-off events, manually labeled by user, using
the ground truth sensors’ data
The aforementioned observations for the data analyses on both phases show that the application of
different feature extraction methods for autonomous clustering (using the proposed approach in
this study) could depend on the structure of the feature space. Therefore, depending on the nature
of the feature space, different features could result in different performance metrics’ values.
However, in general, the results showed the capability of the heuristic algorithm in accurate
partitioning of the feature space. As a general observation, based on the results of our analyses,
89
the application of basic feature vectors and higher harmonic contents along with city block
distance and single linkage metrics could result in an acceptable partitioning of the feature space.
Table 4-6 Performance of the heuristic algorithm for turn-on and turn-off events on phase B for
different feature extraction methods (average values across five 5-fold cross validation results)
Type
Case
no.
Feature Vector Precision Recall
F-
Measure
𝐶𝑄𝐼
𝐶𝑄𝐼 𝑚
**
𝑃 𝐼
On 1 [𝑃 1
,𝑄 1
] 0.89 0.92 0.87 0.35 0.41 0.60
On 2 [𝑃 1
,𝑄 1
] (kernelized distance) 0.95 0.92 0.91 0.29 0.34 0.67
On 3 [𝑃 1
,𝑄 1
, …, 𝑃 9
,𝑄 9
] 0.91 0.91 0.88 0.36 0.43 0.59
On 4 [𝑃 1
,𝑄 1
, 𝑃 3
,𝑄 3
, … , 𝑃 9
,𝑄 9
] 0.92 0.93 0.89 0.30 0.35 0.65
On 5 [𝛼 ,𝛽 ,𝛾 ]
𝑃 ,𝑄 *
0.80 0.85 0.78 0.20 0.24 0.79
Off 6 [𝑃 1
,𝑄 1
] 0.90 0.85 0.85 0.40 0.44 0.58
Off 7 [𝑃 1
,𝑄 1
] (kernelized distance) 0.92 0.82 0.85 0.38 0.39 0.62
Off 8 [𝑃 1
,𝑄 1
, …, 𝑃 9
,𝑄 9
] 0.89 0.85 0.85 0.45 0.49 0.53
Off 9 [𝑃 1
,𝑄 1
, 𝑃 3
,𝑄 3
, … , 𝑃 9
,𝑄 9
] 0.88 0.86 0.84 0.40 0.43 0.59
Off 10 [𝛼 ,𝛽 ,𝛾 ]
𝑃 ,𝑄 *
0.80 0.79 0.74 0.14 0.15 0.88
*
The set of coefficient in equation 18 for real and reactive power
**
The cluster quality index by excluding the feature vectors related to o and 300 labels
As shown, depending on the structure of the feature space, the application of different harmonic
contents might result in better partitioning of the feature space. To find the optimum feature
extraction method that fits well with the feature space structure, an unsupervised method of
cluster quality evaluation is required. Metrics such as inter-cluster and intra-cluster distance
optimization could be a solution and the authors plan to investigate it as part of their future
research. However, more analysis on different data sets could provide the ground for drawing
statistically significant conclusions on the effect of different feature extraction methods.
90
4.5 Summary
A heuristic algorithm for autonomous clustering of the feature space using hierarchical clustering
algorithm was presented and evaluated to enable partitioning of the feature space to similar
samples related to each appliances’ state transition. In this heuristic algorithm, the characteristics
of the binary cluster tree were used to determine the distance threshold for pruning the tree. To
account for the multi-scale nature of the cluster tree, the algorithm finds the natural partitions of
the feature space at different scales in a recursive fashion.
The algorithm was evaluated on two weeks of data from Apartment 1 data set. The power time
series in the data set was fully labeled using the ground truth sensor network for different
appliances operational states. For the performance evaluation, accuracy metrics, as well as a
customized metric for cluster quality were used. The evaluation of the algorithm was carried out
for different distance and linkage metrics by using different feature extraction methods. The
evaluations demonstrated the potential of the proposed algorithm in accurate partitioning of the
feature space with high F-measure values (above 0.85 for the majority of the cases) for various
evaluation scenarios. The assessment of different feature extraction methods showed that the
application of basic feature vector (real and reactive power for the fundamental frequency in
proximity of the events) and higher harmonic contents of the power time series results in
acceptable (with high accuracy and relatively high cluster quality index) partitioning of the feature
space. However, depending on the feature space structure, each one of these feature extraction
methods could relatively improve the quality of clustering. Unsupervised determination of the
better feature extraction method could enable autonomous cluster quality evaluation approach.
This algorithm is used as a component in concomitant with the baseline event-based NILM
91
approach for developing a user-centric smart interaction framework to address other research
questions and achieve the objectives of this dissertation. The following chapter describes the
framework and evaluates the performance of the framework in addressing the objectives.
92
Chapter Five: User-Centric Smart Interaction Framework
5
User-centric design focuses on problem-solving with specific emphasis on the end-user
requirements. In user-centric applications, the processes are modified to ensure that they are well-
suited with user requirements rather than asking the user to accommodate the limitations in the
processes. As noted, in-situ training is required to ensure that examples of labeled signatures,
compatible with the appliances in a specific setting are provided to the NILM system. This
process requires continuous user-interaction with the NILM system to ensure that examples for all
possible signatures (or at least the signatures that are commonly generated in that specific setting)
have been provided. However, as we mentioned in the problem statement section (Section 2.2),
this process could result in multiple calls for user interaction until the training is completed.
Furthermore, some appliances might need more than one example for training to be complete.
Accordingly, in this chapter, the user-centric NILM system is assessed from the user-NILM
system interaction perspective to quantify the interaction requirements of the training process for
in-situ training, provide more insight into challenges and their sources, and present an intelligent
framework that facilitates training through reduction in the user-system interaction requirements.
5.1 User-Centric NILM System
As described earlier, the process of labeling includes the provision of example labels for
appliances signatures that are operating in a specific setting. The characteristics of the power time
series in proximity of the events of appliances’ state changes, are treated as signatures and in this
dissertation we use a feature vector of the real and reactive power components being extracted
93
from high resolution (≥20𝐻𝑧 ) power time series to account for the transients (refer to
Section 3.1.3). The training process in the simplest form could be carried out by triggering
different appliances’ operational modes (i.e. generating the signatures) and providing labels for
those signatures. However, a user-centric NILM system facilitates this process by using the
algorithms, described in Chapter 3, in the form of a real-time NILM framework for power metrics
processing, event detection, classification, and communication with user. Accordingly, a user
centric NILM framework is defined as depicted in Figure 5-1. This framework is used as the
baseline framework for user-centric NILM system in this dissertation.
Current and Voltage
(v and i)
Data Processing to
Power Metrics
(P
k
, Q
k
)
Data Collection and
Processing
Edge (Event)
Detection
Event Feature
Extraction
Event Classification
Appliance State
Transition Detection
User Interaction
(User Interface)
Training
Data
Library
Figure 5-1User-Centric NILM framework components and their relationship
5.2 Basic User-Centric Training
Leveraging the user-centric framework illustrated in Figure 5-1, basic training process is defined
as the facilitated approach for the NILM system-user interaction based on an active training
concept. Active training, as noted earlier, is the method, in which users react to the detected
events and provide labels for them as appliances change their operational modes. The basic
94
training process is illustrated in Figure 5-2.
Labeling f
v
(e
i
) through
classification ® l
*
Labeling f
v
(e
i
) through
classification ® l
*
Is the label
Correct?
Is the label
Correct?
Extracting Feature
Vector (f
v
(e
i
))
Extracting Feature
Vector (f
v
(e
i
))
Training
Data Set
(FV L )
Training
Data Set
(FV L )
Yes
Label Feature
Vector (l
*
® l)
Label Feature
Vector (l
*
® l)
Detecting Event (e
i
)
Detecting Event (e
i
)
Label Feature
Vector with User
Provided Label
(l i
u
)
Label Feature
Vector with User
Provided Label
(l i
u
)
No
List of
Possible
Appliances
(L)
List of
Possible
Appliances
(L)
FV
L
= ϕ
FV
L
= ϕ
No
Yes
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
x 10
5
0
500
1000
1500
2000
2500
3000
3500
Real-Time Power
Metric Calculation
P
k
, Q
k
Figure 5-2 Basic training process by leveraging the user-centric NILM framework depicted in
Figure 5-1
In this process, upon installation of the NILM system in a new setting, every activity (induced
either by building occupants or by automated operational schedule of the appliances) is
represented as a change on the power time series (in our case, fundamental frequency component
of the real power) and is detected using the aforementioned GLRT-based event detection
95
algorithm resulting in event 𝑒 𝑖 . Upon detection, a feature vector, 𝑓 𝑣 (𝑒 𝑖 ) , associated to that event, is
extracted and classified using the library of the training data set, 𝐹𝑉
𝐿 , which includes labeled
feature vectors that are used as the training examples. If the library is not populated (the beginning
of the training process) user is prompted to provide a label, 𝑙 𝑖 𝑢 , for the detected event. The event
feature vector is then labeled by 𝑙 𝑖 𝑢 and is added to the library. The classifier algorithm is used for
detecting the labels of the events that follow. In our case of multiclass classification problem, the
outcome of the classification is one of the labels, 𝑙 𝑖 ∗
, in the possible set of labels, namely 𝐿 .
Therefore, once a label is detected through classification, user confirmation is required to ensure
that correctly labeled examples are added to the training data set library. Due to the presence of
noise, some of the appliance state transitions signature might need a number of feature vector
examples to account for variation of the signatures.
The application of the event detection algorithm facilitates labeling of signatures for those
appliance state transitions that are triggered automatically; in most of the cases these state
transitions cannot be conveniently induced by user (for example the turn-on event of the
compressor of an Air Conditioning unit or the turn-on event for the compressor of a refrigerator).
However, in this approach, user needs to receive (and consequently respond to) all the NILM
system’s calls for user interaction. The interaction will continue until the library of the training
data set contains examples of all the signatures associated to the state transitions. The completion
of the training process has an ad hoc nature and depends on the appliances and their use in a
specific setting. For example, upon installation of a NILM prototype in an apartment and the
beginning of the training, some of the appliances might not be used for a period of time.
Therefore, the training process in nature is a continuous process which requires the capability of
handling the changes in the environment by taking into account the user-centric design
96
requirements. Accordingly, we propose a user-centric smart interaction framework, which
accounts for the ad hoc nature of the training process while reducing the need for user interaction.
5.3 Smart User-Centric Training
The smart training framework is defined as a framework that communicates with users for label
provision while minimizing the number of calls to users through the use of data mining
techniques. Therefore, it is called smart-interaction (SI) framework hereafter. We propose to
benefit from the anonymous labeling approach (described in Section 2.3.1) prior to the user-
interaction stage and then communicate with users. As noted in Chapter 4, the anonymous
labeling relies on autonomous clustering of the appliances signatures to create the possible feature
space of the appliances’ state transitions. In the SI framework, given that the clusters of all
observed signatures have been generated, once the association between one (detected event)
signature with one of the clusters is recognized, the user is called for interacting and providing the
label. The recognition of the association between the individual signatures and clusters is carried
out by using the KNN classifier algorithm. The outcome of this classifier depends on the structure
of the observed feature space; the label for one of the existing classes in the observed feature
space is returned. As it is emphasized, the feature space is an observed one, and there is a
possibility for new signatures to be introduced during the labeling stage. Moreover, the presence
of the noise in signatures and the existence of the overlapping events’ signatures could bring
about confusion in the classification outcome, which in turn results in wrong association between
detected event and one of the clusters. Therefore, an inspection mechanism is required to avoid
associating the user provided label with the unrelated feature vector clusters. In this dissertation,
97
we adopted anomaly detection techniques to assure that the classification outcome is indeed
associated with the clustered signatures prior to interacting with user.
5.3.1 Cluster Validation
As noted, forming the clusters of the sample signatures plays a central role in relaxing the user
interaction requirements. The SI framework relies on the application of the clustering algorithm
that was presented in Chapter 4. As presented in the evaluation of the clustering algorithm, the
autonomous clustering results in partitioning of the feature space into its natural partitions. Due to
the presence of noise, variations in detecting events on the signal, and consequently variation of
the signatures for different instances, the “natural partitioning” does not necessarily result in
groups of clusters that represent a unique group of appliance state transitions. In other words, the
samples associated to one class of appliance state transition, for example Air Conditioning
compressor turn-on event, might not be clustered into one single cluster. The number of clusters
depends on the variation of the signatures for a specific appliance state transition and the initial
structure of the feature space. To clarify the meaning of natural partitioning in the data, the
clusters generated by the heuristic clustering algorithm in Chapter 4, using the data from
Apartment 1 test bed for phase A turn-on events, were illustrated in Figure 5-3. In this clustering
process, all the signatures including the ones with label 0 were taken into account. The title of
each sub-plot represents the cluster label-class label pair, associated with each cluster. The natural
partition of the data could be interpreted in different ways as follows.
98
Figure 5-3The resulted clusters, generated by the autonomous clustering algorithm ( presented in Chapter 4) for the turn-on events on
phase A of the apartment 1 data set for two weeks of the data (the same data that was used for evaluating the clustering algorithms)
40 202
-1000
0
1000
2000
1--0
40 202
-2000
0
2000
4000
2--18002
40 202
-1000
0
1000
2000
3--11101
40 202
-100
0
100
200
4--300
40 202
-2000
0
2000
4000
5--18002
40 202
-100
-50
0
50
100
6--300
40 202
-500
0
500
1000
1500
7--16201
40 202
-1000
0
1000
2000
8--11101
40 202
-1000
0
1000
2000
9--0
40 202
-100
0
100
200
10--11102
40 202
-2000
0
2000
4000
11--18002
40 202
-500
0
500
1000
1500
12--16201
40 202
-2000
0
2000
4000
13--18002
40 202
-1000
0
1000
2000
14--0
40 202
-1000
0
1000
2000
15--11101
40 202
-2000
0
2000
4000
16--18002
40 202
-500
0
500
1000
17--16301
40 202
-100
0
100
200
18--14501
40 202
-100
0
100
200
300
19--0
40 202
-200
0
200
400
20--14401
40 202
-100
0
100
200
21--18001
40 202
-100
0
100
200
300
22--12906
40 202
-50
0
50
100
23--18001
40 202
-100
0
100
200
24--0
40 202
-50
0
50
100
25--0
40 202
-50
0
50
100
26--300
40 202
-100
-50
0
50
100
27--300
40 202
-50
0
50
100
150
28--0
40 202
-50
0
50
29--300
40 202
-50
0
50
100
30--0
40 202
-100
0
100
200
300
31--0
40 202
-100
0
100
200
300
32--12902
40 202
-100
0
100
200
300
33--12902
40 202
-20
0
20
40
34--14201
40 202
-50
0
50
100
35--14301
40 202
-20
0
20
40
36--12900
40 202
-50
0
50
37--0
99
The natural partition of the data from SI framework perspective means grouping of the feature
vectors into a number of clusters that are exactly equal to the appliances’ state transition classes.
However, the mathematical representation of the signatures in the form of feature vectors could
result in partitioning the feature vectors of a specific appliance state transition class into multiple
clusters. As an example, clusters in Figure 5-3 show that the signatures associated with the AC
compressor turn-on events (i.e., 18002 events) have been clustered into multiple clusters (2, 5, 11,
13, 16). The dissimilarity measure that strongly depends on the feature space characteristics plays
an important role in the definition of natural partitions. However, as noted from the SI framework
point of view, the reduced number of clusters means reduced user interactions and consequently
facilitated training process. Moreover, the information about the frequency of events in a cluster
could be used as supplemental information in the form of a rule-based technique to further
enhance the SI framework. For example, a refrigerator defrost module creates multiple events and
the users cannot detect those events to label them. On the other hand, these events usually have a
periodic trend. This periodic nature could also be observed in other appliances such as washing
machines, AC systems, and refrigerators. Therefore, a clustering approach that partitions the
signature space into it actual (compatible with physical environment) clusters is required to enable
calculating each cluster’s events’ frequency.
Accordingly, the application of cluster merging approach for cluster validation has been adopted
to be explored. As mentioned in Chapter 4, definition of exact number of clusters is a challenging
task and thus a heuristic was presented to automate the process of clustering into natural
partitions. Due to the complexity in determining the true number of clusters, many research
studies proposed different split and merge approaches for cluster analysis [87-90]. In other words,
the clustering algorithms use a large number of clusters and then similar compatible clusters are
merged until actual number of clusters (compatible with physical domain observations) is
100
obtained. Since the process includes obtaining the actual number of clusters compatible with the
data partitioning in the physical space, the process is called cluster validation. Considering the
outcome of the hierarchical clustering as the split of the feature space, in this section, the problem
is reduced to a cluster merging problem.
Different cluster merging algorithms have been proposed in the literature and they were mainly
targeted for applications in image analysis. Therefore, for the purpose of this dissertation, two of
these cluster merging algorithms and a proposed algorithm are evaluated for the appliances’
signature cluster merging. Krishnapuram and Freg [88] introduced a compatible cluster merging
(CCM) algorithm in which a number of conditions are checked for compatibility analysis. In this
approach, it is argued that each cluster approximately lies on a hyper-plane in an ℝ
𝑛 space and the
smallest eigenvector of the hyper-plane is the normal vector of the plane which represents the
hyper-plane orientation in the space. Therefore the compatibility criteria are defined as follows.
Given that the outcome of the clustering algorithm is a set of clusters and any two clusters are
centered at 𝒛 𝑖 and 𝒛 𝑗 , the eigenvalues and eigenvectors of the covariance matrix for cluster 𝑖 is
{𝜆 𝑖1
,…,𝜆 𝑖𝑛
} and {𝜐 𝑖1
,…,𝜐 𝑖𝑛
} and for cluster 𝑗 is {𝜆 𝑗 1
,…,𝜆 𝑗𝑛
} and {𝜐 𝑗 1
,…,𝜐 𝑗𝑛
} in descending
order, three conditions are considered for compatibility:
|𝜐 𝑖𝑛
.𝜐 𝑗𝑛
|≥𝐾 1
5-1
|
𝜐 𝑖𝑛
+𝜐 𝑗𝑛
2
.
𝒛 𝑖 −𝒛 𝑗 ‖𝒛 𝑖 −𝒛 𝑗 ‖
|≤𝐾 2
5-2
‖𝒛 𝑖 −𝒛 𝑗 ‖
√𝜆 𝑖1
+
√
𝜆 𝑗 1
≤𝐾 3
5-3
101
The condition defined by the equation 5-1states that the hyper-plane of the clusters should be
parallel; the equation 5-2 states that the line joining the cluster centers should be approximately
orthogonal to the normal of the two hyper-planes; and the equation 5-3 states that the cluster
centers should be close enough. The distance between two clusters is defined in terms of the
largest standard deviations of the clusters as the largest eigenvalues is equal to the variance of the
cluster in the direction of the corresponding eigenvector (i.e., principal component). The 𝐾 1
and
𝐾 2
coefficients were considered to be close to 1 and 0, respectively. The algorithm has originally
evaluated for ℝ
2
and ℝ
3
data in image analysis, where it has shown acceptable results [88]. For
the ℝ
2
case the value of the 𝐾 3
was evaluated to be between 2 and 4. However, as noted the
feasibility of using these conditions and values should be explored for the appliances signatures’
feature vectors (which is presented in the evaluation section).
As the second approach, we adopted a modified version of the approach that was proposed by
Xiong et al. [89] for unsupervised fuzzy cluster merging. In this approach, the measure of
similarity between clusters is derived by taking the intersection of the clusters into account.
Therefore, given that feature space 𝐅𝐕 ={𝒙 ∈ ℝ
𝐷 } has been clustered into 𝑚 clusters centered at
𝒛 1
,…,𝒛 𝑚 ∈ ℝ
𝐷 , the similarity ratio (𝑆𝑅 ) metric between any two clusters is obtained as follows:
𝑆𝑅
𝑖𝑗
=
𝑑𝑝
𝑖 +𝑑𝑝
𝑗 𝑑𝑧
𝑖𝑗
5-4
in which, 𝑑𝑝 is the measure of dispersion for each cluster and is obtained as follows:
𝑑𝑝
𝑖 =√𝑟 𝑖 2
5-5
102
where 𝑟 𝑖 2
is calculated by measuring the maximum distance between cluster 𝒄 𝒊 . 𝑑𝑧
𝑖𝑗
is the
distance between two clusters’ centers ‖𝒛 𝑖 −𝒛 𝑗 ‖, which are obtained as the median of the feature
vectors in each cluster. The decision for merging clusters is derived by comparing the 𝑆𝑅 metric
against a threshold in the data set:
If 𝑆𝑅
𝑖𝑗
≤𝜏 𝑆𝑅
→clusters are seperate
If 𝑆𝑅
𝑖𝑗
≥𝜏 𝑆𝑅
→clusters 𝒄 𝑖 and 𝒄 𝑗 are merged
5-6
The theoretical boundary value for the threshold 𝜏 in a spherical representation of the clusters
could be considered as 1, where 𝑑𝑝
𝑖 +𝑑𝑝
𝑗 =𝑑𝑧
𝑖𝑗
. This equity could be interpreted as two
clusters that are tangent at their boundaries. Therefore, a threshold value larger than 1 means that
the clusters are intersecting since the distance between their centers is less than the sum of their
dispersion and a threshold value lower than 1 means that the clusters are separate since the sum of
the clusters’ radii is larger than the distance between clusters’ centers. A quantitative analysis of
the threshold determination is presented in evaluation section. In the following sections, we refer
to this approach as the fuzzy cluster merging (FCM).
As noted, the application of the aforementioned methodologies was mainly evaluated for low
dimensional data sets in the literature. Consequently, determining generalized threshold values for
higher dimensional feature space could be a challenging task. Therefore, we proposed a third
approach, in which the covariance of the data in each cluster is used as the information for
evaluating similarity between clusters. In this approach, we conjecture that if the covariance
matrix of the merged clusters is compatible with the covariance matrix of either of those clusters,
the clusters are compatible. This compatibility criterion enables us to measure similarity by
103
accounting both the distance and shape of the signatures. Figure 5-4 shows three clusters (namely,
2, 7, 16), selected from the cluster set presented in Figure 5-3.
a) Cluster No.2 b) Cluster No.7 c) Cluster No.16
Figure 5-4 Selected clusters form clusters on phase A, turn-on events in apartment 1; a) cluster of
AC compressor turn-on events; b) cluster of kettle turn-on events; c) second cluster of AC
compressor turn-on events
As this figure illustrates, clusters 2 and 16 are compatible, while cluster 7 is not compatible with
any of those two other clusters. Figure 5-5 shows the covariance of the single and merged clusters
in both matrix and vectorized format. As illustrated, the dissimilarity between clusters have been
reflected in the covariance matrix of the merged clusters in Figure 5-5.d. Therefore, the
covariance based merging (CBM) is proposed as follows. Given that feature space 𝐅𝐕 =
{𝒙 ∈ ℝ
𝐷 } has been clustered into 𝑚 clusters, 𝑪𝑳
1
,…,𝑪𝑳
𝑚 ∈ ℝ
𝐷 , the normalized (by maximum)
covariance based similarity (𝐶𝑆
̅̅̅̅
) metric between any two clusters is obtained as follows:
𝐶𝑆
̅̅̅̅
𝑖𝑗
=‖
𝑉𝑒𝑐 (Σ
𝐶𝐿
̂
𝑖 )
[𝑉𝑒𝑐 (Σ
𝐶𝐿
̂
𝑖 )]
𝑚𝑎𝑥 −
𝑉𝑒𝑐 (Σ
𝐶𝐿
̂
𝑖 +𝑗 )
[𝑉𝑒𝑐 (Σ
𝐶𝐿
̂
𝑖 +𝑗 )]
𝑚𝑎𝑥 ‖ 5-7
Σ=
1
𝑛 −1
∑(𝒙 𝑖 −𝒙̅)(𝒙 𝑖 −𝒙̅)
𝑇 𝑛 𝑖 =1
5-8
in which, Σ is the covariance matrix, 𝑉𝑒𝑐 is the vectorization operator, and 𝐶𝐿
̂
𝑖 +𝑗 is the merged
cluster, comprised of signature samples in cluster 𝑖 and 𝑗 .
50 100 150 200
-2000
-1000
0
1000
2000
3000
4000
Samples
Watt [Var]
0 50 100 150 200
0
500
1000
1500
Samples
Watt [Var]
0 50 100 150 200
-2000
-1000
0
1000
2000
3000
4000
Samples
Watt [Var]
104
a) Cluster 2 covariance matrix d) Cluster 2 vectorized covariance matrix
b) Cluster 2-7 covariance matrix e) Cluster 2-7 vectorized covariance matrix
c) Cluster 2-16 covariance matrix f) Cluster 2-16 vectorized covariance matrix
Figure 5-5 Covariance matrices of the signature samples from selected clusters; a-c) Covariance
of the sample signatures in matrix format; d-e) Covariance of the sample signature in the
vectorized format
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5
x 10
4
-0.5
0
0.5
1
Covariance
0 1 2 3 4
x 10
4
-1
-0.5
0
0.5
1
Covariance
0 1 2 3 4 5
x 10
4
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
Covariance
105
The vectorization operator formats the covariance matrix in the form of a vector, comprised of the
rows of the covariance matrix in sequence. The 𝐶𝐿
̂
represents a randomly selected subset of
signature samples in each cluster.
Unbalanced number of signature samples in each cluster could result in misrepresentation of the
𝐶𝑆 metric. For example if a cluster with 200 samples is merged with a cluster with 3 samples, the
data in the large cluster dominates the covariance matrix and the merged cluster could be
considered as compatible clusters. Therefore, 𝐶𝐿
̂
is used to ensure that both clusters are equally
weighted. The decision for merging clusters is derived by comparing the 𝐶𝑆 metric against a
threshold in the data set:
If 𝐶𝑆
𝑖𝑗
≥𝜏 𝐶𝑆
→clusters are seperate
If 𝐶𝑆
𝑖𝑗
≤𝜏 𝐶𝑆
→clusters 𝒄 𝑖 and 𝒄 𝑗 are merged
5-9
5.3.2 Anomaly Detection Algorithms
The objective of the anomaly detection problem is to determine the core of the regular signatures
of a state transition and separate the irregular data, which are outliers to a specific cluster of the
signatures. Classifiers or regressors could be used for outlier detection; however, these supervised
approaches rely on the training data that is defined as inliers and outliers and their performance
depends on the definition of representative outliers [91]. Estimating the probability density of the
signature clusters is another approach; however, this approach could potentially result in partial
probability density distribution of the target data. As Vapnik [92] stated, the description of the
data could be achieved by focusing on the boundaries rather than the entire data. Accordingly, we
106
use the support vector data description (SVDD) approach, introduced by Tax and Duin [91],
which is a data description approach for modeling the boundary around each cluster and defining
the rules for accepting the classified signatures as part of a cluster during the training process.
The SVDD approach describes a boundary around the signatures in each specific cluster without
the need for the labeled data. Assuming that in a cluster, we have 𝒙 1
,𝒙 2
,…,𝒙 𝑁 feature vectors,
representing signatures (clustered using the approach in Section 4.1), the normal data is defined
by a closed boundary around the clustered data. The boundary is a hyper-sphere in ℝ
𝐷 space,
where 𝐷 defines the dimensionality of the feature vectors. The normal data is then defined as
feature vectors that are close to the center, 𝒄 ∈ ℝ
𝐷 , of the hypersphere in a radius, 𝑟 ∈ ℝ. This
proximity to the center is defined as follows:
‖𝜙 (𝒙 )−𝒄 ‖
2
2
≤𝑟 2
5-10
All the feature vectors inside the hyper-sphere belong to the target cluster, whereas the feature
vectors out of the sphere are marked as outliers. The definition of the boundary between normal
signatures versus outlier signatures depends on the definition of 𝑟 . Small 𝑟 values result in a large
number of outliers and large 𝑟 values results in describing outliers as the core part of the data.
Accordingly, the problem is defined as the following minimization problem:
min
𝑟 ,𝒄 ,𝝃 1
2
𝑟 2
+𝐶 ∑𝜉 𝑘 𝑘
5-11
subject to the following constraint:
107
‖𝜙 (𝒙 𝑘 )−𝒄 ‖
2
2
≤𝑟 2
+𝜉 𝑘 ∀ 𝑘 5-12
This is a regularized minimization problem, which balances the trade-off between the reduction in
the hyper-sphere radius and the penalty term that is applied by using the slack variables 𝜉 𝑘 ≥0.
Figure 5-6 shows an illustration of the trade-off between the radius and penalty terms in a two
dimensional representation. This figure also presents the slack variables, the center and radius of
the data description boundary. Parameter 𝐶 is the cost coefficient in the penalty term
(𝐶 ∑𝜉 𝑘 𝑘 ) that controls the trade-off and balances the volume of the boundary in determining the
outlier feature vectors.
Figure 5-6 A two dimensional representation of the anomaly detection through application of
SVDD approach including the core data boundary and its radius and the slack variables for
representative data points
To incorporate the constraints to the objective function of equation 5-11, Lagrange multipliers are
used as follows:
108
𝐿 (𝑟 ,𝒄 ,{𝜉 𝑘 },{𝛼 𝑘 },{𝜆 𝑘 })
=
1
2
𝑟 2
+𝐶 ∑𝜉 𝑘 𝑘 +∑𝛼 𝑘 (‖𝜙 (𝑥 𝑘 )−𝒄 ‖
2
2
−𝑟 2
−𝜉 𝑘 )−
𝑘 ∑𝜆 𝑘 𝑘 𝜉 𝑘
5-13
for which 𝛼 𝑘 ≥0 and 𝜆 𝑘 ≥0 are the Lagrange multipliers. By setting the partial derivatives of 𝐿
with respect to 𝑟 2
,𝒄 , and 𝜉 𝑘 to zero and going through the required algebraic operations, the dual
form of the optimization problem is obtained as follows:
max
𝜶 ∑𝛼 𝑘 𝑘 (𝒙 𝑘 ,𝒙 𝑘 )
𝑘 −
1
∑𝛼 𝑗 𝑗 ∑𝛼 𝑗 𝛼 𝑘 𝑗 ,𝑘 𝑘 (𝒙 𝑗 ,𝒙 𝑘 )
s.t. 0≤𝛼 𝑘 ≤𝐶
∑𝛼 𝑘 =
1
2
𝑘
5-14
The 𝑘 (𝒙 𝑗 ,𝒙 𝑘 ) term is a kernel function of the feature vectors 𝒙 𝑗 and 𝒙 𝑘 . The outcome of the
optimization is a set of 𝛼 𝑘 . The center of the data description boundary, 𝒄 , is obtained as follows:
𝒄 =2∑𝛼 𝑘 𝑘 𝜙 (𝒙 𝑘 )
5-15
Therefore, the center of the hyper-sphere for a cluster is the linear combination of the feature
vectors with 𝛼 𝑘 >0. These are the feature vectors that are needed to define the data description
and they are support vectors of the description. The outlier test is performed through calculating
109
the distance of any given feature vector, 𝒙 𝑡 , from the center of the data description. The distance
of the test feature vector from the center is calculated as follows:
𝑑 2
=‖𝜙 (𝒙 𝑡 )−𝒄 ‖
2
2
=𝑘 (𝒙 𝑡 ,𝒙 𝑡 )−4∑𝛼 𝑘 𝑘 (𝒙 𝑡 ,𝒙 𝑘 )
𝑘 +4∑𝛼 𝑗 𝑗 ,𝑘 𝛼 𝑘 𝑘 (𝒙 𝑗 ,𝒙 𝑘 )
5-16
The test is performed by comparing the distance of 𝒙 𝑡 with 𝑟 2
, which is the measure of the data
description radius. 𝑟 2
is defined as follows:
𝑟 2
=max‖𝜙 (𝒙 𝑖 )−𝒄 ‖
2
2
𝑓𝑜𝑟 ∀𝒙 𝑖 ∈𝑺𝑽
<𝐶 5-17
in which, 𝒙 𝑖 are the feature vectors in the support vector set for which the 𝛼 𝑖 <𝐶 . This set of
feature vectors are represented with 𝑺𝑽
<𝐶 . The mathematical representation of the test criterion
could be defined as the following sign function:
𝑔 (𝒙 𝑡 )=sgn(𝑑 2
−𝑟 2
) 5-18
𝑔 (𝒙 𝑡 )={
−1 𝒙 𝑡 belongs to the core of the cluster
0 𝒙 𝑡 belongs to the core of the cluster
+1 𝒙 𝑡 is an outlier to the cluster
5-19
The optimization problem, presented in equation 5-14 is a quadratic programming problem, which
is solved for 𝛼 𝑘 .
The performance of the SVDD algorithm strictly depends on the inlier training data (i.e., the
feature vectors in cluster of interest). Therefore, variation of the signatures (during the user-
110
interaction stage) due to the presence of noise could result in detection of those signatures as
outliers even if they are from the same distribution. This is not a desirable situation since it could
increase the number of interactions for training. Therefore, as the second approach, we adopted
the multivariate outlier detection using mahalanobis distance. The mahalanobis distance measures
the distance of each signature observation from the center of the reference cluster by taking the
shape of the signatures into account. This distance measure then is tested against a threshold, 𝜏 𝑜𝑑
,
for detecting the outliers versus inliers.
𝜓 𝑐𝑙
𝒙 𝑡 =√(𝒙 𝑡 −𝜇 𝑐𝑙
)Σ
−1
(𝒙 𝑡 −𝜇 𝑐𝑙
)
𝑇
5-20
where 𝜓 𝑐𝑙
is the mahalanobis distance based compatibility measure for the feature vector of
interest (𝒙 𝑡 ) with respect to the reference cluster, 𝑐𝑙 . 𝜇 𝑐𝑙
is the arithmetic mean and Σ is the
covariance matrix of the signatures in the reference cluster. The Moore–Penrose pseudoinverse is
used for covariance inversion in equation 5-20. The outliers are detected using the following set of
rules:
ℎ(𝒙 𝑡 )=sgn(𝜓 𝑐𝑙
𝒙 𝑡 −𝜏 𝑜𝑑
) 5-21
ℎ(𝒙 𝑡 )={
−1 𝒙 𝑡 belongs to the reference cluster
0 𝒙 𝑡 belongs to the reference cluster
+1 𝒙 𝑡 is an outlier to the reference cluster
5-22
5.3.3 SI Framework Description
In the SI framework, the basic training framework is complemented by adding the anonymous
labeling stage and the anomaly detection component. Figure 5-7 illustrates the components of the
111
framework. The anonymous labeling component uses the event detector, feature extraction and
clustering algorithms to generate clusters of observed signatures in a new setting. Therefore, the
SI framework calls for a period of buffering feature vectors prior to clustering. The clusters are
labeled with cardinal numeric labels. Once the clusters are obtained, the framework uses a real-
time decision making process for association of the signatures with the appliances in the physical
domain. Similar to the basic training, the decision-making module uses event detection, feature
extraction and classification algorithms in real-time. However, in the SI framework, the outcome
of the classification is the numeric label of one of the clusters. Therefore, the anomaly detection
test is used for decision making.
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
x 10
5
0
500
1000
1500
2000
2500
3000
3500
Real-Time Power
Metric Calculation
P
k
, Q
k
Anonymous
Labeling
Anonymous
Labeling
Yes
Extracting
Feature Vector
(f v (e i ))
Extracting
Feature Vector
(f v (e i ))
Detecting
Event (e i )
Detecting
Event (e i )
Buffering
Feature Vectors
(FV)
Buffering
Feature Vectors
(FV)
Feature Space
Clustering ({c 1 ,
c 2 , …, c n })
Feature Space
Clustering ({c 1 ,
c 2 , …, c n })
Anonymous Labeling
{c
1
, c
2
, …, c
n
} {c
1
, c
2
, …, c
n
}
No
Detecting Event
(e i )
Detecting Event
(e i )
Extracting
Feature Vector
(f v (e i ))
Extracting
Feature Vector
(f v (e i ))
Classifying
f v (e i ) ® c k *
Classifying
f v (e i ) ® c k *
Anomaly
Testing
f v (e i ) ? c k *
Anomaly
Testing
f v (e i ) ? c k *
Real-Time Decision-making Module
f v (e i )
belongs to
c k *
f v (e i )
belongs to
c k *
Label Cluster with
User Provided
Label (l i
u
)
Label Cluster with
User Provided
Label (l i
u
)
Unknown
Cluster
Unknown
Cluster
Label f v (e i ) with
User Provided
Label (l i
u
)
Label f v (e i ) with
User Provided
Label (l i
u
)
No
Yes
User Interaction Interface
Figure 5-7 The smart interaction framework including the anonymous labeling module, the real-
time decision-making module, and the user interaction interface
If the detected feature vector belongs to the cluster and the cluster is unknown, a user is asked to
interact for label provision. However, if the cluster is a known cluster (i.e., the cluster has been
112
labeled in a previous interaction), there will be no interaction. On the other hand, if the feature
vector is detected as an outlier to the reference cluster, the feature vector is either a signature that
is affected by noise (due to the presence of overlap or a random noise) or a signature from a
different distribution (e.g., the signature represents an appliance state transition that is new to the
NILM system). In this case, as an optional component, the user could be asked to label the
individual feature vector. This process is a continuous process which always looks for new
signatures in the feature space domain. In other words, any change in the characteristics of the
physical domain (for example addition of a new appliance) will trigger the user interaction
interface.
5.4 Performance Evaluation of the SI Framework Components
Prior to evaluating the SI framework performance compared to the basic training approach, the
performances of the SI framework components are evaluated and the better set of algorithms’
parameters are determined. The performance of the classification and heuristic hierarchical
clustering algorithms was evaluated in Chapters 3 and 4, respectively. Therefore, evaluation of the
cluster merging and anomaly detection is first carried out in the following sections followed by
the SI framework evaluation. Based on the results obtained in Chapter 4, the basic feature vectors
(i.e., segments of the fundamental frequency components of the real and reactive power time
series) for power time series sampled at 60Hz are used for the evaluations in this Chapter.
5.4.1 Cluster Validation Evaluation and Analysis
As described in the previous section, the cluster validation approach relies on cluster merging
algorithms, which in turn depends on tuning the threshold for cluster merging. Therefore, in this
113
section, the feasibility of using cluster merging algorithms for cluster validation are evaluated
through analysis of the data sets for both turn-on and turn-off events in apartment 1. The
evaluation of the algorithms on the data from both phases and for turn-on and turn-off events
introduces a high degree of variability in the feature space structure.
The evaluation process includes clustering of the data set (using the algorithm presented in
Chapter 4) and evaluating the aforementioned cluster merging algorithms on the outcome of the
clustering process. As described in Chapter 4, in evaluating the performance of the clustering
algorithm, accuracy and cluster quality (𝐶𝑄𝐼 ) metrics are used. The 𝐶𝑄𝐼 metric determines the
quality of the clustering process in terms of the number of eliminated feature vectors during the
clustering process and the number of clusters per class. Once the clustering process is complete,
overall f-measure and 𝐶𝑄𝐼 index are obtained for the clustered data and these are used as the
benchmark metrics in evaluation. A desirable performance for the cluster merging algorithms is
the one, in which the clusters, associated with the same class of data (e.g., the refrigerator turn-on
events) are merged together. In this way, the process is similar to the association matrix mapping
(to a conventional confusion matrix) process, which was described in Section 4.2. Therefore, a
desirable performance does not result in any change in the accuracy metrics (i.e., the precision,
recall, and f-measure) while it increases the density of clusters and consequently the 𝐶𝑄𝐼 index.
The feasibility of the CCM and FCM approaches is first evaluated on the signatures of the turn-on
events for phase A in apartment 1 data set. Note that, in this stage, the events labeled as 0, which
are either false positives or unimportant events (i.e, events with a change in power draw less than
a threshold) are also considered to account for realistic scenarios. The rest of the data is used for
comprehensive evaluation and tuning of the generalized parameters. The visual observation of
114
Figure 5-3 shows that there are a number of clusters that are originated from the same class of
signatures and could be merged to improve the clustering quality. Table 5-1 shows these clusters
and their corresponding signature class. As this table shows, 15 clusters have been generated for
these classes and if they all could be merged, the number of clusters is reduced by 10 clusters into
a total of 27 clusters, which are a more compatible scenario with the physical domain events.
Table 5-1The clusters that could be merged, determined by visual observation (the clusters have
been autonomously generated using two weeks of the turn-on events in phase A in apartment 1)
Signature Class
ID - Name
Clusters
0 – Refrigerator Compressor 1, 9, 14
18002 – AC Compressor 2, 5, 11, 13, 16
11101 – Refrigerator Compressor 3, 8, 15
16201 – Kettle 7, 12
12902 - TV 32, 33
Using the equations 5-1 to 5-3, the CCM compatibility conditions for all the clusters in Figure 5-3
are calculated. The variations of the values for these three conditions are taken into account to
assess the feasibility of the approach. Therefore, these variations are specifically taken into
account for the clusters that were presented in Table 5-1. Table 5-2 illustrates the variation of the
first condition (eq. 5-1) for the data set. The ranges of the variations have been highlighted in
different colors. As noted, this value should be close to 1 in order to show that the clusters’ hyper-
planes are parallel. Table 5-2 shows that the first condition criterion could be feasibly used for
class of 18002; however, for the rest of the classes low values (even close to 0.1 to 0.2) were
obtained. This phenomenon could be associated with high dimensionality of the data and the
presence of noise. Since the normal vector of the hyper-plane is associated with the smallest
115
Eigenvalue in each cluster, the noise of the signal could play a major role in our observation. As
mentioned above, the CCM approach (in its original publication) has been applied to 2D and 3D
image data and showed promising results. Therefore, we also explored the dimensionality
reduction effect on evaluation of the compatibility conditions. Principal component analysis
(PCA) was used for dimensionality reduction. Then the evaluation of the first condition was
carried out for a selected number of principal components of the data. The PCA analysis was
carried out on the entire data set covering all the samples in all the clusters. Table 5-3 presents the
values of the first compatibility condition for our targeted clusters (as presented in Table 5-1) for
different number of principal dimensions. As it is observed, by reducing the dimensionality of the
data for the first few dimensions (up to 5 first principal components) the first condition values are
getting closer to one. However, as it could be observed, by reducing the dimensionality of the data
the compatibility between other clusters increases. Although the analysis on this limited data set
could not be used for conclusive interpretation, it is used to assess the feasibility of the cluster
merging approaches for appliance signatures and select the more feasible approach before further
comprehensive analyses. Considering that dimensionality reduction could lead us to feasible
results for first compatibility condition, the second condition is evaluated for a selection of the
principal components. Original data set, as well as principal component dimensions of 4, 5, 10 are
used for second and third conditions evaluations.
116
Table 5-2 The CCM approach first condition (eq. 5-1) evaluation for clusters of the turn-on events on phase A in apartment 1
0.7-1 0.4-0.7 0.2-0.4 0.0-0.2
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37
1 0.0 0.0 0.3 0.0 0.0 0.0 0.8 0.5 0.2 0.7 0.0 0.1 0.0 0.6 0.3 0.0 0.0 0.3 0.0 0.0 0.7 0.1 0.7 0.9 0.8 0.0 0.0 0.2 0.8 0.0 0.0 0.3 0.9 0.0 0.4 0.7 0.0
2 0.0 0.0 0.1 0.6 0.9 0.6 0.0 0.2 0.6 0.0 0.9 0.3 0.7 0.3 0.4 0.8 1.0 0.0 0.8 0.2 0.1 0.9 0.6 0.1 0.1 0.7 0.6 0.0 0.1 0.9 0.5 0.8 0.0 0.1 0.1 0.6 0.0
3 0.0 0.0 0.0 0.1 0.1 0.0 0.3 0.1 0.1 0.3 0.0 0.1 0.0 0.1 0.2 0.1 0.1 0.2 0.1 0.2 0.3 0.0 0.3 0.4 0.4 0.0 0.1 0.0 0.3 0.1 0.1 0.1 0.3 0.2 0.0 0.3 0.0
4 0.0 0.0 0.0 0.0 0.4 0.4 0.0 0.2 0.4 0.0 0.5 0.1 0.3 0.2 0.2 0.4 0.6 0.0 0.4 0.1 0.0 0.5 0.4 0.1 0.0 0.5 0.3 0.0 0.0 0.5 0.3 0.4 0.0 0.1 0.1 0.4 0.0
5 0.0 0.0 0.0 0.0 0.0 0.6 0.0 0.2 0.5 0.0 0.8 0.2 0.8 0.3 0.3 0.8 0.8 0.1 0.6 0.1 0.1 0.7 0.5 0.0 0.0 0.6 0.3 0.0 0.1 0.7 0.3 0.6 0.1 0.0 0.0 0.5 0.0
6 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.2 0.2 0.0 0.5 0.2 0.5 0.2 0.2 0.5 0.6 0.0 0.4 0.1 0.1 0.5 0.3 0.0 0.0 0.6 0.3 0.1 0.0 0.5 0.3 0.5 0.0 0.1 0.0 0.3 0.0
7 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.4 0.1 0.7 0.0 0.0 0.0 0.6 0.2 0.0 0.0 0.3 0.0 0.1 0.7 0.2 0.5 0.7 0.7 0.0 0.0 0.1 0.7 0.0 0.0 0.2 0.7 0.1 0.5 0.5 0.0
8 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.2 0.3 0.2 0.1 0.2 0.3 0.3 0.2 0.2 0.1 0.1 0.0 0.4 0.1 0.4 0.4 0.4 0.1 0.2 0.1 0.4 0.1 0.1 0.4 0.5 0.1 0.2 0.4 0.0
9 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.1 0.5 0.2 0.3 0.5 0.3 0.5 0.7 0.1 0.5 0.3 0.1 0.5 0.5 0.3 0.2 0.4 0.4 0.0 0.2 0.6 0.3 0.6 0.2 0.6 0.4 0.5 0.0
10 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.7 0.1 0.0 0.0 0.3 0.0 0.1 0.6 0.1 0.5 0.7 0.6 0.0 0.0 0.0 0.6 0.0 0.0 0.1 0.7 0.3 0.6 0.5 0.0
11 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.3 0.7 0.3 0.3 0.9 0.9 0.0 0.7 0.2 0.1 0.8 0.5 0.0 0.0 0.6 0.5 0.0 0.1 0.8 0.4 0.7 0.0 0.0 0.0 0.5 0.0
12 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.2 0.1 0.2 0.3 0.3 0.0 0.2 0.2 0.1 0.3 0.3 0.1 0.1 0.2 0.1 0.1 0.1 0.3 0.2 0.3 0.1 0.2 0.1 0.2 0.0
13 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.1 0.3 0.6 0.7 0.0 0.6 0.0 0.1 0.6 0.4 0.0 0.0 0.6 0.2 0.0 0.0 0.6 0.3 0.5 0.1 0.1 0.1 0.4 0.0
14 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.2 0.4 0.4 0.2 0.3 0.5 0.4 0.2 0.5 0.5 0.2 0.2 0.1 0.5 0.3 0.1 0.1 0.6 0.6 0.8 0.2 0.0
15 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.3 0.4 0.0 0.3 0.1 0.2 0.3 0.4 0.3 0.3 0.3 0.4 0.0 0.3 0.3 0.2 0.5 0.3 0.2 0.0 0.5 0.0
16 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.8 0.0 0.7 0.1 0.1 0.7 0.5 0.1 0.0 0.6 0.5 0.0 0.1 0.7 0.4 0.7 0.0 0.1 0.0 0.5 0.0
17 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.8 0.2 0.1 0.9 0.6 0.1 0.1 0.8 0.6 0.0 0.1 0.9 0.5 0.8 0.0 0.2 0.1 0.6 0.0
18 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.3 0.0 0.3 0.3 0.3 0.1 0.0 0.0 0.3 0.1 0.1 0.1 0.3 0.4 0.5 0.3 0.0
19 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.1 0.1 0.7 0.5 0.0 0.1 0.6 0.6 0.1 0.1 0.8 0.4 0.7 0.1 0.0 0.0 0.5 0.0
20 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.2 0.1 0.1 0.1 0.1 0.1 0.1 0.0 0.2 0.1 0.2 0.0 0.5 0.4 0.1 0.0
21 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.2 0.4 0.7 0.6 0.1 0.1 0.1 0.6 0.1 0.1 0.1 0.8 0.0 0.4 0.5 0.0
22 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.4 0.1 0.1 0.7 0.6 0.1 0.1 0.8 0.5 0.7 0.2 0.1 0.1 0.4 0.0
23 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.7 0.6 0.4 0.4 0.2 0.6 0.6 0.3 0.6 0.6 0.1 0.2 0.8 0.0
24 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.8 0.0 0.0 0.3 0.8 0.0 0.0 0.3 0.9 0.1 0.4 0.7 0.0
25 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.1 0.0 0.2 0.7 0.1 0.1 0.3 0.8 0.0 0.4 0.6 0.0
26 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.6 0.1 0.0 0.7 0.5 0.6 0.1 0.0 0.1 0.5 0.0
27 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.6 0.4 0.5 0.1 0.1 0.1 0.4 0.0
28 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.2 0.0 0.0 0.1 0.1 0.0 0.1 0.1 0.0
29 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.1 0.0 0.3 0.8 0.0 0.4 0.6 0.0
30 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.5 0.8 0.1 0.0 0.0 0.6 0.0
31 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.5 0.1 0.0 0.0 0.3 0.0
32 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.2 0.1 0.1 0.7 0.0
33 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.4 0.6 0.0
34 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.8 0.1 0.0
35 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.3 0.0
36 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
37 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
117
Table 5-3 Evaluation of the CCM first condition for the data set with reduced dimensionality
using PCA for dimensions up to 30, as well as original dimensions
Cluster Labels
Cluster Labels
9 14 5 11 13 16 8 15 12 33 9 14 5 11 13 16 8 15 12 33
Dim PC dimension = 3 PC dimension = 10
1 0.8 0.9 0.3 0.3 0.0 0.4 0.7 0.3 0.9 0.2 0.1 0.1 0.4 0.7 0.1 0.1 0.6 0.4 0.2 0.2
2 0.1 0.2 1.0 0.8 0.5 0.8 0.9 0.7 0.2 0.9 0.0 0.1 0.5 0.2 0.1 0.8 0.3 0.0 0.4 0.6
3 0.0 0.0 0.9 1.0 0.1 1.0 1.0 0.7 0.0 0.6 0.0 0.0 0.4 0.3 0.8 0.0 0.0 0.9 0.6 0.1
7 0.0 0.0 1.0 0.9 0.3 0.9 1.0 0.7 0.0 0.7 0.0 0.1 0.2 0.0 0.1 0.3 0.6 0.0 0.2 0.2
32 0.7 0.7 0.8 0.8 0.0 0.8 0.5 1.0 0.7 0.4 0.1 0.1 0.7 0.4 0.0 0.4 0.3 0.1 0.4 0.4
Dim PC dimension = 4 PC dimension = 15
1 0.9 0.6 0.3 0.4 0.3 0.2 0.1 0.8 0.7 0.0 0.2 0.0 0.1 0.8 0.0 0.0 0.9 0.0 0.5 0.4
2 0.1 0.4 0.8 0.9 0.8 0.8 0.8 0.6 0.3 0.7 0.7 0.2 0.6 0.1 0.1 0.6 0.0 0.1 0.4 0.4
3 0.0 0.1 0.8 0.9 0.8 1.0 1.0 0.6 0.3 0.7 0.0 0.0 0.4 0.2 0.2 0.1 0.2 0.1 0.3 0.3
7 0.0 0.0 0.9 0.9 0.6 0.9 1.0 0.6 0.3 0.8 0.3 0.2 0.2 0.5 0.1 0.3 0.4 0.1 0.1 0.1
32 0.9 0.6 0.3 0.5 0.5 0.4 0.0 0.9 0.9 0.1 0.3 0.1 0.3 0.1 0.4 0.1 0.1 0.3 0.4 0.5
Dim PC dimension = 5 PC dimension = 20
1 0.7 0.6 0.2 0.1 0.6 0.3 0.4 0.9 0.7 0.1 0.2 0.4 0.2 0.2 0.4 0.0 0.1 0.3 0.3 0.1
2 0.1 0.7 0.9 0.8 0.9 0.8 0.4 0.7 0.6 0.4 0.5 0.3 0.1 0.2 0.4 0.5 0.5 0.1 0.1 0.1
3 0.0 0.2 0.4 0.0 0.7 0.2 0.9 0.7 0.3 0.5 0.0 0.0 0.0 0.0 0.1 0.0 0.0 1.0 0.0 0.2
7 0.1 0.1 0.6 0.8 0.2 0.1 0.2 0.3 0.1 0.8 0.1 0.0 0.6 0.5 0.0 0.7 0.1 0.1 0.5 0.3
32 0.9 0.5 0.2 0.1 0.5 0.3 0.1 0.8 0.9 0.3 0.0 0.2 0.2 0.2 0.2 0.1 0.0 0.3 0.1 0.2
Dim PC dimension = 6 PC dimension = 25
1 0.5 0.4 0.1 0.1 0.1 0.7 0.6 0.5 0.2 0.1 0.1 0.1 0.1 0.0 0.1 0.2 0.1 0.3 0.1 0.0
2 0.1 0.3 0.9 0.6 0.7 0.1 0.1 0.1 0.6 0.0 0.1 0.0 0.7 1.0 0.1 0.0 0.0 0.1 0.0 0.7
3 0.1 0.1 0.3 0.4 0.2 0.2 0.3 0.9 0.3 0.6 0.0 0.1 0.1 0.0 0.2 0.1 0.0 0.5 0.2 0.2
7 0.1 0.1 0.2 0.4 0.4 0.2 0.0 0.5 0.3 0.8 0.1 0.4 0.4 0.0 0.3 0.3 0.4 0.0 0.5 0.0
32 0.2 0.2 0.1 0.2 0.6 0.2 0.0 0.7 0.1 0.8 0.1 0.0 0.1 0.1 0.2 0.0 0.1 0.6 0.0 0.0
Dim PC dimension = 7 PC dimension = 30
1 0.7 0.6 0.5 0.3 0.2 0.9 0.4 0.2 0.5 0.3 0.1 0.1 0.1 0.1 0.2 0.1 0.0 0.3 0.2 0.2
2 0.2 0.1 0.2 0.2 0.4 0.2 0.5 0.3 0.0 0.2 0.0 0.2 0.3 0.7 0.0 0.1 0.6 0.1 0.4 0.1
3 0.0 0.0 0.8 0.7 1.0 0.3 0.3 1.0 0.7 0.1 0.1 0.1 0.0 0.0 0.1 0.1 0.0 0.3 0.1 0.1
7 0.1 0.2 0.6 0.6 0.8 0.0 0.3 0.8 0.2 0.3 0.4 0.2 0.1 0.4 0.1 0.3 0.3 0.1 0.4 0.1
32 0.4 0.2 0.5 0.3 0.1 0.6 0.3 0.1 0.6 0.6 0.0 0.2 0.3 0.6 0.3 0.3 0.4 0.0 0.4 0.0
Dim PC dimension = 8 Original dimensions
1 0.4 0.6 0.1 0.7 0.3 0.0 0.3 0.1 0.2 0.1 0.2 0.5 0.0 0.0 0.0 0.0 0.4 0.3 0.1 0.9
2 0.1 0.1 0.0 0.7 0.9 0.6 0.7 0.7 0.1 0.3 0.6 0.3 0.8 0.8 0.6 0.8 0.2 0.4 0.3 0.0
3 0.0 0.0 0.2 0.5 1.0 0.3 0.7 0.9 0.3 0.1 0.1 0.1 0.1 0.0 0.0 0.1 0.1 0.2 0.1 0.3
7 0.0 0.0 0.3 0.3 0.9 0.2 0.6 0.9 0.5 0.0 0.1 0.6 0.0 0.0 0.0 0.0 0.4 0.2 0.0 0.7
32 0.5 0.5 0.1 0.8 0.3 0.2 0.5 0.3 0.5 0.1 0.5 0.1 0.6 0.6 0.5 0.6 0.3 0.4 0.2 0.2
0-0.2 0.2-0.4 0.4-0.7 >0.7
Figure 5-8 shows the variation of the CCM’s second condition over the entire set of clusters. The
threshold for this parameter should be close to zero. Evaluation of the second condition with
original dimension of the data results in compatible condition almost for the entire data set, which
is not a desirable condition. However, the more the dimensionality reduces, the contrast between
118
second condition values increases. A closer look at these values for the targeted clusters is shown
in Table 5-4. As this table shows, even for the targeted clusters, the second condition values vary
by an order of magnitude, which makes the threshold selection a challenging task. Kaymak and
Babuska [87] also showed (on a 2 dimensional sample data set) that the second condition does not
hold true for all the compatible clusters and could be ignored.
4 principal dimensions 5 principal dimensions
10 principal dimensions Original dimensions
Figure 5-8 Variation of the CCM approach second condition for different dimensionality in the
data set
Finally the last condition is the similarity between clusters that is a normalized version of the
distance between centers of clusters with the sum of square root of largest eigenvalues for the
5 10 15 20 25 30 35
5
10
15
20
25
30
35
0
0.05
0.1
0.15
0.2
0.25
5 10 15 20 25 30 35
5
10
15
20
25
30
35
0
0.05
0.1
0.15
0.2
0.25
5 10 15 20 25 30 35
5
10
15
20
25
30
35
0
0.05
0.1
0.15
0.2
0.25
5 10 15 20 25 30 35
5
10
15
20
25
30
35
0
0.05
0.1
0.15
0.2
0.25
119
covariance matrix of each cluster. The largest eigenvalue represents the variance of the data along
the first principal component direction of the data. Including the standard deviation of the data in
the similarity metric amplifies the differences. If two clusters are very close and disperse, there is
a higher probability that the clusters are from the same distribution. On the other side of the
continuum, distant and dense clusters results in large values of similarity metric.
Table 5-4 Evaluation of the CCM second condition for selected clusters of turn-on events on
phase A of Apartment 1 data set
Cluster Labels
Cluster Labels
9 14 5 11 13 16 8 15 12 33 9 14 5 11 13 16 8 15 12 33
Dim PC dimension = 4 PC dimension = 5
1
0.04 0.06 0.58 0.11 0.13 0.13 0.57 0.42 0.18 0.01 0.01 0.09 0.56 0.44 0.07 0.56 0.54 0.06 0.31 0.02
2
0.14 0.37 0.14 0.03 0.08 0.15 0.09 0.29 0.45 0.04 0.59 0.08 0.05 0.11 0.03 0.13 0.06 0.10 0.30 0.12
3
0.52 0.34 0.14 0.25 0.35 0.24 0.02 0.40 0.11 0.32 0.48 0.38 0.19 0.12 0.33 0.22 0.11 0.20 0.07 0.01
7
0.52 0.57 0.10 0.11 0.11 0.15 0.08 0.02 0.50 0.18 0.10 0.12 0.04 0.01 0.17 0.03 0.11 0.02 0.15 0.09
32
0.04 0.00 0.24 0.51 0.40 0.44 0.13 0.02 0.18 0.36 0.10 0.10 0.43 0.27 0.36 0.43 0.15 0.06 0.12 0.54
Dim PC dimension = 10 Original Dimensions
1
0.07 0.05 0.12 0.06 0.31 0.35 0.04 0.02 0.10 0.10 0.01 0.05 0.01 0.01 0.00 0.01 0.07 0.01 0.11 0.01
2
0.21 0.31 0.11 0.07 0.11 0.18 0.18 0.02 0.17 0.06 0.00 0.02 0.10 0.09 0.02 0.15 0.15 0.01 0.08 0.02
3
0.12 0.09 0.05 0.14 0.04 0.15 0.09 0.03 0.03 0.01 0.01 0.01 0.03 0.00 0.02 0.03 0.01 0.01 0.14 0.05
7
0.08 0.09 0.10 0.21 0.15 0.18 0.16 0.14 0.32 0.02 0.01 0.01 0.00 0.01 0.04 0.04 0.06 0.02 0.06 0.01
32
0.00 0.02 0.00 0.07 0.04 0.15 0.18 0.04 0.01 0.30 0.00 0.02 0.00 0.02 0.01 0.03 0.18 0.00 0.05 0.00
0-0.1 0.1-0.3 >0.3
Figure 5-9 compares the variation of the third condition (i.e., the similarity metric between
clusters represented by equation 5-3) for different dimensionality of the data, similar to the cases
that were evaluated for second condition. Again, by using the lower dimensionality, the contrast
between different clusters increases, and it could help us determine the merging threshold.
However, decrease in dimensionality could also affect similarity measure between compatible
clusters as it has been presented in Table 5-5, which is a closer look at these values for the
targeted clusters. For example, for clusters 1 and 9, which are compatible clusters, decrease in
dimensionality results in increased normalized distance between cluster centers. This distance
value in the low dimension feature space is comparable with the distance between non-compatible
120
clusters; this change in the distance measure compounds the selection of the generalized
compatibility threshold for detection of true compatible clusters and may result in incompatible
cluster merging. Considering the variations of the values for these three-fold conditions, the
cluster merging using CCM approach was carried out on the clusters of apartment 1 phase A turn-
on events using 𝐾 1
=0.5−1 (with 0.1 increments) and 𝐾 3
=3−5 (with 0.5 increments) while
ignoring the second condition. Figure 5-10 and Figure 5-11 show the effect of the cluster merging
on the cluster quality indices (i.e., accuracy and 𝐶𝑄𝐼 metrics) for different values of 𝐾 1
and 𝐾 3
for original dimensions and first five principal dimensions, respectively.
4 principal dimensions 5 principal dimensions
10 principal dimensions Original dimensions
Figure 5-9 Variation of the CCM approach third condition for different dimensionality in the data
set
5 10 15 20 25 30 35
5
10
15
20
25
30
35
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
5 10 15 20 25 30 35
5
10
15
20
25
30
35
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
5 10 15 20 25 30 35
5
10
15
20
25
30
35
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
5 10 15 20 25 30 35
5
10
15
20
25
30
35
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
121
Table 5-5 Evaluation of the CCM third condition for selected clusters of turn-on events on phase
A of Apartment 1 data set
Cluster Labels
Cluster Labels
9 14 5 11 13 16 8 15 12 33 9 14 5 11 13 16 8 15 12 33
Dim PC dimension = 4 PC dimension = 5
1
8.6 1.7 18.6 27.5 21.4 18.4 19.3 13.7 28.1 53.6 8.01 1.79 17.48 24.63 19.41 17.21 18.02 11.51 24.72 44.59
2 26.6 15.5 1.3 1.4 2.6 0.6 23.8 14.5 27.3 33.7 18.71 9.49 1.23 1.62 2.59 0.67 23.39 12.51 26.03 32.75
3
9.5 7.6 11.2 14.4 12.2 11.3 3.1 2.2 9.2 19.8 7.75 5.18 11.21 14.28 11.95 11.20 3.12 2.02 9.00 19.84
7
25.9 16.1 23.1 34.1 27.1 22.8 15.6 12.0 3.9 70.5 17.11 9.27 22.99 33.44 26.01 22.62 15.57 10.27 3.98 70.11
32 50.9 21.6 20.2 35.0 23.3 20.3 33.4 25.1 52.5 3.7 28.58 11.45 19.79 33.46 21.76 19.83 32.59 20.43 48.46 3.76
Dim PC dimension = 10 Original Dimensions
1
4.3 1.6 11.7 17.3 10.8 11.3 12.9 9.4 14.6 31.4 3.7 1.6 9.1 12.3 8.6 8.9 8.7 7.6 8.9 19.6
2 9.2 6.6 0.8 1.9 1.7 0.8 14.3 8.9 13.6 17.8 8.7 6.4 0.8 1.9 1.7 0.8 11.5 8.2 10.0 15.4
3
4.7 4.0 8.0 10.7 7.6 7.9 3.0 2.2 6.2 15.2 4.6 4.0 7.3 9.6 7.0 7.2 2.7 2.1 4.9 14.1
7
8.8 7.0 14.9 22.6 13.9 14.4 11.2 8.6 3.6 46.4 8.2 6.7 12.8 18.0 12.1 12.4 8.4 7.7 2.7 35.9
32 14.5 9.0 13.4 24.9 11.7 13.2 24.7 18.1 27.7 4.9 14.2 9.0 11.9 20.1 10.4 11.7 18.3 17.2 17.5 5.0
0-2 2-5 >5
For both cases, the CCM approach does not show a promising performance although wide ranges
of the thresholds’ variations were taken into. As noted, out of all the cases, those that do not result
in changes in the f-measure values are acceptable. Although in some cases f-measure values have
increased, the f-measure increase is mainly related to cases, where merging happens for clusters
with high true positive rates and therefore the change in the mean f-measure value becomes
positive. A successful cluster merging with true compatible clusters is the one that does not
change the f-measure values in either positive or negative directions. In the case of using original
dimensions, the first compatibility condition (𝐾 1
) derived the merging process, while in the
reduced dimensionality case, the third compatibility condition (𝐾 3
) drives the cluster merging
process. However, in both cases, the increased CQI, in best case, has been about one percent
(could reach to 8%). Although the values in Table 5-3 and Table 5-5 for targeted clusters seem
promising and a better performance is expected, the ranges of variations for the first and third
compatibility conditions also result in merging incompatible clusters.
The same analysis is carried out for the FCM approach. The cluster compatibility metric in this
122
approach is in fact a slightly different representation of the third compatibility condition in the
CCM approach. In the FCM approach, instead of the cluster dispersion along the first principal
component, the radii of clusters are used, which could potentially increase the amplification of the
compatibility criteria.
Figure 5-10 The effect of the cluster merging on the cluster quality indices (i.e., accuracy and
𝐶𝑄𝐼 metrics) using CCM approach for different values of 𝐾 1
and 𝐾 3
for original dimensions
Figure 5-11 The effect of the cluster merging on the cluster quality indices (i.e., accuracy and
𝐶𝑄𝐼 metrics) using CCM approach for different values of 𝐾 1
and 𝐾 3
for the first five principal
dimensions
0.4 0.5 0.6 0.7 0.8 0.9 1 1.1
-0.2
-0.16
-0.12
-0.08
-0.04
0
0.04
K
1
Cluster Quality Index Variation
CQI f-measure
3 4 5
K
3
0.4 0.5 0.6 0.7 0.8 0.9 1 1.1
-0.3
-0.25
-0.2
-0.15
-0.1
-0.05
0
K
1
Cluster Quality Index Variation
CQI f-measure
3 4 5
K
3
123
Figure 5-12 Variation of the FCM similarity ratio (SR) metric for the turn-on events on phase A
of apartment 1; variation of the SR measure across different clusters (left side); histogram of the
SR values across different clusters (right side)
Figure 5-12 shows the variation of the similarity ratio measure across different clusters. The
variation of the 𝑆𝑅 measures shows the potential of the FCM approach in merging similar
clusters. Considering the histogram of the 𝑆𝑅 measures, a range of threshold values from 0.3 to 1
with 0.1 increments was taken into account for the 𝜏 𝑆𝑅
parameter and cluster merging.
Figure 5-13 The effect of the cluster merging on the cluster quality indices (i.e., accuracy and
𝐶𝑄𝐼 metrics) using FCM approach for different values of 𝜏 using original dimensions
5 10 15 20 25 30 35
5
10
15
20
25
30
35
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2
0
50
100
150
200
250
300
350
400
450
500
Similarity Ratio Metric Values for FCM Approach
Number of Non-zero Elements
0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1
-0.4
-0.35
-0.3
-0.25
-0.2
-0.15
-0.1
-0.05
0
0.05
SR threshold
Cluster Quality Index
CQI
f-measure
124
Figure 5-13 presents the variation of the cluster quality indices for different values of the
𝜏 𝑆𝑅
parameter. The best result is obtained by using τ
𝑆𝐶
=1 and other cases are not acceptable.
However, to shed light on the class wise changes of the 𝐶𝑄𝐼 values Table 5-6 presents the trend of
the 𝐶 𝑄𝐼 changes for different values of 𝜏 𝑆𝑅
. The merging process shows promising results for
most of our targeted classes (0, 18002, 11101, 16201). The 𝐶𝑄𝐼 increases for all these classes,
specifically, when lower values for the 𝜏 𝑆𝑅
parameter are used. Despite the improvements for
these classes, lower values of 𝜏 𝑆𝑅
resulted in dramatic reduction of the 𝐶𝑄𝐼 for the rest of the
classes, and it is not desirable.
Table 5-6 Variation of the CQI values for individual classes of appliance state transitions
Class Label Original CQI
𝜏
0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0 0.09 0.59 0.41 0.27 0.16 0.16 0.13 0.10 0.10
200 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
300 0.09 0.00 0.00 0.11 0.11 0.09 0.09 0.09 0.09
11101 0.32 0.97 0.96 0.48 0.32 0.32 0.32 0.32 0.32
11102 1.00 0.00 0.00 0.00 1.00 1.00 1.00 1.00 1.00
12900 0.02 0.00 0.00 0.00 0.00 0.00 0.02 0.02 0.02
12901 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
12902 0.50 0.00 0.00 0.00 0.00 0.50 0.50 0.50 0.50
12903 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
12906 0.46 0.00 0.00 0.00 0.00 0.00 0.00 0.83 0.46
14101 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
14201 0.59 0.00 0.00 0.00 0.00 0.00 0.00 0.59 0.59
14301 1.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00
14401 0.97 0.00 0.00 0.00 0.00 0.97 0.97 0.97 0.97
14501 1.00 0.00 0.00 0.00 0.00 0.00 1.00 1.00 1.00
16201 0.37 0.73 0.37 0.37 0.37 0.37 0.37 0.37 0.37
16301 0.95 0.00 0.95 0.95 0.95 0.95 0.95 0.95 0.95
18001 0.48 0.00 0.00 0.00 0.00 0.00 0.00 0.07 0.48
18002 0.09 0.46 0.46 0.46 0.46 0.46 0.46 0.23 0.15
However, our observations show that the appliances’ signatures with lower ranges of power
variations are very important in poor performance of the CCM and FCM algorithms. This
phenomenon could also be observed in Figure 5-9 and Figure 5-12, where the contrast between
clusters reduces as we approach the clusters with lower power ranges. It was mentioned that the
125
heuristic clustering algorithm uses recursion to account for the multi-scale nature of the feature
space, and thus, the clusters that are separated in the smaller scales are usually difficult to be
differentiated for CCM and FCM cluster merging algorithms. A closer look at the clusters in
Figure 5-3 could clarify this concept. Therefore, the power draw for each cluster is used as
another important criterion for evaluating the compatibility of clusters; this feature is represented
by the real power variation range for each cluster (𝑃 𝑟 𝑐𝑙
) . Based on these observations, we propose
to enhance the performance of the cluster merging algorithms by performing the merging process
at different scales.
Therefore, the cluster merging is carried out for groups of similar clusters, which are formed by
considering a minimum of 20 percent change in the power variation ranges. In fact, the algorithm
finds considerable (20%) jumps in power ranges by searching through the sorted values of
clusters’ real power ranges. As an example, for our targeted data set in the above analyses, this
results in seven groups of clusters. Table 5-7 shows the groups and their corresponding clusters.
Table 5-7 Example of groups of clusters for the targeted data set and their associated cluster
labels
Group No. 1 2 3 4
Clusters [11, 2, 16, 5, 13] [9, 3, 8, 1, 15, 12, 7, 14] 17 20
Group No. 5 6 7
Clusters [31, 33, 32, 22, 19, 10, 24, 4, 21, 18] [25, 23, 28, 30, 27, 6, 29, 26, 35] [36, 34]
Figure 5-14 illustrates the results of the cluster merging (using the FCM approach) for groups of
similar clusters. Each subplot shows variation of the change in overall 𝐶𝑄𝐼 (represented by bars)
and f-measure (represented by the line) for different groups of clusters and different values of the
𝑆𝑅 threshold. Almost for all of the 𝜏 𝑆𝑅
values, the cluster merging performance drops for the last
126
three groups that contain the signatures in smaller scale (i.e., with lower power ranges),
supporting our argument. The best result was obtained for 𝜏 =0.4, where more than 5 percent
improvement was achieved for CQI without any change in the f-measure value.
Figure 5-14 Variation of the cluster quality indices (f-measure and 𝐶𝑄𝐼 ) merging groups of
clusters, formed based on the real power range variation of clusters
Furthermore, the number above each bar shows the lowest power variation range (𝑃 𝑟 𝑐𝑙
) among the
0 1 2 3 4 5 6 7
-0.1
-0.05
0
0.05
0.1
=0.3
Cluster Group
CQI
0 1 2 3 4 5 6 7 8
-0.05
-0.02
0
0.02
f
0 1 2 3 4 5 6 7
-0.1
-0.05
0
0.05
0.1
=0.4
Cluster Group
CQI
0 1 2 3 4 5 6 7 8
-0.05
-0.02
0
0.02
f
0 1 2 3 4 5 6 7
-0.1
-0.05
0
0.05
0.1
=0.5
Cluster Group
CQI
0 1 2 3 4 5 6 7 8
-0.05
-0.02
0
0.02
f
0 1 2 3 4 5 6 7
-0.1
-0.05
0
0.05
0.1
=0.6
Cluster Group
CQI
0 1 2 3 4 5 6 7 8
-0.05
-0.02
0
0.02
f
0 1 2 3 4 5 6 7
-0.1
-0.05
0
0.05
0.1
=0.7
Cluster Group
CQI
0 1 2 3 4 5 6 7 8
-0.05
-0.02
0
0.02
f
0 1 2 3 4 5 6 7
-0.1
-0.05
0
0.05
0.1
=0.8
Cluster Group
CQI
0 1 2 3 4 5 6 7 8
-0.05
-0.02
0
0.02
f
0 1 2 3 4 5 6 7
-0.1
-0.05
0
0.05
0.1
=0.9
Cluster Group
CQI
0 1 2 3 4 5 6 7 8
-0.05
-0.02
0
0.02
f
0 1 2 3 4 5 6 7
-0.1
-0.05
0
0.05
0.1
=1
Cluster Group
CQI
0 1 2 3 4 5 6 7 8
-0.05
-0.02
0
0.02
f
CQI
f
127
groups of clusters, for which cluster merging could be carried out without negative effect on the
quality of the original clusters. As this figure indicates, the application of the dimensionality
reduction improved the performance of the CCM approach specifically for K
1
=0.5. Consequently,
in Figure 5-16 the performance of the CCM approach was evaluated using these specifications for
different K
3
values.
a) Turn-on events, phase A, original dimensions
K
1
=0.7
b) Turn-on events, phase A, original dimensions
K1=0.5
c) Turn-on events, phase A, first 5 PCs
K
1
=0.7
d) Turn-on events, phase A, first 5 PCs
K
1
=0.5
Figure 5-15 Performance of the CCM algorithm for apartment 1 data set: a) turn-on events on
phase A, original dimensions for K
1
=0.7; b) turn-on events on phase A, original dimensions for
K
1
=0.5; c) turn-on events on phase A, first 5 PCs for K
1
=0.7; b) turn-on events on phase A, first 5
PCs for K
1
=0.5;
2.5 3 3.5 4 4.5 5 5.5 6 6.5
0
0.01
0.02
0.03
0.04
0.05
34 34 34 34 34 34 350 350 350
K
3
CQI
2.5 3 3.5 4 4.5 5 5.5 6 6.5
0
0.01
0.02
0.03
0.04
0.05
34
256 256 350 350 350 350 350 350
K
3
CQI
2.5 3 3.5 4 4.5 5 5.5 6 6.5
0
0.02
0.04
0.06
0.08
0.1
350
350 350 350 350 350
350 350 350
K
3
CQI
2.5 3 3.5 4 4.5 5 5.5 6 6.5
0
0.02
0.04
0.06
0.08
0.1
350
350 350 350 350 350
350 350 350
K
3
CQI
128
a) Turn-on events, phase A b) Turn-off events, phase A
c) Turn-on events, phase B d) Turn-off events, phase B
Figure 5-16 Performance of the CCM algorithm for apartment 1 data set using first 5 PCs and
K
1
=0.5: a) turn-on events on phase A, b) turn-off events on phase A, c) turn-on events on phase B,
and d) turn-off events on phase B
Figure 5-17 and Figure 5-18 present the results for the same analyses for FCM and CBM
approaches, respectively. In these two methods, only one threshold parameter (i.e., 𝜏 𝐶𝑆
for CBM
and 𝜏 𝑆𝑅
for CBM) is adjusted for performance evaluation. The results of the analyses show that
the CBM approach outperforms the other methodologies by resulting in overall higher ∆𝐶𝑄𝐼 and
lower 𝑃 𝑟 𝑐𝑙
. Lower 𝑃 𝑟 𝑐𝑙
means that the CBM approach outperforms the other methods even in
merging clusters in smaller scales of the feature space without a drop in accuracy of the clusters.
Between FCM and CCM approach, the latter showed a better stability against variation of the
threshold values and therefore generalizability of the threshold; however, higher overall 𝐶𝑄𝐼
index improvement and better performance in smaller scales of the feature space emphasizes on
CBM better performance. A comprehensive analysis of the effective threshold values for merging
is presented in Section 5.5.2.
2.5 3 3.5 4 4.5 5 5.5 6 6.5
0
0.02
0.04
0.06
0.08
0.1
350
350 350 350 350 350
350 350 350
K
3
CQI
2.5 3 3.5 4 4.5 5 5.5 6 6.5
0
1
2
3
x 10
-3
64 64
3010 3010 3010 3010 3010 3010 3010
K
3
CQI
2.5 3 3.5 4 4.5 5 5.5 6 6.5
0
0.002
0.004
0.006
0.008
0.01
1068 1068
1068 1068 1068 1068 1068
1068 1068
K
3
CQI
2.5 3 3.5 4 4.5 5 5.5 6 6.5
0
0.01
0.02
0.03
72 72
72 72 72
72 72 72 72
K
3
CQI
129
a) Turn-on events, phase A b) Turn-off events, phase A
c) Turn-on events, phase B d) Turn-off events, phase B
Figure 5-17 Performance of the FCM algorithm for apartment 1 data set: a) turn-on events on
phase A, b) turn-off events on phase A, c) turn-on events on phase B, and d) turn-off events on
phase B
a) Turn-on events, phase A b) Turn-off events, phase A
c) Turn-on events, phase B d) Turn-off events, phase B
Figure 5-18 Performance of the CBM algorithm for apartment 1 data set: a) turn-on events on
phase A, b) turn-off events on phase A, c) turn-on events on phase B, and d) turn-off events on
phase B
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
0
0.02
0.04
0.06
0.08
3072 3072 3072
350
350
350 350 350
256
Threshold Value
CQI
0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
-1
-0.5
0
0.5
1
3010 3010 3010 3010 3010 3010 545 545
Threshold Value
CQI
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
0
0.002
0.004
0.006
0.008
0.01
1068 1068 1068
1068 1068 1068 1068 1068 35
Threshold Value
CQI
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
0
0.05
0.1
0.15
0.2
1059 1059
129 129 129
129 129
129
72
Threshold Value
CQI
3 4 5 6 7 8 9 10 11 12 13 14 15
0
0.02
0.04
0.06
0.08
0.1
34
34
255
350
350 350 350 350 350 350
894
2800
3072
Threshold Value
CQI
3 4 5 6 7 8 9 10 11 12 13 14 15
0
1
2
3
4
x 10
-4
19 19 19
19
282
545 545 545 545
545
1038 1285
2024
Threshold Value
CQI
3 4 5 6 7 8 9 10 11 12 13 14 15
0
0.002
0.004
0.006
0.008
0.01
0.012
862
1068 965 965
965
1068 1068 1068 1068 1068 1068 1068 1068
Threshold Value
CQI
3 4 5 6 7 8 9 10 11 12 13 14 15
0
0.05
0.1
0.15
0.2
72
72 72
72
72
72
72
78
84
78
188
385
960
Threshold Value
CQI
130
Moreover, the results of the analysis shows that by using the CCM approach the merging could be
carried out to small scales with the exception for the phase B turn-on events. A closer look at the
feature space and the original clusters for the data on this phase reveals that there are two
appliance states (i.e., one state from washing machine (18305) and one state from hair dryer
(18101)) that generate very similar signatures, which in turn result in change in the f-measure
values. By ignoring those two clusters the algorithm could correctly evaluates merging of the rest
of the clusters to the lowest scale. Therefore, it could be generally concluded that by using the
CBM approach the cluster merging could be carried out for all the scales in the feature space wile
this is not the case for the other two methods. For example for the turn-on events’ clusters on
phase A, both FCM and CCM approaches resulted in poor performance for scales with 𝑃 𝑟 𝑐𝑙
lower
than 350 watts.
5.4.2 Anomaly Detection Algorithm Validation
The anomaly detection algorithm builds a model for each cluster of interest by solving the
quadratic optimization problem presented in equation 5-14. Different kernel functions could be
used for building the model and the anomaly test. In addition to polynomial kernel function, the
RBF kernel function (equation 5-23) and Mahalanobis kernel function (equation 5-24) are other
kernel functions that could be taken into account.
𝑘 (𝒙 𝑛 ,𝒙 𝑚 )=exp(−
‖𝒙 𝑛 −𝒙 𝑚 ‖
2
2𝜎 2
)
5-23
131
𝑘 (𝒙 𝑛 ,𝒙 𝑚 )=exp(−
1
2𝜎 2
(𝒙 𝑛 −𝒙 𝑚 )
𝑇 𝑄 (𝒙 𝑛 −𝒙 𝑚 )) 5-24
In the above equations, 𝜎 is the adjustable hyperparameter and its value drives the performance of
the algorithm, and 𝑄 is the inverse of the covariance matrix of the training data set that could be
considered for the entire data set or a specific class of interest [93]. The Mahalanobis kernel is the
enhanced version of the RBF kernel and enables the application of full covariance matrix of the
data set. For the 𝑄 matrix the inverse of the covariance matrix of the training class of interest
could be used.
The evaluation of the anomaly detection algorithm is carried out by training the algorithm for
each class and testing the anomaly detection versus the rest of the classes. The 𝐶 parameter in
equation 5-14 is adjusted in an unsupervised way. The 𝐶 parameter is initialized with a value
(larger than the feasible values) and is incrementally decreased until reaching to a marginal value,
for which one of the samples in the training cluster is detected as outlier. The rest of the feature
vectors are used as the test data set, which is evaluated against the trained model for each cluster.
In this way, the trained model for each cluster is evaluated for detecting both inliers and outliers.
The performance evaluation and parameter tuning was carried out on the turn-on and turn-off
events on both phases for apartment 1 and apartment 2 to account for diversity of the feature
space. In evaluating the kernel functions and tuning their parameters, the ground truth labeled data
sets were used. The first, second, and third degree polynomial kernel functions and the RBF
kernel function were selected to be used for the evaluations. Based on numerous evaluations the
intercept (𝑐 parameter) in polynomial kernel functions was set to be zero and the degree of
132
polynomial was the only parameter to be taken into account for the sensitivity analyses. For the
RBF kernel function, the 1 2𝜎 2
⁄ term was considered to be equal to 0.00001; the feasibility of
defining the anomaly detection as an optimization problem was used for determining this value.
Since this value resulted in good performance of the SVDD approach, the sensitivity analyses was
focused on kernel type and the initial 𝐶 value determination.
Table 5-8 shows the results of the analysis on the data from both apartment 1 and apartment 2.
The outlier detection accuracy metric was used for this evaluation. The accuracy metric measures
the percentage of outliers that were correctly detected. For each class of signatures, the outliers
are the samples in other classes. As this table shows, the application of RBF kernel function
resulted in higher accuracy of outlier detection while the best performance of polynomial kernel
functions was obtained while using the second order polynomial kernel function. The higher
accuracy in case of using the RBF kernel function is concomitant with increase in the number of
support vectors (as shown in Table 5-8). This increase brings about increase in computing time for
tuning the 𝐶 parameter, specifically, when the number of signature samples in a specific
class/cluster and the number of variables are large. However, it was observed that a subset of
samples in a class could be used for training to compensate for reduced computing efficiency. By
evaluating the efficiency of the algorithm, a minimum of 150 samples was used in our analyses.
Since the clustering and data description are performed once before the labeling stage, the
relatively lower efficiency of using the RBF kernel function could be justified. By considering the
promising performance of the RBF kernel function and the fact that using the Mahalanobis kernel
function compounds the complexity of the algorithm (due to the computational cost associated
with covariance matrix inversion for kernel calculations), the application of this kernel function
was ignored.
133
Table 5-8 Evaluation of the SVDD algorithm for anomaly detection using the data in apartment 1
and apartment 2 data sets
Data Set Apartment 1 Apartment 2
Mean
Event Type – Phase On - A Off - A On - B Off - B On - A Off - A On - B Off - B
Number of Classes 17 16 12 13 15 19 21 21 -
Outlier Detection Accuracy
Pol
(d1)
*acc 0.98 1.0 0.79 0.92 0.87 0.93 0.9 0.89 0.91
** C
R
0.099 0.149 0.099
0.147 –
0.149
0.149 0.14 – 0.149 0.149 0.149
-
***SV
avg
6.6 4.9 6.6 4.4 4.4 4.4 4.6 4.6 5.1
Pol
(d2)
Acc 0.99 1.0 0.89 0.98 0.9 0.92 0.91 0.91 0.93
C
R
0.099 0.149 0.099 0.13 - 0.149 0.149
0.127 –
0.149
0.149 0.149 -
SV
avg
6.9 4.8 6.8 4.7 4.6 4.4 5.1 4.8 5.3
Pol
(d3)
Acc 0.97 1.0 0.84 0.95 0.8 0.92 0.9 0.9 0.91
C
R
0.099 0.149 0.099
0.141 -
0.149
0.149
0.119 –
0.149
0.149 0.149 -
SV
avg
6.8 4.6 6.3 4.5 4.7 4.5 4.8 4.5 5.1
RBF
Acc 0.99 1.0 0.94 0.99 0.96 0.98 0.98 0.98 0.98
C
R
0.01- 0.099 0.024-0.149
0.009 -
0.099
0.028 –
0.149
0.006 –
0.149
0.009 –
0.149
0.017 –
0.149
0.017 – 0.149 -
SV
avg
19.3 10 25.8 19.8 34.8 35.2 34.3 19.3 24.8
* acc shows the average percentage of correctly detected outlier signatures (for each class of interest,
outlier signatures are the entire data set but the signatures in that class)
** C
R
shows the range of C values that brought about the reported accuracy values
*** SV
avg
shows the average of support vectors (across all the classes) that resulted in the obtained
accuracy
For practical application of the anomaly detection approach and in an attempt to find patterns in
tuning the anomaly detection algorithm, the performance of the algorithm was also evaluated with
respect to the classes’ real power range variation and the number of samples in each class.
Figure 5-19 illustrates the variation of outlier detection accuracy versus the real power range of
classes. The x axis in this figure is in logarithmic scale, and as it shows, the anomaly detection
performance degrades relatively for classes with power ranges less than 200 watts.
134
Figure 5-19 Variation of the outlier detection accuracy versus the change in the real power range
of signature clusters for both apartment 1 and apartment 2 data sets using RBF and second
degree polynomial kernel functions
Figure 5-20 shows the distribution of the signature classes in the power range versus number of
samples space. This figure shows that the majority of the signature samples belong to the classes
with larger power ranges. Therefore for this category of signatures, polynomial kernel could be an
acceptable solution. For the signatures with power ranges below 200 watts, polynomial kernel
showed good performance for majority of the cases. Therefore, as a practical solution in smart
interaction framework, once the clusters are formed, the second order polynomial kernel function
is used for data description of each cluster and its performance is evaluated against the rest of
clusters. For those clusters with outlier detection accuracy less than 0.95, RBF kernel function is
used.
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
1.1
10 100 1000 10000
Outlier Detection Accuracy
Real Power Rnage of Signature Classes
RBF
Second Degree Polynomial
135
Figure 5-20 The variation of real power range and the number of samples in different classes
Another important factor is the initial value of regularization parameter, 𝐶 . In above mentioned
evaluations, the initial values for 𝐶 was set to 0.1 and in cases that 0.1 did not result in a solution,
the parameter was set to 0.15. The 𝐶 was incrementally decreased by 0.001 until a marginal value,
for which a tight data description is defined. Now, the question is whether we can determine the
initial value based on the characteristics of the classes/clusters. For polynomial kernel, 𝐶 =0.15
could be considered as the acceptable value. However, for the RBF kernel function, considering
the increased computing time, an informed selection of the initial 𝐶 value is preferred. Figure 5-21
shows the variation of the classes’ power range and the number of samples in each class versus
the optimum 𝐶 values while using the RBF kernel function. Although these graphs do not
demonstrate any significant trend, it could be concluded that for populated classes/clusters (i.e.,
where the number of samples is larger than 200 samples) an initial value of 0.1 could be
considered as an acceptable value. This value could be even reduced further for classes/clusters
with more than 500 samples to 0.08.
1
5
25
125
625
3125
10 100 1000 10000
Number of signature samples
Power Rnage of Signature Classes
Classes
150 Line
136
C value versus number of samples C value versus power range of classes
Figure 5-21 Variation of optimum C values versus characteristics of the signature classes
For the mahalanobis distance (MD) based outlier detection, the evaluation is carried out for both
inliers and outliers. Similar to the SVDD approach, in this approach, the inlier compatibility
threshold, 𝜏 𝑜𝑑
, needs to be adjusted for effective outlier detection. Since in the SI framework, the
structure of the signature space (i.e., the clusters of observed state transitions) is determined
through the cluster validation procedure, the 𝜏 𝑜𝑑
could be adjusted for each individual cluster in
an unsupervised way. The 𝜏 𝑜𝑑
for each cluster is determined by measuring the accuracy of inlier
and outlier detection. The accuracy for inlier detection is carried out through a leave one out
(LOO) approach and the accuracy for the outlier detection is tested by using the signatures in all
the remaining clusters.
Several research efforts have been made to improve the performance of the multivariate outlier
detection and analytical determination of the detection threshold (some examples could be found
in [94-97]). The improvements are mainly based on using robust statistical measures for the core
data mean and covariance matrix calculation. Adopting robust statistical metrics reduces the
sensitivity to outliers of the core training data. The most common robust measure for calculating
0
100
200
300
400
500
600
700
800
900
1000
0 0.05 0.1 0.15
Number of samples per class
C
0
500
1000
1500
2000
2500
3000
3500
0 0.05 0.1 0.15
Class power range
C
137
the covariance matrix of the data is the minimum covariance determinant (MCD) method mainly
due to the availability of a computationally fast algorithm [94]. In this approach, the data that is
used for calculating the statistical measures is a subset of the training data that results in a
covariance matrix with minimum determinant. However, in the SI framework, the application of
robust statistical metrics could result in rigid representation of the core data (i.e., the signatures in
the reference cluster), which is not desirable as mentioned above. Therefore, in this dissertation,
we use maximum likelihood estimation over all the signatures in the reference cluster to obtain
the sample mean and covariance matrix and calculate mahalanobis distance. As noted, obtaining
the knowledge about the possible states through cluster validation gives us the opportunity to
adjust the detection statistic threshold by taking the knowledge of the signature space into
account. Therefore, instead of finding generalized threshold values, the threshold values are
determined through an in-situ sensitivity analysis for each individual cluster. To evaluate the
performance of the approach on the appliances’ signatures, similar to the evaluation for SVDD
approach, the ground truth data collected from apartment 1 and 2 are used. In this evaluation, both
inlier and outlier detection accuracy is taken into account through following formulation:
argmax
𝜏 𝑜𝑑
𝑎𝑐𝑐 𝑡𝑒𝑠𝑡𝑖 𝑛𝑔
𝑠 .𝑡 . 𝑎𝑐𝑐 𝑡𝑟𝑎𝑖𝑛𝑖𝑛𝑔 >0.8
5-25
In this way, the maximum accuracy of outlier detection on test data for 𝜏 𝑜𝑑
values that bring
about more than 0.8 in detection of inliers in the training data set are reported. A range of [1,30] is
used for evaluation of the outlier detection performance. Unlike the SVDD approach, the
computing time in this approach is considerably reduced; this is due to the fact that the mean,
138
covariance matrix, and its inverse are calculated once for each reference cluster. In these
evaluations, the unknown and false positive signature classes (i.e., 0 and 300) and the classes with
large variations in signatures were not taken into account as reference clusters. However, all the
signatures were used as test data for each reference cluster.
Table 5-9 Evaluation of the MD algorithm for anomaly detection using the data in apartment 1
and apartment 2 data sets
Data Set Apartment 1 Apartment 2
Mean
Event Type – Phase On - A Off - A On - B Off - B On - A Off - A On - B Off - B
Number of Classes 17 16 12 13 15 19 21 21 -
Outlier Detection Acc 1.00 0.98 0.97 0.94 0.99 0.98 0.98 0.98 0.98
*Inlier Detection Acc 1.00 1.00 0.99 0.99 0.97 0.97 0.98 0.98 0.99
**𝜏 𝑜𝑑
Range 3-14 2-12 4-16 2-16 2-19 2-19 2-15 2-16 -
* acc shows the average percentage of correctly detected inlier signatures by using a LOO approach
** shows the range of threshold values that brought about the reported accuracy values
To provide more information about the accuracy, Figure 5-22 illustrates the variation of the inlier
and outlier detection with respect to power range of the signature classes.
Figure 5-22The distribution of inlier and outlier detection accuracy with respect to power range
of the signature class
1
10
100
1000
10000
0.8 0.85 0.9 0.95 1
Signature Class Power Range (Watt)
Accuracy
Inlier Detection Accuracy
Outlier Detection Accuracy
139
As it is expected, in the smaller scales, the outlier detection accuracy is relatively reduced due to
the similarity of the signatures. However, in majority of the cases the accuracy is higher than 0.9.
In using the approach as a component of the SI framework, the thresholds are identified in an
unsupervised way by using the outcome of the cluster validation process and the approach
presented in Equation 5-25.
5.5 Training Frameworks Performance Evaluation and Validation
As noted, the main objective of the SI framework is to reduce the number of calls for user
interaction until a populated training data set is collected for a new environment while
maintaining the accuracy of the data. Therefore, in this dissertation, field experimental study was
adopted as the methodology for understanding the user interaction requirements and
evaluating/validating the performance of the SI framework.
Figure 5-23 depicts the IDEF0 diagram of the training frameworks. This diagram represents the SI
training framework components and information flow. However, the main structure of the SI
framework is the basic training framework and therefore, the following IDEF0 diagram also
represents the basic training framework components. Our NILM prototypes were developed using
National Instruments LabVIEW and MATLAB software packages. The prototypes enabled real-
time load monitoring and the user-interaction for the training data provision. Figure 5-24 shows
one of the prototypes, used for the studies in this dissertation. The prototypes process the current
and voltage waveforms into power time series for each phase independently; the event detection
and feature extraction are then used on buffered samples of the power time series. The parameters
of the event detector could be tuned for each phase independently. The classification is carried out
on the extracted feature vectors using the training data that is stored on a local computer in the
140
residential setting. The number of interactions of the user with the NILM system during training is
also recorded.
A1
Anonymous
Labeling Process
Model parameters
Event Detection,
Feature Extraction,
and Clustering
Algorithms
A2
Real-time Decision
Making Module
Clustered
signatures {c 1 ,
c 2 ,…,c n }
Model parameters
Event detection,
Feature Extraction,
Classification, and
Anomaly Detection
Algorithms
A3
User Interaction
Interface
Control
Policy
Rule Set for
User
Interaction
A0
Power metric
processing
Current waveforms
from both Phases
Voltage waveform
from one phase
Spectral
Envelope
Coefficient
Approach
Harmonics C
Characteristics
Information
Sampling
frequency, STFT
window size
Power Metric Time Series
Extracted
feature vector
Training
Data Set
User Interface
for Interaction
User
Input
ID
Process Name
Inputs
Mechanisms
Outputs
Constraints
Legend
Figure 5-23 The IDEF0 diagram of the training framework, which could be selectively used for
both basic and enhanced training procedures
The user interface of the NILM prototype is as depicted in Figure 5-25. The list shows the
possible available appliances in each setting. In the experiments, this list was provided to the users
of the residential setting, however, in a real-world scenario the list could also be generated by the
users as the system detects new appliance state transitions. As this figure shows, the possible
options for user include “correct”, “not correct” and “pass”. The “correct” button is selected if the
NILM system has detected the correct appliance state transition; the “not correct” button is used
along with the correct label for the appliance when the system does not detect the correct
appliance state transition. In case the user cannot associate the event with an event in the physical
environment or he/she is suspicious about an event, the “pass” button enables them to label the
141
event as unknown. The interface illustrated in Figure 5-25 is the only interface for end-user
interaction with NILM system.
Figure 5-24 The NILM prototype set up in an apartment as part of the experimental test bed
The SI framework is evaluated by investigating its effect on the user interaction compared to the
basic training framework. Field experimental studies were carried out in apartment 1 and
apartment 2. In order to reduce the burden on the users in the experimental test-bed buildings and
provide a fair ground for comparison between the two (basic and SI training) frameworks, the SI
framework performance was evaluated through simulation of the training process using the data
collected in the field experiments using the basic training framework. The feature vectors, which
were captured in the field experiments and were tagged with the field labels, were fed to the SI
framework in the chronological order. The evaluation of the framework is measured by
comparison between interaction requirements (number of calls that are made to the users for
Voltage
Transformer
Current Clamps
User
Interaction
Interface
Local Computing
Device for Data
Acquisition,
Processing, and
Storage
DAQ Card
142
labeling) over the same period of time and the accuracy of the labels. The ground truth labels are
compared with the field provided labels to calculate the accuracy of the labeling process.
Figure 5-25 the user interface of our NILM prototype used for training by users
The framework consists of a stage of possible appliance state identification and a user-interaction
stage. In our studies, we have separated the data in two stages, one data set for clustering and the
second one for user interaction evaluation. The details are presented in the following sections.
Prior to evaluating the SI framework as a whole, more analyses on internal cluster validation,
cluster merging, and anomaly detection are presented.
5.5.1 Internal Cluster Validation
In Chapter 4, the performance of our heuristic clustering algorithm was evaluated externally by
using the ground truth data. It was discussed that no generalized conclusion could be made
regarding the better configuration of the features that brings about generalized optimum
performance. As demonstrated, in general, the algorithm results in high F-measure values that
143
shows its effectiveness in finding the partition of the feature space with high accuracy, however,
there is room for improvement if the optimum configuration could be determined. This
configuration includes the structure of the feature space and the feature extraction approach. As
noted, the dissimilarity measure between the feature vectors is the main factor that drives the
quality of the clustering. Specifically, for the electricity disaggregation problem, the multi-scale
nature of feature space could mask the dissimilarity between some of the signatures. For example,
in apartment 1, the turn-off events for two state transitions of refrigerator compressor turn-off and
bathroom light turn-off are closely similar with subtle differences. If the entire signature space is
taken into account, the dissimilarity between these two classes of signatures are negligible and
they could be clustered together. Although, the presented heuristic algorithm recursively searches
for clusters at different scales, these clusters could be formed in the first recursion. Therefore,
there are two questions to be answered for optimum clustering: 1) how could the different
clustering configurations be autonomously evaluated, and 2) how could the scale effect be
addressed?
To address the first question, we adopted internal clustering evaluation, in which the qoodness of
the clustering results is taken into account [98, 99]. The goodness metrics evaluate how good the
data in each cluster is compact and how well-separated the clusters are. Lower variance in clusters
indicates compactness of the data and the larger distances between clusters indicate well-separated
signature space. Various internal validation metrics have been proposed in different research field
studies [99]. Our observations of the clustering in different configurations revealed that common
situations that we want to avoid include 1) clustering feature vectors from different classes
together, especially in the larger scales, 2) having a large residual cluster that cannot be clustered
further. To address both these concerns we adopted three different internal evaluation metrics to
144
account for overall goodness of clustering and goodness of individual clusters. For global
goodness, we use two metrics. The first one is the Davies-Bouldin index [100], which is
calculated as follows:
𝐷𝐵
𝑖𝑛𝑑𝑒𝑥 =
1
𝑁 𝐶 ∑max
𝑗 (𝑗 ≠𝑖 )
{[
1
𝑛 𝑖 ∑ 𝑑 (𝒙 ,𝐶 𝑖 )+
1
𝑛 𝑗 ∑ 𝑑 (𝒙 ,𝐶 𝑗 )
𝒙 ∈𝐶 𝑗 𝒙 ∈𝐶 𝑖 ]/𝑑 (𝐶 𝑖 ,𝐶 𝑗 )}
𝑖
5-26
where, 𝑁 𝐶 is the number of clusters, 𝑛 𝑘 is the number of feature vectors in cluster k, 𝑑 (𝒙 ,𝐶 𝑘 ) is
the distance between feature vector 𝒙 and the center of cluster k, and 𝑑 (𝐶 𝑖 ,𝐶 𝑗 ) is the distance
between center of clusters i and j. The optimum clustering configuration is obtained by
minimizing the 𝐷𝐵
𝑖𝑛𝑑𝑒𝑥 . In calculating this index, the similarity between cluster C and all other
clusters is calculated and the maximum value is assigned to cluster C. The mean value of the
cluster similarities are used as the 𝐷𝐵
𝑖𝑛𝑑𝑒𝑥 . The idea of this metric is analogous to the similarity
ratio (𝑆𝑅 ) metric that was introduced for cluster merging in Section 5.3.1. As the second metric,
we are using a metric in the form of separation/compactness ratio. This metric was taken into
account based on our observations of the clustering results:
𝑆𝐶
𝑖𝑛𝑑𝑒𝑥 =
1
𝑁 𝐶 ∑
∆
𝑖 1
𝑁 𝑐 −1
∑ 𝑑 (𝐶 𝑖 ,𝐶 𝑗 )
𝑗 (𝑗 ≠𝑖 ) 𝑖
5-27
∆
𝑘 = max
𝒙 𝑛 ,𝒙 𝑚 ∈𝐶 𝑘 𝑑 (𝒙 𝑚 ,𝒙 𝑛 )
5-28
145
The 𝑆𝐶
𝑖𝑛 𝑑 𝑒𝑥
measures the goodness of clustering by taking the quality of all clusters into account.
The metric assigns a score to each individual cluster that represents a ratio of compactness over
intra-cluster distance and evaluates the overall clustering goodness by averaging these scores over
all the clusters. Therefore, the optimum clustering configuration is obtained by minimizing the
𝑆𝐶
𝑖𝑛𝑑𝑒𝑥 . Moreover, we adopted the Dunn index [86] to take into account individual cluster
goodness. This index is calculated as follows:
𝐷 𝑖𝑛𝑑𝑒𝑥 = min
1≤𝑖 ≤𝑁 𝐶 { min
1≤𝑗 ≤𝑁 𝐶 ,
𝑗 ≠𝑖 {
𝑑 (𝐶 𝑖 ,𝐶 𝑗 )
max
1≤𝑘 ≤𝑁 𝐶 ∆
𝑘 }} 5-29
The optimum clustering configuration is obtained by maximizing the 𝐷 𝑖𝑛𝑑𝑒𝑥 . The denominator in
Equation 5-29 measures the maximum inter-cluster distance, which enables us to account for the
worst case scenario and focus on the cluster with maximum dispersion. If one cluster contains
signatures from multiple classes, the cluster dispersion acts as a penalty term in the Dunn index
and reduces the value of this index. Before calculating any of these indices the feature vectors are
normalized by using the maximum absolute values in feature vectors to ensure that large
variations in power ranges does not drive the variation of these internal validation indices. Basic
feature vectors (fundamental frequency component of real and reactive power time series) are
used in calculation of these indices even if a different feature extraction method is used for
clustering.
In addressing the second question, the masking effect of the multiscale nature of the feature space
(specifically in cases that the differences between signatures are subtle) should be taken into
146
account. Therefore, we proposed to introduce artificial variations in scale by dividing the feature
space into subsets, clustering the subsets, and combining the outcome. We hypothesized that
dividing the feature space introduces variations in feature space topology, which in turn facilitates
splitting of the data according to the signature mapping in physical domain. Countless
combinations could be used for dividing the feature space. However, as a practical solution to the
problem we used binary division of data set for a signature power range (fundamental component
of real power) threshold, 𝑃 𝑏 . Each data set, 𝐷 , is divided into two subsets as follows:
𝐷 𝑢 ⊆𝐷 ={𝒙 : 𝒙 ∈𝐷 𝑎𝑛𝑑 𝑃𝑅
𝒙 ≥𝑃 𝑏 }
𝐷 𝑙 ⊆𝐷 ={𝒙 : 𝒙 ∈𝐷 𝑎𝑛𝑑 𝑃𝑅
𝒙 <𝑃 𝑏 }
𝐷 𝑢 ∪ 𝐷 𝑙 =𝐷
5-30
in which, 𝑃𝑅
𝒙 is the power range of the real power component of the basic feature vector. The
values of the 𝑃 𝑏 are determined based on a 10-bin histogram of the power ranges in each data set
ignoring very large and very small values of the power ranges.
The effectiveness of these indices was evaluated on clustering of the data sets in apartments 1 and
2. Correlation analysis was used to determine the correlation between the variation of the indices
and the f-measure of the clustering. In order to account for appliances with electronic components,
clustering with optimum configuration is also carried out for up to 5
th
harmonic components of the
signatures. It is emphasized that numerous combinations of feature space topology could be used
in analyses and the aforementioned one is one of the possible solutions. The results of these
analyses are presented in the following pages. Table 5-10 to Table 5-17 show the results including
all three internal evaluation indices along with the cluster quality metrics. Figure 5-26 and
Figure 5-33 present the variation between SC
index
and the F-measure for different data sets.
147
Table 5-10 Variation of clustering algorithm performance versus different internal evaluation
indices; Apartment 1 – Phase A – Turn-on events
P
b
(Watts) Dunn DB
Index
SC
Index
CQI CQI** F-Measure
200 0.00085 1.218 0.155 0.359 0.399 0.949
450 0.00198 1.170 0.173 0.439 0.488 0.892
600 0.00192 1.593 0.205 0.525 0.582 0.891
900 0.00218 1.227 0.193 0.494 0.548 0.891
1200 0.00830 0.964 0.169 0.430 0.474 0.939
1500 0.00319 1.180 0.206 0.464 0.516 0.954
1800 0.00085 1.200 0.153 0.422 0.470 0.926
2200 0.00085 1.200 0.153 0.422 0.470 0.926
1800-3rd-H 0.00084 1.134 0.135 0.416 0.463 0.929
1800-5th-H 0.00084 1.168 0.123 0.407 0.454 0.930
Correlation 0.17 -0.51 -0.31
Figure 5-26 Variation of the F-Measure versus SC
Index
for different values of P
b
on the data from
Apartment 1 – Phase A – Turn-on events
0.00
0.05
0.10
0.15
0.20
0.25
0.82
0.84
0.86
0.88
0.90
0.92
0.94
0.96
0.98
1.00
SC
Index
F-Measure
P
b
F-measure SC_Index
148
Table 5-11 Variation of clustering algorithm performance versus different internal evaluation
indices; Apartment 1 – Phase A – Turn-off events
P
b
(Watts) Dunn DB
Index
SC
Index
CQI CQI** F-Measure
200 0.00190 2.063 0.188 0.291 0.322 0.918
350 0.00477 1.797 0.204 0.492 0.547 0.904
500 0.00243 1.725 0.109 0.264 0.293 0.943
800 0.00126 1.773 0.122 0.264 0.293 0.918
900 0.00036 2.648 0.184 0.489 0.545 0.871
1150 0.00151 1.457 0.212 0.388 0.430 0.883
1500 0.00054 1.960 0.147 0.408 0.454 0.922
1800 0.00054 1.934 0.145 0.408 0.454 0.922
2100 0.00054 1.934 0.145 0.408 0.454 0.922
500-3rd-H
0.00233 1.490 0.110 0.279 0.310 0.915
500-5th-H
0.00124 1.867 0.145 0.299 0.333 0.909
Correlation 0.09 -0.34 -0.71
Figure 5-27 Variation of the F-Measure versus SC
Index
for different values of P
b
on the data from
Apartment 1 – Phase A – Turn-off events
0.00
0.05
0.10
0.15
0.20
0.25
0.80
0.82
0.84
0.86
0.88
0.90
0.92
0.94
0.96
0.98
1.00
SC
Index
F-Measure
P
b
F-measure SC_Index
149
Table 5-12 Variation of clustering algorithm performance versus different internal evaluation
indices; Apartment 1 – Phase B – Turn-on events
P
b
(Watts) Dunn DB
Index
SC
Index
CQI CQI** F-Measure
100 0.00159 1.609 0.267 0.283 0.321 0.806
250 0.00159 1.586 0.253 0.282 0.318 0.907
450 0.00159 1.625 0.266 0.283 0.318 0.907
650 0.00137 1.415 0.226 0.282 0.322 0.892
800 0.00137 1.415 0.226 0.282 0.322 0.892
1000 0.00166 1.422 0.226 0.284 0.324 0.889
1200 0.00159 1.440 0.224 0.285 0.325 0.917
1500 0.00159 1.440 0.224 0.285 0.325 0.917
1200-3rd-H 0.00142 1.469 0.238 0.348 0.396 0.906
1200-5th-H 0.00142 1.444 0.227 0.325 0.369 0.913
Correlation -0.13 -0.41 -0.53
Figure 5-28 Variation of the F-Measure versus SC
Index
for different values of P
b
on the data from
Apartment 1 – Phase B – Turn-on events
0.20
0.21
0.22
0.23
0.24
0.25
0.26
0.27
0.28
0.80
0.82
0.84
0.86
0.88
0.90
0.92
0.94
0.96
0.98
1.00
SC
Index
F-Measure
P
b
F-measure SC_Index
150
Table 5-13 Variation of clustering algorithm performance versus different internal evaluation
indices; Apartment 1 – Phase B – Turn-off events
P
b
(Watts) Dunn DB
Index
SC
Index
CQI CQI** F-Measure
300 0.00061 1.620 0.221 0.124 0.100 0.886
450 0.00061 1.732 0.247 0.279 0.312 0.779
650 0.00061 2.051 0.300 0.331 0.345 0.866
800 0.00061 1.922 0.254 0.331 0.345 0.876
1000 0.00061 1.922 0.254 0.331 0.345 0.876
1200 0.00061 1.755 0.234 0.274 0.279 0.851
1500 0.00061 1.755 0.234 0.274 0.279 0.851
1800 0.00143 1.483 0.220 0.373 0.415 0.857
1800-3rd-H 0.00143 1.423 0.235 0.326 0.360 0.845
1800-5th-H 0.00213 1.576 0.270 0.418 0.477 0.889
Correlation 0.24 0.09 0.13
Figure 5-29 Variation of the F-Measure versus SC
Index
for different values of P
b
on the data from
Apartment 1 – Phase B – Turn-off events
0.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
0.70
0.75
0.80
0.85
0.90
0.95
1.00
SC
Index
F-Measure
P
b
F-measure SC_Index
151
Table 5-14 Variation of clustering algorithm performance versus different internal evaluation
indices; Apartment 2 – Phase B – Turn-on events
P
b
(Watts) Dunn DB
Index
SC
Index
CQI CQI** F-Measure
160 0.00119 1.880 0.280 0.269 0.278 0.894
500 0.00091 1.649 0.242 0.344 0.358 0.829
870 0.00065 1.682 0.224 0.247 0.249 0.821
1150 0.00219 1.266 0.296 0.402 0.418 0.893
1500 0.00219 1.528 0.398 0.461 0.480 0.817
1700 0.00219 1.526 0.399 0.461 0.480 0.812
2000 0.00038 1.780 0.269 0.383 0.398 0.887
2500 0.00038 1.674 0.236 0.347 0.361 0.858
870-3rd-H 0.00154 1.573 0.287 0.410 0.427 0.835
870-5th-H 0.00154 1.622 0.305 0.447 0.464 0.834
Correlation 0.66 -0.56 -0.35
Figure 5-30 Variation of the F-Measure versus SC
Index
for different values of P
b
on the data from
Apartment 2 – Phase B – Turn-on events
0.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
0.40
0.45
0.70
0.75
0.80
0.85
0.90
0.95
1.00
SC
Index
F-Measure
P
b
F-measure SC_Index
152
Table 5-15 Variation of clustering algorithm performance versus different internal evaluation
indices; Apartment 2 – Phase B – Turn-off events
P
b
(Watts) Dunn DB
Index
SC
Index
CQI CQI** F-Measure
250 0.00066 2.771 0.231 0.388 0.404 0.755
450 0.00066 2.638 0.228 0.323 0.335 0.787
650 0.00079 2.034 0.173 0.256 0.265 0.823
900 0.00029 2.059 0.128 0.314 0.327 0.813
1100 0.00029 2.210 0.124 0.282 0.293 0.828
1450 0.00018 2.609 0.094 0.324 0.337 0.891
1950 0.00018 2.639 0.105 0.348 0.362 0.856
2400 0.00018 2.585 0.096 0.312 0.325 0.862
2400-3rd-H 0.00028 2.023 0.101 0.271 0.282 0.864
2400-5th-H 0.00018 2.705 0.102 0.314 0.327 0.863
Correlation -0.77 -0.07 -0.92
Figure 5-31 Variation of the F-Measure versus SC
Index
for different values of P
b
on the data from
Apartment 2 – Phase B – Turn-off events
0.00
0.05
0.10
0.15
0.20
0.25
0.70
0.75
0.80
0.85
0.90
0.95
1.00
SC
Index
F-Measure
P
b
F-measure SC_Index
153
Table 5-16 Variation of clustering algorithm performance versus different internal evaluation
indices; Apartment 2 – Phase A – Turn-on events
P
b
(Watts) Dunn DB
Index
SC
Index
CQI CQI** F-Measure
180 0.00051 1.685 0.216 0.149 0.151 0.942
500 0.00052 1.649 0.180 0.116 0.118 0.941
800 0.00043 1.743 0.176 0.114 0.116 0.942
1150 0.00043 1.743 0.176 0.114 0.116 0.942
1450 0.00043 1.696 0.172 0.106 0.108 0.942
1800 0.00043 1.696 0.172 0.106 0.108 0.942
2100 0.00043 1.738 0.190 0.107 0.108 0.942
2400 0.00058 1.780 0.301 0.186 0.192 0.897
1450-3rd-H 0.00084 1.573 0.194 0.261 0.272 0.908
1450-5th-H 0.00084 1.561 0.188 0.198 0.203 0.928
Correlation -0.77 -0.61 -0.77
Figure 5-32 Variation of the F-Measure versus SC
Index
for different values of P
b
on the data from
Apartment 2 – Phase A – Turn-on events
0.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
0.84
0.86
0.88
0.90
0.92
0.94
0.96
0.98
1.00
SC
Index
F-Measure
P
b
F-measure SC_Index
154
Table 5-17 Variation of clustering algorithm performance versus different internal evaluation
indices; Apartment 2 – Phase A – Turn-off events
P
b
(Watts) Dunn DB
Index
SC
Index
CQI CQI** F-Measure
175 0.00051 1.308 0.112 0.146 0.151 0.794
300 0.00051 1.314 0.111 0.146 0.151 0.794
800 0.00062 1.770 0.151 0.228 0.237 0.765
1100 0.00062 1.770 0.151 0.228 0.237 0.765
1450 0.00088 1.953 0.189 0.372 0.388 0.788
1750 0.00088 1.953 0.189 0.372 0.388 0.788
2100 0.00088 1.923 0.182 0.372 0.388 0.788
300-3rd-H 0.00126 2.008 0.153 0.274 0.285 0.813
300-5th-H 0.00097 1.947 0.159 0.220 0.229 0.757
Correlation 0.30 -0.12 -0.12
Figure 5-33 Variation of the F-Measure versus SC
Index
for different values of P
b
on the data from
Apartment 2 – Phase A – Turn-off events
0.00
0.05
0.10
0.15
0.20
0.25
0.00
0.10
0.20
0.30
0.40
0.50
0.60
0.70
0.80
0.90
1.00
SC
Index
F-Measure
P
b
F-measure SC_Index
155
The results show that the SC
index
is the most effective metric in internal evaluation of the
clustering algorithm performance. This conclusion is supported by the correlation analyses that
were conducted for each data set. As shown in Table 5-10 to Table 5-17, the correlation
coefficients were calculated for the variation of internal metrics versus the obtained F-measure.
The average correlation coefficients for Dunn, DB, and SC indices are -0.03, -0.32, and -0.45,
respectively. The Dunn index shows no correlation in detecting the trends of F-Measure variation.
The average correlation coefficient is negative (close to zero) while a positive value is expected.
Therefore, it is not a reliable metric for overall cluster validation. As noted, this index is very
effective in evaluating the quality of individual clusters (in worst case). The DB index performs
better in detecting the trends of F-Measure variation. However, as the correlation coefficients
indicate, the SC index is a more reliable metric.
Upon selecting the optimum clustering configuration by using the basic feature vectors and
minimizing the SC index, the effect of the higher harmonic contents in the signatures are
evaluated through Dunn index. Based on our observations, the higher harmonic contents usually
are effective in separation of the signatures in the residual cluster, where the signatures in smaller
scales overlap; therefore, the application of the higher harmonic contents could potentially
improve the quality of the cluster with largest dispersion. If the higher harmonic contents help
separate the overlapping signatures in the residual cluster, an increase in the Dunn index could be
observed. Therefore, in our approach if the higher harmonic contents of the signature improve the
Dunn index without considerable increase in the SC index, their corresponding clustering is taken
into account. For example, the results in Table 5-13 to Table 5-17 show how the higher harmonic
contents of the signatures improved the Dunn index while improving the F-measure in four cases.
In all five cases, visual inspection of the clusters showed that the higher harmonic contents
156
resulted in separation of the residual cluster. As these figures and tables illustrate, the application
of minimum SC index in five of the data sets resulted in highest F-measure values, the application
of minimum DB index in four data sets resulted in highest F-measure, and the application of Dunn
index in two data sets resulted in highest F-measure. This pattern is compatible with our
observations in correlation analysis. However, a statically significant conclusion requires further
analyses on more data sets. The minimization of the SC index is a practical approach to
unsupervised selection of the clustering configuration. The visual inspection of the clustering
outcome showed that the minimum SC index is effective in separation of the signatures in
different classes although in some cases it does not show the highest F-measure value. The only
data set that the minimum SC index is not effective is the turn-off signature on Phase B of
Apartment 2. The minimum SC index in that data set resulted in P
b
=2400 watts while the best
clustering outcome (separation of majority of the classes) occurs when P
b
=1100 watts. It is
important to note that the application of the minimum SC index (following its definition) could
result in increased number of clusters. In fact, this is a trade-off for increased accuracy of the
labeling versus number of interactions. However, this drawback is compensated since the SI
framework uses cluster merging techniques.
5.5.2 Cluster Merging
Upon the evaluation of cluster validation techniques in Section 5.4.1, cluster merging is carried
out for the optimum clustering configuration in both apartment 1 and 2, as obtained in
Section 5.5.1 and presented in Table 5-18.
157
Table 5-18 Clustering configuration for apartment 1 and 2 data sets
Case No.
Case Description
Location – Phase – Event Type
𝑃 𝑏 (Watts) Feature Vector Type
1 Apartment 1 – Phase A – On 1800 𝒙 𝒏 ={𝑝 1
,𝑞 1
}
2 Apartment 1 – Phase A – Off 500 𝒙 𝒏 ={𝑝 1
,𝑞 1
}
3 Apartment 1 – Phase B – On 1200 𝒙 𝒏 ={𝑝 1
,𝑞 1
}
4 Apartment 1 – Phase B – Off 1800 𝒙 𝒏 ={𝑝 1
,𝑞 1
,𝑝 3
,𝑞 3
,𝑝 5
,𝑞 5
}
5 Apartment 2 – Phase B – On 870 𝒙 𝒏 ={𝑝 1
,𝑞 1
,𝑝 3
,𝑞 3
}
6 Apartment 2 – Phase B – Off 2400 𝒙 𝒏 ={𝑝 1
,𝑞 1
,𝑝 3
,𝑞 3
}
7 Apartment 2 – Phase A – On 1450 𝒙 𝒏 ={𝑝 1
,𝑞 1
,𝑝 3
,𝑞 3
,𝑝 5
,𝑞 5
}
8 Apartment 2 – Phase A – Off 300 𝒙 𝒏 ={𝑝 1
,𝑞 1
,𝑝 3
,𝑞 3
}
A comprehensive analysis of the CBM merging algorithm was conducted to provide an analysis
of the effective threshold values and find the best set of parameters for merging. This analysis was
conducted on the optimum clusters (as obtained by using the configuration in Table 5-18) of all
the eight data sets from both test bed apartments. As mentioned in Section 5.4.1, the merging is
carried out for multiple power scales. Since we are using the transient signatures of appliances,
the power range has been used for grouping the clusters into different scales. Our observation
showed that the lower scales call for smaller threshold values while larger scale values could be
used for effective merging in larger scales. These observations led us to selection of 300 watts as
an effective border value for separation of power ranges. The outcome of merging has been
presented in Table 5-19 in terms of variation in CQI and F-Measure metrics as the result of
merging. Figure 5-34 and Figure 5-35 illustrates these variations for below and above 300 watt
border. As these graphs show, a pick in the ΔCQI metric values could be observed in threshold
158
values between 7 to 10 for power ranges above 300 watts, while for the smaller scales (power
ranges below 300 watts) smaller threshold values (3-5) demonstrate better performance.
Table 5-19 Average variation of the CQI versus F-Measure metrics for merging over all the eight
data sets in two test bed apartments (1 and 2)
Figure 5-34 Variation of the average ΔCQI and ΔF-Measure over the eight data sets for both test
bed apartments (apartment 1 and apartment 2) for power ranges above 300 watts
Threshold Δ CQI Δ FM Threshold Δ CQI Δ FM
2 0.012 0.005 2 0.009 0.000
3 0.021 0.014 3 0.015 0.000
4 0.034 0.014 4 0.008 -0.001
5 0.037 0.011 5 0.011 -0.004
6 0.041 0.011 6 0.010 -0.006
7 0.047 0.011 7 0.005 -0.007
8 0.042 0.010 8 0.002 -0.007
9 0.048 0.016 9 -0.008 -0.010
10 0.054 0.016 10 0.004 -0.010
11 0.044 0.007 11 -0.007 -0.013
12 0.042 0.007 12 -0.003 -0.013
13 0.031 -0.008 13 -0.005 -0.012
14 0.030 0.000 14 -0.004 -0.021
15 0.024 -0.017 15 -0.003 -0.022
< 300 Watts < 300 Watts
-0.030
-0.020
-0.010
0.000
0.010
0.020
0.030
0.040
0.050
0.060
2 3 4 5 6 7 8 9 10 11 12 13 14 15
Percentage Change
Threshold Values
Δ CQI Δ FM
159
Figure 5-35 Variation of the average ΔCQI and ΔF-Measure over the eight data sets for both test
bed apartments (apartment 1 and apartment 2) for power ranges below 300 watts
Table 5-20 Detailed variation of ΔCQI and ΔF-Measure for selected threshold values
Apt 1 Phase A Turn-
On
PR Base 3845 2221 1432 1042 395 285 152 60 42
CQI 0.422 0.437 0.437 0.507 0.529 0.529 0.529 0.529 0.529 0.529
FM 0.926 0.926 0.926 0.926 0.926 0.926 0.926 0.926 0.926 0.926
Apt 1 Phase A Turn-
Off
PR Base 3028 1460 527 275 140 105 65 49 39
CQI 0.264 0.264 0.264 0.264 0.278 0.296 0.362 0.362 0.362 0.362
FM 0.943 0.943 0.943 0.943 0.943 0.943 0.943 0.943 0.943 0.943
Apt 1 Phase B Turn-On
PR Base 3050 1059 631 204 162 129 72
CQI 0.285 0.304 0.304 0.294 0.294 0.294 0.294 0.294
FM 0.917 0.917 0.917 0.922 0.922 0.922 0.922 0.922
Apt 1 Phase B Turn-
Off
PR Base 4679 1061 556 368 219 158 82
CQI 0.418 0.418 0.418 0.455 0.464 0.464 0.464 0.464
FM 0.889 0.889 0.889 0.878 0.864 0.864 0.864 0.864
Apt 2 Phase B Turn-On
PR Base 2286 1031 359 158 96
CQI 0.410 0.464 0.407 0.430 0.430 0.430
FM 0.835 0.869 0.916 0.916 0.916 0.916
Apt 2 Phase B Turn-
Off
PR Base 4893 2194 1416 1029 665 401 294 174 138 76
CQI 0.271 0.271 0.280 0.280 0.294 0.311 0.317 0.318 0.318 0.318 0.318
FM 0.864 0.864 0.864 0.864 0.821 0.864 0.864 0.864 0.864 0.864 0.864
Apt 2 Phase A Turn-
On
PR Base 4619 2816 1183 657 292 205
CQI 0.198 0.192 0.192 0.192 0.217 0.217 0.217
FM 0.928 0.950 0.950 0.950 0.950 0.950 0.950
Apt 2 Phase A Turn-
On
PR Base 4547 2173 1169 637 102 56
CQI 0.274 0.297 0.297 0.297 0.297 0.297 0.297
FM 0.813 0.859 0.859 0.859 0.859 0.859 0.859
-0.025
-0.020
-0.015
-0.010
-0.005
0.000
0.005
0.010
0.015
0.020
2 3 4 5 6 7 8 9 10 11 12 13 14 15
Percentage Change
Threshold Values
Δ CQI Δ FM
160
Based on these analyses, we have selected two threshold values for different scales; the threshold
values of 7 and 3 were selected for cluster groups with power range above 300 watt and below
300 watt, respectively. The effect of the cluster merging for our eight data sets has been presented
in details in Table 5-20. The effect of the cluster merging process on reducing the number of
clusters has been reflected in Table 5-21. As this table shows, the merging approach is very
effective on reducing the number of clusters while conserving the accuracy of the clusters.
Table 5-21 Effect of the cluster merging techniques on reducing the number of clusters
Case No.
Case Description
Location – Phase – Event Type
Original Number of
Clusters
Final Number of
Clusters
1 Apartment 1 – Phase A – On 48 23
2 Apartment 1 – Phase A – Off 54 34
3 Apartment 1 – Phase B – On 34 25
4 Apartment 1 – Phase B – Off 28 25
5 Apartment 2 – Phase B – On 51 36
6 Apartment 2 – Phase B – Off 57 22
7 Apartment 2 – Phase A – On 30 15
8 Apartment 2 – Phase A – Off 42 17
5.5.3 Rule-based Cluster Identification
Our observations of the collected data sets showed that events associated to certain appliances’
state transitions demonstrate a frequent pattern; very regular frequent events are commonly
associated to state transitions that could not be detected by users in the physical domain. These
frequent events are triggered by the appliances either based on predetermined schedules or based
on the variations of the environmental factors such as ambient temperature. Accordingly, in the
data sets, various degrees of frequency could be observed that are further described in the
following paragraphs. We have observed that taking these frequent events into account (through a
161
rule-based cluster identification technique) provides the opportunity to reduce the interaction
requirements even further.
The cluster validation techniques, proposed and evaluated in previous sections, provide the
ground for events’ frequency evaluation. The clustered feature vectors are all tagged with the time
stamps of the events; therefore, for evaluating the feature vectors in each cluster they are sorted
based on the time stamp tags and the 𝛿𝑡 values between consecutive feature vectors are used for
histogram analysis. Figure 5-36 shows the sample histograms of clusters with frequent events (a)
and without frequent events (b). In this illustration synthetic data, representing the time difference
between events were used. The data has been randomly generated by different values of
dispersions. As this figure shows, the histogram of the clustered feature vectors shows an
identifiable pattern for frequent event detection. In clusters with frequent events, the maximum
frequency, associated with the 𝛿𝑡 set of frequent events is considerably higher than the sum of the
rest of frequencies in the histogram. However, this pattern could not be observed in clusters with
non-frequent events.
a) Histogram of frequent events b) Histogram of non-frequent events
Figure 5-36 Sample histograms of the synthetic data
0 2 4 6 8 10 12
0
10
20
30
40
50
t
Frequency
5 5.5 6 6.5 7 7.5 8
0
1
2
3
4
5
t
Frequency
162
Therefore, we introduce a frequent event index in our approach. The frequent event index is
assigned to each cluster and is calculated as follows:
𝜑 𝐼 𝑐𝑙
=
𝑀 ∑(𝑚 𝑖 +
≠𝑀 )
𝑘 𝑖 𝑘 +
5-31
𝑀 =max (𝑚 𝑖 ) 5-32
𝑛 =∑𝑚 𝑖 𝑘 𝑖 =1
5-33
where, 𝑚 𝑖 is the number of observations for each histogram bin, 𝑚 𝑖 +
is the number of observations
for bins with non-zero number of observations, 𝑘 is the number of histogram bins, 𝑘 +
is the
number of histogram bins with non-zero observations, n is the total number of observations, and
𝑀 is the number of observations for the bin with maximum number of observations. In this way,
𝜑 𝐼 𝑐𝑙
is the frequent events’ index for each cluster. In our approach, 𝑘 is considered to be equal to
the number of 𝛿𝑡 for each cluster.
Figure 5-37 and Figure 5-38 illustrate the histogram of 𝛿𝑡 for clusters on phase A of apartment 1
turn-on and turn-off events, respectively. The label of each histogram shows the cluster label and
the 𝜑 𝐼 𝑐𝑙
values (cluster label-𝜑 𝐼 𝑐𝑙
). 11102, 18002, and 12900 are examples of clusters with frequent
events that correspond to refrigerator defrost, AC compressor, and TV turn-on events
(Figure 5-37). As it could be seen in both turn-on and turn-off events’ clusters, 𝜑 𝐼 𝑐𝑙
values greater
than 10 are good representatives of clusters with frequent events. Therefore, the following set of
rules is defined based on cluster identification:
163
Figure 5-37 Histograms and frequent events’ indices for clusters of the appliances’ signatures
from turn-on events on phase A of apartment 1
𝜑 𝐼 𝑐𝑙
={
>10 Limit user interactions to 10 times
𝑜𝑡 ℎ𝑒𝑟𝑤𝑖𝑠𝑒 Limit user interactions to 20 times
5-34
In these rules, indices larger than 10 are commonly associated with automated appliances’
operations such as window AC compressor operations. Appliances that create frequent events in a
short period of time such as electric range or washing machines show very large 𝜑 𝐼 𝑐𝑙
in higher
orders of magnitude. Therefore, the number of interactions is limited to 10 instances. If the user
could label them in 10 instances, we assume it could be detected; otherwise no further interactions
are made after 10 instances. For indices less than 10, the users are asked for more instances (up to
20) since they may or may not be related to appliances with frequent events. For example in
apartment 1 data set, the turn-off events of the refrigerator defrost (which cannot be detected by a
0 0.5 1
0
2
4
14001 2.15
0 0.5 1
0
20
40
0 - 10.51
0 0.5 1
0
20
40
11101 - 10.15
0 0.5 1 1.5
0
2
4
16201 - 2.18
0 0.5 1
0
2
4
12906 - 2.77
0 0.5 1
0
50
0 - 42
0 0.5 1
0
5
10
14501 - 3.74
0 0.5 1
0
5
12901 - 3.57
0 0.5 1
0
10
20
14401 - 7.41
0.7 0.8 0.9 1
0
20
40
11102 - 20
0 0.5 1
0
10
20
18002 - 16
0 0.5 1
0
50
100
18002 - 49.08
0 0.5 1
0
2
4
16301 - 2.33
0 0.5 1 1.5
0
2
4
18001 - 2.4
0 0.5 1
0
5
10
12902 - 3.41
0 0.5 1
0
2
4
12902 - 2.63
0 0.5 1
0
5
0 - 2.8
0 0.5 1
0
200
400
12900 - 121.01
0 0.5 1
0
20
40
0 - 20
164
user) have a frequency index of less than 10. The limitations on the number of calls enable the
NILM system to reduce the interaction requirements even further.
Figure 5-38 Histograms and frequent events’ indices for clusters of the appliances’ signatures
from turn-off events on phase A of apartment 1
Another rule that could be used in improving the accuracy of the labeling process is to eliminate
the residual cluster. As noted, the recursive clustering process is continued until the residual
cluster cannot be further split. Considering the multi-scale nature and similarity of the signatures
in very small scales (these signatures usually correspond to state transitions in lights and
electronic appliances such as personal computers, monitors, and televisions) in majority of the
cases, the residual cluster contains overlapping signatures from different appliances. Including
this residual cluster in the labeling process could lead to reduction of the labeling process
accuracy. Therefore, a rule based approach is used for eliminating residual clusters with
0 0.5 1
0
20
40
0 - 28.8
0 0.5 1
0
200
400
0 - 95.66
0 0.5 1
0
2
4
16202 - 2.61
0 0.5 1
0
20
40
0 - 22.4
0 0.5 1
0
2
4
16302 - 2.4
0 0.5 1
0
10
20
14402 - 9.63
0 0.5 1
0
10
20
11104 - 5.8
0 0.5 1
0
50
100
18004 - 82.5
0 0.5 1
0
2
4
14002 - 2.67
0 0.5 1
0
20
40
11103 - 9.33
0 0.5 1
0
10
20
0 - 7.61
0 0.5 1
0
2
4
18006 - 3
0 0.5 1
0
2
4
14202 - 2.61
0 0.5 1
0
50
100
0 - 56
0 0.5 1
0
100
200
0 - 86.8
0 0.5 1
0
5
12904 - 3.33
0 0.5 1
0
10
20
0 - 7.5
0 0.5 1
0
200
400
12900 - 121.01
0 0.5 1
0
20
40
0 - 20
165
possibility of containing overlapping signatures. As noted, the final residual cluster is positioned
in the smaller scales of the feature space. Therefore, the residual cluster elimination rule looks for
clusters with minimum power range among all the clusters. If the cluster with minimum power
range contains more than 20 percent of the feature vectors, the cluster is eliminated in the labeling
process. Evaluation of this rule over the eight data sets for both apartment 1 and 2 should its
effectiveness in eliminating the residual cluster with overlapping signatures.
5.5.4 Ensemble of Anomaly Detectors
The performance of the anomaly detection algorithms plays a major role in performance of the SI
framework. Thus, in evaluation of the anomaly detection algorithms, two factors are taken into
account: accuracy and inlier detection rate. An anomaly detection algorithm could result in high
accuracy in inlier detection by increasing the rejection rate and vice versa. Accordingly, in this
section, we present the findings of investigation on the trade-off between these two factors. As
noted, we adopted SVDD and MD (mahalanobis distance) based anomaly detection algorithms.
Adoption of different algorithms was taken into consideration to account for flexibility in outlier
detection. As noted, the heuristic clustering algorithm eliminates the clusters with one or two
feature vectors in each recursion. These eliminated feature vectors were used in our investigations
as challenging cases for performance evaluation of anomaly detection. In this process, the
eliminated feature vectors were labeled through classification (using KNN classifier) and anomaly
detection by using the reference clusters obtained through cluster validation process. The
performance was measured through label accuracy (LAcc) and labeled samples ratio (LRatio).
LAcc measures the compatibility of the assigned label in comparison to the ground truth labels.
166
LRatio measures the number of feature vectors detected as inliers to the total number of
eliminated feature vectors for each set of data.
Table 5-22 Variation of the anomaly detection algorithms’ performance metrics on the data sets
from apartment 1 and apartment 2
Table 5-22 presents the variation of the LRatio and LAcc for different cases of evaluations for
data sets from apartment 1 and apartment 2. As this table shows, a variety of algorithmic
procedures have been used in this evaluation. Our observations during the data analysis showed
that an ensemble anomaly detection algorithm could be the optimum solution for the SI
framework. Therefore, different combinations of the algorithms and tuning procedures were taken
into account and evaluated on the data sets.
LRatio LAcc LRatio LAcc LRatio LAcc LRatio LAcc
1 Tight MD 0.171 0.982 0.547 0.926 0.059 0.750 0.643 0.919
2 Tight SVDD 0.183 1.000 0.486 0.931 0.103 1.000 0.504 0.937
3 Loose MD 0.424 0.957 0.669 0.939 0.206 0.857 0.755 0.901
4 Loose SVDD 0.223 1.000 0.500 0.932 0.103 1.000 0.622 0.940
5 mixed MD-SVDD|Tight 0.201 1.000 0.547 0.926 0.059 0.750 0.499 0.919
6 mixed MD-SVDD|Loose 0.366 1.000 0.669 0.939 0.206 0.857 0.663 0.909
7 mixed MD-SVDD|Loose|CBM|6-3 0.459 0.987 0.719 0.934 0.306 0.711 0.671 0.908
8 mixed MD-SVDD|Loose|CBM|5-3 0.456 0.987 0.708 0.933 0.288 0.704 0.668 0.909
9 mixed MD-SVDD|Loose|CBM|4-2 0.445 0.997 0.696 0.942 0.274 0.711 0.665 0.909
10 mixed MD-SVDD|Loose|CBM|3-1.5 0.415 1.000 0.689 0.941 0.235 0.812 0.663 0.909
11 mixed MD-SVDD|Loose|CBM|2-1 0.387 1.000 0.682 0.941 0.206 0.857 0.663 0.909
LRatio LAcc LRatio LAcc LRatio LAcc LRatio LAcc
1 Tight MD 0.329 0.625 0.375 0.979 0.429 0.947 0.179 0.900
2 Tight SVDD 0.379 0.797 0.359 0.957 0.549 0.940 0.102 0.750
3 Loose MD 0.407 0.587 0.456 0.975 0.664 0.966 0.194 0.895
4 Loose SVDD 0.514 0.801 0.394 0.961 0.571 0.941 0.148 0.793
5 mixed MD-SVDD|Tight 0.390 0.807 0.375 0.979 0.474 0.946 0.087 0.941
6 mixed MD-SVDD|Loose 0.554 0.785 0.456 0.975 0.708 0.964 0.141 0.909
7 mixed MD-SVDD|Loose|CBM|6-3 0.569 0.787 0.471 0.967 0.737 0.951 0.153 0.917
8 mixed MD-SVDD|Loose|CBM|5-3 0.569 0.787 0.471 0.967 0.736 0.953 0.153 0.917
9 mixed MD-SVDD|Loose|CBM|4-2 0.567 0.787 0.468 0.974 0.732 0.956 0.153 0.917
10 mixed MD-SVDD|Loose|CBM|3-1.5 0.559 0.784 0.466 0.975 0.728 0.959 0.150 0.915
11 mixed MD-SVDD|Loose|CBM|2-1 0.556 0.784 0.460 0.975 0.718 0.961 0.143 0.911
Anomaly Detection Approach
Anomaly Detection Approach
Case No.
Case No.
Apartment 2
A-OFF A-ON B-ON B-OFF
Apartment 1
A-OFF A-ON B-ON B-OFF
167
Figure 5-39 Evaluation of the anomaly detection techniques on data sets from apartment 1
As described, generated clusters of the signatures provide the opportunity for unsupervised tuning
of the algorithms’ hyper parameters. In SVDD, the C parameter, and in MD, the detection
threshold, 𝜏 𝑜𝑑
, are tuned for each individual cluster through performance evaluation, as presented
in Section 5.4.2. Two modes of tuning were used in our analyses: loose and tight. Tight tuning
corresponds to tuning the parameters by finding the minimum parameter values that satisfy the
outlier/inlier detection constraints. On the other hand, loose tuning corresponds to finding the
maximum parameter values that satisfy the constraints. The first four rows in Table 5-22 present
the performance metrics for SVDD and MD anomaly detection algorithms using loose and tight
0.60
0.65
0.70
0.75
0.80
0.85
0.90
0.95
1.00
1.05
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Labeling Accuracy (LAcc)
Labeled Samples Ratio (LRatio)
LA - A - OFF LA - A - ON LA - B - OFF LA - B - ON
LR - A - OFF LR - A - ON LR - B - OFF LR - B - ON
168
tuning for each apartment.
Figure 5-40 Evaluation of the anomaly detection techniques on data sets from apartment 2
As Table 5-22 shows, the MD algorithm with loose tuning gives the highest LRatio values in
majority of the cases. However, in general LAcc varies between different approaches with
relatively higher values for SVDD approach. Our observations showed that signatures with
smaller power range values are more susceptible to error when using the MD algorithm.
Therefore, the combination of the two algorithms was used: SVDD for signatures with power
ranges lower than 100 watts and MD for signatures with power ranges higher than 100 watts. The
results presented in 5
th
and 6
th
rows of Table 5-22 shows that the combined algorithms either keep
0.50
0.55
0.60
0.65
0.70
0.75
0.80
0.85
0.90
0.95
1.00
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Labeling Accuracy (LAcc)
Labeled Samples Ratio (LRatio)
LA - A - OFF LA - A - ON LA - B - OFF LA - B - ON
LR - A - OFF LR - A - ON LR - B - OFF LR - B - ON
169
the same performance or improve the performance in keeping the trade-off between inlier
detection rate and labeling accuracy. Further assessment of the data showed that due to slight
variations of event detection locations on the power time series, some of the signatures are
classified as outliers while they belong to the corresponding clusters. Therefore we proposed to
improve the inlier detection rates by enhancing the anomaly detection through using the CBM
technique that we used for cluster merging. In this approach, three random samples from each
cluster are selected for covariance matrix calculation. The test signature is then replaced with one
of the signatures in the sample and the Euclidian distance between their vectorized covariance
matrices are compared (using the approach presented through Equations 5-7 and 5-8). Ten random
3-signature samples are selected from each cluster and the average value is used for anomaly
detection. In SVDD and MD methods, the inliers are detected if they are inside the data
description boundary. Therefore, the CBM approach is used for the outliers, which are out of the
data description radius (R) but they are inside a data description boundary of 2R. The rows 7 to 11
of Table 5-22 represent the result of the ensemble anomaly detection approach for different
threshold values for CBM approach. Since the data collection during user interaction experiments
(more details in the following section) was carried out using the power time series at 20 Hz. The
anomaly detection performance evaluation was carried out on signature samples at 20Hz.
Therefore, the threshold values that was obtained in Section 5.5.2 does not hold through for this
case. Therefore, in the analyses, various pairs of thresholds for signatures with power range higher
or lower than 300 Watts were taken into account. In Table 5-22, these threshold pairs have been
reflected in the name of the approach. To account for the effect of the random sample selection,
the results in Table 5-22 for the ensemble approach are reported as the average of five runs.
The variations of the LRatio and LAcc metrics have been illustrated in Figure 5-39 and
170
Figure 5-40 for apartment 1 and apartment 2, respectively. The LAcc was depicted as bar graphs
and the LAratio was depicted as line graphs. As these graphs show, the MD algorithm in loose
tuning mode shows higher inlier detection. However, by combining the SVDD and MD
algorithms the trade-off balance between accuracy and inlier detection improves. The addition of
the CBM based anomaly detection approach facilitates the increase in inlier detection rate while
preserving the accuracy of the labeling. According to the presented graphs and the numerical
results in Table 5-22, the 𝜏 𝐶𝑆
values of 4 and 2 are used for signatures with power ranges of higher
and lower than 300 Watts, respectively.
5.5.5 User-Interaction Process Validation
In the SI framework evaluation the following steps have been taken into account. The clustering
of each data set has been carried out through cluster validation procedure with the feature vector
configuration that resulted in best performance in terms of accuracy and cluster quality. The
parameters of the anomaly detection algorithms have been tuned in an unsupervised way (as
described in Section 5.3.2) using the clustered data obtained from each experimental setting. In
the clustering and cluster validation of pre user-interaction stage, the power time series at 60Hz
were used in event detection and feature extraction to ensure that all possible signatures could be
captured using the event detection algorithm described in Chapter 3. However, during the user
interaction stage, the power time series at 20Hz was used in event detection. Accordingly, once
the clustering process is complete, the clustered signatures are downsampled to make real-time
extracted signatures and clusters compatible. The parameters of the anomaly detection algorithms
have been tuned using the signatures, extracted from power time series at 20Hz as described in
Section 5.5.4. In the facilitated training process, the basic feature vectors (i.e., real and reactive
171
power components as presented in Figure 3-1.a) have been used in analyses. As noted earlier, the
information contained in the power time series for 2/3 of a second before an event and one second
after the event was used in basic feature vectors. Number of user interactions and the accuracy of
the labeled signatures were used for evaluating the performance of the SI framework. A two week
period of user interaction was taken into account for evaluating the interaction requirements. As
noted, user in our experimental test bed interacted with the NILM system during this period by
receiving calls from the NILM system. Therefore, reduction in interaction requirements was
measured in terms of the number of interactions (i.e., calls that are made to user for training). The
accuracy of labeling was measured by comparing the ground truth label and the label assigned
during the interaction process. The results of evaluations have been presented in Table 5-23.
Table 5-23 SI framework validation results in apartment 1 and apartment 2 using user interaction
requirement and labeling accuracy metrics
Case
No.
Case
description
Percentage of
interaction
reduction
Accuracy of the
labels
Number of
interactions
Percentage of
the visited
signatures
Number of
classes that user
cannot detect
(On + Off) On Off On Off
1 Apt1 – Phase A 94% 0.992 0.996 61/1101 0.82 0.57 1+2
2 Apt1 – Phase B 97% 0.95 0.969 53/1750 0.83 0.88 0
3 Apt2 – Phase A 97.5% 0.996 0.958 71/2833 0.95 0.9 1+1
4 Apt2 – Phase B 96% 0.91 0.947 121/3069 0.96 0.92 2+2
As shown, a considerable reduction in interaction was achieved in both experimental set ups. The
number of calls was reduced by more than 94% in all the cases. The number of calls that were
made to users was also presented for each case. In all the cases, very high accuracy of labeling
was achieved. In majority of the cases (8 out of 9), the accuracy is above 95%. The lowest
accuracy was obtained in case No.4. In this data set, in addition to the incorrect label that was
172
provided by user for one cluster, the cluster merging process resulted in merging feature vectors
from different classes of signatures for turn-on events. Signatures from coffee maker, toaster and
electric range were merged during the merging process on the turn-on events data set. This is
mainly due to the fact that generalized merging threshold were used. Unsupervised merging
validation could be added to the approach to avoid this type of errors, where differences between
signatures in different classes are subtle. An example for such approach is to monitor the change
in the dispersion of individual clusters and avoid merging when it results in dramatic change in
cluster dispersion.
In the training process some of the clusters were not visited or not labeled. Different factors
contribute to this phenomenon: 1) since the reference clusters were generated using event detector
and feature extraction algorithms on power time series at 60Hz and the real-time NILM system
uses a lower resolution (20Hz), some of the clusters do not have representative signatures in real-
time training; 2) some of the clusters are not visited due to the rejection by the anomaly detection
algorithms; and 3) some of the clusters represent appliance states that cannot be detected by users.
The last column of Table 5-23 shows the number of clusters that represent the latter. As this table
shows, the number of user interactions is correlated to the number of such clusters. In order to
measure the effect of the other two factors another parameter was used to measure the percentage
of feature vectors that were visited in the labeling process. The seventh and eighth columns of
Table 5-23 show these measures for both turn-on and turn-off events. This measure partially
shows the effect of the anomaly detector algorithms. Our observations showed that the classes that
were not trained during this process include signatures with low power range. Almost all these
classes contain signatures with power ranges less than 30-40 Watts. These signatures represent
light fixtures and electronic appliances such as personal computers, monitors, and television. As
173
noted in Section 5.5.3, part of these classes is eliminated to avoid erroneous training data. The
above mentioned appliances commonly generate very similar transient signatures, which are
either clustered together or are challenging for anomaly detection algorithms, specifically for
lower resolution of the power time series. Alternative solutions, such as specific algorithms for
lower scales or increasing the power time series resolution, need to be introduced to account for
very small scale signatures. Some of the future directions to address this problem are discussed in
Chapter 7. As Table 5-23 shows, the lowest percentage of the visited signatures is observed in
Case No.1 turn-off events. In this data set, multiple events from television and small light fixtures
are ignored due to rejection by anomaly detection algorithms. However, overall majority of the
signatures were visited and labeled with high accuracy in labeling.
5.6 Summary
A framework for efficient interaction between a NILM system and its users was presented to
facilitate training of supervised learning algorithms. The framework combines the algorithms,
presented in previous chapters, and complements them by introducing cluster validation and
anomaly detection algorithms. The cluster validation approach introduces effective cluster
merging and internal cluster validation for the high-dimensional multi-scale feature space of the
electricity measurements. Additionally, an ensemble anomaly detection approach, which measures
the compatibility of the classification outcome with the reference training data, complements the
framework to account for increased accuracy of the labeling process. A rule-based cluster
identification module is also added to the framework to help identify clusters with frequent events
and limit the number of interactions for training.
174
The aforementioned components were evaluated on the data acquired in two experimental set ups
(as described in Chapter 3). For the cluster validation techniques a combination of the F-measure
and cluster quality index (CQI), introduced in Chapter 4, were used. The performance of the
anomaly detection algorithms was also measured through accuracy of the inlier/outlier detection,
as well as, the inlier detection rate. The validation of the framework was evaluated for both
apartments by using two metrics, namely the number of user interactions and the accuracy of the
labeling. The validation results showed that the framework is considerably effective in reducing
the number of interactions for 96% on average, while keeping a very high level of accuracy in
labeling with an average of 0.965.
175
Chapter Six: Time Constraint Relaxation in Training
6
As described in Section 2.3.2, one of the challenges in direct interaction for event-based NILM
training is the time dependency of the signatures and their corresponding labels. In other words,
once an event is detected, the label needs to be promptly provided to avoid mislabeling of the
signatures. There is always the possibility that the label is assigned to a consecutive appliance
state transition or a noise signature if the user does not react promptly. The need for timely
interaction with the NILM system could be a major source of inconvenience for the user; delayed
reaction of the user to the NILM system calls could potentially bring about inaccurate and
incomplete training process. Figure 6-1 presents the time difference (∆𝑡 ) between consecutive
events in apartment 1 for both phase A and phase B over two weeks of non-intrusive monitoring.
The margins of 20 and 60 seconds of ∆𝑡 were also illustrated in this figure using the red and green
lines, respectively. As shown, the time difference between consecutive events is less than 20
seconds for a considerable number of events, which imposes a time difference constraint for user
interaction. The major difference between the events’ number on phase A and phase B is due to
the fact that a washing machine is powered on phase B circuit and it creates a large number of
consecutive events; moreover, the level of noise in phase B of this apartment is higher compared
to phase A power time series. The events in Figure 6-1 were obtained by using the event detection
algorithm on real power time series at 20 Hz resolution.
User activities usually include a number of consecutive actions, which involve using appliances in
their different modes of operations. Consequently, interacting with the NILM system while the
176
user is involved in daily activities, even if it is through a mobile device such as tablet or smart
phone could result in inconvenience for users.
Phase A
Phase B
Figure 6-1 Time difference between events for data sets in apartment 1; the events were detected
on the real power time series @20Hz
177
6.1 Towards Enabling Games with a Purpose (GWAP)
Another important factor in human-driven training process for machine learning algorithms is to
turn the process into a pleasant experience rather than a burden for the user. Games with a purpose
(GWAP) were introduced to allow the benefit of using human brainpower in performing tasks that
computers are not able to perform. The main idea is to use time and effort that users put into
games for training artificial intelligence algorithms [101]. Many of these games are used for
collecting labels for data that are used for training algorithms. For example, in the ESP game
[102], an online game, players tag images with words that describe the content of images to be
used for artificial intelligence algorithms’ training. A number of rules are used in designing these
games to make it more interesting for the user such as providing limited time, increasing the
quality of the data (by having two players playing the same game), and making the game more fun
[101]. One of the main objectives in these games is to produce a large set of clean training data.
These games are commonly designed to be online games that match random players, who provide
labels for a specific content (such as images or music). Players win when the inputs or outputs by
two players match.
In the case of NILM algorithm training, the application of conventional GWAP is not feasible due
to the fact that in-situ NILM training is a context (the setting that the NILM system is installed)
specific process and as mentioned above, labeling of the appliances’ signatures depends strictly on
time. However, games could still be a means of communication between users and the NILM
system to increase user interest in the process. In fact, the energy awareness applications could be
coupled with the training process through a game interface. The aforementioned rules for GWAP
design (to increase the quality of the user provided data and encourage users to participate) could
178
also be used to increase the possibility of user participation. However, the use of GWAP for
NILM training introduces new challenges due to the need for matching user's input with time
dependent appliances’ signatures. Consequently, the application of GWAP concepts also calls for
relaxing the time constraint in the training process. Following the objectives of this dissertation in
exploring solutions that could facilitate the training, we proposed and evaluated a passive training
approach. Since in this approach users do not need to react to the NILM system’s calls directly,
the approach is called passive training.
6.2 Proposed Passive Training Approach
The proposed passive training relies on receiving information about the appliances’ operational
modes, provided by a user (when the user is done with an activity) and using this information for
labeling feature vectors after the fact, without asking the user to label an event directly at the time
that event happens. In an active training scenario (as described in previous chapters of this
dissertation), the user is notified after every unknown event is detected. For example, consider an
activity like having breakfast; the activity could include events from multiple appliances/loads
including kitchen light, coffee maker, toaster, possibly refrigerator compartment light, possibly
electric range, or events from other appliances. In an active training approach, the user needs to
interact with a computer interface as he/she is having breakfast. If the user does not provide the
labels in a timely manner, new signatures are replaced in the signature queue. In the passive
training scenario, our objective is to enable postponing the label provision (within a time interval
of about 30 minutes) without the need to react to detected events which requires immediate
attention from the user and therefore causes multiple interruptions in user activities.
179
Therefore, once the activity is completed and the user is free to interact, he/she provides the labels
for the observed events. For the breakfast eating activity, an example of the user report could be:
in the past 30 minutes, the coffeemaker completed a cycle, kitchen light bulb was turned on, and
the toaster completed a cycle, etc. Therefore, the NILM system is informed that in the specified
period of time, there is at least one event associated to the operation of reported appliances. The
aggregate power time series contains the events of all state transitions during the period of time
for which user provided data is available. Since the time window of observing the aggregate
power time series is limited, a limited number of events are associated with the user provided
information. However, the sequence of events is unknown and assigning the user-provided
information to events requires further inference.
6.2.1 Signature-Label Matching Algorithm
Given that the data is available and user has provided proper information (reported the operation
of the same appliance at different times), a pattern-matching algorithm is required to determine
the feature vectors of the appliances signatures that represent the user-provided labels. In order to
associate the user provided labels (𝐿 ) with feature vectors (𝑭𝑽 ), a signature-label matching
algorithm (which is called SLM hereafter) is proposed in this Chapter. It is assumed that the user-
provided data has been reported M times. The M reported labels are associated with a time stamp,
which is close to the appliance state transition event by an offset. The time stamps ({𝑡 1
,𝑡 2
,…,𝑡 𝑀 } )
are used to extract all the 𝒙 ∈𝑭𝑽 in proximity of the reported labels within a time interval limit
(𝑇 𝑙 ) (a fraction of an hour) to make sure that the desired appliance state transitions are covered. By
extracting the signatures, M sets of 𝑭𝑽 are obtained ({𝑠 1
,𝑠 2
,…,𝑠 𝑀 } . The SLM algorithm is
presented in Figure 6-2.
180
Algorithm SLM (M,S)
For each 𝑡 𝑖
Extract the 𝑭𝑽 given 𝑇 𝑙
End
For each 𝑠 𝑚
For each 𝒙 𝑖 in 𝑠 𝑚
𝐷 𝑖𝑚𝑛 ← Calculate distance between 𝒙 𝑖 and all 𝒙 in 𝑠 𝑛 ≠𝑠 𝑚
𝐷 𝑖𝑚𝑛 −𝑠𝑜𝑟𝑡𝑒𝑑 ← Sort 𝐷 𝑖𝑚𝑛
𝑀𝐴
𝑖𝑚𝑛 ← Match 𝒙 𝑖 with nearest 𝒙 in 𝑠 𝑛 ≠𝑠 𝑚
End
End
For each 𝑠 𝑚
𝑆𝐼𝑀
𝑚𝑛
← Calculate similarity metric for each row of 𝑀𝐴
𝑚𝑛
𝑆𝐼𝑀
𝑚𝑛 −𝑠𝑜𝑟𝑡𝑒𝑑 ← Sort 𝑆𝐼𝑀
𝑚𝑛
End
If (similarity metrics for rows with highest similarity
measure in 𝑆𝐼𝑀
𝑚𝑛 −𝑠𝑜𝑟𝑡𝑒𝑑 (𝑛 ,𝑚 :1,…,𝑀 ) are equal)
Assign 𝐿 to all 𝒙 that contribute to the highest similarity
End
Figure 6-2 Signature-Label Matching (SLM) algorithm for assigning the labels and feature
vectors in a given electricity measurement data set
In the SLM algorithm, each 𝒙 in each 𝑠 𝑖 ∈{𝑠 1
,𝑠 2
,…,𝑠 𝑀 } is compared to 𝒙 for 𝑠 𝑛 ≠𝑠 𝑚
(𝑛 ,𝑚 :1,…,𝑀 ) using a distance metric:
‖𝒙 𝑗 −𝒙 𝑘 ‖
𝑝 =(∑|(𝒙 𝑗𝑖
−𝒙 𝑘𝑖
|
p
𝐷 𝑖 =0
)
1
𝑝 6-1
where 𝐷 is the number of features in each 𝒙 and 𝑝 could be any integer number. Euclidian
distance is obtained by setting 𝑝 =2 and has been used in out evaluations. Each 𝒙 in each 𝑭𝑽 set is
matched by its nearest 𝒙 in other 𝑭𝑽 sets. Upon finding the matches between feature vectors, the
similarity measure between all feature vectors in for the matched signatures. For this purpose, the
following similarity metric is used:
181
𝑠𝑖𝑚𝑖𝑙𝑎𝑟𝑖𝑡𝑦 =
1
1+‖𝒙 𝑗 −𝒙 𝑘 ‖
𝑝
6-2
This similarity measure is obtained by averaging the pairwise similarity measure between
candidate signatures in each interval, 𝑠 𝑖 ∈{𝑠 1
,𝑠 2
,…,𝑠 𝑀 } . The sorted similarity array for each
matched signature set is used to find the row with highest similarity measure. If the highest
similarity measure is equal for all M matched signature sets, the user provided label (𝐿 ) is
assigned to the feature vectors that contribute to the highest similarity measure.
In selecting M, the uniqueness of the signatures, associated to each label for the selected M sets,
plays an important role in the SLM algorithm. Therefore, in the training process, it is important
for users to provide non-overlapping signatures at least for one of the M data sets. For example if
a user provides information about a kettle turn-on event for three times and in the same time
intervals provide the information for another appliance, the SLM will assign the labels to one of
the signature classes with higher similarity. By selecting larger M values the probability of
reporting overlapping signatures will reduce. However, very large M numbers will also increase
the user interaction efforts. Our observations in the field showed that for the appliances that are
used more frequently larger numbers of M is feasible to be considered, which could provide the
opportunity to increase the chance to find intervals that have one appliance’s signatures as the
common signature. In this context, we call them non-overlapping intervals. Therefore, given that
user(s) provided a collection of 𝑛 𝑙 labels, 𝑙 𝑖 ∈{𝑙 1
,…,𝑙 𝑛 𝑙 } in 𝑛 𝑠 intervals 𝑠 𝑖 ∈𝑆 ={𝑠 1
,…,𝑠 𝑛 𝑠 }, a
search for non-overlapping intervals is carried out to find the better configuration of interval sets
to satisfy the following constraint:
182
𝑆 𝑙 𝑖 ={𝑠 1
,…,𝑠 𝑀 }⊂𝑆
6-3
𝑠 1
∩𝑠 2
∩…∩𝑠 𝑀 =𝑙 𝑖
6-4
where 𝑠 𝑖 is a set of appliance labels that are provided by users for a specific interval. In a training
period, 𝑛 𝑠 sets are obtained. The search satisfies the constraint of the intersection between the
sets. For the subset of 𝑆 that are reported for 𝑙 𝑖 the intersection between the sets are obtained until
the condition in 6-4 is satisfied unless there is no such subset in 𝑆 . This search enables the SLM
algorithm to select non-overlapping intervals based on user provided data. Although overlaps in
intervals could still occur (due to the presence of signatures from appliances that user cannot
detect or does not see), the search facilitates the selection of the intervals in a more efficient way.
Moreover, this search provides the opportunity for the NILM system to communicate with users
for more training data provision.
6.3 Passive Training Evaluation
6.3.1 NILM System User Interface
In the passive training process, the user is provided with a tool, for example, a smartphone
application, which has an interface that enables quick user interaction. Figure 6-3 shows the
screenshots of the first prototype developed for the approach evaluations. The application contains
images of typical appliances that exist in majority of residential units. These appliances are
mainly the ones that are major contributors to energy consumption in a residential setting and
183
their state transitions can be detected by users. For the purposes of this dissertation, the list of
appliances specific to an experimental set up was provided to users.
The user is asked to provide information about appliances operational mode for a limited period
of time, for example, the last 30 minutes. In many of the cases, the user probably cannot
remember (or observe) all the appliances that have been used (i.e. have changed state transitions)
in the last hour considering a number of reasons. These reasons include 1) not all appliances could
be observed by one person since they are in different locations in a residential setting, 2) not all
appliances are operated by the user, and 3) not all the appliances, which could be operated by the
user, are operated by one person. Nonetheless, the options that a user have (in this first prototype
of the interface) include stating if the appliance got turned on, turned off, has completed an entire
cycle (i.e. off-on-off stages), and "Don't have it" which eliminates the possibility of those
appliances' signatures in the signature space. The Pass button helps user navigate through
appliances and provide the data on the appliances that he/she is sure about, and the Back button
helps user modify the previously provided answers. As noted, this interface prototype has been
designed to demonstrate the concept of passive training and more usability analyses are required
to improve the quality of the input. The interface and user experience with the interface is not the
focus of this dissertation. A noted in the SLM description, in the proposed passive training
approach, users are asked to provide the data at different times in a way that they can cover most
of the visible operational modes of appliances. The data collected through the interface is
transferred to a server and is coupled with the electricity signatures captured through NILM
system as described in previous chapters. The signatures representation format is as presented in
Section 3.1.3 and was used in our analyses in previous chapters.
184
Figure 6-3 User interface screenshots of the passive training smartphone application, developed
on Android platform
As presented in the mathematical description, it is required for a user to report on the same
appliance state transition multiple times, which was represented as M in Figure 6-2. For example,
if a user reports on TV turned-on events, the user needs to provide the data for a number of times,
for example, 4 to 5 times. Using this user interface, a user can provide the information for all the
appliances in about 15 seconds, which facilitates fast and frequent data provision. Going back to
the TV example, the user provided information shows that the TV was turned on at four different
time periods, thus, four similar feature vectors are also found in the feature space for those time
periods. Assuming that these four signatures are the only signatures that are matched for this
example, the feature vector could be labeled as TV turn-on event.
6.3.2 Experimental Validation
In order to assess the performance of the algorithm and investigate the challenges associated with
this approach, as noted in Chapter 3, the data collection was carried out in two experimental set
185
ups. The data was collected in apartment 1 and apartment 3 using the abovementioned smartphone
application, as well as, the real-time NILM system on the main feed of the building units. The
data collection was carried out for almost two weeks for each one of the test-bed building units.
As described earlier, not all possible states of the appliances could be trained using the proposed
approach. Specifically, the multi-state appliances could be a challenge for the algorithm since one
cycle from a user perspective could include multiple turn-on and turn-off events. In the
experiments, the users in these two buildings were provided by a portion of appliances in the unit.
The list for the appliances is presented in Table 6-1. The power ranges of the signatures were also
presented in this table in order to shed more light on the signature scales. These are the average
power ranges that were extracted using the histogram of the power ranges. Appliances with a
range of power ranges are the ones with multiple state transitions or different modes of operations.
Table 6-1 List of appliances and their associated labels that were targeted in the experiments in
apartment 1 and apartment 2 test beds
Test Bed Apartment 1
Average
Signature Power
Range (Watt)
Apartment 3
Average
Signature Power
Range (Watt)
Appliance
List
181 – Hair Dryer
141 – Light Fixture 1
129 - Television
163 - Toaster
145 – Light Fixture 2
183 - Washer
144 – Light Fixture 3
182 - Iron
120 - Laptop
143 – Light Fixture
180 - AC
111 - Refrigerator
162 - Kettle
122 - Monitor
700
35
15-60
850
90-125
30-500
140
1050
20-25
30
50-2900
100-1400
1400
25
181 – Hair Dryer
166 – Hair Iron
162 – Kettle
163 – Toaster
129 – Television
167 – Dishwasher
161 – Microwave
120 – Laptop
111 – Refrigerator
light
1100-1800
100
600
400
60-120
300-2000
40-1800
40
90-150
Using the search Equations in 6-3 and 6-4 the data was extracted for each user-provided label and
186
the data was separated for turn-on and turn-off events. Following Equation 3-12, in this
evaluation, the basic signature representation (fundamental frequency (@60Hz) components of
real and reactive power for 40 samples before and 60 samples after each event) was used. Since
the separation of the labels for different phases in the building is not known, the data for both
phases are put together in the selected intervals for each label. In evaluating the proposed
approach, the M was set to be equal to 4 for a 25-30 minute interval before the reporting time.
Therefore, users were instructed to report on appliances performance in the last 20-30 minute.
Although a sensitivity analyses could be carried out to determine the optimum value of M and the
feasibility of using longer periods of reporting intervals, the analyses in this chapter was limited to
the performance of the algorithm in signature-label matching and sensitivity analyses will remain
as part of the future investigation as will be discussed in next chapter. Our observations from the
field and unstructured discussions with the users (involved in the field experiments) showed that
longer period of reporting intervals could introduce more errors to the training process since users
might not be able to remember all the events and thus the quality of the training process could be
reduced. Since the SLM algorithm assigns labels to the signatures based on equal similarity
measures in different signature sets, the reduction in quality is mainly reflected in the form of
failure in completion of labeling for some of the appliance classes.
In apartment 1, the signatures with power ranges less than 25 watts were ignored to avoid false
positive results due to matching the signatures of noisy data. The signature extraction interval was
set to be 30 minutes. The results for the SLM algorithm performance for turn-on events in this test
bed are as presented in Table 6-2. As it is observed in this table, the last column indicates if the
label and the signatures have been matched. The columns with S as heading indicate from which
subset the signature in that column has been selected.
187
Table 6-2 Performance of the SLM algorithm on the data collected from Apartment 1for turn-on
events
User Provided Label
Labels of the matched signatures
Similarity Measure Label Matching Status
𝑆 1
𝑆 2
𝑆 3
𝑆 4
182 14501 14501 14501 14501 0.001356
+ (false)
182 14501 14501 14501 14501 0.001356
182 14501 14501 14501 14501 0.001356
182 14501 14501 14501 14501 0.001356
181 18101 18101 18101 18101 0.000294
−
181 18101 18101 18101 18101 0.000294
181 18101 18101 18101 18101 0.000294
181 18101 18101 18101 18101 0.000295
180 0 18001 18001 0 0.000252
−
180 0 18001 18001 18001 0.000280
180 0 18001 18001 18001 0.000278
180 0 18001 18001 18001 0.000270
163 16301 16301 16301 16301 0.000254
+
163 16301 16301 16301 16301 0.000254
163 16301 16301 16301 16301 0.000254
163 16301 16301 16301 16301 0.000254
162 16201 16201 16201 16201 0.000199
+
162 16201 16201 16201 16201 0.000199
162 16201 16201 16201 16201 0.000199
162 16201 16201 16201 16201 0.000199
145 14501 14501 14501 14501 0.000997
+
145 14501 14501 14501 14501 0.000997
145 14501 14501 14501 14501 0.000997
145 14501 14501 14501 14501 0.000997
144 14401 14401 14401 14401 0.001666
+
144 14401 14401 14401 14401 0.001666
144 14401 14401 14401 14401 0.001666
144 14401 14401 14401 14401 0.001666
122 0 0 12900 12001 0.00077
−
122 0 0 12201 12201 0.00067
122 0 0 12900 12201 0.00079
122 0 0 12900 12201 0.00081
143 14301 14301 14301 14301 0.00529
+
143 14301 14301 14301 14301 0.00529
143 14301 14301 14301 14301 0.00529
143 14301 14301 14301 14301 0.00529
141 14101 14101 14101 14101 0.002274
+
141 14101 14101 14101 14101 0.002274
141 14101 14101 14101 14101 0.002274
141 14101 14101 14101 14101 0.002274
129 0 0 12904 0 0.000390
−
129 0 0 0 0 0.000549
129 0 0 0 0 0.000549
129 0 0 0 0 0.000549
188
For turn-on events, out of the 11 targeted appliances (the ones with user provided data), in six
cases the label has been assigned successfully. In other words, the similarity measures have been
matched. Two of the signature classes are in smaller scales, for which, the algorithm success rate
could reduce due to possibility of having more frequent similar signatures. The signatures from
bathroom light were incorrectly labeled as iron (182) since in all reported intervals signatures
from both appliances were present. In case of hair dryer (181), although the signature labels are
the same, the algorithm did not assign the label due to difference in similarity measures. Although
all these signatures are representing turn-on events for hair dryer, the difference in mathematical
representation of the signatures (due to different modes of operation) and the repetition of
signatures in the same time interval brought about this result. This is one of the challenging
situations for this approach when we have appliances with repetitive signatures in an interval.
Since the SLM algorithm avoids labeling signatures with un-matched similarity measures, the
failure of the algorithm in matching the labels will not result in incorrect training data provision.
Table 6-3 presents the results for turn-off events in apartment 1 test bed. Similar performance is
observed for the turn-off events. For hair dryer, the same challenge could be observed. For the
iron (182), the similarity between signatures from hair dryer and iron and the fact that they have
used at the same time interval resulted in the mismatch. Although, the algorithm seeks for non-
overlapping intervals, the similarity of the signatures could still result in a mismatch condition.
The validation was also carried out in apartment 3 test bed. Analysis of the signatures in this test
bed showed that the refrigerator compressor events occur with a period of 25 to 30 minutes.
Accordingly, the interval for data extraction was set to be 25 minutes so that the probability of
constantly observing refrigerator signatures in signature sub-sets could be reduced. Similarly, the
signatures with power ranges less than 25 watts were ignored.
189
Table 6-3 Performance of the SLM algorithm on the data collected from Apartment 1for turn-off
events
User Provided Label
Labels of the matched signatures
Similarity Measure Label Matching Status
𝑆 1
𝑆 2
𝑆 3
𝑆 4
182 18102 18202 18202 18102 0.000316
−
182 18102 18202 18202 18102 0.000316
182 18102 18202 18202 18202 0.0002991
182 18102 18202 18202 18202 0.0002991
181 18102 18102 18102 18102 0.0001287
−
181 18102 18102 18102 18102 0.0001414
181 18102 18102 18102 18102 0.0001414
181 18102 18102 18102 18102 0.0001414
180 18006 18006 18006 18006 0.0034271
−
180 18006 18006 18006 18006 0.0034271
180 18006 18006 18006 18006 0.0034271
180 18006 300 18006 18006 0.0016908
141 14102 14102 14102 14102 0.002204
+
141 14102 14102 14102 14102 0.002204
141 14102 14102 14102 14102 0.002204
141 14102 14102 14102 14102 0.002204
143 14302 14302 14302 14302 0.0028547
+
143 14302 14302 14302 14302 0.0028547
143 14302 14302 14302 14302 0.0028547
143 14302 14302 14302 14302 0.0028547
163 16302 16302 16302 16302 0.0002867
+
163 16302 16302 16302 16302 0.0002867
163 16302 16302 16302 16302 0.0002867
163 16302 16302 16302 16302 0.0002867
162 16202 16202 16202 16202 0.0001927
+
162 16202 16202 16202 16202 0.0001927
162 16202 16202 16202 16202 0.0001927
162 16202 16202 16202 16202 0.0001927
145 14502 14502 14502 14502 0.0023278
+
145 14502 14502 14502 14502 0.0023278
145 14502 14502 14502 14502 0.0023278
145 14502 14502 14502 14502 0.0023278
144 14402 14402 14402 14402 0.0046726
+
144 14402 14402 14402 14402 0.0046726
144 14402 14402 14402 14402 0.0046726
144 14402 14402 14402 14402 0.0046726
122 12202 12202 12202 12202 0.0011459
+
122 12202 12202 12202 12202 0.0011459
122 12202 12202 12202 12202 0.0011459
122 12202 12202 12202 12202 0.0011459
190
Table 6-4 Performance of the SLM algorithm on the data collected from Apartment 3 for turn-on
events
User Provided Label
Labels of the matched signatures
Similarity Measure Label Matching Status
𝑆 1
𝑆 2
𝑆 3
𝑆 4
181 18101 18101 18101 18101 0.000126
−
181 18101 18101 18101 18101 0.000126
181 18101 18101 18101 18101 0.000131
181 18101 18101 18101 18101 0.000122
162 16201 16201 16201 16201 0.0002
+
162 16201 16201 16201 16201 0.0002
162 16201 16201 16201 16201 0.0002
162 16201 16201 16201 16201 0.0002
163 16301 16301 16301 16301 0.000351
+
163 16301 16301 16301 16301 0.000351
163 16301 16301 16301 16301 0.000351
163 16301 16301 16301 16301 0.000351
129 12901 12901 12901 12901 0.004627
+
129 12901 12901 12901 12901 0.004627
129 12901 12901 12901 12901 0.004627
129 12901 12901 12901 12901 0.004627
167 16701 16701 16701 16701 0.000270
−
167 16701 16701 16701 16701 0.000357
167 16701 16701 16701 16701 0.000357
167 16701 16701 16701 16701 0.000357
161 16103 16103 16103 16103 0.005714
+
161 16103 16103 16103 16103 0.005714
161 16103 16103 16103 16103 0.005714
161 16103 16103 16103 16103 0.005714
111 11103 11103 11103 11103 0.001741
+
111 11103 11103 11103 11103 0.001741
111 11103 11103 11103 11103 0.001741
111 11103 11103 11103 11103 0.001741
Table 6-4 shows the performance of the SLM algorithm in apartment 3 test bed for turn-on events.
Similar phenomenon for hair dryer (181) is observed in this table. Additionally, in case of
dishwasher (167), the same phenomenon happens. Dishwasher creates repetitive events during a
working cycle and the repetition of the signatures and the differences in signature representations
(due to the presence of noise) may result in similarity measure differences. A dissimilarity
tolerance threshold could be used to address this challenge.
191
Table 6-5 Performance of the SLM algorithm on the data collected from Apartment 3 for turn-off
events
User Provided Label
Labels of the matched signatures
Similarity Measure Label Matching Status
𝑆 1
𝑆 2
𝑆 3
𝑆 4
181 14502 14502 14502 14502 0.001602
+ (False)
181 14502 14502 14502 14502 0.001602
181 14502 14502 14502 14502 0.001602
181 14502 14502 14502 14502 0.001602
162 16202 16202 16202 16202 0.000302
+
162 16202 16202 16202 16202 0.000302
162 16202 16202 16202 16202 0.000302
162 16202 16202 16202 16202 0.000302
163 16302 16302 16302 16302 0.000325
+
163 16302 16302 16302 16302 0.000325
163 16302 16302 16302 16302 0.000325
163 16302 16302 16302 16302 0.000325
129 12902 12902 12902 12902 0.00055
+
129 12902 12902 12902 12902 0.00055
129 12902 12902 12902 12902 0.00055
129 12902 12902 12902 12902 0.00055
167 16702 16702 16702 16702 0.000374
−
167 16702 16702 16702 16702 0.000374
167 16702 16702 16702 16702 0.000374
167 16702 16702 16702 16702 0.000235
161 16104 16104 16104 16104 0.004704
+
161 16104 16104 16104 16104 0.004704
161 16104 16104 16104 16104 0.004704
161 16104 16104 16104 16104 0.004704
111 11104 11104 11104 11104 0.002099
+
111 11104 11104 11104 11104 0.002099
111 11104 11104 11104 11104 0.002099
111 11104 11104 11104 11104 0.002099
Table 6-5 shows the validation results for turn-off events in apartment 3 test bed. Similar
phenomenon for dishwasher (167) is observed. In case of hair dryer (181), all the reported
intervals included signatures from bathroom light (turn-off events), which resulted in labeling
error. The signatures for bathroom light turn-off events were labeled as hair dryer. A similar
phenomenon was observed in apartment 1 validation experiments. This is the main source of error
for the SLM algorithm, which could result in incorrect labeling of signatures. The occurrence of
this problem is due to the fact that users in this test bed building did not report on lighting loads
192
operations and therefore, the search for selecting proper signature subsets was not effective.
Intelligent interaction with users, in which the data entry for different appliances is optimized for
better outcome, could reduce the challenges associated with this phenomenon. In general, the
validation of the approach in these two apartments showed that the algorithm has been successful
in assigning the labels to signatures from appliances with two state transitions and with signatures
that have approximately constant representation.
In order to evaluate the performance of the SLM algorithm under variations in the data sets
condition, the SLM algorithm was also evaluated for different combinations of the intervals and
average accuracy of the labeled signatures were measured. The accuracy metric, in this context, is
the percentage of correctly labeled signatures. In other words, the percentage of correctly labeled
signatures for the labels with positive label matching status (as shown in the aforementioned
validation experiments) was measured. In this evaluation, the sub-sets were selected by using the
ground truth labels. Depending on the number of events, up to 50 different combinations of 4 sub-
sets were selected for each appliance operational state. Table 6-6 shows the results of the analyses
for apartment 1 data set. The results include the ratio of labeled signature. This ratio shows in
what percentage of the cases the SLM algorithm matched the signatures with the associated label.
The accuracy of the labeled signatures shows the rate for correctly labeled signatures. In other
words, in what percentage of the matched signature-labels, the ground truth label is matched with
the assigned label. These values have been reported as average over up to 50 combinations of the
signature subsets from different intervals.
193
Table 6-6 Performance of the SLM algorithm on the data collected from Apartment 1 measured
over up to 50 combinations of the signature subsets
Turn-On Events Turn-Off Events
Appliance
Code
Ratio of
labeled
signatures†
Accuracy
of the
labeled
signatures
Appliance
Code
Ratio of
labeled
signatures
Accuracy of
the labeled
signatures
129 1 1 141 0.76 1
141 0.98 1 143 1 1
143 1 1 144 0.54 1
144 0.76 1 145 0.42 1
145 0.6 1 162 0.44 0.95
162 0.64 0.97 163 0.48 0.96
163 0.76 1 181 0.22 0.1
181 0.3 0.13 182 0.22 0
182 0.38 0 122 0.12 0.5
122 0.06 1
Average 0.81 Average 0.72
† This ratio shows in what percentage of the cases the SLM algorithm matched
signatures with the labels.
As Table 6-6 demonstrates, the algorithms showed promising performance in majority of the
cases. In case of hair dryer (181) as described in the experimental validation above, the variation
of the signatures representation (due to different modes of operation) results in negative outcome
of the algorithm (mismatch between label and signatures). In the 30% of the matched cases, the
matching has mainly happened between other signature classes, which resulted in low accuracy
measure. In case of iron (182), the number of times that the appliance was used is limited and in
all those occasions, the appliance has been used along with another appliance. In this specific
case, the bathroom light events occurred at the same time intervals. This concurrency resulted in
accuracy, equal to 0, for the iron signatures.
194
Table 6-7 Performance of the SLM algorithm on the data collected from Apartment 3 measured
over 50 combinations of the signature subsets
All Events Refrigerator Events Removed‡
Turn-On Events Turn-On Events Turn-Off
App.
Code
Ratio of
labeled
signatures
Acc. of the
labeled
signatures
App.
Code
Ratio of
labeled
signatures
Acc. of the
labeled
signatures
App.
Code
Ratio of
labeled
signatures
Acc. of the
labeled
signatures
162 0.90 0.71 162 0.82 0.88 162 0.94 0.96
181 0.90 0.11 181 0.48 0.79 181 0.64 0.25
166 0.96 0.71 166 0.90 0.60 166 1.00 1.00
163 0.48 0.75 163 0.32 0.88 163 0.42 0.86
129 1.00 1.00 129 1.00 1.00 129 1.00 1.00
144 0.64 1.00 144 0.60 1.00 144 0.32 0.44
145 0.66 1.00 145 0.64 0.97 145 0.52 1.00
146 0.44 0.86 146 0.44 0.95 146 0.36 0.61
147 0.42 0.71 147 0.60 0.77 147 0.78 0.82
143 0.72 0.83 143 0.58 0.93 143 0.58 0.86
Average 0.77 Average 0.88 Average 0.78
‡ The signatures for refrigerator compressor have been removed; the compressor events have a
period of about 25 minutes, which could cause higher rate of errors in the SLM performance.
† This ratio shows in how many cases the SLM algorithm matched signatures with
the labels.
Similar approach was used for the data in apartment 3 test bed. In these analyses, the events from
lights were also included. Table 6-7 shows the outcome. As this table shows, the first three
columns represent the results for turn-on events evaluated considering all the events in the data
set. However, on the right hand side of the table, the events generated by the refrigerator
condenser were eliminated to reduce their effects on the evaluation results and as the table shows
the results were improved. A promising performance was also observed in this test bed as well. As
noted, the main source of the error is the concurrency between operations of two appliances in all
the selected intervals.
195
6.4 Summary
In order to facilitate the user interaction by reducing the time dependency of the training process,
a passive (indirect) training approach for NILM systems was proposed. Passive training provides
more flexibility compared to active training, when users need to react to detected events by a
NILM system. The proposed passive training approach relaxes the time constraints for training
through introducing a signature-label matching (SLM) algorithm that enables the NILM system to
match signatures with corresponding labels, provided through a user interface for a time interval
of 20 to 30 minutes after the events occurred. This algorithm is coupled with a smartphone user
interface that facilitates fast and frequent user input. The SLM algorithm is a proximity-based one,
which assigns a label (provided for M different intervals containing related signatures) to the
corresponding signatures using the equality of similarity measure between nearest signatures in
the M intervals. The algorithm has been evaluated through experimental validation in two test bed
apartments. The validations were carried out through prototypes of the NILM system and the
smartphone application for data collection over a period of two weeks in each one of the test bed
buildings. The users in these buildings were asked to provide label information 20 to 30 minutes
after the events have occurred. The algorithm evaluation was carried out by measuring the
accuracy of the assigned labels with respect to the ground truth data. In general, the validation of
the approach in these two apartments showed that the algorithm has been specifically successful
in assigning the labels to signatures from appliances with two state transitions and with signatures
that have approximately constant representation. The main source of the error is the concurrency
between operations of two appliances in all the selected intervals. By reducing this concurrency
through intelligent and thus efficient selection of the intervals the performance of the algorithm
could be improved.
196
Chapter Seven: Conclusion and Future Directions
7
The main goal of this dissertation has been set to facilitate the training of the electricity
measurement disaggregation (NILM) systems. An effective solution to ensure that all the
signatures (associated with the corresponding appliance state transitions) in a building are
introduced to the NILM system is through facilitating the training process using a real time NILM
prototype that is capable of communicating with user as new events are detected. Upon
installation, as new events are detected, the system communicates with user(s) asking for
confirmation of the classified event or provision of label for new events. Considering the ad hoc
nature of this process due to diversity in appliances’ design, technologies, and manufacturers, and
the number of states for different appliances in a specific setting, this process requires to be
continuous until the training is complete. A number of challenges are associated with this process
including user fatigue due to excessive number of calls, the difficulty for user(s) to detect the state
transition of some of the appliances (e.g, refrigerator defrost), and the need for user immediate
reaction to the calls from the NILM system. In order to facilitate wide adoption of the technology,
the main objectives were set as to reducing the number of calls to user(s) for training data
provision while keeping the accuracy of the training data to a high level and relaxing the time
constraint between the occurrence of an event and the time that user(s) need to provide the data. In
the following section, the proposed solutions and the way that they address the objectives and
research questions of this dissertation are discussed.
197
7.1 Addressing Research Objectives and Questions
A user-centric framework for event-based NILM techniques was presented to address the first
objective. This framework complements the core components of the event-based NILM systems
by modules that are capable of identifying the possible signature space in a new setting. The
signature space is the collection of all signatures, captured by the NILM system, over a period of
time as the appliances operate in different states in reaction to the environment and user(s).
Identification of the signature space is then the partitioning of the signature space into clusters of
signatures compatible with the separation of signatures corresponding to the appliances
operational states in the physical domain. This brings us to the first research question of how the
signature space could be autonomously clustered specifically considering the ad hoc nature of the
signature space topology and the variation that are observed in different settings.
In addressing the first question, the main objective was to present a solution that could enable the
partitioning of the signature space without any a priori information as the input. Different
clustering algorithms could be used for this purpose; however, these algorithms require either the
number of clusters or a threshold for finding the partition in a signature space. Determining such
threshold is not desirable specifically considering the ad hoc nature of the problem. Accordingly,
a heuristic clustering algorithm was developed (as presented in chapter 4) using the hierarchical
clustering as the base approach. In this heuristic algorithm, the characteristics of the binary cluster
tree were used to determine the distance threshold for pruning the tree. To account for the multi-
scale nature of the cluster tree, the algorithm finds the natural partitions of the signature space at
different scales in a recursive fashion. In this way, the thresholds at different scales of the feature
space are autonomously extracted by learning from the data and without providing any additional
198
information. The multi-scale nature of the signature space stems from the fact that appliances in
any setting have different power draw magnitudes that vary significantly, specifically, when the
transient signatures are taken into account. The presented clustering algorithm acknowledges the
multi-scale structure and finds the partitioning thresholds autonomously.
The algorithm was evaluated using the data sets from Apartment 1 (in chapter 4) and Apartment 2
(in chapter 5). For the performance evaluation, accuracy metrics, as well as the CQI metric, a
customized metric for cluster quality were used. The evaluation of the algorithm was carried out
for different distance and linkage metrics by using different feature extraction methods. The
evaluations demonstrated consistent performance of the proposed algorithm in accurate
partitioning of the signature space with high F-measure values for various evaluation scenarios.
The assessment of different feature extraction methods showed that the application of basic
feature vector (real and reactive power for the fundamental frequency in proximity of the events)
and higher harmonic contents of the power time series results in acceptable (with high accuracy
and relatively high CQI) partitioning of the signature space. However, the effective feature
extraction depends on the structure (topology) of the signature space. Although evaluations in this
dissertation were conducted using transient state features, the proposed heuristic is independent of
the mathematical representation of the signatures.
The clustered data is used for efficient interaction with users during the training process. Once the
signature space in a new setting is identified, the NILM system could use that information to
reduce the number of calls during the training. Once the association between one (detected event)
signature with one of the clusters is recognized, the user is called for interacting and providing the
label. For the successive events, the labeled clusters are eliminated for further interactions until
199
labeling the clusters are complete. In this process, certain clusters and their associated signatures
belong to appliances that their state transitions are not detectable by users. These are the
appliances that are controlled automatically and with frequent events. By detecting the clusters
corresponding to these appliances, the number of user interactions for them could be limited.
Therefore, besides the accuracy of the clustered data, an important parameter in facilitating the
training process for NILM systems is to ensure that the clustered data is compatible with
separation of the signatures in the physical domain. The last part of the research question one and
the second research question were then answered by introducing a framework for efficient
interaction between a NILM system and its users in chapter 5. The framework incorporates the
core event-based NILM algorithm and the autonomous clustering algorithm, and complements
them by introducing cluster validation and anomaly detection algorithms. The cluster validation
approach introduces effective cluster merging and internal cluster validation for the high-
dimensional multi-scale feature space of the electricity measurements. A customized cluster
compatibility approach, which measures the similarity between covariance matrices of the
targeted clusters to account for the shape of the signatures, was demonstrated to outperform the
approaches that measure the distance between targeted clusters. It was shown that performing the
cluster merging at different scales of the signature space, increases the accuracy and helps us
determine robust threshold values for merging. Perturbing the signature space topology coupled
with internal cluster validation techniques were showed to be an effective approach in improving
the accuracy of the clustering, where the difference between signatures are subtle. Moreover, this
approach enables the NILM system to determine the optimum feature vector representation (the
use of basic feature with fundamental frequency component or the use of higher harmonic
contents). Similarly, the evaluations for the cluster validation were conducted using transient state
features and promising results were obtained. However, the proposed methodologies are
200
independent of the signature representation.
In order to ensure the accuracy of the labeling process, an ensemble of anomaly detection
algorithms were combined to create an anomaly detection module, which balances the trade-off
between accuracy and rejection rates. A low rejection rate is important for reduction of the
training efforts; however, reducing this rate should not affect the accuracy of the detector. The
evaluation of the anomaly detector module over eight data sets showed its consistent performance.
A rule-based cluster identification module was also added to the framework to help identify
clusters with frequent events and limit the number of interactions for training. The validation of
the framework in two test bed buildings showed its performance by reducing the number of
interactions by more than 94%, while achieving the labeling accuracy more than 96% on average.
User activities usually include a number of consecutive actions, which involve using appliances in
their different modes of operations. Consequently, interacting with the NILM system while the
user is involved in daily activities, even if it is through a mobile device such as tablet or smart
phone could result in inconvenience for users. This brought us to the third research question of
relaxing the response time for labeling the events, which was addressed by introducing a passive
(indirect) training approach for NILM systems in the form of a signature-label matching (SLM)
algorithm. This algorithm is a proximity-based one, which assigns a label (provided for M
different intervals containing related signatures) to the corresponding signatures using the equality
of similarity measure between nearest signatures in the M intervals. The algorithm was evaluated
through experimental validation in two test bed apartments. The validation of the approach in two
test bed buildings showed that the approach could be used for relaxing the response time by 20 to
30 minutes. Moreover, these validations showed that the algorithm has been specifically
201
successful in assigning the labels to signatures from appliances with two state transitions and with
signatures that have approximately constant representation. The main source of the error is the
concurrency between operations of two appliances in all the selected intervals. A search approach
for finding the combination of non-concurring intervals was used and showed to be effective in
the validation studies.
The validation field studies demonstrated the promising performance of the proposed techniques
for smart user-centirc NILM systems. However, the nature of the problem could pose limitations
and some of these limitations are as follows:
One of these limitations stems from ad hoc nature of the problem. In order to achieve
significant conclusions about the performance of the algorithms, these approaches need to
be tested in large scale field studies to account for diversity in appliance signatures.
For appliances with small power draw the algorithms performance could be diminished
due to similarity between signatures. However, this is a common challenge in NILM
event-based techniques.
Although major components of the SI framework do not require a priori information
provision, the cluster merging algorithms call for threshold optimization and
generalization, which requires a more comprehensive sensitivity analyses. Heuristic sanity
check algorithms could be used to avoid error propagation due to the cluster merging
algorithms’ outcome.
The training process depends on user provided information and therefore it is prone to
human errors. In the passive training approach, information provision by multiple people
in a setting could pose more challenges in search for non-concurring intervals.
202
Accordingly, the passive training approach calls for user compliance to certain rules for
data provision.
7.2 Future Research Directions
Our studies showed that it is possible to reduce the user interaction requirements without
compromising the accuracy of the labeling process. As noted, although validation studies showed
successful performance of the proposed solutions in three apartments over a relatively long period
of time, the diversity of the appliances’ design and manufacturers’ technologies calls for
evaluation of these techniques in a large number of settings. This is to ensure that the algorithms
and frameworks have been exposed to different challenging situations, which in turn enables us to
incorporate the lessons learned from those situations into these algorithms and increase their
robustness.
An ideal NILM system is the one that could be trained ahead of installation in a new setting.
Potentially it could be achieved by learning representative signatures or appliance models across
different buildings. This could be achieved by optimum representation of the signatures through
feature extraction techniques as presented in [34] to model the transient signatures, or by
introducing the appliance model and application of Hidden Markov Models as presented in some
of the studies [69, 70]. As it is also presented in these references, developing generalized
representation of the appliance models or signatures is a challenging research question due to the
diversity of the manufacturers and their technologies. Another alternative future direction to this
approach is the application of graphical models such as Bayesian networks by incorporating the
information from observation in different buildings coupled with the proposed cluster validation
techniques in this dissertation. The cluster validation techniques and the identification of the
203
appliances with frequent events in addition to information about the shape, power ranges,
frequency contents, etc. could be used for developing a probabilistic reasoning model that could
be used for detecting the class of a specific signature cluster. Therefore, further explorations in
cluster validation techniques and developing the probabilistic reasoning in large scale studies is
another future direction for this research.
Since the variation of ambient conditions in the environment is correlated with the activities of
users and the behavior of appliances, sensor assisted training is another future direction that could
be pursued. This is a field of research that has been explored in some of the recent research efforts
[77, 79]. We have also started working on this field by introducing techniques for sensor assisted
training for lighting systems. Light intensity sensors enable us to reduce the number of sensing
points while monitoring a number of light sources to track their operational schedule. We have
developed solutions in time domain to reduce the challenges associated with separation of the
artificial versus natural light sources [62, 103]. We have also started working to develop an
approach in the frequency domain, where event detection is carried out on the time series of
certain frequency components of the light signal to separate the artificial versus natural light
contribution (see Figure 7-1).
Figure 7-1 Lighting fixtures activity detection in time and frequency domains
0 1 2 3 4
x 10
5
160
180
200
220
240
260
280
300
Time
Light Intensity
Time Domain
0 2000 4000 6000 8000
0
0.5
1
1.5
2
2.5
3
Time
Frequency Component Amplitude
120Hz Frequency Component
0 200 400 600
0
0.5
1
1.5
2
2.5
3
Time
Frequency Component Amplitude
120Hz Frequency Component (Zoomed In)
Noisy Signal in the
Time Domain
204
Although sensor assisted training techniques are promising, however, the trade-off between using
additional sensors and the gain in information and energy saving should be taken into account.
Accordingly, this is another future direction that could be taken into consideration.
As noted, there is correlation between the behavior of appliances, user activities, and variations
that are observed in the ambient conditions in an environment. Moreover, one of the applications
of the electricity disaggregation algorithms is to facilitate the change in energy-related human
behavior towards energy conservation. Therefore, activity detection could be considered as the
core component of this vision. Activity detection techniques also follow similar concepts for
detecting and identifying the changes in human behavior. Therefore, another potential future
research direction is to combine electricity disaggregation and activity detection techniques in the
form of a hybrid framework to achieve better performance in each individual component,
optimize the sensing infrastructure and combine efforts for training. Human computer interaction
could also play a major role in achieving the objectives in this direction by introducing improved
techniques for increasing the effectiveness of the user input.
205
8 References
[1] IEA, Key World Energy Statistics (2006) ---.
[2] L. Perez-Lombard, J. Ortiz, C. Pout, A review on buildings energy consumption information,
Energy Build. 40 (2008) 394-398.
[3] K. Dagobert G, Global warming — facts, assessment, countermeasures, Journal of Petroleum
Science and Engineering. 26 (2000) 157-168.
[4] BED-Book, Building Energy Data Book (2011).
[5] Energy Star, http://www.energystar.gov/index.cfm?c=business.EPA_BUM_CH6_Lighting,
Last Accessed at May 2013.
[6] US EPA, List of additional statistics on buildings and the environment, Accessed at:
http://www.epa.gov/greenbuilding/pubs/whybuild.htm, (September 23 2012) (2009).
[7] Johnson Controls Inc., 10 Ways to Reduce Energy Use and Costs in Your Building, Accessed
at: http://www.greenbiz.com/research/tool/2010/09/01/10-ways-reduce-energy-use-and-costs-
your-building, (October 16, 2012) (2010).
[8] E. Shove, Changing human behaviour and lifestyle: a challenge for sustainable consumption?,
Consumption - Perspectives from ecological economics (2005) 111-132.
[9] E. Azar, C. Menassa, Agent-Based Modelling of Occupants’ Impact on Energy Use in
Commercial Buildings., Journal of Computing in Civil Engineering. 26(4) (2011) 506-518.
[10] R.K. Jain, J.E. Taylor, G. Peschiera, Assessing eco-feedback interface usage and design to
drive energy efficiency in buildings, Energy Build. 48 (2012) 8-17.
[11] Z. Yang, N. Li, B. Becerik-Gerber and M. Orosz. A non-intrusive occupancy monitoring
system for demand driven HVAC operations, Construction Research Congress 2012: Construction
Challenges in a Flat World, May 21, 2012 - May 23 (2012) , pp. 828-837.
[12] L. Klein, J. Kwak, G. Kavulya, F. Jazizadeh, B. Becerik-Gerber, P. Varakantham, M. Tambe,
Coordinating occupant behavior for building energy and comfort management using multi-agent
systems, Autom. Constr. 22 (2012) 525-536.
[13] F. Jazizadeh, A. Ghahramani, B. Becerik-Gerber, T. Kichkaylo, M. Orosz, User-led
decentralized thermal comfort driven HVAC operations for improved efficiency in office
buildings, Energy Build. 70 (2014) 398-410.
206
[14] F. Jazizadeh, A. Ghahramani, B. Becerik-Gerber, T. Kichkaylo, M. Orosz, Human-Building
Interaction Framework for Personalized Thermal Comfort-Driven Systems in Office Buildings, J.
Comput. Civ. Eng. 28 (2013) 2-16.
[15] F. Jazizadeh, F.M. Marin, B. Becerik-Gerber, A thermal preference scale for personalized
comfort profile identification via participatory sensing, Build. Environ. 68 (2013) 140-149.
[16] L. Klein, G. Kavulya, F. Jazizadeh, J. Kwak, B. Becerik-Gerber and M. Tambe. Towards
optimization of building energy and occupant comfort using multi-agent simulation, International
Symposium on Automation and Robotics in Construction (2011) .
[17] G. Peschiera, J.E. Taylor, The impact of peer network position on electricity consumption in
building occupant networks utilizing energy feedback systems, Energy Build. 49 (2012) 584-590.
[18] F. Jazizadeh, G. Kavulya, J. Kwak, B. Becerik-Gerber, M. Tambe and W. Wood. Human-
building interaction for energy conservation in office buildings, Construction Research Congress
2012: Construction Challenges in a Flat World, May 21, 2012 - May 23 (2012) , pp. 1830-1839.
[19] V. L. Erickson, M. A. Carreira-Perpinan and A. E. Cerpa. OBSERVE: Occupancy-based
system for efficient reduction of HVAC energy, Information Processing in Sensor Networks
(IPSN), 2011 10th International Conference on, (2011) , pp. 258-269.
[20] O. A. Sianaki, O. Hussain, T. Dillon and A. R. Tabesh. Intelligent Decision Support System
for Including Consumers' Preferences in Residential Energy Consumption in Smart Grid, 2010
Second International Conference on Computational Intelligence, Modelling and Simulation
(CIMSiM 2010) (2010) , pp. 154-9.
[21] EPRI, Residential Electricity Use Feedback: A Research Synthesis and Economic
Framework, Tech. Rep. 1016844 (2009).
[22] P. Ekins, Environment and Human Behavior: A New Opportunities Programme, Accessed at:
http://www.psi.org.uk/ehb/; (June 29 2012) (2003).
[23] J. Dewaters and S. Powers. Work in progress - Energy education and energy literacy:
Benefits of rigor and relevance, 39th Annual Frontiers in Education Conference: Imagining and
Engineering Future CSET Education, FIE 2009, October 18, 2009 - October 21 (2009)
[24] J. DeWaters and S. Powers. Work in progress: A pilot study to assess the impact of a special
topics energy module on improving energy literacy of high school youth, 36th ASEE/IEEE
Frontiers in Education Conference, FIE, October 28, 2006 - October 31 (2006)
[25] Yu Yi-xin, L. Peng and Zhao Chun-liu. Non-intrusive method for on-line power load
decomposition, 2008 China International Conference on Electricity Distribution (CICED 2008)
(2008) , pp. 8.
207
[26] M. Berenguer, M. Giordani, F. Giraud-By and N. Noury. Automatic detection of activities of
daily living from detecting and classifying electrical events on the residential power line, e-health
Networking, Applications and Services, 2008. HealthCom 2008. 10th International Conference on
(2008) , pp. 29-32.
[27] J. Torriti, M.G. Hassan, M. Leach, Demand response experience in Europe: Policies,
programmes and implementation, Energy. 35 (2010) 1575-83.
[28] The Energy Detective, TED: The Energy Detective. 2011.
[29] PowerhouseDynamics, http://www.powerhousedynamics.com/residential-energy-efficiency/,
Last Accessed May 2013 (2013).
[30] P3 International, Kill-a-watt. 2011.
[31] Electronic Educational Devices, Electricyty Meters. 2011.
[32] GreenWave Reality, http://www.greenwavereality.com/solutions/energymgmt/, Last
Accessed May 2013 (2013).
[33] Enmetric-Systems, http://www.enmetric.com/platform#Hardware, Accessed at December 10
2012 (2012).
[34] M. Berges, A framework for enabling energy-aware facilities through minimally-intrusive
approaches., 2010 (2010).
[35] G.W. Hart, Residential energy monitoring and computerized surveillance via utility power
flows, IEEE Technol. Soc. Mag. 8 (1989) 12-16.
[36] G.W. Hart, Nonintrusive appliance load monitoring, Proc IEEE. 80 (1992) 1870-91.
[37] Google Inc., Engage your customers every day with Google PowerMeter (2009).
[38] M. Zeifman and K. Roth. Nonintrusive appliance load monitoring: Review and outlook, 2011
IEEE International Conference on Consumer Electronics, ICCE 2011, January 9, 2011 - January
12 (2011) , pp. 239-240.
[39] A. G. Ruzzelli, C. Nicolas, A. Schoofs and G. M. P. O'Hare. Real-Time Recognition and
Profiling of Appliances through a Single Electricity Sensor, Sensor Mesh and Ad Hoc
Communications and Networks (SECON), 2010 7th Annual IEEE Communications Society
Conference on (2010) , pp. 1-9.
[40] S. Drenker, A. Kader, Nonintrusive monitoring of electric loads, IEEE Comput. Appl. Power.
12 (1999) 47-51.
208
[41] A. Marchiori, D. Hakkarinen, Qi Han, L. Earle, Circuit-Level Load Monitoring for
Household Energy Management, Pervasive Computing, IEEE. 10 (2011) 40-48.
[42] L. Farinaccio, R. Zmeureanu, Using a pattern recognition approach to disaggregate the total
electricity consumption in a house into the major end-uses, Energy Build. 30 (1999) 245-59.
[43] M.L. Marceau, R. Zmeureanu, Nonintrusive load disaggregation computer program to
estimate the energy consumption of major end uses in residential buildings, Energy Conversion
and Management. 41 (2000) 1389-1403.
[44] H. Murata and T. Onoda. Estimation of power consumption for household electric
appliances, Neural Information Processing, 2002. ICONIP '02. Proceedings of the 9th
International Conference on (2002) , pp. 2299-2303 vol.5.
[45] M. Akbar and Z. A. Khan. Modified nonintrusive appliance load monitoring for nonlinear
devices, 11th IEEE International Multitopic Conference, INMIC 2007, December 28, 2007 -
December 30 (2007) .
[46] S. Inagaki, T. Egami, T. Suzuki, H. Nakamura, K. Ito, Nonintrusive appliance load
monitoring based on integer programming, Electrical Engineering in Japan. 174 (2011) 18-25.
[47] S. B. Leeb and J. L. Kirtley J. A multiscale transient event detector for nonintrusive load
monitoring, Proceedings of IECON '93 - 19th Annual Conference of IEEE Industrial Electronics
(1993) , pp. 354-9.
[48] S.B. Leeb, S.R. Shaw, Harmonic estimates for transient event detection. 1 (1994) 133-136.
[49] S.B. Leeb, S.R. Shaw, J.L. Kirtley Jr., Transient event detection in spectral envelope
estimates for nonintrusive load monitoring, IEEE Trans. Power Del. 10 (1995) 1200-1210.
[50] U.A. Khan, S.B. Leeb, M.C. Lee, Multiprocessor for transient event detection, IEEE Trans.
Power Del. 12 (1997) 51-60.
[51] D. Luo, L. K. Norford, S. R. Shaw, S. B. Leeb, R. Danks and G. Wichenko. Monitoring
HVAC equipment electrical loads from a centralized location - Methods and field test results,
2002 ASHRAE Winter Meeting, January 13, 2002 - January 16 (2002) , pp. 841-857.
[52] L.K. Norford, S.B. Leeb, Non-intrusive electrical load monitoring in commercial buildings
based on steady-state and transient load-detection algorithms, Energy Build. 24 (1996) 51-64.
[53] M. Berges, E. Goldman, H.S. Matthews, L. Soibelman, K. Anderson, User-centered
nonintrusive electricity load monitoring for residential buildings, J. Comput. Civ. Eng. 25 (2011)
471-480.
[54] M. Baranski and J. Voss. Genetic algorithm for pattern detection in NIALM systems, 2004
IEEE International Conference on Systems, Man and Cybernetics (2004) , pp. 3462-8.
209
[55] K.D. Lee, Electric Load Information system based on non-intrusive power monitoring,
Thesis, Massachusetts Institute of Technology, Ph.D. Thesis - Massachusetts Institute of
Technology, Dept. of Mechanical Engineering (2003).
[56] S. N. Patel, T. Robertson, J. A. Kientz, M. S. Reynolds and G. D. Abowd. At the flick of a
switch: detecting and classifying unique electrical events on the residential power line, 9th
International Conference, UbiComp 2007 (2007) , pp. 271-88.
[57] S. Gupta, M. S. Reynolds and S. N. Patel. ElectriSense: Single-point sensing using EMI for
electrical event detection and classification in the home, 12th International Conference on
Ubiquitous Computing, UbiComp 2010, September 26, 2010 - September 29 (2010) , pp. 139-
148.
[58] W. L. Chan, A. T. P. So and L. L. Lai. Harmonics load signature recognition by wavelets
transforms, Electric Utility Deregulation and Restructuring and Power Technologies, 2000.
Proceedings. DRPT 2000. International Conference on (2000) , pp. 666-671.
[59] H.Y. Lam, G.S.K. Fung, W.K. Lee, A novel method to construct taxonomy of electrical
appliances based on load signatures, IEEE Transactions on Consumer Electronics. 53 (2007) 653-
60.
[60] K. Suzuki, S. Inagaki, T. Suzuki, H. Nakamura and K. Ito. Nonintrusive appliance load
monitoring based on integer programming, SICE 2008 - 47th Annual Conference of the Society of
Instrument and Control Engineers of Japan (2008) , pp. 2742-7.
[61] J. Lifton, M. Feldmeier, Y. Ono, C. Lewis and J. A. Paradiso. A platform for ubiquitous
sensor deployment in occupational and domestic environments, IPSN 2007: 6th International
Symposium on Information Processing in Sensor Networks, April 25, 2007 - April 27 (2007) , pp.
119-127.
[62] F. Jazizadeh and B. Becerik-Gerber. A Novel Method for Non Intrusive Load Monitoring of
Lighting Systems in Commercial Buildings, ASCE Workshop of Computing in Civil Engineering
(2012) .
[63] J. Wang and S. Wang. Wireless Sensor Networks for Home Appliance Energy Management
based on ZigBee technology, 2010 International Conference on Machine Learning and
Cybernetics, ICMLC 2010, July 11, 2010 - July 14 (2010) , pp. 1041-1044.
[64] D. Srinivasan, W.S. Ng, A.C. Liew, Neural-network-based signature recognition for
harmonic source identification, IEEE Trans. Power Del. 21 (2006) 398-405.
[65] M. Baranski and J. Voss. Detecting patterns of appliances from total load data using a
dynamic programming approach, Fourth IEEE International Conference on Data Mining (2004) ,
pp. 327-30.
210
[66] J. Liang, S. Ng, G. Kendall and J. Cheng. Load signature study V part I: Basic concept,
structure and methodology, 2010 IEEE Power & Energy Society General Meeting (2010) , pp. 1
pp.
[67] Hyungsul Kim, Manish Marwah, Martin F. Arlitt, Geoff Lyon and Jiawei Han. Unsupervised
Disaggregation of Low Frequency Power Measurements, In Proceedings of the Eleventh SIAM
International Conference on Data Mining, SDM 2011, April 28-30, 2011, Mesa, Arizona, USA.
pages 747-758, SIAM / Omnipress, 2011.
[68] J. Zico Kolter, Tommi Jaakkola, Approximate Inference in Additive Factorial HMMs with
Application to Energy Disaggregation, In Proceedings of the International Conference on
Artificial Intelligence and Statistics, 2012 (2012).
[69] O. Parson, S. Ghosh, M. Weal and A. Rogers. Non-Intrusive Load Monitoring Using Prior
Models of General Appliance Types., AAAI (2012) .
[70] H. Kim, M. Marwah, M. Arlitt, G. Lyon and J. Han. Unsupervised disaggregation of low
frequency power measurements, 11th SIAM International Conference on Data Mining, SDM
2011, April 28, 2011 - April 30 (2011) , pp. 747-758.
[71] Enetics, http://www.enetics.com/. Last accessed on 08/23/2012.
[72] EIA, Residential Energy Consumption Survey,
http://arizonaenergy.org/Data/U.S.%20Household%20Electricity%20Report.htm, (last accessed:
February 2014) (2005).
[73] EIA, Heating and cooling no longer majority of U.S. home energy use,
http://www.eia.gov/todayinenergy/detail.cfm?id=10271&src=%E2%80%B9%20Consumption%2
0%20%20%20%20%20Residential%20Energy%20Consumption%20Survey%20(RECS)-b1, (last
accessed: February 2014) (2013).
[74] Energy Information Administration, The Residential Energy Consumption Survey (RECS).
2012 (2009).
[75] A. Rowe, M. Berges and R. Rajkumar. Contactless sensing of appliance state transitions
through variations in electromagnetic fields, 2nd ACM Workshop on Embedded Sensing Systems
for Energy-Efficiency in Buildings, BuildSys'10, November 2, 2010 - November 2 (2010) , pp.
19-24.
[76] N. Rajagopal, S. Giri, M. Berges, A. Rowe, Demo abstract: a magnetic field-based appliance
metering system (2013) 307-308.
[77] S. Giri and M. Berges. A study on the feasibility of automated data labeling and training
using an EMF sensor in NILM platforms, in Proceedings of the 2012 International EG-ICE
Workshop on Intelligent Computing, Herrsching, Germany, 2012. (2012) .
211
[78] Y. Kim, T. Schmid, Z. M. Charbiwala and M. B. Srivastava. ViridiScope: Design and
implementation of a fine grained power monitoring system for homes, 11th ACM International
Conference on Ubiquitous Computing, UbiComp'09, September 30, 2009 - October 3 (2009) , pp.
245-254.
[79] A. Schoofs, A. Guerrieri, D. T. Delaney, G. O'Hare and A. G. Ruzzelli. Annot: Automated
electricity data annotation using wireless sensor networks, Sensor Mesh and Ad Hoc
Communications and Networks (SECON), 2010 7th Annual IEEE Communications Society
Conference on (2010) , pp. 1-9.
[80] K. Anderson, A. Ocneanu, D. Benitez, D. Carlson, A. Rowe, and M. Berges, BLUED: A
Fully Labeled Public Dataset for Event-Based Non-Intrusive Load Monitoring Research, in
Proceedings of the 2nd KDD Workshop on Data Mining Applications in Sustainability
(SustKDD), Beijing, China (2012).
[81] J. Z. Kolter and M. J. Johnson. REDD: A public data set for energy disaggregation research,
proceedings of the SustKDD workshop on Data Mining Applications in Sustainability,
proceedings of the SustKDD workshop on Data Mining Applications in Sustainability (2011) , pp.
1-6.
[82] M. E. Balci and M. H. Hocaoglu. Comparison of power definitions for reactive power
compensation in nonsinusoidal conditions, Harmonics and Quality of Power, 2004. 11th
International Conference on (2004) , pp. 519-524.
[83] J. L. Wyatt Jr and M. Ilic. Time-domain reactive power concepts for nonlinear, nonsinusoidal
or nonperiodic networks, Circuits and Systems, 1990., IEEE International Symposium on (1990) ,
pp. 387-390.
[84] S.R. Shaw, S.B. Leeb, L.K. Norford, R.W. Cox, Nonintrusive load monitoring and
diagnostics in power systems, IEEE Transactions on Instrumentation and Measurement. 57 (2008)
1445-54.
[85] D. Luo, L.K. Norford, S. Leeb, S. Shaw, Monitoring HVAC equipment electrical loads from
a centralized location- methods and field test results., ASHRAE Trans. 108 (2002) 841-857.
[86] J.C. Dunn, Well-separated clusters and optimal fuzzy partitions, Journal of Cybernetics. 4
(1974) 95-104.
[87] U. Kaymak and R. Babuska. Compatible cluster merging for fuzzy modelling, Fuzzy
Systems, 1995. International Joint Conference of the Fourth IEEE International Conference on
Fuzzy Systems and The Second International Fuzzy Engineering Symposium., Proceedings of
1995 IEEE Int (1995) , pp. 897-904.
[88] R. Krishnapuram, C. Freg, Fitting an unknown number of lines and planes to image data
through compatible cluster merging, Pattern Recognit. 25 (1992) 385-400.
212
[89] X. Xiong, K. L. Chan and K. L. Tan. Similarity-driven cluster merging method for
unsupervised fuzzy clustering, Proceedings of the 20th conference on Uncertainty in artificial
intelligence (2004) , pp. 611-618.
[90] K. Younis, M. Karim, R. Hardie, J. Loomis, S. Rogers and M. DeSimio. Cluster merging
based on weighted mahalanobis distance with application in digital mammograph, Aerospace and
Electronics Conference, 1998. NAECON 1998. Proceedings of the IEEE 1998 National (1998) ,
pp. 525-530.
[91] D.M. Tax, R.P. Duin, Support vector data description, Mach. Learning. 54 (2004) 45-66.
[92] V. Vapnik, Statistical learning theory. 1998 (1998).
[93] S. Abe, Training of support vector machines with Mahalanobis kernels, in: Artificial Neural
Networks: Formal Models and Their Applications–ICANN 2005, Springer, 2005, pp. 571-576.
[94] P.J. Rousseeuw, K.V. Driessen, A fast algorithm for the minimum covariance determinant
estimator, Technometrics. 41 (1999) 212-223.
[95] P.J. Rousseeuw, B.C. Van Zomeren, Unmasking multivariate outliers and leverage points,
Journal of the American Statistical Association. 85 (1990) 633-639.
[96] K.I. Penny, Appropriate critical values when testing for a single multivariate outlier by using
the Mahalanobis distance, Applied Statistics (1996) 73-81.
[97] P. Filzmoser, A multivariate outlier detection method, na, 2004.
[98] M. Halkidi, Y. Batistakis, M. Vazirgiannis, On clustering validation techniques, J Intell
Inform Syst. 17 (2001) 107-145.
[99] Y. Liu, Z. Li, H. Xiong, X. Gao and J. Wu. Understanding of internal clustering validation
measures, Data Mining (ICDM), 2010 IEEE 10th International Conference on (2010) , pp. 911-
916.
[100] D.L. Davies, D.W. Bouldin, A cluster separation measure, Pattern Analysis and Machine
Intelligence, IEEE Transactions on (1979) 224-227.
[101] L. Von Ahn, L. Dabbish, Designing games with a purpose, Commun ACM. 51 (2008) 58-
67.
[102] L. Von Ahn and L. Dabbish. Labeling images with a computer game, 2004 Conference on
Human Factors in Computing Systems - Proceedings, CHI 2004, April 24, 2004 - April 29 (2004)
, pp. 319-326.
[103] F. Jazizadeh, S. Ahmadi-Karvigh, B. Becerik-Gerber, L. Soibelman, Spatiotemporal
lighting load disaggregation using light intensity signal, Energy Build. 69 (2014) 572-583.
Abstract (if available)
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
A radio frequency based indoor localization framework for supporting building emergency response operations
PDF
Semantic modeling of outdoor scenes for the creation of virtual environments and simulations
PDF
Point cloud data fusion of RGB and thermal information for advanced building envelope modeling in support of energy audits for large districts
PDF
A framework for comprehensive assessment of resilience and other dimensions of asset management in metropolis-scale transport systems
PDF
In-situ quality assessment of scan data for as-built models using building-specific geometric features
PDF
Understanding human-building interactions through perceptual decision-making processes
PDF
Enabling human-building communication to promote pro-environmental behavior in office buildings
PDF
Understanding human-building-emergency interactions in the built environment
PDF
Towards health-conscious spaces: building for human well-being and performance
PDF
The power of flexibility: autonomous agents that conserve energy in commercial buildings
PDF
Analytical and experimental studies in system identification and modeling for structural control and health monitoring
PDF
Smart buildings: employing modern technology to create an integrated, data-driven, intelligent, self-optimizing, human-centered, building automation system
PDF
Optimal distributed algorithms for scheduling and load balancing in wireless networks
PDF
Sequential decision-making for sensing, communication and strategic interactions
PDF
Vision-based and data-driven analytical and experimental studies into condition assessment and change detection of evolving civil, mechanical and aerospace infrastructures
PDF
Theoretical and computational foundations for cyber‐physical systems design
PDF
Towards a cross-layer framework for wearout monitoring and mitigation
PDF
Theoretical foundations for dealing with data scarcity and distributed computing in modern machine learning
PDF
Toward counteralgorithms: the contestation of interpretability in machine learning
PDF
Representation, classification and information fusion for robust and efficient multimodal human states recognition
Asset Metadata
Creator
Karimi, Farrokh Jazizadeh
(author)
Core Title
User-centric smart sensing for non-intrusive electricity consumption disaggregation in buildings
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Civil Engineering
Publication Date
01/10/2017
Defense Date
03/24/2015
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
algorithm training,demand side management,energy management,human-building interaction,non-intrusive load monitoring,OAI-PMH Harvest
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Becerik-Gerber, Burcin (
committee chair
), Soibelman, Lucio (
committee chair
), Berges, Mario (
committee member
), Masri, Sami F. (
committee member
), Sawchuk, Alexander A. (Sandy) (
committee member
)
Creator Email
jazizade@usc.edu,jazizadeh@gmail.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c3-591270
Unique identifier
UC11300539
Identifier
etd-KarimiFarr-3582.pdf (filename),usctheses-c3-591270 (legacy record id)
Legacy Identifier
etd-KarimiFarr-3582.pdf
Dmrecord
591270
Document Type
Dissertation
Format
application/pdf (imt)
Rights
Karimi, Farrokh Jazizadeh
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
algorithm training
demand side management
energy management
human-building interaction
non-intrusive load monitoring