Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
Computer Science Technical Report Archive
/
USC Computer Science Technical Reports, no. 952 (2015)
(USC DC Other)
USC Computer Science Technical Reports, no. 952 (2015)
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
1
Time Series Clustering for Demand Response
An Online Algorithmic Approach
Ranjan Pal, Student Member, IEEE, Charalampos Chelmis, Member, IEEE, Marc Frincu, Member, IEEE,
Viktor Prasanna, Fellow, IEEE
Abstract
The widespread monitoring of electricity consumption due to increasingly pervasive deployment of networked sensors in urban
environments has resulted in an unprecedentedly large volume of data being collected. To improve sustainability in Smart Grids, real-
time data analytics challenges induced by high volume and high dimensional context-based data need to be addressed. Particularly,
with the emerging Smart Grid technologies becoming more ubiquitous, analytics for discovering the underlying structure of high
dimensional time series data are crucial to convert the massive amount of fine-grained energy information gathered from residential
smart meters into appropriate Demand Response (DR) insights. In this paper, we propose an online time series clustering approach to
systematically and efficiently manage the energy consumption data deluge, and also capture specific behavior i.e., identify households
with similar lifestyle patterns. Customers can in this way be segmented into several groups that can be effectively used to enhance
DR policies for real time automatic control in the cyberphysical Smart Grid system. Due to the inherent intractability of the ‘optimal
clustering’ problem, we propose a novel randomized approximation clustering scheme of electricity consumption data, aiming at
addressing three major issues: (i) designing a resource-constrained, online clustering technique for high volume, high dimensional
time series data (ii) determining the optimal number of clusters that gives the best approximate clustering configuration, and (iii)
providing strong clustering performance guarantees. By the term ‘performance guarantees’, we imply algorithm performance with
respect to the best clustering possible for the given data. Our proposed online clustering algorithm is time efficient, achieves a
clustering configuration that is optimal within provable worst case approximation factors, scales to large data sets, and is extensible
to parallel and distributed architectures. The applicability of our algorithm goes beyond that of the Smart Grid and includes any
scenario where clustering needs to be done on high volume and in real-time under space and time constraints.
Keywords
time series, online clustering algorithm, real-time analytics, Smart Grid, demand response
I. INTRODUCTION
The emergence of Cyber-Physical Systems (CPS) such as smart grids is leading to extensive digital sampling of power
networks [42]. This has been possible due to the pervasive deployment of sensing instruments that generate large volumes of
data with high velocity. The growing prevalence of smart meters deployed in large numbers by US utilities for example, has
enabled the collection of energy consumption data from residential and commercial customers at unprecedented scales [37].
Specifically, energy consumption is sampled and reported back to the utility at 15-min granularity leading to a 3; 000 jump in
the volume of data traditionally collected by power utilities [39]. Such growing availability of energy consumption data offers
unique opportunities in understanding the dynamics on both sides of the meter; customer behavior on the consumption side
and operating requirements, planning, and optimization on the utility side. The data deluge however imposes utilities with the
challenging tasks of managing massive sets of fine-grained electricity consumption data and applying data analytics to perform
data-driven coordination of energy resources and efficiently deal with peak demand (e.g., by Demand Response (DR) programs
[30].) The high dimensionality of electricity consumption data in particular makes traditional machine learning and data mining
techniques computationally intractable [34]. In this introductory section we first introduce briefly the demand response mechanism
as a representative Smart Grid application, and explain why the related ‘predictive analytics of energy consumption’ problem is
important but at the same time intractable and challenging. We then state our proposed research contributions.
A. Demand Response in a Nutshell
With the emerging Advanced Metering Infrastructure (AMI) becoming more ubiquitous, unique opportunities for intelligent
management of the power grid arise in order to improve efficiency and reliability, mainly due to the ability of extracting many
features from electricity consumption shapes. One canonical Smart Grid application where such feature information is useful
is Demand Response (DR). In this application, the utility uses power consumption data (e.g., peak usage, duration and time of
R.Pal is with the Department of Computer Science, University of Southern California, CA, USA. E-mail: rpal@usc.edu.
C. Chelmis is with the Department of Computer Science, University of Southern California. E-mail: chelmis@usc.edu.
M. Frincu is with the Department of Electrical Engineering, University of Southern California. E-mail: frincu@usc.edu.
V .Prasanna is jointly with the departments of Electrical Engineering and Computer Science, University of Southern California. E-mail: prasanna@usc.edu.
2
day, etc.) at a micro time scale from individual customers in the service area to forecast the future demand and initiate energy
curtailment programs that can avoid a supply-demand mismatch. A step further is the Dynamic DR (D
2
R) which moves away
from the static decision making allowing utilities to dynamically choose when and how to shed consumption often in real-time
with a latency of few seconds. These curtailment programs target individual customers whose power usage is expected to increase
in the near future, and in turn offers them incentives to shift their impending demand in a bid to relieve the stress on the power
grid. In addition to energy consumption curtailment, predicting consumption demand of consumers for the next few hours helps
a utility plan for additional generation, say, by using backed up stored energy, or purchasing power from the energy market.
B. Challenges to DR Implementation
Although effective demand response promises finding “appropriate” customers for the right incentive program, due to hetero-
geneity in customer behavior, the implementation of such programs can turn out to be very costly. Modeling each user separately
has advantages (e.g., personalized models may capture rich, individual characteristics leading to more accurate forecasts), however
the computational resources required to process a massive data sets would be prohibitive for most utilities operating in the
public sector as the data generated by sampling a large population at 15-minute intervals is enormous. For example, electricity
consumption values collected at 15-minute intervals by large utilities such as the Los Angeles Department of Water and Power
that has over a million consumers [39] amounts to 42 billion entries per year.
While parallel techniques (e.g., MPI, MapReduce) could be used to model each individual in parallel finding the best model
requires 54s per customer in an ideal parallel scenario [19] if 6 different models are required. Combined with the latency
needed to get the data and make a decision we obtain a duration that is too long for dynamic demand response (D
2
R) real-time
(in the order of few seconds) decisions.
Further, customers vary widely in their energy usage patterns causing prediction models to have high errors. Aggregating data
from multiple customers into “virtual consumers” instead, reduces the variability of each virtual customer, and thus reduces
the prediction error of the aggregated load of each cluster of customers [37, 39]. Intuitively, it is expected that people sharing
certain characteristics (e.g., similar lifestyles or household appliances) will generally exhibit similarities in their consumption
behavior. This hypothesis has been hard to validate in the past because of insufficient consumption data. With the advent of smart
meters however it has become possible to analyze at scale, finely granular consumption at the individual and group level. More
importantly, splitting users in different groups by characterizing individual consumption facilitates the application/recommendation
of diverse consumption curtailment strategies across the user base.
C. High-Dimensional Time Series Clustering
ResearchMotivation: Previous efforts on customer segmentation for DR have resorted to the well known unsupervised learning
technique of clustering as their primary tool, because of its simplicity and effectiveness. These efforts have reported interesting
findings [13, 16, 17, 41, 24, 25] (refer to Section 2.), but did not analyze the properties of their algorithms nor did they provide
guarantees on the quality of the clustering results. Most clustering techniques in the above mentioned efforts depend heavily on
initial seeding mechanisms which might drive these mentioned algorithms to output bad segmentations [35].
Solution Insight: We propose a methodology to systematically form virtual customers in Smart Grids in a provably optimal
and efficient manner. The notion of efficiency is with respect to both computational time and space. Specifically, we observe N
high dimensional time series X
n
;n = 1;:::;N, each representing the energy consumption of customer n at regular intervals
t (here every 15-minutes), which we intent to cluster according to similarities in usage patterns. The high dimensionality of
the time series data is a direct result of the granularity at which a smart meter reports energy consumption. For consumption
values been collected every 15 minutes, a daily observation of 15-min meter data consists of 96 dimensions. Formally, a daily
energy consumption time series is defined as c =fc
t
g, t = 1;:::; 96, where c
t
is the energy in kWh consumed in the i
th
15-min interval. For consumption values been collected every 1 minute, the length of c increases to 1; 440 dimensions. As
AMI technology enables transmission of even finer-grained consumption data (e.g., every second), understanding consumption
behavior requires appropriate statistical analysis to be used. More importantly, such approaches need to be scalable with the
number of smart meters and granularity, as well as time ans space efficient to constitute computationally viable solutions.
We assume that the compute environment has limited memory which cannot store the entire large-scale, high-dimensional
time series data set obtained from smart meter readings. Thus, our objective is to divide all time series data points into zones of
similar data points in the best possible online manner (see Section 4.) and in the fastest possible way, where the time component
wise sum of the data points in each zone forms the virtual consumer data point for that zone.
Finding K representative groups that yield the least within-cluster distance is a standard clustering problem. The K-means
algorithm is a popular statistical clustering approach and as such had been extensively applied for time series clustering [13,
24, 25]. However, the number of clusters needs to be predetermined in the classical K-means algorithm. As it is hard to set
an appropriate value for K without proper reasoning on the grounds of adopting a certain value, adaptive K-means algorithms
[25] have been proposed. Even though the traditional clustering problem is computationally intractable [15], we posit that it
is feasible to organize a massive corpus of high dimensional time series data into clusters near optimally, in polynomial time.
Specifically, we propose an unsupervised method to calculate the optimal clustering configuration for the given data set within
a constant factor approximation gap under computationally constrained environments.
3
D. Research Contributions
Our main contributions are theoretical in nature and specifically suited to use cases geared towards effective Smart Grid
demand response. However, our contributions are extendible to multiple scenarios where clustering needs to be done on high
volume data, in real-time fashion, and under joint resource constraints of time and space.
We design and analyze a batch randomized approximation algorithm, that clusters a given set of high dimensional smart
meter time series data points in polynomial time. For any given data set, our algorithmic output approximates a clustering
configuration that is within a constant factor gap, i.e., O(1) gap, of the optimal clustering configuration. (see Section 4.)
Using the above-mentioned batch randomized approximation algorithm, we propose and analyze an online randomized
approximation algorithm that clusters a given set of potentially high dimensional time series data points into k pre-
specified clusters in polynomial time. Our algorithm can be deployed on a commodity machine with minimal memory
requirements. More specifically, the space complexity to storen very high dimensional data points, wheren is in the order
of millions for a large metropolitan locality, is large. As a result, we might not have the entire data set at our disposal for
cluster computations, and the main memory needs to fetch blocks of time series data from time to time and provide them as
inputs to our algorithm that in turn appropriately updates the clustering configuration. For any given data set, our proposed
online clustering algorithm is based on a divide-and-conquer approach and approximates a clustering configuration that is
within O(logk) gap of the optimal clustering configuration, where k is the pre-specified number of clusters. (see Section
5.)
We extend existing literature on time series clustering in Smart Grids to give provable theoretical guarantees to the customer
segmentation problem using clustering, for which only heuristic solutions are known till date (refer to Section 2.).
The rest of the paper is organized as follows - In Section 2, we review related work on predictive analysis of time series
data. We discuss our problem setting and model preliminaries in Section 3. In Sections 4 and 5, we propose and analyse our
algorithms for batch and online clustering. In Section 6, we present early experimental results regarding the practical utility of
our approach. We conclude our paper in Section 7.
II. RELATED WORK
Recent years have seen an increasing interest in energy consumption, motivated by both energy markets and environmental
aspects. Of particular emphasis has been accurate prediction models of energy usage, as one of the key challenges in Smart Grid
is to reliably, accurately and efficiently predict future supply and demand trends [37, 11] to allow utility providers and consumers
to prepare for peak periods as well as to plan for DR activities. Research on electricity demand forecasting considers longterm
and medium-term prediction for utility planning and maintenance purposes, and short-term forecast for economic scheduling [3].
As utilities move towards D
2
R, very short-term predictions are required for real-time control.
Currently, US utilities have deployed millions of smart meters that collect energy consumption data from residential and
commercial customers [38]. Such growing availability of energy consumption data offers unique opportunities in designing
segmentation strategies of household energy use to support demand response and energy efficiency programs design and customer
targeting [33]. The introduction of smart meters has driven studies on high resolution time series modeling and customer clustering
[13, 16, 17, 41, 24, 25]. For example, self-organizing maps (SOM) andk-means are used for load pattern mining [17]. A variety
of clustering algorithms to segment consumers with similar consumption behavior have been examined in [13]. More closely
related to our methodology, [24, 25] decompose daily electricity usage time series into representative load shapes by utilizing
adaptivek means and summarized utilizing hierarchical clustering. The final outcome of most consumer segmentation techniques
is sensitive to initial seeding mechanisms. More importantly, to the best of our knowledge, none of the prior work has thus far
provided performance guarantees with respect to the optimal segmentation possible [26].
Extensive literature exists on clustering of time series data [6, 26]. Typical computational issues include achieving the optimal
clustering configuration, which is computationally intractable, i.e., NP-Hard [15]. Static techniques such as relocation and
agglomerative hierarchical clustering have been applied on time series data, but as the number of dimensions grows with the
length of the time series, these techniques become computationally expensive [26]. Feature-based techniques, such as wavelets
[27], attempt to reduce the problem space by extracting motifs from the time series and replacing the time series with (a much
smaller number of) motifs, thus achieving dimensionality reduction.
An exact k-means algorithm [21] has running time O(n
kd
), where d is the dimensionality of the data. Numerous polynomial
time approximation schemes [18, 23, 31, 36] have been proposed. Such schemes are however highly exponential (or worse) in
k and therefore impractical even for relatively small values of n, k, and d. The most popular k-means clustering algorithm [29]
is known to converge to a local optima. Motivated by the popularity of this algorithm in a variety of scientific and industrial
applications [6], we base our proposed solution on Lloyd’s algorithm. Kanungo et al. [22] proposed an O(n
3
d
) algorithm
that is (9 +) competitive. However, n
3
compares unfavorably with the almost linear running time of Lloyd’s method, and the
exponential dependence on d can also be problematic. A combination of Kanungo et al. and Lloyd’s algorithm has also been
proposed, but in order to avoid the exponential dependence ond, this approach sacrifices all approximation guarantees. A constant
probability O(1) approximation with running time of O(nkd) (but only if k is sufficiently large and the spread is sufficiently
small) was achieved using successive sampling [32]. Other O(1) approximation algorithms assume that data is independent and
4
identically distributed (i.i.d.) [14], or that the data can be clustered with well separated means [12, 35]. Finally, an O(nkd)
algorithm for generic data sets that is O(logk) competitive was proposed in [5].
The main drawback of the above mentioned techniques is their inability to operate under memory limitations. For compute
environments with limited memory constraints, simple to implement online heuristics that process small batches of data or one
data object in each iteration have been proposed [28, 39]. Such approaches however provide no guarantees on the optimality
gap of their produced clustering configurations. Our proposed randomized online clustering algorithm leverages the work of
[28, 39, 1, 5, 20], while at the same time providing provable worst case performance guarantees with respect to the optimality of
cluster configuration. Specifically, (i) our online clustering algorithm is running in polynomial time, and (ii) it provides a cluster
configuration that is within O(logk) factor of the optimal clustering configuration.
III. PROBLEM SETTING AND PRELIMINARIES
In this section, we first state our problem setting followed by a description of the model preliminaries.
A. Problem Setting
We consider the setting where a fixed, large number of customers in a metropolitan locality exist, whose time series data
of energy consumption for a given period of time, e.g., a month time granularized into 15 minute intervals, is known to the
local utility. The utility wants to utilize each consumer’s data to make energy consumption predictions with the goal to optimize
DR. As mentioned in Sections 1 and 2, the utility wants to adopt an aggregate clustering technique, whereby virtual consumers
are formed from the given data set, where each virtual consumer represents a cluster, and the time series data for the virtual
consumer is the component wise aggregate of the time series data of each consumer within the cluster [39]. The goal is to
minimize uncertainty induced by variability of predictions for individual customers. We are therefore interested in minimizing
the cumulative consumption prediction error by predicting energy consumption of virtual consumers.
The utility needs to take care of the following important aspect regarding regressive predictions at fine grained intervals: (i)
the data per consumer is of high dimensions, and (ii) the utility has limited memory (much smaller than the the number of data
points) to take into account all the data in one go in order to perform cluster formation computations. On the other hand the
cluster formation process should be optimal, fast, and practically light weight, and as a result the utility can at most afford to make
a constant number of passes of the data as per its memory limitations, and execute computations satisfying speed requirements.
Only after the clustering process with existing data is completed can consumer aggregation and time series prediction methods
can be used.
One could argue here that with the power of modern hardware and advances in distributed computing, memory and compu-
tational power will not be an issue for performing smart grid analytics and in supporting general cyber-physical systems (CPS)
services in general. While this may be true, it is also evident that (i) the rate of increase in the data deluge with cyber-physical
systems becoming more and more popular in our day-to-day lives, might just be greater than the pace at which advanced
hardware can smoothly support corresponding data volumes, (ii) certain small/medium scale organizations providing smart grid
services (e.g., microgrids) may not have the economic advantages to possess advanced hardware, and (iii) computing services
that co-locate many tenants (utilities) providing CPS services (in our case smart grid related computing services) and provide
utilities computing resources may have aggregate workloads that exceed their current capacities. Our proposed model setting
targets such use cases.
B. Model Preliminaries
The k-means clustering problem is defined as follows: Given n points XR
d
and a weight function w :X!R
1
, the goal
is to find a subset CR
d
;jCj =k such that the following quantity is minimized:
C
=
P
xX
w(x)D(x;C)
2
, where D(x;C)
denotes the l
2
distance of x to the nearest point in C. When the subset C is clear from the context, we denote this distance by
D(x). In our problem context, each point is a time series of energy consumption values for a consumer. The dimension of each
data point is the number of intervals for which the time series information is recorded. The number of intervals are assumed to
be known in advance and is homogenous for all data points. Also, for two points x;y;D(x;y) denotes the l
2
distance between
x and y. The subset C is alternatively called a clustering of X and
C
is called the potential function corresponding to the
clustering. We will use the term center to refer to any cC.
We now state some definitions related to the algorithmics of clustering which will be used throughout the paper.
(Competitive ratio, b-approximation). Given an algorithm B for the k-means problems, let
C
be the potential of the
clustering C returned by B (on some input set which is implicit) and let
C
OPT
denote the potential of the optimal clustering
C
OPT
: Then the competitive ratio is defined to be the worst case ratio
C
. The algorithm B is said to be b-approximation
algorithm if
C
C
OPT
b.
1
For unweighted cases, w(x) = 1 for all xX
5
The previous definition might be too strong for an approximation algorithm for some purposes. For example, the clustering
algorithm performs poorly when it is constrained to output k centers but it might become competitive when it is allowed to
output more centers. In order to relax the strength of the definition, we state the following alternative definition.
((a,b)-approximation) - We call a clustering algorithm B, (a;b)-approximation for the k-means problem if it outputs a
clustering C with ak centers with potential
C
such that
C
C
OPT
b in the worst case. Where a> 1;b> 1.
Online Algorithm - We adopt the flavor of the traditional definition of an online algorithm in computer science literature,
i.e., the algorithm is unaware of the data size and the order of arrival of data, but change the definition a little for the purposes
of this paper, without violating any fundamental aspects of an online algorithm. More specifically, we call an algorithm online
if for a given data set, (i) the size of the data set is significantly larger than the memory available to hold the data set, (ii) the
algorithm divides the whole data set into parts
2
and each part (unseen by the algorithm before) arrives as input to the algorithm
in stages, (iii) the algorithm makes a single scan of each part, does some intermediary computations on the part, but does not
store that the part at the end of the scan, and (iv) does final computations at the end of seeing the whole data set by accounting
for the intermediary computations done earlier for each part.
IV. COMPETITIVE BATCH CLUSTERING
In this section we design a batch clustering algorithm, BTSC, that is based on the O(logk)-competitive k-means algorithm
in [5] but generates a clustering configuration that is O(1)-competitive but with potentially ak number of clusters, for a given
k (i.e., we have an ak-means algorithm), where a > 1. We will use this algorithm in Section 5 to develop an online k-means
clustering algorithm. The first part of this section describes the algorithm, and the second part analyzes the algorithm.
A. The Algorithm
Our algorithm is very similar to the one in [5], except that instead of picking one center in every round we choose O(logk)
centers.
Algorithm 1: BTSC finds a batch clustering
Input: (a) Point Set XR
d
of consumer time series data. Let n =jXj, and (b) Number of desired clusters, kN.
Output: A clustering configuration, C
1 Choose 3 logk centers independently and uniformly at random from X
2 Repeat k -1 times
3 Choose 3 logk centers independently w.p.
D(x
0
)
2
P
xX
D(x
0
)
2
4 for i 1 to k do
5 Set the cluster C
i
to be the set of points in X that are closer to c
i
that they are to c
j
for all j6=i
6 for i 1 to k do
7 Set c
i
to be the center of mass of all points in C
i
such that c
i
=
1
jCij
P
xCi
x
8 Repeat Steps 4 to 7 until C no longer changes
9 return C
B. Algorithmic Analysis
Before we analyse BTSC, we will need two results from [5]. We first state these results.
Lemma 1. LetA be an arbitrary cluster inC
OPT
, and letC be the clustering with just one center, chosen uniformly at random
from A. Then (i) E[
C
(A)] = 2
C
OPT
(A), and (ii) Pr[
C
(A)< 8
C
OPT
(A)]
3
4
.
Lemma 2. Let A be an arbitrary cluster in C
OPT
, and let C be an arbitrary clustering. If we add a random center to C from
A, chosen with D
2
weighing to get C
0
, then (i) E[
C
0(A)] 8
C
OPT
(A), and (ii) Pr[
C
(A)< 8
C
OPT
(A)]
3
4
.
LetA =fA
1
;:::::;A
k
g denote the set of clusters in the optimal clusteringC
OPT
. LetC
i
denote the clustering afterith round
of choosing centers. Let A
i
c
denote the subset of clustersA such that
8AA
i
c
;
C
i(A) 32
C
OPT
(A):
2
In many traditional online settings each part is a single data point.
6
Let this subset of clusters be called the ‘covered’ clusters. LetA
i
u
=AA
i
c
be the subset of ‘uncovered’ clusters. The following
lemma shows that with constant probability step(1) of BTSC picks a center such that at least one of the clusters gets covered,
i.e.,jA
1
c
j 1. Let us call this event E. Then we have the following lemma.
Lemma 3. Pr[E] (1
1
k
)
Proof. The proof follows from Lemma 2.
Let X
i
c
=[
AA
i
c
A and let X
i
u
=XX
i
c
. Now, after the i
th
round, either
C
i(X
i
c
)
C
i(X
i
u
) or otherwise. In the former
case, using Lemma 2, we show that the probability of covering an uncovered cluster in the (i + 1)
th
round is large. In the latter
case, we will show that the current set of centers is already competitive with constant approximation ratio. Let us start with the
latter case.
Lemma 4. If event E occurs, i.e.,jA
1
c
j 1 and for any i> 1,
C
i(X
i
c
>
C
i(X
i
u
, then
C
i 64
C
OPT
.
Proof. We get this result using the following sequence of inequalities:
C
i =
C
i(X
i
c
) +
C
i(X
i
u
)
C
i(X
i
c
) 2 32
C
OPT
(X
i
c
) 64
C
OPT
.
Lemma 5. If for any i 1,
C
i(X
i
c
)
C
i(X
i
u
), then Pr[jA
i+1
c
jjA
i
c
j + 1] (1
1
k
).
Proof. we note that in the (i+1)
th
round, the probability that a center is chosen from a cluster = 2A
i
c
is at least
C
i(X
i
u
)
C
i(X
i
u
)+
C
i(X
i
c
)
1
2
. Conditioned on this event, with probability at least
3
4
any of the centers x chosen in round (i + 1) satisfies
C
i
[x
(A)
32
C
OPT
(A) for some uncovered cluster AA
i
u
. This further implies that with probability at least (1
1
k
) at least one of the
chosen centers x in round (i + 1) satisfies
C
0
[x
(A) 32
C
OPT
(A) for some uncovered cluster AA
i
u
.
We now use Lemma’s 4 and 5 to prove the main result of this section.
Theorem 1. BTSC is a (O(logk);O(1))-approximation algorithm.
Proof. From Lemma 3, we know that event E, i.e., (jA
i
c
j 1) occurs. Given this, suppose for any i> 1, after the i
th
round
C
i(X
c
)>
C
i(X
u
). Then from Lemma 4 we have
C
C
i 64
OPT
. If no such i exists, then from Lemma 5 we get that
the probability that there exists a clusterAA such thatA is not covered even afterk rounds is at most: (1 (1
1
k
)
k
)
3
4
. So
with probability at least
1
4
, the algorithm covers all the clusters inA. In this case from Lemma 5, we have
C
=
C
k 32
C
OPT
.
Thus, we have show that BTSC is a randomized algorithm for clustering which with probability at least
1
4
gives a clustering
with competitive ratio 64.
C. Optimal Number of Clusters
It is some sense a norm to pre-assume the number of clusters before running any time-series clustering algorithm [26],
simply because computing the optimal number of clusters for a batch data set is a difficult problem. In this section we state a
heuristic method to find the optimal number of clusters given a batch time series data set. The motivation behind stating this
method is as follows: we have already proven via Theorem 1 that O(k logk) number of clusters are enough to guarantee an
optimal clustering for a given input k. Thus, between k and k logk, there exists an integer (both boundaries inclusive) that is
the theoretical optimum value for the number of clusters for a given data set. Using proper heuristics one could pin down this
optimal value within a good accuracy level. Our stated heuristic method is based on the Silhouette Index in [4] and estimates the
cluster cohesion (within or intra-variance) and the cluster separation (between or inter-variance) and combines them to compute
a quality measure. Extensive experimental results conducted in [4] indicate that Silhouette Index is the best performing index.
This fact motivates us to recommend this index for finding the optimal number of clusters for a given time series data series.
The Silhouette Index for a given clustering configuration C is defined as follows:
Sil(C) =
1
N
X
c
k
C
X
xiC
k
b(x
i
;c
k
)a(x
i
;c
k
)
maxfa(x
i
;c
k
);b(x
i
;c
k
)g
; (1)
where
a(x
i
;c
k
) =
1
jc
k
j
X
xjc
k
D(x
i
;x
j
); (2)
and
b(x
i
;c
k
) =min
c
l
(Cc
k
)
8
<
:
1
jc
l
j
X
xjC
l
D(x
i
;x
j
)
9
=
;
: (3)
For a given time series data set, by fixing different values ofk ranging in the interval [k;k logk], we can compare each clustering
configuration for respective values of k, and choose the k that gives the best Sil score.
7
T*
huge data stream S
S1
T1
Sw
T
S2 S3 S4 Sl
T2
T3
T4 Tl
optimal k-means
(a,b)- approx
weighted instance of akl points
(a’,b’) - approx
Fig. 1: Conceptualizing the Divide and Conquer Approach
V. ONLINE TIME SERIES CLUSTERING
In the previous section, we devised a batch clustering algorithm that takes as input, k, the number of desired clusters, and
outputs a clusteringC withak clusters such that
C
C
OPT
b =O(1) in the worst case. In this section, using our batch clustering
algorithm, we propose our online clustering algorithm. The first part of this section describes the online algorithm in detail, and
the second part analyses the algorithm. We conclude this section by describing a variant of our proposed time series clustering
algorithm.
A. The Algorithm
We provide our online time series clustering algorithm, OTSC, the principle behind which is a simple batch divide-and-conquer
scheme, analyzed by [20] with respect to thek-medoid objective, and we use it here to approximate thek-means objective in an
online setting. In Section 4, we designed BTSC, a variant of the batchk-means algorithm to cluster a given data set competitively
into a configuration C having ak clusters such that
C
C
OPT
b =O(1). OTSC uses BTSC in small batches to process chunks
of incoming data points in an intermediate fashion, and once the entire data set has been scanned, combines results from the
intermediate processing to output the final clustering configuration. The main rationale behind the divide and conquer scheme is
to combine efficient batch algorithms on small data sets to form an efficient online algorithm for the entire data set.
Algorithm 2: OTSC finds a time series clustering
Input: (a) Point Set SR
d
of consumer time series data. Let n =jSj, (b) Number of desired clusters, kN, (c) A, a
BTSC type (a;b)-approximation algorithm to the k-means objective, (d) A
0
, a BTSC type (a
0
;b
0
)-approximation
algorithm to the k-means objective.
Output: A clustering configuration, C
1 Divide S into groups S
1
;S
2
;:::;S
l
2 for i 1 to l do
3 Run A on S
i
to getak centers T
i
=ft
i1
;t
i2
;:::g
4 Denote the induced clusters of S
i
as S
i1
[S
i2
[::::
5 S
w
T
1
[T
2
[:::::[T
l
, with weights w(t
ij
) jS
ij
j
6 Run A
0
on S
w
to geta
0
k centers T
7 return T
Here, for A we use the (3 logk; 64) randomized approximation algorithm, k-means#, in [1]. For A
0
we use the (1;O(logk))
randomized approximation algorithm, k-means++, also in [1]. The basic idea behind OTSC is that for a huge data stream, we
read as much of it as will fit into the memory (call this portion S
1
), solve this sub-instance, then read the next batch S
2
, solve
this sub-instance, and so on. At the end the partial solutions need to be combined to get the full solution. A figurative explanation
8
is given in Figure 1. Note here that when every group S
i
has size
p
nk, the algorithm takes just one pass, and takes O(
p
nk)
memory.
B. Algorithm Analysis
We have the following result related to the performance guarantee provided by algorithm OTSC.
Theorem 2. For any two (a;b) - randomized approximation algorithms A and A
0
, Algorithm OTSC outputs a clustering that is
an (a
0
; 2b + 4b
0
(b + 1)) approximation to the k-means objective.
Proof. LetT
=ft
1
;:::::;t
k
g be the optimalk-means for data setS. Lett
(x)T
be the mean closest to pointx. Likewise, let
t(x)T be the point inT closest tox, andt
i
(x) be the point inT
i
closest tox. Since we are dividing the data into subsetsS
i
, we
will need to talk about the costs of clustering these subsets as well as the overall cost of clusteringS. We define bycost(S
0
;T
0
)
to be the cost of means T
0
for data S
0
. Thus, cost(S
0
;T
0
) equals
P
xS
0D(x;T
0
) when the data points are unweighted, and
equals
P
xS
0w(x)D(x;T
0
), when the points are weighted. Here, D(x;T
0
) is the distance of x to the closest point in set T
0
.
The a
0
part in the theorem statement is obvious. The rest of the theorem statement will be proved over the following lemmas.
Lemma 6. cost(S;T ) 2
P
l
i=1
cost(S
i
;T
i
) + 2cost(S
w
;T ):
Proof. Recalling the fact that S
w
consists of all means t
ij
, with weights w(t
ij
) =jS
ij
j, we have
cost(S;T)
l
X
i=1
X
xS
i
D(x;T)
2
l
X
i=1
X
xS
i
(D(x;ti(x))+D(ti(x);T))
2
;
or
cost(S;T ) 2
l
X
i=1
X
xSi
D(x;t
i
(x))
2
+ 2
l
X
i=1
X
xSi
D(t
i
(x);T )
2
;
or
cost(S;T ) 2
l
X
i=1
cost(S
i
;T
i
) + 2
l
X
i=1
X
j
jS
ij
jD(t
ij
;T )
2
;
or
cost(S;T ) 2
l
X
i=1
cost(S
i
;T
i
) + 2cost(S
w
;T ):
Hence, we have proved Lemma 6.
The next lemma says that when clustering a data setS
0
, picking centers fromS
0
is at most twice as bad as picking centers from
the entire underlying metric space, .
Lemma 7. For any S
0
; we have
min
T
0
S
0cost(S
0
;T
0
) 2min
T
0
cost(S
0
;T
0
):
Proof. Let T
0
be the optimal solution chosen from . For each induced cluster of S
0
, replace its center t
0
T
0
by the
closest neighbor oft
0
inS
0
. This at most quadruples the cost, by first, the application of the triangle inequality, and then followed
by the fact that ( +)
2
2
+
2
(due to the squared distance concept ink-means algorithm.). Hence we have proved Lemma
7.
Our final goal is to upper-bound cost(S;T ), and we will do so by bounding the two terms on the right-hand side of Lemma
1. Let us start with the first of them. We’d certainly hope that
P
i
cost(S
i
;T
i
) is smaller than cost(S;T
); after all, the former
uses way more representatives (about akl of them) to approximate the same set S. We now give a coarse upper bound to this
effect.
Lemma 8.
l
X
i=1
cost(S
i
;T
i
)bcost(S;T
):
Proof. Each T
i
is a b-approximation solution to the k-means problem for S
i
. Thus
l
X
i=1
cost(S
i
;T
i
)
l
X
i=1
bmin
T
0
Si
cost(S
i
;T
0
);
9
or from Lemma 7 we have,
l
X
i=1
cost(S
i
;T
i
)
l
X
i=1
bmin
T
0
cost(S
i
;T
0
);
or
l
X
i=1
cost(S
i
;T
i
)
l
X
i=1
bcost(S
i
;T
) =bcost(S;T
):
Hence we have proved Lemma 8.
We now bound the second term on the right-hand of Lemma 6.
Lemma 9.
cost(S
w
;T ) 2b
0
(
l
X
i=1
cost(S
i
;T
i
) +cost(S;T
)):
Proof. It is enough to upper bound cost(S
w
;T
) and then invoke the following
cost(S
w
;T )b
0
min
T
0
Sw
cost(S
w
;T
0
);
and
b
0
min
T
0
Sw
cost(S
w
;T
0
)b
0
min
T
0
cost(S
w
;T
0
);
and
b
0
min
T
0
cost(S
w
;T
0
)b
0
cost(S
w
;T
):
To do so we just need to apply the triangle inequality. We have
cost(S
w
;T
)
2
=
X
ij
jS
ij
jD(t
ij
;T
)
2
;
where
X
ij
jS
ij
jD(t
ij
;T
)
2
2
X
i;j
X
xSij
(D(x;t
ij
) +D(x;t
(x)))
2
;
or
X
ij
jSijjD(tij;T
)
2
2
X
i;j
X
xS
ij
D(x;tij)
2
+2
X
i;j
X
xS
ij
D(x;t
(x))
2
;
or
cost(Sw;T)
2
2
X
i
X
xS
ij
(D(x;ti(x))
2
+2
X
i
X
xS
ij
D(x;t
(x))
2
);
or
cost(S
w
;T )
2
2
X
i
cost(S
i
;T
i
) +cost(S;T
):
The proof of Theorem 2 follows immediately by putting together Lemma 6, 8, and 9.
We now proceed to analyzing the time and space complexity of our algorithm. We note from algorithm OTSC that A is the
k-means # algorithm that runs on the data 3 logn times independently, and picks the clustering with the smallest cost.A
0
is just
a single run of the k-means++ algorithm.
Algorithmic Time and Space Analysis: From the analysis and results in Section 4 in relation to algorithm BTSC, we have
with probability at least (1(
3
4
)
3logn
) (1
1
n
), algorithm A is a (3 logk; 64) - approximation algorithm. Moreover, the space
requirement remains logarithmic in the input size. In step (3) of algorithm OTSC, we runA on batches of data. Since each batch
is of size
p
nk, the number of batches is
p
n
k
, the probability that A is a (3 logk; 64) - approximation algorithm for all the
batches is at least (1
1
n
)
p
n
k
1
2
. Conditioned on this event, the divide and conquer strategy gives a O(logk)-approximation
algorithm. The memory required is O(logk
p
nk) times the logarithm of the input size. Moreover, the algorithm has a running
time of O(dnk logn logk).
10
C. A Variant to OTSC
In a particular variant of OTSC, we could incrementally output a k clustering configuration after processing each batch of
data, i.e., after seeing batchi, using intermediate computations for batches 1 toi - 1, we could have ak-clustering configuration.
Since (1
1
n
)
p
1
2
for any p in the range [1;
p
n
k
], the optimality guarantees for this variant of OTSC would be the same as
that for OTSC. The running time of this variant isO(ldnk logn logk) forl batches of data. We have the following result related
to Algorithm, V-OTSC.
Theorem 3. For any two (a;b) - randomized approximation algorithms A and A
0
, Algorithm V-OTSC outputs a clustering in
O(ldn logn logk) time, that is an (a
0
; 2b + 4b
0
(b + 1)) approximation to the k-means objective.
Proof. The proof is very similar to that of Theorem 2.
We now state the algorithm.
Algorithm 3: V-OTSC finds a time series clustering
Input: (a) Point Set SR
d
of consumer time series data. Let n =jSj, (b) Number of desired clusters, kN, (c) A, a
BTSC type (a;b)-approximation algorithm to the k-means objective, (d) A
0
, a BTSC type (a
0
;b
0
)-approximation
algorithm to the k-means objective.
Output: A clustering configuration, C
1 Divide S into groups S
1
;S
2
;:::;S
l
2 for i 1 to l do
3 for j 1 to i do
4 Run A on S
i
to getak centers T
j
=ft
j1
;t
j2
;:::g
5 Denote the induced clusters of S
j
as S
j1
[S
j2
[::::
6 S
w
T
1
[T
2
[:::::[T
i
, with weights w(t
ij
) jS
ij
j
7 Run A
0
on S
w
to geta
0
k centers T
8 return T
VI. EXPERIMENTAL EVALUATION
While the theoretical analysis of the proposed online algorithm has shown that in the worst case scenario it performs better
than the existing heuristics approaches. However this does not say anything about its average behavior for real-life data sets.
Hence, we empirically evaluate the performance of our online time series clustering approach (see Section V) in practice for
scalable prediction of energy consumption in Smart Grids. It is well known that while the consumption of a single consumer is
hard to forecast [40], that of a large group of users can be accurately forecast [24, 25]. Aggregation can dramatically reduce the
overall uncertainty in customers’ consumption by splitting customers into groups, which aggregate consumption is represented
by virtual customers [24, 25]. When forming the aggregates, customers can be grouped arbitrarily [39], based on geographical
information [25, 16, 33], or based on some formal cost function [39, 25]. For example, PG&E divides Northern California into
several zones [25]. As there might be significant behavior variation, even within in a single zip code, it is important to optimize
the clustering process to obtain the “best” results.
Here, we propose the use of our online clustering technique (see Section V) to group customers together into virtual customers,
with the goal of minimizing the cumulative consumption prediction error. Time series forecasting models such as ARIMA [40]
can be trained on historical consumption data aggregated from multiple customers to predict the near future based on real-
time data transmitted by smart meters. We use ARIMA for prediction as it has been shown to outperform other time series
forecasting models for short term prediction of electricity consumption [40]. The goal of our incremental clustering approach is
to computationally scale the problem of grouping customers whose cumulative energy consumption behaves in a more predictable
way and minimizes the ARIMA prediction error for the cluster as a whole.
Our numerical simulations are based on synthetic times series data generated according to the process described in [2].
Specifically, we produce representative data for 50 smart meters over a three month period at 15-minute intervals. We note
that each interval is a dimension. Thus, each data point (individual user time series of power consumption) is approximately
of 3 30 96 = 8,600 dimensions. We used the first two months data per smart meter as a training set for the standard
Auto-Regressive Integrated Moving Average (ARIMA) method used for time series predictive analysis, and the last month’s
data as an evaluating set for calculating the Mean Absolute Percentage Error (MAPE) of the ARIMA model. We show results
for 3 sets of synthetic data of 50 smart meters each. Our experiments were run using the MATLAB package. We empirically
analyze clustering performance and prediction accuracy. Specifically, (i) we compare the clustering performance (clustering cost)
of our proposed algorithm with a traditional batch k means method, and (ii) we compare the average MAPE obtained from our
11
clustering method with that obtained by the method proposed in [39]. Our work, while validated for this vital CPS application
domain, is applicable to scalable clustering of other large-scale time series data sets.
In the worst case, our algorithm will always perform (refer to Section 5.2, Thm 2.) better than the worst case behavior of
the heuristic. However, the theory does not comment on non worst-case behavior, and worst cases might rarely arise unless
an intelligent adversary (refer to Section 7.) is able to enforce it pretty often. One of our main motivations behind empirical
evaluation is to gauge the extent of predictive performance improvement of our proposed online time-series clustering algorithm
over recent heuristic approaches [39] that performed well over general large non-synthetic data sets, and under non worst case
scenarios. The rationale here is that a better performance of our algorithm over a heuristic that performed well for real data
sets would be sort of an indication of our algorithm performing better than proposed heuristics for real-data sets, irrespective of
initial seeding configuration.
Fig. 2: Clustering Performance Comparison in a 50-Smart Meter Scenario (a) Iteration 1 (left), (b) Iteration 2 (middle), and (c)
Iteration 3 (right)
Fig. 3: Average MAPE Performance Comparison in a 50-Smart Meter Scenario (a) Iteration 1 (left), (b) Iteration 2 (middle),
and (c) Iteration 3 (right)
Our results in Figure 2 show that the clustering performance of the OTSC algorithm is better than the traditional batchk-means
algorithm in most cases, where performance is measured by the cost incurred by a clustering algorithm. Please note that the TSC
algorithm in the plots refer to the online clustering algorithm, i.e., the OTSC algorithm in our paper. In fact, with increasing
number of clusters, OTSC increasingly outperforms the batchk-means algorithm. Via Figure 3 we show that the average MAPE
is lower with the application of our proposed OTSC algorithm than with the application of the incremental algorithm proposed in
[39]. In addition, there is a linear increase of average MAPE with the increase in the number of clusters with TSC compared to
the non-linear increasing pattern observed in [39]. The reason for the increase in the MAPE with increasing number of clusters
is identical to the one mentioned in [39], i.e., with less number of consumers per cluster the average MAPE is more than in the
case when there are more consumers per cluster and less clusters.
VII. DISCUSSION
In this section we discuss two important topics related to our clustering work: (a) the role of an adversary in the clustering
process, and (b) the impact of parallel and distributed architectures on online clustering.
A. The Role of an Adversary
An Added Motivation: In Section 1, we stated the importance of quality guarantees with respect to clustering and motivated
our work on the counts of bad initial seeding possible with the clustering approaches of existing works (refer to Section 2.).
12
However, it is also quite possible for a malicious adversary to hack the mechanism of existing segmentation algorithms that
results in an unboundedly bad segmentation. Previous research efforts based on clustering heuristics fail to address adversarial
issues and this is a major drawback, given the increase risks of cyber-physical systems to adversarial attacks. To put this fact
more clearly, existing solutions work well under trustworthy environments, but looking forward to the future they might not
stand up to potent adversaries. Thus, an added motivation to design clustering methods applied to cyber-physical applications
that can theoretically guarantee worst case perfomance in the presence of adversaries.
The Way Adversaries Work: In a recent series of works [7][8], the authors educate us on ways adversaries can manipulate data
(in our problem, time series data.) before they enter the clustering engine of an utility. The main idea in these works is that
an attacker may significantly poison the whole clustering process by adding a relatively small percentage of attack samples to
the input data, and that some attack samples may be obfuscated to be hidden within some existing clusters without excessively
altering the initial clustering. The analysis in the papers have focused on the single-and complete-linkage hierarchical clustering
algorithms, but the attack methods extend to other clustering algorithms as well.
B. The Role of Parallel and Distributed Computing
While the focus of this paper is to propose a polynomial time online clustering algorithm for limited memory systems it can
be argued that existing parallel and distributed systems such as clouds do not face this constraint and can be used as a viable
alternative with any existing parallel clustering algorithm. The rationale lies in the fact that the processing of the time series
batchesS
i
in Figure 1 can be easily done in parallel using state of the art techniques such as Streaming Map Reduce [9, 43] for
data streams. In addition the possible bottleneck caused by aggregating the resultsS
w
could be solved by using elastic solutions
that scale out depending on the system’s load [10]. Our online algorithm can easily be integrated into such a system providing
an efficient means of processing data stream chunks in parallel faster than existing heuristics in a worst case scenario.
VIII. CONCLUSION AND FUTURE WORK
In this paper, we proposed an efficient online time series clustering approach to perform predictive analysis of energy
consumption of consumers in Smart Grid for the purposes of effective demand response. Compared to previous approaches,
our clustering method provides provable guarantees on the optimality of clustering, and is applicable to Big Data pertaining
to large metropolitan localities. In lieu of the fact that an adversary can maliciously manipulate the functioning of existing
customer segmentation algorithms to enable poor clustering configurations, providing performance guarantees on the optimality
of clustering is an important problem. We adopted tools from the theory of randomized approximation algorithms to design our
clustering method, which also accounts for high dimensional, high volume data. We showed that for a given number of clusters
it is possible to have a clustering configuration that is within O(logk) of the optimal clustering configuration. Through initial
experiments we show that our algorithm outperforms existing algorithms in literature in terms of contributing to reduction in
MAPE, even for situations where existing algorithms might not be victim to an adversary or initial bad seeding. In addition,
we also provided a mechanism to find that particular value of the number of clusters that gives the best approximate clustering
configuration for a given times series data set.
In this paper we ran initial experiments using synthetically generated data sets to validate that our theory not only addresses
worst case adversarial scenarios but also performs better than existing heuristics in terms of predictive analysis, under non-
adversarial settings and situations when initial seeding for existing heuristics is not bad. As part of future work, we plan to
extend our experimental setup to include real data from a utility company, and at the same time run more detailed experiments
to characterize the impact of our algorithm on predictive performance. Given a time horizon, we provided a method to cluster
data for a given number of consumers during that time horizon. An interesting extension would be to design a smart way to
update the clusters for incoming data on future intervals, for a given fixed set of consumers. This case is much more frequently
likely than the case of clustering new consumers for a given fixed time horizon. We plan to address this issue as part of future
work.
ACKNOWLEDGEMENT
This material is based upon work supported by the United States Department of Energy under Award Number number DE-
OE0000192, and the Los Angeles Department of Water and Power (LA DWP). The views and opinions of authors expressed
herein do not necessarily state or reflect those of the United States Government or any agency thereof, the LA DWP, nor any of
their employees.
REFERENCES
[1] N. Ailon, R. Jaiswal, and C. Monteleoni. Streaming k-means approximation. In NIPS, 2009.
[2] R. J. Alcock and Y . Manolopoulos. Time-series similarity queries employing a feature-based approach. In Hellenic Conference on Informatics, 1999.
[3] Hesham K. Alfares and Mohammad Nazeeruddin. Electric load forecasting: literature survey and classification of methods. International Journal of Systems Science, 33(1), 2002.
[4] O. Arbelaitz, I. Gurrutxaga, J. Muguerza, J. M. Perez, and I. Perona. An extensive comparative study of cluster validity indices. Pattern Recognition, 46, 2013.
[5] D. Arthur and S. Vassillvitski. k-means++: The advantages of careful seeding. In SODA, 2007.
13
[6] P. Berkhin. Survey of Clustering Data Mining Techniques. Technical Report, Accrue Software, San Jose, 2002.
[7] B. Biggio, I. Pillai, S. R. Bulo, D. Ariu, M. Pelillo, and F. Roli. Is data clustering in adversarial settings secure? In AISec, 2013.
[8] B. Biggio, I. Pillai, S. R. Bulo, D. Ariu, M. Pelillo, and F. Roli. Posioning behavioral malware clustering. In AISec, 2014.
[9] Andrey Brito, Andre Martin, Thomas Knauth, Stephan Creutz, Diogo Becker, Stefan Weigert, and Christof Fetzer. Scalable and low-latency data processing with stream mapreduce.
In Proceedings of the 2011 IEEE Third International Conference on Cloud Computing Technology and Science, CLOUDCOM ’11, pages 48–58. IEEE Computer Society, 2011.
[10] Raul Castro Fernandez, Matteo Migliavacca, Evangelia Kalyvianaki, and Peter Pietzuch. Integrating scale out and fault tolerance in stream processing using operator state
management. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, SIGMOD ’13, pages 725–736. ACM, 2013.
[11] S.C. Chan, K. M. Tsui, H. C. Wu, Yunhe Hou, Yik-Chung Wu, and F.F. Wu. Load/price forecasting and managing demand response for smart grids: Methodologies and challenges.
Signal Processing Magazine, IEEE, 29(5):68–85, 2012.
[12] K. Chaudhuri and S. Rao. Learning mixtures of product distributions using correlations and independence. In COLT, 2008.
[13] G. Chicco, R. Napoli, and F. Piglione. Comparisons among clustering techniques for electricity customer classification. IEEE Transactions on Power Systems, 21, 2006.
[14] S. B. David and U. Luxburg. Relating clustering stability to properties of cluster boundaries. In COLT, 2008.
[15] P. Drineas, A. Frieze, R. Kannan, S. Vempala, and V . Vinay. Clustering large graphs via the singular value decomposition. Machine Learning, 56(1-3), 2004.
[16] G. Chicco et.al. Load pattern-based classification of electricity customers. IEEE Transactions on Power Systems, 21, 2006.
[17] S. Verdu et.al. Classification, filtering, and identification of electrical customer load patterns through the use of self-organizing maps. IEEE Transactions on Power Systems, 21,
2006.
[18] W. Fernandez, M. Karpinski, C. Kenyon, and Y . Rabani. Approximation schemes for clustering problems. In ACM STOC, 2003.
[19] Marc Frincu, Charalampos Chelmis, Muhammad Usman Noor, and Viktor K. Prasanna. Accurate and efficient selection of the best consumption prediction method in smart
grids. In Proc. IEEE International Conference on Big Data, page in print. IEEE, 2014.
[20] S. Guha, A. Meyerson, N. Mishra, R. Motwani, and L. O. Callaghan. Clustering data streams: Theory and practice. IEEE Transactions on Knowledge and Data Engineering,
2003.
[21] M. Inaba, N. Katoh, and H. Imai. Applications of weighted voronoi diagrams and randomization to variance-based-k-clustering. In Symposium on Computational Geometry,
1994.
[22] T. Kanungo, D. Mount, N. S. Netanyahu, C. D. Piatko, R. Silverman, and A. Y . Wu. A local search approximation algorithm for k-means clustering. Computational Geometry,
28(2), 2004.
[23] A. Kumar, Y . Sabharwal, and S. Sen. A sample linear time (1+) approximation algorithm for k-means clustering in any dimensions. In FOCS, 2004.
[24] J. Kwac, C-W.Tan, N. Sintov, J. Flora, and R. Rajagopal. Utility customer segmentation based on smart meter data: An empirical study. In IEEE SmartGridComm, 2013.
[25] J. Kwak, J. Flora, and R. Rajagopal. Household energy consumption segmentation using hourly data. IEEE Transactions on Smart Grid, 5(1), 2014.
[26] T. Warren Liao. Clustering of time series data - a survey. Pattern Recognition, 38(11), 2005.
[27] J. Lin, M. Vlachos, E. Keogh, and D. Gunopulos. Iterative Incremental Clustering of Time Series. Springer, 2004.
[28] Y . Liu, Q. Guo, L. Yang, and Y . Li. Research on incremental clustering. In IEEE CECNet, 2012.
[29] S. P. Lloyd. Least squares quantization in pcm. IEEE Transactions on Information Theory, 28(2), 1982.
[30] J. L. Mathieu, D. S. Callaway, and S. Kilicote. Variability in automated responses of commercial buildings and industrial facilities to dynamic electricity prices. Energy and
Buildings, 43(12), 2011.
[31] J. Matousek. On approximate geometric k-clustering. Discrete and Computational Geometry, 24(1), 2000.
[32] R. R. Mettu and C. G. Plaxton. Optimal Time Bounds for Approximate Clustering. Morgan Kaufmann, 2002.
[33] S. Moss. Market segmentation and energy efficiency program design. In CIEE Energy and Behavior Program, 2009.
[34] A. Mueen, E. Keogh, Q. Zhu, S. Cash, and B. Westover. Exact discovery of time-series motifs. In SIAM International Conference on Data Mining, 2009.
[35] R. Ostrovsky, Y . Rabani, L. J. Schulman, and C. Swamy. The effectiveness of lloyd-type methods for the k-means problem. In FOCS, 2006.
[36] S. H. Peled and S. Mazumdar. On coresets for k-means and k-median clustering. In ACM STOC, 2004.
[37] S. D. Ramchurn, P. Vyetelingum, A. Rogers, and N. R. Jennings. Putting the smarts into the smart grid: A grand challenge for artificial intelligence. Communications of the
ACM, 55(4), 2012.
[38] Juan Shishido. Smart meter data quality insights. In ACEEE Summer Study on Energy Efficiency in Buildings, 2012.
[39] Y . Simhan and M. S. Noor. Scalable prediction of energy consumption using incremental time series clustering. In IEEE International Conference on Big Data, 2013.
[40] James W Taylor, Lilian M de Menezes, and Patrick E McSharry. A comparison of univariate methods for forecasting electricity demand up to a day ahead. International Journal
of Forecasting, 22(1):1–16, 2006.
[41] G. Tsekouras, N. Hatziargyriou, and E. Dialynas. Two-stage pattern recognition of load curves for classification of electricity customets. IEEE Transactions on Power Systems,
22, 2007.
[42] X. Yu, C. Cecati, T. Dillon, and M. G. Simoes. The new frontiers of smart grids. IEEE Industrial Electronics Magazine, 5(3), 2011.
[43] Matei Zaharia, Tathagata Das, Haoyuan Li, Scott Shenker, and Ion Stoica. Discretized streams: An efficient and fault-tolerant model for stream processing on large clusters. In
Proceedings of the 4th USENIX Conference on Hot Topics in Cloud Ccomputing, HotCloud’12, pages 10–10. USENIX Association, 2012.
Linked assets
Computer Science Technical Report Archive
Conceptually similar
PDF
USC Computer Science Technical Reports, no. 955 (2015)
PDF
USC Computer Science Technical Reports, no. 954 (2015)
PDF
USC Computer Science Technical Reports, no. 956 (2015)
PDF
USC Computer Science Technical Reports, no. 942 (2014)
PDF
USC Computer Science Technical Reports, no. 953 (2015)
PDF
USC Computer Science Technical Reports, no. 947 (2014)
PDF
USC Computer Science Technical Reports, no. 967 (2016)
PDF
USC Computer Science Technical Reports, no. 959 (2015)
PDF
USC Computer Science Technical Reports, no. 812 (2003)
PDF
USC Computer Science Technical Reports, no. 556 (1993)
PDF
USC Computer Science Technical Reports, no. 868 (2005)
PDF
USC Computer Science Technical Reports, no. 785 (2003)
PDF
USC Computer Science Technical Reports, no. 845 (2005)
PDF
USC Computer Science Technical Reports, no. 969 (2016)
PDF
USC Computer Science Technical Reports, no. 963 (2015)
PDF
USC Computer Science Technical Reports, no. 917 (2010)
PDF
USC Computer Science Technical Reports, no. 728 (2000)
PDF
USC Computer Science Technical Reports, no. 957 (2015)
PDF
USC Computer Science Technical Reports, no. 961 (2015)
PDF
USC Computer Science Technical Reports, no. 918 (2010)
Description
Ranjan Pal, Charalampos Chelmis, Marc Frincu, and Viktor Prasanna. "Time series clustering for demand response: An online algorithmic approach." Computer Science Technical Reports (Los Angeles, California, USA: University of Southern California. Department of Computer Science) no. 952 (2015).
Asset Metadata
Creator
Chelmis, Charalampos
(author),
Frincu, Marc
(author),
Pal, Ranjan
(author),
Prasanna, Viktor
(author)
Core Title
USC Computer Science Technical Reports, no. 952 (2015)
Alternative Title
Time series clustering for demand response: An online algorithmic approach (
title
)
Publisher
Department of Computer Science,USC Viterbi School of Engineering, University of Southern California, 3650 McClintock Avenue, Los Angeles, California, 90089, USA
(publisher)
Tag
OAI-PMH Harvest
Format
13 pages
(extent),
technical reports
(aat)
Language
English
Unique identifier
UC16269358
Identifier
15-952 Time Series Clustering for Demand Response An Online Algorithmic Approach (filename)
Legacy Identifier
usc-cstr-15-952
Format
13 pages (extent),technical reports (aat)
Rights
Department of Computer Science (University of Southern California) and the author(s).
Internet Media Type
application/pdf
Copyright
In copyright - Non-commercial use permitted (https://rightsstatements.org/vocab/InC-NC/1.0/
Source
20180426-rozan-cstechreports-shoaf
(batch),
Computer Science Technical Report Archive
(collection),
University of Southern California. Department of Computer Science. Technical Reports
(series)
Access Conditions
The author(s) retain rights to their work according to U.S. copyright law. Electronic access is being provided by the USC Libraries, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright.
Repository Name
USC Viterbi School of Engineering Department of Computer Science
Repository Location
Department of Computer Science. USC Viterbi School of Engineering. Los Angeles\, CA\, 90089
Repository Email
csdept@usc.edu
Inherited Values
Title
Computer Science Technical Report Archive
Description
Archive of computer science technical reports published by the USC Department of Computer Science from 1991 - 2017.
Coverage Temporal
1991/2017
Repository Email
csdept@usc.edu
Repository Name
USC Viterbi School of Engineering Department of Computer Science
Repository Location
Department of Computer Science. USC Viterbi School of Engineering. Los Angeles\, CA\, 90089
Publisher
Department of Computer Science,USC Viterbi School of Engineering, University of Southern California, 3650 McClintock Avenue, Los Angeles, California, 90089, USA
(publisher)
Copyright
In copyright - Non-commercial use permitted (https://rightsstatements.org/vocab/InC-NC/1.0/