Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Novel and efficient schemes for security and privacy issues in smart grids
(USC Thesis Other)
Novel and efficient schemes for security and privacy issues in smart grids
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
Novel and Ecient Schemes for Security and Privacy Issues in Smart Grids by Xingze He A Dissertation Presented to the FACULTY OF THE USC GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulllment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (ELECTRICAL ENGINEERING) August 2013 Copyright 2013 Xingze He Dedication To my family. ii Acknowledgements I would like to thank my advisor, Professor C.-C. Jay Kuo, for his patient guidance and wisdom. I would also like to thank my mentor, Dr. Man-On Pun for his vision and help. iii Table of Contents Dedication ii Acknowledgements iii List of Figures vii List of Tables x Abstract xi Chapter 1 Introduction 1 1.1 Signicance of the Research . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Review of Previous Work . . . . . . . . . . . . . . . . . . . . . . . . 2 1.3 Contribution of the Research . . . . . . . . . . . . . . . . . . . . . . 4 1.4 Organization of the Dissertation . . . . . . . . . . . . . . . . . . . . 5 Chapter 2 Background 6 2.1 Overview of Smart Grid System . . . . . . . . . . . . . . . . . . . . 6 2.2 Security and Privacy Issues in Smart Grid . . . . . . . . . . . . . . 9 2.2.1 Cyber Security . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.2.2 Data Privacy . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.3 Guidelines and Standards . . . . . . . . . . . . . . . . . . . . . . . 13 2.3.1 NISTIR 7628 . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.3.2 IEEE STD 1711-2010 . . . . . . . . . . . . . . . . . . . . . . 15 Chapter 3 Homomorphic Encryption-based Secure System 17 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 3.2.1 Conventional Encryption Schemes . . . . . . . . . . . . . . . 20 3.2.1.1 Symmetric Key Encryption . . . . . . . . . . . . . 20 3.2.1.2 Public Key Encryption . . . . . . . . . . . . . . . . 20 3.2.2 Homomorphic Encryption . . . . . . . . . . . . . . . . . . . 22 3.3 Proposed Homomorphic Encryption-based Secure System . . . . . . 23 iv 3.3.1 System Framework . . . . . . . . . . . . . . . . . . . . . . . 23 3.3.2 Secure Data Exchanging and Processing Mechanisms . . . . 24 3.3.2.1 Key Generation and Distribution . . . . . . . . . . 25 3.3.2.2 Uplink Communication . . . . . . . . . . . . . . . 26 3.3.2.3 Homomorphical Processing . . . . . . . . . . . . . 27 3.3.2.4 Downlink Communication . . . . . . . . . . . . . . 28 3.3.2.5 Key Revocation . . . . . . . . . . . . . . . . . . . . 28 3.3.3 Practical System with Extended Quadratic Encryption . . . 29 3.3.3.1 Extension of Goh's Quadratic Encryption Schemes 30 3.3.3.2 Privacy-preserving Data Aggregation . . . . . . . . 33 3.3.3.3 Privacy-preserving Statistical Analysis . . . . . . . 34 3.4 Properties of Proposed Secure System . . . . . . . . . . . . . . . . . 38 3.4.1 Security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 3.4.2 Eciency . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 3.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 Chapter 4 Privacy-preserving Metering Scheme 40 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 4.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 4.2.1 AMI in smart grid . . . . . . . . . . . . . . . . . . . . . . . 43 4.2.2 Non-Intrusive Load Monitoring . . . . . . . . . . . . . . . . 44 4.3 Privacy-preserving Metering Scheme for Smart Grid . . . . . . . . . 45 4.3.1 System Model and Assumptions . . . . . . . . . . . . . . . . 45 4.3.2 Proposed Metering Schemes . . . . . . . . . . . . . . . . . . 46 4.3.2.1 Reading Distortion . . . . . . . . . . . . . . . . . . 46 4.3.2.2 Power Consumption Distribution Reconstruction . 47 4.3.2.3 Aggregated Billing . . . . . . . . . . . . . . . . . . 49 4.3.3 Security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 4.3.3.1 Linear Filter Attack . . . . . . . . . . . . . . . . . 51 4.3.3.2 Non-local Mean Filter Attack . . . . . . . . . . . . 52 4.4 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . 54 4.4.1 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 4.4.1.1 Power Curve Dataset . . . . . . . . . . . . . . . . . 54 4.4.1.2 Smart Grid Simulator . . . . . . . . . . . . . . . . 54 4.4.2 Results and Discussion . . . . . . . . . . . . . . . . . . . . . 54 4.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 Chapter 5 Power Quality Monitoring Using Change-Point Detection 62 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 5.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 5.2.1 Power Quality Events . . . . . . . . . . . . . . . . . . . . . . 64 5.2.2 Conventional Techniques . . . . . . . . . . . . . . . . . . . . 64 5.3 Proposed Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 66 v 5.3.1 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . 66 5.3.2 Signal Statistical Modeling Methods . . . . . . . . . . . . . 68 5.3.2.1 Generic Modeling . . . . . . . . . . . . . . . . . . . 69 5.3.2.2 Event-Specic Modeling . . . . . . . . . . . . . . . 71 5.3.3 Uncertainty Modeling . . . . . . . . . . . . . . . . . . . . . . 72 5.3.4 Weighted CUSUM-based Scheme . . . . . . . . . . . . . . . 73 5.3.5 Multiple-Sensor Detection . . . . . . . . . . . . . . . . . . . 74 5.3.5.1 MBQCUSUM Scheme . . . . . . . . . . . . . . . . 75 5.3.5.2 MVWCUSUM Scheme . . . . . . . . . . . . . . . . 76 5.4 Results and Performance Analysis . . . . . . . . . . . . . . . . . . . 78 5.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 Chapter 6 Conclusion and Future Work 90 6.1 Summary of the Research . . . . . . . . . . . . . . . . . . . . . . . 90 6.2 Future Research Topics . . . . . . . . . . . . . . . . . . . . . . . . . 91 Reference List 93 vi List of Figures 2.1 Illustration of the smart grid conceptual model [31]. . . . . . . . . . 7 2.2 Bump-in-the-wire SSPP deployment. . . . . . . . . . . . . . . . . . 16 3.1 Major domains in smart grid . . . . . . . . . . . . . . . . . . . . . . 18 3.2 Public key cryptosystem. . . . . . . . . . . . . . . . . . . . . . . . . 21 3.3 Proposed data exchange scheme in smart grid. . . . . . . . . . . . . 24 3.4 Overview of data transmission and operation . . . . . . . . . . . . . 25 3.5 Key generation and distribution procedure. . . . . . . . . . . . . . . 26 3.6 Operations in the uplink communication. . . . . . . . . . . . . . . . 27 3.7 Extended packet format. . . . . . . . . . . . . . . . . . . . . . . . . 27 3.8 Data processing and analysis. . . . . . . . . . . . . . . . . . . . . . 28 3.9 Operations in the downlink communication. . . . . . . . . . . . . . 29 3.10 Signing process. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 3.11 Verication process. . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 3.12 Data aggregation in a smart grid. . . . . . . . . . . . . . . . . . . . 34 3.13 Statistical analysis conducted by utilities. . . . . . . . . . . . . . . . 35 3.14 Statistical analysis conducted by the third Party. . . . . . . . . . . 36 4.1 Power usage to personal activity mapping [34]. . . . . . . . . . . . . 41 4.2 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 4.3 Disturbed Power Consumption Trace . . . . . . . . . . . . . . . . . 47 4.4 Parallel Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 4.5 Results of Denoising Attacks . . . . . . . . . . . . . . . . . . . . . 55 4.6 Features extracted after linear lter . . . . . . . . . . . . . . . . . . 56 vii 4.7 Features extracted after non-local mean lter . . . . . . . . . . . . . 57 4.8 Power Consumption Distribution Reconstruction . . . . . . . . . . . 58 4.9 Centric Processing vs Parallel Processing . . . . . . . . . . . . . . . 59 4.10 MSE under dierent class numbers . . . . . . . . . . . . . . . . . . 59 4.11 MSE vs Customer Number . . . . . . . . . . . . . . . . . . . . . . . 60 4.12 MSE under diernt SNR with a xed number of customers . . . . . 61 5.1 Examples of Power Quality Events . . . . . . . . . . . . . . . . . . 65 5.2 Muti-Sensor Scenario . . . . . . . . . . . . . . . . . . . . . . . . . . 75 5.3 FAR Improvement . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 5.4 SimPowerSystems Model for Power Quality Events . . . . . . . . . 79 5.5 Measurements of Voltage Sags by 3 Sensors . . . . . . . . . . . . . 80 5.6 Illustration of a voltage transient event. . . . . . . . . . . . . . . . . 80 5.7 The temporal-frequency plot using STFT w.r.t. a transient event. . 81 5.8 Spectral estimates with MUSIC w.r.t. a transient event. . . . . . . 81 5.9 Sample-by-sample RMS of a transient event as a function of time (SNR = 20 dB). . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 5.10 The logarithm of the weighted likelihood ratio of a transient event using CUSUM (SNR = 20 dB). . . . . . . . . . . . . . . . . . . . . 83 5.11 The MSE versus the CUSUM threshold in a transient event (SNR = 20 dB). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 5.12 The MSE as a function of the SNR value for CUSUM and RMS in a transient event. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 5.13 Illustration of a voltage sag event. . . . . . . . . . . . . . . . . . . . 85 5.14 Sample-by-sample RMS of a sag event as a function of time (SNR = 20 dB). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 5.15 The logarithm of the weighted likelihood ratio of a sag event using CUSUM (SNR = 20 dB). . . . . . . . . . . . . . . . . . . . . . . . 86 5.16 MSE performance as a function of SNR for CUSUM and RMS in a sag event. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 5.17 Performance of dierent modeling methods . . . . . . . . . . . . . . 87 5.18 Performance of dierent uncertainty modeling methods . . . . . . . 88 viii 5.19 MBQCUSUM vs MVWCUSUM . . . . . . . . . . . . . . . . . . . . 88 ix List of Tables 2.1 Time Latency Requiremens . . . . . . . . . . . . . . . . . . . . . . 10 2.2 Cipher Suites in SSPP . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.1 Comparison of two schemes . . . . . . . . . . . . . . . . . . . . . . 39 4.1 Sensitive Information Derived from Power Consumption Trace . . . 42 4.2 Number of Features Extracted with NILM . . . . . . . . . . . . . . 56 5.1 Mathematical Models of Major Power Quality Events . . . . . . . . 71 x Abstract The past years have witnessed the fast development of smart grids all over the world. The introduction of digital communication technologies into the power sys- tem makes smart grids more ecient and intelligent. In the meantime, however, wide security and privacy concerns arise due to the increasing system complexities. Without proper protection, smart grids is extremely vulnerable to various attacks, such as conventional physical damages and emerging cyber attacks. On the other hand, even tiny system faults, if not detected and resolved in a real time manner, would lead to large scale power outage with unexpected loss. Besides, customer's privacy is also severely threatened by the provision of ne-grained power consump- tion data in smart grids. Motivated by these concerns, three novel schemes from dierent technical perspectives are proposed in the dissertation. For the rst topic, an ecient homomorphic encryption-based system was pro- posed for securing data transmission, data sharing and operations among dierent parties. In this work, we rst proposed a system framework tailored for homo- morphic encryption techniques which have great potential to secure data, enable privacy-preserving data sharing and thereby improve the overall eciency of smart grids. Based on the proposed system framework, we then designed a practical sys- tem with an extended partially homomorphic encryption scheme. With homomor- phic features, we prove that the designed system well supports privacy-preserving data aggregation and power consumption statistical analysis in smart grids. For the second topic, a metering scheme was proposed to protect customer's privacy. In this work, a reading distortion scheme was rst designed to distort xi smart meter data in a way that only data senders (i.e. customers) are able to access the original power consumption data. With distorted power consumption data, an aggregated billing mechanism was then proposed to guarantee accurate billing service. For power consumption analysis and prediction, we designed a distribution reconstruction algorithm to recover the original power consumption distribution from distorted power consumption data. To show the security, two potential attacks were investigated theoretically. Experimental results on real world power consumption data were discussed in the end. For the third topic, a power quality monitoring scheme using change-point de- tection techniques was investigated. After modeling pre-event and post-event power signal, we proposed a weighted CUSUM algorithm to detect common power quality events, i.e. sags, transients, swells and harmonics. With experimental results, we compared proposed scheme with conventional power quality monitoring techniques and concluded with the superiority of the proposed scheme. We also extend the scheme to distributed version under multi-sensor scenario. The proposed MVW- CUSUM scheme is compared with recent MBQCUSUM scheme in terms of detection latency and robustness. xii Chapter 1 Introduction 1.1 Signicance of the Research As opposed to legacy power system, much emphasis has been placed on security and privacy issues in smart grid. The reason is straightforward: the more com- plex a system is, the more vulnerabilities it has. Envisioned as a combination of physical power system, information system and communication system, smart grid is becoming an extraordinarily complex system with potentially numerous security vulnerabilities and privacy threats which are fairly easy to monetize. For instance, manipulation of energy costs is quite common in legacy power system with an esti- mated loss of $6 billion in U.S. [33]. Without any doubt, this situation will become worse in smart grid, because customers can not only save money through energy cost manipulation, but can also earn money by trading fabricated electricity. More seriously, adversaries and terrorists can easily launch large scale cyber attacks to smart grid with potentially unpredictable consequences. In addition, the physical power system has to face various power quality problems which may lead to cascad- ing blackouts. Statistics has shown that power outages and power quality problems cost at least $150 billion each year in U.S.. Furthermore, with the increasing de- mand of real-time monitoring, control and management, customers are required to report much more ne-grained power consumption prole (every second or minute) 1 which could be exploited by malicious people to learn proles of customer's daily activities. Therefore, customer's privacy becomes another concern. In view of these concerns, security and privacy have been regarded as two of the most imperative issues in current and future deployment of smart grid. The goal of security and privacy in smart grid is to maintain condentiality, integrity, availability (CIA) as well as to protect customer's data privacy. More specically, envisioned smart grid system must be capable of preventing sensitive data from exposing to unauthorized parties, or from being tampered with by others. Timely access to and use of data by authorized part must be guaranteed as well. For privacy, future smart grid system must ensure the use of data will not compromise individual privacy. To achieve the above goal, comprehensive eort from dierent areas including policy, law and regulations, standardization, industry and academia has to be made. The research work discussed in this dissertation is mainly focusing on security and privacy issues in smart grid. 1.2 Review of Previous Work Security and privacy issues in smart grid have been well discussed in guidelines published by NIST Cyber Security Working Group(CSWG) [39{41] in 2010. In [2], IEEE also puts an emphasis on the importance of data privacy in smart grid interoperability issue. Besides the high-level guidelines, some specic technical schemes have been proposed to address security and privacy issues in smart grid. In 2010, Metke and Ekl [29] discussed smart grid security technologies including public key infrastructure and trusted computing. In the same year, Khurana et al. also discussed smart grid security issues in [22] with an emphasis on complexity and scalability of key management. They pointed out conventional key management scheme (i.e. PKI) is not able to meet the requirements of smart grid in which there are typically millions of devices to manage. 2 In 2011, two privacy-preserving data aggregation schemes were proposed and discussed in detail [16,23]. With advanced cryptographic scheme or data masking method, customer's power consumption data can be well protected during the data transmission. However, these schemes only address the privacy concern arise from data aggregators in the aggregation process. Data sharing is still unsolved. Also, the aggregated information is merely the sum of the total power consumption which could not be sucient for accurate dynamic pricing in smart grid. To solve the pri- vacy concern about billing, in 2010, [30] proposed a protocol using zero knowledge proof to privately derive and prove the correctness of bills. However, without re- porting customers' power consumption data, power consumption analysis becomes impossible. In [13], an anonymization scheme was proposed in which each customer is as- signed a pseudo-name. In the scheme, two databases are introduced: one is used to store customers' power consumption trace and their pseudo-names; the other one is used to store the pseudo-names and the real identities. The former one is made public for data sharing while the latter one must be condential. The rst problem of this scheme is which party could be trusted to manage the condential database. In addition, in 2011, [20] has proven that data mining and pattern recog- nition techniques can be used to break the anonymization scheme through building connection from the customer's pseudo-name to the real identity. In [28] published in 2011, a class of algorithms and systems called Non-Intrusive Load Leveling (NILL) was proposed to combat privacy invasion on metering data. In NILL, an in-residence battery is introduced to mask variance of load on the grid. Through this way, the exposure of power consumption trace is eliminated. One of the major problems of this scheme is the cost of battery which is approximately estimated as $1000 per year. Also, with only smoothed power consumption data, real-time power consumption analysis is hard to accomplish. As a result, power consumption prediction and dynamic pricing may fail. 3 1.3 Contribution of the Research In this dissertation, we proposed three novel schemes that address the security and privacy issues of smart grid system. 1. Homomorphic encryption-based secure system We proposed a system framework of using homomorphic encryption in smart grid to secure data transmission, data sharing and operations among dierent parties. Under the assumption that all parties except newly introduced PKC-HE system are untrusted, the proposed scheme protects data privacy to the highest degree. On the basis of the proposed system framework, a practical secure system was designed based on an extended partially homomorphic encryption scheme. Two important applications, privacy-preserving data aggrega- tion and power consumption statistic analysis, are well supported. 2. Privacy-preserving metering scheme We proposed a metering scheme in which smart meter data is distorted in a way that recovery becomes dicult. We proved that current load analysis technology (NILM) fails to derive any valuable information relating to customer's daily activities from the distorted power consumption trace. We also examined two potential de-noising attacks. The impotence of both denoising schemes is proved empirically and theoretically. We proposed a power consumption distribution reconstruction algorithm to estimate the original distribution of power consumption from the dis- torted power consumption data. For billing service, an aggregated billing mechanism was designed. 4 The proposed scheme is validated by experimental results on real world power consumption data. 3. Power quality monitoring using change-point detection We statistically modeled pre-event and post-event power signal for four common power quality events, namely, voltage sags, transients, swells and harmonics for detection. Two modeling methods, generic modeling and event-specic modeling are proposed. Based on the statistical model, a weighted CUSUM algorithm is pro- posed to detect power quality events from a single sensor measurement We extend the scheme to multi-sensor scenario and proposed MVW- CUSUM scheme. With experimental results, we compared the performance of proposed scheme with conventional detection scheme, such as RMS, STFT and MBQCUSUM. We concluded that the proposed scheme is superior to conventional schemes in terms of detection delay. 1.4 Organization of the Dissertation The rest of the dissertation is organized as follows. In Chapter 2, background of this research will be reviewed. It gives an overview of smart grid and the introduction into security and privacy issues of the system. In Chapter 3, we will discuss the homomorphic encryption-based secure system. The privacy-preserving metering scheme is studied in Chapter 4. In Chapter 5, we investigated the power quality monitoring scheme based on change-point detection techniques. Finally, concluding remarks and future research issues are discussed in Chapter 6. 5 Chapter 2 Background 2.1 Overview of Smart Grid System Smart grid, combining physical power system with information and communication systems, is a large complex system. As shown in Fig. 3.1, a typical smart grid is composed of seven major domains: power generation, power transmission, power distribution, service provider, market, operation and customers. These domains closely interact with each other to make the overall power system more ecient and intelligent. In this section, we will brie y highlight the applications and actors in each domain. Power Generation: unlike the conventional power plants, the power generation domain covers any equipment that generates and stores electricity utilizing conventional and renewable energy sources. In addition, the power generation domain is also responsible for monitoring the system condition and sending information to the operations domain. In the event of faults, the power gen- eration domain will react to protect its equipment. Power Transmission: the power transmission domain transfers bulk electricity at high voltages from generators to distribution over long distances. Trans- mission lines are connected through substations. Furthermore, substations 6 Figure 2.1: Illustration of the smart grid conceptual model [31]. equipped with devices such as transformers, circuit breakers, phasor mea- surement units (PMU) and relays are responsible for step-up/down voltage transformation, switching and voltage control. In particular, relays are em- ployed to monitor circuit conditions and trip circuit breakers in the event of faults or overload. In the smart grid, the power transmission domain is ex- pected to maintain system-level stability by exploiting its fault-tolerance and self-healing capability. Power Distribution: the power distribution domain bridges the power trans- mission domain with customers through distribution transformers and eld capacitor banks. The importance of the distribution domain should not be overlooked since it is responsible for 90% of the power outage [10]. Similar to the transmission domain, the distribution domain is equipped with protective devices such as relays. Furthermore, in contrast to the existing power grid where the electricity and information ows are unidirectional, the smart grid 7 supports two-way electricity and information exchanges between the power distribution domain and customers. Customers: customers are the end users of electricity. However, unlike the conventional users who can only consume electricity and provide information, customers in the Grid can more eectively manage their electricity consump- tion based on demand response by generating and storing electricity as well as feeding electricity back to the grid. Based on the energy required, the cus- tomer domain is further divided into three sub-domains, namely residential, commercial and industrial sub-domains. The customer domain communicates with other domains via the smart meters. Markets: the market domain provides a platform for operators and partic- ipants in electricity markets to trade electricity. A reliable market domain is critical in eciently bridging the gap between energy supply and energy consumption. Service Providers: The service provider domain provides services to both elec- trical customers and utilities. The scope of its services covers from installation and maintenance to billing and account management. Operations: the operation domain is the manager of the whole power system to ensure its smooth operation. Thus, it interacts with all other six domains by exchanging information and control signals. Some of the most important applications of the operation domain include monitoring, control and fault management. 8 2.2 Security and Privacy Issues in Smart Grid 2.2.1 Cyber Security In smart grid, each domain shown in Fig. 3.1 is composed of a certain number of actors which may be a sub-system, application, device or other participants in the grid. Each actor exchanges data with other actors within the same domain or across dierent domains. In such a large and highly interconnected complex system, cyber security becomes a critical issue, especially when public communication network (e.g. the Internet) is introduced and adopted in smart grid. Compared with conventional physical attacks, cyber attack that is not con- strained by distance is generally less risky, cheaper and much easier to coordinate and replicate. With a little basic knowledge about the structure and operation of the network, adversaries are able to launch various attacks wherever they are only through interconnected computer or even just smart phones. Many incidents indicate the seriousness of cyber security issue. In March 2007, researchers at the Department of Energy's Idaho lab launched an experimental cyber attack which caused a generator to self-destruct [21]. In 2008, evidence showed that hackers had penetrated power systems and caused a power outage aecting multiple cities [19]. Besides, virus, worms and other malwares forms another group of cyber security issues in smart grid. In 2009, for example, a security consulting rm showed a simulation in which 15,000 out of 22,000 homes had their smart meters taken over by worm over only 24 hours [15]. Just recently, researchers has already created a worm that spread between smart meters [33]. In order to mitigate these concerns, cyber security in smart grid aims at main- taining availability, integrity and condentiality of the entire system. It's worth noting that, dierent from information and communication system, power system concerns more about availability compared to integrity and condentiality. How- ever, with the increasingly involved interactions among components of smart grid, 9 Senarios Time Latency Requirements Protective Relaying 4 ms Wide-area Situational Awareness Monitoring Sub-seconds Substation and Feeder SCADA Data Seconds Monitoring Noncritical Equipment and Some Market Pricing Information Minutes Meter Reading and Longer-term Market Pricing Information Hours Long-term Data (e.g. Power Quality Infor- mation) Days/Weeks/Months Table 2.1: Time Latency Requiremens massive sensitive data is produced and exchanged. Condentiality is becoming more and more important in the development of smart grid. More detailed discussion on these factors is given below: 1. Availability Simply put, availability requires the system to provide the right information to the right people within the right time period. Take the well-known blackout in 2003 as an example, although the incident was initially caused by equipment problem, the ongoing and cascading failures were primarily due to availability issues. In smart grid, many factors may cause trac congestions or messages delay, thereby bring in availability issues. Typical examples are Denial-of- Service attack, malwares, dysfunction of devices and etc. In view of these concerns, dierent requirements of time latency on communication are listed in Table.2.1. 2. Integrity Message integrity is another critical factor to the reliable operation of smart grid system. Without eective protection, it's not dicult for users to modify his/her power consumption data recorded by smart meter to reduce their power bills. If some malicious hackers take advantage of this to send modied 10 power demand message, utility may generate and deliver more than enough electricity to the system. This may either cause the waste of electricity or compromise many digital sensitive devices. More seriously, large scale attack might be mounted through fabricating control data and control smart meters. All controlled smart meters would be switched o simultaneously. 3. Condentiality Condentiality in smart grid prevents transmitted data from exposing to unauthorized parties which mainly has two benets: rst, it is dicult for adversaries to intercept, understand and analyze the network data. For ex- ample, with eective encryption scheme, adversary who does not have access to the secret key are not able to decrypt any intercepted ciphertext to get the corresponding plaintext; second, it's critical to protect users' privacy which we will discuss later in detail. For availability issues, a variety of monitoring tools and techniques can be adopted in smart grid system to detect attacks and abnormal activities or condi- tions, like intrusion detection system, intrusion prevention system, malicious code protection system, network monitoring system, and etc. Compared to enterprise systems, control system has a relative stable number of users, a limited number of protocols as well as a regular communication patterns. All these features may sim- plify the design and implementation of the above monitoring tools and techniques. Cryptography-based schemes are usually used to cope with integrity and con- dentiality issues. Due to the expanding large scale and limited resource (CPU, memory, bandwidth) of devices, conventionally well working techniques in com- puter networks may not perform well in smart grid systems. Many challenges and issues have to be addressed as the evolution of smart grid. Specic adaptations or novel algorithms have to be designed for smart grid. 11 2.2.2 Data Privacy To achieve high intelligence and eciency, smart meters in smart grid are required to provide detailed customer's power consumption data every minute or second. This transformation from conventional aggregate data to granular data inevitably brings in wide privacy concerns. Using energy signature analysis tools like Non- intrusive Load Monitoring (NILM), people can easily gure out the exact running time of dierent appliances through the ne-grained power consumption trace. Ma- licious people can even derive sensitive information about a customer's daily ac- tivities. For example, number of cycles of the water heater indicates the number of people living in the house, the energy cycle of the TV shows whether the house is occupied or not, the energy signature of the coee pot or the toaster re ects when the home owner wakes up, and even more. Besides, charging data of plug-in electric vehicles (PEVs) may be used to track an individual's traveling time and PEV owner's locations. Many third-party companies may be interested in this kind of sensitive data for commercial benets. This kind of privacy concerns exist in various operations in smart grid, like data monitoring, aggregation, analysis and sharing. For data privacy issues, relevant laws and regulations have to be enacted rst. It must be clear enough about who owns the data? who has the right to access users' data? how utility or other parties share users' data in a privacy-guaranteed way? how to manage the huge amount of sensitive data? and so on so forth. Secondly, technical methods must be designed to protect customer's privacy with consideration on eciency and scalability issues. 12 2.3 Guidelines and Standards 2.3.1 NISTIR 7628 NISTIR 7628, namely, Guidelines for smart grid Cyber Security, is a three-volume report which presents a specic analytical framework to help organizations make eective cyber security strategies [39{41]. The guideline was developed in 2010 by NIST Cyber Security Working Group (CSWG) of the smart grid Interoper- ability Panel(SGIP) which consists of more than 400 participants including fed- eral agencies, regulatory organizations, standard organizations, vendors and service providers, manufacturers and academia. The rst volume of the guideline begins with discussion of the cyber security strategy used by SGIP-CSWG for the development of the guideline. It examines both domain-specic and common requirements of smart grid cyber security is- sues including a series of tasks: risk assessment, requirements identication and modication, and etc. The analytical approach described in this part could serve as guidance for organizations to identify high-level cyber security requirements for their own systems. The rst volume then proposes a smart grid conceptual model followed by a detailed logistic reference model. Each interface in the logistic refer- ence model is analyzed and assigned an appropriate impact level (high, moderate or low) with regard to security. Based on that, high-level security requirements are thoroughly analyzed and discussed for all aspects of the system. As a conclusion for the rst volume, cryptographic and key management issues are carefully addressed along with potential alternatives in the end. The second volume, focusing on the privacy issues, rst discusses privacy impact assessment and mitigating factors in smart grid. Besides potential privacy issues, this volume also gives high-level recom- mendations for privacy solutions. The third volume provides abundant supportive analyses and references where potential vulnerabilities of smart grid are discussed in detail. Also, a number of security problems which do not have specic solutions 13 are identied. Research and development themes for cyber security of smart grid are carefully discussed as well. For information and communication system, selected high-level cyber security requirements are listed below: 1. Communications Partitioning: communications partitioning requires the man- agement communications to be physically or logically separated from the telemetry/data acquisition services communications. 2. Denial-of-Service Protection: denial-of-service protection requires smart grid to mitigate or limit the impacts of all kinds of DOS attacks 3. Communication Integrity and Communication Condentiality: this part re- quires organizations to employ cryptographic mechanisms to protect the in- tegrity and to prevent unauthorized disclosure of information during trans- mission. 4. Collaborative Computing: collaborative computing requires the system to design mechanisms including video and audio conferencing capabilities or in- stant message technologies. 5. Message Authenticity: message authenticity provides authenticity of the mes- sage and devices in communication. It is used to protect from malformed trac, miscongured devices, and malicious entities. It's suggested to be implemented at the protocol level. 6. Honeypots: to detect, analyze and track the attacks, honeypots is designed to be the target of various attacks, thus attack data is collected for analysis. 7. Others 14 Number Cipher Suite Comments 0x0004 AES+ CTR mode + HMAC-SHA256 Used for dynamic sessions 0x0005 AES+ PE mode + HMAC-SHA256 Used for dynamic sessions with a session clock 0x0006 cleartext + SHA256 No security provided 0x0008 cleartext + HMAC-SHA256 Used for dynamic sessions 0x000A AES + CBC mode + HMAC-SHA256 Used for dynamic sessions Table 2.2: Cipher Suites in SSPP 2.3.2 IEEE STD 1711-2010 IEEE STD 1711, namely, IEEE Trial-use Standard for a Cryptographic Protocol for Cyber Security of Substation Serial Links, developed a trial-use standard for a cryptographic protocol for SCADA substations serial communications in 2010 [3]. The standardized protocol, serial SCADA protection protocol (SSPP), provides in- tegrity and optionally condentiality for cyber security of substations serial links. With minimal overhead, SSPP encapsulates each SCADA or application messages in a cryptographic envelope before sending to underlying communication protocols (e.g. DNP3, Modbus) to eectively authenticate and encrypt messages. SSPP's application domain is not limited to serial SCADA communications, but also ap- plicable to other types of serial communications (e.g. data concentrator, load man- agement links, and etc.). However, applications or systems are required to tolerate lost messages since SSPP is designed to discard suspect messages. Cipher suites adopted in SSPP are shown in Table.2.2. SSPP is designed to support three kinds of communication links: point-to-point, multi-drop and broadcast. Also, mixed mode operation, in which some of the substations use SSPP to protect communication among each other whereas others communicate with clear texts, is introduced to support for both multi-drop and broadcast links. The motivation comes from the consideration of the exibility of SSPP deployment in future smart grid system. With the support of mixed mode, SSPP protocols can be implemented onto the whole system in a phased fashion 15 which means there is no need for all substations to implement SSPP at the same time. Besides, SSPP has multiple implementation choices, such as in standalone security devices, integrated in communication modems or embedded in applications or system. The standalone bump-in-the-wire security devices approach, as shown in Fig.2.2, needs little or no modication of existing systems and equipments for legacy serial system update.The only required devices are SSPP gateways deployed seperately in both ends of SCADA control center and substations. RTU SCADA Master Modem Modem SSPP Gateway SSPP Gateway SCADA Data (Unecrypted) SCADA Data (Encrypted) SSPP Header SCADA Data (Encrypted) SSPP Header SCADA Data (Unencrypted) SSPP Trailer SSPP Trailer Control Center Substation Figure 2.2: Bump-in-the-wire SSPP deployment. However, it's worth noting that, the standard doesn't address key management issue for SSPP in order to avoid the complexity of the protocol and to increase the adoption of SSPP in smart grid system. 16 Chapter 3 Homomorphic Encryption-based Secure System 3.1 Introduction As opposed to legacy power system, smart grid involves systems from dierent elds, such as information and communication eld, to provide reliable, ecient and intelligent electricity service. As discussed in Chapter. 2, a typical smart grid system consists of seven major domains as shown in Fig. 3.1: power generation, transmission, distribution, customers, service providers, operations, and markets. To provide highly ecient and intelligent electricity services, systems within these domains must closely interact with each other. In this extremely complex collabora- tion, real time customers' power consumption data is one of the important pieces of information shared among dierent parties. Unlike aggregate data in legacy power system, the power consumption data in smart grid is becoming more and more ne-grained for real-time application (e.g. every minute or second). It's easy for malicious people to derive further information related to customers' daily activities from their ne-grained power consumption proles. Without proper protection, this data can be easily intercepted and exploited by hackers for malicious purpose on targeted household or gain illegitimate advantages over their business competitors. Data privacy becomes an important issue in smart grid. 17 Figure 3.1: Major domains in smart grid In addition, malicious attackers may also use the two-way communication data to launch large-scale attacks on the electrical grid [33, 40]. For example, learning from previous control data sent from operation centers, adversaries can falsify con- trol messages to control a large number of smart meters and devices. They can suddenly shutdown all smart meters and devices and crash the entire power sys- tem. This situation is more severe when wireless channel or public network (e.g. the Internet) is used for data transmission. With the Internet, hackers may mount attacks from anywhere at anytime. The reduced risk and cost of such attacks may induce more people to attack the power system regardless of whether or not they have malicious purposes in mind. Security becomes another concern in smart grid. However, security and privacy protection were not carefully considered in legacy power systems, and they are indeed quite new to smart grids. Recently, the DOE allocated $20 million in research funding on cyber security for the US electrical grid, among which $3.1 million was earmarked for the development of a centralized cryptographic key management system. Many organizations have been working on 18 the standardization of security and privacy issues in smart grid. In 2010, the Na- tional Institute of Standards and Technology (NIST) discussed smart grid privacy and security issues in a series of documents [39{41] and provided some high-level solutions. Elaboration on smart grid security attributes was provided by the De- partment of Energy (DOE) of the US Government in [42] in 2009. Besides, some eort has been made to address smart grid security and privacy issues in the spe- cic area of key management. For example, in 2010, Khurana et al. [22] described smart grid security issues with emphasis on complexity and scalability issues of key management. Metke and Ekl [29] investigated smart grid security technologies in- cluding the public key infrastructure and trusted computing, A secure information aggregation scheme for the smart grid system was proposed by Li, Luo and Liu [25] using homomorphic encryption. Although it is secure and ecient, its applica- tion is limited to information aggregation only and a more powerful homomorphic encryption scheme is needed to overcome this obstacle. A recent breakthrough in homomorphic cryptography made by Gentry [17] provides fully homomorphic encryption based on ideal lattices. Gentry's scheme supports both additive and multiplicative homomorphism and thereby has huge potential. Despite its powerful functionality, the complexity of Gentry's scheme is very high and its application to a practical smart grid system is still in question. Motivated by the above observation, we propose a secure and ecient cryptography- based secure system using homomorphic encryption. On a basis of an extended partially homomorphic encryption algorithm, a practical cryptosystem is designed specically for smart grid. The proposed cryptosystem supports an arbitrary num- ber of additions and a single multiplication on encrypted data, which could well support many privacy-preserving applications in smart grid The rest of this chapter is organized as follows. We rst introduce some back- ground of cryptographic systems in Sec. 3.2. Then, we discuss the proposed homo- morphic encryption-based secure system in Sec. 3.3. A system framework of using 19 homomorphic encryption scheme in smart grid is rst proposed. We then study a practical secure system designed with an extended partially homomorphic encryp- tion scheme. On the practical secure system, two important privacy-preserving applications in smart grid are investigated. Security and eciency analysis is dis- cussed in Sec. 3.4. Finally, concluding remarks are given in Sec. 3.5. 3.2 Background 3.2.1 Conventional Encryption Schemes 3.2.1.1 Symmetric Key Encryption Characterized with low complexity and high eciency, symmetric key encryption has been widely used in many real-world applications. Some symmetric key encryp- tion schemes, such as advanced encryption scheme (AES), have been considered as candidate cryptographic schemes for smart grid. However, symmetric key encryp- tion schemes require both sender and recipient share the unique secret (i.e. secret key) before secure data transmission. The secret is then used for encryption at the sender side and for decryption at the recipient side. The secret sharing requires te availability of secure communication channels in the system, otherwise, adversaries can easily intercept the secret key to break the cryptosystem. In many applications, like smart grid, building up secure communication channels for millions of devices and systems is impossible. To solve this problem, public key encryption-based secret sharing are used. 3.2.1.2 Public Key Encryption In a public key cryptosystem as shown in Fig. 3.2, each recipient generates a pair of public and secret keys, where the public key is publicly available to other users while the secret key is kept condential. If someone wants to send a message 20 to the recipient, the public key can be used to encrypt the message. Once the recipient receives the encrypted message, the secret key is then used to decrypt the encrypted message so as to recover the original content. This process is called public key encryption. The public key cryptosystem also supports the authentication of a message. That is, a message signed with a digital signature, which is generated from a sender's private key, can be veried by anyone with access to the corresponding public key. Through this verication process, the recipient can ensure the sender's identity and whether or not the message has been tampered with during transmission. Unlike the Figure 3.2: Public key cryptosystem. symmetric key cryptosystem, the secure communication channels for secret sharing between the sender and the recipient is not required in the public key cryptosystem. Therefore, the public key cryptosystem has been widely used in large scale public networks. The security of this cryptosystem is generally built upon mathematical problems that cannot be solved eciently such as discrete logarithm and integer factorization. 21 However, public key cryptosystems generally require a public key infrastructure (PKI) for key management. However, in a typical smart grid with millions of smart meters and other devices, the scalability of conventional public key infrastructure becomes a severe problem. On the other hand, a power grid usually covers a large geographical area. Many households located in distant areas also have connectivity problems to the registration authority (RA) and certicate authority (CA) which makes the PKI system fail. 3.2.2 Homomorphic Encryption Given messages m 1 ;m 2 2 Z N , a homomorphic encryption function Enc() can be rewritten as Enc k (m 1 m 2 ) =Enc k (m 1 )Enc k (m 2 ); (3.1) where and are two dierent operators. Homomorphic encryption is a group of cryptographic schemes in the sense that users can delegate the processing of their private data (as indicated by the left-hand-side of the above equation) to others without revealing the content (as shown by the right-hand-side of the above equa- tion). Homomorphic encryption has been used in secure voting, private information retrieval, cloud computing and etc. Based on the supported functionality, homomorphic encryption schemes can be classied into two types { partially homomorphic encryption scheme and fully homomorphic encryption scheme. Partially homomorphic encryption scheme has more restrictions on its supported operation (only addition, multiplication, or poly- nomials up to certain degrees) while fully homomorphic encryption supports both additions and multiplications. The latter is more powerful and exible. Examples of partially homomorphic encryption schemes include: RSA [35], Pailer [32], etc. Fully homomorphic encryption schemes were better understood until recently and 22 reported in [17,43]. However,they are still computationally expensive to be used in practical applications today. 3.3 Proposed Homomorphic Encryption-based Secure System In the conventional public key cryptosystem, each device of the two-way commu- nication channel has to generate a pair of public and secret keys for secure data transmission. In a large scale system such as the smart grid, ecient management of millions of keys is a very challenging problem. In addition, frequent encryption and decryption operations on every single message tremendously reduce the eciency of the system. This is especially critical in smart grid due to the limited bandwidth of the communication channel and timeliness requirements. Conventional cryp- tographic schemes have severe eciency and scalability problems in smart grid. However, smart grid is actually dierent from general communication system: rst, smart grid is a multi-to-one communication network (e.g. a number of users but only one service provider); second, uplink communication (i.e. from smart meters to the service provider) and downlink communication (i.e. from the service provider to smart meters) in power system have asymmetric security requirements. More data security and privacy concerns have been placed on uplink communication. Based on the unique features, we proposed a smart grid-specic secure system by exploiting homomorphic encryption techniques. 3.3.1 System Framework The proposed system framework is shown in Fig. 3.3. Unlike the original seven domains smart grid framework as shown in Fig. 3.1, another system called PKC- HE is introduced. The PKC-HE, which is the only system that has access to the unique secret key of the entire system, is responsible for key generation, distribution, 23 Markets Operations Service Provider PKC-HE Encrypted Data Signed Control Message L-Net H-Net Processing Result (Encrypted) Processing Result (Decrypted) Control Message (Unsigned) Processing Result (Signed) Customers Figure 3.3: Proposed data exchange scheme in smart grid. decryption, signing and verication. In this system, operations, markets, service providers, and the PKC-HE constitute a small communication network, called the H-Net. Owing to the small scale of this sub-network and infrequent communications among them, we use conventional public key cryptosystem to secure H-Net. On the other hand, operations, markets, service providers and customers constitute another sub-network called L-Net. We exploit homomorphic encryption techniques to protect L-Net communication in an ecient and scalable way. 3.3.2 Secure Data Exchanging and Processing Mechanisms An overview of data transmission and operations in the proposed system framework is shown in Fig. 3.4. Detailed discussion on the major procedures is given below: 24 Customer Operation PKC-HE PKC-HE generates one pair of {PK,SK} and broadcast PK C = Enc(m,PK) Customer periodically repots encrypted power consumption data R = function(C) Processing on encrypted data R = function(C) Send processing results for decryption {r=Dec(C,SK),d=DS(r,SK)} Send decrypted control message and corresponding digital signature {r=Dec(C,SK),d=DS(r,SK)} Forward decrypted control message and corresponding digital signature r=Verification(d,PK)? Authentication Key Distribution Reporting Homomorphic Processing Decryption Control Message Transmission Verification Figure 3.4: Overview of data transmission and operation 3.3.2.1 Key Generation and Distribution Key generation and distribution procedure is shown in Fig. 3.5. The PKC-HE sub-system generates a pair of public keys (PK) and secret keys (SK) for the com- munication of the L-Net at time T . For key distribution, the PKC-HE simply broadcasts the generated public keys to customers while keeping the secret keys condential. The keys will remain active for a predened period of time before the PKC-HE system re-generates and distributes new keys. In contrast to a con- ventional public key cryptosystem where two pairs of public and private keys are generated for two-way communication link, only one pair of public and private keys is required for communications between the customers and other domains (i.e. operations, markets, service providers) in our proposed scheme. Note that extra public and secret keys are needed for communication in the H-Net since we adopt conventional public key cryptosystem. For a network consisting of 500 customers, 25 more than 1000 keys are needed in a conventional public key cryptosystem while 10 keys are enough in our proposed scheme. Due to the small number of keys, the generation, distribution and other key management work becomes easy. PKC-HE Broadcasting Public Key Secret Key Public Key Public Key Public Key Figure 3.5: Key generation and distribution procedure. 3.3.2.2 Uplink Communication The uplink communication is depicted in Fig. 3.6, where User A and User B share the same public key (PK), which is generated and distributed by the PKC-HE system. Users use the same public key for data encryption. The encrypted data is then sent to dierent domains through the public network. Since only the PKC-HE system can access the private key, no one else can decrypt the intercepted data in transmission. To prevent data forgery for uplink communication, we add extra bytes to the output message of smart meters as shown in Fig. 3.7. A random number is generated and stored in the bytes right after a new key is generated and distributed. After the transmission of each message, the random number is increased by 1. With the prior knowledge of the initial random number, it is convenient to verify whether the message has been forged or not based on the information embedded in the 26 Related Domain C A =Enc PK (M A ) A B C B =Enc PK (M B ) Figure 3.6: Operations in the uplink communication. random number bytes. Suppose that a smart meter sends power usage data in a period of T s seconds, the initial random number generated at time t i is N i , the random number shown in the received message is N p , and the receiving time is t p . Then, we can check the authenticity using the following equation t p t i T s =N p N i ; (3.2) IP Header Account Number Current Bill Time Stamp RN ĂĂ Figure 3.7: Extended packet format. 3.3.2.3 Homomorphical Processing With fully homomorphic encryption, domains such as markets, operations and ser- vice providers are able to implement all kinds of calculation and processing on encrypted customer data. The processed results are then sent to the PKC-HE sys- tem, the only party who can access the secret key, for decryption as shown in Fig. 3.8. After decryption, the PKC-HE system sends the decrypted results back to the 27 corresponding domains for further analysis. Note that communication in the H-Net is protected under the conventional public key cryptosystem. Related Domains PKC-HE Processing Result (Encrypted) Processing Result (Decrypted) Public Key Secret Key Figure 3.8: Data processing and analysis. 3.3.2.4 Downlink Communication Operations in the proposed downlink communication are shown in Fig. 3.9, where the main concern is data authentication, (i.e. verication of the sender's identity). Without a proper authentication scheme, adversaries may forge and send faked messages to smart meters or other devices in order to control them. To avoid this threat, as shown in Fig. 3.9, the control message from related domains (e.g. service provider) was rst sent to the PKC-HE to generate the digital signature using the secret key. After receiving the digital signature, related domains send the signed message to smart meters. With this signed message, customers can verify the sender's identity using the corresponding public key. 3.3.2.5 Key Revocation Key revocation in the proposed system means a new round of key generation and distribution. Due to the number of keys used in the proposed system, the key revocation is not expensive. Ecient key distribution schemes can also be designed 28 Related Domain Signed Control Message for A A B Signed Control Message for B PKC-HE Unsigned Control Message for A Unsigned Control Message for A Signed Control Message for A Signed Control Message for A Public Key Secret Key Figure 3.9: Operations in the downlink communication. because of the key sharing feature. For example, the broadcasting scheme could be replaced by a ooding scheme in which each node forwards the new public key to its neighbors. The loss of a public key can also be retrieved from neighbors as long as the key is still valid. 3.3.3 Practical System with Extended Quadratic Encryption In the previous section, we introduced a secure and ecient data exchange scheme where homomorphic encryption is used. Fully homomorphic encryption, discussed in Sec. 3.2, is able to support any kind of processing on the encrypted data for dierent purposes. However, current fully homomorphic encryption algorithms in [17,43] are computationally expensive so that they are still not practical in the real world system. Some partially homomorphic encryption schemes were proposed with lower complexity such as RSA [35] and Paillier [32]. However, most of them only support either the addition or the multiplication operation on encrypted data, which imposes some limitation on their applications. To build a practical cryptosystem, we consider a partially homomorphic encryption algorithm recently reported by Goh [18] in the proposed data exchange scheme. It allows an arbitrary number of additions and a single multiplication on the encrypted data. This property is 29 generally enough for most applications in the context of a power grid, some of which will be discussed this section. 3.3.3.1 Extension of Goh's Quadratic Encryption Schemes Goh's partially homomorphic encryption scheme consists of the following proce- dures [18]. KeyGen() Given security parameter 2 Z + , algorithm () is used to generate a tuple (q 1 ;q 2 ;G;G 1 ;e), where q 1 ;q 2 are two random bit primes, G,G 1 are groups of order n = q 1 q 2 and e : GG! G 1 is a bilinear map. Then two random generators g;u R G are selected. Set h = u q 2 2 G. Then, PK = (n;G;G 1 ;e;g;h) is the public key and SK = q 1 is the private key. Note that, all the above operations can be computed in polynomial time [18]. Encrypt(PK;m) To encrypt a message m, a variable r R [0;n 1] is rst randomly selected. Assuming the message space consists of integers in the set 0; 1;:::;T with T <q 2 , we compute the ciphertext as follows, CT =g m h r 2G: Again, the group operations in G can be computed in polynomial time in . Decrypt(SK;CT ) In order to decrypt a ciphertext CT , we directly compute the discrete logarithm of (CT ) q 1 with base ^ g, where ^ g =g q 1 . Because (CT ) q 1 = (g m h r ) q 1 = (g q 1 ) m 2G: Since 0mT , this operation demands an expected time of ^ O( p T ) using Pollard's lambda method [18]. 30 This partially homomorphic algorithm can support an arbitrary number of addi- tions and a single multiplication on ciphertexts. The homomorphism is explained below [18]. Additive Homomorphism Given A =g a h r and B =g b h s , where g;h2G, r and s are randomly chosen from [0;n 1], the encryption of a +b takes the following form: C =ABh t =g a+b h r+s+t 2G (3.3) where t2Z n is randomly selected. Multiplicative Homomorphism Given A =g a h r and B =g b h s , where g;h2G, r and s are randomly chosen from [0;n 1], the encryption of ab can be computed as C =e(A;B)h t 1 =g ab 1 h r 0 1 2G 1 ; (3.4) where g 1 =e(g;g), h 1 =e(g;h) and t;r 0 2Z n are randomly selected. To prevent control message forgery, the digital signature is introduced in the pro- posed cryptosystem as shown in Fig. 3.10. However, Goh only proposed an encryp- tion algorithm without discussing the authentication procedure. Here, we extend Goh's encryption algorithm to support both the signing and the verication process in the downlink communication under the same pair of public and secret keys. Signing Process For the signing process, shown in Fig. 3.10, a digital signature is needed. To generate a digital signature using the secret key SK, we propose to select r R [0;n 1] randomly and compute DS(PK;SK;m) =u H(m)=q 1 g r ; 31 Control Message 01000010100 + SK=(u,q 1 ) Signature Generator Hash Hash Function Digitally Signed Control Message Control Message Signature Figure 3.10: Signing process. whereu andg are two public keys,q 1 is the secret key,H() is the hash function and m is the message to be sent. Note that group operations involved here could be computed in polynomial time [36]. Verication Process Every control message is authenticated using the digital signature gener- Control Message Hash Hash Function Digitally Signed Control Message Verification 01000010100 01000010100 Hash ? Signature PK=(n,G,G 1 ,e,g,h) Figure 3.11: Verication process. ated in the above step. Specically, after receiving the signed message, 32 users rst verify the message. To verify a message using the public key PK = (n;G;G 1 ;e;g;h), we compute (DS) n = (u m=q 1 g r ) n =u q 2 m = (u q 2 ) m =h m : Note that m can be recovered by computing the discrete logarithm of (DS) n with baseh. As mentioned above, this operation takes expected time ^ O( p T ) using Pollard's lambda method. This process is depicted in Fig. 3.11. As in the encryption scheme proposed by Goh, the signing and verication process is semantic secure. Increasing the frequency of key updates also strengthens the security of the authentication process. 3.3.3.2 Privacy-preserving Data Aggregation Information aggregation is an important operation in some proposed Smart Grid communication infrastructure (e.g. the wireless-wired multi-layer architecture) [9, 14,33,44]. In a network architecture as shown in Fig. 3.12, each neighborhood has a data collector to accumulate desired users data. Suppose that the service provider wants to know the average power consumption in the neighborhood. To do so, users send their power consumption data through an exclusive connection to the data collector. After receiving the data, the data collector calculates the mean power consumption and reports to the service provider. Without an eective protection technique, user data is vulnerable to interception either during the transmission or during the processing at data collector. The proposed system oers a secure and ecient solution to eliminate this concern, since all data transmitted within the L-Net is encrypted. Except for the PKC-HE sub-system, no other parties can access the original information. Further- more, the proposed secure system also supports in-network incremental aggregation to largely improve the eciency of aggregation [25]. 33 Neighborhood Figure 3.12: Data aggregation in a smart grid. 3.3.3.3 Privacy-preserving Statistical Analysis In addition to providing secure information aggregation for a smart grid, the pro- posed system oers a secure way to perform statistical analysis on encrypted data. It allows the utility or untrusted third party companies to derive statistics from encrypted customers' power consumption data without privacy concern. In the following, we discuss two secure statistic analysis scenarios: utilities and untrusted third party. Accurate statistical analysis of power data helps a utility company to evaluate the current grid status and plan for future power production and distribution. With the proposed system, a utility company directly processes encrypted customers' power consumption data and sends the processed results (in encrypted form) to the PKC-HE sub-system for decryption. The decrypted results (statistics) will then be sent back to the utility company for further analysis and operation. In many practical cases, some third party companies are also interested in statis- tics of customers' power consumption data for commercial purpose. By exploiting the additive and multiplicative homomorphism properties, the third party company 34 Data Center (Encrypted User Data) Utility Encrypted Data Control Center Network Control Messages Network PKC-HE Public Key Secret Key Figure 3.13: Statistical analysis conducted by utilities. can extract useful statistical information of the underlying power consumption data without decrypting them. As shown in Fig. 3.14, this can be achieved as following steps. 1. The third party registers in the public key infrastructure (PKI) of the con- ventional public key cryptosystem used in H-Net 2. The PKI authorizes the third party in the system 3. The third party gathers encrypted data by either directly collecting it from the public network. 4. The third party extracts useful statistical information. Results are in en- crypted form. 5. The encrypted results are sent to the PKC-HE system for decryption. 6. The decrypted results are sent back to the third party for analysis. Now, we show how to derive the mean and the variance of the underlying power consumption data from encrypted data. Without loss of generality, we assume N users in a neighborhood. Given encrypted power consumption data 35 Data Center (Encrypted User Data) Utility Encrypted Data Control Center Network PKC-HE Public Key Secret Key Authorized Third Party Company Public Key Infrastructure(PKI) Registration Request Authorization Figure 3.14: Statistical analysis conducted by the third Party. 36 c 1 ;c 2 ; ;c N , the mean and variance of the corresponding original power consump- tion data m 1 ;m 2 ; ;m N can be computed as follows. Mean Value The mean of the plain data is dened as m = Dec sk (C m ).By exploiting the additive homomorphism, the encrypted mean valueC m can be computed from the encrypted data as C m =Enc pk (m) = N Y j=1 c i h t ; (3.5) where C m is the encrypted mean value. After decryption, we can get the mean value of the plain data. Variance The variance of the plaintext m i ;i = 1;::;N can be obtained by v m = 1 N 1 N X i=1 (m i m) 2 (3.6) = 1 N 1 N X i=1 m 2 i N N 1 m 2 : (3.7) With the homomorphic property, we can obtainv m in terms ofc i ;i = 1;:::;N with the following three steps. STEP 1: Calculate C 1 =Enc pk (m) from Eq. (4.12). STEP 2: Calculate C 2 =Enc pk ( P N i=1 m 2 i ) by C 2 = Enc pk ( N X i=1 m 2 i ) (3.8) = N Y i=1 (e(c i ;c i )h r i 1 )h s : (3.9) 37 STEP 3: Calculate the variance of plaintext data V m by v m = Dec sk (C 2 )NDec sk (C 1 ) 2 N 1 (3.10) wheres andfr i ;i = 1;:::Ng are randomly chosen from [0;n 1],h =e(g;g), h 1 =e(g;h). 3.4 Properties of Proposed Secure System 3.4.1 Security The security of the proposed system relies on the security of the adopted public key algorithm as well as the condentiality of the secret key. Homomorphic encryption schemes in general are at least semantic secure. In our proposed practical appli- cation, Goh's quadratic encryption is a provable semantic secure scheme. Without knowing the secret key or having access to the decryption system, it's impossible to break down the system. Additionally, in the proposed public key cryptosystem, the introduction of the PKC-HE system isolates the decryption and the signing process from others. As a result, the PKC-HE is the only party that can access the secret key of the whole system. The condentiality of the secret key is high. To further strengthen system security, the system can periodically or randomly update the keys within a relatively short time period. Since all customers share the same public key, the key generation and distribution could be done eciently 3.4.2 Eciency As shown in Table. 3.1, the eciency of the proposed system is evidenced by the fact that only two keys are generated while one key is broadcast to all users for L-Net communication. In addition, since all users share the same public key, key maintenance becomes ecient. Users who has lost the public key can retrieve it 38 from their neighbors. In the proposed system, public key infrastructure becomes unnecessary, so the corresponding scalability and eciency concerns do not exist. Also, the PKC-HE is the only system responsible for key generation and distribu- tion. Other devices (such as smart meters) or systems do not have to support these functions, thereby satisfying hardware design requirements of smart grid system. We admit that it takes extra delay for processed results to be decrypted by the PKC-HE system, but we also see that the time saved by the reduced decryption operations on real-time power consumption data is tremendous. Table 3.1: Comparison of two schemes Schemes NIST Proposed System HE-based Solution Key Generation All Devices Only PKC-HE Keys (Key Exchange Stage) 2N 2 PKI (CA, RA, and etc.) Required Not Required Digital Certicates N 0 Certicate Exchange Operations N 0 Keys (Data Exchanging Stage) N 0 Decryption Operation N 1 3.5 Conclusion A secure and ecient homomorphic encryption-based secure system was proposed for the smart grid system. A practical system was designed by adopting Goh's partially homomorphic encryption scheme and its extension. With the proposed system, a smart grid can provide protection for both uplink and downlink commu- nications eciently. Moreover, the proposed system can adjust the security level of the whole system in terms of corresponding security requirements. Thanks to additive and multiplicative properties, all processing work on the utility side can be done on encrypted data without the need for decrypting every single data. With this feature, applications such as data aggregation and statistical analysis can be accomplished in a privacy-preserving way. 39 Chapter 4 Privacy-preserving Metering Scheme 4.1 Introduction In order to provide highly ecient, automatic and intelligent electricity service, dierent domains in a smart grid are required to closely cooperate with each other. This feature is known as the interoperability requirement [2]. In this complex collaboration, customers' power consumption data is one of the key pieces of in- formation shared among dierent parties. To help service provider, operations and markets keep track of the present power consumption status, customers must frequently share their power consumption data. For service provider, customers' power consumption data is also used for billing purposes. Unlike the aggregate power consumption data (monthly) in a legacy power grid, the shared data in a smart grid is now in granular form (every minute or even second), which leads to a broad customer privacy concerns. Fig. 4.1 shows a sample, customer's daily power consumption trace, obtained by a utility company. Through trivial analysis, almost anyone with access to such data can gure out at what time which electronic appliances are used. With an advanced power signature analysis tool, even more valuable information about customer's daily activities could be obtained [26,27]. Some examples are shown in Table. 4.1. 40 Figure 4.1: Power usage to personal activity mapping [34]. Various parties are interested in obtaining this sensitive personal information. For instance, thieves may exploit this information to study the behavior of potential victims living in the house. They can more easily break into the house at the right time with derived relevant information like "nobody is in the house every Friday night". Commercial companies are also interested in the power consumption behavior of customers. For example, advertising becomes more eective when a vendor can gure out which appliances customers are using for their daily lives. Even utility companies are constantly considered to be a source of privacy abuse in a smart grid. Without proper protection schemes, customers' privacy or even security may be threatened, especially when a public communication network (e.g. the Internet) is used for data transmission. This privacy concern has become one of the main barriers to achieving smart grid interoperability. Customers may not want untrusted companies to have access to their sensitive power consumption data. The privacy and security issues in a smart grid have been well discussed in recent guidelines published by NIST Cyber Security Working Group(CSWG) [39{ 41]. In [2], the IEEE also puts an emphasis on the importance of data privacy in 41 a smart grid interoperability issue. Besides the high-level standard and discussion, researchers have proposed some specic schemes to address such privacy issues. In 2011, two privacy-preserving data aggregation schemes were proposed and discussed in detail [16, 23]. With an advanced cryptographic scheme or a data masking method, a customer's data can be well protected during data transmission. But these schemes only address the privacy concerns arising from data aggregators in the aggregation process. Problems with data sharing are still unaddressed. Also, the aggregated information is merely the sum of the total power consumption, which may not be sucient for accurate dynamic pricing in a smart grid. To address the privacy concerns about billing, in 2010 [30] proposed a protocol using zero knowledge proof to privately derive and prove the correctness of bills. However, without reporting customers' power consumption data, power consumption analysis becomes impossible. In [13], an anonymization scheme was proposed in which each customer was assigned a pseudo-name. In the scheme, two databases were introduced: one used to store customer's power consumption traces and their pseudo-names; the other one to store the pseudo-names and real identities. The former is made public for data sharing while the latter must be condential. One problem of this scheme is which party could be trusted to manage the condential database. In addition, in 2011, [20] proved that data mining and pattern recognition techniques can be used to break the anonymization scheme through building connection from customer's pseudo-names to their real identities. Power Usage Activity-related Information Energy cycle of TV People in the house or not Energy cycle of coe pot Time to wake up Energy cycle of water heater Number of people living in the house ... ... Table 4.1: Sensitive Information Derived from Power Consumption Trace 42 In [28] published in 2011, a class of algorithms and systems called Non-Intrusive Load Leveling (NILL) was proposed to combat privacy invasion on metering data. In NILL, an in-residence battery is introduced to mask the variance of load on the grid. Through this the exposure of power consumption traces is eliminated. One of the major problems of this scheme is the cost of the battery, which is approximately estimated to be $1000 per year. Also, with only smoothed power consumption data, real-time power consumption analysis is hard to accomplish. As a result, power consumption prediction and dynamic pricing may fail. The rest of this chapter is organized as follow: In Section. 4.2, background knowledge about an advanced metering infrastructure and load signature analysis techniques will be brie y discussed. The proposed privacy-preserving metering scheme is then presented in Section. 4.3. In Section. 4.4, experimental results are shown and discussed. 4.2 Background 4.2.1 AMI in smart grid In a smart grid, an advanced metering infrastructure (AMI) is generally composed of a large set of smart meters, communication hardware and software, and a data management system. Each smart meter collects the power consumption information from dierent electronic appliances within the house and reports the information to data management system (like MDMS) through a data transmission network. The data transmission approach can be using broadband over power line (BPL), power line communications (PLC), or public networks (Internet, cellular), among others. The data transmission is also stage-dependent. For instance, a well-known AMI infrastructure adopts wireless mesh network for communication among smart meters and data collector while the Internet is used for the transmission between 43 the data collector and the utility company. In this way, the AMI builds up a two- way communication channel between customers and utilities. In a smart grid, the AMI is also responsible for sending real-time electricity prices back to smart meters which could help the utility reduce the load. Dynamic pricing is mainly based on present and previous power consumption analysis. 4.2.2 Non-Intrusive Load Monitoring In a power system, Non-Intrusive Load Monitoring (NILM) is a class of algorithms that can decompose load proles into composite appliance proles. Customer power consumption traces can be mapped to ON/OFF events associated with identiable appliances. The original motivation of NILM was to help a power company known about the status of system loads. However, with more and more ne-grained meter- ing data provided by the AMI system, NILM has become a potential privacy threat which could be used to assemble a detailed prole of consumer appliance use and indirectly derive their daily activities proles. NILM uses the relative change in energy use along with the time instant as features. For example, an 3-second load prolef(t 0 ; 0W ); (t 1 ; 100W ); (t 2 ; 200W ); (t 3 : 100W )g could generate the features: f(t 1 ; +100W ); (t 2 ; +100W ); (t 3 :100W )g. The more features a load prole has, the more ON/OFF events it contains. To further gure out which appliances are used, pairs of symmetric ON/OFF "sister" features are extracted and analyzed. Previous work [28] mentioned in Section. 4.1 introduces rechargeable batteries to smoothen the load prole in order to reduce the potential features. Results show that their proposed scheme can reduce up to 95% of potential features, which is very eective. 44 4.3 Privacy-preserving Metering Scheme for Smart Grid Inspired by McLaughlin's work in 2011 [28], we proposed a privacy-preserving me- tering scheme where a software module, instead of expensive hardware, is intro- duced to mask a customer's power consumption trace. For an individual customer, it's impossible to recover the original power consumption trace from the masked one. For the utility or other parties, it's still possible to reconstruct enough of the distribution of the original power consumption from the distorted data for useful power consumption analysis and prediction. 4.3.1 System Model and Assumptions The system model of our proposed scheme is shown in Fig. 4.2. After aggregating power consumption data from all appliances used in the house, each smart meter sends the aggregated data to a data collector located in the center of the neigh- borhood through a wireless network. Then, data collector forwards the received power consumption data to the service provider, operation, the market or other 3 rd party companies. Data sharing is also permitted among dierent parties in the system. Untrusted parties involved in data sharing and data interception during the reporting period are two potential privacy threats. The proposed scheme is based on the following assumptions: rst, we assume that smart meters are secure enough to avoid illegal reading manipulation. Second, we assume that all smart meters report their power consumption readings in a synchronous (or near synchronous) manner. We also assume all involved parties are untrusted, including the service provider. All parties will try their best to attain as much information as possible about a customer's private usage history. 45 Public Network SM Service Provider [d1 1 ,d1 2 ,d1 3 ,..] [d2 1 ,d2 2 ,d2 3 ,..] [d3 1 ,d3 2 ,d3 3 ,..] 3 rd Party Companies Operation Center Market SM SM [d41,d42,d43,..] Ă Ă SM Ă Ă Ă Ă Neighborhood 1 Neighborhood K Data Collector Data Collector Figure 4.2: System Model 4.3.2 Proposed Metering Schemes The proposed scheme comprises three procedures: reading distortion, power con- sumption distribution reconstruction, and an aggregated billing mechanism. 4.3.2.1 Reading Distortion Rather than smoothing the power consumption trace, the proposed scheme adds more variations or features to the original power consumption trace. To provide privacy protection to the greatest degree, reading distortion is implemented inside the smart meter. AWGN noise is selected to be the disturbance for its simplicity. Mathematically, suppose each smart meter SM i reports household power con- sumption everyT seconds. (r i [0];r i [1];r i [2];:::r i [j];:::) is used to represent the smart meter readings ofith customer from timet = 0 tot =mT . The disturbed readings d i [j] can be written as d i [j] =r i [j] +n i [j] (4.1) 46 where n i [j] is the AWGN with zero mean and a relatively large variance 2 n which is at least as large as the variance of the power consumption trace. The variance of the added noise signicantly aects the overall performance of the proposed scheme. The selection of optimal variance will be discussed in detail. Fig 4.3 shows an example of distorted power consumption trace of a day in which the data is reported every 10 minutes. The red curve represents the original power consumption trace while the blue one shows the distorted one. 0 50 100 150 −2 −1 0 1 2 3 4 5 Samples (every 10 mins) Real/Disturbed Power Consumption Real Power Consumption Trace Disturbed Power Consumption Trace Figure 4.3: Disturbed Power Consumption Trace 4.3.2.2 Power Consumption Distribution Reconstruction Mathematically, the distribution reconstruction problem is, given a cumulative distribution F Y and the realization of n i.i.d. random samples: X 1 +Y 1 ;X 2 + Y 2 ;X 3 +Y 3 ;:::;X n +Y n ,to estimate F X . This problem has been well solved with basic Bayesian theory in [4]. In our problem, we can use a similar method to rst derive the conditional probability of the original power consumption given each distorted sample as f RjD (ajd i ) = f N (d i a)f R (a) R 1 1 f N (d i z)f R (z)dz (4.2) 47 where f X () indicates the probability density function of X. The iteration algo- rithm to calculatef R (a) is given below: In the algorithm, the averaged conditional Algorithm 1 Power Consumption Distribution Reconstruction Inputs: samplesfD k g and probability density function of AWGN N States: Initialize distribution of the original power consumption R as uniform distribution, i.e. f 0 R = 1 L , where L is the value range of data Procedure: for k = 1; 2; ;1 do f k RjD (ajw i ) = f N (d i a)f k1 R (a) R 1 1 f N (d i z)f k1 R (z)dz ; f k R (a) = 1 n P n i=1 f k1 RjD (ajw i ) if stopping criterion satised then f R (a) =f k R (a); break; end if end for Declare the recovered distribution of power consumption as f R (a). probability density function f RjD (ajd) shown in Eq. 4.3 is used as an update of f R (a) for the next iteration. In fact, the averaging step is equivalent to multiplying every f RjD (ajd i ) with a weight f D (d i ): f R (a) = 1 n n X i=1 f N (d i a)f R (a) R 1 1 f N (d i z)f R (z)dz = n X i=1 f RjD (ajd i )f D (d i ) = n X i=1 f R;D (a;d i ) = n X i=1 f R;N (a;d i a) = n X i=1 f R (a)f N (d i a) f R (a) (4.3) 48 As we can see from Eq. 4.3, averaging has the same eect of estimating the prob- ability density function f R (a) from a conditional probability density function and samples d i . For stopping criterion, given thek th estimation of distributionf k R (a) and (k1) th estimation f k1 R (a), we use the 2 test to determine whether the two successive distributions accord with each other under a 95% condence level. The above sample by sample reconstruction algorithm indeed works. However, it requires the availability of all customers' data in a relatively short time (e.g. seconds) thereby leading to much trac on communication system. Real time processing on the gigantic data set requires powerful computational capabilities. This problem becomes even more severe in a smart grid since the typical system consists of millions of smart meters. To solve this problem, we use parallel processing schemes instead of one-time centric processing. As shown in Fig. 4.4, a reinforced data collector is capable of directly running Algorithm 1 to reconstruct the power consumption distribution of the corresponding neighborhood. Instead of forwarding readings of every smart me- ter, the data collector now sends the reconstructed distribution of the neighborhood to the utility company. Once the utility receives the reconstructed distribution from dierent neighborhoods, the overall distribution can be obtained by averaging all locally reconstructed distributions. 4.3.2.3 Aggregated Billing In order to facilitate the correct and ecient handling of billing without a privacy invasion, an aggregated billing mechanism is used in the proposed metering scheme. Specically, we introduce a storage unit in a smart meter device to accumulate electricity billing information. The initial value of the accumulator is set to be 0. 49 Algorithm Neighborhood 1 d 1 d 2 d i-1 d i Algorithm Neighborhood 2 d 1 d 2 d i-1 d i Algorithm Neighborhood K d 1 d 2 d i-1 d i SUM( )/K Utility ... ... ... ... Figure 4.4: Parallel Processing Given the dynamic electricity price of timet =j isP [j], after each reporting time, the aggregated billing Bill[j] is updated in the following form, Bill[j] =Bill[j 1] +r i [j]P [j] (4.4) where r i [j] denotes power consumption during time slot from t = j 1 to t = j. Whenever a smart meter receives a query from a utility company for the billing statement, the smart meter will send back the updated Bill. 4.3.3 Security The security of the proposed scheme mainly relies on the diculty of recovering the original power consumption trace from a distorted one. In this part, we'll show the ineectiveness of two kinds of attacks with well-known de-noising techniques, which are the most obvious approaches in breaking this system's security. 50 4.3.3.1 Linear Filter Attack The linear lter is a widely-adopted de-noising technique. The fundamental idea is to attenuate the additive noise component by averaging throughout its neighbor- hood. Given a sequence of customer power consumption trace samplesfr[i];i = 1; 2;:::mg, AWGNN with mean 0 and variance 2 n is added before the data is sent to a utility company. The distorted power consumption trace,fd[i];i = 1; 2;:::;mg, is represented as d[i] =r[i] +n[i];i = 1; 2;:::;m (4.5) With the linear lter denoising technique, an attacker can derive the de-noised data sequencefr 0 [i];i = 1; 2;:::;mg in the following form r 0 [i] = X iLji+L w[j]d[j] = X iLji+L w[j](r[j] +n[j]);i = 1; 2;:::;m (4.6) wherew[j] is thej th coecient of the linear lter. The expectation and variance of r 0 [i] could be then derived as E[r 0 [i]] =E[ X iLji+L w[j](r[j] +n[j])] = X iLji+L w[j]E[r[j]] (4.7) Var[r 0 [i]] =Var[ X iLji+L w[j](r[j] +n[j])] = X iLji+L w[j] 2 (Var[r[j]] +Var[N[j]]) = X iLji+L w[j] 2 (Var[r[j]] + 2 n ) (4.8) where 2L + 1 is the size of the sliding window. 51 In the proposed scheme, 2 n is generally larger thanVar[r[j]]. From Eq. 4.8, we see that the variance of the noise dominates the variance of the de-noised data. As a result, many articial ON/OFF features will be fabricated. In the meanwhile, real ON/OFF events may also be distorted by the noise, causing original features to become inaccurate or inexistent. With the NILM tool discussed in Sec. 5.2, it's impossible to derive an accurate load prole of the customer. On the other hand, adjustment of the window size is not helpful since a smaller window size has poorer de-noising eects while a larger window size leads to over-smoothing the power consumption trace. Relevant experimental results will be shown in Sec. 5.4. 4.3.3.2 Non-local Mean Filter Attack The non-local mean lter is one of the state-of-the-art de-noising techniques in the image processing eld. The fundamental idea is to take advantage of global information to remove noise in a local area. In our scheme, we transformed the 2-D NLM lter into a 1-D NLM lter to see whether the original power consumption data can be recovered. As before, given a power consumption trace r[i];i = 0; 1; 2;:::;m, after distor- tion, we get the distorted power consumption traced[i];i = 0; 1; 2;:::;m in the form of d[i] =r[i] +n[i];i = 0; 1; 2;:::;m (4.9) where n[i] is a sample of the AWGN with 0 mean and variance 2 n . With NLM, the recovered consumption at time i is given as r 0 [i] = X j2 d w[i;j]d[j] (4.10) 52 where d is a sliding window with size ss centered around time i. The weight w(i;j) is given by w(i;j) = 1 C i expf jjd(N i )d(N j )jj 2 2;a h g (4.11) whereC i is a normalizing factor such that P w(i;j) = 1,d(N i ) denotes a neighbor- hood, N i , centered around time i in a disturbed power consumption trace,jjjj 2 2;a is the Euclidean norm weighted by a Gaussian kernel of standard deviation a, and h is a parameter that adjusts the decay of the weight. We can derive the mean and variance of the sum of weighted squared dierences term in Eq. 4.11 below. The mean value can be written as Ejjd(N i )d(N j )jj 2 2;a =jjr(N i )r(N j )jj 2 2;a + 2 2 (4.12) Then, we can rewrite Eq. 4.11 as, E[w(i;j)] = 1 C i expf 2 2 h gexpf jjr(N i )r(N j )jj 2 2;a h g (4.13) In our scheme, we choose 2 n larger than Var[r(j)], which means the weights w(i;j) is mainly determined by expf 2 2 h g rather than the signal similarity func- tion expf jjr(N i )r(N j )jj 2 2;a h g. As a result, similar samples may not be assigned high weights and distinct samples may be assigned high weights due to the noise. There- fore, NLM will also fail to recover the original signal from distorted one. Experi- mental results will be shown in Sec. 4.4 53 4.4 Experimental Results 4.4.1 Datasets Two sets of power consumption data are used to verify the proposed scheme. One is the real power consumption data of a single household, while the other dataset was synthesized by a software tool called smart grid Simulator, which was designed according to the power consumption statistics of various household appliances. 4.4.1.1 Power Curve Dataset This dataset was collected by the Business Intelligence Lab of Telecom ParisTech [1]. It contains 349 days of electric power consumption data recorded in a single household in 2007. The daily power consumption trace contains 144 data reported every 10 minutes starting at 00:00 to 23:50. This ne-grained dataset is used to verify the performance of privacy protection. 4.4.1.2 Smart Grid Simulator The smart grid Simulator project is a co-operation work between AIFB and Wech- selfunchs [5]. The simulator can generate realistic random electric usage data for a smart grid emulator. The simulator generates data from statistics (mean and variance) learned from 3 weeks of real world power consumption data. The gener- ated data is in the n3 format and includes the detailed power consumption of each electronic device in the household to every hour. Since we can use this software to generate power consumption data for a large amount of customers, it is mainly used to verify the power consumption reconstruction. 4.4.2 Results and Discussion 1. De-noising Attacks 54 0 50 100 150 −4 −3 −2 −1 0 1 2 3 4 5 6 Sample Sequence Power Consumption (Kw) (a) 2 n =Var(CT ) 0 50 100 150 −0.5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 Sample Sequence Power Consumption (Kw) (b) Mean lter result 0 50 100 150 −0.5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 Sample Sequence Power Consumption (Kw) (c) NLM lter result 0 50 100 150 −2 −1 0 1 2 3 4 5 Sample Sequence Power Consumption (Kw) (d) 2 n = 1:5Var(CT ) 0 50 100 150 −0.5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 Sample Sequence Power Consumption (Kw) (e) Mean lter result 0 50 100 150 −0.5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 Sample Sequence Power Consumption (Kw) (f) NLM lter result Figure 4.5: Results of Denoising Attacks As shown in Fig. 4.5, we see that both the linear mean lter and the NLM lter fail to recover the detailed power consumption traces of customers. The missed detection and false alarms makes the mapping to customer's daily activities unreliable. We adopted NILM techniques to further prove this. As shown in Table. 4.2, we can see that after distortion, the number of features extracted with NILM is largely increased as the increase of the noise. The threshold to lter out noise is chosen to be 0.3 KW. After the linear lter and the NLM lter, the number of features indeed decreases compared to noised data, but still higher than the original one. Fig. 4.6 and Fig. 4.7 show the extracted features over time for dierent processed data. It's obvious that extracted features of de-noised data almost like noise which could not provide any valuable information about customer's prole of daily activities. From Fig. 4.5, we can also see that the larger the noise we added, the worse the reconstruction result. This does not mean we can choose arbitrarily large noise since we have to make sure the power consumption distribution can be 55 Distortion Original Distorted After LM After NLM 2 n =Var(PCT ) 32 117 51 21 2 n = 1:5Var(PCT ) 32 132 77 47 2 n = 2Var(PCT ) 32 129 89 60 Table 4.2: Number of Features Extracted with NILM 00:00 04:00 08:00 12:00 16:00 20:00 24:00 −2 −1 0 1 2 3 4 5 Time of Day Feature Values Features Extracted from Original PCT Features Extracted after LM Filter Original PCT Figure 4.6: Features extracted after linear lter 56 00:00 04:00 08:00 12:00 16:00 20:00 24:00 −2 −1 0 1 2 3 4 5 Time of Day Features Values Features Extracted from Original PCT Features after NLM filter Original PCT Figure 4.7: Features extracted after non-local mean lter accurately reconstructed. For a xed number of users, the larger the noise we choose, the more dicult it is to get an accurate estimate of the power consumption distribution. 2. Power Consumption Distribution Reconstruction Fig. 4.8 shows the distribution reconstruction results of the proposed scheme without distributed processing. The dataset is generated by smart grid Sim- ulator discussed in Section. 4.4.1.2. It consists of 10000 households' power consumption data at the same time instant. The variance of the original power consumption data is 7892:2. The disturbance we added to the original power consumption data is a AWGN with mean 100 and the same variance as the original power consumption data. The number of bins for data splitting was set to be 53 and the algorithm stopped at the 20 th iteration. 57 −400 −200 0 200 400 600 800 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1 Power Consumption (W) Probability Original Distribution Distorted Distribution Reconstructed Distribution Figure 4.8: Power Consumption Distribution Reconstruction The red curve shows the distribution of the original power consumption of all customers. The blue curve shows distribution of disturbed power con- sumption data. The green curve is the recovered distribution using our pro- posed scheme. The averaged mean square error (MSE) of the disturbed power consumption distribution is 0:1608 while that of recovered distribution is en- hanced to 0:0232. Under the same setting, as shown in Fig. 4.9, the blue curve shows a distribu- tion recovery result of 10-class parallel processing. The MSE of the parallel processing method is 0:0251, almost the same as centric processing. Accord- ing to the averaged performance, we see that the parallel processing method indeed achieves the same level of accuracy of distribution recovery as shown in Fig. 4.10. The red dashed line is the average MSE of centric processing result. 58 −400 −200 0 200 400 600 800 1000 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1 Power Consumpton (W) Probability Original Distribution Centric Processing Parallel Processing (5 Groups) Figure 4.9: Centric Processing vs Parallel Processing 10 20 30 40 50 60 70 80 10 −3 10 −2 10 −1 Number of Groups/Neighborhoods Mean Square Error MSE under different number of classes MSE = 0.0069 Figure 4.10: MSE under dierent class numbers 59 Fig. 4.11 and Fig. 4.12 show the accuracy of the proposed power consumption distribution recovery under dierent conditions. Specically, Fig. 4.11 is the MSE curve under a dierent number of households. The accuracy of proposed scheme increases with the growth of the household number. On the other hand, Fig. 4.12 is the MSE curve under dierent disturbance level and we can see that the disturbance level doesn't aect the accuracy of our proposed schemes. With a xed household number, the maximum recovery accuracy is xed as well. The cost of larger noise is an increase in the number of iterations. 10 2 10 3 10 4 10 5 10 6 0 0.05 0.1 0.15 0.2 0.25 Number of Households Average MSE Figure 4.11: MSE vs Customer Number 4.5 Conclusion In this work, we proposed a privacy-preserving metering scheme in a smart grid. In our scheme, a customer's power consumption data is distorted before sending it to the public network. Any party receiving the distorted data is not able to restore 60 0 20 40 60 80 100 10 −3 10 −2 10 −1 10 0 Number of Iterations Mean Square Error SNR = 3dB SNR = 0dB SNR = −3dB SNR = −6dB Figure 4.12: MSE under diernt SNR with a xed number of customers the original ne-grained power consumption trace thereby protective privacy. With the distorted data, a utility or other 3 rd party companies are still able to derive the distribution of the original power consumption data of a certain area. Accurate billing services are also guaranteed through an aggregated billing mechanism. Two attack models are examined to verify the security of our proposed schemes. Real world data-based experimental results prove the feasibility of our scheme. 61 Chapter 5 Power Quality Monitoring Using Change-Point Detection 5.1 Introduction It is estimated that power outages and power quality problems could cost at least $150 billion each year in the U.S. [12]. Being motivated by this concern, one of the dening characteristics of the emerging Smart Grids is their capability of supporting more stable and higher-quality power supply by leveraging state-of-the- art information technology. To assess power quality (PQ), it is a common practice to monitor the quality of voltage and current waveforms by analyzing the real-time information acquired by sensors installed in power distribution networks. In contrast with the sinusoidal power waveform generated by electric utilities, power waveforms over transmission lines are often distorted by power quality(PQ) problems which can be classied into two categories: namely, PQ variations and PQ events [7,8]. While PQ variations are characterized by small and gradual deviations from the sinusoidal voltage/current waveforms, PQ events incur large waveform deviations. PQ events are more detrimental to the power distribution network since it may potentially in ict more severe damages such as power outages. Consequently, the occurrence of PQ events has to be accurately and timely detected to allow appropriate amending actions. For presentational simplicity, we concentrate on the 62 voltage-based PQ events in this work while its extension to the current-based PQ events can be done in a straightforward manner. In practice, PQ event monitoring consists of two steps: 1) detection and 2) classication. In the rst step, the occurrence of a PQ event is declared when the waveform change is detected to exceed a pre-dened threshold. In the second step, the distorted waveforms are fed into a classier to identify the cause of the PQ event before further analysis is performed. In this work, we focus on developing novel detection schemes in the rst step. For readers interested in the classication step of the PQ event monitoring, we refer to [8] for a very comprehensive treatment. In this work, we study the PQ monitoring problem in a change-point detection theoretic framework. More specically, we propose a sequential detection scheme [6] by examining the dierence of statistical distributions of power waveforms before and after the PQ event occurrence. Despite the fact that the pre-change signal statistics can be well characterized, the post-change signal statistics are usually un- known, depending on the nature of the underlying PQ event(s). To circumvent this obstacle, we propose to rst transform the received signal such that the transformed post-change signal can be modeled as a sum of multiple statistically independent signals. After invoking the central limit theorem, we devise a robust change-point detection approach for PQ monitoring by exploiting change-point detection the- ory with unknown parameters after change. Since the proposed scheme performs sample-by-sample evaluation, it can achieve the detection task with the nest time resolution. The rest of this chapter is organized as follows: in Section. 5.2, background information about power quality events and conventional techniques proposed for detecting PQ events are introduced. Proposed algorithm is discussed in detail in Section. 5.3. Results and performance analysis are presented in Section. 5.4. Finally, we conclude this chapter in Section. 5.5 63 5.2 Background 5.2.1 Power Quality Events Power quality events, which only occur occasionally, generally have larger deviations compared to power quality variations. Transients, sags, swells and harmonics are four major types of power quality events that are harmful to electronic devices as shown in Fig. 5.1. Transients are power quality events with very short duration increase of the voltage (e.g.,msecs). The main cause of transient is the lightning strokes to the wires or to the ground and component switching in the power system. Contrast to transient, voltage sags are short-duration reduction of the voltage which is commonly caused are motor starting, transformer energizing and faults. Specically, when the root mean square voltage is below the nominal voltage by 10 to 90% for 0.5 cycle to 1 minute. The majority of equipment problems are caused by voltage sags. Swells are just contrary to voltage sags. When the root mean square (RMS) voltage exceeds the nominal voltage by 10 to 80% for 0.5 cycle to 1 minute, the event is called swell. Harmonics are signals with integer multiples of the fundamental power system frequency. They are usually created by non-linear devices in the power system. 5.2.2 Conventional Techniques Three conventional PQ event detection methods have been proposed in the current literature. The rst one keeps tracking the root mean squared (rms) value of the voltage waveform over a moving window. The likelihood of PQ event occurrence is evaluated based on the rms change across windows. Despite its simplicity, the 64 0 0.02 0.04 0.06 0.08 0.1 −1.5 −1 −0.5 0 0.5 1 1.5 Time/s Amplitude/v (a) Sags 0 0.02 0.04 0.06 0.08 0.1 −1.5 −1 −0.5 0 0.5 1 1.5 Time/s Amplitude/v (b) Swells 0 0.02 0.04 0.06 0.08 0.1 −1.5 −1 −0.5 0 0.5 1 1.5 Time/s Amplitude/v (c) Transients 0 0.02 0.04 0.06 0.08 0.1 −1.5 −1 −0.5 0 0.5 1 1.5 Time/s Amplitude/v (d) Harmonics Figure 5.1: Examples of Power Quality Events 65 rms-based method is eective in detecting amplitude-related distortions. The sec- ond one detects the distortion in the frequency domain by transforming the time waveform into the frequency waveform using either the wavelet or the short-time Fourier transform (STFT) [8]. The third one decomposes the waveform into a sum of damped sinusoids using super-resolution spectral analysis techniques such as signal estimation via a rotational invariance technique (e.g., ESPRIT) or multi- ple signal classication (e.g., MUSIC) [11]. The distorted waveform is detected by comparing the decomposed frequency-domain components of a monitored waveform with those of the normal one. Apparently, the latter two are more agile to frequency distortions. Note that a sliding window is also required in the last two methods to segment the waveform into blocks before any transformation or decomposition is applied [8]. As a result, the time resolution of all three methods is restricted by the sliding window size. Unfortunately, the sliding window size has to be suciently large to meet the detection rate and the false alarm rate requirements. 5.3 Proposed Algorithm In the proposed algorithm, we attempt to model the received signal and apply the change-point detection to prompt and accurate PQ event detection. We rst discuss the scheme for the single sensor scenario. Then we extend the scheme to multi-sensor situation with improved detection latency. 5.3.1 Problem Formulation Without loss of generality, a PQ event is assumed to take place at timet =t e . The goal is to detect the PQ event with the minimum delay and the highest detection accuracy. Note that the proposed technique can be straightforwardly extended to the detection of the end of a PQ event. Here, we will focus on the detection of the occurrence of a PQ event. 66 The continuous-time waveform signal before the PQ event is measured and sam- pled. The k-th sample can be modeled as y[k] =s 0 [k] +n[k]; (5.1) wheren[k] is the additive white Gaussian noise (AWGN) with zero-mean and vari- ance 2 n , denoted byN (0; 2 n ), and s 0 [k] =a 0 sin (2f 0 T s k + 0 ); (5.2) is the undistorted power waveform with T s being the sampling duration, 0 def = [a 0 ;f 0 ; 0 ] T , where a 0 = 1 is the signal amplitude gain, and f 0 and 0 are the fundamental frequency and the initial phase of the power waveform, respectively. Note that we have implicitly assumed the variance of n[k] is independent of k. Similarly, we can model the power waveform after the PQ event as y[k] =s 1 [k] +n[k]; tt e ; (5.3) We use p 0 (y) and p 1 (y) to denote the probability density functions (PDF) of y before and after the PQ event, respectively. Clearly,p 0 (y) can be well estimated due to the fact thatfa 0 ;f 0 ; 0 g are deterministic whereas 2 n can be accurately measured. In contrast, p 1 (y) depends on the specic type of PQ events under consideration. It is generally dicult to fully characterize p 1 (y) before the occur- rence of the PQ event, which handicaps the conventional statistical hypothesis test methods such as the Neyman-Pearson hypothesis testing. 67 As a result, most conventional PQ event detection methods are designed to directly exploit the instantaneous changes in the amplitude gain, fundamental fre- quency or phase without utilizing their long-term statistics. For instance, the con- ventional rms method concentrates on amplitude changes by sampling and com- puting the rms of the voltage waveform. Let y k be the k-th sample of the voltage waveform. The conventional rms method keeps tracking the sample rms over a sliding window of size N, where N usually covers one cycle of the power-system frequency [8]. Mathematically, the q-th rms is given by Y rms (q) = v u u t 1 N q X k=qN+1 y 2 q : (5.4) A PQ event is detected if the current rms value change is larger than a pre-dened threshold of the nominal voltage. Besides the time-resolution problem associated with the sliding window size, conventional methods are sub-optimal due to the fact that they do not exploit the statistical distributions before and after the PQ event. 5.3.2 Signal Statistical Modeling Methods In this section, we derive a PQ event detection scheme from the cumulative sum (CUSUM) algorithm, which is most well-known in the change-point detection the- ory. The pre-event PDF,p 0 (y), is assumed to be known while the post-event PDF, p 1 (y), is unknown. To circumvent the uncertainty of the post-event PDF, the weighted CUSUM algorithm is employed to replace the conventional log-likelihood ratio (LLR) test. In the following, we assume prior knowledge on a 0 = 1, f 0 (i.e. either 50 or 60 Hz), 0 and 2 n . 68 5.3.2.1 Generic Modeling 1. Pre-event Signal Modeling Sincefa 0 ;f 0 ; 0 g are known,s 0 becomes deterministic. We begin with trans- forming y[k] in (5.1) into z[k] as z[k] =y[k]s 0 [k] =n[k]; 0t<t e : (5.5) Thus, the PDF of z is simply p 0 (z) =N (0; 2 n ). 2. Post-event Signal Modeling According to Eq. 5.3, the post-event signal is regarded the sum of distorted power signal s 1 [k] and the additive Gaussian noise n[k]. We further model s 1 [k] as, s 1 [k] =a 1 sin (2f 1 T s k + 1 ) + ' [k]; (5.6) with ' [k] being the additive distortion parameterized by ', and 1 def = a 1 ;f 1 ; 1 ;' T T Next, we derive the post-event PDF using results from change-point detection theory with unknown parameters after change. Two solutions have been developed in change-point detection theory [6]; namely, the weighted CUSUM method and the generalized likelihood ratio (GLR) CUSUM method. In this work, the weighted CUSUM method is adopted due to its simplicity. Similar to (5.5), we also transform Eq. (5.3) as z[k] =y[k]s 0 [k] =x[k] +w[k]; (5.7) 69 where x[k] = a 1 sin (2f 1 t + 1 ); (5.8) w[k] = ' [k]s 0 [k] +n[k]: (5.9) Since 1 is unknown, rather than evaluating the LLR p 1 (z i ) p 0 (z i ) directly, we com- pute the logarithm of the weighted likelihood ratio with the weighted CUSUM method as s i = ln Z 1 p 1 (z i ) p 0 (z i ) dF 1 ( 1 ) ; (5.10) whereF R (r) is the cumulative density function (CDF) of the enclosed random variable R. By invoking the central limit theorem, we can approximate the PDF of w as N (0; 2 w ), where 2 w = 2 + 2 n + 1 2 a 2 0 . Furthermore, recall thatx[k] is approx- imately uniformly distributed over [ja 1 j; +ja 1 j]. Thus, it is straightforward to show that p 1 (z) = 1 4ja 1 j erf z +ja 1 j p 2 w erf zja 1 j p 2 w : (5.11) With the assumption thatx[k] andw[k] are statistically independent, we can express F ( 1 ) as F 1 ( 1 ) =F A 1 (a 1 )F w ( w ): (5.12) As a result, Eq. (5.10) becomes s i = ln Z A 1 Z w p 1 (z i ) p 0 (z i ) dF w ( w ) dF A 1 (a 1 ) : (5.13) The most commonly used distribution ofF () includes the uniform and Gaus- sian distributions [6]. Unless otherwise specied, the Gaussian distribution is employed in our simulation as described before. 70 5.3.2.2 Event-Specic Modeling 1. Pre-event Signal Modeling We use the same modeling method of generic modeling for the pre-event signal in event-specic modeling. Therefore, z[k] =y[k]s 0 [k] =n[k]; 0t<t e : (5.14) The PDF of z is also p 0 (z) =N (0; 2 n ). 2. Post-event Signal Modeling For post-event, we consider the specic properties of dierent power quality events. Based on the mathematical models of the common power quality events shown in Table. 5.1, we classify the power quality events into two categories: sags/swells and transients/harmonics. For sags/swells, we model Event Categories Mathematical Models Pure Signal s(t) =Asin(wt) Sag s(t) =A(1k(u(t 2 )u(t 1 )))sin(wt) Swell s(t) =A(1 +k(u(t 2 )u(t 1 )))sin(wt) Transient s(t) =A[sin(wt) +kexp((tt 1 )=)sin(w n (tt 1 )) (u(t 2 )u(t 1 ))] Harmonics s(t) =Asin(wt) +h 2 sin(2wt) +h 3 sin(3wt) +h 5 sin(5wt) +::: Table 5.1: Mathematical Models of Major Power Quality Events the power waveform after the PQ event as y[k] =s 1 [k] +u[k] +n[k]; tt e ; (5.15) where u[k], which is uniformly distributed over [jbj; +jbj], indicates the additive deviation caused by sags or swell events. Specically,u[k] 0 if sags event occurs while u[k] 0 if swells event occurs. 71 Similar to Eq. (5.5), we transform Eq. (5.18) as z[k] =y[k]s 0 [k] =u[k] +n[k]; (5.16) We can then derive the PDF of z as p 1 (z) = 1 4jbj erf z +jbj p 2 n erf zjbj p 2 n : (5.17) For transients/harmonics, we approximately model the power waveform after the PQ event as y[k] =s 1 [k] +e[k] +n[k]; tt e ; (5.18) where e[k] N(0; e ) indicates the additive deviation caused by transients or harmonics events. Similar to Eq. (5.5), we transform Eq. (5.18) as z[k] =y[k]s 0 [k] =e[k] +n[k]; (5.19) We can then derive the PDF of z as p 1 (z) =N (0; 2 n + 2 e ). 5.3.3 Uncertainty Modeling No matter which statistical modeling method we choose, we have to solve the unknown parameter problem for the post-event model. Specically, in the generic model, we have two unknown parameters, w and a 1 . In event-specic models, we have only one parameters under each category, e for transients and harmonics or b for sags and swells. In the proposed scheme, we consider all possible values of the unknown parameters and calculate the weighted sum of the log-likelihood ratio. The uncertainty modeling problem is what specic distribution function F 1 ( 1 ) we should choose to best match the real distribution of the unknown parameters. 72 We investigate four dierent uncertainty modeling methods, Uniform distribution, Gaussian distribution, Gamma distribution and Inverse Gamma distribution. The dierences of their performances will be shown in Section. 5.4 5.3.4 Weighted CUSUM-based Scheme Given the sequence of samples z[k];k = 1; 2;::: , the weighted likelihood ratio for the samples from j to k is expressed as, k j = Z 1 1 p 1 (z[j];:::;z[k]) p 0 (z[j];:::;z[k]) dF ( 1 ) (5.20) For generic modeling, dF ( 1 ) = dF ( w )dF (a1). For event-specic modeling, dF ( 1 ) = dF (b) for sags and swells and dF ( 1 ) = dF ( e ) for transients and har- monics events. The stopping time is then be determined as t a = minfk : max 1jk ln k j hg (5.21) As an example, the proposed weighted CUSUM-based PQ-event detection scheme for generic modeling is summarized in Algorithm. 2. A larger thresholdh will increase the detection latency while a smaller one will raise more false alarms. The selection of h is mainly determined by the predened false alarm rate and detection rate. According Wald's inequality, we have the relationship between the pre-dened false alarm rate and the threshold h as e h = (5.22) 73 Then, we can choose h =ln. Further, according to Lorden's theory [], we have the relationship between worst mean detection delay with the threshold as = h K( 1 ; 0 ) +O(1) (5.23) where K( 1 ; 0 ) is the Kullback information of the pre-event distribution p 1 and post-event distribution p 0 Algorithm 2 Weighted CUSUM-based PQ-event detection Inputs: samplesfy k g and a preset threshold h States: Initialize t e = 0 Procedure: for k = 1; 2; ;1 do z k =y k s 0 (t k ); s k = ln h R A 1 R w p 1 (z k ) p 0 (z k ) dF w ( w ) dF A 1 (a 1 ) i ; S k = k X i=1 s i ; m k = min 1jk S j ; g k =S k m k ; if g k h then ^ t e =t k ; break; end if end for Declare the detection of a PQ event at time ^ t e if ^ t e 6= 0. 5.3.5 Multiple-Sensor Detection We further extend the scheme to multi-sensor scenario, as shown in Fig.5.3.5. Given a set of L 1 geographically separated sensors, S 1 ;:::;S L , our aim is to quickly detect the change based on the observations, X 1 (k);:::;X n (k);k 1 , from the distributed sensors. Two scenarios, centralized and decentralized, are possible. In the centralized case, the original data are sent to the fusion center for nal decision while in the decentralized case, compressed data are sent. The methods we are 74 Figure 5.2: Muti-Sensor Scenario going to investigate below both fall into the decentralized version considering the communication bandwidth constraints. 5.3.5.1 MBQCUSUM Scheme Proposed by Tartakovsky [37, 38], MBQCUSUM scheme was proven to be asymp- totically optimal at the reference points. The scheme rst dene an interval [; ] for the post-change parameters. Then,M 2 reference points m 2 [; ];m = 1;:::;M are randomly selected. At each of those points, sensors perform binary quantization of the observations using LLR-quantizers, as shown in Eq. 5.24. U m;l (k) = 8 > < > : 1; ifL m;l (X l (k))>t m;l 0; otherwise (5.24) where L m;l (n) = log[g m l (X l (n))=f 0 l (X l (n))] is the LLR tuned to m . To achieve the optimal performance, the quantization threshold t m;l is also chosen so that the K-L divergence is maximized for the corresponding point, i.e.: I BQ l;m (t 0 l;m ) =argmax t l;m 0 I BQ l;m (t l;m ) (5.25) 75 The stopping time of each reference point is similar dened as BQ;m (h m ) =minfn 1 :W BQ m (n)h m g (5.26) where, W BQ m (n) =maxf0;W BQ m (n) +sum L l=1 L BQ m;l (n)g (5.27) The ultimate stopping time is then the minimum of BQ;m (h m ); 1mM. BQ;m (h m ) = min 1mM BQ;m (h m ) (5.28) where h = (h 1 ;h 2 ;:::;h M );h m 0 5.3.5.2 MVWCUSUM Scheme Based on the proposed weighted CUSUM-based scheme, we extend the scheme to multi-sensor scenarios using majority voting. With majority voting, the overall FAR (false alarm rate) could be further improved. In other words, with the same level FAR, the system could reduce the detection delay by introducing multiple sensors for monitoring. Given a sequence of local decisions from sensors distributed in dierent places, d 1 (k);d 2 (k);d 3 (k);:::;d L (k), where d i (k)2f1; 0g and L denotes the total number of sensors. With pre-dened FAR (false alarm rate) , the fusion center has to make a decision on the following two hypotheses: H 1 : certain type of power quality events just happened H 0 : normal power signal, no event happened With majority voting, the nal decision D can be expressed as Eq. 5.29, D(k) = 8 > < > : 1; if P L i=1 d i (k)> L 2 ; 0; otherwise (5.29) 76 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 FAR of Single Detection FAR of Majority Voting L = 3 L = 10 L = 20 L = 30 L = 40 L = 50 L = 1 Figure 5.3: FAR Improvement The stopping time is then determined as, = mink 1 :D(k) = 1 (5.30) Assume d 1 (k);d 2 (k);:::;d L (k) are independent, the FAR of MV-WCUSUM scheme is derived as Eq. 5.31 MVWCUSUM =P (D(k) = 1jH 0 ) =P ( L X i=1 d i (k)> L 2 jH 0 ) = L X q=L=2 L q q (1) (Lq) (5.31) Fig. 5.3.5.2 shows the relationship between and MVWCUSUM under dierent number of sensors When the FAR of single-sensor detection is smaller than 0.4, the more sensors used the smaller FAR can be achieved with majority voting. In practice, FAR of single-sensor is generally smaller than 0.1. Take = 0:05 as an 77 example, the FAR of MV-WCUSUM scheme is enhanced to 0.0073 under 3-sensor scenario and to 2.7e-6 under 10-sensor scenario. On the other hand, it enables the system to detect the power quality events faster. For single-sensor, a larger threshold for detection leads to a large detection delay but a smaller FAR. With multiple-sensor, to achieve the same FAR requirement, we are able to use smaller threshold for each-sensor to achieve smaller detection delay. 5.4 Results and Performance Analysis In this section, simulation results are provided to compare the performance of the proposed scheme and several previous detection schemes. We use ATP to simulate two types of PQ events, i.e., voltage transients and dips generated with the IEEE 14-bus test setup specied in [45]. For a fair comparison, we employ the same sampling rate ofT s = 10 ms for all detection methods under consideration without individually optimizing T s for each method. The PQ event is set to take place at t e = 0:06s in the simulation. Furthermore, we dene the signal-to-noise ratio (SNR) as 1 2 n while xing a 0 = 1. For multi-sensor scenario, we use Matlab SimPowerSystems toolbox to build up the physical model of the power transmission line with the prexed power quality event occurrence time as t e = 1=60 = 0:0168s. As shown in Fig. 5.4, VM1, VM2 and VM3 indicates the three sensors geographically located along the power trans- mission line. The distances between VM1 and VM2, VM2 and VM3 are both 150 km. Fig. 5.5 shows measurements of each sensor. Fig. 5.6 depicts the power waveform distorted by a PQ transient event at t = 0:06s due to switching \in" the capacitor at bus9 at 0.06s. Figs. 5.7 and 5.8 show the temporal-frequency plot using STFT and spectral estimates using the MUSIC over multiple windows, respectively. As shown in Figs. 5.7 and 5.8, it is dicult to detect the PQ event directly from these plots. 78 Machine initialized for P=1500 MW Vt=13.8kV Model For Distributed Change − Point Detection The ’Model initialization function ’ defined in the Model Properties automatically sets the sample time Ts to 50e− 6 s Discrete, Ts = 5e− 005 s. VM 3 v + − VM 2 v + − VM 1 v + − To Workspace 3 VM3 To Workspace 2 VM2 To Workspace 1 VM1 Series Comp . 2 A B C a b c Series Comp . 1 A B C a b c Pm − C− Line 3 (300 km) Line 2 (150 km) Line 1 (150 km) Fault Breaker A B C A B C E − C− Data Acquisition Open this block to visualize recorded signals CB2 A B C a b c CB1 A B C a b c B3 A B C a b c B2 A B C a b c B1 A B C a b c 6*350MVA 13.8 kV Pm E m A B C SSM 6*350 MVA 13.8/735 kV A B C a b c 330 Mvar A B C 330 Mvar A B C 300 MVA 735/230 kV A B C a2 b2 c2 a3 b3 c3 30,000 MVA 735 kV A B C 250 MW A B C 100 MW A B C Figure 5.4: SimPowerSystems Model for Power Quality Events 79 0 0.05 0.1 0.15 0.2 −8 −6 −4 −2 0 2 4 6 8 x 10 5 Time (second) Voltage(Volt) Power Signal at VM1 Power Signal at VM2 Power Signal at VM3 Figure 5.5: Measurements of Voltage Sags by 3 Sensors 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 Time Normalized Voltage Figure 5.6: Illustration of a voltage transient event. 80 Figure 5.7: The temporal-frequency plot using STFT w.r.t. a transient event. 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 −40 −20 0 20 40 60 80 100 120 140 Normalized Frequency (rad) Power (dB) Window [0,0.03] Window [0.03,0.06] Window [0.05,0.07] Window [0.06,0.08] Figure 5.8: Spectral estimates with MUSIC w.r.t. a transient event. 81 In contrast, Figs. 5.9 and 5.10 show the sample-by-sample rms generated by the conventional rms scheme and the logarithm of the weighted likelihood ratio by the proposed CUSUM scheme at SNR of 20 dB. Apparently, the proposed CUSUM scheme has much stronger indication on the PQ event occurrence at t = 0:06s. To compare the performance of the rms scheme and the proposed CUSUM scheme quantitatively, we dene the following mean squared error (MSE) of the event detection as the performance metric: MSE =E n ^ t e 0:06 2 o : (5.32) Note that more systematic evaluation can be performed in terms of the false alarm rate and detection delay as shown in [24]. To optimize the threshold employed in the RMS and the proposed CUSUM schemes, we rst establish the optimal threshold for each scheme by exhaustive search. Fig. 5.11 shows an example of the MSE performance as a function of thresholdh for the CUSUM scheme atSNR = 20 dB, which suggests that the optimal threshold for the CUSUM scheme is about 10 for this SNR value. Fig. 5.12 compares the MSE performance of the CUSUM and RMS schemes as a function of SNR. As shown in this gure, the CUSUM scheme outperforms the RMS scheme by a large margin. We would also like to point out that the RMS scheme shown in Fig. 5.12 is performed in the sample-by-sample fashion (rather than the typical cycle-by-cycle fashion). Thus, the MSE performance depicted in Fig. 5.12 is the optimal performance that the RMS scheme can achieve. Voltage sags are another type of PQ events. Fig. 5.13 illustrates a voltage sag event at bus9 due to a temporary ground fault. We can evaluate its temporal- frequency plot using STFT and its spectral estimates using the MUSIC scheme over multiple windows. Being similar to Figs. 5.7 and 5.8, it is dicult to detect the event occurrence using these two schemes. 82 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1 Time, k RMS Figure 5.9: Sample-by-sample RMS of a transient event as a function of time (SNR = 20 dB). 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1 −1 −0.5 0 0.5 1 1.5 x 10 4 Time, k S k Figure 5.10: The logarithm of the weighted likelihood ratio of a transient event using CUSUM (SNR = 20 dB). 83 0 5 10 15 20 25 30 10 −5 10 −4 10 −3 10 −2 10 −1 CUSUM threshold MSE Figure 5.11: The MSE versus the CUSUM threshold in a transient event (SNR = 20 dB). 10 12 14 16 18 20 22 24 26 28 30 10 −5 10 −4 10 −3 10 −2 RMS CUSUM SNR (dB) MSE Figure 5.12: The MSE as a function of the SNR value for CUSUM and RMS in a transient event. 84 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1 −1.5 −1 −0.5 0 0.5 1 1.5 Time Normalized Voltage Figure 5.13: Illustration of a voltage sag event. Figs. 5.14 and 5.15 show the sample-by-sample rms generated by the rms scheme and the logarithm of the weighted likelihood ratio generated by the CUSUM scheme. Being similar to Figs. 5.9 and 5.10, the occurrence of the sag event is easier by observing the waveforms shown in Figs. 5.14 and 5.15. Fig. 5.16 shows the MSE performance of the proposed CUSUM and the conven- tional RMS schemes as a function of SNR. Inspection of Fig. 5.16 reveals that the CUSUM scheme outperforms the RMS scheme by a signicant margin. Fig. 5.17 shows the detection latency of WCUSUM scheme using dierent sta- tistical modeling methods. As shown from the gure, event-specic modeling has slightly advantage over the generic modeling. In addition, when SNR 10dB, generic modeling fails to detect the event while event-specic model still works well. Therefore, event-specic model is also more robust in the power qualtiy event detection. Fig. 5.18 shows results of WCUSUM scheme under dierent uncertainty mod- eling methods. From (a), we see that each of the modeling methods generates good result for the change point detection. The dierence among them is hardly 85 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1 0.54 0.56 0.58 0.6 0.62 0.64 0.66 0.68 0.7 0.72 Time, k RMS Figure 5.14: Sample-by-sample RMS of a sag event as a function of time (SNR = 20 dB). 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1 −12 −10 −8 −6 −4 −2 0 x 10 4 Time, k S k Figure 5.15: The logarithm of the weighted likelihood ratio of a sag event using CUSUM (SNR = 20 dB). 86 10 12 14 16 18 20 22 24 26 28 30 10 −5 10 −4 10 −3 10 −2 10 −1 SNR (dB) MSE CUSUM RMS Figure 5.16: MSE performance as a function of SNR for CUSUM and RMS in a sag event. 5 10 15 20 25 30 10 −4 10 −3 10 −2 SNR/dB MMSE Generic Model Event−Specific Model Figure 5.17: Performance of dierent modeling methods 87 0 0.02 0.04 0.06 0.08 0.1 −1000 −500 0 500 1000 1500 2000 2500 Time/second Amplitude Gaussian Modeling (m=1, sd=1) Gamma Modeling (k=1,th=2) Inv−Gamma Modeling (alpha =3, beta=0.5 ) (a) WCUSUM Curves 8 9 10 11 12 13 14 10 −3 10 −2 SNR/dB MMSE Uniform Gaussian Gamma & Inverse Gamma (b) Detection Delay Figure 5.18: Performance of dierent uncertainty modeling methods to observe from the WCUSUM curves. However, in (b), we see Gaussion modeling method achives better performance compared to other three. Particularly, it still works well when SNR becomes small. 20 22 24 26 28 30 1 2 3 4 5 6 7 8 9 x 10 −4 SNR/dB Detection Delay (Threshold = 10) MBQCUSUM MVWCUSUM (a) Detection Delay 20 22 24 26 28 30 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 SNR/dB False Alarm Rate MBQCUSUM MVWCUSUM (b) False Alarm Rate Figure 5.19: MBQCUSUM vs MVWCUSUM Lastly, Fig. 5.19 shows the averaged detection results of MBQCUSUM scheme and MVWCUSUM scheme. From Fig. 5.19(a) we see that MBQCUSUM scheme can detect the event faster than MVWCUSUM scheme. However, from Fig. 5.19(b), we see that the false alarm rate of MBQCUSUM scheme is much higher than 88 MVWCUSUM scheme when SNR is smaller than 28dB. When SNR 26dB, MBQCUSUM scheme will generally fail due to the large false alarm rate. Therefore, MVWCUSUM scheme is more robust compared to MBQCUSUM scheme. 5.5 Conclusion A change-point detection approach to power quality event detection in smart grids was examined in this research. The proposed MVWCUSUM scheme computes the logarithm of the weighted likelihood ratio by exploiting both the instantaneous and the long-term information of the power waveform. The superior performance of the proposed MVWCUSUM scheme in the presence of major power quality events, such as voltage transient, sags, swells and harmonics, was shown by computer simulation. 89 Chapter 6 Conclusion and Future Work 6.1 Summary of the Research In this dissertation, we proposed three novel schemes that address the security and privacy concerns of smart grid system. Smart grid secure system was studied in Chapter 3. We proposed a system framework of using homomorphic encryption in smart grid to eliminate privacy concerns over metering data transmission, sharing and operations among dierent parties. Under the assumption that all parties except newly introduced PKC-HE system are untrusted, the proposed scheme securely protects data privacy to the highest degree. Detailed data exchanging and key management issues were dis- cussed. On the basis of the proposed system architecture, a partially homomor- phic encryption scheme was implemented. Two important smart grid applications, privacy-preserving data aggregation and statistical analysis, were well supported with the proposed system. Privacy-preserving metering scheme was studied in Chapter 4. We proposed a metering scheme in which smart meter data was rst distorted by Gaussian noise with large variance before sent to the utility company. An aggregated billing mech- anism was designed to report accurate billing as demanded. With distorted power consumption trace, we proved that current load analysis technology (NILM) fails to 90 derive any valuable information relating to customer's daily activities. As potential attacks, two well-known de-noising schemes, linear mean lter and non-local mean lter, were also investigated. The impotence of both denoising schemes was proved empirically and theoretically. In addition, a distribution reconstruction algorithm was proposed to estimate the original distribution of power consumption from dis- torted power consumption data. To be more ecient and scalable, distributed processing scheme was proposed and studied. Power quality monitoring issues were studied in Chapter 5. Aiming at power quality events, we proposed a detection scheme based on change-point detection technique. After modeling pre-event and post-event power signal, weighted CUSUM algorithm was designed and implemented to detect the exact time of change point. We then extend the scheme to multi-sensor scenario for distributed change-point detection. Compared with conventional techniques, such as RMS, STFT, MUSIC, MBQCUSUM and etc., the superiority of the proposed algorithm was shown in terms of detection delay and robustness. 6.2 Future Research Topics Several research problems deserve further investigation to make our current work more complete: Homomorphic Encryption-based Secure System - smart grid-specic partially homomorphic encryption scheme: Due to the relatively simple data operations, powerful fully homomorphic encryption scheme is unnecessary for smart grid. Only some partially homomorphic encryption scheme could be powerful enough to support all kinds of data op- erations in smart grid. Designing a smart grid-specic partially homomorphic encryption scheme is one of our future topics. 91 Privacy-preserving Metering Scheme - Optimum disturbance problem: under xed number of customers, the power of added noise is closely related to distortion eect and accuracy of distribu- tion reconstruction. Specically, larger noise makes the recovery of the orig- inal power consumption trace more dicult, and in the meanwhile requires more data to achieve the same accuracy level of distribution reconstruction. The noise power selection problem according to the tradeo between security and distribution reconstruction accuracy deserves our further investigation. Power Quality Monitoring Using Change-point Detection - Power events characterization and classication: characterize each types of power quality events for automatic classication 92 Reference List [1] Exploratory analysis of functional data via clustering and optimal segmenta- tion. Neurocomputing, 73(79):1125 { 1141, 2010. [2] Ieee guide for smart grid interoperability of energy technology and information technology operation with the electric power system (eps), end-use applica- tions, and loads. IEEE Std 2030-2011, pages 1 {126, 10 2011. [3] Ieee trial-use standard for a cryptographic protocol for cyber security of sub- station serial links. IEEE Std 1711-2010, pages 1 {49, 15 2011. [4] Rakesh Agrawal and Ramakrishnan Srikant. Privacy-preserving data mining. In Proceedings of the 2000 ACM SIGMOD international conference on Man- agement of data, SIGMOD '00, pages 439{450, New York, NY, USA, 2000. ACM. [5] Sebastian Richter Andreas Harth, Andreas Wagner. Smart grid emulator. Available:http://code.google.com/p/smart-grid-emulator/, 2011. [6] M. Basseville and I. V. Nikiforov. Detection of Abrupt Changes: Theory and Application. Prentice Hall, 1993. [7] M. H. J. Bollen. What is power quality? Elsevier Science, 66:5 { 14, 2003. [8] M. H. J. Bollen, I. Y.H. Gu, and et. al. Bridging the gap between signal and power: assessing power system quality using signal processing techniques. IEEE Signal Proc. Mag., 26:12 { 31, July 2009. 93 [9] A. Bose. Smart transmission grid applications and their supporting infrastruc- ture. IEEE Transactions on Smart Grid, 2010. [10] R. E. Brown. Electric Power Distribution Reliability. Marcel Dekker, Inc : New York, 2002. [11] C.J. Das, C.O. Nwankpa, and A. Petropulu. Analysis of power system tran- sient disturbances using anesprit-based method. In Proc. IEEE Power Eng. Society Summer Meeting, pages 437{442, Seattle, WA, July 2000. [12] Department of Energy. The smart grid: an introduction. Available: http://www.oe.energy.gov, October 2008. [13] C. Efthymiou and G. Kalogridis. Smart grid privacy via anonymization of smart metering data. In Smart Grid Communications (SmartGridComm), 2010 First IEEE International Conference on, pages 238 {243, oct. 2010. [14] A.G.van Engelen and J.S. Collins. Choices for smart grid implementation. In HICSS 10, pages 1{8, 2010. [15] Katie Fehrenbacher. Smart meter worm could spread like a virus, Jul 2009. [16] Flavio Garcia and Bart Jacobs. Privacy-friendly energy-metering via homo- morphic encryption. In Jorge Cuellar, Javier Lopez, Gilles Barthe, and Alexan- der Pretschner, editors, Security and Trust Management, volume 6710 of Lec- ture Notes in Computer Science, pages 226{238. Springer Berlin / Heidelberg. [17] Craig Gentry. Fully homomorphic encryption using ideal lattices, 2009. [18] Eu-Jin Goh. Encryption Schemes from Bilinear Maps. Department of Com- puter Science, Stanford University, Sep 2007. [19] Andy Greenberg. Hackers cut cities' power, Jan 2008. 94 [20] Marek Jawurek, Martin Johns, and Konrad Rieck. Smart metering de- pseudonymization. In Proceedings of the 27th Annual Computer Security Ap- plications Conference, ACSAC '11, pages 227{236, New York, NY, USA, 2011. ACM. [21] Meserve Jeanne. Sources: Staged cyber attack reveals vulnerability in power grid, September 2007. [22] Himanshu Khurana, Mark Hadley, Ning Lu, and Deborah A. Frincke. Smart- grid security issues. IEEE Security and Privacy, 8(1):81{85, Jan./Feb. 2010. [23] Klaus Kursawe, George Danezis, and Markulf Kohlweiss. Privacy-friendly ag- gregation for the smart-grid. In Simone Fischer-Hbner and Nicholas Hop- per, editors, Privacy Enhancing Technologies, volume 6794 of Lecture Notes in Computer Science, pages 175{191. Springer Berlin / Heidelberg. [24] L. Lai, Y. Fan, and H. V. Poor. Quickest detection in cognitive radio: A sequential change detection framework. In Proc. IEEE Global Telecomm. Conf. (GLOBECOM), pages 1{5, New Orleans, LO, December 2008. [25] Fengjun Li, Bo Luo, and Peng Liu. Secure information aggregation for smart grids using homomorphic encryption, Oct 2010. [26] Jian Liang, S. Ng, G. Kendall, and J. Cheng. Load signature study part i: Basic concept, structure and methodology. In Power and Energy Society General Meeting, 2010 IEEE, page 1, july 2010. [27] Jian Liang, S. Ng, G. Kendall, and J. Cheng. Load signature study part ii: Disaggregation framework, simulation and applications. In Power and Energy Society General Meeting, 2010 IEEE, page 1, july 2010. [28] Stephen McLaughlin, Patrick McDaniel, and William Aiello. Protecting con- sumer privacy from electric load monitoring. In Proceedings of the 18th ACM 95 conference on Computer and communications security, CCS '11, pages 87{98, New York, NY, USA, 2011. ACM. [29] Anthony R. Metke and Randy L.Ekl. Security technology for smart grid net- works. IEEE Transaction on Smart Grid, 1(1):99{107, June 2010. [30] Andr es Molina-Markham, Prashant Shenoy, Kevin Fu, Emmanuel Cecchet, and David Irwin. Private memoirs of a smart meter. In Proceedings of the 2nd ACM Workshop on Embedded Sensing Systems for Energy-Eciency in Building, BuildSys '10, pages 61{66, New York, NY, USA, 2010. ACM. [31] Oce of the National Coordinator for Smart Grid Interoperability. NIST Framework and Roadmap for Smart Grid Interoperability Standards. National Institute of Standards and Technology, release 1.0 edition, 2010. [32] Pascal Paillier. Public-key cryptosystems based on composite degree residu- osity classes. In Theory and Application of Cryptographic Techniques, pages 223{238, 1999. [33] P.McDaniel and S.McLaughlin. Security and privacy challenges in the smart grid. Security Privacy, IEEE, 7(3):75{77, 2009. [34] Elias Leake Quinn. Smart metering and privacy: Existing law and competing policies. Available:http://www.dora.state.co.us, page 3, Spring 2006. [35] R. Rivest, A. Shamir, and L Adleman. A method for obtaining digital signa- tures and public key cryptosystems. Communications of the ACM, 21(2):120{ 128, Feb 1978. [36] Douglas R. Stinson. Cryptography: theory and practice. Chapman and hall/CRC, third edition, 2006. 96 [37] A.G. Tartakovsky and A.S. Polunchenko. Quickest changepoint detection in distributed multisensor systems under unknown parameters. In Information Fusion, 2008 11th International Conference on, pages 1{8. IEEE, 2008. [38] A.G. Tartakovsky and V.V. Veeravalli. Quickest change detection in dis- tributed sensor systems. In Proceedings of the 6th International Conference on Information Fusion, pages 756{763, 2003. [39] The Smart Grid Interoperability Panel-Cyber Security Working Group. Guide- lines for Smart Grid Cyber Security: Vol.3, Supportive Analyses and Refer- ences. National Institute of Standards and Technology, 2010. [40] The Smart Grid Interoperability Panel-Cyber Security Working Group, NIST. Guidelines for Smart Grid Cyber Security: Vol.1, Smart Grid Cyber Security Strategy, Architecture, and High-Level Requirements. National Institute of Standards and Technology, 2010. [41] The Smart Grid Interoperability Panel-Cyber Security Working Group, NIST. Guidelines for Smart Grid Cyber Security: Vol.2, Privacy and the Smart Grid. National Institute of Standards and Technology, 2010. [42] U.S. Department of Energy Oce of Electricity Delivery and Energy Reliabil- ity. Study of Security Attributes of Smart Grid Systems-Current Cyber Security Issues. U.S. Department of Energy Oce of Electricity Delivery and Energy Reliability, 2009. [43] Marten van Dijk, Craig Gentry, Shai Halevi, and Vinod Vaikuntanathan. Fully homomorphic encryption over the integers. In Theory and Application of Cryp- tographic Techniques, pages 24{43, 2010. [44] W.H.Sanders. Progress towards a resilient power grid infrastructure. In Pro- ceedings of the IEEE Power and Energy Society General Meeting, 2010. 97 [45] Working Group on a Common Format for the Exchange of Solved Load Flow Data. Common data format for the exchange of solved load ow data. Transactions on Power Apparatus and Systems, PAS-92:1916{1925, Novem- ber/December 1973. 98
Abstract (if available)
Abstract
The past years have witnessed the fast development of smart grids all over the world. The introduction of digital communication technologies into the power system makes smart grids more efficient and intelligent. In the meantime, however, wide security and privacy concerns arise due to the increasing system complexities. Without proper protection, smart grids is extremely vulnerable to various attacks, such as conventional physical damages and emerging cyber attacks. On the other hand, even tiny system faults, if not detected and resolved in a real time manner, would lead to large scale power outage with unexpected loss. Besides, customer's privacy is also severely threatened by the provision of fine-grained power consumption data in smart grids. Motivated by these concerns, three novel schemes from different technical perspectives are proposed in the dissertation. ❧ For the first topic, an efficient homomorphic encryption-based system was proposed for securing data transmission, data sharing and operations among different parties. In this work, we first proposed a system framework tailored for homomorphic encryption techniques which have great potential to secure data, enable privacy-preserving data sharing and thereby improve the overall efficiency of smart grids. Based on the proposed system framework, we then designed a practical system with an extended partially homomorphic encryption scheme. With homomorphic features, we prove that the designed system well supports privacy-preserving data aggregation and power consumption statistical analysis in smart grids. ❧ For the second topic, a metering scheme was proposed to protect customer's privacy. In this work, a reading distortion scheme was first designed to distort smart meter data in a way that only data senders (i.e. customers) are able to access the original power consumption data. With distorted power consumption data, an aggregated billing mechanism was then proposed to guarantee accurate billing service. For power consumption analysis and prediction, we designed a distribution reconstruction algorithm to recover the original power consumption distribution from distorted power consumption data. To show the security, two potential attacks were investigated theoretically. Experimental results on real world power consumption data were discussed in the end. ❧ For the third topic, a power quality monitoring scheme using change-point detection techniques was investigated. After modeling pre-event and post-event power signal, we proposed a weighted CUSUM algorithm to detect common power quality events, i.e. sags, transients, swells and harmonics. With experimental results, we compared proposed scheme with conventional power quality monitoring techniques and concluded with the superiority of the proposed scheme. We also extend the scheme to distributed version under multi-sensor scenario. The proposed MVWCUSUM scheme is compared with recent MBQCUSUM scheme in terms of detection latency and robustness.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Modeling and optimization of energy-efficient and delay-constrained video sharing servers
PDF
A function-based methodology for evaluating resilience in smart grids
PDF
The smart grid network: pricing, markets and incentives
PDF
Dynamic graph analytics for cyber systems security applications
PDF
Distribution system reliability analysis for smart grid applications
PDF
Power efficient multimedia applications on embedded systems
PDF
Integration of energy-efficient infrastructures and policies in smart grid
PDF
Defending industrial control systems: an end-to-end approach for managing cyber-physical risk
PDF
Advanced techniques for stereoscopic image rectification and quality assessment
PDF
Machine learning techniques for perceptual quality enhancement and semantic image segmentation
PDF
Prediction models for dynamic decision making in smart grid
PDF
Deep learning techniques for supervised pedestrian detection and critically-supervised object detection
PDF
Techniques for compressed visual data quality assessment and advanced video coding
PDF
Data-driven methods for increasing real-time observability in smart distribution grids
PDF
Learning about the Internet through efficient sampling and aggregation
PDF
Advanced machine learning techniques for video, social and biomedical data analytics
PDF
Block-based image steganalysis: algorithm and performance evaluation
PDF
A joint framework of design, control, and applications of energy generation and energy storage systems
PDF
A learning‐based approach to image quality assessment
PDF
Variation-aware circuit and chip level power optimization in digital VLSI systems
Asset Metadata
Creator
He, Xingze
(author)
Core Title
Novel and efficient schemes for security and privacy issues in smart grids
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Electrical Engineering
Publication Date
07/31/2013
Defense Date
05/06/2013
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
change-point detection,OAI-PMH Harvest,power quality monitoring,privacy,Security,smart grids
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Kuo, C.-C. Jay (
committee chair
), Huang, Ming-Deh (
committee member
), Hwang, Kai (
committee member
)
Creator Email
xingzehe@gmail.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c3-309209
Unique identifier
UC11294904
Identifier
etd-HeXingze-1913.pdf (filename),usctheses-c3-309209 (legacy record id)
Legacy Identifier
etd-HeXingze-1913.pdf
Dmrecord
309209
Document Type
Dissertation
Format
application/pdf (imt)
Rights
He, Xingze
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
change-point detection
power quality monitoring
smart grids