Close
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Building occupancy modeling and occupancy-loads relationships for building heating/cooling energy efficiency
(USC Thesis Other)
Building occupancy modeling and occupancy-loads relationships for building heating/cooling energy efficiency
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
I Building Occupancy Modeling and Occupancy - Loads Relationships for Building Heating/Cooling Energy Efficiency By Zheng Yang A Dissertation Presented to the FACULTY OF THE USC GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA Submitted in Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (CIVIL ENGINEERING) August 2016 Copyright 2016 Zheng Yang II Table of Contents ABSTRACT ............................................................................................................................................. VIII ACKNOWLEDGMENTS .......................................................................................................................... XI Chapter 1: Motivation .............................................................................................................................. - 1 - Chapter 2: Background and Introduction ................................................................................................. - 6 - Chapter 3: Scope and Definitions .......................................................................................................... - 11 - Chapter 4: Literature Review and Research Gaps ................................................................................. - 15 - 4.1. Building Occupancy Modeling ................................................................................................... - 15 - 4.1.1. Real-time Occupancy Modeling (Detection and Estimation) .............................................. - 15 - 4.1.2. Long-term Occupancy Modeling (Profiling) ....................................................................... - 20 - 4.2. Integration of Occupancy and Heating/Cooling Controls ........................................................... - 24 - 4.2.1. Real-time Occupancy and Heating/Cooling Controls .......................................................... - 25 - 4.2.2. Long-term Occupancy and Heating/Cooling Controls ........................................................ - 26 - 4.3. Energy Model Calibration for Energy Simulation ...................................................................... - 30 - 4.3.1. Comparisons of Simulation Programs ................................................................................. - 31 - 4.3.2. Simulation Calibration Methodologies ................................................................................ - 33 - Chapter 5: Research Objectives and Questions ..................................................................................... - 38 - Chapter 6: Testbed Building and Reference Buildings .......................................................................... - 41 - 6.1. Testbed Building ......................................................................................................................... - 41 - 6.2. Customized Ambient Sensors ..................................................................................................... - 42 - 6.3. Reference Buildings .................................................................................................................... - 46 - 6.4. Virtual Reference Buildings........................................................................................................ - 48 - Chapter 7: Framework for Modeling Building Occupancy ................................................................... - 51 - 7.1. Data Collection and Preparation ................................................................................................. - 51 - 7.2. Real-time Occupancy Modeling ................................................................................................. - 55 - 7.2.1. Methodology for Real-time Occupancy Modeling .............................................................. - 56 - 7.2.2. Findings for Real-time Occupancy Modeling ...................................................................... - 59 - 7.2.3. Primary Effects of Ambient Factors .................................................................................... - 61 - 7.2.4. Joint Effects of Ambient Factors ......................................................................................... - 62 - 7.2.5. Global Occupancy Modeling ............................................................................................... - 66 - 7.2.6. Summary .............................................................................................................................. - 82 - 7.3. Long-term Occupancy Modeling ................................................................................................ - 85 - III 7.3.1 Methodology for Long-term Occupancy Modeling .............................................................. - 85 - 7.3.2. Long-term Occupancy Modeling ......................................................................................... - 89 - 7.3.3. Findings for Long-term Occupancy Modeling .................................................................... - 93 - 7.3.4. Evaluation of Representativeness ........................................................................................ - 95 - 7.3.5. Occupant Number Profiles ................................................................................................. - 103 - 7.3.6. Summary ............................................................................................................................ - 105 - Chapter 8: Occupancy and Heating/Cooling Loads for Efficiency ..................................................... - 107 - 8.1. Occupancy Transitions and Loads ............................................................................................ - 108 - 8.1.1. Multi-zone Occupancy Transitions Analysis ..................................................................... - 110 - 8.1.2. Occupancy Transitions based Setpoint Control ................................................................. - 115 - 8.1.3. Algorithm Validation Results ............................................................................................ - 120 - 8.1.4. Transitions-Loads Analysis Results ................................................................................... - 123 - 8.1.5. Summary ............................................................................................................................ - 126 - 8.2. Occupancy Diversity and Loads ............................................................................................... - 127 - 8.2.1. Methodology for Eliminating Occupancy Diversity .......................................................... - 128 - 8.2.2. Validation for Quantifying Load Implications of Occupancy Diversity ............................ - 133 - 8.2.3. Generalizability Analysis ................................................................................................... - 139 - 8.2.4. Summary ............................................................................................................................ - 142 - Chapter 9: Multi-level Building Energy Model Calibration ................................................................ - 144 - 9.1. Selection of Simulation Programs ............................................................................................. - 144 - 9.1.1. DOE-2 ................................................................................................................................ - 145 - 9.1.2. EnergyPlus ......................................................................................................................... - 146 - 9.1.3. IES-Virtual Environment ................................................................................................... - 148 - 9.1.4. ESP-r .................................................................................................................................. - 149 - 9.1.5. TRNSYS ............................................................................................................................ - 151 - 9.1.6. Comparison Discussions .................................................................................................... - 153 - 9.2. Methodology for Multi-level Energy Model Calibration .......................................................... - 157 - 9.2.1. Initial Energy Modeling ..................................................................................................... - 160 - 9.2.2. Sensitivity Analysis for Parameters ................................................................................... - 162 - 9.2.3. Parameter Estimation ......................................................................................................... - 163 - 9.2.4. Simulation Discrepancy Analysis ...................................................................................... - 164 - 9.2.5. Simulation Discrepancy Minimization .............................................................................. - 166 - 9.3. Energy Model Calibration Evaluation ...................................................................................... - 167 - IV 9.3.1. Case Study Description ...................................................................................................... - 167 - 9.3.2. Results of Initial Energy Modeling .................................................................................... - 171 - 9.3.3. Results of Sensitivity Analysis for Parameters .................................................................. - 172 - 9.3.4. Results of Parameter Estimation ........................................................................................ - 181 - 9.3.5. Results of Simulation Discrepancy Analysis ..................................................................... - 182 - 9.3.6. Results of Simulation Discrepancy Minimization ............................................................. - 185 - 9.3.7. Validation Findings and Discussions ................................................................................. - 188 - 9.4. Summary ................................................................................................................................... - 191 - Chapter 10: Conclusions ...................................................................................................................... - 194 - Chapter 11: Limitations and Future Research Directions .................................................................... - 200 - REFERENCES .................................................................................................................................... - 203 - V List of Figures Figure 1: Importance of occupant to building HVAC system energy efficiency ...................................................... - 7 - Figure 2: Occupancy schedules that are used to recommend HVAC schedules by the ASHRAE 90.1 2004 for an office building .............................................................................................................................................. - 27 - Figure 3: The relationship between occupancy and building HVAC energy simulation ........................................ - 33 - Figure 4: Case study building located on the University of Southern California campus ...................................... - 41 - Figure 5: BLEMS sensor box used for collecting ambient data ............................................................................. - 45 - Figure 6: Deployment of sensors and zoning on the third floor of the testbed building (the floor plan is modified for privacy purposes; all room features (e.g., size, orientation, etc) are kept the same) .......................... - 45 - Figure 7: Reference building developed by Department of Energy ....................................................................... - 47 - Figure 8: Basic building shapes for mass modeling ............................................................................................... - 48 - Figure 9: Computer application program interface for layout planning ................................................................. - 49 - Figure 10: Three typical offices to collect occupancy ground truth using ceiling-mounted cameras (sensor boxes are marked with red circles for their placements) ........................................................................................ - 52 - Figure 11: Process for occupancy detection and personalized occupancy profiling ............................................... - 53 - Figure 12: Locations of the six rooms in the two testbed buildings ....................................................................... - 54 - Figure 13: Information Gains of Ambient Factors in Single Occupancy Rooms (light colors mean the ranges of information gains in different rooms) ..................................................................................................... - 62 - Figure 14: Global occupancy modeling process ..................................................................................................... - 71 - Figure 15: Principal Component Analysis (PCA) results for different rooms ........................................................ - 73 - Figure 16: Comparison of actual and estimated occupancy ................................................................................... - 74 - Figure 17: Comparison between daily F-measures and RMSEs ............................................................................. - 76 - Figure 18: Comparison between daily occupied/unoccupied detection accuracy and daily number estimation accuracy ................................................................................................................................................ - 77 - Figure 19: Daily modeling accuracy and daily occupancy variation level ............................................................. - 79 - Figure 20: Daily modeling accuracy and daily occupancy profile difference degree ............................................. - 80 - Figure 21: Daily modeling accuracy and daily average indoor/outdoor temperature difference ............................ - 81 - Figure 22: Locations of three offices for validating the proposed occupancy profiling methodology ................... - 94 - Figure 23: Comparison of occupancy profile from different modeling methods for Room 1 ................................ - 95 - Figure 24: Comparison of occupancy profile from different modeling methods for Room 2 ................................ - 95 - Figure 25: Comparison of occupancy profile from different modeling methods for Room 3 ................................ - 95 - Figure 26: The range of presence probability at each time point for Room 1 ........................................................ - 97 - Figure 27: The range of presence probability at each time point for Room 2 ........................................................ - 97 - Figure 28: The range of presence probability at each time point for Room 3 ........................................................ - 98 - Figure 29: Rooms and zones selected for evaluating the representativeness of occupancy profile in terms of heating/cooling estimation ............................................................................................................................... - 101 - Figure 30: Deviations of energy consumption, simulated for four months .......................................................... - 102 - Figure 31: Comparison of number profile from different modeling methods for Room 2 ................................... - 104 - Figure 32: Comparison of number profile from different modeling methods for Room 2 ................................... - 104 - Figure 33: Comparison of number profile from different modeling methods for Room 3 ................................... - 104 - Figure 34: Relationship between occupancy transitions and heating/cooling load transitions ............................. - 108 - Figure 35: Energy implication of different combinations of setpoint/setback schedules and distances ............... - 109 - Figure 36: Comparison of energy implications between actual and repeated occupancy ..................................... - 112 - Figure 37: Relationship between average temperature and difference of coverage percentage ........................... - 115 - Figure 38: Neighborhood of solutions for the small reference building with five zones ...................................... - 117 - Figure 39: Heuristics for solutions to update ........................................................................................................ - 119 - VI Figure 40: Trace progress of the maximum heating/cooling load reduction in 5 independent trials of search with different occupancy assignments ....................................................................................................... - 121 - Figure 41: Trace progress of the maximum heating/cooling load reduction in 5 independent trials of search with different initial solutions.................................................................................................................... - 122 - Figure 42: Load differences between random solutions and calculated global optimum (denoted as the line: Y=0) ............................................................................................................................................................. - 123 - Figure 43: Setpoint/setback schedules and distances of all zones in the real-world building for the month of April ................................................................................................................................................................. - 124 - Figure 44: Monthly load differences of implementing the EVNS (yellow dots) and shared combinations (blue dots) compared to baseline control in the office building............................................................................ - 125 - Figure 45: Load differences between random solutions and calculated global optimum (denoted as the line: Y=0) ............................................................................................................................................................. - 126 - Figure 46: Iterative evaluation algorithm for hierarchical clustering and elimination of occupancy diversity ................................................................................................................................................................ - 132 - Figure 47: Occupancy profiles for the 28 rooms on the third floor of the testbed building .................................. - 134 - Figure 48: Original occupancy profile and logic operations applied to form an updated occupancy profile ....... - 135 - Figure 49: Iterative evaluation algorithm for eliminating the occupancy diversity (different colors represent connected zones defined by the third level constraints) ........................................................................ - 136 - Figure 50: Load increments (%) of different trials of eliminating occupancy diversity (the dot with red circle is the trial resulting from the proposed framework) .................................................................................... - 137 - Figure 51: Load increments (%) of different number of factors for determining connected zones as the third constraints .................................................................................................................................................... - 138 - Figure 52: Load increments (%) of different combinations .................................................................................. - 139 - Figure 53: Comparison of results for four types of simulations across 100 virtual reference buildings ............... - 141 - Figure 54: Energy simulation in DOE-2 (arrow shows the flow of information) ................................................. - 146 - Figure 55: Energy simulation in EnergyPlus (arrow shows the flow of information) .......................................... - 147 - Figure 56: Energy simulation in IES-VE (arrow shows the flow of information) ................................................ - 149 - Figure 57: Energy simulation in IES-VE (arrow shows the flow of information) ................................................ - 151 - Figure 58: Energy simulation in TRNSYS (arrow shows the flow of information) ............................................. - 152 - Figure 59: Categorization of input parameters ..................................................................................................... - 159 - Figure 60: Proposed energy model calibration framework ................................................................................... - 160 - Figure 61: Hierarchy and sequence for observable parameter determination ....................................................... - 161 - Figure 62: Architectural model (back side), one typical zone (three offices) and HVAC model ......................... - 172 - Figure 63: Mean and standard deviation of the elementary effects on the energy simulation for the influential parameters at building level, HVAC load response level and zone level ............................................ - 173 - Figure 64: Multi-regression analysis results at the building level (F=57.24) – See Table 18 for parameter IDs (Estimable parameters and HVAC load response related parameters are shaded) ......................................... - 183 - Figure 65: Multi-regression analysis results at the HVAC load response level (F=32.16) - See Table 19 for parameter IDs (Estimable parameters and HVAC load response related parameters are shaded) .................. - 184 - Figure 66: Multi-regression analysis results at the zone level (F=41.59) - See Table 20 for parameter IDs (Estimable parameters and HVAC load response related parameters are shaded) ............................................... - 184 - Figure 67: Data collection periods for energy model calibration and evaluation ................................................. - 186 - Figure 68: MBE values for the calibrated model .................................................................................................. - 188 - Figure 69: CV(RMSE) values for the calibrated model ....................................................................................... - 188 - Figure 70: MBE values and CV(RMSE) values for the calibrated model at the HVAC load response level ....... - 190 - Figure 71: Comparison of simulated temperature and actual temperature for randomly selected four zones ...... - 191 - Figure 72: Summary of building occupancy modeling and investigations of occupancy-loads relationships ...... - 194 - VII List of Tables Table 1: Potential energy reduction ranges for HVAC systems in commercial buildings by implementing energy conservation measures in different cities ...................................................................................................... - 2 - Table 2: Four levels of building controls under EN15232 standard ......................................................................... - 3 - Table 3: Efficiency factors for different levels of building controls under EN 15232 ............................................. - 4 - Table 4: Samples collected from six rooms for global occupancy modeling ......................................................... - 54 - Table 5: Real-time Occupancy Detection Results .................................................................................................. - 60 - Table 6: Real-time Occupancy Estimation Results................................................................................................. - 60 - Table 7: Ambient Factor Combination Analysis for Room 2, Using the DT Algorithm (CO 2: CO 2 Concentration; D: Door Status; DCN: Door Count Net; H: Humidity; L: Light; M: Motion; MCN: Motion Count Net; P: PIR; PCN: PIR Count Net; T: Temperature, AS: Average Sound) ..................................... - 63 - Table 8: Feature Selection for training global occupancy models .......................................................................... - 71 - Table 9: F-measure for global occupancy modeling ............................................................................................... - 75 - Table 10: Error of each classifier in the ensembling method.................................................................................. - 77 - Table 11: Three influential factors on global occupancy modeling performance ................................................... - 78 - Table 12. Percentage of area enclosed by actual occupancy probability that is within the confidence interval by modeled occupancy probability ............................................................................................................ - 98 - Table 13: Coverage percentages of energy efficiency for all combinations ......................................................... - 113 - Table 14: Statistical analysis of coverage percentage differences between periods ............................................. - 114 - Table 15: Influence of occupancy diversity on HVAC system energy efficiency for five building shapes and four types of simulations ................................................................................................................................ - 142 - Table 16: Program comparisons for coupling occupancy information with HVAC energy simulation ............... - 154 - Table 17: Acceptable tolerances for monthly building energy simulation ........................................................... - 171 - Table 18: Influential parameters and their parameter ranges and default values for the building level energy simulation (blue-shaded parameters are for controlling the HVAC system under different load responses, brown-shaded parameters are estimated based on observable evidence, red shaded parameters are found to be statistically insignificant in discrepancy analysis) ....................................................................... - 174 - Table 19: Influential parameters and their parameter ranges and default values for the load response level energy simulation (blue-shaded parameters are for controlling the HVAC system under different load responses, brown-shaded parameters are estimated based on observable evidence, red shaded parameters are found to be statistically insignificant in discrepancy analysis) ....................................................................... - 176 - Table 20: Influential parameters and their parameter ranges and default values for the zone level energy simulation (blue-shaded parameters are for controlling the HVAC system under different load responses, brown-shaded parameters are estimated based on observable evidence, red shaded parameters are found to be statistically insignificant in discrepancy analysis) ....................................................................................... - 179 - Table 21: The convergence curves for different combinations of preferred weights ........................................... - 186 - VIII ABSTRACT Buildings account for 40% of energy consumption in United States. In commercial buildings, more than 40% of energy is used by HVAC (Heating, Ventilation, and Air Conditioning) systems to maintain comfortable and healthy indoor thermal environments, making HVAC systems prime targets for improving energy efficiency. There is significant difference between energy consumed in buildings on the supply side and energy actually required for heating and cooling on the demand side, resulting in energy inefficiency. Occupancy is one of the most important factors impacting actual demands for HVAC systems; however designed occupancy in commercial buildings rarely represents actual occupancy. In order to leverage HVAC energy efficiency based on occupancy, this dissertation has focused on non-intrusive building occupancy modeling including real-time occupancy (time-sequenced occupancy status changes) and long-term occupancy (typical-weekday/weekend presence/number probability). Ambient sensors are deployed and the relationships between ambient implications and occupancy have been explored to model occupancy. Global occupancy awareness is also improved for applications at the building level, by which the model is trained in one space and used in other geometrically similar spaces. Occupancy-loads relationships are then systematically investigated. Heating/cooling loads are associated with the actual energy demands for HVAC systems. Loads represent heating/cooling requirements at the terminal level and determine the responses of HVAC system components at the system level. Despite the high volume of research activities in demand-driven HVAC system controls, it is still not clear how and when occupancy should be linked with heating and cooling controls for sustained and maximum energy efficiency. This is a complex problem as occupancy is stochastic in nature, and there exists heat transfer and balance among zones of a building, as well as heat gain and loss through a building’s envelope. Since there is no systematic understanding of the relationships between occupancy and loads to achieve energy IX efficiency, this dissertation has focused on the types and ways of modeling the relationships that significantly influence HVAC system energy efficiency. Specifically, two relationships have been identified from two perspectives of occupancy transitions, which represent the switch between real-time occupied and unoccupied statuses; and occupancy diversity which represents the difference in long-term occupancy. Energy efficiency, as defined in this dissertation, incorporates both the conditioning miss, which is the length of time a space is occupied but the temperature is outside the range of the setpoint, and energy reduction, which is the absolute amount of energy savings. Conditioning miss is considered as equally important as it compromises the basic function of the HVAC system to maintain a comfortable and desired thermal environment. Setpoint is specifically used as the medium to investigate the relationships between occupancy and heating/cooling loads, since setpoint actually controls the interactions among occupancy, thermal conditions, and HVAC system responses. As a more feasible and reliable venue for carrying on the investigations compared to field experiments, building energy simulation, the virtual representation and reproduction of energy processes for either an entire building or a specific space, is used to implement and validate the methods for modeling occupancy-loads relationships. It is widely accepted building energy model must be well calibrated before any simulations, but there is no generally adapted calibration methodology in both academia and industry to achieve high energy simulation accuracies at multiple levels. This dissertation has introduced a novel multi-level calibration framework to calibrate building energy model at multiple levels (e.g. building level, at energy conservation measure level, and at zone level) simultaneously. A classification schema is designed to classify all of the input parameters into hierarchical categories for analyzing and determining the values of the parameters. The proposed calibration framework does not need retraining when changes are made to building systems and conservation measures; meanwhile it avoids the trial-and-error process, which requires significant time, effort and expertise. A X comparative study of different energy simulation programs is also conducted, by which EnergyPlus simulation program is chosen for the validation of the calibration framework. XI ACKNOWLEDGMENTS Five years’ Ph.D. research life is a fast-paced, amazing, and fruitful journey. Many people have made invaluable contributions to this dissertation. I would never make it possible without the kind help and strong support from professors, colleagues, and friends. I am so blessed with great fortune that I could have the opportunity working with these great people throughout the journey. My deepest gratitude goes first to my dear advisor, Dr. Burcin Becerik-Gerber. She taught me how to do scientific research when I was a novice in academia, guided me to explore the unknown world of human-building interactions, and encouraged me to challenge existing researches and discover the truth. She is patient, energetic, and knowledgeable. To me, Dr. Becerik is not only an academic adviser, but also a family member to correct my mistakes and shape my characters as a good scholar. She helped me to overcome every block and backed up me to move towards the completion and success. What I learned from her will be the inestimable asset in my life. I would like to express my most sincere gratitude to her. I am also greatly grateful to Dr. Lucio Soibelman, Professor and Chair in the Civil and Environmental Engineering department. Thanks to his continuous support, inspiration, and encouragement, I could keep the pace with the rapid research evolution accompanied with the advanced information techniques. His profound and immense knowledge materially widened my vision. Dr. Soibelman is one of the best researchers, teachers and leaders that I have ever met. Thanks to his high standards and strict requirements for Ph.D. students, I could grow into an independent and confident researcher. I would like to acknowledge Dr. Michael Orosz, Dr. Jin Yan, Dr. Viktor Prasanna, and Dr. George Ban-Weiss who served in my qualifying exam and defense committees. They gave me XII unconditional support and valuable instructions to enrich my research, provided significant and well-timed engineering guidance, and helped me work out my problems. My sincere thanks must go to my beloved parents, wife, family, and friends for their backs up behind. They gave me any living convenience I need to complete my research, and stayed with me along these busy years. I sincerely appreciate their understanding and trust to every decision I made. Lastly, I would like to acknowledge the financial support from the Viterbi School of Engineering at the University of Southern California, and the Chinese Scholarship Council, and the research funding from the Department of Energy, and the National Science Foundation. I also appreciate the efforts from all the colleagues, professors, and staff in the department to build the big CEE family full of love. Thank you for letting me feel the warmth of home although I am thousands miles away abroad. - 1 - Chapter 1: Motivation People spend more than 90% of their time indoors [1]. According to the statistics from the International Energy Agency (IEA), buildings accounted for one-third of the globe's total energy consumption in 2013 [2]. In the United States, approximately 40% of the energy consumption is attributed to the 120 million buildings [3], with 45.8% consumed by commercial buildings [4], approximately 18.3% of the nation's total energy consumption. Only 4% of the energy sources for commercial buildings sector are renewable [5] and as much as 90% of environmental impacts caused by commercial buildings (e.g. CO 2 emission) are from energy use [6]. In addition to the steady growth of new buildings through 2035 (an increase in energy consumption by 15.7% compared to 2013) [4], existing buildings typically have a lifespan of 50- 100 years for continually consuming energy [7, 8]. It is expected the total commercial section energy consumption will experience a rapid growth rate of more than 4% until 2035 [9], much faster than the increase resulting from transportation and industrial sectors in the United States. Sustainability and energy conservation have become increasingly important topics, as nearly 30%-50% of the energy consumption in buildings is wasted [10]. Reducing building energy consumption can lead to decreased environmental pollution, reduced energy expenses, and improved economic security [11]. Given the continuing trends of high intensity per square foot that the U.S. commercial building sector has been experiencing for decades [12], this dissertation has focused on commercial buildings to tackle both the challenges and opportunities for energy reduction. In commercial buildings, energy is consumed for providing services such as air conditioning, ventilation, lighting, and power, more than 80% of which is used during the post-occupancy periods [13] to maintain indoor environments and provide building-based functions. More than 40% of this energy is consumed by HVAC (Heating, Ventilation, and Air Conditioning) systems [4] to keep a comfortable and healthy thermal condition for the built environment [14] by responding to the loads imposed by the outside weather, envelope characteristics, lighting design, appliance/equipment usage and occupant activities through heating, ventilation and air conditioning [15]. An HVAC system supplies a building with heated - 2 - air, cooled air, and fresh air if heating, cooling, and ventilation are required, respectively. In general, heating accounts for approximate 16.0% of energy consumption in the commercial buildings sector, more than cooling (14%) and ventilation (9%) [4]. However, it is estimated that 90% of building HVAC systems are inefficient [16], resulting in 700M unnecessary energy cost per year. Most traditional HVAC system operations assume that the demand for heating, cooling and ventilation based on maximum design occupancy of a building during entire operational hours, and only use temperature (dry bulb or wet bulb) as indicator of the environment to adjust HVAC system operations without analyzing the actual demands of the space, resulting in significant HVAC system energy inefficiency [17] through overcooling or overheating of vacant spaces. Accordingly, there is great potential for HVAC system energy efficiency in existing buildings by minimizing the difference between actual energy consumed and energy required to satisfy heating/cooling requirements. Researchers have found that an average of 38% of energy could be saved if more advanced HVAC system responses to loads are adapted by HVAC systems [18, 19]. Data showed in Table 1 indicate the potential energy saving ranges for commercial building HVAC systems in different cities (representing different geographical and climate zones). Moreover, since buildings account for 40% of carbon dioxide (CO 2) emissions, annually over 3 million metric tons of CO 2 could be potentially saved if half of the energy consumption could be reduced in the commercial building sector [20]. Table 1: Potential energy reduction ranges for HVAC systems in commercial buildings by implementing energy conservation measures in different cities City Energy Reduction City Energy Reduction Seattle 28%-48% Las Vegas 41%-43% Helena 30%-44% Albuquerque 40%-48% - 3 - Duluth 30%-42% Denver 36%-49% Chicago 32%-41% Minneapolis 34%-43% San Francisco 43%-67% Houston 42%-49% Los Angeles 55%-64% Baltimore 39%-46% Fairbanks 32%-39% Atlanta 38%-43% Phoenix 39%-44% Miami 38%-51% Although the influences of buildings’ physical characteristics and thermal properties on HVAC energy efficiency are significant, their effects have been weakened because worldwide regulations and policies are gradually forced and continuously updated to improve building energy performance [21]. In addition, installing new technologies and systems into existing buildings for saving energy might not be always feasible and might even consume more energy. A more effective and reliable way of improving HVAC energy efficiency is to control HVAC system based on actual demands [22-26]. Demand-driven controls operate HVAC systems based on the actual space loads, which are usually calculated based on the amount of heat that needs to be delivered into or out of the building to keep temperature as desired [27-29]. Being shown in Table 2, four levels of building controls are categorized by the European Standard EN15232 [30, 31]. The first two classes of controls (Class A and Class B) are based on actual demands, while the third one (Class C) is the standard control level by a standard building automation and control system in current practice, and the fourth one (Class D) is to represent the inefficient control. Table 2: Four levels of building controls under EN15232 standard Heating/Cooling control Air conditioning Illumination Shading - 4 - A Individual room control with communication and requirements; Pressure controlled pumps Automatic airflow control; Air supply control with load-dependent reference value Automatic lighting with light level control; Automatic lighting with presence detection Combined control of blinds and temperature control B Individual room control with communication; Stage controlled pumps Multi-level fan control; Air supply control with weather- compensated reference value Automatic lighting with light level control; Automatic lighting with presence detection Automatic control of blinds C Individual room control with thermostatic value or electronic regulator; On-off controlled pumps On-off fan control; Air supply control with constant reference value Manual light control; Manual switch with central “off” signal Manual operation of motor-driven blinds D No individual room control; Pumps are not regulated No airflow control; No air supply control Manual light control with manual switch Manual operation of blinds To be more explicit, the performance of energy efficiency from Classes A, B and D for different levels of building controls with different building types is expressed as Efficiency Factors relative to the energy consumption of Class C (the ratios between energy consumption of a certain class and that of the Class C) in Table 3. Table 3: Efficiency factors for different levels of building controls under EN 15232 Efficiency Factors (Gas) Building Types D C B A Offices 1.51 1 0.8 0.70 Lecture Rooms 1.24 1 0.75 0.50 - 5 - School Buildings 1.20 1 0.88 0.80 Efficiency Factors (Electricity) Building Types D C B A Offices 1.10 1 0.93 0.87 Lecture Rooms 1.06 1 0.94 0.89 School Buildings 1.07 1 0.93 0.86 The EN results demonstrated that demand-driven HVAC control is a convincing venue to significantly improve building system control efficiency and reduce energy consumption by only using the amount of energy when it is required [32]. It provides the motivations for this dissertation to deeply and systematically investigate the actual demands for building HVAC system in order to reduce energy consumption while maintaining a comfortable and healthy environment. - 6 - Chapter 2: Background and Introduction It has been widely accepted that building occupants is one of the most important factors that impact actual demands for active heating and cooling [33-37] and has significant contribution to HVAC system energy consumption [33,38-40]. However, instead of basing the HVAC control on actual occupancy, most of current HVAC systems are operated under the assumption that buildings are occupied for a fixed period of time during a day, for example, from 8:00 to 20:00. It has been demonstrated that this assumption deviates considerably from actual building occupancy, as spaces of a building are often vacant during part of the day or only used discontinuously [37, 41-45]. Therefore, matching HVAC heating/cooling control to actual occupancy is a necessary venue for demand-driven response to improve energy efficiency without sacrificing comfort or HVAC system functionality. In general, the importance of occupants in a building’s HVAC system energy efficiency can be broken down into two categories (Figure 1): 1) Occupancy that how occupants occupy the building and 2) Actions that how occupants behave in a building [46-48]. The first category includes occupant presence (when they occupy the building) and occupant number (how many of the occupants). Occupancy results in heat gains due to occupants’ metabolisms and activities, and are associated with the use of other building systems (e.g. lighting system) and appliances (e.g. computers), which generate and transfer heat to the environment as well [49,50]. Occupancy is closely related to thermal preferences and schedules and also determines active heating and cooling periods and consequences of running HVAC system, which is called conditioning effect in this dissertation. For example, when the space is occupied, an HVAC system is required to work and maintain static and desirable thermal conditions. The second category is for occupants’ actions with building elements such as blinds, shadings, windows, and doors, which impact the net heat gain and alter heating/cooling loads for a space. For example, occupants operate windows and doors to change indoor thermal conditions and bring in fresh cold air [36]. Although in different situations - 7 - occupants may have different permissions and means to control building elements, the fact that occupant actions influence the loads remains the same. Occupant Action Occupancy Number Presence Conditioning Effect Effective Loads Ineffective Loads Loads Interior Flux Exterior Flux Space Thermal Process and Heat Balance Heat Gain HVAC Terminal HVAC Secondary HVAC Primary HVAC System Setpoint Figure 1: Importance of occupant to building HVAC system energy efficiency Occupancy (presence and number of occupants in a zone) is an objective existence without subjective initiative. It directly determines active heating/cooling periods and effects on the demand side, and acts as the basis for occupant actions, thus the proposed dissertation has focused on the improvement of building occupancy modeling, in which occupancy for a specific space is further divided into real-time occupancy representing the instant occupancy status and long-term occupancy representing occupant presence pattern. More importantly, occupancy has the potential to differentiate the actual demands from all the loads resulting from space thermal process and heat balance [51-53]. Since the majority of energy consumed by HVAC system is to deliver the heating/cooling loads added by interior and exterior heat flux. Besides the heat generated from the space, there are significant loads from adjacent spaces and outside environment. No matter how good the construction is, energy is conducted through envelope, if a temperature difference exists between interior and exterior or between adjacent spaces. There is also - 8 - infiltration occurring in the intersection of surfaces such as windows and walls, convection between air and surfaces, as well as radiation through translucent surfaces. It is important to note that no matter how much the amount of loads is, whether it is considered as actual demands for HVAC system heating/cooling or not depends on occupancy. In this dissertation, occupant presence is applied to divide the loads into two parts: ineffective loads and effective loads. When a zone is occupied, loads are all effective and an HVAC system works to maintain a static and desirable thermal comfort and air quality. If a zone is unoccupied, some of the loads are ineffective and it is not required for an HVAC system to deliver the full loads. Since the effects of occupant number on heating/cooling loads could be described as a time-variant coefficient to effects of occupant presence, and the operation of HVAC terminals is triggered as a response to temperature changes no matter how many occupants are in a space, multi- occupancy space is similar to single-occupancy space in terms of system identification. Therefore, the second part of this dissertation is to explore the relationships between occupant presence and heating/cooling loads on the demand side to understand the actual demands for HVAC system heating/cooling, and implications of different presence characteristics for energy efficiency. In order to evaluate and validate energy significance of different occupancy – loads relationships, loads based on occupancy and energy efficiency under different occupancy driven control should be estimated and compared with conventional scenarios. Simulation and field experiments are two venues commonly applied. Simulation, virtual representation and reproduction of energy processes for either an entire building or a specific space, is used for integrating heat and mass transfer, environmental data, and load- HVAC interaction processes, as well as generating periodical energy performance estimates for building systems [54-56]. Compared to field experiments, which bears the difficulty in controlling complexity of various factors that may impact the experiment such as outdoor temperature, as even a slight difference in some parameters (e.g. annual average outdoor temperature) would result in remarkable deviations in the energy efficiency outcomes, simulation has several advantages: 1) simulation is always feasible to evaluate the performance of implementation when field experiment is infeasible; 2) simulation enables - 9 - the investigation of various alternative controls before being implemented; 3) simulation has computational advantage and is much less expensive and less time consuming; 4) simulation can be reversed for adjustment after being implemented; 5) simulation could control factors that cannot be controlled in a field experiment (e.g., outside weather conditions); 6) simulation could isolate and control one parameter such as setpoint and evaluate the direct consequences; 7) simulation is non-intrusive to both buildings and building occupants; 8) simulation could output different levels of indicators for evaluation which are hard to be metered in field experiments; 9) simulation makes it easier for analysts to interpret results. However, only if a simulation model can generate outcomes that closely match the measured energy data of a building, it has potential to be reliable and representative to accurately estimate loads and energy efficiency from incorporating occupancy with HVAC system heating and cooling. In general, building level energy simulation could describe overall energy performance of a building or a set of building systems. System control level energy simulation could indicate the consequence and effect of running a specific occupancy based control for energy efficiency. A zone or a room is the basic unit for heat balance and load calculations. Zoning segments the building into spaces that can be individually controlled and adjusted. Zone level energy simulation is closely related to occupant comfort and building system functionality and it determines the demands for the system. Since the multiple levels of accuracy are all important, energy calibration should not only be quantified at the single level but it should have equivalent performances in all the three levels at least [57, 58]. In addition, to estimate potential energy implications when different occupancy-loads relationships are evaluated, the model has to be robust to the changes resulting from the building being operated differently [59]. Therefore, the third part of this dissertation is to develop a novel calibration framework for preparing reliable and representative building energy models for supporting the investigation of occupancy-loads relationships for HVAC system energy efficiency. The remaining chapters of the dissertation are organized as follows: Chapter 3 introduces the scope and definitions of building occupancy modeling, occupancy-loads relationships, and multi-level energy - 10 - simulation calibration; Chapter 4 proposes the literature review and concludes the gaps in existing research; Chapter 5 discusses the objectives and research questions; Chapter 6 describes the testbed building, reference buildings, and virtual reference buildings used in this dissertation for implementation and validation; Chapter 7-9 present the proposed analysis and validation for building occupancy modeling, occupancy-loads relationships, and multi-level energy simulation calibration. - 11 - Chapter 3: Scope and Definitions There are a large number of building types in commercial building sector, and this dissertation has focused on office buildings as they are the largest energy consumers with largest total area (the largest building type) among all the commercial building types [4]. Approximately 40% of the office spaces are served by centrally controlled HVAC systems [60, 61]. Since centrally controlled HVAC system is the least efficient and it is estimated that an average of 30% (up to 56%) energy can be potentially saved [32,62], this dissertation has chosen centrally controlled HVAC system to validate the investigations of the relationships between occupancy and loads for energy efficiency, in which loads are quantitative measures to describe the demands for HVAC systems to maintain thermal conditions in buildings. Since loads on the demand side at the building level are analyzed in this dissertation, it does not actually matter too much what type of HVAC system is actually installed in the building. It is because demand is determined by the physical and thermal conditions of building and prescribes the outcome of running HVAC systems. It is different from energy consumed on the supply side which is determined by the coordination of all control parameters (e.g. fan speed and damper angel) to meet the loads on the demand side, and closely related to specific HVAC system type and structure. Considering HVAC systems in existing buildings cannot be easily upgraded or changed, loads optimization on the demand side minimizes the actual requirements for heating/cooling and is a much more practical and convenient way to improve energy efficiency compared to system parameter control. In a building controlled by a HVAC system, a zone is defined as an individual room or adjacent rooms served by the same HVAC terminal (e.g. Variable Air Volume box). A terminal is the primary control unit to manage load transfer and heat balance among HVAC response, indoor thermal conditions and occupancy. Terminals exist in all types of HVAC systems. A terminal responds to heating/cooling loads on the demand side at the specific space and determines the loads for the HVAC system on the supply side at the system level, based on a thermostat, controlling supply air temperature and airflow entering a - 12 - space. There are four commonly used terminal types in centrally controlled HVAC systems: VAV (variable air volume) terminal, CAV (constant air volume) terminal, DD (dual duct) terminal, and MZ (multi zone) terminal [14, 63-65]. A VAV terminal supplies air with constant temperature and meets changing loads by varying supply airflow rates through different positions of dampers. A CAV terminal supplies air with constant flow rate and vary the temperature to meet the thermal loads of a space. A DD terminal mixes heated and cooled air right before reaching each zone, while A MZ system mixes heated and cooled air for each zone at the air handler. According to the conditioned floor space in office buildings, central VAV systems (23.5%) account for 58.8% of centrally controlled HVAC systems which are thus chosen for validation. However, since this dissertation is not aiming to provide specific control solutions for a specific building but exploring a generalizable framework for analyzing and modeling the relationships between occupancy and heating/cooling loads, the proposed methodologies are applicable to other types of terminal and kinds of HVAC systems such as packaged and individual HVAC systems. Although the control mechanisms might be different, the eventual theories are the same by optimizing the amount and schedule of heating/cooling loads on the demand side based on occupancy for energy efficiency. Until 2011, 90% of the spaces in office buildings in U.S. are equipped with thermostats [66], which provide the possibility for terminal level control over setpoint for individual zones. Setpoint is the terminal level temperature setting of an HVAC system on the demand side. A setpoint regulates the desired temperature range and is the primary parameter in HVAC systems for controlling the interactions among HVAC response, indoor thermal conditions and occupancy. Since a typical HVAC system responds to heating/cooling loads through the control of setpoints, setpoint is used in the dissertation as the medium to investigate the occupancy-load relationships on the demand side. In order to incorporate occupancy with heating/cooling loads and improve energy efficiency, setback is programmed in thermostats as threshold to avoid temperature getting over a certain degree, while float allows the setpoint to change autonomously. To be clear, the function of ventilation is not the focus of this dissertation as in - 13 - active conditioning dominant spaces, ventilation either comes along with heating and cooling or only provides minimum airflow per the ASHRAE (American Society of Heating, Refrigerating and Air Conditioning Engineers) requirement [67]. In the context of this dissertation, time point is the specific time in one day while time step represents a certain period of time. Time interval is defined as the granularity of data collection, which is set as every three minutes due to the hardware and database-BAS communication related limitations. The granularity of the data collection is not related to the energy simulation time step which is the period of time to solve thermal differential equations for heat balance. During each simulation time step, the simulation model lists all heat excitations in the space, estimate the HVAC system loads to remove/add the heat, converges the response functions, and calculates the amount of energy required to maintain the setpoint. An ambient environment describes the temporal contextual environmental and physical conditions of the space; it has several dimensions including lighting, CO 2, sound, and so on, while an ambient factor is one dimension of the ambient environment (e.g., temperature). Variables extracted from ambient factors are called contextual information. Ambient sensors are devices for monitoring the ambient factors, which are deployed inside a space to collect and forward periodical measurements to the server. Occupancy is defined as time-sequenced occupancy changes for a specific space instead of for a specific occupant to represent how the space is occupied, including occupant number (how many of the occupants) and occupant presence (when they occupy the space) such as arrival time, duration of stay, and departure time. Occupant presence for a specific space is further divided into real-time occupancy and long-term occupancy (typical presence pattern). Real-time occupancy represents instant occupancy status at each time point; it may vary as time passes and depicts the one-time occurrences of occupant presence/absence changes. Occupancy detection is defined as detecting real-time occupancy; it has two values of "occupied" and "unoccupied", while occupancy estimation is defined as the estimation of specific occupant numbers at each time point. A presence sample is defined as the actual time-series presence of an occupant for one day, and presence population is a summation of all presence samples during a - 14 - predefined period. Zone level real-time occupancy is obtained by aggregating occupancy status of each room based on time. Long-term occupancy, also called personalized occupancy profile, indicates a typical-weekday/weekend presence probability as a function of time. An occupancy profile is for a specific space, representing the space’s long-term habitual presence patterns (occupancy patterns) for the day of week. Personalized occupancy profiles could be used to group occupants with similar patterns, find typical occupancy patterns, and design terminal start/stop schedules. Zone level long-term occupancy is formulated by comparing each time point, and the highest probability among room level profiles is set as the probability for that zone. One zone/room may have more than one profile for different days of week. In the process of modeling long-term occupancy, an occupancy group is the category that presence samples should be grouped into based on their similarities. Raw occupancy status is the output of occupancy detection model, while expected occupancy status is the status with highest possibility calculated by certain ambient conditions and historical occupancy data. Occupancy prediction forecasts future occupancy status based on current and previous statuses. Occupancy observation models occupancy from large-scale surveys or observations while designed occupancy is the occupancy estimated by analyzing the occupancy in similar or prototyping buildings, as well as standards such as ASHRAE compliance. It should be emphasized that in this dissertation occupancy means occupancy for a specific space regardless of building types, there is nothing related to the identity of the specific occupants, and no inferences about the occupants’ genders, ages, or professions. - 15 - Chapter 4: Literature Review and Research Gaps Generally, occupancy modeling includes both real-time occupancy and long-term occupancy modeling, both of which are important for building related applications. For example, real-time occupancy could improve lighting control, security monitoring, disabled aid, emergency evacuation and rescue; while long- term occupancy could help space management, event planning, retrofit design and behavior study [68-71]. Since it is often difficult to get detailed information on building occupancy that can be readily observed or accurately assumed, extensive research has been carried out to monitor and model actual building occupancy [37, 72-76]. Also, since occupancy is crucial to determine the demands for HVAC heating and cooling, identifying and modeling the relationships between occupancy and effective heating/cooling loads are essential to control HVAC system based on actual demands by using the precise amount of energy only when it is required, which can reduce energy consumption without sacrificing comfort or building functionality. In addition, building energy simulation is a more feasible and representative venue to investigate the occupancy-loads relationships. However, there are more than hundreds of simulation programs for selection but no generally accepted ways of calibrating the building energy model at multiple levels simultaneously. This chapter thoroughly reviews related works and analyzes the gaps existing in current research and practice for backing up the significance of research objectives and questions presented later in this dissertation. 4.1. Building Occupancy Modeling 4.1.1. Real-time Occupancy Modeling (Detection and Estimation) Considering occupant presence and occupant number are both important components for occupancy, real- time occupancy modeling is divided into two categories based on the level of precision, namely occupancy detection (binary detection of occupant’s presence), and occupancy estimation (counting the number of occupants). There are several occupancy modeling techniques developed to model real-time - 16 - occupancy. Since occupants are always accompanied by the use of energy indoor, energy consumption data was analyzed to correlate the end use with occupancy [77, 78], however, monitoring all the end uses in a commercial building is not feasible and decomposing space loads to device loads such as NILM (non-intrusive load monitoring) and ILM (intrusive load monitoring) requires significant amount of time and efforts for training. Radar based technique was also widely used to model occupants by sending waves, receiving the reflectance and analyzing the difference [79-81]. For example, microwave motion detector is responsive to microwave electromagnetic signals reflected from the occupants and ultrasonic detector applies ultrasonic waves. However, radar based sensing is easily influenced by the surrounding interference and not available for short distance. Proximity based sensing is another venue for occupancy modeling by creating a proximity field and linking the field change with occupancy change [82-85]. For example, capacitive detectors couples between human body capacitance and its surrounding, and triboelectric detector is capable of detecting electric field changes by occupant. However, proximity based sensing requires a great amount of information about the target space. The change of proximity field is limited within a small range and significantly subjected to the electromagnetic field generated by power electric equipment and lighting, thus the sensors have to be deployed at any potential place that occupants might stay. More importantly, proximity field might be harmful to health, it generates electrical field and magnetic field, occupants should keep as far as possible form proximity field source according to the National Institute of Environmental Health Sciences the National Institutes of Health. Researchers have also proposed various video-based systems [72-75], which estimate the occupancy in a monitored space by using image-processing techniques. These video-based solutions are sensitive to sunlight and generally suffered from the requirement for line of sight, which compromise the usability of these solutions especially in heavily-partitioned spaces. In addition, the use of video cameras usually causes privacy concerns for occupants. Optoelectronic based methods through the detection of light from occupants into the surrounding using optical contrast include visible motion detector, passive infrared detector, laser system detector and photoelectric detector [86]. The drawback of optoelectronic based methods is that only moving occupants can be detected. Sometimes height sensor and depth sensor were also applied to - 17 - detect and distinguish between individuals based on differences between their heights and motions [87]. However these sensors are not widely deployed in buildings and only working for small distances. Besides, existing information technology infrastructures are considered to model building occupancy by monitoring the MAC and IP addresses, and mouse and keyboard activities were monitored for occupancy estimation [88]. It was reported that 80% accuracy at building level and 40% at floor level can be achieved in field tests done for two buildings. A major limitation of the system is the assumption that occupants are continuously connected to Internet which may not applicable in real world and occupants who do not use a computer could not be detected. Ambient sensors are widely deployed in commercial buildings to monitor the temporal contextual environment [89-95]. For example, with the implementation of Title 24’s 2013 building codes [96], which was effective since July 2014 in California, all new buildings must have the essential sensing system installed. The advances of ambient sensing in non-intrusion, wireless communications, low-power requirement, easy installation, low cost, off-the shelf, high precision, and so on, all of which enable the wide deployment of ambient sensors in commercial buildings and the continuous access to data for modeling real-time occupancy [97]. However, single ambient factor in general cannot reliably model occupancy because single sensor has false positive/negative issues, and is influenced by the surrounding environment and noises greatly. Occupancy detector, comprised of PIR sensor and ultrasonic sensor, was most commonly applied to detect moving occupants. However, it can only detect presence or absence of occupants and fails when occupants stay stationary [80, 98]. Light sensors were also important to reflect occupancy [99] because occupants may keep the artificial lights on when the space is not occupied; meanwhile the values of light sensors are easily influenced by the natural light. Door switch sensor was used as complementary to other sensors [100] but its status is determined by both the occupant presence and shelters close to the door. [75]Since temperature sensors are closely related to occupant comfort [101,102], their readings of thermal conditions are the joint effects of occupancy and HVAC conditioning. CO 2 sensors have also been widely - 18 - used for this purpose [103-105], as higher numbers of occupants in a space usually result in higher CO 2 concentrations. However, it takes time for CO 2 concentration to build up (10-20 minutes), and CO 2 concentration is affected by not only occupancy but also by other factors such as the status of windows and doors, passive ventilation and sensor placement. Multiple sensors have noise immunity and reliability, and receptive to different kinds of uncertainties. The synergistic use of ambient sensors could capture any possible interaction between occupants and the environment. Therefore, to overcome the limitations of single ambient sensors, a combination of various ambient sensors has been also tested. Sensor network, a physical configuration compromising a suite of sensor nodes, is becoming more and more common in commercial buildings to periodically measure ambient environment with occupant intervention [106], and capture any interaction between occupants and environment. Agarwal et al. [107] used a magnetic reed switch door sensor and a passive infrared (PIR) sensor for occupancy detection, which reported accurate occupancy status most of the time. However, their occupancy modeling algorithm was designed only for single-occupancy offices, and it was built on the assumption that occupants always keep their doors open when they are in offices or when they are somewhere nearby. The performance highly depended on the precision and range of PIR sensor. Meyn et al. [108] used measurements from cameras, PIR, and CO 2 sensors as well as historical building utilization data to estimate the number of occupants at the building level. The reported accuracy was 89%. Limitations included that the system was not able to estimate the number of occupants at the room level, and that the error tended to accrue over time. Henze et al. [109] proposed an occupancy detection system that was built on three PIR sensors and one telephone sensor for each room. The system relied on a belief network algorithm and detected if any occupant was present, with an accuracy of 76%. Hailemariam et al. [110] built an occupancy detection system that used light sensors, motion sensors, CO 2 sensors, and sound sensors. A decision tree algorithm was used to estimate the occupancy of cubicles in an office. An accuracy of 98.4% was achieved using the motion sensor alone, and a decline in the accuracy was reported when other sensors were integrated. Dong et al. [111] proposed a system that estimated the - 19 - occupancy of a space by sensing CO 2 concentration, acoustics, and motion in space. Field tests were carried out in two rooms, with three methods including support vector machine, artificial neural network, and a hidden Markov model. All three methods yielded an accuracy of up to 75% in detecting occupant numbers in spaces that had up to four occupants. The authors mentioned that the reported accuracy could be further improved by adding more ambient sensors. Hutchins et al. [112] proposed an approach that could recover missing or corrupted sensor data in occupancy estimation. Their approach consisted of an inhomogeneous Poisson process and a hidden Markov process. The system was not validated with field tests and was only able to provide results at the building level. Lam et al. combined sensors for CO 2 concentration, acoustics, illumination, temperature, humidity, and motion [106]. However, the selection of sensor types and algorithms were random and based on availability. Although different combinations of ambient sensors were tested in different studies; few arguments were provided with regard to why certain sensor combinations were selected, and what impact the selection of sensors had on the performance of the occupancy models. Another influential factor that determines the modeling accuracy was the selection of algorithms representing the relationship between ambient factors and occupancy, which also deserves further examination. Besides, occupancy models reported in previous research were mostly used to detect and estimate the occupancy in the same spaces, where the models were trained. Nevertheless, collecting precise and continuous real-time occupancy of each individual space for training can be time consuming and intrusive. In a considerable number of cases it is unrealistic to have access occupancy ground truth for all spaces, making it very impossible to retrain the models [113-116]. In other words, previous research related to occupancy modeling only focused on local modeling, which bears a severe limitation that occupancy ground truth must be collected and occupancy models must be tuned for each space, where real-time occupancy detection/estimation is needed. To summarize, current research has gaps in the following ways: - 20 - 1. Few arguments were provided with regard to why certain algorithms were selected to model occupancy. Improving real-time occupancy modeling means not only analyzing the patterns of ambient factors to detect/estimate occupancy, but also understanding what the actual relationships between real-time occupancy and ambient factors are. 2. Performance of real-time occupancy models depends greatly on the selection of sensors. Different combinations of ambient sensors were tested in different studies without providing why such sensor combinations were selected, and what impact the selection of sensors had on the performance of the occupancy models. Although ambient sensing was applied to model real-time occupancy, the primary and joint effects of different ambient factors in terms of reflecting occupancy changes are not systematically studied. 3. It has not been studied whether the relationships are space-specific and how they are transferred to geometrically similar spaces. In most cases, it is unrealistic to gather occupancy ground truth data for all spaces, this limitation calls for global occupancy modeling, where occupancy models are trained in one space and applicable to other spaces where training data collection is difficult or not possible. 4.1.2. Long-term Occupancy Modeling (Profiling) Long-term occupancy is another important component of building occupancy modeling since it reflects the typical occupant presence patterns. The most practical method to model long-term occupancy is using fixed design profiles (so-called schedules, diversity profile or diversity factors) [117-119] that are defined by the organizations such as ASHRAE, or the typical occupancy from prototyping and similar buildings. The same profiles are used for spaces with similar types and functions. These methods are based on experiences and analyses of large-scale occupant surveys and/or observations from a number of similar buildings, which are not only labor- and time- intensive to complete, but also inaccurate to reflect the stochastic variations of occupancy in time and space [12,120]. For example, a profile for an office building assumes there are different periods of use (e.g., occupied from 8:00 to 18:00 for weekdays and - 21 - not occupied at night and on weekends), but the periods during which spaces are partially occupied are not considered. Studies have shown that actual occupancy of buildings differs significantly from the designed occupancy (sometimes only a third of the designed occupancy in office buildings is accurate [37, 42]) and varies widely from one building to another [120-124]. Fine-grained long-term occupancy should be derived from real-time occupancy with the irregular occupancy being eliminated to represent the habitual occupancy patterns. Agent based simulation is commonly used to model the typical patterns of occupants [125], however, it suffers predefined rules and high degree of complexity which restricts its application to large building with multiple zones and complicated interactions among occupants. Considering the stochastic nature of occupancy, probabilistic modeling is another common method to produce occupancy patterns based on a period of historical data [126]. Probabilistic methods apply certain probabilistic process such as Monte Carlo process [127], Poisson process [121] and Markov process [124] to generate a time series state of occupancy with certain probability. For example, Newsham and Reihhart developed a simple stochastic model for modeling occupancy patterns by reproducing more realistic times of arrival and departure of occupants to and from their offices through a light switch analysis and modifying the fixed design profiles [128]. However, this model may overestimate the occupancy due to its inabilities of detecting long absence durations and eliminating irregular occupancy Wang et al. examined the statistical properties of occupancy profiles for single occupancy offices of a large office building. Based on the observations from infrared sensors, the intervals of intermediate presence and absence (the periods where the person was in and out of the office are mixed together – sometimes the person is there, other times he’s not) were proposed to be exponentially distributed over time [121], which modeled the time interval of independent occupant’s vacancy through a non-homogeneous Poisson process; however, this hypothesis has not been tested in a real-world environment as the coefficient for the modeled exponential distribution varies over a day. Later, Page et al. introduced a generalized inhomogeneous Markov chain to model occupancy [124]. The method did not provide knowledge of the model structure and an explanation of how it works. - 22 - Additionally, the Markov Chain cannot detect irregular presence and was computationally expensive when applied to multiple zones. Yamaguchi et al. developed a stochastic model for occupancy by dividing the state of presence into different states of activities (e.g. using a PC or being out of the room) and then simulated the working states using a Markov Chain [129]. Similar to Wang et al., Yamaguchi et al. assumed that the time an occupant spends in a working state is independent (i.e., does not depend on the time of day), an argument which has been overthrown by Wang et al. The transition distributions were only related to empirical observations, making Yamaguchi et al.’s method inapplicable to more variable activities and irregular presence. Generally, probabilistic modeling modeled occupancy by analyzing similar situations and chains of circumstances using historical data to model occupancy patterns. However, these methods were not reliable and practical, as they cannot really provide long-term habitual pattern but a relatively probable occupancy status at each time point. They were built on strong assumptions which might be distinct in reality, such as considering the coefficients of exponential distribution for presence and absence as constant over the day. Besides, occupancy patterns may change over time, seasonal and personal factors, however probabilistic models are difficult to be updated. Since ambient sensors have demonstrated to be related with occupancy [98,106], they were also applied for modeling long-term occupancy. Motion sensors are commonly used as indicators of occupancy ground truth [48,121,130]. However, the signal produced by a motion sensor is intrinsically a digital signal, which is sensitive to the movement of objects. Further in order to analyze the relationship between ambient environment and occupancy pattern, Dong and Andrews used sensors for acoustics, lighting, motion, CO 2, temperature and relative humidity [111,115] to correlate occupancy pattern. Their proposed approach showed the sequential occupancy, however there were lack of adequate measurements for irregular presence differentiation, -such as long absence, meeting, short visit, schedule events, occasional working at home, intermediate walk-around, and short breaks- which was demonstrated to be important for modeling long-term occupancy [131]. More importantly, the occupancy ground truth used to model occupancy pattern is from estimation of sensor data and has not been verified. - 23 - Another direction for research on modeling long-term occupancy is conducting building-specific surveys or observations that average the occupancy status at each time point over a period of time [132]. The underlying assumption was that at the same time on different days, occupancy is a repetitive occurrence by a large number N of times. The proportion of times that occurrence of presence converges to a specific value as samples become larger and larger; eventually the ratio (presence samples/total samples) converges to a constant limit as the total sample size increases [133,134]. The most commonly used approach for building-specific observation was averaging survey data. Researchers counted the number of presence at each time point during a period of days and divided it by the maximum occupancy for the period to produce the occupancy profiles for each space. This type of observation-based model could roughly model occupancy patterns; however, the averaging process did not consider the influences from occupancy status at adjacent time points thus making the statistical results fluctuate frequently and excessively. Generalizing the survey data makes it difficult to validate the reliability of ground truth, and statistical results are easily influenced by unknown irregular and atypical presence (e.g., early departures from a zone). Previous research has shown that significant discrepancies emerge from averaging the results across different periods of time [120]. Surveying every occupant for every day is infeasible and intrusive, even the obtained data cannot have high resolution and might be inaccurate because in general they are from memory and experience [108]. Besides, the periods to summarize occupancy pattern are difficult to decide. Since sensor network is becoming more common in commercial buildings, ambient sensor data becomes available to understand the differences in an ambient context and assist the long- term occupancy modeling [135,136]. To summarize, current research has gaps in the following areas: 1. Although long-term occupancy could be derived from real-time occupancy, the exact relationship between them is actually unknown. There are different characteristics (time-series, statistically stable and stochastic in nature) in real-time occupancy that could be considered in modeling long-term occupancy. Different techniques which represent different characteristics of occupancy profile should be tested. - 24 - 2. Occupancy ground truth used to model long-term occupancy contains errors and outliers, especially for large-scale applications in commercial buildings. A learning method to preprocess and improve quality of real-time occupancy data is required. Since ambient factors are related to occupancy, it is worth of exploring using ambient factors to improve the quality of real-time occupancy ground truth. 3. Occupants may have different patterns for different days of weeks, thus resulting in different profiles to represent the long-term occupancy patterns. There is no systematic way to decide the number of required profiles. Besides, occupancy profiles may change over time, season and personal schedules. Occupancy profiles should have the potential to be easily updated as new data become available. 4. It is difficult to eliminate irregular occupancy in long-term occupancy modeling. Ambient factors are related to occupancy and have their own patterns over a long term. Data acquired by ambient sensors could provide high-resolution and accurate information for describing irregular indoor ambient variations influenced by irregular occupancy, which brings up the potentials to differentiate irregular occupancy from regular occupancy by analyzing the patterns of ambient factors. 5. How to evaluate the representativeness of modeled occupancy profiles to actual occupancy patterns have not been investigated. Since there are significant discrepancies in occupancy across different periods of time, the validation of long-term occupancy modeling is difficult and it is missing from the literature. The representativeness of occupancy profile should be evaluated through both the comparison with actual occupancy in terms of the degree of statistical approximation and heating/cooling loads (energy consumption) for a period of time. 4.2. Integration of Occupancy and Heating/Cooling Controls In current building management, HVAC systems are programmed to assume buildings are fully occupied, and to run based on this assumption. On demand side, heating and cooling loads are considered to exist all the time [114], and temperature is used as the indicator to adjust the operations, which often results in - 25 - excess energy consumption for conditioning unnecessary loads [107]. For example, an office for an adjunct professor used only two days might be heated and cooled unnecessarily for a whole week. Deviations from the actual occupancy can lead HVAC systems to operate on inefficient schedules. To address this issue, occupancy based heating/cooling schedules replace the designed schedules with actual demands, based on real-time sensing or long-term generalizations regarding occupancy, and significantly improve energy efficiency. Simply, occupancy-based heating/cooling schedules requires terminal being programmed to switch off when the zone is unoccupied (e.g., overnight, during lunch breaks, etc.), and intermittently work during occupied periods and effective vacancies based on the actual demands. Additionally, the morning start-up time should be as close as possible to the time the space is first occupied. 4.2.1. Real-time Occupancy and Heating/Cooling Controls Motivated by the significance of such inefficient energy consumption, a range of research initiatives have been undertaken to optimize HVAC controls based on real-time occupancy [17,137]. The basic principle is that energy efficiency could be improved by not fully running HVAC systems in vacant zones but only providing minimum airflow per standards/codes [109]. Substantial energy savings have been reported by prior research through not maintaining static setpoints in unoccupied zones [138-142]. Instead, zone temperatures were allowed to float within a certain range, depending on whether a zone was occupied or not [143], and setpoint/setback switch schedules were optimized between occupied and vacant modes to reduce frequent system startups while maintaining occupant comfort [144]. For instance, NEST thermostat is a self-programming thermostat that can create setback schedules by learning the changes occupants make to a setpoint, sensing the presence of occupants, and estimating the time required to reach the setpoint [145]. Another example was the Telkonet SmartEnergy control system [146], which allowed temperatures to vary within a relatively wide range in a vacant residence. After occupants returned, the temperature was adjusted to its normal range within a lag time. The lag was determined - 26 - based on tolerable recovery time set by the occupants, building type, HVAC type, and outside weather. Agarwal et al. proposed to set back the temperature setpoint in vacant zones and occupied zones. In a simulation, temperature was maintained at 22.9 0 C and 26.1 0 C for occupied and unoccupied zones, respectively. The authors reported a 15% reduction of HVAC energy consumption in a mid-sized office building [107]. Optimizing switch time step between occupied and vacant modes, in order to avoid frequent system startups while maintaining comfort, has also been explored by previous work [144]. Most of the control strategies that could be found in literature applied feedback-based mechanisms, which reacted only to occupancy changes [147,148] with reported energy savings of 34% for a typical summer day in Switzerland and of 37% for a typical winter day in a city in the southeastern United States. As more fine-grained occupancy information would lead to more complex control algorithms, Goyal et al. discussed the tradeoffs between occupancy complexity and performance of a control algorithm [149] and found significant amount of energy could be saved through simple feedback-based control, however more energy could be saved if prediction of occupancy is integrated into HVAC control loops. A practical attempt at occupancy-based room-level heating control, called PreHeat, aimed to save energy by using occupancy sensing and occupancy prediction to control heating in residential buildings and to reduce the duration when an occupied house is conditioned [150]. However, these research efforts were implemented only in residential buildings or single-occupancy zones for individual terminal control optimization [40,107]. Occupancy based HVAC control, for mid or large size commercial buildings, requires consideration of synergies among adjacent zones and subsystems (e.g., terminals and air distribution systems), dynamic and uncertain interactions among HVAC, occupancy and environmental conditions and it requires global optimization for energy efficiency. 4.2.2. Long-term Occupancy and Heating/Cooling Controls Similar to the real-time occupancy based control strategies, occupancy patterns (long-term occupancy) were found to be driving factors to design fixed occupancy based HVAC control strategies [117-119]. - 27 - According to the existing research, long-term occupancy was typically used to determine the start/stop schedules of terminal controls on a personalized basis and assign typical loads to each terminal [120]. During the design phase of a building, engineers designed fixed HVAC system schedules and load capacities that account for maximum possible occupancy [151], based on standards such as the ASHRAE 90.1 2004 and ANSI/ASHRAE 90.1-2007 [152,153], or large-scale building/occupant survey data and observations [132,133]. The results were for different types of buildings (e.g., office or education) and different types of occupants (e.g., office workers). The values were displayed hourly with range from zero to one, corresponding to a fraction of the maximum occupancy for that room (Figure 2). By using fixed design profiles, interior heating load and cooling loads can be estimated by standard heat gain statistics. Based on the ASHRAE standard, the HVAC systems should be switched on from 6:00 until 24:00, whenever there is a possibility of occupancy. Figure 2: Occupancy schedules that are used to recommend HVAC schedules by the ASHRAE 90.1 2004 for an office building The ANSI/ASHRAE 90.1-2007 recommends HVAC schedules to follow the schedules defined for typical building types [154] and to be customized according to occupancy assumptions made during a building’s design phase based on engineers’ previous experiences, and large-scale occupant surveys. Based on these recommendations, HVAC systems should run continuously when a space was designed to be occupied and be cycled to meet heating and cooling loads during unoccupied hours simply based on setpoint or setback. However, studies have shown that occupancy patterns are personalized in nature; actual - 28 - occupancy of buildings differs significantly from the projected occupancy patterns that are used to determine HVAC schedules, varying with location, building function, season, and even engineer’s expectations. Thus occupant arrival/departure times and typical loads are difficult to generalize and predetermine [47,121,124]. For example, it has been found that occupants of an eleven-story business office building in Boise, Idaho, in northwestern United States, stayed in the office approximately from 7:00 to 18:00, with a maximum occupancy load (heat generated by occupants) of 0.55 [120]; occupants of a twenty-one-story office building in San Francisco, California, in western United States, arrived at 8:00 and leave before 21:00, with a maximum occupancy load of 0.75 [121]; occupants’ presence in a two- story university building in Fayetteville, North Carolina, in mid-Atlantic United States, spanned from 6:00 to 18:00, with a maximum occupancy load of 0.7 [119]; the presence of occupants in a high-rise office building in Vienna, Austria, ranged from 8:00 to 20:00, with a maximum occupancy load of 0.6 [122]; occupants of a laboratory building in Lausanne, Switzerland, were present from 8:00 to 18:00, with a maximum occupancy load of 0.65 [124]. These findings demonstrated considerable variations exist among occupancy patterns for different buildings in different locations. To effectively improve HVAC energy efficiency, space-specific occupancy patterns must be taken into account. The commonly applied method for designing personalized terminal control was to collect long-term occupancy information (samples), averaged the presence of all occupants in a room, and aggregated all profiles from a zone [132,155]. The most common approach was to count the number of presence at each time point during a period of days and divide it by the maximum occupancy for the period to produce the occupancy profiles for each room. The aggregated room occupancy profiles were then used to create zone-level HVAC schedules. However, generalizing historical data made it difficult to validate the reliability of the ground truth, and statistical results were easily influenced by unknown and atypical schedules (e.g., early departures from a zone). Within a zone, if one room is considered occupied, the zone is considered as occupied without compromising occupant comfort. However, in reality occupants in the same zone may have different or even inverse presence patterns [47,125,156]. Aggregating disparate - 29 - profiles would create an inaccurate representation of how each room is actually occupied and might offset the energy savings by personalized HVAC schedules. Besides, a commercial building’s supply air is handled by one or more air handling units (AHUs), which distribute heating, cooling and ventilation to individually controlled zones [24,57,157]. Thus the energy efficiency is also impacted by the heat transfer and balance among zones. Simply considering the occupancy patterns in one zone is likely to result in the compromise of energy savings from one zone by other zones and energy inefficiency. Prediction was also widely applied to drive pre-control by analyzing similar situations and chains of circumstances using historical data. However, if an output of a previous prediction is used as the input for the next prediction, the errors may be accumulated. In addition, the variability and fluctuation of prediction outcomes are also interference factors when controlling HVAC terminals [126]. Therefore, prediction based control is not reliable due to the high uncertainty of occupancy and lack of practical validation. In summary, current research has gaps in the following areas: 1. It is still not clear how and when occupancy should be integrated with HVAC systems for energy efficiency. This is a complex problem as occupancy is stochastic in nature and there exist heat transfer and balance among zones of a building, as well as heat gain and loss through a building’s envelope. 2. Most of the previous research efforts were implemented only in residential buildings, or small size commercial buildings, and in single occupancy zones for individual terminal control optimization. How to incorporate occupancy with heating/cooling loads for energy efficiency in large-scale commercial buildings is not clear. 3. Existing occupancy based HVAC control strategies generate case-specific control solutions and may not be scalable to other types of buildings. There is no systematic understanding of the relationships between occupancy and loads to improve energy efficiency. As there is no solution to this date that can be applied directly in different buildings with different HVAC systems and occupancies, it is necessary to - 30 - identify and model the relationship between occupancy characteristics and heating/cooling loads that significantly influence HVAC system energy efficiency. 4. Long-term occupancy and real-time occupancy are not well integrated in determining the heating/cooling loads. Although occupancy based HVAC system controls have the potential to bring about energy efficiency, they also have the possibility to cause discomfort to occupants and consume more energy. 4.3. Energy Model Calibration for Energy Simulation As is discussed in Chapter 2energy simulation has several advantages compared to field experiments. In the last two decades, building energy simulation has begun to play an increasingly bigger role in evaluating and selecting building energy performances [54, 56]. However, building simulation could be error-prone because of the complex correlations and dynamic changes in envelope thermal conditions, exterior impacts (e.g., solar radiation) and interior impacts (e.g., light related heat gain) [158], as well as because of the large number of independent and interdependent input parameters, which cannot be all obtained empirically [159]. Despite its advantages, expected energy efficiency reported in simulations does not usually match those measured in actual buildings, due to the discrepancies between actual buildings and their virtual representations [160,161]. Empirical studies have revealed noticeable differences between simulation results and actual measurements [162,163]. Simulated results sometimes deviate significantly from the measured ones, by 30% to 90% [164]. Only if a simulation model can generate outcomes that closely match the measured energy performance of a building, it has potential to be reliable and representative in its ability to accurately estimate occupancy- loads relationships for HVAC system energy efficiency. The model calibration for building energy simulation is commonly defined as an inverse process. The accuracy of a simulation model largely depends on how well the outputs are compatible with available measured data, which in turn depends on how accurate the inputs could empirically reproduce the - 31 - properties of a building the model simulates [165-167]. Two sources are recognized to be generally responsible for discrepancies in building energy simulation. One is the simplification of building and building systems, assumptions of thermal processes, and algorithmic differences used in simulation programs [159,168], and the other one is the uncertainty in input parameters. This dissertation first focuses on the first source of error related to the simulation program chosen, and then discusses the second source of error about the simulation discrepancies in outputs caused by the uncertainty of input parameters. 4.3.1. Comparisons of Simulation Programs There are over one hundred building energy simulation programs used in current research and practice, and large discrepancies also exist in simulated results when different simulation programs are used to model the same building under the same conditions. In order to provide an efficient way to examine the effects of occupancy on a building’s HVAC system energy consumption and a cost-effective and non- intrusive solution to test occupancy-loads relationships, simulation programs should be evaluated according to their methods and sequences for considering heat transfer and balance, load calculation, occupancy-HVAC system connection, HVAC system modeling, and HVAC system simulation process. Extensive research has been done to validate the accuracy and reliability of simulation results using different simulation programs [118,169-174]based on the requirements of the ASHRAE 140 [175]. However, these efforts focused on the differences in results, instead of systematically analyzing the coupling of occupancy information with building HVAC energy simulation. Several other studies have compared the advantages of different simulation programs [176,177,177,178]. However, to the authors’ knowledge, no work to this date has specifically compared the principles in modeling the effects of occupancy on a building’s HVAC energy efficiency, or evaluating HVAC system responses to occupancy. Widely-used base case buildings and reference buildings for validation lack actual occupancy information and cannot reflect the actual HVAC energy performance from the occupancy coupling perspective. - 32 - All energy simulation programs are built on the first-principle mathematical modeling of heat balance within or around a building. An HVAC system is virtually run and energy is calculated to keep a constant and comfortable thermal environment under both exterior and interior impacts. Exterior impacts are the loads added to the building from the outside environment such as solar radiation. Buildings have envelopes, which are surfaces with heat transfer and balance. No matter how good the construction is, energy is conducted through envelope, if a temperature difference exists between a building’s interior and exterior. There is also infiltration occurring in the intersection of surfaces such as windows and walls, as well as radiation through translucent surfaces by visible lights and invisible lights. Interior impacts result from the loads caused by equipment/appliances or users being present, which are closely related to occupancy. In general, energy simulation programs assume hourly frequencies to input occupancy and related internal gains, such as use of light in space while injecting heat to the space, and appliances, which constitute the main parts of the gained loads. Typically, there are two types of occupancy input: (1) diversity factors, numbers between zero and one representing the multipliers of nominal loads for occupant metabolic heat gains, such as the ones used in EnergyPlus; and (2) actual loads expressed in W or W/m 2 . ESP-r for instance, requires a measurement of the exact amount of heat given off by occupants. In an energy simulation program, exterior impacts are set by hourly meteorological data from standard databases, such as the DOE (Department of Energy) database, while the interior impacts are controlled by the settings related to occupants, lights, appliances, and schedules. The interactions between exterior impacts and interior impacts are represented by the thermal properties of surfaces such as wall thermal capacitance. Since different simulation programs use different methods and sequences to model occupancy related heat gain and HVAC system response to exterior impacts and interior impacts, a comparison is necessary to evaluate which simulation programs are capable of coupling occupancy with building HVAC energy simulation (Figure 3). - 33 - Occupancy Heat Gain Conditioning Requirement Heat Balance HVAC Modeling Effects of Occupancy on HVAC Energy Consumption HVAC Response to Occupancy- based Control Strategies Reduce HVAC Energy Consumption Simulation Program Importance of Occupancy Occupant Comfort and Building Functionality Coupling Occupancy with HVAC Energy Simulation Demand-driven HVAC Control Load Calculation Occupancy HVAC Connection HVAC Simulation Figure 3: The relationship between occupancy and building HVAC energy simulation 4.3.2. Simulation Calibration Methodologies After specific simulation program is selected, the quality of the calibration is limited by the determination of input parameter values, which represent the building as abstraction in a simulation. Therefore, simulation is a context-related process. The time and effort required to collect data and determine input parameters make energy model calibration a challenge for large-scale applications for commercial buildings [179]. Considering the fact that occupancy-loads relationships are specific for appointed buildings, each building has to be modeled and calibrated individually. Using typical/standard values for input parameters or estimating energy performance based on similar building data does not provide accurate energy model calibration for another specific building [180]. A review of current calibration works has revealed that there is no generally adopted methodology for building energy model calibration [158,166] due to the different requirements for simulations, different purposes of simulations, different configurations of building systems, different available evidence and different levels of knowledge and experience of analysts. Below is a review of major calibration approaches found in recent literature. Statistical learning based calibration methods apply a simple or multivariable mathematical/analytical analysis to find relationships among actual measurements, simulation outputs and input parameters. A - 34 - simple way to express a statistical relationship is by an objective function or a penalty function [181,182], in which input parameters or output variables are assigned certain weights, and a mathematical/analytical algorithm is then used to map them with actual energy measurements. Input parameters could be either static parameters (e.g., chiller coefficient of performance) or dynamic parameters (e.g., room temperature). The weights are determined using measured data, such that the corresponding formulation is able to predict acceptable energy performance with no direct link to physical building properties [183]. Statistical relationships could also be established by machine learning techniques [184-186]. Once the learning models are built and tuned, they can be applied to process new model inputs and estimate corresponding energy performances. Supervised machine learning techniques, such as Artificial Neural Network (ANN) [187] and fuzzy logic model [188], which are capable of modeling complex relationships between inputs and outputs, are commonly used to learn sophisticated non-linear and joint effects of input parameters [189]. In view of the large number of parameters to be considered, learning model training is computationally expensive and may not provide acceptable solutions because of overfitting (overspecialization and cannot be generalizable). In addition, energy model calibration is a case-by-case process; machine-learning models generated from reference buildings may not be applicable to other buildings in the same climate. In general, statistical learning based calibration requires a short development time and provides an accurate estimation of energy consequences given the availability of prior training data. However, it is data-driven and requires extensive data for retraining if there are any system operational changes. Moreover, a statistical learning based calibration method abstracts the calibration process to a pure mathematical fitting problem, which may not be able to represent the real function or contribution of each input parameter to building energy performance. Even though the net effects of all parameters can generate outputs that closely match the measured energy performances, the individual parameters may still be incorrectly or unreasonably tuned, thus it is difficult to simultaneously achieve high simulation accuracies at multiple levels. - 35 - Analytical calibration methods use available evidence, such as zone size and window height, to iteratively adjust input parameters based on analysis, experiences and trial until simulation outputs match actual measurements. This calibration method is mostly manual, iterative and pragmatic intervention, which requires significant time, effort and expertise [190,191], however it is capable of modeling a building and building systems under previously unobserved conditions. Standard steps (e.g., simulation plan, data collection, evidence input, calibration, model refinement) have been widely used in previous research [192-194], and the process typically requires conducting interviews, collecting drawings, specifications, field measurements, logs, system manuals and system settings [195-197][195-197]. To improve model calibration, using continuous field measurements and observations, has been proposed and the calibration accuracy was significantly increased [198]. Sensors could also be installed to get the necessary information for calibration [199]. For parameters that cannot be determined directly from evidence, expert knowledge and experience are required. Most simulation exercises employ heuristic methods for parameter estimation. Specifically, an expert first selects a set of parameters that are likely to significantly influence the outputs of a simulation model on a building-to-building basis [165]. Different combinations of values are then tested until differences between the simulated and measured energy performance are reasonably small. Analytical calibration methods require a trial and error process, where there are large numbers of parameters. It may not be reliable as the complexity of the simulated building increases. Even if each input parameter is empirically validated, the simulation output of a building may still be far from measured building performance, since buildings do not always behave as initially designed. Continuous updates for input parameters are required for calibration. Quality of the analytic calibration model relies heavily on the subjective judgment of an analyst on building systems and thermal processes, especially in the choice of parameters to be calibrated, quantification of their prior distribution, best-guesses of parameter estimation, and interrelations among parameters. High accuracy at multiple levels is difficult to be achieved solely with this method. - 36 - More recently, a combination of analytical calibration methods and sensitivity analyses or analytical optimization approaches has been introduced [200]. Sensitivity analysis is widely used to reduce the number of parameters to be adjusted [201]. A matrix of possible values of each input parameters is created using sampling algorithms, and imported to a quasi-deterministic approach, such as Monte-Carlo (MC) methods. After thousands of simulation trials by a commercial simulation program a set of promising vector solutions is found rather than a single one based on goodness-of-fit [202], or an analytic meta-model is obtained to minimize the difference between simulated and measured data [203]. However, the performances of integrated calibration methods highly depend on the possibility that the solution exists in the trials. Integrated calibration is computationally efficient and could provide comparable results [159]. Usually a group of solutions are selected with actual energy measurement being within a range of the values simulated. However, there is no work to data that specifically focuses on a solution with high accuracy for multiple levels. Sometimes zone level energy consumption is simply added up to represent building level or system level energy consumption [204,205]. These sequential calibrations cannot achieve simultaneous high accuracy for multi-level simulations when there are several zones and multiple HVAC units. To summarize, current research has gaps in the following areas: 1. Previous studies have compared energy simulation programs, however to this date, there is no study focusing on the comparison for coupling occupancy with building HVAC energy simulation. A systematic review is required to identify which simulation program could accurately and reliably model the effects of occupancy on heating/cooling and HVAC system energy efficiency. 2. A review of current literature on this topic has revealed that there is no generally adopted method for calibrating building energy models. A flexible and robust framework is needed to tackle the general calibration process for a range of commercial buildings. - 37 - 3. In general, energy model calibration is an over-parameterized and context-related inverse process. Current calibration methods focus on single-level simulation accuracy, either at the building/system level or at the zone/room level. An energy model that has high accuracy at one level does not necessarily represent precise simulation at another layer. Single-level accuracy would make the energy model unreliable and inaccurate to estimate energy efficiency by incorporating occupancy with HVAC system heating/cooling loads. 4. The possibility of integrating the advantages of statistical learning based calibration and evidence based calibration has not been considered. To obtain an acceptable match between the outcomes of a simulation model and measured energy consumption, calibration is done based on evidence collected from actual building data or mathematical transformation of simulated results. However, analysts typically have limited information about a given building (e.g., as-built documents) while the number of parameters to adjust is generally large, which cannot be all investigated. Building energy modeling differs case to case and analysts generally don’t have identical knowledge with which building energy models could be calibrated. Meanwhile, a pure mathematical approximation simply distributes the errors to different input parameters and cannot have consistent performance. A more integrated framework to take advantages of both energy simulation model calibration methods is needed. 5. Moreover, all building energy models used in studying occupancy based HVAC terminal controls were calibrated with energy ground truth collected under a building’s normal control. These models are not reliable to reveal and reflect a building’s energy performance under other controls. Conservatively, energy efficiency of new controls under different occupancy-loads relationships can only be evaluated after they are implemented. Otherwise, the model has to be robust to the changes resulting from the building being operated under different controls. - 38 - Chapter 5: Research Objectives and Questions The two main objectives of this dissertation are: (1) to improve non-intrusive building occupancy modeling, including both real-time occupancy and long-term occupancy; and (2) investigate the relationships between occupancy and heating/cooling loads on the demand side for improving HVAC system energy efficiency. In the context of this dissertation energy efficiency is the weighted sum of conditioning miss, which is the length of time a space is occupied but the temperature is outside the range of the setpoint, and energy reduction, which is the absolute amount of energy savings. Conditioning miss is considered as it compromises the basic function of an HVAC system to maintain a desired thermal environment that is comfortable for occupants. Objective 1: Learn the relationships between occupancy and ambient factors and develop a framework for occupancy modeling to improve non-intrusive occupancy awareness (real-time occupancy detection and estimation, as well as long-term occupancy profiling). Question 1-1: Why are ambient factors necessary to be included in the modeling of building occupancy? What are the advantages of applying multiple ambient factors to model building occupancy? Question 1-2: What is the relationship between ambient factors and real-time occupancy status (instant occupancy at each time point), and how to detect/estimate real-time occupancy by learning the patterns in ambient factors? Question 1-3: What is the relationship between ambient factors and occupancy patterns (habitual occupancy for a certain period of time), and how to model an occupancy profile (a typical weekday/weekend presence/number probability as a function of time) for a certain period of time by learning the patterns in ambient factors and real-time occupancy? - 39 - Objective 2: Learn the relationships between occupancy and heating/cooling loads on the demand side to improve HVAC system energy efficiency. Question 2-1: Why does building occupancy significantly influence the HVAC system heating/cooling energy efficiency? Question 2-2: How are the occupancy characteristics quantitatively associated with HVAC system heating/cooling loads? Question 2-3: What are the types and ways of modeling the relationships between occupancy and heating/cooling loads that are important for HVAC system energy efficiency? Question 2-4: How the occupancy-loads relationships can be incorporated in HVAC system setpoint controls to improve energy efficiency? A well-calibrated and robust energy model for simulation is a necessary venue in this dissertation for accurately and reliably modeling and validating the occupancy-loads relationships. However, current calibration methods focus on single-level simulation accuracy. Single level of calibration considers the accuracy for one scale of output in an energy simulation, such as building level gas consumption or zone level electricity consumption. Since there are a large number of input parameters but few output variables (depending on the required resolution and the length of simulation), it is usually relatively easy to approximate high accuracy for a single level of simulation. However, simultaneous accuracy for multiple levels of simulation is crucial. Although different levels of energy consumptions are interconnected and they reflect the approximation of simulation results to the measured energy performance, accurate simulation of single level does not necessarily mean accurate simulations for other levels, especially when there are several zones and multiple HVAC units in a building. It becomes more difficult to achieve high accuracies for multiple levels of simulations simultaneously as the complexity increases due to the complicated and dynamic correlations and interactions among envelope thermal conditions, HVAC responses, exterior impacts and interior impacts. In sum, the research towards studying HVAC system - 40 - energy efficiency in a building influences more than one level of energy performance and might require other levels of energy simulation for analysis and exploration [57]. Therefore, a multi-level calibration framework is necessary to achieve multiple calibration objectives simultaneously. Objective 3: Provide a calibration framework for energy modeling to evaluate energy implications of different occupancy – loads relationships at multiple levels. Question 3-1: Why is well-calibrated building energy model important for quantifying the occupancy-loads relationships? Question 3-2: How to calibrate a building energy model with high simulation accuracies at multiple levels simultaneously? Question 3-3: How to improve the robustness of energy simulation to test different occupancy – loads which means how to keep consistent performances when HVAC system is operated under different occupancy and controls? - 41 - Chapter 6: Testbed Building and Reference Buildings 6.1. Testbed Building The testbed building in this dissertation for occupancy modeling and exploration of occupancy-loads relationships is a typical educational office building located on the University of Southern California campus, 4 miles away from downtown Los Angeles. The testbed building (Figure 4) is a three-story office building with a gross area of 3,735 m 2 , and contains 89 mechanically ventilated rooms that have spaces of varying sizes and functions. Most of the rooms in the building are enclosed single occupancy offices; other rooms are classrooms, conference rooms, and auditoriums. The building hosts approximately 50 permanent occupants (i.e., staff, faculty, graduate students) throughout the year and more than 2000 temporary residents (i.e. undergraduate and graduate students) per semester. Figure 4: Case study building located on the University of Southern California campus The testbed building is equipped with state-of-the-art Building Management System (BMS) and central HVAC system with air handling units (AHU) serving a total of 64 variable air volume (VAV) boxes and 3 fan-coil units (FCU). A VAV box is responsible for regulating the ventilation in the thermal zone with conditioned air, and reheating the air with boiler-supplied hot water if the zone needs heating instead of cooling. The conditioned air is supplied to VAVs by Air Handler Units (AHUs) using fans and ductwork. There are two AHUs in the building, each servicing one side of the building with similar sizes of service - 42 - areas. AHUs take in outside air, mix it with returned air from the building, and cool down or heat up the mixed air to with chilled or hot water supplied by chillers or boiler. Each VAV box serves a mechanical zone; it regulates the demand for volume of conditioned air, determined by the volume of the zone and the difference between the zone’s actual temperature and set point. A VAV box reheats the air using hot water, if necessary, before it discharges the air into the room. The third floor of the building comprises 16 thermal zones, each of which is serviced by one VAV box. A VAV box is responsible for regulating the cooling in the thermal zone with conditioned air, and reheating the air with hot water supplied by boilers if the zone needs heating instead of cooling. The conditioned air is supplied to VAVs by AHUs using fans and ductwork. The HVAC energy consumption in the building can be decomposed by fuel type to heating and cooling energy consumption, used by chillers and boilers to generate chilled and hot water, respectively, and ventilation energy consumption, used by AHUs and their embedded fans to distribute conditioned air in the building. The conventional HVAC system control implemented in the testbed building runs at an on-hour mode during the daytime (6:30 - 21:30 on workdays, and 7:00 - 21:30 on weekends), all thermal zones in the building are assumed to be always occupied, and a constant temperature set point (73F) is maintained on the demand side, which dynamically adjusts the airflow damper and reheating valve of each zone. The conventional control has an off-our mode, where the HVAC system is shut off during the nighttime, and no cooling or heating services are provided. Only minimum airflow is maintained to satisfy the ASHRAE compliance. 6.2. Customized Ambient Sensors The building has 64-wired existing sensors for VAV boxes (part of the legacy EMS/HVAC system) plus three for FCUs (fan-coil unit). A total of 50 customized sensor boxes –BLEMS Augmented Sensor Devices, called as “sensor boxes”- (Figure 5) - were constructed and deployed in the testbed building, which cover 50 out of the 89 mechanically ventilated rooms in building with 28 of them on the third floor (Figure 6), as part of a $121 million program, funded by Department of Energy and Los Angeles - 43 - Department of Water and Power [206]. BLEMS, standing for Building Level Energy Management System, studies behaviors of a building and its occupants, and aims to optimize the building energy consumption while maintaining occupant comfort through computing, sensor monitoring and online control. Through real world experiments, inputs from a wide range of modalities (including mobile phones, and various wireless and wired ambient sensors) are integrated, fused and evaluated in order to measure and track indoor climate, occupancy, and preferences. The sensor network is part of the BLEMS project. (http://i-lab.usc.edu/sites.html) The customized sensor box is a stand-alone platform for gathering ambient information at room level. It is a single-board computer that measures various environmental parameters and makes parameter values available through a simple web page accessible over Wi-Fi. A sensor box consists of an Arduino Black Widow single-board microcontroller based on Ardunino Duemilanove processor with integrated support for 802.11 Wi-Fi communications. Each sensor box hosts a number of ambient sensors including a light sensor, a sound sensor, a motion sensor, a CO 2 sensor, a temperature sensor, a relative humidity sensor, a PIR sensor(detects objects as they pass through a door), and a door switch sensor (the main reason to place the sensor boxes next to the doors). The brief specifications of each sensor are as follows: Light sensor: light sensor detects average diffuse visible light level; it is an analog sensor that outputs a voltage from 0-5V corresponding to the visible light sensed. Sound sensor: sound sensor detects average sound; it is an analog sensor that outputs a voltage from 0-5V corresponding to the sonic air pressure sensed. A continuous output of sound sensor yields a signal showing the sinusoidal sound waveform. Motion sensor: motion sensor detects motion in a broad field; it is a digital sensor that outputs a high (digital ‘1’) when motion is detected within a beam of around 30 degrees. Motion sensor is triggered by motion, and reset only in the absence of motion for over approximately 10 seconds. - 44 - CO 2 sensor: CO 2 sensor detects carbon dioxide level; it is a self-calibrated digital sensor that outputs positive pulses (start at low-high transition, end at high-low transition) whose duration in seconds corresponds to the level of carbon dioxide sensed. Temperature sensor: temperature sensor detects air temperature; it is a digital sensor that outputs a count of hundredths of degrees Celsius, i.e., 5583 means 55.83° C. Humidity sensor: humidity sensor detects air humidity; it is a digital sensor that outputs percentage points of relative humidity, i.e., 85 means 85% (or 0.85) relative humidity. PIR sensor: PIR sensors detects the presence of an object in a narrow field, it is a digital sensor that emits a light beam and detects its reflectance within a narrow beam of around 15 degrees, and a distance of around 3 feet. It outputs a 1 when an object is present and a 0 otherwise. Door switch sensor: door switch sensor detects door open/closed status; it is a normally open 2-contact magnetic reed switch that is closed in the presence of a magnet. The CO 2 sensors are tested and pre-calibrated at the factory. There is an Automatic Baseline Correction mechanism inside the CO 2 sensors to compensate for sensor aging issue and adapt to possible transportation/installation damages. Since this process lasts a couple of weeks and doesn’t support further calibration, the CO 2 sensors may cause noise and error, which are then alleviated by moving average method. All other sensor values are used as relative values. The sound sensor outputs a voltage from 0- 5V corresponding to the sonic air pressure sensed. Since the sensor is installed close to the door, its sensed waveforms are easily influenced by the corridor environment. Averaged sound for 5 seconds is then used to alleviate noise and error. The PIR sensor emits a light beam and detects its reflectance within a narrow beam of around 15 degrees. The beam is sometimes blocked by stuff hanged on the wall or door, which causes error and noise. The error and noise from PIR sensor are manually examined and corrected. - 45 - The sensor box is installed on the wall of each room, next to the door at around 1.5m height. Since all of the sensors are integrated in a single sensor box, the individual sensor locations could not be changed. The impact of sensor locations on occupancy modeling is a study itself but not the focus of this dissertation because all sensor boxes are on the relatively similar locations. The authors have tested different locations of putting temperature sensor by considering the interior temperature stratification. For future testbed building equipped with more advanced sensing system, which allows each sensor to be installed its own location, the placement of each sensor could be studied and it is expected that both individual performance and joint effects of sensors would be improved. Figure 5: BLEMS sensor box used for collecting ambient data Figure 6: Deployment of sensors and zoning on the third floor of the testbed building (the floor plan is modified for privacy purposes; all room features (e.g., size, orientation, etc) are kept the same) - 46 - Data reported by each sensor box includes 11 sensor variables, which can be categorized into three types: (1) instant variables that show the instant output of a sensor at the time the data is queried, including light level, binary motion, CO 2 concentration, temperature, humidity, binary PIR, and door status (open/close); (2) count variables that sum number of times a sensor's output changes in the last minute, (motion count net, PIR count net, and door count net); (3) average variables that show average value of a sensor's output over a certain period of time (sound average - every 5 seconds). Data are automatically queried with one- minute granularity, time stamped, and stored in an SQL database. To be clear, binary motion variable detects any motion in the interior area close to the door within a beam of around 30 degrees. The motion sensor is triggered by motion, and reset only in the absence of motion for over approximately 10 seconds. Binary PIR detects an object passing through a door in a narrow field. The PIR sensor emits a light beam and detects its reflectance within a narrow beam of around 15 degrees, and a distance of around 3 feet. It outputs a 1 when an object is present, and a 0 otherwise. The binary motion is used to detect an activity in an interior area, while the binary PIR is used to detect a passing object through the door. Therefore, the detection area and purpose of the two variables are different. In addition, the motion sensor can also count the number of times an activity being detected within a certain period of time, while the PIR sensor can count the number of times an object passing through the door within a certain period of time. The energy consumption by various building systems such as HVAC, lighting, receptacle and mechanical is metered and recorded in a building energy management system (BEMS), a centralized server sending and controlling information to the system actuators. 6.3. Reference Buildings In order to provide benchmark for whole building energy analysis, Department of Energy has assigned national energy research laboratories (Pacific Northwest National Laboratory, National Renewable Energy Laboratory, and Lawrence Berkeley National Laboratory) to develop standard building energy model for reference. Most common commercial buildings are divided into 16 different types in 16 main - 47 - cities representing 16 climate zones based on ANSI/ASHRAE/IES Standard 90.1. For each type, individual prototype building models are created for research and industrial use. These reference buildings could represent 70% of the commercial building area in the United States [207]. Office reference buildings are chosen in this dissertation and contain three categories based on the number of floors: one-floor small size office building, three-floor medium size office building, and 12 floor large size office building (Figure 7) [208]. For each category, there are three kinds of construction (new construction after 2004, post- 1980 construction, and pre- 1980 construction) which comply with different versions of ANSI/ASHRAE/IES Standard 90.1 requirements in different years. Figure 7: Reference building developed by Department of Energy Office buildings with new construction are chosen as test subjects in this dissertation complementary to the campus testbed building described in Chapter 6.1. It is because these models could be more representative to new and existing buildings for investigating the relationship between occupancy and heating/cooling loads and assessing advanced HVAC system control based on occupancy characteristics for energy efficiency. Occupancy of the reference buildings is randomly chosen and assigned from actual occupancy on the third floor of testbed building. Other default model inputs are determined by the national laboratories. - 48 - 6.4. Virtual Reference Buildings Since the reference office buildings only have rectangle shapes and fixed number of zones, which might deviate from real world office building. In order to increase the generalizability and consistency of methodologies and results of occupancy-loads relationships, virtual reference buildings are created based on the reference buildings provided by the Department of Energy [209] and architectural logic/shape grammar [210]. Five basic building shapes, including I, L, H, U, and T, are considered [211,212] (Figure 8). Figure 8: Basic building shapes for mass modeling It is assumed the virtual reference buildings have the same occupancy and number of rooms as the third floor of the testbed office building. The original occupancy for each virtual reference building is formed by a Space Planning Process (SPP) to keep the pairs of occupants with specific spaces originally in the testbed building. The spatial requirements of occupants (group requirement) are kept such as occupants from the same department should be next to each other. Original HVAC system design remains the same to eliminate the possible influences from different HVAC system parameters. All the primary and secondary HVAC systems (e.g., same two air handling units) follow the same settings as in the testbed building. Zoning also keeps the same number of zones and occupancy capacities. A computer application program is developed using Java to assist the layout planning process for different building shapes (Figure 9). - 49 - Figure 9: Computer application program interface for layout planning The user interface of the program is divided into two parts. The left part is the set up menu, in which the existing layout plans can be viewed, edited, locked, and deleted. Before creating new plans, the user decides whether he/she wants the program to generate the plan automatically based on the specific constraints and random number coding, and whether the group requirement and HVAC zoning information are shown during the planning process. The right part of the GUI is the configuration window, in which the user can drag a room from the right top repository and drop it to a vacant space in the building. The number of rooms unassigned can be tracked by the progress bar. There are three options of TEST to check whether the current plan is same as the existing ones, CLEAR to erase the current plan and start over, and FINISH to save the current plan. For each building shape, there are four additional unassigned spaces for two stair cases and two restrooms. If the group requirement checkbox is checked, the room numbers under the same group are color coded to point out where the next rooms should be placed. When the original zoning is selected, the default zoning information based on the testbed building is shown by enclosing the rooms of the same zone. Although the zoning is not exactly the same, the number of zones and the occupancy capacities of each zone could remain the same. Using this program, twenty different plans are generated for each building shape by locating the original 28 rooms of the testbed building. - 50 - Then one-story virtual test buildings are built based on the layout plans using Sketchup. Corridors, which were the same size as in the testbed building, are added. Other building features, such as window fraction, construction materials, such as wall U-factor, and internal loads such as, average lighting power density, are set following the reference office building models described in Chapter 6.3 and ASHARAE 90.1 [15,207]. - 51 - Chapter 7: Framework for Modeling Building Occupancy Occupancy data is difficult to collect and there is no standard way to non-intrusively sense occupancy throughout the building with high accuracy and precision. In this dissertation, it is assumed that the designed occupancy does not necessarily represent the actual occupancy and finding the historical occupancy could improve the accuracy of modeling actual occupancy. Ambient information acts as important indicators for interior environmental changes, which is influenced by occupancy changes and has its own patterns over a period of time, and can provide help for occupancy modeling. A framework is proposed to model both real-time occupancy and long-term occupancy through analyzing the relationships between occupancy and ambient environment. The framework is generalized to be applicable to other office buildings and does not intend to provide a solution for a specific building. There are no inferences about the gender or age of occupants or their profession, location in the building or the orientation of the room. They are associated with occupancy, but there is no direct causality but covariant relations. 7.1. Data Collection and Preparation For local occupancy modeling, which means the occupancy model is trained and tested in the same space, the ground truth was collected using ceiling-mounted cameras for three single-occupancy rooms (Figure 10) because of privacy issues and cost associated with it. However, the experimented rooms are representative of typical faculty offices in the testbed building. All the three offices have a size of approximately 200 sq. ft. and accommodate up to four occupants (including visitors) during the tests. Images were captured automatically and sent wirelessly to a database every 10 seconds. Occupancy in the images was then manually labeled to get the ground truth in the rooms during the data collection period. An IRB (institutional review board) approval is obtained for the study. Since the selected rooms are from - 52 - different sides of buildings, and the occupants have different positions, schedules and preference, the selection has enough variability and diversity. Figure 10: Three typical offices to collect occupancy ground truth using ceiling-mounted cameras (sensor boxes are marked with red circles for their placements) A total of seven months' data were collected in three rooms in the testbed building during two periods of data collection. The first period spanned for two months from May to July 2012, and the second period spanned for four months from January to April 2013. Data collection was sometimes interrupted for couple of reasons, including interrupted Wi-Fi connection, loss of electricity in the building, physical damages to sensor boxes, and data corruption. Moreover, as all of the sensor boxes in the building were communicating concurrently with the same server at a short interval (i.e. 1 minute), data loss due to limited bandwidth was inevitable. The data from the first period were used to model real-time occupancy based on the analysis of variations in the ambient environment. Both ambient sensor data and occupancy ground truth were collected to build the supervised learning model for understanding the mathematical/statistical relationship between real- time occupancy status and the ambient context. As occupancy profiles are derived from periodical occupancy data, the data from the second period were processed for modeling long-term occupancy profile. The output of the real-time occupancy model was used as the ground truth for the occupancy - 53 - profiling. Based on the sensor variable impact analysis and the assumption made in this dissertation, CO 2 concentration, door status, light level, binary motion, and temperature were selected as ambient context to eliminate irregular presence in personalized long-term occupancy modeling. Figure 8 illustrates the process for data collection and analysis. Sensor Dataset for Occupancy Detection Occupancy Ground Truth for Real-time Occupancy Modeling Sensor Dataset for Long-term Occupancy Modeling Real-time Occupancy Model First Period: May to July 2012 Long-term Occupancy Model Second Period: January to April 2013 Occupancy Ground Truth for Long-term Occupancy Modeling Second Period: January to April 2013 Figure 11: Process for occupancy detection and personalized occupancy profiling For global occupancy modeling, in which occupancy model is trained in one space and used in other geometrically similar spaces, which are the space with similar geometry, similar area (no larger or smaller than 20% of the size), similar window and door size, and similar permissions and means to control building elements and systems, especially for real-time occupancy modeling (as long-term occupancy could be derived from real-time occupancy), bears enormous potentials to extend wide applicability of ambient sensing based occupancy modeling. In order to further increase the diversity and variability, besides the three rooms in the testbed building, another room in the same building and two rooms in another building on the main campus of University - 54 - of Southern California were added to implement and test global occupancy modeling (Figure 12). The six rooms chosen serve occupants with different job titles, schedules and preferences (Table 4). Figure 12: Locations of the six rooms in the two testbed buildings Table 4: Samples collected from six rooms for global occupancy modeling Room No. Occupant Periods A Staff May-Aug 2012; Jan-Apr 2013; Sep-Dec 2013 B Staff May-Aug 2012; Jan-Apr 2013; Sep-Dec 2013 C Professor May-Aug 2012; Jan-Apr 2013 D Professor Jan-Apr 2013; Aug-Dec 2013; Jan-Mar 2014 E Professor Jan-Mar 2013; May-July 2013; Sep-Dec 2013 - 55 - F Professor Jan-Apr 2013; Aug-Nov 2013; Jan-Mar 2014 Among the six rooms in the two office buildings, two rooms were occupied by staff and the remaining four were the offices for full-time professors. For each room at least eight-month long samples were collected between 2012 and 2014 (Table 4). The data collected covered the three academic semesters (Fall, Spring and Summer) and all seasons in Los Angeles. Frankly, analyzing data from six rooms may result in generalization difficulty, but lack of available occupancy ground truth is a common issue not only with this research but also for practical scalable implementations of any occupancy related study. Theoretically, data driven occupancy modeling requires a long period -even several years- of occupancy data from a large range of buildings. However such data are not available for any research group. The framework of occupancy modeling in this dissertation was validated by only available source that could be accessed by the research team. In the future, if the community could establish a standard to collect, store data and share the datasets, the results of this dissertation will be more convincing. 7.2. Real-time Occupancy Modeling In order to improve real-time occupancy through processing ambient sensor data, the hypothesis to be tested is: occupancy regularly influences and interferes with the ambient environment, occupancy status and ambient factors could be mathematically or statistically bridged through supervised learning, future ambient data could be then analyzed to output corresponding occupancy outcomes [113]. Although individual ambient factor is closely related to occupancy but it also has large uncertainty [112] to comprehensively represent occupancy. Therefore, multitude of ambient factors was thoroughly analyzed instead of single factors. To be clear, this dissertation does not develop a system or solution to detect/estimate occupancy for a specific building/space. It provides a framework to explore how to learn - 56 - the patterns of ambient factors in terms of occupancy, explore the relationship between occupancy and ambient environment, and evaluate the performance of applying the relationship to detect/estimate occupancy locally and globally. 7.2.1. Methodology for Real-time Occupancy Modeling This process was based on learning representation and generalization. To find the relationship between occupancy and ambient environment, a fundamental question is whether the modeling process should be treated as a regression problem or a classification problem. In this dissertation, occupancy modeling was treated as a classification problem for the following reasons: 1) possible values of occupancy number in a space are known so the classes can be set in advance; 2) ranges of occupancy number, namely the number of classes, are limited (0 to 3 in single occupancy rooms and 0 to 9 in multi occupancy rooms), in this dissertation, when multi-class classification is used, the one-versus-the-rest classification approach is applied. Following this approach, a single classifier is trained per class to distinguish that class from all other classes. Then, the class with the greatest margin (highest probability) is selected as the estimated occupancy class; 3) if regression algorithms are used, the output is continuous and the accuracy is dependent on the boundaries between neighboring classes, which are usually set arbitrarily and hence error-prone. Accuracy, defined as the percentage of correct classified instances out of total instances, was used to evaluate the performance of occupancy detection (binary classification), Root mean square error (RMSE) was used to evaluate the performance of occupancy estimation (multi-class classification). RMSE measures the difference between estimated results and actual observations, providing the standard deviation of model estimation error, where smaller values indicate better model performance. The space was considered as fully occupied if the ambient sensor data from that space is missing. This dissertation used the following machine learning classification algorithms that are most commonly used in literature for representing the statistical relationships: Support Vector Machine (SVM)[42], K nearest neighbors (KNN), Artificial Neural Network (ANN), Naïve Bayesian (NB), Tree Augmented - 57 - Naïve Bayes Network (TAN), and Decision Tree (DT). Commonly used 10-fold cross validation was implemented to reduce the variability and randomness caused by partitioning the data into training set and test set [213], in which data were divided into 10 parts (9 parts for training set and 1 part for testing), and the model training and model validation processes were repeated ten times and the results were averaged. The 10-fold cross validation was applied to all of the algorithms used in this dissertation to evaluate whether the relationships built from training data could be applied to detect/estimate occupancy from testing data. The WEKA v.3.7 packages were used [214] to implement the algorithms, which is a collection of machine learning algorithms and data processing tools. Specifically in this dissertation, SVM investigated whether ambient factors can be mapped to higher dimensional space and divided by hyperplanes to classify occupancy classes. The C-SVM with a Gaussian kernel function was applied based on its strong performance on the non-linear dataset [215], using radial basis function to generate results. The first parameter tuned was the cost parameter C, for a small value of C, the classifier had a large margin even if the hyperplane misclassified several instances; while for a large value of C, the classifier obtained a small margin that could classify the instances correctly. Therefore, parameter C is considered as a regularization controller to keep the balance between the error and weight. There was no certain range for value of C, thus cross calculation was used to find the suitable value of C. The second parameter was gamma in the radial basis function, which affected the flexibility of the decision boundary, which was set to 0.1 in this dissertation. The dual function can be expressed in the term of variable α i as follows: 𝐿 (𝛼 ) = ∑ 𝛼 𝑖 − 1 2 𝑛 𝑖 =1 ∑ ∑ 𝑦 𝑖 𝑦 𝑗 𝛼 𝑖 𝛼 𝑗 𝑘 (𝑥 𝑖 , 𝑥 𝑗 ) 𝑛 𝑗 =1 𝑛 𝑖 =1 KNN investigated whether ambient factors have equal effects and occupancy class is decided by the classes of neighbors with similar ambient factors. Two parameters are tuned for the KNN algorithm: the value of K (the number of nearest neighbors), and the distance function. K is set to 5, which outperforms - 58 - all other values between 1 and 10 when validated with sensor data used in this dissertation. As for the distance function, the default Euclidian distance is used. Based on the contribution of each feature on the results and different ranges of values for each feature, the weight on cache neighbor is applied to predict the occupancy number given a new data. The value of weight for each neighbor can be measured by 1/distance. The prediction function can be shown as follows: 𝑓 ′ (𝑥 𝑝 ) = ∑ 𝑤 𝑖 𝑓 (𝑥 𝑖 ) 𝑘 𝑖 =1 ∑ 𝑤 𝑖 𝑘 𝑖 =1 ANN investigated whether the relationship between occupancy classes and ambient factors is a statistically reliable black-box relationships that can be approximated by artificial processing units. Multilayer Perceptron is used for implementing ANN [216], in which learning rate and the momentum parameters controlled the speed of model modification, set to 0.3 and 0.2 separately via a cross calculation. The number of hidden layers can be calculated by the function (number of attributes + number of classes)/2. The number of attributes was the number of total features (e.g., number of different ambient variables) and the number of classes is the total classes existed in the data. The input of the Multilayer Perceptron was the value of features, and the output is the occupancy number for a given data point. NB investigated whether ambient factors are independent and the probability of occupancy class is proportional to the conditional probability of ambient factors with similar occupancy. NB applied the Bayes rules with the strong assumption that each feature was independent. The probability model for NB classifier can be expressed as follows: 𝑃 ( 𝑦 𝑥 𝑖 , 𝑖 = 1,2, . . 𝑛 ) = 1 𝑍 𝑃 (𝑦 ) ∏ 𝑃 ( 𝑥 𝑖 𝑦 ) 𝐷 𝑖 =1 In which y is the occupancy number, x i is ith feature in the dataset, and Z is the scaling factor. Based on the Bayes rules and independence of each feature, the occupancy number can be obtained by the function: - 59 - #𝑜𝑐𝑐𝑢𝑝𝑎𝑛𝑡 = 𝑎𝑟𝑔𝑚𝑎𝑥 𝑃 (𝑦 ) ∏ 𝑃 ( 𝑥 𝑖 𝑦 ) 𝐷 𝑖 =1 TAN investigated whether the probability of occupancy class depends on the conditional probability and correlation of ambient factors. TAN augmented edges among features that are dependent based on Naive Bayesian Network structure, and dispensed the strong assumption in NB that all features are independent to each other. In a TAN structure, each class variable had no parents and each attribute had a class variable parent with another parent from other attributes at most [217]. TAN searched set of edges that could maximize the network’s likelihood by constructing a maximal weighted spanning tree. The process can be summarized as follows: 1) computed a mutual information function for each pair of variables; 2) built an undirected graph with each vertex representing a variable, and the weight of each edge is obtained by a mutual information function; 3) found the maximum weighted spanning three; and 4) transformed the undirected tree to directed by designating the root variable and setting the direction of edges [218,219]. DT investigated whether each ambient factor is responsible for classifying part of the instances and occupancy classes are determined by the sequential consideration of ambient factors. The first step was to build the tree. Information gain was used for choosing the features to split the data. In each iteration, the feature with the highest information gain was selected to make a decision. After obtaining the tree, the second step in DT algorithm was tree pruning with the purpose of reducing the complexity of the model and avoiding over fitting. An error rate was used in deciding which part of the tree should be pruned. There are several different ways including reduced error pruning and C4.5 pruning to calculate the error rate. As the reduced error pruning might be unavailable for the small data set, C4.5 pruning was used in this dissertation [220], and subtree raising method was applied. 7.2.2. Findings for Real-time Occupancy Modeling The sixed algorithms were implemented to three rooms selected from the testbed building. The data collection period spanned from May to July 2012. As can be seen in Tables 5 and 6, SVM, KNN TAN, - 60 - and DT reported accuracies no less than 95.1% for binary occupancy detection and no larger than 0.166 RMSE for occupancy number estimation. In both occupancy detection and estimation, DT yielded the best results, while NB produced the relatively worst performance. Table 5: Real-time Occupancy Detection Results Algorithm Accuracy in Room 1 (%) Accuracy in Room 2 (%) Accuracy in Room 3 (%) ANN 94.6 97.1 96.4 DT 96.0 98.2 98.8 KNN 95.4 97.5 96.8 NB 92.2 94.3 91.5 TAN 95.3 98.0 97.3 SVM 95.1 97.5 96.4 Table 6: Real-time Occupancy Estimation Results Algorithm RMSE in Room 1 RMSE in Room 2 RMSE in Room 3 ANN 0.184 0.139 0.149 DT 0.144 0.109 0.121 KNN 0.162 0.121 0.134 - 61 - NB 0.195 0.189 0.179 TAN 0.151 0.117 0.128 SVM 0.166 0.141 0.141 It can be found from the results that algorithms yielded slightly better results in Room 2 and Room 3 than in Room 1. Such differences may have resulted from unbalanced data distributions in three rooms (e.g. the class of 3 occupants in Room 2 includes much more samples than that in Room 1), and ultimately due to the differences in occupant schedules and room use patterns. Insufficiency of training data points in certain classes is likely to increase the chances of inaccurate estimation in these classes. 7.2.3. Primary Effects of Ambient Factors Since each ambient factor may have its unique contribution to relationship established, to better understand the effects of ambient factors, information gain was used to measure the uncertainty of occupancy that one factor can reduce. In information theory, entropy expresses the uncertainty of results: the higher entropy indicates higher uncertainty, and information gain detects the entropy of the result that a variable reduces. The higher information gain an ambient factor has, the more impact the ambient factor has on the occupancy modeling: 𝐼𝐺 (𝑌 , 𝑋 ) = 𝐻 (𝑌 ) − ∑𝑝 (𝑋 )𝐻 (𝑌 /𝑋 ) 𝑋 Where X is the ambient factor, Y is the occupancy classes, H(Y) is the entropy of Y, and H(Y/X) is the entropy of Y given a certain value of X. A total of 11 ambient factors were used in modeling real-time occupancy. If the effect of each factor is differentiated from others, ambient factors with major effects can be re-grouped to reduce the number of sensors used in occupancy modeling with minimum decrease in accuracy. Taking all three rooms into consideration, information gains of each ambient factor in Room1, Room 2 and Room 3 were included as ranges to represent the general information gain of that variable. It can be seen from the results (Figure 13) that CO 2, door status, light, and motion count net variables - 62 - played primary roles in determining the modeling results, while PIR, PIR count net, average sound, door count net, and humidity variables had significantly less contributions. The outcomes of effect analysis were consistent with the results from literature review and can be used to streamline ambient factor combination evaluation. Ambient factors with higher effects were given more attention as explained in the following section. Figure 13: Information Gains of Ambient Factors in Single Occupancy Rooms (light colors mean the ranges of information gains in different rooms) 7.2.4. Joint Effects of Ambient Factors When different ambient factors were used together, there may exist mutual effects among them, meaning some ambient factors may strengthen or weaken the effect of another. To examine these joint-effects, different combinations of ambient factors were used to model occupancy. The DT algorithm was used to conduct the combination analysis, as it yielded better performance in previous tests in terms of accuracy and consistency across rooms. Combinations were grouped based on the number of ambient factors, and the combination that yielded the best accuracy and RMSE in each group is bolded. Table 7 presents the results from Rooms 2. Results from Rooms 1 and 3 were very similar to those from Rooms 2. First, each ambient factor was evaluated independently (combinations #1-6). The results showed that CO 2 concentration, door status, and motion count net, when used alone, yielded better results than others (up to - 63 - accuracy of 89.86% and RMSE of 0.191). When all ambient factor pairs were examined (combination #6- 16), the combination that included the CO 2 and door status yielded better results (up to accuracy of 93.86%) than other pairs in detecting occupancy presence. Generally, the results showed that the accuracy was generally higher when two sensors are used than that of using a single variable. Lastly, other ambient factors were gradually added to the analysis, and a total of 11 ambient factor combinations were evaluated (combination #17-35). Among the combinations that included three ambient factors, the CO 2 concentration, door status and light level combination yielded better results (accuracy is 95.21% and RMSE of 0.155). As the number of ambient factors increased, the overall accuracy generally increased with RMSE decreased. The best performance (with accuracy of 98.20% and RMSE of 0.109) was achieved when all ambient factors were used. The PIR count net, PIR, average sound and door count net were less useful than the other ambient factors in modeling real-time occupancy. This was discussed in the previous work [113]. If others are to attempt a similar study, it is recommended the following factors to be used: CO 2 concentration, door status, light level, motion count net (or binary motion, depending on the sensor quality) and temperature variables. Table 7: Ambient Factor Combination Analysis for Room 2, Using the DT Algorithm (CO 2: CO 2 Concentration; D: Door Status; DCN: Door Count Net; H: Humidity; L: Light; M: Motion; MCN: Motion Count Net; P: PIR; PCN: PIR Count Net; T: Temperature, AS: Average Sound) Ambient Factor Accuracy (%) RMSE # CO 2 D L MCN M T H PCN P AS DCN 1 √ 89.86 0.191 2 √ 79.83 0.200 3 √ 69.52 0.223 - 64 - 4 √ 75.97 0.215 5 √ 67.11 0.232 6 √ 65.32 0.255 7 √ √ 93.86 0.155 8 √ √ 92.31 0.168 9 √ √ 91.21 0.176 10 √ √ 90.09 0.183 11 √ √ 89.12 0.208 12 √ √ 89.36 0.191 13 √ √ 90.76 0.195 14 √ √ 90.89 0.185 15 √ √ 89.85 0.200 16 √ √ 89.37 0.213 17 √ √ √ 95.21 0.144 18 √ √ √ 94.40 0.146 19 √ √ √ 94.28 0.149 20 √ √ √ 94.17 0.159 - 65 - 21 √ √ √ 91.35 0.174 22 √ √ √ 91.04 0.177 23 √ √ √ √ 96.01 0.139 24 √ √ √ √ 95.77 0.141 25 √ √ √ √ 92.77 0.167 26 √ √ √ √ 91.95 0.171 27 √ √ √ √ 94.99 0.167 28 √ √ √ √ 95.65 0.165 29 √ √ √ √ √ 96.81 0.128 30 √ √ √ √ √ √ 97.19 0.119 31 √ √ √ √ √ √ √ 97.59 0.112 32 √ √ √ √ √ √ √ √ 97.99 0.112 33 √ √ √ √ √ √ √ √ √ 98.02 0.111 34 √ √ √ √ √ √ √ √ √ √ 98.19 0.109 35 √ √ √ √ √ √ √ √ √ √ √ 98.20 0.109 - 66 - 7.2.5. Global Occupancy Modeling Although the previous studies that used ambient sensing can reliably model occupant numbers locally, they bear an inherent limitation: collection of training sensor data and tuning of occupancy models have to be done for each space, for which an occupancy modeling is needed. This process can be costly and time consuming. In addition, it may be unrealistic to access occupancy ground truth data of each space, since the installation of live-stream cameras or other occupancy ground truth collectors in each space might not be feasible and might have privacy implications. Surveying every occupant is also impractical and intrusive. Even it could be done; the obtained data might not have high resolution and might be inaccurate as surveys in general use self-reporting, which rely on memory. Global occupancy modeling, where a model is trained in one space and used in other spaces by generalizing the relationships between occupancy and contextual information, especially the ambient factors, has been explored in this dissertation to improve the applicability of occupancy models in terms of both presence detection and number estimation 7.2.5.1. Global Modeling Methodologies The proposed global occupancy model used supervised machine learning to learn the general relationships between occupancy and contextual information of one space and applied them to other spaces, in which contextual information was defined as ambient factors, time and the modeled occupancy at the previous time step. The underlying assumption was similar to local occupancy modeling that occupancy regularly influences the ambient environment and has time continuity, therefore future contextual information could be analyzed to output corresponding occupancy outcomes by bridging occupancy ground truth and contextual information through supervised learning. Original features for supervised learning were extracted from the sensors. However, based on our previous analysis [113], these features cannot effectively extract the general relationships between occupancy changes and contextual information, and are unable to globally model occupancy in other - 67 - spaces because the different classes of occupancy do not have similar boundaries given the original features. Thus the original features should be transformed to more representative format. To do so, a feature selection procedure was followed [221]. (1) Add/remove features. Based on the domain knowledge and our previous investigations, features associated with the PIR sensor and motion sensor were excluded. Sound related features were also excluded because of their low information gain ratios to the modeling results. Since occupancy is time sensitive and has continuity, the time of the day and class of occupancy at the previous time step are also informative, therefore they were added as new features. (2) Pre-process features. K-Nearest Neighbors (KNN) (K=10) method was used to impute the missing values due to the interrupted Wi-Fi connection, loss of electricity in the building, physical damages to sensor boxes, and data corruption. The performance of KNN was examined by artificially removing some values of the same features and comparing the differences of quartiles between actual values and imputed values. All features were then normalized by x ∗ = x−x min x max −x min (x max ∗ − x min ∗ ) + x min ∗ to reduce the variability (CO 2 concentration has additional step of being divided by the volume of space before the maximum-minimum normalization) and influences of geometric properties. (3) Transform features. Since some of the ambient factors are continuous and dynamic, their trends and variations during a certain period are more important than the absolute sensed values at the specific time points. Therefore, the sequence of values for continuous ambient factors was segmented by a time window (10 minute long windows were used in this dissertation) and statistics were extracted from each window to form the new features. Further, their trends at each time point were maintained and represented by both the normalized absolute values and the normalized derivatives. Similarly, for binary ambient factors, the state switch frequency and average time length for each state within a time window are potentially important for occupancy modeling, thus these statistics were also taken into account. In addition, some of the ambient factors may have synergistic effects with each other; therefore their products were also considered, such as the CO 2 concentration and temperature. (4) Filter features. Considering too many features could cause overfitting, redundant features were removed in this step. First - 68 - information gain ratio of each feature was calculated and ranked. An exploratory analysis was conducted to understand the patterns of variations in the datasets and how they were related to each other. Features with less variability and that are close to zero, or highly correlated with other features, were removed from the data. After the filtering process, updated samples, for training and testing, were formed by matching the selected features with occupancy ground truth. PCA (Principal Component Analysis) was then applied to examine the feature distributions and whether there were similar boundaries of occupancy classes among the samples from different spaces. After the feature selection, the next step was to input these features to supervised machine learning algorithms to build a general relationship model between the classes of occupancy and the selected features for both presence detection and number estimation. In order to keep a balance between potential bias and variance of the model, bootstrap aggregating was applied. The basic idea was to resample of training data with replacement, and re-conducting the supervised learning on the resampled data. The final modeling results were built on the majority vote. By doing so the modeling bias towards summarization and information loss almost stayed similar, but the variance could be significantly reduced. If there were N samples in the training dataset and resampling time was set as M, one sample was randomly chosen from N with replacement for N times until a loop over M different sets of N samples was formed. Supervised learning was implemented by running five classifiers: Support Vector Machine (SVM), which investigates whether contextual information can be mapped to higher dimensional space through kernel function and whether it can be divided by hyperplanes to classify the classes of occupancy after training the classifier using SMO (sequential minimum optimization); Naive Bayesian (NB), which investigates whether the contextual information is independent and the probability of each class of occupancy is proportional to the conditional probability of contextual information with same class of occupancy according to the conditional probability of each feature within each class based on the Bayes’ theorem; Tree Augmented Naive Bayesian (TAN), which investigates whether the probability of each class of - 69 - occupancy depends on the conditional probability and correlation of contextual information by applying entropy to learn the Bayesian conditional tree; Artificial Neural Network (ANN), which investigates whether the relationships between the classes of occupancy and contextual information can be statistically approximated in a feedforward backpropagation network using large number of highly interconnected processing units (neurons) on different layers; Random Forest (RF), which investigates whether each contextual information is responsible for classifying part of the samples and the classes of occupancy can be determined by averaging the sequential consideration of contextual information from a group of decision trees. Its training process is similar to Bagging and trees are trained independently. Since there were five classifiers and four occupancy classes, ensembling method was applied to combine the results of the five classifiers by voting. Generally, this integrated process could improve the modeling performance, keep the same bias but reduce variance compared to using individual models. 7.2.5.2. Global Modeling Performance Evaluation The samples from six rooms in the two testbed buildings were used for training the occupancy model, which were then applied to the samples from the other geometrically similar rooms for testing. The daily- modeled occupancy was compared with the actual occupancy from 6:00 AM to 9:00 PM (only weekdays) to calculate the performance of global modeling. Four metrics were used to evaluate the modeling performance and to analyze the trends and correlations. The first metric was daily F-measure, which combines the precision and recall to examine the capability of the model to estimate exact classes of occupancy. For each model trained in one room, the daily F-measures for the other five rooms were calculated. Then the F-measures were compared with the second metric, which is the daily RMSE (root mean square error) to investigate the correlation between modeling variability and the overall performance. The third metric was occupied/unoccupied detection accuracy, which combines all of the occupied classes to evaluate the capability of global modeling in differentiating the occupied classes from the unoccupied class. The fourth metric was the number estimation accuracy, which focuses on the occupied period and tests the capability of global modeling in classifying a specific occupied class - 70 - (occupant number). The daily occupied/unoccupied detection accuracy was then compared with the daily number estimation accuracy to investigate whether binary presence detection and multi-level number estimation were correlated. In addition, possible influential factors on global occupancy modeling were analyzed. First, real-time occupancy variation was considered. Since occupancy is stochastic in nature [222], occupants may have different real-time status changes in different days, such as arrival time, duration of stay, intermittent absence, intermittent presence and departure time, which therefore influence the global modeling performance. In this dissertation, the level of real-time occupancy variation for each day was defined as the ratio of occupancy class transition times (between the arrival time and departure time) for a day and averaged occupancy class transition times for the data collection period. The level of real-time occupancy variation was compared with the daily modeling accuracy, which helps us calculate the daily percentage of correct samples to investigate the influences of occupancy variation on global modeling performance. Second, there exist differences of long-term occupancy patterns between the room for training and the rooms for testing. Long-term occupancy, or personalized occupancy profile, represents the probabilities for the classes of occupancy as a function of time, representing long-term habitual occupancy patterns [223]. In this dissertation, the degree of long-term occupancy difference was defined as the Euclidean distance between the actual daily occupancy of a room for testing versus the occupancy profile (described in Chapter 7.3) of the room for training the model. Then its influence on global occupancy modeling was analyzed by comparing the daily modeling accuracy with the degree of long-term occupancy difference. Lastly, as an HVAC system was working during the data collection, and the system directly influenced the ambient factors, such as temperature and CO 2 concentrations, the global modeling performance might be undermined by different operations of the HVAC systems. The effects of HVAC operations were quantified by averaging indoor/outdoor hourly temperature differences on a daily basis. It was then compared with the daily modeling accuracy to understand the patterns of global modeling performance - 71 - over different HVAC operations. The overall process of the proposed global occupancy modeling approach is illustrated in Figure 14. Figure 14: Global occupancy modeling process 7.2.5.3. Global Modeling Results and Discussions After following our feature selection methodology explained in Chapter 7.2.5.1, a preliminary test was conducted to analyze the least number of features required in learning the relationship between classes of occupancy and the contextual information. Percentage of incorrect modeling results for the training dataset and testing dataset of the six rooms were calculated and compared. According to the results, training accuracy was continuously improved as the number of feature types increased, while the testing accuracy began to decrease when the number of feature types exceeded 15, indicating 15 feature types are ideal to train the models in order to avoid overfitting. Table 8 showed the final features. Table 8: Feature Selection for training global occupancy models Category Type Source 10-min sliding window (Continuous) Stagnation point Turning point CO 2 sensor, temperature sensor, Occupancy Ground Truth Ambient Factors Feature Selection Prepared Data Bootstrap Aggregating Model Training Performance Evaluation Support Vector Machine Naive Bayesian Tree Augmented NB Neural Network Tree Augmented NB F-Measure RMSE Detection Accuracy Estimation Accuracy Overall Accuracy - 72 - Mean humidity sensor, light sensor Standard deviation range Mean crossing rate Instant value (Continuous) Normalized absolute value CO 2 sensor, temperature sensor, humidity sensor, light sensor Normalized derivative 10-min sliding window (Binary) Number count 1-0 Door switch Average 1 length Average 0 length Instant value (Binary) Current status Door switch Relation CO 2-temperature correlation CO 2 sensor, temperature sensor Time continuity related context Modeled class of occupant at previous time point Occupancy ground truth and time recorder Hour of the Day Considering the fact that each feature type could have different sources, resulting in different features in each feature type, the contribution of each feature to the global occupancy modeling performance was ranked using information gain. According to the results, the most important features are the mean of CO 2 concentration, the stagnation point of CO 2 concentration, the mean of temperature, the correlation between temperature and CO 2 concentration, average length of 0-door status, modeled class of occupant - 73 - at previous time point, hour of the day, normalized absolute value of temperature, normalized derivative of CO 2 concentration, and standard deviation of light level. Since the difference in the distributions of contextual information (selected features) with the same occupancy class is the most possible factor responsible for the failure of global modeling, PCA was implemented to combine highly dimensional features into independent principal components through orthogonal transformation and all of the updated samples were then represented in 2D with different colors for different occupancy classes (Figure 15). It was discovered that there were similar boundaries among occupancy classes from different rooms given the processed contextual information, indicating the reduction of data heterogeneity among different rooms and the possibility of using a model trained from one room to estimate occupancy classes of the other geometrically similar rooms. Figure 15: Principal Component Analysis (PCA) results for different rooms Figure 16 is an illustration of the comparison between the measured occupancy ground truth and estimated occupancy using the proposed occupancy model in one of the six rooms for a typical day. From the comparison representation, it can be seen generally the global occupancy could accurately model the - 74 - presence and number of occupants, although there was slight delay for the model to recognize the change in the occupancy status. Figure 16: Comparison of actual and estimated occupancy The daily F-measure, which evaluates the overall performance of global modeling to estimate the exact occupancy status, was calculated and averages are presented in Table 9. It can be seen that the F-measures for Room F were all above 0.80 while the F-measures for Room C were all below 0.80 no matter which rooms were used for training. One possible reason could be that the ambient sensors in different rooms were calibrated differently; they may have different accuracies, hysteresis, dynamic characteristics and uncertainties. Another finding was that interchanging the room for training and the room for testing could generate different modeling results. This again could be because of the differences in sensor characteristics between the rooms and the fact that model generated from the selected features of one room might be more generalizable than the model generated from the other room. However, there was no significant difference between the two buildings, indicating that the general relationship between the selected features and the classes of occupancy was not significantly influenced by different building systems, different internal loads, etc. Then the F-measures were compared with the daily RMSE (root mean square error), which indicates the deviations of incorrect modeling results from the actual classes (Figure 17). Daily F-measure ranged from 0.71 to 0.94 with the mean of 0.80, while the daily RMSE ranged from 0.42 to 1.72 with the mean of 1.23. Generally, if the F-measure is higher than 0.80, the framework could roughly detect the different occupancy statuses. An RMSE below 1.5 indicates that even - 75 - if the framework wrongly models the occupancy, the results will not deviate much from the reality. Besides, the comparison results showed that the magnitude of modeling deviations decreased as the overall modeling performance increased. They were approximately linearly related, showing distances between incorrect modeling results and actual classes were reduced when the overall performance became better. Table 9: F-measure for global occupancy modeling Global Performance Testing A B C D E F Training A - 0.81 0.77 0.82 0.79 0.86 B 0.84 - 0.76 0.79 0.82 0.81 C 0.79 0.87 - 0.83 0.76 0.82 D 0.81 0.77 0.79 - 0.81 0.80 E 0.82 0.78 0.76 0.84 - 0.85 F 0.77 0.80 0.78 0.77 0.83 - - 76 - Figure 17: Comparison between daily F-measures and RMSEs Then, the daily occupied/unoccupied detection accuracies were calculated to evaluate the binary modeling performance. The results were then compared with the daily number estimation accuracies, which only focused on occupied periods for quantifying the capability of global modeling to classify specific occupant numbers (Figure 18). From the results, it can be seen that daily occupied/unoccupied detection accuracy ranged from 0.79 to 0.99 with the mean of 0.91, while the daily number estimation accuracy for the occupied period ranged from 0.63 to 0.78 with the mean of 0.71. Clearly, the global modeling for binary occupancy detection was better than the occupant number estimation. This can also be concluded from the PCA results, in which the boundaries between occupied and unoccupied classes were more consistent than the boundaries among different occupied classes across different rooms. Generally, occupied/ unoccupied accuracy above 0.85 could be considered capable of detecting the presence/absence changes, while the number estimation accuracy above 0.70 indicates the model can effectively differentiate occupied occupancy classes. Their comparison illustrated that occupied/unoccupied detection was logarithmically related to multi-level number classification. Since they had similar patterns, high accurate binary occupancy detection could infer high accuracy in occupant number estimation and vice versa. However, as the presence detection accuracy increased, the increase rate of number estimation accuracy decreased, indicating some samples had irregular contextual information and cannot be correctly classified into the exact occupied classes by simply improving the features and learning algorithms. - 77 - Figure 18: Comparison between daily occupied/unoccupied detection accuracy and daily number estimation accuracy Since the final classification results were determined by the majority voting of the five classifiers, the daily error rate of single classifier was also calculated and averaged to provide an overall comparison over the performances among selected data-driven classifier (Table 10). Table 10: Error of each classifier in the ensembling method SVM NB TAN ANN RF Error (%) 10.4% 18.2% 12.5% 15.3% 9.2% Compared to the existing research [111,224,225], the proposed occupancy modeling framework yielded more accurate results, was more feasible, and robust. First, it only focused on possibly occupied time slots and used the period from 6:00 AM to 9:00 PM every day to calculate the modeling performance instead of using 24 hours. Since during the early morning (12:00 AM – 6:00 AM) and late evening (9:00 PM – 12:00 AM) typically there is no occupant in the office, including these periods could increase the modeling accuracy, which was currently not reflected in the actual modeling performance. Even then, the global modeling performance of the proposed framework outperformed the existing research. Second, the agent based simulation approach was widely used in previous research; however it suffered rigid rules - 78 - and high degree of complexity, and required a large amount of data for training. In the proposed framework, the global occupancy model could be trained by any room among the six rooms and used for all other rooms, which can be potentially generalized to other geometrically similar rooms. Third, the proposed framework overcame the drawbacks of Markov model that does not yield accurate results if there are unexpected variations of real-time occupancy in training data set or the real-time occupancy deviates significantly from long-term occupancy. In addition, three potentially influential factors on global occupancy modeling, including the level of real- time occupancy variations, the degree of long-term occupancy differences between the rooms and the operation of HVAC systems were systematically analyzed as presented below (Table 11). These three factors were chosen because they comprehensively represent the direct influences from instant occupancy property, long-term occupancy evolvement, outside temperature and active conditioning effects. Table 11: Three influential factors on global occupancy modeling performance Factors Explanation Level of real-time occupancy variation Quantification of real-time occupancy status variations among different days Degree of long-term occupancy difference between rooms Deviation of daily occupancy from the occupancy patterns of different rooms Indoor/Outdoor Temperature Difference Effects of different HVAC system operations for different rooms Factors Calculation method Level of real-time occupancy variation Ratio of daily No. occupancy status transitions and average No. occupancy status transitions Degree of long-term occupancy Euclidean distance between actual daily occupancy of the - 79 - difference between rooms tested room and occupancy profile of the trained room Indoor/Outdoor Temperature Difference Daily Averaged indoor/outdoor hourly temperature differences To account for the fact that occupancy is variant in nature and occupants may have different real-time status changes in different days, the level of real-time occupancy variation was calculated as the ratio of occupancy class transition times (between the arrival time and departure time) for a day and averaged occupancy class transition times for the data collection period. Figure 19: Daily modeling accuracy and daily occupancy variation level Based on the results, the daily occupancy variation level ranged from 0.5 to 1.6 and the majority was within the interval of [0.9, 1.2]. By comparing the occupancy variation level with the daily modeling accuracy (Figure 19), it could be concluded that the level of real-time occupancy variation was negatively related to the global occupancy modeling performance because more variations brought about more disturbance and noise to the contextual information as the features to model occupancy. These two had approximately a logarithmic relationship: when the daily occupancy transitions began to significantly deviate from the average daily occupancy transitions, the decrease of modeling accuracy slowed down. - 80 - One possible reason might be that some features vary instantly with occupancy and the model can still capture certain contextual changes caused by occupancy. Second, there commonly exists deviation of daily occupancy from the occupancy patterns of different spaces. Considering the fact that global occupancy modeling might be influenced by long-term occupancy changes accordingly, the degree of long-term occupancy difference between the rooms was calculated as the Euclidean distance between actual daily occupancy of the tested room (class of occupancy at each time point) and occupancy profile (class probability at each time point) of the trained room, and compared with the daily global modeling accuracy (Figure 20). Figure 20: Daily modeling accuracy and daily occupancy profile difference degree Based on the results, the degrees of long-term occupancy differences ranged from 2 to 28 with the majority located on the interval of [5, 15]. Although the degrees of differences were not significantly related to the performance of global occupancy modeling, there was still a slight negative trend of daily modeling accuracy as the degree of long-term occupancy difference increased. This might be because of the definition of long-term occupancy difference, which is actually the deviation of occupancy to be tested from the occupancy used for training the model. The larger the deviation is, the less possible that the trained model covers the information necessary to model the tested samples. - 81 - Lastly, since all of the six rooms were served by HVAC systems, which directly influenced the ambient factors during the data collection periods, the consequences of different HVAC system operations on global occupancy modeling performance were analyzed. Averaging indoor/outdoor hourly temperature on a daily basis represented the impacts of HVAC systems on the results. Outdoor temperatures were gathered from the online records of a weather station near the Los Angeles airport while the indoor temperatures were calculated by averaging the data from the temperature sensors. The results showed that global occupancy modeling performances had approximately logistic relation with the effects of HVAC (Figure 21). Figure 21: Daily modeling accuracy and daily average indoor/outdoor temperature difference When the indoor temperature was close to outdoor temperature, the daily modeling accuracies were high since there was not much interference of active conditioning on the ambient factors. When there were 3-5 K hourly temperature differences (the setpoint is generally closer to the outside temperature), the daily modeling accuracies dropped significantly as the HVAC system began to work and influenced the features intermittently depending on the actual heating/cooling loads. When there were large temperature differences between the indoor and outdoor environments, the decrease rate of modeling accuracy slowed down as the HVAC systems were turned to work continuously (the setpoint is generally far from the outside temperature) and other features then played more important roles in occupancy modeling. The - 82 - decrease of daily modeling accuracy finally converged because no matter which room was used for training, the training data contain the samples when HVAC was on, thus the relationships between the class of occupancy and contextual information under active conditioning were already considered and partially encapsulated in the features and models. In addition, there was no significant difference between the two buildings as all the rooms were set with setpoint control: if the room temperature was out the range of deadband of heating/cooling setpoints, the HVAC system was triggered to work. 7.2.6. Summary This chapter presents a systematic approach to model real-time occupancy by using ambient sensing and relationship learning. The findings demonstrated the hypothesis that occupancy regularly influences ambient environments thus there exist general relationships between change of occupancy and variations of ambient environment. Large scale occupancy awareness at building level could be improved by analyzing the patterns of the ambient factors. For local real-time occupancy modeling, this chapter carried out a systematic evaluation of (1) five different algorithms, namely SVM, ANN, NB, TAN, KNN and DT for possible relationships between occupancy and ambient factors. Specifically, SVM, KNN, TAN and DT yielded satisfactory results in estimating all of the occupancy classes in all rooms. However, SVM was instable in local modeling and global modeling, while NB yielded lower accuracies in estimating specific classes when a room is occupied, indicating that, though they can accurately tell if a room was occupied or not, they yield less accuracy for differentiating the actual number of occupants in a room. (2) the impact of each individual ambient factor on the occupancy modeling accuracy. The analysis showed that CO 2, door status and light were the most influential sensor variables, whereas average sound, door count net, PIR and PIR count net had less influential roles in occupancy modeling. (3) the performance of different ambient factor combinations. The results showed that although some sensor variables could provide binary occupancy detection when used alone, they could hardly provide any satisfactory results of actual number of - 83 - occupants. The accuracy in estimating all occupancy classes generally increased as the number of sensor variables increased, although the factors with low information gain contributed little to the modeling results. To eliminate the need for repetitive time and labor-consuming data collection and model training, this chapter also presented a systematic framework for global occupancy modeling. Contextual information related features were processed to improve the global modeling performance, by which the model was trained in one space and was scalable for geometrically similar spaces. Four metrics of daily F-measure, daily RMSE, daily occupied/occupied detection accuracy and number estimation accuracy were used to comprehensively analyze the trends and correlations of modeling results. It is found that daily F-measure ranged from 0.71 to 0.94 with the mean of 0.80, and daily RMSE ranged from 0.42 to 1.72 with the mean of 1.23 for all occupancy classes; daily occupied/unoccupied detection accuracy ranged from 0.79 to 0.99 with the mean of 0.91; and daily number estimation accuracy ranged from 0.63 to 0.78 with the mean of 0.71 which focused only on the occupied period. The comparisons of metrics illustrated that daily F- measure and daily RMSE had approximately a negative linear relationship indicating the magnitude of model results deviated from actual classes decreased as the overall modeling accuracy improved. On the other hand, daily occupied/unoccupied detection accuracy had approximately positive logarithmic relationship with the daily number estimation accuracy, indicating high accurate binary occupancy detection could infer high accuracy occupant number estimation, and vice versa. Three potentially influential factors on global modeling performance, including the level of real-time occupancy variations, the degree of long-term occupancy differences and the operation of HVAC systems were also investigated. Based on the results, the level of real-time occupancy variation logarithmically reduced the global occupancy modeling performance, indicating that the decrease of modeling accuracy slowed down as the daily occupancy transitions began to significantly deviate from the average daily occupancy transitions. The degree of long-term occupancy difference between the rooms had slightly negative impacts on the global occupancy modeling performance, indicating that the daily modeling accuracy decreased as the - 84 - difference between the daily occupancy in the tested room and the occupancy profile in the trained room increased. The operation of HVAC systems and the global occupancy modeling performance were negatively and logistically related, indicating there were three different influences on the daily modeling accuracy resulting from three levels of HVAC operations, including no active heating/cooling, intermittent active heating/cooling, and continuous active heating/cooling. - 85 - 7.3. Long-term Occupancy Modeling This chapter proposes a framework to systematically model personalized occupancy profiles for representing occupants’ long-term presence patterns. It is based on the assumptions that occupants have certain patterns at least for the same day of different weeks during a certain period of time, and find- grained long-term occupancy could be derived from real-time occupancy with the irregular occupancy being eliminated. The exact period of “long-term” depends on the specific application for occupancy patterns and varies with different spaces. Different lengths for long term might influence the results of modeling but cannot influence the way of modeling. Since this dissertation focuses on energy efficiency for building HVAC system, the twelve months of one year are categorized into three groups based on the conditioning requirements: mixed (heating and cooling) period I (January - April), cooling dominant period (May - August) and mixed period II (September - December). The framework presented in this chapter can be applied to other office spaces. Since ambient factors are related to occupancy and have their own patterns, it is worth of exploring using ambient factors to improve the quality of long-term occupancy modeling. Although occupancy profile describe the presence probability at each time point, as the sampling rate increases in the future, occupancy profiles might become a continuous representation of presence that precisely reflects occupancy status changes. 7.3.1 Methodology for Long-term Occupancy Modeling 7.3.1.1. Occupancy Profile Number To start with, it was considered there are significant differences of presence patterns among different days of week, separate profiles for different days (up to seven profiles - one for each day) were assumed until it is proved some profiles (n<=7) are similar then they are combined to reduce the number to n-1. Therefore, at first the differences of presence patterns among different days of week were examined to check how many profiles were required to describe presence patterns of a certain space. The following steps were designed as: 1) for each occupancy sample, sequence of values were segmented by a time window (15 - 86 - minutes used in this dissertation); 2) features of mean and standard deviation were extracted from each window to form a feature vector as mean and standard deviation are more statistically reliable to compare differences compared to minimum value and maximum value; 3) all samples were categorized into seven groups based on their belongingness to the day of week, and the Minkowski distance between any pair of feature vectors is calculated; 4) The inner-group distance sums and inter-group distance sums were compared to decide whether there were significant differences among groups; 5) if there was significant difference between the two groups, the occupancy profiles should be created separately for these two days of week. If there was no significant difference, the occupancy profiles could be combined and represented by a single profile. For example, for one space, there might be 3 profiles for weekdays: one for Monday and one for Tue/Thu and one for Wed/Fri; for another occupant there might be one profile for Monday through Friday. Although the above-explained method could eliminate the influence of occupancy detection error and outlier on the comparison process, it did not take irregular presences and random schedules into consideration. Another method by comparing raw occupancy was designed to calculate the differences among groups and decide the number of profiles to be used to represent occupancy in one space. Since presence sample only consisted of two statuses of unoccupied state and occupied state, each day’s occupant presence could be expressed as 1/0 string with the digit number equal to time points. For each time point of the day, 1 represented the occupied state, while 0 represented the unoccupied state. Hamming distance [226,227], commonly used in computer science for comparing the difference between two strings, was used to compare the difference between any of two occupancy samples: first all presence samples were categorized into seven groups based on the day of week from Monday to Sunday. Within each group, the Hamming distances between any pair of samples were calculated and comprised an inter- Hamming matrix. There were seven inner- Hamming matrixes. For example, if Tuesday group has n presence samples, its inner- Hamming matrix is shown as: 0 𝑁 𝑇 11 𝑁 𝑇 12 … 𝑁 𝑇 1(𝑛 −1) 𝑁 1(𝑛 ) - 87 - 𝑁 𝑇 21 0 … 𝑁 𝑇 2(𝑛 −1) 𝑁 2(𝑛 ) … 𝑁 𝑇𝑛 1 𝑁 𝑇𝑛 2 … 𝑁 𝑛 (𝑛 −1) 0 Secondly, the presence sample within each group was compared with any presence sample from other groups, and the Hamming distances were listed and comprised of inter- Hamming matrixes. For example, given the Tuesday group has n presence samples, its inter- Hamming matrix with Wednesday group (with m presence samples) was shown as: 𝑁 𝑇 1𝑊 1 𝑁 𝑇 1𝑊 2 𝑁 𝑇 1𝑊 3 … 𝑁 𝑇 1𝑊 (𝑚 −1) 𝑁 𝑇 1𝑊 (𝑚 ) 𝑁 𝑇 2𝑊 1 𝑁 𝑇 2𝑊 2 𝑁 𝑇 2𝑊 3 … 𝑁 𝑇 2𝑊 (𝑚 −1) 𝑁 𝑇 2𝑊 (𝑚 ) … 𝑁 𝑇𝑛𝑊 1 𝑁 𝑇𝑛𝑊 2 𝑁 𝑇𝑛𝑊 3 … 𝑁 𝑇𝑛𝑊 (𝑚 −1) 𝑁 𝑇𝑛𝑊 (𝑚 ) The third step is to extract features (mean and standard deviation) from each matrix for comparison and decide whether there is significant difference between inner- Hamming matrix and Inter- Hamming matrix. It is not important to know the absolute distance of two samples; what is important is whether there are differences among groups (different days of week). If there is a significant difference between two groups, the occupancy profiles should be created separately for these two days of week. If there is no significant difference, the profiles could be combined and represented by a single profile. The implementation of both methods could effectively demonstrate whether there is significant difference among different days of week. 8.3.1.2. Processing of Real-time Occupancy Since occupancy profiles are derived from real-time occupancy, the real-time occupancy detection model (binary occupancy model: occupied/unoccupied) introduced in Chapter 7.2 was first applied as a non- - 88 - intrusive way to process sensor data and produce occupancy ground truth for modeling occupancy profile. Although the presence of one occupant may change temporarily, it was assumed that habitual occupancy within similar groups (Monday to Sunday in this dissertation) was stable over a longer period of time and it was likely to reflect personalized profiles. Instead of directly using raw occupancy, the expected occupancy for each time point was calculated for statistical summarizing occupancy profile because: 1) raw occupancy status heavily relies on the quality of the sensor data, and is significantly influence by the sensor noise; 2) raw occupancy status contains outliers (random presence and one-time presence). For example, one occupant forgets his notebook and come backs to pick it up; 3) raw occupancy status does not consider the interrelations with adjacent statuses. The raw occupancy status cannot meaningfully analyze the occupancy trends/patterns, but only estimate instant occupancy status at each time point, sometimes showing spontaneous occupancy changes within a very short period of time. In order to improve the quality of real-time occupancy ground truth, the characteristic of long-term occupancy should be considered to calculate the expected occupancy as new ground truth. Three characteristics of occupancy were considered when calculating expected occupancy (1) its time series characteristics – occupancy changes over time and is an ordered sequence of probability at equally spaced time intervals; (2) its statistical characteristics – occupancy evolves regularly and could be learned from previous cases; and (3) its stochastic characteristics -- occupancy is influenced by non-deterministic factors and status at each time point is determined probabilistically according to the previous status. Therefore, three modeling techniques, one for each of the three characteristics, plus a linear regression approach as a baseline model approach, were explored. Linear regression was chosen as it is the simplest type of relationship analysis method to estimate the expectations of dependent variables, and this method was also used extensively in prior research. Since occupants tend to maintain accustomed and consistent ambient conditions in long term, another assumption was made that the most sensitive ambient factors to occupancy (i.e., CO 2 concentration, door - 89 - status, light level, binary motion, and temperature) at the same time point on different days should have the similar patterns given the typical occupancy. Since the primary ambient factors on real-time occupancy modeling have been recognized, it was assumed that the factors of CO 2 concentration, door status, light level, binary motion- since motion count net factor may suffer from reset issue, binary motion factor was used instead-, and temperature should be used to differentiate regular and irregular occupancy. Afterwards the occupancy profile was linked with real-time expected occupancy based on the underlying assumption that at the same time on different days, occupancy is a repetitive occurrence by a large number N of times. The proportion of times that occurrence of presence converges to a specific value as samples become larger and larger; eventually the ratio (presence samples/total samples) converges to a constant limit as the total sample size increases. 7.3.2. Long-term Occupancy Modeling In the preparation of expected occupancy for each time point, the data was divided into 10 parts (9 parts for the training set and 1 part for the modeling -- set sequentially), and the modeled results were combined to produce the expected presence status for each time point. The following sections introduced the details of the four modeling techniques tested and compared in this dissertation for considering the different characteristics of occupancy for profiling. Due to database-BAS communication related limitations; the sampling rate was set to three minutes. Since one day has 1440 minutes, there are 480 time points in the occupancy profile to represent a whole day. 7.3.2.1. ARMA Time-Series Model (Time-Series Characteristic) Several autoregressive techniques have been developed to analyze and forecast linear time series [228], such as the ARIMAX model for univariate data, multivariate model for stationary multivariate and exponential smoothing models for weighted data [229-231]. These diverse techniques were developed using a simple vector ARMA process (Auto Regressive-Moving-Average model), which applied auto- regression (AR) sub-model and moving average (MA) sub-model to predict future values in the series. - 90 - Therefore, the ARMA was used in this dissertation to understand the time series characteristic of occupancy profile. This method considers the relationship between time series ambient factors and time series raw occupancy, and was referred to as ARMA (p, q) where p is the order of AR and q is the order of MA. If time series occupancy y t satisfies, y t = ∅y t−1 + ⋯ ∅y t−p + ϵ t whereε t is a random-variable sequence of an independent and identical distribution, and meets the following requirement: E(ε t ) = 0, Var (ε t ) = σ ε 2 This occupancy series y t could be considered as an auto-regression model with p order. Meanwhile, if this series satisfies the moving average condition as well: y t = ε t − θ 1 ε t−1 − ⋯ − θ q ε t−q This occupancy series y t could also be considered as a moving-average model with q order. In order to determine the values of p for AR and q for MA, the censored orders of sample partial autocorrelation and sample autocorrelation were calculated, respectively. Based on the results, sample partial autocorrelation was 10-order censored, thus ARMA (0, 10) was applied to the model time series for occupancy. By integrating ambient factors (i.e., CO 2 concentration, door status, light level, binary motion, and temperature) into the modeling process, the final ARMA model is represented as: 𝑦 𝑡 + 𝑎 1 𝑦 𝑡 −1 + 𝑎 2 𝑦 𝑡 −2 + ⋯ + 𝑎 10 𝑦 𝑡 −10 = 𝑏 1 𝑋 1 + 𝑏 2 𝑋 2 + ⋯ 𝑏 𝑖 𝑋 𝑖 + 𝑒 𝑡 𝑋 𝑖 = 𝑥 𝑖𝑡 −1 , 𝑥 𝑖𝑡 −2, 𝑥 𝑖𝑡 −3, 𝑥 𝑖𝑡 −4, … , 𝑥 𝑖𝑡 −10 In which x i is one ambient factor gathered by the deployed sensor network. It is known that at each time point, the time series occupancy is influenced by the occupancy and ambient factors from the past 10 time points. Least square fitting method was applied to estimate the parameters in the equation. As the - 91 - sampling rate was 3 minutes, the expected and habitual occupancy was determined by the data from past 30 minutes. A day-long time (24 hours) was then divided into 480 time intervals (T i , i = 1,2, … 480). At any time interval, if three or more out of five ambient factors complied with the similarity requirements defined above, the sample was considered as valid and counted as one of the ith samples for modeling personalized occupancy profiles. The means E(y i ) ̅ ̅ ̅ ̅ ̅ ̅ ̅ was defined as the occupancy probability for time i. As the ARMA (0, 10) was delayed by 10-order, 470 time intervals were modeled from 00:30 AM to 24:00 PM. Since there was usually no occupant in a room in the early morning and late evening, the presence profile from 04:00 AM to 20:00 PM were presented and compared with the ones from other modeling techniques as discussed below. 7.3.2.2. Pattern Recognition Model (Statistical Characteristic) Pattern recognition is the process of automated machine recognition of occupancy based on a given set of features from previous experiences and cases [232]. It aims to provide a reasonable matching between inputs and outputs by considering the statistical relationships and variations [233]. Artificial neural network (ANN) is a commonly used pattern recognition technique to model a large-scale nonlinear adaptive system through simulating a neural network [234]. It consists of large amounts of simple processing units called neurons, and is often used for massively parallel data processing by iteratively adjusting the input and output of each neuron, eventually approximating a statistically reliable relationship. For each neuron, if the weighted sum of input is equal or above the predefined threshold value, it is processed by a transfer function. Generally, each neural network consists of several layers of neurons; the output of previous layer is transferred as the input of the following layer. In this dissertation, the network input was historical occupancy data from the occupancy detection model and ambient factors from the sensor network while the output of the network was the expected occupancy at current time point. To keep this model consistent with the data segmentation in the ARMA modeling, it was assumed that expected occupancy was determined by the occupancy and context information for the last 10 time - 92 - points (30 minutes). Therefore, for each occupancy output y i , there were 60 inputs (5 ambient factors plus raw occupancy for the last 10 time points). A three layer neural network, including one input layer, one hidden layer and one output layer, could approximate to any arbitrary non-linear function. In this dissertation, 60 inputs were in the input layer as there were 60 inputs, and 31 neurons were in the hidden layer according to an empirical formula: hidden neuron number = (input+output)/2. Upon the receipt of the weighted sums from the input layer, each of the 30 neurons processed the input independently, and transferred its output to the output layer. After the same process was performed for the output layer neuron, the output of the entire network was determined. Thus, neural network is a method to find the near- optimization matching between input and output. Although at each point in time, the network may closely approximate to the output, the trained weights and thresholds vary significantly. 7.3.2.3. Stochastic Process Model Stochastic process considers the occupancy as a random variable; at each time point the status is determined probabilistically according to the previous status. The occupancy may be also influenced by non-deterministic factors, thus even under the same initial condition, it evolves towards different directions with different possibilities [235]. Markov Chain is a classic method to simulate stochastic processes through analyzing the transitions from one state to another [236]. The two important parameters considered in the process were the initial state and the transfer probability. The initial state was the beginning status of occupancy for a space, denoted as p i = [a , 1 − a ], in which a was the probability of the unoccupied status at time 0. Transfer probability indicated the probability of status changes from time n to time n+1, denoted as p(n ) = [ 1 − x x y 1 − y ]. This matrix showed the probabilities as: [ unoccupied at time n , unoccupied at time n+ 1 unoccupied at time n , occupied at time n+ 1 occupied at time n , unoccupied at time n+ 1 occupied at time n , occupied at time n+ 1 ] - 93 - According to the two parameters, occupancy probability at a certain time point could be calculated through a recursion formula: p i (n+ 1) = p i (n ) ∗ p(n ). The initial state was determined by observation of historic data, from which it can be seen that at time 00:00AM there was usually no occupant in a room, thus the p 0 was considered as [1, 0]. In this dissertation, change in occupancy was not a homogeneous Markov Chain, thus the transfer probability could not treated as constant. By implementing image fitting and statistics for the historical occupancy data distribution, a piecewise function was calculated to represent the transfer probability as a function of time of the day. 7.3.2.4. Regression Model Unlike the pattern recognition modeling techniques, the regression modeling belongs to the family of methods called generalized linear models ("GLM"). It aims to build a linear link function between input and output. Considering current occupancy is related to the occupancy at the previous time points, a link could be established as follows, 𝑃 {𝑦 𝑛 = 1|𝑦 𝑛 −1 , 𝑦 𝑛 −2 , 𝑦 𝑛 −3 , … , 𝑦 𝑛 −10 } = 𝑎 0 + 𝑎 1 𝑦 𝑛 −1 + 𝑎 2 𝑦 𝑛 −2 + ⋯ + 𝑎 1 𝑦 𝑛 −10 However, in a linear regression model, the margin effects of previous occupancy keep static, which may not reflect the reality, and the outcomes may exceed the range [0, 1], and makes the results unreasonable. In the case of logistic regression, a Logic link function was applied to transform the distribution of y n 𝑃 ′ (y 𝑛 ) = P(y 𝑛 ) = 1 1 + 𝑒 −y 𝑛 The outputs from Logit regression model were within the range of [0, 1]. 7.3.3. Findings for Long-term Occupancy Modeling The same three rooms in the testbed building selected for validating the real-time occupancy modeling are also used for implementing the methodology of long-term occupancy profiling (Figure 11). - 94 - Figure 22: Locations of three offices for validating the proposed occupancy profiling methodology The implementation of both methods of determining occupancy profile number did effectively demonstrate there was no significant difference among different week days and weekend days but there was significant difference between weekdays and weekends for the three rooms. Therefore, two profiles were required for each room (one for weekday and one for weekend). However, during the weekends, no occupant came to the office during the data collection period and the occupancy probability was 0. Therefore, ASHRAE default schedules were used to represent weekends [153]. Only considering weekdays, after eliminating the samples with missing values, 39,799 samples were selected from the second data collection period for modeling long-term occupancy. The modeled profiles for weekdays were plotted and presented in Figures 23-25 for each room. Based on the visual results, the modeled occupancy profiles by ARMA, neural network and Markov chain models had similar shapes. These results indicated that the three characteristics (i.e., time series, statistic and stochastic) of occupancy profiles were closely associated with each other. In other words, the effects of the three characteristics on occupancy profile were similar. The regression modeling did not have stable performance for different rooms and sometimes generated significantly different profiles that deviated considerably from the results generated using other modeling techniques. Data from three rooms used for occupancy detection and profiling - 95 - Figure 23: Comparison of occupancy profile from different modeling methods for Room 1 Figure 24: Comparison of occupancy profile from different modeling methods for Room 2 Figure 25: Comparison of occupancy profile from different modeling methods for Room 3 7.3.4. Evaluation of Representativeness - 96 - The representativeness of occupancy profile should be evaluated through both the comparison with actual occupancy in terms of the degree of statistical approximation and heating/cooling load approximation on the demand side for a period of time. 7.3.4.1. Degree of Statistical Approximation To evaluate whether the modeled occupancy profiles could better reflect the occupancy patterns in terms of the degree of statistical approximation, a bootstrapping-derivative method was used to compare the modeled profile with each day’s actual occupancy and decide whether the modeled profile could represent the actual occupancy patterns [237,238]. 1. Select all the sample combinations from ( n 1 ) to ( n n ) within a group. The number of groups is determined based on the difference of occupancy profiles among different days of week; 2. For each combination i, the occupancy probability at each time point is calculated; p i (t 1 ) = 1 N i ∑ x j N i 1 , x j = 1 when occupied ; x j = 0 when unoccupied 3. For each time point, the distribution of occupancy probability is determined, according to the Central Limit Theorem. When sampling size is sufficiently large, the mean of occupancy probability can be approximated by a normal density with standard deviation σ(t 1 ) = ∑ σ i /√N i , 4. Calculate modeled occupancy probability based on proposed method p’. Given a certain significance level (98% in this dissertation), the confidence interval of occupancy probability at each time point is calculated. [p ′ (t 1 ) − 1.96 σ(t 1 ) ∑ N i , p ′ (t 1 ) + 1.96 σ(t 1 ) ∑ N i ] - 97 - 5. The percentage of area in actual occupancy probability from (3) that is within the confidence interval from (4), is calculated and presented in Figure 26-28; the larger the percentage is, the more closely the modeled occupancy profile is to the actual occupancy. Figure 26: The range of presence probability at each time point for Room 1 Figure 27: The range of presence probability at each time point for Room 2 - 98 - Figure 28: The range of presence probability at each time point for Room 3 The results generated from four modeling techniques were shown in Table 12. Depending on different periods of time selected, the calculated actual presence probabilities varied accordingly. The evaluation results generated from four modeling techniques showed that at 98% level of confidence, more than 90% of actual presence probability values are within the confidence intervals set by modeled occupancy profiles. Specifically, the ones from ARMA and the neural network closely approximated the daily actual occupancy (with percentage of 0.973 and 0.956, respectively). The four tested modeling techniques all outperformed observation-based method (averaging from occupancy detection results), although the observation-based method was still more accurate than the fixed design profile, which is commonly used in current building simulation and system controls. Table 12. Percentage of area enclosed by actual occupancy probability that is within the confidence interval by modeled occupancy probability Presence Profile Modeling Technique P(room1) P(room2) P(room3) 𝑷 ̅ ARMA (time-series) 0.98 0.96 0.98 0.973 Neural network (pattern recognition) 0.95 0.95 0.97 0.956 Markov chain (random process) 0.92 0.94 0.91 0.923 Regression (logit regression) 0.91 0.90 0.90 0.903 - 99 - Observation-based method 0.79 0.85 0.73 0.790 Fixed design profiles 0.63 0.71 0.68 0.673 The ARMA model was straightforward and easy to interpret; it also had the relative best performance in modeling personalized occupancy profiles. The ARMA structure incorporated both historical occupancy and corresponding ambient factors. It focused on the changes in occupancy over time by considering the changes as an ordered sequence of probabilities at equally-spaced time intervals. The ARMA model considered all possible influential factors in occupancy profiling and thus can better reflect reality than the structure of the other models. A drawback of this method was that the ARMA model treated the influences of historical occupancy and ambient factors as linear, and a change in the ambient factors must correspond to a constant occupancy change. This assumption may not be satisfied in the real world. However, as there was very little information to determine what the actual relationship between the occupancy and ambient environment, assuming a linear relationship might be a reasonable starting point. In contrast, the neural network was designed to approximate the non-linear statistical relationship between independent and dependent variables. However, the neural network model was difficult to interpret because it provided little information about the model structure and explanatory insight on the influence mechanism of the independent variables in the modeling process. It may also lead to over-fitting. Since the neural network adopted a trial-and-error process, the parameter values were not deterministic. A large amount of the work involved determining the appropriate parameter value through repeated and varied attempts. In addition, due to the random initial states and different ending conditions, each run produced a different result. Therefore, modeling performance was not stable as it had significant divergences. Be different from the ARMA model, the Markov chain model did not have after effect, meaning that the presence status for a future time point was assumed to be only influenced by the current status. This - 100 - property could compute the probability of a presence status resulting in a sequence of follow-up statuses. However, Markov chain was likely to favor parts of the status that had high probabilities, which would result in eliminating sink states problems, and sometimes the sequence of modeled statues may be unreasonable. Lastly, regression modeling was used as a baseline modeling technique, as regression is commonly applied to compare different methods. The Logit regression model can only return values between 0 and 1, which was suitable for modeling the probability of occupancy. However, similar to the ARMA model, regression model also assumed a relationship between previous occupancy and current occupancy that was rigidly linear, an assumption which may deviate from reality as discussed earlier. Of the four techniques compared in this dissertation, the regression model gave the worst approximation of actual occupancy. 7.3.4.2. Degree of Load Approximation It is important to determine whether good performance on the degree of statistical occupancy approximation correlates with good performance in analyzing the hating/cooling loads on the demand side from energy perspective [239], as having access to reliable occupancy information is expected to infer accurate results of energy performance. Simulation was applied to evaluate the representativeness of the modeled occupancy profile in terms of energy consumption. Commonly used ways to incorporate and account for occupancy, - actual occupancy and fixed, which apply instant status or fixed design profiles that are defined by motion sensor or organizations such as the ASHRAE - were selected as the control experiments for comparison. This chapter evaluated the impact of implementing personalized long-term occupancy as an input instead of real-time occupancy or designed fixed profile on energy simulation results by simulating ideal energy consumption (the amount of energy to meet the heating/cooling loads on the demand side without considering the system inefficiency) of four thermal zones in a building for four months using calibrated energy model described in Chapter 9 (Figure 29). - 101 - Figure 29: Rooms and zones selected for evaluating the representativeness of occupancy profile in terms of heating/cooling estimation The simulation was used to evaluate the performances of different profiling methods for four zones (seven rooms), although 50 rooms of the testbed building were equipped with the sensor boxes. The reason is that occupancy profile significantly influences the internal heat load and conditioning requirement of a zone; analyzing the zone level energy consumption could get insights on how occupancy profile represent actual occupancy in the energy performance. Therefore, in order to analyze the consequence of implementing occupancy profiles for individual zones in simulation, the focus was on zone loads instead of system responses. Four modeled occupancy profiles, fixed design profiles (for an office building defined by ASHRAE Standard 90.1-2004), observation-based profiles (averaged occupancy status at each time point over a period of time), and the building’s actual occupancy information were implemented and the energy required for meeting the heating/cooling loads of four zones was simulated. Sensor data from seven rooms (Zones 13, 14, 15, 16 in Figure 29) other than the three used for modeling occupancy were gathered and used to evaluate the performances of different methods for profiling occupancy in load estimation from January to April 2013 with ten-minute time step. Seven scenarios were simulated using the energy model. The first through fourth scenarios used the modeled occupancy profiles. The profile for the fifth scenario was generated using the observation-based method. The sixth scenario used the fixed Data from seven rooms used for evaluating the representativeness of occupancy profile in terms of heating/cooling load estimation - 102 - design profiles for an office building defined by ASHRAE Standard 90.1-2004. The seventh scenario used actual occupancy information. The simulation results were compared and presented in Figure 19. Figure 30: Deviations of energy consumption, simulated for four months Energy consumption of the scenario, where actual occupancy information was applied as an input schedule, is used as the baseline for the comparison. Half of January was the university’s winter break; therefore a winter holiday was defined in the simulation. Since a holiday cannot be partitioned into weekdays and weekends, separate occupancy profiles for the holiday were required to run the simulation. During the winter break, no occupancy was observed for the seven rooms; therefore the holiday occupancy profiles generated from different methods were all kept to zero. Thus, the deviations of energy use in January from the other six scenarios were less substantially different from the one with the actual occupancy information, compared to the other months. Based on the simulation results shown in Figure 30, the fixed design profile for a typical office building defined by ASHRAE 90.1-2004 performed the worst, diverging significantly from the baseline energy consumption, indicating the fixed design profiles did not accurately represent specific and realistic occupancy in terms of energy consumption. The profile generated from the observation-based model was more accurate in comparison, but it was still unsatisfactory. Profiles generated from the ARMA, neural network and Markov chain models all produced better simulation results (with deviations less than 12%) than the fixed design profiles and - 103 - observation-based profiles. More importantly, the comparisons from the simulation results were consistent with the validation results except for regression based profiling in Chapter 8.3.4.1. The ARMA model generated the best occupancy profiles for ideal energy analysis (with a deviation less than 6%), closely followed by the artificial neural network model (with a deviation less than 8%). The stochastic process modeling could also provide accurate occupancy profiles for simulation (with a deviation less than 12%). The performance of the profile modeled using the regression model was similar to the observation-based profile, both of which yielded an obvious gap between the simulated energy consumption and baseline energy consumption (with deviations larger than 15%). However, they still outperformed the fixed design profiles, which are commonly used in current research. In reality, using actual occupancy for energy performance analysis is time-consuming and labor-intensive, when it is even unrealistic to have access to the actual occupancy data; occupancy profiles modeled from ARMA and neural network could be substitutes to represent actual occupancy in terms of both statistical approximation and heating/cooling loads estimation on the demand side. 7.3.5. Occupant Number Profiles Occupancy profile described above models the presence probabilities of a space at different time points of the day. It considers only the unoccupied status and occupied status. Having more than one occupant in a space is the same as have only one occupant in a space, in terms of occupancy profiling method. The only difference is the probability at each time point is a joint probability from different occupants. The regular presences of one occupant might have long-term influence on the presence patterns of other occupants. In this case, the profiles of multiple occupants are connected and cannot be separated. Thus, all occupants should be considered as a group and the group’s number is then modeled. Similar to presence profile, another profile called number profile is needed to comprehensively represent occupant number in that space. Number profile could be defined as a set of time sequenced probabilities of different occupant numbers at different time points for one day. It differs from occupancy presence profiling above in calculating the expected occupancy statuses. For this step, the raw occupancy status should be replaced by - 104 - the occupant number, leading to one profile with the segment number equal to the maximum occupant number in that space. Other steps could be kept unchanged. The results for the three rooms using ARMA were shown in Figure 31-33. Figure 31: Comparison of number profile from different modeling methods for Room 2 Figure 32: Comparison of number profile from different modeling methods for Room 2 Figure 33: Comparison of number profile from different modeling methods for Room 3 - 105 - 7.3.6. Summary This chapter presented a framework for modeling personalized occupancy profiles by integrating sensor- based ambient factors and real-time occupancy model for eliminating irregular occupancy. Occupancy was assumed to be stable over a longer period of time which was verified by the analysis of the data. Hamming distance was used to determine the number of profiles required to represent one occupant’s presence pattern. Four types of modeling techniques — time-series modeling, pattern recognition modeling, stochastic process modeling and regression modeling — were proposed to calculate the expected presence statuses, and their performances were compared in terms of the degree of statistical approximation to real occupancy. As the probability of occupancy at each time point was modeled using the data from previous time periods, it reflected the expected presence probability of that time instead of simply using survey or observation-based data. The results showed that the modeled personalized profiles by ARMA and neural network closely approximated to actual daily occupancy. ARMA, neural network, Markov chain and regression model all outperformed the observation-based method, although the latter yielded still more accurate results than the fixed design profile. Ambient factors, including CO 2 concentration, binary motion, door status, light level and temperature, was demonstrated to be an important indicator differentiating regular and irregular presences. By incorporating ambient factors into the profiling process, this framework was capable of generating a personalized occupancy profile for each occupant and eliminating irregular presence by considering it as statistical outlier of typical patterns. The dissertation also presented results from a building energy simulation where personalized occupancy profiles were implemented for validating the degree of load approximation, and the building’s energy use was simulated for four months using OpenStudio. Specifically, the impacts of different occupancy profiles (the four modeled profiles, fixed design profiles, observation-based profiles, and actual occupancy information) on energy simulation results were evaluated. The comparison demonstrated that personalized occupancy profiles were more accurate than fixed design profiles and could be used instead - 106 - of fixed design profiles for building energy simulation. The results for profiling the occupant was also presented. Since the occupancy profiles are for a specific period and may change with time, season and personal factors, such as change in personal schedules. Therefore, the trends of the representativeness of occupancy profile in terms of the degree of statistical approximation and load estimation for a period of time should be monitored at a weekly interval. New profile is generated by adding the new instances of one week. If the percentage of area enclosed by actual occupancy probability that is within the confidence interval by modeled occupancy probability continuously becomes smaller, or the deviations of ideal energy requirement by using occupancy profile continuously becomes larger, the occupancy profile should be remodeled following the same methodology proposed above. - 107 - Chapter 8: Occupancy and Heating/Cooling Loads for Efficiency Through the execution of the studies in this chapter, the types of occupancy characteristics that influence building HVAC system energy efficiency are identified, and the ways of modeling relationships between occupancy characteristics and heating/cooling loads on the demand side are explored. As mentioned in Chapter 7, I build on and use real-time occupancy data and long-term occupancy profiles obtained from the proposed occupancy models. Although occupancy is a crucial factor in determining the effective loads for heating and cooling, before clearly understanding the relationship between occupancy-loads relationships, it is still unknown when and how occupancy should be integrated with HVAC system load response, and whether the occupancy based HVAC system control could constantly improve energy efficiency or not is still not clear [240-243]. This chapter dissertation has focused on the types and ways of modeling the occupancy-loads relationships that significantly influence HVAC system energy efficiency. Specifically, two relationships have been identified including occupancy transitions, which represent the switch between real-time occupied and unoccupied statuses; and occupancy diversity which represents the difference in long-term occupancy. The objective of this chapter is not to provide any specific conclusions for any specific building/case or any specific energy savings but rather to investigate the relationships and evaluate the reliability of the proposed frameworks for modeling occupancy-load relationships for energy efficiency. Energy efficiency, as defined in this dissertation, incorporates both the conditioning miss, which is the length of period when space is occupied but temperature is out of range from the setpoint, and energy reduction, which is the absolute amount of energy savings. As introduced in Chapter 3, operation of an HVAC system is triggered as a response to the temperature changes based on the setpoints. Setpoint is the terminal level temperature setting in each zone. Setpoint regulates the desired temperature range (i.e., the deadband) and is the primary parameter for terminal control. Since HVAC terminals respond to the loads through the control of setpoints, the setpoint is used as the medium in this dissertation to implement occupancy-load relationship investigations. - 108 - 8.1. Occupancy Transitions and Loads For building HVAC systems, the amount of heating and cooling loads depends on the level of thermal condition that is required. As introduced earlier, occupancy transitions are the switches between different occupancy states, thus not all of the loads are effective for heating and cooling when the zone is not occupied. During an unoccupied period, allowing the setpoint to float to a different temperature, which is defined as the setback (higher than the setpoint when cooling is required, and lower than the setpoint when heating is required), could potentially reduce heating/cooling loads and improve an HVAC system’s energy efficiency. Generally, heating/cooling loads of the setpoint control can be divided into four periods based on occupancy transitions (Figure 34): (1) Setpoint period: heating/cooling is provided to maintain the setpoint, loads are considered effective; (2) Float period: heating/cooling is not provided and the temperature is allowed to float from setpoint to setback, loads are considered not effective and only minimum air flow is supplied per the ASHRAE requirement; (3) Setback period: heating/cooling is provided to prevent room temperature from exceeding the setback, loads are considered effective. To be clear, setback period does not mean the setback must be maintained. Setback acts as a threshold to keep temperature within the range between the setpoint and setback; (4) Reconditioning period: heating/cooling is provided to restore the temperature from setback or the floating point to setpoint, loads are considered effective. Figure 34: Relationship between occupancy transitions and heating/cooling load transitions Setpoint Float Setback Reconditioning Occupied Unoccupied Occupied Effective Not Effective Effective Setpoint/Setback Schedule Setpoint/Setback Distance No Prediction - 109 - Therefore, the relationship between occupancy transitions and heating/cooling loads could be reformulated as the relationship between transitions of occupied/unoccupied states and transitions of loads that are actual demands for heating/cooling versus the loads that do not need to be addressed by an HVAC system (Figure 34). Conceptually, the waiting time to start the float period after a zone becomes unoccupied (defined as the setpoint/setback schedule), and the absolute value of setback minus setpoint (defined as setpoint/setback distance) to start the setback period determine the deviations between occupancy transitions and load transitions. Different combinations of setpoint/setback schedules and distances may lead to different levels of energy efficiencies. For example, to demonstrate this point, we simulated a testbed office building in a study [6], where we used 73F as the static setpoint (based on the ASHRAE comfort compliance), when the zone was occupied. The interval for setpoint/setback distance was 1 K and the interval for setpoint/setback schedule was 5 minutes. The conditioning miss and energy reduction for the third floor of the building were simulated for a year from March 2014 to March 2015. Conditioning miss was the length of time when the space is occupied but the temperature is outside the comfort range (defined as 73 F with ± 1 K deadband). Energy reduction was calculated as the absolute amount of energy savings. The results are visualized (Figure 35) using gray maps (percentage of energy reduction and percentage of conditioning miss compared to the baseline control which maintains setpoint of 73 F for the working hours of 6:30AM - 9:30PM on workdays and 7:00AM - 9:30PM on weekends). The darker the color is, the more energy reduction and less conditioning miss were achieved. Figure 35: Energy implication of different combinations of setpoint/setback schedules and distances 1 2 3 4 5 6 7 8 0 5 10 15 20 25 30 35 Setpoint/Setback Schedule (Min) Setpoint/Setback Distance (K) Energy Reduction (%) 1 2 3 4 5 6 7 8 0 5 10 15 20 25 30 35 Setpoint/Setback Schedule (Min) Setpoint/Setback Distance (K) Conditioning Miss (%) - 110 - It can be seen from the simulated results that the setpoint/setback schedule and distance are important to reduce heating/cooling energy use while maintaining occupant comfort. Generally, when a zone becomes unoccupied, instant control adjustments may cause discomfort as an occupant may reoccupy his/her office soon after leaving it. Sometimes reconditioning after a relatively short period of vacancy may consume more energy than just maintaining the setpoint. Especially when a setback is far from a setpoint, the transient large amount of loads could result in additional energy consumption and time to recover from the setback. We demonstrated by the different combination analyses in [7] that the transitions of loads do not necessarily follow the transitions between the occupied/unoccupied states. A portion of the loads during unoccupied periods should be considered effective for improving energy efficiency (Figure 34). The boundaries of the portions are represented by the setpoint/setback schedules and distances, and determine the relationships between occupancy transitions and heating/cooling loads. Since errors might be introduced when prediction is used to pre-recondition the space before an occupant occupies his/her office again, the transitions between unoccupied/occupied states are considered the same as the transitions between setback and reconditioning periods (Figure 34) to eliminate uncertainty of the analysis for occupancy transitions-loads relationship. 8.1.1. Multi-zone Occupancy Transitions Analysis It has been described that the time lag and limit for temperature to float determine the transitions of effective loads and ineffective loads, 15 minutes and 78 F in previous simulation, respectively. They are called setpoint/setback schedules (e.g., waiting time to trigger setback) and distances (difference between setpoint and setback) in this dissertation. Conceptually, different combinations of setpoint/setback schedules and distances may lead to significantly different levels of energy efficiencies. The synergetic effects of setpoint/setback schedules and distances are investigated for linking with heating/cooling loads. Based on ASHARAE comfort compliance, 73F is used as the static setpoint when the zone is occupied. The interval for setpoint/setback distance is 2 K and the interval for setpoint/setback schedule is 5 minutes. Both setpoint and setback have a deadband of ±1 K and the conditioning miss is the length of time a zone - 111 - is occupied but the temperature is outside the range of [72F, 74F] in order to be in compliance with the PMV. In order to analyze the effects of occupancy transitions on effective loads for a building with different occupants in multiple zones during a period of time, two sets of simulations were conducted for the testbed building: 1) using actual occupants with occupancy transitions (each day has different occupancy transitions), while 2) using repeated occupants with constant transitions (each day has the same occupancy). There were N possible scenarios for the second set if the period had N days, and each scenario was called one sample representing one possibility of daily occupancy transitions stayed the same for that period. Therefore, the first task of this became to test whether 1) and 2) had significant difference in energy efficiency of different setpoint/setback schedules and distances. A process for comparison was designed and shown in Figure 36. First the energy efficiency results of different setpoint/setback controls using actual occupancy transitions were obtained, and the estimation interval for each combination was calculated within ±2.5% of the simulated energy efficiency given the predefined 95% confidence level. Then the repeated occupancy transitions were used to replace actual occupancy transitions, and the simulation was ran for N times, each time using one’s occupancy for N days. For each combination of setpoint/setback schedule and distance, the percentages of energy efficiency levels of all samples (all possible scenarios using repeated occupancy without actual transitions) within the intervals from using the actual occupancy transitions were calculated (it was called coverage percentage in this dissertation). The smaller the coverage percentage was, the more significant the influence the occupancy transitions on the absolute energy efficiency of different schedule and distance combinations (occupancy transitions based control). - 112 - Figure 36: Comparison of energy implications between actual and repeated occupancy Then it was test whether actual occupancy transitions and repeated occupancy transitions had significant difference in relative performance of different setpoint/setback schedules and distances combinations. Relative performance was defined as the performance of one combination for energy efficiency compared to the best and worst combination in the same scenario. The difference between absolute energy efficiency and relative performance was: absolute energy efficiency showed the actual energy efficiency that can be improved (absolute amount of effective loads), while the relative performance evaluated differences of selected combinations from the remaining (relative amount of effective loads). Even if the absolute energy efficiency levels from repeated occupancy and actual occupancy transitions were different, the combinations of setpoint/setback schedule and distance that outperformed other combinations could still be the same (e.g. [20%,40%,60%,80%] and [17%,29%,41%,54%] were significantly different, but they were similar after normalization). Therefore, the simulated energy implications of discrete setpoint/setback combinations were normalized by x ∗ = x−x min x max −x min to indicate the relative energy performance of different setpoint/setback controls. The same process of comparison analysis (Figure 36) was then applied to test whether there is significant difference between actual and repeated occupancy transitions in terms of the normalized energy implications. Similarly, the coverage percentage was calculated, and the smaller the percentage was, the more significant the influence the occupancy transitions on the relative performance of different schedule and distance combinations. The ideal loads for the third floor of testbed building during May 2013 to April 2014 were simulated to implement the proposed methodology. One year was divided into four periods based on the outside - 113 - temperature statistics (mean and standard deviation). 73F (with 1 K as deadband) was used as the static setpoint to maintain a comfortable thermal environment when the zone was occupied. When there was no occupant, six different setback values (2 K as the interval) and six setback waiting time (with 5 min as the interval) were combinated. In this dissertation, energy reduction and conditioning miss were considered equally important, thus the energy efficiency was expressed as 50%*(energy reduction)+50%*(conditioning miss). The period of N days for testing the energy significance of occupancy transitions was three-month long and therefore there were four periods to analyze the consistency of findings. For each period, the influences of occupancy transitions on the energy efficiency of different setpoint/setback schedules and distances were first investigated. Actual occupancy transitions and repeated occupancy transitions were used and statistics of coverage percentages for all 36 combinations were shown in Table 13. The differences of energy efficiency caused by variant occupancy transitions were around 60%, indicating that the variant transitions of occupant presence had significant influence on the absolute energy efficiency of different setpoint/setback schedules and distances. The normalized energy implications of different setpoint/setback controls from actual occupancy and repeated occupancy were then compared to investigate the influences of variant occupancy transitions on the relative performance of different setpoint/ setback schedules and distances. From the statistics of coverage percentages for all 36 combinations in Table 13, it can be seen the differences of normalized energy efficiency caused by actual occupancy transitions were below 15%, indicating that the actual occupancy transitions did not have significant influence on the relative performance of different setpoint/setback schedules and distances. The rankings of combinations in terms of energy efficiency remained the same. Table 13: Coverage percentages of energy efficiency for all combinations Energy Efficiency Relative Performance Mean SD Max. Min. Mean SD Max. Min. - 114 - May-Jul (1) 45% 19% 75% 21% May-Jul (1) 94% 3% 88% 98% Aug-Oct (2) 38% 15% 69% 20% Aug-Oct (2) 90% 4% 85% 95% Nov-Jan (3) 42% 18% 71% 17% Nov-Jan (3) 91% 3% 87% 97% Feb-Apr (4) 40% 16% 73% 19% Feb-Apr (4) 92% 3% 88% 96% The coverage percentages of relative performances for four periods were then statistically compared using paired T-tests. The results (Table 14) showed that the coverage percentages of relative performances were significantly different for different periods. Besides, the four periods had different outside temperature statistics, and it was found that the difference of average temperature between the two periods was approximately linearly associated with the difference of coverage percentage between the same periods. The joint effects of outside temperature and the variant occupancy transitions on energy efficiency were more significant than on relative performance of setpoint/setback schedule and distance combinations. Table 14: Statistical analysis of coverage percentage differences between periods Relative Performance Comparison Sig. Period 1-2 0.05 Period 1-3 0.02 Period 1-4 0.01 Period 2-3 0.00 Period 2-4 0.00 - 115 - Period 3-4 0.03 Figure 37: Relationship between average temperature and difference of coverage percentage 8.1.2. Occupancy Transitions based Setpoint Control Finally, in order to quantify the relationship between occupancy transitions and loads, a data-driven approach is developed to minimize heating/cooling loads by searching the optimized setpoint control for each zone, based on the analysis of occupancy data and simulated energy data. The corresponding loads are then compared with the baseline control to understand the relationships between occupancy transitions and heating/cooling loads. To make it generalizable, we assumed that all of the zones have different combinations of setpoint/setback schedules and distances, and the scenario that all of the zones in a building share the same combination is considered as one possible solution among many. Generally, zones consist of more than one room and zone level occupancy transitions are usually obtained by aggregating the occupancy state of each room in that zone: (1) if at least one room in a zone is occupied, the zone is occupied; (2) if all rooms are vacant, the zone is unoccupied. In this chapter, the impact of different occupant numbers in each room is included as a time-variant coefficient to the state of presence for adjusting heat gains. If there are N zones and T different combinations of setpoint/setback schedules and distances for each zone, (T) N possible solutions are in the solution space (N-dimensional) to simulate and evaluate. An - 116 - exhaustive search is not feasible for large buildings when periodical (e.g., monthly or weekly) analysis is required, as the computational time and the number of simulations increase exponentially. For this combinatorial optimization problem, the algorithms that use heuristics have the potential to partially search the solution space and find the near-optimal solutions in a reasonable time and within a reasonable computational power. However, in the building energy domain, for each solution, heating/cooling loads of the HVAC system have to be calculated through a simulation program or state-space thermal models, which are stand-alone processes in addition to the optimization process and therefore time consuming. In addition, the loads of different solutions are discrete and independent, making the search speed and precision much lower than equation-based optimization problems, thus the performances of commonly used optimization techniques are expected to be lower. Specifically, Basic Load Search and Hill Climbing in a greedy way may easily reach local optimums and stop searching. Genetic algorithm is influenced by the selections of parameters, such as the mutation and crossover rates, and is ineffective for local search and inefficient for precise search. Simulated Annealing has little information about the search space and cannot target the subsets in the solution space that have higher chance to contain optimal solutions. It may further reduce the convergence speed. Tabu Search could avoid the repeated search for local optimum. However, the performance of the search also depends on the selection of the initial solution. The speed of search, when Tabu Search is used, is also slow and not feasible for the setpoint control optimization in multi-zone buildings. In order to solve these issues, we developed an Enhanced Variable Neighborhood Search (EVNS) algorithm for determining the optimal combinations of the setpoint/setback schedules and distances for multiple zones, by which the local search is conducted in the neighborhood of solutions and then the search is moved to other subsets of the solution space based on the heuristics of the present global optimums to avoid repeated search and local optimum. The EVNS incorporates the Variable Neighborhood Search (VNS) and a heuristic update. The basic assumption of the VNS is that the local optimum under all possible neighborhood structures is considered as the global optimum. To be clear, one solution x here is one set of combinations of setpoint/setback - 117 - schedules and distances for all zones, and X is the solution space for all possible solutions. f(x) is the fitness of the solution, which is the heating/cooling loads of one set of combinations, calculated using energy simulation. The neighborhood structure is denoted as N k (k=1,2,…,k max) and N k(x) is the set of all solutions in the k- neighborhood of x. When there is no solution x ′𝜖 𝑁 𝑘 (𝑥 ) ⊆ X that could make f(x’)<f(x), x is considered as the local optimal solution. Specifically, 1-neighborhood is defined as one step (either schedule or distance) from the x, 2-neighborhood is one step for both schedule and distance from the x, 3- neighborhood is two steps for either schedule or distance from the x, and so on. A building with five zones is used as an example to illustrate the neighborhood structure (Figure 38). If k-neighborhood of x is outside the boundaries of possible schedules or distances, it stays on the boundaries. The VNS process is then set as follows: Figure 38: Neighborhood of solutions for the small reference building with five zones Zones Setpoint/Setback Distance (K) Setpoint/Setback Schedule (Min) 1 2 3 4 5 6 7 8 0 5 10 15 20 25 30 35 Zone Adjacency 1-Neighborhood 2-Neighborhood Step 1: Select an initial solution x 0 and set it as current optimal solution: x*←x 0 , T=N k(x*), k←1; Step 2: Select a solution set S (S=(T+1)/2) from T and get the optimal solution x + until T\{x*}=Ø. If f(x + )<f(x*), x*←x + , and T=N(x*), otherwise T←T\S; Step 3: Repeat Step 2 until all the solutions in the neighborhood is searched or the maximum time (including simulation time) is reached, then k←k+1; - 118 - In this chapter, only 1-neighborhood is considered for the local search to reduce the time, complexity, and the possibility of missing the optimum, however, it significantly increases the workload of the local search for global optimum as it is one smaller neighborhood at a time and is easy to end up in a random search as the properties of the solution space are unknown. In order to speed up the convergence, heuristics is integrated with the NVS to guide the search to other subsets of the solution space, which is called the heuristic update in this chapter. To conduct the heuristic update, m solutions are generated simultaneously and updated towards the current global optimums based on the amount of heating/cooling loads. Specifically, the combination of setpoint/setback schedules and distances for each zone is updated towards both the combination with the minimum zone level loads of the same zone and the combination in the solution with the minimum building level loads (Figure 39). The ith (i≦m) solution can be denoted as Xi=(xi1,xi2,…,xiN) to indicate the combination for each zone and its location in the solution space. The solution consisting of the best combinations for each zone with the minimum zone level loads is expressed as Zi=(zi1,zi2,…,ziN), short as zbest. The solution consisting of the combinations with the minimum building level loads is expressed as Bi=(bi1,bi2,…,biN), short as bbest. In each iteration, the setpoint/setback schedules and distances for the kth zone (1≦k≦N) of the ith solution is updated based on the equation: x 𝑖𝑘 +1 = 𝑎 𝑥 𝑖𝑘 + 𝑐 1 𝑟𝑎𝑛𝑑 ()(𝑧𝑏𝑒𝑠𝑡 𝑖𝑘 − 𝑥 𝑖𝑘 ) + 𝑐 2 𝑟𝑎𝑛𝑑 ()(𝑏𝑏𝑒𝑠𝑡 𝑖𝑘 − 𝑥 𝑖𝑘 ) In which the a, c1, and c2 are non-negative constants to adjust the step length towards the zbest and bbest, and rand() is a random number in [0, 1]. - 119 - Figure 39: Heuristics for solutions to update The steps for heuristic update are set as follows: The EVNS can then be formed by integrating the heuristic update with variable neighborhood search to increase the convergence speed and accuracy of the global optimum search. The basic idea is first to conduct a local search in 1-neighborhood of the initial solution. Then a number of solutions in the neighborhoods of the local optimum are selected for the heuristic update until an optimal solution is obtained. Comparing the two optimums, if the new optimum is better than the local optimum, the entire process is iterated for the new optimum; otherwise the neighborhood is varied for generating new solutions to repeat the heuristic update. The detailed steps for one iteration are set as follows: Combination before update Setpoint/Setback Distance (K) Setpoint/Setback Schedule (Min) 1 2 3 4 5 6 7 8 0 5 10 15 20 25 30 35 Combination after update Best combination at zone level Best combination at building level Setpoint/Setback Distance (K) Setpoint/Setback Schedule (Min) 1 2 3 4 5 6 7 8 0 5 10 15 20 25 30 35 Step 1: Calculate the loads for each of the m solutions based on the energy simulation; Step 2: Compare the loads of each combination in each solution with zbest, if its loads are lower, the combination in the zbest is replaced by the combination in this solution; Step 3: Analyze the trajectory of the heuristic update for each solution and compare the best one with the bbest, if better, the bbest is replaced by the solution; Step 4: Calculate the update of each solution based on zbest and bbest; Step 5: If the loads after the heuristic update reach the preset load level (e.g. 15% load reduction), stop the update, otherwise go back to Step 1 until maximum iteration time (e.g., 100 times) is reached. - 120 - 8.1.3. Algorithm Validation Results To test and evaluate the proposed EVNS algorithm, the small size reference office building with actual occupancy data collected from the third floor of the real-world office building was used to test the occupancy transitions based setpoint control. During the on-hour period (6:30AM - 9:30PM on workdays, and 7:00AM - 9:30PM on weekends), 73 F was used as the static setpoint to maintain a comfortable thermal environment when the zone was occupied. When the zone was vacant, the temperature was allowed to float until it reached the setback before the zone was occupied again. 8 different setback values (with 1 K as the interval from 74 F to 81 F) for setpoint/setback distance and 8 waiting times (with 5 min as the interval) for setpoint/setback schedules were used to create 64 combinations. During the off-hour period, no cooling or heating services were provided. Only the minimum airflow was maintained to satisfy the ASHRAE compliance [30]. The EVNS was implemented to determine the setpoint control for Step 1: The initial solution x is set as the best combination shared by all zones (all zones have same combinations of setpoint/setback schedules and distances). The neighborhood structure is set as N k (k=1,2,…,k max). k min and k max are set as the minimum and maximum neighborhood, respectively, k step is the step length of varying neighborhood for each iteration, and t max is the maximum computation time; Step 2: Search local optimum x best in the 1-neighborhood of x and consider it as the current optimal solution; Step 3: Set k=k min; Step 4: Randomly select m solutions x i’ (i=1,2,…,m) in the k-neighborhood of 𝑥 ′𝜖 𝑁 𝑘 (𝑥 𝑏𝑒𝑠𝑡 ), and implement heuristic update to find the optimal solution x’; Step 5: Use x’ as the initial solution and search its 1-neighborhood to find the local optimum x”. If f(x”)<f(x), set x=x” and go back to Step 3, otherwise make k=k+k step and go back to Step 4; Step 6: Iterate the above steps until no better solution could be found or the t max is reached. - 121 - the five zones. For each solution in each iteration, the period from March 2014 to March 2015 was simulated to output the heating/cooling loads. The performance of the EVNS was evaluated by analyzing the convergence of maximum heating/cooling load reduction from the iterations. As the proof of concept, the entire process was repeated independently for 5 times (trials) with different occupants being randomly selected (among 28 possible occupants) and assigned to the reference building’s zones (Figure 40). The initial condition for this analysis was based on the setpoint/setback schedule of 15 minutes and setpoint/setback distance of 5 K for all five zones (73 F setpoint; 78 F setback). Figure 40: Trace progress of the maximum heating/cooling load reduction in 5 independent trials of search with different occupancy assignments Based on the results of plotting the maximum heating/cooling loads for all iterations, the optimal combinations of setpoint/setback schedules and distances for each zone were identified to minimize heating/cooling loads at the building level. Although the difference in occupancy transitions (resulting from different occupancy assignments) influences the speed of convergence, the global optimum (solution with the global maximum heating/cooling load reduction) was found within 470 iterations. As a result, the number of required simulations is easier to finish within a reasonable time and computational power. To further investigate how initial solution affects the search performance, different initial solutions (5 trials) other than the shared combination for all zones were tested and the results were plotted in Figure 41. The initial solutions for this analysis were five random combinations of setpoint/setback schedules and - 122 - setpoint/setback distances for the five zones (including 15 mins/5 K, 10 mins/7 K, 5 mins/7K, 20 mins/3 K, and 10 mins/4K). The occupancy information for all five trials was the same. It can be seen the selection of initial solution does not significantly influence the convergence speed and the level of maximum heating/cooling load reduction based on occupancy transitions remains the same. The trajectories of search in the solution space were different for different initial solutions, but they all approached the neighborhood of the global optimum around 200 iterations. Figure 41: Trace progress of the maximum heating/cooling load reduction in 5 independent trials of search with different initial solutions Lastly, the performance of the EVNS was compared with that of randomly formed combinations of setpoint/setback schedules and distances (defined as random solutions) for different zones to examine whether the solution from the EVNS could be considered as the global optimum or if it is only a local optimum. 500 solutions were generated and the corresponding heating/cooling loads were compared with the loads after implementing the EVNS (Figure 42). The results were presented as the percentages of the load difference between the random solutions and the solution from the EVNS, which is denoted as the line Y=0. Random combinations resulted in the increase of heating/cooling loads from 0.2% to 25.8%. The difference was approximately Gaussian distributed with the mean of 11.8% and the standard deviation of 5.1%, indicating the majority of the solutions had the similar ability to reduce a certain amount of loads by not providing heating and cooling during some of the unoccupied periods. No - 123 - randomly generated solution outperformed the solution generated by the EVNS for reducing the heating/cooling loads at the building level. Figure 42: Load differences between random solutions and calculated global optimum (denoted as the line: Y=0) In summary, it is demonstrated that the EVNS is effective to determine the optimal combinations of setpoint/setback schedules and distances for different zones based on the occupancy transitions. The maximum heating/cooling load reduction could be achieved at the building level, the amount of which represents the relationship between the occupancy transitions and heating/cooling loads for a certain period of time. Energy efficiency (energy reduction and conditioning miss) for a specific HVAC system (e.g. HVAC VAV system) could be then improved by integrating the setpoint/setback control with specific terminal control parameters. 8.1.4. Transitions-Loads Analysis Results The EVNS was then implemented in a simulation using the testbed building as a case study to quantify the relationships between occupancy transitions and heating/cooling loads at the building level. In other words, we performed an occupancy-loads analysis to determine the setpoint control for all individual zones that could minimize heating/cooling loads based on occupancy transitions. The EVNS was used to search the optimal combinations of setpoint/setback schedules and distances for all 16 zones (28 rooms) on the third floor of the building. Since the EVNS could be used to quantify the occupancy transitions- - 124 - loads relationship for any period of time, monthly analysis was conducted in this case study to be different from the reference building. The period of one year from March 2014 to March 2015 was divided into 12 months. For each month, the EVNS was run independently with the actual occupancy information (from the global occupancy modeling algorithm). Same as the reference building, during the on-hour period (6:30AM - 9:30PM on workdays, and 7:00AM - 9:30PM on weekends), a setpoint (73 F) was maintained and allowed to float until reaching a setback when the zone was unoccupied for more than certain minutes, and intermittently went back to the setpoint if the space became occupied again. During the off-hour period, no cooling or heating services were provided. Only minimum airflow was maintained to satisfy the ASHRAE compliance [30]. Again, 8 different setback values (with 1 K as the interval) for the setpoint/setback distance and 8 waiting time (with 5 min as the interval) for the setpoint/setback schedule were combined. An example of the combinations for all of the zones during one month (e.g., April) was shown in Figure 43. Figure 43: Setpoint/setback schedules and distances of all zones in the real-world building for the month of April The heating/cooling loads after implementing the EVNS were compared with loads under baseline control. The results were shown in Figure 44, in which the performance of baseline control was denoted as the line Y=0. - 125 - Figure 44: Monthly load differences of implementing the EVNS (yellow dots) and shared combinations (blue dots) compared to baseline control in the office building It can be seen that a minimum of 10.4% and a maximum of 28.3% monthly load reduction were achieved by the EVNS. Since Los Angeles is in a cooling-dominant climate zone, more load reduction was expected during the warm seasons than the cold seasons. During the summer vacation (from mid-May to Mid-August) and winter recess (from mid-December to mid-January), more loads were reduced as occupied periods were generally shortened, especially for the hot days when more cooling was required to maintain the desired indoor thermal conditions while the spaces were not always occupied. Load reduction was greater than implementing the shared combination of setpoint/setback schedules and distances (selected from 64 combinations for each month) for all of the zones, demonstrating the EVNS is effective to search combinations that could quantify the relationship between occupancy transitions and heating/cooling loads at the building level. Next, the zones on the third floor were assigned with random solutions of setpoint/setback schedules and distances and the loads were compared with the loads from implementing the EVNS for the entire year. This process was repeated for 100 times and the corresponding loads were all greater than the loads from the one received by implementing the EVNS (Figure 45). The setpoint/setback schedules and distances selected by the EVNS for all individual zones minimized heating/cooling loads based on occupancy transitions, similar to the findings in Chapter 8.1.3. - 126 - Figure 45: Load differences between random solutions and calculated global optimum (denoted as the line: Y=0) 8.1.5. Summary In this chapter, a data-driven approach was introduced using an enhanced variable neighborhood search algorithm to determine setpoint/setback schedules and distances for all individual zones based on occupancy transitions, by which heating/cooling loads could be minimized. Given certain occupancy data and weather, the difference in the loads resulting from the proposed approach and the baseline control shows the quantitative relationship between occupancy transitions and heating/cooling loads for a certain period of time. The small size reference building was used to validate the performance of integrating variable neighborhood search with a heuristic update for finding the optimal solution. It was demonstrated that the convergence of the search was not influenced by different occupancy assignments or initial solutions, and there was no random solution that could outperform the proposed approach to reduce the heating/cooling loads based on occupancy transitions at the building level. The testbed building was also simulated as a case study for introducing how to apply the EVNS to real-world occupancy transitions- loads analysis. A minimum of 10.4% and a maximum of 28.3% load reduction were achieved compared to the baseline control by setting different zones with optimized combinations of setpoint/setback schedules and distances. It is important to note that, in this chapter, the setback was set to be higher than setpoint, as cooling is dominant and heating is seldom provided in Los Angeles. Yet, this assumption does - 127 - not influence the way the algorithm works, and the EVNS could also be used in the heating-dominant areas as well as in the heating/cooling mixed areas. 8.2. Occupancy Diversity and Loads Since zones usually consist of more than one space, if only one space in a zone is occupied, heating/cooling is required for the entire zone, and the loads of the zone are the sum of loads in all spaces of that zone. Considering different spaces may have different or in some cases inverse occupancies [244], simply aggregating disparate occupancy information of different spaces might create an inaccurate representation of how each zone is occupied, which may lead to unnecessary heating/cooling loads and further reduce energy efficiency. In addition, a building usually has multiple zones and there exist heat transfer and balance among the zones. Loads in one zone could increase because of the different thermal conditions of neighboring zones resulting from occupancy diversity. A quantitative study for measuring the amount of heating/cooling loads that are associated with occupancy diversity is still needed [245], and it is still not clear how occupancy diversity at the building level quantitatively influences the energy efficiency of an HVAC system. In this chapter, a framework is introduced to analyze energy implications of diversity at the building level with the following factors being considered: (1) HVAC layout. Commercial buildings may be segmented and served by different sets of HVAC systems or secondary systems (e.g., air handling units). HVAC layout determines the zones with shared supply air. (2) Zone adjacency: adjacent zones share boundaries and there are load exchanges through heat transfer and balance among the zones when there is temperature difference or mutual ventilation. If the adjacent zones have distinct schedules and requirements for heating/cooling, excessive energy might be consumed due to the thermal circulation. (3) Orientation: zones with the same orientation usually have similar boundary conditions and are therefore impacted similarly by the outside environment. - 128 - 8.2.1. Methodology for Eliminating Occupancy Diversity The objective of quantifying the energy implications of occupancy diversity is to rearrange heating/cooling loads by virtually rearranging occupancies until the diversity is eliminated. Occupancy profile, the typical presence probability as a function of time, representing long term occupancy patterns, is used as a measure to calculate the level of diversity. There might be more than one profile representing one space (e.g., different profiles for an occupant for different days of the week) and there might be changes in a profile. If a space has more than one profile, for each time point, the highest probability among the profiles is chosen to account for the majority of the time. The process is restricted by hierarchical constraints, which represent how a building is utilized. After higher-level constraints are satisfied, the lower-level constraints are included. If there is a conflict between the two sets of constraints, higher-level constraints are given the priority. In this chapter, primary constraints are individual requirements, e.g., different room sizes for different occupants, and group requirements, e.g., occupants of the same department should be spatially close. To be clear, individual requirements are for individual spaces and mainly consider physical conditions and preferences, such as orientation while group requirements are for connections between spaces and the functionality of spaces, such as the offices of administration for a department should be next to each other. Secondary constraints require occupants with similar occupancy profiles to be virtually placed in the same zones so that the occupants of a zone could have similar presence patterns, eliminating the occupancy diversity at the zone level. The third level constraints require similar occupancy profiles in the connected zones, including zones under the same (secondary) HVAC systems, zones adjacent to each other, and zones with same orientation to further eliminate the occupancy diversity at the building level. It is important to note that the consideration of these influential factors does not change the HVAC layout, zone adjacency, and orientation, but it guides the virtual change of occupant-space relationships for rearranging the loads associated with occupancy diversity. - 129 - An agglomerative hierarchical clustering algorithm is designed to cluster occupancy profiles based on their similarities while considering the connectivity between the clusters. Occupancy profiles are derived from real time occupancy information [29] by analyzing the patterns of observable contextual information, such as CO 2 concentration, temperature and lighting levels [30]. In Chapter 7.3, four techniques including ARMA (AutoRegressive-Moving-Average) time series model, ANN (Artificial Neural Network) pattern recognition model, MC (Markov Chain) stochastic process model, and GLM (Generalized Linear Model) were tested. ARMA yielded the best results for modeling personalized occupancy profiles by analyzing the ambient environment and previous occupancy information as described in, and outperformed other methods commonly used in practice and research in terms of both statistical occupancy approximation and load approximation. Therefore, the ARMA algorithm is used to prepare occupancy profiles. Actual occupancy has time continuity, which could be undermined by the outliers in the ground truth and by the impacts of irregular occupancies. In addition, conditioning effects of heating/cooling are not spontaneous and it takes time to reach the desired temperature. Therefore, occupancy profiles should be further adjusted on a time-window basis in order to be more representative. Specifically, sliding windows are defined to segment the profiles by time windows with overlaps. The averaged presence probability within each window is then used as the feature for this window to form a new feature vector (updated profile) for calculating the level of occupancy diversity. This logic operation could generalize original feature information but reduce the dimension of the feature vector, which improves the computational efficiency and the reliability to compare the similarity among different occupancy profiles. The clustering process starts by assigning each updated profile to an individual cluster, each containing only one profile. The Minkowski distance is used to calculate the similarity between two profiles, as it is used as a general function to measure distance in clustering [31]. Minkowski distance d ij = √(∑ |x ik − x jk | r n k=1 ) r - 130 - In which, d ij is the distance between profile i and profile j; n is the vector dimension, depending on the length and overlap of the sliding window; x ik is the averaged probability of window k for the profile i, and x jk is the averaged probability of window k for the profile j. r is selected as 2 in this study to make the Minkowski distance the Euclidean distance. Since the primary constraints for group requirements may change the distances between profiles for addressing the second and third level constraints, the implicit information from group requirement is used to improve the accuracy and reduce the computational complexity. A data structure is built using a heap to efficiently update the distance between the profiles. There are three parts in this structure: the first two are the names of the profiles or clusters to be paired, and the third one is the Minkowski distance of the pair. The data structure can be presented as a distance list with three columns for the three parts. First based on the group requirements, if two profiles belong to the same group, which means they must be close to each other, the distance between them is set to 0 and the two profiles are combined in the distance list as a new cluster. Single linkage, also called the nearest- neighbor, is then applied to calculate the distances of other profiles without group requirements and combine them as new clusters, in which the shortest distance of any inter-cluster profile pairs is considered as the distance between the two clusters. All remaining pairs are searched to find the closest pair of clusters to be merged into a single cluster. Following this procedure, there is one cluster less. Each time when a new cluster is formed, all the related clusters and distances on the distance list are updated. The distance list is updated iteratively until all of the profiles are clustered into a single cluster. The implementation of this data structure enables the calculation of the distances of all profile pairs on the first step and then updates the distances between clusters in the following clustering process. Based on this hierarchical clustering structure, an iterative evaluation algorithm is then designed to complete the load rearrangement through virtually changing occupancies, depending on the hierarchical constraints to eliminate diversity (Figure 46). First, all of the profiles are assigned into initial clusters based on the individual requirements (I cluster). This step might vary case by case. Meanwhile, all profiles are also gathered into initial clusters based on the group requirements (G cluster). The following - 131 - steps are executed independently for the intersections of I cluster and G cluster and the remaining. Initially, the profiles with the distances of 0 are merged based on the group requirements and randomly placed to the zones that could satisfy both the individual requirements and group requirements. The heap based distance list is then used to merge two clusters one at a time according to their profile similarities. Since the capacity of each zone (numbers of spaces and occupants in the zone) does not change before or after the load rearrangement, if the new cluster contains the cluster merged in the previous step, the subsequent profiles are placed to the same zones of the existing profiles or connected zones depending on whether the zone capacity has been reached and how the third level constraints (zone adjacency, orientation, and HVAC layout) define the divisions of connected zones. Otherwise, the subsequent profiles are placed to the zones containing the profiles that are relatively most similar to them. As long as two clusters are merged to one cluster, profiles within one zone are adjusted to ensure the profiles on the zone boundaries are similar to the profiles on the boundaries of the connected zones. Finally, all occupancy profiles are virtually traversed and one trial of eliminating diversity is completed. Since the initial space for loads rearrangement is randomly selected, the entire process described above, is iterated with different initial selections until the ratio between inter-zone distance (average profile distance within each zone) and inner-zone distance (average profile distance from other zones) reaches the maximum. - 132 - All Profiles Initial Clusters Agglomerative Clustering One Cluster at A Time HVAC Layout Orientation Agglomerate with Previous Cluster? Third Constraints Zone Capacity Satisfied? Yes No Adjustment to Same Zone Adjustment to Adjacent Zone Yes Adjustment of Boundary Profiles Adjustment to Relatively Most Similar Zone No Unanalyzed Profile? Yes No Calculate the ratio between inter- zone distance and inner-zone distance Unchanged? Finished Select Initial Spaces No Yes Individual Requirements Primary Constraints Zone Adjacency Zone Level Similarity Secondary Constraints Group Requirements Heap Distance List Figure 46: Iterative evaluation algorithm for hierarchical clustering and elimination of occupancy diversity A contribution is the integration of BIM, with the aid of its unique representation of the spatial relationships of occupants, and its ability in improving the understanding of hierarchical constraints and load rearrangement for eliminating the diversity. The algorithm relies on BIM as a source of building information and spatial information. BIM provides digital repository for information exchange and interoperability to facilitate design, construction, and facilities management, and more and nowadays - 133 - more buildings have BIMs. Moreover, building information is important to feed into building energy model and could further provide contextual interpretation of energy performance. 8.2.2. Validation for Quantifying Load Implications of Occupancy Diversity In order to validate the effectiveness of the proposed framework in quantifying the HVAC system energy efficiency affected by occupancy diversity at the building level, the heating/cooling loads after implementing the framework were compared with other possible trials of eliminating diversity in the testbed building. The occupancy profiles were acquired using the global occupancy modeling algorithm and ARMA profiling method introduced in Chapter 7. Since the sampling rate for the occupancy modeling was 3 minutes, the original occupancy profile was 480-dimensional (Figure 47). The logic operation was determined to segment the 480 dimensions by a 30-minute time window with 15-minute overlap based on the analysis of degree of statistical approximation for different window/overlap combinations. The period from 6:00 AM to 9:00 PM was chosen to form a 60-dimensional vector (Figure 48). Each number in the vector was the averaged presence probability for the corresponding 30-minute time window. - 134 - Figure 47: Occupancy profiles for the 28 rooms on the third floor of the testbed building - 135 - Figure 48: Original occupancy profile and logic operations applied to form an updated occupancy profile The spatial information was encoded for implementing the hierarchical clustering and elimination of diversity in Matlab. Whole building energy simulation was then used to calculate the heating/cooling loads before and after diversity elimination. Building geometry was built using Google Sketchup. Construction thermal properties and HVAC systems were added using OpenStudio. Equipment/appliance schedules were assumed to be the same as occupant presences. Lighting fixtures were assumed to be used if a space was occupied and when artificial lighting is used (based on light sensor) during 6:30-10:00 and 15:30-18:00 (after 18:00 lighting fixtures were assumed to be used if occupant presence is positive). All these inputs were written into idf file using Matlab for the EnergyPlus simulations. Based on the availability of occupancy data, only one floor was used to validate the proposed framework. However, conduction, convection and longwave radiation, heat transfer and loads exchange between two floors (above the ground) are relatively less significant compared to the walls, therefore, their influence on the elimination of diversity at the building level is assumed to be limited. However, the same framework could be implemented to quantify the heating/cooling loads associated with diversity for both single-floor buildings and multi-floor buildings across floors. Primary constraints were set for individual requirements and group requirements. The individual requirement was about space size, which cannot change by more than 20% for a given occupant, while the group requirement was about having occupants of the same departments to be spatially close to each other. It is important to note that the selection of specific primary constraints does not influence the way - 136 - of eliminating occupancy diversity, and can be varied case by case. The secondary constraint was to move the similar occupancy profiles into the same zones. Occupancy profile for each room was calculated for the period of simulation (12 months from March 2014 to March 2015). The third level constrain required the occupancy profiles in the connected zones to be similar. Considering the zone adjacency, zone orientation (North, South, East, West -- no Core Zone in this building), and HVAC layout, the third floor of the testbed building was divided into six connected zones (Figure 49). The capacity of each zone was not changed before or after implementing the evaluation algorithm. AHU 1 AHU 2 Figure 49: Iterative evaluation algorithm for eliminating the occupancy diversity (different colors represent connected zones defined by the third level constraints) First, the proposed framework was implemented, and compared with other 100 trials of eliminating diversity simply based on primary constraints and random occupant-space combinations through running occupancy driven setpoint control. During the on-hour period (6:30 AM - 9:30 PM on workdays, and 7:00 AM - 9:30 PM on weekends), a setpoint (i.e., 73F) was maintained and allowed to float until reaching a setback (i.e., 78F) when the zone was unoccupied for more than 15 minutes (e.g., during lunch breaks, etc.). If the space was occupied again, the setting went back to the setpoint. During the off-hour period, no cooling or heating services were provided. Only minimum airflow was maintained to satisfy the ASHRAE compliance [32]. The benchmarks for comparison were the heating/cooling loads for occupancy driven setpoint control without any occupancy-space rearrangement. The increment - 137 - percentages, after occupancy-space rearrangement compared to the benchmarks in both heating and cooling loads, were calculated. Their relative performances (Figure 50) indicate whether the proposed framework could effectively eliminate diversity and quantify the energy implications related to diversity. Figure 50: Load increments (%) of different trials of eliminating occupancy diversity (the dot with red circle is the trial resulting from the proposed framework) It can be seen from the results (Figure 50) that there was no complete superposition of any two trials, indicating generally the diversity was closely associated with heating/cooling loads. When hearing or cooling being analyzed separately, there was only one trial that generated slightly less loads. All of the 100 random trials consumed more energy in both heating and cooling than the proposed framework (Figure 50). There was no random trial that could outperform the iterative evaluation algorithm in eliminating the diversity. The loads for HVAC system as actual demands at the building level were reduced (11.3% for cooling and 6.5% for heating). To further evaluate the importance of eliminating diversity at the building level, the three factors of HVAC system layout, thermal zone adjacency and orientation for defining the third constraints were added gradually for implementing the iterative evaluation algorithm. The number of factors being considered was varied, and at least fifty simulations were conducted for each number of factors to reduce -10 -8 -6 -4 -2 0 2 4 6 8 10 -15 -10 -5 0 5 10 15 Cooling Load Increment Percentage Heating Load Increment Percentage Different Trials of Eliminating Occupancy Diversity - 138 - the variability. In Figure 51, the x-axis represents the load increment percentages compared to eliminating occupancy diversity at the zone level, while the y-axis displays the number of simulations for the certain percentage of load increment. It can be seen that the loads were generally reduced as more factors were added. It is necessary to define connected zones as third constraints for eliminating occupancy diversity at the building level to improve HVAC system energy efficiency. Figure 51: Load increments (%) of different number of factors for determining connected zones as the third constraints For a more detailed analysis, eight possible combinations (“none”, “HVAC layout”, “zone adjacency”, “orientation”, “HVAC layout and zone adjacency”, “HVAC layout and orientation”, “zone adjacency and orientation” and “HVAC layout and zone adjacency and orientation”) were explored. Each time one combination was considered as the third constraint to generate 20 trials of eliminating diversity at the building level. The average load increment percentage of each combination was then compared with the benchmark introduced previously. As presented in Figure 52, the x-axis represented the combinations of factors to be considered as the third constraints, while the y-axis showed the load increment percentages, compared to simply implementing occupancy driven setpoint control. - 139 - Figure 52: Load increments (%) of different combinations “Zone adjacency” had the most significant influence on load increments associated with diversity. It is because adjacent zones share boundaries and there are load exchanges through heat transfer and balance among the zones and adjacent zones are usually supplied by the same conditioned air. If two factors were considered, the combination of “zone adjacency” and “orientation” was more influential than any other combinations. One possible reason is that zones with the same orientation had similar boundary conditions and were impacted similarly by the outside environment. If all the three factors were incorporated, the loads were significantly reduced (9.6%), demonstrating that all three factors were important to form the third constraints for eliminating occupancy diversity and improving HVAC system energy efficiency. 8.2.3. Generalizability Analysis In order to test the generalizability of the iterative evaluation algorithm, virtual reference buildings were used to test the energy performances after implementing the proposed framework compared to other possible trials for eliminating the occupancy diversity. The same weather data used for the testbed building was downloaded from the website of Department of Energy for simulations. Total of 100 building models (20 different plans for each of the five building shapes) were generated for the validation. 4.00% 6.00% 8.00% 10.00% Load Increment (%) Combination - 140 - For each of the model, 4 types of simulations were performed and results are compared to evaluate the generalizability of the iterative evaluation algorithm (Figure 53). Baseline control: The existing setpoint control without any change was simulated for each virtual reference building. The existing control assumes all thermal zones in the building to be always occupied under the on-hour mode (6:30 AM - 9:30 PM on workdays, and 7:00 AM - 9:30 PM on weekends), and a constant temperature setpoint of 73F is maintained; Occupancy driven setpoint control with occupancy diversity: The occupancy driven setpoint control without any change to occupancy was simulated for each virtual reference building. During the on-hour mode, a constant temperature setpoint of 73F is enforced for occupied zones. If a zone is vacant for a minimum of 15 minutes, the setback of 78F is triggered until it is occupied again; Occupancy driven setpoint control after eliminating occupancy diversity at the zone level: Occupancies are virtually changed based on primary constraints and secondary constraints for load arrangement. The occupancy driven setpoint control after occupancy diversity is eliminated at the zone level. Occupancy driven setpoint control after eliminating occupancy diversity at the building level: All of the primary, secondary and third constraints were considered for eliminating diversity. The occupancy driven setpoint control was simulated for each virtual reference building after occupancy diversity is eliminated at the building level. - 141 - Figure 53: Comparison of results for four types of simulations across 100 virtual reference buildings Based on the results, 96% of the models had the same increasing trend of load increment percentage, from simulation 1 to simulation 4, indicating that energy efficiency was consistently affected by diversity both at the zone level and building level. Heating/cooling loads, associated with diversity, were reduced by load rearrangement. Ranking of the influence of diversity on heating/cooling loads for different building shapes was the following (from less to more): I shaped, L/T shaped, and H/U shaped. The more complicated the building layout was, the more connections the zones had and the influence of outside environment was larger, therefore, the influence of diversity on heating/cooling loads was larger (approximately 3.5% difference between I shaped and L/T shaped, and 3% difference between L/T shaped and H/U shaped). Since the loads that are the actual demands for HVAC system were significantly reduced for all of the models, the influence of diversity on HVAC system energy efficiency was consistent over different building shapes, different layouts and different occupancy diversities (Table -10 -20 0 I Shape -10 -20 0 I Shape -10 -20 0 L Shape -10 -20 0 L Shape -10 -20 0 T Shape -10 -20 0 T Shape -10 -20 0 U Shape -10 -20 0 U Shape -10 -20 0 H Shape -10 -20 0 H Shape Simulation 1 Simulation 2 Simulation 2 Simulation 4 Load Increment Percentage (%) - 142 - 15). By running occupancy driven setpoint control, 11.5-14.4% of the energy efficiency could be improved if occupancy diversity is eliminated at the zone level. The improvement range could move up to 16-18% if occupancy diversity is eliminated at the building level. In addition, the method was generalizable. Table 15: Influence of occupancy diversity on HVAC system energy efficiency for five building shapes and four types of simulations Simulation 1 Simulation 2 Simulation 3 Simulation 4 I Shape 0% (Benchmark) 7.27% 12.62% 16.67% L Shape 0% (Benchmark) 7.76% 14.64% 18.76% T Shape 0% (Benchmark) 6.91% 14.44% 18.84% U Shape 0% (Benchmark) 6.23% 12.08% 18.31% H Shape 0% (Benchmark) 6.18% 11.77% 18.27% 8.2.4. Summary Occupancy diversity may increase the loads that are not actual demands for an HVAC system, leading to inefficiencies. In this chapter, an iterative evaluation algorithm, based on agglomerative hierarchical clustering, was introduced to eliminate occupancy diversity based on three levels of constrains. A testbed building and virtual reference buildings were used for validation. The agglomerative hierarchical clustering process was improved using heap to reduce the time and computational complexity for updating distances among clusters. When using a real-world testbed building, the iterative evaluation algorithm outperformed other possible trials of eliminating diversity, and was effective to quantify the energy implications of diversity. The method could reduce the loads that are not the actual demands for - 143 - HVAC systems to leverage the effects of occupancy driven setpoint control. The performance of the proposed framework was also consistent over different building geometries, different building layouts and different diversities when using virtual reference buildings. The more complicated the building geometries were, the more significant the influence of diversity on HVAC system energy efficiency was. The contribution of this research is to increase our knowledge about the impact of occupancy diversity on HVAC system energy efficiency, which could bring about potential energy savings and could facilitate better-informed decision making for energy-efficient HVAC system response control strategies. The investigations, presented here, do not aim to provide any specific solution to eliminate diversity for a specific building or for a specific HVAC system or in a specific climate. Although centrally controlled HVAC VAV systems were used in the testbed building and virtual reference buildings for validation, the proposed framework could be applied to other buildings and HVAC systems as the essence of the framework about occupancy-space relationships remains the same. - 144 - Chapter 9: Multi-level Building Energy Model Calibration As part of our investigations, building energy simulation, virtual representation and reproduction of energy processes for either an entire building or a specific space, has been used to implement and validate occupancy-loads relationships on the demand side for HVAC system energy efficiency. According to the literature review and gap analysis, the three objectives of this chapter are: (1) to evaluate different programs for modeling the occupancy-loads relationships; (2) to introduce a multi-level calibration framework for simultaneously calibrating energy model at multiple levels; (3) to test the hypothesis that a consistent simulation performance can be achieved at multiple levels when HVAC system is operated different if model is calibrated using ground truth energy data from mixed controls. 9.1. Selection of Simulation Programs This chapter does not aim to compare the simulated results for a building using different simulation programs as different inputs are required and it is demanding and error-prone to develop an identical energy model for different programs. Different input requirements of different programs may cause additional deviations and uncertainties. This chapter does not also use standard case buildings or reference buildings -such as models regulated by ANSI/ASHRAE Standard 140-2011- for validation, since it is impossible to get actual occupancy and energy related data for calibrating the standard cases. Instead, the chapter presents a systematic review for evaluating the applicability of each program in simulating the 1) effects of occupancy on building HVAC energy consumption and 2) HVAC system response for investigating occupancy-loads relationships. A qualified program should effectively react to occupancy, controls settings and boundary conditions, and calculate the accurate thermodynamics, physical states and flows of energy. Based on the analysis provided in [252], there is a variety of building simulation programs specializing on different aspects related to building energy behavior. Each simulation program has its own methods and - 145 - sequences to couple occupancy information with building HVAC energy simulation, which were summarized in this chapter from the five perspectives of heat transfer and balance, load calculation, occupancy-HVAC system connection, HVAC system modeling, and HVAC system dynamic simulation. 9.1.1. DOE-2 DOE-2 is a whole-building simulation program developed by the Department of Energy (DOE) to perform hourly energy simulation for conceptual determination of total building energy consumption. It has been widely used for more than 30 years to guide building design and simulation developments [253]. DOE-2 mainly consists of four subprograms of LOADS (sensible and latent heating/cooling loads), SYSTEMS (secondary air-side equipment), PLANT (primary water-side system) and ECONS (cost of energy) plus one BDL Processor (input translation) [254]. LOADS, SYSTEMS, PLANTS and ECONS are sequentially simulated, with the output of the former one being the input of the next. There is no communication or interaction among the four subprograms. Considering the heat transfer and balance, DOE-2 assumes static space temperature and cannot achieve strict heat balance because it uses weighted coefficients based approximation instead of calculating convection and radiation separately [255]. There are only four types of heat transfer surfaces: exterior wall, interior wall, window and underground wall. Regarding the load calculation, DOE-2 loads are actually reported as HVAC system component loads without incorporating system issues into the load calculation. Loads are decided through transfer functions with customized weighting factors [256]. The occupancy-HVAC connection is achieved on an hourly basis by the sequence of occupancy heat gain, space load, secondary system load, and primary system load. Limited feedback from HVAC system operation is considered to update space load and temperatures. Regarding the HVAC modeling, a set of predesigned HVAC system types is available for selection. The PLANTS allows a part-load setting to calculate energy demand. HVAC system simulation follows the LSPE (load, system, plant, economic) sequence and is not able to simultaneously communicate with building envelope thermal dynamics. In LOADS, the DOE-2 first assumes a constant- - 146 - temperature to estimate the loads then in SYSTEMS and PLANTS it reacts to interior impact and exterior impact for adjustment successively (Figure 54). There is no backward feedback. Conditions of adjacent spaces from the previous step are considered as conditions at the current time step to avoid solving simultaneous equations. DOE-s considers a building HVAC system as a linear system related to space temperature and the coefficients for heat balance are kept constant during the entire simulation period. These significant limitations compared to other programs limit the application of DOE-2 for analyzing the effects of occupancy on HVAC energy consumption, and testing HVAC system responses to different occupancy based control strategies. In addition, DOE-2 has advanced requirements for modeling with limited sources and programming, however its computational efficiency is high and learning curve is small. SYSTEMS LOADS PLANTS Occupancy BDL ECONS Figure 54: Energy simulation in DOE-2 (arrow shows the flow of information) 9.1.2. EnergyPlus EnergyPlus is developed with a heat/mass balance based on the DOE-2 and BLAST. To consider the heat transfer and balance, EnergyPlus uses a state space method for combined heat and mass transfer [257] to ensure strict heat balances on each surface and space air [258]. It is specialized in thermal load calculation by using predict-correct process with backward feedback for continuously updating the loads [259]. Applied heat transfer coefficients are based on the effects of interior and exterior impacts. Regarding occupancy-HVAC connection, at each time step, occupancy is incorporated by updating surface heat balance and air heat balance at the previous time-step. - 147 - Regarding the HVAC modeling, modularity is applied to provide a flexible and robust approach in specifying system characteristics. Loops define the movements of mass and energy, in which air loop simulation and water loop simulation are the main functional parts of an EnergyPlus simulation, both including the demand side and supply side. Within each loop the performance curves of equipment could be defined on a customized basis, however the convergence might not be achieved. To consider HVAC simulation, EnergyPlus also has four subprograms of LOADS, SYSTEMS, PLANTS and ECONS (Figure 55). Different from the DOE-2, it provides an integrated simultaneous solution. The time steps of SYSTEMS and PLANTS could be automatically adjusted to meet the calculated LOADS [260]. Although the conditions of adjacent spaces at the previous time step are still used to successively calculate current conditioning requirements, the short time step could effectively offset simulation deviations. After the loads are calculated for heat balance, the SYSTEMS then estimates the demands and the required responses of the PLANTS , as well as it sends feedback to update LOADS. All of the connections are controlled by the building simulation manager. EnergyPlus can take real-time variations of heat balance into consideration and provide accurate space temperature estimates. Simultaneous HVAC simulation enables EnergyPlus to analyze the effects of occupancy on HVAC energy consumption, and to test HVAC system responses to different occupancy based control strategies. For example, EnergyPlus supports simulation of demand threshold controls (e.g. customized thermostats). In general, EnergyPlus has advanced requirements for managing simulation parameters and thermal-control dynamics; however its learning curve is small. SYSTEMS LOADS Occupancy Manager ECONS PLANTS Figure 55: Energy simulation in EnergyPlus (arrow shows the flow of information) - 148 - 9.1.3. IES-Virtual Environment IES-Virtual Environment (VE) is an integrated building simulation program for design aid and detailed assessment. Modeling in IES-VE is realized by different modules. The core of the IES-VE is to build an IDE (integrated development environment), shared by all of the analysis modules, to analyze building energy consumption and other building performance indicators, such as comfort and CO 2 density. The model could be built directly using the ModelIT module or imported from other programs, such as Revit and SketchUp. This breakthrough eliminates the workload of building different models for different simulation programs and expedites the energy analysis cycle. Apache thermal analysis modules are used to analyze dynamic energy consumption and thermal conditions [261]. There are four Apache modules used in IES-VE: ApacheCalc is for calculating heat loss and gain, while ApacheLoads is for calculating heating and cooling loads. ApacheSim is responsible for dynamic thermal simulation and ApacheHVAC is used to simulate HVAC plants (See Figure 56). The learning curve for IES-VE is small. Regarding the heat transfer and balance, Apache modules use a finite-difference method to model the heat transfer process based on the CIBSE (Chartered Institution of Building Services Engineers) standards (ApacheCalc) [262] and in accordance with the ANSI/ASHRAE Standard 140 (ApacheLoads) [152]. Conduction is assumed to be one-dimensional in each building element and thermal properties of surfaces are assumed to be uniform. In addition, heat balance is calculated based on a stirred tank model, which assumes the air temperature and humidity are the same in one space. Dynamic loads (ApacheSim) are integrated with fluid dynamics (Macroflo) by assigning air node to each space such that the mutual influences among spaces are considered in the load calculation (Figure 56). Regarding occupancy-HVAC connection, occupancy profiles are used to represent heat gain based on admittance techniques for conditioning requirements. Regarding the HVAC modeling, ApacheHVAC provides pre-defined HVAC wizards and system prototypes autosizing, and also supports to create component based HVAC systems, which requires - 149 - detailed and complicated system settings. The multiplexing features enable to assign HVAC data to very large, complex system models. In IES-VE, HVAC simulation could provide detailed system simulation with airflow analysis but has to be set with appropriate time steps, in which ApacheSim, ApacheHVAC and Macroflo are simultaneously taken in to account in thermal-control dynamics [263]. The advantage of IES-VE is the fact that it uses an integrated model for performance analysis with high efficiency and does not require the user to have any prior knowledge of computer programming or of the mathematics and equations that govern building physics. IES-VE acts as more of an “off-the-shelf” program compared to others. However, the heat balance and load calculation in IES-VE are conducted by different modules, and the relationship between loads and HVAC settings cannot be customized, resulting in inaccurate calculations for occupancy-associated demands and loads [263]. IES-VE has the extensive capability for modeling customized systems however is unable to specify settings for certain energy-efficient or sophisticated HVAC plants, such as GSHP (ground source heat pump), which also limits its applicability for analyzing the effects of occupancy on HVAC energy consumption, and testing HVAC system responses to different occupancy based control strategies. Occupancy ApacheSim ApacheCalc ApacheLoad ApacheHVAC VISTA MacroFlo Figure 56: Energy simulation in IES-VE (arrow shows the flow of information) 9.1.4. ESP-r ESP-r is an open-source simulation program that can be run on several operating systems, such as Linux and Windows. It is a rigorous program in modeling building physics by using multi-sided polygons to - 150 - define constructive elements and CFD (computational fluid dynamics) to combine mass flow and HVAC system simulation [264]. Regarding the heat transfer and balance, the ESP-r uses Crank-Nicholson difference formulation to simulate heat and mass transfer processes. Nodes are defined to represent air volumes of the building, geometrical components, connections or HVAC system components. Heat balance is achieved by equating the control volumes (CV) of energy flow from and to these finite difference nodes. The ESP-r has a unique optimization method for solving block-tridiagonal implicit matrix equations by only calculating non-zero components from thermal surfaces. This method could achieve high calculation accuracy and speed without iterations [265]. For load calculation, all nodes are interconnected due to the interdependencies among thermal related components. The collection of equations then form an equation set (nodal network), describing the load state for the whole building. Regarding occupancy-HVAC connection, heat gain from occupants are added to the thermal network and integrated with air fluid dynamics. Data exchange is conducted with HVAC network at each time step. Regarding HVAC modeling, an assembly of components could be selected from a library to form the HVAC network, each node of which is connected to the data from other networks [266]. Once the data are prepared, the simulator begins to solve a set of conservation equations using the finite difference method until convergence is reached, and outputs the energy consequences. The process is discretized in to different networks and each network possesses its own solver (Figure 57). In general, a global solution could be found simultaneously for each network and loose couplings are then established among networks. ESP-r is a research-oriented tool. It is more flexible and holistic than other programs, allowing researchers to analyze the interconnections among occupancy, building thermal conditions and HVAC energy consumption. However, it lacks flexibility when testing HVAC system response to different occupancy based control strategies, and necessary autosized and default values for input parameters in more complicated and tentative tasks, such as occupancy based dynamic setpoint schedules [267]. It does not support a trial-and-error process, and its steep learning curve requires analysts must have specific knowledge for thermal dynamics and physics modeling experiences to use this program. - 151 - Occupancy Airflow Network Building Thermal Network HVAC Network Figure 57: Energy simulation in IES-VE (arrow shows the flow of information) 9.1.5. TRNSYS TRNSYS is an extensible program for transient building mechanical and electrical system simulation. The essence of TRNSYS is to simulate the performance of the entire system by breaking it down into individual components. It has a DLL (Dynamic Link Library) based structure and can be co-simulated with other programs such as ESP-r and Simulink. Regarding the heat transfer and balance, it sets multiple air nodes to the spaces and assumes the entire building and building systems are formed by a collection of “energy system components”, such as auxiliary heater and calculates the heat balance using algebraic and differential equation solvers [268]. Through iteration within a component or a set of components, heat balance could be achieved. Utility components, building components and additional customized DLLs are used as load files for load calculations. An HVAC system is connected with occupancy by the input- output link of corresponding types (type is a category of components) such as connecting Type 56 (multizone building) with Type 516 and Type 520 (thermostat and heating/cooling behavior). Occupancy information could also be obtained through runtime calls from outside occupancy models during the simulation. Regarding HVAC modeling, HVAC component and controller components could be selected from standard libraries or developed using programming languages (C, C++, PASCAL, FORTRAN, etc.). Each component or a group of components represents a process like piping hot water, requiring two types of information: input (time-dependent) and parameter (time-independent). Components are then connected - 152 - to finish a task such as space cooling. In TRNSYS, HVAC simulation is equal to simulating the performance of individual HVAC related components. The output of one component could be used as the input of the next component (acyclic flow), or it integrates the loop within one set until nothing changes, and energy use of all components is dynamically added up to represent the entire building energy use. At each time step, the TRNSYS kernel checks whether the input exceeds the tolerance and chooses whether to run the related components or move to next time step (Figure 58). Simulation is then converted to formulation of mathematical expressions and analysis of interconnections among components and groups [269]. Occupancy Lab Dll Input TRNDll TRNExe Call Component Equation Solver Simulated Output Figure 58: Energy simulation in TRNSYS (arrow shows the flow of information) TRNSYS has a steep learning curve and advanced requirements for mechanical system modeling and configuration. It models the input/output relationships of components by instantaneous or differential equations. The solver calls each component successively or using Powell method. The overall system iterations are performed until all components (and all sets of components) converge simultaneously. The most powerful feature of the TRNSYS -but also its main source of error- is that there is no assumption or range/default for components [270]. Any component related to heating, ventilation and air conditioning of a building has to be specified by analysts, which provide flexibility in modeling a non-conventional strategy, system configuration and specification. Compared to the other programs, TRNSYS can model complex systems that cannot be accurately modeled by other programs with high accuracy and precision. - 153 - (e.g., simulation time step could be 0.1 second). Another key advantage for TRNSYS is its open source characteristic and modular structure, which could fit the investigation of different researches on occupant based and energy-efficient controls. Any customized systems can be tested by connecting different components. However, it is not applicable for analyzing effects of occupancy on HVAC energy consumption because differentiating the energy consequences caused by occupancy is difficult. 9.1.6. Comparison Discussions Based on the analysis shown in Table 16, it can be concluded that EnergyPlus and IES-VE are both qualified to analyze the effects of occupancy on building HVAC energy consumption and test HVAC system response for investigating occupancy-loads relationships, as they are both capable of connecting occupancy load with HVAC conditioning requirements and adjusting HVAC systems to respond to occupancy changes. However, EnergyPlus is more suitable since IES-VE requires complex system settings and could only roughly model the energy consequences of occupancy. In addition, it is unable to model certain customized or sophisticated HVAC plants. TRNSYS is an ideal tool to test different control strategies but cannot specify the energy implications of occupancy. On contrary, ESP-r lacks flexibility to test HVAC system responses to different occupancy based control strategies; especially it lacks necessary autosized and default mechanisms for input parameters in more complicated and tentative tasks. The principles of sequential HVAC response and unbalanced space heat transfer exclude DOE-2 from any occupancy coupled building energy simulation. Regarding operability and practical use, the requirements of each program in terms of requisite knowledge, learning curve and computational complexity, are also considered as analysts may have different skills and experience levels (Table 16). Therefore, Energyplus is selected as the simulation program for implementing the multi-level simulation calibration framework presented in next chapter and for investigating the occupancy-loads relationships on the demand side for HVAC system energy efficiency. - 154 - Table 16: Program comparisons for coupling occupancy information with HVAC energy simulation DOE-2 EnergyPlus IES-VE ESP-r TRNSYS Specialization Conceptual energy simulation Load Calculation Integrated assessment and analysis Physics Modelling Mechanical and electrical system control Heat transfer and balance Weighted coefficients based method; Four types of heat transfer surfaces; No strict space heat balance State space method; Strict heat balances on each surface and space air Finite- difference method; Stirred tank model; Uniform surface thermal properties Finite differences nodes; Equate the control volumes of energy flow Set multiple air nodes to the spaces; Iteration among thermal related components Load calculation HVAC system component loads Predict-correct process; Backward feedback and update Dynamic loads; Assign air node to each space Interconnected nodal networks to represent interdependenci es among the thermal related components Loads profiles consist of utility components, building components and additional customized DLLs; - 155 - Occupancy- HVAC system connection Occupant heat gain- space load- secondary system load- primary system load; No feedback to space load and temperature Occupant heat gain updates surface heat balance and air heat balance at previous time- step Occupant heat gain is incorporated by admittance techniques; Interact with computational fluid dynamics Occupant heat gain is connected with the thermal network and integrated with air fluid dynamics Input-output connection of corresponding components; Interactive runtime calls HVAC system modeling Predesigned HVAC system types Modularity; Mass and energy loops Pre-defined wizards; System prototypes autosizing; Detailed component based HVAC systems An assembly of components could be selected from library to form the HVAC network Components and controllers are from standard libraries or developed using programming languages HVAC system simulation LSPE sequence (load, system, plant, Simultaneous LSPE (load, system, plant, Detailed system simulation with A global solution for each network and Simulating the performance of individual - 156 - economic); No backward feedback economic) airflow analysis; Integrate ApacheSim, ApacheHVAC and Macroflo loose couplings among networks components; Perform overall system iterations until all components converge simultaneously Requisite knowledge High requirements for modeling with limited sources and programming High requirements for managing simulation parameters and thermal- control dynamics High requirements for detailed and complicated system settings High requirements for thermal dynamics knowledge and physics modeling experience High requirements for mechanical system modeling and configuration Learning curve Shallow Shallow Shallow Steep Steep Computation al complexity High computational efficiency Medium computation time and modeling complexity Medium computation time; Multiplexing High computation time and system modeling complexity High computation time and physics modeling complexity - 157 - 9.2. Methodology for Multi-level Energy Model Calibration This chapter introduces and validates a multi-level energy model calibration framework for simultaneously calibrating energy model at multiple levels. To estimate potential energy savings when different occupancy-loads relationships are evaluated, the model has to be robust to the changes resulting from the building being operated differently. This chapter uses ground truth energy data from implementations of two HVAC system responses to loads to calibrate the model and demonstrates the model has consistent performance for either scenario. The framework creates a classification schema for parameters (definitions and categorizations of parameters are introduced in Chapter 3) and integrates the statistical learning-based calibration and analytic calibration. It constitutes five steps: (1) initial energy modeling using available evidence, (2) sensitivity analysis to rank the influence of parameters, (3) parameter estimation for determining the values of estimable parameters, (4) discrepancy analysis to analyze the sources of discrepancies, and (5) multi-objective discrepancy minimization. The framework is evaluated using a case study. Simulated HVAC-related energy consumption is compared with the measured HVAC-related energy consumption to validate the proposed calibration framework. In the proposed calibration framework, input parameters are the parameters specified by an analyst and used by energy simulation programs to reproduce a building’s thermal processes, while outputs are energy performances simulated by energy simulation programs, given certain input parameters. Since an energy simulation model typically has large amounts of input parameters that cannot be all determined by available evidence, and may result in deviations and confusions for determining values, a classification for large, complex system modeling - 158 - schema is created and all of the input parameters are classified into hierarchical categories for calibration. First the input parameters are classified into two categories: observable parameters and non-observable parameters (Figure 59). Observable parameters are the parameters, such as window sizes and equipment multipliers, whose values could be determined directly using available evidence, such as evidence gathered through archived documents or on-site visits. Non-observable parameters, such as material conductivity and fan efficiency, cannot be determined by the available evidence. They are analyzed by a sensitivity analysis, based on which the influential ones are differentiated and further categorized as estimable parameters and adjustable parameters. Estimable parameters are the non-observable parameters that are deterministic in nature (e.g., door open/close status) but whose values are difficult to collect due to lack of feasible data collection approaches or privacy concerns, for example, occupancy schedules. In this dissertation, it is assumed that estimable parameters could be indirectly inferred or calculated using observable parameters by learning the relationships between estimable parameters and observable parameters. Adjustable parameters, such as light radiant fraction, are parameters that are stochastic in nature. The values of these parameters are varied in their respective domains and cannot be measured exactly. Further, adjustable parameters are divided into significant adjustable parameters and insignificant adjustable parameters based on their statistical significance. It is assumed that the significant adjustable parameters are mainly responsible for the discrepancy between the simulated and measured energy performances. Their values should be carefully determined for multi-level simulation calibration. The multi-level calibration is defined as a calibration that minimizes discrepancies at different levels of simulation (e.g., building level, HVAC load response level, and zone level). It is important to note that this classification only defines the characteristics and functions of each category for input parameters since building energy model calibration is a unique process. The specific members vary case by case. - 159 - Input Parameters Observable Parameters Non-Observable Parameters Influential Paramters Non-Influential Paramters Estimable Parameters Adjustable Parameters Significant Parameters Insignificant Parameters Figure 59: Categorization of input parameters The proposed framework is built on the following tasks: gathering data, constructing the model, simulating the model, analyzing and minimizing discrepancies between the simulated and measured energy performances. The framework has five consecutive steps (Figure 60): (1) initial energy modeling using available evidence-based data; (2) sensitivity analysis, which ranks and compares the influences of non-observable parameters on the energy simulation outputs both at the macro level and micro level; (3) parameter estimation to determine the values of estimable parameters, such as occupancy schedule, and finalize base modeling; (4) discrepancy analysis, which helps to understand the sources of discrepancies between the simulated and measured energy performances; and (5) discrepancy minimization, the last step to for the calibrated energy model, aims to reduce discrepancies both at the macro level and micro level by determining the parameter values that cannot be obtained through evidence or estimation. - 160 - Initial Energy Modeling Sensitivity Analysis Parametric Comparison Parameter Estimation Base Modeling Discrepancy Analysis Discrepancy Minimization Calibrated Energy Model Non-observable Parameter Recognition and Range Ranking (Level 1) Ranking (Macro Level 2) Estimable Parameter Adjustable Parameter Multi-objective Programming Regression-fitting Estimable Evidence Observable Evidence Significant Parameters Distribution Analysis Random Samples Parameter Range and Condition Semi-calibrated Model Actual Energy Data Input Energy Discrepancy Explanation Actual Energy Data Input Calibration Evaluation Default and Autosized Insignificant Parameters ... Figure 60: Proposed energy model calibration framework 9.2.1. Initial Energy Modeling The goal of this first step is to provide a basic description for building geometry, construction elements and mechanical systems, using evidence-based data. The initial representation of the energy model is created through iterative model evolution, where each input is updated based on a source of evidence. Since there may be various available sources for determining parameter values, the hierarchy structure described in the literature [192,197,202] is used to rank evidence sources. The source with higher priority is used first (Figure 61). In general, the first step is to evaluate the as-built and design documents, including architectural plans, electric lighting systems (e.g., lamps and ballasts), HVAC designs (e.g., zoning and connections), schedules (e.g., designed occupant schedules and light schedules), inventories (e.g., appliances and equipment), and HVAC specifications (e.g., fan nominal power). This step integrates both as-built data and as-designed assumptions. The second step is to visit site, survey and interview the technicians, engineers and building facility management personnel, and study the operation and maintenance (O&M) manuals, as well as to conduct some continuous measurements, such as lighting - 161 - level if possible (depending on specific building situations and available methods). In this step, the data collected from the first step is re-examined to check whether there is an update to be made or there is any change since the building was built. The last step is using default settings based on similar types of buildings in simulation programs and using the codes and standards when needed. Although research has demonstrated that the data from ASHRAE standards and manufacturer handbooks are not reliable due to the substantial variability in buildings and building systems, values must be set for all input parameters to maintain model integrity. Here, modeling steps do not necessarily follow the hierarchy direction (Figure 61). Since the available evidence for different buildings may have different levels of details and accuracies, typically the initial modeling can be described as an ad-hoc procedure, requiring numerous iterations of input updates. It is difficult to determine a specific evidence source for a specific parameter, however generally the initial modeling accuracy increases if more high-level evidence is used for determining the values of input parameters. As-Built Documents Design Documents Continuous Measurement Survey and Interview O&M Manuals Default Settings for Similar Type of Buildings Codes and Standards Continuous Measurement Survey and Interview O&M Manuals As-Built Documents Design Documents Default Settings for Similar Type of Buildings Codes and Standards High Low Step1 Step2 Step3 a. Sequence b. Hierarchy Figure 61: Hierarchy and sequence for observable parameter determination In general, lack of necessary evidence for determining the input parameters is common in building energy simulation and that is one of the motivations for this dissertation. The evidence should be used if possible. For those parameters that cannot be determined directly by evidence, in our calibration methods, values - 162 - are first assigned as default values or autosized by the simulation program, while the influential ones are classified into two categories of estimable parameters and adjustable parameters for further analysis. Once the new values are calculated, we use them to update the temporary values set in the initial modeling. 9.2.2. Sensitivity Analysis for Parameters The initial model is then used as a basis for the sensitivity analysis. Usually, there are several hundred non-observable parameters whose exact values are unknown and it is usually infeasible to run millions of simulations to determine the values of all of them with an equal priority. Therefore, the number of non- observable parameters to be studied should be reduced. Given the fact that the influences of certain input parameters on energy simulation results are more significant than the others; these inputs should be prioritized for model calibration. In our calibration method, sensitivity analysis is used as a screening method to rank non-observable parameters based on how the simulated energy consumption would change in response to the changes made to each non-observable parameter. In order to achieve accurate energy simulation results at multiple levels, a sensitivity analysis is conducted for n times to account for n levels of energy consumption. There is no established rule or procedure for sensitivity analysis, as each method has its own pros and cons [271]. We use Morris method in our framework to identify the influential parameters, as it has been proven to be a valid method for screening building energy simulation parameters [272]. Morris method makes no assumptions about the relationships (e.g., linearity and correlation) between parameters and model outputs [273]. It could process large numbers of parameters equally by a relatively limited number of simulation runs. This process is efficient and accurate as it does not require a predefined probability density function for each parameter [273], given the fact that assigning estimated probability density functions for hundreds of non-observable parameters is time-consuming and error-prone. Although Morris method cannot provide exact uncertainty that each parameter causes, it is sufficient for ranking the influences and selecting parameters for adjustment. In addition, different parameter types (e.g., discrete, - 163 - continuous or multi-dimensional) could be considered equally. Morris sensitivity analysis is based on the factorial sampling technique, in which the influential input parameters are identified through a series of simulation runs by changing one parameter at a time, and then comparing the corresponding simulated energy performances. A number of individual one-factor-at-a-time samples of input parameters are randomly generated within their ranges as an input vector for simulations. In our framework, we use larger ranges to be conservative, because the proper range of a parameter is determined by analysts’ knowledge and experience, and if the range is larger, the probability that the actual value within the range is high. Sensitivity of each parameter is expressed by a value called “elementary effect” [273], which is defined as the measure of parameter influence, showing the change in the simulation output as a result of a change in this parameter, while all other parameters are kept constant. As the value of each parameter is varied within its range, the mean value of the effect m is then compared to the standard deviation S d to provide a normalized criterion for ranking influential parameters. Parameters with higher absolute mean- standard deviation ratio are more influential. The boundary between influential parameters and non- influential parameters should be determined case by case based on different goals, computational requirements and result distributions. Based on the demonstration by previous research [272], if the parameters are above the threshold set by the lines m = ±2S d /√ r (r is the number of independent samples for each parameter), they are considered to have non-linear or joint effects. When the sensitivity analysis is completed, the influential parameters are explored in the next steps while the non-influential parameters are assigned with default values or autosized by the simulation program. 9.2.3. Parameter Estimation Parameter estimation is conducted if the influential non-observable parameters are deterministic in nature, however they are difficult to be collected due to lack of feasible data collection approaches. These parameters could be building use related (e.g., occupancy, lighting, and appliance) or system operations related (e.g., HVAC thermostat schedules). Traditionally, the use of day-typing (typical days to characterize a period of time) and zone-typing (typical space to represent a building) were adopted. - 164 - Currently observations, surveys, short-term measurements or real-time end use monitoring methods are used to collect data instead. However, these methods are not practical due to the intrusion they cause to buildings and their occupants, and they do not satisfy the requirements for detailed building energy simulation because of the lack of precision and consistency and verification in auditing. In this dissertation, the tested hypothesis is that some of the estimable parameters could be indirectly inferred or calculated from the observable parameters. This step is specific for each case, however the underlying assumption remains the same: the parameters to be estimated either have regular influences on certain observable parameters or their values repetitively occur within variations. Therefore, the relationships between estimable parameters and observable parameters could be established through statistical learning. Once completed, measurements of observable parameters (depending on the sophistication of building management systems and types of onsite metering systems) are learned as parameter values for estimable parameters. If there are common parameters shared by different levels, they are assigned with the same values. Although varies case by case, the parameter estimation step is a mining process to find the patterns of unknown parameters related to building use and system operations using known parameters. In order to test the hypothesis that a consistent simulation performance can be achieved when the model is calibrated using ground truth energy data from mixed HVAC load responses but it is simulated for an individual load response; the response-related parameters should be controlled instead of being estimated. 9.2.4. Simulation Discrepancy Analysis Since the influential parameters are the main sources of the discrepancies, and the estimable parameters could be indirectly estimated using evidence related data, which are deterministic and based on the facts, it is assumed in this dissertation that some of the adjustable parameters are responsible for the majority of the discrepancies between the simulated and measured energy performances. Therefore, we propose to explore the structural patterns of the discrepancies and screen out insignificant parameters. Several - 165 - methods could be developed to analyze the patterns, however as a beginning, the linear relationship is explored to model the contribution of each adjustable parameter to the discrepancies using a regression fitting. All adjustable parameters are varied within their ranges and the nominal values are their default values that could be found in simulation programs. A probability density function (e.g., triangular, Gaussian, or uniform) of each continuous parameter, such as wind speed, is used to select the values based on its probability distribution, while discrete parameters such as iteration number are characterized by minimum, maximum and default values. If analysts cannot determine the ranges, or when there are time constraints, preferred ranges of parameters, provided by the simulation programs, could be used instead. Otherwise, analysts could narrow down or specify the range of each input parameter. Each parameter is normalized by x ∗ = x−x min x max −x min (x max ∗ − x min ∗ ) + x min ∗ for comparison. Random sampling is used to select samples to form independent variables, and multiple simulation runs are then completed to generate the output vector. The discrepancies between the simulated and measured energy performances, calculated by dividing the difference between the simulated results and the actual measurements by the actual measurements, are used as dependent variables. The values of remaining parameters (after parameter estimation is completed) are considered as independent variables. The number of simulation runs is usually based on experience or trial; however in our methodology, the actual number of simulation runs depends on when the regression model converges. Specifically, multi- regression is used to establish the linear model, the outputs (discrepancies) of which are the sums of combinations of parameter values, while the weights are assigned to each parameter before adding them together. The regression line is denoted by y level n = a+ b 1 x 1 + b 2 x 2 + b 3 x 3 + ⋯ + b k x k + ε, where a is the intercept, ε is the random disturbance and b i is the coefficient of the ith parameter, indicating its contribution to the determination of the dependent variable. If some of the parameters are proven to be interrelated as a result of the sensitivity analysis, the linear model is modified into a non-linear one. For example, if x 1 x 2 x 3 are all above the m = ± 2S d √ r lines (for each parameter, m is the mean of its elementary effects, Sd is the standard deviation of its elementary effects, and r is the number of samples - 166 - selected), the new multi-regression model would be y level n = a+ b 1 x 1 + b 2 x 2 + b 3 x 3 + b 1 ′ x 1 x 2 + +b 2 ′ x 2 x 3 + b 3 ′ x 1 x 3 + b 4 ′ x 1 x 2 x 3 + b 4 x 4 … + b k x k + ε to take their interactions into consideration. In order to calibrate the energy model at multiple levels, n (n equals to the number of levels) multiple- regression models are created simultaneously, in which some of the parameters might be shared. The same parameters could have different weights or even reverse contributions (overestimate or underestimate) to different levels of simulations. The intercepts and coefficients of variables are estimated with the least square method, and coefficients of determination (R ̅ 2 ) are calculated to interpret the proportions of discrepancies that can be explained by the regression models. F-statistics is used to test the significance of the regression models and to analyze whether the discrepancies are significantly influenced by the adjustable parameters. Then T-statistics is used to differentiate insignificant parameters from significant parameters, which account for the discrepancies. 9.2.5. Simulation Discrepancy Minimization After identifying the contributions of parameters to the simulation discrepancies at different levels, the most important step is to determine the values of these parameters for minimizing the discrepancies at multiple levels simultaneously. In current practice, the values of the significant adjustable parameters are usually determined as best-guesses of experts or adjusted blindly. In order to make this step more efficient and repeatable, multi-objective programming, commonly used in optimizing energy efficient designs [274], is used for updating the values of adjustable parameters and minimizing the simulation discrepancies at multiple levels simultaneously. In our methodology, the multi-objective is denoted by min={y Level 1 ′ , y Level 2 ′ , y Level 3 ′ , … , y Level n ′ }, subject to the constraints, such as bound limits and integrality requirements. Since the values of all parameters should be selected from their parameter ranges and are recommended not to be far from the default values set by the program, the objective functions are expressed as y level n ′ = a+ b 1 x 1 + b 2 x 2 + b 3 x 3 + ⋯ + b k x k + ε + ∑ (x k − x kdefault ) 2 k 1 (both x k and x kdefault are normalized by x ∗ k or kdefault = x k or k default −x min x max −x min (x max ∗ − x min ∗ ) + x min ∗ where a penalty is - 167 - introduced. At first, each single objective function is solved independently. Pareto optimal solution sets A* (for minimizing discrepancies at the level 1), A** (for minimizing discrepancies at the level 2), and so forth, are generated separately for multiple objectives. The union of A*, A** and so forth, is the solution set for the multi-objective programming. The relative importance of the n objectives should be carefully selected based on the purpose of simulation, as the selected weights have significant influences on the final solution [275]. If the weights are arbitrarily assigned, the programming may converge on the locally optimal solution. Therefore, in this dissertation the weight computation process is transformed to a synthetical fitness optimization problem with preference being considered as a constrain condition. Analysts should determine the relative importance of w 1 0 (for level 1 accuracy) and w 2 0 (for level 2 accuracy), and so forth. (e.g. w 1 0 > w 2 0 > w 3 0 > w 4 0 … > w n 0 ). Gradient projection method [276][276] is then used to search the optimal weights that could maximize weighted variance and differentiate each solution from others. Once the weights are decided, they are assigned to all objectives, and the multi- objective is converted into a single weighted objective function as min f = {w 1 0 y Level1 + w 2 0 y Level2 + w 2 0 y Level3 + ⋯ + w 2 0 y Leveln }, where w 1 0 + w 2 0 + w 3 0 + ⋯ + w n 0 = 1. Linear programming is then used to synergize the weights of parameters at each level determined by regression analysis and find the initial solutions (seed vertex). An effective vertex near the seed vertex is also searched. As long as the actual disaggregated energy performance for detailed end uses, such as lighting, equipment, and HVAC, could be metered, the multi-objective discrepancy minimization method could be applied to any multiple-level calibration. 9.3. Energy Model Calibration Evaluation 9.3.1. Case Study Description The HVAC energy consumption in the building can be decomposed to primary HVAC systems, such as used by chillers and boilers to generate chilled and hot water, secondary HVAC systems, such as used by AHUs and their embedded fans to distribute conditioned air in the building, and HVAC terminals, such as - 168 - VAVs and FCUs. To validate the results for multi-level energy calibration, the ground truth energy data used for calibration was obtained or calculated with the sampling rate of 1 minute from the information recorded by a Honeywell Building Management System (BMS), which provides central control over the chiller, boiler, AHUs and VAVs. OpenStudio and EnergyPlus programs were used for energy simulation. OpenStudio accounts for geometrical modeling and acts as a middleware to connect with EnergyPlus. Detailed energy modeling and calibration are done in EnergyPlus. HVAC system load response refers to a specific HVAC control strategy for satisfying the heating/cooling loads from interior and exterior impacts in this dissertation. Two different responses were implemented in the testbed building, in order to explore the second objective and test the hypothesis that an energy model calibrated using ground truth energy data from mixed responses could consistently simulate energy performance for either response. The first response (baseline HVAC control strategy, short as “baseline response”) ran at an on-hour mode during the daytime (6:30 - 21:30 on workdays, and 7:00 - 21:30 on weekends), all mechanical zones in the building were assumed to be always occupied, and a constant temperature setpoint (73F) was maintained, which considers all the loads outside the deadband of setpoint are effective and should be met by HVAC system. The second response (bimodal HVAC control strategy, short as “bimodal response”) was demand responsive and based on real-time occupancy introduced in Chapter 8.1. During the daytime, an occupied mode was enforced for occupied zones, where a constant temperature setpoint (73F) was maintained. If a zone was vacant for a minimum of 15 minutes, a vacant mode was enforced, where the temperature setpoint was immediately set back to 78F until the zone was occupied again, thus parts of the loads during unoccupied periods are ineffective and discarded by HVAC system. The bimodal response was implemented on the east side of the second and third floors covering 37 rooms (14 zones of which were metered by the BMS). The rest of the building was operated using the baseline response. Both responses had off-hour modes, where the HVAC system was shut off during nighttime, and no cooling, heating or ventilation services were provided. Only minimum airflow was maintained to meet the ASHRAE compliance. Four months of energy consumption data were collected - 169 - during these two periods. The first period spanned for 82 days from Jan 1 st to Feb 21 st 2013 and from Apr 1 st to Apr 30 th 2013, and the HVAC system during this period was operated under baseline response. The second period spanned for 38 days from Feb 22 nd to Mar 31 st 2013, when the bimodal response was adopted for the 14 zones. HVAC related energy consumption was simulated to validate the propose calibration framework at multiple levels as explained below. It is important to note that even though the proposed calibration method was evaluated at three different levels (i.e., building level, HVAC load response level and zone level energy simulation calibration), the discrepancy minimization could be applied to any multiple levels of calibrations, such as floor level. A total of 22 zones and HVAC chiller, boiler and AHUs were taken into account for validation. Metrics were defined to explore the discrepancies between the simulated and measured HVAC energy consumptions at three different levels. For the building level calibration, the sum of electricity and gas consumed by the entire HVAC system, including the air loops and plant loops, was compared with the measured consumption to indicate the percentage of building level discrepancy. Energy consumption of each zone for each day, including heating and cooling provided by the terminals for actual conditioning demands, was calculated by a heat formula (Eq. 1) Q i = ∫ ρV ̇ C pi |T si − T ri | Where T si and T ri are supply air temperature and return air temperature for zone i, and V ̇ is the air volume flow rate (m 2 /s). Their actual values are metered and recorded by BMS. C pi is a constant value of specific heat capacitance for zone air as 1000J/(kg·oC) and ρ is the zone air mass density with the value of 1.29 kg/m3. Zone level ventilation was not considered in this dissertation due to the lack of metered data. The HVAC load response level energy consumption was calculated by adding the energy consumptions of the zones served by one AHU. The AHU takes in outside air, mixes it with returned air from the building, and cools down the mixed air to 12.8 0C with chilled water supplied by the chillers. There are - 170 - 14 zones on the east side of the second and third floors, where both the baseline response and bimodal response were implemented. Q ̅ = ∑ Q i 14 i The zone level energy consumption was calculated by averaging the energy consumptions of 8 zones on the west side of the third floor where only baseline response was implemented. Q ̅ = 1 8 ∑ Q i 8 i The MBE (mean bias error) and CV (RMSE) were chosen as two criteria to evaluate the calibrated energy model by checking whether there is acceptable agreement between the simulated and measured energy consumption. Hourly calibration was conducted. N stands for the number of hours within a period. E actual is the actual energy consumption metered by the BAS while E simulated is the simulated energy consumption by the building energy model. MBE = ∑ (E actual (j) − E Simulated (j) ) N j=1 N CV(RMSE) = √∑ (E actual (j) − E Simulated (j) ) 2 N j=1 E ̅ MBE is a non-dimensional bias measure for overall deviation. A negative MBE value means the simulation model underestimates the energy consumption, while a positive MBE value represents an overestimation. It could measure long-term model performance through analyzing the error between the simulated and measured energy consumptions; however the underestimation and overestimation might compensate each other. The averaged sum of squares errors is called the mean squared error (MSE). Coefficient of variation of RMSE (CVRMSE) is determined by dividing the RMSE by the mean - 171 - measured energy consumption. It is not influenced by the compensation effect and could evaluate the variability of agreement between the simulated results and measured values over a period of time. In general, an energy simulation model is considered as calibrated if the two criteria are satisfied at all the three levels according to the acceptable tolerances set by ASHRAE Guideline 14, IPMVP or FEMP [277- 279]. As there is no regulated daily tolerance in literature, in this case study, hourly tolerances (Table 17) were used for evaluating daily MSE and CV(RMSE). Table 17: Acceptable tolerances for monthly building energy simulation Metric IPMVP (%) FEMP (%) ASHRAE (%) MBE ±20 ±15 ±5 CV(RMSE) ±5 ±10 ±15 9.3.2. Results of Initial Energy Modeling The initial energy model, for the case study, incorporated information collected from archived documentation, such as as-built drawings, specifications, renovation logs, operating records, and information gathered from on-site visits, where building’s geometric characteristics, construction elements, associated mechanical systems, appliance specifications, were collected (Figure 62). For the rest of the input parameters that did not have available evidence, default settings were used or their values were temporarily assigned according to standards and codes. Weather is modeled using TMY (Typical Meteorological Year) data downloaded from the DOE website (Energy Efficiency and Renewable Energy) specific for the building site [280]. The data were collected from the station close to Los Angeles International Airport that is about 10 miles away from the testbed building. The TMY are data sets of - 172 - hourly values of meteorological elements and solar radiation for one year. The simulation period spanned from January 1 st to April 30 th , 2013, and from March 15 th to November 15 th , 2014. Figure 62: Architectural model (back side), one typical zone (three offices) and HVAC model 9.3.3. Results of Sensitivity Analysis for Parameters In total, there were 227 parameters whose values could not be determined (non-observable parameters) during the initial energy modeling due to the lack of available evidence, such as surface albedo, fan total efficiency and air flow fraction. Morris method was implemented three times for the building level, HVAC load response level and zone level sensitivity analysis. The elementary effect was expressed as a percentage of simulation variation in response to the change resulting from the value of the input parameter. For each parameter, five independent samples (r =5) were randomly selected and the elementary effects were simulated by EnergyPlus with 1,362 simulation runs (the number of runs=(r+1)* the number of parameters) in total for each level. The r is usually chosen between 5 and 15. Five samples were chosen for each parameter as the exact rank of parameter sensitivity does not affect the discrepancy analysis and minimization process, thus the minimum number was selected to make the sensitivity analysis efficient. The sensitivity analysis has been implemented using Matlab. The 1,362 idf files were generated for simulation runs and the simulation results (from Energyplus) were collected to calculate the elementary effect for each parameter. The parameters were then ranked by an absolute mean-standard deviation ratio and the ones with higher ratios were considered as more influential for the energy simulation. - 173 - Conservative boundary is set, by which the parameters with absolute mean-standard deviation ratio greater than 0.1 are influential (the boundary could also be determined by experience or using visual plot). Figure 63 shows the mean and SD of the elementary effects for each parameter. 34 parameters for building level simulation, 32 parameters for HVAC load response level simulation and 33 parameters for zone level simulation were presented in a decreasing order of influence shown in Table 18-20, respectively. The influential parameters for the three levels of energy simulation were not exactly the same; even the common parameters had different orders of influence. As expected, building level influential parameters had global influence on the building energy consumption, basically related to running controls and loads for HVAC systems, conditions and performance for HVAC plants, and envelope thermal characteristics; HVAC load response level influential parameters had influences on local thermal states, loads, control settings and conditions for HVAC terminals; zone level influential parameters were mainly associated with end use demands, material properties, space heat transfer and balance. The three lists of parameters should be given primary focus by changing the values of the influential parameters within their plausible parameter ranges in the next steps. The non-influential parameters were left with their default values or autosized by EnergyPlus. Figure 63: Mean and standard deviation of the elementary effects on the energy simulation for the influential parameters at building level, HVAC load response level and zone level - 174 - Table 18: Influential parameters and their parameter ranges and default values for the building level energy simulation (blue-shaded parameters are for controlling the HVAC system under different load responses, brown-shaded parameters are estimated based on observable evidence, red shaded parameters are found to be statistically insignificant in discrepancy analysis) IDs Influential Parameters Parameter Ranges Default Value IDs Influential Parameters Parameter Ranges Default Value 1 Chiller COP 0<X<10 5.9 18 Equipment/Applianc e Schedule Estimable TBD 2 Wind Speed 0<=X<40 15 19 Average Ventilation Rate Range 1<=X<=6 4 3 Occupant Activity 100<=X<=1 50 115 20 Chiller Part Load Ratio 0.3<=X<=0.9 0.7 4 Heating/Cooling Schedule Load Response TBD 21 Boiler Efficiency 0<=X<=1 0.8 5 Chilled Water Delta T - 50<=X<=50 14 22 Heating/Cooling Time Interval 0<=X<=60 10 6 Solar Absorptance 0<=X<=1 0.7 23 Occupant Heat Load Estimable TBD - 175 - 7 Material Conductivity 0<X<30 17 24 AHU Minimum Airflow Rate 0<X<25000 15000 8 Occupancy Schedule Estimable TBD 25 Heat Recovery Efficiency 0<=X<=1 0.8 9 Lighting Fraction Radiant 0<=X<=1 0.72 26 Fresh Air Introduction Rate 20<=X<=50 35 10 Surface Albedo 0<=X<=1 0.3 27 Maximum Zone Wind Speed 0<=X<40 20 11 Fan Total Efficiency 0<=X<=1 0.7 28 Minimum Outside Air Fraction 0<=X<=1 0.3 12 Light Schedule Estimable TBD 29 Airflow Convergence Tolerance 0<X<1 0.0004 13 Temperature Sensor Height 0<=X<=3 1.6 30 Lighting Time Interval 0<=X<=60 10 14 Occupancy Time Interval 0<=X<=60 10 31 Ground Temperature 66<=X<=72 68 - 176 - 15 Solar Heat Gain Coefficient 0.25<=X<= 0.8 0.5 32 Minimum Surface Convection Heat Transfer Coefficient 0<=X<=5 3 16 Hot Water Sizing Factor 0<X<5 1 33 Ground Reflectance 0<=X<=1 0.2 17 Wall U-Factor 0.2<=X<=1. 2 0.8 34 Reference Barometric Pressure X>0 1*10 5 Table 19: Influential parameters and their parameter ranges and default values for the load response level energy simulation (blue-shaded parameters are for controlling the HVAC system under different load responses, brown-shaded parameters are estimated based on observable evidence, red shaded parameters are found to be statistically insignificant in discrepancy analysis) IDs Influential Parameters Parameter Ranges Defaul t Value ID s Influential Parameters Parameter Ranges Default Value 1 Zone Cooling Sizing Factor 0<=X<=5 1.1 17 Window Shading Coefficient 0.2<X<1.0 0.8 2 Thermostat Setpoint Load Response TBD 18 Fraction of Convective 0<=X<=1 0.7 - 177 - Internal Loads 3 Minimum Airflow Fraction 0<=X<=1 0.2 19 Occupant Heat Load Estimable TBD 4 Heating/Cooling Schedule Load Response TBD 20 Airflow Convergence Tolerance 0<X<1 0.0004 5 Temperature Sensor Height 0<=X<=3 1.6 21 Lighting Load Estimable TBD 6 Occupancy Schedule Estimable TBD 22 Zone Flow Coefficient 0<=X<=1 0.8 7 Heating Coil Efficiency 0<=X<=1 0.8 23 Thermal Absorptance 0<X<0.999 0.9 8 Zone Supply Air Temperature 10<=X<=32 20 24 Infiltration Rate 0.1<X<=5 1.8 9 Solar Heat Gain Coefficient 0.25<=X<=0. 8 0.5 25 Gross Rated Cooling Coil COP 0<X<5 3 10 Wall U-Factor 0.2<=X<=1.2 0.8 26 Equipment/ Appliance Estimable TBD - 178 - Schedule 11 Delta Adjacent Zone Temp -20<=X<=20 10 27 Effective Air Leakage Area 0<X<50 20 12 Occupant Number Estimable TBD 28 Light Fraction Radiant 0<=X<=1 0.72 13 Heating/Cooling Time Interval 0<=X<=60 10 29 Maximum Zone Wind Speed 0<=X<=40 20 14 Supply/Zone Air Temperature Delta -20<=X<=20 6 30 Internal Loads Density 1.5<=X<=3 1.8 15 Daily Temperature Range -20<=X<=20 10 31 Window Solar Transmittance 0<=X<1 0.7 16 Lighting Schedule Estimable TBD 32 Visible Absorptance 0<=X<=1 0.7 - 179 - Table 20: Influential parameters and their parameter ranges and default values for the zone level energy simulation (blue-shaded parameters are for controlling the HVAC system under different load responses, brown-shaded parameters are estimated based on observable evidence, red shaded parameters are found to be statistically insignificant in discrepancy analysis) IDs Influential Parameters Parameter Ranges Default Value IDs Influential Parameters Parameter Ranges Default Value 1 Solar Heat Gain Coefficient 0.25<=X<=0.8 0.5 18 Heating/Cooling Time Interval 0<=X<=60 10 2 Sensible Heat Ratio 0.5<=X<=1 0.8 19 Equipment/ Appliance Schedule Estimable TBD 3 Wall U-Factor 0.2<=X<=1.2 0.8 20 Minimum Airflow Fraction 0<=X<=1 0.2 4 Supply-Air-to- Zone-Air Temperature Difference -20<=X<=20 6 21 Occupant Heat Load Estimable TBD 5 Zone Flow Coefficient 0<=X<=1 0.8 22 Zone Supply Air Temperature 50<=X<=90 68 - 180 - 6 Heating/Cooling Schedule Load Response TBD 23 Occupant Activity 100<=X<=150 115 7 Fresh Air Introduction Rate 20<=X<=50 35 24 Delta Adjacent Zone Air Temperature -20<=X<=20 10 8 Thermostat Setpoint Load Response TBD 25 Daily Temperature Range -20<=X<=20 10 9 Zone Cooling Sizing Factor 0<=X<=5 1.1 26 Solar Transmittance 0<X<1 0.7 10 Temperature Sensor Height Above Ground 0<=X<=3 1.6 27 Airflow Convergence Tolerance 0<X<1 0.0004 11 Occupancy Schedule Estimable TBD 28 Visible Reflectance 0<=X<=1 0.08 12 Occupant Number Estimable TBD 29 Infiltration Rate 0.1<X<=5 1.8 13 Delta Adjacent -20<=X<=20 10 30 Light Fraction 0<=X<=1 0.72 - 181 - Zone Temp Radiant 14 Lighting Schedule Estimable TBD 31 Surface Albedo 0<=X<=1 0.3 15 Minimum Surface Convection Heat 0<=X<=5 3 32 Internal loads density 1.5<=X<=3 1.8 16 Airgap Thermal Resistance X>0 0.2 33 Lighting Time Interval 0<=X<=60 10 17 Glazing Conductivity X>0 0.9 9.3.4. Results of Parameter Estimation For the case study, the tested hypothesis was ‘the parameters related to building use or system operations are associated with occupancy. The heating/cooling schedule, lighting schedule, equipment schedule, occupant number, lighting load and occupant heat load could be related to occupancy schedule. First, two on-site visits were performed to collect the information about the occupancy capacity of each zone and the specifications and number of computers and appliances in each zone. It has been demonstrated by the that occupancy schedules could be estimated by the real-time non-intrusive occupancy detection model using the observable ambient factors introduced in Chapter 7. The sampling rate for the occupancy detection model was 3 minutes. Schedules for rooms without ambient sensors were determined according to the ANSI/ASHRAE/IES Standard 90.1-2013 [15]. - 182 - Equipment/appliances were assumed to be only used when a space was occupied thus their schedules followed occupants’ schedules. Lighting levels were sensed by light sensors in each room. Light fixtures were assumed to be used if a space was occupied and when artificial lighting was needed during 6:30- 10:00 and 15:30-18:00 (after 18:00 lighting schedules were the same as the occupancy schedules). The parameter for occupant heat load was calculated based on the occupant number at each time point and based on the standards specified by the ASHRAE 2009 [62]. The exceptional parameters controlled in this dissertation were for the two responses of HVAC setpoint controls and are presented in Table 18-20. The two parameters of HVAC Setpoints and Heating/Cooling Schedules for the 14 zones during the bimodal response implementation period were programmed and driven by actual occupancy; otherwise they followed the baseline response. Occupancy of a particular zone was determined by aggregating the occupancy of associated rooms. A zone was considered vacant only if all rooms within the zone were vacant. 9.3.5. Results of Simulation Discrepancy Analysis Matlab was used to implement the discrepancy analysis and discrepancy minimization for calculating the values of adjustable parameters. In this case study, there were 29 adjustable parameters for the building level simulation, 24 adjustable parameters for HVAC system load response level simulation, and 26 adjustable parameters for zone level simulation (see unshaded and red-shaded parameters in Table 18-20), which were assumed to be responsible for the majority of the discrepancy between the simulated and measured energy consumption. This step analyzed where the major errors lied based on a regression fitting by weighing the relations: Y ̂ building level = a 0 + a 1 X 1 + a 2 X 2 + ⋯ + a 29 X 29 + ε building level Y ̂ response level = b 0 + b 1 Z 1 + b 2 Z 2 + ⋯ + b 24 Z 24 + ε response level Y ̂ zone level = c 0 + c 1 Z 1 + c 2 Z 2 + ⋯ + c 26 Z 26 + ε zone level ) - 183 - and testing the statistical significance. Specifically, Y ̂ level i is the simulation discrepancy at the i level, a 0 b 0 c 0 are the constant, a n b n c n are the coefficients for regression respectively, and ε i is the residual of the regression function for level i simulation. Actual energy consumption data, collected from January 1st to March 10th, were used for the multi-regression analysis, and the results were presented in Figures 64- 66. Each adjustable parameter was normalized within [0,1]. Random sampling was used to select the samples within the predefined parameter ranges to form the independent variables. The partial regression coefficients and intercepts were calculated with the method of least square. The larger the calculated coefficient is, the more contribution a parameter has to the discrepancy. If a certain parameter has no significant influence in a multi-linear regression formula, its coefficient was given the value of 0. A positive coefficient (above blue lines) indicates an overestimation, while a negative (below blue lines) indicates an underestimation in the simulated results (Figures 64-66). The results showed that a parameter with high influence on simulation results in the sensitivity analysis does not mean high influence on simulation discrepancy in the regression analysis. Figure 64: Multi-regression analysis results at the building level (F=57.24) – See Table 18 for parameter IDs (Estimable parameters and HVAC load response related parameters are shaded) - 184 - Figure 65: Multi-regression analysis results at the HVAC load response level (F=32.16) - See Table 19 for parameter IDs (Estimable parameters and HVAC load response related parameters are shaded) Figure 66: Multi-regression analysis results at the zone level (F=41.59) - See Table 20 for parameter IDs (Estimable parameters and HVAC load response related parameters are shaded) The two criteria for analyzing the influences of parameters and their mutual relations were then investigated. The first criterion is the adjusted determination coefficient (Adjusted R Square), used to represent the percentage of a dependent variable (energy simulation discrepancy) that can be explained by the independent variables (adjustable parameters). In this case study, approximately 83.2% of the building level energy discrepancy, 81.8% of the HVAC load response level energy discrepancy, and 80.7% of the zone level energy discrepancy could be attributed to those adjustable parameters. The second criterion is - 185 - the tolerance for the multicollinearity between the parameters. The smaller the tolerance value is, the stronger the multicollinearity appears. The results also demonstrated that significant adjustable parameters were all independent at all the three levels (red-shaded areas in Figures 64 to 66). Similar to the conclusion in the sensitivity analysis that there was no parameter interaction and all the adjustable parameters were linearly related to the simulation discrepancy. In addition, the significance tests (at a 95% confidence level) for assessing statistical significance of multi linear regression equations and coefficients were also explored. F-test results (Figures 64 to 66) showed the three multi linear regression equations all had F values larger than the critical values (F building=1.47, F response=1.52, F zone=1.82), demonstrating the regression models were statistically significant and the simulation discrepancies were mainly impacted by the adjustable parameters. Based on the T-test results except for the parameters of Solar Heat Gain Coefficient (ID# 15), Fresh Air Introduction Rate (ID# 26), Maximum Zone Wind Speed (ID# 27), Ground Temperature (ID# 31), and Reference Barometric Pressure (ID# 34) in building level regression, the parameters of Supply-Air-to-Zone-Air Temperature Delta (ID# 14), Fraction of Convective Internal Loads (ID# 18), Effective Air Leakage Area (ID# 27) and Visible Absorptance (ID# 32) in HVAC load response level regression, and the parameters of Zone Cooling Sizing Factor (ID# 9), Minimum Surface Convection Heat (ID# 15), Minimum Airflow Fraction (ID# 20), Delta Adjacent Zone Air Temperature (ID# 24), Airflow Convergence Tolerance (ID# 27) and Surface Albedo (ID# 31), the coefficients of other individual parameters were all statistically significant (green lines in Figures 64 to 66), indicating the insignificant parameters are successfully differentiated from significant parameters, which account for the simulation discrepancy. 9.3.6. Results of Simulation Discrepancy Minimization Statistically significant parameters were processed by multi-objective programming to determine the values within value ranges that could minimize the simulation discrepancies. The energy consumption data from January 1 st to March 10 th 2013were used for model calibration, while the data from March 15 th to November 15 th 2014 were used for evaluating the performance of multiple-level simulation calibration - 186 - (objective 1), and the data from March 11 st to April 28 th were used for evaluating the model robustness by mixing the ground truth of two HVAC system load responses (objective 2) (Figure 67). In this case study, hourly simulation was conducted and hourly tolerances (from IPMVP, FEMP and ASHRAE) were used for evaluating daily MSE and CV(RMSE). For objective 1, one month was used as the period for calculating the hourly MBE and CV (RMSE); for objective 2, seven days (one week) were considered as one period for calculating the hourly MBE and CV(RMSE). Specifically, the data for calibration (from January 1st to March 10th) were used for generating solutions and choosing the weights for discrepancies at three levels. With different preferences, different combinations of weight values were searched iteratively until the weighted discrepancies converged (Table 21). Jan 1 st – Feb 21 st 2013 Feb 22 nd – Mar 10 th 2013 Multi-objective Optimization and Weight Determination Calibration Baseline Response Bimodal Response Mar 11 st – Mar 17 th 2013 Mar 18 th – Mar 24 th 2013 Mar 25 th – Mar 31 st 2013 Apr 1 st – Apr 7 th 2013 Apr 15 th – Apr 21 st 2013 Apr 8 th – Apr 14 th 2013 Apr 22 nd – Apr 28 th 2013 Evaluation Objective 2 March 15 th – November 15 th 2014 Objective 1 Figure 67: Data collection periods for energy model calibration and evaluation Table 21: The convergence curves for different combinations of preferred weights Six possible combinations of weights were tested for building level preferred optimization, HVAC load response level preferred optimization and zone level preferred optimization. However, the different preferences did not converge to the same result in this case study. A possible reason could be that single level simulation accuracy had a conflict with another. Since each objective function has its own - 187 - parameters that do not exist in the other objective function, different weight preferences that are assigned to the three functions may result in different solutions for those parameters. Solutions are the sets of values for parameters that could minimize either the objective functions (building level, HVAC load response level, or zone level) or the weighted objective function. There might be one or multiple solutions as different combinations of values may achieve the same results. Based on the results in Table 21, HVAC load response level preferred solutions (3rd combination with the relative preference of HVAC load response level > zone level > building level) could achieve lower simulation discrepancy and converged faster. (w building=0.27, w response=0.39; w zone=0.34) was selected for further evaluation. Linear programming was then used to find the initial solution. The effective solutions were also searched and the corresponding function outputs were then compared to conclude which solution could minimize weighted simulation discrepancy. To illustrate the difference in estimation of parameters at different levels and how they are eventually combined into one model the following example is provided. After the initial modeling (Step 1), for example, if Wall U-factor existed in the lists of influential parameters for all three levels based on the sensitivity analysis (Step 2), it was considered to significantly contribute to the accuracy of the final model at the building level, HVAC load response level and the zone level. Since this parameter cannot be calculated by parameter estimation (Step 3), it is an adjustable parameter and should be determined by discrepancy analysis (Step 4) and discrepancy minimization (Step 5). After decomposing the discrepancies between the simulated and actual energy performances to the adjustable parameters, if Wall U-factor was statistically significant for contributing to the simulation discrepancies at all three levels, the weights for the three levels of simulation discrepancy can be recognized through a regression analysis. Multi-objective optimization was then conducted to find the final value of U-factor that could synergize with other parameters to minimize weighted simulation discrepancies of the three levels. Its varying values should be limited within its parameter ranges and are recommended not to be far from the default values set in Energyplus. - 188 - 9.3.7. Validation Findings and Discussions To evaluate the performance of the calibrated building model at multiple levels, eight months’ data from March 15 th 2014 to November 15 th 2014 were collected for validating multi-level simulation accuracy, and the results were shown in Figures 68 and 69. Mar-Apr Apr-May May-Jun Jun-Jul Jul-Aug Aug-Sep Sep-Oct Oct-Nov Absolute Value of Simulation Discrepancy (%) Time (2014) ( - ) ( - ) ( - ) ( - ) ( + ) ( + ) ( - ) ( + ) Response Level MBE Figure 68: MBE values for the calibrated model Mar-Apr Apr-May May-Jun Jun-Jul Jul-Aug Aug-Sep Sep-Oct Oct-Nov Simulation Discrepancy (%) Time (2014) Response Level CV(RMSE) Figure 69: CV(RMSE) values for the calibrated model In general, the calibrated model could simulate long-term energy consumption with an hourly error (MBE value) below 8.1% (6.9% for average) at the building level, below 7.8% (7.1% for average) at the HVAC load response level, and below 8.5% (7.7% for average) at the zone level for all of the tested months. MBE values were slightly lower than the CV(RMSE) values. One explanation could be that simulation overestimation might be compensated by the underestimation. The variations of the simulation - 189 - discrepancy were not significant (12% for average at the building level, 11.1% for average at the HVAC load response level and 12.8% for average at the zone level). All of the CV(RMSE) values were within the tolerances regulated by the ASHRAE, FEMP and IPMVP. As the evaluation results (test performance) were consistent with the calibration results (training performance), the calibrated model was not overfit. The results also demonstrated the consistency of calibrated building energy model over time and season. Based on the results, the envelope thermal characteristics may not fit the actual characteristics of the building so well, as the energy model overestimated the energy consumption at all of the three levels when there was not much cooling required (March to May and October to November). However, during the seasons where cooling was required (May to October), the building level simulation tended to underestimate the energy consumption, possibly because the performances of HVAC plants and systems were overestimated to be more energy efficient than they were in reality. Meanwhile, HVAC load response level may underestimate the control inefficiency and it may also lack the consideration of thermal influences of the adjacent spaces controlled by another AHU. Zone level simulation assumed to have less space heat gains than the actual end use demand and was influenced by the overestimated HVAC system performance as the outside temperature increased. Since not all the zones were selected for calibration and only average zone energy consumption was considered, simulation results for individual zones may deviate from the measured energy consumption, resulting in zone level simulation to be higher in MBE and CV(RMSE) compared to the HVAC load response level and the building level simulation. In order to explore whether the energy model calibrated using ground truth energy data from mixed HVAC system load responses, could consistently simulate energy performance for each load response, the period from March 11 st to April 30 th was selected for evaluation of the second objective, during which three weeks were operated by the bimodal response in the 14 zones and the rest were operated by the baseline response. The corresponding MBE values and CV(RMSE) values were calculated at the HVAC load response level and presented for comparing the simulated results with actual energy performance (Figure 70). It can be concluded that the hypothesis should be accepted that energy model calibrated - 190 - under two HVAC system load responses could consistently simulate actual energy consumption under either load response independently. Baseline Response Bimodal Response Time (2014) HVAC Load Response Level MBE HVAC Load Response Level CV(RMSE) Figure 70: MBE values and CV(RMSE) values for the calibrated model at the HVAC load response level At the HVAC load response level, the averaged differences of MBE between baseline response and bimodal response were 0.7%, and the averaged differences of CV(RMSE) were 0.4%, which indicated that the calibrated model was robust enough to the changes resulting from the building being operated differently. In order to evaluate the quality of simulation for thermal conditions, four zones in the case study building were randomly selected. For each zone, the comparisons between simulated average daily temperatures and actual average daily temperatures from March 15 th to November 15 th were presented in Figure 71. Since all the points were closely around the line y=x (average absolute value is 2.1 o C for Zone A, 2.7 o C for Zone B, 3.1 o C for Zone C and 1.9 o C for Zone D), the comparison results demonstrated that the thermal conditions were well simulated by the calibrated energy model. - 191 - Figure 71: Comparison of simulated temperature and actual temperature for randomly selected four zones 9.4. Summary Building energy performance analysis using energy simulations could help researchers and practitioners to investigate the occupancy-loads relationships on the demand side and identify relatively optimal HVAC system responses to loads based on occupancy in existing buildings. A well-calibrated model is crucial to accurately represent a building and provide confidence and reliability in potential energy performance estimation, especially when field experiments for testing all types of occupancy-loads relationships and HVAC responses to loads are infeasible. In particular, a building energy simulation model should have high accuracies at multiple levels for different purposes. In addition, multiple levels of accuracies are interconnected and they reflect the level of approximation of simulation results to measured energy performance. However, current calibration methods focus on single-level simulation accuracy, either at the building level or at the zone level. Accurate simulation of single level does not necessarily mean accurate simulations for other levels especially when there are several zones and multiple HVAC units. - 192 - This dissertation introduced a multi-level calibration framework to improve the accuracy of building energy simulation models at multiple levels. Evidence-available parameters are identified and linked to the physical building characteristics, system properties and environmental conditions, while the remaining parameters are identified as main sources for discrepancies between the simulated and measured energy performances. A classification schema was created to classify all of the input parameters into hierarchical categories for analyzing and determining the values of the parameters without direct evidence. An optimization solution was used to minimize discrepancies and achieve simultaneously high calibration accuracy. The calibration method followed five steps: (1) initial energy modeling using available evidence to reproduce building energy behavior; (2) sensitivity analysis for ranking the influence of each parameter on energy simulation results at multiple levels; (3) parameter estimation for determining the values of estimable parameters; (4) discrepancy analysis to explain the discrepancies between simulated and actual energy performances based on regression fitting; and (5) discrepancy minimization for determining the values of parameters to minimize the discrepancies between simulation and measured energy performance by multi-objective programming. There are more than one hundred building energy simulation programs currently used in research and practice, but no single program that can enable simulation of every pertinent aspect of a building’s behavior and conditions. Thus far, no study has specifically compared the methods and sequences in simulating the relationships and effects of occupancy on building HVAC energy consumption, as well as the HVAC system response to heating/cooling loads based on occupancy. This chapter analyzed the frequently used simulation programs, including DOE-2, EnergyPlus, IES-VE, ESP-r, and TRNSYS, to investigate their capabilities in coupling occupancy with building HVAC energy simulation. Since the EnegyPlus is more suitable in simulating the energy implications of occupancy, and HVAC system response in occupancy based HVAC system load response, in the case study, EnergyPlus was used as the simulation program to validate the calibration framework. - 193 - Considering the fact that energy simulation in this dissertation is mainly used to estimate expected energy efficiency by considering different occupancy-loads relationships, the energy model should be robust to the changes resulting from HVAC systems being operated differently to respond to loads injected by interior and exterior impacts. In order to assess the credibility of expected energy efficiency resulting from different HVAC system controls, the ground truth energy data used for calibration were collected for the periods when two different kinds of HVAC response to loads were implemented. The results showed the robustness of the calibrated model to predict the performance of both responses. More importantly, the proposed calibration framework does not need retraining when changes are made to building conditions, operation and conservation measures; meanwhile it avoids the trial-and-error process, which requires significant time, effort and expertise. The presented framework is a generalizable method, which is not specific to any building type or building system type. Although EnergyPlus was used as the simulation program to validate the calibration method, the method is not designed for EnergyPlus and could be used with other simulation programs. - 194 - Chapter 10: Conclusions Building occupancy modeling and occupancy-loads relationships were systematically investigated in this dissertation to identify the potentials of building energy efficiency if occupancy is integrated with heating/cooling controls (Figure 72). Two main research objectives of (1) improving non-intrusive building occupancy modeling, including both real-time occupancy and long-term occupancy; and (2) exploring the relationships between occupancy and heating/cooling loads on the demand side for improving HVAC system energy efficiency, were both achieved. Figure 72: Summary of building occupancy modeling and investigations of occupancy-loads relationships Occupancy Awareness Real -Time Occupancy Long -Term Occupancy Ineffective Loads Energy Simulation Field Experiment Normalization Ambient Environment Relationship Learning Feature Selection Cross Space Modeling Multi-level Calibration Cross Validation Model Sensitivity Program Selection Actual Demands for Heating/Cooling Temporal Transitions Spatial Transitions Zone Level Diversity Building Level Diversity Effective Loads Charactersitics Irregular Presence Representativeness - 195 - First, this dissertation has presented a systematic approach to model real time occupancy by using ambient sensing and relationship learning. The modeling procedure had a complete work structure, including hardware preparation, data collection, feature selection, supervised learning and performance evaluation. The findings demonstrated the hypothesis that occupancy regularly influences ambient environments thus there exist general relationships between change of occupancy and variations of ambient environment. The drawbacks of existing solutions for occupancy modeling have been resolved and large scale occupancy awareness at building level could be then improved non-intrusively. Specifically, for real-time occupancy modeling, decision tree yielded the relative best performances, indicating that each ambient factor is responsible for classifying part of samples into different classes of occupancy. Since in previous work occupancy models were mostly used to detect and estimate the occupancy in the same spaces where the models were trained, therefore, the generalizability of existing occupancy models to different spaces has not been explored. In general, collecting precise and continuous occupancy of each space for training can be time consuming and intrusive. In most cases, it is not realistic to have access to actual occupancy for all spaces of a building, thus it is difficult to establish an individual model for each space and to improve large-scale occupancy awareness. This dissertation then presented a global occupancy modeling framework, by which the model was trained in one space and was scalable for geometrically similar spaces. Six rooms in two buildings were selected to validate the effectiveness of the framework by training different models for global testing. For long-term occupancy modeling, this dissertation presented a framework for modeling personalized occupancy profiles by integrating ambient factors and real-time occupancy model for eliminating irregular occupancy. Occupancy was assumed to be stable over a longer period of time which was verified by the analysis of the data. Hamming distance was used to determine the number of profiles required to represent one occupant’s presence pattern. Four types of modeling techniques — time-series modeling, pattern recognition modeling, stochastic process modeling and regression modeling — were proposed to calculate the expected presence statuses, and their performances were compared in terms of the degree of statistical approximation to real occupancy. As the probability of occupancy at each time point was modeled using the data from previous time periods, it - 196 - reflected the expected presence probability of that time instead of simply using survey or observation- based data. The results showed that the modeled personalized profiles by ARMA and neural network closely approximated to actual daily occupancy. ARMA, neural network, Markov chain and regression model all outperformed the observation-based method, although the latter yielded still more accurate results than the fixed design profile. Ambient factors were demonstrated to be important for differentiating regular and irregular presences. The contributions of this part are exploring the relationship between ambient environment and occupancy, and generating a non-intrusive framework to model occupancy. Considering the generalization, the occupancy modeling framework does not intend to provide a solution for a specific building but as a generalized way regardless building types and systems. The modeling results might be data-driven, but the process is generalizable. There were no inferences about the gender or age of occupants or their profession, location in the building or the orientation of the room. This frameworks were validated by one test bed building because it was the only available source. It can be seen that the selected rooms have enough variability and diversity. Data collection covered all university semesters and seasons. Theoretically, occupancy modeling requires a long period -even several years- of occupancy data from a large range of buildings. However such data are not available for any research group. Lack of available occupancy ground truth is a common issue not only with this research but also for any occupancy related study. In the future, if the community could establish a standard to collect, store data and share the datasets, the results of this dissertation will be more convincing. Second, the relationships between occupancy and heating/cooling loads were investigated to improve building HVAC system energy efficiency. Heating/cooling loads are associated with the actual energy demands for HVAC systems. Loads represent heating/cooling requirements at the terminal level and determine the responses of HVAC system components at the system level. Despite the high volume of research activities in demand-driven HVAC system controls, it is still not clear how and when occupancy should be linked with heating and cooling controls for sustained and maximum energy efficiency. This is - 197 - a complex problem as occupancy is stochastic in nature, and there exists heat transfer and balance among zones of a building, as well as heat gain and loss through a building’s envelope. Since there is no systematic understanding of the relationships between occupancy and loads to achieve energy efficiency, this dissertation has focused on the types and ways of modeling the relationships that significantly influence HVAC system energy efficiency. Specifically, two relationships have been identified from two perspectives of occupancy transitions, which represents the switch between real-time occupied and unoccupied statuses; and occupancy diversity which represents the difference in long-term occupancy. For occupancy transitions, a data-driven approach was introduced using an enhanced variable neighborhood search algorithm to determine setpoint/setback schedules and distances for all individual zones based on occupancy transitions, by which heating/cooling loads could be minimized. Given certain occupancy data and weather, the difference in the loads resulting from the proposed approach and the baseline control shows the quantitative relationship between occupancy transitions and heating/cooling loads for a certain period of time. The small size reference building was used to validate the performance of integrating variable neighborhood search with a heuristic update for finding the optimal solution. It was demonstrated that the convergence of the search was not influenced by different occupancy assignments or initial solutions, and there was no random solution that could outperform the proposed approach to reduce the heating/cooling loads based on occupancy transitions at the building level. The testbed building was also simulated as a case study for introducing how to apply the EVNS to real-world occupancy transitions-loads analysis. For occupancy diversity, an iterative evaluation algorithm, based on agglomerative hierarchical clustering, was introduced to eliminate occupancy diversity based on three levels of constrains. A testbed building and virtual reference buildings were used for validation. The agglomerative hierarchical clustering process was improved using heap to reduce the time and computational complexity for updating distances among clusters. When using a real-world testbed building, the iterative evaluation algorithm outperformed other possible trials of eliminating diversity, and was effective to quantify the energy implications of diversity. The method could reduce the loads that are not the actual demands for HVAC systems to leverage the effects of occupancy driven setpoint control. - 198 - The performance of the proposed framework was also consistent over different building geometries, different building layouts and different diversities when using virtual reference buildings. The more complicated the building geometries were, the more significant the influence of diversity on HVAC system energy efficiency was. The contribution of these investigations is to increase our knowledge about the impacts of occupancy transitions and diversity on HVAC system energy efficiency, which could bring about potential energy savings and could facilitate better-informed decision making for energy-efficient HVAC system response control strategies. The investigations, presented in this dissertation, do not aim to provide any specific solution to account for variant transitions or eliminate diversity for a specific building or for a specific HVAC system or in a specific climate. Although centrally controlled HVAC VAV systems were used in the testbed building and virtual reference buildings for validation, the proposed framework could be applied to other buildings and HVAC systems. It cannot be guaranteed that exact percentage of energy could be saved, but these investigations are basically to identify and model the relationships between occupancy characteristics and heating/cooling loads. The unnecessary energy consumption that are associated with occupancy could be quantified and reduced by integrating occupancy transitions and diversity with setpoint control for heating and cooling. As part of the investigations, building energy simulation, the virtual representation and reproduction of energy processes for either an entire building or a specific space, is used to implement and validate the methods for modeling occupancy-loads relationships. Compared to field experiments, simulation has several advantages such as feasibility, controllability, reversibility, non-intrusion, and so on. It is widely accepted building energy model must be well calibrated before any simulations, but there is no generally adapted calibration methodology in both academia and industry to achieve high energy simulation accuracies at multiple levels. This dissertation has introduced a novel multi-level calibration framework to calibrate building energy model at multiple levels (e.g. building level, at energy conservation measure level, and at zone level) simultaneously. A parameter classification schema was designed to classify all of - 199 - the input parameters into hierarchical categories for analyzing the determining their values. By mixing the energy ground truth for calibration, the calibrated model was robust to the changes resulting from buildings and building systems being operated differently. The proposed calibration framework does not need retraining when changes are made to building systems and conservation measures; meanwhile it avoids the trial-and-error process, which requires significant time, effort and expertise. A comparative study of different energy simulation programs was also conducted, by which EnergyPlus simulation program was chosen for the validation of the calibration framework. The contribution of this part is to provide a multi-level calibration framework to improve building energy simulation accuracy at multiple levels simultaneously. It couples both advantages of evidence based calibration and statistical learning based calibration. To be specific, the classification schema only defines the functions and essences of categories, and the specific members vary case by case. Considering generalization, the proposed framework provides a general method for multi-level simulation calibration. It is not related to building or building system types. EnergyPlus is used to validate the framework; however the steps are not restricted to EnergyPlus and could be used for other simulation programs. This framework is also not specific for building level or zone level energy simulation calibration; the discrepancy minimization could be applied to any multiple levels of calibrations. - 200 - Chapter 11: Limitations and Future Research Directions This dissertation bears certain limitations, which should be improved in future research. For real-time occupancy modeling, the effects of each ambient factor on the modeling performance have not been analyzed. The quantitative contributions of each factor to the F-measure, RMSE, occupied/occupied detection accuracy and number estimation accuracy will be investigated in future studies. Their joint effects will also be taken into consideration. Since only a rapid feature selection was conducted to process the ambient factors, some representative and informative features may be neglected and there might still be redundancy in the final feature set. In future work, more advanced feature construction methods, such as matrix factorization and supervised feature selection, could be applied to determine the best features for learning. Besides, other potentially influential factors on the modeling performance, such as space orientation will be investigated. Lastly, analyzing the results from six geometrically similar rooms may make the generalization difficult; more data should be collected to further examine the findings of this dissertation. For long-term occupancy modeling, four months were defined as the long term based on the outside temperature statistics and university calendar. Since the exact length of long term is case specific and not fixed, online learning will be implemented for updating occupancy profiles in order to adapt to presence changes over time, seasonal and personal changes. Besides, if a room accommodates more than one occupants or there are multiple rooms in one zone, the long-term occupancy should be modeled as an integrity of multiple occupants’ patterns instead of simply combining occupancy profiles of individual occupants or rooms. How to model the joint occupancy profile will be systematically explored. In addition, the number profiles of different spaces are interrelated as the occupants at building level generally remain constant. The reduction of occupant number in one space usually indicates the increase of occupant number in another space. This information could be further utilized to support building level long them occupancy modeling and dynamics analysis. - 201 - The importance of occupancy-loads relationships to HVAC system energy efficiency has been identified in this dissertation as well as the ways of modeling them. Specifically, occupancy transitions and occupancy diversity were coupled with setpoint control to reduce heating/cooling loads. Several improvement areas are noted and outlined here for future explorations. For occupancy transitions, we will test the EVNS in different climate zones for analyzing the patterns of heating loads and cooling loads separately. Different types of commercial buildings will be used to explore the correlations among a building’s physical characteristics, heating/cooling loads, and occupancy transitions. The relationships between occupancy transitions and heating/cooling are time-variant, which have to be calculated for different lengths of periods (e.g., one year, one month, or one week). Although the number of simulation runs is significantly reduced, hundreds of iterations still take considerable amount of time. In our future work, we will use statistical analysis to find the relationships and patterns among the different levels of occupancy transitions, weather, and optimized combinations of setpoint/setback schedules and distances. Factors including the arrival time, departure time, morning/afternoon average absence length, break lengths, which are possible to influence the occupancy transitions-loads relationships, will also be identified by computer learning. Lastly, in this dissertation the setpoint was not restored until the space became occupied again, as occupancy prediction has not been incorporated to the approach. Since additional errors might be introduced when prediction is used to recondition a space from setback to setpoint before occupants occupy their spaces, the energy implications of pre-control will be systematically analyzed in our future studies to determine the dynamic time point of switching from setback period to reconditioning period. For occupancy diversity, the occupancy data from the testbed building was used for both validations, which might not be representative for all cases. We will continue to collect occupancy data from a large range of buildings for larger scale validations. Second, more dynamic and robust match between occupants and spaces with the least cost will be generated automatically for facility management and space planning if occupants could be reassigned to spaces or moved to new buildings in the future. Third, if changing occupancy-space relationship is not feasible, occupancy driven setpoint control should be optimized globally to reduce the loads caused by occupancy - 202 - diversity, by allowing different zones to have different values of setback and different waiting time to trigger the setback, as well as different supply air flow temperatures and rates. - 203 - REFERENCES [1] U.S. Environment Protection Agency. 1995. The inside story: A guide to indoor air quality. EPA 402K-93-007. Updated:1993. Available at: http://www.epa.gov/iaq/pubs/insidestory.html#IAQHome2. [2] International Energy Agency. 2013. Transition to sustainable buildings-strategies and opportunities to 2050. Paris. France. [3] U.S. Department of Energy. 2014. Building energy consumption and efficiency. Available at: http://www.eia.gov/consumption/commercial/. [4] U.S. Department of Energy. 2014. Building energy data book. Available at: http://buildingsdatabook.eren.doe.gov/ChapterIntro3.aspx. [5] U.S. Department of Energy. 2013. Energy efficiency & renewable energy. Available at: http://www.eere.energy.gov/. Accessed Nov,2013. [6] U.S. Department of Energy. 2013. Better buildings, brighter future. [7] U.S. Green Building Council. 2010. Buildings and climate change. [8] U.S. Green Building Council. 2009. LEED reference guide for green building design and construction. [9] Energy Information Administration. 2012. Annual energy outlook 2012 with projections to 2035. [10] Bartlett D. 2011. The top ten ways we waste energy and water in buildings. Available at: http://breakingenergy.com/2011/07/26/the-top-ten-ways-we-waste-energy-and-water-in-buildings/. [11] Kelly M.J. 2010. Energy efficiency, resilience to future climates and long-term sustainability: the role of the built environment. Philosophical Transactions of the Royal Society of London A: Mathematical, Physical and Engineering Sciences. 368 (1914):1083-1089. [12] Energy Information Administration. 2008. Commercial buildings energy consumption survey (CBECS). [13] United Nations Environment Programme. 2007. Buildings can play key role in combating climate change. Available at: http://www.unep.org/Documents.Multilingual/Default.Print.asp. [14] American Society of Heating, Refrigerating, and Air Conditioning Engineers. 1996. Handbook of HVAC systems and equipment. [15] American Society of Heating, Refrigerating, and Air Conditioning Engineers. 2013. ANSI/ASHRAE/IES standard 90.1-2013: energy standard for buildings except low-rise residential buildings. [16] Carbon Trust. 2012. Building controls- realising savings through the use of controls. [17] Lo L.J. and Novoselac A. 2010. Localized air-conditioning with occupancy control in an open office. Energy and Buildings. 42 (7):1120-1128. [18] Wang W., Katipamula S., Huang Y., and Brambley M.R. 2011. Energy savings and economics of advanced control strategies for packaged air-conditioning units with gas heat. 2011. Pacific Northwest National Laboratory (PNNL). [19] White F. 2012. Save big on heating, cooling costs with efficiency controls. Available at: http://www.pnnl.gov/news/release.aspx?id=919. [20] U.S. Green Building Council. Buildings and climate change. 2011. Available at: http://www.documents.dgs.ca.gov/dgs/pio/facts/LA%20workshop/climate.pdf. [21] Guerra-Santin O. and Itard L. 2010. Occupants' behaviour: determinants and effects on residential heating consumption. Building Research and Information. 38 (3): 318-338. [22] Kiliccote S., Piette M.A., Watson D.S., Hughes G. 2006. Dynamic controls for energy efficiency and demand response: framework concepts and a new construction study case in New York. ACEEE Summary Study on Energy Efficiency in Buildings. Pacific Grove, CA. [23] Mathews E.G., Botha C.P., Arndt D.D., and Malan A. 2001. HVAC control strategies to enhance comfort and minimise energy usage. Energy and Buildings. 33 (8):853-863. [24] Wang S. and Ma Z. 2008. Supervisory and optimal control of building HVAC systems: a review. HVAC&R Research. 14 (1):03-32. - 204 - [25] Motegi N., Piette M.A., Watson D.S., Kiliccote S., and Xu P. 2007. Introduction to commercial building control strategies and techniques for demand response. Lawrence Berkeley National Laboratory, Berkeley. [26] Salsbury T., Mhaskar P., and Qin S.J. 2013. Predictive control methods to improve energy efficiency and reduce demand in buildings. Computer & Chemical Engineering. 51: 77-85. [27] Tsao Y. and Hsu J.Y. 2013. Demand-driven power saving by multiagent negotiation for HVAC control. Joint Proceedings of the Workshop on AI Problems and Approaches for Intelligent Environments and Workshop on Semantic Cities. Beijing China. [28] Jazizadeh F., Ghahramani A., Becerik-Gerber B., Kichkaylo T., and Orosz M. 2014. User-led decentralized thermal comfort driven HVAC operations for improved efficiency in office buildings. Energy and Buildings. 70: 398-410. [29] Li N., Calis G., and Becerik-Gerber B. 2012. Measuring and monitoring occupancy with an RFID based system for demand-driven HVAC operations. Automation in Construction. 24: 89-99. [30] Schumacher M. 2012. Energy performance of buildings-Impact of Building Automation, Controls and Building Management. European Standard EN 15232. [31] Lütz H. 2012. The new version of EN15232 effects of building automation on building efficiency. 2012. Honeywell. [32] ABB Inc. 2010. The European Standard EN 15232: A key contribution to worldwide energy efficiency. [33] Yu Z., Fung B., Haghighat F., Yoshino H., and Morofsky E. 2011. A systematic procedure to study the influence of occupant behavior on building energy consumption. Energy and Buildings. 43 (6):1409- 1417. [34] Wasilowski H. and Reinhart C. 2009. Modelling an existing building in DesignBuilder/E : Custom versus default inputs. Proceedings of Building Simulation Conference, Glasgow, Scotland. [35] Azar E., and Menassa C.C. 2011. Agent-based modeling of occupants and their impact on energy use in commercial buildings. Journal of Computing in Civil Engineering. 26 (4): 506-518. [36] Hoes P., Hensen J., Loomans M., De Vries B., and Bourgeois D. 2009. User behavior in whole building simulation. Energy and Buildings. 41 (3):295-302. [37] Erickson V.L., Lin Y., Kamthe A., et al. Energy efficient building environment control strategies using real-time occupancy measurements. 2009. Proceedings of the First ACM Workshop on Embedded Sensing Systems for Energy-Efficiency in Buildings. Berkeley, CA. [38] Johnson Controls Inc. 2010 Building energy efficiency and sustainability. [39] Pisello A.L., Bobker M., and Cotana F. 2012. A building energy efficiency optimization method by evaluating the effective thermal zones occupancy. Energies. 5 (12): 5257-5278. [40] Lu J., Sookoor T., Srinivasan V., et al. 2010. The smart thermostat: using occupancy sensors to save energy in homes. Proceedings of the 8th ACM Conference on Embedded Networked Sensor Systems. Zurich, Switzerland. [41] Brandemuehl M.J. and Braun J.E. 1999. The impact of demand-controlled and economizer ventilation strategies on energy use in buildings. Proceedings of ASHRAE Annual Meeting. Boulder, CO. [42] Lawrence T.M. and Braun J.E. 2007. A methodology for estimating occupant CO 2 source generation rates from measurements in small commercial buildings. Building and Environment. 42 (2): 623-639. [43] Masoso O.T. and Grobler L.J. 2010. The dark side of occupants’ behaviour on building energy use. Energy and Buildings. 42 (2):173-177. [44] Azar E. and Menassa C.C. 2011. A decision framework for energy use reduction initiatives in commercial buildings. Winter Simulation Conference. Phoenix.AZ. [45] Webber C.A., Roberson J.A., McWhinney M.C., Brown R.E., Pinckard M.J., Busch J.F. 2006. After-hours power status of office equipment in the USA. Energy. 31 (14):2823-2838. [46] Emery A. and Kippenhan C. 2006. A long term study of residential home heating consumption and the effect of occupant behavior on homes in the Pacific Northwest constructed according to improved thermal standards. Energy. 31 (5):677-693. - 205 - [47] Wang C., Yan Y., and Jiang Y. 2011. A novel approach for building occupancy simulation. Building Simulation. 4: 149-167. [48] Yu T. 2010. Modeling occupancy behavior for energy efficiency and occupants comfort management in intelligent buildings. Ninth International Conference on Machine Learning and Applications. Washington, DC. [49] Kwok S.S., Yuen R.K., Lee E.W. 2011. An intelligent approach to assessing the effect of building occupancy on building cooling load prediction. Building and Environment. 46 (8):1681-1690. [50] Zhang R., Lam K.P., Chiou Y.S., and Dong B. 2012. Information-theoretic environment features selection for occupancy detection in open office spaces. 5: 179-188. [51] Tabak V., de Vries B. 2010. Methods for the prediction of intermediate activities by office occupants. Build and Environment. 45 (6):1366-1372. [52] Andersen R.V., Toftum J., Andersen K.K., and Olesen B.W. 2009. Survey of occupant behaviour and control of indoor environment in Danish dwellings. Energy and Buildings. 41 (1):11-16. [53] de Groot E., Spiekman M., and Opstelten I. 2008. Dutch research into user behaviour in relation to energy use of residences. Proceedings of PLEA Conference. Dublin, Ireland. [54] Široký J., Oldewurtel F., Cigler J., and Prívara S. 2011. Experimental analysis of model predictive control for an energy efficient building heating system. Applied Energy. 88 (9): 3079-3087. [55] Pisello A.L., Goretti M., Cotana F. 2012. A method for assessing buildings’ energy efficiency by dynamic simulation and experimental activity. Applied Energy. 97: 419-29. [56] Huang Y., Niu J., and Chung T. 2013. Study on performance of energy-efficient retrofitting measures on commercial building external walls in cooling-dominant cities. Applied Energy.103: 97-108. [57] Trčka M. and Hensen J.L. 2010. Overview of HVAC system simulation. Automation in Construction. 19(2): 93-99. [58] Yang Z. and Becerik-Gerber B. 2015. A model calibration framework for simultaneous multi-level building energy simulation. Applied Energy. 149: 415-431. [59] Lü X., Lu T., Kibert C.J., and Viljanen M. 2014. A novel dynamic modeling approach for predicting building energy performance. Applied Energy. 114: 91-103. [60] Westphalen D. and Koszalinski S. 2001. Energy consumption characteristics of commercial building HVAC systems. Volume I: chillers, refrigerant compressors, and heating systems. Arthur D.Little Reference No. 33745-00. [61] Energy Star. 2011. Buildings manual: heating and cooling system upgrades. [62] American Society of Heating, Refrigerating and Air-Conditioning Engineers. 2009. Standard A. Standard 189.1-2009, Standard for the design of high-performance green buildings except low-rise residential buildings. [63] Zheng G. and Zaheer-Uddin M. 1996. Optimization of thermal processes in a variable air volume HVAC system. Energy. 21 (5): 407-420. [64] Ardehali M.M. and Smith T.F. 1997. Evaluation of HVAC system operational strategies for commercial buildings. Energy Conversion and Management. 38 (3): 225-236. [65] Aktacir M.A., Büyükalaca O., and Yılmaz T. 2006. Life-cycle cost analysis for constant-air-volume and variable-air-volume air-conditioning systems. Applied Energy. 83 (6): 606-627. [66] Pritoni M., Meier A., and Perry D. 2011. Human factors in climate controls for small commercial buildings. Johnson Control. [67] American Society of Heating, Refrigerating and Air-Conditioning Engineers. 2010. 62.1 User's Manual: ANSI/ASHRAE Standard 62.1-2010: ventilation for acceptable indoor air quality. [68] Roth K., Westphalen D., Feng M., Llana P., and Quartararo L. 2005. Energy impact of commercial building controls and performance diagnostics: market characterization, energy impact of building faults and energy savings potential. Report for U.S. Department of Energy. [69] Garwin T.M., Pollard N.A., and Tuohy R.V. 2004. Project responder: national technology plan for emergency response to catastrophic terrorism. The Natioal Memorial Institute for the Prevention of Terrorism. - 206 - [70] National Fallen Firefighters Foundation. 2005. Report of the National Fire Service Research Agenda Symposium. [71] Tomastik R., Narayanan S., Banaszuk A., and Meyn S. 2010. Model-based real-time estimation of building occupancy during emergency egress. Pedestrian and Evacuation Dynamics. 2008: 215-224. [72] Wang H.T., Jia Q.S., Song C., Yuan R., and Guan X. 2010. Estimation of occupancy level in indoor environment based on heterogeneous information fusion. 49th IEEE Conference on Decision and Control (CDC). Atlanta, GA. [73] Benezeth Y., Laurent H., Emile B., and Rosenberger C. 2011. Towards a sensor for detecting human presence and characterizing activity. Energy and Buildings. 43 (2): 305-314. [74] Shih H. 2014. A robust occupancy detection and tracking algorithm for the automatic monitoring and commissioning of a building. Energy and Buildings. 77: 270-280. [75] Funiak S., Guestrin C., Paskin M., and Sukthankar R. 2006. Distributed localization of networked cameras. Proceedings of the 5th International Conference on Information Processing in Sensor Networks. Nashville, TN. [76] Chen T.H., Chen T.Y. and Chen Z.X. 2006. An intelligent people-flow counting method for passing through a gate. IEEE Conference on Robotics, Automation and Mechatronics. Bangkok, Thailand. [77] Abushakra B. and Claridge D.E. 2008. Modeling office building occupancy in hourly data-driven and detailed energy simulation programs. ASHRAE Transactions. 114 (2): 472-481. [78] Chen D., Barker S., Subbaswamy A., Irwin D., and Shenoy P. 2013. Non-intrusive occupancy monitoring using smart meters. Proceedings of the 5th ACM Workshop on Embedded Systems For Energy-Efficient Buildings. Rome, Italy. [79] Chen Y. and Oh H. 2014. A survey of measurement-based spectrum occupancy modeling for cognitive radios. IEEE Communications Survey & Tutorials. 8 (1): 848-859. [80] Dodier R.H., Henze G.P., Tiller D.K., and Guo X. 2006. Building occupancy detection through sensor belief networks. Energy and Buildings. 38 (9):1033-1043. [81] Guo X., Tiller D.K., Henze G.P., Waters C. 2010. The performance of occupancy-based lighting control systems: A review. Lighting Research and Technology. 42 (4): 415-431. [82] Kim Y., Schmid T., Charbiwala Z.M., and Srivastava M.B. 2009. ViridiScope: design and implementation of a fine grained power monitoring system for homes. Proceedings of the 11th International Conference on Ubiquitous Computing. Orlando, Florida. [83] Hoeynck M. and Andrews B.W. 2008. Sensor-based occupancy and behavior prediction method for intelligently controlling energy consumption within a Building. Patent U.S. 20100025483. [84] Zeeman A.S, Booysen M.J., Ruggeri G., and Laganá B. 2013. Capacitive seat sensors for multiple occupancy detection using a low-cost setup. IEEE International Conference on Industrial Technology (ICIT). Cape Town, South Africa. [85] Fraden J. 2010. Occupancy and motion detectors. Handbook of Modern Sensors. Springer. 2010: 247-278. [86] Melikov A.K. and Pokora P. 2014. Occupant body movement and seat occupancy rate for design of desk micro-environment. 13th SCANVAC International Conference on Air Distribution in Rooms. San Paulo, Brazil. [87] Srinivasan V., Stankovic J., and Whitehouse K. 2010. Using height sensors for biometric identification in multi-resident homes. Pervasive Computing. Springer. 2010: 337-354. [88] Melfi R., Rosenblum B., Nordman B, and Christensen K. 2011. Measuring building occupancy using existing network infrastructure. International Green Computing Conference and Workshops (IGCC). Orlando, Florida. [89] So A. and Chan W. 2012. Intelligent building systems. International Series on Asian Studies in Computer and Information Science. Springer. [90] Barbato A., Borsani L., Capone A., and Melzi S. 2009. Home energy saving through a user profiling system based on wireless sensors. Proceedings of the First ACM Workshop on Embedded Sensing Systems for Energy-Efficiency in Buildings. Berkeley, CA. - 207 - [91] Doukas H., Patlitzianas K.D., Iatropoulos K., and Psarras J. 2007. Intelligent building energy management system using rule sets. Building and Environment. 42 (10): 3562-3569. [92] Jiang X., Dawson-Haggerty S., Dutta P., and Culler D. 2009. Design and implementation of a high- fidelity ac metering network. International Conference on Information Processing in Sensor Networks, 2009. San Francisco, CA. [93] Lifton J., Feldmeier M., Ono Y., Lewis C., and Paradiso J.A. 2007. A platform for ubiquitous sensor deployment in occupational and domestic environments. 6th International Symposium on Information Processing in Sensor Networks. Cambridge, MA. [94] Delaney D.T., O'Hare G.M.P, and Ruzzelli A.G. 2009. Evaluation of energy-efficiency in lighting systems using sensor networks. Proceedings of the First ACM Workshop on Embedded Sensing Systems for Energy-Efficiency in Buildings. Berkeley, CA. [95] Schoofs A., Guerrieri A., Delaney D.T., O'Hare G.M.P, and Ruzzelli A.G. 2010. Annot: Automated electricity data annotation using wireless sensor networks. 7th Annual IEEE Communications Society Conference on Sensor Mesh and Ad Hoc Communications and Networks. Boston, MA. [96] California Energy Commission. 2014. California 2013 building energy efficiency standards. Available at: http://www.energy.ca.gov/title24/2013standards/index.html. [97] Kazmi A.H., O'grady M.J., Delaney D.T., Ruzzelli A.G., O'hare G.M.P. 2014. A review of wireless- sensor-network-enabled building energy management systems. ACM Transactions on Sensor Networks 10(4): 66-108. [98] Howard J. and Hoff W. 2013. Forecasting building occupancy using sensor network data. Proceedings of the 2nd International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications. Chicago, IL. [99] Gunay H.B., O'Brien W., and Beausoleil-Morrison I. 2013. A critical review of observation studies, modeling, and simulation of adaptive occupant behaviors in offices. Building and Environment. 70: 31-47. [100] Agarwal Y., Balaji B, Dutta S., Gupta R., and Weng T. 2011. Duty-cycling buildings aggressively: The next frontier in HVAC control. 10th International Conference on Information Processing in Sensor Networks (IPSN). Chicago, IL. [101] Jazizadeh F., Ghahramani A., Becerik-Gerber B., Kichkaylo T., and Orosz M. 2013. Human- building interaction framework for personalized thermal comfort driven systems in office buildings. Journal of Computing in Civil Engineering. 28 (1): 2-16. [102] Jazizadeh F. and Becerik-Gerber B. 2012. Toward adaptive comfort management in office buildings using participatory sensing for end user driven control. Proceedings of the Fourth ACM Workshop on Embedded Sensing Systems for Energy-Efficiency in Buildings. Toronto, Canada. [103] Sun Z., Wang S., and Ma Z. 2011. In-situ implementation and validation of a CO 2-based adaptive demand-controlled ventilation strategy in a multi-zone office building. Building and Environment. 46 (1): 124-33. [104] Leephakpreeda T., Thitipatanapong R., Grittiyachot T., and Yungchareon V. 2001. Occupancy- based control of indoor air ventilation: a theoretical and experimental study. Science Asia. 27(4): 279-284. [105] Nielsen T.R. and Drivsholm C. 2010. Energy efficient demand controlled ventilation in single family houses. Energy and Buildings. 42 (11): 1995-1998. [106] Lam K.P., Höynck M., Dong B., Andrews B., Chiou Y., Zhang R., et al. 2009. Occupancy detection through an extensive environmental sensor network in an open-plan office building. IBPSA Building Simulation. 145: 1452-1459. [107] Agarwal Y., Balaji B., Gupta R., Lyles J., Wei M. and Weng T. 2010. Occupancy-driven energy management for smart building automation. Proceedings of the 2nd ACM Workshop on Embedded Sensing Systems for Energy-Efficiency in Building. Zurich, Switzerland. [108] Meyn S., Surana A., Lin Y., Oggianu S.M., Narayanan S., and Frewen T.A. 2009. A sensor-utility- network method for estimation of occupancy in buildings. Proceedings of the 48th IEEE Conference Decision and Control. Shanghai, China. [109] Henze G.P., Felsmann C., and Knabe G. 2004. Evaluation of optimal control for active and passive building thermal storage. International Journal of Thermal Sciences. 43 (2): 173-183. - 208 - [110] Hailemariam E., Goldstein R., Attar R., and Khan A. Real-time occupancy detection using decision trees with multiple sensor types. Proceedings of the 2011 Symposium on Simulation for Architecture and Urban Design. Society for Computer Simulation. Boston, MA. [111] Dong B., Andrews B., Lam K.P., Höynck M., Zhang R., Chiou Y., et al. 2010. An information technology enabled sustainability test-bed (ITEST) for occupancy detection through an environmental sensing network. Energy and Buildings. 42 (7): 1038-1046. [112] Hutchins J., Ihler A., and Smyth P. Modeling count data from multiple sensors: a building occupancy model. 2nd IEEE International Workshop on Computational Advances in Multi-Sensor Adaptive Processing. St. Thomas, VI. [113] Yang Z., Li N., Becerik-Gerber B., and Orosz M. 2013. A systematic approach to occupancy modeling in ambient sensor–rich buildings. Simulation. 90 (8): 960-977. [114] Erickson V.L. and Cerpa A.E. 2010. Occupancy based demand response HVAC control strategy. Proceedings of the 2nd ACM Workshop on Embedded Sensing Systems for Energy-Efficiency in Building. Berkeley, CA. [115] Dong B. and Andrews B. 2009. Sensor-based occupancy behavioral pattern recognition for energy and comfort management in intelligent buildings. The 11th International Building Performance Simulation Association Conference. Glasgow, Scotland. [116] Tachwali Y., Refai H., and Fagan J.E. 2007. Minimizing HVAC energy consumption using a wireless sensor network. 33rd Annual Conference of the IEEE on Industrial Electronics Society. Taipei, Taiwan. [117] Abushakra B., Sreshthaputra A., Haberl J., and Claridge D.E. 2001. Compilation of diversity factors and schedules for energy and cooling load calculations. Energy Systems Laboratory, Texas A&M University. [118] Yan D., Xia J., Tang W., Song F., Zhang X., and Jiang Y. 2008. DeST—An integrated building simulation toolkit Part I: Fundamentals. Building Simulation. 1 (2): 95-110. [119] Davis III J.A., Nutter D.W. 2010. Occupancy diversity factors for common university building types. Energy and Buildings. 42 (9): 1543-1551. [120] Duarte C., Van Den Wymelenberg K., and Rieger C. 2013. Revealing occupancy patterns in an office building through the use of occupancy sensor data. Energy and Buildings. 67: 587-595. [121] Wang D., Federspiel C.C., Rubinstein F. 2005. Modeling occupancy in single person offices. Energy and Buildings. 37 (2): 121-126. [122] Mahdavi A. and Pröglhöf C. User behaviour and energy performance in buildings. International Energy Economics Workshop. Venice, Italy. [123] Mahdavi A. 2009. Patterns and implications of user control actions in buildings. Indoor and Built Environment. 18 (5): 440-446. [124] Page J., Robinson D., Morel N., Scartezzini J. 2008. A generalised stochastic model for the simulation of occupant presence. Energy and Buildings. 40 (2): 83-98. [125] Goldstein R., Tessier A., and Khan A. 2010. Schedule-calibrated occupant behavior simulation. Proceedings of the 2010 Spring Simulation Multiconference: Symposium on Simulation for Architecture and Urban Design. Orlando, FL. [126] Richardson I., Thomson M., Infield D. 2008. A high-resolution domestic building occupancy model for energy demand simulations. Energy and Buildings. 40 (8): 1560-1566. [127] Macdonald I. and Strachan P. 2001. Practical application of uncertainty analysis. Energy Build. 33 (3): 219-227. [128] Reinhart C.F. 2004. Lightswitch-2002: a model for manual and automated control of electric lighting and blinds. Solar Energy. 77 (1): 15-28. [129] Yamaguchi Y., Shimoda Y, and Mizuno M. 2003. Development of district energy system simulation model based on detailed energy demand model. Proceeding of Eighth International IBPSA Conference. Eindhoven, Netherland. - 209 - [130] Akhlaghinia M.J., Lotfi A., Langensiepen C., and Sherkat N. 2008. A fuzzy predictor model for the occupancy prediction of an intelligent inhabited environment. IEEE International Conference on Fuzzy Systems. Hong Kong. [131] Stoppel C.M. and Leite F. 2014. Integrating probabilistic methods for describing occupant presence with building energy simulation models. Energy and Buildings. 68: 99-107. [132] Chang W.K., and Hong T. 2013. Statistical analysis and modeling of occupancy patterns in open- plan offices using measured lighting-switch data. Building Simulation. Springer. 6: 23-32. [133] Mahdavi A. and Pröglhöf C. 2008. Observation-based models of user control actions in building. Proceedings of PLEA–Passive and Low Energy Architecture 2008 Conference. Dublin, Ireland. [134] Mohammadi A., Kabir E., Mahdavi A., and Pröglhöf C. 2007. Modeling user control of lighting and shading devices in office buildings: an empirical case study. Building Simulation. Beijing, China [135] Karp B. and Kung H.T. GPSR: Greedy perimeter stateless routing for wireless networks. Proceedings of the 6th Annual International Conference on Mobile Computing and Networking. Boston, MA. [136] Cook D.J., Das S.K. 2005. Smart environments: technologies, protocols, and applications. Wiley. [137] Zhu Y., Liu M., Batten T., Noboa H., Claridge D.E. and Turner W.D. 2000. Optimization of control strategies for HVAC terminal boxes. Proceedings of 12th Symposium on Improving Building Systems in Hot and Humid Climates. San Antonio, TX. [138] Deng K., Barooah P., Mehta P.G., and Meyn S.P. 2010. Building thermal model reduction via aggregation of states. American Control Conference (ACC). Baltimore, MD. [139] Oldewurtel F., Parisio A., Jones C.N., et al. Energy efficient building climate control using stochastic model predictive control and weather predictions. American Control Conference (ACC). Baltimore, MD. [140] Ma Y., Anderson G., and Borrelli F. 2011. A distributed predictive control approach to building temperature regulation. American Control Conference (ACC). San Francisco, CA. [141] Nghiem T. and Pappas G.J. 2011. Receding-horizon supervisory control of green buildings. American Control Conference (ACC). San Francisco, CA. [142] Aswani A., Master N., Taneja J., Culler D., Tomlin C. 2012. Reducing transient and steady state electricity consumption in HVAC using learning-based model-predictive control. Proceedings of the IEEE. 100 (1): 240-253. [143] Gao G. and Whitehouse K. 2009. The self-programming thermostat: optimizing setback schedules based on home occupancy patterns. Proceedings of the First ACM Workshop on Embedded Sensing Systems for Energy-Efficiency in Buildings. Berkeley, CA. [144] Dong B., Lam K.P., and Neuman C. 2011. Integrated building control based on occupant behavior pattern detection and local weather forecasting. Twelfth International IBPSA Conference. Sydney, Australia. [145] NEST Lab. 2014. Life with Nest thermostat. Available at: https://nest.com/thermostat/inside-and- out/#nest-sense. [146] Telkonet SmartEnergy. 2010. Available at: http://www.telkonet.com/index.php. [147] Oldewurtel F., Sturzenegger D., and Morari M. 2013. Importance of occupancy information for building climate control. Applied Energy. 101: 521-32. [148] Goyal S., Ingley H.A., and Barooah P. 2012. Effect of various uncertainties on the performance of occupancy-based optimal control of HVAC zones. IEEE 51st Annual Conference on Decision and Control (CDC). Maui, Hawaii. [149] Goyal S., Ingley H.A., and Barooah P. 2012. Zone-level control algorithms based on occupancy information for energy efficient buildings. American Control Conference (ACC). Montreal, Canada. [150] Scott J., Brush A., Krumm J., et al. 2011. PreHeat: controlling home heating using occupancy prediction. Proceedings of the 13th international conference on Ubiquitous computing. Beijing, China. [151] University of California, Berkeley. 2013. Fundamental of HVAC controls. [152] American Society of Heating, Refrigerating and Air-Conditioning Engineers. 2007. Standard method of test for the evaluation of building energy analysis computer programs. - 210 - [153] American Society of Heating, Refrigerating and Air-Conditioning Engineers. 2004. Standard 90.1- 2004, energy standard for buildings except low rise residential buildings. [154] American Society of Heating, Refrigerating and Air-Conditioning Engineers. 2007. Energy standard for buildings except low-rise residential buildings. [155] Claridge D., Abushakra B., Haberl J., and Sreshthaputra A. 2004. Electricity diversity profiles for energy simulation of office buildings (RP-1093). ASHRAE Transactions. 110 (1): 365-377. [156] Hong T. and Lin H.W. 2013. Occupant behavior: impact on energy use of private offices. ASim 2012-1st Asia Conference of International Building Performance Simulation Association. Shanghai, China. [157] U.S General Service Administration. 2013. Building automation system (BAS). Available at: http://www.gsa.gov/portal/content/101302. [158] Fumo N. 2014. A review on the basics of building energy estimation. Renewable and Sustainable Energy Reviews. 31: 53-60. [159] Manfren M., Aste N., Moshksar R. 2013. Calibration and uncertainty analysis for computer models–A meta-model based approach for integrated building energy simulation. Applied Energy. 103: 627-641. [160] Coakley D., Raftery P., and Molloy P. 2012. Calibration of whole building energy simulation models: detailed case study of a study of a naturally ventilated building using hourly measured data. First Building Simulation and Optimization Conference. Loughborough, UK. [161] Pan Y., Huang Z., Wu G., Chen C. 2006. The application of building energy simulation and calibration in two high-rise commercial buildings in Shanghai. Proceedings of SimBuild. 2006: 2-4. [162] Raftery P., Keane M., Costa A. 2011. Calibrating whole building energy models: detailed case study using hourly measured data. Energy and Buildings. 43 (12): 3666-3679. [163] Mustafaraj G., Marini D., Costa A., Keane M. 2014. Model calibration for building energy efficiency simulation. Applied Energy. 130: 72-85. [164] Ahmad M., Culp C.H. 2006. Uncalibrated building energy simulation modeling results. HVAC&R Research. 12(4): 1141-1155. [165] Heo Y., Choudhary R., Augenbroe G. 2012. Calibration of building energy models for retrofit analysis under uncertainty. Energy and Buildings. 47: 550-560. [166] Coakley D., Raftery P., Keane M. 2014. A review of methods to match building energy simulation models to measured data. Renewable and Sustainable Energy Reviews. 37:123-141. [167] Reddy T.A. 2006. Literature review on calibration of building energy simulation programs: uses, problems, procedures, uncertainty, and tools. ASHRAE Transactions. 112 (1): 226-240. [168] Menezes A.C., Cripps A., Bouchlaghem D., Buswell R. 2012. Predicted vs. actual energy performance of non-domestic buildings: using post-occupancy evaluation data to reduce the performance gap. Applied Energy. 97: 355-364. [169] Andolsun S., Charles H.C. 2008. A comparison of EnergyPlus to DOE-2.1 E: multiple cases ranging from sealed box to a residential building. Texas A&M University. [170] Witte M.J., Henninger R.H., Glazer J., Crawley D.B. 2001. Testing and validation of a new building energy simulation program. Proceedings of Building Simulation. Rio de Janeiro, Brazil. [171] Huang J., Bourassa N., Buhl F., Erdem E., Hitchcock R. 2006. Using EnergyPlus for California title-24 compliance calculations. Proceedings of SimBuild. Cambridge, MA. [172] Waddell C., Kaserekar S., Ten A. 2010. Solar gain and cooling load comparison using energy modeling software. 4 th National Conference of IBPSA-USA. New York, NY. [173] Henninger R.H. and Witte M.J. 2006. The report of LBNL DOE-2.1 E119 based on ANSI/ASHRAE standard 140-2004. Lawrence Berkeley National Laboratory. [174] Henninger R.H. and Witte M.J. 2011. EnergyPlus testing with HVAC equipment performance tests CE300 to CE545 from ANSI/ASHRAE Standard 140-2011. [175] American Society of Heating, Refrigerating and Air-Conditioning Engineers. 2007. 140: Standard method of test for the evaluation of building energy analysis computer program. - 211 - [176] Crawley D.B., Hand J.W., Kummert M., Griffith B.T. 2008. Contrasting the capabilities of building energy performance simulation programs. Building and Environment. 43 (4): 661-673. [177] Zhu D., Hong T., Yan D., Wang C. 2012. Comparison of building energy modeling programs: building loads. Lawrence Berkeley National Laboratory. Report E. 2012. 6034E. [178] Zhu D., Hong T., Yan D., and Wang C. 2013. A detailed loads comparison of three building energy modeling programs: EnergyPlus, DeST and DOE-2.1 E. Building Simulation. 6: 323-335. [179] Hong T., Yang L., Hill D., Feng W. 2014. Data and analytics to inform energy retrofit of high performance buildings. Applied Energy. 126: 90-106. [180] Rahman M.M., Rasul M., Khan M.M.K. 2010. Energy conservation measures in an institutional building in sub-tropical climate in Australia. Applied Energy. 87 (10): 2994-3004. [181] Magnier L. and Haghighat F. 2010. Multiobjective optimization of building design using TRNSYS simulations, genetic algorithm, and artificial neural network. Building and Environment. 45 (3):739-746. [182] Sun J. and Reddy T.A. 2006. Calibration of building energy simulation programs using the analytic optimization approach (RP-1051). HVAC&R Research. 12 (1): 177-196. [183] Liu S. and Henze G. 2005. Calibration of building models for supervisory control of commercial buildings. Proceedings of the 9th international building performance simulation association (IBPSA) conference. Montreal, Canada. [184] Kalogirou S.A. 2000. Applications of artificial neural-networks for energy systems. Applied Energy. 67 (1): 17-35. [185] Amjady N. 2001. Short-term hourly load forecasting using time-series modeling with peak load estimation capability. IEEE Transactions on Power Systems. 16 (3): 498-505. [186] Pao H. 2006. Comparing linear and nonlinear forecasts for Taiwan's electricity consumption. Energy. 31 (12): 2129-2141. [187] Neto A.H. and Fiorelli F.A.S. 2008. Comparison between detailed model simulation and artificial neural network for forecasting building energy consumption. Energy and Buildings. 40 (12): 2169-2176. [188] Karunakaran R., Iniyan S., and Goic R. 2010. Energy efficient fuzzy based combined variable refrigerant volume and variable air volume air conditioning system for buildings. Applied Energy. 87 (4):1158-1175. [189] Sanyal J., New J., and Edwards R. 2013. Supercomputer assisted generation of machine learning agents for the calibration of building energy models. Proceedings of the Conference on Extreme Science and Engineering Discovery Environment: Gateway to Discovery. San Diego, CA. [190] Shapiro I. 2009. Energy audits in large commercial office buildings. ASHRAE Journal. 51 (1): 18- 31. [191] Westphal F.S. and Lamberts R. 2005. Building simulation calibration using sensitivity analysis. Building Simulation. 9: 1331-1338. [192] Zhou D. and Park S.H. 2012. Simulation-assisted management and control over building energy efficiency–a case study. Energy Procedia. 14: 592-600. [193] Fumo N., Mago P., and Luck R. 2010 Methodology to estimate building energy consumption using EnergyPlus benchmark models. Energy and Buildings. 2010. 42.(12):2331-7. [194] Zhou X., Zhu Y., Xia C. and Chen H. 2007. Simulation-based method to assess building energy consumption level at operation stage. International Building Performance Simulation Association. Beijing, China. [195] Monfet D., Charneux R., Zmeureanu R., and Lemire N. 2009. Calibration of a building energy model using measured data. ASHRAE Transactions. 115 (1): 348-359. [196] Parker J., Cropper P., Shao L. 2012. A calibrated whole building simulation approach to assessing retrofit options for Birmingham Airport. Proceedings of the IBPSA-England. Loughborough, UK. [197] Pan Y., Huang Z., and Wu G. 2007. Calibrated building energy simulation and its application in a high-rise commercial building in Shanghai. Energy Build. 39 (6): 651-657. [198] Yoon J., Lee E., and Claridge D. 2003. Calibration procedure for energy performance simulation of a commercial building. Journal of Solar Energy Engineering. 125(3): 251-257. - 212 - [199] Bhamornsiri C, Gomez P., Wilson T., and Eisenhower B. 2013. Calibration of envelope parameters using control-based heat balance identification and uncertainty analysis. 13th Conference of International Building Performance Simulation Association. Chambéry, France. [200] Struck C., Hensen J., Kotek P. 2009. On the application of uncertainty and sensitivity analysis with abstract building performance simulation tools. Journal of Building Physics. 33 (1): 5-27. [201] Eisenhower B., O'Neill Z., Fonoberov V.A., Mezić I. 2012. Uncertainty and sensitivity decomposition of building energy models. Journal of Building Performance Simulation. 5(3): 171-84. [202] Reddy T.A., Maor I., and Panjapornpon C. 2007. Calibrating detailed building energy simulation programs with measured data—part I: general methodology (RP-1051). HVAC&R Research. 13 (2): 221- 241. [203] O'Neill Z., Eisenhower B., Fonoberov V., and Bailey T. 2012. Calibration of a building energy model considering parametric uncertainty. ASHRAE Transactions. 118 (2): 189-196. [204] Booth A., Choudhary R., and Spiegelhalter D. 2013. A hierarchical Bayesian framework for calibrating micro-level models with macro-level data. Journal of Building Performance Simulation. 6 (4): 293-318. [205] Stéphane B. 2012. Evidence-based model calibration for efficient building energy services. Université de Liège, Liège, Belgium. [206] University of Southern California. 2012. Smart grid: power, people and information technology. Available at: http://viterbi.usc.edu/news/news/2010/smart-grid-power.htm. [207] Deru M., Field K., Studer D., Benne K., Griffith B., Torcellini P., et al. 2011. U.S. Department of Energy commercial reference building models of the national building stock. [208] Michaels J. and Leckey T. 2003. Commercial buildings energy consumption survey. Available at: http://www.eia.doe.gov/emeu/cbecs. [209] Goel S., Rosenberg M., Athalye R., Xie Y., Wang W., Hart R., et al. 2014. Enhancements to ASHRAE Standard 90.1 prototype building models. Pacific Northwest National Laboratory. [210] Mitchell W.J. 1990. The logic of architecture: design, computation, and cognition. MIT press. [211] Müller P., Wonka P., Haegler S., Ulmer A., Van Gool L. 2006. Procedural modeling of buildings. ACM Transactions On Graphics (Tog). 25 (3): 614-623. [212] Schmitt G. 2013. Architectura et machina: computer aided architectural design und virtuelle Architektur. Springer-Verlag. [213] Kohavi R. 1995. A study of cross-validation and bootstrap for accuracy estimation and model selection. Proceedings of International Joint Conference on Artificial Intelligence. Montreal, Canada. [214] Hall M., Frank E., Holmes G., Pfahringer B., Reutemann P., Witten I.H. 2009. The WEKA data mining software: an update. ACM SIGKDD Explorations Newsletter. 11 (1): 10-8. [215] Ben-Hur A. 2010. A user's guide to support vector machines. Data Mining Techniques for the Life Sciences. 609: 223-239. [216] Haykin S. 1998. Neural networks: a comprehensive foundation. Prentice Hall. [217] Friedman N., Goldszmidt M., and Lee T.J. 1998. Bayesian network classification with continuous attributes: getting the best of both discretization and parametric fitting. Proceedings of the Fifteenth International Conference on Machine Learning. Madison, WI. [218] Friedman N., Geiger D., and Goldszmidt M. 1997. Bayesian network classifiers. Machine Learning. 29 (2-3): 131-163. [219] Bouckaert R.R. 2008. Bayesian network classifiers in Weka for version 3-5-7. Artificial Intelligence Tools. 11 (3): 369-387. [220] Frank E. 2000. Pruning decision trees and lists. [221] Guyon I. and Elisseeff A. 2003. An introduction to variable and feature selection. The Journal of Machine Learning Research. 3: 1157-1182. [222] Yang Z., Ghahramani A. and Becerik-Gerber B. 2015. Iterative reassignment algorithm: leveraging occupancy based HVAC control for improved energy efficiency. The First International Symposium on Sustainable Human-Building Ecosystems (ISSHBE). Pittsburgh, PA. - 213 - [223] Yang Z. and Becerik-Gerber B. 2014. Modeling personalized occupancy profiles for representing long term patterns by using wireless sensor networks. Building and Environment. 78: 23-35. [224] Mamidi S., Chang Y., and Maheswaran R. 2012. Improving building energy efficiency with a network of sensing, learning and prediction agents. Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems-Volume 1. International Foundation for Autonomous Agents and Multiagent Systems. Valencia, Spain. [225] Liao C. and Barooah P. 2010. An integrated approach to occupancy modeling and estimation in commercial buildings. American Control Conference (ACC). Baltimore, ML. [226] Forney Jr G. 1966. Generalized minimum distance decoding. IEEE Transactions on Information Theory. 12 (2):125-131. [227] Eigen M. and Biebricher C.K. 1988. Sequence space and quasispecies distribution. RNA Genetics. 3: 211-245. [228] Gorcin A., Celebi H., Qaraqe K.A., and Hüseyin A. 2011. An autoregressive approach for spectrum occupancy modeling and prediction based on synchronous measurements. IEEE 22nd International Symposium on Personal Indoor and Mobile Radio Communications (PIMRC). Toronto, Canada. [229] Hamilton J.D. 1994. Time series analysis. Cambridge University Press. [230] Brockwell P.J. and Davis R.A. 2009. Time series: theory and methods. Springer. [231] Chatfield C. 2003. The analysis of time series: an introduction. CRC press. [232] Polikar R. 2006. Pattern recognition. John Wiley & Sons Inc. [233] Duda R.O., Hart P.E., Stork D.G. 2001. Pattern classification. John Wiley & Sons. [234] Yang Z., Li N., Becerik-Gerber B., and Orosz M. 2012. A non-intrusive occupancy monitoring system for demand driven HVAC operations. Construction Research Congress. Lafayette, Indiana. [235] Pinsky M. and Karlin S. 2010. An introduction to stochastic modeling. Academic press. [236] Meyn S.S.P. and Tweedie R.L. 2009. Markov chains and stochastic stability. Cambridge University Press. [237] Efron B. 1987. Better bootstrap confidence intervals. Journal of the American statistical Association. 82 (397): 171-185. [238] Aitken A.C. 1957. Statistical mathematics. Oliver and Boyd Edinburgh. [239] Garrett A., New J.R., and Chandler T. 2014. Evolutionary tuning of building models to monthly electrical consumption. ASHRAE Annual Conference. Seattle, WA. [240] Woolley J., Pritoni M., Modera M., Center W.C.E. 2014. Why occupancy-responsive adaptive thermostats do not always save-and the limits for when they should. ACEEE Summer Study on Energy Efficiency in Buildings. Pacific Grove, CA. [241] Lopes J. and Agnew P. 2010. FPL residential thermostat load control pilot project evaluation. ACEEE Summer Study on Energy Efficiency in Buildings, Pacific Grove, CA. [242] Meier A. 2012. How people actually use thermostats. ACEEE Summer Study on Energy Efficiency in Buildings. Pacific Grove, CA. [243] Peffer T., Pritoni M., Meier A., Aragon C., and Perry D. 2011. How people use thermostats in homes: a review. Building and Environment. 46 (12): 2529-2541. [244] Yang Z. and Becerik-Gerber B. 2014. The coupled effects of personalized occupancy profile based HVAC schedules and room reassignment on building energy use. Energy and Buildings. 78: 113-122. [245] Zeng Y., Zhang Z., and Kusiak A. 2015. Predictive modeling and optimization of a multi-zone HVAC system with data mining and firefly algorithms. Energy. 86: 393-402. [246] Pérez-Lombard L., Ortiz J., Coronel J.F., Maestre I.R. 2011. A review of HVAC systems requirements in building energy regulations. Energy and Buildings. 43 (2): 255-268. [247] Du Z., Jin X., Fan B. 2015. Evaluation of operation and control in HVAC (heating, ventilation and air conditioning) system using exergy analysis method. Energy. 89: 372-381. [248] Korolija I., Marjanovic-Halburd L., Zhang Y., Hanby V.I. 2011. Influence of building parameters and HVAC systems coupling on building energy performance. Energy and Buildings. 43 (6):1247-1253. [249] Kim H., Stumpf A., Kim W. 2011 Analysis of an energy efficient building design through data mining approach. Automation in Construction. 20 (1): 37-43. - 214 - [250] Maimon O. and Rokach L. 2005. Data mining and knowledge discovery handbook. Springer. 2005. [251] Ichino M., Yaguchi H. 1994. Generalized Minkowski metrics for mixed feature-type data analysis. IEEE Transactions on Systems, Man and Cybernetics. 24 (4): 698-708. [252] Yang Z. and Becerik-Gerber B. Coupling occupancy information with whole building energy simulation: a Comparative study of simulation tools. Winter Simulation Conference. Savannah. [253] Kim S., Haberl J. and Liu Z. 2009. Development of DOE-2-based simulation models for the code- compliant commercial construction based on the ASHRAE Standard 90.1. Proceedings of the Ninth International Conference for Enhanced Building Operations, Austin, Texas. [254] Birdsall B., Buhl W., Ellington K., Erdem A., and Winkelmann F. 1990. Overview of the DOE-2 building energy analysis program. Report of Lawrence Berkeley Laboratory, Berkeley, CA. 1990. [255] Hong T., Mathew P., Sartor D., and Yazdanian M. 2009. Comparisons of HVAC simulations between EnergyPlus and DOE-2.2 for data centers. ASHRAE Transactions. 115 (1). [256] Stephens. D.G and Mitalas G.P. 1971. Calculation of heat conduction transfer functions for multi- layer slabs. ASHRAE Journal. 77 (1971). [257] Crawley D.B., Lawrie L.K., Winkelmann F.C., Buhl W.F., Huang Y.J., Pedersen C.O., et al. 2001. EnergyPlus: creating a new-generation building energy simulation program. Energy and Buildings. 33 (4): 319-331. [258] U.S. Department of Energy. 2010. EnergyPlus engineering reference. The Reference to EnergyPlus Calculations. [259] Crawley D.B., Lawrie L.K., Pedersen C.O., Winkelmann F.C. 2000. Energy plus: energy simulation program. ASHRAE Journal. 42 (4): 49-56. [260] Documentation E. 2007. EnergyPlus manual, Version 2. U.S. Department of Energy. [261] Pollock M. and Gough M. 2007. HVAC loads tests performed on ApacheSim in accordance with ANSI/ASHRAE Standard 140-2004. [262] Naser A. 2006. CIBSE guide A Environmental Design. Chartered Institution of Building Services Engineers. [263] Integrated Environmental Solutions. 2010. ApacheSim calculation methods. Integrated Environmental Solutions. Available at: http://www.iesve.com/downloads/help/Thermal/Refer ence/ApacheSimCalculationMethods.pdf. [264] Clarke J., Aasem A., Hand J., Hansen J., Pemot C., and Strachen P. 1993. ESP-r: A program for building energy simulation. [265] ESRU. 1999. ESP-r: a building and plant energy simulation environment, user guide version 9 series. University of Strathclyde. Glasgow. [266] Strachan P., Kokogiannakis G., Macdonald I. 2008. History and development of validation with the ESP-r simulation program. Building and Environment. 43 (4): 601-609. [267] Clarke J. 1988. The energy kernel system. Energy and Buildings. 10 (3): 259-266. [268] Klein S.A. 1979. TRNSYS, a transient system simulation program. Solar Energy Laborataory, University of Wisconsin-Madison. [269] Beausoleil-Morrison I., Macdonald F., Kummert M., McDowell T., and Jost R. 2014. Co- simulation between ESP-r and TRNSYS. Journal of Building Performance Simulation. 7 (2):133-151. [270] Duffy M.J., Hiller M., Bradley D.E., Keilholz W. and Thornton J.W. 2009. TRNSYS–features and functionalitity for building simulation 2009 conference. Eleventh International IBPSA Conference. Glasgow, Scotland. [271] Saltelli A., Tarantola S., and Campolongo F. 2000. Sensitivity analysis as an ingredient of modeling. Statistical Science. 15 (4): 377-395. [272] De Wit S. and Augenbroe G. 2002. Analysis of uncertainty in building design evaluations and its implications. Energy and Buildings. 34 (9): 951-958. [273] Morris M.D. 1991. Factorial sampling plans for preliminary computational experiments. Technometrics. 33(2): 161-174. [274] Nguyen A., Reiter S., Rigo P. 2014. A review on simulation-based optimization methods applied to building performance analysis. Applied Energy. 113: 1043-1058. - 215 - [275] Coello Coello C.A. 2006. Evolutionary multi-objective optimization: a historical view of the field. Computational Intelligence Magazine, 1 (1): 28-36. [276] Loris I., Bertero M., De Mol C., Zanella R., and Zanni L. 2009. Accelerating gradient projection methods for ℓ1-constrained signal recovery by steplength selection rules. Applied and Computational Harmonic Analysis. 27 (2): 247-254. [277] Guideline A. 2002. Guideline 14-2002, measurement of energy and demand savings. American Society of Heating, Ventilating, and Air Conditioning Engineers. [278] Thumann A. and Woodroof E. 2009. Energy, determining, and water savings: international performance measurement & verification protocol. Energy Project Financing: Resources and Strategies for Success. The Fairmont Press, Inc. [279] Federal Energy Management Program (FEMP). 2008. M&V guidelines: measurement and verification for federal energy projects. [280] Energy Efficiency and Renewable Energy. 2014. EnergyPlus energy simulation software: weather data. Available at: http://apps1.eere.energy.gov/buildings/energyplus/weatherdata_about.cfm.
Abstract (if available)
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Learning personal thermal comfort and integrating personal comfort requirements into HVAC system control loop
PDF
Geothermal heat pump's energy effect and economical benefits
PDF
Intelligent adaptive automation: activity-driven and user-centered building automation
PDF
Understanding human-building interactions through perceptual decision-making processes
PDF
Economic model predictive control for building energy systems
PDF
Smart buildings: employing modern technology to create an integrated, data-driven, intelligent, self-optimizing, human-centered, building automation system
PDF
User-centric smart sensing for non-intrusive electricity consumption disaggregation in buildings
PDF
Integration of energy-efficient infrastructures and policies in smart grid
PDF
Point cloud data fusion of RGB and thermal information for advanced building envelope modeling in support of energy audits for large districts
PDF
Enabling human-building communication to promote pro-environmental behavior in office buildings
PDF
A simplified building energy simulation tool: material and environmental properties effects on HVAC performance
PDF
A radio frequency based indoor localization framework for supporting building emergency response operations
PDF
A framework for comprehensive assessment of resilience and other dimensions of asset management in metropolis-scale transport systems
PDF
Developing environmental controls using a data-driven approach for enhancing environmental comfort and energy performance
PDF
Enabling energy efficient and secure execution of concurrent kernels on graphics processing units
PDF
Energy-efficient shutdown of circuit components and computing systems
PDF
The power of flexibility: autonomous agents that conserve energy in commercial buildings
PDF
Semantic modeling of outdoor scenes for the creation of virtual environments and simulations
PDF
Evaluating energy consuming behaviors and the sufficiency of urban systems in the context of extreme heat hazards
PDF
Distributed adaptive control with application to heating, ventilation and air-conditioning systems
Asset Metadata
Creator
Yang, Zheng
(author)
Core Title
Building occupancy modeling and occupancy-loads relationships for building heating/cooling energy efficiency
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Civil Engineering
Publication Date
08/02/2016
Defense Date
05/26/2016
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
buildings,calibration,energy efficiency,HVAC systems,loads,OAI-PMH Harvest,occupancy,simulation
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Becerik, Burcin (
committee chair
), Jin, Yan (
committee member
), Orosz, Michael (
committee member
), Soibelman, Lucio (
committee member
)
Creator Email
mopoyangzheng@gmail.com,zhengyan@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c40-291575
Unique identifier
UC11281638
Identifier
etd-YangZheng-4709.pdf (filename),usctheses-c40-291575 (legacy record id)
Legacy Identifier
etd-YangZheng-4709.pdf
Dmrecord
291575
Document Type
Dissertation
Format
application/pdf (imt)
Rights
Yang, Zheng
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
calibration
energy efficiency
HVAC systems
loads
occupancy
simulation