Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Multi-occupancy environmental control for smart connected communities
(USC Thesis Other)
Multi-occupancy environmental control for smart connected communities
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
Multi-occupancy Environmental Control for Smart Connected Communities By Yushi Wang A THESIS PRESENTED TO THE FACULTY OF THE SCHOOL OF ARCHITECTURE UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree MASTER OF BUILDING SCIENCE May 2021 Copyright 2021 Yushi Wang ii ACKNOWLEDGMENTS This research acknowledges National Foundation (NSF) for funding the project, entitled “Human- Building Integration: Bio-Sensing Adaptive Environmental Control for Human Health and Sustainability” under grant number #1707068, and partially sponsored by the USCA Research Program. I would like to thank all the participants for providing their valuable data for this research. And I also would like to thank my family for supporting me throughout the thesis research all the time and my kind friends for being with me and encouraging me. I would really appreciate Professor Joon-Ho Choi, Professor Yao-Yi Chiang, and Professor Shrikanth Narayanan for patient instructions and encouragements. iii TABLE OF CONTENTS ACKNOWLEDGMENTS .......................................................................................................................................... ii LIST OF FIGURES .................................................................................................................................................. vii LIST OF TABLES ......................................................................................................................................................ix ABSTRACT ................................................................................................................................................................. x 1. INTRODUCTION ................................................................................................................................................... 1 1.1 Thermal comfort preference ............................................................................................................................ 1 1.1.1 Importance of the indoor environment quality ...................................................................................... 1 1.1.2 Importance of thermal comfort ............................................................................................................. 1 1.1.3 Thermal adaptation mechanisms ........................................................................................................... 1 1.2 Issues of thermal comfort control in multi-occupancy conditions ................................................................... 2 1.2.1 Thermal comfort conflict in multi-occupancy conditions ..................................................................... 2 1.2.2 Impact of population mobility and non-specific population ................................................................. 3 1.2.3 Gaps of ownership between building users and utility payers .............................................................. 4 1.3 From PMV Model to Artificial intelligence in HAVC control systems .......................................................... 4 1.3.1 History of PMV Model and Artificial intelligence in HAVC control systems ..................................... 4 1.3.2 Machine Learning techniques ............................................................................................................... 4 1.4 Smart connected communities ......................................................................................................................... 6 1.5 Goals and objectives ........................................................................................................................................ 6 1.6 Summary .......................................................................................................................................................... 7 2. BACKGROUND AND LITERATURE REVIEW................................................................................................ 8 2.1 Personalized HVAC control based on thermal comfort preference ................................................................. 8 2.1.1 Thermal comfort conflicts caused by individual differences ................................................................ 8 2.1.2 Existing thermal comfort test methods ............................................................................................... 10 2.2 Limitation and challenges of the thermal comfort control in multi-occupancy condition ............................. 11 2.2.1 Limitation of the objective and the subjective influence .................................................................... 11 2.2.2 Limitation of feedback needs .............................................................................................................. 11 iv 2.2.3 Difficulty of integration challenges .................................................................................................... 12 2.2.4 Privacy protection challenges ............................................................................................................. 12 2.2.5 Different effects of building use ......................................................................................................... 12 2.3 Smart connected communities ....................................................................................................................... 13 2.3.1 Intelligent control techniques in HVAC systems ................................................................................ 13 2.3.2 Application of machine learning in HVAC system ............................................................................ 14 2.4 Summary ........................................................................................................................................................ 14 3. METHODOLOGIES ............................................................................................................................................. 14 3.1 Overview of methodology ............................................................................................................................. 14 3.1.1 Project Design ..................................................................................................................................... 15 3.1.2 Hardware and software ....................................................................................................................... 16 3.1.3 Experiment subjects and conditions .................................................................................................... 18 3.2 Data Collection Phase .................................................................................................................................... 18 3.2.1 Users’ background data ...................................................................................................................... 18 3.2.2 Users’ real time physiological signals ................................................................................................ 19 3.2.3 Real time environment data ................................................................................................................ 19 3.2.4 User’s feedback .................................................................................................................................. 20 3.3 Machine learning Phase ................................................................................................................................. 21 3.3.1 Input data ............................................................................................................................................ 21 3.3.2 Data clean ........................................................................................................................................... 21 3.3.3 Machine learning algorithm analysis .................................................................................................. 22 3.3.4 Output data.......................................................................................................................................... 22 3.3.5 Optimization goal ............................................................................................................................... 22 3.4 Operation Phase ............................................................................................................................................. 23 3.4.1 Simulation and forecasting in multi-occupancy condition .................................................................. 23 3.5 Summary ........................................................................................................................................................ 24 4. SINGLE-OCCUPANCY MODEL RESULT AND ANALYSIS ........................................................................ 26 v 4.1 Database analysis ........................................................................................................................................... 26 4.1.1 Data preprocess ................................................................................................................................... 26 4.1.2 Database information of single occupancy experiment ...................................................................... 26 4.2 Baseline Model .............................................................................................................................................. 27 4.2.1 Input and output .................................................................................................................................. 27 4.2.2 Baseline Model result ......................................................................................................................... 28 4.3 Individual Thermal Comfort Model............................................................................................................... 31 4.3.1 Input and output .................................................................................................................................. 31 4.3.2 Individual Thermal Comfort Model result .......................................................................................... 32 4.3.3 Individual Thermal Comfort Model accuracy performance ............................................................... 36 4.4 Compiled Thermal Comfort Model ............................................................................................................... 37 4.4.1 Improvement of algorithm .................................................................................................................. 37 4.4.2 Definition of the thermal comfort boundary ....................................................................................... 38 4.4.3 Input and output .................................................................................................................................. 39 4.4.4 Compiled Thermal Comfort Model result .......................................................................................... 40 4.4.5 Compiled Thermal Comfort Model accuracy performance ................................................................ 43 4.4.6 Performance comparison between Compiled Thermal Comfort Model and PMV model .................. 44 4.7 Summary ........................................................................................................................................................ 46 5. MULTI-OCCUPANCY MODEL RESULT AND ANALYSIS ......................................................................... 48 5.1 Individual Thermal Comfort Model tested in a multi-occupancy setting ...................................................... 48 5.1.1 Overall Thermal Discomfort (OTD) Evaluation for group ................................................................. 48 5.1.2 Result of Individual Thermal Comfort Model tested in a multi-occupancy setting ............................ 48 5.2 Compiled Thermal Comfort Model tested in a multi-occupancy setting ....................................................... 52 5.2.1 Automatic evaluation mechanism ....................................................................................................... 52 5.2.2 Result of Compiled Thermal Comfort Model tested in a multi-occupancy setting ............................. 53 5.3 Compiled Thermal Comfort Model tested in a multi-occupancy setting for real-time controls .................... 56 5.3.1 Database information of multi-occupancy experiment ....................................................................... 56 vi 5.3.2 Input and output .................................................................................................................................. 57 5.3.3 Comparison between temperature setpoint prediction result and test participant’s feedback ............. 57 5.3.4 Performance comparison between Compiled Thermal Comfort Model tested in a multi-occupancy setting and PMV Model ............................................................................................................................... 60 5.3.5 Accuracy performance of Compiled Thermal Comfort Model tested in a multi-occupancy setting for real-time control........................................................................................................................................... 62 5.3 Summary ........................................................................................................................................................ 63 6. CONCLUSION ...................................................................................................................................................... 65 6.1 Data-driven approach performance................................................................................................................ 65 6.1.1 Single occupancy condition ................................................................................................................ 65 6.1.2 Multi-occupancy simulation ............................................................................................................... 65 6.1.3 Multi-occupancy condition ................................................................................................................. 66 6.2 Limitations ..................................................................................................................................................... 66 6.2.1 Limitation of the number of subjects .................................................................................................. 66 6.2.2 Limitation of multi-occupancy experiment condition ......................................................................... 67 6.3 Future work.................................................................................................................................................... 67 6.3.1 Model and experiment improvement .................................................................................................. 67 6.3.2 Consideration of potential problems in the development of smart communities ................................ 67 6.3 Conclusion ..................................................................................................................................................... 68 REFERENCES .......................................................................................................................................................... 69 vii LIST OF FIGURES Figure 1. Project Design Planning Chart ............................................................................................. 15 Figure 2. Sensor Information Table ..................................................................................................... 17 Figure 3. Single-occupancy experiment sensor equipment ................................................................. 17 Figure 4. Project Design Flowchart ..................................................................................................... 18 Figure 5. Experimental data collection of test participants .................................................................. 19 Figure 6. User feedback chart .............................................................................................................. 20 Figure 7 Model input and output conclusion chart .............................................................................. 21 Figure 8. Clothing and activity arrangement plan for each test participants ....................................... 24 Figure 9. Multi-occupancy test plan chart ........................................................................................... 24 Figure 10. Leverage detection chart of 30 test participant database .................................................... 26 Figure 11. Regression chart of Thermal Sensation and Real-time Physiological Features (Heart rate, Stress Level, Skin temperature, and EDA) .................................................................................. 29 Figure 12. Regression chart of Thermal Comfort and Real-time Physiological Features (Heart rate, Stress Level, Skin temperature, and EDA) .................................................................................. 29 Figure 13. Regression chart of Thermal Sensation and Thermal Comfort with Clothing Level ......... 30 Figure 14. Regression chart of Thermal Sensation and Thermal Comfort with Activity Level .......... 30 Figure 15. Individual Thermal Sensation(TS) decision tree of test participant ID 1. ................... 33 Figure 16. TS decision tree example for each gender group ............................................................... 34 Figure 17. Individual Thermal Comfort (TC) decision tree of test participant ID 1. .......................... 35 Figure 18. TC decision tree example for each gender group ............................................................... 36 Figure 19. Individual Thermal Comfort Model average accuracy chart base on train data, test data and total data ...................................................................................................................................... 37 Figure 20. Visualization of the Individual Thermal Comfort Model and the Compiled Thermal Comfort Model ........................................................................................................................................... 38 Figure 21. Two Different Comfort Zone Boundary Calculation Method ............................................ 39 Figure 22 Input and output training and testing data of Individual Thermal Comfort Model and Compiled Thermal Comfort Model ............................................................................................. 40 viii Figure 23. A random forest tree sample from the maximum value of Compiled Thermal Comfort Model ..................................................................................................................................................... 42 Figure 24. Group OTD value of Individual Thermal Comfort Model base on 6 subjects ................... 51 Figure 25. The temperature automatic decision model ........................................................................ 53 Figure 26. Comfort range result of the Compiled Thermal Comfort Model of 6 test participants ...... 54 Figure 27. Group OTD value of Compiled Thermal Comfort Model base on 6 subjects ................... 55 Figure 28 Prediction temperature setpoint for group by Compiled Thermal Comfort Model tested in a multi-occupancy setting compared with real-time environment temperature ............................. 58 Figure 29 Test participant thermal sensation feedback........................................................................ 58 Figure 30. Six test participant’s comfort range at14:30 predicted by Compiled Thermal Comfort Model ..................................................................................................................................................... 60 Figure 31. CBE Thermal Comfort Tool Results for six test participants at 14:30 .............................. 61 ix LIST OF TABLES Table 1. Database information of 30 individual test participants ........................................................ 27 Table 2. P-value and R-value for all features to TC and TS. (* indicates statistical significance) ...... 30 Table 3. Data information sample of Test participant ID 1. ................................................................ 32 Table 4. Individual TS decision tree model accuracy and layer feature for 30 test participants (percentage of feature in first three layers of each gender group) ............................................... 34 Table 5. Individual TC decision tree model accuracy and layer feature for 30 test participants (percentage of feature in first three layers of each gender group) ............................................... 35 Table 6. RMSE and R-square performance of Cross Validation of Compiled Thermal Comfort Model base on whole data and test data with 5 folds, 7 folds, and 10 folds ........................................... 41 Table 7. RMSE and R-square value of Baseline Model, Individual Thermal Comfort Model (average RMSE) and Compiled Thermal Comfort Model ......................................................................... 43 Table 8. Comfort zone prediction performance comparison between Compiled Thermal Comfort Model and PMV Model of Test participant ID1. ......................................................................... 45 Table 9. Overall Thermal Discomfort (OTD) Evaluation ................................................................... 48 Table 10. Group test database for the Individual Thermal Comfort Model (6 test participants selected from 30 test participants) ............................................................................................................. 49 Table 11. OTDmax value in different A value .................................................................................... 49 Table 12. Thermal comfort and OTD result Individual Thermal Comfort Model test in group.......... 49 Table 13. Group test database for the Compiled Thermal Comfort Model (6 test participant selected from 30 test participants) ............................................................................................................. 53 Table 14. Comfort Zone under different PPD of ITC Model and ICZ Model (* △means Range difference) .................................................................................................................................... 55 Table 15. Database information of 6 new test participants in multi-occupancy .................................. 56 Table 16. Table of comparison between Compiled Thermal Comfort Model tested in a multi-occupancy setting and PMV model ............................................................................................................... 62 Table 17. Accuracy performance (RMSE) of Single-occupancy Model and Multi-occupancy Model (The orange grid is the main body of comparison, the yellow grid is the comparison object, and the gray grid is the comparison percentage of the two ends of the grid) ..................................... 62 x ABSTRACT As one of a community’s core infrastructure elements, the building is critical for environmental resilience, natural resource consumption, and the occupants’ environmental health and well-being. However, existing facility operation mechanisms of an educational community, whose administrative and financial charge manage by the third party, have not been effectively integrated with actual community dwellers’ time-varying environmental needs, even though communication and sensing infrastructures have become ubiquitous. Consequently, this underutilization lowers real-time adaptation of the community systems by failing to meet the dwellers’ environmental needs. The project proposes an integrative approach that developed a community member-centered framework for determining the heating system control of the building through multi-standard decision-making driven by bio-sensing, which could establish a tailor-made building environment control system to reduce energy consumption and improve comfort of occupants. The project collects environmental information, thermal sensation information, and physiological information data from the daily life of 30 subjects in 3 weeks. Base on individual data, the Baseline Model, Individual Thermal Comfort Model, and Compiled Thermal Comfort Model were established by Weka and Python through linear regression, decision tree, and random forest algorithm. The Compiled Thermal Comfort Model shows 62.26% higher accuracy than Baseline Model and 25.97% higher accuracy than PMV Model. And the Compiled Thermal Comfort Model performs better than Individual Thermal Comfort Model when simulating individual data in a group. The Compiled Thermal Comfort Model was tested in a real-time condition, in which six new test participants changed clothing level, activity level, and location arrangement in a space with frequently changing environmental temperature. The results illustrate that the Compiled Thermal Comfort Model tested in a multi-occupancy setting can bring 40.32% higher accuracy than PMV Model, and the predicted temperature setpoint highly meets the requirement of test participants’ thermal comfort and thermal sensation feedback. The paper also discussed the potential applications of data-driven methods in the establishment of smart communities. Keywords: indoor environmental health; personal thermal comfort profiles; multi-standard decision-making; bio-sensing and data-driven; machine learning. 1 1. INTRODUCTION 1.1 Thermal comfort preference 1.1.1 Importance of the indoor environment quality With the development of modern civilization and society, most people imprison their lives in one artificial environment after another. Studies have shown that people spend nearly 80% of their time in the indoor environment (Al horr et al. 2016). It can be said that the trajectory of human life is to go out from one building and then enter another building. The indoor environment of the building is not only closely related to everyone’s life. Its connotation and demand are also expanding with the continuous development of human society. Based on the most primitive security requirements, people have extended multiple levels of requirements for the indoor environment, such as comfort, efficiency, and health. (Seppänen and Fisk 2006) Among them, comfort is the most essential requirement of human beings, because compared with other factors, the user experience brought by comfort improvement is direct and fast. 1.1.2 Importance of thermal comfort For humans, the meaning of thermal comfort is rich. The physical factors that affect the user's comfort include the thermal environment, humid environment, air environment, acoustic environment, and light environment. (Kamaruzzaman et al. 2016) Among the effects on human comfort, temperature and humidity account for a large proportion, and the temperature is the most critical. Because the largest organ of the human body is the skin. Due to its huge size (especially in terms of area) and its outermost periphery, the various sensory cells it contains are also very rich and numerous, which determines that the skin becomes an important way for the body to regulate heat dissipation. The influence of the hot and humid environment on human comfort is mainly through action and transmission through the skin. (Arens and Hui 2006) People can improve the thermal comfort of the environment by changing the temperature and humidity of the air, the size of the airflow rate, and the clothing worn by the human body. (Streinu-Cercel et al. 2008) The original driving force of human development is people's pursuit of comfort. People nowadays are not only satisfied with how much clothing they wear to control thermal comfort but hope to establish an indoor HVAC system to control it more accurately, in real-time, and autonomously, to get rid of outdoor and indoor, room and room, and even people and the problem of uneven thermal comfort among people. Therefore, the thermal comfort model has been established as a function of multiple human factors and environmental variables. 1.1.3 Thermal adaptation mechanisms 2 The thermal adaptation mechanism of the human body is usually to gradually weaken the stimulation of the human body due to changes in the thermal environment through the adjustment of behavior, physiology, and psychology (J. Liu, Yao, and McCloy 2012). Among them, behavioral adaptation refers to all behaviors that consciously or unconsciously adopt to change their heat balance, such as personal adjustment (dressing, weight reduction), technical adjustment (turning on and off the air conditioner, etc.) and living habits; physiological adaptation refers to Physiological changes such as genetic adaptation or environmental adaptation make the human body gradually adapt to changes in the thermal environment; psychological adaptation is a change in sensory response based on past thermal experience or expectations, resulting in the existence of the individual's optimal comfortable temperature and corresponding temperature settings Huge difference. (Du et al. 2018) Therefore, the thermal comfort of the human body is a fuzzy set with unclear boundaries. To make an accurate evaluation of a suitable thermal environment and the thermal comfort of the human body, it is not only necessary to understand the age, gender, clothing, diet of the person in the environment, as well as cultural and social backgrounds such as rest habits, behavior patterns, and clothing hobbies, it is also necessary to understand the thermal experience that people are used to indoors and outdoors and personal adaptability, and changeability. 1.2 Issues of thermal comfort control in multi-occupancy conditions The space occupied by a single person is mostly a private house or apartment, which can usually reach the user's thermal comfort state through the setting or adjustment of the HAVC system. However, when the user is in a space occupied by multiple people, the heat demand will continue to change with each person's thermal comfort preference and population flow status, which will be beyond the constant temperature range controlled by the traditional PMV model. When people of different genders, ages, and ethnicities share the same space, such as university classrooms, the difference in their thermal comfort preferences may not be able to reach a consensus on the HAVC temperature set point, when space contains both fixed and mobile personnel. For example, banks with staff and customers who are unable to determine the subjects making thermal comfort decisions are also a dilemma; when building owners or managers tend to save energy and barely satisfy thermal comfort models, such as old-fashioned offices buildings, the thermal comfort of the staff is always easily overlooked. Under these circumstances, it will be a huge challenge to satisfy the thermal comfort needs of most users on the premise of sacrificing the feelings of the fewest users and ensuring more energy saving. 1.2.1 Thermal comfort conflict in multi-occupancy conditions When people in the multi-occupied space with high-frequency mobility, such as university, banks, or theater, people have more uncertainty and difficulty in thermal comfort control by setting the HVAC temperature. Li proposed a personalized HVAC control framework, which is based on the thermal comfort preferences of different people and the feedback information of experimenters, 3 combined with algorithms to calculate the thermal comfort environment suitable for multi-person spaces. However, from the analysis of the results of dissatisfaction with partial thermal comfort, he found that when people are in a group, it is impossible to predict whether the people next to them have the same unsatisfactory thermal comfort, or do not know how to change the thermal comfort settings. Therefore, some of the experimenters ultimately chose not to adjust the thermal comfort of the room. (Li, Menassa, and Kamat 2017a) When people are in a space occupied by many people, there will be some people who are dissatisfied with thermal comfort, but make psychological compromises because no one raises objections, which is because social psychological variables affect people’s intention to share control in a public environment. (D’Oca et al. 2017) It is important to establish a thermal comfort adjustment mode that takes into account each person’s thermal comfort preferences and individual differences but under the premise of skipping the process of passive feedback and adjustment while reducing people’s social- psychological impact, the establishment of an autonomous thermal comfort system is the most intelligent in multi-occupancy conditions. 1.2.2 Impact of population mobility and non-specific population When the population of a space is basically fixed, such as an office with fixed employees or a classroom with fixed students, it is very meaningful and cost-effective to establish a personal thermal comfort profile. However, when users of multi-occupancy spaces have high liquidity, uncertain residence time, and uncertain users, such as banks, supermarkets, or community activity centers. The thermal comfort data of a single user may have a minimal impact on all users, or it may be affected by environmental changes. Mishra once placed multiple censors in the museum and collected more than 1,000 thermal comfort questionnaires at the same time. The analysis found that people entering the museum have a transition duration of 20-30 minutes to adapt to the thermal environment inside the museum, and after staying for an hour, some people consider changing the current thermal environment or changing their dressing index.(Mishra et al. 2016) Therefore, in multi-occupancy spaces, due to the uncertainty of population mobility and residence time, it is necessary to consider being a person the transition duration from the outdoor environment to the indoor environment. When faced with non-specific populations, it is a challenge to define individual thermal preferences data and integrate data sets to establish optimal thermal conditions to maximize overall thermal comfort. In addition, due to the mobility of customers, there is often a vacancy period in fixed spaces. If you can choose a more optimized thermal setpoint during the vacancy time, it brings a lot of energy savings for the entire building. For example, Kramer found in a museum energy-saving case that uses different set-point strategies that the building only needs thermal comfort during opening hours, which provides another thermal comfort setting strategy for closed hours possibility. People can save energy by 65% by applying 100% recirculation during closing hours, which means without heating and cooling.(Mishra et al. 2016) But how to set the space through reasonable procedures to adjust the thermal environment in the space in real-time according to the mobility 4 of the people in the space and the thermal comfort of each person, and turn off the system at the appropriate time to save energy. 1.2.3 Gaps of ownership between building users and utility payers Building users usually pursue the most comfortable sense of experience, but utility payers are pursuing the economic cost reduction brought by energy-saving systems. Based on individual differences and thermal comfort preferences, the project monitored, predicted, and changed the HVAC settings of a space in real time. Compared with simply installing a human body temperature sensor to adjust the temperature of the space, can it save more energy for the building? Finding the best balance between maximizing personal thermal comfort and minimizing energy consumption in community-scale buildings are a great challenge. 1.3 From PMV Model to Artificial intelligence in HAVC control systems 1.3.1 History of PMV Model and Artificial intelligence in HAVC control systems Gilaniused the predictive average voting (PMV) model to measure the energy-saving potential of commercial and residential buildings in terms of thermal comfort and found that the percentage deviation of the overestimation rate in HVAC buildings is higher than that calculated using the PMV equation. It shows that using Tony Gai's algorithm to determine the setting of the room thermal comfort range, it can bring energy-saving effects to the building. (Gilani, Khan, and Pao 2015) Du tried to use machine learning algorithms to better predict and evaluate the thermal comfort and temperature of humans in local ventilation systems. The results show that compared with the median accuracy produced by traditional model PMV models, machine learning techniques can make the median prediction accuracy higher. (Du et al. 2019) The "ClimaCon" proposed by Farag uses fuzzy logic to control the comfort of the experimenter obtained from the predicted average voting index within a certain range, and implement different energy-saving solutions on this basis. (Farag 2017)Chen developed a data-driven model based on experimental thermal comfort data, and compared it with the predictive mean voting (PMV) model. In a single or multi-person environment, the data-driven method can save as much as the PMV model 42-45% of energy, and 41-44% of thermal comfort improved.(Zhong and Choi 2017) It can be seen from these studies that the optimization goal of the artificial intelligence system is to continuously improve the accuracy of prediction through different algorithms, thereby bringing higher environmental comfort and more energy saving. 1.3.2 Machine Learning techniques a. Supervised learning, unsupervised learning, semi-supervised learning and reinforcement learning 5 Supervised learning has features and labels. Even if the data is unlabeled, the label can be judged and classified by learning the relationship between features and labels.(Frumosu and Kulahci 2018) For example, a function can be generated to map the input to the appropriate output based on the existing correspondence between a part of the input data and the output data. To give an example of an experiment, first provide data that the experimenter shows "cold" and "hot" feelings for different temperatures. When another temperature is input, the machine predicts that the label is cold or hot. The data set for unsupervised learning has only features and no labels. They are divided into several categories through the internal connection and similarity between the data. According to the characteristics of the data itself, some characteristics can be learned from the data according to a certain metric or to find hidden relationships.(Celebi and Aydin 2016) For example, when the human body is in a different temperature environment, there are different physiological reactions, such as accelerated heartbeat or erected hairs, but at the same time when people are in different psychological states, such as tension or shock, they may also have similar reactions. If the physiological information under different environments and mental states is provided to the computer, the commonalities and differences between the two can be observed through unsupervised learning, so as to distinguish them. Part of the data used in semi-supervised learning is labeled, and most of it is unlabeled. Compared with supervised learning, semi-supervised learning has lower cost, but it can achieve higher accuracy, that is, comprehensive use of class Target data and data without a class label to generate a suitable classification function. (Frumosu and Kulahci 2018) Reinforcement learning is similar to semi-supervised learning. Both use unlabeled data. However, reinforcement learning uses algorithms to learn whether to get closer and closer to the target, which can be understood as an incentive and penalty function. (Szepesvári 2010) For example, continuously adjust the temperature setpoint of the HVAC system, while allowing the experimenter to continuously give feedback on comfort and discomfort, to achieve the most comfortable range through continuous stimulation and punishment. b. Traditional machine learning and deep learning The main difference between deep learning and traditional machine learning is that its performance continues to grow as the scale of data increases. When the data is small, the performance of deep learning algorithms is not good, which is because deep learning algorithms require a lot of data to understand it perfectly. Secondly, in traditional machine learning, most data features need to be determined by experts and then encoded as a data type feature. This process is time-consuming and requires professional knowledge, but the testing process with features is very fast, such as a few seconds. A few hours. Deep learning attempts to obtain high-level features directly from the data, but this testing and learning process is very long, such as two weeks. (Wang, Fan, and Wang 2020) 6 f. Linear Regression Linear regression (LR) is one of the simplest ML regression algorithms. For high-dimensional data sets, linear regression can find the intuitive relationship between input and output through coefficient adjustment.(Grömping 2009) Linear regression analysis could be used to get the relationship between various factors and thermal comfort, which can be used as a baseline method. e. Random Forest Random forest is composed of many decision trees, and there is no correlation between different decision trees. When the project performs a classification task, new input samples are entered, and each decision tree in the forest is judged and classified separately. Each decision tree gets its own classification result. Which of the classification results of the decision tree is classified At most, the random forest treats this result as the final result.(Grömping 2009) For example, when the project input the different thermal comfort states of the same user at different temperatures, and other related factors, through the classification and fitting of many data, the random forest can finally select the temperature range most suitable for this user. 1.4 Smart connected communities Smart connected communities are to propose a comprehensive approach to better predict and evaluate human thermal comfort and temperature through machine learning algorithms, and then develop community members through effective use of existing technology infrastructure currently available in the community and its members. The central intelligent environmental control framework aims to provide the most comfortable real-time thermal comfort for non-fixed people in fixed spaces. The project developed a cost-effective user interaction technology for individual residents and provide multiple auxiliary sensing functions and installed data processing equipment as a test platform in a minority-dominated high school or community. This community-based research can bring powerful and tailored situation-aware building intelligence, enhance sustainability, and use information technology to improve the quality of life of community residents. 1.5 Goals and objectives The discussion so far shows that the existing HVAC control system does not take into account the individual's thermal comfort preference and the environmental factors of multi-person occupation. Therefore, the goal of this study is to develop a new technology platform for intelligent and connected education communities to help community residents understand and participate in the performance of their buildings and control intelligence as a function of these members' participation. Through single-person and multi-person experiments, the experimental data of the experimenter in the daily environment and deliberately arranged multi-person environment, 7 including environmental data, physiological data, and user feedback data, are used to build machine learning models. The model determined the optimal temperature in a multi-residential building/space by input the physiological data collecting from smartwatches, while optimizing the indoor environmental thermal comfort of group people at the multi-occupant scale, thereby developing a community-scale environmental control tool with maximum energy efficiency. 1.6 Summary This chapter explains the necessity of collecting information on users' thermal comfort preferences and discusses the problems and conflicts that may occur in the state of multi-person occupation, emphasizes the importance of multi-person scale optimization of environmental control. On this basis, developing intelligent control of thermal comfort in a multi-occupancy environment. The project goals and objectives were determined. Chapter 2 explains the limitation and challenge encountered in thermal comfort control through background introduction and literature review, and how current efforts have overcome them. Chapter 3 introduces the design of the experimental process and the algorithm model of machine learning. 8 2. BACKGROUND AND LITERATURE REVIEW 2.1 Personalized HVAC control based on thermal comfort preference 2.1.1 Thermal comfort conflicts caused by individual differences As the indoor environments of buildings are getting more and more attention, people's requirements for indoor thermal comfort are gradually shifted from the whole to individuals, which means that they begin to take care of everyone's feelings. But in the office, the temperature of the air conditioner is usually fixed, which does not always satisfy everyone's needs. Just like people may see a lady wearing a sweater and a man wearing short sleeves sitting in the same office; people may hear the elderly and children in the same movie theater complaining about being cold and hot at the same time; People may also see that Americans and Asians living in the same apartment often quarrel over the temperature setpoint of the central air conditioner. A 34,000-questionnaire survey shows that only 37% of employees are 80% satisfied with the thermal comfort of buildings in the United States, Canada and Finland. (Huizenga et al. 2006) Davoodi found that individual differences in humans (such as weight, height, gender, age, and basal metabolic rate) can significantly affect the body's body temperature and thermal adaptation mechanisms. (Davoodi et al. 2017)The following discusses various possible factors affecting the thermal comfort of the human body found in the existing survey. a. Age Two Japanese researchers, Tsuzuki and Iwata, found that older people prefer a warmer environment. (Tsuzuki and Iwata 2002) But Thapa found that the average comfort temperature of young people (20.4°C) is higher than the average comfort temperature of middle-aged or elderly people (16-17°C), which may be affected by the geographical background, but he also proved that the elderly suffer The acceptable range of the test subjects is narrower than that of the young subjects. (Thapa 2019) This is also found in the research of van Hoof and Hensen: Because the elderly has a lower metabolic rate than the young, their ability to contract blood vessels in the skin limits the control of body temperature, so they need a more accurate ambient temperature. (Hoof and Hensen 2006) b. Gender Gagnon and Kenny studied the effect of gender on the body's thermal response. They showed that because women usually have higher heat loss, even if the effects of the thermoregulatory mechanisms of men and women are similar, the difference in thermal response between the two is obvious. (Gagnon and Kenny 2012) Another survey base on 3094 respondents shows that women more prefer high-temperature environments than men, and are more sensitive to cold and hot temperature changes.(Karjalainen 2007) Chang and Kajackaite came to the same conclusion in their research and found that by setting the thermostat higher than current standards, gender- mixed workplaces may be able to increase productivity. (Chang and Kajackaite 2019) 9 c. BMI In the Samar study, subjects in the BMI group were classified as underweight, normal, and overweight. He found that because the excess body fat of the overweight experimenter can provide the body with heat insulation similar to adding clothes, the comfort temperature of the underweight subjects is higher. (Thapa 2019) d. Metabolic rate Alfano research found that when the experimenter's metabolic rate measurement produces an error within ±10%, the impact on PMV accuracy rate reaches [-0.16; 0.14] units.(d’Ambrosio Alfano, Palella, and Riccio 2011) Goto et al. found that the metabolic rate changes in real time by testing the thermal comfort of people under different durations and intensities, and it takes a long time after the change to stabilize.(Goto et al. 2006) Revel et al. once proposed to integrate the metabolic rate estimated by combining environmental data and personal data into the control of the thermal environment, thereby increasing the accuracy of PMV control.(Revel, Arnesano, and Pietroni 2015) Luo compared and analyzed different methods of measuring the metabolic rate, and believed that the sports bracelet is portable and low cost, but if you do not pay attention to the algorithm in different scenarios, it will have a great impact on the accuracy rate, but there are disadvantages of being expensive or inconvenient to carry in the methods with higher accuracy rates.(Luo et al. 2018) e. Race (cultural and social backgrounds) Nicol analyzed and summarized the thermal comfort range of civil buildings in different countries through surveys and data collection: 90% of people in Japan will be comfortable with indoor temperatures in the range of 11-25°C; British thermal comfort zones are mainly distributed in Between 15-22℃; the average indoor temperature range of residential buildings in Saudi Arabia is mainly controlled within 22-32℃.(Nicol 2017) Not only does the range of thermal comfort for people of different nationalities vary greatly, but this difference also exists between people in different regions of the same country. Luo once recruited experimenters from southern and northern China. Through comparative experiments, he found that the thermal comfort range and thermal adaptability of the two are very different. (Luo et al. 2016) It can be seen that various factors have a greater or lesser influence on the prediction and control of thermal comfort. In spite of the significance of the physiological impacts on thermal comfort reported in recent studies, most research still rely on small sample sizes for human subject experiments, and also adopted computational algorithms which are relatively outdated while modern technologies, such as SVM modeling, Deep Learning, etc., have been emerged these days.. Therefore, it is highly desired to investigate the impacts of physiological conditions on thermal comfort, and it is necessary to conduct a comprehensive analysis and integration of various influencing factors to establish a personal thermal comfort preference profile, thereby providing a more accurate algorithm basis for indoor thermal comfort estimation. 10 2.1.2 Existing thermal comfort test methods a. Skin temperature test Davoodi proposed a new type of personalized thermal comfort model, which can estimate the thermal sensation of the naked body and clothing parts, respectively. From the model, they found that in a cold environment, age, weight, height, and gender become important factors influencing thermal sensation.(Davoodi et al. 2017) Cosma also established a real-time thermal comfort monitoring system by combining color extraction from multiple local body parts, sensor information extraction and thermal imager temperature extraction, which accuracy exceeded 80%. (Cosma 2011) b. Wrist test Choi and Yeom established an effective connection between the overall thermal sensation and human skin temperature. He found that the thermal sensation based on the combination of skin temperatures of different body parts could be estimated, and the sides of the wrist were generated more than 94% of the time accurate data.(Choi and Yeom 2017) This finding is very helpful for narrowing the target range of thermal sensation estimation. c. Facial thermal information test Cosma used the model to analyze the vasoconstriction and body temperature regulation activities of the experimenter's facial skin, and then integrated the influence weight of the skin color to predict the skin temperature and thermal comfort, with an accuracy rate of 75%.(Cosma 2011) In another experiment, thermal imaging cameras were used to collect information on the skin temperature in different areas of the exposed face. According to the estimated thermal comfort of the experimenter, facial skin temperature is more sensitive to temperature-lowering stimuli than temperature-increasing stimuli, and the temperature changes of the ears, nose, and cheeks are more obvious, which is more representative of the overall thermal comfort.(Li, Menassa, and Kamat 2018) Cosma also confirmed in the test by Overfitting test and Classifiers comparison that facial temperature changes are of great significance for improving the accuracy of thermal comfort prediction.(Cosma and Simha 2019) d. Comparison of different methods Yang et al. also tested and compared different thermal comfort detection methods, and found that traditional contact measurement methods require questionnaires as feedback and the information is subjective; for example, the semi-contact measurement method of smart bracelets has little interference to the human body and is accurate High performance; for example, the non-contact measurement method of infrared camera technology is suitable for a multi-person environment but the cost is high; for example, the non-contact measurement method of Euler video magnification is not suitable for a multi-person and dynamic environment.(Yang et al. 2020) In summary, the semi-contact measurement method for smart bracelets is the lowest cost, easiest to popularize, 11 highest relative accuracy, and most portable method. However, how to use it in a multi-person occupation environment is a significant challenge. 2.2 Limitation and challenges of the thermal comfort control in multi-occupancy condition 2.2.1 Limitation of the objective and the subjective influence From an objective point of view, user behavior, such as current activity level and clothing level, is always time-varying and unpredictable. Therefore, some so-called energy-efficient green buildings often use room usage and occupancy rate to predict the required temperature setpoint, and the final user energy consumption will be much higher than the ideal assumption. From a subjective point of view, the user's psychological state may also be inconsistent with the current physiological state. For example, the herd mentality mentioned in Chapter 1, because most people around me did not complain about uncomfortable temperature, so they chose not to express their uncomfortable state. In addition, changes in a person's psychological state will also have a great impact on comfort. For example, in the examination room, even if it is at a normal and comfortable temperature, the mental state of being too nervous can produce feelings of hotness. These subjective requirements cannot be met by traditional systems that rely on occupancy to predict thermal comfort requirements. 2.2.2 Limitation of feedback needs Ordinary indoor environment or temperature control mainly relies on the user's active adjustment, and a smarter system will select a more comfortable range through the user's voting feedback. Although this system that relies on the user's final control can achieve effective and direct changes, there is still the possibility that individual users have insufficient understanding of the control technology and environmental conditions, resulting in low efficiency and high energy consumption adjustment. This problem becomes more obvious in an environmental space occupied by multiple people. For example, a sweaty person enters an office with a comfortable temperature, because the sensory heat required a certain transition period before, but the user blindly lowered the air conditioner temperature by a large margin, which eventually caused other people's discomfort and wasted energy. The real intelligent adjustment should be to allow the machine to predict the user's thoughts as much as possible and execute it automatically. It comprehensively analyzes personal thermal comfort preferences and current physiological state data, and considers which machine learning method to use to make the most accurate predictions. On this basis, direct user feedback is minimized, but at the same time, the right and channels for users to make active adjustments are reserved. So as to accommodate the comfort requirements of each user in an unobtrusive and natural way to develop an optimal building performance system control. However, this level of intelligence poses certain challenges in technical implementation, for it may require the integration of environmental comfort, data collection, data processing, machine learning and another knowledge. 12 2.2.3 Difficulty of integration challenges If a thermal comfort preference profile based on individual users is to be established, it is difficult to integrate data selection and data weighting. Human thermal comfort is affected by many factors such as physical environment, psychological environment, indoor environment, and outdoor environment. Each environment contains many small items, just like the various physiological signals discussed at the beginning of this chapter. To build a complete database, it is necessary to have a deep research on the influence of each influencing factor. It needs to choose the most closely related influence factors with the greatest degree of influence, while weighing the proportions between them and some influence factors that have a small degree of influence but have a great significance. This not only requires a lot of professional knowledge, but the relationship between some factors and human comfort needs is very subtle and difficult to explore, especially the impact of psychological changes. Therefore, how to build the most complete and efficient data collection system is a huge challenge. 2.2.4 Privacy protection challenges Since personal thermal comfort is largely affected by race, age, gender, weight, physiological signals, and other autonomic factors, this will cause certain difficulties in collecting user background data. In the big data environment, to obtain convenient and efficient services, people often worry about entering private information such as their names, ages, and addresses. Therefore, too cumbersome personal information surveys can easily arouse users' disgust. Especially when the project establish a data sharing platform to lay the foundation for more building indoor environment control, once the user's data privacy protection is not perfected, the user's real-time physiological data is likely to be leaked. This may cause users to receive product advertisements from unknown sources, thus becoming transparent people in the big data environment. What kind of data privacy protection system can be used to allow users to choose with peace of mind and enjoy this intelligent environmental control is a question that needs to be considered in the later stage? 2.2.5 Different effects of building use The nature and purpose of the building also have different requirements for comfort. Different buildings, even different rooms in the same building, have different requirements and priorities in terms of thermal comfort and energy saving. For example, in the same campus, the comfort of the health center is the patient's priority, but the laboratory's priority is the storage temperature of the test product. When the priority objects are the same, they are all students. In general classrooms and comprehensive conference halls, the thermal comfort requirements will also change due to the difference in spatial population density. Therefore, in the design of an intelligent thermal comfort 13 control system, it is necessary to consider the different needs of buildings with different properties and uses. 2.3 Smart connected communities 2.3.1 Intelligent control techniques in HVAC systems In the past, the HVAC system usually responded to the thermal comfort needs of most people in the same space according to the setting or adjustment of the system by a single person, so it was difficult to achieve the optimal decision whether it was energy saving or thermal comfort satisfaction. Many researchers have tried different methods to transform this passive regulation into more intelligent active regulation. Murakami developed an interactive HVAC control system, that is, according to the thermal comfort feedback of 50 users during the experiment, under the logic of balancing user needs and energy consumption, HVAC control commands are sent in minutes. Tests have proved that this interactive system meets the limits of users' thermal comfort requirements and at the same time saves 20% of energy compared to a constant temperature.(Murakami et al. 2007) By studying the relationship between the thermal environment data and the experimenter's thermal voting, Lim found that generating personal characteristics information is very important for predicting thermal demand, and using Bayesian-based models to reduce data collection costs and improve prediction accuracy.(Lim et al. 2018) Menassa integrates the physiological and behavior data of test participants collected by smart bracelets and mobile phone applications, and establishes a personalized HVAC control framework through voting models and collective decision algorithms to make the prediction of test participants' thermal comfort more than 80% accurate degree.(Li, Menassa, and Kamat 2017b) However, the above experiments all require feedback from experiment users for prediction and analysis, which is actually still a passive adjustment strategy. The active adjustment could be through the real-time thermal comfort monitoring method mentioned above, allowing the test participant's body to quietly and secretly communicate the thermal comfort information to the system, thus completing the control of the system to the thermal environment. Jazizadeh and Jung collected facial photos of the experimenter from the computer in the office, combined with image magnification technology and Eulerian video magnification (EVM) algorithm to observe the subtle changes of blood under the skin at different temperatures, and then isolated the interference factors in the RGB video image of the skin And adjustments to infer the current state of body temperature regulation. It proposes a non-intrusive evaluation scheme for the regulation of environmental thermal comfort.(Jazizadeh and Jung 2018) Cosma also uses a non-invasive method to extract transient temperature information from multiple parts of the face and body, thereby improving the accuracy of thermal comfort prediction by 80%.(Cosma 2011) Park uses wearable devices and Internet of Things sensors to collect environmental information and user information, and builds the service layer, ontology layer, and information layer into building energy autonomous control systems. In the system structure, the HVAC equipment is independently controlled through 14 different relationship chains. In order to meet the needs of users and reduce energy consumption.(Park et al. 2015) 2.3.2 Application of machine learning in HVAC system In recent years, more and more researchers have tried to use different machine learning algorithms to improve the accuracy and efficiency of personal thermal comfort assessment. Based on the thermal comfort data in the office in seven different climate zones, Rana established a prediction function set by using support vector regression.(Rana et al. 2013) Liu used a neural network-based thermal comfort assessment model to evaluate the thermal comfort level of a test subject in 20 different environments, and the results showed that the prediction accuracy was good.(W. Liu, Lian, and Zhao 2007) Jazizadeh integrated the test participant's thermal comfort index in the system framework, based on fuzzy rules in the two areas of the office to assess the test participant's thermal comfort evaluation.(Jazizadeh et al. 2014) Daum use logistic regression technology to convert the user's willingness to feedback thermal comfort into the current thermal comfort state, so as to achieve intelligent control of the building's thermal environment.(Daum, Haldi, and Morel 2011) Kim chose Classification Tree, Gaussian Process Classification, Gradient Boosting Method, Kernel Support Vector Machine, Random Forest, Regularized Logistic Regression, six machine learning algorithms that do not require strong data assumptions, and used the training set and data set to verify the prediction accuracy of the model The accuracy and variability are 20% higher than the traditional PMV model.(Kim et al. 2018) Luo also verified through 9 different machine learning algorithms that the prediction accuracy of the machine learning algorithm is 10–20% higher than the PMV model.(Luo et al. 2020) In Luo's experiment, Random Forest has the highest accuracy. Katić verified that using RUSBoosted trees produced the highest accuracy of 0.84 thermal comfort evaluation, and Kim also considers the impact of calculation cost on the evaluation accuracy value. (Katić, Li, and Zeiler 2020) (Kim et al. 2018) 2.4 Summary The first part of this chapter clarifies the influence of various factors on thermal comfort, explains the necessity of collecting everyone's thermal comfort preferences and data, and introduces the existing efforts on personal thermal comfort evaluation methods. The second part according to the history development of the thermal comfort control, it presents the limitations and challenges of meeting the thermal comfort in multi-occupancy conditions. The third part summarizes the intelligent HVAC control system established in the existing research to improve the prediction accuracy of thermal comfort, and the application of different types of machine learning methods. 3. METHODOLOGIES 3.1 Overview of methodology 15 3.1.1 Project Design The project proposes an integrated approach that developed an intelligent environmental control framework centered on community members by effectively utilizing the existing technological infrastructure in the community and its members. It developed cost-effective user interaction technology for individual residents, and provide multiple auxiliary sensing functions, and strive to establish a comprehensive thermal comfort prediction suitable for multi-person environments based on the thermal comfort preferences of different residents’ model. However, this test has experienced unprecedented challenges this fall, which is COVID-19. To meet the data requirements of the experiment to the greatest extent and reduce the impact on the life of the experimenters, the number of experimenters has been greatly reduced to 6. Figure 1. Project Design Planning Chart According to the picture above, the project can be decomposed into the following steps: Step 1: Distribute HOBO and two smart watches to different experimenters. Let them wear it for at least 1 day, during which time they can record real-time human external environmental data (including temperature, humidity, etc.) and human internal environmental data (may include heart rate, electrical skin activity, etc.) by connecting different sensors. At the same time, let the experimenter frequently record the uncomfortable moments, which can be accurate to the minute interval, just like the sensor records. Step 2: Establish the thermal comfort profile of each experimenter through data analysis. Filter out effective changes from the data through data cleaning and integration. The data in the thermal comfort profile of everyone with the label "comfortable or uncomfortable" corresponding to the 16 environmental temperature and physiological signal with time, thus forming an information database of thousands of sets of data. The experimenter records the uncomfortable moments, and vice versa, records the time as comfortable. The longer the experimenter wears the sensor, the more complete the recorded data. Step 3: Analyze the correlation between outdoor environment and human thermal comfort, and establish personal thermal comfort zone characteristics. Through machine learning, the contours of multiple personal thermal comfort zones are superimposed and analyzed to establish a model that can predict the most comfortable temperature under multi-person environmental conditions. The method is to derive the thermal comfort zone of experimenter A when the detected ambient temperature is X and the physiological signal of experimenter A is Y. Then take the overlap range of the thermal comfort intervals of all the experimenters and adjust the predicted temperature in small increments according to the proportion of the number of experimenters who are beyond or far from the overlap range. Step 4: After the model is established, the previous experimenters are concentrated in the No.212 classroom of the University of Southern California Watt Hall for verification testing. In the same classroom space, through the gathering and flow of people in different situations, different optimal ambient temperatures are predicted. After adjusting the ambient temperature according to the prediction, collect the thermal comfort feedback of all test participants to verify the accuracy of the prediction. 3.1.2 Hardware and software In this experiment, wearable sensors Garmin Vivosmart 3 and Empatica Embrace are used to collect real-time data on the thermal comfort of the experimenter. Empatica Embrace is used to measure EDA and skin temperature; Garmin Vivosmart 3 is used to measure heart rate and pressure level. The interval between the two data collection is 1 minute. The experiment also uses Onset HOBO and HOBO MX1102, which can measure temperatures in the range of 0 to 50 degrees Celsius. In the first stage, the experimenter can carry Onset HOBO to measure the ambient temperature at different locations. The data collection interval is also 1 minute. In the first stage, all environmental and physiological model data can be exported through HOBO Mobile, Garmin Vivosmart 3 and Empatica Embrace applications, which as show in figure 3. In the fourth stage of the multi-person environment test, the sensor can be connected via Bluetooth to read the data of HOBO and wearable sensors in real time. 17 Figure 2. Sensor Information Table Figure 3. Single-occupancy experiment sensor equipment In terms of software, Python software is used for data classification, data analysis, machine learning, and simulation prediction and adjustment, which can reduce format or program problems caused by converting multiple software. 18 3.1.3 Experiment subjects and conditions Due to the impact of COVID-19, many schools have changed to online teaching, resulting in changes to the original experimental plan. The subjects of the experiment are graduate students of different genders from USC. In the early stage of establishing a personal thermal comfort profile, the experimenter was asked to wear a smart watch and download the corresponding data collection software on the mobile phone. They experienced as many different comfort environments as possible, under different dressing levels and activity levels. The environmental scope is their main living place, such as apartment, bedroom, classroom, restaurant, etc. In the later stage of simulation and operation, all the experimenters were called to a place for simulation test in Watt Hall #212 classroom. The space for the test is fixed, but the occupancy and mobility of the test population changes according to the design, so as to verify the accuracy and final effect of the test from different aspects. Figure 4. Project Design Flowchart 3.2 Data Collection Phase 3.2.1 Users’ background data The user's background data mainly includes the test participant's age, gender, body mass index, ethnic, activity level and clothing level. The commonality of these factors is that they all require users to actively input data before or during the test. After the discussion in the second chapter, age, gender, ethnic and body mass index all have a certain influence on the thermal comfort preference interval, but the extent of this influence is temporarily uncertain. Therefore, these data 19 was input by the experimenter on the mobile phone software before the start of the experiment, and the relationship of these data was regression analyzed during the data analysis process, so as to establish a more accurate personal thermal comfort preference profile. However, the activity level and clothing level changed regularly or irregularly during the test. Therefore, when the sensor detects a certain degree of change in the ambient temperature or the user's heart rate during the test, a pop-up message was designed to ask the user if they are planning to do outdoor sports or change their clothes. Figure 5. Experimental data collection of test participants 3.2.2 Users’ real time physiological signals The user's real-time physiological signals include heart rate, stress, skin temperature and EDA. Choose Garmin vivosmart3 and embrace empathic based on the content requirements of the data. It can detect the data required to track heart rate, skin temperature, and activity level, and has the function of Application Programming Interface (API), so that it can be programmed in real time Process and analyze data. The test participant needs to install the software “Alert for Embrace watch” and “Garmin Connect” on the phone, and match the software to their watch 3.2.3 Real time environment data Real-time indoor environment data mainly include indoor environment temperature, and indoor environment relative humidity. The above data was collected using a smaller Onset HOBO during the data collection stage, which can be carried along with the sound and can be downloaded from the computer after the test. In the later test phase, the HOBO MX1102 with display screen was used for data collection. Its temperature display can more accurately express the current indoor ambient temperature. The reason for not using the sensors attached to the building's HVAC system is that in a larger space, due to factors such as individuals, location, and density, different 20 microclimate environments was produced. The control of this microclimate environment is the best control of real-time personal comfort. 3.2.4 User’s feedback The collection of user feedback was challenging and contain accuracy limitations. First, the project sent an excel form to each test participant, which is divided into intervals of 30 minutes, and the test time range is non-sleep time. The part of the form that the test participant needs to fill in includes: Thermal Comfort (1-7level); Thermal Sensation(1-7level); clothing index (0.2-1); activity index (0.5-2.5). Set a time reminder every 30 minutes on each test participant’s mobile phone; The user’s feedback appeared in the form of a pop-up window. It is recommended that the test participant fill in the form, but if the status and thermal comfort are the same as the previous 30 minutes, there is no need to fill in. Figure 6. User feedback chart However, since the user completes it without supervision, it is likely to forget to fill in the comfortable state. Therefore, the influence of these factors can be reduced by increasing the test cycle and the number of people. The data collection was divided into two rounds, with 10 people participating in the test in each round, and each round of testing lasts for three days. To ensure the 21 randomness and universality of the model, some test participants were randomly selected in the final test, and some new test participants were added to test the model. 3.3 Machine learning Phase 3.3.1 Input data Figure 7 summarizes the types of input and output features and data in different models. The Individual Thermal Comfort model does not consider the background data of the experimenter. The environmental data input to the Compiled Thermal Comfort Model is to input all the data in the training phase, but the environmental temperature does not need to be included in the prediction phase, because this is the result to be predicted. The output predicted by the Individual Thermal Comfort model in the group is the minimum value of OTD. The output result of the Compiled Thermal Comfort Model tested in a multi-occupancy setting can be the predicted optimal temperature setpoint, or the overall OTD value. Figure 7 Model input and output conclusion chart 3.3.2 Data clean The first step in cleaning data is to find missing data. During the experiment of collecting user information, there may be problems that the bracelet is not turned on or the user feedback information is missing. First, use the missing data heat map to visualize the missing data, and then 22 choose different strategies for different types of missing values. For example, the missing data caused by the bracelet is not turned on can directly discard the observed value, and for the missing user feedback information, if the interval is small, you can consider supplementing the missing data, because the user's comfort may remain the same as the previous one. The second step of data cleaning is to find out unnecessary and irrelevant data and delete the data. According to the reasonable physical parameter range of the experimental factors, the extremely or extremely small abnormal value generated by the test participant at the moment of wearing and taking off the watch was deleted. 3.3.3 Machine learning algorithm analysis The processed data was divided into training set and test set, and several different machine learning algorithms are used to test the prediction performance of different models. At this stage, the user's thermal comfort preference feedback is used as the dependent variable and the factual condition for verifying the accuracy of the prediction. From the research of related literature in the second chapter, the centralized machine learning algorithm with high test accuracy in other papers is selected for this test. Decision tree and random forest has high accuracy, and it can provide interpretable results for individual model or multi-occupancy model, so they are selected as the main algorithm in this project. At the same time, simple linear regression was used as a baseline method, which does not require adjustment of parameters and is more explanatory. The decision tree model uses the J48 decision tree algorithm and the random forest model uses the "Scikit-learn Ensemble" package in python. 3.3.4 Output data There are two types of data output, one is to output a temperature (Fahrenheit) value, which represents the current environment if adjusted to this value met the thermal comfort needs of all users. The other is to output a temperature range, which is generated at the intersection of the comfort zones of all test participants. When the model is used in different scenes, precise temperature control can be achieved by increasing the accuracy of the range. For example, in a hospital scenario, the range of temperature requirements was much higher than in public places. By setting the range of output results in the model, the needs of different environments can be met. 3.3.5 Optimization goal In the final multi-person verification test, the information and data collected by the test was imported into the model to predict that every moment fits the comfort range of everyone. Then compare the predicted range with the thermal comfort level and thermal sensation level entered by 23 each experimenter. The goal is that the predicted temperature range meets the expectations of different experimenters at the same time. 3.4 Operation Phase 3.4.1 Simulation and forecasting in multi-occupancy condition In the final experiment, 6 experimenters were occupied 2 hours in a classroom where the temperature is constantly changing. Each experimenter wears two smart watches, Garmin and Embrace, and put a small HOBO in his pocket to collect micro-environment data around the body. Place 4 large HOBOs in different places throughout the room to prevent uneven ambient temperature in the room. To reach the boundaries of the comfort range of each experimenter to the greatest extent, the experiment deliberately creates extreme temperatures and extreme conditions, so as to make the experimenter feel uncomfortable. In the course of two hours, each experimenter randomly changed the dress level every 20 minutes, including Trousers + long-sleeve shirt (Clothing level=0.61); Trousers + long-sleeve shirt + sweater /Jacket (Clothing level= 0.74); Trousers + long-sleeve shirt+ Down jacket (Clothing level=1). In addition, in the first hour of the experiment, each experimenter was arranged for different activities every 10 minutes, including sitting (Activity level=1); play game (Activity level=2.6); sport (Activity level=4). The entire two-hour arrangement is arranged as follows, as shown in Figure 8-9: First 1 hour (3 group of 20 minutes): All test participants are in the same space, and their positions are evenly arranged. But their clothing level and activity level changed randomly in every 20 minutes. Fourth 20 minutes: All test participants are in the same space, but the test participants are divided into two groups by gender, and one group are close to the heater and the other group are far away from the heater. After testing for 10 minutes, the positions of the two groups of people are swapped. This form is aim at showing thermal comfort of different genders in an uneven-temperature space. Fifth 20 minutes: Some test participants occupy the space, and two test participant go out of the room and go into the room again every 10minutes. This form is an attempt to test the time required for the temperature prediction to reach the comfort standard when a new person joins the room. Sixth 20 minutes: Let individual test participants change their physical state through exercise, so that the physiological signal data can generate maximum values, and then join the multi-person environment. Through these different forms, in-depth exploration of thermal comfort prediction performance can be carried out under different population density and mobility conditions. 24 Figure 8. Clothing and activity arrangement plan for each test participants Figure 9. Multi-occupancy test plan chart To maintain the social distance between the test subject and the researcher, a mask was worn throughout the experiment. The wearable sensor was provided to the test participant through verbal instructions, and he/she was wearing it correctly. Researchers only assisted test participants to wear the sensors correctly when they encounter technical problems while wearing these sensors. This situation requires fewer physical visits than social distancing but wearing masks on test participants and researchers is short-lived. 3.5 Summary The third chapter mainly describes the overall framework of the research, which includes the experimental content that needs to be carried out in the network environment and the physical 25 environment. Next, each step of the experiment, the required conditions, and the methods used are explained in detail. Chapter 4 explained in detail the process of data integration and model building and analyze the results of experiments. 26 4. SINGLE-OCCUPANCY MODEL RESULT AND ANALYSIS 4.1 Database analysis 4.1.1 Data preprocess In the data preprocess, the experimental data used a physical discriminating method to clean some unreasonable maximum and minimum values. In Figure 10, outliers are values that are abnormal from the perspective of the dependent variable y. In the direction of the red arrow in the figure, the deviation from the mean value of y is farther than the value of the aggregation points of the data that the main body forms a large cluster. The high leverage value refers to an abnormal value from the point of view of the independent variable x. In the direction of the yellow arrow in the figure, the value that deviates from the mean value of x on the x-axis is farther. The outliers and high leverage data were eliminated in the leverage detection chart. Figure 10. Leverage detection chart of 30 test participant database In addition, the experimental data fills in the missing data through the mean method. According to the overall data from the same feature, for example, the average value of one subject’s EDA data is used to interpolate this subject’s lost EDA data due to irregular wearing. 4.1.2 Database information of single occupancy experiment Thirty subjects participated in a single-occupied human thermal comfort experiment. All experimental data are transferred to the personal human thermal comfort database. A group of ten people was collected data for 1 week, and the recording interval was every 30 minutes. A total of 3 cycles of experiments were carried out, and the total number of experimental records was 2,796. 27 The experiment collected the subjects’ personal background information, real-time environmental information, real-time physiological information, and real-time comfort feedback information. The record of database information is shown in the following Table 1. Table 1. Database information of 30 individual test participants Attributes Type Missing Statistics ID - 0 30 individuals: 2796 records Age Integer 0 Min: 23 Max:33 Average: 24.53 Gender Polynomial 0 Male:16 (1582 records) Female:14 (1214 records) BMI Real 0 Min: 18.51 Max: 28.72 Average:21.78 Environmental Temperature Real 0 Min: 19.22℃ Max: 32.39℃ Average: 24.99℃ Relative Humidity Real 0 Min: 9.14% Max: 77.17% Average: 50.55% Heart Rate Real 595 Min: 70 Max: 149 Average:77.91 Stress Level Real 948 Min: 26.32 Max: 97 Average: 31.11 Skin Temperature Real 922 Min: 30.0℃ Max: 35.04℃ Average: 31.96℃ EDA Real 400 Min: -0.10 Max: 4.12 Average: 0.62 Clothing Level Polynomial (ASHRAE- 55) 59 Min: 0.17 Max: 1 Average: 0.51 Activity Level Polynomial (ASHRAE- 55) 78 Min: 0.39 Max:2.5 Average: 1.10 Thermal Comfort Level Polynomial 84 Min: 1 Max:7 Average: 4.23 Thermal Sensation Level Polynomial 79 Min: 2 Max:7 Average: 3.75 4.2 Baseline Model 4.2.1 Input and output For the Baseline Model based on linear regression algorithm, the input data includes time, age, gender, height, weight, BMI, environmental temperature, relative humidity, heart rate, stress level, skin temperature, EDA, clothing level, activity level, thermal comfort level (TC), and thermal sensation level (TS). The output data is the linear regression chart between TC and TS with different features. 28 4.2.2 Baseline Model result We use a linear regression model to analyze the correlation of different features in the data and use the accuracy of the linear regression model as the baseline of the single-occupancy decision tree model and random forest model. The formula below is been used in linear regression model. X means different independent variable, which is features of environment, body signal, and test participants’ feedback. Y means dependent variable, which is environmental temperature. In the linear regression model analysis, although the indoor environmental temperature setpoint is the final target of prediction, the target temperature needs to meet the expectations of the experimenter for Thermal Comfort and Thermal Sensation. The correlation and influence of other features with TC and TS can be analyzed through statistical regression line graphs. Figure 11 and Figure 12 are the regression statistics of Thermal Sensation and Thermal Comfort and four physiological signal data, including Heart rate, Stress Level, Skin temperature, and EDA based on all participants’ data. Figure 11 shows that with the increase of the four kinds of physiological data, Thermal Sensation increased slightly. For example, the skin temperature increases, and the thermal sensation to the ambient temperature also increases. However, the regression line fitted by all points in Figure 12 is almost horizontal, which shows that the manifestation of physiological data is scattered. The increase or decrease of the value of physiological data have no clear tendency to influence the thermal comfort. Therefore, in this experiment, there is no need to aggregate and analyze physiological signal data. 29 Figure 11. Regression chart of Thermal Sensation and Real-time Physiological Features (Heart rate, Stress Level, Skin temperature, and EDA) Figure 12. Regression chart of Thermal Comfort and Real-time Physiological Features (Heart rate, Stress Level, Skin temperature, and EDA) Figure 13 and Figure 14 respectively illustrate the regression relationship between clothing level and activity level and TC or TS. The size of the dressing index has no obvious effect on the thermal sensation, but as the clothing index becomes larger, the thermal comfort index becomes smaller. The data collection of this experiment was concentrated in Los Angeles in October. Therefore, the increase of clothing index was possible due to the decrease in the environmental temperature during that time. The increase in the activity index leads to a small increase in the thermal sensation index, but it has no obvious effect on the thermal comfort index. 30 Figure 13. Regression chart of Thermal Sensation and Thermal Comfort with Clothing Level Figure 14. Regression chart of Thermal Sensation and Thermal Comfort with Activity Level In general, the RMSE of baseline linear regression model is 1.879, and the R-square of it is 0.275. Most of the regression lines in the above regression graphs have a slope of 0, which shows that the correlation between all the features involved in the experiment is not particularly large; and in these graphs, most of points that are too far away from the regression line, which illustrates that there is a large random error in the prediction method of linear regression. It further shows that there is no connection between features, so aggregated data analysis is no necessary, and a model that is not used for feature selection will bring better results to the model's prediction. Table 2. P-value and R-value for all features to TC and TS. (* indicates statistical significance) Heart Rate Stress Level Skin- temp EDA Clo Act TS P-value 0.019 0.000 0.000 0.199* 0.133* 0.001 31 R-value 0.050 0.127 0.172 0.026 0.029 0.065 TC P-value 0.467* 0.002 0.002 0.747* 0.000 0.467* R-value -0.016 0.073 0.073 0.007 -0.129 -0.014 The p-value is used to judge whether the H0 hypothesis is valid. Because the expected value is based on the H0 hypothesis, if the observed value is more consistent with the expected value, the closer the test phenomenon is to the null hypothesis, the less reason to reject the null hypothesis. If the p-value is less than 0.05, the result of this feature is more significant. It can be seen from Table 2 that most of the features are less than 0.05, which proves that increase or decrease of the value of these features has no clear tendency to influence TC or TS. But in the TS model, the EDA and clothing levels are slightly greater than 0.05; in the TC model, the Heart Rate, EDA and activity levels are more significant than 0.05. The reason may be that his own sampling points are scattered and irregular. for example, Clothing level (Clo) and Activity level (Act) are filled in by the tester himself. There may be other features that can replace EDA or Heart Rate. R-value is the Person Correlation Coefficient, which is used to measure the correlation (linear correlation) between two variables features and TS or TC, and its value is between -1 and 1. The closer the value of -1 or 1, the better the correlation between two variables. All the R-value in the Table 2 is close to 0, which means the relationship between the individual variables and the target value is not obvious. Therefore, there is no need to aggregate and analyze physiological signal variables, and it is necessary to consider the relationship between multivariate and target values to establish a model in the next step. 4.3 Individual Thermal Comfort Model 4.3.1 Input and output Database of the Individual Thermal Comfort Model is the same as Table 1, which is clean data set from the linear regression model. The data was divided into 30 files by 30 different test participants. The data uses the average value of each column to make up the gap and deleted 84 rows of data with a lack of TS information and 79 rows of data with a lack of TC information. The overall 30 subject’s data for TS model and TC model is 2712 rows but there are only 90 rows of data in one subject’s model after splitting data into personal file. For the Individual Thermal Comfort Model, the input data includes environmental temperature, relative humidity, heart rate, stress level, skin temperature, EDA, clothing level, and activity level. The output data is thermal sensation (TS) or thermal comfort (TC). 32 4.3.2 Individual Thermal Comfort Model result The “Weka Explorer” was been used to pee-process and classify the data. 30 individual’s CSV data files were converted into ARFF files and imported them into the Explorer application of Waikato environment for knowledge analysis (WEKA) for J48 decision tree analysis. The principle of the J48 decision tree is based on a top-to-bottom strategy, a recursive divide-and- conquer strategy, select an attribute to be placed at the root node, generate a branch for each possible attribute value, and divide the instance into multiple subsets, each subset Correspond to a root interrupted branch, and then repeat this process recursively on each branch. When all instances have the same classification, the subset stop to divide. Decision tree algorithms can handle noisy data well, have high accuracy, and can display important decision attributes. The generated decision results are easy to understand, but they are usually only effective for smaller training sample sets. The 30 test participants have the same data content as Table 3, which shows the minimum, maximum, mean, and standard deviation of ID 1’s input data and real feedback of test participant’s thermal sensation and thermal comfort. Since the content of each test participant’s data is similar, so taken Test participant ID 1 file as example, which includes 78 rows of data. Table 3. Data information sample of Test participant ID 1. ID 1. (78 recording data) Min Max Mean Std Dev Label Count Label Count environment temperature 24.22 26.744 25.67 0.54 Thermal Sensation (TS) very cool 0 Thermal Comfort (TC) very uncomfortable 0 relative humidity 59.70 63.387 61.54 0.86 slightly cool 1 uncomfortable 0 heart rate 55 127 75.71 12.14 cool 6 slightly uncomfortable 8 stress level 0 97 33.22 25.12 neutral 57 neutral 61 skin temperature 30.48 33.585 32.02 0.59 warm 9 slightly comfortable 4 EDA 0.005 2.137 0.33 0.34 slightly warm 1 comfortable 5 clothing level 0 1 0.73 0.14 very warm 4 very comfortable 0 activity level 0.8 1.2 1.006 0.09 33 After input data of test participant ID 1, the default J48 parameters and 5 folds cross-validation are used in the decision tree model. Figure 15 is the Thermal Sensation visualize tree of test participant ID 1. Root feature is Clothing level, which 88% data is 0.74 (Sweatpants with long-sleeve sweatshirt) and easy to divide all the data by 0.74. After that, there are 4 layers of brunch was divided by stress level, relative humidity, clothing level, environment temperature and skin temperature. The output of each brunch is different thermal sensation level, which include cool, neutral, warm, and very warm. The Mean Squared Error is 0.045, and the Correctly Classified Instances of the model is 83.33%. Figure 15. Individual Thermal Sensation(TS) decision tree of test participant ID 1. By summarizing the individual TS decision tree models of all test participants, Table 4 classifies the results according to gender, and calculates the average model accuracy and the proportion of different features in the first three layers. The average accuracy rate of female test participants is slightly higher than that of male test participants by 9.16%. This may be because women are usually more sensitive to temperature than men and have a larger range of TS. The dominant features of female test participants in the first three layers are relatively scattered, which are activity level, environmental temperature, and clothing level. While male test participants dominate the first three levels mainly by environmental temperature and humidity. Figure 16 shows two Thermal Sensation decision tree sample of test participants from each gender, which also illustrates that first three-layer of male’s decision tree all include environment temperature as a classification feature. This shows that women may be more likely to be affected by daily behaviors and clothing conditions than men. In addition, the proportions of EDA, stress level and heart rate are very small, indicating that they do not have obvious characteristics to classify data. 34 Table 4. Individual TS decision tree model accuracy and layer feature for 30 test participants (percentage of feature in first three layers of each gender group) Gender (Test participants’ ID Number) Correctly Classified Instances RMSE Layer Env- Temp RH Heart Rate Act Clo Skin- Temp EDA Stress Female (1;10;11;12;15;16;17; 20;21;22;25;26;27;30 ) 80.76% 0.207 1 15% 24% 30% 15% 8% 2 29% 12% 23% 12% 12% 12% 3 15% 22% 22% 36% 5% Male (2;3;4;5;6;7;8;9;13;14 ;18;19;23;24;28;29) 71.67% 0.279 1 39% 7% 7% 20% 20% 7% 2 31% 23% 8% 19% 8% 3% 8% 3 24% 24% 12% 8% 8% 8% 4% 12% Figure 16. TS decision tree example for each gender group Figure 17 is the Thermal Comfort visualize tree of test participant ID 1. Root feature is still the Clothing level. But there are only 3 layers of brunch was divided by stress level, relative humidity, and environment temperature. The output of each brunch is different thermal comfort level, which include comfortable, neutral, and slightly uncomfortable. The Mean Squared Error is 0.049, and the Correctly Classified Instances of the model is 80.65%, which is slightly higher than individual TS model. 35 Figure 17. Individual Thermal Comfort (TC) decision tree of test participant ID 1. By summarizing the personal TC decision tree models of all test participants, Table 5 classifies the results according to gender, and calculates the average model accuracy and the proportion of different features in the first three layers. The average accuracy rate of female test participants is slightly higher than that of male test participants by 3.76%. The dominant features of female and male test participants in the first three layers are the same as those in the TS model, one is more scattered and the other is concentrated on the environmental temperature. Figure 18 shows two Thermal Comfort decision tree sample of test participants from each gender, which also illustrates that first three layer of male’s decision tree all include HR and Environment Temperature as classification feature. However, the proportion of EDA as a feature in the first three-layers of TC model has increased compared with the TS model, which indicate that the EDA data is more evident for the classification of thermal comfort status than thermal perception. Table 5. Individual TC decision tree model accuracy and layer feature for 30 test participants (percentage of feature in first three layers of each gender group) Gender (Test participants’ ID Number) Correctly Classified Instances RMS E Layer Env- Temp RH Heart Rate Act Clo Skin- Temp EDA Stress Female (1;10;11;12;1 5;16;17;20;21 ;22;25;26;27; 30) 71.34% 0.293 1 29% 14% 9% 21% 9% 9% 9% 9% 2 20% 10% 10% 20% 10% 10% 10% 10% 3 11% 5% 11% 26% 26% 5% 11% 5% 67.58% 0.319 1 40% 20% 20% 20% 36 Male (2;3;4;5;6;7;8 ;9;13;14;18;1 9;23;24;28;29 ) 2 29% 17% 4% 13% 8% 8% 13% 8% 3 24% 12% 9% 21% 17% 15% 2% Figure 18. TC decision tree example for each gender group 4.3.3 Individual Thermal Comfort Model accuracy performance Due to the possible imbalance of the data itself, the model train 80% of the TC data and test 20% to recalculate the accuracy of a single model with 5 cross validation. Through retraining and testing the individual TC model, the results show that the average training model and test model accuracy of 30 test participants is only a little decrease than the original model accuracy, which are 65.74% and 62.98%. But in fact, the difference between the accuracy of the training set and the test set fluctuates greatly, which means that the training set is often much higher or far lower than the test set. The average RMSE of 30 test participant’s decision trees is 0.306. This result seems to indicate that the accuracy of the model is high, but it means that the model is overfitting. Because the decision tree model may have a small error for train data, but a large error for unknown test data. Therefore, it needs to be simulated in the group to test whether its accuracy is suitable for development into a multi-person occupation environment model. 37 Figure 19. Individual Thermal Comfort Model average accuracy chart base on train data, test data and total data 4.4 Compiled Thermal Comfort Model 4.4.1 Improvement of algorithm According to the regression analysis model as a benchmark reference, all features are shown to be independent and non-exclusive in the analysis. However, traditional statistical methods (linear regression) are difficult to apply to information classification of multiple features or large amounts of data. Individual Thermal Comfort has extremely high accuracy overall, but the accuracy difference between the training set and the test set of a single model is very large, which is caused by data imbalance and overfitting. However, random forest is a more advanced algorithm based on decision trees. Like decision trees, random forests can be used for both regression and classification. As you can see from the name, a random forest is a forest constructed in a random manner, and this forest is composed of many unrelated decision trees. In real-time, random forest is essentially a very important branch of machine learning called ensemble learning. Integrated learning solves a single prediction problem by building a combination of several models. It works by generating multiple classifier models, each of which learns and makes predictions independently. These predictions are finally combined into a single prediction, so it is better than any single category to make predictions. Therefore, in theory, the performance of random forest is generally better than that of a single decision tree, because the result of random forest is to determine the result by voting on the results of multiple decision trees. Simply put, each decision tree in the random forest has its own result. The random forest counts the results of each decision tree and selects the result with the most votes as its result. Many similar thermal comfort prediction experiments have confirmed that random forest has higher accuracy than other algorithms. (Katić, Li, and Zeiler 2020) (Kim et al. 2018)(Luo et al. 2020) Based on the existing experimental data, its sample size is small and 38 contains a significant number of missing data. Therefore, the project adopted the random forest algorithm to construct comfort zone in Compiled Thermal Comfort Model, which will bring a more realistic accuracy rate than the Individual Thermal Comfort Model. Figure 20. Visualization of the Individual Thermal Comfort Model and the Compiled Thermal Comfort Model Figure 20 is the visualization of the two models. The Individual Thermal Comfort Model uses a decision tree algorithm, and each subject has its own decision tree, so there is a total of 30 Individual Thermal Comfort Models. The individual model training data also comes from the subjects themselves, and it can only act on the subjects themselves, and the result predicted through the model is Thermal Comfort Level of the subject. However, the Compiled Thermal Comfort model is a random forest algorithm training with the data of 30 subjects, which combine into only one prediction model. Therefore, the Compiled Thermal Comfort model can be applied to subjects participating in data collection or any new subjects, and its predicted result will be the temperature range of the subject’s comfort zone, which is more conducive to establishing a multi-occupancy condition prediction model. 4.4.2 Definition of the thermal comfort boundary To establish the Compiled Thermal Comfort Model, how to define the boundary of the comfort zone based on data is the first issue to be considered. Two method are considered, which show on Figure 21. The method (a) is to extract the data filled by the experimenter when Thermal Comfort=4 (TC=neutral), the maximum and minimum values of Thermal Sensation (TS), which means that when everyone feels comfortable, the coldest and the most popular boundary value. If 39 this experiment is supported by huge data, then the comfort zone obtained in this way should be very accurate. However, due to the limited amount of test data, and when the test participant fills in TC=4, most of the TS is 4 (Neutral), and a few TS is 3 or 5 (Slight cool or slightly warm). Therefore, the data extracted from this type of data the temperature range of the comfort zone is restricted. Figure 21. Two Different Comfort Zone Boundary Calculation Method The method (b) is to define the comfort zone by measuring the boundaries of discomfort aera. First extract the experimenter when TC<4 (slightly uncomfortable, uncomfortable, very uncomfortable), 1<TS<3 (very cool ~ slightly cool), and 4<TS<7 (slightly warm ~ very hot) data. The rest of the data represents the data set of the comfort zone of the experimenter. Regardless of whether the TS is high or low, their TC is greater than or equal to 4. This comfort zone determination scheme is preferable for limited experimental data. Through a simple test of the linear regression model, the model accuracy and interpretation ability of the method (b) is also improved compared with the first scheme, so subsequent models adopt the reverse elimination scheme to determine the comfort range. 4.4.3 Input and output Database of the single-occupancy model is same as Table 1, which is clean data set from linear regression model. 40 Figure 22 Input and output training and testing data of Individual Thermal Comfort Model and Compiled Thermal Comfort Model Figure22 shows input and output training and testing data of the Individual Thermal Comfort Model and the Individual Thermal Comfort Model. Overall, the Individual Thermal Comfort Model added identification features (age, gender, BMI) and tester feedback of TC and TS (Thermal Comfort and Thermal Sensation) in input data than the Individual Thermal Comfort Model. The reason why the Individual thermal comfort model does not include identification features is that if gender and age are used as features to categorize data, they will easily be regarded as root features because of their very few types, which is meaningless. In the training phase of the Compiled Thermal Comfort Model, the input data includes time, age, gender, height, weight, BMI, environmental temperature, relative humidity, heart rate, stress level, skin temperature, EDA, clothing level, activity level, thermal comfort level, and thermal sensation level. The output data is the predicted comfort zone. In the testing phase of the RF model, the input data includes environmental temperature, relative humidity, heart rate, stress level, skin temperature, EDA, clothing level, activity level, thermal comfort level, and thermal sensation level. The output data is the predicted comfort zone. 4.4.4 Compiled Thermal Comfort Model result The random forest model uses “Random Forest Regressor” from “sklearn.ensemble”. The model parameter setting firstly uses random search to determine the approximate range of the parameters, and secondly uses grid search to further determine the optimal parameter values. After testing the accuracy of model by cross-validation in 5 folds, 7 folds, and 10 folds, it shows that 5 folds cross- validation could not only provide parameters resulted in similar actuary but also reduce the running time of the whole model. The Table 6 summed up the accuracy and R-square of model with 41 different folds cross-validation. Therefore, Use the random search with 5 folds cross-validation to search for best hyperparameters and use grid search to narrow down the range for each hyperparameter, which shows max depth=None, max features=5, min samples leaf=1, min samples split=2, number of estimators=1500. Table 6. RMSE and R-square performance of Cross Validation of Compiled Thermal Comfort Model base on whole data and test data with 5 folds, 7 folds, and 10 folds Cross Validation base on whole data Cross Validation base on test data RMSE R-square RMSE R-square 5 folds 0.839 0.448 0.880 0.586 7 folds 0.841 0.445 0.863 0.601 10 folds 0.876 0.401 0.937 0.5298 After inputting 30 test participant’s data (age, gender, height, weight, BMI, environmental temperature, relative humidity, heart rate, stress level, skin temperature, EDA, clothing level, activity level, TC, and TS) into Compiled Thermal Comfort Model, and running the model “Random Forest Regressor” with finding parameters, the thermal comfort maximum value random forest model shows 1500 decision trees, and the minimum value random forest model shows 800 decision trees. 42 Figure 23. A random forest tree sample from the maximum value of Compiled Thermal Comfort Model 43 One random selected decision tree from maximum value random forest model shown in Figure 23 is taken as an example, which is composed of nodes and directed edges. all internal nodes, which represents a feature or attribute, in the decision the tree contain: eigenvalues, Mean Square Error, the number of observed samples, and the output temperature value. The node splits in different directions based on the value of the feature. For example, when the first internal node EDA≤0.608, it splits to the left branch, and when the situation is no, it splits to the right. Since this tree is produced by bagging, a single tree does not contain all the features. But the Bagging method adopts a divide-and-conquer strategy. Through multiple sampling of training samples, multiple independent base classifiers are calculated separately, and each model is integrated to reduce the variance of the integrated classifier. The greater number of base models, the more unified the overall thinking is affected by individual models, and the smaller the variance. At the same time, the order of features selected at each level of each tree is random sampling with replacement. The goal is to get the smallest predicted value of RMSE. Finally, for a random forest composed of thousands of decision trees, the final predicted temperature is obtained by averaging. 4.4.5 Compiled Thermal Comfort Model accuracy performance Root Mean Squared Error is mean value of the square root of the error between the predicted value and the true value. Linear regression uses RMSE as a loss function to evaluate the accuracy of the model. The smaller the value of RMSE, the more accurate the result predicted by the model. In addition to the numerical value of the data itself, it is a very important aspect that the model can capture the law of the data and judge whether it is sufficient. R-Squared is the ratio of the amount of information captured in the model to the amount of information contained in the real label. Therefore, the closer the value of R-Squared is to 1, the better the explanatory power of the model. Table 7 is the comparison of RMSE and R-square value from the single-person occupation temperature prediction by Baseline Model (linear regression algorithm), Individual Thermal Comfort Model (decision tree) and Compiled Thermal Comfort Model (random forest algorithm). And R-square only shows in the model of regression. And only Individual Thermal Comfort Model’s RMSE is the average of the RMSE of 30 test participants’ decision trees, which is 0.082. Table 7. RMSE and R-square value of Baseline Model, Individual Thermal Comfort Model (average RMSE) and Compiled Thermal Comfort Model Single-occupancy condition Baseline Model Individual Thermal Comfort (ITC) Model Compiled Thermal Comfort (CTC) Model CTC Model accuracy compares to Baseline Model CTC Model accuracy compares to ITC Model 44 Database 30 test participants 30 test participants 30 test participants Algorith m Linear Regression Decision Tree Random Forest RMSE Whole model’s RMSE 30 Individual models’ average RMSE Whole model’s RMSE +62.26% -73.59% 1.879 0.306 1.159 R-square 0.275 / 0.725 +163.63% / By comparing the mean square error and R-squared value of the Compiled Thermal Comfort Model with Baseline model, the change value shows that the accuracy of the random forest algorithm is 62.26% higher than the accuracy of the linear regression algorithm, and the explanatory power of the model is also improved 163.63%. However, when comparing the RMSE of the Compiled Thermal Comfort Model with the Individual Thermal Comfort Model, the accuracy of random forest is 73.59% lower than decision tree. Although it seems that the Individual Thermal Comfort model is more accurate for individuals, it has a big data imbalance problem, which has a great impact in multi-occupancy trials. Therefore, the simulation of the ITC model and the ICZ model in group test could decide which one is more suitable for developing a multi- occupancy condition model. 4.4.6 Performance comparison between Compiled Thermal Comfort Model and PMV model Table 8 is a comparison of the prediction results of the Compiled Thermal Comfort Model and the PMV model. The data in the table is based on the prediction data of test participant ID 1 in single database. The ratio of the predicted comfort range is the difference between the temperature range of the Compiled Thermal Comfort Model and the temperature range of the PMV model. The average ratio of the predicted comfort range is 54.54%, which shows that the average range of the Compiled Thermal Comfort Model is twice as small as the PMV Model. The comparison value of the predicted set temperature is the predicted value of the random forest model minus the predicted value of the PMV model. It is predicted that the overall average value of the set temperature is that the Compiled Thermal Comfort Model is 0.64 higher than the PMV Model, and from the point of view of the positive or negative ratio of the value, the high temperature value accounts for 86.6%. This result can explain to a certain extent that under the premise of achieving the same comfortable state effect, the random forest model can set a value closer to the ambient temperature. Without reducing satisfaction levels, increasing the cooling setpoint of 1 °C saves an average of 10% of cooling energy.(Hoyt, Arens, and Zhang 2015) Then the experimenter in the table will be able to save 6.4% energy in average by using the Compiled Thermal Comfort Model to set the ambient temperature demand. 45 Table 8. Comfort zone prediction performance comparison between Compiled Thermal Comfort Model and PMV Model of Test participant ID1. Real-time data Compiled Thermal Comfort Model Prediction PMV Model Prediction Comparison of two model Temp (℃ ) Comfort Range Min ( ℃) Comfort Range Max (℃) Prediction Temp. Setpoint ( ℃ ) Temp. Min ( ℃ ) Temp. Max ( ℃) Temp. Setpoint ( ℃) Temp. Range Ratio Comparison Temp. Setpoint Comparison Assessment Potential Energy Saving 26.38 25.6 28.5 25.3 23.7 29.4 24.4 49.15% 0.92 Higher 9.20% 27.17 26.1 29.7 25.9 24.1 30.7 23.8 53.77% 2.09 Higher 20.90% 25.91 26.8 29.6 24.9 24.8 30.6 23.3 49.04% 1.63 Higher 16.30% 27.24 27.7 29.6 26.6 25.7 30.6 23.9 37.83% 2.70 Higher 27.00% 26.82 26.9 29.7 25.9 24.9 30.7 23.7 48.01% 2.23 Higher 22.30% 22.61 23.3 29.7 21.5 25.5 30.6 22.2 125.46% -0.66 Lower -6.60% 26.38 29.5 30.9 25.9 27.4 32 22.7 32.56% 3.22 Higher 32.20% 26.38 27.8 29.7 25.7 25.8 30.7 23.3 38.57% 2.39 Higher 23.90% 24.85 26.9 29.9 23.8 24.9 30.9 22.7 51.46% 1.11 Higher 11.10% 25.84 26.9 29.3 24.9 25 30.2 23.9 44.32% 0.99 Higher 9.90% 26.39 26.6 29.3 25.4 24.6 30.3 24.2 48.85% 1.21 Higher 12.10% 24.64 28.4 30.7 23.7 26.5 31.6 22.8 43.38% 0.92 Higher 9.20% 23.87 28.2 30.8 22.9 26.3 31.7 22.3 46.88% 0.62 Higher 6.20% 24.74 27.5 29.8 23.8 25.6 30.7 23 43.52% 0.81 Higher 8.10% 23.68 29.0 31.5 22.8 27 32.5 21.6 47.15% 1.23 Higher 12.30% 24.55 28.5 31.4 23.6 26.6 32.3 22.9 49.58% 0.71 Higher 7.10% 23 23.2 30.8 22.2 21.3 31.7 21.2 72.26% 1.02 Higher 10.20% 24.65 23.3 31.2 23.7 25.2 30.3 23.2 154.90% 0.45 Higher 4.50% 25.28 29.5 31.0 24.7 27.5 32 22.2 33.11% 2.49 Higher 24.90% 25.02 27.0 29.2 24.0 25.1 30.1 23.5 42.70% 0.53 Higher 5.30% 25.32 27.7 29.7 24.4 25.7 30.7 23.2 41.80% 1.23 Higher 12.30% 23.77 29.0 30.4 23.0 27 31.4 21.6 33.77% 1.43 Higher 14.30% 24.55 28.2 30.7 23.6 26.3 31.6 22.7 45.50% 0.93 Higher 9.30% 23.97 25.9 30.4 22.4 24 31.3 22 60.31% 0.41 Higher 4.10% 24.75 24.5 29.5 23.8 25.2 30.4 23.3 95.57% 0.45 Higher 4.50% 23.71 27.1 29.2 22.7 25.2 30.1 22.8 42.28% -0.14 Higher -1.40% 23.68 28.6 31.1 22.7 26.7 32 22.3 46.12% 0.41 Higher 4.10% 24.55 30.5 32.6 23.9 28.5 33.6 21.5 41.05% 2.45 Higher 24.50% 24.06 34.0 34.9 23.6 32 35.9 21.1 23.20% 2.54 Higher 25.40% 21.87 19.3 23.5 21.2 17 24.7 26.4 54.55% -5.19 Lower -51.90% 25.03 28.2 30.2 24.3 26.2 31.2 22.5 41.05% 1.79 Higher 17.90% 46 23.87 24.6 31.0 22.9 26.4 31.9 22.7 115.97% 0.18 Higher 1.80% 23.48 23.5 30.2 22.3 24.8 31.1 23 106.30% -0.66 Lower -6.60% 25.67 33.0 33.7 25.4 31 34.7 22.2 17.12% 3.16 Higher 31.60% 24.06 27.9 30.5 23.1 26 31.4 22.8 47.25% 0.28 Higher 2.80% 23.87 29.8 31.6 23.1 27.8 32.6 21.7 39.29% 1.43 Higher 14.30% 25.31 28.9 30.5 24.6 26.9 31.5 22.6 35.52% 2.05 Higher 20.50% 22.63 20.9 24.8 22.1 18.6 26 25.9 52.70% -3.83 Lower -38.30% 25.32 27.0 29.3 24.4 25.1 30.2 23.6 43.55% 0.79 Higher 7.90% 23.68 28.5 30.2 22.8 26.6 31.1 22 36.09% 0.78 Higher 7.80% 25.6 23.6 31.1 24.3 21.7 27.5 25.8 129.50% -1.52 Lower -15.20% 24.58 36.6 29.0 23.5 34.7 29.9 23.6 159.00% -0.06 Higher -0.60% 23.48 28.1 30.7 22.5 26.2 31.6 22.5 47.55% -0.02 Higher -0.20% 24.84 32.0 32.7 24.1 30 33.7 22.8 21.53% 1.34 Higher 13.40% 23.81 34.3 34.7 23.5 32.3 35.7 20.7 11.35% 2.77 Higher 27.70% 25.21 26.5 28.8 24.2 24.6 29.7 23.9 44.09% 0.30 Higher 3.00% 24.16 28.4 30.7 23.2 26.5 31.6 22.8 44.03% 0.39 Higher 3.90% 23.2 28.1 30.8 22.2 26.2 31.7 22.4 48.68% -0.22 Higher -2.20% 24.17 27.3 29.5 23.1 25.4 30.4 23.1 43.25% 0.04 Higher 0.40% 23.55 20.6 24.4 23.0 18.3 25.6 26.4 52.05% -3.41 Lower -34.10% 23.77 25.5 31.0 22.8 26.5 31.9 22.6 101.45% 0.18 Higher 1.80% 24.55 31.8 32.8 23.8 29.9 33.7 22.7 23.99% 1.13 Higher 11.30% 25.16 28.9 30.7 24.5 26.9 31.7 22.5 38.32% 1.99 Higher 19.90% 21.92 20.5 24.2 21.4 18.2 25.4 25.9 51.39% -4.54 Lower -45.40% 24.25 27.7 30.1 23.3 25.8 31 23 45.23% 0.27 Higher 2.70% 24.25 29.7 31.5 23.5 27.7 32.5 21.9 38.96% 1.64 Higher 16.40% 25.02 26.5 28.8 24.0 24.6 29.7 23.8 44.20% 0.20 Higher 2.00% 24.12 26.7 29.1 23.1 24.8 30 23.3 45.70% -0.23 Higher -2.30% 26.1 27.9 29.8 25.4 25.9 30.8 23.1 38.78% 2.34 Higher 23.40% 24.55 24.5 31.3 23.2 22.6 28.7 24.8 111.67% -1.58 Lower -15.80% AVE=54.54% AVE=0.64 AVE=6.40% In addition, by comparing the RMSE values of the two models, the accuracy of the Compiled Thermal Comfort Model is 25.97% higher than that of the PMV model. Therefore, compared with the PMV model, the Compiled Thermal Comfort Model has a smaller prediction comfort range, higher accuracy, and can also bring more energy savings. 4.7 Summary From the perspective of the accuracy of the personal model, the Individual Thermal Comfort Model is greater than the Compiled Thermal Comfort Model, greater than the Baseline Model, and greater than the PMV Model. However, it is also found from the model results that due to the 47 limitation of the number of sample data, there is an imbalance in the results predicted by the Individual Thermal Comfort Model. Therefore, before considering which Individual model will be used as the basis to develop a Multi-occupancy model, a simulation test of the group situation is also required. 48 5. MULTI-OCCUPANCY MODEL RESULT AND ANALYSIS 5.1 Individual Thermal Comfort Model tested in a multi-occupancy setting 5.1.1 Overall Thermal Discomfort (OTD) Evaluation for group Generally, the thermal comfort of the occupants in the room is not uniform, because each person has a different sensitivity to thermal comfort. In a 2017 study, Chen proposed to use the overall thermal comfort (OTD) index to evaluate the thermal comfort of various living conditions. (Chen, Zhong 2017) The OTD index equals the total evaluation score of occupants of the current environmental condition, which show in Table 9 and the formula below. Table 9. Overall Thermal Discomfort (OTD) Evaluation Thermal comfort condition Evaluation score Comfortable 0 Slightly uncomfortable 1 Uncomfortable N+1 Very uncomfortable (N+1)2 OTD-max = (N-A) *1 + A * (N+1); N is the number of residents. A is the number of uncomfortable occupants allowed. When adding up the scores corresponding to the thermal comfort state of each person in the group, the model gets the overall thermal comfort score. It is obvious that the higher the score, the more people feel uncomfortable, and the following formula can simply define the maximum discomfort (OTD total score) that a group can accept. A is the number of uncomfortable occupants allowed, which could be determined by Predicted Percentage of Dissatisfied (PPD). For example, if PPD=10% and 10 people in a group, which means that it only allows one people feel uncomfortable, so A=1. In this condition, OTD-max=20, which mean no one there is very uncomfortable ((10+1) *2=22 score) or two people feel uncomfortable (22 score). If the PPD is larger, the value of OTD-max is larger; conversely, if the value of OTD-max is smaller, the Predicted Percentage of Dissatisfied will be smaller. Therefore, OTD is used to limit and define the overall comfort boundary in a multi-person environment. 5.1.2 Result of Individual Thermal Comfort Model tested in a multi-occupancy setting 49 To test the individual decision tree model under different environmental conditions, 6 people was randomly selected (test participant ID 4, 5, 10, 12, 14, and 22) from the data as a group. Input their physiological signals and state data (shows in Table 10) into the model, and bring in all the temperatures from 15°C to 35°C every 0.1°C. Using the Individual Thermal Comfort Model of six test participants respectively, predict the TC index at different temperatures. Since the project mainly studies the uncomfortable state, here the original thermal comfort level of 1-7 is converted to 0-4, and the comfortable and neutral states in it are converted to 0. Table 10. Group test database for the Individual Thermal Comfort Model (6 test participants selected from 30 test participants) ID E-temp RH Heart Rate Stress S-temp EDA Clo Act 4 24.80 57.16 74.97 37.32 32.15 0.27 0.54 1.00 5 27.96 41.13 69.00 14.00 31.72 1.91 0.31 1.00 10 24.84 60.04 76.58 25.98 30.01 20.38 0.31 2.50 12 25.09 53.86 98.00 79.00 31.46 0.03 0.61 1.80 14 24.65 53.96 86.00 57.00 31.77 0.02 0.54 1.00 22 25.67 23.46 78.00 38.00 31.78 0.03 0.57 1.00 According to the formula OTD-max = (N-A) *1 + A * (N+1), you can simply calculate the value of PPD and OTD when the number of people who feel uncomfortable changes. Display as Table 11. Table 11. OTD-max value in different A value OTD-max = (N-A) *1 + A * (N+1); N=6 the number of uncomfortable occupants allowed PPD OTD-max A=1 16% 12 A=2 33% 18 A=3 50% 24 A=4 66% 30 A=5 83% 36 A=6 100% 42 There are 6test participant in the group. Table 11 shows corresponding PPD value and OTD-max value when the number of uncomfortable occupants change from 1 to 6, which means if the requirement of PPD is under 16.67%, the maximum value of OTD should be equal or less than 12. Table 12. Thermal comfort and OTD result Individual Thermal Comfort Model test in group 50 Test Environment Temperature(℃) ID 4 ID 5 ID 10 ID 12 ID 14 ID 22 OTD Uncomfortabl e People TC=1: Very Uncomfortable (VU) TC=2: Uncomfortable (U) TC=3: Slightly Uncomfortable (SU) TC>=4: Neutral or Comfortable (N/C) 15 SU SU U SU U SU 18 2 15.5 SU SU U SU U SU 18 2 16 SU SU U SU U SU 18 2 16.5 SU SU U SU U SU 18 2 17 SU SU U SU U SU 18 2 17.5 SU SU U SU U SU 18 2 18 SU SU U SU U SU 18 2 18.5 SU SU U SU U SU 18 2 19 SU SU U SU U SU 18 2 19.5 SU SU U SU U SU 18 2 20 SU SU U SU U SU 18 2 20.5 SU SU U SU U SU 18 2 21 SU SU U SU U SU 18 2 21.5 SU SU U SU U SU 18 2 22 SU SU U SU U SU 18 2 22.5 SU SU U SU U SU 18 2 23 SU SU U SU U SU 18 2 23.5 SU SU U SU U SU 18 2 24 SU SU U SU U SU 18 2 24.5 SU SU U SU U SU 18 2 25 SU SU U SU VU SU 25 4 25.5 N/C SU U SU U SU 17 2 26 N/C SU U SU U SU 17 2 26.5 N/C N/C SU SU U SU 10 1 27 N/C N/C SU SU U SU 10 1 27.5 N/C N/C SU SU U SU 10 1 28 N/C SU SU SU U SU 11 1 28.5 N/C SU SU SU U SU 11 1 29 N/C N/C SU SU U SU 10 1 29.5 N/C N/C SU SU U SU 10 1 30 N/C N/C SU SU U SU 10 1 30.5 N/C N/C SU SU U SU 10 1 31 N/C N/C SU SU U SU 10 1 31.5 N/C N/C SU SU U SU 10 1 51 32 N/C N/C SU SU U SU 10 1 32.5 N/C N/C SU SU U SU 10 1 33 N/C N/C SU SU U SU 10 1 33.5 N/C N/C SU SU U SU 10 1 34 N/C N/C SU SU U SU 10 1 34.5 N/C N/C SU SU U SU 10 1 35 N/C N/C SU SU U SU 10 1 Table 12 converts the thermal comfort numbers of 6 test participants at different temperatures from 15-35℃ with 0.5℃ intervals into comfort evaluations, which are Very Uncomfortable (VU), Uncomfortable (U), Slightly Uncomfortable (SU), Neutral or Comfortable (N / C). The blue table cell represents a comfortable state. Yellow, orange, and red table cell represents states of Slightly Discomfort, Discomfort and Very Discomfort, which means the higher the saturation of table cell, the more uncomfortable thermal comfort of the participants. Calculate the OTD value of each group through the formula and mark the corresponding number of people who feel uncomfortable. Figure 24. Group OTD value of Individual Thermal Comfort Model base on 6 subjects Figure 24 shows the group OTD values of 6 test participants at the different environmental temperature, and right side of chart marks the number of people who feel uncomfortable in the group corresponding to this OTD value. Since the smaller the value of OTD, the fewer people feel uncomfortable. The table and figure both illustrate that there are illogical results in applying the Individual Thermal Comfort Model to the group test. For example, the OTD values between 15°C 52 and 24.5°C and between 26.5°C and 35°C are the same, but the number of uncomfortable people should be increase when the environmental temperature greatly decreases or increase. It is possible resulted by insufficient data samples and data imbalance problems of the Individual Thermal Comfort Model. Because decision trees are very sensitive to the data they are trained on, small changes in the training set may result in significantly different tree structures. The test participant ID14 in Table 10 is a subject who is extremely sensitive to temperature, and he or she is in an uncomfortable and very uncomfortable state regardless of the temperature. This should be a problem that the collected data is not comprehensive enough, but this problem has a great impact on the decision tree algorithm of the Individual Thermal Comfort Model, and it also has a great impact on the value of OTD. For example, it will be uncomfortable when it appears at 25°C. The number of people reached the maximum. Therefore, it is unreasonable to use Individual Thermal Comfort Model to develop a multi-person occupancy prediction model. However, random forests can instead take advantage of the sensitive characteristics of decision trees, by allowing each tree to be randomly sampled from the replacement data set, thereby generating tens of millions of different trees. To each decision tree, random variables are passed to predict the results, and all the predicted results are stored. Then the prediction target with the highest number of votes is used as the final prediction of the random forest algorithm. 5.2 Compiled Thermal Comfort Model tested in a multi-occupancy setting 5.2.1 Automatic evaluation mechanism The Overall Thermal Discomfort (OTD) Evaluation can obtain acceptable boundary conditions based on the PPD (Predicted Percentage of Dissatisfied) that needs to be met. But there may be extreme situations in the real environment: a person is extremely sensitive to the temperature, and the current environment must give priority to temperature-sensitive people because of their special status, such as children in kindergartens, elderly in hospitals, or boss in offices. In this situation, it is inappropriate to choose an OTD solution that makes more people feel comfortable. Automatic evaluation mechanism was been established, which consists of two ways of evaluation. One way is the OTD evaluation: the machine checks out all possible temperature values and then takes the average of the temperature corresponding to all values that meet the requirement of the Overall Thermal Discomfort. The other way is the Sensitive first evaluation: recording the temperature difference of each individual comfort zone as △T, and record the sensitive weight of each person as 1/(△T)^x. Then weigh the comfortable temperature of each person and divide by sum up of the weight to get the final temperature, such as the following formula. x x T T m p P e r s o n al T e O u t pu t T e m p 1 1 53 △T is thermal comfort range; x is the adjustable weight of sensitive people (generally 1, it can be set to a value greater than 1 in special environments where the weight of sensitive people is high). Under different situation, people could select OTD mechanism or Sensitive First mechanism. Or Automatic evaluation mechanism, which combine OTD mechanism and Sensitive First mechanism, used the parameters "sensitive People Percentage Limit" and "sensitive Range" to decide which temperature evaluation mechanism is more optimized, which shows in Figure 25. According to the predicted temperature, the OTD value of the group could also be calculated. Figure 25. The temperature automatic decision model 5.2.2 Result of Compiled Thermal Comfort Model tested in a multi-occupancy setting Table 13. Group test database for the Compiled Thermal Comfort Model (6 test participant selected from 30 test participants) ID Gender BMI Age E-temp RH Heart Rate Stress S-temp EDA Clo Act 4 -1 21.71 24 24.80 57.16 74.97 37.32 32.15 0.27 0.54 1.00 5 -1 23.15 33 27.96 41.13 69.00 14.00 31.72 1.91 0.31 1.00 54 10 1 18.82 23 24.84 60.04 76.58 25.98 30.01 20.38 0.31 2.50 12 1 21.34 24 25.09 53.86 98.00 79.00 31.46 0.03 0.61 1.80 14 -1 21.71 24 24.65 53.96 86.00 57.00 31.77 0.02 0.54 1.00 22 1 21.34 24 25.67 23.46 78.00 38.00 31.78 0.03 0.57 1.00 To compare with the Individual Thermal Comfort Model, the same data of 6 test participants as before (ID: 4; 5; 10; 12; 14; 22 in Table 13) was used for group testing. But the Compiled Thermal Comfort Model input data includes test participant’s background information, such as gender, BMI and age, which is different to the ITC Model. The predicted comfort range of each person is as shown in Figure 26. The maximum comfort range is 24.35-28.76℃, and the minimum comfort range is 25.35℃ to 25.8℃. Figure 26. Comfort range result of the Compiled Thermal Comfort Model of 6 test participants When calculating the OTD scores of these 6 individuals at different temperatures, the result and corresponding number of uncomfortable people is as shown in Figure 27. The OTD value trend shows a concave shape, because the 25-26 ℃ in the middle position can satisfy everyone's maximum Thermal comfort needs, which OTD value is the smallest and no one in the team feels uncomfortable. When the temperature continues to rise or fall, it gradually exceeds the boundary of the comfort range of some people, so the OTD value shows a gradual upward trend on both sides. When the OTD is over 42, all test participants in the group feel uncomfortable. 55 Figure 27. Group OTD value of Compiled Thermal Comfort Model base on 6 subjects The comfort zone and OTD values predicted by the Compiled Thermal Comfort Model are relatively reasonable compared to the Individual Thermal Comfort Model. The Table 14 shows the comfort zone range and range difference under different PPD of simulation result in group of Individual Thermal Comfort Model and Compiled Thermal Comfort Model. The changes of comfort zone predicted by Individual Thermal Comfort Model are irregular and unreasonable, while the changes of comfort zone predicted by Compiled Thermal Comfort Model are regular and approximately normally distributed with the increase of PPD ratio. Table 14. Comfort Zone under different PPD of ITC Model and ICZ Model (* △means Range difference) Multi-occupancy simulation Individual Thermal Comfort Model simulates in group Compiled Thermal Comfort Model simulates in group Database 6 test participants (from 30) 6 test participants (from 30) Algorithm Decision Tree Random Forest Evaluation OTD Automatic (OTD + Sensitive First) PPD<16%, Comfort Zone 26.5 ~ 35℃(△=8.5℃) 25.3 ~ 25.9℃(△=0.6℃) 56 PPD<33%, Comfort Zone 15~24.5℃; 25.5~26℃(△=11℃) 25~25.3℃; 25.9~26.2℃(△ =0.6℃) PPD<50%, Comfort Zone / 24.8~25℃; 26.2~26.3℃(△ =0.3℃) PPD<66%, Comfort Zone 25℃(△=0℃) 24.6~24.8℃; 26.3~26.6℃(△ =0.5℃) PPD<83%, Comfort Zone / 24.5~24.6℃; 26.6~27℃(△ =0.5℃) PPD<100%, Comfort Zone / 15~24.6℃; 26.6~35℃(△=18℃) 5.3 Compiled Thermal Comfort Model tested in a multi-occupancy setting for real-time controls 5.3.1 Database information of multi-occupancy experiment Six new test participants participated in a multi-occupied human thermal comfort experiment. A group of six people was collected data for 2 hours in a small experimental room, and the recording interval was every 10 minutes. The total number of experimental records was 78. The record of database information is shown in the following Table 15. Table 15. Database information of 6 new test participants in multi-occupancy Attributes Type Missing Statistics ID - 0 6 individuals: 78 records Age Integer 0 Min: 24 Max:29 Average: 26.33 Gender Polynomial 0 Male:4 ( 52 records) Female:2 ( 26 records) BMI Real 0 Min: 18.5 Max: 28.7 Average: 21.8 Environmental Temperature Real 0 Min: 24.72℃ Max: 33.3℃ Average: 29.98℃ Relative Humidity Real 0 Min: 15.00% Max: 45.92% Average: 30.91% Heart Rate Real 3 Min: 56 Max: 147 Average: 98.22 Stress Level Real 19 Min: 2 Max: 90 Average: 54.89 Skin Temperature Real 3 Min: 30.00℃ Max: 34.12℃ Average: 32.16℃ EDA Real 0 Min: 0.18 Max: 19.21 Average: 6.35 Clothing Level Polynomial (ASHRAE-55) 0 Min: 0.61 Max: 1 Average: 0.78 Activity Level Polynomial 0 Min: 1 Max: 4 Average: 1.87 57 (ASHRAE-55) Thermal Comfort Level Polynomial 0 Min: 1 Max: 6 Average: 3.50 Thermal Sensation Level Polynomial 0 Min: 1 Max: 7 Average: 2.30 5.3.2 Input and output In the multi-occupancy RF model, the input data of the model includes environmental temperature, relative humidity, heart rate, stress level, skin temperature, EDA, clothing level, and activity level. Unlike the single-occupancy model, the input data does not include thermal comfort level and thermal sensation level, which is used to make accuracy evaluation of the model’s result. The output data of RF model is the predicted indoor environmental temperature that satisfies all test participants. 5.3.3 Comparison between temperature setpoint prediction result and test participant’s feedback Figure 28 shows the trend of real-time environment (red line) and predicted temperature (other lines) by Compiled Thermal Comfort Model tested in a multi-occupancy setting. Among them, blue, orange, and green are "Sensitive first evaluation model", "OTD evaluation model", and "Automatic evaluation model". The predicted trajectory of the automatically selected model after 14:10 is coincident with the OTD model. Figure 29 shows the Thermal Sensation feedback statistics of experimenters in a multi-person environment. The 1-7 filled in represent the 7 levels of feelings from Cold to Hot. Generally, most people feel comfortable when TS in the range of 3 to 5. In addition, the data in the yellow dotted line is a test participant who is extremely sensitive to temperature and prefer warm environment, which could be used to verify the accuracy of the “Sensitive first” model. 58 Figure 28 Prediction temperature setpoint for group by Compiled Thermal Comfort Model tested in a multi-occupancy setting compared with real-time environment temperature Figure 29 Test participant thermal sensation feedback 1 2 3 4 5 6 7 13:20 13:30 13:40 13:50 14:00 14:10 14:20 14:30 14:40 14:50 15:00 15:10 15:20 Test participant Thermal Sensation Feedback Tester 1 Tester 2 Tester 3 Tester 4(Temperature sensitive) Tester 5 Tester 6 59 a. Satisfaction of overall expectations Comparing the above two graphs, the main trend of the test participants is roughly the same as the changing trend of the current environment temperature. When most experimenters filled TS in 6- 7, they feel hot, the prediction of environment temperature is lower than the real environment temperature; on the contrary, when most of the experimenters filled TS in 1-3, the prediction of environment temperature is higher than the real environment temperature. Overall, the predicted temperature is in line with people's expectations for the environment temperature. b. Satisfaction of temperature-sensitive people's expectations The yellow dotted line in the TS line chart is the arranged temperature-sensitive experimenter, whose temperature evaluation subject is more likely to experience cold feelings than others. For example, at 14:00 and 15:00, she appears lower temperature evaluation than all other experimenters. Comparing the predicted temperature lines of the Sensitive first model and the OTD model at the two moments, it shows that the sensitive first model is higher than the OTD model at 14:00 and 15:00, which explained that the sensitive first model provides more weight to sensitive people and meets their needs for higher temperatures. The broken line of the Sensitive First Model is higher than other models most of the time, but at 14:20, most people give a hotter evaluation and sensitive people are included. Therefore, the Sensitive First Model predicts that the expected temperature is lower than that of the OTD model. In addition, there is some conditions that the real-time temperature is between the predicted temperatures of the two models. For example, at 14:40, most of the experimenters evaluated overheating of the environment and want to lower the temperature, but the sensitive ones give the Cool evaluation. Also, the result shows Sensitive First Model predicts that the expected temperature is indeed higher than the real temperature. Therefore, the temperature difference of prediction from the 3 models and their difference compared to the real temperature could demonstrate the accuracy of the Sensitive First Model. c. Decrease in the rate of change during the environmental transition period During the experiment, the experimenters were all in the room for the first hour of the experiment, but in the second half of the experiment, different experimenters were set up to leave and enter the experiment room at different periods. When the experimenter returned to the room from the external environment, he felt uncomfortable with the large temperature difference between inside and outside the room. From the predicted temperature increase, it can show the model's response to this situation. The difference between the predicted temperature in the next hour and the ambient temperature is significantly greater than that in the previous hour, for the model sets a predicted value that is very different from the temperature after analyzing the obvious abnormal physiological information of the experimenter, which could bring the temperature back to the comfort zone as soon as possible. 60 5.3.4 Performance comparison between Compiled Thermal Comfort Model tested in a multi- occupancy setting and PMV Model Figure 30. Six test participant’s comfort range at14:30 predicted by Compiled Thermal Comfort Model Figure 30 is the comfort zone prediction by Compiled Thermal Comfort Model base on the six test participant’s data at 14:30. It can be seen that the overlapping range of the six people is between 24-26℃, which is consistent with the predicted environmental temperature of 25.4℃ at 14:30 by Compiled Thermal Comfort Model tested in a multi-occupancy setting. As shown in the Figure 31 below, the 14:30 environmental information, clothing level of six test participant, and activity coefficients were inputted into the traditional PMV model to predict the comfort range of six test participants. The results show that the comfort range predicted by PMV to satisfy everyone is around 20-27℃ and the average temperature prediction by PMV is 22.7℃. 61 Figure 31. CBE Thermal Comfort Tool Results for six test participants at 14:30 Table 16 compares the prediction results of the Compiled Thermal Comfort Model tested in a multi-occupancy setting under real-time control and the PMV model in whole 2 hours of the multi- occupancy experiment. Compared with the real-time environment temperature, the discrete value of the difference between the set temperature and the real temperature predicted by the RF model is 3.13, while the discrete value of the difference between the set temperature and the real temperature predicted by the PMV model is 4.23. It demonstrates that the model could set the temperature setpoint range closer to the ambient temperature and achieve a higher satisfaction rate. The difference between the set temperature of the two models is 0.26 on average. The orange Cell of Group TS Average value and grey Cell of Group TC Average value means when the TC≤3.5(uncomfortable thermal comfort), the Thermal Sensation shows from 5.2~6.7 (hot ~ very hot). Comparing the set temperature predicted by the Compiled Thermal Comfort Model tested in a multi-occupancy setting and the set temperature predicted by the PMV model, it shows that when the average thermal comfort level is uncomfortable, the set temperature predicted by the Compiled Thermal Comfort Model is lower than PMV Model, which is more satisfying for people's desires of the need to get rid of the hot sensation. But for the energy saving, the multi- occupancy experiment was used both cooling and heating system, which could not simply discuses energy saving according to temperature increase or decrease. 62 Table 16. Table of comparison between Compiled Thermal Comfort Model (CTCM) tested in a multi-occupancy setting and PMV model In addition, by comparing the RMSE values of the two models, the accuracy of the Compiled Thermal Comfort Model tested in a multi-occupancy setting for real-time control is 40.32% higher than that of the PMV model. In summary, the Compiled Thermal Comfort Model tested in a multi- occupancy setting also has a smaller prediction range and higher accuracy than the PMV model. 5.3.5 Accuracy performance of Compiled Thermal Comfort Model tested in a multi- occupancy setting for real-time control The Compiled Thermal Comfort Model tested in a multi-occupancy setting for real-time control is tested in a real environment occupied by 6 people. The test result has an RMSE of 0.925 and an R-square of 0.688. The accuracy of the model is improved 43.89% compared with the previous Compiled Thermal Comfort Model simulate in group condition and increased 40.32% compared with the PMV model. The R-square value of the model is slightly decreased 5.38% compared with the Compiled Thermal Comfort Model. In summary, the accuracy of the multi-person possession model is higher, but the degree of the independent variable explaining the variation of the dependent variable is reduced. The increase in accuracy may be because the new decision-making mechanism introduces two constraints, OTD and PPD. However, the data of the model comes from a two-hour test, and the total available test data is only 78 rows, which may be the main reason for the insufficient interpretation of the model. Table 17. Accuracy performance (RMSE) of Single-occupancy Model and Multi-occupancy Model (The orange grid is the main body of comparison, the yellow grid is the comparison object, and the gray grid is the comparison percentage of the two ends of the grid) Single-occupancy Condition Multi-occupancy Condition Simulation Real Multi-occupancy Condition 63 Baseline Model Individual Thermal Comfort Model Compiled Thermal Comfort Model PMV Single Model Compiled Thermal Comfort Model (test in group with self- defined temperature) Compiled Thermal Comfort Model tested under real- time controls PMV Multi Model Database 30 test participants 30 test participants 30 test participan ts 30 test participan ts 6 test participants (selected from 30) New 6 test participants New 6 test participants Algorithm Linear Regression Decision Tree Random Forest / Random Forest Random Forest / RMSE 1.879 0.306 1.159 1.460 1.331 0.925 1.298 Accuracy Comparison (RMSE) +62.26% +43.89% -73.59% +40.32% +25.97% R-square 0.275 / 0.725 / 0.537 0.688 / Interpretation Ability Comparison (R- square) Compare to baseline model: +60.03% Compare to individual model: -5.38% Compare to simulation test: +21.94% Table 17 show the RMSE (accuracy performance) of Individual Thermal Comfort Model and Compiled Thermal Comfort Model tested in single-occupancy and multi-occupancy setting, PMV Model, and Baseline Model. Although Individual Thermal Comfort Model RMSE is the smallest, it does not perform well when test the model in a group, which means it is over fit in individual model. However, the Compiled Thermal Comfort Model perform stable in model accuracy and model explanation. 5.3 Summary In this chapter, the Individual Thermal Comfort Model and the Compiled Thermal Comfort Model were selected to conduct simulation tests on 6 out of 30 people respectively. It was found that the Individual Thermal Comfort Model had extremely low accuracy in the group simulation test due to the imbalance of data. Therefore, based on the Compiled Thermal Comfort Model, the temperature decision mechanism was established includes two types of assessments: OTD and Sensitive First. Finally, which could predict best temperature setpoint in a multi-occupancy 64 experiment for simulation condition. The accuracy performance of the Compiled Thermal Comfort Model tested in a multi-occupancy setting under real condition is illustrated by the interpretation ability and the satisfaction analysis of the TC table prediction. In addition, it also compares and analyzes the PMV model, and summarizes and analyzes the accuracy of all models. 65 6. CONCLUSION 6.1 Data-driven approach performance In this study, single-person and multi-person test conditions were considered. The decision tree algorithm is used to establish the Individual Thermal Comfort Model and the random forest algorithm is used to establish the Compiled Thermal Comfort Model, and different experimental data are used to simulate and verify the performance of the model. 6.1.1 Single occupancy condition a. Individual Thermal Comfort Model Using the thermal preference data of 30 subjects to develop the Individual Thermal Comfort Model, a decision tree model was generated for each subject, which can predict the subject's Thermal Comfort Level. And analyze the data by gender classification and feature selection. Two improvement methods are used to improve the SOC-ANN model, including adding more training data and simplifying output options. In single-occupancy conditions, the average accuracy of the Individual Thermal Comfort Model is 69.46%, and the RMSE is 0.306. But through the verification of the training set and the test set, it is found that the single-person model has problems of overfitting and data imbalance. b. Compiled Thermal Comfort Model Use the thermal preference data of 30 subjects to develop the Compiled Thermal Comfort Model formed by the random forest algorithm, which can predict the upper and lower boundaries of the comfort range of previous or any new subjects. The RMSE of the model is 1.159, which is 62.26% higher than the Baseline Model (linear regression model) and 73.59% lower than the Individual Thermal Comfort Model accuracy. But the R-square of the model is 0.725, which is an increase of 163% compared to the Baseline Model. In addition, the accuracy of the Compiled Thermal Comfort Model is 25.97% higher than that of the PMV model, and the predicted set temperature can save 6.4% energy on average compared to the PMV model. 6.1.2 Multi-occupancy simulation The Individual Thermal Comfort Model and the Compiled Thermal Comfort Model respectively use the data of 6 individuals from 30 subjects to simulate and predict the multi-occupancy environment and use the OTD index to evaluate the overall thermal comfort satisfaction degree of the area. By giving 6 people different ambient temperatures, which is from 15 to 35°C. The temperature setpoint predicted by the Compiled Thermal Comfort Model can achieve the effect of OTD being 0, and gradually increase the OTD value as the temperature decreases or rises, while the OTD index predicted by the Individual Thermal Comfort Model is in multiple temperature conditions The bottom display cannot meet the thermal comfort needs of 6 people. 66 6.1.3 Multi-occupancy condition Based on the Compiled Thermal Comfort Model, the temperature setpoint automatic decision- making mechanism is adopted, which includes the OTD mode and the sensitive population priority mode, and the test in real multi-occupancy condition is performed on 6 new subjects to verify the performance of the model. The analysis results from the predicted temperature setpoint curve and the subject's TS curve show that the predicted results meet the overall thermal comfort requirements, meet the thermal comfort expectations of temperature-sensitive people, and reduce the time of the environmental transition period. The RMSE of the model is 0.925, and its accuracy is increased by 43.89% compared to the simulation test and increased by 40.32% compared with the PMV model. The R-square of the model is 0.688, and its model explanatory power is increased by 21.94% compared to the simulation test, but it is reduced by 5.38% compared to the single- occupancy test. 6.2 Limitations 6.2.1 Limitation of the number of subjects A total of 30 subjects participated in the single-occupancy experiment, and 6 subjects participated in the multi-occupancy experiment. First, the types of subjects sampled are limited. Since the subjects are mainly students from the USC School of Architecture, their age and BMI are within a certain range, and the subjects' nationalities are mainly Chinese. The limitation of subject sample diversity makes it impossible to conduct a more in-depth comparative analysis of model results for subjects with large differences in identification information, which may improve the accuracy and universality of the model. Secondly, although the total number of subjects in a single-occupancy experiment is 30, the data available for an individual subject is only recorded 70-90 records. It had a great influence on the Individual Thermal Comfort Model that adopts the decision tree algorithm. The imbalance data resulted in reduced accuracy or overfitting of the individual model. For example, the accuracy of the individual models of some subjects during training was about 70%, but the accuracy during testing was only about 55%. The over-fitting Individual Thermal Comfort Model shows high accuracy when used in a single subject, but when the model is simulated and predicted in a multi- person environment, the accuracy of the result greatly decreased. Finally, the total number of subjects in the multi-occupancy experiment is only 6 people, which due to the impact of COVID-19, the number of people that can be occupied in a single classroom is limited. Although the results of the multi-occupancy experiment are ideal and feed subjects’ thermal requirements, the results may be different if the number of occupants of the room increases. Therefore, whether the results of the Compiled Thermal Comfort Model tested in a multi- occupancy setting can meet the expectations of more people, it still need to be tested under 67 different occupation conditions such as 10 people, 20 people, 30 people, etc., to verify the accuracy of the temperature setpoint decision-making mechanism. 6.2.2 Limitation of multi-occupancy experiment condition For single-occupancy experiment, experiment condition has the problem of inaccurate test data. Because the subjects need to wear the smart watch and HOBO for up to a week, the value of the smart watch fluctuated irregularly when it is worn or loosened, and the HOBO may not be placed around at any time. Therefore, the values collected in the experiment may have a certain error, which may have an impact on the accuracy of the model when the number of subjects is small. For the multi-occupancy experiment, the experiment classroom is temperature-regulated by two chillers and two heaters, instead of central air-conditioning for indoor temperature control, which caused uneven temperature in the area and different thermal sensations of the subjects under the same temperature conditions recorded by HOBO. It shows that the results and accuracy predicted by the Compiled Thermal Comfort Model tested in a multi-occupancy setting has possible to be biased, or more subjects and a more controllable experiment environment are needed for verification. 6.3 Future work 6.3.1 Model and experiment improvement For the model, the Individual Thermal Comfort Model can improve the overfitting and imbalance of the model by adding more data samples; the Compiled Thermal Comfort Model can increase the universality of the model by increasing the types of data samples, such as adding different nationalities And subjects of different ages. The follow-up study of the model can also introduce different machine learning algorithms to compare the results and verify whether the random forest is the best algorithm. For the experiment, if evenly arranged vents and central air-conditioning system are used in the test environment, there can be more stable conditions and more accurate test data. Compiled Thermal Comfort Model can also test the accuracy of its performance by conducting multi-occupancy tests with different numbers of subjects. In future work, a real case study can also be carried out, which can select an entire community or school building for simulation and analyze the energy consumption of the building in one month. 6.3.2 Consideration of potential problems in the development of smart communities The Compiled Thermal Comfort Model uses an automatic decision-making mechanism that considers both OTD and thermally sensitive people. If this mechanism is fully applied to the development of smart communities in future work, which has been using machine learning 68 algorithms to predict the thermal comfort requirements of a single person or a group, will it cause humans to continue to be in a comfortable temperature state? This results in greater sensitivity to temperature. Therefore, if the experiment is to be extended to the scale of a community or even a city, it is necessary to carry out corresponding simulation and comparison experiments to verify whether the long-term comfort temperature will have physical or health effects on the human body. In addition, Compiled Thermal Comfort Model needs to collect background information of subjects. If the subjects are expanded to the size of the community, then the volume of these data is huge. Therefore, personal data privacy must be effectively protected when establishing community subject data collection. 6.3 Conclusion Under this research condition setting, compared with the Baseline Model using linear regression algorithm and the Individual Thermal Comfort Model using decision tree algorithm, the Compiled Thermal Comfort Model generated by the random forest algorithm have better thermal comfort performance under single-occupancy conditions and multi-occupancy conditions. And the Compiled Thermal Comfort Model shows better performance than the PMV model under both condition, which not only meets the thermal comfort needs of more subjects, but also saves more energy when using predicted temperature setpoint in single-person experiments. 69 REFERENCES Arens, Edward, and Zhang Hui. 2006. “The Skin’s Role in Human Thermoregulation and Comfort.” In Thermal and Moisture Transport in Fibrous Materials, 560–602. Elsevier Ltd. https://doi.org/10.1533/9781845692261.3.560. Celebi, M. Emre, and Kemal Aydin. 2016. Unsupervised Learning Algorithms. Unsupervised Learning Algorithms. Springer International Publishing. https://doi.org/10.1007/978-3-319-24211-8. Chang, Tom Y., and Agne Kajackaite. 2019. “Battle for the Thermostat: Gender and the Effect of Temperature on Cognitive Performance.” Edited by Valerio Capraro. PLOS ONE 14 (5): e0216362. https://doi.org/10.1371/journal.pone.0216362. Chen,Zhong. 2017. “Data-Driven Approach for User-Centered Environmental Control.” Choi, Joon Ho, and Dongwoo Yeom. 2017. “Study of Data-Driven Thermal Sensation Prediction Model as a Function of Local Body Skin Temperatures in a Built Environment.” Building and Environment 121 (August): 130–47. https://doi.org/10.1016/j.buildenv.2017.05.004. Cosma, Andrei Claudiu. 2011. “Real-Time Individual Thermal Preferences Prediction Using Visual Sensors.” Romania M.S. in Computer Vision and Artificial Intelligence. Cosma, Andrei Claudiu, and Rahul Simha. 2019. “Machine Learning Method for Real-Time Non-Invasive Prediction of Individual Thermal Preference in Transient Conditions.” Building and Environment 148 (January): 372–83. https://doi.org/10.1016/j.buildenv.2018.11.017. d’Ambrosio Alfano, Francesca Romana, Boris Igor Palella, and Giuseppe Riccio. 2011. “The Role of Measurement Accuracy on the Thermal Environment Assessment by Means of PMV Index.” Building and Environment 46 (7): 1361–69. https://doi.org/10.1016/j.buildenv.2011.01.001. D’Oca, Simona, Chien Fei Chen, Tianzhen Hong, and Zsofia Belafi. 2017. “Synthesizing Building Physics with Social Psychology: An Interdisciplinary Framework for Context and Occupant Behavior in Office Buildings.” Energy Research and Social Science 34 (December): 240–51. https://doi.org/10.1016/j.erss.2017.08.002. Daum, David, Frédéric Haldi, and Nicolas Morel. 2011. “A Personalized Measure of Thermal Comfort for Building Controls.” Building and Environment 46 (1): 3–11. https://doi.org/10.1016/j.buildenv.2010.06.011. Davoodi, Farzin, Hasan Hasanzadeh, Seyed Alireza Zolfaghari, and Mehdi Maerefat. 2017. “Developing a New Individualized 3-Node Model for Evaluating the Effects of Personal Factors on Thermal Sensation.” Journal of Thermal Biology 69 (October): 1–12. https://doi.org/10.1016/j.jtherbio.2017.05.004. Du, Chenqiu, Baizhan Li, Yong Cheng, Chao Li, Hong Liu, and Runming Yao. 2018. “Influence of Human Thermal Adaptation and Its Development on Human Thermal Responses to Warm Environments.” Building and Environment 139 (July): 134–45. https://doi.org/10.1016/j.buildenv.2018.05.025. 70 Du, Chenqiu, Baizhan Li, Hong Liu, Yu Ji, Runming Yao, and Wei Yu. 2019. “Quantification of Personal Thermal Comfort with Localized Airflow System Based on Sensitivity Analysis and Classification Tree Model.” Energy and Buildings 194 (July): 1–11. https://doi.org/10.1016/j.enbuild.2019.04.010. Farag, Wael A. 2017. “ClimaCon: An Autonomous Energy Efficient Climate Control Solution for Smart Buildings.” Asian Journal of Control 19 (4): 1375–91. https://doi.org/10.1002/asjc.1426. Frumosu, Flavia D., and Murat Kulahci. 2018. “Big Data Analytics Using Semi-Supervised Learning Methods.” Quality and Reliability Engineering International 34 (7): 1413–23. https://doi.org/10.1002/qre.2338. Gagnon, Daniel, and Glen P. Kenny. 2012. “Does Sex Have an Independent Effect on Thermoeffector Responses during Exercise in the Heat?” The Journal of Physiology 590 (23): 5963–73. https://doi.org/10.1113/jphysiol.2012.240739. Gilani, Syed Ihtsham Ul Haq, Muhammad Hammad Khan, and William Pao. 2015. “Thermal Comfort Analysis of PMV Model Prediction in Air Conditioned and Naturally Ventilated Buildings.” In Energy Procedia, 75:1373–79. Elsevier Ltd. https://doi.org/10.1016/j.egypro.2015.07.218. Goto, T., J. Toftum, R. De Dear, and P. O. Fanger. 2006. “Thermal Sensation and Thermophysiological Responses to Metabolic Step-Changes.” International Journal of Biometeorology 50 (5): 323–32. https://doi.org/10.1007/s00484-005-0016-5. Grömping, Ulrike. 2009. “Variable Importance Assessment in Regression: Linear Regression versus Random Forest.” American Statistician 63 (4): 308–19. https://doi.org/10.1198/tast.2009.08199. Hoof, J. Van, and J.L.M. Hensen. 2006. “Thermal Comfort and Older Adults.” Gerontechnology 4 (4). https://doi.org/10.4017/gt.2006.04.04.006.00. horr, Yousef Al, Mohammed Arif, Martha Katafygiotou, Ahmed Mazroei, Amit Kaushik, and Esam Elsarrag. 2016. “Impact of Indoor Environmental Quality on Occupant Well-Being and Comfort: A Review of the Literature.” International Journal of Sustainable Built Environment. Elsevier B.V. https://doi.org/10.1016/j.ijsbe.2016.03.006. Hoyt, Tyler, Edward Arens, and Hui Zhang. 2015. “Extending Air Temperature Setpoints: Simulated Energy Savings and Design Considerations for New and Retrofit Buildings.” Building and Environment 88 (June): 89–96. https://doi.org/10.1016/j.buildenv.2014.09.010. Huizenga, C, S Abbaszadeh, L Zagreus, and E Arens. 2006. “Air Quality and Thermal Comfort in Office Buildings: Results of a Large Indoor Environmental Quality Survey.” Proceeding of Healthy Buildings 2006. Vol. 3. http://cbe.berkeley.edu. Jazizadeh, Farrokh, Ali Ghahramani, Burcin Becerik-Gerber, Tatiana Kichkaylo, and Michael Orosz. 2014. “Human-Building Interaction Framework for Personalized Thermal Comfort-Driven Systems in Office Buildings.” Journal of Computing in Civil Engineering 28 (1): 2–16. https://doi.org/10.1061/(ASCE)CP.1943- 5487.0000300. 71 Jazizadeh, Farrokh, and Wooyoung Jung. 2018. “Personalized Thermal Comfort Inference Using RGB Video Images for Distributed HVAC Control.” Applied Energy 220 (June): 829–41. https://doi.org/10.1016/j.apenergy.2018.02.049. Kamaruzzaman, S. N., Noor Ashiqin, E. M. Ahmad Zawawi, and Mike Riley. 2016. “Critical Aspects of the Inclusive Environmental for the Well-Being of Building Occupant-A Review.” MATEC Web of Conferences 66. https://doi.org/10.1051/matecconf/20166600114. Karjalainen, Sami. 2007. “Gender Differences in Thermal Comfort and Use of Thermostats in Everyday Thermal Environments.” Building and Environment 42 (4): 1594–1603. https://doi.org/10.1016/j.buildenv.2006.01.009. Katić, Katarina, Rongling Li, and Wim Zeiler. 2020. “Machine Learning Algorithms Applied to a Prediction of Personal Overall Thermal Comfort Using Skin Temperatures and Occupants’ Heating Behavior.” Applied Ergonomics 85 (May): 103078. https://doi.org/10.1016/j.apergo.2020.103078. Kim, Joyce, Yuxun Zhou, Stefano Schiavon, Paul Raftery, and Gail Brager. 2018. “Personal Comfort Models: Predicting Individuals’ Thermal Preference Using Occupant Heating and Cooling Behavior and Machine Learning.” Building and Environment 129 (February): 96–106. https://doi.org/10.1016/j.buildenv.2017.12.011. Li, Da, Carol C. Menassa, and Vineet R. Kamat. 2017a. “Personalized Human Comfort in Indoor Building Environments under Diverse Conditioning Modes.” Building and Environment 126 (October): 304–17. https://doi.org/10.1016/j.buildenv.2017.10.004. ———. 2017b. “Personalized Human Comfort in Indoor Building Environments under Diverse Conditioning Modes.” Building and Environment 126 (December): 304–17. https://doi.org/10.1016/j.buildenv.2017.10.004. ———. 2018. “Non-Intrusive Interpretation of Human Thermal Comfort through Analysis of Facial Infrared Thermography.” Energy and Buildings 176 (October): 246–61. https://doi.org/10.1016/j.enbuild.2018.07.025. Lim, Jongyeon, Yasunori Akashi, Doosam Song, Hyokeun Hwang, Yasuhiro Kuwahara, Shinji Yamamura, Naoki Yoshimoto, and Kazuo Itahashi. 2018. “Hierarchical Bayesian Modeling for Predicting Ordinal Responses of Personalized Thermal Sensation: Application to Outdoor Thermal Sensation Data.” Building and Environment 142 (June): 414–26. https://doi.org/10.1016/j.buildenv.2018.06.045. Liu, Jing, Runming Yao, and Rachel McCloy. 2012. “A Method to Weight Three Categories of Adaptive Thermal Comfort.” Energy and Buildings 47 (April): 312–20. https://doi.org/10.1016/j.enbuild.2011.12.007. Liu, Weiwei, Zhiwei Lian, and Bo Zhao. 2007. “A Neural Network Evaluation Model for Individual Thermal Comfort.” Energy and Buildings 39 (10): 1115–22. https://doi.org/10.1016/j.enbuild.2006.12.005. Luo, Maohui, Wenjie Ji, Bin Cao, Qin Ouyang, and Yingxin Zhu. 2016. “Indoor Climate and Thermal Physiological Adaptation: Evidences from Migrants with Different Cold Indoor Exposures.” Building and Environment 98 (March): 30–38. https://doi.org/10.1016/j.buildenv.2015.12.015. 72 Luo, Maohui, Zhe Wang, Kevin Ke, Bin Cao, Yongchao Zhai, and Xiang Zhou. 2018. “Human Metabolic Rate and Thermal Comfort in Buildings: The Problem and Challenge.” Building and Environment. Elsevier Ltd. https://doi.org/10.1016/j.buildenv.2018.01.005. Luo, Maohui, Jiaqing Xie, Yichen Yan, Zhihao Ke, Peiran Yu, Zi Wang, and Jingsi Zhang. 2020. “Comparing Machine Learning Algorithms in Predicting Thermal Sensation Using ASHRAE Comfort Database II.” Energy and Buildings 210 (March): 109776. https://doi.org/10.1016/j.enbuild.2020.109776. Mishra, A. K., R. P. Kramer, M. G.L.C. Loomans, and H. L. Schellen. 2016. “Development of Thermal Discernment among Visitors: Results from a Field Study in the Hermitage Amsterdam.” Building and Environment 105: 40–49. https://doi.org/10.1016/j.buildenv.2016.05.026. Murakami, Yoshifumi, Masaaki Terano, Kana Mizutani, Masayuki Harada, and Satoru Kuno. 2007. “Field Experiments on Energy Consumption and Thermal Comfort in the Office Environment Controlled by Occupants’ Requirements from PC Terminal.” Building and Environment 42 (12): 4022–27. https://doi.org/10.1016/j.buildenv.2006.05.012. Nicol, Fergus. 2017. “Temperature and Adaptive Comfort in Heated, Cooled and Free-Running Dwellings.” Building Research & Information 45 (7): 730–44. https://doi.org/10.1080/09613218.2017.1283922. Park, Sangmin, Sanguk Park, Jinsung Byun, Yeong Yu, and Sehyun Park. 2015. “Design of Building Energy Autonomous Control System with the Intelligent Object Energy Chain Mechanism Based on Energy-IoT.” International Journal of Distributed Sensor Networks 2015. https://doi.org/10.1155/2015/931792. Rana, Rajib, Brano Kusy, Raja Jurdak, Josh Wall, and Wen Hu. 2013. “Feasibility Analysis of Using Humidex as an Indoor Thermal Comfort Predictor.” Energy and Buildings 64 (September): 17–25. https://doi.org/10.1016/j.enbuild.2013.04.019. Revel, Gian Marco, Marco Arnesano, and Filippo Pietroni. 2015. “Integration of Real-Time Metabolic Rate Measurement in a Low-Cost Tool for the Thermal Comfort Monitoring in AAL Environments.” Biosystems and Biorobotics 11: 101–10. https://doi.org/10.1007/978-3-319-18374-9_10. Seppänen, Olli A., and William Fisk. 2006. “Some Quantitative Relations between Indoor Environmental Quality and Work Performance or Health.” HVAC and R Research 12 (4): 957–73. https://doi.org/10.1080/10789669.2006.10391446. Streinu-Cercel, Adrian, Sergiu Costoiu, Maria Mârza, Anca Streinu-Cercel, and Monica Mârza. 2008. “Models for the Indices of Thermal Comfort.” Journal of Medicine and Life 1 (2): 148–56. /pmc/articles/PMC5654073/?report=abstract. Szepesvári, Csaba. 2010. “Algorithms for Reinforcement Learning.” In Synthesis Lectures on Artificial Intelligence and Machine Learning, 9:1–89. Morgan & Claypool Publishers . https://doi.org/10.2200/S00268ED1V01Y201005AIM009. 73 Thapa, Samar. 2019. “Insights into the Thermal Comfort of Different Naturally Ventilated Buildings of Darjeeling, India – Effect of Gender, Age and BMI.” Energy and Buildings 193 (June): 267–88. https://doi.org/10.1016/j.enbuild.2019.04.003. Tsuzuki, K, and T Iwata. 2002. “THERMAL COMFORT AND THERMOREGULATION FOR ELDERLY PEOPLE TAKING LIGHT EXERCISE.” Undefined. Wang, Pin, En Fan, and Peng Wang. 2020. “Comparative Analysis of Image Classification Algorithms Based on Traditional Machine Learning and Deep Learning.” Pattern Recognition Letters, August. https://doi.org/10.1016/j.patrec.2020.07.042. Yang, Bin, Xiaojing Li, Yingzhen Hou, Alan Meier, Xiaogang Cheng, Joon-Ho Choi, Faming Wang, et al. 2020. “Non-Invasive (Non-Contact) Measurements of Human Thermal Physiology Signals and Thermal Comfort/Discomfort Poses -A Review.” Energy and Buildings 224: 110261. https://doi.org/10.1016/j.enbuild.2020.110261. Zhong, Chen, and Joon Ho Choi. 2017. “Development of a Data-Driven Approach for Human-Based Environmental Control.” In Procedia Engineering, 205:1665–71. Elsevier Ltd. https://doi.org/10.1016/j.proeng.2017.10.341.
Abstract (if available)
Abstract
As one of a community’s core infrastructure elements, the building is critical for environmental resilience, natural resource consumption, and the occupants’ environmental health and well-being. However, existing facility operation mechanisms of an educational community, whose administrative and financial charge manage by the third party, have not been effectively integrated with actual community dwellers’ time-varying environmental needs, even though communication and sensing infrastructures have become ubiquitous. Consequently, this underutilization lowers real-time adaptation of the community systems by failing to meet the dwellers’ environmental needs. ❧ The project proposes an integrative approach that developed a community member-centered framework for determining the heating system control of the building through multi-standard decision-making driven by bio-sensing, which could establish a tailor-made building environment control system to reduce energy consumption and improve comfort of occupants. The project collects environmental information, thermal sensation information, and physiological information data from the daily life of 30 subjects in 3 weeks. Base on individual data, the Baseline Model, Individual Thermal Comfort Model, and Compiled Thermal Comfort Model were established by Weka and Python through linear regression, decision tree, and random forest algorithm. The Compiled Thermal Comfort Model shows 62.26% higher accuracy than Baseline Model and 25.97% higher accuracy than PMV Model. And the Compiled Thermal Comfort Model performs better than Individual Thermal Comfort Model when simulating individual data in a group. ❧ The Compiled Thermal Comfort Model was tested in a real-time condition, in which six new test participants changed clothing level, activity level, and location arrangement in a space with frequently changing environmental temperature. The results illustrate that the Compiled Thermal Comfort Model tested in a multi-occupancy setting can bring 40.32% higher accuracy than PMV Model, and the predicted temperature setpoint highly meets the requirement of test participants’ thermal comfort and thermal sensation feedback. The paper also discussed the potential applications of data-driven methods in the establishment of smart communities.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Developing environmental controls using a data-driven approach for enhancing environmental comfort and energy performance
PDF
Enhancing thermal comfort: air temperature control based on human facial skin temperature
PDF
Human–building integration: machine learning–based and occupant eye pupil size–driven lighting control as an applicable visual comfort tool in the office environment
PDF
Developing a data-driven model of overall thermal sensation based on the use of human physiological information in a built environment
PDF
Exploration for the prediction of thermal comfort & sensation with application of building HVAC automation
PDF
Using bio-signals with smart windows
PDF
Enhancing thermal comfort: data-driven approach to control air temperature based on facial skin temperature
PDF
Indoor environmental quality and comfort: IEQ adaptation and human physiological responses in commercial buildings
PDF
Impacts of indoor environmental quality on occupants environmental comfort: a post occupancy evaluation study
PDF
Human-environmental interaction: potential use of pupil size for office lighting controls
PDF
Human-building integration: Investigation of human eye pupil sizes as a measure of visual sensation in the workstation environment
PDF
Exploring participatory sensing and the Internet of things to evaluate temperature setpoint policy and potential of overheating/overcooling of spaces on the USC campus
PDF
Real-time simulation-based feedback on carbon impacts for user-engaged temperature management
PDF
Visualizing thermal data in a building information model
PDF
Building energy performance estimation approach: facade visual information-driven benchmark performance model
PDF
Energy use intensity estimation method based on building façade features by using regression models
PDF
Indoor air quality for human health in residential buildings
PDF
Smart buildings: employing modern technology to create an integrated, data-driven, intelligent, self-optimizing, human-centered, building automation system
PDF
Energy performance of different building forms: HEED simulations of equivalent massing models in diverse building surface aspect ratios and locations in the US
PDF
A BIM-based visualization tool for facilities management: fault detection through integrating real-time sensor data into BIM
Asset Metadata
Creator
Wang, Yushi
(author)
Core Title
Multi-occupancy environmental control for smart connected communities
School
School of Architecture
Degree
Master of Building Science
Degree Program
Building Science
Publication Date
04/15/2021
Defense Date
03/17/2021
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
bio-sensing and data-driven,indoor environmental health,machine learning,multi-standard decision-making,OAI-PMH Harvest,personal thermal comfort profiles
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Choi, Joon-Ho (
committee chair
), Chiang, Yao-Yi (
committee member
), Narayanan, Shrikanth (
committee member
)
Creator Email
yushiw@usc.edu,yushiw620@gmail.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c89-444217
Unique identifier
UC11668608
Identifier
etd-WangYushi-9464.pdf (filename),usctheses-c89-444217 (legacy record id)
Legacy Identifier
etd-WangYushi-9464.pdf
Dmrecord
444217
Document Type
Thesis
Rights
Wang, Yushi
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
bio-sensing and data-driven
indoor environmental health
machine learning
multi-standard decision-making
personal thermal comfort profiles