Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Development of AI-driven architectural design guidelines: establishing human biometric signal-driven architectural design guideline as a function of psychological principles
(USC Thesis Other)
Development of AI-driven architectural design guidelines: establishing human biometric signal-driven architectural design guideline as a function of psychological principles
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
DEVELOPMENT OF AI-DRIVEN ARCHITECTURAL DESIGN GUIDELINES:
ESTABLISHING HUMAN BIOMETRIC SIGNAL-DRIVEN ARCHITECTURAL
DESIGN GUIDELINE AS A FUNCTION OF PSYCHOLOGICAL PRINCIPLES
by
Xingbai Zhang
A Thesis Presented to the
FACULTY OF THE USC SCHOOL OF ARCHITECTURE
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
MASTER OF BUILDING SCIENCE
MAY 2021
Copyright 2021 Xingbai Zhang
ii
Acknowledgments
I would like to thank all the participants to help me finish the experiment and provide their data
for this research. Also, I would like to thank my family and my kind friends for their assistance
throughout the thesis research. In the end, I would like to thank my thesis chair Professor Joon-
Ho Choi, thesis committee members Professor Selwyn Ting and Professor Rafael Ferreira Da
Silva for instructions.
iii
Table of Contents
Acknowledgments ......................................................................................................................... ii
List of Tables ................................................................................................................................ vi
List of Figures .............................................................................................................................. vii
Abstract ......................................................................................................................................... ix
1. Introduction ............................................................................................................................... 1
1.1. Overview ............................................................................................................................ 1
1.2. Impacts of design on clients’ consideration. ....................................................................... 1
1.3. Architecture Design Process .............................................................................................. 2
1.4. Current issues and potential limitations of the conventional schematic design process .... 3
1.5. The potential use of human physiological signals in the design process ........................... 4
1.6. Machine learning ................................................................................................................ 4
1.7. The need for client-centered design approaches ............................................................... 5
1.8. Summary ............................................................................................................................ 6
2. Literature Review ..................................................................................................................... 6
2.1. The potential conflict of design: Clients Vs Architects ........................................................ 6
2.1.1. Design intention and relationship between clients and architects ............................... 7
2.2. Architectural Design Process ............................................................................................. 8
2.3. Current issues and potential limitations of the conventional schematic design process .. 10
2.3.1. Misunderstanding of design concepts and intentions ................................................ 10
2.4. The potential use of human physiological signals in the design process ......................... 11
2.4.1.EEG ............................................................................................................................ 12
2.4.2. Brain Waves .............................................................................................................. 13
2.4.3. HR (Heart Rate) and HRV (Heart Rate Variable) ...................................................... 14
2.4.4. EDA (Electrodermal Activity) ..................................................................................... 15
2.5. Machine Learning ............................................................................................................. 16
2.5.1. Machine learning algorithms ...................................................................................... 17
2.5.2. Supervised Learning .................................................................................................. 17
2.5.3. Support Vector Machine ............................................................................................ 19
2.5.4. Neural Network Learning ........................................................................................... 20
2.6. The need of user (i.e. client)-centered design approaches .............................................. 21
2.6.1. Relationship between Human psychological perception and Visual aesthetic .......... 21
2.6.2. Human physiological responses factors .................................................................... 22
iv
2.6.3. Physiological Signals to estimate psychological factors ............................................ 22
3. Methodology ............................................................................................................................ 23
3.1. Experiment procedure ...................................................................................................... 24
3.2. Data Collection ................................................................................................................. 25
3.2.1. User preference survey ............................................................................................. 25
3.2.2. Physiological Signals Measurement .......................................................................... 28
3.3. Data analysis .................................................................................................................... 31
3.3.1. Linear Regression (LR) .............................................................................................. 31
3.3.2. Artificial Neural Network (ANN) ................................................................................. 32
3.3.3. Random Forest Classifier and Regressor (RF_C and RF_R).................................... 32
3.3.4. Decision Tree (DT) .................................................................................................... 32
3.4. Analysis Tools .................................................................................................................. 33
3.4.1. Minitab ....................................................................................................................... 33
3.4.2. Weka .......................................................................................................................... 34
3.5. The research of Joon Joo Kim (JJ) Description ............................................................... 34
3.5.1. Data Collection .......................................................................................................... 35
3.5.2. The Selection of Building Faç ade .............................................................................. 36
3.5.3. User Preference Survey on Building Faç ade Design ................................................ 37
3.5.4. Survey Procedure ...................................................................................................... 38
3.5.5. Algorithms for Data Analysis ...................................................................................... 39
3.5.6. Statistical Analysis ..................................................................................................... 39
3.5.7. Machine Learning Technique .................................................................................... 39
3.5.8. Data Analysis Tool ..................................................................................................... 40
3.6. Best Prediction Model ...................................................................................................... 40
4. Results and Discussion ............................................................................................................ 41
4. 1. Data Preprocessing ......................................................................................................... 41
4.1.1. Raw Data ................................................................................................................... 41
4.1.2. Conversion and Extraction ......................................................................................... 41
4.1.3. The Integration of Data .............................................................................................. 45
4.1.4. Data Cleansing .......................................................................................................... 47
4.2. Data Analysis ................................................................................................................... 47
v
4.2.1. Design Preference Impacted by Gender ................................................................... 48
4.2.2. Design Preference Impacted by Heart Rate .............................................................. 50
4.2.3. Design Preference Impacted by Stress Level ............................................................ 52
4.2.4. Design Preference Impacted by EDA ........................................................................ 55
4.2.5. Design Preference Impacted by Skin Temperature ................................................... 57
4.3. Summary .......................................................................................................................... 59
5. Development of Design Preference Prediction Model using Machine Learning
Algorithms ................................................................................................................................... 60
5.1. General Design Preference Prediction Model using Linear Regression (LR) .................. 61
5.2. General Design Preference Prediction Model using Artificial Neural Network (ANN) ...... 63
5.3. General Design Preference Prediction Model using Decision Tree (DT) ......................... 64
5.4. General Design Preference Prediction Model using Random Forest (RF) ....................... 65
5.5. Comparison of Machine Learning Algorithms in General Design Preference Prediction
Model ....................................................................................................................................... 67
5.6. Individual Design Preference Prediction Models based on Each Subject’s Personal Data
................................................................................................................................................ 68
5.7. Feature Importance Ranking by Using Each Subject’s Personal Data ............................ 85
5.7.1. Ranking by using Every Subject’s Datasets .............................................................. 86
5.7.2. Ranking by Gender .................................................................................................... 88
5.8. Summary .......................................................................................................................... 91
6. Conclusions and Future Work ............................................................................................... 92
6.1. Conclusions ...................................................................................................................... 92
6.2. Limitations ........................................................................................................................ 94
6.3. Future Work ...................................................................................................................... 95
References .................................................................................................................................... 96
Appendix A ................................................................................................................................ 101
Appendix B ................................................................................................................................ 104
Appendix C ................................................................................................................................ 107
Appendix D ................................................................................................................................ 110
Appendix E ................................................................................................................................ 112
vi
List of Tables
Table 3- 1. Personal Information Questionnaires ......................................................................... 28
Table 3- 2. Design Preference Questionnaires .............................................................................. 28
Table 3- 3. Building Faç ade Design Parameters Sample .............................................................. 36
Table 4- 1. Raw Data Formate Conversion .................................................................................. 42
Table 4- 2. The Scale Represents to Design Preference ............................................................... 48
Table 4- 3. P-Value in Two-Sample T-test ................................................................................... 50
Table 4- 4. ANOVA Test .............................................................................................................. 50
Table 4- 5. ANOVA Test .............................................................................................................. 52
Table 4- 6. ANOVA Test .............................................................................................................. 55
Table 4- 7. ANOVA Test .............................................................................................................. 57
Table 5- 1. Datasets by using 5 Point Scale in Numbers .............................................................. 61
Table 5- 2. Datasets by using 3 Point Scale in Numbers .............................................................. 62
Table 5- 3. Results in Linear Regression (LR) ............................................................................. 62
Table 5- 4. Results in Artificial Neural Network .......................................................................... 63
Table 5- 5. Datasets by using 5 Point Scale in Nominal ............................................................... 64
Table 5- 6. Datasets by using 3 Point Scale in Nominal ............................................................... 64
Table 5- 7. Results in Decision Tree ............................................................................................. 65
Table 5- 8. Results in Random Forest Classifier (RF_Classifier) ................................................ 66
Table 5- 9. Results in Random Forest Regressor (RF_Regressor) ............................................... 66
Table 5- 10. Machine Learning Results Comparision in General Model .................................... 67
Table 5- 11. P-Value of Paired T-Test (5 Point Scale in Numeric) .............................................. 69
Table 5- 12. P-Value of Paired T-Test (3 Point Scale in Numeric) .............................................. 72
Table 5- 13. P-Value of Paired T-Test (5 Point Scale in Nominal) .............................................. 75
Table 5- 14. P-Value of Paired T-Test (3 Point Scale in Nominal) .............................................. 78
vii
List of Figures
Figure 1- 1. Overview of Client-centered Design Approaches ....................................................... 5
Figure 2- 1. Overview of Design Process. ...................................................................................... 9
Figure 2- 2. EEG Headset with electrodes placed ........................................................................ 13
Figure 2- 3. The interactions of Parasympathetic and Sympathetic with HR and HRV .............. 14
Figure 2- 4. Types of Machine Learning ...................................................................................... 17
Figure 2- 5. Supervised Learning Workflow ................................................................................ 18
Figure 2- 6. Decision Trees Workflow ......................................................................................... 19
Figure 2- 7. Working of SVM....................................................................................................... 19
Figure 2- 8. Working of RL .......................................................................................................... 20
Figure 2- 9. Structure of an ANN ................................................................................................. 21
Figure 3- 1. Overview of Methodology ........................................................................................ 24
Figure 3- 2. Overview of Experiment Procedure .......................................................................... 24
Figure 3- 3. Examples of selected Buildings for the survey ......................................................... 27
Figure 3- 4 Specification of Bio-Signals Sensors ......................................................................... 29
Figure 3- 5. Specification of EEG Sensor ..................................................................................... 30
Figure 3- 6. Sample Decision Tree ............................................................................................... 33
Figure 3- 7. The methodology of JJ’s research ............................................................................. 35
Figure 3- 8. Survey Timeline Recommendation ........................................................................... 38
Figure 4- 1. Design Preference Survey Data File Sample ............................................................ 42
Figure 4- 2. Sample of EDA and Skin Temperature Data Conversion ......................................... 43
Figure 4- 3. Sample of Heart Rate and Heart Rate Variable/Stress Level Data Conversion ........ 44
Figure 4- 4. Sample of EEG Data Conversion .............................................................................. 45
Figure 4- 5. Sample of Data File Per Subject After Integrating ................................................... 46
Figure 4- 6. Sample of Combined All Subjects Data in One Sheet .............................................. 46
Figure 4- 7. Overall Design Preference Impacted by Gender ....................................................... 49
Figure 4- 8. Overall Design Preference Impacted by Gender ....................................................... 50
Figure 4- 9. Design Preference Impacted by Heart Rate in Female Group. ................................. 51
Figure 4- 10. Design Preference Impacted by Heart Rate in Male Group. ................................... 52
Figure 4- 11. Design Preference Impacted by Stress Level in Female Group. ............................. 53
Figure 4- 12. Design Preference Impacted by Stress Level in Male Group. ................................ 54
Figure 4- 13. Design Preference Impacted by EDA in Male Group. ............................................ 56
Figure 4- 14. Design Preference Impacted by EDA in Male Group. ............................................ 57
Figure 4- 15. Design Preference Impacted by Skin Temperature in Female Group. .................... 58
Figure 4- 16. Design Preference Impacted by Skin Temperature in Male Group. ....................... 59
Figure 5- 1. General Design Preference Prediction Results ......................................................... 68
Figure 5- 2. Interval Plot of ANN (5 Scale Point in Numeric) ..................................................... 69
Figure 5- 3. Artificial Neural Network Comparison ..................................................................... 70
Figure 5- 4. Interval Plot of Random Forest Regressor ................................................................ 71
Figure 5- 5. Random Forest Regressor Comparison ..................................................................... 71
Figure 5- 6. Interval Plot of Artificial Neural Network ................................................................ 72
Figure 5- 7. Artificial Neural Network Comparison ..................................................................... 73
Figure 5- 8. Interval Plot of Random Forest Regressor ................................................................ 74
Figure 5- 9. Random Forest Regressor Comparison. .................................................................... 74
Figure 5- 10. Interval Plot of Decision Tree ................................................................................. 75
Figure 5- 11. Decision Tree Comparison ...................................................................................... 76
viii
Figure 5- 12. Interval Plot of Random Forest Classifier ............................................................... 77
Figure 5- 13. Random Forest Classifier Comparison ................................................................... 77
Figure 5- 14. Interval Plot of Decision Tree ................................................................................. 78
Figure 5- 15. Decision Tree Comparison ...................................................................................... 79
Figure 5- 16. Interval Plot of Random Forest Classifier. .............................................................. 80
Figure 5- 17. Random Forest Classifier Comparison ................................................................... 80
Figure 5- 18. Datasets by using Numeric Comparison ................................................................. 81
Figure 5- 19. Model by using 5 Point Scale in Numeric Comparison .......................................... 82
Figure 5- 20. Datasets by using Nominal Comparison ................................................................. 83
Figure 5- 21. Model by using 5 Point Scale in Nominal Comparison .......................................... 83
Figure 5- 22. Best Model Comparison. ......................................................................................... 84
Figure 5- 23. Decision tree Sample ............................................................................................... 86
Figure 5- 24. The 5 Point Scale in Datasets Feature Ranking ...................................................... 86
Figure 5- 25.The 3 Point Scale in Datasets Feature Ranking ....................................................... 87
Figure 5- 26.Female Group in 5 Point Scale in Nominal Feature Ranking .................................. 88
Figure 5- 27.Male Group in 5 Point Scale in Nominal Feature Ranking ..................................... 89
Figure 5- 28.Female Group in 3 Point Scale in Nominal Feature Ranking .................................. 90
Figure 5- 29.Male Group in 3 Point Scale in Nominal Feature Ranking ..................................... 91
ix
Abstract
In most architectural design projects, the clients and architects are the main members of the
designing process. Architects and clients unescapably spend a large amount of time deciding
design agreements due to misunderstandings about the clients’ requirements and preferences.
The efficiency of the design process in the architecture design company is based on the architect.
After the client changes the idea of the project, the whole project or part of the project will be
altered according to the client. It takes time to make the changes and redesign the project more
than once. The objective of the proposed research is to develop architectural design guidelines
based on the use of advanced machine learning algorithms. Unlike the current architectural
conventional design process, advanced sensing technologies provide information based on the
clients’ reactions, such as physiological signals and psychological factors. Architects’ concept
design might provide more opportunities to achieve the clients’ satisfaction. Also, a better
understanding of clients’ needs by architects will substantially decrease the processing time and
effort.
Keywords: Architects, Clients, Design Process, Physiological Signals, Machine Learning.
1
1. Introduction
1.1. Overview
In the architecture industry, design is an important step because it has priority features in terms
of financial merits. The architecture firm designs the project based on the budget and clients’
needs. In most architectural design projects, the clients and architects are the main members of
the designing process. In many cases, the clients deliver the task and desired program of their
need to the architects. Architects are client consultants that provide design services and assist the
clients during the design process to help clients reach their goals. Architects should write
agreements that may include important information about the projects. Architects need to
educate the clients to understand the role of architects and their responsibilities on the projects.
An architect should plan, design, and review the projects based on the clients’ needs. The clients
should have a better understanding of the design as well as their roles and responsibilities in the
process. The conflicts and misunderstandings could be avoided if the roles and responsibilities
are well established. (American Institute of Architects 2017).
1.2. Impacts of design on clients’ consideration.
Clients have a major role to play in the design process. Architects who deal with clients will have
to face one situation, no matter how uncomfortable and difficult, and that is leading the client to
agree that the design is good. This is about letting clients to review the design and satisfy their
requirements and maintain everything they are looking for. The different designs will impact
several decisions of clients. In order to complete the task in time and to be efficient, it is
2
important that clients are part of the design reviews. Clients are the final decision makers as their
acceptance can either make or break the design. (American Institute of Architects 2017).
1.3. Architecture Design Process
Architectural services include five design phases which are schematic design, design
development, construction documents, bidding, and construction administration. Pre-design
forms a necessary step before starting any of the phases. It focuses on research on the property
and its context. The clients could do it by themselves or hire an architect to help them, but many
firms provide pre-design architectural services. The pre-design phase or step includes site
analysis, zoning analysis, project scope, project goals, building program, project budgeting, and
project scheduling (American Institute of Architects 2017).
The design process starts with schematic design; the architect and the client discuss the
requirements of the project provided by the client. The architect does prior research and property
analysis. The property zoning and building code issues will be analyzed for the further
development of the project. The program organization is part of this phase, the architect provides
the basic design of each space based on the client’s requirements. Once the schematic design is
done, the architect provides the drawings to the client to get an agreement (American Institute of
Architects 2017). The design development phase is next. In this phase, the architect and client
will select the material of the project together. The drawings will be revised by the architect to
provides more detailed information and specific materials. The systems of the project include
structure, plumbing, electricity, and heating which will be started by the Engineering team
(American Institute of Architects 2017).
3
In the construction documents phase, the architect and engineer will determine all the
technicalities of the design including structural drawings provided by the engineer. All products
and materials will be determined and put on a schedule. Bidding is the fourth phase of the
project. In this phase, the contractor will be selected by a competitive bid or the client can choose
the contractor directly without a competitive bid. Construction administration is the final
phase/fifth of the project. Also, it is the longest phase of the whole project. The architect will
visit the site to look over the construction process and make sure that the contractor is following
the plans. The architect has the responsibility to provides clarification and assure that the
construction is performed as per documents until the project is completed (American Institute of
Architects 2017).
1.4. Current issues and potential limitations of the conventional schematic design process
The conventional design process is that the architect and clients discuss and analyzes the project
and any requirements provided by the client. The architect does research (i.e. Site, Surround
environment) and analysis of the property information such as zoning and building code that
might impact the later development. The architect will design individual spaces where the client
provides an architect with a program of what kind of spaces are going to be built in the project.
The architect proposes the size, orientation, location, and relationship between each space. The
overall shape and function of the project will be present at this moment. Once the basic design is
presented, the architect provides the drawings to the client for their review and approval. After
attaining the clients’ agreement to the design, the architect and client will work together to
decide which materials to choose for the interior features of the project, such as windows, doors,
and fixtures, until both the stakeholders are satisfied. (American Institute of Architects 2017).
4
In many architectural schematic design scenarios, architects and clients usually spend a large
amount of time deciding design agreements due to misunderstanding about the clients’
requirements and preferences. The efficiency of the design process at the architecture design
company is based on the Architect. Although much effort is usually expended by the architect to
assure a design direction is agreed upon, sometimes design expectations are misunderstood, and
redesign becomes inevitable. This can be considered inherent to an organic design process, but
here it is identified as a potential area to economize. Therefore, the development of the bio-
sensing applied architectural design process is proposed.
1.5. The potential use of human physiological signals in the design process
There are a variety of architectural styles around the world created by designers. Architecture
can shape clients’ experiences and expectations play an important role in everyone’s daily life, it
is important to understand clients’ aesthetic expectations (Ma, Hu, and Wang 2015). The feeling
of experience in the building will be different for different clients. The different experiences will
show different physiological signals from their bodies. The potential use of human physiological
signals in the design process is to collect the data from each experimented client when they
experience the different style of architecture. The physiological signals will be collected and
analyzed to show which design is the best one according to clients.
1.6. Machine learning
Machine learning algorithms are going to be applied to data analysis. The primary object of
machine learning is to model the relationship between the inputs (Physiological signals) and
inputs (best design). Once the mathematical model is selected, predicting the value of the desired
5
variables by collecting the data is possible. Machine learning can build a computational model of
complex relationships. The automatic process of model building is “training”, and the data for
the training is “training data”. The trained model provides a new version of how input variables
are matched to the input (Allmer 2014).
1.7. The need for client-centered design approaches
The objective of the proposed research is to develop architectural design guidelines based on the
use of advanced machine learning algorithms. Establishing human bio-metric signal driven
architectural design guidelines as a function of psychology principle. Unlike the current
architectural conventional design process, advanced sensing technologies provide information
based on the clients’ reactions, such as physiological signals and psychological factors.
Architects’ concept design might provide more opportunities to achieve the clients’ satisfaction.
Also, a better understanding of clients’ needs by architects will significantly decrease the
processing time and effort.
Figure 1- 1. Overview of Client-centered Design Approaches
6
1.8. Summary
To make architecture successful, the design process is an important part of the whole project. In
many cases, the clients deliver the task to the architects. Architects provide design service and
assist the clients during the design process to help them reach their goals. Every architect will
face one situation that is leading clients to agree the design is good. The conventional design
process includes several phases which are pre-design, schematic design, design development,
construction documents, and construction administration. In many architectural schematic design
scenarios, architects and clients spend a lot of time deciding design agreement due to
misunderstanding about the clients’ needs and expectations. To improve the situation, advanced
sensing technologies will be provided to get information based on the clients’ reactions, for
example, physiological signals and psychological factors. The data will be collected and
analyzed by machine learning algorithms to find out the best design according to each client.
2. Literature Review
2.1. The potential conflict of design: Clients Vs Architects
The architects acquire their job through the clients. The perfect design is meaningless until
agreed and accepted by the clients. The clients are the ones to fund and support the project and
drive the architecture industry to keep going. The architecture industry exists because of the
clients and achieving the clients’ satisfaction is the goal of the architecture industry (Siva and
London 2011). Since the 1960s, the numbers of government and industry reports that the degree
of satisfaction of clients is low with the architecture industry and architectural professions (Siva
and London 2011).
7
The construction process is a very complex system that consists of all the project elements and
assembly of every single part of the temporary production facility, the site. The construction
process has been studied for over a year. But the production system is only part of the process.
The other part and one which is widely investigated is the clients. Bertelsen and Emmitt
indicated that to improve the process and product quality, more studies about the client's system,
and understanding their basic requirement is essential (Emmitt, Prins, and Group 2005).
2.1.1. Design intention and relationship between clients and architects
J.P.S.Siva and K.London developed a sociological model to show the relationship between
architects and clients during projects that concentrate on the clients’ view. The successful
management of the relationship between architects and clients was to understand the potential
social system where architects are of primary significance to the management of the relationship.
The concept is called habitus and it was from the sociological theory to help understand the
relationship between architects and clients (Siva and London 2011). As architects are educated
toward the mysteries of design practice and considered the significance of peer review and
separate from the public as they cross each invisible professional line. Due to this socialization,
apart from other non-members, the members of the architectural habitus were their clients (Siva
and London 2011). The architects prefer peer-oriented rather than client-oriented which
architects broke a balance of autonomy in the relationship between their and clients (Siva and
London 2011). During the design process, the different habitus between architects and clients
will cause a mismatch of the design. This situation happens when they go into a relationship
where the clients’ habitus might be a conflict of different conditions. The clients’ habitus might
8
be incompatible to corporate with the unfamiliar architects’ habitus to result in underlying
discomfort (Siva and London 2011).
The architectural project start with the designing phase which is supposed to convert and
visualize the requirements and needs of clients. The exchange of information and communication
between architects and clients is extremely important and directly affected in this phase. Several
studies have shown that communication management is an important part of the whole project
management (Taleb et al. 2017). To provide a better understanding of the architects' ideas, the
briefing is recommended to present architects’ findings that are related to their design process.
The briefing or the requirements program has an important meaning of communication between
architects and clients. The good briefs can clearly explain what is the needs, desires, and
expectations of the clients. The information shown in briefs is extremely important for the design
process (Bogers, Van Meel, and Van Der Voordt 2008).
2.2. Architectural Design Process
The Design Process is an approach to separate the large project into manageable steps. Architects
use this design process to solve several problems (Chicago Architecture Center 2019).
There are 6 steps of the design process:
9
Figure 2- 1. Overview of Design Process.
1. Problems detection
Understanding the problem you are supposed to solve.
2. Information collection
Collect every information about the project, such as site analysis, clients’ information, etc
to start considering the solution.
3. Brainstorming and ideate
Start to sketch, research, and study-related knowledge to understand how to use all the
information to impact the design.
10
4. Solution development
Bring the hypothesis, idea, and form several small-scale design solutions.
5. Feedback collections
Show the ideas to as many people as possible, such as colleagues, friends, professionals
to get feedback from them.
6. Improvement
Based on the feedback and define how to extend the ideas to better solve the problem.
2.3. Current issues and potential limitations of the conventional schematic design process
In many architectural schematic design scenarios, architects and clients unescapably spend a
large amount of time deciding design agreements to gain an increasingly better understanding of
the clients’ requirements and preferences. The efficiency of the design process at the architecture
design company is based on the Architect. Despite processes that might safeguard against late-
processes alterations, if the client changes the idea of the project, the whole project or part of the
project will be altered according to the client. It takes time to make the changes and might
redesign the project more than once.
2.3.1. Misunderstanding of design concepts and intentions
The changes and unknowns set upon clients can create stressful conditions. This can happen
especially at phase completion when clients are required to sign off on the design in order to
proceed to the next phase. These jumps make the design process unpredictable, but it is good to
11
create inspiring ideas. These unknowns and changes in the activities of clients result in emotions
in the clients that they might not be used to. Clients play an important role in the design process.
Nevertheless, the main problems of clients are that they might not have enough knowledge about
the design and information provided by them (Siva and London 2011). Normally, when the
clients get into relationships with architects, their expectations, and what they expect from the
architects are uncertain. The shock status based on the clients’ habitus might be revealed when
they meet with architects who have different matched habitus and might receive individual
experiencing culture shock when experiences are different from other cultures (Siva and London
2011). The term habitus shock is defined as confusion, stress, or frustration experienced by
clients. They are experiencing an unacquainted architectural habitus and design process to result
in a misunderstanding between the clients and architects (Siva and London 2011).
2.4. The potential use of human physiological signals in the design process
There are a variety of architectural styles around the world created by designers. Architecture
can shape clients’ experiences and expectations. As architecture plays an important role in
everyone’s daily life, it is important to understand clients’ aesthetic experiences and expectations
(Ma, Hu, and Wang 2015). The feeling of experience in the building will be altered due to
different clients. The different experiences will elicit different physiological signals from their
bodies. The potential use of human physiological signals in the design process is to collect the
data from each experimented client when they experience different design styles. The
physiological signals will be collected and analyzed to show which design is the best one
according to clients.
12
2.4.1.EEG
The electroencephalogram (EEG) can be defined as recording the different types of electrical
activity from the scalp surface after measuring by the metal electrode and conductive media
(Teplan 2002). EEG is an imaging technique that uses for medial or other research areas and it
reads scalp electrical activity triggered by the brain (Teplan 2002). EEG measures most electrical
activity from the scalp surface and the recordable ones can be only generated from a large
number of active neurons on the head surface. The patterns of EEG can be recorded occurring
immediately after a stimulus has been triggered. The positions of electrical activity can be
determined in different brain regions through EEG. The applications of EEG have been used in
humans in several ways, such as physiological and research sleep disorders. EEG is the pattern to
show the electrical activity of brainwaves and determine the state of physiology and phycology
of humans. Therefore, EEG measurement and analysis are important for this research.
The group of Shans explored the potential method of using neural-signal electroencephalogram
(EEG) to develop the interaction between humans and buildings under different indoor
temperatures. The relationship between EEG and subjective perceptions and task performance
was investigated by experiment. The results showed that the EEG frontal asymmetrical activity
has a good relationship with subjective survey and objective task performance. It can be used as
a more objective standard to verify the traditional subjective survey-based methods and task-
based methods. The recognition of Machine Learning based EEG mode with linear discriminant
analysis (LDA) classifier can categorize various mental states with different thermal conditions
(Shan et al. 2018).
13
Figure 2- 2. EEG Headset with electrodes placed (Teplan 2002)
2.4.2. Brain Waves
The EEG sensor on the scalp allows the measurement of the brainwave pattern that can represent
the state of the electrical activity of humans. Tina L and Christine mentioned that the different
ranges of brainwave frequencies can represent the state of human activities. The range from the
slower delta frequencies (1-4 Hz) can relate to deep sleep, theta frequencies (4-8 Hz) can relate
to light sleep, insight, and creativity, alpha frequencies (8-12 Hz) can represent a peaceful and
calm state. Beta frequencies (13-21 Hz) can relate to the state of concentration and consideration.
The high beta frequencies (20-32 Hz) are related to anxiety or nervousness (Yasui 2009).
Therefore, analyzing the data of brainwaves can help architects understand the state of clients.
But some factors might influence post-stimulus EEG alters, such as unstable emotion or baseline
EEG. According to Tina L and Christine, the changes in emotion and cognition will influence the
EEG, but the current state of the individual is the main reason to result in the EEG changes
(Huang and Charyton 2008).
14
2.4.3. HR (Heart Rate) and HRV (Heart Rate Variable)
Heart rate variability (HRV) has been widely studied for many years in clinical situations and
can predict many sicknesses with negative outcomes (Sacha 2014). The average heart rate (HR)
is another key factor that provides evidence to predict cardiovascular diseases. Commonly
known as the HRV is importantly related to HR. Hence, the HRV includes two important pieces
of information which are HR and its variability. The relationship between HRV and HR is a
physiological phenomenon and the other is the mathematical one. The physiology of HRV based
on HR is triggered by the activity of the autonomic nervous system, which means the lower the
parasympathetic nervous system activity, the faster HR and lower HRV (Sacha 2014).
Figure 2- 3. The interactions of Parasympathetic and Sympathetic with HR and HRV [11]
Human emotion varies depending on the individual and it is a key factor of psychological state.
It is very hard to measure with any degree of accuracy. Heart rate variability (HRV) provides
assessments of the autonomic nervous system (ANS) and human emotions. The validity of HRV
as a tool to assess human emotions by using the International Affective Picture System was
discussed by Choi et al and IAPS (IAPS) (K. H. Choi et al. 2017) . For the experiment, they
15
selected five images that represented each of the sorts of “happy”, “unhappy”, and “neutral”
from the IAPS. The participants of the experiment were asked to finish the Self-Assessment
Manikin (SAM) after being shown every image. They picked up the R-R interval (RRI) value of
each image from the photoplethysmogram (PPG) and the valence, arousal, and positive value of
each image from the SAM to analyze the relationship between each of them. The results
indicated that the relationship between valence and “unhappy” emotion in the image simulation
was important positive, and the relationship between dominance and “unhappy” emotion in the
image simulation was important negative. Therefore, the findings suggested that use an HRV-
based assessment only when a high level of emotion is triggered by visual stimulation is possible
(K. H. Choi et al. 2017).
2.4.4. EDA (Electrodermal Activity)
Electrodermal Activity (EDA) is a property of the human body where there are variations that
can be measured using sensors. It has also been known as skin conductance, skin conductance
response (SCR), sympathetic skin response (SSR), skin conductance level (SCL), electrodermal
response (EDR), etc. EDA varies with the state of sweat glands in the skin and these glands are
controlled by the sympathetic nervous system (Boucsein 2012). An EDA sensor can be a
wearable device that is placed on the wrist of the participant to detect the physical condition from
the acquired physiological signal (Zangró niz et al. 2017).
A method to measure the psychological stress level that focuses on exploring the possibility of
using only a single physiological signal (EDA) to make more practical choices for measure
psychological stress in humans rather than recent multiple physiological signals. This approach
uses linear discriminant analysis (LDA) dependent on the electrodermal activity (EDA) signal to
16
distinguish three levels of stress: low, medium, and high (Liu and Du 2018). The eighteen EDA
features were selected to equating three driving conditions: at rest, on the highway, and city
driving. The Fischer projection and Linear discriminant analysis (LDA) was used to categorize
the levels of stress. The results showed these methods reached an 81.82% recognition rate by
using single EDA signals and resulting in an acceptable accuracy rate. Even though it is lower
than the multiple signals system, but it might be a better balance between computational load and
recognition performance, which may be a promising research route to develop practical personal
pressure monitors (Liu and Du 2018).
2.5. Machine Learning
Machine learning algorithms are going to be applied to data analysis. The primary object of
machine learning is to model the relationship between the inputs (Physiological signals) and
inputs (best design) (Allmer 2014). Once the mathematical model is selected, predicting the
value of the desired variables by collecting the data is possible (Allmer 2014). Machine learning
can build a computational model of complex relationships. The automatic process of model
building is “training”, and the data for the training is “training data”. The trained model provides
a new version of how input variables are matched to the input (Allmer 2014).
Machine learning has been widely used for physiological signals analysis, such as EEG analysis
and it gives a new approach to characterize mission-related brain state and the significant EEG
data can be extracted. It is important to select a suitable EEG feature collection in machine
learning analysis (Hu and Zhang 2019). Authors can study their hypothesis or data-driven search
based on their decisions. The objective of a machine learning classifier is to select the correct
17
class according to a given feature element based on prior knowledge while training (Hu and
Zhang 2019).
2.5.1. Machine learning algorithms
Figure 2- 4. Types of Machine Learning [26][27]
2.5.2. Supervised Learning
The supervised machine learning algorithm requires external support, which is the input dataset
is separated to train and test the dataset. The variable input in the training dataset has to be
classified or predicted. Every algorithm studies some patterns according to the trained dataset
and provides them to test the dataset for classification or prediction (Dey 2016).
18
Figure 2- 5. Supervised Learning Workflow [28]
“Decision tree” is one of the algorithms of supervised learning. It includes types of trees that
classify them according to their values to attribute groups. A decision tree consists of nodes and
branches and it mainly applies to classification. The classified attributes in the group can be
represented by each node and the value that the node can take is represented by each branch
(Dey 2016).
19
Figure 2- 6. Decision Trees Workflow [29]
2.5.3. Support Vector Machine
One of the most used machine learning algorithms is the support vector machine (SVM).
Classification is the main purpose of the SVM. Some margins were supplied between the classes.
The distance between the margin and the classes is maximum and result to improve classification
accuracy (Dey 2016).
Figure 2- 7. Working of SVM [30]
20
Reinforcement learning is making decisions according to which actions to apply such that the
result is more positive. The learner can decide which actions to take until a situation is given.
The action that the learner is taken might change the situation and their further actions. Reinforce
learning is based on two standards which consist of trial and error search and delayed outcome
(Dey 2016).
Figure 2- 8. Working of RL [31]
2.5.4. Neural Network Learning
The algorithm of Neural Network Learning comes from the biological concept of neurons. It has
three layers, the input layer gets inputs, the hidden layer processes the input, and the input layer
delivers the inputs with calculated (Dey 2016).
21
Figure 2- 9. Structure of an ANN [32]
2.6. The need of user (i.e. client)-centered design approaches
As discussed above, the objective of the proposed research is to develop an architectural design
guideline based on the use of advanced machine learning algorithms. Establishing human bio-
metric signal driven architectural design guidelines as a function of psychology principle. Unlike
the current architectural conventional design process, advanced sensing technologies provide
information based on the clients’ reactions, such as physiological signals and psychological
factors. Architects’ concept design might provide more opportunities to achieve the clients’
satisfaction. Also, a better understanding of clients’ needs by architects will extremely decrease
the processing time and effort.
2.6.1. Relationship between Human psychological perception and Visual aesthetic
The visual aesthetic defines the capacity of assigning different levels of beauty to identify colors,
shapes, movement, etc (Cela-Conde et al. 2004). It is the characteristics of the human being. The
different ideas about the visual aesthetic that is “The study of human minds and emotions related
22
to the sense of beauty.” The main concept of the definition is to focus more on recognizing the
minds and emotions of humans (Palmer, Schloss, and Sammartino 2013).
2.6.2. Human physiological responses factors
The group of Joon-Ho Choi set up experiments to measure the body skin temperature and heart
rates that respond to the change of surrounding temperature to investigate and determine the
correlation between occupants’ thermal satisfaction and physiological responses in an office
environment. The biosensor was used for measurement in different local body spots. The
experiment continues for 100 min and the original temperature starts at 20 º C and ends at 30 º C.
The temperature increasing rate is 1º C every 10 min and the thermal satisfaction survey was
filled out by participants simultaneously. After the data is collected and analyzed, the overall
thermal comfort is negatively correlated with the heart rate. Also, the thermal comfort and
human physiological signal of different genders lead to diverse results (J. H. Choi and Yeom
2019). The local skin temperature of the female group was lower than the male group at the
forehead, chest, back, and neck. The local skin temperature at the wrist (back), wrist (front), and
belly showed that the male group has a lower temperature than the female and the average heart
rate for females is lower than males. The collected data can be used by the architect and
engineering firms as a reference and improve the building environment by designing the indoor
environmental control system (J. H. Choi and Yeom 2019).
2.6.3. Physiological Signals to estimate psychological factors
The group of Eun-Hye Jang examined the physiological signals to recognize different emotions
which include boredom, pain, and surprise. The three emotions, boredom, pain, and surprise
were triggered by emotional stimuli. And the physiological signals (Electrocardiography (ECG),
23
Electrodermal Activity (EDA), Skin Temperature (SKT), and Photoplethysmography (PPG))
were measured to collect data of emotional state from the objectives. To classify the three
different emotions, the twenty-seven physiological features were extracted. The statistical
method (Discriminate function analysis (DFA)) and five machine learning algorithm (Linear
discriminate analysis (LDA), classification and regression trees (CART), self-organizing map
(SOM), Naï ve Bayes algorithm, and support vector machine (SVM)) were used for classifying
the emotions. The results indicated that the physiological signals including heart rate (HR), skin
conductance level (SCL), skin conductance response (SCR), mean skin temperature (meanSKT),
blood volume pulse (BVP), and pulse transit time (PTT) greatly affect the state of emotion. (Jang
et al. 2015).
3. Methodology
This thesis is to explore the relationship between human physiological signals and personal
design preferences. The goal of this research is to develop the architectural design guideline
based on the correlation of physiological signals and personal design preference. This study
selected the third floor of Watt Hall (MBS Corner) at USC as the experimental place for
measurement and survey. To explore the correlation between human physiological signals and
personal design preference, this research collected two types of data: physiological signals and
questionnaires. The physiological signals data consists of five aspects which include EDA, skin
temperature, heart rate (HR), heart rate variable/stress level (HRV), and EEG signals
(brainwave). To measure these physiological signals, three types of wearable sensors were used
to collect these data. The questionnaires were used to investigate participants’ design preferences
and meanwhile measure the physiological signals. After data collection, the statistical analysis
method was used to analyze the physiological datasets and correlation between these data and
24
personal design preference questionnaires. Figure 3-1 indicate the overview of the workflow in
this research.
Figure 3- 1. Overview of Methodology
3.1. Experiment procedure
Figure 3- 2. Overview of Experiment Procedure
The experiments are conducted to collect physiological signals and design preferences. The
volunteers are students at USC and with a total of 30 people to participate. Each participant’s
physical condition is in a good state without any particular health issues. Each participant is
asked to provide their basic information, such as age, gender, and culture). There are 15 minutes
25
for participants to prepare the experiment, which includes stay in the waiting room and equip all
the required sensors very well. To help the participants understand the design satisfaction survey,
the dummy test is provided to the participant and the test is conducted for around 5 minutes.
Once the preparation and the dummy test are completed, the experiment is officially started. The
experiment is to collect design preference surveys and physiological signals and these two
proceed simultaneously. The user preference survey consists of 50 buildings and each one has
various design features, such as faç ade surface, window shape, window size, materials, and
aspect ratio. Each building has 15 design satisfaction questions and each question asks one
design feature. While the participants answering the design satisfaction survey, their
physiological signals are recorded which consists of EEG, EDA, HR, and HRV. The design
satisfaction survey is about 60 mins long to finish and the whole experiment takes no longer than
90 minutes.
3.2. Data Collection
There are two types of data that were collected in this research and it consists of physiological
signals and design preference. In order to study the physiological condition of the human body,
several parameters were collected, they are EDA, skin temperature, heart rate (HR), heart rate
variable/stress level, and EEG signals (Brainwaves). Also, the digital design preference
questionnaires were provided to every participant to collect their design preference data.
3.2.1. User preference survey
In order to understand the participants’ satisfaction with the architectural design, this study
involved a questionnaire for the experimental subjects. The digital design preference survey was
distributed to every experimental participant. There are a dummy test and diagram of selected
26
design features before the official questionnaire to help them understand how to answer the
survey. They were asked to fill out each question carefully and to prepare the survey, each
participant will provide their personal information such as age, gender, and culture, this helps in
classifying the data based on age or gender. The survey includes images of 50 existing buildings
which have numerous and varied designs, such as faç ade shape, window shape, window size,
materials, and aspect ratio. Each building image consists of 15 questions and every question asks
one design feature. The survey uses the 5-Point Scale from the least preferred to the most
preferred design for each question. Table3-1 and Table 3-2 show the personal information
questionnaires and sample design preference survey.
27
Figure 3- 3. Examples of selected Buildings for the survey
28
Table 3- 1. Personal Information Questionnaires
Personal Information Questionnaires
1. Age
2. Gender
3. Cultural background (i.e., Nationality)
Table 3- 2. Design Preference Questionnaires
3.2.2. Physiological Signals Measurement
This study selected two wearable watches, Empatica Embrace (as Figure 3-4) and Garmin
Vivosmart 3 (as Figure 3-4). Empatica Embrace was used to measure skin temperature and EDA.
This device has approximately 15 hours of battery life and it takes around 2 hours to fully
charged. The data of this device can be stored in memory for up to 14 hours without syncing
through Bluetooth. The frequency is 4 Hz to measure the EDA while 1 Hz frequency is to
Design Preference Questionnaires -2 -1 0 1 2
1. Do you like the width x length ratio of the
building?
2. Do you like the height of the building?
3. Do you like the overall proportion/ form of the
building?
4. Do you like the material selection(s) of the
building?
5. Do you like the color(s) of the material?
6. Do you like the number of windows?
7. Do you like the wall-to-window ratio of the
building?
8. Do you like the color(s) of the windows?
9. Do you like the transparency of the windows?
10. Do you like the reflectivity of the windows?
11. Do you like the module pattern of the faç ade?
12. Do you like the roughness of the faç ade?
13. Do you like the depth of the wall and window?
Overall Design Preference Questionnaires
14. Do you like the overall style of the faç ade?
15. Do you like the overall style of the building?
29
measure skin temperature. The Garmin Vivosmart 3 was used to measure heart rate (HR) and
heart rate variable/stress level. This device’s charge lasts for approximately 7 days and it takes
around 2 hours to reach full charge. The data can be stored for up to 14 days. The interval of
each raw data of heart rate and heart rate variable/stress level is 1 minute and it can be exported
after the measurement. But the Garmin Vivosmart 3 has restrictions that users can not export raw
data in CSV format directly, the fit files are the only available format in which it can be exported
to. Therefore, the author has to convert the fit file into a CSV file by using a Python program that
exports the required heart rate (HR) and heart rate variable/stress level data. Figure 3-4 shows the
specification of these two wearable watches.
Figure 3- 4 Specification of Bio-Signals Sensors
This study selected one wearable EEG sensor, EMOTIV EPOC X. It was used to measure EEG
signals (brainwaves) of participants. The battery life of this device is up to 12 hours by using a
USB receiver and up to 6 hours by using Bluetooth Low Energy. This device has 14 channels of
30
EEG sensors(AF3, F7, F3, FC5, T7, P7, O1, O2, P8, T8, FC6, F4, F8, AF4) that can measure the
various location of the human brain. The material of the sensor is saline-soaked felt pads, and it
needs to be wet enough to get good EEG signals before the experiment. The software
EMOTIVPRO is provided by the EMOTIV company can be used to recording the brainwaves
and export data into CSV and edf format. Figure 3-5 shows the specification of the EEG sensor.
Figure 3- 5. Specification of EEG Sensor
31
3.3. Data analysis
After completing the data collection, the collected data was analyzed by using machine learning
algorithms, including linear regression, artificial neural network, random forest regressor,
decision tree, and random forest classifier. Each algorithm was used to analyze the collected data
in a general preference prediction model based on every subject’s data and also an individual
preference prediction model based on each subject’s data. There are different types of input
datasets that were used in this research, which consists of 5 point scale in numbers and 3 point
scale in numbers, 5 point scale in a Nominal, and 3 point scale in a nominal. 5 point scale in
numbers used number -2, -1,0,1,2 to represent the options of strongly dislike, dislike, neutral,
like, and strongly like. 3 point scale in numbers used numbers -1, 0, 1 to represent the options of
dislike, neutral, and like. The 5 point scale in the nominal used text strongly dislike, dislike,
neutral, like, and strongly like to show the results. The 3 point scale in the Nominal used the text
dislike, neutral, and like to represent the results. The 10-fold cross-validation was provided for
all five machine learning algorithms which is the whole dataset is divided into 10 subsets where
10% of the dataset for training and the rest of the subset for testing the model. The process keeps
going on until all the data were being trained and tested. This method gives an advantage that
checks the validation of the model 10 times.
3.3.1. Linear Regression (LR)
Linear regression (LR) is a commonly used type of predictive analysis and it is useful for finding
the relationship between two continuous variables. These regression estimates are used to
explain the correlation between the dependent variable and one or more independent variables. In
this study, the linear regression listed human physiological signals parameter, that affects the
32
overall design preference of subjects. The datasets using 5 point scale in numeric and 3 point
scale in numeric were used for linear regression analysis.
3.3.2. Artificial Neural Network (ANN)
Artificial neural network (ANN) is one of the well-known machine learning algorithms that has
been used for classification. The idea of the artificial neural network is resulting from the
biological concept of neurons. It consists of an input layer of the node, one or two hidden layers
of the node, and the last layer of the output node. Giving numbers to the input nodes are
independent variables and returned from the output nodes are dependent variables [35]. The
datasets using 5 point scale in numeric and 3 point scale in numeric were used for artificial
neural network analysis.
3.3.3. Random Forest Classifier and Regressor (RF_C and RF_R)
The random forest algorithm was successfully used for the classification and regression method.
This approach consists of a large number of individual random decision trees that operate as an
ensemble. Each tree in the random forest comes out with a class prediction and the class with the
most votes is the model’s prediction. The datasets using 5 point scale in numeric and 3 point
scale in numeric were used for the random forest regressor analysis. The datasets using 5 point
scale in nominal and 3 point scale in nominal were used for random forest classifier analysis.
3.3.4. Decision Tree (DT)
The decision tree is a classifier that is a recursive partition of the instance space. The decision
tree includes nodes that form a rooted tree and it is also a directed tree with a node without
incoming edges. All other nodes have one coming edge. A node with outgoing edges is an
33
internal or test node. All other nodes are leaves or decision nodes. In this algorithm, according to
a certain discrete function of the input attributes, each test node or internal separates the instance
space into two or more sub-spaces (Rokach and Maimon 2006). The datasets using 5 point scale
in nominal and 3 point scale in nominal were used for decision tree analysis. In this study, the
decision tree algorithm was also used to select the most essential human physiological signal of
each subject. Figure 3-6 show the sample of one subject in the decision tree to select the most
essential human physiological signal. In this example, the EDA is the most important signal for
this subject.
Figure 3- 6. Sample Decision Tree
3.4. Analysis Tools
There are two analysis tools used in this study: Minitab and Weka.
3.4.1. Minitab
Minitab is a statistics software developed by three researchers at the Pennsylvania State
University as a teaching tool for statistics class in 1972. Today, it is not only in the statistics class
34
but also in advanced statistics classes at universities, and even in government and the
industry(Alin 2010). In this study, the Minitab is used to do statistical analysis of collected data
to identify the relationship between different parameters. It also can generate the resulting image
to show the relevant parameters of the data.
3.4.2. Weka
Weka is represented as “Waikato Environment for Knowledge Analysis”, developed at the
University of Waikato, New Zealand. It is free to use software and contains a collection of
algorithms for data analysis and predictive modeling, it combines graphical user interfaces for
easy use of these functions. In this study, machine learning algorithms were utilized in Weka
software.
3.5. The research of Joon Joo Kim (JJ) Description
This study will compare with Joon Joo Kim’s study. To simplified the description, XB
represented as Xingbai Zhang, and JJ represented as Joon Joo Kim throughout the whole paper.
The method of JJ’s study is focused on the development of a design preference prediction model
based on the design preference questionnaires (as Table 3-2). The method of the author’s (XB)
study concentrated on the development of a design preference prediction model based on human
physiological signals. The purpose to use JJ’s study is to compare these two different methods to
determine which one provided better accuracy of design preference prediction models. The
author compared JJ’s data in the individual preference prediction models based on each subject’s
data to obtain a significant difference among these two methods.
35
The overall workflow of JJ’s research is aimed at achieving the proposed goal, which is to
develop a responsible building faç ade design guideline model. A user preference study for data
collection was conducted online in order to create a relevant database for the preference
prediction performance. The goal of the user preference study is to collect as much data as
possible through a user preference survey in order to create a more responsible database with
fewer differences, then refine the database to help make statistical analysis and predictive
machine learning performance.
Figure 3- 7. The methodology of JJ’s research
3.5.1. Data Collection
A user preference survey on building faç ade design was conducted to collect the necessary data.
It is critical to developing an effective data collection system that generates high-quality data and
makes the entire data collection process iterate. User responses to the survey’s questionnaires are
directly transformed into useful data. The user preference survey is formed from a combination
of images of the building faç ade and preference questions
36
3.5.2. The Selection of Building Faç ade
A total of 50 mid and high-rise buildings designed in modern times have been deliberately
chosen to convey the most personal building faç ade designs and to be used as visual benchmarks
to measure the faç ade design preferences. The selection standard was mainly aimed at offering
survey participants a broad range of faç ade design features without prejudice to a specific design
style. Figure 3-3 shows the examples of selected buildings for the survey.
Prior to the establishment of the preference survey questionnaire, the evaluation of faç ade
design parameters was determined, such as the size of windows, module pattern, glasses
reflectivity, etc. and on the basis of these parameters, the faç ade design features of each building
have been categorized and listed as equally as possible for the purposes of potential preference
data analysis. Table 3-3 show the sample of building faç ade design parameters.
Table 3- 3. Building Faç ade Design Parameters Sample
Building Faç ade Design Parameters Design Categories Feature
#7: 500 Capitol
Mall Tower
1. Aspect Ratio Rectangular
2. Height Tall
3. Proportion/form Thin
4. Material Glass,
Concrete
5. Color Blue
6. No. of Window Many
7. WWR High
8. Color of Window Blue
9. Transparency of
Window
Low
10. Reflectivity of
Window
Low
11. Module Pattern Regular
12. Roughness Low
13. Depth of the Wall
and Window
Low
37
3.5.3. User Preference Survey on Building Faç ade Design
A survey of user preference was performed to collect actual data on user preferences that could
represent the perspective of a broader population. In this research, the survey is formed from a
set of questionnaires asking for the preference level of provided building facades features that
consistent with the previously design evaluation parameters. The basic structure of the survey
uses the typical Likert-scale format with 5 point scale system for quantitatively measuring the
preference level for data analysis purposes. The system has a number of questions with the 5
response alternatives as following: -2: strongly dislike, -1: dislike, 0: neutral, +1: like, +2:
strongly like. Table 3-2 show the complete preference survey includes 15 questions and 5 point
scale.
The survey consists of a total of 15 questions. The first 13 questions ask preference of unique
design features while the other two questions are the overall preference level of the chosen
building and faç ade design. Besides, each participant was asked to include basic information on
human factors so that all question's answers could be organized and examined on the basis of
demographic parameters, such as gender, age, etc, as shown in Table 3-1. Apart from building
images and survey questionnaires, any additional information that shows the functionality of the
faç ade was not used in the survey in order to suppress any potential impact on the decision of
participants’ preference. The survey was created digitally by using Google Forms to make it
easier for the prospective participants to access the survey, and Google Forms supports the data
collection process by displaying every survey response in an Excel format. Participants could
quickly launch an online survey with special assistance.
38
3.5.4. Survey Procedure
Finally, the survey participants started filling out the Google Forms for as much time as they
wanted to complete the survey. Because of the virtual setup of the survey, there was no
restriction on time and place, but it was recommended that the survey be completed in 85 mins
without taking more than one break: an introduction page with 5 mins, per building with 1.5
mins and break in the middle with 5 mins. Figure 3-8 illustrates the survey timeline
recommendation. In the end, all participants were free to end the survey at any time.
Figure 3- 8. Survey Timeline Recommendation
Participants started the Google Forms survey by agreeing to the consent form, which included an
overview of the research on the study goals, participant rights, risk, and more pertinent
information. Before heading to the main preference questionnaires tab, participants were asked to
include basic individual information such as name, age, gender. The survey began on the second
page with an image of the first building out of 50 buildings and 15 questions. The survey
replicated the same questions with 50 different images of the building. Each participant replied
to the same 15 questions on 50 separate building faç ade designs during the recommended 85
mins survey period.
The research concluded with a total of 30 volunteers participating in the survey. With a
combination of graduate and undergraduate students, all current students at the University of
39
Southern California (USC) with a background in architecture. Participants are mainly in their
twenties, except for a couple, and there was no target gender ratio for this survey, although it
would be preferable to have a balanced ratio for impartial preference analysis and prediction.
3.5.5. Algorithms for Data Analysis
After completion of the data collection, the collected data was analyzed using both statistics and
machine learning techniques. Stepwise regression was used for the statistical analysis, and two
machine learning techniques are the artificial neural network and decision tree.
3.5.6. Statistical Analysis
a) Stepwise regression analysis
In order to successfully predict the preferences of each participant, the research concentrate on
analyzing the findings with the use of stepwise regression. In this research, stepwise regression
listed design parameters that impact the final design preference of participants. The stepwise
regression analysis was performed on the data of each participant and on the entire data to
compare the design preferences parameters of each participant with the overall design
preferences.
3.5.7. Machine Learning Technique
a) Decision Tree
The decision tree is a tree-like machine learning classification technique. It normally has
multiple independent variables to use to determine the dependent variable of a new sample. It
offers a tree form decision guideline, which is the guiding force behind the decision-making
process. The decision tree algorithm was used to simulate in Weka.
40
b) Artificial Neural Network
Artificial neural network (ANN) is one of the well-known machine learning algorithms used for
classification. The theory of an artificial neural network derives from the biological principle of
neurons. It is formed from the input layer of the node, one or two hidden layers of the node, and
the last layer of the output node.
3.5.8. Data Analysis Tool
Responses to the survey need to be analyzed to indicate any useful findings for further study.
Two data analysis tools have been developed for statistical analysis and machine learning
analysis: Minitab and Weka.
3.6. Best Prediction Model
After the development of each model by using the above algorithms, the accuracy comparison of
each model in both the general preference prediction model and the individual preference
prediction model with JJ is necessary to determine the best prediction model. To identify the best
prediction model, the results of root mean squared error (RMSE) was used to determine the best
individual preference prediction model. The lower RMSE indicates the better accuracy of the
model.
41
4. Results and Discussion
4. 1. Data Preprocessing
The data were collected from the experiment which includes the human physiological signals
and questionnaires. There are several types of sensors used in this study, they are three wearable
sensors: Empatica Embrace, Garmin Vivosmart 3, and EMOTIV EXPOC. The number of
datasets is large and each wearable sensor provided a different format. Therefore, the raw data
needs to export into a unified format and can be integrated.
4.1.1. Raw Data
Each subject has four raw data files which consist of Empatica Embrace, Garmin Vivosmart 3,
EMOTIV EPOC X (EEG), Questionnaire. Empatica Embrace has 2 subfolders which are EDA
and skin temperature files under the folders. Garmin Vivosmart 3 also has 2 subfolders that
include heart rate (HR) and stress level files under the folders. Emotiv EPOC X has one subfile
which is EEG file under the folders. There are 30 subjects in total, and the raw data of each
subject are put into one folder. The original format of these raw data will be conversion,
extraction, and integration by different methods. The detailed following steps are discussed
below.
4.1.2. Conversion and Extraction
The formats of these four raw data files are different, the process to convert them to the final
unified formats is also diverse. The author programmed and used various python programs to
finish the extraction and conversion. Table 4-1 show corresponding python programs used to
extract and convert all raw data into a unified format.
42
Table 4- 1. Raw Data Formate Conversion
Raw Data
Formate
Extraction/Conversion Final Data
Formate
Human
Physiological
Data
Empatica
Embrace
.csv Embrace.py
Garmin
Vivismart 3
.fit Garmin-fit.py
EMOTIV EPOX
C
.csv EEG.py
Questionnaire Design
Preference
Survey
.csv Questionnire.py .csv and xlsx
The author collected 750 questionnaire data per subject from a design preference survey created
on Google Forms. The data can be exported into a csv format and open it on Excel. Because the
whole file has a large amount of data, then the sample of combined 30 subject’s questionnaire
data from the finished design preference survey is shown in Figure 4-2.
Figure 4- 1. Design Preference Survey Data File Sample
The three wearable sensors measured human physiological signals, the author used different
methods to convert the original files into the final unified format. The Empatica Embrace
measured the raw data of EDA and skin temperature, the column of Time in the original file is
represented by Unix Timestamp (UTC). In order to convert the UTC time into PST time, the
43
author code a python program named Empatica.py to make this time easily readable by
everyone. The specific code of this program is attached in Appendix A. The sample of the
converted EDA and skin temperature file is shown in Figure 4-3.
Figure 4- 2. Sample of EDA and Skin Temperature Data Conversion
The heart rate and heart rate variable/stress level were measured by the Garmin Vivosmart 3, this
wearable sensor can export raw data with fit format directly, but the file in this fit format cannot
be read directly, which is a specific file format owned by Garmin company. In order to convert
this fit file into a csv file, again, the author code a python program named Garmin_fit.py to make
the file readable. Also, this program directly converted the unix timestamp into readable data.
The code for this program is attached in Appendix B. The sample of the converted file is shown
in Figure 4-4.
44
Figure 4- 3. Sample of Heart Rate and Heart Rate Variable/Stress Level Data Conversion
The EMOTIV EPOX C that measured EEG signals which include Theta, Alpha, Low Beta, High
Beta, and Gamma, the raw data that can be exported into the csv file format, while the file in this
format can be opened in Excel, but the data in the file is confusing, from which the file consist a
lot of spaces and unnecessary data for this research. Due to a large amount of data, manual
sorting may take a huge time to finish. The author wrote a program named eeg.py to select the
necessary data for this research and convert them into a new csv file. The code for this program
is attached in Appendix C. The sample of the converted file is shown in Figure 4-5.
45
Figure 4- 4. Sample of EEG Data Conversion
4.1.3. The Integration of Data
When the file format conversion of each file was completed, these files were also required to be
unified in time in order to integrate these files into one file. Then it was imported into each
machine learning algorithm for data analysis.
The EDA’s acquisition frequency is 240 Hz, which means 240 x 60 data points per minute. The
acquisition frequency of skin temperature is 60 Hz, which means 60 x 60 data points per minute.
The measured physiological signals consist of heart rate and heart rate variable/stress level has
one data point per minute in the original raw data file. The EEG’s sampling rate at the frequency
of 2048 Hz, and then separated rate to 128 Hz into each channel. Because each subject has a
46
different duration to finish the design preference survey (include 50 buildings) and the duration
of each building (include 15 design preference questions) that subjects finished were also
diverse. The author decided to adopt the method of averaging, which is to select the average
value of all data to match the duration of each building that the subject completes. In this way, all
the data have been unified time-frequency based on the duration. If this average method was not
taken, the data file will be confusing and there will be a large amount of data shown on the data
file. To make the data file clean and easy to read, the average method is necessary and the
collected data will not be wasted, and every data can be used for analysis. The combined master
data file per subject is shown in Figure 4-6. All subjects' data combined file is shown in Figure 4-
7.
Figure 4- 5. Sample of Data File Per Subject After Integrating
Figure 4- 6. Sample of Combined All Subjects Data in One Sheet
47
4.1.4. Data Cleansing
Based on the above mentions, each subject can have an integrated file, but before that, data
cleansing is required to make sure all data is correctly selected.
The data cleansing in this research is to clean up all the incorrect data from the collected data,
which consists of negative values. The negative values appeared from these three wearable
sensors due to the sensors did not attach to the skin tightly at some time through the data
collection, therefore the data at those times are incorrectly collected, and result in negative
values. The author selected these negative values by using the filter function in Excel and after
that deleted them directly. Also, the data are out of the normal range need to be taken out. For the
heart rate, the values less than 50 BPM or greater than 130 BPM were detected to be incorrect
data and then deleted. For EDA data, the values higher than 0 are considered correct data. For the
skin temperature, the measured data on the wrist of the subjects was in a range between 26° C to
35° C if the air temperature between 20° C to 30° C (J. H. Choi, Loftness, and Lee 2012). Because
the experiment site is an indoor space with an automatic air condition system to control the
indoor temperature, therefore the indoor air temperature is between 20° C to 30° C. Thus, if the
value of skin temperature is less than 26° C or greater than 35° C they were deleted.
4.2. Data Analysis
Based on the collected data which consists of human physiological signals (HR, HRV/Stress
Level, EDA, Skin Temperature, EEG) and questionnaires data, this study conducted the
statistical analysis as part of the analysis and to explore and define the correlation between
design preference and human physiological signals. The design preference survey used 5 point
scale (-2,-1,0,1,2) to represents the design preference. Point -2 represents strongly dislike, point -
48
1 represents dislike, point 0 represents neutral, point 1 represents like, and point 2 represents
strongly like.
Table 4- 2. The Scale Represents to Design Preference
Scale Design Preference
-2 Strongly Dislike
-1 Dislike
0 Neutral
1 Like
2 Strongly like
4.2.1. Design Preference Impacted by Gender
The summary of the design preference impacted by different genders as illustrated in Figure 4-7
indicates that in the female group shows that the most selected design preference of provided
buildings was dislike (point=-1) and neutral (point=0) option. The second most selected design
preference was the like (point=1) option and the least selected design preferences were strongly
dislike (point=-2) and strongly like (point=2). The results of the female group indicated that the
design preference range in the female group was between dislike (point=-1) and like (point=1).
For the male group, it shows that the most selected design preference of provided building was
like (point=1), and the second most selected design preference was dislike (point=-1) and neutral
(point=0). The least selected design preferences were strongly dislike (point=-2) and strongly
like (point=2) in the male group. As a result, the male group shows that the design preference
range was between dislike (point=-1) and like (point=1). The comparison of design preference
impacted by different genders indicated that the male group had more like (point=1) options than
the female group. The male group had less dislike (point=-1) and neutral (point=0) than the
female group. For the strongly dislike (point=-2) and strongly like (point=2) options, these two
groups had a similar number of selections.
49
Figure 4- 7. Overall Design Preference Impacted by Gender
The summary of the design preference impacted by different genders is illustrated in Figure 4-8.
It shows the results of the interval plot. It indicated that the mean design preference of the female
group is well below 0 at an average of -0.04 which is the design preference is below the neutral
satisfaction. Compared to the female group, the mean design preference is well above 0 at an
average of 0.08 which means that their design preference is above the neutral satisfaction.
Therefore, the results show that the male group is more like the provided buildings than the
female group. The two-sample T-test is used for comparison that confirmed the difference is
marginally significant with a P-Value of 0.064
50
Figure 4- 8. Overall Design Preference Impacted by Gender
Table 4- 3. P-Value in Two-Sample T-test
Test
P-Value
0.064
4.2.2. Design Preference Impacted by Heart Rate
Table 4- 4. ANOVA Test
P-Value
Female 0.004
Male 0.001
The relationship between design preference and heart rate in the female group as illustrated in
Figure 4-9 shows that while strongly dislike (point=-2) was selected, the mean heart rate of the
female group is around 83 BPM. While the strongly like was selected, the mean heart rate of the
female group is around 78, the difference of mean heart rate between these two selections was
not too much, its only around 5 BPM. For the dislike and like selections, the mean heart rate is
51
around 81 BPM which is almost the same value. The mean heart rate is around 79 BPM while
the neutral satisfaction (point=0) was selected. Therefore, in the female group, while the heart
rate is higher, the more strongly dislike (point=-2) was selected, and the heart rate is lower, the
more strongly like (point=2) was selected. These heart rate changes in the female group across
the design preference scores revealed a P-Value of 0.004 (as Table 4-4) in an ANOVA test,
which is highly significant. The correlation index was estimated at -0.098, which indicates the
negative correlation between design preference and heart rate.
Figure 4- 9. Design Preference Impacted by Heart Rate in Female Group.
The correlation between design preference and heart rate in the male group as illustrated in
Figure 4-10 shows the mean heart rate is around 79 BPM while the strongly dislike (point=-2)
was selected and the strongly like (point=2) option shows the mean heart rate is around 75 BPM.
There are only around 4 BPM differences between strongly dislike (point=-2) and strongly like
(point=2) options. The mean heart rate of dislike (point=-1), neutral (point=0), and like (point=1)
options have the almost same value which is around 73 BPM. Therefore, in the male group,
while the heart rate is higher, the more strongly like (point=2) were selected. The lower the heart
52
rate, the more strongly dislike (point=-2) was selected. These heart rate changes in the male
group across the design preference scores revealed a P-Value of 0.001 (as Table 4-4) in an
ANOVA test, which is highly significant. The correlation index was estimated at 0.121, which
indicates the positive correlation between design preference and heart rate in the male group.
Figure 4- 10. Design Preference Impacted by Heart Rate in Male Group.
4.2.3. Design Preference Impacted by Stress Level
Table 4- 5. ANOVA Test
P-Value
Female 0.018
Male 0.262
The summary of stress level impact on the design preference in the female group as illustrated in
Figure 4-11 shows that while strongly dislike (point=-2) was selected, the mean stress level value
is around 27 which means in the female group, they got more stress and in a low-stress zone
53
when they strongly dislike (point=-2) the building. It also can be seen that while strongly like
(point=2) was selected, the mean stress level value is around 23. Compare to the strongly dislike
(point=-2) option of design preference, the female group received less stress. From the dislike
(point=-1) to the like (point=1) option, the mean stress level value is increasing slightly which
are 18, 20, and 22, but the stress level range is still in resting state. Therefore, in the female
group, while the strongly dislike (point=-2) was selected, the mean stress level value is in the
low-stress zone, and the other option was selected, the mean stress level value is in the resting
state zone. The stress level changes in the female group across the design preference scores
revealed a P-Value of 0.018 (as Table 4-5) in an ANOVA test, which is statistically significant.
The correlation index was estimated at -0.007, which indicates that the negative correlation
between design preference and stress level in the female group.
Figure 4- 11. Design Preference Impacted by Stress Level in Female Group.
54
The summary of design preference impacted by stress level value in the male group as illustrated
in Figure 4-12 shows that the mean stress level value is around 11 while the strongly dislike
(point=-2) was selected. It also can be seen that while strongly like (point=2) was selected, the
mean stress level value is around 13.5. The lowest value in the male group is around 10.9 while
the dislike (point=-1) was selected. From the dislike (point=-1) to the like (point=1) option, the
mean stress level is increasing slightly. Therefore, in the male group, the mean stress level is
within the resting state zone which indicated that all subjects in the male group were extremely
relaxed while they were answering the questionnaires. Compare to the female group, the whole
mean stress level value of the male group is well below the female group. The male group
received less stress level than the female group while they were answering the questionnaires.
The stress level changes in the male group across the design preference scores revealed a P-
Value of 0.262 (as Table 4-5) in an ANOVA test, which is not significant. The correlation index
was estimated at 0.077, which indicates the positive correlation between design preference and
stress level in the male group.
Figure 4- 12. Design Preference Impacted by Stress Level in Male Group.
55
4.2.4. Design Preference Impacted by EDA
Table 4- 6. ANOVA Test
P-Value
Female 0.031
Male 0.000
The summary of design preference impacted by EDA in the female group as illustrated in Figure
4-13 shows that the lowest mean EDA value in the female group is around 0.09 while the
strongly dislike (point=-2) was selected. The highest mean EDA value is around 0.14 while
dislike (point=-1) was selected. The second-lowest mean EDA value is around 0.09 while the
neutral (point=0) was selected. The like (point=1) and strongly like (point=2) option had a
similar mean EDA value which around 0.12. Therefore, in the female group, while the strongly
dislike (point=-2) and neutral (point=0) were selected, they received a lower mean EDA value.
While the dislike (point=-1) was selected, they received a higher mean EDA value. While the
like (point=1) and strongly like (point=2) selections were considered, their mean EDA had a
similar value. These EDA changes in the female group across the design preference scores
revealed a P-Value of 0.031 (as Table 4-6) in an ANOVA test, which is statistically significant.
The correlation index was estimated at 0.017, which indicates the positive correlation between
design preference and EDA.
56
Figure 4- 13. Design Preference Impacted by EDA in Male Group.
The summary of design preference impacted by EDA in the male group as illustrated in Figure 4-
14 shows that the lowest mean EDA value in the male group is around 0.2 while the strongly
dislike (point=-2) was selected. The highest mean EDA value is around 0.8 while the like
(point=1) was selected. While the dislike (point=-1) and strongly like (point=2) were selected,
their mean EDA had the same value. Compare to the female group, while the like (point=1) was
selected, the male group had the highest EDA value, but in the female group, the highest mean
EDA value happens while dislike (point=-1) was selected. The lowest mean EDA value happens
in both groups while the strongly dislike (point=-2) was selected. These EDA changes in the
male group across the design preference scores revealed a P-Value of 0.000 (as Table 4-6) in an
ANOVA test, which is highly significant. The correlation index was estimated at 0.118, which
indicates that the positive correlation between design preference and EDA in the male group.
57
Figure 4- 14. Design Preference Impacted by EDA in Male Group.
4.2.5. Design Preference Impacted by Skin Temperature
Table 4- 7. ANOVA Test
P-Value
Female 0.258
Male 0.002
The summary of design preference impacted by skin temperature in the female group as
illustrated in Figure 4-15 shows the highest mean skin temperature is around 29.6 while the
strongly like (point=2) was selected in the female group. The lowest mean skin temperature in
the female group is around 28.85 while the dislike (point=-1) and neutral (point=0) were
selected. The second highest mean skin temperature is around 29.25 while the strongly dislike
(point=-1) was selected. Therefore, while the strongly like (point=2) was selected, the skin
58
temperature is higher in the female group. And while the dislike (point=-1) and neutral (point=0)
were selected, the skin temperature is lower in this group. These skin temperature changes in the
female group across the design preference scores revealed a P-Value of 0.258 (as Table 4-7) in
an ANOVA test, which is not significant. The correlation index was estimated at 0.034, which
indicates that the positive correlation between design preference and skin temperature in the
female group.
Figure 4- 15. Design Preference Impacted by Skin Temperature in Female Group.
The summary of design preference impacted by skin temperature in the male group as illustrated
in Figure 4-16 shows that the highest mean skin temperature is around 30.3 while the strongly
like (point=2) was selected, and the lowest mean skin temperature is around 29.4 while the
neutral (point=0) was selected. From the strongly dislike (point=-1) to neutral (point=0) options,
the mean skin temperature is decreasing slightly which the values are around 30, 29.6, and 29.4.
59
while the like (point=1) was selected, the mean skin temperature is around 30.2 which is the skin
temperature is slightly decreased compared to the strongly like (point=2) option. Therefore, the
highest mean skin temperature happens while the strongly like (point=2) was selected in both
groups. The lowest mean skin temperature happens while the neutral (point=0) was selected in
the male group and while dislike (point=-1) and neutral (point=0) were selected in the female
group. These skin temperature changes in the male group across the design preference scores
revealed a P-Value of 0.002 (as Table 4-7) in an ANOVA test, which is highly significant. The
correlation index was estimated at 0.082, which indicates that the positive correlation between
design preference and skin temperature in the male group.
Figure 4- 16. Design Preference Impacted by Skin Temperature in Male Group.
4.3. Summary
Chapter 4 showed the discussion of data preprocessing and results, it is necessary for this
research. The original raw data exported from these three wearable sensors have a large amount
of data and it is hard to read and unable for future machine learning algorithm analysis.
60
Therefore, the author wrote some python programs to help reorganized, integrated, and clean
data. The data cleansing process erases the incorrect data points which consist of negative data
and the data out of the normal range which would affect the results of the outcome. After data
were preprocessed, all the data was in one master file, in which the datasets from 30 subjects.
Also, this chapter provided statistical analysis in Minitab to show the correlation between design
preference and different human physiological signals. As discussed, the heart rate is statistically
highly significant in both the female and male groups. Stress level is statistically significant in
the female group and it is statistically non-significant in the male group. EDA is statistically
significant in the female group, and statistically highly significant in the male group. Skin
temperature is statistically non-significant in the female group and statistically highly significant
in the male group.
5. Development of Design Preference Prediction Model using Machine Learning
Algorithms
In order to develop the design preference prediction model, machine learning algorithms consist
of Linear Regression, Artificial Neural Network, Random Forest Regressor, Decision Tree, and
Random Forest Classifier were used to analyze the datasets. In this study, the general design
preference prediction models used 30 subject datasets, which includes 1500 instances. The
individual design preference prediction model used each subject dataset, which includes 50
instances. All the machine learning algorithms were simulated in Weka. Therefore, all files must
be converted into ARFF format to fit the Weka file format requirement. The author wrote a
Python program to convert all files into ARFF format, the specific code for this program is
attached in Appendix D. After completing the development of the design preference prediction
61
models using different machine learning algorithms, the author used root mean squared error
(RMSE) to compare each model’s error rate.
5.1. General Design Preference Prediction Model using Linear Regression (LR)
Linear regression is the approach that would determine the correlations between two or more
parameters having cause and effect relations, and it could use the relation to make predictions for
the selected parameter, in this study, question 15 of the questionnaire is the selected parameter
that would predict for. In this machine learning algorithm, the author used two types of input
datasets which were 5 point scale in numbers and 3 point scale in numbers. 5 point scale in
numbers used number -2, -1,0,1,2 to represented the options of strongly dislike, dislike, neutral,
like, and strongly like. 3 point scale in numbers used numbers -1, 0, 1 to represent the options of
dislike, neutral, and like. This linear regression model used a 10-fold cross-validation test option
to simulate in Weka.
Table 5- 1. Datasets by using 5 Point Scale in Numbers
Heart rate Stress level value EDA Skin
temp
Theta Alpha BetaL BetaH Gamma Q15
78 17 0.028 29.27 2.59 0.93 0.54 0.3 0.18 -2
77 19 0.029 29.26 2.41 1.05 0.56 0.31 0.21 -2
77 19 0.029 29.26 2.41 1.05 0.56 0.31 0.21 -1
75 10 0.029 29.15 1.82 0.98 0.58 0.35 0.23 -1
80 22 0.027 29.31 1.96 0.9 0.6 0.39 0.42 0
62
80 21 0.027 29.3 2.02 0.98 0.54 0.29 0.17 0
76 13 0.028 29.22 2.52 0.96 0.56 0.36 0.32 1
74 18 0.029 29.18 2.25 1.03 0.62 0.44 0.31 1
58 13 0.071 29.74 1.26 0.65 0.46 0.28 0.15 2
56 13 0.073 30.88 1.59 0.75 0.49 0.29 0.17 2
Table 5- 2. Datasets by using 3 Point Scale in Numbers
Heart rate Stress level value EDA temp Theta Alpha BetaL BetaH Gamma Q15
78 17 0.028 29.27 2.59 0.93 0.54 0.3 0.18 -1
77 19 0.029 29.26 2.41 1.05 0.56 0.31 0.21 -1
80 22 0.027 29.31 1.96 0.9 0.6 0.39 0.42 0
80 21 0.027 29.3 2.02 0.98 0.54 0.29 0.17 0
76 13 0.028 29.22 2.52 0.96 0.56 0.36 0.32 1
74 18 0.029 29.18 2.25 1.03 0.62 0.44 0.31 1
Table 5- 3. Results in Linear Regression (LR)
Linear Regression Model
RMSE
5 Point Scale in Numbers 1.2637
3 Point Scale in Numbers 0.8755
The simulation results of the linear regression model are divided into several parts for simple
analysis and evaluation. Table 5-3 shows that the root mean squared error (RMSE ) by using 5
point scale in numbers is 1.2637 in the linear regression. There are several ways to check the
linear regression model accuracy. Usually, use root mean squared error (RMSE) to check the
model accuracy, the lower root mean squared root (RMSE), the better model accuracy.
63
The linear regression model by using 3 point scale in numbers (as Table 5-3) shows a different
summary of results. It indicates that the root mean squared error is 0.8755. It can be seen that the
linear regression model using 3 point scale in numbers, its root mean squared error (RMSE) is
0.8755. The RMSE by using 5 point scale in numbers is 1.2637. Therefore, in the linear
regression model, using 3 point scale in numbers has a lower RMSE value, which is better than
using 5 point scale in numbers.
5.2. General Design Preference Prediction Model using Artificial Neural Network (ANN)
Artificial Neural Network (ANN) is an approach that is a computational model and it is capable
of machine learning as well as pattern recognition. These are presented as systems of
interconnected parameters that can compute a value from inputs. In this machine learning
algorithm, the author also used two types of input datasets (same as Linear Regression
Algorithm) which were 5 point scale in numbers (as Table 5-1) and 3 point scale in numbers (as
Table 5-2). The 5 point scale in numbers used numbers -2,-1,0,1,2 to represented the options of
strongly dislike, dislike, neutral, like, and strongly like. 3 point scale in numbers used numbers -
1, 0, 1 to represent the options of dislike, neutral, and like. This Artificial Neural Network model
used a 10-fold cross-validation test option to simulate in Weka.
Table 5- 4. Results in Artificial Neural Network
Artificial Neural Network
RMSE
5 Point Scale in Numbers 1.3671
3 Point Scale in Numbers 0.9503
The result of the ANN model by using 5 point scale in numbers as illustrated in Table 5-4
indicates that the root mean squared error (RMSE) is 1.3674. The result of ANN by using 3 point
scale in numbers in Weka indicates that the root mean squared error is 0.9503. It can be seen that
64
the root mean squared error (RMSE) is 0.9503 which is lower than the value by using 5 point
scale un numbers. Therefore, the ANN using 3 point scale in numbers is better than the ANN
using 5 point scale in numbers.
5.3. General Design Preference Prediction Model using Decision Tree (DT)
A decision tree is an approach that can be used to visually and represent decisions and decision-
making. It uses a tree-like model of decisions. Also, the decision tree is a classifier that can
express as a recursive partition of the instance space. In this decision tree algorithm, the author
used two types of input datasets which were 5 point scale in nominal and 3 point scale in
nominal. The 5 point scale in nominal used text strongly dislike, dislike, neutral, like, and
strongly like to show the results. The 3 point scale in nominal used the text dislike, neutral, and
like to represent the results. The decision tree used a 10 fold cross-validation test option to
simulate in Weka.
Table 5- 5. Datasets by using 5 Point Scale in Nominal
Heart
rate
Stress level
value
EDA Skin
temp
Theta Alpha BetaL BetaH Gamma Q15
(Building Design)
78 17 0.028 29.27 2.59 0.93 0.54 0.3 0.18 strongly_dislike
77 19 0.029 29.26 2.41 1.05 0.56 0.31 0.21 strongly_dislike
77 19 0.029 29.26 2.41 1.05 0.56 0.31 0.21 dislike
75 10 0.029 29.15 1.82 0.98 0.58 0.35 0.23 dislike
80 22 0.027 29.31 1.96 0.9 0.6 0.39 0.42 neutral
80 21 0.027 29.3 2.02 0.98 0.54 0.29 0.17 neutral
76 13 0.028 29.22 2.52 0.96 0.56 0.36 0.32 like
74 18 0.029 29.18 2.25 1.03 0.62 0.44 0.31 like
58 13 0.071 29.74 1.26 0.65 0.46 0.28 0.15 strongly_like
56 13 0.073 30.88 1.59 0.75 0.49 0.29 0.17 strongly_like
Table 5- 6. Datasets by using 3 Point Scale in Nominal
Heart
rate
Stress level
value
EDA Skin
temp
Theta Alpha BetaL BetaH Gamma Q15
(Building
Design)
65
78 17 0.028 29.27 2.59 0.93 0.54 0.3 0.18 dislike
77 19 0.029 29.26 2.41 1.05 0.56 0.31 0.21 dislike
80 22 0.027 29.31 1.96 0.9 0.6 0.39 0.42 neutral
80 21 0.027 29.3 2.02 0.98 0.54 0.29 0.17 neutral
76 13 0.028 29.22 2.52 0.96 0.56 0.36 0.32 like
74 18 0.029 29.18 2.25 1.03 0.62 0.44 0.31 like
Table 5- 7. Results in Decision Tree
Decision Tree
RMSE
5 Point Scale in Nominals 0.4834
3 Point Scale in Nominals 0.5112
The result of the decision tree by using 5 point scale in the nominal as illustrated in Table 5-7
indicates that the root mean squared error (RMSE) is 0.4834. The result of the decision tree by
using 3 point scale in the Nominal indicates that the root mean squared error is 0.5112.
Therefore, decision tree using 5 point scale in nominal have a lower RMSE value, which is better
than using 3 point scale in nominal.
5.4. General Design Preference Prediction Model using Random Forest (RF)
The random forest algorithm was used in this study, it has been extremely successful as a
general-purpose classification and regression (Biau and Scornet 2016), which represents a
random forest classifier and random forest regressor. Both methods were used in this study to
analyze the data. In a random forest classifier, the author used two types of input datasets which
were 5 point scale in a nominal and 3 point scale in a nominal. The 5 point scale in the nominal
(as Table 5-5) used text strongly dislike, dislike, neutral, like, and strongly like to show the
results. The 3 point scale in the nominal (as Table 5-6) used text dislike, neutral, and like to
represent the results. In random forest regressor, the author used two types of input datasets
which were 5 point scale in numbers (as Table 5-1) which used number -2, -1, 0, 1, 2 to represent
66
strongly dislike, dislike, neutral, like, and strongly like. The 3 point scale in numbers (as Table 5-
2) used -1, 0, 1 to represent dislike, neutral, and like. Both random forest classifier and random
regressor were used for a 10 fold cross-validation test option to simulate in Weka.
Table 5- 8. Results in Random Forest Classifier (RF_Classifier)
Random Forest Classifier
RMSE
5 Point Scale in Nominals 0.4233
3 Point Scale in Nominals 0.5022
The results of the random forest classifier as illustrated in Table 5-8 indicate that the root mean
squared error (RMSE) was continued used to determine the quality of the model. In this case, the
random forest classifier with 5 point scale has a lower root mean squared error (RMSE) value
than the random forest classifier with 3 point scale. Therefore, a random forest classifier with 5
point scale is better than a random forest classifier with 3 point scale.
Table 5- 9. Results in Random Forest Regressor (RF_Regressor)
Random Forest Regressor
RMSE
5 Point Scale in Numbers 1.3713
3 Point Scale in Numbers 0.9531
The results of the random forest regressor as illustrated in Table 5-9 show. the random forest
regressor with 3 point scale has a lower root mean squared error (RMSE) value than the random
forest regressor with 5 point scale. Therefore, a random forest regressor with 3 point scale is
better than a random forest regressor with 5 point scale.
67
5.5. Comparison of Machine Learning Algorithms in General Design Preference Prediction
Model
By comparing these machine learning algorithms of the general model, the root mean squared
error (RMSE) is the common value of each algorithm. Therefore, the RMSE was used to clarify
the accuracy of each algorithm. To simplify the results, the values of RMSE were normalized
into the percentage to show the error rate. For instance, the RMSE of random forest regressor
with 5 point scale will be normalized into 1.3713/5=0.27=27% and random forest regressor with
3 point scale will be normalized into 0.9531/3=0.32=32%. The author used this principle to
calculated all RMSE values into error rate percentages.
Table 5- 10. Machine Learning Results Comparision in General Model
LR ANN RF_regressor
RMS
E
Error Rate RMSE Error Rate RMSE Error Rate
5-Scale
Numeric
1.263
7
25% 1.3674 27% 1.3713 27%
3-Scale
Numeric
0.875
5
29% 0.9503 32% 0.9531 32%
DT
RF_classifier
RMSE Error Rate RMSE Error Rate
5-Scale
Nominal
0.4834 9% 0.4223 8%
3-Scale
Nominal
0.5112 17% 0.5022 17%
68
Figure 5- 1. General Design Preference Prediction Results
The error rates of each machine learning algorithm in the general design preference prediction
model is illustrated in Table 5-10 and Figure 5-1 indicates that the random forest classifier with a
5-point scale has the lowest error rate at 8% among all these machine learning algorithms.
Hence, in the general preference prediction model, the random forest classifier with 5-point scale
in nominal is the best model.
5.6. Individual Design Preference Prediction Models based on Each Subject’s Personal
Data
The discussion above is the general design preference prediction models based on all subject
datasets. The author also simulated individual design preference prediction models based on each
subject’s dataset as well as evaluated the above machine learning algorithms. Due to the various
state of each subject, the result of the individual model might diverse compare to the general
69
preference prediction model. The author used RMSE while comparing different algorithms, and
used error rate (ER) while comparing different scales in the numeric group and nominal group.
This section will compare individual design preference prediction models based on all 30
subject’s data to JJ’s survey-based study results. The accuracy results of each model are listed in
Appendix E.
Table 5- 11. P-Value of Paired T-Test (5 Point Scale in Numeric)
P-Value
Artificial Neural Network 0.000
Random Forest Regressor 0.000
Figure 5- 2. Interval Plot of ANN (5 Scale Point in Numeric)
70
Figure 5- 3. Artificial Neural Network Comparison
The comparison analysis as illustrated in Table 5-11, indicates a highly significant difference
with a P-Value of 0.000. The average RMSE of artificial neural network in XB’s datasets is 1.37
while the other one in JJ’s datasets is 0.93, and an average difference of 0.44 was estimated in
Figure 5-2. Therefore, the artificial neural network in JJ’s datasets provided better accuracy in
the individual design preference prediction models. Figure 5-3 indicates the individual subject’s
RMSE results in comparison, which show the significantly higher RMSE in XB’s datasets across
all the test participants.
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
subject 1
subject 2
subject 3
subject 4
subject 5
subject 6
subject 7
subject 8
subject 9
subject 10
subject 11
subject 12
subject 13
subject 14
subject 15
subject 16
subject 17
subject 18
subject 19
subject 20
subject 21
subject 22
subject 23
subject 24
subject 25
subject 26
subject 27
subject 28
subject 29
subject 30
RMSE
ANN Comparison (5 Point Scale in Numeric)
ANN_RMSE_XB ANN_RMSE_JJ
71
Figure 5- 4. Interval Plot of Random Forest Regressor
Figure 5- 5. Random Forest Regressor Comparison
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
subject 1
subject 2
subject 3
subject 4
subject 5
subject 6
subject 7
subject 8
subject 9
subject 10
subject 11
subject 12
subject 13
subject 14
subject 15
subject 16
subject 17
subject 18
subject 19
subject 20
subject 21
subject 22
subject 23
subject 24
subject 25
subject 26
subject 27
subject 28
subject 29
subject 30
RMSE
Random Forest Regressor (5 Point Scale in Numeric)
RF_Regr_RMSE_XB RF_Regr_RMSE_JJ
72
The random forest regressor Paired T-Test as illustrated in Table 5-11, indicates that a highly
significant difference with a P-Value of 0.000. The average RMSE of random forest regressor in
XB’s datasets is 1.36 while the other one in JJ’s datasets is 0.59, and an average difference of
0.77 was estimated in Figure 5-4. Therefore, the random forest regressor in JJ’s datasets
provided better accuracy in the individual design preference prediction models. Figure 5-5
indicates the individual subject’s RMSE results in comparison, which show the significantly
higher RMSE in XB’s datasets across all the test participants.
Table 5- 12. P-Value of Paired T-Test (3 Point Scale in Numeric)
P-Value
Artificial Neural Network 0.000
Random Forest Regressor 0.000
Figure 5- 6. Interval Plot of Artificial Neural Network
73
Figure 5- 7. Artificial Neural Network Comparison
The comparison analysis as illustrated in Table 5-12 indicating a highly significant difference
with a P-Value of 0.000. The average RMSE of artificial neural network in XB’s datasets is 0.96
while the other one in JJ’s datasets is 0.64, and an average difference of 0.32 was estimated in
Figure 5-6. Therefore, the artificial neural network in JJ’s datasets provided better accuracy in
the individual design preference prediction models. Figure 5-7 indicates the individual subject’s
RMSE results in comparison, even though some of XB’s RMSE is lower than JJ’s RMSE, it
shows significantly higher RMSE values in XB’s datasets across all the test participants.
0
0.2
0.4
0.6
0.8
1
1.2
Subject 1
Subject 2
Subject 3
Subject 4
Subject 5
Subject 6
Subject 7
Subject 8
Subject 9
Subject 10
Subject 11
Subject 12
Subject 13
Subject 14
Subject 15
Subject 16
Subject 17
Subject 18
Subject 19
Subject 20
Subject 21
Subject 22
Subject 23
Subject 24
Subject 25
Subject 26
Subject 27
Subject 28
Subject 29
Subject 30
RMSE
ANN Comparison (3 Point Scale in Numeric)
ANN_RMSE_XB ANN_RMSE_JJ
74
Figure 5- 8. Interval Plot of Random Forest Regressor
Figure 5- 9. Random Forest Regressor Comparison.
0
0.2
0.4
0.6
0.8
1
1.2
1.4
Subject 1
Subject 2
Subject 3
Subject 4
Subject 5
Subject 6
Subject 7
Subject 8
Subject 9
Subject 10
Subject 11
Subject 12
Subject 13
Subject 14
Subject 15
Subject 16
Subject 17
Subject 18
Subject 19
Subject 20
Subject 21
Subject 22
Subject 23
Subject 24
Subject 25
Subject 26
Subject 27
Subject 28
Subject 29
Subject 30
RMSE
Random Forest Regressor (3 Point Scale in Numeric)
RF_Regr_RMSE_XB RF_Regr_RMSE_JJ
75
The random forest regressor Paired T-Test as illustrated in Table 5-12 indicates that a highly
significant difference with a P-Value of 0.000. The average RMSE of random forest regressor in
XB’s datasets is 0.98 while the other one in JJ’s datasets is 0.44, and an average difference of
0.54 was estimated in Figure 5-8. Therefore, the random forest regressor in JJ’s datasets
provided better accuracy in the individual design preference prediction models. Figure 5-9
indicates that individual subject’s RMSE results in comparison, which show the significantly
higher RMSE in XB’s datasets across all the test participants.
Table 5- 13. P-Value of Paired T-Test (5 Point Scale in Nominal)
P-Value
Decision Tree 0.000
Random Forest Classifier 0.000
Figure 5- 10. Interval Plot of Decision Tree
76
Figure 5- 11. Decision Tree Comparison
The decision tree Paired T-Test as illustrated in Table 5-13 indicates that a highly significant
difference with a P-Value of 0.000. The average RMSE of the decision tree in XB’s datasets is
0.47 while the other one in JJ’s datasets is 0.37, and an average difference of 0.1 was estimated
in Figure 5-10. Therefore, the decision tree in JJ’s datasets provided better accuracy in the
individual preference prediction models. Figure 5-11 indicates that the individual subject’s
RMSE results in comparison, which show the significantly higher RMSE in XB’s datasets across
all the test participants.
0
0.1
0.2
0.3
0.4
0.5
0.6
subject 1
subject 2
subject 3
subject 4
subject 5
subject 6
subject 7
subject 8
subject 9
subject 10
subject 11
subject 12
subject 13
subject 14
subject 15
subject 16
subject 17
subject 18
subject 19
subject 20
subject 21
subject 22
subject 23
subject 24
subject 25
subject 26
subject 27
subject 28
subject 29
subject 30
RMSE
Decision Tree (5 Point Scale in Nominal)
DT_RMSE_XB DT_RMSE_JJ
77
Figure 5- 12. Interval Plot of Random Forest Classifier
Figure 5- 13. Random Forest Classifier Comparison
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
subject 1
subject 2
subject 3
subject 4
subject 5
subject 6
subject 7
subject 8
subject 9
subject 10
subject 11
subject 12
subject 13
subject 14
subject 15
subject 16
subject 17
subject 18
subject 19
subject 20
subject 21
subject 22
subject 23
subject 24
subject 25
subject 26
subject 27
subject 28
subject 29
subject 30
RMSE
Random Forest Classifier (5 Point Scale in Nominal)
RF_C_RMSE_XB RF_C_RMSE_JJ
78
The random forest classifier Paired T-Test as illustrated in Table 5-13 indicates that a highly
significant difference with a P-Value of 0.000. The average RMSE of the random forest classifier
in XB’s datasets is 0.43 while the other one in JJ’s datasets is 0.32, and an average difference of
0.11 was estimated in Figure 5-12. Therefore, the random forest classifier in JJ’s datasets
provided better accuracy in the individual preference prediction models. Figure 5-13 indicates
that the individual subject's RMSE results in comparison, which show the significantly higher
RMSE in XB’s datasets across all the test participants.
Table 5- 14. P-Value of Paired T-Test (3 Point Scale in Nominal)
P-Value
Decision Tree 0.000
Random Forest Classifier 0.000
Figure 5- 14. Interval Plot of Decision Tree
79
Figure 5- 15. Decision Tree Comparison
The decision tree Paired T-Test as illustrated in Table 5-14 indicates that a highly significant
difference with a P-Value of 0.000. The average RMSE of the decision tree in XB’s datasets is
0.54 while the other one in JJ’s datasets is 0.39, and an average difference of 0.15 was estimated
in Figure 5-14. Hence, the decision tree in JJ’s datasets provided better accuracy in the individual
preference prediction models. Figure 5-15 indicates that the individual subject’s RMSE results in
comparison, which show the significantly higher RMSE in XB’s datasets across all the test
participants.
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Subject 1
Subject 2
Subject 3
Subject 4
Subject 5
Subject 6
Subject 7
Subject 8
Subject 9
Subject 10
Subject 11
Subject 12
Subject 13
Subject 14
Subject 15
Subject 16
Subject 17
Subject 18
Subject 19
Subject 20
Subject 21
Subject 22
Subject 23
Subject 24
Subject 25
Subject 26
Subject 27
Subject 28
Subject 29
Subject 30
RMSE
Decision Tree (3 Point Scale in Nominal)
DT_RMSE_XB DT_RMSE_JJ
80
Figure 5- 16. Interval Plot of Random Forest Classifier.
Figure 5- 17. Random Forest Classifier Comparison
0
0.1
0.2
0.3
0.4
0.5
0.6
Subject 1
Subject 2
Subject 3
Subject 4
Subject 5
Subject 6
Subject 7
Subject 8
Subject 9
Subject 10
Subject 11
Subject 12
Subject 13
Subject 14
Subject 15
Subject 16
Subject 17
Subject 18
Subject 19
Subject 20
Subject 21
Subject 22
Subject 23
Subject 24
Subject 25
Subject 26
Subject 27
Subject 28
Subject 29
Subject 30
RMSE
Random Forest Classifier (3 Point Scale in Nominal)
RF_C_RMSE_XB RF_C_RMSE_JJ
81
The random forest classifier Paired T-Test as illustrated in Table 5-14 indicates that a highly
significant difference with a P-Value of 0.000. The average RMSE of the random forest classifier
in XB’s datasets is 0.50 while the other one in JJ’s datasets is 0.31, and an average difference of
0.19 was estimated in Figure 5-16. Hence, the random forest classifier in JJ’s datasets provided
better accuracy in the individual preference prediction models. Figure 5-17 indicates that the
individual subject’s RMSE results in comparison, which show the significantly higher RMSE in
XB’s datasets across all the test participants.
Figure 5- 18. Datasets by using Numeric Comparison
0.00%
5.00%
10.00%
15.00%
20.00%
25.00%
30.00%
35.00%
ANN_ER_XB RF_Reg_ER_XB ANN_ER_JJ RF_Reg_ER-JJ
Error Rate
Datasets by using Numeric Comparison
5 Point Scale in Numeric 3 Point Scale in Numeric
82
Figure 5- 19. Model by using 5 Point Scale in Numeric Comparison
The datasets by using different point scales in numeric comparison as illustrated in Figure 5-18
indicates that significantly lower error rate in datasets by using 5 point scale in numeric across all
the test participants. Therefore, datasets by using 5 point scale in numeric provided better
accuracy in the individual preference prediction models. The model by using 5 point scale in
numeric comparison as illustrated in Figure 5-19 determines the better model in this group, it
shows that random forest regressor with 5 point scale in numeric has a significantly lower RMSE
across all the test participants. Hence, a random forest regressor with 5 point scale in numeric
provided better accuracy in the individual preference prediction models.
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
XB JJ
RMSE
Model by using 5 Point Scale in Numeric Comparison
ANN_5 Point Scale RF_Regr_5 Point Scale
83
Figure 5- 20. Datasets by using Nominal Comparison
Figure 5- 21. Model by using 5 Point Scale in Nominal Comparison
0.00%
2.00%
4.00%
6.00%
8.00%
10.00%
12.00%
14.00%
16.00%
18.00%
20.00%
DT_ER_XB RF_ER_XB DT_ER_JJ RF_ER_JJ
Error Rate
Datasets by using Nominal Comparison
5 Point Scale in Nominal 3 Point Scale in Nominal
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
XB JJ
RMSE
Model by using 5 Point Scale in Nominal Comparison
DT_5 Point Scale RF_C_5 Point Scale
84
The datasets that used different point scales in nominal is compared in Figure 5-20 indicates that
significantly lower error rate in datasets by using 5 point scale in nominal across all the test
participants. Therefore, datasets that use 5 point scale in nominal provided better accuracy in the
individual preference prediction models. The model by using 5 point scale in nominal
comparison as illustrated in Figure 5-21 determines the better model in this group, it shows that
random forest classifier with 5 point scale in nominal has a significantly lower RMSE across all
the test participants. Hence, random forest classifier with 5 point scale in nominal provided better
accuracy in the individual preference prediction models.
Figure 5- 22. Best Model Comparison.
As discussed above, the best model in the numeric group is the random forest regressor with 5
point scale in numeric while the other one in the nominal group is the random forest classifier
with 5 point scale in nominal. These two models’ comparisons are illustrated in Figure 5-22
indicates a significantly lower RMSE in random forest classifier with 5 point scale in nominal
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
XB JJ
RMSE
Best Model Comparison
RF_C_5 Point Scale in Nominal RF_Regr_5 Point Scale in Numeric
85
across all the test participants. Overall, a random forest classifier with 5 point in nominal
provided the best accuracy in the individual preference prediction models.
As discussed above, JJ’s results based on the survey provided better accuracy, but the survey
method is not the perfect one due to the individual state and subjective perception difference. By
contrast, XB’s results based on human physiological signals have lower accuracy than JJ’s
results by a certain percentage. However, the potential of this principle has a high enough value
to integrate with some design preference solution in the future with some better size of human
feature information.
5.7. Feature Importance Ranking by Using Each Subject’s Personal Data
The decision tree algorithm was used to select the most significant human physiological signal
from each subject’s datasets. The decision tree has several levels to show the significance
ranking. The first level is the root level which indicates the most important feature of the subject.
The author collected all the root data from every subject’s decision tree graphic and rank the
feature. Figure 5-23 show the decision tree sample of the subject.
86
Figure 5- 23. Decision tree Sample
5.7.1. Ranking by using Every Subject’s Datasets
Figure 5- 24. The 5 Point Scale in Datasets Feature Ranking
87
Figure 5- 25.The 3 Point Scale in Datasets Feature Ranking
The summary of human physiological signals importance ranking as illustrated in Figure 5-24
and Figure 5-25 indicates that the heart rate signal is the most essential human physiological
signal based on all subject’s dataset and skin temperature is the second most important. Figure 5-
25 indicates the results that heart rate is the most essential signal and skin temperature is the
second most important one.
88
5.7.2. Ranking by Gender
Figure 5- 26.Female Group in 5 Point Scale in Nominal Feature Ranking
The summary of the female group features importance ranking as illustrated in Figure 5-26
indicates that two human physiological signals are both the most essential feature in the female
group consists of heart rate and skin temperature. BetaL and EDA are both the second most
important signal in the female group.
89
Figure 5- 27.Male Group in 5 Point Scale in Nominal Feature Ranking
The summary of the male group feature ranking as illustrated in Figure 5-27 indicates that the
most essential signal is the heart rate in the male group and the second most important in the
male group is stress level.
90
Figure 5- 28.Female Group in 3 Point Scale in Nominal Feature Ranking
The summary of female group feature ranking as illustrated in Figure 5-28 show that the most
essential feature in the female group consists of heart rate and skin temperature. The second most
important feature is the stress level.
91
Figure 5- 29.Male Group in 3 Point Scale in Nominal Feature Ranking
The summary of the male group feature ranking as illustrated in Figure 5-29 shows that the heart
rate is the most essential feature in the male group and the second most important feature
consists of the BetaL and Theta.
5.8. Summary
This chapter focuses on the development of preference prediction models by using different
machine learning algorithms which include linear regression, artificial neural network, random
forest regressor, decision tree, and random forest classifier. The models consist of the general
preference prediction model which is based on every subject’s datasets and individual preference
prediction models based on each subject’s data. The root mean squared error (RMSE) was
compared for every algorithm in the general preference prediction model. The random forest
92
classifier has the lowest RMSE and it is the best machine learning algorithm. For the individual
preference prediction model, comparing XB’s study based on the human physiological signals
and JJ’s study based on the survey, the average RMSE in the random forest classifier with 5
point scale in nominal has the lowest RMSE value at 0.428 among these algorithms which is the
best algorithm in XB’s study. In JJ’s study, the random forest classifier with 5 point scale in
nominal has the lowest average RMSE at 0.315 which is the best algorithm. Overall, the random
forest classifier with 5 point scale in nominal is the best prediction model. JJ’s study based on the
survey provided better accuracy, but the survey method is not a perfect one due to the individual
state and subjective perception difference. Even though, XB’s study based on human
physiological signals has a lower accuracy than JJ’s study by a certain percentage. However, this
principle has a high potential to integrate with design preference solutions in the future with
some better size of human feature information. The decision tree was also used to clarify the
most essential human physiological signal. The result was that the heart rate is the most
important one among these signals.
6. Conclusions and Future Work
6.1. Conclusions
The goal of this research is to investigate the relationship between human physiological signals
and personal design preference. The author selected the third floor of Watt Hall (Master of
Building Science Program Corner) at the University of Southern California as the experimental
location to collect data. There are three wearable sensors that were used to measure the human
physiological signals which are heart rate, heart rate variable/stress level, EDA, skin temperature,
and EEG (brainwave includes Theta, Alpha, BetaL, BetaH, and Gamma). Design preference data
93
was collected by providing the design preference questionnaires to the participants and were
asked to answer them within 90 mins. After completing the data collection, the author converted
the raw data into the required format to adjust in Minitab and Weka for data analysis. Minitab
was used to generate several statistical charts and Weka software was used to simulate machine
learning algorithms in the general preference prediction model and individual preference
prediction model. After completing the data analysis of all datasets, the author put them together
and compared them with JJ in individual preference prediction models to study the consistent
findings from different machine learning algorithms. In the author’s (XB) study, the average
RMSE in the random forest classifier with 5 point scale in nominal has the lowest RMSE value
at 0.428 and therefore, forms the best algorithm. In JJ’s study, the random forest classifier with 5
point scale in nominal has the lowest average RMSE at 0.315 and therefore, forms the best
algorithm. Based on the survey, JJ’s study provided better accuracy, but the survey method is not
the perfect one due to differences in the individual state and subjective perception. . By contrast,
XB’s results based on human physiological signals have lower accuracy than JJ’s results by a
certain percentage. However, the potential of this principle has a high enough value that can be
integrated with some design preference solution in the future with some better size of human
feature information. The obtained conclusions were showed below:
a) In the general preference prediction model based on every subject’s data, the random
forest classifier has the highest accuracy and thus forms the best machine learning
algorithm among these five algorithms.
b) In the individual preference prediction model based on each subject’s data of XB’s study,
the random forest classifier with 5 point scale in nominal has the highest accuracy and
thus forms the best preference prediction model.
94
c) In the individual preference prediction model of JJ’s study, the random forest classifier
with 5 point scale in nominal has the highest accuracy which is the best preference
prediction model.
d) Overall, the random forest classifier with 5 point scale in nominal is the best individual
preference prediction model.
e) Based on every subject’s data, the heart rate is the most essential human physiological
signal by using 5 point scale and 3 point scale in nominal.
f) In the female group, the heart rate and skin temperature are the most essential human
physiological signals by using 5 point scale, and 3 point scale in nominal.
g) In the male group, the heart rate is the most essential human physiological signal by using
5 point scale and 3 point scale in nominal.
6.2. Limitations
This study has certain limitations with regard to data collection and data analysis which might
have an impact on the accuracy of the results. There are five human physiological signals for
collection during the process of collecting data, which might be impacted by some other factors.
The skin temperature is one of the human physiological signals collected to make the data
analysis. However, the temperature on hand might have a marginal effect on the skin temperature
of the wrist. For example, if a participant holds a cup of cold water on the hand, and then the skin
temperature might slightly increase. Only 30 subjects’ data were collected in this study, which is
a small number of subjects to analyze. Also, all participants were in their twenties and were
students from the University of Southern California and had a background in architecture. . They
might judge the design features using their knowledge which might influence their physiological
signals. Besides, in the data cleaning stage in the data pre-processing part, the author selected an
95
expected range of measured human physiological signal data and deleted the data out of the
range. These irregular data were shown as the measurement error. The error in the data might be
caused by equipment issues, for instance, the wearable wrist sensor was worn too loose to collect
the continuous data. Overall, there are still some spaces for possible improvements in this study.
In future study and research, these limitations will be developed and improved, so that it can
receive more accurate analysis results.
6.3. Future Work
For the future work of this research, the sample size of 30 participants should be appropriate for
an effective analysis as a starting point. However, for research including human subjects, the
findings may be more stable and accurate if the sample size were greater. In other words, with
greater sample size, the predictive outcome will be more accurate with a narrower confidence
interval. Besides, the variety of subjects should be balanced and accomplished in this study. The
majority of the subjects in this study are undergraduate and graduate students at a similar age, as
this study was carried out on the campus. Therefore, it would also be better if more age groups,
such as the elderly and middle-aged people, could be involved. The ethnic group in this study is
almost half Asians and half White, it would be better if more ethnic groups, such as American
Indian, African American, and Native Hawaiian, could be involved.
96
References:
1. The American Institute of Architects. The Architecture Student’s Handbook of
Professional Practice. 15th ed. US: John Wiley & Sons Inc, 2017.
2. Qingguo Ma, Linfeng Hu, and Xiaoyi Wang, “Emotion and Novelty Processing in an
Implicit Aesthetic Experience of Architectures,” NeuroReport 26, no. 5 (2015): pp. 279-
284, https://doi.org/10.1097/wnr.0000000000000344.
3. Baştanlar Y., Özuysal M. (2014) Introduction to Machine Learning. In: Yousef M.,
Allmer J. (eds) miRNomics: MicroRNA Biology and Computational Analysis. Methods
in Molecular Biology (Methods and Protocols), vol 1107. Humana Press, Totowa, NJ
4. Siva, Jessica Pooi Sun, and Kerry London. “Investigating the Role of Client Learning for
Successful Architect–Client Relationships on Private Single Dwelling Projects.”
Architectural Engineering and Design Management 7, no. 3 (2011): 177–89.
https://doi.org/10.1080/17452007.2011.594570.
5. “DiscoverDesign Handbook,” accessed July 7, 2020,
https://discoverdesign.org/handbook.
6. Teplan, Michal. 2002. “Fundamentals of EEG Measurement.” Measurement Science
Review 2 (2): 1–11.
7. Huang, Tina L., and Christine Charyton. 2008. “A Comprehensive Review of the
Psychological Effects of Brainwave Entrainment.” Alternative Therapies in Health and
Medicine 14 (5): 38–50.
8. Yasui, Yoshitsugu. 2009. “A Brainwave Signal Measurement and Data Processing
Technique for Daily Life Applications.” Journal of Physiological Anthropology 28 (3):
145–50. https://doi.org/10.2114/jpa2.28.145.
9. Sacha, Jerzy. 2014. “Interaction between Heart Rate and Heart Rate Variability.” Annals
of Noninvasive Electrocardiology 19 (3): 207–16. https://doi.org/10.1111/anec.12148.
97
10. Choi, Kwang Ho, Junbeom Kim, O. Sang Kwon, Min Ji Kim, Yeon Hee Ryu, and Ji Eun
Park. 2017. “Is Heart Rate Variability (HRV) an Adequate Tool for Evaluating Human
Emotions? – A Focus on the Use of the International Affective Picture System (IAPS).”
Psychiatry Research 251 (February): 192–96.
https://doi.org/10.1016/j.psychres.2017.02.025.
11. Mark Van Deusen, “Heart Rate Variability: The Ultimate Guide to HRV,” July 10, 2020,
https://www.whoop.com/thelocker/heart-rate-variability-hrv/.
12. Boucsein, Wolfram. Electrodermal Activity. New York: Springer, 2012.
13. Zangró niz, Roberto, Arturo Martí nez-Rodrigo, Jose Manuel Pastor, Marí a Ló pez T., and
Antonio Fernandez-Caballero. 2017. "Electrodermal Activity Sensor for Classification of
Calm/Distress Condition." Sensors 17 (10): 2324.
doi:http://dx.doi.org/10.3390/s17102324.
14. Liu, Yun, and Siqing Du. 2018. “Psychological Stress Level Detection Based on
Electrodermal Activity.” Behavioural Brain Research 341 (November 2017): 50–53.
https://doi.org/10.1016/j.bbr.2017.12.021.
15. Hu, Li, and Zhiguo Zhang. 2019. EEG Signal Processing and Feature.
16. Cela-Conde, Camilo J., Gisè le Marty, Fernando Maestú , Tomá s Ortiz, Enric Munar,
Alberto Ferná ndez, Miquel Roca, Jaume Rosselló , and Felipe Quesney. 2004.
“Activation of the Prefrontal Cortex in the Human Visual Aesthetic Perception.”
Proceedings of the National Academy of Sciences of the United States of America 101
(16): 6321–25. https://doi.org/10.1073/pnas.0401427101.
17. Palmer, Stephen E., Karen B. Schloss, and Jonathan Sammartino. 2013. “Visual
Aesthetics and Human Preference.” Annual Review of Psychology 64 (1): 77–107.
https://doi.org/10.1146/annurev-psych-120710-100504.
98
18. Choi, Joon Ho, and Dongwoo Yeom. 2019. “Development of the Data-Driven Thermal
Satisfaction Prediction Model as a Function of Human Physiological Responses in a Built
Environment.” Building and Environment 150 (January): 206–18.
https://doi.org/10.1016/j.buildenv.2019.01.007.
19. Jang, Eun Hye, Byoung Jun Park, Mi Sook Park, Sang Hyeob Kim, and Jin Hun Sohn.
2015. “Analysis of Physiological Signals for Recognition of Boredom, Pain, and Surprise
Emotions.” Journal of Physiological Anthropology 34 (1): 1–12.
https://doi.org/10.1186/s40101-015-0063-5.
20. Mozos, Oscar Martinez, Virginia Sandulescu, Sally Andrews, David Ellis, Nicola
Bellotto, Radu Dobrescu, and Jose Manuel Ferrandez. 2017. “Stress Detection Using
Wearable Physiological and Sociometric Sensors.” International Journal of Neural
Systems 27 (2): 1–17. https://doi.org/10.1142/S0129065716500416.
21. Bertelsen, S., Emmitt, S., 2005, ‘Getting to grips with client complexity’, in: S. Emmitt,
M. Prins (eds), CIB W096 Architectural Management Designing Value, New Directions
in Architectural Management, Lyngby, Denmark.
22. Taleb, Hala, Syuhaida Ismail, Mohammad Hussaini Wahab, and Wan Nurul Mardiah
Wan Mohd Rani. 2017. “Communication Management between Architects and Clients.”
AIP Conference Proceedings 1891 (October). https://doi.org/10.1063/1.5005469.
23. Bogers, Tetske, Juriaan J. Van Meel, and Theo J.m. Van Der Voordt. 2008. “Architects
about Briefing: Recommendations to Improve Communication between Clients and
Architects.” Facilities 26 (3–4): 109–16. https://doi.org/10.1108/02632770810849454.
24. Shan, Xin, En Hua Yang, Jin Zhou, and Victor W.C. Chang. 2018. “Human-Building
Interaction under Various Indoor Temperatures through Neural-Signal
Electroencephalogram (EEG) Methods.” Building and Environment 129 (October 2017):
46–53. https://doi.org/10.1016/j.buildenv.2017.12.004.
99
25. Dey, Ayon. 2016. “Machine Learning Algorithms: A Review.” International Journal of
Computer Science and Information Technologies 7 (3): 1174–79. www.ijcsit.com.
26. M. Welling, “A First Encounter with Machine Learning”
27. M. Bowles, “Machine Learning in Python: Essential Techniques for Predictive
Analytics”, John Wiley & Sons Inc., ISBN: 978-1-118-96174-2
28. S.B. Kotsiantis, “Supervised Machine Learning: A Review of Classification Techniques”,
Informatica 31 (2007) 249-268
29. L. Rokach, O. Maimon, “Top – Down Induction of Decision Trees Classifiers – A
Survey”, IEEE Transactions on Systems.
30. D. Meyer, “Support Vector Machines – The Interface to libsvm in package e1071”,
August 2015
31. L. P. Kaelbing, M. L. Littman, A. W. Moore, “Reinforcement Learning: A Survey”,
Journal of Artificial Intelligence Research, 4, Page 237-285, 1996
32. V. Sharma, S. Rai, A. Dev, “A Comprehensive Study of Artificial Neural Networks”,
International Journal of Advanced Research in Computer Science and Software
Engineering, ISSN 2277128X, Volume 2, Issue 10, October 2012
33. Choi, Joon Ho, and Vivian Loftness. 2012. “Investigation of Human Body Skin
Temperatures as a Bio-Signal to Indicate Overall Thermal Sensations.” Building and
Environment 58: 258–69. https://doi.org/10.1016/j.buildenv.2012.07.003.
100
34. Biau, Gé rard, and Erwan Scornet. 2016. “A Random Forest Guided Tour.” Test 25 (2):
197–227. https://doi.org/10.1007/s11749-016-0481-7.
35. Khan, G. M. 2018. “Artificial Neural Network (ANNs).” Studies in Computational
Intelligence 725: 39–55. https://doi.org/10.1007/978-3-319-67466-7_4.
36. Rokach, Lior, and Oded Maimon. 2006. “Decision Trees.” In Data Mining and
Knowledge Discovery Handbook. https://doi.org/10.1007/0-387-25465-x_9.
37. Alin, Aylin. 2010. “Minitab.” Wiley Interdisciplinary Reviews: Computational Statistics
2 (6): 723–27. https://doi.org/10.1002/wics.113.
101
Appendix A
102
103
104
Appendix B
105
106
107
Appendix C
108
109
110
Appendix D
111
112
Appendix E
Xingbai Zhang’s Results VS Joon Joo Kim’s Results
Xingbai Zhang
Subject #1 LR ANN RF_ regressor
RMS
E
Error Rate RMSE Error Rate RMSE Error Rate
5-Scale
Numeric
1.028 20% 1.241 25% 1.1869 24%
3-Scale
Numeric
0.671 22% 0.7857 26% 0.7264 24%
Subject #1 DT RF_ classifier
RMSE Error Rate RMSE Error Rate
5-Scale
Nominal
0.454 9% 0.4323 9%
3-Scale
Nominal
0.5162 17% 0.5213 6%
Joon Joo Kim
Subject #1 ANN RF_ regressor
RMSE Error Rate RMSE Error Rate
5-Scale
Numeric
0.7311 14.60% 0.6544 13%
3-Scale
Numeric
1.0114 33.70% 0.379 12.60%
Subject #1 DT RF_ classifier ANN
RMSE Error Rate RMSE Error Rate RMSE Error Rate
5-Scale
Nominal
0.299 5.90% 0.288 5.70% 0.299 5.98%
3-Scale
Nominal
0.317 10.60% 0.284 9.40% 0.25 8.30%
Xingbai Zhang
Subject #2 LR ANN RF_ regressor
RMSE Error Rate RMSE Error Rate RMSE Error Rate
5-Scale
Numeric
1.587 32% 1.7028 34% 1.9654 39%
3-Scale
Numeric
0.969 32% 1.0024 33% 1.1233 37%
113
Subject #2 DT RF_ classifier
RMSE Error Rate RMSE Error Rate
5-Scale
Nominal
0.4872 10% 0.4366 9%
3-Scale
Nominal
0.5468 18% 0.5314 18%
Joon Joo Kim
Subject #2 ANN RF_ regressor
RMSE Error Rate RMSE Error Rate
5-Scale
Numeric
0.8092 16.20% 0.5392 10.80%
3-Scale
Numeric
0.8889 29.60% 0.5369 17.90%
Subject #2 DT RF_ classifier ANN
RMS
E
Error Rate RMSE Error Rate RMSE Error Rate
5-Scale
Nominal
0.388 7.80% 0.333 6.70% 0.366 7.30%
3-Scale
Nominal
0.376 12.50% 0.326 10.90% 0.373 12.40%
Xingbai Zhang
Subject
#3
LR ANN RF_ regressor
RMSE Error Rate RMSE Error Rate RMSE Error Rate
5-Scale
Numeric
1.2349 25% 1.2167 24% 1.1313 23%
3-Scale
Numeric
0.9538 32% 0.9075 30% 0.7807 26%
Subject
#3
DT RF_ classifier
RMSE Error Rate RMSE Error Rate
5-Scale
Nominal
0.4746 9% 0.4006 8%
3-Scale
Nominal
0.4783 16% 0.465 16%
114
Joon Joo Kim
Subject #3 ANN RF_ regressor
RMSE Error Rate RMSE Error Rate
5-Scale
Numeric
1.1502 23% 0.7248 14.50%
3-Scale
Numeric
0.3545 11.80% 0.2592 8.60%
Subject #3 DT RF_ classifier ANN
RMSE Error Rate RMSE Error Rate RMSE Error Rate
5-Scale
Nominal
0.306 6.10% 0.28 5.60% 0.294 5.90%
3-Scale
Nominal
0.307 10.20% 0.222 7.40% 0.18 6%
Xingbai Zhang
Subject #4 LR ANN RF_ regressor
RMS
E
Error Rate RMSE Error Rate RMSE Error Rate
5-Scale
Numeric
6.73 135% 1.1266 23% 1.298 6%
3-Scale
Numeric
5.995 200% 0.9331 31% 1.0832 36%
Subject #4 DT RF_ classifier
RMSE Error Rate RMSE Error Rate
5-Scale
Nominal
0.4994 10% 0.4288 9%
3-Scale
Nominal
0.5462 18% 0.5078 17%
Joon Jo Kim
Subject #4 ANN RF_ regressor
RMSE Error Rate RMSE Error Rate
5-Scale
Numeric
0.8808 17.60% 0.4796 9.60%
3-Scale
Numeric
0.8993 29.90% 0.6028 20%
Subject #4 DT RF_ classifier ANN
115
RMS
E
Error Rate RMSE Error Rate RMSE Error Rate
5-Scale
Nominal
0.383 7.70% 0.348 7% 0.394 7.90%
3-Scale
Nominal
0.402 13.40% 0.371 12.40% 0.422 14.10%
Xingbai Zhang
Subject #5 LR ANN RF_ regressor
RMS
E
Error Rate RMSE Error Rate RMSE Error Rate
5-Scale
Numeric
4.433 89% 1.4883 30% 1.3373 27%
3-Scale
Numeric
3.263 109% 1.0649 35% 0.9258 30%
Subject #5 DT RF_ classifier
RMSE Error Rate RMSE Error Rate
5-Scale
Nominal
0.4936 10% 0.4203 8%
3-Scale
Nominal
0.5248 17% 0.4514 15%
Joon Jo Kim
Subject #5 ANN RF_ regressor
RMSE Error Rate RMSE Error Rate
5-Scale
Numeric
0.6396 12.70% 0.4627 9.30%
3-Scale
Numeric
0.2082 6.90% 0.2146 7.20%
Subject #5 DT RF_ classifier ANN
RMS
E
Error Rate RMSE Error Rate RMSE Error Rate
5-Scale
Nominal
0.359 7.20% 0.267 5.30% 0.318 6.40%
3-Scale
Nominal
0.314 10.40% 0.189 6.30% 0.162 5.40%
Xingbai Zhang
Subject #6 LR ANN RF_ regressor
116
RMS
E
Error Rate RMSE Error Rate RMSE Error Rate
5-Scale
Numeric
1.631
5
33% 1.2131 24% 1.0883 22%
3-Scale
Numeric
1.129 38% 0.8985 30% 0.8403 28%
Subject #6 DT RF_ classifier
RMSE Error Rate RMSE Error Rate
5-Scale
Nominal
0.469 9% 0.4307 9%
3-Scale
Nominal
0.5648 19% 0.4984 17%
Joon Joo Kim
Subject #6 ANN RF_ regressor
RMSE Error Rate RMSE Error Rate
5-Scale
Numeric
1.2232 24.40% 0.7257 14.50%
3-Scale
Numeric
0.6837 22.80% 0.3541 11.80%
Subject #6 DT RF_ classifier ANN
RMSE Error Rate RMSE Error Rate RMSE Error Rate
5-Scale
Nominal
0.333 6.70% 0.305 6.10% 0.355 7.10%
3-Scale
Nominal
0.377 12.60% 0.294 9.80% 0.313 10.40%
Xingbai Zhang
Subject #7 LR ANN RF_ regressor
RMS
E
Error Rate RMSE Error Rate RMSE Error Rate
5-Scale
Numeric
1.069
7
21% 1.085 22% 1.2684 25%
3-Scale
Numeric
0.901
2
30% 0.7859 26% 0.9797 33%
Subject #7 DT RF_ classifier
RMSE Error Rate RMSE Error Rate
117
5-Scale
Nominal
0.4957 10% 0.43 9%
3-Scale
Nominal
0.6329 21% 0.5346 18%
Joon Joo Kim
Subject #7 ANN RF_ regressor
RMSE Error Rate RMSE Error Rate
5-Scale
Numeric
0.9829 19.70% 0.4908 9.80%
3-Scale
Numeric
0.6048 20.20% 0.439 14.60%
Subject #7 DT RF_ classifier ANN
RMSE Error Rate RMSE Error Rate RMSE Error Rate
5-Scale
Nominal
0.391 7.80% 0.332 6.70% 0.429 8.60%
3-Scale
Nominal
0.449 14.90% 0.337 11.20% 0.395 13.20%
Xingbai Zhang
Subject #8 LR ANN RF_ regressor
RMS
E
Error Rate RMSE Error Rate RMSE Error Rate
5-Scale
Numeric
1.456 29% 1.2675 25% 1.5619 31%
3-Scale
Numeric
0.964 32% 0.9056 30% 1.0868 36%
Subject #8 DT RF_ classifier
RMSE Error Rate RMSE Error Rate
5-Scale
Nominal
0.5 10% 0.4582 9%
3-Scale
Nominal
0.567 19% 0.557 19%
Joon Joo Kim
Subject #8 ANN RF_ regressor
RMSE Error Rate RMSE Error Rate
5-Scale
Numeric
0.7788 15.60% 0.4733 9.50%
3-Scale
Numeric
0.6004 20% 0.5123 17.10%
118
Subject #8 DT RF_ classifier ANN
RMSE Error Rate RMSE Error Rate RMSE Error Rate
5-Scale
Nominal
0.385 7.70% 0.35 7% 0.394 7.90%
3-Scale
Nominal
0.413 13.80% 0.345 11.50% 0.386 12.90%
Xingbai Zhang
Subject #9 LR ANN RF_ regressor
RMS
E
Error Rate RMSE Error Rate RMSE Error Rate
5-Scale
Numeric
1.535 30% 1.6298 33% 1.6564 33%
3-Scale
Numeric
0.991 33% 0.9714 32% 1.0517 35%
Subject #9 DT RF_ classifier
RMSE Error Rate RMSE Error Rate
5-Scale
Nominal
0.4683 9% 0.3999 8%
3-Scale
Nominal
0.516 17% 0.4807 16%
Joon Joo Kim
Subject #9 ANN RF_ regressor
RMSE Error Rate RMSE Error Rate
5-Scale
Numeric
0.6598 13.20% 0.4763 9.50%
3-Scale
Numeric
0.7466 24.90% 0.4914 16.40%
Subject #9 DT RF_ classifier ANN
RMSE Error Rate RMSE Error Rate RMSE Error Rate
5-Scale
Nominal
0.36 7.20% 0.309 6.20% 0.394 7.90%
3-Scale
Nominal
0.467 15.60% 0.348 11.60% 0.409 13.60%
Xingbai Zhang
Subject #10 LR ANN RF_ regressor
119
RMS
E
Error Rate RMSE Error Rate RMSE Error Rate
5-Scale
Numeric
1.391 28% 1.5323 30% 1.6922 34%
3-Scale
Numeric
0.924 31% 0.9958 33% 1.1077 37%
Subject #10 DT RF_ classifier
RMSE Error Rate RMSE Error Rate
5-Scale
Nominal
0.4697 9% 0.4659 9%
3-Scale
Nominal
0.5549 18% 0.5702 19%
Joon Joo Kim
Subject #10 ANN RF_ regressor
RMSE Error Rate RMSE Error Rate
5-Scale
Numeric
1.2321 24.60% 0.7425 14.90%
3-Scale
Numeric
0.7716 25.70% 0.5175 17.30%
Subject #10 DT RF_ classifier ANN
RMSE Error Rate RMSE Error Rate RMSE Error Rate
5-Scale
Nominal
0.428 8.60% 0.349 6.98% 0.455 9.10%
3-Scale
Nominal
0.434 14.50% 0.352 11.70% 0.452 15.10%
Xingbai Zhang
Subject #11 LR ANN RF_ regressor
RMS
E
Error Rate RMSE Error Rate RMSE Error Rate
5-Scale
Numeric
1.354 27% 1.3455 27% 1.3561 27%
3-Scale
Numeric
1.032 34% 0.9856 33% 1.0424 35%
Subject #11 DT RF_ classifier
RMSE Error Rate RMSE Error Rate
120
5-Scale
Nominal
0.4788 10% 0.3978 8%
3-Scale
Nominal
0.3762 13% 0.4105 14%
Joon Joo Kim
Subject #11 ANN RF_ regressor
RMSE Error Rate RMSE Error Rate
5-Scale
Numeric
0.5118 10.20% 0.3853 7.70%
3-Scale
Numeric
0.6949 23.20% 0.5565 18.60%
Subject #11 DT RF_ classifier ANN
RMSE Error Rate RMSE Error Rate RMSE Error Rate
5-Scale
Nominal
0.317 6.30% 0.339 6.80% 0.434 8.70%
3-Scale
Nominal
0.272 9.10% 0.298 9.90% 0.371 12.40%
Xingbai Zhang
Subject #12 LR ANN RF_ regressor
RMS
E
Error Rate RMSE Error Rate RMSE Error Rate
5-Scale
Numeric
1.221 24% 1.4107 29% 1.3587 27%
3-Scale
Numeric
0.859 29% 0.9636 32% 0.9936 33%
Subject #12 DT RF_ classifier
RMSE Error Rate RMSE Error Rate
5-Scale
Nominal
0.4817 10% 0.4227 8%
3-Scale
Nominal
0.5408 18% 0.509 17%
Joon Joo Kim
Subject #12 ANN RF_ regressor
RMSE Error Rate RMSE Error Rate
121
5-Scale
Numeric
0.468 9.40% 0.3477 7%
3-Scale
Numeric
0.3986 13.30% 0.3799 12.70%
Subject #12 DT RF_ classifier ANN
RMSE Error Rate RMSE Error Rate RMSE Error Rate
5-Scale
Nominal
0.346 6.90% 0.312 6.20% 0.325 6.50%
3-Scale
Nominal
0.291 9.70% 0.282 9.40% 0.329 11%
Xingbai Zhang
Subject
#13
LR ANN RF_ regressor
RMSE Error Rate RMSE Error Rate RMSE Error Rate
5-Scale
Numeric
1.704 34% 1.4896 30% 1.5479 30%
3-Scale
Numeric
0.96 32% 0.8762 39% 0.8848 29%
Subject #13 DT RF_ classifier
RMSE Error Rate RMSE Error Rate
5-Scale
Nominal
0.4302 9% 0.4197 8%
3-Scale
Nominal
0.544 18% 0.5191 17%
Joon Joo Kim
Subject #13 ANN RF_ regressor
RMSE Error Rate RMSE Error Rate
5-Scale
Numeric
0.5801 11.60% 0.3927 7.90%
3-Scale
Numeric
0.7596 25.30% 0.3638 12.10%
Subject #13 DT RF_ classifier ANN
RMSE Error Rate RMSE Error Rate RMSE Error Rate
5-Scale
Nominal
0.376 7.50% 0.317 6.30% 0.354 7.10%
122
3-Scale
Nominal
0.427 14.20% 0.315 10.50% 0.433 14.40%
Xingbai Zhang
Subject #14 LR ANN RF_ regressor
RMS
E
Error Rate RMSE Error Rate RMSE Error Rate
5-Scale
Numeric
1.178 24% 1.1757 24% 1.1413 23%
3-Scale
Numeric
0.756 25% 0.937 31% 0.7707 26%
Subject #14 DT RF_ classifier
RMSE Error Rate RMSE Error Rate
5-Scale
Nominal
0.4553 9% 0.4112 8%
3-Scale
Nominal
0.4893 16% 0.4694 16%
Joon Joo Kim
Subject #14 ANN RF_ regressor
RMSE Error Rate RMSE Error Rate
5-Scale
Numeric
1.0015 20% 0.7173 14.30%
3-Scale
Numeric
0.7368 24.60% 0.4978 16.60%
Subject #14 DT RF_ classifier ANN
RMSE Error Rate RMSE Error Rate RMSE Error Rate
5-Scale
Nominal
0.444 8.90% 0.358 7.20% 0.358 7.20%
3-Scale
Nominal
0.471 15.70% 0.375 12.50% 0.383 12.80%
Xingbai Zhang
Subject
#15
LR ANN RF_ regressor
RMSE Error Rate RMSE Error Rate RMSE Error Rate
5-Scale
Numeric
1.558 31% 1.4184 28% 1.494 30%
3-Scale
Numeric
0.996 33% 0.937 31% 0.9225 30%
123
Subject #15 DT RF_ classifier
RMSE Error Rate RMSE Error Rate
5-Scale
Nominal
0.4604 9% 0.4529 9%
3-Scale
Nominal
0.5268 18% 0.5297 18%
Joon Joo Kim
Subject #15 ANN RF_ regressor
RMSE Error Rate RMSE Error Rate
5-Scale
Numeric
0.8341 16.70% 0.5215 10.40%
3-Scale
Numeric
0.802 26.70% 0.6202 20.70%
Subject #15 DT RF_ classifier ANN
RMSE Error Rate RMSE Error Rate RMSE Error Rate
5-Scale Nominal 0.39
1
7.80% 0.332 6.60% 0.429 8.60%
3-Scale Nominal 0.38
4
12.80% 0.359 12% 0.368 12.30%
Xingbai Zhang
Subject 16 LR
ANN
RF_ regressor
RMS
E
Error Rate RMSE Error Rate RMSE Error Rate
5-Scale
Numeric
1.067 21% 1.0242 20% 0.8468 17%
3-Scale
Numeric
0.906 30% 0.893 30% 0.7574 25%
Subject 16 DT
RF_ classifier
RMSE Error Rate RMSE Error Rate
5-Scale
Nominal
0.4653 9% 0.3784 8%
3-Scale
Nominal
0.5753 19% 0.4593 15%
124
Joon Joo Kim
Subject #16 ANN RF_ regressor
RMSE Error Rate RMSE Error Rate
5-Scale
Numeric
1.021 20.4$ 0.479 9.60%
3-Scale
Numeric
0.8523 28.40% 0.5117 17.10%
Subject #16 DT RF_ classifier ANN
RMSE Error Rate RMSE Error Rate RMSE Error Rate
5-Scale
Nominal
0.413 8.30% 0.37 7.40% 0.426 8.50%
3-Scale
Nominal
0.477 15.90% 0.403 13.40% 0.445 14.80%
Xingbai Zhang
Subject 17 LR
ANN
RF_ regressor
RMSE Error Rate RMSE Error Rate RMSE Error Rate
5-Scale
Numeric
1.079 22% 1.3398 27% 1.182 24%
3-Scale
Numeric
0.878 29% 1.044 35% 1.0002 33%
Subject 17 DT
RF_ classifier
RMSE Error Rate RMSE Error Rate
5-Scale
Nominal
0.4629 9% 0.4366 8.70%
3-Scale
Nominal
0.5296 18% 0.5077 17%
Joon Joo Kim
Subject #17 ANN RF_ regressor
RMSE Error Rate RMSE Error Rate
5-Scale
Numeric
1.6117 32.20% 0.9198 18.40%
3-Scale
Numeric
0.5885 19.60% 0.435 14.50%
Subject #17 DT RF_ classifier ANN
RMSE Error Rate RMSE Error Rate RMSE Error Rate
125
5-Scale
Nominal
0.423 8.50% 0.29 5.80% 0.345 6.90%
3-Scale
Nominal
0.501 16.70% 0.321 10.70% 0.311 10.40%
Xingbai Zhang
Subject
#18
LR ANN RF_ regressor
RMSE Error Rate RMSE Error Rate RMSE Error Rate
5-Scale
Numeric
1.023 20% 1.1251 23% 1.0033 20%
3-Scale
Numeric
0.889 30% 0.9395 31% 0.9234 31%
Subject #18 DT RF_ classifier
RMSE Error Rate RMSE Error Rate
5-Scale
Nominal
0.4633 9% 0.4125 8%
3-Scale
Nominal
0.5603 19% 0.5164 17%
Joon Joo Kim
Subject #18 ANN RF_ regressor
RMSE Error Rate RMSE Error Rate
5-Scale
Numeric
0.8461 16.90% 0.6063 12.10%
3-Scale
Numeric
0.4636 15.50% 0.3145 10.50%
Subject #18 DT RF_ classifier ANN
RMSE Error Rate RMSE Error Rate RMSE Error Rate
5-Scale
Nominal
0.327 6.50% 0.268 5.40% 0.33 6.60%
3-Scale
Nominal
0.332 11.10% 0.275 9.20% 0.316 10.60%
Xingbai Zhang
Subject #19 LR ANN RF_ regressor
RMS
E
Error Rate RMSE Error Rate RMSE Error Rate
5-Scale
Numeric
1.525 31% 1.6853 34% 1.524 30%
126
3-Scale
Numeric
0.933 31% 1.0236 34% 0.9809 33%
Subject #19 DT RF_ classifier
RMSE Error Rate RMSE Error Rate
5-Scale
Nominal
0.4639 9% 0.4412 9%
3-Scale
Nominal
0.5662 19% 0.5067 17%
Joon Joo Kim
Subject #19 ANN RF_ regressor
RMSE Error Rate RMSE Error Rate
5-Scale
Numeric
1.0809 21.60% 0.7348 14.70%
3-Scale
Numeric
0.7062 23.50% 0.4708 15.70%
Subject #19 DT RF_ classifier ANN
RMS
E
Error Rate RMSE Error Rate RMSE Error Rate
5-Scale
Nominal
0.46 9.20% 0.342 6.80% 0.387 7.70%
3-Scale
Nominal
0.356 11.90% 0.355 11.80% 0.382 12.70%
Xingbai Zhang
Subject
#20
LR ANN RF_ regressor
RMSE Error Rate RMSE Error Rate RMSE Error Rate
5-Scale
Numeric
1.363 27% 1.3796 28% 1.4365 29%
3-Scale
Numeric
0.949 32% 0.9771 33% 0.9908 33%
Subject #20 DT RF_ classifier
RMSE Error Rate RMSE Error Rate
5-Scale
Nominal
0.5068 10% 0.4617 9%
3-Scale
Nominal
0.6207 21% 0.5629 19%
127
Joon Joo Kim
Subject #20 ANN RF_ regressor
RMSE Error Rate RMSE Error Rate
5-Scale
Numeric
0.8205 15.40% 0.6392 12.80%
3-Scale
Numeric
1.0721 35.70% 0.6472 21.60%
Subject #20 DT RF_ classifier ANN
RMSE Error Rate RMSE Error Rate RMSE Error Rate
5-Scale
Nominal
0.415 8.30% 0.341 6.80% 0.429 8.60%
3-Scale
Nominal
0.476 15.90% 0.408 13.60% 0.514 17.10%
Xingbai Zhang
Subject #21 LR ANN RF_ regressor
RMS
E
Error Rate RMSE Error Rate RMSE Error Rate
5-Scale
Numeric
1.37 27% 1.0562 21% 1.2962 26%
3-Scale
Numeric
0.977 33% 0.7588 25% 0.9194 31%
Subject #21 DT RF_ classifier
RMSE Error Rate RMSE Error Rate
5-Scale
Nominal
0.4924 10% 0.4106 8%
3-Scale
Nominal
0.5179 17% 0.4744 16%
Joon Joo Kim
Subject #21 ANN RF_ regressor
RMSE Error Rate RMSE Error Rate
5-Scale
Numeric
0.7022 14% 0.4076 8.20%
3-Scale
Numeric
0.8072 26.90% 0.5854 19.50%
Subject #21 DT RF_ classifier ANN
128
RMSE Error Rate RMSE Error Rate RMSE Error Rate
5-Scale
Nominal
0.434 8.70% 0.364 7.30% 0.424 8.50%
3-Scale
Nominal
0.437 14.60% 0.365 12.20% 0.449 15%
Xingbai Zhang
Subject #22 LR ANN RF_ regressor
RMSE Error Rate RMSE Error Rate RMSE Error Rate
5-Scale
Numeric
1.700
7
34% 1.7561 35% 1.8466 37%
3-Scale
Numeric
1.059
2
35% 1.0419 35% 1.1329 38%
Subject #22 DT RF_ classifier
RMSE Error Rate RMSE Error Rate
5-Scale
Nominal
0.4347 9% 0.4453 9%
3-Scale
Nominal
0.4574 15% 0.46 15%
Joon Joo Kim
Subject #22 ANN RF_ regressor
RMSE Error Rate RMSE Error Rate
5-Scale
Numeric
0.6552 13.10% 0.3897 7.80%
3-Scale
Numeric
0.2971 9.90% 0.2437 8.10%
Subject #22 DT RF_ classifier ANN
RMSE Error Rate RMSE Error Rate RMSE Error Rate
5-Scale
Nominal
0.343 6.90% 0.244 4.90% 0.298 5.96%
3-Scale
Nominal
0.222 7.40% 0.198 6.60% 0.205 6.80%
Xingbai Zhang
Subject #23 LR ANN RF_ regressor
RMS
E
Error Rate RMSE Error Rate RMSE Error Rate
129
5-Scale
Numeric
1.205 24% 1.4743 29% 1.2707 25%
3-Scale
Numeric
0.980 33% 1.0469 35% 0.9563 32%
Subject #23 DT RF_ classifier
RMSE Error Rate RMSE Error Rate
5-Scale
Nominal
0.4754 10% 0.441 9%
3-Scale
Nominal
0.5646 19% 0.5139 17%
Joon Joo Kim
Subject #23 ANN RF_ regressor
RMSE Error Rate RMSE Error Rate
5-Scale
Numeric
0.8382 16.80% 0.5773 11.50%
3-Scale
Numeric
0.5508 18.40% 0.3625 12.10%
Subject #23 DT RF_ classifier ANN
RMSE Error Rate RMSE Error Rate RMSE Error Rate
5-Scale
Nominal
0.368 7.40% 0.298 5.96% 0.322 6.40%
3-Scale
Nominal
0.465 15.50% 0.302 10.10% 0.289 9.60%
Xingbai Zhang
Subject
#24
LR ANN RF_ regressor
RMS
E
Error Rate RMSE Error Rate RMSE Error Rate
5-Scale
Numeric
1.375
5
28% 1.3157 26% 1.4888 30%
3-Scale
Numeric
1.185
5
40% 1.0114 34% 1.1591 39%
Subject #24 DT RF_ classifier
RMSE Error Rate RMSE Error Rate
5-Scale
Nominal
0.471 9% 0.4257 9%
3-Scale
Nominal
0.5492 18% 0.5223 17%
130
Joon Joo Kim
Subject #24 ANN RF_ regressor
RMSE Error Rate RMSE Error Rate
5-Scale
Numeric
0.94 18.80% 0.7335 14.50%
3-Scale
Numeric
0.7475 24.90% 0.3816 12.70%
Subject #24 DT RF_ classifier ANN
RMS
E
Error Rate RMSE Error Rate RMSE Error Rate
5-Scale
Nominal
0.362 7.20% 0.297 5.90% 0.359 7.20%
3-Scale
Nominal
0.355 11.80% 0.297 9.90% 0.37 12.30%
Xingbai Zhang
Subject
#25
LR ANN RF_ regressor
RMS
E
Error Rate RMSE Error Rate RMSE Error Rate
5-Scale
Numeric
1.590
7
32% 1.7202 34% 1.8486 37%
3-Scale
Numeric
0.953
6
32% 1.0667 36% 1.1482 38%
Subject #25 DT RF_ classifier
RMSE Error Rate RMSE Error Rate
5-Scale
Nominal
0.5237 10% 0.4504 9%
3-Scale
Nominal
0.5565 19% 0.5159 17%
Joon Joo Kim
Subject #25 ANN RF_ regressor
RMSE Error Rate RMSE Error Rate
5-Scale
Numeric
0.671 13.42% 0.4627 9.30%
3-Scale
Numeric
0.5923 19.70% 0.4744 15.80%
Subject #25 DT RF_ classifier ANN
131
RMS
E
Error Rate RMSE Error Rate RMSE Error Rate
5-Scale
Nominal
0.287 5.70% 0.273 5.50% 0.307 6.10%
3-Scale
Nominal
0.401 13.40% 0.292 9.70% 0.315 10.50%
Xingbai Zhang
Subject
#26
LR ANN RF_ regressor
RMS
E
Error Rate RMSE Error Rate RMSE Error Rate
5-Scale
Numeric
1.206
4
24% 1.2379 25% 1.3697 27%
3-Scale
Numeric
0.930
7
31% 0.9686 32% 1.1015 37%
Subject #26 DT RF_ classifier
RMSE Error Rate RMSE Error Rate
5-Scale
Nominal
0.4769 9% 0.4335 9%
3-Scale
Nominal
0.4968 17% 0.5231 17%
Joon Joo Kim
Subject #26 ANN RF _regressor
RMSE Error Rate RMSE Error Rate
5-Scale
Numeric
1.1624 23.20% 0.6258 12.50%
3-Scale
Numeric
0.4434 14.80% 0.5235 17.50%
Subject #26 DT RF_ classifier ANN
RMS
E
Error Rate RMSE Error Rate RMSE Error Rate
5-Scale
Nominal
0.421 8.40% 0.334 6.70% 0.392 7.80%
3-Scale
Nominal
0.433 8.70% 0.319 10.60% 0.304 10.10%
Xingbai Zhang
Subject #27 LR ANN RF_ regressor
132
RMSE Error Rate RMSE Error Rate RMSE Error Rate
5-Scale
Numeric
1.133
4
23% 1.2097 24% 1.1607 23%
3-Scale
Numeric
0.973
5
32% 0.9711 32% 1.0114 34%
Subject #27 DT RF_ classifier
RMSE Error Rate RMSE Error Rate
5-Scale
Nominal
0.4313 9% 0.4195 8%
3-Scale
Nominal
0.5291 18% 0.4799 16%
Joon Joo Kim
Subject #27 ANN RF_ regressor
RMSE Error Rate RMSE Error Rate
5-Scale
Numeric
0.815 16.30% 0.5316 10.60%
3-Scale
Numeric
0.4339 14.50% 0.3356 11.20%
Subject #27 DT RF_ classifier ANN
RMS
E
Error Rate RMSE Error Rate RMSE Error Rate
5-Scale
Nominal
0.339 6.80% 0.281 5.60% 0.296 5.90%
3-Scale
Nominal
0.369 12.30% 0.25 8.30% 0.275 9.20%
Xingbai Zhang
Subject #28 LR ANN RF_ regressor
RMSE Error Rate RMSE Error Rate RMSE Error Rate
5-Scale
Numeric
1.238
2
25% 1.5911 32% 1.3241 26%
3-Scale
Numeric
1.656 55% 1.0607 35% 0.9544 32%
Subject #28 DT RF_ classifier
RMSE Error Rate RMSE Error Rate
5-Scale
Nominal
0.5252 11% 0.4315 9%
133
3-Scale
Nominal
0.5137 17% 0.519 17%
Joon Joo Kim
Subject #28 ANN RF_ regressor
RMSE Error Rate RMSE Error Rate
5-Scale
Numeric
1.7781 35.60% 0.745 14.90%
3-Scale
Numeric
0.3361 11.20% 0.2498 8.30%
Subject #28 DT RF_ classifier ANN
RMSE Error Rate RMSE Error Rate RMSE Error Rate
5-Scale
Nominal
0.335 6.70% 0.241 4.80% 0.209 4.20%
3-Scale
Nominal
0.278 5.60% 0.11 3.70% 0.207 6.90%
Xingbai Zhang
Subject #29 LR ANN RF_ regressor
RMSE Error Rate RMSE Error Rate RMSE Error Rate
5-Scale
Numeric
0.975
1
20% 0.9666 19% 1.0369 21%
3-Scale
Numeric
0.808
5
27% 0.8544 28% 0.9536 32%
Subject #29 DT RF_ classifier
RMSE Error Rate RMSE Error Rate
5-Scale
Nominal
0.463 9% 0.3989 8%
3-Scale
Nominal
0.5859 20% 0.5126 17%
Joon Joo Kim
Subject #29 ANN RF_ regressor
RMSE Error Rate RMSE Error Rate
5-Scale
Numeric
1.2821 25.60% 0.7271 14.50%
3-Scale
Numeric
0.919 30.70% 0.6394 21.30%
134
Subject #29 DT RF _classifier ANN
RMS
E
Error Rate RMSE Error Rate RMSE Error Rate
5-Scale
Nominal
0.435 8.70% 0.399 7.98% 0.443 8.86%
3-Scale
Nominal
0.532 17.70% 0.456 15.20% 0.537 17.90%
Xingbai Zhang
Subject #30 LR ANN RF _regressor
RMSE Error Rate RMSE Error Rate RMSE Error Rate
5-Scale
Numeric
1.667
3
33% 1.8774 38% 1.8991 38%
3-Scale
Numeric
1.018
4
34% 1.0985 37% 1.066 36%
Subject #30 DT RF_ classifier
RMSE Error Rate RMSE Error Rate
5-Scale
Nominal
0.4698 9% 0.4511 9%
3-Scale
Nominal
0.5191 17% 0.515 17%
Joon Joo Kim
Subject #30 ANN RF_ regressor
RMSE Error Rate RMSE Error Rate
5-Scale
Numeric
1.1595 23.19% 0.8494 16.98%
3-Scale
Numeric
0.2055 6.90% 0.3299 10.90%
Subject #30 DT RF_ classifier ANN
RMSE Error Rate RMSE Error Rate RMSE Error Rate
5-Scale
Nominal
0.33 6.60% 0.293 5.90% 0.318 6.40%
3-Scale
Nominal
0.257 8.60% 0.221 7.40% 0.248 8.30%
Xingbai Zhang
Average LR ANN RF_ regressor
135
RMSE Error Rate RMSE Error Rate RMSE Error Rate
5-Scale
Numeric
1.621
3
32.4% 1.37 27.4% 1.3872 27%
3-Scale
Numeric
1.215
6
40.5% 0.9569 31.5% 0.9792 32.6%
Average DT RF_ classifier
RMSE Error Rate RMSE Error Rate
5-Scale
Nominal
0.4748 9.43% 0.4282 8.59%
3-Scale
Nominal
0.5356 17.86% 0.5048 16.46%
Joon Joo Kim
Average ANN RF_ regressor
RMSE Error Rate RMSE Error Rate
5-Scale
Numeric
0.9289 18.58% 0.5854 11.71%
3-Scale
Numeric
0.6392 21.31% 0.441 14.70%
Average DT RF_ classifier ANN
RMS
E
Error Rate RMSE Error Rate RMSE Error Rate
5-Scale
Nominal
0.373 7.46% 0.315 6.30% 0.364 7.28%
3-Scale
Nominal
0.386 12.88% 0.309 10.30% 0.346 11.54%
Abstract (if available)
Abstract
In most architectural design projects, the clients and architects are the main members of the designing process. Architects and clients usually spend a large amount of time deciding design agreements due to misunderstandings about the clients’ requirements and preferences. The efficiency of the design process in the architecture design company is based on the architect. After the client changes the idea of the project, the whole project or part of the project will be altered according to the client. It takes time to make the changes and redesign the project more than once. The objective of the proposed research is to develop architectural design guidelines based on the use of advanced machine learning algorithms. Unlike the current architectural conventional design process, advanced sensing technologies provide information based on the clients’ reactions, such as physiological signals and psychological factors. Architects’ concept design might provide more opportunities to achieve the clients’ satisfaction. Also, a better understanding of clients’ needs by architects will substantially decrease the processing time and effort.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Development of data-driven user-centered building façade design guideline models: machine learning-based approaches to predict user preferences
PDF
Human–building integration: machine learning–based and occupant eye pupil size–driven lighting control as an applicable visual comfort tool in the office environment
PDF
Exploration for the prediction of thermal comfort & sensation with application of building HVAC automation
PDF
Human-building integration: Investigation of human eye pupil sizes as a measure of visual sensation in the workstation environment
PDF
Human-building integration based on biometric signal analysis: investigation of the relationships between human comfort and IEQ in a multi-occupancy condition
PDF
Enhanced post occupancy evaluation (POE) for office building: improvement of current methodology to identify impact of ambient environment
PDF
An analysis of building component energy usage: a data driven approach to formulate a guideline
PDF
Quantify human experience: integrating virtual reality, biometric sensors, and machine learning
PDF
Office floor plans generation based on Generative Adversarial Network
PDF
A proposal for building envelope retrofit on the Bonaventure Hotel: a case study examining energy and carbon
PDF
Guidelines to airport design: accounting for glare from buildings during takeoff and landing – an LAX case study
PDF
Impacts of indoor environmental quality on occupants environmental comfort: a post occupancy evaluation study
PDF
Streamlining sustainable design in building information modeling: BIM-based PV design and analysis tools
PDF
Multi-occupancy environmental control for smart connected communities
PDF
Mitigating thermal bridging in ventilated rainscreen envelope construction: Methods to reduce thermal transfer in net-zero envelope optimization
PDF
Impacts of building performance on occupants' work productivity: a post occupancy evaluation study
PDF
Daylighting study of a LEED platinum laboratory building: a post-occupancy evaluation comparing performance in use to design intent
PDF
Real-time simulation-based feedback on carbon impacts for user-engaged temperature management
PDF
Natural ventilation in tall buildings: development of design guidelines based on climate and building height
PDF
Collective behavior in public urban spaces: a dynamic approach for fire emergency
Asset Metadata
Creator
Zhang, Xingbai
(author)
Core Title
Development of AI-driven architectural design guidelines: establishing human biometric signal-driven architectural design guideline as a function of psychological principles
School
School of Architecture
Degree
Master of Building Science
Degree Program
Building Science
Publication Date
04/16/2021
Defense Date
03/17/2021
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
Architects,clients,design process,machine learning,OAI-PMH Harvest,physiological signals
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Choi, Joon-Ho (
committee chair
), Ferreira da Silva, Rafael (
committee member
), Ting, Selwyn (
committee member
)
Creator Email
xingbaiz@usc.edu,zxbem890406@gmail.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c89-447998
Unique identifier
UC11668430
Identifier
etd-ZhangXingb-9488.pdf (filename),usctheses-c89-447998 (legacy record id)
Legacy Identifier
etd-ZhangXingb-9488.pdf
Dmrecord
447998
Document Type
Thesis
Rights
Zhang, Xingbai
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
design process
machine learning
physiological signals