Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Human–building integration: machine learning–based and occupant eye pupil size–driven lighting control as an applicable visual comfort tool in the office environment
(USC Thesis Other)
Human–building integration: machine learning–based and occupant eye pupil size–driven lighting control as an applicable visual comfort tool in the office environment
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
Human–Building Integration: Machine Learning–Based and Occupant
Eye Pupil Size–Driven Lighting Control as an Applicable Visual
Comfort Tool in the Office Environment
By
Lingkai Cen
Presented to the
FACUL TY OF THE
SCHOOL OF ARCHITECTURE
UNIVERSITY OF SOUTHERN CALIFORNIA
In partial fulfillment of the
Requirements of degree
MASTER OF BUILDING SCIENCE
May 2019
1
ACKNOWLEDGEMENTS
This research acknowledges the National Science Foundation (NSF) for funding the project, entitled
“Human–Building Integration: Bio Sensing Adaptive Environmental Controls,” under grant number
#1707068. It was also partially sponsored by the Environmental Protection Agency. This research has been
conducted with the support of the University of Southern California (USC) School of Architecture, who
provided the environmental chamber as the experiments’ location. This research acknowledges the assistance
from Xiaomeng Yao, student from Viterbi School of Engineering, in construction of prototype control system.
I’d like to thank my family for supporting me all the way and my beloved friends for accompanying me by
the side. I’d appreciate patient and detailed instructions from professor Joon-Ho Choi and Professor Yolanda
Gil.
2
COMMITTEE MEMBERS
Chair: Joon-Ho Choi
Title: Assistant Professor
Email: joonhoch@usc.edu
Second Committee Member: Yolanda Gil
Title: Research Professor
Email: gil@isi.edu
Third Committee Member: Shrikanth Narayanan
Title: Professor
Email: shri@sipi.usc.edu
3
Abstract
Deficient indoor environments are common issues in today’s workplace, resulting in reduced work
productivity, which contributes to indirect pecuniary loss among firms. Lighting, as an important component
of indoor environmental qualities, is demonstrated to be closely related to occupants’ defective performance
and is largely ignored by existing design guidelines, which are designed primarily for paper-based tasks and
derived from empirical values.
An applicable tool was developed to improve visual comfort for an individual in an office environment. The
tool consists of two parts – visual comfort prediction computed by a machine learning algorithm on the basis
of the occupant’s eye pupil size as well as the illuminance level (lux) and physical luminance control based
on the corresponding predicted visual comfort label (i.e., visual sensation and visual satisfaction) as an input.
After reviewing multiple computational algorithms for establishing a visual comfort prediction model, this
study adopted Gaussian naïve Bayes (Gaussian NB), logistic regression (LG), and support vector machine
(SVM). The model training and testing process utilized data collected from human subject tests, which were
conducted in an environmental chamber at USC. Visual sensation (evaluation of brightness) and visual
satisfaction (assessment of comfort) were used as classes to label each data point described by the features
of human eye pupil size–related parameters and real-time illuminance level. Stepwise control was used for
luminance control of lighting.
The research found that, in terms of accuracy, SVM outperformed LG and Gaussian NB and is therefore
recommended to be used to signal the control. The prototype control developed demonstrated acceptable
performance under extreme conditions (1100 lux) but failed to make changes for occupants under the baseline
condition (500 lux).
Keywords Building environmental control, Artificial intelligence.
Hypothesis
1. The visual comfort prediction model developed with SVM provides the most accurate prediction in
comparison with LG and Gaussian NB.
2. Stepwise control can adjust illuminance level to that corresponding to perceived visual comfort in the
shortest time, and it induces less discomfort during the modulation process.
Objectives
1. Establish a visual comfort prediction model and a physical lighting control model for individuals and
their workstations in an office environment.
2. Determine the most accurate and efficient machine learning algorithm to drive the visual comfort
prediction model.
3. Determine the most reliable and effective control strategy to achieve the perceived visual comfort level.
4. Discuss the relationship between electrodermal activity and an occupant’s perception of visual comfort.
4
TABLE OF CONTENTS
ACKNOWLEDGEMENTS .............................................................................................................. 1
COMMITTEE MEMBERS ............................................................................................................. 2
ABSTRACT ...................................................................................................................................... 3
HYPOTHESIS .................................................................................................................................. 3
OBJECTIVES .................................................................................................................................. 3
CHAPTER 1. INTRODUCTION .............................................................................................. 6
1.1. THE CURRENT ISSUE IN INDOOR ENVIRONMENTAL QUALITY ................................................................ 6
1.2. EYE PUPILS’ REACTIONS TO LIGHT – A HUMAN PHYSIOLOGICAL MECHANISM TO RETAIN HOMEOSTASIS 7
1.3. APPLICATION OF MACHINE LEARNING IN REAL-WORLD PROBLEMS ..................................................... 8
1.4. ELECTRODERMAL ACTIVITY AS AN INDICATOR OF THE HUMAN PSYCHOLOGICAL STATE ........................ 9
1.5. GOALS AND OBJECTIVES ...................................................................................................................10
CHAPTER 2. BACKGROUND AND LITERATURE REVIEW ........................................... 11
2.1. WORK PRODUCTIVITY AND HUMAN WELLBEING ASSOCIATED WITH THE LIGHT ENVIRONMENT ............ 11
2.2. EYE PUPIL SIZE AS A REPRESENTATION OF INDIVIDUAL VISUAL COMFORT ............................................12
2.3. MACHINE LEARNING IN BUILDING SCIENCE DOMAIN ..........................................................................13
2.4. ELECTRODERMAL RESPONSES TO STRESS AND EMOTION ....................................................................15
2.5. CONCLUSION ...................................................................................................................................17
CHAPTER 3. METHODOLOGY ........................................................................................... 18
3.1. METHODOLOGY DIAGRAM ................................................................................................................19
3.2. HUMAN SUBJECT EXPERIMENT FOR VISUAL COMFORT MODELING ......................................................21
3.2.1. Input (or features) and output (or labels) variables ............................................................................................... 21
3.3. ADOPTED MACHINE LEARNING ALGORITHMS AND FUNDAMENTAL MATHEMATICS INVOL VED ...............28
3.3.1. Naïve Bayes (Alpaydin, 2014) .................................................................................................................................. 29
3.3.2. Logistic regression (Ng, n.d.) ................................................................................................................................... 30
3.3.3. Support vector machine (Ng, n.d.) ........................................................................................................................... 33
3.3.4. Scikit-learn for Python .............................................................................................................................................. 34
3.3.5. MATLAB for data processing ................................................................................................................................... 35
3.4. CONTROL MODEL .............................................................................................................................35
3.4.1. Control logic and strategy ........................................................................................................................................ 35
3.4.2. Microcontroller and light dimmer............................................................................................................................ 36
3.5. ELECTRODERMAL ACTIVITY .............................................................................................................37
3.5.1. Measurement of EDA ................................................................................................................................................ 37
3.5.2. Analysis of EDA parameters .................................................................................................................................... 39
CHAPTER 4. DATA PROCESSING AND PRELIMINARY RESULTS ............................... 40
4.1. DATA PROCESSING FOR TRAINING DATASET .......................................................................................40
5
4.2. THE RANDOM ORDER OF ILLUMINANCE LEVEL PRODUCED INCONSISTENT LABELS. .............................40
4.3. 30S WAS PICKED AS THE WINDOW SIZE FOR THE MOVING A VERAGE FILTER. .........................................42
4.4. PRESENTATION OF SAMPLE DATA .......................................................................................................42
4.4.1. Visual sensation versus eye pupil size...................................................................................................................... 43
4.4.2. Visual sensation versus gradients of eye pupil size ................................................................................................ 46
CHAPTER 5. DATA ANALYSIS AND DISCUSSION ........................................................... 48
5.1. FEATURE SELECTION ........................................................................................................................48
5.1.1. Eliminating illuminance level as an input feature .................................................................................................. 48
5.1.2. Features selected for visual sensation prediction ................................................................................................... 50
5.1.3. Features selected for visual satisfaction prediction................................................................................................ 51
5.1.4. Conclusion ................................................................................................................................................................. 51
5.2. EVALUA TION OF PREDICTION PERFORMANCE ....................................................................................51
5.2.1. Prediction accuracy of visual sensation (5 levels).................................................................................................. 52
5.2.2. Prediction accuracy of visual satisfaction (5 levels) .............................................................................................. 56
5.2.3. Prediction accuracy of visual sensation (3 levels).................................................................................................. 60
5.2.4. Prediction accuracy of visual satisfaction (3 levels) .............................................................................................. 64
5.2.5. Conclusion ................................................................................................................................................................. 67
5.3. PROTOTYPE CONTROL ......................................................................................................................68
5.4. V ALIDATION TEST .............................................................................................................................70
5.4.1. First round validation test ........................................................................................................................................ 70
5.4.2. Second round validation test .................................................................................................................................... 75
5.4.3. Conclusion ................................................................................................................................................................. 76
5.5. ANALYSIS OF EDA PARAMETER ........................................................................................................76
5.5.1. Artifacts ..................................................................................................................................................................... 76
5.5.2. Summary of EDA activities ....................................................................................................................................... 79
CHAPTER 6. CONCLUSIONS AND FUTURE WORKS ..................................................... 90
6.1. CONCLUSIONS ..................................................................................................................................90
6.1.1. Visual comfort prediction and prototype control .................................................................................................... 90
6.1.2. Visual comfort and EDA activities ........................................................................................................................... 91
6.2. FUTURE WORKS ...............................................................................................................................91
6.2.1. Explore pupil size as time series data ...................................................................................................................... 91
6.2.2. Possible improvement of experimental design for investigation into EDA parameters ....................................... 91
6.2.3. Possible improvement of software and hardware ................................................................................................... 91
REFERENCES ............................................................................................................................... 93
APPENDIX A ................................................................................................................................. 96
APPENDIX B ................................................................................................................................. 97
APPENDIX C ............................................................................................................................... 109
APPENDIX D ............................................................................................................................... 110
6
Chapter 1. Introduction
It has been estimated that people in the U.S. spend 90% of their time indoors. Thus, indoor environmental
quality (IEQ) has nowadays become the most crucial contributor to occupants’ wellness. IEQ is determined
by various factors, such as lighting, air quality, thermal conditions, etc. The sensation and satisfaction of
individual factors define occupants’ overall perception of IEQ. Good IEQ gives rise to positive physical and
mental conditions, for example, high productivity, which could remediate today’s situation of overwork.
Among all the factors defining IEQ, lighting is closely related to the occupant’s visual perception. The
sensation of light determines the quality of perception, and the comfort level of the lighting environment
decides the efficiency of perception. Visual perception is of great importance in the office environment
because there is a large amount of information input through the visual pathway. Therefore, the office lighting
environment should be designed deliberately.
The energy consumption of the building sector accounts for over 40% of total primary energy consumption
in the U.S. and EU. Over the last decade, to maximize the energy-saving potential embedded in this sector, a
number of techniques have been developed. The appearance of Net Zero Energy (NZE) buildings and passive
houses has proved this remarkable achievement. Codes (e.g., IECC, Title 24) have been published to
streamline the sustainable design process. However, the demand for design with good IEQ performance has
not increased along with the energy-saving goal. This statement can be proved by the empirical value-based
system of set-points defined in codes for the consideration of IEQ. The value is uniform for each functional
zone, failing to take into account the diversity among occupants. Unlike the thermal condition, lighting can
hardly accommodate such diversity through the adaptation method. Even if lighting can be highly customized
in the future, there is no specific model to tell individual lighting setting preferences. In addition, the existing
lighting guidelines mainly focus on paper-based work. They neglect the popularity of computer-based work
in office environments.
Therefore, as stated above, lighting, as one important component of IEQ, affects occupants’ visual perception,
which is significant for an office environment. The existing building industry uses a uniform value for the
lighting system setting, which neglects the occupants’ diversity. Attention should be paid to an improved
comfort design that can also achieve the energy savings required for a sustainable future.
1.1. The current issue in indoor environmental quality
Initially, lighting was the use of light to help accomplish visual tasks. With the discovery of scientific theories
and the development of techniques, lighting became capable of being quantified and qualified by different
parameters besides illuminance. Consequently, the purpose of lighting was replenished, and researchers
started to realize that lighting can also influence individuals’ visual health. The appearance of full-spectrum
light and circadian lighting demonstrates the effort of investigating a comprehensive lighting design to
improve visual comfort.
Work performance was first discussed in association with visual comfort. In those studies, scholars were
trying to manipulate the perceived visual comfort by adjusting artificial lighting parameters (e.g., illuminance,
7
color temperature, spectrum, and the uniformity of distribution), architecture features (e.g., wall finishes) and
access to natural light. The work performance was evaluated as an outcome variable to quantify the influence
of the manipulated visual condition. This stage of research revealed valuable design considerations for
lighting, and their conclusions can be generalized as applicable rules for design.
Well-being was brought into discussion by following studies, which divided it into two aspects: physiological
and phycological. The manipulated variables are identical to those in the study of work performance, but the
outcome was replaced with fatigue, alertness, stress, mood, etc. This research reformed our understanding of
lighting design and provided a new way of regulating lighting-related parameters.
However, in actual lighting design projects, design guidelines (e.g., the IES Lighting Handbook) play the
most important role in determining the relative lighting setting. The most common requirement is the amount
of illuminance on the working plane. These ranges of target lux level are derived from either empirical or
experimental approaches in manipulated laboratory tests instead of a real office environment. In addition, the
design tasks in the guidelines are mainly paper-based, which cannot match today’s office fully equipped with
computers.
The industry does put in extra effort to overcome the drawbacks of robust compliance with the guidelines.
For example, they use dynamic simulation tools (e.g., Radiance, Daysim) to better analyze lighting conditions.
Although the diagnosis assists a comprehensive design during the design stage, the frequent layout change
afterward prevents the existing design from being optimal. Besides, these kinds of tools cannot provide real-
time feedback on the lighting and are unable to estimate the occupants’ perception of the lighting environment.
In conclusion, lighting is crucial today in terms of occupants’ wellbeing and work productivity. Existing
studies have proved the relationship between them and corresponding lighting properties. However, current
lighting design projects still follow a conventional guide-compliance approach. This cannot satisfy the
purpose of visual comfort, as its reference values deviate from actual occupants’ demands. Even though
techniques have been developed to improve lighting design, they still cannot take care of real-time conditions
or users’ visual satisfaction and sensation.
1.2. Eye pupils’ reactions to light – a human physiological mechanism to retain homeostasis
The pupil is a hole located at the center of the iris of the eye, giving light access to the retina. Physically, the
variation of brightness will result in the dilation and contraction of the pupil, which is regulated by the iris to
ensure that the right amount of light enters. In detail, the size of the pupil is determined by the activity of two
kinds of smooth muscle in the iris – the sphincter pupillae and dilator pupillae. They operate in the opposite
way. When the sphincter pupillae contracts, the pupil shrinks; when the dilator pupillae contracts, the pupil
dilates. This physiological activity retains a stable physiological condition of the human body by reducing
changes from the ambient environment, e.g., the pupil dilates to allow more light to enter the eye in a dark
environment.
Based on this physiological principle, researchers have conducted several experiments to study the reaction
8
of eye pupils to different visual stimulations. Berman et al. (1997) tested the eye pupil size patterns under
different illuminance levels, wall spectral reflectivity (i.e., different wall colors), and light source spectral
distributions. It was revealed that log pupil area is linearly dependent on log scotopic illuminance, which is
measured in the plane of the viewer’s eyes. They used visual performance to label the eye pupil size, that is,
smaller pupils corresponded to improved visual performance, even though they usually occur with increased
disability glare in high-luminance conditions. Navvab (2001) extended the research scope and included color
temperature as another potential factor affecting visual acuity. He didn’t measure the occupants’ eye pupil
size in his study but considered the pattern discovered by Berman that high illuminance levels result in
smaller eye pupil size and enhanced visual performance. He kept the illuminance level identical for two
studied lamps with different color temperatures. It was proved that a full spectrum lamp of high color
temperature outperforms a conventional warm–color temperature lamp in terms of visual superiority. For
decades, eye pupil size has been discussed as the result of visual stimuli and then as a cause of variation in
visual performance. However, no correlation has been investigated between eye pupil size and visual comfort.
Choi (2017) reported such research gap and conducted fundamental research to figure out the feasibility of
utilizing eye pupil size as an assessment of occupants’ visual sensation. The research validated its hypothesis
by conducting a human subject experiment and analyzing experimental data. Details will be introduced in
the literature review in Chapter 2. In addition, it was found that age can significantly influence the eye pupil
size change pattern in response to different light conditions (Association for Research in Vision and
Ophthalmology, Whitaker, Elliott, & Phillips, 1994). This means that eye pupil size, as a way of evaluating
occupants’ visual comfort, can accommodate variation within and between individuals. In other words, it can
be used as a parameter to construct an individual visual comfort model.
In conclusion, the eye pupil is the organ used by human body in response to the ambient lighting environment.
It is able to tell the individual’s visual comfort, and its change pattern varies among human subjects. Therefore,
the eye pupil size can be regarded as a parameter of an occupant’s visual comfort model.
1.3. Application of machine learning in real-world problems
This is an age of “big data.” Data contain loads of valuable information that is instructive for not only business
(e.g., user preferences and habits) but also the academic field (e.g., correlation and causality between
parameters studied). How to make good use of available data and extract the desired information has become
crucial. Machine learning is widely developed and applied at this moment. It uses statistical techniques to
allow a computer system to “learn” from data without being explicitly programmed. The data-driven learning
process frees the system from following strictly static program instructions and allows it to adapt to
applications in different fields. In general, there are two kinds of problems that machine learning deals with
– classification and regression. The major difference is output. A classification problem requires prediction
of the class of an input instance, while regression predicts a continuous value.
In building science, machine learning is highly popular for estimating building energy consumption or load
and predicting occupants’ thermal comfort. The former application is of significance for urban planning.
Access to a building’s energy consumption before construction will benefit a more balanced resources
9
distribution. The second investigation helps to develop a more advanced HV AC system control if the comfort
model computed is of equal or even better performance to conventional thermal comfort models (e.g.,
Predicted Mean V ote [PMV]).
It is worthwhile to further explore the potential of machine learning in the building science domain, as it uses
data to explain what may be difficult to figure out with rigorous scientific methods. This interdisciplinary
study can also bridge the gap between a building and its associated components with computers, allowing
digitalization of building information and automatic operations of the building system.
1.4. Electrodermal activity as an indicator of the human psychological state
As the primary interface between an organism and its environment, skin engages in diverse physiological
and psychological functions, such as thermoregulation, vitamin production, immunoreaction, emotional
communication, etc. This complexity of function shows why the skin is densely innervated. In particular,
electrodermal activity (EDA), which refers to the measurable changes in the electrical property of skin (i.e.,
skin conductance), reflects the autonomic innervation of sweat (sudomotor) glands.
In brief, the skin’s conductance changes with the state of sudomotor glands in the skin. The sympathetic
nervous system controls the sweating. A psychological or physiological stimulation can highly arouse the
sympathetic branch of the autonomic nervous system. Consequently, sweat gland activity increases which in
turn increase the skin conductance. The mechanism reveals the feasibility of using EDA as a measure of
emotional and sympathetic responses. EDA is measured non-invasively by applying a low, constant voltage
(e.g., 0.5V) to two electrodes placed on the skin and consists of two components – skin conductance level
(SCL) and skin conductance responses (SCRs). SCL provides a tonic level of the skin’s electrical conductivity.
It is related to the background characteristics of the signal. SCR shows the phasic changes of EDA, which
refer to the faster-changing components of the signal. In general, SCR receives more interest within the
research field as its ability to reflect stimulus-specific responses. By conducting event-related analysis (e.g.,
gauging the amplitude of the elicited SCRs) of SCR as concomitants within psychophysiological paradigms,
EDA can be used as an indicator of more general psychophysiological states (Boucsein, 2012a).
Introduction of EDA as a psychophysiological factor is likely to fill the research gap between the occupants’
psychological state and their reported subjective comfort. General survey procedures have been adopted as
the primary method to collect and analyze the occupant’s perception of comfort in an IEQ study. As for
thermal comfort studies, they are criticized because subjectively perceived stress or strain brings large
variability between and within people, so the thermal levels (either thermal sensation or thermal satisfaction)
recommended by those studies are unreliable (Auliciems, 1981). Luo (2016) provided more solid evidence
that the subject’s perceived accessibility to thermal environment control improved thermal comfort
perception. Therefore, it is necessary to validate the subject’s subjective report from a psychological
perspective to corroborate research findings.
Since insufficient attention was found in such a psychophysiological model of comfort perception,
fundamental studies are required to find paired parameters and prove the potential correlation in preparation
10
for future advanced application. This study plans to use electrodermal parameters as the reflection of
occupants’ psychological states and analyze their relationship with visual sensation and visual satisfaction.
A general conclusion is expected to reveal the influence of occupants’ emotions or stress conditions on
perceived visual comfort and extend the existing scope of IEQ research.
1.5. Goals and objectives
This research deals with the visual comfort problem in today’s office environment so as to improve the
occupants’ wellbeing and work productivity. A lighting control tool signaled by the prediction of visual
comfort level will be designed to achieve the research goal. The visual prediction is achieved by the machine
learning technique. It uses the pattern extracted from existing data to predict the characteristics of a future
instance. Eye pupil size is included as primary data to collect because of its reflection of individual response
to ambient lighting and its capability of representing an occupant’s visual comfort level. A user test is
conducted to validate the research findings, and a control principle is designed. EDA parameters are
introduced as a psychophysiological factor in the research. They are utilized as a reflection of an occupant’s
psychological state and analyzed to clarify any correlation with the subject’s perception of visual comfort.
11
Chapter 2. Background and Literature Review
This chapter explores previous studies to provide insight into the research problem and background
information about the methodology adopted. By reviewing the importance of lighting within IEQ research,
the benefits of a visual comfort tool designed can be clearly identified. In order to explain the reasons that
eye pupil size is determined as the primary parameter cared about, background information is studied.
Psychophysiological variables are newly investigated. Therefore, relevant studies are investigated to refine
the approach and the scope of the research.
2.1. Work productivity and human wellbeing associated with the light environment
As human photoreception consists of a visual and a non-visual path, light is significant not only for visual
activities but also for human physiological regulation. The efficiency of visual activities accomplished in the
office environment is reflected in the work productivity of employees. Irregularity in human physiological
regulation would affect occupants’ health negatively.
In Tanabe and Nishihara’s (2004) research, it was found that higher illuminance levels give rise to a more
activated brain. They used the amount of cerebral blood oxygenation to evaluate the brain’s activity, as a
more difficult task requires more oxygenated hemoglobin and total hemoglobin concentration. As the brighter
condition allowed for the generation of more oxygenated hemoglobin, it was proved that a high illuminance
level is beneficial for increasing work productivity. A field study by Juslén, Wouters, and Tenner (2007)
corroborated this conclusion. They compared the speed of production of electronic devices under horizontal
illuminance of 800 and 1200 lux in a factory in the Netherlands. It was revealed that in summer, the speed
with 1200 lux at the working plane was 2.9% higher than with 800, and in winter, the percentage increased
to 3.1%. Apart from the illuminance level, correlated color temperature and lighting uniformity are the other
lighting parameters found to be associated with occupants’ activities. Sun and Lian (2016) proved a higher
correlated color temperature contributed to a more satisfying working environment by showing a decreased
melatonin level and increased tear mucus ferning quality. The observed phenomenon was shown to be
positively related to the degree of visual satisfaction during work. Boyce (1979) brought lighting uniformity
into the discussion through research examining problems in local lighting installations. It was recommended
that at least 1 square meter with uniform illuminance should be provided at the working plane and the highest
illuminance should occur within this uniform area. Higher illuminance outside the central uniform area gave
rise to distraction and irritation.
Fatigue is a phenomenal reflector of wellbeing issues related to lighting conditions. Weston (1953) first
defined visual fatigue as excessive muscular exertion and provided the general conclusion that suboptimal
lighting conditions for particular visual tasks accelerate the occurrence of fatigue. Tanabe and Nishihara
(2004) created a human subject experiment to evaluate subjective symptoms of fatigue. Test subjects were
exposed to two different illuminance conditions – 800 lux and 3 lux. They were required to report the
symptoms of fatigue, which were predefined in three categories by the Japan Society for Occupational Health
(Table 1). The order of the categories, based on the number of complaints, would give the type of fatigue felt
by occupants. It was reported that the subjects complained about feeling mental fatigue after exposure to the
3-lux condition. No explicit relationship has been found between the health-related aspects of lighting
12
conditions and qualified parameters besides fatigue. However, some fundamental research and empirical
studies suggest the importance of certain components of lighting installations with respect to occupants’
wellbeing, such as spatial distribution of light, duration of light, light color, and lighting level (van Bommel
& van den Beld, 2004).
I II III
My head is drooping I can’t think clearly I have a headache
My whole body is getting tired Talking would take some effort My shoulder muscles are tense
My legs are getting tired I am starting to feel nervous My back hurts
I can't stop yawning I can't concentrate Breathing takes an effort
My mind is blank I am losing interest in my work I feel thirsty
I am starting to feel drowsy I am starting to forget things My voice is probably hoarse
My eyes are getting tired I am uncertain I feel giddy
I feel stiff and clumsy I am worried The small muscles around my
eyes are twitching
I would be unsteady on my feet I am sitting badly My limbs are shaking
I feel like lying down I feel impatient I feel unwell
Table 2.1 Three types of subjective symptoms of fatigue (Tanabe & Nishihara, 2004a)
2.2. Eye pupil size as a representation of individual visual comfort
Previous research done by Choi and Zhu (2015) lays the foundation for this study, as it proved the feasibility
of using human eye pupil size as a representation of an occupant’s visual comfort level. The research adopted
human subject experiments to prepare data for assessment of the correlation between human eye pupil size
and visual sensation. The subjects were required to report their visual sensation, evaluated on a 7-point scale,
under a series of lighting intensities ranging from 50 to 1450 lux with an interval of 150 lux. It used a
normalized filter to eliminate the variations among individuals, which is defined as
Normalized eye pupil size (%) = (
(45678 97:; (7)−45678 97:; (=;5>?@8 9;=9@>7A=)
45678 97:; (=;5>?@8 9;=9@>7A=)
)×100.
where i refers to the subject’s visual sensation in response to an illuminance level. The research found that
each level of visual sensation can be differentiated by the eye pupil size with a confidence interval of 95%
(Figure 1). In addition, a different change rate was found between change from dark to neutral and from
neutral to bright, even though an irregular pattern was reported with respect to different group features (e.g.,
gender, age, eye color, and glasses worn conditions) as a result of insufficient sample size (Table 2). It can
be deduced from these two findings that eye pupil size and its change rate can be utilized to describe
occupants’ visual sensation in response to their lighting environment. Furthermore, visual comfort is likely
to be quantified by the physiological factors of the body.
13
Figure 2.1 Normalized eye pupil size vs. visual sensation (95% confidence interval plot)
Table 2.2 Average change rates grouped by sets of demographic features
2.3. Machine learning in building science domain
Machine learning has been widely investigated in the building science domain as a statistical tool to make a
valuable prediction for building energy consumption, system load, or the indoor thermal comfort index. Its
performance was evaluated to be better than simple statistical models such as linear regression due to its
more accurate description of data patterns. A crucial problem with the application of machine learning is
determining which algorithm (or model) can provide the best solution to the question encountered. This
determination of the optimum algorithm depends on various factors, such as the problem definition and data
characteristics. Therefore, besides providing the state of the art of machine learning in the building science
domain, this subsection will prepare a list of algorithms that are worthy of trial in this study.
Aiming to investigate whether increasing development density actually leads to energy savings, Robinson
and his colleagues (2017) developed a novel technique to predict the energy consumption of commercial
buildings at the building level, which deals with the inaccessibility of spatial energy consumption information
at a granular scale. The prediction model utilized a machine learning model that was trained on the 2012
Commercial Building Energy Consumption Survey microdata and validated on the Local Law 84 (LL84)
dataset from New York City. The model is able to predict the energy usage (e.g., annual major fuel
consumption) of a commercial building based on the five input features – the square footage of the building,
the principal building activity, the heating and cooling degree days, and the number of employees. They
compared the performance of 13 models and found that gradient boosting regressor outperformed the rest of
the models with an ?
F
of 0.82. ?
F
is computed between predicted values and actual values and ranges from
0 to 1. A value approaching 0 represents an estimation that deviates from the actual value. This research
revealed the feasibility of machine learning in building energy estimation even though improvements are still
required for higher precision.
14
It has been identified that prediction of an occupant’s thermal comfort is beneficial for both building energy
saving, since it helps eliminate extra cooling, and maintaining a healthy indoor environment, as it deals with
a potential delay in human thermal sensation. Chaudhuri et al. (2017) used six machine learning classifiers
(support vector machine [SVM, artificial neural network [ANN, logistic regression [LG], linear discriminant
analysis [LDA], k-nearest neighbors [KNN], and classification trees [CT]) to develop a prediction tool to
estimate the participants’ thermal comfort levels (“cool-discomfort,” “comfort,” or “warm-discomfort”). The
prediction required an input feature set consisting of Fanger’s parameters (e.g., air temperature, mean radiant
temperature, relative humidity, airspeed, clothing level, and metabolic rate), outdoor effective temperature,
age, and gender. They used a public database from ASHRAE to train and test the model. The database
contained 818 surveys about occupants’ thermal sensation across buildings in Singapore. Natural ventilated
(NV) buildings and air-conditioned (HV AC) buildings were separately discussed because age information
was missing for surveys taken in the air-conditioned buildings. The results showed that SVM, ANN, and LR
ranked as the top three in terms of test accuracy for either NV or HV AC buildings. Besides, it was found that
the data-driven method, with accuracies of 73.14–81.2%, outperformed the conventional (PMV) method,
with accuracies of 41.68–65.5%. Instead of using a questionnaire to collect thermal comfort feedback from
each individual, Kim et al. (2018) used occupants’ heating and cooling behaviors to label their thermal
preferences. They adopted a personal comfort system, a chair with the ability to heat and cool, and conducted
a 7-month field study in an office building to collect thermal behavior data as well as environmental and
HV AC system data. A total of 64 features were utilized for the model training, and six algorithms
(classification tree [cTree], gaussian process classification [GPC], gradient boosting method ([GBM], kernel
support vector machine [kSVM], random forest [RF], regularized logistic regression [regLR]) were
considered to solve the multiclass classification problem of an occupant’s thermal preference (“warmer”/”no
change”/”cooler”). The prediction models were established for 34 participants individually. The accuracies
of models varied among participants, and it was concluded that algorithms with the ability to control high
dimensions and noise in data (e.g., RF, kSVM, regLR) possessed higher accuracy than others. In addition,
model convergence was introduced as a new assessment criterion for model performance. It tells whether the
current model can generate stable predictions or not and is closely related to the size of the training dataset.
The finding was that a sample size of 64 was required for convergence on average. However, individual
subjects showed remarkable variations, which means the amount of data required to achieve a stable
prediction differs among individuals.
15
Figure 2.2 (a) Personal comfort system; (b) summary of predictive performance of conventional comfort models and personal
comfort models (J. Kim et al., 2018)
2.4. Electrodermal responses to stress and emotion
EDA parameters are regarded as suitable approaches for continuously monitoring autonomic nervous system
(ANS) activity elicited by stress because the sympathetic branch of the ANS, which determines EDA,
predominates in stress states (Boucsein, 2012b).
Firstly, tonic EDA level (or SCL) was investigated in association with stress states. The Lazarus group (1966)
reported an increased SCL in experiment subjects while they were being presented with stressful scenes of
films. After editing the stress-inducing soundtracks within the films and presenting them again, a decrease in
SCL was observed. This led the researchers to refine their research scope to the role of anticipation of stressful
events. Nomikos, Opton, Averill, and Lazarus (2013) designed an experiment focusing on the effect of
anticipatory periods preceding stressful events. Two groups of participants of both genders, 26 in total,
watched two versions of an industrial safety film depicting lumbermill accidents. One group was presented
with the full version, and the other group viewed a short version in which the anticipatory scenes before the
accidents were deleted. The results showed that the latter group had a lower increase in SCL. Then, the
experimental design was elaborated to fully explore the EDA activities during the anticipatory period. In
Folkins’s (1970) experiment, participants were exposed to a threat of repeated electric shocks. The threat
only came with an announcement delivered at different anticipation intervals (5 and 30s; 1, 3, 5, and 20 min),
16
and no actual shocks were given. A continuing increase in SCL level was found during anticipation periods
of 5s, 30s, and 1 min. The pattern of 3- and 5-min interval was shown to be a plateau with two rises at the
start and end of the anticipation period. During the 20-min interval, SCL decreased to base level after 2 min
and increased at the 16th minute.
Secondly, SCRs were brought into discussion when a further investigation into anticipatory stress was
initiated. Temporal uncertainty, defined by the occurrence time of stimuli, and event uncertainty, described
by the probability of occurrence of the stimuli, were introduced as new control variables. In experiments
designed by Monat, Averill, and Lazarus (1972), participants were equally divided into four groups, each of
which was assigned to an experimental condition – (1) 100% stimuli at time known, (2) 100% stimuli at time
unknown, (3) 50% probability of stimuli at time known, (4) 50% probability of stimuli at time unknown. The
stimulus was expected as a threat of shock, and the anticipation period was set to be 3 minutes. The research
showed a continuing decrease in the mean SCR amplitude under the condition of temporal uncertainty;
however, there were no distinctions between different conditions of event probability, as an identical course
of electrodermal responses was reported. Niemelä (1975) developed research focusing on the effect of long-
lasting anticipatory periods on SCRs. They adopted the lumbermill accidents film as the stressful event
anticipated after presentation of a surgery film. The period was designed to be 0 to 3 days. It was revealed
that a shorter interval gave rise to lower SCRs in response to the accident scenes.
Thirdly, non-specific EDR (NS.EDR) was suggested as a more appropriate indicator of anticipatory stress
than electrodermal level. Kilpatrick (1972) conducted research that compared SRL and NS.SRR frequency.
There were two groups of participants – one was given anticipation of intelligence test and the other was
given control instructions. SRL and NS.SRR were monitored during the anticipation interval and the
subsequent performance test. The outcome showed that NS.SRR, not SRL, can differentiate the experimental
group and the control group. Besides, a remarkable SRL reduction was observed during the performance test.
It was interpreted that EDL is affected by cognitive demands, while NS.EDR frequency is influenced by
anticipatory stress elicited (Katkin, 1965).
It has been found that emotional arousal is maintained by circulation of neuronal activity in a Papez circuit
in which emotions can get access to ANS programs, which are stored in the hypothalamus, and elicit a
concomitant autonomic response (Papez, 1937). Therefore, ANS parameters are capable of quantifying the
emotions. EDA, grouped with heart rate (HR), electromyography (EMG), etc., was studied in the initial
psychophysiological emotion research. In Ax’s experiment (1953), he compared the patterns of ANS
variables under anger and fear. The results revealed that EDA is one of the variables that can differentiate
these two emotional states. In detail, a remarkable increase of NS.EDR frequency was observed under anger
in comparison to fear, while the EDL of fear is significantly higher than that of anger. Pleasure was added in
Stemmler’s (1989) study. It was reported that an increase of forehead SCL emerged for anger and fear, but
the pleasure condition yielded a decrease of forehead SCL.
With the development of research, emotions were refined into more dimensions, intensity and valence, to
analyze their correlations with ANS parameters. It has been suggested that EDA cannot differentiate the
17
valence of emotions but is able to tell the intensity of arousal. This statement is supported by several studies.
In the experiment by Winton, Putnam, and Krauss (1984), a higher SCR amplitude was reported for both
highly pleasant and unpleasant stimuli as compared to neutral ones. Besides, it was observed that SCR
amplitude was high under intense emotions and low in moderate categories. Both valence and intensity of
emotions were rated based on subjective variables reported by experiment participants. Similar relationships
were found by Greenwald, Cook, and Lang (1989), who concluded that ratings of arousal were positively
related to SCR amplitude, and ratings of pleasantness were connected to HR.
2.5. Conclusion
In summary, lighting and its parameters have been recognized as crucial indoor environmental factors to
occupants’ productivity and wellbeing. Illuminance is the element that has been studied most frequently. It is
worthwhile to adopt it as the primary lighting component to be investigated, since the existing conclusions
clarify its close relationship with variable work performance and distinct psychological states.
Fundamental research has demonstrated the relationship between human eye pupil size and visual sensation.
This fact provides the theoretical support for using eye pupil size as a variable monitored in a lighting control
loop.
Machine learning, as an interdisciplinary approach in the building science field, has allowed recent
researchers to make better use of building- and occupant-related data. It helps to establish various prediction
models, especially building energy consumption and residents’ thermal comfort index. The most beneficial
algorithms have been revealed as those with higher dimensions and capability of tackling noise, such as
regLG, SVM, and ANN. Additionally, the attempt to build a personal comfort model is inspiring for this
research. The superior performance in comparison with conventional models driven by multi-occupant data
proves its feasibility in a future application.
EDA is a reliable tool in the area of stress and emotion research, even though no general theoretical
framework has been revealed for using different EDA parameters to explain various emotional contexts
(Boucsein, 2012b). The review of current research recommends collection of SCL and SCR, especially
NS.SCR, for an investigation into participants’ psychological states during an experiment. It should be noted
that a controlled condition may be required as a reference state to highlight the effects of experimental
conditions.
18
Chapter 3. Methodology
This chapter elucidates how the research was conducted. There were two parts in the research – the
development of the visual comfort tool and a fundamental study of EDA in association with visual comfort
perceived by occupants. A visual comfort prediction model and a lighting control model constitute the tool.
The establishment of the visual comfort prediction model involved the application of machine learning
algorithms. The source of the data and the choice of algorithm were problems dealt with. The control model
required an investigation into how to adjust the light settings based on the visual sensation level predicted.
Control parameters such as control steps and change per step were explored. Fundamental study of occupants’
psychological states versus corresponding visual comfort perception started with an exploration of EDA
parameters. It attempted to choose the representative EDA parameter and initiates a valuable discussion. The
discussion should tell the influence of occupants’ psychological states on the subjective report of their visual
comfort levels.
19
3.1. Methodology diagram
20
Figure 3.1 Research methodology diagram
21
3.2. Human subject experiment for visual comfort modeling
The visual comfort prediction model was established using a machine learning approach. Figure 2 provides
a detailed illustration of how this method works. There are two steps for the application of machine learning
– training and prediction. The training process allows for extraction of data patterns from input examples.
Since visual comfort prediction is a supervised machine learning problem, the input dataset should be
described by features and tagged with labels. Categorical labels are expected for a classification task. The
outcome of the training is a classifier model computed by the machine learning algorithm. During prediction,
the labels of new instances are generated by a classifier obtained through the training. This prediction process
usually follows a model testing period during which the performance of the classifier will be evaluated on a
new collection of labeled data. This evaluation result provides a criterion for choosing the best algorithm for
the case studied. The definitions of features and labels for this research and its mathematical foundation will
be elaborated in Subsections 3.2.1 and 3.3.
Figure 3.2 Workflow of machine learning problem
3.2.1. Input (or features) and output (or labels) variables
This subsection introduces the definitions of the input and output variables of the prediction model and the
process of preparing the dataset required for the model’s establishment. The data collection process was
accomplished by conducting a human subject test. The introduction provided covers the laboratory settings,
experiment design, apparatus information, and demographic characteristics of human subjects. The dataset,
as mentioned in the diagram, was prepared with features and labels for the ease of training and testing
different classifiers.
3.2.1.1. Eye pupil size
Eye pupil size was captured by a mobile tracking eye module manufactured by ASL (Figure 3.3). The module
comes with tracking glasses, a display transfer unit, and a laptop. The tracking glasses are mounted with two
cameras – a view camera to catch what the wearer is looking at and an eye camera to record the eye pupil
size data. Both cameras are installed at the right eye side, which means the device only allows a single eye’s
data to be collected. The display transfer unit powers the tracking glasses and transfers the monitoring of the
wearer’s eye and his or her view to the computer. The display transfer unit also has its own touchscreen to
display the monitoring and allow the user to switch the displays from different cameras. The laptop is
preloaded with the Mobile Eye XG software package, which provides an interface (Figure 3.4) for data
collection, apparatus calibration, and real-time monitoring of what has been measured.
22
Figure 3.3 Mobile tracking eye module (left) and eye-tracking glass (right)
Figure 3.4 Software interface
The sampling rate of the device is 30 Hz. The raw data consist of eye data and view data. Table 3.1 provides
a snapshot of raw data collected in comma-separated values (CSV) format. The first two columns describe
the timeframe of the instance. Columns 3–7 are variables related to the wearer’s eye. In detail, spots x and y
tell the coordinates of the anchor point highlighted by a red cross in Figure 3.4. The camera keeps tracking
the wearer’s pupil by tracing the anchor point. Pupil x and pupil y are the coordinates of the pupil’s center.
Pupil r is the original data required by the model, which measures absolute eye pupil size in pixels. The last
two columns come from the view camera and tell the position of the focus of sight.
CSV File Version 3
Avi TimeStamp Frame Spot x Spot y Pupil x Pupil y Pupil r Scene x Scene y
0:00:00.00 1 225.04 161.7 204.11 93.53 72.91 169.12 73.5
0:00:00.03 2 268.01 169.82 287.68 107.18 79.15 251.31 86.16
0:00:00.06 3 268.01 169.82 291.08 106.26 78.49 254.25 83.19
0:00:00.10 4 265.98 167.23 287.77 102.73 78.64 253.45 79.6
0:00:00.13 5 262.63 166.16 280.02 99.52 78.8 245.85 77.13
0:00:00.16 6 262.63 166.16 279.97 99.59 78.97 245.08 76.52
0:00:00.20 7 261.75 166.45 279.29 99.61 78.92 244.73 76.4
0:00:00.23 8 261.51 166.67 278.93 99.93 79.33 245.14 76.3
0:00:00.26 9 261.74 167.5 280.73 101.5 79.82 247.46 77.75
Table 3.1 Sample raw data
There are two types of features derived from the absolute eye pupil size. The first one is called the moving
23
average. “Given a sequence {"
#
}
#%&
'
, an n- moving average is a new sequence {(
#
}
#%&
') *+&
defined from the
"
#
by taking the arithmetic mean of n terms, (
#
=
&
*
∑ "
.
#+*) &
. %#
” (Kenney & Keeping, 1962). This parameter
filters out the short-term fluctuations within raw data but keeps the long-term data trend. The choice of value
n is discussed in Chapter 4 in detail.
The second feature derived is the gradient. It is defined as the difference between the moving average of eye
pupil size at two individual timeframes, that is,
/0"12345 (8) = :
;
−:
;) *
where S is absolute pupil size processed by moving average filtering, t is the current time, and n is the time
difference.
The determination of time difference should be based on the size of the moving average filter. It is expected
to try different time difference values as input features and keep the most correlated ones in the feature
selection.
3.2.1.2. Illuminance
Illuminance measures the amount of light incident on a surface, defined in lumens per square meter, or lux.
It is the only lighting setting input in the prediction model. A luminance meter (OMEGA HHLM-1) was used
to evaluate its value, which was noted manually during the human subject test. The illuminance was measured
at the working plane (i.e., the surface of the table).
Figure 3.5 Stick-type light meter
There are 28 LED light bulbs hung on the ceiling of the experiment chamber (Figure 3.6). The light bulbs
were installed in three lines of track with 16-inch spacing. The light bulbs on each track were placed 10
inches apart. Table 3.2 provides information of their technical details. They are dimmable and color-tunable.
They can be controlled wirelessly by using the controller provided with the product. It adjusts the luminance
by percentage. In the human subject test, different illuminance settings could be achieved by adjusting the
luminance of LED lighting bulbs in a different track. This is discussed in Subsection 3.2.5.
24
Figure 3.6 Layout of experiment chamber (in inches)
Brand Coidak
Item Weight 4.6 ounces
Voltage 230 volts
Fixture Features Dimmable
Light Direction Adjustable
Type of Bulb LED
Luminous Flux 800lm
Wattage 9 Watts
Incandescent equivalent 63 Watts
Color Temperature 2700 Kelvin
Color Rendering Index (CRI) 75
Table 3.2 Technical details of the light bulb
Figure 3.7 Coidak 9W E26 dimmable LED light bulb
3.2.1.3. Visual sensation and visual satisfaction
Visual sensation and visual satisfaction were subjectively reported by the test subjects, collected through a
25
self-administered questionnaire, to evaluate the individuals’ visual comfort levels. A 5-point scale was used
to rate these two descriptive parameters of visual comfort: for visual sensation, too dark (-2), dark (-1), neutral
(0), bright (1), and too bright (2); for visual satisfaction, very dissatisfied (-2), dissatisfied (-1), neutral (0),
satisfied (1), and very satisfied (2). In psychometrics, 5-point and 7-point rating scales are commonly used
for attitude and opinion measures (Preston & Colman, 2000). Even though a 5-point scale loses certain
variability of interest, it provides higher-quality data compared with 7-point scales in terms of agree-disagree
rating (Revilla, Saris, & Krosnick, 2014). A 5-point format is also found to be able to increase the response
rate and the quality of responses (Babakus & Mangold, 1992). In addition, fewer points allow a smoother
connection between visual comfort prediction and the control model because control positions are designed
to be as simple as possible for this application.
3.2.1.4. Human subject experiment
The human subject test was designed to collect the data for the establishment of the model. The experiment
was conducted in the environment chamber of Watt Hall at the University of Southern California University
Park Campus. The chamber eliminates the influence of daylight because it is in the basement. The size of the
entire chamber is 112” × 120’’ × 95’’ (Figure 3.8).
Figure 3.8 Fish-eye view of the chamber
The experiment was mainly carried out at the workstation desk that was placed in the lab and along the
southern wall. Two chairs were set beside the table to seat the test subject and the experimenter. The interior
finish of the lab mimicked the real office environment. A monitor, a keyboard, and a mouse were placed on
the desk. During the experiment, the human subjects were required to complete computer-based tasks, but in
accordance with their own preferences. The dimmable lamps were the only light sources for the sake of easy
control. The walls were painted white to ensure similar reflectivity to a real office.
The indoor environmental conditions, except lighting, were desired to be maintained at a stable and
satisfactory level to get rid of noise. The thermal condition was ensured by a ceiling ventilation diffuser,
26
which not only regulated indoor temperature but contributed to good indoor air quality. The entire chamber
was insulated acoustically and thermally to prevent turbulence from ambient spaces. Air temperature, relative
humidity, and the concentration of carbon dioxide were measured to fall into the ranges of 24.5± 0.5℃,
32± 2.5%, and 610± 35ppm, respectively.
Human subjects participated in the experiment to provide data for their visual comfort prediction model.
Each experiment could only accommodate one participant. The selection of human subjects was random and
based on participants’ willingness to respond to an email request. All participants had been notified with the
experiment process and what they were expected to accomplish. Rules and prohibitions were also clarified
in advance.
Figure 3.9 Experiment arrangement
Each experiment took 1 hour and 46 minutes and seated one human subject only. The experiment was divided
into two parts. Firstly, the human subject took 10 minutes to adapt to the environment in the lab, especially
the lighting condition, and filled in the pre-experiment survey. The pre-experiment survey aimed to provide
a description of the human subject and his or her impression about his or her real-world office environment.
During the first 10 minutes, the experimenter also helped the human subject understand the experiment
process and wear the pupilometer and EDA sensor. Calibration was required for each subject because of the
variation of individuals’ interorbital distance and eye size.
The principle of the experiment was to record visual sensation and visual satisfaction level of a human subject
under different lighting settings. The lighting setting, primarily the illuminance on the working plane, was
changed every 8 minutes and 12 times per test. The illuminance for each step was predetermined before the
experiment. The values were picked from 100, 200, 400, 500, 650, 800, 950, 1100, and 1250 lux. The order
(Figure 3.10) was generated by Excel’s random function in an attempt to get rid of any human factors.
Different human subjects used a different order of settings. It should be noted that these levels fall within the
range from 100 to 1400 lux. This was determined based on previous research finding that 100–1400 lux is
the acceptable range for the majority of experimental subjects (Choi, Lin, & Schiller, 2018). V alues above
this range were recognized to harm visual ability as a result of the occurrence of glare.
27
28
Figure 3.10 Illuminance levels in experimental order for (a) Test Subject 1, (b) Test Subject 2, and (c) Test Subject 3
Visual satisfaction and visual sensation were surveyed for each step to label lighting settings. The participants
were asked to fill in the survey during the last minute of each step to make sure they had gotten enough
exposure to the current setting. Figure 3.11 provides a copy of the questionnaire used in the experiment. It is
divided into two parts. The pre-experiment part collected the demographic information. It provides hints for
future multi-occupant analysis. The second part was used during the experiment. The first question about the
brightness collects the test subject’s visual sensation about the lighting environment. The second question
collects the corresponding satisfaction. The third question works as a supplemental question that validates
the answer to the first question about visual sensation.
Figure 3.11 Survey
3.3. Adopted machine learning algorithms and fundamental mathematics involved
Training is a crucial part in machine learning. From a mathematical perspective, it is an iterative process of
establishing the optimized function that maps features to labels. There are three significant components in
the process: hypothesis function, loss function, and optimization method. The hypothesis function is the
expression of the statistical relationship between feature and label. Basically, it consists of terms derived from
features and their corresponding coefficients. The coefficients are randomized for the first iteration and
optimized for each iteration. The loss function tells the cost of adopting the hypothesis function from each
1
Test subject survey (Pre-experiment)
(Fill in the blank reserved or Check applied to yours)
Part I. Personal Data
1. Name:
2. Gender:
Male Female
3. Year of Birth:
4. Eye Color:
Dark Brown Blue Green Other:
5. Ethnicity:
American Indian Asian African American Latino Other Pacific
Islander White
6. Do you wear corrective lenses when working at a computer?
Yes No
7. If yes, what for?
Myopia Hyperopia Radiation Protection
8. Which lighting condition do you prefer for computer-based work?
Dark Slightly Dark Neutral Slightly Bright Bright
Part II: Work Description
1. How would you characterize your computer work tasks?
Work Processing and Text Editing Data Processing Graphical
Application Information Browsing Others:
2
2. How long do you spend in front of a computer during a normal
working day?
Part III: Well-being & Lab Environment Satisfaction
1. How well did you sleep the night before participating in the test?
[Very Badly] -2 -1 0 1 2 [Very Well]
2. How would you describe your current physical condition?
[Very Poor] -2 -1 0 1 2 [Excellent]
3. How would you describe your current emotional condition?
[Very Poor] -2 -1 0 1 2 [Excellent]
4. How do you rate the temperature in the lab?
[Too Cold] -2 -1 0 1 2 [Too Hot]
Test subject survey (During experiment)
1. How do you feel the brightness of this environment for working?
-2 -1 0 1 2
Too Dark Dark Neutral Bright Too Bright
2. How do you feel this whole lighting environment for working?
-2 -1 0 1 2
3. Are you willing to change current lighting condition?
-No -Yes, to make it darker -Yes, to make it brighter
Very
Dissatisfied
Dissatisfied Neutral Satisfied Very
Satisfied
29
iteration. The optimization method determines the way of modifying the coefficients in the hypothesis
function. Through the iterations in the training, the coefficients converge to a fixed value, which means the
corresponding hypothesis function has been optimized with the least cost. The algorithms are characterized
by their loss function and hypothesis function, which are elaborated in following subsections.
To make a better explanation, a problem was assumed here. It was stated that occupants’ visual satisfaction
labels, i.e., satisfied or unsatisfied, were determined based on their knowledge of current light level (X1) and
their 10-min eye pupil size average (X2) during the designed recording period. The satisfaction label was
denoted by y. y = 1 corresponded to satisfaction, and y = 0 indicated dissatisfaction. A set of data was collected
for the training. It is expressed as
{KL
(&)
,M
(&)
N,KL
(O)
,M
(O)
N,…,KL
(Q)
,M
(Q)
N}
Here comes a new instance whose X1 = x1 and X2 = x2. A value of y had to be determined based on the data
pattern extracted from the training dataset.
3.3.1. Naïve Bayes (Alpaydin, 2014)
Unlike LG and SVM, naïve Bayes uses Bayes’ rule to calculate the probabilities of the classes, and its
decision is made through a risk minimization process in order to provide a reliable prediction.
P(y|x1,x2) describes the probability of outcome y occurring with the observation of X1 = x1 and X2 = x2.
According to Bayes’ rule, it can be calculated as
R(M|81,82)=
R(M)T(81,82|M)
T(81,82)
(3U 1)
P(y=1) refers to the prior probability, which is the proportion of unsatisfied occupants in the dataset collected.
As it is prior knowledge of the occupants and irrelative to observations, it is called prior probability.
p(x1,x2|y) is the class likelihood, which means the conditional probability of an event that belongs to outcome
y with associated observations x1 and x2.
p(x1,x2) is the marginal probability that x1 and x2 are seen regardless of the outcome. It is calculated as
T(81,82) = VT(81,82,M)= T(81,82|M = 1)R(M =1)+T(81,82|M= 0)R(M = 0)
X
(3U 2)
Both p(x1,x2|y) and p(x1,x2) are computed from the training set.
In a general condition, the class predicted should have the maximum posterior probability calculated from
Bayes’ rule. It can be described in a mathematical format that
Choose M
#
(2= 1,2) 2^ R(M
#
|81,82) =_"8R(M|81,82)
In detail, it can be described as
Choose {
M =1
M =0
2^ R(M = 1|81,82) >R(M= 0|81,82)
a5ℎ30c2(3
To express the decision rule on the basis of loss and risk, it should be
dℎaa(3 e
#
2^ f (e
#
|81,82) =min
i
f (e
i
|81,82)
30
And
f (e
#
|81,82) = Vj
#i
R(M
i
|81,82)
k
i%&
(3U 3)
e
#
denotes the action taken to assign new input (x1,x2) to class M
#
, i = 1, …, K. j
#i
describes the loss as a
result of taking action e
#
, but the new input actually belongs to class M
i
. The general decision rules stated
above adopted a 0/1 loss principle, which hypothesizes that no loss is incurred for correct decisions and equal
loss is incurred for errors, that is,
j
#i
= l
0 2^ 2= m
1 2^ 2≠ m
3.3.2. Logistic regression (Ng, n.d.)
The hypothesis function of logistic regression is given as:
ℎ
o
(8) = /(p
q
L) (3U4)
And
/(r) =
1
1+3
) s
(3U 5)
g(z) is called the logistic function, whose output falls in the range between 0 and 1. This is the most important
property that allows LG to be used as a classier. The output of the hypothesis ℎ
o
(8) is interpreted as the
estimated probability that y = 1 on input x. p is the coefficient matrix, and x is the feature matrix. p
q
L
describes the multiplication of two matrixes. The matrixing process allows the calculation to be implemented
in a computer-aided way. In detail, the coefficient matrix is a vector, that is, a matrix containing a single
column. The feature matrices contain rows, which equal the number of instances included, and columns,
which equal the number of features in consideration. It should be noted that the features do not merely refer
to the decision-based knowledge. They can be derived from the observed parameters. In the case stated above,
the features can be 8
&
8
O
,8
&
O
,8
O
O
,8
&
8
O
O
,8
&
O
8
O
, etc. The derived features allow the establishment of a non-linear
decision boundary in the classification problem (Figure 3.12). Example 1 is given below, based on the
problem defined at the beginning, to provide a better illustration of what is included in the input matrixes.
Example 1:
Hypothesis function is given as
ℎ
o
(8) = /(p
t
+p
&
8
&
+p
O
8
O
+p
u
8
&
O
+p
v
8
O
O
)
Expression in the matrix is shown as
p =
⎣
⎢
⎢
⎢
⎡
p
t
p
&
p
O
p
u
p
v
⎦
⎥
⎥
⎥
⎤
,L=
⎣
⎢
⎢
⎢
⎡
8
t(&)
8
&(&)
8
O(&)
8
&(&)
O
8
O(&)
O
8
t(O)
8
&(O)
8
O(O)
8
&(O)
O
8
O(O)
O
8
t(u)
8
&(u)
8
O(u)
8
&(u)
O
8
O(u)
O
… … … … …
⎦
⎥
⎥
⎥
⎤
c ℎ303 8
t(#)
= 1 "41 2=1…_.(_ = 4}_~30 a^ 24(5"43( )
And
ℎ
o
(8) = /(p
q
L)
31
Figure 3.12 The non-linear boundary defined by higher-order polynomial terms
The loss function is the criterion for choosing the coefficient p. By comparing the hypothesis function output
and the actual result, the loss function evaluates the error incurred by adopting a prediction derived from the
hypothesis with coefficient p
#
. The decision of p is an optimization problem that aims to minimize the loss
function. The loss function of LG is shown as
J(θ) = −
1
_
[VM
(#)
Éa/ℎ
o
K8
(#)
N+(1−
Q
#%&
M
(#)
)log (1−ℎ
o
K8
(#)
N)] (3U 6)
By considering the labels of y = 0 and y = 1 separately for a single instance, the expression can be
reestablished as
J(θ) =l
−log (ℎ
o
(8))
−log (1−ℎ
o
(8))
2^ M= 1
2^ M= 0
When the actual label is y = 1, it is shown in Figure 3.13(a) that the loss function penalizes the learning
algorithm with a large cost if the hypothesis is 0. In turn, if the hypothesis predicts an approaching outcome
with the correct label, the corresponding cost is reduced or even eliminated. The situation is identical for the
case y = 1.
Figure 3.13 (a) y = -log(x); (b) y=-log(1-x)
The only problem left for finding the optimal coefficient p is the optimization method. Gradient descent is
provided here as the general optimization strategy in machine learning problems. It should be noted that there
are other methods available for a faster decision of optimum p. It is expressed mathematically as
Repeat until convergenc e {p
.
≔p
.
−e
è
èp
.
ê(p)}
32
Convergence means that the loss function converges to a local minimum value. p
.
refers to the specific
element in the coefficient matrix. ≔ assigns a new value to p which is the result of a computation
involving the previous p. It is required that all elements of p be updated simultaneously. In brief, by taking
steps proportional to the negative of the gradient of the function at the current point, a local minimum can be
reached. Figure 14 depicts the optimization process in the coordinates (p
t
,p
&
,ê(p
t
,p
&
)).
Figure 3.14 Visualization of an optimization process
It should be noted that the first point is usually determined in randomly.
To ameliorate the problem of overfitting, regularization is normally applied to improve model performance.
Overfitting is defined as distinctive performance of the hypothesis between the training and testing datasets
– that is, the hypothesis is able to show good fit to the training data but fails to provide reliable predictions
for test data. For example, in Figure 3.15, (c) is the case of overfitting. The algorithm is trying to find a
hypothesis to fit every single training example well; however, the boundary is too contorted to classify future
instances. Opposite to (c), (a) shows a case of underfitting, which means the boundary is rather loose for
either training or testing. The theory of regularization is used to keep all the features but manipulate the
magnitude of their corresponding coefficient p
.
. For instance, by reducing the value of ë
í
till p
*
in (c)
(Figure 3.15), the algorithm can generate a similar hypothesis to (b), which is thought to be an ideal classifier.
Figure 3.15 (a) ì
ë
(L) = î(ë
ï
+ ë
ñ
L
ñ
+ ë
ó
L
ó
) (b) ì
ë
(L) = î(ë
ï
+ ë
ñ
L
ñ
+ ë
ó
L
ó
+ ë
ò
L
ñ
ó
+ ë
ô
L
ó
ó
+ ë
ö
L
ñ
L
ó
) (c) ì
ë
(L) =
î(ë
ï
+ ë
ñ
L
ñ
+ ë
ó
L
ó
+ ë
ò
L
ñ
ó
+ ë
ô
L
ó
ó
+ ë
ö
L
ñ
L
ó
+ ë
í
L
ñ
ó
L
ó
+ë
õ
L
ñ
ò
L
ó
ó
+ ⋯)
In practice, an extra regularization term is added to the cost function to shrink every coefficient. The new
cost function is given as
J(θ) = −
1
_
[VM
(#)
Éa/ℎ
o
K8
(#)
N+(1−
Q
#%&
M
(#)
)log (1−ℎ
o
K8
(#)
N)]+jVp
.
O
*
#%&
(3U 7)
33
j is called the regularization parameter. It controls the trade-off between two purposes. One is to obtain a
hypothesis that fits the training data well. The other is to keep the coefficients small. Basically, the
establishment of the prediction model involves a test of series of j values.
3.3.3. Support vector machine (Ng, n.d.)
SVM generates an optimized classifying boundary whose separation margin of classes is maximized. Instead
of outputting probability, the hypothesis directly predicts the class of the instance, that is,
ℎ
o
(8) = û
1,
0,
2^ p
q
L≫ 0
a5ℎ30c2(3
Its loss function is expressed as
J(θ)= dV†M
(#)
a(5
&
Kp
q
8
(#)
N+K1−M
(#)
Na(5
t
Kp
q
8
(#)
N°+
1
2
Vp
.
O
*
#%&
(3U 8)
Q
#%&
C plays a similar role to j in eq 7, which controls the trade-off between overfitting and underfitting; however,
it applies to the term calculating the cost incurred. m is the number of instances in the training dataset. n is
the number of coefficients. a(5
&
() and a(5
t
() are the related functions to compute the cost. To provide
a simplified explanation, the functions are conceptually depicted in Figure 3.16.
Figure 3.16 Curves of £§•¶
ñ
(ß) and £§•¶
ï
(ß)
To enable the SVM to adapt to build nonlinear classifiers, kernels are applied as the main techniques. The
basic concept is to find better features than higher-order polynomials to compute the non-linear decision
boundary. Here, the Gaussian kernel is demonstrated as an example (Ng, n.d.).
Given input feature 8, compute new feature depending on proximity to landmarks É. Normally, landmarks
are chosen as existing training examples (Figure 3.17).
34
Figure 3.17 Illustration of landmark
New features f are calculated as
^
(#)
=(2_2É"025M K8,É
(#)
N= exp™−
∥8−É
(#)
∥
O
2¨
O
≠ (3U 9)
The similarity function, which is the kernel, is applied to calculate the proximity of feature 8 to landmark
É. Eq 9 shows the expression of the Gaussian kernel. The sign ∥ ∥ represents the calculation of vector 8−
É
(#)
, that is, the Euclidian distance between point 8 and landmark É. σ is the inherent parameter of the
kernel. Details about how kernels work will not be introduced here, since they are beyond the scope of this
applied research.
3.3.4. Scikit-learn for Python
Scikit-learn is a machine learning library (Figure 3.18) for the Python programming language. It was utilized
as the main source of machine learning algorithms in this research. The major advantage of using a library is
that the algorithms have been optimized for usage. It also saves time because scripting the loss function and
optimization method from the beginning requires lots of effort. For researchers outside the computer science
domain, it allows machine learning to be easily accessed as a data analytic tool.
Figure 3.18 Scikit-learn for Python
In this research, Gaussian naïve Bayes, logistic regression, and support vector machine were investigated.
The corresponding models identified in Scikit-learn were sklearn.navie_bayes.GaussianNB
(“sklearn.naive_bayes.GaussianNB — scikit-learn 0.20.2 documentation,” n.d.),
sklearn.linear_model.LogisticRegression (“sklearn.linear_model.LogisticRegression — scikit-learn 0.20.2
documentation,” n.d.), and sklearn.svm.SVC (“sklearn.svm.SVC — scikit-learn 0.20.2 documentation,” n.d.).
Parameter settings for the models are summarized in Table 3.3. The code created is presented in Appendix A.
35
Gaussian NB Var_smoothing = ñ×ñï
) ±
.
LogisticRegression C=1.0. It is the inverse of regularization strength.
Class_weight=None. It specifies the weight assigned to the classes. All
classes are given weight one if it is not clarified.
Random_state=0. The seed of the pseudo-random number generator to
use when shuffling the data.
Solver=‘lbfgs’. This tells the algorithm used in the optimization
problem.
Max_iter=100. The maximum iteration allowed for convergence of
learning.
Multi_Class=‘multinomial’. A multinomial loss function is assumed
to fit the data.
Svm.SVC C=1.0.
Kernel=‘rbf’. The kernel type is Radial-basis function (RBF) kernel.
Degree=3. D=The degree of the polynomial kernel function used for
the model.
Gamma=‘scale’. The coefficient used in the kernel function. The value
equals 1 / (n_features * X.std()).
Class_weight=None.
Max_iter=-1. -1 means no limit.
Random_state=None.
Table 3.3 Parameter settings for scikit-learn models
3.3.5. MA TLAB for data processing
Because raw data cannot be directly fed into algorithms for either training or prediction, as there are issues
such as noise and monotonicity, data processing is necessary for both feature preparation and labeling.
MA TLAB is the tool used for this process. It is a numerical computing platform and programming language
developed by MathWorks. It contains diverse functions that allow the user to process and visualize data,
establish a user interface, etc. Compared with Excel, MA TLAB allows an automated process that is time-
efficient for large-scale data processing. Appendix B shows the code created for the feature preparation.
3.4. Control model
3.4.1. Control logic and strategy
After the prediction model outputs the occupant’s visual comfort labels, the control model should take
corresponding actions. Firstly, according to visual satisfaction, the control model decides whether the
illuminance level should be changed or not. Because the control resolution (2 positions) is lower than that of
predictions (5 points), a grouping of the point scale is required to reduce the dimension of the visual
satisfaction label. It was hypothesized that labels 0, 1, and 2 were recognized as no actions required, and -1
and -2 were recognized as trigger points. If action is required, the visual sensation label should be analyzed
for the definition of the control action. Stepwise control is utilized as the strategy. In detail, if the lighting
environment is perceived as being bright, the controller will lower the luminance of the LED light bulbs by
36
a single step. The grouping of points will still be required here for the control decision. It was assumed that
the labels -1 and -2 corresponded to being dark, and 1 and 2 corresponded to being bright. The neutral visual
sensation was eliminated for the control actions, as neutral feedback under perceived dissatisfaction is
ambiguous for clear action. After the adjustment, new visual satisfaction and visual sensation labels are
generated as a result of the modified illuminance level to determine whether the next step should be executed
or not. In other words, a feedback loop continues until the occupant’s satisfaction is achieved. The step change
of the LED light bulbs is defined as the equivalent effect of reducing the illuminance at the working plane by
150 lux. Figure 3.19 shows the control theory graphically.
Figure 3.19 Control workflow
3.4.2. Microcontroller and light dimmer
The hardware required to achieve lighting control includes a microcontroller (Arduino Uno Rev3) (Figure
3.20) and an AC light dimmer (Figure 3.21). Arduino takes the digital input from the prediction model and
outputs another digital signal for the light dimmer. The control logic is encoded in Arduino’s board using
Arduino software (IDE) to write a program and upload it to the board. The AC lighting dimmer is a print
circuit board (PCB). It is required to be connected to a power source, an Arduino board, and lighting fixtures.
The digital output from Arduino tells the alternating current (AC) load, physically adjusted by the PCB, that
should be applied to the lighting fixtures. Figure 3.22 shows the connection between the Arduino board and
the light dimmer.
37
Figure 3.20 Arduino Uno Rev3
Figure 3.21 AC light dimmer
Figure 3.22 Arduino board connected to light dimmer
3.5. Electrodermal activity
3.5.1. Measurement of EDA
The EDA signal was recorded by the Biopac package (MP160) (Figure 3.23) through the entire human subject
38
experiment except for the period of adaptation. The package includes a data acquisition unit, a high-level
transducer module, and a wireless amplifier. The wireless amplifier consists of a receiver and a sender. The
sender was worn by participants, fastened with an adjustable band to their wrists. Two electrodes were pasted
on participants’ index and middle fingers and connected to the signal projector by cables (Figure 3.24). An
isotonic gel consisting of electrolytes, which contains 0.05% saline, was applied to the electrode paste for
signal recording (Figure 3.25). It helps with the buildup of sweat, which evaporates under normal conditions.
According to Boucsein et al. (2012), the method is recognized as an exosomatic recording with direct current.
In detail, a small voltage was applied to the two electrodes and a small resistor in series with the skin. The
small resistor provides a way of measuring current flow-through. The skin conductance, the reciprocal of
resistance, can be computed from Ohm’s law (≤ = ≥/f
µ
), where I is the current and E is a small voltage
applied on the skin only (excluding the voltage drop with a small resistor).
Figure 3.23 The biopac package
Figure 3.24 Signal projector connected with electrodes
39
Figure 3.25 Electrode paste
3.5.2. Analysis of EDA parameters
The analysis of EDA was conducted in the software Acknowledge 5, which comes with the Biopac package.
The workflow is presented below.
Figure 3.26 EDA parameters analysis flowchart
The summary of number of cycles, mean SCL, and SCR size was matched with visual comfort labels obtained
from the human subject experiment. It was expected that basic statistics would be able to reveal links between
the EDA parameters and participants’ visual comfort.
Figure 3.27 Sample data from a test subject.
40
Chapter 4. Data Processing and Preliminary Results
Data processing is essential for a machine learning problem that requires raw data to be processed in a format,
in this case, constituting features and classes. This chapter introduces how data processing was implemented
and provides preliminary results, such as the window size for the moving average filter and the timeframe
for the gradient. In addition, a modification was made for the experiment design to deal with the problem
reported from first three participants’ surveys; it is described and explained in this chapter. To provide an
overview of data characteristics, two samples were selected to generate illustrative figures.
4.1. Data processing for training dataset
Time
Abs_
size
Mov_
Ave_
30s
Gra_
30s
Gra_
40s
Gra_
50s
Gra_
60s
Gra_
90s
Gra_
120s
Lux_
Level
Vis_
Sens
Vis_
Satisf
0:03:30 73.05 70.16 3.48 4.78 7.72 7.11 4.75 -0.47 100 -1 0
0:03:31 66.93 70.03 3.19 4.52 7.52 7.14 4.50 -0.51 100 -1 0
0:03:32 66.94 70.04 2.94 4.53 7.49 7.39 4.52 -0.10 100 -1 0
0:03:33 62.42 69.81 2.42 4.36 6.96 7.28 4.23 0.00 100 -1 0
0:03:34 64.97 69.65 2.03 4.29 6.43 7.07 4.09 0.00 100 -1 0
0:03:35 67.63 69.61 1.91 4.24 6.04 7.09 4.22 0.34 100 -1 0
0:03:36 70.63 69.71 1.77 4.16 5.70 7.35 4.40 0.80 100 -1 0
0:03:37 74.78 69.96 1.63 4.27 5.56 7.60 4.77 1.41 100 -1 0
Table 4.1 Slice of training data from one sample
Table 4.1 describes the final format of the input training dataset. The granularity of the data is 1 second,
which was computed from raw data with a 30Hz sampling rate. For each time step, the first and last minutes’
data were discarded, as the change of light was likely to incur rapid change of eye pupils, introducing
undesired noise. Additionally, another 150s of data were discarded from the dataset to keep the total number
of data points consistent for every feature. This resulted from different timeframes being used for the
gradients. Both visual comfort labels and illuminance level were regarded as constant for each time step (8
min). Missing values were occasionally found in the raw data file because sensors could lose track of
participants’ eye pupils. The approach to dealing with this was to duplicate available data from the last
moment for the current missing one.
4.2. The random order of illuminance level produced inconsistent labels.
Test Subject 1 Test Subject 2 Test Subject 3
Illuminance
level
Visual
Sensation
Visual
Satisfaction
Illuminance
level
Visual
Sensation
Visual
Satisfaction
Illuminance
level
Visual
Sensation
Visual
Satisfaction
1250 0 0 100 1 1 1100 1 1
400 -1 1 200 2 1 400 -1 1
1250 1 -1 1250 2 -1 200 -2 0
950 0 0 800 2 -2 100 -2 -1
41
950 0 0 950 2 -2 1250 2 -1
1250 2 -2 800 1 -1 500 0 0
1100 1 -1 400 0 0 650 2 0
400 0 0 400 0 1 1100 2 0
100 -1 2 500 1 0 500 0 2
800 1 0 800 1 -1 650 1 1
1250 2 -1 1250 1 0 800 1 0
500 -1 2 950 -1 2 1100 2 -1
*The shading highlights the repeated lighting settings for a single human test subject.
Table 4.2 Subjective report from participants
Table 4.4 summarizes subjective reports from the first three human subjects. Because the illuminance level
in the experiment was picked randomly, there were repeated step values, which have been highlighted in
table 4.2. It can be seen that participants reported different visual sensation and visual satisfaction levels for
the same illuminance values through the experiment. For example, Test Subject 1 felt neutral about 1250 lux
in terms of its satisfaction and brightness but felt it was unsatisfactory and too bright after exposure to several
light settings. The inconsistency was remarkable for Test Subjects 2 and 3. Test Subject 2 reported 950 lux
as being bright and was dissatisfied at first and then considered it dark and was satisfied at the end of the
experiment. The same situation was found for Test Subject 3.
This inconsistency of labels creates extra hurdles for algorithms to draw clear decision boundaries. In the
worst case, the classification tasks may fail with an accuracy lower than 50%, which is no better than random
guessing. This problem was probably caused by the visual annoyance generated by large step jumps.
According to Kim and Kim (2007), frequent light fluctuation generated by an automatically controlled
dimming system should be controlled within 40% of the task illuminance which, in their research, was
assumed to be between 650 and 500 lux. Otherwise, negative responses to overall visual comfort were
reported from occupants. Therefore, this research advised that the step change in the experiment should be
restricted so as to avoid the effect of visual annoyance on rating visual satisfaction and visual sensation. In
addition, visual annoyance does not merely correlate with illuminance change but also relates to the
illuminance that participants are first adapted to.
To get rid of this inconsistency in labels, it was decided that the step change should be controlled within 400
lux, and no occurrence of repeated step value was allowed. The 400 lux threshold was principally based on
the 40% limit stated above, even though it may exceed the limit on some occasions. The other reason was
that 400 lux limit allowed the experiment to expose participants to as many illuminance conditions as possible
within a reasonable experiment length. This flexibility is inaccessible for a 300 lux limit, that is, the maximum
illuminance discovered as being acceptable (Kim & Kim, 2007).
The redesigned order of illuminance level is shown in Figure 4.1. It was adopted in the following human
subject experiments for all participants. The data from the first three participants were excluded from the
following analysis.
42
Figure 4.1 Illuminance levels in experimental order (redesigned)
4.3. 30s was picked as the window size for the moving average filter.
Considering that the absolute pupil size recorded is rather noisy as the result of frequent eye movements, the
study adopted a moving average filter to remove the noise but keep the long-term trend of data. The
mathematical expression was provided in Subsection 3.2.1.1. It was recognized that the size of the time
window (n) is important, as a large value may contribute to smooth but less realistic data and a small value
may not be adequate to eliminate the short-term fluctuation. Figure 4.2 presents the comparison among raw
pupil size and moving averages of 10s, 20s, 30s, 40s, 60s, and 120s. The 30s time window was selected as
the best option, since it filtered out the majority of fluctuation without compromising the sensitivity. It was
adopted for the following analysis, including the computation of the gradient.
Figure 4.2 Comparison among raw pupil size, moving average 10s, 20s, 30s, 40s, 60s, and 120s
4.4. Presentation of sample data
Understanding the general relationship between features and classes allows the development of a more
comprehensive data analysis. Illustrative figures were generated in this subsection to assist this process of
comprehension. Data selected from two human subjects were used to keep a clean but informatic
demonstration.
43
4.4.1. Visual sensation versus eye pupil size
Figure 4.3 Visual sensation versus eye pupil size (Test Subject A)
-3
-2
-1
0
1
2
3
40
45
50
55
60
65
70
75
80
0:03:30
0:04:10
0:04:50
0:05:31
0:06:11
0:06:51
0:12:01
0:12:41
0:13:21
0:14:01
0:14:41
0:19:51
0:20:31
0:21:11
0:21:51
0:22:31
0:27:41
0:28:21
0:29:01
0:29:41
0:30:21
0:35:31
0:36:11
0:36:51
0:37:31
0:38:11
0:38:51
0:44:02
0:44:43
0:45:23
0:46:03
0:46:45
0:52:08
0:52:48
0:53:39
0:54:19
0:54:59
1:00:09
1:00:49
1:01:29
1:02:09
1:02:49
1:07:59
1:08:39
1:09:21
1:10:01
1:10:41
1:15:51
1:16:31
1:17:11
1:17:51
1:18:31
1:23:41
1:24:21
1:25:01
1:25:41
1:26:21
Absolute_size Mov_Ave_30s Vis_Sens
44
Figure 4.4 Visual sensation versus eye pupil size (Test Subject B)
-3
-2
-1
0
1
2
3
30
35
40
45
50
55
60
65
70
0:03:30
0:04:11
0:04:52
0:05:34
0:06:16
0:06:57
0:12:08
0:12:49
0:13:30
0:14:11
0:14:52
0:20:06
0:20:47
0:21:28
0:22:09
0:22:50
0:28:01
0:28:42
0:29:23
0:30:04
0:30:45
0:35:56
0:36:37
0:37:18
0:37:59
0:38:40
0:43:51
0:44:32
0:45:13
0:45:54
0:46:35
0:51:46
0:52:27
0:53:08
0:53:49
0:54:30
0:59:41
1:00:22
1:01:03
1:01:44
1:02:25
1:07:36
1:08:17
1:08:58
1:09:39
1:10:20
1:15:31
1:16:12
1:16:53
1:17:34
1:18:15
1:18:56
1:24:07
1:24:48
1:25:29
1:26:10
1:26:51
Absolute_size Mov_Ave_30s Vis_Sens
45
Figures 4.3 and 4.4 show the general relationship between visual sensation levels and participants’ eye pupil
sizes. It is clearly shown that different visual sensation levels correspond to a different range of eye pupil
sizes. Here, the 30s moving average is highlighted, with absolute size in the background, to show a clearer
data trend. A visual sensation of brightness corresponds to a higher range of eye pupil sizes, and darkness to
a lower range of eye pupil sizes. This can be explained by the human physiological mechanism that tries to
retain homeostasis within the body (Section 1.2). Remarkable distinctions among individuals were found
when the eye pupil size value was brought into consideration. For the identical visual sensation label of 2
(bright), Test Subject A showed a variation of eye pupil size from 45 to 55; however, Test Subject B’s eye
pupil size fluctuated within the range from 35 to 45. In addition, no negative visual sensations were reported
by Test Subject B, that is, this participant perceived all the experimental lighting settings as being bright or
neutral.
Figure 4.5 Mean of 30s moving average under different visual sensation levels
Figure 4.5 plots the mean of the 30s moving average of eye pupil size under different visual sensation levels
for all participants. It can be seen that the distinction among different human subjects was remarkable. Firstly,
the range of fluctuation was different. Participant 8 demonstrated the highest range of variation, from 70 to
80, and Participant 11 showed the lowest range, from 35 to 40. Secondly, the trend of variation was different.
The majority of participants showed decreasing pupil size with increasing visual sensation level; however, it
can be observed from some participants that the turning point would appear at a neutral visual sensation level
to break the smooth decreasing curve, for example, participants 8 and 13. Thirdly, not all participants’
responses fell on a full scale of visual sensation levels. For instance, Participant 8 only responded with levels
-1, 0, and 1.
30
40
50
60
70
80
90
-2 -1 0 1 2
30 Moving Avergae Mean
Visual sensation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
46
These distinctions demonstrate that an individual visual comfort model using occupants’ eye pupil sizes as
the parameter is feasible. A group model would average the individual data so that the sensitivity to individual
differentiation would be reduced.
4.4.2. Visual sensation versus gradients of eye pupil size
Figure 4.6 Visual sensation versus gradients of eye pupil size (Test Subject A)
Figure 4.7 Visual sensation versus gradients of eye pupil size (Test Subject B)
-3
-2
-1
0
1
2
3
-15
-10
-5
0
5
10
15
20
0:03:30
0:04:42
0:05:53
0:11:32
0:12:41
0:13:50
0:14:59
0:20:39
0:21:48
0:22:57
0:28:41
0:29:54
0:35:38
0:36:47
0:37:56
0:43:35
0:44:44
0:45:53
0:51:33
0:52:42
0:53:51
0:59:30
1:00:39
1:01:53
1:07:37
1:08:46
1:09:55
1:15:34
1:16:43
1:17:52
1:23:31
1:24:40
1:25:50
Gra_change_30s Gra_change_40s Gra_change_50s Gra_change_60s
Gra_change_90s Gra_change_120s Vis_Sens
-3
-2
-1
0
1
2
3
-15
-10
-5
0
5
10
15
20
0:03:30
0:04:40
0:05:52
0:11:32
0:12:42
0:13:52
0:19:32
0:20:45
0:21:55
0:27:35
0:28:45
0:29:55
0:35:35
0:36:45
0:37:55
0:43:35
0:44:45
0:45:55
0:51:35
0:52:45
0:53:55
0:59:35
1:00:45
1:01:55
1:07:35
1:08:45
1:09:55
1:15:35
1:16:45
1:17:55
1:23:35
1:24:45
1:25:55
Gra_change_30s Gra_change_40s Gra_change_50s Gra_change_60s
Gra_change_90s Gra_change_120s Vis_Sens
47
Figures 4.6 and 4.7 describe the variation of gradients of pupil size under different visual sensation levels. In
general, it can be seen that a larger amplitude of gradients of eye pupil size corresponds to a lower visual
sensation level, and a smaller amplitude corresponds to a higher visual sensation level. In other words, an
ambient lighting environment perceived as being bright results in a small fluctuation in eye pupil size and
that perceived as being dark gives rise to a large fluctuation in eye pupil size. Differences among gradients
under various timeframes (30s, 40s, 50s, 60s, 90s, and 120s) are not obvious when the visual sensation level
reported was positive. This can be demonstrated by the overlap among curves of gradients of different
timeframes. Nevertheless, the difference becomes evident when a negative visual sensation level was
reported. Individual differences can also be found in this case. Test subject A showed a narrow range of
variation compared with Test Subject B. Different range values were found for identical visual sensation
levels collected from Test Subjects A and B. This finding corroborates the conclusion that an individual
model is preferred for establishing a visual comfort model using eye pupil size as the parameter.
48
Chapter 5. Data Analysis and Discussion
This study was designed, on the one hand, to explore the application of machine learning in human visual
comfort prediction in order to construct a lighting control loop to deal with existing comfort issues due to
today’s insufficient indoor lighting environment and, on the other hand, to investigate the relationship
between human psychology and comfort perception. In detail, the study was intended to determine the
performance of a machine learning technique in visual comfort prediction and the performance of the control
loop in the actual application. In addition, EDA parameters were statistically analyzed for a general
conclusion about the influence of occupants’ psychological state on their visual comfort perception.
The purpose of this chapter is to analyze and discuss the data collected not only from the human subject
experiment but also from the validation test designed for testing the actual performance of the control tool
developed. The analysis and discussion were written after development of the control tool to conform to the
stepwise development of the control. All the data are summarized, and associated data display graphics are
plotted in this chapter for resultant discoveries.
5.1. Feature selection
The purpose of feature selection is to filter out the less relevant features in order to improve model
performance. A boosted decision tree, called XGBoost in Python, was used as the primary method. It
computes the score of importance of each feature in the construction of the boosted decision tree, allowing
attributes to be compared with each other. According to Brownlee (2016), “Importance is calculated for a
single decision tree by the amount that each attribute split point improves the performance measure, weighted
by the number of observations the node is responsible for.” Therefore, the score is presented in a range
between 0 and 1. It was expected that only four features would be kept for the establishment of the prediction
model. Because the visual comfort labels consist of visual sensation and visual satisfaction, individual
prediction models were established to predict these two different labels. The following discussion will also
be provided separately to highlight the feature preferences of different models. Code created correspondingly
is in Appendix C.
5.1.1. Eliminating illuminance level as an input feature
Initially, illuminance level was recognized as an important feature to drive the prediction. However, the score
of importance revealed an overdependency assigned to the illuminance level for predicting either visual
sensation or visual satisfaction (Tables 5.2 and 5.3). It was shown that, for the majority of participants, a
score of importance higher than 0.8 was found for illuminance level in constructing the boosted decision tree
for visual sensation and visual satisfaction, in comparison with the 30s moving average (0.0175–0.3402) and
gradients (0–0.1170) of pupil size. This strong correlation between illuminance level and visual comfort
labels was debatable because the time scale of input features is inconsistent. In detail, both visual comfort
labels and illuminance level were constant for a single experimental step (8 min), though the pupil size data
were collected with a 30Hz rate. The identical time scale between illuminance level and visual comfort labels
contributed to their relatively strong correlation. Pupil size data, which had a higher time resolution, were
recognized to be sensitive and variable compared with illuminance level so as to be valued less by the boosted
decision tree. To recover the importance of pupil size–related data, illuminance level was eliminated from
49
the feature set for the prediction model, which is adopted in the following analysis.
No.
Abs_
Size
Mov_
Ave_
30s
Grad_
30s
Grad_
40s
Grad_
50s
Grad_
60s
Grad_
90s
Grad_
120s
Lux_
lvl
1 0.0000 0.0175 0.0010 0.0031 0.0072 0.0031 0.0021 0.0072 0.9589
2 0.0000 0.0887 0.0000 0.0010 0.0000 0.0049 0.0049 0.0097 0.8908
3 0.0029 0.2251 0.0249 0.0117 0.0381 0.0132 0.0191 0.0337 0.6312
4 0.0000 0.0236 0.0012 0.0012 0.0000 0.0000 0.0189 0.0059 0.9492
5 0.0027 0.2101 0.0189 0.0162 0.0099 0.0323 0.0018 0.0934 0.6149
6 0.0011 0.1819 0.0000 0.0206 0.0309 0.0034 0.0137 0.0069 0.7414
7 0.0000 0.0507 0.0007 0.0000 0.0014 0.0051 0.0116 0.0152 0.9153
8 0.0000 0.0437 0.0053 0.0000 0.0000 0.0041 0.0085 0.0106 0.9308
9 0.0033 0.2485 0.0142 0.0117 0.0033 0.0550 0.0584 0.0092 0.5963
10 0.0131 0.3402 0.0229 0.0545 0.0022 0.0055 0.0000 0.0807 0.4809
11 0.0073 0.2474 0.0000 0.0044 0.0015 0.0029 0.0029 0.0278 0.7057
12 0.0461 0.2553 0.0000 0.0000 0.0142 0.0000 0.0461 0.1170 0.5213
13 0.0228 0.1368 0.0000 0.0000 0.0028 0.0000 0.0028 0.0114 0.8234
14 0.0000 0.0549 0.0000 0.0000 0.0000 0.0027 0.0000 0.0632 0.8791
15 0.0000 0.0429 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.9571
Abs_Size = Absolute size; Mov_Ave_30s = Moving Average 30s; Grad_ = Gradient
Table 5.1 Score of feature importance for visual sensation prediction (including illuminance level)
No.
Abs_
Size
Mov_
Ave_
30s
Grad_
30s
Grad_
40s
Grad_
50s
Grad_
60s
Grad_
90s
Grad_
120s
Lux_
lvl
1 0.0046 0.0547 0.0102 0.0065 0.0083 0.0046 0.0000 0.0102 0.9007
2 0.0020 0.0794 0.0010 0.0010 0.0059 0.0059 0.0049 0.0225 0.8775
3 0.0132 0.2492 0.0122 0.0071 0.0111 0.0284 0.0355 0.0415 0.6018
4 0.0121 0.1212 0.0081 0.0010 0.0000 0.0121 0.0051 0.0121 0.8283
5 0.0000 0.1662 0.0156 0.0000 0.0000 0.0078 0.0156 0.0468 0.7481
6 0.0000 0.0808 0.0000 0.0000 0.0000 0.0000 0.0000 0.0015 0.9178
7 0.0000 0.0332 0.0011 0.0000 0.0043 0.0171 0.0128 0.0310 0.9004
8 0.0000 0.0387 0.0011 0.0011 0.0000 0.0033 0.0109 0.0098 0.9380
9 0.0069 0.2215 0.0034 0.0000 0.0034 0.0137 0.0953 0.0232 0.6326
10 0.0278 0.3803 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.5918
11 0.0173 0.0691 0.0000 0.0173 0.0000 0.0000 0.0000 0.0086 0.8878
12 0.0095 0.1230 0.0000 0.0047 0.0000 0.0038 0.0076 0.0009 0.8505
13 0.0228 0.1368 0.0000 0.0000 0.0028 0.0000 0.0028 0.0114 0.8234
14 0.0034 0.0034 0.0000 0.0000 0.0000 0.0000 0.0023 0.0181 0.9729
50
15 0.0256 0.0385 0.0000 0.0000 0.0000 0.0000 0.0000 0.0513 0.8846
Table 5.2 Score of feature importance for visual satisfaction prediction (including illuminance level)
5.1.2. Features selected for visual sensation prediction
Pupil size with a 30s moving average filter, 60s gradient, 90s gradient, and 120s gradient were selected as
features for the establishment of the visual sensation prediction model. This decision was based on a summary
of votes, which is shown in Table 5.4. In detail, XGBoost would calculate the feature importance for
predictive modeling of visual sensation for each participant. According to the scores, the ranking of features
could be derived, and an effective vote would entitle the top 4 features for each test subject. By counting the
votes for each feature, the top 4 would be selected for model establishment.
No.
Abs_
Size
Mov_
Ave_
30s
Grad_
30s
Grad_
40s
Grad_
50s
Grad_
60s
Grad_
90s
Grad_
120s
1 0.0289 0.4161 0.0597 0.0820 0.0664 0.1403 0.0749 0.1318
2 0.0420 0.3798 0.0827 0.0598 0.0477 0.0878 0.1272 0.1730
3 0.0415 0.3639 0.0743 0.0752 0.0547 0.1295 0.1081 0.1528
4 0.0503 0.2556 0.1311 0.0748 0.0490 0.0927 0.1642 0.1821
5 0.0289 0.4190 0.1204 0.0612 0.0363 0.0921 0.0928 0.1493
6 0.0321 0.3831 0.0981 0.0439 0.0528 0.0602 0.1568 0.1731
7 0.0268 0.3431 0.0896 0.0818 0.0757 0.0749 0.1277 0.1804
8 0.0401 0.3631 0.0937 0.0724 0.0414 0.1052 0.1320 0.1521
9 0.0291 0.3814 0.0537 0.0895 0.0430 0.0962 0.1795 0.1276
10 0.0333 0.4313 0.0358 0.0824 0.0691 0.0575 0.1107 0.1799
11 0.0480 0.3839 0.0867 0.0763 0.0496 0.0965 0.0927 0.1663
12 0.0481 0.3788 0.0361 0.0621 0.0922 0.0842 0.1162 0.1824
13 0.0382 0.4499 0.0426 0.0313 0.0407 0.0688 0.1277 0.2009
14 0.0482 0.3044 0.0751 0.0482 0.0520 0.1445 0.1599 0.1676
15 0.0094 0.3850 0.0023 0.0117 0.1197 0.1338 0.1056 0.2324
Vote 0 15 4 2 2 9 13 15
*The shading highlights the top 4 features based on the score of importance.
Table 5.3 Score of feature importance for visual sensation prediction (excluding illuminance level)
It can be seen that there were differences among participants in terms of features’ ranking. The 30s moving
average and 120s gradient of pupil size were recognized as the most correlated features by all 15 boosted
decision trees constructed for each test subject. The importance computed for the 30s moving average was
0.3759 on average, with a maximum of 0.4499 and a minimum of 0.2556. The 120s gradient of pupil size
showed a lower average score of importance, which was 0.1701, with a maximum of 0.2324 and a minimum
of 0.1276. Besides, the 90s gradient of pupil size was the third significant one, losing votes from human
subjects 1 and 15. The 60s gradient of pupil size was the last feature selected, with 9 votes. The 30s, 40s, and
50s gradients of pupil size got 4, 2, and 2 votes, respectively, and were filtered out because of the handful of
votes. No vote was given to absolute pupil size.
51
5.1.3. Features selected for visual satisfaction prediction
Pupil size with the 30s moving average filter, 60s gradient, 90s gradient, and 120s gradient were selected as
features for the establishment of the visual satisfaction prediction model, which is identical to what was found
for the visual sensation prediction model. However, slight differentiation was found for the vote distribution.
The 30s moving average and 90s and 120s gradients of pupil size all got 15 votes. The 60s gradient of pupil
size was the last feature selected, with 10 votes. Few votes were given to the 30s and 40s gradients of pupil
size, and the absolute size and the 50s gradient of pupil size received no votes.
No.
Abs_
Size
Mov_
Ave_
30s
Grad_
30s
Grad_
40s
Grad_
50s
Grad_
60s
Grad_
90s
Grad_
120s
1 0.0303 0.4003 0.0768 0.0776 0.0580 0.1245 0.0839 0.1486
2 0.0636 0.3094 0.1121 0.0549 0.0473 0.0665 0.1413 0.2049
3 0.0397 0.4185 0.0893 0.0769 0.0527 0.1066 0.1135 0.1029
4 0.0525 0.2338 0.1299 0.0681 0.0612 0.1103 0.1542 0.1900
5 0.0263 0.3590 0.0921 0.0752 0.0357 0.1147 0.1071 0.1898
6 0.0427 0.3573 0.1127 0.0372 0.1121 0.0960 0.1220 0.1201
7 0.0251 0.3665 0.0909 0.0495 0.0232 0.1128 0.1222 0.2099
8 0.0525 0.2925 0.0731 0.0831 0.0644 0.1150 0.1481 0.1713
9 0.0393 0.3619 0.0569 0.0628 0.0411 0.0745 0.2170 0.1466
10 0.0120 0.4234 0.0658 0.0754 0.0580 0.0520 0.1477 0.1657
11 0.0428 0.4204 0.0656 0.0680 0.0452 0.0764 0.1113 0.1704
12 0.0340 0.3833 0.0920 0.0340 0.0642 0.0772 0.1222 0.1932
13 0.0382 0.4499 0.0426 0.0313 0.0407 0.0688 0.1277 0.2009
14 0.0230 0.3037 0.0971 0.0604 0.0591 0.1189 0.1649 0.1730
15 0.0062 0.3381 0.0763 0.0412 0.0536 0.0804 0.2247 0.1794
Vote 0 15 4 1 0 10 15 15
*The shading highlights the top 4 features based on the score of importance.
Table 5.4 Score of feature importance for visual satisfaction prediction (excluding illuminance level)
5.1.4. Conclusion
There were two steps in terms of feature selection. Firstly, illuminance level was dropped from the training
feature set because its strong correlation with visual comfort labels blocked the contributions from other
features to the prediction. This strong correlation was realized to be the result of the consistent time scale
between visual comfort labels and illuminance level. Secondly, four features were selected to construct the
visual comfort prediction model: the 30s moving average of absolute pupil size, 60s gradient of 30s moving
average, 90s gradient of 30s moving average, and 120s gradient of 30s moving average. The selection process
was based on the score of importance generated by the boosted decision tree.
5.2. Evaluation of prediction performance
52
To evaluate the prediction performance of the model, prediction accuracy was adopted as the criterion. It was
defined as the percentage of correct predictions made for the test data. Empirically, the original dataset was
split into training and test data, with a ratio of 70:30. The dataset was prepared initially in a random state for
the separation, in this case, to eliminate the time series effect of the data. Testing the model on a group of
new data provided a more reliable demonstration of its performance. The mathematical expression of the
prediction accuracy is shown below:
!"#$%&'%() +&&,"+&- =
),/0#" (1 &(""#&' 2"#$%&'%()3
4('+5 ),/0#" (1 2"#$%&'%()3 /+$#
To improve the performance of prediction and ensure compatibility between predictions of visual comfort
label and control positions, rescaling of visual sensation and visual satisfaction levels was proposed. In detail,
for both visual sensation and visual satisfaction, levels 1 and 2 were grouped as a single level, and levels -1
and -2 were grouped as another single level. Therefore, the new definition of visual sensation level was given
as -1 (dark), 0 (neutral), and 1 (bright). The new definition of visual satisfaction was given as -1 (unsatisfied),
0 (neutral), and 1 (satisfied). The prediction accuracy will be evaluated for 5-level and 3-level scales
respectively.
5.2.1. Prediction accuracy of visual sensation (5 levels)
No. Data Size
Gaussian Naïve
Bayes
Logistic
Regression
Support Vector
Machine
1 2276 0.54905 0.56808 0.71303
2 2308 0.67388 0.70418 0.68975
3 2279 0.53655 0.62281 0.67836
4 2294 0.68360 0.74020 0.75327
5 2307 0.78499 0.76190 0.80090
6 2271 0.74487 0.71114 0.86510
7 2297 0.51014 0.63043 0.65652
8 2311 0.67821 0.61328 0.59307
9 2266 0.73088 0.78676 0.82500
10 2305 0.87139 0.88584 0.90029
11 2299 0.67536 0.76232 0.80000
12 2289 0.73071 0.75254 0.81368
13 1463 0.81777 0.82004 0.94077
14 2276 0.74524 0.68960 0.81113
15 1454 0.94737 0.98627 0.99313
*The shading highlights the algorithm with the highest accuracy.
Table 5.5 Prediction accuracy of three algorithms (visual sensation [5 levels])
Table 5.5 summarizes the prediction accuracy of 5-level visual sensation from three algorithms. The red
shading highlights the algorithm generating the highest accuracy for each test subject. The size of the test
data was different for each participant and is listed in the table for details.
53
It can be seen that prediction model of visual sensation (5 levels) adopting the SVM algorithm possessed the
highest accuracy for 13 out of 15 participants. Each algorithm performed differently among individuals. For
example, SVM produced an accuracy of 0.65652 for Participant 7 but an accuracy of 0.99313 for Participant
15. For different test subjects, the variation of performance among the three algorithms can be either
remarkable or negligible. For example, Gaussian NB, LR, and SVM produced prediction accuracies of
0.54905, 0.56808, and 0.71303 for Participant 1. The difference between the highest and lowest values is
0.16396. However, the situation was different for Participant 2: The accuracy difference between the highest
(LR = 0.70418) and the lowest (Gaussian NB = 0.67388) is 0.0303. These facts reveal that even though SVM
performs the best for the majority of test subjects, the hypothesis that SVM would be the best option for the
whole population is likely to be rejected in consideration of the individual differences.
Figure 5.8 Interval plot of prediction accuracy of three algorithms (visual sensation [5 levels])
Source DF Adj SS Adj MS F-Value P-Value
Factor 2 0.04657 0.02329 1.81 0.177
Error 42 0.54103 0.01288
Total 44 0.58760
Table 5.6 Analysis of variance (one-way) for three algorithms in terms of prediction accuracy (visual sensation [5 levels])
To determine whether the prediction accuracy of the three algorithms was significantly different, a statistical
test, one-way ANOV A with multiple comparisons (analysis of variance), was conducted. This test was
designed for a case with one categorical factor (e.g., three algorithms) and a continuous response (e.g.,
Support Vector Machine L ogistic Regression Gaussian Navie Bayes
0.85
0.80
0.75
0.70
0.65
Data
95% CI for the Mean
The pooled standard deviation is used to calculate the intervals.
54
prediction accuracy). It would determine whether the means of the three groups differed so as to tell whether
there were significant differences in prediction accuracy among three algorithms. The test would first provide
a null hypothesis that all means were equal and calculate the p-value to determine whether the null hypothesis
could be rejected. The significance value was empirically set to be 0.05, that is, if the p-value was below 0.05,
it could be concluded that the mean difference between the prediction accuracy of the algorithms was
statistically significant.
The p-value calculated was 0.17, which is larger than 0.05. The null hypothesis failed to be rejected. The
difference between the prediction accuracies of the algorithms was not statistically significant.
Figure 5.9 Interval plot of Tukey tests for difference of means between pairs of groups (visual sensation [5 levels])
Difference of Levels
Difference
of Means
SE of
Difference 95% CI T-Value
Adjusted
P-Value
Logistic Reg – Gaussian Nav 0.0237 0.0414 (-0.0771, 0.1245) 0.57 0.836
Support Vect – Gaussian Nav 0.0769 0.0414 (-0.0239, 0.1777) 1.86 0.164
Support Vect – Logistic Reg 0.0532 0.0414 (-0.0476, 0.1540) 1.28 0.412
Individual confidence level = 98.07%
Table 5.7 Tukey tests for difference of means between pairs of groups (visual sensation [5 levels])
The Tukey test was used in conjunction with ANOV A to locate the means that are significantly different from
55
each other. However, it was indicated that the mean differences between the prediction accuracies of the
algorithms for 5-level visual sensation are not statistically significant. The Tukey test was used only as a
demonstration of the mean difference between different pairs of algorithms. Figure 5.9 and Table 5.7 show
the confidence intervals for the differences between the means of LR and Gaussian NB, SVM and Gaussian
NB, and SVM and LR. All the ranges of confidence intervals include zero, which means the differences
between these pairs of groups are not significant. It should be noted that the p-value of the mean difference
between SVM and Gaussian NB is much smaller than those of the other two pairs.
Figure 5.10 Boxplot of prediction accuracy of three algorithms (visual sensation [5 levels])
Variable Mean
SE
Mean
St
Dev Min Q1 Median Q3 Max
Gaussian Naïve
Bayes
0.712 0.031 0.120 0.510 0.674 0.731 0.785 0.947
Logistic
Regression
0.736 0.028 0.109 0.568 0.630 0.740 0.787 0.986
Support Vector
Machine
0.789 0.029 0.110 0.593 0.690 0.801 0.865 0.993
Table 5.8 Basic statistics of prediction accuracy of three algorithms (visual sensation [5 levels])
Figure 5.10 depicts the distribution of the prediction accuracy of three algorithms for 5-level visual sensation.
A tall box plot can be found for all three algorithms in terms of prediction accuracy. This suggests that the
Support Vector Machine L ogistic Regression Gaussian Navie Bayes
1 .0
0.9
0.8
0.7
0.6
0.5
Data
0.73071
0.7402
0.8009
56
performance of the prediction, regardless of the choice of algorithm, varies among participants. Compared
with LR and Gaussian NB, SVM shows a high median, and its quartile groups are located at the upper part
of the graph. This reveals that for this sample group, SVM output higher accuracy for predicting 5-level
visual sensation.
In conclusion, for the current sample group, in the prediction model for 5-level visual sensation, SVM is
recommended as the algorithm to be used for classification because it was able to output more accurate
predictions compared with Gaussian NB and LR. However, this statement cannot be applied to the whole
population because the difference of means proved not to be statistically significant according to one-way
ANOV A.
5.2.2. Prediction accuracy of visual satisfaction (5 levels)
No. Data Size
Gaussian Naïve
Bayes
Logistic
Regression
Support Vector
Machine
1 2276 0.53587 0.59151 0.63690
2 2308 0.61039 0.48485 0.62771
3 2279 0.83772 0.85380 0.89327
4 2294 0.53556 0.61538 0.60377
5 2307 0.71140 0.67244 0.79221
6 2271 0.75220 0.76833 0.78006
7 2297 0.51739 0.60290 0.72174
8 2311 0.69986 0.63059 0.63059
9 2266 0.80294 0.83529 0.85882
10 2305 0.79335 0.78902 0.79624
11 2299 0.75652 0.82464 0.87971
12 2289 0.49054 0.55604 0.72052
13 1463 0.81777 0.82005 0.94077
14 2276 0.59151 0.59151 0.67643
15 1454 0.94050 0.95195 0.97483
*The shading highlights the algorithm with the highest accuracy.
Table 5.9 Prediction accuracy of three algorithms (visual satisfaction [5 levels])
Table 5.9 summarizes the prediction accuracy of 5-level visual satisfaction from three algorithms.
It can be seen that the prediction model of visual satisfaction (5 levels) adopting the SVM algorithm possessed
the highest accuracy for 13 out of 15 participants. Each algorithm performed differently among individuals.
For example, SVM produced an accuracy of 0.60377 for Participant 4 but an accuracy of 0.97483 for
Participant 15. For different test subjects, the variation in performance of the three algorithms can be either
remarkable or negligible. For example, Gaussian NB, LR, and SVM produced the prediction accuracies of
0.71140, 0.67244, and 0.79221 for Participant 5. The difference between the highest and lowest values is
57
0.11977. However, the situation was different for Participant 6: The accuracy difference between the highest
(LR = 0.78006) and the lowest (Gaussian NB = 0.75220) values is 0.02786. These facts reveal that even
though SVM performed the best for the majority of test subjects, the hypothesis that SVM would be the best
option for the whole population is likely to be rejected in consideration of the individual differences.
Figure 5.11 Interval plot of prediction accuracy of three algorithms (visual satisfaction [5 levels])
Source DF Adj SS Adj MS F-Value P-Value
Factor 2 0.03848 0.01924 0.81 0.450
Error 42 0.99308 0.02364
Total 44 1.03156
Table 5.10 Analysis of variance (one-way) for three algorithms in terms of prediction accuracy (visual satisfaction [5 levels])
One-way ANOV A with multiple comparisons was conducted to determine the statistical significance of
difference. The p-value calculated was 0.45, which is much larger than 0.05. This demonstrates that the
difference between the prediction accuracies of the algorithms, in terms of 5-level visual satisfaction, was
not statistically significant.
Support Vector Machine L ogistic Regression Gaussian Navie Bayes
0.90
0.85
0.80
0.75
0.70
0.65
Data
95% CI for the Mean
The pooled standard deviation is used to calculate the intervals.
58
Figure 5.12 Interval plot of Tukey tests for difference of means between pairs of groups (visual satisfaction [5 levels])
Difference of Levels
Difference
of Means
SE of
Difference 95% CI T-Value
Adjusted
P-Value
Logistic Reg – Gaussian Nav 0.0070 0.0561 (-0.1296, 0.1436) 0.12 0.991
Support Vect – Gaussian Nav 0.0652 0.0561 (-0.0713, 0.2018) 1.16 0.482
Support Vect – Logistic Reg 0.0582 0.0561 (-0.0783, 0.1948) 1.04 0.558
Individual confidence level = 98.07%
Table 5.11 Tukey tests for difference of means between pairs of groups (visual satisfaction [5 levels])
The Tukey test was conducted in conjunction with ANOV A to determine whether the mean difference
between different pairs of algorithms was significant. All the ranges of confidence intervals include zero,
which means the differences between these pairs of groups are not significant.
59
Figure 5.13 Boxplot of prediction accuracy of three algorithms (visual satisfaction [5 levels])
Variable Mean
SE
Mean
St
Dev Min Q1 Median Q3 Max
Gaussian Naïve
Bayes
0.693 0.036 0.138 0.491 0.536 0.711 0.803 0.941
Logistic
Regression
0.706 0.035 0.136 0.485 0.592 0.672 0.825 0.952
Support Vector
Machine
0.769 0.031 0.121 0.604 0.637 0.780 0.880 0.975
Table 5.12 Basic statistics of prediction accuracy of three algorithms (visual satisfaction [5 levels])
Figure 5.13 depicts the distribution of the prediction accuracy of three algorithms for 5-level visual
satisfaction. The tall box suggests the prediction performance is diverse among participants. Compared with
LR and Gaussian NB, SVM shows a high median, and its quartile groups are located at the upper part of the
graph. This reveals the superior performance of SVM within the sample group.
In conclusion, similar to the prediction of 5-level visual sensation, the prediction of 5-level visual satisfaction
was best with SVM because of its higher accuracy. However, this statement is only applicable to the current
sample group. In other words, the difference between SVM and the other algorithms is not statistically
significant. Possible explanations for this are nonnegligible individual distinction and small sample size. In
Support Vector Machine L ogistic Regression Gaussian Navie Bayes
1 .0
0.9
0.8
0.7
0.6
0.5
Data
0.7114
0.67244
0.78006
60
comparison to 5-level visual sensation, the prediction accuracy of 5-level visual satisfaction showed a wider
distribution.
5.2.3. Prediction accuracy of visual sensation (3 levels)
No. Data Size
Gaussian Naïve
Bayes
Logistic
Regression
Support Vector
Machine
Relative
Change
1 2276 0.56076 0.57247 0.65300 NA
2 2308 0.67388 0.70418 0.68975 No
3 2279 0.73295 0.78084 0.82438
4 2294 0.68360 0.74020 0.75327 No
5 2307 0.78499 0.76190 0.80090 No
6 2271 0.85431 0.88563 0.96334
7 2297 0.72754 0.71304 0.73913
8 2311 0.66955 0.60750 0.58874 No
9 2266 0.84118 0.91912 0.92059
10 2305 0.97543 1.00000 1.00000
11 2299 0.88116 0.95507 0.97246
12 2289 0.73071 0.75254 0.81368 No
13 1463 0.81777 0.82004 0.94077 No
14 2276 0.74524 0.68960 0.81113 No
15 1454 0.94737 0.98627 0.99313 No
*The shading highlights the algorithm with the highest accuracy.
Table 5.13 Prediction accuracy of three algorithms (visual sensation [3 levels])
No.
Gaussian Naïve
Bayes
Logistic
Regression
Support Vector
Machine
1 0.01171 0.00439 -0.06003
2 \ \ \
3 0.19640 0.15803 0.14602
4 \ \ \
5 \ \ \
6 0.10944 0.17449 0.09824
7 0.21740 0.08261 0.08261
8 \ \ \
9 0.11030 0.13236 0.09559
10 0.10404 0.11416 0.09971
11 0.20580 0.19275 0.17246
12 \ \ \
13 \ \ \
61
14 \ \ \
15 \ \ \
Table 5.14 Difference of prediction accuracy between 5-level and 3-level visual sensation
Table 5.13 summarizes the prediction accuracy of 3-level visual sensation of the three algorithms, and Table
5.14 calculates the value change in comparison with the prediction accuracy of 5-level visual sensation. For
all three algorithms, an improvement of prediction performance was found for six participants. No change
was revealed for eight participants. A dissimilar situation was found for Participant 1, for whom Gaussian
NB and LR produced more accurate predictions; however, SVM had a decrease in prediction accuracy. It was
expected that downscaling the visual sensation level would generate a uniform increase in prediction
accuracy for all participants. The unchanged performance for the eight participants could be explained by the
fact that no response was found for the -2 or 2 level of visual sensation (Figure 4.5). The degree of
improvement varied among algorithms. It can be seen that the increase for Gaussian NB was remarkable and
always higher than that for LR and SVM. This different degree of increase shortened the gap between the
prediction accuracies of the algorithms.
After the rescaling, SVM was still recognized as the algorithm that could generate the most accurate
prediction model for 13 out of 15 participant. Each algorithm performed differently among individuals. For
different test subjects, the variation in performance of the three algorithms can be either remarkable or
negligible. These facts reveal that even though SVM performs the best for the majority of test subjects, the
hypothesis that SVM would be the best option for the whole population is likely to be rejected in
consideration of the individual differences.
Support Vector Machine L ogistic Regression Gaussian Navie Bayes
0.90
0.85
0.80
0.75
0.70
Data
95% CI for the Mean
The pooled standard deviation is used to calculate the intervals.
62
Figure 5.14 Interval plot of prediction accuracy of three algorithms (visual sensation [3 levels])
Source DF Adj SS Adj MS F-Value P-Value
Factor 2 0.02449 0.01225 0.78 0.465
Error 42 0.66036 0.01572
Total 44 0.68485
Table 5.15 Analysis of variance (one-way) for three algorithms in terms of prediction accuracy (visual sensation [3 levels])
A p-value of 0.465, which is larger than 0.05, failed to demonstrate the statistical importance of SVM’s
performance. In comparison with the statistical test conducted for the prediction accuracy of 5-level visual
sensation, the p-value is much larger.
Figure 5.15 Interval plot of Tukey tests for difference of means between pair of groups (visual sensation [3 levels])
Difference of Levels
Difference
of Means
SE of
Difference 95% CI T-Value
Adjusted
P-V alue
Logistic Reg – Gaussian Nav 0.0175 0.0458 (-0.0939, 0.1288) 0.38 0.923
Support V ect – Gaussian Nav 0.0559 0.0458 (-0.0555, 0.1672) 1.22 0.448
Support V ect – Logistic Reg 0.0384 0.0458 (-0.0730, 0.1498) 0.84 0.681
Individual confidence level = 98.07%
Table 5.16 Tukey tests for difference of means between pairs of groups (visual sensation [3 levels])
63
The Tukey test failed to identify any statistically significant difference of means between pairs of groups,
since all the ranges of confidence intervals include zero.
Figure 5.16 Boxplot of prediction accuracy of three algorithms (visual sensation [3 levels])
Variable Mean
SE
Mean
St
Dev Min Q1 Median Q3 Max
Gaussian Naïve
Bayes
0.775 0.029 0.112 0.561 0.684 0.745 0.854 0.975
Logistic
Regression
0.793 0.034 0.132 0.572 0.704 0.762 0.919 1.000
Support Vector
Machine
0.831 0.034 0.131 0.589 0.739 0.814 0.963 1.000
Table 5.17 Basic statistics of prediction accuracy of three algorithms (visual sensation [3 levels])
Figure 5.16 suggests a much wider distribution of prediction accuracy after the rescaling. This was primarily
due to the unchanged performance of algorithms for half the sample group. However, an up-shift of the box
plot can be identified when comparing Figure 5.16 with Figure 5.10. This was the consequence of the
improved performance of prediction. SVM was again recognized for this sample group as the algorithm that
output higher accuracy in predicting 3-level visual sensation.
Support Vector Machine L ogistic Regression Gaussian Navie Bayes
1 .0
0.9
0.8
0.7
0.6
0.5
Data
0.74524
0.7619
0.81368
64
In conclusion, the rescaling of visual sensation level contributed to an improvement of performance for some
of the participants. The unchanged performance was attributed to the absence of responses from participants
at the 2 or -2 level. SVM was realized for the sample group as the algorithm that output the most accurate
prediction for 3-level visual sensation. However, this superior performance of SVM was not statistically
significant.
5.2.4. Prediction accuracy of visual satisfaction (3 levels)
No. Data Size
Gaussian Naïve
Bayes
Logistic
Regression
Support Vector
Machine
Relative
Change
1 2276 0.57247 0.60469 0.64714
2 2308 0.61039 0.48485 0.62771 No
3 2279 0.91582 0.91872 0.93179
4 2294 0.53556 0.61538 0.60377 No
5 2307 0.71140 0.67244 0.79221 No
6 2271 0.84897 0.86657 0.87537
7 2297 0.51739 0.60290 0.72174 No
8 2311 0.72294 0.62049 0.62338
9 2266 0.91176 0.95000 0.95441
10 2305 0.88728 0.86127 0.86850 NA
11 2299 0.81015 0.87246 0.90434
12 2289 0.49054 0.55604 0.72052 No
13 1463 0.81777 0.82005 0.94077 No
14 2276 0.59151 0.59151 0.67643 No
15 1454 0.94050 0.95195 0.97483 No
*The shading highlights the algorithm with the highest accuracy.
Table 5.18 Prediction accuracy of three algorithms (visual satisfaction [3 levels])
No.
Gaussian Naïve
Bayes
Logistic
Regression
Support Vector
Machine
1 0.03660 0.01318 0.01024
2 \ \ \
3 0.07810 0.06492 0.03852
4 \ \ \
5 \ \ \
6 0.09677 0.09824 0.09531
7 \ \ \
8 0.02308 -0.01010 -0.00721
9 0.10882 0.11471 0.09559
10 0.09393 0.07225 0.07226
65
11 0.05363 0.04782 0.02463
12 \ \ \
13 \ \ \
14 \ \ \
15 \ \ \
Table 5.19 Difference of prediction accuracy between 5-level and 3-level visual satisfaction
Table 5.13 summarizes the prediction accuracy of 3-level visual satisfaction for the three algorithms, and
Table 5.14 calculates the value change in comparison with the prediction accuracy of 5-level visual
satisfaction. A similar discovery to that for 3-level visual sensation was made: It was observed that an
improvement of performance was found for six participants for all three algorithms. Unchanged performance
was revealed for eight participants. Participant 8 showed a dissimilar situation, in which Gaussian NB
generated more accurate results but LG and SVM had decreases in performance. The fact that no response
was found for the -2 or 2 level of visual satisfaction was able to explain the unchanged performance.
After rescaling, SVM possessed the highest accuracy for 12 out of 15 participants. The statistical significance
required further testing to be proved.
Figure 5.17 Interval plot of prediction accuracy of three algorithms (visual sensation [3 levels])
Source DF Adj SS Adj MS F-Value P-Value
Factor 2 0.03848 0.01924 0.81 0.450
Error 42 0.99308 0.02364
S upport Vector Machine L ogis tic R egres s ion G aus s ian Navie B ayes
0.90
0.85
0.80
0.75
0.70
0.65
Data
95% CI for the Mean
The pooled standard deviation is used to calculate the intervals.
66
Total 44 1.03156
Table 5.20 Analysis of variance (one-way) for three algorithms in terms of prediction accuracy (visual satisfaction [3 levels])
A p-value of 0.450, which is larger than 0.05, failed to demonstrate the statistical importance of SVM’s
performance.
Difference of Levels
Difference
of Means
SE of
Difference 95% CI T-Value
Adjusted
P-V alue
Logistic Reg – Gaussian Nav 0.0070 0.0561 (-0.1296, 0.1436) 0.12 0.991
Support V ect – Gaussian Nav 0.0652 0.0561 (-0.0713, 0.2018) 1.16 0.482
Support V ect – Logistic Reg 0.0582 0.0561 (-0.0783, 0.1948) 1.04 0.558
Individual confidence level = 98.07%
Table 5.21 Interval plot of Tukey tests for difference of means between pairs of groups (visual satisfaction [3 levels])
The Tukey test failed to identify any statistically significant differences of means between pairs of groups,
since all the ranges of confidence intervals include zero.
67
Figure 5.18 Boxplot of prediction accuracy of three algorithms (visual satisfaction [3 levels])
Variable Mean
SE
Mean
St
Dev Min Q1 Median Q3 Max
Gaussian Naïve
Bayes
0.726 0.042 0.161 0.491 0.572 0.723 0.887 0.941
Logistic Regression 0.733 0.042 0.162 0.485 0.603 0.672 0.872 0.952
Support Vector
Machine
0.791 0.035 0.137 0.604 0.647 0.792 0.932 0.975
Table 5.22 Basic statistics of prediction accuracy of three algorithms (visual satisfaction [3 levels])
Figure 5.18 suggests a much wider distribution of prediction accuracy relative to 3-level visual satisfaction
after the rescaling. SVM was realized for the sample group as the most accurate performer because its quartile
groups are located at the upper part of the graph.
In conclusion, the rescaling of visual satisfaction level contributed to an improvement of performance for
some participants. For this sample group, SVM was the algorithm that output the most accurate predictions.
However, the statistical test failed to demonstrate its statistical significance for the whole population.
5.2.5. Conclusion
The prediction performance of the three algorithms adopted were differentiated by their prediction accuracy.
For 5-level visual sensation and visual satisfaction, SVM as the algorithm used for classification generated
Support Vector Machine L ogistic Regression Gaussian Navie Bayes
1 .0
0.9
0.8
0.7
0.6
0.5
Data
0.72294
0.67244
0.79221
68
the most accurate prediction for the majority of participants. However, this performance was found to be not
statistically significant, which means the conclusion cannot be applied to the whole population. After
downscaling of visual sensation and visual satisfaction level, an improvement of prediction accuracy was
observed for some participants. The unchanged performance for the rest of the participants was attributed to
the absence of responses at the 2 or -2 level from them. This partial improvement gave rise to a greater
dispersion of prediction accuracy. Even though SVM, again, was realized for the sample group as the
algorithm that was able to provide predictions with the highest accuracy, the difference from the other
algorithms was still not statistically significant (p-value > 0.05). The large p-value may be explained by the
small sample group or nonnegligible individual differences.
According to the observation of the sample group, a decision was made to adopt SVM as the algorithm for
prediction of visual comfort label, which would be used to signal the prototype control.
5.3. Prototype control
As an expected deliverable of the research, a prototype control was constructed in the environmental chamber.
Its control logic and controller used were introduced in Section 3.4. Figure 5.19 describes the connections
among controllers, light tracks, the power source, and the computer. During operation, the computer would
output the dimming action to the Arduino board, and the board would translate the action into the
corresponding luminance of light bulbs. The dimming action is the result of a control logic (Figure 3.19) that
is signaled by the visual comfort label predicted. The control step allowed customization but was preset to
be 5 mins. During this 5-min period, eye pupil size data were collected via pupilometer. For the prediction,
the first half (2.5 min) was dropped out to ensure the occupant had adapted to the action from the last step.
The second half was used to compute a single instance as input described by four features (30s moving
average and gradients of 60s, 90s, and 120s). There were two tracks of light under control, which were located
at the central line of all six tracks (Figure 5.20). They could provide different illuminance levels ranging from
0 to 800 lux with an interval of 100 lux. The actions of dimming down or up for all lightbulbs on the tracks
were simultaneous. In other words, it was impossible to dim a single bulb.
Initially, the prototype was set to provide an illuminance of 400 lux on the working plane, which is the median
of its variation. Then, for each control step, the luminance of light is expected to be reduced, increased, or
kept constant in accordance with the visual comfort labels. The illuminance change generated by each control
step is fixed at 100 lux regardless of increase or decrease. Consecutive control steps ensure the illuminance
change is greater than 100 lux.
A program was designed to automate this control process. It is able to start the data collection automatically,
read the data collected, process the data into features, predict the visual comfort labels, and trigger the control.
An interface during operation is shown in Figure 5.21. The code developed for the program is shown in
Appendix D.
69
Figure 5.19 Board connection
Figure 5.20 The prototype system in the environmental chamber
70
Figure 5.21 Program in operation
5.4. Validation test
V alidation was designed to test the performance of the prototype control. There were two rounds of tests,
which were designed for different purposes. There were six participants for the validation tests, three per
round, who were selected from the group of test subjects used for the human subject experiment. This allowed
the adoption of the prediction model trained previously so that no training period would be needed for the
test.
5.4.1. First round validation test
Figure 5.22 First round validation test design
The first-round test was designed following the process shown in Figure 5.22. It lasted for an hour, and
surveys were collected every 10 minutes. There were two conditions that each participant was exposed to:
The first condition was designed as the comparable condition, which created a constant illuminance level of
500 lux at the working plane. This level was recognized as the lighting requirement for the office environment
(IES, 2019). The second condition was a dynamic condition under the control of the prototype system. This
comparison was expected to reveal the drawbacks of the current lighting guidelines and demonstrate the
superiority of the prototype control in terms of realization of individual visual comfort. During the test, the
visual comfort labels were required to be collected, which would be compared with visual comfort labels
predicted.
71
Table 5.24 summarizes the survey results and the prediction results. Control actions that were demanded and
actually triggered are included. The shading highlights the consistent results between the survey and the
prediction. Firstly, from a comparable group with constant 500 lux, it can be seen that different participants
reported different visual sensations and levels of visual satisfaction. Secondly, a poor prediction accuracy
was found when the prototype control was given access to the lighting environment. Thirdly, the control
actions deduced by the control logic conformed to the demands of all test subjects except Participant 7.
The discrepancy between the prediction and the survey was found within 1 level. For example, Participant 7
reported dissatisfaction (visual satisfaction = -1) at step 5 and sensed the light level as being too dark (visual
sensation = 1). The prediction model did predict the participant’s dissatisfaction but gave a visual sensation
label of 0, which failed to trigger a dimming up action.
The fluctuation of participants’ feedback should be investigated for discrepancies. In detail, the test subjects
tended to fluctuate between a neutral level and a positive or negative level. For example, Test Subject 3
reported visual satisfaction of 0 at steps 1, 2, and 5, but reported 1 at steps 3, 4, and 6. Because no control
action was triggered during the experiment, there was actually no change in the illuminance level. This
fluctuation of attitude is inaccessible to the prediction model, so an accurate prediction was not achievable
under this condition.
Lux Level
Visual
Sensation
Visual
Satisfaction
Action
Required
100 -1 0 Brighter
300 0 0 No
600 1 0 Darker
200 0 0 Darker
500 1 0 Darker
800 2 -1 Darker
1150 2 -2 Darker
1250 2 -1 Darker
950 1 0 No
550 0 0 No
650 -1 0 No
1000 0 1 No
Table 5.23 Survey results of Test Subject 3 in human subject experiment
The visual comfort labels reported in the validation test were inconsistent with those from the human subject
experiment, which resulted in the inaccurate predictions in the validation test. Table 5.22 summarizes the
visual comfort labels collected from Participant 3 in the human subject experiment. When the illuminance
level was 500 lux, a condition identical to that of the validation test, the visual sensation level collected was
1, and visual satisfaction was 0. However, in the validation test, the participant reported a visual sensation of
72
0 and visual satisfaction of either 1 or 0. This inconsistency would dramatically influence the prediction
results, since the classes were redefined by the test subject.
The percentage accuracy of the prediction model (Section 5.2) should also be held accountable for the poor
performance in the validation test. The prediction model with higher percentage accuracy should have a
higher chance of outputting the correct label.
No.
Percentage Accuracy
Visual sensation Visual satisfaction
1 65.3% 64.7%
7 73.9% 93.2%
3 82.4% 72.2%
Table 5.24 Percentage accuracy
A stability issue was found for the control loop, which caused the program to collapse during its operation.
The error message is shown below. It indicated that the current input data array could not be reshaped into a
new array, which was expected to have 30 columns, but with no restriction on the number of rows. This
reshape function was designed to obtain the pupil size data with granularity of 1 second, which was originally
sampled at 30Hz. This error was attributed to the fact that the sensor frequently lost track of the eye pupil
during the experiment so that no pupil size was recorded at that time. This was likely a result of the blockage
of eyelid, the wearing of corrective lenses, and the sliding of the tracking glasses during the experiment.
Figure 5.23 Error message
73
No.
1 2 3
Survey Prediction Survey Prediction Survey Prediction
Vis
Sens
Vis
Satis
Action
Req
Vis
Sens
Vis
Satis
Action
Trig
Vis
Sens
Vis
Satis
Action
Req
Vis
Sens
Vis
Satis
Action
Trig
Vis
Sens
Vis
Satis
Action
Req
Vis
Sens
Vis
Satis
Action
Trig
1 1 0 No NA NA NA 1 0 No NA NA NA 1 0 No NA NA NA
7 -1 0 Brighter NA NA NA 0 0 Brighter NA NA NA -1 0 Brighter NA NA NA
3 0 0 No NA NA NA 0 0 No NA NA NA 0 1 No NA NA NA
No.
4 5 6
Survey Prediction Survey Prediction Survey Prediction
Vis
Sens
Vis
Satis
Action
Req
Vis
Sens
Vis
Satis
Action
Trig
Vis
Sens
Vis
Satis
Action
Req
Vis
Sens
Vis
Satis
Action
Trig
Vis
Sens
Vis
Satis
Action
Req
Vis
Sens
Vis
Satis
Action
Trig
1 1 0 No 0/0 1/1 No/No 1 0 No 0/Br 1/Br No/Br 1 0 No 0/Br 1/Br No/Br
7 -1 0 Brighter 0/0 -1/-1 No/No -1 -1 Brighter 0/0 -1/-1 No/No 0 0 No 0/0 -1/-1 No/No
3 0 1 No 1/1 0/0 No/No 0 0 No 1/1 0/0 No/No 0 1 No Br/Br Br/Br Br/Br
* The numerical labels of participants are consistent with those in Tables 5.2–5.11; Action Req = Action required; Action Tri = Action Triggered; NA= Not Applicable; Br = Break
Table 5.25 First round validation test results
74
No.
1 2 3
Survey Prediction Survey Prediction Survey Prediction
Vis
Sens
Vis
Satis
Action
Req
Vis
Sens
Vis
Satis
Action
Trig
Vis
Sens
Vis
Satis
Action
Req
Vis
Sens
Vis
Satis
Action
Trig
Vis
Sens
Vis
Satis
Action
Req
Vis
Sens
Vis
Satis
Action
Trig
8 0 1 No NA NA NA 0 1 No NA NA NA 1 1 No -1/-1 0/0 No/No
13 0 1 No NA NA NA 0 1 No NA NA NA 0 0 No -1/-1 0/0 No/No
5 1 -1 Darker NA NA NA 1 -1 Darker NA NA NA 0 0 No 1/Br -1/Br
Darker/
Br
No.
4 5 6
Survey Prediction Survey Prediction Survey Prediction
Vis
Sens
Vis
Satis
Action
Req
Vis
Sens
Vis
Satis
Action
Trig
Vis
Sens
Vis
Satis
Action
Req
Vis
Sens
Vis
Satis
Action
Trig
Vis
Sens
Vis
Satis
Action
Req
Vis
Sens
Vis
Satis
Action
Trig
8 1 -1 Darker -1/-1 0/0 No/No 0 0 No -1/-1 0/0 No/No 0 1 No Br/Br Br/Br Br/Br
13 1 0 No 1/1 0/0 No/No 1 0 No 1/Br 0/Br No/Br 1 0 No 1/1 0/0 No/No
5 1 -1 Darker 1/1 -1/-1
Darker/
Darker
1 -1 Darker 1/1 -1/-1
Darker/
Darker
1 -1 Darker Br/Br Br/Br Br/Br
*The numerical labels of participants are consistent with those in Tables 5.2–5.11; Action Req = Action required; Action Tri = Action Triggered; NA= Not Applicable; Br = Break
Table 5.26 Second round validation test results
75
5.4.2. Second round validation test
Figure 5.24 Second round validation test design
The second-round validation test was going to test the prototype control under an extreme condition (very
bright – 1100 lux). It lasted for 1 hour, and a questionnaire was required to be filled in every 10 minutes.
There were three conditions: The first one was a comparable condition with a constant illuminance level of
500 lux. The second one was a condition with access by the prototype control with a beginning illuminance
level of 500 lux. The third condition was also under the control of the prototype system, but with an extreme
starting point (1100 lux). In addition to testing the prediction accuracy, the second validation test aimed to
demonstrate the availability of consecutive control actions.
Table 5.24 summarizes the survey results and the prediction results. Control actions that were demanded and
actually triggered are included. The shading highlights the consistent results between the survey and the
prediction. Firstly, accurate predictions were generated for participants 5 and 13 during exposure to the third
condition. Secondly, for Participant 5, consecutive control actions were accessible. Thirdly, an enhancement
of visual comfort was reported from Participant 5 at survey point 3. Previously, the ambient light environment
was dissatisfactory and sensed as being bright (Survey point 2). During the third 10-min step, the luminance
was lowered a single step, which resulted in a 100 lux decrease in the illuminance level at the working plane.
At survey point 3, the lighting environment was felt to be neutral and sensed as being neutral, which
demonstrated an improvement in visual comfort. Fourthly, predictions that were completely opposite to
surveys were found for Participant 8. Fifthly, collapse of the program was reported by the control as a
consequence of tracking loss of the eye pupil.
The visual comfort labels collected from Participant 8 were suspicious, which may explain the loose
predictions of the model. An ascending visual satisfaction level was collected at survey points 4, 5, and 6,
though, no luminance change was made by the system. In other words, the participant showed dissatisfaction
for the extreme condition but was satisfied with the identical illuminance level after a 20-minute exposure.
Additionally, the loose predictions for Participant 8 can probably be attributed to the low percentage accuracy
(Table 5.26).
No.
Percentage Accuracy
Visual sensation Visual satisfaction
8 67.0% 72.3%
13 94.1% 94.1%
76
5 80.1% 79.2%
Table 5.27 Percentage accuracy
5.4.3. Conclusion
The validation test tested the performance of the control prototype under an extreme condition and a baseline
condition. It was found that the prediction accuracy was rather low when participants were exposed to the
baseline condition and acceptable for the extreme condition. The control action output by the control logic
conformed to the actual demands of four out of six test subjects. Frequent collapse of the program was found
during the test, which resulted from a tracking loss of the eye pupil by the pupilometer.
5.5. Analysis of EDA parameter
The process of analyzing EDA parameters followed the sequence shown by Figure 3.5.2. The focus area was
defined as the period from the second minute to the seventh minute for every experimental step (Figure 5.25).
The first and the last minute were discarded, since the transient area was not the focus of the study. The
existing signal was resampled to lower the sampling rate so that the analysis would be done faster. The EDA
was initially measured under a tonic level called SCL. Transforming the tonic level of EDA to a phasic level
allowed the location of SCRs. The mean of SCL and the count of a full cycle of SCR were recognized as the
valuable summary of EDA activities. A general discussion will be provided by combining the summary of
EDA activities and corresponding visual comfort labels.
Figure 5.25 Complete EDA signal with definition of focus areas for a single test subject
5.5.1. Artifacts
To generate a summary of EDA activities, a clean signal should be prepared. There are several types of
artifacts that should be processed before the analysis.
• High resolution
77
Figure 5.26 Signal with high resolution
Figure 5.26 shows an EDA signal with high resolution. It requires a low-pass filter to attenuate the signal.
For this study, the cutoff frequency set for the low-pass filter was 1 Hz. Figure 5.27 shows the signal that has
been passed with a low-pass filter.
Figure 5.27 Signal that has been passed with a 1 Hz low-pass filter
• Rapid transient
78
Figure 5.28 Signal with rapid transient
Figure 5.28 shows a rapid transient issue. The large fluctuation within a short time period is realized as noise,
which would pollute the original data. The endpoints were connected to eliminate this particular noise. The
resultant signal is shown in Figure 5.29.
Figure 5.29 The rapid transient was eliminated by connecting the bounding endpoints
• Decoupling from skin
79
Figure 5.30 Signal with electrodes decoupling issue
When the electrodes were not applied correctly or the electrode leads were pulling on the electrode, the issue
shown in Figure 5.30 would happen. A rapid transient, highlighted by the box, destroyed the smooth transition
of the signal. The best way of dealing with this issue would be identifying and resolving the problem before
it impacted the data. However, as the experiment process could not be interrupted, any focus area containing
such artifact was discarded.
• Negative SCL value
Figure 5.31 Signal with a negative SCL value
Tonic skin conductance cannot be lower than 0. Possible reasons for observing negative values for an EDA
recording are incorrect calibration or the recording being in AC mode. This issue should be resolved before
the experiment. However, such observation during the experiment (Figure 5.31) would result in the discard
of the entire sequence of data collected for the participant.
5.5.2. Summary of EDA activities
80
Three measurements were made for every experimental setting to summarize corresponding EDA activities:
mean SCL, count of SCRs, and frequency of SCRs. The SCRs studied here were non-specific (NS-SCRs),
which means SCRs that occurred in the absence of an identifiable eliciting stimulus. The identification of
SCRs was automatically achieved in the software Acknowledge 5. The definition of NS-SCRs was preset in
the software before the SCRs were located. Figure 5.32 shows the definition used in this study: that when the
phasic EDA level exceeded the 0.01 micro-siemens threshold, it would be recognized as an NS-SCR event;
however, if the phasic EDA level of the NS-SCRs was lower than 5% of the maximum signal, it would be
rejected as an NS-SCR event. The resultant graph (Figure 5.33) shows a phasic-level EDA signal and NS-
SCRs located correspondingly in the tonic-level EDA signal, marked with blue bubbles.
Figure 5.32 Definition of NS-SCR
Figure 5.33 Phasic level of EDA signal
Due to the pollution of artifacts, only five participants’ data were qualified to be used for analysis. The
measurement data are summarized in Table 5.27. The numerical labels used for participants are consistent
with those used in Sections 5.1 and 5.2.
• Illuminance level and EDA measurement
The first attempt was to investigate the EDA activities under different lighting conditions. Figure 5.34 depicts
the step changes of illuminance and of mean SCL together. Firstly, an individual pattern can be found for
81
each participant in terms of the magnitude and the trend. For example, Participant 2 showed a relatively large
value of SCL in comparison with the other participants. Secondly, no relation was found between the patterns
of illuminance variation and SCL fluctuation. For example, there was a stepwise illuminance increase from
steps 4 to 7. The corresponding EDA activities discovered varied for each participant. Participant 1 showed
a stepwise decrease in SCL, while Participant 7 showed a gradual increase in SCL. Meanwhile, Participant
6’s mean SCL fluctuated with the increase of illuminance level.
Figure 5.35 describes the count of NS-SCRs versus illuminance level. Each participant showed an individual
pattern of EDA activities, and no obvious correlation can be found with illuminance level.
Figure 5.34 Illuminance level and mean SCL for 5 participants
82
Figure 5.35 Illuminance level and count of SCRs for 5 participants
• Visual sensation and EDA measurement
83
Figure 5.36 Visual sensation and EDA measurement (Participant 1)
Figure 5.37 Visual sensation and EDA measurement (Participant 2)
84
Figure 5.38 Visual sensation and EDA measurement (Participant 3)
Figure 5.39 Visual sensation and EDA measurement (Participant 6)
85
Figure 5.40 Visual sensation and EDA measurement (Participant 7)
Figures 5.36–5.40 present visual sensation and EDA measurements together for every participant. According
to the finding from Section 2.4 that stress would contribute to an increase in SCL, the focus of the
investigation would fall on tracking variation of the mean SCL in response to the change in visual sensation
level.
For participants 1, 2, and 3, a decreasing SCL can be found when a consistent visual sensation level was
reported for consecutive experimental steps. This statement cannot be applied to Participant 6, as a fluctuating
SCL was found in response to a consistent visual sensation level. For Participant 7, because the visual
sensation reported changed frequently, it is impossible to investigate a period when a consistent visual
sensation level was reported.
It was difficult to match every change in visual sensation with an increase in SCL. A dramatic change in
visual sensation (e.g., varying from negative to positive or positive to negative) corresponded to an increase
in SCL – for example, steps 1–3 for Participant 6. However, variation in visual sensation between neutral and
positive or negative levels would result in losing track of increasing SCL. A possible explanation is that the
latter kind of change in visual sensation is perceptually insignificant. In other words, even though the visual
sensation level changed, the variation of the actual illuminance level was acceptable, so no impact was felt
by the participant.
86
• Visual satisfaction versus EDA measurement
Figure 5.41 Visual satisfaction and EDA measurement (Participant 1)
87
Figure 5.42 Visual satisfaction and EDA measurement (Participant 2)
Figure 5.43 Visual satisfaction and EDA measurement (Participant 3)
88
Figure 5.44 Visual satisfaction and EDA measurement (Participant 6)
Figure 5.45 Visual satisfaction and EDA measurement (Participant 7)
Figures 5.41–5.45 present visual satisfaction and EDA measurements together for every participant. Since
the perception of comfort is subjective, no specific pattern between EDA measurement and visual satisfaction
was accessible for discussion.
89
No. Measurement 1 2 3 4 5 6 7 8 9 10 11 12
1
Mean SCL (micro-siemens) 4.624 4.38 3.984 5.774 4.888 3.213 2.939 4.921 4.272 5.151 \ 7.04
Count of SCR 18 24 16 26 19 32 26 24 29 23 \ 32
Frequency of SCR (per second) 0.05 0.067 0.0444 0.0722 0.0528 0.0889 0.0722 0.067 0.0806 0.0639 \ 0.0889
2
Mean SCL (micro-siemens) 9.673 \ 13.358 14.177 14.583 14.875 14.198 \ 12.73 13.275 13.011 12.517
Count of SCR 5 \ 19 22 22 27 19 \ 21 28 26 21
Frequency of SCR (per second) 0.014 \ 0.0528 0.0611 0.0611 0.075 0.0528 \ 0.0583 0.0778 0.0722 0.0583
3
Mean SCL (micro-siemens) 3.084 5.289 4.632 5.229 3.549 4.472 3.912 2.785 2.863 3.372 1.417 \
Count of SCR 43 38 35 49 29 42 41 38 41 48 37 \
Frequency of SCR (per second) 0.119 0.106 0.0972 0.1361 0.0806 0.1167 0.1139 0.106 0.1139 0.1333 0.1028 \
6
Mean SCL (micro-siemens) 3.668 4.379 \ 5.516 1.671 4.287 3.355 5.141 3.909 3.416 5.286 \
Count of SCR 24 19 \ 19 8 18 14 26 10 17 27 \
Frequency of SCR (per second) 0.067 0.053 \ 0.0528 0.0222 0.05 0.0389 0.072 0.0278 0.0472 0.075 \
7
Mean SCL (micro-siemens) 4.063 2.667 2.334 1.782 1.826 2.209 2.327 2.452 2.078 2.081 1.786 1.86
Count of SCR 46 21 23 14 22 24 23 28 18 23 19 17
Frequency of SCR (per second) 0.128 0.058 0.0639 0.0389 0.0611 0.0667 0.0639 0.078 0.05 0.0639 0.0528 0.0472
Table 5.28 Summary of EDA measurement
90
Chapter 6. Conclusions and Future Works
This research was aimed at developing a lighting control tool to deal with indoor visual comfort issues in the
office environment. The tool was developed with two components – visual comfort prediction and a physical
controller signaled by the prediction results. The prediction model was constructed utilizing a machine
learning technique. Three algorithms were investigated to find the optimum one for the tool to use. The data
input for training and testing were occupants’ eye pupil sizes and their visual comfort labels (e.g., visual
sensation and visual satisfaction) in response to the illuminance level. A human subject test was conducted
for data collection. Prediction accuracy was adopted as the criterion for the selection of an algorithm. Feature
selection and rescaling of visual comfort labels were actions taken for the improvement of the prediction
performance. A prototype control was established using the prediction model selected based on the criteria
and tested by two rounds of validation test. In addition, electrodermal activity was explored as a human
psychophysiological factor. It was expected that a general conclusion could be drawn to reveal the influence
of occupants’ emotions or stress conditions on visual comfort perceived and extend the existing scope of IEQ
research.
Therefore, two major conclusions were drawn from the research. The first part of the conclusions will
summarize the findings involved in the process of developing the control tool. In detail, it will describe the
performance of the prediction model and the prototype control. The second part includes the general findings
from EDA research. Finally, future works will be recommended for further improvement of the research.
6.1. Conclusions
6.1.1. Visual comfort prediction and prototype control
• An individual visual comfort model is necessary to accommodate occupants’ preferences about their
indoor lighting environment.
• 30s was selected as the most appropriate time window for the moving average filter of absolute
pupil size because it retained not only the trend of the data but also their sensitivity.
• A 30s moving average and 60s, 90s, and 120s gradients of pupil size were selected as the most
significant input features for the establishment of the prediction model. The selection process
utilized a boosted gradient tree to generate the score of importance for each feature for every
participant.
• SVM was revealed as the algorithm that could produce the most accurate predictions for 5-level
visual sensation, 5-level visual satisfaction, 3-level visual sensation, and 3-level visual satisfaction.
However, this outstanding performance of SVM was not proved statistically by the ANOV A test (p-
value was larger than 0.05), which means this conclusion was restricted to the current research
sample group. However, the remarkable difference of medians between SVM and the other
algorithms still shows the distinction in prediction performance.
• Two rounds of validation tests indicated that the prototype control worked properly under an extreme
91
condition but failed to output correct control action under a baseline condition (i.e., the illuminance
level required by IES lighting guidelines). Frequent collapse of the control program was reported as
a result of tracking loss of participants’ eye pupils.
6.1.2. Visual comfort and EDA activities
• Individual patterns were evident for EDA activities through the process of the experiment.
• A decreasing level of SCL was found in response to the period when a consistent visual sensation
level was reported for consecutive experimental steps.
• A dramatic change in visual sensation, that is, change staring from a negative to a positive level or
a positive to a negative level, was potentially related to an increase in SCL.
• Considering that the EDA parameter was more like an event-based psychophysiological factor, the
experiment should be intentionally redesigned to reveal the association between visual comfort and
EDA activities.
6.2. Future works
6.2.1. Explore pupil size as time series data
It was realized that the time sequence of data was disorganized during the training process for the purpose of
randomization. However, the change pattern of pupil size was actually a function of time. In detail, sharp
movement can be located during the first second when the eyes are exposed to a light change. The
randomization of the data actually blocks this time effect, which may be a correlated indicator of occupants’
visual comfort.
A possible way of extracting this time feature of pupil size would be to develop 2-dimensional pupil size data.
For example, frequency of change could be derived as a supplement of the second dimension of pupil size
data.
6.2.2. Possible improvement of experimental design for investigation into EDA parameters
Previous research revealed that EDA is closely linked with the occurrence of certain events, for example,
exposure to a stressful scene. The current experiment’s design neglected this fact and tried to utilize EDA as
a general background parameter to investigate the relationship between occupants’ psychological states and
their perception of visual comfort. To some degree, the feasibility of this approach was not proven, reflecting
that the change in visual sensation, as an event, was actually related to variation of EDA activities. Therefore,
it is necessary to redesign the experiment to conform to the features of EDA research.
6.2.3. Possible improvement of software and hardware
It was found that the collapse of the program occurred as a result of the fact that the pupilometer lost track of
the subjects’ eye pupils. This low sensitivity is expected to be improved by more advanced models of eye
sensors. Additionally, the current pupilometer can only be operated with computer that it comes with. This
92
restricted the further application of the prototype control.
The current prediction model utilized less complex machine learning algorithms. If a deep learning technique
is desired as an improvement of prediction performance, a more advanced programming library is required
to implement the idea. For example, tensor flow would be an ideal approach for implementing an artificial
neural network.
93
References
Alpaydin, E. (2014). Introduction to machine learning. Retrieved from https://ieeexplore-ieee-
org.libproxy1.usc.edu/document/6917134
Association for Research in Vision and Ophthalmology., B., Whitaker, D., Elliott, D. B., & Phillips, N. J.
(1994). Investigative ophthalmology & visual science. Investigative Ophthalmology & Visual
Science (V ol. 35). [Association for Research in Vision and Ophthalmology, etc.]. Retrieved from
https://iovs.arvojournals.org/article.aspx?articleid=2161149
Auliciems, A. (1981). Towards a psycho-physiological model of thermal perception. International Journal
of Biometeorology, 25(2), 109–122. http://doi.org/10.1007/BF02184458
Babakus, E., & Mangold, W. G. (1992). Adapting the SERVQUAL scale to hospital services: an empirical
investigation. Health Services Research, 26(6), 767–86. Retrieved from
http://www.ncbi.nlm.nih.gov/pubmed/1737708
Berman, S., Jewett, D., Benson, B., & Law, T. M. (1997). Despite Different Wall Colors, V ertical Scotopic
Illuminance Predicts Pupil Size. Journal of the Illuminating Engineering Society, 26(2). Retrieved from
https://eta.lbl.gov/publications/despite-different-wall-colors
Boucsein, W. (2012a). Applications of Electrodermal Recording. In Electrodermal Activity (pp. 259–523).
Boston, MA: Springer US. http://doi.org/10.1007/978-1-4614-1126-0_3
Boucsein, W. (2012b). Applications of Electrodermal Recording. In Electrodermal Activity (pp. 259–523).
Boston, MA: Springer US. http://doi.org/10.1007/978-1-4614-1126-0_3
Boucsein, W., Fowles, D. C., Grimnes, S., Ben-Shakhar, G., roth, W. T., Dawson, M. E., … Society for
Psychophysiological Research Ad Hoc Committee on Electrodermal Measures. (2012). Publication
recommendations for electrodermal measurements. Psychophysiology, 49(8), 1017–1034.
http://doi.org/10.1111/j.1469-8986.2012.01384.x
Boyce, P . R. (1979). Users’ attitudes to some types of local lighting. Lighting Research & Technology, 11(3),
158–164. http://doi.org/10.1177/14771535790110030501
Brownlee, J. (2016). Feature Importance and Feature Selection With XGBoost in Python. Retrieved February
24, 2019, from https://machinelearningmastery.com/feature-importance-and-feature-selection-with-
xgboost-in-python/
Chaudhuri, T., Soh, Y . C., Li, H., & Xie, L. (2017). Machine learning based prediction of thermal comfort in
buildings of equatorial Singapore. In 2017 IEEE International Conference on Smart Grid and Smart
Cities (ICSGSC) (pp. 72–77). IEEE. http://doi.org/10.1109/ICSGSC.2017.8038552
Choi, J.-H. (2017). Investigation of human eye pupil sizes as a measure of visual sensation in the workplace
environment with a high lighting colour temperature. Indoor and Built Environment, 26(4), 488–501.
http://doi.org/10.1177/1420326X15626585
Choi, J.-H., Lin, X., & Schiller, M. (2018). Investigation of a real-time change of human eye pupil sizes per
visual sensation condition. ARCC – EAAE 2018 INTERNATIONAL CONFERENCE. Retrieved from
https://par.nsf.gov/biblio/10063577
Choi, J.-H., & Zhu, R. (2015). Investigation of the potential use of human eye pupil sizes to estimate visual
sensations in the workplace environment. Building and Environment, 88, 73–81.
http://doi.org/10.1016/J.BUILDENV .2014.11.025
Folkins, C. H. (1970). Temporal factors and the cognitive mediators of stress reaction. Journal of Personality
94
and Social Psychology, 14(2), 173–84. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/5478969
Greenwald, M. K., Cook, E. W., & Lang, P . J. (1989). Affective judgment and psychophysiological response:
Dimensional covariation in the evaluation of pictorial stimuli. Journal of Psychophysiology. Retrieved
from
http://www.academia.edu/29172872/Affective_judgment_and_psychophysiological_response_Dimen
sional_covariation_in_the_evaluation_of_pictorial_stimuli
Juslén, H. T., Wouters, M. C. H. M., & Tenner, A. D. (2007). Lighting level and productivity: a field study in
the electronics industry. Ergonomics, 50(4), 615–624. http://doi.org/10.1080/00140130601155001
Katkin, E. S. (1965). Relationship between manifest anxiety and two indices of autonomic response to stress.
Journal of Personality and Social Psychology, 2(3), 324–333. http://doi.org/10.1037/h0022303
Kenney, J., & Keeping, E. (1962). Moving averages. Mathematics of Statistics 14 (2): 221-223. Princenton.
Retrieved from
https://scholar.google.com/scholar?cluster=9550428660616578665&hl=en&as_sdt=2005&sciodt=0,5
&scioq=Kenney,+J.+F.+and+Keeping,+E.+S.+%22Moving+Averages.%22+
Kilpatrick, D. G. (1972). Differential Responsiveness of Two Electrodermal Indices to Psychological Stress
and Performance of a Complex Cognitive Task. Psychophysiology, 9(2), 218–226.
http://doi.org/10.1111/j.1469-8986.1972.tb00756.x
Kim, J., Zhou, Y ., Schiavon, S., Raftery, P ., & Brager, G. (2018). Personal comfort models: Predicting
individuals’ thermal preference using occupant heating and cooling behavior and machine learning.
Building and Environment, 129, 96–106. http://doi.org/10.1016/J.BUILDENV .2017.12.011
Kim, S.-Y., & Kim, J.-J. (2007). Influence of light fluctuation on occupant visual perception. Building and
Environment, 42(8), 2888–2899. http://doi.org/10.1016/J.BUILDENV .2006.10.033
Lazarus, R. S. (1966). Psychological stress and the coping process. Psychological stress and the coping
process. New York, NY , US: McGraw-Hill.
Luo, M., Cao, B., Ji, W., Ouyang, Q., Lin, B., & Zhu, Y . (2016). The underlying linkage between personal
control and thermal comfort: Psychological or physical effects? Energy and Buildings, 111, 56–63.
http://doi.org/10.1016/J.ENBUILD.2015.11.004
Monat, A., Averill, J. R., & Lazarus, R. S. (1972). Anticipatory stress and coping reactions under various
conditions of uncertainty. Journal of Personality and Social Psychology, 24(2), 237–253.
http://doi.org/10.1037/h0033297
Navvab, M. (2001). A Comparison of Visual Performance under High and Low Color Temperature
Fluorescent Lamps. Journal of the Illuminating Engineering Society, 30(2), 170–175.
http://doi.org/10.1080/00994480.2001.10748361
NG, A. (n.d.). Kernels I – Stanford University | Coursera. Retrieved February 12, 2019, from
https://www.coursera.org/learn/machine-learning/lecture/YOMHn/kernels-i
NIEMELÄ, P . (1975). Effects of interrupting the process of preparation for film stress. Scandinavian Journal
of Psychology, 16(1), 294–302. http://doi.org/10.1111/j.1467-9450.1975.tb00196.x
Papez, J. W. (1937). A PROPOSED MECHANISM OF EMOTION. Archives of Neurology And Psychiatry,
38(4), 725. http://doi.org/10.1001/archneurpsyc.1937.02260220069003
Preston, C. C., & Colman, A. M. (2000). Optimal number of response categories in rating scales: reliability,
validity, discriminating power, and respondent preferences. Acta Psychologica, 104(1), 1–15.
95
http://doi.org/10.1016/S0001-6918(99)00050-5
Revilla, M. A., Saris, W. E., & Krosnick, J. A. (2014). Choosing the Number of Categories in Agree–Disagree
Scales. Sociological Methods & Research, 43(1), 73–97. http://doi.org/10.1177/0049124113509605
sklearn.linear_model.LogisticRegression — scikit-learn 0.20.2 documentation. (n.d.). Retrieved February 11,
2019, from https://scikit-
learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html
sklearn.naive_bayes.GaussianNB — scikit-learn 0.20.2 documentation. (n.d.). Retrieved February 11, 2019,
from https://scikit-
learn.org/stable/modules/generated/sklearn.naive_bayes.GaussianNB.html#sklearn.naive_bayes.Gaus
sianNB
sklearn.svm.SVC — scikit-learn 0.20.2 documentation. (n.d.). Retrieved February 11, 2019, from
https://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html#sklearn.svm.SVC
Spielberger, C. D. (2013). Anxiety and Behavior . Elsevier Science. Retrieved from
https://books.google.com/books?hl=en&lr=&id=45pGBQAAQBAJ&oi=fnd&pg=PA225&dq=The+st
udy+of+psychological+stress:+A+summary+of+theoretical+formulations+and+empirical+findings.+
&ots=C31lRVkuf-&sig=MBfFx5WYczfl0al0YwH0JquUuaI#v=onepage&q&f=false
Stemmler, G. (1989). The autonomic differentiation of emotions revisited: convergent and discriminant
validation. Psychophysiology, 26(6), 617–32. Retrieved from
http://www.ncbi.nlm.nih.gov/pubmed/2629011
Sun, C., & Lian, Z. (2016). Sensitive physiological indicators for human visual comfort evaluation. Lighting
Research & Technology, 48(6), 726–741. http://doi.org/10.1177/1477153515624266
Tanabe, S., & Nishihara, N. (2004a). Productivity and fatigue. Indoor Air, 14 Suppl 7, 126–33.
http://doi.org/10.1111/j.1600-0668.2004.00281.x
Tanabe, S., & Nishihara, N. (2004b). Productivity and fatigue. Indoor Air, 14(s7), 126–133.
http://doi.org/10.1111/j.1600-0668.2004.00281.x
van Bommel, W., & van den Beld, G. (2004). Lighting for work: a review of visual and biological effects.
Lighting Research & Technology, 36(4), 255–266. http://doi.org/10.1191/1365782804li122oa
Weston, H. C. (1953). Visual Fatigue. Lighting Research and Technology, 18(2 IEStrans), 39–66.
http://doi.org/10.1177/147715355301800201
Winton, W. M., Putnam, L. E., & Krauss, R. M. (1984). Facial and autonomic manifestations of the
dimensional structure of emotion. Journal of Experimental Social Psychology, 20(3), 195–216.
http://doi.org/10.1016/0022-1031(84)90047-7
96
Appendix A
Code for establishing sensation model
97
Appendix B
Code for data processing in MA TLAB
%% Data pre_processing
filename='Experiment/Pupilometer/Danyang Zhang.csv'
filename2='Experiment/Pre-processing/Danyang Zhang.xlsx'
%filename3='Experiment/Pupilometer/Lingyu Huang_2.csv'
%filename3='Experiment/Pupilometer/Gaoge Yang_2.csv'
%filename3='Experiment/Pupilometer/Hanxun Liu_2.csv'
%filename4='Experiment/Pupilometer/Hanxun Liu_3.csv'
%filename5='Experiment/Pupilometer/Hanxun Liu_4.csv'
%filename3='Experiment/Pupilometer/Ruoxiao Zeng_2.csv'
%filename4='Experiment/Pupilometer/Ruoxiao Zeng_3.csv'
M=csvread(filename,2,1); %reads the data from the file starting at row offset
2 and column offset 1.
M2=readtable(filename2);
%M3=csvread(filename3,2,1);
%M4=csvread(filename4,2,1)
%M5=csvread(filename5,2,1)
%MM=[M(1:71770,[1 6]);M3(:,[1 6])] %Lingyu Huang
%MM=[M(1:57497,[1 6]);M3(:,[1 6])] %Gaoge Yang
%MM=[M(1:56881,[1 6]);M3(1:28750,[1 6]);M4(1:55124,[1 6]);M5(:,[1
6])] %Hanxun Liu
%MM=[M(1:30340,[1 6]);M3(1:12649,[1 6]);M4(:,[1 6])] %Ruoxiao Zeng
MM=M(:,[1,6]);
New = zeros(size(MM,1),1);
Data = zeros(30,round(size(MM,1)/30));
Timelabel = zeros(round(size(MM,1)/30),1);
%Fill in the missing value with value from last step
for i=1:size(MM,1)
if MM(i,2)~= -2000
New(i,1)=MM(i,2);
elseif MM(i-1,2)==-2000
New(i,1)=New(i-1,1);
else
New(i,1)=MM(i-1,2);
end
end
%Change the granularity from 30HZ to 1HZ
for i=1:size(Data,2);
if i~=size(Data,2)
Data(:,i)= New((1+(i-1)*30):(i*30));
else
Data(:,i)=zeros;
end
end
Fin=mean(Data);
Fin=transpose(Fin); %Final outcome (30Hz to 1Hz)
Fin=table(Fin);
TT = table2timetable(Fin,'TimeStep',seconds(1));
Fin=timetable2table(TT);
% Delete the first and last second of data
Fin(5281:end,:)=[];
T1=array2table(zeros(480,2));
t1=zeros(11,1);
t2=zeros(11,1);
for i=1:1:11
t1(i)=1+(i-1)*480;
98
t2(i)=i*480;
end
%% Extract data from 1st Step
T1=Fin(t1(1):t2(1),:);
T1(421:480,:)=[];
T1(1:60,:)=[];
Mov_Ave_30s=array2table(zeros(360,1),'VariableNames',{'Mov_Ave_30s'});
Gra_change_30s=array2table(zeros(360,1),'VariableNames',{'Gra_change_30s'});
Gra_change_40s=array2table(zeros(360,1),'VariableNames',{'Gra_change_40s'});
Gra_change_50s=array2table(zeros(360,1),'VariableNames',{'Gra_change_50s'});
Gra_change_60s=array2table(zeros(360,1),'VariableNames',{'Gra_change_60s'});
Gra_change_90s=array2table(zeros(360,1),'VariableNames',{'Gra_change_90s'});
Gra_change_120s=array2table(zeros(360,1),'VariableNames',{'Gra_change_120s'})
;
Vis_Sens=array2table(zeros(360,1),'VariableNames',{'Vis_Sens'});
Vis_Satisf=array2table(zeros(360,1),'VariableNames',{'Vis_Satisf'});
Lux_Level=array2table(zeros(360,1),'VariableNames',{'Lux_Level'});
T1=[T1 Mov_Ave_30s Gra_change_30s Gra_change_40s Gra_change_50s
Gra_change_60s Gra_change_90s Gra_change_120s Lux_Level Vis_Sens Vis_Satisf];
for i=31:1:size(T1,1)
T1{i,3}=mean(T1{i-30:i,2});
end
for i=61:1:size(T1,1)
T1{i,4}=T1{i,3}-T1{i-30,3};
end
for i=71:1:size(T1,1)
T1{i,5}=T1{i,3}-T1{i-40,3};
end
for i=81:1:size(T1,1)
T1{i,6}=T1{i,3}-T1{i-50,3};
end
for i=91:1:size(T1,1)
T1{i,7}=T1{i,3}-T1{i-60,3};
end
for i=121:1:size(T1,1)
T1{i,8}=T1{i,3}-T1{i-90,3};
end
for i=151:1:size(T1,1)
T1{i,9}=T1{i,3}-T1{i-120,3};
end
T1.Properties.VariableNames{2}='Absolute_size'
T1(1:150,:)=[];
T1{:,10}=M2{1,2};
T1{:,11}=M2{1,3};
T1{:,12}=M2{1,4};
%% Extract 2nd Step
99
T2=Fin(t1(2):t2(2),:);
T2(421:480,:)=[];
T2(1:60,:)=[];
Mov_Ave_30s=array2table(zeros(360,1),'VariableNames',{'Mov_Ave_30s'});
Gra_change_30s=array2table(zeros(360,1),'VariableNames',{'Gra_change_30s'});
Gra_change_40s=array2table(zeros(360,1),'VariableNames',{'Gra_change_40s'});
Gra_change_50s=array2table(zeros(360,1),'VariableNames',{'Gra_change_50s'});
Gra_change_60s=array2table(zeros(360,1),'VariableNames',{'Gra_change_60s'});
Gra_change_90s=array2table(zeros(360,1),'VariableNames',{'Gra_change_90s'});
Gra_change_120s=array2table(zeros(360,1),'VariableNames',{'Gra_change_120s'})
;
Vis_Sens=array2table(zeros(360,1),'VariableNames',{'Vis_Sens'});
Vis_Satisf=array2table(zeros(360,1),'VariableNames',{'Vis_Satisf'});
Lux_Level=array2table(zeros(360,1),'VariableNames',{'Lux_Level'});
T2=[T2 Mov_Ave_30s Gra_change_30s Gra_change_40s Gra_change_50s
Gra_change_60s Gra_change_90s Gra_change_120s Lux_Level Vis_Sens Vis_Satisf];
for i=31:1:size(T2,1)
T2{i,3}=mean(T2{i-30:i,2});
end
for i=61:1:size(T2,1)
T2{i,4}=T2{i,3}-T2{i-30,3};
end
for i=71:1:size(T2,1)
T2{i,5}=T2{i,3}-T2{i-40,3};
end
for i=81:1:size(T2,1)
T2{i,6}=T2{i,3}-T2{i-50,3};
end
for i=91:1:size(T2,1)
T2{i,7}=T2{i,3}-T2{i-60,3};
end
for i=121:1:size(T2,1)
T2{i,8}=T2{i,3}-T2{i-90,3};
end
for i=151:1:size(T2,1)
T2{i,9}=T2{i,3}-T2{i-120,3};
end
T2.Properties.VariableNames{2}='Absolute_size'
T2(1:150,:)=[];
T2{:,10}=M2{2,2};
T2{:,11}=M2{2,3};
T2{:,12}=M2{2,4};
%% Extract 3rd step
T3=Fin(t1(3):t2(3),:);
T3(421:480,:)=[];
T3(1:60,:)=[];
Mov_Ave_30s=array2table(zeros(360,1),'VariableNames',{'Mov_Ave_30s'});
100
Gra_change_30s=array2table(zeros(360,1),'VariableNames',{'Gra_change_30s'});
Gra_change_40s=array2table(zeros(360,1),'VariableNames',{'Gra_change_40s'});
Gra_change_50s=array2table(zeros(360,1),'VariableNames',{'Gra_change_50s'});
Gra_change_60s=array2table(zeros(360,1),'VariableNames',{'Gra_change_60s'});
Gra_change_90s=array2table(zeros(360,1),'VariableNames',{'Gra_change_90s'});
Gra_change_120s=array2table(zeros(360,1),'VariableNames',{'Gra_change_120s'})
;
Vis_Sens=array2table(zeros(360,1),'VariableNames',{'Vis_Sens'});
Vis_Satisf=array2table(zeros(360,1),'VariableNames',{'Vis_Satisf'});
Lux_Level=array2table(zeros(360,1),'VariableNames',{'Lux_Level'});
T3=[T3 Mov_Ave_30s Gra_change_30s Gra_change_40s Gra_change_50s
Gra_change_60s Gra_change_90s Gra_change_120s Lux_Level Vis_Sens Vis_Satisf];
for i=31:1:size(T3,1)
T3{i,3}=mean(T3{i-30:i,2});
end
for i=61:1:size(T3,1)
T3{i,4}=T3{i,3}-T3{i-30,3};
end
for i=71:1:size(T3,1)
T3{i,5}=T3{i,3}-T3{i-40,3};
end
for i=81:1:size(T3,1)
T3{i,6}=T3{i,3}-T3{i-50,3};
end
for i=91:1:size(T3,1)
T3{i,7}=T3{i,3}-T3{i-60,3};
end
for i=121:1:size(T3,1)
T3{i,8}=T3{i,3}-T3{i-90,3};
end
for i=151:1:size(T3,1)
T3{i,9}=T3{i,3}-T3{i-120,3};
end
T3.Properties.VariableNames{2}='Absolute_size'
T3(1:150,:)=[];
T3{:,10}=M2{3,2};
T3{:,11}=M2{3,3};
T3{:,12}=M2{3,4};
%% Extract 4th step
T4=Fin(t1(4):t2(4),:);
T4(421:480,:)=[];
T4(1:60,:)=[];
Mov_Ave_30s=array2table(zeros(360,1),'VariableNames',{'Mov_Ave_30s'});
Gra_change_30s=array2table(zeros(360,1),'VariableNames',{'Gra_change_30s'});
Gra_change_40s=array2table(zeros(360,1),'VariableNames',{'Gra_change_40s'});
Gra_change_50s=array2table(zeros(360,1),'VariableNames',{'Gra_change_50s'});
Gra_change_60s=array2table(zeros(360,1),'VariableNames',{'Gra_change_60s'});
101
Gra_change_90s=array2table(zeros(360,1),'VariableNames',{'Gra_change_90s'});
Gra_change_120s=array2table(zeros(360,1),'VariableNames',{'Gra_change_120s'})
;
Vis_Sens=array2table(zeros(360,1),'VariableNames',{'Vis_Sens'});
Vis_Satisf=array2table(zeros(360,1),'VariableNames',{'Vis_Satisf'});
Lux_Level=array2table(zeros(360,1),'VariableNames',{'Lux_Level'});
T4=[T4 Mov_Ave_30s Gra_change_30s Gra_change_40s Gra_change_50s
Gra_change_60s Gra_change_90s Gra_change_120s Lux_Level Vis_Sens Vis_Satisf];
for i=31:1:size(T4,1)
T4{i,3}=mean(T4{i-30:i,2});
end
for i=61:1:size(T4,1)
T4{i,4}=T4{i,3}-T4{i-30,3};
end
for i=71:1:size(T4,1)
T4{i,5}=T4{i,3}-T4{i-40,3};
end
for i=81:1:size(T4,1)
T4{i,6}=T4{i,3}-T4{i-50,3};
end
for i=91:1:size(T4,1)
T4{i,7}=T4{i,3}-T4{i-60,3};
end
for i=121:1:size(T4,1)
T4{i,8}=T4{i,3}-T4{i-90,3};
end
for i=151:1:size(T4,1)
T4{i,9}=T4{i,3}-T4{i-120,3};
end
T4.Properties.VariableNames{2}='Absolute_size'
T4(1:150,:)=[];
T4{:,10}=M2{4,2};
T4{:,11}=M2{4,3};
T4{:,12}=M2{4,4};
%% Extract 5th Step
T5=Fin(t1(5):t2(5),:);
T5(421:480,:)=[];
T5(1:60,:)=[];
Mov_Ave_30s=array2table(zeros(360,1),'VariableNames',{'Mov_Ave_30s'});
Gra_change_30s=array2table(zeros(360,1),'VariableNames',{'Gra_change_30s'});
Gra_change_40s=array2table(zeros(360,1),'VariableNames',{'Gra_change_40s'});
Gra_change_50s=array2table(zeros(360,1),'VariableNames',{'Gra_change_50s'});
Gra_change_60s=array2table(zeros(360,1),'VariableNames',{'Gra_change_60s'});
Gra_change_90s=array2table(zeros(360,1),'VariableNames',{'Gra_change_90s'});
Gra_change_120s=array2table(zeros(360,1),'VariableNames',{'Gra_change_120s'})
;
Vis_Sens=array2table(zeros(360,1),'VariableNames',{'Vis_Sens'});
102
Vis_Satisf=array2table(zeros(360,1),'VariableNames',{'Vis_Satisf'});
Lux_Level=array2table(zeros(360,1),'VariableNames',{'Lux_Level'});
T5=[T5 Mov_Ave_30s Gra_change_30s Gra_change_40s Gra_change_50s
Gra_change_60s Gra_change_90s Gra_change_120s Lux_Level Vis_Sens Vis_Satisf];
for i=31:1:size(T5,1)
T5{i,3}=mean(T5{i-30:i,2});
end
for i=61:1:size(T5,1)
T5{i,4}=T5{i,3}-T5{i-30,3};
end
for i=71:1:size(T5,1)
T5{i,5}=T5{i,3}-T5{i-40,3};
end
for i=81:1:size(T5,1)
T5{i,6}=T5{i,3}-T5{i-50,3};
end
for i=91:1:size(T5,1)
T5{i,7}=T5{i,3}-T5{i-60,3};
end
for i=121:1:size(T5,1)
T5{i,8}=T5{i,3}-T5{i-90,3};
end
for i=151:1:size(T5,1)
T5{i,9}=T5{i,3}-T5{i-120,3};
end
T5.Properties.VariableNames{2}='Absolute_size'
T5(1:150,:)=[];
T5{:,10}=M2{5,2};
T5{:,11}=M2{5,3};
T5{:,12}=M2{5,4};
%% Extract 6th Step
T6=Fin(t1(6):t2(6),:);
T6(421:480,:)=[];
T6(1:60,:)=[];
Mov_Ave_30s=array2table(zeros(360,1),'VariableNames',{'Mov_Ave_30s'});
Gra_change_30s=array2table(zeros(360,1),'VariableNames',{'Gra_change_30s'});
Gra_change_40s=array2table(zeros(360,1),'VariableNames',{'Gra_change_40s'});
Gra_change_50s=array2table(zeros(360,1),'VariableNames',{'Gra_change_50s'});
Gra_change_60s=array2table(zeros(360,1),'VariableNames',{'Gra_change_60s'});
Gra_change_90s=array2table(zeros(360,1),'VariableNames',{'Gra_change_90s'});
Gra_change_120s=array2table(zeros(360,1),'VariableNames',{'Gra_change_120s'})
;
Vis_Sens=array2table(zeros(360,1),'VariableNames',{'Vis_Sens'});
Vis_Satisf=array2table(zeros(360,1),'VariableNames',{'Vis_Satisf'});
Lux_Level=array2table(zeros(360,1),'VariableNames',{'Lux_Level'});
T6=[T6 Mov_Ave_30s Gra_change_30s Gra_change_40s Gra_change_50s
Gra_change_60s Gra_change_90s Gra_change_120s Lux_Level Vis_Sens Vis_Satisf];
103
for i=31:1:size(T6,1)
T6{i,3}=mean(T6{i-30:i,2});
end
for i=61:1:size(T6,1)
T6{i,4}=T6{i,3}-T6{i-30,3};
end
for i=71:1:size(T6,1)
T6{i,5}=T6{i,3}-T6{i-40,3};
end
for i=81:1:size(T6,1)
T6{i,6}=T6{i,3}-T6{i-50,3};
end
for i=91:1:size(T6,1)
T6{i,7}=T6{i,3}-T6{i-60,3};
end
for i=121:1:size(T6,1)
T6{i,8}=T6{i,3}-T6{i-90,3};
end
for i=151:1:size(T6,1)
T6{i,9}=T6{i,3}-T6{i-120,3};
end
T6.Properties.VariableNames{2}='Absolute_size'
T6(1:150,:)=[];
T6{:,10}=M2{6,2};
T6{:,11}=M2{6,3};
T6{:,12}=M2{6,4};
%% Extract 7th Step
T7=Fin(t1(7):t2(7),:);
T7(421:480,:)=[];
T7(1:60,:)=[];
Mov_Ave_30s=array2table(zeros(360,1),'VariableNames',{'Mov_Ave_30s'});
Gra_change_30s=array2table(zeros(360,1),'VariableNames',{'Gra_change_30s'});
Gra_change_40s=array2table(zeros(360,1),'VariableNames',{'Gra_change_40s'});
Gra_change_50s=array2table(zeros(360,1),'VariableNames',{'Gra_change_50s'});
Gra_change_60s=array2table(zeros(360,1),'VariableNames',{'Gra_change_60s'});
Gra_change_90s=array2table(zeros(360,1),'VariableNames',{'Gra_change_90s'});
Gra_change_120s=array2table(zeros(360,1),'VariableNames',{'Gra_change_120s'})
;
Vis_Sens=array2table(zeros(360,1),'VariableNames',{'Vis_Sens'});
Vis_Satisf=array2table(zeros(360,1),'VariableNames',{'Vis_Satisf'});
Lux_Level=array2table(zeros(360,1),'VariableNames',{'Lux_Level'});
T7=[T7 Mov_Ave_30s Gra_change_30s Gra_change_40s Gra_change_50s
Gra_change_60s Gra_change_90s Gra_change_120s Lux_Level Vis_Sens Vis_Satisf];
for i=31:1:size(T7,1)
T7{i,3}=mean(T7{i-30:i,2});
end
104
for i=61:1:size(T7,1)
T7{i,4}=T7{i,3}-T7{i-30,3};
end
for i=71:1:size(T7,1)
T7{i,5}=T7{i,3}-T7{i-40,3};
end
for i=81:1:size(T7,1)
T7{i,6}=T7{i,3}-T7{i-50,3};
end
for i=91:1:size(T7,1)
T7{i,7}=T7{i,3}-T7{i-60,3};
end
for i=121:1:size(T7,1)
T7{i,8}=T7{i,3}-T7{i-90,3};
end
for i=151:1:size(T7,1)
T7{i,9}=T7{i,3}-T7{i-120,3};
end
T7.Properties.VariableNames{2}='Absolute_size'
T7(1:150,:)=[];
T7{:,10}=M2{7,2};
T7{:,11}=M2{7,3};
T7{:,12}=M2{7,4};
%% Extract 8th Step
T8=Fin(t1(8):t2(8),:);
T8(421:480,:)=[];
T8(1:60,:)=[];
Mov_Ave_30s=array2table(zeros(360,1),'VariableNames',{'Mov_Ave_30s'});
Gra_change_30s=array2table(zeros(360,1),'VariableNames',{'Gra_change_30s'});
Gra_change_40s=array2table(zeros(360,1),'VariableNames',{'Gra_change_40s'});
Gra_change_50s=array2table(zeros(360,1),'VariableNames',{'Gra_change_50s'});
Gra_change_60s=array2table(zeros(360,1),'VariableNames',{'Gra_change_60s'});
Gra_change_90s=array2table(zeros(360,1),'VariableNames',{'Gra_change_90s'});
Gra_change_120s=array2table(zeros(360,1),'VariableNames',{'Gra_change_120s'})
;
Vis_Sens=array2table(zeros(360,1),'VariableNames',{'Vis_Sens'});
Vis_Satisf=array2table(zeros(360,1),'VariableNames',{'Vis_Satisf'});
Lux_Level=array2table(zeros(360,1),'VariableNames',{'Lux_Level'});
T8=[T8 Mov_Ave_30s Gra_change_30s Gra_change_40s Gra_change_50s
Gra_change_60s Gra_change_90s Gra_change_120s Lux_Level Vis_Sens Vis_Satisf];
for i=31:1:size(T8,1)
T8{i,3}=mean(T8{i-30:i,2});
end
for i=61:1:size(T8,1)
T8{i,4}=T8{i,3}-T8{i-30,3};
end
105
for i=71:1:size(T8,1)
T8{i,5}=T8{i,3}-T8{i-40,3};
end
for i=81:1:size(T8,1)
T8{i,6}=T8{i,3}-T8{i-50,3};
end
for i=91:1:size(T8,1)
T8{i,7}=T8{i,3}-T8{i-60,3};
end
for i=121:1:size(T8,1)
T8{i,8}=T8{i,3}-T8{i-90,3};
end
for i=151:1:size(T8,1)
T8{i,9}=T8{i,3}-T8{i-120,3};
end
T8.Properties.VariableNames{2}='Absolute_size'
T8(1:150,:)=[];
T8{:,10}=M2{8,2};
T8{:,11}=M2{8,3};
T8{:,12}=M2{8,4};
%% Extract 9th Step
T9=Fin(t1(9):t2(9),:);
T9(421:480,:)=[];
T9(1:60,:)=[];
Mov_Ave_30s=array2table(zeros(360,1),'VariableNames',{'Mov_Ave_30s'});
Gra_change_30s=array2table(zeros(360,1),'VariableNames',{'Gra_change_30s'});
Gra_change_40s=array2table(zeros(360,1),'VariableNames',{'Gra_change_40s'});
Gra_change_50s=array2table(zeros(360,1),'VariableNames',{'Gra_change_50s'});
Gra_change_60s=array2table(zeros(360,1),'VariableNames',{'Gra_change_60s'});
Gra_change_90s=array2table(zeros(360,1),'VariableNames',{'Gra_change_90s'});
Gra_change_120s=array2table(zeros(360,1),'VariableNames',{'Gra_change_120s'})
;
Vis_Sens=array2table(zeros(360,1),'VariableNames',{'Vis_Sens'});
Vis_Satisf=array2table(zeros(360,1),'VariableNames',{'Vis_Satisf'});
Lux_Level=array2table(zeros(360,1),'VariableNames',{'Lux_Level'});
T9=[T9 Mov_Ave_30s Gra_change_30s Gra_change_40s Gra_change_50s
Gra_change_60s Gra_change_90s Gra_change_120s Lux_Level Vis_Sens Vis_Satisf];
for i=31:1:size(T9,1)
T9{i,3}=mean(T9{i-30:i,2});
end
for i=61:1:size(T9,1)
T9{i,4}=T9{i,3}-T9{i-30,3};
end
for i=71:1:size(T9,1)
T9{i,5}=T9{i,3}-T9{i-40,3};
end
106
for i=81:1:size(T9,1)
T9{i,6}=T9{i,3}-T9{i-50,3};
end
for i=91:1:size(T9,1)
T9{i,7}=T9{i,3}-T9{i-60,3};
end
for i=121:1:size(T9,1)
T9{i,8}=T9{i,3}-T9{i-90,3};
end
for i=151:1:size(T9,1)
T9{i,9}=T9{i,3}-T9{i-120,3};
end
T9.Properties.VariableNames{2}='Absolute_size'
T9(1:150,:)=[];
T9{:,10}=M2{9,2};
T9{:,11}=M2{9,3};
T9{:,12}=M2{9,4};
%% Extract 10th Step
T10=Fin(t1(10):t2(10),:);
T10(421:480,:)=[];
T10(1:60,:)=[];
Mov_Ave_30s=array2table(zeros(360,1),'VariableNames',{'Mov_Ave_30s'});
Gra_change_30s=array2table(zeros(360,1),'VariableNames',{'Gra_change_30s'});
Gra_change_40s=array2table(zeros(360,1),'VariableNames',{'Gra_change_40s'});
Gra_change_50s=array2table(zeros(360,1),'VariableNames',{'Gra_change_50s'});
Gra_change_60s=array2table(zeros(360,1),'VariableNames',{'Gra_change_60s'});
Gra_change_90s=array2table(zeros(360,1),'VariableNames',{'Gra_change_90s'});
Gra_change_120s=array2table(zeros(360,1),'VariableNames',{'Gra_change_120s'})
;
Vis_Sens=array2table(zeros(360,1),'VariableNames',{'Vis_Sens'});
Vis_Satisf=array2table(zeros(360,1),'VariableNames',{'Vis_Satisf'});
Lux_Level=array2table(zeros(360,1),'VariableNames',{'Lux_Level'});
T10=[T10 Mov_Ave_30s Gra_change_30s Gra_change_40s Gra_change_50s
Gra_change_60s Gra_change_90s Gra_change_120s Lux_Level Vis_Sens Vis_Satisf];
for i=31:1:size(T10,1)
T10{i,3}=mean(T10{i-30:i,2});
end
for i=61:1:size(T10,1)
T10{i,4}=T10{i,3}-T10{i-30,3};
end
for i=71:1:size(T10,1)
T10{i,5}=T10{i,3}-T10{i-40,3};
end
for i=81:1:size(T10,1)
T10{i,6}=T10{i,3}-T10{i-50,3};
end
107
for i=91:1:size(T10,1)
T10{i,7}=T10{i,3}-T10{i-60,3};
end
for i=121:1:size(T10,1)
T10{i,8}=T10{i,3}-T10{i-90,3};
end
for i=151:1:size(T10,1)
T10{i,9}=T10{i,3}-T10{i-120,3};
end
T10.Properties.VariableNames{2}='Absolute_size'
T10(1:150,:)=[];
T10{:,10}=M2{10,2};
T10{:,11}=M2{10,3};
T10{:,12}=M2{10,4};
%% Extract 11th Step
T11=Fin(t1(11):t2(11),:);
T11(421:480,:)=[];
T11(1:60,:)=[];
Mov_Ave_30s=array2table(zeros(360,1),'VariableNames',{'Mov_Ave_30s'});
Gra_change_30s=array2table(zeros(360,1),'VariableNames',{'Gra_change_30s'});
Gra_change_40s=array2table(zeros(360,1),'VariableNames',{'Gra_change_40s'});
Gra_change_50s=array2table(zeros(360,1),'VariableNames',{'Gra_change_50s'});
Gra_change_60s=array2table(zeros(360,1),'VariableNames',{'Gra_change_60s'});
Gra_change_90s=array2table(zeros(360,1),'VariableNames',{'Gra_change_90s'});
Gra_change_120s=array2table(zeros(360,1),'VariableNames',{'Gra_change_120s'})
;
Vis_Sens=array2table(zeros(360,1),'VariableNames',{'Vis_Sens'});
Vis_Satisf=array2table(zeros(360,1),'VariableNames',{'Vis_Satisf'});
Lux_Level=array2table(zeros(360,1),'VariableNames',{'Lux_Level'});
T11=[T11 Mov_Ave_30s Gra_change_30s Gra_change_40s Gra_change_50s
Gra_change_60s Gra_change_90s Gra_change_120s Lux_Level Vis_Sens Vis_Satisf];
for i=31:1:size(T11,1)
T11{i,3}=mean(T11{i-30:i,2});
end
for i=61:1:size(T11,1)
T11{i,4}=T11{i,3}-T11{i-30,3};
end
for i=71:1:size(T11,1)
T11{i,5}=T11{i,3}-T11{i-40,3};
end
for i=81:1:size(T11,1)
T11{i,6}=T11{i,3}-T11{i-50,3};
end
for i=91:1:size(T11,1)
T11{i,7}=T11{i,3}-T11{i-60,3};
end
108
for i=121:1:size(T11,1)
T11{i,8}=T11{i,3}-T11{i-90,3};
end
for i=151:1:size(T11,1)
T11{i,9}=T11{i,3}-T11{i-120,3};
end
T11.Properties.VariableNames{2}='Absolute_size'
T11(1:150,:)=[];
T11{:,10}=M2{11,2};
T11{:,11}=M2{11,3};
T11{:,12}=M2{11,4};
%% Combine together
T_Final=[T1;T2;T3;T4;T5;T6;T7;T8;T9;T10;T11];
c=0
%writetable(T_Final,'Check_Processed.csv')
for i=1:size(T_Final,1)
if T_Final{i-c,2}>80
T_Final(i-c,:)=[];
c=c+1;
elseif T_Final{i-c,2}<51
T_Final(i-c,:)=[];
c=c+1;
end
end
writetable(T_Final,'Mingming_Zhou_Processed_5.csv')
%% Convert 5 to 3
for i=1:size(T_Final,1)
if T_Final{i,11}<0
T_Final{i,11}=-1;
elseif T_Final{i,11}>0
T_Final{i,11}=1;
end
end
for i=1:size(T_Final,1)
if T_Final{i,12}<0
T_Final{i,12}=-1;
elseif T_Final{i,12}>0
T_Final{i,12}=1;
end
end
writetable(T_Final,'Mingming_Zhou_Processed_3.csv')
109
Appendix C
110
Appendix D
111
112
113
Abstract (if available)
Abstract
Deficient indoor environments are common issues in today’s workplace, resulting in reduced work productivity, which contributes to indirect pecuniary loss among firms. Lighting, as an important component of indoor environmental qualities, is demonstrated to be closely related to occupants’ defective performance and is largely ignored by existing design guidelines, which are designed primarily for paper-based tasks and derived from empirical values. ❧ An applicable tool was developed to improve visual comfort for an individual in an office environment. The tool consists of two parts—visual comfort prediction computed by a machine learning algorithm on the basis of the occupant’s eye pupil size as well as the illuminance level (lux) and physical luminance control based on the corresponding predicted visual comfort label (i.e., visual sensation and visual satisfaction) as an input. ❧ After reviewing multiple computational algorithms for establishing a visual comfort prediction model, this study adopted Gaussian naïve Bayes (Gaussian NB), logistic regression (LG), and support vector machine (SVM). The model training and testing process utilized data collected from human subject tests, which were conducted in an environmental chamber at USC. Visual sensation (evaluation of brightness) and visual satisfaction (assessment of comfort) were used as classes to label each data point described by the features of human eye pupil size–related parameters and real-time illuminance level. Stepwise control was used for luminance control of lighting. ❧ The research found that, in terms of accuracy, SVM outperformed LG and Gaussian NB and is therefore recommended to be used to signal the control. The prototype control developed demonstrated acceptable performance under extreme conditions (1100 lux) but failed to make changes for occupants under the baseline condition (500 lux).
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Human-building integration: Investigation of human eye pupil sizes as a measure of visual sensation in the workstation environment
PDF
Human-environmental interaction: potential use of pupil size for office lighting controls
PDF
Exploration for the prediction of thermal comfort & sensation with application of building HVAC automation
PDF
Developing environmental controls using a data-driven approach for enhancing environmental comfort and energy performance
PDF
Enhancing thermal comfort: data-driven approach to control air temperature based on facial skin temperature
PDF
Multi-occupancy environmental control for smart connected communities
PDF
Human-building integration based on biometric signal analysis: investigation of the relationships between human comfort and IEQ in a multi-occupancy condition
PDF
Adaptive façade controls: A methodology based on occupant visual comfort preferences and cluster analysis
PDF
Smart buildings: employing modern technology to create an integrated, data-driven, intelligent, self-optimizing, human-centered, building automation system
Asset Metadata
Creator
Cen, Lingkai
(author)
Core Title
Human–building integration: machine learning–based and occupant eye pupil size–driven lighting control as an applicable visual comfort tool in the office environment
School
School of Architecture
Degree
Master of Building Science
Degree Program
Building Science
Publication Date
04/19/2021
Defense Date
03/20/2019
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
artificial intelligence,building environmental control,eye pupils,machine learning,OAI-PMH Harvest,visual comfort
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Choi, Joon-Ho (
committee chair
), Gil, Yolanda (
committee member
), Narayanan, Shrikanth (
committee member
)
Creator Email
lingkaic@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c89-140613
Unique identifier
UC11676622
Identifier
etd-CenLingkai-7204.pdf (filename),usctheses-c89-140613 (legacy record id)
Legacy Identifier
etd-CenLingkai-7204.pdf
Dmrecord
140613
Document Type
Thesis
Format
application/pdf (imt)
Rights
Cen, Lingkai
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
artificial intelligence
building environmental control
eye pupils
machine learning
visual comfort