Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Spatial and temporal patterns of brain activity associated with emotions in music
(USC Thesis Other)
Spatial and temporal patterns of brain activity associated with emotions in music
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
i
SPATIAL AND TEMPORAL PATTERNS OF BRAIN ACTIVITY
ASSOCIATED WITH EMOTIONS IN MUSIC
A Dissertation Presented to the
FACULTY OF THE GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Fulfillment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
(Psychology)
by
Matthew Elliott Sachs, MA
August, 2019
Approved by:
Antonio Damasio, MD/PhD, Chairperson
Jonas Kaplan, PhD,
John Monterosso, PhD,
Darby Saxbe, PhD,
&
Beatriz Ilari, PhD.
ii
© Matthew E. Sachs, 2019
iii
Acknowledgements
I would like to thank Antonio and Hanna Damasio for welcoming me into their Brain and
Creativity Institute and for supporting me throughout my graduate school career.
Specifically, I want to thank Antonio for the countless hours he has spent reading over
and editing manuscripts with me and discussing the origins and foundations of music, the
brain, and culture. I want to thank Hanna for her scrupulous neuroanatomical edification
and her willingness to always share her thoughts and provide helpful feedback on any
question or concern that I have presented.
Jonas Kaplan for his unwavering dedication to helping me figure out the technical and
analytical aspects of these studies, for his “open-door” policy, both figurative and literal,
which allowed me to come to him with any issue, big or small, and know that, whether he
had time or energy to or not, would work to find a solution, and for his uncanny ability to
delight me with his musings, theories, and unyielding opinions on (name a topic).
Assal Habibi for being one of the most committed and generous mentors and
collaborators I have ever had, for her willingness to read and edit every draft of a draft of
a draft of a paper, for knowing when to provide guidance and when to give me the
freedom to try on my own, for making me feel that I had someone who would go to bat
for me, and for both nurturing and investing in my future success and career.
Denise, Faith, and Cinthya for their administrative support, arranging meetings, providing
reimbursements, organizing events and submitting grants. Collecting data and having
opportunities to present would not have been possible without their tireless and
unmitigated assistance.
Anthony Vaccaro for assisting with parts of the neuroimaging data and for assuring me
that some people out there do still love musicals. Morteza Dehghani for his tough love
and equally tough advice.
All members of the Brain and Creativity Institute who contributed in a variety of ways
throughout graduate school, mostly through systematic interruptions to my meticulous
workflow, which, while undeniably irritating, did provide a nice respite from my own
self-imposed solitude.
My father, my mother, my sister and my brother. Gamma Selma, Grandpop Herb and
Grandma Claire.
And Steph, for sharing in the rage.
iv
Abstract
The evocative and temporal nature of music enables it to be an effective tool for
uncovering the spatial and temporal patterns of brain activity associated with affective
experiences. In a series of three studies, I employ data-driven, multivariate statistical techniques
to capture patterns of information in response to musical stimuli. In Study 1, I employ multivoxel
pattern analysis to show that neural patterns in the auditory, somoatosensory, and insular cortices
represent specific categories of emotions perceived through music and that these representations
extend to non-musical stimuli as well. In the next two studies, I shift from perception to
experience, focusing on the emergence of enjoyment in response to sad pieces of music, using the
example of sad music in which the emotion that is perceived by the listener may not necessary
match the emotion that is felt. In Study 2, I show that people who find sad music enjoyable tend
to score higher on a specific sub-trait of empathy called Fantasy. This relationship is partially
mediated by the intensity and quality of one’s emotional responses to sad music. In Study 3, I
build off the findings from Study 2 by conducting a neuroimaging study in which participants
listened to a full-length piece of sad music. Using a data-driven approach to assess
synchronization of brain activity across participants, I show that, while listening to sad music,
high Fantasy individuals have greater synchronization in regions of the brain involved in
processing emotions and simulating the emotions of others. Furthermore, when evaluating
synchronization dynamically, I found that increased enjoyment of the piece of music predicted
similar patterns of activity across people in the basal ganglia, orbitofrontal, and auditory cortices.
The results presented across these three studies provide a more nuanced understanding of the
neurobiology of emotions and feeling, specifically with regards to the spatial components
involved in processing incoming sensory information to form a concept of an emotion, and the
temporal components involved in integrating external and internal information to form a mental
representation of our subjective experience.
v
Table of Contents
Acknowledgements ....................................................................................................................... iii
Abstract .......................................................................................................................................... iv
List of Tables ................................................................................................................................ vii
List of Figures .............................................................................................................................. viii
CHAPTER 1: Introduction ........................................................................................................... 1
CHAPTER 2: Decoding the neural representation of emotions expressed through music and
voice ............................................................................................................................................... 11
Introduction ............................................................................................................................... 11
Methods ..................................................................................................................................... 15
Results ........................................................................................................................................ 26
Discussion .................................................................................................................................. 35
CHAPTER 4: Unique Personality Profiles Predict When and Why Sad Music is Enjoyed . 44
Introduction ............................................................................................................................... 44
Methods ..................................................................................................................................... 50
Results ........................................................................................................................................ 54
Discussion .................................................................................................................................. 68
CHAPTER 5: Synchronized brain activity reflects the enjoyment of sad music over time . 74
Introduction ............................................................................................................................... 74
Methods ..................................................................................................................................... 83
Results ........................................................................................................................................ 98
Discussion ................................................................................................................................ 106
CHAPTER 5: Discussion and Conclusions ............................................................................. 122
References ................................................................................................................................... 124
Appendix A: Confusion matrices for classification of auditory emotions ............................ 155
vi
Appendix D: Significant peaks and coordinates for searchlight results ............................... 158
Appendix E: Emotional responses to sad music (based on GEMS-9) ................................... 159
Appendix F: Reasons for listening to sad music ..................................................................... 160
Appendix G: Music listening in various situations ................................................................. 161
Appendix I: List of acoustic features extracted from sad piece of music ............................. 166
Appendix J: Zero-order correlations for all survey measures in Chapter 4 ........................ 167
Appendix K: Coordinates of significant ISC clusters during sad music-listening .............. 168
vii
List of Tables
Table 1. Mean scores and standard deviation for survey measures ................................. 17
Table 2. Intensity ratings and affective labeling of auditory emotional stimuli. ............. 27
Table 3. Emotion classification of auditory stimuli using acoustic features alone. ......... 35
Table 4. LASSO models predicting the enjoyment of sad music and GEMS ................. 57
Table 5. Exploratory factor analysis of 20 situations for listening to sad music. ............ 58
Table 6. LASSO models predicting situational factors for listening to sad music .......... 60
Table 7. Exploratory factor analysis with 12 reasons for listening to sad music. ............ 62
Table 8. LASSO models predicting reasons for listening to sad music. .......................... 63
Table 9. Regions of interest used for intersubject synchronization. ................................ 94
Table 10. Summary statistics of collected behavioral measures ...................................... 98
Table 11. Intersubject synchronization correlated with enjoyment ratings ................... 103
Table 12. Inter-regional intersubject synchronization correlated with ratings ............... 105
viii
List of Figures
Figure 1. Example of one functional session of music-listening task. ............................. 19
Figure 2. Regions of interest used for MVPA classification. .......................................... 22
Figure 3. Classification accuracies for decoding of emotions in auditory stimuli. .......... 29
Figure 4. Within instrument whole-brain searchlight results ........................................... 31
Figure 5. Cross-instrument whole-brain searchlight results ............................................ 31
Figure 6. Mediation model with Fantasy, GEMS, and enjoyment of sad music ............. 66
Figure 7. Mediation model with rumination, GEMS, and purging negative feelings ...... 68
Figure 8. Mediation model with Fantasy, GEMS, and prolonging feeling of sadness .... 68
Figure 9. Average continuous ratings of intensity of sadness and enjoyment ................. 99
Figure 10. Significant voxelwise intersubject correlation during sad music listening .. 100
Figure 11. Synchronization differences between high- and low-Fantasy participants .. 101
Figure 12. Intersubject functional connectivity matrix during sad music listening ....... 102
Figure 13. Enjoyment ratings predicting intersubject phase synchronization ............... 104
Figure 14. Enjoyment ratings predicting inter-regional phase synchronization ............ 106
SPATIOTEMPORAL EMOTIONS TO MUSIC
1
CHAPTER 1
Introduction
Emotions and feeling enrich and color our everyday lives and are a key element of
human survival. Overwhelming evidence points to the idea that emotions serve essential
functions for life regulation by promoting homeostasis (Damasio, 1999). Despite their
importance, thinkers and scholars have debated for the better part of a century over the
essence of what constitutes an emotion and where and how they originate in the body.
When discussing the nature of emotions, it is helpful to be clear on definitions, as a
number of related, yet distinct terms are often used interchangeably when describing such
phenomena. For the purposes of this paper, I define an emotion as the series of bodily
state changes (psychophysiological, musculature, cognitive) that motivate adaptive
behaviors to overcome deviations in the internal or external environment (Damasio,
2004; Phillips, Drevets, Rauch, & Lane, 2003). Paul Ekman first proposed that across
human cultures, six basic emotions exist: happiness, sadness, disgust, fear, surprise, and
anger (Ekman, 1992). While more recent evidence has suggested that these labels may
not be so qualitatively distinct (Feldman Barrett, Mesquita, Ochsner, & Gross, 2007) and
that a range of other emotional states may be equally as universal (Cowen & Keltner,
2017), the idea that emotions can be categorized based on a collection of bodily,
psychological, and neural changes still has merit today (Adolphs & Andler, 2018; Kragel
& LaBar, 2015).
Related to emotions are feelings, which refers specifically to the subjective
experience of the body state changes characteristic of an emotion (Damasio & Carvalho,
SPATIOTEMPORAL EMOTIONS TO MUSIC
2
2013). The term affect, when used in psychology, more broadly refers to any
neurophysiological state that is consciously and subjectively experienced as a mixture of
two features: valence—how pleasant or unpleasant is the experience—and arousal—how
energizing or enervating is the experience (Russell, 2003). Affect therefore includes
motivational states such as hunger and thirst as well as emotional states. It is often useful
to conceptualize different emotional categories based on their location in this valence-
arousal space. Happiness, for example, is thought of as having a positive valence and is
high on arousal, whereas sadness has a negative valence and is low on arousal.
Since the early days of neuroimaging, neuroscientists have attempted to better
understand how emotions and feelings emerge in the brain by assessing neural activity
associated with a variety of affective stimuli, including pictures, faces, smells, sounds,
film, and music (Wager, Phan, Liberzon, & Taylor, 2003). Meta-analyses that synthesize
the findings from these now hundreds of studies indicate that discrete emotions are not
subserved by a specific brain region, as researchers once believed; instead, all emotional
states appear to involve the co-functioning of a collection of brain regions known to serve
a variety of more general cognitive and psychological functions, such as detecting the
saliency of external stimuli, representing feelings of valence and arousal, encoding
memories, and integrating the incoming external and internal information (Lindquist,
Wager, Kober, Bliss-Moreau, & Barrett, 2012). The brain regions that are consistently
found to be activated across a wide variety of affective experiences include the amygdala,
insula, anterior cingulate cortex, thalamus, hypothalamus, and hippocampus (collectively
thought of as constituting the limbic or paralimbic networks), the posterior cingulate and
SPATIOTEMPORAL EMOTIONS TO MUSIC
3
medial prefrontal cortex (collectively thought of as constituting the default mode
network), the basal-ganglia, and the orbitofrontal and lateral prefrontal cortices (Kober et
al., 2008).
Despite what we know from meta-analytic findings, the exact role of these
identified regions and networks in affective responses is still unclear. Recent evidence
has suggested that the neural correlates associated with emotion perception, i.e. the
recognition of some pre-experienced state through one or more of the sensory domains
(Schirmer & Adolphs, 2017), may be distinct from those associated with emotion
experience (Garrett & Maddock, 2006). However, previous neuroimaging studies often
combine and conflate these two components of affect. As a result, it is currently difficult
to disentangle the brain systems involved in emotion perception from those involved in
feeling.
Another reason for this lack of clarity may be methodological. One limitation with
traditional neuroimaging studies is that they rely on rigid, highly-controlled experimental
designs and analytical techniques that make a number of assumptions regarding the
expected pattern and shape of the hemodynamic response function. While these
univariate designs are ideal for assessing time-locked activation patterns that correspond
to a particular stimulus, they are ill-suited to uncover neural information that may be
encoded in spatially-distributed patterns across the brain as well as neural information
encoded in time-varying patterns of activity in the brain. It is therefore challenging to use
traditional methods to determine the brain patterns uniquely associated with the
perception of the emotion, which likely involves multiple regions and occurs relatively
SPATIOTEMPORAL EMOTIONS TO MUSIC
4
quickly, versus the experience of the emotion, which likely develops over a longer period
of time and only in response to more ecologically-valid, naturalistic stimuli.
Recently, researchers have begun incorporating multivariate analyses with
neuroimaging data, which overcomes some of the limitations associated with univariate
approaches and can therefore provide a new understanding of the neural signatures of
emotions. One such method, multivoxel pattern analysis (MVPA; Norman, Polyn, Detre,
& Haxby, 2006), applies machine learning algorithms to neuroimaging data to attempt to
predict some high-level concept of interest based on spatial patterns of BOLD signal. In
theory, if the signal pattern across a set of brain regions can be used to successful
distinguish among different task conditions or trial types, it is likely that these regions
encode information that is necessary for that task. By capitalizing on spatially distributed
patterns of activity, this approach has been shown to have greater predictive power than
more traditional GLM-based models (Mitchell, Hutchinson, & Pereira, 2004).
Several studies have used MVPA to classify neural patterns associated with discrete
emotional states induced through different modalities, e.g. music, film, faces, pictures,
and autobiographical recall (Kassam, Markey, Cherkassky, Loewenstein, & Just, 2013;
Kragel & Labar, 2013; Saarimaki et al., 2015). Thus far, the results have found mixed
support for the conclusions drawn from meta-analyses with univariate data, i.e. that a
range of feeling states, independent of the modality of induction, emerge from the co-
activation of regions throughout the brain that are also involved in more basic cognitive
processes. On the one hand, MVPA findings show that discriminable and stable patterns
of brain activity can be used to predict discrete emotional categories (Saarimäki et al.,
SPATIOTEMPORAL EMOTIONS TO MUSIC
5
2016). On the other hand, the findings also show that emotional categories are best
captured by signal change across a system of cortical and subcortical brain regions
previously shown to be involved in a variety of affective states (Kragel & LaBar, 2015;
Nummenmaa & Saarimäki, 2019). While MVPA studies have not yet resolved the long-
standing debate with regards to the neural origins of affect, by shifting the focus away
from univariate patterns of activation towards multivariate patterns of information, they
help uncover the neural patterns necessary for representing the concept of an emotion
across different modalities and different brain regions.
MVPA is ideal for evaluating information encoded in spatial patterns of brain
activity. However, it is not well suited to evaluate information encoded in temporal
patterns of brain activity. Given that different aspects of emotions likely have different
temporal characteristics (Garrett & Maddock, 2006), capturing time-varying patterns of
brain activation may help further disentangle the perception of the emotion from the
experience of an emotion. Recently, a variety of model-free analytical approaches have
been developed to assess the dynamics of brain activity in response to the continuous
presentation of more naturalistic stimuli. In general, these methods involve calculating
how the neural signal both within and across participants’ changes over time and how
these changes map onto events in the stimulus. By using full-length films or stories that
can evoke a range of more intense feelings, researchers can capture the brain regions that
track collective affective experiences as they evolve over time (Nummenmaa, Saarimäki,
et al., 2014; Raz et al., 2012).
The multivariate, data-driven models presented above provide a novel way of
SPATIOTEMPORAL EMOTIONS TO MUSIC
6
uncovering the brain systems involved in emotions. Yet, very few studies have employed
such methods to systematically evaluate emotion perception and emotion experience with
music. Music is an ideal stimulus to use in any investigation attempting to elucidate the
various neural components of affect for several reasons. First, it has been shown that
people can reliably perceive discrete emotions when conveyed through music, including
basic emotions such as happiness, and sadness as well as non-basic, more social emotions
such as love and tenderness (Balkwill & Thompson, 1999; Fritz et al., 2009; Juslin &
Laukka, 2004). Second, people can experience a variety of feelings in response to music
that range from the everyday to the aesthetic (Zentner, Grandjean, & Scherer, 2008) and
that develop over time (Brattico, Brigitte, & Jacobsen, 2013). Third, people can recognize
an emotion in music that is different from, and may even contrast with, the emotion that
they feel (Gabrielsson, 2001). It is these fundamental aspects of music that can be used
by neuroscientists to probe and shed light on the multifaceted nature of human affect.
How do we perceive emotions in music? It is likely that certain musical elements
have become tightly linked with emotional constructs through their mimesis of sounds or
actions that more directly communicate an emotion (Lundqvist, Carlsson, Hilmersson, &
Juslin, 2008). The connection between a musical sound and the emotional information it
conveys can derive from the conditional pairing of sounds and events over long periods
of time or from one individual’s learned associations with those sounds (Cross, 2009).
Laughter, for example, is meant to be convey happiness and any music that contains
similar acoustic elements to laughter, such as faster tempos, higher fundamental
frequencies, and faster tonal attacks, will likely be perceived as sounding happy (Juslin &
SPATIOTEMPORAL EMOTIONS TO MUSIC
7
Laukka, 2003). Consequently, researchers have found that the perceived emotion of
pieces of music are associated with distinct sets of acoustical features (Eerola, Lartillot,
& Toiviainen, 2009).
How do we feel emotions in response to music? It is overly simplistic to assert that
acoustic properties alone are enough to evoke an affective response. More likely, feelings
in response to music result from complex interactions involving sensory processing,
memory, personality, and mood. As an attempt to systematize these various factors,
Juslin (2013) provides a list of eight possible neuropsychological mechanisms by which
music is able to induce an emotion. The mechanisms range from low-level, evolutionary-
ancient processes, such as reflexive responses to novel and/or alarming sounds, to high-
level, conceptual processes, such as the evaluation and prediction of the reward value of
sounds over time. According to the authors, more than one mechanism can be engaged at
the same time, which might account for situations in which listening to music results in a
mixture of emotions (Juslin & Laukka, 2003).
In Juslin's (2013) proposal, each of these mechanisms serves a distinct evolutionary
function and, therefore, involves distinct brain systems. However, univariate
neuroimaging studies with musical stimuli have been unable to provide strong evidence
for these separable mechanisms. While such studies have found that music-evoked
emotions are associated with activity in the same brain regions associated with processing
emotions in other domains, including the insula, ACC, VMPFC, striatum, amygdala, and
hippocampus (Habibi & Damasio, 2014), these regions are also implicated in studies
specifically focused on the perception of musical emotions (Escoffier, Zhong, Schirmer,
SPATIOTEMPORAL EMOTIONS TO MUSIC
8
& Qiu, 2013; Mitterschiffthaler, Fu, Dalton, Andrew, & Williams, 2007), the experience
of musical emotions (Trost, Ethofer, Zentner, & Vuilleumier, 2012), as well as feelings of
enjoyment in response to music (Blood & Zatorre, 2001; Salimpoor, Benovoy, Longo,
Cooperstock, & Zatorre, 2009). Although music has this uncanny ability to convey and
induce complex and intense emotions that vary over time, the current findings have not
focused on these aspects to disentangle affective mechanisms. As previously mentioned,
one reason for this may be due to an overreliance on more conventional neuroimaging
methods that are not conducive for studying these most intriguing and scientifically-
relevant aspects of music.
Another way of disentangling emotion perception from emotion experience is to
capitalize on situations in which they are uncoupled. Such moments occur naturally and
frequently in everyday life, such as when we feel envy for another’s fortunes or
schadenfreude for another’s misfortunes, but are difficult to study empirically. Once
again, music may provide a solution. When listening to sad music, for example, it is quite
common for people to recognize that the music is sad, yet to feel more positive emotions
like enjoyment (Sachs, Damasio, & Habibi, 2015). While there is still debate over
whether people feel genuine sadness when listening to sad music that later morphs into
feelings of pleasure, or whether people perceive the sadness of the music, but only feel
pleasure (Vuoskoski & Eerola, 2012), in both situations people appear to be able to
consciously separate the two states (Juslin & Laukka, 2004). Investigating this puzzling
aspect of aesthetic experiences with music can help dissociate the neural signatures of
emotion perception, feeling, and pleasure (Brattico et al., 2016).
SPATIOTEMPORAL EMOTIONS TO MUSIC
9
What might we expect to find if we use musical stimuli to probe the spatial and
temporal components of affective experience? By combining results from ERP and
univariate fMRI data, Brattico et al. (2013) proposed a model that outlines the time
course of affective responses to music in the brain. In this account, acoustical properties
of the sounds are first processed by the brainstem and auditory cortex and subsequently
appraised for their degree of valence and arousal by the amygdala and parahippocampal
gyrus. The subjective experience of these emotions emerges later in time, with the
engagement of the insula, ACC, and the medial prefrontal cortex, regions that are
involved in mental processes related to feelings, such monitoring changes in the body,
simulating actions and behaviors, and integrating external and internal information.
Feelings of enjoyment, according to the model, emerge later still, as the orbitofrontal
cortex assesses the reward value of the stimulus and makes an aesthetic judgment and the
striatum initiates the actual experience of pleasure (Brattico et al., 2013).
The model presented above provides a working hypothesis for the brain systems
involved in various components of emotional experience in response to music, but have
not yet been tested in neuroimaging studies that are careful to separate perception from
experience from subsequent pleasure. In what follows, I will present findings from three
studies that incorporate newer, data-driven analytical techniques in order to provide a
more comprehensive understanding of the ways in which we make meaning out of
musical stimuli. In chapter 2, I take the bottom-up approach to examine how the quick
perception of emotions in music can be decoded based on spatial patterns of activity in
the human brain. In this study, I use a validated set of brief auditory stimuli that reliably
SPATIOTEMPORAL EMOTIONS TO MUSIC
10
convey three discrete emotions through musical instruments and the human voice. MVPA
is then employed to determine the patterns of activity in the brain that contain emotion-
specific information across sounds from different instruments and modalities.
In the next two chapters, I shift to a top-down approach, investigating the temporal
component of feelings in response to sad music. Chapter 3 presents results from a large-
scale survey designed to elucidate the personality, situational, and motivational factors
that influence one’s emotional responses to, and subsequent enjoyment of, sad music.
The results from this study inform and motivate the fMRI study presented in Chapter 4.
In this study, I examine the temporal dynamics of neural activity and connectivity in
response to emotions expressed through a full-length, naturalistic piece of sad music.
Participants listened continuously to the piece during scanning and continuously rated
their feelings of both sadness and enjoyment outside of the scanner. These continuous
ratings of affect were then used to predict stimulus-driven neural activation in order to
assess the time-varying patterns in the brain that represent the dynamic experience of
feeling an emotion in response to music.
Collectively, these three studies provide novel insights into brain functioning
during affective experiences as well as into the inextricable link between affect and
music. Such insights will hopefully inform future endeavors designed to improve the
lives of individuals with atypical socioemotional abilities as well as anyone who listens to
music as a way of enhancing wellbeing and finding meaning in life.
SPATIOTEMPORAL EMOTIONS TO MUSIC
11
CHAPTER 2
Decoding the neural representation of emotions expressed through music and voice
Introduction
The capacity to both convey and perceive emotions through sounds is crucial for
successful social interaction. For example, recognizing that a person is distressed based
on vocal expressions alone can confer certain advantages when it comes to
communicating and connecting with others. Intriguingly, emotions can be recognized in
non-vocal sounds as well. Music can convey emotions even when not mimicking the
human voice, despite the fact that an ability to express emotions through music does not
serve as clear an evolutionary function as vocal expressions of emotions (Frühholz, Trost,
& Grandjean, 2014). And yet, the capability to consistently and reliably discern musical
emotions appears to be universal, even in individuals with no musical training (Fritz et
al., 2009). Studying the neural overlap of expressions of emotions in both vocal and
musical stimuli therefore furthers our understanding of how auditory information
becomes emotionally relevant in the human brain.
Previous univariate neuroimaging studies that have examined this neural overlap
have reported activity in the superior temporal gyrus (Escoffier, Zhong, Schirmer, & Qiu,
2013), amygdala and hippocampus (Frühholz et al., 2014) during both musical and non-
musical, vocal expressions of emotions. While these results support the notion that
musical and vocal patterns recruit similar brain regions when conveying emotions, they
do not clarify whether these regions are responsive to a specific emotional category or are
SPATIOTEMPORAL EMOTIONS TO MUSIC
12
involved in emotion processing more generally. Neither study addressed the neural
activity patterns that are specific to a particular emotion, but conserved across the two
different domains of music and vocals. One particular univariate study did attempt to
answer this question, but only with the emotion of fear: the researchers found that the
amygdala and posterior insula were commonly activated in response to fear expressed
through non-linguistic vocalizations and musical excerpts, as well as through facial
expressions, (Aubé, Angulo-Perkins, Peretz, Concha, & Armony, 2013).
In general, however, univariate methods are not well suited for evaluating
commonalities in the processing of emotions across the senses because, due to spatial
smoothing and statistical limitations, they cannot assess information that may be located
in fine-grained patterns of activity dispersed throughout the brain (Kaplan, Man, &
Greening, 2015). Multivoxel pattern analysis (MVPA), which entails classifying mental
states using the spatially-distributed pattern of activity in multiple voxels at once, can
provide a more sensitive measure of the brain regions that are responsible for
distinguishing amongst different emotions (Norman et al., 2006). In combination with a
searchlight analysis, in which classification is performed on local activity patterns within
a sphere that traverses the entire brain volume, MVPA can reveal areas of the brain that
contain information regarding emotional categories (Kriegeskorte, Goebel, & Bandettini,
2006; Peelen, Atkinson, & Vuilleumier, 2010). This multivariate approach has been used
in various capacities to predict emotional states from brain data (Saarimaki et al., 2015).
Spatial patterns within the auditory cortex, for example, were used to classify emotions
conveyed through both verbal (Ethofer, Van De Ville, Scherer, & Vuilleumier, 2009) and
SPATIOTEMPORAL EMOTIONS TO MUSIC
13
nonverbal (Kotz et al., 2013) speech. However, it remains unclear whether the neural
activity in these regions correspond to a particular category of emotion or are instead only
sensitive to the lower-level acoustical features of sounds.
Multivariate cross-classification, in which a classifier is trained on brain data
corresponding to an emotion presented in one domain and tested on separate brain data
corresponding to an emotion presented in another, is a useful approach to uncovering
representations that are modality independent (see Kaplan, Man, & Greening, 2015 for
review). Previously, this approach has been used to demonstrate that emotions induced by
films, music, imagery, facial expressions, and bodily actions can be successfully
classified across different sensory domains (J. Kim, Shinkareva, & Wedell, 2017; Kragel
& LaBar, 2015; Peelen et al., 2010; Saarimaki et al., 2015; Skerry & Saxe, 2014). Cross-
modal searchlight analyses revealed that successful classification of emotions across the
senses and across sources could be achieved based on signal recorded from the cortex
lying within the superior temporal sulcus (STS), the posterior insula, the medial
prefrontal cortex (MPFC), the precuneus, and the posterior cingulate cortex (Y. E. Kim et
al., 2010; Peelen et al., 2010; Saarimaki et al., 2015). While informative for uncovering
regions of the brain responsible for representing emotions across the senses, these studies
did not address how the brain represents emotions within a single sensory domain when
expressed in different ways. To our knowledge, there has been no existing research on the
affect-related neural patterns that are conserved across vocal and musical instruments,
two types of auditory stimuli with differing acoustical properties.
Additionally, the degree to which emotion-specific predictive information in the
SPATIOTEMPORAL EMOTIONS TO MUSIC
14
brain might be modulated by individual differences remains unexplored. Empathy, for
example, which entails understanding and experiencing the emotional states of others, is
believed to rely on the ability to internally simulate perceived emotions (Lamm, Batson,
& Decety, 2007). Activation of the anterior insula appears to be related to linking
observed expressions of emotions with internal empathic responses (Carr, Iacoboni,
Dubeau, Mazziotta, & Lenzi, 2003) and the degree of activation during emotion
processing tasks is shown to be positively correlated with measures of empathy (Silani et
al., 2008; T. Singer et al., 2004). Emotion-distinguishing activity patterns in the insula
may therefore relate to individual differences in the tendency to share in the affective
states of others.
Here, I used MVPA and cross-classification on two validated datasets of affective
auditory stimuli, one of non-verbal vocalizations (Belin, Fillion-Bilodeau, & Gosselin,
2008) and one of musical instruments (Paquette, Peretz, & Belin, 2013), to determine if
patterns of brain activity can distinguish discrete emotions when expressed through
different sounds. Participants were scanned while listening to brief (0-4s) audio excerpts
produced by the violin, clarinet, and human voice and designed to convey one of three
target emotions—happiness, sadness, and fear. The authors who published the original
dataset chose the violin and clarinet because both musical instruments can readily imitate
the sounds of the human voice, but are from two different classes (strings and woodwinds
respectively; Paquette et al., 2013). These three target emotions were used because (1)
they constitute what are known as “basic” emotions, which are believed to be universal
and utilitarian (Ekman, 1992), (2) they can be reliably produced and conveyed on the
SPATIOTEMPORAL EMOTIONS TO MUSIC
15
violin and clarinet (Hailstone et al., 2009) and (3) they are also present in both the vocal
and musical datasets.
After scanning, a classifier was trained to differentiate the spatial patterns of
neural activity corresponding to each emotion both within and across instruments. To
understand the contribution of certain acoustic features to our classification results, I
compared cross-instrument classification accuracy with fMRI data to cross-instrument
classification accuracy using acoustic features of the sounds alone. Then, a searchlight
analysis was used to uncover brain areas that represent the affective content that is shared
across the two modalities, i.e. music and the human voice. Finally, classification
accuracies within a priori-defined regions of interest in the auditory cortex, including the
superior temporal gyrus and sulcus, as well as the insula were correlated with behavioral
measures of empathy. These regions were selected for further investigation because of
their well-validated roles in the processing of emotions from sounds (Bamiou, Musiek, &
Luxon, 2003; Sander & Scheich, 2005) as well as across sensory modalities (Peelen et al.,
2010; Saarimaki et al., 2015). Based on previous results, I predict that BOLD signal in
the auditory and insular cortices will yield successful classification of emotions across all
three instruments. Moreover, given the known role of the insula in internal
representations of observed emotional states (Carr et al., 2003), I hypothesize that
classification accuracies within the insula will be positively correlated with empathy.
Methods
Participants. Thirty-eight healthy adult participants (20 females, mean age =
20.63, SD = 2.26, range = 18-31) were recruited from the University of Southern
SPATIOTEMPORAL EMOTIONS TO MUSIC
16
California and surrounding Los Angeles community. All participants were right-handed,
had normal hearing and normal or corrected-to-normal vision, and had no history of
neurological or psychiatric disorders. All experimental procedures were approved by the
USC Institutional Review Board. All participants gave informed consent and were
monetarily compensated for participating in the study.
Survey. The Goldsmith Musical Sophistication Index (Gold-MSI; Mullensiefen, et
al. 2014) was used to evaluate past musical experience and degree of music training. The
Gold-MSI contains 39 items broken up into five subscales, each related to a separate
component of musical expertise: active engagement, perceptual abilities, musical
training, singing abilities, and emotions. The scale also contains a general musical
sophistication score, which is the sum of responses to all items. Each item is scored on a
7-point Likert scale from 1 = completely disagree to 7 = completely agree.
Both cognitive and affective components of empathy were measured using the
Interpersonal Reactivity Index (Davis, 1983), which includes 28 items and four
subscales: fantasy and perspective taking (cognitive empathy) and empathic concern and
personal distress (affective empathy). Table 1 summarizes the results obtained from the
surveys. Items of the MSI were rated on a 7-point scale and factor scores are calculated
by taking sum of all items pertaining to that particular subscale. The 28 items of the IRI
were rated on a 5-point scale and each subscale score corresponds to the average rating of
all items pertaining to that subscale.
SPATIOTEMPORAL EMOTIONS TO MUSIC
17
Table 1. Mean scores and standard deviation for survey measures
M SD
Age (years) 20.63 2.26
Music Sophistication
Index (MSI)
F1: Active Engagement 41.37 9.18
F2: Perceptual Abilities 48.53 6.84
F3: Musical Training 25.37 9.19
F4: Emotions 34.55 4.45
F5: Singing Abilities 29.66 7.75
FG: General Sophistication 79.95 17.51
Interpersonal
Reactivity Index (IRI)
Fantasy 3.76 0.50
Perspective Taking 3.09 0.42
Empathic Concern 3.70 0.47
Personal distress 2.74 0.64
Note. N = 38 (20F, 18M)
Stimuli. Two validated, publically-available datasets of short, affective auditory
stimuli were used: the Music Emotional Bursts (MEB; Paquette, Peretz, & Belin, 2013)
and the Montreal Affective Voices (MAV; Belin, Fillion-Bilodeau, & Gosselin, 2008).
Studying neural responses to relatively short stimuli provided two main advantages: (1)
As suggested Paquette et al., (2013), these brief bursts of auditory emotions may mimic
more primitive, and therefore more biologically relevant, expressions of affect and (2) it
allows us to maximize the number of trials that can be presented to participants in the
scanner, theoretically improving the training of the classifier. The MEB contains 60 brief
(1.64s on average) auditory clips designed to express 3 basic emotions (happiness,
sadness, and fear) played on either the violin or clarinet. The dataset contains 10 unique
exemplars of each emotion on each instrument. The MAV is a set of brief (1.35s on
average) non-verbal vocalizations that reliably convey the same 3 emotions (happiness,
sadness, and fear) as well as several others (disgust, pain, surprise, and pleasure; these
SPATIOTEMPORAL EMOTIONS TO MUSIC
18
emotional clips were not included in this study because they were not included in the
MEB dataset). The MAV dataset contains 10 unique exemplars of each emotion as well
and includes both female and male voices. Both the MEB and MAV also include a
neutral condition, which were not included in this study either. All clips from both
datasets had been normalized so that the peak signal value corresponded to 90% of the
maximum amplitude (Belin et al., 2008; Paquette et al., 2013). Combining these two
stimulus datasets resulted in 90 unique stimuli: 30 featuring the violin, 30 featuring the
clarinet, and 30 featuring the human voice, with 30 clips for each of the three emotions
(happiness, sadness, and fear).
Design and Procedure. Stimuli were presented in an event-related design using
MATLAB’s PsychToolbox (Kleiner, Brainard, & Pelli, 2007) in 6 functional runs.
During scanning, participants were instructed to be still, with their eyes open and focused
on a fixation point continually presented on a screen, and attend to the audio clips when
they heard them. Auditory stimuli were presented through MR-compatible
OPTOACTIVE headphones with noise-cancellation (Optoacoustics). An eye-tracking
camera was monitored to ensure that the participants were awake and alert during
scanning.
During each functional run, participants listened to 45 audio clips, 5 clips for each
trial type (emotion x instrument). Each clip was followed by a rest-period that varied in
length and resulted in a total event length time (clip + rest) of 5s, regardless of the length
of the clip. Five, 5-s rest events, in which no sound played, were also added as an
additional condition, resulting in a total functional run time of 250s (125 TRs, see Figure
SPATIOTEMPORAL EMOTIONS TO MUSIC
19
1). Two unique orders of stimuli presentation were created using a genetic algorithm
(Kao, Mandal, Lazar, & Stufken, 2009), which takes into account designed detection
power and counterbalancing to generate an optimal design that is pseudorandomized. One
optimized order of stimuli presentation was used on odd-numbered runs (1, 3 and 5) and
the other order was used on even-numbered runs (2, 4, and 6). Over the course of the 6
functional runs, each of the 90 audio stimuli were presented exactly 3 times.
To validate the accuracy of the clips in terms of their ability to convey the intended
emotion, after scanning, participants listened to all 90 clips again in random order and
selected the single emotion, from the list of three, that they believed was being expressed
in the clip. To further describe their perceptions of each clip, participants also rated each
clip for how intensely it expressed each of the three target emotions using a scale ranging
from 1 (not at all) to 5 (very much).
Figure 1. Example of one functional session of music-listening task. In each session,
participants listened to 45 clips and 5 rest trials, each of which lasted for a total of 5s.
Each functional session lasted around 4.5min and there were six functional scans in total.
SPATIOTEMPORAL EMOTIONS TO MUSIC
20
Data Acquisition. Images were acquired with a 3-Tesla Siemens MAGNETON
Prisma System and using a 32-channel head coil. Echo-planar volumes were acquired
continuously with the following parameters: repetition time (TR) = 2,000 ms, echo time
(TE) = 25 ms, flip angle = 90°,64 x 64 matrix, in-plane resolution 3.0 x 3.0 mm, 41
transverse slices, each 3.0 mm thick, covering the whole brain. Structural T1-weighted
magnetization-prepared rapid gradient echo (MPRAGE) images were acquired with the
following parameters: TR = 2,530 ms, TE = 3.09 ms, flip angle = 10°, 256 x 256 matrix,
208 coronal slices, 1 mm isotropic resolution.
Data processing. Data preprocessing and univariate analysis was done in FSL
(FMRIB Software Library, Smith et al., 2004). Data were first preprocessed using brain
extraction, slice-time correction, motion correction, spatial smoothing with 5mm FWHM
Gaussian kernel, and high-pass temporal filtering. Each of the 9 trial types
(emotion*instrument) was modeled with a separate regressor derived from a convolution
of the task design and a double gamma hemodynamic response function. Six motion
correction parameters were included in the design as nuissance regressors. The functional
data were registered to the high-resolution anatomical image of each subject and to the
standard Montreal Neurological Institute (MNI) brain using the FSL FLIRT tool
(Jenkinson and Smith 2001). Functional images were aligned to the high-resolution
anatomical image using a 7 degree-of-freedom linear transformation. Anatomical images
were registered to the MNI-152 brain using a 12 degree-of-freedom affine
transformation. This entire procedure resulted in one statistical image for each of the 9
trial types (3 emotions by 3 instrument) in each run. Z-stat images were then aligned to
SPATIOTEMPORAL EMOTIONS TO MUSIC
21
the first functional run of that participant for within-subject analysis.
Multivoxel pattern analysis. Multivoxel pattern analysis (MVPA) was conducted
using the PyMVPA toolbox (http://www.pymvpa.org/) in Python. A linear support vector
machine (SVM) classifier was trained to classify the emotion of each trial type. Leave-
one run out cross-validation was used to evaluate classification performance (i.e. 6-fold
cross-validation with 45 data points in the training dataset and 9 data points in the testing
dataset for each fold). Classification was conducted both within each instrument as well
as across instruments (training the classifier on a subset of data from two of the
instruments and testing on a left-out subset from another instrument) using a mask of the
participant’s entire brain. In addition to training on two instruments and testing on the
third, I also ran cross-instrument classification for every pairwise combination of training
on one instrument and testing on another (6 combinations in total). Feature selection on
the whole brain mask was employed on the training data alone using a one-way ANOVA
and keeping the top 5% most informative voxels (mean 3,320 voxels after feature
selection, SD = 251). Within participant classification accuracy was computed by
averaging the accuracy of predicting the emotion across each of the 6 folds. One-sample
t-tests on the population of participant accuracies were performed to determine if the
achieved accuracies were significantly above theoretical chance (33%).
Region of interest classification. In addition to whole brain analysis, I performed a
region of interest (ROI) analysis focusing on a-priori ROIs in the auditory cortex and
insular cortex. These two ROIs were chosen because of their well-known roles in the
processing of emotions from sounds (Bamiou et al., 2003; Sander & Scheich, 2005). For
SPATIOTEMPORAL EMOTIONS TO MUSIC
22
the auditory cortex, I used the Harvard-Oxford Atlas planum temporale mask, which is
defined as the superior surface of the superior temporal gyrus, as well as the Heschl’s
gyrus mask, merged and thresholded at 25 (Figure 2A). For the insula, I used masks of
the dorsal anterior, ventral anterior, and posterior insula described in Deen, Pitskel, &
Pelphrey (2011) that were defined by the results of a cluster analysis of functional
connectivity patterns (Figure 2B). Within and across instrument classification was
conducted in exactly the same way as described above. For the region of interest analysis,
feature selection was not used, that is, all voxels within the specified anatomical region
were used.
Figure 2. Regions of interest used for MVPA classification. A. The auditory cortex
was defined using the Harvard-Oxford Atlas by merging the planum temporale mask with
Heschl's gyrus mask, boththresholded at 25. B. Three major subdivisions, the dorsal
anterior, ventral anterior, and posterior, were identified based on the results from a
previous study using cluster analysis of functional connectivity patterns.
SPATIOTEMPORAL EMOTIONS TO MUSIC
23
Whole-brain searchlight analysis. A searchlight analysis for classifying emotions
was conducted both within and across modalities (Kriegeskorte et al., 2006). For each
subject, the classification accuracy was determined for spheres with radius 3 voxels
throughout the entire brain. A sphere of that size was chosen to roughly match the size of
the anatomical regions of interest, large enough to not be biased by individual variation in
any one voxel and yet small enough to adhere to known anatomical boundaries. These
accuracies were then mapped to the center voxel of the sphere and warped to standard
space. The searchlight analysis was conducted both within instruments and across
instruments. For the within instrument searchlights, the SVM classifier was trained on
data from all but one of the six runs and tested on the left-out run (leave-one-run-out
cross validation). To evaluate the significance of clusters in the overlapped searchlight
accuracy maps, nonparametric permutation testing was performed using FSL’s
Randomise tool (Winkler, Ridgway, Webster, Smith, & Nichols, 2014), which models a
null distribution of expected accuracies at chance. The searchlight accuracy maps were
thresholded using threshold-free cluster enhancement (TFCE; Smith & Nichols, 2009).
For the cross-instrument searchlight analysis, the classifier was trained on data from
every combination of two instruments and tested on data from the left-out instrument,
resulting in a total of three cross-instrument accuracy maps. The three cross-instrument
searchlights were also overlayed to determine the regions of overlap. To determine the
significance of cross-classification searchlight, I used a more complex nonparametric
method than what was used to determine significance of the within-instrument
searchlight maps. As described in Stelzer, Chen, & Turner (2013), this method involves
SPATIOTEMPORAL EMOTIONS TO MUSIC
24
random permutation tests on the subject level combined with bootstrapping at the group
level. While Randomise with TFCE, which was used to determine significance of the
within-instrument searchlight maps, does provide excellent control of type 1 errors, the
Stelzer et al., (2013) can provide a more accurate estimation of the group level statistics
because it models the null distribution of searchlight maps on both the individual subject
level and group level. However, because within-subject permutation testing and across-
subject bootstrapping is computationally intensive, I only used this method for
determining significance thresholds for the cross-modality searchlight maps, not for the
within-instrument searchlights maps. I believe this decision is justified because a) within
modality classification in auditory cortex is already well known and does not require a
higher standard of proof, b) successful cross-modal classification implies successful
within modality classification, and c) the cross-modal searchlights constitute the most
direct test of our hypotheses.
To achieve this, I randomly permuted the class labels 50 times and performed
whole-brain cross searchlight analyses to create 50 single subject chance accuracy maps.
One permuted accuracy map per subject was selected at random (with replacement) to
create a pooled group accuracy map. This procedure was repeated 10,000 times to create
a distribution of pooled group accuracy maps. Next, a threshold accuracy was found for
each voxel by determining the accuracy that corresponded to a p-value of 0.001 in the
voxel-wise pooled group accuracy map. Clusters were then defined as a group of
contiguous voxels that survived these voxel-wise accuracy thresholds and cluster sizes
were recorded for each of the 10,000 permuted group accuracy maps to create a
SPATIOTEMPORAL EMOTIONS TO MUSIC
25
histogram of cluster sizes at chance. Finally, cluster-sizes from the chance distribution
were compared to cluster-sizes from the original, group accuracy maps to determine
significance. An FDR-method using Benjamini-Hochberg procedure was used to correct
for multiple comparisons at the cluster level (Heller, Stanley, Yekutieli, Nava, &
Benjamini, 2006).
Multiple regression with personality measures. Individual scores on the empathy
subscales of the IRI were correlated with classification accuracy within the four ROIs for
both within and across classification to determine if the degree of emotion-specific
predictive information within these regions is associated with greater emotional empathy.
Age, gender, and music sophistication, as measured by the Gold-MSI, were included in
the model as regressors of no interest. Additionally, behavioral accuracy of correctly
identifying the intended emotions of the sound clips collected outside of the scanner were
correlated with performance of the classifier.
Acoustic features of sound clips. For extracting acoustic features from the sound
clips believed to be relevant to emotional expression, I used MIRToolbox, a publically
available MATLAB toolbox primarily used for music information retrieval (Lartillot &
Toiviainen, 2007), but well suited for extracting relevant acoustical information from
non-musical and vocal stimuli as well (Linke & Cusack, 2015; Rigoulot, Pell, & Armony,
2015). These included: spectral centroid, spectral brightness, spectral flux, spectral
rolloff, spectral entropy, spectral spread, and spectral flatness for evaluating timbral
characteristics, RMS energy for evaluating dynamics, mode and key clarity for evaluating
tonal characteristics, and fluctuation entropy and fluctuation centroid for evaluating
SPATIOTEMPORAL EMOTIONS TO MUSIC
26
rhythmic characteristics of the clips (Alluri et al., 2012). I additionally added the acoustic
features published in Paquette et al., (2013), which included duration, mean fundamental
frequency, max fundamental frequency, and min fundamental frequency. I then evaluated
how these features varied by instrument and by emotion and conducted a classification
analysis based on acoustic features alone to predict the intended emotion of the sound
clip.
Because I found a main effect of emotion label on duration of the clips (i.e. fear
clips were significantly shorter than sad clips) and I do not believe that this different
reflects a meaningful difference amongst emotions, I added an additional regressor of no
interest where the height of the regressor reflected the duration of each clip in a separate
GLM analysis. MVPA and searchlight analysis were then repeated with this model for
comparison.
Results
Behavioral results. Behavioral ratings of the sound clips outside of the scanner
were collected for 37 out of the 38 participants. Overall, participants correctly labeled
85% of the clips (SD = 17%). Averaged correct responses for each emotion and
instrument are presented in Table 2. The between-within ANOVA on accuracy scores for
each of the clips showed a significant interaction between
emotion and instrument (F(4,144) = 64.30, p < 0.0001). Post-hoc follow-up t-tests
showed that the fear condition on the clarinet was more consistently labeled incorrectly
(mean accuracy = 55%, SD = 16%) than the happy condition on the clarinet (mean
accuracy = 97%, SD = 5%; t(36) = 15.99, p < 0.0001, paired t-test) as well as the fear
SPATIOTEMPORAL EMOTIONS TO MUSIC
27
condition in the violin (mean accuracy = 90%, SD = 13%; t(36) = 13.83, p < 0.0001,
paired t-test) or voice (mean accuracy = 94%, SD = 7%; t(36) = 13.27, p < 0.0001, paired
t-test).
For intensity ratings of the clips, I calculated the average intensity of each emotion
for each participant. Again, an interaction between emotion and instrument was found for
the intensity ratings (F(4,144) = 38.73, p < 0.0001, ANOVA). Fear clips on the clarinet
were rated as significantly less intense than fear clips on the violin (t(36) = 14.49, p <
0.0001, paired t-test) and voice (t(36) = 13.16, p < 0.0001, paired t-test), whereas sad
clips on the voice were rated as significantly more intense than sad clips on the violin
(t(36) = 9.84, p < 0.0001, paired t-test) or clarinet (t(36) = 9.90, p < 0.0001, paired t-test).
Overall, the intensity ratings provide further information for the accuracy scores: fear on
the clarinet was the most difficult to identify and was rated as significantly less intense.
Participant intensity ratings were not found to be related to the performance of the brain-
based classifier.
Table 2. Intensity ratings and affective labeling of auditory emotional stimuli.
Happy Sad Fear Total
Acc Intensity Acc Intensity Acc Intensity Acc Intensity
All 0.92 3.86 (0.58) 0.85 3.81 (0.68) 0.79 3.52 (0.87) 0.85 3.86 (0.58)
Voice 0.82 4.17 (0.45) 0.81 4.35 (0.48) 0.90 4.04 (0.56) 0.84 4.19 (0.51)
Violin 0.96 3.54 (0.61) 0.89 3.54 (0.56) 0.94 3.87 (0.69) 0.93 3.64 (0.64)
Clarinet 0.97 3.85 (0.50) 0.85 3.55 (0.64) 0.55 2.67 (0.61) 0.79 3.36 (0.77)
SPATIOTEMPORAL EMOTIONS TO MUSIC
28
Multivariate results. MVPA applied to the whole brain to predict the emotion of
each clip showed above chance (0.33) accuracies using data from all instruments (M =
0.43, SD = 0.08, t(37) = 7.58, p < 0.0001). Above chance accuracy was also obtained
using data collected from each instrument individually (clarinet: M = 0.39, SD = 0.12,
t(37) = 3.06, p = 0.004; violin: M = 0.37, SD = 0.11, t(37) = 2.18, p = 0.04; voice: M =
0.43, SD = 0.15, t(37) = 4.14, p = 0.0002). Within instrument classification accuracy was
also significantly above chance in both the auditory cortex (M = 0.49, SD = 0.06, t(37) =
15.06, p < 0.0001) and all three regions of the insula (dorsal anterior: M = 0.38, SD =
0.07, t(37) = 4.06, p = 0.0002; ventral anterior: M = 0.38, SD = 0.07, t(37) = 4.41, p <
0.0001; posterior: M = 0.38, SD = 0.08, t(37) = 3.68, p = 0.0007, see Figure 3).
Confusion matrices for within instrument classification in the whole brain and auditory
cortex, as well as additional measures of classification performance, including sensitivity,
specificity, positive predictive value, and negative predictive value, and provided in
Appendix A and B respectively.
Cross-classification accuracies, in which the classifier was trained on data from two
instruments and tested on data from the left-out third instrument, also showed successful
classification for each combination of training and testing (3 in total). Classification
accuracy averaged across the 3 combinations of training and testing was significantly
greater than chance in the whole brain (M = 0.38, SD = 0.09, t(37) = 3.56, p = 0.001) as
well as the region of interest in the auditory cortex (M = 0.44, SD = 0.08, t(37) = 8.83, p
< 0.0001), but not in the three insula ROIs. Graphs of the accuracies for each
combination of training and testing in both the whole brain analysis and ROI analysis are
SPATIOTEMPORAL EMOTIONS TO MUSIC
29
presented in Figure 3. Confusion matrices for cross instrument classification in the whole
brain and auditory cortex are provided in the Appendix A.
I additionally conducted cross-classification for all 6, pairwise combinations of
training on one instrument and testing on one other instrument. Overall, the pairwise
cross-instrument classification accuracies were significantly above chance in the auditory
cortex. The results are presented in the Appendix C.
Figure 3. Classification accuracies for decoding of emotions in auditory stimuli. A,
Classification accuracies in the whole brain, auditory cortex (AC), posterior insula (pI),
dorsal anterior insula (dAI), and ventral anterior insula (vAI) with all three instruments
(violin, clarinet, voice) as well as within each instrument individually. B, Cross-
instrument classification accuracies in the whole brain, auditory cortex (AC), posterior
insula (pI), dorsal anterior insula (dAI), and ventral anterior insula (vAI), leaving out data
from one instrument and training on the other two. Error bars represent indicate error. p
values are calculated based on a one-sample t-test comparing classification with chance
(0.33,dotted line). †p < 0.05, uncorrected; *p < 0.05;**p < 0.01,***p < 0.001, corrected
for multiple comparisons across the four ROIs.
Searchlight results. The whole-brain, within instrument searchlight analysis
revealed that successful classification of the emotions of the musical clips could be found
bilaterally in the primary and secondary auditory cortices, including the cortices lying
A. Within instrument classification accuracy B. Across instrument classification accuracy
Classification accuracy
Classification accuracy
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
All Violin Clarinet Voice
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Test - clarinet Test - violin Test - voice Average
**
*
*** *** *** ***
**
*** * *** †
*** *** *** ***
*** ***
*** **
** ** ***
**
SPATIOTEMPORAL EMOTIONS TO MUSIC
30
within the superior temporal gyrus and sulcus, as well as the bilateral posterior insular
cortices, parietal operculum, precentral gyrus, inferior frontal gyrus, right middle
temporal gyrus, the right medial prefrontal cortex, right superior frontal gyrus, right
precuneus, and right supramarginal gyrus (Figure 4). Center coordinates and accuracies
for significant regions in the within instrument searchlight analysis are presented in
Appendix D.
Three whole-brain across instrument searchlight analyses were conducted where
the classifier was trained on data from two of the instruments and tested on the held-out
third instrument. All three searchlights showed significant classification bilaterally in
primary auditory cortex, including Heschl’s gyrus, and the superior temporal gyrus and
sulcus, as well as the posterior insula and parietal operculum (Figure 5). Several other
brain regions showed significant classification in one or more of the searchlight analyses,
but not all three. These included the right middle and inferior frontal gyri and precentral
gyrus (leaving out the violin and the voice) and the MPFC (leaving out clarinet). Center
coordinates and accuracies for each significant region of the cross-instrument searchlight
analysis are presented in appendix D.
SPATIOTEMPORAL EMOTIONS TO MUSIC
31
Figure 4. Within instrument whole-brain searchlight results. Red-yellow colors
represent classification accuracy. Significant clusters are determined by permutation
testing. Images are thresholded to show clusters that reached a FDR-corrected
significance level at alpha = 0.05
Figure 5. Cross-instrument whole-brain searchlight results. A. Results when training
on data collected during violin and voice clips, testing on clarinet clips. B. Results when
training on data collected during clarinet and voice clips, testing on violin clips. C.
Results when training on data collected during violin and clarinet clips, testing on voice
clips. Red-yellow colors represent classification accuracy. Significant clusters were
determined by permutation testing. Images are thresholded to show clusters that reached
a FDR-corrected significance level at alpha = 0.05.
SPATIOTEMPORAL EMOTIONS TO MUSIC
32
Multiple regression results. Measures of the four subscales of the IRI were
modeled in a multiple regression to predict the classification accuracies in each of the
four regions of interest (auditory cortex and three subcomponents of the insula) with age
and gender added as covariates of no interest. In this model, empathic concern was
positively correlated with both the within and cross classification accuracies in the dorsal
anterior insula (Within: b = 0.08, p = 0.0101, Cross: b = 0.08, p = 0.0158). The
significance of the regression coefficient between empathic concern and within-
instrument accuracy in the dorsal anterior insula survived correction for multiple
comparisons, though the regression with cross-instrument accuracies did not (Bonferroni
correction with four regions of interest, alpha = 0.0125). No other predictors were
significantly correlated with accuracy in the regions of interest.
Scores corresponding to the five subscales of the MSI were additionally modelled
in a separate multiple regression as covariates of interest to predict classification
accuracies in the four ROIs. No significant correlations were found between musical
experience and classification accuracy in either the auditory cortex or insula.
Additionally, no significant correlations were found between behavioral accuracies of
correctly identifying the intended emotion of the clip (collected outside of the scanner)
and classification accuracy.
Acoustic features classification and duration. Duration was significantly
different for the three emotions according to a one-way ANOVA (F(2,87) = 110.30, p <
0.0001). Sad clips (M = 2.39s, SD = 0.54) were significantly longer than the happy (M =
1.48s, SD = 0.39; t(58) = 7.45, p
adjust
= 1.2 x 10
-12
) and fear clips (M = 0.81s, SD = 0.25;
SPATIOTEMPORAL EMOTIONS TO MUSIC
33
t(58) = 14.38, p
adjust
< 0.0001). Because duration is not an acoustic feature directly related
to the expression of an emotion, and because it differed significantly by emotion, I
wanted to ensure that the classifier was not only classifying based on stimuli length rather
than its emotional content. I therefore added the length of each clip as an additional
parametric regressor in the lower-level GLM models and redid both within and cross
instrument classification with the z-stat images obtained from this analysis. The average
within instrument accuracy using duration as a regressor was 44% (SD = 7%) within the
whole brain and the average across instrument accuracy using duration as a regressor was
39% (SD = 9%), which were both statistically significant according to a one-way t-test
against theoretical chance (within: t(37) = 9.12, p = 5.33 x 10
-11
; across: t(37) = 4.22, p =
0.0002). No significant differences were found between the classification accuracies
when duration was added as a regressor (within: t(37) = 0.61, p = 0.55, across: t(37) =
0.53, p = 0.60, paired t-test). Because of these, I did not re-compute the searchlight
analysis using the results from the GLM analysis with duration modelled.
I also conducted a classification analysis using the acoustic features of the sound
clips only. These included 12 features related to timbre, rhythm, and tonality as described
in (Alluri et al., 2012) as well as fundamental frequency and duration (Paquette et al.
2013). The linear SVM classifier could successfully classify the emotion label of the
sound clip 82% of the time using all data and 60% on average when training and testing
on data from separate instruments (cross classification). The duration of the sound clips
was determined to be the most important feature used by the SVM. When duration was
removed, classification accuracy was 72% when using data from all instruments and 57%
SPATIOTEMPORAL EMOTIONS TO MUSIC
34
on average when training and testing across all three instruments (see Table 3). After
removing duration, the most important features for classification were fluctuation
centroid and spectral flux. Fluctuation centroid is a measure of rhythmic changes in
sounds and is calculated by taking the mean (center of gravity) of the fluctuation
spectrum, which conveys the periodicities contained in a sound wave’s envelope (Alluri
et al., 2012). A one-way ANOVA revealed a significant main effect of emotion F(2,87) =
29.93, p < 0.0001) on fluctuation centroid. Fear clips (M = 2,977; SD = 1,220) were
significantly higher than both sad (M = 1,325; SD = 554; t(58) = 6.76, p
adjust
< 0.0001)
and happy clips (M = 2,432; SD = 632.21; t(58) = , p
adjust
< 0.0001). Spectral flux is a
measure of how the variance in the audio spectrum changes over time and therefore
conveys both spatial and temporal components of sound (Alluri et al., 2012). It is highly
correlated with fluctuation centroid with our sound clips. A one-way ANOVA revealed
that fearful clips (M = 175.03, SD = 86.05) had a significantly higher spectral flux than
both sad (M = 59.70, SD = 33.32; t(58) = 6.85, p < 0.001) and happy clips (M = 93.20,
SD = 28.25; t(58) = 4.95, p < 0.001).
The results from the acoustic classification provide information regarding how the
fMRI-based classifier is able to decode the auditory emotions and suggests that
differences in neural responses to changes in rhythm and timbre between the emotions
might contribute to the classifier’s performance.
SPATIOTEMPORAL EMOTIONS TO MUSIC
35
Table 3. Emotion classification of auditory stimuli using acoustic features alone.
Happy Sad Fear Total
Within classification
w/ duration 0.73 0.83 0.90 0.82
w/out duration 0.53 0.87 0.77 0.72
Cross-classification:
Test on voice
w/ duration 0.50 0.20 0.90 0.57
w/out duration 0.50 0.40 0.70 0.53
Cross-classification:
Test on violin
w/ duration 0.50 0.70 0.30 0.50
w/out duration 0.50 0.30 0.30 0.37
Cross-classification:
Test on clarinet
w/ duration 0.60 1.00 0.60 0.73
w/out duration 0.80 1.00 0.60 0.80
Discussion
By using multivariate cross-classification and searchlight analyses with different types of
auditory stimuli that convey the same three emotions, I identified higher-level neural
regions that process the affective information of sounds produced from various sources.
Using fMRI data collected from the entire brain, above-chance classification of emotions
expressed through auditory stimuli was found both within and across instruments.
Searchlight analyses revealed that the primary and secondary auditory cortices, including
the superior temporal gyrus (STG) and sulcus (STS), extending into the parietal
operculum and posterior insula, exhibit emotion-specific and modality-general patterns of
neural activity. This is supported by the fact that BOLD signal in these regions could
differentiate the affective content when the classifier was trained on data from one
instrument and tested on data from another instrument. Furthermore, within and cross-
modal classification performance within a region spatially confined to the dorsal anterior
portion of the insula was positively correlated with a behavior measure of empathy. To
SPATIOTEMPORAL EMOTIONS TO MUSIC
36
our knowledge, this is the first study to report the emotion-related spatial patterns that are
shared across both musical instruments and vocal sounds as well as to link the degree of
predictive information within these spatial patterns with individual differences.
The findings confirm the role of the cortices in the STG and the STS regions in
perceiving emotions conveyed by auditory stimuli. Significant classification of vocal
expressions of emotions was previously reported in the STG (Kotz et al., 2013) and the
region is active when processing acoustical (Salimpoor, Zald, Zatorre, Dagher, &
Mcintosh, 2015) and affective components of music (Koelsch, 2014). The left STG was
also found to code for both lower-level acoustic aspects as well as higher-level evaluative
judgments of nonlinguistic vocalizations of emotions (Bestelmeyer, Maurage, Rouger,
Latinus, & Belin, 2014). It has been suggested that the STG and STS bilaterally may be
involved in tracking the changing acoustic features of sounds as they evolve over time
(Schonwiesner, Rübsamen, & von Cramon, 2005). The STS in particular appears to
integrate audio and visual information during the processing of non-verbal affective
stimuli (Kreifelts et al., 2009). Both facial and vocal expressions of emotions activate the
STS (Escoffier et al., 2013; Wegrzyn et al., 2015). Multivariate neuroimaging studies
have proposed that supramodal mental representations of emotions lie in the STS (Peelen
et al., 2010). Furthermore, aberrations in both white-matter volume (von dem Hagen et
al., 2011) and task-based functional activity (Alaerts et al., 2014) in these regions were
associated with emotion recognition deficits in individuals with autism spectrum disorder
(ASD). Our findings suggest that discrete emotions expressed through music are
represented by similar patterns of activity in the auditory cortex as when expressed
SPATIOTEMPORAL EMOTIONS TO MUSIC
37
through the human voice. This confirms the role of the STG and STS in processing the
perceived affective content of a range of sounds, both musical and non-musical, that is
not purely dependent on lower-level acoustic features.
While the peak of the searchlight accuracy maps was located in the auditory cortex,
the significant results extend into the parietal operculum. Because these two regions are
adjacent and because the neuroimaging data are spatially smoothed both in the
preprocessing steps and in the searchlight analysis, I cannot be certain that the significant
classification accuracy found in the parietal operculum indicates that this region is
additionally involved in representing emotions from sounds. Nonetheless, the idea that
cross-modal representation of emotions could be located in this region is consistent with
previous research. The inferior portion of the somatosensory cortex, which is located in
the parietal operculum (Eickhoff, Schleicher, & Zilles, 2006), has been shown to be
engaged during vicarious experiences of perceived emotions (Straube & Miltner, 2011).
Furthermore, patients with lesions in the right primary and secondary somatosensory
cortices performed poorly in emotion recognition tasks (Adolphs, Damasio, Tranel,
Cooper, & Damasio, 2000) and reported reduced intensity of subjective feelings in
response to music (Johnsen, Tranel, Lutgendorf, & Adolphs, 2009). Transcranial
magnetic stimulation applied over the right parietal operculum region was also shown to
impede the ability to detect the emotions of spoken language (Rijn et al., 2005). Using
multivariate methods, Man, Damasio, Meyer, & Kaplan, (2015) found activity in the
parietal operculum could be used to reliably classify objects when presented aurally,
visually, and tactilely, suggesting that this region contains modality invariant
SPATIOTEMPORAL EMOTIONS TO MUSIC
38
representations of objects and may therefore serve as a convergence zone for information
coming from multiple senses. Taken together, the fact that significant predictive affective
information was found in the parietal operculum may suggest that the ability to recognize
the emotional content of sounds relies on an internal simulation of the actions and
sensations that go into producing such sounds.
The searchlight accuracy maps additionally extended into the posterior portion of
the insula. The insula is believed to be involved in mapping bodily state changes
associated with particular feeling states (Damasio, Damasio, & Tranel, 2013; Immordino-
Yang, Yang, & Damasio, 2014). A range of subjectively-labeled feeling states could be
decoded from brain activity in the insula, suggesting that the physiological experience
that distinguishes one emotion from another is linked to distinct spatial patterns of
activity in the insula (Saarimaki et al., 2015). Studies have shown that the region is
largely modality invariant, activated in response to facial expressions of emotions
(Wegrzyn et al., 2015), perceptual differences between emotions conveyed through non-
speech vocalizations (Bestelmeyer et al., 2014), multimodal presentations of emotions
(Schirmer & Adolphs, 2017) and by a wide range of emotions conveyed through music
(Baumgartner, Lutz, Schmidt, & Jäncke, 2006; Park et al., 2013). The insula’s role in
auditory processing may be to allocate attentional resources to salient sounds (Bamiou et
al., 2003) as evidenced by cases in which patients with lesions that include the insula but
not Heschl’s gyrus develop auditory agnosia (Fifer et al 1995). The function of the insula
in processing emotions expressed across the senses is further substantiated by the
observation that a patient with a lesion in the insula showed an impaired ability to
SPATIOTEMPORAL EMOTIONS TO MUSIC
39
recognize the emotion disgust when expressed in multiple modalities (Calder, Keane,
Manes, Antoun, & Young, 2000). Developmental disorders characterized by deficits in
emotional awareness and experience may be linked to aberrant functioning of the insula,
as decrease insular activity was observed in ASD children observing emotional faces
(Dapretto et al., 2006) and altered resting-state functional connectivity between the
posterior insula and somatosensory cortices was observed in adults with ASD (Ebisch et
al., 2011). The fact that emotions conveyed through auditory stimuli could be classified
based on activity in the posterior insula in our study provides further evidence for the
hypothesis that perceiving and recognizing an emotion entails recruiting the neural
mechanisms that represent the subjective experience of that same emotion.
Despite this finding, classification accuracy within a region of interest in the dorsal
anterior portion of the insula was significantly positively correlated with empathy. The
anterior insula was not one of the significant regions found in the searchlight analysis.
These two results might be explained in the context of previous functional and structural
imaging studies that suggest that subdivisions of the insular cortex are associated with
specific functions (Deen et al., 2011). According to such accounts, the posterior insula,
which is structurally connected to the somatosensory cortices, is more directly involved
in affective processing of visceral sensations (Kurth, Zilles, Fox, Laird, & Eickhoff,
2010) and interoceptive awareness (Craig, 2009), whereas the dorsal anterior insula,
which is connected to the cognitive control network and the ACC, is more directly
involved in socio-emotional abilities such as empathy and subjective awareness (Craig,
2009). This is evidenced by the fact that the anterior insula is activated when both
SPATIOTEMPORAL EMOTIONS TO MUSIC
40
observing and imitating the emotions of others (Carr et al., 2003). Measures of empathic
concern, a subtype of affective empathy referring to the tendency to feel sympathy or
concern for others, have been shown to be positively correlated with anterior insula
activity when viewing emotional pictures (Silani et al., 2008) as well as when observing
loved-ones in pain (T. Singer et al., 2004). Given that classification accuracy obtained
from data within the dorsal anterior insula specifically, was correlated with empathic
concern, our results provide further evidence for the unique role of this subdivision in
enabling the emotional resonance that is essential to understanding the feelings of others.
I speculate that individuals who readily behave empathically might have more finely
tuned representations of emotions in the dorsal anterior insula when processing affective
information.
While I was mainly interested in identifying brain regions that conserve affect-
related information across sounds with differing acoustical properties, I recognize that
certain acoustic properties are also integral to specific emotional categories regardless of
the source of the sound. Previous results have shown that happiness, for example, is
characterized by higher fundamental frequencies and faster tempos when conveyed
through both vocal expressions and musical pieces (Juslin & Laukka, 2003). An earlier
attempt to disentangle the neural processing of acoustic changes from the neural
processing of perceptual changes associated with two different emotions conveyed
through auditory stimuli acknowledged that the two are interrelated and that a complete
and straightforward separation of the two would be overly simplistic; indeed, the
researchers found evidence for both distinct and overlapping neural networks associated
SPATIOTEMPORAL EMOTIONS TO MUSIC
41
with these two processes (Bestelmeyer et al., 2014). Because of this, I did not intend to
control for all potential acoustic differences between our stimuli, believing that these
features may be essential to that emotional category. Because the duration of the clips
varied significantly by emotion and is a feature not directly tied to affective expression, I
did regress out the variance explained by duration from our GLM model and showed that
cross-instrument classification performance did not change. Besides for duration, no
other acoustic properties of the sounds were regressed out of the signal. I therefore might
expect that our classifier may be sensitive to signal that is responsive to certain acoustic
variations.
To make predictions about the types of acoustic variation that the classifier may be
sensitive to, I conducted classification using several audio features extracted for each
clip. I found that the acoustic-based classifier performance was also largely dependent on
differences in duration between the emotions, as evidenced by the attenuating in
performance when durational information was removed. While it is difficult to know
what types of information the fMRI-based classifier is using to make distinctions between
the emotional states, the classifier trained on acoustic features alone without duration can
provide some hypotheses. Once duration was removed, the most informative features for
classification of emotions include a rhythmic feature called fluctuation centroid. These
results suggest that the fMRI-based classifier is not only sensitive to BOLD signal
corresponding to the duration and frequency of the sounds, and may be capturing finely-
tuned responses in the auditory cortex and insula that are sensitive to changes in rhythm
and timbre that are integral to conveying emotions through sound.
SPATIOTEMPORAL EMOTIONS TO MUSIC
42
Using BOLD data from a mask of the entire brain, classification accuracies were
between 38-43% for the whole-brain within instrument classification and 36-40% for the
whole-brain cross-instrument classification. Theoretically, if the classifier was guessing
the emotional category at random, just by chance, it would correctly identify the emotion
33% of the time. While I recognize that the accuracies obtained here are not impressively
high compared to theoretical chance, they are statistically significant according to a one-
way t-test corrected for multiple comparisons and comparable to those reported in other
multivariate cross-classification fMRI studies (J. Kim et al., 2017; Skerry & Saxe, 2014).
Furthermore, the cross instrument classification accuracies should be interpreted in
relation to the within instrument classification accuracies, which may set the upper bound
of possible performance of a cross-modal classifier (Kaplan et al., 2015). I therefore
would not expect cross instrument classification to perform better than within instrument
classification and the fact that the cross instrument accuracies are still significantly above
chance provides us with compelling evidence that unique spatial patterns of BOLD signal
throughout does contain some predictive information regarding the emotional category.
Of additional note, upon inspection of the confusion matrices, the within-instrument
classification performance correctly identified fearful clips to a greater degree than the
other two emotions, despite the fact positive predictive value across all three was not
drastically different. This contrasts with the behavioral findings, in which fear was the
emotion most difficult to identify and label. This could indicate that while the fearful
clips were easily distinguishable from the other two emotions, these perceived differences
may not necessarily adhere to the concepts and features humans have learned to associate
SPATIOTEMPORAL EMOTIONS TO MUSIC
43
with the categorical label of fear. Further exploration into the acoustic components and
behavioral responses to fearful musical and vocal clips will help to interpret these
opposing findings.
In sum, our study reveals the emotion-specific neural information that is shared
across sounds from musical instruments and the human voice. The results support the
idea that the emotional meaning of sounds can be represented by unique spatial patterns
of neural activity in sensory and affect processing areas of the brain, representations that
do not depend solely on the specific acoustic properties associated with the source
instrument. These findings therefore have implications for scientific investigations of
neurodevelopmental disorders characterized by an impaired ability to recognize vocal
expressions of emotions (Allen, Davis, & Hill, 2013) and provide a clearer picture of the
remarkable ability of the human brain to instantaneously and reliably infer emotions
when conveyed nonverbally.
SPATIOTEMPORAL EMOTIONS TO MUSIC
44
CHAPTER 3
Unique Personality Profiles Predict When and Why Sad Music is Enjoyed
Introduction
Having evaluated regions in the brain that code for brief and immediate recognition
of emotions in music and sounds, I now move to an exploration of music-evoked
emotions that develop and evolve over a longer period time. To further separate
perception from experience, I focus on music in which the recognized valence might not
match the valence of the induced feeling state, such as is commonly the case in response
to sad-sounding pieces of music.
Study 2 sought to provide a clearer understanding of why we enjoy listening to
sad music, in what situations, and for what reasons. From an evolutionary standpoint,
sadness is considered one of the basic, utilitarian emotions that we actively seek to
minimize. The negative sensations associated with sadness are generally elicited by a real
or perceived loss and the resulting bodily-state changed are designed to protect social
status during these potentially vulnerable situations (Damasio & Carvalho, 2013).
Although this interpretation of the evolutionary function of sadness is well supported in
the literature, it does not fully account for situations that commonly occur in response to
music and other forms of art, in which music that conveys and elicits sadness is said to be
enjoyable. The fact that we actively seek out sad music calls into question the generally
accepted notion of the function of sadness. The so-called “tragedy paradox” refers to the
observation that the experience of sadness, when expressed through the arts, can be found
SPATIOTEMPORAL EMOTIONS TO MUSIC
45
pleasurable or enjoyable. Gaining a better understanding of this problem may further our
basic understanding of the function and origin of negative emotions. It may also resolve
inconsistencies regarding the role that such emotions play in the rewarding aspects of
musical stimuli and help account for our motivations for engaging with such stimuli.
Individual differences and the liking of sad music. Previous explorations of
this tragedy paradox have hypothesized that sadness, when conveyed through music—a
purely aesthetic, non-threatening context—can be found pleasurable because it can yield
psychological benefits related to mood regulation, recollection of and reflection on past
events, and empathic understanding (Levinson, 1990; Sachs et al., 2015; Taruffi &
Koelsch, 2014). Several of these psychological rewards appear to occur specifically in
response to music that conveys sadness and not in response to music that conveys
happiness, for example (Taruffi & Koelsch, 2014). Much of the previous literature has
focused on the personality measures that predict the enjoyment of sad music, though very
few have directly tested how personality traits might relate to how sad music is perceived
and experienced, and, by extension, whether a person gains these psychological benefits
from the music. Openness to Experience, one of domains of the Big Five model that
encompasses personality dimensions such as aesthetic appreciation and novelty-seeking,
has been shown to be positively associated with liking of sad music: specifically, listeners
who liked sad music that made them feel sad tended to score higher on Openness to
Experience (Ladinig & Schellenberg, 2012; J. Vuoskoski, Thompson, McIlwain, &
Eerola, 2011). Different subcomponents of empathy, such as a measure of cognitive
empathy known as Fantasy, which refers to the tendency to become transported into the
SPATIOTEMPORAL EMOTIONS TO MUSIC
46
feelings of the characters when engaging with works of fiction, and a measure of
affective empathy known as Empathic Concern, which refers to feelings of sympathy and
concern for unfortunate others, have both been found to be positively associated with
liking sad music (Vuoskoski, Thompson, McIlwain, & Eerola, 2011). In further
explorations, it was determined that an additional subcomponent of cognitive empathy,
Perspective Taking, which refers to the tendency to adopt the viewpoint of another, was
related to liking sad music (Taruffi & Koelsch, 2014) and that this relationship was
mediated a heightening of the emotional responses to sad music (Kawakami & Katahira,
2015). A personality measure related to the Fantasy subcomponent of empathy called
absorption, referring to the capacity to become deeply focused on a task or stimulus and
temporarily disengaged from the external surroundings, has also been shown to be linked
to the enjoyment of sad music (Garrido & Schubert, 2011).
It has also been shown that personality traits and symptoms related to depression,
such as a tendency to ruminate, are associated with the enjoyment of sad music. Trait
rumination was found to be positively correlated with enjoyment of sad music in several
studies (Garrido & Schubert, 2011; Schubert, Halpern, Kreutz, & Garrido, 2018).
Severity of depressive symptoms more broadly have also be been shown to be positively
association with liking sad music, a relationship that was found to be more pronounced in
men than women (Hogue, Crimmins, & Kahn, 2016). Women, on the other hand,
reported listening to sad music because of its ability to trigger memories and because it
caused them to feel less alone (Eerola, Peltola, & Vuoskoski, 2015).
SPATIOTEMPORAL EMOTIONS TO MUSIC
47
The role of emotional responses in the reactions to sad music. Despite the
support for an association between certain personality traits and the liking of sad music,
several studies have failed to replicate some of these associations. Taruffi & Koelsch,
(2014) for instance, did not find a direct link between Empathic Concern and liking sad
music; on the other hand, they did find that Empathic Concern was related to feeling sad
in response to sad music. These results suggest that the relationship between personality
and liking sad music may be mediated by the intensity and quality of the emotional
responses to sad music. Both Openness to Experience and empathy have been shown to
be correlated with the intensity of emotions felt in response to sad music (J. Vuoskoski et
al., 2011). In light of these findings, it has been proposed that Fantasy and absorption
may be linked to liking sad music because listeners who demonstrate these personality
traits are able to experience positive emotions in response to sad music without the
displeasing emotions that might typically accompany the experience (Garrido &
Schubert, 2015). The same may be true for Openness to Experience as well, which is
known to be associated with aesthetic appreciation (McCrae, 2007) and intense,
pleasurable responses to music in general (Nusbaum & Silvia, 2011). On the other hand,
rumination and depressive symptoms may actual prolong the negative emotional states in
response to sad music; therefore, it may be that the link between these traits and liking
sad music is representative of some “maladaptive attraction” to negative emotions
(Garrido & Schubert, 2015). In either case, the intensity of the emotional responses to sad
music appear to be play a role in its enjoyment, by either increasing positive emotions or
by prolonging negative ones. It is quite possible, then, that the psychological benefits of
SPATIOTEMPORAL EMOTIONS TO MUSIC
48
listening to sad music, and therefore people’s reasons for selecting sad music, depend on
the nature of the emotional response. While there is some empirical evidence for this
idea, whether emotional responses to sad music mediate role relationship between liking
sad music and personality traits has not yet been systematically evaluated.
Situational factors influencing engagement with sad music. Previous research
also suggests that seeking out or enjoying sad music is influenced by situational factors.
It has been shown that when people are listening to music alone, they demonstrate
increased emotional response to the stimuli, as measured by psychophysiology
(Egermann et al., 2011), and, relatedly, report choosing to listen to sad music more often
when alone. Others have found that people choose to listen to sad music more when
feeling homesick or lonely as well as when experiencing emotional distress related to a
death or a break-up (Taruffi & Koelsch, 2014). It is possible that situational factors
influence enjoyment of sad music by manipulating one’s mood. Mood has previously
been shown to influence preferences for sad music. For example, after a sad-mood
induction paradigm, individuals were more likely to report liking sad-sounding music
(Hunter, Schellenberg, & Griffith, 2011). Despite this, not everyone chooses to listen to
sad-sounding music when they are sad. The results from one previous study suggest that
personality may play a role in whether or not people choose to listen to music that is
congruent with their current mood: people who scored higher on the personal distress
subscales of empathy, and lower on the personality trait emotional stability, were more
likely to listen to sad music when they were in a negative mood (mood-congruent),
whereas people who scored higher on the Perspective Taking subscale of empathy were
SPATIOTEMPORAL EMOTIONS TO MUSIC
49
more likely to listen to sad music when in a positive mood (mood-incongruent; Taruffi &
Koelsch, 2014). Interestingly, scores on the Fantasy subcomponent of empathy was
positively associated with both mood-congruent and mood-incongruent liking of sad
music.
The current study. The exact nature of the relationship between individual
differences, emotional responses and the enjoyment of sad music has yet to be fully
clarified. The previous work has not taken into consideration how different situations
might influence one’s motivation for engaging with sad music nor assessed how the
nature of the emotional response to sad music influences its subsequent enjoyment.
With the goal of both replicating previous findings and extending current
knowledge of the relationship between individual differences and reasons and
motivations for listening to sad music, I conducted an online survey in which participants
(N = 431) completed several personality questionnaires designed to measure a variety of
personality traits, including the Big Five personality domains, empathy, absorption, as
well as rumination and other depressive symptoms. Participants additionally reported on
the emotional quality of music that they would choose to listen to in variety of situations
and, if they were to choose sad music, their reasons for doing so as well as their expended
emotional response.
After determining underlying situational and motivation factors using exploratory
factor analysis (EFA), regularized regression models were used to assess the combination
of personality measures that were more predictive of (1) the enjoyment of sad music, (2)
emotional responses to sad music, (3) the situations in which people listen to sad music,
SPATIOTEMPORAL EMOTIONS TO MUSIC
50
and (4) the reasons people gave for listening to sad music. Finally, mediation models
were constructed to assess whether the quality of the emotions in response to sad music
mediate the relationship between personality and liking sad music in particular situations
and for particular reasons.
Using regularized regression models with data that captures multiple dimensions of
personality, as well as situational and motivational factors, I can resolve outstanding
issues with regards to the personality profiles that account for our attraction to sad music.
I hypothesized that different subcomponents of empathy would be linked to the
enjoyment of sad music through separate mechanisms. Specifically, I predicted that
people who score high on Fantasy-proneness and absorption would enjoy sad music when
in distressing situations because it triggers intense, positive emotions. Furthermore, I
predicted that people who tend to ruminate and express symptoms of depression would
choose to listen to sad music when in negative situations because it triggers more intense,
negative emotions. Finally, I predicted that Openness to Experience would be associated
with liking sad music in both positive and negative situations because of its ability to
induce intense emotions, regardless of valence.
Methods
Participants. Data were collected from participants recruited from two separate
sources: one from Amazon Mechanical Turk and one from University of Southern
California Undergraduate population. Amazon Mechanical Turk workers (N = 218, 117
female, M
age
= 34.17, SD = 11.02) were paid $3.00 for completing the survey and USC
SPATIOTEMPORAL EMOTIONS TO MUSIC
51
psychology students (N = 213, 174 female, M
age
= 19.93, SD = 2.07) received course
credit.
Materials. The online survey consisted of several personality questionnaires that
were believed to be most relevant to the enjoyment of sad music. Below is a list of all
questionnaires used as well as a brief description of each:
a. Ten Item Personality Index (TIPI): brief measure of the Big Five personality
traits, Openness to Experience, Conscientiousness, Extraversion, Agreeability,
and Emotional Stability (Gosling, Rentfrow, & Swann, 2003)
b. Southampton Nostalgia Scale: measure of nostalgia proneness (Barrett et al.,
2010)
c. Tellegen Absorption Scale: measure of the tendency to become absorbed in
external stimuli (Tellegen & Walker, 2008)
d. Interpersonal Reactivity Index (IRI): measure of global empathy and the four
subscales: Fantasy, Perspective Taking, Empathic Concern, and Personal
Distress (Davis, 1983)
e. Rumination-Reflection Questionnaire: measure of tendency to ruminate and
self-reflect (Trapnell & Campbell, 1999)
f. Geneva Emotional Music Scale (GEMS-9): assesses emotions experienced
when listening to sad music (Zentner, Grandjean, & Scherer, 2008; see
Appendix E.)
SPATIOTEMPORAL EMOTIONS TO MUSIC
52
g. Patient Health Questionnaire (PHQ-9): measure used to diagnose major
depression as well as subthreshold depressive symptoms in the general
population (Martin, Rief, Klaiberg, & Braehler, 2006).
Participants were also asked to rate how much they generally enjoyed listening to
sad music. Participants who reported liking sad music were then presented with a list of
12 different reasons or motivations for listening to sad music and were asked to rate how
much they agreed or disagreed with each item using a five point Likert scale. The 12
items were adopted from Taruffi & Koelsch (2014) and can be found in appendix F.
In addition, the survey contained a list of 20 different situations in which a person
might choose to listen to music, such as when feeling stressed or when in contact with
nature, and asked the participants to describe the quality of the music that they would
most likely listen to in that situation (see appendix G for a full list of situations). The
situations were adopted from previous studies in that explored how music preferences
change in various external situations (Myriam et al. 2012) and were chosen to represent a
full range of potential emotional states, including all four categories of the two-dimension
space of valence and arousal (Russell et al. 1980). Using a slider from 0 (not at all) to 10
(very much), the participant answered three separate questions for each situation: 1. how
happy sounding is the music; 2. how sad sounding is the music; 3. how energetic is the
music. They were also given the option to select "Not Applicable".
Procedure and Analysis. Following informed consent procedures, participants
were directed to a Qualtrics survey link with a unique, randomly-chosen participant ID
SPATIOTEMPORAL EMOTIONS TO MUSIC
53
number. The ordering of questionnaire presentation was pseudo-randomized, with
demographic information always being presented first before all other questionnaires.
Exploratory factor analyses. Exploratory factor analyses (EFAs) were conducted
for the 9 emotional responses to sad music (GEMS-9), the 20 situations for listening to
sad music, and the 12 reasons for listening to sad music. For the list of 20 situations, only
responses to the one question "How sad sounding is the music" were used in the
subsequent analyses. Scree plots were generated to select the number of underlying
factors to extract based on a eigenvalue cutoff of one. Maximum-likelihood factor
analysis with promax rotation was then conducted on all items from the survey using the
R statistical program (R Development Core Team, 2012) factanal function in the stats
package. Any item that loaded on more than one factor (with a factor loading score
>0.35) or failed to load on any factor (>0.40) was removed and the factor analysis was
conducted again. Regression scores from the final, pruned analyses were then extracted
for subsequent analyses.
Lasso regression models. Lasso regression models were used to predict ratings
for the enjoyment of sad music, emotional responses to sad music (GEMS-9 factors), as
well as the situational and reasons factors scores determined by EFA, based on all
personality variables. The lasso was chosen in order to determine the combination of
personality variables that would best explain the enjoyment of sad music (Tibshirani,
1996). Predictor variables were selected using lasso regression on 80% of the data using
10-fold cross validation to tune the model and select the best fitting penalty coefficient
(lambda) based on the minimum error observed across folds. The final model was then
SPATIOTEMPORAL EMOTIONS TO MUSIC
54
validated on the left-out 20% of the data and model fit was assessed with mean squared
error. Full regression models using the variables selected by the lasso regression were
then conducted with all of the data to assess relationships between personality and (1)
enjoying sad music, (2) emotional responses to sad music, (3) situations for listening to
sad music, and (4) reasons for listening to sad music. All analyses were conducted using
the R statistical program using the lme4 (multiple regression) and glmnet (lasso
regression) packages.
Mediation models between personality and liking sad music. Mediation analyses
were conducted to investigate the relationship between personality, emotional responses,
and liking sad music in various situations and for various reasons. I used the results from
the lasso regression models to guide these analysis by specifically testing if emotional
responses to sad music (as measured by GEMS-9) mediates the association between (1)
personality and liking sad music, (2) personality and reasons for liking sad music, and (3)
personality and listening to sad music in certain situations. Indirect and direct effects
were determined using the lavaan package in R.
Results
Descriptive statistics. Correlations between explanatory variables are presented
in the Appendix H. No significant differences in ratings of the enjoyment of sad music
were found between two survey populations (M
Mturk
= 3.31, M
USC
= 3.37, t(429) = -0.60,
p > 0.10). Therefore, results from the two surveys were combined for all subsequent
analyses.
SPATIOTEMPORAL EMOTIONS TO MUSIC
55
Underlying emotions in response to sad music. The exploratory factor analysis
suggested a 3-factor model (see Table 4). All 9 items were kept in the final model, which
replicated the factor design outlined in Zentner et al., (2008). The items feeling “tender”,
“nostalgic”, “peaceful”, and “transcendent” load onto factor 1, and is therefore
subsequently termed the sublime factor, as per Zentner et al., (2008). Factor 2
corresponds to feelings of tension and sadness and is referred to as the unease factor and
factor 3 corresponds to feelings of joy and power, and is termed the vital factor.
Predicting the enjoyment of sad music and emotional responses to sad music.
All predictor variables (17 in total, the Big Five personality traits, all four subscales of
empathy, nostalgia proneness, trait absorption, rumination, reflection, nostalgia proneness
age, years of music training, and gender) were included in the model to predict ratings for
the enjoyment of sad music. The final Lasso model suggested 7 predictor variables (see
Table 4). In the full regression model, the Fantasy subscale of the IRI was positively
correlated with liking sad music (b= 0.17, SE = 0.06, p < 0.01) and the personal distress
subscale of the IRI was negatively correlated with liking sad music (b= 0.14, SE = 0.05,
p < 0.01).
Lasso was then determine personality measures that best predict factor scores
from the three latent variables identified in the EFA with the GEMS-9 (Table 4). When
predicting factor scores from the sublime factor of the GEMS-9, the best-fitting lasso
regression retained 10 predictor variables. In the full multiple regression model,
emotional stability (b = 0.15, SE = 0.04, p < 0.01), Fantasy (b = 0.15, SE = 0.05, p <
0.01), absorption (b = 0.15, SE = 0.05, p < 0.01), rumination (b = 0.12, SE = 0.06, p <
SPATIOTEMPORAL EMOTIONS TO MUSIC
56
0.05), and nostalgia proneness (b = 0.14, SE = 0.05, p < 0.01) were positively correlated
with sublime feelings in response to sad music, whereas extraversion was negatively
correlated with sublime feelings (b = -0.10, SE = 0.04, p < 0.05).
The best-fitting lasso regression predicting the vital factor, which included
feelings of joy and power, retained 11 predictor variables. In the full regression model,
gender (b = -0.21, SE = 0.10, p < 0.05), agreeability (b = -0.11, SE = 0.05, p < 0.05), and
depression severity (b = -0.10, SE = 0.05, p < 0.01) were all negatively correlated with
the vital emotions factor. Finally, the best-fitting lasso model retained 12 predictor
variables when predicting unease factor. This factor, including feeling of sadness and
tension, was found to be positively correlated with extraversion (b = 0.08, SE = 0.04, p <
0.05), Empathic Concern (b = 0.09, SE = 0.04, p < 0.05), personal distress (b = 0.09, SE
= 0.04, p < 0.05), and absorption (b = 0.09, SE = 0.04, p < 0.05).
SPATIOTEMPORAL EMOTIONS TO MUSIC
57
Table 4. LASSO models predicting the enjoyment of sad music and GEMS
Enjoy sad
music
GEM 1 -
Sublime
GEM 2 - Vital GEM 3 -
Unease
b SE b SE b SE b SE
1. Gender -- -- 0.15 0.09 -0.21* 0.10 -0.10 0.08
2. Years of music training -- -- -- -- 0.06 0.04 -- --
3. Age -- -- -0.08 0.04 -0.05 0.05 -- --
4. Extraversion -0.13 0.05 -0.10* 0.04 0.08 0.04 0.08* 0.04
5. Agreeability -- -- 0.08 0.05 -0.11* 0.05 -- --
6. Conscientiousness -- -- -- -- -0.08 0.05 -0.03 0.04
7. Emotional stability -- -- 0.15** 0.05 0.06 0.05 -0.04 0.05
8. Openness to experience -- -- -- -- 0.09 0.04 0.04 0.03
9. Depression severity 0.07 0.05 -- -- -0.10* 0.05 0.06 0.04
10. Fantasy 0.17** 0.06 0.15** 0.05 0.00 0.05 -- --
11. Perspective taking -- -- -- -- -- -- -0.06 0.04
12. Empathic concern -0.10 0.05 -- -- -- --- 0.09* 0.04
13. Personal distress -- -- -- -- -- -- 0.09* 0.04
14. Absorption 0.16** 0.05 0.15** 0.05 -0.01 0.05 0.09* 0.04
15. Rumination 0.07 0.05 0.12* 0.06 -- -- 0.03 0.05
16. Reflection -- -- 0.04 0.05 -- -- -- --
17. Nostalgia proneness -- -- 0.14** 0.05 -- -- 0.05 0.04
Observations 431 431 431 431
R2 0.14 0.22 0.08 0.12
Adjusted R2 0.12 0.20 0.06 0.10
Residual Std. Error 0.94 0.86 0.91 0.70
F Statistic 11.06*** 11.61*** 3.51*** 4.77***
Mean squared error 1.04 0.80 0.70 0.62
Note. N = 431. ***p < .001. **p < .01,*p < .05
Situations in which one would listen to sad music. A scree-plot was used to
determine the number of latent variables that could account for the variance in the 20
possible situations, which suggested a four-factor model. The first four factors each had
an eigenvalue greater than 1. One factor, when wanting to get-away did not have a
loading score above 0.40 on any of the four factors and was removed from the model. In
the final, pruned model, the four factors appeared to reflect the four quadrants of the
valence-arousal space (see Table 5). Factor one corresponded to situations that are
SPATIOTEMPORAL EMOTIONS TO MUSIC
58
positive valence but low arousal, including the items when feeling relaxed, when trying to
focus, when in a good mood, when in nature, when being creative, when doing something
else, and when alone. Factor two corresponded to situations that are negative valence and
low arousal, including the items when missing someone, when sad, when there has been a
recent death, when feeling lonely, when feeling homesick, and when feeling nostalgic.
The third factor corresponded to situations with negative valence, but high in arousal,
such as when angry, when frustrated, and when stressed. The fourth factor corresponded
to situations with positive valence and high arousal, included the items when with friends,
when running, dancing, or working out, and when celebrating.
Table 5. Exploratory factor analysis of 20 situations for listening to sad music.
Situations: F1: Relaxed F2: Sad F3: Angry F4: Celebrate
feeling relaxed 0.79
doing something else, driving, traveling, reading, working 0.64
you are alone 0.63
being creative or feeling inspired 0.63
in contact with nature 0.61
you are happy or in a good mood 0.57
you are trying to focus 0.53
you are missing someone 0.85
you are sad or in a bad mood 0.74
there has been a recent death or breakup 0.64
feeling lonely 0.61
feeling homesick 0.56
feeling nostalgic or reflecting on past personal experiences 0.53
feeling frustrated 0.82
feeling angry 0.71
feeling stressed 0.65
you are celebrating a holiday, birthday or other social
occasion 0.76
you are with friends/at a social gathering 0.67
you are running, dancing, working out 0.56
you want to distance yourself or get away from your
problems
Loading 3.02 2.87 1.83 1.40
Variance explained 15% 14% 9% 7%
Note. N = 431. Factor loadings < |.40| omitted.
SPATIOTEMPORAL EMOTIONS TO MUSIC
59
Once the factor structure was determined, lasso regression was again used to
determine the combination of personality factors that predict loading scores from each of
these four situational factors (see Table 6). In the final regression model predicting the
positive valence, low arousal (relaxed) situational factor, reflection (b = 0.19, SE = 0.06,
p < 0.01) was positively correlated with factor scores and Empathic Concern (b = -0.15,
SE = 0.04, p < 0.05) was negatively correlated with factor scores. In the best-fitting
regression model predicting the negative valence, low arousal (sad) situational factor,
only Fantasy was positively correlated with factor scores (b = 0.30, SE = 0.05, p <
0.001). For the negative valence, high arousal (angry) situational factor, rumination was
found to be positively correlated with factor scores (b = 0.14, SE = 0.06, p < 0.05). For
the positive valence, high arousal (celebrate) situational factor, years of musical training
(b = 0.10, SE = 0.05, p < 0.05) and Openness to Experience (b = 0.10, SE = 0.05, p <
0.05) were found to be positively correlated with factor scores, whereas gender (b = -
0.31, SE = 0.11, p < 0.01), conscientiousness (b = -0.10, SE = 0.04, p < 0.05), emotional
stability (b = -0.16, SE = 0.07, p < 0.05), and rumination (b = -0.22, SE = 0.06, p <
0.001) were found to be negatively correlated.
SPATIOTEMPORAL EMOTIONS TO MUSIC
60
Table 6. LASSO models predicting situational factors for listening to sad music
F1: Relaxed F2: Sad F3: Angry F4: Celebrate
b SE b SE b SE b SE
1. Gender -- -- -- -- 0.04 0.12 -0.31** 0.11
2. Years of music
training
-- -- -- -- -- --
0.10* 0.05
3. Age -- -- -- -- -0.07 0.06 0.09 0.05
4. Extraversion -- -- -- -- -- -- -- --
5. Agreeability -0.07 0.07 -- -- -- -- 0.06 0.06
6. Conscientiousness -- -- -- -- -- -- -0.10* 0.05
7. Emotional stability -- -- -- -- -- -- -0.16* 0.07
8. Openness to
experience
-- -- -- -- -- --
0.10* 0.05
9. Depression severity 0.08 0.07 -- -- -- -- 0.00 0.06
10. Fantasy -- -- 0.30*** 0.05 -- -- 0.09 0.06
11. Perspective taking -- -- -- -- -- -- -- --
12. Empathic concern -0.15* 0.07 -- -- -- --- -0.10 0.06
13. Personal distress -- -- -- -- -- -- 0.08 0.06
14. Absorption -- -- -- -- -0.06 0.06 -0.07 0.06
15. Rumination 0.04 0.07 -- -- 0.14* 0.06 -0.22*** 0.06
16. Reflection 0.19** 0.06 -- -- -0.11 0.06 -- --
17. Nostalgia proneness -- -- -- -- -- -- -- --
Observations 431 431 431 431
R2 0.05 0.07 0.04 0.11
Adjusted R2 0.04 0.07 0.03 0.08
Residual Std. Error 1.20 1.06 1.10 0.96
F Statistic 4.59*** 33.93*** 3.23** 3.93***
Mean Squared Error 1.14 1.15 1.36 1.09
Note. N = 431. ***p < .001. **p < .01,*p < .05.
Reasons for enjoying sad music. Exploratory factor analysis was conducted on a
randomly selected half of the data to determine the factor structure of responses to the 12
possible reasons for enjoying sad music. Based on a scree-plot and the eigenvalue
criteria, it was determined that a three-factor model was appropriate. Items that did have a
factor loading score above 0.40 for any of the three factors were not included in the
model. These included because it reassures me in my ability to feel emotions at all,
SPATIOTEMPORAL EMOTIONS TO MUSIC
61
because I can experience a milder version of the emotions expressed in the music without
the negative life consequences, and because it allows me to express who I am.
In the final, pruned model (see Table 7), the items because it makes me feel calm,
because it makes me feel touched or moved, because it strengthens my emotions, because
it helps me regulate my emotions, loaded highly on the first factor. Together, these items
all relate to using sad music to control and induce positive emotional responses. The
items because it allows me to relate to or connect with others, because it helps me gain a
better understanding of my own feelings or my current situation, and because it makes me
think more realistically about my life and current situation all loaded on the second
factor. These items express benefits of sad music related to the engagement of cognitive
processes and self-reflection. Finally, the items because it prolongs my feelings of
sadness and because it allows me to purge myself of negative emotions both loaded highly
on factor 3. These items relate to the idea of catharsis, that sadness induced through
music can be beneficial by engaging with negative emotions, whether by releasing them
or going deeper into them.
SPATIOTEMPORAL EMOTIONS TO MUSIC
62
Table 7. Exploratory factor analysis with 12 reasons for listening to sad music.
Reasons: F1: Touched F2: Realistic F3: Prolong
I am more likely to feel touched, awed, or moved by sad music 0.73
it generally results in a strengthening and/or deepening of
emotions and I find powerful emotions enjoyable 0.63
it makes me feel more calm, soothed, and/or relaxed than other
types of music 0.60
it helps me regulate my mood and/or emotions better than other
types of music 0.42
it makes me think more realistically about my life and current
situation 0.80
it helps me gain a better understanding of my own feelings or
my current situation 0.65
it allows me to relate to and/or feel connected to others 0.38
it makes me feel more sad and/or prolongs my sadness and I
think feeling truly sad has some value or importance 0.62
it allows me to release or purge myself of negative emotions 0.40
listening to sad music is a way for me to express how I am
feeling and/or who I am
it allows me to relate to and/or feel connected to others
I can experience a milder version of the emotions expressed in
the music without the negative life consequences
I am more likely to feel touched, awed, or moved by sad music
Loading 1.48 1.38 0.74
Variance explained 16% 15% 8%
Note. N = 332. Factor loadings < |.40| omitted.
Lasso regression was used to determine the combination of personality factors
that predict loading scores from the three factors. In the final regression model predicting
factor 1 (strengthening positive emotions), absorption (b = 0.15, SE = 0.06, p < 0.05) and
rumination (b = 0.14, SE = 0.05, p < 0.05) were positively correlated with factor scores.
In the final regression model predicting factor 2 (thinking realistically), Perspective
Taking was positively correlated with factor scores (b = 0.17, SE = 0.05, p < 0.01). For
factor 3 (releasing negative emotions), Fantasy (b = 0.10, SE = 0.05, p < 0.05) and
rumination (b = 0.10, SE = 0.05, p < 0.05) were positively correlated with factor scores
(see Table 8).
SPATIOTEMPORAL EMOTIONS TO MUSIC
63
Table 8. LASSO models predicting reasons for listening to sad music.
F1: Touched F2: Realistic F3: Prolong
b SE b SE b SE
1. Gender -- -- 0.04 0.09 -- --
2. Years of music training -- -- -- -- -- --
3. Age -- -- -- -- -- --
4. Extraversion -- -- -- -- -- --
5. Agreeability -- -- -- -- -- --
6. Conscientiousness -- -- -- -- -- --
7. Emotional stability -- -- -- -- -0.09 0.05
8. Openness to experience -- -- 0.08 0.05 -- --
9. Depression severity -- -- -- -- -- --
10. Fantasy -- -- -- -- 0.10* 0.05
11. Perspective taking -- -- 0.17** 0.05 -- --
12. Empathic concern -- -- -- -- -0.01 0.04
13. Personal distress -- -- -- -- -- --
14. Absorption 0.15* 0.06 0.06 0.06 -- --
15. Rumination 0.14* 0.05 -- -- 0.10* 0.05
16. Reflection -- -- -- -- -- --
17. Nostalgia proneness 0.06 0.06 0.06 0.06 -- --
Observations 332 332 332
R2 0.06 0.06 0.07
Adjusted R2 0.06 0.04 0.06
Residual Std. Error 0.91 0.94 0.73
F Statistic 7.55*** 4.07** 6.43***
Mean Squared Error 0.78 1.24 0.65
Note. N = 332. ***p < .001. **p < .01,*p < .05.
Emotional responses mediate the relationship between personality and the
liking of sad music. I first tested whether sublime feelings to sad music me the
relationship between Fantasy/absorption and liking sad music. In all mediation models,
the items from the GEMS-9 that were shown to load onto a single factor were averaged
and used as a single variable. A significant indirect and direct effect was found between
enjoyment of sad music and Fantasy (b
indirect
= 0.09, SE = 0.02, p < 0.001; b
direct
= 0.15,
SPATIOTEMPORAL EMOTIONS TO MUSIC
64
SE = 0.05, p = 0.002; see Figure 1), suggesting partial mediation. A significant indirect
and direct effect was also found with the model including absorption (b
indirect
= 0.10, SE =
0.02, p < 0.001; b
direct
= 0.16, SE = 0.05, p = 0.001).
I next evaluated whether rumination would be associated with liking sad music
because it increases negative emotions. A negative indirect effect was found, as well as a
positive direct effect (b
indirect
= -0.04, SE = 0.01, p = 0.003; b
direct
= 0.23, SE = 0.05, p <
0.001), where rumination is directly related to liking sad music, but negatively related to
sad music through its positive associations with feelings of sadness and tension (b
= 0.17,
SE = 0.04). Furthermore, to our surprise, a significant positive indirect effect was found
with sublime feelings mediating the relationship between rumination and liking sad music
(b
indirect
= 0.07, SE = 0.02, p < 0.001; b
direct
= 0.13, SE = 0.05, p = 0.007).
I additionally tested whether an association between sensations of unease when
listening to sad music and liking sad music was mediated by personality measures. Given
that absorption was positively correlated with factor scores of unease as well as liking sad
music, I evaluated whether this trait mediated the relationship between feelings of unease
and liking sad music. A significantly, positive indirect effect was found with a negative
direct effect (b
indirect
= 0.06, SE = 0.02, p = 0.01; b
direct
= -0.22, SE = 0.07, p = 0.002),
indicating a suppression effect. The model suggests that unease feeling in response to sad
music is directly negatively correlated with liking sad music, but positively correlated
with liking sad music through trait absorption. Finally, I evaluated whether depression
severity negatively mediated the relationship between feeling vital emotions in response
SPATIOTEMPORAL EMOTIONS TO MUSIC
65
to sad music and liking sad music, but no indirect effects were found (b
indirect
= 0.01, SE =
0.009, p > 0.10; b
direct
= 0.21, SE = 0.06, p < 0.001).
I next tested the mediating effects that emotional responses to sad music would
have on the relationship between personality and the situations in which people listen to
sad music. A significant positive indirect effect was found with sublime feelings mediates
the relationship between absorption and liking sad music in low valence, low arousal
(sad) situations (b
indirect
= 0.14, SE = 0.03, p < 0.001; b
direct
= 0.08, SE = 0.05, p > 0.10).
Additionally, a significant positive indirect and direct effect was found with sublime
feelings meditating the relationship between Fantasy and liking sad music in sad
situations (b
indirect
= 0.11, SE = 0.02, p < 0.001; b
direct
= 0.19, SE = 0.05, p < 0.001). With
sublime feelings mediating the relationship between rumination and listening to sad
music in negative valence, high arousal (angry) situations, a significant indirect and direct
effect was found (b
indirect
= 0.06, SE = 0.02, p = 0.001; b
direct
= 0.16, SE = 0.05, p =
0.001).
Finally, I hypothesized that the relationship between Openness to Experience and
listening to sad music when in positive, high arousal (celebratory) situations may be
mediated by positive feelings of joy and power in response to sad music. However, when
this model was tested, no indirect or direct effects were found (b
indirect
= 0.03, SE = 0.02,
p > 0.10; b
direct
= 0.05, SE = 0.04, p > 0.01).
SPATIOTEMPORAL EMOTIONS TO MUSIC
66
Figure 6. Mediation model with Fantasy, GEMS, and enjoyment of sad music.
Exploration of the relationship between Fantasy, rumination, and
absorption. The multiple regression analysis showed that Fantasy, rumination, and
absorption were all positively correlated with liking sad music as well as with feeling
sublime emotions in response to sad music. To provide evidence for a theory presented in
previous studies that people with ruminative tendencies have a “maladaptive attraction”
to sad music, whereas people high in Fantasy and trait absorption have an “adaptive
attraction” to sad music (Garrido & Schubert, 2013), I tested the relationship between the
personality traits Fantasy, rumination, and absorption, and reasons for liking sad music.
Specifically, I focused on two reasons: (1) because it prolongs feelings of sadness and (2)
because it releases or purges negative emotions. I first built a multiple regression with all
three personality traits to test for an interaction effect and found a significant negative
interaction between Fantasy and rumination when predicting prolonging feelings of
sadness (b
inter
= -0.18, SE = 0.06, p = 0.003), indicating that people with higher
rumination and lower Fantasy scores have a stronger positive association with liking sad
music because it prolongs feelings of sadness. Both absorption and Fantasy, but not
rumination positively predicted purging negative feelings (absorption: b = 0.12, SE =
SPATIOTEMPORAL EMOTIONS TO MUSIC
67
0.06, p = 0.04; Fantasy: b = 0.16, SE = 0.06, p = 0.01; rumination: b = 0.001, SE = 0.06,
p > 0.10).
To further understand this relationship, I conducted mediation models to
determine if negative (unease) vs. positive (sublime) feelings mediated the relationship
between rumination and liking sad music because it purges and prolongs feelings of
sadness. With rumination predicting liking sad music because it prolongs feelings of
sadness, a significant indirect and direct effect was found with feelings of unease as a
mediator (b
indirect
= 0.07, SE = 0.02, p < 0.001; b
direct
= 0.20, SE = 0.06, p = 0.001). With
rumination predicting purging negative emotions, significant positive indirect effects
were found with both unease feelings as a mediator (b
indirect
= 0.05, SE = 0.02, p = 0.004;
b
direct
= 0.03, SE = 0.06, p > 0.10) as well as with sublime feelings as a mediator (b
indirect
= 0.03, SE = 0.02, p = 0.03; b
direct
= 0.06, SE = 0.06, p > 0.10; see Figure 7).
Interestingly, the opposite effects were shown with regards to the relationship between
Fantasy and liking sad music because it purges and prolongs feelings of sadness: no
indirect effect was found between liking sad music because it purges negative emotions
through sublime feelings (b
indirect
= 0.02, SE = 0.01, p = 0.09), whereas a smaller indirect
effect was found when mediated by unease feelings (b
indirect
= 0.03, SE = 0.01, p = 0.02;
b
direct
= 0.21, SE = 0.06, p < 0.001). Furthermore, a significant indirect was found
between Fantasy and liking sad music because it prolongs feelings of sadness with
feelings of unease as the mediator (b
indirect
= 0.06, SE = 0.02, p = 0.001; b
direct
= 0.07, SE
= 0.06, p > 0.10; see Figure 8).
SPATIOTEMPORAL EMOTIONS TO MUSIC
68
Figure 7. Mediation model with rumination, GEMS, and purging negative feelings
Figure 8. Mediation model with Fantasy, GEMS, and prolonging negative feelings
Discussion
The findings from this study support those of previous investigations by showing
that the enjoyment of sad music is related to specific personality measures. They
SPATIOTEMPORAL EMOTIONS TO MUSIC
69
additionally extend past results by showing how these relationships depend on 1)
situational factors, 2) individual reasons for engaging with sad music, and 3) individual
emotional responses to sad music. In particular, when controlling for other personality
measures, the Fantasy subcomponent of empathy, and a related measure—trait
absorption—uniquely predicted the enjoyment of sad music. Moreover, both personality
measures were additionally correlated with sublime emotions, e.g. peacefulness,
tenderness, and transcendence, in response to sad music. Mediation models additionally
confirmed that Fantasy and absorption are indirectly positively correlated with enjoying
sad music through the activation of sublime feelings. Overall, these results provide
support for previous hypotheses regarding the relationship between Fantasy, absorption,
and the enjoyment of sad music; they show that the act of becoming engrossed and
immersed in works of art that convey sadness engenders strong, positive emotions rather
than negative ones. By including questions regarding individual’s reasons for engaging
with sad music, I then could clarify the rewarding aspects of sad music in high Fantasy
and high absorptive participants. When controlling for additional personality factors, I
show that trait absorption is associated with liking sad music because of it produces and
regulates strong, positive feelings, whereas Fantasy is associated with liking music
because it can purge and/or prolong negative feelings. Although Fantasy and absorption
are highly-related traits, our results suggest that each might be associated with enjoying
sad music through separate mechanisms. Given that Fantasy relates to a tendency to
become engaged and transported into narratives, whereas absorption is far more general,
it may be that fantasy-prone individuals are more likely to resonate with and experience
SPATIOTEMPORAL EMOTIONS TO MUSIC
70
the complex mixture of emotions being conveyed through music, thus enabling them to
both purge and prolong negative emotions. This notion is in line with the results from a
previous music-listening study, in which it was shown that Fantasy, but not absorption,
was correlated with being moved by unfamiliar sad music (Eerola, Vuoskoski, &
Kautiainen, 2016).
Interestingly, rumination was also found to be positively associated with feeling
sublime in response to sad music and not, as predicted, with feelings of sadness or
tension. Even more intriguing, rumination was associated with liking sad music for
reasons related to the strengthening of positive emotions, calling into question the
hypothesis proposed by Garrido & Schubert (2013) regarding a “maladaptive attraction”
to sad music in people who tend to ruminate. In Schubert et al. (2018), it was shown that
the relationship between rumination and liking sad music was independent of absorption;
the authors interpret this finding as evidence of an adaptive attraction to sad music
through rumination and a maladaptive attraction through rumination. In our study, I also
found that absorption and rumination independently contributed to liking sad music;
however, I also found that rumination, absorption, and Fantasy were all linked to feeling
positive emotions in response to sad music. Where these traits differ lies in their
associations with motivational factors. Rumination and Fantasy were both independently
related to liking sad music because it prolongs negative emotions, whereas absorption
was not. Our mediation models did show that the link between rumination and liking sad
music because it prolongs negative emotions was partially mediated by feelings of
unease, providing some evidence for an unhealthy response to sad music. On the other
SPATIOTEMPORAL EMOTIONS TO MUSIC
71
hand, the positive association between rumination and liking sad music because it purges
negative feelings was fully mediated by both feelings of sublime and unease. Therefore,
it may be that people who tend to ruminate can at times feel positive emotions and at
other times negative emotions in response to sad music, each serving a different function
when it comes to the reasons for engaging with sad music. The attraction to sad music in
people who ruminate may not necessarily be “maladaptive”.
Additionally, our results show that the specific situations in which people choose to
engage with sad music is correlated with unique personality profiles. An exploratory
factor analysis found that the 20 possible situations for listening to sad music loaded onto
four latent constructs, which roughly map onto the four quadrants of the circumspect
space of affect (Russell, 1980). People who score higher on Fantasy are more likely to
listen to sad music in sad situations, such as after a breakup, when they feel lonely, or
when they feel homesick. I further tested whether this relationship was mediated by
feeling positive emotions in response so music and report evidence for a partial mediation
effect. This suggests that people who have a tendency to transport themselves
imaginatively into works of art are able to experience intense aesthetic emotions when
listening to sad music. This is likely to help them cope with or overcome current negative
circumstances.
Furthermore, it was determined that Openness to Experience, one of the Big Five
personality traits that pertains to a tendency to seek-out new and stimulating situations, is
positively correlated with listening to sad music in more social, positive situations, such
as with friends, when celebrating, and when being active. Openness to Experience is a
SPATIOTEMPORAL EMOTIONS TO MUSIC
72
personality trait marked by preferences for variety and novelty, intellectual curiosity, and
sensitivity to aesthetics (McCrae, 2007). Previous studies have shown that people who
score higher in Openness to Experience are more likely to experience intense emotional
responses as well as chills in response to music (Liljestrom, Juslin, & Vastfjall, 2012;
Nusbaum & Silvia, 2011). Furthermore, Openness to Experience has been shown to
predict mixed emotional states, whereby people feel comfortable feeling simultaneously
sad and happy (Barford & Smillie, 2016). It is possible, then, that people who express
Openness to Experience enjoy listening to sad music in happy situations because they are
able to appreciate, and maybe even prefer, the experience of mixed emotional states and
therefore seek out such experiences in daily life. While I did not find a significant
mediating effect of emotional responses on the relationship between Openness to
Experience and listening to music in celebratory situations, it may be that our measure of
emotional responses failed to capture the potential for feeling contrasting emotions
simultaneously. Further exploration will be needed to clarify the relationship between
Openness to Experience and the function of sad music.
With regards to the empathy subscale Empathic Concern, I replicated previous
findings in showing that this trait was positively correlated with negative feelings in
response to sad music (sadness and tension) and not positively related to liking sad music
(Taruffi & Koelsch, 2014). Furthermore, I showed that Empathic Concern was negatively
correlated with listening to sad music in positive, low arousing situations, such as when
one feels relaxed, in contact with nature, and when is in a good mood. The results suggest
that the subcomponent of emotional empathy, in which people tend to feel concern for
SPATIOTEMPORAL EMOTIONS TO MUSIC
73
others, results in stronger negative feelings in response to sad music that are not
particularly beneficial, making it unlikely that they would choose to listen to such music
when they are trying to relax or maintain a positive mood.
I did not find a relationship between liking sad music and the Perspective Taking
subscale of empathy, as has been shown in some previous studies (Kawakami &
Katahira, 2015; Taruffi & Koelsch, 2014). However, I did find an association between
Perspective Taking and particular reasons for engaging with sad music. Specifically,
individuals who scored higher in Perspective Taking reported enjoying sad music
because it conferred psychological benefits related to understanding one’s emotions and
the emotions others. Our results provide one possible explanation for the inconsistencies
in previous studies on the link between empathy and enjoying sad music, by suggesting
that people with an ability to adopt the psychological perspectives of others can find sad
music enjoyable, particularly when it enables them to process their own feelings through
the act of connecting with the feelings of others.
In sum, our findings suggest that various measures of personality are linked to the
enjoyment of sad music through separate mechanisms. These mechanisms depend on
emotional responses to sad music and on situational factors, suggesting a complex
relationship between individual differences, learned associations, and the external
environment is involved in the appreciation of art that conveys negative emotions. Being
able to predict when and how negative affect will produce pleasure improves our
understanding of music’s ability to influence mood and behavior and will help expand the
role that music can play in everyday life as well as in therapeutic settings.
SPATIOTEMPORAL EMOTIONS TO MUSIC
74
CHAPTER 4
Synchronized brain activity reflects the enjoyment of sad music over time
Introduction
Study 3 incorporates findings presented in study 2 to assess the time-varying patterns of
neural activity associated with changing emotional experiences to full-length pieces of
sad music. Over the last several decades, neuroscientists have provided evidence that
numerous brain regions that serve a variety of functions not exclusively related to affect,
are involved in representing a diverse range of emotional states (Lindquist et al., 2012).
This has led to the hypothesis that the resplendent tapestry of emotions that I experience
emerges from communication among brain regions that tend to function together, such as
the amygdala, ventral striatum, insula, anterior cingulate cortex (ACC), thalamus,
hypothalamus, and the orbitofrontal cortex (OFC) as well as regions typically thought of
to be part of the default mode network (DMN), i.e. the posterior cingulate and medial
prefrontal cortex (MPFC; Kober et al., 2008). The traditional task-based neuroimaging
studies that inform this theory tend to use static affective stimuli, such as pictures, faces,
smells, sounds, and short film or music clips (Phan, Wager, Taylor, & Liberzon, 2002;
Schirmer & Adolphs, 2017). Because of this, our understanding of the patterns of co-
functioning and neural communication that generate various aspects of affect, such as the
recognition of emotion, the subjective experience of that emotion, and subsequent
feelings of pleasure or displeasure, is limited.
Music is an ideal stimulus to use to answer key questions related to the dynamic
SPATIOTEMPORAL EMOTIONS TO MUSIC
75
aspect of emotions and feelings. While basic emotions, such as happiness, sadness, fear,
and anger, can be perceived within musical structure (Balkwill & Thompson, 1999; Fritz
et al., 2009), because of its temporal nature, music can also induce a wide-range of
feelings, from the everyday to the aesthetic (Zentner et al., 2008), that morph and evolve
over time (Brattico et al., 2013).
Moreover, one can perceive an emotion in music that is distinct from the emotion
that is experienced (Gabrielsson, 2001). Emotional responses to sad music provide an
intuitive and elegant exhibition of this disconnect. While listening to sad music, people
often recognize that the piece is conveying negative emotions, but do not necessarily
report feeling a negative emotion; in fact, they often report feeling positive emotions such
as enjoyment (Sachs et al., 2015). Not everyone enjoys listening to sad music, though,
and recent evidence suggests that personality traits, like the Fantasy component of
empathy, may modulate one’s emotional response to negative valence when expressed
through music (Eerola, Vuoskoski, & Kautiainen, 2016; Kawakami & Katahira, 2015;
Sachs et al. under review). This raises the question of whether separate neural
mechanisms are engaged during the perception of an emotion, the experience of that
emotion, and the overall hedonic experience (Brattico et al., 2013). It has been proposed
that subjective feelings and intensity of emotional responses involve regions of the brain
known for viscerosensory processing, such as the insula, ACC, amygdala, and striatum,
whereas enjoyment, pleasure, and aesthetic appreciation involve regions that monitor and
integrate cognitive and sensory information, such as the orbitofrontal cortex and posterior
cingulate (Brattico et al., 2013). However, this hypothesis has not yet been empirically
SPATIOTEMPORAL EMOTIONS TO MUSIC
76
tested. Assessing the brain regions that are altered in response to full-length pieces of
music that convey complex emotions like pleasurable sadness can help unravel the
unique role that the spatial and temporal patterns of neural activity serve in representing
affective experiences.
One limitation with previous fMRI studies, which may have contributed to this gap
in our understanding of emotional responses to music, is that traditional experimental
designs typically require time-locked and repeated presentations of mostly static, or
relatively short, stimuli in order to model the expected hemodynamic response. While
these models provide strong experimental control and produce reliable BOLD responses,
the constraints of these designs also make it difficult to assess responses to more dynamic
stimuli that convey emotions over a longer period of time and are therefore more akin to
the types of experiences we have in everyday life, outside of the lab. To gain a more
complete picture of the involvement of various brain networks in the neural basis of
feelings, it becomes necessary capture the time-varying patterns of the fMRI signal in
response to more ecologically-valid stimuli.
Recently, a variety of analytical techniques have been developed that do not impose
an expected model on the data and therefore allow investigators to image activity from
the brain continuously during the presentation of more realistic and naturalistic stimuli,
such as full-length movies and pieces of music. These analyses involve calculating
correlations between the neural signal across a group of participants’ brains and matching
moments of high correlation with events in the stimulus, allowing researchers to link a
unified, collective experience to changes in neural signal over time. One such method,
SPATIOTEMPORAL EMOTIONS TO MUSIC
77
termed intersubject correlation (ISC), for instance, involves calculating the correlation in
the BOLD signal between all participants one voxel at a time (Hasson, Malach, &
Heeger, 2010). The advantage of this approach is that idiosyncratic fluctuations in BOLD
signal that are unrelated to the naturalistic stimulus are not synchronized and therefore
voxels that show highly correlated activity across people are assumed to be stimulus-
driven: no explicit modelling of the BOLD signal according to the onset of some external
event is therefore required.
ISC has been used previously to identify neural patterns in response to music
(Trost, Frühholz, Cochrane, Cojan, & Vuilleumier, 2015), films (Hasson, Nir, Levy,
Fuhrmann, & Malach, 2004), and narrative stories (Nummenmaa et al., 2014). The
synchronized patterns that emerge have been shown to be time-locked to the events of the
stimulus, highly reliable across multiple scanning sessions with different stimuli, and
identifiable in brain regions that may not be significantly activated when averaging across
participants, as is done with more traditional neuroimaging procedures (Hasson et al.,
2010). It also has been employed to uncover neural differences that may account for the
ways that individuals process complex, socioemotional stimuli. For example, by
comparing ISC in response to a movie between a cohort of individuals with autism
spectrum disorder (ASD) and typically-developed individuals, researchers found that
synchronization of activity in primary sensory and associated areas was more variable
and less reliable in the ASD group, which may underlie some of atypical socio-emotional
behaviors associated with this developmental disorder (Hasson et al., 2009).
While it is often concluded from these types of studies that the regions that show
SPATIOTEMPORAL EMOTIONS TO MUSIC
78
high synchronization in response to a stimulus are somehow involved in its processing, it
is still unclear from this analysis alone what aspects of the stimulus these patterns are
attending to. To begin to understand the meaning behind these synchronized brain
patterns, it becomes necessary to identify how these patterns change in tandem with
certain aspects of the stimulus. Several methods have been developed for assessing the
dynamic quality of synchronization. Sliding window ISC, for instance, involves
calculating correlations across individuals in short, temporal windows that shift in time
by a pre-specified “step” size. What results is a continuous measure, with each point
representing the synchronization of a particular voxel or region in that window of time,
which can be subsequently correlated with an additional continuous measure that reflects
a psychological or behavioral concept of interest. This approach has been used previously
to identify time-varying patterns of activity that map onto changes in expressed valence
in music over time (Trost et al., 2015). In this study, it was shown that the regions that
show high ISC, i.e. the ACC, amygdala, and caudate, also show activation patterns that
mirror the continuous ratings of negative valence; that is, as negative emotions increased
in response to the music, so did activation in these subcortical regions. The dynamic
method provides a necessary balance between stimulus-dependent and stimulus-free
analyses without requiring strong assumptions with regards to the shape, amplitudes, and
timescale of the expected neural responses.
One challenge with regards to the sliding window approach is the choice of
appropriate window and step size for analysis. Previous studies have shown that different
window sizes have profound effects on the temporal structure of the synchronization
SPATIOTEMPORAL EMOTIONS TO MUSIC
79
patterns (Shakil, Lee, & Dawn, 2016). An alternative that was recently developed to
avoid this issue involves comparing two signals by first statistically separating the
instantaneous amplitude from the phasic information with signal processing filters, and
then comparing only the phasic component across people at each moment in time
(Glerean, Salmi, Lahnakoski, Jääskeläinen, & Sams, 2012). Because this method
measures intersubject similarity in phasic rather than in correlational terms, it can be
calculated at each volume and thus, no averaging across windows is necessary.
Instantaneous phase synchronization (ISPS), as it is called, has been shown to be more
sensitive than sliding window ISC at revealing synchronization effects (Nummenmaa,
Smirnov, et al., 2014). For example, in this paper, continuous self-report ratings of
valence during the presentation of 45-s narratives was associated with ISPS in the
thalamus, anterior cingulate, lateral prefrontal, and orbitofrontal cortices; an association
was not found with ratings predicting sliding-window ISC in these regions.
In light of evidence suggesting that the brain is organized into functional network
hubs (Yeo et al., 2011), it is highly likely that the dynamics of feelings are additionally
represented by time-varying patterns of communication between brain regions
(Touroutoglou, Lindquist, Dickerson, & Barrett, 2014). Indeed, recent dynamic
functional connectivity studies with naturalistic stimuli have shown that connectivity
between regions involved in affect, including the insula, putamen, ACC, OFC, amygdala,
and striatum, were predicted by changes in affective experience (Raz et al., 2016; Singer,
Jacoby, et al., 2016). Furthermore, the patterns of communication between regions may
reflect separable aspects of affective processing. A recent functional connectivity study
SPATIOTEMPORAL EMOTIONS TO MUSIC
80
showed that attentional resources allocated to emotional stimuli were associated with
connections between the dorsal anterior insula and dorsal ACC, whereas momentary
experiences of arousal in response to the stimuli were associated with connections
between the ventral anterior insula and ACC (Touroutoglou, Hollenbeck, Dickerson, &
Feldman, 2012).
Both dynamic ISC and ISPS can be adapted to assess stimulus-driven inter-
regional changes by correlating the signal (or comparing the phasic component of the
signal, as is the case with ISPS) of a single participant’s brain region with the signal in all
other regions of all other participants. This approach, termed intersubject functional
connectivity (ISFC; Simony et al., 2016), has been shown to more fully remove the
effects of intrinsic activity on functional connectivity, as compared to averaging across
participants (D. Kim, Kay, Shulman, & Corbetta, 2018).
While studies that utilize these data-driven, multivariate methods have clarified
the role of specific brain regions in socioemotional processing, few of them that have
employed to specifically investigate the dynamic quality of complex feeling states that
are likely to emerge in response to music. In particular, the brain regions involved in
representing the intensity of an emotion as well as the pleasantness or unpleasantness of
that experienced emotion, which can be distinct in response to something like music, has
yet to be addressed. Answering this question will likely reveal a more nuanced picture of
the interactive quality of the neural mechanisms that underlie our fluctuations in mood,
motivations, wants, desires, and behaviors.
SPATIOTEMPORAL EMOTIONS TO MUSIC
81
The current study. This study aimed to assess the time-varying patterns of neural
activity and connectivity associated with changing emotional experiences to naturalistic
pieces of music. To date, few neuroimaging studies have evaluated how different aspects
of emotional responses emerge in the brain and develop over time. Furthermore, little
attention has been paid to separating the subjective feeling of an emotion from the
subsequent feelings of pleasantness or unpleasantness about that emotion. By specifically
using a piece of music that is negatively valent and yet enjoyable, as well as more
advanced model-free analytical techniques, I aimed to uncover stimulus-driven patterns
of neural activity and connectivity that are driven by two related, yet distinct aspects of
emotional experience: feelings of sadness and feelings of enjoyment.
I first assessed how continuous subjective ratings of emotional experience and
enjoyment covary during the presentation of a full-length piece of music that conveys
sadness. Using ISC, I then evaluated neural synchronization of brain activity across
participants as they were listening to a sad piece of music and test how the Fantasy
subcomponent of empathy, a trait known to be related to emotional responding to sad
music (Eerola et al., 2016; Kawakami & Katahira, 2015), modulates the resulting ISC
maps. I next evaluated the temporal dimension of the musically-driven neural
synchronization by calculating intersubject phase synchronization at each moment in
time. Continuous ratings of emotional experience were used to assess how changes in
intensity of sadness and enjoyment predict time-varying patterns of intra-regional and
inter-regional synchronization.
To our knowledge, no study to date has assessed how the intensity of feelings of
SPATIOTEMPORAL EMOTIONS TO MUSIC
82
both sadness and associated enjoyment are related to stimulus-driven patterns of brain
activation. Findings from previous univariate neuroimaging studies have suggested that
the mechanisms involved in hedonic experiences may be distinct in different affective
contexts. Enjoyment of art associated with positive vs. negative emotions has been shown
to activate different brain regions (Brattico et al., 2016; Ishizu & Zeki, 2017; McPherson,
Barrett, Lopez-Gonzalez, Jiradejvong, & Limb, 2016; Wilson-mendenhall, Barrett, &
Barsalou, 2015). These results, among others, provide a working hypothesis with regards
to the brain regions and systems that we might expect to be involved in processing
emotional intensity and enjoyment in music. I predicted that listening to a sad,
aesthetically-pleasing piece of music would be associated with neural synchronization of
regions within the auditory cortex as well as in regions involved in processing a broader
range of affective stimuli, including the striatum, insula, ACC, and OFC. I further
predicted that trait measures of empathy would modulate the patterns of synchronization
in these regions, with high-Fantasy participants displaying greater ISC in cortical midline
structures known to be involved in resonating with emotions observed in others (Fan,
Duncan, de Greck, & Northoff, 2011). Finally, I predicted that moments within the music
that are rated as highly enjoyable would be associated with time-varying patterns of
synchronization within and between regions involved in reward monitoring, reward
prediction, and aesthetic judgment, such as regions typically considered to be part of the
DMN (MPFC, precuneus, posterior cingulate) and the OFC, whereas moments within the
piece that are rated as highly sad would be associated with time-varying patterns of
synchronization within and between regions involved in viscerosensory processing of
SPATIOTEMPORAL EMOTIONS TO MUSIC
83
emotion, such as the insula, striatum, amygdala, and cingulate.
Methods
All experimental procedures were approved by the USC Institutional Review Board. All
participants gave informed consent and were compensated, either monetarily or for
course credit, for participating in the study.
Stimuli selection. To find pieces of music that reliably induced the intended
emotion of sadness, I started by exploring online music streaming sites, such as Spotify
and Last.fm, as well as social media sites such as Reddit and Twitter for social tags with
the “sad” as well as synonyms. As a comparison set, I additionally searched for pieces of
music that reliably conveyed happiness, using social tags with the word “happy” and its
synonyms. This resulted in a list of 120 pieces (60 sad and 60 happy with a mix of lyrics
and non-lyrics). Eight coders then listened to 60-second clips extracted from these pieces
and rated whether it conveyed either happiness or sadness. All clips in which 75% (6/8)
coders agreed on the intended emotion were then used in an online survey that asked
participants to rate how much they enjoyed the clip, what emotion they feel in response to
the clip (sadness, happiness, calmness, anxiousness, boredom), and how familiar they
were with the clip, using a 5-point Likert scale. The survey was completed by 82 adult
participants via Amazon’s Mechanical Turk. The final survey included 27 pieces of
music, though because I believed that listening to 27 1-minute clips of music would have
been too cognitively demanding, each participant was only presented with 12 clips
selected at random. The number of presentations of each clip was counterbalanced across
participants.
SPATIOTEMPORAL EMOTIONS TO MUSIC
84
Based on the results from the survey, I excluded pieces that were rated as highly
familiar. and selected the pieces that had the highest ratings of the intended emotion
(sadness or happiness) and low ratings on all other emotions. To avoid any potential
confounds associated with semantic information conveyed through the lyrics of a song
(Brattico et al., 2011), I additionally decided to only select pieces that were instrumental.
This resulted in three pieces of music to be used during neuroimaging: (1) one longer
piece that reliably induced sadness without lyrics [Discovery of the Camp by Michael
Kamen, performed by the London Metropolitan Orchestra, a classical piece written for
the HBO miniseries Band of Brothers] (2) one shorter piece that reliably induces sadness
without lyrics [Frysta by Ólafur Arnalds, an ambient piece written for piano and strings],
and (3) one piece that reliably induced happiness without lyrics [Race against the sunset
by Lullatone, a pop song featuring the ukulele and bells]. While the focus of this paper is
on the responses to the longer piece of sad music, I collected behavioral and neural
responses to this second, shorter sad piece of music in order to be able to validate and
assess the generalizability of potential results. Additionally, I collected responses to a
piece of happy music so that I assess if similar or distinct patterns of neural
synchronization emerge with enjoyment in a happy emotional context. The methods and
results presented here, however, focus on responses to the longer, sad piece of music
only.
Pre-scanning behavioral study. To assess the feasibility and reliability of
continuous ratings in responses to the four selected pieces, before the commencement of
the fMRI portion of the study, a group of 51 healthy adult participants were recruited
SPATIOTEMPORAL EMOTIONS TO MUSIC
85
from USC undergraduate population to participate in a behavioral version of the fMRI
paradigm. During this study, participants were instructed to listen attentively to the three
full-length pieces described above and to simultaneously monitor changes in their
affective experience and report these changes using a fader with a sliding scale. Each
participant listened to each piece twice in two separate conditions. In one condition,
participants continuously reported the intensity of their felt emotional response, from 0-
not at all sad/happy to 10-extremely sad/happy. Participants were only asked to rate the
intensity of their feelings of sadness for the pieces that were intended to convey sadness
and happiness for the pieces that were intended to convey happiness and never both. In
the second condition, participants continuously rated their momentary feelings of
enjoyment from 0-not enjoyable to 10-extremely enjoyable. As participants moved the
fader, a real-time visualization of their responses were presented on a computer screen
via Psychtoolbox for MATLAB (Kleiner et al., 2007). The order of ratings were
counterbalanced across participants. During all music-listening trials, heart rate and skin
conductance were continuously monitored and collected using a BIOPAC MP150
system.
fMRI participant selection. Adult, right-handed participants (N = 40, 21 female,
M
age
= 24.1, SD = 6.24) were recruited from the greater Los Angeles community based
on responses to the online survey in which they listened to a 60s clip of the three final
pieces. Only participants who were not familiar with the pieces of music and reported
feeling the intended emotion in response to the clip (either happiness or sadness) were
invited to participate in the fMRI portion of the study. All participants had self-reported
SPATIOTEMPORAL EMOTIONS TO MUSIC
86
normal hearing and normal or corrected-to-normal vision with no history of neurological
or psychiatric disorders.
fMRI scanning procedure. All data was collected at the University of Southern
California Dana and David Dornsife Neuroimaging Center. After participants gave
consent, they completed a questionnaire in which they reported their current emotional
state using the Positive and Negative Affective Scale (PANAS; Watson, Clark, &
Tellegen, 1988).
During scanning, participants listened to full-length recordings of the three pieces
of music, the order of which was counterbalanced across participants. As the music
played, participants were instructed to pay attention to the music, to lie as still as
possible, and to keep their eyes open and focused on a fixation point continually
presented on a screen. An eye-tracking camera was monitored to ensure that the
participants were awake and alert during scanning. The auditory stimuli were presented
through MR-compatible OPTOACTIVE headphones with noise-cancellation
(Optoacoustics). Skin conductance, respiration rate, and pulse were simultaneously
recorded during scanning (see MRI data acquisition section for additional details).
At the conclusion of each piece, participants were presented with a series of
questions on the screen and were instructed to respond using an MRI-safe button box
with four buttons. Specifically, participants were asked how much attention they paid to
the music, how intense was their emotional reaction to the piece, how much they enjoyed
the music, and how familiar they were with the piece. The three pieces were presented in
three separate functional runs with a brief rest period between runs. After music listening,
SPATIOTEMPORAL EMOTIONS TO MUSIC
87
a five-minute resting scan was additionally collected.
Post-scanning continuous ratings. Immediately following scanning, continuous
ratings of the intensity of felt emotions and enjoyment were collected while listening to
the same pieces of music. I chose to have participants rate their experience with the music
outside of the scanner to avoid the influence of the task on the neural responses to the
stimuli (Taylor, Phan, Decker, & Liberzon, 2003). This data was collected exactly as
described above in the pre-fMRI behavioral study. Briefly, all participants listened to
each piece twice and continuously rated (1) the intensity of felt sadness or happiness and
(2) the intensity of their feelings of enjoyment using a fader connected to a laptop
computer. The order of pieces and conditions was completely randomized.
Survey measures. After the continuous music ratings, participants completed an
online survey designed to assess their musical background and experience (Goldsmith
Musical Sophistication Index; Mullensiefen, Gingas, Musil, & Steward, 2014) and
general aesthetic responses to music (Aesthetic Experience Scale in Music; Sachs, Ellis,
Schlaug, & Loui, 2016). Additionally, the survey included a trait measure of empathy
(Interpersonal Reactivity Index; Davis, 1983), a measure of depressive symptomology
and severity (PHQ-9; Martin, Rief, Klaiberg, & Braehler, 2006), and a measure of
generalized anxiety disorder severity (GAD-7; Löwe et al., 2008). General enjoyment of
sad music was also evaluated using a modified version of the Like Sad Music Scale
(Garrido & Schubert, 2013).
MRI data acquisition. Imaging was conducted using a 3-T Siemens
MAGNETOM Trio System with a 32-channel matrix head coil at the Dana and David
SPATIOTEMPORAL EMOTIONS TO MUSIC
88
Dornsife Neuroscience Institute at the University of Southern California. Functional
images were acquired using multiband and a gradient echo, echo-planar, T2*-weighted
pulse sequence (TR = 1000 ms, TE = 25 ms, flip angle = 90°, 64 × 64. matrix). Forty
slices covering the entire brain were acquired with a voxel resolution of 3.0 × 3.0 × 3.0
mm with no interslice gap. A T1-weighted high-resolution (1 × 1 × 1 mm) image was
also obtained using a three-dimensional magnetization-prepared rapid acquisition
gradient (MPRAGE) sequence (TR = 2530 ms, TE = 3.09 ms, flip angle = 10°, 256 × 256
matrix). Two hundred and eight coronal slices covering the entire brain will be acquired
with a voxel resolution of 1 × 1 × 1 mm.
The BIOPAC MP150 system was used to collect simultaneous psychophysiological
measure during scanning. Skin conductance responses (SCRs) were acquired with two
MRI-compatible electrodes placed on the participants left foot. Heart rate was measured
indirectly by recording pulse using the BIOPAC TSD200 pulse plethysmogram
transducer, which records the blood volume pulse waveform optically. The pulse
transducer was placed on the participant's left index finger. Respiration was measured
using BIOPAC TSD201 respiratory-effort transducer attached to an elastic respiratory
belt, which was placed just below the participant’s sternum and measured changes in
thoracic expansion and contraction during breathing. All physiological signals were
sampled simultaneously at 1 kHz using RSP100C and PPG100C amplifiers for
respiration and pulse, respectively, and were recorded using BIOPAC AcqKnowledge
software (version 4.1.1). SCR was connected to a grounded RF filter.
SPATIOTEMPORAL EMOTIONS TO MUSIC
89
MRI Preprocessing. Functional data were preprocessed using FSL
(www.fmrib.ox.ac.uk/fsl). Steps included correction for head motion (MCFLIRT) and
slice-acquisition time, spatial smoothing (5mm FWHM Gaussian kernel), and high-pass
temporal filtering (0.005Hz, 128s period). To account for non-neural fluctuations BOLD,
mean signal changes in white matter (WM) and cerebrospinal fluid (CSF) were regressed
out of each voxel using linear regression. Additional motion scrubbing was conducted to
regress out abrupt sharp changes in the signal using framewise displacement as a measure
of how much the brain moved from one data-point to the next and a cutoff of 0.50
(Power, Barnes, Snyder, Schlaggar, & Petersen, 2012). Additional artifact filtering was
conducted using an ICA-based procedure for removal of non-neuronal variance from
BOLD data (AROMA; Pruim, Mennes, Buitelaar, & Beckmann, 2015). After all filtering,
the preprocessed data were nonlinearly warped to a standard anatomical (MNI152) brain.
Finally, to account for the initial response to the onset of the music, the first 20 seconds
of the pieces of music were deleted before further analyses.
Analysis of continuous affective ratings. Continuous affective ratings were
sampled at 10Hz. To align the time series across participants, spline interpolation was
used to align time points to the nearest 10
th
of a second. Ratings were then downsampled
to match the TR of the brain data (1Hz). The interpolated, downsampled ratings were
demeaned and averaged across participants for each piece and each condition (emotion or
enjoyment) separately. Intersubject coherence in the enjoyment and emotion ratings were
calculated by correlating each participant’s ratings with the average rating of the rest of
the group. Any participant whose intersubject rating was less than 2 standard deviations
SPATIOTEMPORAL EMOTIONS TO MUSIC
90
below the average intersubject rating was removed from subsequent analyses. This
resulted in the removal of two participants.
Continuous measures of acoustic and musical features. To evaluate how
changes in affect and enjoyment were related to specific aspects of the stimuli,
continuous measures of acoustical aspects of the music were calculated using the
MIRtoolbox in Matlab (Lartillot & Toiviainen, 2007). A total of 20 features were
extracted that capture and quantify musical characteristics related to volume, tempo,
tonality, timbre, and rhythm (see Appendix I). The acoustical features were extracted
using a sliding window with a duration of 50ms and a step size of 25ms (Alluri et al.,
2013). All features were extracted using MIRtoolbox’s functions with default parameters,
except for key strength, which was taken as the maximum value of the 24-dimensional
output vector from MIR Toolbox’s key_strength function. In addition to these, Mel-
Frequency Cepstral Coefficients (MFCC) were calculated using the mfcc function in
Matlab’s Audio Toolbox, using a Hamming window, pre-emphasis coefficient of .97,
frequency range of 100-6400 Hz, 20 filterbank channels, and 22 liftering parameters.
MFCCs (13 in total) are a commonly used auditory feature that provide timbral
information (Guan, Chen, & Yang, 2012). Finally, compression ratio, a lower-level sonic
feature related to the complexity of the sound source, was computed by taking the ratio
between the file size of each window’s wav format and that same window’s mp3 format,
after converted with ffmpeg.
To compare features with the continuous ratings, the ratings were reasampled to
40Hz. To compare features to the fMRI data, all extracted acoustic time series were
SPATIOTEMPORAL EMOTIONS TO MUSIC
91
downsampled to match the sampling rate of fMRI data (1 Hz). Autoregressive models
with L1-regularization (Lasso) were used to predict ratings and brain synchronization in
auditory cortex based on continuous measures of acoustic features. Such models more
accurately account for the autocorrelation that is characteristic of the time series data
(Dean & Bailes, 2010). Models were trained on 80% of the data and tested on both the
first and last 20%, resulting in two “folds”. Features with statistically-significant
coefficients (p < .01) in both folds of the best-performing model were determined to be
important for predicting responses.
fMRI Data Analysis
Whole-brain intersubject correlation. Whole-brain intersubject correlation (ISC)
of the fMRI data was calculated using the ISCs toolbox (isc-toolbox, release 1.1)
developed by Kauppi (2010). Using Pearson’s correlation coefficient, temporal
correlations in every voxel of the brain between every pair of subjects was calculated and
then averaged to create a group-level ISC map. Global ISC maps were computed for each
of the pieces separately. Statistical significance of the ISC maps was calculated based on
nonparametric voxelwise permutation tests (Pajula, Kauppi, & Tohka, 2012). To generate
an appropriate null distribution, each participant’s voxel time series was circularly shifted
by a random lag so that they were no longer aligned in time across the subjects, yet the
temporal autocorrelations of the BOLD signal was preserved, and the correlation
coefficient was recalculated. This process was repeated 100,000 times. P-values of the r
statistic from the non-shuffled data were then calculated based on the permutation
SPATIOTEMPORAL EMOTIONS TO MUSIC
92
distribution and corrected for multiple comparisons with Benjamini–Hochberg False
discovery rate (FDR; Kauppi, Pajula, & Tohka, 2014).
Intersubject functional connectivity. To minimize computational costs and
increase predictive power, I limited this analyses to pre-defined regions of interest that I
hypothesized to be involved in affective experiences with musical stimuli. Often in the
literature, these regions are grouped together into functional networks, due to the
observation that they are co-active at rest and during a variety of emotional tasks
(Touroutoglou et al., 2014). However, the names given to these “large-scale intrinsic
networks”, as well as the exact regions that comprise them, vary widely across
publications. Furthermore, while the existence of some of these networks appears to be
reliably and replicable across studies (Yeo et al., 2011), there is evidence that the
connectivity patterns of a region and therefore its functional grouping, can change
depending on task demands (Touroutoglou et al., 2012). To avoid this and not to presume
the function of these networks and regions, I use network definitions as a guide to select
ROIs, but do not limit our analyses to pre-defined networks only and do not average data
across pre-supposed networks.
To parcellate the brain into regions of interest, cortical surface extraction was first
performed on participant’s anatomical image using FreeSurfer and the resulting image
segmented into cortical and subcortical regions. Functional data was then warped to
anatomical space and segmented based on the cortical and subcortical reconstruction
using probabilistic automatic labeling (Destrieux, Fischl, Dale, & Halgren, 2010; Fischl
et al., 2002).
SPATIOTEMPORAL EMOTIONS TO MUSIC
93
In total, 48 ROIS were included. A full list of ROIs as well as their coordinates are
available in Table 9. These included regions typically considered to be part of the limbic
and paralimbic system (caudate, putamen, pallidum, amygdala, hippocampus, thalamus,
and orbitofrontal cortex) and the default mode network (posterior cingulate, precuneus,
inferior parietal lobule/angular gyrus, middle temporal gyrus, and medial prefrontal
cortex; Kober et al., 2008; Shirer, Ryali, Rykhlevskaia, Menon, & Greicius, 2012). I
additionally included regions of the primary and secondary auditory cortex, including
Heschl’s gyrus, the superior, middle, and inferior temporal gyri, and temporal poles.
Time series from the preprocessed, filtered data was extracted from all regions of
interest and z-scored within subjects to zero mean and unit variance. Subject-based ISFC
was calculated by the Pearson correlation between single subject time-course in a region
of interest and the average of all other subjects. To increase normality of the distribution
of correlation values, Fisher’s r-to-z transformation was applied to each correlation
coefficient before averaging across participants. Averaged z values were then inverse
transformed (z-to-r) to produce a group average ISFC matrix with r-values where the
diagonal is equivalent to intersubject correlation, i.e. the correlation between the same
ROIs across participants. Finally, as in Simony et al., (2016), I averaged the upper and
lower triangles of the group-level matrix to impose symmetry, as the correlation between
two brains should be considered unidirectional.
SPATIOTEMPORAL EMOTIONS TO MUSIC
94
Table 9. Regions of interest used for intersubject synchronization.
Right Hemisphere Left Hemisphere
Region x y z x y z
Heschl's gyrus -42 -19 10 46 −17 10
Middle temporal pole −36 15 −34 44 15 −32
Superior temporal gyrus −53 −21 7 58 −22 7
Superior temporal pole −40 15 −20 48 15 −17
Caudate −11 11 9 15 12 9
Pallidum −18 0 0 21 0 0
Putamen −24 4 2 28 5 2
Angular −44 −61 36 46 −60 39
Inferior parietal lobule −43 −46 47 46 −46 50
Posterior cingulum −5 −43 25 7 −42 22
Precuneus −7 −56 48 10 -56 44
Anterior orbitofrontal −17 47 −13 18 48 −14
Lateral orbitofrontal −31 50 −10 33 53 −11
Medial frontal gyrus −5 49 31 9 51 30
Medial orbitofrontal −5 54 −7 8 52 −7
Rectus gyrus −5 37 −18 8 36 −18
Superior frontal gyrus −18 35 42 22 31 44
Amygdala −23 −1 −17 27 1 −18
Hippocampus −25 −21 −10 29 −20 −10
Thalamus −11 −18 8 13 −18 8
Anterior cingulate −4 35 14 8 37 16
Insula −35 7 3 39 6 2
Mid cingulate −5 −15 42 8 −9 40
ParaHippocampal −21 −16 −21 25 −15 −20
Note: stereotactic coordinates (x,y,z) from MNI atlas
Dynamic ISC with phase synchronization. To avoid issues with selecting an
arbitrary window size and to increase the temporal resolution, I opted to use a recently
developed approach that evaluates BOLD-signal similarity across participants
dynamically by calculating differences in phasic components of the signal at each
moment in time (inter-subject phase synchronization; ISPS). The fMRI Phase
Synchronization toolbox was used to calculate dynamic ISPS (Glerean et al., 2012). The
filtered, preprocessed data was first band-pass filtered through 0.025Hz (33s) to 0.09Hz
SPATIOTEMPORAL EMOTIONS TO MUSIC
95
(~11s) because the concept of phase synchronization is meaningful only for narrow-band
signals. Using a Hilbert transform, the instantaneous phase information of the signal was
determined and an intersubject phase coefficient was calculated for each voxel at each
TR by evaluating the phasic difference in the signal for every pair of participants and
averaging. This results in a value between 0 to 1 that represents the degree of phase
synchronization across all participants at that moment in time.
A similar method was then calculated to assess dynamic inter-regional
synchronization. Instantaneous seed-based phase synchronization (SBPS) was performed
by calculating the difference in the phasic component of one region of interest in one
participant with the phasic component of all other regions of other participants. When
averaging, this resulted in a group-level, time-varying connectivity measure between each
pair of 48 regions at each TR (Nummenmaa & Lahnakoski, 2018).
Predicting dynamic synchronization from continuous affective ratings.
Continuous affective ratings were then used to predict the ISPS and SBPS time series in
order to evaluate how stimulus-driven brain activity reflects changes in emotional
responses to the music. The demeaned, averaged time series for enjoyment ratings and
sadness ratings were down-sampled to 1Hz to match the TR of the fMRI data and
convolved with a double gamma HRF to compensate for the hemodynamic delay. These
regressors were then used to predict ISPS in regions of interest, as well as SBPS in every
pairwise combination of regions, in a general linear model. I performed two separate
GLM models to predict ROI-based ISPS and SBPS, once with the sadness ratings,
regressing out enjoyment ratings and once with enjoyment ratings regressing out sadness.
SPATIOTEMPORAL EMOTIONS TO MUSIC
96
Significance of correlations between affective ratings and ISPS/SBPS was calculated
non-parametrically by permuting the raw ratings 5,000 times with using circular block
resampling (Politis and Romano, 1992) to create a null-distribution. The resulting p-
values were corrected using Benjamini–Hochberg (FDR) with a positive q < 0.05 to
control false discovery rate.
Comparing whole-brain ISC between groups. To evaluate differences in ISC
during the processing of emotional music between people with high and low empathy, I
first divided participants into two groups based on a median split of
scores on the 4 subscales of the IRI. I then used a two-group formulation of a linear
mixed-effects (LME) model with a crossed random-effects formulation to identify voxels
that had higher ISC values within the high Fantasy group as compared to the low Fantasy
group as well as higher ISC values within rather than across groups. The LME approach
has been shown accurately account for the correlation structure embedded in the ISC and
provide proper control for false positives (Chen, Taylor, Shin, Reynolds, & Cox, 2017).
The model gives voxelwise population ISC values within groups as well as voxelwise
population ISC values between groups, which reflects the ISC effect between pairs of
participants that belong to different groups. I then contrasted the within-group high-
Fantasy ISC maps from the within-group low-Fantasy ISC maps as well as within-group
high-Fantasy ISC maps from the across-group ISC maps. Minimum cluster size for
significance was calculated with AFNI’s 3dFWHMx and 3dClustSim and contrasts were
thresholded using an initial voxelwise threshold of p < 0.001 and controlled for family-
wise error (FWE) using the calculated cluster size threshold that corresponded to a p-
SPATIOTEMPORAL EMOTIONS TO MUSIC
97
value of 0.05.
Individual participant voxelwise regression. As a complementary approach, I also
analyzed associations between neural activation and continuous ratings of sadness and
enjoyment in a traditional GLM context. For each participant, the group-average ratings
were used to predict the preprocessed, trimmed, and filtered BOLD data across the entire
brain. Given the high correlation between enjoyment and sadness ratings, two separate
models were computed. In the first model, I used average continuous sadness ratings as
the explanatory variable, with enjoyment ratings orthogonalized to it. The contrasts in
this model therefore reflect regions of the brain that are correlated with both sadness and
enjoyment ratings, as well as the unique contributions of enjoyment. In the second model,
I used the group-average continuous ratings of enjoyment as well as an orthogonalized
regressor of sadness. The resulting contrasts therefore reflects regions of the brain that are
correlated with enjoyment as well as with both, as well as uniquely with sadness. Finally,
in each model separately, I contrasted sadness from enjoyment (and vice versa) to
determine regions of the brain that are uniquely contributed to sadness versus enjoyment
ratings.
A higher-level analysis was then conducted to determine group-level effects of
emotion and enjoyment ratings on brain activity. Statistical threshold was determined
using nonparametric permutation testing in FSL's Randomise tool (Winkler et al., 2014)
and thresholded using threshold-free cluster enhancement (TFCE; Smith & Nichols,
2009).
SPATIOTEMPORAL EMOTIONS TO MUSIC
98
Results
Behavioral data and continuous ratings. Four out of the 40 participants had to be
removed from subsequent analyses, one due to an incidental finding discovered during
scanning, one due to high motion-related artifacts, and two due to their ratings of affect
being below the two standard deviations cutoff below the average ISC value across
people (ISC enjoyment: 0.18, SD = 0.13; ISC sadness: 0.21, SD = 0.14). This result in a
total of 36 participants (19F, M
age
= 24.25). After removing the first 20 seconds of the
pieces, the average ISC for sadness ratings amongst the final 36 was 0.16 (SD = 0.15)
and the average ISC for enjoyment ratings was 0.10 (SD = 0.14). A summary of
responses to all questionnaires are presented in Table 10 and correlations between
personality factors and average ratings for sadness and enjoyment are presented in
Appendix J. Average ratings for both enjoyment and sadness are presented in Figure 8.
Table 10. Summary statistics of collected behavioral measures
Mean SD
Age (in years) 21.50 6.40
Years of music training 4.86 2.18
Empathy (IRI)
Fantasy 19.28 5.79
Perspective taking 18.92 4.86
Empathic concern 19.58 4.29
Personal Distress 10.53 4.27
Music Sophistication
Index (Goldsmith MSI)
Active Engagement 40.79 6.61
Perceptual Abilities 41.31 3.20
Musical Training 25.82 6.07
Emotions 30.97 2.91
Singing Abilities 29.74 4.54
General Sophistication 78.14 8.94
Mental health status
Anxiety (GAD-7) 5.67 4.10
Depression (PHQ-9) 5.50 4.06
Note. N = 36
SPATIOTEMPORAL EMOTIONS TO MUSIC
99
Figure 9. Average continuous ratings of intensity of sadness and enjoyment. 10 =
high sadness/enjoyment and 0 = no feelings of sadness/enjoyment. The first 20 seconds
of the piece have been removed. Lighter-colored bounded lines represent +1 standard
deviation from the mean.
Acoustic features predicting ratings. Several acoustic features were found to be
most informative for predicting continuous ratings. Here, I report the three most
significant features the best-performing in the autoregressive models with regularization.
When modeling the continuous ratings of sadness, brightness, RMS, and compressibility
were most informative. Brightness, RMS, and compressibility were also found to be most
informative for predicting enjoyment ratings. All three of these measures were positively
correlated with both sadness and enjoyment ratings.
Whole-brain intersubject correlation. Across the entire piece of sad music,
significant intersubject correlation was found in large bilateral clusters in the auditory
cortex, including Heschl’s gyrus, superior temporal gyrus and sulcus (Figure 9). This
SPATIOTEMPORAL EMOTIONS TO MUSIC
100
cluster extends medially into the posterior insula and posteriorly into the supramarginal
gyrus in both hemispheres. Additional significant clusters were found in the right inferior
frontal gyrus, right anterior insula, right anterior cingulate cortex and paracingulate gyrus,
and right precentral gyrus (see Appendix K for coordinates).
Figure 10. Significant voxelwise intersubject correlation during sad music listening.
Statistical significance voxels based on nonparametric voxelwise permutation testing
with FDR-correction (q < 0.01).
ISC differences between high and low empathy participants. When comparing
ISC values while listening to sad music between participants who score high on the
Fantasy subcomponent of empathy and participants who score low on the Fantasy
subcomponent of empathy, significant differences were found in left auditory cortex,
including the superior temporal gyrus, the middle temporal gyrus, regions in the left
prefrontal cortex, including the superior frontal and middle frontal gyrus, and the right
visual cortex, including calcarine sulcus and lingual gyrus. When comparing the ISC
SPATIOTEMPORAL EMOTIONS TO MUSIC
101
maps between high-Fantasy participants versus across high- and low-Fantasy
participants, significant ISC differences were found additionally in the left medial frontal
gyrus, left medial orbitofrontal cortex, left precuneus, and the left and right cuneus and
lingual gyrus (Figure 10).
Figure 11. Synchronization differences between high- and low-Fantasy participants.
A. Results from a whole-brain, voxelwise contrast revealing brain regions that are more
synchronized between pairs of high-Fantasy participants than pairs of low-Fantasy
participants. B. Results from two whole-brain, voxelwise contrasts revealing brain
regions that are more synchronized within the Fantasy group than across the Fantasy
groups. For all contrasts, results are shown at an initial threshold of p < 0.001 with cluster
correction corresponding to p < 0.05.
Intersubject functional connectivity. Intersubject functional connectivity between
all regions of interest are presented in Figure 11. The connectivity matrix is presented
unthresholded with a max r-value of .25 and a minimum r-value of -0.25. The diagonal of
the ISFC matrix reflects the intersubject correlation values, i.e. that brain region
correlated with itself across participants. Regions are organized based on theoretical
SPATIOTEMPORAL EMOTIONS TO MUSIC
102
functional groupings (Shirer et al., 2012).
Figure 12. Intersubject functional connectivity matrix during sad music listening.
Red colors correspond to positive correlations and blue colors correspond to negative
correlations. The diagonal is equivalent to intersubject correlation values.
Dynamic phase synchronization in regions of interest. Continuous ratings of
enjoyment significantly predicted changes in phase synchronization in several regions of
interest (see Table 11, Figure 12). These included regions in the temporal lobe, i.e. the
right heschl’s gyrus (b = 0.03, p = 0.01, p
adj
= 0.04), the left middle (b = 0.01, p = 0.01,
p
adj
= 0.04) and inferior temporal gyrus (b = 0.01, p = 0.001, p
adj
= 0.04), regions of the
right basal ganglia, i.e. the putamen (b = 0.01, p = 0.006, p
adj
= 0.04) and globus pallidus
SPATIOTEMPORAL EMOTIONS TO MUSIC
103
(b = 0.01, p = 0.003, p
adj
= 0.04), the left and right parahippocampal gyrus (left: b = 0.01,
p = 0.01, p
adj
= 0.04; right: b = 0.01, p = 0.003, p
adj
= 0.04), the left orbitofrontal cortex
(b = 0.01, p = 0.01, p
adj
= 0.04), the left posterior cingulate (b = 0.01, p = 0.01, p
adj
=
0.04), and the parietal inferior lobule (b = 0.01, p = 0.01, p
adj
= 0.04). Continuous ratings
of sadness did not significantly predict dynamic phase synchronization in any of the
regions of interest when correcting for multiple comparisons.
Table 11. Intersubject synchronization correlated with enjoyment ratings
Region Hemisphere b P-value
Adjusted P-
value
Heschl’s gyrus R 0.031 0.010 0.040
Middle temporal pole L 0.004 0.014 0.049
Putamen R 0.009 0.006 0.040
Pallidum R 0.008 0.003 0.040
ParaHippocampal gyrus L 0.007 0.010 0.040
ParaHippocampal gyrus R 0.007 0.003 0.040
Left lateral OFC L 0.011 0.009 0.040
Medial orbitofrontal cortex L 0.009 0.009 0.040
Superior frontal gyrus L 0.018 0.004 0.040
Medial frontal gyrus L 0.016 0.003 0.040
Posterior cingulum L 0.010 0.011 0.040
Inferior parietal lobule L 0.007 0.007 0.040
Angular gyrus L 0.041 0.009 0.040
Middle temporal gyrus L 0.012 0.011 0.040
Inferior temporal gyrus L 0.006 0.001 0.040
N = 36
FDR adjusted p-value calculated using Benjamin-Hochberg procedure
SPATIOTEMPORAL EMOTIONS TO MUSIC
104
Figure 13. Enjoyment ratings predicting intersubject phase synchronization.
Significant correlations were determined using a non-parametric random shuffling of
ratings 5000 times with FDR-correction for the number of regions of interest tested.
Inter-regional phase synchronization. Next I evaluated whether changes in phase
synchronization between regions was associated with changes in affective ratings.
Neither ratings of sadness nor enjoyment significantly predicted seed-based phase
synchronization between regions when correcting for multiple comparisons. At the
uncorrrected p-value threshold of 0.001 with 5000 non-parametric permutations of
ratings, inter-regional synchronization between the left globus pallidus and right thalamus
SPATIOTEMPORAL EMOTIONS TO MUSIC
105
was positively correlated with enjoyment ratings (b = 0.09, p = 0.001, p
adj
= 0.39). At the
uncorrected p-value threshold of 0.005, enjoyment was additionally correlated with inter-
regional synchronization between orbitofrontal, basal ganglia, and parahippocampal
regions (see Table 12, Figure 14). At this threshold, sadness ratings were correlated with
inter-regional phase synchronization between the orbital portion of the right superior
frontal gyrus and the left caudate (b = 0.18, p = 0.005, p
adj
= 0.98).
Table 12. Inter-regional intersubject synchronization correlated with ratings
Region 1 Region 2 b P-value
Adjusted
P-value
Enjoyment
Left anterior OFC Left gyrus rectus 0.144 <0.001 0.265
Left lateral OFC Right middle temporal pole 0.136 0.003 0.391
Left medial OFC Left parahippocampal gyrus 0.117 0.002 0.391
Right medial OFC Left parahippocampal gyrus 0.052 0.004 0.391
Left caudate Left pallidum 0.088 0.003 0.391
Left caudate Right putamen 0.081 0.004 0.391
Right caudate Left putamen 0.121 0.002 0.391
Right caudate Right putamen 0.145 0.003 0.391
Right caudate Right thalamus 0.122 0.005 0.391
Left pallidum Right thalamus 0.090 0.001 0.391
Right ACC Right putamen 0.091 0.003 0.391
Right ACC Right thalamus 0.079 0.003 0.391
Right parahippocampal gyrus Right middle temporal pole 0.124 0.005 0.391
Right precuneus Right STG 0.094 0.004 0.391
Sadness
Right anterior OFC Left caudate 0.184 0.005 0.986
N = 36
FDR adjusted p-value calculated using Benjamin-Hochberg procedure
SPATIOTEMPORAL EMOTIONS TO MUSIC
106
Figure 14. Enjoyment ratings predicting inter-regional phase synchronization. A.
Mean enjoyment ratings predicted inter-regional phase synchronization between the left
OFC and parahippocampal gyrus. B. Sadness ratings predicted inter-regional phase
synchronization between the right OFC and left caudate. Neither were significant when
correcting for multiple comparisons.
Voxelwise activation. Enjoyment ratings significantly predicted the activity in the
right heschl’s gyrus into the right superior temporal gyrus and partially into the right
rolandic operculum/posterior insula (z = 3.54, x = 40, y = -26, z = 2). When contrasting
enjoyment over sadness, significant activation additionally found in the right heschl’s
gyrus (z = 3.54, x = 38, y = -20, z = -8).
Discussion
Using several measures of intersubject neural synchronization combined with
continuous ratings of subjective affective experience, I uncovered brain regions and
networks involved in the representation of enjoyment as it unfolds over time in response
to a sad piece of music. The findings lend support to our current understanding of
SPATIOTEMPORAL EMOTIONS TO MUSIC
107
affective processing in the brain and enrich it by illuminating the time-varying patterns of
neural synchronization and communication that map onto two distinct aspects of
emotional experience.
While participants were listening to a full-length piece of sad music, I found
significant intersubject correlations in voxels within the primary and secondary auditory
cortices, as well as within the insula (posterior and anterior), anterior cingulate, and
inferior frontal gyrus. While it is highly likely that synchronization of voxels in the
primary auditory cortex reflects changes in acoustic aspects of the music, the secondary
auditory cortex, i.e. the superior temporal gyrus and sulcus, has been shown to be
involved in representing emotions conveyed through a variety of sounds with different
acoustical properties (Escoffier et al., 2013; Sachs, Habibi, Damasio, & Kaplan, 2018).
The superior temporal sulcus in particular, has also been shown to encode affective
information across a variety of non-auditory stimuli, such as body movements and faces
(Peelen, Atkinson, & Vuilleumier, 2010).
The insular cortex is widely considered to be involved in subjective experiences of
emotion (Damasio et al., 2013; Immordino-Yang et al., 2014). More specifically, recent
evidence suggests that the posterior insula is involved in processing interoceptive
changes in the body and the anterior portion is involved in integration of the interoceptive
information with external sensory information to generate a subjective representation of
the feeling state (Nguyen, Breakspear, Hu, & Guo, 2016). The insula, in conjunction with
the anterior cingulate and inferior frontal gyrus, may additionally be involved in empathic
processing, as evidenced by the fact that these regions are activated both when mentally
SPATIOTEMPORAL EMOTIONS TO MUSIC
108
simulating the emotions of others and when personally experiencing those same emotions
(Lawrence et al., 2006; Singer & Lamm, 2009). The insula’s role in empathy may be in
the immediate and automatic generation of this mental representation of an observed
emotion, given that it has been shown to respond to stimuli depicting others in pain
regardless of conscious attention or cognitive demand. The fact that all three of these
regions were significantly synchronized across participants during presentation of a sad
piece of music mirrors the results of previous findings with other types of emotional
naturalistic stimuli (Jääskeläinen et al., 2008) and provides evidence that (a) participants
were feeling emotions in response to the piece of music at similar times and (b) these
synchronized emotional responses likely reflect the empathic processes involved in
observing and resonating with the emotions of the music.
To further test this notion, whole-brain ISC differences were calculated between
groups of participants with high and low measures of trait empathy. The Fantasy subscale
of the IRI, which reflects a tendency to become mentally transported into a story or
narrative, has been shown to predict the intensity of emotional responses to and
enjoyment of sad music (Kawamichi et al., 2016; Taruffi & Koelsch, 2014; Sachs et al.,
under review). Our results suggest a possible neural explanation for these behavioral
differences. In our study, participants who scored higher on a self-report measure of
Fantasy showed increased within-group synchronization in the left superior temporal
gyrus and sulcus, the left dorsal medial prefrontal cortex, including the superior frontal
and middle frontal gyri, the ventromedial prefrontal cortex, the precuneus, and several
regions in the visual cortex. Portions of the medial frontal cortex, the superior temporal
SPATIOTEMPORAL EMOTIONS TO MUSIC
109
sulcus, and the precuneus have all previously been shown to become automatically
engaged while observing the emotions of others, responding to both affective and social
cues in situations requiring empathy (Krämer, Mohammadi, Doñamayor, Samii, &
Münte, 2010). The DMPFC appears to be involved in recognizing and inferring the
emotion states of others as well as reflecting upon one’s own emotional state (Schnell,
Bluschke, Konradt, & Walter, 2011). Moreover, while viewing a film that conveyed
negative emotional valence, ISC in the superior temporal gyrus and VMPFC, among
others, were shown to be reduced in patients with a particular subtype of depression
related to impaired affective processing, as compared to healthy controls (Guo, Nguyen,
Hyett, Parker, & Breakspear, 2015). It is therefore possible that the increased
synchronization of brain activity in these regions seen in participants with higher Fantasy
reflects these participants’ ability to readily connect and resonate with the emotions being
conveyed through the music through imaginative processes. Indeed, ratings of intensity
of felt sadness during the piece were positively correlated with the Fantasy subcomponent
in our sample, providing further support for the idea that higher ISC in these empathy-
related regions reflects the simultaneous “catching” of an emotion in response to an
aesthetic stimulus.
The additional finding of increased synchronization of the visual cortex in the high
Fantasy group may indicate that such participants were engaging in visual imagery while
listening to music, picturing scenes and/or characters that might accompany the music.
The left occipital lobe, cuneus and lingual gyrus have been shown to be activated during
visual mental imagery of objects as well as visual perception of those same objects
SPATIOTEMPORAL EMOTIONS TO MUSIC
110
(Ganis, Thompson, & Kosslyn, 2004). Previous research has argued that the enhancement
of visual mental imagery in response to music mediates the relationship between Fantasy
and emotional responses to sad music (Schubert et al., 2018). Although I cannot say for
certain that the high Fantasy group engaged in more visual imagery during music
listening, it is important to note that the piece of music used in this study was written for
a television show and therefore, was likely composed in a way that it complemented or
enhanced certain aspects of the accompanying visual scene; it is quite reasonable, then, to
surmise that images would be conjured in response to this type of music. Taken together,
the group differences found between high vs. low empathy participants lend support for
the use of inter-group ISC as a method of identifying psychologically and therapeutically-
relevant variations in the ways in which individuals process the natural world (Hasson et
al., 2010).
To further probe the meaningfulness of ISC in terms of a collective emotional
experience, I calculated a moment-to-moment measure of intersubject synchronization
and used continuous ratings of enjoyment and sadness to predict changes in
synchronization over time. In this analysis, enjoyment ratings were positively correlated
with intersubject phase synchronization in an area of the auditory cortex, i.e. the right
Heschl’s gyrus, as well as the left orbitofrontal cortex, left posterior cingulate, and right
basal ganglia (putamen and pallidum). Importantly, when continuous enjoyment ratings
were used to predict BOLD activity using the more traditional GLM framework, only
voxels with the right Heschl’s gyrus were significantly correlated, highlighting the
advantage in statistical sensitivity that assessing associations across participants provides.
SPATIOTEMPORAL EMOTIONS TO MUSIC
111
These results corroborate several previous studies using emotional naturalistic
stimuli. In response to both evocative speeches and film clips, negative emotional valence
was shown to predict dynamic intersubject synchronization in the thalamus and striatum
as well as the orbitofrontal and medial prefrontal cortices (Nummenmaa et al., 2012;
Nummenmaa, Saarimäki, et al., 2014). Because participants only rated valence along a
single axis (pleasantness to unpleasantness), the authors interpret these findings as
evidence that these regions process both feeling states and reward. By separating
enjoyment from valence ratings, our findings help clarify the specific functions of these
regions in different aspects of emotion.
The putamen and globus pallidus make up part of the basal ganglia, which, together
with the rest of the striatum, are thought of as a key center of the reward system in the
brain: across numerous studies, this system triggers the pleasurable sensations that
accompany the presentation of rewarding stimuli such as food, sex, drugs, and music
(Berridge & Kringelbach, 2013). There is some evidence that the striatum is involved in
reward learning of musical stimuli (Gold et al., 2019; Salimpoor et al., 2013a, 2009),
whereas the basal ganglia specifically involved in the intensity of emotional responses to
music (Brattico et al., 2011; Pereira et al., 2011; Trost et al., 2012). In light of its known
functional role in movement (Albin, Young, & Penney, 1983), the basal ganglia may
specifically process rhythmic and melodic aspects of musical stimuli (Bengtsson & Ullén,
2006). The fact that synchronization in the basal ganglia was correlated with enjoyment
ratings, not with intensity of sadness in our study, may indicate that the basal ganglia is
tracking changes in musical features that are strongly tied to peak moments of subjective
SPATIOTEMPORAL EMOTIONS TO MUSIC
112
enjoyment in the stimulus.
The orbitofrontal cortex is consistently found to be activated during tasks requiring
aesthetic judgment and appreciation (Brown, Gao, Tisdelle, Eickhoff, & Liotti, 2011) as
well as reward-processing and reward-prediction (Howard, Gottfried, Tobler, & Kahnt,
2015). Structural, the region receives input from higher sensory areas, the hypothalamus
and thalamus, and projects to striatal areas, the ACC, amygdala, and insula (Kringelbach
& Rolls, 2004). Resting-state connectivity data has suggested that the orbitofrontal cortex
can be subdivided into functional units. The medial portion of the OFC is functionally
connected with the posterior cingulate, medial and lateral temporal cortex, and the ventral
striatum, whereas the more anterior and lateral portions of the OFC appears to be
functionally connected with the caudate, putamen, dorsal ACC, and anterior insula
(Kahnt, Chang, Park, Heinzle, & Haynes, 2012). These results imply that the medial
cluster may be more involved in monitoring, learning, and prediction reward value,
whereas the more anterior and lateral clusters may be more involved in behavioral
responses to both reward and punishment (Kringelbach & Rolls, 2004). In this study,
enjoyment ratings were positively associated with intersubject synchronization in the
medial and lateral portions of the orbitofrontal cortex, suggesting that multiple, related
functions are occurring in response to music, including monitoring one’s current affective
state and using this information to assess and predict the music’s current and upcoming
reward value and ultimately make decisions regarding its aesthetic quality and one’s
behavioral response.
Moments when the piece of music was found to be most enjoyable were
SPATIOTEMPORAL EMOTIONS TO MUSIC
113
additionally correlated with across-participant synchronization of signal from the
posterior cingulate, a region considered to be a central hub of the Default Mode Network
(Pearson, Heilbronner, Barack, Hayden, & Platt, 2011). This findings corroborates
previous indications that the DMN is involved in aesthetic processing (Belfi et al., 2019;
Cela-Conde et al., 2013; Vessel, Starr, & Rubin, 2012). The exact role of the posterior
cingulate in the enjoyment of art and music remains unclear, though it is likely that its
more general functions, such as detecting changes in the internal and external
environment in order to modify cognitive and behavioral processes, evaluating the reward
value of emotionally-relevant stimuli, and engaging in self-referential thought (Li, Mai,
& Liu, 2014; Pearson et al., 2011), all contribute in some way to aesthetic experiences.
There is some indication that sad music, as compared to happy music, is more likely to
induce these types of cognitive processes and thought patterns, as evidenced by enhanced
functional connectivity of the DMN in response to sad music (Taruffi, Pehrs, Skouras, &
Koelsch, 2017). Our results elucidate the function of the posterior cingulate by showing
that moment-to-moment intersubject synchronization map onto moments of high
enjoyment of musical stimuli. I hypothesize that the posterior cingulate portion of the
DMN is involved in monitoring the flow of internal information related to feelings,
memory, mental simulation, and self-referential thought that likely occur during the
aesthetic response to sad music.
One of the main aims of this study was to assess how the interactions between brain
regions reflect changes in music-evoked emotions over time. Recent evidence has
suggested that a range of emotional experiences that span the valence and arousal
SPATIOTEMPORAL EMOTIONS TO MUSIC
114
spectrum recruit a set of brain regions involved in a variety of cognitive functions
(Touroutoglou et al., 2014). When grouped together, these regions constitute what are
often labeled as the limbic and paralimbic, default mode, and frontoparietal networks
(Kober et al., 2008). It has been argued that emotions in response to naturalistic stimuli
may be best captured by evaluating changes in inter-regional connectivity and recent
investigations with full-length films and music have embraced this idea. For example,
using the same seed-based phase synchronization approach I used here, Nummenmaa et
al., (2014) found that negative valence was associated with increased connectivity
between the insular, inferior temporal, cingulate, and orbitofrontal cortices. Similar
results have been found with other methods of assessing dynamic functional connectivity.
Increases in the intensity of sadness in response to a film was shown to coincide with
increased connectivity between regions of the limbic network (amygdala, thalamus,
striatum, hypothalamus) as well as between the limbic areas and the dorsomedial PFC
and ACC (Raz et al., 2012). Connectivity within and between these networks have also
been shown in response to both positive and negative emotions in music (Koelsch &
Skouras, 2014; Singer, Jacobi, et al., 2016). These modulations are interpreted as
representing the processing adjustments that are required of these regions as emotions
shift over time, with regards to integrating sensory, motor, affective and higher-order
cognitive information (Koelsch & Skouras, 2014).
When correcting for multiple comparisons in our analysis, I did not find that
sadness nor enjoyment ratings significantly predicted inter-regional phase
synchronization. However, at the unthresholded, non-adjusted p-value threshold of 0.001,
SPATIOTEMPORAL EMOTIONS TO MUSIC
115
I did find that enjoyment predicted seed-based phase synchronization between the left
basal ganglia and right thalamus. At a slightly lower threshold of 0.005, connections
between the OFC and left parahippocampal gyrus, as well as between the ACC and the
putamen and thalamus were also positively correlated with ratings of enjoyment. At this
threshold, sadness ratings also predicted seed-based phase synchronization between a
portion of the right orbitofrontal cortex and the left caudate. While these results need to
be interpreted with caution as they did not reach our criteria for statistical significance,
they mirror those of another study with music, in which it was shown that patterns of
connectivity between orbitofrontal and limbic areas were related to tension ratings of a
musical piece over time (Lehne, Rohrmeier, & Koelsch, 2013). The building and
resolving of tension in music is inherent to its musical structure and is highly correlated
with enjoyment (Steinbeis, Koelsch, & Sloboda, 2006). It is plausible that during the
moments in the music that were determined to be most enjoyable, the left and right
striatum and limbic network need to become more synchronized with the OFC and ACC
in order to integrate the changes in auditory information coming from the music with
changes in internal information coming from the body and sensory regions of the brain in
order to form a mental representation of the bodies current affective state and guide
future behaviors that may be required of this state.
Limitations. There are several limitations with this study that merit discussion.
One difficulty with analyzing continuous self-report measures of affect is determining
how best to account for and model variability across participants. In our data, like in
previously published datasets of continuous ratings of affect (Upham & McAdams,
SPATIOTEMPORAL EMOTIONS TO MUSIC
116
2018), there were substantial inconsistencies across participants in terms of the moments
of high emotional intensity. It is therefore difficult to ascertain the stability and reliability
of the average affective ratings. While simply averaging across participants has been used
in previous studies with continuous ratings (Lehne et al., 2013), such a statistical method
may be missing key informational content with regards to the emotional responses that is
captured in the variance across participants. Consequentially, the average ratings are
characterized by an overall pattern of low-frequency, temporal drift upwards. This raises
an important psychological and methodological questions with regards to how I accurate
model affective ratings, isolating the variation that is due to error or idiosyncrasies across
participants from the variation that says something meaningful about the ways in which
participants interpret the stimulus. More complex methods have recently been developed
that try to remove the irrelevant differences in timing across people, such as differences
in response time and coordination between cognitive processes and motor actions, to
capture moments of high inter-subject agreement in the time series (Upham & McAdams,
2018). Dynamic time warping, for instance, attempts to assessing the similarity between
two time series that may be occurring at different speeds (Deng & Leung, 2015). Another
method involves calculating moments in which a significant change in ratings (either
increases or decreases of a certain threshold) occurs consistently across people, rather
than relying on actual values (Upham & McAdams, 2018). Applying such approaches to
our continuous ratings set may uncover moments of high agreement across participants
with regards to the moments of peak sadness and enjoyment, resulting in a more reliable
model that I can use to predict dynamic brain synchronization.
SPATIOTEMPORAL EMOTIONS TO MUSIC
117
It is also possible that the inconsistencies in the ratings are due to the task itself.
Continuously monitoring the existence of complex feelings such as enjoyment, while
simultaneously listening to a piece of music may be too cognitively demanding or too
abstract for participants. Previous studies have suggested a number of alternative methods
of reporting on subjective experience, such as having participants tap along to the beat of
the music, which has been shown to be an indirect measure of being moved by music
(Singer, Jacoby, et al., 2016) or stopping the music intermittently and asking participants
to rate the emotional quality of the last 30 seconds (Chang et al., 2018). Future endeavors
could additionally use more objective measures of collective emotional responses to the
sad piece of music, such as psychophysiological data.
An additional degree of variability and uncertainty in our data lies in the fact that
self-report ratings of affect were not recorded simultaneously with brain activity. I chose
to collect ratings outside of the scanner in order to avoid the influence of task constraints
and self-monitoring on the collected brain data (Taylor et al., 2003); because of this
decision, there is no guarantee that the ratings collected afterwards accurately capture what
participants felt the first time they listened to the piece of music. Future studies will be
needed to specifically address the stability of continuous ratings across different listening
sessions as well as the influence of self-monitoring of affective state changes on the
resulting neuroimaging data.
Certain hypotheses were not supported by the results. First, counter to what was
predicted, intersubject synchronization in the nucleus accumbens, nor the rest of the
ventral striatum, was not found to be related to experience of listening to a sad piece of
SPATIOTEMPORAL EMOTIONS TO MUSIC
118
music. This was somewhat surprising as a number of previous studies have highlighted
the role of the nucleus accumbens in processing the rewarding aspects of music (Gold et
al., 2019; Salimpoor et al., 2013b; Salimpoor, Benovoy, Larcher, Dagher, & Zatorre,
2011). On the other hand, other studies that have specifically addressed the temporal
component of neural processing of affect have shown that the nucleus accumbens
becomes engaged very quickly, as compared to the DMN and OFC, in response to
aesthetic stimuli, suggesting that there might be two pathways, one slower and one faster,
involved in processing reward (Belfi et al., 2019). Given that the subjective ratings were
collected over the course of the entire 9-minute piece of music and were characterized by
slower frequency patterns, it possible that the fast processing of aesthetics and reward in
the nucleus accumbens could not be accounted for by our predictor variable. It is also
possible that the type of pleasure or reward that was assessed in this study, i.e.
pleasurable sadness, is more complex and/or distinct from the type of pleasure that is
encoded in the nucleus accumbens. Indeed, researchers and scholars have suggested that
there is a meaningful ontological, and neural, distinction between pleasure derived from
obtaining one’s wants and desires (hedonia) and pleasure derived from finding meaning
and well-being (eudaimonia, Kringelbach & Berridge, 2009). Enjoying a sad piece of
music might reflect the latter, as is the case with other forms of art and entertainment
(Oliver & Raney, 2011), and therefore may not directly involve the nucleus accumbens.
Furthermore, intensity of felt sadness was not found to significantly predict
dynamic intersubject phasic synchronization in any region of interest, even when
enjoyment ratings were orthogonalized with respect to happiness ratings. Because
SPATIOTEMPORAL EMOTIONS TO MUSIC
119
sadness ratings were highly correlated with enjoyment ratings, it is difficult to
statistically dissociate the variation in the dynamic intersubject synchronization measure
that was driven by the experience of sadness versus the experience of enjoyment, even
with orthogonalization. It is additionally possible that because participants were rating a
piece that was intended to reflect negative emotions, continuous ratings of sadness were
not as psychologically-relevant of a concept as enjoyment, and therefore, the important
information with regards to the changes in the music were better captured by enjoyment
ratings. One way of teasing these related concepts apart would be to compare ISC values
between participants who found the piece enjoyable and those who did not. However,
most participants found the piece at least somewhat enjoyable at certain time points and it
is therefore difficult to know how to group participants based on these continuous ratings
for this type of comparison. Alternatively, I could assess how the regions found to be
related to enjoyment are related to enjoyment in response to a piece of music designed to
induce a different emotion, such as happiness (see below for more details).
Future directions. The results presented here indicate some logical next-steps and
ideas for further explorations. First, to determine if the findings presented here are unique
to this stimulus used in this study or reflect a more generalizable response to pleasurable
sadness, I can assess correlations between ratings and intersubject synchronization in the
second piece of sad music that is acoustically different, yet still reliably conveys sadness.
If similar regions become synchronized during moments of peak ratings of enjoyment in
this second piece, this would provide strong validation that our findings more generally
highlight brain systems involved in the enjoyment of negative emotional stimuli.
SPATIOTEMPORAL EMOTIONS TO MUSIC
120
Secondly, it would be interesting and psychologically relevant to evaluate
differences between enjoyment of stimuli that express negative emotions and enjoyment
of stimuli that express positive valence in terms of brain synchronization and
connectivity. Previous research has suggested that emotional valence can alter the brain
regions involved in aesthetic processing (Ishizu & Zeki, 2017; McPherson et al., 2016;
Wilson-Mendenhall, Barrett, & Barsalou, 2015). At the same time, several analyses have
evidence for brain regions associated with a variety of emotional states, irrespective of
whether these states have negative or positive valence (Gerber et al., 2008; Lindquist,
Satpute, Wager, Weber, & Barrett, 2015; Viinikainen et al., 2012). To address this
outstanding question, I can calculate intersubject correlation and connectivity during
exposure to the happy piece of non-lyrical music and evaluate how dynamic measures of
synchronization are predicted by continuous ratings of happiness and enjoyment. While
the happy piece is acoustically very different from the sad and therefore it is unlikely that
I would be able to statistically compare the results, this analysis would nonetheless allow
us to test if enjoyment, independent of the emotional valence, is related to
synchronization within and between the OFC, basal-ganglia, and posterior cingulate or if
the brain dynamics involved in tracking enjoyment and happiness are qualitatively
different.
Finally, as previous studies have argued (Du et al., 2017; Finn, Corlett, Chen,
Bandettini, & Constable, 2018; Uri Hasson et al., 2010), time-varying patterns of brain
activity, connectivity, and synchronization across people might have clinical implications
by uncovering aberrations in socioemotional processing and providing useful neural
SPATIOTEMPORAL EMOTIONS TO MUSIC
121
biomarkers for diagnosis. Because the atypical processing of negative valence tends to be
a common symptom of certain mood disorders like depression and anxiety (Groenewold,
Opmeer, de Jonge, Aleman, & Costafreda, 2013; Guo et al., 2015; Siegle, Steinhauer,
Thase, Stenger, & Carter, 2002), future research on the neural processing of sad music
could include a clinical population as a comparison group, which would allow us to assess
if differences in dynamic intersubject synchronization during the processing of naturalistic
stimuli might account for some of the affective symptoms characteristic of mood disorders.
Conclusion. While previous studies have employed dynamic methods of neural
communication to better understand the brain regions and systems involved in processing
ecologically-valid emotional experiences, this study is the first to capture the time-varying
patterns of neural synchronization involved in subjective enjoyment within a sad emotional
context. Our findings show that enjoyment in response to a sad piece of music predicts
intersubject synchronization in auditory cortex, cortico-basal ganglia regions, OFC, and
posterior cingulate. This lends credence to the hypothesis that changes in neural
communication can reflect and represent different components of our everyday emotional
experience. Moreover, synchronization in visual cortex and regions involved in both
experiencing and mentally simulating the emotions of others, was found to be greater in
participants who were more empathic, providing further indication that assessing stimulus-
driven brain activity across people may be a useful tool for illuminating the ways in which
humans experience the natural world differently.
SPATIOTEMPORAL EMOTIONS TO MUSIC
122
CHAPTER 5
Discussion and Conclusions
Imagine, for a moment, a life without emotions. Maybe it is easy to dismiss the
significance of such an absence. Maybe, at times, it is possible to even yearn for such an
absence, particularly during situations in which our emotions appear to get in our way, to
complicate our decision-making, and malignantly alter our behavior. Yet, without
emotions, we would not be able to learn from our past experiences, to seek out the
resources we need, and to avoid situations that cause us pain or harm. Even if we were
able to survive without emotions, we would be missing a fundamental element of what
makes life meaningful. Imagine taking in the view from the summit of mountain without
being able to experience awe. Envision listening to a comedian tell a joke without being
able to experience surprise. Picture watching a loved-one succeed without being able to
experience pride.
The import of emotions when it comes to our everyday social functioning and
survival makes our appreciation for and fascination with music all the more baffling.
Music has an uncanny ability to induce emotions: to make us smile as well as move us to
tears, to excite us as well as calm us, to connect us with our feelings as well as with the
feelings of others. And yet, music holds no agreed-upon meaning and serves no clear
evolutionary function with regards to homeostatic regulation. We feel so strongly in
response to music and yet there no essential resource needed to be procured, no tangible
threat needed to be avoided, no salient event needed to be remembered.
On a societal level, an appreciation for music may have evolved as an apt vessel for
SPATIOTEMPORAL EMOTIONS TO MUSIC
123
communicating emotions to groups at large. This is still very much true today. It has the
power to unite even the most divided of us, as demonstrated by Daniel Barenboim’s
West-Eastern Divan Orchestra. It can galvanize us to stand up to injustice and
oppression, as was the case when the Russian punk rock group Pussy Riot performed the
song “Mother of God, Drive Putin Away” inside an Orthodox Church in Moscow. It can
collectively heal us after a national tragedy, as when Kendrick Lamar’s song “Alright”
became an uplifting and hopeful anthem in the wake of a wave of violence perpetrated
towards Black Americans. Studying our affective experiences with music not only allows
us to begin to unravel the mysteries surrounding the human mind, but to also demystify
the countless ways in which we learn to understand, support, and inspire one another.
In these chapters, I present my modest and sincere attempt at grasping some of the
profound and beguiling obscurities surrounding music and affect by gazing through the
lens of neuroscience. Taken together, the results from these three studies uncover the
spatial patterns of brain activity that represent emotional categories from incoming
sensory information as well as the temporal patterns of brain activity that map onto
feelings of enjoyment in response to this sensory information. Given how music so
exquisitely touches upon these interwoven cognitive processes related to affect, the
findings serve both to provide a clearer basic-science understanding of affect in the brain,
as well as deepen our insight into the multitude of ways in which music collectively
engages us so intensely and viscerally.
124
References
Adolphs, R., & Andler, D. (2018). Investigating Emotions as Functional States Distinct
From Feelings. Emotion Review, 10(3), 191–201.
http://doi.org/10.1177/1754073918765662
Adolphs, R., Damasio, H., Tranel, D., Cooper, G., & Damasio, A. R. (2000). A role for
somatosensory cortices in the visual recognition of emotion as revealed by three-
dimensional lesion mapping. The Journal of Neuroscience : The Official Journal of
the Society for Neuroscience, 20(7), 2683–2690. http://doi.org/123123123
Albin, R. L., Young, A. B., & Penney, J. B. (1983). Speculations on the Functional
Anatomy of Basal Ganglia Disorders. Annual Review of Neuroscience, 6(1), 73–94.
http://doi.org/10.1146/annurev.ne.06.030183.000445
Allen, R., Davis, R., & Hill, E. (2013). The Effects of Autism and Alexithymia on
Physiological and Verbal Responsiveness to Music. J Autism Dev Disord, 43, 432–
444. http://doi.org/10.1007/s10803-012-1587-8
Alluri, V., Toiviainen, P., Jääskeläinen, I. P., Glerean, E., Sams, M., & Brattico, E.
(2012). Large-scale brain networks emerge from dynamic processing of musical
timbre, key and rhythm. NeuroImage, 59(4), 3677–3689.
http://doi.org/10.1016/j.neuroimage.2011.11.019
Alluri, V., Toiviainen, P., Lund, T. E., Wallentin, M., Vuust, P., Nandi, A. K., …
Brattico, E. (2013). From vivaldi to beatles and back: Predicting lateralized brain
responses to music. NeuroImage, 83, 627–636.
http://doi.org/10.1016/j.neuroimage.2013.06.064
Aubé, W., Angulo-Perkins, A., Peretz, I., Concha, L., & Armony, J. L. (2013). Fear
SPATIOTEMPORAL EMOTIONS TO MUSIC
125
across the senses: Brain responses to music, vocalizations and facial expressions.
Social Cognitive and Affective Neuroscience, 10(3), 399–407.
http://doi.org/10.1093/scan/nsu067
Balkwill, L., & Thompson, W. F. (1999). A Cross-Cultural Investigation of the
Perception of and Cultural Cues Emotion in Music: Psychophysical and Cultural
Cues. Music Perception, 17(1), 43–64. http://doi.org/10.2307/40285811
Bamiou, D., Musiek, F. E., & Luxon, L. M. (2003). The insula (Island of Reil) and its
role in auditory processing: Literature review. Brain Research Reviews, 42, 143–
154. http://doi.org/10.1016/S0165-0173(03)00172-3
Barford, K. A., & Smillie, L. D. (2016). Openness and other Big Five traits in relation to
dispositional mixed emotions. Personality and Individual Differences, 102, 118–
122. http://doi.org/10.1016/j.paid.2016.07.002
Barrett, F. S., Grimm, K. J., Robins, R. W., Wildschut, T., Sedikides, C., & Janata, P.
(2010). Music-evoked nostalgia: Affect, memory, and personality. Emotion, 10(3),
390–403. http://doi.org/10.1037/a0019006
Baumgartner, T., Lutz, K., Schmidt, C. F., & Jäncke, L. (2006). The emotional power of
music: How music enhances the feeling of affective pictures. Brain Research, 1075,
151–164. http://doi.org/10.1016/j.brainres.2005.12.065
Belfi, A. M., Vessel, E. A., Brielmann, A., Isik, A. I., Chatterjee, A., Leder, H., … Starr,
G. G. (2019). Dynamics of aesthetic experience are reflected in the default-mode
network. NeuroImage, 188(November 2018), 584–597.
http://doi.org/S105381191832161X
SPATIOTEMPORAL EMOTIONS TO MUSIC
126
Belin, P., Fillion-Bilodeau, S., & Gosselin, F. (2008). The Montreal Affective Voices: a
validated set of nonverbal affect bursts for research on auditory affective processing.
Behavior Research Methods, 40(2), 531–539. http://doi.org/10.3758/BRM.40.2.531
Bengtsson, S. L., & Ullén, F. (2006). Dissociation between melodic and rhythmic
processing during piano performance from musical scores. NeuroImage, 30(1), 272–
284. http://doi.org/10.1016/j.neuroimage.2005.09.019
Berridge, K. C., & Kringelbach, M. L. (2013). Neuroscience of affect: brain mechanisms
of pleasure and displeasure. Current Opinion in Neurobiology, 23(3), 294–303.
http://doi.org/10.1016/j.conb.2013.01.017
Bestelmeyer, P. E. G., Maurage, P., Rouger, J., Latinus, M., & Belin, P. (2014).
Adaptation to Vocal Expressions Reveals Multistep Perception of Auditory
Emotion. Journal of Neuroscience, 34(24), 8098–8105.
http://doi.org/10.1523/JNEUROSCI.4820-13.2014
Blood, A. J., & Zatorre, R. J. (2001). Intensely pleasurable responses to music correlate
with activity in brain regions implicated in reward and emotion. Proceedings of the
National Academy of Sciences of the United States of America, 98, 11818–11823.
http://doi.org/10.1073/pnas.191355898
Brattico, E., Alluri, V., Bogert, B., Jacobsen, T., Vartiainen, N., Nieminen, S., &
Tervaniemi, M. (2011). A functional MRI study of happy and sad emotions in music
with and without lyrics. Frontiers in Psychology, 2, 1–16.
http://doi.org/10.3389/fpsyg.2011.00308
Brattico, E., Bogert, B., Alluri, V., Tervaniemi, M., Eerola, T., & Jacobsen, T. (2016).
SPATIOTEMPORAL EMOTIONS TO MUSIC
127
It’s Sad but I Like It: The Neural Dissociation Between Musical Emotions and
Liking in Experts and Laypersons. Frontiers in Human Neuroscience, 9(676), 1–21.
http://doi.org/10.3389/fnhum.2015.00676
Brattico, E., Brigitte, B., & Jacobsen, T. (2013). Toward a neural chronometry for the
aesthetic experience of music. Frontiers in Psychology, 4(May), 1–21.
http://doi.org/10.3389/fpsyg.2013.00206
Brown, S., Gao, X., Tisdelle, L., Eickhoff, S. B., & Liotti, M. (2011). Naturalizing
aesthetics: Brain areas for aesthetic appraisal across sensory modalities.
NeuroImage, 58(1), 250–258. http://doi.org/10.1016/j.neuroimage.2011.06.012
Calder, A. J., Keane, J., Manes, F., Antoun, N., & Young, A. W. (2000). Impaired
recognition and experience of disgust following brain injury. Nature Neuroscience,
3(I1), 1077–1078.
Carr, L., Iacoboni, M., Dubeau, M., Mazziotta, J. C., & Lenzi, G. L. (2003). Neural
mechanisms of empathy in humans : A relay from neural systems for imitation to
limbic areas. Proceedings of the National Academy of Sciences, 100(9), 5497–5502.
Cela-Conde, C. J., García-Prieto, J., Ramasco, J. J., Mirasso, C. R., Bajo, R., Munar, E.,
… Maestú, F. (2013). Dynamics of brain networks in the aesthetic appreciation.
Proceedings of the National Academy of Sciences of the United States of America,
110 Suppl, 10454–61. http://doi.org/10.1073/pnas.1302855110
Chang, L. J., Jolly, E., Cheong, J. H., Rapuano, K., Greenstein, N., Chen, P. A., …
Sciences, B. (2018). Endogenous variation in ventromedial prefrontal cortex state
dynamics during naturalistic viewing reflects affective experience.
SPATIOTEMPORAL EMOTIONS TO MUSIC
128
Chen, G., Taylor, P. A., Shin, Y., Reynolds, R. C., & Cox, R. W. (2017). Untangling the
relatedness among correlations , Part II : Inter-subject correlation group analysis
through linear mixed-e ff ects modeling. NeuroImage, 147(October 2016), 825–840.
http://doi.org/10.1016/j.neuroimage.2016.08.029
Cowen, A. S., & Keltner, D. (2017). Self-report captures 27 distinct categories of
emotion bridged by continuous gradients. Proceedings of the National Academy of
Sciences, 114(28). http://doi.org/10.1073/pnas.1702247114/-
/DCSupplemental.www.pnas.org/cgi/doi/10.1073/pnas.1702247114
Craig, A. D. (2009). How do you feel--now? The anterior insula and human awareness.
Nature Reviews Neuroscience, 10(1), 59–70.
Cross, I. (2009). The evolutionary nature of musical meaning. Musicae Scientiae, 13,
179–200. http://doi.org/10.1177/1029864909013002091
Damasio, A. (1999). The feeling of what happens: Body and emotion in the making of
consciousness. New York: Harvest Books.
Damasio, A. (2004). Emotions and feelings. In A. S. R. Manstead, N. Frijda, & A.
Fischer (Eds.), Feelings and Emotions (pp. 49–57). Cambridge: Cambridge
University Press.
Damasio, A., & Carvalho, G. B. (2013). The nature of feelings: evolutionary and
neurobiological origins. Nature Reviews Neuroscience, 14(2), 143–52.
http://doi.org/10.1038/nrn3403
Damasio, A., Damasio, H., & Tranel, D. (2013). Persistence of feelings and sentience
after bilateral damage of the insula. Cerebral Cortex, 23, 833–846.
SPATIOTEMPORAL EMOTIONS TO MUSIC
129
http://doi.org/10.1093/cercor/bhs077
Dapretto, M., Davies, M. S., Pfeifer, J. H., Scott, A. A., Sigman, M., Bookheimer, S. Y.,
& Iacoboni, M. (2006). Understanding emotions in others: mirror neuron
dysfunction in children with autism spectrum disorders. Nature Neuroscience, 9(1),
28–30. http://doi.org/10.1038/nn1611
Davis, M. H. (1983). Measuring individual differences in empathy: Evidence for a
multidimensional approach. Journal of Personality and Social Psychology, 44(1),
113–126. http://doi.org/10.1037/0022-3514.44.1.113
Dean, R. T., & Bailes, F. (2010). Time series analysis as a method to examine acoustical
influences on real-time perception of music. Empirical Musicology Review, 5(4),
152–175.
Deen, B., Pitskel, N. B., & Pelphrey, K. A. (2011). Three Systems of Insular Functional
Connectivity Identified with Cluster Analysis. Cerebral Cortex, 21, 1498–1506.
http://doi.org/10.1093/cercor/bhq186
Deng, J. J., & Leung, C. H. C. (2015). Dynamic time warping for music retrieval using
time series modeling of musical emotions. IEEE Transactions on Affective
Computing, 6(2), 137–151. http://doi.org/10.1109/TAFFC.2015.2404352
Destrieux, C., Fischl, B., Dale, A., & Halgren, E. (2010). Automatic parcellation of
human cortical gyri and sulci using standard anatomical nomenclature. NeuroImage,
53(1), 1–15. http://doi.org/10.1016/j.neuroimage.2010.06.010
Du, Y., Pearlson, G. D., Lin, D., Sui, J., Chen, J., Salman, M., … Calhoun, V. D. (2017).
Identifying dynamic functional connectivity biomarkers using GIG-ICA:
SPATIOTEMPORAL EMOTIONS TO MUSIC
130
Application to schizophrenia, schizoaffective disorder, and psychotic bipolar
disorder. Human Brain Mapping, 38(5), 2683–2708.
http://doi.org/10.1002/hbm.23553
Ebisch, S. J. H., Gallese, V., Willems, R. M., Mantini, D., Groen, W. B., Romani, G. L.,
… Bekkering, H. (2011). Altered Intrinsic Functional Connectivity of Anterior and
Posterior Insula Regions in High-Functioning Participants With Autism Spectrum
Disorder. Human Brain Mapping, 32, 1013–1028. http://doi.org/10.1002/hbm.21085
Eerola, T., Lartillot, O., & Toiviainen, P. (2009). Prediction of multidimensional
emotional ratings in music from audio using multivariate regression models.
Information Retrieval, (Ismir), 621–626. Retrieved from
http://ismir2009.ismir.net/proceedings/PS4-8.pdf
Eerola, T., Peltola, H.-R., & Vuoskoski, J. K. (2015). Attitudes toward sad music are
related to both preferential and contextual strategies. Psychomusicology: Music,
Mind, and Brain, 25(2), 116–123. http://doi.org/10.1037/pmu0000096
Eerola, T., Vuoskoski, J. K., & Kautiainen, H. (2016). Being moved by unfamiliar sad
music is associated with high empathy. Frontiers in Psychology, 7(SEP), 1–12.
http://doi.org/10.3389/fpsyg.2016.01176
Egermann, H., Sutherland, M. E., Grewe, O., Nagel, F., Kopiez, R., & Altenmuller, E.
(2011). Does music listening in a social context alter experience? A physiological
and psychological perspective on emotion. Musicae Scientiae, 15, 307–323.
http://doi.org/10.1177/1029864911399497
Eickhoff, S. B., Schleicher, A., & Zilles, K. (2006). The Human Parietal Operculum . I .
SPATIOTEMPORAL EMOTIONS TO MUSIC
131
Cytoarchitectonic Mapping of Subdivisions. Cerebral Cortex, 15, 254–267.
http://doi.org/10.1093/cercor/bhi105
Ekman, P. (1992). An argument for basic emotions. Cognition & Emotion, 6(3), 169–
200. http://doi.org/10.1080/02699939208411068
Escoffier, N., Zhong, J., Schirmer, A., & Qiu, A. (2013). Emotional expressions in voice
and music: Same code, same effect? Human Brain Mapping, 34(8), 1796–1810.
http://doi.org/10.1002/hbm.22029
Ethofer, T., Van De Ville, D., Scherer, K., & Vuilleumier, P. (2009). Decoding of
Emotional Information in Voice-Sensitive Cortices. Current Biology, 19(12), 1028–
1033. http://doi.org/10.1016/j.cub.2009.04.054
Fan, Y., Duncan, N. W., de Greck, M., & Northoff, G. (2011). Is there a core neural
network in empathy? An fMRI based quantitative meta-analysis. Neuroscience and
Biobehavioral Reviews, 35(3), 903–11.
http://doi.org/10.1016/j.neubiorev.2010.10.009
Feldman Barrett, L., Mesquita, B., Ochsner, K. N., & Gross, J. J. (2007). The Experience
of Emotion. Emotion, 58(July 2006), 373–403.
http://doi.org/10.1146/annurev.psych.58.110405.085709
Finn, E. S., Corlett, P. R., Chen, G., Bandettini, P. A., & Constable, R. T. (2018). Trait
paranoia shapes inter-subject synchrony in brain activity during an ambiguous social
narrative. Nature Communications, 9(1), 1–13. http://doi.org/10.1038/s41467-018-
04387-2
Fischl, B., Salat, D. H., Busa, E., Albert, M., Dieterich, M., Haselgrove, C., … Dale, A.
SPATIOTEMPORAL EMOTIONS TO MUSIC
132
M. (2002). Whole Brain Segmentation: Automated Labeling of Neuroanatomical
Structures in the Human Brain. Neuron, 33, 1–15. Retrieved from
papers2://publication/uuid/188F021B-3647-4A51-A9CC-125CD22ED3D5
Fritz, T., Jentschke, S., Gosselin, N., Sammler, D., Peretz, I., Turner, R., … Koelsch, S.
(2009). Universal Recognition of Three Basic Emotions in Music. Current Biology,
19(7), 573–576. http://doi.org/10.1016/j.cub.2009.02.058
Frühholz, S., Trost, W., & Grandjean, D. (2014). The role of the medial temporal limbic
system in processing emotions in voice and music. Progress in Neurobiology, 123,
1–17. http://doi.org/10.1016/j.pneurobio.2014.09.003
Gabrielsson, A. (2001). Emotion perceived and emotion felt: Same and different.
Musicae Scientiae, 10, 123–147. http://doi.org/10.1177/102986490601000203
Ganis, G., Thompson, W. L., & Kosslyn, S. M. (2004). Brain areas underlying visual
mental imagery and visual perception: An fMRI study. Cognitive Brain Research,
20(2), 226–241. http://doi.org/10.1016/j.cogbrainres.2004.02.012
Garrett, A. S., & Maddock, R. J. (2006). Separating subjective emotion from the
perception of emotion-inducing stimuli: An fMRI study. NeuroImage, 33(1), 263–
274. http://doi.org/10.1016/j.neuroimage.2006.05.024
Garrido, S., & Schubert, E. (2011). Negative emotion in music: what is the attraction? A
Qualitative Study. Empirical Musicology Review, 6(4), 214–230.
Garrido, S., & Schubert, E. (2013). Adaptive and maladaptive attraction to negative
emotions in music. Musicae Scientiae, 17(2), 147–166.
http://doi.org/10.1177/1029864913478305
SPATIOTEMPORAL EMOTIONS TO MUSIC
133
Garrido, S., & Schubert, E. (2015). Moody melodies: Do they cheer us up? A study of the
effect of sad music on mood. Psychology of Music, 43(2), 244–261.
http://doi.org/10.1177/0305735613501938
Gerber, A. J., Posner, J., Gorman, D., Colibazzi, T., Yu, S., Wang, Z., … Peterson, B. S.
(2008). An affective circumplex model of neural systems subserving valence,
arousal, and cognitive overlay during the appraisal of emotional faces.
Neuropsychologia, 46(8), 2129–2139.
http://doi.org/10.1016/j.neuropsychologia.2008.02.032
Glerean, E., Salmi, J., Lahnakoski, J. M., Jääskeläinen, I. P., & Sams, M. (2012).
Functional Magnetic Resonance Imaging Phase Synchronization as a Measure of
Dynamic Functional Connectivity. Brain Connectivity, 2(2), 91–101.
http://doi.org/10.1089/brain.2011.0068
Gold, B. P., Mas-Herrero, E., Zeighami, Y., Benovoy, M., Dagher, A., & Zatorre, R. J.
(2019). Musical reward prediction errors engage the nucleus accumbens and
motivate learning. Proceedings of the National Academy of Sciences, 1–6.
http://doi.org/10.1073/PNAS.1809855116
Gosling, S. D., Rentfrow, P. J., & Swann, W. B. (2003). A very brief measure of the Big-
Five personality domains. Journal of Research in Personality, 37, 504–528.
http://doi.org/10.1016/S0092-6566(03)00046-1
Groenewold, N. A., Opmeer, E. M., de Jonge, P., Aleman, A., & Costafreda, S. G.
(2013). Emotional valence modulates brain functional abnormalities in depression:
Evidence from a meta-analysis of fMRI studies. Neuroscience and Biobehavioral
SPATIOTEMPORAL EMOTIONS TO MUSIC
134
Reviews, 37(2), 152–163. http://doi.org/10.1016/j.neubiorev.2012.11.015
Guan, D., Chen, X., & Yang, D. (2012). Music emotion regression based on multi-modal
features. Cmmr 2012, (61170167), 19–22.
Guo, C. C., Nguyen, V. T., Hyett, M. P., Parker, G. B., & Breakspear, M. J. (2015). Out-
of-sync: disrupted neural activity in emotional circuitry during film viewing in
melancholic depression. Scientific Reports, 5(October 2014), 11605.
http://doi.org/10.1038/srep11605
Habibi, A., & Damasio, A. (2014). Music, feelings, and the human brain.
Psychomusicology: Music, Mind, and Brain, 24(1), 92–102.
http://doi.org/10.1037/pmu0000033
Hailstone, J. C., Omar, R., Henley, S. M. D., Frost, C., Michael, G., Warren, J. D., …
Warren, J. D. (2009). It’s not what you play , it’s how you play it: Timbre affects
perception of emotion in music. The Quarterly Journal of Experimental Psychology,
62(11), 2141–2155. http://doi.org/10.1080/17470210902765957
Hasson, U., Avidan, G., Gelbard, H., Minshew, N., Harel, M., Behrmann, M., & Vallines,
I. (2009). Shared and idiosyncratic cortical activation patterns in autism revealed
under continuous real-life viewing conditions. Autism Research, 2(4), 220–231.
http://doi.org/10.1002/aur.89
Hasson, U., Malach, R., & Heeger, D. J. (2010). Reliability of cortical activity during
natural stimulation. Trends in Cognitive Sciences, 14(1), 40–48.
http://doi.org/10.1016/j.tics.2009.10.011
Hasson, U., Nir, Y., Levy, I., Fuhrmann, G., & Malach, R. (2004). Intersubject
SPATIOTEMPORAL EMOTIONS TO MUSIC
135
Synchronization of Cortical Activity DuringNatural Vision. Science, 303(5664),
1634–1640. http://doi.org/10.1126/science.1089506
Heller, R., Stanley, D., Yekutieli, D., Nava, R., & Benjamini, Y. (2006). Cluster-based
analysis of FMRI data. NeuroImage, 33, 599–608.
http://doi.org/10.1016/j.neuroimage.2006.04.233
Hogue, J. D., Crimmins, A. M., & Kahn, H. (2016). “So sad and slow, so why can’t I turn
off the radio”: The effects of gender , depression , and absorption on liking music
that induces sadness and music that induces happiness. Psychology of Music, 44(4),
816–829. http://doi.org/10.1177/0305735615594489
Howard, J. D., Gottfried, J. A., Tobler, P. N., & Kahnt, T. (2015). Identity-specific
coding of future rewards in the human orbitofrontal cortex. Proceedings of the
National Academy of Sciences, 112(16), 201503550.
http://doi.org/10.1073/pnas.1503550112
Hunter, P. G., Schellenberg, E. G., & Griffith, A. T. (2011). Misery loves company:
Mood-congruent emotional responding to music. Emotion, 11, 1068–1072.
http://doi.org/10.1037/a0023749
Immordino-Yang, M. H., Yang, X.-F., & Damasio, H. (2014). Correlations between
social-emotional feelings and anterior insula activity are independent from visceral
states but influenced by culture. Frontiers in Human Neuroscience, 8(September),
1–15. http://doi.org/10.3389/fnhum.2014.00728
Ishizu, T., & Zeki, S. (2017). The Experience of Beauty Derived from Sorrow. Human
Brain Mapping, 00. http://doi.org/10.1002/hbm.23657
SPATIOTEMPORAL EMOTIONS TO MUSIC
136
Jääskeläinen, I. P., Koskentalo, K., Balk, M. H., Autti, T., Kauramäki, J., Pomren, C., &
Sams, M. (2008). Inter-subject synchronization of prefrontal cortex hemodynamic
activity during natural viewing. The Open Neuroimaging Journal, 2, 14–9.
http://doi.org/10.2174/1874440000802010014
Johnsen, E. L., Tranel, D., Lutgendorf, S., & Adolphs, R. (2009). A neuroanatomical
dissociation for emotion induced by music. International Journal of
Psychophysiology, 72(1), 24–33. http://doi.org/10.1016/j.ijpsycho.2008.03.011
Juslin, P. N. (2013). From everyday emotions to aesthetic emotions: towards a unified
theory of musical emotions. Physics of Life Reviews, 10(3), 235–66.
http://doi.org/10.1016/j.plrev.2013.05.008
Juslin, P. N., & Laukka, P. (2003). Communication of emotions in vocal expression and
music performance: different channels, same code? Psychological Bulletin, 129(5),
770–814. http://doi.org/10.1037/0033-2909.129.5.770
Juslin, P. N., & Laukka, P. (2004). Expression, Perception, and Induction of Musical
Emotions: A Review and a Questionnaire Study of Everyday Listening. Journal of
New Music Research, 33(3), 217–238.
http://doi.org/10.1080/0929821042000317813
Kahnt, T., Chang, L. J., Park, S. Q., Heinzle, J., & Haynes, J.-D. (2012). Connectivity-
Based Parcellation of the Human Orbitofrontal Cortex. Journal of Neuroscience,
32(18), 6240–6250. http://doi.org/10.1523/JNEUROSCI.0257-12.2012
Kao, M., Mandal, A., Lazar, N., & Stufken, J. (2009). NeuroImage Multi-objective
optimal experimental designs for event-related fMRI studies. NeuroImage, 44(3),
SPATIOTEMPORAL EMOTIONS TO MUSIC
137
849–856. http://doi.org/10.1016/j.neuroimage.2008.09.025
Kaplan, J. T., Man, K., & Greening, S. G. (2015). Multivariate cross-classification :
applying machine learning techniques to characterize abstraction in neural
representations. Frontiers in Human Neuroscience, 9, 1–12.
http://doi.org/10.3389/fnhum.2015.00151
Kassam, K. S., Markey, A. R., Cherkassky, V. L., Loewenstein, G., & Just, M. A. (2013).
Identifying Emotions on the Basis of Neural Activation. PLoS ONE, 8(6).
http://doi.org/10.1371/journal.pone.0066032
Kauppi, J.-P. (2010). Inter-subject correlation of brain hemodynamic responses during
watching a movie: localization in space and frequency. Frontiers in
Neuroinformatics, 4(March). http://doi.org/10.3389/fninf.2010.00005
Kauppi, J.-P., Pajula, J., & Tohka, J. (2014). A versatile software package for inter-
subject correlation based analyses of fMRI. Frontiers in Neuroinformatics,
8(January), 1–13. http://doi.org/10.3389/fninf.2014.00002
Kawakami, A., & Katahira, K. (2015). Influence of trait empathy on the emotion evoked
by sad music and on the preference for it. Frontiers in Psychology, 6(October), 1–9.
http://doi.org/10.3389/fpsyg.2015.01541
Kawamichi, H., Sugawara, S. K., Hamano, Y. H., Makita, K., Kochiyama, T., & Sadato,
N. (2016). Increased frequency of social interaction is associated with enjoyment
enhancement and reward system activation. Scientific Reports, 6(April), 24561.
http://doi.org/10.1038/srep24561
Kim, D., Kay, K., Shulman, G. L., & Corbetta, M. (2018). A new modular brain
SPATIOTEMPORAL EMOTIONS TO MUSIC
138
organization of the bold signal during natural vision. Cerebral Cortex, 28(9), 3065–
3081. http://doi.org/10.1093/cercor/bhx175
Kim, J., Shinkareva, S. V, & Wedell, D. H. (2017). Representations of modality-general
valence for videos and music derived from fMRI data. NeuroImage, 148, 42–54.
http://doi.org/10.1016/j.neuroimage.2017.01.002
Kim, Y. E., Schmidt, E. M., Migneco, R., Morton, B. G., Richardson, P., Scott, J., …
Turnbull, D. (2010). Music Emotion Recognition : a State of the Art Review.
Information Retrieval, (Ismir), 255–266. Retrieved from
http://ismir2010.ismir.net/proceedings/ismir2010-45.pdf
Kleiner, M., Brainard, D. H., & Pelli, D. (2007). What’s new in Psychtoolbox-3?
Perception 36 ECVP Abstract Supplement.
Kober, H., Barrett, L. F., Joseph, J., Bliss-Moreau, E., Lindquist, K. A., & Wager, T. D.
(2008). Functional grouping and cortical–subcortical interactions in emotion: A
meta-analysis of neuroimaging studies. NeuroImage, 42(2), 998–1031.
http://doi.org/10.1016/j.neuroimage.2008.03.059
Koelsch, S. (2014). Brain correlates of music-evoked emotions. Nature Reviews.
Neuroscience, 15(3), 170–80. http://doi.org/10.1038/nrn3666
Koelsch, S., & Skouras, S. (2014). Functional centrality of amygdala, striatum and
hypothalamus in a “small-world” network underlying joy: An fMRI study with
music. Human Brain Mapping, 35(7), 3485–3498. http://doi.org/10.1002/hbm.22416
Kotz, S. A., Kalberlah, C., Bahlmann, J., Friederici, A. D., & Haynes, J. D. (2013).
Predicting vocal emotion expressions from the human brain. Human Brain Mapping,
SPATIOTEMPORAL EMOTIONS TO MUSIC
139
34(8), 1971–1981. http://doi.org/10.1002/hbm.22041
Kragel, P. A., & Labar, K. S. (2013). Multivariate pattern classification reveals
autonomic and experiential representations of discrete emotions. Emotion, 13(4),
681–690. http://doi.org/10.1037/a0031820
Kragel, P. A., & LaBar, K. S. (2015). Multivariate neural biomarkers of emotional states
are categorically distinct. Social Cognitive and Affective Neuroscience, 10, 1437–
1448. http://doi.org/10.1093/scan/nsv032
Krämer, U. M., Mohammadi, B., Doñamayor, N., Samii, A., & Münte, T. F. (2010).
Emotional and cognitive aspects of empathy and their relation to social cognition-an
fMRI-study. Brain Research, 1311, 110–120.
http://doi.org/10.1016/j.brainres.2009.11.043
Kriegeskorte, N., Goebel, R., & Bandettini, P. A. (2006). Information-based functional
brain mapping. Proceedings of the National Academy of Sciences, 103(10), 3863–
3868. http://doi.org/10.1073/pnas.0600244103
Kringelbach, M. L., & Berridge, K. C. (2009). Towards a functional neuroanatomy of
pleasure and happiness. Trends in Cognitive Sciences, 13(September), 479–487.
http://doi.org/10.1016/j.tics.2009.08.006
Kringelbach, M. L., & Rolls, E. T. (2004). The functional neuroanatomy of the human
orbitofrontal cortex: Evidence from neuroimaging and neuropsychology. Progress
in Neurobiology, 72, 341–372. http://doi.org/10.1016/j.pneurobio.2004.03.006
Kurth, F., Zilles, K., Fox, P. T., Laird, A. R., & Eickhoff, S. B. (2010). A link between
the systems: functional differentiation and integration within the human insula
SPATIOTEMPORAL EMOTIONS TO MUSIC
140
revealed by meta-analysis. Brain Structure and Function, 214, 519–534.
http://doi.org/10.1007/s00429-010-0255-z
Ladinig, O., & Schellenberg, E. G. (2012). Liking unfamiliar music: Effects of felt
emotion and individual differences. Psychology of Aesthetics, Creativity, and the
Arts, 6(2), 146–154. http://doi.org/10.1037/a0024671
Lamm, C., Batson, C. D., & Decety, J. (2007). The neural substrate of human empathy:
effects of perspective-taking and cognitive appraisal. Journal of Cognitive
Neuroscience, 19(1), 42–58. http://doi.org/10.1162/jocn.2007.19.1.42
Lartillot, O., & Toiviainen, P. (2007). A matlab toolbox for musical feature extraction
from audio. International Conference on Digital Audio …, 1–8. Retrieved from
http://dafx.labri.fr/main/papers/p237.pdf%5Cnpapers2://publication/uuid/840762A7
-A43B-48F8-A50C-85BFCE586BDE
Lawrence, E. J., Shaw, P., Giampietro, V. P., Surguladze, S., Brammer, M. J., & David,
a. S. (2006). The role of “shared representations” in social perception and empathy:
An fMRI study. NeuroImage, 29(4), 1173–1184.
http://doi.org/10.1016/j.neuroimage.2005.09.001
Lehne, M., Rohrmeier, M., & Koelsch, S. (2013). Tension-related activity in the
orbitofrontal cortex and amygdala: an fMRI study with music. Social Cognitive and
Affective Neuroscience. http://doi.org/10.1093/scan/nst141
Levinson, J. (1990). Music, art, and metaphysics: essays in philosophical aesthetics.
Oxford University Press.
Li, W., Mai, X., & Liu, C. (2014). The default mode network and social understanding of
SPATIOTEMPORAL EMOTIONS TO MUSIC
141
others: what do brain connectivity studies tell us. Frontiers in Human Neuroscience,
8(February), 1–15. http://doi.org/10.3389/fnhum.2014.00074
Liljestrom, S., Juslin, P. N., & Vastfjall, D. (2012). Experimental evidence of the roles of
music choice, social context, and listener personality in emotional reactions to
music. Psychology of Music. http://doi.org/10.1177/0305735612440615
Lindquist, K. A., Satpute, A. B., Wager, T. D., Weber, J., & Barrett, L. F. (2015). The
Brain Basis of Positive and Negative Affect: Evidence from a Meta-Analysis of the
Human Neuroimaging Literature. Cerebral Cortex (New York, N.Y. : 1991),
bhv001-. http://doi.org/10.1093/cercor/bhv001
Lindquist, K. A., Wager, T. D., Kober, H., Bliss-Moreau, E., & Barrett, L. F. (2012). The
brain basis of emotion: a meta-analytic review. The Behavioral and Brain Sciences,
35(3), 121–143. http://doi.org/10.1017/S0140525X11000446
Linke, A. C., & Cusack, R. (2015). Flexible Information Coding in Human Auditory
Cortex during Perception, Imagery, and STM of Complex Sounds. Journal of
Cognitive Neuroscience, 27(7). http://doi.org/10.1162/jocn
Löwe, B., Decker, O., Müller, S., Brähler, E., Schellberg, D., Herzog, W., & Herzberg, P.
Y. (2008). Validation and standardization of the generalized anxiety disorder
screener (GAD-7) in the general population. Medical Care, 46(3), 266–274.
http://doi.org/10.1097/MLR.0b013e318160d093
Lundqvist, L.-O., Carlsson, F., Hilmersson, P., & Juslin, P. N. (2008). Emotional
responses to music: experience, expression, and physiology. Psychology of Music,
37, 61–90. http://doi.org/10.1177/0305735607086048
SPATIOTEMPORAL EMOTIONS TO MUSIC
142
Man, K., Damasio, A., Meyer, K., & Kaplan, J. T. (2015). Convergent and Invariant
Object Representations for Sight,Sound, and Touch. Human Brain Mapping, 36,
3629–3640. http://doi.org/10.1002/hbm.22867
Martin, A., Rief, W., Klaiberg, A., & Braehler, E. (2006). Validity of the Brief Patient
Health Questionnaire Mood Scale ( PHQ-9 ) in the general population. General
Hospital Psychiatry, 28, 71–77. http://doi.org/10.1016/j.genhosppsych.2005.07.003
McCrae, R. R. (2007). Aesthetic chills as a universal marker of openness to experience.
Motivation and Emotion, 31(1), 5–11. http://doi.org/10.1007/s11031-007-9053-1
McPherson, M. J., Barrett, F. S., Lopez-Gonzalez, M., Jiradejvong, P., & Limb, C. J.
(2016). Emotional Intent Modulates The Neural Substrates Of Creativity: An fMRI
Study of Emotionally Targeted Improvisation in Jazz Musicians. Scientific Reports,
6(November 2015), 18460. http://doi.org/10.1038/srep18460
Mitchell, T. M., Hutchinson, R., & Pereira, F. (2004). Learning to decode cognitive states
from brain images. Machine Learning: ECML 2004, 57, 145–175.
Mitterschiffthaler, M. T., Fu, C. H. Y., Dalton, J. a, Andrew, C. M., & Williams, S. C. R.
(2007). A functional MRI study of happy and sad affective states induced by
classical music. Human Brain Mapping, 28(11), 1150–62.
http://doi.org/10.1002/hbm.20337
Mitterschiffthaler, M. T., Kumari, V., Malhi, G. S., Brown, R. G., Giampietro, V. P.,
Brammer, M. J., … Sharma, T. (2003). Neural response to pleasant stimuli in
anhedonia: an fMRI study. Neuroreport, 14(2), 177–182.
http://doi.org/10.1097/00001756-200302100-00003
SPATIOTEMPORAL EMOTIONS TO MUSIC
143
Mullensiefen, D., Gingas, B., Musil, J., & Steward, L. (2014). The Musicality of Non-
Musicians : An Index for Assessing Musical Sophistication in the General
Population. PLoS ONE, 9(2). http://doi.org/10.1371/journal.pone.0089642
Nguyen, V. T., Breakspear, M., Hu, X., & Guo, C. C. (2016). The integration of the
internal and external milieu in the insula during dynamic emotional experiences.
NeuroImage, 124, 455–463. http://doi.org/10.1016/j.neuroimage.2015.08.078
Norman, K. a, Polyn, S. M., Detre, G. J., & Haxby, J. V. (2006). Beyond mind-reading:
multi-voxel pattern analysis of fMRI data. Trends in Cognitive Sciences, 10(9), 424–
30. http://doi.org/10.1016/j.tics.2006.07.005
Nummenmaa, L., Glerean, E., Viinikainen, M., Jaaskelainen, I. P., Hari, R., & Sams, M.
(2012). Emotions promote social interaction by synchronizing brain activity across
individuals. Proceedings of the National Academy of Sciences, 109(24), 9599–9604.
http://doi.org/10.1073/pnas.1206095109
Nummenmaa, L., & Lahnakoski, J. M. (2018). Sharing the social world via intersubject
neural synchronization, (March). http://doi.org/10.1016/j.copsyc.2018.02.021
Nummenmaa, L., & Saarimäki, H. (2019). Emotions as discrete patterns of systemic
activity. Neuroscience Letters, 693, 3–8. http://doi.org/10.1016/j.neulet.2017.07.012
Nummenmaa, L., Saarimäki, H., Glerean, E., Gotsopoulos, A., Jääskeläinen, I. P., Hari,
R., & Sams, M. (2014). Emotional speech synchronizes brains across listeners and
engages large-scale dynamic brain networks. NeuroImage, 102(P2), 498–509.
http://doi.org/10.1016/j.neuroimage.2014.07.063
Nummenmaa, L., Smirnov, D., Lahnakoski, J. M., Glerean, E., Jääskeläinen, I. P., Sams,
SPATIOTEMPORAL EMOTIONS TO MUSIC
144
M., & Hari, R. (2014). Mental Action Simulation Synchronizes Action–Observation
Circuits across Individuals. The Journal of Neuroscience, 34(3), 748–757.
http://doi.org/10.1523/JNEUROSCI.0352-13.2014
Nusbaum, E. C., & Silvia, P. J. (2011). Shivers and Timbres: Personality and the
Experience of Chills From Music. Social Psychological and Personality Science,
2(2), 199–204. http://doi.org/10.1177/1948550610386810
Oliver, M. B., & Raney, A. a. (2011). Entertainment as Pleasurable and Meaningful:
Identifying Hedonic and Eudaimonic Motivations for Entertainment Consumption.
Journal of Communication, 61(5), 984–1004. http://doi.org/10.1111/j.1460-
2466.2011.01585.x
Pajula, J., Kauppi, J. P., & Tohka, J. (2012). Inter-subject correlation in fMRI: Method
validation against stimulus-model based analysis. PLoS ONE, 7(8).
http://doi.org/10.1371/journal.pone.0041196
Paquette, S., Peretz, I., & Belin, P. (2013). The “ Musical Emotional Bursts ”: a validated
set of musical affect bursts to investigate auditory affective processing. Frontiers in
Psychology, 4, 1–7. http://doi.org/10.3389/fpsyg.2013.00509
Park, M., Hennig-Fast, K., Bao, Y., Carl, P., Pöppel, E., Welker, L., … Gutyrchik, E.
(2013). Personality traits modulate neural responses to emotions expressed in music.
Brain Research, 1523, 68–76. http://doi.org/10.1016/j.brainres.2013.05.042
Pearson, J. M., Heilbronner, S. R., Barack, D. L., Hayden, B. Y., & Platt, M. L. (2011).
Posterior cingulate cortex: Adapting behavior to a changing world. Trends in
Cognitive Sciences, 15(4), 143–151. http://doi.org/10.1016/j.tics.2011.02.002
SPATIOTEMPORAL EMOTIONS TO MUSIC
145
Peelen, M. V, Atkinson, A. P., & Vuilleumier, P. (2010). Supramodal Representations of
Perceived Emotions in the Human Brain. The Journal of Neuroscience : The Official
Journal of the Society for Neuroscience, 30(30), 10127–10134.
http://doi.org/10.1523/JNEUROSCI.2161-10.2010
Pereira, C. S., Teixeira, J., Figueiredo, P., Xavier, J., Castro, S. L., & Brattico, E. (2011).
Music and Emotions in the Brain: Familiarity Matters. PLoS ONE, 6(11).
http://doi.org/10.1371/journal.pone.0027241
Phan, K. L., Wager, T., Taylor, S. F., & Liberzon, I. (2002). Functional neuroanatomy of
emotion: a meta-analysis of emotion activation studies in PET and fMRI.
NeuroImage, 16(2), 331–48. http://doi.org/10.1006/nimg.2002.1087
Phillips, M. L., Drevets, W. C., Rauch, S. L., & Lane, R. (2003). Neurobiology of
Emotion Perception I : The Neural Basis of Normal Emotion Perception.
http://doi.org/10.1016/S0006-3223(03)00168-9
Power, J. D., Barnes, K. A., Snyder, A. Z., Schlaggar, B. L., & Petersen, S. E. (2012).
Spurious but systematic correlations in functional connectivity MRI networks arise
from subject motion. NeuroImage, 59(3), 2142–2154.
http://doi.org/10.1016/j.neuroimage.2011.10.018
Pruim, R. H. R., Mennes, M., Buitelaar, J. K., & Beckmann, C. F. (2015). Evaluation of
ICA-AROMA and alternative strategies for motion artifact removal in resting state
fMRI. NeuroImage, 112, 278–287. http://doi.org/10.1016/j.neuroimage.2015.02.063
Raz, G., Touroutoglou, A., Wilson-Mendenhall, C., Gilam, G., Lin, T., Gonen, T., …
Barrett, L. F. (2016). Functional connectivity dynamics during film viewing reveal
SPATIOTEMPORAL EMOTIONS TO MUSIC
146
common networks for different emotional experiences. Cognitive, Affective, &
Behavioral Neuroscience, (May). http://doi.org/10.3758/s13415-016-0425-4
Raz, G., Winetraub, Y., Jacob, Y., Kinreich, S., Maron-Katz, A., Shaham, G., …
Hendler, T. (2012). Portraying emotions at their unfolding: A multilayered approach
for probing dynamics of neural networks. NeuroImage, 60(2), 1448–1461.
http://doi.org/10.1016/j.neuroimage.2011.12.084
Rigoulot, S., Pell, M. D., & Armony, J. L. (2015). Time course of the influence of
musical expertise on the processing of vocal and musical sounds. Neuroscience, 290,
175–184. http://doi.org/10.1016/j.neuroscience.2015.01.033
Rijn, S. Van, Aleman, A., Diessen, E. Van, Berckmoes, C., Vingerhoets, G., & Kahn, R.
S. (2005). What is said or how it is said makes a difference : role of the right fronto-
parietal operculum in emotional prosody as revealed by repetitive TMS. European
Journal of Neuroscience, 21, 3195–3200. http://doi.org/10.1111/j.1460-
9568.2005.04130.x
Robert Tibshirani. (1996). Regression Shrinkage and Selection via the Lasso. Royal
Statistical Society, 58(1), 267–288.
Russell, J. A. (1980). A Circumplex Model of Affect. Journal of Personality and Social
Psychology, 39(6), 1161–1178. http://doi.org/10.1037/h0077714
Russell, J. A. (2003). Core affect and the psychological construction of emotion.
Psychological Review, 110(1), 145–72. http://doi.org/10.1037/0033-295X.110.1.145
Saarimaki, H., Gotsopoulos, A., Jaaskelainen, I. P., Lampinen, J., Vuilleumier, P., Hari,
R., … Nummenmaa, L. (2015). Discrete Neural Signatures of Basic Emotions.
SPATIOTEMPORAL EMOTIONS TO MUSIC
147
Cerebral Cortex, 1–11. http://doi.org/10.1093/cercor/bhv086
Saarimäki, H., Gotsopoulos, A., Jääskeläinen, I. P., Lampinen, J., Vuilleumier, P., Hari,
R., … Nummenmaa, L. (2016). Discrete Neural Signatures of Basic Emotions.
Cerebral Cortex, 26, 2563–2573. http://doi.org/10.1093/cercor/bhv086
Sachs, M. E., Damasio, A., & Habibi, A. (2015). The pleasures of sad music: a
systematic review. Frontiers in Human Neuroscience, 9(July), 1–12.
http://doi.org/10.3389/fnhum.2015.00404
Sachs, M. E., Ellis, R. J., Schlaug, G., & Loui, P. (2016). Brain connectivity reflects
human aesthetic responses to music. Social Cognitive and Affective Neuroscience,
(October 2015), 1–8. http://doi.org/10.1093/scan/nsw009
Sachs, M. E., Habibi, A., Damasio, A., & Kaplan, J. T. (2018). Decoding the neural
signatures of emotions expressed through sound. NeuroImage, 174(March), 1–10.
http://doi.org/10.1016/j.neuroimage.2018.02.058
Salimpoor, V. N., Benovoy, M., Larcher, K., Dagher, A., & Zatorre, R. J. (2011).
Anatomically distinct dopamine release during anticipation and experience of peak
emotion to music. Nature Neuroscience, 14(2), 257–262.
http://doi.org/10.1038/nn.2726
Salimpoor, V. N., Benovoy, M., Longo, G., Cooperstock, J. R., & Zatorre, R. J. (2009).
The rewarding aspects of music listening are related to degree of emotional arousal.
PLoS ONE, 4(10). http://doi.org/10.1371/journal.pone.0007487
Salimpoor, V. N., van den Bosch, I., Kovacevic, N., McIntosh, A. R., Dagher, A., &
Zatorre, R. J. (2013a). Interactions between the nucleus accumbens and auditory
SPATIOTEMPORAL EMOTIONS TO MUSIC
148
cortices predict music reward value. Science, 340(6129), 216–9.
http://doi.org/10.1126/science.1231059
Salimpoor, V. N., van den Bosch, I., Kovacevic, N., McIntosh, A. R., Dagher, A., &
Zatorre, R. J. (2013b). Interactions between the nucleus accumbens and auditory
cortices predict music reward value. Science (New York, N.Y.), 340(6129), 216–9.
http://doi.org/10.1126/science.1231059
Salimpoor, V. N., Zald, D. H., Zatorre, R. J., Dagher, A., & Mcintosh, A. R. (2015).
Predictions and the brain : how musical sounds become rewarding. Trends in
Cognitive Sciences, 19(2), 86–91. http://doi.org/10.1016/j.tics.2014.12.001
Sander, K., & Scheich, H. (2005). Left Auditory Cortex and Amygdala , but Right Insula
Dominance for Human Laughing and Crying. Journal of Cognitive Neuroscience,
17(10), 1519–1531.
Schirmer, A., & Adolphs, R. (2017). Emotion Perception from Face,Voice, and Touch:
Comparisons and Convergence. Trends in Cognitive Sciences, 21(3), 216–228.
http://doi.org/10.1016/j.tics.2017.01.001
Schnell, K., Bluschke, S., Konradt, B., & Walter, H. (2011). Functional relations of
empathy and mentalizing: An fMRI study on the neural basis of cognitive empathy.
NeuroImage, 54(2), 1743–1754. http://doi.org/10.1016/j.neuroimage.2010.08.024
Schonwiesner, M., Rübsamen, R., & von Cramon, D. Y. (2005). Hemispheric asymmetry
for spectral and temporal processing in the human antero-lateral auditory belt cortex.
European Journal of Neuroscience, 22, 1521–1528. http://doi.org/10.1111/j.1460-
9568.2005.04315.x
SPATIOTEMPORAL EMOTIONS TO MUSIC
149
Schubert, E., Halpern, A. R., Kreutz, G., & Garrido, S. (2018). Attraction to sad music:
The role of imagery, absorption, and rumination. Psychology of Aesthetics,
Creativity, and the Arts, 12(3), 251–258. http://doi.org/10.1037/aca0000160
Shakil, S., Lee, C., & Dawn, S. (2016). Evaluation of sliding window correlation
performance for characterizing dynamic functional connectivity and brain states.
NeuroImage, 133, 111–128. http://doi.org/10.1016/j.neuroimage.2016.02.074
Shirer, W. R., Ryali, S., Rykhlevskaia, E., Menon, V., & Greicius, M. D. (2012).
Decoding subject-driven cognitive states with whole-brain connectivity patterns.
Cerebral Cortex, 22(1), 158–165. http://doi.org/10.1093/cercor/bhr099
Siegle, G. J., Steinhauer, S. R., Thase, M. E., Stenger, V. A., & Carter, C. S. (2002).
Can’t shake that feeling: Event-related fMRI assessment of sustained amygdala
activity in response to emotional information in depressed individuals. Biological
Psychiatry, 51, 693–707. http://doi.org/10.1016/S0006-3223(02)01314-8
Silani, G., Bird, G., Brindley, R., Singer, T., Frith, C., & Frith, U. (2008). Levels of
emotional awareness and autism : An fMRI study. Social Neuroscience, 3(2), 97–
112. http://doi.org/10.1080/17470910701577020
Simony, E., Honey, C. J., Chen, J., Lositsky, O., Yeshurun, Y., Wiesel, A., & Hasson, U.
(2016). Dynamic reconfiguration of the default mode network during narrative
comprehension. Nature Communications, 7(May 2015), 1–13.
http://doi.org/10.1038/ncomms12141
Singer, N., Jacobi, N., Lin, T., Raz, G., Shpigelman, L., Gilam, G., … Hendler, T. (2016).
Common modulation of limbic network activation underlies the unfolding of
SPATIOTEMPORAL EMOTIONS TO MUSIC
150
musical emotions and its temporal attributes. NeuroImage.
http://doi.org/10.1016/j.neuroimage.2016.07.002
Singer, N., Jacoby, N., Lin, T., Raz, G., Shpigelman, L., Gilam, G., … Hendler, T.
(2016). Common modulation of limbic network activation underlies musical
emotions as they unfold. NeuroImage, 141, 517–529.
http://doi.org/10.1016/j.neuroimage.2016.07.002
Singer, T., & Lamm, C. (2009). The Social Neuroscience of Empathy. Ann. N.Y. Acad.
Sci., 96, 81–96. http://doi.org/10.1111/j.1749-6632.2009.04418.x
Singer, T., Seymour, B., Doherty, J. O., Kaube, H., Dolan, R. J., & Frith, C. D. (2004).
Empathy for Pain Involves the Affective but not Sensory Components of Pain.
Science, 303, 1157–1162.
Skerry, A. E., & Saxe, R. (2014). A Common Neural Code for Perceived and Inferred
Emotion. The Journal of Neuroscience, 34(48), 15997–16008.
http://doi.org/10.1523/JNEUROSCI.1676-14.2014
Smith, S. M., Jenkinson, M., Woolrich, M. W., Beckmann, C. F., Behrens, T. E. J.,
Johansen-Berg, H., … Matthews, P. M. (2004). Advances in functional and
structural MR image analysis and implementation as FSL. NeuroImage, 23, S208-
19. http://doi.org/10.1016/j.neuroimage.2004.07.051
Smith, S. M., & Nichols, T. E. (2009). Threshold-free cluster enhancement : Addressing
problems of smoothing , threshold dependence and localisation in cluster inference.
NeuroImage, 44, 83–98. http://doi.org/10.1016/j.neuroimage.2008.03.061
Steinbeis, N., Koelsch, S., & Sloboda, J. a. (2006). The role of harmonic expectancy
SPATIOTEMPORAL EMOTIONS TO MUSIC
151
violations in musical emotions: evidence from subjective, physiological, and neural
responses. Journal of Cognitive Neuroscience, 18(8), 1380–93.
http://doi.org/10.1162/jocn.2006.18.8.1380
Stelzer, J., Chen, Y., & Turner, R. (2013). Statistical inference and multiple testing
correction in classification-based multi-voxel pattern analysis (MVPA): Random
permutations and cluster size control. NeuroImage, 65, 69–82.
http://doi.org/10.1016/j.neuroimage.2012.09.063
Straube, T., & Miltner, W. H. R. (2011). Attention to aversive emotion and specific
activation of the right insula and right somatosensory cortex. NeuroImage, 54(3),
2534–2538. http://doi.org/10.1016/j.neuroimage.2010.10.010
Taruffi, L., & Koelsch, S. (2014). The Paradox of Music-Evoked Sadness: An Online
Survey. PloS One, 9(10), e110490. http://doi.org/10.1371/journal.pone.0110490
Taruffi, L., Pehrs, C., Skouras, S., & Koelsch, S. (2017). Effects of Sad and Happy Music
on Mind-Wandering and the Default Mode Network. Scientific Reports, 7(1), 1–10.
http://doi.org/10.1038/s41598-017-14849-0
Taylor, S. F., Phan, K. L., Decker, L. R., & Liberzon, I. (2003). Subjective rating of
emotionally salient stimuli modulates neural activity. NeuroImage, 18(3), 650–659.
http://doi.org/10.1016/S1053-8119(02)00051-4
Touroutoglou, A., Hollenbeck, M., Dickerson, B. C., & Feldman, L. (2012). Dissociable
large-scale networks anchored in the right anterior insula subserve affective
experience and attention. NeuroImage, 60(4), 1947–1958.
http://doi.org/10.1016/j.neuroimage.2012.02.012
SPATIOTEMPORAL EMOTIONS TO MUSIC
152
Touroutoglou, A., Lindquist, K. A., Dickerson, B. C., & Barrett, L. F. (2014). Intrinsic
connectivity in the human brain does not reveal networks for “basic” emotions.
Social Cognitive and Affective Neuroscience, 10(9), 1257–1265.
http://doi.org/10.1093/scan/nsv013
Trohidis, K., & Kalliris, G. (2008). Multi-Label Classification of Music Into Emotions.
Learning, 2008, 325–330. Retrieved from
http://ismir2008.ismir.net/papers/ISMIR2008_275.pdf
Trost, W., Ethofer, T., Zentner, M., & Vuilleumier, P. (2012). Mapping aesthetic musical
emotions in the brain. Cerebral Cortex, 22(12), 2769–83.
http://doi.org/10.1093/cercor/bhr353
Trost, W., Frühholz, S., Cochrane, T., Cojan, Y., & Vuilleumier, P. (2015). Temporal
dynamics of musical emotions examined through intersubject synchrony of brain
activity. Social Cognitive and Affective Neuroscience, 10(12), 1705–1721.
http://doi.org/10.1093/scan/nsv060
Upham, F., & McAdams, S. (2018). Activity Analysis and Coordination in Continuous
Responses to Music. Music Perception: An Interdisciplinary Journal, 35(3), 253–
294. http://doi.org/10.1525/mp.2018.35.3.253
Vessel, E. A., Starr, G. G., & Rubin, N. (2012). The brain on art: intense aesthetic
experience activates the default mode network. Frontiers in Human Neuroscience,
6(April), 66. http://doi.org/10.3389/fnhum.2012.00066
Viinikainen, M., Glerean, E., Jääskeläinen, I. P., Kettunen, J., Sams, M., & Nummenmaa,
L. (2012). Nonlinear neural representation of emotional feelings elicited by dynamic
SPATIOTEMPORAL EMOTIONS TO MUSIC
153
naturalistic stimulation. Open Journal of Neuroscience, 17(2–4), 1–2.
von dem Hagen, E. A. H., Nummenmaa, L., Yu, R., Engell, A. D., Ewbank, M. P., &
Calder, A. J. (2011). Autism Spectrum Traits in the Typical Population Predict
Structure and Function in the Posterior Superior Temporal Sulcus. Cerebral Cortex,
21, 492–500. http://doi.org/10.1093/cercor/bhq062
Vuoskoski, J. K., & Eerola, T. (2012). Can sad music really make you sad? Indirect
measures of affective states induced by music and autobiographical memories.
Psychology of Aesthetics, Creativity, and the Arts, 6(3), 204–213.
http://doi.org/10.1037/a0026937
Vuoskoski, J., Thompson, W. F., McIlwain, D., & Eerola, T. (2011). Who Enjoys
Listening to Sad Music and Why? Music Perception, 29(3), 311–318.
Wager, T. D., Phan, K. L., Liberzon, I., & Taylor, S. F. (2003). Valence , gender , and
lateralization of functional brain anatomy in emotion : a meta-analysis of findings
from neuroimaging, 19, 513–531. http://doi.org/10.1016/S1053-8119(03)00078-8
Watson, D., Clark, L. a, & Tellegen, a. (1988). Development and validation of brief
measures of positive and negative affect: the PANAS scales. Journal of Personality
and Social Psychology, 54(6), 1063–70. Retrieved from
http://www.ncbi.nlm.nih.gov/pubmed/3397865
Wegrzyn, M., Riehle, M., Labudda, K., Woermann, F., Baumgartner, F., Pollmann, S., …
Kissler, J. (2015). Investigating the brain basis of facial expression perception using
multi-voxel pattern analysis. Cortex, 69, 131–140.
http://doi.org/10.1016/j.cortex.2015.05.003
SPATIOTEMPORAL EMOTIONS TO MUSIC
154
Wilson-mendenhall, C. D., Barrett, L. F., & Barsalou, L. W. (2015). Variety in emotional
life : within-category typicality of emotional experiences is associated with neural
activity in large-scale brain networks, 62–71. http://doi.org/10.1093/scan/nsu037
Wilson-Mendenhall, C. D., Barrett, L. F., & Barsalou, L. W. (2015). Variety in emotional
life: within-category typicality of emotional experiences is associated with neural
activity in large-scale brain networks. Social Cognitive and Affective Neuroscience,
10(1), 62–71. http://doi.org/10.1093/scan/nsu037
Winkler, A. M., Ridgway, G. R., Webster, M. A., Smith, S. M., & Nichols, T. E. (2014).
Permutation inference for the general linear model. NeuroImage, 92, 381–397.
http://doi.org/10.1016/j.neuroimage.2014.01.060
Yeo, B. T. T., Krienen, F. M., Sepulcre, J., Sabuncu, M. R., Lashkari, D., Hollinshead,
M., … Buckner, R. L. (2011). The organization of the human cerebral cortex
estimated by intrinsic functional connectivity. Journal of Neuroph, 106, 1125–1165.
http://doi.org/10.1152/jn.00338.2011.
Zentner, M., Grandjean, D., & Scherer, K. R. (2008). Emotions evoked by the sound of
music: characterization, classification, and measurement. Emotion (Washington,
D.C.), 8(4), 494–521. http://doi.org/10.1037/1528-3542.8.4.494
SPATIOTEMPORAL EMOTIONS TO MUSIC
155
Appendix A: Confusion matrices for classification of auditory emotions
Predicted Label
True Label
A. Within-instrument: Whole brain B. Within-instrument: Auditory cortex
Predicted Label
True Label
D. Cross-instrument: Leave out clarinet E. Cross-instrument: Leave out violin F. Cross-instrument: Leave out voice
True Label
True Label
True Label
Predicted Label
Auditory cortex
True Label True Label
Predicted Label
Predicted Label
True Label
A. Cross-instrument: Leave out clarinet B. Cross-instrument: Leave out violin
Predicted Label
C. Cross-instrument: Leave out voice
True Label
True Label
True Label
Whole brain
Predicted Label
Predicted Label
True Label True Label
SPATIOTEMPORAL EMOTIONS TO MUSIC
156
Appendix B: Classification performance values for within-instrument classification
ROI Emotion Sensitivity Specificity PPV NPV
Whole-brain Fear 0.53 0.57 0.49 0.62
Happiness 0.40 0.57 0.38 0.60
Sad 0.37 0.67 0.44 0.59
Average 0.43 0.60 0.43 0.60
Auditory cortex Fear 0.59 0.63 0.53 0.69
Happiness 0.42 0.63 0.39 0.65
Sad 0.45 0.71 0.52 0.65
Average 0.49 0.66 0.48 0.66
SPATIOTEMPORAL EMOTIONS TO MUSIC
157
Appendix C: Pairwise cross-instrument classification accuracies
Note: Cross instrument classification is conducted by training the classifier on data from
one instrument and testing on data from another. Error bars represent indicate error. p
values are calculated based on a one-sample t-test comparing classification with chance
(0.33,dotted line). †p < 0.05, uncorrected; *p < 0.05;**p < 0.01,***p < 0.001, corrected
for multiple comparisons across the two ROIs.
Pairwise cross instrument classification accuracy
Classification accuracy
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Whole Brain
AC
Chance
*
*
*
**
*
**
*
**
*
**
*
**
* *
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Violin-Clarinet Clarinet-Voice Voice-Clarinet Clarinet-Violin Violin-Voice Voice-Violin
* *
** *** *** *** ** **
† †
SPATIOTEMPORAL EMOTIONS TO MUSIC
158
Appendix D: Significant peaks and coordinates for searchlight results
Within-instrument searchlight results
Hemisphere x y z Accuracy
Superior temporal gyrus R 56 -6 4 0.49
Superior temporal gyrus L -56 -18 4 0.48
Heschl’s gyrus R 48 -20 12 0.45
Heschl’s gyrus L -46 -22 12 0.45
Posterior insula R 44 -12 8 0.44
Posterior Insula L -40 -14 8 0.41
Parietal operculum R 54 -20 16 0.44
Parietal operculum L -46 -30 14 0.44
Middle temporal gyrus R 64 -16 -8 0.44
Precentral gyrus R 50 -4 48 0.38
Precentral gyrus L -52 4 6 0.41
Inferior frontal gyrus R 52 16 -2 0.39
inferior frontal gyrus L -54 16 0 0.38
Superior frontal gyrus R 8 14 60 0.38
Middle frontal gyrus R 46 8 52 0.38
Middle frontal gyrus L -42 6 48 0.38
Medial prefrontal cortex R 20 58 16 0.38
Precuneus R 4 -44 44 0.38
Supramarginal gyrus R 46 -46 46 0.37
Cross-instrument whole brain searchlight results
Hemisphere x y z Accuracy
Heschl's gyrus R 56 -8 2 0.46
Heschl's gyrus L -52 -18 4 0.43
Superior temporal gyrus R 60 -14 -2 0.45
Superior temporal gyrus L -58 -6 0 0.43
Posterior insula R 42 -14 6 0.40
Posterior Insula L -38 -14 6 0.40
Parietal operculum R 56 -20 16 0.42
Parietal operculum L -56 -26 14 0.42
Superior temporal gyrus R 60 -12 -2 0.46
Superior temporal gyrus L -52 -10 -2 0.42
Note. stereotactic coordinates (x,y,z) from MNI atlas corresponding to the voxel with
peak accuracy within a particular anatomical region.
SPATIOTEMPORAL EMOTIONS TO MUSIC
159
Appendix E: Emotional responses to sad music (based on GEMS-9)
Please describe how listening to sad music makes you feel (e.g. this music makes me feel
sad). Do not describe the music (e.g. this music is sad) or what the music may be
expressive of (e.g. this music expresses sadness). Keep in mind that a piece of music can
be sad or can sound sad without making you feel sad. Please rate the intensity with which
you feel each of the following feelings on a scale ranging from 1 (not at all) to 5 (very
much).
Sad music makes me feel...
Not at all
(1)
Somewhat
(2)
Moderately
(3)
Quite a lot
(4)
Very much
(5)
Sad or sorrowful m m m m m
Tender, affectionate,
or in love
m m m m m
Nostalgic, dreamy, or
melancholic
m m m m m
Tense, agitated, or
nervous
m m m m m
Serene, calm, or
soothed
m m m m m
Joyful, amused, or
bouncy
m m m m m
Fascinated or
overwhelmed and
evokes in me feelings
of transcendence or
spirituality
m m m m m
Strong, triumphant,
or energetic
m m m m m
Filled with wonder,
dazzled, moved
m m m m m
No particular
emotion
m m m m m
SPATIOTEMPORAL EMOTIONS TO MUSIC
160
Appendix F: Reasons for listening to sad music
Strongly
Disagree
Disagree
Agree nor
Disagree
Agree
Strongly
Agree
m m m m m
it helps me gain a better
understanding of my own feelings or
my current situation
m m m m m
it helps me regulate my mood and/or
emotions better than other types of
music
m m m m m
it allows me to release or purge
myself of negative emotions
m m m m m
it is more likely to trigger an
emotional response, which reassures
me of my ability to feel emotions at all
m m m m m
it generally results in a strengthening
and/or deepening of emotions and I
find powerful emotions enjoyable
m m m m m
I am more likely to feel touched,
awed, or moved by sad music
m m m m m
I can experience a milder version of
the emotions expressed in the music
without the negative life consequences
m m m m m
listening to sad music is a way for me
to express how I am feeling and/or
who I am
m m m m m
it allows me to relate to and/or feel
connected to others
m m m m m
it makes me feel more sad and/or
prolongs my sadness and I think
feeling truly sad has some value or
importance
m m m m m
it makes me think more realistically
about my life and current situation
m m m m m
it maxes me feel more calm, soothed,
and/or relaxed than other types of
music
m m m m m
SPATIOTEMPORAL EMOTIONS TO MUSIC
161
Appendix G: Music listening in various situations
Below you will see a number of statements that ask you to describe the type of music that
you would most likely listen to in various hypothetical situations. For each question,
imagine a time when you were in that particular situation and then move the bars from 0
(not at all) to 10 (very much) for each of the three dimensions in order to describe the
music that you would most likely listen when in that situation. It is okay if you do not
always listen to the same type of music in a certain situation. Simply select the choice
that corresponds to what you would do most of the time. Notice that you can move the
bars independently, meaning that you can select music that sounds both happy and sad at
the same time. If you would not listen to music in the given situation, please select "Not
Applicable" for all three dimensions. If one of the dimensions is not important in a given
situation, i.e. when you are feeling angry you do not care how happy or sad sounding the
music is, please select "Not Applicable" for that particular dimensions. Please respond to
all the items; do not leave any blank. Please be as accurate and honest as you can be.
Respond to each item as if it were the only item. That is, do not worry about being
consistent in your responses.
Describe the qualities of the music that you would most likely listen to…
when feeling angry
______ How happy is the music?
______ How sad is the music?
______ How energetic is the music?
when feeling frustrated
______ How happy is the music?
______ How sad is the music?
______ How energetic is the music?
when there has been a recent death or breakup
______ How happy is the music?
______ How sad is the music?
______ How energetic is the music?
when feeling stressed
______ How happy is the music?
______ How sad is the music?
______ How energetic is the music?
SPATIOTEMPORAL EMOTIONS TO MUSIC
162
when feeling lonely
______ How happy is the music?
______ How sad is the music?
______ How energetic is the music?
when you are alone
______ How happy is the music?
______ How sad is the music?
______ How energetic is the music?
when feeling homesick
______ How happy is the music?
______ How sad is the music?
______ How energetic is the music?
when you are missing someone
______ How happy is the music?
______ How sad is the music?
______ How energetic is the music?
when feeling relaxed
______ How happy is the music?
______ How sad is the music?
______ How energetic is the music?
when in contact with nature
______ How happy is the music?
______ How sad is the music?
______ How energetic is the music?
when feeling nostalgic or reflecting on past personal experiences
______ How happy is the music?
______ How sad is the music?
______ How energetic is the music?
when doing something else, such as driving, traveling, reading, or working
______ How happy is the music?
______ How sad is the music?
______ How energetic is the music?
SPATIOTEMPORAL EMOTIONS TO MUSIC
163
while being creative or feeling inspired
______ How happy is the music?
______ How sad is the music?
______ How energetic is the music?
when you are happy or in a good mood
______ How happy is the music?
______ How sad is the music?
______ How energetic is the music?
when you are sad or in a bad mood
______ How happy is the music?
______ How sad is the music?
______ How energetic is the music?
when you are trying to focus
______ How happy is the music?
______ How sad is the music?
______ How energetic is the music?
when you are with friends/at a social gathering
______ How happy is the music?
______ How sad is the music?
______ How energetic is the music?
when you are running, dancing, or working out
______ How happy is the music?
______ How sad is the music?
______ How energetic is the music?
when you are celebrating a holiday, birthday, or other special occasion
______ How happy is the music?
______ How sad is the music?
______ How energetic is the music?
SPATIOTEMPORAL EMOTIONS TO MUSIC
164
when you want to distance yourself or get away from your problems
______ How happy is the music?
______ How sad is the music?
______ How energetic is the music?
SPATIOTEMPORAL EMOTIONS TO MUSIC
165
Appendix H: Zero-order correlations between all survey measures in Chapter 3
SPATIOTEMPORAL EMOTIONS TO MUSIC
166
Appendix I: List of acoustic features extracted from sad piece of music
Feature Type Description
1- 13
th
MFCCs Timbre Timbre
1- 13
th
ΔMFCCs Timbre Change in timbre
1- 13
th
ΔΔMFCCs Timbre Change in timbre
HCDF Harmony Flux of tonal centroid
Spectral flux Timbre Harmonic change
Skewness Timbre Degree of tilt towards high/low frequency
Kurtosis Timbre Change in spectrum
Centroid Timbre Center of amplitude
Brightness Timbre % of high frequencies
LPCs Timbre 13 features
Spread Timbre SD of spectrum
1
st
– 12
th
Chroma Harmony Likelihood of each of 12 notes
Key strength Harmony Likelihood of a key
Key mode Harmony Major or minor
Compression ratio Dynamics Complexity
RMS Dynamics Loudness
Pulse clarity Rhythm Strength of beats
SPATIOTEMPORAL EMOTIONS TO MUSIC
167
Appendix J: Zero-order correlations for all survey measures in Chapter 4
SPATIOTEMPORAL EMOTIONS TO MUSIC
168
Appendix K: Coordinates of significant ISC clusters during sad music-listening
Region Hemisphere Corr value x y z
Heschl's gyrus right 0.11 71 54 40
Heschl's gyrus left 0.12 19 54 40
Superior temporal gyrus right 0.15 78 55 40
Superior temporal gyrus left 0.08 16 46 41
Posterior insula right 0.02 49 80 51
Posterior insula left 0.02 28 51 40
Inferior frontal gyrus right 0.04 68 73 48
Anterior insula right 0.02 64 76 39
ACC right 0.01 47 79 49
Paracingulate gyrus right 0.02 47 80 52
Precentral gyrus right 0.03 69 64 62
Note. stereotactic coordinates (x,y,z) from MNI atlas corresponding to the voxel with
peak intersubject correlation values within a particular anatomical region.
Abstract (if available)
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Decoding information about human-agent negotiations from brain patterns
PDF
Fathers’ social-cognitive network connectivity: associations with prenatal father-infant attachment and postpartum parenting behavior
PDF
Biological and behavioral correlates of emotional flexibility and associations with exposure to family aggression
PDF
Mapping the neural architecture of concepts
PDF
Dedifferentiation of emotion regulation strategies in the aging brain: an MVPA investigation
PDF
The neuroscience of ambivalent and ambiguous feelings
PDF
The brain's virtuous cycle: an investigation of gratitude and good human conduct
PDF
The role of music training on behavioral and neurophysiological indices of speech-in-noise perception: a meta-analysis and randomized-control trial
PDF
Value-based decision-making in complex choice: brain regions involved and implications of age
PDF
Theory of mind processing in expectant fathers: associations with prenatal oxytocin
PDF
Neurobiological correlates of fathers’ transition to parenthood
PDF
Affective neuropsychiatric symptoms and neural connectivity in the early stages of Alzheimer’s disease
PDF
The neural correlates of creativity and perceptual pleasure: from simple shapes to humor
PDF
Behavioral and neural influences of interoception and alexithymia on emotional empathy in autism spectrum disorder
PDF
Individual differences in heart rate response and expressive behavior during social emotions: effects of resting cardiac vagal tone and culture, and relation to the default mode network
PDF
Neural and behavioral correlates of fear processing in first-time fathers
PDF
Behabioral and neural evidence of state-like variance in intertemporal decisions
PDF
Heart, brain, and breath: studies on the neuromodulation of interoceptive systems
PDF
Decoding the neural basis of valence and arousal across affective states
PDF
The acute impact of glucose and sucralose on food decisions and brain responses to visual food cues
Asset Metadata
Creator
Sachs, Matthew Elliott
(author)
Core Title
Spatial and temporal patterns of brain activity associated with emotions in music
School
College of Letters, Arts and Sciences
Degree
Doctor of Philosophy
Degree Program
Psychology
Publication Date
07/03/2019
Defense Date
05/08/2019
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
emotion,enjoyment,fMRI,multivariate statistics,Music,OAI-PMH Harvest,Pleasure
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Damasio, Antonio (
committee chair
), Ilari, Beatriz (
committee member
), Kaplan, Jonas T. (
committee member
), Monterosso, John R. (
committee member
), Saxbe, Darby (
committee member
)
Creator Email
sachsm53@gmail.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c89-180677
Unique identifier
UC11660642
Identifier
etd-SachsMatth-7532.pdf (filename),usctheses-c89-180677 (legacy record id)
Legacy Identifier
etd-SachsMatth-7532.pdf
Dmrecord
180677
Document Type
Dissertation
Format
application/pdf (imt)
Rights
Sachs, Matthew Elliott
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
enjoyment
fMRI
multivariate statistics