Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
The role of music training on behavioral and neurophysiological indices of speech-in-noise perception: a meta-analysis and randomized-control trial
(USC Thesis Other)
The role of music training on behavioral and neurophysiological indices of speech-in-noise perception: a meta-analysis and randomized-control trial
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
THE ROLE OF MUSIC TRAINING ON BEHAVIORAL AND NEUROPHYSIOLOGICAL
INDICES OF SPEECH-IN-NOISE PERCEPTION:
A META-ANALYSIS AND RANDOMIZED-CONTROL TRIAL
by
Sarah L. Hennessy
A Thesis Presented to the
FACULTY OF THE USC DORNSIFE COLLEGE OF ARTS AND SCIENCES
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
MASTER OF ARTS
(PSYCHOLOGY )
August 2021
Copyright 2021 Sarah L. Hennessy
ii
TABLE OF CONTENTS
List of Tables……………..……………..……………..……………..……………..……………iii
List of Figures……………..……………..……………..……………..……………..…………...iv
Abstract……………..……………..……………..……………..……………..…………………..v
Chapter 1: Speech-in-noise perception in musicians and non-musicians: a multi-level meta-
analysis……………..……………..……………..……………..……………..…………………...1
Introduction……………..……………..……………..……………..……………..……....1
Methods……………..……………..……………..……………..……………..…...……...4
Results……………..……………..……………..……………..……………..…...……...13
Discussion……………..……………..……………..……………..………...…………...21
Chapter 2: Neurophysiological improvements in speech-in-noise task after short-term choir
training in older adults………………….……..……………..……………..…………………....29
Introduction……………..……………..……………..……………..……………..……..29
Methods……………...……………..……………..……………..……………..………...36
Results……………..……………..……………..……………..……………..………......54
Discussion……………..……………..……………..……………..……………..……....64
Conclusions……………..……………..……………..……………..……………..……………..78
References……………………..……………..……………..……………..……………..……....79
Appendices……………………..……………..……………..……………..…………...……....104
iii
List of Tables
Table 1. Studies included in meta-analysis………………………………………………………13
Table 2. Demographics for choir and control group. ……………………………………………38
Table 3. Mean number of trials accepted in each EEG task and
condition…………………………………………………………………………………………44
Table 4. Latencies and electrodes for components observed during Oddball
task………………………………………………………………………….……………………46
Table 5. Latencies and electrodes for components observed during syllable-in-noise active task.
……………………………………………………………………………………………………48
Table 6. Latencies and electrodes for components observed during syllable-in-noise passive task.
……………………………………………………………………………………………………50
iv
List of Figures
Figure 1. PRISMA flow diagram………………………………………………………………….7
Figure 2. Forest plot. For each effect size, speech target type is indicated by color and noise type
is indicated by shape. Line length indicates standard error surrounding each effect size (g)…....16
Figure 3. Baujat plot……………………………………………………………………………..20
Figure 4. Funnel plot……………………………………………………………….………..…...21
Figure 5. Pure tone thresholds for participants in choir and control groups at Pre-test……..…...37
Figure 6a. N1 latency, difference score (post-test – pre-test) at Cz in the active condition of the
syllable-in-noise task in choir and control groups, across SNR conditions……………………...58
Figure 6b. ERPs recorded at Cz during active condition of the syllable-in-noise task in the choir
and control groups at pre and post-test for each noise condition……………………………..….58
Figure 6c. Topographic headplots for N1 during active condition of the syllable-in-noise task in
the choir and control groups at pre and post-test for 0dB and Silent conditions……………..….58
Figure 7a. N1 amplitude, difference score (post-test – pre-test) averaged across frontal and
central in the passive condition of the syllable-in-noise task in choir and control groups…...….61
Figure 7b. ERPs recorded at Cz during passive condition of the syllable-in-noise task in the choir
and control groups at pre and post-test for each noise condition……………………………..….61
Figure 8a. N1 amplitude, difference score (post-test – pre-test) in frontal and central electrodes in
the standard condition of the oddball task in choir and control groups………………………….63
Figure 8b. ERPs recorded at Fz during standard condition of the oddball task in the choir and
control groups at pre and post-test……………………………………………………………….63
Figure 8c. Topographic headplots for N1 during oddball task in choir and control groups in the
standard condition………………………………………………………………………………..63
v
Abstract
Speech-in-noise perception, the ability to hear a relevant voice within a noisy background, is
important for successful communication. Musicians have been reported to perform better than
non-musicians on speech-in-noise tasks and reduced age-related auditory decline. In Chapter 1,
we employ a meta-analysis with a multi-level design to assess the claim that musicians have
improved speech-in-noise abilities compared to non-musicians. Across 31 studies and 61 effect
sizes, the effect of musician status on speech-in-noise ability was significant and moderate (g =
0.58). Older adults displayed greater improvements in speech-in-noise abilities. The effect of
musician status was not moderated by IQ matching, target stimulus or context, or background
noise. In Chapter 2, we employ a randomized-control trial to assess whether short term music
engagement may have similar effects to long-term training. We used a pre-post design to
investigate whether a 12-week music intervention in adults aged 50-65 without prior music
training and with subjective hearing loss improves well-being, speech-in-noise abilities, and
auditory encoding and voluntary attention as indexed by auditory evoked potentials (AEPs) in a
syllable-in-noise task, and later AEPs in an oddball task. Age and gender-matched adults were
randomized to a choir or control group. Choir participants sang in a 2-hr ensemble with 1-hr
home vocal training weekly; controls listened to a 3-hr playlist weekly and socialized online with
fellow participants. From pre- to post-intervention, no differences between groups were observed
well-being or behavioral SIN abilities. In the choir, but not the control, group, changes in the N1
component were observed for the syllable-in-noise task, with increased N1 amplitude in the
passive condition and decreased N1 latency in the active condition. During the oddball task,
larger N1 amplitudes to the frequent standard stimuli were also observed in the choir but not
control group from pre to post intervention. Across both studies, findings have implications for
vi
the potential role of music training to improve sound encoding in individuals who are in the
vulnerable age range and at risk of auditory decline.
1
Chapter 1: Speech-in-noise perception in musicians and non-musicians: a multi-level meta-
analysis
Introduction
Speech-in-noise (SIN) perception refers to the ability to hear a relevant voice in the context of a
noisy background (e.g., a loud restaurant or gathering). Understanding speech in noisy
environments is vital to successful communication. Deficits in SIN abilities are associated with
reduced school performance in children (de Carvalho et al., 2017), and increased emotional and
social loneliness in older adults (Stam et al., 2016). Speech-in-noise abilities decline with age
(Pronk et al., 2013a) and are difficult to remedy with assistive technology (Chung, 2004; Killion,
1997). Important to our understanding of human auditory processing is untangling how speech-
in-noise abilities vary across individuals and can be improved across the lifespan.
One group of individuals reported to have better SIN abilities is lifelong musicians (for review,
see Coffey et al., 2017). Music training may improve auditory processing through repetitive
practice of fine-tuned pitch discrimination and enhanced attendance to changes in acoustic
features such as timbre and rhythm. According to the OPERA (overlap, precision, emotion,
repetition, attention) hypothesis (Patel, 2011, 2014a) music training may influence speech
processing specifically because it places demands on shared sensory or cognitive processes and
involves repetition, attentional focus, and is emotionally rewarding. Patel (2014) suggests that
music training may drive neural plasticity by placing a higher demand on overlapping brain
networks that process music and speech than in everyday speech communication.
2
Several cross-sectional studies have reported differences in speech-in-noise performance
between musicians and non-musicians (Alexandra Parbery-Clark et al., 2009; Zendel et al.,
2015b), while others have observed no differences (e.g., Boebinger et al., 2015; Madsen et al.,
2017, 2019). Discrepancies between studies may be due to differences in task selection; there is
great variety in tasks used to assess speech-in-noise abilities – for example, many researchers
(e.g., Zendel & Alain, 2012) have used QuickSIN (Etymotic Research, 2001; Killion et al., 2004)
while others (e.g., Parbery-Clark et al., 2013) use HINT (Neff & Green, 1987), or WIN (e.g.:
Slater & Kraus, 2016). Many studies report results from multiple tasks (e.g., Bidelman & Yoo,
2020; Escobar et al., 2019). Each task measures speech-in-noise perception in slightly different
ways; QuickSIN, for example, assesses perception of full-length meaningful and grammatical
sentences that are embedded in 4-talker speech babble that gradually increases in volume, while
HINT has similar sentences embedded in 8-talker babble where the babble volume is adaptive
relative to participant performance. In contrast, WIN (Wilson, 2003) consists of monosyllabic
words embedded in adaptive babble, without any grammatical or semantic context. It has been
reported that some speech-in-noise tasks may be more sensitive than others. For example, HINT,
because most users achieve higher overall performance (at ceiling), is poorer at discriminating
between individuals with and without hearing loss than is QuickSIN and the WIN (Wilson et al.,
2007). Relatedly, differences between studies may be due to target speech type, where some
studies use tasks where the target speech is a sentence (e.g., Anaya et al., 2016; Başkent &
Gaudrain, 2016), while others choose syllables (e.g., Du & Zatorre, 2017) or words (e.g.,
Fostick, 2019) as the target. These stimuli may be processed differently, relying on different cues
and cognitive resources. Sentences often contain semantic and grammatical information that
allows the listener to make predictions and fill in gaps about missed information and words.
3
Words, while not always embedded in contextual cues, follow predictable patterns of consonant
and vowel combinations. In contrast, syllables and made-up words do not contain predictable
information, and thus perception is less able to depend on top-down mechanisms. Type of
background noise also varies across studies, as some researchers choose a single voice masking
paradigm (e.g., Başkent & Gaudrain, 2016), while others use babble (e.g., Escobar et al., 2019),
or speech-shaped noise (e.g., Fuller et al., 2014). SIN perception outside of the laboratory is
supported by contextual cues and syntactical information and is typically in the presence of
speech-related background noise, and thus the choice of speech target type, contextual
information available, or background noise may lead to critical differences in performance in a
measurement setting.
Lastly, in a recent meta-analysis, speech-in-noise abilities were found to be positively associated
with cognitive abilities, including working memory and IQ (Dryden et al., 2017). This was
suggested to be due to a greater ability to effectively use contextual cues, as this association was
particularly strong in contextually-rich tasks. Musicians, as compared to non-musicians, have
been reported to have higher auditory working memory (Chan et al., 1998; Moreno et al., 2011;
Talamini et al., 2016) and verbal and nonverbal IQ (Schellenberg, 2011) although differences in
intelligence may be simply due to differences in music aptitude rather than a result of training (S.
Swaminathan et al., 2017). Thus, to assess whether musicians have improved speech-in-noise
abilities compared to non-musicians independent of cognitive ability, it may be important to
control for IQ and auditory working memory measures. While several studies have explored this
idea and controlled for cognitive ability (e.g., Boebinger et al., 2015), others have not.
4
In this meta-analysis, we assess the hypothesis that musicians have improved speech-in-noise
ability compared to non-musicians. We restrict our analysis to cross-sectional studies due to
insufficient longitudinal studies and randomized control trials that would be necessary to conduct
this analysis. To our knowledge, this is the first meta-analysis to explore this question. We
explore four main questions: 1) do adult musicians perform better than non-musicians in tasks
measuring speech-in-noise?, 2) how much variance can be attributed to within-study effects
(specifically in studies with multiple SIN tasks) as compared to between-study effects?, 3) are
observed effects dependent on the type of speech target (e.g., sentences vs. words), 4) are
observed effects dependent on whether participants were matched on cognitive ability? Given
that speech-in-noise abilities decline with age (Pronk et al., 2013b) and that many studies have
investigated SIN perception specifically in older adult participants, we additionally explore a
fifth question: 5) are observed effects dependent on the age group assessed (e.g., older adults vs.
younger adults)?
Methods
Literature Search
A literature search using PubMed and ProQuest was conducted. The first author designed the
search method, and terms used in each search are listed in (Appendix A). The search was
conducted once in April 2020 and updated again in February 2021. We retrieved 3521 records.
5
Inclusion Criteria
Articles that met the following criteria were included in the meta-analysis:
1. Participants were adults with normal hearing thresholds defined as less than 20 dB HL
from 250-4000 Hz.
2. The study was cross-sectional and included a long-term musically-trained (> 6 years of
training) group and a musically-untrained control group (i.e., music training as a
categorical, not continuous, variable).
6
3. The study was a peer-reviewed publication, published in English. Dissertations, theses,
conference proceedings, abstracts, unpublished manuscripts, and case studies were not
included.
4. The study reported behavioral outcomes of speech-in-noise (i.e., sentences, words, or
syllables in noise). Studies that reported auditory stimuli unrelated to speech (i.e., tones)
in noise were not included.
5. The study reported sufficient data to compute effect sizes.
Records were evaluated by the first author for eligibility based on inclusion criteria (see Figure
1), resulting in 31 studies and 61 effect sizes included in the meta-analysis. Given that this was
not a meta-analysis of randomized-control trials, a Risk-of-Bias analysis was not conducted.
7
Figure 1. PRISMA flow diagram
Outcome Measures
Multiple outcome measures of SIN perception were allowed for each study. Outcome variables
were coded as the name (i.e., “Hearing in Noise Task”) and category (i.e., “masked speech”) of
the task performed. See Table 1 for a complete list of outcome variables for each included study.
8
Data Extraction
Study characteristics and outcome data were extracted manually from each study, first by the
first author and then by two independent researchers using a spreadsheet form. In the case of
disagreement between researchers, the first author reviewed the disagreement and paper in
question. If a study included additional groups, only the data from the formally trained music and
non-musician control groups were extracted. If a study contained insufficient data to calculate
effect size or to attain descriptive statistics, the authors were contacted via email up to two times
by the first author (7 authors contacted). For each article, the following data points were
extracted: publication details, speech-in-noise task description, n for each group, years of music
training in the music group, mean participant age, target speech type and context, whether IQ
matching was performed, outcome means and standard deviations or standard errors for each
group, and F or t statistics.
Power Analysis
We conducted an a priori power analysis using the pwr package (Champely, 2020) in R. To
achieve 80% power, assuming moderate heterogeneity among studies and an alpha level of 0.05,
at least 20 studies with an average of 15 participants in each group were necessary to detect a
Cohen’s d effect size of 0.30. This medium effect size was chosen given the limited number of
papers available in this subject to estimate an a priori effect size estimate.
9
Effect Size Calculation
Effect sizes were calculated for each outcome measure using the esc package (Lüdecke, 2019) in
R (R Core, 2020). Hedge’s g, an effect size measure that accounts for small study bias (Hedges
& Olkin, 1985), was computed using the following formula, where d is Cohen’s d, n1 is the
sample size of group 1 and n2 is the sample size of group 2:
𝑔 ≃𝑑 × (1−
!
"($
!
%$
"
)'(
)
A positive effect size indicated an advantage in the music compared to the control group.
3-Level Model
Given studies reported more than one SIN outcome, a three-level model (Assink & Wibbelink,
2016; Cheung, 2014; Hox, 2010) was employed using the metafor package (Viechtbauer, 2010)
using guidelines from Harrer et al., (2019). Three-level models have been shown to perform well
in meta-analyses involving multiple effect sizes within one study (Cheung, 2014). Here, meta-
analysis variances are assessed across three levels: Level 1) sampling variance, Level 2) variance
between effect sizes within a single study (i.e., different outcome measures), and Level 3)
variance between studies. Model equations, as presented by (M Harrer et al., 2019) are as
follows, where i is an individual effect size from study j. θij and 𝜃
*
ij, are the true and estimated
effect sizes i from study j, and ϵij is the Level 1 error, 𝜁
())*+
is the Level 2 error, 𝜁
(!)+
is the Level
3 error, κj is the average effect size of study j, and β0 is the effect size at the population level.
10
Level 1: 𝜃
*
*+
=𝜃
*+
+𝜖
*+
Level 2: 𝜃
*+
=𝜅
+
+𝜁
())*+
Level 3: 𝜅
+
=𝛽
,
+𝜁
(!)+
The combined equation is, therefore: 𝜃
*
*+
=𝛽
,
+𝜁
())*+
+ 𝜁
(!)+
+ 𝜖
*+
Heterogeneity measures (𝐼
)
) across model levels were calculated using the dmetar package in R
(Harrer et al., 2019)
Moderators
Three moderators (subgroups) were assessed:
1. Type of target stimulus coded the type of speech stimulus participants were asked to
identify within noise (i.e., words, sentences, or syllables).
2. Type of background noise was coded as the type of noise in which speech targets were
embedded (i.e., competing speaker, babble (more than one speaker), speech-shaped noise,
fluctuating speech-shaped noise (speech-shaped noise that matched temporal envelope of
speech noise), or white noise).
11
3. Type of context was coded as the type of contextual or syntactical cues within which a
target stimulus was embedded. For example, if the target was a meaningful, syntactically
correct sentence, it was coded as “semantic”, but if the target was a meaningless but
syntactically correct sentence, it was coded as “syntactic”. Stimuli that contained no
contextual cues were coded as “none”.
4. Age group of participants. Effect sizes were coded as “younger adults” (mean age < 45)
or “older adults” (mean age ≥ 45).
5. IQ matching. Studies were coded as to whether the music-trained group and the
musically-untrained group were matched for Verbal IQ, Non-verbal IQ, or Auditory
Working Memory (AWM).
A separate 3-level model was fitted for each moderator variable, as the inclusion of multiple
moderators has been shown to increase Type-II error of the moderator estimate (Raudenbush &
Bryk, 2002). If moderators were significant, they were included in the full model.
Publication Bias
Studies that show studies reporting statistically significant findings are published more often than
studies reporting null results (W. Viechtbauer, 2007). Methods of assessing this publication bias
in multi-level meta-analyses show inconsistent performance (Fernández-Castilla et al., 2020).
Traditional methods of publication bias analysis, including Egger’s Test of the Intercept (a
method of assessing funnel plot asymmetry) and Duval and Tweedie’s Trim & Fill, show
inflated Type I error rates when dependent effect sizes are ignored (for example, collapsing
12
multiple effect sizes of one study into a single number, or randomly sampling one effect size
from each study) (Rodgers & Pustejovsky, 2020). Therefore, we opted for a multi-level method
of assessing publication bias, the Egger MLMA test, originally proposed by (Van den Noortgate
et al., 2013), which demonstrates sufficient power (Rodgers & Pustejovsky, 2020) and does not
inflate Type I error (Fernández-Castilla et al., 2020; Rodgers & Pustejovsky, 2020). This test
regresses effect size precision on effect size, with slope estimated using weighted least-squares
and intercept significance testing using a multi-level model approach (Fernández-Castilla et al.,
2020). The Egger MLMA test was performed in R using the metafor package and code adapted
from Rodgers and Pustejovsky (Rodgers & Pustejovsky, 2020).
Influence Analysis
Influence analysis was conducted using the dmetar package to identify leverage effect sizes. We
conducted analysis using all effect sizes treated independently, as done in previous multi-level
meta-analyses (Castillo-Eito et al., 2020; Parry et al., 2020). Effect sizes exerting high influence
were detected using a leave-one-out method, as suggested by Viechtbauer and Cheung (2010).
These results were confirmed using a Baujat plot (Baujat et al., 2002) to detect studies that
contributed unequally to the meta-analysis heterogeneity (Cochran’s Q).
13
Results
Study Characteristics
Study characteristics, including mean age, outcome measures, and IQ matching are presented in
Table 1. The mean participant age was 28 and the mean participant age, when weighted for the
number of effect sizes included, was 28.3. Of the 61 effect sizes included, 11 were from studies
whose participants were “older adults” (> age 45), while 50 had participants who were
considered “younger adults”. 13 effect sizes were from studies that matched participants based
on verbal IQ, 27 matched participants for nonverbal IQ, and 12 matched participants for auditory
working memory. 30 effect sizes were from studies that had matched participants from any of the
3 IQ measures. Length of music training was specified in 26 of the 31 studies, with an average of
19.3 years of training in the music group. In the remaining studies, music training length was
described as at least 10 years (3 studies), at least 6 years (1 study), and practicing at least 7 hours
per week regularly with 3 hours in orchestral rehearsal (1 study). Although not explicitly meeting
more than 6 years of training criteria, the authors determined to include this study because an
individual playing music for at least 7 hours per week regularly is likely to be considered a
“musician”.
Table 1. Studies included in meta-analysis
Study, Journal Outcome
Measure(s)
Speech Target/
Noise Type
Duration of
Music
Training in
the Music
Group
Mean
Age,
N
IQ Measures
Anaya et al., 2016, The Journal of the
Acoustical Society of America
HINT and
PRESTO
Sentences/
speech-shaped
noise, babble
15.45 20.7,
N =
22
Nonverbal*
Başkent & Gaudrain, 2016, The Journal of
the Acoustical Society of America
Masked
sentences
Sentences/ single
speaker
At least 10
years
22.3,
N =
38
none
14
Bidelman & Yoo, 2020, Frontiers in
Psychology
Masked
sentences,
QuickSIN
Sentences
/babble
15.1 24.2,
N =
28
Nonverbal*,
AWM*
Boebinger et al., 2015, The Journal of the
Acoustical Society of America
Masked BKB
sentences
Sentences/ single
speaker, speech-
shaped noise,
fluctuating
speech-shaped
noise
22.7
27.2,
N =
50
Verbal,
Nonverbal,
AWM
Clayton et al., 2016, PloS ONE Masked
sentences, co-
located,
separated
Sentences/ single
speaker
14.4
21.5,
N =
34
Nonverbal,
AWM*
Du & Zatorre, 2017, PNAS
Syllable in
Noise
Syllables/ white
noise
At least 10
years, start
prior to age
7
21.8,
n =
30
Verbal, AWM
Escobar et al., 2019, Ear and Hearing QuickSIN,
HINT, SPIN-R
Sentences/
babble, speech-
shaped noise
13.4
21.4,
N =
49
AWM
Fostick, 2019, European Journal of Ageing AB- words
task in speech
noise
Words/ speech-
shaped noise,
white noise
7 hrs/week 65.6
†
,
N=
46
Nonverbal,
AWM
Fuller et al., 2014, Frontiers in
Neuroscience
Sentences and
words in noise
Sentences/
speech-shaped
noise, fluctuating
speech-shaped
noise, babble
14.6
22.7,
N =
50
none
Kaplan et al., 2021, Frontiers in
Psychology
Masked
sentences
Sentences/ 2-
talker maskers
13.1 26.7,
N =
33
none
Madsen et al., 2017, Scientific Reports Masked HINT
sentences
Sentences/
babble, speech-
shaped noise
14.6
21.0,
N =
60
Verbal,
nonverbal
Madsen et al., 2019, Scientific Reports Closed and
open speech-
on-speech task
(Dantale II
sentences),
separated, co-
located
Sentences/
babble, speech-
shaped noise
15.3 22.9,
N =
64
Verbal,
nonverbal
Mankel & Bidelman, 2018, PNAS QuickSIN,
LiSNS
Sentences/
babble
16
22.2,
N =
28
none
Meha-Bettison et al., 2018, International
Journal of Audiology
LiSN-S, low
and high cue,
co-located and
separated
Sentences/ single
speaker
39.7
45.9 ,
N =
20
none
Morse-Fortier et al., 2017, Trends in
Hearing
Masked words
(natural,
vocoded,
spatial,
nonspatial)
Words/ babble,
fluctuating
speech-shaped
noise
11.5
21.3,
N =
40
none
15
Parbery-Clark et al., 2009, Ear and
Hearing
QuickSIN,
HINT
Sentences/
babble, speech-
shaped noise
16
23, N
= 31
Nonverbal,
AWM*
Parbery-Clark et al., 2011, PLoS ONE QuickSIN,
HINT, WIN,
masked speech
Sentences,
words/ babble,
speech-shaped
noise
50 54.5
†
,
N =
37
Verbal,
Nonverbal,
AWM*
Parbery-Clark et al., 2011,
Neuropsychologia
HINT Sentences/
speech-shaped
noise
16.4 22.4,
N =
31
Nonverbal
Parbery-Clark et al., 2012, PloS ONE HINT Sentences/
speech-shaped
noise
49 56
†
,
N =
48
Nonverbal
Parbery-Clark et al., 2012, Neuroscience QuickSIN Sentences/
babble
17.3 22, N
= 50
Nonverbal
Parbery-Clark et al., 2013, Journal of
Neuroscience
HINT
Sentences/
speech-shaped
noise
16.2
20, N
= 30
Nonverbal
Ruggles et al., 2014, PloS ONE QuickSIN,
HINT
Sentences/
babble, speech-
shaped noise
At least 10
years
21.2,
N =
33
none
Slater & Kraus, 2016, Cognitive
Processing
QuickSIN,
WIN
Sentences,
Words/ babble
15.7 23.9,
N =
54
Nonverbal
Swaminathan et al., 2015, Scientific
Reports
Masked
speech
(forwards,
reversed, co-
located,
separated)
Sentences/
babble
13.8 21.7,
N =
24
none
Vanden Bosch der Nederlanden et al.,
2020, Psychological Research
SPIN-R Sentences/
babble
11.6 21, N
= 60
Verbal,
Nonverbal,
AWM
Varnet et al., 2015, Scientific Reports
Nonwords in
noise (correct,
sensitivity,
SNR)
Nonwords/ white
noise
15.8 22.8,
N =
38
none
Yoo & Bidelman, 2019, Hearing Research QuickSIN,
WIN, HINT
Sentences,
Words/ babble,
speech-shaped
noise
15.8 25.4,
N =
31
Nonverbal*,
AWM*
Zendel & Alain, 2012, Psychology and
Aging
QuickSIN
Sentences,
babble
At least 6 47.3
†
,
N =
163
none
Zendel et al., 2015, Journal of Cognitive
Neuroscience
Words in
babble
Words, babble 15.5 22.7,
N =
26
none
Zendel & Alexander, 2020, Frontiers in
Neuroscience
QuickSIN Sentences,
babble
23.5 32.0,
N =
37
none
Zhang et al., 2019, International Journal of
Audiology
QuickSIN
Sentences,
babble
At least 10
years, start
prior to age
7
24, N
= 34
none
*musicians significantly outperformed non-musicians on IQ measure (marked as “nonmatched”)
†marked as “older adults” in sub-group analysis
16
3-Level Model Analysis
The first 3-level model, without moderators, resulted in a pooled effect size (g) of 0.58 (p <
0.0001), with a 95% confidence interval of [0.43, 74]. 46.92% of the total model variance was
attributed to Level 1 (sampling variance). I
2
Level 2 was ~0%, indicating no within-study
heterogeneity. I
2
Level 3 was 53.08%, indicating moderate between-study heterogeneity. I
2
Total,
indicating the amount of heterogeneity not attributable to sampling error, was 53.08%. Effect
sizes for each outcome variable and study are presented in Figure 2.
Figure 2. Forest plot. For each effect size, speech target type is indicated by color and noise type
is indicated by shape. Line length indicates standard error surrounding each effect size (g).
17
We then assessed whether our three-level model was superior to a two-level model by removing
one of the levels and comparing fit. When removing only Level 2, the resulting model had a
slightly lower AIC (76.74) and BIC (80.93) than the full model, but this difference was not
significantly different (p =1.00), suggesting that including Level 2 was not necessary. When
removing Level 3, the resulting model had a higher AIC (89.04) and BIC (93.23) when
compared to the full model (78.74, 85.03) and was significantly different (p < 0.001). This
suggests that Level 3 was necessary to include in the full model for this analysis. Given the
model comparisons, our final model excluded Level 2. The resulting estimate effect size and
level heterogeneity, however, did not differ from the full model, as the removed Level 2 variance
was originally at 0%.
Type of Speech Target
A test of moderation by speech target was conducted to compare effect sizes where the target
stimulus was a sentence versus a word. Syllables and nonwords were excluded from this analysis
as they did not contain at least 3 effect sizes. The test of moderators indicated that type of speech
target was not a significant moderator (F(1, 55) = 0.13, p = 0.72).
Type of Background Noise
A test of moderation on type of background noise was conducted to compare effect sizes where
the background noise was multi-talker babble, a single speaker, speech-shaped noise, fluctuating
speech-shaped noise, or white noise. The test of moderators indicated that type of background
noise was not a significant moderator (F(4, 56) = 1.22, p = 0.31).
18
Type of Context
A test of moderation on type of contextual cues available to the listener was conducted to
compare effect sizes where stimuli were placed within the context of semantic, syntactic, or no
cues. The test of moderators indicated that type of context was not a significant moderator (F(2,
58) = 0.29, p = 0.75).
IQ matching
We conducted 4 separate models to assess the impact of IQ matching on effect size: matching on
non-verbal IQ, matching on verbal IQ, matching on auditory working memory, matching on any
IQ measure. Matching on IQ did not moderate effect size in any of these tests of moderation
(any IQ p = 0.27, nonverbal IQ p = 0.06, verbal IQ p = 0.36, AWM p = 0.29).
Age group
The test of moderators including age group showed no significant difference between older
adults and younger adults (F (1, 59) = 2.54, p = 0.12).
Influence Analysis
Influence analysis indicated one outlier effect size (N = 163, g = 0.16, (Zendel & Alain, 2012;
QuickSIN)) identified in both the leave-one-out assessment and the Baujat plot (see Figure 3).
This outlier came from a study that reported only one effect size in the overall model. A
sensitivity analysis excluding this outlier demonstrated similar results to the original analysis (g
= 0.60, p < 0.001, 95% CI: [0.44, 0.76]). Total variance in this model was 0.229, with 48.41%
19
attributed to Level 1, ~0% attributed to Level 2, and 51.59% attributed to Level 3. Test of
moderators removing the outlier showed a significant effect of age group (F(1, 58) = 8.68, p <
0.01), indicating that studies with older adults had overall higher effect sizes than those with
younger adults. A follow-up three-level model of only studies with younger adults indicated an
overall effect size of (g = 0.52, p < 0.001, 95% CI: [0.37, 0.67]), whereas studies with older
adults indicated an overall effect size of (g = 1.18, p < 0.001, 95% CI: [0.76, 1.60]). Tests of
moderators of other subgroups yielded similar results to the original analysis (any IQ p = 0.38,
nonverbal IQ p = 0.09, verbal IQ p = 0.30, AWM p = 0.29, speech target p = 0.66, noise type p =
0.36, context type p = 0.76).
20
Figure 3. Baujat plot.
Publication Bias Analysis
The Egger MLMA test for asymmetry on the intercept before outlier removal was significant,
indicating funnel plot asymmetry presumably due to selective reporting (𝛽 = 2.39, p < 0.05).
However, after outlier removal, no funnel plot asymmetry was detected (𝛽 = 2.29, p = 0.07) (see
Figure 4).
21
Figure 4. Funnel plot.
Discussion
In this multi-level meta-analysis, we investigated speech-in-noise perception abilities in adult
musicians and non-musicians across 31 studies and 61 effect sizes. Results indicated a moderate
effect size indicating a musician benefit for the ability to hear speech in noisy environments, with
moderate heterogeneity derived from between-study effects and virtually no heterogeneity
derived from within-study effects. After sensitivity analysis, overall effect size remained high
and publication bias was nonsignificant. Age emerged as a significant moderator of effect size,
22
suggesting that older adults with presumably more years of music training showed greater
advantages in speech-in-noise perception when compared to controls with potential age-related
decrease in speech-in-noise perception. We additionally found that the overall musician benefit
for SIN ability was not impacted whether participants were matched for IQ, the type of target
stimulus, contextual cues, or the type of background noise. Thus, results from this meta-analysis
indicate that a musician benefit for SIN abilities remains robust across studies independent of IQ,
qualitative aspects of the speech task, and after controlling for outliers, and is particularly
significant for older adults.
Moderators of musician advantage for SIN perception
The type of target, context cues, and background noise used to assess SIN perception varied
considerably across studies. Yet, despite these differences, neither speech target, background
noise type, or contextual cues significantly moderated effect size between or within studies,
indicating that the musician advantage for SIN perception is global, rather than specific to the
semantic or syntactical richness of a speech cue or the complexity of the background.
Additionally, matching for cognitive performance did not significantly moderate the overall
observed effect across studies. That is, while some studies made an effort to match musician and
non-musician participants on verbal IQ, nonverbal IQ, or auditory working memory, whether a
study chose to or was able to do so did not impact the overall difference observed between
musicians and non-musicians in the meta-analysis. This was true for each cognitive measure
independently (verbal, nonverbal, or AWM) and for matching of any of the three measures.
23
Notably, several studies assessed IQ or auditory working memory and found significant
differences between musicians and non-musicians (see Table 1, studies marked with asterisks).
These studies conducted follow-up analyses to explore how much of the variance could be
attributed to differences in cognitive abilities (e.g.: Parbery-Clark 2009) or whether performance
on SIN tasks was correlated with performance on cognitive tasks (e.g.: Bidelman & Yoo, 2020).
Drawing from these follow-up analyses, each study concluded that differences in SIN abilities
were unlikely to have been due to differences in cognitive ability. Results from our moderator
analyses here support this notion, suggesting that musicians’ enhanced speech-in-noise abilities
are independent of any potential differences in cognitive ability.
Lastly, we explored whether effect size magnitude was moderated by age group, comparing
studies with younger adult participants with those with older adult participants, and found that,
after removing outliers, age group significantly impacted the effect of musician status on SIN
perception. Specifically, while both subgroups demonstrated musician advantages independently,
studies with older adults had an overall higher pooled effect size estimate than studies with
younger adults. Thus, even in later life, when hearing abilities typically decline (Huang & Tang,
2010), musicians still outperform non-musicians in measures of SIN perception. It should be
noted that only 3 of the 31 studies in this analysis (6 out of 61 effect sizes) included older adult
participants. Given that the subgroups assessed in this study are highly uneven (there were many
more studies with younger than older adults), conclusions should be drawn with caution.
Additionally, older adult musicians in this meta-analysis had roughly twice the amount of music
training than did younger adults. Age differences observed in this analysis likely reflect simply
length of training rather than other factors associated with age. Significantly more studies
24
investigating older adult musicians and non-musicians are necessary to draw stronger
conclusions regarding musician status, aging, and SIN perception. Additionally, given the age
range available, we used 45 as a cut-off for “older adult”. More studies with adults on the older
end of the age spectrum are necessary.
Defining “musician”
An important feature of this meta-analysis is that we included only studies that defined
“musicians” and “non-musicians” as a binary variable, based on years of formal training or
completion of conservatory degree. This decision was made to maximize comparability between
studies, as most of the literature in this field defines musician status similarly. However,
assessing differences between musicians and non-musicians in a binary fashion may exclude
valuable information. Including music training as a continuous measure in a meta-regression
analysis may provide a more nuanced look into the role of music training on speech-in-noise
perception, allowing for a dose-response relationship to be investigated. For example, while
Ruggles and colleagues (2014) did not find a significant overall difference between musicians
and non-musicians in SIN abilities, they did observe a significant correlation between SIN
perception and years of music training. While, currently, an insufficient number of studies exist
to conduct a meta-regression on this topic, we encourage more incorporation of continuous
measures of music training as a compliment for future research.
Taking this one step further, it should be acknowledged that, while assessing formal training is a
convenient way to measure musicianship on a binary or continuous scale, it is not representative
25
of the range of musical experiences present in the population at large. Many individuals have had
experience with music making in some informal capacity- for example, through religious
services or family gatherings. More recently, the availability of tools for creating and “playing”
music electronically (e.g., GarageBand, Logic), and learning instruments informally online (e.g.,
YouTube tutorials) has increased accessibility of music-making without formal instruction.
(Zendel & Alexander, 2020) reported that self-taught musicians outperformed non-musicians and
were outperformed by formally trained musicians on a melodic tone violation task. However, no
SIN differences were observed between groups. To our knowledge, this is the only study that has
included a group of self-taught musicians in assessments of SIN perception. Future studies could
incorporate measures of informal music playing experience and assessment of musical abilities
(rather than length of training) to further elucidate these findings.
A limitation of causality
Perhaps the most important caveat of the present meta-analysis is that our findings do not
indicate a causal relationship between music training and improved speech-in-noise perception.
Rather, we provide evidence for a moderate positive association between musicianship and SIN
abilities. All studies included in the analysis were cross-sectional, and therefore it cannot be
ruled out that differences between musicians and non-musicians are due to pre-existing
biological traits rather than as a result of training. While cognitive abilities are related to accurate
speech perception in noisy environments (Anderson et al., 2013b; Dryden et al., 2017), it is
unlikely that differences in cognitive performance accounted for SIN differences here, given that
IQ or AWM matching was not a significant moderator of effect in the present analysis.
26
Alternatively, it has been proposed that music aptitude, rather than music training, may account
for many of the extra-musical benefits, including language abilities, observed in cross-sectional
studies comparing musicians with non-musicians (S. Swaminathan et al., 2017). Slater and Kraus
(2016) additionally found that performance on melodic competence rhythm task predicted better
performance on speech-in-noise abilities in both musicians and non-musicians. Including a music
aptitude assessment, particularly in studies with continuous measures of music training, may help
to separate these effects in the context of cross-sectional investigations. To address the role of
training independent of pre-existing differences, several longitudinal studies have been
conducted. Developmental work has demonstrated that, after two years of training, musically-
trained children show enhanced maturity of early auditory evoked potentials and better ability to
detect changes in tonal sequences as compared to sports-trained and children with no training
(Habibi et al., 2016). Specific to SIN perception, Slater et al., (2015) found that, in a waitlist
control study, children who received music training showed improved SIN abilities as compared
to controls. Recent randomized-control trials with older adults show that 10 weeks of choir
participation (Dubinsky et al., 2019), and 6 months of piano lessons (Zendel et al., 2019),
produced improved performance in speech-in-noise tasks. Our present results are in line with
longitudinal findings, suggesting that music training may induce neuroplasticity that supports
speech-in-noise perception. To truly separate the effects of pre-existing differences from
training, however, meta-analyses of longitudinal studies are needed.
27
Conclusions and Future Directions
Speech-in-noise abilities are important for successful communication, and understanding factors
involved in enhanced SIN perception across the population provides insight into auditory
processing as a whole. This meta-analysis utilized a multi-level design to assess whether
musicians demonstrate improved processing of speech-in-noise when compared to non-
musicians. A strength of the current investigation is its multi-level design, allowing for the
incorporation of multiple effect sizes within a single study. We provide evidence for a moderate
musician benefit for speech-in-noise abilities, supported by our overall effect size of g =.58, 95%
CI [0.43, 74]. This effect remained robust irrespective of speech target type, background noise,
context cues, matching for cognitive ability, and age group, indicating that musicians experience
advantages to SIN perception across a variety of contexts, cognitive abilities, and throughout the
lifespan. Future studies should focus on elucidating a causal link between music training and SIN
perception through the employment of longitudinal design for varying age and hearing groups,
including cochlear implant users, individuals with hearing loss, and children exposed to chronic
noise. Future studies should additionally consider controlling for socioeconomic status, which
contributes to SIN abilities (Anderson et al., 2013b), exposure to noise (Casey et al., 2017)
(which in turn impacts SIN (Skoe et al., 2019) and access to music lessons (Elpus & Abril,
2011). Finally, we believe focusing on continuous measures of music training and experimental
studies of music training and SIN abilities is necessary for future meta-analyses.
CRediT author statement.
28
Sarah Hennessy: Conceptualization, Methodology, Data curation, Formal analysis, Writing-
Original draft preparation. Wendy J. Mack: Validation, Writing – Review and editing. Assal
Habibi: Conceptualization, Writing- Review and Editing, Supervision.
Declaration of competing interests.
The authors declare that they have no known competing financial interests or personal
relationships that could have appeared to influence the work reported in this paper.
Data availability statement.
The data that support the findings of this study are openly available on Open Science Framework
at https://osf.io/w2cm8/.
29
Chapter 2: Neurophysiological improvements in speech-in-noise task after short-term choir
training in older adults
https://doi.org/10.18632/aging.202931
Introduction
In the United States, 25% of adults over aged 64-74, and 50% of adults over the age of 75
experience hearing loss (National Institute on Deafness and Other Communication Disorders
(NIDHCD), 2016). Auditory difficulties can be due to sensorineural hearing loss, conductive
hearing loss, or central hearing loss, which encompasses deterioration or damage to ascending
auditory pathways beyond the cochlea (Mazelová et al., 2003).
One consequence of central hearing loss is the reduction in ability to understand speech in noisy
environments. Speech-in-noise (SIN) discrimination is notably difficult to target with hearing
aids (Chung, 2004; Killion, 1997), and deficits may exist even in the presence of a clinically
normal audiogram (Pienkowski, 2017). Communication difficulties that result from hearing loss
produce strain on social relationships and quality of life. Specifically, auditory decline is
associated with loneliness (Lotfi et al., 2009), depression (Li et al., 2014; Mulrow et al., 1990),
substance abuse (McKee et al., 2019), and reduced social functioning (Mulrow et al., 1990;
Strawbridge et al., 2000; Yoo et al., 2019). To address the dramatic impact of speech-in-noise
discrimination loss on quality of life, it is relevant to both investigate ways to prevent decline
and to improve speech-in-noise abilities in older adults. Music training is a reasonable candidate
to improve auditory abilities by fine-tuning perceptual abilities of sound and enhancing
discrimination between streams of sound in a complex auditory scene.
30
Accordingly, adult musicians show enhanced performance on sentence-in-noise (A. Parbery-
Clark et al., 2011, 2012; Alexandra Parbery-Clark et al., 2009; Zendel et al., 2015a), masked
sentence (Başkent & Gaudrain, 2016; Clayton et al., 2016; Rostami & Moossavi, 2017; J.
Swaminathan et al., 2015), word-in-noise (Fuller et al., 2014), and gap-in-noise (Donai &
Jennings, 2016) tasks as compared to non-musicians. Additionally, Ruggles et al.,(Ruggles et al.,
2014) observed a significant correlation in speech-in-noise abilities with years of music training
in adults. In older adults, musicians additionally out-perform non-musicians in sentence-in-noise
(Alexandra Parbery-Clark et al., 2011, 2012) and word-in-noise discrimination (Fostick, 2019;
Alexandra Parbery-Clark et al., 2011). Fostick, 2019 demonstrated that the musician advantage
for words-in-noise discrimination remained when comparing older adult musicians to life-long
card players. Zendel and Alain (Zendel & Alain, 2012) found that the rate of speech-in-noise
decline associated with age was less steep in musicians as compared to non-musicians, indicating
that music training may protect against age-related hearing difficulties.
Speech-in-noise difficulties are thought to reflect reduced synchrony of neuronal firing (Kraus et
al., 2000; Rance, 2005; Schneider & Pichora-Fuller, 2001), and are associated with alterations to
both bottom-up and top-down processing (Parthasarathy et al., 2020). Perceiving speech in noise
relies on encoding acoustic features, such as frequency or temporal structure, through bottom-up
processes in combination with recruiting attentional resources, memory, and contextual
prediction through top-down processes. In age-related hearing decline, individuals may
compensate for bottom-up sensory deficits with greater reliance on top-down mechanisms, filling
in missed pieces of information (Besser et al., 2015). In situations of cognitive decline, these
31
compensatory resources may be less available, resulting in further reduced speech-in-noise
perception (Lin et al., 2011; Moore et al., 2014). Thus, both top-down and bottom-up
mechanisms are important for supporting speech-in-noise perception in older adults and can be
dissociated and assessed at the level of the brain. Specifically, neural responses to speech-in-
noise can be measured with event-related potentials, voltage recorded from scalp electrodes
evoked by a stimulus (Luck, 2014). Specifically, the P1, N1, P2, and P3 components are utilized
to assess auditory processing, including SIN, at a cortical level. The P1 potential (sometimes
referred to as P50) peaks around 70-100ms post-stimulus onset, is the first cortical component of
the auditory response (R. Erwin & Buchwald, 1986; R. J. Erwin & Buchwald, 1986) and has a
fronto-central distribution. It is thought to originate in the primary auditory cortex and the
reticular activating system (R. Erwin & Buchwald, 1986; Liégeois-Chauvel et al., 1994), and
becomes more robust with age (Chambers, 1992). N1 is a negative deflection peaking around
100ms after stimulus onset and is most reliably has a frontal and fronto-central distributions on
the scalp (Näätänen & Picton, 1987). N1 is thought to originate in the primary auditory cortex,
specifically from the posterior supratemporal plane, Heschl’s gyrus, and the planum temporal
(Liégeois-Chauvel et al., 1994; Näätänen & Picton, 1987; Scherg et al., 1989; Vaughan & Ritter,
1970), and may be modulated by prefrontal regions engaged in attention processes (Coull, 1998).
A vertically-oriented or “tangential” dipole in the primary auditory cortex, in parallel with
orientation of auditory cortex neurons, is likely responsible for generating the negative potential
recorded in frontal and frontocentral sites (Scherg et al., 1989; Vaughan & Ritter, 1970). N1
response measured in frontal electrodes from this tangential dipole, as compared to a horizontal
dipole originating in secondary auditory areas and recorded more centrally, is more dependent on
stimulus intensity and on age (Hegerl et al., 1994). N1 amplitude increases in the presence of an
32
unpredictable or change-related stimulus (Nishihara et al., 2011; Schafer et al., 1981). P2,
peaking around 200ms, is less studied but is known to appear with the N1 response (H. Davis &
Zerlin, 1966) and may, like P1, originate in the reticular activation system (Knight et al., 1980).
P2 may reflect attentional processing of sensory input after initial detection marked by N1 (for
review, see (Crowley & Colrain, 2004)). The P3 component peaks from 300-700ms post-
stimulus onset, and is reflective of attentional engagement (Terence W. Picton, 1992), classically
assessed utilizing the Oddball task. P3 contains two main subcomponents, P3a and P3b. P3a has
a frontocentral distribution and is elicited by novel, non-target stimuli and is largely generated by
the anterior cingulate cortex (Yamaguchi & Knight, 1991). P3b, often referred to as simply P3,
occurs slightly later and has a posterior parietal distribution. It is elicited in response to an
infrequent target sound and reflects voluntary attention (Kok, 2001) and is largely generated by
the temporal-parietal junction (Knight et al., 1989). Of particular relevance to this study
investigating speech in noise, it has been demonstrated that early auditory event-related
potentials (AERPs) showing cortical responses to speech (e.g: N1, P2) degrade with increased
level of background noise (Billings et al., 2013; Whiting et al., 1998), as well as with advancing
age (Koerner & Zhang, 2018; Tremblay et al., 2003).
Behavioral differences between musicians and non-musicians in speech-in-noise abilities are
paralleled by differences in electrophysiological measures of auditory processing. Adult
musicians, compared to non-musicians, show enhancements (earlier and larger peaks) of P1 and
N1 in response to syllables in silence (Musacchia et al., 2008), and P2 in response to vowels
(Bidelman, Weiss, et al., 2014). Adult musicians, compared to non-musicians, also exhibit less
changes in N400 (Zendel et al., 2015a), a component reflective of meaning representations
33
(Willems et al., 2008), and N1 (Meha-Bettison et al., 2018) as a result of increasing background
noise level in a speech task, indicating less degrading effects of noise on speech processing. In
older adults, musicians demonstrate enhanced N1, P2, and P3 response to vowels as compared to
non-musicians (Bidelman & Alain, 2015), suggesting more robust encoding of and increased
attention to speech stimuli. At the subcortical level, both child (Strait et al., 2012) and adult
(Bidelman, Weiss, et al., 2014; Musacchia et al., 2007, 2008; A. Parbery-Clark et al., 2011)
musicians show enhanced auditory brainstem encoding, a measure of pre-attentive processing,
when compared to non-musicians.
While these cross-sectional studies provide valuable information regarding differences between
musicians and musically untrained individuals, they do not establish a causal relationship
between musical experience and speech-in-noise discrimination. Additionally, it has been
suggested that cognitive abilities and socioeconomic status (Anderson et al., 2013a) as well as
inherent differences in auditory abilities (Mankel & Bidelman, 2018), may mediate the
relationship between music training and speech-in-noise perception. To address this, several
longitudinal studies have investigated the effect of music training on speech-in-noise perception.
In a randomized waitlist-control study, children aged 7-9 who received community-based music
training showed significant improvement in sentence-in-noise discrimination after 2 years of
training, and as compared to controls (Slater et al., 2015). Children aged 6-9 with prelingual
moderate-to-profound sensorineural hearing loss showed advantages in sentence-in-noise ability
as compared to a passive control group after 12 weeks of music training (Lo et al., 2020). In
older adults, individuals randomly assigned to choir participation outperformed a passive control
group on a sentence-in-noise task after 10 weeks of training (Dubinsky et al., 2019). In this
34
study, participants assigned to the choir group additionally demonstrated enhanced neural
representation to temporal-fine structure of auditory stimuli related to speech (i.e.: fundamental
frequency of the syllable \da\), and that this training effect remained robust in individuals with
higher levels of peripheral hearing loss. In another randomized-control study, older adults who
participated in 6 months of piano training performed better on a words-in-noise task and showed
enhanced N1 and mid-latency responses, as compared to a videogame and no-training group
(Zendel et al., 2019).
Overall, cross-sectional and longitudinal findings demonstrate the potential for music training to
affect speech-in-noise perception across development. However, more experimental work is
needed to continue disentangling the effects of music training from pre-existing biological
differences, both in terms of behavior and neural response. Additionally, as our global population
ages, investigation of auditory decline in relation to socio-emotional well-being in older adults
grows more significant. More research is needed to assess effects of shorter-term music
interventions commencing later in life, as compared to life-long learning. Lastly, it is unclear
whether music training may produce advantages in speech processing through bottom-up
processes, implying that music training improves the neural encoding of sound, or through top-
down processes implying enhanced conscious attentional network performance leading to
improved auditory discrimination. Studies on long-term music training suggest that both
mechanisms are at play, where musicians as compared to non-musicians show enhancements of
attention-related P300 during a 2-stimulus pure tone oddball task (George & Coch, 2011), but
also enhanced subcortical pitch encoding (Musacchia et al., 2008).Working memory additionally
appears to mediate the relationship between preservation of speech-in-noise abilities and lifelong
35
music training in older adults (Zhang et al., 2020b). However, the contribution of each of these
mechanisms in short-term music training is not known.
In this study, we expand upon existing literature to examine the effects of a short-term,
community-oriented music training program on speech-in-noise abilities, associated neural
mechanisms, and well-being in older adults with mild subjective hearing loss. We utilize a
randomized-control design with an active control group to examine whether potential differences
can be attributed to active music engagement, or simply to any music listening activity. Choir
singing was chosen as the active music intervention due to its practicality in short-term
application, potential for near-transfer, and pervasiveness through human culture and evolution.
Additionally, as compared to instrument-learning, choir singing is more accessible to larger
communities as it requires less equipment and financial resources. By recruiting adults aged 50-
65 with mild subjective hearing loss, we examine the effects of music training on a population
vulnerable to age-related auditory decline. Inclusion of EEG measurements provide information
on training-related changes in neural processing of speech and sound. To parse the effects of
bottom-up versus top-down changes in auditory processing related to music training, we include
both a speech-in-noise, aimed to target mostly bottom-up processing, and an auditory attention
(Oddball) task, aimed to target mostly top-down processing, in our EEG assessments. Lastly, we
address the link between aging, hearing loss, and psychological well-being by including
measures of quality of life and loneliness.
We hypothesized that after 12 weeks of training participants in the choir group, as compared to
the control group, would show 1) greater improvements in behavioral measures of speech-in-
36
noise perception, 2) more robust neural responses during EEG, and 3) improvements in
socioemotional well-being. Exploratory analysis between EEG tasks were additionally assessed.
We expected that greater change in the P3 vs. early sensory components (N1, P2) in the oddball
task and/or the syllable in noise task would support a top-down model of attentional
neuroplasticity associated with music training of this type, indicating that training supports
cognitive processes (i.e. attention, memory) that support speech perception. If the reverse (a
greater change in N1, P2 vs. P3) a bottom-up model in which music training enhances stimulus-
encoding would be supported.
Methods
Participants
Participants between the ages of 50-65 were recruited from local community centers in the Los
Angeles area, and from the Healthy Minds Research Volunteer Registry, a database of potential
participants interested in studies at the University of Southern California related to aging and the
brain. Participants were pre-screened based on inclusion and exclusion criteria. Participant
inclusion criteria were: 1) native English speaker with experience of subjective hearing loss; 2)
normal cognitive function, as measured by the Montreal Cognitive Assessment (score ≥ 23).
Subjective hearing loss was assessed by verbally asking participants if they noticed problems
with their hearing, or if they struggled to hear in noisy environments. Participant exclusion
criteria were: 1) use of prescribed hearing aids; 2) severe hearing loss (thresholds of 50db for all
recorded frequencies; see Figure 5); 3) current diagnosis of neurological or psychiatric disorders;
4) formal music training, where participant currently plays a musical instrument or has had more
37
than 5 years of formal music training in their life, excluding music classes as part of typical
education curriculum.
Figure 5. Pure tone thresholds for participants in choir and control groups at Pre-test.
Study design was a pre-post randomized control trial. Participants took part in two testing
sessions: the Pretest session took place up to one month prior to intervention and the Posttest
took place up to one month after 12 weeks of intervention. After all participants had completed
the Pretest session, participants were randomized by an independent statistic consultant into two
groups (Control and Choir), stratified by gender and age (<57, ≥57). During Pretest and Posttest,
38
participants completed behavioral assessments of socio-emotional well-being, speech-in-noise
perception, music in noise perception and two auditory tasks with simultaneous EEG recording.
Seventy-six participants were recruited to participate in the study. Five participants dropped out
prior to pre-screening assessment. After pre-screening, 11 participants were excluded, leaving 60
participants who completed the Pretest session. After randomization, 17 participants withdrew
from the study due to personal circumstances, change in schedule, or relocation. 2 participants
were removed for insufficient completion of the intervention (missed more than 3 choir
rehearsals or 3 weeks of music listening). This resulted in forty-one participants completing
Pretest and Posttest (Control group N = 23, Choir group N = 18). Demographics of participants
within each group are summarized in Table 2.
Total Choir Control
Gender
n 41 18 23
# Females 26 12 14
Age
Mean 58.29 58.22 58.39
SD 4.19 4.35 4.10
MoCA Total Score 26.32 26.48 26.11
SD 2.13 2.06 2.25
Table 2. Demographics of participants within each group.
39
Interventions
Choir-singing group. The choir-singing group (Choir group hereafter) participated in 2-hour
weekly group choir singing sessions for 12 consecutive weeks. Participants were given at-home
vocal training and music theory exercises to complete outside of class for an estimated 1 hour per
week. The choir was directed by a doctoral student from the Department of Choral and Sacred
Music at USC Thornton School of Music and accompanied by a pianist. Four singers from
Thornton School of Music sang with each voice part of the choir, as “section leaders”.
Participants learned a variety of songs across genres and performed them at the end of the 12-
week period as a small concert. The performance included folk (i.e: “Sally Gardens”), musical
theater (i.e: “Food Glorious Food” from Oliver!), holiday (i.e: “Carol of the Bells”), renaissance
(i.e: “El Grillo), Baroque (i.e; “Bist du Bei Mir”, by J.S Bach), and traditional choral music (i.e:
“Life’s Joy” by Schubert, and “Laudate Dominum”). Participants in the choir were given an
additional $15 per rehearsal attended to cover parking and transportation expenses.
Passive-listening group. The passive-listening group (Control group hereafter) received twelve
weekly 3-hour musical playlists that they were asked to listen to throughout the week. Playlists
were curated by a doctoral student in the Thornton School of Music to reflect a variety of
musical genres that would be enjoyable to participants in this age group. Participants were given
the choice to listen to the playlists on a provided MP3 player, or on a personal device through
Spotify. Reminders to listen each week were administered via text. Participants interacted with
other participants on a private online platform to discuss the previous week’s playlist.
Additionally, participants were given opportunities to attend free weekly live concerts and
40
musical events as a group. Attendance at live events was not required for participation in the
study, but on average different combinations of 4-5 participants attended each week.
Stimuli
Behavioral Tasks. Cognitive abilities were assessed for pre-screening purposes using the
Montreal Cognitive Assessment (MoCA) (Julayanont & Nasreddine, 2016), which includes
measures of memory, language, attention, visuospatial skills, calculation, and orientation and is
intended to detect mild cognitive impairment. Audiometric thresholds were obtained bilaterally
at octave intervals 0.5-8 kHz using a Maico MA 790 audiometer in a sound-attenuated booth.
Musical experience was measured at pre-test only using the Goldsmiths’ Musical Sophistication
Index (Müllensiefen et al., 2014), which measures musical experience as a function of six facets:
active engagement, perceptual abilities, musical training, singing abilities, emotions, and general
sophistication. Socio-emotional well-being was assessed using Ryff’s Psychological Well-Being
Scale (C. D. Ryff et al., 2012; Carol D. Ryff, 1989), which includes 42 self-report items that
measures six aspects of wellbeing: autonomy, environmental mastery, personal growth, positive
relations with others, purpose in life, and self-acceptance. Loneliness was measured at post-test
only, with the Dejong Giervald Loneliness Scale (de Jong-Gierveld & Kamphuls, 1985),
consisting of 11 self-report items asking participants about current feelings of social and
emotional loneliness. At post-test, participants were additionally asked to respond in writing to
the open-ended prompt: “Do you feel that the music intervention has had any impact on your
social life or feelings of connection with other people?”.
41
Hearing-in-noise abilities were assessed with the Music-In-Noise Task (MINT) (Coffey et al.,
2019) and the Bench, Kowal, and Bamford Sentences test (BKB-SIN) (Niquette et al., 2003). In
the MINT, participants were presented with a musical excerpt embedded within musical noise,
followed by a matching or non-matching repetition of the target excerpt in silence and are asked
to determine whether the two presented sounds matched. This portion of the task is divided into
Rhythm or Pitch matching conditions. In a third condition of the task (Prediction), participants
were first presented with the target stimulus in silence before being asked to determine if the
following excerpt within noise was a match. Accuracy and response times were recorded.
Participants completed this task using headphones in a sound attenuated room. In the BKB-SIN,
speech-in-noise abilities were assessed by asking participants to repeat simple sentences
embedded in four-talker babble at increasing noise levels. The BKB-SIN uses Bench, Kowal,
and Bamford Sentences (Bench et al., 1979), which are short stimuli written at a first-grade
reading level rich with syntactic and contextual cues. A verbal cue (“ready”) is presented before
each sentence. Background babble is presented at 21, 18, 15, 12, 9, 6, 3, 0, -3, and -6 dB SNR.
Six lists containing ten sentences each were presented through a single loudspeaker in a sound
attenuated room at 60 dBA. Each sentence contains three or four key words that are scored as
correct or incorrect. An experimenter recorded responses, and a total score and a SNR-50 (23.5 –
total score) were calculated.
EEG tasks. Participants completed two tasks during EEG recording: an auditory oddball, and a
syllable-in-noise task. The syllable-in-noise (SIN) task consisted of an active and a passive
condition. In the active condition, participants pressed a button when they were able to hear a
target syllable within background babble. In the passive condition, participants watched a muted
42
nature documentary while passively listening to the stimuli. Stimuli consisted of the syllable /da/
presented at 65 dB SPL within a two-talker babble at one of four SNR conditions (silent (no
background noise), 0dB, 5dB, and 10dB). Each target stimulus was presented for 170 ms with an
inter-stimulus interval jittered at 1000, 1200, or 1400 ms, for a total trial length of 1370 ms. Each
SNR condition was presented in a block of 150 stimuli for both the active and the passive
condition. Accuracy and response time during the active condition were recorded. Auditory
stimuli for both tasks were presented binaurally with ER-3 insert earphones (Etymotic Research).
In the oddball task, 400 trials were presented with a 1000msec Intertrial Interval; stimuli
consisted of 280 standard pure tones (500 Hz), 60 oddball target tones (1000 Hz), and 60 white
noise distracter stimuli, each presented for 60ms. Stimuli were presented at 76 dB SPL.
Participants were instructed to press a button only for the oddball stimulus. Accuracy and
response times were recorded.
Procedure
Recruitment and induction protocols were approved by the University of Southern California
Institutional Review Board. Informed consent was obtained in writing from participants, and
participants could end participation at any time. Participants received monetary compensation for
assessment visits ($20 per hour). All participants were tested individually at the Brain and
Creativity Institute at the University of Southern California.
EEG recording and averaging
Electrophysiological data was collected from 32 channels of a 64-channel BrainVision actiCAP
Standard-2 system. Electrodes were labeled according to the standard International 10-20 system
43
(Oostenveld & Praamstra, 2001). Participants were seated in a comfortable chair in a dark,
sound-attenuated and electrically-shielded room. Impedances were kept below 10 kΩ. Data were
sampled at 500 Hz.
EEG data processing was conducted with EEGLab (Delorme & Makeig, 2004) and ERPLAB
(Lopez-Calderon & Luck, 2014). Data were resampled offline to 250 Hz sampling rate, and
bandpass filtered with cut-offs at .5 Hz and 50 Hz. Channels with excessive noise were removed
and then manually interpolated. The data were visually inspected for artifacts, and segments with
excessive noise were removed. Ocular movements were identified and removed using
independent components analysis. Data were then bandpass filtered at 1-20 Hz. Epochs were
average referenced (excluding EOG and other removed channels) and baseline corrected (-200 to
0 ms prior to each note). Epochs with a signal change exceeding +/- 150 microvolt at any EEG
electrode were artifact-rejected and not included in the averages. For the Active and Passive
syllable-in-noise tasks, EEG data were divided into epochs starting 200ms before and ending 800
ms after the onset of each stimulus. A repeated measures ANOVA was conducted, with SNR
Condition and Time as within-subject factors, and Group as the between-subjects factor for the
Passive and Active tasks separately to assess differences in number of trials accepted. No
differences in accepted trials were observed in the Passive syllable-in-noise task (ps > 0.05). An
effect of time was observed in the Active syllable-in-noise task, (F(1, 32) = 5.96, p < 0.05),
where more trials were accepted at post-test than at pre-test across conditions and groups. No
other differences were observed (see Table 3).
44
Table 3. Trials in EEG tasks
Pre-Test
mean (SD)
Post-test
mean (SD)
Choir Control Choir Control
Syllable-in-noise Active
Silent 123.53
(31.01)
132.68
(19.29)
119.26
(37.64)
112.79
(32.71)
10 dB 121.87
(33.01)
132.26
(22.22)
114.13
(44.14)
111.58
(34.99)
5 dB 130.33
(29.75)
135.37
(16.77)
119.47
(36.54)
111.63
(40.04)
0 dB 123.00
(34.86)
134.00
(19.23)
116.80
(39.49)
115.89
(36.52)
Syllable-in-noise Passive
Silent 147.11
(4.09)
148.33
(33.93)
148.67
(3.01)
148.16
(2.48)
10 dB 146.22
(6.34)
144.78
(25.28)
149.33
(1.85)
147.78
(4.28)
5 dB 146.61
(3.18)
139.94
(14.90)
149.00
(1.61)
147.28
(9.61)
0 dB 147.83
(2.50)
141.83
(12.19)
147.33
(8.35)
148.22
(3.57)
45
Oddball
Standard 274.89
(6.64)
263.2
(36.92)
276.28
(5.97)
269.35
(20.47)
Oddball 55.00
(7.11)
51.75
(11.27)
54.11
(5.94)
53.25
(8.28)
Distractor 56.89
(1.94)
54.35
(6.47)
57.11
(1.45)
54.75
(4.52)
46
For the Oddball task, data was epoched from -200ms to +1000ms relative to the onset of each
stimulus. For the Oddball task, separate repeated measures ANOVAs were calculated to assess if
time or group impacted the number of accepted trials in each condition (Oddball, Standard, and
Distractor). No effect of group or time on the number of accepted trials was observed in the
Oddball (p > 0.05), Standard (p > 0.05), or Distractor conditions (p > 0.05) (see Table 3).
Mean amplitude and peak latency for ERPs were calculated automatically in time-windows
centered on the peak of the retrospective component of the grand average waveform. Latencies
were analyzed at a single electrode chosen from existing literature (Meha-Bettison et al., 2018;
Musacchia et al., 2008a) and verified based on location of peak activity observed in topographic
headplots. Time-windows and electrodes for peak measurements for each component of the
Oddball and the syllable-in-noise task are summarized in Tables 4, 5, and 6. In addition to
examining well-studied ERP components (P1, N1, P2, P3), we investigated the effects of choir
training on a frontally-distributed, P3-like positive peak occurring at 200-1000ms during the
syllable-in-noise task as described by Zendel et al., (Zendel et al., 2019). This peak was
interpreted as a marker of attention orienting, given its temporal overlap with the P3 (Zendel et
al., 2019).
Table 4. Latencies and electrodes for components observed during Oddball task.
Time Component Condition Electrodes Window
Pre N1 F3, FZ*,F4 65 115
47
Post
Oddball,
Standard,
Distractor
C3, Cz, C4
70 110
Pre P2
Oddball
Fz
Cz*
Pz
145 250
Standard 135 185
Distractor 190 265
Post P2
Oddball
Fz
Cz*
Pz
125 155
Standard 115 145
Distractor 115 145
Pre P3 Oddball P3, Pz*, P4 300 625
Post P3 Oddball P3, Pz*, P4 315 610
Pre P3a Distractor
Fz*
Cz
Pz
345 395
Post P3a Distractor
Fz*
Cz
Pz
320 390
48
Pre
P3b Distractor
Fz
Cz*
Pz
450 660
Post
*electrode from which latency was calculated
Table 5. Latencies and electrodes for components observed during Syllable-in-Noise Active
Task.
Time Component Condition Electrodes Window
Pre P1
Silent
F3, FZ,F4
C3, Cz*, C4
35 70
10db 50 80
5db 65 110
0db 60 105
Post P1
Silent
F3, FZ,F4
C3, Cz*, C4
45 70
10db 50 85
5db 55 95
0db 65 100
Pre N1 Silent F3, FZ,F4 90 125
49
10db C3, Cz*, C4
115 170
5db 125 190
0db 130 200
Post N1
Silent
F3, FZ,F4
C3, Cz*, C4
85 130
10db 105 175
5db 125 175
0db 155 205
Pre P2 Silent
F3, FZ,F4
C3, Cz*, C4
155 200
Post P2 Silent
F3, FZ,F4
C3, Cz*, C4
160-245
Pre P3-like component
Silent
F3, FZ,F4
C3, Cz*, C4
275 400
10db 270 430
5db 280 440
0db 295 480
50
Post P3-like component
Silent
F3, FZ,F4
C3, Cz*, C4
275 400
10db 280 410
5db 275 430
0db 305 445
*electrode from which latency was calculated
Table 6. Latencies and electrodes for components observed during Syllable-in-Noise Passive
Task.
Time Component Condition Electrodes Window
Pre P1
Silent
F3, FZ,F4
C3, Cz*, C4
40 75
10db 50 100
5db 55 105
0db 55 110
Post P1 Silent F3, FZ,F4
51
C3, Cz*, C4
40 70
10db 55 95
5db 55 105
0db 65 115
Pre N1
Silent
F3, FZ,F4
C3, Cz*, C4
90 130
10db 130 195
5db 145 200
0db 144 215
Post N1
Silent
F3, FZ,F4
C3, Cz*, C4
90 130
10db 125 185
5db 145 200
0db 155 200
Pre P2 Silent
F3, FZ,F4
C3, Cz*, C4
160 230
Post P2 Silent F3, FZ,F4 165 230
52
C3, Cz*, C4
*electrode from which latency was calculated
Statistical Analysis
All statistical analyses were performed using R statistics (R Core, 2020). Difference scores were
calculated for all behavioral and EEG measures (Posttest - Pretest) and used as the primary
outcome of interest. Much of the data presented as not normally distributed or homoscedastic,
thus robust estimators were used, with R functions from (Wilcox, 2017) and the WRS2 package
(Mair & Wilcox, 2020). Pairwise comparisons were conducted using a robust bootstrap-t method
(R function linconbt from functions in (Wilcox, 2017)). This method computes sample trimmed
means (20%) and Yuen’s estimate of squared standard errors, before generating bootstrap
samples to estimate the distribution. For tasks that included multiple conditions, a robust
bootstrap-trimmed-mean method was used (R functions bwtrim and bwwtrim from WRS2). 20%
trimming was used in all tests as it is a compromise between the mean and median. These robust
methods perform well under non-normal conditions and small sample sizes (Wilcox, 2017).
Effect sizes were computed (R function ES.summary) for all significant main effects and
interactions using QS, a heteroscedastic, non-parametric measure based on medians. An alpha
level of 0.05 was used for all tests.
Behavioral analysis
53
Separate robust bootstrap-t tests were conducted for each behavioral task, with Group as the
between-groups factor and difference score as the dependent variable. For the MINT, task
condition was included as a within-groups factor (Prediction, Melody, and Rhythm). For Ryff’s
and the Goldsmith MSI, each subcategory was assessed separately. DeJong’s scale was assessed
at post-test only, and scores on the emotional and social subcategory were assessed separately.
For the open-ended well-being prompt (“Do you think that the music intervention has had any
impact on your social life or feelings of connection with other people?”) responses were
transcribed and sorted into one of three categories : 1) social impact, 2) emotional impact, or 3)
no impact and proportion of responses in each category were assessed by Group. These
categories were aimed to parallel the “social” and “emotional” aspect of loneliness measured in
the DeJong scale (de Jong-Gierveld & Kamphuls, 1985). For the EEG syllable-in-noise task,
SNR condition was included as a within-groups factor (silent, 0dB, 10dB, 5dB). Accuracy and
reaction time during the EEG syllable-in-noise task were only recorded during the Active
listening condition. For the EEG Oddball task, group differences in accuracy and reaction time
were compared separately.
EEG analysis
Separate bootstrap-trimmed-means tests were conducted for each EEG task, for each component
of interest for amplitude and latency difference scores. When appropriate, laterality was included
as a factor in both EEG tasks due to the known right-lateralized processing of musical pitches
(Zatorre et al., 1992), the mediating effect of pitch perception on speech-in-noise abilities
(Coffey, Chepesiuk, et al., 2017; Dubinsky et al., 2019), and influence of musical training on
right- lateralized temporal structures (Habibi et al., 2018; Hyde et al., 2009). For the syllable-in-
54
noise task, SNR Condition (Silent, 10dB, 5dB, 0dB), Laterality (amplitude only), and Frontality
(amplitude only; frontal vs central electrodes) were included as within-subjects factors, and
Group was included as a between-subjects factor. The Active and Passive listening conditions of
the syllable-in-noise task were analyzed separately. For the Oddball task, components were
assessed separately for each trial type (Oddball, Standard, and Distractor). Laterality (amplitude
only; left, middle and right) or Frontality (amplitude only; frontal, central, parietal) was included
as a within-subjects factor, and group was included as a between-subjects factor.
Results
Means and standard deviations for each behavioral task, EEG task amplitude, and EEG task
latency by group are presented in Appendices B, C, and D, respectively.
Montreal Cognitive Assessment
At pre-test, no difference between groups was observed for the MoCA (p > 0.05). Groups
demonstrated nearly identical distributions (Choir M = 26.11, SD = 2.25; Control M = 26.48, SD
= 2.06).
Sentence-in-Noise Task
In the BKB-SIN task, no effect of Group was observed (p > 0.05).
Musical Sophistication
55
At Pretest, no difference between groups was observed in any subcategory of the Goldsmith MSI
(p > 0.05).
Music-in-Noise Task
In the MINT, 3 participants from the control group had incomplete or missing data from one or
more time points and were thus excluded from analysis, resulting in 20 Control and 18 Choir
participants. No main or interaction effects of Condition or Group were observed for accuracy or
reaction time (all p > 0.05).
Well-being
No significant effects of Group were observed for any subcategory of Ryff’s Psychological
Well-being Scale (all p > 0.05).
For the Dejong’s Loneliness Scale, no effect of group was observed in emotional or social
loneliness at post-test (all p > 0.05).
For the open-ended prompt, “Do you think that music intervention has had any impact on your
social life or feelings of connection with other people?”, 13 participants responded from the
Control group and 15 participants responded from the Choir group. In the Choir group, 62%
reported that the intervention had an impact on their social wellbeing, 19% reported an impact on
emotional well-being, and 19% reported no impact. In the Control group, 8% reported that the
intervention had an impact on their social well-being, 54% reported impact on emotional well-
56
being, and 31% reported no impact. A chi-squared test of independence indicated that response
category (social, emotional, none) was dependent on group (X
2
(2, N = 30) = 11.02, p < 0.01).
Behavioral Responses during EEG tasks
Syllable-in-noise
One participant from the Choir group was removed from analysis due to excessive noise in EEG
data, and 3 participants were removed from the Control group for excessive noise or incomplete
data. No main or interaction effects were observed for accuracy (all p > 0.05). No main or
interaction effects were observed for reaction time (all p > 0.05).
Oddball
Three participants from the Control group were removed from analysis due to excessive noise in
EEG data. No effect of Group was observed for accuracy or reaction time (all p > 0.05).
Event related potentials in active syllable-in-noise task
P1 Amplitude and latency
P1 reached peak latency at 35-70ms in the Silent SNR condition, 50-85ms in the 10dB SNR
condition, 65-110ms (pre) and 55-95ms (post) in the 5dB SNR condition, and 60-105ms in the
0dB SNR condition. No significant effects between groups or interactions were observed for P1
amplitude or latency (all p > 0.05). For P1 latency, a main effect of SNR Condition was observed
(Test statistic: 7.50, p < 0.01, QS = 0.78), where latency in the 5dB condition was earlier than in
the 0dB (p < 0.001), 10dB (p < 0.05), and silent (p < 0.01) conditions from Pretest to Posttest.
N1 Amplitude
57
N1 reached peak amplitude at 90-125ms (pre) and 85-130ms (post) during the Silent SNR
condition, 105-175ms in the 10dB SNR condition, 125-190 in the 5dB condition, and 130-200 in
the 0dB condition. No significant effects related to intervention were observed for N1 amplitude
(p > 0.05). A main effect of Frontality was observed (Test statistic = 4.15, p < 0.05, QS = 0.50)
where amplitude in frontal electrodes showed an increase more than in central electrodes from
Pretest to Posttest (p < 0.01).
N1 Latency
For N1 latency, a main effect of Group was observed (Test statistic = 7.31, p < 0.05, QS = 0.31),
where N1 latency in the Choir group decreased to a greater extent than in the Control group from
Pretest to Posttest (p < 0.01) across all SNR conditions (see Figure 6).
58
Figure 6. a) N1 latency, difference score (post-test – pre-test) at Cz in the active condition of the
syllable-in-noise task in choir and control groups, across SNR conditions. b) ERPs recorded at
Cz during active condition of the syllable-in-noise task in the choir and control groups at pre and
post-test for each noise condition. c) Topographic headplots for N1 during active condition of
the syllable-in-noise task in the choir and control groups at pre and post-test for 0dB and Silent
conditions.
59
P2 Amplitude and Latency
P2 was observed only in the Silent SNR condition around 160-245ms. For P2 amplitude and
latency, no significant effects between groups were observed (all p > 0.05)
P3-like Amplitude
A positive inflection varying from 275-400ms to 305-445ms (latency dependent on SNR
condition) was observed across SNR conditions of the active, but not the passive, task. A Group
x Laterality interaction was observed for the P3-like amplitude (Test statistic = 3.10, p < 0.05)
where, in the right electrodes, the Control group showed an increased amplitude from Pretest to
Posttest more than the Choir group (p < 0.05, QS = 0.41). A Group x SNR Condition interaction
approached significance (Test statistic = 2.55, p = 0.05) where, in the silent SNR condition only,
the Control group showed an increased amplitude from Pretest to Posttest more than the Choir
group. A main effect of Frontality was observed (Test statistic = 7.51, p < 0.01, QS = 0.44),
where amplitude increased from Pretest to Posttest was more pronounced in frontal than central
electrodes (p < 0.01). After inspecting individual traces, we noted that the group differences in
amplitude were driven by a single participant in the Control group and, when that participant was
removed, did not approach significance.
P3-like Latency
For Latency, no significant effects or interactions were observed (p > 0.05).
Event related potentials in passive syllable-in-noise task
60
P1 Amplitude and Latency
P1 reached peak amplitude at 40-75ms in the Silent SNR condition, 50-100ms in the 10dB SNR
condition, 55-105ms in the 5dB SNR condition, and 55-110ms (pre) and 65-115ms (post) in the
0dB condition. No significant effects between groups or interactions were observed for P1
amplitude or latency (all p > 0.05).
N1 Amplitude
N1 reached peak amplitude at 90-130ms in the Silent SNR condition, 125-195ms (pre) and 125-
185ms (post) in the 10dB SNR condition, 145-200ms in the 5dB SNR condition, and 144-215ms
(pre) and 155-200ms (post) in the 0dB SNR condition. A main effect of Group was observed
(Test statistic = 6.62, p < 0.05, QS = 0.51), where the Choir group showed an increase in N1
amplitude from Pretest to Posttest significantly more than did the Control group (p < 0.001) (see
Figure 7) across SNR conditions. A Group X SNR Condition X Frontality interaction was
observed on N1 amplitude (Test statistic = 3.38, p < 0.05) but was not significant after correcting
for multiple comparison (p > 0.05).
61
Figure 7. a) 7a. N1 amplitude, difference score (post-test – pre-test) averaged across frontal and
central in the passive condition of the syllable-in-noise task in choir and control groups. b) ERPs
recorded at Cz during passive condition of the syllable-in-noise task in the choir and control
groups at pre and post-test for each noise condition.
N1 Latency
For N1 latency, no significant effects between groups or interactions were observed (p > 0.05).
P2 Amplitude and Latency
P2 was observed only in the silent SNR condition and reached peak amplitude at 160-230ms. No
significant effects related to intervention were observed for P2 amplitude (p > 0.05). A main
effect of Laterality was observed (Test statistic = 7.32, p < 0.01), but was not significant after
correcting for multiple comparisons (p > 0.05). No significant effects between groups were
observed for P2 latency (all p > 0.05).
62
Event related potentials in oddball task
N1 Amplitude
N1 reached peaked amplitude at 65-115ms at pretest and 70-110 ms at posttest in the Oddball,
Standard, and Distractor conditions. During Standard trials, a Group X Frontality interaction was
observed (Test statistic = 5.36, p < 0.05, QS = 0.64) where, in frontal electrodes, amplitude in the
Choir group increased more than in the Control group (p < 0.01, QS = 0.37) from Pretest to
Posttest (see Figure 8). During Oddball and Distractor trials, no effect of Group was observed (p
< 0.05). During Distractor trials, a main effect of laterality was observed (Test statistic = 3.59, p
< 0.05, QS = 0.73), where amplitude at right electrodes increased more than amplitude at left
electrodes (p < 0.01).
63
Figure 8. a) N1 amplitude, difference score (post-test – pre-test) in frontal and central electrodes
in the standard condition of the oddball task in choir and control groups. b) ERPs recorded at Fz
during standard condition of the oddball task in the choir and control groups at pre and post-test.
c) Topographic headplots for N1 during oddball task in choir and control groups in the standard
condition.
N1 Latency
64
During Oddball, Standard, and Distractor trials, no significant effects between groups or
interactions were observed on N1 latency (all p > 0.05).
P2 Amplitude and Latency
P2 reached peak amplitude at 145-250ms (pre) and 125-155ms (post) in the Standard condition,
135-185ms (pre) and 115-145ms (post) in the Oddball condition, and 190-265ms (pre) and 115-
145ms (post) in the Distractor condition. However, no significant effects between groups or
interactions were observed for P2 amplitude or latency (all p > 0.05) for any of the conditions.
P3a Amplitude and Latency
During the Distractor trials, P3a reached peak amplitude at 345-495ms at pretest and 320-390 ms
at posttest. However, there were no observed significant amplitude or latency effects between
groups or interactions (all p > 0.05).
P3b Amplitude and Latency
P3b reached peak amplitude at 300-625ms (pre) and 315-610ms (post) during Oddball trials and
450-660ms during Distractor trials. No significant effects between groups were observed on P3b
amplitude or latency during Oddball or Distractor trials (all p > 0.05).
Discussion
In this study, we investigated the effects of participation in a short-term choir program on
perceiving speech in noise (SIN), auditory attention, and their underlying neurophysiological
65
correlates using event-related potentials (ERPs) in a randomized-control trial with older adults
between ages 50-65. We also assessed social well-being as a result of participation in the choir.
We observed an effect of music training on the auditory evoked potential N1 response in an
Active and Passive Syllable-in-Noise task, although no behavioral differences were observed. An
effect of training was also observed on N1 response during the Oddball task, again in the absence
of behavioral differences. Lastly, well-being measure qualitatively indicated that choir training
may have benefitted participants’ social well-being, while passive music listening may have
benefitted control participants’ emotional well-being. These results have implications for the use
of a short-term music program to mitigate the perceptual and socioemotional effects of age-
related auditory decline. We discuss these findings in detail in the context of existing literature
below.
N1
N1 is regarded as a correlate of initial stimulus detection (Parasuraman & Beatty, 1980). N1 is
additionally enhanced by increased attention, where larger amplitudes (Folyi et al., 2012;
Hillyard et al., 1973; T. W. Picton et al., 1974) and shorter latencies (Folyi et al., 2012) are
observed with increasing attentional engagement. In the presence of background noise, N1 is
attenuated, with decreased amplitude and increased latency with falling signal-to-noise ratios
(Kaplan-Neeman et al., 2006; Koerner & Zhang, 2015; Salo et al., 2003). Thus, N1 is associated
with encoding of physical properties of sound and marks the arrival of potentially important
sounds to the auditory cortex. While N1 elicitation does not require conscious processing
(Atienza et al., 2001; Näätänen & Winkler, 1999), it can be modulated by attentional demands
(Hillyard et al., 1973).
66
N1 response is reduced in certain clinical populations with disorders related to audition,
including individuals with misophonia (Schröder et al., 2014) and sensorineural hearing loss
(Oates et al., 2002). The effects of age on N1 are less clear. While some report decreased
amplitude (Schiff et al., 2008), others report a pattern of increased amplitude and longer N1
latency in older adults (Anderer et al., 1996; Bidelman, Villafuerte, et al., 2014; Herrmann et al.,
2016; Rufener et al., 2014) and older adults with hearing loss (Tremblay et al., 2003) and many
investigations report little or no effects of age on either amplitude or latency (Bahramali et al.,
1999; Barrett et al., 1987; Čeponiene et al., 2008; Coyle et al., 1991; Terence W. Picton et al.,
1984; Polich, 1997). Throughout the lifespan, however, N1 appears to be mutable through
experience-dependent plasticity. N1 is larger in adult musicians as compared to non-musicians
(Baumann et al., 2008; Shahin et al., 2003). N1 amplitude increases are observed after short-term
syllable (Tremblay & Kraus, 2002), frequency (using a tone-based oddball task) (Menning et al.,
2000) and music training (Pantev & Herholz, 2011; Zendel et al., 2019).
Effect of music training on N1
In the present study, participants involved in choir, as compared to participants engaged in
passive music listening, demonstrated larger N1 amplitudes in a passive syllable-in-noise task
from pre- to post-training across all noise conditions. This finding replicates that of Zendel et al.,
2019, who also showed larger N1 during a passive, but not active, words-in-noise task after 6
months of piano training. Of note, all participants in our study first completed the active task
followed by the passive task. The group difference in N1 amplitude observed only in the passive
67
condition could be related to the order of task administration and interaction with music training;
where during the active condition both groups equally attended to the incoming auditory stimuli
and due to a ceiling effect, no group differences were evident- during the passive task however,
the participants in the choir group continued to involuntarily attend to the incoming auditory
stimuli, due to a general re-organization of attention to and encoding of sound relation to their
music training.
In the oddball task, choir participants additionally demonstrated larger N1 amplitudes from pre-
to post-training as compared to controls. This finding was specific to the frontal electrode (Fz),
during trials of standard tones. This finding is similar to that of Menning et al., 2000 who
observed that a short-term frequency discrimination intervention led to increased N1 amplitude
most prominently during standard (as compared to deviant) trials of an oddball task. The finding
that N1 amplitude was enhanced only in standard trials may simply reflect the fact that standard
tones were presented 4.7 times as frequently as oddball or distractor tones, indicating that a
larger sample of trials was necessary to see an effect of training. The observed frontality effect
replicates previous work showing the N1 response most reliably observed at frontal or
frontocentral sites (Näätänen & Picton, 1987), and further demonstrates that the effect of training
was most robust in locations where N1 is classically observed.
Given that N1 amplitude is known to be enhanced by attention (Folyi et al., 2012; Hillyard et al.,
1973; T. W. Picton et al., 1974), it is possible that observed changes in N1 amplitude in the
oddball and passive syllable-in-noise tasks may be explained by, in addition to enhanced
encoding, increased attention to sound in general in the choir group. Participating in music
68
training may have in part re-organized participants’ orientation towards sounds and led to greater
engament of attention recources towards tones and syllables. This, in conjunction with imporved
basic auditory perception, may have contributed to enhanced amplitudes of N1.
In contrast to amplitude, latency differences were observed only in the active condition of the
syllable-in-noise task, where choir participants demonstrated earlier N1 latencies from pre- to
post-training across all noise conditions. Attention has been shown to decrease N1 latency,
where latency is earlier in active as compared to passive tasks (Folyi et al., 2012; Okamoto et al.,
2007). These findings support the Prior Entry Hypothesis, which posits that attended stimuli are
perceived earlier than unattended stimuli (Holt & Titchener, 1909). While it is expected that
latencies will be shorter in the active than the passive condition across participants, the choir
group’s latency decrease from pre to post-test in the active condition here suggests that music
training impacted attentional processes. It could be that music training led participants to be
more attentive during the task, or that it increased the potential for acceleration in neural
processing speed for the same level of attentional engagement. Given that the choir group did not
demonstrate any improvements in syllable-in-noise response time, which would also indicate
greater attentiveness during the task, we posit that the latter explanation is more likely to be true.
Specifically, choir training increased the influence of attention on the speed of neural processing
which may be not evident in the motor response as measured by reaction time.
Of note, no effect of latency was observed during the oddball task, even though it is also an
active task and latency effects were observed during the active condition of the syllable-in-noise
task. If attention modulates latency of N1 response, and music training further enhances this
69
effect, then one would expect latency during N1 to also decrease in the oddball task in the choir-
trained group. The lack of latency difference between groups may relate to a ceiling effect on the
latency of the stimuli in the oddball task. It also likely indicates that the ability of short-term
choir training to accelerate sensory processing speed is not consistent across all types of auditory
stimuli. Rather than a global effect on attention across stimuli, choir training may first modify
the latency of N1 selectively in response to speech sounds as presented in the syllable-in-noise
task as opposed to pure tones and white noise presented in the oddball task. Speech perception
involves top-down processing (for review, see (M. H. Davis & Johnsrude, 2007)), whereas
perception of pure tones, sounds that do not typically occur in the natural environment, may not
benefit as much from top-down filling. In line with this, Shahin et al., (Shahin et al., 2003)
observed enhancements of N1 and P2 to musical tones as compared to pure tones in professional
musicians. Speech stimuli, as used in this study, are arguably more similar to musical stimuli
than are pure tones, given their probability of occurrence in daily life. It is likely that the
attention-related reductions in N1 latency attributed to music training were present in the SIN,
but not the oddball, task because training improved only top-down modulation of sounds relevant
to the natural environment, such as speech, and not to computer-generated stimuli typically
unheard outside of a laboratory.
Together, enhancements of N1 in the Choir group across tasks demonstrate the ability of a short-
term music program to improve the early neural encoding of both speech and tones. The
observed overall effect of music training on N1 is in accordance with experimental (Zendel et al.,
2019) and cross-sectional work comparing musicians to non-musicians, citing enhanced N1
during passive tone listening (Shahin et al., 2003) and active tone listening (Baumann et al.,
70
2008). After habituation in a passive task, musicians as compared to non-musicians showed
enhanced N1 when presented with a brief active task, demonstrating rapid plasticity (Seppänen et
al., 2012). Yet, others report no N1 differences between musicians and non-musicians in
response to pure and piano tones, noise (Lütkenhöner et al., 2006) or harmonics (O’Brien et al.,
2015), or report reduced amplitudes in musicians (Kühnis et al., 2014). Discrepancies may be
due to differences in EEG task stimuli and design. For example, both O’Brien et al., 2015 and
(Kühnis et al., 2014) used an oddball-like paradigm. It may be that N1 enhancement in musicians
observed in the context of an attention-related task may produce less consistent results, and that
more research is needed to elucidate these differences. For example, N1 response decreases with
increased predictability of a stimulus (Nishihara et al., 2011; Schafer et al., 1981) (i.e: with high
repetition in an oddball paradigm). Differences in N1 may not be consistently detectable across
task designs due to the saturation of the neural response, yet more investigation is needed.
Alternatively, as proposed by Lutkenhoner et al. (Lütkenhöner et al., 2006), discrepancies
between studies may reflect differences in dipole estimation methods. Here, our results most
closely followed Zendel et al., 2019, whose study and EEG task design more closely follow ours.
Change in N1 could be indicative of more synchronized discharge patterns in N1 generator
neuron populations of Heschl’s gyrus or regions of the superior temporal gyrus. This is
supported by evidence that N1 responses to speech in noise are predicted by neural phase
locking, as measured by inter-trial phase coherence (Koerner & Zhang, 2015). Specifically,
neural synchrony is positively correlated with the earlier latencies and larger amplitudes of N1
that are observed when background noise is decreased (Koerner & Zhang, 2015). The shorter
71
latency observed in the active condition may additionally indicate faster conduction time in these
neurons (Lister et al., 2011).
Contributions of top-down and bottom-up processing
Using multiple EEG tasks, we aimed to address the question regarding role of top-down versus
bottom-up processing in music training-related benefits to auditory processing in general and
speech perception specifically. Studies recruiting life-long musicians have provided evidence
primarily for top-down attention modulation to improve speech processing abilities (George &
Coch, 2011; Zhang et al., 2020a). In this study, however, we provide evidence largely towards a
model of improved bottom-up processes. We notably did not observe differences between groups
in later components of the oddball task (e.g: P3a or P3b) or in the later attention-related
positivity of the syllable-in-noise task, suggesting that choir-training conferred a general
advantage to encoding acoustic features, but did not modulate general attentional processes. This
is in line with N1 findings from the syllable-in-noise task, where differences between groups
were not affected by noise level. This suggests that changes observed were again due to general
enhanced processing of the target sound, rather than suppression of attention away from a
distracting noise. Importantly, however, it should be noted that, although N1 is an early
component thought to reflect basic encoding, it can still be impacted by top-down processes,
namely attention, as seen in differences in amplitude and latency when comparing active to
passive paradigms (e.g: Folyi et al., 2012). Here, we observed that choir training enhanced the
relationship between attention and sensory processing in the syllable-in-noise task, as seen in
decreased latencies in the active condition only. This suggests that choir training, while mainly
impacting bottom-up processes, may have had some impact on attention-related processing of
72
speech stimuli. This effect was stimulus-specific, as no latency effects were observed for N1, or
any other component, during the oddball task that involved pure tones as opposed to speech
sounds. This may reflect a more near-transfer effect of choir training, which involves speech and
not pure tones, as compared to instrumental training. It may additionally suggest simply that
choir may selectively improve top-down processing of stimuli that more regularly occur in the
environment; pure tones, as compared to speech stimuli, are highly unusual outside of a
laboratory setting as they are built from an isolated frequency. Due to their prevalence in the
natural environment, speech sounds also involve and benefit more from top-down processing
(review: (M. H. Davis & Johnsrude, 2007)) than do pure tones. Therefore, we overall provide
evidence towards improved neural encoding with some attentional modulation, suggesting that
short term choir training and long-term instrumental training may produce benefits through
different, or proportionally different, mechanisms. As noted by (Patel, 2014b), the proposed
mechanisms may not be mutually exclusive.
Speech perception involves top-down processing (for review, see (M. H. Davis & Johnsrude,
2007)), whereas perception of pure tones, sounds that do not typically occur in the natural
environment, may not benefit as much from top-down filling.
Effect of training on P3-like component
In our analysis on the P3-like component during the active syllable-in-noise task, we investigated
whether we could replicate findings observed by Zendel et al., 2019. In Zendel et al., 2019, the
music group showed greater amplitude of this peak, and this result was interpreted as an index of
increased voluntary attention allocation similar to a P3b response. Here, we observed enhanced
amplitude in the control group in the P3-like component during the active condition of the
73
syllable-in-noise task. However, this difference was driven by a single participant in the control
group and thus does not reflect true differences between groups. Discrepancies between our
findings and those of Zendel et al., 2019 may simply be due to task design, as noted previously.
Zendel et al., 2019 observed a positivity peaking from 200-1000 ms in both the passive and the
active tasks, whereas in this study we were only able to reliably measure a similar component in
the active task and in a much smaller time window (~250-450ms). This may again indicate that
the stimuli used by Zendel et al., 2019 required more effort to process and thus was more
sensitive to training-related effects.
Absence of behavioral change
Despite observed changes on early auditory encoding, we report no effect of training on
behavioral measures of speech-in-noise perception. Groups did not differ in pre- to post-training
improvements of sentence-in-noise tasks during or outside EEG recording. This is in contrast to
experimental evidence demonstrating benefits in behavioral speech-in-noise abilities after 10
weeks of choir training (Dubinsky et al., 2019) and 6 months of piano training (Zendel et al.,
2019), both in older adults. However, with the same group of participants, Fleming et al., 2019
did not observe behavioral differences in an in-scanner task of hearing in noise. Differences
between observed behavioral speech-in-noise improvements and the results of this study may
reflect differences in tasks. Dubinsky et al., 2019 used the QuickSIN (Etymotic Research, 2001;
Killion et al., 2004), which consists of sentences embedded in 4-talker babble. Comparison of
QuickSIN and BKB-SIN, as used in this study, show greater differences between groups of
differing hearing abilities in QuickSIN as compared to BKB-SIN, a difference associated with
increased contextual cues present in the BKB-SIN that lead to better recognition in individuals
74
with greater hearing loss (Wilson et al., 2007). It is possible that the BKB-SIN was not sensitive
enough to pick up on potential differences resulting from a short-term training program. In
Zendel et al., 2019, stimuli consisted of 150 different monosyllabic words were presented over a
4-talker babble. In contrast, the stimuli presented during EEG in this study consisted of a single
repeated syllable presented in a 2-talker babble. It is possible that the addition of two more
babble speakers, thereby increasing the difficulty, may have impacted accuracy during this task
between groups, especially as Zendel et al. 2019 found differences only during the most difficult
condition of the task (0dB SNR), and participants in the present study performed at ceiling.
Differences in results between Zendel et al., 2019 and Fleming et al., 2019, in which the same
participants were assessed, were attributed to differences in the speech-in-noise task. The task
completed during Zendel et al., 2019’s EEG session had lower signal-to-noise ratios, as
compared to the task presented in Fleming et al., 2019 and, in Zendel et al., 2019, single words
were presented in noise without context, whereas Fleming et al., 2019 presented sentences in
noise, for which participants could use contextual cues. Here, both our behavioral speech-in-
noise task (BKB-SIN) and results are more similar to that of Fleming et al., 2019, indicating that
in measurement choice could explain the absence of behavioral change, and that a more difficult
task may produce different results.
We also observed no behavioral change between groups on the music-in-noise task. This task is
intended to measure auditory segregation ability in the context of musical excerpts. Musicians
outperformed non-musicians in the original study of the task, and years of music (minimum of 2
years) training predicted task performance (Coffey et al., 2019). However, no studies to our
knowledge have examined the effects of short-term music training on the MINT. Here, we show
75
that 12 weeks of choir training for older adults with no prior music training may not be sufficient
to provide an advantage in hearing musical excerpts in noise.
Well-being
Through qualitative assessment, participants who participated in choir reported more perceived
social benefit, while participants in the passive listening group reported more perceived
emotional benefit. Group music production has been found to produce feelings of social cohesion
and group belonging (Kokotsaki & Hallam, 2011; Spychiger et al., 1993), while music listening
may help individuals regulate emotions (van Goethem & Sloboda, 2011). While individuals in
the passive listening group did participate in online group discussions about the playlists,
qualitative results here demonstrate that singing together was a more effective way to gain a
sense of social well-being. However, no observed differences were found between groups in
quantitative measures of well-being. In a recent waitlist-control study, 6 months of choir singing
was shown to reduce loneliness and improve interest in life in older adults (Johnson et al., 2020).
It may be that twelve weeks of group singing is not sufficient time to alter feelings of loneliness
and well-being outside of the immediate choir context, as was measured in this study.
Limitations
A limitation of the present study is small sample size due to high rates of attrition before and
during the intervention period. While robust statistical methods were utilized to ensure
appropriate capture of training effects, statistical methodology cannot replace overall power
gained from high Ns.
76
Additionally, a possible limitation in this study is the degree to which we were able to match the
groups on programmatic aspects related to the intervention, specifically the nature and setting of
social engagement. In the passive-listening control group, participants responded to prompts and
collectively discussed playlists on an online platform and were encouraged to attend specific in-
person concerts with the research team and other participants. Thus, social engagement between
participants was encouraged and facilitated. However, this type of engagement differed from the
social activity experienced by participants in the choir group, where participants worked together
towards the common goal of a cohesive musical sound. This difference may have contributed to
the observed qualitative well-being or auditory processing findings. Additionally, while we
believe that matching of auditory-based interventions was a reasonable method of control, we do
acknowledge that differences in social setting and differential enhancements in social
functioning could have benefitted cognitive abilities and subsequently impacted auditory
processing.
Conclusions
In older adults, age-related declines in speech-in-noise abilities may significantly disrupt daily
communication and overall well-being. Underlying such declines are hypothesized reductions in
neural conduction speeds and population synchrony of neurons in the auditory cortex. Auditory
training programs have shown to improve speech-in-noise abilities (for review, see (Boothroyd,
2007)), but are frequently expensive, time-consuming, and require high consistency and
motivation. Singing is a low-cost activity that is often fun and engaging, and thus may be easier
to implement and maintain across a variety of situations. Here, we observed that 12 weeks of
77
choir singing produces enhancements in early sound encoding, as seen in earlier latencies and
larger amplitudes of the N1 response, in a group of older adults with mild subjective hearing
loss. Enhanced N1 response may reflect more synchronized firing and accelerated conduction
velocity in regions of the auditory cortex that are involved in processing of speech and music.
Thus, using a randomized-control design, we provide experimental evidence for the efficacy of a
low-cost, non-invasive method to improve neural processing of speech, specifically early sound
encoding, in individuals who are particularly vulnerable to declines in such abilities due to age.
Additionally, we demonstrate that group singing, through its socially engaging nature, may
improve certain indices of well-being. Importantly, the use of an active control demonstrates that
advantages conferred to the choir group were related specifically to group music production,
rather than passive music listening. Our findings diverge from previous investigations in that
behavioral improvements in speech-in-noise abilities were not observed, likely due to differences
in measurement method. Future work utilizing a variety of hearing-in-noise tasks in a larger
sample could provide clarification.
Author Contributions
Project was conceptualized by AH and SH. AH acquired funding. SH and AW curated data and
performed project administration. SH performed formal analysis, and RW provided critical
revisions. SH and AH drafted manuscript, with revisions from RW and AW. All authors
approved the final version of the manuscript for submission.
Conflict of Interest
The authors have no conflicts of interest to declare.
78
Funding
This project was funded by a grant from the Southern California Clinical and Translational
Science Institute awarded to A. Habibi.
Conclusions
In this thesis, I explored the role of music training on speech-in-noise perception across the
lifespan. In Chapter 1, meta-analytic results using a multi-level model demonstrated that long-
term musicians show significantly greater performance on speech-in-noise perception across 31
studies and 61 effect sizes, and that this effect was particularly strong in older adults. In Chapter
2, I examined whether such cross-sectional findings could be replicated in a short-term
randomized-control trial with older adult participants without prior music experience.
Participants who engaged in 12 weeks of community choir class showed significantly enhanced
neural encoding of auditory stimuli as evidenced by increased N1 amplitude and decrease N1
latency in the passive and active conditions of a syllable in noise task, respectively, and larger
N1 amplitude in an oddball task. Combined, these two studies provide cross-sectional and
longitudinal evidence for the role of music training to enhance speech-in-noise perception in
adults and older adults.
79
References
Anaya, E. M., Pisoni, D. B., & Kronenberger, W. G. (2016). Long-term musical experience and
auditory and visual perceptual abilities under adverse conditions. The Journal of the
Acoustical Society of America. https://doi.org/10.1121/1.4962628
Anderer, P., Semlitsch, H. V., & Saletu, B. (1996). Multichannel auditory event-related brain
potentials: Effects of normal aging on the scalp distribution of N1, P2, N2 and P300
latencies and amplitudes. Electroencephalography and Clinical Neurophysiology.
https://doi.org/10.1016/S0013-4694(96)96518-9
Anderson, S., White-Schwoch, T., Parbery-Clark, A., & Kraus, N. (2013a). A dynamic auditory-
cognitive system supports speech-in-noise perception in older adults. Hearing Research,
300, 18–32. https://doi.org/10.1016/j.heares.2013.03.006
Anderson, S., White-Schwoch, T., Parbery-Clark, A., & Kraus, N. (2013b). A dynamic auditory-
cognitive system supports speech-in-noise perception in older adults. Hearing Research,
300, 18–32. https://doi.org/10.1016/j.heares.2013.03.006
Assink, M., & Wibbelink, C. (2016). Fitting three-level meta-analytic models in R: A step-by-
step tutorial. The Quantitative Methods for Psychology, 12(3), 154–174.
https://doi.org/10.20982/tqmp.12.3.p154
Atienza, M., Cantero, J. L., & Escera, C. (2001). Auditory information processing during human
sleep as revealed by event-related brain potentials. Clinical Neurophysiology.
https://doi.org/10.1016/S1388-2457(01)00650-2
Bahramali, H., Gordon, E., Lagopoulos, J., Lim, C. L., Li, W., Leslie, J., & Wright, J. (1999).
The effects of age on late components of the ERP and reaction time. Experimental Aging
Research, 25(1), 69–80. https://doi.org/10.1080/036107399244147
80
Barrett, G., Neshige, R., & Shibasaki, H. (1987). Human auditory and somatosensory event-
related potentials: effects of response condition and age. Electroencephalography and
Clinical Neurophysiology. https://doi.org/10.1016/0013-4694(87)90210-0
Başkent, D., & Gaudrain, E. (2016). Musician advantage for speech-on-speech perception. The
Journal of the Acoustical Society of America, 139(3), EL51–EL56.
https://doi.org/10.1121/1.4942628
Baujat, B., Mahé, C., Pignon, J. P., & Hill, C. (2002). A graphical method for exploring
heterogeneity in meta-analyses: Application to a meta-analysis of 65 trials. Statistics in
Medicine, 21(18), 2641–2652. https://doi.org/10.1002/sim.1221
Baumann, S., Meyer, M., & Jäncke, L. (2008). Enhancement of auditory-evoked potentials in
musicians reflects an influence of expertise but not selective attention. Journal of Cognitive
Neuroscience. https://doi.org/10.1162/jocn.2008.20157
Bench, J., Kowal, A., & Bamford, J. (1979). The bkb (bamford-kowal-bench) sentence lists for
partially-hearing children’. British Journal of Audiology.
https://doi.org/10.3109/03005367909078884
Besser, J., Festen, J. M., Goverts, S. T., Kramer, S. E., & Pichora-Fuller, M. K. (2015). Speech-
in-Speech Listening on the LiSN-S Test by Older Adults With Good Audiograms Depends
on Cognition and Hearing Acuity at High Frequencies. Ear & Hearing, 36(1), 24–41.
https://doi.org/10.1097/AUD.0000000000000096
Bidelman, G. M., & Alain, C. (2015). Musical training orchestrates coordinated neuroplasticity
in auditory brainstem and cortex to counteract age-related declines in categorical vowel
perception. Journal of Neuroscience. https://doi.org/10.1523/JNEUROSCI.3292-14.2015
Bidelman, G. M., Villafuerte, J. W., Moreno, S., & Alain, C. (2014). Age-related changes in the
81
subcortical-cortical encoding and categorical perception of speech. Neurobiology of Aging,
35(11), 2526–2540. https://doi.org/10.1016/j.neurobiolaging.2014.05.006
Bidelman, G. M., Weiss, M. W., Moreno, S., & Alain, C. (2014). Coordinated plasticity in
brainstem and auditory cortex contributes to enhanced categorical speech perception in
musicians. The European Journal of Neuroscience, 40(4), 2662–2673.
https://doi.org/10.1111/ejn.12627
Bidelman, G. M., & Yoo, J. (2020). Musicians Show Improved Speech Segregation in
Competitive, Multi-Talker Cocktail Party Scenarios. Frontiers in Psychology, 11.
https://doi.org/10.3389/fpsyg.2020.01927
Billings, C. J., McMillan, G. P., Penman, T. M., & Gille, S. M. (2013). Predicting perception in
noise using cortical auditory evoked potentials. JARO - Journal of the Association for
Research in Otolaryngology, 14(6), 891–903. https://doi.org/10.1007/s10162-013-0415-y
Boebinger, D., Evans, S., Rosen, S., Lima, C. F., Manly, T., & Scott, S. K. (2015). Musicians
and non-musicians are equally adept at perceiving masked speech. The Journal of the
Acoustical Society of America, 137(1), 378–387. https://doi.org/10.1121/1.4904537
Boothroyd, A. (2007). Adult Aural Rehabilitation: What Is It and Does It Work? Trends in
Amplification. https://doi.org/10.1177/1084713807301073
Casey, J. A., Morello-Frosch, R., Mennitt, D. J., Fristrup, K., Ogburn, E. L., & James, P. (2017).
Race/ethnicity, socioeconomic status, residential segregation, and spatial variation in noise
exposure in the contiguous United States. Environmental Health Perspectives.
https://doi.org/10.1289/EHP898
Castillo-Eito, L., Armitage, C. J., Norman, P., Day, M. R., Dogru, O. C., & Rowe, R. (2020).
How can adolescent aggression be reduced? A multi-level meta-analysis. In Clinical
82
psychology review (Vol. 78, p. 101853). NLM (Medline).
https://doi.org/10.1016/j.cpr.2020.101853
Čeponiene, R., Westerfield, M., Torki, M., & Townsend, J. (2008). Modality-specificity of
sensory aging in vision and audition: Evidence from event-related potentials. Brain
Research. https://doi.org/10.1016/j.brainres.2008.02.010
Chambers, R. D. (1992). Differential age effects for components of the adult auditory middle
latency response. Hearing Research. https://doi.org/10.1016/0378-5955(92)90122-4
Champely, S. (2020). pwr: Basic Functions for Power Analysis.
Chan, A. S., Ho, Y. C., & Cheung, M. C. (1998). Music training improves verbal memory [7]. In
Nature (Vol. 396, Issue 6707, p. 128). https://doi.org/10.1038/24075
Cheung, M. W. L. (2014). Modeling dependent effect sizes with three-level meta-analyses: A
structural equation modeling approach. Psychological Methods, 19(2), 211–229.
https://doi.org/10.1037/a0032968
Chung, K. (2004). Challenges and Recent Developments in Hearing Aids: Part I. Speech
Understanding in Noise, Microphone Technologies and Noise Reduction Algorithms.
Trends in Amplification, 8(3), 83–124. https://doi.org/10.1177/108471380400800302
Clayton, K. K., Swaminathan, J., Yazdanbakhsh, A., Zuk, J., Patel, A. D., & Kidd, G. (2016).
Executive function, visual attention and the cocktail party problem in musicians and non-
musicians. PLoS ONE, 11(7), e0157638. https://doi.org/10.1371/journal.pone.0157638
Coffey, E. B. J., Arseneau-Bruneau, I., Zhang, X., & Zatorre, R. J. (2019). The Music-In-Noise
Task (MINT): A Tool for Dissecting Complex Auditory Perception. Frontiers in
Neuroscience. https://doi.org/10.3389/fnins.2019.00199
Coffey, E. B. J., Chepesiuk, A. M. P., Herholz, S. C., Baillet, S., & Zatorre, R. J. (2017). Neural
83
correlates of early sound encoding and their relationship to speech-in-noise perception.
Frontiers in Neuroscience, 11(AUG), 479. https://doi.org/10.3389/fnins.2017.00479
Coffey, E. B. J., Mogilever, N. B., & Zatorre, R. J. (2017). Speech-in-noise perception in
musicians: A review. In Hearing Research (Vol. 352, pp. 49–69). Elsevier B.V.
https://doi.org/10.1016/j.heares.2017.02.006
Coull, J. T. (1998). Neural correlates of attention and arousal: Insights from electrophysiology,
functional neuroimaging and psychopharmacology. Progress in Neurobiology, 55(4), 343–
361. https://doi.org/10.1016/S0301-0082(98)00011-2
Coyle, S., Gordon, E., Howson, A., & Meares, R. (1991). The effects of age on auditory event-
related potentials. Experimental Aging Research, 17(2), 103–111.
https://doi.org/10.1080/03610739108253889
Crowley, K. E., & Colrain, I. M. (2004). A review of the evidence for P2 being an independent
component process: Age, sleep and modality. In Clinical Neurophysiology.
https://doi.org/10.1016/j.clinph.2003.11.021
Davis, H., & Zerlin, S. (1966). Acoustic Relations of the Human Vertex Potential. Journal of the
Acoustical Society of America. https://doi.org/10.1121/1.1909858
Davis, M. H., & Johnsrude, I. S. (2007). Hearing speech sounds: Top-down influences on the
interface between audition and speech perception. Hearing Research, 229(1–2), 132–147.
https://doi.org/10.1016/j.heares.2007.01.014
de Carvalho, N. G., Novelli, C. V. L., & Colella-Santos, M. F. (2017). Evaluation of speech in
noise abilities in school children. International Journal of Pediatric Otorhinolaryngology,
99. https://doi.org/10.1016/j.ijporl.2017.05.019
de Jong-Gierveld, J., & Kamphuls, F. (1985). The Development of a Rasch-Type Loneliness
84
Scale. Applied Psychological Measurement. https://doi.org/10.1177/014662168500900307
Delorme, A., & Makeig, S. (2004). EEGLAB: An open source toolbox for analysis of single-trial
EEG dynamics including independent component analysis. Journal of Neuroscience
Methods. https://doi.org/10.1016/j.jneumeth.2003.10.009
Donai, J. J., & Jennings, M. B. (2016). Gaps-in-noise detection and gender identification from
noise-vocoded vowel segments: Comparing performance of active musicians to non-
musicians. The Journal of the Acoustical Society of America, 139(5), EL128–EL134.
https://doi.org/10.1121/1.4947070
Dryden, A., Allen, H. A., Henshaw, H., & Heinrich, A. (2017). The Association Between
Cognitive Performance and Speech-in-Noise Perception for Adult Listeners: A Systematic
Literature Review and Meta-Analysis. In Trends in Hearing (Vol. 21).
https://doi.org/10.1177/2331216517744675
Du, Y., & Zatorre, R. J. (2017). Musical training sharpens and bonds ears and tongue to hear
speech better. Proceedings of the National Academy of Sciences of the United States of
America, 114(51), 13579–13584. https://doi.org/10.1073/pnas.1712223114
Dubinsky, E., Wood, E. A., Nespoli, G., & Russo, F. A. (2019). Short-Term Choir Singing
Supports Speech-in-Noise Perception and Neural Pitch Strength in Older Adults With Age-
Related Hearing Loss. Frontiers in Neuroscience, 13, 1153.
https://doi.org/10.3389/fnins.2019.01153
Elpus, K., & Abril, C. R. (2011). High school music ensemble students in the United States: A
demographic profile. Journal of Research in Music Education.
https://doi.org/10.1177/0022429411405207
Erwin, R., & Buchwald, J. S. (1986). Midlatency auditory evoked responses: Differential effects
85
of sleep in the human. Electroencephalography and Clinical Neurophysiology/ Evoked
Potentials. https://doi.org/10.1016/0168-5597(86)90017-1
Erwin, R. J., & Buchwald, J. S. (1986). Midlatency auditory evoked responses: Differential
recovery cycle characteristics. Electroencephalography and Clinical Neurophysiology.
https://doi.org/10.1016/0013-4694(86)90075-1
Escobar, J., Mussoi, B. S., & Silberer, A. B. (2019). The Effect of Musical Training and
Working Memory in Adverse Listening Situations. Ear and Hearing.
https://doi.org/10.1097/AUD.0000000000000754
Etymotic Research. (2001). QuickSIN Speech-in-Noise Test (Version 1.3) User Manual.
Etymotic Research Inc.
Fernández-Castilla, B., Jamshidi, L., Declercq, L., Beretvas, S. N., Onghena, P., & Van den
Noortgate, W. (2020). The application of meta-analytic (multi-level) models with multiple
random effects: A systematic review. Behavior Research Methods, 52(5), 2031–2052.
https://doi.org/10.3758/s13428-020-01373-9
Fleming, D., Belleville, S., Peretz, I., West, G., & Zendel, B. R. (2019). The effects of short-term
musical training on the neural processing of speech-in-noise in older adults. Brain and
Cognition. https://doi.org/10.1016/j.bandc.2019.103592
Folyi, T., Fehér, B., & Horváth, J. (2012). Stimulus-focused attention speeds up auditory
processing. International Journal of Psychophysiology.
https://doi.org/10.1016/j.ijpsycho.2012.02.001
Fostick, L. (2019). Card playing enhances speech perception among aging adults: comparison
with aging musicians. European Journal of Ageing, 16(4), 481–489.
https://doi.org/10.1007/s10433-019-00512-2
86
Fuller, C. D., Galvin, J. J., Maat, B., Free, R. H., & Başkent, D. (2014). The musician effect:
Does it persist under degraded pitch conditions of cochlear implant simulations? Frontiers
in Neuroscience, 8(8 JUN). https://doi.org/10.3389/fnins.2014.00179
George, E. M., & Coch, D. (2011). Music training and working memory: An ERP study.
Neuropsychologia, 49(5), 1083–1094.
https://doi.org/10.1016/j.neuropsychologia.2011.02.001
Habibi, A., Cahn, B. R., Damasio, A., & Damasio, H. (2016). Neural correlates of accelerated
auditory processing in children engaged in music training. Developmental Cognitive
Neuroscience. https://doi.org/10.1016/j.dcn.2016.04.003
Habibi, A., Damasio, A., Ilari, B., Veiga, R., Joshi, A. A., Leahy, R. M., Haldar, J. P.,
Varadarajan, D., Bhushan, C., & Damasio, H. (2018). Childhood music training induces
change in micro and macroscopic brain structure: Results from a longitudinal study.
Cerebral Cortex, 28(12), 4336–4347. https://doi.org/10.1093/cercor/bhx286
Harrer, M, Cuijpers, P., Furukawa, T. ., & Ebert, D. D. (2019). Doing Meta-Analysis in R: A
Hands-on Guide. PROTECT Lab Erlangen. https://doi.org/10.5281/zenodo.2551803.
Harrer, Mathias, Cuijpers, P., Furukawa, T., & Ebert, D. D. (2019). dmetar: Companion R
Package For The Guide “Doing Meta-Analysis in R.”
Hedges, L., & Olkin, I. (1985). Statistical Models for Meta-Analysis. Academic Press.
Hegerl, U., Gallinat, J., & Mrowinski, D. (1994). Intensity dependence of auditory evoked dipole
source activity. International Journal of Psychophysiology, 17(1), 1–13.
https://doi.org/10.1016/0167-8760(94)90050-7
Herrmann, B., Henry, M. J., Johnsrude, I. S., & Obleser, J. (2016). Altered temporal dynamics of
neural adaptation in the aging human auditory cortex. Neurobiology of Aging, 45, 10–22.
87
https://doi.org/10.1016/j.neurobiolaging.2016.05.006
Hillyard, S. A., Hink, R. F., Schwent, V. L., & Picton, T. W. (1973). Electrical signs of selective
attention in the human brain. Science, 182(4108), 177–180.
https://doi.org/10.1126/science.182.4108.177
Holt, E. B., & Titchener, E. B. (1909). Lectures on the Elementary Psychology of Feeling and
Attention. The Philosophical Review. https://doi.org/10.2307/2177879
Hox, J. J. (2010). Multilevel analysis: Techniques and applications. Routledge.
Huang, Q., & Tang, J. (2010). Age-related hearing loss or presbycusis. In European Archives of
Oto-Rhino-Laryngology. https://doi.org/10.1007/s00405-010-1270-7
Hyde, K. L., Lerch, J., Norton, A., Forgeard, M., Winner, E., Evans, A. C., & Schlaug, G.
(2009). Musical training shapes structural brain development. Journal of Neuroscience,
29(10), 3019–3025. https://doi.org/10.1523/JNEUROSCI.5118-08.2009
Johnson, J. K., Stewart, A. L., Acree, M., Nápoles, A. M., Flatt, J. D., Max, W. B., & Gregorich,
S. E. (2020). A Community Choir Intervention to Promote Well-Being among Diverse
Older Adults: Results from the Community of Voices Trial. Journals of Gerontology -
Series B Psychological Sciences and Social Sciences, 75(3), 549–559.
https://doi.org/10.1093/geronb/gby132
Julayanont, P., & Nasreddine, Z. S. (2016). Montreal Cognitive Assessment (MoCA): Concept
and clinical review. In Cognitive Screening Instruments: A Practical Approach.
https://doi.org/10.1007/978-3-319-44775-9_7
Kaplan-Neeman, R., Kishon-Rabin, L., Henkin, Y., & Muchnik, C. (2006). Identification of
syllables in noise: Electrophysiological and behavioral correlates. The Journal of the
Acoustical Society of America. https://doi.org/10.1121/1.2217567
88
Killion, M. C. (1997). Hearing aids: Past, present, future: Moving toward normal conversations
in noise. In British Journal of Audiology (Vol. 31, Issue 3, pp. 141–148). Whurr Publishers
Ltd. https://doi.org/10.3109/03005364000000016
Killion, M. C., Niquette, P. A., Gudmundsen, G. I., Revit, L. J., & Banerjee, S. (2004).
Development of a quick speech-in-noise test for measuring signal-to-noise ratio loss in
normal-hearing and hearing-impaired listeners. The Journal of the Acoustical Society of
America. https://doi.org/10.1121/1.1784440
Knight, R. T., Hillyard, S. A., Woods, D. L., & Neville, H. J. (1980). The effects of frontal and
temporal-parietal lesions on the auditory evoked potential in man. Electroencephalography
and Clinical Neurophysiology. https://doi.org/10.1016/0013-4694(80)90328-4
Knight, R. T., Scabini, D., Woods, D. L., & Clayworth, C. C. (1989). Contributions of temporal-
parietal junction to the human auditory P3. Brain Research. https://doi.org/10.1016/0006-
8993(89)90466-6
Koerner, T. K., & Zhang, Y. (2015). Effects of background noise on inter-trial phase coherence
and auditory N1-P2 responses to speech stimuli. Hearing Research, 328, 113–119.
https://doi.org/10.1016/j.heares.2015.08.002
Koerner, T. K., & Zhang, Y. (2018). Differential effects of hearing impairment and age on
electrophysiological and behavioral measures of speech in noise. Hearing Research.
https://doi.org/10.1016/j.heares.2018.10.009
Kok, A. (2001). On the utility of P3 amplitude as a measure of processing capacity.
Psychophysiology. https://doi.org/10.1017/S0048577201990559
Kokotsaki, D., & Hallam, S. (2011). The perceived benefits of participative music making for
non-music university students: A comparison with music students. Music Education
89
Research. https://doi.org/10.1080/14613808.2011.577768
Kraus, N., Bradlow, A. R., Cheatham, M. A., Cunningham, J., King, C. D., Koch, D. B., Nicol,
T. G., McGee, T. J., Stein, L. K., & Wright, B. A. (2000). Consequences of neural
asynchrony: A case of auditory neuropathy. JARO - Journal of the Association for Research
in Otolaryngology, 1(1), 33–45. https://doi.org/10.1007/s101620010004
Kühnis, J., Elmer, S., & Jäncke, L. (2014). Auditory evoked responses in musicians during
passive vowel listening are modulated by functional connectivity between bilateral
auditory-related brain regions. Journal of Cognitive Neuroscience.
https://doi.org/10.1162/jocn_a_00674
Li, C. M., Zhang, X., Hoffman, H. J., Cotch, M. F., Themann, C. L., & Wilson, M. R. (2014).
Hearing impairment associated with depression in US adults, National Health and Nutrition
Examination Survey 2005-2010. JAMA Otolaryngology - Head and Neck Surgery, 140(4),
293–302. https://doi.org/10.1001/jamaoto.2014.42
Liégeois-Chauvel, C., Musolino, A., Badier, J. M., Marquis, P., & Chauvel, P. (1994). Evoked
potentials recorded from the auditory cortex in man: evaluation and topography of the
middle latency components. Electroencephalography and Clinical Neurophysiology/
Evoked Potentials. https://doi.org/10.1016/0168-5597(94)90064-7
Lin, F. R., Metter, E. J., O’Brien, R. J., Resnick, S. M., Zonderman, A. B., & Ferrucci, L. (2011).
Hearing loss and incident dementia. Archives of Neurology, 68(2), 214–220.
https://doi.org/10.1001/archneurol.2010.362
Lister, J. J., Maxfield, N. D., Pitt, G. J., & Gonzalez, V. B. (2011). Auditory evoked response to
gaps in noise: Older adults. International Journal of Audiology.
https://doi.org/10.3109/14992027.2010.526967
90
Lo, C. Y., Looi, V., Thompson, W. F., & McMahon, C. M. (2020). Music Training for Children
With Sensorineural Hearing Loss Improves Speech-in-Noise Perception. Journal of Speech,
Language, and Hearing Research, 63(6), 1990–2015. https://doi.org/10.1044/2020_JSLHR-
19-00391
Lopez-Calderon, J., & Luck, S. J. (2014). ERPLAB: An open-source toolbox for the analysis of
event-related potentials. Frontiers in Human Neuroscience.
https://doi.org/10.3389/fnhum.2014.00213
Lotfi, Y., Mehrkian Msc, S., Moossavi, A., & Faghih-Zadeh, S. (2009). Quality of Life
Improvement in Hearing-Impaired Elderly People after Wearing a Hearing Aid. In Archives
of Iranian Medicine (Vol. 12, Issue 4).
Luck, S. J. (2014). An Introduction to the Event-Related Potential Technique, second edition. In
The MIT Press. MIT Press. https://mitpress.mit.edu/books/introduction-event-related-
potential-technique%0Ahttp://doi.wiley.com/10.1118/1.4736938
Lüdecke, D. (2019). esc: Effect Size Computation for Meta Analysis (Version 0.5.1).
https://doi.org/10.5281/zenodo.1249218
Lütkenhöner, B., Seither-Preisler, A., & Seither, S. (2006). Piano tones evoke stronger magnetic
fields than pure tones or noise, both in musicians and non-musicians. NeuroImage, 30(3),
927–937. https://doi.org/10.1016/j.neuroimage.2005.10.034
Madsen, S. M. K., Marschall, M., Dau, T., & Oxenham, A. J. (2019). Speech perception is
similar for musicians and non-musicians across a wide range of conditions. Scientific
Reports. https://doi.org/10.1038/s41598-019-46728-1
Madsen, S. M. K., Whiteford, K. L., & Oxenham, A. J. (2017). Musicians do not benefit from
differences in fundamental frequency when listening to speech in competing speech
91
backgrounds. Scientific Reports. https://doi.org/10.1038/s41598-017-12937-9
Mair, P., & Wilcox, R. (2020). Robust statistical methods in R using the WRS2 package.
Behavior Research Methods. https://doi.org/10.3758/s13428-019-01246-w
Mankel, K., & Bidelman, G. M. (2018). Inherent auditory skills rather than formal music training
shape the neural encoding of speech. Proceedings of the National Academy of Sciences of
the United States of America, 115(51), 13129–13134.
https://doi.org/10.1073/pnas.1811793115
Mazelová, J., Popelar, J., & Syka, J. (2003). Auditory function in presbycusis: Peripheral vs.
central changes. Experimental Gerontology, 38(1–2), 87–94. https://doi.org/10.1016/S0531-
5565(02)00155-9
McKee, M. M., Meade, M. A., Zazove, P., Stewart, H. J., Jannausch, M. L., & Ilgen, M. A.
(2019). The Relationship Between Hearing Loss and Substance Use Disorders Among
Adults in the U.S. American Journal of Preventive Medicine, 56(4), 586–590.
https://doi.org/10.1016/j.amepre.2018.10.026
Meha-Bettison, K., Sharma, M., Ibrahim, R. K., & Mandikal Vasuki, P. R. (2018). Enhanced
speech perception in noise and cortical auditory evoked potentials in professional
musicians. International Journal of Audiology, 57(1), 40–52.
https://doi.org/10.1080/14992027.2017.1380850
Menning, H., Roberts, L. E., & Pantev, C. (2000). Plastic changes in the auditory cortex induced
by intensive frequency discrimination training. NeuroReport.
https://doi.org/10.1097/00001756-200003200-00032
Moore, D. R., Edmondson-Jones, M., Dawes, P., Fortnum, H., McCormack, A., Pierzycki, R. H.,
& Munro, K. J. (2014). Relation between speech-in-noise threshold, hearing loss and
92
cognition from 40-69 years of age. PLoS ONE.
https://doi.org/10.1371/journal.pone.0107720
Moreno, S., Bialystok, E., Barac, R., Schellenberg, E. G., Cepeda, N. J., & Chau, T. (2011).
Short-term music training enhances verbal intelligence and executive function.
Psychological Science, 22(11), 1425–1433. https://doi.org/10.1177/0956797611416999
Müllensiefen, D., Gingras, B., Musil, J., & Stewart, L. (2014). The musicality of non-musicians:
An index for assessing musical sophistication in the general population. In PLoS ONE.
https://doi.org/10.1371/journal.pone.0089642
Mulrow, C. D., Aguilar, C., Endicott, J. E., Tuley, M. R., Ramon Velez, ;, Charlip, W. S.,
Rhodes, M. C., Hill, J. A., & Denino, L. A. (1990). Quality-of-Life Changes and Hearing
Impairment A Randomized Trial. In Annals of Internal Medicine (Vol. 113).
http://annals.org/
Musacchia, G., Sams, M., Skoe, E., & Kraus, N. (2007). Musicians have enhanced subcortical
auditory and audiovisual processing of speech and music. Proceedings of the National
Academy of Sciences of the United States of America.
https://doi.org/10.1073/pnas.0701498104
Musacchia, G., Strait, D., & Kraus, N. (2008). Relationships between behavior, brainstem and
cortical encoding of seen and heard speech in musicians and non-musicians. Hearing
Research, 241(1–2), 34–42. https://doi.org/10.1016/j.heares.2008.04.013
Näätänen, R., & Picton, T. (1987). The N1 Wave of the Human Electric and Magnetic Response
to Sound: A Review and an Analysis of the Component Structure. Psychophysiology, 24(4),
375–425. https://doi.org/10.1111/j.1469-8986.1987.tb00311.x
Näätänen, R., & Winkler, I. (1999). The concept of auditory stimulus representation in cognitive
93
neuroscience. Psychological Bulletin. https://doi.org/10.1037/0033-2909.125.6.826
National Institute on Deafness and Other Communication Disorders (NIDHCD). (2016). Quick
Statistics About Hearing | NIDCD. National Institutes of Health.
https://www.nidcd.nih.gov/health/statistics/quick-statistics-hearing#6
Neff, D. L., & Green, D. M. (1987). Masking produced by spectral uncertainty with
multicomponent maskers. Perception & Psychophysics.
https://doi.org/10.3758/BF03203033
Niquette, P., Arcaroli, J., Revit, L., Parkinson, A., Staller, S., Skinner, M., & Killion, M. (2003).
Development of the BKB-SIN test. Annual Meeting of the American Auditory Society,
Scottsdale, AZ.
Nishihara, M., Inui, K., Motomura, E., Otsuru, N., Ushida, T., & Kakigi, R. (2011). Auditory N1
as a change-related automatic response. Neuroscience Research, 71(2), 145–148.
https://doi.org/10.1016/j.neures.2011.07.004
O’Brien, J. L., Nikjeh, D. A., & Lister, J. J. (2015). Interaction of Musicianship and Aging: A
Comparison of Cortical Auditory Evoked Potentials. Behavioural Neurology.
https://doi.org/10.1155/2015/545917
Oates, P. A., Kurtzberg, D., & Stapells, D. R. (2002). Effects of sensorineural hearing loss on
cortical event-related potential and behavioral measures of speech-sound processing. Ear
and Hearing, 23(5), 399–415. https://doi.org/10.1097/00003446-200210000-00002
Okamoto, H., Stracke, H., Wolters, C. H., Schmael, F., & Pantev, C. (2007). Attention improves
population-level frequency tuning in human auditory cortex. Journal of Neuroscience,
27(39), 10383–10390. https://doi.org/10.1523/JNEUROSCI.2963-07.2007
Oostenveld, R., & Praamstra, P. (2001). The five percent electrode system for high-resolution
94
EEG and ERP measurements. Clinical Neurophysiology. https://doi.org/10.1016/S1388-
2457(00)00527-7
Pantev, C., & Herholz, S. C. (2011). Plasticity of the human auditory cortex related to musical
training. In Neuroscience and Biobehavioral Reviews.
https://doi.org/10.1016/j.neubiorev.2011.06.010
Parasuraman, R., & Beatty, J. (1980). Brain events underlying detection and recognition of weak
sensory signals. Science, 210(4465), 80–83. https://doi.org/10.1126/science.7414324
Parbery-Clark, A., Strait, D. L., & Kraus, N. (2011). Context-dependent encoding in the auditory
brainstem subserves enhanced speech-in-noise perception in musicians. Neuropsychologia,
49(12), 3338–3345. https://doi.org/10.1016/j.neuropsychologia.2011.08.007
Parbery-Clark, A., Tierney, A., Strait, D. L., & Kraus, N. (2012). Musicians have fine-tuned
neural distinction of speech syllables. Neuroscience, 219, 111–119.
https://doi.org/10.1016/j.neuroscience.2012.05.042
Parbery-Clark, Alexandra, Anderson, S., Hittner, E., & Kraus, N. (2012). Musical experience
offsets age-related delays in neural timing. Neurobiology of Aging, 33(7), 1483.e1-1483.e4.
https://doi.org/10.1016/j.neurobiolaging.2011.12.015
Parbery-Clark, Alexandra, Skoe, E., Lam, C., & Kraus, N. (2009). Musician enhancement for
speech-In-noise. Ear and Hearing, 30(6), 653–661.
https://doi.org/10.1097/AUD.0b013e3181b412e9
Parbery-Clark, Alexandra, Strait, D. L., Anderson, S., Hittner, E., & Kraus, N. (2011). Musical
experience and the aging auditory system: Implications for cognitive abilities and hearing
speech in noise. PLoS ONE, 6(5), e18082. https://doi.org/10.1371/journal.pone.0018082
Parbery-Clark, Alexandra, Strait, D. L., Hittner, E., & Kraus, N. (2013). Musical training
95
enhances neural processing of binaural sounds. Journal of Neuroscience.
https://doi.org/10.1523/JNEUROSCI.5700-12.2013
Parry, D. A., Davidson, B. I., Sewall, C., Fisher, J. T., Mieczkowski, H., & Quintana, D. (n.d.). A
Systematic Review and Meta-Analysis of Discrepancies Between Logged and Self- Reported
Digital Media Use. https://doi.org/10.31234/OSF.IO/F6XVZ
Parthasarathy, A., Hancock, K. E., Bennett, K., Degruttola, V., & Polley, D. B. (2020). Bottom-
up and top-down neural signatures of disordered multi-talker speech perception in adults
with normal hearing. ELife. https://doi.org/10.7554/eLife.51419
Patel, A. D. (2011). Why would musical training benefit the neural encoding of speech? The
OPERA hypothesis. Frontiers in Psychology, 2(JUN).
https://doi.org/10.3389/fpsyg.2011.00142
Patel, A. D. (2014a). Can nonlinguistic musical training change the way the brain processes
speech? The expanded OPERA hypothesis. In Hearing Research (Vol. 308, pp. 98–108).
https://doi.org/10.1016/j.heares.2013.08.011
Patel, A. D. (2014b). Can nonlinguistic musical training change the way the brain processes
speech? The expanded OPERA hypothesis. In Hearing Research (Vol. 308, pp. 98–108).
https://doi.org/10.1016/j.heares.2013.08.011
Picton, T. W., Hillyard, S. A., Krausz, H. I., & Galambos, R. (1974). Human auditory evoked
potentials. I: Evaluation of components. Electroencephalography and Clinical
Neurophysiology. https://doi.org/10.1016/0013-4694(74)90155-2
Picton, Terence W. (1992). The P300 wave of the human event-related potential. Journal of
Clinical Neurophysiology. https://doi.org/10.1097/00004691-199210000-00002
Picton, Terence W., Stuss, D. T., Champagne, S. C., & Nelson, R. F. (1984). The Effects of Age
96
on Human Event‐Related Potentials. Psychophysiology. https://doi.org/10.1111/j.1469-
8986.1984.tb02941.x
Pienkowski, M. (2017). On the Etiology of Listening Difficulties in Noise Despite Clinically
Normal Audiograms. Ear and Hearing, 38(2), 135–148.
https://doi.org/10.1097/AUD.0000000000000388
Polich, J. (1997). EEG and ERP assessment of normal aging. Electroencephalography and
Clinical Neurophysiology - Evoked Potentials. https://doi.org/10.1016/S0168-
5597(97)96139-6
Pronk, M., Deeg, D. J. H., Festen, J. M., Twisk, J. W., Smits, C., Comijs, H. C., & Kramer, S. E.
(2013a). Decline in older persons’ ability to recognize speech in noise: The influence of
demographic, health-related, environmental, and cognitive factors. Ear and Hearing.
https://doi.org/10.1097/AUD.0b013e3182994eee
Pronk, M., Deeg, D. J. H., Festen, J. M., Twisk, J. W., Smits, C., Comijs, H. C., & Kramer, S. E.
(2013b). Decline in Older Persons’ Ability to Recognize Speech in Noise. Ear & Hearing.
https://doi.org/10.1097/aud.0b013e3182994eee
R Core, T. (2020). R: A Language and Environment for Statistical Computing. R Foundation for
Statistical Computing. https://www.r-project.org/
Rance, G. (2005). Auditory Neuropathy/Dys-synchrony and Its Perceptual Consequences.
Trends in Amplification, 9(1), 1–43. https://doi.org/10.1177/108471380500900102
Raudenbush, S. ., & Bryk, A. . (2002). Hierarchical linear models: Applications and data
analysis methods (Vol. 1). SAGE.
Rodgers, M. A., & Pustejovsky, J. E. (2020). Evaluating Meta-Analytic Methods to Detect
Selective Reporting in the Presence of Dependent Effect Sizes. Psychological Methods.
97
https://doi.org/10.1037/met0000300
Rostami, S., & Moossavi, A. (2017). Musical training enhances neural processing of
comodulation masking release in the auditory brainstem. Audiology Research.
https://doi.org/10.4081/audiores.2017.185
Rufener, K. S., Liem, F., & Meyer, M. (2014). Age-related differences in auditory evoked
potentials as a function of task modulation during speech-nonspeech processing. Brain and
Behavior, 4(1), 21–28. https://doi.org/10.1002/brb3.188
Ruggles, D. R., Freyman, R. L., & Oxenham, A. J. (2014). Influence of musical training on
understanding voiced and whispered speech in noise. PLoS ONE, 9(1), e86980.
https://doi.org/10.1371/journal.pone.0086980
Ryff, C. D., Almeida, D. M., Ayanian, J., Carr, D. S., Cleary, P. D., Coe, C., ..., & Williams, D.
(2012). National Survey of Midlife Develop- ment in the United States (MIDUS 2), 2004–
2006. Ann Arbor, MI: Inter-University Consortium for Political and Social Research
(ICPSR) [Distributor].
Ryff, Carol D. (1989). Happiness is everything, or is it? Explorations on the meaning of
psychological well-being. Journal of Personality and Social Psychology.
https://doi.org/10.1037/0022-3514.57.6.1069
Salo, S. K., Lang, A. H., Salmivalli, A. J., Johansson, R. K., & Peltola, M. S. (2003).
Contralateral white noise masking affects auditory N1 and P2 waves differently. Journal of
Psychophysiology, 17(4), 189–194. https://doi.org/10.1027/0269-8803.17.4.189
Schafer, E. W. P., Amochaev, A., & Russell, M. J. (1981). Knowledge of stimulus timing
attenuates human evoked cortical potentials. Electroencephalography and Clinical
Neurophysiology, 52(1), 9–17. https://doi.org/10.1016/0013-4694(81)90183-8
98
Schellenberg, E. G. (2011). Music lessons, emotional intelligence, and IQ. Music Perception.
https://doi.org/10.1525/mp.2011.29.2.185
Scherg, M., Vajsar, J., & Picton, T. W. (1989). A source analysis of the late human auditory
evoked potentials. Journal of Cognitive Neuroscience.
https://doi.org/10.1162/jocn.1989.1.4.336
Schiff, S., Valenti, P., Andrea, P., Lot, M., Bisiacchi, P., Gatta, A., & Amodio, P. (2008). The
effect of aging on auditory components of event-related brain potentials. Clinical
Neurophysiology. https://doi.org/10.1016/j.clinph.2008.04.007
Schneider, B. A., & Pichora-Fuller, M. K. (2001). Age-related changes in temporal processing:
Implications for speech perception. In Seminars in Hearing (Vol. 22, Issue 3, pp. 227–238).
Copyright © 2001 by Thieme Medical Publishers, Inc., 333 Seventh Avenue, New York,
NY 10001, USA. Tel.: +1(212) 584-4662 . https://doi.org/10.1055/s-2001-15628
Schröder, A., van Diepen, R., Mazaheri, A., Petropoulos-Petalas, D., de Amesti, V. S., Vulink,
N., & Denys, D. (2014). Diminished N1 auditory evoked potentials to oddball stimuli in
misophonia patients. Frontiers in Behavioral Neuroscience.
https://doi.org/10.3389/fnbeh.2014.00123
Seppänen, M., Hämäläinen, J., Pesonen, A. K., & Tervaniemi, M. (2012). Music training
enhances rapid neural plasticity of N1 and P2 source activation for unattended sounds.
Frontiers in Human Neuroscience, 6(MARCH 2012), 1–13.
https://doi.org/10.3389/fnhum.2012.00043
Shahin, A., Bosnyak, D. J., Trainor, L. J., & Roberts, L. E. (2003). Enhancement of neuroplastic
P2 and N1c auditory evoked potentials in musicians. Journal of Neuroscience, 23(13),
5545–5552. https://doi.org/10.1523/jneurosci.23-13-05545.2003
99
Skoe, E., Camera, S., & Tufts, J. (2019). Noise exposure may diminish the musician advantage
for perceiving speech in noise. Ear and Hearing, 40(4), 782–793.
https://doi.org/10.1097/AUD.0000000000000665
Slater, J., & Kraus, N. (2016). The role of rhythm in perceiving speech in noise: a comparison of
percussionists, vocalists and non-musicians. Cognitive Processing.
https://doi.org/10.1007/s10339-015-0740-7
Slater, J., Skoe, E., Strait, D. L., O’Connell, S., Thompson, E., & Kraus, N. (2015). Music
training improves speech-in-noise perception: Longitudinal evidence from a community-
based music program. Behavioural Brain Research, 291, 244–252.
https://doi.org/10.1016/j.bbr.2015.05.026
Spychiger, M., Patry, J., Lauper, G., Zimmerman, E., & Weber, E. (1993). Does More Music
Teaching Lead to a Better Social Climate. In Experimental Research in Teaching and
Learning.
Stam, M., Smit, J. H., Twisk, J. W. R., Lemke, U., Smits, C., Festen, J. M., & Kramer, S. E.
(2016). Change in Psychosocial Health Status Over 5 Years in Relation to Adults’ Hearing
Ability in Noise. Ear & Hearing, 37(6), 680–689.
https://doi.org/10.1097/AUD.0000000000000332
Strait, D. L., Parbery-Clark, A., Hittner, E., & Kraus, N. (2012). Musical training during early
childhood enhances the neural encoding of speech in noise. Brain and Language, 123(3),
191–201. https://doi.org/10.1016/j.bandl.2012.09.001
Strawbridge, W. J., Wallhagen, M. I., Shema, S. J., & Kaplan, G. A. (2000). Negative
consequences of hearing impairment in old age: A longitudinal analysis. Gerontologist,
40(3), 320–326. https://doi.org/10.1093/geront/40.3.320
100
Swaminathan, J., Mason, C. R., Streeter, T. M., Best, V., Kidd, G., & Patel, A. D. (2015).
Musical training, individual differences and the cocktail party problem. Scientific Reports.
https://doi.org/10.1038/srep11628
Swaminathan, S., Schellenberg, E. G., & Khalil, S. (2017). Revisiting the association between
music lessons and intelligence: Training effects or music aptitude? Intelligence.
https://doi.org/10.1016/j.intell.2017.03.005
Talamini, F., Carretti, B., & Grassi, M. (2016). The working memory of musicians and
nonmusicians. Music Perception. https://doi.org/10.1525/MP.2016.34.2.183
Tremblay, K. L., & Kraus, N. (2002). Auditory training induces asymmetrical changes in cortical
neural activity. Journal of Speech, Language, and Hearing Research.
https://doi.org/10.1044/1092-4388(2002/045)
Tremblay, K. L., Piskosz, M., & Souza, P. (2003). Effects of age and age-related hearing loss on
the neural representation of speech cues. Clinical Neurophysiology, 114(7), 1332–1343.
https://doi.org/10.1016/S1388-2457(03)00114-7
Van den Noortgate, W., López-López, J. A., Marín-Martínez, F., & Sánchez-Meca, J. (2013).
Three-level meta-analysis of dependent effect sizes. Behavior Research Methods, 45(2),
576–594. https://doi.org/10.3758/s13428-012-0261-6
van Goethem, A., & Sloboda, J. (2011). The functions of music for affect regulation. Musicae
Scientiae, 15(2), 208–228. https://doi.org/10.1177/1029864911401174
Vaughan, H. G., & Ritter, W. (1970). The sources of auditory evoked responses recorded from
the human scalp. Electroencephalography and Clinical Neurophysiology, 28(4), 360–367.
https://doi.org/10.1016/0013-4694(70)90228-2
Viechtbauer, W. (2007). Publication Bias in Meta-Analysis: Prevention, Assessment and
101
Adjustments. In Publication Bias in Meta-Analysis: Prevention, Assessment and
Adjustments. https://doi.org/10.1002/0470870168
Viechtbauer, Wolfgang, & Cheung, M. W.-L. (2010). Outlier and influence diagnostics for meta-
analysis. Research Synthesis Methods. https://doi.org/10.1002/jrsm.11
Whiting, K. A., Martin, B. A., & Stapells, D. R. (1998). The effects of broadband noise masking
on cortical event-related potentials to speech sounds /ba/and/da/. Ear and Hearing.
https://doi.org/10.1097/00003446-199806000-00005
Wilcox, R. (2017). Modern Statistics for the Social and Behavioral Sciences. In Modern
Statistics for the Social and Behavioral Sciences. https://doi.org/10.1201/9781315154480
Willems, R. M., Özyürek, A., & Hagoort, P. (2008). Seeing and hearing meaning: ERP and
fMRI evidence of word versus picture integration into a sentence context. Journal of
Cognitive Neuroscience. https://doi.org/10.1162/jocn.2008.20085
Wilson, R. H. (2003). Development of a speech-in-multitalker-babble paradigm to assess word-
recognition performance. Journal of the American Academy of Audiology.
https://doi.org/10.1055/s-0040-1715938
Wilson, R. H., McArdle, R. A., & Smith, S. L. (2007). An evaluation of the BKB-SIN, HINT,
QuickSIN, and WIN materials on listeners with normal hearing and listeners with hearing
loss. Journal of Speech, Language, and Hearing Research. https://doi.org/10.1044/1092-
4388(2007/059)
Yamaguchi, S., & Knight, R. T. (1991). P300 generation by novel somatosensory stimuli.
Electroencephalography and Clinical Neurophysiology. https://doi.org/10.1016/0013-
4694(91)90018-Y
Yoo, M., Kim, S., Kim, B. S., Yoo, J., Lee, S., Jang, H. C., Cho, B. L., Son, S. J., Lee, J. H.,
102
Park, Y. S., Roh, E., Kim, H. J., Lee, S. G., Kim, B. J., Kim, M. J., & Won, C. W. (2019).
Moderate hearing loss is related with social frailty in a community-dwelling older adults:
The Korean Frailty and Aging Cohort Study (KFACS). Archives of Gerontology and
Geriatrics, 83, 126–130. https://doi.org/10.1016/j.archger.2019.04.004
Zatorre, R. J., Evans, A. C., Meyer, E., & Gjedde, A. (1992). Lateralization of phonetic and pitch
discrimination in speech processing. Science. https://doi.org/10.1126/science.1589767
Zendel, B. R., & Alain, C. (2012). Musicians experience less age-related decline in central
auditory processing. Psychology and Aging, 27(2), 410–417.
https://doi.org/10.1037/a0024816
Zendel, B. R., & Alexander, E. J. (2020). Autodidacticism and Music: Do Self-Taught Musicians
Exhibit the Same Auditory Processing Advantages as Formally Trained Musicians?
Frontiers in Neuroscience, 14, 752. https://doi.org/10.3389/fnins.2020.00752
Zendel, B. R., Tremblay, C.-D. D., Belleville, S., & Peretz, I. (2015a). The impact of
musicianship on the cortical mechanisms related to separating speech from background
noise. Journal of Cognitive Neuroscience, 27(5), 1044–1059.
https://doi.org/10.1162/jocn_a_00758
Zendel, B. R., Tremblay, C. D., Belleville, S., & Peretz, I. (2015b). The impact of musicianship
on the cortical mechanisms related to separating speech from background noise. Journal of
Cognitive Neuroscience, 27(5), 1044–1059. https://doi.org/10.1162/jocn_a_00758
Zendel, B. R., West, G. L., Belleville, S., & Peretz, I. (2019). Musical training improves the
ability to understand speech-in-noise in older adults. Neurobiology of Aging, 81, 102–115.
https://doi.org/10.1016/j.neurobiolaging.2019.05.015
Zhang, L., Fu, X., Luo, D., Xing, L., & Du, Y. (2020a). Musical Experience Offsets Age-Related
103
Decline in Understanding Speech-in-Noise: Type of Training Does Not Matter, Working
Memory Is the Key. Ear and Hearing2.
Zhang, L., Fu, X., Luo, D., Xing, L., & Du, Y. (2020b). Musical Experience Offsets Age-Related
Decline in Understanding Speech-in-Noise. Ear & Hearing, Publish Ah.
https://doi.org/10.1097/aud.0000000000000921
104
Appendices
Appendix A. Search terms used for each of the databases.
Database
Search Terms
Proquest
= all(auditory processing OR speech processing OR speech*
PRE/1noise OR speech masking OR masking OR speech OR
word* PRE/1 noise OR sentence* PRE/1 noise OR syllable*
PRE/1 noise) AND all(music* group* OR music lesson* OR
music* training OR music* instruction OR music* experience
OR music* group* OR instrument* training OR musician*)
PubMed
= ((auditory processing) OR (speech processing) OR (speech*
noise) OR (speech masking) OR (masking) OR (speech) OR
(word* noise) OR (sentence* noise) OR (syllable* noise))
AND ((music* group*) OR (music lesson*) OR (music*
training) OR (music* instruction) OR (music* experience)
OR (music* group*) OR (instrument* training) OR
(musician*))
105
Appendix B. Means and standard deviations for behavioral tasks by group and time.
Pre-Test
Mean (SD)
Post-Test
Mean (SD)
Choir Control Choir Control
BKB-SIN
Total 24.38
(1.12)
24.54
(1.24)
25.03
(1.04)
24.83
(1.24)
Goldsmith MSI
Engagement 37.27
(9.03)
37.47
(11.70)
Perceptual 44.00
(8.96)
46.53
(8.29)
Training 21.47
(9.66)
18.47
(10.86)
Singing 25.13
(8.48)
23.33
(10.22)
Emotions 29.80
(6.66)
32.27
(5.44)
General 69.93
(21.21)
64.73
(19.34)
MINT
Rhythm
Accuracy
0.61
(0.17)
0.61
(0.13)
0.66
(0.14)
0.59
(0.14)
Pitch
Accuracy
0.66
(0.15)
0.62
(0.13)
0.64
(0.15)
0.61
(0.17)
Prediction
Accuracy
0.73
(0.13)
0.69
(0.14)
0.74
(0.13)
0.65
(0.15)
Rhythm RT 4.01
(2.13)
4.11
(1.35)
3.86
(1.77)
3.94
(1.57)
Pitch RT 4.08
(1.52)
4.90
(2.06)
4.56
(3.12)
4.13
(2.09)
Prediction RT 2.47
(0.96)
2.66
(0.64)
2.74
(0.93)
3.01
(1.86)
Ryff’s
Autonomy 38.07
(7.48)
38.86
(5.73)
39.21
(6.99)
38.50
(6.12)
Environmental
Mastery
36.44
(7.84)
38.04
(6.37)
37.00
(8.44)
37.65
(7.51)
Personal
Growth
40.13
(5.78)
43.78
(5.20)
40.38
(5.82)
44.52
(4.95)
Positive
Relations
36.67
(7.58)
39.83
(7.67)
36.60
(7.56)
40.35
(7.02)
Purpose 39.00
(5.41)
40.55
(5.75)
37.29
(7.52)
41.05
(6.07)
Self-
Acceptance
36.88
(6.18)
35.50
(6.12)
36.81
(8.23)
36.05
(6.69)
106
Dejong’s
Social
Loneliness
3.20
(1.78)
2.21
(2.04)
Emotional
Loneliness
2.40
(1.80)
2.36
(2.24)
EEG syllable-in-
noise
Silent
Accuracy
0.94
(0.12)
0.96
(0.06)
0.94
(0.09)
0.91
(0.17)
10 dB
accuracy
0.93
(0.14)
0.96
(0.09)
0.97
(0.03)
0.89
(0.18)
5 dB
Accuracy
0.94
(0.08)
0.96
(0.08)
0.97
(0.05)
0.88
(0.21)
0dB Accuracy 0.97
(0.03)
0.98
(0.02)
0.94
(0.13)
0.91
(0.16)
Silent RT 0.24
(0.06)
0.25
(0.08)
0.28
(0.08)
0.25
(0.08)
10 dB RT 0.29
(0.08)
0.29
(0.07)
0.29
(0.11)
0.28
(0.08)
5 dB RT 0.30
(0.08)
0.30
(0.07)
0.31
(0.09)
0.30
(0.08)
0dB RT 0.30
(0.10)
0.32
(0.07)
0.34
(0.09)
0.33
(0.08)
EEG Oddball Accuracy 0.95
(0.11)
0.93
(0.12)
0.93
(0.10)
0.96
(0.06)
RT 0.47
(0.10)
0.46
(0.10)
0.48
(0.11)
0.44
(0.08)
107
Appendix C. Means and standard deviations of amplitudes for EEG tasks by group and time.
Pre-Test Post-Test
Choir Control Choir Control
Mean
Amplitude
(SD)
Mean
Amplitude
(SD)
Mean
Amplitude
(SD)
Mean
Amplitude
(SD)
Syllable-in-
noise, active
P1 Silent 0.21
(0.61)
0.37
(0.72)
0.46
(0.67)
0.20
(0.85)
P1 10dB 0.27
(0.57)
0.08
(0.51)
0.24
(0.45)
0.28
(0.55)
P1 5dB 0.23
(0.46)
0.05
(0.42)
0.19
(0.49)
0.31
(0.58)
P1 0dB -0.03
(0.48)
0.01
(0.46)
-0.07
(0.64)
0.00
(0.41)
N1 Silent -0.69
(1.18)
-0.94
(1.58)
-1.03
(0.98)
-0.90
(1.57)
N1 10dB -0.57
(0.76)
-0.54
(1.13)
-0.23
(0.73)
-0.39
(0.92)
N1 5dB -0.38
(0.72)
-0.69
(0.82)
-0.72
(1.01)
-0.55
(0.89)
N1 0dB -0.78
(0.77)
-0.62
(0.86)
-0.66
(0.76)
-0.54
(0.97)
P2 Silent 1.65 (1.09) 1.28 (1.02) 1.68 (1.08) 1.53 (1.30)
P3-like Silent 1.15 (1.12) 1.31 (1.20) 1.00 (1.31) 1.70 (1.07)
P3-like 10 dB 0.69 (0.75) 1.29 (0.96) 0.78 (1.21) 1.38 (0.82)
P3-like 5 dB 0.85 (1.24) 1.18 (0.87) 0.87 (1.14) 1.25 (1.08)
P3-like 0 dB 0.81 (0.86) 1.08 (0.96) 0.72 (0.95) 1.25 (1.09)
Syllable-in-
noise, passive
P1 Silent 0.52 (0.59) 0.49 (0.65) 0.50 (0.63) 0.44 (0.59)
P1 10dB 0.50 (0.33) 0.42 (0.42) 0.50 (0.42) 0.57 (0.46)
P1 5dB 0.32 (0.32) 0.35 (0.26) 0.37 (0.41) 0.52 (0.42)
P1 0dB 0.31 (0.27) 0.42 (0.30) 0.30 (0.34) 0.37 (0.53)
N1 Silent -0.93
(0.82)
-1.39
(0.84)
-1.18
(0.80)
-1.34
(0.68)
N1 10dB -0.28
(0.52)
-0.61
(0.42)
-0.48
(0.52)
-0.56
(0.45)
N1 5dB -0.33
(0.45)
-0.59
(0.44)
-0.50
(0.51)
-0.70
(0.54)
N1 0dB -0.19
(0.46)
-0.47
(0.50)
-0.49
(0.50)
-0.58
(0.44)
108
P2 Silent 1.13 (0.83) 1.15 (0.82) 1.13 (0.78) 1.46 (0.87)
Oddball N1 Oddball -1.25
(1.46)
-2.32
(1.56)
-0.87
(1.55)
-2.04
(1.35)
N1 Standard -0.99
(0.97)
-1.81
(1.33)
-0.95
(1.16)
-1.55
(1.22)
N1 Distractor -1.34
(1.27)
-1.92
(2.03)
-0.77
(1.24)
-1.95
(1.58)
P2 Oddball 1.38 (1.77) 0.81 (1.27) 0.92 (2.42) 1.12 (1.47)
P2 Standard 1.70 (0.92) 1.59 (0.93) 1.48 (1.01) 1.82 (0.97)
P2 Distractor 1.62 (1.48) 1.55 (1.43) 1.21 (1.79) 1.50 (1.40)
P3a Distractor 1.66 (1.57) 1.37 (2.15) 1.79 (1.93) 1.73 (2.06)
P3b Oddball 0.29 (0.71) 0.03 (1.18) 0.25 (0.92) 0.04 (1.06)
P3b Standard 0.22 (0.37) -0.05
(0.56)
0.28 (0.52) 0.00 (0.52)
Appendix C. Means and standard deviations of latencies for EEG tasks by group and time.
Pre-Test Post-Test
Choir Control Choir Control
Mean
Latency
(SD)
Mean
Latency
(SD)
Mean
Latency
(SD)
Mean
Latency
(SD)
Syllable-in-
noise, active
P1 Silent 62.82
(11.29)
62.00
(13.20)
60.47
(9.37)
62.60
(12.40)
P1 10dB 67.06
(14.53)
70.00
(15.44)
67.53
(13.26)
66.40
(15.10)
P1 5dB 91.06
(13.31)
87.80
(12.81)
73.65
(18.50)
72.40
(18.76)
P1 0dB 78.82
(15.67)
77.60
(16.69)
79.29
(17.51)
89.60
(21.06)
N1 Silent 109.65
(12.33)
108.20
(11.20)
105.41
(10.19)
108.60
(11.12)
N1 10dB 136.94
(14.39)
142.60
(17.76)
143.29
(27.50)
150.80
(22.67)
N1 5dB 159.76
(17.23)
147.40
(19.04)
148.47
(17.37)
150.00
(19.23)
N1 0dB 182.12
(19.80)
176.40
(19.68)
168.47
(15.55)
179.00
(19.11)
P2 Silent 191.53
(18.86)
195.60
(23.36)
195.29
(18.83)
197.80
(23.91)
P3-like Silent 341.18
(45.12)
307.78
(37.51)
325.65
(40.33)
319.11
(31.96)
P3-like 10 dB 355.76
(54.62)
332.89
(42.39)
345.65
(49.30)
346.89
(40.74)
P3-like 5 dB 367.06
(53.00)
350.22
(43.41)
366.12
(51.03)
356.22
(49.70)
109
P3-like 0 dB 372.00
(61.04)
370.22
(43.09)
368.00
(51.13)
364.44
(38.90)
Syllable-in-
noise, passive
P1 Silent 58.89
(10.70)
56.63
(10.61)
57.33
(10.08)
55.37
(12.46)
P1 10dB 76.00
(13.72)
72.84
(12.90)
77.33
(11.15)
72.63
(14.44)
P1 5dB 81.56
(16.01)
77.05
(14.47)
80.89
(15.97)
85.68
(14.13)
P1 0dB 85.33
(16.35)
84.42
(13.39)
94.89
(14.64)
84.00
(16.97)
N1 Silent 110.89
(7.36)
109.89
(10.01)
109.11
(9.39)
109.05
(10.31)
N1 10dB 157.56
(19.12)
162.11
(20.59)
161.78
(14.21)
158.53
(20.62)
N1 5dB 180.67
(16.54)
177.26
(16.71)
175.33
(18.21)
173.05
(17.07)
N1 0dB 177.56
(18.36)
174.53
(18.39)
184.00
(14.06)
182.11
(10.94)
P2 Silent 192.22
(18.94)
195.58
(20.91)
195.33
(19.32)
201.26
(19.28)
Oddball N1 Oddball 88.89
(9.76)
92.17
(8.54)
87.33
(10.01)
89.8
(10.26)
N1 Standard 89.33
(7.76)
92.67
(7.04)
88.67
(8.92)
91.40
(8.24)
N1 Distractor 89.56
(11.16)
96.17
(12.11)
88.67
(13.11)
96.60
(12.26)
P2 Oddball 170.22
(32.27)
172.33
(33.21)
163.56
(31.25)
176.80
(34.86)
P2 Standard 192.67
(29.43)
194.00
(31.16)
197.56
(32.03)
205.80
(25.84)
P2 Distractor 197.33
(30.62)
205.00
(28.73)
197.56
(34.30)
213.60
(22.57)
P3a Distractor 317.78
(17.79)
317.33
(18.64)
324.22
(20.00)
325.60
(18.33)
P3b Oddball 578.00
(87.15)
605.33
(103.97)
567.11
(84.78)
567.40
(100.28)
P3b Standard 625.56
(92.21)
637.50
(83.95)
602.22
(97.75)
648.00
(72.94)
Abstract (if available)
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Effects of online choir and mindfulness interventions on auditory perception and well-being in older adults during the COVID-19 pandemic
PDF
Understanding music perception with cochlear implants with a little help from my friends, speech and hearing aids
PDF
The role of individual variability in tests of functional hearing
PDF
Neural and behavioral correlates of fear processing in first-time fathers
PDF
Latent change score analysis of the impacts of memory training in the elderly from a randomized clinical trial
PDF
P-center perception in children with developmental dyslexia: do low level auditory deficits underlie reading, spelling, and language impairments?
PDF
Influence of age and anxiety on recognition of facial expressions of emotion: exploring the role of attentional processes
PDF
Spatial and temporal patterns of brain activity associated with emotions in music
PDF
Inhibitory spillover in adolescence: a fMRI study
PDF
Relationships between physical activity and cognitive ability in older adults: investigating the mediating effects of sleep efficiency and cerebral blood flow
PDF
Majoring in music: how conservatory training changes the brain
PDF
A preliminary evaluation of a telehealth approach to acceptance and commitment training (ACT) for enhancing behavioral parent training (BPT) for Chinese parents
PDF
The effects of acceptance and commitment therapy-based exercises on eating behaviors in a laboratory setting
PDF
Evaluating the effects of virtual reality and acceptance and commitment therapy on music performance anxiety
PDF
The effects of bilingual acceptance and commitment training (ACT) on exercise in bilingual international university students
PDF
The role of age, mood, and time- perspective on the interpretation of emotionally ambiguous information
PDF
The carcinogenic effect of the MMP9 rs3918242 polymorphism on the risk of cancer of the digestive system: evidence from a meta-analysis
PDF
A systematic review of measurement invariance research of the CES-D scale across gender
PDF
Optimism when imagining the future in old age influences the positivity effect in memory
PDF
A dissonance-based intervention targeting cheating behavior in a university setting
Asset Metadata
Creator
Hennessy, Sarah Louise
(author)
Core Title
The role of music training on behavioral and neurophysiological indices of speech-in-noise perception: a meta-analysis and randomized-control trial
School
College of Letters, Arts and Sciences
Degree
Master of Arts
Degree Program
Psychology
Degree Conferral Date
2021-08
Publication Date
08/05/2021
Defense Date
08/05/2021
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
auditory perception,electroencephalography,meta-analysis,Music,OAI-PMH Harvest,speech-in-noise perception
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Habibi, Assal (
committee chair
), Damasio, Antonio (
committee member
), Kaplan, Jonas (
committee member
)
Creator Email
hennesss@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-oUC15710827
Unique identifier
UC15710827
Legacy Identifier
etd-HennessySa-10010
Document Type
Thesis
Format
application/pdf (imt)
Rights
Hennessy, Sarah Louise
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the author, as the original true and official version of the work, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright. The original signature page accompanying the original submission of the work to the USC Libraries is retained by the USC Libraries and a copy of it may be obtained by authorized requesters contacting the repository e-mail address given.
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Repository Email
cisadmin@lib.usc.edu
Tags
auditory perception
electroencephalography
meta-analysis
speech-in-noise perception