Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Statistical signal processing of magnetoencephalography data
(USC Thesis Other)
Statistical signal processing of magnetoencephalography data
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
STATISTICAL SIGNAL PROCESSING OF MAGNETOENCEPHALOGRAPHY
DATA
by
Dimitrios Pantazis
A Dissertation Presented to the
FACULTY OF THE GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
(ELECTRICAL ENGINEERING)
December 2006
Copyright 2006 Dimitrios Pantazis
Dedication
to my family
ii
Acknowledgments
I would like to express my deepest respect and gratitude to my advisor, Prof. Richard
Leahy. Your contribution has been invaluable; you supported me financially throughout
my graduate studies in USC, guided me in selecting and solving research problems,
introduced me to distinguished researchers and scientists, provided me a friendly and
welcome environment, gave me freedom to select my own research directions, let me
enjoy long holidays. You are a great mentor, and it has really been a pleasure working
with you. Thank you!
I also wish to thank the members of my guidance committee: Dr. Krishna Nayak,
Dr. Manbir Singh, Dr. Shrikanth Narayanan, and Dr. Fengzhu Sun for their suggestions
and valuable feedback.
My greatest appreciation goes to Prof. Thomas Nichols at the University of Michi-
gan for being almost my second advisor. You gave me interesting research topics, guided
me through statistics, and helped me the most during my first years in USC. You are a
bright scientist and it is an honor working with you. Thank you!
I am also grateful to Dr. John Mosher at the Los Alamos Labs for sharing his scien-
tific expertise with me. You have infinite knowledge on MEG, and our long discussions
have been fun and educational. Thank you for always being there when I need you!
Spacial thanks to Prof. Gregory Simpson at the University of California, San Fran-
cisco, for giving me invaluable knowledge on neuroscience. Our long discussions have
iii
been very fruitful, and your excitement and motivation helped me face research prob-
lems and move forward.
I also thank Dr. Sylvain Baillet at the Cognitive Neuroscience & Brain Imaging
Hopital de la Salpetriere for offering advice and support, as well as providing MEG
data to conduct my research. I was also privileged to meet Prof. Martin Styner at
the University of North Carolina, and have the opportunity to do original research on
surface-based morphometry.
I could not have selected better office mates than Dr. Quanzheng Li and Dr. Esen
Kucukaltun-Yildirim. We have spend quality time together, discussing research prob-
lems as well as life experiences. Special thanks to Dr. Felix Darvas who has often been
like a mentor to me, explaining mathematical details and answering my questions. I
must also mention that both Quanzheng and Felix motivated me to look for research
problems and care for my future career.
I had the honor to collaborate with some excellent scientists: Dr. Darren Weber and
Dr. Corby Dale at the University of California, San Francisco; Dr. Warren Merrifield
and Dr. Dinah Thyerlei at the Huntington Memorial Hospital, and Dr. Karim Jerbi at
the Brain Imaging Hopital de la Salpetriere.
In the University of Southern California I have enjoyed the cozy and warm environ-
ment of an excellent research lab. I want to thank its members: Sangtae Ahn, Abhijit
Chaudhari, Sanghee Cho, Belma Dogdas, Hua Hui, Anand Joshi, Zheng Li, Sangeetha
Somayajula, Juan Luis Poletti Soto, Evren Asma, and Bing Bai for adding positively to
my research and academic experience.
I also want to express my appreciation to Dr. Athanasios Mouchtaris and Dr. Panayi-
otis Georgiou who welcomed me in USC, Alex and Cristine Ciancio for being great(!)
iv
friends, Mikko Rautiainen for offering me the most exciting semester in USC, Alexan-
dros Batzios for being a great roommate, and Prof. Robert Scholtz for being an excellent
tutor.
Most importantly, I want to thank my family for providing unlimited support and
helping me realize my dreams.
v
Table of Contents
Dedication ii
Acknowledgments iii
List of Tables ix
List of Figures x
Abstract xvi
1 Introduction 1
2 Overview of Magnetoencephalography 6
2.1 Neural Basis of Electromagnetic Signals . . . . . . . . . . . . . . . . . 7
2.2 Instrumentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.3 Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.4 Statistical Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.5 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3 Model and Inverse Methods 17
3.1 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.2 Inverse Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.2.1 Regularized Minimum Norm Solutions . . . . . . . . . . . . . 18
3.2.2 Linearly Constrained Minimum Variance Beamformer . . . . . 20
3.2.3 Multiple Signal Classification (MUSIC) . . . . . . . . . . . . . 21
3.2.4 Matched Filter . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.3 Imaging Oscillatory Behavior in MEG studies . . . . . . . . . . . . . . 24
3.3.1 Cortical Source Imaging with Time-Frequency Expansions . . . 25
3.3.2 Morlet Wavelet . . . . . . . . . . . . . . . . . . . . . . . . . . 27
vi
4 Controlling Familywise Error Rate in Cortical Current Density Maps 30
4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.2.1 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.2.2 Random Field Theory Method . . . . . . . . . . . . . . . . . . 37
4.2.3 Permutation Method . . . . . . . . . . . . . . . . . . . . . . . 41
4.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.3.1 Simulation Studies . . . . . . . . . . . . . . . . . . . . . . . . 47
4.3.2 Real Data experiment . . . . . . . . . . . . . . . . . . . . . . . 54
4.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.5 Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5 Controlling Familywise Error Rate in MEG Studies of Oscillatory Behavior 60
5.1 Statistical Interpretation of Time-Frequency Cortical Maps . . . . . . . 61
5.1.1 Modeling and ANOV A Analysis . . . . . . . . . . . . . . . . . 61
5.1.2 Multiplicity Adjustments . . . . . . . . . . . . . . . . . . . . . 63
5.1.3 Permutation Tests . . . . . . . . . . . . . . . . . . . . . . . . . 63
5.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
5.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
5.4 Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
6 Neuroimaging Applications: Language Cortex Dominance and Surface-
based Morphometry 70
6.1 Hemispheric Language Dominance using MEG Cortical Imaging and
Non-Parametric Statistical Analysis . . . . . . . . . . . . . . . . . . . 70
6.1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
6.1.2 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
6.1.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
6.1.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
6.1.5 Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . 84
6.2 Statistical Surface-Based Morphometry Using a Non-Parametric Approach 84
6.2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
6.2.2 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
6.2.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
6.2.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
7 General Linear Modeling and Error Control Using False Discovery Rate 94
7.1 Human Attention . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
7.2 Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
7.2.1 Subjects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
7.2.2 Visual spatial cueing experiment . . . . . . . . . . . . . . . . . 98
7.2.3 MEG data collection and preprocessing . . . . . . . . . . . . . 98
vii
7.3 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
7.3.1 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
7.3.2 Wavelet Expansion and Statistics . . . . . . . . . . . . . . . . . 101
7.3.3 ANCOV A Model . . . . . . . . . . . . . . . . . . . . . . . . . 103
7.3.4 Statistical Significance . . . . . . . . . . . . . . . . . . . . . . 105
7.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
7.5 Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
8 Conclusion and Future Work 113
8.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
8.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
8.2.1 Signal Processing Methodologies in MEG . . . . . . . . . . . . 115
8.2.2 Functional Mapping . . . . . . . . . . . . . . . . . . . . . . . 116
8.2.3 Anatomical Mapping . . . . . . . . . . . . . . . . . . . . . . . 116
References 118
viii
List of Tables
4.1 Summary statistics for three permutation methods. The permutation
samples are T
∗
it
, with i the spatial index, and t the time index. The
tilde indicates the maximum over the dotted subscript; p
i
(·) is the per-
mutation p-value function using only data from spatial locationi. . . . . 44
4.2 Thresholds for the sources illustrated in Figs. 4.3 and 4.4 for controlling
FWER over space and time. Since 12.3% of the
e
P
∗
·
had the smallest
possible P-value of 0.0001, 0.123 is the smallest FWER obtainable with
method 2. Method 1 was repated for this FWER for comparison . . . . 50
4.3 Thresholds for the sources illustrated in Figs. 4.3 and 4.4 for controlling
FWER over space only . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.4 Noise-only simulation results for control of spatial and spatiotemporal
FWER at nominal level α = 5%. The Monte Carlo standard error for
the spatiotemporal FWER is 0.0218; for the spatial FWER it is 0.0022. . 54
4.5 Number of RESELS for simulated and real data. . . . . . . . . . . . . . 54
4.6 Real data thresholds found with the different RF and permutation methods. 55
6.1 Spatiotemporal thresholds
ˆ
F
−1
]
T··
(0.95), used to identify significant acti-
vation in theT
it
maps. . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
6.2 Thresholds
ˆ
F
−1
S
(α) and
ˆ
F
−1
S
(1−α) at 5% significance level, and orig-
inal statistic S for all subjects. The subject is right dominant if S >
ˆ
F
−1
S
(0.95), and left dominant ifS <
ˆ
F
−1
S
(0.05). For subject P1 the test
is inconclusive. *: We should not proceed to a laterality test for sub-
jects P1 and P2, because they did not have significant voxels (Fig. 6.5).
However we show the results for completeness. . . . . . . . . . . . . . 82
6.3 Edinburgh profile, Wada test, location of significant voxels after thresh-
oldingT
it
maps, and asymmetry analysis usingS statistic. *: The same
comments as for Table 6.2 apply. . . . . . . . . . . . . . . . . . . . . . 84
6.4 Normalization schemes, summary statistics and thresholds for the two
permutation methods. . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
7.1 P-values of the ipsilateral vs. contralateral statistic S
STF
. Red values
indicate significance after multiplicity correction using a false discovery
rate procedure that gave a threshold of 0.0273 (Fig. 7.9) . . . . . . . . . 110
ix
List of Figures
2.1 Cerebral frontal cortex drawn by Ramn y Cajal using a Golgi staining
technique. Pyramidal (A, B, C, D, E) and non-pyramidal (F, K) cells
are clearly depicted. Currents flowing in the dendritic trunks of pyrami-
dal cells are believed to be the primary generators of magnetic signals
outside the head. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2 Left: Whole-head CTF Omega MEG system with 275 axial gradiome-
ters; right: MEG sensors use low-temperature electronics cooled by liq-
uid helium. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3 MEG model; (a) Sensor arrangement of a 275-channel CTF MEG sys-
tem, (b) Topography of sensor measurements, (c) Minimum-norm inverse
solution on a tessellated cortical surface. . . . . . . . . . . . . . . . . . 12
3.1 (a) Detection maps for two separate actual source locations. (b) Corre-
sponding z-scores. Notice that, although individual detection maps have
different amplitudes, z-score maps exhibit the same range of values so
that thresholding is meaningful. . . . . . . . . . . . . . . . . . . . . . 24
3.2 MEG brain activity in response to a task or stimulus consists of evoked
components that are phase-locked to the stimulus and induced compo-
nents that are not. While averaging over epochs enhances the phase
locked SNR, averaging also results in suppression of the non-phase
locked components. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.3 Time-varying frequency components of a source on the frontal lobe,
after averaging over many epochs; we notice desynchronization in the
β band 250-600ms after stimulus. The Morlet wavelet is a Gaussian-
windowed complex sinusoid with the real part shown in blue, and the
imaginary part in red. . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.4 Normalized energy increase in a visual attention MEG study in theθ and
γ frequencies. The colormap encodes percentage increase in energy in
units of standard deviation. The maps reveal high activity in the occipital
lobe and suggest a relevant functional association betweenθ andγ bands. 28
x
4.1 Illustration of the summarizing procedures used to construct FWER-
corrected thresholds: the original epochs Y
itj
produce M permutation
samplesY
∗
itj
by exchanging pre- and post-stimulus data. The epochs are
then averaged and normalized to produce T
it
and T
∗
it
. Three different
summarizing methods can be used. Method 1: T
∗
it
are summarized in
time (
e
T
∗
i·
) and space (
e
T
∗
··
) to produce epochwise thresholds (
ˆ
F
−1
e
T··
(1−α));
Method 2: T
∗
it
are summarized in time, converted into p-values (P
∗
i
)
and summarized in space (
e
P
∗
·
) to produce uniform specificity epochwise
thresholds (
ˆ
F
−1
e
P·
(α)); Method 3: T
∗
it
are summarized in space (
e
T
∗
·t
) to
produce slicewise thresholds (
ˆ
F
−1
e
T·t
(1−α)). . . . . . . . . . . . . . . . 43
4.2 Illustration of the impact of heterogeneous voxel null distributions on
a 5% FWER threshold. Shown are null distributions of 5 surface ele-
ments in three cases: each having different variances, each having dif-
ferent skewed distributions, and all sharing the same normal distribu-
tion. The first case (left) demonstrates the variable false positive rate
when test statistics are not normalized (e.g. use raw CDM values ˆ μ
ij
instead of T
it
). The second case (center) demonstrates the impact of
non-Gaussianity, even when variance is normalized, and motivates the
use of p-values to normalizeT
ij
intop
i
(T
it
). The last case (right) shows
that with homogeneous nulls the false positive rate at each surface ele-
ment is homogeneous. Note that in all cases FWER is controlled at 5%. 45
4.3 Simulated source 1 (left hemisphere) and source 2 (right hemisphere)
are shown mapped onto high resolution and smoothed versions of the
cortical surface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.4 Timecourses of simulated sources, blue for source 1 and red for source
2. The pattern of activation mimics a typical neuroimaging study where
an early response to a stimulus propagates to another brain region giving
a delayed component. . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.5 Examples of significant activation maps for permutation and random
field methods for two time instances; (a) Permutation method 1 using
unsmoothed CDMs, (b) Permutation method 3 using unsmoothed CDMs,
(c) Permutation method 3 using smoothed CDMs, (d) Random field
using smoothed CDMs. The first method controls FWER over space
and time, while the last three methods control FWER over space for one
time point only. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.6 Examples of significant activation maps for permutation methods 1 and
2 for two time instances. The lowest achieved threshold for method 2 is
α = 0.123 and the comparison for both methods is done at this level. . . 52
4.7 Thresholded and unthresholded maps of the current density (ˆ μ
it
), the
noise normalized current density (T
it
) and the (1-p)-value mapp
i
(T
it
) at
t = 113. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
xi
4.8 Examples of significant activation maps for permutation and random
field methods for real data; (a) Permutation method 1 using unsmoothed
CDMs, (b) Permutation method 3 using unsmoothed CDMs, (c) Permu-
tation method 3 using smoothed CDMs, (d) Random field controlling
FWER over space only. . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.9 Reconstruction and significant maps from permutation methods 1 and 2
for a FWER ofα = 0.086. . . . . . . . . . . . . . . . . . . . . . . . . 57
4.10 Global threshold applied by permutation method 1 (
ˆ
F
−1
e
T··
(1−α)) at level
α = 0.086, as compared to the histogram of the thresholdsp
−1
i
(
ˆ
F
−1
e
P·
(α))
applied to each source i by method 2. Also, a map of the thresholds
on the cortical surface is given on the right. Most of the individual
thresholds are below
ˆ
F
−1
e
T··
(1−α). . . . . . . . . . . . . . . . . . . . . 57
5.1 Illustration of the MEG visual attention study: Each trial presents a cen-
tral arrow cue that directs attention to the lower left or right visual field,
in anticipation of a second stimulus (S2), delivered 1 sec later on the
left or right. The locations of the upcoming S2 are continuously marked
by gray patches, to guide allocation of covert visual spatial attention. If
the S2 (a B/W grating) occurs on the cued (attended) side, then subjects
respond if it has the target orientation. . . . . . . . . . . . . . . . . . . 65
5.2 We tested 12 post-stimulus time-frequency bands against their baseline
counterparts. All bands were 200-500ms or 700-1000ms after the pre-
sentation of the cue, in frequency regions: θ (4-7Hz), α low (8-10Hz),
α high (11-14Hz),β low (14-19Hz),β high (20-30Hz), andγ (30-50Hz). 66
5.3 Empirical distribution
ˆ
F
]
T··
ofmax
ik
|T
∗
ik
|.We use this distribution to esti-
mate a threshold
ˆ
F
−1
]
T··
(1−α) = 0.2628 that controls the FWER at the
α = 0.05 level. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
5.4 Normalized power difference maps of baseline vs. right cue for the
visual-attention MEG study: (a) Unthresholded maps ofT
ik
, (b) Thresh-
olded maps of T
ik
, based on our permutation test. We used the thresh-
old
ˆ
F
−1
]
T··
(1− α) = 0.2628 to control the FWER at α = 0.05 level.
The colormap encodes percentage increase in energy in units of stan-
dard deviation. The thresholded maps reveal increased activity in the
superior frontal gyrus and inferior temporal lobe (arrows), that are only
significant in theθ band. These results suggest a functional dissociation
betweenθ andγ bands. . . . . . . . . . . . . . . . . . . . . . . . . . . 68
6.1 Channel Timeseries on a word recognition MEG study. The early evoked
response occurs at approximately 50-150ms after stimulus onset; the
late evoked response occurs at approximately 250-700ms after stimulus
onset. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
xii
6.2 Illustration of the summarizing procedures used to construct FWER-
corrected thresholds: the original epochs Y
itj
produce M permutation
samples Y
∗
itj
by exchanging pre- and post-stimulus data. The epochs
are then averaged and normalized to produceT
it
andT
∗
it
. Finally,T
∗
it
are
summarized in time and space to produce epochwise thresholds
ˆ
F
−1
]
T··
(1−
α). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
6.3 Empirical probability density distribution ofmax
it
|T
∗
it
|.We use this dis-
tribution to estimate a threshold
ˆ
F
−1
]
T··
(1−α) that controls the FWER at
theα = 0.05 level. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
6.4 Illustration of the permutation procedure used to estimate the distribu-
tion of theS statistic. We first permute the energy statisticsE
R
j
andE
L
j
to create permutation samples E
R∗
j
and E
L∗
j
. We then use the permu-
tation samples to estimate the empirical distribution of S, and estimate
the thresholds
ˆ
F
−1
S
(α) and
ˆ
F
−1
S
(1−α). . . . . . . . . . . . . . . . . . 82
6.5 V oxels with significant activity during 250-400ms after the word pre-
sentation. Subjects P1 and P2 had no significant voxels. . . . . . . . . . 83
6.6 The posterior temporal cortex is used to define areasN
R
andN
L
for the
language asymmetry test of all subjects that had significant voxels. . . . 83
6.7 Visualization of the distance map between the right lateral ventricles of
a twin-pair (superior view). A: The two ventricles after alignment. B:
Same as A, one ventricle shown transparent and the other as grid-mesh.
C: Distance map with color-coded distance at each boundary-point. . . . 87
6.8 Concept of statistical significance maps: For two groups of objects, DVF
maps are compared locally resulting in a map of significant differences
between groups. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
6.9 Illustration of the permutation scheme. We createM permutation sam-
ples from the original data and form the statistic T
ij
. The statistic is
normalized into T
n
ij
and the data are summarized in space to create S
j
.
The empirical distribution of S
j
, called
ˆ
F
S
is used to define a global
thresholdS
th
which corresponds to spatially varying thresholdsS
th
i
. . . 89
6.10 Color-coded average distance maps visualize the absolute distances between
pairs, averaged over the group. The displays show a smaller average dis-
tance between MZ twins compared to NR subjects. . . . . . . . . . . . 91
6.11 Left: Example empirical distribution of the statisticT
ij
acrossj at a sur-
face elementi. Middle: Empirical distribution of the maximum statistic
S
σ
j
. Right: Empirical distribution of the minimum statisticS
p
j
. . . . . . 93
6.12 Significance maps for the two normalization methods. The maps indi-
cate regions where unrelated individuals have more variability that twins.
The methods perform similarly on both the left and right lateral ventricle. 93
xiii
7.1 Visual spatial cueing experiment. A brief central arrow cue directed
covert attention to the lower left or right visual field, in anticipation of a
second stimulus (S2) delivered 1 sec later on the left or right with equal
probability. The upcoming S2 consisted of gratings slanted clockwise
from the vertical by either 5 (non-target) or 20 (target) degrees, with a
response required if the S2 appeared at the cued (attended) location and
had the target orientation. . . . . . . . . . . . . . . . . . . . . . . . . 99
7.2 MEG model; (a) Sensor arrangement of a 275-channel CTF MEG sys-
tem, (b) Topography of sensor measurementsM, (c) Min-norm inverse
solutionX on a tessellated cortical surface . . . . . . . . . . . . . . . . 101
7.3 Selecting rectangular time-frequency bands in the wavelet signal domain
(right). Also shown is the average frequency content of the cue-right and
cue-left trials on the right intra-parietal sulcus (Fig. 7.4). Alpha power is
stronger on the cue-right trials (ipsilateral hemisphere) around 500-800
ms, indicating a topographical change of alpha activity in anticipation
of a target stimulus at 1 sec in the cued direction. . . . . . . . . . . . . 103
7.4 Visual attention effects are explored in 6 cortical regions, selected sym-
metrically in each hemisphere; SPL: Superior Parietal Lobe, TPJ: Tem-
poral Parietal Junction, IPS: Intra-Parietal Sulcus, OM: Occipital Mid-
dle, OL: Occipital Lateral, OV: Occipital Ventral. . . . . . . . . . . . . 103
7.5 First few rows of the ANCOV A design matrixX. Each MEG trial pro-
duces two observations (post-cue neural energy in the right and left
hemisphere), that account for two rows in the design matrix. We place
1s in the proper cue and hemisphere indicator variable, and the within
trial correlation term. The baseline neural energy is placed on the fifth
or sixth column, if it comes from the right or left hemisphere respectively.105
7.6 Alpha power sensor map showing the difference between cue-right and
cue-left trials. The effect is averaged over 8 subjects at 600 ms post-cue,
and demonstrates that ispilateral is greater than contralateral alpha activity.107
7.7 Caudal view of cortical maps of 8 subjects, showing the difference in
alpha power between cue-right and cue-left trials at 600 ms post-cue.
Despite the considerable variability between subjects, overall ipsilateral
activity is greater than contralateral. . . . . . . . . . . . . . . . . . . . 108
7.8 Testing model validity. If baseline alpha activity is not an additive effect,
as we assume in the ANCOV A model, large values of baseline alpha
activity would cause large errors in the model. Since no inflation of
error occurs for large baseline alpha activity, the baseline covariate is
indeed an additive effect. . . . . . . . . . . . . . . . . . . . . . . . . . 109
xiv
7.9 FDR procedure for the p-values of the S
STF
statistics. The blue line
sorts the 60 p-values of Table 7.1 in ascending order. The black line is
the 0.05 significance line for 60 hypothesis tests. Based on FDR theory,
all p-values below the highest crossing point, i.e. less than or equal to
0.0273, are significant. . . . . . . . . . . . . . . . . . . . . . . . . . . 110
7.10 Temporal dynamics of the ispilateral vs. contralateral statistic S
STF
at 6 cortical regions reveal the top-down mechanisms of visual spatial
attention. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
xv
Abstract
Imaging approaches in MEG typically generate dynamic current density maps (CDMs)
on the cortical surface, where the activation in each location is represented by a current
dipole. The highly convoluted nature of the human cortex requires the use of many thou-
sands of dipoles for an accurate representation of the cortical surface, and the resulting
CDMs are difficult to interpret. As with fMRI images, objective assessment of CDMs
requires a principled approach to identifying regions of activation, which involves test-
ing thousands of hypotheses (one per surface element) for statistically significant exper-
imental effects. This raises the possibility of a large number of false positives simply as
a result of multiple hypothesis testing. To overcome this problem, we develop statistical
procedures that control the familywise error rate, i.e. the chance of one or more false
positives under the null hypothesis of no activation. We use random field and permu-
tation methods that consider the spatial dependence of the data, and make inferences
based on the global maximum distribution of the image statistics. To satisfy random
field assumptions, we must first smooth the CDMs. Since permutation methods are eas-
ily generalizable to multidimensional data, we expand them to control the familywise
error rate over spatial-temporal-spectral maps. We compare and evaluate the methods
on simulated and experimental data, and use them to detect hemispheric language dom-
inance in epileptic patients and morphological differences in brain structures.
Since experimental designs often involve multiple factors and confounding effects,
we present the use of general linear modeling theory to model MEG experimental data.
xvi
We demonstrate the theory in a specific MEG experiment involving a visual spatial
cueing paradigm. To extract contrast statistics of attention related alpha activity, we
follow a massively univariate approach and model each subject, region of interest, and
time-frequency band separately with a novel ANCOV A (analysis of covariance) design,
where power in rectangular time-frequency bands forms the observation variables, and
baseline power forms the covariate. We also demonstrate the use of false discovery
rate, a less conservative alternative to familywise error rate control, to do multiplicity
adjustments to our statistics.
xvii
Chapter 1
Introduction
Magnetoencephalography (MEG) is a relatively new medical imaging modality for
monitoring human brain function. While it offers significantly lower spatial resolution
than PET and fMRI, the ability to monitor neuronal activation at the millisecond time
scale makes this modality, together with EEG, a unique window on the human brain.
Until recently, most MEG literature has focused on forward and inverse modeling
of brain activation. Several head models have been proposed to model the propaga-
tion of electromagnetic signals inside the brain, ranging from a simple set of concentric
homogeneous and isotropic spheres representing the brain, skull and scalp, to more
realistic models based on anatomical information and approximate numerical solu-
tions [H¨ am¨ al¨ ainen et al., 1993, Darvas et al., 2004b]. A rich literature also exists for
inverse modeling, i.e. the identification of the neuronal current source configuration
that explains the MEG measurements [Baillet et al., 2001,Mosher et al., 1992,VanVeen
et al., 1997a]. However, until recently the statistical processing of MEG current source
solutions has drawn little attention.
Research or clinical findings are reliable only when accompanied by statistical
analysis to control for false positives, nominally at a 5% level. Consequently, error
rate control has a central role in medical image analysis, and fMRI or PET studies typ-
ically employ random field analysis to determine statistically significant experimental
effects. Our work introduces parametric (random field) and non-parametric (permuta-
tion) statistical procedures into MEG.
Imaging approaches in MEG typically generate dynamic current density maps on the
cortical surface. However, the highly convoluted nature of the human cortex requires the
1
use of many thousands of dipoles for an accurate representation of the cortical surface.
The inverse problem becomes hugely underdetermined and the resulting current density
maps (CDMs) are of low resolution; interpretation is further confounded by the pres-
ence of additive noise exhibiting a highly nonuniform spatial correlation. As with fMRI
images, objective assessment of CDMs requires a principled approach to identifying
regions of activation. The analysis of CDMs involves testing thousands of hypotheses
(one per surface element) for statistically significant experimental effects. This raises the
possibility of a large number of false positives simply as a result of multiple hypothesis
testing. To effectively control the number of false positives over all tests we must there-
fore consider the multiple hypothesis testing problem. Many false positive measures
have been proposed in this context, including familywise error rate, expected false dis-
covery rate, per-comparison error rate and per-family error rate [Nichols and Hayasaka,
2003]. The standard approach, and the one investigated in this work, is to control the
Familywise Error Rate (FWER), i.e. the chance of one or more false positives under
the null hypothesis. We achieve this by developing random field and permutation meth-
ods that consider the spatial dependence of the data, and make inferences based on the
global maximum distribution of the image statistics over space and time.
We demonstrate our statistical analysis of MEG signals with noise-normalized
minimum-norm solutions. However, our statistical methodologies apply to any type of
localization map. Further, we typically restrict the reconstruction to elemental sources
oriented normally to the cortical surface, because the main generators of MEG signals
are clusters of thousands of synchronously activated pyramidal neurons oriented per-
pendicular to the cortex. Although we take a cortical source imaging approach here, our
statistical methodologies can also be applied to volumetric detection maps.
2
The above statistical procedures involve pairwise tests between two conditions, for
example comparing a post-stimulus effect against a baseline. Since the most inter-
esting experimental designs involve multiple factors and confounding effects, we also
present the use of general linear modeling theory to identify main effects and inter-
actions in MEG experimental data, while at the same time suppressing confounding
baseline effects. We also demonstrate the use of false discovery rate, a less conserv-
ative alternative to familywise error rate control, to do multiplicity adjustments to our
statistics.
Chapter 2 gives an overview of MEG, focusing less on mathematical details and
more on basics and current developments. Chapter 3 presents a mathematical model
that linearly relates MEG measurements to brain activation. We then describe several
inverse methods based on minimum norm, spatial beamforming, subspace correlation,
and matched filter techniques, to detect the neuronal current source configuration that
explains the MEG measurements. Since oscillatory brain activity correlates with neu-
ronal communication, we also describe a method to create time-frequency activation
maps using Morlet wavelets.
Chapter 4 introduces the use of random fields and permutation tests to statistically
threshold cortical source images. Both methods consider the spatial and temporal depen-
dence of the data and make inferences based on the global maximum distribution of
the image statistics. Random field methods assume the same parametric distribution at
each spatial location, sufficient smoothness of the data, and a sufficiently high threshold
for the asymptotic results to be accurate. In contrast, permutation methods exploit the
information contained in the data to estimate the empirical distribution of the maximum
statistic. To satisfy random field assumptions, we smooth the current density maps using
a Laplace-Beltrami operator on the cortical surface.
3
In Chapter 5, we extend permutation methods to control for FWER in time-
frequency cortical activation maps produced with the Morlet wavelet approach of Chap-
ter 3. We cluster the data into rectangular pre- and post-stimulus time-frequency bands,
construct pairwise test statistics, and control type I error rate with a permutation proce-
dure. We demonstrate this method in application to high density MEG studies of visual
attention.
Chapter 6 presents two applications of the developed theory. The first identifies
the language dominant hemisphere of epileptic patients, and is an MEG alternative to
invasive procedures, such as the Wada test [Medina et al., 2004] and electrocorticog-
raphy [Salanova et al., 1992]. The second studies morphological differences in brain
structures, in particular the lateral ventricles of monozygotic twins and non-related sub-
jects; using this methodology, our collaborators published results that strongly suggest
the normalized hippocampal shape of a schizophrenic group is different from a control
group, most significantly as a deformation difference in the tail region [Styner et al.,
2004].
In Chapter 7 we move from simple pairwise tests to complex experimental designs
involving multiple factors and covariates. We demonstrate the use of general linear
theory to model main effects and interactions in MEG experimental data. Since mod-
eling is closely related to the experiment at hand, we present the theory in a specific
MEG experiment involving a visual spatial cueing paradigm. To extract contrast statis-
tics of attention related alpha activity, we follow a massively univariate approach and
model each subject, region of interest, and time-frequency band separately with a novel
ANCOV A (analysis of covariance) design, where power in rectangular time-frequency
bands forms the observation variables, and baseline power forms the covariate. We also
demonstrate the use of false discovery rate, a less conservative alternative to familywise
error rate control, to do multiplicity adjustments to our statistics.
4
We end in Chapter 8 with several concluding remarks about our statistical proce-
dures, and suggestions for future work on the statistical modeling of MEG data.
5
Chapter 2
Overview of Magnetoencephalography
Magnetoencephalography (MEG) is a noninvasive technique for measuring neuronal
activity in the human brain. Electrical currents flowing through neurons generate weak
magnetic fields recorded using magnetic sensors surrounding the head. The MEG
method is part of a broad area of research referred to as biomagnetism, which involves
studies of magnetic fields emanating from several organs of the human body, notably
the brain and heart.
The temporal resolution of MEG is in the millisecond range, the timescale at which
neurons communicate. Therefore, we can follow the rapid cortical activity reflecting
ongoing signaling between different brain areas. This is a great advantage compared
to other medical imaging modalities such as functional magnetic resonance imaging
(fMRI) and positron emission tomography (PET), where temporal resolution is on the
order of seconds. Further, unlike other methodologies that measure brain metabolism
or the relatively slow hemodynamic response, MEG directly measures electrical brain
activity. Electroencephalography (EEG) is a complimentary method to MEG, measur-
ing electrical scalp potentials rather than magnetic fields. It offers similar temporal res-
olution to MEG, but the spatial resolution is less accurate because electrical potentials
measured on the scalp are strongly influenced by strongly inhomogeneous conductivity
in the skull and scalp, whereas magnetic fields are mainly produced by currents that flow
in the relatively homogeneous intracranial space.
This chapter describes the neural generation of weak electromagnetic signals around
the head, and the extremely sensitive MEG instrumentation that can even detect mag-
netic flux quantums. We present the principles of forward and inverse modeling of
6
magnetic fields, and then discuss our contributions in the statistical analysis of MEG
activation maps. We conclude with an overview of applications of MEG, which span
both basic and clinical research.
2.1 Neural Basis of Electromagnetic Signals
A neuron consists of the cell body (or soma), which contains the nucleus, branching
dendrites, which receive signals from other neurons, and a projection called an axon,
which conducts the nerve signal. When a pulse arrives at an axon of a presynaptic
cell, neurotransmitter molecules are released from the synaptic vesicles into the synaptic
cleft. These molecules bind to receptors located on target cells, opening ion channels
(mostly Na+, K+ and Cl-) through the membrane. The resulting flow of charge causes
an electrical current along the interior of the postsynaptic cell, changing the postsynaptic
potential (PSP). When an excitatory PSP reaches the firing threshold at the axon hillock,
it initiates an action potential that travels along the axon with undiminished amplitude.
Conservation of electric charge dictates that intracellular currents, commonly called
primary currents, give rise to extracellular currents flowing through the volume conduc-
tor. Both primary and volume currents contribute to magnetic fields outside the head,
however only locally structured arrangements of cells can achieve sufficient coherent
superposition of currents as to produce measurable external fields. Clusters of thou-
sands of synchronously activated pyramidal cortical neurons are believed to be the main
generators of MEG signals (Fig. 2.1). In particular, the currents associated with their
large dendritic trunks, which are locally oriented in parallel and perpendicular to the cor-
tical surface, are believed to be the primary source of the neuromagnetic fields outside
the head. In contrast, the temporal summation of currents for action potentials, which
have duration on the order of 1 ms, is not as effective at producing measurable external
7
Figure 2.1: Cerebral frontal cortex drawn by Ramn y Cajal using a Golgi staining tech-
nique. Pyramidal (A, B, C, D, E) and non-pyramidal (F, K) cells are clearly depicted.
Currents flowing in the dendritic trunks of pyramidal cells are believed to be the primary
generators of magnetic signals outside the head.
fields as is the case the dendritic currents flowing in neighboring fibers, so that action
potentials are believed to contribute little to MEG measurements.
2.2 Instrumentation
Empirical observations indicate that we observe sources on the order of 10 nA-m, and
consequently the neuromagnetic signals are typically 50-500 fT, i.e. 10
9
or 10
8
times
smaller that the geomagnetic field of the earth [H¨ am¨ al¨ ainen et al., 1993]. The only
detector that offers sufficient sensitivity to measure such fields is the Superconducting
8
QUantum Interference Device (SQUID) introduced in the late 1960s by James Zimmer-
man [Zimmerman et al., 1970]. The first measurement of brain magnetic fields using
a SQUID magnetometer was carried out by David Cohen [Cohen, 1972] at the Massa-
chusetts Institute of Technology, and consisted of spontaneous alpha activity of a healthy
subject and abnormal brain activity in an epileptic patient.
The SQUID is a superconducting ring interrupted by thin insulating layers to form
one or two Josephson junctions [Barone and Paterno, 1982]. One important prop-
erty associated with Josephson junctions is that magnetic flux is quantized in units of
Φ
0
= 2.0678×10
−15
tesla m
2
. If a constant biasing current is maintained in the SQUID
device, the measured voltage oscillates as the magnetic flux increases; one period of
voltage variation corresponds to an increase of one flux quantum. Counting the oscil-
lations allows one to evaluate the flux change that has occurred, and therefore detect
magnetic fields on the order of a few fT. The sensitivity of the SQUID can be increased
to 1fT by attaching a coil of superconducting wire or flux transformer. The later is
placed as close to the human head as possible, and depending on its shape, can be con-
figured as a first-order planar or axial gradiometer, a second-order axial gradiometer,
or a simple magnetometer [H¨ am¨ al¨ ainen et al., 1993]. The gradiometer configurations
produce measurements proportional to the spatial gradient of the magnetic field thus
offering robustness to intereference from distant magnetic field sources.
Modern MEG systems consist of a few hundreds SQUID sensors placed in a liquid
helium-filled dewar, with the flux-transformer pickup coils surrounding a helmet struc-
ture (Fig. 2.2). World-wide, three companies build the majority of whole-head MEG
systems: 4-D Neuroimaging (formerly Biomagnetic Technologies Bti), Elekta Neuro-
mag Oy, and VSM MedTech Ltd (manufacturers of the CTF Systems). In recent years,
all three vendors have introduced dense arrays comprising over 200 SQUID channels.
9
Figure 2.2: Left: Whole-head CTF Omega MEG system with 275 axial gradiometers;
right: MEG sensors use low-temperature electronics cooled by liquid helium.
Brain magnetic signals are very weak compared with ambient noise. Outside dis-
turbances include power-line fields, electronic devices, elevators, and radio-frequency
waves. Nearby artifacts are caused by instrumentation noise and body interference,
such as heart, skeletal muscle, and spontaneous or incoherent background brain activity.
Shielded rooms made of successive layers of mu-metal, copper, and aluminum effec-
tively attenuate high frequency disturbances. Further, gradiometer flux transformers
cancel distant noise sources that produce magnetic fields with small spatial gradients.
10
2.3 Modeling
To estimate the neural sources of magnetic fields, one must first solve the associated
forward problem, i.e. the forward model that maps sources of known location, strength,
and orientation to the MEG sensors. The most common source model is the current
dipole [Baillet et al., 2001], used to approximate the flow of electrical current in a small
area of the brain. A typical strength of a current dipole, generated by the synchronous
firing of thousands of neurons, is 10 nAm. Alternatively, to avoid the identifiability prob-
lem that arises when too many small regions and their dipoles are required to represent
a single large region of coherent activation, we can use multipolar models, consisting of
dipoles, quadrupoles, octupoles, and so on [Mosher et al., 1999a].
Since the useful frequency spectrum for electrophysiological signals is largely below
100 Hz, the physics of MEG can be described with the quasi-static approximation
of Maxwell’s equations. The propagation of electromagnetic fields inside the head is
estimated based on the conductivities of the scalp, skull, gray and white matter, cere-
brospinal fluid and other tissue types. Head models that consist of a set of nested concen-
tric spheres with isotropic and homogeneous conductivities have closed form solutions.
Even though spherical head models work surprising well, more accurate solutions use
realistic head models based on anatomical information from high-resolution magnetic
resonance (MR) or x-ray computed tomography (CT) volumetric images. To solve the
forward problem using these more realistic models, numerical solutions using boundary
element methods (BEM), finite element methods (FEM), or finite difference methods
(FDM) are necessary [Darvas et al., 2004a].
To make inferences about the brain activity that gives rise to a set of MEG data,
we must solve the inverse problem, i.e. find a neuronal current source configuration that
explains the MEG measurements. Inverse methods for MEG can be roughly categorized
into two classes: imaging methods and dipole fitting/scanning methods. The imaging
11
Figure 2.3: MEG model; (a) Sensor arrangement of a 275-channel CTF MEG system,
(b) Topography of sensor measurements, (c) Minimum-norm inverse solution on a tes-
sellated cortical surface.
approaches are based on the assumption that the primary sources are intracellular cur-
rents in the dendritic trunks of cortical pyramidal neurons that are aligned normally to
the cortical surface. Consequently, a tessellated representation of the cerebral cortex is
extracted from a coregistered MR image and the inverse problem is solved for a current
dipole located at each vertex of the surface (Fig. 2.3). In this case, since the posi-
tion and orientation of the dipoles are fixed, image reconstruction is a linear problem
and can be solved using standard techniques. The dipole fitting or scanning methods
assume that the sources consist of only a few activated regions, each of which can be
represented by an equivalent current dipole of unknown location and orientation. The
standard approach to localization is to perform a least squares fit of the dipole model to
the data [Lu and Kaufman, 2003]. More recently, scanning methods have been devel-
oped that are also based on the dipole model, but involve scanning a source volume or
surface and detecting sources at those positions at which the scan metric produces a
local peak [Baillet et al., 2001]. Examples of these methods include the MUSIC (MUl-
tiple Signal Classification) algorithm [Mosher et al., 1992] and the LCMV (Linearly
Constrained Minimum Variance) beamformer [VanVeen et al., 1997a].
12
Due to intrinsic spatial ambiguities of the electromagnetic principles that underlie
MEG, the spatial resolution is lower that PET and fMRI. These ambiguities force a
choice between low resolution linear cortical imaging methods, or potentially higher
resolution methods based on parametric models or Bayesian or other nonlinear imaging
methods incorporating physiological priors that reflect the expected characteristics of
neural activation. A consensus is developing in the research community that no single
method suits all MEG applications; each method has strengths and weaknesses, reflect-
ing the ill-posedness of the inverse problem. The characteristics of expected neural
activation, as well as model fitting techniques, can facilitate the proper choice of inverse
methodology.
2.4 Statistical Analysis
Given the large number of localization methodologies in MEG, it is important to per-
form validation and statistical analysis under different experimental settings, such as
the number, location, and timeseries of neuronal sources. Furthermore, several meth-
ods require fine-tuning of parameters, such as the subspace correlation threshold for
the MUSIC algorithm. The receiver operating characteristic (ROC) curve is a standard
tool to evaluate the trade-off between sensitivity and specificity, and to compare dif-
ferent inverse methods. By varying a threshold applied to localization maps, we can
estimate two performance measures: the sensitivity or true positive fraction (TPF), and
1-specificity or false positive fraction (FPF). The ROC curve is a plot of TPF vs. FPF as
a detection threshold is varied. When comparing two detection methods, the one whose
ROC curve gives higher sensitivity at matched specificity, and vice versa, for all points
on the curve is the better detector. A simple metric to compare methods is the area
under the ROC curve (AUC), where the method with largest AUC is superior. The use
of free-response ROC, an ROC variant that can handle the presentation and detection of
13
multiple targets per image, is demonstrated in [Yildirim et al., 2005] for the evaluation
of minimum norm and scanning inverse methods.
In addition to evaluating the relative performance of different methods, it is impor-
tant to establish some degree of confidence in the results of real data analysis. Dipole
scanning methods often produce unstable solutions and the reproducibility of the recon-
structed dipoles is not guaranteed. A number of different approaches have been
investigated for assessing dipole localization accuracy, including Cramer Rao lower
bounds, perturbation analysis, and Monte Carlo simulation. To avoid strict distribu-
tional assumptions, a resampling alternative based on bootstrap theory was proposed
in [Darvas et al., 2005]. The principle underlying the bootstrap is that although the dis-
tribution of the data is unknown, it can be approximated by the empirical distribution
of a set of independent trials. By sampling with replacement over independent trials
collected during an event-related MEG study, the position, variance, and timeseries of
current dipoles can be estimated reliably.
In contrast to dipole scanning methods, imaging methods are hugely underde-
termined, resulting in low-resolution localization maps; interpretation is further con-
founded by the presence of additive noise exhibiting a highly nonuniform spatial corre-
lation. In this case, we need a mechanism to decide which features in the data are indica-
tive of true activation vs. those that are noise artifacts. To determine a suitable threshold
for detecting statistically significant activation, the familywise error rate (FWER), i.e.
the chance of any false positives under the null hypothesis of no activation (type 1 error)
is typically controlled. Parametric random field methods and nonparametric permuta-
tion methods are used to estimate familywise-corrected thresholds in [Pantazis et al.,
2005b]. Alternatively, the control of false discovery rate (FDR), i.e. the proportion of
false positives among those tests for which the null hypothesis is rejected, can produce
more sensitive thresholds.
14
Recent literature in MEG statistical analysis has been mostly limited to pairwise
comparisons at each cortical surface element for event-related averages. However,
extensions of this methodology to investigation of multiple effects using analysis of
variance (ANOV A) and analysis of covariance (ANCOV A) in individuals and groups is
possible, as for example in [Brookes et al., 2004].
2.5 Applications
Applications in MEG include both basic and clinical research. One of the most impor-
tant clinical applications is the detection, classification and localization of abnormal
neuronal activity in epilepsy patients. MEG has been successfully used to localize three
different spontaneous interictal signal components: epileptic spikes, slow-wave activ-
ity, and fast-wave activity [Lu and Kaufman, 2003]. Neurosurgical planning of med-
ically intractable epilepsy often includes the identification of epileptogenic lesions with
MEG [Stefan et al., 2003, Ossadtchi et al., 2004]. Furthermore, recent literature inves-
tigates the possibility of seizure prediction based on a drop in the complexity of neural
activity immediately before seizures [Maiwald et al., 2004].
In addition to the diagnosis of epilepsy, MEG is currently used for functional brain
mapping. Evoked response fields have been used to identify somatosensory, motor,
and vision related activity [Lu and Kaufman, 2003]. Several MEG studies [Bowyer
et al., 2005a, Simos et al., 1998, Pantazis et al., 2005a] have localized language-specific
cortical activity using either equivalent current dipoles or distributed cortical imaging,
with promising results for clinical application in neurosurgery. Time-frequency analysis
of MEG oscillatory evoked responses [Martin Vetterli, 1995, Hari and Salmelin, 1997,
Pantazis et al., 2005c] has detected networks of cortical interactions, and determined
the functional specificity of several frequency bands. A wide range of signal processing
techniques including image modeling and reconstruction, blind source separation, phase
15
synchrony estimation, non-linear analysis and chaos theory are under investigation to
reveal complex cognitive processes such as attention and working memory.
Recent reports in the literature have investigated how evoked response fields relate
to neuronal disorders, such as Alzheimer’s disease, autism, dyslexia, brain tumors, and
Parkinson’s disease. Furthermore, MEG has been used in conjunction with transaxial
magnetic stimulation to ameliorate abnormal brain activity [Anninos et al., 1991].
2.6 Summary
Magnetoencephalography is a relatively new medical imaging modality for monitoring
and imaging of human brain function. While spatial resolution is significantly lower
than that of PET and fMRI, the ability to monitor neuronal activation at the millisecond
time scale makes this modality, together with EEG, a unique window on the human
brain. Recent developments in instrumentation have lead to the manufacture of whole
head MEG arrays with in excess of 300 magnetometers. Coupled with new data analysis
tools for mapping brain function from MEG data, these systems will lead to important
new insights into the workings of the human brain with applications in both clinical and
cognitive neuroscience.
16
Chapter 3
Model and Inverse Methods
In this chapter we present a mathematical model that linearly relates MEG mea-
surements to brain activation. We then introduce several inverse methods based on
minimum-norm, spatial filtering, subspace correlation, and matched filter techniques, to
detect the neuronal current source configuration that explains the MEG measurements.
Oscillatory brain activity has been widely observed in response to sensory stimuli,
and during motor and cognitive tasks. To localize the brain areas and frequency bands
facilitating neuronal processes, the last part of this chapter presents a time-frequency
expansion of cortical source imaging using Morlet wavelets.
3.1 Model
We assume that MEG data are collected as a set of J stimulus-locked event-related
epochs (one per stimulus repetition), each consisting of a pre- and post-stimulus inter-
val. Each epoch consists of an array of data M (n
channels
× n
timepoints
) representing the
measured magnetic field at each sensor as a function of time. The measurements are
linearly related with the brain activation X as:
M = GX+ N (3.1)
where G (n
channels
xn
sources
) is the forward operator, X (n
sources
xn
timepoints
) is the matrix
of source time series, and N represents additive noise in the channel measurements. The
above model uses a single current dipole in each spatial location to represent a source
oriented normally to the cortical surface. We denote the lead field for spatial locationi
as g
i
(n
channels
x1). Alternatively, we may allow 3 current dipoles in each spatial location
17
to represent the x,y, and z components of the source. Then, the forward operator G has
dimensions (n
channels
x 3n
sources
), X has dimensions (3n
sources
x n
timepoints
), and the array
vector for spatial location i is G
i
(n
channels
x 3). The matrix G depends on the source
location and on the shape and conductivity of the head. G can be computed using a
simplified spherical head model, or more accurately using boundary or finite element
methods that account for the true shape and conductivity within the head [Baillet et al.,
2001, Mosher et al., 1999b].
3.2 Inverse Methods
MEG Inverse methods can be roughly categorized into two classes: imaging methods
and dipole fitting/scanning methods. An overview of these methods is given in Chapter
2; here we present the mathematical details of some representative algorithms.
Later chapters demonstrate our statistical analysis of MEG signals with noise-
normalized minimum-norm solutions. However, our statistical methodologies, namely
random fields and permutation tests, apply to any type of localization map. Further, we
typically restrict the reconstruction to elemental sources oriented normally to the cor-
tical surface, because the main generators of MEG signals are clusters of thousands of
synchronously activated pyramidal neurons oriented perpendicular to the cortex. Even
though we prefer cortical source imaging, our statistical methodologies can also be
applied to volumetric detection maps.
3.2.1 Regularized Minimum Norm Solutions
A cortical map of the source time series X can be computed by applying a linear inverse
method to produce an estimate of the temporal activity at each surface element in a
tessellated representation of the cortex. Even when dipole source orientations are con-
strained to be normal to the cortical surface, as is the case in the results presented below,
18
the forward matrix G is ill-conditioned and the inverse problem is highly illposed. Vari-
ous approaches have been proposed to ameliorate the effects of illposedness by introduc-
ing implicit or explicit constraints on the allowed current source distributions using reg-
ularization or Bayesian image estimation methods. The most popular of these, Tikhonov
regularization [Tikhonov and Arsenin, 1977], solves the problem:
ˆ
X = arg min
X
kM-GXk
2
2
+λkXk
2
2
(3.2)
whereλ is the tikhonov regularization parameter, whose choice is typically data depen-
dent [Golub and von Matt, 1997]. By setting the derivative with respect to X to zero, we
have:
ˆ
X = (G
T
G+λI)
−1
G
T
M (3.3)
= HM
where H = (G
T
G+λI)
−1
G
T
is the inverse operator.
Dale et al. [Dale et al., 2000] normalize the cortical maps using an estimate of the
background noise variance at each cortical element:
ˆ
X
nn
= (diag{HC
N
H
T
})
−1/2
HM (3.4)
These noise normalized images, which follow a t-distribution under the null hypoth-
esis of Gaussian background noise, have spatially more uniform point-spread function
than the regular minimum-norm approach. Furthermore, the spatial uniformity under
the null hypothesis should lead to uniform spatial specificity when thresholding these
maps.
19
3.2.2 Linearly Constrained Minimum Variance Beamformer
The Linearly Constrained Minimum Variance (LCMV) Beamformer [VanVeen et al.,
1997a] applies spatial filtering to sensor array data to discriminate between signals from
a location of interest and those originating elsewhere. In application to MEG, the goal
is to find a spatial filter that minimizes the output power of the beamformer subject to a
unity gain constraint at the desired location on the brain. The output of such a filter at
thei
th
spatial location is represented as:
Y
i
= W
T
i
M (3.5)
where W
i
(n
channels
x 3) is the spatial filtering matrix for source i. The LCMV beam-
forming filter satisfies:
min
W
i
{P
i
} subject to W
T
i
G
i
= I (3.6)
whereP
i
= tr{C
Y
i
} = tr{W
T
i
C
M
W
i
} is the estimated output power at source location
i, and C
M
is the data covariance. Using Lagrange multipliers, we solve for the spatial
filtering matrix:
W
T
i
= [G
T
i
C
−1
M
G
i
]
−1
G
T
i
C
−1
M
(3.7)
Using this filter and Eq. 3.5 we can form an estimate of the signal from each location on
the cortical surface. An estimate of the signal power at each source location i is given
byP
i
=tr{[G
T
i
C
−1
M
G
i
]
−1
}.
Van Veen et al. [VanVeen et al., 1997b] defined the neural activity index d
i
as the
estimated signal power normalized by the estimated noise power at locationi:
d
i
=
tr{[G
T
i
C
−1
M
G
i
]
−1
}
tr{[G
T
i
C
−1
N
G
i
]
−1
}
(3.8)
where C
N
is the estimated channel noise covariance.
20
It is straightforward to apply the above procedure when the sources are constrained
to align normally to the cortical surface. For this case, the neural activity index simplifies
to:
d
i
=
{g
T
i
C
−1
M
g
i
}
−1
{g
T
i
C
−1
N
g
i
}
−1
(3.9)
In the results presented below the LCMV is used to scan through all possible source
locations on the cortical surface, producing the neural activity index,d
i
, as an indicator
of brain activity, at each location.
3.2.3 Multiple Signal Classification (MUSIC)
The MUSIC method was initially developed in the array signal processing commu-
nity [Schmidt, 1986], before being used in MEG/EEG source localization [Mosher and
Leahy, 1998]. Each spatial location in the brain volume, or cortical surface, is scanned
for current dipoles of unknown position and orientation. The forward model for each
potential dipole location is projected against an estimated signal subspace computed
from the spatio-temporal channel data. The positions and orientations that give the best
projections onto the signal subspace correspond to the estimated dipole parameters.
Let M = U
M
Σ
M
V
T
M
be the singular value decomposition of the spatiotemporal
measurement matrix M. Then, the matrix U
M
forms an orthonormal basis for the sub-
space spanned by the data. If the number of channels is greater than the number of lin-
early independent active sources p, the signal to noise ratio (SNR) is sufficiently large,
and the noise at the channels is i.i.d, we can define a basis U
s
for the signal subspace
using the firstp column vectors of U
M
and a basis U
N
for the noise subspace using the
remaining column vectors of U
M
[Mosher and Leahy, 1998]:
U
M
= [u
1
...u
p
|u
p+1
...u
n
channels
] = [U
s
|U
N
] (3.10)
21
The number of expected sourcesp in Eq. 3.10 is typically chosen based on the singular
values of the data covariance matrix C
M
.
The MUSIC method identifies sources that maximize the correlation between the
signal subspace U
s
and the lead field G
i
for each spatial location, i. We can form a
cortical detection mapd
i
by computing the correlation between these two subspaces as
given in [Mosher and Leahy, 1998]:
d
i
= subcorr{G
i
, U
s
}
1
(3.11)
where subcorr{A, B}
1
denotes the cosine of the smallest principal angle between the
column spaces of matrices A and B [Golub and Loan, 1996].
3.2.4 Matched Filter
Matched filters have been widely used in signal detection since, under the assumption of
additive Gaussian noise, they lead to the uniformly most powerful test for detection of a
known signal [Scharf, 1991]. We assume here that a source centered at cortical element
i consists of a contiguous activated cortical patch which we denote by the set N
i
. We
can compute a signal “template”’ in the sensor space for this source as:
t
i
=
X
jN
i
c
j
g
j
(3.12)
where the coefficientsc
j
represent the relative strength of the source at cortical element
j and g
j
is the corresponding array vector.
Using this template, we can rewrite the data model in Eq. 3.1 as:
m(t) = t
i
x(t)+ n(t) (3.13)
where m(t) are the channel measurements andx(t) is the time-course of theith source
at timet; we have dropped the matrix notation to ease the following treatment.
22
Note that this model assumes only a single source or patch is active, located at posi-
tion i. As we shall see below, this results in significant degradation in performance
when more than one source is present, which limits the utility of the matched filter for
this application. We further assume the noise n(t) is a white process with standard devi-
ationσ. However, no spacial treatment is required for colored noise, other than applying
a prewhitening operator.
Given the above assumptions, the vector m(t) follows a Gaussian distribution
N(t
i
x(t),σ
2
I), which belongs to a one-parameter exponential family. From the Fisher-
Neyman factorization theorem, we know a sufficient statistic for the parameter x(t) is
the scalar
d
i
(t) =
t
T
i
σ
p
t
T
i
t
i
m(t) (3.14)
This statistic is the matched filter output when testing for a source at location i. If
the actual patch is indeed present at location i, the statistic follows the distribution
N(
p
t
T
i
t
i
x(t)/σ,1) and should produce a large value. However, if the actual source is
at another location a, then the statistic follows the distribution N(
p
t
T
i
t
a
x(t)/σ,1), and
should typically give a small value because the template and the signal do not match.
Applying Eq. 3.14 at each potential source locationi on the cortical surface, we gener-
ate a matched filter detection mapd
i
(t) for each time instancet, which we can threshold
to detect the source location.
Since the sensitivity of the MEG array varies with source location, the mean of
the matched filter output (
p
t
T
i
t
a
x(t)) will depend on the location of the true patch.
Consequently if we apply a fixed threshold to the detection maps d
i
, specificity will
vary with respect to spatial location of the actual source, as illustrated in Fig. 3.1a. To
overcome this problem, we convert the detection maps into z-scores, as follows:
d
z
i
(t) =
d
i
(t)−E
i
{d
i
(t)}
std
i
{d
i
(t)}
(3.15)
23
0 2000 4000 6000
0
0.2
0.4
0.6
0.8
1
MF Outputs
Patch No
Power
0 2000 4000 6000
0
2
4
6
8
z−Scores
Patch No
Power
(a) (b)
Figure 3.1: (a) Detection maps for two separate actual source locations. (b) Correspond-
ing z-scores. Notice that, although individual detection maps have different amplitudes,
z-score maps exhibit the same range of values so that thresholding is meaningful.
The effect of this transformation is to produce uniform specificity for a fixed threshold
with respect to spatial location of the true patch (Fig. 3.1b). After constructing a sep-
arate detection map for each timepoint d
z
i
(t), we form a single detection map over the
whole time period (as in the case of LCMV beamformer and MUSIC methods) using
the average power of each surface element over time:
d
i
= E
t
{d
z
i
2
(t)} (3.16)
3.3 Imaging Oscillatory Behavior in MEG studies
Event-related changes in oscillatory brain activity have been widely observed in
response to sensory stimuli, and during motor and cognitive tasks, in a variety of human
EEG and MEG experiments [Worden et al., 1946, Singh et al., 2002, Pfurtscheller and
da Silva, 1999,Jerbi et al., 2005,Basar et al., 2001]. These changes often localize in spe-
cific brain areas and frequency bands, depending on the underlying neuronal processes.
Increases and descreases in oscillatory power, relative to some baseline, can be attributed
to event-related synchronization and desynchronization of neuronal populations [Singh
et al., 2003, Lee et al., 2005].
Task-related activity that is precisely phase locked to the stimulus is referred to as
an evoked response and averaging of such responses over multiple epochs will result in
24
Figure 3.2: MEG brain activity in response to a task or stimulus consists of evoked
components that are phase-locked to the stimulus and induced components that are not.
While averaging over epochs enhances the phase locked SNR, averaging also results in
suppression of the non-phase locked components.
increased signal to noise ratio relative to background noise and brain activity unrelated
to the task, as illustrated in Fig. 3.2. However, there are other oscillatory components
in EEG and MEG data whose power is modulated in response to a task or stimulus but
which are not phase locked to the stimulus. Averaging of these non phase-locked com-
ponents over multiple epochs will result in signal cancellation as also illustrated in Fig.
3.2. These induced components in the data are believed to play an important role in
neural communication. Consequently producing spatial maps of these induced oscilla-
tory changes is important for developing an increased understanding of communication
in the brain.
3.3.1 Cortical Source Imaging with Time-Frequency Expansions
Since event-related components in MEG studies are often lost in background brain activ-
ity and environmental and sensor noise, it is a standard technique for noise reduction to
average over stimulus-locked responses. However this also removes induced oscillatory
changes in brain activity, because they are not phase locked to the stimulus (Fig. 3.2). To
25
Figure 3.3: Time-varying frequency components of a source on the frontal lobe, after
averaging over many epochs; we notice desynchronization in the β band 250-600ms
after stimulus. The Morlet wavelet is a Gaussian-windowed complex sinusoid with the
real part shown in blue, and the imaginary part in red.
overcome the problem of cancellation of non-phase locked oscillatory components, we
use time-frequency analysis [Morgan and Gevins, 1986, Martin Vetterli, 1995] of each
individual epoch and then analyze event-related changes by computing average power
across epochs in time-frequency space. Figure 3.3 illustrates time-frequency analysis
using Morlet wavelets [Teolis, 1998] for a single source on the frontal lobe in an MEG
study involving attention to a visual task. Repeating this procedure at all cortical loca-
tions, we can produce dynamic cortical images in multiple frequency bands. While the
SNR in individual epochs is too low to see any but the strongest components, averag-
ing signal power across epochs should reveal significant event-related activity across the
entire time-frequency space.
Recently, there have been many studies investigating oscillatory components and
their relation to cognitive, motor and sensory tasks [Jerbi et al., 2004,Hari and Salmelin,
1997, Tallon-Baudry et al., 1997]. Time-frequency analysis has been successfully used
26
to detect and localize evoked and induced responses [Bertrand et al., 2000, Morgan and
Gevins, 1986, Tallon-Baudry et al., 1996].
3.3.2 Morlet Wavelet
Based on the model described in Section 3.1, a cortical map can be computed for each
epoch by applying a linear inverse method, such as a Tikhonov regularized minimum-
norm solution, to produce an estimate of the temporal activity at each surface element
in the cortex:
ˆ
X = (G
T
G+λI)
−1
G
T
M (3.17)
We write the reconstructed cortical maps as{X
itj
}, wherei = 1,...,S,t =−N
0
+
1,...,N, and j = 1,...,J are indices in space, time, and epoch, respectively. We let
t = 1 correspond to the stimulus event time so that there areN
0
pre-stimulus time points.
We use the pre-stim data to estimate the baseline mean m
ij
and standard deviation s
ij
at each spatial element i at epoch j, by averaging over time t. We then estimate the
noise-normalized centered data as:
Y
itj
=
X
itj
−m
ij
s
ij
(3.18)
We use a continuous wavelet transform [Martin Vetterli, 1995] to decompose the
source timeseries (Y
itj
,t = −N
0
+ 1,...,N) into their wavelet coefficients. Unlike
the Fourier transform, which decomposes a signal into infinite length sines and cosines
and loses all time localization information, the continuous wavelet transform basis func-
tions are scaled and shifted versions of the time-localized mother wavelet. A commonly
used continuous time wavelet is the complex Morlet wavelet [Teolis, 1998], a Gaussian-
windowed complex sinusoid defined as:
w
tf
= (πb)
−0.5
e
2iπft
e
−t
2
/b
(3.19)
27
θ band γ band
Figure 3.4: Normalized energy increase in a visual attention MEG study in theθ andγ
frequencies. The colormap encodes percentage increase in energy in units of standard
deviation. The maps reveal high activity in the occipital lobe and suggest a relevant
functional association betweenθ andγ bands.
whereb is the bandwidth parameter.
For each epoch j and each source i we obtain an estimate of the time-varying fre-
quency components by expanding the timeseries using Morlet wavelets as:
C
itfj
=Y
itj
?w
tf
(3.20)
where (?) denotes the convolution operator over the time indext, andC
itfj
are the com-
plex wavelet coefficients. Because the wavelet decomposition is linear and computed
entirely in the time domain, while the inverse operator (3.17) is computed entirely in
the spatial domain, the two operators commute. In practice it is more computationally
efficient to first compute the wavelet decomposition in the channel domain, and then to
apply the inverse operator (3.17) to each of the wavelet coefficients.
Our goal is to detect spatial regions of activity in MEG-based cortical maps that
exhibit task-related changes in oscillatory behavior. Based on our understanding of
neuronal processes related to visual and sensorimotor tasks, we define time-frequency
bandsS
k
={(t,f),t∈ [t
k
0
t
k
1
],f ∈ [f
k
0
f
k
1
]}, that may contain significant event-related
28
components of brain activity. For example, we can choose time-frequency bands cor-
responding to early (200-500ms) and late (700-1000ms) responses to a stimulus event,
in the θ (4-7Hz), α low (8-10Hz), α high (11-14Hz), β low (14-19Hz), β high (20-
30Hz), andγ (30-50Hz) frequency regions. We estimate the total power in these bands
by integrating over time and frequency:
E
ikj
=
ZZ
(t,f)∈S
k
|C
itfj
|
2
dtdf (3.21)
We can then produce dynamic images of brain activity on the cerebral cortex in multiple
time-frequency bands (Fig. 3.4).
29
Chapter 4
Controlling Familywise Error Rate in
Cortical Current Density Maps
We describe the use of random field and permutation methods to detect activation in
cortically-constrained maps of current density computed from MEG data. The meth-
ods are applicable to any inverse imaging method that maps event-related MEG to a
coregistered cortical surface. These approaches also extend directly to images com-
puted from event-related EEG data. We determine statistical thresholds that control the
familywise error rate (FWER) across space or across both space and time. Both random
field and permutation methods use the distribution of the maximum statistic under the
null hypothesis to find FWER thresholds. The former methods make assumptions on
the distribution and smoothness of the data and use approximate analytical solution, the
latter resample the data and rely on empirical distributions. Both methods account for
spatial and temporal correlation in the cortical maps. Unlike previous nonparametric
work in neuroimaging, we address the problem of non-uniform specificity that can arise
without a Gaussianity assumption. We compare and evaluate the methods on simulated
data and experimental data from a somatosensory evoked response study. We find that
the random field methods are conservative with or without smoothing, though with a 5
vertex FWHM they are close to exact. Our permutation methods demonstrated exact
specificity in simulation studies. In real data the permutation method was not as sen-
sitive as the RF method, although this could be due to violations of the random field
theory assumptions.
30
4.1 Overview
Magnetoencephalography (MEG) is used to image electrical activity in the brain. Clus-
ters of thousands of synchronously activated pyramidal cortical neurons are believed
to be the main generators of MEG signals. In particular, the currents associated with
their large dendritic trunks, which are locally oriented in parallel and perpendicular
to the cortical surface, are the primary source of the neuromagnetic fields outside the
head [H¨ am¨ al¨ ainen et al., 1993]. Imaging approaches to the MEG inverse problem
exploit this concept by restricting the reconstruction to elemental sources (dipoles) ori-
ented normally to the cortical surface [Dale and Serano, 1993]. Consequently, a com-
monly used approach extracts a tessellated representation of the cerebral cortex from a
coregistered MR image and solves the inverse problem for a current dipole located at
each vertex of the tessellated surface. Since the position and orientation of the dipoles
is fixed, image reconstruction is a linear problem and can be solved using standard tech-
niques [Baillet et al., 2001, H¨ am¨ al¨ ainen et al., 1993, Katila and Karp, 1983, Phillips
et al., 1997]. However, the highly convoluted nature of the human cortex requires the
use of many thousands of dipoles for an accurate representation of the cortical surface.
The inverse problem becomes hugely underdetermined and the resulting current density
maps (CDMs) are of low resolution; interpretation is further confounded by the presence
of additive noise exhibiting a highly nonuniform spatial correlation.
As with fMRI images, objective assessment of CDMs requires a principled approach
to identifying regions of activation. The analysis of CDMs involves testing thousands
of hypotheses (one per surface element) for statistically significant experimental effects.
This raises the possibility of large number of false positives simply as a result of multiple
hypothesis testing. To effectively control the number of false positives over all tests we
must therefore consider the multiple hypothesis testing problem. Many false positive
measures have been proposed in this context, including familywise error rate, expected
31
false discovery rate, per-comparison error rate and per-family error rate [Nichols and
Hayasaka, 2003]. The standard approach, and the one investigated in this work, is to
control the Familywise Error Rate (FWER), i.e. the chance of one or more false positives
under the null hypothesis.
The simplest approach to controlling the FWER is the Bonferroni correction method
[Hochberg and Tamhane, 1987, Nichols and Hayasaka, 2003]. This method produces
conservative thresholds unless the tests are independent, a case that is rarely true in
neuroimaging experiments and certainly not for the smooth images reconstructed from
MEG data. Other methods that consider the spatial dependence of the data make infer-
ences based on the global maximum distribution. The FWER is directly related to the
maximum statistic; one or more voxels T
i
will exceed the threshold u under the null
hypothesisH
0
only if the maximum exceeds the threshold:
P(FWER) = P(∪
i
{T
i
>u}|H
o
)
= P(max
i
T
i
>u|H
o
)
= 1−F
maxT|Ho
(u) (4.1)
= 1−(1−α
o
) =α
o
Therefore, we can control the FWER if we choose the threshold u to be in the (1−
α)100
th
percentile of the maximum distribution.
Random Field (RF) theory methods approximate the upper tail of the maximum
distribution F
max
using the expected value of the Euler Characteristic of the thresh-
olded image [Worsley et al., 1996]. They are implemented in various software pack-
ages (SPM - http://www.fil.ion.ucl.ac.uk, V oxBo - http://www.voxbo.org, and FSL -
http://www.fmrib.ox.ac.uk/fsl among others), and are typically used in PET and fMRI
studies. However, RF theory relies on several assumptions including the following: the
image has the same parametric distribution at each spatial location, the point spread
32
function has two derivatives at the origin, sufficient smoothness to justify application of
continuous RF theory, and a sufficiently high threshold for the asymptotic results to be
accurate.
Resampling methods are a different approach to controlling the FWER that exploit
the information contained in the data to estimate the empirical distribution of the max-
imum statistic. They do not assume parametric distributions, they adapt to underlying
correlation patterns in the data, and are now computationally feasible. The two main
categories are bootstrap-based, which allow for a general modeling framework, and
permutation-based, which require some knowledge of exchangeability conditions under
the null hypothesis. Here we only consider permutation methods since they are exact,
i.e. they give precise control of the FWER, while bootstrap methods are only asymptoti-
cally exact. Furthermore, the permutation approach relies on a less restrictive exchange-
ability condition than the requirement in the bootstrap that samples are independent and
identically distributed.
Permutation and RF theory methods have been applied widely in functional [Wors-
ley et al., 1996, Worsley et al., 1992, Nichols and Holmes, 2001, Andrade et al., 2000]
and structural [Pantazis et al., 2004, Chung, 2001, Thompson et al., 2001, Thompson
et al., 2003,Sowell et al., 1999a,Sowell et al., 1999b,Bullmore et al., 1999] brain imag-
ing. However, until recently error rate control in MEG experiments has drawn little
attention. [Dale et al., 2000] normalized the CDMs using an estimate of the background
noise variance at each cortical element. These normalized images follow a t-distribution
under the null hypothesis of Gaussian background noise. Thresholding of the resulting
statistical maps was then used to detect significant activation, however the multiple com-
parisons problem was not addressed. [Barnes and Hillbrand, 2003] presented an applica-
tion of RF theory to MEG data but their method is specifically tailored to beamforming
solutions rather that the general linear inverse methods. [Carbonell et al., 2004] used
33
Hotelling’s T
2
random fields to localize significant MEG/EEG activation in time, and
then t-statistics to achieve spatial localization. Permutation tests were applied by [Blair
and Karnisky, 1994] for the analysis of EEG data as recorded on an array of electrodes,
and by [Pantazis et al., 2003] for the analysis of MEG data in reconstructed cortical
maps of brain activation. An alternative permutation scheme, proposed by [Singh et al.,
2003], detects event-related synchronization or desynchronization components in an
MEG study involving visual stimulation and an LCMV beamformer applied to data
decomposed into multiple frequency bands. The current work presents a novel general
RF theory based method to control FWER in MEG. Also, it extends the results in [Pan-
tazis et al., 2003] to extract thresholds for each time-point. Finally, we compare RF
theory and permutation methods in terms of specificity, sensitivity and possible limita-
tions.
4.2 Methods
Our goal is to detect spatial and temporal regions of significant activity in MEG-based
cortical maps while controlling familywise error rate. The methods to do this that we
describe below also apply directly to cortical maps computed from EEG data, since the
inverse imaging methods differ only in the forms of their lead field matrices [Baillet
et al., 2001]. In this section we first describe our MEG data model. We then present two
methods, the first based on RF theory and the second on permutations, for controlling
the error rate in MEG experiments.
4.2.1 Model
We assume that MEG data are collected as a set of J stimulus-locked event-related
epochs (one per stimulus repetition), each consisting of a pre- and post-stimulus inter-
val. Each epoch consists of an array of data M (n
channels
× n
timepoints
) representing the
34
measured magnetic field at each sensor as a function of time. The measurements M are
linearly related with the brain activation X (n
sources
×n
timepoints
) according to the model
presented in section 3.1 and repeated here for quick reference:
M = GX+ N (4.2)
where G (n
channels
× n
sources
) is the forward operator and N represents additive noise in
the channel measurements.
A cortical map can be computed for each epoch by applying a linear inverse method
to produce an estimate of the temporal activity at each surface element i in the cortex.
We write the reconstructed cortical maps as {X
itj
} where i = 1,...,S, t = −N
0
+
1,...,N, and j = 1,...,J are indices in time, space, and epoch, respectively. We
let t = 1 correspond to the stimulus event time so that there are N
0
pre-stimulus time
points. We use the pre-stim data to estimate the baseline mean ˆ μ
i
and standard deviation
ˆ σ
i
at each spatial elementi. We model the centered data≡ Y
itj
=X
itj
− ˆ μ
i
as
Y
itj
=μ
it
+
itj
, t = 1,...,N (4.3)
where μ
it
is the spatiotemporal profile, and
itj
is the zero mean random error. We
assume independence across epochs, but not over time and space; that is, that
itj
is
independent of
itj
0 forj 6=j
0
and alli,t.
We take the standard massively univariate approach and model each spatial location
independently. For each spatial location the model then amounts to a 1-way ANOV A
model with repeated measures. The standard linear modeling approach would be to find
estimates ˆ μ
it
and test
H
0
:{μ
it
= 0,t = 1,...,N}, (4.4)
for each location i. Estimation and inference on this hypothesis is complicated by the
temporal dependence, as the optimal estimates require the temporal whitening of the
35
data (and model). The temporal dependence can be difficult to estimate, and whitening
with an inaccurate covariance can bias variance estimates and even increase the vari-
ance of mean estimates [Friston et al., 2000]. Inference on the nulls in Eq. 4.4 with
F statistics is also challenging. Without whitening, the null F distributions will be
incorrect (though conservative approximations are available [Greenhouse and Geisser,
1959]), while whitening with estimated covariances yields only approximately correct
inferences.
Hence we depart from standard statistical approaches in three important ways. First,
instead of whitening we use ordinary least squares with inference methods that are valid
despite the dependence (the case for permutation), or that require only low-dimensional
characterization of the dependence (the case for RF theory). While this implies that our
parameter estimates do not have minimum variance, it makes our methods general and
easy to implement. In particular, since we expect no experimental effect over epochs,
the spatiotemporal profiles are estimated as the appropriate average overj in Eq. 4.5:
ˆ μ
it
=
¯
Y
it·
, (4.5)
where the bar indicates an average over the dotted subscript.
Second, instead of testing the temporal-omnibus null hypothesis (Eq. 4.4), we test
H
0
: μ
it
= 0, (4.6)
for each t > 0 and i. This allows temporal as well as spatial localization. The test
t-statistic for Eq. 4.6 is
T
it
=
ˆ μ
it
ˆ σ
i
/
√
J
. (4.7)
Lastly, we use the pre-stimulus variance estimate ˆ σ
i
instead of a residual variance
estimator. We may do this because we expect no systematic response in pre-stimulus
time. We choose to do this since it is likely that the stimulus will increase the variance
36
in the post stimulus time period in a heterogeneous fashion; in the presence of a response
the residual variance estimator would overestimate the background noise level.
4.2.2 Random Field Theory Method
Random field methods use the topology of the statistic image to estimate the maximum
distribution. In this section we describe their application to surface reconstructed MEG
data.
Assumptions
Gaussian random field methods assume that the statistic image, or equivalently the nor-
malized errors, approximate a Gaussian random field. Based on our modeling, to control
FWER over space it is required that for eacht andj the spatial process of the normalized
errors
(
itj
p
Var(
itj
)
)
i
(4.8)
is a sampled version of a continuous, mean zero, unit variance Gaussian random field.
To control FWER over space and time, the spatiotemporal process must satisfy this
condition. While the core RF theory results also assume stationarity, there have been
recent developments that relax this assumption, as described below.
When the degrees of freedom are high, thet random fields will be well approximated
by a Guassian random field; [Worsley et al., 1995] offers a rule of thumb of 120 degrees-
of-freedom as being sufficient. Ourt statisticT
it
hasN
0
−1 degrees of freedom. As the
number of epochs (J) grows, the central limit theorem implies that ˆ μ
it
will be Gaussian,
even if
itj
is not. Hence if both N
0
and J are large, the random field methods should
have their distributional assumptions satisfied.
37
Spatial and Temporal Smoothing
To apply the continuous RF theory results, the statistic images must be sufficiently
smooth, since failure to satisfy this assumption typically leads to conservative thresh-
olds. [Nichols and Hayasaka, 2003] found that 3 voxels FWHM smoothness is suf-
ficient for three dimensional Gaussian images; more smoothness was required for low
degrees-of-freedomt images. The smoothness required of surface reconstructed CDM’s
is unknown. The highly convoluted cerebral cortex causesT
it
to be rather rough across
i, especially on gyral crests and sulcal fundi where neighboring vertices have rapidly
changing orientations. Therefore we consider spatially smoothing ˆ μ
it
to avoid conserv-
ative thresholds.
Smoothing in Euclidean space is typically performed by convolving the imageX(z),
z∈R
n
with a Gaussian kernel of FWHM =4(ln2)
1/2
√
t as follows:
X
0
(z,t) =
1
(4πt)
n/2
Z
R
n
e
−(z−y)
2
/4t
f(y)dy (4.9)
Here the time variablet determines the degree of smoothing by analogy to an isotropic
diffusion process. This equation cannot be applied directly to cortical surfaces. To apply
smoothing on arbitrary curved surfaces, we can reformulate Gaussian kernel smoothing
as a solution of a diffusion equation on a Riemannian manifold [Chung, 2001]:
∂X
0
∂t
= ΔX
0
(4.10)
with the initial condition X
0
(z,0) = X(z). For an n-dimentional Euclidean space,
ΔF = ∂
2
F/∂x
2
1
+... +∂
2
F/∂x
2
n
; for an arbitrary Riemannian manifold (such as a
cortical surface),Δ is called the Laplace-Beltrami operator and is given by:
ΔF =
X
i,j
1
|g|
1/2
∂
∂u
i
(|g|
1/2
g
ij
∂F
∂u
j
) (4.11)
38
where g = (g
ij
) is a 2x2 matrix whose coefficients are the Riemannian metric tensors,
u
1
and u
2
are the coefficients of a local parameterization of the cortical surface, and
(g
ij
) = g
−1
[Boothby, 2002, Arfken, 2000].
If control of FWER’s over time and space is desired, then smoothing in time may
also be required. This can be accomplished with initial bandpass filtering of the data, or
with temporal smoothing of the CDMs. One problem with temporal smoothing is that
the usual estimate of σ
i
is only unbiased under independence; when temporal samples
are correlated, a bias correction for ˆ σ
i
can be found using the Saitherwaite correction
[ [Satterthwaite, 1946a, Worsley and Friston, 1995]].
Estimating the maximum distribution
As shown in Eq (4.1), the FWER can be determined directly from the maximum distri-
bution; specifically, the probability of the maximum exceeding u gives the FWER for
thresholdu. [Adler, 1981] demonstrated that the expected value of the Euler Character-
istic is a good approximation of this probability whenu is large. [Worsley et al., 1996]
provides an intuitive formula that unifies the results for all types of random fields:
P(max
i
T
it
≥ u)≈
2
X
d=0
R
d
(S)ρ
d
(u) (4.12)
This equation gives the FWER P-value for thresholdu in a 2-dimensional random field
(T
it
for fixed t) in a search region S (cortical surface). The term R
d
(S) is the d-
dimensional RESEL count, a unitless quantity that depends only on topological features
of the CDMs in the search regionS. Theρ
d
(u) term is the Euler Characteristic density
that depends only on the threshold u and the type of statistical field (such as z, t, X
2
,
and Hotelling’sT
2
). In Eq. 4.12 the summation of the lower dimensional terms (d = 0
andd = 1) compensate for the case when the excursion set, i.e. the regions of voxels in
a field above a thresholdu, touches the boundary.
39
It is straightforward to extend these results to a 3-dimensional random field that
includes the time dimension [Worsley et al., 1996]. Letτ denote the number of RESELS
computed in the time dimension, then:
P(max
it
T
it
≥ u)≈
2
X
d=0
R
d
(S)[τρ
d+1
(u)+ρ
d
(u)] (4.13)
In MEG we are interested in positive or negative deflections of current density, and
hence make inferences on the absolute values|T
it
| using two-side FWER P-values:
P(max
i
|T
it
|≥u) ≈ 2R
2
(S)ρ
2
(u) (4.14)
(4.15)
for control of FWER over space only, and
P(max
it
|T
it
|≥u) ≈ 2R
2
(S)[τρ
3
(u)+ρ
2
(u)] (4.16)
for control of FWER over space and time.
We now give equations for ρ
2
(u), ρ
3
(u) and estimates for R
2
(S) and τ based on
pre-stimulus data. [Worsley et al., 1996] gives the Euler characteristic density for a 2-
dimensional Gaussian RF as
ρ
2
(u) =
4log
e
2
(2π)
3
2
e
−u
2
/2
u (4.17)
ρ
3
(u) =
(4log
e
2)
3
2
(2π)
2
e
−u
2
/2
(u
2
−1) (4.18)
and for at RF as
ρ
2
(u) =
4log
e
2
(2π)
3
2
Γ(
ν+1
2
)
(
ν
2
)
1
2
Γ(
ν
2
)
1+
u
2
ν
−
1
2
(ν−1)
(4.19)
ρ
3
(u) =
(4log
e
2)
3
2
(2π)
2
1+
u
2
ν
−
1
2
(ν−1)
ν−1
ν
u
2
−1
(4.20)
40
The estimation ofR
2
(S) is complicated by the strong but heterogeneous spatial cor-
relation. A thorough treatment of such fields can be found in [Worsley, 2000] and [Wors-
ley et al., 1999]; we only summarize the results here. The variance ofT
it
is already unity,
so does not need to be normalized. For each surface elementi we define the time vector
T
i
withN
0
elementsT
it
,t =−N
0
+1,...,0. We assume the tessellated cortical surface
comprisesK trianglesΔ
k
,k = 1,...,K and let T
i
0k
, T
i
1k
, and T
i
2k
be the random field
time vectors on the vertices of trianglek. Then the number of RESELS,R
2
(S), on the
2-dimensional cortical surface is:
R
2
(S) =
K
X
k=1
det(
1
N
0
−1
P
T
k
P
k
)
1/2
2
(4log
e
2)
−1
(4.21)
P
k
= [T
i
1k
− T
i
0k
, T
i
2k
− T
i
0k
] (4.22)
whereP
k
is anN
0
×2 matrix and
1
N
0
−1
P
T
k
P
k
corresponds to an averaging over time.
Finally, the number of time RESELSτ over a time intervalT
0
is given by:
τ =
T
0
FWHM
t
=
T
0
p
4log
e
2
λ (4.23)
where λ = Var(
∂T
it
∂t
) = E
t
{(
∂T
it
∂t
)
2
} is the variance of the temporal derivative and is
estimated by finite differences.
4.2.3 Permutation Method
We use resampling methods to allow more flexible models and to avoid random field the-
ory assumptions. The standard approach to permutation tests is to find units exchange-
able under the null hypothesis. While epochs can be regarded as independent, permuting
of epochs does not change the value of epoch-averaged statistics (although the bootstrap
can make use of the epoch replicates). By collecting equal pre- and post-stimulus data
(N =N
0
) we can permute pre- and post-stim data since these intervals are interchange-
able under the null hypothesis of no activation at any post-stimulus timepoint. GivenJ
41
original epochsY
itj
,j ={1,··· ,J}, we can createM ≤ 2
J
permutation samplesY
∗
itj
,
each consisting ofJ new epochs (Fig. 4.1). The symbol (*) indicates that the valuesY
∗
itj
are created by permutation.
Assumptions
Except in the special case of data from randomized experiments, permutation methods
are not assumption-free. In order to permute the data under the null hypothesis, we must
assume that the distribution of the errors is exchangeable with respect to epochs. That is,
the multivariate distribution of{
itj
} is invariant with respect to permutations of indices
j, but is otherwise unspecified; independence across epochs is sufficient to satisfy this
condition. We further assume that under the null hypothesis, the distribution of{
itj
} is
in unaltered by exchanging the pre- and post-stim data (we require thatN =N
0
).
Permutation Statistics
Both permutation and RF theory approaches use the maximum distribution to control the
FWER over space for one time point, or over time and space simultaneously. To obtain
thresholds that control FWER over space at a given timet we compute the permutation
distribution of the spatial maximum:
e
T
∗
·t
= max
i
|T
∗
it
|
(4.24)
where the tilde indicates the maximum over the dotted subscript. TheseM permutation
samples
e
T
∗
·t
estimate the null distribution of
e
T
·t
, written as
ˆ
F
e
T·t
1
. The level α spatial
FWER threshold is
ˆ
F
−1
e
T·t
(1−α).
1
F
X
denotes the cumulative density function (CDF) of the random variableX
42
Figure 4.1: Illustration of the summarizing procedures used to construct FWER-
corrected thresholds: the original epochs Y
itj
produce M permutation samples Y
∗
itj
by
exchanging pre- and post-stimulus data. The epochs are then averaged and normalized
to produce T
it
and T
∗
it
. Three different summarizing methods can be used. Method
1: T
∗
it
are summarized in time (
e
T
∗
i·
) and space (
e
T
∗
··
) to produce epochwise thresholds
(
ˆ
F
−1
e
T··
(1− α)); Method 2: T
∗
it
are summarized in time, converted into p-values (P
∗
i
)
and summarized in space (
e
P
∗
·
) to produce uniform specificity epochwise thresholds
(
ˆ
F
−1
e
P·
(α)); Method 3: T
∗
it
are summarized in space (
e
T
∗
·t
) to produce slicewise thresholds
(
ˆ
F
−1
e
T·t
(1−α)).
Thresholds that control FWER over time at a given spatial locationi, or that control
FWER over both time and space, are found by the corresponding permutation distribu-
tion:
e
T
∗
i·
= max
t>0
|T
∗
it
|
e
T
∗
··
= max
i
e
T
∗
i·
(4.25)
The
e
T
∗
··
are used to estimate the null distribution of
e
T
··
, written as
ˆ
F
e
T··
. The level α
FWER threshold is
ˆ
F
−1
e
T··
(1−α). The above procedures are summarized as methods 1
and 3 in Table 4.1.
43
Time-Summarizing Space- Summarizing
Method 1
e
T
∗
i·
= max
t>0
|T
∗
it
|
e
T
∗
··
= max
i
e
T
∗
i·
Method 2 P
∗
i
=p
i
(
e
T
∗
i·
)
e
P
∗
·
= min
i
e
P
∗
i
Method 3
e
T
∗
·t
= max
i
|T
∗
it
|
Table 4.1: Summary statistics for three permutation methods. The permutation samples
areT
∗
it
, withi the spatial index, andt the time index. The tilde indicates the maximum
over the dotted subscript;p
i
(·) is the permutation p-value function using only data from
spatial locationi.
Achieving Uniform Spatial Specificity
Permutation tests are always valid given the assumption of exchangeability under the
null hypothesis. However, if the null distribution varies across space, different surface
elements will have differing specificity, as illustrated in Fig. 4.2. To address the problem
we can perform an element by element normalization by converting the statistic values
at each location to P-values, which are computed using permutations. We then control
FWER with respect to the distribution of the minima of these P-values, rather than the
maxima of the original statistic. In this way, we achieve uniform spatial specificity.
The method to achieve uniform specificity proceeds as follows. We first summarize
by computing the maximum over time for each permutation sample at each location,
to generate
e
T
∗
i·
as defined in Eq. (4.25). We then use these permutation statistics to
estimate the null distribution
ˆ
F
e
T
i·
of
e
T
i·
. Using this distribution, we then replace each
permutation statistics
e
T
∗
i·
with its P-value:
P
∗
i
=p
i
(
e
T
∗
i·
) = 1−
ˆ
F
e
T
i·
(
e
T
∗
i·
)
(4.26)
Effectively, this is equivalent to counting how many of the M permutation sample sta-
tistics
e
T
∗
i·
at locationi are greater than or equal to the current permutation statistic, and
dividing byM. For eachi, theP
∗
i
have a uniform distribution under the null hypothesis,
44
Figure 4.2: Illustration of the impact of heterogeneous voxel null distributions on a 5%
FWER threshold. Shown are null distributions of 5 surface elements in three cases: each
having different variances, each having different skewed distributions, and all sharing
the same normal distribution. The first case (left) demonstrates the variable false positive
rate when test statistics are not normalized (e.g. use raw CDM values ˆ μ
ij
instead of
T
it
). The second case (center) demonstrates the impact of non-Gaussianity, even when
variance is normalized, and motivates the use of p-values to normalize T
ij
into p
i
(T
it
).
The last case (right) shows that with homogeneous nulls the false positive rate at each
surface element is homogeneous. Note that in all cases FWER is controlled at 5%.
and hence are self-normalized. To find a threshold on the P-values that controls FWER,
we first find the minimum P-value for each permutation sample:
e
P
∗
·
= min
i
e
P
∗
i
(4.27)
We then use the M permutation values
e
P
∗
·
to estimate the null distribution,
ˆ
F
e
P·
, of
e
P
·
.
By chosing a levelα threshold on
ˆ
F
e
P·
, i.e. (
ˆ
F
−1
e
P·
(α)), and retaining only those locations
at which the P-value is less than this threshold, we control the FWER atα. Furthermore,
since under the null hypothesis the P-values will have the same uniform distribution at
each location, this approach will guarantee uniform specificity. The above procedure is
summarized in Table 4.1 (method 2).
45
One practical problem of this approach is the discreteness of the P-valuesP
∗
i
, which
in turn causes
e
P
∗
·
to be discrete. If many
e
P
∗
·
have the smallest possible value (1/M),
then small α-levels thresholds may be unattainable. For example, one Monte Carlo
experiment with M = 1,000 found that 30% of the permutations had a minimum P
∗
i
of value 0.001 and hence the smallest possible FWER threshold corresponded to α =
0.3. Therefore the P-value normalization approach, while it makes no assumptions on
differing shapes of the local distributions, requires many permutations.
Thresholds
The three different permutation methods produce different types of thresholds. Method
1 produces a single statistic value threshold for all time and space:
ˆ
F
−1
e
T··
(1−α). Method
2 also produces a single P-value threshold for all time and space:
ˆ
F
−1
e
P·
(α). This P-value
can be re-expressed in terms of statistic values, which will yield different thresholds for
each spatial location: p
−1
i
(
ˆ
F
−1
e
P·
(α)). Method 3 produces one threshold for each time
point:
ˆ
F
−1
e
T·t
(1−α). The key difference between these thresholds is that methods 1 and
2 control FWER over all time and space, while method 3 controls FWER over space at
a single time point.
4.3 Results
In this section we evaluate the random field method and the three permutation methods
given in Table 4.1 in terms of specificity, i.e. the ability of the methods to control false
positives, and sensitivity, a measure of how well the method can detect and localize true
brain activation.
46
4.3.1 Simulation Studies
A cortical surface was extracted from an MRI scan using BrainSuite, a brain surface
extraction tool [Shattuck and Leahy, 2002b]. The surface extraction algorithm consists
of skull and scalp removal, nonuniformity correction, tissue classification, and cortical
surface topology correction. The surfaces produced by BrainSuite can be constrained
to be topologically spherical, and are suitable for use in current source localization and
visualization. The surfaces where then coregistered to the MEG sensor arrangement of a
155 channel CTF Systems Inc. Omega 151 system using an affine transformation, based
on three fiducial points (nasion, right/left preauricular). The resulting surface, which
contained approximately 520,000 faces, was downsampled to produce a 15,000 face
(7481 vertices) surface suitable for reconstruction purposes. CDMs were mapped back
onto the higher resolution surface, as illustrated in Fig. 3a, as well as onto a smoothed
version of the surface to assist in visualization, Fig. 3b. An orientation constraint was
applied during reconstruction, using surface normals estimated from the original dense
cortical surface. The forward model was calculated using an overlapping spheres model
[Huang et al., 1998]. We used a Tikhonov regularized linear inverse method [Tikhonov
and Arsenin, 1977] with regularization parameterλ = 4·10
−7
[Baillet et al., 2001].
Source Simulation
We simulated two sources, one each in left and right primary motor cortex, as shown in
Fig. 4.3. Each source consists of an activated patch approximately 2cm
2
in size. The
time-courses simulate early and delayed responses to a stimulus (Fig. 4.4), which is a
typical pattern in neuroimaging studies.
A total of 100 stimulus-locked event-related epochs (or trials) were simulated, each
one consisting of 100 pre-stimulus and 100 post-stimulus time points. Each epoch is
an array of data (151 channels x 200 timepoints) representing the measured magnetic
47
(a) (b)
Figure 4.3: Simulated source 1 (left hemisphere) and source 2 (right hemisphere) are
shown mapped onto high resolution and smoothed versions of the cortical surface.
Figure 4.4: Timecourses of simulated sources, blue for source 1 and red for source 2.
The pattern of activation mimics a typical neuroimaging study where an early response
to a stimulus propagates to another brain region giving a delayed component.
field at each sensor as a function of time. Gaussian i.i.d. noise with power 2000 times
the averaged power of the signal was added to the channel measurements, making the
reconstruction a difficult task. We then applied the Tikhonov regularized inverse opera-
tor [Tikhonov and Arsenin, 1977] to produce CDMs X
itj
. We also generated spatially
smoothed versions of the CDMs using diffusion smoothing as described in the RF the-
ory section. We used an approximation of the Laplace-Beltrami operator given in [Oos-
tendorp and Oosterom, 1988] with spatial filtering corresponded to a 23.5mm FWHM.
Since the mean distance between the vertices in our tessellated cortical surface was
48
4.5mm, the spatial filtering was equivalent to approximately 5 vertices FWHM. Finally,
we created M = 1,000 permutation samples for permutation methods 1 and 3, and
M = 10,000 permutation samples for permutation method 2, due to the discreteness
concerns discussed in Section 4.2.3.
We tested smoothed unsmoothed CDMs for significant activation using the ran-
dom field and permutation methods. Spatiotemporal FWER-corrected thresholds (Table
4.2) are higher, as expected, than spatial FWER-corrected thresholds (Table 4.3). The
thresholds for permutation method 2 are given only for unsmoothed CDMs, because
of the high computational cost of smoothing 10,000 permutations, and represent the
average statistic value thresholds over all spatial locations, E
i
{p
−1
i
(
ˆ
F
−1
e
P·
(α))}. The esti-
mated empirical distribution of
e
P
·
allowed for a smallest achieved false positive rate of
α = 12.3% for Method 2. Therefore, in order to compare Methods 1 and 2 we repeated
the analysis for Method 1 withα = 12.3% as listed in Table 4.2.
The design of the statisticsT
it
, involving post-stimulus dataμ
it
and pre-stimulus data
σ
i
, causes permutation methods to adapt to the activation pattern of the sources on the
cortical surface. In particular, when a stimulus activates a source on the cortical surface,
the corresponding timecourse has higher amplitude and variability in the post-stimulus
area than in the pre-stimulus area. As a result, the original data will have relatively high
T
it
(high μ
it
and small σ
i
) as compared to the permutation samples; as a source gets
stronger, the significance of the original data increases. In our simulation experiments
the timecourses of the sources vary with time, therefore we expect the FWER-corrected
thresholds to adapt to this variation. For permutation method 3 we estimate a different
threshold
ˆ
F
−1
e
T·t
(1−α) for each timeslice after the stimulus t > 0 and Table 4.3 shows
the mean of these thresholds (3.876 for unsmoothed CDMs and 3.778 for smoothed
CDMs). The random field method is less sensitive with higher thresholds (4.451 and
4.076 respectively) and particularly conservative when unsmoothed CDMs are used.
49
Nominal FWER Unsmoothed CDMs Smoothed CDMs
Permutation Method 1 0.050 5.236 5.151
Permutation Method 1 0.123 5.050 −
Permutation Method 2 0.123 4.998 −
Table 4.2: Thresholds for the sources illustrated in Figs. 4.3 and 4.4 for controlling
FWER over space and time. Since 12.3% of the
e
P
∗
·
had the smallest possible P-value of
0.0001, 0.123 is the smallest FWER obtainable with method 2. Method 1 was repated
for this FWER for comparison
Nominal FWER Unsmoothed CDMs Smoothed CDMs
Permutation Method 3 0.050 3.876 3.778
Random Field Method 0.050 4.451 4.076
Table 4.3: Thresholds for the sources illustrated in Figs. 4.3 and 4.4 for controlling
FWER over space only
.
Fig. 4.5 shows examples of significant activation maps for the permutation and
random field methods. Epochwise thresholds (Fig. 4.5a) are more stringent and thus
less sensitive to signals than slicewise thresholds (Fig. 4.5b). We are, however, more
confident a signal is truely present because the false positives are controlled over all
time slices. Spatial smoothing (Fig. 4.5c) produced a mild loss in resolution compared
to the unsmoothed case (Fig. 4.5b). Finally, comparison of the permutation and random
field results for the smoothed CDMs (Fig. 4.5c and d, respectively) indicates similar
performance in this simulation.
We should comment here that permutation and random field tests do not address
the limited resolution of MEG reconstruction methods. The MEG inverse problem is
ill-posed and CDMs are of low resolution and tend to mislocalize source activation. If
the inverse method identifies experimental variation is some region, permutation and
random field tests will identify these regions regardless of the presence of an actual
source at those locations. In most cases, CDM reconstructions of activation from a single
50
Figure 4.5: Examples of significant activation maps for permutation and random field
methods for two time instances; (a) Permutation method 1 using unsmoothed CDMs,
(b) Permutation method 3 using unsmoothed CDMs, (c) Permutation method 3 using
smoothed CDMs, (d) Random field using smoothed CDMs. The first method controls
FWER over space and time, while the last three methods control FWER over space for
one time point only.
sulcus or gyrus will tend to show activation in neighboring sulci or gyri respectively. It is
quite possible, as shown in Fig. 4.5, that reconstructions of activation in a single cortical
area will exceed the determined threshold in multiple areas, and thus particular care is
required in interpretation of CDMs, even after thresholding to control for FWER.
It is interesting to study the effect of the p-value transformation on the significant
activation maps: if the effect is significant, it is an indication that Gaussianity assump-
tions, used by random fields and permutation methods 1 and 3, are violated. Fig. 4.6
shows that permutation methods 1 and 2 produce very similar results in simulations.
This is expected since in this case the data was homogeneously Gaussian. The p-value
transformation step only affects the solution if, under the null hypothesis, the surface
elements have an inhomogeneous distribution (see also Fig. 4.2). However, for real data
experiments, as we shall see in a following section, the two methods produced different
activation maps, an indication that the Gaussianity assumption may be violated.
51
Figure 4.6: Examples of significant activation maps for permutation methods 1 and 2
for two time instances. The lowest achieved threshold for method 2 is α = 0.123 and
the comparison for both methods is done at this level.
We can display the unthresholded p-value maps of permutation method 2 by trans-
forming the CDMs of the original data into p-values. Even though this does not address
the multiple comparisons problem, it is interesting to compare the relative apparent
localization properties of CDMs ( ˆ μ
it
), noise-normalized CDMs (T
it
), and p-value maps
(p
i
(T
it
)). Such a result is shown in Fig. 4.7. Noise-normalized and (1-p)-value maps are
qualitatively similar, again this is because the data are Gaussian. However, (1-p)-value)
maps offer a direct quantitative measure of significance. In the lower row of Fig. 4.7,
all sources with p-value less than 0.05 are shown for the (1-p)-value maps; CDMs and
noise-normalized CDMs are thresholded subjectively, indicating the importance of some
form of normalization of the CDMs, either by noise standard deviation or using p-values,
before thresholding.
Noise Simulation
In order to test all methods for specificity, we created MEG sensor data using only
standard Gaussian noise and no signal. The data consisted of 100 epochs, each having
52
Figure 4.7: Thresholded and unthresholded maps of the current density (ˆ μ
it
), the noise
normalized current density (T
it
) and the (1-p)-value mapp
i
(T
it
) att = 113.
100 pre-stimulus and 100 post-stimulus time points. We then estimated the 5% threshold
values for all methods, as given in table 4.4.
We then repeated the above procedure 100 times and tested the simulated data
(n
1
= 100 epochs for spatiotemporal FWER-corrected thresholds, n
2
= 10,000 slices
for spatial FWER-Corrected thresholds) for activation. The approximate Monte Carlo
standard errors for a true0.05 rejection rate are
p
α(1−α)/n
1
= 0.0218 for spatiotem-
poral FWER-corrected thresholds and
p
α(1−α)/n
2
= 0.00218 for spatial FWER-
corrected thresholds. The approximate 95% confidence intervals are (0.0073,0.0927)
and(0.0457,0.0543) respectively
The random field method is conservative, with or without spatial smoothing. With-
out smoothing the spatial FWER is0.0139; with spatial smoothing of 5 vertices FWHM
the spatial FWER is 0.0340. Both are outside the 95% confidence limit, though the
result with smoothed data is better.
It can be shown theoretically that permutation methods are always exact, i.e. they
will achieve the chosen FWER. Our experiments verify this: for unsmoothed data
53
Unsmoothed CDMs Smoothed CDMs
Threshold Obs. FWER Threshold Obs. FWER
Spatiotemporal FWER Method
Permutation Method 1 5.350 0.0600 5.245 -
Spatial FWER Methods
Permutation Method 3 4.059 0.0480 3.980 -
Random Field Method 4.453 0.0139 4.081 0.0340
Table 4.4: Noise-only simulation results for control of spatial and spatiotemporal FWER
at nominal levelα = 5%. The Monte Carlo standard error for the spatiotemporal FWER
is 0.0218; for the spatial FWER it is 0.0022.
Unsmoothed CDMs Smoothed CDMs
Source Simulation Data 641.14 141.04
Real Data 574.44 125.62
Table 4.5: Number of RESELS for simulated and real data.
method 3 had a spatial FWER of0.0480 while method 1 had a spatiotemporal FWER of
0.06, both inside the95% confidence intervals.
4.3.2 Real Data experiment
The effectiveness of the proposed algorithms was also investigated using data from a
somatosensory experiment. Data were acquired using a CTF Systems Inc. Omega 151
MEG system. The somatosensory stimulation was an electrical square-wave pulse deliv-
ered randomly to the thumb, index, middle and little finger of each hand of a healthy
right-handed subject [Meunier et al., 2001]. For the purposes of this study only data
from the right thumb was used. The recordings consist of 400 epochs each having 62
pre-stimulus and62 post-stimulus time points.
Table 4.5 shows the number of estimated RESELS,R
2
(S), for the source simulation
and real data experiments. Since both data have the same range of RESELS, our simula-
tion experiments were performed under conditions consistent with our real experiment.
54
Nominal FWER Unsmoothed CDMs Smoothed CDMs
Spatiotemporal FWER Method
Permutation Method 1 0.050 9.346 8.925
0.086 8.691 8.351
Permutation Method 2 0.086 7.473 -
Spatial FWER Methods
Permutation Method 3 0.050 6.597 6.387
Random Field Method 0.050 4.426 4.045
Table 4.6: Real data thresholds found with the different RF and permutation methods.
All methods identify significant activity in the left somatosensory cortex. Since the
experiment involved right thumb stimulation, activation of area S1 (primary somatosen-
sory cortex) in contralateral somatosensory cortex is expected [Kandel et al., 2000]. Fig.
4.8 allows for similar inferences as presented in the simulation section for Fig. 4.5, i.e.
epoch-wise or spatiotemporal thresholds are more conservative than slice-wise thresh-
olds (Fig. 4.8a vs. b) and spatial smoothing slightly reduces resolution (Fig. 4.8b vs. c).
In this case the RF threshold is lower than that for the permutation Method 3, so that a
broader area of activation is seen in the RF result compared to the permutation method
(Fig. 4.8c vs. d). Since this is real data we cannot know whether Fig. 4.8c or Fig.
4.8d is closer to truth, however S1 activation as revealed in other neuroimaging studies
is typically highly focal. Since the two methods performed similarly in simulations, one
explanation is that the real data do not satisfy the distribution assumptions of RF theory.
Evidence of violated assumptions is supported by Fig. 4.9. This shows permutation
method 2 to be more sensitive than method 1. Simulation experiments demonstrated
that when the data are Gaussian the two permutation methods exhibit very similar per-
formance; the discrepancies between the two maps of significant activation using meth-
ods 1 and 2 indicate a violation of this distributional assumption. At t = 22 ms only
method 2 detected significant activity. Further, it seems to correct the CDM, which
shows the main activity in the ipsilateral hemisphere. As mentioned before, we expect
55
Figure 4.8: Examples of significant activation maps for permutation and random field
methods for real data; (a) Permutation method 1 using unsmoothed CDMs, (b) Permu-
tation method 3 using unsmoothed CDMs, (c) Permutation method 3 using smoothed
CDMs, (d) Random field controlling FWER over space only.
activation of the left hemisphere (at least for the early component of the response) and
this is supported by our results. Fort = 28 ms the same remarks for sensitivity are true.
Finally, Fig. 4.10 shows the thresholds applied by each of the two methods. Again, due
to discreteness, the lowest achieved FWER by method 2 isα = 0.086.
An alternative explanation to the stringency of the permutation results is contami-
nation of the empirical permutation distributions. Permutations that are similar to the
correctly labeled data will also yield relatively large statistic values, affecting the upper
tail of the permutation distribution and shifting the threshold upwards. A possible solu-
tion to this problem is to use a step-down test in which the null distribution is ’purified’
by removing spatial elements that test as significant and recomputing the permutation
distribution of the maximum.
Note that our method can, to some extent, counteract the tendency to inflate the
permutation distribution, due to our use of a pre-stim standard deviation. While a strong
signal can inflate a permuted statistic values, it will also induce more variability into the
56
Figure 4.9: Reconstruction and significant maps from permutation methods 1 and 2 for
a FWER ofα = 0.086.
ˆ
F
−1
e
T··
(1−α)
Figure 4.10: Global threshold applied by permutation method 1 (
ˆ
F
−1
e
T··
(1−α)) at level
α = 0.086, as compared to the histogram of the thresholdsp
−1
i
(
ˆ
F
−1
e
P·
(α)) applied to each
sourcei by method 2. Also, a map of the thresholds on the cortical surface is given on
the right. Most of the individual thresholds are below
ˆ
F
−1
e
T··
(1−α).
pre-stim period, inflate the estimated standard error, and hence reduce permuted statistic
values. In this dataset, this effect apparently did not overcome the strong signal.
57
4.4 Conclusion
We have presented RF theory and permutation methods for processing of MEG data and
extracting significant activation maps. They can be used with any linear or nonlinear
cortical imaging method to obtain objective thresholds on statistic maps.
The random field method demonstrated valid but conservative performance in our
null simulation experiments; the observed FWER was 0.034 (vs 0.05 nominal) for
smoothed data, and worse for unsmoothed data. However, the method successfully iden-
tified the two simulated sources while rejecting false positives on other surface elements.
This suggests that the nonstatationary smoothness estimation addresses the problem of
highly variant spatial correlation of the noise.
Permutation method 3, which controls spatial FWER at each point in time, as does
the RF method, achieved superior performance in the simulations, though there was lit-
tle difference with smoothed data. This is because in simulations with smoothed data,
RF assumptions are satisfied, i.e. Gaussianity and sufficient smoothness, so that the
method is almost exact and performs similarly to its non-parametric counterpart. In
experimental somatosensory data, permutation method 3 found activated sources in the
contralateral S1 area, as expected, with fewer significant spatial elements than the ran-
dom field method. We cannot say whether the random field method in this case is more
sensitive or simply giving false positives induced by violated assumptions.
Overall, the permutation methods are more flexible than RF based methods. We can
work with spatially smoothed or unsmoothed CDMs depending on the desired tradeoff
between SNR and spatial resolution, when RFs should only be used with sufficiently
smoothed CDMs. In general, we prefer to smooth the data as little as possible; the data
is low resolution, and it seems undesirable to further reduce the resolution. Further,
permutation methods can achieve uniform sensitivity by defining different thresholds
per surface element via p-values. More importantly, they do not rely on distribution
58
assumptions, making them suitable for real data experiments. Their major limitation is
that we need equal pre- and post-stimulus regions to allow exchangeability. This issue
can generally addressed in planning the event-related MEG study.
4.5 Acknowledgments
We wish to thank Dr. Sylvain Baillet at the Cognitive Neuroscience & Brain Imaging
Hopital de la Salpetriere for providing experimental data of an MEG somatosensory
study, and Prof. Thomas Nichols at the University of Michigan for his assistance in data
analysis.
59
Chapter 5
Controlling Familywise Error Rate in
MEG Studies of Oscillatory Behavior
The previous chapter described the use of random field and permutation methods to
statistically estimate spatiotemporal thresholds that control the familywise error rate
(FWER) in cortical activation maps. The permutation methods are very flexible, since
they do not rely on distributional assumptions, and can be applied to smoothed or
unsmoothed data and achieve uniform specificity with the use of p-values. Further, they
are easily generalizable to multidimensional data, such as spatial-temporal-frequency
maps. In this chapter we demonstrate their application to the time-frequency cortical
maps introduced at the end of Chapter 3.
The methods we describe below also apply directly to cortical maps computed from
EEG data, since the inverse imaging methods differ only in the forms of their lead field
matrices [Baillet et al., 2001]. Although for convenience we describe them in the con-
text of baseline vs. post-stimulus comparison, these can be used to compare any two
conditions. We first motivate error rate control in time-frequency cortical maps, and
then combine one-way ANOV A modeling with permutation tests to statistically thresh-
old the cortical images. We demonstrate this method in application to high density MEG
studies of visual attention.
60
5.1 Statistical Interpretation of Time-Frequency Corti-
cal Maps
In Chapter 3 we combined time-frequency analysis of individual epochs with minimum
norm imaging to produce dynamic cortical images in multiple frequency bands. Objec-
tive assessment of the above cortical images requires a principled approach to identify-
ing regions of activity in frequency bands. This analysis involves testing thousands of
hypotheses (one per surface element and time-frequency band) for statistically signifi-
cant experimental effects. This raises the possibility of large numbers of false positives
simply as a result of multiple hypothesis testing. To control the number of false posi-
tives over all tests we must therefore consider the multiple hypothesis testing problem.
The standard approach, and the one investigated in this work, is to control the Fam-
ilywise Error Rate (FWER), i.e. the chance of one or more false positives under the
null hypothesis. Methods that control the FWER, and consider the spatial dependence
of the data, make inferences based on the global maximum distribution of the image
statistics over space and frequency bands. The maximum distribution can be estimated
non-parametrically using permutation tests.
5.1.1 Modeling and ANOV A Analysis
We follow the Morlet wavelet procedure presented in Section 3.3 to create power esti-
matesE
ikj
in rectangular time-frequency bands. To extract statistics from these observa-
tions, we take the standard massively univariate approach and model each spatial loca-
tion separately [Friston et al., 1995]. For each spatial location, the model then amounts
to a one-way ANOV A model with repeated measures:
E
ikj
=μ
ik
+
ikj
(5.1)
61
Our goal is to identify, in each band, those cortical areas where there are significant
pre- and post-stimulus energy differences. Using the notation k
+
for a post-stimulus
time period t ∈ [t
k
0
t
k
1
], and k
−
for a pre-stimulus period t ∈ [−t
k
1
− t
k
0
], we test the
hypothesis:
H
0
: μ
ik
+ =μ
ik
− (5.2)
for all source locations and time-frequency bands.
We use the standard statistic for pairwise testing:
T
ik
=
E
ik
+
·
−E
ik
-
·
σ
ik
(5.3)
where the bar indicates an average over the dotted subscript. We assume a heteroscedas-
tic model, i.e. we allow different distributions for residuals
ikj
occurring for different
source locationsi and time-frequency bandsk. Then, the varianceσ
2
ik
is estimated using
the repeated measurements over theJ epochs as:
σ
ik
=
r
var{E
ik
+
j
}+ var{E
ik
-
j
}
J
var{E
ik
+
j
} =
X
j
(E
ik
+
j
−E
ik
+
·
)
2
J−1
, var{E
ik
-
j
} =
X
j
(E
ik
-
j
−E
ik
-
·
)
2
J−1
(5.4)
If the error terms
ikj
in Eq. 5.1 follow a Gaussian distribution, the statistic T
ik
follows a t-distribution with approximatelyν
ik
degrees of freedom, computed using the
Satterthwaite method as [Satterthwaite, 1946b]:
ν
ik
= (J−1)
(var{E
ik
+
j
}+ var{E
ik
-
j
})
2
var
2
{E
ik
+
j
}+ var
2
{E
ik
-
j
}
(5.5)
Under the central limit theorem, this statement is approximately true even if the
Gaussian assumption is violated provided the number of epochsJ is large.
62
5.1.2 Multiplicity Adjustments
The massively univariate approach, as modeled in Eq. (5.1), involves testing thousands
of hypotheses (one per surface element and time-frequency band) for statistically signif-
icant experimental effects. To effectively control the number of false positives over all
tests we must therefore apply multiplicity adjustments by controlling the FWER. The
FWER is directly related to the maximum statistic; one or more statisticsT
ik
will exceed
the threshold u under the null hypothesis H
0
if and only if the maximum exceeds the
threshold
1
:
P(FWER) = P(∪
ik
{|T
ik
| >u}|H
o
)
= P(max
ik
|T
ik
| >u|H
o
)
= 1−F
max|T| |Ho
(u) (5.6)
= 1−(1−α) =α
We use the absolute value because negative values of T
ik
can also indicate significant
changes in oscillatory power, and we therefore need a two-tailed test statistic. As shown
in Eq. 5.6, we can control the FWER if we choose the threshold u to be in the (1−
α)100
th
percentile of the maximum distribution.
5.1.3 Permutation Tests
The multiplicity adjustments require the distribution of max
ik
|T
ik
|. Even though we
know each T
ik
follows a t-distribution with ν
ik
degrees of freedom (Subsection 5.1.1),
it is impossible to evaluate analytically the maximum distribution;ν
ik
varies overi and
k, and the correlation structure amongT
ik
is unknown. Further, the distribution ofE
ikj
1
F
X
denotes the cumulative density function (CDF) of the random variableX
63
may deviate from Gaussian, and thus the distribution of T
ik
may deviate from the t-
distribution. While the effects of deviations from Gaussianity can be serious in uni-
variate situations, the effects tend to be amplified in multiple testing applications where
maximal statistics are considered [Westfall and Young, 1992].
To overcome these problems, we use a non-parametric permutation method to esti-
mate the distibution of
e
T
··
= max
ik
|T
ik
|, where the tilde indicates a maximum over the
dotted subscript. The standard approach to permutation tests is to find units exchange-
able under the null hypothesis. Under the null hypothesis that there are no task-related
differences between corresponding pre- and post-stimulus time-frequency bands, we can
permute the energy in time-frequency bands E
ik
+
j
and E
ik
-
j
. The permutations can be
applied over epochs, which are typically regarded as independent, but not over source
locations and time-frequency bands, because we may destroy possible correlations in
the data. Given J original epochs of MEG measurements, we have J repetitions of
E
ik
+
j
and E
ik
-
j
. By randomly interchanging or permuting the pre- and post-stimulus
time-frequency bands, we can create M ≤
2J
J
permutation samples E
∗
ik
+
j
and E
∗
ik
-
j
.
The symbol (*) indicates that the statistics are created by permutation.
By applying Eq. 5.3 to the permuted data, we create permutation samples ofT
ik
:
T
∗
ik
=
E
∗
ik
+
·
−E
∗
ik
-
·
σ
∗
ik
(5.7)
To obtain thresholds that control the FWER over space and time-frequency bands, we
compute the permutation distribution of the maximum statistic:
e
T
∗
··
= max
ik
|T
∗
ik
| (5.8)
We use this distribution to define a levelα threshold:
ˆ
F
−1
e
T··
(1−α). A bandk has statis-
tically significant energy at sourcei ifT
ik
(the original statistic) exceeds this threshold.
64
Figure 5.1: Illustration of the MEG visual attention study: Each trial presents a central
arrow cue that directs attention to the lower left or right visual field, in anticipation of
a second stimulus (S2), delivered 1 sec later on the left or right. The locations of the
upcoming S2 are continuously marked by gray patches, to guide allocation of covert
visual spatial attention. If the S2 (a B/W grating) occurs on the cued (attended) side,
then subjects respond if it has the target orientation.
5.2 Results
We illustrate this method using a high density MEG study of visual attention which
produces activity in many different cortical regions forming functional networks. Fol-
lowing a 1.2 sec baseline period, a brief central arrow cue directs covert attention to the
lower left or right visual field, in anticipation of a second stimulus (S2), delivered 1 sec
later on the left or right (Fig. 5.1). The upcoming S2 consists of gratings slanted clock-
wise from the vertical by either 5 (non-target) or 20 (target) degrees, with a response
required if the grating appears at the cued (attended) location (50%) and has the target
orientation.
The MEG measurements were recorded using a whole-head CTF Omega system
with 275 axial gradiometers. Horizontal and vertical EOGs were recorded and used to
eliminate trials with eye movements or blinks. Data were collected using a bandpass
65
Figure 5.2: We tested 12 post-stimulus time-frequency bands against their baseline
counterparts. All bands were 200-500ms or 700-1000ms after the presentation of the
cue, in frequency regions: θ (4-7Hz), α low (8-10Hz), α high (11-14Hz), β low (14-
19Hz),β high (20-30Hz), andγ (30-50Hz).
filter of 0.3 to 200Hz at a 1.2kHz sampling rate, with trials defined off-line around stim-
ulus events, and rejected for predefined anomalies in EEG or MEG signals. Anatomical
data was acquired with a Philips 1.5 Tesla MRI scanner, using a SENSE head coil; T1
contrast was acquired in 200 axial slices (1.0x1.0x1.2 mm, TI/TE/TR/flip angle = 769.6
msec/3.7 msec/7.9 msec/8 degrees). A cortical surface was extracted from the MRI
scan using BrainSuite, a brain surface extraction tool [Shattuck and Leahy, 2002a], and
coregistered to the MEG sensor arrangement. The MEG forward model was calculated
based on an overlapping spheres model [Huang et al., 1999a] using BrainStorm [Baillet
et al., 2004], a Matlab toolbox for EEG&MEG data processing.
We applied our methodology to identify statistically significant differences between
pre-cue baseline activity vs. post-cue responses. The MEG recordings consisted ofJ =
276 epochs, and were mapped onto the cortical surface using min-norm imaging and
time-frequency analysis based on complex Morlet wavelets. We chose 12 bands to detect
early (200-500ms) and late (700-1000ms) post-cue stimulus responses, inθ (4-7Hz),α
66
ˆ
F
−1
e
T··
(1−α)
Figure 5.3: Empirical distribution
ˆ
F
e
T··
ofmax
ik
|T
∗
ik
|.We use this distribution to estimate
a threshold
ˆ
F
−1
e
T··
(1−α) = 0.2628 that controls the FWER at theα = 0.05 level.
low (8-10Hz),α high (11-14Hz),β low (14-19Hz),β high (20-30Hz), andγ (30-50Hz)
frequency regions. To detect changes from the baseline we also computed power in
the corresponding prestimulus bands. We used Eqs. (5.3) and (5.7) to form t-statistics
T
ik
, from our original data, and T
∗
ik
, from M = 1000 permutation samples. Using the
permutation distribution ofmax
ik
|T
∗
ik
|, we estimated the threshold
ˆ
F
−1
e
T··
(1−α) = 0.2628
that controls the FWER at theα = 0.05 level (Fig. 5.3).
Figure 5.4a shows example image maps from one subject of T
ik
, for the θ and γ
frequency bands at 700-1000ms after the presentation of a right cue (k = 7 andk = 12
respectively, in Fig. 5.2). Figure 5.4b shows thresholded versions of the same maps
based on our permutation test.
The statistically thresholded images indicate gamma activity in multiple visual sen-
sory areas with weak activity in the intra-parietal sulcus (IPS) and temporal parietal
junction (TPJ). This pattern is consistent with models of parietal and sensory networks
for visuospatial attention and supports a role for gamma in amplifying and maintain-
ing a network of internal representations of the upcoming target stimulus. Significant
increases in theta activity are seen in many of the same visual sensory and parietal areas.
In addition, there are two non-corresponding areas that suggest an interesting functional
67
(a) (b)
@
@ R
@
@ I
Figure 5.4: Normalized power difference maps of baseline vs. right cue for the visual-
attention MEG study: (a) Unthresholded maps of T
ik
, (b) Thresholded maps of T
ik
,
based on our permutation test. We used the threshold
ˆ
F
−1
e
T··
(1−α) = 0.2628 to control
the FWER at α = 0.05 level. The colormap encodes percentage increase in energy
in units of standard deviation. The thresholded maps reveal increased activity in the
superior frontal gyrus and inferior temporal lobe (arrows), that are only significant in
theθ band. These results suggest a functional dissociation betweenθ andγ bands.
γ band θ band
dissociation between gamma and theta. The superior frontal gyrus and higher order
visual areas such as inferior temporal lobe (indicated by arrows), have increased theta,
but little or no gamma activity (see arrows). These areas are thought to be involved
in working memory and the cortical pattern suggests that theta plays a role in main-
taining a representation of the anticipated target features, consistent with our findings in
monkey recordings [Lee et al., 2005]. These preliminary findings are consistent with the
hypothesis that there are overlapping yet distinct brain networks operating with different
physiological processes to deploy and hold anticipatory attention.
68
5.3 Summary
In this chapter we combined time-frequency analysis of individual epochs with mini-
mum norm imaging to produce dynamic cortical images in multiple frequency bands.
We then averaged signal power across epochs to find event related components in each
band. To detect statistically significant differences between two conditions, such as
post-stimulus vs. baseline, we used a permutation test. Applying this test to each fre-
quency band produced a set of cortical images showing significant event-related activity
in each band of interest. We demonstrated this method in application to high density
MEG studies of visual attention. This ability to image statistically significant changes
in brain oscillatory activity within individual subjects has a wide range of applications
in basic cognitive neuroscience research and clinical studies.
5.4 Acknowledgments
We wish to thank Dr. Gregory Simpson, Dr. Darren Weber and Dr. Corby Dale at the
University of California, San Francisco, for providing the experimental data of the MEG
visual attention study, and working with us on data analysis.
69
Chapter 6
Neuroimaging Applications: Language
Cortex Dominance and Surface-based
Morphometry
Research or clinical findings are reliable only when accompanied by statistical analysis
to control for false positives, nominally at a 5% level. Consequently, family-wise error
rate control has a central role in medical image analysis, and fMRI or PET studies typ-
ically employ random field analysis to determine statistically significant experimental
effects. Our work has introduced parametric (random field) and non-parametric (per-
mutation) statistical procedures in MEG. In this chapter we demonstrate the use of our
methods to two applications: identification of hemispheric language dominance using
MEG cortical imaging, and statistical surface-based morphometry of human brain struc-
tures.
6.1 Hemispheric Language Dominance using MEG
Cortical Imaging and Non-Parametric Statistical
Analysis
We offer a statistical procedure to localize language-specific cortex in MEG event-
related studies. Using regularized min-norm imaging we produce cortically constrained
dynamic images of brain activity. We then present a two-step procedure to identify
the language dominant hemisphere. The first step identifies language-specific corti-
cal regions that demonstrate significant variation against the baseline. Once significant
70
activation areas have been established, a pairwise test of total activation between hemi-
spheres determines the dominant hemisphere. We illustrate our method in application
to MEG studies of language processing.
6.1.1 Introduction
Patients with medically intractable epilepsy, tumors, or other neurological diseases often
undergo surgical procedures to remove pathological tissue. Since resection of language
areas may cause aphasias and compromise communication skills, surgical planning
includes functional brain mapping that reveals the dominant hemisphere for language
processing.
One factor which has been associated with language asymmetry is handedness, and
is assessed by the Edinburgh inventory [Oldfield, 1971], which ranges from−100 for
strong left-handedness to +100 for strong right-handedness. Typically, in right-handed
people the left hemisphere is language dominant. Further, left-handedness increases
the likelihood of right-hemisphere language dominance. However, left-handedness in
neither a precondition nor a necessary consequence of language asymmetry on the right
hemisphere [Knecht et al., 2000].
A reliable test to determine hemispheric language dominance is the intracarotid
sodium amobarbital procedure, commonly called the Wada test. The Wada test evalu-
ates the language performance of one hemisphere by anesthetizing the other hemisphere
with barbiturate agents injected into the corresponding carotid artery. It is generally a
safe procedure, however it has a relatively high cost [Medina et al., 2004]; it requires
considerable time to perform, an angiogram to evaluate brain blood flow, the presence
of specialized neuroradiologists and neurologists, and is a semi-invasive procedure.
71
Alternatively, to identify the language hemisphere, electrocorticography [Ojemann
et al., 1989,Lesser et al., 1986,Valachovic et al., 1997] performs direct cortical stimula-
tion, either intraoperatively or extraoperatively, using implanted electrodes. It is again an
expensive procedure, poses potential health risks, and causes discomfort to the patient.
Recent literature has focused on non-invasive alternatives or complimentary tech-
niques to the Wada procedure and electrocorticography. Language asymmetry has
been evaluated using fMRI [Woermann et al., 2003, Binder et al., 1996, Gaillard et al.,
2002, Pujol et al., 1999, Gaillard et al., 2004] and PET [Gross et al., 1991], by count-
ing voxels of activation in each hemisphere; the hemisphere with more activation is
considered dominant. Unlike fMRI and PET that produce images of brain activity aver-
aged over hundreds of milliseconds, MEG has excellent temporal resolution and offers
invaluable information when we image cognitive tasks, such as language processing,
involving both primary and associative cortical areas. Several MEG studies [Papan-
icolaou et al., 1999, Papanicolaou et al., 2004, Bowyer et al., 2005b, Szymanski et al.,
2001,Simos et al., 1998,Breier et al., 2001,Zouridakis et al., 1998,Bowyer et al., 2005a]
have localized language-specific cortical activity using either equivalent current dipoles
or distributed cortical imaging, with promising results for clinical application.
In this work we present a new method to localize language-specific cortex in MEG
event-related studies. Since complex cognitive tasks, such as language processing,
involve simultaneous activation of multiple sources, we choose a cortical imaging
approach based on regularized min-norm imaging, instead of an equivalent current
dipole method that assumes only a few focal sources. We use a permutation test to
optimally threshold the cortical maps while controlling the false positive ratio [Pantazis
et al., 2005b, Nichols and Holmes, 2001]. Once we establish significant activation, we
perform a language asymmetry test based on average power, to identify the language
72
dominant hemisphere. We demonstrate our method in application to MEG studies of
language processing.
6.1.2 Method
Our goal is to identify the language dominant hemisphere using MEG cortical imaging.
In this section we first describe the language processing experimental paradigm. We
then present our MEG data model, and a two-step procedure to identify the language
dominant hemisphere. The first step identifies language-specific cortical locations that
demonstrate significant variation against the baseline. Once significant activation areas
have been established, a pairwise test of total activation between hemispheres deter-
mines the dominant hemisphere.
We choose to follow a two step procedure to ensure correct localization of the hemi-
sphere, since false positives carry high cost and are unacceptable in neurosurgical deci-
sions. Before doing a language asymmetry test, it is important to identify the extent
and location of language-specific cortical activity. For example, during acquisition of
MEG data the subject’s head may move close to one side of the helmet. If we perform
a right vs left asymmetry test, we may identify the incorrect hemisphere because brain
activity appears stronger in locations close to the sensors. However, a comparison with
the baseline, as dictated by the first step of our procedure, can prevent such incorrect
decisions.
Experimental Paradigm
The language processing experiment consisted of two phases: the study phase and
recognition phase. In the study phase, participants were asked to listen to and remem-
ber 30 words. During the recognition phase, participants were presented new words in
addition to the words from the study list for a total of 240 words. Words were presented
in six blocks (40 each), which included 30 words from the study list and 10 new words.
73
Participants were instructed to deflect their index finger (left first half, right second half)
when they recognized a word from the study list. Prior to the start of the experiment,
participants were instructed to keep their eyes focused on a target in front of them and
refrain from excessive movement during data collection.
Data were acquired using a 68-channel MEG system (CTF Systems, Port Coquitlam,
BC, Canada) in a magnetically shielded room with multiple sheets of of Mu-Metall
(VacuumSchmelze, Hanau, Germany). The MEG measurements were recorded at the
time interval{-150 ms, 800 ms} with zero corresponding to word presentation. During
acquisition we applied an online bandpass filter from 0 to 100 Hz , then sampled at 312.5
Hz per channel, then filtered offline from 0.3125 to 20 Hz..
Anatomical data were acquired with a 0.3 Tesla Hitachi or a 1.5 Tesla General Elec-
tric scanner (T1-weighted images, FOV 240 mm, 256x256 matrix, with 2 mm axial
slices and no gap between the slices). The MRI was co-registered with MEG using
special fiducial landmarks at the nasion and both pre-auricular points.
Model
We assume that MEG data are collected as a set of J stimulus-locked event-related
epochs (one per stimulus repetition), each consisting of a pre- and post-stimulus inter-
val. Each epoch consists of an array of data M (n
channels
× n
timepoints
) representing the
measured magnetic field at each sensor as a function of time. The measurementsM are
linearly related with the brain activationX (n
sources
×n
timepoints
) as:
M =GX +N (6.1)
whereG (n
channels
×n
sources
) is the forward operator andN represents additive noise in
the channel measurements. The lead field matrix G depends on the shape and conduc-
tivity of the head and can be computed using a simplified spherical head model, or more
74
accurately, using boundary or finite element methods that account for the true shape and
conductivity within the head [Baillet et al., 2001].
A cortical map can be computed for each epoch by applying a Tikhonov regularized
minimum norm inverse method to produce an estimate of the temporal activity at each
surface element in the cortex:
X = (G
T
G+λI)
−1
G
T
M (6.2)
where the Tikhonov regularization parameter λ can be estimated using model fit tech-
niques [Hawkins et al., 2003].
We write the reconstructed cortical maps as{X
itj
}, wherei = 1,...,S,t =−N
0
+
1,...,N, and j = 1,...,J are indices in space, time, and epoch, respectively. We let
t = 1 correspond to the stimulus event time so that there areN
0
pre-stimulus time points.
We use the pre-stim data to estimate the baseline mean ˆ μ
ij
at each spatial element i at
epochj, by averaging over timet. We then estimate the centered data as:
Y
itj
=X
itj
− ˆ μ
ij
(6.3)
Localization of Significant Brain Activity
While the SNR in individual epochs j is too low to see any but the strongest com-
ponents, averaging signal across epochs should reveal significant event-related activity
across the entire cortex. Since interpretation of cortical maps is confounded by the
presence of additive noise exhibiting highly nonuniform spatial dependence, we also
noise-normalize the maps using the pre-stimulus variance estimate ˆ σ
i
:
T
it
=
¯
Y
it·
ˆ σ
i
/
√
J
(6.4)
75
Figure 6.1: Channel Timeseries on a word recognition MEG study. The early evoked
response occurs at approximately 50-150ms after stimulus onset; the late evoked
response occurs at approximately 250-700ms after stimulus onset.
where the bar indicates an average over the dotted subscript.
Assessment of T
it
maps can identify significant brain activation, and localize the
language cortex. Based on Papanicolaou et al. [Papanicolaou et al., 1999], evoked
responses have two types of components (Fig. 6.1): the early evoked response localizes
approximately 50-150 ms after stimulus, and corresponds to bilaterally symmetric acti-
vation of the primary sensory cortex; the late evoked response localizes approximately
250-700 ms after stimulus, and reflects activation of the association cortex. Since lan-
guage asymmetry is reflected predominantly in the late response, we focus our attention
on the analysis ofT
it
maps at 250-700 ms.
Objective assessment is crucial for the interpretation of cortical activation maps,
because the choice of threshold can greatly affect the locus and spatial extent of signifi-
cant activation. Evaluation ofT
it
maps involves testing thousands of hypotheses (one per
surface element and time instance) for statistically significant experimental effects. To
effectively control the number of false positives over all tests we must therefore apply
76
multiplicity adjustments by controlling the Familywise Error Rate (FWER), i.e. the
chance of one or more false-positives under the null hypothesisH
0
of no activation. The
FWER is directly related to the maximum statistic; one or more statisticsT
it
will exceed
the threshold u under the null hypothesis H
0
, if and only if the maximum exceeds the
threshold:
P(FWER) = P(∪
it
{|T
it
|>u}|H
o
)
= P(max
it
|T
it
|>u|H
o
)
= 1−F
max|T| |Ho
(u) (6.5)
= 1−(1−α) =α
We use the absolute value because negative values of T
it
can also indicate significant
changes in the language cortex, and we therefore need a two-tailed test statistic. As
shown in Eq. 6.5, we can control the FWER if we choose the threshold u to be in
the (1− α)100
th
percentile of the maximum distribution. Since the complex spatial
profile of noise makes it impossible to evaluate analytically the maximum distribution
max
it
|T
it
|, in a later section we will demonstrate the use of permutation tests [Pantazis
et al., 2005b] to approximately estimate it.
Dominant Hemisphere
Depending on the sensitivity of our statistical analysis and the actual activation of the
language cortex, thresholding T
it
maps results in significant areas of activation that
localize on right, left, or both hemispheres. In cases where no voxel exceeds the thresh-
old, we do not recommend proceeding to a subsequent language asymmetry test based
on this data; we should not base neurosurgical decisions on MEG analysis, unless we
are absolutely sure that the language-cortex has been specified accurately. Our test is
conservative, but we feel any false positives carry high cost and are unacceptable.
77
Using the activated voxels, we define a set of symmetric spatial locations N
R
and
N
L
for the right and left hemisphere. For example, if we find significant activity in
Wernicke’s area, we can choose a subset of the posterior temporal cortex in both hemi-
spheres (Fig. 6.6). We further specify a time region T during which the significant
voxels are active.
We proceed by averaging activity over N
R
and N
L
spatial locations in each hemi-
sphere:
E
R
j
= mean
t∈T,j∈N
R
|Y
itj
|
2
E
L
j
= mean
t∈T,j∈N
L
|Y
itj
|
2
(6.6)
To identify the language-dominant hemisphere, we do a pairwise test between E
R
j
and E
L
j
using a standard t-statistic, as shown in Eq. 6.7. We assume a heteroscedastic
model, i.e. we allow different variance between the two hemispheres, and the standard
deviationσ is estimated as in Eq. 6.8.
S =
¯
E
R
·
−
¯
E
L
·
σ
(6.7)
σ =
s
var{E
R
j
}+ var{E
L
j
}
J
(6.8)
Under the null hypothesis of no language asymmetry between hemispheres, the statistic
S should be close to zero. If we assume Gaussianity, we can use the t-distribution to
establish significance, however this is not necessary with the permutation alternative
discussed in the next section.
Permutation Tests
In this section we demonstrate the use of permutation tests to evaluate the distribution
under the null hypothesis of
e
T
··
= max
it
|T
it
| and S in Eq. 6.4 and 6.7 respectively,
where the tilde indicates a maximum over the dotted subscript.
78
Beginning with the distribution of
e
T
··
, the standard approach to permutation tests
is to find units exchangeable under the null hypothesis. By collecting equal pre- and
post-stimulus data we can permute pre- and post-stim source timeseries, since these
intervals are interchangeable under the null hypothesis of no activation at any post-
stimulus timepoint. Since the permutations are applied over epochs (Fig. 6.2), which
are typically regarded as independent, we do not destroy possible correlations in the
data. GivenJ original epochsY
itj
,j ={1,··· ,J}, we can createM ≤ 2
J
permutation
samplesY
∗
itj
, each consisting ofJ new epochs (Fig. 6.2). The symbol (*) indicates that
the valuesY
∗
itj
are created by permutation.
By applying Eq. 6.4 to the permuted data, we create permutation samples ofT
it
:
T
∗
it
=
¯
Y
∗
it·
ˆ σ
∗
i
/
√
J
(6.9)
To obtain thresholds that control the FWER over space and time, we compute the per-
mutation distribution of the maximum statistic:
e
T
∗
··
= max
it
|T
∗
it
| (6.10)
We use this distribution to define a levelα threshold,
ˆ
F
−1
e
T··
(1−α) (Fig. 6.3). A voxeli
has statistically significant activation ifT
it
(the original statistic) exceeds this threshold.
Learning the distribution ofS in Eq. 6.7 requires a simpler procedure. Under the null
hypothesis that there are no task-related differences between right and left hemispheres,
we can permuteE
R
j
andE
L
j
(Fig. 6.4) to create permutation samplesE
R∗
j
andE
L∗
j
. We
use these samples to create permutation samples ofS:
S
∗
=
¯
E
R∗
·
−
¯
E
L∗
·
σ
∗
(6.11)
The empirical distribution ofS (called
ˆ
F
S
) allows us to define a levelα threshold using
the right and left tail of the distribution,
ˆ
F
−1
S
(1− α) and
ˆ
F
−1
S
(α) respectively. The
79
Figure 6.2: Illustration of the summarizing procedures used to construct FWER-
corrected thresholds: the original epochs Y
itj
produce M permutation samples Y
∗
itj
by
exchanging pre- and post-stimulus data. The epochs are then averaged and normal-
ized to produce T
it
and T
∗
it
. Finally, T
∗
it
are summarized in time and space to produce
epochwise thresholds
ˆ
F
−1
e
T··
(1−α).
subject is right dominant ifS >
ˆ
F
−1
S
(1−α), and left dominant ifS <
ˆ
F
−1
S
(α), where
S is the original statistic from Eq. 6.7.
6.1.3 Results
We illustrate our method in the language processing MEG study presented in the Exper-
imental Paradigm section. The MEG recordings of the recognition phase were mapped
onto the cortical surface using Tikhonov regularized min-norm imaging (Eq. 6.2). We
used Eqs. (6.4) and (6.9) to form t-statistics T
it
from our original data, and T
∗
it
from
M = 1000 permutation samples. Since the permutation scheme requires equal number
of pre- and post-stimulus timepoints and we have only 150ms of pre-stimulus data, we
used the post-stimulus interval{250 ms, 400ms} during permutation. Using the permu-
tation distribution of max
it
|T
∗
it
|, we estimated thresholds
ˆ
F
−1
e
T··
(1−α) that controls the
FWER at theα = 0.05 level for two control subjects (C1-C2) and four epileptic patients
(P1-P4) (Table 6.1). Figure 6.5 shows all voxels that where deemed significant with
our permutation analysis. Subjects P1 and P2 had no significant voxels, indicating that
80
Figure 6.3: Empirical probability density distribution of max
it
|T
∗
it
|.We use this distri-
bution to estimate a threshold
ˆ
F
−1
e
T··
(1− α) that controls the FWER at the α = 0.05
level.
Subject
ˆ
F
−1
e
T··
(0.95)
C1 49.21
C2 72.28
P1 48.51
P2 89.77
P3 21.92
P4 25.66
Table 6.1: Spatiotemporal thresholds
ˆ
F
−1
e
T··
(0.95), used to identify significant activation
in theT
it
maps.
their language cortex does not demonstrate considerable variation from the baseline. All
remaining subjects had significant voxels close to Wernicke’s area.
We use the results from Fig. 6.5 to define regions N
R
and N
L
to all subjects that
had significant language-specific activity. Since all significant voxels were in Wer-
nicke’s area, we define N
R
and N
L
to be the right and left posterior temporal lobe
respectively (Fig. 6.6). We use the total energy E
R
j
and E
L
j
during the time interval
T = {250ms,600ms} to estimate the asymmetry test statistic S (Eq. 6.7). By apply-
ing a permutation test as described in the method section, we establish significance for
81
Figure 6.4: Illustration of the permutation procedure used to estimate the distribution of
the S statistic. We first permute the energy statistics E
R
j
and E
L
j
to create permutation
samples E
R∗
j
and E
L∗
j
. We then use the permutation samples to estimate the empirical
distribution ofS, and estimate the thresholds
ˆ
F
−1
S
(α) and
ˆ
F
−1
S
(1−α).
Subject
ˆ
F
−1
S
(0.05)
ˆ
F
−1
S
(0.95) S
C1 -0.23 0.22 -0.45
C2 -0.45 0.45 -2.60
P1 -0.26 0.25 0.03
∗
P2 -0.33 0.33 -0.91
∗
P3 -0.38 0.41 -1.45
P4 -0.32 0.30 -0.49
Table 6.2: Thresholds
ˆ
F
−1
S
(α) and
ˆ
F
−1
S
(1− α) at 5% significance level, and original
statistic S for all subjects. The subject is right dominant if S >
ˆ
F
−1
S
(0.95), and left
dominant if S <
ˆ
F
−1
S
(0.05). For subject P1 the test is inconclusive. *: We should not
proceed to a laterality test for subjects P1 and P2, because they did not have significant
voxels (Fig. 6.5). However we show the results for completeness.
all subjects that had significant voxels (Table 6.2). Since P1 and P2 had no significant
voxels, we should not do an asymmetry analysis for these subjects, as explained in the
Method section; our test is conservative, because we want to avoid false positives.
Table 6.3 summarizes the results for all subjects, together with Edinburgh profile and
Wada score when available. For subjects C1, C2, and P4 a Wada test is not available,
however their high Edinburgh profile indicates right handedness, and consequently the
probability of left hemisphere language dominance is very high [Knecht et al., 2000,
Pujol et al., 1999]. Our MEG methodology also indicates left dominance. Edinburgh
82
Right Left
C1 C2
P1
P2 P3 P4
Figure 6.5: V oxels with significant activity during 250-400ms after the word presenta-
tion. Subjects P1 and P2 had no significant voxels.
Figure 6.6: The posterior temporal cortex is used to define areas N
R
and N
L
for the
language asymmetry test of all subjects that had significant voxels.
profile, Wada test, and our MEG methodology indicates left dominance for P3. Our
analysis for P1 and P2 was inconclusive.
83
Subject Edinburgh Wada Sig. Activity Dom. Hemisphere
C1 77.8 - Left Left
C2 77.8 - Left Left
P1 -100 Right No activity Inconclusive
∗
P2 87.5 Left No activity Left
∗
P3 73.3 Left Left Left
P4 81.8 - Left Left
Table 6.3: Edinburgh profile, Wada test, location of significant voxels after thresholding
T
it
maps, and asymmetry analysis usingS statistic. *: The same comments as for Table
6.2 apply.
6.1.4 Conclusion
We have developed a two-step procedure to identify the language dominant hemisphere
using MEG cortical imaging and permutation tests. The first step identifies language-
specific cortical locations that demonstrate significant variation against the baseline.
Once the language cortex is specified, we perform an asymmetry test based on average
power, to identify the language dominant hemisphere.
Since the number of subjects is very small, our results are only preliminary. We are
currently processing more subjects to validate the accuracy of our methodology.
6.1.5 Acknowledgments
We wish to thank Dr. William Sutherling and Warren Merrifield at the Epilepsy and
Brain Mapping MEG Unit in the Huntington Medical Research Institutes, for providing
the experimental data of the language processing MEG study.
6.2 Statistical Surface-Based Morphometry Using a
Non-Parametric Approach
We present a novel method of statistical surface-based morphometry based on the use
of non-parametric permutation tests. In order to evaluate morphological differences of
brain structures, we compare anatomical structures acquired at different times and/or
84
from different subjects. Registration to a common coordinate system establishes corre-
sponding locations and the differences between such locations are modeled as a dis-
placement vector field (DVF). The analysis of DVFs involves testing thousands of
hypothesis for signs of statistically significant effects. We randomly permute the surface
data among two groups to determine thresholds that control the familywise (type 1) error
rate. These thresholds are based on the maximum distribution of the amplitude of the
vector fields, which implicitly accounts for spatial correlation of the fields. We propose
two normalization schemes for achieving uniform spatial specificity. We demonstrate
their application in a shape similarity study of the lateral ventricles of monozygotic
twins and non-related subjects.
6.2.1 Introduction
The advancement of MRI has given us an invaluable tool for extracting structural brain
information. The study of brain morphology has emerged as a new field of computa-
tional neuroanatomy and can provide great insights into brain function and development,
as well as investigate the effects of many pathological diseases. In order to evaluate
morphological differences of brain structures we need to compare structures extracted
from many images acquired at different times and from different subjects. A statistical
analysis of these structures may rely on global features, as in classical MRI-based vol-
umetry, which measures and compares the volumes of homologous regions. Statistical
analysis of shape based features has also been proposed. In shape analysis, anatom-
ically corresponding locations between different images are computed and a mathe-
matical transformation between such locations, called deformation, is evaluated. This
deformation can be mathematically modeled as a displacement vector field (DVF) and
the study of this field is called deformation-based morphometry. Recent work includes
computation of DVF’s using deformable registration schemes on images [Davatzikos
85
et al., 1996, Csernansky et al., 1998] and using structural correspondence establishing
methods on boundary and medial shape descriptions [Styner et al., 2003b, Styner et al.,
2003a].
Analysis of the DVFs involves testing from a few tens to many thousands of hypoth-
esis (one per surface element) for statistically significant effects. It is important to con-
trol for the multiple testing problem, and the most common measure of multiple false
positives is the familywise error rate (FWER).
The multiple testing problem has been an active area of research in the functional
neuroimaging community. The most widely used methods in the analysis of neuroimag-
ing data use random field theory [Chung, 2001] [Worsley et al., 1996] and make infer-
ences based on the maximum distribution. In this framework, a closed form approxima-
tion for the tail of the maximum distribution is available, based on the expected value
of the Euler characteristic of the thresholded image [Worsley et al., 1996]. However,
random field theory relies on many assumptions such as the same parametric distribu-
tion at each spatial location, a point spread function with two derivatives at the origin,
sufficient smoothness to justify the application of the continuous random field theory,
and a sufficiently high threshold for the asymptotic results to be accurate.
In contrast, non-parametric methods rely on minimal assumptions, deal with the
multiple comparisons problem and can be applied when the assumptions of the para-
metric approach are untenable. Non-parametric permutation tests are exact, distribution
free and adaptive to underlying correlation patterns in the data. Further, they are con-
ceptually straightforward and, with recent improvements in computing power, are com-
putationally tractable. They have been applied in a wide range of functional [Nichols
and Holmes, 2001, Pantazis et al., 2003] and structural [Thompson et al., 2003] imag-
ing applications. The method presented here produces a local threshold map applied
directly to the DVF and controls the FWER over all surface elements.
86
A: B: C:
0 mm 8 mm
Figure 6.7: Visualization of the distance map between the right lateral ventricles of a
twin-pair (superior view). A: The two ventricles after alignment. B: Same as A, one
ventricle shown transparent and the other as grid-mesh. C: Distance map with color-
coded distance at each boundary-point.
6.2.2 Method
The objective of this study is to localize regions of brain structures that exhibit statis-
tically significant morphological variation among two population groups while control-
ling the risk of any false positives. We find local thresholds that control the FWER and
at the same time achieve uniform specificity among all surface elements.
Permutation Scheme
We assume we have two groups of DVFs, group A and group B. Each DVF measures the
difference between a brain structure and a template or a designated comparison object.
For example, group A may be composed of lateral ventricles of twins and each DVF
models the morphological difference for each pair of twins (see Fig. 6.7). Group B
may have similarly constructed DVFs but from pairs of unrelated individuals of similar
age and same gender. We want to test the two groups for difference in the means at
each spatial location (see Fig. 6.8). Permutation tests are a valid and tractable approach
for such an application, as described in the introduction. Our null hypothesis is that the
distribution of the DVF at each spatial element is the same for every subject regardless of
the group. Permutations among the two groups satisfy the exchangeability condition, i.e.
87
vs →
Group A Group B Map
Figure 6.8: Concept of statistical significance maps: For two groups of objects, DVF
maps are compared locally resulting in a map of significant differences between groups.
they leave the distribution of the statistic of interest unaltered under the null hypothesis.
Given n
1
members of the first group a
k
, k = 1...n
1
and n
2
members of the second
group b
k
, k = 1...n
2
, we can create M ≤
n
1
+n
2
n
2
permutation samples. A value of
M from 1000 and up should yield results that are negligibly different from using all
permutations [Edgington, 1995].
Our modeling proceeds by forming a vector of statistics for each permutation sample
j, called T
j
, with elements:
T
ij
=
P
n
1
k=1
ka
∗
ki
k
n
1
−
P
n
2
k=1
kb
∗
ki
k
n
2
(6.12)
where i is the spatial index, j the permutation index, and k the group member index.
Further,ka
∗
ki
k (kb
∗
ki
k) is the amplitude of the DVF at the i
th
spatial location of the k
th
member of group A (group B). The symbol (
∗
) indicates that the values a
∗
ki
and b
∗
ki
are
created by permutation (Fig. 6.9). In essence, T
ij
captures the difference of the ampli-
tudes of the DVFs among the two groups. The goal is to use this statistic to estimate
the maximum distribution of the vector amplitude differences over all surface elements
i. We may do this by summarizing in space using a maximum statisticS
j
= max
i
(T
ij
)
and use the empirical distribution to extract thresholds that control the false positives to
a desired level. However, the method requires some modifications to achieve uniform
specificity across space.
88
Figure 6.9: Illustration of the permutation scheme. We create M permutation samples
from the original data and form the statistic T
ij
. The statistic is normalized into T
n
ij
and the data are summarized in space to create S
j
. The empirical distribution of S
j
,
called
ˆ
F
S
is used to define a global thresholdS
th
which corresponds to spatially varying
thresholdsS
th
i
.
Uniform Specificity
Permutation tests are always valid given the assumption of exchangeability under the
null hypothesis. However, we may face the problem of uneven specificity in the spatial
dimension if the null distribution varies across space. For example, with a maximum
statistic over space, surface elements for which the morphological structure varies sig-
nificantly will contribute more to the maximum distribution than others with smaller
morphological variability. The impact is a relatively generous threshold for the highly
varying locations and a stringent threshold for the other locations. We can overcome
this problem by including some form of normalization in the statistic T
ij
. In this work
we propose two different normalization schemes.
The first normalization scheme is based on computing the standard deviation σ
i
of
the statisticT
ij
across permutations for each surface elementi and normalizingT
ij
with
this value: T
σ
ij
= T
ij
/σ
i
. The normalized statistic T
σ
ij
has a new empirical distribution
overj which, under specific assumptions, is invariant across all surface elementsi. We
assume that, under the null hypothesis, the x,y and z vector components of the DVFs are
zero mean Gaussian random variables with possibly different variancesσ
x
,σ
y
andσ
z
. It
89
T
ij
Space- Global Local
Normalized Summarizing Threshold Threshold
T
n
ij
S
j
S
th
S
th
i
Method 1 T
σ
ij
max
i
{T
σ
ij
}
ˆ
F
−1
S
σ
(1−α) S
th
·σ
i
Method 2 T
p
ij
min
i
{T
p
ij
}
ˆ
F
−1
S
p
(α) p
−1
i
(S
th
)
Table 6.4: Normalization schemes, summary statistics and thresholds for the two per-
mutation methods.
is very hard to extract the theoretical distribution ofT
ij
based on eq 6.12, so we resorted
to Monte Carlo simulations. For1000 times we simulated 100 samples ofT
ij
(excluding
the outside absolute value operator) withn
1
=n
2
= 10 and tested how well they match
a Gaussian distribution using a Lilliefors test. The 0.05 Lilliefors threshold is 0.0889
and only 51 out of 1000 samples exceeded this threshold, so T
ij
pass the Gaussianity
test. Similar experiments indicated that the Gaussianity assumption is reasonable even
forn
1
= n
2
≥ 5. Finally, other simulations demonstrated thatT
ij
has the same skewed
distribution for all elementsi, as long asσ
x
=σ
y
=σ
z
. Under the above conditions, the
T
σ
ij
normalization scheme guarantees uniform specificity.
Alternatively, we can also normalize based on p-values, i.e. at each spatial location
we compute the empirical distribution across permutations and then replace the statistic
T
ij
for each permutation sample with its p-value. The p-value at surface element i for
permutationj, calledT
p
ij
, is defined byT
p
ij
=p
i
(T
ij
), where:
p
i
(t) =
1
M
X
j
H(T
ij
−t), H(x) =
1 ify≥ 0
0 ify < 0
(6.13)
We now guarantee thatT
p
ij
has a uniform distribution underH
o
for eachi.
Local Threshold Map
We can use the information contained in the normalized data to define a local threshold
map that controls the FWER to a desired level, say 5%, when applied to the original
data. For this it is necessary to estimate the maximum distibution of the vector amplitude
90
MZ NR
L
R
2 mm 8 mm
Figure 6.10: Color-coded average distance maps visualize the absolute distances
between pairs, averaged over the group. The displays show a smaller average distance
between MZ twins compared to NR subjects.
differences over all surface elementsi. We achieve this by summarizing in space using
an extremal statistic:
S
σ
j
=max
i
{T
σ
ij
}, S
p
j
=min
i
{T
p
ij
} (6.14)
Notice that the minimum p-value plays the same role as the maximum statistic in FWER.
After summarizing in space we can use the empirical distribution ofS
j
to define a thresh-
oldS
th
that controls the FWER. If
ˆ
F
S
σ and
ˆ
F
S
p are the empirical cdfs ofS
σ
andS
p
, then
the appropriate global thresholds for a level α test would be
ˆ
F
−1
S
σ (1− α) and
ˆ
F
−1
S
p (α)
respectively (Table 6.4). For example, if we choose a threshold that leaves 5% of the
area of the empirical distribution on the right side forS
σ
j
, resp. left side forS
p
j
, then we
have 5% probability of one or more false positives throughout the entire surface. The
thresholdS
th
cannot be directly applied toT
i0
(the statistic formed by the original data
with permutation indexj = 0). Since the statisticT
ij
was normalized separately for each
surface elementi, the sameS
th
will correspond to different values of local thresholds at
different surface elements. These local thresholds are found with the inverse normaliza-
tion transformation, where surface elementi shows significant variation ifT
i0
≥ S
th
σ
i
,
resp. T
i0
≥ p
−1
i
(S
th
).
91
6.2.3 Results
We applied our method to a shape similarity study of lateral ventricles, a fluid-filled
space in the human brain. We used 2 groups of subjects each consisting of 20 subjects
(age and gender matched): a monozygotic (MZ) twin group and a non-related subject
group (NR). In this study we are interested in the difference in similarity of the lateral
ventricles between these groups. The ventricles were segmented from single gradient-
echo MRI via automatic brain tissue classification and connectivity based postprocess-
ing. Using the SPHARM surface description [Styner et al., 2003b], each object was
described by 4002 surface points with known correspondence; the parameterization is
aligned according to the first order ellipsoid and then based on a rigid-body procrustes.
The difference between MZ twins and NR subjects was computed as shown in Figure
6.7. We generated M = 50,000 permutation samples and derived significance maps
according to the methods described in Table 6.4.
Figure 6.10 shows the average distance maps, which clearly display a higher degree
of similarity of the MZ group as compared to the NR group. Figure 6.11 shows the
empirical distributions ofS
σ
j
andS
p
j
. The right tail of the maximum statisticS
σ
j
is well
behaved and the threshold value
ˆ
F
−1
S
σ (0.95) can thus be reliably computed. This is not
the case for the minimum statistic S
p
j
, as the left tail is bounded by zero and, due to
discreteness of the empirical distribution, the threshold
ˆ
F
−1
S
p (0.05) required many per-
mutations (M = 50,000) for an accurate computation. The significance maps produced
by the two normalization schemes are quite similar (Fig. 6.12), which demonstrates that
the ventricle data reasonably satisfy the assumptions of the first normalization scheme.
6.2.4 Conclusion
We have presented a novel method for the statistical analysis of morphological differ-
ences of brain structures. We have used permutation tests to extract local thresholds that
92
Figure 6.11: Left: Example empirical distribution of the statisticT
ij
acrossj at a surface
elementi. Middle: Empirical distribution of the maximum statisticS
σ
j
. Right: Empirical
distribution of the minimum statisticS
p
j
.
0.001
0.05
Figure 6.12: Significance maps for the two normalization methods. The maps indi-
cate regions where unrelated individuals have more variability that twins. The methods
perform similarly on both the left and right lateral ventricle.
control the number of false positives at a specified levelα over all surface elements and
our approach implicitly accounts for spatial correlations of the data. Further, we have
proposed two normalization schemes to achieve uniform detection specificity.
If the data are Gaussian under the null hypothesis, the normalization scheme based
on the standard deviationσ
i
may be used. Otherwise, we resort to the p-value normaliza-
tion scheme which requires more permutations but makes no distributional assumptions.
93
Chapter 7
General Linear Modeling and Error
Control Using False Discovery Rate
The procedures described until now involved pairwise comparisons between two con-
ditions. We now generalize our tests to incorporate complex experimental designs, and
model our data using general linear modeling procedures. Since modeling is closely
related to the experiment at hand, we present the theory in a specific MEG experimental
setting involving a visual spatial cueing paradigm.
The human brain employs complex selective mechanisms to process the overwhelm-
ing information entering our senses and produce relevant ongoing behavior. Visual
attention models propose that this may be accomplished by differentially modulating
neural activity in the visual cortex to control the response to target and distractor stim-
uli. Since alpha frequency oscillations have been related to disengagement of cortex, we
hypothesize that dynamic maps of cortical activity over this frequency range can reveal
the temporal and spatial profile of visual attention.
To investigate attention bias mechanisms in the brain, we combine minimum-norm
imaging [H¨ am¨ al¨ ainen et al., 1993,K-Yildirim et al., 2006] with wavelet analysis to com-
pute dynamic images of oscillatory cortical activity. We then select a number of cortical
regions on the parietal, occipital, and temporal lobe that play an important role in form-
ing and maintaining attention, based on published literature on human and animal stud-
ies [Corbetta et al., 2000]. To extract contrast statistics of attention related alpha activity,
we follow a massively univariate approach and model each subject, region of interest,
and time-frequency band separately with a novel ANCOV A (analysis of covariance)
94
design, where power in rectangular time-frequency bands forms the observation vari-
ables, and baseline power forms the covariate. We incorporate trial indicator variables
in our model to compensate for within trial correlation of the right and left hemispheres.
The temporal evolution of the estimated ipsilateral vs. contralateral statistic reveals the
top-down control mechanisms of visual attention. Statistical analysis using permutation
tests and false discovery rate demonstrates the significance of our findings.
7.1 Human Attention
To understand the organization and function of the human brain, we need to explore
how higher cognitive processes exert voluntary control over more automatic brain sys-
tems. Of the many cognitive processes associated with the human mind, such as work-
ing memory, emotion, and decision-making, attention is considered the most concrete
because it is closely tied to perception. Several neuroscience studies have identified
direct neural correlates of human attention using intracortical recordings, or noninvasive
alternatives such as PET, fMRI, and EEG/MEG imaging. Most of them study visual spa-
tial attention, since it is possible to retinotopically map attended and unattended stimuli
into regions on the occipital lobe.
Even though our knowledge of attention is incomplete, we can now identify some
principles of organization of the human visual attention system. By covertly deploy-
ing attention to locations in the visual space, humans can improve perception of a
target stimulus while at the same time suppressing the processing of distractors [Pos-
ner and Petersen, 1990]. The underlying neural mechanisms have been investigated at
the level of single neurons, activity in neuronal assemblies, and macroscopic electrical
and hemodynamic activity. At the neuron level, [Luck et al., 1997] and [Womelsdorf
et al., 2006] have measured higher firing rates of individual neurons when attention is
increased. At the neural ensemble level, [Womelsdorf et al., 2006] and [Fries et al.,
95
2001] observed local field potentials that exhibit high oscillatory synchronization at the
gamma frequency. At the macroscopic level, evoked response potential studies showed
amplitude enhancement of P1 (75-130ms) and N1(1500-190ms) components [Hillyard
et al., 1998, Hillyard and Anllo-Vento, 1998], and E/MEG studies demonstrated alpha
frequency modulation [Vanni et al., 1997, Worden et al., 1946, Sauseng et al., 2005].
Stronger hemodynamic responses were measured in various brain sites, such as the
occipital, parietal and temporal lobe, using fMRI [Kastner et al., 1999] and PET [Man-
gun et al., 1997, Woldorff et al., 1997].
Neural correlates of visual attention are visible even before the presentation of a
target stimulus [Worden et al., 1946, Hopfinger et al., 2000, Foxe et al., 1998, Yam-
agishi et al., 2003], indicating that the attention system employs biasing mechanisms
to enhance ongoing behavior. It is not clear whether such bias mechanisms suppress
interference from distracting stimuli, or enhance signal from attending locations, even
though evidence favors the former [Serences et al., 2004, Ruff and Driver, 2006, Hopf
et al., 2006]. Recently, EEG studies have identified topographic changes of alpha activ-
ity during attention [Worden et al., 1946, Sauseng et al., 2005, Marrufo et al., 2001];
sensor plots showed higher alpha power on the ipsilateral hemisphere to the stimulated
hemifield than the contralateral hemisphere. Since alpha frequency oscillations have
been related to disengagement of cortex, it is not surprising that an anticipatory alpha
power decrease can be obtained at the respective sensory association cortices of all stim-
uli, regardless of whether they are auditory, visual, or somatosensory [Bastiaansen and
Brunia, 2001, Foxe et al., 1998]. However, in the visual attention paradigm we still do
not know which areas are involved [Yamagishi et al., 2003], and what is the top-down
mechanism that deploys attention [Awh et al., 2003]. Further, voluntary spatial attention
can be decomposed into elementary mental operations: disengaging attention from the
96
current focus, orienting attention to a new locus, and selectively modulating new stimu-
lus inputs [Hopfinger et al., 2000]. Despite the excellent temporal resolution of E/MEG,
this modality has not been used to study how attention is shifted and sustained. Rather,
fMRI studies have indirectly identified cortical regions, such as the temporal parietal
junction and the superior temporal gyrus, that play a role in forming and sustaining
attention [Corbetta et al., 2000, Hopfinger et al., 2000].
In the present study, we use MEG to explore the temporal evolution of human visual
attention, understand which cortical sites are responsible for shifting and sustaining
attention, and how the process is coordinated by a top-down control. Since alpha fre-
quency oscillations have been related to disengagement of cortex, we hypothesize that
dynamic maps of cortical activity over this frequency range can reveal the temporal
and spatial profile of visual attention. To our knowledge, this is the first study that
maps the topographic changes of alpha activity during attention on the cortical surface,
and explores its temporal dynamics. We apply general linear modeling of cortically
constrained activation maps extracted from an MEG study of a visual spatial cueing
paradigm. To address our hypothesis, we employ mathematical modeling that takes
into account baseline effects and within trial correlation, rather than a simple ANOV A
design. By creating statistics that estimate the ipsilateral vs. contralateral alpha activ-
ity on cortical regions, we present evidence that the superior parietal lobe and temporal
parietal junction have an instrumental role in shifting attention and driving the occipital
lobe.
7.2 Experiment
7.2.1 Subjects
Eight subjects (6 male, 2 female) participated in the experiment (mean age 27.4 years;
standard deviation 4.6 years). Two additional subjects were excluded from analysis, due
97
to excessive head movements. Informed consent was obtained from all subjects, and
the experiment was approved by the Institutional Review Board at The University of
California, San Francisco, in accordance with the 1964 declaration of Helsinki.
7.2.2 Visual spatial cueing experiment
A liquid crystal projection system displayed stimuli to the subjects onto a screen placed
36 cm from the nasion. At all times, subjects were instructed to maintain fixation on a
cross located at the center of their visual field (see Fig. 7.1). Following a short baseline
period, a brief central arrow cue instructed subjects to covertly deploy their attention to
a location in the lower left or right visual quadrant. After a 1 sec delay, a discrimination
task was performed by the subjects, to make sure they oriented and maintained attention
as instructed. In particular, a second stimulus (S2) was presented with equal probability
to the right or left visual field (a small circular patch containing a sinewave grating),
and the subjects had to press a button with their right index finger only if S2 occurred at
the cue location and was a target (20 degrees clockwise grating pattern). In the present
study, we use the data acquired during the 1 sec between cue and S2 presentation, since
at that time subjects shift and maintain attention.
7.2.3 MEG data collection and preprocessing
A total of 900 trials were collected from each subject in a series of 18 blocks of 50
trials, with acquisition time approximately 80 minutes. The head location within the
MEG helmet was recorded during the short breaks between blocks, and subjects with
excessive head movement were excluded from analysis. The MEG recordings were low-
pass filtered at 0-300 Hz, and sampled at 1200 Hz. Eye activity was measured with a
vertical and horizontal electro-oculograph. All data trials were carefully inspected for
eye movements and other artifacts (MEG artifacts, muscle activity) and contaminated
98
Figure 7.1: Visual spatial cueing experiment. A brief central arrow cue directed covert
attention to the lower left or right visual field, in anticipation of a second stimulus (S2)
delivered 1 sec later on the left or right with equal probability. The upcoming S2 con-
sisted of gratings slanted clockwise from the vertical by either 5 (non-target) or 20 (tar-
get) degrees, with a response required if the S2 appeared at the cued (attended) location
and had the target orientation.
trials were discarded, leaving around 700 trials per subject for analysis. Cardiac artifacts
on the remaining trials were removed using an infomax independent component analysis
algorithm [Hyvarinen et al., 2001], by identifying periodic independent components that
match the beating heart. Since we focus our analysis on low frequencies (alpha band),
we downsampled the timeseries to 120 Hz to reduce processing time.
7.3 Methods
Our goal is to explore the temporal evolution of human visual attention, and identify
the functional role various cortical sites play in directing and sustaining attention. We
believe that the ipsilateral vs. contralateral topographic changes of alpha activity can
reveal the top-down control mechanisms of visual spatial attention. For this reason, we
design statistics that estimate the statistical significance of the ipsilateral vs. contralat-
eral alpha effect for several cortical and temporal regions.
We combine minimum-norm imaging [H¨ am¨ al¨ ainen et al., 1993, K-Yildirim et al.,
2006] with wavelet analysis to compute dynamic images of oscillatory cortical activity.
We then select a number of cortical regions on the parietal, occipital, and temporal lobe
99
that play an important role in forming and maintaining attention, based on published
literature on human and animal studies [Corbetta et al., 2000]. To extract contrast sta-
tistics of attention related alpha activity, we follow a massively univariate approach and
model each subject, region of interest, and time-frequency band separately with a novel
ANCOV A (analysis of covariance) design, where power in rectangular time-frequency
bands forms the observation variables, and baseline power forms the covariate. We
incorporate trial indicator variables in our model to compensate for within trial correla-
tion of the right and left hemispheres. The temporal evolution of the estimated ipsilateral
vs. contralateral statistic reveals the top-down control mechanisms of visual attention.
Statistical analysis using permutation tests and false discovery rate demonstrates the
significance of our findings.
7.3.1 Model
We assume that MEG data are collected as a set of stimulus-locked event-related trials
(one per cue stimulus), each consisting of a baseline period, and a post-cue interval.
Each trial consists of an array of dataM (n
channels
×n
timepoints
) representing the measured
magnetic field at each sensor as a function of time. The measurements M are linearly
related with the brain activationX (n
sources
×n
timepoints
) as:
M =GX +N (7.1)
whereG (n
channels
×n
sources
) is the forward operator andN represents additive noise in
the channel measurements. The lead field matrix G depends on the shape and conduc-
tivity of the head [Darvas et al., 2004b], and in this study we compute it based on an
overlapping spheres model [Huang et al., 1999b] using the BrainStorm electromagnetic
software [Mosher et al., 2005].
100
(a) (b) (c)
Figure 7.2: MEG model; (a) Sensor arrangement of a 275-channel CTF MEG system,
(b) Topography of sensor measurementsM, (c) Min-norm inverse solutionX on a tes-
sellated cortical surface
A cortical map is computed for each epoch by applying a Tikhonov regularized
minimum norm inverse method [Tikhonov and Arsenin, 1977] to produce an estimate
of the temporal activity at each surface element in the cortex:
X = (G
T
G+λI)
−1
G
T
M (7.2)
We write the reconstructed cortical maps as{X
st
}, wheres andt are indices in space
and time respectively. We use the pre-stimulus data to estimate the baseline meanm
s
at
each spatial elements, by averaging over timet. We then estimate the centered data as:
Y
st
=X
st
−m
s
(7.3)
7.3.2 Wavelet Expansion and Statistics
We follow the complex Morlet wavelet procedure presented in Section 3.3 to decompose
the source timeseriesY
st
into their wavelet coefficients. For convenience, we repeat the
wavelet kernel function here:
w
tf
= (πb)
−0.5
e
2iπft
e
−t
2
/b
(7.4)
101
whereb is the bandwidth parameter.
For each sources we obtain an estimate of the time-varying frequency components
by expanding the timeseries using Morlet wavelets as:
C
stf
=Y
st
?w
tf
(7.5)
where (?) denotes the convolution operator over the time indext, andC
stf
are the com-
plex wavelet coefficients.
Our goal is to detect spatial-temporal-spectral components of cortical activity that
relate to visual attention effects. A statistic that estimates neural activation energy at
specific time-frequency instances, given by the squared wavelet coefficients, can capture
such attention effects:
E
stf
=|C
stf
|
2
(7.6)
To improve the signal to noise ratio, and increase the statistical power by minimizing
the total number of statistics we need to test for significance, we average the energy over
timeT = [t
1
,t
2
], frequencyF = [f
1
,f
2
], and cortical areaS:
E
STF
=
ZZZ
(s,t,f)∈(S,T,F)
|C
stf
|
2
dsdtdf (7.7)
Figure 7.3 shows how we select rectangular time-frequency bands in the wavelet sig-
nal domain. We divide the 1 sec post-cue time period into several equal length intervals
(only a few are shown), and select bands in the alpha range [8,14Hz]. We estimate the
E
STF
statistics on 6 cortical sites shown in Fig. 7.4. These sites have a functional role
in human vision and attention, as identified by neuroimaging studies [Corbetta et al.,
2000, Hopfinger et al., 2000], and were selected manually on the cortical surfaces of 8
subjects.
102
Cue Right Cue Left
Figure 7.3: Selecting rectangular time-frequency bands in the wavelet signal domain
(right). Also shown is the average frequency content of the cue-right and cue-left trials
on the right intra-parietal sulcus (Fig. 7.4). Alpha power is stronger on the cue-right
trials (ipsilateral hemisphere) around 500-800 ms, indicating a topographical change of
alpha activity in anticipation of a target stimulus at 1 sec in the cued direction.
Caudal Right Lateral Ventral
Figure 7.4: Visual attention effects are explored in 6 cortical regions, selected symmetri-
cally in each hemisphere; SPL: Superior Parietal Lobe, TPJ: Temporal Parietal Junction,
IPS: Intra-Parietal Sulcus, OM: Occipital Middle, OL: Occipital Lateral, OV: Occipital
Ventral.
7.3.3 ANCOV A Model
During the visual attention experiment, we acquire many energy observations E
STF
in
several spatial-temporal-spectral bandsSTF . We introduce new indices to identify the
observations: k∈{1...K} denotes the subject,i∈{1,2} denotes the main effect cue
(1 for right, 2 for left),j ∈{1,2} denotes the main effect hemisphere (1 for right, 2 for
103
left), andl denotes the independent measurements collected over each subject and main
effect.
The new indices allow us to arrange the observations into general linear models.
We follow a massively univariate approach and model each spatial-temporal-spectral
band (and subject) separately with an ANCOV A model. Our approach resembles the
standard statistical methodology in fMRI data analysis (statistical parametric mapping),
where an ANOV A model is fit in each voxel location [Friston, 1996]. However, MEG
has lower spatial resolution, so instead of voxels we use cortical regions, and higher
temporal resolution, so instead of a hemodynamic response function we expand neural
activity in time-frequency bands. We also use a baseline covariate, because it is reason-
able to expect that pre-cue brain state will affect our post-cue observations; high alpha
activity before the stimulus may predict high alpha activity after the stimulus and vice
versa. Finally, since observations from the right and left hemisphere of the same trial
are correlated, we incorporate trial indicator variables in our model to compensate for
correlated observations. The ANCOV A model we use is:
E
STF,k
ijl
=μ
STF,k
ij
+β
STF,k
j
c
STF,k
ijl
+ρ
STF,k
il
+
STF,k
ijl
(7.8)
whereμ
STF,k
ij
are the main effects for cue and hemisphere conditions,c
STF,k
ijl
are the base-
line covariates (observations of neural energy in the baseline) multiplied by an estimable
parameterβ
STF,k
j
,ρ
STF,k
il
compensates for the correlation between the two hemispheres,
and
STF,k
ijl
is the model error term. The superscriptsSTF,k indicate that we fit the same
ANCOV A model for all spatial-temporal-spectral bands and subjects.
It may be easier to understand the model in matrix notation:
E
STF,k
=Xb
STF,k
+
STF,k
(7.9)
104
Figure 7.5: First few rows of the ANCOV A design matrixX. Each MEG trial produces
two observations (post-cue neural energy in the right and left hemisphere), that account
for two rows in the design matrix. We place 1s in the proper cue and hemisphere indica-
tor variable, and the within trial correlation term. The baseline neural energy is placed
on the fifth or sixth column, if it comes from the right or left hemisphere respectively.
whereE
STF,k
is the vector of all observations, andb
STF,k
is the vector of unknown
parameters. Each observation corresponds to a row in the design matrixX, as shown in
Fig. 7.5. We find the least squares solution ofb
STF,k
, and then estimate a statistic that
captures the ipsilateral vs. contralateral effect for each subject:
S
STF,k
=μ
STF,k
11
−μ
STF,k
12
−μ
STF,k
21
+μ
STF,k
22
(7.10)
Finally, to get a reliable estimate, we average over all subjects k. We hypothesize that
large values of the statisticS
STF
in the alpha band indicate increased attention.
S
STF
= mean
k
{S
STF,k
} (7.11)
7.3.4 Statistical Significance
The spatial and temporal evolution ofS
STF
can reveal the dynamics of visual attention.
However, we need to establish if the observed effects are statistically significant, and we
105
do this using a permutation scheme [Pantazis et al., 2005b,Pantazis et al., 2003,Nichols
and Holmes, 2001].
Under the null hypothesis of no cue or hemisphere main effect, we can randomly
mix the labels of our observations and exchange the ipsilateral and contralateral con-
ditions. Computationally, this is equivalent to randomly multiplying each subject’s
response with 1 or -1 (Eq. 7.12); we avoid a complete randomization scheme that would
require mixing the first 4 columns of the design matrix, because recomputing the least
squares solution for each permutation sample is too computationally demanding.
S
STF
∗
= mean
k
{(±1)
∗
S
STF,k
} (7.12)
Since we have experimental data for 8 subjects, we can generate2
8
permutation samples
S
STF
∗
, and use them to evaluate the p-value P
S
STF of our original statistic S
STF
. The
symbol (*) denotes the statistic is generated by permutation.
By repeating this procedure for each spatial-temporal-spectral bandSTF , we eval-
uate a group of p-values of our original statistics. To find which of these p-values are
significant after multiple comparison corrections, we use a false discovery rate (FDR)
procedure. FDR controls the expected proportion of errors among the rejected hypoth-
esis [Benjamini and Hochberg, 1995, Genovese et al., 2002]. For example, FDR at a
level α = 0.05 means that if we reject 100 null hypotheses, 5 of these will be false
positives. We consider this measure suitable, because when many of our hypotheses are
rejected, the error from a single erroneous rejection is not crucial for drawing conclu-
sions about the ipsilateral vs. contralateral effect. Before applying FDR, we confirmed
that its assumptions hold [Benjamini and Yekutieli, 2001], namely we have non-negative
correlation between theS
STF
statistics in the various spatial-temporal-spectral bands.
106
Figure 7.6: Alpha power sensor map showing the difference between cue-right and cue-
left trials. The effect is averaged over 8 subjects at 600 ms post-cue, and demonstrates
that ispilateral is greater than contralateral alpha activity.
7.4 Results
Each subject was cued to attend to the right or left lower visual quadrant in approxi-
mately half the trials. Figure 7.6 shows a sensor map of the difference in alpha activity
between cue-right and cue-left trials, averaged over all 8 subjects. Even though these
patterns appear dipolar, they should not be confused as such, since these images are
power maps. In the right hemisphere, the alpha activity in the cue-right trials is greater
than the cue-left trials; the opposite holds true for the left hemisphere. This means that in
both hemispheres the ipsilateral is greater than the contralateral alpha activity. The same
result has been reported in previously published literature [Worden et al., 1946,Sauseng
et al., 2005], and it demonstrates that alpha activity represents disengagement of the
cortex.
In figure 7.7 we show the same maps, but now with the data mapped back onto
the cortical surface. Even though the ipsilateral activity is still greater, the maps show
great variation and it is difficult to localize a consistent cortical effect across subjects.
The different temporal dynamic in each subject further confound our understanding of
the effect. For this reason, in the method section we provide a formal mathematical
procedure based on massive univariate ANCOV A modeling that can reveal attention
107
Figure 7.7: Caudal view of cortical maps of 8 subjects, showing the difference in alpha
power between cue-right and cue-left trials at 600 ms post-cue. Despite the considerable
variability between subjects, overall ipsilateral activity is greater than contralateral.
effects in several cortical regions and time-frequency bands which may not be apparent
through visual inspection of the data.
Following the procedure in the method section, we divided the 1 sec post-cue
time period into 10 equal intervals, T
1
= [0,100ms], T
2
= [100,200ms], ..., T
10
=
[900,1000ms], and used the alpha band F
1
= [8,14Hz] and the 6 cortical sites shown
in Fig. 7.4. For each subject we constructed neural energy observationsE
STF,k
in these
bands, and fit multiple ANCOV A models.
To check if our modeling approach is correct, we investigated how well it can
explain the MEG observations. Out of the total variation ofE
STF,k
, on average 4%
was explained by the cue and hemisphere main effects, 12% by the baseline covariate,
66% by the within trial correlation, and 18% by the error term. The error variance is
relatively large, but given the great unpredictability of MEG recordings, we consider
our modeling satisfactory. Also, when constructing the ANCOV A model, we made the
assumption that the baseline alpha activity has an additive effect on the post-cue activity.
It the effect is multiplicative rather than additive, large values of baseline alpha activ-
ity would cause large errors in the model. Figure 7.8 plots the ANCOV A error against
108
Figure 7.8: Testing model validity. If baseline alpha activity is not an additive effect, as
we assume in the ANCOV A model, large values of baseline alpha activity would cause
large errors in the model. Since no inflation of error occurs for large baseline alpha
activity, the baseline covariate is indeed an additive effect.
the baseline alpha covariate. Since no inflation of error occurs for large baseline alpha
activity, our model accurately uses the baseline covariate.
After estimating the ispilateral vs. contralateral statistic S
STF
using Eq. 7.11, we
establish its statistical significance using permutations. The p-values are shown in Table
7.1. Red values indicate significance after multiplicity correction using a false discovery
rate procedure that gave a threshold of 0.0273 (Fig. 7.9). All cortical regions demon-
strate significant topographic changes of alpha activity after 200 ms, apart from the
temporal parietal junction that has significant activity after 400 ms, and the superior
temporal lobe that is not significant after 700 ms.
The temporal dynamics of the statistic S
STF
(Fig. 7.10) reveal the mechanisms of
human attention. Alpha band oscillations were enhanced in the parietal, temporal and
occipital regions of the ipsilateral hemisphere (ignored representation) relative to the
contralateral hemisphere (attended representation), providing support for the hypothesis
that visual selective attention operates in part by means of top-down mechanisms that
modulate cortical activity in a preparatory fashion. The superior parietal lobe reaches
a peak in ipsilateral vs. contralateral activity at 500 ms post-cue, and then the effect
109
SPL 0.4766 0.1055 0.0273 0.0039 0.0039 0.0039 0.0234 0.0586 0.0898 0.1055
IPS 0.0469 0.0156 0.0039 0.0039 0.0039 0.0039 0.0039 0.0039 0.0195 0.0234
TPJ 0.1094 0.1445 0.2813 0.0781 0.0234 0.0195 0.0195 0.0195 0.0117 0.0117
OM 0.6797 0.0781 0.0078 0.0039 0.0039 0.0039 0.0039 0.0078 0.0078 0.0117
OL 0.1680 0.0781 0.0195 0.0039 0.0039 0.0039 0.0039 0.0078 0.0039 0.0039
OV 0.4531 0.0391 0.0039 0.0039 0.0039 0.0039 0.0078 0.0117 0.0117 0.0117
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 sec
Table 7.1: P-values of the ipsilateral vs. contralateral statisticS
STF
. Red values indicate
significance after multiplicity correction using a false discovery rate procedure that gave
a threshold of 0.0273 (Fig. 7.9)
.
Figure 7.9: FDR procedure for the p-values of theS
STF
statistics. The blue line sorts the
60 p-values of Table 7.1 in ascending order. The black line is the 0.05 significance line
for 60 hypothesis tests. Based on FDR theory, all p-values below the highest crossing
point, i.e. less than or equal to 0.0273, are significant.
is suppressed. Effectively, it demonstrates that this region is responsible for shifting
spatial attention, preparing the cortex for the imminent target stimulus. On the contrary,
the temporal parietal junction peaks much later at around 700 ms, probably because
it has a functional role both in shifting and sustaining attention. Both areas control
various cortical occipital sites, that keep building the alpha biasing effect until 1 sec
when the target stimulus occurs. These findings confirm previous fMRI studies that
have indirectly identified cortical regions, such as the temporal parietal junction and the
110
superior temporal gyrus, as playing a role in forming and sustaining attention [Corbetta
et al., 2000, Hopfinger et al., 2000].
7.5 Acknowledgments
We wish to thank Dr. Gregory Simpson, Dr. Darren Weber and Dr. Corby Dale at the
University of California, San Francisco, for providing the experimental data of the MEG
visual attention study, and working with us on data analysis.
111
SPL OM
IPS OL
TPJ OV
Figure 7.10: Temporal dynamics of the ispilateral vs. contralateral statistic S
STF
at 6
cortical regions reveal the top-down mechanisms of visual spatial attention.
112
Chapter 8
Conclusion and Future Work
8.1 Conclusion
We have presented random field theory and permutation methods for processing of MEG
data and extracting statistically significant activation maps. They can be used with any
linear or nonlinear cortical imaging method to obtain objective thresholds on statistic
maps. Type I error rate is controlled over all voxels by considering the spatial and
temporal dependence of the data.
Overall, the permutation methods are more flexible than RF based methods. We can
work with spatially smoothed or unsmoothed current density maps (CDMs) depending
on the desired trade off between SNR and spatial resolution, when RFs should only be
used with sufficiently smoothed CDMs. In general, we prefer to smooth the data as
little as possible; the data is low resolution, and it seems undesirable to further reduce
the resolution. Further, permutation methods can achieve uniform sensitivity by defin-
ing different thresholds per surface element via p-values. More importantly, they do
not rely on distribution assumptions, making them suitable for real data experiments.
Their major limitation is that we need equal pre- and post-stimulus regions to allow
exchangeability. This issue can generally addressed in planning the event-related MEG
study.
We should comment here that permutation and random field tests do not address the
limited resolution of MEG reconstruction methods. The MEG inverse problem is ill-
posed and CDMs are of low resolution and tend to mislocalize source activation. If the
113
inverse method identifies experimental variation is some region, permutation and ran-
dom field tests will identify these regions regardless of the presence of an actual source
at those locations. In most cases, CDM reconstructions of activation from a single sulcus
or gyrus will tend to show activation in neighboring sulci or gyri respectively. It is quite
possible that reconstructions of activation in a single cortical area will exceed the deter-
mined threshold in multiple areas, and thus particular care is required in interpretation
of CDMs, even after thresholding to control for FWER.
Permutation methods are easily generalizable to multidimensional data, such as
spatial-temporal-frequency cortical maps. We have demonstrated their application to
time-frequency cortical maps produced by a Morlet wavelet expansion. This allows us
to further investigate the dynamics of the brain, by identifying the frequency compo-
nents of neuronal communication.
Our methods are directly applicable to neuroimaging studies performed with MEG
or EEG. Such studies, for example, include event-related experiments where baseline
data are acquired and used as a reference to test against experimental effects. We
have demonstrated our methodology in two important applications: localization of the
language-specific cortex of epileptic patients, and surface-based morphometry of brain
structures of schizophrenic patients.
We have also presented procedures to move from simple pairwise tests to complex
experimental designs involving multiple factors and covariates. We have demonstrated
the use of general linear theory to model main effects and interactions in MEG exper-
imental data. Since modeling is closely related to the experiment at hand, we have
presented the theory in a specific MEG experimental setting involving a visual spatial
cueing paradigm. To extract contrast statistics of attention related alpha activity, we
have followed a massively univariate approach and modeled each subject, region of
114
interest, and time-frequency band separately with a novel ANCOV A (analysis of covari-
ance) design, where power in rectangular time-frequency bands forms the observation
variables, and baseline power forms the covariate. We have also demonstrated the use
of false discovery rate, a less conservative alternative to familywise error rate control, to
do multiplicity adjustments to our statistics.
8.2 Future Work
Over the last years I have developed and applied several statistical methodologies to
process brain activation maps from MEG recordings. I have used these methodolo-
gies to investigate cognitive responses in high-density MEG studies of visual attention,
language processing, somatosensory-motor integration, and morphological analysis of
brain structures. My goal is to broaden my research in the area of brain mapping, both
functional and anatomical
8.2.1 Signal Processing Methodologies in MEG
I have investigated a wide range of processing and statistical techniques in MEG: para-
metric (random field) and nonparametric (permutation and bootstrap) tests to evalu-
ate the statistical significance of MEG inverse solutions; imaging oscillatory neural
processes with wavelet expansions; evaluation of reconstruction methods using ROC
analysis; experimental design using general linear modeling; error control based on fam-
ilywise error rate and false discovery rate. I plan to continue research in this direction,
by developing new statistical procedures and improving MEG data quality.
In the future we expect statistical thresholding of brain activation maps to be a stan-
dard procedure in neuroimaging studies. Apart from the maximal statistics investigated
in this work, we can develop methods that control the cluster size of activation in brain
115
images. This technique will combine random field theory and permutation tests to calcu-
late the resolution element (RESEL) count of activation clusters, and define a threshold
based on cluster size. Further, we intend to develop statistical techniques that ensure
MEG data quality and involve tests for Gaussianity, model fitting, and smoothness of
cortical images. We will compare data from different MEG vendors, such as CTF, Elekta
Neuromag, and 4D Neuroimaging, to investigate data reproducibility and SNR perfor-
mance. New denoising techniques will be explored, such as subspace techniques based
on generalized sidelobe cancelers.
8.2.2 Functional Mapping
We will use the developed methodologies to explore brain function and investigate cog-
nitive processes. The human brain forms complex cognitive networks to perform tasks,
such as attention, language processing, and working memory. Our knowledge of these
processes is still limited, but neuroimaging studies can help us gain insight into how the
brain works and integrates information. MEG has the temporal resolution necessary to
resolve brain dynamics and synchronization of cortical regions.
8.2.3 Anatomical Mapping
The study of brain morphology has emerged as a new field of computational neu-
roanatomy and can provide new insights into changes in the brain during development
and aging as well as investigate the morphological effects of different neurological dis-
eases and disorders. We have successfully developed a method for statistical surface-
based morphometry based on the use of non-parametric permutation tests. Recent
improvements include the use of Hotelling’s T
2
statistics to derive a metric of differ-
ence between anatomically corresponding locations between different images. Future
116
work will include developmental studies and multivariate analysis of variance (MAN-
COV A) designs with group and gender main effects, and covariates, such as age and IQ
test. Our goal is to identify morphological manifestations of various diseases, such as
autism and schizophrenia.
To extend the statistical analysis to multiple subjects, we can use recently developed
procedures to co-register brains into an atlas based on volume registration with cortical
surface based constraints. This procedure will be used in intersubject studies of MEG
and fMRI experiments.
117
References
[Adler, 1981] Adler, R. J., editor (1981). The Geometry of Random Fields. New York:
Wiley.
[Andrade et al., 2000] Andrade, A., Kherif, F., Mangin, J.-F., Worsley, K. J., Paradis,
A. L., Simon, O., Dehaene, S., Bihan, D. L., and Poline, J.-B. (2000). Detection of
fMRI activation using cortical surface mapping. Human Brain Mapping, 12:79–93.
[Anninos et al., 1991] Anninos, P., Tsagas, N., Sandyk, R., and Derpapas, K. (1991).
Magnetic stimulation in the treatment of partial seizures. Neuroscience, 60:141–171.
[Arfken, 2000] Arfken, G. B., editor (2000). Mathematical Methods for Physicists.
Academic Press, 5th edition.
[Awh et al., 2003] Awh, E., Matsukura, M., and Serences, J. T. (2003). Top-down con-
trol over biased competition during covert spatial orienting. Journal of Experimental
Psychology, 29(1):52–63.
[Baillet et al., 2004] Baillet, S., Mosher, J., and Leahy, R. (2004). Brainstorm: A
non-commercial matlab toolbox for meg-eeg data visualization and processing.
http://neuroimage.usc.edu/brainstorm.
[Baillet et al., 2001] Baillet, S., Mosher, J. C., and Leahy, R. M. (2001). Electromag-
netic brain mapping. IEEE Signal Processins Magazine, 18:14–30.
[Barnes and Hillbrand, 2003] Barnes, G. R. and Hillbrand, A. (2003). Statistical flat-
tening of MEG beamformer images. Human Brain Mapping, 18:1–12.
[Barone and Paterno, 1982] Barone, A. and Paterno, G., editors (1982). Physics and
Applications of the Josephson Effect. Wiley.
[Basar et al., 2001] Basar, E., Basar-Eroglu, C., Karakas, S., and Schurmann, M.
(2001). Gamma, alpha, delta, and theta oscillations govern cognitive processes. Int.
J. Psychophysiol., 2-3(39):241–248.
118
[Bastiaansen and Brunia, 2001] Bastiaansen, M. C. M. and Brunia, C. H. M. (2001).
Anticipatory attention: An event-related desynchronization approach. International
Journal of Psychophysiology, 43:91–107.
[Benjamini and Hochberg, 1995] Benjamini, Y . and Hochberg, Y . (1995). Controlling
the false discovery rate: a practical and powerful approach to multiple testing. Jour-
nal of the Royal Statistical Society, 57:289–300.
[Benjamini and Yekutieli, 2001] Benjamini, Y . and Yekutieli, D. (2001). The control of
the false discovery rate in multiple testing under dependency. The Annals of Statistics,
29(4):1165–1188.
[Bertrand et al., 2000] Bertrand, O., Tallon-Baudry, C., and Pernier, J. (2000). Time-
frequency analysis of oscillatory gamma-band activity: wavelet approach and phase-
locking estimation. In C, A., Y , O., G, S., S, S., and C, W., editors, Advances in
Biomagnetism Research: Biomag96, Springer-Verlag, New York.
[Binder et al., 1996] Binder, J., Swanson, S., Hammeke, T., Morris, G., Mueller, W.,
Fischer, M., Benbadis, S., Frost, J., Rao, S., and Haughton, V . (1996). Determina-
tion of language dominance using functional mri: a comparison with the wada test.
Neurology, 46:978–84.
[Blair and Karnisky, 1994] Blair, R. C. and Karnisky, W. (1994). Distribution-free sta-
tistical analyses of surface and volumetric maps. Functional Neuroimaging: Techni-
cal Foundations (R. W. Thatcher, M. Hallett, T. Zeffiro, E. R. Jony, M. Huerta, eds.),
pages 19–28.
[Boothby, 2002] Boothby, W., editor (2002). An Introduction to Differentiable Mani-
folds and Riemannian Geometry, Revised. Academic Press.
[Bowyer et al., 2005a] Bowyer, S., Fleming, T., Greenwald, M., Moran, J., Mason, K.,
Weiland, B., Smith, B., Barkley, G., and Tepley, N. (2005a). Magnetoencephalo-
graphic localization of the basal temporal language area. Epilepsy and Behavior,
6:229–34.
[Bowyer et al., 2005b] Bowyer, S., Moran, J., Weiland, B., Mason, K., Greenwald, M.,
Smith, B., Barkley, G., and Tepley, N. (2005b). Language laterality determined by
meg mapping with mr-focuss. Epilepsy and Behavior, 6:235–41.
[Breier et al., 2001] Breier, J., Simos, P., Wheless, J., Constantinou, J., Baumgartner, J.,
Venkataraman, V ., and Papanicolaou, A. (2001). Language dominance in children as
determined by magnetic source imaging and the intracarotid amobarbital procedure:
a comparison. J Child Neurol, 16:124–30.
119
[Brookes et al., 2004] Brookes, M. J., Gibson, A. M., Hall, S. D., Furlong, P. L., Barnes,
G. R., Hillebrand, A., Singh, K. D., Holliday, I. E., Francis, S. T., and Morris, P. G.
(2004). A general linear model for MEG beamformer imaging. Neuroimage, 23:936–
946.
[Bullmore et al., 1999] Bullmore, E. T., Suckling, J., Overmeyer, S., Rabe-Hesketh, S.,
Taylor, E., and Brammer, M. J. (1999). Global, voxel, and cluster tests, by theory
and permutation, for a difference between two groups of structural mr images of the
brain. IEEE Transactions on Medical Imaging, 18:32–42.
[Carbonell et al., 2004] Carbonell, F., Galan, L., Valdes, P., Worsley, K., Biscay, R. J.,
Diaz-Comas, L., Bobes, M. A., and Parra, M. (2004). Random field - union intersec-
tion tests for eeg/meg imaging. NeuroImage, 22:268–276.
[Chung, 2001] Chung, M. K. (2001). Statistical Morphometry in Neuroanatomy. PhD
thesis, McGill University, Montreal.
[Cohen, 1972] Cohen, D. (1972). Magnetoencephalography: Detection of the brain’s
electrical activity with a superconducting magnetometer. Science, 175:664–666.
[Corbetta et al., 2000] Corbetta, M., Kincade, J. M., Ollinger, J. M., McAvoy, M. P.,
and Shulman, G. L. (2000). V oluntary orienting is dissociated from target detection
in human posterior parietal cortex. Nature Neuroscience, 3(3):292–297.
[Csernansky et al., 1998] Csernansky, J., Joshi, S., Wang, L., Haller, J., Gado, M.,
Miller, J., Grenander, U., and Miller, M. (1998). Hippocampal morphometry in
schizophrenia via high dimensional brain mapping. Proc. Natl. Acad. Sci. USA,
95:11406–11411.
[Dale et al., 2000] Dale, A. M., Liu, A. K., Fischi, R. B., Buckner, R. L., Belliveau,
J. W., Lewine, J. D., and Halgren, E. (2000). Dynamic statistical parametric mapping:
Combining fMRI and MEG for high-resolution imaging of cortical activity. Neuron,
26:55–67.
[Dale and Serano, 1993] Dale, A. M. and Serano, M. I. (1993). Improved localization
of cortical activity by combining EEG and MEG with MRI cortical surface recon-
struction: a linear approach. Cognitive Neuroscience, 5:162–176.
[Darvas et al., 2004a] Darvas, F., Pantazis, D., Kucukaltun-Yildirim, E., and Leahy, R.
(2004a). Mapping human brain function with MEG and EEG: methods and valida-
tion. Neuroimage, 23(supplement 1):S289–S299.
[Darvas et al., 2004b] Darvas, F., Pantazis, D., Kucukaltun-Yildirim, E., and Leahy,
R. M. (2004b). Mapping human brain function with MEG and EEG: methods and
validation. Neuroimage, 23:s289–s299.
120
[Darvas et al., 2005] Darvas, F., Rautiainen, M., Pantazis, D., Baillet, S., Benali, H.,
Mosher, J., Garnero, L., and Leahy, R. (2005). Investigations of dipole localization
accuracy in MEG using the bootstrap. Neuroimage, 25(2):355–368.
[Davatzikos et al., 1996] Davatzikos, C., Vaillant, M., Resnick, S., Prince, J., Letovsky,
S., and Bryan, R. (1996). A computerized method for morphological analysis of the
corpus callosum. 20:88–97.
[Edgington, 1995] Edgington, E. S., editor (1995). Randomization Tests. Academic
Press.
[Foxe et al., 1998] Foxe, J. J., Simpson, G. V ., and Ahlfors, S. P. (1998). Parieto-
occipital 10hz activity reflects anticipatory state of visual attention mechanisms. Neu-
roreport, 9:3929–3933.
[Fries et al., 2001] Fries, P., Reynolds, J. H., Rorie, A. E., and Desimone, R. (2001).
Modulation of oscillatory neuronal synchronization by selective visual attention. Sci-
ence, 291:1560–1563.
[Friston, 1996] Friston, K. (1996). Statistical parametric mapping and other analysis of
functional imaging data. In Brain Mapping: The Methods, pages 363–385. Academic
Press.
[Friston et al., 2000] Friston, K., Josephs, O., Zarahn, E., Rouquette, A. H. S., and
Poline, J.-B. (2000). To smooth or not to smooth? Bias and efficiency in fMRI
time-series analysis. NeuroImage, 12:196–208.
[Friston et al., 1995] Friston, K. J., Holmes, A. P., Worsley, K. J., Poline, J. B., Frith, C.,
and Frackowiak, R. S. J. (1995). Statistical parametric maps in functional imaging:
A general linear approach. Human Brain Mapping, 2:189–210.
[Gaillard et al., 2002] Gaillard, W., Balsamo, L., Xu, B., Grandin, C., Braniecki, S.,
Papero, P., Weinstein, S., Conry, J., Pearl, P., Sachs, B., Sato, S., Jabbari, B., Vezina,
L., Frattali, C., and Theodore., W. (2002). Language dominance in partial epilepsy
patients identified with an fmri reading task. Neurology, 59:256–65.
[Gaillard et al., 2004] Gaillard, W., Balsamo, L., Xu, B., McKinney, C., Papero, P.,
Weinstein, S., Conry, J., Pearl, P., Sachs, B., Sato, S., Vezina, L., Frattali, C., and
Theodore., W. (2004). fmri language task panel improves determination of language
dominance. Neurology, 63:1403–1408.
[Genovese et al., 2002] Genovese, C., Lazar, N., and Nichols, T. (2002). Thresholding
of statistical maps in functional neuroimaging using the false discovery rate. Neu-
roImage, 15:870–878.
121
[Golub and Loan, 1996] Golub, G. and Loan, C. V . (1996). Matrix Computations.
Johns Hopkins Univ. Press, Baltimore, MD, 3rd edition.
[Golub and von Matt, 1997] Golub, G. and von Matt, U. (1997). Tikhonov regulariza-
tion for large scale problems. Technical Report SCCM-97-03, Stanford University.
[Greenhouse and Geisser, 1959] Greenhouse, S. and Geisser, S. (1959). On methods in
the analysis of profile data. Psychometrika, 24:95–112.
[Gross et al., 1991] Gross, G., Duara, R., Barker, W., Loewenstein, D., Chang, J.,
Yoshii, F., Apicella, A., Pascal, S., Boothe, T., Sevush, S., and et al. (1991). Positron
emission tomographic studies during serial word-reading by normal and dyslexic
adults. J Clin Exp Neuropsychol, 13:531–44.
[H¨ am¨ al¨ ainen et al., 1993] H¨ am¨ al¨ ainen, M., Hari, R., Ilmoniemi, R., Knuutila, J., and
Lounasmaa, O. (1993). Magnetoencephalography. theory, instrumentation and appli-
cations to the noninvasive study of human brain function. Reviews of Modern Physics,
65:413–448.
[Hari and Salmelin, 1997] Hari, R. and Salmelin, R. (1997). Human cortical oscilla-
tions: a neuromagnetic view through the skull. Trends in Neurosciences, 20:44–49.
[Hawkins et al., 2003] Hawkins, D. M., Basak, S. C., and Mills, D. (2003). Assessing
model fit by cross-validation. J. Chem. Inf. Comput. Sci., 43:579–586.
[Hillyard and Anllo-Vento, 1998] Hillyard, S. A. and Anllo-Vento, L. (1998). Event-
related brain potentials in the study of visual selective attention. Proceedings of
National Academy of Science, 95:781–787.
[Hillyard et al., 1998] Hillyard, S. A., Teder-Salejarvi, W. A., and Munte, T. F. (1998).
Temporal dynamics of early perceptual processing. Current Opinion in Neurobiol-
ogy, 8:202–210.
[Hochberg and Tamhane, 1987] Hochberg, Y . and Tamhane, A. C., editors (1987). Mul-
tiple Comparison Procedures. Wiley.
[Hopf et al., 2006] Hopf, J. M., Boehler, C. N., Luck, S. J., Tsotsos, J. K., Heinze,
H. J., and Schoenfeld, M. A. (2006). Direct neurophysiological evidence for spatial
suppression surrounding the focus of attention in vision. Proceedings of National
Academy of Science, 103(4):1053–1058.
[Hopfinger et al., 2000] Hopfinger, J. B., Buonocore, M. H., and Mangun, G. R. (2000).
The neural mechanisms of top-down attentional control. Nature Neuroscience,
3(3):284–291.
122
[Huang et al., 1999a] Huang, M., Mosher, J., and Leahy, R. (1999a). A sensor-weighted
overlapping-sphere head model and exhaustive head model comparison for meg.
Phys. Med. Biol., 44:423–440.
[Huang et al., 1999b] Huang, M., Mosher, J., and Leahy, R. (1999b). A sensor-
weighted overlapping-sphere head model and exhaustive head model comparison for
meg. Physics in Medicine and Biology, 44(2):423–440.
[Huang et al., 1998] Huang, M. X., Mosher, J. C., and Leahy, R. M. (1998). A sensor-
weighted overlapping-sphere head model and exhaustive head model comparison for
MEG. Physics in Medicine and Biology, 44:423–440.
[Hyvarinen et al., 2001] Hyvarinen, A., j. Karhunen, and Oja, E., editors (2001). Inde-
pendent Component Analysis. Wiley-Interscience.
[Jerbi et al., 2005] Jerbi, K., Baillet, S., Lachaux, J.-P., Pantazis, D., Leahy, R. M., and
Garnero, L. (2005). Modulations of power and synchronization of neural activity
during sustained visuomotor coordination: An MEG study. In The organization of
Human Brain Mapping, 11th annual meeting.
[Jerbi et al., 2004] Jerbi, K., Lachaux, J.-P., Baillet, S., and Garnero, L. (2004). Imaging
cortical oscillations during sustained visuomotor coordination in MEG. In Proceed-
ings of ISBI 2004, Arlington, VA, USA.
[K-Yildirim et al., 2006] K-Yildirim, E., Pantazis, D., and Leahy, R. (2006). Task-
based comparison of inverse methods in magnetoencephalography. IEEE Transac-
tions on Biomedical Engineering, 53(9):1783–1793.
[Kandel et al., 2000] Kandel, E. R., Schwartz, J. H., and Jessell, T. M., editors (2000).
Principles of Neural Science. McGraw-Hill/Appleton and Lange.
[Kastner et al., 1999] Kastner, S., Pinsk, M. A., Weerd, P. D., Desimone, R., and Unger-
leider, L. G. (1999). Increased activity in human visual cortex during directed atten-
tion in the absence of visual stimulation. Neuron, 22:751–761.
[Katila and Karp, 1983] Katila, T. and Karp, P. (1983). Magnetocardiography: Mor-
phology and multipole presentations. In Williamson, S. J., Romani, G. L., Kaufman,
L., and Modena, I., editors, Proc. 16th Conf. Information Processing in Medical
Imaging, pages 237–263.
[Knecht et al., 2000] Knecht, S., Drager, B., Deppe, M., Bobe, L., Lohmann, H., Floel,
A., Ringelstein, E., and Henningsen, H. (2000). Handedness and hemispheric lan-
guage dominance in healthy humans. Brain, 123:2512–2518.
123
[Lee et al., 2005] Lee, H., Simpson, G. V ., Logothetis, N. K., and Rainer, G. (2005).
Phase locking of single neuron activity to theta oscillations during working memory
in monkey. Neuron, 45:147–156.
[Lesser et al., 1986] Lesser, R. P., Luders, H., Morris, H. H., Dinner, D. S., Klem, G.,
Hahn, J., and Harrison, M. (1986). Electrical stimulation of wernicke’s area interferes
with comprehension. Neurology, 36:658–663.
[Lu and Kaufman, 2003] Lu, Z.-L. and Kaufman, L., editors (2003). Magnetic Source
Imaging of the Human Brain. Lawrence Erlbaum Associates.
[Luck et al., 1997] Luck, S. J., Chelazzi, L., Hillyard, S. A., and Desimone, R. (1997).
Neural mechanisms of spatial selective attention in areas v1, v2, and v4 of macaque
visual cortex. Journal of Neurophysiology, 77:24–42.
[Maiwald et al., 2004] Maiwald, T., Winterhalder, M., Aschenbrenner-Scheibe, R.,
V oss, H., Schulze-Bonhage, A., and Timmer, J. (2004). Comparison of three non-
linear seizure prediction methods by means of the seizure prediction characteristic.
Physica D, 194:357–368.
[Mangun et al., 1997] Mangun, G. R., Hopfinger, J. B., Kussmaul, C. L., Fletcher,
E. M., and Heinze, H. J. (1997). Covariations in ERP and PET measures of spa-
tial selective attention in human extrastriate visual cortex. Human Brain Mapping,
5:273–279.
[Marrufo et al., 2001] Marrufo, M. V ., Vaquero, E., Cardoso, M. J., and Gomez, C. M.
(2001). Temporal evolution of alpha and beta bands during visual spatial attention.
Cognitive Brain Research, 12:315–320.
[Martin Vetterli, 1995] Martin Vetterli, J. K., editor (1995). Wavelets and Subband Cod-
ing. Prentice Hall PTR; 1st edition.
[Medina et al., 2004] Medina, L. S., Aguirre, E., Bernal, B., and Altman, N. R. (2004).
Functional mr imaging versus wada test for evaluation of language lateralization:
Cost analysis. Radiology, 230:49–54.
[Meunier et al., 2001] Meunier, S., Garnero, L., Ducorps, A., Mazieres, L., Montcel,
S. L. S. T. D., Renault, B., and Vidailhet, M. (2001). Human brain mapping in
dystonia reveals both endophenotypic traits and adaptative reorganization. Annals of
Neurology, 47(2):521–527.
[Morgan and Gevins, 1986] Morgan, N. H. and Gevins, A. S. (1986). Wigner distribu-
tions of human event-related brain potentials. IEEE Trans. BME, 33:63–70.
124
[Mosher et al., 2005] Mosher, J., Baillet, S., Darvas, F., Pantazis, D., Yildirim, E., and
Leahy, R. (2005). Brainstorm electromagnetic imaging software. In International
Journal of Bioelectromagnetism, volume 7(2), pages 189–190.
[Mosher and Leahy, 1998] Mosher, J. and Leahy, R. (1998). Recursive MUSIC: a
framework for EEG and MEG source localization. IEEE Trans. on Biomed. Eng.,
45(11):1342–1355.
[Mosher et al., 1992] Mosher, J., Leahy, R., and Lewis, P. (1992). Multiple dipole mod-
eling and localization from spatiotemporal MEG data. IEEE Trans. Biomed. Eng.,
39:541–557.
[Mosher et al., 1999a] Mosher, J., Leahy, R., Shattuck, D., and Baillet, S. (1999a).
MEG source imaging using multipolar expansions. In Kuba, A., Samal, M., and
Todd-Pokropek, A., editors, Proc. 16th Conf. Information Processing in Medical
Imaging, pages 15–28.
[Mosher et al., 1999b] Mosher, J. C., Leahy, R. M., and Lewis, P. S. (1999b). EEG and
MEG: Forward solutions for inverse problems. IEEE Transactions of Biomedical
Engineering, 46 (3):245–259.
[Nichols and Hayasaka, 2003] Nichols, T. E. and Hayasaka, S. (2003). Controlling the
familywise error rate in functional neuroimaging: A comparative review. Statistical
Methods in Medical Research, 12(5):419–446.
[Nichols and Holmes, 2001] Nichols, T. E. and Holmes, A. P. (2001). Nonparametric
permutation tests for functional neuroimaging: A primer with examples. Human
Brain Mapping, 15:1–25.
[Ojemann et al., 1989] Ojemann, G., Ojemann, J., and et al, E. L. (1989). Cortical
language localization in left, dominant hemisphere. an electrical stimulation mapping
investigation in 117 patients. J Neurosurg, 71:316–326.
[Oldfield, 1971] Oldfield, R. C. (1971). The assessment and analysis of handedness:
the edinburgh inventory. Neuropsychologia, 9:97–113.
[Oostendorp and Oosterom, 1988] Oostendorp, T. F. and Oosterom, A. V . (1988). Inter-
polation on a triangulated 3d surface. Journal of Computational Physics, 80:331–
343.
[Ossadtchi et al., 2004] Ossadtchi, A., Baillet, S., Mosher, J., Thyerlei, D., Sutherling,
W., and Leahy, R. (2004). Automated interictal spike detection and source localiza-
tion in magnetoencephalography using independent components analysis and spatio-
temporal clustering. Clin Neurophysiol., 115(3):508–522.
125
[Pantazis et al., 2004] Pantazis, D., Leahy, R. M., Nichols, T. E., and Styner, M. (2004).
Statistical surface-based morphometry using a non-parametric approach. In 2004
IEEE International Symposium on Biomedical Imaging.
[Pantazis et al., 2005a] Pantazis, D., Merrifield, W., Darvas, F., Sutherling, W., and
Leahy, R. (2005a). Hemispheric language dominance using MEG cortical imag-
ing and non-parametric statistical analysis. WSEAS Transactions on Biology and
Biomedicine, 2(3):318–325.
[Pantazis et al., 2003] Pantazis, D., Nichols, T. E., Baillet, S., and Leahy, R. M. (2003).
Spatiotemporal localization of significant activation in MEG using permutation tests.
In Information Processing in Medical Imaging, pages 512–523.
[Pantazis et al., 2005b] Pantazis, D., Nichols, T. E., Baillet, S., and Leahy, R. M.
(2005b). A comparison of random field theory and permutation methods for the
statistical analysis of MEG data. Neuroimage, 25:383–394.
[Pantazis et al., 2005c] Pantazis, D., Weber, D., Dale, C., Nichols, T., Simpson, G., and
Leahy, R. (2005c). Imaging of oscillatory behavior in event-related MEG studies.
In Bouman, C. and Miller, E., editors, Proceedings of SPIE, Computational Imaging
III, pages 55–63.
[Papanicolaou et al., 1999] Papanicolaou, A., Simos, P., Breier, J., Zouridakis, G., Will-
more, L., Wheless, J., Constantinou, J., Maggio, W., and Gormley, W. (1999). Mag-
netoencephalographic mapping of the language-specific cortex. J Neurosurg, 90:85–
93.
[Papanicolaou et al., 2004] Papanicolaou, A., Simos, P., Castillo, E., Breier, J., Sarkari,
S., Pataraia, E., Briilingsley, R., Buchanan, S., Wheless, J., Maggio, V ., and Maggio,
W. (2004). Magnetoencephalography: a noninvasive alternative to the wada proce-
dure. J Neurosurg, 100:867–76.
[Pfurtscheller and da Silva, 1999] Pfurtscheller, G. and da Silva, F. H. L. (1999). Event-
related EEG/MEG synchronization and desynchronization: basic principles. In Press,
T. U., editor, Proceedings of the 11th International Conference on Biomagnetism,
volume 11, pages 1842–1857.
[Phillips et al., 1997] Phillips, J. W., Leahy, R. M., and Mosher, J. C. (1997). MEG
- based imaging of focal neuronal current sources. IEEE Transactions of Medical
Imaging, 163:338–348.
[Posner and Petersen, 1990] Posner, M. I. and Petersen, S. E. (1990). The attention
system of the human brain. Annual Reviews of Neuroscience, 13:25–42.
126
[Pujol et al., 1999] Pujol, J., Deus, J., Losilla, J., and Capdevila, A. (1999). Cerebral
lateralization of language in normal left-handed people studied by functional mri.
Neurology, 52:1038–1043.
[Ruff and Driver, 2006] Ruff, C. C. and Driver, J. (2006). Attentional preperation for
a lateralized visual distractor: Behavioral and fMRI evidence. Journal of Cognitive
Neuroscience, 18(4):522–538.
[Salanova et al., 1992] Salanova, V ., Andermann, F., Oliver, A., Rasmussen, T., and
Quesney, L. F. (1992). Occipital lobe epilepsy: Electroclinical manifestations, eloc-
trocorticography, cortical stimulation and outcome in 42 patients treated between
1930 and 1991: Surgery of occipital lobe epilepsy. Brain, 115(6):1655–1680.
[Satterthwaite, 1946a] Satterthwaite, F. (1946a). An approximate distribution of esti-
mates of variance components. Biometrics, 2:110–114.
[Satterthwaite, 1946b] Satterthwaite, F. E. (1946b). An approximate distribution of esti-
mates of variance components. Biometrics Bulletine, 2:110–114.
[Sauseng et al., 2005] Sauseng, P., Klimesch, W., Stadler, W., Schabus, M., Doppel-
mayr, M., Hanslmayr, S., Gruber, W. R., and Birbaumer, N. (2005). A shift of visual
spatial attention is selectively associated with human EEG alpha activity. Journal of
Neuroscience, 22:2917–2926.
[Scharf, 1991] Scharf, L. (1991). Statistical Signal Processing: Detection, Estimation,
and Time Series Analysis. Addison-Wesley Publishing Co., New York.
[Schmidt, 1986] Schmidt, R. (1986). Multiple emitter location and signal parameter
estimation. IEEE Trans. Antennas Propagat., 34:276–280.
[Serences et al., 2004] Serences, J. T., Yantis, S., Culberson, A., and Awh, E. (2004).
Preparatory activity in visual cortex indexes distractor suppression during covert spa-
tial orienting. Journal of Neurophysiology, 92:3538–3545.
[Shattuck and Leahy, 2002a] Shattuck, D. and Leahy, R. (2002a). Brainsuite: An auto-
mated cortical surface identification tool. Medical Image Analysis, 6(2):129–142.
[Shattuck and Leahy, 2002b] Shattuck, D. W. and Leahy, R. M. (2002b). Brainsuite:
An automated cortical surface identification tool. Medical Image Analysis, 6(2):129–
142.
[Simos et al., 1998] Simos, P., Breier, J., Zouridakis, G., and Papanicolaou, A. (1998).
Identification of language-specific brain activity using magnetoencephalography. J
Clin Exp Neuropsychol, 20:706–22.
127
[Singh et al., 2003] Singh, K., Barnes, G. R., and Hillebrand, A. (2003). Group imaging
of task-related changes in cortical synchronization using nonparametric permutation
testing. Neuroimage, 19:1589–1601.
[Singh et al., 2002] Singh, K., Barnes, G. R., Hillebrand, A., Forde, E. M., and
Williams, A. L. (2002). Task-related changes in cortical synchronization are spa-
tially coincident with the hemodynamic response. Neuroimage, 1(16):103–114.
[Sowell et al., 1999a] Sowell, E. R., Thompson, P. M., Holmes, C. J., Jernigan, T. L.,
and Toga, A. W. (1999a). In vivo evidence for post-adolescent frontal and striatal
maturation. Nature Neuroscience.
[Sowell et al., 1999b] Sowell, E. R., Thompson, P. M., Holmes, C. J., Jernigan, T. L.,
and Toga, A. W. (1999b). Localizing age-related changes in brain structure between
childhood and adolescence using statistical parametric mapping. Neuroimage, 9:587–
597.
[Stefan et al., 2003] Stefan, H., Hummel, C., Scheler, G., Genow, A., Druschky, K.,
Tilz, C., Kaltenhauser, M., Hopfengartner, R., Buchfelder, M., and Romstock, J.
(2003). Magnetic brain source imaging of focal epileptic activity: a synopsis of 455
cases. Brain, 126(11):2396–2405.
[Styner et al., 2003a] Styner, M., Gerig, G., Lieberman, J., Jones, D., and Weinberger,
D. (2003a). Statistical shape analysis of neuroanatomical structures based on medial
models. Medical Image Analysis, 7(3):207–220.
[Styner et al., 2004] Styner, M., Lieberman, J. A., Pantazis, D., and Gerig, G. (2004).
Boundary and medial shape analysis of the hippocampus in schizophrenia. Medical
Image Analysis, 8.
[Styner et al., 2003b] Styner, M. A., Lieberman, J., and Gerig, G. (2003b). Boundary
and medial shape analysis of the hippocampus in schizophrenia. In Medical Image
Computing and Computer - Assisted Intervention (MICCAI), pages 464–471.
[Szymanski et al., 2001] Szymanski, M., Perry, D., Gage, N., Rowley, H., Walker, J.,
Berger, M., and Roberts, T. (2001). Magnetic source imaging of late evoked field
responses to vowels: toward an assessment of hemispheric dominance for language.
J Neurosurg, 94:445–53.
[Tallon-Baudry et al., 1996] Tallon-Baudry, C., Bertrand, O., Delpuech, C., and Per-
mier, J. (1996). Stimulus specificity of phase-locked and non-phase-locked 40Hz
visual responses in human. Neuroscience, 16(16):4240–4249.
[Tallon-Baudry et al., 1997] Tallon-Baudry, C., Bertrand, O., Delpuech, C., and Per-
mier, J. (1997). Oscillatory gamma-band (30-70 Hz) activity induced by a visual
search task in humans. Neuroscience, 17(2):722–734.
128
[Teolis, 1998] Teolis, A., editor (1998). Computational Signal Processing With
Wavelets (Applied and Numerical Harmonic Analysis). Birkhauser Boston.
[Thompson et al., 2001] Thompson, P. M., Cannon, T. D., Narr, K. L., Erp, T., Pouta-
nen, V .-P., Huttunen, M., L¨ onnqvist, J., Standertskj¨ old-Nordenstam, C.-G., Kaprio,
J., Khaledy, M., Dail, R., Zoumalan, C. I., and Toga, A. W. (2001). Genetic influences
on brain structure. Nature Neuroscience, 4(12).
[Thompson et al., 2003] Thompson, P. M., Hayashi, K. M., Zubicaray, G., Janke, A. L.,
Rose, S. E., Semple, J., Herman, D., Hong, M. S., Dittmer, S. S., Doddrell, D. M.,
and Toga, A. W. (2003). Dynamics of gray matter loss in alzheimer’s disease. Journal
of Neuroscience, 23(3):994–1005.
[Tikhonov and Arsenin, 1977] Tikhonov, A. N. and Arsenin, V . Y ., editors (1977). Solu-
tions to Ill-Posed Problems. V . H. Winston & Sons.
[Valachovic et al., 1997] Valachovic, A. M., Smith, B. J., Elisevich, K., Fisk, J., and
Jacobson, G. (1997). Language and its management in the surgical epilepsy patient.
In Johnson, A. and Jacobson, B., editors, Medical Speech-Language Pathology: A
practitioner’s Guide, pages 425–466.
[Vanni et al., 1997] Vanni, S., Revonsuo, A., and Hari, R. (1997). Modulation of the
parieto-occipital alpha rhythm during object detection. The Journal of Neuroscience,
17(18):7141–7147.
[VanVeen et al., 1997a] VanVeen, B., van Drongelen, W., Yuchtman, M., and Suzuki,
A. (1997a). Localization of brain electrical activity via linearly constrained minimum
variance spatial filtering. IEEE Transactions on Biomedical Engineering, 44:867–
880.
[VanVeen et al., 1997b] VanVeen, B., van Drongelen, W., Yuchtman, M., and Suzuki,
A. (1997b). Localization of brain electrical activity via linearly constrained minimum
variance spatial filtering. IEEE Transactions on Biomedical Engineering, 44(9):867–
880.
[Westfall and Young, 1992] Westfall, P. H. and Young, S. S., editors (1992). Resam-
pling Based Multiple Testing: Examples and Methods for p-Value Adjustment. Wiley-
Interscience.
[Woermann et al., 2003] Woermann, F., Jokeit, H., Luerding, R., Freitag, H., Schulz,
R., Guertler, S., Okujava, M., Wolf, P., Tuxhorn, I., and Ebner, A. (2003). Language
lateralization by wada test and fmri in 100 patients with epilepsy. Neurology, 61:699–
701.
129
[Woldorff et al., 1997] Woldorff, M. G., Fox, P. T., Matzke, M., Lancaster, J. L.,
Veeraswamy, S., Zamarripa, F., Seabolt, M., Glass, T., Gao, J. H., Martin, C. C., and
Jerabek, P. (1997). Retinotopic organization of early visual spatial attention effects
as revealed by PET and ERPs. Human Brain Mapping, 5:280–286.
[Womelsdorf et al., 2006] Womelsdorf, T., Fries, P., Mitra, P. P., and Desimone, R.
(2006). Gamma-band synchronization in visual cortex predicts speed of change
detection. Nature, 439:733–736.
[Worden et al., 1946] Worden, M. S., Foxe, J. J., Wang, N., and Simpson, G. V . (1946).
Anticipatory biasing of visuospatial attention indexed by retinotopically specific
alpha-band EEG increases over occipital cortex. Neuroscience, 20(6):RC63.
[Worsley et al., 1995] Worsley, K., Marrett, S., Neelin, P., Vandal, A., Friston, K., and
Evans, A. (1995). A unified statistical approach for determining significant signals
in images of cerebral activation. Human Brain Mapping, 4:58–73.
[Worsley, 2000] Worsley, K. J. (2000). Exceedence probabilities of local maxima of
non-isotropic random fields, with an application to shape analysis via surface dis-
placements. To be published.
[Worsley et al., 1999] Worsley, K. J., Andermann, M., Koulis, T., MacDonald, D., and
Evans, A. C. (1999). Detecting changes in non-isotropic images. Human Brain
Mapping, 8:98–101.
[Worsley et al., 1992] Worsley, K. J., Evans, A. C., Marrett, S., and Neelin, P. (1992).
A three-dimensional statistical analysis for cbf activation studies in human brain.
Journal of Cerebral Blood Flow and Metabolism, 12:900–918.
[Worsley and Friston, 1995] Worsley, K. J. and Friston, K. J. (1995). Analysis of fMRI
time-series revisited - again. NeuroImage, 2:173–181.
[Worsley et al., 1996] Worsley, K. J., Marrett, S., Neelin, P., Vandal, A. C., Friston,
K. J., and Evans, A. C. (1996). A unified statistical approach for determining signifi-
cant signals in images of cerebral activation. Human Brain Mapping, 4:58–73.
[Yamagishi et al., 2003] Yamagishi, N., Callan, D. E., Goda, N., Anderson, S. J.,
Yoshida, Y ., and Kawato, M. (2003). Attentional modulation of oscillatory activ-
ity in human visual cortex. NeuroImage, 20:98–113.
[Yildirim et al., 2005] Yildirim, E., Pantazis, D., and Leahy, R. (2005). Task-based
comparison of inverse methods in MEG. submitted to IEEE Transactions on Bio-
medical Engineering.
130
[Zimmerman et al., 1970] Zimmerman, J. E., Thiene, P., and Harding, J. T. (1970).
Design and operation of stable rf-based superconducting point-contact quantum
devices and a note on the properties of perfectly clean metal contacts. J. Appl. Phys.,
41:1572–1580.
[Zouridakis et al., 1998] Zouridakis, G., Simos, P., Breier, J., and Papanicolaou, A.
(1998). Functional hemisperic asymmetry assessment in a visual language task using
meg. Brain Topogr, 11:57–65.
131
Abstract (if available)
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Signal processing methods for interaction analysis of functional brain imaging data
PDF
Multivariate statistical analysis of magnetoencephalography data
PDF
Geometric methods of image registration and analysis
PDF
Prediction modeling and statistical analysis of amino acid substitutions
PDF
Causality and consistency in electrophysiological signals
PDF
Optimal designs for high throughput stream processing using universal RAM-based permutation network
PDF
Fast iterative image reconstruction for 3D PET and its extension to time-of-flight PET
PDF
Statistical inference for dynamical, interacting multi-object systems with emphasis on human small group interactions
PDF
Diffusion MRI white matter tractography: estimation of multiple fibers per voxel using independent component analysis
PDF
Location of lesions on postmortem brain by co-registering corresponding MRI and postmortem slices
PDF
Data-driven methods in description-based approaches to audio information processing
PDF
Theory and simulation of diffusion magnetic resonance imaging on brain's white matter
PDF
Automatic quantification and prediction of human subjective judgments in behavioral signal processing
PDF
Dynamic functional magnetic resonance imaging to localize activated neurons
PDF
Applications of all optical signal processing for advanced optical modulation formats
PDF
Random access to compressed volumetric data
PDF
Shift-invariant autoregressive reconstruction for MRI
PDF
Nonlinear optical signal processing for high-speed, spectrally efficient fiber optic systems and networks
PDF
Measuring functional connectivity of the brain
PDF
Statistical modeling of sequence and gene expression data to infer gene regulatory networks
Asset Metadata
Creator
Pantazis, Dimitrios
(author)
Core Title
Statistical signal processing of magnetoencephalography data
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Electrical Engineering
Publication Date
11/16/2006
Defense Date
10/13/2006
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
magnetoencephalography,OAI-PMH Harvest,permutations
Language
English
Advisor
Leahy, Richard M. (
committee chair
), Nayak, Krishna S. (
committee member
), Singh, Manbir (
committee member
)
Creator Email
pantazis@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-m160
Unique identifier
UC185614
Identifier
etd-Pantazis-20061116 (filename),usctheses-m40 (legacy collection record id),usctheses-c127-28405 (legacy record id),usctheses-m160 (legacy record id)
Legacy Identifier
etd-Pantazis-20061116.pdf
Dmrecord
28405
Document Type
Dissertation
Rights
Pantazis, Dimitrios
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Repository Name
Libraries, University of Southern California
Repository Location
Los Angeles, California
Repository Email
cisadmin@lib.usc.edu
Tags
magnetoencephalography
permutations