Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Multiscale spike-field network causality identification
(USC Thesis Other)
Multiscale spike-field network causality identification
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
Multiscale Spike-Field Network Causality Identification
By
Chuanmeizhi Wang
A Dissertation Presented to the
FACULTY OF THE USC GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
(ELECTRICAL ENGINEERING)
May 2022
Copyright 2022 Chuanmeizhi Wang
ii
Acknowledgements
First and foremost, I would like to thank my Ph.D. advisor and committee chair, Professor
Maryam M. Shanechi, for her guidance and continuous support throughout my PhD work. Prof.
Shanechi is very knowledgeable, patient and encouraging.
I wish to thank my dissertation committee and qualifying exam committee members, Prof.
Paul Bogdan, Prof. Dong Song, Prof. Antonio Ortega and Prof. Krishna Nayak for their support
and valuable advice. I also greatly thank for the helps from Ms. Diane Demetras and Ms. Annie
Yu.
I thank all my colleagues in NSEIP lab. I learned a lot from them. I thank Yuxiao Yang and
Omid Ghasem-Sani for their valuable discussions in weekly meetings. I thank Hamidreza Ab-
baspour and Yanyu Li for their special helps during the pandemic of COVID-19.
Last but not least, I express my utmost gratitude to my parents for their love and encourage-
ment. It is their support that made the thesis possible.
iii
Table of Contents
Acknowledgements ....................................................................................................................................... ii
List of Tables ................................................................................................................................................ v
List of Figures .............................................................................................................................................. vi
Abstract ....................................................................................................................................................... vii
Preface ......................................................................................................................................................... ix
Introduction ................................................................................................................................................... 1
Chapter 1: Estimating Multiscale Direct Causality Graphs in Neural Spike-Field Networks ...................... 4
1.1 Introduction ....................................................................................................................................... 4
1.2 Methods ............................................................................................................................................. 7
1.2.1 Model description ................................................................................................................. 7
1.2.2 Multiscale causality measure .............................................................................................. 11
1.2.3 Model estimation................................................................................................................. 12
1.2.4 Multiscale causality statistical tests .................................................................................... 14
1.2.5 Estimating multiscale causality graphs in spike-field network activity .............................. 18
1.3 Results ............................................................................................................................................. 19
1.3.1 Method validation ............................................................................................................... 19
1.3.2 Performance measures ........................................................................................................ 20
1.3.3 Multiscale causality estimation algorithm accurately reconstructs the 3-node network
graphs………………………………………………………………………………………………...21
1.3.4 Multiscale causality estimation algorithm accurately reconstructs the 10-node network
graphs………………………………………………………………………………………………...22
1.3.5 Robustness of the multiscale causality estimation algorithm.............................................. 23
1.4 Discussions ..................................................................................................................................... 27
1.5 Summary ......................................................................................................................................... 30
Chapter 2: Data-efficient multiscale causal interaction modeling between spiking and field potential
signals during behavior ............................................................................................................................... 31
2.1 Introduction ..................................................................................................................................... 31
2.2 Causality graph identification ......................................................................................................... 32
2.2.1 Spike train model and causality to spike trains. .................................................................. 33
2.2.2 Field potential model and causality to field potentials ........................................................ 35
2.2.3 Model selection and causality assessment .......................................................................... 38
2.3 Results ............................................................................................................................................. 40
2.3.1 Simulation showing data-efficiency of the new causality method ...................................... 40
2.3.2 Biophysical simulation ............................................................................................................... 42
2.4 Summary ......................................................................................................................................... 47
Chapter 3: Modeling causal interactions between spiking and field potential signals during behavior of
non-human primates .................................................................................................................................... 48
3.1 Introduction ..................................................................................................................................... 48
3.2 Methods........................................................................................................................................... 48
3.2.1 Motor task and neural data recording.................................................................................. 48
iv
3.2.2 Prediction power ................................................................................................................. 49
3.3 Results ............................................................................................................................................. 51
3.3.1 Multiscale causal connections improve prediction of spike trains and field potentials in
NHP datasets ....................................................................................................................................... 51
3.3.2 Latent firing rates predict field potentials better than binary spike trains ........................... 56
3.3.3 Properties of causal connections ......................................................................................... 58
3.4 Discussions ..................................................................................................................................... 61
3.5 Summary ......................................................................................................................................... 63
Chapter 4: Conclusion ................................................................................................................................. 64
Bibliography ............................................................................................................................................... 66
v
List of Tables
Table 1-1 Variables in 1.2.1 ~ 1.2.3………………………………………………………………………..8
Table 2-1 Parameters of neurons, synapses and input currents in the biophysical simulation. AHP here
refers to after-hyperpolarization………………………………………………………….…………..…...43
Table 3-1 Mean and sem of averaged PP increase ratio across sessions from baseline to multiscale,
and from single-scale or cross-scale to multiscale models of all signals with multiscale connections.
P-values are computed by one-sided paired t-tests……………………………………….………..……..56
vi
List of Figures
Figure 1-1 Multiscale causality estimation..………………...……………………………………………...9
Figure 1-2 Cross-validation procedure to assess the significance of causality to field nodes ……….…...16
Figure 1-3 Multiscale causality estimation algorithm ……………………………………….……………17
Figure 1-4 Multiscale causality estimation algorithm accurately estimates the proxy and cascading
neworks……………………………………………………………………………………………………21
Figure 1-5 Multiscale causality estimation algorithm accurately estimates the causality graph of two
random 10-node networks ……...…………………………………………………………………………24
Figure 1-6 Multiscale causality estimation algorithm is robust to the sparsity level and exact network
topology…………………………………………………………………………………………...………25
Figure 1-7 Multiscale causality estimation algorithm is robust to correlated white Gaussian noise
between field signals and unobserved inputs……………………………………………………………...26
Figure 2-1 Comparison of new and original test procedures on causality to field potentials shows that
the new method is more data efficient…………………………………………………………………….41
Figure 2-2 Multiscale causality identification method can find the directed connections within
spike-field networks in biophysical simulations…………………………………………………………..44
Figure 2-3 We show 20 example spike trains and 10 example LFPs observed from a network shown
in Figure 2-2a during a 10-second period…………………………………………………………………45
Figure 3-1 Multiscale causalities are successfully detected, thus improving the prediction of spike
trains and field potentials………………………………………………………………………………….53
Figure 3-2 Our method using latent firing rates predicts field potentials better than using binary spike
trains……………………………………………………………………………………………………….57
Figure 3-3 Causal connection density from a signal is negatively correlated to physical distance from
that signal………………………………………………………………………………………………….58
Figure 3-4 Average individual-connection strength of four types of causal connections measured by
directed information (DI) with errorbar (mean ±sem) across sessions……………………………………59
Figure 3-5 Within-region and cross-region density of four kinds of connections in each session………..60
vii
Abstract
Behavior is encoded across multiple spatiotemporal scales of brain activity, measured with
spike trains from neurons and larger-scale field potential signals that reflect network activity.
Thus, it is important to identify and model causal interactions not only at a single scale of activ-
ity, but also across multiple scales, i.e., between spike trains and field potential signals. Standard
causality measures are not directly applicable here because spike trains are binary-valued but
field potentials are continuous-valued and have slower timescales. In this thesis, we develop
novel computational methods to identify multiscale neural causality during behavior, evaluate
these methods on neural datasets, and study whether the prediction of neural signals can be im-
proved beyond what is possible with single-scale causality.
Our methods compute a multiscale model-based causality measure based on directed infor-
mation. To compute multiscale causality, we learn point-process generalized linear models that
predict the binary spike events at a given time based on the history of both spike trains and field
potential signals. We also learn linear Gaussian models that predict the field potential signals at a
given time based on their own history as well as either the history of observable binary spike
events or that of latent continuous firing rates. We first validate the method with extensive
Monte Carlo simulations and then evaluate its success both in realistic biophysical spike-field
simulations and in non-human primate (NHP) motor cortical datasets during movement. We
show that our method reveals the true multiscale causality network structure not only in Monte
Carlo simulations but also in biophysical simulations despite the presence of model mismatch.
Further, we find that the multiscale models in the NHP dataset predict both spike trains and field
potential signals better compared to just modeling single-scale or cross-scale causalities, thus
suggesting the successful discovery of multiscale causal relationships. Finally, we show that the
viii
identified multiscale interaction are negatively correlated with physical distance between signals
and are stronger within a given brain region compared to across brain regions. These novel mul-
tiscale causality identification and modeling methods can reveal causal interactions across differ-
ent spatiotemporal scales of brain activity to inform basic science investigations and neurotech-
nologies.
.
ix
Preface
Parts of this thesis have been published or accepted for publication.
Chapter 1 is a reproduce of the following published article with slight revisions: Chuanmeizhi
Wang and Maryam M. Shanechi, “Estimating Multiscale Direct Causality Graphs in Neural
Spike-Field Networks,” IEEE Transactions on Neural Systems and Rehabilitation Engineering,
vol. 27, no. 5, pp. 857–866, 2019. doi: 10.1109/TNSRE.2019.2908156. Chapter 2 and 3 include
contents from the following article: Chuanmeizhi Wang, Bijan Pesaran, Maryam M. Shanechi,
“Modeling multiscale causal interactions between spiking and field potential signals during be-
havior." Journal of Neural Engineering, 2022. doi: 10.1088/1741-2552/ac4e1c. Abstract, Intro-
duction and Conclusion also contain contents combined and revised from two published articles.
The candidate was the primary author on these manuscripts, worked on the studies,
analyzed the data and wrote the first draft of the manuscripts.
1
Introduction
Identifying the functional causality structure in brain networks is important for our under-
standing of neural mechanisms underlying brain functions and for designing neurotechnologies
to decode and modulate brain states [1]–[11]. Further, many behaviors involve multiple spatio-
temporal scales of neural activity, spanning not only neural spiking activity but also larger-scale
neural population activity measured through local field potentials (LFP) [12]–[28]. Thus, identi-
fying the causality structure can be important not only at a single scale of activity but also across
different activity scales [3], [13], [29], [30]. As many modern neurophysiological datasets pro-
vide measurements of the brain at multiple scales through simultaneous recordings of spike
trains and field potentials [31], [32], statistical tools that can assess multiscale causality are im-
portant to develop and validate. Further, it is important to study whether modeling the causality
structure across different scales can lead to more accurate models of neural activity beyond what
is possible through identifying a single-scale causality structure. Finally, neurophysiology exper-
iments can record from only a subset of neurons in the brain and as such, functional causality is
generally evaluated by Granger-like causality methods in which causality is statistically meas-
ured within the recorded data only. Regardless of the method used, a fundamental limitation of
such Granger-like causality evaluation is that identified causal connections are not conditioned
on the unrecorded activity. To improve accuracy of Granger-like causality assessment, it is bene-
ficial to consider as much data as possible within a given experimental recording. As spiking and
field potential activity provide measurements of the brain at multiple scales [31], [32], consider-
ing both of these multiscale signals simultaneously could lead to more accurate causality anal-
yses and help better understand brain mechanisms [12], [33]–[36].
2
However, to evaluate causal interactions between spike and field signals, new computational
algorithms are required that can assess multiscale causality. Such an assessment presents a chal-
lenging computational problem because of the fundamental differences in spike and field signals.
First, spikes are discrete binary-valued while fields are continuous-valued. Second, the time-
scales of dynamics for spikes and fields could be different, e.g., milliseconds for spikes vs. gen-
erally several to tens of milliseconds for fields [29]. Classical causality measures are not de-
signed to assess interactions between these mixed discrete and continuous signals.
Here we develop a novel multiscale causality estimation algorithm for spike-field networks
(Chapter 1). We construct a likelihood function comprised of point process models for spikes
and linear Gaussian models for fields. For spikes, firing rates are modeled as a function of the
history of both field signals and binary spike events within the network. For fields, to make their
linear models consistent with biophysical findings, we use the history of field signals and the his-
tory of the latent log-firing rates of neurons as predictors. To resolve the challenge of estimating
the network model parameters in the presence of latent firing rates, we develop a sequential max-
imum-likelihood parameter estimation procedure that extends to large networks. Once models
are estimated, we compute directed information as our measure of multiscale causality and de-
vise two statistical tests to assess its significance. Using extensive simulations, we show that the
algorithm can accurately reconstruct the true causality graphs of random spike-field networks.
Moreover, the algorithm is robust to the number of connections, connection strengths, or exact
topology of the network. Then we also design a data efficient estimation method and evaluate its
success both in realistic biophysical spike-field simulations (Chapter 2) and in motor cortical da-
tasets from two non-human primates (NHP) performing a motor behavior (Chapter 3). We find
that our method reveals the true multiscale causality network structure in biophysical simulations
3
despite the presence of model mismatch. Further, models with the identified multiscale causali-
ties in the NHP neural datasets lead to better prediction of both spike trains and field potential
signals compared to just modeling single-scale causalities. Finally, we find that latent firing rates
are better predictors of field potential signals compared with the binary spike events in the NHP
datasets. This multiscale causality method can reveal the directed functional interactions across
spatiotemporal scales of brain activity to inform basic science investigations and neurotechnolo-
gies.
4
Chapter 1: Estimating Multiscale Direct Causality Graphs in Neural
Spike-Field Networks
1.1 Introduction
As technology advances, it has become increasingly common to concurrently record spikes
from individual neurons and field signals, which represent the activity of larger neural popula-
tions. These field signals include local field potentials (LFPs) and electrocorticogram (ECoG).
Analyses of causal functional connectivity patterns at multiple spatiotemporal scales using these
spike-field recordings could help advance our understanding of neural representations underlying
function and dysfunction. Furthermore, uncovering such multiscale causal patterns may lead to
more accurate neural encoding models and decoders to improve future neurotechnologies such as
motor brain-machine interfaces (BMIs) or closed-loop brain stimulation systems for neurological
and neuropsychiatric disorders.
Identifying multiscale causality introduces additional challenges because of the signal differ-
ences in the spike train and field potential modalities. Spike trains consist of fast action potential
events that take a binary 0-1 value while field potentials are continuous-valued signals [9], [37]–
[40]. Some prior multiscale studies focus on undirectional connectivity using correlation-based
measures such as spike-field coherence [18], [34], [41]–[43] or on developing encoding models
that fit correlation terms between signals [9], [44], [45]. But these studies do not focus on as-
sessing causality and directional interactions between spike trains and field potentials. There are
also studies on spike-field coherence that measures consistent neural spiking at a specific phase
of LFP [46] and on model-free measures such as spike-triggered average of the LFP (stLFP) that
assess interactions from spike trains to field potentials, but these measures are not always causal
[29]. To assess causality, traditionally, Granger causality [47] is used for continuous signals like
5
LFP and electroencephalography (EEG)/magnetoencephalography (MEG) [48]–[50]. Also, alt-
hough Granger causality is not directly applicable for discrete signals that are not Gaussian pro-
cesses [37], it has been used for filtered continuous signals generated from spike trains [50]–[52].
But assessing multiscale causality between spike trains and field potentials requires methods that
can work on a mixture of both discrete and continuous signal modalities.
For binary spike trains alone, other methods for computing causality have also been developed
beyond Granger causality [50], [53]–[62]. Some methods use a generalized linear model (GLM)
framework to model the spike trains as point processes [53], [57], [58], [60], [63]. Doing so, cau-
sality for spike trains can be computed using an information-theoretic measure termed directed
information [53], [57]–[59], [64], which is a generalization of Granger causality. But these
methods are applicable within only spike trains rather than across spike trains and field poten-
tials.
In general, methods for computing causality for spike trains or field potentials are either
model-based or model-free. Model-based methods can explicitly incorporate the influence of
various factors such as behavior on neural activity but depend on the accuracy of the constructed
models. Despite this challenge, these methods have been shown to be useful tools for analyzing
causality statistically and can reveal neurophyisological insights as shown in prior work [50],
[53]. However, to ensure the accuracy of the estimated causality, these methods need to be care-
fully validated by assessing the accuracy of the underlying models in predicting the neural sig-
nals.
To measure causality in spike-field networks, one prior approach applied Granger causality by
operating on the power spectrum of the spike trains computed using multitaper techniques in
sliding bins [65]. This method is non-parametric and model-free. Thus, new methods are needed
6
to produce an encoding model of spike-field network interactions as encoding models are often
desired. Further, computing the spectrum required stationarity and did not consider the behavior.
As a dynamic behavior such as movement makes the spectrum change over time, new methods
are needed when a dynamic behavior is being performed to take it into account. Another prior
approach computes a joint probability density across all signals [66]. But because computing this
density is complex, this approach has focused on smaller networks, for example consisting of 3
signals or nodes [66]. As modern datasets contain recordings from tens or hundreds of elec-
trodes, there is a need to develop and validate methods that can assess multiscale causality across
larger networks. Further, as the predictor for field potentials, this prior approach directly used the
binary spike trains [66]. However, compared with the measured binary spike trains, the firing
rates of neurons may be a better predictor of field potentials [29], [67]; but these firing rates are
not measurable and are instead latent. Thus, firing rates are challenging to incorporate in a model
of field potentials.
Here, we develop a novel multiscale causality estimation algorithm to identify the causality
graphs of neural spike and field networks. We write a joint statistical likelihood model using a
combination of point-process GLM models for spikes and linear Gaussian models for fields. Im-
portantly, given prior biophysical evidence [29], [67], instead of observable binary spike events,
we take the history of latent log-firing rates as predictors for fields. To resolve the challenge of
estimating the large number of network model parameters in the presence of these latent firing
rates, we develop a sequential estimation method that estimates these latent variables as an inter-
mediate step. This estimation method characterizes the effect of spike and field history as well as
behavioral states. From the estimated multiscale models, we compute directed information as our
measure of causality. Finally, we devise two statistical tests for different connection types---log-
7
likelihood ratio test or cross-validation based test---to detect multiscale causal interactions. We
show that the multiscale causality estimation algorithm identifies the true causality graphs of
neural spike and field networks in both basic network topologies and larger random network to-
pologies in extensive Monte Carlo simulations. Finally, we find that the algorithm is robust to
network connection strength, number of connections, or exact topology. This algorithm provides
a new tool to help assess causal neural interactions across spatiotemporal scales both during be-
havior and during spontaneous activity.
1.2 Methods
In this section, we describe the parametric multiscale models used to simultaneously charac-
terize spike-field signals and their interactions. We then devise a multiscale causality estimation
algorithm that estimates model parameters and computes the directed information between spike-
field signals. Finally, we provide statistical tests of multiscale causality. These methods together
reconstruct the multiscale causality graph from spike-field data. In Table 1-1, we list all variables
used in 1.2.1, 1.2.2 and 1.2.3 for reference.
1.2.1 Model description
As shown in Fig. 1-1, we consider the scenario where multiscale brain signals including 𝐶 bi-
nary spike trains and 𝐷 field signals are simultaneously observed. We assume spike trains and
fields are stationary ergodic Markov processes that are strictly causal [64], i.e., that each process
(whether spike or field) is conditionally independent of other processes at the current time condi-
tioned on the full history of all processes. This assumption is used in classical Granger causality
and directed information measures and is reasonable for data with high enough sampling rate
[47], [64]. Our formulation also allows for the concurrent time-series of behavioral states (e.g.,
movement kinematics) to be observed, thus allowing us to assess causality both during behavior
8
and during spontaneous activity. Having observed the spikes and fields, we build two parametric
models, a point-process GLM for spikes and a linear Gaussian model for fields.
Table 1-1 Variables in 1.2.1 ~ 1.2.3
Variable Description
𝐶 number of observed binary spike trains
𝐷 number of observed field signals
𝑁 𝑡 𝑞 binary spike event of neuron 𝑞 at time 𝑡
𝑦 𝑡 𝑞 value of 𝑞 -th field signal at time 𝑡
𝒩 set of all spike series
𝒴 set of all field time-series
𝜆 𝑡 𝑞 instantaneous firing rate of 𝑁 𝑡 𝑞
𝑢 𝑡 vector of behavioral states at time 𝑡
Δ time-bin for spikes
𝜇 𝑡 𝑞 mean of 𝑞 -th field signal at time 𝑡
𝜎 𝑞 2
variance of 𝑞 -th field signal
𝑚 the integer ratio of the field and spike sample times
𝒩 𝑠𝑝
𝑞 the subset of spike trains that 𝑁 𝑞 depends on
𝒴 𝑠𝑝
𝑞 the subset of field trains that 𝑁 𝑞 depends on
𝒩 𝑓𝑙
𝑞 the subset of spike trains that 𝑦 𝑞 depends on
𝒴 𝑓𝑙
𝑞 the subset of field trains that 𝑦 𝑞 depends on
𝐾 𝑠𝑝
𝑞 the length of history that 𝑁 𝑞 depends on
𝐾 𝑓𝑙
𝑞 the length of history that 𝑦 𝑞 depends on
𝛽 linear coefficients in point process GLM
𝛼 linear coefficients in linear Gaussian model
𝐼 𝑛 directed information of 𝑛 samples
𝐼 directed information rate
𝘃 𝑡𝑝
𝑗
vector of all 𝛽 ’s or 𝛼 ’s and 𝜎 depending on signal type tp
𝛿
size of 𝘃 𝑡𝑝
𝑗
𝑧 , 𝑤 denote either 𝑁 or 𝑦
We denote the binary spike event of neuron 𝑞 ∈ { 1 , 2 , . . . , 𝐶 } at time 𝑡 by 𝑁 𝑡 𝑞 , the value of the
𝑞 -th field signal 𝑞 ∈ { 1 , 2 , . . . , 𝐷 } at time 𝑡 by 𝑦 𝑡 𝑞 , and the behavioral states at time 𝑡 by 𝒖 𝑡 . We
denote the set of all spike time-series by 𝒩 = { 𝑁 1
, 𝑁 2
, . . . , 𝑁 𝐶 }, where 𝑁 𝑞 denotes the time-se-
ries of the 𝑞 -th spike train, i.e., 𝑵 1 : 𝑡 𝑞 = [ 𝑁 1
𝑞 , 𝑁 2
𝑞 , . . . , 𝑁 𝑡 𝑞 ] ′ and ∙ ′ is the transpose operator. Simi-
larly, we denote the set of all field time-series by 𝒴 = { 𝑦 1
, 𝑦 2
, . . . , 𝑦 𝐷 }, where 𝑦 𝑞 denotes the
9
time-series of the 𝑞 -th field signal, i.e., 𝒚 1 : 𝑡 𝑞 = [ 𝑦 1
𝑞 , 𝑦 2
𝑞 , . . . , 𝑦 𝑡 𝑞 ] ′. Finally, we denote the history of
a general time-series 𝑧 𝑞 (whether spike or field) from time 𝑎 to 𝑏 by 𝒛 𝑏 , 𝑎 𝑞 = [ 𝑧 𝑎 𝑞 , . . . , 𝑧 𝑏 𝑞 ] ′ .
Figure 1-1 Multiscale causality estimation. We observe multiscale spike-field network activity using
which we aim to detect all direct causal connections between network nodes, whether spike or field
nodes. Thus, there are 4 connection types we assess, spike →spike, field →spike, field →field, and
spike →field. Spikes ( 𝑁 ) whose sampling rate is generally higher as shown} are modeled as GLM point
processes whose log-firing rate (ln 𝜆 ) is a function of the history of the spike binary events and the history
of field signals ( 𝑦 ). Fields are characterized using a linear Gaussian model that depends on the history of
spike log-firing rates and fields. Both spike and field signals can also be modulated by behavioral states
( 𝑢 ), e.g., movement kinematics. In both models, related signals are shaded, and the current time is indi-
cated by 𝑡 . Our algorithm can assess multiscale direct causality in spike-field networks both during be-
havior and during spontaneous activity.
We model a binary spike train as a point-process GLM [37], [63], [68]–[70] that depends on
the history of all observed spikes and fields and the current behavioral states, which together
constitute the covariates (Fig. 1-1) [1], [35], [37], [63], [68]–[72]. For each neuron 𝑞 and at time
𝑡 , this model is fully specified by the instantaneous firing rate function 𝜆 𝑡 𝑞 . The point process
likelihood function can be written as
(1)
10
where Δ is the time-bin for spikes, which is selected small enough to contain at most one spike.
Let's denote the subset of spike trains and the subset of field time-series that the 𝑞 -th spike train
𝑁 𝑞 ∈ 𝒩 depends on by 𝒩 𝑠𝑝
𝑞 ⊆ 𝒩 and 𝒴 𝑠𝑝
𝑞 ⊆ 𝒴 , respectively. Within the GLM framework, we
write the log of the firing rate of this spike train 𝑁 𝑞 ∈ 𝒩 as
(2)
where 𝛽 's with subscripts and superscripts are parameters to be estimated, a non-negative integer
𝐾 𝑠𝑝
𝑞 is the length of the history (to be found), and 𝑚 is the ratio of the field and spike sample
times and is assumed to be an integer (since the time-scale of spikes is faster than fields, the sam-
ple time for spikes is taken to be no larger than that of fields). 𝛽 's are vectors except 𝛽 0
𝑞 .
We model a field signal using a linear Gaussian model. We assume a field time-series 𝑦 𝑞 ∈ 𝒴
is Gaussian distributed with time-variant mean 𝜇 𝑡 𝑞 and fixed variance 𝜎 𝑞 2
. The Gaussian likeli-
hood function can be written as
(3)
The mean of the field time-series depends on the history of observed fields, the history of spike
firing rates (which need to be estimated) and the current behavioral states, which together consti-
tute the covariates (Fig. 1-1). Denoting the subset of spike trains and the subset of field time-se-
ries that the 𝑞 -th field time-series depends on by 𝒩 𝑓𝑙
𝑞 ⊆ 𝒩 and 𝒴 𝑓𝑙
𝑞 ⊆ 𝒴 , respectively, we can
write the mean as
(4)
11
where 𝛼 's with subscripts and superscripts are parameters to be estimated and a non-negative in-
teger 𝐾 𝑓𝑙
𝑞 is the length of the history (to be found) in units of the field sample time.
To assess causality, we first need to estimate the model parameters. We then need to devise
and compute a multiscale causality measure and the corresponding statistical tests to assess cau-
sality between spikes and fields. The subset of signals that may affect a given spike or field sig-
nal in the models can be narrowed down by computing unconditional interactions as described in
Section 1.2.5.
1.2.2 Multiscale causality measure
We measure multiscale causality using an information theoretic measure termed directed in-
formation or directed information rate [64]. A non-zero directed information implies causality.
Let's denote the conditional likelihood function of the first 𝑛 samples of 𝑦 𝑖 and 𝑁 𝑗 by
𝑓 𝑛 ( 𝑦 𝑖 | | 𝒩 𝑓𝑙
𝑖 , 𝒴 𝑓𝑙
𝑖 ) and 𝑓 𝑛 ( 𝑁 𝑗 | | 𝒩 𝑠𝑝
𝑗 , 𝒴 𝑠𝑝
𝑗 ), respectively. The conditional likelihood for the 𝑖 -th field
time-series 𝑓 𝑛 ( 𝑦 𝑖 | | 𝒩 𝑓𝑙
𝑖 , 𝒴 𝑓𝑙
𝑖 ) is the product of 𝑛 Gaussian likelihoods with means given by (4)
and with variance 𝜎 𝑖 2
. The conditional likelihood for the 𝑗 -th spike time-series 𝑓 𝑛 ( 𝑁 𝑗 | | 𝒩 𝑠𝑝
𝑗 , 𝒴 𝑠𝑝
𝑗 )
can be obtained by inserting (2) into (1). We aim to compute the causally conditioned directed
information [64] between multiscale signals, i.e., from a signal 𝑧 𝑖 to a signal 𝑤 𝑗 where each of
these two signals can be either spike or field, i.e., 4 total combinations: spike-spike, spike-field,
field-spike, and field-field. Let's denote the subset of spike trains and the subset of fields that the
time-series 𝑤 𝑗 depends on by 𝒩 𝑡 𝑝 𝑗 and 𝒴 𝑡𝑝
𝑗 , respectively (where 𝑡𝑝 stands for type and could be
either 𝑠𝑝 or 𝑓𝑙 depending on whether 𝑤 𝑗 is spike or field, respectively, as in (2) and (4). Caus-
ally conditioned directed information from 𝑧 𝑖 to 𝑤 𝑗 can be written with a unified equation for all
4 combinations
12
(5)
Here 𝐴 ∪ 𝐵 indicates the union of sets 𝐴 and 𝐵 , and 𝐴 \𝐵 indicates the relative complement of
𝐵 in 𝐴 , i.e., the members in 𝐴 that do not belong to 𝐵 . Thus (5) is a non-negative value and is a
measure of the information in 𝑤 𝑗 that can be extracted directly from the history of 𝑧 𝑖 conditioned
on the history of the whole network. If (5) is evaluated to be 0, then 𝑧 𝑖 does not directly cause 𝑤 𝑗
in the network. Note that the reason we condition on all signals (except 𝑧 𝑖 and 𝑤 𝑗 ) is because we
are interested in ``direct" causal interactions between 𝑧 𝑖 and 𝑤 𝑗 [64]. The computed directed in-
formation is thus referred to as conditional directed information. Not having the conditioning on
the rest of the network signals will compute the unconditional directed information and will re-
sult in also detecting indirect causal interactions between 𝑧 𝑖 and 𝑤 𝑗 , which are due to their indi-
rect connections through other nodes.
From (5), we also define the directed information rate [58], which is a scaled version that does
not depend on the number of samples 𝑛 and is given by
(6)
1.2.3 Model estimation
To compute the directed information for multiscale data, we first need to estimate the model
parameters in (2) and (4). To do so, we need to write the joint multiscale likelihood function for
all spike and field observations and then find the maximum-likelihood estimate (MLE) of param-
eters that optimize this function. However, joint optimization is quite difficult given the large
number of parameters and the complex nonlinear form of the joint multiscale likelihood function.
In particular, the firing rates of all spike trains are used in every field model in (4). However,
13
these firing rates are latent variables. Thus we would need to insert (2) into (4), resulting in a
prohibitively large number of parameters that need to be optimized jointly within a complex non-
linear function.
To resolve this challenging problem, we devise an approximate sequential MLE approach in-
stead. The key idea is that if we first estimate the firing rates by fitting the spike models, then the
problem becomes much simpler because the parameters of the spike and field models can be esti-
mated separately rather than jointly. Thus, we estimate the firing rates as an intermediate step.
We do so by first estimating the spike model parameters based on spike-field observations; this
also enables obtaining an estimate of the latent firing rates. After that, we separately estimate the
field model parameters by inserting the estimate of the firing rates into (4) as expanded on be-
low.
Denote all unknown parameters in the likelihood function 𝑓 𝑛 ( 𝑤 𝑗 | | 𝒩 𝑡𝑝
𝑗 , 𝒴 𝑡𝑝
𝑗 ) by 𝘃 𝑡𝑝
𝑗 . 𝘃 𝑠𝑝
𝑗 in-
cludes all 𝛽 ’s and 𝘃 𝑓𝑙
𝑗 includes all 𝛼 ’s and 𝜎 . For a given history length 𝐾 𝑡𝑝
𝑗 , the MLE of
𝘃 𝑡𝑝
𝑗 ( 𝐾 𝑡𝑝
𝑗 ) is given by
(7)
When 𝐾 𝑡𝑝
𝑗 is unknown, we use the Akaike information criterion (AIC) to select the model or-
der [73] as
(8)
where 𝛿 𝐾 𝑡𝑝
𝑗 is the total number of model parameters, with 𝛿 = 𝑚 | 𝒩 𝑠𝑝
𝑗 | + | 𝒴 𝑠𝑝
𝑗 | if 𝑤 𝑗 is a
spike signal and 𝛿 = | 𝒩 𝑓𝑙
𝑗 | + | 𝒴 𝑓𝑙
𝑗 | if 𝑤 𝑗 is a field signal ( | 𝐴 | indicates the number of elements
in 𝐴 . Note that when estimating the two likelihood functions in (5), we only select the model
14
order for the first term 𝑓 𝑛 ( 𝑤 𝑗 | | 𝒩 𝑤 𝑗 ∪ 𝒴 𝑤 𝑗 ) and the second term uses the same order. This is be-
cause the signals conditioned on in the second term are a subset of the signals conditioned on in
the first term and thus the second term is a special case of the first term.
Once model parameters are estimated, we can easily compute the multiscale causality measure
using (5). However, to assess causality, we need to develop statistical tests for whether the com-
puted values are significantly different from zero, as we develop next.
1.2.4 Multiscale causality statistical tests
We develop statistical tests for the significance of causality between spikes and fields. The
null hypothesis that there is no causality is equivalent to directed information being zero.
For assessing causality to all spike nodes in the network (whether from fields or other spike
nodes), we note that all covariates that affect the firing rate in (2) are observable. Hence in (5),
we have a nested model because the first term contains all parameters in the second term. A
nested model satisfies the log-likelihood ratio (LLR) test [74]. Thus, we can use the LLR test to
assess causality. By Wilks' theorem [75], we have
(9)
and
(10)
where 𝜒 2
( 𝑘 ) denotes a chi-squared distribution with degree of freedom 𝑘 . This distribution can
be used to compute the significance level.
However, the LLR test cannot be used to test the causality to field nodes because the firing
rate covariate, 𝜆 , used in (4) is latent. To resolve this problem, we use the cross-validation-based
conservative Z method [76] to evaluate the significance level. Instead of testing 𝐼 ̂
𝑛 directly, we
test if in (5), 𝑓 𝑛 ( 𝑦 𝑗 | | 𝒩 𝑓𝑙
𝑗 ∪ 𝒴 𝑓𝑙
𝑗 ) is significantly greater than 𝑓 𝑛 ( 𝑦 𝑗 | | 𝒩 𝑓𝑙
𝑗 ∪ 𝒴 𝑓𝑙
𝑗 \ { 𝑧 𝑖 } ). Thus, given
15
the Gaussian distribution for the field time-series, a zero directed information is equivalent to the
mean-squared one-step-ahead prediction error of 𝑦 𝑗 using (4) being the same with and without
𝑧 𝑖 . The one-step-ahead prediction of 𝑦 𝑗 at time 𝑡 ---predicting its current value using the history
of covariates---is equal to 𝜇 𝑡 𝑗 , which is a function of all covariates up to time 𝑡 − 1 in (4).
Denote the mean-squared one-step-ahead prediction error of 𝑦 𝑗 using (4) by 𝜎 ̂
2
. Assuming 𝑛
total data samples (i.e., recording time), we estimate the difference of cross-validated 𝜎 ̂
2
with
and without 𝑧 𝑖 , denoted by 𝛾 ̂
0
as follows. Repeatedly for a total of 𝑅 times, we select 𝑛 /2 of the
samples as train set and the other half as test set. Note that these 𝑛 /2 samples are randomly se-
lected and are not continuous in time. Each time, we estimate the model parameters in the train
set using the sequential MLE method, and then compute 𝜎 ̂
2
on the test set. Thus 𝛾 ̂
0
can be ob-
tained as the average difference between 𝜎 ̂
2
with and without 𝑧 𝑖 over the 𝑅 repeats.
We also need to obtain the variance and bias of 𝛾 ̂
0
for our statistical tests. To do so, we per-
form the following cross-validation procedure as shown in Fig. 1-2. Repeatedly and for a total of
𝑀 times, we partition the data into two halves to create 𝑀 bipartitions (Fig. 1-2A, B). Note that
the samples for each half are selected randomly and are not continuous in time. We denote the
two halves of the 𝑚 -th bipartition by 𝐴 𝑚 and 𝐵 𝑚 . Then repeatedly for a total of 𝑅 times, in each
half we randomly select test sets consisting of 𝑛 2
samples. We denote the 𝑟 -th train-test set
within 𝐴 𝑚 and 𝐵 𝑚 by 𝐴 𝑚 , 𝑟 and 𝐵 𝑚 , 𝑟 , respectively (Fig. 1-2C). We can now compute two aver-
age one-step-ahead prediction errors 𝛾 ̂
𝐴 , 𝑚 and 𝛾 ̂
𝐵 , 𝑚 by averaging across all 𝑅 test sets of 𝐴 𝑚 , 𝑟
and 𝐵 𝑚 , 𝑟 in the 𝑚 -th bipartition, respectively. A conservative estimate (i.e., over-estimate) of the
variance of 𝛾 ̂
0
is given by [76].
(9)
16
Figure 1-2 Cross-validation procedure to assess the significance of causality to field nodes. This proce-
dure is used to estimate the one-step-ahead prediction error for field signals, and to compute the variance
and bias of this estimate. Data are indicated by pies. In each partition, data chunks consist of random sam-
ples in time. (A, B) Data is partitioned into two random halves. (C) Each half is partitioned into a test and
a train set consisting of randomly selected samples. (D) The two train sets and two test sets corresponding
to the two halves are combined to generate a combined train set and a combined test set, respectively.
The bias of 𝛾 ̂
0
is caused by the estimation error in 𝜆 in (2) and is positive (error in 𝜆 causes
larger prediction error). To estimate the bias, we construct 𝑅 train-test sets 𝐶 𝑚 , 𝑟 as the combina-
tion of the train-test sets 𝐴 𝑚 , 𝑟 and 𝐵 𝑚 , 𝑟 as shown in Fig. 1-2D. Then for the 𝑚 -th bipartition, we
compute 𝛾 ̂
𝐶 , 𝑚 as before. We can now compute the bias as
(10)
To see why this gives the bias, note that the bias is due to the estimation error of 𝜆 , which is in
turn caused by the estimation error of parameters in (2); since these parameter estimates are ob-
tained using MLE, the error variances have a convergence rate of 1 / 𝑛 [77]. Thus, the bias in
𝛾 ̂
𝐶 , 𝑚 is half the bias in each of 𝐴 𝑚 , 𝑟 and 𝐵 𝑚 , 𝑟 , which are computed using ( 𝑛 − 2 𝑛 2
) /2 samples.
Also, 𝑏 ̂
converges at approximately the same rate of 1 /𝑛 .
Having computed the bias and variance in (11) and (12) above, we z-score 𝛾 ̂
0
and use
17
(11)
to perform a conservative significance test and compute a p-value.
For all tests above, given a significance level 𝘀 , Benjamini-Hochberg procedure [78] is used to
control false discovery rate at level 𝘀 for connections to spikes and fields, respectively. For con-
nections to both spikes or fields, the procedure is applied in the unconditional step first and then
in the conditional step.
Figure 1-3 Multiscale causality estimation algorithm. It consists of two sequential steps (two dashed
boxes). First, we assess causality to a spike node from field or spike by estimating a GLM point process
model. This step also provides an estimate of the latent firing rates. We then use these estimated firing
rates in the second step to assess causality to a field node from field or spike. Here, we estimate the linear
Gaussian model for fields and compute the corresponding directed information. To reduce computational
complexity, in both steps, we first identify the subset of connections that may have direct causality by
computing the unconditional directed information, which is less complex to calculate. We then compute
the conditional directed information only for these identified connections. The output causality matrix re-
constructs the network graph.
18
1.2.5 Estimating multiscale causality graphs in spike-field network activity
A flow chart of the entire multiscale causality estimation algorithm is shown in Fig. 1-3. We
use all observations including spikes, fields and behavioral states as input to estimate the causal-
ity graph. Using our sequential MLE estimation method, we execute two sequential steps. In the
first step, we test the causality from any network node (whether spike or field) to spike nodes by
inserting the spike GLM point process likelihood functions into (5). The first step also allows us
to estimate the firing rate of all spike nodes. In the second step, we use the estimate of firing
rates obtained in the first step as input to compute the mean of the field Gaussian likelihood
functions in (4); we then test the causality from any network node (whether spike or field) to
field nodes by inserting the field Gaussian likelihood functions into (5).
We are interested in ``direct" causal interactions, which can be evaluated by estimating the
conditional directed information between any node-pair. Since this estimation is computationally
complex, we divide each of the two steps of the algorithm into two procedures (Fig. 1-3): First,
we test the pairwise unconditional causality---which finds all causal interactions including indi-
rect ones---to identify the subset of node pairs that may have direct causality between them. We
then only test the direct causality, i.e., the conditional directed information, within the identified
subset of pairs. We emphasize that at the end, the output causality matrix with p-values indicates
direct (conditional) causality (Fig. 1-3). However, the separation above is still beneficial in prac-
tice especially for large networks because conditional tests are much more complex than uncon-
ditional tests.
19
1.3 Results
In this section, we perform extensive Monte Carlo simulations of random multiscale causality
graphs to show that the proposed algorithm and tests can accurately uncover the true causality
graph from network spike-field activity.
1.3.1 Method validation
We generate data using Monte-Carlo simulations from different spike-field networks, includ-
ing basic 3-node networks and more complex random 10-node networks. We simulate data using
(2) and (4). In our simulations, each coefficient 𝜷 1
𝑖𝑗
, 𝜷 2
𝑖𝑗
, 𝜶 1
𝑖𝑗
, 𝜶 2
𝑖𝑗
is generated randomly from a
uniform distribution and normalized to have a zero mean and a fixed L1 norm. The L1 norms of
𝜷 2
𝑖𝑗
, 𝜶 1
𝑖𝑗
and 𝜶 2
𝑖𝑗
are 4, 0.2 and 0.2, respectively. The L1 norms of 𝜷 1
𝑖𝑗
for 𝑖 = 𝑗 is taken as 2 and
for 𝑖 ≠ 𝑗 is taken as 10. We generate 𝒖 𝑘 at time 𝑘 by simulating a 3-dimensional white Gaussian
process with zero mean and unit variance. 𝜷 3
𝑖 is also generated from a uniform distribution and is
normalized to have zero mean and L1 norm of 1.5. 𝜶 3
𝑖 is set to 0. Finally, we set the history
length for spikes and fields as 𝐾 𝑓𝑙
= 𝐾 𝑠𝑝
= 4, the ratio between spike and field sample times as
𝑚 = 5, the time-bin for spikes as Δ = 0 . 001 seconds, the baseline parameters as 𝛽 0
𝑖 = 1 . 5, 𝛼 0
𝑖 =
0, the false positive rate threshold (i.e., significance level) of statistical tests as 𝘀 = 0 . 05, and the
sampling rates of fields and spikes as 200Hz and 1kHz, respectively. To keep a similar variance
for different field signals, we set the variance in the linear Gaussian model of the fields as 𝜎 =
0 . 2 for field signals that are not caused by spike signals and 𝜎 = 0 . 1 otherwise. For the cross-
validation significance test of causality to fields, we set 𝑅 = 1 , 𝑀 = 5 and 𝑛 2
= 𝑛 /10 [76]. This
setting provides an acceptable computing time. A larger 𝑀 provides more precise estimate of
variance and bias, and a larger 𝑅 provides more conservative tests, which can be chosen to be
larger if fewer false positives are preferred. However, the computing time is linear to 𝑅 and 𝑀 .
20
1.3.2 Performance measures
We simulate two classes of basic 3-node topologies called proxy and cascading topology [58].
In proxy topology, P1 causes P2 and P2 causes P3. Thus, no direct causality exists from P1 to
P3, but may be incorrectly detected if P2 is not conditioned on in computing the directed infor-
mation. Thus, we evaluate the accuracy by the average of the true negative rate from P1 to P3
(i.e., percentage of trials in which no direct causality is detected between P1 and P3) and the true
positive rate from P1 to P2 and P2 to P3 (i.e., percentage of trials in which direct causality is de-
tected between these nodes).
In cascading topology, C1 causes C2 and C1 also causes C3. No causality between C2 and C3
exists but may be incorrectly detected if C1 is not conditioned on in directed information compu-
tation. Thus, we find the accuracy by the average of the true negative rate from C2 to C3 and the
true positive rates from C1 to C2 and C1 to C3. For multiscale networks, the three nodes in a to-
pology can be either spikes or fields, resulting in 8 possible network types for each of proxy and
cascading setups.
For 10-node networks, we simulate 5 spike nodes and 5 field nodes in order to have a balanced
evaluation across different kinds of connections. As our performance measures, we compute the
overall accuracy ( 𝐴 𝑐𝑐 ) for different kinds of connections and also the F1 score 𝐹 1 defined as
(12)
where 𝑛 𝑡𝑝
, 𝑛 𝑡𝑛
, 𝑛 𝑓𝑝
and 𝑛 𝑓𝑛
are the number of true positives, true negatives, false positives and
false negatives, respectively. 𝐹 1 replaces the true negatives in the definition of 𝐴 𝑐𝑐 by true posi-
tives and thus puts more emphasis on correctly detecting the existence of causalities [79].
21
1.3.3 Multiscale causality estimation algorithm accurately reconstructs the 3-node network
graphs
We simulate 300 trials for each of proxy and cascading topologies. In 100 of these trials, we
use the base connection strengths, i.e., the L1 norms of parameters as in Section 1.3.1. To assess
the robustness of our method to network connection strengths, we increase or decrease the con-
nection strengths by half in 100 trials each, which means that we multiply the L1 norms of all 𝜷 's
and 𝜶 's (except the self-connections 𝜷 1
𝑖𝑖
and 𝜶 2
𝑖𝑖
) by 0.5 for 100 trials and by 1.5 for another 100
trials.
Figure 1-4 Multiscale causality estimation algorithm accurately estimates the proxy and cascading net-
works. The network graphs to the side of the subplots and the figure legends specify the types of nodes in
the graph corresponding to each curve. In the graphs, the solid arrows indicate true causal connections to
be detected and the dashed arrows show the non-existing causal connections that should be rejected. S
and F in the network graph and the legends indicate spike and field nodes, respectively. Accuracy is com-
puted as the average of the true positive rates of the solid arrows and the true negative rates of the dashed
arrows in different graphs. Data length is shown in minutes (min).
22
Fig. 1-4 shows the average accuracy over 300 trials in proxy and cascading networks with dif-
ferent node types. Accuracy increases with data length for all node types (i.e., 16 possible net-
works). Starting from a low value, the accuracy in each network asymptotically increases to
around 1. The false positive rate may not be very small for small data lengths even with FDR
control as our estimation of bias in (12) holds asymptotically.
From Fig. 1-4, we can also observe that the convergence time for the curves is different. This
is due to two factors. First, convergence time depends on the complexity of the model, which is
different for different types of node connections (i.e., spike or field) as expanded on in Section
1.3.4. Second, convergence time is also related to the strengths of the connections, which are ran-
domly set in the simulation and are not equal across different kinds of connections.
1.3.4 Multiscale causality estimation algorithm accurately reconstructs the 10-node network
graphs
First, we simulated 100 trials for 2 randomly generated 10-node networks with 12 and 24
causal connections (Fig. 1-5A,E). The method accurately estimates the true causality graphs
(Figs. 1-5B,F). There are 4 possible connection types here: field →field, field →spike,
spike →field, and spike →spike. For each connection type, F1 score asymptotically increases as
the number of samples (i.e., data length) increases (Fig. 1-5C-D,G-H). The accuracy also asymp-
totically converges to close to 1; this result suggests that both the true positive and the true nega-
tive rates are increasing. Note that the asymptotic accuracy depends on the significance level of
the statistical test 𝘀 , which sets the false positive rate; thus, the asymptotic accuracy will be close
to 1 but smaller as 𝘀 is not set to 0 in statistical tests.
One observation is that initially and for small data lengths, as the data length increases, there
may be a transient decrease in F1 score or accuracy before eventual convergence. For example,
23
in Fig. 1-5C, F1 score of spike →field decreases at 12 min transiently before eventually converg-
ing to close to 1. This decrease happens because for very short data lengths in this example
(shorter than 12 min), the bias in (12) is under-estimated and thus most of the connections are
identified as positive. As this abundant identification of positives increases not only the false
positive rate but also the true positive rate, F1 score could start from a relatively large value ini-
tially because it weighs true positives more strongly. However, the accuracy measure is low for
these short data lengths since many false positives result in a low accuracy. Nevertheless, after
this transient phase, both accuracy and F1 score increase and converge to close to 1.
We also find that the convergence time of different connection types are different because
these connections have different strengths and model complexities (Figs. 1-5C-D, G-H). Gener-
ally, the model when considering the causality from spikes or fields to fields is more complex
because the firing rate is latent in the linear Gaussian model in (4). Thus, when the two-step se-
quential estimation procedure is applied, many parameters in both (2) and (4) need to be esti-
mated, whose estimation errors affect the spike to field causality result. In contrast, the model
when assessing causality from spikes or fields to spikes is less complex because only the param-
eters in the firing rate model in (2) are needed to assess this causality, which are much smaller in
number. Finally, we observe that the two networks have different convergence times given their
different topologies.
1.3.5 Robustness of the multiscale causality estimation algorithm
We firstly assess the robustness of the algorithm to network topology by simulating random
networks that have different sparsity levels, quantified by the number of connections, as well as
different topologies (i.e., differences in which exact nodes have these connections). The total
number of connections in a network is selected to be 8, 16 or 24, which means that there are 2, 4
24
or 6 of each 4 connection types. For each sparsity level, we simulated 100 trials. For each of
these trials, the network is randomly and independently generated; thus, the generated networks
have different placement of the connections and thus different topology.
Figure 1-5 Multiscale causality estimation algorithm accurately estimates the causality graph of two
random 10-node networks. (A, E) True network graphs. The blue nodes indicate field nodes, and the or-
ange nodes indicate spike nodes. (B, F) Estimated network graphs. The color of arrows indicates the de-
tection rate (between 0 to 1 as shown by the colorbar) of causality between each pair calculated across
100 simulated trials, with a darker color indicating a larger rate of detection. 𝑇 indicates the data length
used for estimation. (C-D, G-H) F1-score and accuracy for the 4 possible connection types using our
method.
25
We see that regardless of the sparsity level, accuracy and F1-score converge in every case
tested (Fig. 1-6). Also, the convergence times are only slightly different for different sparsity lev-
els and have no systematic trend. These results suggest that our algorithm is robust to different
sparsity levels and network topologies since the networks are randomly generated for each trial.
Figure 1-6 Multiscale causality estimation algorithm is robust to the sparsity level and exact network
topology. F1-score and accuracy in 10-node random networks with (A) 8, (B) 16, and (C) 24 causal con-
nections are shown for the 4 connection types.
A simulation on a random 20-node network (10 fields and 10 spikes) with 24 connections on
100 trials is also performed and the result is similar to 10-node networks showing that the algo-
rithm is robust to large number of nodes.
Then we assess the robustness of the algorithm to several model mismatch cases. First, we
simulate data from the same network topology as in Fig. 1-6B but with correlated white Gaussian
noise in fields. The noise covariance matrix contains 0.015 on diagonal. Each field signal has a
0.005 covariance with a field signal and a -0.005 covariance with another field signal. We can
26
observe that our method still works in this case (Fig. 1-7B). The reasons are that correlated noise
does not introduce causality between fields since it is white, and the multivariate Gaussian noise
projected to each channel is still Gaussian.
Figure 1-7 Multiscale causality estimation algorithm is robust to correlated white Gaussian noise be-
tween field signals and unobserved inputs. F1-score and accuracy are shown for (A) data generated from
the exact same model in estimation as Fig. 1-6B, (B) data with correlated noise between field signals, and
(C) 48-minute data with different level of unobserved white Gaussian inputs; 95% confidence intervals of
mean accuracy are shown in red.
Finally, we simulate data from the same network topology but with independent unobserved
field signals inputting to each observed node. To avoid unreasonably large value, the L1 norms
of 𝜷 1
𝑖𝑗
, 𝜷 2
𝑖𝑗
, 𝜶 1
𝑖𝑗
, 𝜶 2
𝑖𝑗
are halved. The average variance of unobserved signals is adjusted such that
it is equal to 0, 30%, 60% or 90% of the variance of observed part of log-firing rates and mean
values of fields. Based on t-tests on groups of accuracy in different trials, the overall accuracy
does not have a difference between different level of unobserved inputs (Fig. 1-7 C).
27
1.4 Discussions
To study neural mechanisms, it is important to assess the flow of information within brain net-
works by computing causal interactions between spikes and fields. Field recordings such as LFP
and ECoG capture cumulative large-scale network processes whereas spikes reflect the activity
of smaller-scale populations of neurons. We consider spike and field nodes in a statistical sense.
However, there are also neuroscientific evidence showing that spikes contribute to field genera-
tion and field signals can also predict spiking activity variations. Biophysical forward model
shows that a field potential measurement has contributions from different sources and the largest
source of field potentials comes from neurons that generate the largest current dipoles [38]. Bio-
physical inverse model is used to infer neuronal sources from field potentials. Inferring neuronal
sources from MEG or EEG is called source reconstruction and from LFPs has typically involved
current source density (CSD) analysis [38]. Indeed, prior works have shown that field signals
contain useful information in predicting spikes [30], [39], [68] and spikes can contribute to stLFP
[29], which is often used to quantify the postsynaptic responses at one location of spiking activ-
ity at another location and it is partially causal. Such causality assessment is challenging since
spikes are binary-valued and have fast time-scales whereas fields are continuous-valued and have
slower time-scales. Here we developed a novel multiscale causality estimation algorithm and the
associated statistical tests that can assess causal interactions within mixed spike-field networks.
We also have behavioral states in the model, examples of which include kinematic states of
hands or eyes and abstract states like mood score. In case of external stimulation, behavioral
states may also be able to be replaced by stimulation parameters like voltage or frequency.
As a key element, we designed a two-step sequential estimation procedure. This procedure en-
abled two features. First, it allowed for efficient estimation of parameters for larger networks.
28
Second, it allowed us to build biophysically-informed models by using the latent spike log-firing
rates as predictors for the field features instead of using the observable binary spike events---bio-
physical studies indicate that log-firing rates contribute to field prediction [29], [67]. It is a chal-
lenging problem that the firing rates cannot be estimated efficiently using maximum-likelihood
estimation on a joint likelihood function. The nonlinear joint likelihood function can be opti-
mized with gradient-based algorithms. In each iteration, the complexity is linear to the number of
parameters and samples, but the computational time also depends on the number of iterations,
which usually increases with the number of parameters. However, the sequential procedure re-
solves the problem by computing the firing rates as an intermediate step. For GLM model, spikes
are estimated separately. For fields, linear least-square-based method is applied. It does not re-
quire iteration and thus the complexity is only linear to the number of parameters and samples.
Besides, the number of parameters in each separate model is fewer than in the joint likelihood
function.
The computational time to reconstruct the network graph also depends on the number of con-
nections, i.e., network sparsity. Sparser networks will require less computational time. Specifi-
cally, in our algorithm, unconditional models are fitted about 𝑛 2
times where 𝑛 is the total num-
ber of nodes in the network. Then conditional models are fitted fewer times depending on the
network sparsity. Various prior studies of structural and functional brain networks across spatio-
temporal scales (e.g., [80], [81]) have suggested that brain networks are indeed sparse. In par-
ticular, by using various experimental modalities such as electroencephalogram (EEG), magne-
toencephalogram (MEG) and functional and structural MRI, these studies have indicated that
brain networks have small-world properties [82], which means that the networks are sparse and
29
locally clustered, with short paths that link all nodes of the network. In a small-world network,
most nodes have a small number of direct connections despite the network being possibly large.
We assume conditional independence between signals in our model. This assumption holds
when the sampling rates of signals are high enough. For fields, we showed that our proposed al-
gorithm also works when there are correlated white noises between field signals in simulation.
For spikes, the sampling rate can always be made as high as we want such that correlations are
negligible, and each spike can be separately modeled. Therefore, this is indeed a reasonable as-
sumption.
We focused on the derivation of the novel multiscale causal estimation algorithm and its vali-
dation and evaluation using Monte Carlo simulations. These simulations allowed us to validate
the algorithm by providing the ground-truth causality graphs and also enabled the study of algo-
rithm robustness by generating networks with various sparsity levels and topologies. Simula-
tions have also been useful in prior studies for designing causality measures (e.g., [52]–[54],
[66]) and for developing closed-loop neural systems such as motor BMIs~(e.g., [39], [83]–[85]),
BMIs for anesthetic delivery [86], [87], and brain stimulation systems [88], [89]. Having derived
and validated the algorithm in simulations, an important future direction of our research is to ap-
ply this algorithm on multiscale spike-field datasets from animal and human experiments across
various brain regions and behaviors. Other future directions will be in modeling aspects. In the
future, nonlinear modeling between log-firing rates, spikes, fields and behavioral states is possi-
ble and can be compared to our work. It might be possible to define directed information be-
tween fields and spikes directly from biophysical models such as integrate-and-fire network
models [67] instead of a simplified linear model.
30
In addition to helping investigate neural mechanisms, the algorithm may help inform future
neurotechnologies [88]–[97]. Brain stimulation systems for neuropsychiatric disorders need to
identify a target brain region whose stimulation can causally alter the relevant brain state [88]–
[94]. For example, in depression, stimulation of the target region should causally modulate brain
networks that affect mood. Since these networks are multi-site and distributed [98], the mul-
tiscale causality measure may help uncover connectivity patterns across spatial scales and thus
guide the location of the stimulation electrode. The detected causal interactions from any net-
work node can be further validated by stimulating that node and observing the response at other
nodes [93], [96].
1.5 Summary
We developed a multiscale causality estimation algorithm to identify the causality graphs of
mixed discrete-continuous spike-field networks. We described spike-field activity with a likeli-
hood function comprised of point process GLMs for spikes and linear Gaussian models for
fields. Both spike and field signals were modeled as functions of the history of all spike and field
signals and the behavioral state, enabling causality assessment in both task-driven and spontane-
ous activity. We designed a sequential estimation procedure to fit the large number of model pa-
rameters in the presence of latent firing rates that the fields depended on. We then computed the
directed information based on the estimated multiscale models. We devised two statistical tests
to assess causal interactions. We validated the algorithm using extensive simulations. The algo-
rithm accurately estimated the causality graphs for random spike-field networks regardless of the
number of connections, connection strengths, or exact topology.
31
Chapter 2: Data-efficient multiscale causal interaction modeling be-
tween spiking and field potential signals during behavior
2.1 Introduction
We developed a model-based Granger-like theoretical framework and tested it within numeri-
cal Monte Carlo simulations [99] in Chapter 1. In this model-based Granger-like framework, a
spike train is modeled as a point process GLM in which the instantaneous firing rate of each neu-
ron is modeled as a function of the history of both the spike trains and the field potentials across
the network. Further, in addition to their own history, field potentials can be modeled either as a
function of the history of the measurable binary spike trains or the history of the latent firing
rates. This framework resolves the above challenges through three main ideas. First, to enable
assessing causality across networks consisting of several nodes/signals, it assumes that condi-
tioned on the history of all signals---i.e., all spike trains and field potential time-series---, spike
trains and field potentials at a given time are conditionally independent and thus their models can
be separately estimated. Second, to enable using the latent firing rates of neurons as predictors of
field potentials, the framework consists of a sequential maximum-likelihood estimation ap-
proach. It first fits the GLM model of the spike trains to estimate the latent firing rates and then
uses these estimated latent firing rates to fit the model of field potentials. Third, time-varying dy-
namic behavior is modeled as covariates in both spike and field models. Thus, a stable measure
of causality during behavior can be estimated. Together, these ideas resolve the challenge of esti-
mating a large number of parameters, allow for firing rates to be predictors of field potentials,
consider behavior, and generate a network encoding model. The two expanded parametric mod-
els can then be used to compute a multiscale causality measure in terms of directed information.
32
While this multiscale causality framework was shown effective within Monte Carlo simula-
tions [99], its statistical tests to assess causality to field potential signals required repetitively
splitting the data, which resulted in reduced data efficiency and redundant computation. To make
the method suitable for real neural datasets, especially when assessing causality for a large num-
ber of signals in modern datasets that contain recordings from tens of electrodes, it is important
to improve the data efficiency of the method because real datasets are length-limited and contain
a large number of electrodes. Moreover, the framework was only tested in Monte Carlo simula-
tions, and so it remained unclear whether multiscale causality can be recovered in more realistic
biophysical simulations---where there exists a model mismatch between the fitted model and the
ground truth network---or in real neural datasets.
Here, we address the above problems by developing a new data-efficient statistical test proce-
dure within our multiscale network causality method based on the Wald test [77], and by then
demonstrating the multiscale network causality method on biophysical simulations. We refer to
the network model with the identified causal connections as the multiscale spike-field causality
graph. First, we show that our new test procedure is more data-efficient and can thus perform
better with the same amount of training data. Second, we show that the method can identify mul-
tiscale causality in more realistic biophysical simulations of spike-field network activity in which
the ground-truth network structure is known.
2.2 Causality graph identification
We need to estimate the likelihood models of spikes and fields to compute it and develop a
statistical test to assess the significance of causality. This estimation is based on our proposed
multiscale parametric models. We also devise new statistical tests for causality assessments be-
low.
33
We denote the likelihood function of the field potential 𝑦 𝑚𝑡
𝑞 , causally conditioned on all sig-
nals in the network up to time 𝑚𝑡 − 1, by 𝑓 𝑡 ( 𝑦 𝑚𝑡
𝑞 | | 𝒩 ∪ 𝒴 \ { 𝑦 𝑞 } ). Similarly, we denote the likeli-
hood function of a spike train 𝑁 𝑚𝑡
𝑞 , causally conditioned on all signals in the network up to time
𝑚𝑡 − 1, by 𝑓 𝑡 ( 𝑁 𝑚𝑡
𝑞 | | 𝒩 ∪ 𝒴 \ { 𝑁 𝑞 } ). Thus, the conditional likelihood function of the 𝑛 samples of
𝑦 𝑞 and 𝑁 𝑞 can be denoted by 𝑓 𝑛 ( 𝑦 𝑞 | | 𝒩 ∪ 𝒴 \ { 𝑦 𝑞 } ) and 𝑓 𝑛 ( 𝑁 𝑞 | | 𝒩 ∪ 𝒴 \ { 𝑁 𝑞 } ), respectively. We
have
(15)
2.2.1 Spike train model and causality to spike trains.
Here we denote the behavioral states at time 𝑚𝑡 by 𝒖 𝑚𝑡
. For each spike train 𝑞 at time 𝑚𝑡 , the
model is fully specified by the instantaneous firing rate function 𝜆 𝑚𝑡
𝑞 .
Given any subset of spike trains 𝒩 𝑠 𝑢𝑏 ⊆ 𝒩 \{ 𝑁 𝑞 } and any subset of field potentials 𝒴 𝑠 𝑢𝑏 ⊆ 𝒴 ,
we write the logarithm of the firing rate of the 𝑞 -th spike train 𝑁 𝑞 ∈ 𝒩 within the GLM frame-
work as
(16)
where 𝐾 𝑠 𝑞 is a non-negative integer representing the length of history. Here 𝛽 's with subscripts
and superscripts are parameters to be estimated. All 𝛽 's are vectors except 𝛽 0
𝑞 . 𝑨 𝑎 , 𝑏 denotes the
vector [ 𝐴 𝑎 , 𝐴 𝑎 + 1
, . . . , 𝐴 𝑏 ] and 𝑨 𝑚 ( 𝑎 , 𝑏 )
denotes the vector [𝐴 𝑚𝑎
, 𝐴 𝑚𝑎 + 𝑚 , . . . , 𝐴 𝑚𝑏
]. We note that
all samples of spike trains from time 𝑚𝑡 − 𝑚 𝐾 𝑠 𝑞 to time 𝑚𝑡 − 1 are included as covariates in the
GLM model above as can be seen in the first summation on the right side of Equation (16).
34
We denote the set of all unknown parameters (all 𝛽 's) in (16) by 𝜗 . Elements in 𝜗 need to be
estimated using maximum likelihood estimation (MLE) by maximizing the point process likeli-
hood function of 𝑁 𝑚𝑡
𝑞 written as
(17)
where Δ is the time-step used to bin the spikes [37], [99]. Note that the likelihood of spike trains
is evaluated at times 𝑚𝑡 because the field potential signal is only available at these times. How-
ever, the firing rates at time 𝑚𝑡 are evaluated using all samples of spike trains at every time-step
from time 𝑚𝑡 − 𝑚 𝐾 𝑠 𝑞 to time 𝑚𝑡 − 1 as seen from Equation (16).
The MLE is given by
(18)
To find the set of signals that cause 𝑁 𝑞 , we need to select the smallest sets 𝒩 𝑠 𝑢𝑏 and 𝒴 𝑠 𝑢𝑏
such that 𝑓 𝑛 ( 𝑁 𝑞 | | 𝒩 𝑠 𝑢𝑏 ∪ 𝒴 𝑠 𝑢𝑏 ) = 𝑓 𝑛 ( 𝑁 𝑞 | | 𝒩 ∪ 𝒴 \ { 𝑁 𝑞 } ). This would mean that conditioning on
signals other than the ones in 𝒩 𝑠 𝑢𝑏 and 𝒴 𝑠 𝑢𝑏 does not improve the likelihood function and thus
all and only signals in 𝒩 𝑠 𝑢𝑏 ∪ 𝒴 𝑠 𝑢𝑏 cause 𝑁 𝑞 . To select the hyperparameters 𝒩 𝑠 𝑢𝑏 , 𝒴 𝑠 𝑢𝑏 , and 𝐾 𝑠 𝑞
(i.e. history length), we devise a model selection procedure based on comparing
𝑓 𝑛 ( 𝑁 𝑞 | | 𝒩 𝑠 𝑢𝑏 ∪ 𝒴 𝑠 𝑢𝑏 ) with different hyperparameters [99]. We use the Akaike information crite-
rion (AIC) for this selection [73] and thus for a given 𝒩 𝑠 𝑢𝑏 and 𝒴 𝑠 𝑢𝑏 , we pick 𝐾 𝑠 𝑞 as follows:
(19)
where 𝛿𝐾
𝑠 𝑞 is the number of parameters with 𝛿 = 𝑚 ( | 𝒩 𝑠 𝑢𝑏 | + 1 ) + | 𝒴 𝑠 𝑢𝑏 | and with | ∙ | being
the operator counting the number of elements in a given set. To determine whether a signal 𝑧 ∈
𝒩 𝑠 𝑢𝑏 ∪ 𝒴 𝑠 𝑢𝑏 causes 𝑁 𝑞 , by Wilk's theorem [75], under the hypothesis that 𝑧 does not cause 𝑁 𝑞 ,
we have
35
(20)
where 𝜒 2
( 𝑘 ) denotes a chi-squared distribution with degree of freedom 𝑘 , 𝑐 = 1 if 𝑧 is a spike
train and 𝑐 = 𝑚 if 𝑧 is a field potential. If the p-value in this test is smaller than a desired thresh-
old, we say 𝑧 significantly causes 𝑁 𝑞 . This enables us to remove any signal in a given set 𝒩 𝑠 𝑢𝑏 ∪
𝒴 𝑠 𝑢𝑏 that does not significantly cause the modeled spike train. We will further explain the model
selection procedure in Section 2.2.3. For tests on spike train models, the threshold of p-value is
set to 0.05 and we also control false discovery rate (FDR) using Benjamini-Hochberg procedure
[78].
2.2.2 Field potential model and causality to field potentials
We model a field potential using a linear Gaussian model. We assume a field potential 𝑦 𝑞 ∈ 𝒴
is Gaussian distributed with time-variant mean 𝜇 𝑚𝑡
𝑞 at time 𝑚𝑡 and fixed variance 𝜎 𝑞 2
. The mean
of the field potential depends on various covariates, including the history of observed field po-
tentials, the history of spike firing rates (which need to be estimated) and the current behavioral
states. Instead of the history of spike firing rates, it is easy to use the history of the observed bi-
nary spike trains as predictors as well because these spike trains are directly observed. We can
thus study whether the binary events or the firing rates are better predictors of the field poten-
tials. Given any subset of spike trains 𝒩 𝑠 𝑢𝑏 ⊆ 𝒩 \{ 𝑁 𝑞 } and any subset of field potentials 𝒴 𝑠 𝑢𝑏 ⊆
𝒴 , we can write the mean of the field potential as
(21)
36
where 𝐾 𝑓 𝑞 is the history length for spike-field and field-field causality, which is a non-negative
integer. Here 𝛼 's with subscripts and superscripts are parameters to be estimated. Thus, the
Gaussian likelihood function of the field potential is given by
(22)
We denote the set of all remaining unknown parameters by 𝜗 including all 𝛼 's in (21) and 𝜎 in
(22). Elements in 𝜗 still need to be estimated using MLE by maximizing the Gaussian likelihood
function as
(23)
We now have to perform model selection and statistical tests of causality for the field potential
models. In our previous chapter, this was done through a data-expensive and redundant empirical
data-splitting process. Here, we improve on the data efficiency of this method by devising an al-
ternative analytical approach as described in the next paragraphs. The new method is better
suited for neural data analyses as real datasets are length-limited and can contain recordings from
many electrodes, thus introducing a large number of parameters and signals.
The new analytical method is devised as follows. Once 𝛼 's are estimated, the covariance ma-
trix of these estimates -- i.e., covariance of 𝛼 ̂ 's -- denoted by 𝑉 plays an important role in model
selection procedure and statistical tests for causality as we will show below. Finding this covari-
ance matrix is not straightforward. This is because we use the history of spike firing rates as pre-
dictors of field potentials and thus the spike train models have to be estimated before the field
potential models can be estimated, resulting in a 2-step MLE problem for the field potential mod-
els. This means that in the second step of MLE, there are nuisance parameters 𝛽 ̂
's in the spike
models that are estimated from the first step. The latent spike firing rates computed using these
37
nuisance parameters are different from the true values, and thus 𝑉 is also related to the variances
of 𝛽 ̂
's.
We thus need to estimate 𝑉 , which is the covariance of the 𝛼 ̂ parameters of the field potential
model in the presence of nuisance parameters. To do so, we leverage a method developed for this
covariance estimation in the presence of nuisance parameters [77]. Let's use ∇
𝜁 to denote the first
order derivatives of a function w.r.t. 𝘁 and ∇
𝜁 2
to denote the second order derivative of a function
w.r.t. 𝘁 and 𝘂 . We define 𝐺 𝛼 = − 𝑛 − 1
∇
𝛼𝛼
2
𝑙𝑛 𝑓 𝑛 ( 𝑦 𝑞 ) as the second order derivative of
− 𝑛 − 1
𝑙𝑛 𝑓 𝑛 ( 𝑦 𝑞 ) w.r.t. 𝛼 's. Let 𝐺 𝛽 = − 𝑛 − 1
∇
𝛼𝛽
2
𝑙𝑛 𝑓 𝑛 ( 𝑦 𝑞 ) be the second order derivative of
− 𝑛 − 1
𝑙𝑛 𝑓 𝑛 ( 𝑦 𝑞 ) w.r.t. 𝛽 's then 𝛼 's, and 𝑀 = − 𝑛 − 1
∇
𝛽𝛽
2
𝑙𝑛 𝑓 𝑛 ( 𝑁 ) with − 𝑛 − 1
𝑙𝑛 𝑓 𝑛 ( 𝑁 ) =
− 𝑛 − 1
[ 𝑙𝑛 𝑓 𝑛 ( 𝑁 1
) + 𝑙𝑛 𝑓 𝑛 ( 𝑁 2
) + . . . + 𝑙𝑛 𝑓 𝑛 ( 𝑁 𝐶 ) ]. Also let Ω
𝛼 =
𝑛 − 1
∑ ∇
𝛼 𝑙𝑛 𝑓 𝑡 ( 𝑦 𝑚𝑡
𝑞 ) ∇
𝛼 𝑙𝑛 𝑓 𝑡 ( 𝑦 𝑚𝑡
𝑞 )
′
𝑛 𝑡 = 1
, and Ω
𝛽 = 𝑛 − 1
∑ ∇
𝛽 𝑙𝑛 𝑓 𝑡 ( 𝑁 𝑚𝑡
) ∇
𝛽 𝑙𝑛 𝑓 𝑡 ( 𝑁 𝑚𝑡
)
′ 𝑛 𝑡 = 1
, where
𝑓 𝑡 ( 𝑁 𝑚𝑡
) = [ 𝑓 𝑡 ( 𝑁 𝑚𝑡
1
) , 𝑓 𝑡 ( 𝑁 𝑚𝑡
2
) , . . . , 𝑓 𝑡 ( 𝑁 𝑚𝑡
𝐶 ) ]
′
. It can be shown that [77]
(24)
All variables in Equation (24) are evaluated at the estimated value of the parameters, i.e., at 𝛼 ̂
and 𝛽 ̂
. The first term in 𝑉 above is the same as the typical covariance in a 1-step MLE. The sec-
ond term in 𝑉 is contributed by the uncertainty of the nuisance parameters 𝛽 ̂
's due to the 2-step
nature of the MLE here that affects the field potential model. Now that we have an estimate of 𝑉 ,
we can formulate the model selection procedure and statistical tests for assessing causality to
field potential signals.
A model selection procedure is required when 𝒩 𝑠 𝑢𝑏 , 𝒴 𝑠 𝑢𝑏 , or 𝐾 𝑓 𝑞 are undetermined. For given
𝒩 𝑠 𝑢𝑏 and 𝒴 𝑠 𝑢𝑏 , we select the history length 𝐾 𝑓 𝑞 using an information criterion. For the multiscale
model, there is uncertainty in 𝛼 ̂ 's contributed by 𝛽 ̂
's, i.e., the second term in 𝑉 in Equation (24)
38
and so AIC is not directly applicable. However, we can apply AIC to a model with only field po-
tential signals. Thus, taking the history of spike-field connections to be no longer than the history
of field-field connections in this case, we select the length of history in the multiscale model as:
(25)
where 𝛿 𝐾 𝑓 𝑞 is the number of parameters with 𝛿 = | 𝒴 𝑠 𝑢𝑏 | + 1.
The estimated 𝑉 is then applied to statistical tests for assessing causality. We use the Wald test
[77] to test whether a sub-vector of 𝛼 is 0 in Equation (21), and thus to determine whether a sig-
nal 𝑧 ∈ 𝒩 𝑠 𝑢𝑏 ∪ 𝒴 𝑠 𝑢𝑏 causes 𝑦 𝑞 . The Wald statistic follows a 𝜒 2
distribution under the null hy-
pothesis 𝛼 𝑠 = 0 and is given by
(26)
where 𝛼 𝑠 is any sub-vector of 𝛼 , 𝑉 𝑠 is a sub-matrix of 𝑉 with selected rows and columns corre-
sponding to 𝛼 𝑠 and the degree of freedom 𝑟 is the length of 𝛼 𝑠 . To assess causality from a signal
𝑧 to a field potential 𝑦 𝑞 , we consider the 𝛼 𝑠 that only contains parameters related to 𝑧 ; we then
say 𝑧 does not cause 𝑦 𝑞 when the p-value corresponding to the above test is larger than a thresh-
old. We use a threshold of 0.05 for statistical tests and do separate FDR controls on spike-field
and field-field connections in the following sections.
2.2.3 Model selection and causality assessment
To find the set of signals that cause a spike train or a field potential, we use the above statisti-
cal tests described in Section 2.2.1 and 2.2.2. We now describe how we assess causality to a
spike train and to a field potential signal, respectively.
For a spike train 𝑁 𝑞 and another specific signal, we first model 𝑁 𝑞 using its own history and
the history of that other signal to decide if that specific signal unconditionally causes 𝑁 𝑞 . We
then construct the set 𝒩 𝑠 𝑢𝑏 ∪ 𝒴 𝑠 𝑢𝑏 to consist of all signals that significantly unconditionally
39
cause 𝑁 𝑞 . This allows us to narrow down the number of signals for which we need to compute
the conditional causality to 𝑁 𝑞 . We then proceed to testing the conditional causality within this
set 𝒩 𝑠 𝑢𝑏 ∪ 𝒴 𝑠 𝑢𝑏 using Equation (20). In particular, we determine if each signal 𝑧 in 𝒩 𝑠 𝑢𝑏 ∪ 𝒴 𝑠 𝑢𝑏
significantly conditionally causes the modeled signal (i.e. when knowing the history of all other
signals in set 𝒩 𝑠 𝑢𝑏 ∪ 𝒴 𝑠 𝑢𝑏 , does knowing 𝑧 still help the prediction of 𝑁 𝑞 ). At a given signifi-
cance level, we eliminate every insignificant 𝑧 from the set after all tests are done. This proce-
dure provides the set of signals that significantly cause a spike train.
For a field potential signal 𝑦 𝑞 , we first take 𝒩 𝑠 𝑢𝑏 ∪ 𝒴 𝑠 𝑢𝑏 to be 𝒩 ∪ 𝒴 \𝑦 𝑞 , i.e. all of the sig-
nals being recorded except 𝑦 𝑞 itself. However, because these sets are high-dimensional, to re-
duce the computation complexity of model selection and make the method applicable to large
channel counts, we devise a preprocessing procedure to reduce the number of signals that need to
be considered in model-selection, i.e. the number of signals in 𝒩 𝑠 𝑢𝑏 ∪ 𝒴 𝑠 𝑢𝑏 . First, we fit a sin-
gle-scale model with only field-field connections and remove the insignificant connections after
FDR control from 𝒴 𝑠 𝑢𝑏 . The lengths of history are also selected using Equation (25). We then
compute a residual for each field signal at each time point by subtracting the single-scale one-
step-ahead prediction of the signal from its observed value (predicting the signal at the current
time based on all past values of field potentials, see Section 3.2.2). This residual indicates the re-
sidual part of the field potential signal that is not predicted by the other field potential signals.
Second, we consider each pairwise spike-field connection by fitting a pairwise field model for
each residual series. Spike-field connections with p-values larger than threshold 0.05 are re-
moved from 𝒩 𝑠 𝑢𝑏 . Together, these preprocessing steps provide us with a subset of signals
𝒩 𝑠 𝑢𝑏 ∪ 𝒴 𝑠 𝑢𝑏 before we do model selection for the multiscale model, thus reducing the complex-
ity and allowing for causality assessment in large channel-sets.
40
Having obtained the 𝒩 𝑠 𝑢𝑏 ∪ 𝒴 𝑠 𝑢𝑏 , we fit the multiscale model and look at the p-values com-
puted by Equation (26) for the causality of spike trains to the field potential signal and remove
the spike train whose p-value is the largest and also larger than a relatively large threshold of
0.95 from the set and then fit the model again. This threshold means that the removed spike train
has less than 5% chance of having a causal connection to the field potential and thus helps avoid
false negatives. We repeat this removal and refitting procedure until all p-values corresponding
to spike trains are smaller than the threshold. Then the remaining signals in 𝒩 𝑠 𝑢𝑏 ∪ 𝒴 𝑠 𝑢𝑏 are the
candidates that may cause 𝑦 𝑞 . Now, the p-values computed by Equation (26) are used to find all
significant causalities to field potential signals (see also Section 2.2.2).
Based on prior evidence about the length of history for spike and field potential dynamics and
for computational tractability, we consider a history length of up to 100ms for a spike train [37]
and up to 300ms for a field potential [29] in our model selection procedure. History lengths
around these values are generally considered long enough for spike train and field potential dy-
namics [29], [37]. This means that we allow 𝐾 𝑠 𝑞 to be up to 10 and 𝐾 𝑓 𝑞 to be up to 30 when the
data is sampled at 100 Hz, i.e. at a time-step of Δ =10 ms.
2.3 Results
In this section, we first show, using Monte-Carlo simulations, that the new statistical test on
causality to field potentials in our multiscale causality method introduced in Section 2.2.2 works
successfully and is data-efficient. We then use biophysical simulations with a known ground-
truth causality graph to show that our method can identify the ground truth multiscale causality
graph in realistic simulations of multiscale spike-field signals.
2.3.1 Simulation showing data-efficiency of the new causality method
41
To show data-efficiency of the statistical test for assessing causality to field potentials, we
compare the new test to our original test in [99] using the same simulated data from [99]. In [99],
100 randomly generated 10-node networks with 16 connections were used to evaluate the perfor-
mance of our original multiscale causality method. Note in our prior work, we only performed
Monte Carlo simulations to validate the approach [99]. We apply our new statistical test to the
100 random networks and evaluate the AUC from data lasting for different lengths of 3-min to
24-min.
Figure 2-1 Comparison of new and original test procedures on causality to field potentials shows that
the new method is more data efficient. (a) AUC of field-field and spike-field connections with errorbar
showing mean ± sem. (b) AUC of field-field connections with errorbar showing mean ± sem. (c) AUC of
spike-field connections with errorbar showing mean ± sem. The new method can perform as well or better
than the original method using much less training data.
Figure 2-1 shows the comparison of AUCs of the two tests. We can observe that for any given
length of data shown in Figure 2-1a, the new method has larger AUC than the original method
used in [99]. At 3 min, the overall AUC using the new method is significantly larger than the
original AUC by 2.1% (p-value=1.6e-2, one-sided paired t-test). At 12 min, the overall AUC
42
using the new method is again significantly larger than the original AUC by 9.0% (p-value=8e-
20, one-sided paired t-test). Furthermore, the new AUC achieved using 6-min data is larger than
the original AUC using even 12-min data (p-value=0.008, one-sided paired t-test). These results
show that the new method performs as well or better than the original method using much less
training data and thus is more data-efficient than the original method.
2.3.2 Biophysical simulation
We use the Virtual Electrode Recording Tool for EXtracellular potentials (VERTEX) [100] to
generate data from biophysical simulations. Given the position of electrodes, properties of neu-
rons and tissues, VERTEX can calculate extracellular LFPs using a forward modeling approach
[101].
2.3.2.1 Simulation settings
The purpose of this simulation is to show that our method can identify the ground truth causal-
ity graph in a realistically simulated multiscale spike-field network. We simulate 10 sessions of
spike-field data, with each session lasting 200 s. The placement of neurons is randomized for
each session. For generality, each random simulated network includes all 4 types of causality:
spike-field, field-field, spike-spike, field-spike. The overall simulated network structures are the
same in each session, but neurons are randomly placed in 4 different 3-D regions (A-D squares
in the x-y plane in Figure 2-2a). The size of each region is 1 mm × 1 mm × 0.2 mm. The mini-
mum distance between two connected regions is 6 mm. Neurons in a region are randomly placed
in each session while electrodes recording LFPs are placed as shown in x-y plane with height
𝑧 = 0 . 1 mm.
Each simulated network contains 6120 neurons. We chose our simulated network and its pa-
rameters such that we can have realistic firing rates with a reasonable number of simulated
43
neurons and a known ground truth causality to which we can compare. Among all neurons, a net-
work of 60 neurons (orange circles) and 20 electrodes (blue crosses) are shown in Figure 2-2a.
The weight of the synapses is 1100 pA for every connection between these neurons. Each of the
60 displayed neurons has an input from another neuron not shown in the figure through a syn-
apse with a weight of 4000 pA. This input neuron fires with Poisson statistics at a rate of 10 Hz
if connected to a displayed neuron without other inputs or 5 Hz otherwise. Each of the 30 dis-
played neurons with no displayed input from the same region also has an output to another 200
neurons within the same region through synapses with a weight of 2200 pA. All neurons are ex-
citatory pyramidal neurons. Synapses are single exponential current-based, and the axon arbor is
uniformly distributed within the region. Other synapse properties are specified in Table 2-1. The
structural and passive properties are the same as the layer 2/3 pyramidal neuron model described
in [100]. The 60 neurons shown in Figure 2-2a and the 6000 neurons driven by them follow the
same adaptive exponential (AdEx) model [102] with parameters specified in Table 2-1. These
neurons also have random current input following Ornstein Uhlenbeck process with parameters
also specified in Table 2-1.
Table 2-1 Parameters of neurons, synapses and input currents in the biophysical simulation. AHP
here refers to after-hyperpolarization.
Name Value
spike generation threshold -50 mV
spike steepness parameter 1 mV
scale factor of AHP current 2.6 nS
AHP current time constant 25 ms
AHP current instantaneous change 600 pA
reset membrane potential -60 mV
cutoff membrane potential -45 mV
synapse exponential decay time constant 2 ms
axon conduction speed 0.3 m/s
synapse release delay 0.5 ms
input current mean 100 pA
input current std 30 pA
input current time constant 2 ms
44
Figure 2-2 Multiscale causality identification method can find the directed connections within spike-field
networks in biophysical simulations. (a) Simulated network structure consists of 6120 neurons as detailed
in Section 2.3.2.1. The figure shows 60 neurons (orange circles) and 20 electrodes (blue crosses) in 4 dif-
ferent 3-D regions (A--D squares in x-y plane). The size of each region is 1 mm × 1 mm × 0.2 mm. The
minimum distance between two connected regions is 6 mm. Neurons in a region are randomly placed in
each session while electrodes recording LFPs are placed as shown in x-y plane with height 𝑧 = 0 . 1 mm.
A spike train is observed from each neuron. Connections between spike trains are shown in black arrows
and are all from left to right. To have a clear ground truth between regions, 20 spike trains from regions
45
A, D and 10 LFPs from regions B, C are modeled. (b--e) Ground truth and identified network between
signals. The ground truth (b, d) and identified network (c, e) averaged across 10 sessions are shown. For
(c, e), the colors indicate the ratio of sessions for which a positive connection was identified, and the scale
is shown by the color bar. (f--i) Receiver operating characteristic (ROC) curves of connectivity with
shaded 95% confidence interval computed using the ground truth and identified networks. Our method
can successfully identify (f) field-field (g) spike-field (h) field-spike and (i) spike-spike connections in the
multiscale spike-field networks.
Figure 2-3 We show 20 example spike trains and 10 example LFPs observed from a network shown in
Figure 2-2a during a 10-second period. Sampling rates are 100 Hz for LFPs and 200 Hz for spikes (i.e., 5-
ms spike bins). The regions to which the LFP and spike train signals belong are indicated by A-D. These
regions as well as the LFP and spike train numberings are the same as in Figure 2-2a.
The raw output from VERTEX includes spike timings and LFPs. The original sampling rate of
LFPs is 1 kHz. Under the above simulation settings, neurons rarely spike more than once every 5
46
ms as expected. The maximum 𝐾 considered here is also 30 for field potentials and 10 for spikes.
For spikes, we generate spike trains from spike timings. To obtain the spike trains, we generate
time-series of 0’s and 1’s by binning the spikes in 5ms time bins. If there is a spike in a time bin,
we set the value of the spike train at that time to 1 and if there is no spike in a time bin, we set
the value of the spike train at that time to 0. For LFPs, we down-sample the LFP signals at
100Hz sample rate and standardize them to have zero mean and unit variance. Example simu-
lated spike and LFP signals are shown in Figure 2-3.
2.3.2.2 Multiscale causal connections can be identified in biophysical simulations
To have a clear ground truth between regions that we can then use for assessment, we select
20 spike trains from regions A and D and 10 LFPs from regions B and C as shown in Figure 2-
2a. All our subsequent analyses will focus on these signals. The ground truth causality graph for
these signals is shown in Figure 2-2b, d, with former showing the causality to the spike trains in
area A and D and the latter showing the causality to LFPs in regions B and C. For spike-spike
connections, the ground truth is the same as synapse connections and causal connections are
shown with yellow. For the other 3 types of connections, each LFP in a region is caused by every
direct synaptic input to that region and also causes every signal in that region and every signal in
the next region with direct synaptic connections. Thus, for these 3 types of connections, the true
causal connections are shown between regions (rather than being at the level of synapse connec-
tions) in yellow.
We apply the multiscale causality algorithm to each session of the simulated data. A causality
from one signal to another is defined as positive if the first signal causes the second. A positive
causality in an identified network is true positive if the ground truth is positive. It is false positive
if the ground truth is negative. Then we can also compute the true positive rate and false positive
47
rate by dividing the number of true positives and false positives by the number of positives and
negatives in ground truth, respectively.
The ratio of sessions for which a positive causality was identified from each signal to another
signal is shown by the color in Figure 2-2c, e. Comparing with the ground-truth causal connec-
tions in Figure 2-2b, d, this figure suggests that the method is successful in identifying causal
connections.
To quantify this performance, we plot the receiver operating characteristic (ROC) curves for
different kinds of connections by sweeping the threshold of p-values from 0 to 1 (p-values used
for causality detection). The average ROC curves across 10 sessions with shaded 95% confi-
dence are shown in Figure 2-2f-i. We find the area under the ROC curve (AUC) in each case.
The chance-level AUC is 0.5, which we compare to. As we can observe from Figure 2-2i, true
spike-spike causalities are accurately detected and the associated AUC is 0 . 999 ± 0 . 001 (mean
± std), which is significantly larger than chance level of 0.5 (p-value=9e-26, one-sample one-
sided t-test). In addition, for field-field, spike-field and field-spike causality, the AUCs are
shown in Figure 2-2f-h and are 0 . 95 ± 0 . 04, 0 . 79 ± 0 . 10 and 0 . 91 ± 0 . 14, respectively. All
these AUCs are also significantly larger than the chance level of 0.5 (p-values=2e-11, 5e-6 and
3e-6, respectively; one-sample one-sided t-test).
2.4 Summary
Here we explore the causality structure in multiscale spike-field networks and demonstrate a
multiscale spike-field causality graph identification method based on directed information in re-
alistic biophysical simulations and in motor cortical spike-LFP data during motor behavior from
two NHPs. We show that the method is data-efficient and can successfully identify the mul-
tiscale causal connections in biophysical simulations.
48
Chapter 3: Modeling causal interactions between spiking and field
potential signals during behavior of non-human primates
3.1 Introduction
In this chapter, we demonstrate the multiscale network causality method developed in Chapter
2 on experimental data collected from two non-human primates (NHP) during an arm reaching
behavior. We apply the method to NHP spike-field data during motor behavior and show that
given its data-efficiency, the method can perform causality tests within high-dimensional record-
ings from over 100 electrodes. Also, our results on NHP data suggest the existence of multiscale
spike-field network causality. Further, we find that the multiscale causality method leads to mod-
els that improve the prediction of both spike trains and field potentials in NHP data. Finally, in
the NHP datasets, we find that latent firing rates are better predictors of field potentials and ex-
plore the properties of different directed connections, such as their density as a function of dis-
tance or their density within vs. across brain regions.
3.2 Methods
3.2.1 Motor task and neural data recording
We use multiscale recordings from motor cortical areas of two Rhesus macaques performing a
motor task [103]. The NHPs reached to objects located at different 3D locations for a liquid re-
ward and then returned to the resting position before initiating another reach as described in
[103]. For the first NHP J, 7 sessions of recordings were obtained from the dorsal premotor cor-
tex (PMd), ventral premotor cortex (PMv), primary motor cortex (M1), and prefrontal cortex
(PFC) contralateral to the arm using an array of 137 electrodes (Gray Matter Research, USA)
where neighboring electrodes were spaced 1.5 mm apart in X and Y directions and the depth in Z
direction was adjusted by the experimenter. For the second NHP C, 4 sessions of recordings
49
were obtained from left PMd, right PMd, left PMv and right PMv using 128 electrodes in total
from 4 arrays (Gray Matter Research, USA). Arm movements were tracked by reflective body
markers attached to the subject's skin. Marker locations were captured by near infrared cameras
with sampling frequency of 100 frames per second (Motion Analysis Corp., USA). Behavioral
covariates 𝒖 𝑡 were taken as the 7 joint angle time-series corresponding to the shoulder, elbow
and wrist movements (shoulder elevation, elevation angle, shoulder rotation, elbow flexion, pro
supination, wrist flexion, and wrist deviation) [9]. Neural data was originally recorded with a
sampling rate of 30 kHz. All surgical and experimental procedures were performed in compli-
ance with the National Institute of Health Guide for Care and Use of Laboratory Animals and
were approved by the New York University Institutional Animal Care and Use Committee.
As described in [103], local field potentials (LFPs) were extracted by applying a low-pass fil-
ter with 400 Hz cut-off frequency on the raw signals. Spike trains were obtained by band-pass
filtering the raw signal from 0.3 to 6.6 kHz and then identifying the threshold crossings below
the mean filtered signal. The threshold was set to 3.5 standard deviations [103]. We fit models
both for all available channels in each monkey and for selected higher-quality channels. In the
latter case, we eliminated low-quality noisy spike trains by keeping the 10 spike trains that can
be best predicted using only behavioral states as predictors in GLM models for each session [9].
We then also used the field potentials recorded from these channels in our network causality
analyses. Here we use the same sampling rate for spikes and LFPs of 100 Hz (i.e. 𝑚 = 1).
3.2.2 Prediction power
In the biophysical simulations, we can know the ground truth causality graph and compare that
to the estimated causality graph. However, this is not the case for real neural data. As such, we
evaluate the estimated causality graph in real data by asking how well the multiscale model with
50
that causality graph can predict the spike trains and field potentials. In other words, we take the
fitted multiscale model corresponding to an identified causality graph. We then use this fitted
model to predict the signals one-step-ahead into the future. This means that we assess how well a
signal can be predicted by the history of those signals that are identified as causing it within a fit-
ted multiscale model. This way we can compare different causality graphs by comparing the pre-
diction of their corresponding multiscale models. We refer to this one-step-ahead prediction ca-
pability as prediction power (PP) and define it for spike trains and field potentials as follows.
For each spike train, the one-step-ahead prediction of the continuous valued firing rates 𝜆 𝑞 in
Equation (16) can be computed from a model based on any given causality graph. Then given
any threshold, we can predict the binary spike 𝑁 𝑚𝑡
𝑞 to be 1 if the one-step-ahead predicted firing
rate exceeds the threshold and to be 0 otherwise. We can then construct a receiver operating
characteristic (ROC) curve by sweeping the threshold and compute the area under the curve
(AUC) of the ROC. We define the predictive power (PP) as PP=2 ×AUC-1 [9], [12], [45]. The
range of PP is [0,1] in the training set but PP could be smaller than 0 in the test set. A perfect
predictor achieves a PP score of 1 and a random predictor achieves a PP score of 0.
For each field potential, we find the one-step-ahead prediction of the field potential 𝑦 𝑞 by
computing 𝜇 𝑞 from the multiscale model in Equation (21) based on the history of the signals that
are identified as causing it in the identified causality graph. We define the normalized root
mean-squared error (NRMSE) (the root mean-squared error normalized by the standard deviation
of 𝑦 𝑞 [104] as
𝑁𝑅𝑀𝑆𝐸 =
√
∑ ( 𝜇 𝑚𝑡
𝑞 − 𝑦 𝑚𝑡
𝑞 )
2
𝑡 / ∑ ( 𝑦 ̅
𝑞 − 𝑦 𝑚𝑡
𝑞 )
2
𝑡
51
where 𝑦 ̅
𝑞 is the mean of 𝑦 𝑞 {and the prediction 𝜇 𝑚𝑡
𝑞 is given in Equation (21). Given the
NRMSE, for LFPs, we define PP=1-NRMSE. Thus, the range of PP is also [0,1] in the training
set but PP could be smaller than 0 in the test set. A perfect predictor achieves a PP score of 1 and
a chance-level predictor achieves a PP of 0.
Our statistical tests are on functions of PP, for example on the PP increase ratio defined in Fig-
ure 3-1 that shows the ratio of improvement in PP between different cases (e.g., multiscale vs.
single-scale). Further, we use cross-validation to evaluate PPs and assess statistical significance
in tests on functions of PP.
Note we report p-values using the scientific notation used in MATLAB in which 𝑥𝑒 − 𝑛 de-
notes 𝑥 × 10
− 𝑛 .
3.3 Results
3.3.1 Multiscale causal connections improve prediction of spike trains and field potentials in
NHP datasets
Having established that the multiscale causality graph can be significantly identified in bio-
physical simulations in Chapter 2, we next examined whether multiscale causal connections were
also identified in real NHP motor datasets and if so whether their identification improved the
prediction of spike-field activity. To investigate these questions, we define four types of PP to be
compared: baseline, single-scale, cross-scale, and multiscale. Multiscale PP is the PP of full
models that contain all kinds of connections including multiscale causal connections between
spike-field and field-spike, as well as self-history for each signal. Baseline PP is the PP of a
model acquired by removing causal connections from the full models and considers only self-
history in prediction of a given signal. Single-scale PP is the PP of models acquired by removing
spike-field and field-spike connections from the full models. Cross-scale PP is the PP of models
52
acquired by removing spike-spike and field-field connections except self-history from the full
models. All PP's are computed in cross-validation as described in Methods. As a first validation
and for easier visualization, in Sections 3.3.1.1 and 3.3.1.2, we focus on the causality analyses
with the higher-quality 10-channel sets. In later sections, we repeat these analyses when using all
channels and further provide all the other results by including all of the channels in the analyses.
All conclusions are similar in the 10-channel sets and the all-channel set analyses.
3.3.1.1 Spike train prediction
Here we examine spike trains and find that the identified multiscale causalities improve the
prediction of spike trains in cross-validation. To show a meaningful comparison, we compute
these improvements for spike trains with at least one single-scale (spike-spike) and one cross-
scale (field-spike) connection in all 5 cross-validation folds. In this section we focus on a high-
quality 10-channel set for each session for easier visualization and validation, and then we con-
firm that our conclusions for all-channel analyses are similar in Section 3.3.1.3.
Figure 3-1a shows that multiscale PP is significantly larger than baseline PP for all signals
with multiscale connections (p-value<0.05 by one-sided t-test). We then asked whether the PP
improvement over baseline was due to single-scale or cross-scale causal connections being iden-
tified. To answer this question, we found the normalized improvement of multiscale PP vs. both
single-scale PP (p-value = 2e-11, one-sided t-test) and cross-scale PP (p-value = 0.045, one-sided
t-test) as in Figures 3-1b and c, respectively. We can see when considering both connection
types instead of just spike-spike or field-spike connections, most of the PPs (97.0% and 83.6% of
PPs) for a spike train are significantly improved (p-value<0.05 by one-sided t-test). Figure 3-1b
shows the normalized PP improvement contributed by field-spike connections was on average
66.0% for the significant signals, while Figure 3-1c shows the normalized PP improvement
53
Figure 3-1 Multiscale causalities are successfully detected, thus improving the prediction of spike
trains and field potentials. Signals---whether spike train signals in (a--c) or field potential signals in (d--f)-
--are sorted by the 95% confidence bound of PP increase ratios (these ratios are defined in the legends).
PP increase ratios are significantly above 0 for signals to the right of the black vertical lines (p-
value<0.05, one-sided t-test). Box plots to the right of each panel show the statistics of significant PP in-
crease ratios, where the central mark is the median, the edges of the box are the 25th and 75th percentiles
and the whiskers extend to the most extreme data points that are not outliers. Outliers are shown by red
crosses and are points whose distance to the box is larger than 1.5 times of the box height. P-values are
shown for comparing the average to 0 by one-sided t-tests. (a--c) Multiscale causalities improve the pre-
diction of spike trains. (a) PP increase ratio from baseline to multiscale. Signals from all 11 sessions with
multiscale connections are shown. (b) PP increase ratio from single-scale to multiscale. (c) PP increase
54
ratio from cross-scale to multiscale. (d--f) Multiscale causalities improve the prediction of field potentials.
(d) PP increase ratio from baseline to multiscale. Signals from all 11 sessions with multiscale connections
are shown. (e) PP increase ratio from single-scale to multiscale. (f) PP increase ratio from cross-scale to
multiscale.
contributed by spike-spike connections was on average 24.7% for the significant signals. This
result suggests that both spike-spike and field-spike causal connections are successfully identi-
fied, leading to improved prediction of spikes. Further, this result suggests that multiscale con-
nections (from field potentials to spikes) exist and their identification helps the prediction of
spike trains.
Interestingly, overall, on average across signals, cross-scale connections were more important
compared to single-scale connections in improving PP of spike trains (p-value=8e-20 by one-
sided Wilcoxon rank sum test). This result suggests that the unique contribution from field po-
tentials to spike train prediction is more than that from the spike trains.
3.3.1.2 Field potential prediction
We next consider field potentials and find that the identified multiscale connections also im-
prove the prediction of field potentials. Here we focus on the high-quality 10-channel sets for
easier visualization and validation and then confirm that our conclusions for all-channel analyses
are similar in Section 3.3.1.3. Similar to Figure 3-1a-c, Figure 3-1d-f show the PP improvements
for field potentials that had at least one field-field connection and one spike-field connection
identified in all 5 folds. Multiscale PP significantly improved compared with baseline PP for all
signals with multiscale connections (p-value<0.05 by one-sided t-test). We then asked whether
the improvement over baseline was due to single-scale or cross-scale causal connections being
identified. To answer this question, we found the normalized improvement of multiscale PP vs.
both single-scale PP (p-value = 2e-4, one-sided t-test) and cross-scale PP (p-value = 8e-37, one-
sided t-test) as in Figure 3-1e and f, respectively. We can see that when considering both spike-
55
field and field-field connections instead of just one kind of connection, 73.3% and 97.8% of the
PPs were significantly improved, respectively (p-value<0.05 by one-sided t-test). Figure 3-1f
shows that the normalized PP improvement contributed by single-scale (field-field) connections
was on average 88.4%, while Figure 3-1e shows that the normalized PP improvement contrib-
uted by cross-scale (spike-field) connections was on average 10.3%. This result suggests that
both spike-field and field-field causal connections are successfully identified, leading to im-
proved prediction of field potentials. Further, this result suggests that multiscale connections
(from spike to fields) exist and that is why their identification helps the prediction of field poten-
tials.
Comparing the single-scale and cross-scale contribution to PP's reveals that for field potential
prediction, single-scale causal connections are more important than cross-scale ones. On average,
the normalized PP improvement contributed by single-scale (field-field) connections (88.4%)
shown by Figure 3-1f is much larger than that by cross-scale (spike-field) connections (10.3%)
shown by Figure 3-1e (p-value 3e-16 by one-sided Wilcoxon rank sum test). This result shows
that the unique contribution to field potential prediction from field potentials is much larger than
that from spike trains.
3.3.1.3 Prediction considering all signals
We now consider all channels and study how much multiscale connections improve the pre-
diction of signals by considering all observed signals. The models are fitted using the first 80%
of samples and the PPs are computed using the last 20% of samples as test set. Similar to 10-
channel sets, we find that the identified multiscale causal connections improve the prediction of
both spikes and field potentials. For signals with multiscale connections, the mean of PP increase
ratios of spikes from baseline to multiscale model was 58% (p-value 0.004; one-sided t-test). The
56
mean of PP increase ratios of fields from baseline to multiscale model was 6.4% (p-value 2e-47;
one-sided t-test). Compared with the 10-channel sets, the improvement from baseline to mul-
tiscale is larger for all-channel sets because there are more connections being modeled in this lat-
ter case. Also, the associated averaged PP increase ratios across sessions from baseline, single-
scale or cross-scale to multiscale models are summarized in Table 3-1. Similar to 10-channel set
results, more PP improvement is contributed by field-spike or field-field connections than spike-
spike or spike-field connections. All analyses that follow are done by considering all channels
together.
Table 3-1 Mean and sem of averaged PP increase ratio across sessions from baseline to multiscale, and
from single-scale or cross-scale to multiscale models of all signals with multiscale connections. P-values
are computed by one-sided paired t-tests.
PP increase ratio mon-
key
mean sem p-value
Spike From Baseline to Multiscale:
(multiscale PP - baseline PP)/baseline PP
J 71.5% 37.3% 0.03
C 38.7% 3.4% 2e-19
From Single-scale to Multiscale:
(multiscale PP - single-scale PP)/(multiscale PP - baseline PP)
J 76.3% 5.0% 4e-31
C 58.3% 3.7% 4e-28
From Cross-scale to Multiscale:
(multiscale PP - cross-scale PP)/(multiscale PP - baseline PP)
J 13.9% 6.0% 0.01
C 26.2% 3.7% 1e-10
Field From Baseline to Multiscale:
(multiscale PP - baseline PP)/baseline PP
J 7.8% 1.0% 4e-8
C 6.2% 0.31% 2e-41
From Single-scale to Multiscale:
(multiscale PP - single-scale PP)/(multiscale PP - baseline PP)
J 2.7% 0.9% 3e-3
C 0.51% 0.16% 8e-4
From Cross-scale to Multiscale:
(multiscale PP - cross-scale PP)/(multiscale PP - baseline PP)
J 91.6% 1.0% 7e-28
C 97.6% 0.41% 5e-168
3.3.2 Latent firing rates predict field potentials better than binary spike trains
In addition to handling the case of large networks, one advantage of our multiscale causality
identification framework is that it allows our multiscale models to use either the latent spike fir-
ing rates or the observed spike events as field potential predictors. This can then allow us to test
the hypothesis that latent firing rates may be better predictors of field potentials as could be im-
plied from biophysical models [67]. We thus used our framework to test this hypothesis within
57
the NHP datasets. As a comparison, we trained an alternative multiscale model with the same
length of history. Figure 3-2 shows the multiscale PP difference between two models. We find
that using the latent firing rates predicts the field potentials better than using the observed binary
spike trains (p-value= 8.5e-49 for monkey J and 0.003 for monkey C, one-sided paired t-test).
This result indicates that the latent firing rates contain more information than binary spike trains
about field potentials. This result is consistent with the prediction in biophysical studies suggest-
ing firing rate can be an LFP proxy [67].
Figure 3-2 Our method using latent firing rates predicts field potentials better than using binary spike
trains. Panels (a) and (b) show the probability density function (PDF) of PP difference using firing rates
vs. using spike trains for monkeys J and C, respectively. PP differences greater than 0 are shown in red
and the rest are shown in blue. In most cases, PP differences are positive indicating that latent firing rates
are better predictors of field potentials overall. The PPs are computed for all field potentials. The p-values
are computed with one-sided paired t-test and indicate that PPs are larger when using latent firing rates
compared to using binary spike trains as predictors.
58
3.3.3 Properties of causal connections
3.3.3.1 Density of Connections and correlations to distance
We next examined the density of connections and whether there was a relationship between
the physical distance from a given signal and the density of connections from that signal. To do
so, we evenly cut the sorted distances into 50 groups. Density in each group is defined by the ra-
tio between the number of identified connections and all possible connections in that group. The
distance representing each group is defined as the average of lower and upper bound of the dis-
tances in the group. Figure 3-3 shows the results of the correlation analyses between the distance
of causal connections and their density for (a) field-field, (b) spike-field, (c) field-spike, and (d)
spike-spike connections.
Figure 3-3 Causal connection density from a signal is negatively correlated to physical distance from that
signal. Pearson's correlation between the physical distance (mm) from a signal and the corresponding den-
sity of connections from that signal is shown for (a) field-field (b) spike-field (c) field-spike and (d)
spike-spike connections in monkey J's dataset where we have 3-D coordinates of electrodes. There is a
significant negative correlation between distance and density of connections for all connection types.
59
We find that there was a significant negative correlation between distance and density for all
types of connections (p-value=2e-20, 2e-4, 5e-6 and 2e-6, respectively, Pearson's correlation
test). This result suggests that the density of the directed functional connectivity represented by
directed information is negatively correlated with physical distance for connections from field
potentials and also from spike trains.
Figure 3-4 Average individual-connection strength of four types of causal connections measured by di-
rected information (DI) with errorbar (mean ±sem) across sessions.
3.3.3.2 Strength of causality
We then examined the strength of causality between two signals. From section 3.3.3.1, the
density of field-field connections is much larger than that of other kinds of connections. This
may be due to field potentials being aggregate network-level signals that contain information
from groups of neurons nearby. Here we examined the average strength of individual connec-
tions measured by directed information across 11 sessions shown in figure 3-4. The strength of
field-spike connections was significantly larger than the strength of spike-spike connections (p-
value = 0.008, paired one-sided t-test) but the strength of field-field connections was not signifi-
cantly different from the strength of spike-field connections (p-value=0.46, two-sided t-test).
60
This latter result suggests that field-field causality improves the prediction power of field poten-
tials more than spike-field causality (see section 3.3.1.2) mainly because of the larger density of
field-field connections as the individual field-field and spike-field connection strengths are com-
parable.
3.3.3.3 Causal connections within vs. across regions
We also compared the density of causal connections across regions with that within regions.
The electrode arrays were placed in different cortical regions via an MRI guided stereotax [103].
The regions for the two NHPs are specified in Section 3.2.1. Figure 3-5 shows the density of
within-region and cross-region connections in each session. We can observe that the method
identifies that there are more within-region connections than cross-region connections for each
kind of connection (p-values: 9e-10, 0.1, 2e-9 and 1e-5, respectively; one-sided paired t-test).
Figure 3-5 Within-region and cross-region density of four kinds of connections in each session.
61
3.4 Discussions
Prior directed information measures of causality have been developed for a single scale of ac-
tivity, e.g. spikes [53], [57]–[59], [64]. Our method extends and demonstrates the directed infor-
mation measure for estimation of multiscale causality in mixed binary-continuous spike-field
signals. The multiscale directed information measure is a causal directional measure, and thus is
distinct from prior non-causal measures to assess spike-field connectivity including spike-field
coherence [18], [34], [41]–[43] or measures in encoding models that use the LFP features at a
given time to predict the spikes at the same time [9], [45] or to study correlations [44].
Our method is a model-based Granger-like causality method. There are some general limita-
tions for these methods [53], [105]. First, these methods depend on building statistical models of
the phenomena, and thus depend on the accuracy of these models. Indeed, that is why as our
main measure of how accurately causality is estimated in NHP data, we consider the model with
the identified causality network, and find this model's predictions for the measured spiking and
LFP signals. As these signals are observations from the latent causality network, their better pre-
diction can imply a better causality identification by our method. By comparing these predictions
for models with the identified single-scale causality vs. models with the identified multiscale
causality, we find that multiscale causality identification improves the prediction of both spike
trains and LFP signals. Second, a fundamental challenge for assessing causality is that factors
that are not measured in a given experiment cannot be considered or conditioned on in any cau-
sality method. Thus, the causality graph identified from the neural signals measured in an experi-
ment provides a measure of network interactions within the measured network only and without
conditioning on the unmeasured elements. Despite this challenge, model-based Granger-like cau-
sality is still a useful tool for analyzing functional connectivity statistically and could reveal
62
insights about the neurophysiology as shown in prior work [50], [53], for example for activity
during sleep or during visuomotor integration. Our work on modeling multiscale causality fills
an important gap for such analyses because it allows us to model all observed signals in a given
experiment for better causality assessment by including both field potentials and spike trains ra-
ther than just one or the other. As such the method enables including more factors/neural ele-
ments when assessing causality.
The method demonstrated here has several unique features. First, it allows multiscale causality
to be assessed for large networks, which is needed for modern electrophysiology datasets that
record from tens of electrodes. We demonstrated this capability by showing that the method can
compute multiscale connections across a large number of spike and LFP signals in NHP data and
in biophysical simulations. Second, our method allows the use of latent firing rates as field po-
tential predictors. The use of latent firing rates was motivated by biophysical suggestions that fir-
ing rates may be better LFP predictors compared to binary spike events [67]. Our results con-
firmed this biophysical conclusion. We found that in our NHP data, the estimated history of fir-
ing rates predicted the LFPs better than the history of binary spike trains. Also, our new mul-
tiscale causality test procedure based on the Wald test was more accurate for a given training
data length, and thus was data-efficient. This is key for assessing causality in high-dimensional
data recorded across large-scale brain networks. Finally, our method generates an encoding
model of spike-field network activity during behavior that is more predictive of neural data com-
pared with models that only use single-scale causality.
It is also important to note that by explicitly modeling a dynamic behavior as a covariate in the
spike model in Equation (16) and field model in Equation (21), the method can take into account
non-stationarities that are due to a changing behavioral signal that modulates the neural signals.
63
Having modeled the changing behavior, the method then assumes that spike and field potential
models are stationary within the typical time-lengths of a dataset. If the data time-length is much
longer, it may be that the models change for example due to plasticity. Future work can thus ex-
plore refitting the models intermittently or adaptively for more accurate causality assessment.
We found that the density of causal connections is negatively correlated with physical distance
for connections in the NHP motor cortical data. This is consistent with a prior study [9] which
has shown that the strength of non-causal correlations from LFP to spike trains drops over dis-
tance. It is also consistent with prior studies showing spike-LFP coherence drops over distance
[30], [106].
3.5 Summary
Our method reveals that multiscale spike-field network causality exists in NHP spike-field
motor cortical data during arm movements and can be identified; indeed, models that included
the identified multiscale causal connections better predicted both spike trains and field potentials
compared to models that just included either single-scale or cross-scale connections. Also, com-
pared to using the binary spike trains as predictors, our method better predicted the LFPs by
uniquely allowing for latent firing rates to be used as LFP predictors. These results show how
causality can be computed in multiscale binary-continuous data reflected in spike-field measure-
ments and how multiscale causality identification improves the modeling of spike-field network
activity.
64
Chapter 4: Conclusion
We developed a multiscale causality estimation measure to identify the causality graphs of
mixed discrete-continuous spike-field networks. We described spike-field activity with a likeli-
hood function comprised of point process GLMs for spikes and linear Gaussian models for
fields. We computed the directed information based on the estimated multiscale models with two
model estimation and selection algorithms. We devised statistical tests to assess causal interac-
tions. We validated the algorithms using extensive simulations and biophysical simulations. Our
method reveals that multiscale spike-field network causality exists in NHP spike-field motor cor-
tical data during arm movements and identified multiscale causal connections predicted both
spike trains and field potentials better compared to models that just included either single-scale
or cross-scale connections.
One future direction would be to explore whether the model selection procedure in this work
may be further improved using regularization methods. Regularization methods like ridge or
LASSO [107], [108] can help reduce over-fitting and can potentially be integrated into our
method. However, because of the two-step model fitting procedure in our method and the fact
that the distribution of regularized parameters within a regularization method are hard to find, it
would be hard to statistically test the significance of causal connections with such a regulariza-
tion procedure [109]. Nevertheless, a future direction could be to solve this challenge and assess
whether regularization can further improve our method of multiscale causality identification.
The multiscale spike-field causality graph identification method demonstrated here can also be
applied to other brain regions and behaviors to discover the multiscale causal interactions in
spike-field networks that drive these behaviors. One can also use this method to explore the de-
pendence of multiscale causality in a given brain region on behavioral and task context, or to
65
study how the causal network topology may change as a result of adaptation, learning or plastic-
ity [97], [110]–[113]. This method can also inform the design of future closed-loop neural con-
trol systems such as brain stimulation systems [2], [4], [5], [87], [88], [90]–[93], [95], [96],
[114]–[120]. For example, a major question in developing deep brain stimulation (DBS) thera-
pies for mental disorders such as treatment-resistant depression is which site within the brain
should be stimulated [2], [5], [116]. This method can provide one approach to identify candidate
sites by finding the brain site that has the strongest causal connection to the cortico-limbic re-
gions involved in mental disorders. This method can thus help develop future closed-loop DBS
systems for mental disorders that aim to causally modulate symptom-related neural activity using
the decoded symptom as feedback [2], [4], [5], [92], [95], [116], [117], [119]. More generally,
this method can facilitate the investigation of causal directed interactions in large-scale brain net-
works, whose findings can guide the development of more effective neurotechnologies.
66
Bibliography
[1] M. M. Shanechi, “Brain–Machine Interface Control Algorithms,” IEEE Trans Neural Syst Rehabil
Eng, vol. 25, no. 10, pp. 1725–1734, 2017.
[2] M. M. Shanechi, “Brain-machine interfaces from motor to mood,” Nat. Neurosceince, vol. 22, pp.
1554–1564, 2019.
[3] T. Liu, L. Ungar, and K. Kording, “Quantifying causality in data science with quasi-experiments,”
Nat. Comput. Sci., vol. 1, no. 1, pp. 24–32, 2021.
[4] Y. Yang et al., “Modelling and prediction of the dynamic responses of large-scale brain networks
during direct electrical stimulation,” Nat. Biomed. Eng., pp. 1–22, 2021.
[5] K. B. Hoang, I. R. Cassar, W. M. Grill, and D. A. Turner, “Biomarkers and Stimulation Algorithms
for Adaptive Brain Stimulation,” Front. Neurosci., vol. 11, p. 564, 2017, doi:
10.3389/fnins.2017.00564.
[6] K. J. Friston, “Functional and effective connectivity: a review,” Brain Connect., vol. 1, no. 1, pp. 13–
36, 2011.
[7] E. Schneidman, M. J. Berry, R. Segev, and W. Bialek, “Weak pairwise correlations imply strongly
correlated network states in a neural population,” Nature, vol. 440, no. 7087, pp. 1007–1012, 2006.
[8] J. W. Pillow et al., “Spatio-temporal correlations and visual signalling in a complete neuronal popula-
tion,” Nature, vol. 454, no. 7207, pp. 995–999, 2008.
[9] R. Bighamian, Y. T. Wong, B. Pesaran, and M. M. Shanechi, “Sparse model-based estimation of
functional dependence in high-dimensional field and spike multiscale networks,” J. Neural Eng.,
2019.
[10] T. Tu, N. Schneck, J. Muraskin, and P. Sajda, “Network configurations in the human brain reflect
choice bias during rapid face processing,” J. Neurosci., vol. 37, no. 50, pp. 12226–12237, 2017.
[11] M. Jamali, B. L. Grannan, E. Fedorenko, R. Saxe, R. Bá ez-Mendoza, and Z. M. Williams, “Single-
neuronal predictions of others’ beliefs in humans,” Nature, vol. 591, no. 7851, pp. 610–614, 2021.
[12] W. Truccolo, L. R. Hochberg, and J. P. Donoghue, “Collective dynamics in human and monkey sen-
sorimotor cortex: predicting single neuron spikes,” Nat. Neurosci., vol. 13, no. 1, p. 105, 2010.
[13] H. Abbaspourazad, M. Choudhury, Y. T. Wong, B. Pesaran, and M. M. Shanechi, “Multiscale low-
dimensional motor cortical state dynamics predict naturalistic reach-and-grasp behavior,” Nat. Com-
mun., vol. 12, no. 1, pp. 1–19, 2021.
[14] D. Susilaradeya, W. Xu, T. M. Hall, F. Galán, K. Alter, and A. Jackson, “Extrinsic and intrinsic dy-
namics in movement intermittency,” Elife, vol. 8, p. e40145, 2019.
[15] T. M. Hall, F. de Carvalho, and A. Jackson, “A common structure underlies low-frequency cortical
dynamics in movement, sleep, and sedation,” Neuron, vol. 83, no. 5, pp. 1185–1199, 2014.
67
[16] A. K. Bansal, W. Truccolo, C. E. Vargas-Irwin, and J. P. Donoghue, “Decoding 3D reach and grasp
from hybrid signals in motor and premotor cortices: spikes, multiunit activity, and local field poten-
tials,” J. Neurophysiol., vol. 107, no. 5, pp. 1337–1355, 2012.
[17] J.-M. Schoffelen, R. Oostenveld, and P. Fries, “Neuronal coherence as a mechanism of effective cor-
ticospinal interaction,” Science, vol. 308, no. 5718, pp. 111–113, 2005.
[18] B. Pesaran, J. S. Pezaris, M. Sahani, P. P. Mitra, and R. A. Andersen, “Temporal structure in neu-
ronal activity during working memory in macaque parietal cortex,” Nat. Neurosci., vol. 5, no. 8, p.
805, 2002.
[19] D. P. Nguyen, M. A. Wilson, E. N. Brown, and R. Barbieri, “Measuring instantaneous frequency of
local field potential oscillations using the Kalman smoother,” J. Neurosci. Methods, vol. 184, no. 2,
pp. 365–374, 2009.
[20] S. Perel et al., “Single-unit activity, threshold crossings, and local field potentials in motor cortex dif-
ferentially encode reach kinematics,” J. Neurophysiol., vol. 114, no. 3, pp. 1500–1512, 2015.
[21] S. D. Stavisky, J. C. Kao, P. Nuyujukian, S. I. Ryu, and K. V. Shenoy, “A high performing brain–
machine interface driven by low-frequency local field potentials alone and together with spikes,” J.
Neural Eng., vol. 12, no. 3, p. 036009, 2015.
[22] H.-L. Hsieh, Y. T. Wong, B. Pesaran, and M. M. Shanechi, “Multiscale modeling and decoding algo-
rithms for spike-field activity,” J. Neural Eng., vol. 16, no. 1, p. 016018, Dec. 2018.
[23] R. D. Flint, E. W. Lindberg, L. R. Jordan, L. E. Miller, and M. W. Slutzky, “Accurate decoding of
reaching movements from field potentials in the absence of spikes,” J. Neural Eng., vol. 9, no. 4, p.
046006, 2012.
[24] C. Mehring, J. Rickert, E. Vaadia, S. C. de Oliveira, A. Aertsen, and S. Rotter, “Inference of hand
movements from local field potentials in monkey motor cortex,” Nat. Neurosci., vol. 6, no. 12, pp.
1253–1254, 2003.
[25] P. Berens, G. A. Keliris, A. S. Ecker, N. K. Logothetis, and A. S. Tolias, “Comparing the feature se-
lectivity of the gamma-band of the local field potential and the underlying spiking activity in primate
visual cortex,” Front. Syst. Neurosci., vol. 2, p. 2, 2008.
[26] A. Belitski, S. Panzeri, C. Magri, N. K. Logothetis, and C. Kayser, “Sensory information in local
field potentials and spikes from visual and auditory cortices: time scales and frequency bands,” J.
Comput. Neurosci., vol. 29, no. 3, pp. 533–545, 2010.
[27] O. G. Sani, H. Abbaspourazad, Y. T. Wong, B. Pesaran, and M. M. Shanechi, “Modeling behavior-
ally relevant neural dynamics enabled by preferential subspace identification,” Nat. Neurosci., vol.
24, pp. 140–149, 2021.
[28] O. G. Sani, B. Pesaran, and M. M. Shanechi, “Where is all the nonlinearity: flexible nonlinear model-
ing of behaviorally relevant neural dynamics using recurrent neural networks,” bioRxiv, 2021, doi:
10.1101/2021.09.03.458628.
68
[29] G. T. Einevoll, C. Kayser, N. K. Logothetis, and S. Panzeri, “Modelling and analysis of local field
potentials for studying the function of cortical circuits,” Nat. Rev. Neurosci., vol. 14, no. 11, p. 770,
2013.
[30] S. Ray, “Challenges in the quantification and interpretation of spike-LFP relationships,” Curr Opin
Neurobiol, vol. 31, pp. 111–118, 2015.
[31] J. Viventi et al., “Flexible, foldable, actively multiplexed, high-density electrode array for mapping
brain activity in vivo,” Nat. Neurosci., vol. 14, no. 12, p. 1599, 2011.
[32] C.-H. Chiang et al., “Development of a neural interface for high-definition, long-term recording in
rodents and nonhuman primates,” Sci. Transl. Med., vol. 12, no. 538, 2020.
[33] Y. T. Wong, M. M. Fabiszak, Y. Novikov, N. D. Daw, and B. Pesaran, “Coherent neuronal ensem-
bles are rapidly recruited when making a look-reach decision,” Nat. Neurosci., vol. 19, no. 2, p. 327,
2016.
[34] T. Womelsdorf et al., “Modulation of neuronal interactions through neuronal synchronization,” sci-
ence, vol. 316, no. 5831, pp. 1609–1612, 2007.
[35] H.-L. Hsieh and M. M. Shanechi, “Multiscale brain-machine interface decoders,” in Engineering in
Medicine and Biology Society (EMBC), 2016 IEEE 38th Annual International Conference of the,
2016, pp. 6361–6364.
[36] L. Muller, F. Chavane, J. Reynolds, and T. J. Sejnowski, “Cortical travelling waves: mechanisms and
computational principles,” Nat. Rev. Neurosci., 2018.
[37] W. Truccolo, U. T. Eden, M. R. Fellows, J. P. Donoghue, and E. N. Brown, “A point process frame-
work for relating neural spiking activity to spiking history, neural ensemble, and extrinsic covariate
effects,” J Neurophysiol, vol. 93, pp. 1074–1089, 2005.
[38] B. Pesaran et al., “Investigating large-scale brain dynamics using field potential recordings: analysis
and interpretation,” Nat. Neurosci., vol. 21, no. 7, pp. 903–919, 2018.
[39] H.-L. Hsieh and M. M. Shanechi, “Optimizing the learning rate for adaptive estimation of neural en-
coding models,” PLOS Comput. Biol., vol. 14, no. 5, pp. 1–34, May 2018, doi: 10.1371/jour-
nal.pcbi.1006168.
[40] H. Abbaspourazad, H. Hsieh, and M. M. Shanechi, “A Multiscale Dynamical Modeling and Identifi-
cation Framework for Spike-Field Activity,” IEEE Trans. Neural Syst. Rehabil. Eng., pp. 1–1, 2019,
doi: 10.1109/TNSRE.2019.2913218.
[41] G. G. Gregoriou, S. J. Gotts, H. Zhou, and R. Desimone, “High-frequency, long-range coupling be-
tween prefrontal and visual cortex during attention,” science, vol. 324, no. 5931, pp. 1207–1210,
2009.
[42] K. Benchenane et al., “Coherent theta oscillations and reorganization of spike timing in the hippo-
campal-prefrontal network upon learning,” Neuron, vol. 66, no. 6, pp. 921–936, 2010.
69
[43] M. van Wingerden, M. Vinck, J. V. Lankelma, and C. M. Pennartz, “Learning-associated gamma-
band phase-locking of action–outcome selective neurons in orbitofrontal cortex,” J. Neurosci., vol.
30, no. 30, pp. 10025–10038, 2010.
[44] J. R. Manning, J. Jacobs, I. Fried, and M. J. Kahana, “Broadband shifts in local field potential power
spectra are correlated with single-neuron spiking in humans,” J. Neurosci., vol. 29, no. 43, pp.
13613–13620, 2009.
[45] M. E. Rule, C. Vargas-Irwin, J. P. Donoghue, and W. Truccolo, “Contribution of LFP dynamics to
single-neuron spiking variability in motor cortex during movement execution,” Front. Syst. Neuro-
sci., vol. 9, p. 89, 2015.
[46] M. Jarvis and P. Mitra, “Sampling properties of the spectrum and coherency of sequences of action
potentials,” Neural Comput., vol. 13, no. 4, pp. 717–749, 2001.
[47] C. W. Granger, “Investigating causal relations by econometric models and cross-spectral methods,”
Econom. J. Econom. Soc., pp. 424–438, 1969.
[48] A. Brovelli, M. Ding, A. Ledberg, Y. Chen, R. Nakamura, and S. L. Bressler, “Beta oscillations in a
large-scale sensorimotor cortical network: directional influences revealed by Granger causality,”
Proc. Natl. Acad. Sci. U. S. A., vol. 101, no. 26, pp. 9849–9854, 2004.
[49] M. G. Philiastides and P. Sajda, “Causal influences in the human brain during face discrimination: a
short-window directed transfer function approach,” IEEE Trans. Biomed. Eng., vol. 53, no. 12, pp.
2602–2605, 2006.
[50] M. Kamiński, M. Ding, W. A. Truccolo, and S. L. Bressler, “Evaluating causal relations in neural
systems: Granger causality, directed transfer function and statistical assessment of significance,”
Biol. Cybern., vol. 85, no. 2, pp. 145–157, 2001.
[51] M. Krumin and S. Shoham, “Multivariate autoregressive modeling and Granger causality analysis of
multiple spike trains,” Comput. Intell. Neurosci., vol. 2010, p. 10, 2010.
[52] A. G. Nedungadi, G. Rangarajan, N. Jain, and M. Ding, “Analyzing multiple spike trains with non-
parametric granger causality,” J. Comput. Neurosci., vol. 27, no. 1, pp. 55–64, 2009.
[53] S. Kim, D. Putrino, S. Ghosh, and E. N. Brown, “A Granger Causality Measure for Point Process
Models of Ensemble Neural Spiking Activity,” PLoS Comput. Biol., vol. 7, no. 3, p. e1001110, 2011.
[54] K. Sameshima and L. A. Baccalá, “Using partial directed coherence to describe neuronal ensemble
interactions,” J. Neurosci. Methods, vol. 94, no. 1, pp. 93–103, 1999.
[55] L. Zhu, Y.-C. Lai, F. C. Hoppensteadt, and J. He, “Probing changes in neural interaction during adap-
tation,” Neural Comput., vol. 15, no. 10, pp. 2359–2377, 2003.
[56] E. Pereda, R. Q. Quiroga, and J. Bhattacharya, “Nonlinear multivariate analysis of neurophysiologi-
cal signals,” Prog. Neurobiol., vol. 77, no. 1–2, pp. 1–37, 2005.
[57] C. Quinn, T. Coleman, N. Kiyavash, and N. Hatsopoulos, “Estimating the directed information to in-
fer causal relationships in ensemble neural spike train recordings,” J. Comput. Neurosci., vol. 30, no.
1, pp. 17–44, 2011.
70
[58] K. So, A. C. Koralek, K. Ganguly, M. C. Gastpar, and J. M. Carmena, “Assessing functional connec-
tivity of neural ensembles using directed information,” J. Neural Eng., vol. 9, no. 2, p. 13, 2012.
[59] Z. Cai, C. L. Neveu, D. A. Baxter, J. H. Byrne, and B. Aazhang, “Inferring neuronal network func-
tional connectivity with directed information,” J Neurophysiol, vol. 118, no. 2, pp. 1055–1069, 2017.
[60] A. Sheikhattar et al., “Extracting neuronal functional network dynamics via adaptive Granger causal-
ity analysis,” Proc. Natl. Acad. Sci., vol. 115, no. 17, pp. E3869–E3878, 2018.
[61] Y. J. Liew, A. Pala, C. J. Whitmire, W. A. Stoy, C. R. Forest, and G. B. Stanley, “Inferring Mon-
osynaptic Connectivity Across Brain Structures In-Vivo,” bioRxiv, 2020.
[62] T. Schreiber, “Measuring information transfer,” Phys. Rev. Lett., vol. 85, no. 2, p. 461, 2000.
[63] L. Citi, D. Ba, E. N. Brown, and R. Barbieri, “Likelihood methods for point processes with refractori-
ness,” Neural Comput., vol. 26, no. 2, pp. 237–263, 2014.
[64] C. J. Quinn, N. Kiyavash, and T. P. Coleman, “Directed Information Graphs,” IEEE Trans Info The-
ory, vol. 61, no. 12, pp. 6887–6909, 2015.
[65] X. Gong, W. Li, and H. Liang, “Spike-field Granger causality for hybrid neural data analysis,” J.
Neurophysiol., vol. 122, no. 2, pp. 809–822, 2019.
[66] M. Hu, M. Li, W. Li, and H. Liang, “Joint analysis of spikes and local field potentials using copula,”
NeuroImage, vol. 133, pp. 457–467, 2016.
[67] A. Mazzoni, H. Lindé n, H. Cuntz, A. Lansner, S. Panzeri, and G. T. Einevoll, “Computing the Local
Field Potential (LFP) from Integrate-and-Fire Network Models,” PLoS Comput Biol, vol. 11, no. 12,
p. e1004584, 2015.
[68] E. N. Brown, L. M. Frank, D. Tang, M. C. Quirk, and M. A. Wilson, “A statistical paradigm for neu-
ral spike train decoding applied to position prediction from ensemble firing patterns of rat hippocam-
pal place cells,” J Neurosci, vol. 18, no. 18, pp. 7411–7425, Sep. 1998.
[69] R. E. Kass and V. Ventura, “A Spike-Train Probability Model,” Neural Comput., vol. 13, no. 8, pp.
1713–1720, 2001.
[70] R. Agarwal, Z. Chen, F. Kloosterman, M. A. Wilson, and S. V. Sarma, “A novel nonparametric ap-
proach for neural encoding and decoding models of multimodal receptive fields,” Neural Comput.,
vol. 28, no. 7, pp. 1356–1387, 2016.
[71] H. Abbaspourazad and M. M. Shanechi, “An unsupervised learning algorithm for multiscale neural
activity,” in Proc. IEEE EMBC, 2017, pp. 201–204.
[72] M. M. Shanechi, A. L. Orsborn, and J. M. Carmena, “Robust Brain-Machine Interface Design Using
Optimal Feedback Control Modeling and Adaptive Point Process Filtering,” PLoS Comput Biol, vol.
12, no. 4, p. e1004730, 2016.
[73] H. Akaike, “A new look at the statistical model identification,” IEEE Trans. Autom. Control, vol. 19,
no. 6, pp. 716–723, 1974.
71
[74] J. Neyman and E. S. Pearson, “IX. On the problem of the most efficient tests of statistical hypothe-
ses,” Phil Trans R Soc Lond A, vol. 231, no. 694–706, pp. 289–337, 1933.
[75] S. S. Wilks, “Weighting systems for linear functions of correlated variables when there is no depend-
ent variable,” Psychometrika, vol. 3, no. 1, pp. 23–40, 1938.
[76] C. Nadeau and Y. Bengio, “Inference for the Generalization Error,” Mach. Learn., vol. 52, no. 3, pp.
239–281, 2003.
[77] W. K. Newey and D. McFadden, “Large sample estimation and hypothesis testing,” Handb. Econom.,
vol. 4, pp. 2111–2245, 1994.
[78] Y. Benjamini and Y. Hochberg, “Controlling the false discovery rate: a practical and powerful ap-
proach to multiple testing,” J. R. Stat. Soc. Ser. B Methodol., pp. 289–300, 1995.
[79] D. M. Powers, “Evaluation: from precision, recall and F-measure to ROC, informedness, markedness
and correlation,” J. Mach. Learn. Technol., vol. 2, no. 1, pp. 37–63, 2011.
[80] E. Bullmore and O. Sporns, “Complex brain networks: graph theoretical analysis of structural and
functional systems,” Nat. Rev. Neurosci., vol. 10, no. 3, p. 186, 2009.
[81] C. J. Stam and J. C. Reijneveld, “Graph theoretical analysis of complex networks in the brain,” Non-
linear Biomed Phys, vol. 1, no. 1, p. 3, 2007.
[82] D. J. Watts and S. H. Strogatz, “Collective dynamics of small-world networks,” nature, vol. 393, no.
6684, p. 440, 1998.
[83] J. P. Cunningham, P. Nuyujukian, V. Gilja, C. A. Chestek, S. I. Ryu, and K. V. Shenoy, “A closed-
loop human simulator for investigating the role of feedback control in brain-machine interfaces,” J
Neurophysiol, vol. 105, no. 4, pp. 1932–1949, 2011.
[84] M. M. Shanechi, G. W. Wornell, Z. M. Williams, and E. N. Brown, “Feedback-controlled parallel
point process filter for estimation of goal-directed movements from neural signals,” IEEE Trans Neu-
ral Syst Rehabil Eng, vol. 21, no. 1, pp. 129–140, 2013.
[85] S. Gowda, A. L. Orsborn, S. A. Overduin, H. G. Moorman, and J. M. Carmena, “Designing dynam-
ical properties of brain-machine interfaces to optimize task-specific performance,” IEEE Trans Neu-
ral Syst Rehabil Eng, vol. 22, no. 5, pp. 911–920, 2014.
[86] M. M. Shanechi, J. J. Chemali, M. Liberman, K. Solt, and E. N. Brown, “A brain-machine interface
for control of medically-induced coma,” PLoS Comput Biol, vol. 9, no. 10, p. e1003284, 2013.
[87] Y. Yang and M. M. Shanechi, “An adaptive and generalizable closed-loop system for control of med-
ically induced coma and other states of anesthesia,” J Neural Eng, vol. 13, no. 6, p. 066019, 2016.
[88] D. Ehrens, D. Sritharan, and S. V. Sarma, “Closed-loop control of a fragile network: application to
seizure-like dynamics of an epilepsy model,” Front Neurosci, vol. 9, p. 58, 2015.
[89] Y. Yang and M. M. Shanechi, “A framework for identification of brain network dynamics using a
novel binary noise modulated electrical stimulation pattern,” in Conf. Proc. IEEE Eng. Med. Biol.
Soc., 2015, pp. 2087–2090.
72
[90] I. Graat, M. Figee, and D. Denys, “The application of deep brain stimulation in the treatment of psy-
chiatric disorders,” Int Rev Psychiatry, vol. 29, no. 2, pp. 178–190, 2017.
[91] T. Wichmann and M. R. DeLong, “Deep brain stimulation for movement disorders of basal ganglia
origin: restoring function or functionality?,” Neurotherapeutics, vol. 13, no. 2, pp. 264–283, 2016.
[92] O. G. Sani, Y. Yang, M. B. Lee, H. E. Dawes, E. F. Chang, and M. M. Shanechi, “Mood variations
decoded from multi-site intracranial human brain activity,” Nat. Biotechnol., vol. 36, pp. 954–961,
Sep. 2018.
[93] J. P. Newman, M. Fong, D. C. Millard, C. J. Whitmire, G. B. Stanley, and S. M. Potter, “Optogenetic
feedback control of neural activity,” Elife, vol. 4, p. e07192, 2015.
[94] Z. M. Williams, “Good vibrations with deep brain stimulation,” Nat Neurosci, vol. 18, no. 5, pp.
618–619, 2015.
[95] K. A. Moxon and G. Foffani, “Brain-machine interfaces beyond neuroprosthetics,” Neuron, vol. 86,
no. 1, pp. 55–67, 2015.
[96] D. C. Millard, Q. Wang, C. A. Gollnick, and G. B. Stanley, “System identification of the nonlinear
dynamics in the thalamocortical circuit in response to patterned thalamic microstimulation in vivo,”
J. Neural Eng., vol. 10, no. 6, p. 066011, 2013.
[97] P. T. Sadtler et al., “Neural constraints on learning,” Nature, vol. 512, no. 7515, pp. 423–426, 2014.
[98] H. S. Mayberg, “Modulating dysfunctional limbic-cortical circuits in depression: towards develop-
ment of brain-based algorithms for diagnosis and optimised treatment,” Br Med Bull, vol. 65, no. 1,
pp. 193–207, 2003.
[99] C. Wang and M. M. Shanechi, “Estimating Multiscale Direct Causality Graphs in Neural Spike-Field
Networks,” IEEE Trans. Neural Syst. Rehabil. Eng., vol. 27, no. 5, pp. 857–866, 2019.
[100] R. J. Tomsett et al., “Virtual Electrode Recording Tool for EXtracellular potentials (VERTEX):
comparing multi-electrode recordings from simulated and biological mammalian cortical tissue,”
Brain Struct. Funct., vol. 220, no. 4, pp. 2333–2353, 2015.
[101] H. Pettersen, A. M. Dale, and G. T. Einevoll, “Extracellular spikes and current-source density.
Handbook of neural activity measurements,. Romain Brette and A. Destexhe.” Cambridge University
Press, Cambridge, UK, 2012.
[102] R. Brette and W. Gerstner, “Adaptive exponential integrate-and-fire model as an effective de-
scription of neuronal activity,” J. Neurophysiol., vol. 94, no. 5, pp. 3637–3642, 2005.
[103] Y. T. Wong, D. Putrino, A. Weiss, and B. Pesaran, “Utilizing movement synergies to improve
decoding performance for a brain machine interface,” in 2013 35th Annual International Conference
of the IEEE Engineering in Medicine and Biology Society (EMBC), 2013, pp. 289–292.
[104] J. S. Armstrong and F. Collopy, “Error measures for generalizing about forecasting methods: Em-
pirical comparisons,” Int. J. Forecast., vol. 8, no. 1, pp. 69–80, 1992.
73
[105] M. Maziarz, “A review of the Granger-causality fallacy,” J. Philos. Econ. Reflect. Econ. Soc. Is-
sues, vol. 8, no. 2, pp. 86–105, 2015.
[106] S. Ray and J. H. Maunsell, “Network rhythms influence the relationship between spike-triggered
local field potential and functional connectivity,” J. Neurosci., vol. 31, no. 35, pp. 12674–12682,
2011.
[107] A. E. Hoerl and R. W. Kennard, “Ridge regression: Biased estimation for nonorthogonal prob-
lems,” Technometrics, vol. 12, no. 1, pp. 55–67, 1970.
[108] H. Zou, “The adaptive lasso and its oracle properties,” J. Am. Stat. Assoc., vol. 101, no. 476, pp.
1418–1429, 2006.
[109] A. Buja and L. Brown, “Discussion:" A significance test for the lasso",” Ann. Stat., vol. 42, no. 2,
pp. 509–517, 2014.
[110] A. M. Green and J. F. Kalaska, “Learning to move machines with the mind,” Trends Neurosci.,
vol. 34, no. 2, pp. 61–75, 2011, doi: https://doi.org/10.1016/j.tins.2010.11.003.
[111] A. L. Orsborn and B. Pesaran, “Parsing learning in networks using brain–machine interfaces,”
Curr. Opin. Neurobiol., vol. 46, pp. 76–83, 2017.
[112] Y. Yang, P. Ahmadipour, and M. M. Shanechi, “Adaptive latent state modeling of brain network
dynamics with real-time learning rate optimization,” J. Neural Eng., 2020.
[113] P. Ahmadipour, Y. Yang, E. F. Chang, and M. M. Shanechi, “Adaptive tracking of human ECoG
network dynamics,” J. Neural Eng., 2020, [Online]. Available: http://iopscience.iop.org/arti-
cle/10.1088/1741-2552/abae42
[114] M. M. Shanechi, R. C. Hu, M. Powers, G. W. Wornell, E. N. Brown, and Z. M. Williams, “Neu-
ral population partitioning and a concurrent brain-machine interface for sequential motor function,”
Nat. Neurosci., vol. 15, no. 12, p. 1715, 2012.
[115] N. Sadras, B. Pesaran, and M. M. Shanechi, “A point-process matched filter for event detection
and decoding from population spike trains,” J Neural Eng, vol. 16, no. 6, p. 066016, 2019, doi:
10.1088/1741-2552/ab3dbc.
[116] A. S. Widge et al., “Treating refractory mental illness with closed-loop brain stimulation: pro-
gress towards a patient-specific transdiagnostic approach,” Exp. Neurol., vol. 287, pp. 461–472,
2017.
[117] Y. Yang, A. T. Connolly, and M. M. Shanechi, “A control-theoretic system identification frame-
work and a real-time closed-loop clinical simulation testbed for electrical brain stimulation,” J Neural
Eng, vol. 15, no. 6, p. 066007, 2018.
[118] Y. Yang et al., “Developing a personalized closed-loop controller of medically-induced coma in a
rodent model,” J Neural Eng, vol. 16, no. 3, p. 036022, 2019.
[119] Y. Yang, O. G. Sani, E. F. Chang, and M. M. Shanechi, “Dynamic network modeling and dimen-
sionality reduction for human ECoG activity,” J Neural Eng, vol. 16, no. 5, p. 056014, 2019.
74
[120] L. Citi, R. Poli, C. Cinel, and F. Sepulveda, “P300-based BCI mouse with genetically-optimized
analogue control,” IEEE Trans. Neural Syst. Rehabil. Eng., vol. 16, no. 1, pp. 51–61, 2008.
Abstract (if available)
Abstract
Behavior is encoded across multiple spatiotemporal scales of brain activity, measured with spike trains from neurons and larger-scale field potential signals that reflect network activity. Thus, it is important to identify and model causal interactions not only at a single scale of activity, but also across multiple scales, i.e., between spike trains and field potential signals. Standard causality measures are not directly applicable here because spike trains are binary-valued but field potentials are continuous-valued and have slower timescales. In this thesis, we develop novel computational methods to identify multiscale neural causality during behavior, evaluate these methods on neural datasets, and study whether the prediction of neural signals can be improved beyond what is possible with single-scale causality.
Our methods compute a multiscale model-based causality measure based on directed information. To compute multiscale causality, we learn point-process generalized linear models that predict the binary spike events at a given time based on the history of both spike trains and field potential signals. We also learn linear Gaussian models that predict the field potential signals at a given time based on their own history as well as either the history of observable binary spike events or that of latent continuous firing rates. We first validate the method with extensive Monte Carlo simulations and then evaluate its success both in realistic biophysical spike-field simulations and in non-human primate (NHP) motor cortical datasets during movement. We show that our method reveals the true multiscale causality network structure not only in Monte Carlo simulations but also in biophysical simulations despite the presence of model mismatch. Further, we find that the multiscale models in the NHP dataset predict both spike trains and field potential signals better compared to just modeling single-scale or cross-scale causalities, thus suggesting the successful discovery of multiscale causal relationships. Finally, we show that the identified multiscale interaction are negatively correlated with physical distance between signals and are stronger within a given brain region compared to across brain regions. These novel multiscale causality identification and modeling methods can reveal causal interactions across different spatiotemporal scales of brain activity to inform basic science investigations and neurotechnologies.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Geometric and dynamical modeling of multiscale neural population activity
PDF
Switching dynamical systems with Poisson and multiscale observations with applications to neural population activity
PDF
Dynamical representation learning for multiscale brain activity
PDF
Detection and decoding of cognitive states from neural activity to enable a performance-improving brain-computer interface
PDF
Decoding memory from spatio-temporal patterns of neuronal spikes
PDF
Causality and consistency in electrophysiological signals
PDF
Nonlinear dynamical modeling of single neurons and its application to analysis of long-term potentiation (LTP)
PDF
Dealing with unknown unknowns
PDF
Algorithms and frameworks for generating neural network models addressing energy-efficiency, robustness, and privacy
PDF
Dynamic neuronal encoding in neuromorphic circuits
PDF
Emulating variability in the behavior of artificial central neurons
PDF
Intelligent urban freight transportation
PDF
Adaptive octree meshing of the extracellular space in neural simulations
PDF
Bidirectional neural interfaces for neuroprosthetics
PDF
Hardware-software codesign for accelerating graph neural networks on FPGA
PDF
A probabilistic model predicting retinal ganglion cell responses to natural stimuli
PDF
A million-plus neuron model of the hippocampal dentate gyrus: role of topography, inhibitory interneurons, and excitatory associational circuitry in determining spatio-temporal dynamics of granul...
PDF
Nonlinear modeling of causal interrelationships in neuronal ensembles: an application to the rat hippocampus
PDF
Functional connectivity analysis and network identification in the human brain
PDF
Stochastic and multiscale models for urban and natural ecology
Asset Metadata
Creator
Wang, Chuanmeizhi
(author)
Core Title
Multiscale spike-field network causality identification
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Electrical Engineering
Degree Conferral Date
2022-05
Publication Date
04/11/2022
Defense Date
03/08/2022
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
causality,field potentials,LFP,multiscale,neural encoding models,OAI-PMH Harvest,spike trains
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Shanechi, Maryam M. (
committee chair
), Bogdan, Paul (
committee member
), Song, Dong (
committee member
)
Creator Email
chuanmew@usc.edu,wzh1002011@gmail.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-oUC110937263
Unique identifier
UC110937263
Document Type
Dissertation
Format
application/pdf (imt)
Rights
Wang, Chuanmeizhi
Type
texts
Source
20220412-usctheses-batch-922
(batch),
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the author, as the original true and official version of the work, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright. The original signature page accompanying the original submission of the work to the USC Libraries is retained by the USC Libraries and a copy of it may be obtained by authorized requesters contacting the repository e-mail address given.
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Repository Email
cisadmin@lib.usc.edu
Tags
causality
field potentials
LFP
multiscale
neural encoding models
spike trains