Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Decoding memory from spatio-temporal patterns of neuronal spikes
(USC Thesis Other)
Decoding memory from spatio-temporal patterns of neuronal spikes
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
DECODING MEMORY FROM SPATIO-TEMPORAL PATTERNS OF NEURONAL SPIKES
by
Xiwei She
A Dissertation Presented to the
FACULTY OF THE USC GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
(BIOMEDICAL ENGINEERING)
August 2022
Copyright 2022 Xiwei She
Acknowledgments
First and foremost, I would like to express my sincere gratitude to my Ph.D. mentor, Dr. Dong Song, for
providing all his help and guidance during these past five years. I have learned so many skills and spirit
from Dr. Song to become a qualified Ph.D. and also a good researcher.
In addition, I would like to give special thanks to my dissertation committee members. To Dr.
Theodore Berger for his helpful advice and lectures. To Dr. Charles Liu for his clinical support. To
Dr. Vasilis Marmarelis and Dr. Maryam Shanechi for providing many meaningful suggestions during my
qualifying exam and dissertation.
I also would like to thank my previous and current lab members, research collaborators, and fellow
graduate students for all the discussions and fun times over these years. They have made my Ph.D. life
a wonderful period that I will remember forever.
Finally, I would like to thank and dedicate this thesis to my parents. Their love, continuous support
and understanding make me fearless in chasing my goals. I would not have made it this far without them.
ii
Table of Contents
Acknowledgments ii
List Of Tables v
List Of Figures vi
Abstract viii
Chapter 1: Introduction 1
1.1 Motivations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Decoding of Spikes - Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Decoding of Spikes - Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Neural Prostheses Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
Chapter 2: Memory Decoding Model 7
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2.1 Spike Decoding Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2.2 Regularized General Linear Models . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2.3 Multi-Resolution Feature Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2.4 Sparse Classification Functional Matrix . . . . . . . . . . . . . . . . . . . . . . . . 14
2.2.5 Bagging-based Ensemble Classifier . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2.6 Nested Cross-Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2.7 The Double-layer Multi-Resolution Spike Decoding Model . . . . . . . . . . . . . . 16
2.2.8 Model Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.3.1 Decoding Single-Neuron Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.3.2 Decoding Two-Neuron Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.3.3 Decoding Population Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.3.4 Factors Influencing Classification Performance . . . . . . . . . . . . . . . . . . . . 29
2.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Chapter 3: Experimental Memory Decoding with Rodent and Human Hippocampal
Recordings 32
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.2.1 Experimental Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.2.1.1 Delayed Nonmatch-to-Sample Task in Rats . . . . . . . . . . . . . . . . . 33
3.2.1.2 Delayed Match-to-Sample Task in Human Patients . . . . . . . . . . . . . 35
3.2.2 Human Study Paradigm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
iii
3.2.3 Placement of Electrodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.2.4 Hardware Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.2.5 Input-Output Data Pre-processing . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.2.6 Design of Testing Cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.3.1 Decoding Binary Spatial Location from Hippocampal Spikes in Rats . . . . . . . . 39
3.3.2 Decoding Visual Memory Categories from Hippocampal Spikes in Humans . . . . . 41
3.3.2.1 Human Subjects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.3.2.2 1.2. Selection Criteria of Human Subjects for Modeling . . . . . . . . . . 43
3.3.2.3 1.3. Performances of Human Memory Decoding Models . . . . . . . . . . 43
3.3.2.4 Meta-Analysis of Human Decoding Results . . . . . . . . . . . . . . . . . 49
Chapter 4: Accelerating Model Estimation with Parallel Computing for Testing Hip-
pocampal Memory prostheses in Human 57
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
4.2.1 MIMO Estimation Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
4.2.2 DLMR Estimation Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.2.3 Parallel Computing Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.2.4 Materials and Equipment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
4.3.1 Accelerating MIMO model estimation . . . . . . . . . . . . . . . . . . . . . . . . . 68
4.3.2 Accelerating DLMR model estimation . . . . . . . . . . . . . . . . . . . . . . . . . 70
4.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
Chapter 5: Discussion and Future Works 73
5.1 Future Development of Memory Decoding Model . . . . . . . . . . . . . . . . . . . . . . . 73
5.1.1 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.1.1.1 Contribution of the Bagging Procedure . . . . . . . . . . . . . . . . . . . 73
5.1.1.2 Robustness of SCFM Patterns . . . . . . . . . . . . . . . . . . . . . . . . 74
5.1.1.3 Comparison with Other Baseline Decoders . . . . . . . . . . . . . . . . . 75
5.1.1.4 Resemblance to Artificial Neural Networks . . . . . . . . . . . . . . . . . 75
5.1.2 Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5.2 Future Development of Parallel Computing Models . . . . . . . . . . . . . . . . . . . . . . 79
5.2.1 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
5.2.2 Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
References 82
iv
List Of Tables
2.1 Table of Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 PIF Parameters Used in Simulation Studies . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.1 Model Performances of Rodent Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.2 Human Subjects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
v
List Of Figures
1.1 Regression Models vs. Classification Models . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 MIMO and DLMR Model for Hippocampal Prostheses . . . . . . . . . . . . . . . . . . . . 6
2.1 Feature Extraction with B-splines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2 Algorithm 1: Bagging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.3 Nested Cross Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.4 Algorithm 2: Stacking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.5 The Double-Layer Multi-Resolution Decoding Model . . . . . . . . . . . . . . . . . . . . . 21
2.6 Decoding Single-Neuron Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.7 Decoding Two-Neuron Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.8 Decoding Population Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.9 Influence of Key Experimental Parameters on Classification Performance . . . . . . . . . . 30
3.1 Behavioral Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.2 Experimental Paradigm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.3 Example Images Shown in DMS Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.4 Design of Testing Cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.5 Rodent Decoding Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.6 Human Decoding Performances (Normal Decoding Group) . . . . . . . . . . . . . . . . . . 45
3.7 Human Decoding Performances (Control Group) . . . . . . . . . . . . . . . . . . . . . . . 46
3.8 Human Decoding Example 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
vi
3.9 Human Decoding Example 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.10 Modeling Performances Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.11 Sparseness of Spatio-temporal Regions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.12 Contribution of CA3 and CA1 Individual Neurons . . . . . . . . . . . . . . . . . . . . . . 52
3.13 Contribution of CA3 and CA1 Region . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.14 Distribution of Memory Category Information in CA3 and CA1 Region . . . . . . . . . . 54
3.15 Contributions of Temporal Resolutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.1 Concept Structure of Parallel Computing Models . . . . . . . . . . . . . . . . . . . . . . . 58
4.2 Flowcharts of the Model Estimation Procedure . . . . . . . . . . . . . . . . . . . . . . . . 62
4.3 Parallel Computing Schemes - MIMO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.4 Parallel Computing Schemes - DLMR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.5 Illustration of CARC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.6 Accelerating MIMO Model Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.7 Accelerating DLMR Model Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.1 Contribution of the Bagging Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
5.2 Robustness of SCFM patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
5.3 Performances Comparisons of Multiple Decoders . . . . . . . . . . . . . . . . . . . . . . . 77
vii
Abstract
To study how memories are encoded in neural signals, we have built a double-layer multiple temporal-
resolution (multi-resolution) model for decoding specific memory information from single-trial spatio-
temporalpatternsofspikes. Themodeltakeshippocampalspikingactivitiesasinputsignalsandmemory-
related variables as output signals. It provides a powerful tool for better understanding how memory
content is encoded in hippocampal spikes.
The double-layer multi-resolution (DLMR) decoding model represents the input-output mapping with
atwo-layerstructure. Inthefirstlayer, tosolvetheunder-determinedproblemcausedbythesmallsample
size and the very high dimensional input and output signals, L1-regularized B-spline logistic regression
classifiers are used to reduce dimensionality and yield sparse model estimation. A wide range of temporal
resolutions of neural signals is included by using a large number of classifiers with different numbers
of B-spline knots. Each classifier serves as a base learner to classify spatio-temporal patterns into the
probability of the memory label with a given temporal resolution. A bootstrap aggregating strategy is
used to further reduce estimation variances of these classifiers. In the second layer, another L1-regularized
logistic classifier takes outputs of first-layer classifiers as inputs to generate the final output prediction.
This classifier serves as a meta learner that combines multiple temporal resolutions to classify the spatio-
temporal patterns of spikes into final output memory labels.
For the development of hippocampal memory prostheses, a human hippocampal closed-loop analysis
system was built at both Keck Hospital of the University of Southern California (USC) and Rancho Los
Amigos National Rehabilitation Centre. The system can successfully complete the entire closed-loop
paradigm integrating recording, modeling and stimulation. Neuronal signals from 43 subjects have been
viii
recorded and corresponding DLMR decoding models have been built. In addition, meta-analyses were
performed on modeling results to investigate the memory coding characteristics of the hippocampus.
Moreover, to accelerate model estimation procedures, we have developed parallelization strategies to
decompose the overall model estimation task into multiple independent sub-tasks. These sub-tasks are
then accomplished in parallel on different computer nodes provided by the USC Center for Advanced
Research Computing (CARC).
Results of simulation studies show that the DLMR decoding model can capture neural patterns
with various temporal resolutions. Results of experimental data decoding prove that this method can
effectively avoid overfitting and yield accurate prediction of memory-related information. The double-
layer multi-resolution classifier consistently outperforms the best single-layer single-resolution classifier by
extracting and utilizing multi-resolution spatio-temporal features of spike patterns in the classification.
Moreover, the DLMR decoding model can be used not only to predict memory information but also
provide signature functions that represent spatio-temporal characteristics of spike patterns most relevant
to the memory. Such signature functions can be potentially used to design spatio-temporal stimulation
patterns for eliciting specific memories in the hippocampus. Therefore, the proposed DLMR decoding
model has important implications to the development of hippocampal memory prostheses.
ix
Chapter 1
Introduction
1.1 Motivations
Spatio-temporal patterns of spikes are commonly agreed as the way that neurons in the brain transform
information. Spikes happening in specific brain area are usually associated with outside stimulus or
behaviors. For example, neurons in motor cortex are widely founded in relation to movements [1, 2];
neurons located in the primary visual cortex are sensitive to spatial information in vision [3, 4]; and
neurons in the hippocampus have long been known to be responsible for the formation of memories[5, 6].
Therefore, understanding how such information is encoded in the brain and how to decode information
from spatio-temporal patterns of spikes are essential problems in the field of neuroscience and neural en-
gineering. We are focused here on developing computational machine learning models to decode behavior
related memory information from hippocampal spatio-temporal pattern of spikes. With the help of pro-
posed models, we can decode specific memory information from single-trial spatio-temporal patterns of
spikes, investigate how memories are encoded in the hippocampus, and gain insight into the relationship
of neural signals to external behaviors.
1
1.2 Decoding of Spikes - Background
Decoding information from spikes allows us to 1) understand the relationship between neural activities
and external stimuli or behaviors [7, 8, 9], 2) identify the functional connectivity between brain regions
[10, 11], and 3) develop neural code-based prostheses for restoring sensory, motor, and cognitive functions
[12, 13, 14, 15, 16]. Over the past decade, more attentions have been paid to spikes decoding problems by
using computational machine learning models. Viewed from the lens of machine learning, spike decoding
can be formulated computationally as building an input-output model that takes spikes trains x as input
and the signal to be decoded y as output.
There are a large literature now exist on developing models for solving such a spike decoding problem.
Based on the assumption that models made, they can be roughly classified into two parts: generative
models and discriminative models. Generative models are based on Bayes’ rule, in which the prior
of signal y, i.e., P (Y ), and the posterior probability P (XjY ), are assumed to be followed a certain
distribution. These models are efficient when the assumed distribution is matched with the ground
truth [17, 18, 19, 20, 21, 22]. However, all generative models were assuming an arbitrary probability
distribution and cannot specifically describe the relationship between input-output signals. Moreover,
computational difficulties limit the generative models to yield a Bayes-optimal solution. Discriminative
models, on the other hand, assume a linear/non-linear functional mapping betweenx andy, and estimate
the conditional probability P (YjX) directly from data. Discriminative models are powerful when they
can explicitly capture the nonlinear properties of input-output signals [23, 24, 25, 26].
Moreover, decoding problems can also be grouped into regression problems and classification problems
(Fig. 1.1). In regression models, the output signals are continuous time-series variables such as 2D/3D
kinematics [7, 27, 28], brain states [29, 30], and spike firing probabilities [31, 32]. In classification models,
output signals are discrete categorical variables such as images labels [33, 34], memory categories [35],
and behavioral events [36, 37].
2
Figure 1.1: Decoding neural spikes with a regression model and a classification model. In a regres-
sion model, output is a continuous time-series variable. In a classification model, output is a discrete
categorical variable.
1.3 Decoding of Spikes - Challenges
While a large groups of methods have been reported to be useful for decoding spikes, several essential
questions need to be addressed before building a machine learning model for the problem of decoding
information from spikes. First, spatio-temporal patterns of spikes are extraordinarily high dimensional
signals. Taking every spike timing as a feature for decoding, the dimension of spatio-temporal patterns
is equal to the product of the number of neurons and the number of possible spike timings within the
decoding window, which can easily be in the order of thousands [38, 35]. Second, although the feature
dimension can be reduced by lowering the temporal resolution of spike trains with binning, the optimal
temporal resolution is not known a priori. Using a resolution that is too high cannot sufficiently reduce
the feature dimension; while using a resolution that is too low will lose the temporal structure of spike
patterns. Numerous studies have shown that precise spike timing are crucial in neural coding [39, 40, 41];
different temporal resolutions are involved in different neurons, brain regions, and species [42, 43, 44].
Third, output signals representing behaviors and cognitive states are also high dimensional. Taking the
visual memory as an example, a single image can be labelled with numerous categories and features [45],
3
and thus leads to output signals also with numerous dimensions. Forth, classification models often suffer
from having limited data length. Different from time-series regression models, which take continuous
time steps with fine resolutions and thus result in large number of samples (e.g., T), classification models
are often estimated with discrete data epochs that each contains many data points and result in a much
smaller sample size (e.g., T=M, where M is the epoch length). For example, the numbers of memory
events recorded from animals or human subjects range from 50 to 150 in datasets used in this study,
due to the time constrain of experimental or clinical procedures. Using such a short data length to build
classification models with high input and output dimensions poses a highly under-determined problem
and becomes virtually impossible with conventional estimation methods. Finally, to achieve high enough
modeling performance, models are usually with computationally intensive settings, which must run for
hundreds of hours on a single personal computer in order to optimize a large number of open parameters
given high-dimensional input-output data. However, strict medical restrictions in neuroscience studies
limit the modeling time to typically 3-4 days, after which electrodes are promptly removed from research
subjects. Therefore, in order to test the model through stimulations, it requires us to seek faster methods
of model estimations.
1.4 Neural Prostheses Applications
The hippocampal memory prosthesis is a cognitive prosthesis device for helping patients to restore their
memory functions [46]. Memory disabilities can result from damaged hippocampal regions caused by
disease or brain injuries. With damaged hippocampal regions, the neural spike communication between
damaged brain regions can be interrupted. As one of the key parts of the hippocampal prosthesis, a
multi-input multi-output (MIMO) model has been built. The MIMO model identifies neural functional
connectivity and captures the neural transmission properties (Fig. 1.2 left). It can predict down-stream
spikes based on recorded up-stream spikes and thus be used to drive stimulation for bypassing damaged
hippocampal regions [47, 48, 46, 38].
4
Another important topic of the hippocampal prosthesis is to understand how memory contents are
encoded in the hippocampus. Therefore, as another key parts of the hippocampal prosthesis, the double-
layermulti-resolutiondecodingmodelhasbeenbuildtodecodememory-relatedinformationfromrecorded
hippocampal spikes (Fig. 1.2 right). It plays a complimentary role to the MIMO model in the memory
prosthesis. Challenges of building such a DLMR model for better understand the hippocampus is equiva-
lent to the decoding of spikes mentioned previously in section 1.3. The DLMR decoding model described
in this study can successfully capture temporal properties of spatio-temporal patterns of spikes, and
yield accurate prediction of memory labels based on input spikes. Moreover, the DLMR decoding model
can be used not only to predict memory information but also provide signature functions that represent
spatio-temporal characteristics of spike patterns most relevant to the memory. Such signature functions
can be potentially used to design spatio-temporal stimulation patterns for eliciting specific memories in
the hippocampus. Therefore, the proposed DLMR decoding model has important implications to the
development of hippocampal memory prostheses.
5
Figure 1.2: Two core models for the development of the hippocampal memory prosthesis. Top: MIMO
model captures the input-output spike transmission properties. Bottom: DLMR decoding model decodes
memory-related information from high-resolution neural spikes.
6
Chapter 2
Memory Decoding Model
2.1 Introduction
In this chapter, we propose the methodology of a memory decoding model for decoding spatio-temporal
patterns of spikes. Memory decoding models can be formulated computationally as building an input-
output model that takes hippocampal spikes as inputs and the signal (e.g., memory contents) to be
decoded as output. The estimation of memory decoding models based on hippocampal spiking activities
offers a powerful methodology to the development of the hippocampal prostheses (section 1.4).
Specifically, we utilize modern machine learning techniques such as bootstrap aggregating (bagging),
stacking and regularization to develop a double-layer multi-resolution classification model for decoding
spatio-temporal patterns of spikes. The model takes spatio-temporal patterns of spikes as input sig-
nals and memory-related information as output signals and represents the input-output mapping with a
double-layer ensemble classifier. In the first layer, to solve the under-determined problem caused by small
sample size and the very high dimensionality of input signals, B-spline functional expansion [49] is used
to reduce feature dimensionality; L1-regularized (LASSO) logistic regression classifier is used to yield
sparse model estimations. A wide range of temporal resolutions of neural features are included by using a
large number of classifiers with different number of B-spline knots. Each classifier serves a base learner to
7
classify spatio-temporal patterns into probability of the output with a single temporal resolution. A boot-
strap aggregating strategy is utilized to reduce the estimation variances of these base learners [50]. In the
second layer, by taking advantage of the stacking strategy [51], another L1-regularized logistic classifier
takes outputs of first-layer classifiers as inputs to generate the final output predictions. This second-layer
classifier serves as a meta-learner that fuses multiple temporal resolutions to classify the spatio-temporal
patterns of spikes into the binary memory label. Nested cross-validation [52] is implemented in both
layers to avoid overfitting and yield true out-of-sample estimations.
In this chapter, we test the proposed memory decoding model with synthetic datasets. Results show
that the proposed model can successfully decode spatio-temporal patterns of spikes generated with vari-
ous temporal resolutions. In addition, the proposed model can capture the optimal temporal resolution
by assigning weights to base learners with different resolutions appropriately. Moreover, such a model
structure can not only to decode information from the spatio-temporal patterns of spikes, but also pro-
vide signature functions that represent spatio-temporal characteristics of spike patterns most relevant to
classification.
2.2 Methods
2.2.1 Spike Decoding Model
Consider a spatio-temporal pattern of neural spikes X recorded around the i
th
trial of behavioral event
happened at t
i
, and that event is labelled as y. The goal is to build a model H that approximates the
functional relationship between input X and output y:
H[X(t
i
);
M
2
M
2
]!y(i) (2.1)
where X(t
i
) = (x
(1)
(t
i
);x
(2)
(t
i
);:::;x
(N)
(t
i
)), N is the number of input neurons, and M is the length of
the decoding time window. With a sufficiently small sampling interval (e.g., 2 msec), x only contains
8
Table 2.1: Table of notations
Symbols Meanings
Knot of B-spline basis functions
Estimated probabilities
B B-spline basis functions in units of probability
D Data set pair x;y
F Sparse classification functional matrix
G Data set after bagging
H Input-output decoding models
I Number of instances
J Number of B-spline basis
L First-layer base learners
L
0
Second-layer meta-learner
M Length of the decoding window
N Number of input signals
Q Number of included B-spline resolutions
R Number of bagging replicas
S Model estimation minimizer
b B-spline basis function
d Degree of B-spline basis function
l Log-likelihood function
m Number of interior knots of B-spline
w Model coefficients
x Input signals
y Output signals
z Input feature vectors
Regularization parameter
9
binary values (1 when there is a spike and 0 when there is not) and preserves the temporal structures of
spike trains. Notations used in this letter are summarized in Table 2.1.
Taking a probabilistic approach, equation 2.1 can be expressed as
(i) =Prob(y(i)jX(t
i
);
M
2
M
2
) (2.2)
For binary classification problems where y takes binary values (0 or 1), the following classification
rule predicts y(i) based on (i):
^ y(i) =
8
>
>
>
<
>
>
>
:
1 (i)> 0:5
0 otherwise
(2.3)
The naive way of solving this classification problem is to consider each spatio-temporal point of X as
one independent feature. However, this is impractical since the total number of features (NM) can be
forbiddingly large. To reduce the number of features, X is projected into a feature space such as in the
linear case:
z
i
(n;j) =
X
M
2
M
2
b
j
()x
(n)
(t
i
) (2.4)
where b
j
is the j
th
basis of the total J basis functions used for feature extraction, x
(n)
is the n
th
neuron
of the total N neurons included in analysis. By concatenating NJ projected features into a feature
vector, model inputs (i.e., spatio-temporal features) can be significantly reduced from a vector with a
dimensionality of NM to a vector with a dimensionality of NJ, since M >>J. Equation 2.2 can
be thus rewritten as:
(i) =Prob(y(i)jZ
(1)
i
;Z
(2)
i
;:::;Z
(N)
i
) (2.5)
where Z
(N)
i
= [z
i
(n; 1);z
i
(n; 2);:::;z
i
(n;J)].
Equation 2.5 represents a multi-input, single-output (MISO) decoding model of spikes. It decodes
spatio-temporal patterns of spikes X
i
into the behavioral label y
i
based on spatio-temporal features
10
Z
i
extracted from X
i
. The spike decoding model proposed in this letter is different from the multi-
input, multi-output (MIMO) model, which essentially a regression model for capturing spike-to-spike
transformations [46, 47]. The spike decoding model, on the other hand, is a classification model that
takes neural spike trains as inputs to decode behavioral/cognitive variables (e.g., image labels) (see Fig.
1.1).
2.2.2 Regularized General Linear Models
Equation 2.5 can be solved with a generalized linear model (GLMs) [53], where the output signal y is
assumedtofollowanexponentialfamilydistributionwithamean, whichcanbecalculatedasaweighted
summation of input features Z
(n)
i
followed by a static nonlinear link function as
(i)Prob() =f
1
(
N
X
n=1
w
(n)
Z
(n)
i
+w
0
) (2.6)
where f is the link function, and w
(n)
are the sought modeling coefficients that can be estimated with
a Maximum-likelihood estimation (MLE) method [54]. Assuming that samples of output signals y are
conditionally independent, the likelihood function can be expressed as:
L(w) =
I
Y
i=1
Prob(y(i)jZ
(1)
i
;Z
(2)
i
;:::;Z
(N)
i
) =
I
Y
i=1
(i)
y(i)
(1(i))
1y(i)
(2.7)
where I is the total number of samples.
The goal of the MLE is to maximize L(w) or minimize the negative log-likelihoodl(w) as:
l(w) =
I
X
i=1
y(i) log(i) + (1y(i)) log(1(i)) (2.8)
Using a logit link function, equation 2.6 becomes:
(i) = [1 +exp(
N
X
n=1
w
(n)
Z
(n)
i
w
0
)]
1
(2.9)
11
When the number of features is large compared to the number of instances, estimation of w becomes
an under-determined problem; L1 regularization can be used to achieve sparse model estimation of w.
The minimizerS becomes the summation ofl(w) and a penalty term that is expressed as the L1-norm
of w as:
S(w) =l(w) +(sum
N
n=1
jjw
(n)
jj
1
) (2.10)
whereistheregularizationhyper-parameterthatcontrolstherelativeimportancebetweenthelikelihood
and the penalty. A larger value of leads to a higher degree of penalty and sparser model estimations.
The optimal value of can be selected using a cross-validation procedure [55].
2.2.3 Multi-Resolution Feature Extraction
In this study, the B-spline basis function [56, 49, 47, 57] is used to extract temporal features from
spike trains. B-splines are piece-wise polynomials with smooth transitions between adjacent pieces at
a set of interior knot points. Compared to the typical binning method, B-spline can be used as a
generalized histogram method [58] to bin spike trains into firing rates with smoother boundaries. Suppose
a polynomial spline of degreed in the range of [0;M] hasm interior knot points, which form the sequence
0
= 0<
1
<
2
<:::<
m
<
(m+1)
=M. Thej
th
B-spline basis function can be defined in a recursive
fashion as:
b
m
j;d
() =
j
j+d1
j
b
m
j;d1
() +
j+d
j+d
j+1
b
m
j+1;d1
() (2.11)
where
b
m
j;0
() =
8
>
>
>
<
>
>
>
:
1 if
j
< <
j+1
0 otherwise
(2.12)
With a fixed degreed and a sequence ofm interior knots, the total number of B-spline basis functions
J is m +d + 1.
12
In this study, them interior knots are evenly distributed in the decoding window [
M
2
;
M
2
]. Therefore,
given a set of interior knots, the temporal resolution of the B-spline basis is M=(m + 1). In the following
text, we simply refer tom as the "B-spline resolutions." Given a B-spline resolution, the basisb
j
described
inequation2.4cantaketheformofthej
th
B-splinebasisfunctiondirectly. Assuch, thefeaturedimension
is controlled by the B-spline resolution m. A larger m results in a higher feature dimension and a finer
temporal resolution. By adjustingm, B-spline features can range from representing rate code to temporal
code with varying temporal resolutions.
For the purpose of including multiple temporal resolutions, a wide range of values of m is included
(see Fig. 2.1). Each B-spline resolution corresponds to one classifier H
m
(i.e., single-resolution model)
and total number of Q classifiers are included.
Figure 2.1: Extracting multi-resolution features from spatio-temporal patterns of spikes using B-spline
functions with different resolutions
13
2.2.4 Sparse Classification Functional Matrix
For every single-resolution model H
m
, model coefficients can be reshaped into a matrix with dimension-
ality as NJ, with each row containing coefficients of all features of one neuron. With the coefficient
matrix and B-spline basis functions, a sparse classification functional matrix (SCFM) can be calculated
as:
F
m
(n;) =
J
X
j=1
b
m
j
()w(n;j) (2.13)
The SCFM has the same dimension as the spatio-temporal patterns of spikes X. It provides a
functional map to relate spike patterns to the decoding target in a probabilistic manner, and can be
directlyusedtocalculatetheconditionalprobabilityofthelabelwiththeoriginalspatio-temporalpatterns
X as:
(^ y(i) = 1jX(t
i
)) = [1 +exp(w
0
N
X
n=1
X
M
2
M
2
F
m
(n;)x
(n)
(t
i
))]
1
(2.14)
Regions of interest (ROIs) refers to non-zero regions of the SCFM where spike patterns have non-zero
(positive or negative) contributions to the decoding target y.
For each neuron n, the j
th
B-spline basis function (modeled with the single-resolution m) can be
converted into probability with its corresponding model coefficients as:
B
m
j
() = [1 +exp(w
0
b
m
j
()w(n;j))]
1
(2.15)
Equation 2.15 illustrates how each B-spline basis function depends on the estimated model weights
and contributes to the conditional probability . The baseline probability is estimated as B
m
j
= [1 +
exp(w
0
)]
1
as b
m
j
() = 0. Points on B
m
j
() with a value higher than B
m
j
represent regions where a
spike in these regions increases the probability of the pattern belongs to a label, i.e.,(^ y(i) = 1jX(t
i
));
points onB
m
j
() with a value lower than B
m
j
represent regions where a spike decreases this probability.
14
2.2.5 Bagging-based Ensemble Classifier
Estimation of model H
m
tends to have large variance due to the small sample size. In this study, a
bagging-based strategy is used to reduce the estimation variances [50, 59]. In this strategy, data are
resampled and partitioned into multiple sets of training and validation data for the model estimation and
cross-validation. Specifically, we sample the data based on our partition strategy R time to obtain R
training and validation datasets. For each dataset, a classifier is trained. The optimal hyper-parameter
is estimated globally by considering all R classifiers; thus an ensemble estimation with a lower variance
is achieved (Algorithm 2.2).
We modify the standard partition strategy to handle imbalanced data, where classes of labels have
verydifferentsizesthatcancausetheaccuracyparadox[60]. Considerarandomk-foldpartitionofasmall
dataset. It is possible that some folds contain only instances of the majority class, where classification
becomes impossible. To solve this problem, we sample the two binary labels separately and then combine
themtoensurethatbothtrainingandvalidationsetscontaininstancesofbothclasses. Asimilarpartition
strategy is used in the stratified k-fold cross-validation [61].
2.2.6 Nested Cross-Validation
Nested cross-validation [52] is used to avoid overfitting and provide an unbiased evaluation of the model
goodness-of-fit (see Fig. 2.3).
The nested cross-validation process consists of two cross-validation loops - outer and inner loops.
In the outer loop, the original dataset is divided into two subsets: outer training sets and test sets.
The test set is used only for measuring the method performance. In the inner loop, the outer training
sets is split into inner training sets and validation sets using the bagging-based partition procedure
(Algorithm in Fig. 2.2). The inner loop is the same as the standard cross-validation. Inner training sets
are used to estimate the model coefficients, whereas validation sets are used to tune hyper-parameters.
15
Figure 2.2: The pseudo-code of the bagging-based data partition strategy
Optimal hyper-parameters are selected globally within each outer training set across all inner training and
validation sets. Finally, the test set in the outer loop is used to assess the model prediction performance.
In this project, we concatenate all predictions from each test sets and compare it with true labels
of output signals as the final model evaluation. All steps described above are repeated eight times (i.e.,
R = 8 in Algorithm in Fig. 2.2) to reduce model variance.
2.2.7 The Double-layer Multi-Resolution Spike Decoding Model
As described in section 2.2.3, multi-resolution models with different B-spline resolutions are trained to
include a wide range of temporal resolutions of spikes. However, to solve the classification problem in
equation2.6, onefinalmodelisneededtoprovidethefinalpredictionofoutputlabels. Onesimplesolution
is to choose the best model, i.e., the model that achieves the highest decoding accuracy, in the ensemble
as the final model. However, such a strategy requires intensive computing time and discards all other
16
Figure 2.3: The nested cross-validation.
17
models except the "best" one. It becomes problematic when the computational resources are limited.
More importantly, such a final model is still a single-resolution model. It is not necessarily true that
all neurons share the same temporal resolution. Modeling neurons with a single fixed resolution lacks
flexibility and could also harm the decoding performance. The generalization ability of an ensembled
multi-resolution model is often much stronger than that of single-resolution models by combining a wide
range of resolutions.
In this study, we use a stacking procedure to combine all single-layer single-resolution models together
to form a double-layer multi-resolution ensemble model [50, 62, 63]. Through this approach, multiple
single-resolution models serve as the first-layer "base learners." In this layer, all base learners classify the
input and make their output predictions in the form of probabilities. In the second layer, all first-layer
output predictions are taken as inputs of a classifier, termed as the "meta learner", to produce the final
output prediction. The pseudo-code of our double-layer multi-resolution classification model is shown as
Algorithm in Fig. 2.4.
By applying the same sigmoid classification rule described in equation 2.3, we can get the final
ensembled binary classification of the output label as:
^ y(i) =
8
>
>
>
<
>
>
>
:
1 ’(i)> 0:5
0 otherwise
(2.16)
Withstacking, thesecond-layermodelcombinedallfirst-layermodelswiththeircorrespondingsecond-
layer model coefficients w
0
, which has a dimensionality of 1Q. In addition, an ensemble SCFM can be
calculated from first-layer SCFMs and their corresponding second-layer coefficients as:
F
0
(n;) = [1 +exp(w
0
0
Q
X
m=1
F
m
(n;)w
0
(m))]
1
(2.17)
18
Figure 2.4: The pseudo-code of double-layer multi-resolution classification model. *Note: processes
in Step.2 and Step.6 also contain bagging and nested cross-validation strategies described in previous
sections.
19
Similarly, B-spline basis functions in units of probability can also be calculated with the second-layer
as:
B
0
j
() = [1 +exp(w
0
0
Q
X
k6=m
w
0
(k)B
k
j
B
m
j
()w
0
(m))]
1
(2.18)
The baseline probability of such double-layer probability curves is defined by baselines from all first-
layerB
m
j
. Equation 2.17 and 2.18 show how the double-layer SCFM combines a wide range of resolutions
and contributes to the conditional probability .
Nowwehaveacompletedouble-layermulti-resolutionclassificationmodelfordecodingspatio-temporal
patterns of spikes. In the first layer, multi-resolution, B-spline basis functions are used to explore the
neural temporal resolutions and reduce data dimensionalities. Multiple LASSO classifiers are trained as
first-layer base-learners to classify the spatio-temporal patterns of spikes based on the B-spline features.
Inputs of those base learners are B-spline features with different resolutions. Outputs are behavioral
labels in memory-dependent tasks. In the second layer, another LASSO classifier is used to combine all
single-resolution base-learners to form a final multi-resolution model (Figure 2.5). As such, inputs of
the meta-learner are probabilities estimated by base learners, while outputs are corresponding behavioral
labels (see Fig. 2.5).
2.2.8 Model Evaluation
Modeling performance is evaluated with the Matthews correlation coefficients (MCCs), which is robust
to imbalanced data. MCC is calculated from the numbers of true positive (TP), true negative (TN),
false positive (FP), and false negative (FN) in the confusion matrix as:
MCC =
TPTNFPFN
p
(TP +FP ) (TP +FN) (TN +FP ) (TN +FN)
(2.19)
20
Figure2.5: Thedouble-layermulti-resolutionclassificationmodelofspikes. Thefirstrow: spatio-temporal
patterns of spikes; the second row: bagging-based data partition strategy; the third row: first-layer single-
resolution base-learners; the fourth row: the second-layer multi-resolution meta-learner; the fifth row:
model outputs.
21
MCC values of 1, 0, and -1 indicate perfect, random, opposite predictions to the output labels,
respectively. Note that trivial models, which prediction output all as 1 or all as 0, also yield an MCC of
0.
2.3 Results
In this chapter, we tested the double-layer multi-resolution decoding model with synthetic data to see
whether it can decode spatio-temporal patterns of spikes generated with various temporal resolutions.
Synthetic spike patterns were simulated with different forms of probability intensity functions (PIFs),
which consist of a baseline firing rate f
0
(5 Hz) and Gaussian peaks that can be described as:
P
f
(t) =P
f0
(t) +
1
p
2
exp(
(x)
2
2
2
) (2.20)
whereP
f0
is the baseline firing probability,
is the peak intensity, is the center, and is the standard
deviation of the Gaussian function that controls the temporal resolution of simulated spike patterns.
We designed four simulations with progressively higher complexity to examine how the proposed
model performs with (1) a single neuron with a wider (low-resolution) peak, (2) a single neuron with a
narrower (high-resolution) peak, (3) two neurons with wider and narrower peaks respectively, and (4)
population of neurons with randomly generated peaks. Simulation parameters of PIFs are shown in Table
2.2.
Table 2.2: PIF parameters used in simulation studies
Single (Low) 1 0.3 0.02
Single (High) 3 0.005 0.002
Combined Two 1, 3 0.3, 0.005 0.02, 0.002
Population 0.1-3.9 0.001-0.1 0.002-0.02
In the first three simulations, two classes (i.e., class 1 and class 0) of patterns were simulated. For class
1, PIFs were designed as in equation 2.20 and table 2.2; for class 0, PIFs were designed to be flat lines
22
(i.e., homogeneous Poisson spike trains) with values equal to the mean PIF values of their corresponding
class 1 patterns. Therefore, the mean firing rates of both classes were the same. Each class was simulated
with 100 instances. Every instance was four second long. In the fourth population simulation, multiple
classes of patterns were simulated with PIF parameters randomly generated in the ranges given in table
2.2. In other word, each neuron exhibited different PIF patterns corresponding to multiple classes. In
this study, multiple classes were classified with multiple binary classifiers, where in each classifier, the
class to be classified were treated as class 1, while all other classes were treated as class 0.
In all classifications, bagging replicas R is 8; B-spline resolution m was selected from the range
[0 : 25; 50 : 5 : 150]. Thus, 47 base-learners H
m
were trained in the first layer. [a : b : c] denotes the
vector of possible values from a to c with step size b. All models were built through Matlab 2019a.
2.3.1 Decoding Single-Neuron Patterns
We started with the simplest two cases, where class 1 patterns had PIFs with low- and high-resolution
Gaussian peaks, respectively (Fig. 2.6A). PIFs of class 0 are flat lines and not shown for simplicity. It is
worth noting that since the average firing rates of class 1 patterns and class 0 patterns are designed to be
the same, these two patterns cannot be classified by any firing rate-based classifier. The main goal here is
to gain insights into how the double-layer classifier captures different temporal resolutions in the patterns
with different B-splines. B-splines selected by the double-layer model and ensemble SCFMs in the unit
of probability, which described in section 2.2.3 and section 2.2.7, were calculated to show the ROIs and
compared with true PIFs (Fig. 2.6B, red lines) and effective PIFs (Fig. 2.6B, blue lines). True PIFs refer
to the analytical PIFs defined in equation 2.20, while effective PIFs are spike histograms calculated after
simulations. There two types of PIFs converge when the number of trials is large.
Decoding results show that ensemble SCFMs (Fig. 2.6B, shaded areas) well match the PIFs used in
simulation (Fig. 2.6A, red and blue lines). As expected, low-resolution B-splines (smaller m) and high-
resolution B-splines (larger m) are selected under the wider and narrower Gaussian peaks, respectively,
to construct the corresponding SCFMs. In both low- and high-resolution cases, the model also selects a
23
few high-resolution B-splines out of the Gaussian peaks of true PIFs (Fig. 2.6A, red lines), due to the
random fluctuation in effective PIFs caused by small sample sizes and low firing rates.
We further plot MCCs of all base learners and the meta-learner classifiers (see Fig. 2.6C). In the
low-resolution case, MCCs of single-neuron base learners (see Fig. 2.6C, solid line) reach maximal value
at optimal temporal resolution of B-splines (m = 7). In the high-resolution case, one of the maximal
temporal resolutions (m = 120) yields the highest MCC (see Fig. 2.6C, solid line). In both cases, the
multi-resolution meta-learners (see Fig. 2.6C, dashed lines) further boost the MCCs (0.922 in Fig. 2.6C
left, and 0.871 in Fig. 2.6C right) on top of the highest MCCs (0.844 in Fig. 2.6C left, and 0.754 in Fig.
2.6C right) achieved by the best single-resolution base learner.
It is worth noting that there is no multi-resolution PIF in such single-neuron simulations. However,
the meta-learner can still boost the performance significantly by combining base learners with different
B-spline basis functions since each individual base learner is imperfect in capturing the PIF due to its
specific B-spline knot sequence (see Fig. 2.6C, colored circles).
2.3.2 Decoding Two-Neuron Patterns
In the second simulation, we combined the two neurons above to form two-neuron patterns for decoding.
The goal is to see how the double-layer model decodes patterns with multiple temporal resolutions, i.e.,
wide, and narrow Gaussian peaks, with its base learners and the meta-learner.
Decoding results show that the ensemble SCFM (concatenation of SCFMs in Fig. 2.7A) well matches
the PIFs used in simulations (Fig. 2.6A, red and blue lines). Both low- and high-resolution B-splines
under Gaussian peaks are selected by the meta-learner and used to construct the SCFM. Among the
four groups of B-spline (represented by four different colors) selected by the meta-learner, two are high-
resolutions (Fig. 2.7A, dark and light red lines) and two are low-resolutions (Fig. 2.7A, dark and light
blue lines).
These four groups of B-splines correspond to four base learners with different temporal resolutions.
Similar to single-neuron patterns decoding results, low- and high-resolution B-splines show relatively
24
Figure 2.6: Decoding single-neuron patterns. Left column: decoding patterns with a low-resolution
Gaussian peak; right column: decoding patterns with a high-resolution Gaussian peak. A: Spike raster
plots (black dots), true PIFs (red lines), and effective PIFs (blue lines) of simulated spike patterns. Note
only class 1 patterns are shown. B: B-spline (color lines) selected and SCFMs (shaded areas) estimated
with the double-layer model. C: MCCs of base-learner classifiers (black lines: mean; shaded areas: STD)
and meta-learner classifiers (dashed lines). Colored circles represent base learners selected by the meta
learner.
25
Figure 2.7: A: B-splines (color lines) selected and SCFMs (shaded areas) estimated with the double-layer
model. B: MCCs of base-learners classifiers (black lines: mean; shaded areas: STD) and meta-learner
classifiers (dashed lines). Colored circles represent base-learners selected by the meta-learner.
26
larger contributions to the wider and narrower Gaussian peaks in the PIFs, respectively. Different from
single-neuron results, the meta-learner here tends to use a mixture of low- and high-resolution B-splines
to fit both wide and narrow Gaussian peaks. This is because each base learner uses a single temporal
resolution to fit both wide and narrow peaks. Although each base learner is imperfect in fitting all
peaks, the meta-learner combined multiple base learners to achieve good fit to the PIFs (Fig. 2.7A). It is
worth noting that with both low- and high-resolution B-splines, the SCFM captures the noisier effective
PIF instead of the smooth, true PIF. Therefore, the "noisy" SCFMs in Figure 2.7A are caused not by
overfitting but by the small sample size, which biases the true PIF.
MCCs of all base learners and the meta-learner classifiers are shown in Fig. 2.7B. Results show that
with two neurons, base learners in general perform better than with single neurons due to the redundant
information in the two-neuron patterns (i.e., base learners can achieve relatively higher MCCs from
classification in either neuron). More importantly, the meta-learner selects both low- and high-resolution
base learners to classify the two-neuron patterns that contain both wide and narrow Gaussian peaks
(Fig. 2.7B, colored circles). By combining multiple resolutions, the meta-learner boosts the MCC to
0.95, which is much higher than the MCC achieved by the best base-learner (0.89).
2.3.3 Decoding Population Patterns
Our last simulation dealt with population-level spatio-temporal patterns of spikes with a larger number of
neurons and more complex temporal patterns. The population patterns were simulated with 30 neurons
and 5 categories. These numbers were chosen based on the typical numbers we encountered in human
experiments. Each neuron has a 0.5 probability of having no Gaussian peaks and a 0.25 probability
of having 1 or 2 peaks, respectively. The Gaussian peaks have randomly chosen locations, widths, and
intensities (see Table 2.2 and Fig. 2.8A, B). For each category, 500 instances were simulated.
Five double-layer multi-resolution decoding models were built for the 5 categories (Fig. 2.8). Results
show that these models can perfectly decode the population patterns (MCC=1 in all categories). The
resulting SCFMs well capture most ROIs in PIFs (Fig. 2.8A, C). As proved in equation 2.14 and shown
27
Figure 2.8: Decoding population patterns. A: PIFs of the five categories. B: Raster plots of one example
spatio-temporal patterns of spikes simulated based on PIFs. C: SCFMs of double-layer decoding models.
In panels A, B, and C, the x-axis is the decoding time window (-2 sec to 2 sec), and the y-axis is indices
of neurons. PIF, raster and SCFM all have the same dimensionality. D: Spike counts with and without
SCFMs as masks. Bars: Mean spike counts per instance. Error bars: STD of spike counts.
28
here, SCFMs provide characteristic spatio-temporal maps for decoding spatio-temporal patterns by iden-
tifying ROIs unique for each class (category). In other words, SCFMs are essentially the spatio-temporal
areas that maximize the difference between class 1 patterns and class 0 patterns for classification. This
is clearly shown in Fig. 2.8D: without applying SCFMs (i.e., counting all spikes in the spatio-temporal
patterns), there is no significant difference between spike counts in the patterns across different categories
(Fig. 2.8D, top panel); with SCFMs applied (i.e., counting spikes within nonzero areas of SCFMs only),
the difference between spike counts across different categories become obvious (Fig. 2.8D, bottom panel).
Note that this simple counting procedure is not used in the decoding model: such results are illustrated
to intuitively demonstrate how the SCFMs identify ROIs. The actual model calculation is shown in
equation described in section 2.2.
2.3.4 Factors Influencing Classification Performance
At the end of the simulation study, we investigated the influence of key experimental parameters on
the model performance. These parameters include the number of neurons N, the number of instances
per category T, and the number of categories C. Since 500 instances per category results in perfect
classification as shown in the previous section, we decrease T to smaller numbers that are more realistic
in experimental or clinical settings. Twenty-seven simulations and classifications were performed with
different combination of N (10, 20, 30), T (100, 150, 200), and C (2, 3, 5) (see Fig. 2.9).
Results show that, as expected, decoding performance increases with the number of instances. Second,
including more neurons increases model performance. This results indicates that the decoding model does
not suffer from overfitting caused by including a larger number of features from more neurons. Finally,
having more categories decreases model performance as data become more imbalanced. However, even
in the most challenging case with the fewest instances (100), the fewest neurons (10), and the most
categories (5), the decoding model can still achieve a significant level of classification (MCC=0.493
0.077).
29
Figure 2.9: Influence of key experimental parameters on classification performance. Bars: mean MCCs;
Error bars: STD of MCCs.
2.4 Summary
In this chapter, we proposed a double-layer multi-resolution decoding model for decoding spatio-temporal
patterns of spikes. This model structure is designed specifically for solving the highly under-determined
estimation problem caused by high-dimensional spike patterns, the small sample size, and taking into
account the unknown optimal temporal resolution for decoding spikes. In first-layer base learners, B-
spline basis functions with a single temporal resolution are used to extract spatio-temporal features from
spike patterns and to reduce dimensionality. L1-regularized logistic classifiers are used to yield sparse
estimations of model coefficients as solutions to the under-determined problems. Estimating multiple
copies of classifiers using resampled data allows us to build ensemble classifiers to reduce estimation
variance and thus obtain more stable model estimation. With the second-layer meta learner, multiple
base learners are fused in a data-driven manner to incorporate multiple temporal resolutions into the
model and render the decoding model double-layer and multi-resolution. To avoid overfitting, nested
cross-validation is used throughout estimations.
30
Results in the synthetic data show that this model can capture the underlying ground truth temporal
resolution by assigning weights to base learners with different resolutions appropriately. Significant MCCs
indicate that the model can accurately decoding high-dimensional spike patterns using a small number
of instances. The decoding models also provide signature functions (i.e., SCFMs) that represent spatio-
temporal characteristics of spike patterns most relevant to classifications.
31
Chapter 3
Experimental Memory Decoding with Rodent and Human
Hippocampal Recordings
3.1 Introduction
The hippocampus is a brain region critical for the formation of long-term episodic memories [64, 65, 66].
Damage to the hippocampus and surrounding regions can result in a permanent loss of the ability to
form new long-term memories. It is recently emerged to use modern machine learning models to analyze
neural signals recorded from hippocampal areas, and thus to better understand the hippocampus.
This chapter presents applications of the proposed double-layer multi-resolution decoding models
with experimental hippocampal spiking data recorded from rodent and human subjects. We have built
a whole human hippocampal analysis system for the development of hippocampal memory prostheses at
both Keck Hospital of USC and Rancho Rehabilitation Center. We used depth electrodes with micro-
macro contacts to record neural spike trains from hippocampal CA3 and CA1 regions. A memory-depend
behavioral task was designed for representing visual memories. Several signal pre-processing steps were
applied to obtain good input-output data for the modeling. In a total of 43 subjects participated in the
study were successfully recorded.
We tested the double-layer multi-resolution memory decoding model with experimental data recorded
from 5 rats and 24 human subjects performing memory-dependent tasks. Results show that this model
32
can effectively avoid overfitting and yield accurate prediction of memory labels. The double-layer multi-
resolution model consistently outperforms the best single-layer single-resolution model by extracting and
utilizing multi-resolution spatio-temporal features of spike patterns in the classification.
3.2 Methods
3.2.1 Experimental Procedures
Two types of experimental data are used to test the double-layer multi-resolution decoding model. One
is hippocampal spiking activities recorded from rats performing a spatial delayed nonmatch-to-sample
(DNMS) task [67]. The other is hippocampal spiking activities recorded from human epilepsy patients
performing a visual delayed match-to-sample (DMS) task [38].
3.2.1.1 Delayed Nonmatch-to-Sample Task in Rats
Hippocampal CA3 and CA1 spike trains were recorded from rats (n=5) performing the DNMS task using
multi-electrode arrays (MEAs). Each DNMS trial started with the sample phase (see Fig. 3.1, top) when
the animal was presented randomly with one of the two levers (left or right) on one wall of the chamber.
The animal was trained to press the lever to generate a sample response event. The lever was then
retracted, and the delay phase with a random duration (5-30 seconds) started. For the delay duration,
the animal was required to nose-poke into a lighted device on the opposite wall. When the delay ended,
the nose-poke light was turned off; both levers were extended, and the animal was required to press the
lever opposite the one it pressed during the sample phase and generated a nonmatch response. If the
correct lever was pressed, the animal was rewarded with a drop of water. One session had 80 to 120
DNMS trials. Spatio-temporal patterns of CA3 and CA1 spikes around (-2 sec to 2 sec) sample response
events were used as model inputs. Lever locations (left: 1, right: 0) were used as model outputs.
33
Figure 3.1: Memory-dependent behavioral tasks used in this study. Top: Spatial delayed nonmatch-to-
sample task in rodent studies. Bottom: Visual delayed match-to-sample task in human studies.
34
3.2.1.2 Delayed Match-to-Sample Task in Human Patients
Hippocampal CA3 and CA1 spike trains were recorded from human epilepsy patients performing a DMS
task. Each DMS trial was initiated with a red focus ring presented on the touch screen. The subject
was asked to click on the focus ring to start the sample phase. During the sample phase, one image was
presented on the touch screen. The subject was asked to remember the image and then press on the
image to form a sample response event. After pressing, the image disappeared, and the delayed phase
(3-5 seconds) started. When the delayed phase finished, multiple images including the image presented
at the sample phase (sample image) were presented on the touch screen. The subject should choose and
press on the sample image to generate a correct match response. One session consisted of 100 to 150
DMS trials.
3.2.2 Human Study Paradigm
We aim to develop hippocampal prostheses for restoring and enhancing memory functions. All subjects
enrolled in this study were suffering from hippocampal-related refractory epilepsy and underwent im-
plantation of intracranial depth electrodes to monitor and localize seizures. We selected subjects through
non-invasive epileptogenic foci monitoring, in which further invasive monitoring with recording electrodes
was considered necessary before possible surgical treatment. Our experimental paradigm for testing such
prostheses consists of multiple steps happening in four days (Fig. 3.2, bottom):
1. Implanting Day: subjects are typically implanted with 8-10 macro EEG probes and 1-4 macro-
micro EEG/single neuron probes. Spiking activities are recorded with the "micro" electrodes of the
macro-micro probes.
2. Recording Day: 1-2 days after the Implanting Day, subjects perform the memory-dependent DMS
tasks (section 3.2.1.2) while implanted probes record the ensemble spike trains from hippocampal CA3
and CA1 areas.
35
3. Stimulation Day: 3-4 days after the Recording Day, model-based electrical stimulations are deliv-
ered back to specific hippocampal locations as a way to test the model.
4. Explanting Day: due to postoperative observation requirements, typically, all electrodes are re-
moved from the subject on the seventh day after implantation.
Figure 3.2: Hippocampal memory prosthesis human study paradigm. Top: delayed match-to-sample
task. Bottom: Timeline of the human study paradigm.
All procedures were reviewed and approved by the Institutional Review Boards of Wake Forest Uni-
versity and the University of Southern California, in accordance with the National Institutes of Health.
The subjects provided voluntary written informed consent prior to participation in this study. All human
experiments were performed at the Comprehensive Epilepsy Center of Wake Forest Baptist Medical Cen-
ter, Keck School of Medicine of the University of Southern California, and Rancho Los Amigos National
Rehabilitation Center.
3.2.3 Placement of Electrodes
In order to determine the placement of intraoperative electrodes for the ‘stereo-EEG’ assessment, all
subjects received appropriate pre-operative imaging processes. Typically, 8-10 ‘macro’ style EEG probes
36
and 1-4 ‘macro-micro’ style EEG/single neuron probes were implanted into our subjects. All probes
consisted of FDA-approved depth electrodes (Ad-Tech Medical Instrumentation Corporation, Racine,
WI). The electrodes were placed using either a stereotactic headframe (CRW Precision Arc, Integra Life
Sciences, Plainsboro, NJ) or a frameless stereotactic system (VarioGuide, Brainlab AG, Feldkirchen,
Germany). The placement of the hippocampal electrode was aimed at penetrating the head of the
hippocampus perpendicular to the long axis to record neuronal activity in both the CA1 and the CA3
subfield. Typically, on each probe, 6 out of 10 micro electrodes were implanted in the CA3 area of the
hippocampus whereas 4 out of 10 micro electrodes were implanted in the CA1 area of the hippocampus.
During the positioning of the electrodes, intraoperative monitoring of neuronal activity was performed
to confirm single unit and field potential recordings. The surgery was then applied to the patient while
all electrodes were secured, and craniotomies closed. After 1-2 days of recovery from the anesthesia,
subjects were transferred to the ICU of the Keck Hospital or the Rancho Rehabilitation Center, where
all memory behavioral tests (DMS tasks) were performed. Both MRI and electrophysiological activity
consistent with putative hippocampal CA1 and CA3 principal cells were used to verify the localization
of the electrode placements. After confirming the seizure localization or at a time designated by the
care team for each patient, from which a sufficient period of invasive monitoring had been performed,
electrodes were explanted from all subjects.
3.2.4 Hardware Environments
The recording and microsimulation were performed using the Blackrock Cervello neural recording system
(Blackrock Neuromed, Salt Lake City, UT). Continuous electrical digitized monitoring identified single-
unit action potential waveforms (bandpass filtered to 500–5000 Hz, 30000 samples/s) and single-unit
spike events (i.e. timestamps, 200 s resolution) during the undermentioned DMS task performance.
The recording was performed simultaneously with the clinical EEG recording system. Active headstages
amplified the signal at the level of the electrode and the signal was subsequently fed to a pre-amplifier
for further amplification. Referencing used macro electrode contacts selected to be within white matter
37
and to demonstrate good noise characteristics. The grounding scheme shared one circuit between the
clinical and research EEG systems which reached the patient via a scalp disc electrode. Isolated power
transformers and software noise filtering were used.
3.2.5 Input-Output Data Pre-processing
Manual sorting schemes were applied according to each subject’s signal-to-noise ratio to isolate and
identify single unit extracellular action potential waveforms. The timestamp of the spike events was set
to 200 s resolution, and the bandpass filter for the single unit action potential waveforms was 500-5000
Hz with 30000 samples per second. As such, spatio-temporal patterns of spikes were obtained as input
signals to the decoding model.
In human decoding, we used image labels (termed "memory labels") as representations of visual
memory information. Memory labels were obtained from thousands of normal volunteers giving scores
to a set of images through an online survey system. Five main categories (i.e., Animal, Plant, Building,
Tool, and Vehicle) were selected as the main decoding target categories (see Fig.3.3).
Figure 3.3: Five categories of images were selected as the main decoding target categories for testing the
decoding model.
38
3.2.6 Design of Testing Cases
We designed 4 testing cases (Fig. 3.4), two as normal decoding cases and two as control cases, to evaluate
the model’s abilities of decoding memory-related information from spatio-temporal patterns of spikes.
In the first normal decoding case (labeled “Sample Response” case), spike patterns around the “Sample
Response” events, when visual memories were formed, were used as model inputs. Memory labels of
images shown in the Sample phase were used as model outputs. This case aims to check whether the
model can decode specific memory accurately when memories were formulated.
Inthesecondnormaldecodingcase(labeled“MatchResponse” case), spikepatternsaroundthe“Match
Response” event were used as model inputs. Memory labels of images, which are shown in the Sample
phase, were used as model outputs. This case aims to check whether the model can decode specific
information when memories were retrieved.
In the first control case (labeled “Control (Time-shifted)” case), time windows of spike patterns were
shifted to be 4 seconds before the “Sample On” events, so that the spike patterns contain no memory
information about the sample images. This case serves as a negative control, which aims to check whether
the model can decode real memory-related information.
In the second control case (labeled “Control (Label-shuffled)” case), memory labels were randomly
shuffled to disrupt any possible correlation between spike patterns (model inputs) and memory labels
(model outputs). This case serves as another negative control, which aims to check whether the model
may suffers overfitting issue given such high-dimensional input signals and a small sample size.
3.3 Results
3.3.1 Decoding Binary Spatial Location from Hippocampal Spikes in Rats
In rat decoding models, model inputs are spatio-temporal patterns of spikes recorded from hippocampal
CA3 and CA1 neurons around sample response events (-2 sec to 2 sec). Model outputs are locations of
39
Figure 3.4: Design of Testing Cases.Top: Delayed match-to-sample memory tasks; Middle: neural spiking
activities and selected decoding window of specific testing case; bottom: correlation of spike patterns with
behavioral events in each testing case.
40
the lever (left or right) pressed at sample response events represented as a binary variable (1 or 0). We
build both single-layer single-resolution and double-layer multi-resolution decoding models using data
from five animals. Results show that both types of model can accurately decoding the sample locations
(see Table 3.1). In addition, double-layer models consistently outperform their corresponding single-layer
models with significantly higher MCCs (paired t-test, p < 0.001 in all cases).
Table 3.1: Model Performances of Single-Layer and Double-Layer Decoding Models in Rodent Data
Rat Number of Neurons Number of Trials Single-layer (MCC) Double-layer (MCC)
a 50 89 0.952 0.008 0.977 0.02
b 26 63 0.835 0.038 0.943 0.044
c 57 53 0.929 0.032 0.989 0.018
d 41 68 0.956 0.015 0.964 0.022
e 56 68 0.825 0.031 0.945 0.031
Figure 3.5 shows examples of decoding models from two animals. Since PIFs are unknown in exper-
imental data, peri-event histograms of neurons around sample response are shown instead. There is no
obvious difference between the left and right histograms in both animals (see Fig. 3.5A). This is also
indicated in Fig. 3.5D, which show little difference in spike counts between left and right without applying
the SCFM mask. However, with SCFM applied, the difference in spike counts between the two patterns
becomes much more apparent. In both models, SCFMs exhibit both positive and negative ROIs.
3.3.2 DecodingVisualMemoryCategoriesfromHippocampalSpikesinHumans
We further applied the memory decoding model to hippocampal spiking data collected from human
subjects. In human decoding models, model inputs are spatio-temporal patterns of spikes around a
specific behavioral event of the DMS task. Model outputs are binary labels (1 or 0) corresponding to five
memory categories of sample images.
3.3.2.1 Human Subjects
We had 43 subjects enrolled in this project (26 from the Wake Forest University, 10 from the USC Keck
Hospital, and 7 from the Rancho Los Amigos National Rehabilitation Center) (Table 3.2).
41
Figure 3.5: Decoding hippocampal spike patterns in rats performing a DNMS task. Left: rat a. Right:
rat b. (A) Peri-event histograms of spike patterns during left and right sample response. (B) Raster
plot of one spatio-temporal patterns of spikes during left and right sample response. (C) SCFMs of
double-layer decoding models. (D) Spike counts without and with SCFMs as masks.
42
All procedures were reviewed and approved by the Institutional Review Boards of Wake Forest Uni-
versity and the University of Southern California, in accordance with the National Institutes of Health.
The subjects provided voluntary written informed consent prior to participation in this study.
3.3.2.2 1.2. Selection Criteria of Human Subjects for Modeling
We defined four criteria for selecting subjects for modeling:
1. The number of correctly performed DMS trials should be larger than 100 to ensure enough sample
points for model training.
2. Sample images shown in the DMS trials should be labeled with (any of) five objective categories,
i.e., Animal, Building, Plant, Tool, and Vehicle. And each category should have more than twenty class
1 instances.
3. The average firing rate should be ranging from 2 Hz to 20 Hz to filter out abnormal recording
sessions (e.g., noisy datasets).
4. The timestamp of every trial’s “Match Response” event should be at least 2 seconds before the
timestamp of the next trial’s “Sample Present” event.
By applying such criteria, we selected 24 subjects for modeling. Both double-layer multi-resolution
decoding models and single-layer single-resolution models were built with four testing cases (see section
3.2.6).
3.3.2.3 1.3. Performances of Human Memory Decoding Models
We first tested the double-layer multi-resolution model with the two normal decoding cases. Spatio-
temporal patterns of spikes around the “Sample Response” events (i.e., the periods of memory encoding)
and the “Match Response” events (i.e., the periods of memory retrieval) were used as model inputs,
respectively. Five memory labels of the sample images were used as model outputs.
Results show that, in both “Sample Response” and “Match Response” cases, the double-layer multi-
resolution model can yield significantly higher-than-zero MCCs in most categories and subjects (Fig. 3.6,
43
Table 3.2: Information of recorded human subjects (n=43). “WFU” represents subjects recorded from
Wake Forest University. “Keck” represents subjects recorded from the Keck Hospital of USC. “Rancho”
represents subjects recorded from the Rancho Rehabilitation Center. “LA”, “LP”, “RA”, and “RP” refers
to the Left Anterior, Left Posterior, Right Anterior, and Right Posterior of the hippocampus, respectively.
Subject ID Electrode location # of Neuron # of Trials
WFU09 LA 29 66
WFU10 RA 20 86
WFU11 LA RA 43 78
WFU12 LA RA 15 54
WFU14 RA RP 28 55
WFU15 LA RA 44 72
WFU16 LA LP RA RP 97 99
WFU17 LA RA 41 99
WFU18 LA LP RA 65 100
WFU19 LA 22 100
WFU20 RA RP 38 120
WFU21 LA RA 43 199
WFU22 LA RA 42 172
WFU23 RA 19 200
WFU24 LA 20 199
WFU25 LA RA 46 196
WFU26 LA RA 40 147
WFU27 LA LP RA RP 48 150
WFU28 LA RA 44 149
WFU29 LA 27 150
WFU31 LA LP RA RP 64 150
WFU32 LA LP RA RP 65 150
WFU33 LA RA 31 150
WFU34 LA LP RA RP 58 150
WFU35 LA LP RA RP 62 150
WFU37 LA LP RA RP 62 150
Keck04 LA LP RA RP 51 150
Keck05 LA LP RA RP 57 99
Keck06 LA RA RP 60 89
Keck07 LA RA 37 100
Keck08 LA RA 35 68
Keck09 LA LP RA RP 55 145
Keck10 LA LP RA RP 75 150
Keck12 LA LP RA RP 49 135
Keck14 LA RA 38 53
Keck15 LA RA 37 150
Rancho01 LA RA 36 142
Rancho02 LA LP RA RP 64 89
Rancho03 LA LP RA RP 70 150
Rancho04 LA LP RA RP 80 109
Rancho05 LA 18 113
Rancho06 LA 9 145
Rancho07 LA LP RA 57 149
44
Figure 3.6: Modeling performances of the double-layer multi-resolution models obtained from all subjects
(n=24) in two normal decoding cases. Color lines: MCCs of an individual subject; Black solid lines:
average MCCs of all subjects.
.
distribution of color lines and the violin plots). Average MCCs that the model obtained in the decoding
case of “Sample Response” are 0.29, 0.39, 0.47, 0.29, 0.4 that corresponding to the five memory categories
“Animal”, “Building”, “Plant”, “Tool”, and “Vehicle”, respectively (black solid lines on Fig. 3.6, Left). In
the decoding case of “Match Response”, the MCCs of five memory categories are 0.38, 0.43, 0.5, 0.43,
and 0.36, respectively (black solid lines on Fig. 3.6, Right). Such results indicate that the double-layer
multi-resolution model can decode memory categories from spatio-temporal patterns of spikes during the
period of memory encoding and memory retrieval.
Wefurtherevaluatedthemodelperformanceswiththetwocontrolcases. Inthecaseof“Control(Time-
shifted)”, input spike patterns contain no memory information to the output memory labels, whereas in
the case of “Control (Label-shuffle)”, there is no input-output relation persists across sample points. The
main goal here is to check whether the model might suffer any overfitting issues.
Results show that the double-layer multi-resolution model yield significantly lower MCCs in the case
of “Control (Time-shifted)” (Fig. 3.7, Left) than in both normal decoding cases. Most models yield zero
MCC in this case. The average MCCs of five memory categories are 0.06, 0.04, 0.03, 0.04, and 0.03,
45
Figure 3.7: Modeling performances of the double-layer multi-resolution models obtained from all subjects
(n=24) in two control cases. Color lines: MCCs of an individual subject; Black solid lines: average MCCs
of all subjects.
respectively (black solid lines on Fig. 3.7, Left). In addition, all models yield zero MCC in the case of
“Control (Label-shuffled)” (Fig. 3.7, Right). It proves that the double-layer multi-resolution model can
effectively avoid overfitting and decode real memory-related information.
Details of two example human decoding models decoding the “Sample Response” case are shown in
Figures 3.8 and 3.9. Similar with what is shown in rodent models (see Fig. 3.5), there is no obvious
difference in the peri-event histograms of spikes across different categories (see Fig. 3.8A and 3.9A).
However, the decoding models identify the sparsely distributed spatio-temporal regions of spike patterns
that best differentiate the patterns across different categories (see Fig. 3.8D and 3.9D). These regions
are visualized with the SCFMs, which are characteristic of each category.
Moreover, in the structure of our double-layer multi-resolution memory decoding model, the second
layer meta-learner is used to combine multiple first-layer single-resolution base learners to make the final
prediction. To demonstrate the contribution of the second-layer meta-learner, we statistically compared
the modeling performances of the double-layer multi-resolution models and the best single-layer single-
resolution models by using the paired sample t-test.
46
Figure 3.8: Decoding hippocampal spike patterns in human performing a DMS task (subject 1). A:
peri-event histograms of spike patterns during sample responses. B: Rater plot of one spatio-temporal
patterns of spikes during sample response. C: SCFM of double-layer decoding models. D: Spike counts
with and without SCFMs as masks.
47
Figure 3.9: Decoding hippocampal spike patterns in human performing a DMS task (subject 2). A:
peri-event histograms of spike patterns during sample responses. B: Rater plot of one spatio-temporal
patterns of spikes during sample response. C: SCFM of double-layer decoding models. D: Spike counts
with and without SCFMs as masks.
48
Results show that, as expected, double-layer multi-resolution models consistently outperform corre-
sponding best single-resolution models (p-value < 0.05 in all cases shown in Fig. 3.10). It proves that
the multi-resolution meta-learner can boost modeling performance significantly by combining multiple
single-resolution base learners.
Figure 3.10: Paired t-test comparisons between the best single-layer (SL) single-resolution model and the
double-layer (DL) multi-resolution model. Color regions: violin distributions of MCCs obtained from all
subjects (n=24). Black solid lines connect each single-layer model and double-layer model pair.
3.3.2.4 Meta-Analysis of Human Decoding Results
We further performed meta-analyses on the decoding results to better understand several scientific ques-
tions.
1. Sparseness of Spatio-temporal Regions
49
From previous sections, we have shown that spatio-temporal regions of spike patterns, which are
characteristic of a certain memory category, are sparsely distributed in the decoding window. The double-
layermulti-resolutionmodelcanidentifysuchspatio-temporalregionsandvisualizethemwiththeSCFMs
(see Fig. 2.8, 3.5, 3.8, and 3.9). In our first meta-analysis, we quantitatively measured the sparseness of
such spatio-temporal regions across different categories and subjects.
The sparseness of each category’s spatio-temporal region is equal to the area of non-zero SCFM
regions divided by the area of the entire decoding window. We statistically calculated the sparseness of
each category’s corresponding spatio-temporal regions considering results from all subjects.
Results show that, overall, the sparseness of each category is larger than 85% (Fig. 3.11) in both
period of memory encoding and memory retrieval. The sparseness of spatio-temporal regions in all
memory categories and hippocampal areas is similar and with low standard deviations. Such results
suggest a sparse coding pattern of memory information exists in the spatio-temporal patterns of spikes
recorded from the hippocampus.
2. Contribution of Hippocampal CA3 and CA1 to the Decoding of Memory Categories
In human decoding studies, model inputs are spiking activities recorded from hippocampal CA3 and
CA1 regions. To gain insights into the contribution of CA3 and CA1 neurons to the decoding of memory
categories, we applied a permutation feature importance (PFI) analysis on model features extracted from
the CA3 neurons and features from CA1 neurons, respectively. The unique contribution of a specific
feature is calculated as the gain of loss that results from permuting this feature divided by the gain of
loss that results from permuting all features. As such, the unique contribution of CA3 or CA1 individual
neuron can be calculated by permuting all CA3 or CA1 features and then normalized the contribution
with the number of CA3 or CA1 neurons.
We calculated such unique contributions of individual CA3 and CA1 neurons considering all categories
and subjects and compared them statistically with the paired sample t-test. Results show that there is no
significant difference between the normalized unique contribution of CA3 and CA1 neurons (Fig. 3.12, p
50
Figure 3.11: The sparseness of spatio-temporal regions of five memory categories and two hippocampal
areas. Bars: average sparseness of all subjects (n=24); Error bars: standard deviation of the sparseness
51
values > 0.01 in all cases). Such results suggest that CA3 individual neurons and CA1 individual neurons
contribute equivalently to the decoding of memory categories.
Figure 3.12: Normalized unique contributions from the hippocampal individual CA3 and CA1 neuron to
the decoding of memory categories. Color areas: violin distribution of normalized unique contributions
obtained from all subjects (n=24). Black solid lines connect each CA3 and CA1 neuron’s contribution
pair.
3. Distribution of Memory Category Information in CA3 and CA1 Region
In addition to the contribution of CA3 and CA1 individual neurons, we further explored the distri-
bution of memory category information in the CA3 and CA1 regions.
We performed the permutation feature importance analysis on features from the CA3 region versus
features from the CA1 region across all categories and subjects. Results show that average contributions
from the CA3 region are larger than average contributions from the CA1 region (Fig. 3.13, violin plots).
However, it is worth noting that in our recording paradigm, we typically define 6 out of 10 channels on
52
each electrode as CA3 channels and the rest 4 out of 10 as CA1 channels. Therefore, such differences in
averageregioncontributionsresultfromhavingadifferentnumberofCA3andCA1neurons. Nevertheless,
statistically, the CA3 region contributes equivalently to the CA1 regions (Fig. 3.13, p values > 0.01 in
all cases).
Figure 3.13: Unique contributions from the hippocampal CA3 and CA1 region to the decoding of memory
categories. Color areas: violin distribution of unique contributions obtained from all subjects (n=24).
Black solid lines connect each CA3 and CA1 region’s contribution pair
Furthermore, unique contribution pairs shown in Figure 3.13 indicate a certain amount of redundant
information shared by CA3 and CA1 regions. The unique contribution measured by the PFI analysis
represents the percentage of independent information contained in the CA3 and CA1 regions. Therefore,
we can calculate the shared redundant information of one memory category in one subject by subtracting
100% with the summation of each CA3 and CA1 unique contribution pair.
53
We calculated such redundant information considering all 24 subjects with 5 memory categories.
Results show that there is an overall 20% of redundant information shared by CA3 and CA1 regions in
all five memory categories of all subjects (Fig. 3.14). It suggests that hippocampal CA3 and CA1 regions
contain partially redundant information of memory categories.
Figure 3.14: Hippocampal CA3 and CA1 region contain partially redundant information of memory
categories. Such a redundancy is similar across different memory categories.
4. Temporal Code of Memory Categories
One advantage of the double-layer multi-resolution model is that the model includes a wide range
of temporal resolutions by using multiple base learners to explore the optimal temporal resolution for
decoding. Therefore, we aim to better understand whether there are any optimal temporal resolutions,
and how low- and high-temporal resolutions contribute to the decoding of memory categories.
54
To investigate this topic, we performed the PFI analysis on the model’s base learners to quantitatively
measure their unique contribution. As such, we can quantitatively measure the contribution of each base
learner’s corresponding temporal resolution to the decoding of memory categories.
Figure 3.15: Contributions of low- and high-temporal resolution base learners to the decoding of memory
categories. Color dots: contribution of an individual model with a certain category; Color solid lines:
the average contribution of a certain category across all subjects (n=24); Black solid lines: the overall
contribution of all categories and all subjects.
Results show that, similar to our previous findings [68], there is not an exact single optimal temporal
resolution. The model uses multiple resolutions to achieve better decoding. However, base learners
with higher temporal resolutions contribute more than base learners with lower temporal resolutions in
predicting all five memory categories (Fig. 3.15, color lines and dots). Such a contribution difference
exists in both periods of memory encoding (Fig. 3.15, left) and retrieval (Fig. 3.15, right). It indicates
55
a temporal code, as opposed to a rate code, of memory categories in the hippocampal CA3 and CA1
spiking activities.
56
Chapter 4
Accelerating Model Estimation with Parallel Computing for
Testing Hippocampal Memory prostheses in Human
4.1 Introduction
The hippocampal prosthesis consists of a recording unit (e.g., multi-array electrodes) for recording input
spike trains from an upstream hippocampal region (e.g., CA3), a computational unit (e.g., input-output
models) for predicting the output spike trains in a downstream region (e.g., CA1), and a stimulator for
delivering electrical stimulations to the downstream region following the predicted output spike patterns.
It can be considered a neural code-based closed-loop deep brain stimulation (DBS) system.
To build the computational unit of the hippocampal memory prosthesis, we have developed two
types of input-output models, i.e., a multi-input multi-output (MIMO) model, and a double-layer multi-
resolution(DLMR)decodingmodel(seeFig. 4.1A).TheMIMOmodelisanonlineardynamicalregression
model that predicts downstream hippocampal spike trains based on upstream hippocampal spike trains
[32, 10, 69]. This model has proven to be a powerful tool for identifying input-output nonlinear dynam-
ics and driving neural code-based micro-stimulations for improving short-term and long-term memory
functions [70, 46, 71, 38]. The DLMR model is the proposed model described in chapter 2. It identifies
characteristics spike patterns for different memory categories and features and can potentially be used to
design stimulation patterns to elicit or suppress specific memories [35, 68].
57
Figure 4.1: Utilizing parallel computing strategies to accelerate MIMO and DLMR model estimation. A:
MIMO and DLMR models of spike trains. MIMO model captures the input-output spike transformation
properties and predicts output spike trains based on input spike trains. DLMR model decode memory
categories from spatio-temporal patterns of spikes. B: Parallel Computing strategy accelerates model
estimation and allow it to be accomplished within the human study time constraints.
58
Bothinput-outputmodelshavebeensuccessfullyimplementedwithhippocampalCA3andCA1spikes
recorded in rodents, nonhuman primates, and human epilepsy patients. The high model performances
result from:
1. Large-scale input-output datasets, i.e., both models are estimated from spiking data simultaneously
recorded from many neurons during a long recording time.
2. Large-scale nonlinear dynamical model structures, i.e., both models contain a large number of
model parameters for capturing complex nonlinear neural dynamics.
3. Computationally intensive estimation procedures, i.e., both models involve highly iterative proce-
dures to optimize hyper-parameters and estimate model coefficients.
All these factors at the same time dramatically increase the model estimation time. A MIMO or
DLMR model estimation typically requires hundreds of hours on a single personal computer. Although
long model estimation times are not problematic on their own, the time constraints imposed by human
studies (see section 3.2.2) require that model estimations have to be completed within a short time frame,
e.g., 48-72 hours after recording so that models can be further used to drive stimulations. Typically,
electrodes will be explanted from patients 5-7 days after the recording. Therefore, much faster model
estimation procedures are required.
To solve this problem, we have utilized parallel computing as a powerful tool for accelerating model
estimation procedures. In this strategy, the overall modeling estimation task is decomposed into multiple
independentsub-tasksthatcanbesolvedinparallelondifferentcomputerprocessors(seeFig. 4.1B).Such
parallel computing strategy is implemented on a high-performance computer cluster at the University of
Southern California Center for Advanced Research Computing (CARC). Results show that this parallel
computing strategy allows us to accomplish all modeling tasks within the human study time constraints
and enables testing the hippocampal memory prosthesis in human subjects.
59
4.2 Methods
4.2.1 MIMO Estimation Procedures
The MIMO model is a nonlinear dynamical time-series regression model that predicts downstream hip-
pocampal spike trains based on upstream hippocampal spike trains. A MIMO model consists of a series
of multi-input single-output (MISO) models.
MISO models have a physiologically plausible structure that can be expressed as:
w =u(K;x) +a(H;y) +() (4.1)
y =
8
>
>
>
<
>
>
>
:
1 whenw<
0 whenw
(4.2)
w is a hidden variable consisting of three components: synaptic potential u, after potential a, and a
pre-threshold Gaussian noise term n. The synaptic potential u results from input spike trains x and a
feed-forward kernelK, which describesthe transformationfromx andu. The afterpotentiala istriggered
by the output spike trainy and a feed-back kernelH. Oncew exceeds the threshold, an output spike is
generated by the model, and an after-potential is triggered and added to w through the feed-back kernel
H. The Gaussian noise term renders the model stochastic.
The MIMO estimation procedures starts with a preprocessing step to create design matrices (see Fig.
4.2A). The design matrices are formed from spike trains with Laguerre expansion and Volterra series
[72]. With the design matrices, estimation of MISO models becomes estimation of model coefficients and
optimization of model hyper-parameters (i.e.,
1
).
The group Lasso [73] is used to achieve group-level sparsity and estimation model coefficients. With
the group lasso, every MISO model selects and estimates its coefficients at a group level by minimizing
an objective function S that consists of two components:
60
1. Likelihood estimation term: a negative log-likelihoodl for minimization.
2. Regularization term: a grouped L2norm penalty term.
By using a second-order self-kernel model with N inputs and one output we can express the function
S as equations below:
S(c) =l(c) +
1
jjC
(u(t);a(t))jj
2
(4.3)
l(c) =
T
X
t
[y(t)logP (t) + (1y(t))log(1P (t))] (4.4)
P (c) = (u(t) +a(t)) (4.5)
where C
() denotes the model coefficients of u(t) and a(t), T is the data length, and
1
is the
regularization parameter. is the normal cumulative distribution function, which combines the overall
influence caused by post-synaptic potential u and the after-potential a into the output firing probability
P.
In this study, we set a pool of
1
with 100 values that are logarithmically spaced on a selection path
with the maximal value as
1max
and the minimal value as 0.01
1max
. The maximal value of the
selection path can be determined by features of input spike trains and the model residual as:
1max
=max
q
1
T
jjL(x)
T
q
rjj
p
J
q
(4.6)
where L(x) denotes the convolution of input spike trains with Laguerre basis, r is the model residual,
and J
q
corresponds to Laguerre basis functions. The
1
is optimized by cross-validation that yields the
smallest out-of-sample negative log-likelihood. The entire MIMO model estimation procedure consists of
multiple MISO model estimation procedures within the output loop. For each output, it contains a fold
loop for cross-validation (5-fold in this letter) using different data splits. For each fold, there is another
61
loop for optimizing the regularization parameter from the regularization selection path and estimating
model coefficients for each given (see Fig. 4.2A). The whole MIMO model estimation procedure involves
three loops of regularized estimations and thus becomes computationally intensive.
Figure 4.2: Flowcharts of the model estimation procedure. A: estimation procedure of the MIMO model.
B: estimation procedure of the DLMR model.
4.2.2 DLMR Estimation Procedures
The estimation of DLMR model is to estimate model coefficients of all base learners and the meta-
learner. As described in Chapter 2, a 10-fold nested cross-validation method (see section 2.2.6) is applied
to optimize the regularization parameter
2
(see Eq. 2.10). In addition, the bagging method (see
section 2.2.5) is applied in the DLMR model by re-sampling the data to achieve ensemble estimations.
With bagging, data are re-sampled and partitioned into multiple splits for model estimations. The
62
hyper-parameter
2
is estimated globally by considering all splits; thus, an ensemble model with a lower
estimation variance is achieved.
Similarly to the MIMO estimation, the DLMR model estimation procedure also consists of multiple
estimation loops (see Fig. 4.2B). The first loop (label loop) deals with multiple memory labels (outputs)
to be decoded from spike trains. The second loop (split loop) combines multiple resolutions with a meta-
learner and produces ensemble model coefficients. The third loop (resolution loop) explores different
temporal resolutions for extracting spike features using different numbers of B-spline basis functions.
The fourth loop (fold loop) contains bagging-based ensemble model estimations for reducing estimation
variance. The fifth loop ( loop) optimizes regularization parameters
2
and estimates model coefficients
corresponding to each value. It is worth noting that, by using nested cross-validation, two loops of
cross-validation, i.e., split loops and fold loops, are included in the DLMR model estimation procedure
[68]. Model training and validation happen in the inner fold loop while model testing happens in the
outer split loop.
4.2.3 Parallel Computing Strategies
Both models need to perform computationally intensive modeling tasks with large-scale datasets within
a limited time-frame (see Fig. 3.2). Therefore, their amenability of parallelization is crucial for them to
handle such large-scale modeling tasks.
With the help of the CARC cluster, we can launch multiple model estimation procedures in parallel
by employing multiple computing nodes. With this characteristic, the most straightforward and effective
parallelization strategy is to identify independent modeling (sub-) tasks, which can be distributed to
separate computing nodes and run all independent tasks simultaneously. In practice, such a strategy can
theoretically lead to a reduction by N times, where N is the number of independent tasks.
In the MIMO model, if N output neurons were modeled, N MISO models need to be built. The
entire modeling procedure of each MISO model can be broken into three estimation stages. Estimation
stage 1 (S1) starts estimates the first regularization parameter
1max
described in Eq.4.6 to determine
63
the selection path of
1
. Consequently, estimation stage 2 (S2) takes charge of minimizing the objective
function S described in Eq.4.3. The last estimation stage (S3) is for hyper-parameter optimization in
which cross-validation are used to pick the optimal value of
1
. Note thatS3 is the most computationally
expensive since it also contains multipleS1 andS2 for full model estimations. From the above, the serial
(non-parallel) computing time for MIMO model is O(N (T (S1) +T (S2) +T (S3))).
Significantly, in the estimation procedure of the MIMO model, all MISO models can be estimated
independently since their outputs do not interact with each other (see Fig. 4.2A, output loop). Therefore,
we can distribute MISO estimations as independent tasks and run them in parallel to reduce the total
computing time.
Our first parallelization scheme (MIMO Scheme A) is based on the number of independent MISO
models. Each MISO model can be distributed and estimated on a separate CARC computing node. Each
node runs all of three stages of a single MISO model. By estimating N MISO models on N computing
nodes in parallel, we can theoretically reduce the computing time by N toO(T (S1 +S2 +S3)). Here we
assume that each MISO model requires a similar computing time for estimation tasks.
However, it is also important to note the estimation of S3 can be enhanced since each fold of the
cross-validation is independent of each other (see Fig. 4.2A, fold loop). As such, we can further reduce
every MISO model’s computing time by distributing each S3 to 5 additional CARC computing nodes
after the original S1 and S2 are finished. With this scheme (MIMO Scheme B), we first run N MISO
models on N nodes until accomplishing the first two stages. Subsequently, after saving all intermediate
results ofS1 andS2, we allocate 5N new nodes from CARC and distribute a one-fold task of S3 to each
node. As such, the total computing time can further be scaled to O(T (S1 +S2 +S3=5)) (see Fig. 4.3).
In the DLMR model, the estimation of each base learner is independent (see Fig. 4.2B, resolution
loop). Such a independence allows us to make parallelization schemes. In addition, the DLMR model
involves multiple sets of estimations from different bagging splits to reduce model variance. Consequently,
estimation of each split can be parallelized (see Fig. 4.2B, split loop). It is worth noting that since the
model estimations with different resolutions and different splits are independent of each other. The
64
Figure 4.3: Parallel MIMO computing schemes outperform non-parallel computing scheme
65
two loops can be parallelized with respect to either resolution or split. As summary, the independent
components of DLMR models are 1) base learners and 2) bagging splits.
Suppose the range of J is X to Y (see section 2.2.3), and we train the DLMR model with Z bagging
splits. Assuming that the computing time required by the base learner with B-spline X and Y are T
x
and T
y
, the non-parallel computing time for the DLMR model is: O((T
x
+T
x+1
+::: +T
y1
+T
y
)Z).
The first parallelization scheme (DLMR Scheme A) for the model is based on the number of splits
Z, meaning that we can distribute the estimation procedure to Z computing nodes and let each node
estimatesallbaselearnersandthemeta-learner. Aftereachnodehasaccomplisheditsestimations, wecan
gather the results from all nodes and calculate the ensembled performance. With this parallel strategy,
we can reduce the total computing time to O(T
x
+T
x+1
+::: +T
y1
+T
y
).
The second scheme (DLMR Scheme B) stems from independent base learners. In practice, the number
of base learners tends to be much larger than the number of bagging splits, i.e., (YX)>>Z. In this
way, if enough computing nodes are available on the cluster, this scheme can be faster than the DLMR
Scheme A by applying R (R = Y X + 1) nodes. With this scheme, then, each node only needs to
estimate a single base learner Z times. After all nodes finish, one head-node on the cluster combines all
baselearners’outputstotrainthemeta-learnerandthengetthefinalensembleestimations. Theoretically,
since T
x
< T
y
, the upper bound for total computing time is decided by the most time-consuming base
learner as O(T
y
Z) (see Fig. 4.4).
4.2.4 Materials and Equipment
We have implemented the parallel computing strategies with the high-performance computer cluster at
the University of Southern California Center for Advanced Research Computing (CARC). The CARC
provides 600 computing nodes with more than 1000 CPU cores. Both MIMO and DLMR models are
implemented with MATLAB 2019a through SLURM (Simple Linux Utility for Resource Management)
scheduler [74] to communicate with the cluster for all of the modeling steps (see Fig. 4.5).
66
Figure 4.4: Parallel DLMR computing schemes outperform non-parallel computing scheme
Figure 4.5: Implementing parallel computing with a computer cluster.
67
4.3 Results
WehavetestedthetwoparallelcomputingschemesforbothMIMOandDLMRmodelswithdatacollected
from 11 subjects using the CARC cluster. One subject typically needs to perform 100-150 DMS trials to
generate enough data for training the models. The performances of the parallel schemes are compared
with the performances of the non-parallel scheme.
4.3.1 Accelerating MIMO model estimation
Figure 4.6 illustrates parallel computing results of one MIMO model from one subject. This model
contains 15 input (CA3) neurons and 10 output (CA1) neurons.
Results show that by applying the MIMO scheme A, the computing time was reduced to 5.5 hours
(see Fig. 4.6A). The total computing time of this scheme is determined by the slowest MISO model
estimation on a single node (Node 10 in this case). Without parallel scheme, the total computing time is
approximately equal to the summation of all computing times on all nodes (43 hours).
Faster model estimations can be achieved by using the MIMO Scheme B. In Scheme B, the cluster first
assigns 10 computing nodes to run MISO estimation S1 andS2. Then, when each node has finished S2,
the cluster saves all intermediate results and terminates these nodes. Subsequently, the cluster assigns 50
new nodes to beginS3. Therefore, every new node only needs to perform a single-fold of cross-validation
procedure.
ThetotalcomputingtimeoftheMIMOSchemeBalsodependsontheslowestMISOmodelestimation.
However, since S3 is divided into five sub-tasks, the entire computing time is recorded to approximately
2 hours (see Fig. 4.6B). Note that we terminate the first 10 nodes after they finish S2 and then use more
computing nodes to runS3. Although this scheme requires additional computing time for communicating
between the head-node and the cluster, this computing time is negligible compared with other computing
time.
68
MIMO computing time of all 11 subjects are compared between the schemes (see Fig. 4.6C). In
all cases, both parallel schemes accelerated MIMO model estimation. Scheme B further reduces MIMO
model computing times compared with Scheme A.
Figure 4.6: MIMO model estimations are accelerated by two parallel schemes. A: MIMO computing
time with Scheme A. B: MIMO computing time with Scheme B. C: comparison of computing time in all
subjects (n=11).
With such a parallelization strategy, MIMO model estimation can be completed within the clinical
time window and further enable testing the MIMO model-based hippocampal memory prosthesis in
epilepsy patients. Results show that MIMO models can accurately predict CA1 spike trains based on
CA3 spike trains during the DMS task [75]. More importantly, MIMO model-based electrical stimulation
69
delivered to the hippocampal CA1 region during Sample Phase of DMS task significantly enhances DMS
performance (by 35%) of epilepsy patients (n=22) [38].
4.3.2 Accelerating DLMR model estimation
Figure 4.7 illustrates parallel computing results of one DLMR model from one subject. This model
contains 36 input (CA3) neurons, 24 output (CA1) neurons, and 5 memory labels. In all DLMR model
estimations, the number of bagging splits is 20. In each split, 101 base learners (i.e., temporal resolutions)
were explored.
Results show that, by applying the DLMR Scheme A, the computing time was reduced from 296
hours to 17 hours (Fig. 4.7A). Since all computing nodes start simultaneously, the total computing time
depends on the slowest computing nodes, which varies depending on the specific computing task. In
general, computing times across different computing nodes are quite similar.
Notably, the resolution-based Scheme B is even faster than the split-based Scheme A, as it utilizes
more computing nodes (101, [50:150]) and each node runs the base learner with fewer times (20). After
all nodes finish their modeling tasks, one head-node on the cluster combines outputs of all base learners to
train the meta-learner. The computing time of this head-node is negligible comparing to the computing
time of base learners.
Computing times of the DLMR Scheme B show a very different pattern from the DLMR Scheme A.
In Scheme A, most of the computing nodes have similar computing times, while in Scheme B, the longest
run-time is the base learner with the highest B-spline resolution values. The total computing time of
SchemeBis4.5hours(seeFig. 4.7C),whichismuchshorterthanthatofSchemeA.Innoparallelizationis
used, the total computing time is 298 hours, which will make testing the DLMR model-based stimulation
in this particular subject impossible.
DLMR computing time of all 11 subjects are compared between the scheme (see Fig. 4.7C). In all
cases, both parallel schemes accelerate DLMR model estimation. Scheme B further reduces DLMR model
computing times compared with Scheme A.
70
Figure 4.7: DLMR model estimations are accelerated by two parallel schemes. A: DLMR computing
time with Scheme A. B: DLMR computing time with Scheme B. C: comparison of computing time in all
subjects (n=11).
71
With such parallelization strategy, DLMR model estimation can be completed within the clinical time
window and further enable testing the DLMR model-based hippocampal memory prosthesis in epilepsy
patients. Results show that DLMR models can accurately decode visual memory (i.e., sample image)
categories from the hippocampal CA3 and CA1 spike patterns during the DMS task [68]. The effect of
DLMR model-based electrical stimulation is still under investigation and will be reported in the future.
4.4 Summary
This chapter describes parallelization strategies for accelerating model estimations of both MIMO and
DLMR models to meet the human study time constraint requirement for testing a hippocampal prosthesis
in human subjects. Performance of two parallel schemes in both models are evaluated and compared in
multiple human subjects (n=11).
For the MIMO model, we take advantage of the independence of each MISO model to estimate all
MISO models simultaneously (MIMO Scheme A). In addition, S3 of each MISO model estimation are
further parallelized to achieve even shorter computing times (MIMO Scheme B). The DLMR model
estimations are performed with multiple bagging splits to achieve low estimation variances, which are
time consuming on a single computer. However, since the estimations of DLMR models with different
splits and different resolutions are independent, we are able to reduce computing time by parallelizing
the splits (DLMR Scheme A) or the resolutions (DLMR Scheme B).
Implementing parallel schemes with a high-performance computer cluster, we successfully reduced the
computing time of model estimations from hundreds of hours to tens of hours, which makes input-output
model-based stimulation experiments possible in the same patients.
72
Chapter 5
Discussion and Future Works
5.1 Future Development of Memory Decoding Model
5.1.1 Discussion
5.1.1.1 Contribution of the Bagging Procedure
Bagging procedure is included in our modeling methodology to facilitate model estimation and reduce
estimation variances caused by the small sample size. To demonstrate the contribution of the bagging
procedure, we built decoding models with and without bagging and compared their MCC performances
(see Fig. 5.1). In the models without bagging, hyper-parameters are optimized within each inner loop
and then used in outer loop testing set.
Results show that the decoding models without the bagging procedure under-perform their corre-
sponding decoding models with bagging in all synthetic and experimental data sets. Specifically, models
with bagging consistently yield MCCs with higher mean values and lower STDs.
In addition, as shown in the results section of chapter 2 and chapter 3, the double-layer decoding
model drastically improves the model performance with its ensemble learning components (i.e., bagging
[section 2.2.5], and stacking [section 2.2.7]). This strong improvement results from the ensemble learn-
ing methods and is also due to the small sample size. In experimental datasets, the sample size are
73
Figure 5.1: Comparisons of decoding performances between the model with bagging and the model
without the bagging procedure. A: Synthetic data sets. B: Rodent data sets. C: Human data set 1. D:
Human data set 2. Blue bars: Mean MCCs of the model with the bagging. Orange bars: Mean MCCs of
the model without the bagging. Error bars: STD
so small that most of the standard estimation methods fail with no above-chance performances; even
with regularized estimation, most single-resolution single-trial (no bagging) base learners have moderate
performances. This leaves enough room for the ensemble learning methods to boost the performance by
combining multiple weak learners to form a strong learner. In contrast, when the sample size is large,
the improvement caused by ensemble learning methods is expected to be less prominent [76].
5.1.1.2 Robustness of SCFM Patterns
In our modeling methodology, SCFMs are important variables for decoding and representing the spatio-
temporal characteristics of spike patterns for different output labels. To examine the robustness of SCFM
patterns identified by the decoding model, we conducted additional experiments using down-sampled
datasets to see whether the SCFMs patterns are sensitive to down-sampling. Specifically, we compare
the results from the first half of the data and the results from the second half of the data in our analysis.
Results show that, although SCFMs estimated from the two down-sampled datasets are slightly noisier,
74
they are nonetheless very similar to SCFMs estimated with full datasets (see Fig. 5.2). More importantly,
the Full pattern which is the pattern estimated from all data using the 10-fold bagging procedure captures
the pattern that is common in both the First Half pattern and the Second Half pattern. This is not
surprising since in our modeling approach, SCFMs are estimated with a bagging-based ensemble learning
approach: multiple copies of SCFMs are estimated from multiple bagging down-sampled datasets and
then averaged to yield stable estimations that can reflect true patterns within the data.
5.1.1.3 Comparison with Other Baseline Decoders
Our modeling methodology is designed to solve a highly under-determined decoding problem with short
data length. Such a challenging decoding problem cannot be readily solved by baseline decoders. To
prove this, we have implemented multiple standard baseline decoders with all synthetic and experimental
datasets and compared their performances with those of our decoding models. The baseline decoders
include naive Bayes decoder, logistic regression classifier, L1-regularized (LASSO) logistic regression
classifier using original spikes as inputs, and L1-regularized (LASSO) logistic regression classifier using
PCA features extracted from original spikes as inputs.
Results show that in all datasets, our decoding model (see Fig. 5.3) achieve the highest decoding
MCCs. All baseline decoders yield poor performances in both synthetic and experimental datasets,
except in a few cases, e.g., logistic regression and LASSO logistic regression the third synthetic dataset.
These results indicate the necessity of including model selection and ensemble learning procedures in the
modeling methodology for solving the decoding problem described in this study.
5.1.1.4 Resemblance to Artificial Neural Networks
The double-layer decoding model described in chapter 2 bears some resemblance to an artificial neural
network (ANN). For example, both are supervised learning models with a layered structure. The logistic
function is also a commonly used activation function in ANN. However, fundamental differences between
these two models exist in that ANN originated from a connectionism learning theory where simultaneous
75
Figure 5.2: Comparisons of performance of SCFMs using full dataset (first column), first half of the
dataset (second column), and second half of the dataset (third column). A and B: example SCFMs from
two rodent datasets. C and D: example SCFMs from two human datasets
76
Figure 5.3: Comparisons of performance of multiple decoders. A: synthetic datasets. B: rodent datasets.
C: human dataset 1. D: human dataset 2. Color bars: mean MCCs achieved by decoders. Blue bars:
double-layer multi-resolution model. Orange bars: single-layer single-resolution model. Yellow bars:
naive Bayes decoder. Purple bars: logistic regression classifier. Green bars: L1-regularized (LASSO)
logistic regression classifier using spike trains as inputs. Cyan bars: L1-regularized (LASSO) logistic
regression classifier using PCA features from spike trains as inputs. Error bars: STD.
77
processing of distributed computational units are used. In contrast, the double-layer decoding model
described in chapter 2 is inspired by functional expansion and ensemble learning theories, where multiple
independently estimated computational units emphasizing different aspects (e.g., temporal resolutions)
of the modeled system are combined to form the final model. The main reason behind this choice is the
highly under-determined estimation problem caused by the high dimensionality of input signals and, more
importantly, the very short data length (100-200 instances). Given such data length, model estimations in
every step can include only a minimal number of open parameters to make model estimations possible and
avoid overfittings. In our double-layer decoding model, very base learner is estimated with all (training)
data points available. The meta-learner again use all (training) data points to determine the weights of
each base learner. This somewhat "greedy" training strategy is necessary for the highly under-determined
problems being solved in this letter but is drastically different from what is typically used in ANN (e.g.,
back-propagation). When the standard back-propagation method is used, all model parameters of the
ANN are simultaneously trained based on the error signal. This will cause serious overfittings or non-
convergence when the sample size is small, as in human brain studies. The ensemble learning approach
greedily estimates model parameters group by group and step by step to solve the under-determined
estimation problem and mitigate overfittings. It can be considered a very special way of training an ANN
with small sample size.
5.1.2 Future Directions
Modeling factors involved in the study described in section 2.3.4 can be further explored to improve
the model performances in future studies. The current model is computationally intensive and requires
tens to hundreds of hours on a single personal computer. Speeding up model estimation is crucial for
experimental and clinical applications as described in chapter 4 and in [77, 78]. For example, the L1-
regularization rate (i.e., in Eq. 2.10 and Eq. 4.3), which controls the sparsity level, is a key determinant
of the total computing time since model estimations need to be repeated for each value. In our current
strategy, is treated as a hyper-parameter; its optimal value is searched exhaustively within a certain
78
range (10
5
to 10
0
) with a high resolution using the nested cross-validation procedure (see section 2.2.6).
The sequence of candidate values can potentially be reduced and optimized based on empirical results
to speed up model estimations. Similarly, the number of temporal resolutions to be explored can also
be optimized to include only the more possible temporal resolutions. As the starting step, we use over-
complete sets of hyper-parameters to impose minimal constraints on the models; as more insights are
gained from more analyses using experimental data, the hyper-parameter space is expected to be greatly
reduced without sacrificing the model performance.
In the current version of the double-layer multi-resolution decoding model, output labels are treated
independently; abinaryclassificationmodelisbuiltforeachoutputlabelusingonlyinput(spikepatterns)
features. The dependencies between output labels are not included. For example, the human spike
decoding model are five independent decoding models for five memory (sample image) categories. This is
mainly because the sample images are hand-selected to be relatively independent [75]. The dependence
between image categories thus is weak and unrealistic to be generalized to model other images. In
extension to larger-scale decoding models with a larger number of categories and naturalistic distribution
of sample images, the decoding model can be modified to explicitly incorporate dependencies between
image categories into the model using a graphical modeling approach [79, 80, 81]. By exploiting such
dependencies, model decoding performances are expected to be improved further.
5.2 Future Development of Parallel Computing Models
5.2.1 Discussion
There are several limitations with the current experimental and modeling procedures for developing and
testing a clinically-viable hippocampal memory prosthesis. First, Ad-Tech electrodes are not ideal elec-
trodes for recording unitary spiking activities due to its relatively large electrode size (40 m diameter),
low density (24 micro electrodes per probe), and short implantation time (typically 2 weeks). The signal-
to-noise ratio is relatively low; the number of neurons recorded and included in the models are still
79
small, compared with those of animal electrodes. Long-term variation of neural coding and encoding are
currently impossible to asses due to the short implantation time. These limitations are expected to be
overcomewiththeadvancedofmulti-electrodearraytechniqueswhichcanpotentiallyenablemulti-region,
large-scale, chronic recording of the hippocampus [82, 83, 84].
Second, a single memory task, i.e., DMS task, is used to study hippocampal memory function in
this study. This task only involves static visual stimuli (images) presented with short time delays.
Although such simplification is commonly used and necessary for proof-of-concept studies, to fully model
hippocampal memory prostheses, it is necessary to include different forms of episodic memories involving
multiple sensory modalities, e.g., auditory, spatial, and olfactory, to fully study and model hippocampal
functions [85, 86, 87, 88]. Ideally, the behavioral task should include rich sets ofspatio-temporal sequences
of behavioral events as in real life. We are planning to implement this type of naturalistic behavioral
paradigmsinourfuturestudiestotestthegeneralityofourmodelapproachandthehippocampalmemory
prostheses.
Finally, we would like to emphasize that both MIMO and MD models are input-output functional
models and have been formulated to be general enough for testing other hippocampal functions. Specifi-
cally, the MIMO model aims to reinstate signal transmission from an upstream region of the hippocampus
to a downstream region of the hippocampus impaired by diseases or injuries. It does not assume or rely
on specific hippocampal function as long as sufficient numbers of input and output neurons are included.
The MD model can be readily used to decode other memory information such as locations during the
spatial delayed nonmatch-to-sample task[48]. Therefore, the parallel computing strategy described in
chapter 4 is also applicable to the modeling and testing of other hippocampal functions during other
experimental paradigms. The key idea of this strategy is to identify and further parallelize the computa-
tionally independent sub-tasks during modeling procedures such as model estimation and cross-validation
with different hyper-parameters. These procedures are commonly used in neural modeling to yield bet-
ter model estimation. Therefore, such a parallelization strategy can be applied to any other modeling
procedure involving similar independent sub-tasks.
80
5.2.2 Future Directions
Moving forward, we may further improve computing efficiencies by changing computing hardware and
optimizing the parallel computing strategy. In terms of hardware, a promising solution is to switch
from CPU cores to GPU cores [89, 90, 91]. GPUs are advantageous over CPUs since they may allow
us to run thousands of smaller sub-tasks instead of tens or hundreds of sub-tasks as now on a reduced
number of computers [92]. To take advantages of the much larger scale of GPU-based parallelization, our
model estimation procedures need to be implemented in a more standard form which is typically used for
building artificial neural network (ANN) models. In addition, estimation procedure needs to be further
parallelizedtomakesurethateachsub-taskcanbehandlebyasingleGPU.Onepossibilityistoparallelize
the MIMO and MD model estimation to the level of each individual regularized likelihood estimation of
MISO model or base learner since such estimations involves different regularization parameters () are
also independent from each other (see Fig. 4.2, loop) and thus can be performed on separate nodes
simultaneously. This ideal-case parallel computing strategy has the potential to further reduce the model
computing time by another 100 times, which has important implications to testing the input-output
model-based DBS intraoperatively and developing clinically viable hippocampal memory prostheses.
81
References
[1] Claire-Bénédicte Rivara, Chet C Sherwood, Constantin Bouras, and Patrick R Hof. Stereologic
characterization and spatial distribution patterns of betz cells in the human primary motor cortex.
The Anatomical Record Part A: Discoveries in Molecular, Cellular, and Evolutionary Biology: An
Official Publication of the American Association of Anatomists, 270(2):137–151, 2003.
[2] Otfrid Foerster. The motor cortex in man in the light of hughlings jackson’s doctrines. Brain,
59(2):135–159, 1936.
[3] Kalanit Grill-Spector and Rafael Malach. The human visual cortex. Annu. Rev. Neurosci., 27:649–
677, 2004.
[4] Li Zhaoping and Zhaoping Li. Understanding vision: theory, models, and data. Oxford University
Press, USA, 2014.
[5] Neil Burgess, Eleanor A Maguire, and John O’Keefe. The human hippocampus and spatial and
episodic memory. Neuron, 35(4):625–641, 2002.
[6] Sam A Deadwyler and Robert E Hampson. Ensemble activity and behavior—what’s the code?
Science, 270(5240):1316–1316, 1995.
[7] John P Donoghue. Connecting cortex to machines: recent advances in brain interfaces. Nature
neuroscience, 5(11):1085–1088, 2002.
[8] Liam Paninski, Jonathan W Pillow, and Eero P Simoncelli. Maximum likelihood estimation of a
stochastic integrate-and-fire neural encoding model. Neural computation, 16(12):2533–2561, 2004.
[9] Dawn M Taylor, SI Helms Tillery, and Andrew B Schwartz. Information conveyed through brain-
control: cursor versus robot. IEEE Transactions on Neural Systems and Rehabilitation Engineering,
11(2):195–199, 2003.
[10] Dong Song, Haonan Wang, Catherine Y Tu, Vasilis Z Marmarelis, Robert E Hampson, Sam A
Deadwyler, and Theodore W Berger. Identification of sparse neural functional connectivity us-
ing penalized likelihood estimation and basis functions. Journal of computational neuroscience,
35(3):335–357, 2013.
[11] IanHStevenson, JamesMRebesco, NicholasGHatsopoulos, ZachHaga, LeeEMiller, andKonradP
Kording. Bayesian inference of functional connectivity and network structure from spikes. IEEE
Transactions on Neural Systems and Rehabilitation Engineering, 17(3):203–213, 2008.
[12] Jennifer L Collinger, Brian Wodlinger, John E Downey, Wei Wang, Elizabeth C Tyler-Kabara,
Douglas J Weber, Angus JC McMorland, Meel Velliste, Michael L Boninger, and Andrew B
Schwartz. High-performance neuroprosthetic control by an individual with tetraplegia. The Lancet,
381(9866):557–564, 2013.
82
[13] Leigh R Hochberg, Mijail D Serruya, Gerhard M Friehs, Jon A Mukand, Maryam Saleh, Abraham H
Caplan, Almut Branner, David Chen, Richard D Penn, and John P Donoghue. Neuronal ensemble
control of prosthetic devices by a human with tetraplegia. Nature, 442(7099):164–171, 2006.
[14] Mark S Humayun, James D Weiland, Gildo Y Fujii, Robert Greenberg, Richard Williamson, Jim Lit-
tle, Brian Mech, Valerie Cimmarusti, Gretchen Van Boemel, Gislin Dagnelie, et al. Visual perception
in a blind subject with a chronic microelectronic retinal prosthesis. Vision research, 43(24):2573–
2581, 2003.
[15] Gerald E Loeb, Frances JR Richmond, and Lucinda L Baker. The bion devices: injectable interfaces
with peripheral nerves and muscles. Neurosurgical focus, 20(5):1–9, 2006.
[16] KH Mauritz and HP Peckham. Restoration of grasping functions in quadriplegic patients by func-
tional electrical stimulation (fes). International Journal of Rehabilitation Research, 10:57–60, 1987.
[17] Terence David Sanger. Probability density estimation for the interpretation of neural population
codes. Journal of neurophysiology, 76(4):2790–2793, 1996.
[18] Emery N Brown, Loren M Frank, Dengda Tang, Michael C Quirk, and Matthew A Wilson. A
statistical paradigm for neural spike train decoding applied to position prediction from ensemble
firing patterns of rat hippocampal place cells. Journal of Neuroscience, 18(18):7411–7425, 1998.
[19] Wei Wu, Michael J Black, David Mumford, Yun Gao, Elie Bienenstock, and John P Donoghue.
Modeling and decoding motor cortical activity using a switching kalman filter. IEEE transactions
on biomedical engineering, 51(6):933–942, 2004.
[20] AE Brockwell, Robert E Kass, and Andrew B Schwartz. Statistical signal processing and the motor
cortex. Proceedings of the IEEE, 95(5):881–898, 2007.
[21] Ryan C Kelly and Tai Sing Lee. Decoding v1 neuronal activity using particle filtering with volterra
kernels. In Advances in neural information processing systems, pages 1359–1366, 2004.
[22] Shy Shoham, Liam M Paninski, Matthew R Fellows, Nicholas G Hatsopoulos, John P Donoghue,
and Richard A Normann. Statistical encoding model for a primary motor cortical brain-machine
interface. IEEE Transactions on Biomedical Engineering, 52(7):1312–1322, 2005.
[23] FREDERIC Theunissen, J Cooper Roddey, STEVEN Stufflebeam, HEATHER Clague, and John P
Miller. Information theoretic analysis of dynamical encoding by four identified primary sensory
interneurons in the cricket cercal system. Journal of neurophysiology, 75(4):1345–1364, 1996.
[24] William Bialek, Fred Rieke, RR De Ruyter Van Steveninck, and David Warland. Reading a neural
code. Science, 252(5014):1854–1857, 1991.
[25] Vasilis Marmarelis. Analysis of physiological systems: The white-noise approach. Springer Science
& Business Media, 2012.
[26] Konstantinos Alataris, TheodoreW Berger, and VasilisZ Marmarelis. A novel network for nonlinear
modeling of neural systems with arbitrary point-process inputs. Neural Networks, 13(2):255–266,
2000.
[27] Miguel AL Nicolelis. Brain–machine interfaces to restore motor function and probe neural circuits.
Nature Reviews Neuroscience, 4(5):417–422, 2003.
[28] Maryam M Shanechi, Ziv M Williams, Gregory W Wornell, Rollin C Hu, Marissa Powers, and
Emery N Brown. A real-time brain-machine interface combining motor target and trajectory intent
using an optimal feedback control design. PloS one, 8(4):e59049, 2013.
83
[29] David Raposo, Matthew T Kaufman, and Anne K Churchland. A category-free neural population
supports evolving demands during decision-making. Nature neuroscience, 17(12):1784, 2014.
[30] Maryam M Shanechi, Rollin C Hu, Marissa Powers, Gregory W Wornell, Emery N Brown, and Ziv M
Williams. Neural population partitioning and a concurrent brain-machine interface for sequential
motor function. Nature neuroscience, 15(12):1715–1722, 2012.
[31] Cunle Qian, Xuyun Sun, Yueming Wang, Xiaoxiang Zheng, Yiwen Wang, and Gang Pan. Bin-
less kernel machine: Modeling spike train transformation for cognitive neural prostheses. Neural
Computation, 32(10):1863–1900, 2020.
[32] Dong Song, Rosa HM Chan, Vasilis Z Marmarelis, Robert E Hampson, Sam A Deadwyler, and
Theodore W Berger. Nonlinear dynamic modeling of spike train transformations for hippocampal-
cortical prostheses. IEEE Transactions on Biomedical Engineering, 54(6):1053–1066, 2007.
[33] R Quian Quiroga, Leila Reddy, Christof Koch, and Itzhak Fried. Decoding visual inputs from
multiple neurons in the human temporal lobe. Journal of neurophysiology, 98(4):1997–2007, 2007.
[34] Yichen Zhang, Shanshan Jia, Yajing Zheng, Zhaofei Yu, Yonghong Tian, Siwei Ma, Tiejun Huang,
andJianKLiu. Reconstructionofnaturalvisualscenesfromneuralspikeswithdeepneuralnetworks.
Neural Networks, 125:19–30, 2020.
[35] Dong Song, Xiwei She, Robert E Hampson, Sam A Deadwyler, and Theodore W Berger. Multi-
resolution multi-trial sparse classification model for decoding visual memories from hippocampal
spikesinhuman. In2017 39th Annual International Conference of the IEEE Engineering in Medicine
and Biology Society (EMBC), pages 1046–1049. IEEE, 2017.
[36] David K Warland, Pamela Reinagel, and Markus Meister. Decoding visual information from a
population of retinal ganglion cells. Journal of neurophysiology, 78(5):2336–2350, 1997.
[37] David Sussillo, Paul Nuyujukian, Joline M Fan, Jonathan C Kao, Sergey D Stavisky, Stephen Ryu,
andKrishnaShenoy. Arecurrentneuralnetworkforclosed-loopintracorticalbrain–machineinterface
decoders. Journal of neural engineering, 9(2):026027, 2012.
[38] Robert E Hampson, Dong Song, Brian S Robinson, Dustin Fetterhoff, Alexander S Dakos, Brent M
Roeder, Xiwei She, Robert T Wicks, Mark R Witcher, Daniel E Couture, et al. Developing a
hippocampal neural prosthetic to facilitate human memory encoding and recall. Journal of neural
engineering, 15(3):036014, 2018.
[39] Hélene Paugam-Moisy and Sander M Bohte. Computing with spiking neuron networks. Handbook
of natural computing, 1:1–47, 2012.
[40] Simon Thorpe, Arnaud Delorme, and Rufin Van Rullen. Spike-based strategies for rapid processing.
Neural networks, 14(6-7):715–725, 2001.
[41] Jilles Vreeken. Spiking neural networks, an introduction, 2003.
[42] Ehsan Arabzadeh, Stefano Panzeri, and Mathew E Diamond. Deciphering the spike train of a
sensory neuron: counts and temporal patterns in the rat whisker pathway. Journal of Neuroscience,
26(36):9216–9226, 2006.
[43] EdmundCLalorandJohnJFoxe. Neuralresponsestouninterruptednaturalspeechcanbeextracted
with precise temporal resolution. European journal of neuroscience, 31(1):189–193, 2010.
[44] Michael W Oram, Dengke Xiao, Barbara Dritschel, and Kevin R Payne. The temporal resolution
of neural codes: does response latency have a unique role? Philosophical Transactions of the Royal
Society of London. Series B: Biological Sciences, 357(1424):987–1001, 2002.
84
[45] Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. Imagenet classification with deep convolu-
tional neural networks. Communications of the ACM, 60(6):84–90, 2017.
[46] Theodore W Berger, Dong Song, Rosa HM Chan, Vasilis Z Marmarelis, Jeff LaCoss, Jack Wills,
Robert E Hampson, Sam A Deadwyler, and John J Granacki. A hippocampal cognitive prosthe-
sis: multi-input, multi-output nonlinear modeling and vlsi implementation. IEEE Transactions on
Neural Systems and Rehabilitation Engineering, 20(2):198–211, 2012.
[47] Dong Song, Rosa HM Chan, Vasilis Z Marmarelis, Robert E Hampson, Sam A Deadwyler, and
Theodore W Berger. Nonlinear modeling of neural population dynamics for hippocampal prostheses.
Neural Networks, 22(9):1340–1351, 2009.
[48] Dong Song, Madhuri Harway, Vasilis Z Marmarelis, Robert E Hampson, Sam A Deadwyler, and
Theodore W Berger. Extraction and restoration of hippocampal spatial memories with non-linear
dynamical modeling. Frontiers in systems neuroscience, 8:97, 2014.
[49] Larry Schumaker. Spline functions: basic theory. Cambridge University Press, 2007.
[50] Leo Breiman. Bagging predictors. Machine learning, 24(2):123–140, 1996.
[51] ThomasGDietterich. Ensemblemethodsinmachinelearning. In International workshop on multiple
classifier systems, pages 1–15. Springer, 2000.
[52] Sudhir Varma and Richard Simon. Bias in error estimation when using cross-validation for model
selection. BMC bioinformatics, 7(1):91, 2006.
[53] John Ashworth Nelder and Robert WM Wedderburn. Generalized linear models. Journal of the
Royal Statistical Society: Series A (General), 135(3):370–384, 1972.
[54] Halbert White. Maximum likelihood estimation of misspecified models. Econometrica: Journal of
the Econometric Society, pages 1–25, 1982.
[55] Ron Kohavi et al. A study of cross-validation and bootstrap for accuracy estimation and model
selection. In Ijcai, volume 14, pages 1137–1145. Montreal, Canada, 1995.
[56] Carl De Boor. On calculating with b-splines. Journal of Approximation theory, 6(1):50–62, 1972.
[57] Catherine Y Tu, Dong Song, F Jay Breidt, Theodore W Berger, and Haonan Wang. Functional
model selection for sparse binary time series with multiple inputs. Economic Time Series: Modeling
and Seasonality, pages 477–497, 2012.
[58] Katerina Hlaváčková-Schindler, Milan Paluš, Martin Vejmelka, and Joydeep Bhattacharya. Causal-
ity detection based on information-theoretic approaches in time series analysis. Physics Reports,
441(1):1–46, 2007.
[59] D Rindskopf. An introduction to the bootstrap-efron, b, tibshirani, rj, 1997.
[60] BJM Abma. Evaluation of requirements management tools with support for traceability-based
change impact analysis. Master’s thesis, University of Twente, Enschede, 2009.
[61] NA Diamantidis, Dimitris Karlis, and Emmanouel A Giakoumakis. Unsupervised stratification of
cross-validation for accuracy estimation. Artificial Intelligence, 116(1-2):1–16, 2000.
[62] Padhraic Smyth and David Wolpert. Stacked density estimation. In Advances in neural information
processing systems, pages 668–674, 1998.
[63] David H Wolpert. Stacked generalization. Neural networks, 5(2):241–259, 1992.
85
[64] William Beecher Scoville and Brenda Milner. Loss of recent memory after bilateral hippocampal
lesions. Journal of neurology, neurosurgery, and psychiatry, 20(1):11, 1957.
[65] Stuart Zola-Morgan, Larry R Squire, and David G Amaral. Human amnesia and the medial tem-
poral region: enduring memory impairment following a bilateral lesion limited to field ca1 of the
hippocampus. Journal of Neuroscience, 6(10):2950–2967, 1986.
[66] CT Dickson and CH Vanderwolf. Animal models of human amnesia and dementia: hippocampal
and amygdala ablation compared with serotonergic and cholinergic blockade in the rat. Behavioural
brain research, 41(3):215–227, 1990.
[67] Sam A Deadwyler, Terence Bunn, and Robert E Hampson. Hippocampal ensemble activity during
spatial delayed-nonmatch-to-sample performance in rats. Journal of Neuroscience, 16(1):354–372,
1996.
[68] Xiwei She, Theodore W Berger, and Dong Song. A double-layer multi-resolution classification
model for decoding spatiotemporal patterns of spikes with small sample size. Neural Computation,
34(1):219–254, 2022.
[69] Dong Song, Rosa HM Chan, Vasilis Z Marmarelis, Robert E Hampson, Sam A Deadwyler, and
Theodore W Berger. Sparse generalized laguerre-volterra model of neural population dynamics. In
2009 Annual International Conference of the IEEE Engineering in Medicine and Biology Society,
pages 4555–4558. IEEE, 2009.
[70] Theodore W Berger, Robert E Hampson, Dong Song, Anushka Goonawardena, Vasilis Z Marmarelis,
and Sam A Deadwyler. A cortical neural prosthesis for restoring and enhancing memory. Journal
of neural engineering, 8(4):046017, 2011.
[71] Robert E Hampson, Greg A Gerhardt, Vasilis Marmarelis, Dong Song, Ioan Opris, Lucas Santos,
Theodore W Berger, and Sam A Deadwyler. Facilitation and restoration of cognitive function in
primateprefrontalcortexbyaneuroprosthesisthatutilizesminicolumn-specificneuralfiring. Journal
of neural engineering, 9(5):056012, 2012.
[72] Vasilis Z Marmarelis. Identification of nonlinear biological systems using laguerre expansions of
kernels. Annals of biomedical engineering, 21(6):573–589, 1993.
[73] Lei Yuan, Jun Liu, and Jieping Ye. Efficient methods for overlapping group lasso. In Advances in
neural information processing systems, pages 352–360, 2011.
[74] Andy B Yoo, Morris A Jette, and Mark Grondona. Slurm: Simple linux utility for resource man-
agement. In Workshop on Job Scheduling Strategies for Parallel Processing, pages 44–60. Springer,
2003.
[75] Dong Song, Brian S Robinson, Robert E Hampson, Vasilis Z Marmarelis, Sam A Deadwyler, and
Theodore W Berger. Sparse large-scale nonlinear dynamical modeling of human hippocampus
for memory prostheses. IEEE Transactions on Neural Systems and Rehabilitation Engineering,
26(2):272–280, 2016.
[76] Joshua I Glaser, Ari S Benjamin, Raeed H Chowdhury, Matthew G Perich, Lee E Miller, and
Konrad P Kording. Machine learning for neural decoding. Eneuro, 7(4), 2020.
[77] Xiwei She, Brian S Robinson, Theodore W Berger, and Dong Song. Accelerating estimation of a
multi-inputmulti-outputmodelofthehippocampuswithaparallelcomputingstrategy. In 2020 42nd
Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC),
pages 2479–2482. IEEE, 2020.
86
[78] Xiwei She, Brian Robinson, Garrett Flynn, Theodore W Berger, and Dong Song. Accelerating
input-output model estimation with parallel computing for testing hippocampal memory prostheses
in human. Journal of Neuroscience Methods, 370:109492, 2022.
[79] Jack Lanchantin, Arshdeep Sekhon, and Yanjun Qi. Neural message passing for multi-label classifi-
cation. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases,
pages 138–163. Springer, 2019.
[80] Zheng-Jun Zha, Xian-Sheng Hua, Tao Mei, Jingdong Wang, Guo-Jun Qi, and Zengfu Wang. Joint
multi-label multi-instance learning for image classification. In 2008 ieee conference on computer
vision and pattern recognition, pages 1–8. IEEE, 2008.
[81] Jie Zhou, Ganqu Cui, Zhengyan Zhang, Cheng Yang, Zhiyuan Liu, Lifeng Wang, Changcheng Li,
and Maosong Sun. Graph neural networks: A review of methods and applications. arXiv preprint
arXiv:1812.08434, 2018.
[82] Sahar Elyahoodayan, Wenxuan Jiang, Curtis D Lee, Xiecheng Shao, Gregory Weiland, John J
Whalen Iii, Artin Petrossians, and Dong Song. Stimulation and recording of the hippocampus
using the same pt-ir coated microelectrodes. Frontiers in Neuroscience, 15, 2021.
[83] Xuechun Wang, Ahuva Weltman Hirschberg, Huijing Xu, Zachary Slingsby-Smith, Aziliz Lecomte,
Kee Scholten, Dong Song, and Ellis Meng. A parylene neural probe array for multi-region deep brain
recordings. Journal of Microelectromechanical Systems, 29(4):499–513, 2020.
[84] Huijing Xu, Ahuva Weltman Hirschberg, Kee Scholten, Theodore William Berger, Dong Song, and
Ellis Meng. Acute in vivo testing of a conformal polymer microelectrode array for multi-region
hippocampal recordings. Journal of neural engineering, 15(1):016017, 2018.
[85] Howard Eichenbaum. On the integration of space, time, and memory. Neuron, 95(5):1007–1018,
2017.
[86] Michael J Kahana, Robert Sekuler, Jeremy B Caplan, Matthew Kirschen, and Joseph R Mad-
sen. Human theta oscillations exhibit task dependence during virtual maze navigation. Nature,
399(6738):781–784, 1999.
[87] Emma R Wood, Paul A Dudchenko, and Howard Eichenbaum. The global record of memory in
hippocampal neuronal activity. Nature, 397(6720):613–616, 1999.
[88] Ali Yousefi, Anna K Gillespie, Jennifer A Guidera, Mattias Karlsson, Loren M Frank, and Uri T
Eden. Efficient decoding of multi-dimensional signals from population spiking activity using a gaus-
sian mixture particle filter. IEEE Transactions on Biomedical Engineering, 66(12):3486–3498, 2019.
[89] Ronan Collobert, Koray Kavukcuoglu, and Clément Farabet. Torch7: A matlab-like environment
for machine learning. In BigLearn, NIPS workshop, 2011.
[90] Tianqi Chen, Mu Li, Yutian Li, Min Lin, Naiyan Wang, Minjie Wang, Tianjun Xiao, Bing Xu,
Chiyuan Zhang, and Zheng Zhang. Mxnet: A flexible and efficient machine learning library for
heterogeneous distributed systems. arXiv preprint arXiv:1512.01274, 2015.
[91] Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu
Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, et al. Tensorflow: A system for large-scale
machine learning. In 12thfUSENIXg symposium on operating systems design and implementation
(fOSDIg 16), pages 265–283, 2016.
[92] Ebubekir Buber and DIRI Banu. Performance analysis and cpu vs gpu comparison for deep learning.
In 2018 6th International Conference on Control Engineering & Information Technology (CEIT),
pages 1–6. IEEE, 2018.
87
Abstract (if available)
Abstract
To study how memories are encoded in neural signals, we have built a double-layer multiple temporal-resolution (multi-resolution) model for decoding specific memory information from single-trial spatio-temporal patterns of spikes. The model takes hippocampal spiking activities as input signals and memory-related variables as output signals. It provides a powerful tool for better understanding how memory content is encoded in hippocampal spikes.
The double-layer multi-resolution (DLMR) decoding model represents the input-output mapping with a two-layer structure. In the first layer, to solve the under-determined problem caused by the small sample size and the very high dimensional input and output signals, L1-regularized B-spline logistic regression classifiers are used to reduce dimensionality and yield sparse model estimation. A wide range of temporal resolutions of neural signals is included by using a large number of classifiers with different numbers of B-spline knots. Each classifier serves as a base learner to classify spatio-temporal patterns into the probability of the memory label with a given temporal resolution. A bootstrap aggregating strategy is used to further reduce estimation variances of these classifiers. In the second layer, another L1-regularized logistic classifier takes outputs of first-layer classifiers as inputs to generate the final output prediction. This classifier serves as a meta learner that combines multiple temporal resolutions to classify the spatio-temporal patterns of spikes into final output memory labels.
For the development of hippocampal memory prostheses, a human hippocampal closed-loop analysis system was built at both Keck Hospital of the University of Southern California (USC) and Rancho Los Amigos National Rehabilitation Centre. The system can successfully complete the entire closed-loop paradigm that integrates recording, modeling, and stimulation. Neuronal signals from 43 subjects have been recorded and corresponding DLMR decoding models have been built. In addition, meta-analyses were performed on modeling results to investigate the memory coding characteristics of the hippocampus. Moreover, to accelerate model estimation procedures, we have developed parallelization strategies to decompose the overall model estimation task into multiple independent sub-tasks. These sub-tasks are then accomplished in parallel on different computer nodes provided by the USC Center for Advanced Research Computing (CARC).
Results of simulation studies show that the DLMR decoding model can capture neural patterns with various temporal resolutions. Results of experimental data decoding prove that this method can effectively avoid overfitting and yield accurate prediction of memory-related information. The double-layer multi-resolution classifier consistently outperforms the best single-layer single-resolution classifier by extracting and utilizing multi-resolution spatio-temporal features of spike patterns in the classification. Moreover, the DLMR decoding model can be used not only to predict memory information but also to provide signature functions that represent spatio-temporal characteristics of spike patterns most relevant to the memory. Such signature functions can be potentially used to design spatio-temporal stimulation patterns for eliciting specific memories in the hippocampus. Therefore, the proposed DLMR decoding model has important implications for the development of hippocampal memory prostheses.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
A million-plus neuron model of the hippocampal dentate gyrus: role of topography, inhibitory interneurons, and excitatory associational circuitry in determining spatio-temporal dynamics of granul...
PDF
Experimental and computational models for seizure prediction
PDF
Nonlinear modeling of causal interrelationships in neuronal ensembles: an application to the rat hippocampus
PDF
Functional consequences of network architecture in rat hippocampus: a computational study
PDF
Data-driven modeling of the hippocampus & the design of neurostimulation patterns to abate seizures
PDF
A double-layer multi-resolution classification model for decoding time-frequency features of hippocampal local field potential
PDF
Nonlinear dynamical modeling of single neurons and its application to analysis of long-term potentiation (LTP)
PDF
Neural spiketrain decoder formulation and performance analysis
PDF
Multiscale spike-field network causality identification
PDF
Dynamic neuronal encoding in neuromorphic circuits
PDF
Simulating electrical stimulation and recording in a multi-scale model of the hippocampus
PDF
Geometric and dynamical modeling of multiscale neural population activity
PDF
Multi-region recordings from the hippocampus with conformal multi-electrode arrays
PDF
Computational investigation of glutamatergic synaptic dynamics: role of ionotropic receptor distribution and astrocytic modulation of neuronal spike timing
PDF
Nonlinear models for hippocampal synapses for multi-scale modeling
PDF
Detection and decoding of cognitive states from neural activity to enable a performance-improving brain-computer interface
PDF
Switching dynamical systems with Poisson and multiscale observations with applications to neural population activity
PDF
Parametric and non‐parametric modeling of autonomous physiologic systems: applications and multi‐scale modeling of sepsis
PDF
Developing a neural prosthesis for hippocampus: proof-of-concept using the in vitro slice
PDF
Dynamical representation learning for multiscale brain activity
Asset Metadata
Creator
She, Xiwei
(author)
Core Title
Decoding memory from spatio-temporal patterns of neuronal spikes
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Biomedical Engineering
Degree Conferral Date
2022-08
Publication Date
06/29/2022
Defense Date
05/18/2022
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
ensemble learning,human hippocampus,machine learning,memory decoding,OAI-PMH Harvest
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Song, Dong (
committee chair
), Berger, Theodore (
committee member
), Liu, Charles (
committee member
), Marmarelis, Vasilis (
committee member
), Shanechi, Maryam (
committee member
)
Creator Email
caesarxiao5@gmail.com,xiweishe@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-oUC111352086
Unique identifier
UC111352086
Legacy Identifier
etd-SheXiwei-10797
Document Type
Dissertation
Format
application/pdf (imt)
Rights
She, Xiwei
Type
texts
Source
20220706-usctheses-batch-950
(batch),
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the author, as the original true and official version of the work, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright. The original signature page accompanying the original submission of the work to the USC Libraries is retained by the USC Libraries and a copy of it may be obtained by authorized requesters contacting the repository e-mail address given.
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Repository Email
cisadmin@lib.usc.edu
Tags
ensemble learning
human hippocampus
machine learning
memory decoding