Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Experimental and computational explorations of different forms of plasticity in motor learning and stroke recovery
(USC Thesis Other)
Experimental and computational explorations of different forms of plasticity in motor learning and stroke recovery
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
Experimental and computational explorations of different
forms of plasticity in motor learning and stroke recovery
Amarpreet Singh Bains
Faculty of the USC Graduate School
Doctor of Philosophy (Neuroscience)
University of Southern California
August 9, 2016
2
Acknowledgements
I would like to acknowledge the many people who played vital roles in helping me along the
path from entering USC as a first-year graduate student to defending my thesis. First, my thesis advisor,
Dr. Nicolas Schweighofer, was always willing to find time to discuss my ongoing projects and share his
perspective. His advice was invaluable in getting me to think about problems in new ways and, on a
more general scale, helped to shape me professionally as a scientist. I would also like to thank my
dissertation committee members, Drs. James Gordon and Stefan Schaal. Dr. Gordon’s feedback was
always enlightening and appreciated, and it pushed my research towards a higher standard of rigor and
scientific integrity. Dr. Schaal provided a keen eye in evaluating the soundness of the more theoretical
portions of my work and brought an important computational perspective to my committee. A thank
you is also due to my qualification committee members, Drs. Bartlett Mel and Norberto Grzywacz, who
thoroughly evaluated my work and made sure I changed direction where needed to move towards a
successful thesis. Additionally, I am grateful to have worked with the other members of the CNRL and
MBNL lab groups over the years, whose feedback, friendship, and support always brightened the
prospect of working in our otherwise sunless basement laboratory.
Finally, a huge thank you is in order to the friends and family that have seen me through the
successes and difficulties of my PhD. This includes my friends and colleagues within the Neuroscience
Graduate Program, as well as my close friends in LA and elsewhere that have provided encouragement
and perspective throughout this process. I would especially like to thank my parents, sisters, in laws, and
above all my wife, Pallavi, for their support and guidance. There were points during my PhD when I was
not sure that I would be able to finish successfully, but their counsel and unwavering confidence in me
provided the motivation to push me through those times. This thesis is a product of all of that support,
and I cannot thank all of you enough.
3
Table of Contents
Chapter 1: Overview and Organization of the Thesis ................................................................................... 6
Chapter 2: Background on Plasticity, Stroke, and Motor Learning............................................................... 7
2.1 Neural plasticity mechanisms: physiology and models ...................................................................... 7
2.1.1 Hebbian-like plasticity .................................................................................................................. 8
2.1.2 Hebbian plasticity-based models of cortical organization ......................................................... 10
2.1.3 Homeoplasticity ......................................................................................................................... 11
2.1.4 Homeoplasticity in neural network models of the brain ........................................................... 13
2.2 Animal, human, and computational models of stroke ..................................................................... 17
2.2.1 Clinical studies: physiological and behavioral aspects of spontaneous motor stroke recovery 17
2.2.2 Clinical studies: stroke rehabilitation ......................................................................................... 20
2.2.3: Animal studies: physiology and behavior during stroke and recovery ..................................... 23
2.2.4: Animal studies: areas of discrepancy with human studies ...................................................... 26
2.2.5 Computational models of stroke: current models ..................................................................... 29
2.2.6 Computational models of stroke: shortcomings, simplifications, and future possibilities........ 34
2.2.7 Hebbian and homeoplasticity: putative roles in stroke recovery .............................................. 36
2.3 Mechanisms of motor learning ......................................................................................................... 38
2.3.1 Error-based learning .................................................................................................................. 39
2.3.2 Reward-based learning .............................................................................................................. 41
2.3.3 Use-dependent learning ............................................................................................................ 46
Chapter 3: Roles of Hebbian and Homeoplasticity in a Neural Network Model of Stroke Recovery ......... 51
3.1 Abstract ............................................................................................................................................. 51
3.2 Introduction ...................................................................................................................................... 51
3.3 Methods ............................................................................................................................................ 54
3.3.1 Arm and Muscle Spindle Simulations......................................................................................... 54
3.3.2 Cortical Network Simulations .................................................................................................... 55
3.3.3 Hebbian Plasticity and Homeoplasticity .................................................................................... 56
3.3.4 Initial Network Training ............................................................................................................. 58
3.3.5 Lesioning .................................................................................................................................... 59
3.3.6 Rehabilitation training ............................................................................................................... 59
3.3.7 Outcome Measures .................................................................................................................... 60
4
3.3.8 Parameter Sensitivity Analyses .................................................................................................. 61
3.4 Results ............................................................................................................................................... 63
3.4.1 Initial Network Training ............................................................................................................. 63
3.4.2 Effects of Targeted Lesion .......................................................................................................... 63
3.4.3 Experiment 1: Essential Role of Homeoplasticity in Recovery................................................... 66
3.4.4 Experiment 2: Effect of Delayed Rehabilitation Training ........................................................... 71
3.4.5 Experiment 3: Effects of Elbow-Only Training after Lesion ....................................................... 72
3.4.6 Parameter Sensitivity Analyses .................................................................................................. 73
3.5 Discussion .......................................................................................................................................... 76
3.6 Chapter Appendix ............................................................................................................................. 81
Chapter 4: Use-Dependent Learning in Reaching Movements is Robust, Feedback-Insensitive, and Occurs
in Hand Space Coordinates ......................................................................................................................... 83
4.1 Abstract ............................................................................................................................................. 83
4.2 Introduction ...................................................................................................................................... 84
4.3 Methods ............................................................................................................................................ 86
4.3.1 Subjects ...................................................................................................................................... 86
4.3.2 Common Apparatus and Task Description for all Three Experiments ....................................... 86
4.3.3 Bias Time Course Experiment .................................................................................................... 87
4.3.4 Verstynen Replication Experiment ............................................................................................ 89
4.3.5 Coordinate System Experiment ................................................................................................. 90
4.3.6 Common Movement Analysis for the Three Experiments ......................................................... 92
4.3.7 Common Statistical Data Analysis for the Three Experiments .................................................. 93
4.3.8 Statistical Data Analysis for the Bias Time Course Experiment ................................................. 94
4.3.9 Statistical Data Analysis for the Verstynen Replication Experiment.......................................... 95
4.3.10 Statistical Data Analysis for the Coordinate System Experiment ............................................ 96
4.4 Results ............................................................................................................................................... 99
4.4.1 Bias Time Course Experiment: Feedback-Independence, Time Course, and Retention of Use-
Dependent Learning............................................................................................................................ 99
4.4.2 Washout of Use-Dependent Learning ..................................................................................... 101
4.4.3 Generalization of Use-Dependent Learning ............................................................................ 103
4.4.4 Bias in Verstynen Replication Experiment ............................................................................... 103
5
4.4.5 Adaptive Bayesian Model Fit to Bias Time Course Experiment Data and Prediction of
Verstynen Replication Experiment Data ........................................................................................... 105
4.4.6 Coordinate System of Use-Dependent Learning ..................................................................... 106
4.5 Discussion ........................................................................................................................................ 108
Chapter 5: Summary of Work ................................................................................................................... 113
Bibliography .............................................................................................................................................. 118
6
Chapter 1: Overview and Organization of the Thesis
Motor learning and the neural plasticity that underlies it are essential ingredients in allowing us
to perform almost everything we do on a daily basis, from driving a car to typing on a keyboard.
Additionally, proper understanding of these phenomena would allow us to harness them to aid those
who suffer motor impairments resulting from stroke or other diseases. This thesis lays out the work
done during my PhD aimed at improving this understanding. The work falls under two main projects.
The first proposes a neural network model of sensory cortex to explore the roles of Hebbian and
homeoplasticity after stroke, including how they interact to determine the optimal time to initiate
rehabilitation. The second project then endeavors to strengthen a central assumption of the model,
namely that purely unsupervised Hebbian learning can occur in the sensorimotor system independent of
performance feedback (error or reward) from the environment. This is accomplished through arm
reaching experiments that assay for unsupervised learning using behavioral measurements of use-
dependent learning. This second project is also extended to explore the phenomenon of use-dependent
learning in general to determine whether it could play a significant role alongside error- and reward-
based learning mechanisms in shaping motor control.
The organization of the thesis is as follows. Chapter 2 gives a background of the literature that is
pertinent to understanding models of cortical organization, behavioral and physiological aspects of
stroke recovery, and the mechanisms of motor learning. Chapter 3 and 4 then detail the two projects
described above. These chapters may be read independently of the others and are formatted as
manuscripts for submission to particular journals, as per Neuroscience Graduate Program guidelines.
Finally, Chapter 5 summarizes the work and hypothesizes about its links to other recent work.
7
Chapter 2: Background on Plasticity, Stroke, and Motor Learning
Before describing the details of my thesis research, I have included this chapter as a review of
the pertinent topics and literature that have formed the foundation of my work. The first section
describes two neural plasticity mechanisms- Hebbian-like plasticity and homeoplasticity- whose roles in
stroke recovery were investigated through a computational model in my first project. The second
section then details the effects of stroke and recovery on cortical (re-)organization, including knowledge
gleaned from animal, human, and computational work, and explains the gaps addressed by our model.
Finally, the third section takes a step back from cellular physiology and gives a background on human
motor control studies. This lays the groundwork for my second thesis project, which used behavioral
experiments to confirm that pure Hebbian plasticity, a mechanism assumed to exist in our stroke model,
could actually play a role in motor learning.
2.1 Neural plasticity mechanisms: physiology and models
Brain plasticity was classically described by the 19
th
century American psychologist William
James as the “possession of a structure weak enough to yield to an influence, but strong enough not to
yield all at once,” going on to observe “nervous tissue seems endowed with a very extraordinary degree
of plasticity of this sort” (James, 1890). Although he was focused on plasticity’s role in the formation of
habits, his observations have proved prescient regarding the brain’s general ability to change in
response to functional demands. The second half of the 20
th
century and the beginning of the 21
st
has
seen a great expansion in our knowledge of how the cellular and molecular mechanisms underlying this
plasticity operate. Notably, it has been revealed through both computational predictions and
experimental work that multiple forms of plasticity exist at a cellular level. Two forms are discussed
here- Hebbian-like plasticity and homeoplasticity.
8
2.1.1 Hebbian-like plasticity
In 1949, Donald Hebb suggested the following hypothesis about how neurons alter the strength
of their connecting synapses: “When an axon of cell A is near enough to excite a cell B and repeatedly or
persistently takes part in firing it, some growth process or metabolic change takes place in one or both
cells such that A’s efficiency, as one of the cells firing B, is increased” (Hebb, 1949). This hypothesis,
named Hebb’s Rule, is often boiled down into the phrase “neurons that fire together wire together”.
Bliss and Lomo revealed the first physiological correlate of Hebb’s Rule by stimulating axons synapsing
onto dentate gyrus neurons in the rabbit hippocampus at high frequencies, between 10-100 Hz. After
less than 15 seconds of this high-frequency stimulation, subsequent stimulation of the axons elicited
strengthened postsynaptic responses in the dentate cells (Bliss and Lomo, 1973).
This phenomenon of long-term potentiation (LTP) was soon discovered to have a complement in
the form of long-term depression (LTD), also first discovered in the hippocampus. LTD was initially found
to occur at an unstimulated synapse if a neighboring synapse underwent LTP induction by stimulation at
a high frequency (Lynch et al., 1977). Since this form of LTD did not require presynaptic activity, it was
referred to as “anti-Hebbian” (Massey and Bashir, 2007). However, later work uncovered a Hebbian
form of LTD that acted in opposition to previously-induced LTP. In this case, low-frequency stimulation
of presynaptic axonal inputs to the hippocampus (1-5 Hz) led to a weakening of the postsynaptic
response to subsequent presynaptic stimulation (Barrionuevo et al., 1980).
In the intervening decades, many details have been added to this picture of LTP and LTD. These
include molecular details of the receptors, biochemical pathways, proteins, and genes involved, but also
the demonstration that both processes occur in many locations besides the hippocampus. Relevant to
motor learning, these include the motor cortex (Iriki et al., 1989, 1991; Rioult-Pedotti et al., 1998, 2000),
cerebellum (Ito and Kano, 1982), and basal ganglia (Calabresi et al., 1992; Kreitzer and Malenka, 2008).
9
Furthermore, different forms of LTP and LTD (both Hebbian and anti-Hebbian) have been characterized
in different parts of the brain, each requiring a specific set of neurotransmitters, receptors, or pattern of
stimulation for induction (Citri and Malenka, 2008a; Collingridge et al., 2010). For instance, to induce LTP
in the direct pathway of the basal ganglia, dopamine release and metabotropic glutamate receptor
activation are required in addition to coordinated presynaptic activity and postsynaptic depolarization
(Shen et al., 2008). Similarly, assuming LTP in the motor cortex (M1) underlies skill learning (Rioult-
Pedotti et al., 1998, 2000; Cantarero et al., 2013a, 2013b), dopamine release from the ventral tegmental
area may be essential to consolidating this plasticity and allowing skill retention across days (Hosp et al.,
2011).
These variations on the theme of Hebbian-like plasticity are important since they reveal that
highly correlated pre- and postsynaptic activity patterns alone are not always sufficient to induce
synaptic changes. In the case of sensorimotor systems such as the basal ganglia and M1, the
involvement of dopamine suggests reward feedback is also needed (Schultz et al., 1997; Hosp et al.,
2011). In the cerebellum, LTD at the parallel fiber-Purkinje cell synapses requires error feedback carried
by climbing fibers (Albus, 1971; Ito and Kano, 1982). This raises the question of how much each variation
of Hebbian-like plasticity (reward-based, error-based, or purely activity-based) contributes toward
sensorimotor learning in both healthy and stroke-affected individuals. In the neural network model of
stroke described in Chapter 3, we assumed the existence of purely activity-driven plasticity, similar to
previous models of cortical organization noted in the next section. Behavioral evidence for this
assumption was then sought through the series of motor learning experiments on healthy subjects
described in Chapter 4.
10
2.1.2 Hebbian plasticity-based models of cortical organization
The basic idea of Hebbian plasticity has been used in many models simulating the process of
cortical organization, especially during development of the visual cortex. An early model included
afferent connections from a retinal surface to a cortical layer of neurons that were laterally
interconnected with excitatory and inhibitory synapses (von der Malsburg, 1973). Neighboring cortical
neurons were assumed to excite each other, while more distant neurons were assumed to inhibit each
other. The retinal surface was activated to simulate input to the cortex that represented oriented bars
of light. Each bar was statically presented until the cortical activity settled to near steady state, with
each neuron’s activity being either a linearly weighted sum of retinal and lateral cortical inputs, or being
set to zero if this sum was below a threshold. The weights of the afferent synapses were then increased
in a Hebbian manner, being proportional to the product of pre- and postsynaptic activity. To prevent
unchecked synaptic weight increase, afferent weights for a given neuron were normalized after each
update such that their sum remained constant. Lateral cortical connection strengths were assumed
fixed. Despite these simplifications, the model was able to organize afferent synaptic strengths from an
initial, largely unstructured state, to one that caused neighboring cortical neurons to represent similarly
oriented retinal bars. This reflected the orientation-selective cortical columns seen in primary visual
cortex and established Hebbian plasticity as a valid mechanism for driving this organization.
Later models continued to expand on these initial results and simulated additional aspects of
primary visual cortex, including two-dimensional topographic organization (self-organizing maps),
experience-driven lateral connectivity patterns, and the ability to perform contour integration
(Kohonen, 1982; Sirosh and Miikkulainen, 1994, 1997; Choe and Miikkulainen, 2004). Recently, a similar
model has also been used to explain the organizational development of rat barrel cortex in response to
whisker stimulation (Wilson et al., 2010). These models all included extensions to von der Malsburg’s
11
initial incarnation, including approximately sinusoidal functions to translate synaptic inputs to neuronal
activity output, plasticity in lateral connections, and a wider range of afferent input patterns. However,
the key ingredient to each has still been the use of Hebbian plasticity to drive synaptic changes, along
with some form of synaptic weight normalization to prevent unchecked weight increases. In our model,
we continue to build on this core plasticity mechanism to simulate both the initial development and
post-stroke reorganization of a cortical network.
2.1.3 Homeoplasticity
Homeoplasticity is the maintenance of neural activity near a set point (Turrigiano, 2012).
Specifically, if a neuron becomes hyperactive, homeoplasticity reduces excitability and hence activity
levels; conversely, if the neuron becomes too quiescent, homeoplasticity increases excitability. This
balancing act is extremely important in preventing brain network activity from becoming unstable and
reaching a seizure state, while also making sure neural activity does not become so sparse that little
information can be transmitted (Triesch, 2005; Turrigiano, 2011). In contrast to changes driven by LTP or
LTD, the changes produced by homeoplasticity take place over hours or days, preventing them from
interfering in learning processes or information transfer on a short time scale (Turrigiano and Nelson,
2004; Murphy and Corbett, 2009).
Although the overall goal of homeoplasticity is singular- namely, to maintain proper activity- the
mechanisms underlying this task are far from monolithic. At a high level, homeoplasticity can be broken
into two types- synaptic scaling and intrinsic plasticity. Synaptic scaling may be analogous to the synaptic
weight normalization rules required in the models of cortical development discussed previously. It has
been found to take place in several cortical areas, including the hippocampus and neocortical areas such
as visual cortex (Turrigiano et al., 1998; Thiagarajan et al., 2005; Turrigiano, 2008). Multiple studies have
found it relies heavily on insertion or removal of AMPA receptor subtypes from synaptic membranes
12
using signaling pathways that are different from those acted on by LTP/LTD (Turrigiano, 2012). To add
further complexity, synaptic scaling can be further split into many separate processes. For instance,
under conditions of reduced activity, excitatory synapses onto excitatory pyramidal cells are scaled up
while some inhibitory synapses onto pyramidal cells are scaled down (Rutherford et al., 1998; Vale and
Sanes, 2002; Maffei et al., 2004; Turrigiano, 2012), as one might expect if the goal of scaling was to
oppose changes in activity. However, under similar conditions, the inhibitory postsynaptic currents
driven by other inhibitory cell types may not be scaled down or may even be increased (Echegoyen et
al., 2007; Bartley et al., 2008). Furthermore, some forms of scaling appear cell-autonomous, requiring
each cell to only monitor its own activity through changes in calcium levels (Burrone et al., 2002;
Turrigiano and Nelson, 2004), while others appear to require information about network activity
(Rutherford et al., 1998; Stellwagen and Malenka, 2006). Even more complications arise when
considering that each of these synaptic scaling variations are controlled by myriad poorly understood
cell signaling and gene transcription pathways, some of which may work in parallel (Turrigiano, 2008).
Intrinsic plasticity adds another layer of complexity on top of synaptic scaling. Although it has not
been as well studied as scaling, it has been found to act in a variety of cells, including neocortical ones
(Desai et al., 1999; Turrigiano, 2011). As opposed to regulating synaptic strength, intrinsic plasticity
alters a cell’s input-output function, changing how much activity is elicited by a given amount of input
current (Triesch, 2005; Turrigiano, 2011). This is likely done by changing the balance of inward and
outward currents through changes in voltage-gated ion channels. Similar to synaptic scaling, however,
there appear to be many mechanisms underlying this phenomenon, including which channels are
regulated in different cell types and whether the excitability of whole cells or specific dendrites is
altered (Nelson et al., 2005; Breton and Stuart, 2009). What mechanism is selected for action may
depend on cell type, cortical layer, and developmental stage (Turrigiano, 2011).
13
Even apart from the myriad mechanisms underlying synaptic scaling and intrinsic plasticity, it is still
unclear how these two overall forms of homeoplasticity interact. Whether they are redundant or have
some preferential order of action remains to be elucidated, and again may depend on factors such as
brain location, cell type, or developmental stage (Maffei et al., 2004; Hartman et al., 2006; Echegoyen et
al., 2007; Huupponen et al., 2007; Bartley et al., 2008). Furthermore, defective homeoplasticity plays a
yet-uncertain role in chronic diseases such as autism and depression, while functioning homeoplasticity
may be important to cortical reorganization after stroke (discussed below; Murphy and Corbett, 2009).
What is clear, however, is that the processes maintaining balance in neural activity are immensely
multifaceted, demand more study, and, importantly for our purposes, must be vastly simplified in
simulation.
2.1.4 Homeoplasticity in neural network models of the brain
As described above, early attempts at simulating self-organizing maps and using them as models of
cortical organization relied on the essential heuristic of synaptic weight normalization to prevent
unchecked weight growth. This may be viewed as a loose theoretical parallel to the synaptic scaling
described in the previous section; hence it was arguably an initial inclusion of homeoplasticity in cortical
models prior to homeoplasticity being described in vivo. Later computational work has provided a more
biologically-inspired, homeoplasticity-based twist on the weight normalization rule (Sullivan and de Sa,
2006). Here, instead of normalizing to maintain the total sum of all synaptic weights converging on a
postsynaptic cell, the weights were divisively normalized according to a term based on the recent
average activity of the cell. If recent activity had been high, this term increased to greater than one, thus
reducing total excitatory input weight to the cell; conversely, if activity had been low, this term
decreased to less than one, thus increasing excitatory input. In the case of the laterally connected one-
dimensional neural network explored by the authors, this rule was able to maintain cell activity close to
14
a desired average better than the standard weight normalization rule. However, its efficacy in
topographic mapping in a two-dimensional network was untested.
Aside from synaptic weight normalization rules, another approach to preventing uncontrolled
increase of synaptic weights (and of the resulting cell activity) was the Bienenstock-Cooper-Munro
(BCM) rule for simulating a sliding LTP/LTD threshold (Bienenstock et al., 1982). Instead of simply
multiplying pre-and postsynaptic activity (both assumed positive in the basic Hebbian rule), the BCM
rule replaced the postsynaptic activity term with a nonlinear function of postsynaptic activity that could
become negative or positive. This nonlinear term therefore provided for weight reduction as well as
growth. Specifically, the nonlinear term output positive values if the postsynaptic firing rate was above a
certain threshold and negative values if the firing rate was below this threshold. The threshold itself was
dependent on a running average of cellular activity and was designed to stabilize weights and activity.
For example, a period of low activity reduced the threshold and made weight increases more likely. This
in turn would bring cellular activity back towards a desired regime, echoing the goals of homeoplastic
mechanisms discovered over a decade later (Turrigiano et al., 1998; Desai et al., 1999).
The BCM rule has continued to play a part in cortical network models, with a recent study using
a variation of it to allow homeoplasticity in a model of visual cortical development after monocular
deprivation (Toyoizumi and Miller, 2009). In this model, homeoplasticity was found necessary to
replicate the takeover of visual cortex by the un-occluded eye. At the start of monocular deprivation,
certain areas of the simulated cortex had very strong inputs from the occluded eye and very weak inputs
from the un-occluded eye that were ineffective at driving a postsynaptic response. With only a Hebbian
rule in play, the ineffectiveness of the un-occluded synaptic inputs at driving postsynaptic response in
these cortical areas would have prevented their strengthening and subsequent takeover of the cortex.
However, with homeoplasticity present, these cortical areas reduced their “LTP” threshold because the
15
loss of input from the occluded eye diminished their activity. Thus, any small amount of activity driven
by the un-occluded eye became fodder for increasing its associated synaptic weights such that they
eventually became stronger than the weights from the occluded eye. This result is important to think
about in the current context of modeling stroke recovery. As detailed in Chapter 3, this is because
recovery stems from strengthening the remaining weak synapses representing body parts whose
original area of strong cortical input was removed by the stroke lesion.
In our model, however, a slightly different combination of homeoplastic rules than those
described so far was utilized. As with most cortical organization models, our model included synaptic
weight normalization as a mathematical necessity, but also included a biologically-inspired intrinsic
plasticity mechanism based on computational work by Triesch and colleagues (Triesch, 2005, 2007;
Butko and Triesch, 2006). In this mechanism, cells were assumed to have a sigmoidal input-output
function that translated the weighted sum of synaptic inputs into a firing rate. The slope and offset of
the sigmoid could be homeoplastically varied to control the firing rate distribution given some input
distribution.
The mathematical details are provided in Chapter 3, but in brief, the authors derived update
rules for the sigmoid parameters that pushed the firing rate distribution to approximate an exponential
distribution. This was done by first determining the equation for the firing rate probability density
function (PDF) based on the current synaptic input distribution. Then, the difference between this PDF
and the desired exponential distribution was written out using the Kullback-Leibler (KL) divergence, a
non-negative measure of the difference between two probability distributions. Finally, the partial
derivatives of the KL divergence equation were determined with respect to both parameters that
defined the shape of the sigmoidal input-output function. These partial derivatives were the bases of
16
the sigmoid update rules, allowing stochastic gradient descent to be performed on each time step to
minimize the KL divergence based on the current synaptic input and cell firing rate.
An exponential firing rate distribution was chosen as the goal of this homeoplastic rule for both
theoretical and biological reasons. Theoretically, the exponential distribution has the highest
information transfer rate for any non-negative distribution with a fixed mean, while biologically, firing
rate must be non-negative while obeying a metabolic budget that constrains its mean (Baddeley et al.,
1997; Triesch, 2005). Additionally, neurons in sensory areas have actually been observed to follow
sparse, approximately exponential firing rate distributions (Baddeley et al., 1997). Notably, this
formulation of intrinsic plasticity allowed cells to automatically and successfully adjust their excitability
in response to changing input distributions (a condition that may occur post-stroke), to represent the
independent components composing their input patterns, and to simulate the development of oriented
simple cell receptive fields in V1 (Butko and Triesch, 2006; Triesch, 2007).
This model of intrinsic plasticity was attractive for inclusion in our model for several reasons.
The proposed mechanism was local in space and time and thus biologically plausible, perhaps through
each cell monitoring its calcium influx as a measure of firing rate. It was also a possible alternative to the
BCM model of a sliding LTP/LTD threshold, for which evidence is indirect (Triesch, 2007). Instead of this
threshold, in the Triesch model changes in a cell’s excitability can bias it towards LTP when it is more
excitable or LTD when it is less excitable. Additionally, such excitability changes have been directly
shown to operate in vitro through changes in sodium and potassium currents after 48 hours of input
deprivation (Desai et al., 1999). Finally, the Triesch intrinsic plasticity rule was clear in its theoretical
motivations (maximizing information transfer) and computationally simple to implement.
17
2.2 Animal, human, and computational models of stroke
The previous section outlined the mechanisms whose interactions are explored through the
model described in Chapter 3 and, in the case of Hebbian plasticity, whose behavioral outcomes are
examined in the use-dependent learning experiments of Chapter 4. However, the end goal of the model
in Chapter 3 is to specifically explore these mechanisms in the case of stroke recovery. To put this this
goal in context, this section provides a general background on what we know about the stroke recovery
process and rehabilitation from human, animal, and computational studies, as well as our current
shortcomings. At the end, a brief summary is given of how Hebbian and homeoplasticity may interact
during stroke recovery, providing further motivation for the model presented later.
2.2.1 Clinical studies: physiological and behavioral aspects of spontaneous motor stroke
recovery
As the stroke-affected brain passes through the acute and into the chronic phases of recovery, a
timeline of different physiological events occur that may aid or, in some cases, hinder functional
recovery. These changes are thought to underlie the spontaneous recovery (i.e. recovery thought to be
driven largely by biological changes as opposed to rehabilitative therapy) seen in the weeks and months
after stroke (Cramer, 2008). However, it is impossible to completely abolish the interaction of functional
motor activity that may occur outside of rehabilitation and these biological mechanisms, making
“spontaneous recovery” a difficult term to define. Additional challenges lie in the fact that patients
exhibit a large degree of heterogeneity in stroke location and severity, making broad statements hard to
prove true for all patients (Buma et al 2010). For instance, a return of voluntary movement may be
exhibited anywhere form the first days to a month post-stroke, with more mildly-affected patients
showing a greater rate of recovery (Cramer 2008a). However, in spite of these difficulties several general
principles appear to be coalescing. Their discovery in humans has been greatly aided by the noninvasive
18
tools of transcranial magnetic stimulation (TMS) and brain imaging such as fMRI and PET in the last two
decades (Schaechter et al 2011, Cramer 2008a, Swayne et al 2008, Buma et al 2010, Schaechter 2004).
Soon after stroke, cortical activity in the peri-infarct area (or immediately downstream of the
infarct in the case of subcortical strokes) is greatly reduced (Cramer 2008a). Over time, the surviving
tissue in the peri-infarct, which may have received reduced blood flow during the stroke, gradually
recovers its activity. The extent of this recovery has been linked to improvement in functional outcomes
(Schaechter et al 2011). Additional correlates of functional recovery found in TMS studies include
corticospinal tract integrity and cortical motor map sizes of affected body parts (Fujii and Nakada 2003,
Winship and Murphy 2009).
As recovery proceeds over weeks, activity in response to activation of affected body parts is also
altered in regions distant from the lesion but in the same network as it. Areas are often undergoing
increased activation are neighboring or secondary sensorimotor areas in the case of motor cortex
stroke, such as premotor or supplementary motor areas. Decreased activation, or diaschisis, is also
observed in far-flung brain regions downstream from the lesion. Shifts to activation of corresponding
contralesional sensorimotor areas are also seen (Schaechter 2004, Cramer 2008a, Buma et al 2010). The
amount of activation shift to ipsilesional or contralesional sites may depend on stroke severity (Cramer
2008a), though one study suggests even mild injury, such as prolonged hypoperfusion in the absence of
stroke, may be enough to prompt some degree of contralesional shift (Krakauer et al 2004).
Reviews of multiple clinical studies suggest that there are trends relating the degree of stroke
recovery with normalization of these post-stroke activity shifts. Specifically, recovery may be negatively
correlated with residual, abnormal activations of secondary motor and contralateral cortices. However,
one systematic review suggests that currently significant relationships cannot be determined due to
19
many poor experimental designs, especially with few subjects or variable stroke characteristics (Buma et
al 2010). To add to this ambiguity, two competing hypotheses exist to explain contralesional activity
shifts. One postulates that these shifts aid in controlling affected body parts, perhaps through
preexisting ipsilateral corticospinal pathways. The second posits that the increase in contralesional
activation is a passive epiphenomenon driven by the reduction in interhemispheric inhibition seen in
stroke patients (Nair et al 2007) caused by death of neurons in one hemisphere (Cramer 2008a,
Schaechter 2004). Evidence exists for both theories. One study measured healthy control and paretic
patient hand movement speed with and without TMS-driven disruption of the ipsilateral (contralesional)
motor cortical areas. Patients experienced greater slowing of the movements than controls when the
ipsilateral cortex was disrupted. Additionally, fMRI imaging showed that this slowing was greater in
patients who showed more ipsilateral activation and had poorer functional performance without TMS
(Johansen-Berg et al 2002). Interestingly, other studies have shown increased ipsilateral activation even
in healthy subjects when performing complex tasks, and it is plausible that the complexity of even basic
tasks could be considered high for stroke patients (Cramer 2008a). On the other hand, a retrospective
study on chronic stroke patients (3-6 months post-stroke) divided subjects into three groups, one of
which showed fast and good recovery, another which showed good recovery but over a longer period of
time, and a third which showed poor recovery. The authors found that persistent contralesional activity
was correlated with slow rate of recovery, but not degree of recovery, whereas corticospinal tract
integrity was more important to degree of recovery (Fujii and Nakada 2003). In addition, another study
showed that only TMS stimulation of the ipsilesional, not contralesional, motor cortex after stroke
recovery elicited motor evoked potentials in the hand (Nair et al 2007). Thus, it seems the shift to
contralesional activation may only be partially related to motor recovery. Its correlation with recovery
may even depend on which functional tasks were affected by stroke, as some functions, such as
20
proximal arm movements, are normally less lateralized than other functions such as hand movements
(Cramer 2008a).
Finally, differences in genetics may also be important to recovery outcomes, as changes in gene
regulation are known to occur after stroke. Though current knowledge of genetic effects is limited, one
known example is the effect on recovery of a brain-derived neurotrophic factor (BDNF) polymorphism
found in 27% of the US population. BDNF is linked to many brain functions, such as neuronal
differentiation, plasticity, and repair. A clinical study showed that subarachnoid hemorrhage patients
with the polymorphism were less likely to show good recovery 3 months after stroke than those without
it (Siironen et al 2007), though data was not measured at any other time points to allow determination
of the rate of recovery or any possible late recovery.
2.2.2 Clinical studies: stroke rehabilitation
Many different types of rehabilitative approaches are currently being studied, though a recent
Cochrane review showed no approach had high-quality evidence of superiority over traditional
rehabilitation therapy (Pollock et al., 2014). This may be partly due to the great heterogeneity of initial
stroke severity in humans, which in and of itself is perhaps the strongest behavioral predictor of
recovery (Coupar et al 2012). However, many clinical trials on various novel therapies also suffer from
design flaws, such as small patient sample sizes (< 100), un-blinded outcome assessors, or poor
concealment of therapy allocation to each experimental group (Langhorne et al 2009, Laver et al 2011).
Also, even when studying one mode of therapy, different studies may use a wide range of functional
outcome measures, such that comparison of results can sometimes be difficult and may differ
depending on the measure (Sivan et al 2011, Nijland et al 2011, Laver et al 2011). In spite of these
limitations and complications, recently published meta-analyses and systematic reviews of different
therapies can guide future investigations in promising directions.
21
In the case of the upper limb, moderate evidence for the benefit of several therapies exist,
though as noted above, none has been convincingly shown superior to the others. These therapies
include constraint-induced movement therapy (CIMT), mirror therapy, mental practice, virtual therapy,
unilateral practice (with possible superiority to bilateral practice), EMG biofeedback, and high doses of
repetitive task practice (Conforto et al., 2010; Langhorne et al., 2011; Laver et al., 2011; van Delden et
al., 2012; Pollock et al., 2014). Regarding therapy dose, however, complications may arise from the
timing of administration. Clinical studies comparing standard and high doses of therapy have found a
detrimental effect of high dose when administered very early (< 2 weeks) post-stroke, no effect when
administered slightly later (1.5 months) post-stroke, and a beneficial effect when administered even
later (3-9 months) post-stroke (Wolf et al., 2006; Dromerick et al., 2009; Winstein et al., 2016). These
seemingly conflicting findings may reflect an interaction of rehabilitation with the changing physiological
milieu of the brain as it undergoes post-stroke changes (Winstein et al., 2016).
Specifically for the hand, bilateral training, constraint-induced movement therapy, electrical
stimulation, high-intensity therapy, robotic-assisted therapy, repetitive task training, and splinting have
shown potential as therapies in some studies. However, two recent systematic reviews of clinical studies
suggested that none of these treatments showed robust, certain benefits for hand function (Langhorne
et al 2009, Langhorne et al 2011).
For the arm, CIMT has been robustly successful in improving mobility (ability to carry and handle
objects). In CIMT, a patient’s functioning arm is constrained to force use of the paretic one, thus helping
to overcome the barrier of learned non-use in recovery (Peurala et al 2012, Schaechter 2004, Cramer
2008b, Langhorne et al 2009, Larnghorne et al 2011, Nijland et al 2011, Han et al 2012). Especially
encouraging results from one study suggest that improved outcomes (measured on the Wolf Motor
Function Test and Motor Activity Log) after CIMT may be independent of infarct location and the related
22
severity of baseline impairment (Gauthier et al 2009). Interestingly, in one meta-analysis, high levels of
practice (60-72 hours over 2 weeks) clearly improved mobility outcomes but appeared not to improve
self-reported self-care abilities, while moderate doses (30 hours over 3 weeks) seemed to improve both
(Puerala et al 2012). It is unclear why this is the case, but one conjecture might be that a moderate dose
prompts enough improvement to allow additional practice of self-care while not tiring the patient
excessively. The high dose, on the other hand, may leave the patient too tired to practice self-care.
Additionally, improvement in tasks practiced in one context may not carry-over to related tasks in a
different context (Langhorne et al 2011). Although CIMT appears promising, enthusiasm should be
tempered by the fact that most of the clinical trials involved very select groups of patients with only mild
or moderate impairments who could undergo significant periods of constraint (Langhorne et al 2011,
Cramer 2008b).
Robotic therapy also appears to be beneficial to recovery, though its effect size and robustness
may be smaller than that of CIMT (Langhorne et al 2009, Langhorne et al 2011). In comparison to CIMT
and many other therapies, however, robotic therapy has the added benefit of easing the work burden of
physical therapists and thus may allow patients to undergo extended therapy sessions even when their
therapist may be unable to supervise them (Cramer 2008b).
For the lower limb and related functions such as walking, electromechanical-assisted gait
training, task-oriented fitness training, high-intensity gait therapy, speed-dependent treadmill training,
repetitive task training for gait speed and transfers, and endurance training for walking have been
shown to have robust positive outcomes on mobility and gait. Other lower-limb techniques that have
ambiguous benefits include external rhythmic cueing of the gait, position and force biofeedback or
moving platform training for leg function and balance, body-weight supported treadmill training, leg
strengthening, therapist-driven stretching and mobilization, and orthotic or functional electrical
23
stimulation for foot drop (Langhorne et al 2011, Stanton et al 2011, Winter et al 2011). As with the
ambiguous treatments for the upper limb, many of these techniques need further study with larger,
high-quality clinical experiments.
In summary, challenges lying ahead in improving rehabilitation include determining the correct
dosage and timing of different therapies and their usefulness for patients with, for example, varying
lesion locations or impairment severities (Corti et al 2012, Langhorne et al 2011). As functional
neuroimaging methods improve, they may be applied to gain information about how different patterns
of brain activity can be used to predict outcomes for different modes of therapy. This may be more
useful than only using behavioral abilities as a predictor, as different brain activity patterns can give rise
to very similar functional profiles (Cramer 2008b). Social and environmental factors outside of therapy
are also known to greatly impact recovery and are harder to control. For instance, depression is
common in stroke patients and leads to poorer outcomes, but can be alleviated in part by a supportive
social setting (Cramer 2008a). Thus, stroke recovery is a complicated, multifaceted problem that
requires many approaches, including animal trials, which are easier to control than clinical ones but are
still informative on human stroke.
2.2.3: Animal studies: physiology and behavior during stroke and recovery
Much animal work has been done in stroke research, both from the perspective of physiological
responses and of rehabilitative paradigms. Some of the earliest work showing remapping of motor
cortical somatotopy after lesion was performed in monkeys. The remapping of lesion-affected hand
representations into neighboring cortical territory was associated with motor training of the hand and
functional task recovery (Nudo et al 1996). Similar apparent remapping of function has been found in
many rodent sensorimotor lesion studies, as well as in human TMS motor cortex excitation studies
following rehabilitation (Winship and Murphy 2009, Dijkhuizen et al 2001). Lending additional credence
24
to the use of animal models for human stroke research, alteration of activity after stroke has been
observed in rats as well as humans. An fMRI study of rats showed that a sensorimotor infarct caused
large decreases in peri-lesional activity in response to affected limb stimulation. However, within a few
days, the same stimulation evoked large contralesional responses, similar to human activity shifts.
Additionally, as in humans, the return of ipsilesional sensorimotor responses over time correlates with
functional ability (Dijkhuizen et al 2001). Rodent models have shown that the contralesional cortex
undergoes decreased GABAergic activity and increased glutamatergic responses. There is also significant
degradation of white matter tracts, possibly including the largely inhibitory transcallosal pathways
(Winship and Murphy 2009, Dijkhuizen et al 2001). This may be the driving reason behind the
contralesional shifts seen in both rodents and humans.
Besides replicating the regional stroke effects seen in human brains, rodent studies have helped
elucidate the physiological changes allowing sensorimotor remapping. The rodent neural activity
patterns observed in response to forelimb activation after forelimb-affecting lesions have been imaged
using intrinsic optical signaling (IOS) and two-photon calcium imaging to allow measurement of
subthreshold responses. They have revealed that prior to lesion, the border separating forelimb- and
hindlimb-responsive cells is very sharp. As expected, two weeks after lesion the response to forelimb
activation was abolished, while the hindlimb response was generally normal. However, around a month
after lesion, forelimb responses begin emerging again, but in the pre-lesion hindlimb area. These
forelimb-responsive cells were still hindlimb-responsive as well, leading to overlapped somatotopic
representations. After approximately 2 months, the newly forelimb-responsive cells begin to specialize
again, losing their responsiveness to hindlimb stimulation (Winship and Murphy 2009).
This cellular function change may be facilitated by several factors, including the existence of
diffuse subthreshold synaptic connections and hyperexcitability, synaptogenesis, increased dendritic
25
spine turnover, axonal sprouting, and prolonged depolarizations in the peri-lesional cortex (Murphy and
Corbett 2009). Hyperexcitability may allow forelimb inputs that used to only generate subthreshold
depolarizations in hindlimb-area cells to generate action potentials after stroke. This could allow
forelimb-driven activity to strengthen these synapses to the point of dominance over the hindlimb
inputs (Winship and Murphy 2009). Thus, when hyperexcitability begins to decrease, the forelimb inputs
would now be the suprathreshold ones, leading to the observed somatotopy changes. The
strengthening of forelimb synapses, if undertaken in a Hebbian manner (discussed below), may also be
aided by prolonged depolarizations that increase the chance of presynaptic activity coincidentally
occurring with postsynaptic depolarizations (Brown et al 2009). Finally, even if few existing connections
are in place, synaptogenesis, spine turnover, and axonal sprouting could lay the foundation for
formation of new synapses. The fact that these changes are limited to the peri-lesional area is perhaps
important for giving the affected forelimb inputs an advantage at regaining territory (Cramer 2008a,
Murphy and Corbett 2009). The dependence of this theory on forelimb activity would explain some of
the benefits of rehabilitative limb use and the failure of affected hand areas to reappear in untrained
monkeys in the Nudo study mentioned above.
Finally, many of the physiological changes just discussed are driven by alterations in gene
expression and release of various molecules (from both neurons and glia) found to occur peri-lesionally
in rodents. Some of these molecules include growth factors, glial-derived synaptogenic thrombospondin
1 and 2, brain-derived neurotrophic factor (BDNF), and other substances promoting axon and dendrite
outgrowth (Murphy and Corbett 2009). Outgrowth-inhibiting or axon-repellant factors are also released
(such as NOGO, EPH receptors and ligands, semaphoring 3A, and others), though these tend to be
upregulated after the increase in growth-promoting factors. They may be a response to prevent the
formation of too much new connectivity outside of the forelimb recovery area that could interfere with
26
other circuits (Murphy and Corbett 2009). Animal age was found to affect the exact combination of
upregulated factors and their timings (with possible implications if applied to humans). Interestingly, the
subset of differentially regulated genes during this period has significant overlap with those seen during
rodent development, and even patterns of motor (re)development are similar to those of young rodents
(Murphy and Corbett 2009). This has been paralleled in a human study discussing the similarities
between infants and stroke victims when learning skilled movements (Cramer and Chopp 2000). Thus, a
“critical window” after stroke may exist, similar to the developmental one, and would help to explain
the observation that much-delayed therapy, whether in rodents or humans, tends to worsen outcomes.
In light of these molecular changes, pharmacological treatments have been tried in rodents and some,
like administration of BDNF along with rehabilitation training, have shown beneficial results (Murphy
and Corbett 2009). However, the jump from animals to human clinical trials of stroke-related drugs has
been disappointing, with only fibrinolytics shown effective as a neuroprotectant in humans after being
successful in rodents or primates (Cook and Tymianski 2011, Cook and Tymianski 2012). This suggests a
lack of an exact correspondence between animal and human stroke recovery in spite of the similarities
discussed here.
2.2.4: Animal studies: areas of discrepancy with human studies
Discrepancies in pharmacological findings between humans and animals are driven by
differences in physiology that are not fully understood. The discrepancies may be especially great in
rodent models as compared to primate ones, which is important given the preponderance of rodent
stroke studies (Hainsworth and Markus 2008). Differences regarding genetic regulation, cell signaling,
and single-unit electrophysiological changes in the peri-lesional cortex are hard to ascertain due to the
invasive procedures that would be needed to determine their profiles in humans. However, more easily
27
recognizable differences in vasculature, stroke location and size, neural organization, outcome
measures, study designs, and motor control could also affect animal vs. clinical results.
The most common rodent model of stroke involves middle cerebral artery occlusion (MCAO),
which can be induced in several ways (Murphy and Corbett 2009, Cook and Tymianski 2011). Some
methods, such as intraluminal sutures, give rise to very large infarcts affecting most of the hemisphere
and perhaps even the hypothalamus, which is not similar to most human strokes and could cause
additional behavioral deficits that are hard to trace to specific brain regions. Another method, inducing
middle cerebral artery embolism, is more related to human strokes but could cause variable infarct
patterns. Other methods (photothrombosis, endothelin 1 vasoconstriction, and proximal or distal
middle cerebral artery occlusion) cause smaller, more precisely located strokes, but in general no
method produces all effects seen in human strokes besides gray matter cell death, such as white matter
damage or vessel pathology (Hainsworth and Markus 2008, Murphy and Corbett 2009). Additionally,
rodents generally have a much lower white to gray matter ratio than humans, so the damage caused to
white matter tracts in humans is impossible to replicate in rodents (Cook and Tymianski 2011). The lack
of gyri and sulci in the rodent (Cook and Tymianski 2011), which imparts a much higher surface area-to-
volume ratio in humans, could also change how the infarct spreads across different functional areas of
the cortex. Furthermore, differences in motor control patterns of the affected limbs and hands/paws
could alter the functional outcomes after stroke. Humans have much more dexterous finger control and
rely on their hands and arms for more complicated tasks, which may be harder to re-learn or
compensate for than more basic rodent movements (Karl and Whishaw 2011). Finally, differences in size
and physiology alter the time course of recovery and appropriate therapy timings and doses in rodents
as compared to humans. This makes it very difficult to estimate what dose of therapy shown effective in
28
rodents should be attempted in a clinical trial, such that a therapy that may in fact be effective is
discarded due to improper dosing.
Aside from physiological differences, rodent and human studies have important experimental
design differences. One major difference is the lack of homogenous stroke size and location in humans.
Even similar functional abilities, often screened for in clinical trials, could hide underlying cortical
damage differences that affect recovery (Cramer 2008b, Buma et al 2010). This fact raises a conundrum
in animal trials- good experimental control dictates some level of infarct standardization, while
applicability to clinical reality suggests allowing variability. Thus, it is sometimes hard to judge whether
animal results, based on a homogenous stroke group, could transfer to heterogeneous humans.
Additionally, animal researchers are often not blinded to experimental conditions, and outcome
measures can be very different (Cook and Tymianski 2012, Cook and Tymianski 2011). For instance, in
neuroprotective drug trials, effectiveness in rodents was often measured not by behavioral outcomes,
as in humans, but by stroke volume reduction. This is problematic, as stroke volume has not been
proven to relate directly to functional human outcomes. Even when using rodent behavioral outcome
measures, comparison problems persist because rodent recovery is often judged based on their
performance of a specific, highly trained task, whereas human recovery is measured by broad indices
that often measure performance of different tasks or activities of daily living (Cook and Tymianski 2012).
These activities can include multiple motor and cognitive components, and in combination with the
heterogeneity of human lesions, it is hard to focus on a performance a criterion that specifically
addresses only functions degraded by stroke (Cook and Tymianski 2012).
The discrepant results caused by differences in human and rodents could be partially addressed
in several ways. First, transitioning promising therapies to macaque monkeys before testing them in
humans may be useful. These animals possess much more similar characteristics to human physiology
29
with regard to vasculature, pharmacological kinetics, recovery rates, white to gray matter ratios, and
number of gyri and sulci (Cook and Tymianski 2012, Cook and Tymianski 2011). They also possess similar
dexterity and control of the upper limbs and hands. However, they are much harder and more expensive
to obtain and care for than rodents, and drugs for neuroprotection have not yet been carried over from
macaques to humans, making the closer link between these species in stroke still somewhat speculative
(Cook and Tymianski 2011). Second, outcome measures in animals should be changed to better
correlate with behavioral outcome measures used clinically (Cook and Tymianski 2012). Third, more
animal models that express the comorbidities and advanced age that most human patients express may
provide more realistic results than studies which are currently often done on young animals (Murphy
and Corbett 2009, Hainsworth and Markus 2008). Implementing these partial solutions would allow
researchers to begin to narrow the reasons for discrepancies in results and would ideally speed the pace
of research.
2.2.5 Computational models of stroke: current models
Stroke has given rise to a handful of models in the past two decades. These models operate at
different levels of biological realism and detail, with some being data-driven regressions of recovery,
while others attempt to model post-stroke changes in local cortical networks.
Early modeling attempts often tried to replicate basic reorganization dynamics in two-layer,
feedforward networks (with input and cortical layers) based on the self-organizing map principles made
popular by developmental models (Lytton et al 1999). Key features of these basic maps included
Hebbian synapses from the input to the cortex and a “Mexican hat” lateral cortical connectivity pattern,
whereby immediately neighboring cells excited one another and more distant neighbors inhibited each
other. Lesioning this type of network led to immediate changes in cells’ receptive field sizes (dependent
on the parameters of the lateral connections). Cells near the lesion, whose inhibition from the lesioned
30
cells had been released, expanded their receptive fields towards the areas once covered by the lesioned
cells. If an additional halo of disinhibition was added around the lesion (suggested by histological
studies), cells far away from the lesion contracted the size of their fields, since the further lack of
inhibition near the lesion caused nearby cells to become highly active, which then heavily inhibited
those cells with which they had inhibitory connections. Thus based on simple connectivity and a
disinhibitory halo, this model replicated the expansion and contraction of various receptive fields seen
experimentally (Lytton et al 1999).
Unfortunately, the basic model could not replicate the patterns of synaptic and somatotopic
reorganization seen over a longer period after stroke. To enable this, various modifications were
attempted, including some biologically unrealistic ones such as completely randomizing all input
synaptic connections or imposing a set amount of activity on a cell that it could divide amongst its
downstream connections. Another more plausible approach simply added a third thalamic layer to the
network and allowed Hebbian plasticity in both the cortical inputs and lateral connections. One
limitation of this model, though, was its propensity to produce irregularly-shaped, sometimes
discontinuous receptive fields (Lytton et al., 1999).
A more recent neural network model also attempted to simulate the recovery phase after
stroke (Reinkensmeyer et al., 2012). Specifically, the model contained a layer of cortical neurons that
were connected with fixed weights (either +1 or -1) to a flexor or extensor unit on a simulated wrist.
Activity in the neurons was passed through a saturating nonlinearity prior to arriving at the flexor or
extensor units, and these units each summed the weighted inputs from connected cortical neurons. 70%
of cortical cells connected with excitatory (+1) weight only to either the flexor or the extensor, while
30% also connected to the antagonist with an inhibitory (-1) weight. In secondary simulations aimed at
explaining post-stroke spread of activation to secondary motor areas, 20% of the cortical cells were
31
connected with a weight of only 0.1 to the wrist units. This represented a more weakly connected
supplementary motor area (SMA) versus the more strongly connected primary motor cortex. The output
of the system was the amount of flexion force in the wrist, calculated by subtracting the extensor unit
activity from the flexor unit activity. The goal of network training was to maximize this flexion using a
reinforcement learning algorithm. Noise was added on each time step to the cortical cell activities and
new activity patterns were retained if they led to stronger flexion than any previous pattern. This simple
model was used to replicate the idea of residual capacity in stroke rehabilitation, whereby a plateau in
recovery appeared to be reached in the acute post-stroke phase that was then improved upon by
additional rehabilitation in the sub-acute or chronic phases (Ada et al., 2006). However, in the context of
the model this effect was simply due to more trials of training leading to more wrist flexion in a
predictable, steady manner. The model also replicated the finding that a dose of training early in
recovery had a larger effect on recovery than a later dose. Again, the relevance of this replication to
biological reality was tenuous. It was simply a result of the random neural activity patterns frequently
leading to improvement in wrist flexion early in training compared to the initial random pattern.
However, as performance improved, it became less likely that a new random pattern would improve on
the current pattern, thus slowing the overall rate of improvement. The model did provide two
interesting clinical predictions for improving flexion strength, namely decreasing noise later in training
and inhibiting the more strongly connected primary motor cortex neurons to allow the weakly
connected SMA neurons to better optimize their activity.
Two models have attempted to address the interhemispheric alterations in activity seen post-
stroke. An earlier model found that the involvement of the contralesional hemisphere in reorganization
was dependent on pre-stroke levels of lateralization and lesion size (Levitan and Reggia 1999). A later
model then implemented both direct, excitatory tanscallosal connections between hemispheres, and
32
inhibitory connections routed through a subcortical pathway (Reggia 2004). This allowed development
of lateralized hemispheres when trained on certain tasks (though an initial skewing towards dominance
of one hemisphere over the other was required). Importantly, it also allowed a depression in
contralesional hemisphere activity after stroke, but not as severe as the one in the ipsilesional cortex.
This depression later recovered such that the contralesional hemisphere was more active than the
ipsilesional one. Aspects of this model seemed slightly unrealistic or uninteresting, however. The author
claimed that the contralesional hemisphere underwent activity depression after stroke, in agreement
with experiments. Nonetheless, he did not address the more interesting and perhaps more important
(to recovery) clinical observations that certain homologous contralesional areas later become active in
tasks once performed by the ipsilesional cortex. In fact, he did not measure contralesional activity
specifically in elements of the task that were once carried out by the ipsilesional cortex, as is done
clinically with movement of the ipsilesionally-controlled limb. Thus, though an appealing model, it is
lacking in explanatory power.
Several models in the last decade have used connectome data from the macaque to model the
brain using network theory (Alstott et al 2009, Honey and Sporns 2008, Rubinov et al 2009). In these
models, large ensembles of excitatory or inhibitory neurons whose cumulative activity is simulated are
represented by network nodes. Multiple local nodes are connected with one another, representing, for
example, different areas of sensory cortex, while long-range connections also exist between far-flung
areas of the brain. These models have explored aspects of functional relationships between nodes, such
as how quickly they synchronize with each other when their activity is modeled in an oscillatory manner.
How these functional relationships are altered by node lesions has also been investigated, suggesting
(perhaps unsurprisingly) that areas that are highly interconnected with other disparate regions are most
33
devastating to normal activity when lesioned. Though useful for broad insights to overall brain
functionality, these models have yet to add in elements of plasticity or repair after lesion.
Two other models take a “black-box” approach to stroke recovery by modeling high-level
processes without being bogged down in biological detail. These models are focused on the recovery
phase of motor stroke. The first model addresses the issue of non-use of the paretic limb by stroke
patients, postulating simply that as functionality of the limb improves, patients will be more likely to use
it (Han et al 2008). In the model, this idea is incorporated into a setup in which use of the arm improves
its function through supervised learning and increases the topographic representation in a neural
network of directions it moves through unsupervised learning. As arm function improves, this provides
positive feedback reward to a choice simulator, which learns through reinforcement learning which limb
to use for later trials. Thus, use increases function, which in turn increases use. The main prediction of
this model is that if rehabilitation can reach a threshold performance level for the limb, the patient will
continue to use it spontaneously. If this threshold is not reached, the patient will stop using the limb
again after rehabilitation ends and its function will regress.
The second “black-box” model gets at the same issue of use and function, but instead models a
real clinical data set (Hidaka et al 2012). The model itself is a recursive regression that predicts paretic
arm function from previous use and function, where the amount of use is dependent on past function.
Through this model fitting, the authors found an increase in the parameter relating use to function after
rehabilitation, suggesting increased spontaneous use of the arm as function improved.
Finally, one recent computational model of a basic two-layer network has attempted to begin
explicitly modeling the synaptogenesis during normal activity and after lesion (Butz et al 2009). The
authors consider formation of dendritic spines (which can either be searching for inhibitory or excitatory
34
inputs) to be driven by a homeoplastic mechanism, whereby a low activity level drives formation of
excitatory spines and shedding of inhibitory ones, and high activity induces the opposite. Axonal
outgrowth, however, is driven only by high cellular activity levels and is needed to form synapses with
nearby empty spines. Given this setup, they find a lesion that disrupts inputs and reduces activity in one
area of the network can be best recovered from by providing cyclic instead of constant stimulation. This
is because the affected cells send out excitatory dendritic spines to relieve their depression, but the
unaffected cells, which are still close to homeostatic activity levels, do not send out enough axons to fill
this void. Stimulation increases activity, thus driving needed axonal growth, but prolonged continuous
stimulation causes the unaffected cells to homeoplastically diminish activity. This leads to a halt in
axonal growth before the needs of the lesion-affected cells have been met. Cyclic stimulation solves this
problem by preventing unaffected cells from diminishing their activity and therefore keeps their axonal
growth going. The authors suggest this approach may be beneficial clinically if similar homeostatic
synaptogenesis mechanisms are at work.
2.2.6 Computational models of stroke: shortcomings, simplifications, and future possibilities
It is clear from the above descriptions that models come nowhere close to a detailed biological
reality of stroke progression or recovery- any mention of growth factors, genes, glia, or even
synaptogenesis, all of which are thought to play important roles post-stroke, is completely missing in
most models. However, the unwieldy complexity that a model incorporating more than a few detailed
mechanisms would reach would be a major problem in two regards. First, such a complex model would
lead to too many parameters to constrain based on fitting a model to experimental data. Second, such
complexity would antithetical to the purpose of most models. Models should be heavily pared down,
bare-bones versions of reality. This is where their explanatory and predictive power can be seen in full
force. If an experimental finding can be replicated by a model featuring only a few important
35
phenomena, then that provides researchers with a strong clue as to what might be underlying the
finding. Ideally, these clues then form the fodder for new experiments. However, if there are instead 20
different processes acting in the model, some of which can compensate for others depending on
parameter settings, then this explanatory power is lost- researchers no longer gain clear insights into
which factors are most important in causing the outcome.
However, simplicity and too much detachment from reality are two forces that can be hard to
balance in simulation studies. Although models can be far from biological reality in the general sense,
they should be biologically plausible in what assumptions they are using to address the specific
phenomenon at hand. Again, the non-use model is a good example of this. Though the human brain is
millions of times more complex than that basic model, it is not far-fetched to argue that it has some sort
of reinforcement-based circuitry for making arm choices, or that use of a paretic arm improves its
function and increases its neural representation. However, some of the elaborations to the basic two-
layer model by Lytton et al. discussed above, such as complete randomization of synaptic connections
after stroke, are less immediately useful since they lack any plausible biological correlate.
In the work described in Chapter 3, our goal was to build on the progress made in using neural
network models to simulate cortical development and stroke to explore the roles of Hebbian and
homeoplasticity in stroke recovery (see next section for a review of experimental knowledge addressing
this question). Given the presence of these two plasticity mechanisms, we also wanted to determine
how they shaped the optimal timing of a rehabilitation dose given post-stroke. In contrast with past
models, we simulate the full process of initial development and organization of a cortical network to
represent sensory input signals prior to lesioning the network and observing its reorganization. We
include fully plastic afferent as well as lateral synaptic connections. Additionally, we provide input to the
network that changes on every time step, similar to what happens in reality. This is different from the
36
common approach of providing a single input pattern, allowing the network activity to settle, and then
altering synaptic weights before providing the next input pattern (Kohonen, 1982; Sirosh and
Miikkulainen, 1997; but see Sullivan and de Sa 2006). However, we of course made many simplifying
assumptions as well, including ignoring many biological processes such as the effects of changes in cell
signaling and gene regulation post- stroke (Murphy and Corbett, 2009). The major simplification was
that our network received input from a moving, 6-muscle arm, but the resulting network activity was
not used to aid in control of the arm. Thus, stroke led to no degradation of motor control, and
reorganization of the network was evaluated based on synaptic representation of each muscle instead
of accurate motor control.
2.2.7 Hebbian and homeoplasticity: putative roles in stroke recovery
Hebbian plasticity, incarnated biologically as long-term potentiation LTP and LTD, appears key to
stroke recovery for several reasons. First, potentiation is facilitated in the peri-infarct cortex 7 days after
stroke in rats (Winship and Murphy 2009). This may be partially driven through molecular mechanisms,
such as upregulation of NMDA receptor binding (Cramer 2008a). NMDA receptors are the major
receptor responsible for Hebbian LTP in the brain (Citri and Malenka 2008). Additionally, blocking of
brain-derived neurotrophic factor (BDNF) prevents the benefits of rehabilitation in rats (Murphy and
Corbett 2009), and BDNF is known to be an important product of protein synthesis during consolidation
of LTP. In fact, consolidation of LTP, which is normally dependent on protein synthesis, becomes
independent of synthesis if BDNF is exogenously administered (Pang et al 2004). In addition to molecular
mechanisms, the long-lasting postsynaptic depolarization responses seen in the weeks after stroke may
make it easier for coincident pre and postsynaptic activity to occur and drive LTP (Brown et al 2009).
Neuronal hyperexcitability also occurs concomitantly with this potentiation facilitation, providing the
presynaptic element to LTP (Murphy and Corbett 2009).
37
In addition, in healthy rats, subthreshold forelimb responses have been observed in hindlimb-
area somatosensory cells (Murphy and Corbett 2009). These subthreshold inputs cannot drive cell
activity normally, but a month after lesion to the forelimb area, hyperexcitability may allow these inputs
to drive activity in formerly hindlimb-only territory. After two months, cells in this territory become
selective again, but this time for the forelimb, which has become the stronger input (Winship and
Murphy 2009). This process is explainable within a Hebbian framework, especially given that training of
the affected limb is essential to regaining territory (Nudo et al 1996). This training presumably drives
affected-limb activity, which is needed to drive LTP of the formerly subthreshold inputs. Without this
training, the remaining affected-limb territory can in fact disappear as other, more-used areas of the
body encroach and possibly drive their own LTP (Nudo et al 1996). Thus, Hebbian plasticity appears very
important in allowing reorganization of the injured cortex after stroke.
Homeoplasticity, on the other hand, has more unknown role in stroke. However, it is active in
various areas of the CNS (Turrigiano 2012), and presumably remains so after stroke. If this is true,
important phenomena in the peri-lesional cortex may be critically dependent on it. First, the peri-
lesional area undergoes greatly reduced activity in the first days after rodent stroke. The later
hyperexcitability described above might therefore be driven, at least in part, by homeostatic
mechanisms attempting to bring cellular activity back up to desired levels (Murphy and Corbett 2009).
Second, the high rates of synaptogenesis, axon outgrowth, and dendritic spine growth near the infarct
may also be homeoplastic mechanisms acting to bring more excitation to inactive cells (Murphy and
Corbett 2009). If these changes are important to recovery, it may be detrimental to drive activity in the
peri-infarct area that may prevent these homeoplastic responses to low activity from occurring (Murphy
and Corbett 2009). This could be part of the reason for the surprising results found in a clinical study of
38
very early rehabilitation therapy (within 2 weeks of stroke onset), where patients with a high dose of
therapy performed worse after 90 days than normal-dose patients (Dromerick et al 2009).
In the model presented in Chapter 3, we attempt to begin exploring this hypothesized interplay
of rehabilitation-driven activity, homeoplasticity, and Hebbian plasticity in a quantitative fashion.
Important to our results is the assumption that Hebbian plasticity occurs in the sensorimotor cortices in
a purely use-dependent manner. This means we assume network activity alone acts to shape
reorganization, whether or not that activity necessarily corresponds to successful or accurate
movements. However, motor control and learning depends on multiple mechanisms, some of which are
inextricably linked not just to brain activity but also to performance feedback from the environment.
Thus, Chapter 4 presents experiments exploring whether activity alone, with little or no feedback, can
actually induce sensorimotor changes as assumed in the model. To provide a primer for that chapter,
the next section discusses the different mechanisms thought to underlie motor learning, including both
feedback-based and possible non-feedback-based mechanisms.
2.3 Mechanisms of motor learning
Currently, three major mechanisms are thought to drive motor learning and shape motor
control, namely error-based, reward-based, and use-dependent learning. These are thought to map on
to three types of learning that are used to classify machine learning and artificial intelligence algorithms:
supervised, reinforcement, and unsupervised learning, respectively. This computational framework
provides a clear way to organize thinking about each biological learning mechanism. The following
sections outline each of these three mechanisms, focusing on providing a definition, behavioral
evidence, and hypothesized anatomical and physiological correlates for each based on upper limb motor
control studies.
39
2.3.1 Error-based learning
Error-based motor learning is characterized as the reduction of sensory prediction error
between actual and expected movements (Mazzoni and Krakauer, 2006). In the context of upper limb
motor learning studies, this error may be manifested as a difference in observed and expected reach
endpoint or trajectory. Error-based learning maps onto the computational construct of supervised
learning, in which a teacher provides a learning system with the correct output for a given input. The
system can then determine the direction and magnitude of the error between its actual output and the
desired one, using this error signal to incrementally update its internal parameters with a learning rule
that reduces the error on future trials.
This learning mechanism has been extensively studied over the past two decades mostly using
two motor adaptation paradigms. The first is the force field adaptation paradigm (Shadmehr and Mussa-
Ivaldi, 1994; Brashers-Krug et al., 1996), in which participants attempt to reach to a target while holding
a robotic manipulandum that laterally perturbs their reach with a force that corresponds to the current
position or velocity of the hand. Eventually, subjects learn from their trajectory errors and preemptively
compensate for the lateral forces such that their reach trajectories become straight again. The second
common adaptation paradigm is visuomotor adaptation (Krakauer et al., 2000), in which the hand is
hidden from view and represented on a visual display (along with a reach target) as a cursor whose
position relative to the actual hand is displaced. In both paradigms, it is thought that adaptation to the
given perturbation occurs based on update of a forward model, which maps motor commands to a
predicted sensory outcome (Mazzoni and Krakauer, 2006).
Many features of adaptation have been explained by a simple, linear time-invariant state-space
model that assumes error correction is the only force at play (Smith et al., 2006). The model contains
two learning processes that are both updated to compensate for error experienced on each reach. One
40
process has a fast learning rate but also a fast forgetting rate, while the second process has both a slow
learning and forgetting rate. The features explained by this model include anterograde interference,
spontaneous recovery, and long-term retention (Smith et al., 2006; Joiner and Smith, 2008; Sing and
Smith, 2010). The model was unable to explain savings (the faster relearning of a task after a period of
washout), but further extensions to it such as adding multiple slow processes (Lee and Schweighofer,
2009) or allowing learning rates to vary over time (Zarahn et al., 2008) have addressed this issue.
Generally, this model has proven powerful in its combination of simplicity and explanatory ability,
suggesting that a cohesive learning mechanism underlies error-based adaptation. It has also provided
theoretical foundations for experimental forays into determining the neural correlates of learning
processes with different time scales (Kim et al., 2015). Attempts have also been made to relate the fast
and slow processes of the model back onto classic dichotomies of learning into explicit and implicit
components (Keisler and Shadmehr, 2010; McDougle et al., 2015), although hypotheses on this front are
still in need of further testing.
Anatomically, it appears that supervised, error-based learning, at least as it applies to implicitly
updating a forward model based on sensory prediction errors, is critically dependent on the cerebellum
(Doya, 2000). Patients with cerebellar damage show impaired adaptation to perturbations in the
absence of a conscious strategy to correct for the given perturbation (Rabe et al., 2009; Taylor et al.,
2010). Additionally, anodal transcranial direct current stimulation (tDCS) of the cerebellum increases
adaptation learning rates in healthy subjects (note, however, that retention of this learning was
improved by stimulation of M1 and not cerebellum) (Galea et al., 2011). Overall, these behavioral results
are in accordance with evidence that climbing fiber inputs to the cerebellum encode movement errors
during reaching (Kitazawa et al., 1998).
41
Although error-based adaptation is clearly important to adjusting for environmental
perturbations, its applicability to skill learning is uncertain because it appears only to drive performance
to return to baseline instead of to increase, as would be required to learn a new skill (Krakauer and
Mazzoni, 2011). However, it may be that adaptation helps to adjust only for errors that are behaviorally
relevant. For example, when reaching with a perturbation, subjects have no motivation or reason to
increase performance beyond a return to baseline since this performance level was satisfactory for the
task at hand. However, if some reward or cost was imposed that made the remaining baseline reach
errors relevant to achieving an acceptable level of task success, perhaps adaptation would play a role in
further reducing error and improving performance. Recent evidence provides some support for such a
hypothesis, since it has been found that movement noise at a given speed can be reduced if high enough
reward is provided, presumably to make up for the extra effort required for noise reduction (Manohar
et al., 2015). In this case, the extra reward makes the baseline movement error “worth” reducing, at
which point perhaps error-based adaptation mechanisms help to better define a forward model that
leads to more precise movements.
2.3.2 Reward-based learning
As opposed to error-based learning, reward-based learning depends on feedback that only
provides information about task success or failure, not about the direction or magnitude of adjustments
that must be made to counteract an error. In this way, it maps onto the computational construct of
reinforcement learning (Doya, 2000). In reinforcement learning, an “agent” (in this case, the subject’s
brain) must use some degree of trial and error to determine what output leads to successful
improvement in a task. The agent can then approximately repeat those outputs which lead to
improvements (exploitation), while still continuing to vary them in an effort to find even better
outcomes (exploration) (Sutton and Barto, 1998). As applied to a motor learning example, this could
42
mean that when learning a new task, the changes in muscle activation patterns needed to correct for
task failure is not well defined, but that variation in execution from trial to trial sometimes leads to
improvement in the outcome. Future movements would then be driven to become more like the
successful variations.
Recent work has started to clarify the role of reward-based learning in motor learning. In rats, it
was found that lesion of inputs from the dopaminergic ventral tegmental area to M1 prevented
retention of skill learning across days (Hosp et al., 2011). Interestingly, lesioned rats still showed some
improvement within single training sessions. Given the role of dopamine in signaling unexpected reward
(Schultz et al., 1997), this could suggest that reward feedback was necessary for retention of motor skill
learning. This hypothesis is lent more support by a human study in which subjects learned an isometric
pinch force task (Abe et al., 2011). Three groups participated, one receiving monetary losses for
unsuccessful task trials (punishment group), one with no change in monetary compensation based on
performance (neutral group), and one receiving monetary rewards for successful trials (reward group).
In a parallel to the rat study, there was no between-group difference in learning during the training
session. However, the reward group had greater offline performance gains after 24 hours and showed
better retention after 30 days.
The suggestion that reward-based learning is the biological equivalent of reinforcement learning
has also received direct support. A recent study found that in tasks involving either reward-based or
error-based feedback, motor variability (the “exploration” of reinforcement learning) led to faster
learning if it was in a task-relevant dimension; i.e. if it causally drove changes in task success (Wu et al.,
2014). Additionally, as subjects became familiar with the structure of a task, their motor variability
would tend to change so that more of it was in task relevant dimensions. A separate reaching study also
found variability could be modulated in a manner consistent with reinforcement learning- in this case, if
43
only binary reward feedback about hitting or missing a target was provided, healthy subjects’ reach
direction variability was increased after unsuccessful trials (exploration) and decreased after successful
ones (exploitation) (Pekny et al., 2015).
Evidence has also grown for reward-based mechanisms being at play in reach adaptation
scenarios, a paradigm previously thought of as mostly engaging error-based mechanisms. One study
showed that if a visuomotor rotation was gradually introduced, without awareness on the part of the
subject, it could be adapted to even if instead of error feedback, only binary reward feedback about
hitting the target or not was provided (Izawa and Shadmehr, 2011). After correcting for the
perturbation, subjects who only received this binary feedback could still accurately estimate the true
position of their hand (which was hidden during the task). On the other hand, another group of subjects
who received error feedback instead of binary reward feedback during adaptation mistakenly estimated
their true hand position to be near the perturbed cursor location. This result is notable because it
suggests that despite similar levels of adaptation in these two groups, separate learning mechanisms
had been recruited simply by providing different kinds of feedback. The adaptation driven by error-
based feedback appeared to involve a remapping of the forward model based on sensory prediction
error, such that the cursor was believed to represent the actual hand location. Meanwhile, the
adaptation driven by reward-based feedback did not involve this remapping, and may have instead been
represented associative learning between a given action and a successful outcome.
A different study put a twist on reward-based correction of a visuomotor rotation by allowing
two subject groups to first learn to compensate for the rotation through error feedback, but then only
providing binary success or failure (reward) feedback to one subject group once performance had
plateaued (Shmuelof et al., 2012). The other group continued to receive error feedback in addition to
binary feedback. During a subsequent block of error clamp trials (in which error was “clamped” at 0 by
44
showing cursor endpoints that were artificially forced to appear at the target location), the group that
had received error feedback during the performance plateau showed a decay of their adaptation back to
baseline. However, the group that had received only reward (binary) feedback showed little or no decay
in their adaptation (they kept reaching as though to compensate for a rotation perturbation). In a
follow-up experiment in which an additional rotation perturbation was adapted to after the
performance plateau, the error-based group decayed to baseline during a final error clamp block while
the reward-based group decayed back to the level of adaptation seen during the performance plateau.
The findings of this study may relate to the apparent importance of reward-based feedback in retention
mentioned above. It is possible that the group that received only reward-based feedback during the
performance plateau preferentially engaged retention mechanisms, while the error feedback provided
to the other group somehow interfered with this process.
A final study of note that examined the interplay of reward-based and error-based mechanisms
(as well as use-dependent mechanisms, discussed in the next section) suggested reward-based
mechanisms were behind the savings seen during relearning of a visuomotor rotation after washout
(Huang et al., 2011). The authors proposed that the reach direction that previously had been found as
the correct solution to the initial visuomotor rotation was subject to operant reinforcement as the
“rewarding” outcome to the task, even though error-based adaptation had driven the initial finding of
this solution. Thus, after washout and reinstatement of the visuomotor rotation, the brain was quicker
to find the previously reinforced solution again. This quickness would then behaviorally be revealed as
savings. To test whether it was the correct reach direction itself that was important in savings, the
authors then performed a follow-up experiment in which adaptation to a counterclockwise rotation was
followed by adaptation to a clockwise rotation. For one subject group, the target appeared in the same
visual location in both rotations, while for another group, the targets were separated such that the
45
correct reach direction (i.e. actual direction of hand movement) was the same for both rotations.
Notably, only the group for which the reach direction was the same showed savings, bolstering the
hypothesis that a reach direction that had already been reinforced as the correct solution for one
rotation was quicker to find when a new, previously unexperienced rotation was imposed. However,
presumption that a reward-based mechanism acted without interference from the concurrent error-
based adaptation is somewhat surprising here in light of reward-based mechanisms appearing to be
interfered with by error-based adaptation in the Shmuelof et al. study described above. The interference
may be a matter of degree- it is strong enough to prevent resetting of a baseline rotation adaptation as
in Shmuelof et al., but weak enough to allow faster re-finding of a previously experienced rotation
adaptation as in Huang et al.
Within the brain, the basal ganglia, which are densely interconnected with motor areas, have
been postulated to play a major role in reward-based reinforcement learning, at least at the level of
action selection (Doya, 2000). The dense dopaminergic projections from the substantia nigra pars
compacta (SNc) to the basal ganglia, carrying information about changes in reward feedback (Schultz et
al., 1997), support this postulate. A recent study of Parkinson’s disease (PD) patients, in whom these
dopaminergic cells die off, provided further evidence that the basal ganglia may act as a biological
reinforcement learning system. In this study, healthy and PD subjects made reaches with their hand
hidden, and received only binary reward feedback (success/failure) depending on whether they hit a
target on a given trial. However, a slowly changing visuomotor rotation was imposed during training that
displaced the correct reach direction from the visually displayed target, thus requiring subjects to learn
the correct reach direction over time. If learning in this task proceeded based on reinforcement learning
principles, subjects would be expected to increase their reach direction variability after an unsuccessful
trial (increase exploration) and decrease it after a successful one (increase exploitation). This is exactly
46
what the healthy subjects did, while the PD subjects were unable to modulate their variability. Overall,
this suggests reinforcement learning is an appropriate model of learning in this reward-feedback-based
task, and that the basal ganglia may be the locus of this mechanism since this was the locus of damage
in the PD subjects (Pekny et al., 2015). Finally, however, it should be noted that M1 might also be
important to reward-based learning based on the studies described above that detailed its importance
in retention and the modulation of this function by reward feedback and dopaminergic projections from
the ventral tegmental area (Abe et al., 2011; Hosp et al., 2011).
2.3.3 Use-dependent learning
The final motor learning mechanism, use-dependent learning (UDL), theoretically does not
depend on any error or reward feedback whatsoever, but by definition only shapes motor control
through frequent use of certain movements. Within the framework of computational learning theories,
this mechanism maps onto unsupervised learning and therefore is thought to be driven by purely
unsupervised Hebbian learning mechanisms (Verstynen and Sabes, 2011). However, the existence of
purely Hebbian learning in the brain, without any need for modulation or gating by reward or error
signals, is not clear. For example, in a study of monkeys either learning to retrieve a food pellet from a
small well (which required new dexterous finger manipulations) or retrieving the pellet from a large well
(which required no skill learning), only the small well group showed increased finger representations in
M1 (Plautz et al., 2000). This result demonstrated skill learning, and not just activity, was necessary for
reorganization of M1. Physiologically, it may be that the successes and errors during learning drive
dopaminergic or other neuromodulatory inputs to M1, thus gating Hebbian-like plasticity that drives
reorganization.
Whether activity alone is sufficient to drive plastic changes (within M1 or elsewhere in the
motor system) is an important question in the current work, given the assumption that it was sufficient
47
in the model presented in Chapter 3. As will be described in that chapter, for example, we find allowing
a period of arm rest (i.e. where the arm is kept stationary) after stroke lesion but before rehabilitative
arm movements begin to be beneficial to subsequent network recovery speed. However, this result
depends on low proprioceptive arm muscle activity during the rest period driving some cortical
reorganization that can subsequently be built upon by rehabilitative movements, which provide higher,
quickly varying proprioceptive inputs to the cortex. However, if we assume based on studies such as the
Plautz monkey study that skill learning is necessary to induce plasticity anywhere in the motor system, it
is not feasible to suggest that the rest period could drive any initial cortical reorganization. Therefore, to
lend credence to the model assumptions and results, we planned the experiments described in Chapter
4 to try to assay whether movement alone could cause any measurable motor control changes,
presumably due to neural plasticity. Specifically, we used assays of UDL and attempted to show that its
effects were independent of the error-based or reward-based mechanisms that are important in skill
learning. The remainder of this section gives a brief overview of previous work characterizing UDL and
the open questions we attempted to answer in the current work.
Behaviorally, UDL is manifested when repetition of a single movement biases future movements
towards the repeated one (Verstynen and Sabes, 2011). It was first characterized in thumb movement
studies almost two decades ago. Subjects trained for 30 minutes by moving their thumb repeatedly in
the direction opposite from the direction evoked by TMS over the thumb representation of M1 prior to
training. UDL was measured by comparing the pre- and post-training TMS-evoked thumb movement
directions. It was found that training had rotated the evoked direction to be similar to the trained
direction, an effect that decayed over approximately 30 minutes (Classen et al., 1998). Using the same
TMS and thumb paradigm, later work found that pharmacological GABA agonists and NMDA receptor
antagonists both diminished UDL. This suggests the mechanism underlying UDL is similar to activity- and
48
NMDA receptor-dependent LTP (Bütefisch et al., 2000). Other thumb experiments provided further
support for this notion, showing that UDL could be enhanced by anodal tDCS over M1 (Galea and Celnik,
2009). This intervention has been shown to decrease GABA levels in the stimulated area (Stagg et al.,
2009), leading to decreased inhibition and more excitatory activity. Additionally, its aftereffects are
dependent on NMDA activity (Stagg and Nitsche, 2011).
Work in the last decade has demonstrated UDL can shape reaching movements as well, even
when movements testing for UDL are planned and voluntary as opposed to TMS-induced (Jax and
Rosenbaum, 2007; Diedrichsen et al., 2010; Verstynen and Sabes, 2011; Hammerbeck et al., 2014). In
one of these studies, the reach trajectory shape on one reach caused a similar trajectory shape on the
next reach (Jax and Rosenbaum, 2007). A later study investigated the effects of reaching hundreds of
times to training targets drawn from different distributions. Probe targets appeared on random trials
that were located in directions away from the center of the training distributions. UDL was measured as
bias towards the center of the training distribution during reaches to these probe targets. The authors
found that with increasingly narrower distributions of training targets led to greater UDL (Verstynen and
Sabes, 2011).
Interestingly, in addition to trajectory shape or direction, UDL has also been demonstrated for
movement speed, where the speeds of test movements were biased towards a speed enforced during
preceding training reaches (Hammerbeck et al., 2014). This is notable because it demonstrates that
aspects of planned movements other than direction are also subject to UDL. Adding to this possibility,
an older study showed that generalized bias in reach direction occurred for all target directions when
the initial posture of the arm was changed, especially after a period of training in the first posture. This
bias was removed with practice in the second posture, or if vision of the arm was allowed prior to a
reach. This data was qualitatively explained by the assumption that if relying only on proprioceptive
49
information about initial posture, the brain tended to bias its judgement towards whatever posture
recently been used for training (Ghilardi et al., 1995). This postulate fits within the later-identified
framework of UDL if UDL can be assumed to act in brain areas associated with postural estimation. Thus,
UDL may operate in premotor areas that are thought to be responsible for higher-level aspects of
planned movement such as speed, and perhaps in sensory areas that interpret proprioceptive
information as well. The latter possibility is important in the model presented in Chapter 3, because it
assumes Hebbian plasticity independent of feedback (again, a mechanism which might underlie UDL,
although this is yet to be shown) is active in sensory cortices.
Forays have been made into investigating the interaction of UDL with error- and reward-based
learning. Two studies utilizing visuomotor adaptation paradigms (involving mostly error-based learning)
explained certain features of the adaptation time courses they observed as being due to UDL-induced
biases in reach direction (Huang et al., 2011; McDougle et al., 2015). In a study directly aimed at
investigating its role in conjunction with error-based learning, it was shown to act even in task-irrelevant
dimensions and in directions opposite to error-based learning (Diedrichsen et al., 2010), suggesting it is
not tied to performance outcomes. However, in this study, performance feedback about the task was
given. It is possible that the changes driven by UDL, which were task-irrelevant and neither helped nor
hindered successful performance, were unable to be separated out by the brain from the causal, error-
or reward-based changes that affected performance. Thus, the changes induced by UDL may have been
incidentally gated by some sort of error- or reward-based performance feedback. Other studies
investigating UDL in voluntary reaching movements similarly provided some form of error- or reward-
based performance feedback. Therefore, the question of whether “pure” UDL exists, driven only by
unsupervised Hebbian-like mechanisms, was not fully addressed by these studies.
50
If UDL does in fact map on to unsupervised learning, then its locus within the motor system may
be the cerebral cortex (Doya, 2000). This dovetails with the experimental evidence gathered since Doya
proposed his hypothesis. As detailed above, this includes modulation of M1 with tDCS to promote the
induction and retention of UDL (Galea and Celnik, 2009), as well as earlier work using TMS over M1 to
test for UDL after thumb training (Classen et al., 1998; Bütefisch et al., 2000). Other areas of the
sensorimotor cortex may also be involved, however, based on putative UDL-based effects on movement
speed and estimation of current postural state (Ghilardi et al., 1995; Hammerbeck et al., 2014). As
detailed in Chapter 4, however, there are several outstanding questions about the feedback-
independence, time course, and motor system locus of UDL, at least in the case of its effects on reach
direction. These questions are tackled by the experiments described in that chapter. The answer to the
question about whether UDL is feedback-independent helps to cement its role alongside error- and
reward-based mechanisms as a third motor learning mechanism. It also bolsters the assumption of the
model in Chapter 3 that pure unsupervised Hebbian learning can occur in the cortex, since the
experiments described above draw a link between UDL and Hebbian-like LTP mechanisms in the brain.
51
Chapter 3: Roles of Hebbian and Homeoplasticity in a Neural Network
Model of Stroke Recovery
3.1 Abstract
Together with Hebbian plasticity, homeoplasticity presumably plays a significant, yet unclear,
role in recovery post-lesion. Here, we undertake a simulation study addressing the role of
homeoplasticity and rehabilitation timing post-stroke. We first hypothesize that homeoplasticity is
essential for recovery, and second that rehabilitation training delivered too early, before
homeoplasticity has compensated for activity disturbances post-lesion, is less effective for recovery than
training delivered after a delay. We developed a neural network model of the sensory cortex driven by
muscle spindle inputs arising from a six-muscle arm. All synapses undergo Hebbian plasticity, while
homeoplasticity adjusts cell excitability to maintain a desired firing distribution. After initial training, the
network was lesioned, leading to areas of hyper- and hypo-activity due to the loss of lateral synaptic
connections. The network was then retrained through rehabilitative arm movements. We found that
network recovery was unsuccessful in the absence of homeoplasticity, as measured by re-establishment
of lesion-affected inputs. We also found that a delay preceding rehabilitation led to faster network
recovery during the rehabilitation training than no delay. Our simulation results thus suggest that
homeoplastic restoration of pre-lesion activity patterns is essential to functional network recovery via
Hebbian plasticity.
3.2 Introduction
It is widely appreciated that motor rehabilitation given after a prolonged delay post-stroke is
less effective than after a short delay (Biernaskie et al. 2004; Horn et al. 2005; Maulden et al. 2005;
Salter et al. 2006; Wolf et al. 2010; Wolf et al. 2006), presumably because of the ending of the critical
period for plasticity that characterizes the semi-acute post-stroke phase (Cramer and Chopp 2000;
52
Murphy and Corbett 2009). However, rehabilitation given too early may also lead to sub-par outcomes
(Bland et al. 2000; Dromerick et al. 2009; Kozlowski et al. 1996; Risedal et al. 1999). For instance, in
rodents, rehabilitative training that started 24 hours after cortical infarct led to worse recovery than
training that started 7 days after infarct, although recovery was still better than without training (Risedal
et al. 1999). In humans, intense motor rehabilitation within 2 weeks post-stroke led to poorer outcomes
than less-intense therapy during the same period (Dromerick et al. 2009).
The mechanisms underlying the sub-par effects of too-early rehabilitation are not well
understood. Abnormal cortical patterns of excitation and inhibition occur both near (Clarkson et al.
2010; Liepert et al. 2000; Mittmann et al. 1998; Qu et al. 1998; Schiene et al. 1996) and far from the
lesion (Buchkremer-Ratzmann and Witte 1997; Butefisch et al. 2003). Such abnormal excitability and
resulting abnormal activity patterns are in part due to the loss of efferent connections from lesioned
cells to local and distant areas (Buchkremer-Ratzmann and Witte 1997; Sober et al. 1997). Locally,
pyramidal cells send lateral projections up to 2 mm away (Boucsein et al. 2011) with net excitatory
effects to nearby cells and net inhibitory effects to more distant cells, via connections to local GABAergic
neurons (Derdikman et al. 2003). Computer simulation studies using such “Mexican hat” connectivity
patterns (Sirosh and Miikkulainen 1997; Stevens et al. 2013; Sullivan and de Sa 2006; Wilson et al. 2010)
have supported the experimental findings that lesions yield complex changes in activity, with zones of
abnormally low and high excitability (Goodall et al. 1997; Sober et al. 1997). Abnormal hyperexcitability,
if exacerbated by activity due to early and intense use of the affected limb, can cause cell death and
lesion enlargement (Risedal et al. 1999), presumably via abnormally high levels of NMDA activity (Humm
et al. 1999; Risedal et al. 1999). In surviving cells, hyperexcitability enhances Hebbian plasticity-like long-
term potentiation (LTP) peri-lesionally (Hagemann et al. 1998). Such increased plasticity may be
beneficial to recovery (Hagemann et al. 1998; Murphy and Corbett 2009), because it can help in the
53
strengthening and reemergence of remaining weak afferent synapses from stroke-affected inputs (Jones
2000; Sanes and Donoghue 2000; Sigler et al. 2009; Winship and Murphy 2009). However, increased LTP
may also lead to maladaptive plasticity and poor cortical reorganization if existing inputs are further
strengthened at the expense of the reemergence of weak afferent synapses.
In addition to Hebbian plasticity-like LTP, homeoplasticity has been proposed to play a
significant role post-lesion (Murphy and Corbett 2009; Nahmani and Turrigiano 2014). Homeoplasticity,
which is ubiquitous in the brain, acts to maintain desired firing rates and patterns (Desai et al. 1999;
LeMasson et al. 1993; Turrigiano 2011). In response to abnormally low or high activity after lesion,
homeoplasticity may thus enhance or decrease excitability, respectively, to restore pre-lesion firing
levels (Murphy and Corbett 2009).
Experimentally, it is difficult to study the interacting effects of homeoplasticity, LTP, and
enhanced inputs due to rehabilitative training on cortical reorganization and recovery post-lesion
(Murphy and Corbett 2009; Nahmani and Turrigiano 2014). In addition, rehabilitative training will affect
the network differently at different times post-stroke depending on the level of “spontaneous recovery”
achieved before training. Here, we aim at better understanding these interactions in computer
simulations to guide future experimental work and, eventually, clinical practice regarding the optimal
timing of rehabilitation. We make the following two hypotheses. First, homeoplasticity, together with
Hebbian plasticity, enhances recovery and prevents maladaptive cortical reorganization. Second,
rehabilitation training delivered too early, before homeoplasticity has compensated for activity
disturbances post-lesion, is less effective for network recovery than training delivered after a delay.
To test these hypotheses, we developed a simplified somatosensory cortical network of neurons
that received simulated arm muscle spindle inputs and lateral connections via Mexican hat-like
54
connectivity. All synapses were modifiable via Hebbian plasticity; cell activity was adjusted via
homeoplasticity. After initial cortical organization in response to arm reaching movements, the network
was partly lesioned to simulate a stroke. We first tested the specific contribution of homeoplasticity to
network reorganization and then tested the effect of rehabilitation timing.
3.3 Methods
3.3.1 Arm and Muscle Spindle Simulations
We simulated a planar 2-joint kinematic human arm model with shoulder and elbow joints. Each
joint was spanned by an extensor and flexor muscle pair, as well as a bi-articular pair (Figure 1)
(Katayama and Kawato 1993). These muscles are abbreviated as Sh-Fl and Sh-Ex for the shoulder, El-Fl
and El-Ex for the elbow, and Bi-Fl and Bi-Ex for the bi-articular pair, with Fl and Ex denoting flexor and
extensor. Each muscle was passively stretched and contracted during reaching movements and gave rise
to two spindle activities, one based on muscle length (similar to a group II fiber), and the other on the
sum of length and stretch velocity (similar to a group Ia fiber).
Planar point-to-point movements were generated via a minimum jerk model (Flash and Hogan
1985). Inverse kinematics generated shoulder and elbow angles. Movement endpoints were randomly
selected with the constraint that changes in shoulder and elbow angles during each movement were
required to be less than π/4 and π/3 radians, respectively. The time MT allowed for each movement was
based on a modified Fitts’s law 𝑀𝑇 = 𝑙𝑜𝑔
2
(𝑎 + 𝑑𝑖𝑠𝑡𝑎𝑛𝑐𝑒 ) (MacKenzie 1989), where distance is the
Cartesian distance between consecutive endpoints and a = 1.3. The stretch lengths and velocities of
each muscle were normalized to between 0 and 1 to keep all spindle activity ranges equal. Time step
length was 0.01 sec.
55
3.3.2 Cortical Network Simulations
A 400-cell (20x20) network arranged in a doughnut-shaped grid received the 12 spindle inputs.
This represented a small patch of primary sensory cortex. Before training, initial connection weights
from spindles to cortical cells were all-to-all, chosen randomly from the range [0,1] before being
normalized to sum to 1 for each cell. Cells were laterally connected to each other with Mexican hat
connectivity (Kohonen 1982; Lytton et al. 1999; Sirosh and Miikkulainen 1997; Sober et al. 1997; Stevens
et al. 2013; Sullivan and de Sa 2006; Wilson et al. 2010).The lateral weight L
ij
from cell j to i was defined
as:
𝐿 𝑖𝑗
=
𝐴 𝑒 𝜎 𝑒 √2𝜋 𝑒 −
𝑑 2
2𝜎 𝑒 2
−
𝐴 𝑖 𝜎 𝑖 √2𝜋 𝑒 −
𝑑 2
2𝜎 𝑖 2
where d is the grid distance between the cells, and other parameter values are given in Table 1.
Positive and negative lateral input weights to each cell were normalized to sum to 9 and -9,
respectively. Such bias in the strength of lateral versus afferent input has been found in cortical
pyramidal cells (Boucsein et al. 2011; Stepanyants et al. 2009). Normalization constants and difference-
of-Gaussian parameters were heuristically chosen based on two criteria. 1) Formation of contiguous cell
neighborhoods (as defined by strong input weights from the same subset of spindles) following initial
training. 2) Large activity disturbances throughout the network following lesions targeting given
neighborhoods (see “Lesioning” sub-section below).
Each cell was modeled as a simple leaky integrator neuron (Arbib 1995). The change in
“membrane voltage” of cell i at time t was given by:
𝜏 ∆𝑥 𝑡 ,𝑖 ∆𝑡 = −𝑥 𝑡 ,𝑖 + ∑ 𝑆 𝑖𝑘
𝑓 𝑡 ,𝑘 12
𝑘 =1
+ ∑ 𝐿 𝑖𝑗
𝑦 𝑡 ,𝑗 400
𝑗 =1
, 𝑖 ≠ 𝑗 , (1)
56
where x
t,i
is the voltage for cell i at time t, y
t,j
is the activity for cell j (i ≠ j) at time t, L
ij
is the lateral
connection weight from cell j to i, S
ik
is the synaptic connection weight from spindle k to cell i, f
t,k
is the
activity of spindle k at time t, τ is the cell excitation constant, and Δt is the time step length in seconds.
The voltage x
t,i
was then passed through a sigmoidal response curve to give a final cell activity, or “firing
rate,” y
t,i
:
𝑦 𝑡 ,𝑖 =
1
1+𝑒 −(𝑎 𝑖 𝑥 𝑡 ,𝑖 +𝑏 𝑖 )
, (2)
where both the slope a
i
and the offset b
i
are adjusted via homeoplastic mechanisms in equations (4)
and (5) below.
3.3.3 Hebbian Plasticity and Homeoplasticity
At each time step, afferent and lateral (positive and negative) weights were modified via
Hebbian-type learning as follows:
𝑤 𝑡 +∆𝑡 = 𝑤 𝑡 + 𝜂 ℎ𝑒𝑏𝑏 𝑦 𝑝𝑟𝑒 ,𝑡 𝑥 𝑝𝑜𝑠𝑡 ,𝑡 (3)
where y
pre,t
and x
post,t
are the pre-synaptic cell or spindle activity and post-synaptic cell voltage,
respectively, at time t, and η
hebb
is the Hebbian learning rate (Castillo et al. 2011; Citri and Malenka 2008;
Foldiak 1990; Madison et al. 1991). Only the magnitude, not the sign, of the weights could change.
Weights were re-normalized after each update to prevent runaway weight values. The normalization
created competition between weights, whereby one weight to a cell could not increase without
decreasing the contributions of other weights. Such competitive Hebbian learning is known to lead to
regular map formation in models of sensory cortex (Kohonen 1982; Linsker 1989; Malsburg 1973; Wilson
et al. 2010).
57
To model homeoplasticity, the slope and the offset of each cell’s response curve (equation (2))
were modified at each time step with the goal of maintaining an exponential firing rate distribution with
a desired mean, according to (Butko and Triesch 2006; Triesch 2005) . For cell i, the changes of sigmoidal
parameters a
i
and b
i
in equation (2) are given by:
∆𝑎 𝑖 = 𝜂 ℎ𝑜𝑚
(
1
𝑎 𝑖 + 𝑥 𝑖 − (2 +
1
𝜇 ℎ𝑜𝑚
) 𝑥 𝑖 𝑦 𝑖 +
1
𝜇 ℎ𝑜𝑚
𝑥 𝑖 𝑦 𝑖 2
)
(4)
∆𝑏 𝑖 = 𝜂 ℎ𝑜𝑚
(1 − (2 +
1
𝜇 ℎ𝑜𝑚
) 𝑦 𝑖 +
1
𝜇 ℎ𝑜𝑚
𝑦 𝑖 2
) (5)
where 𝜂 ℎ𝑜𝑚
is a learning rate and with µ
hom
the mean of the exponential distribution (see Table 1 for
parameter values and initial values).
58
MODEL PARAMETERS
Mexican Hat
Parameters
A
e
= 10
σ
e
= 1.5
A
i
= 5
σ
i
= 2.5
Plasticity Parameters η
hebb
= 0.03
η
hom
= 0.001
μ
hom
= 0.3
Spindle input weight
normalization constant = 1
Positive lateral weight
normalization constant = 9
Negative lateral weight
normalization constant = -9
Cell Dynamics
Parameters
Δt = 0.01
τ = 0.01
Table 1. Values of all the parameters used in the model.
3.3.4 Initial Network Training
Using spindle activities generated by 400 simulated arm movements (a total training length of
24,693 time steps) as inputs, we trained 10 different networks with a different random number
generator seed for each network. The distribution of MT during these 400 movements was 0.62 ± 13 sec
(mean ± SD).
59
3.3.5 Lesioning
After initial training, the 10 cortical networks were lesioned by removing all cells with an input
weight from the shoulder flexor length/velocity spindle (abbreviated as the Sh-Fl Len/Vel spindle)
greater than 0.1. Lesioning a specific spindle is similar to experimental approaches targeting the
somatotopic representations of specific limbs or limb segments – see for instance (Brown et al. 2009;
Kleim et al. 2003; Nudo et al. 1996).
3.3.6 Rehabilitation training
After lesioning, the post-stroke rehabilitation phase consisted of the same sequence of arm
movements as in the training phase under each of the following conditions.
3.3.6.1 Simulated Experiment 1: Hebbian/Homeoplastic vs. Hebbian-Only Rehabilitation
To test the importance of homeoplasticity in network recovery, rehabilitation training with both
Hebbian and homeoplastic mechanisms active was compared to training with only the Hebbian
mechanism active (η
hom
was set to 0).
3.3.6.2 Simulated Experiment 2: Delay vs. No Delay Rehabilitation
To test the effect of rehabilitation timing, training was given either immediately after lesion (No
Delay condition) or after a delay period (Delay condition). During the delay, only spindle activity
stemming from the home position lengths of the six muscles was provided as input to the network.
These activities were between 0.4 and 0.6 for all 12 spindles, in the middle of the possible [0,1] range of
activity. For the main results, the delay length was equivalent to 50% of the 24,693 time steps of initial
training, or 12,346 time steps. To link with Experiment 1, the delay period was also simulated with only
Hebbian plasticity present. We also explored the effect of variable delays in a parameter sensitivity
analysis (see below).
60
3.3.6.3 Simulated Experiment 3: Elbow-Only Training after Lesion
After infarct affecting the forelimb representation in monkey sensorimotor cortex, the absence
of forced forelimb use and re-training prevents recovery of the forelimb cortical representation (Nudo
and Milliken 1996; Nudo et al. 1996). To qualitatively compare the reorganization of our networks to
these results, we administered a type of degraded post-lesion training in which spindle activities from
the shoulder flexors and extensors were held at the values generated in the home arm position; other
spindle activities were as during normal rehabilitative training above.
3.3.7 Outcome Measures
3.3.7.1 Mean Cortical Cell Activities
Mean cell activities were calculated by averaging the activity of each cell in the network over a
set number of simulation time steps (precise numbers noted in Results for different situations). To
quantify abnormal activity, the mean cell activities taken across all time steps in initial training were
subtracted from the mean cell activities during the period of interest (e.g. during rehabilitation) on a
cell-by-cell basis. The mean and standard deviation of the absolute value of this change in mean activity
was then calculated across cells and across simulations arising from different random number seeds.
3.3.7.2 Spindle Input Weight Map and Proportion Changes
We measured the degree of recovery by comparing the input maps before lesion and at
different time points after lesion. Specifically, for each spindle, 20x20 input weight maps were created
by plotting the strength of input from this spindle to each of the 400 cells. Comparisons between post-
and pre-lesion maps were quantified using the weight proportion change for each spindle, calculated as
follows. The sum of input weight to all surviving cortical cells for a given spindle was divided by the sum
of input weights to all surviving cells from all 12 spindles (there were 400 surviving cells pre-lesion, but
this was reduced after lesion). This proportion (expressed as a percentage) was then compared at
61
different times post-lesion to the spindle’s average proportion pre-lesion (across the final 6,000 time
steps of training, i.e., after map organization had converged). Recovery had occurred when the
difference in proportion from pre-lesion was no longer significantly different from 0% for the lesion-
targeted Sh-Fl Len/Vel spindle. Significance was defined at the p = 0.05 level using the Wilcoxon signed
rank test, using the 10 network simulations as the sample. Weight proportion change thus allowed an
intuitive link with experimental work, where the reduction or expansion of somatotopic representations
of different limb segments is one of the outcomes of interest after post-infarct rehabilitation or skill
learning (Kleim et al. 1998; Nudo et al. 1996; Plautz et al. 2000).
3.3.7.3 Cell Allegiance Changes
At any single time point, one spindle has the largest input weight to a given cortical cell. Because
of competitive Hebbian plasticity, this spindle can eventually have a lower weight to the cell than one of
the other 11 spindles. If this happens, it is termed an “allegiance change,” since the cell changes the
spindle with which it has the strongest allegiance. Allegiance changes were used to quantify the degree
of synaptic plasticity occurring post-lesion, since such plasticity was necessary for reorganization of
cortical inputs. To compare the Hebbian/Homeoplastic and Hebbian-Only conditions, the number of
allegiance changes per cell was computed over the entire rehabilitation training.
3.3.8 Parameter Sensitivity Analyses
To check that the simulation results were not based on precisely-tuned parameters alone, we
ran rehabilitation simulations with different delay lengths (0, 12.5, 25, 37.5, 50, 62.5 and 75%), Hebbian
learning rates (0, 0.0003, 0.003, 0.03 and 0.3), and homeoplastic learning rates (0, 0.0001, 0.001, 0.01
and 0.1). For each parameter, we ran six simulations with different random number seeds and
computed lesion-targeted Sh-Fl Len/Vel input weight proportion change to quantify the degree of
recovery.
62
Figure 1. Model and simulated experiments. (a) Effects of arm movement on cortical activity. Left to
right: simulated 2-joint, 6-muscle arm during movement (background shows arm in home position for
reference). Black dashed lines are muscles and include shoulder flexor and extensor (Sh-Fl and Sh-Ex),
elbow flexor and extensor (El-Fl and El-Ex), and bi-articular flexor and extensor (Bi-Fl and Bi-Ex). 6-by-2
pixel grid shows corresponding activity in Length-Only and Length/Velocity spindles for each muscle. As
input to cortical cells, spindle activities were multiplied by input weights (shown as stack of weight
maps) to cells. Each map pixel shows weight from one spindle to one cortical cell; different maps
represent weights from different spindles. Each cell also received weighted activity from neighboring
cells as input, with weights in Mexican hat pattern (plot second from right shows example of lateral
weights from nearest 120 neighbors to cell at center of grid). Summed input from spindles and
neighbors (“membrane potential”) is passed through sigmoid function to generate cortical cell activity,
shown on far right of figure (each pixel is activity of one cell). (b) Schematic demonstrating time course
of initial network training period, lesion, and rehabilitation training in experiments 1 and 2, as well as
post-lesion Elbow-Only training in experiment 3.
63
3.4 Results
3.4.1 Initial Network Training
As expected with competitive Hebbian learning, the initial random distribution of input weights
from a given spindle to the cortical network (representing a small patch of primary sensory cortex)
became organized into well-defined neighborhoods of relatively high input strength (Figure 2a). During
the final 6,000 time steps of training, each spindle contributed between 7.8% and 8.8% (averaged across
time steps and simulations) of the total cortical input weight. The similar contributions reflected the
similar percentage of total input activity accounted for by each spindle over time, averaging between
8.2% and 8.5%. Note that because of correlations or anti-correlations in spindle activities (see
Appendix), spindle pairs had overlapping and non-overlapping weight map distributions, respectively.
Because there was very large overlap in the map distributions of length/velocity and length-only
spindles arising from the same muscle, only results for the length/velocity spindles are shown hereafter.
3.4.2 Effects of Targeted Lesion
To simulate a stroke lesion, all cells with input weights from the Sh-Fl length/velocity spindle
greater than 0.1 were removed from the network. This lesion criterion resulted in 31 ± 7.0% (mean ± SD)
of cells being lesioned and removed from each network. The post-lesion input weight proportion for the
targeted Sh-Fl spindle was reduced by 43 ± 8.7% across simulation seeds (Figure 2d). The Bi-Fl spindle
also experienced a weight proportion decrease of 32 ± 11% due to its significant cortical representation
overlap with the Sh-Fl spindle.
Lesions caused large disturbances in mean cell activities of surviving cells by removing a large
fraction of lateral connections. Figures 3a and b demonstrate this for one network by showing mean
network activity after initial training either pre- or post-lesion (computed over the same 3,000 time
64
steps of spindle activity). The lesion caused persistent hyperactivity in local areas of the network, which
then inhibited surrounding areas, causing them to exhibit low activity.
65
Figure 2. Effects of initial training, lesion, and rehabilitation on spindle weight input representations in
Experiment 1. (a) Input weight maps from shoulder muscle spindles to cortical cells at end of training.
Brightness denotes strength of input weight. White border denotes area to be lesioned. (b) Input weight
maps for shoulder spindles after Hebbian/Homeoplastic condition. Here and in c), hatched region
denotes lesion. (c) Input weight maps for shoulder spindles after Hebbian-Only condition. Note the
lesion-targeted Sh-Fl spindle shows diminished areas of representation compared to the
Hebbian/Homeoplastic condition in b). (d) Percent difference from pre-lesion of each spindle’s input
weight proportion after lesion and at the end of rehabilitation. Error bars represent mean ± SD. 0%
denotes recovery of the pre-lesion proportion. (e) Percent difference from pre-lesion of each spindle’s
66
input weight proportion during rehabilitation. Note for the lesion-targeted Sh-Fl spindle, recovery only
occurs in the Hebbian/Homeoplastic condition. Shaded error represents mean ± SD.
3.4.3 Experiment 1: Essential Role of Homeoplasticity in Recovery
We next addressed our first hypothesis, that homeoplasticity enhanced recovery of spindle
input weight representations in the network after lesion. Figure 2 shows that the lesion-targeted Sh-Fl
spindle recovered its pre-lesion input weight proportion by the end of the rehabilitation training in the
Hebbian/Homeoplastic rehabilitation (recovery defined in Methods using the Wilcoxon signed rank
test). In contrast, in the Hebbian-Only rehabilitation, there was no recovery: the post-lesion Sh-Fl
proportion remained significantly different from pre-lesion. Similarly, the other spindles recovered in
the Hebbian/Homeoplastic condition but did not recover their pre-lesion proportions in the Hebbian-
Only condition.
The different recovery patterns seen in the Hebbian/Homeoplastic and Hebbian-Only conditions
were driven by greatly different cell activity patterns during rehabilitation training. Figure 3c and d show
the mean cell activities during rehabilitation for both Hebbian-Only and Hebbian/Homeoplastic
rehabilitation. In comparison to the Hebbian/Homeoplastic condition, activity remained persistently
high or low in localized areas in the Hebbian-Only condition, similar to the activity patterns seen
immediately after lesion in Figure 3b. Under the Hebbian-Only condition, the absolute value of the
change in mean cell activities compared to initial training was 0.13 ± 0.14 (mean ± SD across simulations
and surviving cells). Such large changes in mean contrasted to the small activity changes in the
Hebbian/Homeoplastic condition, which showed an absolute value of 0.041 ± 0.029 (p < 1e-172 using
the Wilcoxon rank sum test to compare conditions; significantly difference between conditions).
67
Figure 3. Mean cortical cell activity before and after lesion and rehabilitation. (a) Mean activity of
cortical cells (each represented as a single pixel) when network is presented with 3,000 time steps of
arm movement after the end of the training dose in the absence of a lesion. Brightness denotes activity
level. (b) Mean activity of cortical cells after lesion in response to the same 3,000 time steps of arm
movement after training. Here and in c) and d), hatched region denotes lesion. (c) Mean cortical cell
activity over rehabilitation dose during Hebbian/Homeoplastic condition. (d) Mean cortical cell activity
over rehabilitation dose during Hebbian-Only condition.
The difference in cell activities under each condition was driven by the presence or absence of
homeoplasticity. Importantly, the direct effect of homeoplasticity on cell activity was the first link in a
chain whereby homeoplasticity ultimately affected Hebbian plasticity and hence network
reorganization.
68
The next link connected activity levels with cell membrane potentials. Although the activity of
any cell in isolation had no way of affecting the cell’s own membrane potential, in the laterally
connected network the activity could affect the membrane potentials of other cells. Thus by altering the
activities of the cells, homeoplasticity also indirectly changed the distribution of membrane potentials in
the network. This effect is seen in the different distributions of mean cell membrane potentials (mean
taken over rehabilitation training) under the Hebbian/Homeoplastic and Hebbian-Only conditions
(Figure 4a).
In the final link, homeoplasticity’s effect on membrane potentials allowed it to affect Hebbian
weight plasticity (see equation (3)) and hence network reorganization, quantified here by cell allegiance
changes during rehabilitation. Figure 4b demonstrates the close relationship between membrane
potential and allegiance changes. By affecting the membrane potentials of the cells, homeoplasticity
affected Hebbian spindle input weight changes through the membrane potential term in the Hebbian
rule (equation (3)). In particular, cells in the Hebbian-Only condition with near-zero mean membrane
potential did not exhibit the allegiance changes needed for recovery: under the Hebbian-Only condition
34.9 ± 17.1% (mean ± SD across simulations) of cells had few (less than 10) allegiance changes during
rehabilitation training, versus only 1.0 ± 0.91% for the Hebbian/Homeoplastic condition. The cells with
less than 10 allegiance changes in the Hebbian-Only condition are shown in Figure 4c, along with the
number of allegiance changes in the same cells under the Hebbian/Homeoplastic condition. This
demonstrates that the same cells that contributed little to reorganization in the Hebbian-Only condition
were not similarly impotent in the Hebbian/Homeoplastic condition.
Importantly, for the cells with less than 10 allegiance changes in the Hebbian-Only condition,
the input weight from the lesion-targeted Sh-Fl spindle decreased by -0.016 ± 0.046 (mean ± SD). In
contrast, in the Hebbian/Homeoplastic condition, input weight from the lesion-targeted Sh-Fl spindle
69
increased by 0.031 ± 0.079 (Figure 4d). This illustrates the connection between reorganization and
recovery- cells with few allegiance changes experienced little gain in Sh-Fl input strength, so the chain
mechanistically linking homeoplasticity to reorganization and increased allegiance changes also linked it
to recovery.
70
Figure 4. Relationship between mean cell membrane potentials and spindle input weight plasticity in the
Hebbian/Homeoplastic and Hebbian-Only conditions. a) Distributions of mean cell membrane potentials
during rehabilitation under both conditions. Note the clustering around zero in the Hebbian-Only
condition. b) Relationship between mean membrane potential and the number of cell allegiance
changes (defined as a change in the spindle with the highest input weight to a cell). The two values are
highly correlated because membrane potential drives synaptic plasticity according to the Hebbian rule
(equation(3)), which in turn leads to allegiance changes. c) Only cells with few (< 10) allegiance changes
in the Hebbian-Only condition are shown. The same cells with fewer than 10 allegiance changes in the
Hebbian-Only condition exhibit many more changes under the Hebbian/Homeoplastic condition. d)
Connection between allegiance changes and increase in weight (i.e. recovery) from the lesion-targeted
Sh-Fl spindle. For the same cells as in (c), the histogram shows changes in Sh-Fl spindle weight from the
beginning to end of rehabilitation. Note the shift towards higher positive values in the
Hebbian/Homeoplastic condition.
71
3.4.4 Experiment 2: Effect of Delayed Rehabilitation Training
We found our second hypothesis, that rehabilitation delivered immediately after lesion (No
Delay condition) is less effective than one delivered after a delay (Delay condition; see Figure 1b), was
true due to the following mechanism. Driven by similar activity from all spindles during the delay period,
many cells decreased their input weights from previously strongly connected spindles and increased
weights from previously weakly connected spindles, making weights more similar (or “flattened”) across
spindles. For each cell, this change was gauged by calculating the standard deviation of input weights
across all 12 spindles before and after the delay, since more flattened weights would give rise to a
smaller standard deviation. Indeed, the input weight standard deviation per cell fell from 0.053 ± 0.017
(mean ± SD) immediately post-lesion to 0.026 ± 0.025 after the delay. In particular, since the lesion left
alive only cells with weak weights from the Sh-Fl spindle, the flattening of input weights increased
weight from this spindle on average. Immediately post-lesion, the average input weight value from the
Sh-Fl spindle was 0.048 ± 0.024 per cell (mean ± SD across simulations and surviving cells), while it
increased to 0.064 ± 0.017 after the delay (see Figure 5a-b).
Thus, rehabilitation after delay began with similar weight values from all spindles. This allowed
the Sh-Fl weights to compete more effectively with spindles that would have had much higher weights
before the delay, thereby speeding recovery (Figure 5c). Although recovery of weight proportions for
lesion-targeted Sh-Fl spindle occurred in both conditions, recovery occurred after 13032 time steps of
rehabilitation in the No Delay condition but only after 4550 time steps in the Delay condition, a 65%
faster recovery (Wilcoxon signed rank test was used to measure recovery, as explained in Methods).
The beneficial effect of delay depended on homeoplasticity: in a simulation of Hebbian-Only
plasticity during the delay, input weights did not flatten to the same degree during the delay. In
particular, the average Sh-Fl weight remained at 0.052 ± 0.024 (mean ± SD across simulations and
72
surviving cells) after delay without homeoplasticity, little-changed from immediately post-lesion in
comparison to the Sh-Fl weights after the delay with homeoplasticity.
Figure 5. Delay vs. No Delay Hebbian/homeoplastic rehabilitation. (a)-(b) Input weight maps for lesion-
targeted Sh-Fl spindle at beginning of rehabilitation dose in No Delay condition (i.e. immediately after
lesion) and Delay condition (i.e. after the delay period), respectively. Note in Delay condition, weights
from Sh-Fl spindle have slightly re-emerged and become more evenly distributed across network by
start of rehabilitation. (c) Percent difference from pre-lesion of each spindle’s input weight proportion
during rehabilitation. 0% denotes recovery of the pre-lesion proportion. For lesion-targeted Sh-Fl
spindle, recovery occurs faster in Delay condition (recovery points shown by arrows; based on Wilcoxon
signed rank test using a significance level of p < 0.05). Shaded error represents mean ± SD.
3.4.5 Experiment 3: Effects of Elbow-Only Training after Lesion
During Elbow-Only training post-lesion, the input weight proportion from the lesion-targeted Sh-
Fl spindle showed no recovery of representation across all simulation seeds (see Figure 6). The spindle’s
input weight proportion immediately after lesion was reduced by 44.8 ± 4.7% (mean ± SD) compared to
73
pre-lesion, and after post-lesion training it was still reduced by 33.3 ± 9.1%. There was also loss of
representation of the shoulder extensor spindle Sh-Ex due to its inactivity during the Elbow-Only
training. Immediately after lesion, it had increased its weight proportion by 45.7 ± 10.1% compared to
pre-lesion, but by the end of training this had become a decrease of 10.5 ± 5.9%.
Figure 6. Elbow-Only post-lesion training. a) Input weight maps for shoulder spindles at end of post-
lesion training show failure of lesion-targeted Sh-Fl spindle to regain representation and reduced
representation of unused Sh-Ex spindle. Compare to Figure 2b. b) Percent difference from pre-lesion of
each spindle’s input weight proportion during post-lesion training. 0% denotes recovery of pre-lesion
proportion. Similar to experimental findings, “non-use” of affected shoulder in absence of rehabilitation
leads to no recovery of affected Sh-Fl spindle, as well as expanded representation beyond pre-lesion
levels in some muscles that were in use during post-lesion training (e.g., Bi-Fl). Shaded error represents
mean ± SD.
3.4.6 Parameter Sensitivity Analyses
We ran parameter sensitivity analyses for the delay length, Hebbian learning rate, and
homeoplastic learning rate. The mean time courses of Sh-Fl recovery for all altered parameter values are
74
shown in Figure 7.To test different delays, the same number of time steps of rehabilitation were applied
to the network after varying delay lengths. As shown in Figure 7a, recovery during rehabilitation training
was slowest with no (0%) delay, as in our main results. This confirmed that the benefits of delay were
not dependent on the precise delay length, perhaps excluding delay lengths much shorter than those
shown in Figure 7a.
Figure 7b and c show the recovery results for the Hebbian and homeoplastic learning rates,
respectively. Very low learning rates did not achieve recovery within the length of rehabilitation training,
while high rates led to unstable recovery, suggesting unstable weight maps that changed drastically
based on recent spindle input. High Hebbian learning rates yielded much more frequently changing
recovery levels than high homeoplastic learning rates, however.
75
Figure 7. Parameter sensitivity analyses. Each plot shows mean percent difference from pre-lesion of
targeted Sh-Fl spindle’s input weight proportion during rehabilitation. 0% denotes recovery of pre-lesion
proportion. a)-d) Represent sensitivity analyses for different delays, Hebbian learning rates, and
homeoplastic learning rates, respectively. Recovery was robust in all cases except for very low or zero-
valued Hebbian or homeoplastic learning rates. Note the x-axis of a) was shortened in order to clearly
show differences in time course for different delays.
76
3.5 Discussion
In this simulation study, we explored the influence of cellular Hebbian and homeoplastic
mechanisms on recovery post-stroke. We notably aimed to answer two important questions regarding
neural plasticity mechanisms and rehabilitative therapy in the period post-stroke. The first question was
whether homeoplasticity was, in addition to Hebbian plasticity, necessary for recovery following a
lesion; the second question was whether administering rehabilitation training after a delay period aided
recovery.
Regarding the first question, our results clearly showed that recovery in our model was only
possible in the presence of both homeoplasticity and Hebbian plasticity. Homeoplasticity abolished local
high or low cortical activity patterns seen immediately post-lesion. As shown in Figure 4, this led to very
different cell membrane potential distributions in the Hebbian/Homeoplastic and Hebbian-Only
conditions, which in turn led to different amounts of network reorganization (quantified by cell
allegiance changes). In the Hebbian/Homeoplastic condition, the ubiquity of non-zero mean membrane
potentials allowed many allegiance changes and thus allowed inputs from the lesion-affected spindle to
establish strong connections to the surviving cells. In contrast, in Hebbian-Only networks, local high or
low activity patterns were persistent. This, in turn, drove maladaptive plasticity and poor map
reorganization by causing many mean cell membrane potentials to remain close to zero, thus preventing
re-emergence of strong weights from the lesion-targeted Sh-Fl spindle. Parameter sensitivity analyses
revealed that it was the presence of both Hebbian and homeoplasticity that was important, not the
precise values of their learning rates, since poor recovery only occurred with extreme changes in these
learning rates. Note that our explanation of the role of homeoplasticity post-stroke is different from, but
not mutually exclusive with, previous suggestions that a homeoplastic response to the low levels of peri-
77
lesional activity immediately post-lesion leads to the hyperexcitability seen in the days after lesion
(Murphy and Corbett 2009; Nahmani and Turrigiano 2014).
Interestingly, another simulation study reached a similar conclusion on the complementary roles
of homeoplasticity and Hebbian plasticity, but during visual cortical development (Toyoizumi and Miller
2009). Homeoplasticity was a necessary addition to Hebbian plasticity to allow robust equalization in
the representation of both eyes in normal development and overrepresentation of one eye in
monocular deprivation. Similar to the present model, homeoplasticity allowed strengthening of initially
weak input weights, permitting them to compete effectively for cortical representation against other
inputs. Note that in the Toyoizumi and Miller model, very noisy cell activity could compensate for a lack
of homeoplasticity (a possibility not explored here); however, compensation was not robust to
parameter changes.
Regarding the second question, we found that although recovery and the alleviation of
maladaptive plasticity was possible under both No Delay and Delay conditions, a delay led to faster
recovery during a subsequent rehabilitation dose. In the Delay condition, during the delay,
homeoplasticity led to a “flattening” of input weight distributions; that is, the spindle input weights to
each cortical cell became more evenly distributed across all spindles. This was quantified by the overall
decrease after delay of the standard deviations of spindle weight values arriving at each cell. This
allowed re-emergence of weights from the lesion-affected spindles by the beginning of rehabilitation
training (Figure 5), acting as a seed upon which further strengthening and reorganization of lesion-
affected spindle weights occurred quickly. In contrast, in the No Delay condition, more time during
training was required to achieve any initial weight re-emergence. In a link with Experiment 1,
homeoplasticity was required for Sh-Fl re-emergence during the delay. This spontaneous reorganization
78
depended on the same effects of homeoplasticity on persistent high or low cortical activity explored in
Experiment 1.
This homeoplastic-dependent mechanism of weight flattening may partially explain (along with
activity-induced cell death) poor outcomes seen in rodents and humans when exposed to rehabilitation
very early after stroke (Bland et al. 2000; Humm et al. 1999; Kozlowski et al. 1996; Risedal et al. 1999).
Note that although the model predicts the same level of recovery after extensive training in both the
Delay and No Delay condition, faster recovery in the Delay condition suggests that a limited dose
produces better recovery in a short time if given after a delay. Whether this phenomenon occurs
biologically could be tested experimentally in rodents by using different rehabilitation dose lengths and
different delays before rehabilitation onset to see if for short doses, some delay is beneficial to recovery.
This is important clinically because the majority of stroke patients receive only a limited dose of physical
therapy that may be insufficient to maximize recovery (Lang et al. 2009).
Further experimental work could also attempt to reduce the ability of peri-infarct cells to
undergo homeoplastic adjustments to test our prediction that homeoplasticity is necessary for recovery.
However, homeoplasticity in vivo is controlled through a variety of signaling pathways, some of which
overlap with LTP pathways (Guzman-Karlsson et al. 2014; Turrigiano 2012; Turrigiano 2011). To
minimize overlap, targets for reducing homeoplasticity might be found at the ends of these pathways at
the gene transcription or protein trafficking level. For example, genes or promoter sequences for
homeostatic processes, such as Na
+
1β or K
+
K
v
1.4 channel subunits (McClung and Nestler 2003)
affecting excitability, could be turned off after lesion with optogenetically-controlled transcriptional
repressors.
79
Previous experimental work in post-stroke monkeys (Nudo et al. 1996) provided a useful
benchmark against which to test the capability of our model to reproduce basic biological outcomes. In
that work, lesion-targeted inputs only regained cortical representation if their use was forced during
rehabilitation. If they remained unused, they failed to regain representation while allowing other, un-
lesioned inputs to gain further representation. Here, Experiment 3 showed similar results: simulated
disuse of the shoulder after lesioning cells with strong Sh-Fl input prevented recovery of pre-lesion
representation of the Sh-Fl input (Figure 6). This occurred because relative Sh-Fl spindle inactivity failed
to provide the presynaptic activity necessary to increase its input weights via Hebbian plasticity (see
equation (3)). This led to the inactive spindles having reduced input weight relative to active spindles,
which, because of competitive Hebbian plasticity, increased the active spindles’ overall representation
beyond pre-lesion levels.
Computer simulations of motor recovery post-stroke have several benefits when pursued in
concert with experimental studies. Simulation allows rapid iteration of theoretical experiments that is
not possible biologically. Though simulation results do not amount to proof, they do provide predictions
to be developed into hypotheses for testing in vivo. However, the simplifications inherent in simulations
necessarily impose a number of limitations on our results.
First, although we proposed here that homeoplasticity (modeled as adjustment of cell
excitability) is an important factor in recovery, there are a number of other processes occurring after
stroke, including changes in cell signaling and gene regulation prompted by cell death (reviewed in
Murphy and Corbett 2009). Experiments that induce lesions with reduced cell death, and hence without
some of these secondary lesion effects, could be undertaken by using local cortical cooling (Orton et al.
2012) or blockade of glutamatergic signaling through drug injection (Malmierca et al. 2003).
80
Second, our model contains highly simplified neuron models. However, we do not expect large
changes in results with more complex spiking neurons, because homeoplasticity works over long-time
scales that filter out individual spikes.
Third, the lack of closed-loop control of the simulated arm meant there was no degradation in
arm control after the stroke and no need to produce compensatory actions to achieve reaching goals.
Interaction between compensatory behavior and plastic mechanisms presumably further impairs
recovery by reinforcing abnormal motor control and will need to be addressed in future models.
While our model is a model of sensory cortex recovery post-stroke, recent related modeling
studies have explored reorganization in the motor system post-stroke. Reorganization of motor system
post-stroke, unlike purely “unsupervised” sensory system reorganization, require learning rules that
depend on feedback from the environment. In (Han et al. 2008), reorganization depends on both error-
based, or supervised, and unsupervised learning rules. In a paper based on this work, (Reinkensmeyer et
al. 2012) proposed that reorganization of motor cortical activity to re-learn how to flex a simulated wrist
depends on reinforcement learning.
Future models of reorganization of the motor system post-stroke should aim at better
understanding the combined roles of different plastic processes in stroke reorganization. Importantly,
they should study how the role of homeoplasticity proposed here interacts with feedback-driven plastic
rules in the motor system (Takiyama and Okada 2012). In addition, future models could incorporate
interhemispheric effects of lesions (Levitan and Reggia 1999; Reggia 2004) and better link to anatomy.
For instance, they could draw on previous work using connectome data to model brain regions as
graphical network nodes and evaluating the effects of node deletion on network dynamics (Alstott et al.
2009; Honey and Sporns 2008; Rubinov et al. 2009).
81
Finally, a crucial aspect of plasticity that is not included in the current model is that various
forms of plasticity, such as dendritic and axonal sprouting and LTP, are modulated as a function of time
after stroke (i.e. metaplasticity). Following stroke, specific features of brain function revert to those seen
at an early stage of development, with the subsequent process of “recovery recapitulating ontogeny”
(Cramer and Chopp 2000). In particular, genetic changes in the perilesional area allow for a window of
increased plasticity that makes it easier for perilesional neurons to modify existing connections and form
new ones in response to motor training (see Murphy and Corbett 2009 for review). Sensorimotor
training during this window can drive plasticity more effectively, which benefits recovery if the training
involves appropriate rehabilitative movements. On the assumption that rehabilitative training is given
before this increased plasticity subsides, our model predicts that a limited dose of training produces
better recovery if given after a delay. This prediction, combined with better characterization of the post-
stroke plasticity window, has important clinical implications for optimally timing the limited dose of
physical therapy received by most patients.
3.6 Chapter Appendix
To better characterize the network inputs and understand the pattern of input weights seen
after training, the correlations between spindle activities were calculated. Spearman’s rank correlation
coefficient was used because this measure is nonparametric and does not require the relationship
between variables to be linear, only monotonic. The resulting correlation matrix is shown in Figure A1.
82
Figure A1. Correlations in spindle activities during training/rehabilitation dose. Correlations and anti-
correlations seen between spindles gave rise to patterns of overlapping or non-overlapping areas,
respectively, of high input weight from certain subsets of spindles seen at end of training.
83
Chapter 4: Use-Dependent Learning in Reaching Movements is Robust,
Feedback-Insensitive, and Occurs in Hand Space Coordinates
4.1 Abstract
Use-dependent learning is driven by repetition of a well-practiced movement, is assumed to be
an unsupervised motor learning mechanism, and is manifested as bias towards the repeated movement
when executing similar movements. However, the time course, sensitivity to feedback, robustness to
forgetting, and coordinate system within which this bias occurs all require better understanding. Here,
we conducted three experiments to improve understanding of use-dependent learning in voluntary
reaching movements. In the first experiment, we showed that performance feedback frequency does
not modulate the time course or magnitude of use-dependent learning, supporting the hypothesis that
it is an unsupervised learning mechanism. This experiment additionally showed that use-dependent
learning has a gradual time course and is robust to forgetting across 24 hours or during washout. In the
second experiment, we replicated work from Verstynen and Sabes (2011) in which repetitive reaching
occurred under conditions similar to our first experiment but induced a larger bias after less training. We
demonstrated that the Bayesian model of use-dependent learning proposed by these authors was
sufficient to explain this discrepancy. The model elucidated how the time course of use-dependent
learning was modulated by periodically executing movements other than the repeated one. Finally, in
the third experiment, we examined the pattern of use-dependent learning transfer across workspaces.
Our results showed that use-dependent learning occurs in a hand space coordinate system. Together,
these results support that idea of use-dependent learning as a mechanism independent of error- or
reward-based learning that can have lasting effects on motor control.
84
4.2 Introduction
Motor learning can be categorized into supervised learning, reinforcement learning, and
unsupervised learning (Doya, 2000), often simply paraphrased as learning from errors, learning from
rewards, and learning regular patterns in input data without feedback, respectively. Most previous arm
movement learning studies have focused on supervised learning in motor adaptation paradigms (e.g.,
Shadmehr and Mussa-Ivaldi, 1994; Krakauer et al., 2000; Smith et al., 2006). Reinforcement learning has
also been implicated in adaptation, either via the direct update of motor commands from rewards
(Izawa and Shadmehr, 2011; Wu et al., 2014), or via the reinforcement of the rewarded movement
direction following initial adaptation (Huang et al., 2011; Shmuelof et al., 2012).
What role unsupervised learning plays in learning arm movements is less well understood.
Previous studies haves shown that repetitions of a well-practiced movement induce a bias towards that
movement when executing similar movements (Classen et al., 1998; Jax and Rosenbaum, 2007; Han et
al., 2008; Diedrichsen et al., 2010; Verstynen and Sabes, 2011; Hammerbeck et al., 2014). Such use-
dependent learning is hypothesized to occur independently of feedback, and thus, by definition, to be
an unsupervised learning process (Verstynen and Sabes, 2011). However, in all previous voluntary arm
movement studies, movements were followed by a combination of error and reward feedback, which
can both act to update well-practiced movements such as unperturbed reaching movements.
Specifically, error feedback can minimize endpoint variability (van Beers, 2009), and reward feedback
can increase the magnitude of use-dependent learning resulting from repeating a reaching movement
associated with successful adaptation to a perturbation (Huang et al., 2011). Our first aim is therefore
to conduct an experiment in which we minimized feedback following repetitive reaching movements in
order to test whether use-dependent learning can occur independently of performance feedback.
85
In addition to its independence from feedback, the dynamics of use-dependent learning have
not been thoroughly investigated. The decay of directional bias induced by repeating reaches in a single
direction is slow compared to the washout of error-based adaptation (Diedrichsen et al., 2010).
However, it is untested whether use-dependent learning is a labile process that soon decays following a
training session or is a true motor learning mechanism that can alter motor commands long after initial
induction. If use-dependent learning is an unsupervised learning process, performance changes
associated with it are presumably rooted in Hebbian-like plasticity mechanisms (Han et al., 2008;
Verstynen and Sabes, 2011). In turn, these mechanisms, such as long-term potentiation (LTP), would be
resistant to forgetting (Bliss and Lomo, 1973; Citri and Malenka, 2008). Supporting this argument, the
magnitude and resistance to forgetting of bias in thumb movements towards a direction of repeated
motion is increased by anodal transcranial direct current stimulation (tDCS; Galea and Celnik, 2009), a
process that stimulates physiological changes associated with LTP (Stagg et al., 2009; Stagg and Nitsche,
2011). Our second aim is therefore to study the time course of use-dependent learning by continuously
inserting probe trials during training, by measuring 24-hour retention, and by measuring resistance to
washout.
Finally, it is unknown whether use-dependent learning occurs in joint-based (intrinsic) or hand-
based (extrinsic) coordinates. The primary motor cortex has been shown to be involved in use-
dependent learning of thumb movements (Galea and Celnik, 2009). However, neurons in the primary
motor cortex cells display a mix of intrinsic and extrinsic coordinate frame representations (Kalaska,
2009). In addition, use-dependent learning during voluntary, planned reaching movements could occur
upstream of the primary motor cortex in premotor and parietal cortices, in which many cells code
movements in an extrinsic coordinate frame (Kakei et al., 2001; Buneo and Andersen, 2006). Our third
aim is therefore to test whether use-dependent learning in arm reaching occurs in an intrinsic or
86
extrinsic coordinate system by measuring the transfer of the bias in reaching movements induced by
use-dependent learning- across arm postures.
4.3 Methods
4.3.1 Subjects
Forty-eight healthy, right-handed individuals (as assessed by the Edinburgh Handedness
Inventory) participated in the three experiments of this study (mean age 23 years, range 18-33, 31
females). Twenty (13 females) participated in the first experiment, the Bias Time Course Experiment.
Eight (5 females) participated in the second experiment, the Verstynen Replication Experiment. Twenty
(13 females) participated in the third experiment, the Coordinate System Experiment. All subjects signed
an informed consent, and procedures were approved by the University of Southern California
Institutional Review Board.
4.3.2 Common Apparatus and Task Description for all Three Experiments
A horizontal Wacom tablet was placed in front of the seated subjects, who held a stylus to
perform reaching movements on the tablet. Subjects’ hands were hidden from view during the task.
Visual information and feedback were displayed via a 17-inch monitor, the location of which varied
between experiments (see below). The stylus position, recorded at 60 Hz, was shown as a 3 mm
diameter red cursor. Stimuli were presented using MATLAB 8.2.0 and the Cogent Graphics library
(developed by John Romaya at the LON at the Wellcome Department of Imaging Neuroscience).
To begin each trial, subjects were instructed to move the cursor into an 8 mm diameter white
start circle. After a variable delay, a 6 mm diameter white target appeared 11.1 cm from the start circle.
A “go” tone was then played after a 500 ms delay. In response to this signal, the subjects were
instructed to move the cursor toward the target within a predefined reaction time threshold. Violation
87
of this threshold resulted in a large red square appearing on the screen, a low tone playing, and the trial
re-starting. After leaving the start circle, the cursor disappeared. The trial ended when cursor speed
dropped below 25 mm/s. The maximal speed during the movement was constrained to be within an
allowed range. Violation of this condition resulted in a text warning appearing (“Speed Too Low” or
“Speed Too High”) and a low tone playing. Trials that met reaction and movement speed constraints
were assigned a score from 0-100 based on reach endpoint accuracy. The score decayed with distance
as a Gaussian (SD 9.4 mm), becoming zero when reaches terminated >30.5 mm from the target. The
values of the reaction time threshold, the movement speed constraints (see below) and whether the
final cursor positions and scores were displayed after each trial depended on the experimental
condition.
Subjects were instructed to move the cursor to targets quickly and accurately. They were told to
make straight, single-segment, uncorrected reach movements, and to use only the elbow and shoulder
to move the pen while minimizing finger and wrist movements.
4.3.3 Bias Time Course Experiment
In this experiment, we explored both the sensitivity of use-dependent learning to feedback and
the time course of use-dependent learning over two days. Two groups of subjects participated in this
experiment (Figure 1A-D). The full feedback (Full FB) group (n = 10, 7 females) received both reach
endpoint and score feedback on training trials and no feedback on probe trials. The sparse feedback
(Sparse FB) group (n = 10, 6 females) received no performance feedback on most training trials and on
all probe trials during Baseline, Training, and Generalization blocks. During training, endpoint feedback
(but no score) was given at random intervals of 4-6 training trials. The frequency of this feedback was
adjusted in pilot studies to the minimal level that prevented large drifts in movement direction away
from a target displayed repeatedly at the same location. This drift would have prevented the repetition
88
of movements in the same direction, which is necessary to induce strong use-dependent bias (Verstynen
and Sabes, 2011).
A mirror placed above the tablet hid the subject’s hand from view. A horizontal monitor
suspended above the mirror displayed all task information and feedback, which was reflected from the
mirror to the subject’s eye. Subjects rested their chin in a support above the front edge of the mirror.
Before each trial, the cursor appeared once it was <12 mm from the start circle. The variable delay from
trial start to target appearance was between 100-300 ms; the reaction time threshold was 450 ms. The
allowed maximum speed range was 750-1600 mm/s.
On Day 1, subjects executed a 50-trial Familiarization block followed by a 50-trial Baseline block,
in which training (i.e. non-probe) targets were randomly drawn from the range (0°, 360°], with 0° lying
along the rightward x-axis. These blocks were followed by six 103-trial Training blocks to a target at
135°. In both Baseline and Training blocks, probe trials were interspersed among training trials at
random intervals of 4-6 trials to measure use-dependent learning. Four consecutive probes were also
given at the beginning of each Training block. Probe targets appeared at ±90° from 135°, with the order
of CW and CCW probe targets counterbalanced among subjects.
Subjects returned 24-26 hours after the start of the Day 1 session for Day 2 and completed three
more Training blocks followed by Generalization and Washout blocks. The first Training block started
with ten consecutive probe trials to assess 24-hour use-dependent learning retention. The
Generalization block consisted of 116 trials, with probe targets appearing at ±15°, ±30°, ±60°, ±90°,
±135°, or 180° from 135°. The order of probe target angles was pseudorandom, with CW and CCW
probes counterbalanced among subjects. Finally, the Washout block contained probes at ±90° from
89
135°, with training targets uniformly drawn from the range (0°, 360°]. Feedback (endpoint error and
score) was provided following each washout trial for both groups.
4.3.4 Verstynen Replication Experiment
This experiment was a shortened version of Experiment 2 from Verstynen and Sabes (2011), and
was intended to verify that our experimental procedure resulted in a similar use-dependent bias
learning effect under similar task conditions. The monitor, mirror, tablet, and chin support were placed
as described for the Bias Time Course Experiment. Before each trial, a white circle denoting cursor
distance from the start circle appeared when the cursor was < 55.5 mm from the start circle; the cursor
appeared when it was <4 mm from the start circle. The variable delay from trial start to target
appearance was between 500-1500 ms, as in Verstynen and Sabes, while the allowed maximum speed
range was as in the Bias Time Course Experiment. The reaction time threshold was 600 ms. This was
shorter in Experiments 1 and 3 to keep overall experiment lengths shorter while still allowing sufficient
time to react, based on pilot data. Note that in Verstynen and Sabes there was no reaction time
threshold and a short reaction time was instead enforced by increasing the score for decreasing reaction
time. We decided to include a threshold and not include reaction time in the score so that subjects
could not achieve high scores by making inaccurate reaches with short reaction times.
Subjects completed four blocks of 90 trials in a single experimental session (Figure 1E). Each
block consisted of 76 training targets and 14 probe targets. Training targets were drawn from a different
distribution context in each block. These distributions included a repeated context in which all targets
appeared at 150°, a 7.5° and 15° contexts in which targets were drawn from the distribution
Ν(150°, 𝜎 𝑡𝑎𝑟𝑔𝑒𝑡 ) with 𝜎 𝑡𝑎𝑟𝑔𝑒𝑡 = 7.5° or 15°, respectively, and a uniform context in which targets were
uniformly drawn from the range (0°, 360°]. Probe trials appeared twice at each of 0°, ±30°, ±60°, and
±90° from 150°. The first 10 trials of each block were training trials, and probe trials were randomly
90
placed among the remaining 80 trials. Feedback conditions were the same as for the Full FB group in the
Bias Time Course Experiment.
4.3.5 Coordinate System Experiment
The Coordinate System Experiment tested whether use-dependent learning transferred
between workspaces in hand or joint space. The design of this study was based on a previous study of
force field adaptation transfer (Malfait et al., 2002). In this experiment, because of the need to translate
the subject’s arm position horizontally along with the tablet, the mirror was not used, and instead the
monitor was placed vertically in front of the subject. To prevent vision of the arm, the subject wore
goggles that blocked peripheral vision. Their arm was suspended in a support hanging from the ceiling
that fit around the upper arm and abducted it to an angle of 80°. This effectively restricted shoulder and
elbow movements to two dimensions, allowing transformation of Cartesian hand location to joint
angles, while limiting fatigue. Each subject’s upper arm was abducted such that arm movements were
restricted to a horizontal plane with two degrees of freedom corresponding to the shoulder and elbow
angles. The shoulder angle was defined as 0° when the upper arm was in line with the shoulders; the
elbow angle was defined as 0° when the lower arm was perpendicular to the upper arm (Figure 1).
Reaching movements were performed in two postures, a Train posture (shoulder = 15°, elbow = 15°) and
a Test posture (shoulder = 75°, elbow = 0°).
A Use-Dependent Learning group (UDL; n = 10, 7 females) and a Control group (n = 10, 6
females) participated in this experiment. The UDL group received training trials to a repeated target at
90° to induce a use-dependent bias towards this target. The Control group received training trials to
targets randomly drawn from the range [-20°, 200°]. This group was included to allow correction for any
non-use-dependent learning biases that occurred in the transition between postures, for instance due to
91
range effects or other non-specific causes (Ghilardi et al., 1995). Feedback conditions for training and
probe trials were the same as for the Full FB group in the Bias Time Course Experiment.
Before each trial, the cursor appeared once it was <12 mm from the start circle. The variable
delay from trial start to target appearance, reaction time threshold, and allowed maximum speed range
were as in the Bias Time Course Experiment.
Subjects were first placed in the Test posture and completed a 50-trial Test Familiarization block
followed by a 15-second break and a 48-trial Test Baseline block (Figure 1F). Targets during the
Familiarization block were drawn randomly from the range [-20°, 200°] (targets beyond this range
appeared beyond the edges of the tablet for subjects with shorter arms). Targets during the Baseline
block appeared at 90° and ±58° from 90° (i.e. at 148° and 32°) in order to measure any initial biases in
reaching movements towards these targets. Each consecutive triplet of trials contained all three targets
in a pseudorandom order. Next, subjects were placed in the Train posture, in which they completed
Train Familiarization and Train Baseline blocks that were identical to the corresponding Test posture
blocks. They then executed five 100-trial Training blocks, each preceded by a 1-minute break. During
these blocks, probe trials were interspersed randomly every 9-11 trials. Each probe trial target was at
±58° from 90° (i.e. 148° and 32°), with the order of CW and CCW probes counterbalanced between
subjects. Finally, subjects were placed back in the Test posture for a Transfer block of 24 probe trials,
with 8 targets each at 90°, 148°, and 32°. Consecutive triplets of trials contained all three targets in a
pseudorandom order and no performance feedback was given.
The Transfer block in the UDL group tested whether the use-dependent bias formed during
Training transferred from the Train posture to the Test posture in hand space or in joint space. Use-
dependent bias towards the 90° target in the Test posture would provide evidence of a hand space
92
transfer because the positions of the 90°, 148°, and 32° targets are the same relative to the hand in both
Train and Test postures (see Figure 1G). Conversely, use-dependent bias from the 90° target towards the
CCW probe target at 148° would provide evidence of joint space transfer. This is because the relative
elbow and shoulder angle changes required to reach towards the 90° target in the Train posture are the
same as those needed to reach towards the 148° target in the Test posture (Figure 1G).
4.3.6 Common Movement Analysis for the Three Experiments
Individual trials were excluded if the reach was >45° away from the target (0.58% of trials were
excluded). Subjects were excluded if they violated the movement speed constraint on >25% of trials
during Training blocks (based only on Day 1 for the Bias Time Course Experiment). In the Bias Time
Course Experiment, two subjects were in violation of this constraint and did not return for Day 2. In the
Coordinate System Experiment, one subject was in violation. No data from these subjects was included
in the analyses.
Use-dependent learning was quantified by use-dependent bias on probe trials. As in Verstynen
and Sabes 2011, on trial 𝑡 , use-dependent bias was defined as (Figure 1):
𝛾 𝑡 = 𝑠𝑔𝑛 (𝜃 𝑟𝑒𝑝𝑒𝑎𝑡 − 𝜃 𝑡𝑎𝑟𝑔𝑒𝑡 ,𝑡 ) ∗ (𝜃 𝑟𝑒𝑎𝑐 ℎ,𝑡 − 𝜃 𝑡𝑎𝑟𝑔𝑒𝑡 ,𝑡 ) (1)
where 𝜃 𝑟𝑒𝑎𝑐 ℎ,𝑡 was the subject’s actual reach direction (measured from the center of the start position
to the cursor location 100 ms into the reach). 𝜃 𝑟𝑒𝑝𝑒𝑎𝑡 was the angle of the repeated target in the
Training blocks (Bias Time Course Experiment, 135°; Coordinate System Experiment, 90°) or repeated
context block (Verstynen Replication Experiment, 150°), or was the direction of the CCW probe in the
Coordinate System Experiment Transfer block (148°) when evaluating joint space transfer. 𝜃 𝑡𝑎𝑟𝑔𝑒𝑡 ,𝑡 was
the target angle. For the analyses noted below, the bias from probe target distance 𝜃 𝑝 for a given
subject was then averaged across multiple probe trials (Verstynen and Sabes, 2011):
93
𝐵𝑖𝑎𝑠 (𝜃 𝑝 ) =
1
𝑁 𝐶𝑊
+𝑁 𝐶𝐶𝑊
∑ 𝛾 𝑡 |𝜃 𝑟𝑒𝑝𝑒𝑎𝑡 −𝜃 𝑡𝑎𝑟𝑔𝑒𝑡 ,𝑡 |=𝜃 𝑝 (2)
where 𝑁 𝐶𝑊
and 𝑁 𝐶𝐶𝑊
were the number of CW or CCW probe targets at distance 𝜃 𝑝 , respectively, and 𝛾 𝑡
was the bias on trial 𝑡 . 𝑁 𝐶𝑊
and 𝑁 𝐶𝐶𝑊
were counted over varying intervals of trials depending on the
specific analysis being performed (see below).
4.3.7 Common Statistical Data Analysis for the Three Experiments
Linear mixed models (LMM), fit with the MATLAB ‘fitlme’ command, were used to compare bias
between groups, time points, and/or probe targets, with these variables acting as fixed factors in a given
experiment. In all cases, the within-subject mean biases were first calculated across each time point or
probe target of interest (e.g. if Day 1 Training Block 6 was a time point of interest, the mean bias across
all probe trials within this block was calculated for each subject separately). These within-subject means
were then used as samples for LMM fitting. A fixed intercept was included in the model, representing
the effect for one combination of levels across all fixed factors. All effects for other levels of the fixed
factors were defined relative to this intercept. Subject was evaluated as a possible random effect on all
fixed effect variables as appropriate. Fixed effects were tested for significance using the Matlab ‘anova’
command after fitting the model using ‘effects’ dummy variable coding. Fixed factors with non-
significant fixed effects were then removed and the model re-fit. Final selection of random effect was
based on comparing models with the likelihood ratio test. Post-hoc testing of fixed factor levels for
difference from each other and from zero were carried out using the ‘coefTest’ command with the
Satterthwaite degrees of freedom approximation and Holm-Bonferroni multiple comparisons
corrections. Residuals of the final model were examined to ensure homoscedastic, normal distributions.
94
4.3.8 Statistical Data Analysis for the Bias Time Course Experiment
The bias for a given probe target was baseline-corrected for each subject by subtracting off the
bias present during the Day 1 Baseline block at the same target. CW and CCW probes were separately
baseline-corrected before being averaged to counteract any movement asymmetries. Average bias
across probe trials at each of three time points- Training Block 6 on Day 1, the first 10 consecutive probe
trials on Day 2, and Training Block 3 on Day 2- were computed to compare bias both between and within
groups across these time points. Comparisons were completed using an LMM analysis with Time Point,
Group, and their interactions as possible categorical fixed factors.
To evaluate the decay of use-dependent bias during Washout, within- and between-group
comparisons of the mean bias during the first and last four probe trials of the Washout block were
performed using a second LMM analysis with the same categorical factors as above. Additionally, trial-
by-trial Washout bias data from each subject were fitted with exponential decay functions containing
asymptotes:
𝛾 ̂
𝑡 = 𝑎 𝑒 −𝛼𝑡
+ 𝑏 (3)
Asymptotes b were compared between groups using an unpaired t-test, and were compared to 0 using
one-tailed one-sample t-tests with Holm-Bonferroni corrections. Log decay rates 𝑙𝑜𝑔 (𝛼 ) were compared
using Wilcoxon’s rank sum test after failing the Lilliefors test for normality (logarithm taken in attempt
to normalize values but did not succeed).
The Generalization block data were analyzed using an LMM approach to determine whether
bias at each probe distance was significantly greater than 0°, and if there was a consistent pattern of
differences between pairs of probe distances. Fixed factors were ordinal Probe Distance, categorical
Group (Full or Sparse FB), and their interactions. Because probes were only given at ±90° from 135° in
95
Baseline, baseline correction of other probe target angles in Generalization (±15°, ±30°, ±60°, ±135°, and
180° from 135°) were estimated from LOESS local regression fits (moving average span of five data
points) to each subject’s reach direction errors on all trials during Baseline.
4.3.9 Statistical Data Analysis for the Verstynen Replication Experiment
An adaptive Bayesian mode of use-dependent learning (Verstynen and Sabes, 2011) was fitted
to data from the Full FB group in the Bias Time Course Experiment. The model was then used to predict
data from the Verstynen Replication Experiment and to verify whether the same learning process can
explain data from both experiments. The Full FB group data was used, because feedback for this group
and for the Verstynen Replication Experiment was equivalent. The Bayesian model calculated the
posterior distribution of the target location on trial 𝑡 as
𝑝 (𝜃 𝑡𝑎𝑟𝑔𝑒𝑡 ,𝑡 |𝑥 𝑡 ) ∝ 𝑝 (𝑥 𝑡 |𝜃 𝑡𝑎𝑟𝑔𝑒𝑡 ,𝑡 ) ∗ 𝑝 (𝜃 𝑡𝑎𝑟𝑔𝑒𝑡 ,𝑡 ) (4)
where 𝜃 𝑡𝑎𝑟𝑔𝑒𝑡 ,𝑡 is the true target direction on trial 𝑡 , 𝑥 𝑡 is the noisy sensory estimate of target direction,
𝑝 (𝑥 𝑡 |𝜃 𝑡𝑎𝑟𝑔𝑒𝑡 ,𝑡 ) ~ 𝑁 (𝜃 𝑡𝑎𝑟𝑔𝑒𝑡 ,𝑡 , 𝜎 𝑙𝑖𝑘𝑒𝑙𝑖 ℎ𝑜𝑜𝑑
2
) is the likelihood distribution for the sensory estimate with
mean 𝜃 𝑡𝑎𝑟𝑔𝑒𝑡 ,𝑡 and variance 𝜎 𝑙𝑖𝑘𝑒𝑙𝑖 ℎ𝑜𝑜𝑑
2
, and 𝑝 (𝜃 𝑡𝑎𝑟𝑔𝑒𝑡 ,𝑡 ) ~ 𝑁 (𝜃 ̅ 𝑡 , 𝜎 𝑝𝑟𝑖𝑜𝑟 ,𝑡 2
) is the prior distribution for
target direction on trial 𝑡 with mean 𝜃 ̅ 𝑡 and variance 𝜎 𝑝𝑟𝑖𝑜𝑟 ,𝑡 2
. Reach direction on trial 𝑡 was predicted by
maximum of the posterior distribution (MAP), given by the weighted sum
𝜃 𝑀𝐴𝑃 ,𝑡 =
𝜎 𝑙𝑖𝑘𝑒𝑙𝑖 ℎ𝑜𝑜𝑑
2
𝜎 𝑙𝑖𝑘𝑒𝑙𝑖 ℎ𝑜𝑜𝑑
2
+𝜎 𝑝𝑟𝑖𝑜𝑟 ,𝑡 2
𝜃 ̅ 𝑡 +
𝜎 𝑝 𝑟𝑖𝑜𝑟 ,𝑡 2
𝜎 𝑙𝑖𝑘𝑒𝑙𝑖 ℎ𝑜𝑜𝑑
2
+𝜎 𝑝𝑟𝑖𝑜𝑟 ,𝑡 2
𝑥 𝑡 (5)
After every trial, the most recent target direction was used to update the prior mean and variance as
𝜃 ̅ 𝑡 +1
= (1 − 𝛽 )𝜃 ̅ 𝑡 + 𝛽 𝜃 𝑡𝑎𝑟 𝑔 𝑒𝑡 ,𝑡 (6)
𝜎 𝑝𝑟𝑖𝑜𝑟 ,𝑡 +1
2
= (1 − 𝛽 )𝜎 𝑝𝑟𝑖𝑜𝑟 ,𝑡 2
+ 𝛽 (𝜃 ̅ 𝑡 − 𝜃 𝑡𝑎𝑟𝑔𝑒𝑡 ,𝑡 )
2
(7)
96
The parameters 𝜎 𝑙𝑖𝑘𝑒𝑙𝑖 ℎ𝑜𝑜𝑑
2
and learning rate 𝛽 were fit to each subject’s data by minimizing the sum of
squared error
∑ (𝜃 𝑟𝑒𝑎𝑐 ℎ,𝑡 − 𝜃 𝑀𝐴𝑃 ,𝑡 )
2
𝑡 (8)
with constraints 𝜎 𝑙𝑖𝑘𝑒𝑙𝑖 ℎ𝑜𝑜𝑑
2
> 0 and 0 < 𝛽 < 1. The initial value of 𝜎 𝑝𝑟𝑖𝑜𝑟 ,𝑡 2
was set to 110°, while the
initial value of 𝜃 ̅ 𝑡 was set to the repeated direction.
After fitting to data from the Full FB group in the Bias Time Course Experiment, the model was
used to simulate the Verstynen Replication Experiment by presenting it with the sequence of targets
experienced by the subjects and using Eqns. 4-6 to predict reach direction and update the prior
distribution.
4.3.10 Statistical Data Analysis for the Coordinate System Experiment
Biases at a given target in the Test or Train posture were baseline corrected using the bias at
that target during the second half of the Test or Train Baseline block, respectively. CW and CCW probes
were analyzed separately because the CCW probe (148° target) was of specific interest. Reaches to this
target when the arm was in the Test posture required the same direction of joint space trajectory (i.e.
change in shoulder and elbow angles) as reaches to the repeated 90° target in the Train posture (Figure
1G). Thus, in joint space, it was the Test posture analogue of the Train posture repeated direction.
We first verified the existence of a use-dependent bias in the UDL group and its absence in the
Control group using an LMM analysis with Group, Probe Direction (CW or CCW), and their interactions
as possible categorical fixed factors. We then determined the coordinate system in which use-
dependent bias developed by regressing the mean bias in the Transfer block against the mean bias in
Training Block 5 (final training block) in both groups. Positive bias in Training Block 5 (the predictor) was
97
defined as being toward the 90° target during reaches to the 148° (CCW) probe target. To test if transfer
occurred in hand space, positive bias in the Transfer block (the outcome variable) was also defined as
being towards the 90° target during reaches to the 148° target. To test if transfer occurred in joint space,
positive bias in the Transfer block was instead defined as occurring toward the 148° target during
reaches to the 90° target. As detailed above, the 148° target in the Test posture was in the same joint
space direction as the 90° target in the Training posture (Figure 1G); thus, a bias induced in joint space
towards the 90° target in the Train posture would appear as a bias towards the 148° target in the Test
posture. If transfer of bias from the Train to the Test posture occurred in either coordinate system, the
respective Transfer block bias (outcome variable) would be positively correlated with the Block 5 bias
(predictor variable), resulting in a positive slope between 0 and 1. Regression slopes for the Control
group were expected to be zero in both cases because there was no use-dependent learning to be
transferred in this group.
The noisy predictor violated the ordinary least squares (OLS) regression assumption of fixed
predictors measured with no error. Thus, an additional bootstrap analysis was performed to generate
regression slope distributions that could robustly be compared to zero in spite of the predictor noise
(Fox, 2016). In each bootstrap sample, the probe trial biases were resampled with replacement for each
subject before calculating within-subject mean biases. Next, the subjects themselves were resampled
with replacement, since both the subjects and the within-subject mean biases were random variables.
Finally, the resampled data was used for linear regression as described above. This process was iterated
10,000 times to yield a distribution of regression slopes for each regression analysis detailed above,
which were compared to zero using 95% confidence intervals (Fox, 2016).
98
Figure 1. a) Schedule for the Bias Time Course Experiment over Day 1 and Day 2, with a 24-hour break.
b) Description of target distributions in different blocks. Examples of possible training target locations
99
are shown in black; probe trials at ±90° from the repeated direction are shown in dark gray.
Generalization was similar to Training but with probe targets in additional directions. The illustration of
the Training block also shows how bias was measured (gray dashes). c) Description of trial schedule and
feedback in the Sparse FB group. Probe trials and “Training” trials yielded no feedback to the subject.
“Feedback” training trials showed the reach endpoint. d) Description of trial schedule and feedback in
the Full FB group. Feedback in “Training” trials was composed of reach endpoint and an accuracy-based
score. Probe trials yielded no feedback. e) Description of target distributions in each block of the
Verstynen Replication Experiment. Examples of possible training target locations are shown in black;
probes trials are shown in darker shades of gray as distance from the repeated direction increases. f)
Schedule for the Coordinate System Experiment and diagram of shoulder and elbow angles during
different sets of blocks. Probe targets are labelled and shown in dark gray; examples of possible training
target locations are shown in black. Note that during Baseline and Transfer blocks, all targets (including
training targets) only appeared in the probe directions. g) Hand space and joint space paths to 90° and
148° targets in the Train and Test postures. Subplots in different columns correspond to different
coordinate systems; subplots in different rows correspond to different postures. Note that in joint
space, the path to 90° target in Train posture is in the same direction as the path to 148° target in Test
posture. If use-dependent learning occurs in hand space, bias will be towards 90° target in Test posture;
if learning occurs in joint space, bias will be towards 148° target in Test posture.
4.4 Results
4.4.1 Bias Time Course Experiment: Feedback-Independence, Time Course, and Retention of
Use-Dependent Learning
During Day 1 of the Bias Time Course Experiment, the bias appeared to gradually increase
(Figure 2a) and then plateau in Training Blocks 4-6 (Block 6: Sparse FB 2.6° ± 1.0°, Full FB 4.3° ± 1.4°;
Figure 2B; unless otherwise stated, all reported values are mean ± SEM of within-subject means during
specified interval). For both groups the bias remained positive at the start of Day 2 (10 probe trials at
start of Day 2: Sparse FB 1.9° ± 0.8°, Full FB 1.4° ± 0.6°) and increased again until Training Block 3 on Day
2 (Day 2 Block 3: Sparse FB 4.2° ± 1.3°, Full FB 4.5° ± 1.0°; Figure 2B).
100
This pattern was corroborated by an LMM analysis that included Time Point, Group, and their
interaction as candidate fixed factors. Only Time Point (Day 1 Block 6, Day 2 Start, and Day 2 Block 3)
was significant (F = 10.5, p = 0.0001 for Time Point; F = 22.4, p < 0.0001 for intercept, representing bias
on Day 1 Block 6 for Full FB group). Group (F = 0.2, p = 0.71) and the interaction Group with Time Point
(F = 1.7, p = 0.20) were not significant, suggesting that feedback is not necessary in driving use-
dependent bias during voluntary movements.
The bias at all three time points was positive (post-hoc F-tests, Day 1 Block 6: F = 21.0, p <
0.0001; Day 2 Start: F = 5.1, p = 0.03; Day 2 Block 3: F = 34.7, p < 0.0001), although the bias at the start
of Day 2 was smaller than at the other two time points (Day 2 Start vs. Day 1 Block 6: F = 8.5, p = 0.006;
Day 2 Start vs Day 2 Block 3: F = 19.7, p < 0.0001; Day 1 Block 6 vs. Day 2 Block 3: F = 2.3, p = 0.14). These
results confirmed that use-dependent bias developed by Day 1 Block 6, decayed to some extent by the
start of Day 2, and then recovered by Day 2 Block 3 (Figure 2B). Despite the decay between Day 1 Block
6 and Day 2 Start, however, a positive bias remained at Day 2 Start, demonstrating 24-hour retention of
use-dependent learning.
The noticeably variable mean bias seen in the Full FB group data during Day 1 Block 6 (Figure 2A)
was more closely investigated post-hoc and found to be largely driven by two subjects. Additionally,
although there was no evidence that the within-subject mean biases during this block were non-
normally distributed (Lilliefors test, test statistic = 0.17, p > 0.5), we also tested for any between-group
differences in bias using nonparametric methods. For this analysis, the within-subject median biases
during the block were compared and no evidence of between-group difference was found (Wilcoxon
rank sum test, rank sum = 121, p = 0.25).
101
Figure 2. a) Use-dependent bias over the course of the Bias Time Course Experiment. Bias increases
gradually during Day 1, is retained across 24 hours, increases slightly during Day 2 training, and slowly
decays during extensive Washout. b) Mean bias in Block 6 Day 1, Day 2 Start, and Block 3 Day 2 (mean ±
SE of within-subject means across probes is shown) for both groups. Asterisks (*) show significant
differences between blocks (black) or for each block in comparison to 0° (gray). Note that the effect of
group was not significant (see text for details of hypothesis testing), indicating no effect of feedback in
modulating use-dependent learning.
4.4.2 Washout of Use-Dependent Learning
Figure 3A shows the bias at the start and end of Washout in both groups. An LMM analysis
revealed no effect of Group (F = 0.4, p =0.51) or interaction effect of Group and Time Point (F = 0.01, p =
0.92), but a significant bias at the start of Washout in the Full FB group (F = 30.5, p < 0.0001) and a trend
102
toward significance of Time Point (F = 3.5, p = 0.07). Post-hoc analyses confirmed that the bias remained
greater than 0° at both the start (F = 28.4, p < 0.0001) and end (F = 7.0, p = 0.01) of Washout, with a
trend towards decay from start to end (F = 3.6, p = 0.07).
Figure 3. Washout and Generalization data from the Bias Time Course Experiment. a) For both groups,
bias at start and end of Washout block. Both groups still had positive bias at the end of Washout. b)
Results from Generalization block (mean ± SE of within-subject means at each probe distance is shown).
Asterisks (*) show significant differences from 0° at each time point (a) or probe distance (b). Note that
the effect of group was not significant based on LMM analysis for either Washout or Generalization (see
text for details of hypothesis testing).
The decay rates of the exponential models fit to the washout data did not differ between groups
(Wilcoxon rank sum test after taking logarithm of decay rates, rank sum = 88, p = 0.22). In addition, both
groups showed asymptotic bias values greater than 0 (one tailed t-test with Holm-Bonferroni
correction, Sparse FB t = 2.6, p = 0.02; Full FB t = 2.0, p = 0.04) that were not different between groups
(two-tailed t-test, t = 0.1, p = 0.91). (Note that in the comparison of asymptotes, we excluded one outlier
in the Sparse FB group because the estimated asymptote value was -414). Overall, these results show
103
that the bias was robust to extensive washout, with incomplete decay and significant remaining effects
even after 203 washout trials. These results also show that feedback had no effect on the washout of
use-dependent bias.
4.4.3 Generalization of Use-Dependent Learning
The results of the Generalization block LMM analysis were inconclusive as to the pattern of bias
shown at different probe target distances from the repeated direction (Figure 3B). ANOVA testing of
fixed effects showed significant intercept (F = 52.3, p < 0.0001; representing bias at the 15° probe in the
Full FB group) and Probe Distance effects (F = 6.1, p < 0.0001). Group effects (F = 1.7, p = 0.19) and
Group interactions with Probes Distance (F = 0.6, p = 0.68) were not significant. Post-hoc comparisons
using a model without Group as a factor showed significant non-zero biases at 30° (F = 41.6, p < 0.0001),
90° (F = 28.8, p < 0.0001), and 135° (F = 7.6, p = 0.01), while biases at 15° (F = 0.2, p = 0.64), 60° (F = 5.3,
p = 0.02), and 180° (F = 0.19, p = 0.67) were not significant.
Generally, these comparisons show that the bias started small at the probe distance of 15°,
increased at the 30° probe, and stayed elevated at the 90° probe until shrinking slightly at the 135°
probe and becoming small at the 180° probe. However, this otherwise consistent pattern of increase
and decrease was broken by the unexplained non-significant bias at the 60° probe.
4.4.4 Bias in Verstynen Replication Experiment
Figure 4A shows the bias at different probe distances in each block in the Verstynen Replication
Experiment, with different distributions of training targets occurring in each block. The pattern of
increasing bias with increasing probe distance and with narrower training target distributions
qualitatively agrees with data from experiment 2 in Verstynen and Sabes 2011 (see their Figure 2B).
Note, however, that the bias at the 90° probe in the repeated context block (14.3° ± 6.4°) was much
104
larger than the bias observed at the 90° probes in the Bias Time Course Experiment. This was in spite of
the fact that subjects reached repeatedly to a single target for many more trials during the Bias Time
Course Experiment Day 1 than during the Verstynen Replication Experiment (618 during Bias Time
Course Experiment Day 1; 90 during Verstynen Replication Experiment).
Figure 4. a) Use-dependent bias at each probe distance during each block in the Verstynen Replication
Experiment. Note how the bias at the 90° probe during the repeated context block in the Verstynen
Replication Experiment is much larger than the biases recorded in the Bias Time Course Experiment. b)
Verstynen and Sabes’s adaptive Bayesian model simulation of the Bias Time Course Experiment. The left
plot shows the data from the Full FB group on Day 1, overlaid with the Bayesian simulation after fitting
the model parameters to these data. The right plot shows the simulated the Verstynen Replication
Experiment data: using the same model parameter set as on the left, the model can account for the
different magnitudes of bias seen in each experiment. c) Prior variance in the Bayesian model during
105
one simulated block of the Bias Time Course Experiment Day 1. d) Prior variance in the Bayesian model
during simulated repeated context block the Verstynen Replication Experiment. In comparison to the
Bias Time Course Experiment, the less-frequent and less-distant probes allowed the variance of the prior
estimate of target location to become smaller, thus allowing larger biases to develop (see Eqn. 5 and
Results for details).
4.4.5 Adaptive Bayesian Model Fit to Bias Time Course Experiment Data and Prediction of
Verstynen Replication Experiment Data
To verify if the results from both experiments could be explained by a single model of use-
dependent learning, we fitted the adaptive Bayesian model from (Verstynen and Sabes, 2011) to the
data of the Bias Time Course Experiment and then predicted the behavior in the Verstynen Replication
Experiment. Specifically, the model was first fitted separately to the Bias Time Course Experiment Day 1
data of each subject in the Full FB group. As can be seen in Figure 4B, the model provided a good
approximation of the gradually increasing bias during Day 1. Then, the model parameters fitted to each
of the ten subjects in the Bias Time Course Experiment Day 1 were used to simulate the Verstynen
Replication Experiment (Figure 4B, right panel). Here, again, the model well accounted for the data, and
notably for the larger bias at the 90° probe distance in the repeated context block of the Verstynen
Replication Experiment.
The reason for the discrepancy in bias amplitude between the two experiments was revealed by
examining the effects of probe targets on the model’s prior estimates of target distribution (Figure 4C
and D). When training in a repeated direction, the mean prior estimate converged to the repeated
direction and the variance shrank towards zero. Smaller prior variance weighted target location
estimation towards the mean of the prior estimate, hence biasing the reach towards the repeated
direction (see Eqn. 5 in Methods). However, probe targets, which appeared away from the repeated
106
direction and hence away from the mean of the prior estimate, had the effect of increasing the variance
of the prior estimate (see Eqn. 7 in methods). Thus, presentation of probe targets decreased the weight
of the prior during target location estimation on the next trial, hence reducing bias. In the Bias Time
Course Experiment, probes appeared 90° away from the repeated direction every 4-6 trials, preventing
the prior variance from shrinking to a value that caused a bias of more than a few degrees (Figure 4C).
However, in the repeated context block of the Verstynen Replication Experiment, there were longer
stretches of trials without a probe, and probes were sometimes closer to the repeated direction than
90°. In this scenario, the prior variance sometimes shrank (Eqn. 7) to values smaller than in the Bias Time
Course Experiment and caused larger biases at probes 90° from the repeated direction (Eqn. 5; Figure
4D).
4.4.6 Coordinate System of Use-Dependent Learning
Figure 5A shows that by the final training block in the Coordinate System Experiment, only the
UDL group developed use-dependent bias from both CW and CCW probes towards the repeated
direction, as expected. This was confirmed with an LMM analysis that showed bias at the CW probe for
the UDL group (F = 5.1, p = 0.03), no effect of Probe Direction (F = 0.09, p = 0.77), a significant effect of
Group (F = 8.8, p = 0.01), and no Probe Direction by Group interaction (F = 3.0, p = 0.09). We then tested
whether use-dependent bias transferred in a hand or joint space coordinate system when the arm
switched postures.
Figure 5B shows the use-dependent bias developed during Training transferred across postures
in hand space. This was demonstrated by a significant, positive regression slope using Transfer block bias
from the 148° target to the 90° as the outcome variable (bootstrapped slope mean [95% CI]: 1.3 [0.5
2.0]; see Methods for details about why this evaluation measure was used). In contrast, the regression
slope using Transfer block bias from the 90° target to the 148° target as the outcome variable (i.e.,
107
testing for joint space transfer) was not significantly different from zero (bootstrapped slope mean [95%
CI]: -0.4 [-1.3 0.7]).
Note that the non-zero intercepts seen in Figure 5B (bootstrapped mean intercept for 90° to
148° Transfer bias regression [95% CI]: 8.1 [3.7 13.1]; for 148° to 90°: -8.1 [-12.0 -3.7]) were biases
arising from the switch between the Train and Test postures (Ghilardi et al., 1995) and were not a result
of use-dependent learning as defined in the current work. This was confirmed by the presence of similar
intercepts in the Control group (Control group bootstrapped mean intercept for 90° to 148° Transfer
bias regression [95% CI]: 4.8 [1.6 8.4]; for 148° to 90°: -6.0 [-10.1 -1.3]). In agreement with the data from
Ghilardi et al., 1995, these intercepts showed that there was an overall CCW bias in reach direction
caused by the transition from Train to Test postures.
Figure 5. a) Mean bias during Training Block 5 (final training block) from probes located at 32° (CW
probe) and 148° (CCW probe) (mean ± SE of within-subject means across probes is shown) for each
group. Asterisks (*) show significant differences between groups (black) or for each group in comparison
to 0° (gray). Note the effect of probe direction was not significant (see text for details of hypothesis
108
testing). b) Regression of bias during Transfer block vs. bias during final training Block for UDL group. To
test for use-dependent learning transfer in joint space, the bias in the Transfer block from the 90° target
towards the 148° target was used as the outcome variable. To test for transfer in hand space, the bias
from the 148° target towards the 90° target was used instead. Only the slope of the regression for hand
space transfer was significantly different from zero (see text).
4.5 Discussion
The current study generated three novel and valuable additions to our understanding of use-
dependent learning. First, use-dependent learning was found to be insensitive to the frequency of
performance feedback. This supports the theory that use-dependent learning is a third motor learning
mechanism independent of error- and reward-based mechanisms. Second, use-dependent learning was
found to develop with a gradual, non-exponential time course and was robust to time-based and to
washout-based forgetting. This supports the notion that use-dependent learning plays a long-lasting role
in controlling movements, which is an important prerequisite to considering its effects in motor learning
and neurorehabilitation. Third, use-dependent learning was found to occur in a hand space coordinate
system. This suggests that brain areas representing extrinsic coordinates during motor learning underlie
use-dependent learning.
The first finding supports the prevailing assumption that use-dependent learning does not
require performance feedback to occur. This is in line with previous proposals suggesting that use-
dependent learning is a third motor learning mechanism separate from error- or reward-based ones
(Huang et al., 2011). Furthermore, the existence of separate error-based, reward-based, and use-
dependent mechanisms maps directly on to the hypothesis that the brain takes advantage of three kinds
of learning- supervised (requiring error feedback), reinforcement (requiring reward feedback), and
109
unsupervised (requiring no feedback) (Doya, 2000). In this hypothesis, supervised learning was assumed
to take place largely in the cerebellum, reinforcement learning in the basal ganglia, and unsupervised
learning in the cerebral cortex. The assumption that use-dependent learning occurs in an unsupervised
manner also agrees with previous results focusing on TMS-induced thumb movements after extensive
repetitions of voluntary thumb movements in a single direction. In one study, pharmacological blockade
of NMDA-receptor activity or enhancement of GABAergic inhibition substantially reduced use-
dependent learning (Bütefisch et al., 2000). This result suggested a mechanism similar to long-term
potentiation (LTP) underlying use-dependent learning. LTP, in turn, is thought to be the Hebbian-like
unsupervised learning rule that allows the cortex to organize itself during development or after
pathological events such as stroke (von der Malsburg, 1973; Kohonen, 1982; Sirosh and Miikkulainen,
1997; Sullivan and de Sa, 2006; Wilson et al., 2010; Bains and Schweighofer, 2014).
The second finding revealed novel details about the time course and robustness of use-
dependent learning. Day 1 of the Bias Time Course Experiment revealed a gradual increase in bias that
lacked the rapid exponential initial phase that characterizes error-based motor adaptation. However,
because such training only led to a bias of less than 5° after more than 600 training trials, this result was
at first glance incongruent with the bias of over 15° seen in the Verstynen and Sabes 2011 experiment
with almost the same number of trials to a repeated target. We therefore conducted the Verstynen
Replication Experiment and verified this difference in biases was not due to a difference in experimental
protocol or setup, because this experiment yielded a bias magnitude of 14.3° ± 6.4° (mean ±SE) at the
90° probes in the repeated context block. The Verstynen and Sabes adaptive Bayesian model allowed us
to reconcile the different bias magnitudes observed in Experiments 1 and 2. The model revealed that
bias magnitude was sensitive to probe trial frequency and distance from the repeated direction.
Therefore, the model confirmed that: i) the time course of use-dependent learning is gradual (even
110
under experimental conditions that lead to large bias magnitudes), ii) is well described as an incremental
Bayesian process whereby a subject builds a prior expectation of where a target may appear, and iii)
combines this prior with visual information about the actual target location to generate a reach
direction.
Also related to the second finding, the results of Day 2 of the Bias Time Course Experiment show
that the use-dependent bias learned on Day 1 was partially retained after 24 hours. This is a meaningful
finding because it is the first time, to our knowledge, that use-dependent learning has been shown to
last for a day or more after a single session of training. The robustness of bias to forgetting was also
observed during the Washout block. Although subjects could observe their own errors in this block, the
bias remained greater than zero after over 200 trials. This is in line with previous work demonstrating
that the aftereffects of use-dependent learning induced in a task-irrelevant direction were slower to
washout than aftereffects relating to error-based learning (Diedrichsen et al., 2010). In sum, these two
forgetting results further the idea that use-dependent learning is not a transient phenomenon, but is
instead a true learning mechanism that shapes motor learning over an extended period in addition to
error- and reward-based learning.
The third finding was that use-dependent learning in voluntary reaching movements occurred in
a hand space (extrinsic) coordinate system. Previous work has shown that this coordinate frame is well-
represented in many cells of the posterior parietal and premotor cortices, and to some extent in the
primary motor cortex (Georgopoulos et al., 1986; Kakei et al., 2001; Buneo and Andersen, 2006; Kalaska,
2009; Toxopeus et al., 2011). Additionally, in thumb studies of use-dependent learning, TMS stimulation
of primary motor cortex evoked thumb movements biased in the direction of preceding repetitive
training movements (Classen et al., 1998; Bütefisch et al., 2000), while tDCS over the motor cortex
increased the magnitude and duration of use-dependent learning (Galea and Celnik, 2009). This suggests
111
the primary motor cortex as one locus of use-dependent learning in these studies, which relied on
passively evoked single-joint movements as a behavioral assay. However, voluntary reaching
movements like those in the current work may additionally be affected by use-dependent learning in
parietal and premotor areas that are involved in voluntary reach planning (Cisek and Kalaska, 2005) and
more strongly represent reaches in extrinsic coordinates than the primary motor cortex (Kakei et al.,
2001; Kalaska, 2009).
There were two main limitations to the current study. First, we were unable to remove all
performance feedback in the Bias Time Course Experiment. We minimized it as much as possible in the
Sparse FB group by providing only endpoint error, and not score, every 4-6 trials. This minimal feedback
was found necessary in pilot experiments to prevent drift in the subjects’ reach directions away from the
intended target. Such drift would have undermined the goal of movement in a repeated direction during
training. However, even with this titration of feedback to its lowest possible level, there was no
corresponding change in use-dependent bias observed in the Sparse FB group compared to the Full FB
group.
Second, results from the Generalization block during Day 2 of the Bias Time Course Experiment
were unclear. We tried to evaluate the generalization of use-dependent learning by determining how far
away from the repeated direction probes could occur and still reveal a reach bias. However, we
observed an unexpectedly low bias at probes 60° from the repeated direction compared to neighboring
30° and 90° probes. This may have been due to a poor estimation of the baseline bias at probes other
than 90°, because targets were not intentionally placed at the other probe distances during the Baseline
block. Nonetheless, it did appear that some bias was maintained even at probes at least 135° from the
repeated direction, with a possible maximum bias occurring at probes closer to the repeated direction.
This generalization of use-dependent learning across a wide range of probe directions would be similar
112
to the wide generalization range previously found for the effects of hand path priming (i.e. bias of hand
trajectory towards the trajectory executed on a previous trial; Jax and Rosenbaum, 2007), assuming the
mechanism underlying hand path priming is similar to the use-dependent learning mechanism
investigated here.
The current study raises several possibilities for future work on use-dependent learning in
voluntary movements. First, further testing of the Bayesian model of use-dependent learning proposed
by Verstynen and Sabes could be undertaken by, for example, blurring targets to increase uncertainty in
the visual information and observing if this further increases the weight of the prior expectation in
determining reach direction, hence increasing bias. Second, the robustness of use-dependent learning to
washout could be more thoroughly characterized by determining if the robustness is sensitive to the
magnitude of the use-dependent bias or the length of training. Third, experiments could investigate the
effects of tDCS applied to motor or parietal cortex during induction of use-dependent learning in
reaching. This work would be in a similar vein to the tDCS thumb movement study of Galea and Celnik
2009, but would involve testing for use-dependent learning using planned, voluntary movements
instead of TMS-evoked movements. Such an experiment could help to pinpoint the brain areas
underlying use-dependent learning.
Aside from having a role in healthy motor learning, use-dependent leaning may also be
pertinent in rehabilitation after stroke or other pathological events. A patient who receives insufficient
rehabilitation may continue to repeat a compensatory movement pattern, thereby entrenching this
pattern through use-dependent learning and making it difficult to “washout” by performing healthy
movement patterns. Though speculative for now, this clinical application would be an interesting
avenue for future research.
113
Chapter 5: Summary of Work
Chapters 3 and 4 detailed two different but related projects that have constituted my thesis
research. In chapter 3, I described a purely computational project in which a simplified model of sensory
cortex was used to explore how homeoplasticity and Hebbian plasticity may interact post-stroke, and
how this interaction could determine when a rehabilitation dose should be given. Chapter 4 described
subsequent experimental work examining whether a central assumption of the cortical model, namely
that purely unsupervised Hebbian learning could occur in the sensorimotor system, was feasible.
The cortical modeling project yielded two major conclusions. First, Hebbian and homeoplasticity
were both essential to recovery after simulated stroke lesion. Homeoplasticity allowed the network to
recover from abnormal post-lesion activity patterns and better correlate with afferent muscle spindle
activity. Hebbian plasticity then took advantage of this correlation to drive a reorganization of synaptic
weights in the remaining network that reflected the patterns of afferent activity. Second, a post-lesion
delay period, during which the simulated arm and muscles were held in a “rest” position, allowed a
subsequent rehabilitation dose to drive recovery faster than if the dose was given with no delay. This
was because during the delay, afferent synaptic weights from all spindles became more equally spread,
or “flattened,” across the cortical cells since all spindles had similar, mild activity levels. The subsequent
rehabilitation dose could then more easily drive reorganization as compared to the case immediately
post-lesion, where remaining cortical cells had pre-existing strong synaptic connections with certain
spindles that had to be rearranged to allow the lesion-affected afferent inputs to gain representation.
These results, however, depended on the assumption that a purely unsupervised Hebbian
mechanism could alter cortical organization in absence of skill learning and its related error- or reward-
based feedback. This assumption was especially important in supposing some reorganization would
occur during a delay period before rehabilitation, since no skill learning could occur with the arm at rest.
114
Thus, in my second thesis project, we attempted to determine whether such purely unsupervised
changes were actually possible using UDL as a behavioral assay of neural plasticity. However, we also
extended this project to generally explore and better characterize UDL in an attempt to determine if it
was an important third motor learning mechanism in addition to error- and reward-based mechanisms.
The UDL project yielded three main conclusions. First, titrating feedback to as low a level as
possible had no effect on UDL, suggesting UDL was independent of error- or reward-based mechanisms.
Second, the time course and magnitude of UDL was very sensitive to how frequently targets distant
from the repeated target occurred, but the UDL that did develop was retained across days and only
slowly washed out. This was the first time to our knowledge such retention was demonstrated, and the
slow washout was in line with previous work showing UDL decayed more slowly than error-based effects
(Diedrichsen et al., 2010). Third, UDL transferred in hand-based (extrinsic) coordinates, suggesting in the
case of voluntary, planned arm movements, UDL was largely driven by cortical areas mostly
representing extrinsic reach direction, such as premotor areas of the frontal cortex, and not by areas
representing reaches in terms of intrinsic, joint-based coordinates, such as area 5d of the posterior
parietal cortex (Shadmehr and Wise, 2005). In sum, these results show that UDL is a robust
phenomenon that could play a role in shaping motor control over extended periods.
Recently, however, evidence has been presented that performance feedback is in fact important
in modulating UDL (Mawase et al., 2015). The experiments demonstrating this were not reaching
experiments, but instead were based on thumb movements and tested for UDL by measuring changes in
the direction of TMS-evoked thumb movements, as in previous studies discussed earlier (Classen et al.,
1998; Bütefisch et al., 2000). Crucially, the recent experiments differed from these previous ones in how
subjects were induced to repeatedly generate thumb forces in a repeated direction. In the previous
studies, subjects made ballistic movements of their thumb in a repeated direction on every beat of a
115
metronome set at 1 Hz. In the recent study, subjects had to complete trials of a sequential visuomotor
pinch task. In this task, the pinch force of their thumb and forefinger was used to control the one-
dimensional movement of a cursor and pass it through targets requiring a specific pinch force. In one
experiment, subjects had to pass the cursor through five different targets requiring five different forces
on each trial, returning to the start position after every target. One group of subjects had a consistent
mapping of force to cursor location on every trial such that they were able to learn and become more
skilled at the task. A second group had random mappings on every trial, such that they never learned
and their performance never improved. UDL in the thumb was only found to a significant degree in the
first group. A follow up experiment used a similar task, but only required subjects to hit one target on
each trial. However, cursor location was not show, and only binary success/failure feedback was given.
One group received feedback related to their actual performance, hence allowing skill improvement,
while each subject in a second group was yolked to a subject in the first (meaning they saw feedback
based on the subject in the first group’s trial-by-trial performance, not their own). Again, only the first
group showed skill learning and significant UDL.
There are several reasons why the results from this experiment may not be in conflict with our
own showing that performance feedback and skill learning was not necessary for UDL. First, it may very
well be possible that skill learning (or performance improvement) in addition to repetitive movements
increases the magnitude of UDL beyond what repetitive movement alone can induce (Huang et al.,
2011). Substantial skill learning was not present in our experiments. Although subjects did become
somewhat more accurate and precise in their reaching to a single target during training, this change was
not as substantial as the skill learning in Mawase et al. since reaching is already a well-practiced skill.
However, even if performance feedback that leads to skill learning increases the magnitude of UDL, our
results demonstrate that there is still a degree of UDL that can be induced without such feedback. Hence
116
the Mawase et al. results do not necessarily detract from the notion of UDL as a mechanism unique
from error- or reward-based mechanisms, but rather provide an another example of how these
mechanisms may interact.
Second, it should be noted that, for example, subjects in the first experiment from Mawase et
al. performed approximately 600 pinch-release cycles during training, compared to the 1800 ballistic
thumb movements performed during the original Classen et al. paper investigating UDL in the thumb.
This difference in repetitions may have prevented the non-improving groups in Mawase et al. from
inducing significant UDL, while skill learning in the other groups allowed a high magnitude of UDL to
develop more quickly. This question about repetition number is especially pertinent considering that the
original thumb studies induced UDL without subjects learning a skill. This may again point to an
interaction of feedback-based mechanisms with UDL as opposed to a complete dependence of UDL on
feedback.
Finally, the difference in how UDL was evaluated in our work versus that of Mawase et al. could
play a role in how much UDL is observed independent of performance feedback during training. Mawase
et al. relied on TMS stimulation of M1 to move the thumb in order to measure directional biases caused
by UDL, thereby only allowing measurement of UDL that had been caused by plasticity in M1 or
downstream, subcortical motor-related areas. In our reliance on voluntary, planned arm movements to
measure UDL, on the other hand, we allowed any UDL that had developed in higher motor planning
areas, such as the premotor cortices, to influence our outcome measurements, along with M1 UDL.
Although how much or whether UDL is restricted to specific cortical areas is still an open question, this
raises the possibility that small amounts of feedback-independent UDL in multiple cortical areas
summed to give us the reach direction biases we measured. In Mawase et al., however, the perhaps
117
small amount of UDL only arising from M1 in the non-improving groups may not have yielded a
significant outcome effect.
Nonetheless, the questions raised by Mawase et al. are interesting. Further work needs to be
done on elucidating the interactions of error-based, reward-based, and use-dependent learning
mechanisms and their neural plasticity correlates. It is my hope that this thesis has moved some small
distance towards that goal by investigating the interaction of Hebbian and homeoplasticity and better
characterizing use-dependent learning, Hebbian plasticity’s putative behavioral manifestation.
Continued work on this front will reveal not only basic truths about motor learning, but also valuable
understanding of how these mechanisms can be harnessed to improve rehabilitative outcomes for
millions of stroke patients.
118
Bibliography
Abe M, Schambra H, Wassermann EM, Luckenbaugh D, Schweighofer N, Cohen LG (2011) Reward
improves long-term retention of a motor memory through induction of offline memory gains. Curr
Biol 21:557–562.
Ada L, Dorsch S, Canning C (2006) Strengthening interventions increase strength and improve activity
after stroke: a systematic review. Aust J Physiother 52:241–248.
Albus JS (1971) A theory of cerebellar function. Math Biosci 10:25–61.
Baddeley R, Abbott LF, Booth MCA, Sengpiel F, Freeman T, Wakeman EA, Rolls ET (1997) Responses of
neurons in primary and inferior temporal visual cortices to natural scenes. Proc R Soc London Ser B
Biol Sci 264:1775–1783.
Bains AS, Schweighofer N (2014) Time-sensitive reorganization of the somatosensory cortex poststroke
depends on interaction between Hebbian and homeoplasticity: a simulation study. J Neurophysiol
112:3240–3250.
Barrionuevo G, Schottler F, Lynch G (1980) The effects of repetitive low frequency stimulation on control
and “potentiated” synaptic responses in the hippocampus. Life Sci 27:2385–2391.
Bartley AF, Huang ZJ, Huber KM, Gibson JR (2008) Differential activity-dependent, homeostatic plasticity
of two neocortical inhibitory circuits. J Neurophysiol 100:1983–1994.
Bienenstock EL, Cooper LN, Munro PW (1982) Theory for the development of neuron selectivity:
orientation specificity and binocular interaction in visual cortex. J Neurosci 2:32–48.
Bliss T, Lomo T (1973) Long-lasting potentiation of synaptic transmission in the dentate area of the
unanaestetized rabbit following stimulation of the perforant path. J Physiol 232:331–356.
Brashers-Krug T, Shadmehr R, Bizzi E (1996) Consolidation in Human Motor Memory. Nature 382:252–
255.
Breton J-D, Stuart GJ (2009) Loss of sensory input increases the intrinsic excitability of layer 5 pyramidal
neurons in rat barrel cortex. J Physiol 587:5107–5119.
Buneo CA, Andersen RA (2006) The posterior parietal cortex: sensorimotor interface for the planning
and online control of visually guided movements. Neuropsychologia 44:2594–2606.
Burrone J, O’Byrne M, Murthy V (2002) Multiple forms of synaptic plasticity triggered by selective
suppression of activity in individual neurons. Nature 420:414–418.
Bütefisch CM, Davis BC, Wise SP, Sawaki L, Kopylev L, Classen J, Cohen LG, Bu CM (2000) Mechanisms of
use-dependent plasticity in the human motor cortex. Proc Natl Acad Sci U S A 97:3661–3665.
Butko NJ, Triesch J (2006) Exploring the role of intrinsic plasticity for the learning of sensory
representations. In: European Symposium on Artifical Neural Networks.
Calabresi P, Maj R, Pisani A, Mercuri N, Bernardi G (1992) Long-term synaptic depression in the striatum:
119
physiological and pharmacological characterization. JNeurosci 12:4224–4233.
Cantarero G, Lloyd A, Celnik P (2013a) Reversal of long-term potentiation-like plasticity processes after
motor learning disrupts skill retention. J Neurosci 33:12862–12869.
Cantarero G, Tang B, O’Malley R, Salas R, Celnik P (2013b) Motor learning interference is proportional to
occlusion of LTP-like plasticity. J Neurosci 33:4634–4641.
Choe Y, Miikkulainen R (2004) Contour integration and segmentation with self-organized lateral
connections. Biol Cybern 90:75–88.
Cisek P, Kalaska JF (2005) Neural correlates of reaching decisions in dorsal premotor cortex: Specification
of multiple direction choices and final selection of action. Neuron 45:801–814.
Citri A, Malenka RC (2008a) Synaptic plasticity: multiple forms, functions, and mechanisms.
Neuropsychopharmacology 33:18–41.
Citri A, Malenka RC (2008b) Synaptic plasticity: multiple forms, functions, and mechanisms.
Neuropsychopharmacol Rev 33:18–41.
Classen J, Liepert J, Wise SP, Hallett M, Cohen LG (1998) Rapid plasticity of human cortical movement
representation induced by practice. J Neurophysiol 79:1117–1123.
Collingridge GL, Peineau S, Howland JG, Wang YT (2010) Long-term depression in the CNS. Nat Rev
Neurosci 11:459–473.
Conforto AB, Ferreiro KN, Tomasi C, dos Santos RL, Moreira VL, Marie SK, Baltieri SC, Scaff M, Cohen LG
(2010) Effects of somatosensory stimulation on motor function after subacute stroke. Neurorehabil
Neural Repair 24:263–272.
Cramer SC (2008) Repairing the human brain after stroke: I. Mechanisms of spontaneous recovery. Ann
Neurol 63:272–287.
Desai NS, Nelson SB, Turrigiano GG (1999) Activity-dependent regulation of excitability in rat visual
cortical neurons. Neurocomputing 26-27:101–106.
Diedrichsen J, White O, Newman D, Lally N (2010) Use-dependent and error-based learning of motor
behaviors. J Neurosci 30:5159–5166.
Doya K (2000) Complementary roles of basal ganglia and cerebellum in learning and motor control. Curr
Opin Neurobiol 10:732–739.
Dromerick AW, Lang CE, Birkenmeier RL, Wagner JM, Miller JP, Videen TO, Powers WJ, Wolf SL, Edwards
DF (2009) Very Early Constraint-Induced Movement during Stroke Rehabilitation (VECTORS): A
single-center RCT. Neurology 73:195–201.
Echegoyen J, Neu A, Graber KD, Soltesz I (2007) Homeostatic plasticity studies using in vivo hippocampal
activity-blockade: Synaptic scaling, intrinsic plasticity and age-dependence. PLoS One 2.
Fox J (2016) Bootstrapping Regression Models. In: Applied Regression Analysis and Generalized Linear
Models, Third., pp 587–606. Sage Publications.
120
Galea JM, Celnik P (2009) Brain polarization enhances the formation and retention of motor memories. J
Neurophysiol 102:294–301.
Galea JM, Vazquez A, Pasricha N, Orban De Xivry JJ, Celnik P (2011) Dissociating the roles of the
cerebellum and motor cortex during adaptive learning: the motor cortex retains what the
cerebellum learns. Cereb Cortex 21:1761–1770.
Georgopoulos AP, Schwartz AB, Kettner RE (1986) Neuronal population coding of movement direction.
Science (80- ) 233:1416–1419.
Ghilardi MF, Gordon J, Ghez C (1995) Learning a visuomotor transformation in a local area of work space
produces directional biases in other areas. J Neurophysiol 73:2535–2539.
Hammerbeck U, Yousif N, Greenwood R, Rothwell JC, Diedrichsen J (2014) Movement speed is biased by
prior experience. J Neurophysiol 111:128–134.
Han CE, Arbib MA, Schweighofer N (2008) Stroke rehabilitation reaches a threshold. PLoS Comput Biol
4:e1000133.
Hartman KN, Pal SK, Burrone J, Murthy VN (2006) Activity-dependent regulation of inhibitory synaptic
transmission in hippocampal neurons. Nat Neurosci 9:642–649.
Hebb DO (1949) The Organization of Behavior: A Neuropsychological Theory. John Wiley and Sons.
Hosp J a, Pekanovic A, Rioult-Pedotti MS, Luft AR (2011) Dopaminergic projections from midbrain to
primary motor cortex mediate motor skill learning. J Neurosci 31:2481–2487.
Huang VS, Haith A, Mazzoni P, Krakauer JW (2011) Rethinking motor learning and savings in adaptation
paradigms: model-free memory for successful actions combines with internal models. Neuron
70:787–801.
Huupponen J, Molchanova SM, Taira T, Lauri SE (2007) Susceptibility for homeostatic plasticity is down-
regulated in parallel with maturation of the rat hippocampal synaptic circuitry. J Physiol 581:505–
514.
Iriki A, Pavlides C, Keller A, Asanuma H (1989) Long-term potentiation in the motor cortex. Science (80- )
245:1385–1387.
Iriki A, Pavlides C, Keller A, Asanuma H (1991) Long-term potentiation of thalamic input to the motor
cortex induced by coactivation of thalamocortical and corticocortical afferents. J Neurophysiol
65:1435–1441.
Ito M, Kano M (1982) Long-lasting depression of parallel fiber-Purkinje cell transmission induced by
conjunctive stimulation of parallel fibers and climbing fibers in the cerebellar cortex. Neurosci Lett
33:253–258.
Izawa J, Shadmehr R (2011) Learning from sensory and reward prediction errors during motor
adaptation. PLoS Comput Biol 7:e1002012.
James W (1890) The principles of psychology (Vols. 1 & 2). New York Holt 118:688.
121
Jax SA, Rosenbaum DA (2007) Hand path priming in manual obstacle avoidance: evidence that the dorsal
stream does not only control visually guided actions in real time. J Exp Psychol Hum Percept
Perform 33:425–441.
Joiner WM, Smith M a (2008) Long-term retention explained by a model of short-term learning in the
adaptive control of reaching. J Neurophysiol 100:2948–2955.
Kakei S, Hoffman DS, Strick PL (2001) Direction of action is represented in the ventral premotor cortex.
Nat Neurosci 4:1020–1025.
Kalaska JF (2009) From intention to action: motor cortex and the control of reaching movements. In:
Progress in Motor Control, pp 139–178. Springer US.
Keisler A, Shadmehr R (2010) A shared resource between declarative memory and motor memory. J
Neurosci 30:14817–14823.
Kim S, Ogawa K, Lv J, Schweighofer N (2015) Neural Substrates Related to Motor Memory with Multiple
Timescales in Sensorimotor Adaptation.
Kitazawa S, Kimura T, Yin PB (1998) Cerebellar complex spikes encode both destinations and errors in
arm movements. Nature 392:494–497.
Kohonen T (1982) Self-Organized Formation of Topologically Correct Feature Maps. Biol Cybern 43:59–
69.
Krakauer JW, Mazzoni P (2011) Human sensorimotor learning: adaptation, skill, and beyond. Curr Opin
Neurobiol 21:636–644.
Krakauer JW, Pine ZM, Ghilardi MF, Ghez C (2000) Learning of visuomotor transformations for vectorial
planning of reaching trajectories. J Neurosci 20:8916–8924.
Kreitzer AC, Malenka RC (2008) Striatal Plasticity and Basal Ganglia Circuit Function. Neuron 60:543–554.
Langhorne P, Bernhardt J, Kwakkel G (2011) Stroke rehabilitation. Lancet 377:1693–1702.
Laver KE, George S, Thomas S, Deutsch JE, Crotty M (2011) Virtual reality for stroke rehabilitation.
Cochrane Database Syst Rev:CD008349.
Lee JY, Schweighofer N (2009) Dual adaptation supports a parallel architecture of motor memory. J
Neurosci 29:10396–10404.
Lynch GS, Dunwiddie T, Gribkoff V (1977) Heterosynaptic depression: a postsynaptic correlate of long-
term potentiation. Nature 266:737–739.
Lytton WW, Stark JM, Yamasaki DS, Sober SJ (1999) REVIEW : Computer Models of Stroke Recovery:
Implications for Neurorehabilitation. Neurosci 5:100–111.
Maffei A, Nelson SB, Turrigiano GG (2004) Selective reconfiguration of layer 4 visual cortical circuitry by
visual deprivation. Nat Neurosci 7:1353–1359.
Malfait N, Shiller DM, Ostry DJ (2002) Transfer of motor learning across arm configurations. J Neurosci
122
22:9656–9660.
Manohar SG, Chong TT, Apps MAJ, Batla A, Stamelou M, Jarman PR, Bhatia KP, Husain M (2015) Reward
Pays the Cost of Noise Reduction in Motor and Cognitive Control. Curr Biol 25:1707–1716.
Massey P V., Bashir ZI (2007) Long-term depression: multiple forms and implications for brain function.
Trends Neurosci 30:176–184.
Mawase F, Uehara S, Bastian A, Celnik P (2015) The crucial role of success-related reward signals in use-
dependent learning. In: TCMC 2015. Chicago.
Mazzoni P, Krakauer JW (2006) An implicit plan overrides an explicit strategy during visuomotor
adaptation. J Neurosci 26:3642–3645.
McDougle SD, Bond KM, Taylor JA (2015) Explicit and Implicit Processes Constitute the Fast and Slow
Processes of Sensorimotor Learning. J Neurosci 35:9568–9579.
Murphy TH, Corbett D (2009) Plasticity during stroke recovery: from synapse to behaviour. Nat Rev
Neurosci 10:861–872.
Nelson AB, Gittis AH, Du Lac S (2005) Decreases in CaMKII activity trigger persistent potentiation of
intrinsic excitability in spontaneously firing vestibular nucleus neurons. Neuron 46:623–631.
Pekny SE, Izawa J, Shadmehr R (2015) Reward-Dependent Modulation of Movement Variability. J
Neurosci 35:4015–4024.
Plautz EJ, Milliken GW, Nudo RJ (2000) Effects of repetitive motor training on movement
representations in adult squirrel monkeys: role of use versus learning. Neurobiol Learn Mem
74:27–55.
Pollock A, Farmer SE, Brady MC, Langhorne P, Mead GE, Mehrholz J, van Wijck F (2014) Interventions for
improving upper limb function after stroke. Cochrane Libr 11:CD010820–CD010820.
Rabe K, Livne O, Gizewski ER, Aurich V, Beck A, Timmann D, Donchin O (2009) Adaptation to visuomotor
rotation and force field perturbation is correlated to different brain areas in patients with
cerebellar degeneration. J Neurophysiol 101:1961–1971.
Reinkensmeyer DJ, Guigon E, Maier M a (2012) A computational model of use-dependent motor
recovery following a stroke: optimizing corticospinal activations via reinforcement learning can
explain residual capacity and other strength recovery dynamics. Neural Netw 29-30:60–69.
Rioult-Pedotti M-S, Friedman D, Donoghue JP (2000) Learning-Induced LTP in Neocortex. Science (80- )
290:533–536.
Rioult-Pedotti MS, Friedman D, Hess G, Donoghue JP (1998) Strengthening of horizontal cortical
connections following skill learning. Nat Neurosci 1:230–234.
Rutherford LC, Nelson SB, Turrigiano GG (1998) BDNF has opposite effects on the quantal amplitude of
pyramidal neuron and interneuron excitatory synapses. Neuron 21:521–530.
Schultz W, Dayan P, Montague PR (1997) A Neural Substrate of Prediction and Reward. Science (80- )
123
275:1593–1599.
Shadmehr R, Mussa-Ivaldi FA (1994) Adaptive representation of dynamics during learning of a motor
task. J Neurosci 14:3208–3224.
Shadmehr R, Wise SP (2005) The Computational Neurobiology of Reaching and Pointing. Cambridge,
MA: MIT Press.
Shen W, Flajolet M, Greengard P, Surmeier DJ (2008) Dichotomous dopaminergic control of striatal
synaptic plasticity. Science 321:848–851.
Shmuelof L, Huang VS, Haith AM, Delnicki RJ, Mazzoni P, Krakauer JW (2012) Overcoming motor
“forgetting” through reinforcement of learned actions. J Neurosci 32:14617–14621.
Sing GC, Smith M a (2010) Reduction in learning rates associated with anterograde interference results
from interactions between different timescales in motor adaptation. PLoS Comput Biol 6.
Sirosh J, Miikkulainen R (1994) Cooperative self-organization of afferent and lateral connections in
cortical maps. Biol Cybern 71:65–78.
Sirosh J, Miikkulainen R (1997) Topographic receptive fields and patterned lateral interaction in a self-
organizing model of the primary visual cortex. Neural Comput 9:577–594.
Smith M a, Ghazizadeh A, Shadmehr R (2006) Interacting adaptive processes with different timescales
underlie short-term motor learning. PLoS Biol 4:e179.
Stagg CJ, Best JG, Stephenson MC, O’Shea J, Wylezinska M, Kincses ZT, Morris PG, Matthews PM,
Johansen-Berg H (2009) Polarity-Sensitive Modulation of Cortical Neurotransmitters by
Transcranial Stimulation. J Neurosci 29:5202–5206.
Stagg CJ, Nitsche MA (2011) Physiological basis of transcranial direct current stimulation. Neuroscientist
17:37–53.
Stellwagen D, Malenka RC (2006) Synaptic scaling mediated by glial TNF-alpha. Nature 440:1054–1059.
Sullivan TJ, de Sa VR (2006) Homeostatic synaptic scaling in self-organizing maps. Neural Netw 19:734–
743.
Sutton AG, Barto RS (1998) Reinforcement Learning: An Introduction, First edit. Cambridge, MA: MIT
Press.
Taylor JA, Klemfuss NM, Ivry RB (2010) An explicit strategy prevails when the cerebellum fails to
compute movement errors. Cerebellum 9:580–586.
Thiagarajan TC, Lindskog M, Tsien RW (2005) Adaptation to synaptic inactivity in hippocampal neurons.
Neuron 47:725–737.
Toxopeus CM, de Jong BM, Valsan G, Conway BA, Leenders KL, Maurits NM (2011) Direction of
movement is encoded in the human primary motor cortex. PLoS One 6:e27838.
Toyoizumi T, Miller KD (2009) Equalization of ocular dominance columns induced by an activity-
124
dependent learning rule and the maturation of inhibition. J Neurosci 29:6514–6525.
Triesch J (2005) A gradient rule for the plasticity of a neuron’s intrinsic excitability. In: Artificial Neural
Networks: Biological Inspirations – ICANN 2005, pp 65–70.
Triesch J (2007) Synergies between intrinsic and synaptic plasticity mechanisms. Neural Comput 19:885–
909.
Turrigiano G (2011) Too Many Cooks? Intrinsic and Synaptic Homeostatic Mechanisms in Cortical Circuit
Refinement. Annu Rev Neurosci 34:97–107.
Turrigiano G (2012) Homeostatic synaptic plasticity: local and global mechanisms for stabilizing neuronal
function. Cold Spring Harb Perspect Biol 4:a005736.
Turrigiano GG (2008) The self-tuning neuron: synaptic scaling of excitatory synapses. Cell 135:422–435.
Turrigiano GG, Leslie KR, Desai NS, Rutherford LC, Nelson SB (1998) Activity-dependent scaling of
quantal amplitude in neocortical neurons. Nature 391:892–896.
Turrigiano GG, Nelson SB (2004) Homeostatic plasticity in the developing nervous system. Nat Rev
Neurosci 5.
Vale C, Sanes DH (2002) The effect of bilateral deafness on excitatory and inhibitory synaptic strength in
the inferior colliculus. Eur J Neurosci 16:2394–2404.
van Beers RJ (2009) Motor learning is optimally tuned to the properties of motor noise. Neuron 63:406–
417.
van Delden AE, Peper CE, Beek PJ, Kwakkel G (2012) Unilateral versus bilateral upper limb exercise
therapy after stroke: a systematic review. J Rehabil Med 44:106–117.
Verstynen T, Sabes PN (2011) How each movement changes the next: an experimental and theoretical
study of fast adaptive priors in reaching. J Neurosci 31:10050–10059.
von der Malsburg C (1973) Self-organization of orientation sensitive cells in the striate cortex. Kybernetik
14:85–100.
Wilson SP, Law JS, Mitchinson B, Prescott TJ, Bednar J a (2010) Modeling the emergence of whisker
direction maps in rat barrel cortex. PLoS One 5:e8778.
Winstein CJ, Wolf SL, Dormerick AW, Lane CJ, Nelsen MA, Lewthwaite R, Cen SY, Azen SP (2016) Effect of
a Task-Oriented Rehabilitation Program on Upper Extremity Recovery Following Motor Stroke The
ICARE Randomized Clinical Trial. JAMA 315:571–581.
Wolf SL, Winstein CJ, Miller JP, Taub E, Uswatte G, Morris D, Giuliani C, Light KE, Nichols-Larsen D (2006)
Effect of Constraint-Induced Movement Therapy on upper Extremity Function 3 to 9 Months After
Stroke. 296:2095–2104.
Wu HG, Miyamoto YR, Castro LNG, Ölveczky BP, Smith MA (2014) Temporal structure of motor
variability is dynamically regulated and predicts motor learning ability. Nat Neurosci.
125
Zarahn E, Weston GD, Liang J, Mazzoni P, Krakauer JW (2008) Explaining savings for visuomotor
adaptation: linear time-invariant state-space models are not sufficient. J Neurophysiol 100:2537–
2548.
Abstract (if available)
Abstract
Motor learning and the neural plasticity that underlies it are essential ingredients in allowing us to perform almost everything we do on a daily basis, from driving a car to typing on a keyboard. Additionally, proper understanding of these phenomena would allow us to harness them to aid those who suffer motor impairments resulting from stroke or other diseases. This thesis lays out the work done during my PhD aimed at improving this understanding. The work falls under two main projects. The first proposes a neural network model of sensory cortex to explore the roles of Hebbian and homeoplasticity after stroke, including how they interact to determine the optimal time to initiate rehabilitation. The second project then endeavors to strengthen a central assumption of the model, namely that purely unsupervised Hebbian learning can occur in the sensorimotor system independent of performance feedback (error or reward) from the environment. This is accomplished through arm reaching experiments that assay for unsupervised learning using behavioral measurements of use-dependent learning. This second project is also extended to explore the phenomenon of use-dependent learning in general to determine whether it could play a significant role alongside error- and reward-based learning mechanisms in shaping motor control. The organization of the thesis is as follows. Chapter 2 gives a background of the literature that is pertinent to understanding models of cortical organization, behavioral and physiological aspects of stroke recovery, and the mechanisms of motor learning. Chapter 3 and 4 then detail the two projects described above. These chapters may be read independently of the others and are formatted as manuscripts for submission to particular journals, as per Neuroscience Graduate Program guidelines. Finally, Chapter 5 summarizes the work and hypothesizes about its links to other recent work.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Computational models and model-based fMRI studies in motor learning
PDF
Computational model of stroke therapy and long term recovery
PDF
Modeling motor memory to enhance multiple task learning
PDF
Computational principles in human motor adaptation: sources, memories, and variability
PDF
Design of adaptive automated robotic task presentation system for stroke rehabilitation
PDF
Computational transcranial magnetic stimulation (TMS)
PDF
Relationship between brain structure and motor behavior in chronic stroke survivors
PDF
The representation, learning, and control of dexterous motor skills in humans and humanoid robots
PDF
Virtual surgeries as a tool for studying motor learning
PDF
Arm choice post-stroke
PDF
Hemisphere-specific deficits in the control of bimanual movements after stroke
PDF
Reaching decisions in dynamic environments
PDF
Learning reaching skills in non-disabled and post-stroke individuals
PDF
Learning invariant features in modulatory neural networks through conflict and ambiguity
PDF
Learning lists and gestural signs: dyadic brain models of non-human primates
PDF
Deficits and rehabilitation of upper-extremity multi-joint movements in individuals with chronic stroke
PDF
Development and implementation of a modular muscle-computer interface for personalized motor rehabilitation after stroke
PDF
Brain and behavior correlates of intrinsic motivation and skill learning
PDF
Dynamical representation learning for multiscale brain activity
PDF
The brain and behavior of motor learning: the what, how and where
Asset Metadata
Creator
Bains, Amarpreet Singh
(author)
Core Title
Experimental and computational explorations of different forms of plasticity in motor learning and stroke recovery
School
College of Letters, Arts and Sciences
Degree
Doctor of Philosophy
Degree Program
Neuroscience
Publication Date
08/08/2016
Defense Date
05/06/2016
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
computational modeling,Hebbian plasticity,homeoplasticity,motor learning,neural network,OAI-PMH Harvest,stroke,unsupervised learning,use-dependent learning
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Schweighofer, Nicolas (
committee chair
), Gordon, James (
committee member
), Schaal, Stefan (
committee member
)
Creator Email
abains05@gmail.com,amarpreb@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c40-300541
Unique identifier
UC11281573
Identifier
etd-BainsAmarp-4757.pdf (filename),usctheses-c40-300541 (legacy record id)
Legacy Identifier
etd-BainsAmarp-4757.pdf
Dmrecord
300541
Document Type
Dissertation
Format
application/pdf (imt)
Rights
Bains, Amarpreet Singh
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
computational modeling
Hebbian plasticity
homeoplasticity
motor learning
neural network
stroke
unsupervised learning
use-dependent learning