Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Computational models and model-based fMRI studies in motor learning
(USC Thesis Other)
Computational models and model-based fMRI studies in motor learning
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
I
COMPUTATIONAL MODELS AND MODEL-BASED FMRI STUDIES IN
MOTOR LEARNING
by
Sung Shin Kim
A Dissertation Presented to the
FACULTY OF THE USC GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
(NEUROSCIENCE)
August 2013
Copyright 2013 Sung Shin Kim
II
Acknowledgements
First of all, I would like to thank Dr. Nicolas Schweighofer for being the best advisor
with his endless support, insight, and patience. I am also thankful to my dissertation
committee members Drs. Stefan Schaal, Bosco Tjan, James Gordon, and Jinchi Lv as
well as my qualification exam committee member Dr. Norberto Gryzwac for offering
valuable feedback on this work. I would also like to thank my Japanese collaborators Drs.
Imamizu Hiroshi, Kenji Ogawa, Daniel Callan for their work in fMRI data analysis, Dr.
Jeong-Yoon Lee for his insightful discussion and brilliant research ideas, and Dr. Mitsuo
Kawato for hosting me to ATR, Kyoto, Japan. I feel thankful to all of my fellows at
University of Southern California for their moral support and encouragement: Youngmin
Oh, Yupeng Xiao, Chunji Wang, Younggeun Choi, Cheol Han, Feng Qi, Yukikazu
Hidaka, Amarpreet Bains, Hyeshin Park, and Sujin Kim in Computational Neuro-
Rehabilitation and Learning Lab. I also give thanks to all the Christian fellows of LIGHT
ministry in USC, pastor Sanghwan Lee and special thanks to my old friends, Jaewook
Kim and Euijun Park and my piano teacher, Taeyeon Lim who guided me to playing
Chopin, the exciting motor learning experience. Last but not least, I would like to thank
my family, especially my parents and brothers for supporting me through the toughest
times with love and prayers.
III
Table of Contents
Chapter 1. Introduction 1
1.1. Computational models and model-based fMRI in motor learning 1
1.2. Organization of the dissertation 5
Chapter 2. Dynamics of fast and slow memories underlies both the spacing and the 7
contextual interference effects in motor adaptation
2.1. Introduction 7
2.2. Materials and Methods 9
2.2.1. Participants 9
2.2.2. Experimental procedure 9
2.2.3. Computational modeling 13
2.2.4. Model parameter fitting and model comparison 16
2.2.5. Measurement of initial learning rate 18
2.2.6. Analysis of retention 19
2.3. Results 21
2.3.1. Initial learning rate depends on the activation of the fast 21
process
2.3.2. Short-term retention is determined by relative proportion of 25
fast and slow processes
2.3.3. Time-spacing induces overnight consolidation 27
2.4. Discussion 29
Chapter 3. Neural correlates for multi-rate models of sensorimotor learning 35
3.1. Introduction 35
3.2. Materials and Methods 37
3.2.1. Subjects 37
IV
3.2.2. Task procedures 38
3.2.3. Computational model 40
3.2.4. MRI acquisition 42
3.2.5. Processing of fMRI data 43
3.2.6. Model-based regression analysis of fMRI data 43
3.2.7. Multi-voxel pattern analysis (MVPA) 44
3.3. Results 46
3.3.1. Behavioral analysis and modeling 46
3.3.2. Model-based regression of fMRI 47
3.3.3. MVPA result 52
3.4. Discussion 55
Chapter 4. A model-based fMRI study of decision making in motor learning 59
4.1. Introduction 59
4.2. Materials and Methods 61
4.2.1. Participants 61
4.2.2. Experiment procedure 61
4.2.3. Bayesian search model 64
4.2.4. Alternative models 68
4.2.5. Model-based regressors 70
4.3. Results 72
4.3.1. Behavioral results and model performance 72
4.4. Discussion 77
Chapter 5. Conclusion 80
5.1. Summary 80
5.2. Future work 82
Bibliography 84
V
List of Tables
Table 2.1. Mean and 95% confidence interval of BICs from 10000 bootstrapped data
VI
List of Figures
Figure 2.1. Experiment design
Figure 2.2. Observed subject behavior and model fits in three experimental conditions
Figure 2.3. First-day learning and forgetting
Figure 2.4. Second-day retention and consolidation
Figure 3.1. Timeline in a trial
Figure 3.2. Adaptation process in behavioral data and result of model fitting
Figure 3.3. Responsible regions for individual adaptation components with 30 different
time constants
Figure 3.4. T-values of regression coefficients for parietal region and the cerebellum as a
function of regressor numbers, which correspond to different time constants
Figure 3.5. Eigenvariates and eigenimages of the top of four components yielded by the
singular value decomposition analysis of variations in brain activity related to different
timescales of sensorimotor memory
Figure 3.6. Regions of interest (ROIs) and classification accuracy of multi-voxel pattern
analysis
Figure 4.1. Experiment design
Figure 4.2. Bayesian decision making for the movement direction
Figure 4.3. The reward-prediction errors (RPE) and the exploration calculated from a
posterior distribution
Figure 4.4. Behavioral data analysis and three latent variables as regressors
VII
Abstract
In the last decade, computational models in motor learning have become popular
because it provides a theoretical framework not only to explain but predict motor learning
behaviors. However, the computational approach sometimes has been criticized by
dominating experimentalists in neuroscience due to its lack of the underlying neural
mechanisms to support the hypothetical models. In this dissertation, we combined the
computational modeling with neuro-imaging methods such as fMRI to understand human
motor memories and decision making in motor learning.
First, we provided a unifying mechanism accounting for the spacing and the
contextual interference effects using multi-rate motor adaption models: Model
comparison and retention performance analyses showed how varying learning schedule
influence the dynamics of fast and slow motor memories.
Second, we searched neural correlates of motor memories predicted by multi-rate
motor adaptation models: We hypothesized that different brain regions were correlated
with activities of motor memories with varying time scales. We could see the gradual
shift of the correlated areas from the fronto-parietal to the cerebellar area with slower
time scales. In addition, using multi-voxel pattern analysis (MVPA), we found results
supporting separate representation of learning opposing rotations in the cerebellum.
Third, we searched neural correlates of decision making in motor learning: By
designing an fMRI experiment where subjects searched a hidden target through
movements. We hypothesized that subjects used a Bayesian strategy updating the next
VIII
movement direction given feedbacks as a binary reward to search a target with
exploration. We focused on behavioral results with computational models and left fMRI
data analysis in the future work, discussing the expected results based on other similar
studies for neural correlates of reward-prediction errors, exploration, and uncertainty in
motor learning.
1
Chapter 1
Introduction
1.1. Computational models and model-based fMRI in motor
learning
Suppose that you are learning to play a complicated musical piece such as
Brahms piano concerto. You might want to quantify your performance to monitor how
you learn the piece during practice. The learning mechanism (i.e., update rules of motor
commands) could be dependent on feedbacks available in the practice (e.g., sounds you
are listening, instructions from a teacher, or even no feedback).
First, the feedback could be given as a form of errors between the current and the
desired outputs either in the motor space or in the task space and thus this type of
learning is called “supervised” learning. The update rule of motor commands is to reduce
the errors for each trial by counteracting in the opposite direction of the errors. Second,
the feedback could be simply evaluative for the performance rather than providing full
information on the error and thus this type of learning is called “reinforcement” learning.
The update rule of motor commands is to gain higher evaluation or maximize overall
rewards by exploring over the motor space or exploiting the currently available
information (Sutton and Barto, 1998).
One of the most widely studied motor learning is motor adaptation, where a
learner adapts to a new environment after an external perturbation altering the initial
2
mapping between motor outputs and their consequences. In this paradigm, a learner
returns back to the initial motor performance by learning the altered mapping after the
full adaptation. A visuomotor adaptation task is a popular paradigm where the
consequences of motor outputs are provided as visual feedbacks like in supervised
learning.
Adaptive behaviors in the task have been well accounted for by simple linear
state-space models, which suggested two distinct motor memories with different time
scales (Smith et al., 2006; Zarahn et al., 2008; Lee and Schweighofer, 2009). These
models have been popular by suggesting underlying mechanisms of interesting learning
behaviors such as anterograde interference, savings, and spontaneous recovery and so on.
It would be also very interesting to study how forgetting in the motor memory with
different time scales predicted by the models account for the well known effects in motor
learning such as the contextual interference and the spacing effects. We hypothesized that
forgetting in the fast motor memory induced by either interference or time decay would
influence slower performance during practice but higher long-term retention after the
practice. To prove the hypothesis, we employed our previously suggested 1-fast-2-slow-
process model (Lee and Schweighofer, 2009), in which a shared fast memory is
connected with 2 slow memories in parallel.
Although the multi-rate motor memory model could account for a wide range of
motor learning behaviors, little has been known about the underlying neural correlates of
the motor memories. It is even unclear that the brain has qualitatively distinguishable
motor memories with different time scales (Wolpert et al., 2011). We hypothesized that
3
the neural correlates of the motor memories are spatially distributed in different regions
of the brain depending on the time scales of the memories. Although previous studies
suggested neural correlates activated in the different stage of learning, their approach is
rather qualitative and could not exploit the entire dynamics of learning by simply
comparing the early and the late stage of learning. However, we took the combined
approach of computational models and functional MRI, thus providing more quantitative
analysis on dynamic neural activities of the motor memories with a reasonable range of
time scales. The model-based fMRI analysis has great advantages over the conventional
fMRI analysis by providing unobserved variables from behavioral data (O’Doherty et al.,
2007) as regressors such as internal states of motor memories. Since there could be more
than one candidate models to explain behavioral data, we should select a model whose
predictions fitted to the behavioral data with fewer parameters of the model. Then, the
selected model with the best fitted parameters can provide the predictions of internal
variables as regressors for further fMRI analysis.
Since it is more likely that the brain has a distribution of time scales for the motor
memories rather than distinct time scales, i.e., fast and slow, we adopted and modified a
more generalized multi-time scale model suggested by Körding et al. (2006). This
approach made possible to find a gradual change of brain areas correlated with model-
based regressors predicting the dynamic activity change of motor memories.
The aforementioned analysis is based on correlation between model predictors
(i.e., regressors) and neural responses in individual voxels with the assumption that
cognitive variables (e.g., state of memory) are possibly encoded by the average neural
4
response in individual voxels. Instead, multi-voxel pattern analysis (MVPA) exploits
patterns of the neural response across multiple voxels in the region of interest to extract
encoded cognitive variables (Norman et al., 2006). The MVPA was applied to a study
how the brain represents multi-task motor learning as distinct brain activity pattern
(Ogawan and Imamizu, 2013).
A visuomotor adaptation task that we discussed so far could be learned mostly by
implicit process (Mazzoni and Krakauer, 2006) because of minimal uncertainty in
feedback information, i.e., feedback errors directly instruct the update of next movement
plan. However, when the feedback is simply evaluative as in reinforcement learning, the
task could be more complicated and involve more explicit process because of high
uncertainty in feedback information. In this decision making process, the brain needs to
estimate the values of possible movement plans with consideration of uncertainty in the
expected rewards associated the movement plans. In addition, the brain should make a
decision of whether exploiting the currently available information (e.g., uncertainty and
expected rewards) or exploring to gain more information before making the movement.
Unfortunately, this internal cognitive process is not accessible from observed behavioral
data. This is the reason why we need to employ computational models which can explain
the internal process as variables of the models. A seminal study (Schultz et al., 1997)
discovering the role of dopamine as an indicator of reward-prediction error shows the
power of the model-based approach which could explain the reward-prediction error as
the internal variable of the reinforcement learning model.
5
Inspired by the seminal study, numerous model-based fMRI studies have been
performed to identify neural correlates for decision making and reinforcement learning
(O’Doherty et al., 2004; Daw et al., 2006; Yoshida and Ishii, 2006; Beherns et al., 2007;
Gershman et al., 2009). However, most studies focused on discrete decision making tasks,
e.g., n-arm bandid task and no motor task was involved in the studies. We designed an
fMRI experiment to study neural correlates for decision making involving a motor task.
We mostly focus on the behavioral data analysis and model-based regressors and leave
the expected results from fMRI analysis in the discussion and the future work.
In this dissertation, we describe three different research projects, one modeling
with behavioral study, and two studies combining modeling, behaviors, and fMRI,
collectively called model-based fMRI studies. The first two studies shared similar
learning models (i.e., supervised learning) and paradigm in the task and the last study,
used different learning model (i.e., reinforcement learning) and paradigm in the task.
1.2. Organization of the dissertation
In Chapter 2, we first proposed a unifying theoretical framework to account for
both the contextual interference and the spacing effects using computational models. We
dissociated both effects by designing three different learning conditions and showed the
dynamics of the fast and the slow memories is a key to understand both the phenomena.
In Chapter 3, we searched neural correlates for motor memories with various time scales
using multi-rate adaptation models. The correlated brain regions were shifted from the
fronto-parietal to the cerebellar regions with longer time scales. In addition, by using
6
MVPA, we found slow learning of discriminating two opposing visuomotor rotations,
which were represented by patterns of neural responses. In Chapter 4, we presented
another model-based fMRI study, where we are interested in neural correlates of decision
making in motor task. Although it is incomplete, we discussed behavioral results and
expected results of the fMRI analysis. In Chapter 5, we summarize the dissertation and
discussed future work and few interesting research ideas.
7
Chapter 2
Dynamics of fast and slow memories underlies both
the spacing and the contextual interference effects in
motor adaptation
2.1. Introduction
The spacing effect and the contextual interference effect are two robust behavioral
phenomena that are largely used to increase motor performance in activities such as
sports or rehabilitation after brain lesions. In the spacing effect, practicing a single task
with temporally spaced presentations leads to superior retention compared to blocked
presentations (Lee and Genovese, 1988). In the contextual interference effect,
intermixing different tasks reduces performance during training, but enhances retention
(Shea and Morgan, 1979; Schmidt and Lee, 1999).
Despite more than a century of research (Ebbinghaus, 1913; Pyle, 1919), the
underlying mechanisms of both effects are unclear, and a number of theories are still
entertained (Magill and Hall, 1990). In their “forgetting-reconstruction” theory of the
contextual interference effect, Lee and Magill (1983) proposed that short-term forgetting
between presentations of the same task results in stronger memories. Such a forgetting
view had previously been advanced to explain the spacing effect in verbal learning
(Cuddy and Jacoby, 1982). Lee and Magill (1985) then proposed a unifying mechanism
8
of the spacing and contextual effects, according to which conditions of practice that
promotes forgetting in working memory between presentations, either via “spacings” or
“task interferences”, will depress acquisition performance, but promote retention.
Here, we used a combined experimental and computational approach to test this
unifying mechanism in motor adaptation, whereby the motor system returns to baseline
performance following an external perturbation. Computational studies have shown that
motor adaptation occurs via simultaneous update of a fast process that contributes to fast
initial learning, but forgets quickly, and a slow process that contributes to long-term
retention, but learns slowly (Smith et al., 2006). The fast process has been linked to
working memory, notably because it correlates with tests of visuo-spatial working
memory (Anguera et al., 2009; Schweighofer et al., 2011), and because it can be
interfered with by other tasks (Keisler and Shadmehr, 2010). Lee and Schweighofer
(2009) extended this work to multiple-adaptation: in this model, a single fast process,
which is highly prone to interferences, is arranged in parallel and compete for errors with
multiple slow processes, which are protected from interferences.
The present study is based on theoretical predictions from the Lee and
Schweighofer model: 1) the spacing effect is due to decay in fast process due to the
passage of time, and 2) the contextual effect is due to even quicker decay in fast process
due to combined effect of the passage of time and interference. To test these predictions,
we first designed a visuo-motor experiment with three conditions: A fast blocked
condition (FBK) in which a single motor adaptation task was presented with a short inter-
trial interval (ITI); a time spaced condition (TSP) in which a single task was presented
9
with a longer ITI; and an alternating condition (ALT) in which two opposite tasks were
presented alternatively the same single-task ITI as in TSP. We then used a computational
modeling approach to study the dynamics of fast and slow memories in both the spacing
and the contextual interference effects.
2.2. Materials and Methods
2.2.1. Participants
Forty-six neurologically intact right-handed subjects (10 men and 36 women, 21-
32 years old) participated in the study. We randomly assigned the subjects to one of three
different experimental conditions: FBK, TSP, and ALT; with a predefined goal of 15
subjects per group. Participants were excluded from the study if the standard deviation of
performance in baseline trials following the first 80 trials of the familiarization session
was greater than 10 degrees (see below). One participant was excluded according to this
criterion. All subjects were naï ve to the purpose of the study, and signed an informed
consent to participate in this study, which was approved by the IRB at the University of
Southern California.
2.2.2. Experimental procedure
Subjects sat facing a computer monitor with the right arm supported with a
JAECO/Rancho arm support. Subjects controlled a cursor shown on the screen by
moving a pen on the surface of a digitizing tablet (sampling rate: 200Hz, Wacom Tech
10
Corp.). At each trial, subjects were instructed to make straight and uncorrected out-and-
back movements to hit a target. An opaque shield blocked subjects from seeing their hand
or arm.
Figure 2.1. Experiment design: A, Training schedules of the three experimental groups.
Note that the schedule for task 1 (T1) in ALT is the same as the schedule for task 1 in
TSP. The only difference between ALT and TSP is that task 2 (T2) is intercalated
between two presentations of T1. B, The experiment was comprised of three blocks: 30
baseline trials, 60 training trials per task (which depend on the condition, as shown in A),
and retention test trials, blocks of 5 trials given 2 min, 5 min, 10 min and 1 day after the
11
end of training trials. C, Distribution of targets: Targets randomly appeared either upward
or leftward depending on a task with a color cue, green or blue. To hit a target, subjects
should move toward the rotated position (goal position) from the actual target position,
+45º or -45º depending on a task.
The experiment ran across two consecutive days, with the sessions being
separated by 22 to 26 hours (~ 1 day). On the first day, subjects first performed a
familiarization session of 160 baseline trials (5.2 s ITI) with no transformation between
hand and cursor movements. Feedback (see below for details) was provided in all trials
except in two blocks of 10 trials in the middle and the end of the session, to familiarize
the subjects with the upcoming retention tests. The training session started about a 1 min
after the familiarization session. The training session comprised two blocks: a block of
30 baseline trials, followed by a block of training trials in either of FBK, TSP, or ALT
condition (Figure 2.1A), with 60 trials per task. Then, four retention tests, each of five
trials (ITI = 9.2 s) without feedback, were given at 0 min, 2 min, 5 min, and 10 min after
the end of training (Figure 2.1B). Between retention tests, subjects were instructed to
remain seated in front of the screen to keep motor memory in the active state and prevent
possible transformation of the fast states into slow states (Criscimagna-Hemminger and
Shadmehr, 2008). On the second day, subjects performed a last retention test of 5 trials
(ITI = 9.2 s) without feedback (Figure 2.1B). Ten seconds before each test, a beep signal
was played to alert the subject of the upcoming test.
12
In training trials, we altered the mapping between the actual hand position and
the cursor position via a counterclockwise (+45˚) visuo-motor rotation for task 1, and a
clockwise (-45˚) rotation for task 2. Task 1 and 2 were counterbalanced across subjects in
the different conditions. The schedules of the training trials varied according to the
experimental conditions. In both FBK and TSP conditions, there were 60 training trials of
either a single task 1 or 2. The ITI in FBK was short and lasted only 5.2 s, and the ITI in
TSP was long and lasted 18.4 s. For these two groups, the task that was practiced was
presented in the retention tests. In ALT, tasks 1 and 2 were presented alternatively with
an ITI of 9.2 s, with 60 trials per each task (120 trials in total). Because the ITI in ALT is
half the ITI in TSP (see Figure 2.1A), the first task of ALT was presented with the same
timing as the task in TSP (see Taylor and Rohrer, 2010 for a similar design but for
learning how to solve mathematical problems). In ALT, the task sequence was
counterbalanced across subjects, and the first task presented was also that presented in
retention tests.
The targets appeared at a random position (uniform distribution) either upward
(task 1, green in Figure 2.1C), along an arc ranging from 60˚ to 120˚, or leftward (task 2,
blue in Figure 2.1C), along an arc ranging from 150˚ to 210˚. Note that after complete
adaptation, the hand position required to hit the targets (goal positions in Figure 2.1C) for
task 1 and 2 were in two opposite ranges centered on 45˚ and 235˚, respectively.
Separation of the workspace (i.e., range of target and goal positions, see Figure 2.1C )
between the tasks allowed the subjects to succeed in dual adaptation (Woolley et al., 2007,
2011) although learning two opposing adaptation simultaneously is difficult (Osu et al.,
2004; Krakauer et al., 2006). Indeed, in a pilot experiment where the two tasks shared a
13
common workspace, we found that almost half the subjects (12 out of 25) could not learn
the tasks (results not shown).
We introduced such variability in the target position along each arc because we
noticed in pilot studies that subjects could reach the target in very few trials if a single
target was displayed upward and/or leftward, presumably due to cognitive strategies
(Mazzoni and Krakauer, 2006). The randomized target position is also beneficial to avoid
directional biases toward the repeated movement (Verstynen and Sabes, 2011), which, if
associated with successful error reduction, increases savings in relearning (Huang et al.,
2011). By randomizing target position within and across subjects, we minimized those
effects, which are otherwise not taken into account by our adaptation model. In baseline
trials, targets were always red and appeared randomly along the leftward or the upward
arcs.
At the beginning of each trial, a white cross appeared at the center of the screen.
After several seconds, depending on the specific ITI, a target (colored disk of 0.7 cm
radius) appeared at 10 cm from the center either upward or leftward, and signaled the
start of the movement. Subjects had to move the 1 cm-cross-shaped white cursor to the
target within 1.5 s, otherwise the trial was considered a missed trial. Feedback was
presented in two forms: First, the cursor was displayed early in the movement, while
inside an invisible disk of 3.3 cm centered at the home position (defined as the initial
position of the cursor at the onset of target). Second, the cursor re-appeared 1.5 s after the
start of movement for 0.5 s. To encourage faster reactions after appearance of the target,
the cursor color changed into yellow if the subjects did not move within 1.0 s. To help
14
subjects move back to the home position after feedback presentation, we displayed a
white circle whose diameter was proportional to the distance between the cursor and the
home position.
The outcome measure was the directional error between the target direction and
the final cursor direction, which was computed at each trial.
2.2.3. Computational modeling
We previously proposed a multiple adaptation model with a parallel structure of
fast and slow motor memories (Lee and Schweighofer, 2009). The model contains one
common fast-updating fast-decaying process and multiple separate slow-updating slow-
decaying processes. The motor output at each trial n is given by:
fs
( ) ( ) ( ) ( )
T
y n x n n n xc (2.1)
where
f
x is a fast learning process and
s
x is a slow learning process with multiple
internal states. In the case of two tasks, there are two slow states
s1 s2
[ ]
T
xx . In the original
model, the contextual cue c addresses the internal states such that (1 0)
T
c for task 1
and (0 1)
T
c for task 2, where we assumed perfect switching, i.e., no interference or
transfer between the states of the slow process.
Here, to estimate the amount of memory decay as a function of time instead of
trials, we updated the model by replacing the forgetting rate parameters of the model with
15
exponential decay terms (Ethier et al., 2008; Tanaka et al., 2012). The update equations
from trial n to trial 1 n for the fast and the slow states are as follows:
f
s
( )/
f f f
( )/
s s s
( 1) ( ) ( )
( 1) ( ) ( ) ( )
Tn
Tn
x n x n e e n
n n e e n n
x x c
(2.2)
where () Tn is the inter-trial interval following trial n (in sec);
f f s
, , , and
s
are
four free parameters, which are learning rates and time constants for the fast and slow
learning processes, respectively. The motor error e is the difference between the
external perturbation f and the motor output y . Because we assumed no difference in
the task difficulty between task 1 and 2, we used the same learning rate and time
constants for the slow learning processes of the two tasks. Note that in one task condition,
such as FBK or TSP, the model reduces to the 1-fast-1-slow model of Smith et al. (2006)
(with the exponential decay terms).
In dual task conditions such in ALT, the common fast process of the original Lee
and Schweighofer (2009) model is being interfered with each time the other task is
presented. In contrast, the slow process of each task is “protected” from interference (as
later shown by Pekny et al., 2011) via the contextual input. Because it is possible that this
protection is not perfect, we also considered an extended model that accounts for some
degree of interference or transfer between the slow processes, with a contextual cue
vector such as (1 )
T
q c or ( 1)
T
q c in Equations 2.1 and 2.2, where the free parameter
q ranges from -1 (full transfer) to 1 (full interference). Positive values of q represents
16
interference because the sign of the state value of task 2,
s2
x , is opposite to that of task 1,
s1
x , in our experiment design with opposite rotations, -45˚ and +45˚, respectively.
2.2.4. Model parameter fitting and model comparison
We first estimated the directional bias of movements for each subject by taking
the mean of the movement directions in the upward or leftward baseline trials, depending
on the task, and subtracted this bias from the actual movement directions. We then took
the median of the bias-corrected movements from 15 subjects for each trial of each
experiment group. The advantage of the median over the mean is to reduce the effect of
outliers, comprised of large overshoots or undershoots. For parameter estimation, we
used the MATLAB fmincon function, which minimizes the root mean squared error
(RMSE) between the observed median data and the model prediction. We also estimated
95% confidence intervals of model parameter estimates by using bootstrapped data. This
method is more accurate than standard parametric method especially for small sample
numbers with unknown sample distribution (DiCicco and Efron, 1996).
In the Lee and Schweighofer (2009) model, alternation of opposite tasks in ALT
condition yields near zero update of the fast process due to interferences (see
Schweighofer et al. (2011) for qualitative simulations); in the other conditions, FBK and
TSP, the fast process has greater than zero activity. We therefore predicted that the
performance of each task in ALT is well fitted by a model containing only two
17
independent slow states, and that the performance of the single task in FBK and TSP is
well fitted by a model with a fast and a slow process (with a single state in each).
We adopted a 2-stage model comparison to test these predictions based on
computations of the Bayesian Information Criterion (BIC) (Eq. 2.3). In the first stage of
the model comparison, we tested the prediction that adaptation to two opposite
perturbations in ALT is due to two independent slow processes. For this, we compared
the original Schweighofer and Lee (2009) model to an extended model with a contextual
cue vector such as (1 )
T
q c or ( 1)
T
q c (see above). We compared the BICs of the 1
state (2 independent slow processes, k =2), 1 state with q (2 slow processes, k =3), 2
states (1 fast, 2 independent slow processes, k =4), and 2 states with q (1 fast, 2 slow
processes, k =5) models, using all the training and retention trials (N = 140) of ALT (see
Eq. 2.3).
We performed the second stage of the model comparison after verifying in the
first stage that adaption in ALT is well supported by two independent slow states (see
Results). Here, we tested the prediction that in FBK and TSP, unlike in ALT, there is a
greater than zero activity in the fast process. For this, we compared the BICs of 2-
parameter (1-state model) and 4-parameter model (2-state model) for each experimental
condition FBK, TSP, and ALT, using the training and retention trials for a single task (N
= 80). Note that using single task trials in ALT in this model comparison is valid given
that we showed in the first stage of the model comparison that a single independent slow
process is involved in each task in ALT. Also note that for ALT, although the 2 vs. 4-
18
parameter model comparison was previously performed in the first stage using N = 140,
we compare these models with N = 80 trials in this second stage to match FBK and TSP.
For these model comparisons, we generated 10,000 bootstrapped data by
randomly selecting 15 subjects with replacement and fitted predictions of candidate
models to each of bootstrapped data. We then calculated the BIC of the bootstrapped data
for the candidate models ( k -parameter models: k =2, 3, 4, and 5) and the conditions (i.e.,
FBK, TSP, and ALT) as follows:
BIC=-2log ( ) log
ee
L k N θ (2.3)
where () L θ is the likelihood of the observed data given an optimized parameter set θ
and N is the number of trials used for data fitting. We selected the model with the lowest
BICs by performing paired bootstrap t tests (DiCiccio and Efron, 1996; see details in Lee
and Schweighofer, 2009), in which 1- p is equivalent to the proportion of the
bootstrapped data preferring (i.e., with lower BIC) the model of interest versus the
compared model.
2.2.5. Measurement of initial learning rate
The rationale for this analysis is that the activity of the fast process will largely
influence change in performance early in training. The prediction is that initial learning
speed will be greatest in FBK, then in TSP, and smallest in ALT. We computed how fast
error decreased for initial 15 trials of one task in the training block for each condition
19
(with MATLAB robustfit function). The slope represents the error decrement per trial,
which is indicative of the initial learning rate (in degrees per trial).
2.2.6. Analysis of retention
Retention data were obtained by taking the median of the 5 trials within each of
the five tests at 0, 2, 5, 10 minutes, and at 1 day after training. We applied a series of
mixed model analyses to evaluate retention over three different time periods:
1) Retention between 0 and 2 minutes after training. The rationale for the
analysis in the first 2 minutes is that most of the decay in performance will be due to the
fast process decay. Our model predicts that the activity of the fast process at the end of
training is largest in FBK, then in TSP, and smallest in ALT. Thus, the first prediction is
that the amount of forgetting will be largest in FBK, then in TSP, and smallest in ALT.
2) Retention during 0, 2, 5, and 10 minutes after training. The rationale for this
analysis is that the rate of decay in performance will be affected by a proportion of fast
process and slow process. Because our model predicts that the slow process contributes
most to performance in ALT, then in TSP, and least in FBK, the decay rate is expected to
be fastest in FBK, then in TSP, and slowest in ALT. We measured decay rate in degrees
per min.
3) Retention between 10 minutes and 1 day after training. Because the slow
process will be more activated at the end of training, the first prediction is that overall
performance at 10-minute and 1-day post-training will be higher for TSP and ALT
20
compared to FBK. The second prediction is that performance in 1-day retention test will
correlate with performance at 10-minute post-training because the slow process has been
shown to predict 1-day retention (Joiner and Smith, 2008). The third prediction is that
there will be a consolidation effect following both TSP and ALT schedule, but not in
FBK, because the conditions of training have been shown to affect long-term
consolidation (Tanaka et al., 2010; Kantak et al., 2010).
In these mixed model analyses, final model choices (covariance structures for
fixed effects, inclusion of random effects, and covariates) were based on the lowest BIC.
For fixed effects, we used training conditions (groups) as factors and time after training
as covariates, with 2-way interactions. For random effects, we used intercepts to
compensate for the inter-subject variability in performance. Four covariance structures
for main effects were considered a priori: Auto-regressive, diagonal, scaled identity, and
compound symmetry. In all analyses, the random effect intercepts were significant (p <
0.0001), the covariance structure that minimizes the BIC was the (scaled) identity matrix,
with all covariance parameters significant. In addition, because final error in performance
(ERR) on the last five training trials had a very significant effect on retention and
minimized the BIC, we also included ERR as a second covariate. We used Restricted
Maximum Likelihood method. We used SPSS 18 for these statistical analyses of
retention. Since the hypotheses were all pre-specified, no adjustments were made to the
reported p values. Our criterion for significance was p < 0.05.
21
2.3. Results
Figure 2.2 compares measured performance (stars: median, shaded region: 25-75%
quartiles) for the three groups: FBK, TSP, and ALT. The comparisons of interest between
groups were the initial learning rates and the changes of errors in retention.
Figure 2.2. Observed subject behavior and model fits in three experimental conditions,
FBK (A), TSP (B), and ALT (C). Star symbols: median subject performance. Shaded area:
inter-quartile (25-75%) ranges. Solid black line: motor output, gray dotted line: fast
process, brown dotted line: slow process (See Eq. 2.1).
2.3.1. Initial learning rate depends on the activation of the fast process
As seen in Figure 2.3A, initial learning rates, in units of degree per trial (˚/trial)
differed across experiment groups (mean ± SEM; FBK: 1.85 ± 0.244˚/trial, TSP: 1.53 ±
0.227˚/trial, ALT: 0.963 ± 0.238˚/trial, one-way ANOVA, F
(2,42)
= 3.64, p = 0.034). As
predicted, FBK showed faster initial learning rate than ALT (two tailed t-test, p = 0.014).
22
Although initial learning rate in TSP was not significantly different from other groups
(FBK/TSP, p = 0.34 and ALT/TSP, p = 0.096), its mean value was between that of the
two other groups. There was no difference in final performance across groups, although
there was a trend for smaller performance in ALT compared to FBK (mean ± SEM; FBK:
37.4 ± 2.18˚, TSP: 35.4 ± 1.70˚; ALT: 33.0 ± 2.27˚, one-way ANOVA, F
(2,42)
= 1.14, p =
0.330, difference between FBK and ALT, p = 0.140).
Figure 2.3. First-day learning and forgetting: A, Initial learning rates (performance
increase per trial) obtained from the initial 15 trials in the training block. B, Forgetting in
the two minutes post-training. Forgetting was measured as a difference between zero and
two minute retention tests post-training. Negative sign indicates actual forgetting. C,
Forgetting rates in the 10 minutes following training. Forgetting rates were calculated
using a mixed model analysis with retention tests data at 0, 2, 5, and 10 minutes after
training (see Materials and Methods).
The first model comparison analysis using the ALT data for the 2, 3, 4, and 5
parameter models showed that BIC was lowest for the 2-parameter model (Bootstrap t-
test, p < 10
-4
for all other comparisons; Table 2.1A). There was no single bootstrapped
23
data preferring to the other models, which strongly supports our 1-fast-2-independent
slow process model. The additional parameter q in the extended model only marginally
decreased the model fitting errors, in average, by 0.0342˚ from k =2 to k =3 and by
0.0457˚ from k =4 to k =5.
The second model comparison analysis using data for all experimental conditions
and for the 2-parameter ( k =2) and 4-parameter ( k =4) models shows that: 1) BIC was
significantly lower for the 4-parameter model for FBK (Bootstrap t-test, p = 0.0022;
Table 2.1B); 2) there was no significant difference in BIC between the two models for
TSP (Bootstrap t-test, p = 0.462); and 3) for ALT, BIC was significantly lower for the 2-
parameter model (Bootstrap t-test, p < 10
-4
), as expected from the first model comparison.
In ALT, there was no single bootstrapped data preferring to the 4-parameter model,
which minimally decreased RMSEs, 0.0002˚ in average, compared to the 2-parameter
model.
24
A
k = 2 k = 3 k = 4 k = 5
ALT (N = 140)
245.35
(220.67~282.04)
249.38
(224.37~286.44)
255.23
(230.55~291.93)
258.95
(233.91~296.17)
B
k = 4 k = 2 P
FBK (N = 80) 136.77 (117.77~162.29) 159.84 (146.99-174.39) 0.0022
TSP (N = 80) 135.41 (119.43~156.28) 135.25 (121.03~154.07) 0.462
ALT (N = 80) 152.60 (135.20~173.84) 143.84 (127.43~168.07) <10
-4
Table 2.1. Mean and 95% confidence interval of BICs from 10000 bootstrapped data (A)
for ALT group using the number of trials, N = 140. 2-parameter model had significantly
lower BICs than all the other models (k=3, 4, 5) (B) for all three experimental conditions,
FBK, TSP, and ALT using N = 80. The 4-parameter model in FBK and the 2-parameter
model in ALT had significantly lower BICs than the other model; there was no difference
between the two models in TSP.
Because the model comparisons provide strong evidence for lack of activation of
the fast process in ALT, i.e.,
f
0 x , the two parameters estimated using ALT data are
those of the slow learning process,
s
and
s
. Using these parameters, we then estimated
the parameters of the fast learning process,
f
and
f
from the training data of FBK and
25
TSP groups. We reserved the retention data of these groups to test predicted retention.
The estimated parameters using all the 15 subjects with their 95% confidence intervals
using the bootstrapped data were:
s
= 0.0525 (0.0404-0.0817), and
s
= 1201 s (670.0-
2214 s),
f
= 0.160 (0.0898-0.4273),
f
= 51.4 s (11.5-162.3 s) with an overall fitting
error (RMSE), 3.82˚ (3.62-6.46˚).
In Figure 2.2, we plotted the model performance with the estimated mean
parameters, as well as the predicted activity of the fast and slow learning processes,
which showed distinct patterns depending on the experimental condition. Activation of
the fast process in ALT was almost zero, i.e.,
f
0 x when compared with that of the slow
process (the slightly negative activation of the fast process indicates the effect of net
interference from the second, opposite, task). TSP showed intermediate activation of both
fast and slow process compared to ALT and FBK.
2.3.2. Short-term retention is determined by relative proportion of fast and slow
processes
Using the estimated time constants for the fast process
f
and the slow process
s
,
we predicted retention in 0, 2, 5, and 10 minutes after training in FBK and TSP, and
compared them with retention in ALT. The RMSEs between the model predictions and
the median retentions of 15 subjects for all 20 trials were 5.83˚ (FBK) and 6.17˚ (TSP)
(see Figure 2.2B, C). For comparison, the RMSE for ALT (a fitting error in this case,
since the retention data was also used for fitting) was 5.38˚. The model predicted the
26
fastest decay in FBK due to large fast process activity at the end of training (Figure 2.2A),
intermediate decay in TSP (Figure 2.2B), and the slowest decay in ALT because
performance is mostly due to the slow process activity (Figure 2.2C).
Mixed model analysis for retention data between the first two tests (at 0 and 2
minutes) showed that training condition (p = 0.026), time after training (p = 0.001), and
the interaction of condition and time (p = 0.046) contributed to decay significantly.
Forgetting was significantly different from 0 in FBK (-7.21 ± 2.03˚, p = 0.001) and TSP
(-5.24 ± 2.03˚, p = 0.013), but not different from 0 in ALT (-0.057 ± 2.0˚, p = 0.97).
Forgetting was larger in FBK compared to ALT (p = 0.017), and with a trend towards
being greater in TSP than ALT (p = 0.078), but was not different between FBK and TSP
(p = 0.50) – see Figure 3B. Performance at time 0 was greater in TSP than in ALT (p =
0.01) and marginally greater than in FBK (p = 0.051).
Mixed model analysis of retention over tests at 0, 2, 5, and 10 minutes with
condition as factor and time as covariate showed that condition was significant (here
again barely, p = 0.047), time was not significant (p =0.33), but the interaction of
condition and time was largely significant (p = 0.006) in affecting decay rate. Decay rate
was faster in both FBK (-0.32 ± 0.29˚/min, p = 0.024) and in TSP (
-0.55 ± 0.28˚/min, p = 0.002) than in ALT, for which it is not different from zero (0.33 ±
0.26˚/min, p = 0.21) (Figure 2.3C). There was no difference between decays in TSP and
FBK (p = 0.41). Overall error in the four retention tests following training was smaller in
TSP than in ALT (p = 0.035) and than in FBK (p = 0.031).
27
2.3.3. Time-spacing induces overnight consolidation
Model simulations predict that, in all conditions, performance at 10-minute post
training is mostly due to the slow process because the fast process is near zero at this time
(which is supported by the mean time constant of the fast process
f
= 51.4 s – see Figure
2). We thus tested whether the individual performance in the 10-minute retention test
predicts performance in the 1-day retention test. Note that taking actual performance in
the 10-minute test for each subject and not the slow process estimated from group data
allowed us to obtain individual prediction of 1-day performance. We found a strong a
linear relationship (Figure 2.4A) between performance in the 10-minute test and
performance in the 1-day test, with the slope parameter = 1.16 (p < 10
-8
, MATLAB
function robustfit).
Do the conditions of practice have an effect on long-term forgetting? After
removing the retention data of three subjects who appeared to have failed to associate the
color cue with the task (see below), we calculated the long-term (1–day) forgetting as the
difference between performances in the 10-minute test and the 1-day test in the three
experiment groups (Figure 2.4B). Mixed model analysis showed that, overall,
performance was not different across groups (p = 0.75), and time was not significant (p =
0.50). However, the interaction of condition and time was significant (p = 0.034). TSP
showed increase in performance from the 10-minute to the 1-day test, which presumably
indicates overnight consolidation (6.10 ± 1.96, p = 0.015). However, performance in FBK
and ALT did not changes from the previous day (FBK: -2.09 ± 2.35˚, p = 0.35, ALT: -
1.30 ± 2.53˚, p = 0.577). Forgetting in TSP was significantly less (TSP had actually an
28
increase in performance) than in the other two groups (TSP/FBK: p = 0.014, TSP/ALT: p
= 0.031), but there was no differences in forgetting between ALT and FBK (p > 0.5).
(Figure 2.4B)
Figure 2.4. Second-day retention and consolidation: A, Correlation between performance
at the 10-minute and the 1-day retention tests. Data from three subjects (surrounded by
the red circle) were excluded as outliers as they presumably failed to recognize the task in
the 1-day test (see Results). B, Change in performance between 10-minute and 1-day
retention test. Performance change was obtained from difference between performances
the 10-minute test and 1-day test. Positive sign indicates increase in performance on the
second day (consolidation).
Identification of subject outliers: During training, color of a target was used as a
contextual cue to distinguish different tasks from baseline. However, three subjects
appeared to fail to remember this color cue on the 1-day retention test, and generated
29
baseline-like movements instead. As shown in Figure 2.4A, performance for these three
subjects at 1 day deviated largely from the regression line despite excellent median
performance (around 45 degrees) in the 10-minute post-training test. Only for these three
subjects, the long-term forgetting was more than two standard deviation from the mean of
45 subjects (Gaussian distributed, Kolmogorov-Smirnov test, p=0.23). In addition,
analysis of the weights used in the robustfit showed that three subjects have weights
inferior to 0.5 while the other 42 subjects have weights superior to 0.6.
2.4. Discussion
We studied both the spacing effect and the contextual interference effect by
measuring visuo-motor adaptation and decay in three conditions: FBK, TSP, and ALT.
Because the only difference between TSP and ALT schedules was the intercalation of a
secondary task in ALT, our design decomposed the contextual interference effect into the
two components predicted by the computational model of Lee and Schweighofer (2009):
time decay by comparing TSP to FBK, and interferences by comparing TSP to ALT.
Performance changes during training were predicted by the model in the three
conditions as shown by both model selection and fitting. Performance in FBK was due to
large activation of the fast process and relatively low activation of the slow process. As a
result, subjects in FBK showed fast initial adaptation followed by slow and gradual
adaptation: a well-known result (Redding and Wallace, 1996; Karni et al., 1998; Della-
Maggiore and McIntosh, 2005; Anguera et al., 2010) which is accounted for by the 4-
parameter model with a fast and a slow process (Smith et al., 2006; Zarahn et al., 2008).
30
In TSP, performance was due to intermediate activation of both fast and slow processes,
as compared to ALT and FBK. Finally in ALT, performance was due to a large activation
of two independent slow processes and near-zero activation of the fast process. Note,
however, that our result concerning the independence of the slow processes is probably
due to our choice of opposite rotations: similar rotations would presumably show transfer
between the tasks.
Forgetting in the four same-day retention tests was in part predicted by the model.
As predicted by the near-zero activation of the fast process in ALT at the end of training,
there was more decay in FBK in the 2 minutes following training compared to ALT.
However, there was no difference in decay between FBK and TSP. This was unexpected
because the fast process at the end of training was estimated to be larger in FBK than in
TSP. However, this difference is smaller than between FBK and ALT (compare Figure
2.2A, B, and C), and may not be sufficient to lead to a significant difference. As predicted,
ALT showed the smallest forgetting rate in the same day (0, 2, 5, and 10 min) retention
tests. However, despite no difference in performance at the end of training between ALT
and TSP (two tailed t-test, p = 0.27), overall retention level in these retention tests was
poorer in ALT than in TSP. Retention in ALT may have been affected by the change of
context during testing compared to training: although task 1 was practiced with task 2
during training, task 1 was tested in isolation following ALT training (in order to keep
tests identical across conditions). Such “encoding specificity” is known to affect
retention (Tulving and Thomson, 1973). In addition, the rapid change of context from
two tasks to a single task could have made the subjects in ALT sensitive in detecting
environmental change, and thus leading to initial rapid drop in performance (Vaswani
31
and Shadmehr, 2013). However, once this change has been recognized, the remaining
decay was slow because of the near-zero activity in the fast process. Our model does not
take into account this change in testing context, and thus cannot explain the drop in
performance following training (see Figure 2.2C). As a result, the estimated decay rate of
the slow process obtained via model fitting of the training and one-day retention data in
ALT is shorter than that obtained via mixed model analysis, which only takes into
account the retention data. Nonetheless, the model-estimated decay rate leads to good
model fits for the TSP and FBK retention data (See Figure 2.2A, B). In addition, the
independence of the slow processes revealed by our model comparison analysis suggests
that the greater number of training trials in ALT compared to FBK and TSP (120 vs 60)
did not influence retention performance.
Across conditions, we found a linear relationship between performance in 10
minutes and performance in 1-day retention tests. This result was predicted by activation
of the slow process at the end of each training session and extends previous results
(Joiner and Smith, 2008) of linear relationship between short-term and long-term
retention to individual subjects. However, our model does not account for the overnight
consolidation effect reported only in TSP (Figure 2.4B). The first possible explanation
would be that increased rewards (in the form of “target hits”) acquired during training
enhanced retention in TSP (Abe et al., 2011; Huang et al., 2011; Pekny et al., 2011;
Shmuelof et al., 2012a). However, this does not seem to be the case in our study, as the
reward rate (mean ± STD) was a higher during training in FBK (0.148 ± 0.091) than in
TSP (0.067 ± 0.058; p = 0.0054) and ALT (0.063 ± 0.058; p = 0.0072), with no
difference between TSP and ALT (p = 0.876). Thus, the effect of rewards seems to be
32
more complicated in our study with variable schedules because 1) more rewards during
fast learning in FBK does not lead to greater retention, and 2) there is no difference in the
rewards between TSP and ALT, but consolidation occurred in TSP only. The second
possible explanation would be that a single task induces repetitive and similar brain
activity patterns that improve long-term retention and consolidation, like episodic
memory for faces or words (Xue et al., 2010). In contrast, the two opposing tasks in ALT
are represented as different brain activity patterns (Ogawa and Imamizu, 2013), which
possibly reduces long-term memory consolidation. In addition, the longer ITI in TSP than
in FBK presumably gives more time for mental practice of the incoming trials (Feltz and
Landers, 1983); this could further increase the subsequent consolidation process. A
number of recent studies explored the effect of task schedules in motor learning or
adaptation and are related to the present study. Two studies reported no significant
difference in learning rate during practice when the time between trials was greater than 1
second (Francis, 2005; Bock et al., 2005). Similarly, we found no difference (p = 0.34) in
the initial learning rate between FBK and TSP. Unlike in our study, Huang et al. (2007)
showed increased learning rates during practice with a longer ITI group compared to a
shorter ITI group, which cannot be explained by multi-rate learning models with fixed
learning rates. However, they calculated the learning performance by averaging 32 trials,
which might not capture dynamic change of the fast process with a time constant of less
than a minute. Körding et al. (2007) proposed that the spacing effect is due to increase in
learning rates of the slower processes during spaced adaptation. Although this is one
possible explanation, our model does not require modifiable learning rates to explain the
spacing effect: longer ITI induces more forgetting in the fast process, which in turn
33
results in larger errors, and thus greater update in the slow process. In our previous study
(Schweighofer et al., 2011), we compared the contextual effect in learning to generate
specific force profiles in healthy individuals and individuals with chronic stroke. We
showed that individuals with chronic stroke, who have low visuo-spatial working
memory, exhibited little long-term forgetting after either random or blocked schedules.
This finding was predicted based on simulations of the contextual interference effect with
the Lee and Schweighofer model. However, the models were qualitative in the study: the
models were not selected from the data, and similarly the parameters were not estimated
from the data. The present study, by directly selecting the best model and estimating
model parameters from the data, clearly demonstrated that the 2-parameter model with
two independent slow processes was preferred in alternating dual-task training.
In sum, like the “forgetting-reconstruction” hypothesis, our results strongly
supports the view that the spacing and the contextual interference effects are largely
based on fast forgetting between presentations of the same task during training. However,
the specific mechanisms underlying the enhancement of long-term memory differ in the
forgetting-reconstruction hypothesis and in our model. According to the forgetting-
reconstruction hypothesis, forgetting in working memory between spaced presentations
necessitates retrieval from long-term memory, which increases long-term retention. In
our model, forgetting in the fast process between spaced presentations leads to greater
errors during training, which increases the update of the slow process, and in turn leads to
better long-term retention. Note that our account of the contextual interference effect
does not exclude additional explanations, such as the “elaboration-distinctiveness”,
“deficient processing” (Magill and Hall, 1990), or active preparation (Cross et al., 2007).
34
In addition, we uncovered a specific role of presentation spacing in enhancing long-term
consolidation. Further studies are needed to dissociate the possible roles of additional
mechanisms in both the spacing and the contextual effects.
35
Chapter 3
Neural correlates for multi-rate models of
sensorimotor learning
3.1. Introduction
Recent studies have suggested computational models of multi-state motor
memories with different time scales, in motor adaptation. They could account for well
several interesting motor learning phenomena such as anterograde interference,
spontaneous recover, savings and so on (Smith et al., 2006; Körding et al., 2007; Joiner
and Smith, 2008; Lee and Schweighofer, 2009). However, little is known about the
underlying neural correlates of the putative multi-rate motor memories suggested by the
computational models.
Several related neuroimaging studies using fMRI (Della-Maggiore and McIntosh,
2005; Luaute et al., 2009; Anguera et al., 2010; Landi et al., 2011) or PET (Clower et al.,
1996; Shadmehr and Holcomb, 1997; Krakauer et al., 2004; Della-Maggiore and
McIntosh, 2005) have found activation and plastic changes of different brain regions
during and after motor adaptation. The prefrontal cortex (PFC) has been known for its
role in spatial working memory with its activity correlated with capacity of the working
memory(Pessoa et al., 2002; Olesen et al., 2003). Thus, PFC mostly contributes to the
36
early but not late stage of adaptation. The posterior parietal cortex (PPC) is also
important in the early stage of motor adaptation (Clower et al., 1996; Graydon et al.,
2005; Luauté et al., 2009) as working memory (Pessoa et al., 2002; Olesen et al., 2003)
with its role in planning movements and coordinating a new visuomotor transformation
(Clower et al., 1996; Buneo and Andersen, 2006). The cerebellum seems more
complicatedly involved in motor adaptation with its role in receiving motor errors
(Schweighofer et al., 2004; Diedrichsen et al., 2005), building internal models (Wolpert
et al., 1998; Kawato, 1999), and storing multiple motor skills (Imamizu et al., 2003). The
activity of the cerebellum increases as the later stage of visuomotor adaptation (Imamizu
et al., 2000; Graydon et al., 2005; Luauté et al., 2009), thus being correlated with the
degree of savings at transfer of learning (Seidler and Noll, 2008).
However, these studies focused on separate functions of the specific brain
regions in different stages of motor learning without directly investigating dynamic
changes of their activation during entire learning. In addition, they assumed qualitatively
separable neural systems (fast vs. slow or early vs. late), but different learning and
forgetting rates we experience depending on the task difficulty presumably suggested a
distribution of possible time scales of motor memory- see discussion (Smith et al., 2006;
Wolpert et al., 2011).
Our approach is more quantitative in estimating states of memory activation with
varying time scales based on a computational model. For this, we adopted and modified a
model suggested by Körding et al. (2007) which generalized two-state models to multi-
state models with a distribution of time scales. This model-based fMRI analysis has
37
advantage over a conventional fMRI analysis in that it provides internal variables (i.e.,
states of memories activation) as regressors, otherwise inaccessible from observable data
only (O'Doherty et al., 2007). Therefore, we could explore over neural correlates of
motor memory with varying time scales from the slowest to the fastest in a reasonable
range. For this, we designed an event-related fMRI experiment where subjects learned
opposing visuomotor rotation tasks in alternating blocks by which we could prevent
premature adaptation within few trials. In addition, we further performed multi-voxel
pattern analysis (MVPA) to investigate whether activity patterns in brain regions such as
the PPC and the cerebellum could represent different visuomotor rotations (Ogawa and
Imamizu, 2013).
3.2. Materials and Methods
3.2.1. Subjects
Subjects were 21 right-handed volunteers (15 males and 6 females, 20-50 years),
as assessed by a modified version of the Edinburgh Handedness Inventory (Oldfield,
1971). Written informed consent was obtained from all subjects in accordance with the
Declaration of Helsinki. The experimental protocol received approval from the local
ethics committee.
38
3.2.2. Task procedures
We designed a dual-task adaptation experiment with two opposing visuomotor
rotations(Lee and Schweighofer, 2009): task 1: 40˚, task 2: -40˚. At the beginning of each
trial, a white cross appears at the center of screen, which was subjects’ fixation point. A
round target of 0.7 cm radius appeared on the top of the screen 8 cm from the center.
Subjects had to manipulate a joystick to move the cursor to the target within 1.5 s,
otherwise the trial was considered as a missed trial without record of data. The cursor
trajectory was not visible during reaching but the cursor appeared again for 500 ms to
give feedback of the final cursor position 8 cm from the center. To encourage subjects to
respond faster, the color of the feedback cursor changed into yellow if not moving within
800 ms. Intertrial intervals were random, exponentially distributed from 4 to 14 seconds
with 2 second increment. For each trial, we calculated directional error between the target
direction and the final cursor direction from the center of the screen.
The size of the target is equivalent to 10˚ in visual angle, allowing up to ±5˚ of error
to be ‘hitting’ the target. We used different colors, such as red, blue, and green for
subjects to distinguish between different tasks including a control task without
visuomotor rotation. Blocks of the tasks were presented alternatively, e.g., C1212C2121C
where C, 1, and 2 indicates a block of 9 trials for the control, task 1, and 2. The sequence
of the task 1 and 2 and target colors was counter-balanced across both sessions and
subjects to eliminate any confounding effects. There were three sessions and each
consisted of 99 trials with 27 trials for the control task and 36 trials for each of task 1 and
2. A session lasted ~11 minutes and subjects had one-minute break between sessions. All
39
the participants achieved a minimum required performance in a screening session of 150
trials without visuomotor rotation, where they had to hit more than 80 out of later 100
targets.
Stimuli were presented on a liquid crystal display and projected onto a custom-made
viewing screen. Subjects lay supine in the scanner and viewed the screen via a mirror,
being unable to see their hand throughout this task. They used their left index and middle
fingers to control the joystick with the left upper arm immobilized using foam pads to
minimize body motions.
Figure 3.1. Timeline in a trial: Subjects adapted to opposing visuomotor rotations (+40˚
and -40˚). Target presentation signaled a task and subjects should move within 1.5 s for
the trial to be valid. The feedback was given as a cursor position for 0.5 s after the
movement time. The control trials did not have a rotation (0˚). The different tasks were
cued by colors (blue, red and green), which were counterbalanced across subjects.
40
3.2.3. Computational model
Two state space models have well explained trial-by-trial motor adaptation as the sum
of decomposed memory traces with two different time constants (Smith et al., 2006; Lee
and Schweighofer, 2009). These studies assumed distinct fast and slow processes but it
has yet to be determined whether brains implement distinguishable neural and behavioral
systems (Wolpert et al., 2011). Therefore, it is more natural to consider continuous
timescales of motor memories possibly implemented in the brain (Körding et al., 2007).
As an approximation to continuous distribution, we chose 30 different time constants
similarly in Körding et al. (2007), ranging from 2 seconds as fastest to ~92.6 hours as
slowest, where the different time constants were logarithmically scaled.
We assumed motor memories simultaneously update their states to common error
feedback as our group suggested for the two-state model (Lee and Schweighofer, 2009).
To account for dual adaptation, we defined the state of motor memory with a time
constant
k
as a vector with two internal states,
1, 2,
[ ]
T
k k k
xx x . We solved the first
order differential equation similar to one shown in the references to represent trial-by-
trial update equation (for details, see supplemental material). In the solution, the learning
processes forget exponentially with different time constants and at each trial n , update
their states simultaneously on the moments of receiving error feedbacks, e between an
external perturbation f and the motor output, y
.
( )/
( 1) ( ) ( ) ( ) ( 1,..,30)
k
Tn
k k k
n n e e n n k
x x c (3.1)
41
30
1
( ) ( ) ( )
T
k
k
y n n n
xc (3.2)
( ) ( ) ( ) e n f n y n (3.3)
where
k
x is a learning process with 2 internal states,
1, 2,
[ ]
T
kk
xx , corresponding to tasks,
c is the contextual cue, and () Tn is duration of the inter-trial interval following trial n .
We used (1,0)
T
c or (0,1)
T
c for each task, where we assumed no interference and
perfect switching between tasks (Lee and Schweighofer, 2009).
The time constant,
k
determines the learning rate,
k
according to the following
equation.
( 1,..,30)
k p
k
c
k
(3.4)
where c and p are free parameters, which are positive. Thus, a process of memory with
smaller time constant forgets and learns faster. While error-based learning occurs trial-
based, which is a discrete-time process, time-based decaying of motor memories is a
continuous-time process (Ethier et al., 2008; Tanaka et al., 2012).
We used the MATLAB fmincon function to optimize the two free parameters, c
and p that minimize the mean absolute error between the actual adaptations of subjects
and model predictions, () yn (see Eq. 3.2) . The actual adaptation at each trial was
42
calculated by averaging the observed adaptations of 21 subjects excluding missed trials
and severely overshot trials (>20˚), which are less than 5% of the total trials. Once the
parameters were optimized, we calculated 30 memory traces for each task,
k
x (see Eq.
3.1) every 1.8 seconds corresponding to repetition time (TR) of the scanner. Since our
model is continuous-time based, we did not interpolate between trials but used actual
elapsed time at the image acquisition, replacing the decaying term in the state update
equation. We used the calculated memory traces as regressors for univariate fMRI
analysis.
3.2.4. MRI acquisition
A 3-T Siemens Trio scanner (Erlangen, Germany) with a 12-channel head coil
was used to perform T2*-weighted echo planar imaging (EPI). A total of 368 scans were
acquired for each session with a gradient echo EPI sequence, and each subject underwent
three sessions. The first five scans were discarded to allow for T1 equilibration. Scanning
parameters were repetition time (TR), 1800 ms; echo time (TE), 30 ms; flip angle (FA),
70; field of view (FOV), 192 192 mm; matrix, 64 64; 30 axial slices; and slice
thickness, 5 mm without gap. T1-weighted anatomical imaging with an MP-RAGE
sequence was performed with the following parameters: TR, 2250 ms; TE, 3.06 ms; FA,
9 ; FOV, 256 256 mm; matrix, 256 256; 192 axial slices; and slice thickness, 1 mm
without gap.
43
3.2.5. Processing of fMRI data
Image preprocessing was performed using SPM8 software (Wellcome
Department of Cognitive Neurology, http://www.fil.ion.ucl.ac.uk/spm). All functional
images were first realigned to adjust for motion-related artifacts. The realigned images
were then spatially normalized with the Montreal Neurological Institute (MNI) template
and resampled into 2-mm-cube voxels with sinc interpolation. All images were spatially
smoothed using a Gaussian kernel of 8 8 8 mm full width at half-maximum. The
smoothing was not performed for multi-voxel pattern analysis (see below), as this could
blur fine-grained information contained in multi-voxel activity (Mur et al., 2009).
3.2.6. Model-based regression analysis of fMRI data
We conducted a model-based regression analysis of fMRI data. The 30
components with different time-constants, which were estimated with the previously
behavioral modeling (see Eq. 3.1), were used as explanatory variables (i.e., regressors)
using the general linear model (GLM). To accommodate a problem of multicollinearity
due to similarity of regressors between adjacent time-constants, we separately estimated
30 regression models corresponding to individual time-constants. The regressors in each
model include states of adaptation components (see modeling of behavioral data) for
Tasks 1 (+40º ) and 2 (-40º ), each of which was orthogonalized by using a SPM function
(spm_orth.m). Regressors also include pulse functions at every onset of joystick
movement modeling the hand movement. Hand movements were modeled separately for
44
each trial-type (Tasks 1 and 2, and Control [0º ])). Amplitudes of the pulse functions were
modulated by behavioral measures (directional error and reaction time) in each trial,
modeling the effects of error and reaction time; however, the effects were of no interest in
the current analyses. Low-frequency noise was removed using a high-pass filter with a
cut-off period of 128 s, and serial correlations among scans were estimated with an
autoregressive model implemented in SPM8. Contrast images of each subject, generated
using a fixed-effects model, were taken into the group analysis using a random-effects
model of a one-sample t-test. Because the purpose of the model-based regression analysis
is to recruit possible regions related to many (30) adaptation components for singular
value decomposition analysis (see below), activation was reported with a generous
threshold of p < 0.001 uncorrected for multiple comparisons at voxel level.
3.2.7. Multi-voxel pattern analysis (MVPA)
We additionally conducted a multi-voxel pattern analysis (MVPA) to test if the
regional brain activity could be used to classify the two rotational types (+40 and –40
degrees). The regions of interests (ROIs) include the right parietal lobe and the
cerebellum. Our previous model-based regression analysis suggested that these regions
are related to the middle (the parietal regions) and the slow (the cerebellum) components.
The ROI of the right parietal region was the superior and the inferior parietal lobes
according to anatomical map in PickAtlas (http://fmri.wfubmc.edu/software/PickAtlas)
(regions enclosed by red curves in Fig. 3.6A). Within the anatomical ROI (regions
enclosed by cyan curves in Fig. 3.6A), we applied MVPA to BOLD signals of voxels that
45
were significantly correlated with at least one of the intermediate components (
k
,
ranged from 2.1 to 87.9 minutes: k = 11, … 20) in the model-based regression analysis.
In the cerebellum, MVPA was applied to signals that were significantly correlated with at
least one of the slow components (
k
, ranged from 2.2 to 92.6 hours: k = 21, … 30).
First, the total 297 trials were modeled as separate pulse regressors at the onset of
movement, which were convolved with a canonical hemodynamic response function.
This analysis yielded 297 independently estimated parameters (beta-values) for each
individual voxel, and the only 198 trials with rotational conditions (+40 or –40 degrees)
were subsequently used as inputs for the MVPA. The classification was performed with a
linear support vector machine (SVM) implemented in LIBSVM
(http://www.csie.ntu.edu.tw/~cjlin/libsvm/), with default parameters (a fixed
regularization parameter C = 1). The separate training and testing datasets were generated
with a pseudo-random half-split of the all samples. The cross-validation was then
conducted for 1,000 times for each subject, and averaged classification accuracy was
estimated. A two-sided t-test was used to determine whether the observed classification
accuracy was significantly higher than chance (50%) with inter-subject difference treated
as a random factor (degree of freedom [d.f.] = 20).
46
3.3. Results
3.3.1. Behavioral analysis and modeling
The fitted parameters, c and p (see Eq. 3.4) were 0.0487 and 0.263 and the fitting
error as the absolute mean error was 1.75 degree (Fig. 3.2A). The internal state of the
learning process for task 1,
,1 k
x showed different profiles depending on the time-constant,
k
(Fig. 3.2B).
47
Figure 3.2. Adaptation process in behavioral data and result of model fitting. (a) Blue,
red, and black circles indicate direction of joystick movements in Task 1 (40º visuomotor
rotation), Task 2 (-40º ), and Control (0º ), respectively, averaged across subjects (N = 21).
Blue or red shaded regions indicate trials in Task 1 or 2, respectively. Thick black line
indicates total output of multi-state model (see main text). (b) Profiles of individual
adaptation components for Task 1 of the fitted model. Colors indicate component
numbers and corresponding time constants as indicated by the color bar.
3.3.2. Model-based regression of fMRI
The model-based regression analysis revealed the distinct patterns of correlated
regions for each component of different time-constants (
k
) (Fig. 3.3).
48
Figure 3.3. Responsible regions for individual adaptation components with 30 different
time constants. Red-yellow regions indicate those where BOLD signal time courses were
significantly correlated with those of individual adaptation components (see Fig. 3.2B) (p
< 0.001 uncorrected for multiple correction, see Methods). Color-coded T-values of
regression coefficients are rendered on right posterior view of the brain surfaces. Blue
circles indicated the anterior regions of inferior parietal lobe and the cerebellum, which
are consistently responsible for intermediate (k = 11, … 20) and slow components (k = 21,
… 30). s: second, m: minute, h: hour.
The faster (
k
, ranged from 2.0 to 4.6 seconds; k = 1, 2 and 3) components
mainly revealed the large regions in the frontal and parietal cortices, while the middle
49
(
k
, ranged from 2.1 to 87.9 minutes; k = 11, … 20) time-constants consistently showed
the restricted area in the right anterior region of the inferior parietal lobe (aIPL: a blue
circle in Fig. 3.3). With slower time-constants (
k
, ranged from 2.2 to 92.6 hours: k = 21,
… 30), we observed only the cerebellum. These patterns for each time-constants were
almost similar between +40 and -40 degrees conditions. We then compared the averaged
t-values for all regressions with 30 time-constants between the right aIPL and the
cerebellum observed in the middle (
k
= 16.7 minutes; k = 16) and slow (
k
= 92.6 hours;
k = 30) components, respectively (Fig. 3.4). We found the crossing point between the
time-courses of the middle and slow component is around 40 minutes (48 and 32 minutes
for +40 and -40 degree conditions, respectively).
50
Figure 3.4. T-values of regression coefficients for parietal region and the cerebellum as a
function of regressor number, which correspond to different time constants. Curves with
open circles indicate t-values averaged within the parietal region (see the right panels),
where BOLD signal time course are significantly correlated with the regressor for the
intermediate adaptation component (
k
= 16.7 minutes; k = 16), and averaged across
subjects (blue = Task 1, and red = Task 2). Curves with filled circles indicate those
averaged within the cerebellar regions where signal time course are correlated with
regressors for the slow components (
k
= 92.6 hours; k = 30). Thick blue and red arrows
indicate time constants at which the parietal (open circles) and the cerebellar (filled
circles) curves cross over each other for Tasks 1 and 2, respectively.
We further applied singular value decomposition (SVD) to the time-series of the
t-values of all components of 30 time-constants to extract principle components. All the
voxels that survived the voxel-level threshold (p < 0.001) at least in one time-constant
were included in this analysis, and the dimension of data matrix is the number of voxels
(approx. 20,000) versus that of time-constants (30). The SVD decomposes the data
matrix into the following three orthogonalized matrices: eigenvariates, eigenvalues, and
eigenimages. We then selected the top four components, which accounted for over 99.5%
of observed variance. The 1st and 2nd eigenvariates corresponds to the fast components
with different time-course, while the 3rd
and 4th ones represents the slow and middle
components, respectively (Figs. 3.5A and 3.5B). The corresponding eigenimages showed
the 1st components mainly in the prefrontal and medial parietal regions, 2nd in the
51
posterior regions of the left and right inferior parietal lobes, the 3rd mainly in the
cerebellum and partly in the temporal lobe, and the 4th in the right aIPL and premotor
cortex (Fig. 3.5C).
Figure 3.5. Eigenvariates and eigenimages of the top of four components yielded by the
singular value decomposition analysis of variations in brain activity related to different
52
timescales of sensorimotor memory. A, B Eigenvariates as a function of regressor number
corresponding to different time constants for Tasks 1 and 2, respectively. C, Eigenimages
rendered on the brain surface and transverse slices at different levels. Eigenimages are
thresholded at the top 5% of their magnitudes for each component.
3.3.3. MVPA result
The MVPA revealed significant above-than chance accuracy in the right aIPL as
well as in the cerebellum for all sessions, with averaged accuracy across subjects ranged
from 60 to 70% (Fig. 3.6B). We then tested differences in classification accuracy across
sessions. The one-way analysis of variance (ANOVA) with sessions as a intra-subject
factor revealed significant increase of accuracy in the cerebellum (F
(2, 40)
= 11.97, p <
0.001), but no significant difference in the right aIPL (F
(2, 40)
= 1.69, p > 0.05). These
results indicate that specificity of activity patterns to the task (40 or -40º rotations)
increased with sessions in the cerebellum.
We further confirmed that the increase in classification accuracy with session
observed in the cerebellum is not due to behavioral confounds during adaptation: the
increased behavioral performance due to adaptation (i.e., decrease in endpoint directional
errors) may cause decrease in variance of low-level movement kinematics, which could
then contribute to the better classification accuracy of two rotational conditions. First, the
variance of the directional errors for each session was compared. We found significant
difference of variance in errors from the 1st and 2nd sessions (F
(20, 20)
= 3.33, p < 0.01 [p
53
< 0.02 corrected for two comparisons with Bonferroni method]), but no significant
difference between the 2nd and 3rd
sessions (F
(20, 20)
= 0.90, p = 0.82; Fig. 3.6C). The
mean of errors also showed no significant difference between the 2nd and 3rd
sessions
(t
(20)
= 1.05, p = 0.31). These behavioral measures indicate that performance already
plateaued in the 2nd session. Second, we found no significant correlation between the
difference in the classification accuracy of the cerebellum, and that of the variability
(S.D.) of absolute directional errors between 2nd and 3rd session for individual subjects
(correlation coefficient = -0.27, p = 0.24). This indicates that the increased classification
accuracy is unlikely to be caused by behavioral confounds (i.e., movement kinematics).
54
Figure 3.6. Regions of interest (ROIs) and classification accuracy of multi-voxel pattern
analysis. A, Functional ROIs to which multi-voxel pattern analysis was applied (gray-
black regions). Broken lines indicate anatomical ROIs (red: the superior and the inferior
parietal lobes, and cyan: the cerebellum). Right and the left panels indicate the parietal
55
and cerebellar ROIs, respectively. Top and bottom panels shows the regions projected to
the sagittal and the transverse planes, respectively. B, Classification accuracy of Tasks 1
and 2 as a function of sessions using activity patterns in the above functional ROIs. A
plus (+) marker indicates accuracy averaged within each subject according to cross-
validation tests (see Methods). Thick lines indicate accuracy averaged across subjects.
The one-way analysis of variance (ANOVA) with sessions as a intra-subject factor
revealed significant increase of accuracy in the cerebellar ROI (F
(2, 40)
= 11.97, p < 0.001),
but no significant difference in the parietal ROI (F
(2, 40)
= 1.69, p > 0.05). C, Variance
(standard deviation: SD) of directional error among trials in a session for individual
subjects (gray circles). Red lines and boxes indicate mean and SD of the SDs in each
session. Significant difference of SD in errors was found between the 1st and 2nd
sessions (F
(20, 20)
= 3.33, p < 0.01 [*: p < 0.02 corrected for two comparisons with
Bonferroni method]), but no significant difference between the 2nd and 3rd
sessions (F
(20,
20)
= 0.90, p = 0.82, ns: not significant)
3.4. Discussion
We searched over neural correlates of motor memories along with a distribution
of possible time scales during motor adaptation. The computational model of learning
behavior could estimate the states of motor memories with different time constants and
they were entered into a design matrix as regressors for univariate fMRI analysis. The
correlated brain regions shifted from the fronto-parietal area down to the cerebellar area,
56
correspondingly from the fastest to the slowest time constants, and this result is consistent
with expected roles of PFC, PPC and the cerebellum. For the first time, we showed the
spatial distribution of motor memories with varying time scales.
With the first few fastest constants (2 to 4.6 seconds), various brain regions were
activated including not only frontal and parietal lobes but visual, temporal cortex, and
cerebellum. Activation of these areas is mostly due to cognitive processes such as
attention and arousal at the onsets of target presentations (Coull et al., 1996; Coull, 1998).
Especially, the strong activation in the right intraparietal sulculs shown in Figure 3.3 was
known to be correlated with attention to a spatial cue of a target (Coull, 1998).
Beyond the cognitive processes, the activated regions became more localized into
the PPC with longer but still relatively fast time constants. Within the PPC, we could
found more voxels in the inferior area correlated with slower time constants than in the
superior area. Especially, we could find a characteristic region of activation (blue circle
in Fig.3.3) in the right anterior part of the inferior parietal lobule (aIPL) correlated with
the intermediate time constants around 16.7 minutes. Before and after the time constant,
the number of correlated voxels in the superior PPC decreased and the number in the
cerebellum increased. Our finding is consistent with the report that this area (aIPL) is
activated both for the early and late stage of motor adaptation (Inoue et al., 2000). Due to
anatomical location of the area, intermediate between the PPC and the cerebellum, we
expect its role as a locus of the parieto-cerebellar network in connecting memories with
different time scales.
57
With slower time constants larger than 1 hour, the most activities were identified
in the cerebellum and this result is consistent with previous studies (Imamizu et al., 2000;
Graydon et al., 2005; Luauté et al., 2009). In contrast, a recent study reported that brain
stimulation using tDCS over the cerebellum induced faster adaptation during training but
not affected retention after training (Galea et al., 2011). However, their result does not
necessarily support that the cerebellum is a site for the fast memory because the faster
adaptation could be accounted for by an increased learning rate by the stimulation. The
unaffected retention performance could be due to no effect of the tDCS stimulation on a
forgetting rate of the motor memory in the cerebellum. This seemingly discrepancy
warrants further investigation.
We could not find the evidence of modular organization (Imamizu et al., 2003) in
the cerebellum for learning two opposing rotations. In the study by Imamizu et al. (2003),
the property of the learning tasks (rotation and velocity mice) was quite different, but in
the current study, only a parameter of learning (i.e., rotation direction) was different with
the same structure of the task. As discussed in their study, the correlated regions in the
cerebellum for the two tasks were mostly overlapped due to the similarity of the task.
MVPA could successfully discriminate the representation of the opposing rotations not
only in the cerebellum but in the PPC with higher decoding accuracy than a chance level.
However, we could observe significant increase of the decoding accuracy across sessions
only in the cerebellum. This result provides a supplementary evidence that the cerebellum
is a site for the multiple slow processes learning separate representation of dual
adaptation as predicted by our previously suggested model (Lee and Schweighofer, 2009).
58
The non-significant increase of the decoding accuracy in the PPC also suggests the PPC
would experience more interference than the cerebellum also predicted by the same
model (Lee and Schweighofer, 2009). A further study on the neural correlates of the
degree of interference in multi-task learning would be interesting and could reveal the
neural mechanism of failing to learn two opposing tasks in high-interference conditions
(Osu et al., 2004).
Our study did not suggest a possible architecture of motor memories with
different time scales because we already assumed a parallel architecture in our model
where all the memories received common errors as feedback (see Eq. 3.1). Although a
parallel architecture of motor memories was supported by behavioral evidences (Lee and
Schweighofer, 2009), there are still no neurological evidences of the architecture. A
functional connectivity analysis (Bullmore and Sporns, 2009) with causal intervention
methods such as TMS and tDCS could be a solution to this issue.
We could not find significant activities from subcortical areas such as basal
ganglia (BG), which are involved in reward-based motor learning (Doya, 2000). Hitting
targets or reduction in errors during adaptation could be rewarding and possibly
correlated with activities in the BG. We also could not find significantly correlated
activities in primary motor cortex (M1) with slower time scales as other studies predicted
(Hadipour-Niktarash et al., 2007; Galea et al., 2011; Landi et al., 2011). The activities in
M1 might be correlated with even slower time scales beyond the range used in the current
study, possibly inducing structural change (Landi et al., 2011) although further studies
would be necessary.
59
Chapter 4
A model-based fMRI study of decision making in
motor learning
4.1. Introduction
Learning motor skills involves movement planning and control to reduce motor
errors and costs associated with the movement or maximize rewards from movements.
Therefore, motor learning necessarily requires a process of decision making (Wolpert and
Landy, 2012). However, most of the modeling studies in motor learning have focused on
simple error-based learning like in motor adaptation (Smith et al., 2006; Zarahn et al.,
2008; Lee and Schweighofer, 2009), updating motor commands to reduce the error trial-
by-trial. On the other hand, most of the modeling studies in decision making have
focused on making choices among discrete action plans, e.g., n-arm bandit task (Daw et
al., 2006; Behrens et al., 2007).
A seminal study by Körding and Wolpert modeled the decision making process
in sensorimotor learning in the framework of Bayesian integration (Körding and Wolpert,
2004). In the model, the next movement plan was optimally decided to minimize
uncertainty given states and feedbacks. In addition, few studies suggested reinforcement
learning models to account for human motor learning behavior (Izawa et al., 2008; Izawa
60
and Shadmehr, 2011). However, little is known about the underlying neural mechanism
of decision making in motor learning. Most of the model-based fMRI studies have
focused on simple decision making in cognitive tasks (Daw et al., 2006; Behrens et al.,
2007), not in motor tasks, mostly due to restricted movement in the scanner.
We designed an event-related fMRI experiment where subjects learned making a
decision to search a hidden target through movements. We hypothesized that subjects
made a decision of a movement direction based on integrating their current states and
feedbacks in the Bayesian way but with a certain degree of exploration which shows
deviation from the Bayesian predictions. The advantage of our model-based approach is
to estimate the hidden variables such as uncertainty and reward-prediction errors. Few
model-based fMRI studies using decision making tasks reported the neural correlates of
exploration (Daw et al., 2006), frontopolar cortex and intraparietal sulcus, and of
uncertainty (Yoshida and Ishii, 2006; Behrens et al., 2007), anterior prefrontal cortex,
anterior cingulated cortex. However, none of these studies used a motor task like our
experiment. First, using computational models, we analyze behaviors searching a hidden
target given feedbacks such as binary rewards and continuous error information. Second,
for fMRI analysis, we generate regressors for hidden variables predicted from the one
selected model such as reward-prediction errors, exploration, and uncertainty. In this
dissertation, we focus on the behavioral results with modeling and discuss the expected
results from fMRI data analysis.
61
4.2. Materials and Methods
4.2.1. Participants
Twenty four neurologically health subjects participated in the experiment (18
males and 6 females, 20-37 years, mean age of 23.6 years). All subjects were right-
handed, as assessed by a modified version of the Edinburgh Handedness Inventory
(Oldfield, 1971). All subjects gave written informed consent for experimental procedures
approved by the ATR Human Subject Review Committee in accordance with the
principles expressed in the Declaration of Helsinki.
4.2.2. Experiment procedure
We designed a hidden target searching task on a semi-circle (Fig. 4.1A). Subjects
lay in a supine position in the scanner and viewed stimuli projected onto a custom-made
viewing screen. They controlled a joystick to move a cursor in the direction where they
thought a hidden target was located on a semi-circle. They had 1.3 seconds for movement
after the onset of trial and received feedback for 0.7 seconds. If they did not move within
1.3 seconds, the trial was considered missed without record of data. Time intervals
between trials were random, exponentially distributed over 4, 6, 8, 10, and 12 seconds.
After 14 trials, they had a different task with randomly changed location of a hidden
target. We provided a different feedback of their movement depending on a task, control,
supervised (SL) and reinforcement learning (RL), which were cued by different colors.
62
For the control task, subjects did not have feedback but were instructed to move
around over the semi-circle. For the SL task, the feedback provided a directional error bar
(middle in Fig 4.1A) scaled with an inverted Gaussian function (Körding and Wolpert,
2004). The sensitivity of the error bar was increased by reducing the standard deviation
of the Gaussian function from 45˚ to 5˚ (left in Fig. 4.1B). For the RL task, a binary
feedback with either ‘smiley face’ or ‘sad face’ (bottom in Fig. 4.1A) was provided
depending on whether the movement direction was within a predefined reward zone
around the hidden target. The reward zone spanned up to 90˚ around the location of the
hidden target initially and shrunk to 5˚ in the last trial. We could facilitate learning with
the varying sensitivity of the error bar and shrinking reward zone (right in Fig. 4.1B).
63
Figure 4.1. Experiment design: A, Subjects had to move a joystick after “go” signal cued
by a red cross within 1.3 seconds. Subjects received a feedback as ‘smiley’ or ‘sad’ face
depending on whether a search direction was within a reward zone. B, The reward zone
around the hidden target direction exponentially shrinks across trials to facilitate learning.
We designed an alternative task schedule for each session for example,
CRSRSCSRSRC where C, R, and S respectively indicate the control, RL, and SL task.
The sequence and color of the SL and RL tasks were cross-balanced across both sessions
and subjects to eliminate any confounding effects of sequencing. Each session consisted
of 154 trials with 42 trials for the control task, 56 trials for the RL and SL task and lasted
~13 minutes. Subjects underwent the same experiment as a training session few days
before scanning with different locations of a hidden target in the actual experiment.
64
4.2.3. Bayesian search model
Our task requires subjects to estimate the most probable position of a hidden
target to decide a search direction for each trial. After every decision, they received a
feedback and update their belief on the target position to make the next movement. This
hypothetical procedure could be well described by Bayesian learning, which combines
prior belief and the likelihood to update the posterior belief. Similar idea was tested in the
human visual system, which has been known to achieve nearly optimal search
performance (Najemnik and Geisler, 2005) or maximize total reward gains by optimal
eye movements (Eckstein et al., 2010). Their optimal search model optimally integrates
information, visibility maps obtained by eye movements to predict the next fixation point.
In our model, subjects obtain the likelihood representing subject’s belief on the
target direction from a reward feedback. In learning perspective, the likelihood is
equivalent to a generalization pattern around the direction where a subject received a
feedback. We used a Gaussian or inverted Gaussian function for the likelihood depending
on a ‘smiley face’ ( r =1) or ‘sad face’ ( r =0). In other words, the reward feedback
provides evidence of enhancing or suppressing the belief around the trial direction. The
motivation of the latter comes from the inhibition of return (IOR) mechanism found in
the attentional system, searching towards novel spatial locations while inhibiting already
scanned ones.
Mathematically, the likelihood of a target being located in the direction of
t
for
each of reward conditions is as follows.
65
2
1 2
2
0 2
()
( | , 1) exp
2
()
( | , 0) 1 exp
2
tt
t t t
tt
t t t
u
L u r
u
L u r
(4.1)
where
t
u indicates the actual movement direction at trial t and is the only free
parameter of our model, which is the standard deviation of the Gaussian function.
For each trial, our model integrates the likelihood with the prior to update the posterior
belief on the target direction,
t
in the Bayesian way.
1 1 1
( | , ) ( | , ) ( | , )
t t t t t t t t t
p p L u r
u r u r (4.2)
where
12
[ ... ]
tt
u u u u and
12
[ ... ]
tt
r r r r represent a history of actual search directions
and rewards. The model prediction for the next trial is a direction where the posterior is
maximized and thus it is deterministic without exploration.
1
11
ˆ arg max ( | , )
t
t t t t
up
ur (4.3)
We defined the exploration as the deviation of the actual search direction of the
next trial,
1 t
u
from the prediction,
1
ˆ
t
u
. The only free parameter of our model, was
tuned to minimize the sum of the deviations (i.e., mean absolute errors, MAE) from the
second trials with the assumption that our model could predict subjects’ behavior (Izawa
and Shadmehr, 2011).
66
Figure 4.2. Bayesian decision making for the movement direction: A, The first
movement is at 40º with a subsequent reward. The likelihood is enhanced at 40º due to
the reward and predicts the next movement at 40º as a maximum posterior. B, The second
movement is at 70º with deviation (i.e., exploration) from the previous prediction at 40º .
C, The first movement is at 40º without a subsequent reward. The likelihood is
suppressed at 40º due to the non-reward and predicts the next movement at 100º as a
maximum posterior. D, The second movement is at 90º with deviation (i.e., exploration)
from the previous prediction at 100º
67
Figure 4.2 demonstrates an example of how the model predicts the next search
direction given the actual search direction. An exemplary subject moves in 40˚ direction
in the first trial (
1
u ) and we assumed that the subject has the initial prior belief around
the direction, which is Gaussian distribution.
If the subject received a ‘smiley face’ (
1
r =1) for the trial (Fig. 4.2A), the model
combines the initial prior with the likelihood
1
L
to update the posterior. In this case, the
likelihood is the same as the initial prior because we used the same standard deviation for
the two distributions. As a result of the positively reinforcing the initial direction, the
posterior becomes sharper than the initial prior. If the subject received a ‘sad face’ (
1
r =0)
for the trial (Fig. 4.2C), the likelihood
0
L is used to update the posterior where the
probability around the search direction is suppressed. For the both cases, the model
predicts the next search direction at the maximum posterior. The actual search direction
for the second trial (
2
u =70˚) would be different from the prediction as much as
exploration. The model updates again the posterior for the third trial by combining the
likelihood given from the reward feedback of the second trial (Fig. 4.2B, C). Following
this procedure, the model sequentially updates the posterior and predicts the next search
direction at the maximum posterior from the second to the 14th trial.
68
4.2.4. Alternative models
Because our task is in continuous space (i.e., search direction), we proposed
policy-based models rather than traditional state and/or action value-based models. The
first alternative model plans the next movement direction by estimating the gradient of
the action policy with a learning rate (William, 1992; Sutton et.al, 2000; Peters and
Schaal, 2008). This model updates a policy mean given previous exploration and its
consequent reward (William, 1992) as follows.
1
1
0
1
( )( )
=( ) /
t t t t t t
t
ti
i
r b u
b b r t
(4.4)
where
t
,
t
r , and
t
b indicate a policy mean, reward, and baseline at a trial t ,
respectively. The only free parameter is a learning rate, . The baseline implies the
expected reward, which is calculated as the running average of rewards as shown in the
second equation and thus its value is between 0 and 1. The initial baseline,
0
b was
estimated as the average reward of the first trials from a training session for each subject.
The original REINFORCE algorithm suggested by William also updates a policy
exploration with additional parameters. However, we used the policy mean as the
deterministic prediction for the next movement,
11
ˆ =
tt
u
and defined the exploration as
the difference between the actual search direction and model prediction like the Bayesian
search model. In the first equation, the update of a mean policy depends on the product of
the reward prediction error,
tt
rb and the exploration,
tt
u . For example, a subject
69
received ‘smiley face’ ( r =1) in a trial, the reward prediction error becomes positive. It
implies the direction (i.e., sign) of the policy mean update,
1tt
is the same as that of
the exploration. In contrast, given a ‘sad face’ ( r =0) equivalently to the negative reward
prediction, the direction of the policy mean update is the opposite to that of the
exploration.
The second alternative model plans the next search movement direction by taking
the reward-weighted average of previous trials. The advantage of this model is no
requirement of learning rate. However, the model prediction heavily depends on how we
define a reward (Peters and Schaal, 2008). For a binary reward, a relative weight,
assigned to trials with ‘sad face’ could be a free parameter, which is less than 1 when the
weight for ‘smiley face’ is 1.
11
1
1
{1, }, 0< 1
ˆ
i
t t t t
t
tt
r
r u r u
u
rr
(4.5)
It is notable that the predictions are always between two previous actual search directions
and start from the third trial. Both of the alternative models have only one free parameter
to fit and we compared the performance of the predictors with the Bayesian search mode
and analyzed how they are different in predictions.
70
4.2.5. Model-based regressors
The Bayesian search model predicts several hidden variables not visible from
behavioral observations. The variables were entered in the design matrix as explanatory
regressors to identify brain regions whose activities are correlated with them. First, we
estimated reward prediction error for each trial as the difference between the actual
reward ( r =0 or 1) and the expected reward, which is equivalent to the expected
probability of getting a reward. Since subjects were informed that a reward zone shrunk
after every trial, the expected reward generally decreases. However, if subjects are
confident with their movements (i.e., sharp posterior), they would expect a reward with
higher probability. Based on this idea, we calculated the expected reward as the area
under the normalized posterior distribution over a reward zone of the trial around the
current search direction (Fig. 4.3). Although subjects do not exactly know how the
reward zone shrunk, we used the same reward zone used in the experiment design.
Second, the exploration was estimated trial-by-trial as the difference between the actual
search direction and the model prediction (Fig. 4.3) as explained in the description of our
model. Third, the uncertainty for each trial was estimated as the entropy of the posterior
distribution shown in the following equation (Fiorillo et al., 2003; Aron et al., 2004;
Yoshida and Ishii, 2006; Bach et al., 2011).
180
1 1 1 1
0
( | , ) log ( | , ) d
t t t t t t t
pp
u r u r (4.6)
The regressor for the exploration was correlated with brain activity after the onset of each
trial triggering movement. However, those for the reward prediction error and the
71
uncertainty were correlated with brain activity after the reward feedback, which is 1.3 sec
delayed from the trial onset, because a subject updated their posterior after the feedback.
Figure 4.3. The reward-prediction errors (RPE) and the exploration calculated from a
posterior distribution: RPE was defined as the actual reward and the expected reward
which is calculated as the area under the curve of a posterior distribution. The exploration
was defined as the difference between the actual search direction and the model
prediction
72
4.3. Results
4.3.1. Behavioral results and model performance
Subjects missed 124 trials out of 4032 trials for the RL task because of not
responding within the given movement time, 1.3 sec. Among 288 given hidden targets for
24 subjects (4 targets × 3 sessions, 12 targets for each subject), we excluded the entire 14
trials of 56 targets from analysis when there were at least one missed trials for the targets.
All the subjects were successful for at least 3 hidden targets.
Subjects could learn from the reward feedback from every trial. The average
reward rate, number of rewarded trials divided by the number of total trials, generally
decreased as later trials since a reward zone shrunk but significantly higher (p < 0.001)
than the random search except for the first trial (Fig. 4.4A). We analyzed how subjects
showed the pattern of the IOR mechanism. As shown in Fig. 4.4B, the magnitude and
variance of search direction changes are significantly larger for all the trials after non-
rewarded ( r =0) than after rewarded ( r =1) (two-sided t-test and Levene’s test: p <10
-7
),
supporting the IOR mechanism. It is notable that there are more non-rewarded trials and
the distribution becomes narrower around zero for later trials due to the shrunken reward
zone. Most of trials after the non-rewarded change a search direction larger than the level
of the mean motor noise, 5.5º which we measured from a separate experiment although
the changes became lesser for later trials. This pattern of IOR behavior was consistent
several trials after the non-rewarded although we do not present the detailed result here.
73
Figure 4.4C shows examples of the search behavior from one subject with the
prediction of the Bayesian search model. The free parameter of the model, was
optimized for each hidden targets and subjects. Due to the shrunken reward zone (shown
as two curves), the subject could guide the search toward the target direction and our
model predicted the behavior with some active search noise (i.e., exploration). For each
subject, we calculated the model performance as the mean absolute error between the
actual search directions and model predictions for all available trials.
74
75
Figure 4.4. Behavioral data analysis and three latent variables as regressors: A, Reward
rate at each trial, behavior (blue, mean ± SEM) and random search (black). B, Adaptive
searching behavior given reward and non-reward conditions across trials: Subjects
explored less after rewards (bottom) than after non-rewards (top) and the exploration
decreases as trials presumably due to the shrunken reward zone. C, Examples of
searching behavior with Bayesian search model prediction: The model captures not only
exploitive but explorative search behaviors. D, Predicted entropy as uncertainty of
posterior distribution for each trial: The shaded areas indicate the trial with a non-reward.
The entropy generally decreases after rewards but increases after non-rewards as
predicted by our model.
76
We compared the performance of the Bayesian search model with the alternative
models. For each subject, we calculated the averaged MAE for maximum 12 targets and
calculated reported the mean of the averaged MAE for each model. There was no
significant difference (1-way ANOVA, F
(2,69)
= 0.21, p = 0.815) between models (mean ±
STD, BS: 20.17 ± 5.46, RWA: 21.38 ± 7.35, and PG: 20.63 ± 6.83)and the three models
similar predictions on behaviors although their search algorithms are based on different
principles.
This similarity between models supports the Bayesian search model as more
robust predictor of behavior. As we described in methods, regressors for fMRI analysis
were calculated from the Bayesian search model only because the other model do not
provide the uncertainty information of subjects’ decision for the search direction. The
prediction accuracy of the Bayesian model correlated significantly with subjects’ search
performance measured as the reward rate (r = 0.667, p < 10
-30
); better performance with
nearly optimal search following the Bayesian model (Yoshida and Ishii, 2006).
Figure 4.4D demonstrates model-based regressors for reward prediction errors,
exploration and uncertainty with reward conditions (shaded: non-rewarded trials). The
uncertainty generally decreased as reward feedbacks were accumulated with trials.
77
4.4. Discussion
We designed an event-related functional MRI experiment where subjects should
make a series of decisions to search hidden targets. Behavioral analysis showed that
subjects could learn the position of hidden targets given binary feedbacks. Our suggested
Bayesian search model predicted search behavior reasonably well, which is either
explorative or exploitative. The Bayesian search model predicted the next search
direction by sequentially updating a posterior distribution by which we could calculate
three interesting latent variables such as reward-prediction errors, exploration, and
uncertainty. We also presented alternative reinforcement learning models such policy
gradient and reward average algorithm, which showed similar fitting performance to the
Bayesian search model (1-way ANOVA, F
(2,69)
= 0.21, p = 0.815). Therefore, we adopted
the Bayesian search model to provide the latent variables for further fMRI data analysis.
A key process of the suggested model is estimation of a posterior distribution
where all three latent variables were calculated. In the process, the only free parameter,
determines how much a posterior distribution is updated from the feedback for each
trial. In other words, the estimated is an indicator of the reliability of the feedback that
subjects estimated. Therefore, it would be interesting to search the neural correlate of the
reliability in contrast to uncertainty.
The suggested model is deterministic in predicting the next search direction as
maximum a posterior and cannot predict the exploration. Therefore, we calculated the
exploration as the fitting errors with a strong assumption that subjects used the same
78
strategy suggested by Bayesian search algorithm. In addition, the model did not include
the motor noise term which could be a part of the fitting error. Therefore, the validity of
the suggested model should be also supported from fMRI data analysis by identifying
similar brain regions which are consistent with those from previous studies. The model-
driven exploration could be compared with a simple change of search direction,
1tt
uu
which does not require the model. Similarly, the model-driven reward-prediction errors
could be also calculated without the model by defining the expected reward as the
running average of rewards from previous trials. We could compare the identified neural
correlates for the regressors calculated from these two approaches.
We are interested in three model-based regressors and their neural correlates and
here discuss the expected results from fMRI data analysis. For the reward-prediction
errors, the expected neural correlates are the basal ganglia as most other studies have
found (Schultz et al., 1997; O’Doherty et al., 2003, 2004; Samejima et al., 2005;
Gershman et al., 2009; Kim et al., 2009). For the exploration, the expected neural
correlates are the bilateral frontopolar cortex, intraparietal sulcus (Daw et al., 2006), and
rostrolateral prefrontal cortex (Badre et al., 2012). These areas were preferentially active
during explorative choices than exploitative choices in a ‘four-armed bandit’ task. For
the uncertainty, the expected neural correlates are dorsomedial prefrontal cortex (Hsu et
al., 2005; Yoshida and Ishii, 2006), anterior cingulated cortex (Behrens et al., 2007),
dorsolateral prefrontal cortex (Badre et al., 2012).
Once the neural correlates are identified, we could apply MVPA techniques to
the region of interest to see if we can extract relevant information from the pattern of the
79
neural responses. A recent study showed that the reward value of sensory cues is decoded
from the distributed fMRI patterns in the orbitofrontal cortex (Kahnt et al., 2010).
Likewise, the exploration and the uncertainty could be decoded from the patterns of brain
responses in the region of interest identified by the univariate analysis with the
corresponding regressors. We will further analyze fMRI data to see if our results are
consistent with those of the previous studies or there are any other interesting regions
which have not reported so far.
80
Chapter 5
Conclusion
5.1. Summary
In this dissertation, we presented three studies on computational models and
model-based fMRI in motor learning. We took a combined approach of computational
modeling, behavioral analysis, and fMRI analysis to understand underlying behavioral
and neural mechanisms of human motor learning.
First, in Chapter 2, we suggested a unifying mechanism accounting for the two
very well-known effects in motor learning: the contextual interference and the spacing
effects. We designed three different training conditions, FBK, TSP, and ALT. Then, we
simulated motor adaptation models containing both fast-learning-fast-forgetting and
slow-learning-slow-forgetting processes in the conditions. Both model selection and
model fitting showed that faster increase in performance during training in FBK was due
to large fast process activation. In contrast, slower increase in performance in ALT was
due to near-zero fast process activation and large slow process activation. In TSP,
performance was due to intermediate activation of both fast and slow processes. In same
day retention, the contextual interference effect was caused by slower forgetting in ALT
than in FBK, as predicted by slow process activity. Over 24 hours, the spacing effect was
due to an increase in consolidation. Our results at least partially support the “forgetting-
reconstruction” hypothesis of the contextual interference effect.
81
Second, in Chapter 3, we searched for neural correlates for motor memories with
different time scales using model-based fMRI methods. We could identify three
characteristic brain regions, significantly correlated with the different variables:
prefrontal, intraparietal, and cerebellum. The prefrontal region was correlated with
regressors with the shorter time constants, the medial parietal cortex became more
dominant with the intermediate time constants, and the cerebellum was activated with the
longer time constants. A further multivariate classification analysis showed that the
neural activity in the cerebellum region progressively represented the two separate tasks.
Our results support the existence of multiple time constants in motor adaptation and give
additional support for a role of the cerebellum the acquisition of multiple internal models.
Third, in Chapter 4, we designed a sequential decision making task involving
motor experience. Since the feedback is binary, subjects needed to learn the position of
hidden targets under uncertainty with a certain extent of exploration. We suggested
Bayesian search model updating a posterior distribution given binary feedbacks and this
model could predict subjects’ search behaviors reasonably well. We estimated three
interesting quantities for every trial, reward-prediction errors, exploration, and
uncertainty from the posterior distribution. Those quantities were only available from the
model and entered into a design matrix for further fMRI data analysis. We have not
finished the fMRI data analysis yet, but discussed expected results based on previous
other studies.
82
5.2. Future work
Adaptation models that we employed have few major limitations. First, they
excluded the effects of rewards on the long-term retention. We need to investigate how
the reward, “hitting a target” or “reducing errors” can influence on the state of the fast
and slow memories during and after adaptation (Huang et al., 2011). Second, the model
also did not consider cognitive and explorative strategies as a process of learning
(Mazzoni and Krakauer, 2005). It would be interesting to study how these strategies
affect dynamics of the fast and slow memories. Third, we need to test the degree of
interference in the fast memory as we did in the slow memory. Although our model
suggested one shared fast memory which is the source of interference, it is still unclear
how the degree of interference is dependent of the time scale of memory. We have a
hypothesis that the shorter time scale of memory would suffer from higher degree of
interference but this hypothesis should be tested in both behavioral and neural data
analysis.
What we found in the model-based fMRI study on the multi-rate motor memories
are mostly based on correlation analysis and thus it is still not clear the identified brain
regions have any causal effects on the time scale of memory. Therefore, further studies
using TMS and tDCS to stimulate the identified regions would be necessary to
understand how the stimulation affects the speed of adaptation and the pattern of the
long-term retention. This line of studies could be applied to develop more efficient neuro-
rehabilitation program for patients with stroke to recover their lost motor functions.
83
While numerous modeling studies on human motor adaptation are being done,
few studies on human motor skill learning were reported (Shmuelof et al., 2012b). The
lack of studies is mostly due to the difficulty of defining performance in skill learning
although the speed-accuracy tradeoff is a well known indicator of the performance (Reis
et al., 2009; Shmuelof et al., 2012b). However, skill learning in robotics is a popular topic
of research and numerous reinforcement learning algorithm have been suggested so far
(see review by Peters and Schaal, 2008). Applying algorithms developed in robotics to
study human motor skill learning could be an interesting reverse-engineering approach,
which influence on both robotics and neuroscience (Hapgood, 2006). It would be also
interesting how the multi-rate adaptation model can be applied to motor skill learning. In
addition, our hidden target searching task could be considered as a cognitive task rather
than a motor task. Therefore, it would be interesting to employ a true motor skill learning
task such as pole-balancing during fMRI scanning and identify interesting brain activities.
84
Bibliography
Abe M, Schambra H, Wassermann EM, Luckenbaugh D, Schweighofer N, Cohen LG
(2011) Reward improves long-term retention of a motor memory through induction
of offline memory gains. Curr Biol 21:557–562.
Anguera JA, Reuter-Lorenz PA, Willingham DT, Seidler RD (2010) Contributions of
spatial working memory to visuomotor learning. J Cogn Neurosci 22:1917–1930.
Aron AR, Shohamy D, Clark J, Myers C, Gluck MA, Poldrack R A (2004) Human
midbrain sensitivity to cognitive feedback and uncertainty during classification
learning. J Neurophysiol 92:1144-1152.
Bach DR, Hulme O, Penny WD, Dolan RJ (2011) The known unknowns: neural
representation of second-order uncertainty, and ambiguity. J Neurosci 31:4811-
4820.
Badre D, Doll BB, Long NM, Frank MJ (2012) Rostrolateral prefrontal cortex and
individual differences in uncertainty-driven exploration. Neuron 73:595-607.
Behrens TE, Woolrich MW, Walton ME, Rushworth MF (2007) Learning the value of
information in an uncertain world. Nat Neurosci 10:1214-1221.
Bock O, Thomas M, Grigorova V (2005) The effect of rest breaks on human
sensorimotor adaptation. Exp Brain Res 163:258–260.
Bullmore E, Sporns O (2009) Complex brain networks: graph theoretical analysis of
structural and functional systems. Nat Rev Neurosci 10:186-198.
85
Buneo CA, Andersen RA (2006) The posterior parietal cortex: sensorimotor interface for
the planning and online control of visually guided movements. Neuropsychologia
44:2594-2606.
Clower DM, Hoffman JM, Votaw JR, Faber TL, Woods RP, Alexander GE (1996) Role
of posterior parietal cortex in the recalibration of visually guided reaching. Nat
383:618-621
Coull J, Frith C, Frackowiak RSJ, Grasby P (1996) A fronto-parietal network for rapid
visual information processing: a PET study of sustained attention and working
memory. Neuropsychologia 34:1085-1095.
Coull JT (1998) Neural correlates of attention and arousal: insights from
electrophysiology, functional neuroimaging and psychopharmacology. Prog
Neurobiol 55:343-361.
Criscimagna-Hemminger SE, Shadmehr R (2008) Consolidation patterns of human motor
memory. J Neurosci 28:9610–9618.
Cross ES, Schmitt PJ, Grafton ST (2007) Neural substrates of contextual interference
during motor learning support a model of active preparation. J Cogn Neurosci
19:1854-1871.
Cuddy LJ, Jacoby LL (1982) When forgetting helps memory: An analysis of repetition
effects. J Verbal Learning Verbal Behav 21:451-467.
Daw ND, O'Doherty JP, Dayan P, Seymour B, Dolan RJ (2006) Cortical substrates for
exploratory decisions in humans. Nature 441:876-879.
DiCiccio TJ, Efron B (1996) Bootstrap confidence intervals. Stat Sci 11:189 –228.
86
Della-Maggiore V, McIntosh AR (2005) Time course of changes in brain activity and
functional connectivity associated with long-term adaptation to a rotational
transformation. J Neurophysiol 93:2254 –2262.
Diedrichsen J, Hashambhoy Y, Rane T, Shadmehr R (2005) Neural correlates of reach
errors. J Neurosci 25:9919-9931.
Doya K (2000) Complementary roles of basal ganglia and cerebellum in learning and
motor control. Curr Opin Neurobiol 10:732-739.
Ebbinghaus H (1913) Memory: a contribution to experimental psychology. New York:
Teachers College, Columbia University.
Eckstein M, Schoonveld W, Zhang S (2010) Optimizing eye movements in search for
rewards. J Vis 10:33-33.
Ethier V, Zee DS, Shadmehr R (2008) Spontaneous recovery of motor memory during
saccade adaptation. J Neurophysiol 99:2577–2583.
Feltz DL, Landers DM (1983) The effects of mental practice on motor skill learning and
performance: A meta-analysis. J Sport Psychol 5:25-57.
Fiorillo CD, Tobler PN, Schultz W (2003) Discrete coding of reward probability and
uncertainty by dopamine neurons. Science, 299(5614), 1898-1902.
Francis JT (2005) Influence of the inter-reach-interval on motor learning. Exp Brain Res
167:128–131.
87
Galea JM, Vazquez A, Pasricha N, de Xivry JJ, Celnik P (2011) Dissociating the roles
of the cerebellum and motor cortex during adaptive learning: the motor cortex
retains what the cerebellum learns. Cereb Cortex 21:1761-1770.
Gershman SJ, Pesaran B, Daw ND (2009) Human reinforcement learning subdivides
structured action spaces by learning effector-specific values. J Neurosci
29:13524-13531.
Graydon FX, Friston KJ, Thomas CG, Brooks VB, Menon RS (2005) Learning-related
fMRI activation associated with a rotational visuo-motor transformation. Cogn
Brain Res 22:373-383.
Hadipour-Niktarash A, Lee CK, Desmond JE, Shadmehr R (2007) Impairment of
retention but not acquisition of a visuomotor skill through time-dependent
disruption of primary motor cortex. J Neurosci 27:13413-13419.
Hapgood F (2006) Reverse-engineering the Brain. Technol Rev 109:M12-M17.
Hsu M, Bhatt M, Adolphs R, Tranel D, Camerer CF (2005) Neural systems responding
to degrees of uncertainty in human decision-making. Science 310:1680-1683.
Huang VS, Shadmehr R (2007) Evolution of motor memory during the seconds after
observation of motor error. J Neurophysiol 97:3976–3985.
Huang VS, Haith A, Mazzoni P, Krakauer JW (2011) Rethinking motor learning and
savings in adaptation paradigms: model-free memory for successful actions
combines with internal models. Neuron 70:787–801.
88
Imamizu H, Kuroda T, Miyauchi S, Yoshioka T, Kawato M (2003) Modular
organization of internal models of tools in the human cerebellum. Proc Natl Acad
Sci U S A 100:5461-5466.
Imamizu H, Miyauchi S, Tamada T, Sasaki Y, Takino R, Pütz B, Yoshioka T, Kawato
M (2000) Human cerebellar activity reflecting an acquired internal model of a
new tool. Nature 403:192-195.
Inoue K, Kawashima R, Satoh K, Kinomura S, Sugiura M, Goto R, Ito M, Fukuda H
(2000) A PET study of visuomotor learning under optical rotation. Neuroimage
11:505-516.
Izawa J, Shadmehr R (2011) Learning from sensory and reward prediction errors
during motor adaptation. PLoS Comput Biol 7:e1002012.
Izawa J, Rane T, Donchin O, Shadmehr R (2008) Motor adaptation as a process of
reoptimization. J Neurosci 28:2883-2891.
Joiner WM, Smith MA (2008) Long-term retention explained by a model of short-term
learning in the adaptive control of reaching. J Neurophysiol 100:2948–2955.
Kahnt T, Heinzle J, Park SQ, Haynes JD (2010) The neural code of reward
anticipation in human orbitofrontal cortex. Proc Natl Acad Sci U S A 107:6010-
6015.
Kantak SS, Sullivan KJ, Fisher BE, Knowlton BJ, Winstein CJ (2010) Neural
substrates of motor memory consolidation depend on practice structure. Nat
Neurosci 13:923-925.
89
Karni A, Meyer G, Rey-Hipolito C, Jezzard P, Adams MM, Turner R, Ungerleider LG
(1998) The acquisition of skilled motor performance: fast and slow experience-
driven changes in primary motor cortex. Proc Natl Acad Sci U S A 95:861-868.
Kawato M (1999) Internal models for motor control and trajectory planning. Curr Opin
Neurobiol 9:718-727.
Keisler A, Shadmehr R (2010) A shared resource between declarative memory and motor
memory. J Neurosci 30:14817–14823.
Kim H, Sul JH, Huh N, Lee D, Jung MW (2009) Role of striatum in updating values of
chosen actions. J Neurosci 29:14701-14712.
Klein RM, MacInnes WJ (1999) Inhibition of return is a foraging facilitator in visual
search. Psychol Sci 10:346-352.
Körding KP, Wolpert DM (2004) The loss function of sensorimotor learning. Proc Natl
Acad Sci U S A 101:9839-9842.
Körding KP, Wolpert DM (2004) Bayesian integration in sensorimotor learning. Nature
427:244-247.
Körding KP, Tenenbaum JB, Shadmehr R (2007) The dynamics of memory as a
consequence of optimal adaptation to a changing body. Nat Neurosci 10:779 –786.
Krakauer JW, Ghilardi M-F, Mentis M, Barnes A, Veytsman M, Eidelberg D, Ghez C
(2004) Differential cortical and subcortical activations in learning rotations and
gains for reaching: a PET study. J Neurophysiol 91:924-933.
90
Krakauer JW, Mazzoni P, Ghazizadeh A, Ravindran R, Shadmehr R (2006)
Generalization of motor learning depends on the history of prior action. PLoS Biol
4:e316.
Landi SM, Baguear F, Della-Maggiore V (2011) One week of motor adaptation induces
structural changes in primary motor cortex that predict long-term memory one year
later. J Neurosci 31:11808-11813.
Lee TD, Genovese ED (1988) Distribution of practice in motor skill acquisition:
Learning and performance effects reconsidered. Res Q Exerc Sport 59:277-287.
Lee TD, Magill RA (1983) The locus of contextual interference in motor-skill acquisition.
J Exp Psychol Learn Mem Cogn 9:730–746.
Lee TD, Magill RA (1985) Can forgetting facilitate skill acquisition? Adv Psychol 27:3-
22.
Lee J-Y, Schweighofer N (2009) Dual adaptation supports a parallel architecture of
motor memory. J Neurosci 29:10396–10404.
Luauté J, Schwartz S, Rossetti Y, Spiridon M, Rode G, Boisson D, Vuilleumier P (2009)
Dynamic changes in brain activity during prism adaptation. J Neurosci 29:169-178.
Magill RA, Hall KG (1990) A review of the contextual interference effect in motor skill
acquisition. Hum Mov Sci 9 :241-289.
Mazzoni P, Krakauer JW (2006) An implicit plan overrides an explicit strategy during
visuomotor adaptation. J Neurosci 26:3642–3645.
91
Mur M, Bandettini PA, Kriegeskorte N (2009) Revealing representational content with
pattern-information fMRI - an introductory guide. Soc Cogn Affect Neurosci 4:101-
109.
Najemnik J, Geisler WS (2005) Optimal eye movement strategies in visual search.
Nature 434:387-391.
Norman KA, Polyn SM, Detre GJ, Haxby JV (2006) Beyond mind-reading: multi-voxel
pattern analysis of fMRI data. Trends Cogn Sci 10:424-430.
O’Doherty JP, Dayan P, Friston K, Critchley H, Dolan RJ (2003) Temporal difference
models and reward-related learning in the human brain. Neuron 38:329-337.
O'Doherty JP, Dayan P, Schultz J, Deichmann R, Friston K, Dolan RJ (2004) Dissociable
roles of ventral and dorsal striatum in instrumental conditioning. Science 304:452-
454.
O'Doherty JP, Hampton A, Kim H (2007) Model ‐Based fMRI and Its Application to
Reward Learning and Decision Making. Ann N Y Acad Sci 1104:35-53.
Ogawa K, Imamizu H (2013) Human sensorimotor cortex represents conflicting
visuomotor mappings. J Neurosci 33:6412– 6422.
Oldfield RC (1971) The assessment and analysis of handedness: the Edinburgh inventory.
Neuropsychologia 9:97-113.
Olesen PJ, Westerberg H, Klingberg T (2003) Increased prefrontal and parietal activity
after training of working memory. Nat Neurosci 7:75-79.
92
Osu R, Hirai S, Yoshioka T, KawatoM (2004) Random presentation enables subjects to
adapt to two opposing forces on the hand. Nat Neurosci 7:111–112.
Pekny SE, Criscimagna-Hemminger SE, Shadmehr R (2011) Protection and expression
of human motor memories. J Neurosci 31:13829–13839.
Pessoa L, Gutierrez E, Bandettini PA, Ungerleider LG (2002) Neural correlates of visual
working memory: fMRI amplitude predicts task performance. Neuron 35:975-987.
Peters J, Schaal S (2008) Reinforcement learning of motor skills with policy
gradients. Neural Netw, 21: 682-697.
Pyle WH (1919) Transfer and interference in card-distributing. J Educ Psychol 10:107-
110.
Redding GM, Wallace B (1996) Adaptive spatial alignment and strategic perceptual-
motor control. J Exp Psychol Hum Percept Perform 22(2), 379-394.
Samejima K, Ueda Y, Doya K, Kimura M (2005) Representation of action-specific
reward values in the striatum. Science 310:1337-1340.
Sapir A, Soroker N, Berger A, Henik A (1999) Inhibition of return in spatial attention:
direct evidence for collicular generation. Nat Neurosci 2:1053-1054.
Schmidt RA, Lee TD (1999) Motor control and learning, Ed 3. Champaign, IL: Human
Kinetics.
Schultz W, Dayan P, Montague PR (1997) A neural substrate of prediction and
reward. Science 275:1593-1599.
93
Schweighofer N, Doya K, Fukai H, Chiron JV, Furukawa T, Kawato M (2004) Chaos
may enhance information transmission in the inferior olive. Proc Natl Acad Sci U S
A 101:4655-4660.
Schweighofer N, Lee J-Y, Goh H-T, Choi Y, Kim SS, Stewart JC, Lewthwaite R,
Winstein CJ (2011) Mechanisms of the contextual interference effect in individuals
poststroke. J Neurophysiol 106:2632–2641.
Seidler RD, Noll DC (2008) Neuroanatomical correlates of motor acquisition and motor
transfer. J Neurophysiol 99:1836-1845.
Shadmehr R, Holcomb HH (1997) Neural correlates of motor memory consolidation.
Science 277:821-825.
Shea JB, Morgan RL (1979) Contextual interference effects on the acquisition, retention,
and transfer of a motor skill. J Exp Psychol Learn Mem Cogn 5:179–187.
Shmuelof L, Huang VS, Haith AM, Delnicki RJ, Mazzoni P, Krakauer JW (2012a)
Overcoming motor “forgetting” through reinforcement of learned actions. J
Neurosci 32:14617–14621.
Shmuelof L, Krakauer JW, Mazzoni P (2012b) How is a motor skill learned? Change and
invariance at the levels of task success and trajectory control. J Neurophysiol
108:578-594.
Smith MA, Ghazizadeh A, Shadmehr R (2006) Interacting adaptive processes with
different timescales underlie short-term motor learning. PLoS Biol 4:e179.
Sutton RS, Barto AG (1998) Reinforcement learning. Cambridge, MA: MIT.
94
Tanaka S, Honda M, Hanakawa T, Cohen LG (2010) Differential contribution of the
supplementary motor area to stabilization of a procedural motor skill acquired
through different practice schedules. Cereb Cortex 20:2114–2121.
Tanaka H, Krakauer JW, Sejnowski TJ (2012) Generalization and multirate models of
motor adaptation. Neural Comput 24:939–966.
Taylor K and Rohrer D (2010) The effects of interleaved practice. Appl Cogn Psychol
24:837–848.
Tulving E, Thomson DM (1973) Encoding specificity and retrieval processes in episodic
memory. Psychol Rev 80:352–373.
Vaswani PA, Shadmehr R (2013) Decay of motor memories in the absence of error. J
Neurosci 33:7700-7709.
Verstynen T, Sabes PN (2011) How each movement changes the next: an experimental
and theoretical study of fast adaptive priors in reaching. J Neurosci 31:10050–
10059.
Wolpert DM, Landy MS (2012) Motor control is decision-making. Curr Opin Neurobiol
22:996-1003.
Wolpert DM, Miall RC, Kawato M (1998) Internal models in the cerebellum. Trends
Cogn Sci 2:338-347.
Wolpert DM, Diedrichsen J, Flanagan JR (2011) Principles of sensorimotor learning. Nat
Rev Neurosci 12:739-751.
95
Woolley DG, Tresilian JR, Carson RG, Riek S (2007) Dual adaptation to two opposing
visuomotor rotations when each is associated with different regions of workspace.
Exp Brain Res 179:155– 165.
Woolley DG., de Rugy A, Carson RG, Riek S (2011) Visual target separation determines
the extent of generalisation between opposing visuomotor rotations. Exp Brain
Res 212:213-224.
Xue G, Dong Q, Chen C, Lu Z, Mumford JA, Poldrack RA (2010) Greater neural
patternsimilarity across repetitions is associated with better memory. Science
330:97–101.
Yoshida W, Ishii S (2006) Resolution of uncertainty in prefrontal cortex. Neuron 50:781-
789.
Zarahn E, Weston GD, Liang J, Mazzoni P, Krakauer JW (2008) Explaining savings for
visuomotor adaptation: linear time-invariant state-space models are not sufficient. J
Neurophysiol 100:2537–2548.
Abstract (if available)
Abstract
In the last decade, computational models in motor learning have become popular because it provides a theoretical framework not only to explain but predict motor learning behaviors. However, the computational approach sometimes has been criticized by dominating experimentalists in neuroscience due to its lack of the underlying neural mechanisms to support the hypothetical models. In this dissertation, we combined the computational modeling with neuroimaging methods such as fMRI to understand human motor memories and decision making in motor learning. First, we provided a unifying mechanism accounting for the spacing and the contextual interference effects using multi-rate motor adaption models: Model comparison and retention performance analyses showed how varying learning schedule influence the dynamics of fast and slow motor memories. Second, we searched neural correlates of motor memories predicted by multi-rate motor adaptation models: We hypothesized that different brain regions were correlated with activities of motor memories with varying time scales. We could see the gradual shift of the correlated areas from the fronto-parietal to the cerebellar area with slower time scales. In addition, using multi-voxel pattern analysis (MVPA), we found results supporting separate representation of learning opposing rotations in the cerebellum. Third, we searched neural correlates of decision making in motor learning: By designing an fMRI experiment where subjects searched a hidden target through movements. We hypothesized that subjects used a Bayesian strategy updating the next movement direction given feedbacks as a binary reward to search a target with exploration. We focused on behavioral results with computational models and left fMRI data analysis in the future work, discussing the expected results based on other similar studies for neural correlates of reward-prediction errors, exploration, and uncertainty in motor learning.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Computational principles in human motor adaptation: sources, memories, and variability
PDF
Functional models of fMRI BOLD signal in the visual cortex
PDF
Experimental and computational explorations of different forms of plasticity in motor learning and stroke recovery
PDF
Modeling motor memory to enhance multiple task learning
PDF
Computational model of stroke therapy and long term recovery
PDF
The representation, learning, and control of dexterous motor skills in humans and humanoid robots
PDF
Bayesian methods for autonomous learning systems
PDF
Brain and behavior correlates of intrinsic motivation and skill learning
PDF
Functional magnetic resonance imaging characterization of peripheral form vision
PDF
Iterative path integral stochastic optimal control: theory and applications to motor control
PDF
The neural correlates of face recognition
PDF
Learning affordances through interactive perception and manipulation
PDF
Explicit encoding of spatial relations in the human visual system: evidence from functional neuroimaging
PDF
Value-based decision-making in complex choice: brain regions involved and implications of age
PDF
Perceptual and computational mechanisms of feature-based attention
PDF
Machine learning in interacting multi-agent systems
PDF
Cognitive-linguistic factors and brain morphology predict individual differences in form-sound association learning: two samples from English-speaking and Chinese-speaking university students
PDF
Computational intelligence: prediction, control and memory in artificial and biological agents
PDF
Characterization of visual cortex function in late-blind individuals with retinitis pigmentosa and Argus II patients
PDF
Rethinking perception-action loops via interactive perception and learned representations
Asset Metadata
Creator
Kim, Sung Shin
(author)
Core Title
Computational models and model-based fMRI studies in motor learning
School
College of Letters, Arts and Sciences
Degree
Doctor of Philosophy
Degree Program
Neuroscience
Publication Date
02/01/2014
Defense Date
06/13/2013
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
Bayesian,contextual interference effects,fMRI,motor learning,multi-rate adaptation,OAI-PMH Harvest,reinforcement learning,spacing effects
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Schweighofer, Nicolas (
committee chair
), Gordon, James G. (
committee member
), Lv, Jinchi (
committee member
), Schaal, Stefan (
committee member
), Tjan, Bosco S. (
committee member
)
Creator Email
holykim79@gmail.com,sungshik@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c3-310639
Unique identifier
UC11294875
Identifier
etd-KimSungShi-1916.pdf (filename),usctheses-c3-310639 (legacy record id)
Legacy Identifier
etd-KimSungShi-1916.pdf
Dmrecord
310639
Document Type
Dissertation
Format
application/pdf (imt)
Rights
Kim, Sung Shin
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
Bayesian
contextual interference effects
fMRI
motor learning
multi-rate adaptation
reinforcement learning
spacing effects