Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
The representation, learning, and control of dexterous motor skills in humans and humanoid robots
(USC Thesis Other)
The representation, learning, and control of dexterous motor skills in humans and humanoid robots
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
THE REPRESENTATION, LEARNING, AND CONTROL OF DEXTEROUS
MOTOR SKILLS IN HUMANS AND HUMANOID ROBOTS
by
Michael Nalin Mistry
A Dissertation Presented to the
FACULTY OF THE GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
(COMPUTER SCIENCE)
May 2009
Copyright 2009 Michael Nalin Mistry
Acknowledgements
Although there is only a single name on the cover of this dissertation, it goes without
saying that this work is the accumulation of countless hours of discussions, ideas, and
efforts of a significant number of individuals. It is unfortunate that only this single brief
section is used to acknowledge the efforts of all.
I would especially like to thank Stefan Schaal for his patience, support, and guid-
ance throughout this entire process. His generosity has far exceeded the requirements
for a typical Ph.D. advisor. I also thank my dissertation and qualification exam commit-
tee members: Henryk Flashner, Nicolas Schweighofer, Gaurav Sukhatme, and Francisco
Valero-Cuevas. IamalsoespeciallygratefultomysupervisorsattheATRComputational
Neuroscience Laboratories: MitsuoKawato, Gordon Cheng, andErhanOztop. Addition-
ally, the following individuals have contributed to this thesis, either directly or indirectly,
in a small or large portion, and should therefore be recognized: Darrin Bentivenga, Jonas
Buchli, Daniel Bullock, Etienne Burdet, Rick Cory, Kazuko Davis, Aaron D’Souza, Yuka
Furukawa, Ganesh Gowrishankar, Joshua Hale, Joseph Hart, Heiko Hoffmann, Sang-ho
Hyon, MrinalKalakrishnan, GaryLiaw, Takamitsu Matsubara, Peyman Mohajerian, Jan
Moren,JunMorimoto, SrideepMusuvathy,JunNakanishi,NakanoNao,NersesOhanyan,
ii
Dae-Hyung Park, Peter Pastor, Jan Peters, Dimitris Pongas, Keng Peng Tee, Evangelos
Theodorou, Jo-Anne Ting, Ales Ude, and Toshinori Yoshioka.
Finally I would like to thank my parents Nalin and Jette, as well as my brother and
sister-in-law, Nicholas and Kristina, for their love and continuous encouragement.
iii
Table of Contents
Acknowledgements ii
List Of Tables vii
List Of Figures viii
Abstract xi
Chapter 1: Introduction 1
1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.1 Internal models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.1.2 Redundancy resolution . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.1.3 Impedance control . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.2 Dissertation Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Chapter 2: Resolving Kinematic Redundancy in Human Reaching 12
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2 The Exoskeleton Apparatus . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.2.1 Exoskeleton Control Architecture . . . . . . . . . . . . . . . . . . . 15
2.2.2 Exoskeleton Control Formulation . . . . . . . . . . . . . . . . . . . 15
2.2.2.1 Low Level Joint Control . . . . . . . . . . . . . . . . . . . 16
2.2.2.2 Gravity, Coriolis, and Inertia Compensation . . . . . . . 17
2.2.2.3 Parameter Identification . . . . . . . . . . . . . . . . . . . 18
2.3 Experimental Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.4 Results. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.4.1 Experiment 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.4.2 Experiment 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.5.1 Control model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.5.1.1 Model Simulation . . . . . . . . . . . . . . . . . . . . . . 26
2.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Chapter 3: Desired Trajectories in Human Reaching 31
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.2 Stochastic Optimal Control . . . . . . . . . . . . . . . . . . . . . . . . . . 32
iv
3.2.1 Control formulation . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.2.2 A model of sensorimotor control of human reaching . . . . . . . . 34
3.2.3 Reoptimization as a model of learning . . . . . . . . . . . . . . . . 35
3.2.3.1 Modeling catch trials . . . . . . . . . . . . . . . . . . . . 36
3.2.3.2 Example with a viscous curl field . . . . . . . . . . . . . . 37
3.3 Reaching Experiment. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.3.1 Apparatus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.3.2 Experimental Procedure . . . . . . . . . . . . . . . . . . . . . . . . 40
3.3.2.1 Experiment A . . . . . . . . . . . . . . . . . . . . . . . . 41
3.3.2.2 Experiment B . . . . . . . . . . . . . . . . . . . . . . . . 41
3.3.2.3 Data Analysis . . . . . . . . . . . . . . . . . . . . . . . . 42
3.3.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.3.3.1 Experiment A . . . . . . . . . . . . . . . . . . . . . . . . 42
3.3.3.2 Experiment B . . . . . . . . . . . . . . . . . . . . . . . . 45
3.3.4 Modeling using stochastic optimal control . . . . . . . . . . . . . . 48
3.3.5 Modeling using a directionally biased cost function . . . . . . . . . 50
3.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
Chapter 4: Task Space Control for Humanoid Robots 57
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.2 The Floating Base Representation of Robot Kinematics . . . . . . . . . . 59
4.3 Floating Body Kinematics . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.3.1 Floating Base Jacobians . . . . . . . . . . . . . . . . . . . . . . . . 62
4.3.2 Inverse Kinematics with Floating Base . . . . . . . . . . . . . . . . 64
4.4 Task Prioritization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.5 Task Space Control for Humanoid Tasks . . . . . . . . . . . . . . . . . . . 71
4.5.1 Evaluations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
4.5.1.1 Platform . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
4.5.1.2 COG Tracking . . . . . . . . . . . . . . . . . . . . . . . . 76
4.5.1.3 Hand Tracking with Simultaneous Balance . . . . . . . . 77
4.5.2 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
4.6 Task space control for simultaneous locomotion and balance . . . . . . . . 81
4.6.1 Controller Formulations . . . . . . . . . . . . . . . . . . . . . . . . 82
4.6.1.1 Locomotion Control Method 1 . . . . . . . . . . . . . . . 82
4.6.1.2 Locomotion Control Method 2 . . . . . . . . . . . . . . . 83
4.6.1.3 Locomotion Control Method 3 . . . . . . . . . . . . . . . 85
4.6.2 Evaluations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
4.6.2.1 Platform . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
4.6.2.2 Locomotion Performance . . . . . . . . . . . . . . . . . . 87
4.6.2.3 Constraint Switching . . . . . . . . . . . . . . . . . . . . 88
4.6.2.4 Swing Leg Tracking . . . . . . . . . . . . . . . . . . . . . 90
4.6.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
4.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
v
Chapter 5: Model Based Control for Humanoid Robots 94
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
5.2 Floating Base Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
5.2.1 Derivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
5.3 Orthogonal decomposition for reduced dimensionality rigid-body dynamics 101
5.3.1 The QR decomposition . . . . . . . . . . . . . . . . . . . . . . . . 102
5.4 Evaluations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
5.4.1 Platform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
5.4.2 Joint Tracking with Feedforward Control . . . . . . . . . . . . . . 107
5.4.3 Weighted Distribution of Forces . . . . . . . . . . . . . . . . . . . . 110
5.4.4 Superposition of joint space and operational space control . . . . . 112
5.4.5 Quadruped locomotion . . . . . . . . . . . . . . . . . . . . . . . . . 114
5.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
Chapter 6: Conclusion 120
6.1 Summary of Dissertation Contributions . . . . . . . . . . . . . . . . . . . 120
6.2 Summary of Chapters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
6.2.1 Chapter 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
6.2.2 Chapter 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
6.2.3 Chapter 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
6.2.4 Chapter 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
Reference List 123
Appendix A
Chapter 2: Additional Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
A.1 Experiment 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
A.2 Experiment 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
Appendix B
Proofs of Task Prioritization Equations . . . . . . . . . . . . . . . . . . . . . . 136
B.1 Proof of Equation (4.16) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
B.2 Proof of Equation (4.17) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
vi
List Of Tables
3.1 Summary of experiment A results . . . . . . . . . . . . . . . . . . . . . . . 45
3.2 Summary of experiment B results . . . . . . . . . . . . . . . . . . . . . . . 48
vii
List Of Figures
1.1 Internal model control with feedback . . . . . . . . . . . . . . . . . . . . . 4
2.1 SARCOS master arm with user . . . . . . . . . . . . . . . . . . . . . . . . 14
2.2 Subject A: Experiment 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.3 Subject B: Experiment 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.4 SL master arm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.5 Experiment 1: model results . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.6 Experiment 2: model results . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.7 Experiment 1: model results without task-space force field compensation . 30
3.1 SOC Model: curl field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.2 Acceleration Force Field . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.3 ATR Planar Force Manipulandum . . . . . . . . . . . . . . . . . . . . . . 40
3.4 Experiment A: averaged trajectories . . . . . . . . . . . . . . . . . . . . . 43
3.5 Experiment A: averaged trajectories, early learning trials . . . . . . . . . 44
3.6 Experiment A: Learning curve . . . . . . . . . . . . . . . . . . . . . . . . 45
3.7 Experiment B: averaged trajectories . . . . . . . . . . . . . . . . . . . . . 46
3.8 Experiment B: averaged trajectories, early learning trials . . . . . . . . . 47
3.9 Experiment B: Learning curve . . . . . . . . . . . . . . . . . . . . . . . . 48
viii
3.10 Stochastic optimal control trajectories . . . . . . . . . . . . . . . . . . . . 49
3.11 Stochastic optimal control: reoptimization trajectories . . . . . . . . . . . 50
3.12 Stochastic optimal control: ramp learning trajectories . . . . . . . . . . . 51
3.13 Exponential decay term . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.14 Direction biased cost function: viscous curl field . . . . . . . . . . . . . . 53
3.15 Direction biased stochastic optimal control trajectories. . . . . . . . . . . 53
3.16 Direction biased stochastic optimal control: reoptimization trajectories . 54
3.17 Direction biased stochastic optimal control: ramp learning trajectories . . 55
3.18 Position and velocity gains of optimal controller . . . . . . . . . . . . . . 56
4.1 Controlling a humanoid robot in task space . . . . . . . . . . . . . . . . . 58
4.2 The floating base representation . . . . . . . . . . . . . . . . . . . . . . . 60
4.3 Inverse Kinematics/Inverse Dynamics control diagram . . . . . . . . . . . 61
4.4 Examples of adequately constrained systems . . . . . . . . . . . . . . . . . 66
4.5 Examples of under-constrained systems . . . . . . . . . . . . . . . . . . . 67
4.6 Sarcos/ATR CBi Humanoid Robot . . . . . . . . . . . . . . . . . . . . . . 77
4.7 COG tracking task performance . . . . . . . . . . . . . . . . . . . . . . . . 78
4.8 hand tracking task performance: time series . . . . . . . . . . . . . . . . . 79
4.9 hand tracking task performance . . . . . . . . . . . . . . . . . . . . . . . . 79
4.10 hand tracking task: COG performance . . . . . . . . . . . . . . . . . . . . 80
4.11 LittleDog Robot Simulator . . . . . . . . . . . . . . . . . . . . . . . . . . 87
4.12 Locomotion COG tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
4.13 Locomotion foot tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
4.14 Locomotion desired knee velocity (controller 2) . . . . . . . . . . . . . . . 89
ix
4.15 Locomotion desired knee velocity (controller 3) . . . . . . . . . . . . . . . 90
4.16 Locomotion null space distortion . . . . . . . . . . . . . . . . . . . . . . . 92
5.1 Floating Base Representation . . . . . . . . . . . . . . . . . . . . . . . . . 98
5.2 Sarcos CBi robot and SL biped simulator . . . . . . . . . . . . . . . . . . 108
5.3 Simulated squatting evaluation . . . . . . . . . . . . . . . . . . . . . . . . 109
5.4 Simulated squatting evaluation: slow tracking performance . . . . . . . . 110
5.5 Simulated squatting evaluation: fast tracking performance . . . . . . . . . 111
5.6 Weighted torque distribution . . . . . . . . . . . . . . . . . . . . . . . . . 111
5.7 Weighted torque distribution:tracking performance . . . . . . . . . . . . . 112
5.8 Simulated stepping evaluation . . . . . . . . . . . . . . . . . . . . . . . . . 114
5.9 Simulated stepping evaluation: tracking performance . . . . . . . . . . . . 115
5.10 Littledog evaluation: tracking performance . . . . . . . . . . . . . . . . . 117
A.1 Subject B: Experiment 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
A.2 Subject C: Experiment 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
A.3 Subject D: Experiment 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
A.4 Subject E: Experiment 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
A.5 Subject A: Experiment 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
A.6 Subject C: Experiment 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
A.7 Subject D: Experiment 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
A.8 Subject E: Experiment 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
x
Abstract
Humans are capable of executing a wide variety of complex and dexterous motor behav-
ior with grace and elegance. We would like to understand the computational principals
underlying their sensorimotor control, both for the advancement of neuroscience, as well
as to achieve similar performance on artificial machines. I study the problem via two
approaches: motor control experiments on human subjects, and the implementation of
control models on anthropomorphic hardware. In one experiment, we use a novel ex-
oskeleton platform that permits us to alter arm dynamics by applying torques to specific
joints during full 3-D motion, and study how arm redundancy is utilized. Results of this
experiments suggest that: 1) humans are able to learn, and later predict, novel dynamic
environments, 2)humansplanmotion inanextrinsichand,orend-effector oriented space,
and 3) joint redundancy can be utilized in order to assure task achievement. In a second
experiment, weattempt tocreate a dynamicenvironment thateffects motor cost, butnot
accuracy: ifaforceperturbationfirstpushesthehandoff-courseinonedirectionandthen
subsequently in the opposite direction, the reaching task may still be achieved with mini-
malcorrection, butastrongly curved trajectory. Under suchconditions, subjects learn to
return their trajectories towards the baseline, or null field trajectory, effectively fighting
this disturbance in order to maintain a straight trajectory. These results are inconsistent
xi
withtheories suggestingthatperceivedkinematicerrorplaysnoroleinmotoradaptation.
I will demonstrate how improved predictions can be obtained with a stochastic optimal
controller using a directionally biased cost function. In second part of my dissertation,
I discuss how to obtain robust, compliant, human-like motion of humanoid robots, via
task-space, model-based, feed-forward control. I will explain the challenges of doing so,
namely under-actuation dueto the ”floating” base unattached to theworld, the generally
unknown contact forces, and the ill-posed nature of floating base inverse dynamics. I will
showhowtoovercome theseobstacles byplanningtask-space motion inasufficientlycon-
strained space, andby computinginverse dynamics torques byfirstprojecting the system
dynamics into a reduced dimensional space which is independent of contact forces.
xii
Chapter 1
Introduction
A primary goal of this dissertation is to study both humans and humanoid robots syn-
ergistically. Since both share the same underlying anthropomorphic form, at least in
theory, both should be capable of similar function. The investigation of motor behavior
on either will potentially lead to advances and insights towards the other. We are already
aware of the proficiency of motor behavior in humans, particularly in the learning of new
skills and the grace and elegance in execution. To date, no artificial machine has been
able to emulate such a high level of performance. Thus, there is much to be gained from
the study of human motor control. However, we should not ignore the potential of the
humanoid robot as a tool for neuroscience. By implementing biologically inspired control
models on anthropomorphic hardware, we are forced to examine some of the difficult
issues already solved by the nervous system and crucial for understanding human brain
function as a whole. For example, how should we control our hand motion to most ro-
bustly or efficiently resolve our remaining redundancy? How can we best resolve multiple
tasks simultaneously? How can we control these tasks with a high level of accuracy while
simultaneously remaining compliant and robust to disturbances? Our hope is that by
1
resolving such issues by implementing controllers on artificial systems, we will develop
insights into theunderlyingmechanisms of biological motor systems, possiblyunavailable
by other research means.
Until recently, robotic hardware has not been capable of emulating human perfor-
mance in such a manner that we could seriously address such issues. However, the latest
advancements in humanoid robotics have produced high degree of freedom systems with
anthropomorphic configurations of human-proportional weight, size, and strength. Sci-
entists and engineers, now less constrained by hardware limitations, increasingly need to
look towards biology to achieve human-like control of humanoid systems. Thus, it is the
beginning of an exciting time for research in computational motor control and humanoid
robotics.
We should stress that humans and robots are still remarkably different: the electric,
hydraulic or pneumatic actuators used in robotics are highly dissimilar to the muscles
operatinginanantagonisticmannerwithnonlinearspring-likeproperties. Nottomention
the differences in sensory processing, computation, energy utilization, etc. However, at a
higher level in the systems hierarchy, there exist elements common to both systems and
a plethora of issues to be investigated and understood. In this work, it is primarily at
this higher level of abstraction where we investigate theories of motor control common to
both humans and humanoid robots. Specifically, the issues examined include:
Representation: when the brain plans a movement, such as reaching the hand to an
object,inwhichcoordinatesystemdoesitusetorepresentthetask? Isitinextrinsic
coordinates (i.e. hand, task, operational space)? or rather in intrinsic coordinates
2
(joint position, joint velocity, torque, muscle length, etc)? How can we efficiently
representtasksto planmotion foracomplex50 degree-of-freedom humanoidrobot?
Learning: whatmechanismsdoesthebrainhaveinordertolearnnewtasksoradapttoa
new dynamic environments (for example, when limbs grow and dynamic properties
change)? How can our robots adapt to unknown environments? What are the
principles that guide adaptation in humans?
Control: Given a task or a trajectory plan, how does the brain execute the movement,
resolving any redundancy, and dealing with disturbances? How can we control our
robots to be as robust and compliant as humans? How does the brain utilize the
available sensory feedback information, and which sensory information is best for
our robots?
The remainder of this chapter will give an overview of the background and concepts
I will be using in this thesis. I will discuss some of the key challenges in this work, and
will briefly highlight the contributions of this dissertation.
1.1 Background
Theunderlyingframeworkthatwewilluseasamodelofmovement controlandexecution
in both humans and humanoid robots, is shown in its simplest form in figure 1.1. Here
we represent a controlled object that can either be a human or robot. The vector u
represents some control command signal, which could beforces, torques, actuator voltage
or current, muscle lengths, motor neuron activation, etc. What is important is that
there are two control pathways by which we generate control commands. The internal
3
Internal Model
Feedback
Controller
Controlled
Object +
-
+
+
Figure 1.1: Internal model control with feedback
model pathway (or feed-forward pathway), computes the precise control commands, u
ff
required to realize some desired task variable x
d
. The feedback pathway uses the error
between the desired task variable x
d
and the sensed (or estimated) task state x, to
compute the control command u
fb
. This model forms the basis of both internal model
theories in humans (Kawato, Furukawa & Suzuki 1987, Kawato & Gomi 1992, Kawato
& Wolpert 1998, Kawato 1999) and inverse dynamics control in robotics (Sciavicco &
Siciliano 2001).
1.1.1 Internal models
Internal models in motor control are hypothetical neural mechanisms used for sensory
prediction and motor command computation. The so called forward internal model is
used to map motor commands to their sensory consequences, while the inverse internal
model computes the motor commands required for a desired trajectory (Kawato 1999).
Arguments in support of the internal model hypothesis claim that fast coordinated arm
movements cannot beexecuted solely underfeedback control, since thelatency of sensory
feedback information (vision, proprioception, etc.) is too slow: spinal feedback loops
4
typically take 30-50 ms, while visual feedback requires 150-250 ms. These latencies are
large compared to arm movements which can be as fast as 150 ms. Opponents of the in-
ternal model hypothesis claim that the viscoelastic properties of muscles combined with
fast peripheral reflex loops allow the limbs to be controlled by prescribing an equilib-
rium point trajectory, without requiring explicit knowledge of musculoskeletal dynamics
(Feldman 1986, Feldman, Ostry, Levin, Gribble & Mitnitski 1998). While this equi-
librium point hypothesis is nice for its computational simplicity, (Gomi & Kawato 1996)
demonstrated that thetrajectories requiredtoexecute suchmovements can bearbitrarily
complex(andrequiringknowledgeofarmdynamicstocompute)andfastmovements may
require stiffness levels that are not feasible. In human balance, Morasso demonstrates
thattheanklestiffnessrequiredforposturalstabilizationsolelyviafeedbackcontrolwould
be higher than biologically possible (Morasso, Baratto, Capra & Spada 1999, Morasso
& Schieppati 1999, Morasso & Sanguineti 2002). Thus, given the low stiffness levels ob-
servedinhumanbehavior,itisbelievedthatfastcoordinatedmovements canbeachieved,
in spiteof sensorylatencies, via feed-forward motor commands computedwith knowledge
and prediction of musculoskeletal dynamics and their sensory consequences.
The most convincing empirical evidence in favor of the internal model hypothesis
have been demonstrated by the force field experiment. A force field is used to alter limb
dynamics, by placing the subject in a rotating room (Lackner & DiZio 1994), or more
commonly, by exerting external forces on the subject via a robot manipulandum, forcing
subjects to adapt to the novel dynamics in order to achieve their task. Typically these
forces arefixed functionsof endpointor joint positions, velocities, or accelerations. In the
seminalworkof (Shadmehr&Mussa-Ivaldi 1994) a robotmanipulandawas usedtoapply
5
a viscous force field at the hand in the form of f =B˙ x, where ˙ x is the velocity of the
hand,f is the vector of the external force to be applied at the hand, and B is a constant
matrix which determines the strength and direction of the force. Shadmehr and Mussa-
Ivaldi showed that when executing point-to-point reaching movements in the absence
of such fields, endpoint trajectories are typically straight with a smooth velocity profile
(Morasso 1981). However, upon first exposure to a novel dynamic environment (e.g. a
force field), trajectories become skewed in response to the field. After several subsequent
trials within the force field, endpoint trajectories actually return to their original paths;
subjects learn to compensate for external disturbances. If the field is suddenly removed
after learning, the trajectories once again become skewed, in the opposite direction of the
initial pre-adaptation field trajectories, and are called After Effects (Jeannerod 1988) of
the force field adaptation. Evidence of After Effects suggest that the brain does build
an internal model of the new dynamic model of the force field, which it uses to predict
and compensate for the environment such that its endpoint plan is realized. Shadmehr
and Mussa-Ivaldi propose the following equation as a model of the controller used by the
nervous system
τ =
ˆ
M¨ q
d
(t)+
ˆ
C−K
P
(q−q
d
(t))−K
V
(˙ q− ˙ q
d
(t)) (1.1)
where τ is the torque command,
ˆ
M is the nervous system’s estimate of arm inertia,
ˆ
C is the estimate of centripetal and Coriolis forces, q is the joint configuration of the
arm, q
d
(t) is the desired joint space trajectory of the arm, and K
P
and K
V
are stiffness
and damping gains, respectively. This equation is precisely the inverse dynamics control
6
equation in robotics (Sciavicco & Siciliano 2001) (the feed-forward gravity compensation
termismissing,sincethearmwassupportedagainstgravity). Additionally, thisequation
fits into the model of figure 1.1 since it contains both feed-forward and feedback compo-
nents. Using this controller equation, Shadmehr and Mussa-Ivaldi demonstrate that the
simulated hand trajectories, have similar corrective “hooks” observed in the human data.
Additional experiments using the force field paradigm have added insight into the
learningprocessofthepredictiveinternalmodel. Internalmodelsappeartobeamapping
from joint state to corrective torque, rather than a memorized, time dependant, sequence
of torques (Conditt, Gandolfo & Mussa-Ivaldi 1997). The mapping is in an intrinsic
(joint-space) coordinate system rather than an extrinsic (task-space) coordinate system
(Shadmehr & Mussa-Ivaldi 1994). Additionally, the learning appears to be local: a force
field learned in one area of joint space will not generalize well to another area (Gandolfo,
Mussa-Ivaldi & Bizzi 1996). Finally, as witnessed by the adaptation of EMG signals
during training, the learning process appears to occur gradually, on a trial-to-trial basis
(Thoroughman & Shadmehr 1999).
1.1.2 Redundancy resolution
Most manipulation tasks, such as object grasping, tool use, etc. require 6 degrees of
freedom (DOF). Other tasks such as a reaching require 3 DOFs, while bipedal balancing
canbeconsideredasa2DOFtask(controlling center ofmassonaplaneperpendicularto
gravity). Sincehumansandhumanoidrobotshave moredegrees of freedomthan required
for these tasks (the human arm alone has at least 7 major DOFs), these systems contain
a significant amount of redundancy. From a computational point of view, redundancy is
7
problematic becauseitis often ill-posed, havingan infinitenumberofsolutions. However,
redundancy can also be beneficial, since it gives the system some freedom to improve the
reliability of task achievement, to execute other tasks simultaneously, or achieve some
other optimization criteria. As scientists, we wish to understand how the nervous sys-
tem chooses among the infinite number of possible solutions available for solving motor
problems, for example choosing a particular joint configuration for reaching to an ob-
ject. There are two main types of redundancy to be considered in the motor system:
1) Kinematic (or joint space) redundancy: given a target to reach to or an end-effector
trajectory tofollow, whichjointconfiguration doIchoose amongtheinfinitepossibilities?
2) Actuator or muscular redundancy: given a joint configuration, how do I choose the
muscle activation to realize it? Actuator redundancy is useful in humans, since it allows
for separate control of joint accelerations and stiffness (via simultaneous contraction of
antagonistic pairs of muscles).
As early as 1930, Lashley recognized the importance of kinematic redundancy for
motor equivalence: the ability to achieve the same motor goal using different movements
(Lashley 1930). However it was (Bernstein 1967) who first formalized the Degrees-of-
freedom problem: how does the nervous system cope with the indeterminate mapping
from a goals to actions? Proposed solutions to Bernstein’s problem generally fall under
one of two camps: passive or computational. The passive methods suggest that the so-
lution comes about automatically, without computation, due to either fixed constraints
or mechanical properties of the system. For example, in eye gaze experiments, Donder
observed a specific and fixed eye torsion for every gaze angle, and thus proposed that
8
the nervous system may have a fixed 1-to-1 mapping from task variables to joint vari-
ables (i.e. Donder’s Law). However limb movements have been shown to violate this
relatively simple law (Soechting, Buneo, Herrmann & Flanders 1995). Mussa-Ivaldi et.
al. argue that the elastic properties of muscles generate potential fields which deter-
mine the joint configurations (Mussa-Ivaldi, Morasso & Zaccaria 1988). Computational
techniques, on the other hand, rely on active decision making processes for resolving re-
dundancy. Bullock et. al. developed a neuralnetwork that learns (from prior experience)
a mappingfrom joint configuration and end-effector velocities to joint velocities (Bullock,
Grossberg & Guenther 1992). This model is a biologically plausible way of implementing
thecommonJacobianpseudoinverseinversekinematicstechniqueofrobotics(Sciavicco &
Siciliano 2001). Manyhave suggested thatthenervous systemsfollows someoptimization
principles such as the minimization of jerk (Flash & Hogan 1985), the minimization of
torquechange (Kawato, Maeda, Uno& Suzuki1990), minimization of interaction torques
(Goble, Zhang, Shimansky, Sharma & Dounskaia 2007), or combinations of such (Ohta,
Svinin, Luo, Hosoe & Laboissi` ere 2004). Todorov and Jordan propose a theory based on
stochastic optimal control (Todorov & Jordan 2002, Todorov 2004), where redundancy
is used as a way of minimizing the variance in task relevant dimensions. Todorov and
Jordan suggest the minimum intervention principle: errors are only corrected when they
interferewithtaskperformance. Thustheypredictthatthevarianceinmovementswillbe
higher in task irrelevant dimensions. These predictions are compatible with Scholtz and
Schoner’sUncontrolled Manifold concept(Scholz&Sch¨ oner1999), amethodofdetermin-
ing the task irrelevant dimensions (which lie on a manifold) by measuring the variance in
motor output. Additionally, motor synergies (Latash, Scholz & Sch¨ oner 2007) and motor
9
primitives(Bizzi, Mussa-Ivaldi&Giszter1991,Mussa-Ivaldi, Hogan&Bizzi1985,Mussa-
Ivaldi & Bizzi 2000) have also been proposed as a dimensionality reduction technique.
1.1.3 Impedance control
With respect to the actuator redundancymentioned in section 1.1.2, humans are capable
ofcontrollingtheirlimbimpedance. Forexample,wecanexecutethesamearmmovement
(i.e. the same kinematic trajectory) with more compliance or more stiffness depending
on how tightly we flex (or co-contract) our muscles. A model by Bullock and Grossberg
explains how neurons of the spinal cord enable the independent control of joint motion
and stiffness via muscle co-contraction (Bullock & Grossberg 1989). Stiffness is desirable
as a method of fast rejection of disturbances and is the strategy employed by humans
when operating in uncertain environments or when learning new skills. However, humans
typically reduce stiffness levels after a skill is learned (Milner & Cloutier 1993). Thus
compliance is more desirable for achieving smoother and more energy efficient motion
with less variability (motor noise increases with muscle activity(Harris & Wolpert 1998)).
Burdetetal. studiedimpedancecontrolinhumansbyexposingsubjectstounstableforce
fields generated by a manipulandum (Burdet, Osu, Franklin, Milner & Kawato 2001).
They show that subjects learn to stabilize the unstable dynamics by increasing arm
impedance, but only in the direction of instability. Thus subjects choose an efficient
strategy for dealing with instability, and internal model control and impedance control
do coexist in humans (Osu, Burdet, Franklin, Milner & Kawato 2003). Additionally,
Takahashi et al. (Takahashi, Scheidt & Reinkensmeyer 2001) show that after learning
in a force field with a strength that varies randomly from trial to trial, subjects have
10
reduced aftereffects, suggesting that they are increasing joint impedance as a method to
cope with the uncertainty.
1.2 Dissertation Outline
The remainder of this dissertation is organized as follows:
• Chapter 2 addresses the issue of kinematic redundancy resolution in human arm
movements. Using an exoskeleton robot, we alter the dynamic properties of the
human arm and test how subjects react and adapt to such disturbances in joint
space.
• Chapter 3 we attempt to create a dynamicenvironment that effects motor cost, but
notaccuracy. Indoingso,weexaminethe”optimality”ofhumanmotoradaptation,
and try to understand if kinematic error plays a role.
• Chapter 4 turns to the problem of humanoid robotics, and discusses how to imple-
ment human like task-space control, via floating base inverse kinematics.
• Chapter 5 tackles the problem of model based control of humanoid robots, and
provides a solution to the ill-posed problem of floating base inverse dynamics.
• Chapter 6 concludes with a summary of contributions presented in this thesis.
11
Chapter 2
Resolving Kinematic Redundancy in Human Reaching
2.1 Introduction
Very early in the study of human motor control, kinematic redundancy was identified as
a key problem to be solved by the central nervous system (see Section 1.1.2 for a brief
overview). Moving a hand to a target requires the coordination of a plethora of joints,
and the set of possible joint motions is infinitely large. How and why does the motor
system choose the joint pattern it does? How does the nervous system solve Bernstein’s
Degrees-of-freedom problem (Bernstein 1967)?
Researches have primarily studied the issuekinematic redundancyby observing natu-
ral human motion and examining patterns that emerge. For example, (Scholz & Sch¨ oner
1999), observed the variance in motor output, and concluded that there is some consis-
tency when repeating a task. The space with highest movement variance can be mapped
onto a lower dimensional manifold (the so called uncontrolled manifold). However, we
would like to be able to address how these patterns of joint motion change and adapt
when learning new skills or new dynamic enviornments. We would like to be able to
12
study the problem in a dynamic setting, potentially altering the dynamics of motion and
testing how the nervous system adapts.
Robot manipulanda have been a common tool in motor control research for alter-
ing motor dynamics. In these experiments, robotic manipulanda apply controlled, ex-
traneous forces/torques either at the hand (Shadmehr & Mussa-Ivaldi 1994),(Gandolfo
et al. 1996),(Conditt et al. 1997),(Gomi & Kawato 1996) or individual joints (Singh &
Scott 2003) while the subjects carry out movement tasks, such as point-to-point reach-
ing movements or continuous patterns (Conditt et al. 1997). However, because of the
mechanical constraints of the manipulanda used, these experiments have been limited to
two degrees-of-freedom (DOF) movements, focusing on shoulder and elbow joints, and
thus not allowing for any spatial redundancy in a movement. With the goal of extending
this experimental paradigm, to explore a wider variety of movements, including move-
ments in full 3-D space using the major seven degrees of freedom of the human arm, we
havedevelopedanewexperimentalplatformusinga7-DOFanthropomorphicexoskeleton
(figure 2.1) (Mistry, Mohajerian & Schaal 2005a, Mistry, Mohajerian & Schaal 2005b).
The exoskeleton robot allows the subject to make unconstrained 3-D movements with all
seven major DOFs of the arm. By applying perturbing forces at any or all of these seven
joints, we are able to explore how subjects cope with intrinsic (joint space) force fields
and resolve redundancy during reaching tasks.
In this chapter, we first introduce the robotic apparatus, and discuss how we control
the device for use as an experimental platform. We will then discuss our particular
reaching movement experiment, and show results of several human subjects. Next we
will propose a model to explain the underlying behavior observed in our subjects, and
13
Figure 2.1: SARCOS master arm with user
suggest a possible control mechanism used by the brain. Finally we will draw some
conclusions from this work.
2.2 The Exoskeleton Apparatus
The experimental platform is a seven DOF hydraulically actuated exoskeleton robot arm
(Sarcos Master Arm, Sarcos, Inc., Salt Lake City) (Fig. 2.1). Its anthropomorphicdesign
mimics the major seven DOFs of the human arm
1
, such that any joint movement of the
user’sarm willbeapproximately reflected byasimilar joint movement oftheexoskeleton.
Conversely, torques applied by an exoskeleton joint will be reflected to the corresponding
joint of the user’s arm. The exoskeleton’s most proximal joint is mounted to a fixed
platform, and the user wields the device by holding on to a handle at the most distal
1
Themajorjointsofthehumanarmarelabeled: shoulder-flexion-extension(SFE),shoulder-abduction-
adduction (SAA), humeral rotation (HR), elbow-flexion-extension (EB), wrist-supination-pronation
(WSP), wrist-flexion-extension (WFE) and wrist-abduction-adduction (WAA).
14
joint and by a strapping of the forearm to the equivalent link of the robot just before the
elbow. Theshoulderremainsunconstrained,butispositionedsuchthatthethreeshoulder
rotation axes of the exoskeleton approximately intersect with the human’s shoulder joint.
2.2.1 Exoskeleton Control Architecture
The control architecture consists of independent PD servo controllers at each joint (im-
plemented on individual Sarcos Advanced Joint Controller analog circuit boards), with
additionalfeed-forwardtorquecommandscomputedonacentralizedcontrollerrunningon
2 Motorola PPC 603 parallel processors with the commercial real-time operating system
vxWorks (Windriver Systems). Potentiometers and load cells at each joint are sampled
at 960 Hz to provide positional and torque feedback, respectively. Joint velocities and
accelerations are computed numerically by differentiating theposition signal. Thesignals
are filtered with a 2nd order Butterworth filter with cutoff frequency of 33.6 Hz for po-
sition, velocity, and torque and 4.8 Hz for acceleration. The acceleration signal requires
more aggressive filtering because of the noise amplified by the numerical differentiation.
The centralized controller updates the feed-forward commands and PD set points at 480
Hz.
2.2.2 Exoskeleton Control Formulation
Ideally,theusershouldnotbeburdened(unintentionally)bytheexoskeletonwhileexecut-
ing movements. Therefore the control scheme needs to compensate for the exoskeleton’s
gravity, Coriolis, and inertia forces. Typically, robot manipulanda use force sensors at
the end effector to measure the forces applied by the user (Gomi & Kawato 1996), and
15
apply control laws that generate a particular impedance characteristic at the point where
the user holds the manipulandum. In contrast to these previous approaches, which fo-
cused on investigations of the end effector movement of human arms, our current robotic
setup is to examine joint level effects of human motor control. Thus, we do not have a
constraint in the form of a desired impedance at the end effector, but rather impedance
control for every DOF. Since we cannot reliably attach force sensors between the human
arm and every link of the robot, we resorted to a model-based control approach.
2.2.2.1 Low Level Joint Control
The low level joint controllers are governed by the equation
u=K
D
(˙ q
d
− ˙ q)+K
P
(q
d
−q)+u
ff
, (2.1)
where u ∈ R
7
is the vector of motor command torques, and q, ˙ q ∈ R
7
are the vectors
of joint position and velocity, and K
D
,K
P
are diagonal gain matrices. The desired
joint position and velocity vectors q
d
, ˙ q
d
∈R
7
and the feed-forward torque command u
ff
∈R
7
, are set by the centralized controller. Ideally, we would set ˙ q
d
= ˙ q and q
d
= q in
this controller, and compute an inverse dynamics based feed-forward command, u
ff
, to
eliminate inertial, Coriolis, and gravity forces. However, as such a controller is neutrally
stable anddifficult torealize in light of inevitable modelingerrors in thedynamics model,
16
amoreprudentcontrolapproachisrequired. Inordertokeepasmallamountofaposition
reference for enhanced stability, we define
q
n+1
d
=q
n
d
+ǫ(q
n
−q
n
d
), (2.2)
wheren is the discrete time step of the control loop, and ǫ is a constant factor between 0
and 1. Thusq
d
is a filtered version ofq, and removes high frequency noise at the cost of
a lag behind the current joint state. The filter parameter, ǫ, is chosen high enough such
that speed of typical human arm movements is well within the bandwidth of the filter.
If we choose ǫ = 1, we achieve effectively the setting q
d
=q. We indeed set the desired
velocity ˙ q
d
= ˙ q, but we will maintain a small amount of damping in the feed-forward
command u
ff
.
2.2.2.2 Gravity, Coriolis, and Inertia Compensation
We assume the exoskeleton is governed by the well-known rigid-body dynamics model of
a manipulator robot arm given by the equation
M(q)¨ q+C(q, ˙ q)+G(q) =u, (2.3)
where M(q)∈R
7×7
is the mass or inertia matrix, C(q,˙ q)∈R
7
denotes centrifugal and
Coriolis forces, and G(q) ∈ R
7
denotes the gravity force (Sciavicco & Siciliano 2001).
17
Theoretically, we can completely cancel the dynamics of the exoskeleton robot with the
feed-forward control law,
u
ff
=
˜
M(q)(¨ q−K
′
D
˙ q)+
˜
C(q, ˙ q)+
˜
G(q), (2.4)
where
˜
M,
˜
C,
˜
G are our estimates of the robot’s inertia, Coriolis, and gravity matrices,
andK
′
D
is a diagonal matrix of damping gains. Assuming perfect parameter estimation,
therobot’s dynamicswillbeeliminating leaving onlythedampingterm
˜
M(q)K
′
D
˙ qwhich
is required to maintain stability. The damping gains can be individually tuned for each
joint to apply the appropriate impedancecharacteristic at each joint. Small gains can be
usedtokeep theuser’smotion as unconstrainedas possible. Premultiplyingthedamping
term by the inertia matrix has the consequence of amplifying any modeling inaccuracies
of theinertiamatrix. Therefore, wemove thedampingterm outsideof theinertia matrix
product, which empirically we find provides a greater margin of stability. Combining this
control law with Equation (2.1) results in the equation
u=
˜
M(q)¨ q+
˜
C(q, ˙ q)+
˜
G(q)−K
′
D
˙ q+K
P
(q
d
−q), (2.5)
with q
d
defined by Equation (2.2).
2.2.2.3 Parameter Identification
The estimates of our model parameters,
˜
M,
˜
C,
˜
G, are determined with system identi-
fication techniques (An, Atkeson & Hollerbach 1988). We recorded an hour of robot
18
trajectories in response to sufficiently exciting desired trajectories of pseudorandom mo-
tor commands (including sine waves of various frequencies at the joint-level and discrete
endpoint movements at various speeds). This data was used to regress the rigid body
dynamics parameters acting on each DOF. However, we noticed that due to unmodeled
nonlinearities of the robot, we obtained partially physically inconsistent rigid body pa-
rameters, e.g., non positive definite link inertia matrices and negative viscous friction
coefficients. We improved this estimation with a novel nonlinear rigid body dynamics
identificationalgorithmthatguaranteedaphysicallycorrectrigidbodymodel(Nakanishi,
Cory, Mistry, Peters & Schaal 2008). This model significantly increased the stability of
our control system.
2.3 Experimental Procedure
We asked five healthy right-handed subjects (4 male, 1 female) to make a point-to-point
reaching movement to a visual target. In each trial, the right hand begins above the
shoulder (as if starting to throw a ball) and finishes with the arm slightly extended, in
front of the torso (as if shaking someone’s hand). The movement was chosen as such
to maximize the spatial extent of the trajectory within the limited workspace of the
robot. At the start of each trial, the robot servos the subject’s arm to the starting
location, and each trial begins with the same joint configuration. The subject is not
instructed on how to execute the movement, but told that the reaching task should only
involve an arm movement (not torso, hips, etc.) Thisway thesubject’s shoulder position
stays approximately at the intersection of the exoskeleton’s three shoulder rotation axes,
19
and the subjects’ and exoskeleton’s joint angles will remain in cohesion with minimal
discrepancy. The subject is also instructed to complete each trial within one second.
The force field, when applied, adds joint torques dependent on the velocity of one or
several joints. The general form of such external forces is
u
field
=A˙ q (2.6)
Inthisexperimentweinvestigatetwodifferentforcefields. Field1appliesanelbowtorque
dependent on the sum of shoulder flexion-extension and shoulder abduction-adduction
velocities. The A matrix of this field takes the form:
A
1
=
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
3.0 3.0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
(2.7)
20
Field 2 applies a humeral rotation torque dependent on elbow velocity:
A
2
=
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 1.5 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
(2.8)
In experiment 1, each subject completes a block of 70 trials without a force field (null
field trials), followed by a block of 70 trials with field 1 turned on. In experiment 2,
each subject conducted a block of 70 null trials, followed by 70 trials with force field 2.
Experiments 1 and 2 were conducted on different days, and in random order for each
subject.
2.4 Results
2.4.1 Experiment 1
Figure 2.2 (left graph) shows the average of the last 10 hand trajectories during subject
A’s nullfieldblock inblueandtheaverage of thelast 10handtrajectories duringthefield
1 trials in red. On the right, are the averaged joint-space trajectories of shoulder flexion-
extension, shoulder abduction-adduction, and elbow (SFE, SAA, and EB, respectively)
for the same subject during the same trials. Clearly, after force field training, this joint
21
0.3
0.4
0.5
0.6
0.7
0.2
0.3
0.4
−0.2
−0.1
0
0.1
0.2
0.3
Y (m)
X (m)
Z (m)
0
0.5
−1
−0.8
−0.6
−0.4
0.8
1
1.2
1.4
SFE (rad)
SAA (rad)
EB (rad)
Null Field
Force Field
Figure 2.2: The left graph shows the three dimensional trajectories of the right hand
of Subject A during experiment 1 The average of the last 10 null field trials (blue) is
plotted with the average of the last 10 force field trials (red). Also plotted as thinner
lines are the two dimensional projections on orthogonal planes. Hand trajectories return
to the original null-field trajectories after adapting to the force field. The right graph
shows the average trajectories of the same trials in a three dimensional projection of joint
space (SFE, SAA, and EB). Although the hand trajectory nearly matches the null field
trajectory, there is an alteration in joint space
space trajectory is skewed to the right. Average trajectories are found by resampling
each trajectory (by applying an anti-aliasing FIR filter) to a fixed array length, and then
computing the mean of each index across all trials. Results for subjects B, C, D, and E
are shown in Appendix A.1.
2.4.2 Experiment 2
Figure 2.3 (left graph) shows the average of the last 10 hand trajectories during subject
B’s null field block in blue and the average of the last 10 hand trajectories during the
field 2 trials in red. On the right, are the averaged joint-space trajectories of shoulder
22
0.3
0.4
0.5
0.6
0.7
0.2
0.3
0.4
−0.2
−0.1
0
0.1
0.2
0.3
Y (m)
X (m)
Z (m)
−1
−0.5
−0.5
0
0.5
1
0.6
0.8
1
1.2
1.4
HR (rad)
SAA (rad)
EB (rad)
Null Field
Force Field
Figure 2.3: The left graph shows the three dimensional trajectories of the right hand
of Subject B during experiment 1 The average of the last 10 null field trials (blue) is
plotted with the average of the last 10 force field trials (red). Also plotted as thinner
lines are the two dimensional projections on orthogonal planes. Hand trajectories return
to the original null-field trajectories after adapting to the force field. The right graph
shows the average trajectories of the same trials in a three dimensional projection of joint
space (SAA, HR, and EB). Although the hand trajectory nearly matches the null field
trajectory, there is an alteration in joint space
abduction-adduction, humeral rotation, and elbow (SAA, HR, and EB, respectively) for
the same subject during the same trials. Results for subjects A, C, D, and E are shown
in Appendix A.2.
2.5 Discussion
Figures2.2and2.3, aswellasthoseinAppendixA,demonstratethatthehandtrajectory
remains invariant after force field adaptation. Interestingly, however, we see no such
invariance in joint space trajectories. These results suggest that the motor system may
23
be planning reaching movements in extrinsic (task space) coordinates and not creating
fixed positional trajectories for the joints to follow. Rather, the motor system is able to
exploit the redundancy available in the human arm (perhaps to reduce the effects of the
force field) while still executing its endpoint plan.
2.5.1 Control model
In order to better understand the underlying process that may result in the observed
motion of our experiments, we propose a model for the motor control of reaching tasks.
Our model emphasizes task-space, or operational space control (Khatib 1987), as well as
the minimum intervention principle (Todorov & Jordan 2002). A disturbance will only
be corrected for if it interferes with the task, otherwise the nervous system need not be
concerned.
First we model the human arm as a 7-DOF rigid body, identical to the form of (2.3).
We will develop a controller for this arm model that executes a specified task. In this
case, our task is to track a 3-DOF desired hand trajectory. The remaining redundancy
should be additionally be resolved, but in such a way that it does not interfere with
task achievement. The framework of operational space control has been developed by
(Khatib 1987) for the task-space control of robotic manipulators. Khatib’s formulation
provides a method to decouple those forces applied directly to a task and those in the
task’s orthogonal null space. Thus, forces can be applied to the null space of a task, such
that they do not interfere dynamically with task space motion.
Given a set of generalized forces, u, and a task, x, (in our case we assume u ∈ R
7
and x ∈R
3
), if there is the differential kinematic relationship: ˙ x = J˙ q (where J is the
24
Jacobian matrix), we can separate the generalized forces into their orthogonal task and
null space components as follows:
u = u
task
+u
null
(2.9)
= J
T
¯
J
T
u+
I−J
T
¯
J
T
u, (2.10)
where
¯
J is the inertia weighted pseudoinverse of J:
¯
J =M
−1
J
T
JM
−1
J
T
. (2.11)
Using this methodology, we propose the following controller for a reaching task:
u=J
T
¯
J
T
MJ
+
¨ x
r
−
˙
J˙ q
+u
field
+
I−J
T
¯
J
T
¨ q
r
+C+G (2.12)
where
¨ x
r
= ¨ x
d
+K
Dx
(˙ x
d
− ˙ x)+K
Px
(x
d
−x) (2.13)
¨ q
r
= ¨ q
d
+K
Dq
(˙ q
d
− ˙ q)+K
Pq
(q
d
−q) (2.14)
(x
d
, ˙ x
d
,¨ x
d
and q
d
, ˙ q
d
,¨ q
d
are desired task and joint trajectories respectively). Here we
choose K
Dx
≫K
Dq
and K
Px
≫K
Pq
to emphasize the importance of task-space control
over joint-space control. Additionally, we only compensate for the force field, u
field
,
in task-space. We assume that the nervous system only learns the component of the
25
disturbanceeffecting hand motion, andallows thedisturbanceto alter theirrelevant null-
space motion. Finally, we will compensate for Coriolis, centrifugal, and gravity forces in
the full joint-space (as opposed to only task-space as originally proposed by Khatib).
2.5.1.1 Model Simulation
We implement our rigid body arm model and controller using the SL dynamics simula-
tor (Figure 2.4). For the simulation, we use the same kinematic and estimated inertial
parameters of our robot (as found in Section 2.2.2.3). Ideally we could use inertial pa-
rameters that are closer to that of a human arm, but since our controller is compensating
for inertial effects, we believe the desired effect should be similar. We tune the gain
parameters of (2.12), in order to match our experimental results, but keeping task-space
gains at least 20 times higher than joint-space gains. For experiment 1, desired task and
joint space trajectories aredirectly taken from SubjectA’s averaged handand joint space
null-field trajectories. For experiment 2, desired trajectories are taken from Subject B.
Figure 2.5 shows the simulated result of Experiment 1. Qualitatively, the joint space
motion under the force field is similar as found in the actual subject data in Figure 2.2.
Figure 2.6 shows the simulated result of Experiment 2. Again, qualitatively, the joint
space motion under the force field is similar as found in Figure 2.3. As an additional test,
we dropped the task-space force field compensation in the controller. Figure 2.7 shows
the simulated result using force field 1. The task-space compensation term is required in
order to maintain the desired hand path.
26
Figure 2.4: The SL master arm simulator, used to model the human arm and controller.
2.6 Conclusion
In this work we demonstrated how a complex exoskeleton robot can be employed for
behavioral studies of motor control. Despite that this hydraulic robot cannot compete
with the quality of impedance control of very small scale haptic devices, a model-based
controller with carefully tuned parameters accomplished a surprising quality of gravity,
Coriolis, and inertia compensation, such that the robot did not alter the movement of a
humaninserted intotheexoskeleton structuretoomuch. Wedemonstrated theusefulness
of this new experimental platform in a behavioral experiment, where human subjects
were exposed to a novel dynamic environment in the form of joint space force fields.
The experiment revealed that movement planning of reaching movements in humans
does not employ desired joint space trajectories, but instead may be using a form of
operational space control. We show how an operational space controller applied to a
simulated rigid body arm model can produce similar motion to our observed results in
human experiments.
27
0.3
0.4
0.5
0.6
0.7
0.2
0.3
0.4
−0.2
−0.1
0
0.1
0.2
0.3
Y (m)
X (m)
Z (m)
0
0.2
0.4
−1
−0.8
−0.6
−0.4
0.8
1
1.2
1.4
1.6
SFE (rad) SAA (rad)
EB (rad)
Null Field
Force Field
Figure 2.5: Simulated results: the left graph shows the hand trajectories of the rigid
body arm model. A null field trial (blue) is plotted together with a force field trial (red).
The right graph shows the joint space trajectories (SFE, SAA, and EB). The force field
perturbs the joint motion similarly as seen in the human data.
28
0.3
0.4
0.5
0.6
0.7
0.2
0.3
0.4
−0.2
−0.1
0
0.1
0.2
0.3
Y (m)
X (m)
Z (m)
−1
−0.5
−0.5
0
0.5
1
0.6
0.8
1
1.2
1.4
HR (rad)
SAA (rad)
EB (rad)
Null Field
Force Field
Figure 2.6: Simulated results: the left graph shows the hand trajectories of the rigid
body arm model. A null field trial (blue) is plotted together with a force field trial (red).
The right graph shows the joint space trajectories (SAA, HR, and EB). The force field
perturbs the joint motion similarly as seen in the human data.
29
0.3
0.4
0.5
0.6
0.7
0.2
0.3
0.4
−0.2
−0.1
0
0.1
0.2
0.3
Y (m)
X (m)
Z (m)
0
0.2
0.4
−1
−0.8
−0.6
−0.4
0.8
1
1.2
1.4
1.6
SFE (rad)
SAA (rad)
EB (rad)
Null Field
Force Field
Figure2.7: Simulated results: theleftgraphshowsthehandof therigid bodyarmmodel,
but without task-space compensation of the force field. A null field trial (blue) is plotted
with a force field trial (red). The right graph shows the joint space trajectories (SAA,
HR,and EB).Without compensating forthetask-space forcesproducedbytheforcefield
disturbance, the controller cannot maintain the desired hand trajectory.
30
Chapter 3
Desired Trajectories in Human Reaching
3.1 Introduction
Even a simple point to point reaching movement is a highly complex control problem for
the central nervous system (CNS). There are an infinite amount of possibilities in terms
of joint kinematics, hand path, and muscle actuation. To complicate matters further, the
sensory system is noisy, delayed, and partially observable. We would like to understand
how the CNS learns an effective and robust control strategy for such a task, and in
what way the control strategy is adapted when learning to control in a new dynamic
environment.
Optimization has been suggested as a guiding principle. Given some cost function,
the CNS strives to find a control strategy that will minimize it. Various cost functions
for human reaching have been suggested in literature: the minimization of jerk (Flash &
Hogan 1985), or torque change (Uno, Kawato & Suzuki 1989), etc. Using the framework
of stochastic optimal control (SOC), (Todorov & Jordan 2002) have shown how human
reaching behavior can result from the minimization of a cost function based only on
31
the sum of motor commands during a movement and position error at the end of the
movement. Since this cost function does not include any kinematic constraints such as
a desired path of the hand to the target, it is possible the CNS need not be concerned
about the hand path during a reaching movement. However, there has been evidence to
the contrary. For example in (Wolpert, Ghahramani & Jordan 1995) subjects received
artificially curved visual feedback of their hand position during a reaching task, that
did not alter target accuracy. Subjects adapted by curving their hand trajectory in the
opposite direction in order to maintain a visually straight path. In this experiment, the
perceived kinematic error became a motivating factor for adaptation.
In this chapter, we furtherchallenge the major SOCassumption that path kinematics
plays no role in sensorimotor learning and adaptation. First we give an overview of
the stochastic optimal control framework, and show how it is used for modeling the
sensorimotor controlofhumanreaching. Nextwewillpresentresultsofahumanreaching
experiment in which we attempt to alter motor cost without effecting accuracy. We will
then show how to modify the SOC model in order to better represent our observed
experimental results. Finally, we will conclude with a discussion of the implications of
our results.
3.2 Stochastic Optimal Control
Stochastic optimal control has been a useful tool for modeling the estimation and control
of biological sensorimotor systems, particularlybecausesuchsystems areinherentlynoisy
and partially observable (Loeb, Levine & He 1990). If one assumes a linear model of
32
dynamics with additive Gaussian noise, and a globally quadratic cost function, a cost
minimizing optimal controller can be computed analytically via the Linear-Quadratic-
Gaussian (LQG) framework. Additionally, (Todorov 2005) extends the LQG framework
to the case of signal-dependent noise, a property believed to exist in biological muscles
(Harris & Wolpert 1998).
3.2.1 Control formulation
We assume a dynamics equation of the form
x
t+1
=Ax
t
+Bu
t
+ξ
t
+
c
X
i=1
ε
i
t
C
i
u
t
(3.1)
where x
t
is the state vector, u
t
is the control vector, and ξ
t
and ε
i
t
are the additive
and multiplicative zero-mean Gaussian noise variables, respectively. We also assume our
sensory feedback is partial observable, and given by the equation:
y
t
=Hx
t
+ω
t
(3.2)
where y
t
are the observations, and ω
t
is zero-mean Gaussian noise. Finally, we assume
the following quadratic cost function:
0≤x
T
t
Q
t
x
t
+u
T
t
Ru
t
(3.3)
33
where Q
t
and R are positive definite matrices. In (Todorov 2005), the optimal estimator
and controller under such assumptions is derived and takes the form of
u
t
=−L
t
ˆ x
t
(3.4)
ˆ x
t+1
= (A−BL
t
)ˆ x
t
+K
t
(y
t
−Hˆ x
t
)+η
t
(3.5)
where ˆ x
t
isthestateestimateattimet,L
t
aretheoptimalcontrolgains,K
t
istheoptimal
estimator gain (the Kalman gain), and η
t
is zero mean Gaussian noise.
3.2.2 A model of sensorimotor control of human reaching
The assumptions of linear dynamics and a globally quadratic cost function may seem to
betoo restrictive, particularly for the highlynonlinear problem of sensorimotor control of
humanreaching. However, as demonstrated in (Todorov 2005), SOCstill has a predictive
power,particularlyinstochasticsystemsunderuncertainty. Formodelinghumanreaching
(as a 2 dimensional problem), 10 states are chosen:
x
t
=
p
x,y
v
x,y
f
x,y
g
x,y
p
∗
x,y
T
(3.6)
p
x,y
, v
x,y
, and f
x,y
are the 2 dimensional position, velocity and forces of a point mass.
g
x,y
is a 2 dimensional muscle state, and p
∗
x,y
is the target location. Linear dynamics
equations of the motor systems are as follows:
p
x,y
(t+Δt)=p
x,y
(t)+v
x,y
(t)Δt (3.7)
34
v
x,y
(t+Δt)=v
x,y
(t)+
f
x,y
(t)
m
Δt (3.8)
f
x,y
(t+Δt)=f
x,y
(t)
1−
Δt
τ
2
+g
x,y
(t)
Δt
τ
2
+D
x,y
x
t
(3.9)
g
x,y
(t+Δt) =g
x,y
(t)
1−
Δt
τ
1
+u
x,y
(t)(1+σ
c
ε
t
)
Δt
τ
2
(3.10)
whereD
x,y
x
t
in (3.9) is some linear, state dependent, external force, such as a force field.
The quadratic cost function is typically defined as follows:
w
p
p
x,y
(t
f
)−p
∗
x,y
2
+w
v
||v
x,y
(t
f
)||
2
+w
f
||f
x,y
(t
f
)||
2
+
Z
t
f
0
u
T
t
Ru
t
dt (3.11)
or in more compact notation:
x
T
t
f
Q
t
f
x
t
f
+
Z
t
f
0
u
T
t
Ru
t
dt (3.12)
With this cost function, the objective of the reaching movement is to be at the target
at a specified time (t
f
), with zero velocity and force (all motion stopped), and using a
minimal amount of control effort possible. It is important to note that this cost function
does not specify any direction or path to the target. Various human-like motions (e.g.
minimum jerk trajectories) are shown to automatically emerge from the minimization of
this minimum effort/maximum accuracy cost function (Todorov 2005).
3.2.3 Reoptimization as a model of learning
The work of (Izawa, Rane, Donchin & Shadmehr 2008) extends the stochastic optimal
control model to incorporate learning and uncertainty of force field dynamics. When
35
learning a new force field, the subject will not have full knowledge of D
x,y
from 3.9.
Instead the subject estimates D
x,y
with
ˆ
D
x,y
=αD
x,y
(3.13)
with 0 < α ≤ 1. The optimal controller (3.4) is computed using
ˆ
D
x,y
, while during
simulation the fullD
x,y
is used, and learning becomes the process of gradually increasing
α from 0 to 1.
3.2.3.1 Modeling catch trials
In this chapter, we will extend this concept of reoptimization to additionally model the
effects of catch trials. Assumingthe actual force field isZ, Before Effects will bemodeled
simply as
D
x,y
=Z (3.14)
ˆ
D
x,y
= 0 (3.15)
and After Effects are
D
x,y
= 0 (3.16)
ˆ
D
x,y
=Z (3.17)
36
−0.04 −0.02 0 0.02
0
0.05
0.1
0.15
0.2
0.25
x (m)
y (m)
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Mean Velocity (m/s)
Time Horizon (s)
Baseline
After Effects
Before Effects
25% Learning
50% Learning
75% Learning
100% Learning
Figure 3.1: The prediction made by stochastic optimal control under a rightward viscous
curl field. Each line plots the average of 50 simulations. Note that the optimal action
under this field is to move with leftward curvature.
3.2.3.2 Example with a viscous curl field
Figure 3.1 shows the trajectory predictions made by stochastic optimal control with
reoptimization, under a rightward viscous curl field. This repeats and confirms the result
reported in (Izawa et al. 2008), while additionally showing the prediction made by our
model of before and after effects. Note that, as emphasized in (Izawa et al. 2008), the
100% optimal controller, does not move straight to the target, but rather moves with a
leftward curvature. It costs less energy to fight the field at first (while velocity is low),
and subsequently allow the field to push you towards the target.
37
3.3 Reaching Experiment
Inspired by the experiment of (Wolpert et al. 1995), we attempt to create a dynamic
environment that effects motor cost, butnot accuracy: while reaching to a goal, if a force
perturbation firstpushes the hand off-course in one direction and then subsequentlyback
in the opposite direction, the goal may still be achieved with minimal correction. Fight-
ing this perturbation In order to maintain a straight hand trajectory may unnecessarily
increase motor cost.
We attempt to create such a perturbation via a force field created by robot manipu-
landum. Since we also wish to use existing linear optimal control modeling techniques,
we implement a simple force field that is a linear function of hand kinematics:
F
x
= 2.0¨ y−5.0˙ x (3.18)
Thus, if the subject plans to execute a straight, minimum jerk movement, a disturbing
force will first push the hand off course to the right during acceleration, and then sub-
sequently left, back towards the target during deceleration Figure 3.2. The viscous term
is required to dampen inertia sufficiently such that the deceleration disturbance reverses
hand direction.
Additionally, because (Wolpert et al. 1995) already addressed the issue of visually
perceived error, we will focus solely on the case proprioceptive feedback. Thus we will
give no visual feedback to our subjects, except for their terminal position.
38
−0.05 0 0.05
0
0.05
0.1
0.15
0.2
0.25
2N
x (m)
y (m)
0 0.2 0.4 0.6 0.8
0
0.1
0.2
0.3
0.4
0.5
vel. (m/s)
0 0.2 0.4 0.6 0.8
−2
−1
0
1
2
time (s)
acc. (m/s
2
)
Figure 3.2: Our acceleration force field shown as arrows for a minimum jerk reaching
movement
3.3.1 Apparatus
We use the ATR parallel-link, 2-DOF, direct drive, air-magnet floating manipulandum
(PFM),showninFigure3.3. Subjectswereseatedinachairandstrappedinwithshoulder
harness to keep their back against the chair. A wrist brace was attached to their right
hand to restrict wrist motion and connect the subject to the handle of the device. The
combination of wrist brace and shoulder straps only allow for planar shoulder and elbow
motion. Thesubjectscannotseetheirarm,astabletop covers theentireworkspace. Start
and target locations are projected onto the table, as well as the positional feedback at
the end of the movement. Kinematic data is recorded at 500Hz rate. The manipulandum
and setup were described in detail previously (Gomi & Kawato 1996).
39
Figure 3.3: The ATR Planar Force Manipulandum (PFM).
3.3.2 Experimental Procedure
Subjects are asked to execute 25cm point-to-point reaching movements, within a time
window of 800+/-150ms, to a specified target in a single direction, perpendicular to the
frontal plane, away from the body. The target is indicated by a 2.5cm diameter circle.
Before the start of each trial, the subject moves their hand to the starting location,
indicated by a 2cm diameter circle. During this phase, visual feedback of the hand
position is given (indicated by a dot), but only within 4 cm of the starting location, and
only so the subject can properly position their hand to start. Once inside the starting
circle, visual feedback is removed, and 3 consecutive beeps sound to indicate the start of
the trial. No visual feedback is given during the trial, except a single dot indicating the
final hand position at the end of the trial.
40
3.3.2.1 Experiment A
We test 6 subjects (5 male, 1 female), ages 20-32. Subjects are testing in a single session
consisting of 5 blocks, with a 5 minute break in between each block. In Block 1 subjects
trains on the null field condition until 70 successful trials have been completed. In Block
2, subjects execute 30 successful null field trials, followed by 80 trials with 10 random
catch trials (where the field is switched on). Blocks 3 and 4 are 60 successful trials each
in the force field. Finally in Block 5 subjects execute 30 successful trials followed by 80
trials with 10 catch trials (with the field off). Each subject trains up to a total of 100
successful null field trials, and 150 successful force field trials.
3.3.2.2 Experiment B
In experiment B, we test a total of 10 subjects (9 male, 1 female), ages 20-32. Subjects
are testing in a single session consisting of 5 blocks, with a 5 minute break in between
each block. In Block 1 subjects trains on the null field condition until 70 successful trials
have been completed. In Block 2, subjects execute 30 successful null field trials, followed
by 18 trials with the force field gain linearly rising from trial to trail (from 0 to full
strength), and then followed by an additional 40 successful trials in the full force field.
Block 4 trains 70 successful force field trials. Block random catch trials (where the field
is switched on). Blocks 3 and 4 are 60 successful trials each in the force field. Finally
in Block 5 subjects execute 30 successful trials followed by 80 trials with 10 catch trials
(with the field off). Each subject trains up to a total of 100 successful null field trials,
and 150 successful force field trials.
41
3.3.2.3 Data Analysis
In order to quantify the curvature of a trajectory, we compute the average perpendicular
distance from a straight line that connects the start of the trial to the target. We assign
this metric a positive value if the greater portion of the trajectory is on the right side of
the line, otherwise it receives a negative value. For the purpose of this data analysis, the
start of a trial is considered to be the last time the velocity is 0.05m/s before reaching
peak velocity. The end of the trial is the first time the velocity reaches 0.05m/s after
peak velocity.
3.3.3 Results
3.3.3.1 Experiment A
Figure 3.4 shows average trajectory plots collectively for 6 subjects. Baseline (black) is
the average of the last 15 trials of null field training (after at least 100 successful trails
in the null field). Before Effects (blue) are the average of the 10 catch trials immediately
following null field training. Final Force (red) is the average of the last 15 trials of force
field training (after at least 150 successful trials in the force field). After Effects (green)
is the average of the 10 catch trials after force field learning.
Figure 3.5 shows how trajectories change during early trials of force field learning.
Plotted are the averages (over all 6 subjects) of the first 1-10, 11-20, 21-30, and 31-40
trajectories in the force field respectively. We see that the effect of force field training
gradually brings the the trajectory curvature inward towards baseline.
42
−0.04 −0.02 0 0.02 0.04
0.35
0.4
0.45
0.5
0.55
x (m)
y (m)
0 0.2 0.4 0.6 0.8 1 1.2
0
0.1
0.2
0.3
0.4
0.5
0.6
mean velocity (m/s)
time (s)
Baseline
Before Effects
Final Force
After Effects
Figure 3.4: Shown are averaged trajectory plots for all 6 subjects during experiment A.
The left figure is the x and y position, and the right is the velocity profile over time.
Black is the baseline trajectory (after null field learning). Red is the final trajectory after
force field learning. Blue and green show the before and after effects (respectively) of
learning.
43
−0.02 0 0.02 0.04
0.35
0.4
0.45
0.5
0.55
x (m)
y (m)
0 0.2 0.4 0.6 0.8 1 1.2
0
0.1
0.2
0.3
0.4
0.5
0.6
mean velocity (m/s)
time (s)
Baseline
Before Effects
Force Trials [1−10]
Force Trials [11−20]
Force Trials [21−30]
Force Trials [31−40]
Figure 3.5: Averaged trajectory plots for all six subjects, during early force field learning
trials (trials 1-10, 11-20, 21-30, and 31-40. We see that the curvature moves inwards
during training.
FinallyinFigure3.6 weplottheaveraged learningcurveduringthefirstblockofforce
field training (Block 3). The y-axis plots the average distance from the line connecting
the start of the trial to the target (see Section 3.3.2.3). Each point of the black graph is
the trial-by-trial average of all 6 subjects. Blue is a 15 moving average. The mean value
of all baseline trajectories is also shown for reference.
Table 3.1 reports the statistics of the shown trajectories. Mean and Stdev report
the mean trajectory error and standard deviation (receptively) of 6 subjects. P and Sig.
report the significance from baseline (Paired t-test, df=5, * P<0.05, ** P<0.01). Success
rates for each trajectory are also shown.
44
0 10 20 30 40 50 60 70 80 90 100
−0.015
−0.01
−0.005
0
0.005
0.01
0.015
0.02
mean trajectory error (m)
trial number
mean trial−by−trial error
15 point moving average
baseline error
Figure 3.6: Averaged trial by trial learning curve during Block 3 (the first block of force
field training). The y-axis is the average distance from the line connecting the start of
the trial to the target. The baseline value is also shown with a dashed line.
Table 3.1: Statistics for average trajectories from experiment A).
Trajectory Mean (m) Stdev (m) P Sig. Success Rate
Baseline -0.0020 0.0018 58.9%
Before Effects 0.0128 0.0028 3.0×10
−6
** 20.0%
Force Trials [1-10] 0.0086 0.0030 6.6×10
−5
** 25.0%
Force Trials [11-20] 0.0050 0.0036 3.6×10
−3
** 25.0%
Force Trials [21-30] 0.0038 0.0033 6.5×10
−3
** 36.7%
Force Trials [31-40] 7.7×10
−4
0.0050 0.2500 61.7%
Final Force 0.0032 0.0054 0.0694 65.6%
After Effects -0.0131 0.0043 7.5×10
−4
** 16.7%
3.3.3.2 Experiment B
Figure 3.7 shows average trajectory plots collectively for 10 subjects. Baseline (black) is
the average of the last 15 trials of null field training (after at least 100 successful trails
in the null field). Post Ramp (blue) is the first 10 trials immediately following the ramp
stage. Final Force (red) is the average of the last 15 trials of force field training (after at
least 150 successful trials in the force field). After Effects (green) is the average of the 10
catch trials after force field learning.
45
−0.04 −0.02 0 0.02
0
0.05
0.1
0.15
0.2
0.25
x (m)
y (m)
0 0.2 0.4 0.6 0.8 1 1.2
0
0.1
0.2
0.3
0.4
0.5
0.6
mean velocity (m/s)
time (s)
Baseline
Post Ramp [1−10]
Final Force
After Effects
Figure 3.7: Shown are averaged trajectory plots for all 10 subjects during experiment B.
The left figure is x and y position, and the right is the velocity profile over time. Black
is the baseline trajectory (after null field learning). Red is the final trajectory after force
field learning. Blue is the first 10 trials after the ramp stage, and green shows the after
effects of learning.
46
−0.04 −0.02 0 0.02 0.04
0
0.05
0.1
0.15
0.2
0.25
x (m)
y (m)
0 0.2 0.4 0.6 0.8 1 1.2
0
0.1
0.2
0.3
0.4
0.5
0.6
mean velocity (m/s)
time (s)
Baseline
Ramp [1−10]
Ramp [11−20]
Post Ramp [1−10]
Post Ramp [11−20]
Figure 3.8: Averaged trajectory plots for 10 subjects, during early force field learning
trials of experiment B. Ramp [1-10] and Ramp [11-20] are the first and last 10 trails,
respectively, duringthe ramp stage where the force field gain increases from trial to trial.
Post Ramp [1-10] and Post Ramp [11-20], are the first 10 and next 10 trials immediately
following the ramp stage (with full force field strength).
Figure 3.8 shows trajectories during early force field trails, both during and immedi-
atley after the ramp stage, where the force field gain increases from trial to trial.
Figure 3.9 shows thetrial-by-trial averaged learning curvefor all 10 subjects. Inorder
to compute the trial by trial average, we first aligned the trials of each subject by the
start of the ramp stage.
Table 3.2 reports the statistics for the shown trajectories in experiment B. Mean
and Stdev report the mean trajectory error and standard deviation (receptively) of 10
subjects. P and Sig. report the significance from baseline (Paired t-test, df=9, * P<0.05,
** P<0.01). Success rates for each trajectory are also shown.
47
0 20 40 60 80 100 120
−0.01
−0.005
0
0.005
0.01
0.015
mean trajectory error (m)
trial number
0 20 40 60 80 100 120
0
0.2
0.4
0.6
0.8
1
force field gain
mean trial−by−trial error
15 point moving average
baseline error
Figure3.9: Averaged trialbytriallearningcurveduringBlock2, whichincludestheramp
stage of gradually increasing the force field gain. The y-axis is the average distance from
theline connecting the start of the trial to thetarget. Thebaseline value is also indicated
with a dashed line.
Table 3.2: Statistics for average trajectories from experiment B
Trajectory Mean (m) Stdev (m) P Sig. Success Rate
Baseline -0.0021 0.0036 58.0%
Ramp Trials [1-10] 0.0011 0.0047 0.1125 59.0%
Ramp Trials [11-20] 0.0025 0.0058 0.0511 34.0%
Force Trials [1-10] 0.0019 0.0062 0.1015 23.0%
Force Trials [11-20] 4.5×10
−4
0.0056 0.2498 34.0%
Final Force 0.0015 0.0048 0.0768 56.0%
After Effects -0.0117 0.0017 4.0×10
−6
** 24.0%
3.3.4 Modeling using stochastic optimal control
Using the linear dynamics model described in Section 3.2.1 and the cost function defined
by (3.11), we use stochastic optimal control, with multiplicative noise, to compute the
optimal control policy under our force field. To add our acceleration based force field to
the model, we modify the x component of 3.9 to be:
f
x
(t+Δt) =f
x
(t)
1−
Δt
τ
2
+g
x
(t)
Δt
τ
2
+α
f
y
m
(3.19)
48
−0.02 0 0.02 0.04 0.06
0
0.05
0.1
0.15
0.2
0.25
x (m)
y (m)
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Mean Velocity (m/s)
Time Horizon (s)
Null Field
Before Effects
Force Field
After Effects
Figure 3.10: Predictions made by stochastic optimal control in our acceleration based
force field. The optimal trajectory has a strong rightward curvature.
Note that our model remains linear with the addition of this force field term.
Figure 3.10 shows theaverage results of 50 simulations underthe nullfield,force field,
as well as before and after effect catch trials (the modeling of catch trials is described
in Section 3.2.3.1). Essentially, the stochastic optimal control model predicts that the
optimal behavior is to move with a significant rightward curvature.
Figure 3.11 shows how the reoptimization framework (Section 3.2.3) predicts how
learningmodifiestheresultingtrajectories. Thefirsttrajectory(0% learning)iscurvedto
the right, but slightly misses the target to the left. Reoptimization predicts that subjects
will adapt by moving further to the right, and increasing the amount of curvature.
Finally, Figure 3.12 shows how the SOC model prediction of adaptation during the
rampstage of ExperimentB,wheretheforcefieldgain isgraduallyincreasedfromtrialto
49
−0.02 0 0.02 0.04 0.06
0
0.05
0.1
0.15
0.2
0.25
x (m)
y (m)
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Mean Velocity (m/s)
Time Horizon (s)
Baseline
0% Learning
25% Learning
50% Learning
75% Learning
100% Learning
Figure 3.11: Predictions made by stochastic optimal control in our acceleration based
force field, and with the reoptimization model of learning. As learning improves, the
trajectory moves further to the right and with additional curvature.
trial. If the subject behaves optimally during the ramp stage, curvature should increase
with increased force field gain.
3.3.5 Modeling using a directionally biased cost function
As Section 3.3.4 shows, the stochastic optimal control fails to predict the adaptation
process of our subjects. In particular Figure 3.11 predicts that curvature should move
outwards with learning, while Figures 3.5 and 3.6 clearly confirm that curvature moves
inwards. In this case, the perceived kinematic error must play some role in adaptation.
50
0 0.02 0.04 0.06
0
0.05
0.1
0.15
0.2
0.25
x (m)
y (m)
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Mean Velocity (m/s)
Time Horizon (s)
0% Force
25% Force
50% Force
75% Force
100% Force
Figure 3.12: Predictions made by stochastic optimal control in our acceleration based
force field during the ramp stage where the force field gain is gradually increased from
trial to trial. As the gain increases, the SOC model predicts additional curvature.
We can modify the cost function given by (3.11), in order to include a kinematic bias.
The new cost function takes the form:
x
T
t
f
Q
t
f
x
t
f
+
Z
t
f
0
u
T
t
Ru
t
dt+e
−t/τ
k
p
p
2
x
+k
v
v
2
x
(3.20)
where p
x
and v
x
are the x direction (perpendicular to the target direction) position and
velocity, respectively. These new terms bias the motion in beginning to move towards
the target. The exponential decay term insures that this directional bias is only used at
the very beginning of the trial. For this experiment, we use τ = 0.07, which results in a
decay term that looks like Figure 3.13. Note that with the addition of this new term, the
cost function still remains quadratic with respect to the state.
51
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
e
t/0.07
time(s)
Figure 3.13: The exponential decay term of (3.20) with τ = 0.07.
First we apply the new cost function to the standard viscous curl field. The result is
shown in Figure 3.14
Next we apply the new cost function to model our acceleration based force field.
Figure 3.15 shows the average results of 50 simulations under the null field, force field, as
well as before and after effect catch trials. In this case, the optimal result is to move with
a trajectory close to baseline. Additionally, Before and After Effects are symmetrical,
as they were on our subjects. Overall, this model provides a much better match to the
actual subject results shown in Figure 3.4.
Figure 3.16 shows the direction biased cost function effects the reoptimization frame-
work. Previously, reoptimization predicted adaptation by moving further outward to
the right. Now, learning moves curvature inward, towards baseline, as executed by our
subjects (Figure 3.5).
Finally, Figure 3.17 shows how the direction biased SOC model prediction of adap-
tation during the ramp stage of Experiment B, where the force field gain is gradually
increased from trial to trial. If the subject behaves optimally duringthe ramp stage, very
little change occurs in the observed trajectories.
52
−0.04 −0.02 0 0.02 0.04
0
0.05
0.1
0.15
0.2
0.25
x (m)
y (m)
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Mean Velocity (m/s)
Time Horizon (s)
Baseline
After Effects
Before Effects
25% Learning
50% Learning
75% Learning
100% Learning
Figure 3.14: Stochastic Optimal control, with direction biased cost function, applied to
a rightward viscous curl field. As learning improves, the trajectory becomes straight.
−0.04 −0.02 0 0.02 0.04
0
0.05
0.1
0.15
0.2
0.25
x (m)
y (m)
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Mean Velocity (m/s)
Time Horizon (s)
Null Field
Before Effects
Force Field
After Effects
Figure 3.15: Predictions made by the direction biased stochastic optimal control in our
acceleration based force field. The optimal trajectory is close to baseline, and nicely
matches the results observed from subjects.
53
−0.04 −0.02 0 0.02 0.04
0
0.05
0.1
0.15
0.2
0.25
x (m)
y (m)
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Mean Velocity (m/s)
Time Horizon (s)
Baseline
0% Learning
25% Learning
50% Learning
75% Learning
100% Learning
Figure 3.16: Predictions made by the direction biased stochastic optimal control in our
acceleration based force field, and with the reoptimization model of learning. As learning
improves, the trajectory moves further inward towards baseline, with reduced curvature.
3.4 Discussion
In motor control, the minimization of motor effort, or energy, is of course very practical
for biological systems. However, our sensory system is not capable of measuring instan-
taneous motor cost. Feedback of motor cost can be substantially delayed, perhaps on the
time courseof hoursor days, in terms of musclefatigue, exhaustion, etc. Therefore, when
quickly learning a new dynamic environment, it does not make sense for motor cost to
drive adaptation. Rather, perceived kinematic error is a much more reliable indication
that something has gone awry with the current internal model. Attempting to reduce
this kinematic error provides a systematic way to learn a new internal model. Once the
new model is formed, the CNS can turn its efforts to optimizing for energy.
54
−0.04 −0.02 0 0.02 0.04
0
0.05
0.1
0.15
0.2
0.25
x (m)
y (m)
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Mean Velocity (m/s)
Time Horizon (s)
0% Force
25% Force
50% Force
75% Force
100% Force
Figure 3.17: Predictions made by the direction biased stochastic optimal control in our
acceleration based force field duringtheramp stage wherethe force fieldgain is gradually
increased from trial to trial. As the gain increases, the SOC model predicts very change
to the trajectory.
55
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
−100
0
100
200
300
400
500
600
700
800
900
x position gain
Directionless Cost
Direction Biased Cost
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
−50
0
50
100
150
200
250
300
350
x velocity gain
time(s)
Figure 3.18: The position and velocity gains of optimal controller, in the x direction
before (in blue) and after (in green) adding a directional bias to the cost function. The
directionalbiasincreasesthestiffnessanddampingduringtheearlypartofthemovement.
Adding a directional bias to the stochastic optimal control cost function provides a
simple way to model the effect of kinematic error. We can see one effect of this new term
in Figure 3.18. The directional bias increases the stiffness and damping levels during the
early part of the movement. This increased impedance will help to maintain robustness
in light of uncertainty, and is certainly a strategy used by the CNS during learning
(Takahashi et al. 2001),(Franklin, Burdet, Tee, Osu, Chew, Milner & Kawato 2008).
56
Chapter 4
Task Space Control for Humanoid Robots
4.1 Introduction
Previous chapters outlined how for certain tasks such as arm reaching, the brain likely
encodes the movement in the lower dimensional end effector space (i.e. a hand) rather
than over individual joints of the arm. This way the complexity of the motion problem
is reduced to a smaller number of task relevant dimensions. For high degree-of-freedom
robotic systems, such as humanoid robots, it is logical to plan motion in a similar man-
ner. For a humanoid, key end effectors that we may want to control include the robot’s
hands for reaching and grasping, feet for locomotion, or the center-of-gravity (COG) for
balance (Figure 4.1). In order to control such task space points. while resolving all re-
mainingredundancyinasuitableway, weneedageneralframeworkforinversekinematics
computation in a humanoid robot. However, humanoid robots complicate such matters
sincetheyare typically high DOFsystems, inherentlyunder-actuated, andhave changing
contacts and constraints with respect to the environment.
57
Figure 4.1: The Sarcos/NiCT i-1 humanoid robot with various task space control points
shown.
This chapter addresses such issues, and presents a general framework for inverse kine-
matics computation for high DOF humanoid robots. This framework allows us to plan
motion in a task relevant space, while resolving full body motion in joint space. We also
useafloating-baserepresentation: anaturalwaytorepresentanyrobotthatisunconfined
to a fixed platform and has a constantly changing contact state with the environment,
such as a humanoid that may break and create contacts as it walks. We show how to
resolve the issue of under-actuation, by sufficiently constraining the robot. Additionally
we show how through task prioritization, we can complete multiple tasks simultaneously,
or execute less critical tasks without interfering with higher priority tasks. Finally, we
show how to deal with switching contact states, a crucial problem for locomotion.
58
4.2 TheFloatingBaseRepresentationofRobotKinematics
Because most humanoid robots are not physically attached to the world, as are fixed
manipulators, the floating base framework not only provides a more general representa-
tion for full body humanoid robot control, but is required in order to represent the full
configuration with respect to an inertial frame of reference. We attach a base reference
frame to the robot, and assume that the base frame can move freely relative to a fixed
inertial frame. We can then completely describe the configuration of a rigid body robot,
with respect to the inertial frame, as:
q =
q
T
r
x
T
b
T
(4.1)
whereq
r
∈R
n
is thejoint configuration of therigid bodyrobotwithnjoints andx
b
∈R
6
is the position and orientation of the coordinate system attached to the robot base, and
measuredwithrespecttothefixedinertialframe. Figure4.2illustratesthisrepresentation
by showing the 6 virtual degrees of freedom attached from inertial frame to the robot
base frame.
Floating-base kinematics and dynamics have been extensively analyzed for spacecraft
control (Umetani&Yoshida1989),(Vafa &Dubowsky1990),(Dubowsky &Papadopoulos
1993),(Caccavale & Siciliano 2001) and in recent years for the control of humanoid robots
(Sentis & Khatib 2005),(Park & Khatib 2006),(Sentis & Khatib 2006),(Khatib, Sentis &
Park 2007) under the framework of Khatib’s operational space control (Khatib 1987).
The advantage of Khatib’s framework is that it can compute the control forces that
decouple task and null-space dynamics. Additionally, as it is a force control method, it
59
Inertial Frame
Robot
Base
Frame
6 Virtual DOFs
Figure 4.2: The base frame attached to the robot is connected to the inertial frame via 6
unactuated virtual DOFs
doesnotusejointpositionreferences,andthusdoesnotrequireexplicitinversekinematics
computation. Because these methods do not explicitly control for joint positions, they
can maintain high level of compliance and thus are suitable for use in environments with
human interaction.
1
In(Nakanishi etal. 2008) weconducted anextensive theoretical andempirical evalua-
tionoftaskspacecontrollersforcompliantcontroloffixed-basedredundantmanipulators.
The types of controllers examined included velocity, acceleration and force based meth-
ods, where velocity and acceleration based methods utilize explicit inverse kinematics
to compute joint references before feeding into a inverse dynamics computation. We
found that the force control methods were highly sensitive to rigid-body dynamics model
inaccuracies, in particular the inertia matrix which must be inverted. In this particu-
lar evaluation, we found that a joint acceleration-based controller coupled with inverse
dynamics, had slightly better performance in light of modeling errors.
1
There is also evidence that humans do not explicitly control their joint positions in arm reaching
tasks, see Chapter 2.
60
A goal of this chapter is to compliment the force control work of Sentis, Park, and
Khatib, by focusing on the specific issues related to inverse kinematics for floating base
systemsinordertoprovidealternatecontrolsolutions. Inparticular,ifwehaveamethod-
ology for inverse kinematics computation for floating base humanoids, we can then easily
utilize velocity or acceleration based task-space control coupled with floating base inverse
dynamics (Chapter5), as well as traditional joint positioncontrol. Thesemethods,which
maintain joint position references, can potentially provide a more robust solution in light
of dynamic model inaccuracies. Figure 4.3 is a high level overview of a potential control
framework. If we have an accurate inverse dynamics model, we can operate the feedback
controller at near zero gain in order to keep compliance high. However in practice, dy-
namics models are rarely accurate, so we still use a moderate feedback gain to maintain
robustness at the cost of higher stiffness. Additionally, pure position based feedback con-
trol can be very useful for learning system dynamics, since we can execute trajectories
without any prior dynamics knowledge, in order to collect data for building dynamics
models (Ting, Mistry, Peters, Schaal & Nakanishi 2006).
Inverse
Dynamics
Feedback
Controller
Controlled
Object +
-
+
+
Inverse
Kinematics
Figure 4.3: Inverse Kinematics/Inverse Dynamics control diagram. x
d
is a desired task,
u
ff
is feedforward torque and u
fb
is feedback torque
61
Next we will introduce floating base Jacobians and other notation we will use. We
then provideand prove an inverse kinematics equation, and thesufficientconditions, that
allow for the control of arbitrary task variables, regardless of under-actuation of floating
base. Wethensuggestacontrollerwhichcontrolsforstaticbalance,aswellasothertasks,
while maintaining environmental constraints. Finally we demonstrate the feasibility of
the controller on an actual humanoid robot platform.
4.3 Floating Body Kinematics
InthissectionwewilldefinethenecessarynotationandtheJacobiansrequiredforfloating
base inverse kinematics. We will also discuss the role of constraints in the control of
unactuated degrees of freedom.
4.3.1 Floating Base Jacobians
If x=f(q) is the position and orientation of any frame located on the robot, and repre-
sented in the fixed (inertial) world coordinate system, the velocity of x is given by the
equation ˙ x =J˙ q, where
J =
∂x
∂qr
∂x
∂x
b
=
J
∂qr
J
∂x
b
, (4.2)
62
and J
∂qr
and J
∂x
b
are the components of J relating to robot joint and base motion,
respectively. Each component of the Jacobian can be computed geometrically (Sciavicco
& Siciliano 2001). For example, end-effector motion relative to base motion is
J
∂x
b
=
∂x
∂x
b
=
I
3×3
R
b
×
x
P
−x
P
b
0
3×3
I
3×3
(4.3)
wherex
P
∈R
3
are the three positional components of x and R
b
× is a column-wise cross
product of R
b
, the rotation matrix representing base orientation relative to the fixed
world frame. Other important free-floating base Jacobians to consider are the Jacobian
describing motion of the base itself:
J
b
=
"
0
6×n
I
6×6
#
(4.4)
as well as the center of gravity of the robot (Sugihara 2004):
J
cog
=
"
∂x
P
cog
∂qr
I
3×3
R
b
×
x
P
cog
−x
P
b
#
(4.5)
where x
P
cog
is the position of the robot’s center of gravity, and
∂x
P
cog
∂q
r
=
P
i
m
i
J
i,cog
P
i
m
i
(4.6)
is the mass weighted average of the Jacobians of COGs of individual links.
63
4.3.2 Inverse Kinematics with Floating Base
Task space control of robots using floating base Jacobians is complicated due to the ad-
dition of unactuated degrees of freedom. Computing joint velocities via a traditional
Jacobian pseudo-inverse approach will create velocities for the base position and orien-
tation which may not be realizable due to under-actuation. However, if we sufficiently
constrain our system, we will be able fully realize endeffector velocities, provided certain
conditions are met. Calling x
C
∈ R
k
the vector of k constraints, and J
C
the Jacobian
of constraints (˙ x
C
= J
C
˙ q), using the following equation, we can compute the required
actuated joint velocities (˙ q
r
) that will realize an arbitrary desired end effector velocity
(˙ x
d
):
˙ q
r
=
I
n×n
0
n×6
J
C
J
+
0
k×1
˙ x
d
, (4.7)
where A
+
= A
T
AA
T
−1
is the right pseudoinverse, provided the following sufficient
conditions are satisfied:
i) ˙ x
C
= 0 is maintained (there is sufficient friction and stiffness at the constraint
locations to prevent motion relative to the inertial frame).
ii) The system is not over-constrained: The matrix
J
T
C
J
T
T
should remain full
row rank.
iii) The system is not under-constrained: Recalling the notation introduced in (4.2)
(J
C
=
J
∂qr
C
J
∂x
b
C
), the constraint Jacobian related to base motion (J
∂x
b
C
),
must have a rank equal to 6.
64
Proof. If
J
T
C
J
T
T
is full row rank (condition 2), its right pseudoinverse exists, and
we can compute ˙ q as:
˙ q =
J
C
J
T
J
C
J
J
C
J
T
−1
0
k×1
˙ x
d
. (4.8)
Assuming that this ˙ q can be fully realized by some controller, and contact points do not
slip (condition 1), it is trivial to verify that ˙ x = ˙ x
d
and ˙ x
C
=0 (the task is successfully
completed while constraints are maintained). Additionally, because of (4.8), ˙ q will be in
the null space of the constraints, and we can write:
J
C
˙ q=J
∂qr
C
˙ q
r
+J
∂x
b
C
˙ x
b
=0. (4.9)
We can then solve for ˙ x
b
:
˙ x
b
=−
h
J
∂x
b
C
i
+
J
∂qr
C
˙ q
r
. (4.10)
Only when (4.10) has a unique solution will we be able to represent ˙ q unambiguously
using only ˙ q
r
. Otherwise, for a fixed ˙ q
r
, x
b
may be able to move independently. Note
that J
∂x
b
C
∈ R
k×6
. If k < 6, (4.10) has potentially an infinite number of solutions. If
k ≥ 6 and Rank
J
∂x
b
C
= 6 a unique solution is guaranteed (Strang 1988)
2
, and given
2
Assuming at least one solution exists, which we are able to do, since we also assume (4.9) is true.
65
Double Surface Contact Single Surface Contact
Double Edge Contact
(non-collinear)
Triple Point Contact
(non-collinear)
Figure 4.4: Examples of systems that are adequately constrained: Rank
J
∂x
b
C
= 6
by the left pseudo-inverse (A
+
=
A
T
A
−1
A
T
). Finally, when Rank
J
∂x
b
C
= 6, we can
represent ˙ q unambiguously as:
˙ q =
I
n×n
−
h
J
∂x
b
C
T
J
∂x
b
C
i
−1
J
∂x
b
C
T
J
∂qr
C
˙ q
r
. (4.11)
Because of condition 3), the structure of the J
∂x
b
C
determines whether or not our
system iscontrollable withrespecttoan arbitrarytaskpointontherobot. Theminimum
number of constraints required is 6, however, their relative positions are also critical.
For example, two edge contacts are adequate provided they are not collinear. Figures
4.4 and 4.5 illustrate some examples of adequately constrained and under-constrained
configurations, respectively.
66
Single Edge Contact
Double Point Contact
Triple Collinear Point Contact
Double Collinear Edge Contact
Figure 4.5: Examples of systems that are under-constrained: Rank
J
∂x
b
C
< 6
It is important to note that when constraints change (for example when feet make
andbreakcontact withthegroundduringlocomotion),J
C
willabruptlychangestructure
(possibly size as well, if new constraints are added or removed), creating undesirable
discontinuities in ˙ q
r
at the moment of constraint switch. Methods for reducing these
discontinuities for locomotion are addressed in Section 4.6.
4.4 Task Prioritization
Robots can utilize redundancy, if available, to achieve multiple tasks simultaneously. A
robot may have a set of n tasks, each represented by the vector x
i
, i = 1...n. It may be
possible to compute a configuration velocity, that will achieve all n tasks simultaneously
67
by utilizing the pseudo-inverse of an augmented Jacobian matrix, created by stacking the
n task Jacobians together:
˙ q=
J
1
J
2
.
.
.
J
n
#
˙ x
1
˙ x
2
.
.
.
˙ x
n
. (4.12)
For convenience we will write augmented Jacobians and velocity vectors, stacked with i
components, as
ˆ
J
i
and
ˆ
˙ x
i
respectively, and can therefore rewrite (4.12) as
˙ q =
ˆ
J
#
n
ˆ
˙ x
n
(4.13)
However, occasions can arise when two or more tasks will be in conflict, i.e. one task
cannotbecompletedwithoutviolatinganothertask. Inthiscase,theaugmentedJacobian
of (4.13)willdroprankandbecomesingular. Thepseudo-inversecancreateunreasonably
high joint velocities when approaching singularities, and discontinuities when rank is
changed (Maciejewski & Klein 1985). Note that wecan potentially dealwith singularities
by using a damped pseudo-inverse, as discussed in (Baerlocher & Boulic 1998):
J
#λ
=J
T
JJ
T
+λ
2
I
−1
at the cost of inaccurate control of task variables. Another way to prevent singularities,
is to employ a task-priority framework, whereby lower priority tasks are guaranteed not
to interfere with the achievement of higher priorities. In addition, if it becomes necessary
68
to sacrifice a lower priority task for a higher one, then the lower priority task should be
executed with a minimum amount of error.
Task-priority control of redundant manipulators have been studied in several papers
(Nakamura, Hanafusa&Yoshikawa 1987, Siciliano& Slotine1991, Chiaverini1997, Baer-
locher & Boulic 1998, Marani, Kim, Yuh & Chung 2003). In one of the originals
(Nakamura et al. 1987), task-priority based control of two tasks is defined the follow-
ing way:
˙ q =J
#
1
˙ x
1
+(J
2
N
1
)
#
˙ x
2
−J
2
J
#
1
˙ x
1
. (4.14)
where N
i
= I−J
#
i
J
i
is the null-space projection matrix of J
i
, and the task defined by
˙ x
1
is claimed to have higher priority than the task defined by ˙ x
2
. The framework is
generalized, recursively, in (Siciliano & Slotine 1991) for the prioritization of n tasks:
˙ q
1
= J
#
1
˙ x
1
˙ q
i
= ˙ q
i−1
+
J
i
ˆ
N
i−1
#
(˙ x
i
−J
i
˙ q
i−1
) (4.15)
˙ q = ˙ q
n
where
ˆ
N
i−1
is defined to be the null space projection of the augmented Jacobian
ˆ
J
i−1
:
ˆ
N
i−1
=I−
J
1
J
2
.
.
.
J
i−1
#
J
1
J
2
.
.
.
J
i−1
,
69
andtaski−1isclaimedtohavehigherprioritythantaski. Unfortunatelythisformulation
also has difficulty handling task conflicts, since so called algorithmic singularities are
created when one task will violate another (and at least one J
i
ˆ
N
i−1
term will become
singular). By using a manipulability measure, (Marani et al. 2003) is able to show that
algorithmic singularities in the two task case of (4.14) occur if and only if the augmented
Jacobian of both tasks, i.e. (4.13), is singular. Thus the formulations of (4.13) and (4.15)
havetwostrikingsimilarities. Theybothachieveaccuratetrackingofallntaskswhenever
possible, and when two or more tasks conflict, they both run into singularity difficulties.
It proved in Appendix B.1 that when
ˆ
J
n
has full row rank, for an arbitrary n> 1 tasks,
the two formulations are precisely equivalent, i.e.:
˙ q
n
= ˙ q
n−1
+
J
n
ˆ
N
n−1
#
(˙ x
n
−J
n
˙ q
n−1
) =
ˆ
J
#
n
ˆ
˙ x
n
. (4.16)
Note that the relation of (4.16) does not hold when
ˆ
J
n
is not row full rank (in which case
SVDs are used to compute the pseudoinverses). However, it is proved in Appendix B.2,
that regardless of the rank of
ˆ
J
n
, the following relation does hold:
det
ˆ
J
n
ˆ
J
T
n
=
n
Y
j=1
det
J
j
ˆ
N
j−1
J
j
ˆ
N
j−1
T
(4.17)
Thusas thematrix
ˆ
J
n
approaches asingularity, sowillat leastoneJ
i
ˆ
N
i−1
term of(4.15).
Consequently, theonlybenefitof(4.15) over (4.13) occurspreciselyatsingularities, where
(4.15) is still able to guarantee task achievement of higher priority tasks. However,
practically, operating at singularities is unrealistic since approaching them can generate
70
unreasonably high joint velocities, as well as large discontinuities at the rank transition.
Therefore in its current form, (4.15) is no more useful as a task-priority formulation than
(4.13), and we must turn to singularity robust methods. (Chiaverini 1997) mentions an
alternative form of task prioritization:
˙ q =J
#
1
˙ x
1
+N
1
J
#
2
˙ x
2
, (4.18)
which (Baerlocher & Boulic 1998) claims has poor tracking of lower priorities (even if
redundancy is available for the achievement of all tasks). However, when two tasks of
differentprioritiesconflicttherearenoproblematicsingularities. Becauseofthisproperty,
thistechnique, coupledwithaugmentedJacobiansof(4.13), anddampedpseudo-inverses,
will be employed in the development of our control algorithms.
4.5 Task Space Control for Humanoid Tasks
The high number of DOFs of humanoids allow for a variety of tasks to be completed
simultaneously, for example locomotion while manipulating a tool and orienting the head
to look at an object. However, in order to maintain stability and controllability of the
robot, it is critical to manage the different tasks in an appropriate hierarchical manner.
Therefore, as similarly done in (Sentis & Khatib 2006), we categorize our robot’s tasks
into one of three prioritization groups:
i) Critical tasks: essential tasks which much succeed. All other tasks depend on the
success of these tasks.
71
ii) Functional tasks: tasks we want to accomplish with a high level of accuracy, but
their success or failure generally does not effect other tasks. This group could
potentially have additional sub-levels of hierarchy within it.
iii) Extraneous tasks: These tasks are used for resolving any remaining redundancy
(such that therobot does notdriftin null-space). Success of these tasks is generally
irrelevant.
Once categorized, these tasks will be controlled via a task prioritization framework.
As emphasized in Section 4.3.2, in order to be able to use (4.7) to control arbitrary
tasks, it is critical that ˙ x
C
= 0 at all times. If a contact point ever has a nonzero
velocity, then the robot’s base will undergo motion independent of the joints, and as a
consequence we lose the controllability of all tasks. Thus, when controlling the robot, the
most critical task is to maintain zero velocity at all contact points. Secondly, balance is
obviously critical, since a fall can be disastrous. Thus we will always place emphasis on
static stability (maintaining the COG over the supportpolygon). For the purpose of this
section, we will assume that the robot is in double surface contact with the environment
(both feet are flat on the floor), and we define our critical task Jacobian as:
J
A
=
J
rightfoot
J
leftfoot
J
cog,xy
, (4.19)
whereJ
rightfoot
andJ
leftfoot
aretheJacobiansforthepositionandorientationsofthe2feet,
and J
cog,xy
is the Jacobian for the projection of the COG onto the XY plane (assuming
72
the gravity vector is in the Z direction). Since we plan to use a matrix psuedoinverse
for control, as with (4.7), it is critical for robot safety that we avoid any singularities of
J
A
. Singularities will occur when two or more tasks are in conflict, i.e when the task
of controlling the XY position of the COG conflicts with the task of maintaining the
position of either foot. This particular situation would only occur when the robot is fully
extended across the floor (i.e the robot has already fallen), or has its legs in a full split
position. Thus in practice, as long as the robot is upright, we have no real concerns ofJ
A
reaching a singularity. Note that fully extended knees are not a singular configuration in
this case, and balance can still be maintained (provided the ankle joints have sufficient
torque).
For a functional task we may want to control hand motion, for example, and thus we
define the functional task Jacobian as:
J
B
= [J
righthand
], (4.20)
where J
righthand
is the Jacobian for the right hand.
Finally we define our lowest priority (extraneous) tasks as:
J
E
=
J
rpy
J
cog,z
(4.21)
where J
rpy
dictates base orientation (in roll, pitch, yaw angles), and J
cog,z
for the height
of the COG above ground. Note that it is necessary to control for these variables in
73
order to roughly maintain some desired posture, and prevent the robot from drifting in
null-space.
Nextwewilldefinethevectorsofdesiredendeffectorvelocities, notingthat ˙ x
rightfoot
=
˙ x
leftfoot
=0 in order to maintain constraints:
˙ x
A
=
0
0
˙ x
cog,xy
, ˙ x
B
= [˙ x
righthand
], ˙ x
E
=
˙ x
rpy
˙ x
cog,z
. (4.22)
Finally, we will compute the desired joint velocities as:
˙ q
r
=
I
n×n
0
6×n
T
J
+
A
˙ x
A
+N
A
J
+
B
˙ x
B
+N
AB
J
+
E
˙ x
E
, (4.23)
where N
A
=I−J
+
A
J
A
is the null space projection matrix of J
A
and
N
AB
=I−
J
A
J
B
+
J
A
J
B
(4.24)
projects into the intersection of the J
A
and J
B
null spaces. Note that we conduct task
prioritization via null space projections, which as a consequence does not track secondary
tasks as accurately as other task prioritization frameworks (Baerlocher & Boulic 1998).
However (4.23)usestheonlytaskprioritization frameworkthatwillguaranteethesuccess
of ˙ x
A
(our critical tasks) for all time (except at rare singularities of J
A
, as discussed
previously) and does not suffer from the problem of algorithmic singularities as the other
74
approachesdo(Nakamuraetal.1987),(Siciliano &Slotine1991), orrequirealessaccurate
damped pseudoinverse (Chiaverini 1997). For more discussion of these issues in relation
to floating base kinematics, please refer to (Mistry, Nakanishi & Schaal 2007).
3
Extending (4.23) to the 2nd order acceleration case, we have:
¨ q=J
+
A
¨ x
A
−
˙
J
A
˙ q
+N
A
J
+
B
¨ x
B
−
˙
J
B
˙ q
+N
AB
J
+
E
¨ x
E
−
˙
J
E
˙ q
(4.25)
and ¨ q
r
=
I
n×n
0
n×6
¨ q. Ideally, we would like to use (4.25) in conjunction with
floating base inverse dynamics in order to compute joint torques (Chapter 5).
After computing ˙ q
r
from (4.23), q
r
can be computed via numerical integration:
q
t
r
=q
t−1
r
+ΔT ˙ q
r
(4.26)
wheretis thecurrenttimestep, andΔT is thecontrol cycle period. Similarly, if equation
(4.25) is used:
¨ q
t
r
=
I
n×n
0
n×6
¨ q (4.27)
˙ q
t
r
= ˙ q
t−1
r
+ΔT¨ q
t
r
(4.28)
q
t
r
=q
t−1
r
+ΔT ˙ q
r
(4.29)
3
Also, because technicallyJA andJB could conflict, we should replaceNAB withNAN
λ
AB
whereN
λ
AB
uses a damping for the pseudo-inverse, see (Mistry et al. 2007) for more detail.
75
Note that (4.25) requires knowledge of ˙ q, which is often obtained via noisy sensors. In
order to ensure that ˙ q is consistent with the constraints, we either need to use ˙ q
t−1
com-
puted from the previous time step via integration, or use current measured or estimated
values filtered through the null space of the constraints (N
A
).
4.5.1 Evaluations
4.5.1.1 Platform
We evaluate our controller on the CBi humanoid robot (figure (4.6)), developed by SAR-
COS Corporation and ATR Computational Neuroscience Laboratories (Cheng, Hyon,
Morimoto, Ude, Hale, Colvin, Scroggin & Jacobsen 2007). CBi is a full-body, 51 degree
of freedom humanoid (2 × 2 DOF eyes, 1 × 3 DOF neck, 2 × 7 DOF arms, 2 × 7 DOF
legs, 1 × 3 DOF torso, 2 × 6 DOF fingers, 1 × 1 DOF jaw). CBi is approximately
157.5 cm in height and 87 kg in weight. All DOFs except eyes, fingers, and jaw, are
hydraulically actuated. The eyes and jaw use electric motors and the fingers use pneu-
matics. Each joint has local position, velocity, and torque control available at a 5 kHz
rate. Position, velocity, and torque set points, are computed on an off-board computer,
and communicated via Ethernet at a 500 Hz bandwidth.
4.5.1.2 COG Tracking
In the first experiment we attempt COG tracking control using (4.23) along with (4.26)
and high gain joint position control on the robot. Although a support harness exists for
safety, it does not bare any load, and the robot is fully standing and balancing by its
own power. The projection of the COG on to the floor is given a side-to-side sinusoidal
76
Figure 4.6: Sarcos/ATR CBi Humanoid Robot
trajectory to track (shown in green) in figure 4.7. The front-back COG position (lower
graph) is commanded to remain in a fixed position. Actual COG positions are shown in
blue.
4.5.1.3 Hand Tracking with Simultaneous Balance
Next we attempt to track a hand trajectory, while simultaneously maintaining static
balance (also using joint position control). The COG is controlled to remain at a fixed
positionwithinthesupportpolygon. Therighthandpositionisgivenafigure-8trajectory
to trackin theperpendicularplane. Tracking performancein atimeseries graphis shown
in figure 4.8, and in the plane in figure 4.9. Note that the controller tracks the desired
hand trajectory in the null space of the COG controller, and therefore is less accurate.
However, the arm motion theoretically does not interfere with the static balance of the
77
0 5 10 15 20 25 30
0.18
0.2
0.22
X (m)
0 5 10 15 20 25 30
−0.02
0
0.02
Y (m)
time (s)
Figure 4.7: Tracking of the COG position in the horizontal plane. The top graph shows
the left-right sinusoidal target (green) and actual position (blue). The bottom graph
shows the front-back tracking performance.
robot. COG tracking duringthis experiment is shown in figure4.10. Please see the video
associated with this paper for the full motion.
4.5.2 Discussion
Ideally we would like to control the robot via floating base inverse dynamics (Chapter
5). This would allow us to achieve accurate tracking performance while maintaining
compliance, since high gain position controllers would not be required. However, the
accurate estimation of inertial parameters of a floating base inverse dynamics model is
challenging, and work continues to be on going. The position based control explored in
this paper has the advantage of model-freedom, and demonstrates some nice results on
simpletasks. However, forcritical tasks suchas balancing, wewouldprefermoreaccurate
controlmethods. Forexample,moderatetolowtaskspacegainscontributetothetracking
lags observed in the COG tracking task (figure 4.7). Raising task space gains, may result
78
0 10 20 30 40 50 60
−0.2
−0.1
0
0.1
0.2
X (m)
0 10 20 30 40 50 60
0.8
0.9
1
1.1
1.2
Z (m)
time (s)
Figure 4.8: Time series graph of right hand tracking of a figure 8 in the perpendicular
plane. The top graph shows the left-right position, and the bottom is the vertical. Green
is desired and blue is actual.
−0.2 −0.15 −0.1 −0.05 0
0.84
0.86
0.88
0.9
0.92
0.94
0.96
0.98
1
1.02
X (m)
Z (m)
Figure 4.9: Right hand tracking of a figure 8 in the perpendicular plane. Green is
desired position and blue is actual. Note that performance is degraded because the hand
controller operates in the null-space of the balance/constraint controller
79
0 10 20 30 40 50 60
0.18
0.2
0.22
X (m)
0 10 20 30 40 50 60
0
0.02
0.04
Y (m)
time (s)
Figure 4.10: COG tracking in the horizontal plane during the hand figure-8 task. The
COG is commanded to remain in a fixed position.
inbettertrackingperformancewithahigherriskofinstability. Limitationsoftheposition
controlapproachbecomeobviousasweattemptmorecomplicatedorfastermotion,which
may generate more inertia. Unfortunately, the model-free techniques cannot cope with
the disturbances due to momentum, and the need for dynamic balance (Vukobratovic &
Stepanenko 1972),(Vukobratovic & Borovac 2004) becomes apparent. The hand tracking
experiment demonstrates the feasibility of whole body motions, while maintaining static
balance. Again, faster motion would require model-based approaches. In addition, we
would like to attempt whole body motion during constraint switching, for more complex
tasks including locomotion.
80
4.6 Task space control for simultaneous locomotion and
balance
In this section we use the floating base task space control framework for the locomotion
of a quadruped robot, with the eventual goal of implementing a similar technique on a
bipedal humanoid robot. In order to achieve static locomotion in a robot the position
of the COG must be accurately controlled so that it does not drift outside the support
polygon. However, in order to do so with floating base kinematics, we must have suffi-
ciently many constraints (to allow for control of unactuated DOFs) and our controllers
must always maintain those constraints. If at any time, a constraint is allowed to be vio-
lated, the COG will no longer be controllable. Therefore, we must always place the COG
and constraints in the same and highest level of priority, i.e. together in an augmented
Jacobian. It may be possible that the COG trajectory can conflict with constraints: the
most common example would be when the COG is commanded to go as high as possi-
ble, which is only achievable if the feet lift from the floor (thus breaking constraints).
Therefore, we will only place the x and y (floor plane) components of the COG Jacobian
in this augmented Jacobian. The z position (height) of the COG will be controlled at
a lower level of priority. It may be still possible that an x or y component of the COG
will violate the constraints, however we will assume that these postures only occur when
the COG is outside the support triangle, when stability is lost anyway. Our secondary
concern is fortheposition of theswingleg, for as accurate footplacement as possible. We
will investigate placing control of this task variable bothin primaryand secondary levels.
Finally, any remaining degrees of freedom (COG height and orientation of the body),
81
must never be able to interfere with constraints, COG, or swing leg, and will always be
placed at the lowest level of priority.
4.6.1 Controller Formulations
4.6.1.1 Locomotion Control Method 1
In our first controller implementation for locomotion, we place the swing leg in the same
level of priority as the constraints and COG. The main advantages are that the structure
of the primary Jacobian does not change with the constraints, and thus desired joint
velocities remain continuous (provided desired end-effector velocities remain continuous).
The major disadvantage with this approach is that the swing leg can potentially conflict
with the COG (for example, when reaching for a target outside of the leg’s workspace),
and thus swing leg trajectories must be planned in advance to not interfere with COG
trajectory. We will define:
ˆ
J
A
=
ˆ
J
stance
J
swing
J
COG,XY
,
ˆ
J
P
=
J
RPY
J
COG,Z
where
ˆ
J
stance
is the augmented Jacobian, containing a stack of all legs in contact with
the ground, J
swing
is the Jacobian of the swing leg, J
RPY
dictates base orientation (in
roll, pitch, yaw angles), J
COG,XY
is the Jacobian for the projection of the COG on the
82
XY plane, and J
COG,Z
for the height of the COG. Similarly we will define the vector of
desired end effector velocities:
ˆ
˙ x
A
=
0
˙ x
swing
˙ x
COG,XY
,
ˆ
˙ x
P
=
˙ x
RPY
˙ x
COG,Z
Finally we compute desired joint velocities as
˙ q
1
=
ˆ
J
#
A
ˆ
˙ x
A
+
ˆ
N
A
ˆ
J
#
P
ˆ
˙ x
P
, (4.30)
and
ˆ
N
A
=I−
ˆ
J
#
A
ˆ
J
A
is the null space projection matrix of
ˆ
J
A
.
4.6.1.2 Locomotion Control Method 2
Because Method 1 leaves us in danger of commanding the swing leg to a posture that
conflicts with the COG (creating a kinematic singularity), we would like to move the
swing leg controller into the null space of the COG controller and constraints. Defining:
ˆ
J
B
=
ˆ
J
stance
J
COG,XY
,
83
and
ˆ
˙ x
B
=
0
˙ x
COG,XY
,
the 2nd controller will be:
˙ q
2
=
ˆ
J
#
B
ˆ
˙ x
B
+
ˆ
N
B
J
#
swing
˙ x
swing
+
ˆ
N
A
ˆ
J
#
P
ˆ
˙ x
P
. (4.31)
Note that we project
ˆ
J
#
P
ˆ
˙ x
P
into the null-space of
ˆ
J
A
since this is the intersection of the
ˆ
J
B
andJ
swing
null-spaces. Thustheterm
ˆ
N
A
ˆ
J
#
P
ˆ
˙ x
P
willnotinterferewiththeconstraints,
COG or swing leg. Also note that computation of
ˆ
N
A
requires a pseudo-inverse of
ˆ
J
A
,
and thus eliminates the singularity robustness we were seeking by moving the swing leg
out of the top priority. To deal with this issue we will compute
ˆ
N
A
with a damped
pseudo-inverse:
ˆ
N
λ
A
=I−
ˆ
J
#λ
A
ˆ
J
A
Because
ˆ
N
λ
A
does not perfectly project into the null-space of
ˆ
J
A
, we will need to premul-
tiply with
ˆ
N
B
to be certain the constraints and COG are not violated. Thus the final
version of the method 2 controller is:
˙ q
2
=
ˆ
J
#
B
ˆ
˙ x
B
+
ˆ
N
B
J
#
swing
˙ x
swing
+
ˆ
N
B
ˆ
N
λ
A
ˆ
J
#
P
ˆ
˙ x
P
. (4.32)
84
4.6.1.3 Locomotion Control Method 3
Jacobian switching problems arise because at the moment of transfer between a 3 and 4
leg stance: ˙ q
1
6= ˙ q
2
. One question we can ask is if it is possible to add a term to ˙ q
2
such
that at the moment of constraint switching:
˙ q
1
= ˙ q
2
+
ˆ
N
B
ξ, (4.33)
where ξ is an arbitrary vector. The additional term will not violate the constraints of
COG trajectory (although it may interfere with the swing leg). Substituting equations
(4.30) and (4.31) into (4.33), and solving for
ˆ
N
B
ξ gives us:
ˆ
N
B
ξ =
ˆ
J
#
A
ˆ
˙ x
A
−
ˆ
J
#
B
ˆ
˙ x
B
−
ˆ
N
B
J
#
swing
˙ x
swing
. (4.34)
We can multiply both sides of (4.34) by
ˆ
N
B
and because
ˆ
N
B
ˆ
J
#
B
=0 and
ˆ
N
B
is idempo-
tent, we have:
ˆ
N
B
ξ =
ˆ
N
B
ˆ
J
#
A
ˆ
˙ x
A
−
ˆ
N
B
J
#
swing
˙ x
swing
. (4.35)
Now adding this new term to (4.31) gives us
˙ q
3
=
ˆ
J
#
B
ˆ
˙ x
B
+
ˆ
N
B
ˆ
J
#
A
ˆ
˙ x
A
+
ˆ
N
A
ˆ
J
#
P
ˆ
˙ x
P
.
Again, becausewewishtoberobusttosingularitiesof
ˆ
J
A
wewillusethedampedpseudo-
inverse:
˙ q
3
=
ˆ
J
#
B
ˆ
˙ x
B
+
ˆ
N
B
ˆ
J
#λ
A
ˆ
˙ x
A
+
ˆ
N
B
ˆ
N
λ
A
ˆ
J
#
P
ˆ
˙ x
P
. (4.36)
85
If λ = 0, and because the relation of equation (4.33) holds, and ˙ q
3
= ˙ q
1
. Thus as long
as
ˆ
J
A
is not near a singularity, we can operate with λ= 0 and maintain continuous joint
velocities andaccurateswinglegtracking. As
ˆ
J
A
approachesasingularity(manipulability
becomes low), we can increase λ to maintain robustness.
4.6.2 Evaluations
4.6.2.1 Platform
We evaluate the three controllers on a dynamic simulator of the Boston Dynamics Little-
Dog quadruped robot (figure 4.11). The LittleDog robot is a 12 degree of freedom robot
(3perleg) withpointfeetcontacts. We attemptalocomotion taskonflatgroundwherea
smooth, twice-differentiable, COG trajectory for locomotion, is generated automatically
based on foot movement (Pongas, Mistry & Schaal 2007). Foot motion is planned to
reach predefined targets on the floor (for moving straight ahead). Trajectories for the
feet to reach targets are generated via 5th order splines (which provide minimum jerk
trajectories). Desired task velocities (for input into the controllers) are generated from
the trajectories ˙ x
d
, x
d
, as follows:
˙ x =K
V
˙ x
d
+K
P
(x
d
−x) (4.37)
The controllers compute desired joint velocities, which are then integrated to produce
desired joint positions. The desired joint position and velocity set points are sent to low
level joint PD controllers on the robot at a rate of 500 Hz.
86
Figure 4.11: LittleDog Robot Simulator
4.6.2.2 Locomotion Performance
TheCOGtrajectoryisautomaticallyplacedwithinthesupporttriangleofthestancelegs,
andsincetheCOGtrajectoryisparameterizedbyfootposition,velocity, andacceleration,
any foot movement, including slipping, will readjust the trajectory (see (Pongas et al.
2007)). Oscillations in the desired trajectories are caused by ground contacts, which are
modeled in the simulator as spring mass dampers. Figure 4.12 shows the COG tracking
performance for each of the three controllers during a locomotion task on flat terrain.
The robot moves at a speed of one gait cycle per 6 seconds. Controller 3 uses a constant
λ = 0.05 for the entire motion (although a more intelligent approach may be to vary λ).
Tracking for all controllers is good enough for maintaining stability during a locomotion
task.
The swing leg is guided to its target with a trajectory created by a 5th order spline.
Figure4.13showstheswinglegtrackingperformanceinthezdirectionoftheleftfrontleg
for each of the three controllers during locomotion. In green is the trajectory generated
87
0.3 0.35 0.4 0.45 0.5 0.55 0.6
−0.02
0
0.02
0.04
C1
COG y (m)
0.3 0.35 0.4 0.45 0.5 0.55 0.6
−0.02
0
0.02
0.04
C2
COG y (m)
0.25 0.3 0.35 0.4 0.45 0.5 0.55 0.6
−0.02
0
0.02
0.04
C3
COG y (m)
COG x (m)
Figure 4.12: x and y coordinates (overhead view) of the COG trajectory (in blue) for the
3 controllers plotted with the desired COG trajectory in green, during walking on flat
ground. the robot moves from left to right.
by the splines for the leg to lift off and touch down, and in blue is the foot’s actual
trajectory.
4.6.2.3 Constraint Switching
Controllers 2 and 3 suffer from constraint switching discontinuities. Figure 4.14 shows
the effect of constraint switching, on controller 2 during the transition from 4 foot sup-
port phase to 3 foot support. Shown in the graph are plots of the computed desired
knee velocities for each of the for knees, and the discontinuity created by the switching
Jacobian.
Figure 4.15 shows the same time period for Controller 3 (with λ = 0.05). Although,
this controller is also going though a constraint switch, the effect is nearly negligible.
88
0 1 2 3 4 5 6 7 8 9 10
−0.22
−0.2
−0.18
−0.16
−0.14
C1
Foot z (m)
0 1 2 3 4 5 6 7 8 9 10
−0.22
−0.2
−0.18
−0.16
−0.14
C2
Foot z (m)
0 1 2 3 4 5 6 7 8 9 10
−0.22
−0.2
−0.18
−0.16
−0.14
C3
Foot z (m)
time (s)
Figure 4.13: left front leg’s tracking performance in the z direction over time during
locomotion for each of the three controllers. Controller 2 is not able to track that well.
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45
0.4
0.45
0.5
LF KFE
(rad/s)
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45
0.1
0.2
0.3
0.4
RH KFE
(rad/s)
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45
−1
0
1
2
3
LH KFE
(rad/s)
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45
−0.25
−0.2
−0.15
−0.1
RF KFE
(rad/s)
time (s)
Figure 4.14: desired knee velocities computed by controller 2. There are clearly discon-
tinuities in the trajectories, caused by a sudden constraint switch.
89
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45
0
0.5
1
1.5
LF KFE
(rad/s)
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45
−0.5
0
0.5
RH KFE
(rad/s)
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45
−2
0
2
4
LH KFE
(rad/s)
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45
−0.5
0
0.5
1
RF KFE
(rad/s)
time (s)
Figure4.15: desiredkneevelocities computedbycontroller3. Thereisaconstraintswitch
but this controller maintains a rather smooth trajectory though it.
4.6.2.4 Swing Leg Tracking
Since controller 1 places the swing leg in the highest level of priority, we know that swing
leg control will not bedistorted (provided
ˆ
J
A
remains non-singular). In order to quantify
the amount of degradation caused by placing the swing leg controller in the null space of
the balance controller, we can compute the error in ˙ x
swing
reconstruction. For controller
2, it is:
e
2
= ˙ x
swing
−J
swing
ˆ
N
B
J
#
swing
˙ x
swing
=
I−J
swing
ˆ
N
B
J
#
swing
˙ x
swing
= E
2
˙ x
swing
90
where E
2
is independent of the desired swing foot velocity. For method 3:
E
3
=I−J
swing
ˆ
N
B
ˆ
J
#λ
A
S
swing
,
where S
swing
is used to convert ˙ x
swing
into a higher dimensional vector.
S
swing
=
0
I
0
(4.38)
The matrix E should give some indication of the controller’s performance of swing leg
tracking, independently of the particular trajectory (E = 0 means perfect swing leg
tracking). Figure 4.16 shows plots of the Frobenius norm of E
2
(in blue), E
3
(in green)
with λ = 0.05 during the swing phase of the left front leg. Note that the norms of the
first controller (or third controller with λ = 0) are zero. The norm of controller 3 is
consistently less than that of 2, indicating that controller 3 should be better at swing
leg tracking. Reducing λ even further will likewise reduce the norm, and should improve
swing leg tracking even further.
4.6.3 Discussion
Controller 1 had the best performance for locomotion. However, in order to use this
controller, foot placement must be carefully planned to avoid conflicts with the COG.
Using a damped pseudo-inverse in this controller, is not an option since we will lose the
ability to maintain the constraints (and thus lose COG controllability). Controller 2 no
91
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7
0
0.2
0.4
0.6
0.8
1
1.2
1.4
time (s)
||E||
2
Figure 4.16: The Frobenius norms of E
2
(blue) and E
3
(green) plotted during the swing
phase of the left front leg. Smaller norm means less distortion of the desired swing leg
trajectory.
longer hasthesesingularityissues,butas theresultsshow, itperformspoorlyinswingleg
tracking and suffers from discontinuities during constraint switches. Controller 3 seems
to be a nice trade off between these two approaches, however it requires the additional
tuning of the λ parameter to achieve the best performance. We have demonstrated the
feasibility using floating base inverse kinematics, with task-prioritization to address the
issue of simultaneous balance and walking of a quadruped robot. A controller devel-
oped here seems to be a promising trade-off between singularity problems and constraint
switching. Future work will focus on the evaluation of these controllers on more complex
terrain, torque based control via inverse dynamics, and other robot platforms including
humanoids.
92
4.7 Conclusion
In this chapter, we have discussed some relevant issues concerning inverse kinematics
algorithms for floating base structures. In particular we have addressed how to deal with
under-actuation by sufficiently constraining the robot. We also have taken the first steps
of implementing full body, floating base, task space control on a real humanoid robot.
While the performance of the demonstrated position based controllers is limiting, they
show promise and demonstrate the feasibility of the approach. Work continues on model-
based floating base inverse dynamics, and the development of robust controllers for more
complex tasks including constraint switching and locomotion.
93
Chapter 5
Model Based Control for Humanoid Robots
5.1 Introduction
InChapter1,wementionedsomekeyfeaturesofsensorimotorcontrol,thatenablehumans
to be robust, dexterous, and able to learn new skills. Some of these features include:
Stiffness/Compliance Control: Biological muscles are unique actuators in that they
can be controlled to a maintain a position or load. Thus force output can be
modulated in order to tune the stiffness (or compliance) for a specific task (Burdet
et al. 2001).
Internal Models: Humans can learn models of dynamic environments, and can use
these models to predict execution forces (Shadmehr & Mussa-Ivaldi 1994)(Kawato
1999), to improve control accuracy without sacrificing compliance.
Task-Space Control: Humans can control motion in an extrinsic, or end-effector ori-
ented space, such as the hand or center of gravity (COG), and redundancy in the
degrees-of-freedom (DOFs) can be utilized in order to increase robustness of task
94
achievement (Mistry et al. 2005a), or in order to perform additional tasks at the
same time (Nakamura et al. 1987).
Discrete/Rhythmic Motion: Humans can execute discrete (e.g. reaching) or rhyth-
mic(e.gwalking)movements, andarecapableofsuperimposingbothtypestogether
(Sternad, de Rugy, Pataky & Dean 2002). Rhythmic behavior can also be modu-
lated by a task objective. For example, walking patterns can be adjusted in real
time to account for a balance/COG correction.
Besides understanding these features to advance research in the neural control of
movement, they are also of particular importance if, in the near future, robots will share
their workspace with humans, as anticipated for humanoid and assistive robots. Human
environments are dynamic, have stochastic components, and require safe operation of
robots – all these issues are largely addressed in the context of model-based compliant
control.
Additionally, because such free moving robots are floating base systems (Chapter 4),
control is more complex than standard control due to under-actuation, constraints with
unknown contact forces and switching contact states, and closed-loop kinematics and
dynamics. These issues make floating base control more complex and still an open topic
of research – hardly any humanoid robot, so far, uses such control principles.
In literature, both model-based and task-space control have been well studied topics
in the realm of manipulator robotics. For a recent review, comparing the theoretical
and empirical aspects of several techniques see (Nakanishi et al. 2008). Additionally, the
work of (Sentis & Khatib 2005),(Park & Khatib 2006),(Sentis & Khatib 2006),(Khatib
95
et al. 2007) has extended Khatib’s operational space framework to the control of float-
ing base humanoid robot systems in contact with their environments The advantage of
Khatib’s framework is that it computes the control forces that decouples task and null-
space dynamics. Thus secondary tasks can be achieved without interfering dynamically
with higher priority tasks. Additionally, as it is a force control method, it does not use
joint position references, and thus does not require explicit inverse kinematics computa-
tion. However, these methods rely heavily on accurate modeling, in particular the inertia
matrix which is required for pseudoinverse computations.
Theworkof (Hyon, Hale & Cheng2007) has focusedon passivitybased approaches to
humanoid control and balance, bycontrolling for optimal contact forces at thefeet. They
have demonstrated robust disturbance rejection on a real humanoid platform, and have
recently shown (Hyon, Morimoto & Cheng 2008) how the joint trajectories generated by
task-space controller can bootstrap the similar, but faster motions by way of supervised
learning.
This chapter will address a novel algorithm for model based control of humanoid
robots, via floating base inverse dynamics that can be applied to arbitrary robots with
multiple and dynamically changing constraints. The method is a general framework for
compliant control, via feed-forward model-based inverse dynamics within the context of
floatingbaserigidbodydynamics. Inversedynamicsallows forthecomputationoftheac-
tuator commandsrequiredto execute accurately a desiredtrajectory. First, our approach
can be used to improve joint tracking accuracy, while allowing for lower feedback gains.
Second, we consider the superposition of joint-space control with operational space con-
trol, as, for instance, needed to stabilize a joint-space locomotory pattern generator with
96
a balance controller. We evaluate our methods on a biped humanoid robot simulation,
as well as an actual robot dog walking over rough terrain. In both cases, our approach
allows reducing feedback gains, increases movement accuracy, and enhances compliance
and robustness towards perturbations.
5.2 Floating Base Dynamics
In this section, we derive an analytically correct solution to the ill-posed problem of
floating base inverse dynamics. We do so by first projecting the system dynamics into
a reduced dimensional space, that is independent of constraint forces. The projection
is created via an orthogonal decomposition of the constraint Jacobian. Orthogonal de-
composition was first proposed for modal synthesis in (Flashner 1986) as a method of
eliminating Lagrange multipliers for linear dynamical systems with linear constraints.
Here, we extend this technique to nonlinear rigid-body dynamics with nonlinear (but
holonomic) constraints. The result allows us taking a simple pseudoinverse to compute
analytically correct torque values, without knowledge of constraint forces.
5.2.1 Derivation
The floating base framework provides the most general representation of a rigid-body
system unattached to the world, and is necessary to describe the complete dynamics of
the system with respect to an inertial frame. The system configuration is represented as:
q =
q
T
r
x
T
b
T
(5.1)
97
whereq
r
∈R
n
is thejoint configuration of therigid bodyrobotwithnjoints andx
b
∈R
6
is the position and orientation of the coordinate system attached to the robot base, and
measured with respect to an inertial frame. Figure 5.1 illustrates this representation by
showing the 6 virtual degrees of freedom attached from inertial frame to the robot base
frame.
Inertial Frame
Robot
Base
Frame
6 Virtual DOFs
Figure 5.1: The base frame attached to the robot is connected to the inertial frame via 6
unactuated virtual DOFs
When the robot is in contact with the environment, the equations of motion with
respect to an inertial frame are given by:
M(q)¨ q+h(q, ˙ q) =S
T
τ +J
T
C
(q)λ (5.2)
with variables defined as follows:
• M(q)∈R
n+6×n+6
: the floating base inertia matrix
• h(q, ˙ q)∈R
n+6
: the floating base centripedal, Coriolis, and gravity forces.
• S =
I
n×n
0
n×6
: the actuated joint selection matrix
98
• τ ∈R
n
: the vector of actuated joint torques
• J
C
∈R
k×n+6
: the Jacobian of k constraints
• λ∈R
k
: the vector of contact forces
Thegoal of inversedynamics isto computethejoint torquesthat willrealize adesired
joint acceleration, ¨ q
d
, given the current state (q, ˙ q) of the robot. In other words, we wish
to find a τ such that ¨ q = ¨ q
d
. Note that this problem is ill-posed. In order to solve for τ
using (5.2), we need to know the contact forces, λ. However because λ also depends on
τ, there are potentially infinitely many solutions to this equation. Additionally, due to
under-actuation, a given ¨ q
d
may not be realizable at all. For example, it is impossible to
accelerate the base up into the air while keeping all other joints fixed.
In order to resolve these issues, we can attempt to project the equations of motion
into actuated joint space (Sentis & Khatib 2005). If we assume that the contact points
do not move with respect to the inertial frame, i.e. ¨ x
C
=J
C
¨ q+
˙
J
C
˙ q =0, and that ¨ q is
in the null-space of J
C
(joint/base motion will not violate the constraints), then contact
forces can be uniquely determined in the following way. We multiply (5.2) by J
C
M
−1
,
replace J
C
¨ q with ¨ x
C
−
˙
J
C
˙ q, and solve for ¨ x
C
:
¨ x
C
=
˙
J
C
˙ q−J
C
M
−1
h−S
T
τ
+J
C
M
−1
J
T
C
λ (5.3)
Thus the only λ for which ¨ x
C
=0 is:
λ=
J
C
M
−1
J
T
C
−1
J
C
M
−1
h−S
T
τ
−
˙
J
C
˙ q
(5.4)
99
Inserting (5.4) into (5.2) and projecting the result into actuated joint space (by mul-
tiplying by
¯
S
T
=
SM
−1
S
T
−1
SM
−1
) results in the equations of constrained dynamics
in actuated joint space:
SM
−1
S
T
−1
¨ q
r
+
¯
S
T
I−J
T
C
¯
J
T
C
h+
¯
S
T
M
¯
J
C
˙
J
C
˙ q
=
¯
S
T
I−J
T
C
¯
J
T
C
S
T
τ (5.5)
where
¯
J
T
C
=
J
C
M
−1
J
T
C
−1
J
C
M
−1
. Notethattheresultisindependentoftheconstraint
forces,λ. Iftheexpression
¯
S
T
I−J
T
C
¯
J
T
C
S
T
wereinvertible, itcould beusedtocompute
inverse dynamics torques. However, in general, this matrix is rankdeficient and therefore
not invertible.
In (Nakanishi, Mistry & Schaal 2007) we attempt to resolve this issue bytreatingλ as
a vector of external forces. Again projecting (5.2) into actuated joint space, butreplacing
λ with F
ext
, we have
SM
−1
S
T
−1
¨ q
r
+
¯
S
T
h−J
T
C
F
ext
=τ (5.6)
Thus inverse dynamics torques can be computed, if F
ext
can be measured. However
this approach is undesirable since force sensors must exist at all contact points, and
are typically noisy and delayed if filtered. Alternatively we could estimate F
ext
using
(5.4). In this case, we need to use τ from the previous control time step, which will
not produce an analytically correct solution, but may be acceptable with a fast enough
100
control loop. Nevertheless, this approach is just an approximate solution to floating-base
inverse dynamics control, and can go unstable in adverse situations.
Because of the problems that are caused by working with contact forces directly,
in the following, we will develop an approach that avoids working them and computes
the analytically correct inverse dynamics torques by working in the constrained, reduced
dimensionalspace, asrealized byanorthogonaldecomposition oftheconstraintJacobian.
5.3 Orthogonal decomposition for reduced dimensionality
rigid-body dynamics
In order to derive our inverse dynamics formulation, we will make the following assump-
tions.
i) ˙ x
C
= ¨ x
C
= 0 is maintained, i.e., there is sufficient friction and stiffness at the
constraint locations to prevent motion.
ii) The system is not over-constrained: The matrix J
C
should remain full row rank,
i.e., every constraint can be simultaneously satisfied.
iii) Thesystem is sufficiently constrained to eliminate under-actuation: if we dividethe
constraint Jacobian into its parts (J
C
=
∂x
C
/∂q
r
∂x
C
/∂x
b
), the constraint
Jacobian related to base motion (∂x
C
/∂x
b
), must have a rank equal to 6.
In Chapter 4 we show that these 3 conditions are sufficient for full controllability of
the robot, even with under-actuated degrees of freedom of the floating base. If conditions
101
1) and 3) are satisfied, the system no longer has under-actuated dynamics. If condi-
tion 2) is additionally satisfied, the system will also be able to maintain all constraints
simultaneously.
These three conditions can be typically satisfied for most modern, high degree of
freedom humanoid robots, with one foot flat on the ground (k = 6), or two feet flat on
the ground (k = 12), or even with 2 edge or 3 point contacts, provided they are not
colinear. If these three conditions are satisfied, the system will still haven+6−k degrees
of freedom remaining for useful motion. By transforming the system dynamics into this
reduced dimensional space, we end up with n+6−k ”unconstrained” degrees of freedom
which do not depend on constraint forces.
5.3.1 The QR decomposition
Because of condition 2), Rank(J
C
) = k, and we can compute the QR decomposition of
J
T
C
(see Appendix A):
J
T
C
=Q
R
0
(5.7)
where Q is orthogonal (QQ
T
= Q
T
Q = I), and R is an upper triangle matrix of rank
k. Additionally, if R is constrained to have all positive diagonal elements, Q and R are
unique.
Next we multiply (5.2) by Q
T
:
Q
T
M¨ q+Q
T
h=Q
T
S
T
τ +
R
0
λ. (5.8)
102
and define a new coordinate system p such that
˙ q=Q˙ p (5.9)
and subsequently
¨ q=Q¨ p+
˙
Q˙ p, (5.10)
We then rewrite (5.8) as:
Q
T
MQ¨ p+Q
T
M
˙
Q˙ p+Q
T
h=Q
T
S
T
τ +
R
0
λ, (5.11)
or in more compact notation:
ˆ
M¨ p+
ˆ
M
′
˙ p+
ˆ
h=D
T
τ +
R
0
λ, (5.12)
where
ˆ
M =Q
T
MQ (5.13)
ˆ
M
′
=Q
T
M
˙
Q (5.14)
ˆ
h =Q
T
h (5.15)
D=SQ. (5.16)
103
Next, we can subdivide p as follows:
p =
p
c
p
u
, (5.17)
where p
c
∈R
k
and p
u
∈R
n+6−k
, and can expand (5.12) as:
ˆ
M
cc
ˆ
M
cu
ˆ
M
uc
ˆ
M
uu
¨ p
c
¨ p
u
+
ˆ
M
′
cc
ˆ
M
′
cu
ˆ
M
′
uc
ˆ
M
′
uu
˙ p
c
˙ p
u
+
ˆ
h
c
ˆ
h
u
=
D
T
c
D
T
u
τ +
Rλ
0
(5.18)
If ˙ q is in the null space of J
C
:
J
C
˙ q=
R
T
0
˙ p
c
˙ p
u
=0, (5.19)
and therefore ˙ p
c
=0. If we differentiate (5.19):
R
T
0
¨ p
c
¨ p
u
+
˙
R
T
0
˙ p
c
˙ p
u
=0. (5.20)
Since ˙ p
c
=0, we must also have ¨ p
c
=0. Finally we can conclude that
˙ p
c
= ¨ p
c
=0 (5.21)
p
c
=k (5.22)
104
where k is some constant vector (it is unknown, but irrelevant).
Because of (5.21) and (5.22), the upper and lower equations of (5.18) are decoupled,
and we can write the complete dynamics of our system without any constraint forces as:
ˆ
M
uu
¨ p
u
+
ˆ
M
′
uu
˙ p
u
+
ˆ
h
u
=D
T
u
τ (5.23)
Note that D
T
u
∈ R
(n+6−k)×n
, and is invertible (either directly if k = 6 or via right
pseudoinverse if k > 6), provided condition 3) is met (i.e. there are no under-actuated
dynamics). Therefore we can write our inverse dynamics equation as:
τ =
D
T
u
+
h
ˆ
M
uu
¨ p
u
+
ˆ
M
′
uu
˙ p
u
+
ˆ
h
u
i
(5.24)
where (.)
+
is the right pseudoinverse (A
+
=A
T
AA
T
−1
).
Since (5.24) requires a
˙
Q term, which may be difficult to determine analytically, we
can substitute (5.10) back into (5.24), and use
S
u
=
0
(n+6−k)×k
I
(n+6−k)×(n+6−k)
, (5.25)
to write the inverse dynamics equation as:
τ =
S
u
Q
T
S
T
+
S
u
Q
T
[M¨ q
d
+h] (5.26)
Note that (5.26) does not depend on the contact forces. Thus we have successfully
determined the analytically correct inverse dynamics torques that will realize the desired
105
joint accelerations ¨ q
d
. This control law is surprisingly simple as it is just a projection of
thefloat-baseinversedynamics(thelastterminsquarebrackets), whichcanbecomputed
with efficient algorithms (Featherstone 1987). Also because a right pseudoinverse is used
in (5.26), whenk > 6 there are an infinite number of possible torques that can realize ¨ q
d
.
By using a weighted psuedoinverse in (5.26), i.e.
A
#
=W
−1
A
T
AW
−1
A
T
−1
(5.27)
where W is a positive definite matrix, we may be able to reduce the torque loads on
specific joints, at the cost of increasing the total torque generated by the system (as well
as the total contact force).
Also, we can still compute the contact forces, using the upper half of (5.18):
λ=R
−1
S
c
Q
T
M¨ q
d
+h−S
T
τ
(5.28)
where
S
c
=
I
k×k
0
k×(n+6−k)
, (5.29)
This equation is potentially more accurate than (5.4), since the modeling error-prone
inertia matrix is used only once, and in non-inverted form.
Finally, note that because Q is numerically determined (typically with Housholder
transformations), it is possible that Q can exhibit discontinuities during tracking: an
arbitrarily small change in J
C
may result in a large change in Q. However, as (5.26) is
actually usingQ to project first into the unconstraint DOF space, and then usesQ again
106
to transform back into the original configuration space, it does not matter whether Q
may change from one servo cycle to the next, as the next cycle does not make use of the
Q from the previous cycle. Thus, a large jump inQ will not create a discontinuity in the
expression
S
u
Q
T
S
T
+
S
u
Q
T
.
5.4 Evaluations
5.4.1 Platform
We usetheSLsimulated bipedalrobot, modeledafter thelower half oftheCBihumanoid
robot (figure (5.2), developed by SARCOS Corporation and the ATR Computational
Neuroscience Laboratories (Cheng, Hyon, Ude, Morimoto & Hale 2008). The simulated
robot has 2 × 7 DOF legs and a 1 × 2 DOF torso, for a total of 16 DOF. The robot
has a free floating base, and is only constrained by its contacts with the floor (using a
spring-damper contact model). In the simulator, the control loop runs at a 500Hz cycle,
as it does on the real robot.
5.4.2 Joint Tracking with Feedforward Control
In our first example, we demonstrate how floating base inverse dynamics can improve
joint tracking performance of the biped robot. We implement a feedforward controller
(Liegeois, Fournier & Aldon 1980):
τ =InvDyn(q, ˙ q,¨ q
d
)+K
P
(q
rd
−q
r
)+K
D
(˙ q
rd
− ˙ q
r
) (5.30)
107
Figure 5.2: Sarcos CBi robot and SL biped simulator
where q
rd
is the desired joint position of only the robot links (not including the floating
base),andInvDyn()iscomputedvia(5.26). Notethattheinversedynamicscomputation
requires the full desired configuration vector ¨ q
d
, including base motion. However, if we
assume the system is sufficiently constrained (there are no under-actuated dynamics),
and motion only happens in the null-space of the constraints, there will be a one-to-
one mapping between joint and base motion, and desired base motion can be computed
analytically. If we define J
∂qr
C
= ∂x
C
/∂q
r
and J
∂x
b
C
= ∂x
C
/∂x
b
(the components of the
constraint Jacobian related to joint and base motion, respectively), then:
J
C
¨ q=J
∂qr
C
¨ q
r
+J
∂x
b
C
¨ x
b
+
˙
J˙ q=0, (5.31)
and we can solve for desired base acceleration as:
¨ x
bd
=−
h
J
∂x
b
C
i
+
J
∂qr
C
¨ q
r
+
˙
J
C
˙ q
. (5.32)
108
If Rank(J
∂x
b
C
) = 6 (assumption 3 from section III), then this equation will have a unique
solution. Otherwise, it can have an infinite number of solutions.
We execute a squatting/bowing motion in the sagittal plane, by commanding desired
joint angles created by combining sinusoids. In order to show that our method is capable
of operating with knees at full extension (a configuration typically considered ”singular”
with other control methods), we gradually lower the desired knee offset of the sinusoids
of the knees (extending the knee joint), until it is fully at its limit. Figure 5.3 displays
some snapshots of the resulting motion. Figure 5.4 shows tracking performance of our
method(labeled ”PD+FF”), compared toonlyhigh-gain PDcontrol, forarelatively slow
execution of the desiredtrajectories. Tracking is significantly improved usingfeedforward
control, achieving RMSerrorvaluesof 0.001, 0.011, and0.005 forthehip,knee, andankle
flexion extension joints respectively. In comparison, PD only control had 0.008, 0.076,
and 0.021 RMS error over the same desired motion.
Figure 5.3: Some configurations of the biped simulator during the squatting/bowing
motion.
We repeated the trajectories, 5 times faster (Fig. 5.5). PD only control was not able
to keep the robot upright after 1 second. Note that in this example there is no explicit
balance controller, however, compensating for the robot’s inertia via floating base inverse
109
0 5 10 15 20
−1
0
1
hip (rad)
0 5 10 15 20
0
0.2
0.4
0.6
knee (rad)
0 5 10 15 20
0
0.5
ankle (rad)
time (s)
Desired
PD only
PD+FF
Figure 5.4: Tracking performance of the feedforward controller and PD only controller
during a slow squatting/bowing motion. Shown are the joint angles of the left hip, knee,
and ankle flexion/extension joints.
dynamicssignificantly improved stability. Themethodwas also ableto maintain stability
duringfullknee extension (starting roughly at the 4 second mark), although this resulted
in a degradation of tracking performance due to hitting the joint range limits. The
feedforward controller achieved tracking RMS errors of 0.005, 0.109, and 0.052 for the
hip, knee, and ankle flexion/extension joints respectively.
5.4.3 Weighted Distribution of Forces
Next we show how we can use a weighting term to redistribute control forces. Imagine
that the robot’s left knee joint becomes damaged, and we wish to minimize the torque
output required by this actuator. From (5.27), we set the weight matrix W to identity,
expect at the entry corresponding to the left knee joint, where we use 10
−5
. We execute
a squating motion (2Hz sinusoids at hip, knee, and ankle flexion/extension joints), and
110
0 1 2 3 4 5 6
−1
0
1
hip (rad)
0 1 2 3 4 5 6
0
0.2
0.4
0.6
knee (rad)
0 1 2 3 4 5 6
0
0.5
ankle (rad)
time (s)
Desired
PD+FF
Figure 5.5: Tracking performance of the feedforward controller during fast squat-
ting/bowing motion. The PD only controller was not able to maintain stability after
1 second. Notice that the knee joint reaches full extension (after roughly 4 seconds).
measure the RMS torque over 6 seconds. As shown in Fig. 5.6, by using the weighting
factor, we were able to achieve a 3.0 times reduction in left knee RMS torque, at the cost
of a 1.9, 2.4, and 1.7 times increase in left ankle, right hip, and right knee (respectively).
L hip L knee L ankle R hip R knee R ankle
0
10
20
30
RMS torque (Nm)
Original
Weighted
Figure 5.6: RMS torque values for a squatting task when using the original (W = I)
formulation, and a weighed formulation intended to decrease the torque required by the
left knee joint.
This redistribution of torque did not greatly impact tracking performance. Fig. 5.7
shows relatively small increases in tracking error when compared to PD only control.
111
L hip L knee L ankle R hip R knee R ankle
0
0.02
0.04
0.06
RMS error (rad)
Original
Weighted
PD only
Figure 5.7: RMS tracking errors for a squatting task. The Weighted formulation re-
duces the torque required at the left knee joint, without significantly effecting tracking
performance.
5.4.4 Superposition of joint space and operational space control
Next we show how floating base inverse dynamics can be used to modify joint tracking
control in order to achieve an operational space objective. In this case, we execute a
periodic squatting motion in joint space, while maintaing stability using a feedforward
balance controller in operational space. We also introduce a stepping pattern which
demonstrates the balance controller’s robustness to constraint switching, due to making
and breaking contacts with the feet. Additionally, we perturb the balancing system with
external forces.
We use the same feedforward controller as in (5.30), however for the desired joint
acceleration (in the inverse dynamics computation) we use:
¨ q
d
=
J
C
J
b,xy
+
0
¨ x
b
d
,xy
−
˙
J
C
˙
J
b,xy
˙ q
, (5.33)
112
whereJ
b,xy
is the Jacobian of the x and y components of the robot’s base (assuming z is
the direction of gravity), and ¨ x
b
d
,xy
is a desired task-space tracking trajectory for the x
and y components of the robot’s base. In this case, we simply define:
¨ x
b
d
,xy
=K
P
(x
b
d
,xy
−x
b,xy
)−K
D
˙ x
b,xy
(5.34)
NotethatweaugmentthetaskJacobian withtheconstraintJacobianJ
C
,andinclude
a null vector, 0, in order to ensure that ¨ q
d
remains within the null space of J
C
. For a
detailed discussion of this, and other issues related to floating base inverse kinematics,
please see Chapter 4. Additionally, we use the base Jacobian as an approximation to the
COG Jacobian (Sugihara & Nakamura 2002), which works well for our top-heavy biped
system. Also, we intentionally do not include the z component of the base in (5.33).
This way, we can control for the robot’s balance, and the stretched knee posture does not
introduce a singularity.
The desired base position x
b
d
is set to alternate between the two feet, such that the
COG lies above the supportpolygon of a single foot, and the opposing foot can be lifted.
To lift the feet, we instantaneously change the desired joint offsets of the hip, knee, and
ankle of a single leg. Also at this time, we provide for a constraint switch by instantly
changing the constraint Jacobian (from double-support, to right foot, to double-support,
to left foot, etc.). Doing so creates discontinuities in the torque output of (5.26).
We test this controller with a 2Hz periodic squatting motion, superimposed with a 3
second period stepping pattern (alternating right and left feet). The robot bobs up and
down while hoppingfrom one foot to the other. Additionally every 5 seconds, we perturb
113
the system with an external 200 N. force at the base for 150ms. Resulting motion during
disturbance and stepping is shown in Fig. 5.8. Tracking performance of the base as well
as leg joints is shown in Fig. 5.9.
Figure 5.8: Reaction of the biped during a 200 N external disturbance, and a constraint
switch from the right to left foot. Time progress from left frame to the right.
5.4.5 Quadruped locomotion
We also evaluated our floating base inverse dynamics controller on the LittleDog robot,
a roughly 0.3 meter long and .2 meter tall robot dog that walks over rough terrain
(Fig. 4.11). This robot is controlled wirelessly, over a 100Hz control cycle. Position
and orientation of the robot are measured via a motion capture system. We determined
the rigid-body dynamics of the robot from CAD data and from parameter estimation
technique from empirical data (Ting et al. 2006) such that we obtained a high confidence
dynamics model of the robot. Our planning algorithms for the robot generate a walk
pattern in joint space out of 5th order splines (Pongas et al. 2007), and also use a 5th
order spline representation for robot base motion. Both the base trajectory and the joint
space trajectories are compatible with the constraints due to ground contacts of the feet.
114
Figure5.9: Trackingresultsofthefeedforwardbalancecontrollerduringthesteppingtask.
The base (top to graphs) is controlled to sway side-to-side, to allow for the sequential
lifting of alternate feet. Discontinuities in desired joint trajectories are due to an instant
offset change to lift the feet. A t=5 a 200 N external disturbance pushes the robot for
150ms.
115
It should be noted that the constraints switch multiple times during one walking period,
by cycling through all the swing legs and also by having short 4-leg support phases.
We compared the improvement of joint-space tracking on a test terrain of medium
difficultly. As comparison, we looked at a high gain PD controller which is used by all
research groups that work with LittleDog. The floating-base inverse dynamics controller
improved the tracking performance by a factor of 4, measured from the mean-squared
negative feedbackcommandthatthePDcontroller generated. Feedback commandscould
be reduced very significantly. These results are illustrated in Fig. 5.10. The top row
shows desired and realized trajectory of the knee position of the left-front leg of the
robot. The middle row of the plot shows the corresponding joint velocities. The bottom
row shows the total motor command (u), the feedback motor command (ufb), and the
feedforwardmotorcommand(uff)duetotheinversedynamicscontroller. Ascanbeseen,
the tracking performance is excellent, and the total motor command is almost identical
with the feedforward command, indicating that little correction is needed by the PD
controller. We also noted that the floating base inverse dynamics controller was very
robust in face slipping of feet, or even stumbling – these issues violate the assumptions
of rigid contraints that were used in the derivation of the controller. We also explore
reducing the PD gains by a factor of 4 to increase compliance of the robot. Without the
inverse dynamics controller, i.e., pure PD control, the robot could not walk across the
terrain anymore. With the inverse dynamics controller, the robot could walk over the
rough terrain with the same overall performance as before. This results confirmed that
the floating base inverse dynamics is effective in increasing the compliance of the robot
without a peformance penalty.
116
Figure 5.10: Figure a) PD+Feedforward control: Position (top) and velocity (middle)
trajectories of the left front knee of the real LittleDog robot while walking on rough
terrain. Thebottomgraphshowstherequiredfeedforwardandfeedbackmotorcommands
for the same joint. Relatively low feedback commands show the feedforward controller
is working effectively. Plots on the right side are magnifications at the location of the
vertical red line. Figure b) PD-only (zero feedforward commands) control at roughly
a similar phase of the locomotion cycle, magnified a the same scale in a). As typical
for a pure PD controller, the tracking performance shows a delay, and thus a significant
increase in tracking error.
117
5.5 Conclusion
This chapter addresses how to generate compliant dexterous control in a floating base
roboticsystem, e.g., ahumanoidrobot. Compliantcontrol requiresmodel-based methods
to keep negative feedback gains low, but model-based control on floating base systems
is not straightforward due to changing constraints when the robot interacts with the
environment, potential closed-loop kinematics, and the fact that the exact location and
force of contacts with the environment may be unpredictable. Thus, a general inverse
dynamics control approach is needed that can flexibly accommodate constraints without
the need to derive new analytical models for every new constraint situation. For this
purpose, we have extended orthogonal decomposition to nonlinear rigid-body dynamics
with nonlinear, holonomic constraints. We use the orthogonal decomposition to reduce
the dimensionality of our robotic system to only unconstrained degrees of freedom, that
do not depend on constraint forces. In this way, we can find a solution to an ill-posed
inverse dynamics problem. We are able to compute the analytically correct inverse dy-
namics torques for a sufficiently constrained, floating base rigid-body system, such as a
humanoid robot, without the need to know the constraint forces. This is a major advan-
tage, since otherwise the contact forces must either be estimated, or measured with noisy
sensors. The resulting formulation is surprisingly simple, and does not require complex
formulations with inverted inertia matrices, which are prone to magnify modeling errors.
Additionally, we can use this formulation to naturally combine operational space control
with joint tracking control. For example, we can combine a central pattern generator
for locomotion with a COG controller for balance. We have demonstrated the feasibility
118
of our approach with an evaluation on a simulated bipedal robot, and a real quadruped
robot. We have shown how our approach improves the performance of tracking joint
trajectories significantly and, thus, can help to keep systems compliant with low feedback
gains. We have also shown that our approach can robustly handle constraint switch-
ing, extended knee postures, and balancing under disturbances. Our next step is the
implementation on a real high dimensional force-controlled humanoid robot, to perform
locomotion and manipulation tasks.
119
Chapter 6
Conclusion
In this dissertation, we examined some of the computational principles underlying sen-
sorimotor control in humans. Specifically, we looked into task-space control, redundancy
resolution, internal model learning, and optimization. We conducted two studies into hu-
man reaching. One of which was a study of redundancy resolution, and the other looked
intowhatdrivestheadaptationprocessduringmotorlearning. Weadditionallyaddressed
how to implement such computational principles for the control of human robots. We
showed how to conduct task-space, model-based control of humanoid robots, and showed
how some of the challenges inherent to floating base structures can be overcome.
6.1 Summary of Dissertation Contributions
In this dissertation, we developed:
• A new platform for exploring arm movements in a full 3D environment.
• A novel experiment for testing the desired trajectory hypothesis in humans.
• A framework for task-space control and inverse kinematic in humanoid robots
120
• A practical solution to the ill-posed problem of floating base inverse dynamics, and
a framework for model based control of humanoid robots.
6.2 Summary of Chapters
6.2.1 Chapter 2
Chapter 2 demonstrated how a complex exoskeleton robot can be employed for behav-
ioral studies of motor control. We demonstrated the usefulness of this new experimental
platform in a behavioral experiment, where human subjects were exposed to a novel dy-
namic environment in the form of joint space force fields. The experiment revealed that
movementplanningofreachingmovements inhumansdoesnotemploydesiredjointspace
trajectories, but may rather be similar to inverse kinematics controllers with some form
of null space optimization. Simulated results using such an operational space controller
were shown to qualitatively match the resulting motion observed by subjects.
6.2.2 Chapter 3
Chapter 3 used the force field experiment paradigm to examine the adaptation process
in humans. Specifically, we asked what role perceived kinematic error plays in motor
learning. A stochastic optimal control model only concerned with target accuracy and
motor cost failed to accurately predict the adaptation result of our subjects. Rather, a
cost function that is directionally biased towards the target, provided a much stronger
prediction, suggesting that the minimization of kinematic error is an important concern
for the CNS.
121
6.2.3 Chapter 4
Chapter4discussedtherelevantissuesconcerninginversekinematics algorithms forfloat-
ing base structures. In particular, we have addressed how to deal with under-actuation
by sufficiently constraining the robot. We have outlined and proved the sufficient condi-
tions that will allow for full controllability of a task for a floating base platform. We also
developed controllers for task-space control, task prioritization, and dealing with discon-
tinuities created by switching constraints. We have implemented the controllers for full
body control of a humanoid robot and locomotion of a simulated quadruped robot.
6.2.4 Chapter 5
Chapter 5 addressed how to generate compliant dexterous control in a floating base
robotic system, via model based control. We extended orthogonal decomposition to
nonlinear rigid-body dynamics with nonlinear, holonomic constraints, and used the de-
composition to reduce the dimensionality of our robotic system to only unconstrained
degrees of freedom, that do not depend on constraint forces. In this way, we have found
a practical solution to an ill-posed inverse dynamics problem, and are able to compute
the analytically correct inverse dynamics torques for floating base rigid-body systems,
without requiring contact force measurement. Additionally, we were able to use this for-
mulation to naturally combine operational space control with joint tracking control. We
demonstrated the feasibility of our approach with an evaluation on a simulated bipedal
robot, and a real quadruped robot. We have shown how our approach improves the per-
formance of tracking joint trajectories significantly and, thus, can help to keep systems
122
compliant with low feedback gains. We have also shown that our approach can robustly
handle constraint switching, extended knee postures, and balancing under disturbances.
123
Reference List
An, C. H., Atkeson, C. & Hollerbach, J. (1988), ‘Model-based control of a robot manip-
ulator’, MIT Press Cambridge, MA .
Baerlocher, P. & Boulic, R. (1998), ‘Task-priority formulations for the kinematic control
of highly redundant articulated structures’, IROS 1998 1, 323–329.
Bernstein, N. (1967), ‘The co-ordination and regulation of movements’, Pergamon Press
London .
Bizzi, E., Mussa-Ivaldi, F. A. & Giszter, S. (1991), ‘Computations underlying the execu-
tion of movement: a biological perspective.’, Science 253(5017), 287–291.
Bullock, D. & Grossberg, S. (1989), ‘Vite and flete: neural modules for trajectory forma-
tion and postural control’, Volitional action (Hershberger WA, ed.) pp. 253–297.
Bullock,D.,Grossberg,S.&Guenther,F.(1992), ‘Aself-organizingneuralnetworkmodel
for redundantsensory-motor control, motor equivalence, and tool use’, International
Joint Conference on Neural Networks (IJCNN 1992) 4, 91–96 vol.4.
Burdet, E., Osu, R., Franklin, D. W., Milner, T. E. & Kawato, M. (2001), ‘The central
nervoussystemstabilizes unstabledynamicsbylearningoptimal impedance’,Nature
414(6862), 446–9.
Caccavale, F.&Siciliano, B.(2001), ‘Kinematic control ofredundantfree-floatingrobotic
systems’, Advanced Robotics 15(4), 429–448.
Cheng, G., Hyon, S.-H., Morimoto, J., Ude, A., Hale, J., Colvin, G., Scroggin, W. &
Jacobsen, S. (2007), ‘Cb: a humanoid research platform for exploring neuroscience’,
Advanced Robotics 21(10), 1097–1114.
Cheng, G., Hyon, S., Ude, A., Morimoto, J. & Hale, J. (2008), ‘Cb: Exploring neuro-
science with a humanoid research platform’, ICRA 2008 .
Chiaverini, S. (1997), ‘Singularity-robust task-priority redundancy resolution for real-
time kinematic control of robot manipulators’, IEEE Transactions on Robotics and
Automation 13(3), 398–410.
Conditt,M.,Gandolfo,F.&Mussa-Ivaldi,F.A.(1997), ‘Themotorsystemdoesnotlearn
the dynamics of the arm by rote memorization of past experience.’, J Neurophysiol
78(1), 554–560.
124
Dubowsky, S. & Papadopoulos, E. (1993), ‘The kinematics, dynamics, and control of
free-flying and free-floating space robotic systems’, IEEE Transactions on Robotics
and Automation 9(5), 531 – 543.
Featherstone, R.(1987), ‘Robot dynamics algorithms’, University Kluwer Academic Pub-
lishers, Boston/Dordrecht/Lancaster .
Feldman, A. G. (1986), ‘Once more on the equilibrium-point hypothesis (lambda model)
for motor control’, Journal of motor behavior 18(1), 17–54.
Feldman, A. G., Ostry, D. J., Levin, M. F., Gribble, P. L. & Mitnitski, A. B. (1998),
‘Recent tests of the equilibrium-point hypothesis (lambda model)’, Motor control
2(3), 189–205.
Flash, T. & Hogan, N. (1985), ‘The coordination of arm movements: an experimentally
confirmed mathematical model’, J Neurosci 5(7), 1688–703.
Flashner, H. (1986), ‘An orthogonal decomposition approach to modal synthesis’, Inter-
national journal for numerical methods in engineering 23(3), 471–493.
Franklin, D. W., Burdet, E., Tee, K. P., Osu, R., Chew, C.-M., Milner, T. E. & Kawato,
M. (2008), ‘Cns learns stable, accurate, and efficient movements using a simple
algorithm’, J Neurosci 28(44), 11165–73.
Gandolfo, F., Mussa-Ivaldi, F. A. & Bizzi, E. (1996), ‘Motor learning by field approxi-
mation.’, Proc Natl Acad Sci U S A 93(9), 3843–3846.
Goble, J. A., Zhang, Y., Shimansky, Y. P., Sharma, S. & Dounskaia, N. V. (2007), ‘Di-
rectional biases reveal utilization of arm’s biomechanical properties for optimization
of motor behavior’, J Neurophysiol 98(3), 1240–52.
Gomi, H.&Kawato (1996), ‘Equilibrium-pointcontrol hypothesisexaminedbymeasured
arm stiffness during multijoint movement’, Science 272(5258), 117–20.
Harris, C. M. & Wolpert, D. M. (1998), ‘Signal-dependent noise determines motor plan-
ning’, Nature 394(6695), 780–4.
Hyon, S.-H., Hale, J. & Cheng, G. (2007), ‘Full-body compliant human-humanoid inter-
action: Balancing in the presence of unknown external forces’, IEEE Transactions
on Robotics 23(5), 884–898.
Hyon, S., Morimoto, J. & Cheng, G. (2008), ‘Hierarchical motor learning and synthesis
with passivity-based controller and phase oscillator’, ICRA 2008 .
Izawa, J., Rane, T., Donchin, O. & Shadmehr, R. (2008), ‘Motor adaptation as a process
of reoptimization’, J Neurosci 28(11), 2883–91.
Jeannerod, M. (1988), ‘The neural and behavioural organization of goal-directed move-
ments’, Clarendon Press Oxford, UK .
125
Kawato, M. (1999), ‘Internal models for motor control and trajectory planning’, Curr
Opin Neurobiol 9(6), 718–27.
Kawato, M., Furukawa, K. & Suzuki, R. (1987), ‘A hierarchical neural-network model for
control and learning of voluntary movement’, Biological Cybernetics 57(3), 169–85.
Kawato, M.&Gomi, H.(1992), ‘Acomputational modelof fourregions of thecerebellum
based on feedback-error learning’, Biological Cybernetics 68(2), 95–103.
Kawato, M.,Maeda, Y.,Uno,Y.&Suzuki,R.(1990), ‘Trajectoryformationofarmmove-
ment by cascade neural network model based on minimum torque-change criterion’,
Biological Cybernetics 62(4), 275–88.
Kawato, M.&Wolpert, D.M.(1998), ‘Internalmodelsformotorcontrol’, Novartis Found
Symp 218, 291–304; discussion 304–7.
Khatib, O. (1987), ‘A unified approach for motion and force control of robot manipula-
tors: Theoperational space formulation’, IEEE Journal of Robotics and Automation
3(1), 43–53.
Khatib, O., Sentis, L. & Park, J.-H. (2007), ‘A unified framework for whole-body hu-
manoid robot control with multiple constraints and contacts’, Springer Tracts in
Advanced Robotics - STAR Series, European Robotics Symposium (EURON) p. 10.
Lackner, J. R. & DiZio, P. (1994), ‘Rapid adaptation to coriolis force perturbations of
arm trajectory’, J Neurophysiol 72(1), 299–313.
Lashley,K.(1930), ‘Basicneuralmechanismsinbehavior’,Psychological Review37,1–24.
Latash, M. L., Scholz, J. P. & Sch¨ oner, G. (2007), ‘Toward a new theory of motor syner-
gies’, Motor control 11(3), 276–308.
Liegeois, A., Fournier, A. & Aldon, M. (1980), ‘Model reference control of high-velocity
industrial robots.’, Proc. Joint Automatic Control Conf. .
Loeb,G.E.,Levine,W.S.&He,J.(1990),‘Understandingsensorimotorfeedbackthrough
optimal control’, Cold Spring Harb Symp Quant Biol 55, 791–803.
Maciejewski, A. & Klein, C. (1985), ‘Obstacle avoidance for kinematically redundant
manipulators in dynamically varying environments’, The International Journal of
Robotics Research 4(3), 109–117.
Marani, G., Kim,J., Yuh,J.&Chung,W.K.(2003), ‘Algorithmicsingularities avoidance
in task-priority based controller for redundant manipulators’, IROS 2003 4, 3570–
3574 vol.3.
Milner, T. E. & Cloutier, C. (1993), ‘Compensation for mechanically unstable loading in
voluntary wrist movement.’, Exp Brain Res 94(3), 522–532.
126
Mistry, M., Mohajerian, P. & Schaal, S. (2005a), ‘Arm movement experiments with joint
space force fields using an exoskeleton robot’, 9th International Conference on Re-
habilitation Robotics pp. 408 – 413.
Mistry, M., Mohajerian, P. & Schaal, S. (2005b), ‘An exoskeleton robot for human arm
movement study’, 2005 IEEE/RSJ International Conference on Intelligent Robots
and Systems pp. 4071 – 4076.
Mistry, M., Nakanishi, J. & Schaal, S. (2007), ‘Task space control with prioritization for
balance and locomotion’, IROS 2007 pp. 331 – 338.
Morasso, P. (1981), ‘Spatial control of arm movements.’, Exp Brain Res 42(2), 223–227.
Morasso, P., Baratto, L., Capra, R. & Spada, G. (1999), ‘Internal models in the control
of posture’, Neural Netw 12(7-8), 1173–1180.
Morasso,P.&Sanguineti,V.(2002), ‘Anklemusclestiffnessalonecannotstabilizebalance
during quiet standing’, J Neurophysiol 88(4), 2157–62.
Morasso, P. & Schieppati, M. (1999), ‘Can muscle stiffness alone stabilize upright stand-
ing?’, J Neurophysiol 82(3), 1622–6.
Mussa-Ivaldi, F. A. & Bizzi, E. (2000), ‘Motor learning through the combination of prim-
itives’, Philos Trans R Soc Lond, B, Biol Sci 355(1404), 1755–69.
Mussa-Ivaldi, F. A., Hogan, N. & Bizzi, E. (1985), ‘Neural, mechanical, and geometric
factors subserving arm posture in humans’, J Neurosci 5(10), 2732–43.
Mussa-Ivaldi,F.A.,Morasso,P.&Zaccaria,R.(1988),‘Kinematicnetworks.adistributed
model for representing and regularizing motor redundancy’, Biological Cybernetics
60(1), 1–16.
Nakamura, Y., Hanafusa, H. & Yoshikawa, T. (1987), ‘Task-priority based redundancy
control of robot manipulators’, The International Journal of Robotics Research
6(2), 3–15.
Nakanishi,J.,Cory,R.,Mistry,M.,Peters, J.&Schaal,S.(2008), ‘Operationalspacecon-
trol: Atheoretical and empirical comparison’, The International Journal of Robotics
Research 27(6), 737–757.
Nakanishi, J., Mistry, M. & Schaal, S. (2007), ‘Inverse dynamics control with floating
base and constraints’, ICRA 2007 pp. 1942–1947.
Ohta, K., Svinin, M. M., Luo, Z., Hosoe, S. & Laboissi` ere, R. (2004), ‘Optimal trajectory
formation of constrained human arm reaching movements’, Biological Cybernetics
91(1), 23–36.
Osu, R., Burdet, E., Franklin, D. W., Milner, T. E. & Kawato, M. (2003), ‘Different
mechanismsinvolved inadaptation tostableandunstabledynamics’,J Neurophysiol
90(5), 3255–69.
127
Park, J.-H. & Khatib, O. (2006), ‘Contact consistent control framework for humanoid
robots’, ICRA 2006 pp. 1963– 1969.
Pongas, D., Mistry, M.& Schaal, S.(2007), ‘Arobustquadrupedwalking gait fortravers-
ing rough terrain’, ICRA 2007 pp. 1474–1479.
Scholz, J. P. & Sch¨ oner, G. (1999), ‘The uncontrolled manifold concept: identifying
control variables for a functional task’, Exp Brain Res 126(3), 289–306.
Sciavicco, L. & Siciliano, B. (2001), Modeling and Control of Robot Manipulators, 2nd
ed., Springer.
Sentis, L. & Khatib, O. (2005), ‘Control of free-floating humanoid robots through task
prioritization’, ICRA 2005 pp. 1718 – 1723.
Sentis,L.&Khatib,O.(2006), ‘Awhole-bodycontrolframeworkforhumanoidsoperating
in human environments’, Proceedings of the 2006 IEEE International Conference on
Robotics and Automation .
Shadmehr,R. & Mussa-Ivaldi, F. A. (1994), ‘Adaptive representation of dynamics during
learning of a motor task.’, J Neurosci 14(5 Pt 2), 3208–3224.
Siciliano, B. & Slotine, J.-J. (1991), ‘A general framework for managing multiple tasks
in highly redundant robotic systems’, Fifth International Conference on Advanced
Robotics pp. 1211–1216 vol.2.
Singh, K. & Scott, S. H. (2003), ‘A motor learning strategy reflects neural circuitry for
limb control’, Nat Neurosci 6(4), 399–403.
Soechting, J. F., Buneo, C. A., Herrmann, U. & Flanders, M. (1995), ‘Moving effort-
lessly in three dimensions: does donders’ law apply to arm movement?’, J Neurosci
15(9), 6271–80.
Sternad, D., de Rugy, A., Pataky, T. & Dean, W. J. (2002), ‘Interaction of discrete and
rhythmic movements over a wide range of periods’, Exp Brain Res 147(2), 162–74.
Strang, G. (1988), Linear Algebra and its Applications, 3rd ed, Saunders.
Sugihara, T. (2004), Mobility Enhancement Control of Humanoid Robot based on Reac-
tion Force Manipulation via Whole Body Motion, PhD thesis, University of Tokyo.
Sugihara, T. & Nakamura, Y. (2002), ‘Whole-body cooperative balancing of humanoid
robot using cog jacobian’, IROS 2002 .
Takahashi, C. D., Scheidt, R. A. & Reinkensmeyer, D. J. (2001), ‘Impedance control and
internal model formation when reaching in a randomly varying dynamical environ-
ment’, J Neurophysiol 86(2), 1047–51.
Thoroughman, K. A. & Shadmehr, R. (1999), ‘Electromyographic correlates of learning
an internal model of reaching movements’, J Neurosci 19(19), 8573–88.
128
Ting,J.-A.,Mistry,M.,Peters,J.,Schaal,S.&Nakanishi,J.(2006), ‘Abayesianapproach
tononlinearparameteridentificationforrigidbodydynamics’,Robotics: Science and
Systems Conference .
Todorov, E. (2004), ‘Optimality principles in sensorimotor control’, Nat Neurosci
7(9), 907–15.
Todorov, E. (2005), ‘Stochastic optimal control and estimation methods adapted to the
noise characteristics of the sensorimotor system’, Neural computation 17(5), 1084–
108.
Todorov, E. & Jordan, M. I. (2002), ‘Optimal feedback control as a theory of motor
coordination’, Nat Neurosci 5(11), 1226–35.
Umetani, Y. & Yoshida, K. (1989), ‘Resolved motion rate control of space manipulators
with generalized jacobian matrix’, Robotics and Automation, IEEE Transactions on
5(3), 303 – 314.
Uno, Y., Kawato, M. & Suzuki, R. (1989), ‘Formation and control of optimal trajectory
in human multijoint arm movement. minimum torque-change model’, Biological Cy-
bernetics 61(2), 89–101.
Vafa, Z. & Dubowsky, S. (1990), ‘The kinematics and dynamics of space manipulators:
the virtual manipulator approach’, The International Journal of Robotics Research
.
Vukobratovic, M. & Borovac, B. (2004), ‘Zero-moment point-thirty five years of its life’,
IJHR 1(1), 157–173.
Vukobratovic, M.&Stepanenko,J.(1972), ‘Onthestabilityofanthropomorphicsystems’,
Journal of Mathematical Bioscience 15, 1–37.
Wolpert, D. M., Ghahramani, Z. & Jordan, M. I. (1995), ‘Are arm trajectories planned
in kinematic or dynamic coordinates? an adaptation study’, Exp Brain Res
103(3), 460–70.
129
Appendix A
Chapter 2: Additional Results
A.1 Experiment 1
130
0.3
0.4
0.5
0.6
0.7
0.2
0.3
0.4
−0.2
−0.1
0
0.1
0.2
0.3
Y (m)
X (m)
Z (m)
0
0.5
−1
−0.8
−0.6
−0.4
0.8
1
1.2
1.4
SFE (rad)
SAA (rad)
EB (rad)
Null Field
Force Field
Figure A.1: Experiment 1: Subject B
0.3
0.4
0.5
0.6
0.7
0.2
0.3
0.4
−0.2
−0.1
0
0.1
0.2
0.3
Y (m)
X (m)
Z (m)
0
0.2
−1
−0.8
−0.6
−0.4
−0.2
0.8
1
1.2
1.4
SFE (rad)
SAA (rad)
EB (rad)
Null Field
Force Field
Figure A.2: Experiment 1: Subject C
131
0.3
0.4
0.5
0.6
0.7
0.2
0.3
0.4
−0.2
−0.1
0
0.1
0.2
0.3
Y (m)
X (m)
Z (m)
0
0.2
0.4
−1
−0.8
−0.6
−0.4
−0.2
0.8
1
1.2
1.4
SFE (rad)
SAA (rad)
EB (rad)
Null Field
Force Field
Figure A.3: Experiment 1: Subject D
0.3
0.4
0.5
0.6
0.7
0.2
0.3
0.4
−0.2
−0.1
0
0.1
0.2
0.3
Y (m)
X (m)
Z (m)
0
0.2
0.4
−1
−0.8
−0.6
−0.4
0.8
1
1.2
1.4
1.6
SFE (rad) SAA (rad)
EB (rad)
Null Field
Force Field
Figure A.4: Experiment 1: Subject E
132
A.2 Experiment 2
0.3
0.4
0.5
0.6
0.7
0.2
0.3
0.4
−0.2
−0.1
0
0.1
0.2
0.3
Y (m)
X (m)
Z (m)
−1
−0.5
0
0.5
1
0.6
0.8
1
1.2
1.4
HR (rad)
SAA (rad)
EB (rad)
Null Field
Force Field
Figure A.5: Experiment 2: Subject A
133
0.3
0.4
0.5
0.6
0.7
0.2
0.3
0.4
−0.2
−0.1
0
0.1
0.2
0.3
Y (m)
X (m)
Z (m)
−1
−0.5
0
0.5
1
0.6
0.8
1
1.2
1.4
HR (rad)
SAA (rad)
EB (rad)
Null Field
Force Field
Figure A.6: Experiment 2: Subject C
0.3
0.4
0.5
0.6
0.7
0.2
0.3
0.4
−0.2
−0.1
0
0.1
0.2
0.3
Y (m)
X (m)
Z (m)
−1
−0.5
−0.5
0
0.5
1
0.6
0.8
1
1.2
1.4
HR (rad)
SAA (rad)
EB (rad)
Null Field
Force Field
Figure A.7: Experiment 2: Subject D
134
0.3
0.4
0.5
0.6
0.7
0.2
0.3
0.4
−0.2
−0.1
0
0.1
0.2
0.3
Y (m)
X (m)
Z (m)
−1
−0.5
0
0.5
1
0.6
0.8
1
1.2
1.4
HR (rad)
SAA (rad)
EB (rad)
Null Field
Force Field
Figure A.8: Experiment 2: Subject E
135
Appendix B
Proofs of Task Prioritization Equations
B.1 Proof of Equation (4.16)
Since (4.15) is a recursive formulation, we will use induction to prove (4.16). The induc-
tion hypothesis will be:
˙ q
i
=
J
1
J
2
.
.
.
J
i
#
˙ x
1
˙ x
2
.
.
.
˙ x
i
=
ˆ
J
#
i
ˆ
˙ x
i
, (B.1)
and the base case (i = 1) is trivially verified (˙ q
1
= J
#
1
˙ x
1
). Therefore we only need to
prove the (i+1)th step:
˙ q
i
+
J
i+1
ˆ
N
i
#
(˙ x
i+1
−J
i+1
˙ q
i
) =
J
1
J
2
.
.
.
J
i
J
i+1
#
˙ x
1
˙ x
2
.
.
.
˙ x
i
˙ x
i+1
(B.2)
The right hand side of (B.2) can be rewritten by using the augmented Jacobian notation:
˙ q=
ˆ
J
i
J
i+1
#
ˆ
˙ x
i
˙ x
i+1
(B.3)
Assuming the augmented Jacobian of (B.3) is full row rank, we can expand its pseudo-
inverse to get the equation:
˙ q =
ˆ
J
T
i
J
T
i+1
ˆ
J
i
ˆ
J
T
i
ˆ
J
i
J
T
i+1
J
i+1
ˆ
J
T
i
J
i+1
J
T
i+1
−1
ˆ
˙ x
i
˙ x
i+1
. (B.4)
The inversion in (B.4) can be solved using the Strassen block-wise inversion formula:
A B
C D
−1
=
W X
Y Z
. (B.5)
136
where
W = A
−1
+A
−1
B
D−CA
−1
B
−1
CA
−1
X = −A
−1
B
D−CA
−1
B
−1
Y = −
D−CA
−1
B
−1
CA
−1
Z =
D−CA
−1
B
−1
By setting A,B,C, and D to the corresponding element of the inverted matrix in (B.4)
we can solve for W,X,Y, and Z. First we expand Z:
Z =
J
i+1
J
T
i+1
−J
i+1
ˆ
J
T
i
ˆ
J
i
ˆ
J
T
i
−1
ˆ
J
i
J
T
i+1
−1
=
J
i+1
J
T
i+1
−J
i+1
ˆ
J
#
i
ˆ
J
i
J
T
i+1
−1
=
J
i+1
I−
ˆ
J
#
i
ˆ
J
i
J
T
i+1
−1
=
J
i+1
ˆ
N
i
J
T
i+1
−1
. (B.6)
Next we will expand the diagonal elements of (B.5):
Y = −
J
i+1
ˆ
N
i
J
T
i+1
−1
J
i+1
ˆ
J
T
i
ˆ
J
i
ˆ
J
T
i
−1
= −
J
i+1
ˆ
N
i
J
T
i+1
−1
J
i+1
ˆ
J
#
i
, (B.7)
X = −
ˆ
J
i
ˆ
J
T
i
−1
ˆ
J
i
J
T
i+1
J
i+1
ˆ
N
i
J
T
i+1
−1
. (B.8)
And the final element is expanded:
W =
ˆ
J
i
ˆ
J
T
i
−1
+
ˆ
J
i
ˆ
J
T
i
−1
ˆ
J
i
J
T
i+1
J
i+1
ˆ
N
i
J
T
i+1
−1
J
i+1
ˆ
J
#
i
(B.9)
Replacing the matrix inversion in (B.4) with (B.5) results in:
˙ q =
ˆ
J
T
i
J
T
i+1
W X
Y Z
ˆ
˙ x
i
˙ x
i+1
=
ˆ
J
T
i
W +J
T
i+1
Y
ˆ
˙ x
i
+
ˆ
J
T
i
X +J
T
i+1
Z
˙ x
i+1
= P
ˆ
˙ x
i
+Q˙ x
i+1
(B.10)
First we will expand the right most term of (B.10) and substitute (B.6) and (B.8):
Q =
ˆ
J
T
i
X +J
T
i+1
Z
137
= −
ˆ
J
T
i
ˆ
J
i
ˆ
J
T
i
−1
ˆ
J
i
J
T
i+1
J
i+1
ˆ
N
i
J
T
i+1
−1
(B.11)
+J
T
i+1
J
i+1
ˆ
N
i
J
T
i+1
−1
= −
ˆ
J
#
ˆ
J
i
J
T
i+1
J
i+1
ˆ
N
i
J
T
i+1
−1
+J
T
i+1
J
i+1
ˆ
N
i
J
T
i+1
−1
=
I−
ˆ
J
#
i
ˆ
J
i
J
T
i+1
J
i+1
ˆ
N
i
J
T
i+1
−1
=
ˆ
N
i
J
T
i+1
J
i+1
ˆ
N
i
J
T
i+1
−1
=
ˆ
N
T
i
J
T
i+1
J
i+1
ˆ
N
i
ˆ
N
T
i
J
T
i+1
−1
=
J
i+1
ˆ
N
i
#
(B.12)
Thelast two steps are possiblebecause
ˆ
N is symmetric and idempotent. Next we expand
the left hand term of (B.10) and substitute (B.7) and (B.9):
P =
ˆ
J
T
i
W +J
T
i+1
Y
=
ˆ
J
T
i
ˆ
J
i
ˆ
J
T
i
−1
+
ˆ
J
T
i
ˆ
J
i
ˆ
J
T
i
−1
ˆ
J
i
J
T
i+1
J
i+1
ˆ
N
i
J
T
i+1
−1
J
i+1
ˆ
J
#
i
−J
T
i+1
J
i+1
ˆ
N
i
J
T
i+1
−1
J
i+1
ˆ
J
#
i
=
ˆ
J
#
i
+
ˆ
J
#
i
ˆ
J
i
J
T
i+1
J
i+1
ˆ
N
i
J
T
i+1
−1
J
i+1
ˆ
J
#
i
−J
T
i+1
J
i+1
ˆ
N
i
J
T
i+1
−1
J
i+1
ˆ
J
#
i
=
ˆ
J
#
i
−
ˆ
N
i
J
T
i+1
J
i+1
ˆ
N
i
J
T
i+1
−1
J
i+1
ˆ
J
#
i
=
ˆ
J
#
i
−
J
i+1
ˆ
N
i
#
J
i+1
ˆ
J
#
i
(B.13)
Substituting (B.12) and (B.13) into (B.10) yields:
˙ q =
ˆ
J
#
i
ˆ
˙ x
i
−
J
i+1
ˆ
N
i
#
J
i+1
ˆ
J
#
i
ˆ
˙ x
i
+
J
i+1
ˆ
N
i
#
˙ x
i+1
=
ˆ
J
#
i
ˆ
˙ x
i
+
J
i+1
ˆ
N
i
#
˙ x
i+1
−J
i+1
ˆ
J
#
i
ˆ
˙ x
i
(B.14)
Finally, we invoke the induction hypothesis and substitute (B.1) into (B.14):
˙ q = ˙ q
i
+
J
i+1
ˆ
N
i
#
(˙ x
i+1
−J
i+1
˙ q
i
)
138
B.2 Proof of Equation (4.17)
The following proof is an extension of the one shown in (Marani et al. 2003). While they
were able to prove the case of n = 2, we use induction to extend the proof for arbitrary
n. The induction hypothesis will be:
det
ˆ
J
i
ˆ
J
T
i
=
i
Y
j=1
det
J
j
ˆ
N
j−1
J
j
ˆ
N
j−1
T
where we define
ˆ
N
0
=I. The i = 1 step is trivially verified. We need to prove the i+1
step:
det
ˆ
J
i+1
ˆ
J
T
i+1
=
i+1
Y
j=1
det
J
j
ˆ
N
j−1
J
j
ˆ
N
j−1
T
=
i
Y
j=1
det
J
j
ˆ
N
j−1
J
j
ˆ
N
j−1
T
det
J
i+1
ˆ
N
i
J
i+1
ˆ
N
i
T
(B.15)
Invoking the induction hypothesis on the bracketed term of (B.15):
det
ˆ
J
i+1
ˆ
J
T
i+1
= det
ˆ
J
i
ˆ
J
T
i
det
J
i+1
ˆ
N
i
J
i+1
ˆ
N
i
T
(B.16)
If
ˆ
J
i
is full row rank, then we can expand the last term of (B.16) by:
det
J
i+1
ˆ
N
i
J
i+1
ˆ
N
i
T
= det
J
i+1
ˆ
N
i
J
T
i+1
= det
J
i+1
I−
ˆ
J
#
i
ˆ
J
i
J
T
i+1
= det
J
i+1
J
T
i+1
−J
i+1
ˆ
J
#
i
ˆ
J
i
J
T
i+1
= det
J
i+1
J
T
i+1
−J
i+1
ˆ
J
T
i
ˆ
J
i
ˆ
J
T
i
−1
ˆ
J
i
J
T
i+1
(B.17)
Substituting (B.17) into (B.16) gives us
det
ˆ
J
i+1
ˆ
J
T
i+1
= det
ˆ
J
i
ˆ
J
T
i
det
J
i+1
J
T
i+1
−J
i+1
ˆ
J
T
i
ˆ
J
i
ˆ
J
T
i
−1
ˆ
J
i
J
T
i+1
(B.18)
Using the following block formula for determinants:
det
A B
C D
= detA·det
D−CA
−1
B
(B.19)
139
Equation (B.18) becomes
det
ˆ
J
i+1
ˆ
J
T
i+1
= det
ˆ
J
i
ˆ
J
T
i
ˆ
J
i
J
T
i+1
J
i+1
ˆ
J
T
i
J
i+1
J
T
i+1
= det
ˆ
J
i
J
i+1
ˆ
J
i
J
i+1
T
= det
ˆ
J
i+1
ˆ
J
T
i+1
If
ˆ
J
i
is not full row rank, then
ˆ
J
i+1
will also not be full row rank, and:
det
ˆ
J
i+1
ˆ
J
T
i+1
= 0
However the left side of (B.16) will also be zero since det
ˆ
J
i
ˆ
J
T
i
= 0.
140
Abstract (if available)
Abstract
Humans are capable of executing a wide variety of complex and dexterous motor behavior with grace and elegance. We would like to understand the computational principals underlying their sensorimotor control, both for the advancement of neuroscience, as well as to achieve similar performance on artificial machines. I study the problem via two approaches: motor control experiments on human subjects, and the implementation of control models on anthropomorphic hardware. In one experiment, we use a novel exoskeleton platform that permits us to alter arm dynamics by applying torques to specific joints during full 3-D motion, and study how arm redundancy is utilized. Results of this experiments suggest that: 1) humans are able to learn, and later predict, novel dynamic environments, 2) humans plan motion in an extrinsic hand, or end-effector oriented space, and 3) joint redundancy can be utilized in order to assure task achievement. In a second experiment, we attempt to create a dynamic environment that effects motor cost, but not accuracy: if a force perturbation first pushes the hand off-course in one direction and then subsequently in the opposite direction, the reaching task may still be achieved with minimal correction, but a strongly curved trajectory. Under such conditions, subjects learn to return their trajectories towards the baseline, or null field trajectory, effectively fighting this disturbance in order to maintain a straight trajectory. These results are inconsistent with theories suggesting that perceived kinematic error plays no role in motor adaptation. I will demonstrate how improved predictions can be obtained with a stochastic optimal controller using a directionally biased cost function. In second part of my dissertation, I discuss how to obtain robust, compliant, human-like motion of humanoid robots, via task-space, model-based, feed-forward control.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Optimization-based whole-body control and reactive planning for a torque controlled humanoid robot
PDF
Machine learning of motor skills for robotics
PDF
The task matrix: a robot-independent framework for programming humanoids
PDF
Data-driven autonomous manipulation
PDF
Iterative path integral stochastic optimal control: theory and applications to motor control
PDF
Modeling motor memory to enhance multiple task learning
PDF
Estimation-based control for humanoid robots
PDF
Design of adaptive automated robotic task presentation system for stroke rehabilitation
PDF
Computational principles in human motor adaptation: sources, memories, and variability
PDF
Efficiently learning human preferences for proactive robot assistance in assembly tasks
PDF
Scaling robot learning with skills
PDF
Learning objective functions for autonomous motion generation
PDF
Discrete geometric motion control of autonomous vehicles
PDF
Characterizing and improving robot learning: a control-theoretic perspective
PDF
Intelligent robotic manipulation of cluttered environments
PDF
Learning affordances through interactive perception and manipulation
PDF
Computational model of stroke therapy and long term recovery
PDF
Rethinking perception-action loops via interactive perception and learned representations
PDF
Minimum jerk model for control and coarticulation of arm movements with multiple via-points
PDF
Nonverbal communication for non-humanoid robots
Asset Metadata
Creator
Mistry, Michael Nalin
(author)
Core Title
The representation, learning, and control of dexterous motor skills in humans and humanoid robots
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Computer Science
Publication Date
04/17/2009
Defense Date
03/05/2009
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
force fields,humanoid robotics,inverse dynamics,inverse kinematics,OAI-PMH Harvest,operational space control,redundancy
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Schaal, Stefan (
committee chair
), Flashner, Henryk (
committee member
), Schweighofer, Nicolas (
committee member
), Sukhatme, Gaurav S. (
committee member
)
Creator Email
mike@mistry.com,mmistry@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-m2091
Unique identifier
UC1464157
Identifier
etd-Mistry-2770 (filename),usctheses-m40 (legacy collection record id),usctheses-c127-227416 (legacy record id),usctheses-m2091 (legacy record id)
Legacy Identifier
etd-Mistry-2770.pdf
Dmrecord
227416
Document Type
Dissertation
Rights
Mistry, Michael Nalin
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Repository Name
Libraries, University of Southern California
Repository Location
Los Angeles, California
Repository Email
cisadmin@lib.usc.edu
Tags
force fields
humanoid robotics
inverse dynamics
inverse kinematics
operational space control
redundancy