Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Optimizationbased wholebody control and reactive planning for a torque controlled humanoid robot
(USC Thesis Other)
Optimizationbased wholebody control and reactive planning for a torque controlled humanoid robot
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
OPTIMIZATIONBASED WHOLEBODY CONTROL AND REACTIVE
PLANNING FOR A TORQUE CONTROLLED HUMANOID ROBOT
by
Sean Alan Mason
A Dissertation Presented to the
FACULTY OF THE USC GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
(COMPUTER SCIENCE)
August 2018
Copyright 2018 Sean Alan Mason
Acknowledgements
While I have a great number of people to thank, I would like to begin by
thanking my advisor Stefan Schaal. Thank you Stefan for creating an incredible
work environment that fostered both creativity and collaboration. During my
Ph.D., I felt fully supported to pursue all of my curiosities and knew I could
always count on you for support. Equally important to my success, I would like
to thank my coadvisor Ludovic Righetti. Thank you for the countless meetings,
whiteboard talks, code sprints, and guidance over the course of my Ph.D. You
have an amazing ability to lead diplomatically and to take complex math and
reduce it down to physical intuition. Thank you for opening so many doors and
for always pushing me to pursue good science. I also would like to thank my Ph.D.
dissertation and qualifying exam committee: Gaurav Sukhatme, James Finley,
Nora Ayanian, and Hao Li.
Next, I would like to thank my friends and colleagues in CLMC, AMD, and
RESL for all of their help in reviewing my work, assisting with experiments, brain
storming ideas, and creating a happy and healthy workplace. While I had many
late night code sprints and paper revisions, I always knew that I was never working
alone. We commiserated in the tough times and celebrated in the happy times.
Thank you all! I would like to specifically thank Nicholas Rotella for all of his help
ii
over the years. Almost all of our projects overlapped in one way or another and it
was incredibly helpful to have you there to divide and conquer.
To my girlfriend Nathalie, I thank you so much for supporting me through
this journey and helping me grow over the years. You’ve endured my late night
rants on robotics, constant rephrasing of life’s problems as research problems, and
always knew how to make me laugh. You have been my biggest supporter and
I am blessed to have you in my life. Lastly, I would like to thank my wonderful
parents who have supported my pursuit of robotics for as long I can remember.
Thank you for your steadfast patience and love throughout this journey.
iii
Contents
Acknowledgements ii
List of Tables vi
List of Figures vii
Abstract xi
1 Introduction 1
1.1 WholeBody Control . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Online Trajectory Generation in Locomotion . . . . . . . . . . . . . 6
1.3 Summary of Contributions . . . . . . . . . . . . . . . . . . . . . . . 8
1.4 Document Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2 The Sarcos Humanoid Robot 10
2.1 Actuation and Sensing Overview . . . . . . . . . . . . . . . . . . . . 10
2.2 Overall Control Pipeline . . . . . . . . . . . . . . . . . . . . . . . . 12
3 Full Dynamics ContactConsistent LQR Design for Balancing 15
3.1 Floating Base Dynamics . . . . . . . . . . . . . . . . . . . . . . . . 16
3.2 Contact Consistent Dynamics . . . . . . . . . . . . . . . . . . . . . 17
3.3 Linearized Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.4 Task Space Control . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.5 Balancing Experiments . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.5.1 Double Support Balancing: Impulse Rejection . . . . . . . . 25
3.5.2 Double Support Balancing: Upper Body Disturbance Motion 29
3.5.3 Single Support Balancing: Impulse Rejection . . . . . . . . . 29
3.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4 Full Dynamics ContactConsistent LQR Design for Walking 33
4.1 ZMP Based Walking Pattern Generation . . . . . . . . . . . . . . . 34
4.1.1 Zero Moment Point . . . . . . . . . . . . . . . . . . . . . . . 34
4.1.2 Control of ZMP through LIPM . . . . . . . . . . . . . . . . 35
iv
4.1.3 Finite Horizon Control of the ZMP . . . . . . . . . . . . . . 36
4.2 MultiContact Linearizations . . . . . . . . . . . . . . . . . . . . . . 38
4.3 Switching Between Multiple Linearizations . . . . . . . . . . . . . . 41
4.4 Walking: Multiple Linearizations and Contact Switching . . . . . . 43
4.4.1 Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.4.2 Real Robot . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
5 MPC Walking Control Using Hand Contacts 49
5.1 Linear Inverted Pendulum with an External Contact Force . . . . . 52
5.2 Shifting the ZMP Support Polygon . . . . . . . . . . . . . . . . . . 54
5.3 Optimization Formulation . . . . . . . . . . . . . . . . . . . . . . . 55
5.3.1 Stage 1: MPC of CoM Trajectory and Footstep Locations . 57
5.3.2 Stage 2: Hand Contact . . . . . . . . . . . . . . . . . . . . . 58
5.3.3 Determining the Contact Force . . . . . . . . . . . . . . . . 60
5.3.4 Determining the Contact Location . . . . . . . . . . . . . . 62
5.4 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
5.4.1 Optimization Benchmarks . . . . . . . . . . . . . . . . . . . 63
5.4.2 Maximum Push Experiments . . . . . . . . . . . . . . . . . 64
5.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
6 Conclusion 69
6.1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
Reference List 72
v
List of Tables
4.1 RMSE for tracking a sidetoside ZMP trajectory with no contact
switching. The single support phase (SSP) time refers to the dwell
of the ZMP located at a single foot, where the foot would be lifted if
the robot were walking, and the double support phase (DSP) refers
tothetransitionoftheZMPfromonefoottotheother. Thenumber
in the poses column indicates the number of linearized poses used
for tracking. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
5.1 Solving speed for multiple contact points considering different hori
zon lengths. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
vi
List of Figures
1.1 Control pipeline for stabilizing offline trajectories. The dashed lines
in this plot indicate that the information is computed without feed
back. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Control pipeline extending Fig. 1.1. The stabilizing capability is
further increased by modifying trajectories online. . . . . . . . . . . 3
2.1 Lower body of Sarcos Humanoid with a modular upperbody mass
attached. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2 Comparisonoftheoldtorquetracking(top)andnewtorquetracking
(bottom). Red is the desired torque coming in as a sine wave and
blue is the measured torque. . . . . . . . . . . . . . . . . . . . . . . 14
3.1 Robot response and plot of the CoM error for a push in the lateral
plane (front). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.2 Joint response to the push shown in Fig. 3.1 . . . . . . . . . . . . . 27
3.3 CoM error for push in frontal plane (side). . . . . . . . . . . . . . . 27
vii
3.4 Simulation results from pushing experiments on the full humanoid
model. The top and bottom animations show the resulting motion
from a 150 N push for 0.1s on the torso from the back and front
respectively. The robot operating under a PD controller using the
diagonal of the LQR gain matrix was not able to balance provided
the same push. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.5 The LQR controller was used to balance the lower body of the robot
without knowing about the motion of the upper body. The graphs
show the error in the base position for both a slow (0.2 Hz) and fast
motion (0.8 Hz). . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.6 Snapshots from single support balancing task and the tracking error
of the CoM. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.7 Tracking error of the CoM and swing leg after an external impulse.
Theimpulsewas8.2Nswithapeakforceof488N.Asimilartestwas
conducted for a push located at the base of the robot, resulting in a
impulse of 5.5 Ns and peak force of 246 N. This value is very close
to the max impulse of 5.8 Ns measured when using a hierarchical
QP controller on the same robot (Herzog et al., 2014). . . . . . . . 32
4.1 Diagram of a sample pipeline used to generate ZMP based walking
trajectories using the LIP. . . . . . . . . . . . . . . . . . . . . . . . 39
4.2 Joint position portion of gain matrices produced from key poses and
constraint conditions. The outer matrices are created for single leg
balancing and the center matrix is double support balancing. . . . . 40
4.3 Postures that were chosen for linearization along with the Frobenius
normkK
n
−K
1
k of the gain differences with respect to the first pose
(base centered over feet). . . . . . . . . . . . . . . . . . . . . . . . . 41
viii
4.4 Blockdiagramdepictingthecontrolarchitectureusedinthewalking
experiments. θ is the joint and base state, τ
ff
is the feedforward
torque, andR
c
,L
c
denote the timevarying contact trajectories for
the feet. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.5 CoP and footstep location tracking for walking using the LQR con
troller in simulation. . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.7 CoP, CoM, and foot locations in the sidetoside direction through
out the experiment from the walking experiment on the real robot
using multiple LQR linearizations and contact switching. . . . . . . 46
4.6 Snapshots from the walking experiment on the real robot using mul
tiple LQR linearizations and contact switching. . . . . . . . . . . . 46
5.1 Visual of a push experiment setup mid push. The spheres located
on the wall mark available contact points, in which green spheres
specify those considered reachable. . . . . . . . . . . . . . . . . . . 51
5.2 When the conservative ZMP bounds are violated the robot seeks
to create a contact force among the available options such that the
"Shifted ZMP Support Polygon", Eq. (5.8) is centered at the current
ZMP location. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.3 In the digram the block remains static because the contact force f
c
remains inside of the friction cone. . . . . . . . . . . . . . . . . . . 59
5.4 The conservative linear approximation for a friction cone, referred
to as a friction pyramid. . . . . . . . . . . . . . . . . . . . . . . . . 59
5.5 The left shows the solution space when we solve for the exact ZMP
shift (QP1) and the right for when we formulate the optimization
as QP2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
ix
5.6 Maximum impulse at different times in the walking gait that the
robot was able to recover from for zero, one, and two step recoveries
with and without hand contacts. . . . . . . . . . . . . . . . . . . . . 66
5.7 Top: External impulse applied to the robot (10 Ns) and the force
tracking of the hand. Bottom: ZMP tracking and CoM trajectory
as a result of the push. Despite imperfect force tracking, the robot
is still able to recover from the impulse. . . . . . . . . . . . . . . . . 67
x
Abstract
In this work, we explore computationally lightweight optimizationbased meth
ods for planning and control of a humanoid robot. In doing so, we focus on
the important role of contact for the task of balancing and walking for a bipedal
system. This thesis puts a strong focus on real robot implementations and will
address questions that come with working with real complex systems. We begin by
looking into wholebody control techniques fortorquecontrolledhumanoids. While
manyresearchgroupshaveconvergedtousingQuadraticProgramming(QP)based
inverse dynamics techniques, we explore to what extent computationally simpler
optimal control strategies work well on real systems. We show how linearizations of
the full dynamics can be projected into contact consistent constraints to compute
timeinvariant optimal feedback gains. We then show how high level task goals
can be mapped into this framework. We evaluate this computationally lightweight
control policy in a variety balancing tasks and show that for significantly less
computational effort, disturbance rejection competitive with the state of the art
methods can be achieved on the real robot. We then extend this work to use lin
earizations around different contact configurations and postures to stabilize more
complicated motions that involve changing contacts such as walking. We conclude
by developing an online walking pattern generator that generates hand contact
plans when adjusting the center of mass (CoM) and footstep location alone is not
xi
sufficient. The lightweight formulation presented allows for hand contact locations
and force trajectories to be determined in realtime without predetermining the
specific contact location or timing.
xii
Chapter 1
Introduction
Humans have long envisioned a world in which robots are capable of assisting
in a variety of tasks, from disaster recovery, to household chores, to medical care.
Expanding beyond the conventional idea of a machine being highly specialized for
a specific task, general purpose robots have been designed with the intention of
being programmable to a much larger range of tasks. There has been considerable
progress in the automation of manufacturing tasks using industrial robots, where
highly articulated robotic manipulators have been used for tasks such as pickand
place, welding, assembly, painting, and machining. Such platforms, however, are
limited to the highly predictable industrial settings for which they were designed.
In contrast, humanoid robots can serve a more general purpose. Their human form
factor allow these robots to take advantage of environments and tools that have
been specifically designed for humans. We envision these robots assisting humans
in environments such as homes, work places, hospitals, or places too dangerous
for humans. Though actuation and sensing technologies have advanced signifi
cantly, robots perform very poorly compared to their human counterparts at what
we consider everyday tasks such as walking through a variety of environments and
interacting with objects. Humanoid robots offer a unique combination of the capa
bilities of other types of robots, including locomotion, manipulation, vision, and
pathplanning. In particular, humanoid locomotion is challenging for a number of
reasons. Because the robot is underactuated, the system is not controllable from
a classical controls perspective. To fully control the system, the robot must create
1
contact forces through interactions with the environment. This switching of con
tacts leads to hybrid dynamics which furthers the complexity. Bipedal locomotion
is challenging at a theoretical level because the problem is high dimensional, heav
ily constrained, and must be solved at time scales fast enough to prevent falling.
Beyond theory, the problem is further complicated by the difficulties involved in
working with real hardware rather than idealized simulations. Even the most high
performancesystemssufferfromnoisysensors, unmodelednonlineardynamics(e.g.
wires, hoses, stiction, compliance), imperfect state estimation, and communication
delays. Despite considerable hardwarerelated difficulties, recently much progress
has been made on real robots through the use of optimization techniques for both
control and planning. Because the goal for robotics is to develop methods that will
work on real systems, it is important for methods to be designed for and tested on
real systems.
Forbalancingandwalkingtasks, therearetwomaincontrolloopsforstabilizing
a humanoid. The first control loop is responsible for tracking trajectories (e.g. in
task space or joint space) and outputting joint torques to the robot, Fig. 1.1. This
controller is responsible for coordinating motion among all of the joints to achieve
goals specified in task space and it is commonly referred to as a wholebody control.
High Level
Planning
Trajectory
Generation
Trajectory
Tracking
Contact Sequence(s)
Com / Ende ffector
Trajectories
Robot
Joint T orques
Environment
Sensing
Figure 1.1: Control pipeline for stabilizing offline trajectories. The dashed lines in
this plot indicate that the information is computed without feedback.
While this class of control can be sufficient for balancing and stabilizing admis
sible trajectories, it is generally insufficient to prevent the robot from falling over
in the case of large disturbances, modeling error, or changes in the environment.
2
Duringbalancingandlocomotion, thereisalimitonthemagnitudeofdisturbances
whichcanberejectedbythetrajectorytrackingcontroller(eithermodelbasedcon
trol or simply joint space feedback control). At this point, this lowerlevel control
loop is insufficient to stabilize the system, meaning that the trajectories themselves
must be refined, Fig. 1.2.
High Level
Planning
Trajectory
Generation
Trajectory
Tracking
Contact Sequence(s)
Com / Ende ffector
Trajectories
Robot
Joint T orques
Environment
Sensing
Figure 1.2: Control pipeline extending Fig. 1.1. The stabilizing capability is
further increased by modifying trajectories online.
By modifying the CoM trajectory online, researchers have shown that robots
can reject much larger disturbances without changing the foot step locations
(Wieber, 2006). By additionally modifying the footstep locations, the ability to
not fall over further increases (Koolen et al., 2012). In this work, I address con
trol at both levels discussed: wholebody control and online trajectory generation.
Throughthesetools, thegoalistocreateandstabilizemotionthatallowstherobot
to balance, locomote, and move dynamically over different terrains. In particular,
this thesis has a focus on applying such methods on real systems and confronting
the unique challenges that real systems impose.
1.1 WholeBody Control
Earlyon,humanoidrobotswerealmostexclusivelypositioncontrolled. Walking
was demonstrated on fully actuated electric bipedal systems such as Asimo by
Honda (Hirai et al., 1998), the HRP robots by Kawada Industries (Kaneko et al.,
2004; Akachi et al., 2005) and HUBO by KAIST (Park et al., 2005). By precisely
3
modeling the environment and kinematics of the robot, joint trajectories can be
generated from simple models and inverse kinematics, and then tracked using high
gain joint position control. Given well tuned gains and precise sensing, position
control schemes can be used to track trajectories very accurately; however, because
they lack model information, and thus a feedforward component, they are usually
quite stiff. This stiffness makes the robots more vulnerable to unexpected impacts
and external forces. Additionally, the lack of compliance in the system can become
a safety concern when working around humans.
More recently, human sized torque controlled humanoids have been developed
such as CB by Sarcos (Cheng et al., 2008), Toro by DLR (Ott et al., 2011), and
Atlas by Boston Dynamics (Feng et al., 2015a). Torque controlled robots can deal
with forces directly and typically require lower gains to track joint trajectories
because they compensate for the dynamics of the robot in a feedforward manner.
In general, this is accomplished through a model based control scheme such as
inverse dynamics (Righetti et al., 2011; Stephens & Atkeson, 2010; Herzog et al.,
2014), passivitybased methods (Hyon et al., 2007; Ott et al., 2011), or virtual
model control (Pratt et al., 1997). In the inverse dynamics approach, a dynamic
model of the robot is used to compute feedforward torques that correspond to the
desired motion. Feedback control is then added, either in joint space or task space,
to account for modeling error and disturbances usually using a PID controller.
Passivity methods have also been successfully implemented on humanoid robots.
They have the advantage that they do not require a precise model of the robot
to compute admissible contact forces and torque commands under a quasistatic
assumption.
Within recent years, there has been a convergence among many researchers
working with torque controlled humanoid platforms to formulate the inverse
4
dynamics problem as a Quadratic Program (QP) (Koolen et al., 2013; Tedrake
& others, 2014; Feng et al., 2015a; Stephens & Atkeson, 2010; Herzog et al., 2014;
Hopkins et al., 2015). In the QP formulation, joint torques, joint accelerations, and
contact wrenches are simultaneously optimized to minimize a quadratic cost func
tion subject to linear equality and inequality constraints. While classical optimal
control theory also optimizes a quadratic cost function, as in a Linear Quadratic
Regulator (LQR), the addition of linear constraints allows the optimizer to model
a variety of limitations previously unexpressed. One major difference, is that LQR
usually considers control over a horizon while using a QP for inverse dynamics
typically only optimizes for the next best state. For humanoid robots, this can
include joint acceleration limits, torque limits, and contact force limits. The prob
lem can further be organized into hierarchies to solve wholebody control problems
according to a set priority in goals such that tasks of higher priority will always
be achieved first (Kanoun et al., 2011; Herzog et al., 2016). The benefit of using
a strict hierarchy is that it provides guarantees on the order in which potentially
conflicting tasks are resolved. For example, when balancing on one foot, it may be
reasonable to specify that not falling over is strictly more important than tracking
of the swing leg trajectory. While the same priority could roughly be achieved
though the tuning of individual cost functions for different tasks, hierarchies pro
vide better numerical stability when drastically different weights would be required
in the cost function. Unfortunately, along with the growing flexibility of these
methods comes added computational overhead, complexity in tuning, and a lack
of theoretical disturbance rejection metrics (such as the gain and phase margin of
classical controls). It is unclear how much complexity is really required to create
controllers which exhibit good performance on real systems.
5
1.2 OnlineTrajectoryGenerationinLocomotion
The control techniques presented so far are used to track trajectories; however,
they do not reason about the future system behavior. In addition to providing
feedback to stabilize a trajectory, the trajectories themselves can be modified to
further stabilize the robot. Online trajectory generation methods for balancing
and walking are generally formulated as combinations of the following techniques:
modifying the CoM trajectory, modifying end effector trajectories, and modifying
the timing of contact transitions. In the seminal work by (Kajita et al., 2003b),
CoM trajectories are computed by solving a infinite horizon LQR problem in dis
crete time using a simplified linear model, the Linear Inverted Pendulum (LIPM).
This problem was later restructured in (Wieber, 2006) as a receding horizon LQR
problem and solved using a QP. By structuring the problem as a QP, the problem
wasgeneralizedto acceptinequalityconstraintssuch asstabilitybounds andreach
ability limits. By adding constraints, the optimization problem no longer provides
feedback gains. Instead, feedback is generated through the repeated solution of a
Model Predictive Control (MPC) problem. In the MPC framework, each optimiza
tion results in a solution of optimal control inputs over a horizon; however, only
the first input is applied. At the next time step, the optimization is once again
solved using the the current measured state of the system as the new initial state.
ResearchershaveextendedthisQPbasedMPCframeworktoadditionallyoptimize
footstep adjustment (Herdt et al., 2010), long term balance constraints (Sherikov
et al., 2014), and CoM height trajectories (Brasseur et al., 2015). Additionally, the
dynamics of the LIPM can be seperated into unstable and stable components. By
focusing control efforts on only the unstable components, known as the Divergent
Component of Motion (DCM) or Instantaneous Capture Point (ICP), researchers
6
have developed online methods that modify the DCM trajectories and foothold
locations (Englsberger et al., 2014; Hopkins et al., 2015; Pratt et al., 2012).
Whentryingtoadjuststeptimingasanoptimizationparameter,thiscommonly
resultsinnonlinearoptimizationproblem, (Aftabetal.,2012;Kryczkaetal.,2015).
To avoid solving a nonlinear optimization, which is computationally expensive and
does not does not guarantee convergence to a global minimum, (Khadiv et al.,
2016) proposed solving the optimization problem as a QP in two stages. In the
first stage, nominal values for the step length, step time and DCM offset are
calculate based on kinematic limits and a desired walking speed. In the second
stage, these variables are reoptimized using measurements from the current robot
state. Another method to avoid nonlinear optimization is to structure the problem
as a Mixed Integer Quadratic Program (MIQP). MIQP extend quadratic programs
by allowing the inclusion of integer variables. By using binary variables, boolean
logiccanbeusedfordecisionmakingandnonlinearfunctionscanbeapproximated
aspiecewiselinearfunctions. In(Maximoetal.,2016)theonlineCoMandfootstep
planner of (Herdt et al., 2010) is extended to automatically choose the single
support timings for walking. In all of the methods discussed so far, the step
cadence, (i.e. left foot, right foot, left foot, etc.) remained fixed. (Ibanez et al.,
2014) proposed using a MIQP to additionally optimizing over the contact phase
sequencing, allowing the optimizer to create gaits that did not adhere to a strict
cadence. While online trajectory generation has been successfully applied to many
problems, online methods for balancing and walking that consider multiplane
contacts, both foot and hand contacts, is less explored. One reason for this is
because the control point, Zero Moment Point (ZMP), of the simplified models
typically used, LIPM, is defined only for flat ground planes.
7
1.3 Summary of Contributions
In this thesis, we propose contributions in the areas of Trajectory Tracking and
Trajectory Generation, emphasizing the importance of contact, low computational
cost, and feasibility for real robotic systems. We begin by experimentally exploring
using a computational lightweight linear optimal control policy that utilizes the
full dynamic model together with contact information on a torque controlled robot
for balancing. This method is then extended by taking multiple linearizations
about different poses and contact conditions and used to stabilize more complex
motions such as walking. We then derive an online walking controller that utilizes
hand contacts with the environment to stabilize the robot when CoM and footstep
adjustment alone are not sufficient. The lightweight formulation presented allows
for hand contact locations and force trajectories to be determined in realtime
without predetermining the specific contact location or timing.
1.4 Document Outline
This document is outlined as follows:
• Chapter 2 introduces the Sarcos humanoid used in experiments and describes
the low level torque control framework.
• Chapter 3 derives a contactconsistent infinite timehorizon Linear Quadratic
Regulator using the full robot dynamics. Next, we show how cost functions
in the optimization can be manipulated to track task space quantities such as
CoM motion and momentum. Experiments are shown on a real robot using
this controller formulation for balancing in different scenarios (Mason et al.,
2014).
8
• Chapter 4 extends the previous chapter by investigating the use of multi
ple linearizations for multicontact scenarios. Offline walking trajectories are
generated through preview control of the Linear Inverted Pendulum Model
(LIPM).Walkingexperimentsarethenshownonarealrobotusingthedevel
oped control architecture (Mason et al., 2016).
• Chapter 5 derives the dynamics for the LIPM in the presence of an exter
nal contact force created by a the hand of a humanoid. This is then used
in a two stage optimization to simultaneously plan footstep locations, the
CoM trajectory, and hand contact trajectories to utilize the environment for
additional stability (Mason et al., 2018).
• Chapter 6 concludes the thesis and proposes future work on developing a
reactive CoM and footstep planner which can be used for traversing of rough
terrain.
9
Chapter 2
The Sarcos Humanoid Robot
In this work, experiments are conducted both in simulation and on real hard
ware. The lower body of the Sarcos humanoid “Hermes” used extensively in the
experiments presented is described in this chapter. Additionally, an overview of
the lowlevel control is provided.
2.1 Actuation and Sensing Overview
Hermes is a torque controlled hydraulic robot with 17 degrees of freedom (seven
per leg and three at the torso). Actuation is provided by linear pistons designed
and manufactured by Sarcos. The linear motion from the pistons is converted to
rotational motion through threebar, fourbar, or bell crank linkages. The motion
of the pistons is controlled by hydraulic servo valves (Moog Series 30 flow control
servo valves) that regulate the flow of oil both in and out of the hydraulic piston.
The operating pressure of 3000 psi is supplied by an offboard pump. The servo
valves receive flow rate commands from the onboard General Degree of Freedom
Cards (i.e. GDC cards) at a rate of 5000 Hz, which in turn receive commands
from the offboard computer at 1000 Hz. While it is common for position controlled
robots can be controlled at much lower frequencies (100400 Hz), a high bandwidth
controller is necessary to control the torque because the valve and piston dynamics
are much faster than the robot dynamics.
10
Figure 2.1: Lower body of Sarcos Humanoid with a modular upperbody mass
attached.
Hermes contains sixaxis force/torque sensors located within each of the feet.
These sensors are used primarily for estimation and measuring the Center of Pres
sure (CoP). Potentiometers are located at each joint and provide the rotational
position. Joint velocities are calculated by differentiating the potentiometer sig
nals. Both signals are filtered independently using a secondorder Butterworth
filters having nominal cutoff frequencies of five percent of the Nyquist frequency
(five percent of 500Hz or 25Hz). The potentiometers themselves are not perfectly
linear and subject to considerable noise, especially as the mechanical structure
of the potentiometer degrades over time. This is problematic because noisy joint
11
velocities can create undesirable jointlevel behavior such as vibration and oscil
lation. While filtering can help smooth out the noisy velocities signals, it also
induces delays which can completely destabilize the joint control loops. The noise
levels observed in the velocity signals, combined with stiction and backlash, create
a particularly difficult control problem when using heavily modelbased methods.
Additionally, the robot has an IMU located at the base, used for floating base pose
estimation, and accelerometers located in each foot.
2.2 Overall Control Pipeline
The goal of the lowlevel torque control pipeline is to mimic a pure torque
source (i.e. track a reference torque with no delay so that higherlevel controllers
can assume perfect execution of commands). The high bandwidth of hydraulics
allows one to get very close to this decoupling assumption. For early experiments
(i.e. Chap. 3) the lowlevel torque control was accomplished through an offboard
PID controller at 1000 Hz with a feedforward velocity term to compensate for the
moving piston, similar to (Boaventura et al., 2012). Eq. 2.1 describes this feedback
policy:
τ =PID(F,F
des
) +k
vc
˙ q +b, (2.1)
whereτ is torque,PID(F,F
des
) is a PID controller on force,F is the force,k
vc
is a velocity feedback gain, ˙ q is the joint velocity, and b is a bias. The resultant
torqueateachjointismeasuredthroughlinearforcesensorslocatedonthelinkages
connected to the linear pistons. From the kinematics of the specific joint linkage,
the torque can be calculated from the measured force.
In later experiments (i.e. Chapter 4) it was noticed that there were many
stability issues when trying to tune a stabilizing wholebody LQR control policy.
12
Despite many tuning strategies, we were unable to find gains that remained stable
when pushed while balancing on one leg. Properly damping the system was one of
the biggest limitations to obtain robust behaviors on the robot. LQR controllers
tend to create aggressive feedback on joint velocity which can create issues on
the robot due to excessive noise in the velocity signals. While this problem does
not exist in simulations, it becomes quickly visible on the real robot. To address
this, a secondary lowlevel control policy was explored where the torque control
feedback loop was placed on the onboard GDC cards operating at 5 kHz. Velocity
compensation was removed, and torque feedback gains were lowered. Eq. 2.2
describes this new feedback policy.
τ =PID(F,F
des
) +b (2.2)
While torque tracking bandwidth was slightly reduced for faster tracking, this
allowed for higher feedback gains in the LQR controller and significantly improved
the balancing performance, Fig. 2.2. In addition, the higher control frequency
allow for a very similar performance in zero torque tracking tasks compared to the
previous torque control architecture without the need for velocity compensation.
Since the velocity signals for this robot were particularly noisy, we believe adding
a velocity compensation term at the lowlevel should be avoided unless necessary.
The tradeoff between force tracking performance and higher level control perfor
mance and the role of velocity compensation is extensively discussed in Focchi et.
al (Focchi et al., 2016).
13
Figure 2.2: Comparison of the old torque tracking (top) and new torque tracking
(bottom). Red is the desired torque coming in as a sine wave and blue is the
measured torque.
14
Chapter 3
Full Dynamics ContactConsistent
LQR Design for Balancing
The dynamics of a humanoid robot reveal a highly coupled nonlinear multiple
input multiple output (MIMO) system. It is shown in classical controls that the
optimal solution to the MIMO regulator problem for a linear system is provided
by LQR controllers. In addition, it is wellknown that LQR controllers provide
very good gain and phase margins and by extension a certain tolerance to non
linearities (Anderson & Moore, 1990). A control framework that uses optimal
feedback gains would provide the same compliance observed in approaches that
use inverse dynamics. Furthermore, it should provide equal or greater tracking
and disturbance rejection capabilities than a PID controller with even lower gains.
This advantage comes from the fact that the controller explicitly takes into account
the coupling between the different joints to create optimal feedback controllers for
wholebody coordination. While an LQR controller works well in theory and sim
ulations, no evidence exists showing whether or not it can provide good stability
margins on a real humanoid robot (a high dimensional, underactuated, floating
base system) in the presence of realworld conditions (model uncertainty, sensor
noise, imperfect state estimation, ground contact, compliance, etc.). In this chap
ter, I will derive a contactconsistent infinitetime horizon LQR controller using
the full dynamics of the robot. I then experimentally evaluate this controller in a
variety of different tests: impulse rejection in double support, impulse rejection in
15
single support, and double support balancing in the presence of unmodeled upper
body motion.
3.1 Floating Base Dynamics
Consider a floatingbase robot with n DoFs, six of which correspond to the
position and orientation of the base with respect to an inertial frame and the
rest composed of the joint space of the robot. To avoid singularities, the base
orientation is represented as a unit quaternion. The equations of motion for this
system can be written as
M(θ)
¨
θ +C(θ,
˙
θ)
˙
θ +g(θ) =S
T
τ +J
T
c
(θ)f
c
(3.1)
The variables in Eq. 3.1 are defined as
θ∈R
n−6
×SE(3) : generalized coordinates
M(θ)∈R
n×n
: inertia matrix
C(θ,
˙
θ)∈R
n×n
: centrifugal and Coriolis forces
g(θ)∈R
n
: gravitational force
τ∈R
n−6
: active joint torques
J
c
(θ)∈R
m×n
: contact Jacobian matrix
f
c
∈R
m
: contact forces
WhereS
T
∈R
n×n−6
converts active joint torques to generalized forces and has
the form
S
T
=
I
(n−6)×(n−6)
0
6×(n−6)
. (3.2)
16
3.2 Contact Consistent Dynamics
A fully actuated bipedal platform that possesses actuation at each joint is able
to fully control its configuration, but cannot control the floating base. In order to
do so, the robot must interact with the environment to create additional forces.
Contact with the environment can be represented as a constraint on the overall
system that projects the dynamics into a lower dimensional subsystem. In order
for this constraint to be valid, the assumption that the foot does not move relative
to the environment must remain true. Practically, this means that there is no
slipping or tilting and thus the reaction forces on the foot remain with the local
friction cone. This nonslip constraint can be written as follows:
J
c
˙
θ = 0 (3.3)
and by taking the time derivative
J
c
¨
θ +
˙
J
c
˙
θ = 0. (3.4)
This constraint can be written for any number of end effectors. In the case of
walking, the number of feet in contact with the ground changes and such the con
straint would also change during relinearization. The constraints can be included
into the dynamics by:
1. Solving the constraint equation, Eq. 3.4, for the accelerations
2. Plug this into the equations of motion and solve for the reaction forces.
3. Plug the contactconsistent reaction forces into the original equations of
motion.
17
This results in:
¨
θ = Φ(θ)S
T
τ +φ(θ,
˙
θ) (3.5)
where
Φ =M
−1
−M
−1
J
T
c
(J
c
M
−1
J
T
c
)
−1
J
c
M
−1
=
I− (J
+
c
)
M
J
c
!
M
−1
(3.6)
φ =M
−1
J
T
c
(J
c
M
−1
J
T
c
)
−1
˙
J
c
˙
θ− Φ(C +g) (3.7)
and (J
+
c
)
M
is the inertia weighted pseudoinverse. Equation (3.6) shows that
Φ is nothing more than the inverse of the mass inertia matrix projected into the
null space of the end effector constraints. While an infinite number of valid gener
alized matrix inverses exist, the inertia weighted pseudoinverse is used because it
preserves the kinetic energy of the system (Khatib, 1987). For a more indepth dis
cussion on the properties of constraint projected dynamics refer to (Aghili, 2005).
3.3 Linearized Dynamics
In the general formulation, Eq. 3.5 is linearized around the nominal pose
(θ
T
0
,
˙
θ
T
0
)
T
, withτ
0
defined as the joint torques that realize the desired posture and
velocity. In the case of balancing, where
˙
θ
T
0
= 0, then τ
0
is equivalent to gravity
compensation. Linearization results in
δ
¨
θ = Γδθ + Λδ
˙
θ + Φ(θ
0
)S
T
δτ (3.8)
18
where
Γ =
∂Φ(θ
0
)S
T
τ
0
∂θ
+
∂φ(θ
0
,
˙
θ
0
)
∂θ
!
(3.9)
Λ =
∂φ(θ
0
,
˙
θ
0
)
δ
˙
θ
!
. (3.10)
Furthermore, it can be shown as that for the specific case where
˙
θ
0
= 0 (i.e.
tracking a static pose), Λ =0. Expanding Λ we have
Λ =
∂
M
−1
J
T
c
(J
c
M
−1
J
T
c
)
−1
˙
J
c
˙
θ− Φ(C +g)
!
δ
˙
θ
(3.11)
=
∂
M
−1
J
T
c
(J
c
M
−1
J
T
c
)
−1
˙
J
c
˙
θ
!
δ
˙
θ
−
∂
Φ(C +g)
!
δ
˙
θ
Rewriting
˙
J
c
as
˙
J
c
=
∂J
c
∂t
=
∂J
c
∂θ
dθ
dt
(3.12)
gives
Λ =
∂
M
−1
J
T
c
(J
c
M
−1
J
T
c
)
−1dJ
dθ
˙
θ
2
!
δ
˙
θ
−
∂
Φ(C +g)
!
δ
˙
θ
(3.13)
Both terms of 3.13 are quadratic in
˙
θ (C has an extrema point at
˙
θ = 0) and
therefore are 0 when their partial derivatives are evaluated at
˙
θ = 0. This yields
the following matrices in the familiar state space form, (3.14)
˙ x =Ax +Bu (3.14)
A =
0 I
Γ Λ
=
(Λ=0)
0 I
Γ 0
,B =
0
Φ
0
S
T
(3.15)
19
where x = [Δθ
T
, Δ
˙
θ
T
]
T
and u = [Δτ]. For a static pose, this becomes x =
[Δθ
T
,
˙
θ
T
]
T
. To calculate Γ and Λ, Yamane et al. (Yamane, 2012) suggested a
samplebasedmethod. Instead, Icalculate Γand Λnumericallythroughperturbing
the initial pose and using a central finite difference method.
∂f(x)
∂x
≈
f(x +)−f(x−)
2
(3.16)
wheref is the terms of Φ andφ in (3.5). Special care need to be taken for the
orientation of the base which is an element of SO(3). To account for this the box
plus operator is introduced
:SO(3)×so(3)→SO(3), (3.17)
q
1
,ω7→ exp(ω)⊗q
1
,
which makes use of the exponential mapping, q
2
= exp(ω), between an infinitely
smallrotationω anditscorrespondingquaternionq
2
. Notethat⊗isthequaternion
multiplication operator. In short, one maps the desired perturbation about an
axis, e
i
, to the corresponding rotation in SO(3). A more detailed description of
the differential calculus on SO(3) and the operator can be found in Hertzberg
et al. (2013). The numerical differentiation is then
∂f
∂q
=
f(qe
i
)−f(q−(e
i
))
2
, i = 1,..., 3 (3.18)
In order to compute the LQR gains, one must find a minimal realization of the
linearsystem. Inpractice, numericaldifferentiationcommonlyyieldsastatetransi
tion matrix that is illconditioned, in this case cond(A) = 2e20. This poses a prob
lem to compute such a minimal realization. This problem arises because numerical
20
differentiation does not guarantee that the contact constraints are enforced and
thereforeA has a rank that is higher than its theoretical maximum rank 2(n−m).
This makes it difficult for offtheshelf programs (e.g. minreal in Matlab) to numer
ically eliminate the uncontrollable states of the system, as required to solve the
LQR problem. To resolve this, constraints are again embedded in the linearized
dynamics by reprojecting the linearized system into the nullspace of the kinematic
constraint of Eq. 3.4 using the projection matrix N,
N = null
J
c
0
˙
J
c
J
c
!
.
Here,N∈R
n×(n−m)
is an orthonormal basis for the nullspace of the constraint
equation which maps the linearized dynamics to the minimal system as follows.
x
m
= N
T
x
A
m
= N
T
AN
B
m
= N
T
B
R
m
= R
Q
m
= N
T
QN
This linearization pipeline enables one to linearize around predefined poses or
the current state of the robot at each point along a trajectory as for the time
varying LQR problem. The optimal feedback gain matrix K
m
are computed by
minimizing the following cost function in the reduced state space
J =
Z
∞
0
(x
T
m
Q
m
x
m
+u
T
R
m
u)dt (3.19)
21
I then map the gains from the minimal system, K
m
, back to the full system as
K =K
m
N
T
.
The resulting controller for the humanoid robot is thus
τ =τ
0
−Kx. (3.20)
3.4 Task Space Control
It is possible to rewrite the matrixQ of the LQR cost function (3.19) such that
one optimizes for elements in the task space defined as a linear combination of the
original state vector x. If it is desirable to place costs on the states ˆ x defined as
ˆ x =Tx
rather than transforming the states, one can equivalently transform the cost func
tion such that Q =T
T
ˆ
QT. The cost function is equivalently written
J =
Z
∞
0
ˆ x
T
ˆ
Qˆ x +u
T
Ru (3.21)
J =
Z
∞
0
x
T
T
T
ˆ
QT
!
x +u
T
Ru (3.22)
In this study, I chose the task space objectives to be the regulation of the
CoM and total momentum. Human studies have shown that momentum is tightly
regulated during walking suggesting the importance of regulation in humanoid
systems (Popovic et al., 2004). Furthermore, recent work has shown promising
22
resultfromcontrollingtheCoMandmomentumforbalancing(Herzogetal.,2014),
(Lee & Goswami, 2012). By reformulating the cost function, one can generate
optimal gains for these task space objectives. The momentum about the CoM is
written in the form of the 6× 1 vector h
G
containing both the linear and angular
momentum. This matrix was defined by Orin et. al (Orin & Goswami, 2008) as
the centroidal momentum matrix, written
h
G
=
h
lin
h
ang
=
m ˙ x
com
I
˙
θ
com
=A
G
(θ)
˙
θ (3.23)
Where ˙ x
com
and
˙
θ
com
arethelinearandangularvelocityoftheCoMrespectively.
From here, I define A
G
= A
G
(θ
0
) = [A
G1
,A
G2
]
T
where A
G1
is the first 3 rows of
the 6×n matrix A
G
. The relationship between the CoM and joint angles can be
written
h
lin
=mΔx
com
=A
G1
Δθ (3.24)
Δx
com
=
1
m
A
G1
Δθ (3.25)
Q is partitioned such that
x
T
Qx =
(Δθ)
T
Q
1
(Δθ) 0
0
˙
θ
T
Q
2
˙
θ
(3.26)
Using (3.25) and following the transformation method in (3.21) and (3.22), we can
choose Q
1
such that
Q
1
=
1
m
2
!
A
T
G1
Q
com
A
G1
(3.27)
23
giving
(Δθ)
T
Q
1
(Δθ) =
1
m
2
!
(Δx
com
)
T
Q
com
(Δx
com
) (3.28)
where Q
com
is a 3× 3 matrix. Along the same lines, using (3.23) we choose Q
2
such that
Q
2
=A
T
G
Q
G
A
G
(3.29)
giving us
˙
θ
T
Q
2
˙
θ =h
T
G
Q
G
h
G
(3.30)
where Q
G
is a 6× 6 matrix. This allows one to weight the CoM movement and
total robot momentum directly through the cost matrices Q
com
and Q
G
. Lastly,
if needed, one can add a regularization term Q
R
to directly specify costs for joint
tracking.
Q =
Q
1
0
0 Q
2
+Q
R
(3.31)
Note that Q
R
should be chosen to be appropriately small such that it does not
dominate the weighting. For the rest of the paper the controller that uses this type
of cost function manipulation will be referred to as the LQRmomentum controller.
3.5 Balancing Experiments
Experiments were conducted in both simulation and on real hardware. The
focus of this section will be on the real hardware experiments, while simulation
results are used to show how the experiments may extend to a full humanoid
robot (i.e. including an upper body). The hardware used is the lower body of the
hydraulically actuated torque controlled robot Hermes, Sec. 2. The experiments
included are the following:
24
• Double Support Balancing: Impulse Rejection
• Double Support Balancing: Upperbody Disturbance Motion
• Single Support Balancing: Impulse Rejection
With the goal of balancing in mind, the cost matrix was tuned such that the
heaviest cost was placed on the error in the base x and y positions for the LQR
controller and the heaviest cost on the error in the CoM x and y position for the
LQRmomentum controller. To test the disturbance rejection capabilities of the
two controllers, push tests were conducted in both the lateral and frontal plane.
To measure the force of the push a 6 DoF force/torque sensor was mounted to the
end of push stick to record timesynchronized data.
3.5.1 Double Support Balancing: Impulse Rejection
To see how this LQR policy compares to more computationally intense meth
ods, I began by looking at the max impulse the control policy can reject without
falling over. The results of the lateral push test are shown in (Fig. 3.1, 3.2) for
the CoM and the joint tracking responses respectively. Because testing was only
done on the lower body of the robot, the position of the base was quite close to
where the COM would be located. As such, with less tuning parameters the LQR
momentum formulation produced a gain matrix similar in structure to that of the
LQR gain matrix. Because of the similarities, the plots of the LQRmomentum
gain matrix is omitted. The two gain matrices will diverge more when the full
humanoid is used and the joint coupling becomes more complex.
While the real experiments were only conducted on the lower body of the
humanoid, the feasibility of using this formulation on the full humanoid was con
ducted in simulation using the simulation environment SL (Schaal, 2009). Figure
25
z
y
x
−400
−200
0
LateralPushExperiment−LQRController
Force(N)
−0.1
0
0.1
COMX(m)
actual
desired
−0.05
0
0.05
0.1
COMY(m)
actual
desired
0 0.5 1 1.5 2 2.5 3 3.5
−0.1
0
0.1
Time(s)
COMZ(m)
actual
desired
Figure 3.1: Robot response and plot of the CoM error for a push in the lateral
plane (front).
26
−0.1
−0.05
0
0.05
0.1
Lateral Push Experiment −LQRController (Joint Response)
Left Joint Angles (rad)
0 0.5 1 1.5 2 2.5 3 3.5
−0.1
−0.05
0
0.05
0.1
Right Joint Angles (rad)
Time (s)
HFE
HAA
HFR
KFE
AR
AFE
AAA
Figure 3.2: Joint response to the push shown in Fig. 3.1
.
−400
−200
0
FrontalPushExperiment−LQRController
Force(N)
−0.1
0
0.1
COMX(m)
actual
desired
−0.05
0
0.05
0.1
COMY(m)
actual
desired
0 0.5 1 1.5 2 2.5 3 3.5
−0.1
0
0.1
Time(s)
COMZ(m)
actual
desired
Figure 3.3: CoM error for push in frontal plane (side).
27
3.4 shows the resulting motions from two push experiments. The robot model was
pushed both from the front and back with an impulse of 15 Ns. The goal here
was to see how the additional limbs would behave using the LQR formulation and
show feasibility for future tests on the full humanoid. As expected, the limbs were
coupled such they contributed to the robustness in balancing.
Figure 3.4: Simulation results from pushing experiments on the full humanoid
model. The top and bottom animations show the resulting motion from a 150
N push for 0.1s on the torso from the back and front respectively. The robot
operating under a PD controller using the diagonal of the LQR gain matrix was
not able to balance provided the same push.
.
The reformulation of the LQRmomentum was much easier to tune (i.e. fewer
tuning parameters with more intuitive effects) and produced a similar perfor
mance in terms of impulses that could be rejected. For both the LQR and LQR
momentum controller formulations, the humanoid was able to reject pushes of 8.1
Ns  8.6 Ns with a peak force of approximately 350 N in the lateral plane and 6.2 Ns
 6.7 Ns pushes with a peak force of approximately 250 N in the frontal plane. In
(Ott et al., 2011), test results were reported for an experiment where a pendulum
was used to impact the robot with about 5.8J in the frontal plane. This would
28
equivalently deliver approximately a 7.6 Ns impulse assuming that all the momen
tum of the ball was transferred to the robot. In (Herzog et al., 2014), impulses
between 4.5 Ns and 5.8 Ns with peak forces up to 150N were rejected while in sin
gle support balancing. These numbers show that the simple controller presented
is capable of maintaining balance and rejecting perturbations competitively with
more complicated methods.
3.5.2 Double Support Balancing: Upper Body Distur
bance Motion
While disturbances are often external in nature, they can also originate from
the motion of the robot itself. It is a common scenario for a humanoid to engage in
tasks that decouple upper and lower body goals. The upper body may be moving
around and interacting with the world while the lower body is purely focused on
balancing. To simulate this, a mass of 10kg was added to the torso joint and moved
through sinusoidal motions of different frequencies. The LQR controller was able
to balance for motions of the upper body moving up to 0.8 Hz in the sagittal plane
and 0.5 Hz in the frontal plane with an amplitude of 0.1 rad. Plots of the sagittal
disturbance tests for slow and fast sinusoids are shown in Fig. 3.5.
3.5.3 Single Support Balancing: Impulse Rejection
In the case of locomotion over very rough or complex terrain, stepping to main
tain stability may not always be possible. An example of this is the problem of
crossing stepping stones; there are a finite number of safe regions to step and
the robot must be able to reject disturbances or correct for error while in single
29
Sineso idal Di stur ba nce
0 5 10 15 20
0.05
0.00
0.05
0.10
0.15
0.20
0 5 10 15 20
0.10
0.05
0.00
0.05
0.10
0 5 10 15 20
0.10
0.05
0.00
0.05
0.10
Sag ittal Mov e ment Fast
0 5 10 15 20
0.10
0.05
0.00
0.05
0.10
0.15
0.20
0 5 10 15 20
0.10
0.05
0.00
0.05
0.10
0 5 10 15 20
T ime( s)
0.10
0.05
0.00
0.05
0.10
X
Y
Z
Sag ittal Mov e ment Sl ow
Base E r r o r (m ) C o M E r r or ( m) T o r so ( r ad) B a se E r r or ( m) C o M E r r or ( m) T o r s o ( r ad)
X
Y
Z
Figure 3.5: The LQR controller was used to balance the lower body of the robot
without knowing about the motion of the upper body. The graphs show the error
in the base position for both a slow (0.2 Hz) and fast motion (0.8 Hz).
support so it can plan and execute safe stepping motions. In the following exper
iment, I test the ability of the LQR controller to balance in single support while
being externally perturbed in different locations. Referring to the gain matrices in
Fig. 4.2, one can see that the balancing strategy for single support is drastically
different than that for double support. In the gain matrix, it is clear that the
support leg has a large amount of coupling (shown by offdiagonal terms) to help
the robot balance, while the swing leg essentially becomes decoupled from the sup
port leg (primarily diagonal terms) with some coupling relating the hip position
to base error (i.e. the swing leg can be moved at the hip to help correct for errors
measured in the base).
30
Figure 3.6: Snapshots from single support balancing task and the tracking error
of the CoM.
3.6 Summary
In this chapter, the general formulation for an LQR controller derived from
the full robot dynamics with constraints was presented. From this derivation,
the specific case where the regulation of a statically stable posture was evaluated
on the robot to keep the robot balanced. It has been successfully demonstrated
that with this simple control law both good disturbance rejection and tracking
can be achieved on a physical robot in the presence of real conditions (i.e. model
uncertainty, sensor noise, imperfect state estimation, ground contact compliance,
etc.). In the next chapter, I investigate the improvements that relinearization
contributes to the tracking performance for more complicated tasks involving a
larger range of motion and contact switching (i.e. walking).
31
250
200
150
100
50
0
50
Push Force (N)
20 25 30 35 40
Time (s)
0.20
0.15
0.10
0.05
0.00
0.05
0.10
0.15
Tracking Error (m)
CoM x
CoM Y
CoM Z
Swing Leg X
Swing Leg Y
Swing Leg Z
Figure 3.7: Tracking error of the CoM and swing leg after an external impulse.
The impulse was 8.2 Ns with a peak force of 488 N. A similar test was conducted
for a push located at the base of the robot, resulting in a impulse of 5.5 Ns and
peak force of 246 N. This value is very close to the max impulse of 5.8 Ns measured
when using a hierarchical QP controller on the same robot (Herzog et al., 2014).
32
Chapter 4
Full Dynamics ContactConsistent
LQR Design for Walking
In Chapter 3, a contact consistent LQR controller was developed and validated
experimentallyonrealhardwarefordifferentbalancingtasks. Thecontrollerexhib
ited a robust push recovery behavior competitive with more sophisticated balance
controllers which use QP solvers for inverse dynamics (Stephens & Atkeson, 2010;
Herzog et al., 2014), rejecting impulses up to 11.7 Ns with peak forces of 650 N
while in double support without stepping. Despite this good performance, the pre
vious section was limited to a single contact scenario using a single linearization
of the dynamics. In this section, we extend the previous approach and explore the
idea of using only a small number of LQR controllers computed from a set of pre
defined robot poses and contact configurations. My hypothesis is that only a small
number of LQR controllers, corresponding to a contactconsistent linearization of
the dynamics at key poses, are necessary to stabilize complex tasks. (Posa et al.,
2016) proposed to use a similar approach to stabilize constrained systems in order
to track dynamic behaviors. Feedback gains, although computed, are not directly
executed. Instead, walking trajectories are stabilized by computing the costto
go of the timevarying LQR problem. This cost is then used as the objective in
an inverse dynamics controller, which is implemented using a QP solved at each
control cycle. This optimizationbased approach allows for the use of additional
constraints such as torque limits and contact friction cones but has a relatively
33
high computational complexity compared to a simple LQR design. Further, only
results in simulation were presented, while in this work we focus predominantly
on real robot experiments: switching between multiple linearizations and contact
conditions for walking. Additionally, we experimentally study how the number of
relinearizations of the dynamics along a trajectory affects performance.
4.1 ZMP Based Walking Pattern Generation
4.1.1 Zero Moment Point
Arguably the most influential theoretical contribution in legged locomotion has
been the introduction of the Zero Moment Point (ZMP). The ZMP is used as both
a stability criteria and as a control point for many simplified or reduced models.
Consider a robot balancing on one foot on a horizontal floor. Contact with the
ground creates a resultant reactionary wrench w = [f
x
,f
y
,f
z
,τ
x
,τ
y
,τ
z
]
T
. This
wrench completely determines the evolution of the center of mass of the system,
however, it cannot be controlled directly. Rather, it can be indirectly controlled
by the motion of the robot above the foot. In this scenario, f
x
, f
y
, f
z
, and τ
z
are
all determined through the balancing forces provided by the contact friction. The
ZMP is defined as the point on the floor where the additional moments, (τ
x
and
τ
y
), are equal to zero. The ZMP is a useful quantity because it replaces all forces
acting on the robot by a single force. Note that the ZMP is only defined within the
support polygon of coplanar feet and is identical to the center of pressure (CoP)
within these bounds. Once the ZMP reaches the edge of the support polygon the
foot can rotate, indicating instability.
34
4.1.2 Control of ZMP through LIPM
Because the ZMP is controlled by actuating the joints composing the structure
above the foot, reduced models have been used to more compactly represent the
dynamics of the full robot. The mass distribution of the robot can be reduced to
a single point, called the center of mass (CoM). The dynamics of the CoM are
determined by the linear momentum of the system, which in turn evolves through
the application of forces on the end effectors. By constraining angular momentum
to be zero and the CoM to move along a plane we have derived the Linear Inverted
Pendulum Model (LIPM). The LIPM has been widely used in locomotion because
encodes the dynamic relationship between contact forces, on flat ground, and the
CoM. Additionally, because the model is linear it allows for fast computations
when used in optimal control algorithms. This has lead to a variety of control
designs that use output feedback to control the ZMP by planning the motion of
the CoM. The LIPM dynamics can be derived from the NewtonEuler Equations
(4.1), (4.2). Similar to (Agravante et al., 2016), we separate foot contact forces,
f
i
, from all other forces.
m(¨ c +g) =f
c
+
X
f
i
(4.1)
c×m(¨ c +g) +
˙
L =
X
s
i
×f
i
(4.2)
,
where c is the CoM, L is the angular momentum taken about the CoM, s
i
is
the location of the coplanar foot contacts, and the coordinate frame has thez axis
aligned with the gravity vector. Dividing the Euler equation by the foot contact
forces in the z direction, f
z
i
, of the Newton equation and rearranging yields:
35
P
s
i
×f
i
P
f
z
i
=
c×m(¨ c +g)−
˙
L
m(¨ c
z
+g)
.
Under the LIPM assumptions, the angular momentum and the CoM height
remain constant (
˙
L = 0, ¨ c
z
= 0), with the ground plan at zero (s
z
i
= 0). We can
then write the equations for the Zero Moment Point (ZMP), (Kajita et al., 2003a)
.
Z
x,y
=
P
s
x,y
i
f
z
i
P
f
z
i
=c
x,y
−
c
z
¨ c
x,y
g
. (4.3)
Note that the dynamics are decoupled in between x and y. In chapter 5, we
modify this derivation to allow for the use of hand contacts with the environment.
4.1.3 Finite Horizon Control of the ZMP
Dropping the superscripts of c
x
and c
y
, the dynamics of the LIPM can be
written in discrete statespace form as:
x(t + 1) =Ax(t) +Bu(t) (4.4)
y(t) =Cx(t) (4.5)
x(t) =
c(t)
˙ c(t)
¨ c(t)
, u(t) =
...
c(t) (4.6)
A =
1 T T
2
/2
0 1 T
0 0 1
, B =
T
3
/6
T
2
/2
T
, C =
1 0 −
c
z
g
(4.7)
36
where the CoM states are given by x(t) = [c(t), ˙ c(t), ¨ c(t)]
T
, u(t) =
...
c, and T is
the time between discrete steps. In this formulation, the evolution of the states is
deterministic and can be written in the following nested form
x(1) =Ax(0) +Bu(0)
x(2) =Ax(1) +Bu(1)
=A(Ax(0) +Bu(0)) +Bu(1)
=A
2
x(0) +ABu(0) +Bu(1).
For the horizon N, we can write the evolution of the ZMP compactly as
y(1)
y(2)
.
.
.
y(N)
=
Cx(1)
Cx(2)
.
.
.
Cx(N)
=
CA
CA
2
.
.
.
CA
N
x(0) +
CB 0 0 0
CAB CB 0 0
.
.
.
.
.
.
.
.
. 0
CA
N−1
B CA
N−2
B ... CB
u(1)
u(2)
.
.
.
u(N)
(4.8)
Y =P
x
x
0
+P
u
U (4.9)
Optimization of the inputs forU = [u(t)...u(t +N)] can be written in following
form:
minimize
U
Y−Y
ref

2
Q
+U
2
R
(4.10)
By optimizing over the entire trajectory, one can calculate the optimal offline
CoM trajectory to track. The pipeline from foothold selection to joint trajectories
can be visualized in Fig. 4.1. This offline method is used to generate a CoM
37
trajectory for walking, as shown in this Chapter. Alternatively, this optimization
problem can additionally include linear inequality constraints and be solved in an
online manner as discussed in Chapter 5.
4.2 MultiContact Linearizations
By using methods discussed in Sec. 3.2, one can linearize around arbitrary
poses and contact conditions offline and build up a library of local linear con
trollers. Fig. 4.2 shows a visualization of the gain matrices calculated around key
poses in the walking trajectory, both single support and double support poses. By
visualizing these resulting gain matrices, one can gain insight into the balancing
strategy by inspecting the coupling between joint states and output torques. This
type of analysis is not readily available for other controllers which output joint
torques rather than a local feedback policy. It is very helpful to have such insights
when working with such a high dimensional system and trying to reason about the
behavior the robot is exhibiting. High costs on the diagonals indicate decoupled
joint tracking while off diagonal terms indicate a coupling between different states
(note that a perfectly diagonal matrix would correspond to a standard indepen
dent joint PD controller). Note that while in double support the LQR controller
produces a highly coupled gain matrix (i.e. large off diagonal terms). In single
support, the balancing strategy changes and the support leg is coupled among its
own joints while mostly ignoring the tracking of the swing leg which becomes a
diagonally dominated controller that mostly considers decoupled joint tracking.
Figure 4.3 shows different poses and contact conditions used in this study and
provides values of the Frobenius norm of the difference between gain matrices with
respect to those for the centered posture. One can see that the gain variation is
38
Plan Footstep Locations
Generate ZMP Trajectory
(Piecewise Linear Interpolation
of Foothold Centers)
Generate CoM Trajectory
(Preview Control of CoM
using LIP Dynamics)
Generate Ende ffector Trajectory
(Quintic Spline Interpolation)
Inverse Kinematics
and/or
Inverse Dynamics
(Quadratic Program)
Robot
Joint
T orques
Figure 4.1: Diagram of a sample pipeline used to generate ZMP based walking
trajectories using the LIP.
39
400
300
200
100
0
1 00
2 00
L ef t L eg Righ t L eg T o r s o B a se L ef t L eg Righ t L eg T o r s o B a se L ef t L eg Righ t L eg T o r s o B a se
L ef t L eg Righ t L eg T o r s o
Figure 4.2: Joint position portion of gain matrices produced from key poses and
constraint conditions. The outer matrices are created for single leg balancing and
the center matrix is double support balancing.
40
quite small when shifting to the side, compared to when the contact constraint
changes, indicating that switching constraints has a much larger effect on the
LQR solution. This insight is further explored experimentally in Section 4.3. In
general, the diagonal terms produced by the LQR controller are lower than the
independent joint PD controller typically used on the robot. The Frobenius norm
of the LQR gain matrices used on the robot and independent joint PD matrix are
approximately 650 and 1,500 respectively.
1 2 3 4 5
Pose # 1 2 3 4 5
Norm 0 28.03 56.18 545.84 548.07
Figure 4.3: Postures that were chosen for linearization along with the Frobenius
normkK
n
−K
1
k of the gain differences with respect to the first pose (base centered
over feet).
4.3 Switching Between Multiple Linearizations
Intuitively, using more linearizations should better capture the dynamics of
the robot. However, the gain matrices produced by varying the pose changed
very little compared to those produced by changing the contact constraints. To
quantify the effect of using multiple linearizations, we tracked a sidetoside motion
41
and transitioned between sets of gains, again using the control diagram shown in
Fig. 4.4. The root mean squared error (RMSE) for CoM tracking was calculated
for two different motion speeds, as summarized in Table 4.1.
Poses SSP (s) DSP (s) X (m) Y (m) Z (m)
1 4 4 0.00806 0.08239 0.30955
3 4 4 0.00814 0.08268 0.30967
5 4 4 0.00816 0.08262 0.30962
1 0.5 2 0.01287 0.02913 0.03900
3 0.5 2 0.01304 0.05456 0.06337
5 0.5 2 0.01299 0.028760 0.04142
Table4.1: RMSEfortrackingasidetosideZMPtrajectorywithnocontactswitch
ing. The single support phase (SSP) time refers to the dwell of the ZMP located
at a single foot, where the foot would be lifted if the robot were walking, and the
double support phase (DSP) refers to the transition of the ZMP from one foot to
the other. The number in the poses column indicates the number of linearized
poses used for tracking.
From the experiments, we saw no clear advantage to using a larger number
of linearizations. We believe that this indicates that for the scenarios presented,
that there are other bottlenecks on the real robot (such as sensor noise and torque
tracking) that dominate the system’s performance. From the experiments on the
robot, we saw no clear advantage to using multiple linearizations over a single
linearization for a given contact state.
42
4.4 Walking: Multiple Linearizations and Con
tact Switching
While any walking planner could be used, we chose to generate walking trajec
tories offline. In doing so, we rely solely on the ability of the proposed LQR control
architecture to robustly track the originally planned trajectory. In the following
Chapter, we explore the extension to this idea by looking into reactive planning.
In these experiments, we used the preview control approach proposed by (Kajita
et al., 2003b) based on a ZMP stability criterion, as discussed in Sec. 4.1.1. From
the resulting CoM and predefined foot step trajectories, we generated desired joint
space trajectories using inverse kinematics and feedforward torques from gravity
compensation. During double support, we exploit torque redundancy to optimize
contact forces to shift the robot’s weight between successive stance legs (Righetti
& Schaal, 2012). We found that this weight distribution optimization is crucial in
order to achieve the desired motion and ZMP.
From the preceding results, we know that the LQR formulation behaves well
for tasks without contact switching. Switching contact conditions based solely on
the planned trajectory often cause instability because the real contact with the
environment never follows the exact timing of the planned trajectory. For the
walking task, it thus proved essential to estimate the contact state using both the
measured normal force and the planned trajectory. To ensure that contacts are
created robustly during the swing leg touch down, trajectories were designed such
that the foot contacted the ground with nonzero velocity. Furthermore, to deal
withjointtorquediscontinuitiesatcontacttransitions, causedifoneweretosimply
switch from one LQR controller to another, we used a heuristic approach to blend
the gains produced by the LQR controllers. When the contact condition changed,
43
thecurrentgainsarequicklyinterpolatedtoanindependentjointPDcontrollerand
then interpolate to the new LQR gain matrix. The gain matrices were selected by
considering the estimated contact condition of the end effectors and the minimum
norm of the current posture to the preselected linearized postures. The control
scheme used for walking is shown in Fig. 4.4.
ZMP
T r ajec tor y
Gen er a tor
LQR F eedbac k
P ol ic y
Gr a vi t y
C o m pen sa tio n
R o bot
+

+
+
C o n t ac t
E st i m at o r
Figure 4.4: Block diagram depicting the control architecture used in the walking
experiments. θ is the joint and base state, τ
ff
is the feedforward torque, and R
c
, L
c
denote the timevarying contact trajectories for the feet.
4.4.1 Simulation
Inthesimulation, therobotcouldpreciselytracktheprescribedZMPtrajectory
and planned footsteps for a wide range of timings and step parameters using only
three linearizations (one for double support and one for each single support pose)
as shown in Fig. 4.5. The close tracking of the open loop trajectory over a large
number of steps demonstrates the effectiveness of using only a small number of
linearizations.
44
0.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4
0.2
0.1
0.0
0.1
0.2
0.3
0.4
Desir ed C oP
A c tu a l C o P
Figure 4.5: CoP and footstep location tracking for walking using the LQR con
troller in simulation.
4.4.2 Real Robot
Expanding on the simulation results, we tested the proposed control architec
ture on the real robot to walkinplace. Fig. 4.6 shows snapshots from the walking
experiment for a series of nine steps. Fig. 4.7 shows the CoP and CoM relative to
the feet for the same experiment. The maximum walking speed able to achieve had
a 1.5s SSP and a 1.8s DSP. As the robot walked at faster rates, the placement of
the foot steps became less precise and the performance degraded because of incon
sistencies with the preplanned trajectory. If the feet land in the wrong location,
the desired pose would become kinematically infeasible and the LQR controller
would be physically unable to drive the error to zero. We believe that replanning
the trajectory online, Chapter 5, would alleviate this issue and allow for both faster
and more stable walking.
45
0 5 10 15 20 25 30 35 40
Time (s)
0.10
0.05
0.00
0.05
0.10
0.15
0.20
0.25
Position (m)
CoM
CoP
Left Foot
Right Foot
Figure 4.7: CoP, CoM, and foot locations in the sidetoside direction throughout
the experiment from the walking experiment on the real robot using multiple LQR
linearizations and contact switching.
Figure4.6: Snapshotsfromthewalkingexperimentontherealrobotusingmultiple
LQR linearizations and contact switching.
4.5 Discussion
The experiments of the preceding section demonstrate both the strengths and
shortcomings of the wholebody LQR control approach for a number of different
46
tasks. We showed in the experiments that a relatively small number of lineariza
tions were sufficient to control stepping tasks and that increasing the number of
linearizationalongthetrackedtrajectoriesdidnotnecessarilyimproveperformance
on the real robot. My experiments suggest that linearizations are only necessary at
different contact configurations and that a complete gain schedule along planned
trajectories, where gains change every control cycle, might not provide additional
robustness in real robot experiments. It is also important to notice that stepping
and balancing tasks could not be achieved by merely using an independent joint
PD controller. The ability of the LQR controller to take into account joint cou
pling significantly improved the performance. Properly damping the system was
one of the biggest limitations to obtain robust behaviors on the robot. LQR con
trollers tend to create aggressive feedback on joint velocity which can create issues
on the robot due to excessive noise in the velocity signals. While this problem
does not exist in simulations, it becomes quickly visible on the real robot. While
a combination of onboard and offboard filtering helped, we expect that using a
modelbased filter with additional sensors such as IMUs can allow increasing the
control bandwidth (Rotella et al., 2016). Using control design techniques that
explicitly take into account measurement uncertainty might also help to address
this issue (Ponton et al., 2016).
Contact transitions were also difficult to control and accurate contact estima
tion was an important element of the control architecture. Indeed, gains change
drastically when contact conditions changes and poor estimation of this can lead
to unstable behaviors (e.g. when switching to a double support control when the
robot is still in single support). In addition, the heuristic used to smooth transi
tioning between gains at each contact sequence during the fast contact transition
also significantly increased performance. While controlling contact transitions was
47
not an issue in simulation, it remains an important issue for implementation on
real robots.
4.6 Summary
In this chapter, we showed experiments for computationally lightweight control
policy can be used to combine a small number of linearizations to create complex
motion. We explored using multiple linearizations to tracked sidetoside motions
and found that on the real robot there was no measurable advantage for using a
higher number of linearizations. Finally, using linearizations around each contact
situation (double support and both single support poses) was sufficient enough to
track a ZMP walking trajectory when coupled with a contact estimator that helped
transition between contact switching. These results highlight both the ability to
control complex motions with relatively simple control policies and also the need
to evaluate modern algorithms on real hardware. On the robot, we observed that
the aggressive stabilizing commands LQR produces could be problematic with high
sensor noise and limited control bandwidth. In the future, we plan to incorporate
joint state estimation (Rotella et al., 2016), focus on robustifying contact transi
tions and test the control framework in combination with online model predictive
control planners.
48
Chapter 5
MPC Walking Control Using
Hand Contacts
In bipedal locomotion, online pattern generators have seen much success
because of their ability to adapt to continually changing environment and inter
nal states. In order to generate new plans quickly, researchers typically utilize a
simplified dynamic model for fast planning of center of mass (CoM) trajectories.
The plans are then tracked using a wholebody controller typically utilizing either
inversekinematicsorinversedynamics. Historically, theLinearInvertedPendulum
Model (LIPM) has been the most prominent model for biped walking because it
is linear and yet effectively captures the dynamics of walking on flat surfaces. The
low dimensionality and linearity of the LIPM allow the dynamics of the model to
be regulated through linear control methods for compact formulations and efficient
computations.
In (Kajita et al., 2003a), Kajita et. al introduced the preview control of the
LIPM, which optimizes the CoM trajectory over a timehorizon. For a linear
system with no inequality constraints, this is equivalent to solving the well known
finitehorizonLinearQuadraticRegulator(LQR),whichyieldsatimevaryingfeed
forward and feedback policy. If the problem includes inequality constraints, the
solution no longer provides a feedback policy that stabilizes the trajectory. One
approach used to rectify this is Model Predictive Control (MPC). In an MPC
49
framework, feedback is generated by continuously resolving the constrained opti
mization from the measured state and applying the first feedforward command.
In (Wieber, 2006), Center of Pressure (CoP) constraints in the form of linear
inequalities are introduced and the problem is solved in a receding horizon manner
to produce CoM trajectories robust to perturbations. This approach was further
extended to additionally optimize over the footstep locations (Herdt et al., 2010),
allowingtherobottotakerecoverystepsandtrackareferencevelocity. Researchers
have further developed MPC methods using the LIPM to adjust footstep timing
(Khadiv et al., 2016; Maximo et al., 2016; Caron & Pham, 2017), plan 3D tra
jectories (Brasseur et al., 2015), and control the Divergent Component of Motion
(DCM) (Englsberger et al., 2014; Hopkins et al., 2015). A nonlinear extension also
allows the robot to deal with walking around obstacles (Naveau et al., 2017).
Theproblem ofdesigningreceding horizon controllers thathandle rough terrain
and allow the use of hand contacts remains a largely open area. More general
models such as the centroidal momentum dynamics have gained popularity to
generatepatternsallowingmulticontactbehaviors,(Carpentieretal.,2016;Herzog
et al., 2016), but they are not convex and are computationally more demanding.
While significant progress has been made with these algorithms, they have not yet
been applied for receding horizon control of legged robots.
Simpler models still allowing hand contacts have also been explored. In (Serra
et al., 2016), hand contacts with the environment were introduced in an MPC
problem by deriving a nonlinear dynamical model of the CoM that included an
external wrench. To solve the nonlinear optimization problem, a Newton scheme
was presented in which each step of the optimization was bounded such that the
intermediate solutions were always feasible. While external hand contacts were
50
External
Force
Footstep
Boundry
x
y
z
Figure 5.1: Visual of a push experiment setup mid push. The spheres located
on the wall mark available contact points, in which green spheres specify those
considered reachable.
used for stability, the contact locations and timing were predetermined. Addition
ally, the contact considered was a grasp and rather than a push, thus neglecting
friction cone constraints. In (Agravante et al., 2016), an external wrench is also
included in the MPC scheme to design a walking pattern generator for physical
collaboration. To drive cooperation with a second agent, the robot assumes the
51
role of either a leader or follower in the task of collectively carrying an object. Con
tact forces are included in the dynamics and cost function such that the problem
remains linear in the constraints and quadratic in the cost.
In this chapter, we extend simplified walking pattern generators to exploit hand
contactswhenstabilitycannotbemaintainedbysteppingalone. Wetreatthehand
contact as a complement to footstep planning. To limit computational complex
ity and allow receding horizon control, we propose a twopart optimization that
automatically calculates how to use hand contacts when the CoM and footstep
adjustment alone are not sufficient. The contact locations are chosen among mul
tiple contact points, considering both the time to reach the contact and friction
cone constraints. Our decomposition maintains the computational simplicity of
preview controllers based on the LIPM dynamics in the first optimization. The
second part then selects a hand contact location and contact force to stabilize
the robot when necessary by solving a lowdimensional mixedinteger quadratic
program (MIQP). The result is a controller that reactively decides when and how
to use the hands of the robot to further stabilize the walking controller. Push
recovery simulation experiments demonstrate that our approach can significantly
improve the stability of a walking robot under strong perturbations.
5.1 Linear Inverted Pendulum with an External
Contact Force
In Section 4.1.2, we derived the dynamics of the LIPM. Here, we show how
this model changes with the addition of an external contact force f
c
= [f
x
c
,f
y
c
,f
z
c
],
52
corresponding to a hand contact located at pointp = [p
x
,p
y
,p
z
]. Similar to (Agra
vante et al., 2016), we separate foot contact forces, f
i
, from all other forces in the
NewtonEuler Equations (5.1), (5.2).
m(¨ c +g) =f
c
+
X
f
i
(5.1)
c×m(¨ c +g) +
˙
L =p×f
c
+
X
s
i
×f
i
(5.2)
In these equations, c is the CoM, L is the angular momentum taken about the
CoM, s
i
is the location of the coplanar foot contacts, and the coordinate frame
has thez axis aligned with the gravity vector. Dividing the Euler equation by the
foot contact forces in the z direction, f
z
i
, of the Newton equation and rearranging
yields:
P
s
i
×f
i
P
f
z
i
=
c×m(¨ c +g)− (p×f
c
) +
˙
L
m(¨ c
z
+g)−f
z
c
.
Under the LIPM assumptions, the angular momentum and the CoM height
remain constant (
˙
L = 0, ¨ c
z
= 0), with the ground plane at zero (s
z
i
= 0). We can
then write the equations for the Zero Moment Point (ZMP) which we will refer to
as Z
hand
(i.e. ZMP with external hand contact).
Z
x,y
hand
=
P
s
x,y
i
f
z
i
P
f
z
i
=−
mc
z
¨ c
x,y
−mgc
x,y
−p
z
f
x,y
c
+p
x,y
f
z
c
mg−f
z
c
(5.3)
Notice that if there is no contact force (i.e. f
c
= 0) this term reduces to the familiar
ZMP equations for the LIPM, Z
LIPM
, (Kajita et al., 2003a) :
Z
x,y
LIPM
=
P
s
x,y
i
f
z
i
P
f
z
i
=c
x,y
−
c
z
¨ c
x,y
g
. (5.4)
53
5.2 Shifting the ZMP Support Polygon
To express the difference between the ZMP with and without a hand contact,
we introduce the variable ΔZ such that:
ΔZ =Z
hand
−Z
LIPM
(5.5)
or equivalently
Z
hand
=Z
LIPM
+ ΔZ. (5.6)
By expressing the ZMP this way, one can see how the traditional ZMP from the
LIPM shifts with the addition of an external force. As mentioned in (Agravante
et al., 2016), we can additionally think about the implication of this force on the
ZMP bounds. That is, we can rewrite
Z≤Z
LIPM
+ ΔZ≤Z (5.7)
as
Z− ΔZ≤Z
LIPM
≤Z− ΔZ. (5.8)
Eq. (5.8) implies that it is possible to shift the effective support polygon that
boundsZ
LIPM
through the addition of an external force. Solving for ΔZ yields the
nonlinear expression
ΔZ
x,y
=
p
z
mg−f
z
c
!
f
x,y
c
+
−p
x,y
−
c
z
¨ c
x,y
g
+c
x,y
mg−f
z
c
!
f
z
c
. (5.9)
54
ZMP Reference
ZMP
Conservative Safety Boundries
ZMP
Support Polygon
Shifted ZMP
Support Polygon
Constraint Violation
Figure 5.2: When the conservative ZMP bounds are violated the robot seeks to
create a contact force among the available options such that the "Shifted ZMP
Support Polygon", Eq. (5.8) is centered at the current ZMP location.
5.3 Optimization Formulation
Due to constraints on footstep locations (e.g. kinematic limits and environ
mental obstructions) and CoM motion (e.g. acceleration limits), it is not always
feasible to keep the ZMP within the specified bounds. We remedy these situations
by shifting the ZMP bounds by specifying an external force with the hand. One
approach is to embed Eq. (5.9) into the optimization resulting in nonlinear dynam
ics. It also introduces the variables f
c
i
,i = 1,...,N, andp for a total of (3N + 3)
55
new decision variables, where N is the number of timesteps in the MPC preview
horizon. This problem grows even further if one would like to consider points on
different surfaces with different constraints (i.e. friction cone and boundary con
straints). Because of the increased dimensionality, complexity, and thus the time
needed to solve this larger nonlinear optimization in an online manner, we propose
a more lightweight approach that remains linear by breaking the problem into two
separate optimizations. In the first optimization, we decide both when an external
force is necessary and what value ΔZ
x,y
is desired to shift the support polygon.
In the second optimization, we decide both the location of the hand contactp and
the force f
c
. While separating the optimization into two stages will result in a
suboptimal solution, we argue that in the context of a quick recovery motion, fea
sibility is more important than optimality. The result is a walking controller that
hand contacts in a reactive manner when the control over the CoM and footstep
locations alone is deemed insufficient. An outline of the algorithm logic is shown
in Alg. 1.
1 while Robot Has Not Fallen do
2 Find reachable contact points from list.
3 Constrain slack variables based on time to reach contact
4 Solve LIPMM based MPC (Sec. 5.3.1) to generate CoM, footstep, and
slack trajectory.
5 ifkSlack Trajectoryk> 0 then
6 if Hand In Contact then
7 Solve QP to calculate the external force that compensates for the
ZMP error (Sec. 5.3.3)
8 else
9 Solve MIQP to decide the best contact location (Sec. 5.3.4).
10 Track solution with hand.
11 end
12 end
13 end
Algorithm 1: MPC Walking Scheme With Hand Contacts
56
5.3.1 Stage1: MPCofCoMTrajectoryandFootstepLoca
tions
In the first stage of the optimization, we formulate an MPC scheme to minimize
CoM jerk, ZMP tracking error, and CoM velocity tracking error. The scheme
is based on the well known approach from (Herdt et al., 2010), so we omit the
details for brevity. The only difference in our approach is that we write the ZMP
constraints with the addition of a trajectory of slack variables,S
k
,k = 1...N. We
can write this constraint in the same form as Eq. (5.8), omitting the subscript k,
as:
Z−S≤Z
LIPM
≤Z−S, (5.10)
where Z and Z correspond to the upper and lower bounds of a conservative
support polygon, Fig. 5.2. The amount we scale the support polygon boundaries
is directly linked to how early the hand contact will be triggered to help. When
the ZMP leaves the conservative safety boundaries,S > 0, we treat this as an
indication that additional support is necessary to maintain balance, triggering the
use of a hand contact. In order to account for the fact that it will take some time
for the hand to reach the contact point location, we constrain the first N
D
time
steps of the slack variable S to be zero, where N
D
is representative of the time
steps it will take for the hand to make contact. Furthermore, we add a weighted
quadratic term to minimize S in the cost function to favor solutions that only use
footstep and CoM adjust for balance.
57
5.3.2 Stage 2: Hand Contact
Once the first optimization signifies the need for a hand contact,S > 0, a
second optimization calculates the necessary contact force and location to shift
the ZMP boundaries by ΔZ such that the ZMP is centered within the support
polygon , Fig. 5.2. While the hand has not yet created a contact, we formulate
a MIQP that includes all the reachable contact point locations, along with the
corresponding friction cones, to determine which, if any, is the optimal contact
location to enact the shift specified by the ΔZ trajectory. Once a contact has been
made, we remove all other contacts from the optimization and solve for the contact
force as a single QP that can be solved in realtime. We begin by formulating the
problem for when the hand is already in contact.
Friction Cone Constraints
Consider a block of massm sitting on a table. For a sufficiently small external
force applied tangentially, the block remains unmoved as the external force is
counteractedbyfriction. Astheforcesincrease,theexternalforcesremainbalanced
by friction until up until a certain point where the block starts sliding. When
creating contact with the environment it is desirable to know the limits on the
forces that can be applied while remaining still. To model this, many assume a
Coulomb friction model. In the Coulomb friction model, the block remains fixed
as long as the contact force, f
c
lies within the friction cone as shown in Fig. 5.4.
f
c
x
≤μf
c
z
(5.11)
58
where μ is the coefficient of friction between the two surfaces. In three dimen
sions this is visualized by a cone the vertex located at the contact point being
considered.
friction
cone
Figure 5.3: In the digram the block remains static because the contact force f
c
remains inside of the friction cone.
In many cases, such as optimization problems that require linear constraints,
is it advantageous to use a linear approximation of the friction cone, Fig. 5.3.
The conservative linear approximation for a friction cone, referred to as a friction
pyramid, is shown in
Figure 5.4: The conservative linear approximation for a friction cone, referred to
as a friction pyramid.
59
5.3.3 Determining the Contact Force
In Eq. (5.9), we can see that with the values of ΔZ
x,y
already determined, we
have two equations and three unknowns. Geometrically, the solutions for the 3D
force vector form a line can be rewritten in parametric form as:
f
x,y
c
=
p
x,y
+
c
z
¨ c
x,y
g
−c
x,y
− ΔZ
x,y
p
z
f
z
c
+
mgΔZ
x,y
p
z
. (5.12)
Note that f
x,y
c
is a linear function of f
z
c
and thus the only decision variable is
f
z
c
. We include linearized friction cone constraints, by first changing the frame of
our variables to be aligned with the surface normal of the contact:
f
b
f
t
f
n
=R
f
x
f
y
f
z
(5.13)
and write the following constraints
f
b
≤uf
n
, f
b
≥−μf
n
f
t
≤uf
n
, f
t
≥−μf
n
f
n
≤f
n
max
, f
n
≥ 0 ,
(5.14)
where μ is the static coefficient of friction that defines the linearized fiction
cone. The cost function to optimize is then:
minimize
f
z
(f
b
)
2
+ (f
t
)
2
+κ(f
n
)
2
subject to Friction Cone Constraints (5.14).
(5.15)
Reducing the gainκ favors solutions that are closer to the center of the friction
cone and thus more robust to slipping. This optimization formulation specified by
60
Eq. (5.14,5.12,5.15) will be referred to as QP1. Because the formulation of QP1
shifts the support polygon by exactly ΔZ, the optimization does not consider that
the ZMP may not be centered in the support polygon. To address this, we include
the addition of the new slack variable S
z
= [S
x
z
,S
y
z
] that is bounded as follows:
S
z
≤S
z
≤S
z
. (5.16)
Rewriting Eq. (5.12) as a function of S
z
and f
z
results in:
f
x,y
c
=
p
x,y
+
c
z
¨ c
x,y
g
−c
x,y
− ΔZ
x,y
p
z
!
f
z
c
+
1
p
z
!
S
x,y
z
f
z
c
+
mg
p
z
!
S
x,y
z
+
mgΔZ
x,y
p
z
!
.
(5.17)
Unfortunately, by introducing the new decision variable S
z
we have added the
bilinear term S
z
f
z
c
. In order to keep the problem linear and solvable with a QP,
we introduce a convex relaxation of the bilinear term by means of a McCormick
envelope (Castro & Cincias, 2014). In doing so, we create the new variablesW
x,y
=
S
x,y
z
f
z
c
, more compactly written as W, and add the following new constraints to
the system:
S
z
f
z
c
+ S
z
f
z
c
− S
z
f
z
c
≤ W
S
z
f
z
c
+ S
z
f
z
c
− S
z
f
z
c
≤ W
S
z
f
z
c
+ S
z
f
z
c
− S
z
f
z
c
≥ W
S
z
f
z
c
+ S
z
f
z
c
− S
z
f
z
c
≥ W.
(5.18)
This new formulation will be referred to as QP2.
61
Figure 5.5: The left shows the solution space when we solve for the exact ZMP
shift (QP1) and the right for when we formulate the optimization as QP2.
5.3.4 Determining the Contact Location
In order to determine the contact location, we reason about reachable contact
locations and their corresponding friction cone in the form of a MIQP. To define
whether a contact has been chosen, we introduce the binary variablesb
i
∈ 0, 1, i =
1,...,N
c
. We then optimize over the time steps
ˆ
k corresponding to the nonzero
elements in the ΔZ trajectory.
minimize
f
z
i
ˆ
k
,b
i
(f
b
)
2
i
ˆ
k
+ (f
t
)
2
i
ˆ
k
+κ(f
n
)
2
i
ˆ
k
subject to
Nc
X
1
b
i
= 1 (5.19)
b
i
=⇒
0, f
z
i
ˆ
k
= 0
1, (Friction Cone Constraints)
i
(5.20)
62
5.4 Experiments
5.4.1 Optimization Benchmarks
The experiments were conducted in this section using the optimization library
(Gurobi Optimization, 2016). The average solve time while searching among con
tacts, shown in Table 5.1, was found to vary linearly with the length of the ΔZ
trajectory and the number of contacts points. Once in contact, the optimization
problem collapses down to a single QP which can be solved in 0.11 ms.
ΔZ Timesteps Number of Contact Points (N
c
)
5 10 20 40
16 5.2 ms 9.7 ms 18.7 ms 35.9 ms
8 3.2 ms 5.2 ms 9.8 ms 19.8 ms
4 1.8 ms 2.9 ms 5.4 ms 10.2 ms
Table 5.1: Solving speed for multiple contact points considering different horizon
lengths.
As shown in Fig. 5.5, by solving for contact forces using the formulation of
QP2 as opposed to QP1 we expand the feasible solution space from a line to a
volume. TocomparehowthesolutionsofQP1andQP2, weperformedaparameter
sweep over μ and ΔZ and evaluated 5,000 random friction cones orientations and
CoM accelerations for the following ranges: μ = [.1, 1] and ΔZ = [−.05,.05],
S
z
= [−.02,.02]. In total, there were 2,000,000 test cases evaluated, resulting in
the following statistics:
Success Rate Average Force (N)
Force QP1 36.1% 28.0
Force QP2 73.6% 11.7
63
On average, formulation of QP2 favorably found optimal solutions for approx
imately twice the number of test cases at approximately half the force. Because
of the convex relaxation of the bilinear term used in QP2, we additionally need to
check that the solutions satisfy the original problem constraints. Over the values
tested, the maximum force violation was 0.037 N, which is negligible given the
accuracy of force tracking capabilities of the robot seen in experiments.
5.4.2 Maximum Push Experiments
We evaluated the proposed control scheme in simulation through a series of
push recovery tests on a 31 DoF torque controlled humanoid. We are particularly
interested in the relationship between the max impulse the robot can recover from
with and without an external force through hand contact with the environment.
Additionally, we are interested in how the number of recovery steps before hand
contact affects this maximum impulse value. To test this, the robot was pushed
towards a wall while walking in place from three separate distances that allow for
zero, one, and two recovery steps before hand contact. Additionally, the robot
footstep locations were constrained to remain at a minimum distance from the
wall (0.35m) so that the upperbody would not make contact unless the arm was
purposefully extended, Fig. 5.1. In order to track the CoM and footstep trajectory,
we use an optimizationbased wholebody inverse dynamics controller formulated
asa QP,similarto (Fengetal., 2015b;Herzogetal.,2016;Tedrake&others,2014),
to resolve for joint torques at 1kHz. To track the desired forces, we implemented
a simple force feedback policy that was added to the resulting joint torques from
the wholebody QP solver (τ
qp
):
τ =τ
qp
+J
T
c
(f
c
+P (f
c
−
ˆ
f
c
)) (5.21)
64
, where J
c
is the Jacobian from the CoM to the hand in contact,
ˆ
f
c
is the
measured force at the endeffector in global coordinates, andP is a forcefeedback
gain. In the future, we plan to include these desired forces and friction cone
constraints directly in the wholebody QP. The following parameters were used in
these experiments: MPC horizon = 1.6 s, MPC Δt = 0.1 s, robot mass = 62.5 kg,
CoM height = 0.78m, single support duration = 1.0 s, double support duration =
0.1 s, μ = 1, and the support polygon boundary used in Sec. 5.3.1 scaled the foot
dimensions by a factor of 0.1.
The results of the push experiments are shown in Fig. 5.6. On average, using
an external hand contact increased the maximum disturbance impulse by 233%,
156%, and 13% for the experiments where the robot was allowed zero, one, and two
recovery steps respectively. While the addition of a hand contact greatly increased
the walking controllers robustness to external impulses for zero and one recovery
steps, inthe2steprecoverysituation, therobothadaveryhighaccelerationbythe
time the wall became close enough to make contact. This resulted in high desired
contact forces that were difficult to track. The experiments confirm that because
of the short optimization horizon and the reactive nature of this formulation, the
approach presented here is most effective in close proximity of contacts or when
there is limited control authority through stepping.
65
0
5
10
15
2 Step Recovery
Hand Contact
No Hand Contact
0
5
10
15
Impulse (Ns)
1 Step Recovery
0.4 0.2 0.0 0.2 0.4
Time Relative to Double Support (s)
0
5
10
15
No Stepping
Figure 5.6: Maximum impulse at different times in the walking gait that the robot
was able to recover from for zero, one, and two step recoveries with and without
hand contacts.
Additionally, it is important to note that perfect force tracking was not
achieved, but the task was still successful. This is important because it is very
improbable to achieve perfect force tracking, while making and breaking contacts
with the environment, on a real humanoid robot. Instead, the robustness of the
approach presented comes from constantly replanning the contact force from the
66
measured robot state. Fig. 5.7 shows the tracking of the external hand force in
one of the experiments.
75
50
25
0
25
50
75
100
Force (N)
Push Force x
Contact Force x
Desired Contact Force x
Contact Force y
Desired Contact Force y
Contact Force z
Desired Contact Force z
3 4 5 6 7 8
Time (s)
0.05
0.10
0.15
0.20
0.25
0.30
0.35
0.40
Position (m)
ZMP x Ref
ZMP x
CoM x
Figure 5.7: Top: External impulse applied to the robot (10 Ns) and the force
tracking of the hand. Bottom: ZMP tracking and CoM trajectory as a result of
the push. Despite imperfect force tracking, the robot is still able to recover from
the impulse.
67
5.5 Summary
In this chapter, we presented an extension to an MPC walking control scheme
that reactively plans a desired contact location and force without any predeter
mined timing. In doing so, the formulation considers friction properties of the
contact surfaces and the time it takes for the robot to move its hand to the con
tact location. To this end, we introduced a twopart optimization scheme in which
the first optimization plans the CoM and footsteps trajectories while the second
optimizationreasonsaboutthehandcontacts. Thesecondoptimizationisinitiated
when the first optimization violates some conservative ZMP constraints. This sep
aration into two optimization problems allowed us to reach realtime computation
while maintaining low computational complexity. As shown in our experiments,
there is a great benefit of using hand contact together with a walking controller
when within a one recovery step proximity. Because this control scheme separates
walking from hand contacts, there is the possibility to pair the second stage with
different LIPM based walking pattern generators and other methods to trigger the
need for hand contact. Potential directions for future research would be to use the
DCM of the LIPM or incorporate an external force estimator (Rotella et al., 2015)
to trigger the need for hand contact.
68
Chapter 6
Conclusion
In this thesis, we have presented optimizationbased methods for wholebody
control and reactive planning. We began by detailing an LQR feedback policy that
exploitedknowledgeofthehighlycoupleddynamicsoftherobotforavarietyofbal
ancing tasks on a torque controlled robot. With a heavy focus on real robot exper
iments, we experimentally showed that this computationally lightweight approach
was able to reject external disturbances while balancing in both single support and
double support, competitive with more complex and less efficient control policies.
Our work on LQR balancing was then extended to track walking trajectories
generated from preview control of the LIPM by using a library of multiple LQR
control policies linearized around different postures and contact states. Obtaining
results on the robot required modification of the lowlevel torque control scheme
to increase the overall stability of the coupled wholebody LQR control policy.
Additionally. we investigated how tracking error changed as a result of the number
of linearizations, concluding that a small number of linearizations around different
contact states was sufficient to stabilize the planned trajectories.
Lastly, we explored online methods of modifying the task space trajectories
directly for walking while considering hand contacts for providing disturbance
rejection. To accomplish this, we developed a reactive planning framework based
ontheLIPMdynamicswhenconsideringforcesprovidedbyahandcontact. Inthis
framework, the CoM and footstep trajectories were solved for online as an MPC.
When ZMP bounds in the MPC problem were violated (by using a slack variable),
69
a secondary optimization was solved to determine hand contact locations and tra
jectories designed to keep the robot stable. The formulation presented allowed
for the hand contact trajectories to be planned in realtime. Additionally, we
showed how this optimization could consider a larger range of solutions by using
McCormick envelopes. Simulation results were presented on a full humanoid robot
(31 DoFs) for a variety of scenarios, timings, and contact surfaces.
6.1 Future Work
OnenaturalextensionfortheMPCplannerpresentedinChapter5istodevelop
methods that allow for footstep adjustment on rough terrain. While rough terrain
encapsulates a great variety of terrain types, the rough terrain referred to in this
section is characterized by 3D geometry consisting of both continuous and discrete
sections of terrain. For continuous flat terrain, footstep locations can be written
analytically as a function of the CoM (Koolen et al., 2012) or directly optimized
as continuous decision variables (Herdt et al., 2010) using linear models. This
problem becomes nonlinear when considering nonflat terrain.
If only discrete foothold placements are considered, then the choice of all pos
sible foothold sequences can be written as a combinatorial problem and discrete
searches can be used to choose among precomputed motions. In (Deits & Tedrake,
2014), footstep locations are planned over rough terrain by using a MIQP. MIQPs
extend quadratic programs by allowing the inclusion of integer variables. By using
binary variables, boolean logic can be used for decision making and nonlinear
functions can be approximated as piecewise linear functions. By using this formu
lation, (Deits & Tedrake, 2014) was able to assign footstep locations to precom
putedconvexplanarobstaclefreeregions, throughbooleanlogic, andthenoptimize
70
footstep locations within these convex regions as continuous variables. While this
formulation works well as a footstep planner, dynamics and timing were ignored
when choosing the foothold locations. An extension to this work was provided
by (Ponton et al., 2016), in which a convex relaxation of the nonlinear momen
tum dynamics was used to plan CoM motion and interaction forces. Using this
plan, footstep locations were selected to be dynamically consistent with the plan.
While impressive planning results were shown, experiments were not conducted
using dynamic simulations. Additionally, the solver takes 200500ms to compute
a planned trajectory while the approach of (Herdt et al., 2010) can be solved in
under 1ms. Because of these reasons, it is not clear whether this approach would
work well on a real robot.
Along these lines, we propose a simpler approach to walking over rough terrain
that uses a linear model. Reactive planners that rely on the LIPM are constrained
to reasoning about flat ground because the control point, the ZMP, is only defined
for this situation. (Caron et al., 2015) formulated a generalization to the ZMP
and the corresponding stability bounds. By considering projections of the contact
wrench and friction cones at contact points, a pseudoZMP and its corresponding
support polygon can be calculated for any virtual plane. This allows the LIPM
based MPC framework to be extended to noncoplanar contacts. We believe that
reactive walking on rough terrain could be formulated by combining pseudoZMP
projection methods and considering discrete stepping planes as a MIQP.
71
Reference List
Aftab Z, Robert T, Wieber PB (2012) Ankle, hip and stepping strategies for
humanoid balance recovery with a single model predictive control scheme. In
2012 IEEERAS International Conference on Humanoid Robots (Humanoids),
pp. 159–164.
Aghili F (2005) A unified approach for inverse and direct dynamics of constrained
multibody systems based on linear projection operator: Applications to control
and simulation. IEEE Transactions on Robotics 21:834–849.
Agravante DJ, Sherikov A, Wieber PB, Cherubini A, Kheddar A (2016) Walking
pattern generators designed for physical collaboration. In 2016 IEEE Interna
tional Conference on Robotics and Automation (ICRA), pp. 1573–1578.
Akachi K, Kaneko K, Kanehira N, Ota S, Miyamori G, Hirata M, Kajita S, Kane
hiro F (2005) Development of humanoid robot hrp3p. In 5th IEEERAS Inter
national Conference on Humanoid Robots, 2005., pp. 50–55.
Anderson B, Moore J (1990) Optimal control: linear quadratic methods. Prentice
Hall information and system sciences series. Prentice Hall.
Boaventura T, Semini C, Buchli J, Frigerio M, Focchi M, Caldwell DG (2012)
Dynamic torque control of a hydraulic quadruped robot. In 2012 IEEE Inter
national Conference on Robotics and Automation (ICRA), pp. 1889–1894.
Brasseur C, Sherikov A, Collette C, Dimitrov D, Wieber PB (2015) A robust
linear mpc approach to online generation of 3d biped walking motion. 2015
IEEERAS International Conference on Humanoid Robots (Humanoids) 2015
Decem:595–601.
Caron S, Pham QC (2017) When to make a step? tackling the timing problem
in multicontact locomotion by toppmpc. In 2017 IEEERAS International
Conference on Humanoid Robotics (Humanoids), pp. 522–528.
72
Caron S, Pham QC, Nakamura Y (2015) Stability of surface contacts for humanoid
robots: Closedform formulae of the contact wrench cone for rectangular sup
port areas. 2015 IEEE International Conference on Robotics and Automation
(ICRA) 2015June:5107–5112.
Carpentier J, Tonneau S, Naveau M, Stasse O, Mansard N (2016) A versatile
and efficient pattern generator for generalized legged locomotion. In 2016 IEEE
International Conference on Robotics and Automation (ICRA), pp. 3555–3561.
Castro P, Cincias F (2014) Tightening piecewise mccormick relaxations through
partitiondependent bounds for nonpartitioned variables. .
Cheng G, Hyon SH, Ude A, Morimoto J, Hale J, Hart J, Nakanishi J, Bentivegna
D, Hodgins J, Atkeson C, Mistry M, Schaal S, Kawato M (2008) Cb: Exploring
neuroscience with a humanoid research platform. In 2008 IEEE International
Conference on Robotics and Automation (ICRA), pp. 1772–1773.
Deits R, Tedrake R (2014) Footstep planning on uneven terrain with mixedinteger
convexoptimization. In2014IEEERASInternationalConferenceonHumanoid
Robots (Humanoids), pp. 279–286.
Englsberger J, Koolen T, Bertrand S, Pratt J, Ott C, AlbuSchäffer A (2014)
Trajectory generation for continuous leg forces during double support and heel
totoe shift based on divergent component of motion pp. 4022–4029.
Feng S, Xinjilefu X, Atekson C, Kim J (2015a) Optimization based controller
design and implementation for the atlas robot in the darpa robotics chal
lenge finals In 2015 IEEERAS International Conference on Humanoid Robots
(Humanoids).
Feng S, Whitman E, Xinjilefu X, Atkeson CG (2015b) Optimization based full
body control for the atlas robot. 2015 IEEERAS International Conference on
Humanoid Robots (Humanoids) pp. 120–127.
Focchi M, MedranoCerda GA, Boaventura T, Frigerio M, Semini C, Buchli J,
Caldwell DG (2016) Robot impedance control and passivity analysis with inner
torque and velocity feedback loops. Control Theory and Technology 14:97–112.
Gurobi Optimization I (2016) Gurobi optimizer reference manual.
Herdt A, Diedam H, Wieber PB, Dimitrov D, Mombaur K, Diehl M (2010) Online
walkingmotiongenerationwithautomaticfootstepplacement. Vol.24ofSpecial
Issue: Section Focused on Cutting Edge of Robotics in Japan 2010, pp. 719–737.
Taylor & Francis.
73
Hertzberg C, Wagner R, Frese U, Schröder L (2013) Integrating generic sen
sor fusion algorithms with sound state representations through encapsulation of
manifolds. Inf. Fusion 14:57–77.
Herzog A, Righetti L, Grimminger F, Pastor P, Schaal S (2014) Balancing exper
iments on a torquecontrolled humanoid with hierarchical inverse dynamics. In
2014 IEEE/RSJ International Conference on Intelligent Robots and Systems
(IROS), pp. 981–988.
Herzog A, Schaal S, Righetti L (2016) Structured contact force optimization for
kinodynamic motion generation. In 2016 IEEE/RSJ International Conference
on Intelligent Robots and Systems (IROS), pp. 2703–2710.
Herzog A, Rotella N, Mason S, Grimminger F, Schaal S, Righetti L (2016) Momen
tumcontrolwith hierarchical inversedynamics onatorquecontrolledhumanoid.
Autonomous Robots 40:473–491.
Hirai K, Hirose M, Haikawa Y, Takenaka T (1998) The development of honda
humanoid robot. In 1998 IEEE International Conference on Robotics and
Automation (ICRA), Vol. 2, pp. 1321–1326 vol.2.
Hopkins M, Hong D, Leonessa A (2015) Humanoid locomotion on uneven ter
rain using the timevarying divergent component of motion. 2015 IEEERAS
International Conference on Humanoid Robots (Humanoids) pp. 266–272.
HyonS,HaleJ,ChengG(2007) Fullbodycomplianthumanhumanoidinteraction:
Balancing in the presence of unknown external forces. IEEE Transactions on
Robotics 23:884–898.
Ibanez A, Bidaud P, Padois V (2014) Emergence of humanoid walking behaviors
from mixedinteger model predictive control. 2014 IEEE International Confer
ence on Intelligent Robots and Systems (IROS) pp. 4014–4021.
Kajita S, Kanehiro F, Kaneko K, Fujiwara K, Harada K, Yokoi K, Hirukawa H
(2003a) Biped walking pattern generation by using preview control of zero
moment point. 2:1620–1626 vol.2.
Kajita S, Kanehiro F, Kaneko K, Fujiwara K, Harada K, Yokoi K, Hirukawa H
(2003b) Resolved momentum control: humanoid motion planning based on the
linear and angular momentum. In 2003 IEEE/RSJ International Conference on
Intelligent Robots and Systems (IROS), Vol. 2, pp. 1644–1650 vol.2.
Kaneko K, Kanehiro F, Kajita S, Hirukawa H, Kawasaki T, Hirata M, Akachi K,
IsozumiT(2004) Humanoidrobothrp2. In2004IEEEInternationalConference
on Robotics and Automation (ICRA), Vol. 2, pp. 1083–1090 Vol.2.
74
Kanoun O, Lamiraux F, Wieber PB (2011) Kinematic control of redundant manip
ulators: Generalizing the taskpriority framework to inequality task. IEEE
Transactions on Robotics 27:785–792.
Khadiv M, Herzog A, Moosavian S, Righetti L (2016) Step timing adjustment: A
step toward generating robust gaits. In 2016 IEEERAS International Confer
ence on Humanoid Robots (Humanoids), pp. 35–42.
Khatib O (1987) A unified approach for motion and force control of robot manip
ulators: The operational space formulation. IEEE Journal of Robotics and
Automation (IJRA) RA3:43–53.
Koolen T, Smith J, Thomas G, Bertrand S, Carff J, Mertins N, Stephen D, Abeles
P, Englsberger J, McCrory S, van Egmond J, Griffioen M, Floyd M, Kobus S,
Manor N, Alsheikh S, Duran D, Bunch L, Morphis E, Colasanto L, Hoang KLH,
Layton B, Neuhaus P, Johnson M, Pratt J (2013) Summary of team ihmc’s
virtual robotics challenge entry. In 2013 IEEERAS International Conference
on Humanoid Robots (Humanoids), pp. 307–314.
Koolen T, Boer T, Rebula J, Goswami A, Pratt J (2012) Capturability
based analysis and control of legged locomotion, part 1: Theory and appli
cation to three simple gait models. The International Journal of Robotics
Research 31:1094–1113.
Kryczka P, Kormushev P, Tsagarakis NG, Caldwell DG (2015) Online regeneration
of bipedal walking gait pattern optimizing footstep placement and timing. In
2015 IEEE/RSJ International Conference on Intelligent Robots and Systems
(IROS), pp. 3352–3357.
Lee SH, Goswami A (2012) A momentumbased balance controller for humanoid
robots on nonlevel and nonstationary ground. Autonomous Robots 33:399–414.
Mason S, Righetti L, Schaal S (2014) Full dynamics lqr control of a humanoid
robot: An experimental study on balancing and squatting. In 2014 IEEERAS
International Conference on Humanoid Robots (Humanoids), pp. 374–379.
Mason S, Rotella N, Schaal S, Righetti L (2016) Balancing and walking using full
dynamics lqr control with contact constraints. In 2016 IEEERAS International
Conference on Humanoid Robots (Humanoids).
Mason S, Rotella N, Schaal S, Righetti L (2018) An mpc walking framework with
external contact forces In 2018 IEEE International Conference on Robotics and
Automation (ICRA).
75
Maximo MROA, Ribeiro CHC, Afonso RJM (2016) Mixedinteger programming
for automatic walking step duration. pp. 5399–5404.
Naveau M, Kudruss M, Stasse O, Kirches C, Mombaur K, Souères P (2017) A
reactive walking pattern generator based on nonlinear model predictive control.
IEEE Robotics and Automation Letters 2:10–17.
Orin D, Goswami A (2008) Centroidal momentum matrix of a humanoid robot:
Structure and properties. In 2008 IEEE/RSJ International Conference on Intel
ligent Robots and Systems (IROS), pp. 653–659.
Ott C, Roa M, Hirzinger G (2011) Posture and balance control for biped robots
based on contact force optimization. In 2011 11th IEEERAS International
Conference on Humanoid Robots (Humanoids), pp. 26–33.
Park IW, Kim JY, Lee J, Oh JH (2005) Mechanical design of humanoid robot plat
form khr3 (kaist humanoid robot 3: Hubo). In 2005 IEEERAS International
Conference on Humanoid Robots (Humanoids), pp. 321–326.
Ponton B, Herzog A, Schaal S, Righetti L (2016) A convex model of momentum
dynamics for multicontact motion generation. 2016 IEEERAS International
Conference on Humanoid Robots (Humanoids) p. 8.
Ponton B, Schaal S, Ludovic R (2016) The role of measurement uncertainty in
optimal control for contact interactions. .
Popovic M, Hofmann A, Herr H (2004) Angular momentum regulation during
human walking: biomechanics and control. In 2004 IEEE International Confer
ence on Robotics and Automation (ICRA), Vol. 3, pp. 2405–2411 Vol.3.
Posa M, Kuindersma S, Tedrake R (2016) Optimization and stabilization of tra
jectories for constrained dynamical systems. In 2016 IEEE International Con
ference on Robotics and Automation (ICRA), pp. 1366–1373.
Pratt J, Dilworth P, Pratt G (1997) Virtual model control of a bipedal walking
robot. In 1997 International Conference on Robotics and Automation (ICRA),
Vol. 1, pp. 193–198 vol.1.
Pratt J, Koolen T, Boer T, Rebula J, Cotton S, Carff J, Johnson M, Neuhaus
P (2012) Capturabilitybased analysis and control of legged locomotion, part
2: Application to m2v2, a lowerbody humanoid. The International Journal of
Robotics Research 31:1117–1133.
Righetti L, Buchli J, Mistry M, Schaal S (2011) Inverse dynamics control of
floatingbase robots with external constraints: A unified view. In 2011 IEEE
International Conference on Robotics and Automation (ICRA), pp. 1085–1090.
76
Righetti L, Schaal S (2012) Quadratic programming for inverse dynamics with
optimal distribution of contact forces. In 2012 IEEERAS International Con
ference on Humanoid Robots (Humanoids), pp. 538–543.
Rotella N, Herzog A, Schaal S, Righetti L (2015) Humanoid momentum estimation
usingsensedcontactwrenches In2015IEEERAS15thInternationalConference
on Humanoid Robots (Humanoids), pp. 556–563.
Rotella N, Mason S, Schaal S, Righetti L (2016) Inertial sensorbased humanoid
joint state estimation. In 2016 IEEE International Conference on Robotics and
Automation (ICRA), pp. 1825–1831.
Schaal S (2009) The SL simulation and realtime control software package. Tech
nical report.
SerraD, BrasseurC, Sherikov A,Dimitrov D,Wieber PB(2016) Anewton method
with always feasible iterates for nonlinear model predictive control of walking
in a multicontact situation. In 2016 IEEERAS International Conference on
Humanoid Robots (Humanoids), pp. 932–937.
Sherikov A, Dimitrov D, Wieber PB (2014) Whole body motion controller with
longterm balance constraints. In 2014 IEEERAS International Conference on
Humanoid Robots (Humanoids), pp. 444–450.
Stephens B, Atkeson C (2010) Dynamic balance force control for compliant
humanoid robots. In 2010 IEEE/RSJ International Conference on Intelligent
Robots and Systems (IROS), pp. 1248–1255.
Tedrake R et al. (2014) A summary of team mit’s approach to the virtual robotics
challenge In 2014 IEEE International Conference on Robotics and Automation
(ICRA), pp. 2087–2087.
Wieber P (2006) Trajectory free linear model predictive control for stable walk
ing in the presence of strong perturbations. In 2006 IEEERAS International
Conference on Humanoid Robots (Humanoids), pp. 137–142.
Yamane K (2012) Systematic derivation of simplified dynamics for humanoid
robots. In 2012 12th IEEERAS International Conference on Humanoid Robots
(Humanoids), pp. 28–35.
77
Abstract (if available)
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Estimationbased control for humanoid robots
PDF
The representation, learning, and control of dexterous motor skills in humans and humanoid robots
PDF
Nonverbal communication for nonhumanoid robots
PDF
The task matrix: a robotindependent framework for programming humanoids
PDF
Iterative path integral stochastic optimal control: theory and applications to motor control
PDF
Leveraging structure for learning robot control and reactive planning
PDF
Discrete geometric motion control of autonomous vehicles
PDF
From active to interactive 3D object recognition
PDF
Machine learning of motor skills for robotics
PDF
Design and control of a twomode monopod
PDF
Datadriven acquisition of closedloop robotic skills
PDF
Data scarcity in robotics: leveraging structural priors and representation learning
PDF
Characterizing and improving robot learning: a controltheoretic perspective
PDF
Intelligent robotic manipulation of cluttered environments
PDF
Informative path planning for environmental monitoring
PDF
Mobilitybased topology control of robot networks
PDF
Information theoretical action selection
PDF
Rethinking perceptionaction loops via interactive perception and learned representations
PDF
Learning objective functions for autonomous motion generation
PDF
A robotic system for benthic sampling along a transect
Asset Metadata
Creator
Mason, Sean Alan
(author)
Core Title
Optimizationbased wholebody control and reactive planning for a torque controlled humanoid robot
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Computer Science
Publication Date
07/26/2018
Defense Date
05/09/2018
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
Balance,controls,dynamics,humanoid,hydraulic,locomotion,LQR,OAIPMH Harvest,optimization,robot,torquecontrolled,Walking
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Schaal, Stefan (
committee chair
), Finley, James (
committee member
), Righetti, Ludovic (
committee member
), Sukhatme, Gaurav (
committee member
)
Creator Email
seanmaso@usc.edu,seanmason337@gmail.com
Permanent Link (DOI)
https://doi.org/10.25549/uscthesesc8928685
Unique identifier
UC11672302
Identifier
etdMasonSeanA6503.pdf (filename),uscthesesc8928685 (legacy record id)
Legacy Identifier
etdMasonSeanA6503.pdf
Dmrecord
28685
Document Type
Dissertation
Format
application/pdf (imt)
Rights
Mason, Sean Alan
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 900892810, USA
Tags
controls
dynamics
humanoid
hydraulic
locomotion
LQR
optimization
robot
torquecontrolled