Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Augmented simulation techniques for robotic manipulation
(USC Thesis Other)
Augmented simulation techniques for robotic manipulation
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
Augmented Simulation Techniques for Robotic Manipulation
by
David Millard
A Dissertation Presented to the
FACULTY OF THE USC GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
(COMPUTER SCIENCE)
December 2023
Copyright 2024 David Millard
Dedication
To my parents, for their support and encouragement.
ii
Acknowledgements
The work in this thesis would have been impossible to complete without my advisor, collaborators, colleagues,
and friends. There are too many to list here, but I would like to thank a few in particular.
First and foremost, I thank my advisor, Gaurav S. Sukhatme, for his guidance, support, and encouragement
throughout my time in graduate school. His principled and consistent approach to advising, his strong support
of academic exploration and research freedom, and his dedication to his students have been invaluable to
me. I am also grateful that he has assembled such a talented, diverse, and supportive group of students
and postdocs in the Robotic Embedded Systems Laboratory, and am proud to have been a part of it. It
seems we will never solve some of our favorite pet debates, but I look forward to continuing to argue about
consciousness, bitter lessons, and longevity for years to come.
I am grateful to my NASA collaborators through the NASA Space Technology Research Fellowship,
especially at the NASA Intelligent Robotics Group and the Jet Propulsion Laboratory, for their support and
encouragement. Lorenzo Flückiger, Trey Smith, and Brian Coltin have supported me throughout my PhD,
and I am grateful for their mentorship. They have been willing to dive deep into experimental and algorithmic
details, and have always helped me to position and understand my work in the context of the broader research
community. I am also grateful for my time at the Jet Propulsion Laboratory, where I was able to work with
Paul Backes, Daniel Pastor, and Joseph Bowkett on the Europa Lander project. JPL is a unique place that I
will always be proud to have been a part of.
iii
The other members of my thesis committee, Lars Lindemann and Somil Bansal, have provided valuable
feedback and encouragement throughout the late stages of my PhD. I am grateful for their time and attention.
My many co-authors have been invaluable to me. I am grateful to have worked with such talented and
dedicated researchers, and I am proud of the work we have done together. Eric Heiden, James Preiss, Chris
Denniston, Erwin Coumans, Jernej Barbic, Fabio Ramos, Yizhou Sheng, and Tao Yao, thank you for your ˇ
contributions to the work presented here. I have learned so much from each of you.
I am lucky to have made many friends during my time in graduate school. Gautam Salhotra, Aleksei
Petrenko, Isabel Rayas, Chris Birmingham, Lilly Clark, and Salvador Buse have been my closest friends and
have been continued sources of inspiration and support. From our very first days with the first year potluck to
our last days on the RTH fourth floor, I am glad to have shared this time in my life with all of you.
To my RAGBRAI 2023 team, I may have recovered enough now to ride again. Eric Ewing, Jackie
Downling, Anna Scott, Mike O’Connell, and Jeremy Morgan - thanks for whatever it was that we did, and I
hope we can do it again.
My friends from the University of Georgia, who have believed in me and supported me, have been a
foundation for me over the past decade. Matthew Saltz, John B. Stroud, Eilidh Geddes, Jerry Griffin, Jesse
Chan, and Cameron Zahedi - thank you for our lasting friendship, and I look forward to many more years of
it.
To Madison Douglas, my partner, I am so grateful for your love and support. You are a constant source of
inspiration and encouragement, in ways that I cannot express. I am so lucky to have you in my life.
iv
Table of Contents
Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
Chapter 1: Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Chapter 2: Simulation Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1 Mathematical Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2.1 Ordinary Differential Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2.2 Manipulator Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.3 Numerical Integration of ODEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.3.1 Contact Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.3.2 Partial Differential Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Chapter 3: NeuralSim: Augmenting Differentiable Simulators with Neural Networks . . . . . . . . 13
3.1 Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.3 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.4 Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.4.1 Hybrid Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.4.2 Overcoming Local Minima . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.4.3 Implementation Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.5 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.5.1 Friction Dynamics in Planar Pushing . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.5.2 Passive Frictional Forces for Articulated Systems . . . . . . . . . . . . . . . . . . . 25
3.5.3 Discovering Neural Augmentations . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.5.4 Imitation Learning from MPC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Chapter 4: Parameter Estimation for Deformable Objects in Robotic Manipulation Tasks . . . . . . 30
4.1 Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
v
4.2 Chapter Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.3 Background and Placement within Related Works . . . . . . . . . . . . . . . . . . . . . . . 32
4.3.1 Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.3.2 Closing the Sim-to-Real Gap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.3.2.1 Deformable Object Modeling . . . . . . . . . . . . . . . . . . . . . . . . 33
4.3.2.2 Parameter Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.4 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.4.1 Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.4.1.1 Constitutive Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.4.1.2 Boundary Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.4.1.3 Time Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.4.1.4 Measurement model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.4.2 Nonlinear Programming for Parameter Estimation . . . . . . . . . . . . . . . . . . 38
4.4.2.1 Equality Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.4.2.2 Quadratic Penalty Methods . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.4.2.3 µ-Scheduling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.4.2.4 Initial X Guess . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.4.2.5 Rest state problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.6 Conclusion and Ongoing Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
Chapter 5: Tracking Fast Trajectories with a Deformable Object using a Learned Model . . . . . . 45
5.1 Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
5.2 Chapter Introduction and Placement within Related Work . . . . . . . . . . . . . . . . . . . 45
5.3 Problem Setting and Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
5.4 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
5.4.1 Data collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
5.4.2 RNN dynamics model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
5.4.3 Model-predictive control with reduced-order model . . . . . . . . . . . . . . . . . . 53
5.4.4 Estimating the RNN state . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
5.4.5 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5.5 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5.5.1 Model frequency response . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
5.5.2 MPC tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
5.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
Chapter 6: GranularGym: High Performance Simulation for Robotic Tasks with Granular Materials 62
6.1 Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
6.2 Chapter Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
6.3 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
6.3.1 Implicit Contact Impulses with Projected Jacobi . . . . . . . . . . . . . . . . . . . . 65
6.3.2 Linked Spatial Hashmaps for Broadphase Nearest Neighbor Searches . . . . . . . . 68
6.3.3 Collision Geometry using Signed Distance Functions . . . . . . . . . . . . . . . . . 71
6.3.4 Minimizing Warp Divergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
6.4 Benchmark Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
6.4.1 Bulldozing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
6.4.2 Helical Gear Tower . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
6.4.3 Excavation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
vi
6.5 Placement Within Related Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
6.6 Conclusion and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
Chapter 7: Conclusions and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
Appendix Chapter A: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
7.1 Background on the Julia Programming Language . . . . . . . . . . . . . . . . . . . . . . . 100
7.2 A Julia wrapper for the Franka Emika Panda robot (Franka.jl) . . . . . . . . . . . . . . . . 101
vii
List of Tables
3.1 Comparison of dynamics modeling approaches (selected works) along the axes of analytical
and data-driven modeling, end-to-end differentiability, and hybrid approaches. . . . . . . . . 15
3.2 Physical properties of media used in the swimmer experiment. . . . . . . . . . . . . . . . . 25
4.1 Comparison of ground-truth and estimated values for the material parameters θ in our
experiment. (λ,µ): Lamé parameters. (γr
, γm): Rayleigh damping and mass-proportional
damping constants. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
5.1 Values of user-chosen constants in our experiments. . . . . . . . . . . . . . . . . . . . . . 57
5.2 MPC tracking errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
viii
List of Figures
3.1 Top: Snapshots from a few seconds of the locomotion trajectories generated by the QP-based
model-predictive controller (first row) and the neural network controller imitating the QP
solver (second row) in our differentiable simulator running the Laikago quadruped. Bottom:
model-predictive control using the neural network (third row) and OSQP (fourth row)
running on a real Unitree A1 robot. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.2 Left: comparison of various model architectures (cf. Anurag et al. [2]). Right: augmentation
of differentiable simulators with our proposed neural scalar type where variable e becomes a
combination of an analytical model φ(·,·) with inputs a and b, and a neural network whose
inputs are a, c, and d. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.3 Exemplar architecture of our hybrid simulator where a neural network learns passive forces τ
given joint velocities q˙ (MLP biases and joint positions q are not shown). . . . . . . . . . . 18
3.4 Comparison of system identification results from local optimization (left) and the parallel
basin hopping strategy (right), where the grid cells indicate the initial parameters from which
the optimization was started. The link lengths of a double pendulum are the two simulation
parameters to be estimated from joint position trajectories. The true parameters are 3 and 4
(indicated by a star), and the colors indicate the `2 distance between the ground truth and the
parameters obtained after optimization (darker shade indicates lower error). . . . . . . . . . 20
3.5 Runtime comparison of finite differences (blue) against the various automatic differentiation
frameworks supported by our physics engine. Note the log scale for the time (in seconds). . 22
3.6 Left: example setup for the push experiment (Section 3.5.1) where the rounded object is
pushed by the black cylindrical tip at a constant velocity on a plywood surface. Without
tuning the analytical physics engine, a deviation between the pose of the object in the real
world (green) and in the simulator (metallic) becomes apparent. Right: predefined contact
points (red dots) for one of the shapes in the push dataset. . . . . . . . . . . . . . . . . . . . 23
3.7 Trajectories of pushing an ellipsoid object from our hybrid simulator (blue) and the
non-augmented rigid-body simulation (orange) compared against the real-world ground truth
from the MCube push dataset [164]. The shaded ellipsoids indicate the final object poses. . . 24
ix
3.8 Left: Model of a swimmer in MuJoCo with five revolute joints connecting its links simulated
as capsules interacting with ambient fluid. Right: Evolution of the cost function during the
training of the neural-augmented simulator for the swimmer in different media. . . . . . . . 24
3.9 Trajectory of positions q and velocities q˙ of a swimmer in water at 25 ◦C temperature
actuated by random control inputs. In a low-data regime, our approach (orange) generates
accurate predictions, even past the training cutoff (vertical black bar), while an entirely
data-driven model (green) regresses to the mean. The behavior of the unaugmented analytical
model, which corresponds to a swimmer in vacuum, is shown for comparison. . . . . . . . . 25
3.10 Left: Neural augmentation for a double pendulum that learns joint damping. Given the fully
connected neural network (top), after 15 optimization steps using the sparse group lasso
cost function (Equation 3.2) (bottom), the input layer (light green) is sparsified to only the
relevant quantities that influence the dynamics of the observed system. Right: Final mean
squared error (MSE) on test data after training a feed-forward neural network and our hybrid
physics engine. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.1 A motivating tracking task, where a robot controller needs to manipulate a deformable object
along a given trajectory, while predicting and controlling for the passive dynamic objects. . . 31
4.2 An example state trajectory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.3 Structure of the matrices in the computation of ∂ c
∂x. Left: The overall block-diagonal
structure of ∂ c
∂x. The first block column is the parameter Jacobians ∂ ct
∂ θ . The rest of
the matrix is a block structure with a diagonal and two subdiagonals, coming from the time
differencing in the calculation of vt and at
. Right: An example symmetric sparsity pattern
of one of the diagonal subblocks. This sparsity pattern comes from the connectivity of the
nodes in the Finite Element Method (FEM) mesh. . . . . . . . . . . . . . . . . . . . . . . 41
4.4 A motivating testbed problem setup for our proposed method. Left: Franka Emika Panda arm
rigidly grasping a foam cylinder, with attached Vicon markers. Right: Visualization of a
simulated FEM mesh used for the methods described in this paper. . . . . . . . . . . . . . . 43
5.1 Physical testbed for our method. Trajectories are tracked by a Vicon motion capture marker
attached to the end of a pool noodle, rigidly held by a Franka Emika Panda robot. Pitch/yaw
inputs u¯ = (φ,ψ), tracking point measurement y¯, and coordinate axes (X,Y,Z) shown. (The
fiducial marker visible in the image is not used.) . . . . . . . . . . . . . . . . . . . . . . . . 48
5.2 Diagram of our system. A small dataset of control inputs and tracked marker locations is
obtained from the real system. A recurrent neural network (RNN) model is trained to predict
input/output behavior with a latent state. The RNN forms the nonlinear dynamics model for
an extended Kalman filter (EKF) and model-predictive controller (MPC) to track a dynamic
input trajectory with the deformable object. . . . . . . . . . . . . . . . . . . . . . . . . . . 49
5.3 Frequency-domain gain and phase response (Bode plots) for real pool noodle and LSTM model. 58
5.4 Two-dimensional projections of paths traced by pool noodle free end in MPC tracking
experiments. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
x
5.5 Traces of rotation angle inputs (pitch, yaw) and horizontal and vertical components of
pool noodle free end (y, z) for MPC tracking of the circle trajectory (see Figure 5.4). The
closed-loop variant of our method shows superior tracking performance. . . . . . . . . . . 60
6.1 Simulation of a Franka Emika Panda robot with a rigidly attached scoop attachment,
interacting with a bed of 50000 particles, running at realtime on a single NVIDIA GeForce
RTX 3080 Ti with a simulated timestep of 5×10−4. Our simulation engine approximates
granular material state and dynamics as a system of rigidly interacting spherical particles,
which may interact with kinematically driven rigid bodies of arbitrary geometry. . . . . . . 63
6.2 Graphical representation of a contact frame between two objects (red and blue). The contact
normal is aligned with the frame’s e1-axis, while the e2 and e3 axes are an arbitrary basis for
the frictional force plane. A cone representing the bounds of dry Coulomb friction is shown
in yellow. ∆v
∗
shows a candidate impulse, and the dashed line shows its projection onto the
friction cone via the solution of the contact Nonlinear Complementarity Problem (NCP)
described in Section 6.3.1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
6.3 Plot of speedup over realtime against hash table size (higher numbers are better). To
maximize the number of collision candidates, we simulate particles at rest inside a tall
cylindrical column. The vertical lines mark the number of particles used in the performance
plot of the corresponding style. Based on the shape of these performance curves, we propose
a heuristic hash table size of twice the number of particles simulated. . . . . . . . . . . . . 68
6.4 Graphical representation of the spatial hashmap data structure for fast, parallel broadphase
collision checking. Physical coordinates are divided into cells of size c, each with a hash
value h computed from the cell’s location. A linked hash table, shown in the lower half of
the figure, is constructed in parallel according to Algorithm 2, with one thread for each
particle in the scene. This data structure generalizes to any number of dimensions; we show a
two-dimensional representation for clarity. . . . . . . . . . . . . . . . . . . . . . . . . . . 69
6.5 Graphical illustration of the execution paths for warp divergence during collision detection
during a single timestep of simulation. Left: Execution pattern of a single fused loop, where
all threads in a warp must wait for even a single impulse to be computed. Performance of this
approach is shown in Figure 6.6 as ONELOOP. Right: Execution pattern from precomputing
all collision checks and storing a list of impulses to be processed and then processing
them later. Note that this allows impulses from different collision checks to be computed
simultaneously. Performance of this approach is shown in Figure 6.6 as TWOLOOPSFUSED
and TWOLOOPSSPLIT. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
6.6 Speedup vs. realtime (higher numbers are better) for different orderings for processing
contacts and collisions. To maximize the number of collision candidates, we simulate
particles at rest inside a tall cylindrical column. ONELOOP shows a naive implementation,
while TWOLOOPSFUSED shows the performance of splitting collision and contact processing
but keeping them in a single kernel. TWOLOOPSSPLIT shows the performance of splitting
collision and contact processing across kernel calls. . . . . . . . . . . . . . . . . . . . . . . 72
xi
6.7 On the left is an image taken from the bulldozer simulation environment with 50000 granular
material particles simulated at 1kHz, running in real-time on an NVIDIA RTX 3080 Ti
Graphics Processing Unit (GPU). Visible here are particles interacting with the non-convex
geometry of a toy bulldozer model, as well as mounding effects from stable interparticle
contacts with dry coulomb friction. In the top right is displayed a 36-by-36 pixel “ego view”
perspective camera showing a depth image from the cockpit of the bulldozer, while on the
bottom right is a 72-by-36 pixel depth image taken from a “sky view” orthographic camera.
Combined with the lateral position and orientation of the bulldozer, these images are used as
the observation space for a reinforcement learning environment. . . . . . . . . . . . . . . . 74
6.8 A plot of agent total episode return (averaged over the 100 most recent episodes) against
wall clock time, using the Twin Delayed Deep Deterministic Policy Gradient (TD3)
algorithm. The simulation environment, reward structure, and computer hardware used in
this experiment are described in depth in Section 6.4.1. . . . . . . . . . . . . . . . . . . . . 75
6.9 Plots of speedup vs. realtime (higher numbers are better) against the number of rigid bodies
in the scene (left) and the number of particles simulated (right, note log x axis). The
benchmark environment simulated is described in Section 6.4.2. . . . . . . . . . . . . . . . 76
6.10 Image of the helical gear benchmark setup that we use to test performance as it scales with
the number of bodies and the number of particles. Particles are colored according to their
z-axis coordinate. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
xii
Abstract
The development and application of capable robot manipulators require advances in world modeling and
simulation. This thesis provides a comprehensive overview of our work in simulation methodologies, covering
diverse physical phenomena and target applications, and with augmented structures that make associated
computation tractable.
First, we present work in rigid body dynamics with contact, augmented for fast computation of first
and second order derivatives and probability distributions, and present applications to parameter estimation
and control. We then present work in soft object modeling and control, first by using differentiable solid
mechanics for constrained parameter estimation and then by machine learning based predictive control.
Finally, we present work on GPU accelerated parallel Discrete Element Methods DEM and their applicability
to the challenges of robotic sampling and excavation. To demonstrate the translational utility of this work, we
present results from several methodologies from real data or on real hardware.
xiii
Chapter 1
Introduction
Any robot that must react to its environment must have a model of that environment. These models most
commonly take the form of an ordinary or partial differential equation describing the way that the state of
the robot-environment system changes over time, perhaps in response to inputs from the robot. Since the
earliest days of science, such models have been expressed as low dimensional equations of motions through
the language of physics, with parameters determined through experimentation. More recently, these models
have gained more and more parameters, which have been fit from more and more data, as is common in
modern deep learning.
This thesis relays our work in several areas of simulation that are of interest for robotics, particularly in
the domain of robotic manipulation. Early robotic manipulators followed fixed paths, and largely existed to
move tools to predefined locations in Cartesian space. With the advent of more capable sensing technology
and computer vision algorithms, robot manipulators took on tasks that were more reactive, such as block
stacking and tabletop rearrangement. Historically, most such tasks were fully actuated in the dynamics of the
object under manipulation, which were furthermore largely rigid. This work addresses tasks which are not
so neatly described: rigid objects moving through fluid or sliding on rough surfaces, deformable objects, or
large beds of granular materials. We present methods for the efficient computation of the dynamics of these
environments.
1
Simulators alone can serve as a useful tool for testing algorithms in safe envrionments, or collecting
simulated data for reinforcement learning experiments. By augmenting simulators with the efficient computation of associated quantities of interest, like first- and second-order derivative information, or probability
distirbutions, we can apply them to optimal control, motion planning, and parameter estimation problems.
This thesis covers several contributions to the methodology of augments simulation for robotic manipulation:
1. We introduce a method for computing fast rigid body dynamics with dry, frictional, Coulomb contact,
with fast first and second derivative information. Rigid body dynamics and Coulomb friction are
both simplifying approximations, and may leave a large gap between a real-world system and its
associated simulated dynamics. To this end, we present a novel methodology for inserting learned
function approximators, in our case using neural networks, into the interal working of a rigid body
simulator. We show that our approach yields results which are more interpretable and significatnly
more data efficient than other neural machine learning techniques, while allowing for a much smaller
simulation-to-reality gap than rigid body dynamics alone. We showcase this work experimentally on
simulated viscous friction tasks, real-world pushing tasks with anisotropic friction, and a real-world
robot quadruped control tasks.
2. We introduce a differentiable solid mechanics simulator, which solves visco-hyperelastic dynamics
formulated as a partial differential equation using the Finite Element Method, and describe an efficient
augmented Lagrangian technique which leverages time series and physcial stiffness structure for fast
optimization of a parameter estimation problem. We showcase this problem on a simulation of a
flexible foam beam, and present results on reidentification of simulation problems.
3. We develop and present a method for fast optimal control of deformable objects using a machine
learned model. We leverage the smoothness properties of the Long Short-Term Memory architecture
2
to compute the relevant derivates required for an extended Kalman filter state oberver, and use the
estimated state as an initial rollout state in a model predictive controller. By leveraging existing
optimized routines for recurrent model derivative calculation, we are able to control deformable beams
in real time to track very fast trajectories through free space in front of a robotic testbed.
4. We introduce a simulation engine for the large scale computation of the dynamics of large beds
of granular materials. We showcase several physical model reductions which make the problem
computationally tractable, describe related data structures, and cover the mathematical formulations
underpinning the mechanics of rigid body contact, particularly when computed with implicit integration
schemes. We demonstrate this simulator on several downstream tasks, including a reinforcement
learning agent training scenario, and a robotic excavation task formulated as a model predictive action
choice and executed on real hardware.
3
Chapter 2
Simulation Preliminaries
This chapter will give a brief overview of the mathematical preliminaries of simulation as they pertain
to robotic manipulation. We will cover some of the mathematical techniques that are commonly used
in simulation and control, but will generally avoid proofs and derivations in lieu of directing readers to
established texts and references. We will also cover some of the common simulation techniques used in
robotics, and discuss some of the challenges associated with them.
2.1 Mathematical Preliminaries
All of the simulation techniques discussed in this thesis are based on mathematical models of the physical
world. In general these models are represented as ordinary or partial differential equations, which describe
the way that the state of the robot-environment system changes over time, perhaps in response to inputs from
the robot. Since the earliest days of science, such models have been expressed as low dimensional equations
of motions through the language of physics, with parameters determined through experimentation. More
recently, these models have gained more and more parameters, which have been fit from more and more data,
as is common in modern deep learning.
4
2.2 Resources
The techniques presented in this chapter are generally not novel, nor is this section meant to be exhaustive.
Three texts have proven most useful as resources to develop simulators. A high quality introductory text on
robot manipulator dynamics is "Modern Robotics: Mechanics, Planning, and Control", by Lynch and Park
[99]. A more complete, though occasionally more involved treatment is given in "Rigid Body Dynamics
Algorithms" by Roy Featherstone [43], which provides a comprehensive and understandable overview of
the mathematical tools used to simulate rigid body systems, as well as complete derivations of the from
physical principles. Finally, while not a formal text, the MuJoCo documentation, available now at []
and https://mujoco.readthedocs.io/en/latest/computation/index.html, provides a wealth of information on
realistic nuances of simulation, including contact models, actuation models, transmissions, joint armatures,
and friction models.
2.2.1 Ordinary Differential Equations
Ordinary differential equations (ODEs) are equations which describe the relationship between a function
and its derivatives. They are commonly used to describe the dynamics of a system, and are often used in
simulation and control. The most common form of ODEs are first and second order, which describe the
relationship between a function and its first or second derivative. For example, the equation of motion for a
simple pendulum is a second order ODE:
q¨ = −
g
l
sin(q) (2.1)
where q is the angle of the pendulum, g is the acceleration due to gravity, and l is the length of the
pendulum. This equation describes the acceleration of the pendulum as a function of its current angle. The
solution to this equation is a function of time, q(t), which describes the angle of the pendulum at any given
5
time. This function can be found analytically, but often it is more convenient to solve the equation numerically,
which is discussed in Section 2.3. Indeed, many ODEs arising in robotics cannot be solved analytically, and
must be solved numerically.
It is worth noting that the pendulum could be analyzed in another way, in so called “maximal coordinates”,
where we use state variables x and y to represent the respective coordinates of the bob at end of the pendulum.
In the general case of robot dynamics in three dimensions, where each body in a robot has six degrees
of freedom of motion but is constrained often to a single hinge or prismatic subspace, it is almost always
more efficient to represent motion with so-called “Lagrangian” or “generalized” coordinates, which assigns
one coordinate to each constrained degree of freedom. Maximal coordinates may occasionally be a proper
choice for analysis, particularly in constrained dynamical systems or systems with particular control goals
involving contacts. They may also be an appealing choice for parallelizable dynamics algorithms on large
scale computer hardware, as in [101]
2.2.2 Manipulator Dynamics
Analysis in [43] brings us to the famous “manipulator equation”, the equivalent of f = ma for open-chain
serial mechanisms
M(q)q¨+C(q,q˙)q˙+g(q) = τ (2.2)
where M is the inertia matrix, C is the Coriolis matrix, g is the gravity vector, and τ is the vector of joint
torques. The inertia matrix is a function of the joint angles q, and is defined as:
M(q) =
n
∑
i=1
miJvi
(q)
T
Jvi
(q) (2.3)
6
This information, coupled with the fact that τ(q¨,q˙,q) is easily computed using the Recursive Newton
Euler Algorithm, is sufficient to simulate the dynamics of a robot, by inverting the inertia matrix and solving
for ¨q, which can then be integrated to find ˙q and q.
The manipulator equation is a second order ODE, which relates the joint accelerations q¨ and velocities q˙
to the joint torques τ, and is fundamental to robot dynamics.
2.3 Numerical Integration of ODEs
Only a small number of ODEs can be solved analytically, and many of the ODEs arising in robotics cannot be
solved as such. Instead, numerical methods are used to approximate the solution to the ODE. Fundamentally,
these methods rely on Taylor series expansions of the function in question. The simplest numerical integration
method is the explicit Euler method, which uses only the first term in the Taylor series, evaluated at the
current time step:
yn+1 = yn +h f(tn, yn) (2.4)
Since many ODEs are second order, it is convenient to “lower” the system by introducing velocity states.
For example,
q˙ = v
v˙ = M(q)
−1
(τ −C(q, v)v−g(q))
(2.5)
which may then be integrated using any first order method, such as the explicit Euler method. The
explicit Euler method, however, numerically adds energy to the system through the poorly approximated term
h f(tn, yn), resulting in (often) catastrophic instability. A more stable method is the implicit Euler method,
which solves for the next state using the current state and the next derivative:
7
yn+1 = yn +h f(tn+1, yn+1) (2.6)
which can be solved using a Newton-Raphson method with the residualr(yn+1) = yn+1−yn−h f(tn+1, yn+1)
and the Jacobian J(yn+1) = I −h
∂ f
∂ y
(tn+1, yn+1). This method is computationally more expensive, but is unconditionally stable, and is therefore often used in simulation. This method, however, will numerically damp
the system through underestimation of the derivative term f(tn+1, yn+1), especially for stiff systems or large
time steps, and is therefore not always appropriate for simulation.
A method which is both stable and preserves energy is the symplectic Euler method, which integrates the
velocity state first, and then the position state:
vn+1 = vn +hM(qn)
−1
(τn −C(qn, vn)vn −g(qn))
qn+1 = qn +hvn+1
(2.7)
The MuJoCo documentation [121] provides a more complete overview of the relevant numerical integration methods, including the Runge-Kutta methods, and variants called “Implicit-in-Velocity” which are
hybrid implicit-explicit (IMEX) methods, and are superior for most realistic robotics simulations. These
methods, however, are not used in this thesis.
2.3.1 Contact Dynamics
Robots interact meaningfully with their environment through contact - manipulators through their grippers,
mobile robots through their wheels, and legged robots through their feet. Contact is a very complex physical
phenomenon, and there are large swaths of mechanical engineering and physics literature devoted to its
study. Roboticists have, to this point, used extremely simplified models for contact, primarily consisting of
point-wise perfectly rigid contact with approximated or exact Coulomb friction cones. These models are used
8
for computational efficiency, and are a reasonable approximation given the inherent uncertainty of sensor
measurements and the difficulty of system identification.
Contact is a non-smooth phenomenon, and therefore cannot be modeled using ODEs. Instead, contact is
modeled using a complementarity problem, which is solved at each time step. Importantly, the complementarity problem is specified in terms of impulses, the integrals of forces over time, which are used to update
the velocity of the system directly. Complementarity problems are quite interesting and sometimes taken
for granted in related literature, so we will take a moment to discuss them here. A very useful reference for
complementarity problems is [33].
In general, a linear complementarity problem (LCP), given a matrix M and a vector q, finds a vector z
such that
Mz+q ≥ 0
z ≥ 0
z
T
(Mz+q) = 0
(2.8)
where z is the solution to the LCP. The third condition is called the complementarity condition. Combined
with the non-negativity constraints, it implies if zi > 0, then Mz+q = 0, and if Mz+q > 0, then zi = 0, i.e.
the vectors Mz+q and z are complementary.
The LCP can be reformulated as a quadratic program (QP)
min
z
1
2
z
TMz+q
T
z
subject to z ≥ 0
(2.9)
Although it looks like we have dropped the complementarity condition, it is actually implied by the
non-negativity constraint and the Karush-Kuhn-Tucker (KKT) conditions of constrained optimization. For
this problem, the Lagrangian is
9
L(z,λ) = 1
2
z
TMz+q
T
z−λ
T
z (2.10)
and the KKT conditions are
∂L
∂ z
= Mz+q−λ = 0 (Stationarity)
z ≥ 0 (Primal Feasibility)
λ ≥ 0 (Dual Feasibility)
λ
T
z = 0 (Complementary Slackness)
(2.11)
Solving the stationarity condition for λ yields λ = Mz+q, which, when combined with the complementary slackness condition, yields the complementarity condition.
In practice LCPs are solving using a variety of methods. A popular method is the class of projected
operator splitting methods. The operator splitting methhods are an iteratice class of linear solvers, and include
the popular Successive Over-Relaxation method, and its variants, such as the Gauss-Seidel method and the
Jacobi method.
Since projected Gauss-Seidel (PGS) is likely the most popular method for physics simulators at the time
of this writing, we will discuss it here.
The Gauss-Seidel method for solving Ax = b splits the matrix A into a lower triangular matrix L, an upper
triangular matrix U, and a diagonal matrix D, such that A = L+D+U. The method then iteratively solves
the system (L+D)x
k+1 = b−Uxk
for x
k+1
, where x
k
is the kth iteration of the solution. The method takes
advantage of the computational ease of solving triangular systems, and is guaranteed to converge if A is
strictly diagonally dominant, i.e. |aii| > ∑j6=i
|ai j| for all i.
For a given LCP(M, q), the PGS method splits M into a diagonal matrix D, a lower triangular matrix L,
and an upper triangular matrix U, such that M = L+D+U. The method then solves a series of systems of
10
the form x
k+1 = LCP((L+U)x
k +q,D). Since D is diagonal and M is diagonally dominant (in fact, M is
positive definite in contact dynamics problems), the iterates can be solved in closed form,
x
k+1
i = max
0, x
k
i −D
−1
q+ Mxk
(2.12)
An analysis and proof of correctness for this algorithm is given in [33].
These methods are applied to contact dynamics because complementary is key to the mechanics of
contact: to objects may either have a force from contact between them, or they may be currently moving
away from each other, but never both. Restated, if there is ever a positive separation velocity, then the contact
is already breaking, and the contact force is zero. To wrangle the problem into a form that can be solved by
the PGS method, the impulse problem is formulated in a combined set of contact coordinate frames, and the
contact impulses are solved for in this contact frame. Solving for contact impulse ∆v, with Jacobian J, and
contact velocity v, the complementarity problem is to find ∆v such that
c = JMJT∆v+J
v+ M−1
hFext
≥ 0
∆v ≥ 0
∆v
T
c = 0
||proje2e3
∆v|| ≤ µ∆v1
(2.13)
where Fext is the external force vector, h is the time step, and M is the familiar mass matrix. The third
constraint enforces the Coulomb friction cone for friction coefficient µ, given that the contact frame is
oriented so that e1 points along the contact normal. This third constraint is also nonlinear, although it is
still amenable to continued projection through the PGS method (though there is no analytical guarantee of
convergence, for physically realistic problems it convergest quickly). Chapter 6 discusses a variant method of
PGS called Projected Jacobi, which is more amenable to parallelization. Emperical comparisons of various
1
approximations of contact dynamics, including a discretization of the Coulomb cone into a frictional pyramid,
are given in [69].
2.3.2 Partial Differential Equations
Partial differential equations (PDEs) are equations which describe the relationship between a function and its
partial derivatives. They are commonly associated with problems in fluid dynamics, electromagnetism, heat
transfer, and solid mechanics. They are also used in simulation and control, and are particularly useful for
describing the dynamics of deformable objects. The most common form of PDEs are first and second order,
which describe the relationship between a function and its first or second partial derivative. For example, the
equation of motion for a vibrating string is a second order PDE, often also called the wave equation:
∂
2u
∂t
2
= c
2
∂
2u
∂ x
2
(2.14)
where u(x,t) is the displacement of the string at position x and time t, and c is the wave speed. This
equation describes the acceleration of the string as a function of its current displacement. The solution to this
equation is a function of time and space, u(x,t), which describes the displacement of the string at any given
time and position. Note that this equation is indeed a partial differential equation, as it relates the function u
to its partial derivatives with respect to both x and t. PDEs for the behavior and deformation of hyperelastic
deformable objects are often solved numerically by discretizing the domain of the function through the Finite
Element Method and solving the resulting system of ODEs, which is discussed heavily in Chapter 4.
12
Chapter 3
NeuralSim: Augmenting Differentiable Simulators with Neural Networks
3.1 Abstract
Differentiable simulators provide an avenue for closing the sim-to-real gap by enabling the use of efficient,
gradient-based optimization algorithms to find the simulation parameters that best fit the observed sensor
readings. Nonetheless, these analytical models can only predict the dynamical behavior of systems for which
they have been designed. In this work, we study the augmentation of a novel differentiable rigid-body physics
engine via neural networks that is able to learn nonlinear relationships between dynamic quantities and can
thus model effects not accounted for in traditional simulators. Such augmentations require less data to train and
generalize better compared to entirely data-driven models. Through extensive experiments, we demonstrate
the ability of our hybrid simulator to learn complex dynamics involving frictional contacts from real data, as
well as match known models of viscous friction, and present an approach for automatically discovering useful
augmentations. We show that, besides benefiting dynamics modeling, inserting neural networks can accelerate
model-based control architectures. We observe a ten-fold speed-up when replacing the QP solver inside a
model-predictive gait controller for quadruped robots with a neural network, allowing us to significantly
improve control delays as we demonstrate in real-hardware experiments. We publish code, additional results
and videos from our experiments on our project webpage at https://sites.google.com/usc.edu/neuralsim.
13
Neural QP Controller
OSQP Controller
Neural QP Controller
OSQP Controller
Figure 3.1: Top: Snapshots from a few seconds of the locomotion trajectories generated by the QP-based model-predictive controller
(first row) and the neural network controller imitating the QP solver (second row) in our differentiable simulator running the Laikago
quadruped. Bottom: model-predictive control using the neural network (third row) and OSQP (fourth row) running on a real Unitree
A1 robot.
3.2 Introduction
Physics engines enable the accurate prediction of dynamical behavior of systems for which an analytical
model has been implemented. Given such models, Bayesian estimation approaches [131] can be used to find
the simulation parameters that best fit the real-world observations of the given system. While the estimation
of simulation parameters for traditional simulators via black-box optimization or likelihood-free inference
algorithms often requires a large number of training data and model evaluations, differentiable simulators
allow the use of gradient-based optimizers that can significantly improve the speed of parameter inference
approaches [115].
14
Analytical Data-driven End-to-end ∇ Hybrid
Physics engine [35, 154, 87] X
Residual physics [77, 2, 165, 51] X X X
Learned physics [136, 12, 91, 79] X X
Differentiable simulators [49, 71, 29, 9, 48,
126, 115]
X X
Our approach X X X X
Table 3.1: Comparison of dynamics modeling approaches (selected works) along the axes of analytical and data-driven modeling,
end-to-end differentiability, and hybrid approaches.
Nonetheless, a sim-to-real gap remains for most systems we use in the real world. For example, in typical
simulations used in robotics, the robot is assumed to be an articulated mechanism with perfectly rigid links,
even though they actually bend under heavy loads. Motions are often optimized without accounting for air
resistance. In this work, we focus on overcoming such errors due to unmodeled effects by implementing a
differentiable rigid-body simulator that is enriched by fine-grained data-driven models.
Instead of learning the error correction between the simulated and the measured states, as is done in
residual physics models, our augmentations are embedded inside the physics engine. They depend on
physically meaningful quantities and actively influence the dynamics by modulating forces and other state
variables in the simulation. Thanks to the end-to-end differentiability of our simulator, we can efficiently
train such neural networks via gradient-based optimization.
Our experiments demonstrate that the hybrid simulation approach, where data-driven models augment an
analytical physics engine at a fine-grained level, outperforms deep learning baselines in training efficiency and
generalizability. The hybrid simulator is able to learn a model for the drag forces for a swimmer mechanism
in a viscous medium with orders of magnitude less data compared to a sequence learning model. When
unrolled beyond the time steps of the training regime, the augmented simulator significantly outperforms
the baselines in prediction accuracy. Such capability is further beneficial to closing the sim-to-real gap.
We present results for planar pushing where the contact friction model is enriched by neural networks to
closely match the observed motion of an object being pushed in the real world. Leveraging techniques from
15
sparse regression, we can identify which inputs to the data-driven models are the most relevant, giving the
opportunity to discover places where to insert such neural augmentations.
Besides reducing the modeling error or learning entirely unmodeled effects, introducing neural networks
to the computation pipeline can have computational benefits. In real-robot experiments on a Unitree A1
quadruped we show that when the quadratic programming (QP) solver of a model-predictive gait controller is
replaced by a neural network, the inference time can be reduced by an order of magnitude.
We release our novel, open-source physics engine, Tiny Differentiable Simulator (TDS)*, which implements articulated rigid-body dynamics and contact models, and supports various methods of automatic
differentiation to compute gradients w.r.t. any quantity inside the engine.
3.3 Related Work
Analytical Model Data-driven Model Residual Physics Hybrid Simulation
Physics
Engine
Learned
Model
Learned
Model
Physics
Engine
Hybrid Simulator
Figure 3.2: Left: comparison of various model architectures (cf. Anurag et al. [2]). Right: augmentation of differentiable simulators
with our proposed neural scalar type where variable e becomes a combination of an analytical model φ(·,·) with inputs a and b, and
a neural network whose inputs are a, c, and d.
*Open-source release of TDS: https://github.com/google-research/tiny-differentiable-simulator
16
Various methods have been proposed to learn system dynamics from time series data of real systems
(see Table 3.1 and Figure 3.2 left). Early works include Locally Weighted Regression [138] and Forward
Models [111]. Modern “intuitive physics” models often use deep graph neural networks to discover constraints
between particles or bodies ([12, 162, 60, 129, 112, 31, 91, 137]). Physics-based machine learning approaches
introduce a physics-informed inductive bias to the problem of learning models for dynamical systems from
data [30, 96, 130, 97, 52, 147]. We propose a general-purpose hybrid simulation approach that combines
analytical models of dynamical systems with data-driven residual models that learn parts of the dynamics
unaccounted for by the analytical simulation models.
Originating from traditional physics engines (examples include [35, 154, 87]), differentiable simulators
have been introduced that leverage automatic, symbolic or implicit differentiation to calculate parameter
gradients through the analytical Newton-Euler equations and contact models for rigid-body dynamics [49, 29,
9, 84, 65, 64, 48, 88, 98]. Simulations for other physical processes, such as light propagation [120, 61] and
continuum mechanics [74, 92, 71, 126, 115] have been made differentiable as well.
Residual physics models [165, 2, 77, 51] augment physics engines with learned models to reduce the
sim-to-real gap. Most of them introduce residual learning applied from the outside to the predicted state
of the physics engine, while we propose a more fine-grained approach, similar to [77], where data-driven
models are introduced only at some parts in the simulator. While in [77] the network for actuator dynamics is
trained through supervised learning, our end-to-end differentiable model allows backpropagation of gradients
from high-level states to any part of the computation graph, including neural network weights, so that these
parameters can be optimized efficiently, for example from end-effector trajectories.
3.4 Approach
Our simulator implements the Articulated Body Algorithm (ABA) [43] to compute the forward dynamics
(FD) for articulated rigid-body mechanisms. Given joint positions q, velocities q˙, torques τ in generalized
17
Integrator Forward Dynamics
Figure 3.3: Exemplar architecture of our hybrid simulator where a neural network learns passive forces τ given joint velocities q˙
(MLP biases and joint positions q are not shown).
coordinates, and external forces fext, ABA calculates the joint accelerations q¨. We use semi-implicit Euler
integration to advance the system dynamics in time. We support two types of contact models: an impulse-level
solver that uses a form of the Projected Gauss Seidel algorithm [145, 69] to solve a nonlinear complementarity
problem (NCP) to compute contact impulses for multiple simultaneous points of contact, and a penalty-based
contact model that computes joint forces individually per contact point via the Hunt-Crossley [76] nonlinear
spring-damper system. In addition to the contact normal force, the NCP-based model resolves lateral friction
following Coulomb’s law, whereas the spring-based model implements a nonlinear friction force model
following Brown et al. [24].
Each link i in a rigid-body system has a mass value mi
, an inertia tensor Ii ∈ R
6×6
, and a center of mass
hi ∈ R
6×6
. Each corresponding nd degree of freedom joint i between link i and its parent λ(i) has a spring
stiffness coefficient ki
, a damping coefficient di
, a transformation from its axis to its parent λ(i)Xi ∈ SE(3),
and a motion subspace matrix Si ∈ R
6×nd
.
For problems involving nc possible contacts, each possible pair j of contacts has a friction coefficient
µj ∈ R describing the shape of the Coulomb friction cone. Additionally, to maintain contact constraints
during dynamics integration, we use Baumgarte stabilization [14] with parameters αj
,βj ∈ R. To soften the
18
shape of the contact NCP, we use Constraint Force Mixing (CFM), a regularization technique with parameters
Rj ∈ R
nc
.
θAM = {mi
,Ii
,hi
,
λ(i)Xi
,Si}i ∪ {µj
,αj
,βj
,Rj}j
are the analytical model parameters of the system, which have an understandable and physically relevant
effect on the computed dynamics. Any of the parameters in θAM may be measurable with certainty a priori,
or can be optimized as part of a system identification problem according to some data-driven loss function.
With these parameters, we view the dynamics of the system as
τ = HθAM (q)q¨ +CθAM (q,q˙) +gθAM (q) (3.1)
where HθAM computes the generalized inertia matrix of the system, CθAM computes nonlinear Coriolis and
centrifugal effects, and gθAM describes passive forces and forces due to gravity.
3.4.1 Hybrid Simulation
Even if the best-fitting analytical model parameters have been determined, rigid-body dynamics alone often
does not exactly predict the motion of mechanisms in the real world. To address this, we propose a technique
for hybrid simulation that leverages differentiable physics models and neural networks to allow the efficient
reduction of the simulation-to-reality gap. By enabling any part of the simulation to be replaced or augmented
by neural networks with parameters θNN, we can learn unmodeled effects from data which the analytical
models do not account for.
Our physics engine allows any variable participating in the simulation to be augmented by neural networks
that accept input connections from any other variable. In the simulation code, such neural scalars (Figure 3.2
right) are assigned a unique name, so that in a separate experiment code a “neural blueprint” is defined that
declares the neural network architectures and sets the network weights.
19
Parameter
Error
Local Optimization Parallel Basin Hopping
Figure 3.4: Comparison of system identification results from local optimization (left) and the parallel basin hopping strategy (right),
where the grid cells indicate the initial parameters from which the optimization was started. The link lengths of a double pendulum
are the two simulation parameters to be estimated from joint position trajectories. The true parameters are 3 and 4 (indicated by a
star), and the colors indicate the `2 distance between the ground truth and the parameters obtained after optimization (darker shade
indicates lower error).
3.4.2 Overcoming Local Minima
For problems involving nonlinear or discontinuous systems, such as rigid-body mechanisms undergoing
contact, our strategy yields highly nonlinear loss landscape which is fraught with local minima. As such, we
employ global optimization strategies, such as the parallel basin hopping strategy [105] and population-based
techniques from the Pagmo library [22]. These global approaches run a gradient-based optimizer, such
as L-BFGS in our case, locally in each individual thread. After each evolution of a population of local
optimizers, the best solutions are combined and mutated so that in the next evolution each optimizer starts
from a newly generated parameter vector. As can be seen in Figure 3.4, a system identification problem as
basic as estimating the link lengths of a double pendulum from a trajectory of joint positions, results in a
complex loss landscape where a purely local optimizer often only finds suboptimal parameters (left). Parallel
Basin Hopping (right), on the other hand, restarts multiple local optimizers with an initial value around the
currently best found solution, thereby overcoming poor local minima in the loss landscape.
20
3.4.3 Implementation Details
We implement our physics engine Tiny Differentiable Simulator (TDS) in C++ while keeping the computations
agnostic to the implementation of linear algebra structures and operations. Via template meta-programming,
various math libraries, such as Eigen [53] and Enoki [78] are supported without requiring changes to the
experiment code. Mechanisms can be imported into TDS from Universal Robot Description Format (URDF)
files.
To compute gradients, we currently support the following third-party automatic differentiation (AD)
frameworks: the “Jet” implementation from the Ceres solver [1] that evaluates the partial derivatives for
multiple parameters in forward mode, as well as the libraries Stan Math [28], CppAD [15], and CppADCodeGen [86] that support both forward- and reverse-mode AD. We supply operators for taking gradients
through conditionals such that these constructs can be traced by the tape recording mechanism in CppAD.
We further exploit code generation not only to compile the gradient pass but to additionally speed up the
regular simulation and loss evaluation by compiling models to non-allocating C code.
In our benchmark shown in Figure 3.5, we compare the gradient computation runtimes (excluding
compilation time for CppADCodeGen) on evaluating a 175-parameter neural network, simulating 5000 steps
of a 5-link pendulum falling on the ground via the NCP contact model, and simulating 500 steps of a double
pendulum without contact. Similar to Giftthaler et al. [49], we observe orders of magnitude in speed-up by
leveraging CppADCodeGen to precompile the gradient pass of the cost function that rolls out the system
with the current parameters and computes the loss.
3.5 Experimental Results
21
Figure 3.5: Runtime comparison of finite differences (blue) against the various automatic differentiation frameworks supported by
our physics engine. Note the log scale for the time (in seconds).
3.5.1 Friction Dynamics in Planar Pushing
Combining a differentiable physics engine with learned, data-driven models can help reduce the simulationto-reality gap. Based on the Push Dataset released by the MCube lab [164], we demonstrate how an analytical
contact model can be augmented by neural networks to predict complex frictional contact dynamics.
The dataset contains a variety of planar shapes that are pushed by a robot arm equipped with a cylindrical
tip on different types of surface materials. Numerous trajectories of planar pushes with different velocities
and accelerations have been recorded, from which we select trajectories with constant velocity. We model the
contact between the tip and the object via the NCP-based contact solver (Section 3.4). For the ground, we use
a nonlinear spring-damper contact model that computes the contact forces at predefined contact points for
each of the shapes. We pre-processed each shape to add contact points along a uniform grid with a resolution
of 2 cm (Figure 3.6 right). Since the NCP-based contact model involves solving a nonlinear complementarity
problem, we found the force-level contact model easier to augment by neural networks in this scenario since
the normal and friction forces are computed independently for each contact point.
22
Figure 3.6: Left: example setup for the push experiment (Section 3.5.1) where the rounded object is pushed by the black cylindrical
tip at a constant velocity on a plywood surface. Without tuning the analytical physics engine, a deviation between the pose of the
object in the real world (green) and in the simulator (metallic) becomes apparent. Right: predefined contact points (red dots) for one
of the shapes in the push dataset.
As shown in Figure 3.6 (left), even when these two contact models and their analytical parameters (e.g.
friction coefficient, Baumgarte stabilization gains, etc.) have been tuned over a variety of demonstrated
trajectories, a significant error between the ground truth and simulated pose of the object remains. By
augmenting the lateral 2D friction force of the object being pushed, we achieve a significantly closer match
to the real-world ground truth (cf. Figure 3.7). The neural network inputs are given by the object’s current
pose, its velocity, as well as the normal and the penetration depth at the point of contact between the tip
and the object. Such contact-related variables are influenced by the friction force correction that the neural
network provides. Unique to our differentiable simulation approach is the ability to compute gradients for the
neural network weights from the trajectories of the object poses compared against the ground truth, since the
influence of the friction force on the object pose becomes apparent only after the forward dynamics equations
have been applied and integrated.
23
Figure 3.7: Trajectories of pushing an ellipsoid object from our hybrid simulator (blue) and the non-augmented rigid-body simulation
(orange) compared against the real-world ground truth from the MCube push dataset [164]. The shaded ellipsoids indicate the final
object poses.
Figure 3.8: Left: Model of a swimmer in MuJoCo with five revolute joints connecting its links simulated as capsules interacting
with ambient fluid. Right: Evolution of the cost function during the training of the neural-augmented simulator for the swimmer in
different media.
24
Figure 3.9: Trajectory of positions q and velocities q˙ of a swimmer in water at 25 ◦C temperature actuated by random control inputs.
In a low-data regime, our approach (orange) generates accurate predictions, even past the training cutoff (vertical black bar), while
an entirely data-driven model (green) regresses to the mean. The behavior of the unaugmented analytical model, which corresponds
to a swimmer in vacuum, is shown for comparison.
Medium Dynamic Viscosity µ [Pa·s] Density ρ [kg / m3
]
Vacuum 0 0
Water (5 ◦C) 0.001 518 1000
Water (25 ◦C) 0.000 89 997
Air (25 ◦C) 0.000 018 49 1.184
Table 3.2: Physical properties of media used in the swimmer experiment.
3.5.2 Passive Frictional Forces for Articulated Systems
While in the previous experiment an existing contact model has been enriched by neural networks to compute
more realistic friction forces, in this experiment we investigate how such augmentation can learn entirely new
phenomena which were not accounted for in the analytical model. Given simulated trajectories of a multi-link
robot swimming through an unidentified viscous medium (Figure 3.8), the augmentation is supposed to learn
a model for the passive forces that describe the effects of viscous friction and damping encountered in the
fluid. Since the passive forces exerted by a viscous medium are dependent on the angle of attack of each link,
as well as the velocity, we add a neural augmentation to τ, ψθ (q,q˙), analogous to Figure 3.3.
Using our hybrid physics engine, and given a set of generalized control forces u, we integrate Equation 3.1
to form a trajectory Tˆ = {qˆi
,
ˆq˙
i}i
, and compute an MSE loss L(T,Tˆ) against T, a ground truth trajectory
of generalized states and velocities recorded using the MuJoCo [154] simulator. We are able to compute
exact gradients ∇θL end-to-end through the entire trajectory, given the differentiable nature of our simulator,
integrator, and function approximator.
25
Figure 3.10: Left: Neural augmentation for a double pendulum that learns joint damping. Given the fully connected neural network
(top), after 15 optimization steps using the sparse group lasso cost function (Equation 3.2) (bottom), the input layer (light green) is
sparsified to only the relevant quantities that influence the dynamics of the observed system. Right: Final mean squared error (MSE)
on test data after training a feed-forward neural network and our hybrid physics engine.
We train the hybrid simulator under different ground-truth inputs. We consider water at 5 ◦C and 25 ◦C
temperature, plus air at 25 ◦C, as media, which have different properties for dynamic viscosity and density
(see Table 3.2).
Leveraging the mechanical structure of problems aids in long-term physically relevant predictive power,
especially in the extremely sparse data regimes common in learning for robotics. As shown in Figure 3.9,
our approach, trained end-to-end on the initial 200 timesteps of 10 trajectories of a 2-link swimmer moving
in water at 25 ◦C, exhibits more accurate long-term rollout prediction over 900 timesteps than a deep long
short-term memory (LSTM) network. In this experiment, the augmentation networks for TDS have 1637
trainable parameters, while the LSTM has 1900.
3.5.3 Discovering Neural Augmentations
In our previous experiments we carefully selected which simulation quantities to augment by neural networks.
While it is expected that air resistance and friction result in passive forces that need to be applied to the
26
simulated mechanism, it is often difficult to foresee which dynamical quantities actually influence the
dynamics of the observed system. Since we aim to find minimally invasive data-driven models, we focus on
neural network augmentations that have the least possible effect while achieving the goal of high predictive
accuracy and generalizability. Inspired by Sparse Input Neural Networks (SPINN) [44], we adapt the
sparse-group lasso [142] penalty for the cost function:
L = ∑t
|| fθ (st−1)−s
∗
t
||2
2 +κ||θ[:1]
||1 +λ||θ[1:]
||2
2
, (3.2)
given weighting coefficients κ,λ > 0, where fθ (st−1) is the state predicted by the hybrid simulator given
the previous simulation state st−1 (consisting of q,q˙), s
∗
t
is the state from the observed system, θ[:1] are the
network weights of the first layer, and θ[1:] are the weights of the upper layers. Such cost function sparsifies
the input weights so that the augmentation depends on only a small subset of physical quantities, which
is desirable when simpler models are to be found that potentially overfit less to the training data while
maintaining high accuracy (Occam’s razor). In addition, the L2 loss on the weights from the upper network
layers penalizes the overall contribution of the neural augmentation to the predicted dynamics since we only
want to augment the simulation where the analytical model is unable to match the real system.
We demonstrate the approach on a double pendulum system that is modeled as a frictionless mechanism
in our analytical physics engine. The reference system, on the other hand, is assumed to exhibit joint friction,
i.e., a force proportional to and opposed to the joint velocity. As expected, compared to a fully learned
dynamics model, our hybrid simulator outperforms the baselines significantly (Figure 3.10). As shown on the
right, convergence is achieved while requiring orders of magnitude less training samples. At the same time,
the resulting neural network augmentation converges to weights that clearly show that the joint velocity q˙i
influences the residual joint force τi
that explains the observed effects of joint friction in joint i (Figure 3.10
left).
27
3.5.4 Imitation Learning from MPC
Starting from a rigid-body simulation and augmenting it by neural networks has shown significant improvements in closing the sim-to-real gap and learning entirely new effects, such as viscous friction that has not
been modeled analytically. In our final experiment, we investigate how the use of learned models can benefit
a control pipeline for a quadruped robot. We simulate a Laikago quadruped robot and use a model-predictive
controller (MPC) to generate walking gaits. The MPC implements [81] and uses a quadratic programming
(QP) solver leveraging the OSQP library [144] that computes the desired contact forces at the feet, given
the current measured state of the robot (center of mass (COM), linear and angular velocity, COM estimated
position (roll, pitch and yaw), foot contact state as well as the desired COM position and velocity). The swing
feet are controlled using PD position control, while the stance feet are torque-controlled using the QP solver
to satisfy the motion constraints. This controller has been open-sourced as part of the quadruped animal
motion imitation project [124]
†
.
We train a neural network policy to imitate the QP solver, taking the same system state as inputs, and
predicting the contact forces at the feet which are provided to the IK solver and PD controller. As training
dataset we recorded 10k state transitions generated by the QP solver, using random linear and angular MPC
control targets of the motion of the torso. We choose a fully connected neural network with 41-dimensional
input and two hidden layers of 192 nodes that outputs the 12 joint torques, all using ELU activation [32].
The neural network controller is able to produce extensive walking gaits in simulation for the Laikago
robot, and in-place trotting gaits on a real Unitree A1 quadruped (see Figure 3.1). The learned QP solver is
an order of magnitude faster than the original QP solver (0.2 ms inference time versus 2 ms on an AMD 3090
Ryzen CPU).
Thanks to improved performance, the control frequency increased from 160 Hz to 300 Hz. Since the model
was trained using data only acquired in simulation, there is some sim-to-real gap, causing the robot to trot
†The implementation of the QP-based MPC is available open-source at https://github.com/google-research/motion_
imitation
28
with a slightly lower torso height. In future work, we plan to leverage our end-to-end differentiable simulator
to fine-tune the policy while closing the simulation-to-reality gap from real-world sensor measurements.
Furthermore, the use of differentiable simulators, such as in [156], to train data-driven models with faster
inference times is an exciting direction to speed up conventional physics engines.
3.6 Conclusion
We presented a differentiable simulator for articulated rigid-body dynamics that allows the insertion of neural
networks at any point in the computation graph. Such neural augmentation can learn dynamical effects that
the analytical model does not account for, while requiring significantly less training data from the real system.
Our experiments demonstrate benefits for sim-to-real transfer, for example in planar pushing where, contrary
to the given model, the friction force depends not only on the objects velocity but also on its position on
the surface. Effects that have not been present altogether in the rigid-body simulator, such as drag forces
due to fluid viscosity, can also be learned from demonstration. Throughout our experiments, the efficient
computation of gradients that our implemented simulator facilitates, has considerably sped up the training
process. Furthermore, replacing conventional components, such as the QP-based controller in our quadruped
locomotion experiments, has resulted in an order of magnitude computation speed-up.
In future work we plan to investigate how physically meaningful constraints, such as the law of conservation of energy, can be incorporated into the training of the neural networks. We will further our preliminary
investigation of discovering neural network connections and the viability of such techniques on real-world
systems, where hybrid simulation can play a key role in model-based control.
Acknowledgments
We thank Carolina Parada and Ken Caluwaerts for their helpful feedback and suggestions.
29
Chapter 4
Parameter Estimation for Deformable Objects in Robotic Manipulation Tasks
4.1 Abstract
We consider the problem of identifying material parameters of a deformable object, such as elastic moduli,
by non-destructive robotic manipulation. We assume known geometry and mass, a reliable fixed grasp, and
the ability to track the positions of a few points on the object surface. We collect a dataset of grasp pose
sequences and corresponding point position sequences. We represent the object by a tetrahedral FEM mesh
and optimize the material parameters to minimize the difference between the real and predicted observations.
We use a collocation-type formulation where the sequence of FEM mesh states are decision variables, and
the dynamics are encoded as constraints. Sparsity patterns in the constraints make this problem tractable
despite the large number of variables. Experiments show that our approach is computationally feasible and
able to adequately re-identificy simulated material parameters.
4.2 Chapter Introduction
Robotic manipulation of deformable objects is a valuable and challenging domain. Many objects of human
necessity, like clothes, food, and bedding, move in ways which are dominated by their deformability. Systems
30
Figure 4.1: A motivating tracking task, where a robot controller needs to manipulate a deformable object along a given trajectory,
while predicting and controlling for the passive dynamic objects.
capable of safe and reliable dextrous deformable manipulation would greatly increase the number of industrial,
domestic, and logistical contexts in which robots could be deployed.
Deformable dynamics are difficult to model, manipulate, and even to percieve. Unlike many tasks with
rigid objects, grasping in different locations on the same object will result in significantly different behavior.
An object may fold to obscure itself or otherwise change shape dramatically, preventing a sensor system from
reliable tracking over time.
A motivating example for system identification in robotics is the task of tracking a given fast trajectory
with the distal end of a deformable object. The control task is dominated by the passive deformable dynamics
of the object, and require accurate system parameter identification for model-based control. An example
physical setup is shown in Figure 4.1.
In this work, we propose a method to determine the material parameters of a deformable object model,
using data collected from robot interactions. In contrast to many common techniques which require specialized
force measurement equipment, our proposed method relies only on observations of the bulk motion of the
object over time. We describe a nonlinear optimization problem formulation inspired by collocation methods,
a penalty augmented loss transformation which enables optimization algorithms to solve it, and discuss sparse
31
matrix structures which enable efficient solutions. We also present preliminary experiments which show that
this method is promising for complex systems with deformable dynamics.
4.3 Background and Placement within Related Works
4.3.1 Simulation
Rich simulation environments have become a fundamental technology in autonomous robotics. An early and
ongoing use of physics simulation is as a virtual testbed for algorithm and mechanism design [83]. In such
uses, simulated environments provide a safe, inexpensive, fast, and easily configurable approximation of the
world, and are primarily useful for decreasing engineering time and cost. The physical phenomena simulated
by such systems can be complex and span many domains within one system, including electrical, mechanical,
thermal, with interactions between such subsystems [39].
With the advent of deep Reinforcement Learning (RL) techniques for robotics applications, simulators
also play an important role in data collection during the learning process [3]. In this context, simulators
again provide a safe environment for algorithms to execute suboptimal actions, which could be dangerous or
damaging in the real world. The popular class of model-free RL algorithms require vast amounts of data from
environment interactions to train [89], and simulators provide an inexpensive, safe, and often much faster
avenue towards such data generation. Successful deep RL methods for dynamically manipulating deformable
objects have been trained in simulation, for flinging [55] and folding cloth [159], and for ballistically casting
ropes [93].
Finally, environmental modelling has a long history in nonlinear optimal control techniques [143]. From
this perspective, it may be difficult to draw an exact line between a simulator and a dynamics model, but
the capabilities of control and dynamic planning techniques are steadily encompassing an increasingly rich
set of phenomena [148, 154]. Recent research has focused on the differentiability of entire simulators,
32
encompassing various modeling methods and phenomena, from contact-rich rigid body dynamics [36], to
mesh-free multiphysics with the Material Point Method [72], to Finite Element based soft-body cutting
simulation [63]
4.3.2 Closing the Sim-to-Real Gap
In the use of a simulator for any of these purposes, a major concern is the accuracy of approximation of
the real world. This “sim-to-real” gap introduces particular concern for RL and optimal control contexts –
both seek optimal actions, but an optimal action in an approximate simulator may be quite different from an
optimal action on a real-world system. Broadly, there are two ways to solve this problem: by identifying
parameters for a given model which more optimally explain the real world, or by choosing a more nuanced or
more structurally appropriate model.
In the extreme end of the spectrum of nuanced modelling, deep neural networks or dynamic mode
decompositions [85] can be used to directly approximate system input-output dynamics from data [125].
These fully data-driven schemes are able to model highly complex dynamics, for which there may not be
expressible equations of motion. However, these schemes can only effectively predict from observations
similar to data seen during fitting, and thus they require many example trajectories to generalize even to a
single system, much less to similar systems with different physical parameters. Furthermore, despite recent
advancements [140, 109], it is difficult to detect when such systems are not making accurate predictions,
leading to unsafe and unpredictable actions.
4.3.2.1 Deformable Object Modeling
System-specific models require less data and may more easily generalize to similar systems, at the cost of
significantly increased human effort required to design the model. Development of nuanced and accurate
physical models has been a cornerstone of engineering since the development of calculus. Civil, mechanical,
33
aerospace [103] and agricultural [132] engineering have long relied on complex models based on solid
mechanics and fluid dynamics.
For many applications, computer graphics also requires deformable object simulation. Computer graphics
methods are of particular interest to robotics applications due to the shared interest in a tradeoff between
fidelity and speed. In computer graphics, it may be desirable to render a reduced model to preserve realtime
interactivity, while in robotics, a less accurate dynamics estimate may be much more useful than a high
fidelity estimate if it can be provided at a much higher frequency. This tradeoff has been explored in depth for
physically-based model reductions of St. Venant-Kirchoff materials using Finite Element Methods [10]. As
will be discussed, the approach proposed in this work relies on a constraint formulation to model the grasping
hand of the robotic manipulator. Detailed analysis of constrained deformable dynamics, and approaches to
overcome related issues of coupling between mesh vertices, is presented in [158].
In robotics, many systems have approximated environments by assuming all objects are rigid. Only
recently have common simulators begun to incorporate objects with deformable dynamics, using linked rigid
body approximations [154], FEM based dynamics [34], and Position-Based Dynamics (PBD) [114].
4.3.2.2 Parameter Estimation
Physically based models often have many physically interpretable parameters. Choosing these parameters
poorly can significantly degrade a model’s ability to accurately predict dynamics. Parameter estimation is the
process of attempting to accurately identify the parameters which most optimally match the system’s true
dynamics, and has been an area of interest since the earliest days of robotics, with publications resulting from
parameter identifications of individual serial manipulators [7].
Identification of the parameters of a deformable system is also a well explored area. Mechanically focused
fields, for which an engineered material will be produced repeatedly within a specified tolerance, are able to
sample small sections of material and subject them to detailed stress-strain measurements which specialized
34
scientific apparatus. A survey of such approaches discusses the tradeoffs of relative solution methods with
such fine-grained data [104]. In the field of computer graphics, it is valuable to ascertain material properties
to get a visually plausible match, so that an animated object with simulated dynamics will look real to a user
or audience. An approach for a co-rotational linearly elastic material model is detailed in [157]. In a robotics
context, an approach for identifying the parameters of arbitrary material models is given in [58], and is based
on the adjoint sensitivity analysis method. In contrast, our proposed method is based on a direct approach,
optimizing the parameters jointly with the candidate state trajectories.
4.4 Methods
4.4.1 Modeling
The true dynamics of a deformable object are continuous in space and time, with an infinite-dimensional
state and highly complex equations of motion. Nevertheless, the mechanics of deformation have a rich
history of study dating back to Euler and Lagrange, yielding approximate continuous dynamics in the form
of systems of Partial Differential Equations (PDEs). Tractable approximate representations of deformable
object mechanics are a cornerstone of engineering analysis for nearly all mechanical systems built in the real
world. A classic technique is the Finite Element Method (FEM), which converts the continuous-state system
into a discrete-state state Ordinary Differential Equation (ODE). For brevity, we assume familiarity with the
FEM technique, and refer interested readers to
The natural world simultaneously exhibits a complex collection of phenomena, and there are accordingly
many aspects of any physical object to model. We focus in this paper on the restricted class of viscoelastic
deformable objects. Broadly, these objects are governed by restitutive forces that resist local deformations
and viscous damping forces that resist movement.
35
Formally, given a mesh of linear tetrahedral elements with n nodes, and a state vector x ∈ R
3n describing
the three-dimensional position of each node, the FEM approximation of the equations of motion yields the
second order system
Mx¨ = fe(x) +fd(x,x˙) +fext
fd(x,x˙) = −γrK(x)x˙ −γmMx˙,
(4.1)
where M ∈ R
3n×3n
is the lumped mass matrix, fe describes the internal elastic forces, fd describes the collected
damping forces, and fext collects any external forces, such as gravity. The parameter γr ∈ R>0 is a coefficient
for an element-local damping term called Rayleigh damping , where K = ∂ fe
∂x is called the stiffness
matrix. The parameter γm ∈ R>0 is a coefficient for a global viscous damping term called mass-proportional
damping.
4.4.1.1 Constitutive Model
The choice of the function fe determines the elastic behavior of the system, and is equivalent to the choice
of a constitutive model. The choice of such a model for a given object is a field in itself [67]. In this work,
we select the Neohookean model described in [16], primarily for the fact that compressing an element to
zero volume requires infinite energy. We emphasize that this choice is based on the real-world properties
of the particular object used in our experiments, and our method is compatible with all major viscoelastic
constitutive models. The Neohookean constitutive model is defined by the energy function
ψ(C) = 1
2
λ(lnJ)
2 − µ lnJ +
1
2
µ(trC−3) (4.2)
36
where λ and µ are material parameters called the Lamé parameters, and are effectively a nonlinear change of
variables from Poisson’s ratio and Young’s modulus. For a local deformation gradient F representing the
linearized deformation across a single element,
C = F
TF and J = det(F). (4.3)
C is called the right Cauchy-Green deformation tensor. As mentioned, the lnJ terms result in an infinite
amount of energy required to compress an element to zero volume, and thus an infinite restitutive force away
from such compression.
4.4.1.2 Boundary Conditions
Manipulation problems where the robot can be assumed to have a rigid and unbreakable grasp of the object
may be modelled via time-dependent Dirichlet boundary conditions. In the description of a PDE problem,
boundary conditions represent the behavior of valid solutions at the edges of the domain. In particular, a
Dirichlet boundary condition specifies the value of the solution for regions of the boundary. For the solid
mechanics problems considered here, this corresponds to specifying the position and velocity of specific
nodes on the boundary of the mesh, representing the rigid grasp of the robot manipulator.
By varying the states specified by the boundary condition with time, we represent the motion of the
robot, and excite motion in the rest of the mesh. In our problem formulation, we assume that the robot
has rigid control over a subset of “grasped” nodes with indices Ig ⊆ {1,...,n}, and call the state of those
nodes x
g = (xi)i∈Ig
. We model the grasp as a time-varying 6-DOF rigid transform ut ∈ SE(3) of these nodes,
resulting in the constraints
xt,i = utx¯i ∀i ∈ Ig, ∀t ∈ {1,...,T}, (4.4)
where x¯i
is the undeformed position of node i.
37
Figure 4.2: An example state trajectory
4.4.1.3 Time Integration
The second-order FEM dynamics (4.1) can be solved by any ODE solver by introducing variables v = x˙ and
lowering to a first order system. Alternatively, there exist tailored ODE solvers for second-order systems
with mass and damping matrices. For general forward simulation tasks, we use the second order implicit
Newmark integrator [118], which enables stable timestepping for timesteps on the order of 10 ms.
4.4.1.4 Measurement model
We assume that in the real system, we have the ability to measure the positions of the subset of mesh vertices
with indices Io ⊆ {1,...,n}. This is denoted by
y = h(x) = (xi)i∈Io
.
4.4.2 Nonlinear Programming for Parameter Estimation
The model in Section 4.4.1 describes an entire family of dynamical systems. Collecting free variables from
Equations (4.1) and (4.2), we see that this family of systems is parameterized by a choice of (M,λ,µ, γr
, γm).
In this work we assume that the mass matrix M is known. In experiments, we compute M by weighing the
38
deformable object, assuming uniform density throughout the mesh, and dividing the resulting mass of each
tetrahedron equally among its vertices. (This is known as a lumped mass matrix.) We denote the remaining
parameters as θ = (λ,µ, γr
, γm). The goal of parameter estimation is to pick system parameters θ
∗ which
best describe a set of collected real-world observations.
We collect data from the real system by supplying a predetermined discrete-time input signal and
recording the corresponding output signal. To simplify notation, we present our method using only one
pair of input/output signals. However, the method is easily extended to a dataset of multiple input/output
signal pairs. Let U = (u1,...,uT ) ∈ SE(3)
T denote the input signal. We execute U with zero-order hold
using a constant time interval of ∆t, yielding the output signal Yˆ = (yˆ1,...,yˆT ). We will also use the notation
X = (x1,...,xT ) to denote a sequence of mesh states.
Broadly, parameter estimation can be described as the following optimization problem:
θ
∗
,X
∗ = argmin
X,θ
L(X,θ) ,
T
∑
t=1
kyˆt −h(xt)k
2
2
subject to X consistent with eqs. (4.1) and (4.4).
(4.5)
The choice of squared Euclidean norm to quantify error is arbitrary, and could be replaced by other divergences
or metrics.
4.4.2.1 Equality Constraints
Given a candidate state trajectory X = (x1,...,xT ) and candidate parameters θ, we wish to evaluate how well
they satisfy Equation (4.1). At each timestep t ∈ {1,...,T}, we introduce an equality constraint function
ct(X,θ) , fe(xt
;θ) +fd(xt
,vt
;θ) +fext −M(θ)at = 0,
(4.6)
39
where
vt =
xt −xt−1
∆t
and at =
xt −2xt−1 +xt−2
∆t
2
. (4.7)
In our experiments the system begins in its rest state, so it is reasonable to set v1 = 0 and for t ≤ 2, at = 0.
Equation (4.6) applies only for nodes which are not grasped by the manipulator. For the grasped nodes x
g
we introduce a separate equality constraint which enforces the Dirichlet constraint in Equation (4.4), and
supersedes Equation (4.6):
c
g
t
(X,θ) , xt,i −utx¯i
.
(4.8)
In realistic FEM problems, a mesh may have thousands of nodes, and a trajectory may have thousands of
timesteps. Many powerful local optimization algorithms rely on the Jacobian of the constraint vector to
choose a search direction for improvement. As such, evaluation of the combined equality constraint c and its
Jacobian require special care for efficient computation. Here we state the equations for ∂ ct
∂X and ∂ ct
∂ θ ,
with some comments on performant implementation.
∂ ct
∂xt
=
1+
γr
∆t
K(xt) +γrH(xt)· vt +
γm
∆t
M−
1
∆t
2M
∂ ct
∂xt−1
= −
1
∆t
(γrK(xt) +γmM) + 2
∆t
2M
∂ ct
∂xt−2
= −
1
∆t
2M
(4.9)
where the order-3 tensor H(xt), called the Hessian stiffness tensor, has elements Hi jk = ∂Ki j
∂xtk , and
A·b = Ai jkb
k
is a tensor contraction along the last dimension. In practice, this term is computed efficiently
as the directional derivative DvK(x).
As shown in Figure 4.3, notice that the structure of ∂ c
∂x is block sparse, and block lower triangular,
and that each block may be computed independently and in parallel. Additionally, each block is itself a sparse
matrix arising from the structure of the FEM mesh. Storing ∂ c
∂x using a sparse matrix represenation is
crucial, as the uncompressed matrices can approach terabytes of storage of almost all exact zeros.
40
Figure 4.3: Structure of the matrices in the computation of ∂ c
∂x . Left: The overall block-diagonal structure of ∂ c
∂x . The first
block column is the parameter Jacobians ∂ ct
∂ θ . The rest of the matrix is a block structure with a diagonal and two subdiagonals,
coming from the time differencing in the calculation of vt and at
. Right: An example symmetric sparsity pattern of one of the
diagonal subblocks. This sparsity pattern comes from the connectivity of the nodes in the FEM mesh.
4.4.2.2 Quadratic Penalty Methods
Equation (4.5) is difficult to optimize. While the objective is convex, there are a large number of nonlinear
equality constraints, so the feasible set is highly nonconvex. Furthermore, the problem is very high dimensional. Satisfying hundreds of simultaneous nonlinear equality constraints is challenging, and we find that
many methods reach very suboptimal local minima quite quickly. Relaxing the constraints into a penalty
allows optimization algorithms to explore outside of the strictly physically feasible set of trajectories by
transforming the constrained Equation (4.5) into the unconstrained problem
argmin
X,θ
L(X,θ) + µ
2 ∑t
kct(X,θ)k
2
2
, (4.10)
where µ > 0, which we solve using a nonlinear conjugate gradient method [57].
4.4.2.3 µ-Scheduling
Equation (4.10) leaves a free parameter µ which weights the relative contribution of the equality constraints c to the output matching loss function L. As is common in the optimization literature, we find
41
that by exponentially increasing the value of µ during optimization (by a factor of 10) improves optimizer
performance.
4.4.2.4 Initial X Guess
Additionally, the initial guess of the state trajectory X strongly influences optimization convergence. Given
an initial guess θ0 for the parameters, we compute an initial state trajectory as a numerical solution of the
dynamics (4.1) where the input u is determined by the zero-order hold of the sequence U and the initial state
x1 is set to the solution of the rest-state problem for θ0, as detailed below. Since X0 satisfies Equation (4.1)
for θ0, it also satisfies ct(x,θ0) = 0 for all timesteps t, up to small differences caused by the ODE integration
scheme. Therefore, it is an approximately feasible point for the optimization problem (4.5).
4.4.2.5 Rest state problem
The rest state problem refers to finding an equilibrium point of the FEM dynamics under the Dirichlet
boundary condition, that is, solving the nonlinear system of equations
fe(x) = −fext, xi = ux¯i ∀i ∈ Ig
for x with u and θ fixed. When solving this system numerically, we must take precautions to ensure that the
solver does not query FD for a state x with very large restorative forces caused by the ln det(F) term in the
Neo-Hookean constitutive model (4.2). In our experiments it was sufficient to use Newton’s method with a
backtracking line search to ensure that the stiffness matrix K remains positive definite.
42
Figure 4.4: A motivating testbed problem setup for our proposed method. Left: Franka Emika Panda arm rigidly grasping a foam
cylinder, with attached Vicon markers. Right: Visualization of a simulated FEM mesh used for the methods described in this paper.
Parameter Ground truth Estimated
λ 200.0 143.8
µ 300.0 347.3
γr 0.009 0.011
γm 1.0 0.91
Table 4.1: Comparison of ground-truth and estimated values for the material parameters θ in our experiment. (λ,µ): Lamé
parameters. (γr, γm): Rayleigh damping and mass-proportional damping constants.
4.5 Results
We implement a FEM simulation environment in the Julia language [20], and use the Optim [110] software
suite’s implementation of a nonlinear conjugate gradient algorithm to optimize Equation (4.10). To store
the sparse Jacobian ∂ c
∂x, we use the common Compressed Sparse Column (CSC) format. We are able to
evaluate the Jacobian of a trajectory of 100 timesteps, with a mesh of 176 3-dimensional nodes in less than
100 ms (minimum 57 ms, median 82 ms, mean 98 ms).
Given an initial parameter guess θ0, we are able to identify a set of parameters and state variable which
achieve a mean optimization loss L(X,θ) of 1×10−3 m mean Euclidean error. However, it is important
to note that due to the penalty method presented, there is still slack in the dynamics constraints during
optimization, and that these parameters diverge during forward simulation, where an individual point on the
distal tip of the mesh has a mean 1.1 cm tracking error. This points towards possible issues with our choice of
43
a penalty-based optimization algorithm or towards a difference between the implicit Newmark integrator and
the finite-difference approximations enforced in the dynamics constraints ct(X,θ). This provides motivation
for ongoing work.
4.6 Conclusion and Ongoing Work
We have proposed a method to estimate the material parameters of a deformable object by observing the
object’s motion under non-destructive robotic manipulation. Our method is intended to improve the fidelity
of finite-element models in simulation environments for complex manipulation tasks. Unlike approaches
based on the adjoint method or shooting, our method is similar to collocation-style trajectory optimization
and estimates the unobserved trajectory of the finite-element mesh along with the material parameters. We
believe that this method will show an improved set of initial guesses which will converge to a correct system
parameter identification, compared to other methods in the literature.
Our ongoing work focuses on three main efforts. First, we intend to compare the accuracy and computational cost of a wider range of constrained nonlinear optimization algorithms for the collocation-style
problem setup. Second, we will investigate the possibility of exploiting the problem structure for faster or
more accurate results, such as solving a series of estimation problems over increasing time horizons. Third,
we will compare our method against adjoint- and shooting-based parameter estimation algorithms. We are
especially interested in comparing stability-related properties such as sensitivity to initial guess.
44
Chapter 5
Tracking Fast Trajectories with a Deformable Object using a Learned Model
5.1 Abstract
We propose a method for robotic control of deformable objects using a learned nonlinear dynamics model.
After collecting a dataset of trajectories from the real system, we train a recurrent neural network (RNN)
to approximate its input-output behavior with a latent state-space model. The RNN internal state is lowdimensional enough to enable realtime nonlinear control methods. We demonstrate a closed-loop control
scheme with the RNN model using a standard nonlinear state observer and model-predictive controller. We
apply our method to track a highly dynamic trajectory with a point on the deformable object, in real time and
on real hardware. Our experiments show that the RNN model captures the true system’s frequency response
and can be used to track trajectories outside the training distribution. In an ablation study, we find that the
full method improves tracking accuracy compared to an open-loop version without the state observer.
5.2 Chapter Introduction and Placement within Related Work
Manipulating deformable objects represents a challenging area of robotics. In contrast to typical objects
consisting of a single rigid body, deformable objects often admit limited control authority and have dynamics
45
that are difficult to predict. At the same time, many objects of human interest are deformable. Safe, reliable
robotic manipulation of these objects is critical for capable general-purpose robots [8, 166].
As with many manipulation problems in robotics, there are multiple ways to model the dynamical
behavior of the object in question. Broadly speaking, there are two classes of models: Fully data-driven
methods based on an expressive function class with many parameters, and analytical methods based on a
physical model with fewer parameters that can be identified from data.
Physics-based models of deformable objects have been studied extensively in science and engineering
contexts, and many constitutive models have been described for various materials [68]. These models are
continuous in space and time, and simulation on computer hardware requires a discretization strategy. Finite
element methods (FEM) have been used for decades to solve continuum mechanics problems [161] and
have shown promise for robotic control. Recent work with FEM modeling addresses dynamic control of
deformable objects and soft robots with offline trajectory optimization [167, 17, 37, 90, 127, 62]. FEM
is appealing as it admits extensive theoretical analysis [11, 151]. For real-world problems, it is possible
to estimate the parameters of FEM meshes in a principled way from exteroceptive sensors common in
robotics [59]. Alternative discretizations include the material point method [146, 75], (extended) position
based dynamics [102], and meshless shape matching [113]. Additionally, task-specific reduced states can be
formulated, which accelerate control, perception and planning [106, 107].
Deformable objects can be complex to model, and for objects with resonant dynamics, seemingly
minor errors in model assumptions or material parameter estimates can cause large deviations in dynamic
behavior over time [18]. For these reasons, purely data-driven methods are an appealing alternative to
physics-based models. Machine learning methods have been explored in deformable manipulation [108], for
purely kinematic trajectory tracking [18], for cable-driven soft robot actuators [25], for control of pneumatic
deformable mechanisms [50], and for structures actuated by shape-memory alloys [134]. For scenarios
where controllers can continuously interact with their environment to improve, model-based reinforcement
46
learning has been proposed [153]. Other data-driven approaches studied in soft robot control include proper
orthogonal decompositions [155].
In this work, we propose a data-driven method for modeling and trajectory-tracking control of a deformable object. Tracking fast trajectories serves as a benchmark for a method’s ability to predict and account
for dynamics in the manipulated object. This is of particular interest for free-flying space robots like the
Astrobee platform [27], where the passive dynamics of many objects of interest are much stronger than the
actuator’s control authority. Our method models the dynamics in with a long short-term memory (LSTM)
recurrent neural network (RNN) [66, 95], trained in a standard sequence modeling setup using input-output
trajectory data from a real physical system. The internal state of the learned RNN is not physically meaningful, but the RNN still forms a discrete-time dynamical system that is compatible with standard methods in
state-space nonlinear control. We therefore apply model-predictive control [4] and extended Kalman filtering
methods [80] to track the trajectory using the RNN model.
We implement our method to run online with a real robot, and use it to “draw in the air” along a fast
trajectory with the free end of a pool noodle (long foam cylinder) held by the robot arm. To measure how well
non-instantaneous dynamics are captured, we compare frequency response plots of the true system and the
RNN model. To verify that a nonlinear observer helps compensate for model errors, we perform an ablation
study comparing our method to an open-loop variant where the RNN model is assumed to be perfect. We find
that the full closed-loop method reduces trajectory tracking error.
RNNs have a long history within the broader field of nonlinear state-space system identification with
partial observability [117]. Such models can be used as an intermediate tool for learning a control policy,
e.g. in model-based reinforcement learning [54], or directly with nonlinear observers and controllers as in
our method [149]. To our knowledge, our work represents the first application of the RNN/observer/MPC
architecture to the task of manipulating deformable objects.
47
Figure 5.1: Physical testbed for our method. Trajectories are tracked by a Vicon motion capture marker attached to the end of a pool
noodle, rigidly held by a Franka Emika Panda robot. Pitch/yaw inputs u¯ = (φ,ψ), tracking point measurement y¯, and coordinate
axes (X,Y,Z) shown. (The fiducial marker visible in the image is not used.)
5.3 Problem Setting and Preliminaries
We consider a robot manipulating a deformable object such that a particular point on the object tracks a
trajectory. We use the following notation: The special Euclidean group of rigid transformations in threedimensional space is denoted by SE(3). The notation A 0 (resp. A 0) indicates that the matrix A is
positive semidefinite (resp. definite). The notation kykW =
p
y
>Wy for W 0 indicates a weighted Euclidean
norm. The notation diag(x) refers to a diagonal matrix with the entries of the vector x on its diagonal. We
define a setting with the following assumptions:
48
Interact
with object
Train Recurrent
Neural Network Extended Kalman Filter
Model Predictive Control
Robot, Object, Motion Capture
D θ
xˆ
y
u
Preparation Testing
Figure 5.2: Diagram of our system. A small dataset of control inputs and tracked marker locations is obtained from the real system.
A recurrent neural network (RNN) model is trained to predict input/output behavior with a latent state. The RNN forms the nonlinear
dynamics model for an extended Kalman filter (EKF) and model-predictive controller (MPC) to track a dynamic input trajectory
with the deformable object.
Assumption 5.3.1 (Input) The deformable object is attached to the robot’s end effector by an unbreakable
grasp at a fixed position. The control policy interacts with the robot by commanding the position and attitude
of the grasp point.
Assumption 5.3.2 (Output) The robot’s perception system can accurately measure the three-dimensional
position of a single point on the surface of the deformable object.
Assumption 5.3.3 (Objective) The robot should manipulate the deformable object such that the measured
point on its surface tracks a given trajectory in three-dimensional space.
Assumption 5.3.4 (Protocol) Interaction with the environment is divided into two stages. In the preparation
stage, the robot can interact with the deformable object, perform calculations, and store results. There is no
time limit on this stage. In the testing stage, a supervisor reveals the trajectory to be tracked. The robot must
track the trajectory promptly without slow pre-computations. The robot can use all the results stored from
the preparation stage, but cannot interact with the deformable object in any way other than attempting to
track the trajectory. This restriction is motivated by safety- and time-constrained tasks, where it may not be
feasible for the robot to perform additional exploratory actions.
Formally, we assume that the coupled system of the robot and deformable object can be described by a
continuous-time, deterministic, time-invariant dynamical system
x˙¯ = ¯f(x¯,u¯), y¯ = h¯(x¯), (5.1)
49
where the state x¯ is an abstract infinite-dimensional quantity representing the full continuum state of the
deformable object and the robot, the input u¯ ∈ SE(3) is the desired pose of the robot end effector, and
the output y¯ ∈ R
3
is the position of the measured point. The unknown dynamics ¯f encapsulates both the
deformable object itself and a low-level controller that attempts to track the end effector pose input u¯ by
issuing actuator commands, e.g. motor torques, to the robot. The measurement model h¯ extracts the position
of the measured point from the continuum state.
Assumption 5.3.5 The true system has a unique globally exponentially stable equilibrium state x¯0 corresponding to the steady-state identity input I, i.e. ¯f(x¯0,I) = 0. This assumption is realistic for damped elastic
objects, such as closed cell foam. We assume that no new plastic deformations to the foam (such as a
permanent bend) are introduced during our experimental protocol.
The control policy interacts with the system (5.1) in discrete time steps of fixed length ∆t
. In this paragraph
we overload the same notations for discrete-time and continuous-time inputs, states, and outputs; for the
remainder of the paper we refer to the discrete-time quantities exclusively. At step k ∈ N, the discrete-time
input u¯[k] is supplied to the continuous-time system as a zero-order hold, resulting in the continuous-time
input signal u¯(t) = u¯[bt/∆tc] for all t ≥ 0, and the discrete-time output y¯[k] is sampled such that y¯[k] = y¯(k∆t).
Our control task is specified by a discrete-time signal of K goal positions z[1],...,z[K] ∈ R
3
for the tracked
point and the cost function
J = ∑
K
k=1
kz[k]−y¯[k]k
2
W , (5.2)
where the weighting matrix W 0 encodes a (potentially) non-isotropic tracking objective.
Remark The protocol in Assumption 5.3.4 rules out some methods that have been widely used for
deformable manipulation. Trajectory optimization methods that build a local dynamics model around a
reference trajectory by interacting with the environment, such as guided policy search [153], violate the rule
against non-task interaction in the testing stage.
50
5.4 Methods
The components of our method are outlined in Figure 5.2. To ensure that our system relies on the whip-like
resonant behavior of the deformable object to track trajectories, rather than relying on large translational
movements of the robot end effector, we restrict the inputs to only the pitch φ and yaw ψ angles of the
end effector. We denote this restricted space as U ⊂ SE(3). In practice, we generate training data within a
compact subset of pitch and yaw angles, and regularize the MPC problem so that inputs far from the identity
I are penalized. Therefore, we parameterize U by R
2
(in radians) and ignore potential issues of multiple
covering.
5.4.1 Data collection
We begin by collecting a training dataset D containing N input-output trajectories from the real system. We
denote by u¯[ j, k] and y¯[ j, k] the input and output from the k
th time step of the j
th trajectory in D. Before
starting each trajectory, we apply the identity input and allow the system to settle for several seconds (10 in
our experiments) to ensure that the system returns to a state very close to the rest state x¯0, which will happen
promptly due to Assumption 5.3.5. We then apply random sinusoidal pitch and yaw inputs. Sinusoidal
inputs excite the system enough to demonstrate large-scale dynamics such as resonance that are important for
manipulation, but are smooth enough to respect the actuator limits of our robot.
The sinusoidal inputs are chosen to respect a user-selected angular acceleration limit ω˙max and absolute
angle limits φmin,φmax and ψmin,ψmax to avoid triggering our robot arm’s built-in safety stop when actuator
limits are reached. For each angle, the sinusoid’s frequency ν is sampled log-uniformly between 0.125 Hz
and 3 Hz. The maximum angular acceleration of a sinusoid is given by 2πAν
2
, where A is the amplitude.
Therefore, ν and the acceleration limit induce a maximum amplitude A ≤ Amax = ω˙max/2πν2
. We sample
the amplitude uniformly from [0.1,Amax]. Finally, we sample the phase uniformly from [0,2π) and add a
constant offset, which is sampled uniformly from the range induced by the amplitude and angle limits.
51
5.4.2 RNN dynamics model
The true state x¯ of the deformable object is an infinite-dimensional continuum of points, which is not
representable on a computer without discretization and approximation. Furthermore, the dynamics of the
system are not purely dictated by the behavior of the deformable object—they also include the behavior of
the low-level controller of the robot arm. We overcome both challenges by representing the system state as
the hidden state of a learned RNN. In comparison to other methods of system identification from input/output
data, such as linear methods [26], RNNs are a highly expressive class of nonlinear state-space models. The
RNN is a generic function approximation scheme parameterized by a real-valued vector θ, and consists of a
discrete-time dynamics model
x[k +1] = fθ (x[k],u[k]), y[k] = hθ (x[k]), (5.3)
where x ∈ R
n
is the internal state, u is the input, y is the output, and fθ and hθ are the dynamics and
measurement functions respectively. Both fθ and hθ are differentiable with respect to their arguments and
the parameter θ. The particular form of the functions fθ and hθ must be carefully chosen to maximize
expressiveness while preserving desirable properties for optimization, but from the perspective of estimation
and control, their exact form is unimportant.
The RNN is trained on the dataset D to minimize a regression loss on the input-output map over complete
sequences:
minimize
θ
N
∑
j=1
Kj
∑
k=1
ky¯[ j, k]−hθ (x[ j, k])k
2
2
subject to x[ j,0] = 0
x[ j, k +1] = fθ (x[ j, k],u¯[ j, k]),
(5.4)
52
where Kj
is the length of the j
th trajectory. The fixed initial state of 0 is justified in our setting because each
trajectory in D begins at the rest state x¯0. The optimization problem (5.4) is typically solved with stochastic
gradient descent (SGD) using backpropagation, so the gradient of the output prediction loss is allowed to
flow through the recursive applications of fθ . This allows the RNN to learn dynamics where the effect of
an input does not appear until many time steps later. RNNs have achieved state-of-the-art results on many
sequence modeling tasks, even though the objective is nonconvex and SGD may converge to suboptimal local
minima [95].
5.4.3 Model-predictive control with reduced-order model
After finding a value of the RNN parameter θ
∗
that approximately optimizes the learning objective (5.4),
we use the RNN model in a model-predictive control (MPC) framework to optimize the trajectory-tracking
objective (5.2) in an online manner. Assume temporarily (we will return to this assumption in Section 5.4.4)
that at time step k we know a value xˆ[k] of the abstract RNN state that is consistent with the input and output
history from previous time steps in the real-world system. We solve the short-horizon optimal control problem
minimize
u¯[k],...,
u¯[k+H−1]
H
∑
i=1
kz[k +i]−hθ
∗ (x[k +i])k
2
W
+α
H−1
∑
i=0
d(u¯[k +i−1],u¯[k +i])2
+β
H
∑
i=1
d(u¯[k +i],I)
2
(5.5)
subject to x[k] = xˆ[k],
x[k +i+1] = fθ
∗ (x[k +i],u¯[k +i]) ∀i.
In the objective (5.5), H K is the MPC horizon. In the second and third terms, d : U ×U 7→ R≥0 is a metric
on the input space U. The first regularization term, weighted by constant α ≥ 0, encourages smoothness
53
of the control signal. The second regularization term, weighted by constant β ≥ 0, encourages the robot to
return to its rest state. Solving (5.5) yields a sequence of inputs u¯
∗
[k],...,u¯
∗
[k +H −1] optimized to track the
next H steps of the full goal trajectory. Following the standard moving horizon architecture, we apply only
the first input u¯
∗
[k] from the solution to the real system. Then, at time step k +1, we solve a new instance
of (5.5). The optimization problem is nonconvex, but the solution from the previous time step provides a
high-quality initial guess, so local optimization methods typically perform well as long as the initial guess is
not close to a bad local optimum [4]. We obtain an approximate solution with a few steps of gradient descent
with momentum. The momentum state from previous MPC steps is persisted and time-shifted in the same
manner as the initial guess.
5.4.4 Estimating the RNN state
In Section 5.4.3, we assumed the availability of a RNN state xˆ[k] that is consistent with previous inputs and
outputs from the real-world system. To obtain xˆ[k], it is not sufficient to simply evaluate fθ
∗ recursively on
u¯[1],...,u¯[k −1]. Because the RNN model is not perfect, the true outputs y¯[1],..., y¯[k −1] obtained from the
real-world system may diverge from the outputs hθ
∗ (x[1]),...,hθ
∗ (x[k −1]) predicted by applying the RNN
model in this open-loop manner.
Instead we observe that, although the RNN state has no direct physical meaning, the RNN model is still
a nonlinear discrete-time dynamical system with known dynamics. Therefore, its state can be estimated
from the input-output history using standard techniques from estimation theory. In particular, we apply an
extended Kalman filter (EKF) to the RNN model. We now review the well-known EKF equations to highlight
the role of the RNN.
The EKF maintains a Gaussian-distributed belief over the RNN state. At time k, the belief is distributed
according to
x[k] ∼ N (µ[k],Σ[k]),
54
where N (µ,Σ) is the Gaussian distribution with mean µ and covariance Σ. After sending an input u¯[k] to the
system, we then update the belief according to
µ[k|k −1] = fθ
∗ (µ[k −1],u¯[k]),
Σ[k|k −1] = F[k]Σ[k −1]F[k]
> +Q[k],
(5.6)
where
F[k] = ∂ fθ
∗
∂ x
(µ[k −1], u¯[k])
is the Jacobian of the RNN dynamics fθ
∗ with respect to state. The process noise covariance Q[k] 0
represents noise in the RNN abstract state, so it cannot be derived from a physical model or estimated from
data. We simply set Q to a scaled identity matrix in this work. Next, the measurement y¯[k] is captured
from the real system. We compute the measurement residual γ[k] = y¯[k]−hθ
∗ (µ[k|k −1]). According to the
current belief, γ[k] has covariance
S[k] = H[k]Σ[k|k −1]H[k]
> +R[k],
where
H[k] = ∂hθ
∗
∂ x
(µ[k|k −1])
is the Jacobian of the RNN observation function hθ
∗ with respect to state. The sensor noise covariance
R[k] 0 represents a physically meaningful quantity that can be derived from a model or estimated from data.
In this work we use a motion capture system with isotropic noise, so we set R to a constant scaled identity
matrix. We then compute the Kalman gain
K[k] = Σ[k|k −1]H[k]
>
S[k]
−1
55
and update the belief according to
µ[k|k] = µ[k|k −1] +K[k]γ[k],
Σ[k|k] = (I −K[k]H[k])Σ[k|k −1].
(5.7)
We then use the mean of the belief distribution as the initial state for the MPC problem (5.5), that is,
xˆ[k] = µ[k].
5.4.5 Implementation
For the RNN reduced-order dynamics model fθ , we select the long short-term memory (LSTM) architecture [66]. The LSTM is the de facto standard RNN architecture due to its favorable properties for training with
gradient descent. Numerical values of the architectural and training hyperparameters are listed in Table 5.1.
We train the RNN in PyTorch [123]. We also solve the model-predictive control problems and implement
the EKF in PyTorch due to the ease of differentiating through the RNN model. We run the MPC control loop
at 40 Hz.
5.5 Experiments
In all experiments, our deformable body is a thin cylinder of uniform closed-cell polyethylene foam, commonly known as a pool noodle, with length 1.5 m and diameter 6.5 cm. To attach the object to the robot, we
press-fit approximately 4 cm of one end of the pool noodle into a rigid 3D-printed ring attached to the robot
end effector. To track the distal end of the object, we attach a rigid assembly of motion capture markers and
track its full pose with a Vicon motion capture system. The measurement y¯ has a calibrated offset from the
marker assembly to lie on the center line of the pool noodle. Our experimental setup is shown in Figure 5.1.
The robot is a Franka Emika Panda. To track the pose commands u¯ ∈ SE(3) output by the MPC controller,
we apply a proportional-only control law in both the position and in the attitude quaternion to generate desired
56
Table 5.1: Values of user-chosen constants in our experiments.
Q EKF process covariance 10−6
I
R EKF measurement covariance 10−2
I
H MPC horizon 25
α MPC smoothness weight 1.0
β MPC homing weight 1e-1
— MPC gradient descent rate 4e-1
— MPC gradient descent steps 5
— LSTM layers 1
n Reduced state dimension 200
— LSTM SGD steps 1e4
— LSTM SGD learning rate 1e-3
— LSTM SGD batch size 10
N # trajs. in dataset 100
linear and angular velocities v for the end effector. We compute the kinematic Jacobian from libfranka and
produce desired joint velocities using the Jacobian pseudoinverse with null space optimization towards the
“home” position [141].
5.5.1 Model frequency response
To validate our RNN model, we compare its empirical frequency response near the internal state x = 0 to that
of the true physical system at its rest state x¯0. Frequency response is a holistic property of the model that
depends on its behavior in multiple parts of the state space. To avoid the complications of frequency-domain
analysis for multiple-input/multiple-output systems, we actuate the system only in the yaw axis ψ and
measure only the deflection of the pool noodle tip in the horizontal axis. Yaw inputs, as opposed to pitch
inputs, avoid the asymmetric effect of gravity.
We sample 33 geometrically-spaced frequencies ranging from 0.125 Hz to 2 Hz. For each frequency, we
apply a yaw input with a small amplitude (11.5° peak-to-peak) for 20 sec and record the trajectory of the pool
noodle tip ¯y. The small input amplitude allows higher-frequency inputs before reaching the robot’s actuator
limits. We discard approximately the first half of the recording (on the nearest whole cycle boundary) to
57
1/8 1/4 1/2 1 2
frequency (Hz)
0
1
2
3
4
gain
1/8 1/4 1/2 1 2
frequency (Hz)
0
−90
−180
−270
phase (deg)
Figure 5.3: Frequency-domain gain and phase response (Bode plots) for real pool noodle and LSTM model. real lstm
eliminate transient effects. We then compute the empirical gain and phase by taking the inner product of the
output signal with complex exponential functions, analogous to the discrete Fourier transform. Because the
input and output have different units, only the relative gain magnitudes between different frequencies are
meaningful. We normalize the gains such that the gain of the real system for the slowest input frequency
equals 1.
Bode (gain and phase) plots for each system are overlaid in Figure 5.3. In the true physical system, we
observe typical behavior of a resonant lowpass filter with a strong resonant peak around 0.8 Hz. The peak
gain is approximately 4 times larger than the low-frequency gain. This represents a dramatic phenomenon
that a faithful model must capture. In the RNN model, the gain response closely matches the real system.
The phase response matches closely below the resonant frequency, but begins to lag behind the true system
for high frequencies. We suspect this may be due to an unbalanced training data distribution. Overall, this
experimental result suggests that the RNN model has successfully captured the frequency response of the
physical system.
5.5.2 MPC tracking
We apply our method to track several test trajectories which attempt to expose the controller’s performance
with regard to resonant dynamics. The goal trajectories z[1],...,z[K] ∈ R
3
are specified by the user. As
described in Equation (5.2), our tracking cost is non-isotropic. We use the value W = diag(0,1,1) to focus
only on tracking in the Y- and Z-axes, which represents the view from the front of the robot. Without this
58
−0.5 0.0 0.5
y
−0.6
−0.4
−0.2
0.0
z
circle
−0.5 0.0 0.5
y
lissajous
−0.5 0.0 0.5
y
rectangle
goal closedloop openloop
Figure 5.4: Two-dimensional projections of paths traced by pool noodle free end in MPC tracking experiments.
non-isotropic weight, the goal trajectory would need to be a carefully designed curve in R
3
to be trackable
with zero error.
We show the results from three trajectories in Figure 5.4. The first two trajectories are a circle of diameter
0.6 m and a figure-8 Lissajous curve of width 1 m and height 0.4 m. Both trajectories are sinusoidal in each
axis, which matches the data collected during our training step. The circle and the vertical axis of the Lissajous
curve are set to the system’s resonant frequency 0.8 Hz as determined experimentally. The third trajectory
is a rectangle with constant linear velocity along each edge (no stopping at corners). These trajectories are
close to the robot’s dynamic limits as discussed in Section 5.4.1. Improvements in our low-level controller
are required to test more demanding trajectories that push the system further into nonlinearity.
To show that the EKF observer provides meaningful feedback to the controller (the “closed-loop” setup),
we compare it to a setup that assumes the predicted feedforward state, i.e. the value yielded by applying the
RNN dynamics fθ
∗ to the full sequence of past inputs, is always correct (the “open-loop” setup). The results
of this comparison are visualized in Figures 5.4 and 5.5, and the tracking errors are compared numerically in
Table 5.2.
For the circle trajectory, we observe significantly improved performance in both mean and maximum error
from the closed-loop setup. In the traces over time, shown in Figure 5.5, the open-loop solution drifts towards
stronger resonance in the vertical axis and weaker resonance in the horizontal axis. In contrast, the error of the
closed-loop solution does not grow over time, indicating that our EKF setup is able to compensate for model
59
Table 5.2: MPC tracking errors
max error (cm) mean error (cm)
shape kind
circle closedloop 12.59 5.59
openloop 29.15 11.61
lissajous closedloop 9.19 4.43
openloop 14.85 4.77
rectangle closedloop 14.60 6.34
openloop 14.62 6.01
error. For the Lissajous trajectory, the closed-loop setup yields a minor improvement in mean tracking error
but a significant improvement in maximum error. The rectangle trajectory shows little difference between the
open- and closed-loop approaches, but it is noteworthy that both approaches track the rectangle reasonably,
even though it is physically infeasible and not similar to the training data. This result suggests that the LSTM
model behaves reasonably for sequence inputs that are dissimilar to those in the training data.
-0.25
0.00
0.25
y
-0.50
0.00
z
-0.50
0.00
pitch
0 1 2 3 4 5 6 7
time (sec)
-0.10
0.00
0.10
yaw
goal closedloop openloop
Figure 5.5: Traces of rotation angle inputs (pitch, yaw) and horizontal and vertical components of pool noodle free end (y, z) for
MPC tracking of the circle trajectory (see Figure 5.4). The closed-loop variant of our method shows superior tracking performance.
60
5.6 Conclusion
We have described and demonstrated a system to manipulate a deformable object such that a particular point
on the object tracks a fast trajectory. Our approach is completely data-driven, and requires a fixed initial
data-collection phase, without further exploratory actions. We model our dynamical system as an LSTM
recurrent neural network and design a nonlinear MPC controller. We use an EKF state observer to account
for model error by estimating a value of the LSTM hidden state consistent with past inputs and outputs.
We validate our model on a real hardware setup with a robot manipulator holding a foam pool noodle,
measured by a motion capture system. Our experiments show that closing the loop with the EKF observer
improves tracking performance compared to an open loop control scheme for several of the test trajectories.
In future work, we aim to improve tracking accuracy by investigating other nonlinear state estimation
methods and MPC implementations. The EKF is one of several ways to account for modeling error. Other
nonlinear state estimators such as particle filters and moving window least-squares estimators could also
be adapted to estimate RNN internal states. Alternatively, a pure deep-learning approach might add past
observations as inputs to the RNN. For MPC, a more sophisticated constrained optimization scheme could
be applied to the nonlinear MPC problem (5.5) to help enforce smoothness and actuator limits. We are
also interested in applying our work to non-position-based control tasks, such as force control for using
deformable objects as tools.
61
Chapter 6
GranularGym: High Performance Simulation for Robotic Tasks with
Granular Materials
6.1 Abstract
Granular materials are of critical interest to many robotic tasks in planetary science, construction, and
manufacturing. However, the dynamics of granular materials are complex and often computationally very
expensive to simulate. We propose a set of methodologies and a system for the fast simulation of granular
materials on GPUs, and show that this simulation is fast enough for basic training with Reinforcement
Learning algorithms, which currently require many dynamics samples to achieve acceptable performance.
Our method models granular material dynamics using implicit timestepping methods for multibody rigid
contacts, as well as algorithmic techniques for efficient parallel collision detection between pairs of particles
and between particle and arbitrarily shaped rigid bodies, and programming techniques for minimizing warp
divergence on Single-Instruction, Multiple-Thread (SIMT) chip architectures. We showcase our simulation
system on several environments targeted toward robotic tasks, and release our simulator as an open-source
tool.
62
Figure 6.1: Simulation of a Franka Emika Panda robot with a rigidly attached scoop attachment, interacting with a bed of 50000
particles, running at realtime on a single NVIDIA GeForce RTX 3080 Ti with a simulated timestep of 5×10−4. Our simulation
engine approximates granular material state and dynamics as a system of rigidly interacting spherical particles, which may interact
with kinematically driven rigid bodies of arbitrary geometry.
6.2 Chapter Introduction
Robots are expanding their operational domains from structured environments, like factory floors, into the
unstructured world of homes [163], the outdoors [45], and other planets in our solar system [42]. Successful
operation in any environment requires a robot to predict the effect of its actions on the world, so that it may
select an action that best achieves its goal.
The dynamics of the objects in a robot’s environment may be quite complex. In addition to rigid objects,
robots may encounter deformable [6] or sliceable [63] objects including cloth [135] or ropes [94]. Here, our
focus is on robots interacting with large quantities of granular material like rock, sand, or loose soil, and
which are given tasks that require prediction of the bulk state of such materials, such as material transport or
shaping.
While machine learning is a powerful and generalizable tool for robot prediction and perception [46],
it remains necessary to use large models and a large number of samples to achieve accurate performance.
Physical modeling, which we refer to in this paper as simulation, is a well-tested and interpretable method for
63
dynamical system prediction. Robust and efficient simulation provides robotics engineers with accurate, safe,
and fast environments to test algorithms, train data-driven learning agents, and serve as a predictive model in
robotic control algorithms.
To these ends, we have developed GranularGym, a faster-than-realtime simulation engine for the mechanics of granular materials with tens of thousands of particles and at interactive speeds for hundreds of
thousands of particles, running on a single commodity GPU. Additionally, we contribute a set of environments
that we utilize as benchmarks for algorithms intending to solve granular material manipulation tasks. We
document the equations of motion for the simulated dynamics, describe the algorithms and data structures
used for high-performance simulation on GPUs, analyze the performance of our engine and how it scales
with various parameters, and present several benchmark environments and associated scaling of performance,
and show that our engine is performant enough to train state-of-the-art reinforcement learning algorithms to
achieve high rewards.
Our approach uses a simplified model of granular interaction based on rigid body interparticle contact,
which may not capture the full, vast space of rheological phenomena found in nature. Nevertheless, we
believe that fast, approximate simulation in complex domains is a powerful tool for the development of
robotic autonomy, particularly in closed-loop robotic systems where modeling errors may be accounted for
and corrected based on sensor observations.
The code for GranularGym is released under an open-source license and is written in Julia [21], with a
Python interface available, and targets multithreaded CPU and GPU architectures that support the NVIDIA
Compute Unified Device Architecture (CUDA) or Apple Metal frameworks, and is easily portable to other
parallel compute platforms.
64
6.3 Methods
We simulate rigid, dry-frictional contact of np spherical particles and nb rigid bodies of arbitrary geometry
interacting in a scene. The particle states and velocities are described by matrices x, v ∈ R
3×n
. Each rigid
body’s state is given by a six Degree of Freedom (DoF) transform 0X
i ∈ SE (3) from the world frame 0 into
the frame of body i. Rigid body i also has velocity vi ∈ se (3), where se (3) is the Lie algebra of the Special
Euclidean Lie group SE (3). Rigid bodies in our engine are fully driven by time and may exert forces on
granular particles but do not accumulate a reaction force.
In the following subsections, we describe our algorithm for computing contact impulses across large
systems of particles, efficient parallel data structures for broadphase collision detection and non-convex rigid
body collision detection, and branched execution considerations for efficient programming in SIMT GPU
environments.
6.3.1 Implicit Contact Impulses with Projected Jacobi
We compute particle impulses with rigid dry frictional contact using an implicit timestepping approach
common in rigid body simulation by constructing a NCP to solve for the systemic contact impulse. A
thorough treatment and comparison of implicit timestepping methods is given in [70]. We use a parallelized
variant of the serialized Projected Gauss-Seidel (PGS) method to solve this NCP, which we call the Projected
Jacobi Algorithm (PJA), listed in Algorithm 1. For completeness, in this section, we detail the computation
of the contact impulse NCP and then describe the PJA solver and its implementation on a CUDA GPU.
During collision impulse calculation, it is useful to establish a contact frame, shown in Figure 6.2. A
contact frame is a local coordinate system with the origin at the point of contact and the e1-axis oriented
along the contact normal. As a result, the contact normal force is axis-aligned with e1, and frictional forces
exist only in the e2e3 plane. For contact between particles i and j, the contact Jacobian Ji, j ∈ R
3×3
is the
transformation into the associated contact frame. We assemble these into a large, sparse contact Jacobian
65
∆v
∗
e1
e2
e3
Figure 6.2: Graphical representation of a contact frame between two objects (red and blue). The contact normal is aligned with the
frame’s e1-axis, while the e2 and e3 axes are an arbitrary basis for the frictional force plane. A cone representing the bounds of dry
Coulomb friction is shown in yellow. ∆v
∗
shows a candidate impulse, and the dashed line shows its projection onto the friction cone
via the solution of the contact NCP described in Section 6.3.1.
Algorithm 1: Projected Jacobi Algorithm
1 ∆v ← 0;
2 foreach iteration from 1 to n do
3 foreach contact pair (i, j,ψ) do
4 b ← J
T
i, j
(v
(k)
i −γv
(k)
j +∆tFext,i +∆vi);
5 b[1] ← max(b[1]+
αψ
∆t
,0);
// Project onto Coulomb friction cone.
6 if ||b[2:3]||2 > µ ·(b[1]) then
7 b[2:3] ← µ ·(b[1])
b[2:3]
||b[2:3]||2
;
8 end if
9 ∆vi ← ∆vi +Ji, jb;
10 end foreach
11 end foreach
for the entire system, J ∈ R
nc×np
, where nc is the number of contacts at the current timestep. The contact
impulses ∆v is then computed from the solution to Equation (6.1), where M is the mass-inertia matrix, v
(k)
is
the system velocity at timestep k, c
(k)
is the post-contact relative velocity in the contact frame at timestep k,
Fext is the summary of any external forces applied to the system, such as gravity, and ∆t is the timestep.
c
(k+1) = JM−1
J
T
| {z }
A
∆v+J(v
(k) + M−1∆tFext)
| {z }
b
(6.1)
66
To enforce the non-penetration of rigid body contact, where objects may either exert a contact force on
one another, or accelerate away from one another, but not both, we add the complementarity constraint in
Equation (6.2) to Equation (6.1) on the normal directions
0 ≤ ∆v · e1 ⊥ (A∆v+b)· e1 ≥ 0 (6.2)
Furthermore, to enforce a Coulomb approximation of dry friction, we include the nonlinear constraint in
Equation (6.3) between the tangent and normal impulses
||proje2e3
∆v||2 ≤ µ∆v · e1, (6.3)
where µ is the coefficient of friction and proje2e3
is a projection operator into the e2e3 plane. The projection
onto the friction cone is the nonlinearity in the NCP. Together, Equations (6.1) to (6.3) form the systemic
contact impulse NCP. The post separation contact velocities c
(k)
are treated as slack variables, and we use the
resulting contact impulses ∆v in a symplectic Euler time integration scheme, where the updated velocity is
computed first, per Equation (6.4), and then used to update positions, per Equation (6.5).
v
(k+1) = v
(k) +∆t∆v (6.4)
x
(k+1) = x
(k) +v
(k+1)
(6.5)
We use a symplectic Euler integrator for its energy preserving properties, particularly in systems of dynamics
with rigid body contacts[40].
Due to accumulating errors inherent in simulation, particularly in implicit methods, particles may
intersect by small amounts. Since the contact impulse NCP of Equation (6.1) is specified in terms of velocity
and impulse, rather than position, such errors would accumulate if left untreated. To mitigate erroneous
67
512 1k 2k 4k 8k 16k 32k 65k131k
0x
1x
2x
3x
4x
5x
6x
7x
8x
Hash Table Size
Speedup vs. Realtime
∆t = 1×10
−3
Performance Scaling with Hash Table Size
np ≈ 1k
np ≈ 4k
np ≈ 16k
np ≈ 65k
Figure 6.3: Plot of speedup over realtime against hash table size (higher numbers are better). To maximize the number of collision
candidates, we simulate particles at rest inside a tall cylindrical column. The vertical lines mark the number of particles used in the
performance plot of the corresponding style. Based on the shape of these performance curves, we propose a heuristic hash table size
of twice the number of particles simulated.
penetration, we apply a stabilization scheme proposed by [13], by applying a small corrective impulse along
the contact normal of magnitude αψ
∆t
, where ψ is the penetration depth, and α is a user chosen parameter.
6.3.2 Linked Spatial Hashmaps for Broadphase Nearest Neighbor Searches
A naive implementation for detecting collisions between neighboring particles scales as O(n
2
) with the
number of particles in the system and would contribute a significant source of computation time each timestep.
To alleviate this, we implement a linked hashmap data structure to generate lists of candidate collision pairs,
Algorithm 2: Parallel Spatial Hashmap Construction
1 foreach particle i do
2 x¯i ← Round( xi
2r
);
3 hi ← SpatialHash(x¯i
,nh);
4 p ← AddressOf(table[h]);
5 head ← table[hi
];
6 while head = table[hi
] do
7 head ← AtomicCAS(p,head,i);
8 end while
9 next[i] ← head;
10 end foreach
6
h = 0 h = 7 h = 2 h = 9
h = 4 h = 11 h = 6 h = 1
h = 8 h = 3 h = 10 h = 5
e1
c 2c 3c 4c
e2
c
2c
3c
h = 0 1 2 3 4 5 6 7 8 9 10 11
3
3
6
6
9
9
12
12
15
15
18
18
1
1
4
4
7
7
10
10
13
13
16
16
19
19
2
2
5
5
8
8
11
11
14
14
17
17
0
0
Figure 6.4: Graphical representation of the spatial hashmap data structure for fast, parallel broadphase collision checking. Physical
coordinates are divided into cells of size c, each with a hash value h computed from the cell’s location. A linked hash table, shown in
the lower half of the figure, is constructed in parallel according to Algorithm 2, with one thread for each particle in the scene. This
data structure generalizes to any number of dimensions; we show a two-dimensional representation for clarity.
which can be built in parallel in GPU global memory. By hashing three-dimensional points according to a
discretized spatial hashing rule, we can quickly check discretized regions of state space for particle occupancy
and, therefore, significantly filter the list of potential collisions. We build this data structure atomically in
GPU memory with a single Atomic Compare-and-Swap (CAS) instruction, using Algorithm 2. Since each
particle can belong to at most one cell, the entire set of linked lists can be stored in a single, preallocated
array of size np. A graphical representation of the data structure is represented in Figure 6.4. We hash the
discretized coordinates of individual particles with Equation (6.6), a modification of the rule proposed by
[150],
h(xi) =M
3
j=1
pj
roundxi, j
2r
−q
(mod nh), (6.6)
69
Thread 1
Thread 2
Thread 3
Thread 4
Thread 5
Thread 6
Thread 7
Thread 8
Thread 1
Thread 2
Thread 3
Thread 4
Thread 5
Thread 6
Thread 7
Thread 8
Collision Check
Collision Check
Collision Check
Impulse
Collision Check
Collision Check
Collision Check
Impulse
Impulse Impulse Impulse
Impulse Impulse Impulse
Impulse Impulse
Impulse Impulse Impulse
Impulse Impulse Impulse
Impulse
Wall Time
Fused Loop
Collision Check
Collision Check
Collision Check
Impulse
Collision Check
Collision Check
Collision Check
Impulse
Impulse Impulse Impulse
Impulse Impulse Impulse
Impulse Impulse
Impulse Impulse Impulse
Impulse Impulse Impulse
Impulse
Wall Time
Collision Loop Impulse Loop
Figure 6.5: Graphical illustration of the execution paths for warp divergence during collision detection during a single timestep of
simulation. Left: Execution pattern of a single fused loop, where all threads in a warp must wait for even a single impulse to be
computed. Performance of this approach is shown in Figure 6.6 as ONELOOP. Right: Execution pattern from precomputing all
collision checks and storing a list of impulses to be processed and then processing them later. Note that this allows impulses from
different collision checks to be computed simultaneously. Performance of this approach is shown in Figure 6.6 as TWOLOOPSFUSED
and TWOLOOPSSPLIT.
where xi, j denotes the jth coordinate of the position xi ∈ R3, p1 = 73856093, p2 = 19349663, p3 = 83492791,
q = 100, nh is the hashmap size, ⊕ is the xor bitwise logical operator, and round rounds a real number to the
nearest integer.
We find that the size of the hash table has a significant effect on simulator performance and present results
in Figure 6.3. As a heuristic summary of these findings, we propose a hash table size of twice the number of
simulated particles.
We query the hash table in parallel with one thread per particle, traversing the linked list of particles with
the same spatial hash as filtered candidates for collisions. Since particles can collide across cell boundaries,
each thread must also traverse the 3
3 −1 = 26 neighboring cell lists. During the linked list traversal, we
check particles for collision against the particle associated with the current thread. We discuss a method for
accelerating narrowphase collision checking and contact processing on SIMT GPUs in 6.3.4.
70
6.3.3 Collision Geometry using Signed Distance Functions
Our simulation engine models granular materials as particles that are uniform size and spherical, so narrowphase interparticle collision checking between particles i and j is as simple as checking ||xi −xj
||2
2 ≤ (2r)
2
.
However, many interesting scenarios for robotic tasks involve objects of arbitrary geometry, like excavator
buckets, bulldozer blades, or robot wheels. We represent general rigid-body geometry using Signed Distance
Functions (SDFs). A function fG : R
3 → R is a signed distance function for a closed, simple surface G if
| fG(x)| is the distance between x and the nearest point in G. More formally, given a distance function d,
| fG(x)| = inf{d(x, y) | y ∈ G}. Additionally, the sign of f is positive if x is outside of G and negative if x
is inside of G. Using a signed distance function, it is simple to check collisions with spheres and, thus, to
perform particle-body collision checks. For a sphere of radius r centered at x, the maximum penetration
depth into a geometry G is given by Equation (6.7),
ψ(x) = fG(x)−r. (6.7)
There are many SDFs which can be represented in simple closed forms for geometric primitives[128]. Of
broader interest are geometries described by arbitrary triangle meshes which are commonly used in physical
simulation. Computing a SDF for an arbitrary mesh can be quite expensive, depending on the resolution of
the mesh surface. To mitigate this, we pre-compute the SDF values on a regular rectilinear three-dimensional
grid which contains the bounds of the mesh geometry. To query the SDF, we use a trilinear interpolation
scheme across the knot points in this grid. While this discretization naturally introduces approximation errors
above and beyond the inherent discretization errors in triangle meshes, such errors are user-tunable through
the grid resolution. Additionally, while the memory used by the grid scales O(n
3
) with the inverse of the grid
resolution, an interpolated lookup in the grid is always O(1), regardless of the resolution or of the complexity
of the represented geometry. To check for collisions between a world-frame particle i at coordinates 0xi and
71
1k 2k 4k 8k 16k 32k 65k
0x
1x
2x
3x
4x
5x
np (Number of Particles)
Speedup vs. Realtime
∆t = 1×10
−3
Collision and Contact Algorithm Performance
TWOLOOPSSPLIT
TWOLOOPSFUSED
ONELOOP
Figure 6.6: Speedup vs. realtime (higher numbers are better) for different orderings for processing contacts and collisions. To
maximize the number of collision candidates, we simulate particles at rest inside a tall cylindrical column. ONELOOP shows a naive
implementation, while TWOLOOPSFUSED shows the performance of splitting collision and contact processing but keeping them in a
single kernel. TWOLOOPSSPLIT shows the performance of splitting collision and contact processing across kernel calls.
a rigid body j at pose 0X
j ∈ SE (3), we transform the particle coordinates into the local coordinates and
evaluate Equation (6.7) as ψ
jX
0
0x
, where jX
0 =
0X
j
−1
.
6.3.4 Minimizing Warp Divergence
In CUDA programs, a warp is a collection of threads that execute the same instruction at the same time.
Since each thread sees different data in the SIMT architecture, conditional instructions may cause two threads
in the same warp to take different branches of control flow, a phenomenon called warp divergence. Since
the program counter must remain the same for each thread in the warp, CUDA environments will execute
both branches of the conditional, one after the other, and will mask out the effects of instructions on inactive
threads. In effect, warp divergence can substantially negatively impact program performance on the GPU, as
threads may spend time “doing nothing” while waiting for other threads to finish their branches.
As suggested in [116], we minimize warp divergence by deferring computation of contact impulses until
after all collisions are detected and only marking particle pairs as “colliding” during an initial narrowphase
collision detection step. Due to the nature of the SIMT architecture, splitting the collision checking and
contact processing phases into two loops vastly shortens the number of instructions in the “colliding” code
path during collision detection. Despite an efficient broadphase filter on nearest neighbor search using spatial
hashing (Section 6.3.2), we find experimentally that only about 16% of candidate collisions are actually in
collision. Since it is significantly faster to store the indices of colliding pairs than it is to compute the contact
response impulse (Section 6.3.1), we can reduce the time spent waiting for threads assigned to non-colliding
particles. A graphical representation of the difference in approaches is shown in Figure 6.5. This performance
increase is confirmed experimentally, using a large pile of particles at rest in a tall cylindrical column designed
to maximize the number of interparticle collisions that are physically feasible. We show the experimental
difference in several approaches to collision-contact computation order in Figure 6.6. In summary, we find
that splitting the collision detection and contact impulse computation across two subsequent CUDA kernel
calls is the fastest approach for all tested numbers of particles.
6.4 Benchmark Environments
Using the simulation methodology described in Section 6.3, we construct several exemplary simulation
environments which showcase various aspects of our simulation engine, particularly as it pertains to robotics
test and learning environments. We also fully describe the environment we use for performance benchmarks
of our engine.
6.4.1 Bulldozing
In the BULLDOZING task, shown in Figure 6.7, the the controlled robot is a tracked vehicle with a scoop
rigidly attached to the front, modeled after a toy bulldozer STL file. We set up an OpenAI Gym[23] style
environment, designed for a high-achieving agent to plan a series of actions to use the bulldozer to move
material from one zone to another using a fixed time budget, which we briefly describe in this subsection.
73
Figure 6.7: On the left is an image taken from the bulldozer simulation environment with 50000 granular material particles simulated
at 1kHz, running in real-time on an NVIDIA RTX 3080 Ti GPU. Visible here are particles interacting with the non-convex geometry
of a toy bulldozer model, as well as mounding effects from stable interparticle contacts with dry coulomb friction. In the top right is
displayed a 36-by-36 pixel “ego view” perspective camera showing a depth image from the cockpit of the bulldozer, while on the
bottom right is a 72-by-36 pixel depth image taken from a “sky view” orthographic camera. Combined with the lateral position and
orientation of the bulldozer, these images are used as the observation space for a reinforcement learning environment.
Dynamics We use the simulation engine described in this paper, running on a machine with a 72-core
Intel(R) Xeon(R) Gold 6154 CPU and four NVIDIA GeForce RTX 2080 Ti GPUs. We note that our engine
currently does not meaningfully support multiple GPUs, besides running multiple independent simulations
simultaneously. On this hardware, our simulation is able to run at realtime with tens of thousands of particles.
However, to decrease the time required for training, we sacrifice physical accuracy for computation speed
by increasing the simulation timestep to 0.01 s. We note that this leads to increased numerical error during
integration, leading to nonphysical behavior, particularly the collapse of tall stacks of particles, and a general
behavior of steep slopes to “ooze” down to a more shallow configuration. However, we believe that the
differences in particulate behavior are subtle and that these errors do not significantly affect the performance
of training agents in this environment. For environments and tasks in which particle stacking and stable steep
slopes are required, this tradeoff may, of course, not be acceptable.
Observations We compute two image observations of the environment, both on the GPU. The “ego
camera” is a perspective depth camera that moves with the robot mesh and points out of the “front windshield”
74
0 2 4 6 8 10 12
−2,000
0
2,000
4,000
6,000
Wall Time (hours)
Mean Agent Return
Agent Reward vs. Wall Time
TD3
Figure 6.8: A plot of agent total episode return (averaged over the 100 most recent episodes) against wall clock time, using the TD3
algorithm. The simulation environment, reward structure, and computer hardware used in this experiment are described in depth in
Section 6.4.1.
of the bulldozer. The “sky camera” is an orthographic third-person depth camera that faces down from the
top of the scene and captures the heights of particle stacks across the entire environment. These images
are computed on the same GPU used for simulation, minimizing expensive host-device memory transfers.
Additionally, we provide the x and y positions of the bulldozer, as well as the yaw angle θ about the global
z-axis.
Actions We use a track steering model parametrized by the box [−1,1]×[−1,1] ⊂ R
2
. The first axis
describes the linear velocity of the bulldozer along the direction it is facing, and the second axis describes the
angular velocity around the yaw axis of the bulldozer.
Rewards The agent gets a reward of 100
n
, where n is the total number of simulated particles for each
particle inside of a “goal box” inside the environment. To provide a smoother reward function, we assign a
negative value to each particle outside the box of −
d
n
, where d is the distance from the particle to the closest
point on the box.
We train this environment using the TD3 algorithm and show that an agent is able to learn to achieve
higher rewards than an initial random policy. We do not present these results as an exemplary demonstration
75
of robot learning, but instead to show that complex environments can be simulated fast enough to train
Reinforcement Learning agents in days.
6.4.2 Helical Gear Tower
0 5 10 15 20
0x
1x
2x
3x
4x
5x
6x
7x
8x
nb (Number of Rigid Bodies)
Speedup vs. Realtime
∆t = 1×10
−3
Performance Scaling with nb
np ≈ 1k
np ≈ 4k
np ≈ 16k
np ≈ 65k
1k 2k 4k 8k 16k 32k 65k
0x
1x
2x
3x
4x
5x
6x
7x
8x
9x
np (Number of Particles)
Speedup vs. Realtime
∆t = 1×10
−3
Performance Scaling with np
nb = 0
nb = 4
nb = 8
nb = 12
nb = 16
nb = 20
Figure 6.9: Plots of speedup vs. realtime (higher numbers are better) against the number of rigid bodies in the scene (left) and the
number of particles simulated (right, note log x axis). The benchmark environment simulated is described in Section 6.4.2.
To test our performance and demonstrate our ability to scale to multiple rigid bodies of arbitrary, nonconvex geometry, we simulate a tall, cylindrical containment tower with a varying number of involute
Figure 6.10: Image of the helical gear benchmark setup that we use to test performance as it scales with the number of bodies and the
number of particles. Particles are colored according to their z-axis coordinate.
helical gears rotating at uniformly random velocities, as shown in Figure 6.10. To keep particles flowing,
we implement a cyclic boundary condition at the vertical extents of the cylinder, where particles passing
through the bottom cap are transported to the top of the cylinder with the same velocity. However, we
do not implement this cyclic boundary for nearest neighbor contact searches. We present results of how
performance scales with the number of particles and the number of rigid bodies in Figure 6.9. While we do
not propose a quantitative measurement of the “complexity” of a mesh geometry, we select these helical
gears as a benchmark of interaction with arbitrary non-convex geometry, and note that due to our gridded
SDF representation of collision geometries discussed in Section 6.3.3, our performance does not change with
the shape or complexity of the mesh.
6.4.3 Excavation
In the EXCAVATION environment, shown in Figure 6.1, the robotic manipulator is a 7-DoF Franka Emika
Panda robot with a tool rigidly attached that is similar in shape to an excavator bucket. The robot accepts
77
velocity control inputs in joint space. In our testing, we are able to interact with 50000 particles at realtime
speeds on a single NVIDIA GeForce RTX 3080 Ti, using a simulation timestep of 5×10−4
.
We select the Franka Emika Panda robot for its flexibility and because it is becoming commonplace in
the robotic research world. However, the Panda highlights an important limitation of our simulation, the
fact that rigid bodies are kinematically driven and do not accumulate reaction forces from the environment.
Particularly in granular manipulation tasks, there is a stark difference in the amount of force required to
push a cutting edge through an aggregate (the penetration force), and the amount of force required to push a
flat surface, like the back of a shovel, through an aggregate (the deadload force). Since the Panda robot is
designed for safety around humans and in research applications, it has a relatively low maximum applicable
force, leading to a wide gap in behavior between simulation and reality, which we intend to address in future
improvements to our engine. Currently, users may manually compute the total force on an individual particle
or rigid body, and use this in a higher-level decision to terminate an action or training episode, which we
propose as a mitigation for this limitation.
6.5 Placement Within Related Works
Simulation of granular dynamics is of great interest for many tasks besides robotics, including Computer
Aided Design and Engineering of earthmoving equipment [119], scientific sampling apparatus [133], and
production lines [47]. Accordingly, there are several overarching approaches for the simulation of granular
dynamics in existing literature, each with various design decisions that affect computational efficiency and
accuracy. Broadly, these are Discrete Element Method (DEM), which represents a Lagrangian state of
granular materials as a set of distinct bodies, Computational Fluid Dynamics, or Material Point Method
(MPM) approaches, which use a hybrid Eulerian-Lagrangian state representation, and Reactive Force Theories,
which are extremely efficient and use a fully data-driven approach, but only model the force on a tool exerted
by soil. Our presented framework falls firmly into the Discrete Element Method (DEM) category.
78
The simulation of bulk material by the interactions of discrete particles with Lagrangian state is generally
called the Discrete Element Method (DEM). One popular Discrete Element Method (DEM) framework for
granular dynamics is Large-scale Atomic/Molecular Massively Parallel Simulator (LAMMPS) [152], which
offers a “Granular” package for rigid or elastic interparticle physical contact forces. A fork of LAMMPS,
called LAMMPS Improved for General Granular and Granular Heat Transfer Simulations (LIGGGHTS) [82],
also offers large-scale granular dynamics simulations with arbitrary particle geometries and sizes, as well as
interactions with arbitrarily shaped rigid objects. While both LAMMPS and LIGGGHTS are well-tested and
proven simulation codes that offer significant flexibility to model multiple physical models of contact, they
have been primarily designed for large clusters of CPUs communicating with networked message-passing
interfaces, and are inherently limited in their ability to reach “interactive” rates of simulation for large
numbers of particles.
Other solvers, such as Ansys Rocky [5], target granular interactions on GPUs, and are flexible, high
quality, and high performance. However, these systems are closed-source simulation codes without published
methods, which require expensive licenses, and prohibit user modification of the underlying software.
Algoryx Momentum Granular [139] provides real-time simulation of granular materials, along with a twoway coupling with rigid multibody physics, by using an adaptive multiscale particle model, presumably
executing on a CPU. While the authors publish their methodology, the implementation is also closed source
and available only under a commercial license.
Project Chrono is a multiphysics engine supporting multibody rigid dynamics, computational fluid
dynamics, and solid mechanics through finite element analysis. Project Chrono also offers Chrono::GPU [41],
a GPU accelerated engine for granular flows. Chrono::GPU does not target realtime interaction, instead
supporting elastic contact modes for very high accuracy simulations, which run significantly slower than
realtime.
79
Wheel-soil interaction, also called terramechanics, is a key aspect of modeling wheeled robots traversing
off-road terrains. This behavior is especially important during highly dynamic motions or when traversing
loose or rocky soils. Previous approaches have found success by directly learning input-output maps of
vehicle dynamics on such terrains from real data from trajectories from human experts and then used the
learned model directly for Model-Predictive Path Integral control schemes to achieve “drifting” like behavior
on dirt racetracks [160].
Additionally, there have been several efforts in robotics to control heavy earth-moving equipment like
excavators. Early efforts used a physically based analytical model of soil computing onlly reactive forces
of the earth on the bucket mechanism of the excavator [122]. Similarly, more recent efforts have achieved
impressive results by using reinforcement learning to control excavator buckets in inhomogeneous soil
densities by using a reduced analytical reactive force model, and employing heavy domain randomization to
train a reinforcement learning policy for bucket control [38]. A key stated intention for the use of reduced,
force-based models in these works is the computational intensity required for a full, Lagrangian-state particle
simulation of soil mechanics. Reactive force models are limited in their ability to richly represent the state of
the soil under manipulation, and are thus inapplicable to completely simulate tasks such as material transport.
Other approaches include the MPM, which models Lagrangian particle states but integrates their dynamics
on Eulerian grids. MPM is an extremely flexible modeling approach for a wide variety of physical phenomena,
but in general, is not efficient to run at interactive simulation speeds[56], despite high quality implementations
available as part of the DiffTaichi[73] GPU framework.
There are many high quality interfaces for GPU programming in high level languages, including the
aforementioned Taichi framework and the Warp[100] library from NVIDIA. We choose the Julia language
due to its composable interfaces, high quality GPU libraries[19], and the native compilation of non-GPU
code.
80
6.6 Conclusion and Future Work
We present a methodology for a simulation framework of hundreds of thousands of particles in dry frictional
contact, which in bulk approximate granular media. Our framework is for execution on CUDA GPUs, and
we describe methods for high performance on such architectures. While our approach does not model all
types of materials, our engine favors speed while modeling a Lagrangian state, which is critical for robotic
tasks that need to manipulate the state of soil. A performant implementation of the simulation architecture,
which was used for the performance benchmarks in this paper is written in the Julia programming language
and is released as open-source software. This framework is able to handle one-way contact coupling with
rheonomically constrained rigid bodies of arbitrary geometry, and allows a user-tunable tradeoff between
geometry accuracy and CUDA memory usage, while maintaining constant compute time performance for
rigid body collision detection.
Our model of granular interaction incorporates several simplifications and is not an appropriate model for
all of the various types of granular interactions present in nature. Particularly in robotics, however, where fast
simulation allows planning algorithms to predict dynamics in closed-loop control algorithms and enables
fast environments for robot learning setups, we believe there is a role for a granular simulation engine that is
focused on speed and which can represent arbitrary state configurations of particulate media.
To demonstrate the utility of our simulation framework for tasks in robotics, we show a collection
of several benchmark robotic tasks involving the manipulation of granular materials, as well as showing
that state of the art reinforcement learning algorithms can be brought to bear against these problems. An
open-source implementation of these environments, along with an OpenAI Gym-like API, is available in the
linked software repository.
We hope that these contributions spur increased research in the difficult research area of robotic manipulation of granular materials, and see several clear paths toward a more complete simulation framework, which
we list in the following paragraphs and intend to address in future work.
81
Two-Way Coupling For tasks where the mass of the accumulated material is non-negligible compared to
the actuation power of the manipulator, a full rigid-body or multiphysics dynamics engine, and a full two-way
force coupling is required.
Differentiability Many robotic dynamic planning algorithms and system identification algorithms use
Jacobians of the dynamics to accelerate optimization. The Lagrangian particle state representation is
not amenable to useful direct differentiation, due to its permutation invariance. A carefully chosen state
representation could pave a path toward meaningful differentiability.
Non-Homogeneity While the presented framework allows for an easily replaceable interparticle contact
model, each particle has the same parameters and contact equations. Since many real-world materials are
non-homogenous, efficient computation for such media is an important contribution towards a fully flexible
simulator.
Chapter Acknowledgments
82
Chapter 7
Conclusions and Future Work
This thesis presents a collection of work on the use of simulation for robotics research. The work is organized
into four major contributions, each of which is presented as a separate chapter. The first contribution (Chapter 3) describes the NeuralSim framework, which augments differentiable simulators with neural networks to
improve their accuracy and efficiency. The second contribution (Chapter 4) describes a differentiable finite
element simulator for robots manipulating soft objects, and describes a system for estimated the parameters
of the simulator from real-world data. The third contribution (Chapter 5) describes a learning-based approach
to the dynamics of soft objects, and demonstrates that it can be used for control to track fast trajectories. The
fourth contribution (Chapter 6) describes GranularGym, a high-performance simulator for robotic tasks with
granular materials.
Together these contributions push forward the ability of robots to accurately model their environments,
and to use data from the real world to improve their models. This is a critical step towards robots that can
operate capably in the real world, particularly with limited amounts of data. The work presented here is a
step towards that goal, but there is still much work to be done.
Closing the Learning Loop. The work presented in this thesis has focused on the use of simulation
to improve the accuracy of models for robotic manipulation. However, the frameworks presented here in
Chapters 3, 4, and 5 learn once from data, and then use that model for control. Closing the learning loop, akin
83
to techniques in Model Base Reinforcement Learning (MBRL), would allow for continued improvement of
the model during operation. Particularly of interest is the use of differentiable models to guide the collection
of specific regions of data, where the model may be ill-conditioned or inaccurate.
Control of Granular Media. The work presented in Chapter 6 demonstrates the use of GranularGym
to simulate a variety of tasks with granular media. However, the control of granular media is still an open
problem, as is the related long-horizon planning of tasks. The problem is complicated by the fact that the
instantaneous dynamical systems formed during granular interactions are non-smooth and transient, and there
are combinatorially many. Furthermore, the problem has an interesting permutation symmetry, where the
order of the particles in the system does not matter, as many tasks require an amount of material to be moved
to or removed from a given location, but not a specific set of particles. Much work focuses on various model
reductions, both from physical principles and from data, but an area of interest is the use of differentiability
and an exploitation of the structure of the large-scale complementarity problem, combined with an ability to
simulate (and thus predict) at faster than real-time speeds.
Parameter Estimation of Granular Media. Chapter 6 demonstrates the use of GranularGym to for
largely known parameters. In limited experiments, we have demonstrated the ability to match the parameters
of the simulator to experimental setups in the real world, with good suspected initial guesses of the simulation
parameters. The real world, of course, is seldom so kind. The ability to robustly estimate the parameters, and
distributions of the parameters of the simulator from data, and particularly from data collected by the robot,
would be a significant step towards the use of granular media in robotic tasks.
Multiphysics Simulation. The work presented in this thesis has focused on the use of several different
types of simulation engine. Chapter 3 uses a differentiable rigid body simulator, Chapter 4 uses a differentiable
finite element simulator, Chapter 5 uses a learned model directly from real data, and Chapter 6 uses a nondifferentiable rigid body contact solver. Each of these tools is presented in its own limited scope, but there is
significant opportunity to combine them into a single framework. This would allow for a particularly rich
84
set of environments for robotic tasks. Multiphyiscs simulation is a challenging problem, but there are many
interesting frameworks for computational efficiency, like co-simulation, which could be brought to bear on
this problem for modern hardware.
85
Bibliography
[1] Sameer Agarwal, Keir Mierle, et al. Ceres Solver. http://ceres-solver.org.
[2] Anurag Ajay, Jiajun Wu, Nima Fazeli, Maria Bauza, Leslie P Kaelbling, Joshua B Tenenbaum, and
Alberto Rodriguez. “Augmenting Physical Simulators with Stochastic Neural Networks: Case Study
of Planar Pushing and Bouncing”. In: IEEE/RSJ International Conference on Intelligent Robots and
Systems (IROS). 2018.
[3] Ilge Akkaya, Marcin Andrychowicz, Maciek Chociej, Mateusz Litwin, Bob McGrew, Arthur Petron,
Alex Paino, Matthias Plappert, Glenn Powell, Raphael Ribas, et al. “Solving Rubik’s Cube with a
Robot Hand”. 2019. arXiv: 1910.07113.
[4] Frank Allgöwer and Alex Zheng. Nonlinear model predictive control. Vol. 26. Birkhäuser, 2012.
[5] Ansys Rocky | Particle Dynamics Simulation Software.
https://www.ansys.com/products/fluids/ansys-rocky. (Visited on 01/30/2023).
[6] Rika Antonova, Jingyun Yang, Priya Sundaresan, Dieter Fox, Fabio Ramos, and Jeannette Bohg. “A
Bayesian Treatment of Real-to-Sim for Deformable Object Manipulation”. In: IEEE Robotics and
Automation Letters 7.3 (2022), pp. 5819–5826.
[7] Brian Armstrong, Oussama Khatib, and Joel Burdick. “The Explicit Dynamic Model and Inertial
Parameters of the PUMA 560 Arm”. In: Proceedings. 1986 IEEE International Conference on
Robotics and Automation. Vol. 3. IEEE. 1986, pp. 510–518.
[8] Veronica E. Arriola-Rios, Puren Guler, Fanny Ficuciello, Danica Kragic, Bruno Siciliano, and
Jeremy L. Wyatt. “Modeling of Deformable Objects for Robotic Manipulation: A Tutorial and
Review”. In: Frontiers in Robotics and AI 7 (2020), p. 82. ISSN: 2296-9144. DOI:
10.3389/frobt.2020.00082. (Visited on 09/14/2021).
[9] Filipe de Avila Belbute-Peres, Kevin Smith, Kelsey Allen, Josh Tenenbaum, and J. Zico Kolter.
“End-to-End Differentiable Physics for Learning and Control”. In: Advances in Neural Information
Processing Systems 31. Ed. by S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi,
and R. Garnett. Curran Associates, Inc., 2018, pp. 7178–7189. URL: http:
//papers.nips.cc/paper/7948-end-to-end-differentiable-physics-for-learning-and-control.pdf.
86
[10] Jernej Barbic and Doug L James. “Real-Time Subspace Integration for St. Venant-Kirchhoff ˇ
Deformable Models”. In: ACM transactions on graphics (TOG) 24.3 (2005), pp. 982–990.
[11] Jernej Barbic and Jovan Popovi ˇ c. “Real-time control of physically based simulations using gentle ´
forces”. In: ACM transactions on graphics (TOG) 27.5 (2008), pp. 1–10.
[12] Peter Battaglia, Razvan Pascanu, Matthew Lai, Danilo Jimenez Rezende, et al. “Interaction networks
for learning about objects, relations and physics”. In: Advances in Neural Information Processing
Systems. 2016, pp. 4502–4510.
[13] J. Baumgarte. “Stabilization of Constraints and Integrals of Motion in Dynamical Systems”. In:
Computer Methods in Applied Mechanics and Engineering 1.1 (June 1972), pp. 1–16. ISSN:
0045-7825. DOI: 10.1016/0045-7825(72)90018-7. (Visited on 02/02/2023).
[14] Joachim Baumgarte. “Stabilization of constraints and integrals of motion in dynamical systems”. In:
Computer methods in applied mechanics and engineering 1.1 (1972), pp. 1–16.
[15] Bradley Bell. CppAD: a package for C++ algorithmic differentiation.
http://www.coin-or.org/CppAD. 2020.
[16] Ted Belytschko, Wing Kam Liu, Brian Moran, and Khalil Elkhodary. Nonlinear Finite Elements for
Continua and Structures. John wiley & sons, 2014.
[17] James M Bern, Pol Banzet, Roi Poranne, and Stelian Coros. “Trajectory Optimization for
Cable-Driven Soft Robot Locomotion.” In: Robotics: Science and Systems. Vol. 1. Issue: 3. 2019.
[18] James M. Bern, Yannick Schnider, Pol Banzet, Nitish Kumar, and Stelian Coros. “Soft Robot
Control With a Learned Differentiable Model”. In: 2020 3rd IEEE International Conference on Soft
Robotics (RoboSoft). May 2020, pp. 417–423. DOI: 10.1109/RoboSoft48309.2020.9116011.
[19] Tim Besard, Christophe Foket, and Bjorn De Sutter. “Effective Extensible Programming: Unleashing
Julia on GPUs”. In: IEEE Transactions on Parallel and Distributed Systems (2018). ISSN: 1045-9219.
DOI: 10.1109/TPDS.2018.2872064. arXiv: 1712.03112 [cs.PL].
[20] Jeff Bezanson, Alan Edelman, Stefan Karpinski, and Viral B Shah. “Julia: A Fresh Approach to
Numerical Computing”. In: SIAM Review 59.1 (2017), pp. 65–98. DOI: 10.1137/141000671.
[21] Jeff Bezanson, Alan Edelman, Stefan Karpinski, and Viral B Shah. “Julia: A Fresh Approach to
Numerical Computing”. In: SIAM review 59.1 (2017), pp. 65–98.
[22] Francesco Biscani and Dario Izzo. “A parallel global multiobjective framework for optimization:
pagmo”. In: Journal of Open Source Software 5.53 (2020), p. 2338. DOI: 10.21105/joss.02338.
[23] Greg Brockman, Vicki Cheung, Ludwig Pettersson, Jonas Schneider, John Schulman, Jie Tang, and
Wojciech Zaremba. OpenAI Gym. 2016.
[24] Peter Brown and John McPhee. “A 3D ellipsoidal volumetric foot–ground contact model for forward
dynamics”. In: Multibody System Dynamics 42.4 (2018), pp. 447–467.
87
[25] Daniel Bruder, Brent Gillespie, C. David Remy, and Ram Vasudevan. “Modeling and Control of Soft
Robots Using the Koopman Operator and Model Predictive Control”. In: arXiv:1902.02827 [cs]
(July 2019). arXiv: 1902.02827. URL: http://arxiv.org/abs/1902.02827 (visited on 09/14/2021).
[26] Steven L Brunton and J Nathan Kutz. Data-driven science and engineering: Machine learning,
dynamical systems, and control. Cambridge University Press, 2019.
[27] Maria Bualat, Jonathan Barlow, Terrence Fong, Chris Provencher, and Trey Smith. “Astrobee:
Developing a Free-flying Robot for the International Space Station”. In: AIAA SPACE 2015
Conference and Exposition. _eprint: https://arc.aiaa.org/doi/pdf/10.2514/6.2015-4643. American
Institute of Aeronautics and Astronautics. DOI: 10.2514/6.2015-4643. (Visited on 09/15/2021).
[28] Bob Carpenter, Matthew D Hoffman, Marcus Brubaker, Daniel Lee, Peter Li, and
Michael Betancourt. “The Stan math library: Reverse-mode automatic differentiation in C++”. In:
arXiv preprint arXiv:1509.07164 (2015).
[29] Justin Carpentier and Nicolas Mansard. “Analytical Derivatives of Rigid Body Dynamics
Algorithms”. In: Robotics: Science and Systems. 2018.
[30] Ricky T. Q. Chen, Yulia Rubanova, Jesse Bettencourt, and David K Duvenaud. “Neural Ordinary
Differential Equations”. In: Advances in Neural Information Processing Systems 31. Ed. by
S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett. Curran
Associates, Inc., 2018, pp. 6571–6583. URL:
http://papers.nips.cc/paper/7892-neural-ordinary-differential-equations.pdf.
[31] Tian Qi Chen, Yulia Rubanova, Jesse Bettencourt, and David K Duvenaud. “Neural ordinary
differential equations”. In: Advances in Neural Information Processing Systems. 2018,
pp. 6571–6583.
[32] Djork-Arné Clevert, Thomas Unterthiner, and Sepp Hochreiter. Fast and Accurate Deep Network
Learning by Exponential Linear Units (ELUs). 2016. arXiv: 1511.07289 [cs.LG].
[33] Richard W Cottle, Jong-Shi Pang, and Richard E Stone. The Linear Complementarity Problem.
SIAM, 2009.
[34] Erwin Coumans and Yunfei Bai. “PyBullet, a Python Module for Physics Simulation for Games,
Robotics and Machine Learning”. In: (2016). URL: http://pybullet.org.
[35] Erwin Coumans and Yunfei Bai. “PyBullet, a Python module for physics simulation for games,
robotics and machine learning”. In: (2016–2020). URL: http://pybullet.org.
[36] Filipe de Avila Belbute-Peres, Kevin Smith, Kelsey Allen, Josh Tenenbaum, and J Zico Kolter.
“End-to-End Differentiable Physics for Learning and Control”. In: Advances in neural information
processing systems 31 (2018).
[37] Simon Duenser, James M Bern, Roi Poranne, and Stelian Coros. “Interactive robotic manipulation of
elastic objects”. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems
(IROS). IEEE, 2018, pp. 3476–3481.
88
[38] Pascal Egli, Dominique Gaschen, Simon Kerscher, Dominic Jud, and Marco Hutter. “Soil-Adaptive
Excavation Using Reinforcement Learning”. In: IEEE Robotics and Automation Letters 7.4 (Oct.
2022), pp. 9778–9785. ISSN: 2377-3766. DOI: 10.1109/LRA.2022.3189834.
[39] Hilding Elmqvist, Sven Erik Mattsson, and Martin Otter. “Modelica-a Language for Physical System
Modeling, Visualization and Interaction”. In: Proceedings of the 1999 IEEE International Symposium
on Computer Aided Control System Design (Cat. No. 99TH8404). IEEE. 1999, pp. 630–639.
[40] Tom Erez, Yuval Tassa, and Emanuel Todorov. “Simulation Tools for Model-Based Robotics:
Comparison of Bullet, Havok, MuJoCo, ODE and PhysX”. In: 2015 IEEE International Conference
on Robotics and Automation (ICRA). May 2015, pp. 4397–4404. DOI: 10.1109/ICRA.2015.7139807.
[41] Luning Fang, Ruochun Zhang, Colin Vanden Heuvel, Radu Serban, and Dan Negrut. “Chrono::GPU:
An Open-Source Simulation Package for Granular Dynamics Using the Discrete Element Method”.
In: Processes 9.10 (Oct. 2021), p. 1813. ISSN: 2227-9717. DOI: 10.3390/pr9101813. (Visited on
02/05/2023).
[42] Kenneth A. Farley, Kenneth H. Williford, Kathryn M. Stack, Rohit Bhartia, Al Chen, Manuel de la
Torre, Kevin Hand, Yulia Goreva, Christopher D. K. Herd, Ricardo Hueso, Yang Liu, Justin N. Maki,
German Martinez, Robert C. Moeller, Adam Nelessen, Claire E. Newman, Daniel Nunes,
Adrian Ponce, Nicole Spanovich, Peter A. Willis, Luther W. Beegle, James F. Bell, Adrian J. Brown,
Svein-Erik Hamran, Joel A. Hurowitz, Sylvestre Maurice, David A. Paige,
Jose A. Rodriguez-Manfredi, Mitch Schulte, and Roger C. Wiens. “Mars 2020 Mission Overview”.
In: Space Science Reviews 216.8 (Dec. 2020), p. 142. ISSN: 1572-9672. DOI:
10.1007/s11214-020-00762-y. (Visited on 02/02/2023).
[43] Roy Featherstone. Rigid Body Dynamics Algorithms. Berlin, Heidelberg: Springer-Verlag, 2007.
ISBN: 0387743146.
[44] Jean Feng and Noah Simon. Sparse-Input Neural Networks for High-dimensional Nonparametric
Regression and Classification. 2019. arXiv: 1711.07592 [stat.ME].
[45] Isabel M. Rayas Fernández, Christopher E. Denniston, David A. Caron, and Gaurav S. Sukhatme.
“Informative Path Planning to Estimate Quantiles for Environmental Analysis”. In: IEEE Robotics
and Automation Letters 7.4 (2022), pp. 10280–10287.
[46] Zipeng Fu, Xuxin Cheng, and Deepak Pathak. “Deep Whole-Body Control: Learning a Unified
Policy for Manipulation and Locomotion”. In: Conference on Robot Learning (CoRL). 2022.
[47] S. Geer, M. L. Bernhardt-Barry, E. J. Garboczi, J. Whiting, and A. Donmez. “A More Efficient
Method for Calibrating Discrete Element Method Parameters for Simulations of Metallic Powder
Used in Additive Manufacturing”. In: Granular Matter 20.4 (Oct. 2018), p. 77. ISSN: 1434-7636.
DOI: 10.1007/s10035-018-0848-4. (Visited on 02/02/2023).
[48] Moritz Geilinger, David Hahn, Jonas Zehnder, Moritz Bächer, Bernhard Thomaszewski, and
Stelian Coros. ADD: Analytically Differentiable Dynamics for Multi-Body Systems with Frictional
Contact. 2020. arXiv: 2007.00987 [cs.GR].
89
[49] Markus Giftthaler, Michael Neunert, Markus Stäuble, Marco Frigerio, Claudio Semini, and
Jonas Buchli. “Automatic differentiation of rigid body dynamics for optimal control and estimation”.
In: Advanced Robotics 31.22 (2017), pp. 1225–1237. DOI: 10.1080/01691864.2017.1395361.
[50] Morgan T. Gillespie, Charles M. Best, Eric C. Townsend, David Wingate, and Marc D. Killpack.
“Learning nonlinear dynamic models of soft robots for model predictive control with neural
networks”. In: 2018 IEEE International Conference on Soft Robotics (RoboSoft). Apr. 2018,
pp. 39–45. DOI: 10.1109/ROBOSOFT.2018.8404894.
[51] Florian Golemo, Adrien Ali Taiga, Aaron Courville, and Pierre-Yves Oudeyer. “Sim-to-Real
Transfer with Neural-Augmented Robot Simulation”. In: Proceedings of The 2nd Conference on
Robot Learning. Ed. by Aude Billard, Anca Dragan, Jan Peters, and Jun Morimoto. Vol. 87.
Proceedings of Machine Learning Research. PMLR, 29–31 Oct 2018, pp. 817–828. URL:
http://proceedings.mlr.press/v87/golemo18a.html.
[52] Samuel Greydanus, Misko Dzamba, and Jason Yosinski. “Hamiltonian Neural Networks”. In:
Advances in Neural Information Processing Systems 32. Ed. by H. Wallach, H. Larochelle,
A. Beygelzimer, F. d’Alché-Buc, E. Fox, and R. Garnett. Curran Associates, Inc., 2019,
pp. 15379–15389. URL: http://papers.nips.cc/paper/9672-hamiltonian-neural-networks.pdf.
[53] Gaël Guennebaud, Benoît Jacob, et al. Eigen v3. http://eigen.tuxfamily.org. 2010.
[54] David Ha and Jürgen Schmidhuber. “Recurrent World Models Facilitate Policy Evolution”. In:
Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information
Processing Systems 2018, NeurIPS 2018, December 3-8, 2018, Montréal, Canada. Ed. by
Samy Bengio, Hanna M. Wallach, Hugo Larochelle, Kristen Grauman, Nicolò Cesa-Bianchi, and
Roman Garnett. 2018, pp. 2455–2467. URL: https:
//proceedings.neurips.cc/paper/2018/hash/2de5d16682c3c35007e4e92982f1a2ba-Abstract.html.
[55] Huy Ha and Shuran Song. “Flingbot: The Unreasonable Effectiveness of Dynamic Manipulation for
Cloth Unfolding”. In: Conference on Robot Learning. PMLR. 2022, pp. 24–33.
[56] Amin Haeri, Dominique Tremblay, Krzysztof Skonieczny, Daniel Holz, and Marek Teichmann.
“Efficient Numerical Methods for Accurate Modeling of Soil Cutting Operations”. In: 37th
International Symposium on Automation and Robotics in Construction. Kitakyushu, Japan, Oct. 2020.
DOI: 10.22260/ISARC2020/0085. (Visited on 02/04/2023).
[57] William W. Hager and Hongchao Zhang. “Algorithm 851: CGDESCENT, a Conjugate Gradient
Method with Guaranteed Descent”. In: ACM Trans. Math. Softw. 32.1 (Mar. 2006), pp. 113–137.
ISSN: 0098-3500. DOI: 10.1145/1132973.1132979.
[58] David Hahn, Pol Banzet, James M Bern, and Stelian Coros. “Real2sim: Visco-elastic Parameter
Estimation from Dynamic Motion”. In: ACM Transactions on Graphics (TOG) 38.6 (2019), pp. 1–13.
[59] David Hahn, Pol Banzet, James M Bern, and Stelian Coros. “Real2sim: Visco-elastic parameter
estimation from dynamic motion”. In: ACM Transactions on Graphics (TOG) 38.6 (2019). Publisher:
ACM New York, NY, USA, pp. 1–13.
90
[60] Siyu He, Yin Li, Yu Feng, Shirley Ho, Siamak Ravanbakhsh, Wei Chen, and Barnabás Póczos.
“Learning to predict the cosmological structure formation”. en. In: Proceedings of the National
Academy of Sciences of the United States of America 116.28 (July 2019), pp. 13825–13832.
[61] Eric Heiden, Ziang Liu, Ragesh K. Ramachandran, and Gaurav S. Sukhatme. “Physics-based
Simulation of Continuous-Wave LIDAR for Localization, Calibration and Tracking”. In:
International Conference on Robotics and Automation (ICRA). IEEE. 2020.
[62] Eric Heiden, Miles Macklin, Yashraj Narang, Dieter Fox, Animesh Garg, and Fabio Ramos.
“DiSECt: A Differentiable Simulation Engine for Autonomous Robotic Cutting”. In: arXiv preprint
arXiv:2105.12244 (2021).
[63] Eric Heiden, Miles Macklin, Yashraj Narang, Dieter Fox, Animesh Garg, and Fabio Ramos. “Disect:
A Differentiable Simulation Engine for Autonomous Robotic Cutting”. In: Robotics: Science and
Systems (2021).
[64] Eric Heiden, David Millard, and Gaurav S. Sukhatme. “Real2Sim Transfer using Differentiable
Physics”. In: R:SS Workshop on Closing the Reality Gap in Sim2real Transfer for Robotic
Manipulation (2019).
[65] Eric Heiden, David Millard, Hejia Zhang, and Gaurav S. Sukhatme. “Interactive Differentiable
Simulation”. In: CoRR abs/1905.10706 (2019). arXiv: 1905.10706. URL:
http://arxiv.org/abs/1905.10706.
[66] Sepp Hochreiter and Jürgen Schmidhuber. “Long short-term memory”. In: Neural computation 9.8
(1997). Publisher: MIT Press, pp. 1735–1780.
[67] Gerhard .A. Holzapfel. Nonlinear Solid Mechanics: A Continuum Approach for Engineering. Wiley,
2000. ISBN: 978-0-471-82304-9.
[68] Gerhard A. Holzapfel. “Nonlinear Solid Mechanics: A Continuum Approach for Engineering
Science”. en. In: Meccanica 37.4 (July 2002), pp. 489–490. ISSN: 1572-9648. DOI:
10.1023/A:1020843529530. (Visited on 09/14/2021).
[69] Peter C Horak and Jeff C Trinkle. “On the similarities and differences among contact models in robot
simulation”. In: IEEE Robotics and Automation Letters 4.2 (2019), pp. 493–499.
[70] Peter C. Horak and Jeff C. Trinkle. “On the Similarities and Differences Among Contact Models in
Robot Simulation”. In: IEEE Robotics and Automation Letters 4.2 (Apr. 2019), pp. 493–499. ISSN:
2377-3766, 2377-3774. DOI: 10.1109/LRA.2019.2891085. (Visited on 01/05/2023).
[71] Yuanming Hu, Luke Anderson, Tzu-Mao Li, Qi Sun, Nathan Carr, Jonathan Ragan-Kelley, and
Frédo Durand. “DiffTaichi: Differentiable Programming for Physical Simulation”. In: ICLR (2020).
[72] Yuanming Hu, Luke Anderson, Tzu-Mao Li, Qi Sun, Nathan Carr, Jonathan Ragan-Kelley, and
Frédo Durand. “DiffTaichi: Differentiable Programming for Physical Simulation”. In: ICLR (2020).
91
[73] Yuanming Hu, Luke Anderson, Tzu-Mao Li, Qi Sun, Nathan Carr, Jonathan Ragan-Kelley, and
Frédo Durand. “Difftaichi: Differentiable Programming for Physical Simulation”. In: arXiv preprint
arXiv:1910.00935 (2019). arXiv: 1910.00935.
[74] Yuanming Hu, Jiancheng Liu, Andrew Spielberg, Joshua B Tenenbaum, William T Freeman,
Jiajun Wu, Daniela Rus, and Wojciech Matusik. “ChainQueen: A Real-Time Differentiable Physical
Simulator for Soft Robotics”. In: Proceedings of IEEE International Conference on Robotics and
Automation (ICRA) (2019).
[75] Yuanming Hu, Jiancheng Liu, Andrew Spielberg, Joshua B Tenenbaum, William T Freeman,
Jiajun Wu, Daniela Rus, and Wojciech Matusik. “ChainQueen: A Real-Time Differentiable Physical
Simulator for Soft Robotics”. In: Proceedings of IEEE International Conference on Robotics and
Automation (ICRA) (2019).
[76] Kenneth H Hunt and Frank R Erskine Crossley. “Coefficient of restitution interpreted as damping in
vibroimpact”. In: (1975).
[77] Jemin Hwangbo, Joonho Lee, Alexey Dosovitskiy, Dario Bellicoso, Vassilios Tsounis,
Vladlen Koltun, and Marco Hutter. “Learning agile and dynamic motor skills for legged robots”. In:
Science Robotics 4.26 (2019), eaau5872.
[78] Wenzel Jakob. Enoki: structured vectorization and differentiation on modern processor architectures.
https://github.com/mitsuba-renderer/enoki. 2019.
[79] Yifeng Jiang and C. Karen Liu. “Data-Augmented Contact Model for Rigid Body Simulation”. In:
CoRR abs/1803.04019 (2018). arXiv: 1803.04019. URL: http://arxiv.org/abs/1803.04019.
[80] R.E. Kalman. “A new approach to linear filtering and prediction problems”. In: Journal of Basic
Engineering 82.1 (1960). Publisher: Citeseer, pp. 35–45.
[81] Donghyun Kim, Jared Di Carlo, Benjamin Katz, Gerardo Bledt, and Sangbae Kim. Highly Dynamic
Quadruped Locomotion via Whole-Body Impulse Control and Model Predictive Control. 2019. arXiv:
1909.06586 [cs.RO].
[82] Christoph Kloss, Christoph Goniva, Alice Hager, Stefan Amberger, and Stefan Pirker. “Models,
Algorithms and Validation for Opensource DEM and CFD–DEM”. In: Progress in Computational
Fluid Dynamics, an International Journal 12.2-3 (Jan. 2012), pp. 140–152. ISSN: 1468-4349. DOI:
10.1504/PCFD.2012.047457. (Visited on 01/30/2023).
[83] Nathan Koenig and Andrew Howard. “Design and Use Paradigms for Gazebo, an Open-Source
Multi-Robot Simulator”. In: 2004 IEEE/RSJ International Conference on Intelligent Robots and
Systems (IROS) (IEEE Cat. No.04CH37566). Vol. 3. Sept. 2004, 2149–2154 vol.3. DOI:
10.1109/IROS.2004.1389727.
[84] Twan Koolen and Robin Deits. “Julia for robotics: simulation and real-time control in a high-level
programming language”. In: International Conference on Robotics and Automation. May 2019.
[85] J Nathan Kutz, Steven L Brunton, Bingni W Brunton, and Joshua L Proctor. Dynamic Mode
Decomposition: Data-Driven Modeling of Complex Systems. SIAM, 2016.
92
[86] João Rui Leal and Brad Bell. joaoleal/CppADCodeGen: CppAD 2017. Version v2.2.0. July 2017.
DOI: 10.5281/zenodo.836832.
[87] Jeongseok Lee, Michael X. Grey, Sehoon Ha, Tobias Kunz, Sumit Jain, Yuting Ye,
Siddhartha S. Srinivasa, Mike Stilman, and C. Karen Liu. “DART: Dynamic Animation and Robotics
Toolkit”. In: Journal of Open Source Software 3.22 (2018), p. 500. DOI: 10.21105/joss.00500.
[88] Q. Lelidec, I. Kalevatykh, I. Laptev, C. Schmid, and J. Carpentier. “Differentiable simulation for
physical system identification”. In: IEEE Robotics and Automation Letters (2021), pp. 1–1. DOI:
10.1109/LRA.2021.3062323.
[89] Sergey Levine, Peter Pastor, Alex Krizhevsky, and Deirdre Quillen. “Learning Hand-Eye
Coordination for Robotic Grasping with Deep Learning and Large-Scale Data Collection”. Aug. 28,
2016. DOI: 10.48550/arXiv.1603.02199. arXiv: 1603.02199 [cs].
[90] Yifei Li, Tao Du, Kui Wu, Jie Xu, and Wojciech Matusik. “DiffCloth: Differentiable Cloth
Simulation with Dry Frictional Contact”. In: arXiv preprint arXiv:2106.05306 (2021).
[91] Yunzhu Li, Jiajun Wu, Russ Tedrake, Joshua B. Tenenbaum, and Antonio Torralba. “Learning
Particle Dynamics for Manipulating Rigid Bodies, Deformable Objects, and Fluids”. In:
International Conference on Learning Representations. 2019. URL:
https://openreview.net/forum?id=rJgbSn09Ym.
[92] Junbang Liang, Ming Lin, and Vladlen Koltun. “Differentiable Cloth Simulation for Inverse
Problems”. In: Advances in Neural Information Processing Systems. 2019, pp. 771–780.
[93] Vincent Lim, Huang Huang, Lawrence Yunliang Chen, Jonathan Wang, Jeffrey Ichnowski,
Daniel Seita, Michael Laskey, and Ken Goldberg. “Planar Robot Casting with Real2Sim2Real
Self-Supervised Learning”. 2021. arXiv: 2111.04814.
[94] Vincent Lim, Huang Huang, Lawrence Yunliang Chen, Jonathan Wang, Jeffrey Ichnowski,
Daniel Seita, Michael Laskey, and Ken Goldberg. “Real2Sim2Real: Self-Supervised Learning of
Physical Single-Step Dynamic Actions for Planar Robot Casting”. In: 2022 International Conference
on Robotics and Automation (ICRA). Philadelphia, PA, USA: IEEE Press, May 2022, pp. 8282–8289.
DOI: 10.1109/ICRA46639.2022.9811651. (Visited on 02/02/2023).
[95] Zachary Chase Lipton. “A Critical Review of Recurrent Neural Networks for Sequence Learning”.
In: CoRR abs/1506.00019 (2015).
[96] Zichao Long, Yiping Lu, Xianzhong Ma, and Bin Dong. “Pde-net: Learning pdes from data”. In:
International Conference on Machine Learning. 2018, pp. 3208–3216.
[97] Michael Lutter, Christian Ritter, and Jan Peters. “Deep Lagrangian networks: Using physics as
model prior for deep learning”. In: arXiv preprint arXiv:1907.04490 (2019).
[98] Michael Lutter, Johannes Silberbauer, Joe Watson, and Jan Peters. Differentiable Physics Models for
Real-world Offline Model-based Reinforcement Learning. 2020. arXiv: 2011.01734 [cs.RO].
93
[99] Kevin M Lynch and Frank C. Park. Modern Robotics: Mechanics, Planning, and Control. English
(US). Cambridge Univeristy Press, 2017. ISBN: 978-1107156302.
[100] Miles Macklin. Warp: A High-performance Python Framework for GPU Simulation and Graphics.
Mar. 2022.
[101] Miles Macklin, Matthias Müller, and Nuttapong Chentanez. “XPBD: Position-Based Simulation of
Compliant Constrained Dynamics”. In: Proceedings of the 9th International Conference on Motion
in Games. MIG ’16. New York, NY, USA: Association for Computing Machinery, Oct. 10, 2016,
pp. 49–54. ISBN: 978-1-4503-4592-7. DOI: 10.1145/2994258.2994272. (Visited on 05/19/2022).
[102] Miles Macklin, Matthias Müller, and Nuttapong Chentanez. “XPBD: position-based simulation of
compliant constrained dynamics”. In: Proceedings of the 9th International Conference on Motion in
Games. 2016, pp. 49–54.
[103] Richard H. MacNeal and Caleb W. McCormick. “The NASTRAN Computer Program for Structural
Analysis”. In: Computers & Structures 1.3 (1971), pp. 389–412.
[104] Rolf Mahnken. “Identification of Material Parameters for Constitutive Equations”. In: Encyclopedia
of Computational Mechanics Second Edition. John Wiley & Sons, Ltd, 2017, pp. 1–21. ISBN:
978-1-119-17681-7. DOI: 10.1002/9781119176817.ecm2043. eprint:
https://onlinelibrary.wiley.com/doi/pdf/10.1002/9781119176817.ecm2043.
[105] Steven L McCarty, Laura M Burke, and Melissa McGuire. “Parallel Monotonic Basin Hopping for
low thrust trajectory optimization”. In: 2018 Space Flight Mechanics Meeting. 2018, p. 1452.
[106] Dale Mcconachie and Dmitry Berenson. “Estimating Model Utility for Deformable Object
Manipulation Using Multiarmed Bandit Methods”. In: IEEE Transactions on Automation Science
and Engineering 15.3 (July 2018). Conference Name: IEEE Transactions on Automation Science
and Engineering, pp. 967–979. ISSN: 1558-3783. DOI: 10.1109/TASE.2018.2822669.
[107] Dale McConachie, Andrew Dobson, Mengyao Ruan, and Dmitry Berenson. “Manipulating
deformable objects by interleaving prediction, planning, and control”. en. In: The International
Journal of Robotics Research 39.8 (July 2020). Publisher: SAGE Publications Ltd STM,
pp. 957–982. ISSN: 0278-3649. DOI: 10.1177/0278364920918299. (Visited on 09/14/2021).
[108] Nada Masood Mirza. “Machine Learning and Soft Robotics”. In: 2020 21st International Arab
Conference on Information Technology (ACIT). Nov. 2020, pp. 1–5. DOI:
10.1109/ACIT50332.2020.9300102.
[109] Peter Mitrano, Dale McConachie, and Dmitry Berenson. “Learning Where to Trust Unreliable
Models in an Unstructured World for Deformable Object Manipulation”. In: Science Robotics 6.54
(2021), eabd8170. DOI: 10.1126/scirobotics.abd8170. eprint:
https://www.science.org/doi/pdf/10.1126/scirobotics.abd8170.
[110] Patrick Kofod Mogensen and Asbjørn Nilsen Riseth. “Optim: A Mathematical Optimization Package
for Julia”. In: Journal of Open Source Software 3.24 (2018), p. 615. DOI: 10.21105/joss.00615.
94
[111] Andrew W. Moore. “Fast, Robust Adaptive Control by Learning Only Forward Models”. In:
Proceedings of the 4th International Conference on Neural Information Processing Systems. NIPS’91.
Denver, Colorado: Morgan Kaufmann Publishers Inc., 1991, pp. 571–578. ISBN: 1558602224.
[112] Damian Mrowca, Chengxu Zhuang, Elias Wang, Nick Haber, Li Fei-Fei, Joshua B Tenenbaum, and
Daniel L K Yamins. “Flexible Neural Representation for Physics Prediction”. In: Advances in Neural
Information Processing Systems (June 2018).
[113] Matthias Muller, Bruno Heidelberger, Matthias Teschner, and Markus Gross. “Meshless
Deformations Based on Shape Matching”. en. In: (), p. 8.
[114] Matthias Müller, Miles Macklin, Nuttapong Chentanez, Stefan Jeschke, and Tae-Yong Kim.
“Detailed Rigid Body Simulation with Extended Position Based Dynamics”. In: Computer Graphics
Forum. Vol. 39. 8. Wiley Online Library. 2020, pp. 101–112.
[115] J. Krishna Murthy, Miles Macklin, Florian Golemo, Vikram Voleti, Linda Petrini, Martin Weiss,
Breandan Considine, Jérôme Parent-Lévesque, Kevin Xie, Kenny Erleben, Liam Paull,
Florian Shkurti, Derek Nowrouzezahrai, and Sanja Fidler. “gradSim: Differentiable simulation for
system identification and visuomotor control”. In: International Conference on Learning
Representations. 2021. URL: https://openreview.net/forum?id=c_E8kFWfhp0.
[116] Yasuhiro Nakahara and Teruyoshi Washizawa. Accelerating DEM Simulations on GPUs by Reducing
the Impact of Warp Divergences. 2015. DOI: 10.48550/ARXIV.1503.03553.
[117] Oliver Nelles. Nonlinear system identification. Berlin: Springer, 2001.
[118] Nathan M. Newmark. A Method of Computation for Structural Dynamics. American Society of Civil
Engineers, 1959.
[119] Erfan G. Nezami, Youssef M. A. Hashash, Dawei Zhao, and Jamshid Ghaboussi. “Simulation of
front end loader bucket–soil interaction using discrete element method”. In: International Journal for
Numerical and Analytical Methods in Geomechanics 31.9 (2007), pp. 1147–1162. ISSN: 1096-9853.
DOI: 10.1002/nag.594. (Visited on 02/02/2023).
[120] Merlin Nimier-David, Delio Vicini, Tizian Zeltner, and Wenzel Jakob. “Mitsuba 2: A Retargetable
Forward and Inverse Renderer”. In: Transactions on Graphics (Proceedings of SIGGRAPH Asia)
38.6 (Nov. 2019). DOI: 10.1145/3355089.3356498.
[121] Overview - MuJoCo Documentation. https://mujoco.readthedocs.io/en/latest/overview.html. (Visited
on 12/02/2023).
[122] Borinara Park. “Development of a Virtual Reality Excavator Simulator: A Mathematical Model of
Excavator Digging and a Calculation Methodology”. In: (Oct. 2002). (Visited on 05/23/2022).
95
[123] Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan,
Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kopf,
Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy,
Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. “PyTorch: An Imperative Style,
High-Performance Deep Learning Library”. In: Advances in Neural Information Processing Systems
32. Ed. by H. Wallach, H. Larochelle, A. Beygelzimer, F. d’ Alché-Buc, E. Fox, and R. Garnett.
Curran Associates, Inc., 2019, pp. 8024–8035. URL: http://papers.neurips.cc/paper/9015-pytorchan-imperative-style-high-performance-deep-learning-library.pdf.
[124] Xue Bin Peng, Erwin Coumans, Tingnan Zhang, Tsang-Wei Edward Lee, Jie Tan, and Sergey Levine.
“Learning Agile Robotic Locomotion Skills by Imitating Animals”. In: Robotics: Science and
Systems. July 2020. DOI: 10.15607/RSS.2020.XVI.064.
[125] James A. Preiss, David Millard, Tao Yao, and Gaurav S. Sukhatme. “Tracking Fast Trajectories with
a Deformable Object Using a Learned Model”. In: IEEE Conference on Robotics and Automation
2022. 2022.
[126] Yi-Ling Qiao, Junbang Liang, Vladlen Koltun, and Ming C. Lin. “Scalable Differentiable Physics for
Learning and Control”. In: ICML. 2020.
[127] Yi-Ling Qiao, Junbang Liang, Vladlen Koltun, and Ming C. Lin. “Scalable differentiable physics for
learning and control”. In: arXiv preprint arXiv:2007.02168 (2020).
[128] Inigo Quilez. Inigo Quilez. https://iquilezles.org. (Visited on 01/31/2023).
[129] Maziar Raissi, Hessam Babaee, and Peyman Givi. Deep Learning of Turbulent Scalar Mixing.
Tech. rep. 2018.
[130] Maziar Raissi, Paris Perdikaris, and George E Karniadakis. “Physics-informed neural networks: A
deep learning framework for solving forward and inverse problems involving nonlinear partial
differential equations”. In: Journal of Computational Physics 378 (2019), pp. 686–707.
[131] Fabio Ramos, Rafael Carvalhaes Possas, and Dieter Fox. “BayesSim: adaptive domain randomization
via probabilistic inference for robotics simulators”. In: Robotics: Science and Systems (2019).
[132] Alan R. Reece. “The Fundamental Equation of Earth-Moving Mechanics”. In: Proceedings of the
Institution of Mechanical Engineers, Conference Proceedings. Vol. 179. 6. SAGE Publications Sage
UK: London, England. 1964, pp. 16–22.
[133] Dario Riccobono, Scott Moreland, Paul Backes, and Giancarlo Genta. “Granular Flow
Characterization during Sampling Operation for Enceladus Surface Acquisition”. In: (Apr. 2021),
pp. 564–576. DOI: 10.1061/9780784483374.053. (Visited on 02/02/2023).
[134] Andrew P. Sabelhaus and Carmel Majidi. “Gaussian Process Dynamics Models for Soft Robots with
Shape Memory Actuators”. In: 2021 IEEE 4th International Conference on Soft Robotics (RoboSoft).
Apr. 2021, pp. 191–198. DOI: 10.1109/RoboSoft51838.2021.9479294.
96
[135] Gautam Salhotra, I.-Chun Arthur Liu, Marcus Dominguez-Kuhne, and Gaurav S. Sukhatme.
“Learning Deformable Object Manipulation from Expert Demonstrations”. In: IEEE Robotics and
Automation Letters 7.4 (2022), pp. 8775–8782.
[136] Alvaro Sanchez-Gonzalez, Jonathan Godwin, Tobias Pfaff, Rex Ying, Jure Leskovec, and
Peter W Battaglia. “Learning to simulate complex physics with graph networks”. In: arXiv preprint
arXiv:2002.09405 (2020).
[137] Alvaro Sanchez-Gonzalez, Jonathan Godwin, Tobias Pfaff, Rex Ying, Jure Leskovec, and
Peter W. Battaglia. Learning to Simulate Complex Physics with Graph Networks. 2020. arXiv:
2002.09405 [cs.LG].
[138] Stefan Schaal and Christopher G Atkeson. “Robot juggling: implementation of memory-based
learning”. In: IEEE Control Systems Magazine 14.1 (1994), pp. 57–71.
[139] Martin Servin, Tomas Berglund, and Samuel Nystedt. “A Multiscale Model of Terrain Dynamics for
Real-Time Earthmoving Simulation”. In: Advanced Modeling and Simulation in Engineering
Sciences 8.1 (Dec. 2021), p. 11. ISSN: 2213-7467. DOI: 10.1186/s40323-021-00196-3. (Visited on
01/30/2023).
[140] Apoorva Sharma, Navid Azizan, and Marco Pavone. “Sketching Curvature for Efficient
Out-of-Distribution Detection for Deep Neural Networks”. In: Proceedings of the Thirty-Seventh
Conference on Uncertainty in Artificial Intelligence. Ed. by Cassio de Campos and
Marloes H. Maathuis. Vol. 161. Proceedings of Machine Learning Research. PMLR, July 27–30,
2021, pp. 1958–1967. URL: https://proceedings.mlr.press/v161/sharma21a.html.
[141] Bruno Siciliano, Lorenzo Sciavicco, Luigi Villani, and Giuseppe Oriolo. Robotics: Modelling,
Planning and Control. en. Advanced Textbooks in Control and Signal Processing. London:
Springer-Verlag, 2009. ISBN: 978-1-84628-641-4. DOI: 10.1007/978-1-84628-642-1. (Visited on
09/14/2021).
[142] Noah Simon, Jerome Friedman, Trevor Hastie, and Robert Tibshirani. “A sparse-group lasso”. In:
Journal of computational and graphical statistics 22.2 (2013), pp. 231–245.
[143] Jean-Jacques E Slotine, Weiping Li, et al. Applied Nonlinear Control. Vol. 199. 1. Prentice hall
Englewood Cliffs, NJ, 1991.
[144] B. Stellato, G. Banjac, P. Goulart, A. Bemporad, and S. Boyd. “OSQP: an operator splitting solver
for quadratic programs”. In: Mathematical Programming Computation 12.4 (2020), pp. 637–672.
DOI: 10.1007/s12532-020-00179-2.
[145] David Stewart and Jeffrey C Trinkle. “An implicit time-stepping scheme for rigid body dynamics
with coulomb friction”. In: Proceedings 2000 ICRA. Millennium Conference. IEEE International
Conference on Robotics and Automation. Symposia Proceedings (Cat. No. 00CH37065). Vol. 1.
IEEE. 2000, pp. 162–169.
[146] D. Sulsky, Z. Chen, and H. L. Schreyer. “A particle method for history-dependent materials”. en. In:
Computer Methods in Applied Mechanics and Engineering 118.1 (Sept. 1994), pp. 179–196. ISSN:
0045-7825. DOI: 10.1016/0045-7825(94)90112-0. (Visited on 09/14/2021).
97
[147] Giovanni Sutanto, Austin Wang, Yixin Lin, Mustafa Mukadam, Gaurav Sukhatme, Akshara Rai, and
Franziska Meier. “Encoding Physical Constraints in Differentiable Newton-Euler Algorithm”. In:
ed. by Alexandre M. Bayen, Ali Jadbabaie, George Pappas, Pablo A. Parrilo, Benjamin Recht,
Claire Tomlin, and Melanie Zeilinger. Vol. 120. Proceedings of Machine Learning Research. The
Cloud: PMLR, Oct. 2020, pp. 804–813. URL: http://proceedings.mlr.press/v120/sutanto20a.html.
[148] Yuval Tassa, Tom Erez, and Emanuel Todorov. “Synthesis and Stabilization of Complex Behaviors
through Online Trajectory Optimization”. In: 2012 IEEE/RSJ International Conference on Intelligent
Robots and Systems. IEEE. 2012, pp. 4906–4913.
[149] Enrico Terzi, Fabio Bonassi, Marcello Farina, and Riccardo Scattolini. “Learning model predictive
control with long short-term memory networks”. In: International Journal of Robust and Nonlinear
Control Early Access.n/a (2021).
[150] Matthias Teschner, Bruno Heidelberger, Matthias Muller, Danat Pomeranets, and Markus Gross.
“Optimized Spatial Hashing for Collision Detection of Deformable Objects”. In: ().
[151] Maxime Thieffry, Alexandre Kruszewski, Christian Duriez, and Thierry-Marie Guerra. “Control
design for soft robots based on reduced-order model”. In: IEEE Robotics and Automation Letters 4.1
(2018). Publisher: IEEE, pp. 25–32.
[152] Aidan P. Thompson, H. Metin Aktulga, Richard Berger, Dan S. Bolintineanu, W. Michael Brown,
Paul S. Crozier, Pieter J. in ’t Veld, Axel Kohlmeyer, Stan G. Moore, Trung Dac Nguyen, Ray Shan,
Mark J. Stevens, Julien Tranchida, Christian Trott, and Steven J. Plimpton. “LAMMPS - a Flexible
Simulation Tool for Particle-Based Materials Modeling at the Atomic, Meso, and Continuum Scales”.
In: Computer Physics Communications 271 (Feb. 2022), p. 108171. ISSN: 0010-4655. DOI:
10.1016/j.cpc.2021.108171. (Visited on 01/30/2023).
[153] Thomas George Thuruthel, Egidio Falotico, Federico Renda, and Cecilia Laschi. “Model-Based
Reinforcement Learning for Closed-Loop Dynamic Control of Soft Robotic Manipulators”. In: IEEE
Transactions on Robotics 35.1 (Feb. 2019). Conference Name: IEEE Transactions on Robotics,
pp. 124–134. ISSN: 1941-0468. DOI: 10.1109/TRO.2018.2878318.
[154] Emanuel Todorov, Tom Erez, and Yuval Tassa. “Mujoco: A physics engine for model-based control”.
In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE. 2012,
pp. 5026–5033.
[155] Sander Tonkens, Joseph Lorenzetti, and Marco Pavone. “Soft robot optimal control via reduced order
finite element models”. In: arXiv preprint arXiv:2011.02092 (2020).
[156] Kiwon Um, Yun Fei, Robert Brand, Philipp Holl, and Nils Thuerey. “Solver-in-the-Loop: Learning
from Differentiable Physics to Interact with Iterative PDE-Solvers”. In: arXiv preprint
arXiv:2007.00016 (2020).
[157] Bin Wang, Longhua Wu, KangKang Yin, Uri M Ascher, Libin Liu, and Hui Huang. “Deformation
Capture and Modeling of Soft Objects.” In: ACM Trans. Graph. 34.4 (2015), pp. 94–1.
[158] Bohan Wang, Mianlun Zheng, and Jernej Barbic. “Adjustable Constrained Soft-Tissue Dynamics”. ˇ
In: Computer Graphics Forum 39.7 (2020), pp. 69–79. ISSN: 1467-8659. DOI: 10.1111/cgf.14127.
98
[159] Thomas Weng, Sujay Bajracharya, Yufei Wang, Khush Agrawal, and David Held. “FabricFlowNet:
Bimanual Cloth Manipulation with a Flow-Based Policy”. In: Conference on Robot Learning. 2021.
[160] Grady Williams, Paul Drews, Brian Goldfain, James M. Rehg, and Evangelos A. Theodorou.
“Aggressive Driving with Model Predictive Path Integral Control”. In: 2016 IEEE International
Conference on Robotics and Automation (ICRA). May 2016, pp. 1433–1440. DOI:
10.1109/ICRA.2016.7487277.
[161] Peter Wriggers. Nonlinear Finite Element Methods. en. Berlin Heidelberg: Springer-Verlag, 2008.
ISBN: 978-3-540-71000-4. DOI: 10.1007/978-3-540-71001-1. (Visited on 09/14/2021).
[162] Zhenjia Xu, Jiajun Wu, Andy Zeng, Joshua B Tenenbaum, and Shuran Song. “DensePhysNet:
Learning Dense Physical Object Representations via Multi-step Dynamic Interactions”. In: Robotics:
Science and Systems (June 2019).
[163] Takashi Yamamoto, Koji Terada, Akiyoshi Ochiai, Fuminori Saito, Yoshiaki Asahara, and
Kazuto Murase. “Development of Human Support Robot as the Research Platform of a Domestic
Mobile Manipulator”. In: ROBOMECH Journal 6.1 (Apr. 2019), p. 4. ISSN: 2197-4225. DOI:
10.1186/s40648-019-0132-3. (Visited on 02/02/2023).
[164] Kuan-Ting Yu, Maria Bauza, Nima Fazeli, and Alberto Rodriguez. “More than a million ways to be
pushed. a high-fidelity experimental dataset of planar pushing”. In: 2016 IEEE/RSJ international
conference on intelligent robots and systems (IROS). IEEE. 2016, pp. 30–37.
[165] Andy Zeng, Shuran Song, Johnny Lee, Alberto Rodriguez, and Thomas Funkhouser. “TossingBot:
Learning to Throw Arbitrary Objects with Residual Physics”. In: (2019).
[166] Jihong Zhu, Andrea Cherubini, Claire Dune, David Navarro-Alarcon, Farshid Alambeigi,
Dmitry Berenson, Fanny Ficuciello, Kensuke Harada, Xiang Li, Jia Pan, and Wenzhen Yuan.
“Challenges and Outlook in Robotic Manipulation of Deformable Objects”. In: CoRR
abs/2105.01767 (2021).
[167] Simon Zimmermann, Roi Poranne, and Stelian Coros. “Dynamic Manipulation of Deformable
Objects With Implicit Integration”. In: IEEE Robotics and Automation Letters 6.2 (Apr. 2021).
Conference Name: IEEE Robotics and Automation Letters, pp. 4209–4216. ISSN: 2377-3766. DOI:
10.1109/LRA.2021.3066969.
99
Appendix A
Much of my thesis involved the production of new software tools for simulation and robotics research. These
tools are largely available now as open source software, and I hope that they will be useful to the community.
I will cover the tools briefly now in a series of short appendices, as well as a few other bits of information and computing tools which I found valuable for development. For more information, please see the
documentation for each tool, which is available online.
7.1 Background on the Julia Programming Language
The Julia programming language is a high-performance, dynamic programming language for technical
computing. It is designed to be fast, and to be easy to use for numerical computing. It is a relatively new
language, first released in 2012, but has seen significant adoption in the scientific computing community.
The major strength of Julia lies in its dynamicity, combined with its JIT compilation strategies. This
means several things for the simulation engineer. First, it means that Julia is easy to use for interactive
development. The interactive REPL allows for rapid development and testing of code, and the dynamic nature
of the language means that changes can propagate immadiately. The JIT compilation means that native code
is generated immediately with the proper and most specific types, and the flexibility of the metaprogramming
system means that code can be generated for each specific system and set of parameters, i.e. requisite loops
can be unrolled and fused for each kinematic structure of a robot. This powerful combination of features is
100
almost unavailable in other language traditionally used for scientific computing, such as C++. The achieve
similar performance, we had to trace, codegen, compile, and runtime dlopen C++ code in the NeuralSim
framework, please see Chapter 3 for more details. To say the least, this process in C++ was significantly less
ergonomic than the Julia equivalent.
More importantly, Julia’s type dynamicity, combined with its rich and consise mathematical syntax, make
it an ideal system which is its own domain-specific language. The same piece of code, without modification,
can be used to generate its own derivatives, run Monte-Carlo inference, or even be transformed into PTX
assembly to run on CUDA GPUs. These features are critical for the success of the FEM package in Chapter 4,
and the performance of GranularGym in Chapter 6. This flexibility made it nearly trivial to port GranularGym
to run on CUDA GPUs, multithreaded CPUs, and even Apple Metal GPUs with Apple Silicon chips, all
without code modification. Similar features are available in Python through PyTorch and other packages, but
packaging this functionality as a library makes it difficult to interoperate with existing non-type-compatible
Python code, as opposed to the natively compositional packages of the Julia ecosystem.
7.2 A Julia wrapper for the Franka Emika Panda robot (Franka.jl)
The Franka.jl package is a Julia wrapper for the Franka Emika Panda robot. It is a thin wrapper around the
C++ API provided by Franka Emika, and provides a simple interface for controlling the robot at the lowest
available level, the 1kHz control loop. Since Julia code is compiled to native code and libfranka expects
a callback function at the 1kHz interface, after JITting, we can pass a pointer to the Julia equivalent of the
.text section directly to libfranka. After bouncing a raw pointer through a thin C++ trampoline, the
Julia code unpacks its own data structures and calls the user-supplied callback. The net result is completely
transparent to the user.
101
For completeness, an example of code using Franka.jjl is provided below. This code is a simple Cartesian
impedance controller which is totally compliant in the global y direction, while tracking a sinusoidal trajectory
in the global z direction. This code is also available in the Franka.jl documentation.
1 using Franka
2 using LinearAlgebra
3 using StaticArrays
4 using Rotations
5
6 Base . @kwdef mutable struct ZSineGenerator <: AbstractRobotController { CartesianPose
}
7 time :: Float64 = 0.0
8 center :: Matrix { Float64 } = zeros (4 , 4)
9 zA :: Float64
10 zf :: Float64
11 end
12
13 function ( c :: ZSineGenerator ) ( state , dt :: Real )
14 c . time += dt
15 finished = false
16 if c . time == 0.0
17 c . center = deepcopy ( reshape ( get_O_T_EE_c ( state ) , 4 , 4) )
18 end
19 if c . time > 60.0
20 finished = true
21 end
22 O_T_EE_c = deepcopy ( c . center )
23 dz = c . zA * ( cos (2 pi * c . zf * c . time ) - 1)
24 O_T_EE_c [3 , 4] += dz
25 return ( O_T_EE_c , finished )
102
26 end
27
28 Base . @kwdef mutable struct CartesianImpedanceController {
29 ModelT <: Model ,
30 } <: AbstractRobotController { Torques }
31 time :: Float64 = 0.0
32 model :: ModelT
33 Kp :: SMatrix {6 ,6 , Float64 ,36}
34 Kd :: SMatrix {6 ,6 , Float64 ,36}
35 end
36
37 function ( c :: CartesianImpedanceController ) ( stateptr , dt :: Real )
38 state = stateptr []
39 c . time += dt
40 finished = c . time > 60.0
41
42 goal = SMatrix {4 ,4 , Float64 }( get_O_T_EE_d ( state ) )
43 pose = SMatrix {4 ,4 , Float64 }( get_O_T_EE ( state ) )
44 Js = SMatrix {6 ,7 , Float64 }( get_zero_jacobian ( c . model , kEndEffector , state ) )
45 dq = SVector {7 , Float64 }( get_dq ( state ) )
46
47 coriolis = SVector {7 , Float64 }( get_coriolis ( c . model , state ) )
48
49 e_translation = pose [ SOneTo (3) , 4] - goal [ SOneTo (3) , 4]
50
51 R_goal = QuatRotation ( goal [1:3 , 1:3])
52 R_current = QuatRotation ( pose [1:3 , 1:3])
53 quatdot ( q1 , q2 ) = q1 . s * q2 . s + q1 . v1 * q2 . v1 + q1 . v2 * q2 . v2 + q1 . v3 * q2 . v3
54 if quatdot ( R_goal .q , R_current . q ) < 0
103
55 R_current = QuatRotation ( - R_current . q .s , - R_current . q . v1 , - R_current . q . v2 ,
- R_current . q . v3 )
56 end
57 e_quat = QuatRotation ( inv ( R_current ) * R_goal )
58 e_rotation = - R_current * SA [ e_quat . q . v1 , e_quat . q . v2 , e_quat . q . v3 ]
59
60 error = SA [ e_translation ... , e_rotation ...]
61
62 Fs = -c . Kp * error - c . Kd * Js * dq
63 tau_J = Js ’ * Fs + coriolis
64 return ( tau_J , finished )
65 end
66
67 function main ()
68 robot = Robot (" 172.16.0.2 ")
69 automatic_error_recovery !( robot )
70
71 torque_thresh = Franka . StlArray7d ()
72 torque_thresh .= 99
73 force_thresh = Franka . StlArray6d ()
74 force_thresh .= 500
75 set_collision_behavior !( robot , torque_thresh , torque_thresh , force_thresh ,
force_thresh )
76
77 panda_home = [0.0 , - pi / 4 , 0.0 , -3 pi / 4 , 0.0 , pi / 2 , pi / 4]
78 control !( robot , JointGoalMotionGenerator (0.5 , panda_home ) )
79
80 Kp_translation = 5000.0
81 Kp_rotation = 30.0
104
82 Kp = Diagonal ([ Kp_translation , 0 , Kp_translation , Kp_rotation , Kp_rotation ,
Kp_rotation ])
83 Kd = 2.0 * sqrt .( Kp )
84 Kp = SMatrix {6 ,6 , Float64 }( Kp )
85 Kd = SMatrix {6 ,6 , Float64 }( Kd )
86 torque_controller = CartesianImpedanceController (; model = load_model ( robot ) , Kp
, Kd )
87
88 generator = ZSineGenerator (; zA =0.1 , zf =0.5)
89
90 control !( robot , torque_controller , generator )
91 end
92
93 if ! isinteractive ()
94 main ()
95 end
Combined with the flexibility of the Julia type system, this makes it easy to write complex controllers.
An example controller which accepts a constraint function for the position of the end effector, and keeps the
end effector on the manifold defined by the constraint, is given in the Franka.jl documentation. It uses the
automatic differentiation system from ForwardDiff.jl to compute corrective torques to keep the robot on the
constraint manifold.
105
Abstract (if available)
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Accelerating robot manipulation using demonstrations
PDF
Closing the reality gap via simulation-based inference and control
PDF
Scaling robot learning with skills
PDF
High-throughput methods for simulation and deep reinforcement learning
PDF
Leveraging prior experience for scalable transfer in robot learning
PDF
Intelligent robotic manipulation of cluttered environments
PDF
Hierarchical tactile manipulation on a haptic manipulation platform
PDF
Algorithms and systems for continual robot learning
PDF
Characterizing and improving robot learning: a control-theoretic perspective
PDF
Optimization-based whole-body control and reactive planning for a torque controlled humanoid robot
PDF
Advancing robot autonomy for long-horizon tasks
PDF
Decentralized real-time trajectory planning for multi-robot navigation in cluttered environments
PDF
Data-driven acquisition of closed-loop robotic skills
PDF
Learning objective functions for autonomous motion generation
PDF
From active to interactive 3D object recognition
PDF
Data scarcity in robotics: leveraging structural priors and representation learning
PDF
Leveraging cross-task transfer in sequential decision problems
PDF
Anatomically based human hand modeling and simulation
PDF
Efficiently learning human preferences for proactive robot assistance in assembly tasks
PDF
Active sensing in robotic deployments
Asset Metadata
Creator
Millard, David
(author)
Core Title
Augmented simulation techniques for robotic manipulation
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Computer Science
Degree Conferral Date
2023-12
Publication Date
12/13/2023
Defense Date
09/29/2023
Publisher
Los Angeles, California
(original),
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
deformable objects,granular materials,machine learning,manipulation,OAI-PMH Harvest,optimal control,robotics,simulation
Format
theses
(aat)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
(
Bansal, Somil
), Sukhatme, Gaurav (
committee chair
), Lindemann, Lars (
committee member
)
Creator Email
dmillard@usc.edu,dmillard10@gmail.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-oUC113792383
Unique identifier
UC113792383
Identifier
etd-MillardDav-12550.pdf (filename)
Legacy Identifier
etd-MillardDav-12550
Document Type
Thesis
Format
theses (aat)
Rights
Millard, David
Internet Media Type
application/pdf
Type
texts
Source
20231214-usctheses-batch-1115
(batch),
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the author, as the original true and official version of the work, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright.
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Repository Email
cisadmin@lib.usc.edu
Tags
deformable objects
granular materials
machine learning
manipulation
optimal control
robotics
simulation