Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Hierarchical tactile manipulation on a haptic manipulation platform
(USC Thesis Other)
Hierarchical tactile manipulation on a haptic manipulation platform
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
HIERARCHICAL TACTILE MANIPULATION ON A HAPTIC
MANIPULATION PLATFORM
by
Harry Zhe Su
A Dissertation Presented to the
FACULTY OF THE USC GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
(BIOMEDICAL ENGINEERING)
December 2018
Copyright 2018 Harry Zhe Su
Acknowledgements
I would like to start by thanking my thesis advisors: Dr. Gerald Loeb and Dr.
Stefan Schaal. I thank Dr. Loeb for giving me the opportunity to pursue my
graduate study with him when I first came to USC. He is a great role model by
setting his high standards and expectations in pursuing science and engineering
projects. I also like to thank him for his patience, encouragement, and guidance to
enable me to grow into a better scientist and engineer. I thank my second advisor,
Stefan, for creating a flexible and encouraging work environment that fostered
creativity and collaboration. He also encouraged me to travel all over the world to
share my work, learn from others’, and expand my professional network.
I would like to thank my committee, Dr. Gaurav S. Sukhatme, Dr. Heather
Culbertson, and Dr. James Finley, for their efforts and time in guiding me to
focusonthe rightworkandrightphilosophicalthinking. Iamalsoliketothankmy
graduate advisor Mischal Diasanta for the advice, encouragement, and friendship
throughout my Ph.D. process.
Next, I would like to thank my colleagues and friends from the Medical Device
Development Facility, SynTouch, Computational Learning and Motor Control lab,
Robotic Embedded Systems Laboratory and Autonomous Motion Department at
Max Planck Institute for Intelligent Systems for collaboration, help in reviewing
my work, and experiments. I would like specifically thank Oliver Kroemer and
ii
Franziska Meier for your guidance, project brainstorming, and software develop-
ment, GaryLinandJeremyFishelforprovidingyourexpertiseintheBioTactactile
sensors, Giovanni Sutanto, Yevgen Chebotar, Karol Hausman and for providing
your expertise in robotics, and software development.
A big shout out to all my friends for your support and distractions from graduate
school. First of all, I would like to thank all the long distance calls from friends
in China, Yue Zhen, Xiaokun Qin, Wenjing Pan, and many more. Thank you
to my wonderful friends I met in Los Angeles - Aditya Patel, Mandy Lai, Travis
Peterson, Shanie Liyanagamage, Adriana Nicholson Vest, Dylan Vest, Shraddha
More,JessicaOrtigoza,JohnSunwoo,EnriqueArgüelles,GeneYu,LeonardoNava,
Gary Lin, Katie Zheng and more. I hope we can continue having more adventures
together.
Finally, I would like to thank all my family who has supported me throughout
the years. I especially want to thank my parents Zhencen Zhang and Jinlong
Su, who have been my first and best role models. You helped me by providing
great examples of hard work, perseverance, selflessness, and generosity. Their
unconditional love, support, and sacrifices enabled me to pursue my education and
personal growth. To my aunts and uncle-in-laws, thank you for introducing me to
the opportunities of pursuing graduate studies and researches.
Lastly, thankyoutomywife, EvelynQingqiLi, andherfamily, whohavesupported
me throughout this process, which has been full of challenges. You are always there
to celebrate any accomplishment and to support me through any hardship. To my
pets, companions, and friends, Luna and Mars, thank you for giving me your
selfless love and bringing cheerfulness and comforts to my life.
iii
Contents
Acknowledgements ii
List of Tables viii
List of Figures ix
Abstract xviii
1 Introduction 1
1.1 The Current State of Tactile Perception and Manipulation . . . . . 3
1.1.1 Low-level Tactile Feedback . . . . . . . . . . . . . . . . . . . 5
1.1.2 High-level Tactile Feedback . . . . . . . . . . . . . . . . . . 6
1.2 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2 Background 11
2.1 Overview of the Haptic Robotic Manipulation Platform . . . . . . . 11
2.2 Human and Robot Skin Deformation Sensing . . . . . . . . . . . . . 11
2.2.1 Anatomy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2.2 Evolutionary Development . . . . . . . . . . . . . . . . . . . 14
2.2.3 Mechanical Properties . . . . . . . . . . . . . . . . . . . . . 15
2.2.4 Sensory Transduction . . . . . . . . . . . . . . . . . . . . . . 16
3 Tactile Servoing 18
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.2 Materials and Methods . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.2.1 Experiment Setup . . . . . . . . . . . . . . . . . . . . . . . . 22
3.2.2 Robot exploratory movements . . . . . . . . . . . . . . . . . 24
3.2.3 Normal and tangential force extraction . . . . . . . . . . . . 28
3.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.3.1 Force Extraction . . . . . . . . . . . . . . . . . . . . . . . . 29
3.3.2 Pressing with orientation uncertainty . . . . . . . . . . . . . 30
3.3.3 Compliance discrimination . . . . . . . . . . . . . . . . . . . 32
3.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
iv
4 Learning Tactile Feedback Model 40
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.2 Background and Related Work . . . . . . . . . . . . . . . . . . . . . 43
4.2.1 Quaternion DMPs . . . . . . . . . . . . . . . . . . . . . . . 43
4.2.2 Related Work on Learning Feedback Models . . . . . . . . . 46
4.3 Learning Feedback Models via Phase-Modulated Neural Networks . 47
4.3.1 Learning Expected Sensor Traces . . . . . . . . . . . . . . . 48
4.3.2 Learning Feedback Models from Demonstration . . . . . . . 49
4.3.3 Phase-Modulated Neural Network Structure . . . . . . . . . 50
4.4 Learning Tactile Feedback Models: System Overview and Experi-
mental Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.4.1 Hardware . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.4.2 Robot’sEnvironmentalSettingsandHumanDemonstrations
with Sensory Traces Association . . . . . . . . . . . . . . . . 55
4.4.3 Learning Pipeline Details and Lessons Learned . . . . . . . . 55
4.5 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.5.1 Fitting and Generalization Evaluation of PMNNs . . . . . . 60
4.5.2 Performance Comparison between FFNN andPMNN . . . . 60
4.5.3 Comparison between Separated versus Embedded Feature
Representation and Phase-Dependent Learning . . . . . . . 61
4.5.4 Evaluation of Movement Phase Dependency . . . . . . . . . 63
4.5.5 Unrolling the Learned Feedback Model on the Robot . . . . 63
4.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
5 Precision Grip Force Control with Slip Detection and Classifica-
tion 67
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
5.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
5.3 Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.3.1 Force Estimation . . . . . . . . . . . . . . . . . . . . . . . . 71
5.3.2 Slip Detection . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.3.3 Slip Classification . . . . . . . . . . . . . . . . . . . . . . . . 74
5.3.4 Grip Controller . . . . . . . . . . . . . . . . . . . . . . . . . 75
5.4 Evaluation and Discussion . . . . . . . . . . . . . . . . . . . . . . . 77
5.4.1 Force Estimation . . . . . . . . . . . . . . . . . . . . . . . . 77
5.4.2 Slip Detection . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.4.3 Slip Classification . . . . . . . . . . . . . . . . . . . . . . . . 83
5.4.4 Grip Controller . . . . . . . . . . . . . . . . . . . . . . . . . 84
5.5 Conclusions and Future Work . . . . . . . . . . . . . . . . . . . . . 86
v
6 Precision Grip Force Control with Slip Prediction and Classifica-
tion 88
6.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
6.2 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
6.3 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
6.4 Technical Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
6.4.1 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . 91
6.4.2 Data Collection . . . . . . . . . . . . . . . . . . . . . . . . . 92
6.4.3 Slip Prediction . . . . . . . . . . . . . . . . . . . . . . . . . 94
6.4.4 Grip Control Law . . . . . . . . . . . . . . . . . . . . . . . . 95
6.5 Experiments completed and Results . . . . . . . . . . . . . . . . . . 98
6.5.1 Slip Prediction Results . . . . . . . . . . . . . . . . . . . . . 98
6.5.2 Grip Control Experiments . . . . . . . . . . . . . . . . . . . 99
6.6 Conclusion and future work . . . . . . . . . . . . . . . . . . . . . . 101
7 Manipulation Graph Acquisition Using Tactile Sensing 102
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
7.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
7.3 Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
7.3.1 Demonstration and Multimodal Sensory Signals . . . . . . . 108
7.3.2 Sensorimotor Primitive Segmentation . . . . . . . . . . . . . 109
7.3.3 Segmentation Clustering and Skill Graph Generation . . . . 110
7.3.4 Skill Replay with Exploration and Mode Discovery . . . . . 112
7.3.5 Manipulation Graph Generation . . . . . . . . . . . . . . . . 113
7.4 Experimental Evaluations . . . . . . . . . . . . . . . . . . . . . . . 113
7.4.1 Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . 114
7.4.2 Segmentation Clustering and Skill Graph Generation . . . . 116
7.4.3 Mode Discovery . . . . . . . . . . . . . . . . . . . . . . . . . 117
7.4.4 Manipulation Graph Generation with Failure Modes . . . . . 121
7.5 Conclusions and Future Work . . . . . . . . . . . . . . . . . . . . . 123
8 Conclusions and Future Directions 124
8.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
8.2 Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
8.2.1 High-level Decision-making Using Tactile Sensing . . . . . . 126
8.2.2 Skill Outcome Verification . . . . . . . . . . . . . . . . . . . 127
8.2.3 Learning Tactile Servoing . . . . . . . . . . . . . . . . . . . 128
8.2.4 Nature to Inspire Robotics, Robotics to Understand Nature 129
8.2.5 Biomedical Applications . . . . . . . . . . . . . . . . . . . . 130
A Quaternion 134
vi
Reference List 136
vii
List of Tables
4.1 Force-torque control schedule for steps 2-4. . . . . . . . . . . . . . . 57
4.2 NMSE of the roll-orientation coupling term learning with leave-one-
demonstration-out test, for each primitive. . . . . . . . . . . . . . . 60
viii
List of Figures
1.1 Control pipeline for object exploration during robotic tactile per-
ception. The dashed lines in this plot indicate that the information
is computed without feedback. The red arrow represents the PID
controllers used to track trajectories with joint torques. . . . . . . 3
1.2 The figure shows the hierarchical controller architecture with tactile
feedback. Each skill, represented as end-effector pose trajectories
and force trajectories, defines a low-level control loop, indicated by
the orange arrows. The blue arrows indicate the high-level control
loop, which involves the high-level policy selecting skills to execute
until their termination conditions are fulfilled, at which point the
high-level policy selects a new skill. The task context, linking tactile
information and high-level policy with the green arrows, is used
to acquire new skills in certain task context using tactile sensory
signals. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
ix
1.3 The large arrows at the top show a sequence of skill executions.
The bottom part of the figure shows how information is extracted
duringtheskillexecutions. Eachofthesmallarrowshasahorizontal
baseline that indicates the time window from which the information
isbeingextracted. Theheadofthebluearrowsindicatesthepointin
timeatwhichthatinformationisusedtoperformanactionoradapt
the skill execution. For example, skill initialization uses tactile data
from the previous skill to select and set the parameter of the current
skill. The red arrows indicate where information is extracted to
monitor the skill, but not alter the execution, e.g., for subsequent
learning or error detection. For example, if no contact event is
detected,thentheoutcomedetectionwillsimplycontinueasnormal.
When a contact event occurs, the robot jumps ahead and terminates
the skill. For the event prediction, the robot monitors the skill
(red arrows) and then adapts the skill (blue arrow) to avoid the
contact event (blue cross) and continue executing the skill for its
full duration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1 Haptic Robotic Manipulation Platform. . . . . . . . . . . . . . . . . 12
2.2 Cross-sectional schematics of the BioTac sensor. . . . . . . . . . . . 12
2.3 (a). A volar view of the distal phalanx (A), and a lateral view of
the distal phalanx (B). Source: (Shrewsbury & Johnson, 1975); (b).
Side and ventral views of the rigid core of the BioTac. . . . . . . . . 13
x
2.4 The force-BioTac displacement data (open blue circles) during con-
tact with a flat rigid surface (inflation volume 0.15ml, inflation
pressure 11721.1 Pa according to factory specifications); the force-
fingertip displacement data (red dots) as one human subject taps
his or her fingertip on a flat, rigid surface. Redrawn from (Serina et
al., 1998) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.1 (a) Orientations on BioTAC: the finger local coordinate frame has
its origin in the center of the two electrode pairs and is coplanar
with the flat surface of the core; (b) Electrode array map. . . . . . 23
3.2 (a)BarrettwithBioTacpressingacompliantsurface; (b)Force/position
control diagram. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.3 Algorithmforonlineorientationgenerationusingtactilesensorfeed-
back. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.4 Coordinate frame of BioTac: each impedance sensing electrode has
a specific orientation in the BioTac coordinate frame.. . . . . . . . 28
3.5 Force measured on force plate (blue) and measured from the BioTac
(red) for pokes with various tangential components. . . . . . . . . 30
3.6 Typical BioTac impedance sensor feedback on the point of contact
and robot orientation behavior obtained from pressing on a compli-
ant surface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.7 Typical BioTac tangential force (X force) and normal force (Z force)
feedback obtained from pushing on a compliant surface. . . . . . . 32
3.8 Measured normal force from BioTac: five objects with five different
hardness are tested with Barrett robot equipped with BioTac. . . . 33
3.9 Measured indentation displacement from Barrett joint encoder. . . 34
3.10 Force vs. indentation displacement. . . . . . . . . . . . . . . . . . 35
xi
3.11 Measured average pressure from MEMS pressure transducer on Bio-
Tac: the rate of average pressure and saturation pressure are used
to discriminate object compliance. . . . . . . . . . . . . . . . . . . 36
3.12 Typical BioTac lateral impedance electrode feedback from pressing
on different compliant surfaces. . . . . . . . . . . . . . . . . . . . . 38
4.1 Proposed framework for learning behavior adaptation based on associa-
tive skill memories (ASMs). . . . . . . . . . . . . . . . . . . . . . . . 41
4.2 Process pipeline of learning feedback model. . . . . . . . . . . . . . 48
4.3 Phase-modulated neural network (PMNN) with one-dimensional
output coupling term C. . . . . . . . . . . . . . . . . . . . . . . . . 51
4.4 Experimental setup of the scraping task. . . . . . . . . . . . . . . . 54
4.5 Comparisonofregressionresultsonprimitives2and3usingdifferent
neural network structures. . . . . . . . . . . . . . . . . . . . . . . . 61
4.6 Comparison of regression results on primitives 2 and 3 using sepa-
rated feature learning (PCA or Autoencoder and phase kernel mod-
ulation) versus embedded feature learning (PMNN). . . . . . . . . 62
4.7 The top 10 dominant regular hidden layer features for each phase
RBF in primitive 2, roll-orientation coupling term, displayed in yel-
low. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.8 Snapshots of our experiment on the robot while scraping on the tilt
stage with +10
◦
roll angle environmental setting: without adapta-
tion (top figures, (a) to (d)) versus with adaptation (bottom figure,
(e) to (h)). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
xii
4.9 The roll-orientation coupling term (top) vs. the corresponding sen-
sor traces deviation of the right BioTac finger’s electrode #6 on
primitive 2 (bottom), during scraping task on environmental (env.)
setting with the tilt stage’s roll-angle varies as specified in caption
(a)-(d). x-axis is the time index, y-axis of top figures is the coupling
term magnitude (in radians), and y-axis of bottom figures is the
discretized sensor trace deviation magnitude (unitless). . . . . . . . 65
5.1 Robotic arm grasping a fragile object using a standard position con-
troller (left) and the proposed force grip controller (right). . . . . . 68
5.2 The coordinate frame of the BioTac sensor (adapted from (Su et al.,
2012)). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.3 Control diagram of the grip controller. . . . . . . . . . . . . . . . . 76
5.4 Experimental setup for the force estimation comparison: the finger
is pressed at different positions and orientations against the force
plate. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5.5 The performance comparison between force estimation techniques.
Analytical approach is outperformed by the other methods. LWPR
and 1-layer NN perform well on the full data set but have low per-
formance on the test set. 3-layer NN avoids overfitting and yields
good results on the test set. . . . . . . . . . . . . . . . . . . . . . . 79
5.6 Example of force estimation with different methods over time. From
top to bottom: force estimation for dimensions: F
x
, F
y
, F
z
. . . . . 80
5.7 Different objects used for the experiments. . . . . . . . . . . . . . . 81
5.8 An example run of the slip detection experiment. Using the BioTac
sensorweareabletodetecttheslipeventbeforetheIMUaccelerom-
eter attached to the object measures any acceleration due to slip. . 82
xiii
5.9 Linear/rotational slip classification accuracy dependent on the time
of prediction. Red line shows the point when slip is detected based
on the IMU readings. . . . . . . . . . . . . . . . . . . . . . . . . . . 84
5.10 An example run of the grip controller that includes all of the grasp-
ing phases. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
6.1 The Barrett Hand equipped with a pair of BioTacs is forming a
pinch grasp on an object. The Vicon motion capture markers are
used to tracked the 6D pose of the object. . . . . . . . . . . . . . . 92
6.2 Left: An example of rotational slip trial and slip prediction data
extraction using motion capture data. The slip prediction data
begins from the fingers are beginning to load the weight the object
which is shown by the first dotted vertical line labeled by the rel-
ative position changes between fingers and the object due to small
finger skin distortions. The slip event, such as a rotation slip, is
labeled from the relative orientation between the fingertips and the
objects, shown as the second vertical dotted line. The data between
these two vertical lines are extracted to train a slip predictor to
predict rotation slips. Right: An example of a stable grip with-
out translational and rotational slips. During the lifting movement,
there is no significant position and orientation differences between
the fingertips and the objects. Therefore, the second vertical line is
labeled at the end of the lifting movement of the robot hand. The
data between these two vertical lines are extracted to train a slip
predictor to predict stable contact. . . . . . . . . . . . . . . . . . . 93
xiv
6.3 Feedforward adjustments of motor output to object weight (A), fric-
tional conditions (B), and object shape (C) in a task in which a
test object is lifted with a precision grip, held in the air, and then
replaced. The top graphs show the horizontal grip force, vertical
load force, and the vertical position of the object as a function of
time for two superimposed trials. The bottom graphs show the
relation between load force and grip force for the same trials. The
dashed line indicates the minimum grip-to-load force ratio required
to prevent slip. The gray area represents the safety margin against
slip. After contact with the object (leftmost vertical line, top), grip
force increases by a short period while the grip is established. A
commandisthenreleasedforsimultaneousincreasesingripandload
force (second vertical line). This increase continues until the load
forceovercomestheforceofgravityandtheobjectliftsoff(thirdver-
tical line). After replacement of the object and table contact occurs
(fourth line), there is a short delay before the two forces decline in
parallel (fifth line) until the object is released (sixth line). (adapted
from (Johansson & Westling, 1984, 1988; Jenmalm & Johansson,
1997)). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
6.4 Training, testing and generalization test results from logistic regres-
sion classifier are presented in confusion matrices. . . . . . . . . . . 99
6.5 The classification results on these three novel objects are presented
in confusion matrices. . . . . . . . . . . . . . . . . . . . . . . . . . 99
xv
6.6 We included three novel objects to test the robustness or the gen-
eralization capability of the learned classifier. From the upper left
clockwise, a long cardboard tea-box with unknown center of mass,
a tall cardboard tea-box with unknown center of mass and a foam
brick. All of these three novel objects are deformable. . . . . . . . 100
6.7 Percentage of success grip trials. . . . . . . . . . . . . . . . . . . . 100
7.1 Robotneedstolearntodisambiguatesuccessfulinsertionfromfailed
insertion into a screw. . . . . . . . . . . . . . . . . . . . . . . . . . 104
7.2 Overview of the framework used in this experiment. . . . . . . . . . 105
7.3 Experimental setup of demonstrating the grasping task. . . . . . . . 113
7.4 Experimental setup of demonstrating the peg-in-hole task. . . . . . 114
7.5 Experimental setup of demonstrating the unscrewing task. . . . . . 114
7.6 A: An example of joint BOCPD to segment sensorimotor primitives
in the grasping task; B: An example of joint BOCPD to segment
sensorimotor primitives in the unscrewing task; C: An example of
joint BOCPD to segment sensorimotor primitives in the peg inser-
tion task; D: Segments clustering and skill graph for the grasping
task; E: Segments clustering and skill graph for the unscrewing task;
F: Segments clustering and skill graph for the peg-in-hole task . . . 115
7.7 Similarity matrix heat-map (A, C, E) and spectral clustering (B,
D, F) of tactile signals of the exploratory movements at the goal of
each phase of grasping, unscrewing and peg-in-hole tasks. . . . . . . 118
7.8 A: failed to insert tool-tip into the screw head; B: after failed inser-
tion, continue twisting the screwdriver failed to unscrew the screw;
C: failed to slide into the vertical groove therefore missed the corner 119
7.9 Grasping Manipulation Graph . . . . . . . . . . . . . . . . . . . . . 121
xvi
7.10 Success and Failure Mode Detection . . . . . . . . . . . . . . . . . . 122
xvii
Abstract
A service robot deployed in human environments must be able to perform reli-
able dexterous manipulation tasks under many different conditions. These tasks
require robots to interact with objects under intrinsic pose uncertainties, such as
poor forward kinematics due to unmodeled nonlinear dynamics, and extrinsic pose
uncertainties, such as uncertainties on objects’ poses. Recent advances in com-
puter vision and range sensing enable robots to detect their end-effectors’ and
objects’ pose reliably. However, the pose estimation accuracy deteriorates when
visual occlusion is involved in the tasks. Even with correct pose estimation of the
robot and the objects, reliable dexterous manipulation tasks remain a challeng-
ing problem, because these tasks involve interacting with objects with unknown
material properties, such as mass, center of mass, compliance, shapes, and friction,
etc.
Tactile sensors can be used to monitor hand-object interactions that are very
importantindexterousmanipulation. Recently,biomimetictactilesensors,designed
to provide more humanlike capabilities, have been developed. With such rich tac-
tile sensing capabilities, we propose a hierarchical tactile manipulation framework
to improve the robustness of the robotic manipulation so that robots could get
close to human-level performance in dexterous manipulation tasks. First, we dis-
cussusingheuristictactilefeaturestocopewithexternalobjects’poseuncertainties
xviii
through low-level feedback control, such as tactile servoing while estimating the
compliance material property. Second, we propose a framework to learn low-level
feedback policy through human demonstration and we demonstrate this framework
on a scraping task using learned tactile feedback policy. Third, we present our slip
detection method which is used to provide low-level feedback control to adjust the
grip force during a grasping task. Four, we present a framework to predict trans-
lational and rotational slips which are used to update the desired feedforward grip
force during a precision pinch grip task. Finally, we present a framework to use
tactile events to perform skill acquisitions as well as skill performance evaluation,
a high-level perceptual process, through active exploration.
xix
Chapter 1
Introduction
Humans have been envisioning a world in which robots are capable to relieve them
fromvariousdexterousmanipulationtasks, fromhouseholdchorestomedicalcares.
Conventional robots, specialized in a specific task, have achieved success in indus-
tries that require workers to perform highly repetitive tasks, such as manufacturing
tasks. However, these robots are limited to work in highly structured environments
and they could be dangerous to human workers or even disrupting the workflow of
their human coworkers. On the contrary, service robots should be able to handle
versatile manipulation tasks in human environments which are not structured and
organized. They should be able to utilize tools specifically designed for humans.
Although actuation and sensing technologies have advanced significantly, service
robots perform very poorly compared to their human counterparts at what we
consider everyday tasks. A good example of these type of service robots would
be a service robot can cook in our kitchen. Cooking is a unique human behavior
which has several unique challenges for a service robot:
• the ability to perceive objects with diverse material properties, such as it
needs to discriminate the compliance of a dough;
• the ability to adapt to environmental uncertainties, such as it needs to pal-
pate the dough with uncertain poses and curvatures, and it is able to use a
spatula to scrap trunks of food on a frying pan with uncertain poses, friction,
and inhomogeneous properties;
1
• the ability to control grip force to prevent slippage due to uncertain object
material properties, such as compliance, yield strength, friction, shapes,
mass, and center of mass;
• the ability to acquire new skills from human demonstrations.
Except for these extrinsic uncertainties, fully dexterous manipulation is further
complicated by intrinsic uncertainties on the robot’s hardware. On the state-of-
the-art DARPA autonomous robotic manipulation platform (Hackett et al., 2013),
non-linear cable stretch with motor-side encoders causes the cable-driven dual
robot manipulators to have very low accuracy 1.5cm and different accuracies at
different parts of the workspace (Pastor et al., 2013).
To solve such challenging tasks, we first looked for clues from how human counter-
parts achieve such dexterous manipulation. Neurophysiology and psychophysics
studies on primate tactile perception and manipulation have shown that primates,
especially humans, rely on mechanoreceptors in their glabrous skin to perceive
materials properties (Jenmalm et al., 2003; Khalsa et al., 1998), react to envi-
ronmental uncertainties during manipulation (Johansson & Westling, 1984; Good-
win et al., 1998), and provide cues to terminate and select manipulation sub-
tasks (Johansson & Flanagan, 2009a). Given the importance of tactile sensing in
dexterous manipulation, we built a robot platform equipped with state-of-the-art
biomimetic tactile sensors, called BioTac (Wettels et al., 2008). In two following
sections, we will first propose a hierarchical tactile manipulation framework to
close the perception-action loop along with a summary of current state of tactile
perception and manipulation and then outline the contribution of my thesis.
2
Figure 1.1: Control pipeline for object exploration during robotic tactile percep-
tion. The dashed lines in this plot indicate that the information is computed
without feedback. The red arrow represents the PID controllers used to track
trajectories with joint torques.
1.1 The Current State of Tactile Perception and
Manipulation
In robotic tactile perception, there have been a lot of efforts using tactile sensors in
information gathering. In a typical control pipeline to explore objects, as shown in
Figure. 1.1, the skill sequences, such as stereotypical exploratory movements, are
predefinedbasedonpreviousobservationfrompsychophysicsstudies, suchas(Led-
erman & Klatzky, 1987). Material properties of objects are typically characterized
and/or classified using these tactile signals (Chu et al., 2015). The characterized
material properties can also be used to perform object recognition (Sinapov et al.,
2011). However, this control pipeline does not include any tactile information in
the perception-action loop of the robot control.
In my thesis, we proposed a hierarchical controller architecture to provide tactile
feedback at different levels of robot control. Robots, like humans, generally need
to execute a sequence of skills to perform a task (Johansson & Flanagan, 2009b),
3
e.g., grasping, transporting, and releasing for pick-and-place tasks, or a sequence
of steps for locomotion. The semantics of tactile signals often change based on
the skill being executed. For example, bumping into another object is expected
during a placing skill, but not during transport. We define different types of tactile
feedback based on when the data is acquired and used for each skill. An overview
of the different types of feedback is shown in different colors in Figure. 1.2:
• Tactile information is used to provide low-level feedback control, shown as
the orange lines in Fig. 1.2, to adapt the end-effector pose/force trajectories
from a skill policy. The tactile information includes both contact information
on the robot itself, such as contact forces, and objects’ information, such as
material properties of the objects.
• Tactile information is used to provide high-level feedback, shown as the blue
lines in Fig. 1.2, to monitor the progress and performance of a skill execution
and select a revised skill if the currently selected skill failed to achieve the
goal of a manipulation task, which is monitored by termination condition.
Sometimes an action from a skill execution cannot determine the goal of
a manipulation task has been achieved. Additional skills are required to
provide more tactile signals to verify the goal of the manipulation task.
• Tactile signals can also be used to acquire new skills from human demonstra-
tions based on a specific task context, shown as green lines in Fig. 1.2.
In this proposed architecture, a robot can adapt its execution and complete the
perception-action loop accordingly.
4
Figure 1.2: The figure shows the hierarchical controller architecture with tactile
feedback. Each skill, represented as end-effector pose trajectories and force trajec-
tories, defines a low-level control loop, indicated by the orange arrows. The blue
arrows indicate the high-level control loop, which involves the high-level policy
selecting skills to execute until their termination conditions are fulfilled, at which
point the high-level policy selects a new skill. The task context, linking tactile
information and high-level policy with the green arrows, is used to acquire new
skills in certain task context using tactile sensory signals.
1.1.1 Low-level Tactile Feedback
One of the most common uses for tactile feedback is in low-level control. In this
case, the robot is acquiring tactile data at a certain rate during the skill execution
and using the data to directly adapt the trajectories during skill execution. The
robot may, for example, use the tactile signals to control the forces applied to an
object during grasping or it may use the data to trace the edge of an object with
its fingertip using tactile servoing (Li et al., 2013). The actions and perceptions
are tightly coupled in this case, and the robot is frequently adapting its current
actions to the current tactile signals.
Instead of using manually generated tactile features, previous work on tactile-
driven manipulation with tools has tried to learn feedback models to correct the
5
position trajectories based on current tactile sensor caused by the position uncer-
tainty between tools and the environment, via reinforcement learning (Chebotar
et al., 2014) or motor babbling (Hoffmann et al., 2014). Another paper models
haptic tasks as POMDP problems which learn tactile representations using deep
recurrent neural networks as well as learns to choose an optimal action based on
current tactile representation in each state using deep Q-learning. It demonstrates
thisframeworkinaknob-turningtaskandlearnsrepresentationsandpoliciesoffline
from demonstrated data (Sung et al., 2017).
1.1.2 High-level Tactile Feedback
An overview of high-level tactile feedback is illustrated in Fig. 1.3.
Skill Selection and Initialization
Tactile data is also used to select skills and initialize skill parameters in the high-
level of robot control. For example, a robot may select a shorter, faster step if it
feels that the ground is not as stable. A robot may select different rubbing move-
ments in terms of different normal forces and scanning velocities if it is discrimi-
nating two objects with very subtle differences in their texture properties (Fishel
& Loeb, 2012). It uses a biomimetic tactile sensor to construct 117 objects’ feature
library by sliding fingertip on the object’s surface. Based on this library, it picks
the parameters of this sliding action to produce the greatest distinction between
the most plausible candidate objects. A robot may also decide how to regrasp an
object based on the current tactile information from a failed grasp attempt (Cheb-
otar et al., 2016). It learns spatiotemporal tactile features during a grasping task
as well as learns the mapping between these tactile features and regrasping policies
in a self-supervised manner. In this case, the tactile data provides high-level action
6
Next
Skill(s)
Skill
Previous Skill(s)
Initialization
Outcome /
Termination
Contact
Event
X
Low-Level
Feedback
Selection /
Initialization
Outcome
Detection
(Closed-Loop)
(Open-Loop)
Verification
Event
Prediction
X
Figure 1.3: The large arrows at the top show a sequence of skill executions. The
bottom part of the figure shows how information is extracted during the skill
executions. Each of the small arrows has a horizontal baseline that indicates the
time window from which the information is being extracted. The head of the blue
arrows indicates the point in time at which that information is used to perform
an action or adapt the skill execution. For example, skill initialization uses tactile
datafromthepreviousskilltoselectandsettheparameterofthecurrentskill. The
red arrows indicate where information is extracted to monitor the skill, but not
alter the execution, e.g., for subsequent learning or error detection. For example,
if no contact event is detected, then the outcome detection will simply continue as
normal. When a contact event occurs, the robot jumps ahead and terminates the
skill. For the event prediction, the robot monitors the skill (red arrows) and then
adapts the skill (blue arrow) to avoid the contact event (blue cross) and continue
executing the skill for its full duration.
information. The skill selection and initialization are also performed only once for
each skill execution. Since the selection and initialization occur before the actual
skill, the tactile signal for these decisions is acquired during previous skills or at
the time of the decision.
7
Outcome Detection
Most skills are designed or learned to achieve specific goals. If the robot is holding
theobjectattheendofagraspingskillexecution, thenskill’soutcomeisconsidered
to be a success. If the grasping skill leaves the robot empty-handed, it is considered
a failure and an error has occurred. To perform tasks in a robust manner under
various conditions, the robot will need to monitor its skill executions and detect
when errors occur and when goals are achieved. Outcome detection allows the
robot to select if it should continue as planned or switch to a different skill to
recover. Similar to tactile servoing, the tactile signals for outcome detection are
acquired during the skill’s execution. Previous work (Pastor et al., 2011; Kappler
et al., 2015) introduced a method of using sensor information collected over a series
of trials in the training phase to compute the average and variance of the expected
sensor feedback to predict the outcome of future trials.
Detecting outcomes can be performed either online or offline. Online approaches
continuously monitor the skill execution to detect goals and errors as soon as they
occur, such that the robot can adapt immediately. In offline approaches, the skill
is always executed in its entirety. The robot then evaluates if the goal was achieved
or if an error occurred. For example, previous work uses the tactile data at the
end of a grasp to determine if the blind grasp was successful or failed (Dang &
Allen, 2012; Bekiroglu et al., 2011; Madry et al., 2014). Offline outcome detection
is closely related to skill selection, as the robot will often need to select a different
next skill if an error occurred. The difference is that outcome detection provides
information for the current skill rather than the next skill. Outcome detection can,
therefore, be used for learning to adapt the skill for the future.
8
Event Prediction
Rather than waiting for an outcome to occur, the robot can predict events cor-
responding to outcomes and preemptively adapt to them. The tactile signals are
again acquired during the skill’s execution. However, rather than deciding whether
or not to continue the skill execution, the robot must choose how to adapt the skill
when it predicts that an error or a goal is about to be reached. For example, the
robot may predict slips and apply more force to stabilize a held object (Veiga et
al., 2015; Veiga & Peters, 2016).
Outcome Verification
Tactile signals depend on the actions being performed. The tactile signals acquired
during a skill’s execution may therefore not provide sufficient data for determining
the outcome of a skill. Tactile verification involves performing one or more addi-
tional skills to determine the outcome. For example, a robot may wiggle a screw
to verify that it was correctly inserted into a hole and is now constrained (Su et
al., 2018). In this case, the robot is using data from subsequent skills to determine
information about the original skill. Similar to offline outcome detection, outcome
verification cannot change the already completed skill execution, but it can be used
to adapt the skill for future executions.
1.2 Thesis Outline
This thesis is outlined as follows:
• Chapter 2 provides an overview of the haptic robot platform and a detailed
literature review on human vs. robot skin deformation sensing
9
• Chapter 3 introduces a tactile servoing using heuristic tactile features while
discriminating material property, such as compliance.
• Chapter 4 introduces a framework to learn tactile features and a feedback
model to adapt to environmental uncertainties, such as orientation.
• Chapter 5 introduces a slip detection and grip control method to prevent
translational slippage due to uncertain object material properties, such as
mass and friction coefficients.
• Chapter 6 introduces a slip prediction and grip control to prevent rotational
slippage due to environmental uncertainties, such as center of mass.
• Chapter 7 introduces a skill acquisition framework from human demonstra-
tions using tactile sensing.
• Chapter 8 concludes the thesis and proposes future work on developing gen-
eralized tactile feedback models and generalized predictive models.
10
Chapter 2
Background
2.1 Overview of the Haptic Robotic Manipula-
tion Platform
Wepresentahaptically-enabledrobot2.1withtheBarrettarm/handsystemwhose
three fingers are equipped with novel biomimetic tactile sensors (BioTac R
) (Lin
et al., 2009), designed to provide all sensing modalities and mechanical properties
of human fingertips. Each BioTac (see 2.2a) consists of a rigid core housing all
electronics and sensory components surrounded by an elastic skin that is inflated
with an incompressible and conductive fluid. When the skin contacts an object,
this fluid is displaced, resulting in distributed impedance changes in the electrode
array on the surface of the rigid core. A MEMS pressure transducer measures
hydrostatic pressure, which increases depending on the distribution of deformation
in the elastic skin. Temperature and heat flow are transduced by a thermistor near
the surface of the rigid core.
2.2 Human and Robot Skin Deformation Sens-
ing
Each BioTac (see Fig. 2.2a,b) has an array of 19 electrodes surrounded by an
elastic skin. The skin is inflated with an incompressible and conductive liquid.
11
Figure 2.1: Haptic Robotic Manipulation Platform.
Rigid Core w/
Integrated Electronics
Incompressible
Conductive
Fluid
Elastomeric Skin
Thermistor
Impedance
Sensing
Electrodes
Hydroacoustic Fluid
Pressure Transducer
External Texture
Fingerprints
Fingernail
(a)
(b)
Figure 2.2: Cross-sectional schematics of the BioTac sensor.
When the skin is in contact with an object, the liquid is displaced, resulting in
distributed conductance changes in the electrode array on the surface of the rigid
core. The conductance of each electrode tends to be dominated by the thickness
of the liquid between the electrode and the immediately overlying skin. The skin
has a pattern of asperities on its inner surface that gradually compresses with
increasing normal force, greatly increasing the dynamic force range (Wettels et al.,
12
Figure 2.3: (a). A volar view of the distal phalanx (A), and a lateral view of the
distal phalanx (B). Source: (Shrewsbury & Johnson, 1975); (b). Side and ventral
views of the rigid core of the BioTac.
2008). The distributed conductance changes can be used to detect the point of
contact, the magnitude and orientation of the force vector, compliance (Su et al.,
2012), and object spatial properties (Su et al., 2011; Wettels & Loeb, 2011). The
conductance of each electrode is measured as a voltage induced by a brief, current
regulatedpulseanddigitized(100samples/sat12bitsresolution)intheBioTacfor
serialdatatransmission. Asdescribedinmoredetailbelow, fouroftheseelectrodes
are located on a small flat surface of the rigid core that corresponds to a similar
feature of the human distal phalanges (Fig. 2.3b).
2.2.1 Anatomy
In the human hand, the distal end of each distal phalanx possesses a flat expansion
called an apical tuft (also known as an ungual tuberosity or process), which serves
to support the fleshy pad or pulp on the volar side of the fingertips and the nails on
the dorsal side (Shrewsbury & Johnson, 1975). Two lateral ungual spines project
proximally from the apical tuft, see Fig. 2.3a. The pairs of ungual spines on the
human thumb are asymmetric (the ulnar side being more prominent) to ensure
that the thumb pulp is always facing the pulps of the other digits, an osteological
13
configuration which maximizes contact surface with handheld objects (Almecija et
al., 2010). Humans tend to use a “precision grip” to hold small tools, in which the
apical tufts of the opposing fingers are parallel to each other and to the surfaces
of the tool (Napier, 1956).
The shape of the molded epoxy core of the BioTac was based on careful measure-
ments of a human distal phalanx bone. As shown in Fig. 2.3b, the rigid core of
the BioTac consists of a spherical region and a cylindrical region, which can be
approximated as one-quarter of a sphere, and one-half of a cylinder, respectively.
On the spherical region, a flat surface at a 30-degree angle was developed to mimic
the human apical tuft.
2.2.2 Evolutionary Development
In (Susman, 1979; Shrewsbury & Johnson, 1983), authors have shown that humans
have proportionately broader and more robust apical tufts on the distal phalanges
than other extant primates. The widened apical tufts support broad, palmar,
fibrofatty pads whose large surfaces distribute pressure during forceful grasping
and whose deformation accommodates the pads to uneven surfaces as well as fine-
tuning in the positioning of objects (Susman, 1979; Marzke & Shackley, 1986;
Susman, 1988). From an evolutionary point of view, the widened tufts and the
large frictional surface would have been essential for securing and controlling cradle
precision pinch of large preforms and three-jaw “baseball” grip of hammerstones in
habitual tool making and tool using (Marzke & Marzke, 2000; Mittra et al., 2007).
Compared to chimpanzees and hamadryas baboons, human pad-to-pad precision
grips are distinguished by the greater force with which objects may be secured by
the thumb and fingers of one hand (precision pinching) and the ability to adjust
14
Figure 2.4: The force-BioTac displacement data (open blue circles) during contact
with a flat rigid surface (inflation volume 0.15ml, inflation pressure 11721.1 Pa
accordingtofactoryspecifications); theforce-fingertipdisplacementdata(reddots)
as one human subject taps his or her fingertip on a flat, rigid surface. Redrawn
from (Serina et al., 1998)
the orientation of gripped objects through movements at joints distal to the wrist
(precision handling) (Marzke, 1997).
2.2.3 Mechanical Properties
Measurements of the contact force and fingertip displacement as individuals tap on
a flat, rigid surface (Serina et al., 1997) or grasp an object between the thumb and
index finger (Westling & Johansson, 1987) indicate that it is a nonlinear spring in
which most of the displacement of the fingertip pulp occurs at forces of less than
1N (red dots in Fig. 2.4). At higher forces, the pulp stiffens rapidly from around
3.5N/mm at 1N to over 20N/mm at 4 N, which mechanically protects the distal
phalanx from impact (Serina et al., 1997).
The core of the BioTac is surrounded by an elastomeric skin (Dow Corning Silastic
S, Shore A 26) and the space between the skin and the core is inflated with an
15
incompressible fluid (water with NaBr and polyethylene glycol) to give it com-
pliance similar to the human fingertip. The displacement of the fingertip of the
BioTac was measured by a linear motor that pressed a flat aluminum plate against
the BioTac, and the force between the BioTac and the flat plate was measured by
a six-axis force plate (AMTI HE6x6-16) supporting the BioTac (blue circles in Fig.
2.4). The relationship between the displacement and force was estimated with an
8th-order polynomial function, and the stiffness of the BioTac at 1 and 4N was
the first derivative of the force-displacement polynomial function at those specific
forces. The stiffness of the BioTac at 1 and 4N are 3.38N/mm and 14.53N/mm,
respectively. The texturing of the inner surface of the skin provides sensitivity for
transduction of relatively small compression changes that occur in the high force
– high stiffness range of the BioTac (Wettels et al., 2008).
2.2.4 Sensory Transduction
Primate glabrous skin over the area of the apical tuft contains a higher density
of Merkel cells, which are highly sensitive to local skin stress/strain (Vallbo &
Johansson, 1978; Johansson & Vallbo, 1979). These cells are innervated by slowly
adapting, low-threshold afferents (SA-I afferent fiber) with small receptive fields
(Ogawa, 1996). SA-I afferents are particularly sensitive to spatial features, such
as points, edges, curvature and orientation (Phillips & Johnson, 1981), and their
population responses have been shown to encode the representation of these spatial
features independent of contact forces (Goodwin et al., 1995).
Similarly to the human apical tuft, the flat surface on the core of the BioTac is
equipped with a cluster of four identical electrodes, including electrodes 8 and
9 symmetrically distributed along the Y-axis, or the roll direction in the finger
16
coordinate frame, and electrodes 7 and 10 symmetrically distributed along the X-
axis, or the pitch direction in the finger coordinate frame (Fig. 2.3b). If the flat
surfaceisorientedparalleltoacontactsurfacewhilepressingtheskinagainstit,the
four electrodes should generate the same voltage values as normal force increases
symmetrically on all four electrodes (Su et al., 2012). Even slight rotations or tilts
of the sensor from parallel alignment with a contacting surface will produce large
asymmetries in skin deformation, which could be sensed by the voltage differences
between each pair of electrodes. For example, the voltage differences between
electrodes 8 and 9 can be used to detect the tilt angle along the roll direction of the
finger coordinate frame, as shown in Fig. 2.3b. Therefore, the voltage asymmetries
can be used to achieve very fine perception and control of the orientation of the
fingertip with respect to a contacting surface.
17
Chapter 3
Tactile Servoing
Humans have been shown to be good at using active touch to perceive subtle differ-
ences in compliance. They tend to use highly stereotypical exploratory strategies,
such as applying normal force to a surface. We developed similar exploratory
and perceptual algorithms for a mechatronic robotic system (Barrett arm/hand
system) equipped with liquid-filled, biomimetic tactile sensors (BioTac from Syn-
Touch LLC). The distribution of force on the fingertip was measured by the elec-
trical resistance of the conductive liquid trapped between the elastomeric skin and
a cluster of four electrodes on the flat fingertip surface of the rigid core of the Bio-
Tac. These signals provided closed-loop control of exploratory movements, while
the distribution of skin deformations, measured by more lateral electrodes and by
the hydraulic pressure, was used to estimate material properties of objects. With
this control algorithm, the robot plus tactile sensor was able to discriminate the
relative compliance of various rubber samples.
3.1 Introduction
Humans interact with compliant objects to judge ripeness of fruits, the air pressure
in bicycle tires or the quality of a mattress. Expert bakers judge the quality of
flour by evaluating physical firmness or toughness of dough (Katz, 1937). During
breast or prostate examinations, healthcare practitioners use their hands to locate
18
and characterize a hard lump in soft tissue. Unlike visual features such as size and
shape, compliance can only be appreciated via active or passive touch.
It is essential for social and personal assistive robots and prosthetic hands (a form
of telerobot) to be able to perceive material properties such as compliance to
handle household objects. The ability to interact with fragile objects is neces-
sary particularly if such systems are designed to interact physically with humans.
Compliance perception could also be beneficial to robot-assisted, minimally inva-
sive surgeries by detecting a hidden tumor in an organ or a calcified artery in heart
tissue (Yamamoto et al., 2009). A variety of tactile sensors have been designed
to solve the tactile sensing problems in robotic manipulation and medicine (Web-
ster, 1988), but their practical use is limited by the hostile environments to which
robotic and prosthetic hands are typically exposed. The BioTac is a robust and
easy to repair tactile sensor that is capable of detecting the point of contact,
normal/tangential contact forces, and object spatial properties with impedance
sensing electrodes (Wettels & Loeb, 2011; Wettels et al., 2008), micro-vibrations
associated with slip and textures through a hydro-acoustic pressure sensor (Fishel
et al., 2008), and thermal fluxes with a thermistor (Lin et al., 2009).
Previous studies of compliance discrimination by robots used a combination of tac-
tile and force sensors. Takamuku et al. built a tendon-driven robot hand covered
with strain gauges and a piezoelectric polyvinylidene fluoride (PVDF) skin (Taka-
mukuetal.,2007). Byperformingsqueezingandtappingoverobjectswithdifferent
material properties, the strain gauges in this tactile sensor enabled the discrimi-
nation of hardness of different materials. In (Campos & Bajcsy, 1991), authors
proposed a robotic haptic system architecture that performed haptic exploratory
procedures based on psychophysical studies of human performance (Lederman &
Klatzky, 1987). The hardness of objects was determined by measuring the force
19
required to produce a given displacement. Both studies focused on measuring con-
tact force and indentation displacement to discriminate object hardness or com-
pliance. An adaptive force/position control algorithm was tested on an industrial
robot to maintain force along the normal direction to the surface while moving in
tangential directions on a rubber ball with 10 cm radius and 5000N/m stiffness
(Villani et al., 2000). In this paper, we present the results of using information
about distributed deformation of the elastic skin of our tactile sensor to discrim-
inate compliance, a strategy that appears to be similar to that used by humans.
This is made possible by using sensory feedback from a cluster of impedance sens-
ing electrodes in the BioTac that are responsive to distributed forces. With these
electrodes, we were able to maintain a consistent orientation while applying normal
forces to the surface of the object.
Subjective hardness/softness discrimination has been studied in psychophysical
studies. In (Srinivasan & LaMotte, 1995), authors showed that humans are effi-
cient at discriminating subtle differences in softness under both active touch and
passive touch with only cutaneous sensation but they are unable to discriminate
even large differences during local cutaneous anesthesia. This suggests that tactile
sensory information independent of proprioceptive information is necessary for dis-
criminating softness of objects with a deformable surface. Their studies also show
that randomizing maximum force levels and indentation velocity in passive touch
does not seem to affect sensitivity. This indicates that compliance discrimination
can be done without fine control of these movements. Instead, we propose that
spatial distribution of skin could be the cue for compliance discrimination. Peine
developed a taxonomy that classifies the surgeons’ finger motion during palpation
procedures (Peine, 2000). They found that surgeons apply various normal force
with no lateral motion to sense the stiffness of body tissues. Lateral motion after
20
applying heavy pressure was found to enhance the ability to detect hard lumps in
soft tissue.
To acquire information about object properties, humans tend to perform stereo-
typed exploratory movements (Lederman & Klatzky, 1987). The exploratory
movements to detect hardness are pressing and squeezing (Lederman & Klatzky,
1990). We have developed a haptic robot platform with a Barrett hand-wrist-arm
system whose three fingers have been equipped with novel BioTac R
multimodal
tactile sensors. In this paper, we present algorithms for the control of human-
like exploratory movements for pressing on and characterizing objects with various
hardnesses (durometer values). When robot gradually presses its fingertip into
rubber samples with compliant surfaces, it uses the sensory feedback from the
tactile sensor (BioTac) to control both normal and tangential contact forces and
to adjust the orientation of its fingertip to account for the potentially unknown
orientation of contact surfaces and internal discontinuities such as buried lumps.
The distributed deformation sensed by the BioTac can be used to estimate the
compliance of the contact surface.
3.2 Materials and Methods
We present data from initial experiments with flat objects made from materials
with varying hardness to demonstrate the simultaneous use of multimodal tactile
sensor data to control exploratory movements and to interpret their results.
21
3.2.1 Experiment Setup
Tactile Signals
Similar to the human fingertip, the BioTac sensors are sensitive to tangential as
well as normal forces. When performing a compliance movement it is desirable
to apply forces normally and symmetrically to the object. For the haptic robot,
this means servoing its end-effectors in the pitch and roll directions to orient a
flat portion of the core of the BioTac that defines a local coordinate frame (Fig.
3.1a). The sensory feedback is provided by four adjacent electrodes on this flat
region whose impedance depends on compression of the skin against the electrode
surface. These four adjacent electrodes are labeled electrode 7, 8, 9, and 10 on the
electrode array map (Fig. 3.1b). The pair of electrodes along the x-direction (8
and 9) and the pair of electrodes along the y-direction (7 and 10) are used for servo
control of the pitch and roll, respectively, of the robotic fingertip. When the tactile
sensor detects differences between these pairs of electrodes the error is corrected
by adjusting the pitch or roll of the fingertip with the robot. The total contact
force during indentation is estimated from the sum of impedance changes on all
four electrodes. When pressing into a compliant object, the object has a tendency
to wrap around the finger and the resulting forces can be measured by lateral
electrodes that are not on the flat surface (such as 17). Comparing this change
with the relative magnitude of impedance changes in the central four electrodes
can yield substantial information about the compliance of the object. Additional
information from the fluid pressure can also be used to characterize these changes.
The sensor signals that provide information about compliance depend also on the
curvature of the surface of the object, which must be estimated simultaneously
from the complete temporal profiles of all sensor signals (Wettels & Loeb, 2011).
22
a
b
Figure 3.1: (a) Orientations on BioTAC: the finger local coordinate frame has its
origin in the center of the two electrode pairs and is coplanar with the flat surface
of the core; (b) Electrode array map.
Testing Materials
The levels of compliance for objects used in this experiment are classified by their
durometers. The durometer is measured by the indentation depth into a material
created by a given force on a standardized indenter with a specific diameter. There
are several scales of durometer depending on the diameter and configuration of the
indenter, the spring forces applied on the tested materials. The samples in this
experiment were all one inch thick and made from Neoprene rubber (50 Shore A)
and polyurethane rubber (30 Shore A, 50 Shore 00 and 30 Shore OO), going from
hard to soft.
Experimental procedure
The experiments were conducted on the 7 DOF Barrett WAM robot arm and 4
DOF Barrett Hand BH-280 equipped with the BioTac. In each trial, the robot
pressed one digit against a rubber sample in an unknown orientation and position.
The robot controller had no prior knowledge of the orientation of the surface;
23
Figure 3.2: (a) Barrett with BioTac pressing a compliant surface; (b)
Force/position control diagram.
instead, it used tactile sensory feedback to identify a contact surface and adjust
its finger orientation while pressing onto the compliant surface (Fig. 3.2a).
3.2.2 Robot exploratory movements
The exploratory movement can be divided into three phases: 1) Reach to an object
surface by controlling position in the Cartesian coordinate system with smooth
path movement. The desired position is either provided a priori or estimated
by machine vision. 2) Maintain normal contact and orientation with the center
of the fingertip by maintaining a symmetrical distribution of force on a cluster
of tactile sensors. 3) Controlling the exploratory movement, which consists of
pressing the fingertip gradually into the contact surface while maintaining the
normal orientation of the fingertip in the pitch and roll directions.
24
Figure 3.3: Algorithm for online orientation generation using tactile sensor feed-
back.
Online orientation control using tactile sensor feedback
In order to maintain the orientation of the flat portion of the sensor while gradually
pressing into a compliant surface, the desired orientation trajectory is generated
by feedback signals on the two pairs of electrodes. These differential signals are
used to incrementally increase or decrease current pitch and roll angles (β
c
, γ
c
)
with very small increments (Δβ, Δγ) in the finger local coordinates, respectively.
From the new local roll-pitch angles (β, γ) in the finger local coordinate frame
(shown in Fig. 3.1a), the corresponding finger local rotation matrixR
local
can be
derived and translated into rotation matrix in the robot base coordinates
B
F
R
Finger
by premultiplying local rotation matrix with matrix which is the forward kinematic
from fingertip to robot base. Instead of using roll-pitch-yaw angles for orientation
control directly, a unit quaternion representation of orientation [η,
1
,
2
,
3
] is
derived from the new rotation matrix, because of its singularities-free property
(Yuan, 1988). This online orientation generation algorithm is shown in a pseudo-
code, (Fig. 3.3).
25
The scalar part and the vector part (
1
,
2
,
3
) in a unit quaternion representation
fulfill
η
2
+
2
1
+
2
3
+
2
3
= 1
A velocity-based orientation control with quaternion feedback is written below:
ω
r
=ω
d
−K
o
e
o
(3.1)
where ω
d
is the desired angular velocities, K
o
is diagonal gain matrix and e
o
is
the orientation error which is formulated using the unit quaternion (Yuan, 1988)
as:
e
o
=δ =η
d
−η
d
+ [
d
×] (3.2)
where [
d
×] =
0 −
3d
2d
3d
0 −
1d
−
2d
1d
0
and [η,
1
,
2
,
3
] is the current orientation and
[η,
1d
,
2d
,
3d
] is the desired orientation.
Robot position control
The desired positions and orientation generated by the online trajectory generation
usingtactilesensoryfeedbackisachievedbyavelocity-basedoperationalspacecon-
troller together with an inverse dynamic law and PD feedback error compensation
in joint space (Nakanishi et al., 2008). Inverse dynamics control enables low PD
feedback gains for compliant control while ensuring high tracking performances.
The control law is written as:
τ
arm,p
=M ¨ q
d
+h +K
p
(q
d
−q) +K
d
( ˙ q
d
− ˙ q) (3.3)
26
where τ
arm,p
is computed vector of torques to track desired joint angles q
d
with
measured current joint anglesq,M is rigid-body inertia matrix of the arm, ˙ q
d
is
the vector of desired joint velocity shown as:
˙ q
d
=J
+
( ˙ x
d
+K
x
(x
d
−x)) +K
post
(I−J
+
J)(q
post
−q) (3.4)
wherex andx
d
are the measured and desired finger position and orientation, h
is the vector of Coriolis, centrifugal, and gravitational forces, K
p
, K
d
, K
x
, and
K
post
are diagonal gain matrices. J is the end-effector Jacobian,J
+
denotes the
pseudo-inverse of Jacobian andq
post
is the vector of default posture optimized in
the nullspace of the end-effector motion. The desired joint acceleration ¨ q
d
and
desired joint positionq
d
are obtained by numerical differentiation and integration
of the desired velocity ˙ q
d
.
Robot force control
Because the robot will press its end-effector onto compliant surfaces, external con-
tact forces need to be taken into account. The external contact forces are obtained
from the three force vectors on the BioTac extracted from impedance changes.
They are used to compute torques in the joint space to account for the external
contact forces by premultiplying them with Jacobian transpose, shown in Eq. 5.
The tracking of desired contact forces is achieved with a PI controller (Pastor et
al., 2011)
τ
arm,p
=−J
T
(F
arm_des
−F
arm
) +K
I
Z
t
t−Δt
(F
arm_des
−F
arm
)dt (3.5)
whereF
arm_des
is desired forces at the end-effector, F
arm
is the measured forces
interpreted from BioTac, K
I
is a diagonal positive definite gain matrix and Δt
27
Figure 3.4: Coordinate frame of BioTac: each impedance sensing electrode has a
specific orientation in the BioTac coordinate frame..
is the time-window during which the force error is integrated. The integral con-
troller will compensate for steady-state errors during contact. An overview of the
presented control architecture is shown in Fig. 3.2b.
3.2.3 Normal and tangential force extraction
During contact with an object, external forces deform the skin and fluid path
around the impedance sensing electrodes. This deformation results in a distributed
pattern of impedance changes on the electrodes. Previous studies have shown that
both normal and tangential forces can be characterized by the impedance changes
on the electrodes using machine learning techniques (Wettels et al., 2009a; Wettels
& Loeb, 2011). Here we present a simpler and more robust analytical algorithm
to estimate normal and tangential forces.
The BioTac contains an array of 19 impedance sensing electrodes distributed over
the surface of the core, which has a coordinate frame aligned with its long axis
(Fig. 3.4). Each impedance sensing electrode was determined to have the highest
sensitivity to forces applied normally to its surface. The normal vectors to each
of these electrodes in 3-axis coordinate space can be weighted with the change
in impedance of these electrodes to determine an estimate of tri-axial force. We
28
calculate the x, y and z force vectors from these electrodes with the following
equation:
F
x
F
y
F
z
=
S
x
0 0
0 S
y
0
0 0 S
z
×
N
1,x
... N
19,x
N
1,y
... N
19,y
N
1,z
... N
19,z
×
E
1
E
...
E
19
−
E
1,rest
E
...
E
19,rest
(3.6)
where
F
x
F
y
F
z
is the three dimensional force vectors on the BioTac,
E
1
... E
19
and
E
1,rest
... E
19,rest
are the measured impedance changes and
resting impedance values on the BioTac,
N
1,x
... N
19,x
N
1,y
... N
19,y
N
1,z
... N
19,z
is a matrix in which
each column is calculated normal vector for each impedance sensing electrode sur-
face from the geometry of the rigid core, the , and are scaling values for x-y-z three
dimensional vectors to transform these arbitrary units into engineering units (N).
3.3 Results
3.3.1 Force Extraction
The scaling factor for each of the three dimensional estimated force vectors from
the BioTac was calibrated on a 6-axis force plate (HE6x6-16, AMTI). We found
that using the above-mentioned normal/tangential force calibration method was
computationally efficient and achieved a low root-mean-squared (RMS) error that
exceeded the performance of the neural network and machine learning techniques
described in (Wettels et al., 2009a; Wettels & Loeb, 2011). Fig. 3.5 shows the
actual forces (blue) measured from force plate and the predicted force vectors
29
Figure 3.5: Force measured on force plate (blue) and measured from the BioTac
(red) for pokes with various tangential components.
(red) extracted from BioTac by manually pressing and sliding the BioTac on the
force plate. While pressing and sliding the BioTac, the flat portion of the BioTac
was kept parallel with the surface of the force plate, similar to the orientation goal
of the servo-controller for the exploratory poking movements. The RMS errors for
these sample movements were less than 10% of the applied forces in each axis.
3.3.2 Pressing with orientation uncertainty
Typical behavior of the system poking a surface with unknown orientation is illus-
tratedin Fig. 3.6. Thetop twoplots showtheimpedances ofthepairsof electrodes
along x and y directions (E8 vs. E9, E7 vs. E10) on the flat portion of the core of
the BioTac, which is the desired center of contact. The differential signals between
30
Figure 3.6: Typical BioTac impedance sensor feedback on the point of contact and
robot orientation behavior obtained from pressing on a compliant surface.
those two pairs of electrodes are also displayed in the two middle plots in Fig.
3.6. After the initial contact (around 0.5 to 1s), the robot gradually pressed the
BioTac into the compliant surface. A small asymmetry in the x-direction pair
triggered the small correction to the roll angle that occurred at about 2-2.5s and
a larger correction at 4.5-5s, shown in the bottom left plot. A larger asymmetry
in the y-direction pair triggered a large pitch angle correction at 4-6s, shown in
the bottom right plot. The correction to pitch angle was relatively slow because
it involved most of the proximal joints of the Barrett arm and it actually resulted
in the second correction to the roll angle.
While the robot performed a pressing behavior, its contact force on the compliant
surface was controlled by using tangential and normal force feedback extracted
from the impedance electrode array on the BioTac. Fig. 3.7 shows that the robot
pressed 10N on the compliant surface and kept its lateral tangential force (X Force)
close to zero. The axial tangential force of 4N is what would be expected given
31
Figure 3.7: Typical BioTac tangential force (X force) and normal force (Z force)
feedback obtained from pushing on a compliant surface.
the 30
o
tilt of the flat portion of the fingertip with respect to the long axis of the
BioTac. Stretch between the BioTac elastic skin and the compliant rubber sample
caused by two rolling movements (around 2 to 2.5s and 4.5 to 5s in Fig. 3.6)
created a positive tangential force on the sensor, but the force controller gradually
decreased the tangential force to close to zero by the end of the movements shown
in Fig. 3.7.
3.3.3 Compliance discrimination
Force and displacement
Previous experiments showed that force and indentation displacement can be used
in compliance discrimination when actively palpating with a tool (LaMotte, 2000).
Thus, the ratio between force and indentation displacement can also provide infor-
mation for the perception of compliance, especially for compliant objects covered
with non-deformable surfaces, such as piano keys. During our experiment, the
32
Figure 3.8: Measured normal force from BioTac: five objects with five different
hardness are tested with Barrett robot equipped with BioTac.
robot is controlled to apply 10N in the normal direction onto five objects, consist-
ing of an aluminum plate and 4 progressively softer rubber samples with durometer
Shore 50A, Shore 30A, Shore 50OO and Shore 30OO. Shore 50A is as hard as a
pencil eraser and shore 30OO is a little bit softer than a racquetball ball. When the
robot was actively pressing the BioTac onto five objects with various hardnesses,
normal forces and indentation displacement were measured from BioTac and robot
joint encoders, respectively, as shown in Fig. 3.8 and Fig. 3.9. The initial rates of
rising of force were similar for all materials, but it took the robot longer to reach
10N on soft materials 50OO and 30OO because it needed to constantly adjust its
fingertip orientation to keep its fingertip orthogonal to the surface of soft materials.
In Fig. 3.9, we observe that the softer materials have more indentation displace-
ment than the harder materials. The indentation displacement was measured by
the position sensors in the robot actuators. It reflects the sum of indentation of
33
Figure 3.9: Measured indentation displacement from Barrett joint encoder.
the skin of the BioTac plus indentation of the object being probed plus stretch-
ing of the fine stainless-steel cables that link the motors to the joints. Fig. 3.10
shows that indentation displacement trajectories were similar for all materials up
to about 6mm and 1N, which were due to the displacement of the elastic skin on
the BioTac and the initial stretching on the finger joints, which explain a nearly
linear relationship between force and displacement. From 6mm to 12mm, inden-
tation displacement trajectories diverged as a result of the deformation of rubber
samples with different compliance properties and stretching of cables in the robot.
At about 15 mm and 10N, they reconverged because they were then dominated
by the stretching of the cables in the robot wrist subjected to large external forces
from the compressed samples.
34
Figure 3.10: Force vs. indentation displacement.
Deformation
Inlightofthecomplexcombinationoffactorsthatcontributestoapparentindenta-
tion displacement of the fingertip, it would be desirable to use temporal variations
of average pressure and spatiotemporal variations of distributed skin deforma-
tion, as proposed for human discrimination of hardness by (Srinivasan & LaMotte,
1995). Both types of tactile information are available from the BioTac. The
MEMS pressure transducer measures the average pressure of the fluid inside the
space between the elastic skin and rigid core. The spatiotemporal variations of
distributed deformation are provided by the impedance electrode array, especially
from lateral electrodes adjacent to the central four electrodes used for controlling
the applied force.
35
Figure 3.11: Measured average pressure from MEMS pressure transducer on Bio-
Tac: the rate of average pressure and saturation pressure are used to discriminate
object compliance.
When the BioTac was pressed against hard materials (e.g. the aluminum plate and
shore A 50 and shore A 30 rubber samples), fluid pressure plateaued or actually
declinedafternormalforcereached2.87N(around2sinFig. 3.11). Thissaturation
is caused by the rigid object pushing the elastic skin against the rigid core on
BioTac. The first part of the increased fluid pressure reflects the compliance of
the BioTac skin and fluid pressure, which grows nonlinearly after the skin contacts
the core. The curves diverge before that occurs if and when the object compliance
exceeds the BioTac compliance. As shown in Fig. 3.10 (1-2s), harder objects
created a higher rate of average pressure changes in the BioTac. When BioTac
pressed objects with a softer surface, the soft surface not only pushed the elastic
skin against the rigid core more gradually but also progressively enveloped the side
of the BioTac fingertip. This created higher saturation pressures for softer surfaces
(around 2-3s in Fig. 3.11).
36
The tendency of soft surfaces to envelop the fingertip as they are deformed can be
seenalsointheimpedancesofmorelateralelectrodessuchas#17(Fig. 3.12). The
BioTac actually measures the current admitted into the electrode from a test pulse
applied to various reference electrodes distributed in the fingertip, so a decrease in
measured voltage from an initial value reflects an increase in electrode impedance.
For the lateral electrode # 17, the impedance initially increased similarly for all
materials as increasing force was applied at the fingertip and the skin deformed,
but the curves diverged as the more compliant materials deformed and enveloped
the skin further from the centroid of contact. The reorientation movement that the
robot made to correct pitch to maintain normal force (4.5-5s in Fig. 3.6) resulted
in the transients in lateral electrode impedance at that time (Fig. 3.12), which
were particularly pronounced for the hard materials. After the robot corrected
its orientation and reached its maximum contact force, the resting voltage on the
lateral electrode reflected the compliance of the object.
3.4 Discussion
The tactile sensors available in the BioTac have properties similar to those in
human fingertips and can be used to measure compliance of objects, but only
if there is accurate control of the exploratory movement. Those same sensors
can be used to control the exploratory movements, using tactile feedback control
that may also be similar to what humans use when deciding how to palpate an
unknownobject. Thepreliminaryresultspresentedhereareafirststepindesigning
algorithms that can enable robots to produce the range of exploratory movements
and the percepts that humans achieve thereby.
37
Figure 3.12: Typical BioTac lateral impedance electrode feedback from pressing
on different compliant surfaces.
In this paper, the BioTac was controlled to explore flat compliant objects. Com-
pliant objects that have curved surfaces or inhomogeneities in material properties
will generate different responses in the sensors, whose interpretation may require
additional exploratory movements. The tactile-based control of exploratory move-
mentspresentedhereshouldenablesystematicexplorationofsuchunknownobjects
regardless of their location or orientation with respect to the robot hand.
Systematic datasets need to be generated by poking the BioTac into objects with
various curvatures and various compliances to develop a more complete perceptual
algorithm. In previous studies, the impedance sensing electrodes of the BioTac
could be used to make coarse determinations of the radius of curvature of rigid
objects (Wettels & Loeb, 2011). Humans tend to follow the contour of objects
to perceive their precise shapes (Lederman & Klatzky, 1990). Palpation of hard
38
objectsburiedinsofttissuesprobablyreflectsacombinationoftactile-drivenmove-
ments to determine the orientation of hard surfaces and kinesthesia to keep track
of the location and size of those surfaces (Peine, 2000). In the future, we will
combine pressing and contour-following exploratory movements to facilitate the
perception of both compliance and shape of objects. Eventually, tactile informa-
tion from exploratory movements must be fused with machine vision to permit
location, characterization, identification and dexterous manipulation of objects in
the environment.
39
Chapter 4
Learning Tactile Feedback Model
In order to robustly execute a task under environmental uncertainty, a robot needs
to be able to reactively adapt to changes arising in its environment. The environ-
mentchangesareusuallyreflectedindeviationfromexpectedsensorytraces. These
deviations in sensory traces can be used to drive the motion adaptation, and for
this purpose, a feedback model is required. The feedback model maps the devia-
tions in sensory traces to the motion plan adaptation. In this paper, we develop a
general data-driven framework for learning a feedback model from demonstrations.
We utilize a variant of a radial basis function network structure –with movement
phases as kernel centers– which can generally be applied to represent any feedback
models for movement primitives. To demonstrate the effectiveness of our frame-
work, we test it on the task of scraping on a tilt board. In this task, we are learning
a reactive policy in the form of orientation adaptation, based on deviations of tac-
tile sensor traces. As a proof of concept of our method, we provide evaluations on
an anthropomorphic robot.
4.1 Introduction
The ability to handle unexpected sensor events is key to robustly executing manip-
ulation tasks. Humans, for instance, can predict how it should feel to pick up an
object and correct a grasp if the actual experience deviates from this prediction.
Phraseddifferently, humanscanmaperrorsinsensoryspacetocorrectionsinaction
40
Figure 4.1: Proposed framework for learning behavior adaptation based on associative
skill memories (ASMs).
space. In order to endow our robots with this ability, two problems need to be
tackled: First, the system needs to be able to predict what sensor measurements
to expect. Second, it needs to learn how to map deviations from those predictions
to changes in actions.
Learning what sensor measurements to expect at any moment in time, anywhere in
the state space, is a challenging problem with no known viable solution. However,
associating sensor information with successful executions of motion primitives has
been shown to be promising (Pastor et al., 2011, 2013). When such sensor traces
have been associated with a primitive, the robot can try to correct the primitive’s
nominal actions when the actual sensor readings deviate from what is expected.
In order to do so, a feedback model that maps errors in sensor space to the cor-
rective actions needs to be acquired. In initial implementations of such Associa-
tive Skill Memories (ASMs) (Pastor et al., 2011), a linear feedback model was
used. This feedback model essentially multiplies the sensor trace error with a
manually defined feedback gain matrix to compute acceleration changes. While
hand-designing feedback models can work well for specific problem settings, this
41
approach is not expected to generalize beyond the scenario it was tuned for. Fur-
thermore, when considering high-dimensional and multimodal sensory input, such
as haptic feedback, manually designing a feedback policy quickly becomes infeasi-
ble. For example, in this work, we consider tactile-driven manipulation with tools.
Manipulation tasks involving tools is challenging due to inaccurate tool kinematics
models and non-rigid contacts between tactile sensors and the tool.
Thus, the larger goal of this research is to equip Associative Skill Memories with
a general feedback modulation learning framework, as depicted in the block dia-
gram in Fig. 4.1. Data-driven approaches to learning such feedback models have
been proposed (Rai et al., 2014, 2017; Chebotar et al., 2014) in the past. Here,
we present a learning framework that improves such data-driven approaches in
generality and experimental validation. First, we contribute towards the goal of
generality by proposing the use of phase-modulated neural networks (PMNNs).
Our previous work (Rai et al., 2017) shows that feedforward neural networks
(FFNNs) have greater flexibility to learn feedback policies from human demon-
strations than a hand-designed model. However, FFNNs cannot capture phase-
dependent sensory features or corrective actions. Thus, in this paper, we introduce
(PMNNs), which can learn phase-dependent feedback models and show that this
improves learning performance when compared to regular FFNNs. Second, we
present detailed insight on our experimental pipeline for learning feedback models
on a tactile-driven manipulation task. Furthermore, we extensively evaluate our
learning approach on this manipulation task across multiple task variations and
successfully deploy our approach on a real robot.
This paper is organized as follows. Section 4.2 provides some background on
the motion primitive representation and related work. Section 4.3 presents the
detailsofourapproachforlearningfeedbackmodelsfromdemonstrations. Wethen
42
present insights into our experimental setup in Section 4.4. Finally, we evaluate
our approach in Section 4.5 and conclude with Section 4.6.
4.2 Background and Related Work
Herewereviewbackgroundmaterialonourchosenmotionprimitiverepresentation
and related work in learning feedback model approaches, including tactile feedback
learning.
4.2.1 Quaternion DMPs
The Associative Skill Memories framework, as proposed in (Pastor et al., 2013),
uses Dynamic Movement Primitives (DMPs) (Ijspeert et al., 2013) as a motion
primitive representation. DMPs are a goal-directed behavior described as a set of
differentialequationswithwell-definedattractordynamics. Itisthisformulationof
DMPs as a set of differential equations that allows for online modulation from var-
ious inputs, such as sensor traces, in a manner that is conceptually straightforward
and simple to implement, relative to other movement representations.
In our work, DMPs need to represent both position and orientation of the end-
effector. Wereferthereaderto(Raietal.,2017)forourpositionDMPformulation.
Here we focus on reviewing Quaternion DMPs, which we use for orientation rep-
resentation in our learning-from-demonstration experiments.
Quaternion DMPs were first introduced in (Pastor et al., 2011), and then improved
in(Krambergeretal.,2016;Udeetal.,2014)tofullytakeintoaccountthegeometry
of SO(3). Like position DMPs, they consist of a transformation system and a
canonical system, governing the evolution of the orientation state and movement
phase, respectively.
43
The transformation system of a quaternion DMP is
1
:
τ
2
˙ ω =α
ω
β
ω
2 log
Q
g
◦Q
∗
−τω
+f +C (4.1)
whereQisaunitquaternionrepresentingtheorientation,Q
g
isthegoalorientation
andω, ˙ ω are the 3D angular velocity and angular acceleration, respectively. f and
C are the 3D orientation forcing term and feedback/coupling term
2
, respectively.
The forcing term encodes the nominal behavior, while the coupling term encodes
behavior adaptation which is commonly based on sensory feedback. In this paper,
we focus on learning a feedback model that generates the coupling term, which is
described in Sub-Section 4.3.2. During unrolling, we integrateQ forward in time
to generate the kinematic orientation trajectory as follows:
Q
t+1
= exp
ωΔt
2
!
◦Q
t
(4.2)
where Δt is the integration step size. We set the constantsα
ω
= 25 andβ
ω
=α
ω
/4
to get a critically-damped system response when both forcing term and coupling
term are zero. τ is set proportional to the motion duration.
The movement phase variable p and phase velocity u are governed by the second-
order canonical system as follows:
τ ˙ u =α
u
(β
u
(0−p)−u) (4.3)
τ ˙ p =u (4.4)
1
For defining Quaternion DMPs, the operators◦,
∗
and the generalized log and exponential
mapslog(·),andexp(·)arerequired. ThedefinitionoftheseoperatorsarestatedinEquationsA.1,
A.2, A.3, and A.4 in the Appendix.
2
Throughoutthispaper, weusetheterm feedback andtheterm coupling term interchangeably.
44
We set the constants α
u
= 25 and β
u
= α
u
/4. The phase variable p is initialized
with 1 and will converge to 0. On the other hand, the phase velocity u has initial
value 0 and will converge to 0. Note, for a multiple degree-of-freedom (DOF)
system, each DOF has its own transformation system, but all DOFs share the
same canonical system (Ijspeert et al., 2013).
The forcing term f governs the shape of the primitive and is represented as a
weighted combination ofN basis functionsψ
i
with width parameterh
i
and center
at c
i
, as follows:
f (p,u;w) =
P
N
i=1
ψ
i
(p)w
i
P
N
j=1
ψ
j
(p)
u (4.5)
where
ψ
i
(p) = exp
−h
i
(p−c
i
)
2
(4.6)
Note, becausetheforcingtermf ismodulatedbythephasevelocityu, itisinitially
0 and will converge back to 0.
The N basis function weightsw
i
in equation 4.5 are learned from human demon-
strations of baseline/nominal behaviors, by setting the target regression variable:
f
target
=−α
ω
(β
ω
2 log
Q
g,bd
◦Q
∗
bd
−τω
bd
) +τ
2
˙ ω
bd
where {Q
bd
,ω
bd
, ˙ ω
bd
} is the set of baseline/nominal orientation behavior demon-
strations. Then we can perform linear regression to identify parameters w, as
shown in (Ijspeert et al., 2013).
Finally, we include a goal evolution system as follows:
τω
g
=α
ωg
2 log
Q
G
◦Q
∗
g
(4.7)
45
whereQ
g
andQ
G
are the evolving and steady-state goal orientation, respectively.
We set the constant α
ωg
= α
ω
/2. The goal evolution system has two important
roles related to safety during the algorithm deployment on robot hardware. The
first role, as mentioned in (Ijspeert et al., 2013), is to avoid discontinuous jumps
in accelerations when the goal is suddenly moved. The second role, as mentioned
in (Nemec & Ude, 2012), is to ensure continuity between the state at the end of
one primitive and the state at the start of the next one when executing a sequence
of primitives. Here we ensure continuity between primitives for both position and
orientation DMPs by adopting (Nemec & Ude, 2012).
4.2.2 Related Work on Learning Feedback Models
The ability to adapt movement plans to changes in the environment requires feed-
back models. In previous work, researchers have hand-designed feedback models
for specific purposes. For instance, feedback models for obstacle avoidance were
devised in (Park et al., 2008; Hoffmann et al., 2009). A human-inspired feedback
model was designed for performing robotic surface-to-surface contact alignment
based on force-torque sensing (Khansari et al., 2016). Force-torque sensing is also
used in (Pastor et al., 2011), where a hand-designed feedback gain matrix maps
deviations from the expected force-torque measurements to the grasp plan adap-
tation.
Previous work on robotic tactile-driven manipulation with tools has tried to learn
feedback models to correct the position plans for handling uncertainty between
tools and the environment, via reinforcement learning (Chebotar et al., 2014) or
motor babbling (Hoffmann et al., 2014). In our work, we propose to bootstrap the
learning of feedback model from human demonstrations.
46
Abu-Dakka et al. iteratively learned feedforward terms to improve a force-torque-
guided task execution over trials, while fixing feedback models as constant gain
matrices (Abu-Dakka et al., 2015).
Learning by demonstrations is also employed in (Gams et al., 2015) to train sepa-
rate feedback models for different environmental settings. Gaussian process regres-
sion is used to interpolate between these learned models to predict the required
feedback model in a new environmental setting. Our work directly uses a single
model to handle multiple settings.
Kupcsik et al. learns the mapping from contexts –or environmental settings– to
DMP parameters (Kupcsik et al., 2017). On the other hand, we learn the mapping
from sensory input to the plan adaptation, abstracting the pre-specification of the
context.
In (Sung et al., 2017), a partially-observable Markov decision process (POMDP),
which is parameterized by deep recurrent neural networks, is used to represent
a haptic feedback model. In general, POMDPs models are not explicitly pro-
vided with the information of the movement phase which is essential for making a
prediction on the next corrective action. Our proposed approach can learn phase-
dependent corrective actions.
4.3 Learning Feedback Models via Phase-
Modulated Neural Networks
In this section we describe our framework to learn general feedback models from
human demonstrations. The process pipeline of learning feedback models is visu-
alized in Fig. 4.2. For a specific instance of this pipeline in our experiment, please
refer to Sub-Section 4.4.3. Our framework comprises 3 core components: learning
47
Figure 4.2: Process pipeline of learning feedback model.
expected sensor traces; learning the feedback model to map sensor trace errors to
corrections; and finally we introduce PMNNs, a feedback model representation
that is flexible enough to capture phase-dependent features and can learn across
multiple task settings.
4.3.1 Learning Expected Sensor Traces
The core idea of ASMs (Pastor et al., 2011, 2013) rests on the insight that similar
task executions should yield similar sensory events. Thus, an ASM of a task
includesbothamovementprimitiveaswellastheexpectedsensortracesassociated
with this primitive’s execution in a known environment. We term this execution as
the primitive’s nominal behavior, the known environment as the nominal setting,
and the expected sensor traces asS
expected
. To learn theS
expected
model, we execute
the nominal behavior and collect the experienced sensor measurements. Since
48
these measurements are trajectories by nature, we can encode them using DMPs
to become S
expected
. This has the advantage that S
expected
is phase-aligned with
the position and Quaternion DMP’s execution, because they all share the same
canonical system in Equations 4.3 and 4.4.
4.3.2 Learning Feedback Models from Demonstration
When a movement primitive is executed under environment variations and/or
uncertainties, the perceived sensor traces, denoted as actual sensor tracesS
actual
,
tend to deviate fromS
expected
. The disparityS
actual
−S
expected
= ΔS can be used
to drive corrections for adapting to the environmental changes causing the devi-
ated sensor traces. Previous work (Chebotar et al., 2014; Kober et al., 2008) uses
reinforcement learning to learn these corrective behaviors, also in form of feedback
models. However, learning a good feedback policy via trial-and-error from scratch
is a very slow process. Therefore, we would like to bootstrap this process by learn-
ing feedback models from demonstrations. In our supervised learning framework,
the disparity ΔS is used as the input to a feedback model, mapping them to the
motion plan adaptation or the coupling termsC (from Equation 4.1), as follows:
C =h(S
actual
−S
expected
) =h(ΔS) (4.8)
We pose this as a regression problem, and similar to learning the nominal behavior,
we can also learn this feedback model h from human demonstrations of corrected
behavior, i.e. the demonstrated behavior when the feedback is active. To perform
the learning-from-demonstration, we need to extract the target output variable,
49
i.e. the target coupling termC
target
, from demonstrations data, which can be done
as follows:
C
target
=−α
ω
(β
ω
2 log
Q
g,cd
◦Q
∗
cd
−τω
cd
) +τ
2
˙ ω
cd
−f (4.9)
where {Q
cd
,ω
cd
, ˙ ω
cd
} is the set of corrected orientation behavior demonstration.
Next, we describe our proposed general learning representation for the feedback
model.
4.3.3 Phase-Modulated Neural Network Structure
We use neural network (NN) structures for representing feedback term models
due to its ability to learn task-relevant feature representations of high-dimensional
inputs from data. In this paper, we improve upon our previous work (Rai et
al., 2017), in which we used a regular fully-connected feedforward neural network
(FFNN) to represent the feedback model. Our new neural network design is a vari-
ant of the radial basis function network (RBFN) (Bishop, 1991), which we call the
phase-modulated neural networks (PMNNs) as depicted in Fig. 4.3. PMNN has
an embedded structure that allows the encoding of a feedback model’s dependency
on the movement phase, which an FFNN structure lacks. We expect PMNN to
model human adaptation better than FFNN because the same sensory deviation
(NN input) may occur at different movement phases, but the form of the adap-
tation (NN output) will most likely be different. There is also an alternative way
of modeling phase-dependent adaptation behavior by using FFNN and including
both phase variable p and phase velocity u as inputs, together with the sensor
trace deviations ΔS. However, there is no convergence guarantee on the adapted
motion plan because the coupling term is not guaranteed to converge to zero, hence
50
we may still need to hand-design an output post-processing similar to (Rai et al.,
2017) to ensure convergence. PMNN, on the other hand, guarantees convergence
due to the way we embed the information of phase velocity u into the structure.
Figure 4.3: Phase-modulated neural network (PMNN) with one-dimensional out-
put coupling term C.
ThePMNN consists of:
• input layer
The input is ΔS =S
actual
−S
expected
.
• regular hidden layers
The regular hidden layers perform non-linear feature transformations on the
high-dimensional inputs. If there are L layers, the output of l-th layer is:
h
l
=
a
l
(W
h
l
ΔS
ΔS +b
h
l
) for l = 1
a
l
W
h
l
h
l−1
h
l−1
+b
h
l
for l = 2,...,L
a
l
is the activation function of the l-th hidden layer, which can be tanh,
ReLU, or others. W
h
1
ΔS
is the weight matrix between the input layer and
the first hidden layer. W
h
l
h
l−1
is the weight matrix between the (l− 1)-th
51
hidden layer and the l-th hidden layer. b
h
l
is the bias vector at the l-th
hidden layer.
• final hidden layer with phase kernel modulation
This special and final hidden layer takes care of the dependency of the model
on the movement phase. The output of this layer ism, which is defined as:
m =G (W
mh
L
h
L
+b
m
) (4.10)
where denote element-wise product of vectors. G =
G
1
G
2
... G
N
T
is the phase kernel modulation vector, and each component G
i
is defined as:
G
i
(p,u) =
ψ
i
(p)
P
N
j=1
ψ
j
(p)
u i = 1,...,N (4.11)
with phase variable p and phase velocity u, which comes from the second-
order canonical system defined in Equation 4.3 and 4.4. ψ
i
(p) is the radial
basis function (RBF) as defined in Equation 4.6. We useN = 25 phase RBF
kernels both in the PMNNs as well as in the DMPs representation. The
phase kernel centers have equal spacing in time, and we place these centers
in the same way in the DMPs as well as in thePMNNs.
• output layer
The output is the one-dimensional coupling term C:
C =w
T
Cm
m (4.12)
w
Cm
is the weight vector. Please note that there is no bias introduced in
the output layer, and hence ifm = 0 –which occurs when the phase velocity
52
u is zero– then C is also zero. This ensures that C is initially zero when a
primitive is started. C will also converge to zero because the phase velocity
u is converging to zero. This ensures the convergence of the adapted motion
plan.
For an M-dimensional coupling term, we use M separatePMNNs with the same
input vector ΔS and the output of each PMNN corresponds to each dimension
of the coupling term. This separation
We implementedPMNN in TensorFlow (Abadi et al., 2016). To avoid overfitting,
we used the dropout technique as introduced in (Srivastava et al., 2014).
4.4 Learning Tactile Feedback Models: System
Overview and Experimental Setup
This work is focused on learning to correct tactile-driven manipulation with tools.
Our experimental scenario involves a demonstrator teaching our robot to perform
a scraping task, utilizing a hand-held tool to scrape paint off the surface of a dry-
erase board (see Figure 4.4). The system is taught this skill at a nominal tilt
angle and needs to correct when the board is tilted away from that default angle.
Neither vision nor motion capture system is used, thus we only rely on tactile
sensing to inform the correction. One of the main challenges is that the tactile
sensors interact indirectly with the board, i.e. through the tool adapter and the
scraping tool via a non-rigid contact, and the robot does not explicitly encode
the tool kinematics model. This makes hand-designing a feedback gain matrix
difficult. Next, we explain the experimental setup and some lessons learned from
the experiments.
53
Figure 4.4: Experimental setup of the scraping task.
4.4.1 Hardware
The demonstrations were performed on the right arm and the right hand of our
bi-manual robot. The arm is a 7-degrees-of-freedom (DoF) Barrett WAM arm
which is also equipped with a 6D force-torque (FT) sensor at the wrist. The hand
is a Barrett hand whose left and right fingers are equipped with biomimetic tactile
sensors (BioTacs) (Wettels et al., 2008). The two BioTac-equipped fingers were
set up to perform a pinch grasp on a tool adapter. A tool adapter is a 3D-printed
object designed to hold a scraping tool with an 11mm-wide tool-tip.
The dry-erase board was mounted on a tilt stage whose orientation can be adjusted
tocreatestatictiltsof±20
◦
intherolland/orpitchwithrespecttotherobotglobal
coordinates as shown in Fig. 4.4. Two digital protractors with 0.1
◦
resolution
(Wixey WR 300 Digital Angle Gauge) were used to measure the tilt angles during
the experiment.
54
4.4.2 Robot’sEnvironmentalSettingsandHumanDemon-
strations with Sensory Traces Association
For our experiment, we considered 5 different settings, and each setting is associ-
ated with a specific roll angle of the tilt stage, specifically at 0
◦
, 2.5
◦
, 5
◦
, 7.5
◦
, and
10
◦
. At each setting, we fixed the pitch angle at 0
◦
and maintain the scraping path
toberoughlyatthesameheight. Hence, weassumethatamongthe6Dposeaction
(x-y-z-pitch-roll-yaw), the necessary correction is only in the roll-orientation. For
each setting, we collected 15 demonstrations. The setting with a roll angle of 0
◦
is
selected as the nominal setting, while the remaining settings become the corrected
ones.
For the demonstrated actions, we recorded the 6D pose trajectory of the right
hand end-effector at 300 Hz rate, and along with these demonstrations, we also
recorded the multi-dimensional sensory traces associated with this action. The
sensorytracesarethe38-dimensionaltactilesignalsfromtheleftandrightBioTacs’
electrodes, sampled at 100 Hz.
4.4.3 Learning Pipeline Details and Lessons Learned
DMPs provide kinematic plans to be tracked with a position control scheme. How-
ever, for tactile-driven contact manipulation tasks such as the scraping task in this
paper, using position control alone is not sufficient. In order to attain consistent
tactile signals on task repetitions –during the demonstrations as well as during
unrolling of the learned feedback models– similar contact force profiles needs to be
applied. Hence force control is required.
Moreover, while it is possible to perform corrected demonstrations solely by
humans, the sensor traces obtained might be significantly different from the traces
55
obtained during the robot’s execution of the motion plan. This is problematic
because during learning and during prediction phases of the feedback terms, the
input to the feedback models is different. Hence, instead, we try to let the robot
execute the nominal plans, and only provide correction by manually adjusting the
robot’s execution in different settings as necessary.
Therefore, we use the force-torque (FT) sensor in the robot’s right wrist for FT
control, with two purposes: (1) to maintain tool-tip contact with the board, such
that consistent tactile signals are obtained, and (2) to provide compliance, allowing
the human demonstrator to perform corrective action demonstration as the robot
executes the nominal behavior.
For simplicity, we set the force control set points in our experiment to be constant.
We need to set the force control set point carefully: if the downward force (in the
z-axis direction) for contact maintenance is too big, the friction will block the robot
from being able to execute the corrections as commanded by the feedback model.
We found that 1 Newton is a reasonable value for the downward force control set
point. Regarding the learning process pipeline as depicted in Fig. 4.2, here we
provide the details in our experiment:
1. Nominal primitives acquisition: While the robot is operating in the gravity-
compensation mode and the tilt stage is at 0
◦
roll angle, the human demon-
strator guided the robot’s hand to kinesthetically perform a scraping task,
which can be divided into three stages, each of which corresponds to a move-
ment primitive:
(a) primitive 1: starting from its home position above the board, go down
(in the z-axis direction) until the scraping tool made contact with the
scraping board’s surface (no orientation correction at this stage),
56
(b) primitive 2: correct the tool-tip orientation such that it made a full flat
tool-tip contact with the surface,
(c) primitive 3: go forward in the Y-axis direction while scraping paint off
the surface, applying orientation correction as necessary to maintain full
flat tool-tip contact with the surface.
We used Zero Velocity Crossing (ZVC) method (Fod et al., 2002) and local
minima search refinement on the velocity signal in the z and y-axes, to find
segmentation points of primitives 1 and 3, respectively. The remaining part
– between the end of primitives 1 and the beginning of primitive 3 – becomes
primitive 2. We encode each of these primitives with position and orientation
DMPs.
Force-Torque Control Activation Schedule
Primitive 1 Primitive 2 Primitive 3
Step 2 - z 1 N z 1 N
Step 3 - z 1 N, roll 0 Nm z 1 N, roll 0 Nm
Step 4 - z 1 N z 1 N
Table 4.1: Force-torque control schedule for steps 2-4.
For the following pipeline steps (2, 3, and 4), in reference to Table 4.1, which
indicates what force-torque control mode being active at each primitive of
these steps. "z 1 N" refers to the 1 Newton downward z-axis proportional-
integral (PI) force control, for making sure that consistent tactile signals are
obtained at repetitions of the task; this is important for learning and making
correction predictions properly. "roll 0 Nm" refers to the roll-orientation PI
torque control at 0 Newton-meter, for allowing corrective action demonstra-
tion.
57
2. Expected sensor traces acquisition: Still with the tilt stage at 0
◦
roll angle, we
unroll the nominal primitives 15 times and record the tactile sensor traces.
We encode each dimension of the 38-dimensional sensor traces as S
expected
,
using the standard DMP formulation.
3. Feedback model learning: Now we vary the tilt stage’s roll-angle to 2.5
◦
,
5
◦
, 7.5
◦
, and 10
◦
, one-at-a-time, to encode different environmental settings.
At each setting, we let the robot unroll the nominal behavior. Beside the
downward force control for contact maintenance, now we also activate the
roll-orientation PI torque control at 0 Newton-meter throughout primitives
2 and 3. This allows the human demonstrator to perform the roll-orientation
correction demonstration, to maintain full flat tool-tip contact relative to the
now-tilted scraping board. We recorded 15 demonstrations for each setting,
from which we extracted the supervised dataset for the feedback model, i.e.
the pair of the sensory trace deviation ΔS and the target coupling term
C
target
as formulated in Equation 4.9. Afterwards, we learn the feedback
models from this dataset using thePMNN.
4. DMP and Feedback Model Unrolling/Testing: We test the feedback models
in different settings on the robot.
4.5 Experiments
To evaluate the performance of the learned feedback model, we first evaluate the
regression and generalization ability of the PMNNs which were trained offline on
the demonstration data. Second, we show the superiority of PMNNs over FFNNs
as a choice for feedback models learning representation. Third, we investigate the
importance of learning both the feature representation and the phase dependencies
58
together within the framework of learning feedback models. Fourth, we show the
significance of the phase modulation in the feedback model learning. Finally, we
evaluate the learned feedback model’s performance in making predictions of action
corrections online on a real robot.
We evaluate feedback models only on primitives 2 and 3, for roll-orientation cor-
rection. In primitive 1, we deem that there is no action correction because the
height of the dry-erase board surface is maintained constant across all settings.
As an error metric, we use the normalized mean squared error (NMSE), i.e. the
mean squared prediction error divided by the target coupling term’s variance. To
evaluate the learning performance of each model in our experiments, we perform a
leave-one-demonstration-out test. In this test, we performK iterations of training
and testing, where K = 15 is the number of demonstrations per setting. At the
k-th iteration:
• Thedatapointsofthek-thdemonstrationofallsettingsareleft-outasunseen
data for generalization testing, while the remaining K− 1 demonstrations’
datapoints
3
areshuffledrandomlyandsplit 85%, 7.5%,and 7.5%fortraining,
validation, and testing, respectively.
• We record the training-validation-testing-generalization NMSE pairs corre-
sponding to the lowest generalization NMSE across learning steps.
We report the mean and standard deviation of training-validation-testing-
generalization NMSEs across K iterations.
On all models we evaluated, we use tanh as the activation function of the hidden
layer nodes. We use the Root Mean Square Propagation (RMSProp) (Tieleman &
3
Each demonstration – depending on the data collection sampling rate and demonstration
duration – provides hundreds or thousands of data points.
59
Hinton, 2012) as the gradient descent optimization algorithm and set the dropout
(Srivastava et al., 2014) rate to 0.5.
4.5.1 Fitting and Generalization Evaluation of PMNNs
The results for primitive 2 and 3, using the PMNN structure with one regular
hidden layer of 100 nodes, are shown in Table 4.2. The PMNNs achieve good
training, validation, testing results, and reasonable generalization results for both
primitives.
Roll-Orientation Coupling Term Learning NMSE
Training Validation Testing Generalization
Prim. 2 0.15±0.05 0.15±0.05 0.16±0.06 0.36±0.19
Prim. 3 0.22±0.05 0.22±0.05 0.22±0.05 0.32±0.13
Table 4.2: NMSE of the roll-orientation coupling term learning with leave-one-
demonstration-out test, for each primitive.
4.5.2 Performance Comparison between FFNN and
PMNN
We compare the performance between FFNN and PMNN. For PMNN, we test
twostructures: onewithnoregularhiddenlayerbeingused, andtheotherwithone
regular hidden layer comprised of 100 nodes. For FFNN, we use two hidden layers
with 100 and 25 nodes each, which is equivalent toPMNN with one regular hidden
layer of 100 nodes but de-activating the phase modulation. The results can be seen
in Fig. 4.5. It can be seen thatPMNN with one regular hidden layer of 100 nodes
demonstrated the best performance compared to the other structures. PMNN
with one regular hidden layer is better than the one without a regular hidden
60
layer, most likely because of the richer learned feature representation, without
getting overfitted to the data.
Figure 4.5: Comparison of regression results on primitives 2 and 3 using different
neural network structures.
4.5.3 Comparison between Separated versus Embed-
ded Feature Representation and Phase-Dependent
Learning
We also compare the effect of separating versus embedding the feature representa-
tion learning with overall parameter optimization under phase modulation. Cheb-
otar et al. (Chebotar et al., 2014) used PCA for feature representation learning,
which was separated from the phase-dependent parameter optimization using rein-
forcement learning. On the other hand,PMNN embeds feature learning together
with the parameter optimization under phase modulation, into an integrated pro-
cess.
In this experiment, we used PCA which retained 99% of the overall data variance,
reducing the data dimensionality to 7 and 6 (from originally 38) for primitive 2
61
and 3, respectively. In addition, we also implemented an autoencoder, a non-
linear dimensionality reduction method, as a substitute for PCA in representation
learning. The dimensions of the latent space of the autoencoders were 7 and 6 for
primitive 2 and 3, respectively. ForPMNNs, we used two kinds of networks: one
with one regular hidden layer of 6 nodes (such that it becomes comparable with
the PCA counterpart), and the other with one regular hidden layer of 100 nodes.
Figure 4.6: Comparison of regression results on primitives 2 and 3 using sepa-
rated feature learning (PCA or Autoencoder and phase kernel modulation) versus
embedded feature learning (PMNN).
Fig. 4.6 illustrates the superior performance of PMNNs, due to the feature learn-
ing performed together with the phase-dependent parameter optimization. Of the
two PMNNs, the one with more nodes in the regular hidden layer performs bet-
ter, because it can more accurately represent the mapping, while not over-fitting to
the data. Based on these evaluations, we decided to usePMNNs with one regular
hidden layer of 100 nodes and 25 phase-modulated nodes in the final hidden layer
for subsequent experiments.
62
4.5.4 Evaluation of Movement Phase Dependency
Here we visualize the trained weight matrix mapping the output of 100 nodes in
the regular hidden layer to the 25 nodes in the final hidden layer being modulated
by the phase RBFs. This weight matrix is of dimension 25× 100, and each row
shows how each of the 100 nodes’ output (or "features") in the regular hidden
layer being weighted into a particular phase RBF-modulated node. In Fig. 4.7,
we display the top 10 dominant regular hidden layer node output for each phase
RBF-modulated node (in yellow color), and the rest (colored in blue) are the
less dominant ones. We see that between different phase RBF-modulated nodes,
the priority ranking is different, suggesting that there is some dependency of the
feedback on the movement phase.
Figure 4.7: The top 10 dominant regular hidden layer features for each phase RBF
in primitive 2, roll-orientation coupling term, displayed in yellow.
4.5.5 Unrolling the Learned Feedback Model on the Robot
In Fig. 4.8, we show the snapshots of our robot scraping experiment on a setting
with 10
◦
roll-angle of the tilt stage. In particular, we compare between the nom-
inal plan execution (top figures, from (a) to (d)) and the adapted plan execution
63
((a)) 0.0
◦
((b)) 0.0
◦
((c)) 0.0
◦
((d)) 2.0
◦
((e)) 0.7
◦
((f)) 2.5
◦
((g)) 5.7
◦
((h)) 3.7
◦
Figure 4.8: Snapshots of our experiment on the robot while scraping on the tilt
stage with +10
◦
roll angle environmental setting: without adaptation (top figures,
(a) to (d)) versus with adaptation (bottom figure, (e) to (h)).
(bottom figures, from (e) to (h), using the trained feedback models). From left to
right ((a) to (d), and (e) to (h)), it shows subsequent phases of plan execution.
The caption ((a) to (h)) shows the reading of the Digital Angle Gauge mounted on
top of the middle finger of the hand. We see that if we turn off the coupling term
(nominal plan execution, top figures), there was no correction applied to the tool-
tip orientation and the scraping result was worse than when the online adaptation
was applied (adapted plan execution, bottom figures).
Fig. 4.9 shows the coupling term (top) alongside the corresponding sensor trace
deviation of one of the electrodes (bottom) during plan execution at 4 different
environmental settings as specified in caption (a)-(d). We compare between sev-
eral cases: human demonstrations (blue), human demonstrations’ mean trajectory
(dashedblack), rangeofdemonstrationswithin1standarddeviationfromthemean
trajectory (solid black), during robot unrolling of the nominal behavior (green),
64
and during robot unrolling while applying the coupling term computed online by
the trained feedback model (red). On the top plots, we see that the trained feed-
back model can differentiate between settings and apply the approximately correct
amount of correction. When applying the coupling term computed online by the
trained feedback model, the sensor trace deviation is also close to those of demon-
strations, as shown in the bottom plots.
((a))Env. setting: 2.5
◦
roll-angle
((b)) Env. setting:
5.0
◦
roll-angle
((c)) Env. setting: 7.5
◦
roll-angle
((d)) Env. setting:
10.0
◦
roll-angle
Figure 4.9: The roll-orientation coupling term (top) vs. the corresponding sensor
traces deviation of the right BioTac finger’s electrode #6 on primitive 2 (bottom),
during scraping task on environmental (env.) setting with the tilt stage’s roll-angle
varies as specified in caption (a)-(d). x-axis is the time index, y-axis of top figures
is the coupling term magnitude (in radians), and y-axis of bottom figures is the
discretized sensor trace deviation magnitude (unitless).
Finally, video https://youtu.be/7Dx5imy1Kcw shows the scraping execution at
twosettings, at 5
◦
and 10
◦
roll-angleofthetiltstage, whileapplyingthecorrections
predicted online by the trained feedback model.
4.6 Conclusion
We introduced a general framework for learning-from-demonstration of feedback
models, mapping sensory trace deviations to action corrections. In particular, we
65
introduced phase-modulated neural networks (PMNNs), which allow to fit phase-
dependent feedback models and preserve the convergence properties of DMPs.
Finally, we demonstrate the superior learning performance of our PMNN-based
framework when compared to state-of-the-art methods, as well as its capability in
performing online adaptation on a real robot.
66
Chapter 5
Precision Grip Force Control with
Slip Detection and Classification
We introduce and evaluate contact-based techniques to estimate tactile properties
and detect manipulation events using a biomimetic tactile sensor. In particular, we
estimate finger forces and detect and classify slip events. In addition, we present
a grip force controller that uses the estimation results to gently pick up objects
of various weights and texture. The estimation techniques and the grip controller
are experimentally evaluated on a robotic system consisting of Barrett arms and
hands. Our results indicate that we are able to accurately estimate forces acting
in all directions, detect the incipient slip, and classify slip with over 80% success
rate.
5.1 Introduction
A service robot deployed in human environments must be able to perform dex-
terous manipulation tasks under many different conditions. These tasks include
interacting with unknown objects (e.g. grasping). Recent advances in computer
vision and range sensing enable robots to detect objects reliably (Erhan et al.,
2014). However, even with the correct pose and location of an object, reliable
grasping remains a problem.
67
Figure 5.1: Robotic arm grasping a fragile object using a standard position con-
troller (left) and the proposed force grip controller (right).
Tactile sensors can be used to monitor gripper-object interactions that are very
important in grasping, especially when it comes to fragile objects (see Fig. 5.1).
These interactions are otherwise difficult to observe and model.
Achieving human-level performance in dexterous grasping tasks will likely require
richer tactile sensing than is currently available (Dahiya et al., 2009). Recently,
biomimetic tactile sensors, designed to provide more humanlike capabilities, have
beendeveloped. Thesenewsensorsprovideanopportunitytosignificantlyimprove
the robustness of robotic manipulation. In order to fully use the available infor-
mation, new estimation techniques have to be developed. This paper presents a
first step towards estimating some tactile properties and detecting manipulation
events, such as slip, using biomimetic sensors.
In this work, we use the BioTac sensors (Wettels et al., 2008) (Fig. 5.2) in order
to estimate forces, detect slip events and classify the type of slip. Additionally, we
68
present a grip controller that uses the above techniques to improve grasp quality.
The key contributions of this work are: a) a force estimation technique that out-
performs the state of the art, b) two different slip detection approaches that are
able to detect the slip event up to 35ms before it is detected by an accelerometer
attached to the object, c) a slip classifier that is able to classify the types of the
slip with over 80% accuracy, and d) potential applications of the above techniques
to robotic grasp control.
5.2 Related Work
Humans are capable of manipulating novel objects with uncertain surface prop-
erties even when experiencing random external perturbations (Birznieks et al.,
1998). Tactile sensing plays a crucial role during these tasks (Johansson & Flana-
gan, 2009a). As reported in (Srinivasan et al., 1990), humans mainly rely on tactile
feedback for slip detection and contact force estimation.
Previous work has taken inspiration from human grip control. Romano et al. pro-
pose and evaluate a robotic grasp controller for a two-finger manipulator based
on a human-inspired processing of data from tactile arrays (Romano et al., 2011).
In (Wettels et al., 2009b), an approach to control grip force using the BioTac
is presented. The approach adopts a conservative estimate of the friction coef-
ficient instead of estimating it on-the-fly. However, a conservative estimate may
result in damaging fragile objects with excessive grip force. In (De Maria et al.,
2015), the authors propose a new slipping avoidance algorithm based on integrated
force/tactilesensors(DeMariaetal.,2012). Thealgorithmincludesatactileexplo-
ration phase aiming to estimate the friction coefficient before grasping. It also uses
69
a Kalman filter to track the tangential component of the force estimated from tac-
tilesensing inorder toadaptivelychange the gripforce appliedbythe manipulator.
In our work, instead of a tactile exploration phase, we continuously re-estimate the
friction coefficient while grasping the object.
Significant work has also focused on slip detection and slip-based controllers. In
(Heyneman & Cutkosky, 2013), the authors present a method for slip detection
and try to distinguish between finger/object and object/world slip events. Their
approach is based on multidimensional coherence which measures whether a group
of signals is sampling a single input or a group of incoherent inputs. A frequency-
domain approach is presented for incipient slip detection based on information
from a Piezo-Resistive Tactile Sensor (Schoepfer et al., 2010). Our work, however,
is novel in using the BioTac sensors for these tasks, which provide the robot with
increased sensitivity and frequency range over traditional sensors.
The slip classification problem has not been explored as much as the other aspects
of tactile estimation. In (Melchiorri, 2000), it addresses the problem of detect-
ing both linear and rotational slip by using an integrated suite comprised of a
force/torque and tactile sensors. However, this approach neglects the temporal
aspect of tactile data, which may be useful in classifying manipulation events.
The BioTac sensors have been previously used to estimate contact forces. In (Su
et al., 2012), an analytical approach based on electrode impedance was used to
extract normal and tangential forces. In this work, we show that our machine
learning methods outperform this method substantially.
In (Wettels et al., 2014), the authors also use the BioTac sensors to estimate forces
acting on a finger. Machine learning (Artificial Neural Networks and Gaussian
Mixture Models) are used for learning the mapping from sensor values to forces.
The best performance is achieved by using neural networks with regularization
70
Z
Y
X
Y
X
Z
Figure 5.2: The coordinate frame of the BioTac sensor (adapted from (Su et al.,
2012)).
techniques. Here we extend this approach to a network with multiple layers and
show that it leads to better estimation performance.
5.3 Approach
In this section, we introduce different aspects of tactile-based estimation that are
useful in various manipulation scenarios. The high-resolution and multimodal
properties of the BioTac sensor enables us to estimate forces, detect and classify
the slip, and control the gripper using reaction forces exerted on the fingers.
5.3.1 Force Estimation
Reliableestimationoftri-axialforces (F
x
,F
y
,F
z
)appliedontherobotfinger, which
are shown in Fig. 5.2, is important for a robust finger control. In this work, we
employ and evaluate four methods to estimate these forces based on the readings
from the BioTac sensor.
Previous studies have shown that tri-axial forces can be characterized based on
the impedance changes on the 19 electrodes (Su et al., 2012). This method makes
71
an assumption that each electrode is only sensitive to forces that are normal to its
surface. In our first approach, tri-axial contact forces are analytically estimated
by a weighted sum of the normal vectors (N
x,i
,N
y,i
,N
z,i
) of the electrodes. The
weights are the impedance changes (E
i
) on the electrodes:
F
x
F
y
F
z
=
19
X
i=1
S
x
E
i
N
x,i
S
y
E
i
N
y,i
S
z
E
i
N
z,i
,
where (S
x
,S
y
,S
z
) are scaling factors that convert calculated contact forces into
Newtons (N). They are learned with linear regression using ground truth data (Su
et al., 2012).
To improve the quality of force estimation we apply two other machine learning
methods: Locally Weighted Projection Regression (LWPR) (Vijayakumar et al.,
2005) and regression with neural networks. LWPR is a nonparametric regression
technique that uses locally linear models to perform nonlinear function approxi-
mation. Given N local linear models ψ
k
(x), the estimation of the function value
is performed by computing a weighted mean of the values of all local models:
f(x) =
P
N
k=1
w
k
(x)ψ
k
(x)
P
N
k=1
w
k
(x)
.
The weights determine how much influence each local model has on the function
value based on its distance from the estimation point. The weights are commonly
modelled by a Gaussian distribution:
w
k
(x) = exp
−
1
2
(x− c
k
)D(x− c
k
)
,
72
where c
k
are the centers of the Gaussians and D is the distance metric. Locally
weightedpartialleastsquaresregressionisusedtolearntheweightsandtheparam-
eters of each local model.
As our third approach, we use a single-hidden-layer neural network (NN) that was
proposed by (Wettels et al., 2014). The hidden layer consists of 38 neurons, which
is the doubled number of inputs.
We also propose a fourth approach, where we use a multilayer NN to learn the
mapping from BioTac electrode values to the finger forces. The network consists
of input, output and three hidden layers with 10 neurons each.
For both NN approaches we use neurons with the hyperbolic tangent sigmoid
transfer function:
a =
2
1 + exp(−2n)
− 1.
For the activation of the output layer we use a linear transfer function, i.e. the
output is a linear combination of the inputs from the previous layer.
NNs are trained with the error back-propagation and Levenberg-Marquardt opti-
mization technique (Hagan & Menhaj, 1994). In order to avoid overfitting of the
training data, we employ the early stopping technique during training (Yao et al.,
2007). The data set is divided into mutually exclusive training, validation, and
test sets. While the network parameters are optimized on the training set, the
training stops once the performance on the validation set starts decreasing.
5.3.2 Slip Detection
Robust slip detection is one of the most important features needed in a manip-
ulation task. Knowledge about slip may help the robot to react such that the
object does not fall out of its gripper. In order to detect a slip event, two different
73
estimation techniques are used: a force-derivative method and a pressure-based
method.
The force-derivative method uses changes in the estimated normal force to detect
slip. Since the gripper force becomes smaller as the object slips, the negative
derivative of the normal force is used to detect the slip event. Based on the
experience from the experimentation, the threshold on the negative normal force
derivative is set to 1N/s.
Slip is also detected using the pressure sensor. Since the BioTac skin contains
a pattern of human-like fingerprints, it is possible to detect slip-related micro-
vibration on the BioTac skin when rubbing against the textured surface of an
object. A bandpass filter (60-700Hz) is first employed to filter the pressure signal.
Second, the absolute value of the signal is calculated since we are interested in the
absolute vibration. Due to differences between pressure sensor sampling frequency
(2.2kHz) and the onboard controller (300Hz), the slip detection algorithm con-
siders a 10ms time window (3 cycles of the onboard controller). This guarantees
22 samples of pressure readings in the time window. Slip is detected if 11 out of
22 pressure sensor values exceed the threshold. Based on the experiments, the slip
threshold is set to be twice as large as the baseline vibration caused by the motors
of the robot.
5.3.3 Slip Classification
In the course of our experiments, we observed two main categories of object slip:
linear and rotational. During linear slip, the object maintains its orientation with
respect to the local end-effector frame but gradually slides out of the robot fingers.
During rotational slip, the center of mass of the object tends to rotate about
an axis normal to the grasping surface, although the point of contact with the
74
robot’s fingers might stay the same. It is important to discriminate between these
two kinds of slip to react and control finger forces accordingly. We notice that
rotational slip requires much stronger finger force response than linear slip in order
to robustly keep the object grasped within the robot hand (Kinoshita et al., 1997).
To be able to classify linear and rotational slip, we train a neural network to learn
the mapping from the time-varying BioTac electrode values to the slip class. To
construct the features, we take a certain time interval of electrode values and com-
bine all values inside the window into one long feature vector, e.g. 100 consecutive
timestamps of 19-dimensional electrode values result in a 1900-dimensional input
vector. The architecture of the NN consists of input, output and one hidden layer
with 50 neurons. The hidden layer has a sigmoid transfer function. The softmax
activation function is used in the output neurons. It produces the probabilities of
the signal sequence belonging to one of the slip classes.
Similar to the force estimation we use early stopping to prevent overfitting. The
networkistrainedwiththeScaledConjugateGradientback-propagationalgorithm
(Moller, 1993).
5.3.4 Grip Controller
In order to test the estimation of the forces and detection of the slip event, we
design a grip controller that is able to take advantage of the estimated information.
Appropriategripforcecontrolisrequiredfortherobottomanipulatefragileobjects
without damaging or dropping them.
The control algorithm consists of two main stages, grip initiation, and object lifting
(Fig. 5.3). In grip initiation, the robot fingers are position controlled to close on
an object until the estimated normal force (F
z
) is above a certain threshold. The
threshold is chosen to be very small (0.2N) in order to avoid damaging the object.
75
Start
Close all fingers
Finger normal
force F Z > 0.2N
Yes
All fingers in contact
No
No
Yes
Closing fingers with
contact detection
Update the grip force
F Z = F T / ? + safety margin
No
End
? estimated by Coulomb friction law
Slip?
Yes
Lifting the object with
force control, slip
detection and online
friction coefficient
estimation
Start picking up an object
using ? = 2
Figure 5.3: Control diagram of the grip controller.
Onceallthefingersareincontactwiththeobject,thepositioncontrollerisstopped,
and the grip force controller is employed. The force control is used for the entire
object-lifting phase.
In order to establish the minimal required grip force, the force tangential to the
BioTac sensor F
t
is estimated:
F
t
=
q
F
2
x
+F
2
y
.
76
Since the tangential force is directly proportional to the weight of the object, the
grip forceF
z
is controlled based on the current estimation of the friction coefficient
μ in addition to some safety margin:
F
z
=
F
t
μ
+safety margin.
Thefrictioncoefficientisinitiallysetto 2basedontheknownfrictionofthesilicone
skinoftheBioTacandothercommonmaterials. Sincetheinitialfrictioncoefficient
is not estimated accurately, slip may occur during the lifting phase. Once slip
is detected using the force-derivative-based slip detection described earlier, the
friction coefficient is estimated more accurately online using the Coulomb friction
law:
μ =
F
t
F
z
.
The safety margin was chosen to be 10-40% to account for object acceleration
duringmanipulationandadditionaluncertaintiesofthefrictioncoefficient. Finally,
the commanded grip forceF
z
is updated according to the newly estimated friction
coefficient that provides the minimal force, which is sufficient to lift the object.
The grip control algorithm is shown in Fig. 5.3.
5.4 Evaluation and Discussion
5.4.1 Force Estimation
In order to evaluate different force estimation methods, we collected a dataset
consisting of raw signals of 19 electrodes. The ground truth data were acquired
using a force plate that was rigidly attached to the table. The BioTac was rigidly
attached to the force plate as shown in Fig. 5.4. In the experiment, the BioTac
77
Figure 5.4: Experimental setup for the force estimation comparison: the finger is
pressed at different positions and orientations against the force plate.
was perturbed manually multiple times from various directions with a wide range
of forces. The data were collected with frequency 300Hz (over 17000 individual
force readings). The collected data sets were divided into 30 seconds intervals of
continuous electrode readings. Afterwards, these intervals were randomly shuffled
and divided into 80% training and 20% test sets. Additionally, during the training
ofNNs, 20%ofthetrainingsetwasusedforthevalidationsettopreventoverfitting
with the early stopping technique.
Fig. 5.5 shows the results of the four compared methods evaluated on the full
and test sets. In both cases, common estimation metrics were chosen: Root Mean
Squared Error (RMSE) of the force inN and unitless Standardized Mean Squared
Error (SMSE). SMSE is computed by dividing the MSE by the variance of the
data.
The analytical approach developed previously (Su et al., 2012) is outperformed by
the other three methods. From the results, we draw the conclusion that the LWPR
and 1-layer-NN methods overfitted to the data, i.e. they perform better in the full
dataset than the other methods but they yield inferior performance in the set that
78
RMSE Test Set SMSE Test Set
1.0
1.5
2.0
008
0.12
0.16
0.0
0.5
Fx Fy Fz
0.00
0.04
0.08
Fx Fy Fz
2.0
RMSE Full Data Set
0.16
SMSE Full Data Set
y y
0.5
1.0
1.5
0.04
0.08
0.12
0.0
Fx Fy Fz
0.00
Fx Fy Fz
Analytical LWPR NN 1 layer NN 3 layers
Figure 5.5: The performance comparison between force estimation techniques.
Analytical approach is outperformed by the other methods. LWPR and 1-layer
NN perform well on the full data set but have low performance on the test set.
3-layer NN avoids overfitting and yields good results on the test set.
was not exposed in the training. The 3-layer neural network approach, however,
achieved good results on the test set and avoided overfitting. It illustrates that
the deeper structure of the NN was able to capture the high-dimensional force
mapping more accurately. On the test set, we could achieve the best RMSE of
0.43N in the x-direction, 0.53N in the y-direction and 0.85N in the z-direction.
It is also worth noting that there exists a significant difference between different
force directions in the case of the MSE evaluation. It can be explained by the
range of forces that were exerted on the sensor. Since F
z
is the vertical axis of the
79
0 1 2 3 4 5 6 7 8 9 10
−6
−4
−2
0
2
4
6
8
Time, s
Force Fx, N
Force Plate
Analytical
LWPR
NN 1 layer
NN 3 layers
0 1 2 3 4 5 6 7 8 9 10
−1
0
1
2
3
4
5
6
7
8
Force Fy, N
Time, s
0 1 2 3 4 5 6 7 8 9 10
−5
0
5
10
15
20
Force Fz, N
Time, s
Figure 5.6: Example of force estimation with different methods over time. From
top to bottom: force estimation for dimensions: F
x
, F
y
, F
z
BioTac, the forces experienced during the experiments vary more than in the other
directions. AnSMSEcomparisonismoreappropriateinthiscaseasitincorporates
the range of the data. The best SMSE values on the test set were achieved with
the 3-layer NN: 0.08 for the x-direction, 0.03 for the y-direction and 0.02 for the
z-direction.
In addition to the absolute errors, it is important to see how the estimation errors
correspond to the actual forces over time. An exemplary result is depicted in Fig.
80
Figure 5.7: Different objects used for the experiments.
5.6. One problem of the analytical approach is that it has an offset that differs in
various situations. The assumption that each electrode is mostly sensitive to skin
compression along its normal vector is not able to capture the non-linear patterns
given by the highly non-linear deformation on the silicone skin of the BioTac. In
the case of LWPR and NNs, the results are similar. One can notice, however,
that the LWPR force estimation produces forces that are not as smooth as the
NN approaches. The difference between the two NN approaches is too small to be
noticed on this data set. Given the results obtained from the test data set, the
3-layer NN approach yields better performance than the other methods.
5.4.2 Slip Detection
We tested the previously described slip detection algorithms on two objects with
distinctive textures: a plastic jar with a smooth surface and a wooden block with
rough texture (see Fig. 5.7). In both cases, we attached an IMU to the objects in
order to detect the moment when the object starts moving. In order to make the
81
1.34
1.36
1.38
1.4
right finger position (rads)
-20
-10
0
10
20
force derivative slip
right finger force derivative (N/s)
left finger force derivative (N/s)
0
500
1000
1500
2000
2500
vibration slip right finger vibration (bits)
left finger vibration (bits)
Time (s)
0 0.05 0.1 0.15 0.2 0.25
0
0.5
1
1.5
IMU acceleration (m/s
2
)
IMU slip
Figure 5.8: An example run of the slip detection experiment. Using the BioTac
sensor we are able to detect the slip event before the IMU accelerometer attached
to the object measures any acceleration due to slip.
object slip, the robot first grasps and picks up the object, and then opens its right
finger by 0.04 rad. The collected dataset consists of 20 slip events per object.
Anexamplerunoftheslipdetectionexperimentusingthewoodenblockisdepicted
in Fig. 5.8. One can see that using the force-derivative and the pressure-based
method, we were able to detect slip even before it was noticed by the IMU. It is
also worth noting that the pressure-based method can detect slip sooner than the
force-derivative method. This may be caused by the higher sampling rate of the
pressure sensor. However, it is also the case that in the very initial stage of slip
(incipient slip) the microscopical slip effects are not yet visible at the electrodes.
Nonetheless, the slight movement of the fingerprints is picked up by the high-
frequency pressure-based slip detection signal.
82
Statistical analysis of the experiments shows that the robot is able to detect slip
usingtheforce-derivativemethod 5.7ms±4.5ms(theplasticjar)and 7.8ms±3.6ms
(thewoodenblock)beforethemovementisnoticedbytheIMU.Thepressure-based
method detects slip even sooner: 32.8ms± 4.2ms (the plastic jar) and 35.7ms±
6.0ms(the wooden block) before the object motion is detected by the IMU. These
results indicate that the BioTac is able to quickly and reliably detect slip which is
important for robust grip control.
5.4.3 Slip Classification
To evaluate the NN approach for the classification of two kinds of slip events,
four objects were chosen: a wooden block, oil bottle, wipes box and a jar with
added weights (see Fig. 5.7). For training, the robot grasped an object either
approximately at the center of mass of the object or at the edge of the object.
These two grasping methods caused either linear (if grasped at the center of mass)
or rotational slip of the object while it was being picked up. In order to detect
slip, an IMU was attached to the object. For each object, over 80 grasps were
performed (40 for the linear slip and 40 for the rotational slip). The dataset was
randomly shuffled and divided into the 80% training and 20% test sets. Similar
to the force estimation, 20% of the training set was used for the validation during
the NN training.
Results of the experiments are depicted in Fig. 5.9. For the input of the NN, points
from 100 consecutive timestamps were selected, resulting in a 1900-dimensional
input vector. Each point in Fig. 5.9 corresponds to the last timestamp that was
taken into account as the NN input, i.e. the point when we classify slip given 100
previous values. The moment when the slip was detected by the IMU is depicted
by a vertical line. As more data are gathered during an actual slip, classification
83
−300 −200 −100 0 100 200 300 400
0.4
0.5
0.6
0.7
0.8
0.9
1
IMU slip
Time of prediction, ms
Classification accuracy
Figure 5.9: Linear/rotational slip classification accuracy dependent on the time
of prediction. Red line shows the point when slip is detected based on the IMU
readings.
accuracy improves as expected. However, it is worth noting that using the NN
approach, the robot is able to achieve approximately 80% classification rate, before
theIMUisevenabletonoticethattheslipeventstarted. Ouralgorithmaccurately
detects the slip class even before significant object motion is detected (using an
IMU), allowing more time for the robot to respond appropriately.
5.4.4 Grip Controller
The grip controller was evaluated using two different objects with varying weight:
a plastic jar (see Fig. 5.7) with the weight ranging from 100g to 1500g and a
plastic cup (see Fig. 5.1, top) with the weight ranging from 10g to 500g. In each
experiment, the robot grasped the object approximately at its center of mass, lifted
it off the table, held in the air and placed it back on the table.
84
0
0.5
1
1.5
right finger position (rads)
left finger position (rads)
-20
0
20
40
right finger grip force (N)
left finger grip force (N)
-20
0
30
right finger force derivative (N/s)
left finger force derivative (N/s)
0
500
1000
right finger vibration (bits)
left finger vibration (bits)
Time (s)
0 10 20 30 40 50 60 70
0.5
1
1.5
2
IMU acceleration (m/s
2
)
Reach Grip / Lift Hold Replace Open
Mechanical transients
• Slip detection
• Collision with the environment
Contact responses
• Contact detection
• Tri-axial contact forces
• Friction Coefficient
Figure 5.10: An example run of the grip controller that includes all of the grasping
phases.
Fig. 5.10 shows an example run of the grip controller. During the reaching phase,
the robot’s fingers detect the contact with the jar using the normal force estima-
tion. This is the moment when the grip force control is employed (10 seconds in
the experiment). When the robot starts to lift the jar, the grip force (F
z
) starts
to increase proportionally to the tangential force (F
t
) sensed on the BioTac. The
85
friction coefficient (μ) is updated at approximately the 18th second of the exper-
iment when the slip event is detected by the force-derivative method. After the
20th second of the experiment, the jar was successfully picked up and held in the
air. Two 150g weight plates were added to the jar at the 32nd second and the
38th second, consecutively. It is worth noting that the grip controller detected the
two slip events using the force-derivative method and increased the grip force to
prevent further slip. When the robot placed the jar back on the table, there are
large spikes in the slip detection signal (at 48th second). These may be used to
detect the collision with the environment and release the objects without pressing
the jar on the table with excessive force.
5.5 Conclusions and Future Work
In this work, we explored how one can use biomimetic tactile sensors to extract
useful tactile information needed for robust robotic grasping and manipulation.
We performed estimation of normal and tangential forces that normally occur dur-
ing holding and manipulating objects. Machine learning techniques were employed
and evaluated to learn the non-linear mapping from raw sensor values to forces. As
the experiments demonstrated, the best performance was achieved using 3-layer
neural network regression.
Different modalities available from the BioTac sensor were used to perform detec-
tion of the slip event. The best performance was observed with the pressure-based
method, where slip was detected more than 30ms before it was picked up by an
IMU accelerometer.
Slip classification into a linear or rotational slip was observed to be important for
robust object handling due to different requirements for finger force response. We
86
achieved 80% classification success rate using a neural network approach before the
slip event was detected by an IMU accelerometer. This indicates that the robot
should be able to change finger forces at a very early stage of the slip and therefore,
prevent the moving of the object inside the hand. In future work, the controller
that uses this classification will be employed and evaluated.
In order to test the above-mentioned estimation techniques, we created a grip force
controller that adjusts the gripping force according to the forces acting on the
fingers. We presented an example run of the controller during the entire grasping
experiment. Our results indicate that, by using the grip controller, the robot is
able to successfully grasp even easily deformable objects such as a plastic cup (Fig.
5.1).
At present, we are able to detect simple manipulation events and estimate forces.
In the future, we plan to predict more high-level features such as grasp stability,
which can be used to plan high-level decisions to manipulate objects successfully.
87
Chapter 6
Precision Grip Force Control with
Slip Prediction and Classification
6.1 Motivation
Tactile sensing enables humans to perform dexterous pinch grasp and pinch manip-
ulation on objects with environmental uncertainties which cannot or extremely
challenging to estimate by other sensory signals, such as vision. These environ-
ment uncertainties include poor pose estimation of the objects, unknown mass
and center of the mass properties, unknown friction property and uncertain cur-
vatures at the local contact surface between fingers and objects. Due to long
time delays, feedback control through reactive policies cannot support the swift
and skilled coordination of fingertip forces observed in most manipulations tasks
involve ordinary "passive" objects. Instead, humans rely on feedforward control
mechanisms that take advantage of the stable and predictable physical properties
of these objects. Even under a various form of uncertainties, humans tend to pre-
dict translational and rotational slip events and prevent them by scaling their grip
forces proportional to the mass, center of mass, frictional properties of the objects
and curvatures at local contact surfaces, (Flanagan & Johansson, 2002).
Based on the neurophysiological studies, the mechanoreceptors in the glabrous
skins enable them to sense the translational load (Johansson & Westling, 1987,
1984), torsional load (Kinoshita et al., 1997), as well as the directional of these
88
loads (Birznieks et al., 2010). If the desired poses of the objects are purely trans-
lational, gripping forces are regulated if any translational and rotational slips are
predicted. If the desired poses of the objects are purely rotational, gripping forces
are regulated if any translational slips are predicted. Because literature has shown
that higher grip forces are required to prevent rotational slips than translational
slips, the grip forces should be regulated at different rates for predicted transla-
tional slips and rotational slips.
6.2 Problem Statement
The state-of-the-art research robotic plate-form, like the Autonomous Robotic
Manipulationplate-formfromtheDARPAchallenge, isequippedwithcable-driven
and non-industrial robot arms and state-of-the-art computer vision systems. The
state-of-the-art computer vision systems usually achieve sub-centimeter accuracy
for pose estimation of objects. However, the non-linear cable stretches and motor-
side encoders on cable-driven and non-industrial robot arms result in various pose
errors inside different parts of the robot’s work-space. These two sources of errors
usually result in centimeter accuracy on this research robotic plate-form. There-
fore, it suffers from similar environmental uncertainties as human-counterparts
during dexterous pinch grasp and pinch manipulation tasks. The robotic hand on
our robot has very limited active gripping force (15N) using pinch grasp, which
is much smaller than human pinch grasp ( 60N). The robot finger motors stall
when the gripping forces are more than recommended active gripping force by the
manufacturer. Our robot plate-form is equipped with state-of-the-art multimodal
biomimetic tactile sensors. The goal of this work is to enable the robot with the
89
capabilities of predicting translational slip, rotational slips and directional of these
slips, controlling grip force to prevent these slips.
6.3 Related Work
In (Veiga et al., 2015), the authors used supervised learning methods, such as ran-
dom forest classifiers, to predict future slips. Their experimental setup was very
limited because it requires the experiment to visually label slip events from a video
camera which has a significantly lower sampling rate as tactile sensors. In their
recent work (Veiga & Peters, 2016), they adopted a motor babbling approach by
randomly moving the fingertips against non-moving objects and used the changes
of joint angles to label slip events. This experiment setup limits them from cap-
turing rotation slips because rotation slip events happen when finger joints are not
changing. Therefore, their work only addresses predicting translational slips. In
the grip controller, they incrementally add weights for detected slips or subtract
weights for stable contacts and apply exponential regulator to this weight. This
exponential regulated weight is used to control the velocity of the finger joint.
Therefore, velocity will quickly be around zero when no slip is predicted. When
there is a long period of stable contacts, the weights will become very negative and
will take a while to ramp up velocity. This will cause significant delays in the grip
control.
Previous work from us (Su et al., 2015) used spatial and temporal tactile sensor
data from BioTac electrodes and neural network to predict the types of trans-
lational and rotational slips and achieved approximately 80% classification rate
before the ground truth is even able to notice that the slip event started. The
ground truth was labeled by placing an Inertial measurement unit on the grasped
90
object. In (Meier et al., 2016), the authors integrated spatial and temporal tactile
sensor data from a piezoresistive sensor array through deep learning techniques,
the network is not only able to classify the contact state into stable versus slipping,
but also to distinguish between rotational and translation slippage. They evalu-
ated different network layouts and reached a final classification rate of more than
97% within 10ms. However, none of these work was evaluated on a grip control on
a real robot and the slip classifiers don’t differentiate the directions of slips. Previ-
ous work (Yao et al., 2017) proposed a rotation detection method as well as center
of mass estimation to explore the center of mass of cubic objects. Their approach
is limited to cubic object and rotation detection replied on a pair of symmetrical
sensors in a three-point contact scenario. Our method is not limited to the number
of contacts.
In (Chebotar et al., 2016), the authors developed a self-supervised experiment
setup to train a grasp stability predictor whose ground truth labels are provided
by whether a grasp is stable after the robot performs a range of shaking motions.
This self-supervised setup and stability predictor do not account for an object is
rotated in-hand but is successfully lifted off the table and not shaken out of the
grasp.
6.4 Technical Approach
6.4.1 Experimental Setup
Our experimental setup consists of a pair of BioTacs forming a pinch grasp on a
cardboard box ( 250g). The 6D pose of the cardboard box is real-time tracked
with a Vicon motion capture system, shown in Fig. 6.1.
91
Figure 6.1: The Barrett Hand equipped with a pair of BioTacs is forming a pinch
grasp on an object. The Vicon motion capture markers are used to tracked the 6D
pose of the object.
6.4.2 Data Collection
A single data collection trial consists of the robot reaching to a predefined pose
relative to the center of mass of the object, forming a pinch grasp on the object at
predefined normal force, loading its weight, and lifting it to a predefined height.
There are four types of contact events during a data collection trial: translational
slips if the object slips out of grip without significant rotation during lifting move-
ment, rotational slips along the positive and negative directions of the grip axis,
and stable contact if the robot successfully lifts the object off the table without
any type of slips. The translational and rotational slips labels are automatically
generated by thresholding the relative position and orientation between object and
fingers. Fig. 6.2 left shows an typical example of a rotational slip and Fig. 6.2
right shows a typical example of a stable grip without translational and rotational
slips.
The predefined normal forces range from 1N to 10N with 1N intervals and the
predefined relative pose to the center of mass is -5cm to 5cm with 1cm interval. At
any given predefined relative pose to the center of mass, the robot collects data by
92
Figure 6.2: Left: An example of rotational slip trial and slip prediction data
extraction using motion capture data. The slip prediction data begins from the
fingers are beginning to load the weight the object which is shown by the first
dotted vertical line labeled by the relative position changes between fingers and the
object due to small finger skin distortions. The slip event, such as a rotation slip, is
labeled from the relative orientation between the fingertips and the objects, shown
as the second vertical dotted line. The data between these two vertical lines are
extracted to train a slip predictor to predict rotation slips. Right: An example of a
stable grip without translational and rotational slips. During the lifting movement,
there is no significant position and orientation differences between the fingertips
and the objects. Therefore, the second vertical line is labeled at the end of the
lifting movement of the robot hand. The data between these two vertical lines are
extracted to train a slip predictor to predict stable contact.
incrementally increase the normal force with 1N intervals until it can successfully
lift the object off the table without translational and rotational slips. We collect
20 trials data for each type of these four contact events and total 80 trials of data.
93
6.4.3 Slip Prediction
We treat slip prediction as a multi-classes classification problem.
c
t
=h(ϕ(x
(t−τ):t
)) (6.1)
whereaclassifierh(·)isusedtopredictthestateofthegripattimetasfourtypesof
events c
t
∈contact,ts,rs+,rs− where contact, ts, rs+ and rs− represents stable
contact, translational slips, positive rotational slip along the grip axis and negative
rotational slips along the grip axis. The inputs of the classifier are features ϕ(·) of
the raw sensor valuesx
(t−τ):t
, where τ is a time window. x
(t−τ):t
are raw sensory
data from the 19 skin deformation sensing electrodes on each BioTac during all
these four types of contact events. For example, these 19 electrodes from a typical
rotational slip and typical stable grip are visualized on the fourth and fifth plots
from the top of the Fig. 6.2. The fifth plot from the top of the Fig. 6.2 is a
heatmap in which the x-axis is time, y-axis is the number of electrodes, and colors
represents the digital values on these electrodes. To construct the features, we take
time window τ of electrode values from the extracted sensory data and combine
all values inside the window into one long feature vector, e.g. 10 consecutive
timestamps of 19-dimensional electrode values result in a 190-dimensional input
vector therefore 380-dimensional input vector for two BioTacs forming a pinch
grasp on the object.
We chose to use logistic regression to classify these four types of contact events.
The classification is performed using logistic regression with regularization, which
is a form of probabilistic classifier. The probability of each contact event, such as
translational slip y
ts
= 1 is given by:
94
p(y
ts
= 1|x
s
) = (1 +exp(−θ
T
ϕ(x
s
)))
−1
(6.2)
whereϕ(x
s
) is a vector of features describing the sensory signals during contact
events, and the weight vectorθ is computed iteratively from training data through
a conjugate gradient method.
6.4.4 Grip Control Law
A grip force control law includes original feedforward as well as updated feed-
forward components, F
ff
and F
ff
0. The original feedforward component can be
estimated based on previous experience such as a minimum contact force to main-
tain a stable grasp based on previous experience of grasping and manipulation
similar objects. Because our work does not assume we have visual recognition of
given objects, we always assume the original feedforward force is chosen to be a
minimum grip force, such as 1N.
The updated feedforward component, F
ff
0, is used to modulate the desired grip
force to cope with any object uncertainty, such as mass and center of mass prop-
erties. We proposed a grip control law to convert the predicted slip events c
t
into changes of feedforward grip force as following, which is a hyperbolic tangent
sigmoid to guarantee smooth and constraint contact force:
F
ff
0 =β
2
1 + exp(−τW
slip,t
F
N
/β)
− 1
. (6.3)
whereF
ff
0 is the desired feedforward grip force at the given tactile sensor; β is the
max grip force recommended by the robot hand manufacturer; τ is propositional
to the time constant to ramp up the grip force and it can be used to scale grip
force based on the estimated friction coefficient and local contact curvatures; F
N
95
is the current estimated contact normal force on the given tactile sensor which is
used to cope with degraded calibration of contact force on different tactile sensors
due to deflation of tactile sensors; W
slip,t
is accumulated slip events predicted by
the slip predictor:
W
slip,t+1
=
W
slip,t
+w
ts
, if ts
W
slip,t
+w
rs
, if rs+,rs−
W
slip,t
, otherwise
where w
ts
and w
rs
are responsible for predicted translational slips and rotational
slips, respectively. Adding different weights w
ts
and w
rs
results in increases in
desired grip force at different rates. These weights can be scaled by current esti-
mated contact tangential force and torque to account for larger range of object
mass.
Improvements on the current grip control law can be made as follows: 1) the τ
should be estimated from the ratio between tangential load and normal load during
the initial loading phase 100ms, because the friction coefficient and local contact
curvatures are embedded in this ratio. For example, if the friction coefficient is
lower than expected, the ratio is lower than expected therefore τ should be set at
a higher value which means a fast ramp-up of the grip force. If the local contact
surface is concave vs convex, it will induce a high ratio between tangential load and
normal load for the concave surface than the convex surface, therefore τ should
be set at a smaller value for the concave surface than convex surface. 2) the w
ts
and w
rs
are constant and the increased force is propositional to accumulated pre-
dicted slip events which are equivalent to the duration of the predicted slip events;
this will cause an object being lifted off the table at different duration, however,
96
Figure 6.3: Feedforward adjustments of motor output to object weight (A), fric-
tional conditions (B), and object shape (C) in a task in which a test object is lifted
with a precision grip, held in the air, and then replaced. The top graphs show
the horizontal grip force, vertical load force, and the vertical position of the object
as a function of time for two superimposed trials. The bottom graphs show the
relation between load force and grip force for the same trials. The dashed line indi-
cates the minimum grip-to-load force ratio required to prevent slip. The gray area
represents the safety margin against slip. After contact with the object (leftmost
vertical line, top), grip force increases by a short period while the grip is estab-
lished. A command is then released for simultaneous increases in grip and load
force (second vertical line). This increase continues until the load force overcomes
the force of gravity and the object lifts off (third vertical line). After replacement
of the object and table contact occurs (fourth line), there is a short delay before
the two forces decline in parallel (fifth line) until the object is released (sixth line).
(adapted from (Johansson & Westling, 1984, 1988; Jenmalm & Johansson, 1997)).
humans tend to lift an object off the table at the same duration regardless of
material properties, as shown in Fig. 6.3. In the future, we should fix the duration
of the predicted slip events and scale weights, w
ts
and w
rs
, with the current esti-
mated tangential forces and tangential torques. This meansW
slip,t
is a function of
W
slip,t
=h(F
tan
(x
(t−τ):t
),F
tor
(x
(t−τ):t
)) or W
slip,t
=h(F
0
tan
(x
(t−τ):t
),F
0
tor
(x
(t−τ):t
))
97
6.5 Experiments completed and Results
6.5.1 Slip Prediction Results
We report the slip prediction results in terms of the classification accuracy and
generalization capabilities. We first describe the experimental procedure and clas-
sification results. We first split the 80 trials of data into 90% training and 10%
generalization testing sets. The generalization testing sets are held-out trials that
have never been trained on. Then the 90% training data set is randomly shuffled
and divided into the 80% training and 20% test sets. In all experiment, we set the
time window feature of τ = 1.
First, we perform experiments using logistic regression classifier described above
in Sec. 6.4.3. We report the results of these experiments using confusion matrix,
shown in Fig. 6.4. We choose to report confusion matrix instead of classification
accuracy because the majority of labels are stable labels, therefore, the classifica-
tion accuracy could be very high while not predicting any of these slip events.
One observation can be made regarding the classification results. There are 13.9%
and 17.5% of rotational slips classified as stable contacts at the beginning of the
loading phase, shown in Fig. 6.4 left. This potentially results in a delayed pre-
diction on the slip events which cause some amount of rotations at the beginning
of the loading phase. Another observation is that this classifier achieves good
generalization on held-out data sets, as shown in Fig. 6.4 right.
Second, we examine how well learned classifier could generalize to previously
unseen objects Fig. 6.6. On these unseen objects, we collected two trials for each
contact events except the translation slips. Because these objects have weights less
than 200g and high friction coefficient, only rotational slips and stable contacts
can be induced with minimum grip force available on the Barrett robot fingers.
98
Figure 6.4: Training, testing and generalization test results from logistic regression
classifier are presented in confusion matrices.
Figure 6.5: The classification results on these three novel objects are presented in
confusion matrices.
We perform classification on the data collected on these objects and report the
classification results in Fig. 6.5.
One observation can be made on these results is that the predictor trained on the
cardboard box performed poorly in predicting negative rotation. This may be due
to the deformable surfaces on these objects. We will perform further analysis on
these data.
6.5.2 Grip Control Experiments
As the final test of the generalization capabilities of our slip prediction method, we
examine the ability to perform grip stabilization on novel objects including three
novel objects, as shown in Fig. 6.6 and the cardboard box with modified center of
mass. Weusedthepredictedslipeventsfromtheslippredictorinthegripcontroller
99
Figure 6.6: We included three novel objects to test the robustness or the general-
ization capability of the learned classifier. From the upper left clockwise, a long
cardboard tea-box with unknown center of mass, a tall cardboard tea-box with
unknown center of mass and a foam brick. All of these three novel objects are
deformable.
Figure 6.7: Percentage of success grip trials.
to updated the desired feedforward grip force presented in Sec. 6.4.4. We conduct
10 trials per object and report the percentage of trials which successfully lift the
objects off the table with less than 1cm translational slips and±15
◦
rotational
slips, shown in Fig. 6.7.
100
6.6 Conclusion and future work
Inthiswork, aslippredictorwastrainedtopredictslippagesaswellasclassifyslips
into translational slips, rotational slips, and stable grips. The predictor achieved
above 80% classification success rate on the learned object and can generalize to
three novel objects. The predicted slip events were used to regulate grip force
which prevented objects from translational as well as rotational slips. The slip
predictor achieved 60% 90% stable grasps on the learned objects and three novel
objects. In the future, the transnational and rotational slip prediction model can
be further expand to objects other materials properties, such as changes in contact
curvatures and frictional properties.
101
Chapter 7
Manipulation Graph Acquisition
Using Tactile Sensing
Complex contact manipulation tasks can be decomposed into sequences of motor
primitives. Individual primitives often end with a distinct contact state, such as
inserting a screwdriver tip into a screw head or loosening it through twisting. To
achieve robust execution, the robot should be able to verify that the primitive’s
goal has been reached as well as disambiguate it from erroneous contact states.
In this paper, we introduce and evaluate a framework to autonomously construct
manipulationgraphsfrommanipulationdemonstrations. Ourmanipulationgraphs
include sequences of motor primitives for performing a manipulation task as well
as corresponding contact state information. The sensory models for the contact
states allow the robot to verify the goal of each motor primitive as well as detect
erroneous contact changes. The proposed framework was experimentally evaluated
on grasping, unscrewing, and insertion tasks on a Barrett arm and hand equipped
with two BioTacs. The results of our experiments indicate that the learned manip-
ulation graphs achieve more robust manipulation executions by confirming sensory
goals as well as discovering and detecting novel failure modes.
102
7.1 Introduction
Object manipulation tasks can be decomposed into sequences of discrete motor
primitives. For instance, an assembly task like unscrewing a screw involves insert-
ing the screwdriver tip into the head of the screw and twisting it. Each of these
motor primitives terminates in a sensory event that corresponds to a sensorimo-
tor subgoal of the task (Flanagan et al., 2006), e.g., making contact between the
screwdriver tip and the head of the screw, and loosening the screw. In humans,
thesedistinctsensoryeventshavebeencharacterizedbyspecificneuralresponsesin
cutaneous sensory afferents on the fingertips (Johansson & Flanagan, 2009a). For
example, fast- and slow-adapting type one afferents (FA-I, SA-I) respond strongly
to making or breaking contact as well as normal and tangential forces between
fingertips and hand-held tools. When a tool makes contacts or slides against an
object, the fast-adapting type two afferents (FA-II) sense the vibrations indicating
the contact changes.
In (Klingbeil et al., 2016), Klingbeil et al. showed humans can perform manip-
ulation tasks requiring complicated contact changes indirectly through tools and
they achieve robust manipulation by spending a significant amount of time at a
few distinct types of contact states which often require exploratory strategies to
disambiguate. It is desirable to equip robots with this capability. For example, a
robot could learn to disambiguate between a successful insertion into a screw head
versus making contacts with a flat surface, as shown in Fig. 7.1.
In our previous work (Su et al., 2016), we proposed a method to segment demon-
strated manipulation tasks into a sequence of sensorimotor primitives using unsu-
pervised Bayesian online changepoint detection (BOCPD) (Adams & MacKay,
2007) with multimodal haptic signals. In this paper, we expand our previous work
103
Figure 7.1: Robot needs to learn to disambiguate successful insertion from failed
insertion into a screw.
into a manipulation skill acquisition framework by making the following improve-
ments. First, correspondences of the segmented motor primitives from multiple
demonstrations are found by clustering the final poses of all the segments extracted
from BOCPD. After clustering these segments, skill clusters and the frequency of
transitions between clusters within demonstrations are used as nodes and edges to
build a skill graph. The robot then performs the task by replaying a sequence of
motor primitives by traversing the skill graph. A sequence of exploratory move-
ments is performed at the end of each motor primitive execution. The resulting
sensory signals, i.e. from tactile sensors in this work, are clustered to identify sen-
sory events corresponding to distinct contact states, which we refer to as modes.
These modes are formed from successful and failed skill executions. Finally, a uni-
fied manipulation graph is built with both the motor primitives and the modes,
and the learned graph is used by the robot to verify successful skill executions by
detecting contact state changes and discovering novel failures. The overall frame-
work is shown in Fig. 7.2.
104
Figure 7.2: Overview of the framework used in this experiment.
The proposed framework was evaluated on three manipulation tasks: a grasping
task, an unscrewing task, and a peg insertion task (0.5mm tolerance). The experi-
mentsevaluatedtherobotonsegmentingdemonstrations, clusteringsegmentsfrom
multiple demonstrations, and building manipulation graphs as well as discovering
novel failure cases.
7.2 Related Work
Imitation learning methods are an effective approach for transferring human
manipulation skills to robots. These methods often learn motor primitive libraries
that generalize between different contexts of task (Pastor et al., 2012; Chebotar
et al., 2014; Manschitz et al., 2015). The motor primitives are usually trained on
pre-segmented trajectories, and they tend to terminate after a fixed duration or
once they have reached a predefined pose threshold from their goal. Kappler et al.
proposed a framework for using multimodal signals to switch between primitives
(Kappler et al., 2015). Their approach models the stereotypical sensor signals as
functions of time rather than monitoring for a specific sensory event of the primi-
tive’s goal and failures. Niekum et al. learn a finite-state automaton or skill graph
that select the next primitive based on the current state (Niekum et al., 2015a).
105
Methods for segmenting manipulations into sequences of primitives (Meier et al.,
2011; Niekum et al., 2015a; Lioutikov et al., 2017) usually use proprioceptive sig-
nals of the robot and the locations of the objects to segment the demonstrations.
Konidaris et al. used the returns from a Reinforcement Learning framework to
segment demonstrated trajectories (Konidaris et al., 2012). Niekum et al. have
proposed an approximate online Bayesian changepoint detection method to seg-
ment demonstrations by detecting changes in the articulated motion of objects
(Niekum et al., 2015b). The authors also proposed verification tests to verify
skills have been successfully executed before switching onto subsequent skills. We
use low and high-frequency tactile signals, inspired by human sensorimotor prim-
itives (Flanagan et al., 2006), to detect contact events for segmentation as well
as discovering modes corresponding to success and failure executions. It has been
shown that high-frequency tactile signals are particularly important for manipu-
lation tasks (Romano et al., 2011; Su et al., 2015). Recently, Chu et al. have
shown the importance of using multiple sensory modalities, such as force/torque
sensing and vision, to improve skill segmentation (Chu et al., 2017). Other tech-
niques have been proposed for decomposing tasks into modes based on changes
in the state transition model (Kroemer et al., 2014; Kulick et al., 2015). Motor
primitives are subsequently optimized for switching between the modes.
In the planning domain, a contact manipulation task can be treated as a contact
motion planning problem by dividing it into a sequence of contact state transitions
which can be represented as connections in a graph (Ji & Xiao, 2001). However,
the graph size grows combinatorially with the number of contact states. Lee et
al. proposed a hierarchical approach to decrease the search space by planning for
three subproblems: finding sequences of object contact states, finding sequences
of object’s poses, and finding sequences of contact points for manipulators on the
106
object (Lee et al., 2015). Jain et al. proposed to solve contact manipulation tasks
with a hierarchical POMDP motion planner that develops high-level discrete state
plans to find sequences of local models to visit and a low-level cost-optimized con-
tinuous state belief-space plans (Jain & Niekum, 2018). Previous work in motion
planning (Alami et al., 1994, 1990) has also created manipulation graphs with
modes, which represent finite and discrete sub-manifolds of the full configuration
space. In our paper, the modes are discovered through clustering the sensory sig-
nals after executing a sequence of skills and they correspond to different types of
contact constraints formed at the end of these skill executions.
In (Klingbeil et al., 2016), Klingbeil et al. developed a framework to analyze
human control strategies while humans demonstrated complex contact manipula-
tion tasks in a virtual environment with visual and haptic feedback. Their experi-
mentsshowedthathumanstendtoexplicitlycontrolandexploreonlyafewcontact
states along the manipulation trajectories due to physiological delay limits. This
work seems to agree with our assumption that these few states correspond to the
subgoals of the manipulation tasks and a robot should develop sensory models at
these key contact states.
Guarded motions are primitives that terminate when a sensory condition is ful-
filled. These primitives are widely used in industrial application and prosthetics to
avoid excessive force (Deiterding & Henrich, 2007; Matulevich et al., 2013). The
termination conditions are usually manually predefined.
7.3 Approach
We present a framework for autonomously segmenting manipulations, clustering
segments into skill primitives, and discovering corresponding modes to create a
107
manipulation graph. The manipulation graph is learned from successful and failed
executions during skill replays, and therefore also includes failure modes. The suc-
cessandfailuremodesaresubsequentlylearnedforeachskillprimitivetodetermine
when to switch to the next primitive and to detect when an error has occurred.
We explain the segmentation of the demonstrations into primitives in Sec. 7.3.2,
finding corresponding segments among demonstrations to build skill graphs and
removing over-segmentations in Sec. 7.3.3, discovering unique sensory events asso-
ciated with contact state changes at the end of each skill replay in Sec. 7.3.4, and
building manipulation graphs from skill graphs and the corresponding modes in
Sec. 7.3.5. An overview of our framework is shown in Fig. 7.2.
7.3.1 Demonstration and Multimodal Sensory Signals
The graph generation process is initialized from demonstrations, as shown in Fig.
7.2A.Ourexperimentalsetupconsistsofa7-DOFBarrettWAMarmandaBarrett
hand, which is equipped with two biomimetic tactile sensors (BioTacs) (Wettels et
al., 2008). We demonstrate manipulation tasks through two types of demonstra-
tions: kinesthetic demonstrations and teleoperated demonstrations. In the kines-
thetic demonstrations, the human expert demonstrates tasks by directly moving
the robot arm. In the teleoperated demonstration, the human operates the bi-
manual robot by manually moving the robot’s master arm where the slave arm
mimics the movements of the master arm to manipulate the objects. More details
can be found in Fig. 7.3, Fig. 7.4, Fig. 7.5.
Multimodal haptic signals, including proprioceptive signals and both low and high-
frequency tactile signals, are captured throughout the human demonstration. The
proprioceptive signals are the 6D Cartesian position and orientation of the robot’s
end-effector y
pos
∈ R
6
derived from the robot’s forward kinematics. We also
108
recorded the 6D Cartesian pose of the object in the robot’s surroundings with
a Vicon motion capture system y
obj
∈R
6
.
The low-frequency tactile signals (≤ 100Hz) are measured from an array of 19
impedance sensing electrodes that detect sensitive skin deformations. The elec-
trode impedances y
E
∈ R
19
are sampled at 100Hz. For dynamic tactile sens-
ing, high-frequency vibration signals (10 - 1040Hz) are available from the hydro-
acoustic pressure sensor. These vibration signals y
PAC
∈R
1
are sampled at 2200Hz
and often correspond to transient mechanical events, such as micro-vibrations
between the sensor’s skin and external environment. Detailed descriptions of the
tactile signals can be found in (Su et al., 2016).
7.3.2 Sensorimotor Primitive Segmentation
To discover primitives that terminate in distinct sensory events, the robot must
segment demonstrations and skill executions according to the sensory signals. The
tactile signals are particularly important for segmenting sensory trajectories into
primitives with sensory goals (Romano et al., 2011; Su et al., 2016).
Unlike the relatively smooth proprioceptive signals, dynamic tactile sensor signals
are highly sensitive to transient mechanical events. Some of these detected events
correspond to the end of a contact state, at which point a new primitive should
start, but others may be caused by noise. In a peg-in-hole task, the vibrations
from scratching the peg over the rough surface are irrelevant for the segmentation,
but the vibrations from entering the hole are relevant.
BOCPD (Adams & MacKay, 2007) is used to segment demonstrated trajectories
into a sequence of primitives, as shown in Fig. 7.2B. Each of the primitives ends
with a sensory event. Because our previous work showed the superior segmentation
results with multimodal sensory signals (Su et al., 2016), we apply the BOCPD
109
jointly to the proprioceptive and tactile signals. The number of segments for each
demonstration is automatically determined by the algorithm.
BOCPD passes through the sensory trajectories and calculates the posterior dis-
tributionp(r
t
|y
1:t
) over the current run lengthr
t
∈Z at time t given the previously
observed data y
1:t
. r
t
represents the number of time steps since the last change-
point was detected. The posterior distribution is computed by normalizing the
joint likelihood P (r
t
|y
1:t
) =
P (rt,y
1:t
)
P (y
1:t
)
, where the joint likelihood P (r
t
,y
1:t
) over the
run lengthr
t
and the observed datay
1:t
is computed online using a recursive mes-
sage passing scheme (Adams & MacKay, 2007)
P (r
t
,y
1:t
) =
X
r
t−1
P (r
t
|r
t−1
)P (y
t
|r
t−1
,y
(r)
t
;θ
m
)P (r
t−1
,y
1:t−1
),
where P (r
t
|r
t−1
) is the conditional changepoint prior over r
t
given r
t−1
. The mul-
tivariate time series sensory signals are modelled as a joint Student’s t-distribution
P (y
t
|r
t−1
, Y
1:t
;θ
m
),where y
t
aremultimodalsensorysignalsdescribedinSec.7.3.1
and θ
m
are hyperparameters.
7.3.3 SegmentationClusteringandSkillGraphGeneration
The BOCPD algorithm decomposes the trajectories from multiple demonstrations
intomultiplesequencesofsegments,butitdoesnotprovidecorrespondencesamong
the segments of multiple demonstrations. These correspondences are required to
construct a unified skill graph for each task. Rather than manually define the
110
correspondences, our framework finds correspondences between segments by clus-
tering the 3D goal positions and 3D orientations represented in Euler angles of the
segments, as shown in Fig. 7.2C. We assume that the same skills from multiple
demonstrations of the same task will have similar goal poses, which correspond
to the particular configuration of the robot learned by demonstrations. The goal
poses are extracted from the final poses of these segments.
The segments are clustered using spectral clustering (Von Luxburg, 2007; Shi &
Malik, 2000). We first compute the similarity of pairs of segments’ final poses (x
i
and x
j
) using a squared exponential kernel:
[K]
ij
=k(x
i
,x
j
) =e
−(x
i
−x
j
)
2
2σ
2
Then, a normalized Laplacian is computed as: L = I−D
−1
K, where D is a
diagonal matrix with its j
th
diagonal element is given by [D]
jj
=
n
P
i
[K]
ij
, where
n is the total number of segments. Subsequently, k-means clustering is performed
on a lower-dimensional space of the eigenvectors of the normalized Laplacian. As
shown in Fig. 7.2C, each demonstration has a sequence of segments which are
labeled with unique colors based on the assigned clusters.
After the segments’ final poses were clustered, a skill graph is constructed as shown
on the right side of Fig. 7.2C. The segments from the same cluster are used to learn
askillprimitive, whichcorrespondstoanodeintheskillgraph. Afterthenodesare
formed, we added directed edges if pairs of skills are demonstrated consecutively.
The strength of each edge indicates the probability of the connected skills being
performed in sequence.
111
7.3.4 Skill Replay with Exploration and Mode Discovery
Given the skill graph of a task, the robot can execute the task by traversing the
skill graph through a sequence of skill primitives, which are represented as force-
position controllers. The position and force signals at the end of the segments are
used to compute the final desired state for the controller. The feedback gains for
the controllers are predefined. A detailed overview of the control architecture can
be found in (Pastor et al., 2011).
Due to the nonlinear cable stretch and motor-side encoders on our robot, our
robot has a poor accuracy (1.5cm) as well as significantly different accuracies for
different regions inside the robot’s workspace. Simply executing a sequence of
skills will tend to result in successful replays if they are executed at the same pose
as the demonstrations but failed replays if they are executed at different robot
poses as the demonstrations. A sensory model for monitoring the progress of each
motorprimitiveisthereforeessentialtoachieverobustexecutionperformance. The
sensory model is used to confirm whether a sensory subgoal of the task has been
reached by the end of each skill.
At the end of each motor primitive, the robot performs a sequence of exploratory
movements, which are 5mm position deviations, along with the three orthogonal
directions of the current pose of the robot’s end-effector, see Fig. 7.2D. The goal of
these exploratory movements is to collect sensory signals for observing the distinct
types of contact states.
The tactile sensory signals from these exploratory movements are clustered using
spectral clustering to identify distinct modes, as shown in Fig. 7.2E. The clusters
from the spectral clustering represent distinct types of contact states. These clus-
ters are used to learn sensory models to confirm whether the goal of each primitive
has been achieved or if an error has occurred.
112
Figure 7.3: Experimental setup of demonstrating the grasping task.
7.3.5 Manipulation Graph Generation
Given a skill graph and the modes for the manipulation task discovered from both
successful and failed skill executions, a unified manipulation graph can be created
for the robot, as shown in Fig. 7.2F. The large rectangles in a manipulation
graph correspond to the unique modes discovered by skill replays and exploration.
The directed edges indicate the transition probabilities between the vertices in the
graph. Some skills result in the robot remaining in the same mode while others
result in switching into different modes, as indicated by the connections within the
same mode or between different modes.
After generating the manipulation graph, the robot can perform the task through
graph traversal by executing a sequence of skills in the graph. It can also confirm
successful or failed skill executions by clustering the tactile sensory signals at the
end the skill execution against the corresponding discovered success and failure
modes.
7.4 Experimental Evaluations
In this section, we describe the experiments and results obtained for evaluating the
proposed framework for building manipulation graphs by segmenting demonstra-
tions into skill primitives and discovering corresponding modes to verify successful
and failed skill executions.
113
Figure 7.4: Experimental setup of demonstrating the peg-in-hole task.
Figure 7.5: Experimental setup of demonstrating the unscrewing task.
7.4.1 Segmentation
We evaluated the segmentation method in our skill learning framework on three
tasks: a grasping task, an unscrewing task, and a peg insertion task. Because
kinesthetic demonstration for the grasping task will require direct contact with
the fingertip which will corrupt the tactile sensory signals, we use the master-slave
dual setup to move the fingers to grasp the object, as shown in Fig. 7.3. During
each grasp demonstration, the human expert moves the master arm a sequence of
movements while visually observing the slave arm and a cylinder so that the slave
arm reaches on the top of the object, closes its fingers on it, lifts the object off
the supporting table by about 15 cm and places it back on the supporting table.
Kinesthetic demonstrations were used for the peg-in-hole and unscrewing tasks as
we can hold on to the wrist above the force-torque sensor, as shown in Fig. 7.4 and
Fig. 7.5. The initial pose of each object is recorded by a Vicon motion capture
system.
The resultsof using BOCPDwith the proprioceptiveand tactile datafor the grasp-
ing task is shown in Fig. 7.6A. The ground truth primitive switches, as indicated
114
Figure 7.6: A: An example of joint BOCPD to segment sensorimotor primitives
in the grasping task; B: An example of joint BOCPD to segment sensorimotor
primitives in the unscrewing task; C: An example of joint BOCPD to segment
sensorimotor primitives in the peg insertion task; D: Segments clustering and skill
graphforthegraspingtask; E:Segmentsclusteringandskillgraphfortheunscrew-
ing task; F: Segments clustering and skill graph for the peg-in-hole task
by the double vertical dashed lines, were manually labeled only for the purpose of
showing this exemplary segmentation result. In this example case, six significant
sensorimotor events were labeled, including reaching the object, closing one finger
on the object, forming a pinch grasp on the object, loading the object off a sup-
porting table, lifting the object above the targeted pose, and placing the object
back on the supporting table, as shown in Fig. 7.3. The changepoints detected
by the BOCPD algorithm are indicated by black crosses. If these changepoints
are between the double vertical dashed lines, we consider the BOCPD algorithm
as having successfully segmented the primitive. If changepoints fall between two
consecutive sensorimotor events, we consider these changepoints as false positives,
such as the changepoint at 2sec and 4.4sec, indicated by the open blue circle.
These two false positives are caused by the vibrations resulting from the motors
and gears of the robot fingers during finger closing. They are not considered to be
115
relevant to this task and are effectively over-segmentations caused by noisy sensory
signals. The false positive changepoints are manually labeled only for the purpose
of showing exemplary over-segmentation from BOCPD.
For the unscrewing task, there are three significant sensorimotor events: reaching
the object, inserting the screwdriver into the head of the screw, and unscrewing
the screw, as shown in Fig. 7.6B.
For the peg insertion task, we have a peg reaching the board, making contact with
the surface of the board, sliding into the groove, reaching the corner of the groove,
reaching the top of the hole, and making contact with the bottom of the hole, as
shown in Fig. 7.6C.
7.4.2 SegmentationClusteringandSkillGraphGeneration
The segmentation clustering and skill graph generation are evaluated on the seg-
mented primitives from all three tasks: grasping, unscrewing and peg insertion.
The segmented primitives from 15 trials of grasping demonstrations are clustered
by applying spectral clustering on the goal poses of these segments, which are high-
lighted by the colored columns in Fig. 7.6A. On the left of Fig. 7.6D, each row
represents one of the 15 demonstrations, and its segments are colored based on the
assigned cluster. We use the same color coding to visualize the correspondences
betweensegmentsfromBOCPDandsegmentsusedinsegmentclustering, asshown
inFig. 7.6AandFig. 7.6D,respectively. Duetoover-segmentationcausedbynoisy
sensory signals, sometimes multiple segments are assigned to the same cluster. We
can keep the first segment assigned into a cluster and reject the segments that are
subsequently assigned to the same cluster. Thus, clustering the goal poses of seg-
ments allows the robot to not only find the correspondences among segments from
multiple demonstrations but also eliminate over-segmentations. After eliminating
116
these over-segmentations, segments among all 15 demonstrations with the same
cluster label are assumed to represent the same skill primitive, therefore the mean
of those segments’ pose is used to form a node in the skill graph, shown as colored
circles on the right of Fig. 7.6D. The five clusters are used to form five nodes in
this grasping skill graph.
AsshownontheleftofFig. 7.6E,thesegmentsfrom 10unscrewingdemonstrations
are clustered into three clusters, shown in dark blue, green and yellow. Because
the first segments only neighbor with the second segment and the second segments
only neighbor with the third segments, only two sequential connections are created
among these three nodes, as shown on the right of Fig. 7.6E.
The segments from 20 trials of the peg insertion demonstrations are clustered into
eight unique clusters, shown as eight nodes in the Fig. 7.6F.
7.4.3 Mode Discovery
The grasping task is executed by traversing through those five skill nodes in the
grasping skill graph. Each graph traversal is sampled based on the connections’
strength among those nodes. For example, traversals sampled from the grasping
skill graph are 1→ 2→ 3→ 4→ 5, and 1→ 2→ 3→ 5→ 4→ 5 with 80%
and 7% probabilities, respectively. The robot performs exploratory movements,
which are 5mm position deviations, along with the three orthogonal directions of
the current pose at the end of each skill. It samples 10 sequences of the five nodes
resulting in a total of 50 trials of explorations. During executions of the skill 3, 4,
and 5, which correspond to the robot forming a pinch grasp on the object, lifting
it above the table and placing it back on the table, we sample an additional 10
times when failures are introduced by the experimenter resulting in a total of 30
trials of explorations from failed executions of skill 3-5. In Fig. 7.7A, a heatmap
117
Figure 7.7: Similarity matrix heat-map (A, C, E) and spectral clustering (B, D,
F) of tactile signals of the exploratory movements at the goal of each phase of
grasping, unscrewing and peg-in-hole tasks.
shows the similarity matrix of the tactile sensory signals corresponding to those
80 explorations. The brighter color in the heatmap represents high similarity. For
example, we can see that the 21-30 diagonal elements and the 31− 40 diagonal
elements in the similarity matrix have high similarities, each of which corresponds
to the robot forming a pinch grasp on the object and the object being lifted above
118
Figure 7.8: A: failed to insert tool-tip into the screw head; B: after failed insertion,
continue twisting the screwdriver failed to unscrew the screw; C: failed to slide
into the vertical groove therefore missed the corner
the table, respectively. This is due to a stable grasp has been formed regardless of
if the object is still supported by the table or if the object is lifted off of the table.
After applying spectral clustering, 80 trial explorations are clustered into three
distinct clusters, shown in cyan, dark red and orange in the middle of Fig. 7.7.
The dark red cluster (trials 21-50) in the middle of Fig. 7.7B represents that
successful executions of skills 3, 4 and 5 are clustered into the same cluster. This
is due to when the stable grasps have been formed, the tactile sensory signals
are very similar among these three skills. Trials 51-60 corresponding to failed
executions of skill 3, forming a pinch grasp, are clustered into a different cluster,
as shown in orange, from the successful execution of skill 3, as shown in dark red.
Trials 61-70 corresponds to failed executions of skill 4 when the robot attempts
to lift the object. Some of them are clustered into the same cluster as successful
executions of skill 1 when the robot does not make contact with the object. This is
due to the object slipping out of the fingers when the robot tries to lift the object.
119
By following the skill graph of the unscrewing task, a sequence of three skills
are executed on the robot including the robot moving the screwdriver towards
the screw, inserting the tip into the head of the screw, and twisting it. Eight
exploratory movements are applied at the end of each primitive. They are six
5mm translational movements along the three orthogonal directions of the robot’s
end-effector and two rotations (±5
◦
) along the normal of the robot’s palm. This
process was repeated 10 times. A similarity matrix of these 30 trials of tactile
sensory signals corresponding to these eight exploratory movements is shown in
the first 30 diagonal elements in the similarity matrix on the left of Fig. 7.7C.
Three clusters are formed for these 30 trials of explorations, shown in dark red,
orange, and green at the diagonal elements of the spectral clustering matrix in Fig.
7.7D.
The robot subsequently executes these motor primitives under pose uncertainties
introduced by the experimenter. Although the robot tracks its trajectory depend-
ing on the object pose measured by the Vicon system, the poor accuracy, as well
as significantly different accuracies inside the robot workspace, causes the robot to
fail its execution under pose uncertainties induced by the experimenter. Failures
are detected if the tactile sensory signals at the end of each primitive failed to be
clustered into the same clusters as the modes formed from successful executions. A
common failure mode is discovered: such as the robot failed to insert the tool-tip
into the screw hole, as shown in Fig. 7.8A, corresponding to the cyan cluster in
Fig. 7.7D. This failure mode results in an additional failure mode, corresponding
to the blue cluster in Fig. 7.7D, if the robot still twists the screwdriver without a
successful insertion, as shown in Fig. 7.8B.
As shown in Fig. 7.7F, six unique modes are discovered from the exploration
movements, trials 1-60, sampled from eight motor primitives in the peg-in-hole
120
Figure 7.9: Grasping Manipulation Graph
skill graph Fig. 7.6F. Two failure modes are also discovered due to failing to slide
the peg into the vertical groove. It remains in the same mode as making contact
with the surface and continues executing the next motor primitive resulting in a
new mode due to sliding the peg from the flat surface directly into the horizontal
groove instead of sliding into the corner along the vertical groove as shown on the
right of Fig. 7.8.
7.4.4 Manipulation Graph Generation with Failure Modes
Oncewehaveaskillgraphanddiscoveredthecorrespondingmodesforthegrasping
task, a manipulation graph can be formed by combining them. As shown in Fig.
7.9, starting state (s
0
) is in mode 1, and a sequence of actions a
1
reaching to
the object and a
2
forming a pinch graph on the object result in states s
1
and
s
2
respectively, which are clustered into the same mode as s
0
. Then action a
3
,
closing both fingers on the object, causes a mode switch as state s
31
is in mode 2.
Executing actiona
4
lifts the object off of the table and actiona
5
places the object
back onto the table but does not cause any detected mode switches because both
121
Figure 7.10: Success and Failure Mode Detection
states s
41
and s
f
are still in mode 2, where s
f
is the final state. Executing action
a
3
could also result in failure mode 1 which is represented as a failure state s
32
in
red in Fig. 7.9. After discovering this failure mode, continuing to execute the next
action a
4
either stays in the same failure mode or results in an additional failure
mode 2, which is clustered together with mode 1. This corresponds to the object
slipping out of the robot’s fingers when it attempts to lift the object off the table.
We evaluated the built manipulation graphs on the Barrett robot to perform all
three tasks. It performs these three tasks by doing graph traversal as well as
clustering the tactile sensory signals from these executions against the success
modesandfailuremodes. Therobotexecutedeachtask 40timeswhichincluded 20
successful executions and 20 failed executions. The failed executions are caused by
pose variations on the objects introduced by the experimenter. The ground truth
successes and failures are manually labeled by the experimenter. By comparing
the manual labels against the predicted clusters, we reported success rates for
successful and failed executions of each of the three tasks in Fig. 7.10. The
122
detection success rates are above 90% for all the discovered success modes as well
as failure modes. For these modes with only 90% detection success rates, such
as failing to unscrew in the unscrewing task and failing to slide the peg into the
groove in the peg-in-hole task, are due to other novel failure modes that were not
present in the training data.
7.5 Conclusions and Future Work
We presented a framework for segmenting contact-based manipulation tasks into
sequencesofmotorprimitives. Thecorrespondencesamongsegmentsfrommultiple
demonstrations are found by clustering the final poses of these segments. During
skill replays, a sequence of exploratory movements is performed at the end of each
skill to discover distinct modes that correspond to distinct contact states. Fail-
ure modes could be discovered under environment variations and uncertainties by
clustering sensory events against successful skill executions. A manipulation graph
is built by using skill graphs as well as discovered modes corresponding to both
successful and failed skill executions. The proposed framework was successfully
evaluated on grasping, unscrewing and peg insertion tasks.
Learning from demonstration allows the robot to initialize a skill, but the demon-
strated skills tend to fail if they are deviated from the demonstrated trajectory due
to environmental uncertainties. By building a manipulation graph which incorpo-
rates both successful modes as well as distinct failure modes, it enables the robot
to discover novel failure modes, which open the possibility for the robot to acquire
recovery behaviors for each failure from human teachers or reinforcement learning.
123
Chapter 8
Conclusions and Future
Directions
8.1 Conclusions
To achieve fully autonomous service robots, such as a cooking robot proposed
by Moley Robotics (Moley, 2018), this dissertation identifies all of the general
problems required to be solved to achieve human-level autonomy in manipulation.
These general problems are: 1) material properties perception; 2) reactive poli-
cies to cope with the robot and environmental pose uncertainties; 3) the ability
to detect slippages, estimate contact forces and friction coefficients, and regulate
contact force to prevent future slippages; 4) the capability to predict translational
and rotational slippages as well as to update desired grip force; 5) the ability to
predict grasp failures in the very early stage of the lifting motion and choose better
grasp poses on the objects through regrasping policy; 6) the capability to teach
robots new tasks by segmenting human demonstration into subtasks.
The research described in this thesis places particular emphasis on the role of
tactile sensing, inspired by the demonstration of the severe loss of dexterity that
humans exhibit when their fingertips are anesthetized (Johansson, 2018; Monzée
et al., 2003). Chapter 2 describes a haptic robot platform to mimic human haptic
sensing capabilities, including cutaneous and proprioceptive sensing (Grunwald,
2008). This haptic robot platform is built by equipping the bimanual manipulation
124
platform from the DARPA Autonomous Robotics Manipulation (Hackett et al.,
2013) with three biomimetic tactile sensors (BioTacs) (Wettels et al., 2008) and a
6-DOF force-torque sensor.
ThroughoutChapter3toChapter7, thisdissertationpresentstoolstoaddresseach
of the general problems in manipulation by utilizing tactile sensory signals to close
the perception-action loop from low-level control to high-level decision-making.
Equipped with these tools, robots can be taught to perform complex tasks, such as
cooking a dish, through kinesthetic or teleoperated demonstrations. Autonomous
skill acquisition algorithm presented in Chapter 7 segments these demonstrations
into a sequence of skills. Each skill includes not only a motion trajectory but also
concurrent sensory trajectories. When a robot executes the motion trajectory in a
novel environment, the errors between the demonstrated sensory trajectories and
actual sensory trajectories drive a reactive policy to cause the robot to deviate
its motion from the demonstrated motion trajectory. This reactive policy can be
taught throughhuman demonstrations, as shown in Chapter4, or formulated using
contactmechanicsprinciples, asshowninChapter5. Reactivepoliciesarepowerful
tools, but they are usually significantly delayed actions due to delays between
sensed events, such as slip detection in Chapter 5, and reactions, such as grip
controlinChapter5. Therefore, predictivecontrol, suchastranslational/rotational
slip prediction and grip control in Chapter 6, is a more desirable approach to avoid
delayed actions as well as to account for unmodeled contact mechanics, such as
torques along the grip-axis due to the unknown center of mass of the objects. The
reactive policies and predictive control help the robot to achieve robustness during
each individual skill. The manipulation graph constructed from segmented skills
from Chapter 7 and material properties perception, such as objects’ compliance or
fragility in Chapter 3, can be used to make high-level decisions about transitions
125
among learned skills, such as when to terminate current skill and what next skill
shouldbeactivated. Ifalearnedterminationconditionisnotfulfilled, suchaswhen
a failed grasp attempt is detected, a corrective skill called regrasping is activated
before activating the lifting motion during a grasping task (Chebotar et al., 2016).
Extensive experiments on the haptic robot platform appear to show that the tools
presented in this dissertation are sufficient to solve examples of each general prob-
lem to achieve human-like performance in manipulation. However, a few challenges
remain and will be the focus of future research.
8.2 Future Directions
8.2.1 High-level Decision-making Using Tactile Sensing
The slip prediction and grip force control framework presented in Chapter 6 is
sufficient to achieve stable grasps on objects that are rigid enough and desired
grip forces that are below the maximum force available on the robot finger motors.
However, it may not be sufficient to handle deformable or fragile objects which will
be crushed by increased grip force. It may also not be sufficient if the maximum
grip forces fail to prevent further rotational slippage when the grip axes are far
away from the center of mass of the objects.
When picking up an object, it would be useful to know its susceptibility to crush.
The compliance perception method from Chapter 3 could be used to estimate the
compliance and fragility properties and make high-level decision regarding whether
increasing grip force is safe or a new grasping pose or a different type of grasp,
such as palm power grasp instead of precision grip, should be selected to avoid
breaking the objects.
126
When the threat of slip arises from rotational (torsion) forces on the fingertips, it
would be useful to use the tactile sensory information to predict a new grasping
pose on the object that will achieve stable grasp without exceeding available and
safe grip force limits. Previous work from us (Chebotar et al., 2016) has demon-
strated that this regrasping policy can be learned through reinforcement approach.
Instead of reinforcement learning, a more efficient method is to reuse the data col-
lected to train the slip predictor in Chapter 6 and learn the regrasping policy using
supervised learning methods. Throughout the data-collection procedure in Chap-
ter 6, the object is tracked through the Vicon motion capture system. We can
extract the 6D pose differences between the robot fingers and the object. Then
we can learn a mapping between the electrodes values on the tactile sensors at the
beginning of lifting motion, such as the maximum desired hand velocity, and the
extracted 6D pose differences between the robot hand and the object.
8.2.2 Skill Outcome Verification
The autonomous skill acquisition algorithm presented in Chapter 7 segments
human demonstrations into a sequence of skills. However, a demonstrated skill
may not provide sufficient sensory data to determine the outcome of that skill.
For example, a robot without vision may have to wiggle the screwdriver’s tip
to verify that it was correctly inserted into a screw head and is now constrained.
(Klingbeiletal.,2016)showedthathumanscanperformmanipulationtasksinvolv-
ing complicated contact changes indirectly through tools and they achieve robust
manipulation by spending a significant amount of time at a few distinct types of
contact states which often require exploratory strategies to disambiguate. (Su et
al., 2018) adopted this idea by allowing the robot to apply predefined exploratory
movements along the three orthogonal directions of the current pose of the robot’s
127
end-effector after each skill execution. The results of this work indicate that the
learned manipulation skills along with these predefined exploration achieve more
robust manipulation executions by confirming sensory goals as well as discovering
novel failure modes. This work uses hard-coded exploratory movements based on
the heuristics from the experimenter. One interesting future direction would be to
adopt some automatic ways to choose the best exploratory actions, such as basing
the next best action on the Bhattacharyya coefficient between two normal distri-
butions (Fishel & Loeb, 2012; Loeb & Fishel, 2014). Action selection is an active
research topic within the newly defined field of Interactive Perception (Bohg et al.,
2017).
8.2.3 Learning Tactile Servoing
The tactile servoing method presented in Chapter 3 is sufficient for a robot to
cope with pose uncertainties from itself or the environments by adopting servo-
ing motions based on known sensor geometry while estimating material properties
of its environment. One interesting extension of this work is to learn servoing
motions instead of relying upon the geometries of each sensor to engineer these
servoing motions. A general approach was recently proposed to learn visual ser-
voing (Byravan et al., 2017). First, the sensory signals and actions are acquired
through kinesthetic demonstrations by human operators on a robot. Then these
sensory signals and actions are used to learn a dynamics model, which takes in
the current sensory signals and an action and predicts the change in sensory sig-
nals. Once the dynamics model is learned, the servoing motions are computed by
minimizing sensory error using gradient-based methods (Byravan et al., 2017).
128
8.2.4 Nature to Inspire Robotics, Robotics to Understand
Nature
The methods and strategies used in this dissertation were inspired by biological,
psychological, psychophysical studies of humans and other primates. One future
research direction along this line of thinking is to utilize the idea of efference copy,
which is an internal copy of the signals going to the muscles is used as sensors to
enhance the robustness of robot skill execution. To mimic human counterparts,
when robots are trying to pick up an object after forming a stable grip on it,
they should be able to anticipate the upward acceleration that could decrease the
stability of the formed grip. They use the efference copy signals to cue the increase
in grip force accordingly. Another example is to use efference copy to differentiate
movement-produced vibrations versus slip-induced vibrations at the interface of
textured objects and tactile sensors. This could avoid an undesirable positive
feedbacklooponthegripforcecontrolusingvibration-basedslipdetectionmethod.
Thirdly, all of the over-segmentations from the skill segmentation algorithm (Su
et al., 2018) are caused by the noises introduced by the human demonstrations or
the robot motors and gears. The movement-produced sensory signals, such as the
robot end-effector poses, can be used to eliminate over-segmentations.
While natures can inspire a lot progress in robotics, we believe that robotics can be
also used to learn about nature, especially in haptics. In recent years, fairly strong
theories of computation have been developed for many aspects of perceptual and
even cognitive behavior. These usually derive from systematic recordings of neural
activity from various brain regions of highly trained animals performing tasks and
correlations of such activity with various parameters of the performance. However,
this strategy is difficult to apply to haptics because few animals have anything like
the manual dexterity of humans. For haptic behaviors that are feasible for animals,
129
themovementsandforcesbetweenthedigitsandobjectsaredifficulttocaptureand
the steps in the complex sequences tend to be variable and uncontrollable by the
experimentalist. Alternatively, a theory of haptics can be developed and tested by
building a haptically enabled robotic system that incorporates known principles of
operationofhumantouchandattemptingtoemulatethebehaviorsandcapabilities
of human subjects. If the controller of the haptically enabled robot successfully
uses a theory of computation to achieve humanlike haptic performance, this would
be suggestive that the brain may be using a similar theory of computation (Loeb
et al., 2011).
8.2.5 Biomedical Applications
All the ideas and methods presented in this dissertation have broader impacts
beyond tactile sensing based manipulation in robotics. We will give two examples
to adapt them to biomedical applications, such as prosthetics and robot-assisted
medical devices.
There have been efforts on restoring the sensory functions of tactile feedback
to amputees. This system measured force, vibration, and temperature on a
biomimetic tactile sensor and played back these feedback modalities on a subject’s
forearm through a tactile display which is made of force, vibration and thermal
tactors, (Jimenez & Fishel, 2014). Although this system is sufficient for percep-
tion in tactile discrimination tasks, it would be extremely cognitive demanding or
impossibletoallowthesubjectstocontrolthegripforcesthroughbiologicalsignals,
such as EMG signals, based on only those tactile feedbacks. This is mainly due
to the limited information bandwidth of a typical body-machine interface. On top
of that, the subject tends to find the majority of the haptic interfaces distracting
and would be undesirable for day-to-day use. One potential solution is to equip
130
the microcontroller of the prosthetic limb with autonomous grip forces control to
grasp objects using tactile sensing and only provide high-level tactile percepts back
to the subjects through the haptic interface. Therefore, the subjects only need to
provide high-level manipulation commands. For example, the subjects may use
the EMG signal to command opening, closing and different type of stable grasps,
such as precision grip or power grip. If a stable precision grip is commanded, the
slip detection, slip prediction and grip control algorithms presented in Chapter 5
and 6 can be implemented as the autonomous grip forces control. The material
compliance perception using tactile sensing in Chapter 3 can be used to estimate
compliance and fragility of the objects. The compliance and fragility properties
then can be used to scale down the maximum grip force, slow down ramping up
the grip force, and adopt different PID controller gains (Matulevich et al., 2013)
to avoid crashing the objects. If a high-level predictor predicts the objects cannot
be stably grasped without breaking them, it even provides feedback to the sub-
jects through the haptic interface to warn the subjects to regrasp the objects while
releasing the objects autonomously to avoid crushing them.
There have been efforts to restore the sensory functions of tactile feedback
to amputees. This system measured force, vibration, and temperature on a
biomimetic tactile sensor and played back these feedback modalities on a subject’s
forearm through a tactile display that is made of force, vibration and thermal
tactors, (Jimenez & Fishel, 2014). Although this system is sufficient for percep-
tion in tactile discrimination tasks, it would be extremely cognitive demanding or
impossible to allow the subjects to control the grip forces through biological sig-
nals, such as EMG signals, based on only those tactile feedbacks. This is mainly
due to the limited information bandwidth of a typical body-machine interface. On
131
top of that, the subject tends to find the majority of the haptic interfaces dis-
tracting and undesirable for day-to-day use. One potential solution is to equip
the microcontroller of the prosthetic limb with autonomous grip forces control to
grasp objects using tactile sensing and only provide high-level tactile percepts back
to the subjects through the haptic interface. Therefore, the subjects only need to
provide high-level manipulation commands. For example, the subjects may use
the EMG signal to command opening, closing and different type of stable grasps,
such as precision grip or power grip. If a stable precision grip is commanded, the
slip detection, slip prediction and grip control algorithms presented in Chapter 5
and 6 can be implemented as the autonomous grip forces control. The material
compliance perception using tactile sensing in Chapter 3 can be used to estimate
compliance and fragility of the objects. The compliance and fragility properties
then can be used to scale down the maximum grip force, slow down ramping up the
grip force, and adopt different PID controller gains to avoid crushing the objects
(Matulevich et al., 2013). If a high-level predictor determines that the objects
cannot be stably grasped without breaking them, it could provide feedback to the
subjects through the haptic interface to warn the subjects to regrasp the objects
while releasing the objects autonomously to avoid crushing them.
Due to the shortage of skilled sonographers, who are specially trained to posi-
tion the ultrasound probe to obtain blood flow profiles, and recent advances in
telemedicine, a robotic-assistive Transcranial Doppler Ultrasound (Hamilton et
al., 2017) has been proposed to substitute the functionalities of a sonographer in
hospitals that cannot afford or hire a qualified sonographer. This robot system
is designed to control the ultrasound probe to obtain desired blood flow profiles
regardless of the uncertainties of anatomically and physiological differences among
different patients’ heads. The servoing algorithm presented in Chapter 3 can be
132
implemented to control the ultrasound probe 6-DOF poses while searching for the
desired blood flow profiles. The skill acquisition method presented in Chapter 7
could be used by skilled sonographers to teach the robot the probing motions as
well as desired blood flow profiles. The reactive policy driven by sensory errors pre-
sented in Chapter 4 could be used by sonographers to teach robot recovery motions
once current sensory profiles have deviated from the desired sensory profiles.
133
Appendix A
Quaternion
Unit quaternion is a hypercomplex number which can be written as a vectorQ =
r q
T
T
, such thatkQk = 1 withr andq =
q
1
q
2
q
3
T
are the real scalar and
the vector of three imaginary components of the quaternions, respectively. For
computation with orientation trajectory, several operations need to be defined as
follows:
• quaternion composition operation:
Q
A
◦Q
B
=
r
A
−q
A1
−q
A2
−q
A3
q
A1
r
A
−q
A3
q
A2
q
A2
q
A3
r
A
−q
A1
q
A3
−q
A2
q
A1
r
A
r
B
q
B1
q
B2
q
B3
(A.1)
• quaternion conjugation operation:
Q
∗
=
r
−q
(A.2)
• logarithm mapping (log(·) operation), which maps an element of SO(3) to
so(3), is defined as:
log (Q) = log
r
q
=
arccos (r)
sin (arccos (r))
q (A.3)
134
• exponential mapping (exp(·) operation, the inverse of log(·) operation) maps
an element of so(3) to SO(3):
exp (ω) =
cos (kωk)
sin (kωk)
kωk
ω
(A.4)
135
Reference List
Abadi M, Agarwal A, Barham P, Brevdo E, Chen Z, Citro C, Corrado G, Davis A,
Dean J, Devin M, Ghemawat S, Goodfellow I, Harp A, Irving G, Isard M, Jia
Y, Jozefowicz R, Kaiser L, Kudlur M, Levenberg J, Mané D, Monga R, Moore
S, Murray D, Olah C, Schuster M, Shlens J, Steiner B, Sutskever I, Talwar K,
Tucker P, Vanhoucke V, Vasudevan V, Viégas F, Vinyals O, Warden P, Watten-
berg M, Wicke M, Yu Y, Zheng X (2016) Tensorflow: Large-scale machine learn-
ing on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467 .
Abu-DakkaFJ,NemecB,JørgensenJA,SavarimuthuTR,KrügerN,UdeA(2015)
Adaptation of manipulation skills in physical contact with the environment to
reference force profiles. Autonomous Robots 39:199–217.
Adams RP, MacKay DJ (2007) Bayesian online changepoint detection. arXiv
preprint arXiv:0710.3742 .
Alami R, Laumond JP, Siméon T (1994) Two manipulation planning algorithms
In WAFR Proceedings of the workshop on Algorithmic Foundations of Robotics,
pp. 109–125. AK Peters, Ltd. Natick, MA, USA.
Alami R, Simeon T, Laumond JP (1990) A geometrical approach to planning
manipulation tasks. the case of discrete placements and grasps In The Fifth
International Symposium on Robotics Research, pp. 453–463. MIT Press.
Almecija S, Moya-Sola S, Alba DM (2010) Early origin for human-like precision
grasping: a comparative study of pollical distal phalanges in fossil hominins.
PLoS One 5:e11727.
Bekiroglu Y, Laaksonen J, Jorgensen JA, Kyrki V, Kragic D (2011) Assess-
ing grasp stability based on learning and haptic data. IEEE Transactions on
Robotics 27:616–629.
Birznieks I, Burstedt MKO, Edin BB, Johansson RS (1998) Mechanisms for force
adjustments to unpredictable frictional changes at individual digits during two-
fingered manipulation. Journal of Neurophysiology 80:1989–2002.
136
Birznieks I, Wheat HE, Redmond SJ, Salo LM, Lovell NH, Goodwin AW (2010)
Encoding of tangential torque in responses of tactile afferent fibres innervating
the fingerpad of the monkey. Journal of Physiology 588:1057–1072.
Bishop C (1991) Improving the generalization properties of radial basis function
neural networks. Neural Computation 3:579–588.
Bohg J, Hausman K, Sankaran B, Brock O, Kragic D, Schaal S, Sukhatme GS
(2017) Interactive perception: Leveraging action in perception and perception
in action. IEEE Transactions on Robotics 33:1273–1291.
Byravan A, Leeb F, Meier F, Fox D (2017) Se3-pose-nets: Structured
deep dynamics models for visuomotor planning and control. arXiv preprint
arXiv:1710.00489 .
Campos M, Bajcsy R (1991) A robotic haptic system architecture In IEEE Inter-
national Conference on Robotics and Automation (ICRA), pp. 338–343.
Chebotar Y, Hausman K, Su Z, Sukhatme GS, Schaal S (2016) Self-supervised
regrasping using spatio-temporal tactile features and reinforcement learning In
IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS),
pp. 1960–1966.
Chebotar Y, Kroemer O, Peters J (2014) Learning robot tactile sensing for object
manipulation In IEEE/RSJ International Conference on Intelligent Robots and
Systems (IROS), pp. 3368–3375.
Chu V, Gutierrez RA, Chernova S, Thomaz AL (2017) The role of multisensory
data for automatic segmentation of manipulation skills In RSS 2017 Workshop
on Empirically Data-Driven Manipulation.
Chu V, McMahon I, Riano L, McDonald CG, He Q, Perez-Tejada JM, Arrigo
M, Darrell T, Kuchenbecker KJ (2015) Robotic learning of haptic adjectives
through physical interaction. Robotics and Autonomous Systems 63:279–292.
Dahiya RS, Gori M, Metta G, Sandini G (2009) Better manipulation with human
inspired tactile sensing In RSS 2009 workshop on Understanding the Human
Hand for Advancing Robotic Manipulation, pp. 1–2.
Dang H, Allen PK (2012) Learning grasp stability In IEEE International Confer-
ence on Robotics and Automation (ICRA), pp. 2392–2397.
De Maria G, Falco P, Natale C, Pirozzi S (2015) Integrated force/tactile sens-
ing: The enabling technology for slipping detection and avoidance In IEEE
International Conference on Robotics and Automation (ICRA), pp. 26–30.
137
De Maria G, Natale C, Pirozzi S (2012) Tactile sensor for human-like manipulation
In IEEE International Conference on Biomedical Robotics and Biomechatronics
(BioRob), pp. 1686–1691.
Deiterding J, Henrich D (2007) Automatic adaptation of sensor-based robots In
IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS),
pp. 1828–1833.
Erhan D, Szegedy C, Toshev A, Anguelov D (2014) Scalable object detection using
deep neural networks In IEEE Conference on Computer Vision and Pattern
Recognition (CVPR), pp. 2155–2162.
Fishel JA, Loeb GE (2012) Bayesian exploration for intelligent identification of
textures. Frontiers in Neurorobotics 6:4.
Fishel JA, Santos VJ, Loeb GE (2008) A robust micro-vibration sensor for
biomimetic fingertips In IEEE International Conference on Biomedical Robotics
and Biomechatronics (BioRob), pp. 659–663.
Flanagan JR, Bowman MC, Johansson RS (2006) Control strategies in object
manipulation tasks. Current Opinion in Neurobiology 16:650–659.
Flanagan JR, Johansson RS (2002) Hand movements. Encyclopedia of the Human
Brain 2:399–414.
Fod A, Matarić MJ, Jenkins OC (2002) Automated derivation of primitives for
movement classification. Autonomous Robots 12:39–54.
Gams A, Denisa M, Ude A (2015) Learning of parametric coupling terms for robot-
environment interaction In IEEE-RAS International Conference on Humanoid
Robots (Humanoids), pp. 304–309.
Goodwin AW, Browning AS, Wheat HE (1995) Representation of curved sur-
faces in responses of mechanoreceptive afferent fibers innervating the monkey’s
fingerpad. Journal of Neuroscience 15:798–810.
Goodwin AW, Jenmalm P, Johansson RS (1998) Control of grip force when tilting
objects: effect of curvature of grasped surfaces and applied tangential torque.
Journal of Neuroscience 18:10724–10734.
Grunwald M (2008) Human haptic perception: Basics and applications Springer
Science & Business Media.
Hackett D, Pippine J, Watson A, Sullivan C, Pratt G (2013) An overview of the
darpa autonomous robotic manipulation (arm) program. Journal of the Robotics
Society of Japan 31:326–329.
138
Hagan MT, Menhaj MB (1994) Training feedforward networks with the marquardt
algorithm. IEEE Transactions on Neural Networks 5:989–993.
Hamilton R, O’brien M, Petrossian L, Radhakrishnan S, Thibeault C, Wilk S,
Zwierstra J (2017) Systems and methods for determining clinical indications
US Patent App. 15/399,710.
Heyneman B, Cutkosky MR (2013) Slip interface classification through tactile
signal coherence In IEEE International Conference on Intelligent Robots and
Systems (IROS), pp. 801–808.
Hoffmann H, Pastor P, Park D, Schaal S (2009) Biologically-inspired dynami-
cal systems for movement generation: Automatic real-time goal adaptation and
obstacle avoidance In IEEE International Conference on Robotics and Automa-
tion (ICRA), pp. 2587–2592.
Hoffmann H, Chen Z, Earl D, Mitchell D, Salemi B, Sinapov J (2014) Adap-
tive robotic tool use under variable grasps. Robotics and Autonomous Sys-
tems 62:833–846.
Ijspeert AJ, Nakanishi J, Hoffmann H, Pastor P, Schaal S (2013) Dynamical
movement primitives: Learning attractor models for motor behaviors. Neural
Computation 25:328–373.
Jain A, Niekum S (2018) Efficient hierarchical robot motion planning under uncer-
tainty and hybrid dynamics. arXiv preprint arXiv:1802.04205 .
Jenmalm P, Birznieks I, Goodwin AW, Johansson RS (2003) Influence of object
shape on responses of human tactile afferents under conditions characteristic of
manipulation. European Journal of Neuroscience 18:164–176.
Jenmalm P, Johansson RS (1997) Visual and somatosensory information
about object shape control manipulative fingertip forces. Journal of Neuro-
science 17:4486–4499.
Ji X, Xiao J (2001) Planning motions compliant to complex contact states. The
International Journal of Robotics Research 20:446–465.
Jimenez MC, Fishel JA (2014) Evaluation of force, vibration and thermal tactile
feedback in prosthetic limbs In Haptics Symposium (HAPTICS), 2014 IEEE,
pp. 437–441. IEEE.
Johansson RS, Vallbo AB (1979) Detection of tactile stimuli. thresholds of affer-
ent units related to psychophysical thresholds in the human hand. Journal of
Physiology 297:405–422.
139
JohanssonRS,WestlingG(1984) Rolesofglabrousskinreceptorsandsensorimotor
memory in automatic control of precision grip when lifting rougher or more
slippery objects. Experimental Brain Research 56:550–564.
Johansson RS, Westling G (1987) Signals in tactile afferents from the fingers
eliciting adaptive motor responses during precision grip. Experimental Brain
Research 66:141–154.
Johansson RS, Westling G (1988) Coordinated isometric muscle commands ade-
quately and erroneously programmed for the weight during lifting task with
precision grip. Experimental Brain Research 71:59–71.
Johansson R (2018) Light a match: Normal, pre-anesthetization performance
vs post-anesthetization performance https://www.youtube.com/watch?v=
0LfJ3M3Kn80 Accessed: 2018-08-04.
Johansson RS, Flanagan JR (2009a) Coding and use of tactile signals from the
fingertips in object manipulation tasks. Nature Reviews Neuroscience 10:345.
Johansson RS, Flanagan JR (2009b) Sensory control of object manipulation. Sen-
sorimotor control of grasping: Physiology and pathophysiology pp. 141–160.
Kappler D, Pastor P, Kalakrishnan M, Wüthrich M, Schaal S (2015) Data-
Driven Online Decision Making for Autonomous Manipulation In Proceedings
of Robotics Science and Systems (RSS).
Katz D (1937) Studies on test baking: III. The Human Factor in Test Baking: a
Psychological Study. Cereal Chem 14:382–396.
Khalsa PS, Friedman RM, Srinivasan MA, Lamotte RH (1998) Encoding of shape
and orientation of objects indented into the monkey fingerpad by populations
of slowly and rapidly adapting mechanoreceptors. Journal of Neurophysiol-
ogy 79:3238–3251.
Khansari M, Klingbeil E, Khatib O (2016) Adaptive human-inspired compliant
contact primitives to perform surface–surface contact under uncertainty. The
International Journal of Robotics Research 35:1651–1675.
Kinoshita H, Bäckström L, Flanagan JR, Johansson RS (1997) Tangential torque
effects on the control of grip forces when holding objects with a precision grip.
Journal of Neurophysiology 78:1619–1630.
Klingbeil E, Menon S, Khatib O (2016) Experimental analysis of human con-
trol strategies in contact manipulation tasks In International Symposium on
Experimental Robotics, pp. 275–286. Springer.
140
Kober J, Mohler B, Peters J (2008) Learning perceptual coupling for motor primi-
tives In IEEE/RSJ International Conference on Intelligent Robots and Systems
(IROS), pp. 834–839.
Konidaris G, Kuindersma S, Grupen R, Barto AG (2012) Robot learning from
demonstration by constructing skill trees. The International Journal of Robotics
Research 31:360–375.
Kramberger A, Gams A, Nemec B, Ude A (2016) Generalization of orientational
motion in unit quaternion space In IEEE-RAS International Conference on
Humanoid Robots (Humanoids), pp. 808–813.
Kroemer O, van Hoof H, Neumann G, Peters J (2014) Learning to predict phases
of manipulation tasks as hidden states In IEEE International Conference on
Robotics and Automation (ICRA), pp. 4009–4014.
Kulick J, Otte S, Toussaint M (2015) Active exploration of joint dependency struc-
tures In IEEE International Conference on Robotics and Automation (ICRA),
pp. 2598–2604.
Kupcsik AG, Deisenroth MP, Peters J, Ai Poh L, Vadakkepat V, Neumann G
(2017) Model-based contextual policy search for data-efficient generalization of
robot skills. Artificial Intelligence 247:415–439.
LaMotte RH (2000) Softness discrimination with a tool. Journal of Neurophysiol-
ogy 83:1777–1786.
Lederman SJ, Klatzky RL (1987) Hand movements: A window into haptic object
recognition. Cognitive Psychology 19:342–368.
Lederman SJ, Klatzky RL (1990) Haptic exploration and object representation. .
Lee G, Lozano-Pérez T, Kaelbling LP (2015) Hierarchical planning for multi-
contact non-prehensile manipulation In IEEE/RSJ International Conference on
Intelligent Robots and Systems (IROS), pp. 264–271.
Li Q, Schürmann C, Haschke R, Ritter H (2013) A control framework for tactile
servoing In Proceedings of Robotics Science and Systems (RSS).
Lin CH, Erickson TW, Fishel JA, Wettels N, Loeb GE (2009) Signal processing
and fabrication of a biomimetic tactile sensor array with thermal, force and
microvibration modalities In IEEE International Conference on Robotics and
Biomimetics (ROBIO), pp. 129–134.
141
Lioutikov R, Neumann G, Maeda G, Peters J (2017) Learning movement primi-
tive libraries through probabilistic segmentation. The International Journal of
Robotics Research 36:879–894.
Loeb GE, Fishel JA (2014) Bayesian action&perception: Representing the world
in the brain. Frontiers in neuroscience 8:341.
Loeb GE, Tsianos GA, Fishel JA, Wettels N, Schaal S (2011) Understanding
haptics by evolving mechatronic systems In Progress in brain research, Vol. 192,
pp. 129–144. Elsevier.
Madry M, Bo L, Kragic D, Fox D (2014) St-hmp: Unsupervised spatio-temporal
feature learning for tactile data In IEEE International Conference on Robotics
and Automation (ICRA), pp. 2262–2269.
Manschitz S, Kober J, Gienger M, Peters J (2015) Learning movement primitive
attractor goals and sequential skills from kinesthetic demonstrations. Robotics
and Autonomous Systems 74:97–107.
MarzkeMW(1997) Precisiongrips,handmorphology,andtools. American Journal
of Physical Anthropology 102:91–110.
Marzke MW, Marzke RF (2000) Evolution of the human hand: approaches to
acquiring, analysing and interpreting the anatomical evidence. The Journal of
Anatomy 197:121–140.
Marzke MW, Shackley MS (1986) Hominid hand use in the pliocene and pleis-
tocene: evidence from experimental archaeology and comparative morphology.
Journal of Human Evolution 15:439–460.
Matulevich B, Loeb GE, Fishel JA (2013) Utility of contact detection reflexes in
prosthetic hand control In IEEE/RSJ International Conference on Intelligent
Robots and Systems (IROS), pp. 4741–4746.
Meier F, Theodorou E, Stulp F, Schaal S (2011) Movement segmentation using a
primitive library In IEEE/RSJ International Conference on Intelligent Robots
and Systems (IROS), pp. 3407–3412.
Meier M, Patzelt F, Haschke R, Ritter HJ (2016) Tactile convolutional networks
for online slip and rotation detection In International Conference on Artificial
Neural Networks, pp. 12–19. Springer.
Melchiorri C (2000) Slip detection and control using tactile and force sensors.
IEEE/ASME Transaction on Mechatronics 5:235–242.
142
Mittra ES, Smith HF, Lemelin P, Jungers WL (2007) Comparative morpho-
metrics of the primate apical tuft. American journal of physical anthropol-
ogy 134:449–459.
Moley (2018) Moley – the world’s first robotic kitchen http://www.moley.com/
Accessed: 2018-08-04.
Moller MF (1993) A scaled conjugate gradient algorithm for fast supervised learn-
ing. Neural Networks 6:525–533.
Monzée J, Lamarre Y, Smith AM (2003) The effects of digital anesthesia on force
control using a precision grip. Journal of neurophysiology 89:672–683.
Nakanishi J, Cory R, Mistry M, Peters J, Schaal S (2008) Operational space
control: A theoretical and empirical comparison. The International Journal of
Robotics Research 27:737–757.
Napier JR (1956) The prehensile movements of the human hand. The Journal of
Bone and Joint Surgery. British Volume 38:902–913.
Nemec B, Ude A (2012) Action sequencing using dynamic movement primitives.
Robotica 30:837–846.
Niekum S, Osentoski S, Konidaris G, Chitta S, Marthi B, Barto AG (2015a) Learn-
inggroundedfinite-staterepresentationsfromunstructureddemonstrations. The
International Journal of Robotics Research 34:131–157.
Niekum S, Osentoski S, Atkeson CG, Barto AG (2015b) Online bayesian change-
point detection for articulated motion models In IEEE International Conference
on Robotics and Automation (ICRA), pp. 1468–1475. IEEE.
Ogawa H (1996) The merkel cell as a possible mechanoreceptor cell. Progress in
Neurobiology 49:317–334.
ParkD,HoffmannH,PastorP,SchaalS(2008) Movementreproductionandobsta-
cle avoidance with dynamic movement primitives and potential fields In IEEE-
RAS International Conference on Humanoid Robots (Humanoids), pp. 91–98.
Pastor P, Kalakrishnan M, Righetti L, Schaal S (2012) Towards associative
skill memories In IEEE-RAS International Conference on Humanoid Robots
(Humanoids), pp. 309–315.
Pastor P, Kalakrishnan M, Binney J, Kelly J, Righetti L, Sukhatme G, Schaal
S (2013) Learning task error models for manipulation In IEEE International
Conference on Robotics and Automation (ICRA), pp. 2612–2618.
143
Pastor P, Kalakrishnan M, Chitta S, Theodorou E, Schaal S (2011) Skill learning
and taskoutcome prediction formanipulation In IEEE International Conference
on Robotics and Automation (ICRA), pp. 3828–3834.
Pastor P, Kalakrishnan M, Meier F, Stulp F, Buchli J, Theodorou E, Schaal
S (2013) From dynamic movement primitives to associative skill memories.
Robotics and Autonomous Systems 61:351–361.
Pastor P, Righetti L, Kalakrishnan M, Schaal S (2011) Online movement adapta-
tion based on previous sensor experiences In IEEE/RSJ International Confer-
ence on Intelligent Robots and Systems (IROS), pp. 365–371.
Peine WJ (2000) Remote palpation instruments for minimally invasive
surgery. pp. 1187–1187.
Phillips J, Johnson KO (1981) Tactile spatial resolution. ii. neural representation
of bars, edges, and gratings in monkey primary afferents. Journal of Neurophys-
iology 46:1192–1203.
Rai A, Meier F, Ijspeert A, Schaal S (2014) Learning coupling terms for obsta-
cle avoidance In IEEE-RAS International Conference on Humanoid Robots
(Humanoids), pp. 512–518.
Rai A, Sutanto G, Schaal S, Meier F (2017) Learning feedback terms for reac-
tive planning and control In IEEE International Conference on Robotics and
Automation (ICRA), pp. 2184–2191.
Romano JM, Hsiao K, Niemeyer G, Chitta S, Kuchenbecker KJ (2011) Human-
inspired robotic grasp control with tactile sensing. IEEE Transaction on
Robotics 27:1067–1079.
Schoepfer M, Schuermann C, Pardowitz M, Ritter H (2010) Using a piezo-resistive
tactile sensor for detection of incipient slippage In IEEE International Sympo-
sium on Robotics, p. 14–20, Berlin.
Serina ER, Mockensturm E, Mote C, Rempel D (1998) A structural model of the
forced compression of the fingertip pulp. Journal of Biomechanics 31:639–646.
Serina ER, Mote C, Rempel D (1997) Force response of the fingertip pulp to
repeated compression—effects of loading rate, loading angle and anthropometry.
Journal of Biomechanics 30:1035–1040.
ShiJ,MalikJ(2000) Normalizedcutsandimagesegmentation. IEEE Transactions
on pattern analysis and machine intelligence 22:888–905.
144
ShrewsburyMM,JohnsonRK(1975) Thefasciaofthedistalphalanx. The Journal
of Bone and Joint Surgery. American Volume 57:784–788.
Shrewsbury MM, Johnson RK (1983) Form, function, and evolution of the distal
phalanx. Journal of Hand Surgery 8:475–479.
Sinapov J, Sukhoy V, Sahai R, Stoytchev A (2011) Vibrotactile recognition
and categorization of surfaces by a humanoid robot. IEEE Transactions on
Robotics 27:488–497.
Srinivasan MA, Whitehouse JM, LaMotte RH (1990) Tactile detection of slip:
Surface microgeometry and peripheral neural codes. Journal of Neurophysiol-
ogy 63:1323–1332.
Srinivasan MA, LaMotte RH (1995) Tactual discrimination of softness. Journal
of Neurophysiology 73:88–101.
Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014)
Dropout: Asimplewaytopreventneuralnetworksfromoverfitting. The Journal
of Machine Learning Research 15:1929–1958.
Su Z, Fishel JA, Yamamoto T, Loeb GE (2012) Use of tactile feedback to control
exploratory movements to characterize object compliance. Frontiers in Neuro-
robotics 6:7.
Su Z, Hausman K, Chebotar Y, Molchanov A, Loeb GE, Sukhatme GS, Schaal S
(2015) Force estimation and slip detection/classification for grip control using a
biomimetic tactile sensor In IEEE-RAS International Conference on Humanoid
Robots (Humanoids), pp. 297–303.
Su Z, Kroemer O, Loeb GE, Sukhatme GS, Schaal S (2016) Learning to switch
between sensorimotor primitives using multimodal haptic signals In Interna-
tional Conference on Simulation of Adaptive Behavior, pp. 170–182. Springer.
Su Z, Kroemer O, Loeb GE, Sukhatme GS, Schaal S (2018) Learning manipu-
lation graphs from demonstrations using multimodal sensory signals In IEEE
International Conference on Robotics and Automation (ICRA), pp. ?–?
Su Z, Li Y, Loeb GE (2011) Estimation of curvature feature using a biomimetic
tactile sensor. in Proceedings of American Society of Biomechanics pp. 1–2.
Sung J, Salisbury JK, Saxena A (2017) Learning to represent haptic feedback for
partially-observable tasks. arXiv preprint arXiv:1705.06243 .
Susman RL (1979) Comparative and functional morphology of hominoid fingers.
American Journal of Physical Anthropology 50:215–236.
145
Susman RL (1988) Hand of paranthropus robustus from member 1, swartkrans:
fossil evidence for tool behavior. Science 240:781–784.
Takamuku S, Gomez G, Hosoda K, Pfeifer R (2007) Haptic discrimination of
material properties by a robotic hand In IEEE International Conference on
Development and Learning (ICDL), pp. 1–6.
Tieleman T, Hinton G (2012) Lecture 6.5—RmsProp: Divide the gradient by
a running average of its recent magnitude COURSERA: Neural Networks for
Machine Learning.
Ude A, Nemec B, Petric T, Morimoto J (2014) Orientation in cartesian space
dynamic movement primitives In IEEE International Conference on Robotics
and Automation (ICRA), pp. 2997–3004.
Vallbo AB, Johansson RS (1978) The tactile sensory innervation of the glabrous
skin of the human hand. Active Touch 2954:29–54.
VeigaF,PetersJ(2016) Canmodularfingercontrolforin-handobjectstabilization
be accomplished by independent tactile feedback control laws? arXiv preprint
arXiv:1612.08202 .
Veiga F, Van Hoof H, Peters J, Hermans T (2015) Stabilizing novel objects by
learning to predict tactile slip In IEEE/RSJ International Conference on Intel-
ligent Robots and Systems (IROS), pp. 5065–5072.
Vijayakumar S, D’Souza A, Schaal S (2005) Incremental online learning in high
dimensions. Neural Computation 17:2602–2634.
Villani L, Natale C, Siciliano B, de Wit CC (2000) An experimental study of adap-
tiveforce/positioncontrolalgorithmsforanindustrialrobot. IEEE Transactions
on Control Systems Technology 8:777–786.
Von Luxburg U (2007) A tutorial on spectral clustering. Statistics and Comput-
ing 17:395–416.
Webster JG (1988) Tactile sensors for robotics and medicine John Wiley & Sons,
Inc.
Westling G, Johansson RS (1987) Responses in glabrous skin mechanoreceptors
during precision grip in humans. Experimental Brain Research 66:128–140.
Wettels N, Fishel JA, Loeb GE (2014) Multimodal tactile sensor In The Human
Hand as an Inspiration for Robot Hand Development, Vol. 95 of Springer Tracts
in Advanced Robotics, pp. 405–429. Springer.
146
Wettels N, Fishel JA, Su Z, Lin CH, Loeb GE (2009a) Multi-modal synergistic tac-
tilesensing InTactile Sensing in Humanoids - Tactile Sensors and Beyond Work-
shop, IEEE-RAS International Conference on Humanoid Robots (Humanoids).
Wettels N, Parnandi AR, Moon J, Loeb GE, Sukhatme GS (2009b) Grip con-
trol using biomimetic tactile sensing systems. IEEE/ASME Transactions on
Mechatronics 14:718–723.
Wettels N, Santos VJ, Johansson RS, Loeb GE (2008) Biomimetic tactile sensor
array. Advanced Robotics 22:829–849.
Wettels N, Loeb GE (2011) Haptic feature extraction from a biomimetic tactile
sensor: force, contact location and curvature In IEEE International Conference
on Robotics and Biomimetics (ROBIO), pp. 2471–2478.
Wettels N, Smith LM, Santos VJ, Loeb GE (2008) Deformable skin design to
enhance response of a biomimetic tactile sensor In IEEE International Confer-
ence on Biomedical Robotics and Biomechatronics (BioRob), pp. 132–137.
Yamamoto T, Vagvolgyi B, Balaji K, Whitcomb LL, Okamura AM (2009) Tis-
sue property estimation and graphical display for teleoperated robot-assisted
surgery In IEEE International Conference on Robotics and Automation (ICRA),
pp. 4239–4245.
Yao K, Kaboli M, Cheng G (2017) Tactile-based object center of mass exploration
anddiscrimination InIEEE-RAS International Conference on Humanoid Robots
(Humanoids), pp. 876–881.
Yao Y, Rosasco L, Caponnetto A (2007) On early stopping in gradient descent
learning. Constructive Approximation 26:289–315.
Yuan JS (1988) Closed-loop manipulator control using quaternion feedback. IEEE
Journal on Robotics and Automation 4:434–440.
147
Abstract (if available)
Abstract
A service robot deployed in human environments must be able to perform reliable dexterous manipulation tasks under many different conditions. These tasks require robots to interact with objects under intrinsic pose uncertainties, such as poor forward kinematics due to unmodeled nonlinear dynamics, and extrinsic pose uncertainties, such as uncertainties on objects' poses. Recent advances in computer vision and range sensing enable robots to detect their end-effectors' and objects' pose reliably. However, the pose estimation accuracy deteriorates when visual occlusion is involved in the tasks. Even with correct pose estimation of the robot and the objects, reliable dexterous manipulation tasks remain a challenging problem, because these tasks involve interacting with objects with unknown material properties, such as mass, center of mass, compliance, shapes, and friction, etc. ❧ Tactile sensors can be used to monitor hand-object interactions that are very important in dexterous manipulation. Recently, biomimetic tactile sensors, designed to provide more humanlike capabilities, have been developed. With such rich tactile sensing capabilities, we propose a hierarchical tactile manipulation framework to improve the robustness of the robotic manipulation so that robots could get close to human-level performance in dexterous manipulation tasks. First, we discuss using heuristic tactile features to cope with external objects' pose uncertainties through low-level feedback control, such as tactile servoing while estimating the compliance material property. Second, we propose a framework to learn low-level feedback policy through human demonstration and we demonstrate this framework on a scraping task using learned tactile feedback policy. Third, we present our slip detection method which is used to provide low-level feedback control to adjust the grip force during a grasping task. Four, we present a framework to predict translational and rotational slips which are used to update the desired feedforward grip force during a precision pinch grip task. Finally, we present a framework to use tactile events to perform skill acquisitions as well as skill performance evaluation, a high-level perceptual process, through active exploration.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Biomimetic tactile sensor for object identification and grasp control
PDF
Design and use of a biomimetic tactile microvibration sensor with human-like sensitivity and its application in texture discrimination using Bayesian exploration
PDF
Accelerating robot manipulation using demonstrations
PDF
Intelligent robotic manipulation of cluttered environments
PDF
Planning for mobile manipulation
PDF
Data-driven autonomous manipulation
PDF
Learning affordances through interactive perception and manipulation
PDF
Leveraging structure for learning robot control and reactive planning
PDF
Data scarcity in robotics: leveraging structural priors and representation learning
PDF
Augmented simulation techniques for robotic manipulation
PDF
Scaling robot learning with skills
PDF
Data-driven acquisition of closed-loop robotic skills
PDF
Trajectory planning for manipulators performing complex tasks
PDF
Information theoretical action selection
PDF
Optimization-based whole-body control and reactive planning for a torque controlled humanoid robot
PDF
Perception and haptic interface design for rendering hardness and stiffness
PDF
Characterizing and improving robot learning: a control-theoretic perspective
PDF
Program-guided framework for your interpreting and acquiring complex skills with learning robots
PDF
Robot trajectory generation and placement under motion constraints
PDF
Speeding up trajectory planning for autonomous robots operating in complex environments
Asset Metadata
Creator
Su, Harry Zhe
(author)
Core Title
Hierarchical tactile manipulation on a haptic manipulation platform
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Biomedical Engineering
Publication Date
10/30/2018
Defense Date
08/29/2018
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
associative skill memories,biomimetic,BioTac,classification,feature extraction,grip control,haptics,machine learning,motor primitives,movement primitives,OAI-PMH Harvest,regrasping,robotic grasping,robotic manipulation,skin deformation,slip detection,slip prediction,tactile sensing,tactile sensor,tactile servoing,Touch
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Loeb, Gerald E. (
committee chair
), Culbertson, Heather (
committee member
), Finley, James (
committee member
), Sukhatme, Gaurav S. (
committee member
)
Creator Email
szhe.bme@gmail.com,zhesu@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c89-99975
Unique identifier
UC11675615
Identifier
etd-SuHarryZhe-6914.pdf (filename),usctheses-c89-99975 (legacy record id)
Legacy Identifier
etd-SuHarryZhe-6914.pdf
Dmrecord
99975
Document Type
Dissertation
Format
application/pdf (imt)
Rights
Su, Harry Zhe
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
associative skill memories
biomimetic
BioTac
feature extraction
grip control
haptics
machine learning
motor primitives
movement primitives
regrasping
robotic grasping
robotic manipulation
skin deformation
slip detection
slip prediction
tactile sensing
tactile sensor
tactile servoing