Close
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
AI-driven experimental design for learning of process parameter models for robotic processing applications
(USC Thesis Other)
AI-driven experimental design for learning of process parameter models for robotic processing applications
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
AI-DRIVEN EXPERIMENTAL DESIGN FOR LEARNING OF PROCESS
PARAMETER MODELS FOR ROBOTIC PROCESSING APPLICATIONS
by
Yeo Jung Yoon
A Dissertation Presented to the
FACULTY OF THE USC GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
(MECHANICAL ENGINEERING)
December 2023
Copyright 2023 Yeo Jung Yoon
Acknowledgements
I would like to express my heartfelt gratitude to my advisor, Dr. Satyandra K. Gupta, for his unwavering
guidance throughout my Ph.D. journey. With his profound expertise in robotics and manufacturing, Dr.
Gupta has provided me with invaluable feedback and unwavering direction for my research. I have learned
immeasurable knowledge and wisdom from him, and I will forever cherish his mentorship. I consider myself
exceptionally fortunate to have had him as my advisor.
I am also deeply thankful to my dissertation committee, consisting of Dr. Jin, Dr. Bermejo-Moreno, Dr.
Quan, and Dr. Nikolaidis who dedicated their time and expertise to attend both my proposal defense and
final defense. Their insightful comments and constructive suggestions during these defenses have significantly
enriched the quality of my dissertation.
I extend my sincere thanks to my dedicated labmates at the Center for Advanced Manufacturing (CAM),
Sarah, Ariyan, Neel, Omey, Jeon, Yeowon, Alec, Shantanu, Pradeep, Prahar, Brual, and Rishi. Our collaborative brainstorming sessions, various activities at CAM, shared campus experiences, joint research efforts,
and successful paper publications have made my Ph.D. journey at CAM exceptionally rewarding and colorful.
I am immensely grateful to each one of you.
I wish to express my gratitude to the students and collaborators I had the privilege of working with on various research projects, including Santosh, Yang, Sung Eun, Minsok, Oswin, and Ashish. Our collaborations
have been a source of great learning and experience.
Dr. Max Liu deserves special thanks for his endless encouragement, support, and motivation throughout
my entire Ph.D. journey, from its beginning to the end. Max has consistently inspired and engaged me
in stimulating discussions about my research. These fun discussion sessions provided me with invaluable
insights, greatly elevating the quality of my work.
ii
I would also like to extend my warmest appreciation to Dr. Chan-Ye Ohh, with whom I embarked on
this Ph.D. journey in the same department. We prepared for defenses side by side and graduated around the
same time. The time I spent with Chan-Ye studying for exams and preparing for defenses holds immense
value and support for me. The memories we created both on and off the USC campus are cherished and
unforgettable.
My heartfelt thanks go to Calvin Chang, who has encouraged me to maintain a steady and effective pace
while conducting research. Calvin brought a positive influence and his unwavering belief in me has propelled
me forward during challenging moments in my Ph.D. journey.
Lastly, I want to express my deepest appreciation to my parents, Dr. Youngsoo Yoon and Kyounghee
Choi, whose strong faith in me and encouragement inspired me to pursue my Ph.D. journey. Even from
Korea, they have consistently been my sources of support, offering unlimited cheers and caring for my wellbeing. They provided invaluable advice whenever I needed it. Their strong belief, endless encouragement,
and love have been instrumental in helping me overcome numerous challenges throughout my Ph.D. journey.
iii
Table of Contents
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii
Chapter 1: Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Research Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Objectives and Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Chapter 2: Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2 Parameter Learning and Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2.1 Parameter Estimation Using Active Learning Approach . . . . . . . . . . . . . . . . . 11
2.2.2 Parameter Estimation Using Bayesian Optimization . . . . . . . . . . . . . . . . . . . 13
2.2.3 Parameter Estimation Using Design of Experiments Methods . . . . . . . . . . . . . . 15
2.2.4 Parameter Estimation Using Meta-heuristic Approaches . . . . . . . . . . . . . . . . . 16
2.3 Spatially Varying Process Parameters Learning . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.4 Temporally Varying Process Parameters Learning . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.5 Safe Parameter Learning and Exploration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Chapter 3: Background: Planning, Control, and Calibration Foundations for Robotic Processing
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.2 Tool Path Planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.2.1 Tool Paths for Covering Areas on Non-planar Layers . . . . . . . . . . . . . . . . . . . 24
3.2.2 Tool Paths for Pick and Place Operations . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.3 Robot Trajectory Planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.4 Tool Velocity Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.5 Calibration of Robots and Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
Chapter 4: Learning of Constant Process Parameter Models for Robotic Processing Applications . . . 53
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.2 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.3 Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.3.2 Feasibility Biased Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.3.3 Surrogate Model Construction: Gaussian Process . . . . . . . . . . . . . . . . . . . . . 59
4.3.4 Greedy Optimization using GP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
iv
4.3.5 A Sequential Decision Making Approach for Parameter Selection . . . . . . . . . . . . 62
4.4 Case Study: Robotic Sanding Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.5 Results of Robotic Sanding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.5.1 Description of Function Used for Simulating Sanding Process . . . . . . . . . . . . . . 69
4.5.2 Implementation of Our Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
4.6 Case Study: Robotic Sanding Task with Risks of Irreversible Damage . . . . . . . . . . . . . 74
4.6.1 Description of Function Used for Simulating Sanding Process . . . . . . . . . . . . . . 79
4.6.2 Implementation of Safe Learning Approach . . . . . . . . . . . . . . . . . . . . . . . . 79
4.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
Chapter 5: Learning of Spatially Varying Process Parameter Models for Robotic Finishing Applications 82
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
5.2 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
5.3 Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
5.3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
5.3.2 Definition of Parameters and Objective Function for Contact-based Finishing Tasks . 88
5.3.3 Initial Exploration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
5.3.3.1 Part Surface Splitting into Smaller Patches . . . . . . . . . . . . . . . . . . . 90
5.3.3.2 Initial Task Execution Policy . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
5.3.4 Surrogate Model Construction and Update . . . . . . . . . . . . . . . . . . . . . . . . 93
5.3.5 Selection of Region Sequencing Policy . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
5.3.6 Selection of Process Parameter Policy . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
5.4 Case Study: Contact-based Sanding Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
5.4.1 Initial Exploration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
5.4.2 Surrogate Model Construction and Update . . . . . . . . . . . . . . . . . . . . . . . . 102
5.4.3 Selection of Region Sequencing Policy . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
5.4.4 Selection of Process Parameter Policy . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
5.5 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
5.5.1 Computational Simulation of Robotic Sanding Task . . . . . . . . . . . . . . . . . . . 106
5.5.2 Physical Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
5.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
Chapter 6: Learning of Temporally Varying Process Parameter Models for Direct Ink Writing
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
6.2 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
6.3 Overview of Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
6.4 Task Performance Analysis: Image Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
6.5 Surrogate Model Construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
6.6 Parameter Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
6.7 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
6.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
Chapter 7: A Sequential Decision Making Approach to Learn Process Parameter Models by
Conducting Experiments on Sacrificial Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
7.2 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
7.3 Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
7.3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
7.3.2 Search Tree Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
7.3.3 Gaussian Process Surrogate Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
7.3.4 Policy to Select Process Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
7.4 Case Study: Robotic Spray Painting Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
7.4.1 Process Parameters and Task Performance . . . . . . . . . . . . . . . . . . . . . . . . 143
v
7.4.2 Gaussian Process Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
7.4.3 Parameter Policy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
7.4.4 Search Tree Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
7.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
7.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
Chapter 8: Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
8.1 Intellectual Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
8.2 Anticipated Benefits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
8.3 Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
vi
List of Tables
4.1 Selecting the different sampling methods results in different outcomes in terms of N′
, tcb and
T. The unit of the time is second. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
4.2 Sanding the different sizes of rusty surface results in different outcomes in terms of N′
, tcb,
and T. The unit of time is second. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
4.3 Tuning hyperparameters of GP models affects the average and maximum values of N′ and
tcb. The unit of the time is second. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
4.4 Comparison between our approach and other DOE methods. The unit of the time is second. 74
4.5 Comparison of outcomes when applying safe learning strategy and not applying it. The unit
of the time is second. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
4.6 Comparison of outcomes when applying safe learning strategy and not applying it. The unit
of the time is second. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
5.1 The mean of task completion time with different ratio of γ (unit is sec) . . . . . . . . . . . . 107
5.2 The mean of task completion time with different combinations of kernel functions in GPR
models (unit is sec) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
5.3 The mean of task completion time with different region sequencing policies (unit is sec). . . . 109
5.4 The mean of task completion time with and without heuristics (unit is sec) . . . . . . . . . . 110
5.5 The mean of task completion time with different parts . . . . . . . . . . . . . . . . . . . . . . 111
6.1 The computation of RSME 1, 2, and 3. We ran the computation for five times. The training
and test data are randomly selected for each run. . . . . . . . . . . . . . . . . . . . . . . . . 131
6.2 The comparison of the expected total length when the process parameters are adjusted over
time or not adjusted. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
6.3 The comparison of the expected total length of prints with and without GP tuning. . . . . . 133
vii
List of Figures
1.1 Traditionally, the process of determining the right process parameters and programming the
robot is done by human operators. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 The parts are severely damaged, requiring significant costs and time for the replacement. . . 4
1.3 Three characteristics for the robotic sanding task with potential for irreversible damage. . . . 5
1.4 Learning framework for a manufacturing task should satisfy four factors: learn efficiently,
learn safely, utilize prior knowledge, and handle complex manufacturing constraints. . . . . . 7
1.5 The learning approach is established and applied in the following applications: (a) Sanding,
(b) Spray painting, (c) Additive manufacturing. . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.1 The robotic system for conformal 3D printing. The left figure consists of Yaskawa Motoman
GP12 manipulator with a bowden extruder. The right figure consists of an ABB IRB 120. . . 25
3.2 (a) Hatching along 20o
slope, (b) Hatching along 40o
slope, (c) Hatching along 60o
slope, (d)
Hatching along 90o
slope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.3 Tool path generation of non-planar layers with varying hatching angles . . . . . . . . . . . . . 27
3.4 The execution of the AM process on the non-planar surfaces. . . . . . . . . . . . . . . . . . . 29
3.5 3D CAD models and physical models printed by the robotic 3D printing system: Specimen
A to F . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.6 (a) Surface finish comparison between Specimen Aplanar(left) and Specimen A(right), (b)
Specimen Cplanar printed by the traditional 3D printer using a planar layered method, (c)
Enlarged pictures of Specimen Cplanar(left) and Specimen C(right). . . . . . . . . . . . . . . 31
3.7 The robotic cell consisting of two 6 DOF robotic manipulators, one extruder, and a gripper . 33
3.8 Pick and place operations: (a) The electromagnetic gripper is in the activated mode and
picks up the servo motor, (b) The gripper inserts the servo motor into the yellow 3D printed
part, (c) The gripper is in the deactivated mode and applies forces to slide the servo motor
into the printed part. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.9 Tool path generated for Part A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.10 Tool path generated for Part B . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
viii
3.11 Entire execution process of the proposed robotic cell . . . . . . . . . . . . . . . . . . . . . . . 38
3.12 Printing process of Part A: (a) The bottom layers are printed with non-planar layers, (b)
The inner layers are printed with planar layers, (c) After the motor is embedded, the rest of
the infill is printed with planar layers, (d) The top layers of the part are conformally printed
with the non-planar layers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.13 Initial CAD model designs of Part A, B, and C (top figures). The printed parts (bottom
figures). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.14 Illustration of path consistency challenges during moving from P
1
to P
2
. Robot configurations
Θ2 and Θ2
′
can reach P
2
. Going from Θ1
to Θ2
to leads a consistent path. Going from Θ1
to Θ2
′
leads to an inconsistent path. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.15 Representative cone generation for waypoints along the tool path. . . . . . . . . . . . . . . . 46
3.16 Changes in TCP orientation along a planar, convex, and concave surface. . . . . . . . . . . . 47
3.17 Printing along a curved surface (a) without velocity control, (b) with velocity control. . . . . 48
3.18 (a) Damaged base, (b) Printed model of scaled down version of car bonnet . . . . . . . . . . . 50
3.19 (a) Squeezed out excess material, (b) Curling of hatch pattern. . . . . . . . . . . . . . . . . . 50
3.20 Position of TCP and end effector with respect to flange frame coordinate . . . . . . . . . . . 52
3.21 TCP calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.22 RobotStudio Simulation for singularity and collision check . . . . . . . . . . . . . . . . . . . . 52
4.1 An industrial robot is performing the surface finishing task. . . . . . . . . . . . . . . . . . . . 53
4.2 Figure (a) shows the robotic finishing process for a rigid part. (b) shows the surface quality
is poor and the task fails. (c) shows the surface quality constraint is satisfied and the task
succeeds. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.3 A representative example of Ω and Ψ. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.4 Constructing GP models without hyperparameter optimization can cause the poor estimation
of task performance. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.5 The 6 DOF manipulator with the sanding tool needs to sand the surface with N number of
identical panels. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.6 This describes the results of the current exploration and potential reasons for the outcomes.
It also presents suitable reactions for further task execution. . . . . . . . . . . . . . . . . . . . 66
4.7 A general framework of greedy optimization for the robotic sanding task. . . . . . . . . . . . 68
4.8 The modeling of sanding tasks using analytical functions. . . . . . . . . . . . . . . . . . . . . 70
4.9 The sets of parameters selected for task executions are marked in the parameter space. The
sequence of the process parameter selection is represented by the color gradation from light
to dark. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
ix
4.10 Robotic sanding task that poses the risk of irreversible damage. The part can be damaged
severely if an excessive amount of force is applied. . . . . . . . . . . . . . . . . . . . . . . . . 74
4.11 The deflection estimation using the performance model constructed with initial task executions. 77
5.1 The regions of the part exhibit different levels of compliance. . . . . . . . . . . . . . . . . . . 83
5.2 A manipulator is performing the sanding operations on the part that has spatially varying
stiffness. The part’s stiffness varies depending on regions due to its geometries, thickness,
and metal support underneath. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
5.3 Flow chart of the learning and task execution process. . . . . . . . . . . . . . . . . . . . . . . 87
5.4 (a) Isometric view: a trapezoid shape part is given. (b) Top view: a part is split into
non-matching patches. (c) Top view: a part is split into two pairs of identical patches. . . . 91
5.5 An example of feasible and infeasible regions in 3-dimensional parameter space. Red areas
indicate the infeasible region. The remaining area marked in blue is the feasible region. . . . 92
5.6 GPR model 1 is to predict surface quality and GPR model 2 is to predict deflection. . . . . . 94
5.7 The different region sequencing policies. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
5.8 Setup of the robotic sanding task. The figures show the setup in different orientations. . . . . 100
5.9 Four different parts (a)(b)(c)(d) are given for the sanding task. The parts are shown in an
isometric view. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
5.10 Mean absolute percentage error decreases as the number of initial experiments increases. . . . 103
5.11 (a) The average loss when using different kernel functions. In this iteration, ARD matern
3/2 is the best kernel function as it produces the smallest loss. (b) and (c): the percentages
of each kernel selected as the best hyperparameter of GPR model 1 and GPR model 2,
respectively. Note: the kernel functions in the x-axis are listed in the same order for figures
(a)(b)(c). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
5.12 Figures show the regions of high stiffness and low stiffness for parts (a)(b)(c)(d). Figures are
shown in a top view. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
5.13 Left figure shows how policy 1 is applied to part (a). Right figure shows how policy 9 could
be applied to part (d). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
5.14 (a) Robotic sanding cell: ABB IRB 2600, the sanding tool, and the part to sand (b) Enlarged
photo of the tool. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
5.15 (a) The part surface quality before sanding, (b) The part surface quality after sanding task. . 114
6.1 (a) The shape recovery product is fabricated through dual DIW approach [23]. (b) Mesoscale
printing is performed on curved surface using a robotic DIW setup [19]. . . . . . . . . . . . . 118
6.2 (a) The experimental setup for DIW using a 6 DOF manipulator. (b) The detail of fluid
dispenser. Two process parameters can be controlled. . . . . . . . . . . . . . . . . . . . . . . 121
6.3 The flowchart of DIW process using the AI-driven experimental design . . . . . . . . . . . . . 123
x
6.4 Image processing to determine edges of the printed line. (a) is the original image, and (b)
shows the edges of the printed line using Sobel edge detection. . . . . . . . . . . . . . . . . . 125
6.5 Image processing to determine discontinuity in the printed artifacts. (a) is the original image,
and (b) is the binary image. (c) shows all the bounding boxes detected including noise. (d)
shows the line segmentation is successful after filtering out the noise. . . . . . . . . . . . . . 127
6.6 Printing failure: the blobs occurred during the printing process . . . . . . . . . . . . . . . . . 128
6.7 Three GP models are constructed in each time step. . . . . . . . . . . . . . . . . . . . . . . . 128
6.8 (a) Printing actual artifacts at different time steps. (b) The artifacts in the orange box shows
the process without temporal adjustment. The artifacts in the blue box shows the printing
process with temporal compensation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
6.9 The set of process parameters that selected over time . . . . . . . . . . . . . . . . . . . . . . . 134
7.1 The figures describe the mobile manipulator is spray painting a mural [36] . . . . . . . . . . . 136
7.2 The figures describe the spray painting operation on a sacrificial part. . . . . . . . . . . . . . 137
7.3 The example of search tree structure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
7.4 The figures describe the process parameters, the distance and spray angle. . . . . . . . . . . . 144
7.5 The example of spray painting task, (b) and (c) show successful results without any drip,
and (a), (d), (e), (f) show failed results. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
7.6 The structures of GP regression models. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
7.7 N′
changes depending on the number of data M′
. This decaying function means that the
ratio of N′ and N′′ changes depending on M′
. . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
7.8 The search tree structure used in our algorithm. One single round execution includes a
decision, actions, and outcomes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
7.9 The five batches of Monte Carlo rollout are performed. The figure shows the case where 8
sets of process parameters are selected for exploration and 2 sets of parameters are selected
for pseudo exploitation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
7.10 Result of task executions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
7.11 The word, USC, is spray painted. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
xi
Abstract
Robots are increasingly being considered for different manufacturing processing applications. Completing
the process efficiently and successfully requires using the right process parameters. Under traditional practices, the responsibility of determining and implementing the right process parameters into the robots has
largely been done by human operators. This approach, although reliable, presents drawbacks with its timeconsuming nature and associated costs. Instead, we want to facilitate an AI-driven experimental design
approach for learning manufacturing tasks using robots. AI-driven experimental design will enable robots to
learn from and adapt to the outcomes of previous experiments for determining process parameters to use in
further experiments. Robots can try different values of process parameters, evaluate the task performance,
and incorporate insights from these evaluations to guide subsequent experiments. Robots can continuously
enhance task performance and update process parameter models through this iterative learning approach.
The first contribution of this dissertation is to develop and implement an adaptive experimental design for
learning tasks characterized by constant process parameter models. Since the process parameter models are
constant, the sets of process parameters that ensure efficient and successful task execution can be determined
and employed. The AI-driven experimental design integrates aspects of feasibility biased sampling, surrogate
model construction, and heuristic-driven optimization. The practical implementation of this approach is
demonstrated within the context of a robotic sanding application.
The second contribution of this dissertation is building a framework to learn tasks characterized by
spatially varying process parameter models through AI-driven experimental design. Compared to models
utilizing constant process parameters, those involving spatially varying process parameters are more complex
and challenging to learn. The adaptive experimental design presented in this chapter utilizes a combination of
initial parameter exploration, surrogate modeling, region sequencing policy selection, and process parameter
xii
selection policy. The applied execution of this method is showcased in contact-based robotic finishing tasks.
Through the computational simulations and physical experiments of the robotic sanding case study, we
demonstrate the successful implementation of our approach.
The third contribution of this dissertation is to develop and implement a learning approach for robotic
processing applications characterized by temporally varying process parameter models. We implement our
approach to direct ink writing applications, where the issue of ink drying is prevalent over time. To account
for the issue of ink drying, it becomes necessary to adjust process parameters, such as tool velocity or pressure,
accordingly for each temporal phase of the processing. By making temporal adjustments of parameters, we
successfully maximize the achievable print length without any constraint violations of the process.
The final contribution of this dissertation is developing a sequential decision making approach to learn
process parameters by conducting experiments on sacrificial objects. When there is a risk of damaging target
objects (objects of interest), experimenting on sacrificial objects is a viable strategy to ensure the safety of the
target objects. However, excessive utilization of sacrificial objects could increase the associated costs. Using
an appropriate quantity of sacrificial objects is important to complete the task efficiently and safely with the
minimum task completion costs. To find the right quantity of sacrificial objects and determine the process
parameter to use, we utilize an AI-driven experimental design using a sequential decision making approach.
The AI-driven experimental design approach encapsulates aspects of look-ahead search, surrogate modeling,
and a policy for process parameter selection. The proposed method is implemented and demonstrated on
the robotic spray painting application.
xiii
Chapter 1
Introduction
1.1 Motivation
The use of robots has brought a revolution in the industry, resulting in increased productivity, enhanced
quality of products, and reduced overall manufacturing costs. Robots could be used for various manufacturing processing applications. When performing a manufacturing task, the selection of appropriate process
parameters is crucial in order to efficiently and safely complete the task. Traditionally, the process of identifying the appropriate process parameters and programming robots has largely been done by human operators.
Human operators try different values of process parameters with the robots and assess the task performance. With the performance measurement outcomes, the operators make adjustments and update process
parameter models of the task. The operators then perform another experiment and iterate this process (e.g.
Figure 1.1). Eventually, the operators find the best process parameters for robots to successfully perform
the manufacturing task. However, this manual process is inefficient due to the need for numerous trials
and significant programming costs. While this manual programming process can be economically viable for
mass-production scenarios where the high programming costs can be spread across a large number of parts,
it remains impractical for many other manufacturing scenarios, such as high-mix low-volume production,
customization, small-batch production, and so on.
In high-mix low-volume production, customization, and small-batch production, it is not economically
feasible to identify the appropriate process parameters manually. These types of production involve manufacturing a wide array of part types, each in small quantities. Each part type might require a distinct set
1
Figure 1.1: Traditionally, the process of determining the right process parameters and programming the robot is done
by human operators.
of process parameters to successfully complete the manufacturing tasks. This means that each and every
new order would require human operators to identify the new optimal parameters. In these situations, the
costs will be substantial, but the high programming costs cannot be spread out over small amounts of items.
Therefore, these types of production may require a more efficient approach to determine the appropriate
process parameters.
One possible solution is to integrate AI for experimental design to learn robotic processing applications.
AI-driven experimental design could save costs and increase the efficiency of learning the task. Robots
could efficiently collect the data and improve the task on the fly while minimizing associated costs. The
following is the general framework: A robot performs initial experiments with a selective and planned
experimental design and evaluates the task performance. Then, build surrogate models of the process
parameter models using the collected data. A robot performs another experiment with a set of parameters,
which was recommended based on AI algorithms. Every time the data is collected, the data set and the
models are updated. AI algorithms select the parameters in each experiment based on the updated posterior.
Over time, the process parameters are subsequently refined. There are lots of factors to consider in this
general framework. The algorithm should efficiently search the process parameters while minimizing the
costs that may come from constraint violations. The experiments should be performed in a way to avoid
catastrophic failures. All related manufacturing constraints, such as the physical limits of the experimental
platform or task performance constraints, should be considered. Prior knowledge or domain knowledge,
such as physics, should be utilized. We want to develop an adaptive and efficient experimental design while
2
considering all these factors. We will examine how to enable robots to efficiently and safely complete various
manufacturing tasks while minimizing manufacturing costs. With this in mind, this dissertation explores an
AI-driven approach to learning process parameter models for robotic processing applications.
1.2 Research Issues
The learning of manufacturing tasks presents research challenges for achieving efficient and safe learning.
One of the major challenges is effectively handling various constraints that have different levels of complexity.
There are numerous requirements or constraints in manufacturing tasks that need to be taken into consideration. These include factors such as task performance constraints, quality standards, physical limitations,
safety protocols, as well as time and financial constraints. These constraints have different levels of complexity in models. Some constraints may have complex models, while others may have simpler models. For
instance, when considering sanding tasks, the surface quality constraint is more intricate than the deflection
constraint. Surface quality is influenced by multiple parameters (force, tool rotation speed, and velocity),
and the relationship is non-linear. This complexity makes it more challenging to model the surface quality
accurately. In contrast, the deflection constraint is relatively simpler as the deflection is affected mainly by
one parameter (force). This difference between the two constraints may require different levels of consideration and computational analysis to precisely model those constraints. While learning, capturing the precise
behaviors of the surface quality model might demand a greater volume of data or experiments, whereas the
deflection model can be constructed accurately with fewer data points. Hence, it is important to determine and establish an effective learning approach that takes into account the varying levels of complexity in
manufacturing constraints.
It is also crucial to consider the consequences of constraint violations in manufacturing tasks. Constraint
violations can result in substantial damages, compromised safety, or an increase in manufacturing costs. By
considering the consequences of constraint violations, the learning approach can minimize the chances of
constraint violations and make informed decisions to reduce dangerous situations during exploration. For
instance, Figure 1.2 (a) and (b) show that the parts are severely damaged during robotic sanding due to
wrong process parameters. Wrong choices of process parameters can result in constraint violations and
3
lead to critical damage on the surfaces. In such instances, part replacements become necessary, causing a
significant surge in overall manufacturing costs due to the high expense of replacements. When learning a
task that carries severe consequences of constraint violations, a safe learning approach should be employed
during the exploration stage to minimize the risk of selecting inappropriate process parameters. Conversely,
if the consequences of constraint violations are less severe, a more ambitious and greedy exploration can be
used. The incurred cost from selecting incorrect process parameters would remain relatively low, as the part
would not be significantly damaged. In this scenario, it might be sufficient to just reperform the task with
different process parameters to compensate for the constraint violation. Therefore, determining the most
appropriate learning approach based on the consequences of constraint violations is a significant challenge.
Figure 1.2: The parts are severely damaged, requiring significant costs and time for the replacement.
Another challenge is to effectively utilize prior knowledge of manufacturing tasks. Leveraging prior
knowledge can significantly improve the efficiency of learning and the accuracy of process parameter models.
Prior knowledge, such as domain expertise, available data, and insights from physics can be taken into
consideration when establishing the learning process. For example, we could use the prior knowledge to
build the right kind of models for process parameters, explore specific regions within the parameter space,
curtail redundant experiments, or avoid exploring parameters that would offer marginal or no improvement
to task performance. This can make the learning process efficient and safe, reducing the overall costs of
learning the tasks. Considering the highly dynamic nature of manufacturing processes, strategically using
prior knowledge can significantly increase the adaptability of the learning approach. The ability to adapt to
4
changes in manufacturing tasks is key to a successful learning framework, and prior knowledge provides a
solid foundation for such adaptability.
Figure 1.3: Three characteristics for the robotic sanding task with potential for irreversible damage.
For a new manufacturing task, we must characterize the task with these three factors (complexity of
constraint models, consequences of constraint violations, and prior knowledge) to establish the appropriate
learning approach. For instance, Figure 1.3 shows an example of determining these characteristics for a
robotic sanding task. Assume a part is newly given and could break due to excessive deformation. The
surface quality constraint is a complex model as it is a nonlinear function of multiple parameters. The
deflection constraint is a relatively simpler model as it is mainly affected by force amounts. When the
surface quality constraint is violated, the damage is reversible. If the surface does not possess the desired
level of smoothness, additional sanding can be carried out to enhance its smoothness. However, the violation
of the deflection constraint could result in permanent damage, causing physical damage to the part. As the
part is newly provided, we possess limited prior knowledge of both constraints. Figure 1.3 shows where these
two constraints are located in the 3-D space of the characteristics graph.
Many different manufacturing processes present unique constraints. An appropriate and efficient learning strategy should be established considering all these constraints. Several learning approaches present
themselves as potential solutions, such as Bayesian Optimization, active learning, and reinforcement learning. Each learning approach has its own advantages and disadvantages. Our AI-driven experimental design
5
was inspired by Bayesian Optimization (BO), active learning, and reinforcement learning. AI-driven experimental design based on BO can adapt very well to the dynamic environment of manufacturing tasks by
using diverse acquisition functions to maximize the final utility, in other words, minimize the manufacturing
costs. It determines the process parameters to use in each experiment, which minimizes the expected overall
manufacturing costs. Active learning could determine the complete process parameter models and increase
the accuracy, our main goal is to improve performance during the processing applications. In the context
of manufacturing, the choice of process parameters often needs to evolve in accordance with the learning
progress, the number of available data, knowledge gained from previous experiments, and other factors.
The adaptive experimental design through AI can accommodate these factors and various manufacturing
constraints effectively throughout the iterative learning process. While active learning predominantly selects
data points with the highest uncertainty, our approach concentrates on a broader spectrum of adaptability.
This adaptability allows for finely tuning surrogate models, and considering exploration and exploitation
trade-offs. Our approach enables the strategic selection of process parameters at various stages, enabling
the minimization of manufacturing costs under diverse constraints. Reinforcement learning could be ideal if
there is easy and cheap data collection to learn the manufacturing tasks with experimentation. When there
is a simulation or digital twin that can simulate real-world tasks in virtual settings, reinforcement learning
can be used efficiently. The adaptive experimental design could find a good balance between exploration and
exploitation as reinforcement learning does. Our approach requires much less number of trials as it builds
the models with a small number of data.
1.3 Objectives and Scope
The goal of this dissertation is to develop and implement AI-driven experimental design for learning process
parameter models in various robotic processing applications. The experimental design should promote the
efficiency of learning and the safety of the task while utilizing prior knowledge and handling all relevant
constraints (e.g. Figure 1.4). In manufacturing, it is crucial to achieve task completion with minimum costs
or time. Maximizing efficiency in learning is essential to enhance manufacturing productivity and reduce
overall costs. Also, safe learning is important to ensure the safety of parts and physical environments and
6
avoid catastrophic failures. Safe learning is closely related to manufacturing costs as well. If there is any
fatal failures occur due to safety violations, additional costs may be incurred, such as repairing the damaged
parts, recovering the damaged settings, or reperforming the task. Leveraging prior knowledge can increase
learning efficiency, offering valuable insights into process parameters and models. By effectively utilizing prior
knowledge and making informed decisions, manufacturing costs or time could be saved. Finally, the learning
framework should be able to handle various constraints in the manufacturing application. Violations of the
constraints can result in the need for task repetition or irreversible damage to parts, leading to unnecessary
expenses and time.
Figure 1.4: Learning framework for a manufacturing task should satisfy four factors: learn efficiently, learn safely,
utilize prior knowledge, and handle complex manufacturing constraints.
The dissertation focuses on building the computational foundation of learning algorithms for four robotic
processing applications. These include situations where robots aim to learn the following.
(1) Constant process parameters: The task is governed by constant process parameter models. In this
scenario, an appropriate set of process parameters to achieve successful task completion could be
identified and consistently utilized.
(2) Spatially varying process parameters: The task exhibits spatially varying process parameter models.
The optimal set of process parameters varies based on the specific regions within the part. It is
necessary to learn which parameter sets are appropriate to use in each region to execute the task
efficiently.
(3) Temporally varying process parameters: The task exhibits temporally varying process parameter models. In this scenario, the process parameters must be adjusted over time for efficient task completion.
The absence of such adjustments could lead to constraint violations, inefficiencies, and resource wastage.
7
Figure 1.5: The learning approach is established and applied in the following applications: (a) Sanding, (b) Spray
painting, (c) Additive manufacturing.
(4) Safe task execution using sacrificial objects: To avoid catastrophic events caused by constraint violation,
sacrificial objects could be used to explore process parameters. In this scenario, the quantities of
sacrificial objects to use and the right process parameter values should be identified. Once the right
process parameters are determined with sacrificial objects, the task can proceed with the target objects.
The dissertation demonstrates the details of the adaptive and iterative learning approaches for each
situation. We will demonstrate the practical applications of this research in the fields of sanding, spray
painting, and additive manufacturing (Figure 1.5).
1.4 Overview
The dissertation consists of multiple chapters as the following: introduction, related work, background, learning framework and algorithms, and conclusions. Chapter 2 presents the related literature reviews. Chapter
3 illustrates the background knowledge about path planning, control, and calibration for robotic processing
applications. Chapter 4 presents the learning framework for constant process parameters models. In this
chapter, we discuss an efficient learning strategy when a new part is assigned for a robotic automated process, such as sanding. Chapter 5 describes the robot learning for contact-based finishing tasks which exhibit
spatially varying process parameter models. The learning approach includes region sequencing policies and
process parameter policies to effectively identify the orders of the regions to execute the task and process
8
parameters to use in each region. Chapter 6 describes the learning of the tasks that exhibit temporally
varying process parameter models, such as direct ink writing (DIW). We determine the appropriate adjustments of process parameters over time to avoid constraint violations and improve task performance. The
task performance of the DIW process with and without temporal adjustments of process parameters are
compared. Chapter 7 presents the sequential decision making process for learning of robotic spray painting
process using sacrificial objects. To avoid irreversible damage to the object of interest, sacrificial objects are
utilized to explore process parameter models. Finally, the conclusions of the dissertation are presented in
Chapter 8.
9
Chapter 2
Related Work
2.1 Overview
Robots have been used for a wide array of manufacturing applications [42]. Numerous studies on the use
of robots in manufacturing processes have been done. The manufacturing applications include, but are
not limited to, the following: Arc welding [17, 37, 37, 93, 127, 137], finishing [16, 18, 41, 53, 60, 79, 89, 92, 94,
110, 115, 129, 131], spray painting [6, 7, 20, 22, 28, 45, 49, 50, 72, 111, 134, 144, 148], additive manufacturing
[10, 11, 44, 107–109, 119, 136, 143], machining [66, 83, 100, 125, 135, 145], pick and place [47, 74, 84, 97], material
handling [38, 120, 122, 126], assembly [24, 35, 46, 54, 63, 75, 82, 86–88], and packaging [2, 5, 39, 43, 124, 130].
The fields of the studies include robot path planning, manipulation, control, task scheduling, human-robot
interaction, multi-robot collaboration, parameter optimization, and robot learning. With their superior
precision and repeatability, robots could enhance task efficiency, thereby driving a significant increase in
productivity.
The capabilities of robotic systems in diverse applications can be largely enhanced by advancements in
artificial intelligence (AI) and machine learning (ML). These advancements, coupled with the integration of
smart sensor systems, digital twins, virtual realities, or big data enable smart manufacturing. Through a
smart manufacturing system, task efficiency can be increased, and the potential risk of task failures can be
reduced. As manufacturing processes are often complex and involve numerous constraints, AI technologies
could be utilized to effectively analyze data and incorporate physics knowledge into parameter models [150].
10
This chapter describes the literature reviews regarding the learning approach for robotic applications.
Parameter estimation plays a crucial role in efficiently learning the right parameters. Section 2.2 presents
an overview of parameter learning and estimation. Sections 2.2.1 to 2.2.4 describe the literature reviews
of parameter estimation using different methodologies. Section 2.2.1 focuses on parameter estimation using
active learning, and Section 2.2.2 discusses parameter estimation using Bayesian Optimization (BO). Design
of Experiments (DOE) techniques for parameter estimation are covered in Section 2.2.3, and parameter
estimation using meta-heuristics is described in Section 2.2.4. Section 2.3 introduces related studies for
learning spatially varying process parameter models. Section 2.4 introduces the related work focusing on
temporally varying process parameters. Section 2.5, we investigated literature reviews about safe parameter
learning exploration. This chapter describes the applications that safe learning strategies should be employed
to avoid constraint violations.
2.2 Parameter Learning and Estimation
The process of learning parameter models and estimating parameters becomes a vital component of constructing the learning framework for manufacturing applications. Process parameters of the task could be
estimated from the parameter models built with experimentation. Accurately estimated process parameters
can be used for task executions to significantly influence task performance and save task completion time.
Several approaches can be employed to learn and estimate parameters effectively: Active learning, Bayesian
Optimization, DOE methods, and meta-heuristics. The details of these approaches are described in the
following:
2.2.1 Parameter Estimation Using Active Learning Approach
Active learning approaches allow a system to actively select the most informative parameters for further
exploration. With an active learning approach, the model can be efficiently trained in a way that the
accuracy of the model improves. Through the iterative process of active learning over time, it becomes
possible to discover comprehensive process parameter models that accurately describe the relationships
between process parameters and task performance. Compared to a random sampling approach, active
11
learning offers a remarkable advantage by enabling faster learning with reduced numbers of data. This
quality is especially beneficial in experimental-based learning approach, where data acquisition is costly or
time-consuming. Active learning can be utilized in many different robotic applications where robots learn
flexibly through continuous feedback [116].
Rakicevic and Kormushev introduced a unique active learning strategy, designed to efficiently navigate
the space of trial parameters for both robot task learning and transfer [99]. The authors successfully demonstrated the effectiveness of the proposed framework through a bimanual robot puck-passing task. Wilson
et al. showed empirical evidence of online parameter estimation and trajectory optimization for dynamic
task executions performed by the Baxter research robot [4]. The authors utilized an active estimator which
used Fisher information and non-linear control, known as sequential action control. Wang et al. introduced
a method employing active learning for affordances in continuous state and action spaces geared towards
enabling robot utilization of household items [123]. The authors demonstrated how a humanoid robot can
actively learn affordances and develop manipulation skills for dealing with objects, such as garbage bins.
Kroemer et al. proposed an active exploration method for robot parameter selection [65]. The authors
of this study approached the exploration and exploitation trade-off as an episodic reinforcement learning
problem. They conducted experiments to validate the effectiveness of their proposed method in the context
of robot grasping tasks involving diverse objects. Marvel et al. presented a model-assisted stochastic learning framework for automatic parameter optimization. The paper introduces a novel method that extends
conventional machine learning techniques to predict the performance of input parameter sequences. The
authors conducted experiments to validate the effectiveness of this approach in diverse mechanical assembly
operations using an industrial robot testbed. Kabir et al. [58] proposed a semi-supervised learning approach
to optimize operation parameters in a robotic cleaning application. Their method aimed to automate the
process of cleaning stain profiles on various surfaces using robotic arms, resulting in a reduced number of
cleaning experiments. The authors validated their approach through physical experiments conducted with
two industrial robots. In the paper [57], the authors described a method of identifying the optimal operation
and trajectory parameters in finishing applications with a small number of experiments. Their approach
12
was based on the uncertainty in the task performance surrogate models. They showed that the algorithm
converged to the optimal point while minimizing the number of physical experiments.
2.2.2 Parameter Estimation Using Bayesian Optimization
Bayesian Optimization algorithm is primarily used for optimizing an unknown and black-box objective function that is expensive to evaluate. As the objective function is expensive or time-consuming to evaluate,
maintaining a balance between exploration and exploitation is crucial for the efficiency of BO algorithm.
Excessive exploration of new parameters may lead to increased costs. Prematurely stopping the exploration
phase can result in undesirable or sub-optimal outcomes. Commonly used acquisition functions are probability of improvement, expected improvement, upper confidence bound, and Thompson Sampling. The
selection of an acquisition function depends on factors such as the characteristics of the objective function,
or uncertainty levels. Using different acquisition functions could capture different strategies for selecting the
next parameter sets to evaluate. Users could have flexibility in selecting acquisition functions at different
stages of optimization based on the progress of learning and the specific characteristics of the tasks. To guide
the optimization process, BO utilizes an acquisition function that assigns a value to each candidate point,
determining its potential value. By carefully considering this acquisition function, BO intelligently selects
which points to evaluate next to make optimal use of limited resources. BO utilizes a probabilistic surrogate
model, mostly Gaussian Process Regression (GPR). BO algorithms provide a powerful framework for efficiently estimating and learning the parameter space in robotics and manufacturing domains. By leveraging
BO algorithms, we can effectively optimize an objective function. This enables the robotic processing to
avoid unnecessary movements that incur additional costs and time.
In the paper [21], GPR surrogated BO algorithm is used to optimize the process parameters for a torque
converter assembly process performed by robots. To keep the right balance between exploration and exploitation, the cycle time of the process is optimized. Hong et al. have applied BO to learn assembly control
parameters for complicated robotic assembly processes [51]. Compared to traditional methods using DOE,
the total number of experiments could be reduced using their proposed method. Cheng et al. used BO
algorithm in robotic force controlled assembly processes. Authors compared the efficiency and accuracy of
13
the processes with DOE methods [25]. Dong et al. proposed a method to tune real time welding parameters to achieve desired welding quality in arc welding processes [40]. The approach was applied to gas
tungsten arc welding experiments, and the proposed modeling using acquired data from experiments was
able to predict proper weld bead geometry. Kabir et al. proposed a systematic approach to minimize the
physical experiments of a robotic cleaning application. The proposed algorithm is benchmarked against
other optimization methods on mathematical problems [59]. The author achieved a reduction in the total
number of cleaning experiments using a semi-supervised learning approach in their research [58]. Langsfeld
et al. [68] showed the approach of online learning in the automatic robotic cleaning of deformable objects.
Wu et al. developed the online robotic assembly parameter optimization based on orthogonal exploration
GPR surrogated BO algorithm [132]. In the paper [114], the authors proposed a way to efficiently search
the optimal set of process parameters for an application with industrial collaborative robots. The authors
utilized Wilson Score to make the right estimation for the success rate of sparsely sampled regions.
The BO algorithm can be employed in other robotics applications, going beyond manufacturing applications, to identify optimal task parameters. Learning gaits of robots under uncertainty is one of the popular
examples [103]. Lizotte et al. demonstrated the GP-based BO algorithm on a quadruped robot to optimize
speed and smoothness [32]. Robotic cooking could be another example of using BO algorithm to improve the
performance of the process [56]. The authors used the batch BO algorithm because the cost of the cooking
process was high, but the cost of the evaluation was relatively cheap. There are more studies regarding batch
BO in terms of using adaptive local search [73] or improving the speed of computation [27]. The paper [104],
presented the use of BO method for online sensing and planning of a visually guided mobile robot. The
proposed approach enabled a dynamic trade-off of exploration and exploitation. Pawar and Rao proposed
parameter optimization using teaching-learning-based optimization for the abrasive water jet machining process. They compared the proposed algorithm with other optimization methods, including genetic algorithm,
simulated annealing, particle swarm optimization, harmony search, and artificial bee colony algorithm [96].
Wan et al. utilized GPR based deformation compensation method for optimal path planning and control of
assembly robots [121]. They used hard-measuring easy-deformation assemblies for the experiments. Desautels et al. proposed an algorithm that could select batches of experiments to operate in parallel using GP
14
batch upper confidence bound [33]. Contal et al. proposed an algorithm using GP upper confidence bound
and pure exploration [29]. The proposed algorithm combined the upper confidence bound method and pure
exploration in the same batch of evaluations with parallel iterations. They evaluated the proposed algorithm
on real and synthetic problems to confirm its efficiency.
2.2.3 Parameter Estimation Using Design of Experiments Methods
Parameter estimation and optimization using DOE methods is a traditional yet popular approach. DOE is a
systematic approach to acquiring data and analyzing the relationships between input parameters and output
parameters. DOE involves carefully pre-designing experiments to acquire data points at specific combinations
of input parameter values. These meticulously planned experiments provide the foundation for constructing
statistical models that inherently encapsulate the insights derived from the data. The statistical models,
which were created based on the data from the carefully designed experiments, are used to analyze the
impact of each parameter on task performance. In manufacturing, it helps to uncover the relationship
between different input process parameters and the resulting task performance outcomes. The types of
DOEs include full factorial design, fractional factorial design, or Taguchi method. Full factorial design
encompasses all possible combinations of input parameters, while fractional factorial design investigates a
subset of possible combinations. Taguchi method is robust even if there exist variations in the data. The
choice of the design depends on the complexity of the given task, the required levels of accuracy, and resources.
The formulation of these statistical models can be done through techniques such as regression analysis or
response surface methodology (RSM). Regression analysis finds the relationship between a dependent variable
and independent variables via linear or non-linear regression models. RSM creates a response surface or socalled contour plot to visualize the impact of variables on the response. Also, analysis of variance (ANOVA)
can be used to construct accurate models. ANOVA plays an important role in understanding whether there
are statistically significant differences among various groups. The literature reviews below present how DOE
approach could be used to determine parameters in various manufacturing applications.
Mohamed et al. [85] used RSM based on Q-optimal design in order to find the optimal process parameters
in fused deposition modeling (FDM). They ran the experiments based on Q-optimal design and formulated
15
mathematical models for the relationships between input and output variables. They used multi-objective
optimization to find the optimum setting in experiment ranges and proved that their method was very effective for the process optimization of multiple variables and responses. Alafaghani et al. [3] investigated
the impact of FDM processing parameters on part dimension accuracy and tensile testing mechanical properties. They used Taguchi’s L9 DOE method and ran physical experiments. They concluded that there
was a trade-off between mechanical properties and dimensional accuracy. A survey paper [34] investigated
existing literature work on FDM process parameter optimization and the influence of parameters on part
characteristics. Many researchers used Taguchi method combined with statistical analysis, such as signal-tonoise-ratio and ANOVA, so they could determine important parameters that primarily affected performance
and interaction with part properties.
2.2.4 Parameter Estimation Using Meta-heuristic Approaches
Process parameters can be estimated and the appropriate parameter values can be identified with the use
of meta-heuristic algorithms. Meta-heuristic algorithms are optimization algorithms inspired by natural
processes. Meta-heuristic algorithms could be used for solving complex and nonlinear optimization problems. Meta-heuristic approaches offer a robust and flexible approach to optimization, capable of navigating
complex parameter spaces. By using meta-heuristics, it becomes possible to iteratively explore a process
parameter space and determine optimal or near-optimal values. Commonly used meta-heuristic algorithms
for parameter estimation and learning include the following: Genetic Algorithm (GA) which is inspired by
the process of natural selection and evolution. Ant colony optimization which is inspired by the foraging
behavior of real ants. Particle swarm optimization that particles in the solution space adjust their positions
and velocities according to their own experiences and the experiences of their neighbors. Simulated annealing begins with an initial solution and iteratively explores nearby solutions. It permits upward movements,
strategically avoiding local optima during the search process. Meta-heuristic approaches can be effectively
employed for parameter estimation in manufacturing by adapting their strategies to obtain desired outcomes
in manufacturing processes.
16
Marvel et al. developed a strategy leveraging GA for automated learning in the context of robotic
assembly tasks [80,81]. Similarly, an approach aimed at reducing the frame vibration of a delta robot in pickand-place applications was proposed by Cheng and Li. They achieved this by optimizing the acceleration
profile of the robot [26]. Citing an additional work, the study [128] elaborated on the enhancement of
robotic assembly performance through autonomous exploration. In a comparison between GA and response
surface methodology in the context of gas metal arc welding application process optimization, Correia et al.
discovered interesting findings [30]. Unlike response surface methodology, which generated input and output
models using a selected experimental design, GA predicted subsequent experiments based on prior processes,
bypassing the need for model generation. Both approaches, however, were able to reach optimum conditions
with a minimal number of experiments. In another study [70], the authors introduced an approach enabling
a robot to generate trajectories for task execution with limited physical experimentation. The effectiveness
of the algorithm was validated using a robot executing a dynamic fluid pouring task.
2.3 Spatially Varying Process Parameters Learning
Under the context of manufacturing processing applications, certain applications might demonstrate models
with spatially varying process parameters. For instance, consider a contact-based surface finishing for a part
that exhibits spatially varying stiffness. It is common for a part to possess variations in stiffness across
different regions of the part. The variations in stiffness can be caused by design considerations, material
properties, or other factors. In this case, different regions of the part may require the use of different
process parameters to achieve optimal results. Using constant process parameters might lead to inefficient
execution or task failure. For instance, if one region of the part is more compliant than another, the amount
of force applied or other parameters may need to be adjusted accordingly to avoid constraint violations.
Hence, finishing the part with desired outcomes, such as desired surface quality and deflection amount,
requires tailored approaches by applying varying process parameters in each region. Another example is
when machining parts with composite materials. The process parameters may need to be spatially adjusted
and used due to the anisotropic properties of the material. Anisotropic materials have different mechanical
17
properties in different directions. The optimal parameters, such as cutting speed, depth of cut, or feed rate
may vary depending on the orientation of the fibers at each region in the part.
When learning spatially varying parameter models, it is important to determine the order of regions
to learn and execute the task. The sequence of regions to learn can significantly impact the efficiency and
effectiveness of the learning process because acquired knowledge from one region can be leveraged to facilitate
learning in subsequent regions. Careful consideration of the sequence of regions to learn can lead to precise
and efficient estimation of spatially varying parameter models. Experimental design for learning spatially
varying parameter models should consider the sequence of regions and the parameter parameters to perform
experiments in each region.
Several studies have delved into the subject of spatially varying process parameters in manufacturing.
Langsfeld et al. explored the robotic finishing of intricate interior regions within components with complex
geometries [67]. The authors developed an algorithm that incorporated online learning of part deformation
models, enabling effective adaptation to potential deformations during the finishing process. The authors
validated their algorithm through a robotic cleaning application involving compliant parts. Notably, the
algorithm demonstrated the ability to perform cleaning or finishing tasks on compliant parts while mitigating
the risks of part cracks or permanent damage, ensuring a safe finishing result. In the study [69], the authors
introduced an approach to estimate the stiffness characteristics of elastically deformable parts during the
robotic cleaning process. Their objective was to minimize the overall cleaning time by optimizing the selection
of grasping positions on the part. To achieve this, a finite-element analysis was employed to approximate
the part’s deformation, providing valuable insights for determining optimal grasping points and reducing
cleaning time.
2.4 Temporally Varying Process Parameters Learning
Temporally varying process parameter models involve the temporal adjustment of parameters over time
during the learning and execution of the tasks. Numerous applications necessitate such temporal adjustment
to accommodate changing conditions. For example, in manufacturing processes, various process parameters
like force, temperature, feed rate, tool orientation, or velocity may need to be dynamically adjusted over
18
time to adapt to variations occurring in the part or system dynamics. Failing to appropriately adjust the
process parameters as the system dynamics change can result in task execution failure, poor performance,
or inefficient task completion. Managing process parameters in these manufacturing tasks become crucial
to ensure satisfactory task performance as well as optimize the performance. The importance of temporal
adjustment of process parameters is highlighted in a study by Nassif et al. [90] which emphasizes the need
to consider and effectively manage process parameters in order to achieve desired task outcomes.
Researchers have dedicated efforts to address this aspect and have conducted studies focusing on temporal
adjustment techniques in manufacturing tasks. These investigations aim to develop strategies for the effective
adaptation of process parameters to accommodate the temporal changes in the manufacturing process. Hu
et al. expanded the conventional GPR technique to establish a powerful technique for time-varying manufacturing systems regression with unknown shifts and drifts [52]. A suboptimal GPR approach was suggested
to balance the accuracy of the estimation and the computational efficiency. The chemical vapor deposition
process was used as a case study with the time-varying parameters, temperature and layer thickness. Hao
et al. [48] and Zhang et al. [149] investigated the time-varying deformation and dynamic characteristics of
the workpiece. The method known as the curved surface mapping based geometric representation model
enabled obtaining sufficient accuracy with a small amount of measurement data.
Another example can be direct ink writing (DIW) process. Direct ink writing is an additive manufacturing
process that is expanding the range of materials applicable within the realm of additive manufacturing [105].
DIW enables the creation of intricate geometries while minimizing material waste by precisely dispensing ink
only in the required areas, leading to more efficient material utilization [91]. However, its common problem
is ink drying during the process. Using sub-optimal process parameters can lead to low-quality outcomes
and waste of ink. Given the ink’s chemical reactions causing it to dry over time, the process parameters
must be adjusted in each time phase. For instance, Mantelli et al. enhanced the printability of a UV-curable
thermosetting composite ink by incorporating antisagging agents and modifying its rheological behavior [77].
The authors improved the ink’s flow properties and stability during the printing process. Tu et al. contributed
to the field by developing a process parameters optimization method for quality control in DIW processes
[118]. In their work, they divided the parameters related to printing a line and the parameters associated
19
with transitioning between lines and layers into two distinct groups. Subsequently, they individually adjusted
the parameters within each group to optimize the overall printing process. Aboutaleb et al. introduced a
novel technique that could reduce the number of optimization experiments by using experimental data
from previous studies [1]. By utilizing this methodology, the optimization process for laser-based additive
manufacturing was accelerated, allowing for more efficient parameter tuning and improved process outcomes.
2.5 Safe Parameter Learning and Exploration
Ensuring safe exploration is key to preventing irreversible harm or catastrophic incidents in various applications. These include, but are not limited to, robotics, manufacturing processes, medical procedures,
combustion engines, and gas turbines. Each of these systems operates within certain safety constraints or
thresholds. Violations of these boundaries can inflict significant damage to the systems. The objective is to
explore and learn the parameter space, all while staying within the safety threshold. The algorithm should
find the right balance between prioritizing safety and maintaining efficiency.
Numerous studies have delved into the topic of safe exploration for learning. For instance, Schreiter et al.
introduced an innovative strategy for safe exploration in active learning [106]. The proposed algorithm could
learn regression models while ensuring that critical and unsafe regions were avoided. The authors validated
their algorithm with a one-dimensional toy example and an inverse pendulum policy search problem. Sukhijia
et al. [113] presented a model-free algorithm capable of a global search for optimal policies. This algorithm
was designed to facilitate safe and optimal exploration within high-dimensional systems. Their algorithm
could be applied to safety-critical, real-world dynamic systems. The use of a pure BO algorithm may lead
to unsafe parameters during the optimization process. Consequently, there have been research endeavors
focused on learning parameters while ensuring safe exploration within the parameter space. In this context,
Sui et al. proposed a comprehensive framework for an effective and safe BO algorithm that could be applied
to a range of applications, including medical treatments and safe robotic control [112]. The algorithm could
separate the expansion of the safe region and the maximization of the utility function into two separate stages.
The authors compared their algorithm with traditional BO optimization and demonstrated the algorithm
could achieve the theoretical guarantees of both safety and optimality. Berkenkamp et al. [8] presented
20
a generalized algorithm that enabled both safe and automatic parameter tuning in the field of robotics.
The proposed algorithm maximized task performance while only exploring parameters that were safe for all
constraints with high probabilities. The authors validated the effectiveness of the proposed algorithm using
the experimental setup consisting of a quadrotor vehicle. Marco et al. introduced a framework for robot
learning with crash constraints, specifically addressing scenarios where the threshold for crash detection was
not known beforehand [78]. The authors validated their framework through experiments conducted on a
real jumping quadruped, demonstrating its effectiveness in handling crash constraints during the learning
process. The proposed algorithm, as described in the study by Homanga et al. [9] demonstrated the ability
to achieve competitive task performance while effectively preventing catastrophic failures during the training
process.
21
Chapter 3
Background: Planning, Control, and Calibration Foundations for
Robotic Processing Applications
3.1 Overview
This background chapter provides essential insights into planning, control, and calibration for robotic processing applications. The algorithms developed played a crucial role in generating smooth, collision-free
trajectories and enabling successful robotic executions for manufacturing tasks. First, planning involves tool
path planning and robot trajectory planning. The sequences of operations and tool movements will be determined during the planning stage. Through planning, the robot can create smooth and collision-free motions
to complete the given manufacturing tasks. Collision avoidance should be considered during the planning
stage to ensure there is no collision with robotic systems and environments. Control is crucial to regulate the
robot’s movements to ensure the accuracy, precision, and quality of the task. With velocity control of the
tool, the robot’s tool can follow a path accurately while maintaining the desired tool velocity. Calibration
of the robotic system is an essential step so that robotic processing can be performed with high accuracy as
planned. Both robots and tools should be calibrated prior to commencing the task. Without calibration,
the robotic system may experience offset from the planned trajectory or produce poor performance during
task executions.
The discussion of these topics (planning, control, and calibration) is critical for any robotic processing
application that requires the robot to manipulate a tool over the entire surface or area of a part, as exemplified
22
by tasks like robotic sanding, spray painting, or additive manufacturing (AM). This chapter mainly discusses
robotic AM processes as examples based on the previously published papers [108, 138]. Nevertheless, the
applicability of these concepts extends beyond just robotic AM processes [13]. Fundamental principles related
to planning, control, and calibration are transferable and can be used across various robotic processing
applications.
3.2 Tool Path Planning
Tool path planning is important in manufacturing tasks where a robot operates a tool on the surface of
a part [62, 76, 98]. Tool path planning generates the optimal or near-optimal sequences of points that the
tool should follow during the task [61]. By determining the appropriate tool path, the robot can effectively
move the tool along the surface, ensuring that the tool reaches all the required locations and executes the
necessary operations. Different manufacturing processes and applications require different considerations
when designing the tool path. For instance, the tool path generation for the 3D printing process requires
precise, smooth, collision-free paths [12]. The 3D printing process is a delicate process and often uses a small
diameter of nozzles. When the tool path generates sequential points to build the part layer by layer, we
should ensure there are no gaps between printed lines or excessive accumulation of materials on the surface.
We discuss two case studies for tool path generations. Section 3.2.1 describes the case study for depositing materials onto non-planar layers. Section 3.2.2 presents the tool path generation for the process with
the insertion of a prefabricated component during the 3D printing process. In summary, tool path planning is crucial in guiding the tool’s movements during manufacturing tasks. It involves generating optimal
sequences of points, tailored to the specific process requirements, to enable accurate execution. Meticulous
tool path planning ensures the successful completion of the task while maintaining the desired performance
and precision.
23
3.2.1 Tool Paths for Covering Areas on Non-planar Layers
AM technologies have been widely used to fabricate 3D objects quickly and cost-effectively. However, building
parts consisting of complex geometries with curvatures can be a challenging process for the traditional AM
system whose capability is restricted to planar layered printing. In the traditional AM process, independent
X and Y drives are used to move the deposition head in the horizontal plane to create a layer. The Z
drive moves the deposition head up and down to create new layers. However, constructing a part using
only planar horizontal layers limits the capabilities. To overcome this limitation, conformal AM could be
performed where the material is deposited on complex non-planar layers. Using non-planar layers brings
lots of benefits. For instance, many composite parts have thin three-dimensional shell structures because of
weight constraints. Achieving the right fiber orientation is critical to the proper functioning of these parts.
The conventional planar layer material deposition often leads to undesirable fiber orientation. The capability
to deposit the material along non-planar layers could achieve the right orientations while completing the part
with high quality. Another benefit is the better surface quality and the reduction in overall fabrication time
in printing. Many part geometries require the use of a particular build direction to minimize the staircase
effect in a conventional planar layer based process. This, in turn, may lead to a very time-consuming
process. Non-planar layers present more options for minimizing the staircase effect. Subsequently, printing
time savings can be achieved by using fewer number of layers and reducing the need for post-processing by
minimizing the staircase effect.
In this section, the main idea is to generate a tool path that covers the area on non-planar layers. It
requires generating the sequences of positions and tool inclination angles of the tool center point (TCP) to
form hatching lines covering the entire surface. The input for the tool path generation method is an STL
file and the output is a nominal tool path in the form of an ordered set of tool configurations. The rotation
of the tool about its axis is not set while computing the nominal tool path. Figure 3.1 describes examples
of AM applications where the tool paths should cover the entire area on non-planar layers.
STL files of non-planar surfaces include the set of unit normal vectors and information on vertices of each
triangle of a tessellated surface in 3D Cartesian coordinates. This information is useful for planar projection
mapping between the reference 2D plane and non-planar surface. As the layer thickness is given as user
24
Figure 3.1: The robotic system for conformal 3D printing. The left figure consists of Yaskawa Motoman GP12
manipulator with a bowden extruder. The right figure consists of an ABB IRB 120.
input, the number of layers is calculated using the total thickness of the STL file. Since the input STL
file is a shell model, we identify the bottom surface of the STL file and compute the tool path along the
bottom layer. For successive layers, we offset the Z-axis coordinates of previous layers by layer thickness and
compute the next non-planar layer with the desired hatching angle.
The 2D reference plane includes the uniform grid points data to be mapped onto the non-planar plane.
Since each triangle of a tessellated surface is a discrete 2-dimensional plane, grid points are projected on the
triangular region of the plane in order to obtain the uniform point cloud projected surface. After generating
the point cloud, the continuous tool path is generated over the non-planar surface with a uniform spacing
that matches the thickness of the filament to be extruded.
The orientation of the FDM extruder corresponding to the position of the extruder is calculated in the
form of Euler angles. The default setting for the extruder orientation can be normal to the surface but there
is a risk that the extruder tip or heating block may hit the mold while printing on a concave surface. To
account for this, the extruder tip is not always aligned with the surface normal. This prevents the extruder
from collision and creates a smooth surface finish as the change in joint angles is reduced. The exact
TCP orientation is calculated by computing the trajectory in the joint configuration space under necessary
constraints.
25
For generating the tool path, various hatching patterns such as zigzag, raster, contour, or spiral are
available. For our work, a zigzag pattern is used. The hatching lines of this pattern can have any hatching
direction for printing. It is useful to have different hatching directions for consecutive layers, as stacking layers
with varying fiber orientation will give improved part strength. In the method of generating a tool path,
users can select the hatching angle of the first layer with respect to the X-axis of the part coordinate system.
Subsequent hatching angles for remaining layers can be either user-defined or automatically generated based
on the initial hatching angle and increment. The default increment for the hatching angle is kept at 90o
.
Figure 3.2 shows the zigzag hatching pattern with different hatching angles. Figure 3.3 shows the complete
tool path generation of all layers from the STL file. The hatching angle of each layer varies.
Figure 3.2: (a) Hatching along 20o
slope, (b) Hatching along 40o
slope, (c) Hatching along 60o
slope, (d) Hatching
along 90o
slope
To create a tool path on a non-planar surface, we use a projection method instead of a slicing algorithm
that transforms non-planar layers into planar layers. From the STL file, information about vertices, faces
and unit normal vectors of triangles is extracted and stored in arrays [v], [f], and [n] respectively. Based on
26
Figure 3.3: Tool path generation of non-planar layers with varying hatching angles
the minimum and maximum values of X and Y coordinates in [v], grid points data Pgrid is generated over
the reference XY plane. It is better to have a grid size bigger than the part size hence some offset is given
to the X and Y coordinates (the range of x goes from min(x) − d to max(x) + d and the range of y goes
from min(y) − d to max(y) + d, where d is the offset parameter). The grid spacing is determined based
on the layer thickness and nozzle diameter inputs. This grid is then rotated about the Z-axis with a given
hatching angle about the centroid of grid points. To identify the bottom surface, only normals with negative
coefficients of kˆ are stored as [nbottom] from [n]. The vertices associated with [nbottom] are projected on the
XY plane, thereby the entire bottom surface is projected on the reference plane. The surface, as mentioned
before, has a triangular tessellation. All projected triangles store the grid points that lie inside them and
those grid points are stored in Pin. These grid points are then projected back on the original 3D triangles
of the non-planar surface. There is an assumption that point projections from the bottom surface to the
reference plane and again to the conformal surface are 1:1 mapping. Since any plane needs a minimum of
three points to define the plane equation, vertices of the triangle generate the equation of the plane and the
corresponding Z-coordinate value for all grid points inside the triangle is obtained. The process is repeated
for all triangles and the uniform grid points data Pin,projected is obtained along a conformal surface. In
Pin,projected, information on unit normals associated with STL triangles is stored. This information is useful
in calculating Euler angles for the TCP orientation. As each point in Pin,projected belongs to a unique triangle
of the bottom surface of the STL file, each point is assigned the unit normal of the triangle it belongs to.
One way of computing the Euler angle rotation is as Rx = −tan−1
(ny/nz) & Ry = tan−1
(nx/nz). Here,
the value nx, ny, nz represent unit normal vectors from the STL files. Another way of computing the Euler
27
angles is by calculating unit direction vectors [bx, by, bz] along the X, Y, and Z-axis of the tool frame. The
Z-axis direction vector bz is along the unit normal. Depending upon the hatching direction (i.e. the direction
of tool motion), the other two direction vectors are calculated. One vector is along the hatching direction
while the last vector is orthonormal to the other two vectors.
Algorithm 1 Path Generation from STL File
START
Input STL file, hatching distance, hatching angle and layer thickness
V = {v1, ..., vk} where, vj : {vxj , vyj , vzj} //Vertices from STL file
F = {f1, ..., fm} where, fi : {vj1i, vj2i, vj3i} //Faces from STL file
N = {n1, ..., nm} where, ni : {nxi, nyi, nzi} //Unit normals from STL file
Generate uniform grid points Pgrid within the bounds, x : {min(x) − d, max(x) + d}, y : {min(y) − d, max(y) + d} on the
reference plane, //d is a positive variable used to make the grid size bigger than projected part size
Pgrid r ← apply Z-axis rotation to Pgrid about its centroid with input hatching angle
// Identify bottom surface
for all i = 1 to m do
{
if nzi < 0 then
Store {Vbottom, Fbottom, Nbottom} ← {vi ∈ fi, fi, ni}
end if
}
end for
now, Vbottom = {vbottom 1, ..., vbottom k} where, vbottom i
: {vbottom xi, vbottom yi, vbottom zi}
Fbottom = {fbottom 1, ..., fbottom m} where, fbottom i
: {vbottom 1i
, vbottom 2i
, vbottom 3i}
Nbottom = {nbottom 1, ..., nbottom m} where, nbottom i
: {nbottom xi, nbottom yi, nbottom zi}
for all i = 1 to m do
{
Identify vbottom which belongs to fbottom i //to get vertices of each triangle
vbottom z ← 0 //Project vbottom on XY plane
Find all Pin ∈ Pgrid r in the triangular region of fbottom i
for all j = 1 to size of Pin do
{
Project Pin j on fbottom i to get Pin,projected j
Rotation about X-axis, Rx j ← −tan−1
(nbottom yj/nbottom zj )
Rotation about Y-axis, Rx j ← tan−1
(nbottom xj/nbottom zj )
Store {Pin,projected,Rx,Ry}
}
end for
end for
//Generate tool path with zigzag hatching pattern
Pin r,projected ← apply Z-axis rotation to Pin,projected about its centroid with negative of input hatching angle
for all i = 1 to size Pin r,projected do
PathPoints = Group sorted Pin r,projected i along lines parallel to X-axis
if i = EVEN then
Flip Pin r,projected i and append to Pathpoints //to create zigzag pattern
end if
end for
Apply Z-axis rotation to PathPoints about its centroid with input hatching angle
END
Grid points data Pin,projected is then rotated in the Z-axis about its centroid with the negative input
hatching angle. Based on the hatching direction, projected points are grouped into parallel hatching lines,
and every alternate array of a line is flipped to get the zigzag pattern for printing. Once the hatching
pattern is generated, points are rotated back about the centroid with a hatching angle, which generates the
28
Figure 3.4: The execution of the AM process on the non-planar surfaces.
tool path with the desired hatching direction. This process is repeated for each layer. Algorithm 1 presents
the formation of the tool path for a single layer. The hatching lines of the tool path are converted into
individual path segments and the continuous trajectory is generated on each of the segment independently.
Using the algorithm, building paths were successfully generated on surfaces with complex geometries.
This is a zigzag path that was generated on the non-planar layer by using the projection method. The
proposed projection based algorithm is an efficient way to create various hatching patterns along curved
surfaces without slicing a 3D object. Users are able to select a number of parameters including hatching
pattern, hatching direction and grid size. Figure 3.4 shows the printing process of specimens on non-planar
surfaces. ABB IRB 120 robot manipulator is used for processing. The material used was black Polylactic
Acid (PLA) filament with a diameter of 1.75mm. The specimens were successfully fabricated while using
conformal printing along non-planar surfaces. The generated surfaces hold satisfactory texture without gaps
or excessive filament. The variety in curvature surface and size of the specimens shows the capability and
reliability of the proposed algorithm and printing process.
Specimens of five different sizes and shapes were fabricated after implementing the tool path and trajectory planning algorithm on the robotic additive manufacturing system (Figure 3.5). Tool velocity control is
29
performed on curved surfaces. Calibrations of robots and tools were performed when printing a new part to
ensure accurate tool motion. The names, descriptions and nominal sizes of the specimens are shown below:
1. Specimen A : Curved beam (65 x 14 x 4mm)
2. Specimen B : Mini armor chest protector (96 x 70 x 3mm)
3. Specimen C : Wind turbine blade (178 x 127 x 3mm)
4. Specimen D : Scaled down car bonnet version 1 (152 x 130 x 2.5mm)
5. Specimen E : Scaled down car bonnet version 2 (230 x 178 x 2mm)
6. Specimen F : Octagonal dome shape (508 x 381 x 4mm)
Figure 3.5: 3D CAD models and physical models printed by the robotic 3D printing system: Specimen A to F
30
Specimen A was fabricated by using a conformal deposition pattern along the length of the beam (0o
).
Specimens B, C, D, E, F were produced by using two conformal deposition patterns (0o & 90o
), which means
the specimens were alternately printed along the length and width. The hatching spacing used for printing
is 0.5mm which matches the diameter of the deposited filament.
The surface quality of all five specimens was evaluated by using surface roughness measuring instrument
Mitutoyo SJ-410. For each specimen, five points were selected randomly and roughness values were measured
along the hatching direction. The average roughness value for each specimen was calculated from all locations.
Mean Ra value for Specimen A is 1.390µm, 1.574µm for Specimen µm, 2.274µm for Specimen C, 1.088µm
for Specimen D, and 1.798µm for Specimen D. All roughness values, Ra, were around 2 µm or less, which
shows the satisfactory performance of the non-planar 3D printing process in terms of surface quality.
Figure 3.6: (a) Surface finish comparison between Specimen Aplanar(left) and Specimen A(right), (b) Specimen
Cplanar printed by the traditional 3D printer using a planar layered method, (c) Enlarged pictures of Specimen
Cplanar(left) and Specimen C(right).
To evaluate differences in surface finish between non-planar layered prints and planar layered prints, the
specimen Aplanar and Cplanar were additionally fabricated by using the commercial 3D printer, Ultimaker
3 Extended. Specimen Aplanar was printed from the same STL file as that of specimen A. Both are the
same geometry and size, however notable differences in the surface finish were observed. As shown in Figure
3.6(a), specimen Aplanar failed to achieve conformal printing and showed the staircase effect. Due to the
staircase like shape terminations along the curvature, their surface roughness was not measured. In contrast,
specimen A showed a relatively smooth surface without the discrete layer terminations. A similar result was
observed in specimen Cplanar, as shown in Figure 3.6(b), which has the same wind turbine blade geometry
as specimen C. Figure 3.6(c) illustrates the differences in surface texture between the non-conformal and
31
conformal specimens. The printing patterns in specimen Cplanar were neither uniform nor evenly spaced
over the surface, whereas the pattern in specimen C was evenly spaced with a zigzag pattern. Also, the nonconformal specimen shows various distinct terminations and fails to achieve satisfactory surface smoothness.
32
3.2.2 Tool Paths for Pick and Place Operations
Figure 3.7: The robotic cell consisting of two 6 DOF robotic manipulators, one extruder, and a gripper
This section is about tool path generation for pick and place operations. The tool path generation
focuses on the efficient and precise movement of the tool to pick up an object from one location and place
it in another location. Depending on the tool used, the right orientation should be determined to pick or
grasp an object. The tool path generation should also consider collision avoidance to ensure the tool and
the operating system do not collide with any objects or obstacles during pick and place operations. The tool
path generation could involve minimizing the travel distance of the tool to reduce the time required for the
operations.
Our main application is the process of pick and place an object during an extrusion-based AM process.
As AM operations are being performed, the tool path generation explained in Section 3.2.1 will also be used.
The experimental setup is the following: Two 6 DOF robotic arms with a gripper and custom built extruder.
Figure 3.7 shows the robotic cell we designed and utilized for the experiments. A bigger robot, Yaskawa
Motoman GP12 is solely dedicated to material extrusion-based AM. Motoman GP12 has an extruder at the
end of the robot. It is a three-nozzle extrusion system that can print parts with three nozzles at maximum.
We have used one nozzle with a diameter of 1mm. The other robot arm in the figure, Yaskawa GP8, is
33
committed to embedding prefabricated components into the 3D printed parts. We designed and fabricated
an electromagnetic gripper for the embedding process. This gripper can perform pick and place operations
up to 100N holding and lifting force.
The process of the pick and place operations is described as shown in Figure 3.8. The figure illustrates the
pick and place process of the servo motor. Figure 3.8(a) shows that the electromagnetic gripper is in active
mode and Yaskawa GP8 picks up the servo motor. Then the robot brings the servo close to the previously
printed part using a collision-free tool path. During this time, Yaskawa GP12 is in wait mode and does not
move the tool. Figure 3.8(b) shows the moment the electromagnetic gripper implants the servo motor into
the part. The part is printed in a way that it has a shape that can accommodate the servo motor. After the
motor is placed, the electromagnet is switched to the deactivated mode and used as a guiding tool. Figure
3.8(c) shows that the gripper slides the servo motor into the part. We give the tight clearances so that the
servo motor can be securely positioned when the gripper applies the force to slide it into the part.
Figure 3.8: Pick and place operations: (a) The electromagnetic gripper is in the activated mode and picks up the servo
motor, (b) The gripper inserts the servo motor into the yellow 3D printed part, (c) The gripper is in the deactivated
mode and applies forces to slide the servo motor into the printed part.
The tool path generation should consider both AM, and pick and place operations. After the prefabricated
components are picked and placed, the printing is resumed and the material is deposited on top of the
34
prefabricated component. In this case, a collision-free tool path should be considered as well. Combining the
AM process with the assembly process can improve the efficiency of the part fabrication process by saving
materials and reducing build times. There is no need to split the part design into several sub-parts. To
validate the proposed method, we have built three different parts with embedded prefabricated components.
In our previous study [107, 108], we have developed the tool path generation algorithms that decompose
a part into planar or non-planar layers and generate a sequence of points on the layers. For the planar layers,
the orientation vector is constant and normal to the flat surface. For non-planar layers, the normal vectors
vary, so the build orientation changes accordingly. The orientation is calculated at each point on the layer.
Performing the extrusion-based AM process with both planar and non-planar layers allows for maximizing
the quality of the surface finish and increases the efficiency of the building process. For example, the surfaces
of a part can be printed with curved layers so the surface can be smoothly printed without staircase effects.
Internal regions of the part can be printed with planar layers so the printing process can be quick and
simplified. In this work, we have applied the algorithms already developed in our previous work. Other
factors must be taken into consideration. One consideration is, generating additional sequences of points in
the middle of the printing process so that the main robot can perform pause execution when the supporting
robot approaches the printing plate and embeds a prefabricated component into the 3D printed part. The
main robot should entirely pause the printing process and move its extruder further away from the printing
plate so there is no collision between the robots. After the prefabricated component is inserted and the
supporting robot moves away from the plate, the main robot reapproaches the part to resume the printing
process. The nozzle should deposit the material precisely, especially when printing onto the prefabricated
component. Even small deviations in the position and orientations of the print head can lead to a collision
between the nozzle tip and prefabricated components. Since the nozzle hot end is at a very high temperature
of over 200oC, the prefabricated components can be burned by a slight collision. The path generation around
the prefabricated component should take these factors into consideration, and users need to set sufficient
clearance between the nozzle tip and the prefabricated component.
Figure 3.9 shows the tool path generation of printing Part A with pick and place operations. There
are Parts A, B, and C, and their shapes, sizes, and CAD designs of the parts will be shown in the later
35
Figure 3.9: Tool path generated for Part A
section. A servo motor would be embedded into Part A. As shown in the figure, the bottom surfaces of
the part are generated with non-planar layers, which are the orange colored layers. The inner sections of
the parts are printed by using planar layers. The purple colored layers are printed before embedding the
prefabricated component. Once these sections are completed, the main robot is paused at the pausing point
as marked. The supporting robot embeds the servo motor into the void section in the purple layers. By
giving a tight clearance to the spaces between the purple layers and the motor, the motor can be firmly
inserted. The yellow colored layers are printed when the embedding process of the prefabricated component
is done. Finally, the top blue layers are printed with non-planar layers. The circled numbers in the figure
indicate the sequences of the printing operation and motor insertion.
Figure 3.10 illustrates another example of the tool path generation of Part B. The blue, orange, and
yellow layers are printed before inserting the camera. Once the camera is embedded, the non-planar layers
in purple color are directly deposited on top of the camera frame. Since the PLA deposited is at a high
temperature, it allows the filament to stick to the frame of the camera.
Figure 3.11 illustrates the entire operation process in the proposed robotic cell. Here, we call the layers
printed before embedding prefabricated components as P1, and the layers printed after the embedding process
as P2 for the sake of convenience. For instance, for Figure 3.9, the orange and purple layers are P1, and
the yellow and blue layers are P2. Also, for Figure 3.10, the blue, orange, and yellow layers are P1, and
the purple layer is P2. The execution starts with robot trajectory planning for given STL files. The robot
trajectory planning process includes the tool path generation for the part, tool orientation calculation, and
trajectory parameter selection. The tool path generation will return a set of points in sequence. The raster
pattern is used for the path generation. The tool orientation will be the set of Euler angles calculated for
36
Figure 3.10: Tool path generated for Part B
given points. The points and orientations are calculated offline and are sent to robot controllers through
TCP/IP communication.
The trajectory parameters refer to 3D printing parameters such as nozzle tip speed, material extrusion
rate, or print temperature. After the trajectory planning is completed, the extrusion-based AM process
begins with the main robot. The robot starts printing P1. If printing P1 fails, that means there is a
problem in the trajectory planning or printing process. The possible reasons for the errors in the trajectory
planning could be missing points, sparse path gaps, or a miscalculation of rotation angles. These can cause
part defects, such as gaps between filaments, extra material accumulations, or collisions between current
and previous layers. The execution process goes back to the previous step when printing fails. If P1 is
successfully printed, the subsequent execution begins, which is the robot position change. This execution
refers to the moment that one robot stops its task and moves away from the printing plate, while the other
robot comes close to the printing plate to start its task. The position change performed after P1 fabrication
means that the main robot moves away from the printing plate while the supporting robot approaches the
plate. After the robots move, the process of planting or embedding the prefabricated component begins.
The prefabricated component can be described as C. If C is embedded securely into P1, the position change
execution begins. This time, the main robot goes back to the printing platform to resume the printing
process. After P2 is successfully printed, the final part consisting of P1, P2, and C is produced. Before
performing any operation by the robots, the two robots and their tools should be calibrated to ensure that
37
Figure 3.11: Entire execution process of the proposed robotic cell
38
Figure 3.12: Printing process of Part A: (a) The bottom layers are printed with non-planar layers, (b) The inner
layers are printed with planar layers, (c) After the motor is embedded, the rest of the infill is printed with planar
layers, (d) The top layers of the part are conformally printed with the non-planar layers.
the nozzle tip accurately follows the calculated tool paths. We assume that the calibration of two robots
and tools is performed successfully.
Figure 3.12(a) shows that the bottom layer is conformally printed with non-planar layers onto the support
structure. Figure 3.12(b) illustrates that the planar layers are printed on top of the curved layers. Then,
the servo motor is picked and embedded into the part by using the supporting robot. After pick and place
operations, the printing process is resumed, and the planar layers are printed on top of the previously printed
layers and the embedded servo motor as shown in Figure 3.12(c). Finally, Figure 3.12(d) shows that the top
surfaces are printed with non-planar layers to achieve the better surface finish. Top surfaces are covered by
two non-planar layers. It is observed that the first top layer printed shows visible stair shapes as a reflection
from the previous layers.
Figure 3.13 shows successful task completion of robotic AM with different prefabricated components. The
top figures show the initial CAD model designs of the three parts, and the bottom figures show the photos of
the printed parts. Each part has a prefabricated component that is embedded in the middle of the printing
process. These prefabricated components are a servo motor, camera, and strain gauge. By printing the
parts with a combination of planar and non-planar layers, the exterior surfaces of the parts can be printed
39
Figure 3.13: Initial CAD model designs of Part A, B, and C (top figures). The printed parts (bottom figures).
smoothly without having staircase effects. The interior section of the parts, where the surface finish is less
critical, can be printed with planar layers. Part A is printed on top of the support materials, while Parts
B and C are printed without support materials. The dimensions of the part (length x width x height) and
the number of layers are: Part A is 81mm x 32mm x 37mm, Part B is 42mm x 26mm x 26mm, and Part
C is 93mm x 90mm x 5mm. The following is the model information and dimension of the prefabricated
part (length x width x height) used for pick and place operations. Part A: HiTech HS-646WP servo motor.
The dimension of the servo motor is 56mm x 46mm x 21mm. Part B: Fly cam One eco V2 camera. The
dimension of the camera is 24mm x 19mm x 20mm. Part C: Comidox BF12 high-precision resistance strain
Gauge foil. The dimension of the strain gauge is 6mm x 3mm x 0.1mm.
One interesting observation is that the wavy surfaces in Part C can be printed without any support structures underneath. When the non-planar layer is printed over another non-planar layer, the layer underneath
serves as the support. So, the waviness becomes more visible when there are more curved layers printed. A
total of three curved layers are printed. The layer thickness which is the nozzle diameter is 1mm for all the
parts.
40
3.3 Robot Trajectory Planning
After the tool path is generated, the robot’s tool needs to follow the tool path with high accuracy. This
is where robotic trajectory planning comes into play. Robotic trajectory planning determines the specific
path that the tool will traverse to precisely follow the generated tool path. Trajectory planning algorithms
consider the robot’s physical constraints, such as joint limits or velocity limits. Trajectory planning considers
various factors such as the kinematics of the robot, experimental settings, and task requirements.
We apply trajectory planning to the application for covering areas on non-planar layers. The robotic
AM system is used for the experimental setup. Trajectory planning for the AM process should maintain the
desired printing speed while following the smooth motions of the robot. The waypoints generated for the tool
path are in the part frame and do not consider the constraints and parameters imposed by executing the AM
process with an articulated robot arm. These waypoints include the tool position and nominal inclination
angle for the tool so that it travels over the surface in a consistent manner. The mapping between the tool
configuration and robot configuration is highly non-linear in nature and not one-to-one in most situations.
Determining joint space robot trajectories from the tool (e.g., extruder) configurations is a computationally
challenging problem and requires addressing many important factors related to robot capabilities.
The first important factor is robot reachability considerations. If the robot workspace is small compared
to the part being printed, then tool path execution may not be possible. Even if the robot is big enough
to access the entire region of the print base, there is the risk of some part of the robot or the extruder
may collide with the print base. Hence collision avoidance is the second important factor. 6 DOF Robots
may have multiple inverse kinematic solutions, therefore reachability of every point along the tool path does
not mean that the resulting path will remain in contact with the desired surface (see Figure 3.14 for an
illustration). We need to check tool path consistency. From Figure 3.14, it is clear that moving from Θ1
to
Θ2
leads to a consistent path while moving from Θ1
to Θ2
′
does not lead to a consistent path. As the path
consistency directly affects the material deposition, smooth and continuous motion of the extruder over the
print base is required in order to maintain the quality of the printed part. The last factor for the execution of
the trajectory is to meet the TCP speed constraints. To maintain the uniform material deposition over the
print base, the desired extruder linear velocity should be maintained at all the time. This extruder velocity
41
is governed by the robot joint angle velocities. The mapping of the joint velocities to TCP velocity is highly
non-linear. TCP velocity near workspace singularities tends to be very small. Hence we need to make sure
that the robot is able to provide the desired TCP velocities.
Figure 3.14: Illustration of path consistency challenges during moving from P
1
to P
2
. Robot configurations Θ2
and
Θ
2
′
can reach P
2
. Going from Θ1
to Θ2
to leads a consistent path. Going from Θ1
to Θ2
′
leads to an inconsistent
path.
Following are two general approaches to solve the trajectory planning problem of 6 DOF robot arm
moving along a curve in 3D Cartesian coordinates.
1. Solving inverse kinematics (IK) and performing graph-based search: In semi-constrained Cartesian
trajectories in 3D space, the set of sample points S = {P
i
: i = 1, 2, . . . , n} along the trajectory can be
selected. Let P
i = (x
i
, yi
, zi
, Ri
x
, Ri
y
, Ri
z
) ∈ R
6 be the corresponding positions and orientations of the
end effector at each sample point P
i
. We can solve inverse kinematic equations to find a set of possible
joint values of the robot arm at a given Pi
, Θi = (θ
i
1
, θi
2
, θi
3
, θi
4i
, θi
5
, θi
6
). For one given configuration of
the robot arm, there can be multiple joint solutions [31]. The next step is to design a structure of a
graph model for the viable joint solutions and perform a graph-based search by connecting nodes in
time. A node corresponds to a solution, so if we can find a connection between the previous node and
the current node, [Θi−1 → Θi
] we know that the robot can shift from the previous joint solution to
the current solution without collision or singularity. So, the continuous connection between the nodes
[Θi−1 → Θi
] in the proposed graph model implies the existence of a feasible trajectory [P
i−1 → P
i
].
When all possible connections of the nodes are explored, we can choose the one with the shortest
trajectory or the one with minimal joint movement.
42
2. Solving inverse Jacobian and integrating over the domain: Using the same notation as shown in approach 1, the set of sample points along the trajectory can be selected. For the given position and
orientation of the end effector P
i
, the velocity can be denoted as: P˙ i = ( ˙x
i
, y˙
i
, z˙
i
, R˙ i
x
, R˙ i
y
, R˙ i
z
) ∈ R
6
,
where the first three values indicate velocity of the end effector and the last three values represent the
angular velocity of the end effector. The possible joint velocity of the robot arm at a given P˙ i will
be Θ˙ i = ( ˙θ
i
1
,
˙θ
i
2
,
˙θ
i
3
,
˙θ
i
4
,
˙θ
i
5
,
˙θ
i
6
). To avoid IK calculations, the problem is formulated in velocity space
P˙ i = J(Θ)Θ˙ i
, where J(Θ) is Jacobian of the robot at configuration Θ. Inverse Jacobian can be used
to calculate Θ˙ i = J
−1
(Θ)P˙ i
. Then by integration, the joint values at the next point Θi+1 can be
obtained. By using the sequential method, we can obtain joint values of all sample points and generate
the trajectory.
Our approach presented in this work adopts concepts from both of the above-described ideas. Robot
motion trajectory from the nominal tool path is generated in two stages. The first stage is the generation
of feasible solutions for robot configuration under collision and robot reachability constraints. The second
stage is to generate a trajectory from feasible solutions under path consistency constraints.
Each path segment in the nominal tool path has the set of waypoints in the form of its location (x, y, z)
and tool inclination angle, (Rx and Ry). Using the tool inclination angle information, a feasible conical
region of possible tool orientations can be determined by giving some slack to the (Rx and Ry) values. The
size of the cone is restricted by the location along the segment where cones are generated. Another important
factor into consideration is gravity feed which helps in the deposition of filament from the nozzle onto the
surface. The size of the cone is bigger on a relatively flat surface (or where the gradient on the path segment
is small) as opposed to the smaller cones on slant surfaces (or where the gradient on the path segment is
large). Figure 3.15 illustrates the representative example of cone generation for waypoints along the path.
The cone size needs to be determined by conducting printing experiments on surfaces with different types
in inclinations.
For start points on a tool path segment, the following steps are used to find feasible robot configurations:
1. For each cone at the waypoint, initially 5 possible tool inclinations are created sampling. These include
the nominal inclinations and other inclination vectors around the boundary of the cone. The first stage
43
for eliminating the improper samples of configurations is by checking the collision between the tool
mesh model and the print base mesh model. In traditional 3D printing, all the print layers are flat
and there is no orientation change in the nozzle. Hence, there is no collision between an extruder
and print bed, or an extruder and previous layers. In non-planar printing, as an extruder is tilted to
match its axis with the surface normal, there is a chance of collision with the print base. To avoid
the collision, mesh to mesh collision checking [95] is done and all the configurations with detected
collisions are eliminated from the set of possible solutions. Collision checking is modeled as a distance
function (dist(ExtruderMesh, P rintBase) ≥ 0) to check the intersection of each face in a mesh file of
the extruder with each face in a mesh file of the print base.
2. All the remaining tool configurations are converted into robot joint space Θi = {θ
i
1
, ...θi
n} (where n
is degrees of freedom of robotic arm) by solving the inverse kinematics (IK) of the robot. As we are
dealing with only 6 DOF robots in this work, analytical IK solutions are used based on the robot
Universal Robotic Description Format (URDF) model. For calculating the nominal tool path, the
rotation of the tool about its axis was not specified. For a given tool inclination, we start with an
arbitrarily selected tool rotation and compute all possible IK configurations. We adaptively change the
tool rotation angle to make sure that the joint angle corresponding to the wrist rotation angle for the
robot is at the middle of the range for the joint angle. This enables the robot to have margins to rotate
the tool to meet tool inclination requirements at other waypoints along the tool path. To avoid the
robot motion near its joint angles limit, upper and lower bounds (θjlb ≤ θj ≤ θjub ) are applied on the
permissible robot joint angles for trajectory generation. After calculating IK for sample configurations,
samples with at least one joint angle outside the bound are eliminated.
3. For 3D printing, it is desirable to have the nozzle speed maintained within certain ranges. While
maintaining the nozzle speed, the velocities of the robot joints should not exceed their limit set by the
manufacturer. To ensure reliable printing, joint angle velocity is calculated at each sample configuration
and samples with any of joint angle velocities going off the bounds are eliminated ( ˙θjlb ≤ ˙θj ≤ ˙θjub ).
As the TCP velocity can be calculated by the product of Jacobian and joint angle velocity, the desired
input 3D printing speed is used to check for joint angle velocities of all joints by Θ˙ ≥ J
−1V.
44
4. The final stage of eliminating undesired samples is collision checking between the robot and the print
base. Using the computed robot joint angles, the remaining samples are checked for a collision between
the robot mesh model and print base mesh model with mesh to mesh collision checking. As this mesh
to mesh collision checking is computationally expensive due to the bigger size of the robot mesh, it
is performed as the end elimination method since the number of possible sample configurations to
be tested is reduced. Collision checking for the robot and print base is modeled in a similar way as
collision checking between the extruder and print base (dist(RobotMesh, P rintBase) ≥ 0).
5. If the above steps do not lead to any feasible robot kinematic configurations at a point on the tool
path, then we sample additional inclination angles to search for feasible solutions. We start with the
tool inclination angle that leads to the minimum amount of constraint violation and search for tool
inclination angles and rotation angles that lead to a feasible robot configuration. We perform gradient
descent over the constraint violation function to find feasible solutions. If a new solution is found, then
these are added to the list. If feasible robot configurations cannot be found then it is not possible to
execute the path using the current setup.
For each interior point on the tool path segment, we also compute feasible robot configurations using
a method similar to the method described above. Instead of using a large range of tool rotational angles,
we use a narrow range of tool rotation angles defined using the tool rotation angle at the start point of
the tool segment. This range is set using the allowable twist in the strand during deposition. This value is
experimentally determined.
After identifying feasible robot configurations by checking for collisions and robot reachability, robot
trajectory is constructed by linking robot configurations that lead to consistent paths. We examine each
potential feasible robot configuration at the start point. Starting with a robot configuration at the start
point, we find all feasible robot configurations along the tool path that lead to a consistent path. Let P
i and
P
i+1 be two tool configurations and let Θi and Θi+1 be two feasible robot configurations capable of reaching
these two tool configurations. Robot configuration Θi+1 will be considered path-consistent with Θi
if it is
very close to Θi + J
−1
(P
i+1 − P
i
). Here, J is Jacobian at configuration Θi
. If the distance between vector
Θi+1 and vector Θi + J
−1
(P
i+1 − P
i
) is below a given threshold, then we simulate the path by identifying
45
intermediate robot configuration and by linearly interpolating between Θi and Θi+1. We use intermediate
robot configurations to compute intermediate tool configurations. If the intermediate tool configuration lies
between P
i and P
i+1, then Θi+1 is considered path consistent with Θi
. In practice, given a configuration
Θi
, we identify the configuration Θi+1 at P
i+1 that is closest to Θi + J
−1
(P
i+1 − P
i
), and check the path
consistency by checking the P value at (Θi + Θi+1)/2.
If we can find a sequence of robot configurations that leads to a consistent path from the start to the
end of the path segment, then we have found a valid solution. We perform this step for all feasible robot
configurations at the start point of the tool path segment. There will be a case that more than one of such
trajectory sequence can be generated for a segment. If such case arises, then the average of the change in
consecutive joint angles over the entire segment for each joint is taken into consideration. Trajectory with
the minimum average is then selected. This method of trajectory generation is carried out on each of the
path segments in order to compute the complete trajectory.
Figure 3.15: Representative cone generation for waypoints along the tool path.
Print speed, consisting of proportionate TCP speed, joint angle velocity, and extrusion rate indicates
how fast a nozzle moves and how fast material is deposited while printing. The print speed and the material
deposition rate have significant impacts on the printing quality and the strength of a 3D printed object.
In traditional additive manufacturing, the slower speed usually results in better quality. Also, print speed
can be consistent over a planar surface, keeping the quality level the same. However, in non-planar printing
where the material is deposited at various angles, uniform speed does not guarantee the same print quality
due to the curvatures and complexity of surfaces. If surfaces have high curvatures and require substantial
orientation changes, trajectory parameters should be controlled in order to minimize vibration, to ensure
46
the reliability of printing, and to reduce the risk of conflicts between a nozzle tip and mold. In order to have
variable velocity during motion, accurate velocity control is required as explained in Section 3.4.
3.4 Tool Velocity Control
In planar 3D printing, there is no change in rotation angles (Rx, Ry & Rz) of the tool since the extruder
does not orient about its tool center point. Hence, hatching can be generated with almost constant extruder
velocity and extrusion flow rate without affecting the print quality. On the other hand, non-planar 3D
printing has active TCP orientation which demands variation in both extruder speed and extrusion flow
rate.
Figure 3.16: Changes in TCP orientation along a planar, convex, and concave surface.
When a robot travels along sharp curves, there is a significant change in rotation angles for adjacent
points. During this motion, the robot joints should cover large angles while the TCP covers comparatively
smaller distances. Figure 3.16 shows TCP orientation along different surfaces. If the robot joint speeds are
constant throughout the motion, the time taken to change joint angles is high which results in more material
47
Figure 3.17: Printing along a curved surface (a) without velocity control, (b) with velocity control.
extrusion along a shorter path. It creates an accumulation of material and distorts subsequent layers over
it. This can be prevented by accurately controlling the extrusion flow rate and changing the robot velocity
by detecting curvature along every print.
Experiments are performed in order to obtain the maximum extrusion flow rate without creating an
overflow of material inside the nozzle. Robot velocity is finely tuned relative to the extrusion rate to make
hatching smooth and uniform. When the robot is moving along a curve, excess material deposition can be
reduced either by reducing the material flow or by increasing the robot joint angle speed. Both methods
can be applied either independently or together for better material deposition control. While increasing the
robot joint speed, there is a possibility that any joint angle may reach a speed limit that would stop the
entire printing process. Such a scenario can be handled by keeping the joint angle speed under the upper
limit, reducing the TCP speed, and further lowering the extrusion flow rate.
Globally determining the curvature of a printing surface is not always possible. Hence, the nature of the
surface is analyzed in a more discretized manner, that is, between two consecutive projected grid points.
Since the grid spacing is an input variable, even small curvatures can be identified by resizing the grid mesh.
For every projected point and its associated normal, another temporary point is generated along the normal
at a specific distance from the original point. This distance tends to vary from 1 to 4mm and it can be
empirically decided based on experimentation. The gap between consecutive temporary projected points
is then compared with the original projected points. If the gap between temporary points is more than
that of the actual gap then curvature is convex, and vice versa. In any of the cases, robot joint angles
vary due to curvature. Hence material extrusion rate is reduced by the same amount as that of the gap
between temporary projected points and actual projected points. For general estimation of flow rate, linear
48
interpolation can be applied between extreme values (upper and lower limit) of gap and motor micro-stepping
delay (maximum and minimum delay possible under specific micro-stepping). This results in precise control
over the extrusion rate while constantly changing the slope along the curve. Since extrusion rate by default
is kept at maximum, velocity control will only reduce the flow rate along areas of sharp curvature. Extrusion
rate can be easily integrated with the robot system by generating an either analog or digital output signal
from the robot controller and sending it to a microcontroller between every motion command to change the
stepper motor speed.
A similar approach can be used to control the robot joint angle speed by interpolating it against higher
and lower gap values. Unlike flow control, joint speed control can be directly added in motion command
by assigning respective joint angle speeds to each motion. As joint speed increases along the curve from
its initial value, there is a possibility that the joint angular speed reaches its limit. This can be eliminated
by identifying the joint angle speed limit in RobotStudio simulation and keeping the highest value of joint
angle velocity along the path just below this limit. The highest velocity value is mapped by calculating scale
down factor and all remaining velocities are decreased to match the scale. This scale is used to reduce the
extrusion rate further in order to maintain the uniform layer.
Figure 3.17 shows the comparison between non-planar printing with and without velocity control for the
curved structure. The absence of velocity control deposits extra material on the fillet as shown in Figure
3.17(a). On the other hand, Figure 3.17(b) shows smoother hatching at fillet with velocity control.
3.5 Calibration of Robots and Tools
During the printing process, the gap between the hot nozzle tip of the FDM extruder and the previous layer
(or workpiece in the case of the first layer) should be accurate and uniform across the complete conformal
surface. Even the slightest variation in gap adversely affects the smoothness of layers, which further affects
consecutive layers. In extreme conditions, a significant amount of variation in the gap increases the risk that
the extruder will hit the base and result in a print failure. Figure 3.18 shows the damaged base and broken
build due to improper calibration. This failure incident damages both the end extruder and the base for
49
printing. Additionally, it increases the chance of motor failure in the robotic manipulator due to excess load.
To avoid this, the precise calibration of the nozzle tip with respect to the printing base is required.
Figure 3.18: (a) Damaged base, (b) Printed model of scaled down version of car bonnet
While calibrating, the following two important factors should be taken into consideration:
1. Proper offset value of nozzle tip from surface: The offset between the end of the nozzle tip and the
printing surface should match the desired layer thickness to print. If the gap is smaller than the proper
value, thickening of hatching lines takes place as extra material between gaps is squeezed out from the
side. This extra material creates a hindrance to consecutive hatching lines and distorts their path as
shown in Figure 3.19(a). On the other hand, if the gaps are greater than the desired value, the layer
starts to curl up and will fail the print as shown in Figure 3.19(b).
Figure 3.19: (a) Squeezed out excess material, (b) Curling of hatch pattern.
2. Uniform TCP orientation over the entire space: If the robot’s tool data is poorly calibrated, TCP
shifts its position (x, y, z) up to a certain extent when rotated about axes. This creates an uneven
50
thickness in layers. Even though the change in position coordinates is very small, the difference is
noticeable because the layer thickness is as small as 0.4 mm.
Due to varying constraints involved in the initial setup such as small deviation in the 3D printed mold
base from its actual CAD model or extruder tip wear, calibration is performed every time a new base is
mounted. The steps of the calibration process are as follows:
1. Calculation of accurate tool data & load data: To get highly accurate data of the TCP for uniform
TCP orientation, tool data is calculated with at least seven data points in the procedure. Figures 3.20
and 3.21 show the Tool Center Point calibration before printing.
2. Mounting a base to the proper position on a table: A base, where the printing will be held, is actually
mounted on a flat table. RobotStudio simulations can be performed to determine the right position
of the base. This is done by placing the base at an appropriate place to confirm that the robot has
neither a singularity nor a collision over the entire region as shown in Figure 3.22. The appropriate
place should be within the workspace of the robot and should not be too close or too far from the
robot since it can increase the chance of reaching the joint limits of the robot. The initial position is
validated by RobotStudio. If the program has generated errors, then the initial position is re-selected
and the selection is done by the user. When confirmed, the base is firmly attached to the table to make
sure that it does not get disturbed while printing.
3. Calibration of Z-axis offset value: At least 3 points along the boundary and 3 points over the surface
are selected, and TCP is moved to each point with its required rotation about axes. Correct offset is
recorded considering the thickness of the desired layer.
4. Pilot run along the boundary of printing surface: After Z-axis offset, a calibration test is performed
along the boundary to ensure that the offset is uniform over the surface and the calibration is reliable.
The complete calibration procedure is followed for every new base mounted in order to avoid damage to
any of the hardware used in printing.
51
Figure 3.20: Position of TCP and end effector with respect
to flange frame coordinate Figure 3.21: TCP calibration
Figure 3.22: RobotStudio Simulation for singularity and collision check
52
Chapter 4
Learning of Constant Process Parameter Models for Robotic
Processing Applications
4.1 Introduction
Figure 4.1: An industrial robot is performing the surface finishing task.
Robots are increasingly being considered for non-repetitive tasks. For many such tasks, a performance
model that maps operation parameters to task performance is not known a priori. For example, consider
the task of finishing a surface using a robotic system (Figure 4.1). The condition of the surface may not be
53
known in advance. Hence the robot does not know what operation parameters to perform the task such as
rotational speed of the tool, forward velocity of the tool, and force. The robot may start with an initial value
of process parameters based on prior experience or random selection. It can execute a small portion of the
task with this set of parameters. Based on the observed task performance, the robot can select parameters
again and perform another small portion of the task. The process of updating process parameters based on
the task performance on the partially completed task can be continued until the task is completed or the
robot has determined the best operation parameter values.
In this chapter, we are mainly focused on developing an efficient and adaptive experimental design for
learning the manufacturing processing applications that have constant process parameter models. The main
objective is the minimization of task completion time. AI-driven algorithm is employed to select process
parameters in each experiment with a priority on the main objective. The selected process parameters affect
the task completion rate. The selected parameters also affect the performance and determine if the task
performance constraints are met or not. For instance, Figure 4.2 shows the robotic finishing process and
the outcomes. After the robot finishes the rigid part, the surface quality is evaluated. In Figure 4.2(b), the
outcome has not met the surface quality constraint, and that the task fails. On the other hand, the surface
quality constraint is satisfied in Figure 4.2(c) and the finishing task succeeds. Task failure is due to wrong
parameter selection. If the selection process cannot meet the performance constraints, then the task needs
to be performed again. This leads to wasted time.
When learning a manufacturing task, three characteristics are taken into account: consequences of constraint violations, complexities of constraint models, and prior knowledge. Constraint violation consequences
should be carefully determined and considered, depending on the specific task settings and conditions. For
example, violating surface quality constraints during the finishing of a rigid part may not have fatal consequences, whereas failure to meet deflection constraints could result in significant damage to the part. The
complexity of constraints refers to the varying degrees of intricacy in their models. Some constraints may
have complex models, while others may be simpler. Each model exhibits different relationships between
input and output parameters. For instance, the surface quality constraint is considered more complex than
the deflection constraint. Surface quality is influenced by three parameters (force, tool rotation speed, and
54
Figure 4.2: Figure (a) shows the robotic finishing process for a rigid part. (b) shows the surface quality is poor and
the task fails. (c) shows the surface quality constraint is satisfied and the task succeeds.
velocity), and the relationship is non-linear. On the other hand, the deflection constraint is relatively simpler,
as it can be assumed that deflection increases with force while rotational speed and velocity have minimal
impact on part deflection. The utilization of prior knowledge plays a crucial role in facilitating efficient
learning of manufacturing processes. Having knowledge of various aspects, such as which process parameters impact task performance, the appropriate or inappropriate ranges of process parameters, the effects of
parameter changes on performance, or possessing data from similar tasks can greatly assist in designing an
effective learning framework. By leveraging this prior knowledge, we can save time by bypassing unnecessary
experiments or focusing on the parameters of particular interest.
We are interested in developing an efficient experimental design that achieves the best balance between
exploration and exploitation. The term exploration means searching for process parameters that can improve
task execution performance by trying different values of process parameters. The term exploitation means
using the best performing process parameters that we have discovered so far. Initially, we need to explore
many different process parameter values to understand their impact on task performance. All process
parameters that are tried during exploration do not necessarily lead to progress on the task. In fact, some
process parameters may lead to violation of task performance constraints, and hence, that portion of the
task may need to be repeated. During the exploitation phase, we use the experience that we have gained and
use process parameter values that are known to work. Initially, we need to rely on exploration to ensure that
55
we do not use sub-optimal values of process parameters to perform the task. When we start approaching
task completion, exploration may not offer much value. Towards the end, small gains in performance will
not have much impact on the overall task performance. The exploration phase may end, and exploitation
may begin.
This chapter presents an adaptive and iterative learning framework that refines process parameters by
achieving the right balance between exploration and exploitation while minimizing the expected task completion time. We utilize Gaussian Process (GP) models to predict task performance based on the observations
from the attempted/completed portion of the task. The ability to predict task performance enables us to
decide the next course of action. Our method analyzes the likely gains by performing exploration and considers the consequence of not meeting the task performance constraints. It then makes the informed decision
about either selecting process parameters to improve performance or selecting the best parameters based on
the observed performance so far. This approach enables us to achieve the best trade-off between exploration
and exploitation.
4.2 Problem Formulation
We begin by defining terminologies and notations. We consider scenarios where a robot needs to execute
and complete a specific type of task such as sanding, painting, or cleaning. The task is non-repetitive.
(1) Task: Define the task that the robot will perform as C. Divide the entire task into N number of
sub-tasks (or task fractions). In this case, the entire task is the union of the sub-tasks, C =
S
N
i=1
ci
, where ci
indicates the i
th sub-task.
(2) Process parameters: When performing the task, the robot needs to select the process parameters or
operation parameters such as force to apply or tool speed. Assume there is L number of process parameters
for the task C. Define each process parameter as x1, ..., xL. The set of process parameters can be written as
x = (x1, ..., xL). Process parameters are subject to physical constraints of the robot or safety considerations.
We can denote the parameter constraints as g(x) ≤ 0.
(3) Task performance: We can measure task performance after executing the task with the selected process
parameters. The finish quality or accuracy of the task can be an example of task performance. Assume there
56
is M number of task performance measurements. Define each task performance as y1, y2, ..., yM. A set of
the task performance can be written as y = (y1, ..., yM). Define task performance targets or constraints
as h(y) ≤ 0. The performance model that maps the process parameters to each task performance can be
written as yj = fj (x), where j = 1, ..., M.
(4) Probability of success or failure: For each sub-task, we know whether the execution of the sub-task
succeeds or fails using the selected process parameters. The success rate for meeting all the task constraints
can be written as ps and the probability of failure can be written as pf = 1 − ps.
(5) Task completion time: The time taken to complete the sub-task ci
is a function of task performance
and can be written as ti = d(y). The input of the function, y, is the set of task performance measured for
ci
. In many applications, we assume ti can be computed analytically.
(6) Multiple task attempts: If an attempt fails, the robot should perform the task again. Let us introduce
the notation t
k
i
, the time taken for retry attempt corresponding to sub-task ci
. Here, the upper index k
indicates how many retrials were performed. We do not put any upper index if there is no retry. t
1
i means
the time taken for the first retry of sub-task ci
.
(7) Parameter space: Define the parameter space as Ω that is the set of all possible process parameters.
For each sub-task, the robot selects a set of parameters x in Ω and executes the task.
We are interested in an adaptive and iterative learning approach to minimize the expected task completion
time for a given task. First, we can divide the given task into N number of sub-tasks. All sub-tasks are
assumed to be identical and have the same conditions such as the amount of task or characteristics of task.
The robot begins executing a sub-task and updates operation parameters as it observes the task performance.
As the robot completes more sub-tasks, more information about the performance model can be obtained.
It aims to complete all sub-tasks with the right balance of exploration and exploitation. Process parameter
limits and task performance constraints must be met. Our objective function is to minimize the expected
task completion time and can be formulated as follows.
min E[
P
N
i=1
ti
] (4.1)
s.t. g(x) ≤ 0 and h(y) ≤ 0 (4.2)
57
We will use several algorithms to solve this problem including feasibility biased sampling, surrogate model
fitting, and greedy heuristics. Each approach is described in the following sections.
4.3 Approach
4.3.1 Overview
We begin by exploring process parameters. Our approach for the initial sampling of process parameters
is presented in Section 4.3.2. Once we have attempted to perform the task, surrogate models of the task
performance can be built based on the measurement of the outcomes. The surrogate models can be used to
predict task performance across the entire parameter space (Section 4.3.3). Section 4.3.4 presents the greedy
optimization to select improved process parameters. Finally, Section 4.3.5 presents the sequential decision
making process that selects process parameters based on exploration and exploitation trade-off.
4.3.2 Feasibility Biased Sampling
In the early stage, a robot does not have much information about operation parameters. The robot can start
a task with a random selection of operation parameters. A random set of parameters would be sampled from
the entire parameter space Ω. This is random sampling and is drawn from a uniform distribution. Random
sampling is unbiased and straightforward to implement. However, it may not be the best approach if we
want to sample a specific set of parameters. We can introduce a bias in the sampling scheme to accelerate
the process of finding feasible process parameters.
To facilitate the initial selection process, feasibility biased sampling can be employed. Feasibility biased
sampling is a sampling method that prioritizes the selection of process parameter sets that are more likely to
satisfy the task performance constraints. This can be done by sampling the parameters from certain regions
of the parameter space Ω, guided by prior knowledge or previous task execution results. These designated
regions are defined as the region of interest, Ψ. We have a limited number of performance evaluations to query,
so our goal is not to precisely identify the exact location of Ψ within Ω. Rather, we approximately determine
Ψ by using an elimination strategy. Each time the robot executes a task, we know whether task performance
58
Figure 4.3: A representative example of Ω and Ψ.
constraints are met or not, with the selected parameter set. Any regions in the parameter space where
task performance violations have already occurred or are likely to occur are eliminated. After eliminating
infeasible regions, the remaining regions can be denoted as Ψ. This elimination strategy can be repeated each
time an experiment is performed. The use of feasibility biased sampling increases the probability of finding
the process parameters that would not violate the constraints. sampling the set of process parameters that
is more likely to meet the task performance constraints. Figure 4.3 shows the example of Ψ in parameter
space Ω. In order to construct surrogate models, we utilize the data obtained during the initial exploration
stage. However, if the initial data is insufficient to build accurate surrogate models, we address this by
conducting additional random sampling. By gathering more data through random sampling, we ensure that
an adequate amount of initial data is available for the construction of reliable surrogate models.
4.3.3 Surrogate Model Construction: Gaussian Process
The relationship between process parameters and task performance measurements can be represented by
yj = fj (x) where j = 1, ..., M. The complete form of the model fj is not known a priori, and is often a
nonlinear and complex function. We employ surrogate model approach to estimate the relationship between
process parameters and task performance. Among various surrogate functions, we choose to use Gaussian
Process (GP) because of its ability to achieve high accuracy with a limited amount of data. GP exhibits
remarkable flexibility and fits well for nonlinear input and output relations that are commonly observed in
practical implementations [101]. Furthermore, GP is capable of computing uncertainty levels for unobserved
data.
59
The process of building and updating GP models follows the steps outlined below. Firstly, all process
parameters and corresponding task performance measurements obtained during the exploration phase are
stored in the current data set Scurrent. Then construct a surrogate model with Scurrent for each task
performance. The surrogate model for the j
th task performance is denoted as ˆf j. The task performance
estimation can be written as yˆj = ˆfj (x). The surrogate model is subsequently updated each time Scurrent
is updated or augmented with new data.
In addition, we can optimize the hyperparameters or so-called free parameters of GP. By tuning these
hyperparameters, we can customize the GP to better suit the specific requirements of different robotic
tasks [15, 133]. The decision making process to tune which hyperparameters such as those related to basis
function, kernel function, or kernel scales, is left to the users’ selection. The frequency of optimization can
also be controlled by the user. Figure 4.4 shows an example of model fitting in a robotic sanding application.
In the top figure, the GP regression model is constructed without hyperparameter tuning, resulting in a green
predicted model that significantly deviates from the original blue model. The bottom figure is the estimation
of deflection without adjusting hyperparameters. The red dots indicate the estimation, the blue dots indicate
95% confidence level and the black dots are the original values. Both figures demonstrate poor prediction
accuracy, emphasizing the importance of hyperparameter optimization in improving the performance of the
GP models.
4.3.4 Greedy Optimization using GP
We perform greedy optimization using GP to improve process parameters. The greedy optimization starts
from the current best parameter set, and it aims to find the best immediate solution to meet an objective
function [14]. First, define the current best set of parameters in Scurrent as xcb. xcb indicates the set of
parameters that returns the least task completion time while meeting all task performance constraints. tcb
is the time taken to complete the current sub-task using the optimal process parameters, xcb. tcb can be
considered as the minimum task completion time achievable for the current sub-task while satisfying all
constraints.
60
Figure 4.4: Constructing GP models without hyperparameter optimization can cause the poor estimation of task
performance.
We are interested in exploring the regions in the parameter space that can complete the next sub-task in
less than the time tcb. Design the regions of interest as ΩT . ΩT can be defined using the process parameters
that govern the task completion time. For example, the velocity of the tool mainly determines the task
completion time. In this case, ΩT encompasses all possible parameter sets with a higher velocity value
compared to the current one. Next, we employ grid sampling to generate numerous parameter sets within
ΩT . The granularity of the grid, or grid resolution, is determined by users’ selection. To expedite the search
process, an adaptive grid can be utilized. Save all parameters in the grid to Sgrid. Our objective now is to
select a parameter set that exhibits a high probability of meeting the task performance constraints. Using
GP models, the mean and variance of the task performance estimations for all parameters in Sgrid can be
computed. Normalize the mean and variance using a standard score to calculate the success rate of meeting
the task performance constraints. Define the success rate as ps = P(h(yˆ1, ..., yˆM) ≤ 0). We can find the
parameter set that returns the least task completion time and the highest success rate ps. This parameter
set is referred to as the next best set, denoted as xnb. The greedy heuristic returns xnb to be used for
executing the next sub-task.
61
We proceed to the next sub-task by updating it as the current sub-task. After executing the task with
xnb, we assess whether the task performance constraints have been satisfied. If the constraints are satisfied,
the current best parameter set is updated with the value stored as xnb. However, if the constraints are not
satisfied, we need to search for an alternative set of process parameters to retry the task. In this case, the
current best value remains unchanged. we update GP models based on the outcomes obtained from the
unsuccessful task executions. This allows us to incorporate new information into the models. Subsequently,
we repeat the process of greedy optimization to identify a fresh set of process parameters for executing the
task. This iterative approach continues until the task performance constraints are successfully met.
4.3.5 A Sequential Decision Making Approach for Parameter Selection
This section presents the sequential decision making process to select process parameters to meet the requirements of the objective function. This approach has the following steps.
1. Perform a combination of feasibility biased sampling and random sampling. Save the data to the
current data set.
2. Fit GP models using the current data set.
3. Perform greedy optimization using the GP models and find the next best set of parameters.
4. Compare the expected task completion time that can be achieved by exploration and exploitation using
the success rate to meet task performance requirements.
(a) If the expected task completion time of exploration is smaller than that of exploitation, go to
Step 5.
(b) If not, go to Step 8.
5. Continue exploration and execute the next sub-task using the next best set of parameters.
6. Evaluate if it satisfies the task performance requirements.
(a) If so, update the current best parameter set.
62
(b) If not, the current best parameter set remains the same.
7. Update the current data set. Repeat Step 2 - 7.
8. Stop exploration and use the current best set of parameters to execute the remaining tasks.
We make a decision based on the value that can be gained from our objective function. The expected
task completion time includes the retry of the portion of the task that was previously executed but did not
meet the task performance constraints. We learn to improve the process parameters to produce better task
performance. The exploration may not provide much improvement near the end of the task. Then we stop
exploring and start to exploit. During exploitation, we execute the remaining portion of the task using the
set of parameters known to produce the best results.
4.4 Case Study: Robotic Sanding Task
Figure 4.5: The 6 DOF manipulator with the sanding tool needs to sand the surface with N number of identical
panels.
We apply our algorithms to a robotic sanding application. The application setting is as follows. Consider
a 6 degrees of freedom (DOF) manipulator that performs sanding operations on a given surface. The
manipulator uses a sanding tool at the end effector to mechanically remove rusty areas from the part. The
entire surface to sand can be divided into N number of identical panels. We assume each panel has the
63
same part parameters and mechanical properties such as curvatures or materials. Additionally, we make the
assumption that the part possesses high rigidity and that any constraint violation will not cause permanent
damage to the part. Figure 4.5 illustrates the robotic sanding task.
We can define process parameters as the following. Note that they are defined in general. The lower
index of process parameters or task performance indicates that it is for i
th sub-task.
- Force applied in normal direction to surface f (N)
- Rotational speed of the tool r (RPM)
- Forward velocity of the tool v (mm/s)
- The parameter set xi = (fi
, ri
, vi).
The task performance can be written as the following:
- Finish quality of surface q
- Deflection of surface d (mm)
- Task completion time t (sec)
- The task performance set yi = (qi
, di
, ti).
All process parameters have an impact on the finish quality of the surface. We can write the function as
q = F1(f, r, v). We characterize the range of q from 0 to 10, where state 0 indicates the worst finish quality
and state 10 is the best quality. The manipulator needs to apply a sufficient amount of force to remove
material from the surface. As the force increases, the quality of the surface increases as well to some extent.
However, a large force can cause harm and excessive deformation on the surface. The impact of force on the
deflection is significant. We can ignore the impact of the tool’s angular speed and forward velocity on the
deflection because they are negligible. Hence the deflection can be written as the function of force, d = F2(f).
For the rotational speed of the tool, sanding with too high RPM may damage the surface, whereas using too
low RPM cannot remove the rust. There usually exists an optimal speed in the middle that produces the
maximum surface quality. The finish quality decreases as the rotational speed gets further from the optimal
value. Using the low to medium forward velocity of the tool leads to reasonably good finish quality. When
the tool is moving too fast, it cannot remove rusty areas effectively. The time to sand one panel is inversely
64
proportional to the forward velocity of the tool. The time can be written as ti =
λ
vi
. λ is constant because
every panel has the same size, mechanical properties, or curvatures. The main objective function can be
written as the following.
min E[
P
N
i=1
ti
] (4.3)
s.t. − q + γ ≤ 0, d − δ ≤ 0, G(f, r, v) ≤ 0 (4.4)
q = F1(f, r, v), d = F2(f), and ti =
λ
vi
(4.5)
γ, δ, and λ are given, constant (4.6)
The equation (4.4) indicates that the surface quality should be greater than or equal to a certain level.
The deformation d should be less than or equal to a certain value. G(f, r, v) ≤ 0 indicated the parameter
constraints followed by the capability of the robot or safety constraints such as limits of the velocity or the
maximum force to apply. F1 and F2 are unknown, black-box functions. We want to improve task performance
by executing more sub-tasks. We continue selecting the next best set of parameters while the value computed
from exploration is greater than that of exploitation. Computing the next best parameters can be seen as
maximizing acquisition functions when solving Bayesian Optimization under unknown constraints [55, 71,
102]. When the exploration does not offer that much added value, we stop focusing on improving the
parameter sets and sand the remaining panels with the current best parameters. The way that we apply
each algorithm to the robotic sanding task is as the following.
1. Combination of feasibility biased sampling and random sampling: Initially, we explore a portion of
the task using feasibility biased sampling. We use the following elimination approach. Define the current
operation parameter set as xa = (fa, ra, va). If the selected parameter set results in excessive deformation,
then we know the selected force fa is too large. We exclude the region where a force value is greater than
fa from sampling. For tool angular speed, we expect the optimal value lies in the middle of the minimum
and maximum values of the speed. So the regions near the speed limits can be excluded from sampling. For
the case that the finish quality constraint is violated, whereas the deformation constraint is not, sampling
the region with a smaller force value will not help to improve finish quality. Hence, we eliminate the region
65
from the parameter selection. As we conduct more sub-tasks, we can eliminate more infeasible regions from
sampling. Define ninit as the number of task executions done by feasibility biased sampling. If ninit is too
small, then we may not have enough data to make reasonable estimations of GP models. Hence, we perform
random sampling to acquire enough executions. Users can define the fixed number Na. When ninit ≤ Na,
then perform (Na − ninit) number of random sampling. Save all the parameter sets to Scurrent. Figure
4.6 shows an example of the feasibility biased sampling based on the current performance. This approach
helps determine the specific regions within the parameter space where sampling should be focused, as well
as regions that may be avoided. We can effectively guide the sampling process, selecting relevant ranges of
process parameters for exploration while excluding areas that are less likely to yield desired results.
Figure 4.6: This describes the results of the current exploration and potential reasons for the outcomes. It also
presents suitable reactions for further task execution.
2. Surrogate model construction: With the current data set Scurrent, fit GP models. Two separate GP
models are constructed. One GP is to predict the finish quality of the surface. The other GP is to make the
estimation of the deflection. We select to tune hyperparameters of the kernel function in every Nb number
of executions. For example, Nb = 1 indicates hyperparameter optimization is performed for every and each
iteration.
3. Greedy optimization: The current best set of parameters is written as xcb = (fcb, rcb, vcb). Assign a
step size of each parameter as (∆f, ∆r, ∆v). Define the region of interest ΩT as all the possible parameter
sets with the velocity value of vcb + ∆v. Define a uniform grid in ΩT using ∆f and ∆r as grid resolutions.
66
We determine Ng by Ng grid around the parameter set (fcb, rcb, vcb + ∆v) where Ng is an assigned number
by users. Save the total number of parameter sets, Ng ∗ Ng, to Sgrid.
By using the GP models constructed in the previous step, compute the estimated task performance values
for every parameter set in Sgrid. Compute the probability of satisfying the task performance targets. The
probability of meeting the finish quality constraints can be written as pf inish, and that of the deflection
constraints is pdef . With the assumption that the two events are independent, the probability of meeting
both constraints can be computed as pb = pf inish ∗ pdef . If there exists a set of parameters in Sgrid such
that pb ≥ α, then explore the higher velocity regions by adding one more step size velocity. Update ΩT
to be all the possible parameter sets with the velocity value of vcb + 2∆v. Perform grid sampling around
(fcb, rcb, vcb+2∆v). Repeat the process until there is no set of parameters that has the probability of meeting
task performance above a given threshold percentage α.
Compute the success rate of meeting task performance constraints for all the set of parameters in Sgrid.
Define the parameter set that has the highest success rate as the next best parameter set xnb = (fnb, rnb, vnb).
Execute the task using xnb. The task completion time can be written as tnb =
λ
vnb
. Determine if the task
performance constraints are met or not. If they are met, update the current best solution to xnb. Update
the current data set Scurrent as well. If the task performance fails to meet the target, save the selected
parameter set to the data set of failed parameters, Sfail. The parameters in Sfail will be excluded from
further executions. Figure 4.7 describes the framework of the greedy optimization.
4. Exploration vs. exploitation decision: We want to design the learning framework to find the right balance
between exploration and exploitation. We need to make a decision between two choices: exploring the next
sub-task by using xnb or exploiting the current best set of parameters xcb.
We compute the likely gains to decide when to stop exploration. Assume that the robot has explored J
th
panels. The task completion time so far can be written as TJ . Assume there exists M number of remaining
panels to sand. The remaining panels include the panels that have to be sanded again due to the failure of
previous attempts. When we choose to exploit the current best parameter set, then the total time to sand
the remaining panels will be the product of M and tcb, where tcb is the task completion time for one panel
using the current best parameter set xcb. When we choose to explore one more panel with the next best
67
Figure 4.7: A general framework of greedy optimization for the robotic sanding task.
parameter set xnb, we have to consider the likelihood of succeeding the attempt. Define ps as the probability
of meeting the task performance constraints when sanding one panel with the best estimated parameter set
xnb. The expected task completion time for the total panels can be written as follows:
- When we stop exploring and start exploitation, the expected time is:
E[T]exploit = TJ + M ∗ tcb
- When we keep exploring, then the total expected time is:
E[T]explore = TJ + M ∗ (ps ∗ tnb + (1 − ps) ∗ tcb) + (1 − ps) ∗ tnb
Whether the attempt to sand the next panel using xnb fails or succeeds, the amount of time tnb is spent
for the exploration. When it fails, we conclude that xcb is the best parameter set and decide to sand the
remaining M panels with xcb. When the attempt succeeds, xnb becomes the best parameter set to use,
and the remaining (M − 1) panels will be sanded with those parameters. We compute the expected task
completion times every time sub-tasks are executed sequentially. If E[T]exploit < E[T]explore, then we stop
exploring. If E[T]exploit > E[T]explore, then we keep exploring. The expected task completion time for
all panels is governed by the remaining number of panels M. When there exist lots of remaining panels,
exploring some more portions of the task may lead to a great reduction of the total task completion time.
68
Even a slight reduction of tcb can affect a lot to the total task completion time. On the other hand, if only a
few panels are left, then the small reduction of tcb may not be much reduction in the total task completion
time. In this case, choosing the exploration will not be a good choice.
We set up the constant value Nc in order to prevent the case when the algorithm is stuck in the local
minimum and stops prematurely. Nc is the number of executions for mandatory exploration. Regardless
of the expected values computed by exploitation or exploration, we can keep the exploration phase until
we complete a certain number of executions. This approach enables to build better GP models that are
able to make accurate predictions and to explore enough amount of panels. In our application, Nc will be
assigned by users. It can be properly set depending on the number of total panels to sand on. When there
are lots of panels, Nc can be a large number so the exploration phase can be long enough. On the other
hand, Nc can be a smaller number when there are a small number of panels. In this case, the mandatory
exploration session will end early, then the exploration and exploitation trade-off algorithm will take place
for the decision making process.
4.5 Results of Robotic Sanding
Unfortunately, due to the pandemic, we were unable to conduct physical experiments in the lab to validate
our approach. We instead tested our approach with computational simulations for the robotic sanding
process.
4.5.1 Description of Function Used for Simulating Sanding Process
We modeled the sanding process using analytical functions. The process parameter limits are given as
follows. The force ranges from 0 to 200N, the rotational speed ranges from 0 to 1000RPM, and the tool
velocity ranges from 0 to 200mm/s. Figure 4.8 shows the example models of the sanding task built with
analytical functions. The figure shows three representative examples of different velocity levels. When the
force and rotational speed values are the same, low velocity produces a greater or equal finish quality than
high velocity. We also constructed a model for the force and surface deflection. The degree of deformation
increases as the applied force increases up to a certain threshold.
69
Figure 4.8: The modeling of sanding tasks using analytical functions.
4.5.2 Implementation of Our Approach
We have implemented our approach with different choices of settings. Depending on which methods to use
for sampling or how large the part is, the algorithm makes different decisions. For each scenario, we ran
the algorithm 100 times and computed the mean of outcomes. N is the total number of panels. N′
is the
number of executions for the exploration phase (or the number of panels used for exploration). When the
robot stops exploration and starts exploitation, it takes the process parameter set known to work the best to
perform the rest of the task. tcb indicates the time to sand one panel using the best performing parameters
discovered so far. T represents the total task completion time. γ = 7, δ = 16, and λ = 900.
Scenario 1. Different sampling methods: We want to see the impact of sampling methods in the early
stage of exploration. We consider three cases: perform pure feasibility biased sampling (FBS), pure random
70
sampling (RS), and a combination of two sampling methods (RS+FBS). For (RS+FBS) case, we perform
FBS first until the algorithm finds the set of parameters that meets the task performance constraints. If the
number of executions is smaller than the minimum required number of initial data Na, we perform random
sampling up to Na. We set the number as Na = 9, the frequency of GP hyperparameter optimization to
be Nb = 15, and the minimum executions of the exploration phase to be Nc = 20. Besides the sampling
method, other conditions are the same. The entire surface consists of 100 identical panels. Table 4.1
shows the comparison of three outcomes when using different sampling methods to sample initial data. RS
method is inefficient compared to two other sampling methods. It requires approximately 76 executions for
exploration and takes more time to complete the entire task. FBS and (RS+FBS) methods take about an
average of 20 and 21 executions, respectively, for exploration. The mean of the total task completion time
is also small, compared to that of RS.
Table 4.1: Selecting the different sampling methods results in different outcomes in terms of N
′
, tcb and T. The unit
of the time is second.
Sampling method µ(N′
) µ(tcb) µ(T)
RS 75.73 10.86 2584.70
FBS 20.38 10.87 1335.08
RS + FBS 21.20 10.83 1376.01
Scenario 2. Different sizes of the surface to sand: We want to consider the cases that the robot needs
to sand different sizes of the surface. Assume that there are three different sizes: small, medium, and large
part. Each part consists of 20, 100, and 500 identical panels, respectively. Na = 9 and Nb = 15. Nc is
assigned differently for each size. For the large part, we want to ensure the robot has enough attempts for
exploration. For the small part, we may want to keep the exploring phase short and begin exploitation to
maximize the likely gain. Nc is 15, 20, 30 for the cases of N = 20, 100, 500, respectively. The sampling
method during the initial stage is selected to be (RS+FBS) method. Other than N and Nc, the conditions
are the same. The outcome is presented in Table 4.2.
As N increases, the mean of exploration executions and total task completion time increase. Let’s
compare the cases of N = 20 and N = 100. µ(tcb) of the small part (N = 20) is about 11 seconds. It
seems the algorithm has not identified the optimal process parameters. This is understandable because the
71
Table 4.2: Sanding the different sizes of rusty surface results in different outcomes in terms of N
′
, tcb, and T. The
unit of time is second.
# of total panels µ(N′
) µ(tcb) µ(T)
N = 20 16.81 11.01 474.40
N = 100 21.20 10.83 1376.01
N = 500 30.25 10.83 5801.14
small part does not offer much value when the exploration phase gets longer. Hence, it rather selects the
suboptimal parameters and begins the exploitation. We have used the success rate of each task performance
in a sequential manner for greedy heuristics. The parameter sets in Sgrid that are likely to meet the deflection
constraints are saved in S. Then, the set of parameters with the highest success rate of meeting finish quality
in S is selected for the next best set of process parameters.
Scenario 3. Tuning hyperparameters of GP models: We want to see the impact of GP hyperparameter
optimization on the task performance and decision making process. In this scenario, we assume there are
20 number of total panels, and (RS+FBS) method is used. The given values are N = 20, Na = 9, Nb = 5
and Nc = 15. Table 4.3 shows the mean and maximum of N′ and tcb. GP tuning provides an improvement
in outcomes. Without tuning GP hyperparameters, we may have the case that the task completion time for
a single panel is greater than 20 seconds. In this case, the set of parameters is stuck in local minima. The
optimization of the GP hyperparameter helps to improve task performance.
Table 4.3: Tuning hyperparameters of GP models affects the average and maximum values of N
′
and tcb. The unit
of the time is second.
GP model µ(N′
) max(N′
) µ(tcb) max(tcb)
With tuning 15.74 24 11.12 14.03
Without tuning 16.14 26 11.30 20.86
The sequential decision making process is visualized in Figure 4.9 as a representative example. The
colored points in the figure are the sets of parameters selected in the parameter space for task executions.
The brightest point represents the first execution, and the darkest point represents the last execution. The
color transition from light to dark color expresses the sequence in which the parameter sets are selected.
During the initial sampling stage, the points are scattered around. Once greedy optimization begins, it leads
the parameter set in the direction of higher velocity levels to improve task performance and reduce the task
completion time.
72
Figure 4.9: The sets of parameters selected for task executions are marked in the parameter space. The sequence of
the process parameter selection is represented by the color gradation from light to dark.
Comparison with other methods: Our proposed method is compared with other DOE methods. There
are three methods. Full factorial method consists of three factors with three levels, Taguchi L25 orthogonal
array design consists of three factors with five levels, and central composite design of RSM. Finding optimal
process parameters using these DOE methods returns the time to sand one single panel, tcb. Table 4.4 shows
the total number of executions and the best time value calculated. The concept of balancing exploration
and exploitation is not considered for these methods, so the number of total panels does not affect the
optimization result. We denote our method when the surface size is medium (N = 100). We ran our
algorithm 100 times, so the mean number of executions and the mean of the optimal time are calculated
and compared with the traditional DOE methods. When using Full factorial, Taguchi L25 design, and RSM
methods, the numbers of experiments needed are 27, 25, and 20, respectively. After the optimal process
parameters are calculated through each DOE method, one more experiment is conducted for validation.
Hence, the final numbers of experiments needed for Full factorial, Taguchi L25 design, and RSM methods
are 28, 26, and 21. Compared to our approach, Full factorial and Taguchi’s L25 methods require a greater
number of executions, resulting in an increase in task completion time. Also, the optimal sanding time,
tcb, is 12.86 seconds, which is higher than the optimal sanding time using RSM or our method. Compared
to RSM, our method is able to compute µ(tcb) which is smaller than the optimal time of RSM. The mean
number of executions is 21.20, which is comparable to the number of executions needed with RSM.
73
Table 4.4: Comparison between our approach and other DOE methods. The unit of the time is second.
Method # of executions tcb
Full Factorial 28 12.86
Taguchi L25 26 12.86
RSM 21 11.12
Method µ(N′
) µ(tcb)
Our Method 21.20 10.83
4.6 Case Study: Robotic Sanding Task with Risks of Irreversible
Damage
Figure 4.10: Robotic sanding task that poses the risk of irreversible damage. The part can be damaged severely if
an excessive amount of force is applied.
We have extended our algorithm to include another case study involving a robotic sanding application.
In the previous case study, we assume that the part can withstand a substantial amount of force without
encountering catastrophic failures. In this application, we introduce a characteristic where the part is
susceptible to irreversible damage if a force greater than a certain threshold is applied. It is crucial to
exercise caution when applying force to ensure the part’s safety. A 6 DOF manipulator is used to perform
sanding operations on a given surface. The robot employs a sanding tool at the end effector to mechanically
74
remove areas affected by rust. Figure 4.10 illustrates an example of robotic sanding process where irreversible
damage occurs as a result of excessive force applied during the sanding process. The part may experience
reversible damage up to a certain force level, but once the force surpasses a specific threshold, the damage is
irreparable damage. Repairing or replacing the damaged part incurs significant time and costs. We assume
that the part parameters and mechanical properties remain constant across the entire surface when the
applied force is below the threshold. The violation of the deflection constraint carries severe and irreversible
consequences. The complexity of the deflection constraint is relatively simpler compared to the surface
quality constraint, which is more complex. Violations of the surface quality constraint result in reversible
damage. There is limited prior knowledge available.
For the iterative learning process, we divide the entire surface into N number of identical panels. The
part parameters and mechanical properties are assumed to be the same over the entire surface. The relation
between process parameters and task performance is the same as previous case study. The only difference is
that excessive force causes catastrophic damage to the part or tool. When a catastrophic failure happens,
the cost of repairing or replacing the part is high. It includes the time to remove a broken part, prepare
a new part, and associated labor costs. We substitute all of these costs to time. When the part is sanded
without catastrophic failures, the time to sand one panel is inversely proportional to the forward velocity.
It can be written as ti =
λ
vi
where λ is given. When the catastrophic failures occur, λ becomes extremely
large, and ti will skyrocket. The finish quality q should be greater than or equal to a certain level γ. Higher
finish quality implies better performance. The deflection amount d should be less than δ to achieve the task
performance constraints. When d is greater than δ, the deflection constraint is violated. However, that does
not mean catastrophic failures happen. The catastrophic failures occur when d is greater or equal to the
maximum deflection limit zeta. When d is a value between δ and ζ, the part surface is reversible and not
permanently destroyed. In this case, the robot can redo the sanding task on the panel. The main objective
function can be written as the following.
75
min E[
P
N
i=1
ti
] (4.7)
s.t. q ≥ γ, d ≤ δ (redo), G(f, r, v) ≤ 0 (4.8)
q = F1(f, r, v), d = F2(f), d ≥ ζ (catastrophic f ailure), and ti =
λ(di)
vi
(4.9)
γ, δ are given and constant (4.10)
G(f, r, v) ≤ 0 indicated the constraints of the parameters due to the capability of the robot or physical
constraints that are imposed on the robot. For example, for the purpose of safety, the speed of the robot
or tool might be limited to a certain value, or there can be a maximum rotational speed depending on the
types of sanding tools. The performance models that map the process parameters and task performances
are unknown and black-box functions. The λ is a function of the deflection as it changes depending on the
failure status. If d ≥ ζ, then λ will be extremely large. The details of our algorithm are as follows:
1. Initial sampling stage to identify the safe region: We know the maximum deflection limit that can
cause the part break. But we do not know the force that matches with that deflection limit. Call this force
as force (failure limit). The first step is to know this failure limit to make sure the robot will never explore
the areas above this failure limit. We assume that the impact of rotational speed and forward velocity of the
tool on the surface deflection is negligible. Then, the process parameter model of the deflection will be the
function of the applied force. In order to construct a process model using a surrogate model, we need few
data. We can select small force values that will not likely cause catastrophic events. Assume that we know
the force less than or equal to fsaf e will not result in excessive deformation. We first select K numbers of
samples within the initial safe force region and execute the task. By sampling K points incrementally, a local
mapping between force and deformation can be inferred. With the initial data, we can now construct the GP
model that estimates deflection. The uncertainty of GP model will be low for the forces that are located near
the previously explored forces. On the other hand, the uncertainty becomes high as we estimate the force
further away from the initial data. Figure 4.11 shows how uncertainty in the GP model changes. Here, the
76
forces of 40, 45, and 50N (red starred points) are the parameters that have been explored. The forces between
40 and 50N have extremely low uncertainty. The uncertainties become higher when force values get further
from the visited area. So, it is reasonable to explore the force a little above 50N. Like this, we can explore
the safe regions by increasing the force for the following executions, updating GP models, and estimating
deflections. Ideally, we would like to know the force regions that will satisfy the deflection constraints while
not causing catastrophic events. The other process parameters, rotational speed and forward velocity will
be selected randomly as they do not have any impacts on the surface deflection. Once the safe region is
explored, we can determine the current best parameter set among the visited parameters. The current best
parameter set stands for the parameter that returns the highest gain while satisfying the task performance
constraints. If there is no such parameter set, then we can perform the feasibility biased sampling by using
the information from the current data. Make sure that we only sample the parameter set in the safe regions
to avoid catastrophic failures.
Figure 4.11: The deflection estimation using the performance model constructed with initial task executions.
2. Greedy heuristics with the consideration of task performance violations and catastrophic failures: The
general idea about the greedy heuristic algorithm is similar to the one in Section 4.4. We will use the same
notations. We construct the GP models for deflection estimation and finish quality estimation by using
the data we have so far. Then we compute the probability of meeting task performance constraints while
77
avoiding catastrophic failures for the points around the current best set of parameters xcb = (fcb, rcb, vcb).
The probability of meeting the finish quality constraints is pf inish and that of the deflection constraints is
pdef . We can also write the probability of catastrophic failures as pfailure. Then, the probability of meeting
both constraints while avoiding the failures can be written as pu = pf inish ∗ pdef ∗ (1 − pfailure). We first
explore the highest velocity that satisfies pu ≥ α, then explore the rotational speed and force value. Make
sure that we will only explore the forces that are in the safe region. We will define the parameter set that has
the highest probability of pu as the next best parameter set xnb = (fnb, rnb, vnb). Then, we need to decide
whether to explore this set of parameters. If we decide to explore, then we execute the next sub-task using
xnb. The task completion time can be written as tnb =
λ(dnb)
vnb
where dnb is the deflection occurred on the
part surface. Update the current data set and GP models. Then, we repeat the process until the algorithm
decides to stop exploring and start exploitation.
3. Exploration vs. exploitation trade-off: The exploration and exploitation decision will be made by
comparing the expected cost to accomplish the objective function. We follow the same exploration and
exploitation decision explained in Section 4.4. When the current best set of parameters is xcb, the next best
set of parameters computed by greedy heuristics is xnb, and the remaining number of panels to sand is M,
the expected task completion time is as follows:
- When we stop exploring and start exploitation, the expected time is:
E[T]exploit = TJ + M ∗ tcb
- When we keep exploring, then the total expected time is:
E[T]explore = TJ + M ∗ (pu ∗ tnb + (1 − pu) ∗ tcb) + (1 − pu) ∗ tnb
Here pu = pf inish ∗ pdef ∗ (1 − pfailure). This implies that we want to explore the parameter set that
meets all task performance constraints while strictly avoiding catastrophic failures. The expected task
completion times will be computed whenever a new attempt to sand a panel is made. We stop exploring
when E[T]exploit < E[T]explore.
Here in safe learning, we also set up the constant value Nc to prevent the case when the algorithm is stuck
in the local minimum. Nc is the number of executions for mandatory exploration. Although many parameter
78
sets could be explored in the initial stage, there may be a case when the algorithm stops prematurely. In this
case, setting Nc helps to get out from the local minimum and to find the better parameter set. We formulated
the synthetic functions and implemented our safe learning algorithm for the robot sanding process.
4.6.1 Description of Function Used for Simulating Sanding Process
We modeled the sanding process using analytical functions. The force ranges from 0 to 300N, the rotational
speed ranges from 0 to 1000RPM, and the tool velocity ranges from 0 to 300mm/s. The performance models
between process parameters and task performance have a similar trend to the previously formulated synthetic
function. However, this time, there is a maximum deflection limit that could cause fatal damage to the part
surface or tool. Also, we assume that the deflection could occur somewhere between 0.1 and 20cm.
4.6.2 Implementation of Safe Learning Approach
We have implemented our safe learning approach with different choices of settings. Here, we ran the algorithm
500 times and computed the mean of outcomes. N is the total number of panels. N′
is the number of
executions for exploration. It can be seen as the number of panels used for exploration. When the robot
starts exploitation, it utilizes the set of process parameters known to work the best to sand the rest of the
panels. The set of parameters can be called the best parameter set xcb. tcb stands for the time to sand
one panel using the best performing parameter set xcb. T stands for the total task completion time to sand
the entire panel. In our setting, the numbers are given as follows: N = 100, γ = 6, δ = 15, and ζ = 16.5.
Table 4.5 indicates how the sanding performance would be with and without the safe learning strategy. We
have used our learning algorithm presented in Section 7.3 for the non-repetitive robotic sanding task. PS
indicates the success rate of sanding the part surface without catastrophic events among 500 tries.
Table 4.5: Comparison of outcomes when applying safe learning strategy and not applying it. The unit of the time
is second.
Learning Strategy PS µ(N′
) µ(tcb) µ(T)
Regular 85% 20.03 9.49 1304.80
Safe Learning 100% 24.20 9.60 1362.58
79
When we have not applied the safe learning strategies, 75 times violations occurred out of 500 sanding
tries. When we compared the mean number of executions for exploration, the safe learning strategy is higher
than that of the regular learning algorithm. This is reasonable because it takes more time to find a safe
region to explore in the initial stage. In terms of the mean of tcb and T, the safe learning algorithm returns
little higher values. This can indicate that regular learning is a little bit more aggressive than safe learning
and was able to find the better parameter set.
We have also compared the cases where the deflection constraint for performance is relatively near to
the maximum deflection limit. We set up the numbers as N = 100, γ = 6, δ = 15, and ζ = 15.2. When
the deflection is above 15cm but less than 15.2, it can redo the sanding process. However, the re-performing
process can be more dangerous as there is not much room between the regular deflection constraint (redo)
and the maximum deflection limit (failure). Table 4.6 shows the result of these two cases.
Table 4.6: Comparison of outcomes when applying safe learning strategy and not applying it. The unit of the time
is second.
ζ PS µ(N′
) µ(tcb) µ(T)
15.2 100% 23.37 9.64 1357.69
16.5 100% 24.20 9.60 1362.58
We can see that whether the deflection limit is near the maximum deflection value or not, our safe learning
algorithm 100% avoids catastrophic failures. There is not much difference between the two cases in terms of
µ(N′
), µ(tcb), and µ(T).
4.7 Summary
This chapter presents the adaptive experimental design of learning constant process parameter models while
enhancing task performance for robotic sanding applications. The decision making process to select the set of
parameters in each experiment is presented by achieving the balance between exploration and exploitation
while meeting the constraints of our objective function. We implemented the learning algorithm in the
computational simulation for the robotic sanding application. Three different scenarios are examined. From
the outcomes of the scenarios, we learn that the use of feasibility biased sampling in the initial exploration
phase can efficiently improve the task performance and result in the reduction of the total task completion
80
time. It is also observed that the number of executions during the exploration phase can increase when
sanding a larger part. Finding the right trade-off between exploration and exploitation is essential to minimize
the expected task completion time. Utilizing GP hyperparameter optimization proves beneficial in enhancing
task performance. We compare our approach with three conventional DOE methods: full factorial design,
Taguchi method, and RSM. Our approach using greedy heuristics shows better results in terms of finding
the best process parameters with a smaller number of executions during the exploration phase.
The next case study presents the safe learning framework to avoid irreversible damage on the part for
robotic sanding applications. When a problem is given, it is the first step to determine the affecting task
performance and process parameters. Then, our priority is a performance model between the affecting
parameters and affecting task performance. This exploration process entails GP surrogate modeling and
estimation. Once we have determined the safe region to explore, we can proceed to employ feasibility biased
sampling or greedy heuristics, depending on the acquired data set. In the context of greedy heuristics,
the exploration process involves sequentially investigating regions with higher velocities compared to the
current best velocity. The objective is to minimize the task completion time. The probabilities of meeting
performance constraints and avoiding catastrophic failures are computed for both sparsely and densely
sampled regions. We identify the parameter set with the highest likelihood of satisfying both conditions.
Then, we make a decision to explore the new parameter set or exploit the best set of parameters that we
have gained so far. The expected task completion time to finish the task for both cases is computed to make
the decision. In this case study, we only consider a one-step lookahead. This means that the exploration
and exploitation decision is based on whether to explore one more sub-task.
81
Chapter 5
Learning of Spatially Varying Process Parameter Models for
Robotic Finishing Applications
5.1 Introduction
Surface finishing is an important manufacturing process. Completing the process efficiently and successfully
requires using the right process parameters. Parts being processed can be divided into two types. The
first type of parts can be considered rigid, i.e., part deflection under the force applied during the process
is negligible. Therefore, constant process parameters can be used to finish the part. Since the process
parameters are constant, the optimal process parameters can be identified and used. The second part type
includes compliant regions (see Figure 5.1), i.e., the part exhibits significant deflection during the processing.
Different regions on the part exhibit different levels of rigidity or compliance depending upon their geometries
or their distances from the points on the part that make contact with the fixture. These parts cannot be
processed using the constant process parameters. Regions that are relatively rigid can use a higher magnitude
of force to ensure fast processing (e.g., see Region B in Figure 5.1). On the other hand, regions that are
not very rigid require the use of a lower level of force to ensure safety and prevent failure due to excessive
deformation (e.g., see Region A in Figure 5.1). The use of lower force means that the region will be processed
at a slower rate. Therefore, compliant parts require the use of process parameters that vary spatially based
on the rigidity of various regions on the part. So rather than learning optimal constant process parameters,
82
we need to learn spatially varying process parameter models for the part to help us in using the right process
parameters in various part regions depending upon the rigidity of the region.
Figure 5.1: The regions of the part exhibit different levels of compliance.
There is significant interest in using robots in surface finishing applications (Figure 5.2) Human experts
can program the robots by performing a large number of experiments and finding the right process parameters. However, this approach is costly and time-consuming. AI technologies could be used so that the
algorithm could find the right process parameters by analyzing data. AI-driven experimental design enables the robots to learn the process parameter models by performing experiments, evaluating measurement
outcomes, learning from the evaluation, and updating internal models. This adaptive learning approach
potentially reduces the time and effort required for surface finishing compared to traditional manual programming methods.
The learning can be realized through many different methods such as active learning, reinforcement learning (RL), or design of experiments (DOE). With RL approach, a robot learns to perform a task based on trial
and error with rewards based on the performance. RL can be a very effective approach if a simulation exists
for the given task to virtually conduct a large number of experiments. DOE method is a systematic approach
to designing physical experiments and understanding the impact of parameters on task performance. DOE
83
approach selects process parameters for experiments and uses mathematical models to estimate the best
process parameter. Many DOE methods have been developed, and they are well-structured and widely used.
The use of active learning in a robotic manufacturing task enables robots to receive feedback and actively
query the experimental data to learn the task and improve learning efficiency. For example, in our previous
paper [139], we reduced the task completion time of learning non-repetitive manufacturing tasks that used
constant process parameters. We demonstrated that our approach could result in a shorter task completion
time than using DOE methods.
Figure 5.2: A manipulator is performing the sanding operations on the part that has spatially varying stiffness. The
part’s stiffness varies depending on regions due to its geometries, thickness, and metal support underneath.
The focus of this chapter is on developing the adaptive experimental design for learning contact-based
finishing tasks, such as sanding, polishing, grinding, or buffing on parts that exhibit varying levels of rigidity
in different regions (i.e., spatially varying stiffness) [141]. In this case, the relationships between the input
process parameters and the output task performance vary depending on the region in the part. Different
process parameters should be used in different regions by taking into account the stiffness characteristics of
the regions. Careful use of process parameters is required to prevent excessive deformation and ensure safety
in regions that are not very rigid.
Our goal is to develop an efficient approach to learning spatially varying process parameter models for
contact-based finishing tasks for a robot. Learning spatially varying process parameter models is more
84
complex than learning constant process parameter models because additional factors should be considered
during the learning process. Those factors include performing the appropriate data acquisition over the part,
selecting the efficient region sequencing policy based on the part’s stiffness map, and selecting the proper
process parameters to use in each region. For example, initial exploration should be carefully designed so
that the data is evenly acquired from different regions of the part. If a robot performs the initial experiments
only in certain regions, this may result in a biased data set. Also, the sequences of the regions to perform the
finishing process should be designed with the consideration of the part’s stiffness map or rigidity variations.
Users may choose to start from the regions that are relatively rigid or from the regions that are more
compliant. The cost of failure and the value of success will vary depending on the regions that are processed.
We have developed the iterative learning approach with an initial task execution policy, surrogate modeling approach, region sequencing policy, and process parameter policy. Contributions of this chapter include
(1) Development of a framework for learning spatially varying process parameter models through iterative
learning in the context of contact-based finishing tasks by a robot. (2) Characterization of region sequencing
policies by carefully weighing the cost of failure against the value of success. Our approach identifies a
set of process parameters that can accomplish the task with the minimum expected completion time. (3)
Validation of our approach using computational simulations and physical experiments with a case study of
the robotic sanding task.
5.2 Problem Formulation
Consider a scenario where a robot needs to complete contact-based finishing tasks such as robotic sanding,
polishing, grinding, and buffing. Assume that the tools used are rotary tools. The task is high-mix, so the
robot needs to perform finishing tasks for parts with different geometries, sizes, etc. Parts have spatially
varying stiffness, and the range of optimal process parameters is not known in advance.
(1) Process parameters: A robot needs to select process parameters when performing a contact-based
finishing task. Process parameters are controllable inputs for the task. For contact-based finishing tasks,
process parameters can be a force, rotational tool speed, or tool forward velocity. We can define a set of
process parameters as a vector form x. Process parameters may be subject to physical constraints, safety
85
constraints, or any other constraints. These constraints can be written as g(x) ≥ 0. g(x) is a vector-valued
function.
(2) Part parameters: Part parameters are parameters related to the characteristics of a part such as
stiffness. Part parameters can have impacts on task performance. The part parameter can be written as a
vector form, y.
(3) Task performance measures: Task performance measures are quantifiable parameters or evaluation
metrics that represent the performance of the task. The task performance measures are dependent parameters
and can be considered outputs for the task. A set of task performance measures can be written as z. The
task performance requirements can be written as h(z) ≥ 0. h(z) is a vector-valued function.
(4) Process parameter model: A process parameter model is a relationship between input parameters
and task performance measures of the task. The inputs can be process parameters and part parameters.
The outputs are task performance measures. A process parameter model can be written as a function in the
form f(x, y) in general. For ith task performance measure zi
, the process parameter model represents the
relationship between zi and input parameters. This can be represented by zi = fi(x, y).
(5) Task completion time: Task completion time is the time it takes to complete a given task. If the
robot performs a total N number of task executions, the total task completion time can be written as the
sum of each execution, T =
P
N
i=1
ti
. ti denotes the task completion time of the ith task execution.
(6) Probability of success: Before executing a task, we can determine the probability of success for the
task execution. The task execution succeeds when all task performance constraints are satisfied. The task
execution fails even if one performance constraint is violated. We can denote the probability of success as
Ps. The probability of task failure can be written as Pf = 1 − Ps.
Our approach aims to minimize the expected time needed to complete the contact-based finishing tasks
on a part with spatially varying stiffness. This problem can be seen as an optimization problem under
constraints. The objective is to minimize the expected task completion time. The constraints that need to
satisfy are the process parameter constraints and task performance constraints.
86
5.3 Approach
5.3.1 Overview
Figure 5.3: Flow chart of the learning and task execution process.
Figure 5.3 illustrates a flowchart outlining the adaptive experimental design process proposed for learning
robotic finishing tasks. Our goal applications are contact-based finishing tasks, such as sanding, grinding,
polishing, and buffing for a part with spatially varying stiffness. The process begins with obtaining a new part
and performing initial finishing experiments to acquire data. Since the part is newly given, the relationships
between input and output parameters are unknown. During initial exploration, the experiments should be
performed at various regions of the part so that the data is acquired from different stiffness regions. The next
step is to construct surrogate models using the data acquired from the initial experiments. Considering the
extensive time and resources required to construct comprehensive process parameter models, our approach
87
utilizes surrogate modeling. The aim of this chapter is not to identify complete process parameter models
for finishing tasks but rather to find adequate models and efficiently learn the tasks. The next process is to
create a list of the remaining task regions to complete. Should the list be empty, it means all areas requiring
work have been finished, and the procedure can be terminated. If there are remaining task regions, the task
execution continues. First, we select a task execution region based on the region sequencing policy, and then
we determine which task execution parameters to use in that region. The policy identifies the parameters
expected to result in the shortest task completion time. The task is executed using the selected parameter
set. If the task is unsuccessful or the task performance constraints are violated, the corresponding data is
saved to the failed data set, prompting an update to the surrogate models. If the task succeeds, the list of
remaining task regions is updated accordingly. These processes are repeated until all task regions of the part
are visited.
5.3.2 Definition of Parameters and Objective Function for Contact-based Finishing
Tasks
For contact-based finishing tasks performed by a robot with a rotary tool, we can identify input and output
parameters as below. The units of the parameters or variables are written inside parentheses.
- Force applied to the surface, f (N).
- Rotational speed of the rotary tool, r (RPM).
- Tool forward velocity, v (mm/s).
- The process parameter set can be written as x = (f, r, v).
There is a spatially varying part parameter, which is stiffness.
- Stiffness of the part, s (N/mm).
- The spatially varying part parameter set can be written as y = s .
There are two task performance measures.
88
- Surface quality, q (no unit).
- Deflection on the surface, d (mm).
- The task performance set can be written as z = (q, d).
We can characterize the quality of the surface with a scale of 0 to 10. A scale of 0 indicates the worst
surface quality and a scale of 10 means the best surface quality. All three process parameters affect the surface
quality. Stiffness does not significantly influence the surface quality. So, the process parameter model of
the surface quality can be written as q = f1(f, r, v). During contact-based finishing tasks performed by
the manipulator, it is necessary to apply a suitable level of force to attain a smooth surface. Another task
performance is deflection occurred on the part location when the finishing task is being performed. The
impact of force and stiffness is significant on the deflection. We assume the impact of the rotational speed of
the tool and forward velocity is negligible on the deflection. In this case, the deflection can be expressed as a
function of force and stiffness, denoted as d = f2(f, s). As stiffness varies across different regions or locations
of the part, the deflection model f2 exhibits spatial variation. In the regions that are relatively rigid, even
if the robot applies a high amount of force to the part, it may result in a small deflection. Conversely, in
regions with low stiffness, proper application of force on the surface is necessary to ensure safe exploration
of that area. It may be safe to start with a lower level of force when exploring the regions that are relatively
compliant to avoid excessive deformation. Hence, it is crucial to carefully determine an appropriate range
of forces based on the stiffness of each region. In situations where direct measurement of the part’s stiffness
is challenging, we can use a location on the part as an input because the stiffness is a function of location.
The task execution time is inversely proportional to the forward velocity of the tool. This can be written as
t =
λ
v where λ is a time-velocity constant. If the task fails and the robot needs to perform the task again
at the same location, the task completion time will be t =
λ
v + β where β represents the additional time
required for the re-execution process. The objective function can be written as the following when there is
a total of N task executions.
89
min E[
P
N
i=1
ti
] (5.1)
s.t. g1(f) ≥ 0, g2(r) ≥ 0, g3(v) ≥ 0 (5.2)
h1(q) ≥ 0, h2(d) ≥ 0 (5.3)
q = f1(f, r, v), d = f2(f, s) (5.4)
Constraints (5.2) indicate the process parameter constraints. The applied force by the robot arm during
the finishing operation must satisfy the constraint g1(f) ≥ 0, ensuring that the force stays within allowable
boundaries. The rotational speed cannot exceed the maximum speed of the rotary tool and can be written
as the form g2(r) ≥ 0. The tool forward velocity cannot exceed the maximum speed of the robot arm and is
written as g3(v). Constraints (5.3) indicate the task performance constraints. The surface quality should be
above a certain quality level. This is written as h1(q) ≥ 0. The deformation d should be less than a certain
value to avoid excessive deformation and permanent damage on the part. This can be written as h2(d) ≥ 0.
Equation (5.4) indicates the process parameter models, f1 and f2, for the surface quality and deflection.
Our goal is to find the set of process parameters that can satisfy all the constraints while minimizing the
expected completion time.
5.3.3 Initial Exploration
5.3.3.1 Part Surface Splitting into Smaller Patches
To make the finishing process more efficient, we can split a given part surface into multiple patches. The
purpose of splitting the part into multiple patches is to efficiently learn the characteristics of the part with
spatially varying stiffness. This approach is particularly useful if the part can be split into matching or
identical patches. If the patches are identical, they have the same stiffness map.
Figure 5.4 shows the example of part split. A part with a trapezoid shape is given for a contact-based
finishing task in Figure 5.4(a). The stiffness of the part varies due to its geometry and extruded supports
underneath. The part surface can be split into random shape patches like Figure 5.4(b). The part can be
90
Figure 5.4: (a) Isometric view: a trapezoid shape part is given. (b) Top view: a part is split into non-matching
patches. (c) Top view: a part is split into two pairs of identical patches.
split into pairs of identical patches like Figure 5.4(c). In the case of Figure 5.4(c), U1 and U2, as well as
U3 and U4 possess the same stiffness map. We can first learn proper process parameters (force, rotational
speed, tool speed) to perform the task in one patch U1. Then, we can leverage the parameter information
obtained from U1 to perform the task in U2. Similarly, after learning process parameter models of U3, we
can apply that knowledge to improve the finishing process in U4. This strategy enables efficient learning
of process parameters and time-saving for the finishing task. On the other hand, when the part is split
into non-matching or randomly-shaped patches like Figure 5.4(b), it may take more time to learn process
parameters. Since the shapes of the patches are different, we cannot efficiently leverage the information we
gained from one patch to perform the finishing task for another patch. This may increase the entire task
completion time. Hence, it is more consistent and efficient to split the part surface into matching or identical
patches for efficient learning of process parameter models.
5.3.3.2 Initial Task Execution Policy
After splitting the part into multiple patches, we can begin initial exploration. The initial task execution
policy comprises two elements: spatial variation when selecting the task execution locations and feasibilitybiased sampling when selecting the process parameters to use. Assume that Ninit experiments will be
conducted in the current patch. It is important to consider the spatial variation of the stiffness when
exploring the patch. Ideally, experiments should be performed evenly across the part surface so the data can
91
be acquired in various stiffness regions. This helps to build an approximate stiffness map of the part surface.
Also, this approach ensures that the data collected is not biased toward a particular region.
Figure 5.5: An example of feasible and infeasible regions in 3-dimensional parameter space. Red areas indicate the
infeasible region. The remaining area marked in blue is the feasible region.
To select the process parameters, a combination of random sampling and feasibility biased sampling can
be used. Random sampling entails randomly selecting a set of process parameters from the entire parameter
space, Ω. All possible sets in Ω can be a candidate for the sampling. The disadvantage of random sampling
is that it may be difficult to find good sets of parameters due to a lack of directed heuristics. A set of
parameters selected by random sampling often fails to meet task performance constraints without heuristics.
To overcome this disadvantage, feasibility biased sampling can be employed. Feasibility biased sampling is
the sampling method to select a set of process parameters from feasible parameter space, Ψ. Feasible and
non-feasible regions in the parameter space can be distinguished through multiple experiments and physicsbased information. By sampling the parameters only from the feasible regions, we can reduce the risk of
choosing parameters that are likely to fail. We define γ as a ratio of feasibility biased sampling out of the
total sampling number. If γ is close to 1, most of the process parameters are selected using feasibility biased
sampling, while only a small number of parameters are selected through random sampling. When γ is close
to 0, most of the process parameter selection is done with random sampling, and a few sets of parameters
are selected using feasibility biased sampling. Figure 5.5 shows the example of feasible and infeasible regions
in 3-dimensional parameter space for contact-based finishing tasks.
92
5.3.4 Surrogate Model Construction and Update
A process parameter model for the ith task performance measure can be written as zi = fi(x, y). The
relationship between input and output parameters is not known in advance. We can build surrogate models
and make predictions of task performance. The surrogate model that estimates the ith task performance
measure can be written as zˆi = ˆfi(x, y). Various models can be considered to capture the linear or non-linear
correlation between input and output parameters such as Gaussian Process Regression (GPR), polynomial
regression, radial basis function, or artificial neural network. We choose GPR, which is a nonparametric,
probabilistic kernel-based regression. GPR surrogate models can provide uncertainty levels of task performance at unobserved process parameters.
When building a GPR surrogate model, the selection of kernel function and hyperparameter tuning is very
important. Different types of kernel functions capture different patterns in the data. An appropriate kernel
function should be chosen to capture the relationship between input and output parameters. Also, tuning
of the hyperparameters associated with the selected kernel function is recommended to build an accurate
surrogate model. Various methods can be used to select a good kernel function and hyperparameters such
as computing marginal likelihood, comparing the loss of the models, or performing confidence evaluation
and cross-validation. We use a combination of loss comparison and cross-validation. The basic idea is to
build the models with different kernels and hyperparameter tuning, then compare the loss of the constructed
models using a validation set. The performance metric can be a loss function (e.g. mean squared error or
mean absolute percentage error). To avoid any bias in selecting the validation set, k-fold cross-validation
is used. The selection of k in k-fold cross-validation can be decided depending on the sizes of the data set.
We can compare the performance across different folds, and select the kernel function and hyperparameter
setting that produces the best performance.
For contact-based finishing tasks defined in Section 5.3.2, two GPR models can be constructed as shown
in Figure 5.6. The input parameters are indicated with the arrows that go into the model box. GPR model
1 is built to predict the surface quality and can be written as qˆ = ˆf1(f, r, v). GPR model 1 is not governed
by task execution location. GPR model 2 is to make predictions of deflection. GPR model 2 is a spatially
varying process parameter model as the part has spatially varying stiffness. The task execution location can
93
be used as input because stiffness is a function of task execution location. The inputs of GPR model 2 are
force and task execution location, u = (u, v, w). Hence, GPR 2 can be represented by dˆ = ˆf2(f, u, v, w).
Figure 5.6: GPR model 1 is to predict surface quality and GPR model 2 is to predict deflection.
Through the iterative learning process, GPR surrogate models will be updated whenever the robot
performs the finishing task. The following steps summarize how to update the surrogate model.
(1) Construct the GPR surrogate models using the current data set, Scurrent.
(2) Select the task execution location to perform the next experiment using region sequencing policy. The
current location can be written as uc = (uc, vc, wc).
(3) Select the set of process parameters, xc = (fc, vc, rc), using process parameter selection policy.
(4) Perform the finishing task at uc with the selected parameter set xc. The data set acquired from the
task execution is saved to Snew.
(5) Scurrent ← Scurrent + Snew: Update the current data set by adding the new data set.
(6) Go back to step (1) to update the GPR surrogate model. Repeat steps (1)-(6).
5.3.5 Selection of Region Sequencing Policy
After the initial exploration, we need to decide where to start the finishing task and the sequence of task
regions for subsequent tasks. The decisions can be made using a region sequencing policy, which is represented
by the symbol π. π includes the sequences of the task execution regions. For example, π = {U1, U3, U4, U2}
indicates that the task execution should begin in U1 region, followed by U3, then U4, and finally U2. A
region includes infinite sets of points. A point on a region can be written as u = (u, v, w). Surface normal
94
vectors can be calculated using u. The orientations of the tool can be computed based on the surface normal
vectors.
It is crucial to take into account both the cost of failure and the value of success when exploring policies
for sequencing regions. Within a part, high stiffness regions are better able to withstand high force and high
speed settings during the finishing process. The task can be completed quickly in these regions as the robot
can move the tool with high speed. Excessive deformation, changes in shape, or part damage is not likely to
occur in high stiffness regions when the force is applied. In the event of a constraint violation, the robot can
redo the finishing tasks on the same part. The cost of task failure is not high, and learning of a good process
parameter set can be done efficiently. On the other hand, the cost of task failure is high in low stiffness
regions. The regions with low stiffness are more susceptible to deformation. The part may experience a
significant deformation when a high force or high speed is applied. An excessive deformation could lead to
cracks in the part or catastrophic damage to the part. If the part is permanently damaged, it will need to
be discarded and replaced. Hence, the cost of failure is high in low stiffness regions. Learning of parameters
should be conducted meticulously in low stiffness regions to ensure safety and avoid catastrophic events. It
may take a longer time to complete the finishing task in the low stiffness regions than in other regions as
the robot may use low force and speed settings in the low stiffness regions. But once we find the process
parameter set that successfully completes the finishing task in low stiffness, the same process parameter set
can be used to perform the finishing tasks in the regions with higher stiffness. In other words, the value
of success is high. In summary, generating a successful parameter set may take lots of time in low stiffness
regions, but the value of success is high once it is found. Conversely, in high stiffness regions, the value of
success is not as significant, and finding a successful parameter set is relatively easier. Even if we find the
successful set of parameters for high stiffness regions, those parameters do not guarantee a successful task
execution in other regions. Likewise, there is a trade-off between the cost of failure and the value of success.
We want to find a good balance of the trade-off when exploring the region sequencing policies.
We need to characterize region sequencing policies when a new part is given for the finishing task. When
a part is given, we can enclose the surface of the part with a rectangular shape. Then, we can determine a
95
Figure 5.7: The different region sequencing policies.
long side and a short side of the rectangular shape that encloses the part. Figure 5.7 show the rectangular
shape and several region sequencing policies. We explore the region sequencing policies described as below.
(1) π1: A robot can perform the finishing task by moving a tool parallel to a long side direction. A robot
begins the task from a high stiffness region and moves the tool in a zig-zag pattern.
(2) π2: A robot performs the finishing task by moving a tool parallel to a short side of the part. A robot
begins the task from a high stiffness region and moves in contiguous regions with a zig-zag pattern.
(3) π3: A robot moves a tool in a contour parallel pattern (parallel to the boundaries of the part). The
task begins from a high stiffness region, and the pattern is clockwise.
(4) π4: A robot moves a tool in a contour parallel pattern (parallel to the boundaries of the part). The
task begins from a low stiffness region, and the pattern is counterclockwise. This region sequencing
policy can be seen as an inverse sequence of π3.
(5) π5: A robot moves a tool in a contour parallel pattern (parallel to the boundaries of the part). The
task begins from a high stiffness region, and the pattern is counterclockwise.
96
(6) π6: A robot performs the finishing task in a contour parallel pattern (parallel to the boundaries of the
part). The task begins from a low stiffness region, and the pattern is clockwise. This region sequencing
policy can be seen as an inverse sequence of π5.
(7) π7: A robot starts to perform the finishing task in a high stiffness region and moves the tool in a
diagonal raster pattern.
(8) π8: This region sequencing policy allows a robot to move its tool in a raster pattern while keeping
parallel to a short side. The task execution begins from a region that has high stiffness.
(9) π9: This region sequencing policy performs the finishing tasks in high stiffness regions. After finishing
multiple high stiffness regions, it moves the tool to the less stiffness regions, and lastly executes the
task in the lowest stiffness regions. This region sequencing policy considers the spatial diversity of the
part.
(10) π10: This policy begins by performing experiments in the lowest stiffness region, then moves on to
regions with higher stiffness. The last executions will be done in regions with the highest stiffness
region. This approach can be seen as the opposite of π9.
The policies (from π1 to π6) move the tool contiguously whereas the policies (from π7 to π10) move the
tool in non-contiguous way. Transition time will be added whenever the robot has to detach the tool from one
region and move it to another region. The policies, π9 and π10, consider the spatial diversity. Incorporating
spatial diversity in learning components helps the algorithm prevent a local minimum. We showed how the
policies could be applied to the rectangular shape which encloses the part surface, as shown in Figure 5.7.
In this example, the rectangular shape has different stiffness values per region. The part has the highest
stiffness at the corners and the lowest stiffness in the center. The figure shows the top view of the part.
The red color indicates the starting region. The green arrows indicate the sequences of the regions. In π9,
the green circles with numbers represent the order in which tasks are performed. The green circle with ‘1’
represents the first region to explore, ‘2’ represents the second region to explore, and so on. In π10, the
orange circles with numbers show the reverse order in which tasks are carried out. The numbers indicate the
97
final order in which the tasks are completed, with ‘1’ representing the final region to explore, ‘2’ representing
the region where the task was completed immediately before the final region, and so on.
5.3.6 Selection of Process Parameter Policy
After the region to perform the finishing task is selected, the next step is to determine which process
parameters to use. We use the process parameter selection policy which is based on the surrogate models
and the estimation of the task performance. First, we can define the sets of the possible parameters in the
region of interest, Ψ, and save them to Xunexplored. There are two discretization methods.
- Discretization of Ψ using fixed spacing: This method is to discretize the entire feasible parameter space
Ψ with fixed spacing. The spacing size can be chosen by users. Either coarse spacing or fine spacing
is used, the fixed spacing can evenly cover the entire space of Ψ.
- Discretization of Ψ using adaptive spacing: This method uses adaptive (or dynamic) spacing when
dividing Ψ. Adaptive spacing means the spacing size is not fixed and can vary. Users can change the
spacing size depending on their needs. For instance, we can adjust the spacing size based on the size
of the feasible space. We can use coarse spacing when Ψ is large and then switch to fine spacing as we
gather more data and narrow down Ψ to a smaller space. Adaptive spacing also enables efficient search
in the parameter space. By starting with coarse spacing, the search can quickly identify broad regions
of the parameter space that could possibly contain good sets of parameters. Then, we can conduct a
more detailed search using a smaller spacing to find the optimal set of parameters within that region.
Adaptive spacing can be beneficial when dealing with large and complex parameter spaces. Using
finely-tuned search from the beginning can be computationally expensive or impractical.
After saving the discretized process parameters to Xunexplored, we need to find the set of parameters
to query or to perform the experiment. The expected task completion time for the finishing task will be
computed to search the parameter set. Define uc as the current location. Adaptive spacing is performed: a
broad search in parameter space using coarse spacing is performed in the beginning, and then finer spacing
98
is used to search the best set of parameters. The details of the discretization and selection of the process
parameters are as below.
(1) Discretize the region of interest Ψ using coarse spacing (∆f, ∆r, ∆v are users’ choice). Save these
discretized values to Xunexplored.
(2) Using GPR model 1, estimate the surface quality for all x ∈ Xunexplored. Calculate the probability to
satisfy the surface quality constraint, P(qˆ ≥ qmin).
(3) Using GPR model 2, estimate the deflection at the current location uc for all x ∈ Xunexplored. The
probability of meeting the deflection constraint is calculated, P(dˆ ≤ dmax).
(4) Calculate the probability to meet both surface quality and deflection constraints for all x ∈ Xunexplored.
The success rate is the multiplication of the probabilities computed in steps (2) and (3). The success
rate is Ps = P(qˆ ≥ qmin) × P(dˆ ≤ dmax).
(5) Sort the sets of parameters in Xunexplored that has a success rate greater than a certain threshold
(Ps ≥ Pthreshold). Save those sets of process parameters to Xnew.
(6) Calculate the expected task completion time for all x ∈ Xnew. Find the set of process parameters
with the minimum expected task completion time. We call this value the candidate set of process
parameters for task execution. We can define the candidate set as xcandidate. xcandidate = argmin
x∈Xnew
E[t].
(7) Perform finely-tune search around xcandidate. We want to check if there is a set of process parameters
that may produce a shorter task completion time around xcandidate. Adjust the spacing (∆f, ∆r, ∆v)
to finer spacing, and discretize the process parameter space around xcandidate. Save the results of
discretization to Xc.
(8) Using similar methods as shown in steps (2)-(6), calculate the success rate and the expected task
completion time to perform the task for all x ∈ Xc.
(9) Select the set of process parameters in Xc that has the minimum expected task completion time. Define
the set as x
∗
. x
∗ = argmin
x∈Xc
E[t].
99
(10) Use x
∗
for the task execution at the current location uc.
In step (5), we sort the sets of parameters based on the probability of success. The purpose of this step is
to avoid selecting the parameter set which has a low success rate but also has a low expected task completion
time. In step (7), the reason to perform finely-tune search is that there may be a better process parameter
set around xcandidate when using finer spacing. In the last step, the robot executes the finishing task at the
current location using the process parameter set x
∗
. If the task succeeds, the robot can move to the next
location. If the task fails, a new process parameter set will be suggested and the task should be reperformed.
5.4 Case Study: Contact-based Sanding Task
Figure 5.8: Setup of the robotic sanding task. The figures show the setup in different orientations.
In this section, our approach is explained in detail with an example of a robotic sanding task. We assume
a high-mix production setting and a robot needs to sand parts with spatially varying stiffness. The setup
includes a 6 degrees of freedom (DOF) articulated robot arm. The robot arm will move the rotary tool over
the part and apply force to the surface for the sanding operation. Figure 5.8 shows a setup of the robotic
sanding task.
100
5.4.1 Initial Exploration
Figure 5.9 shows four parts given for the sanding task. Parts (a) and (c) can be split into four identical
matching patches. Part (b) can be divided into two identical matching patches. Part (d) does not have any
matching patches.
Figure 5.9: Four different parts (a)(b)(c)(d) are given for the sanding task. The parts are shown in an isometric view.
Based on the physics-based information, high force and velocity values could be used in high stiffness
regions, while this is not the case for low stiffness regions. In low stiffness regions, high force and velocity
values may result in excessive deformation. The feasible parameters should be in the low force and low
velocity range. This information can be used for feasibility biased sampling. We can adjust the region of
interest Ψ depending on the stiffness of the regions. When sanding the part with a high stiffness region,
Ψ could include high force and velocity ranges in the parameter space. Conversely, when uc is in a low
stiffness region, Ψ could be limited to regions with low force and velocity. Also, the following physicsbased information can be utilized. Assume the robot executes the sanding task with the process parameters
(fc, rc, vc) at the selected region. From the task result and performance values, (qc, dc), the following factors
can be deduced to distinguish the feasible or infeasible regions.
(1) When the surface quality constraint is violated and the deflection constraint is satisfied: This means
the force applied was fine in that region, not causing too much deflection. However, when the same
values rc and vc are used, the force ranges smaller than fc are infeasible as the surface quality will
decease. One way to increase the surface quality is sampling a velocity smaller than vc while using
the same values, fc and rc. The surface quality increases as the velocity decreases. Another option is
101
to select a different rotational speed value other than rc. If rc is a low value, the surface may not be
sanded enough. The increase in the rotational speed could make the surface smoother.
(2) When the surface quality constraint is satisfied and the deflection constraint is violated: The force
applied was too large and resulted in excessive deflection. In this case, the regions with forces greater
than fc can be considered infeasible regions. To satisfy the deflection constraint, a force smaller than
fc should be selected. When a force decreases, the deflection decreases. However, the smaller force
may lower the surface quality. The velocity and rotational speed values need to be changed accordingly
to meet the surface quality requirement.
(3) Nor the surface quality constraint and deflection constraint is satisfied: Similar to (2), we should not
sample forces greater than or equal to fc because a larger force only causes more deflection. Because
the surface quality constraint is not satisfied, we need to check if rc is too low and if vc is too high. If
rc is very small, the rotational speed can be sampled in the region greater than rc. Similarly, if vc is
very high, the velocity can be sampled from the region below vc.
By following the initial execution policy in section 5.3.3.2, initial experiments were done evenly over the
parts’ surface. Ninit number of data is selected for the initial experiments. We determine Ninit by computing
the mean absolute percentage error (MAPE). Figure 5.10 shows how MAPE changes as Ninit changes. A
simulation function is created for a deflection function of part (a). We set the threshold error rate as 5%.
Also, the measurement errors are assumed to be between 2 −3%. From the graph, MAPE becomes less than
the threshold error of 5% when Ninit ≥ 17. Considering the measurement error, we select Ninit = 20. From
the graph, the average MAPE of 50 runs is 2.89% when Ninit = 20. With similar logic, we select Ninit of
part (b) to be 11, Ninit of part (c) to be 20, and Ninit of part (d) to be 20. The error percentage decreases
as we perform more experiments and have more data.
5.4.2 Surrogate Model Construction and Update
For the case study, we tuned the kernel function and hyperparameters to optimize the surrogate models.
First, we made the list of available kernel functions, which was {ARD exponential, ARD matern 3/2, ARD
102
Figure 5.10: Mean absolute percentage error decreases as the number of initial experiments increases.
matern 5/2, ARD rational quadratic, ARD squared exponential, exponential, matern 3/2, matern 5/2,
rational quadratic, squared exponential}. Then, we built the GPR models using the data acquired from the
initial exploration. Different kernel functions were used to build GPR models, and the loss was computed
using three additional validation points in the parts. For the hyperparameter tuning, the hyperparameter
optimization tool in MATLAB was used. Figure 5.11(a) shows an example of the process of computing loss
for different kernel functions. We ran 100 iterations as the data acquired from the initial experiments has
stochastic characteristics. In each iteration, the kernel function that produces the least loss was selected as
the best kernel function in each iteration. Figure 5.11(b) illustrates the percentages that each kernel function
was selected as the best kernel function among 100 iterations for GPR 1 model. For GPR 1 model that
predicts surface quality, ARD matern 3/2 was selected as the best kernel 19 times, ARD squared exponential
was selected 16 times, matern 3/2 and matern 5/2 were selected 15 times, respectively, and other kernel
functions were selected less than 15 times. In this case, there was not much difference between ARD matern
3/2, ARD squared exponential, matern 3/2, and matern 5/2, so we concluded that users could select any of
these kernel functions. Figure 5.11(c) illustrates the percentages of being selected as the best kernel for GPR
model 2. For GPR 2 model that predicts defection, ARD squared exponential was selected 78 times, ARD
matern 3/2 was selected 9 times, ARD matern 5/2 was selected 6 times, ARD rational quadratic was selected
4 times, ARD exponential was selected 3 times, and others are selected 0 times among 100 iterations. In
this case, we concluded that ARD squared exponential is the best kernel function to use.
103
Figure 5.11: (a) The average loss when using different kernel functions. In this iteration, ARD matern 3/2 is the
best kernel function as it produces the smallest loss. (b) and (c): the percentages of each kernel selected as the best
hyperparameter of GPR model 1 and GPR model 2, respectively. Note: the kernel functions in the x-axis are listed
in the same order for figures (a)(b)(c).
5.4.3 Selection of Region Sequencing Policy
Figure 5.12: Figures show the regions of high stiffness and low stiffness for parts (a)(b)(c)(d). Figures are shown in
a top view.
For the given parts, we can apply the general region sequencing policies described in Section 5.3.5. When
characterizing the region sequencing policies, stiffness or rigidity variation of the parts should be considered.
There are several ways to estimate or measure the stiffness, such as manual probing, visual inspection,
mechanical testing, non-destructive testing, or finite element analysis. Some of the physical testing may
require special equipment, and it may take a lot of time to test all the regions of the parts. For mechanical
testing, a human expert needs to apply a controlled force onto different regions of the part and measure the
displacement or strain amount. For non-destructive testing, such as ultrasonic testing or radiographic testing,
special equipment may be required. Finite element analysis can be performed with computer-aided design
(CAD) simulation, but it also takes time to simulate various scenarios through virtual experiments. Visual
104
inspection and manual probing are relatively easier compared to mechanical or non-destructive testing, and
finite element analysis. Users can use any technique to assess or estimate stiffness depending on the resources
they have.
Figure 5.13: Left figure shows how policy 1 is applied to part (a). Right figure shows how policy 9 could be applied
to part (d).
Figure 5.12 shows the approximate stiffness map over the surface of the four parts. We estimate the
stiffness of the parts through the CAD files of the parts. As some regions of the parts have extruded support
underneath, we can assume that those regions will have high stiffness. Also, the regions far from those
underneath support would be more compliant. Figure 5.12 depicts those regions. The level of stiffness
decreases as the regions get further from the supports. To check our assumption, we conducted the finite
element analysis at the regions that are assumed to be rigid and compliant (about 5-10 discrete points).
Finite element analysis also shows that the regions near supports are much more rigid and the regions far
from supports are more compliant. By understanding the estimated stiffness variations of the part, we can
characterize region sequencing policy in each part. Figure 5.13 shows two examples of how a certain region
sequencing policy can be applied to parts (a) and (d). The left figure shows the tool movement when π1 is
applied to part (a). The right figure illustrates the sequences of the tool when π9 is applied to part (d).
5.4.4 Selection of Process Parameter Policy
We can calculate the expected task completion time using the probability of succeeding in the finishing task.
Assume we want to compute the expected task completion time using the parameter (fc, rc, vc). The time
that it takes to complete the task at the current location when using the tool velocity value vc is tc =
λ
vc
.
105
When the task fails, additional time β is added and the task completion time will be ( λ
vc
+ β). β varies
depending on the degree of the task failure. If the part is permanently damaged and shall be replaced, β will
be high as it takes a lot of time to discard the damaged part, replace it with a new part, and perform the
task initialization. If the part is not permanently damaged and is still reusable, then β is a low value. In this
case, we can just add the time to redo the sanding operation. For simplicity, the expected task completion
time can be rewritten as E[t] = Ps ×
λ
vc
+ (1 − Ps) × (
λ
vc
+ β). As described in Section 4.6, the set of process
parameters that minimizes the expected task completion time will be selected.
5.5 Results and Discussion
5.5.1 Computational Simulation of Robotic Sanding Task
Our algorithm is tested in a virtual environment of the robotic sanding task. In the virtual setting, the robot
can perform the sanding task on the given parts, and the task performance will be evaluated. The details
of the virtual experiment setting are the same as defined in the previous section, Section 5.4: a 6 DOF
manipulator with a rotary tool will perform the sanding tasks for the given parts. We test our algorithm
under various scenarios of virtual experiments. Task conditions or parameter settings become varying in
each scenario. For example, a task is being executed with a different ratio of sampling methods or a different
choice of region sequencing policy. In each scenario, the task completion time is calculated to evaluate the
task result. Per a different task condition or parameter settings, we perform 100 trial experiments to validate
the results. Tinit represents the time taken for the initial experiments in one patch. Tpolicy represents the
time it takes to explore using the region sequencing policy and process parameter policy in a single patch.
Tpolicy does not include the time for the initial experiments. If the robot has to detach the tool from the part
surface and reposition the tool, a transition time will be added. Tpolicy includes the transition time. Ttotal
is the total task completion time which represents the time it takes to complete the sanding task for the
entire part surface. As we perform 100 trial experiments for each scenario, the mean values of Tinit, Tpolicy,
and Ttotal are calculated to evaluate the performance. These mean values can be represented by µ(Tinit),
106
µ(Tpolicy), and µ(Ttotal). The details of each scenario are explained below.
Scenario 1. Different ratio of biased sampling, γ: We want to examine the impact of γ, the ratio of
feasibility biased sampling on the performance of robotic sanding. This implies that γ ratio of feasibilitybiased sampling is performed of the total number of initial experiments. The remaining of initial experiments
is conducted via random sampling. The ratios used for the comparison are γ = 1, γ = 2/3, γ = 1/2, γ = 1/3,
and γ = 0. When γ = 1, the initial experiments are done only by feasibility-biased sampling. When γ = 1/2,
the initial experiments are performed with 50% feasibility-biased sampling and 50% random sampling. When
γ = 0, the initial exploration is performed only through random sampling. The computational simulation
is done in part (a). The region sequencing policy used is π1. We assume that part (a) is a relatively thin
metal panel with a size of 36 inches (width) × 30 inches (length). The diameter of the round-shape orbital
sander is 5 inches. When constructing GPR models, the kernel function used is ARD matern 3/2 for GPR
1 model and ARD squared exponential for GPR 2 model. The computational result is shown in Table 5.1.
Table 5.1: The mean of task completion time with different ratio of γ (unit is sec)
γ = 1 γ = 2/3 γ = 1/2 γ = 1/3 γ = 0
µ(Tinit) 63.28 77.05 83.64 88.33 117.18
µ(Tpolicy) 121.66 124.00 125.57 130.64 143.56
µ(Ttotal) 549.93 573.05 585.91 610.90 691.43
Main finding of scenario 1: The computational simulation results showed that using a high ratio of
feasibility-biased sampling resulted in a shorter task completion time. µ(Tinit), µ(Tpolicy) and µ(Ttotal) are
all reduced when using higher value of γ. When using 100% feasibility biased sampling, the task completion
time was reduced from 691.43 secs to 549.93 secs compared to using 100% random sampling. This is about
20% of time reduction. In our study case, using γ value of 1 is the optimal setting to efficiently learn the
sanding task. In other domains or tasks, the optimal γ value can be different.
Scenario 2. Different tuning combinations of GPR models: In this scenario, GPR models are built with
different kernel functions and the performance of the models is evaluated. We want to compute the entire task
completion time using different kernels. In Figure 5.11(b), the models with particular kernel functions show
107
better performance than the models with other kernels for GPR 1 optimization. So we tested the kernels
that have more than 10% selection rate. Those kernel functions are ARD matern 3/2, ARD matern 5/2,
ARD squared exponential, matern 3/2, matern 5/2, and squared exponential. For GPR 2 tuning, we tested
the first and second most selected kernel functions in Figure 5.11(c). They are ARD squared exponential
and ARD matern 3/2. We made the different combinations of kernel functions for GPR 1 and GPR 2.
The combinations are as follows (the first kernel is the kernel function for GPR 1 model, and the second
kernel represents the kernel for GPR 2 model): Tuning 1 is ARD matern 3/2 and ARD squared exponential.
Tuning 2 is ARD matern 5/2 and ARD squared exponential. Tuning 3 is ARD squared exponential and ARD
squared exponential. Tuning 4 is matern 3/2 and ARD squared exponential. Tuning 5 is matern 5/2 and
ARD squared exponential. Tuning 6 is squared exponential and ARD squared exponential. Tuning 7 is ARD
matern 3/2 and ARD matern 3/2. Tuning 8 is ARD matern 5/2 and ARD matern 3/2. The computational
results are shown in Table 5.2. Tinit is the same for all tuning methods in the initial experiment because
we did not use the GPR models to choose process parameters during initial exploration. The mean of task
completion time for the initial experiment was 63.28 seconds, represented by µ(Tinit) = 63.28. Same as
scenario 1, we performed the computational simulation in part (a) and selected region sequencing policy as
π1. The sizes of the part and the tool are the same as scenario 1.
Table 5.2: The mean of task completion time with different combinations of kernel functions in GPR models (unit is
sec)
Tuning 1 Tuning 2 Tuning 3 Tuning 4 Tuning 5 Tuning 6 Tuning 7 Tuning 8
µ(Tpolicy) 121.66 124.55 125.50 151.25 149.22 147.21 131.31 137.34
µ(Ttotal) 549.63 561.47 565.27 668.29 660.15 652.13 588.53 612.64
Main finding of scenario 2: By comparing (tuning 1 and 7) and (tuning 2 and 8), it appears that the
tuning combinations which include an ARD squared exponential for GPR 2 result in relatively shorter task
completion time than the ones without it. For GPR 1 model, using ARD matern 3/2, ARD matern 5/2, and
ARD squared exponential result in shorter task completion time (as shown in tuning 1, 2, and 3). Compared
to unoptimized combinations (tuning 4, 5, or 6), using optimized combinations (tuning 1, 2, or 3) can save
the task completion time up to about 18%. In our case study, using tuning 1 resulted in the minimum task
completion time. This scenario shows that we can achieve time savings and improve the overall efficiency of
108
the task by selecting good combinations of kernel functions in GPR models.
Scenario 3. Different region sequencing policies: This scenario examines the influence of different region
sequencing policies on the task completion time. The policies in section 5.3.5 are applied to perform sanding
experiments for part (a). The sizes of the part and tool, and GPR tuning are the same as in scenario 1. Table
5.3 shows the computational results. In the initial experiments, Tinit was the same for all region sequencing
policies because the policies were only applied after the initial experiment. The mean task completion time
for the initial experiment was µ(Tinit) = 63.28 seconds.
Table 5.3: The mean of task completion time with different region sequencing policies (unit is sec).
π1 π2 π3 π4 π5 π6 π7 π8 π9 π10
µ(Tpolicy) 121.66 122.83 122.12 127.52 121.21 127.55 147.75 136.75 131.48 137.00
µ(Ttotal) 549.63 554.60 551.75 573.36 548.12 573.49 654.29 610.28 589.03 611.28
Main finding of scenario 3:
(1) Based on the results, {π1, π2, π3, π4, π5, π6} show relatively shorter task completion time compared to
{π7, π8, π9, π10}. Using {π1, π2, π3, π4, π5, π6}, the robot can sand contiguous regions in a consecutive manner
without stopping to detach or reattach the tool to the surface unless the task has to be redone in the region.
When using {π7, π8, π9, π10}, the robot executes the sanding task through non-contiguous regions. After
completing one region, the robot may need to detach the tool from the surface and move the tool to another
region to start sanding in that region. Also, there could be additional time to reposition the tool. This is
the time required for the transition between non-contiguous regions. The transition time increases the total
task completion time. Hence, this can be one of the reasons that it takes more time to complete the task
when applying the policies, {π7, π8, π9, π10}. Time savings can be achieved by sanding contiguous regions
using {π1, π2, π3, π4, π5, π6} because these policies do not require the additional transition time.
(2) By comparing {π1, π2, π3, π5} and {π4, π6}, we could conclude that starting the task from a high
stiffness region to a low stiffness region results in relatively shorter task completion time compared to sanding
in inverse sequences. The similar result is shown when comparing π9 and π10. When we analyzed the data,
the time increase was caused by the task failure in the low stiffness regions in the beginning. Because the
GPR models were not complete enough in the beginning, the parameter selection was not good. The task
should be redone due to task failure. This increased the total task completion time. On the other hand,
109
when the task began from the high stiffness region, there was less number of task failure. Later, when the
robot moved to the low stiffness regions, the task executions were more successful because there was enough
number of data and GPR models were more accurate.
(3) There is not much difference in task completion time among the policies {π1, π2, π3, π5}. Selecting
π5 resulted in a slightly shorter task completion time compared to selecting other policies. Users can select
any of these four policies to execute the task in this case.
(4) Among four policies {π7, π8, π9, π10}, π9 results in a relatively shorter task completion time. This
shows that sanding all high stiffness regions first, and then moving to the lower stiffness regions could achieve
the time saving.
Scenario 4. Using a heuristic approach in process parameter selection : We applied a heuristic method and
evaluated its impact on task performance. The heuristic method is to limit the parameter space depending
on the stiffness of the region. Part (a) is used for the computational simulation, and three region sequencing
policies {π1, π2, π3} are tested. The results are in Table 5.4.
Table 5.4: The mean of task completion time with and without heuristics (unit is sec)
π1 w/o heuristic π1 w/ heuristic π2 w/o heuristic π2 w/ heuristic π3 w/o heuristic π3 w/ heuristic
µ(Tinit) 63.28 43.28 63.28 43.28 63.28 43.28
µ(Tpolicy) 121.66 115.87 122.12 115.65 121.21 115.43
µ(Ttotal) 549.93 506.77 551.75 505.90 548.12 504.99
Main finding of scenario 4: For all policies tested, the use of the heuristic method in process parameter selection resulted in shorter task completion times. For π1, it achieved 43.16 seconds reduction in
µ(Ttotal). For π2, there is 45.85 seconds reduction in µ(Ttotal), and for π3, there is 43.13 seconds reduction in
µ(Ttotal). This shows that around 7.8%−8.3% of time savings can be achieved through the heuristic method.
Scenario 5. Performing sanding tasks on different parts : In this scenario, we evaluated the performance
of the sanding task on different parts, parts (a)(b)(c)(d). For part (a), the part size and tool size are the
same in scenarios 1 and 2. For part (b), the part size is 14 inches (width) and 24 inches (length). Part (b)
is smaller than part (a). We use a round-shape sander with a 3-inch diameter for part (b). For part (c),
the part size is 48 inches (width) and 40 inches (length). The size of the rectangular shape empty space in
110
the center of the part (b) is 24 inches (width) and 14 inches (length). The tool size used is 3-inch diameter.
For part (d), the size is 30 inches (width) and 23 inches (length). The tool size used is 4-inch diameter.
We calculated the mean task completion times for these parts, and the results are shown in Table 5.5. The
number of points in the part used for initial experiments is Ninit = 20, 11, 20, 20 for parts (a), (b), (c), and
(d), respectively.
Table 5.5: The mean of task completion time with different parts
Part Part(a) Part(b) Part(c) Part(d)
Policies π1 π2 π3 π1 π2 π3 π1 π2 π3 π1 π2 π3
µ(Tinit) 43.28 43.28 43.28 28.75 28.75 28.75 56.77 56.77 56.77 56.75 56.75 56.75
µ(Tpolicy) 115.87 115.65 115.43 62.26 61.09 63.30 111.41 111.09 1113.62 148.19 154.37 149.36
µ(Ttotal) 506.77 505.90 504.99 153.27 150.92 155.34 502.42 501.13 511.25 204.94 211.12 206.11
Main finding of scenario 5: In this scenario, we implemented our approach in different parts to evaluate
the performance. In part (a), there is not much difference in the total task completion time across region
sequencing policies π1, π2, and π3. For part (b), the policy π2 produces a relatively shorter task completion
time compared to the other two policies. In part (c), π1 and π2 relatively perform well and result in shorter
task completion times compared to π3. π3 shows about 2% increase in task completion time compared to
π2. When applying π3 to part (c), the low stiffness regions were sanded in an earlier stage of the sequences.
It is possible that GPR models were not accurate enough due to the small amount of data. This could lead
to task failure. For part (d), the best policy is π1. Using other policies results in approximately 3.0% (with
π2) and 0.5% (with π3) increases in the total task completion time.
5.5.2 Physical Experiments
Through the simulations, we have proved that our approach works for different scenarios. In this section,
we apply our approach in the physical experiment. The purpose of the physical experiment is to test the
performance of our algorithm in real-world setting. The task setting for the physical experiment could vary
depending on resources that users have. For example, specific settings of physical experiments could vary
depending on a robot model, tool, measurement devices, experiment location, etc. Our setting of the robotic
cell is the following: The robotic cell consists of a 6 DOF manipulator arm with a random orbital sanding
tool, as shown in Figure 5.14. The model of the manipulator is ABB IRB 2600 (with a maximum reach of
5.41 feet). The force controller is installed at the end-effector of the manipulator. The model of the tool is
111
3M random orbital sander with 5 × 5/16 inches orbit. The tool holder is 3D printed and the material used is
polylactic acid. Figure 5.14(a) illustrates the robotic cell. Figure 5.14(b) shows the enlarged photo of the tool
which includes the end effector of the robot, 3D printed tool holder, force controller, and the orbital sanding
tool. The part is rectangular with 36 × 24 inches (width × length). The tested part has a similar stiffness
variation to part (a) in Figure 5.12. The material of the part is steel, and the extruded supports underneath
the part are wooden posts. The sectional area of the wooden post is 2 × 2 inches square shape. Manual
probing was performed in several points over the part to roughly check the rigidity of the part. The regions
right above the wooden posts are relatively rigid, while the regions farther from the wooden posts are much
more compliant. The surface quality was measured before and after the sanding task. We used the surface
roughness measuring instrument from Mitutoyo (model: SJ-410) to evaluate the surface quality. A pneumatic
source is used for the rotary tool. The control input of the rotary tool is pressure. The rotational speed of
the tool increases as the pressure goes into the pneumatic rotary tool increases. Due to the characteristic
of the pneumatic drive, accurately controlling or measuring the rotational speed was not feasible. Instead
of the rotational speed, we could control the pressure applied to the orbital sander, so pressure fed to the
tool was used as input. The range of pressure setting was 10 − 40psi. We set the deflection constraint as 2
inches. So when the deflection occurred was more than 2 inches, the task was considered a failure. When
we performed the experiments, the joint angles of the robots were recorded. Using forward kinematics, the
position of the tool that made contact with the part surface could be computed. By comparing that with
the default position, the deflection was measured. The surface roughness constraint is that Ra value of the
surface should be less than 1.0 µm (or 0.00197 inches).
We applied our proposed approach to the proposed experimental setting. The part was split into four
identical patches using a line of symmetry. The sanding experiments were performed on one patch first and
then the remaining three patches were sanded in sequence. A patch consists of varying stiffness regions.
In the high stiffness regions, the optimal force amount selected was 30N. The optimal velocity of 20mm/s
(0.79inch/s) was selected and used through our algorithm. In the regions of medium intensity level, the
selected force was 25N. The optimal velocity calculated is 15mm/s (0.59inch/s). In the regions that are
more compliant, the force amount of 20N was selected and used. The level of the velocity should decrease,
112
Figure 5.14: (a) Robotic sanding cell: ABB IRB 2600, the sanding tool, and the part to sand (b) Enlarged photo of
the tool.
and the optimal velocity of 12mm/s (0.47inch/s) was used. The optimal pressure range computed was 20-25
psi for all regions. We observed that using high pressure caused too much vibration on the part. This is
because the sheet metal part is too thin. To avoid excessive vibration during the sanding operation, we
avoid using too high pressure of the pneumatic drive. Along with the aforementioned parameters, the tool
was oriented at 10 degrees from the horizontal surface. The angle was adjusted because when the tool
was oriented perpendicular to the surface, the contact between the tool and the surface was too large, and
the sheet metal part vibrated harshly. Also, when the tool was perpendicular to the part surface, and the
rotational speed was very low, the tool was not rotating due to high friction between the tool and the part
surface. By adjusting the angle between the tool and the part surface, we could avoid these situations.
The result of the experiments is shown in Figure 5.15. Figure 5.15(a) shows the qualify of the surface
before the sanding task is executed. Figure 5.15(b) illustrates the surface of the part after the sanding task is
done. We used the region sequencing policy π8. This region sequencing policy begins from the high stiffness
region. Ten random locations on the part surface were selected before and after the sanding operation to
113
measure the surface roughness (Ra). For each location, the Ra value was measured three times, and the
average value was calculated. Prior to sanding, the average Ra values for the 10 locations ranged between
2.5-3.8 µm. After the sanding execution is done, the Ra values decreased to the range of 0.3-0.9 µm for the
same locations. This indicates that all locations satisfied the surface roughness constraint, which was less
than 1.0 µm. When using π8, the task completion time to sand the entire part was 1120 seconds, which
is about 18.8 minutes. For the comparison, we applied π2 for the region sequencing policy. In this case,
the task completion time for the entire part was about 1790 seconds, which is about 30 minutes. The task
completion time was much longer than using the policy π8 because the surface quality was not smooth, and
the task should be redone. In both cases, there was no excessive deflection or permanent damage at any
location within the part during sanding. The part did not have to be replaced during the processing.
Figure 5.15: (a) The part surface quality before sanding, (b) The part surface quality after sanding task.
π8 which performs the sanding task in a raster pattern (while keeping the movement parallel to a short
side of the part rectangular shape), was the most promising due to the direction of scratches and the angle
between the tool. In Figure 5.15(a), most of the scratches are along the long side of the part shape. Sanding
it at about 90 degrees could maximize surface smoothness. The reason that the best policy of the physical
experiment is different from the simulation is that the detailed settings for the sanding were different. In
the simulation, the direction of the surface scratch is random, so sanding in one direction does not provide
any advantages in terms of making the surface smooth. The simulation has more general settings, but the
physical experiments could have more specific settings depending on what users have (e.g., the tools, robots
114
or measurement devices, experimental resources, etc.). In our setting, our surface roughness measurement
devices are measuring the roughness along the one direction, so the scratches of the parts were made along
one direction instead of random direction. Also, the angle of the too was about 10 degrees when sanding to
avoid excessive vibration. Due to this, sanding in one direction like π8 could perform better than sanding in
zig-zag pattern like π2.
5.6 Summary
In this study, we proposed a general framework for learning spatially varying process parameter models
through an adaptive and iterative approach. The target applications are contact-based surface finishing.
The process is performed by a robot arm with a rotary tool. The proposed framework utilizes the initial task
execution policy, surrogate modeling, region sequencing policy, and process parameter policy. The effectiveness of our approach was validated through both computational simulation and physical experimentation for
a robotic sanding case study. The virtual experiments were performed through the simulations in five scenarios with different task settings. In the physical experiment, we had more specific settings about our robotic
cell for the sanding process. The best region sequencing policy was applied. Compared to the case when a
non-optimal policy was used, the task completion time could be saved by using the best region sequencing
policy. The proposed approach can be applied to any other type of contact-based surface finishing, such as
grinding, polishing, or buffing. Users can adopt the general framework based on the methods explained in
Section 4 Approach. Depending on the resources (computational resources or physical resources), users may
have different settings for their experiments. Users need to take the specific settings of their experiments
into consideration when determining region sequencing policy, and adjusting the parameters.
In future work, we aim to enhance the efficiency and automation of our approach. For instance, we
could automate the process of determining the region sequencing policy for a given part. Currently, region
sequencing policies are manually suggested and evaluated through simulations. We aim to develop an
algorithm that can generate a richer set of region sequencing policies if the input geometry CAD file is given.
While set of possible region sequencing policies are generated, it should consider a trade-off between the cost
of failure and the value of success. Then, the algorithm should be able to suggest the best region sequencing
115
policy among the possible policies for the given part. Also, instead of using GPR based surrogate model, we
want to try other methods to fit the models and to predict the task performance for unobserved data. With
the CAD file and geometric information that we can acquire from the CAD file, we may be able to fit the
models through physics-informed artificial neural networks.
116
Chapter 6
Learning of Temporally Varying Process Parameter Models for
Direct Ink Writing Applications
6.1 Introduction
Direct ink writing (DIW) is one of the additive manufacturing (AM) techniques that enables the printing
process by dispensing ink through a nozzle. DIW can use any ink or material as far as it exhibits proper
rheological behavior and can be dispensed through a nozzle. DIW process is versatile as it allows the
printing of various types of inks, including multi-materials. Such materials can be polymers, hydrogels,
ceramics, metals, glass-forming materials, composites, and even food. The versatility in using material can
lead to greater freedom in printing design and expand the functionality of printing [105]. DIW process can
also create multifunctional structures with its capability of using multi-materials and designing the printing
process. The DIW process is used for various applications such as bioinspired materials, tissue engineering,
or deformable electronics [64, 117, 147]. Figure 6.1 shows examples of printing using DIW process.
The rheology of the ink and material properties are very important to perform a successful DIW process. The
ink’s rheological properties affect how the ink flows through a nozzle, and ultimately affect the quality and
structure of the print. Unfortunately, ink drying is a common issue during DIW, as the ink’s properties can
change over time due to chemical reactions, or environmental conditions such as temperature and humidity.
Hence, the ink has limitations in its use. To maximize the utilization of the ink, process parameter adjustment
is necessary during the printing process. Proper adjustment of process parameters could help the ink to be
117
Figure 6.1: (a) The shape recovery product is fabricated through dual DIW approach [23]. (b) Mesoscale printing is
performed on curved surface using a robotic DIW setup [19].
dispensed for a longer period of time and can prevent printing failures due to ink drying. If the process
parameters are not adjusted properly, the ink drying could cause defects in the printed artifacts, such as
thinning or gaps, or nozzle clogging. This leads to poor quality of the print, and failed structure integrity.
In such cases, the ink is not being fully utilized and the printing process is inefficient. Hence, the process
parameters should be controlled and updated over time to ensure successful DIW printing and prevent waste
of resources.
Manual tuning of process parameters requires real-time monitoring and careful observation of the printing
process to detect and correct errors. This is time-consuming and labor-intensive. Also, for some complex
printing tasks such as multi-materials DIW process, it may not be practical to do manual tuning. Instead,
we want to automate this process using an adaptive and iterative learning approach. AI-driven experimental
design can save time and cost by allowing the system to adjust parameters on the fly and determine good
ranges of process parameters that could maximize an objective function of the DIW process [142]. The first
step is to collect data by printing test artifacts. In this step, we want to explore parameter space as much as
possible. The task performances of the test artifacts printed are analyzed through image processing. Then,
we construct temporally varying process parameter models by analyzing the input and output data. The
relationships between input and output can be complex and highly non-linear, so we need to select the right
models to fit the data. Using the constructed data, we estimate the optimal range of process parameters
118
over time. As more data is collected and analyzed, the learning process becomes more reliable and efficient.
Ultimately, our goal is to maximize the use of ink resources and improve the efficiency of the DIW process
through the automated adjustment of process parameters through the iterative learning process.
Moreover, the adaptive experimental design approach for automating the adjustment of process parameters
in response to temporal changes has wider applications beyond DIW. It can be used to perform the tasks that
exhibit temporal phenomena in process parameter models. Often this is a common scenario in manufacturing
processes or industrial setting. Process parameters often need to be adapted over time to maintain optimal
performance and maximize resource utilization. For instance, various manufacturing processes are influenced
by environmental conditions, changes in machinery, materials used, and so on. Our approach can be applied
in such cases to dynamically adapt to changes and determine the appropriate ranges of process parameters.
6.2 Problem Formulation
We define terminologies and notations related to process parameters and task performance as below.
(1) Process parameters: When performing experiments, a robot must use the appropriate process parameters. Process parameters can be seen as input parameters. If there are M1 number of process parameters
that are controllable, the set of process parameters can be represented as x = (x1, ..., xM1
). Process parameters can be time-dependent if they can vary with time. Process parameters have to conform to process
parameter constraints which can be expressed as g(x) ≤ 0 where g(x) is a vector-valued function. The
process parameter constraints could be based on physical limitations of the robot or specifications of tools.
(2) Process parameter space: We can denote the process parameter space as Ω. Ω represents the set of all
possible process parameters for the given task. When we have the total M1 number of process parameters,
the dimension of Ω will be M1. For example, if there are three controllable process parameters, Ω is in
3-dimensional space.
119
(3) Task performance: After performing experiments, we need to determine how well the task is completed.
The task performance is the qualitative measure to assess the quality of the task performed. Task performance measures can be seen as output parameters for the task. If there are M2 number of the measures, a
set of the task performance can be represented by y = (y1, ..., yM2
). The task performance constraints can
be written as h(y) ≤ 0. The constraint h(y) is a vector-valued function.
(4) Process parameter model: A process parameter model is the model which maps the input process parameters and the task performance. It represents the relationship between a set of process parameters and
one single task performance measure. The jth process parameter model can be written as yj = fj (x) where
j = 1, ..., M2.
In this chapter, we explore the adaptive experimental design approach for learning process parameter models that have temporal characteristics. The goal is to optimize the objective function of the DIW process
while satisfying the constraints for both process parameters and task performance. Optimizing the objective
function could indicate maximizing resource utilization and conducting as many experiments as possible. If
the measure of task amount is represented by f, for example, f can be number of the total experiments,
or length of the produced object, or etc. The objective function can be represented by max E[f(x)] s.t.
g(x) ≤ 0 and h(y) ≤ 0. f is a function process parameters because the process parameters have a direct
impact on the task performance. The process parameters, x, should be adjusted every time step with a fixed
time step size, ∆t. The time step interval ∆t can be decided by experimental setup and its resources. For
example, in DIW ink characteristic is changing quickly over time, then the process parameters should be
adjusted accordingly and that ∆t could a small value. On the other hand, the ink properties do not changes
quickly the frequent parameter adjustment is not necessary. ∆t could be a larger value in this case, allowing
for a longer time interval between parameter compensations.
We can consider the following DIW robotic setup. A 6 degrees of freedom (DOF) manipulator is used to
perform DIW. The model is a Yaskawa GP8 manipulator which has high accuracy and repeatability. We
120
Figure 6.2: (a) The experimental setup for DIW using a 6 DOF manipulator. (b) The detail of fluid dispenser. Two
process parameters can be controlled.
have a fluid dispensing system (Model: Nordson EFD fluid dispenser) which enables the automated and
precise ink dispensing. A custom 3D printed holder is attached to the flange of the robot arm, which holds
the fluid dispenser and laser displacement sensor (LDS). The experimental setup is shown in Figure 6.2(a).
Figure 6.2(b) shows the enlarged photo of the fluid dispenser. It also shows how two process parameters, p
and v, can be applied. We can determine the process parameters and task performance for the DIW process
as below. The units are indicated inside parentheses.
- Applied pressure in the tool p (psi)
- Forward velocity of the tool v (mm/s)
- The process parameter set x = (p, v).
121
The task performance set is:
- Mean of line width ¯d (mm)
- Variance of linewidth σ
2
(mm2
)
- Ratio of the printed line ρ (unitless)
- The task performance set y = ( ¯d, σ2
, ρ).
In section 6.3, the overall process of the adaptive experimental design for DIW process is described. Section
6.4 shows the method for analyzing task performance using image processing. With the analyzed data,
surrogate models are constructed, as described in Section 6.5. Parameter estimation is described in Section
6.6. Finally, Section 6.7 presents the results of our approach. Section 6.8, provides our conclusion.
6.3 Overview of Approach
The overall flowchart of DIW using the iterative learning is shown in Figure 6.3. First, we print test artifacts
to explore process parameters. We want to explore the parameter space as much as possible during this
initial experiment stage. Various combinations of process parameters are tried during this stage. Then,
image processing techniques are used to efficiently analyze the acquired data. The surrogate models can be
constructed based on the data acquired. Hyperparameter tuning can be done when building the surrogate
models. The optimization of hyperparameters can lead to more accurate model construction. Using the
surrogate models, the proper ranges of process parameters to meet the constraints can be determined.
As the ink shows the temporal characteristics, the appropriate ranges of the process parameters change
correspondingly. By analyzing the trend of temporal adjustment, we can also estimate the feasible duration
of a successful printing period. Next, we can start printing actual artifacts with a new ink. When any errors
are observed during the printing process, the printing ends immediately and the failed artifacts and used ink
are discarded. The data acquired so far including the failed data can be used to update the surrogate models.
Then, process parameters can be newly estimated and the process can be repeated. If there is no error while
printing actual artifacts, the DIW process continues until it reaches the estimated feasible duration.
122
Figure 6.3: The flowchart of DIW process using the AI-driven experimental design
We can further develop our objective function for DIW experiments using the parameter defined above. We
aim to maximize the utilization of the ink. When printing actual artifacts, the given constraints should be
satisfied. Hence, the goal is to maximize the expected length of the total print while meeting all constraints.
The objective function can be written as below.
max E[Ltotal] (6.1)
s.t. |
¯d − d
′
| ≤ γ1, σ2 ≤ γ2, ρ = 1 (6.2)
s.t. g1(p(j)) ≤ 0, g2(v(j)) ≤ 0 (6.3)
γ1, γ2 are given and constant (6.4)
123
Ltotal is the total length of the printed artifacts. γ1 is an absolute error allowed between ¯d and the desired
linewidth, d
′
. The variance of linewidth, σ
2
, should be below a threshold γ2 to ensure the linewidth is
consistent. The printed artifacts should not have any discontinuity, and that ρ should be 1. Equation (6.4)
indicates the constraints for each process parameter, pressure ranges and velocity constraints.
6.4 Task Performance Analysis: Image Processing
For DIW robotic setup, two major process parameters can be considered. One is the applied pressure to
the fluid dispenser. This can be written as p, and p is controlled by the fluid dispensing system in Figure
6.2(a). This device enables the precise pressure to the dispenser ranging from 0-16.9 psi with a resolution of
0.1 psi. The other process parameter is tool forward velocity or printing speed. v can be controlled by the
manipulator’s movement. The height of the nozzle tip is controlled to be within a proper range so that the
material can be deposited on the surface. LDS in the system is used to read the distance between the tool
and the surface. The orientation of the nozzle is kept to be constant.
We want to analyze three task performance measures to determine the quality of the printing. The first
measure is the error between the desired linewidth and the mean linewidth of the printed artifacts. The
second measure is the consistency of the linewidth. We want to determine whether the printed line achieves
consistent width and quality throughout its length. The last measure is the discontinuity. We want to ensure
that the printed artifacts have no discontinuity while maintaining smooth ink dispensing and achieving good
quality print. To analyze these measures, we use image processing techniques on the images acquired from
the initial experiments. The details of these techniques are as follows.
1. Mean of linewidth : This component indicates the average linewidth of the printed artifacts. We can use
the edge detection algorithm to determine the edges of the printed line. Sobel edge detection is used. Then,
we can calculate the distance between the edges at different points along the line. Assume we measured the
124
total N number of points along the line. The distance between the edges at ith point can be written as di
.
This indicates the linewidth printed at the ith point. The mean of the linewidth can be represented by:
¯d =
1
N
PN
i=1 di (6.5)
(6.6)
2. Consistency of the linewidth : The level of consistency of the printed linewidth indicates how much the
linewidth varies around its mean value. We can calculate the variance of linewidth at different points along
the printed line. We can use the same edge detection methods as above, and calculate the distance between
the edges at various points along the line as explained in the previous section. Mathematically, the variance
of the linewidth along the line can be written as:
σ
2 =
1
N
P
N
i=1
(di − ¯d)
2
(6.7)
s.t. ¯d =
1
N
PN
i=1 di (6.8)
Figure 6.4 shows the examples of the printed line that is not straight. In Figure 6.4(b), the top edge is
indicated by the red line, and the bottom edge is indicated by the blue line. The distance between the edges
at each point along the line is calculated.
Figure 6.4: Image processing to determine edges of the printed line. (a) is the original image, and (b) shows the
edges of the printed line using Sobel edge detection.
125
3. Discontinuity in the printed artifacts : We want to determine if the printed artifacts have any discontinuity.
To determine if the line is continuous, we can calculate the ratio of the printed length. Define L as the planned
length and Lp as the printed length. The ratio of the actual printed line can be rewritten as ρ = Lp/L. The
first step is to convert an input image to a binary image with a thresholding technique. The background
pixels are white, while the pixels of the printed parts or noise can be black. Figure 6.5(a) and (b) show
the conversion from the original image to the binary image. Then, we can apply connected component
analysis (CCA) to identify the group of black pixels that are connected together. Using CCA method, line
segments in the image can be detected. The bounding boxes can be applied to these groups of pixels that
are segmented. Figure 6.5(c) shows the bounding boxes in red color. However, the noise was also detected
and marked with the bounding boxes as shown in the figure. To filter out the noise, we can apply a filtering
process based on the size of the bounding boxes. As noise is usually small and negligible, we can filter small
size bounding boxes. Figure 6.5(d) shows the line segmentation after filtering out the bounding boxes of the
noise. Here, each bounding box represents the line segment. The total length of the printed line Lp is the
sum of the length of all bounding boxes. When there are M line segments, Lp =
PM
k=1 Lk. Here, Lk is the
kth bounding box. Then, we calculate the ratio of the actual printed line, ρ = Lp/L. When ρ = 1, the line
is continuously printed. As ρ decreases, the line is not printed contiguously and the discontinuity along the
line increases. In the extreme case, ρ = 0 and this indicates that the line is not printed at all.
Based on the physics and related research about DIW application [146], we can assume the impact of
changes in process parameters on the printed artifacts. An increase in pressure could lead to an increased
flow of the ink, and that printed line could have a wider linewidth. Excessive pressure can also lead to
increased variance of the linewidth due to fluctuations in the ink. If the pressure is too high, the material
could be accumulated on the nozzle tip, resulting in blobs. Figure 6.6 shows the blobs that occur on the tip
of the ink dispenser. On the other hand, decreasing pressure can lead to thinner lines due to less material
deposition. The variance of the linewidth could be lower as the dispensing process could be more controlled
as the ink will be dispensed more slowly with decreased pressure. If the pressure is too low, the material
may not be dispensed continuously, resulting in irregularities in the printed line. In such cases, the ratio of
the actual printed line could decrease.
126
Figure 6.5: Image processing to determine discontinuity in the printed artifacts. (a) is the original image, and (b)
is the binary image. (c) shows all the bounding boxes detected including noise. (d) shows the line segmentation is
successful after filtering out the noise.
When the tool forward velocity increases, the material is deposited with less time for spreading. This
could result in a thinner linewidth. When the tool is moving too fast, discontinuity may occur in the printed
line. Conversely, decreasing the velocity can result in a thicker linewidth or higher variance because there is
more time for irregularity to occur. If the tool is moving too slowly, material accumulation could happen as
the ink stays in one area for too long. Also, the ink could stick to the nozzle tip and hinder the smooth and
straight material deposition.
6.5 Surrogate Model Construction
The process parameter model that maps between the input process parameters and output task performance
is a complicated, non-linear and black-box model. Obtaining a comprehensive process parameter model
often requires conducting a significant number of experiments, which can be time-consuming. Alternatively,
a surrogate model can be constructed to approximate the relationships between process parameters and task
performance.
127
Figure 6.6: Printing failure: the blobs occurred during the printing process
Surrogate models are constructed using the data acquired from the initial experiments. Various data at
different time steps could be acquired from time step 1 to time step N′
(Assume the experiment continues
until time step N′ with the total duration of N′ ∗ ∆t). Figure 6.7 represents how the surrogate models are
constructed for each time step. We have three task performance measures. Gaussian Process (GP) regression
models are used to estimate the process parameter models. GP 1 is a model for the mean of the linewidth.
GP 2 is for the variance of the linewidth, and GP 3 is for the ratio of the printed line. These GP models
can then be utilized to predict the optimal process parameter ranges that satisfy the task performance
constraints.
Figure 6.7: Three GP models are constructed in each time step.
128
When constructing GP models, hyperparameter optimization could be performed. We select different kernel
functions to fit the GP models and optimize their hyperparameters to improve the accuracy of the models.
The choice of kernel function can significantly impact the performance of the GP models in predicting the
task performance measures. In our previous study, it was demonstrated how the performance of prediction
could vary depending on whether hyperparameter tuning was performed for a robotic sanding task.
6.6 Parameter Estimation
This section outlines the process for selecting the optimal process parameters at each time step using GP
models. We first calculate the probability of meeting each task performance for the sets of possible process
parameters. P1 indicates the probability of meeting the constraint of the mean of linewidth. P2 is the
probability to satisfy the variance constraint. P3 is the probability of satisfying the constraint of the printed
line ratio. Then, using the probabilities, we identify the combination of process parameters that maximizes
the expected print length in the time step. This combination of process parameters is considered the optimal
set for the time step. The process is as follows:
1. Select the first time step j = 1.
2. Generate all possible combinations of p and v within the process parameter limits and save the data
to Ωj .
3. Make predictions of all task performance for each combination (p, v) in Ωj using GP 1, 2, 3 models in
the current time step j.
4. Using the mean and variance of the predictions, calculate the probabilities to meet the task performance, P1, P2, P3.
5. Save a combination of p and v to Ψj if its P3 value is greater than or equal to a certain threshold α.
6. Calculate the probabilities to meet the constraints for both task performance 1 and 2. The probability
can be computed by P = P1 ∗ P2.
129
7. Compute the expected length in the current time step by multiplying the probability, velocity, and
duration of printing, P ∗ v ∗ ∆t.
8. Select the combination of (p∗, v∗) which can maximize the expected length of the print in the current
time step. Save the combination to the data set Sj = (p∗, v∗).
9. Go to the next time step and repeat the process from 2-9. Stop the process if the end of the time steps
is reached.
6.7 Results
We validate our approach in DIW application using a robotic arm. The ink is prepared as follows: The
silicone (20g of Ecoflex 00-30A and 20g Ecoflex 00-30B, or 1:1 ratio), 1.5g of dark colored silicone color pigment (model: Silc Pig), and 2g of silicone thickener (model: THI-VEX). The prepared materials are mixed
with a planetary centrifugal mixer (model: Thinky Mixer ARE-310) at a speed of 2000rpm. Each mixing
cycle is 30secs and two cycles were performed. The room temperature is kept between 72-74◦F. To ensure
consistency, we maintain the same conditions every time a new ink is prepared. Environmental conditions,
as well as the ratio and amount of each material used, can affect ink drying rates. Therefore, it is crucial
to maintain the same printing conditions. We evaluated our approach based on three factors. Firstly, the
results of the initial experiments were shown. Secondly, we compared the outcomes of printed artifacts with
and without temporal compensation. Finally, we estimated the duration of the feasible printing process.
The details of the results are presented as follows.
1. Model Construction using Data Set from Initial Experiments: During the initial experiments, we printed
test artifacts to investigate the impact of different process parameters on task performance. Different combinations of pressure and velocity were used to print test artifacts. The range of p was between 2-16.9 psi,
and the range of v was between 1-6 mm/s. The time step was ∆t = 5min. We have a total of 225 data for
the time step, j = 1, 2, ..., 25. In the last time step, the ink was not dispensed and the initial experiments
ended. To determine the accuracy of the surrogate models, first we randomly divided the entire data set
130
into training and test sets (200 of the training data and 25 of the test data). The GP models were then
constructed using the training set and validated with the test set. The predicted process parameters are
compared with the actual values in the test set. Root mean squared error (RMSE) was computed. We
repeated the computation 5 times, and the results are as shown in Table 6.1. RMSE 1, 2, and 3 are the
error values for ¯d, σ
2
, and ρ, respectively. We set the constraints as follows: the desired linewidth d
′
is
1000(µm) and γ1 = 50. The threshold of the variance, γ2 = 5000. α = 0.8. The results showed that RSME
1 ranges between 47.62-65.05(µm), RSME 2 ranges between 2598.46-3864.79(µm2
), and RSME 3 ranges
between 0.02-0.19.
Table 6.1: The computation of RSME 1, 2, and 3. We ran the computation for five times. The training and test
data are randomly selected for each run.
RSME 1 RSME 2 RSME 3
1 50.55 2959.28 0.11
2 51.00 2598.46 0.02
3 47.62 3113.71 0.19
4 62.19 3864.79 0.07
5 65.05 3521.28 0.10
2. Temporal Adjustment : We determined the impact of temporal compensation during DIW process.
We adjusted the pressure applied to the ink dispenser and the tool forward velocity over time. Because
our experimental setup was not suitable for printing a large artifact, we only printed a small artifact in
the beginning and half of the time step to analyze the task performance as shown in Figure 6.8. If the
printed artifacts satisfy all constraints, we assume that the selected process parameters work for that time
step. Although ink drying process may vary depending on whether printing is done continuously for the full
duration or only for a portion of the time, we make this assumption for simplicity. Based on this assumption,
we calculate the expected total length, E[Ltotal]. This represents the total printed length if the experimental
setup allows for printing large artifacts. We computed E[Ltotal] when the temporal compensation was applied
and not applied. The comparison is shown in Table 6.2. If the process parameters are not compensated
over time, the expected total length, E[Ltotal], is approximately 4 meters. However, with the temporal
compensation approach, the expected total length is around 26 meters, which is significantly longer. The
lines on the left side in Figure 6.8(a) are printed without the compensation, and the lines on the right side
131
are printed with the compensation. Figure 6.8(b) shows an enlarged photo of the last five lines. The last two
lines (time step j = 5.5 and 6) in the orange box demonstrate a thinning process as the process parameters
used were not compensated over time.
Figure 6.8: (a) Printing actual artifacts at different time steps. (b) The artifacts in the orange box shows the process
without temporal adjustment. The artifacts in the blue box shows the printing process with temporal compensation.
Table 6.2: The comparison of the expected total length when the process parameters are adjusted over time or not
adjusted.
E[Ltotal](mm)
With temporal adjustment 26095
Without temporal adjustment 3889
We explored the effect of using different kernel functions to construct the GP models. As presented in Table
6.3, using the optimal kernel function improves the accuracy of predictions and maximizes the expected
printing length. In particular, when the ARD matern32 kernel is chosen, the estimated total length is longer
than the other two cases, which use different kernel functions. The difference in length percentages is approximately 1.6-2.8% when compared to using the squared exponential and exponential kernels, respectively.
132
Table 6.3: The comparison of the expected total length of prints with and without GP tuning.
GP model Kernel function E[Ltotal](mm)
With optimization ARDmatern32 26529
Without optimization Squaredexponential 26095
Without optimization Exponential 25816
3. Estimated duration of the feasible printing process : Finally, we aim to estimate the duration of time
that the DIW process is successful. Beyond a certain duration, the printing process may fail to meet
the constraints due to physical limitations and ink drying. For instance, once the mixed silicone becomes
more viscous beyond a certain point, it becomes difficult to smoothly dispense the ink even with parameter
adjustments. At this point, further attempts to print may lead to a waste of resources. Moreover, such
attempts could result in additional costs associated with the replacement of the ink due to failed trials. By
determining the duration of the feasible printing process, we can minimize the waste of resources and cost
incurred in failure. The prediction from the model shows that the task performance will fail after the 20th
time step or 100 minutes. At the final time step, the selected process parameters were (p, v) = (16.9, 1.2).
Figure 6.9 displays the process parameters selected at each time step. The brightest color represents the
lowest time step (time step = 1), while the darkest color represents the highest time step (time step = 20).
As shown in the graph, the pressure increased over time while the tool velocity generally decreased as the
time step increased.
6.8 Summary
In this chapter, we presented the adaptive and iterative learning approach for the DIW application. Our
approach includes data acquisition during initial experiments, task performance analysis with image processing, surrogate model construction, estimation of process parameters at different time steps, and estimation
of the duration of feasible task executions. We aimed to build an efficient learning framework to maximize
the utilization of the ink. We computed the total length to be printed while satisfying the constraints for
both process parameters and task performance. We applied our approach to DIW setup with a robot arm.
Two process parameters were controlled, tool pressure to the ink dispenser and tool forward velocity. The
133
Figure 6.9: The set of process parameters that selected over time
measurements of the task performance were analyzed using image processing. This computed the mean of
linewidth, the variance of linewidth, and the ratio of the printed line.
Using test artifacts, we explored how changes in process parameters at each time step affect the task performance. With the initial data, we built the surrogate models and determined the accuracy of the surrogate
models. We divided the entire data set into training and test sets and then computed the RMSE of the
predicted values of task performance. We also printed actual artifacts with or without temporal adjustment
of process parameters. Our results demonstrated three main factors: (1) The surrogate models constructed
using the initial data set were able to build the models with good enough accuracy. (2) Enabling temporal
adjustment of process parameters could result in significantly longer printing runs compared to the process
without such adjustment. Additionally, optimization of the kernel function in GP models can also lead to a
longer expected length. (3) The duration of feasible task execution was estimated, thereby preventing waste
of the material and additional costs that may occur due to failed printing attempts. Overall, our approach
provides an effective learning strategy for DIW using a robot arm.
134
Chapter 7
A Sequential Decision Making Approach to Learn Process
Parameter Models by Conducting Experiments on Sacrificial
Objects
7.1 Introduction
Often, we need to learn process parameters when a task needs to be performed on a part with new material
or new geometry. Learning process parameters may also be required when dealing with new tools. A trialand-error approach is often adopted while learning the process parameters. However, directly learning these
parameters on the target object or the object of interest can be impractical and costly, as the cost of failure
is very high. Using incorrect process parameter values during the trial-and-error phase can result in serious
consequences or irreversible damages for the target object, requiring expensive repairs or even replacement.
To mitigate these risks and costs, many complex tasks require the use of sacrificial objects (i.e., test pieces).
These sacrificial objects will be used to determine proper process parameters before proceeding with the task
on the target object.
Since our goal is to complete the given task in the shortest amount of time while satisfying the constraints,
the number of experiments conducted on sacrificial objects plays a critical role. Conducting a substantial
number of experiments would theoretically enable us to safely and accurately identify the right process
parameters. However, this approach would lead to a very long task completion time and therefore incur high
costs. On the other hand, if we do not conduct a sufficient number of experiments on sacrificial objects,
135
then we may not be able to identify the appropriate process parameters. This can damage the target object
and lead to highly inefficient task execution. Once again, these outcomes would incur considerable expenses.
Hence, we need to determine the appropriate amount of experimentation required with sacrificial objects.
Neither too many experiments nor too few experiments are good choices. Hence, finding the optimal midpoint
is crucial.
Increasingly, robots are being used for high-mix manufacturing applications such as robotic sanding,
spray painting, and coating. Learning of the right process parameters is important to avoid damage to
target objects. As an example, robotic sanding has three process parameters: the force applied to the
surface, the forward velocity of the tool, and the rotational speed of the orbital sander. If the robot applies
excessive force to the surface, the object may be damaged and this requires expensive repairs. Similar
problems can occur with spray painting application. If the robot is moving too slowly, or spraying too close
to the target surface, the spray paint may drip and hence the object is ruined. To prevent this, the robot
can conduct experiments on sacrificial objects to identify the right process parameters. Figure 7.1 shows the
robot is spray painting a large mural. Considering the
Figure 7.1: The figures describe the mobile manipulator is spray painting a mural [36]
In the context of robotic experimentation, we can view this problem as a sequential decision making
problem. The decision has two parts. The first part of the decision is to decide whether to conduct
experiments on sacrificial objects or move on to task execution on the target object(s). Using reinforcement
136
learning analogy, experimentation on sacrificial parts can be viewed as exploration. A decision to execute
tasks on the target objects can be viewed as exploitation. Data from experimentation on sacrificial parts can
be used to construct a surrogate model of the process. When the process model is sufficiently complete and
can provide good process parameter settings, then a task can be executed on the target object. The second
part of the decision requires deciding which process parameters to choose during experimentation or task
execution. A sacrificial object may enable multiple experiments. Therefore, the robot may need to select a
batch of experimentation to perform. To find a good solution, the robot needs to explore new regions of the
parameter space. The robot may also choose to conduct some experiments near parameter settings where
good performance was observed. Once again, at the second part of the decision, the robot also needs to find
the right trade-off between exploration and exploitation. After each round of experimentation, the robot will
evaluate the task execution performance and use this information to decide what to do next. This approach
enables the robot to adapt its strategy based on the outcome of the previous actions. Figure 7.2 describes
the spray painting operation being performed on a sacrificial part. By doing this, the robot is exploring the
process parameter space.
Figure 7.2: The figures describe the spray painting operation on a sacrificial part.
This chapter investigates the problem of sequential decision making in the context of learning process
parameters by conducting experiments on sacrificial objects [140]. We use a spray painting task as an
illustrative example. We build a search tree to evaluate different decision options. Based on the expected
task execution cost, we decide whether to use sacrificial objects or perform tasks on the target object(s).
137
Whichever option gives us the least expected cost, we proceed with that option. We use a policy to select
process parameters by considering the trade-off between exploration of new parameter space and exploitation
of parameters that lead to satisfactory task performance. The method presented in this chapter is general
purpose and can be used for any application that may require the use of sacrificial objects.
7.2 Problem Formulation
First, we would like to define terminologies and notations for manufacturing tasks that may require the use
of sacrificial objects to explore process parameters.
(1) M: Number of the sacrificial objects that have been used. M is not a fixed number. The sequential
decision making process will decide how many sacrificial objects are needed to safely and economically
execute the tasks.
(2) N: Number of the target objects to finish. The amount of work is fixed from the beginning, so N is a fixed
and given number. We assume that each target object is identical, so they are identical in size, geometry, or
material.
(3) Nd: Number of the target objects that failed. When incorrect process parameters are used, catastrophic
events may happen on the target objects such as part cracks or surface damage. The damaged target objects
are not reusable and hence will be discarded. The cost and time spent setting up and executing tasks are
wasted.
(4) C
S
setup: Cost to setup the sacrificial object. This cost includes the material cost of the sacrificial object
and other costs associated with preparing the sacrificial object for experimentation. In this chapter, we
assume that the setup cost is the same and given.
(5) C
T
setup: Cost to setup the target object. This cost includes the material cost of the target object and all
costs related to arranging the experimental setup. In this chapter, we assume that the setup cost for each
target object is the same.
(6) C
S
execution: Cost to execute a certain manufacturing task using the sacrificial object.
(7) C
T
execution: Cost to execute a certain manufacturing task using the target object.
138
(8) α: Ratio of time and cost coefficient. The time taken for the setup and task execution should be
transferred to the cost. α indicates the ratio between time and cost, and the unit can be, for example,
dollars/sec or dollars/min. α varies depending on the types of manufacturing tasks. We assume that
during the task execution, α is constant.
(9) T
S
setup: Setup time for the sacrificial object. The setup time indicates the time taken to prepare task
executions for a single sacrificial object.
(10) T
T
setup: Setup time for the target object, the time taken to prepare executions for a target object.
(11) T
S
execution: Time for conducting manufacturing tasks using the sacrificial object. While performing
experimentation on the sacrificial object, the robot may do multiple experiments with different process
parameter sets.
(12) T
T
execution: Time for executing manufacturing tasks on the target object.
The total cost of setting up a single sacrificial object can be written as C
S
setup + α · T
S
setup. For the sake
of simplicity, we will denote this value as C1. Similarly, C2 is the total setup cost of a target object and this
can be written as C2 = C
T
setup + α · T
T
setup. We will denote C3 as the cost of task execution on a sacrificial
object and C3 = C
S
execution + α · T
S
execution. Likewise, the execution cost of a target object can be written as
C4 = C
T
execution + α · T
T
execution.
In this chapter, the objective function is to minimize the expected cost to complete the tasks for N
number of target objects. The expected cost includes the costs spent to explore process parameters using
the sacrificial objects, and the cost spent to execute tasks on the target objects. The problem is formulated
as below.
Minimize
E[Ctotal] = X
M
i=0
(C1 + C3) +X
N
j=1
(C2 + C4) +X
Nd
k=0
(C2 + C4)
Based on the assumption, the setup cost of both sacrificial and target objects, C1 and C2, are constant.
The execution cost can be different depending on which process parameters are selected. Hence, the execution
139
costs of sacrificial and target objects are the functions of process parameters. C3(i) indicates the task
execution cost of i
th sacrificial object, C4(j) indicates the task execution cost of j
th target object, and C4(k)
indicates the task execution cost of k
th target object that failed. The above equation can be rewritten as
follows:
Minimize
M · C1 + (N + Nd) · C2 +
X
M
i=0
C3(i) +X
N
j=1
C4(j) +X
Nd
k=0
C4(k)
In this chapter, we solve the above objective function using the sequential decision making process.
Details of the approach are described in the next section.
7.3 Approach
7.3.1 Overview
In this section, we describe the details of the approach to solve a sequential decision making problem. The
decision making approach is a combination of the look ahead search tree, surrogate modeling, and policy for
selecting process parameters. The search tree analysis is used to make the decision at the current status.
After making a decision, the experiment is performed using the selected process parameters. The task
performance results will be stored in the data set. The constructed search tree is discarded, and new search
tree will be constructed to make the next decision. In the experimentation context, the sequential decision
making process only commits to the current decision and does not commit to any future decisions. In the
computational context, the search tree is expanded beyond the first executions and calculates the effect of
further executions by simulating future decisions. By computing the expected costs of further executions, we
can make the best possible move from the current status. The part of the decision making process requires
selecting the process parameters to use. Depending on the amount of data acquired so far, the selection
approach can be different. It can be exploration-oriented or exploitation-oriented. We use the parameter
140
policy which takes the right trade-off between exploration and exploitation. We also use the surrogate
modeling approach where we can construct the approximate model of the relationship between input and
output parameters. The details of each approach are described below.
7.3.2 Search Tree Evaluation
Search tree analysis can be used in a variety of applications to quickly analyze the values and performance
associated with the tree. We use this search tree evaluation for our decision making process to search the
best move (decision) at the current status. Our proposed search tree consists of decision nodes, action nodes,
and outcome nodes as shown in Figure 7.3. At the decision nodes, an agent needs to make a decision and
the node will be extended to action nodes. Each action node denotes a possible action that can be made
from the decision. After action nodes, the path flows to the outcome nodes that show the results of the
actions. Figure 7.3 shows that three actions can be made from a decision (D1). Those actions are Action
1 (A1), Action 2 (A2), and Action 3 (A3). A1 leads to three outcomes, A2 leads to two outcomes, and A3
results in one outcome. Each outcome is associated with probability. Using the search tree, we can compute
the expected costs of the outcomes under each action. The expected costs of an action can be written as
E(C) = P
N
i=1
Ci ∗Pi such that Ci represents the cost of i
th outcome in the action, Pi stands for the probability
associated with the outcome, N is the total number of outcomes in the action. After computing the expected
cost of A1, A2, and A3, we can make the best move at the decision node, D1, by comparing the expected
costs.
Figure 7.3: The example of search tree structure.
141
The search tree can consist of expansion and backpropagation. The extension means additional nodes will
be created after the outcome nodes. For example, in Figure 7.3, the outcome nodes (O1 to O6) can be
extended, the decision nodes can be added to the outcome nodes to make more steps of decision. From that
decision nodes, more possible action nodes will be created and then there will be outcomes associated with
the actions. Assume that we want to see how the possible decisions, actions, and outcomes can be made. We
can compute the expected costs of leaf nodes (outcomes), then, we can backpropagate the result to the first
decision node to select the best possible path. In this case, we can make the decision based on the possible
outcomes after a few more rounds of decisions instead of just one single round of decision. This search tree
with extension and backpropagation could be applied to the decision making process of manufacturing tasks.
7.3.3 Gaussian Process Surrogate Model
We can build surrogate models after performing experiments and collecting the data. When performing a
new task, the process model that maps input and output parameters is not known in advance. Experiments
or engineering simulations may be performed to determine the accurate process model. However, this is not
always possible since it requires thousands of evaluations. Instead, we will use the surrogate model approach
that constructs the approximation of the process model. This approach is much cheaper to evaluate because
the number of observations required is much less. There are several types of surrogate models, and the
following are considered. Polynomial regression, Gaussian process regression, and artificial neural networks.
We decide to use Gaussian process (GP) regression because it can provide predictions with uncertainty.
GP regression is a non-parametric kernel based probabilistic model. It can produce the predicted mean
and variance of the model at unobserved data based on the previous observations. We can construct a GP
surrogate model based on the existing data set, and keep updating the model whenever we perform new
experimentation. Uncertainty in the GP surrogate model can be an indicator when selecting which process
parameters to use. For example, high uncertainty means that regions are under-explored. We may select
process parameters from under-explored regions, conduct experiments, and update the model. This approach
allows the model to be more complete. GP regression model is very flexible even with the limited number
of data set. This is another reason that we choose GP regression model for surrogate modeling.
142
7.3.4 Policy to Select Process Parameters
When a robot executes a task, it needs to select process parameters. It can perform pure exploration, pure
exploitation or something between them. We define the parameter policy as the policy that maps the current
state to the selection of process parameters. This means the policy provides which process parameters to
choose based on the current state. There can be several policies that an agent (robot) can use. For example,
one policy can be to select the process parameter set that can maximize the immediate future rewards.
Another policy can be to select the process parameters whose uncertainty measures in surrogate models are
high. Another example of a policy is to select the process parameter set that will provide the lowest failure
rate on the task. Likewise, there can be various parameter policies to choose from. A good policy will be a
strategy that considers the right balance between exploration and exploitation.
7.4 Case Study: Robotic Spray Painting Task
7.4.1 Process Parameters and Task Performance
As a case study, we apply our approach to the robotic spray painting task. An experimental setup consists
of a robot arm that can perform spray painting on target objects (target panels). The robot does not know
the process models in advance. The robot can perform the task on either sacrificial panels or target panels.
Whether the robot chooses to perform spray painting on sacrificial panels or target panels, there are costs to
pay: setup cost and execution cost. In spray painting application, the setup cost includes the cost of a panel
and spray paint can, cost and time to replace a panel with new one, cost and time to calibrate a tool for a
newly placed panel, cost and time to locate the robot arm to initial position and orientation. The execution
cost includes the operation cost of the robot and the time taken for the spray painting task. Here, the time
spent should be recalculated to the cost by using the ratio of time and cost coefficient α. In this case study,
we assume that the setup cost is significantly higher than the execution cost. Another assumption is that
the sacrificial panel is solely designed for practice purposes and much cheaper than the target panel.
143
The first part of decision making process is whether to use a sacrificial object or a target object. The
second part of the decision making is to select which process parameters for experimentation. In the spray
painting application, there are three process parameters. We define those as below.
- Distance between the tool (spray paint can) and panel surface, d (mm).
- Angle between the tool and panel surface, a (degree).
- Forward velocity of the tool, v (mm/s).
Figure 7.4: The figures describe the process parameters, the distance and spray angle.
Figure 7.4 shows an illustration of two process parameters, distance (d) and spray angle (a). We measure
the task performance after performing the experiment using selected process parameters. There are two task
performance measures.
- Dripping status, D (no unit).
- Width of the spray paint, w (mm).
We determine that the task fails and the panel is permanently damaged when spray painting produces
drips on the target panel. Once spray paint drips happen, it is difficult to remove those. The target panel
has to be thrown away, and the cost spent for setup and execution was wasted.
One of the task performance measures, the dripping status can be seen as a qualitative measure. We
determine it to six different levels: failure 1 to 5 and success. Failure 1 means there are extreme drips on the
144
panel. Similarly, failure 2 stands for high dripping, failure 3 means moderate dripping, failure 4 indicates
low dripping, and failure 5 means extremely minor drips happen. The success denotes there is no drip. The
dripping status is scored by human evaluation. Figure 7.5 shows the example of the results from the spray
painting experiment. In terms of task performance measurements, (a) shows extremely minor drips (failure
5), (b), (c) show no drips (success), (d) shows high dripping (failure 2), and (e) shows extreme drips (failure
1), (f) illustrates moderate dripping (failure 3).
Figure 7.5: The example of spray painting task, (b) and (c) show successful results without any drip, and (a), (d),
(e), (f) show failed results.
When using the sacrificial object, the robot will select a batch of experiments to perform. The sacrificial
panel is for practice purposes, so it is recommended to explore the parameter space as much as possible.
We assume that there are 10 experiments in one batch. Then, the robot will follow the parameter policy
to choose which process parameter sets to use. The right balance between exploration and exploitation of
process parameters is considered when selecting process parameters. When performing the experiment on
the target object, we do not want to take any risk since the cost of failure is too high. Hence, we assume
that the best-known, successful process parameter set will be selected to execute the task. We will use the
145
process parameters that guarantee no drips on the panels. The details of the policy are described in Section
7.4.3.
7.4.2 Gaussian Process Regression
The process parameters and task performance were defined in the previous section. With those data, we fit
a model. Figure 7.6 illustrates two Gaussian process regression models constructed. One model predicts the
dripping status. The other model estimates the core width of the spray paint. We divide the dripping status
into six levels (extreme, high, moderate, low, extremely minor dripping, and no drips). We can change each
dripping status to a quantitative measure, so it can fit a regression model. For example, we can use 0 to
represent the success mode. We can also assign a number between 1 and 5 to indicate failure 1 through
failure 5 depending on the level of the dripping status. When measuring the width, we only consider the
core width which shows the clear and distinct spray paint stroke. If the spray paint stroke has vague and
misty areas, the areas are not considered part of the core width. If the paint stroke is all misty and there is
no distinct and clear line, then the core width is 0mm. For example, Figure 7.5(d), (e), (f) have the misty
areas on top of the clear paint strokes. Gaussian process regression provides estimations of output values
over the entire input range, along with uncertainty levels in the measurement.
Figure 7.6: The structures of GP regression models.
7.4.3 Parameter Policy
In each round of the execution, the search tree algorithm is used to find the best sequence of decisions using
expansion and backpropagation. Once the first part of the decision is made, whether to use a sacrificial panel
146
or target panel, the robot needs to make the second part of the decision. The second part of the decision
making process is to select the set of process parameters. In exploitation stage, the selection of process
parameters is simple. When spray painting on the target object, we want to use the process parameters known
to work the best without any dripping. Hence, the parameter policy is to find the best set of parameters in
the current data set and exploit it. In this case, the task performances are deterministic because a specific
set of parameters is used in the experiment. On the other hand, when using the sacrificial object, the robot
needs to select multiple parameter sets for exploration. It can select process parameters from the regions
where uncertainty measurements on the predictions are high. This strategy makes the surrogate model more
accurate and complete. Having a sufficiently complete model can lead to good parameter selection. This
exploration method is especially effective in the beginning when we do not have a complete model. However,
once we have more data set and a sufficiently accurate model, we may want to perform pseudo exploitation.
We define pseudo exploitation as the process of selecting process parameters from a neighborhood of the
best set in parameter space. The best set means the set of parameters that works the best so far. We call
this pseudo exploitation since it can be viewed as a kind of exploitation process, however, it is not pure
exploitation.
Our parameter policy for the sacrificial panel is as follows. Define M′ as the total number of parameter
sets acquired from experimentation. Define N′ as the number of process parameters selected from a high
uncertainty area, and N′′ as the number of process parameters selected using pseudo exploitation. The
parameter policy should consider the right trade-off between the exploration and pseudo exploitation. Depending on M′
, we want to use a variable ratio of N′ and N′′. For example, when M′
is small, performing
pseudo exploitation is not a good idea since we do not have sufficient data to build a complete model. Hence,
we put more weight on N′
side. When M′
increases, we have more data set. The surrogate models can be
accurate enough to make good predictions. We can perform more pseudo exploitation at this stage to find
good parameter settings. Figure 7.7 shows the proportion of N′ and N′′ based on the variation of M′
. M′
only increments when we acquire the data by using the sacrificial object. The parameter policy suggests
pure exploitation when using the target object. The data obtained from the target object is already in the
existing data set, so it does not provide new data. We use the fixed ratio of N′ and N′′ for the sake of
147
simplicity. It can be any ratio as far as the shape of the function is a decaying function as shown in Figure
7.7. The decaying function means the policy suggests to perform more exploration initially and do more
pseudo exploitation at a later stage.
Figure 7.7: N′
changes depending on the number of data M′
. This decaying function means that the ratio of N′
and
N
′′ changes depending on M′
.
7.4.4 Search Tree Analysis
This section describes how the decision is made using the search tree and the analysis of the expected cost
is performed. We use search tree analysis to find the best move that returns the least expected cost at the
current status. In the computational context, we need to build the search tree with consideration of all
possible outcomes, expand the tree, and evaluate the expected cost to make the right decision at the current
decision node. The first step is to start with the first part of decision making process. When performing
the very first experiment, we will select the sacrificial object. Using the target object is very risky and may
damage the target because there are no data or experimental results. Hence, the first decision is always
to use the sacrificial panel to avoid catastrophic events. We assume that 10 experiments will be performed
when using the sacrificial panel. After the first round of execution, the obtained data set (10 different sets
of process parameters and the corresponding task performance measures) will be added to the current data
set.
148
After the first round of experimentation, we can start to build the search tree. Figure 7.8 shows the
simplified version of the search tree approach we have used. The decision nodes (D2, D3, etc.) represent the
first part of decision making process which is to select the sacrificial panel or the target panel. D2 denotes
the second round of the execution, D3 denotes the third round of the execution, and so on. There are two
possible actions from the decision node. Action 1 (A1) node is to use the sacrificial object, and action 2 (A2)
node is to use the target object. After the first part of the decision and the corresponding action is taken,
the second part of the decision should be made. The search tree will follow the parameter policy described
in Section 7.4.3 to select which process parameters to use. When using the sacrificial panel, we decide to run
five Monte Carlo rollout of the policy (this number can be changed by the user). Each Monte Carlo rollout
consists of 10 sets of process parameters. From these five batches of Monte Carlo rollout, we will have five
possible outcomes. Each outcome node (from O1 to O5) indicates the task performance measurements from
one batch of Monte Carlo rollout. For the target object, we perform pure exploitation. Therefore, there
will be only one outcome (O6). The process of decision, action, and outcome stands for one round of task
execution. After one cycle of the execution, the tree keeps expanding to the next round of the execution. We
can expand the tree up to Nt executions. For example, consider the case where we want to make the decision
at the 2nd node (D2). The tree starts from D2 and will be expanded to Nt rounds. Then the expected gained
value, or the expected cost will be computed. The gained values at the leaf nodes will be backpropagated
to the parent nodes. Using backpropagation, we can compute the expected cost at each node in the search
tree. Finally, we can compare the expected costs of A1 and A2 under D2, and select the action that provides
the least expected cost.
When the target object is selected for task executions, the outcome values are deterministic because we
use exploitation. On the other hand, when performing the experiments on the sacrificial panel, the outcomes
are not deterministic. For the experiments, (N′+ N′′) sets of process parameters will be selected. The
parameter policy guides us where to explore and pseudo exploit, however, it does not specifically pick which
process parameters to select in the parameter space. The parameter policy itself has stochastic behavior,
and the outcomes will not be identical each time. Therefore, we have to perform Monte Carlo rollout of
the parameter policy. We perform five batches of the rollout while following the variable ratio between
149
Figure 7.8: The search tree structure used in our algorithm. One single round execution includes a decision, actions,
and outcomes.
exploration and pseudo exploitation. In the figure 7.9, each rollout picks N′
sets of process parameters for
exploration and N′′ sets of parameters for pseudo exploitation. The results of Monte Carlo rollout will be
stored in the outcome nodes. Monte Carlo rollout will be performed at every execution in the search tree.
Using the search tree, we calculate the expected task completion cost (the cost of finishing the given
N number of target objects). For example, when Nt = 4, the search tree will be expanded up to the four
rounds of the task executions. Each outcome node will indicate the width and dripping status. Assume we
want to calculate the expected task completion cost at each leaf node under D1(A1)-D2(A1)-D3(A1)-D4(A1)
path. The path indicates that all four rounds of executions have used sacrificial objects. If the first three
executions failed and the fourth execution succeeded, the remaining target panels are N panels. Then, the
expected cost to finish the entire panels will be N ∗ (C2 + C4) at one of the outcome nodes. The next step is
to calculate the expected task completion cost at the action node. Here, the action node itself costs expenses.
Using the sacrificial panel requires the setup and execution cost, C1 + C3. Using the target panel requires
the setup and execution cost, C2 + C4. Therefore, the expected task completion cost at the action (CA)
would be the addition of the expected cost from the child outcome nodes and the cost of the action itself
150
Figure 7.9: The five batches of Monte Carlo rollout are performed. The figure shows the case where 8 sets of process
parameters are selected for exploration and 2 sets of parameters are selected for pseudo exploitation.
(Caction). This can be written as CA =
P
k
i=1
Pi ∗ Ci + Caction such that k is the number of outcomes under
the selected action node, Pi
is the probability associated to the i
th outcome, Ci
is the task completion cost
calculated at the i
th outcome. In this example, the sacrificial panel is selected and five batches of Monte
Carlo rollout are performed. Hence, the number of outcomes is 5 (k = 5). Each batch is simulated with
equal probability, so Pi = 0.2. On the other hand, if the target panel is selected (the parent action node is
A2), we perform exploitation and there will be only one outcome set. Hence, k = 1 and Pi = 1 in this case.
After computing the expected task completion cost at the action nodes, we will backpropagate the values
to the parent decision node. In our search tree, making a decision will not cost any expense. We can start
comparing the cost of the action nodes (A1 or A2) at the Nt round. If CA1 < CA2, this means the expected
task completion cost of using the sacrificial panel is less than that of using the target panel. Hence, the
decision at Nt round is to use the sacrificial panel. After this step, the cost at the decision node will be
backpropagated to the parent nodes which are the outcome nodes at the (Nt −1) round of executions. Then
151
the above evaluation, backpropagation, and search process will be repeated until the search tree analyzes
the costs across all nodes of the search tree.
7.5 Results
We validate our proposed approach using an experimental setup to perform robotic spay painting. The
experimental setup consists of a KUKA robot and 2-finger Robotiq gripper. The gripper is attached to the
end effector of the robot and the gripper holds the spray paint can. First, we construct the search tree to
determine whether to use sacrificial panels or target panels, and decide which process parameters to conduct
the experiment. For computational analysis in the search tree, we assign the fixed values as follows. The
total number of target objects to spray paint is 100 panels (N = 100), the setup cost of one sacrificial object
is C1 = 20, the setup cost of one target object is C2 = 100. The execution cost of one sacrificial object is
C3 = 2 and the execution cost of one target panel is C4 = 15. Using these values, we can perform the search
tree analysis. The sequential decision making process only commits to the current decision, so the search
tree constructed is discarded after each experiment. New search tree will be built for the next decision. We
always look five steps ahead (five rounds of executions ahead) whenever we construct the new search tree.
For the experimentation, we need to set a reasonable range of input process parameters. The distance
range used in the experiment is from 90mm to 200mm. For parameterization, the parameter space is
discretized with the spacing of 5mm. The range of the tool forward velocity is from 30mm/s to 120mm/s.
The parameter space is discretized with the spacing of 10mm/s. The range of the spray angle is from 0◦
to
20◦
. The parameter space is discretized with the spacing of 10◦
. The table in Figure 7.10 shows the results
of task performance. In the table, the blue background means the values are from the exploration N′
, and
the red background means the task performance values are from pseudo exploitation N′′. We also use the
bold text to highlight the best task performance in each execution.
1. The first execution: Since the first decision is always to use a sacrificial panel, 10 sets of process
parameters are selected for exploration (N′ = 10). There is no GP regression model at this stage
indicating the level of uncertainty because we don’t have a data set. Hence, we select the 10 sets of
152
Figure 7.10: Result of task executions.
153
parameters that are reasonably distributed in the parameter space. Among the 10 sets, only 2 sets were
successful ([w, D] = [35mm, 0], [30mm, 0]). Other 8 sets resulted in spray paint drips on the sacrificial
panel. [35mm, 0] is the best task performance set in the first execution, because it does not produce
drips and has the widest width. We assume that a wide core width is better than a narrow core width.
Because it will spray paint the surface faster than a very small width.
2. The second execution: the search tree analysis starts from the second execution. Five steps look
ahead search tree is constructed with five batches of Monte Carlo rollout of the parameter policy. The
first decision making process is to use a sacrificial panel. In the second decision making process, the
parameter policy selects 8 sets of process parameters to perform exploration (N′ = 8) and 2 sets to
perform pseudo exploitation (N′′ = 2). By comparing the expected costs, we decide to use a sacrificial
panel at this stage. The outcome node with the lowest expected task completion cost is selected to
conduct the experimentation. 4 sets were successful(no dripping) among the 10 sets. The best task
performance result in the second execution is [42mm, 0].
3. The third execution: the search tree used to make the second decision is discarded, and the new one
is constructed for the third execution. We perform five steps look ahead as well. A sacrificial panel is
chosen from the first decision making process. 5 sets resulted in no drips among 10 sets. When we see
the result from the pseudo exploitation, it shows good performance since the dripping status is either
no drips (D = 0) or minor drips (D = 5). The best task performance result in the third execution is
[50mm, 0].
4. The fourth execution: The new search tree is built for the fourth decision with five steps look ahead.
The first part of the decision is made to use a sacrificial panel. This time, we perform more pseudo
exploitation (N′′ = 8) and less exploration (N′ = 2). Among the 10 sets, 8 sets were successful. The
other 2 sets showed minor drips (D = 5). The best task performance is [w, D] = [65mm, 0].
5. The fifth execution: The new search tree is built for the fifth decision with five steps look ahead.
This time, the decision is to use a target panel. The second part of decision making process is to
exploit the best set of process parameters from the current model. The set of process parameters
154
shows the best performance is [d, v, a] = [150mm, 80mm/s, 0
◦
] (the corresponding task performance is
[w, D] = [65mm, 0]).
6. The sixth executions and further: The search tree is built for the sixth decision with the five steps look
ahead. Again, a target panel is selected to execute the task. We can intuitively see that the search
tree newly constructed for further executions will choose to use target panels as it will keep producing
the successful result and save the task completion cost.
The above table and search tree analysis show how the decisions were made for the robotic spray painting
task. With the set of process parameters we gained in the experiments ([d, v, a] = [150mm, 80mm/s, 0
◦
]),
we spray painted the word (USC) on a canvas. The result of spray painting is shown in Figure 7.11. No drip
was observed on the canvas.
Figure 7.11: The word, USC, is spray painted.
7.6 Summary
In this chapter, we presented a sequential decision making approach to learn process parameters by using
sacrificial objects. We solved the sequential decision making problem with a combination of look ahead search
tree, GP regression modeling, and parameter policy. We validated our proposed approach by performing the
computational analysis and experimentation on the robotic spray painting task. The main findings of this
work are the following:
155
(1) We have demonstrated that the search tree analysis by looking ahead future steps can help to make
an informed decision at the current stage.
(2) The surrogate model approach based on GP regression can be used to build the relationships between
input and output parameters in robotic spray painting application. It provides better predictions as
more experiments are performed and the model is updated based on the experimental data.
(3) The proposed parameter policy can balance the right trade-off between exploration and exploitation
of process parameters. Performing more explorations at the beginning makes the model more complete. Performing pseudo exploitation helps to find good parameter settings based on the constructed
surrogate models.
156
Chapter 8
Conclusions
The present chapter outlines the anticipated intellectual contributions and benefits resulting from the proposed work presented in this dissertation. Furthermore, it explores potential avenues for future research
directions.
8.1 Intellectual Contributions
The intellectual contributions of this dissertation include the following:
• The dissertation presents an efficient learning framework for tasks with constant process parameter
models. The AI algorithm explores the parameter space and guides the parameter search to enhance the
task performance. This adaptive and iterative learning approach finds the set of process parameters that
could minimize the expected task completion time. Our adaptive experimental design approach includes
feasibility biased sampling, surrogate model construction, and greedy optimization. We implement our
approach and validate it with computational simulations of robotic sanding. Our algorithm produces
robust and efficient learning outcomes in different scenarios of manufacturing settings. Compared to
the traditional design of experiment methods, our approach completes the sanding task with a shorter
task completion time while satisfying all performance constraints.
• This dissertation presents an adaptive and iterative learning framework for a robot to learn spatially
varying process parameter models. Our focus applications are contact-based surface finishing processes.
157
Our learning approach using AI-driven experimental design utilizes an initial parameter space exploration method, surrogate modeling, selection of region sequencing policy, and development of process
parameter selection policy. We demonstrated the effectiveness of our approach through computational
simulations and physical experiments with a robotic sanding application with a rotary tool. The computational simulation was tested in five different scenarios with varying parameter settings. Our work
shows that the learning approach that has been optimized based on task characteristics significantly
outperforms an unoptimized learning approach based on the overall task completion time. The proposed approach can be applied to other types of robotic contact-based finishing operations, such as
grinding, polishing, or buffing. Users can adopt the general framework and adjust the parameters
based on their specific robotic task requirements.
• This dissertation proposes an efficient learning approach for temporally varying process parameter
models. The optimal process parameters vary over time, so temporal adjustment is needed for efficient
task execution. The proposed learning framework includes systematic data acquisition during initial
experiments, construction of surrogate models, estimation of process parameters at different time
intervals, and determination of feasible task execution durations. Our target application is DIW
process where the ink drying problem is prevalent. We apply image processing techniques to analyze
the printed artifacts. Our results demonstrate the accuracy of the models, compare the DIW process
outcomes for actual artifacts with and without temporal adjustment, and estimate the feasible duration
for successful printing. With the temporal process parameter adjustment, our method maximizes ink
utilization.
• This dissertation investigates the sequential decision making of learning process parameters by conducting experiments on sacrificial objects. We determine the right amount of sacrificial objects to identify
the proper process parameters while keeping the expected task completion cost to the minimum. The
sequential decision making approach we proposed is a combination of look ahead search, surrogate
modeling, and a policy to select process parameters. In computation, the search tree simulates future
decisions beyond the current decision and evaluates the costs by considering the effect of future experiments. The decision made in our approach only commits the user to the current stage and does
158
not affect future decisions. When a decision is made to conduct experiments on the sacrificial object,
the parameter policy to select process parameters should be simulated. In this policy, we consider the
right trade-off between exploration and exploitation. We validate our approach through experiments
using the robotic spray painting application.
8.2 Anticipated Benefits
This dissertation mainly focuses on AI-driven experimental design to learn process parameter models for
robotic processing applications. With the adaptive experimental design approach, robots can explore parameters through experiments, evaluate the task performance, and guide the task with parameters that
produce the minimum expected task completion time. This iterative learning algorithm could handle various constraints in manufacturing applications, from simple to complicated and nonlinear constraints. Also,
the AI algorithm can take the consequences of constraint violations into learning sequences so parameter
space can be safely explored. This could potentially avoid constraint violations and subsequently reduce
the costs associated with the violations. This benefits the entire system by safely preserving the part, experimental setting, and environment, reducing time and costs. The learning algorithm also considers prior
knowledge and reflects it into the exploration stage. Leveraging prior knowledge expedites the learning process by focusing on parameters of interest and utilizing physics relationships between parameters and task
performance. Without utilizing prior knowledge, unnecessary experiments could be repeated and that time
could be wasted.
The four different learning approach is introduced for different scenarios: tasks with constant process
parameter models, safe task exploration, tasks with spatially varying process parameters, and tasks with
temporally varying process parameters. They share the same learning structure of exploring the parameter,
updating the model, and determining the best process parameters. The learning details are different per
scenario due to the characteristic of each manufacturing task. As all manufacturing task has their own
physical setting and application, the learning approach should be tailored to each manufacturing application.
The utilization of AI in experimental design enables the deployment of robots in high-mix low-volume,
customization, and small-batch manufacturing tasks. This means AI algorithm allows for efficient, adaptive,
159
and safe learning strategies for a new part, geometry, and even the use of new tools. These can be beneficial
in various manufacturing tasks and robotic processing applications. The learning algorithm can be adaptive,
flexible, and reliable while providing more efficiency than traditional learning methods for robotic processing.
The adaptive and iterative learning facilitates the optimization of task execution by continuously adapting
and refining the robotic task execution parameters. Subsequential benefits could be improved performance
and significantly reduced manufacturing costs and time. The optimization of task execution, elimination of
inefficient procedures, and safe exploration result in cost savings and shorter production cycles.
8.3 Future Directions
The present work can be extended in the following directions in the future.
• Integration with the advanced sensor system: the adaptive learning of manufacturing processes can be
integrated with intelligent sensors to expand its capabilities. Advanced sensing systems can effectively
monitor numerous variables in manufacturing processes and provide a rich source of data. Advanced
sensing systems could enable real-time sensing with more accurate parameter measurements, which
could eventually lead to more accurate surrogate models and parameter estimation. Accurate parameter estimation in manufacturing processes could lead to a decrease in failures or undesirable outcomes.
It could also achieve better task performance. These positive outcomes could eventually lead to a
reduction in task completion time.
• Transfer Learning across Various Manufacturing Domains: One future direction can be further investigating the transferability of learned parameters and models across different manufacturing domains.
While specific tasks may differ, many robotic processing applications share common underlying structures, such as the robotic system, planning, and control of robots. The data and knowledge obtained
from one manufacturing task using a robot can be transferred and adapted to similar or related tasks.
For instance, the data gained from robotic sanding could be transferred and used for learning robotic
cleaning. Both tasks are contact-based tasks, and a robot needs to apply force or pressure on the
surface to make the surface quality better. The transfer of knowledge can significantly reduce the time
160
required to learn and optimize new tasks. It eliminates the redundant data acquisition process and
provides useful information to estimate the right parameters. The transferred information enables the
efficient exploration and optimization of manufacturing processes in similar domains.
• Enhanced efficiency and automation of the learning components: Some of the learning components in
the current algorithm are manually designed and selected. Fully automating these learning components can lead to efficiency in task executions. Developing algorithms that can automatically generate
possible policies for sequences of operations and process parameter selections can be a good example.
For instance, when CAD files of a new part are given, the algorithm should be capable of suggesting
optimal sequences to execute the manufacturing task and select the parameters, taking into consideration the specific characteristics of the task. By automating these processes, the overall learning process
becomes more streamlined and efficient.
161
References
[1] Amir M. Aboutaleb, Linkan Bian, Alaa Elwany, Nima Shamsaei, Scott M. Thompson, and Gustavo
Tapia. Accelerated process optimization for laser-based additive manufacturing by leveraging similar
prior studies. IISE Transactions, 49(1):31–44, 2017.
[2] Marichi Agarwal, Swagata Biswas, Chayan Sarkar, Sayan Paul, and Himadri Sekhar Paul. Jampacker:
An efficient and reliable robotic bin packing system for cuboid objects. IEEE Robotics and Automation
Letters, 6(2):319–326, 2021.
[3] Alaaldin Alafaghani and A. Oattawi. Investigating the effect of fused deposition modeling processing
parameters using taguchi design of experiment method. Journal of Manufacturing Processes, 36:164–
174, 2018.
[4] Alex R. Ansari Todd D. Murphey Andrew D. Wilson, Jarvis A. Schultz. Dynamic task execution using
active parameter identification with the baxter research robot. IEEE Transactions on Automation
Science and Engineering, 14:391–397, 2017.
[5] Dejanira Araiza-Illan, Alberto De San Bernabe, Fang Hongchao, and Leong Yong Shin. Augmented
reality for quick and intuitive robotic packing re-programming. In 2019 14th ACM/IEEE International
Conference on Human-Robot Interaction (HRI), pages 664–664, 2019.
[6] N. Asakawa and Y. Takeuchi. Teachingless spray-painting of sculptured surface by an industrial robot.
In Proceedings of International Conference on Robotics and Automation, volume 3, pages 1875–1879
vol.3, April 1997.
[7] N. Asakawa and Y. Takeuchi. Teachingless spray-painting of sculptured surface by an industrial robot.
In Proceedings of International Conference on Robotics and Automation, volume 3, pages 1875–1879
vol.3, 1997.
[8] Felix Berkenkamp, Andreas Krause, and Angela P. Schoellig. Bayesian optimization with safety constraints: Safe and automatic parameter tuning in robotics, 2016.
[9] Homanga Bharadhwaj, Aviral Kumar, Nicholas Rhinehart, Sergey Levine, Florian Shkurti, and Animesh Garg. Conservative safety critics for exploration, 2020.
[10] Prahar M. Bhatt, Ariyan M. Kabir, Rishi K. Malhan, Brual Shah, Aniruddha V. Shembekar, Yeo Jung
Yoon, and Satyandra K. Gupta. A robotic cell for multi-resolution additive manufacturing. In IEEE
International Conference on Robotics and Automation, pages 2800–2807, Montreal, May 2019.
[11] Prahar M. Bhatt, Ariyan M. Kabir, Max Peralta, Hugh A. Bruck, and Satyandra K. Gupta. A robotic
cell for performing sheet lamination-based additive manufacturing. Additive Manufacturing, 27:278–
289.
[12] Prahar M. Bhatt, Ashish Kulkarni, Rishi K. Malhan, Brual C. Shah, Yeo Jung Yoon, and Satyandra K. Gupta. Automated Planning for Robotic Multi-Resolution Additive Manufacturing. Journal
of Computing and Information Science in Engineering, 22(2):021006, October 2021.
[13] Prahar M Bhatt, Rishi K Malhan, Aniruddha V Shembekar, Yeo Jung Yoon, and Satyandra K Gupta.
Expanding capabilities of additive manufacturing through use of robotics technologies: A survey.
Additive Manufacturing, page 100933, 2019.
162
[14] Paul E. Black. Greedy algorithm in dictionary of algorithms and data structures, February 2005.
From: https://www.nist.gov/dads/HTML/greedyalgo.html.
[15] Manuel Blum and Martin Riedmiller. Optimization of gaussian process hyperparameters using rprop.
In European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, April 2013.
[16] Han Bo, Muhammad Azhar, Dhanya Menoth Mohan, and Domenico Campolo. Review of robotic
control strategies for industrial finishing operations. In 2015 10th International Symposium on Mechatronics and its Applications (ISMA), pages 1–6, 2015.
[17] Filippo Bonaccorso, Luciano Cantelli, and Giovanni Muscato. An arc welding robot control for a shaped
metal deposition plant: Modular software interface and sensors. IEEE Transactions on Industrial
Electronics, 58(8):3126–3132, 2011.
[18] David J. Buckmaster, Wyatt S. Newman, and Steven D. Somes. Compliant motion control for robust
robotic surface finishing. In 2008 7th World Congress on Intelligent Control and Automation, pages
559–564, 2008.
[19] Ying Cai, Prahar M. Bhatt, Hangbo Zhao, and Satyandra K. Gupta. Using an articulated industrial robot to perform conformal deposition with mesoscale features. In ASME 17th International
Manufacturing Science and Engineering Conference, volume 1, West Lafayette, Indiana, USA, June
2022.
[20] Heping Chen, Thomas Fuhlbrigge, and Xiongzi Li. Automated industrial robot path planning for
spray painting process: A review. In 2008 IEEE International Conference on Automation Science and
Engineering, pages 522–527, 2008.
[21] Heping Chen, Binbin Li, Dave Gravel, George Zhang, and Biao Zhang. Robot learning for complex
manufacturing process. In 2015 IEEE International Conference on Industrial Technology (ICIT), pages
3207–3211, 2015.
[22] Heping Chen, Weihua Sheng, Ning Xi, Mumin Song, and Yifan Chen. Automated robot trajectory
planning for spray painting of free-form surfaces in automotive manufacturing. In IEEE International
Conference on Robotics and Automation, volume 1, pages 450–455, May 2002.
[23] Qiyi Chen, Thanyada Sukmanee, Lihan Rong, Matthew Yang, Jingbo Ren, Sanong Ekgasit, and
Rigoberto Advincula. A dual approach in direct ink writing of thermally cured shape memory rubber
toughened epoxy. ACS Applied Polymer Materials, 2(12):5492–5500, 2020.
[24] Xu J. Zhang B. Chen, H. and T. Fuhlbrigge. Improved parameter optimization method for complex
assembly process in robotic manufacturing. Industrial Robot, 44:21–27, 2017.
[25] Hongtai Cheng and Heping Chen. Online parameter optimization in robotic force controlled assembly
processes. In 2014 IEEE International Conference on Robotics and Automation (ICRA), pages 3465–
3470, 2014.
[26] Hongtai Cheng and Wei Li. Reducing the frame vibration of delta robot in pick and place application:
An acceleration profile optimization approach. Shock and Vibration, 76, 2018.
[27] Cl´ement Chevalier and David Ginsbourger. Fast computation of the multi-points expected improvement with applications in batch selection. In Giuseppe Nicosia and Panos Pardalos, editors, Learning
and Intelligent Optimization, pages 59–69, Berlin, Heidelberg, 2013. Springer Berlin Heidelberg.
[28] K V Chidhambara, B Latha Shankar, and Vijaykumar. Optimization of robotic spray painting process
parameters using taguchi method. IOP Conference Series: Materials Science and Engineering, 310,
february 2018.
[29] Emile Contal, David Buffoni, Alexandre Robicquet, and Nicolas Vayatis. Parallel gaussian process
optimization with upper confidence bound and pure exploration. 2013.
163
[30] Davi Sampaio Correia, Cristiene Vasconcelos Gon¸calves, Sebasti˜ao Sim˜oes da Cunha, and Valtair Antonio Ferraresi. Comparison between genetic algorithms and response surface methodology in gmaw
welding optimization. Journal of Materials Processing Technology, 160(1):70–76, 2005.
[31] J. Craig. Introduction to Robotics: Mechanics and Control. Pearson/Prentice Hall, 2005.
[32] Michael Bowling Dale Schuurmans Daniel Lizotte, Tao Want. Automatic gait optimization with gaussian process regression. In The 20th international joint conference on Artificial Intelligence, pages
944–949, 2007.
[33] Thomas Desautels, Andreas Krause, and Joel Burdick. Parallelizing exploration-exploitation tradeoffs
with gaussian process bandit optimization, 2012.
[34] Arup Dey and Nita Yodo. A systematic survey of fdm process parameter optimization and their
influence on part characteristics. Journal of Manufacturing and Materials Processing, 3(3), 2019.
[35] Neel Dhanaraj, Rishi Malhan, Heramb Nemlekar, Stefanos Nikolaidis, and Satyandra K. Gupta.
Human-guided goal assignment to effectively manage workload for a smart robotic assistant. In 2022
31st IEEE International Conference on Robot and Human Interactive Communication (RO-MAN),
pages 1305–1312, 2022.
[36] Neel Dhanaraj, Yeo Jung Yoon, Rishi K Malhan, Prahar M Bhatt, Shantanu Thakar, and Satyandra K Gupta. A mobile manipulator system for accurate and efficient spraying on large surfaces. In
International Conference on Industry, volume 4, 2021.
[37] Donghong Ding, Chen Shen, Zengxi Pan, Dominic Cuiuri, Huijun Li, Nathan Larkin, and Stephen van
Duin. Towards an automated robotic arc-welding-based additive manufacturing system from cad to
finished part. Computer-Aided Design, 73:66–75, 2016.
[38] Ing-Jr Ding and Jun-Lin Su. Designs of human–robot interaction using depth sensor-based hand
gesture communication for smart material-handling robot operations. Proceedings of the Institution of
Mechanical Engineers, Part B: Journal of Engineering Manufacture, 237(3):392–413, 2023.
[39] H. M. Do, C. Park, and J. H. Kyung. Dual arm robot for packaging and assembling of it products. In
IEEE International Conference on Automation Science and Engineering, pages 1067–1070, 2012.
[40] Hang Dong, Ming Cong, Yuming Zhang, Yukang Liu, and Heping Chen. Real time welding parameter
prediction for desired character performance. In 2017 IEEE International Conference on Robotics and
Automation (ICRA), pages 1794–1799, 2017.
[41] Andr´e Driemeyer Wilbert, B. Behrens, C. Zymla, O. Dambon, and F. Klocke. Robotic finishing process
– an extrusion die case study. CIRP Journal of Manufacturing Science and Technology, 11:45–52, 2015.
[42] M. Edwards. Robots in industry: An overview. Applied Ergonomics, 15(1):45–53, 1984.
[43] Jens Kober Cosimo Della Santina Zlatan Ajanovi´c Eveline Drijver, Rodrigo P´erez-Dattari. Robotic
packaging optimization with reinforcement learning. arXiv preprint arXiv:2303.14693, 2023.
[44] L. Danielsen Evjemo, S. Moe, J. T. Gravdahl, O. Roulet-Dubonnet, L. T. Gellein, and V. Brtan.
Additive manufacturing by robot manipulator: An overview of the state-of-the-art and proof-of-concept
results. In IEEE International Conference on Emerging Technologies and Factory Automation (ETFA),
pages 1–8, Sept 2017.
[45] Daniel Gleeson, Stefan Jakobsson, Raad Salman, Niklas Sandgren, Fredrik Edelvik, Johan S. Carlson,
and Bengt Lennartson. Robot spray painting trajectory optimization. In 2020 IEEE 16th International
Conference on Automation Science and Engineering (CASE), pages 1135–1140, 2020.
[46] A Balamurali Gunji, BBBVL Deepak, CMVA Raju Bahubalendruni, and D Bibhuti Bhushan Biswal.
An optimal robotic assembly sequence planning by assembly subsets detection method using teaching
learning-based optimization algorithm. IEEE Transactions on Automation Science and Engineering,
2018.
164
[47] Shuai D. Han, Si Wei Feng, and Jingjin Yu. Toward fast and optimal robotic pick-and-place on a
moving conveyor. IEEE Robotics and Automation Letters, 5(2):446–453, 2020.
[48] Xiaozhong Hao, Yingguang Li, Yinghao Cheng, Changqing Liu, Ke Xu, and Kai Tang. A timevarying geometry modeling method for parts with deformation during machining process. Journal of
Manufacturing Systems, 55:15–29, 2020.
[49] Heping Chen, T. Fuhlbrigge, and Xiongzi Li. Automated industrial robot path planning for spray
painting process: A review. In 2008 IEEE International Conference on Automation Science and
Engineering, pages 522–527, 2008.
[50] P. Hertling, L. Hog, R. Larsen, J. W. Perram, and H. G. Petersen. Task curve planning for painting robots. i. process modeling and calibration. IEEE Transactions on Robotics and Automation,
12(2):324–330, April 1996.
[51] Qi Hong, Heping Chen, Biao Zhang, and Thomas Fuhlbrigge. Assembly control parameter learning
for complex robotic assembly processes. In 2018 IEEE International Conference on Robotics and
Biomimetics (ROBIO), pages 2526–2530, 2018.
[52] Jinwen Hu, Min Zhou, Xiang Li, and Zhao Xu. Online model regression for nonlinear time-varying
manufacturing systems. Automatica, 78:163–173, 2017.
[53] Sebastian H¨ahnel, Fabio Pini, Francesco Leali, Olaf Dambon, Thomas Bergs, and Thomas Bletek.
Reconfigurable robotic solution for effective finishing of complex surfaces. In 2018 IEEE 23rd International Conference on Emerging Technologies and Factory Automation (ETFA), volume 1, pages
1285–1290, 2018.
[54] I-Ming Chen and J. W. Burdick. Determining task optimal modular robot assembly configurations.
In IEEE International Conference on Robotics and Automation, volume 1, pages 132–137, 1995.
[55] Ryan P. Adams Matthew W. Hoffman Zoubin Ghahramani Jose Miguel Hernandez-Lobato, Michael
A. Gelbart. A general framework for constrained bayesian optimization using information-based search.
The Journal of Machine Learning Research, 17.
[56] Kai Junge, Josie Hughes, Thomas George Thuruthel, and Fumiya Iida. Improving robotic cooking
using batch bayesian optimization. IEEE Robotics and Automation Letters, 5(2):760–765, 2020.
[57] A. M. Kabir, J. D. Langsfeld, K. N. Kaipa, and S. K. Gupta. Identifying optimal trajectory parameters
in robotic finishing operations using minimum number of physical experiments. Integrated ComputerAided Engineering, 25(2):111–135, 2018.
[58] A. M. Kabir, J. D. Langsfeld, C. Zhuang, K. N. Kaipa, and S. K. Gupta. Automated learning of
operation parameters for robotic cleaning by mechanical scrubbing. In ASME 11th International
Manufacturing Science and Engineering Conference, volume 2, Blacksburg, Virginia, USA, June 2016.
[59] A. M. Kabir, J. D. Langsfeld, C. Zhuang, K. N. Kaipa, and S. K. Gupta. A systematic approach for
minimizing physical experiments to identify optimal trajectory parameters for robots. In 2017 IEEE
International Conference on Robotics and Automation (ICRA), pages 351–357, Singapore, May 2017.
[60] A. M. Kabir, A. V. Shembekar, R. K. Malhan, R. S. Aggarwal, B. C. Langsfeld, J. D.and Shah, and
S. K. Gupta. Robotic finishing of interior regions of geometrically complex parts. In ASME 13th
International Manufacturing Science and Engineering Conference, volume 3, College Station, Texas,
USA, June 2018.
[61] Ariyan M. Kabir, Alec Kanyuck, Rishi K. Malhan, Aniruddha V. Shembekar, Shantanu Thakar,
Brual C. Shah, and Satyandra K. Gupta. Generation of synchronized configuration space trajectories of multi-robot systems. In 2019 International Conference on Robotics and Automation (ICRA),
pages 8683–8690, 2019.
165
[62] Ariyan M Kabir, Shantanu Thakar, Rishi K Malhan, Aniruddha V Shembekar, Brual C Shah, and
Satyandra K Gupta. Generation of synchronized configuration space trajectories with workspace path
constraints for an ensemble of robots. The International Journal of Robotics Research, 40(2-3):651–678,
2021.
[63] Krishnanand Kaipa, Carlos Morato, and Satyandra K Gupta. Design of hybrid cells to facilitate
safe and efficient human-robot collaboration during assembly operations. Journal of Computing and
Information Science in Engineering, 2018.
[64] Hyun-Wook Kang, Sang Jin Lee, In Kap Ko, Carlos Kengla, James J Yoo, and Anthony Atala. A
3d bioprinting system to produce human-scale tissue constructs with structural integrity. Nature
Biotechnology, 34:312–319, 2016.
[65] Oliver Kroemer and Jan Peters. Active exploration for robot parameter selection in episodic reinforcement learning. In 2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement
Learning (ADPRL), pages 25–31, 2011.
[66] Tomas Kubela, Ales Pochyly, and Vladislav Singule. Assessment of industrial robots accuracy in relation to accuracy improvement in machining processes. In 2016 IEEE International Power Electronics
and Motion Control Conference (PEMC), pages 720–725, 2016.
[67] J. D. Langsfeld, A. M. Kabir, K. N. Kaipa, and S. K. Gupta. Online learning of part deformation models
in robotic cleaning of compliant objects. In ASMEs 11th Manufacturing Science and Engineering
Conference, number 49903, page V002T04A003, Blacksburg, Virginia, USA, June 2016.
[68] J. D. Langsfeld, A. M. Kabir, K. N. Kaipa, and S. K. Gupta. Robotic bimanual cleaning of deformable
objects with online learning of part and tool models. In IEEE International Conference on Automation
Science and Engineering, pages 626–632, Fort Worth, Texas, USA, Aug 2016.
[69] J. D. Langsfeld, A. M. Kabir, K. N. Kaipa, and S. K. Gupta. Integration of planning and deformation
model estimation for robotic cleaning of elastically deformable objects. IEEE Robotics and Automation
Letters, 3(1):352–359, January 2018.
[70] J. D. Langsfeld, K. N. Kaipa, and S. K. Gupta. Selection of trajectory parameters for dynamic pouring
tasks based on exploitation-driven updates of local metamodels. Robotica, 36(1):141–166, 2018.
[71] Robert B. Gramacy Herbert K. H. Lee. Optimization under unknown constraints. Bayesian Statistics,
9, 2010.
[72] Weiyang Lin, Ali Anwar, Zhan Li, Mingsi Tong, Jianbin Qiu, and Huijun Gao. Recognition and pose
estimation of auto parts for an autonomous spray painting robot. IEEE Transactions on Industrial
Informatics, 15(3):1709–1719, 2019.
[73] Jiang C. Liu, J. and J. Zheng. Batch bayesian optimization via adaptive local search. Applied Intelligence, 51(2):1280–1295, 2022.
[74] Andrew Lobbezoo, Yanjun Qian, and Hyock-Ju Kwon. Reinforcement learning for pick and place
operations in robotics: A survey. Robotics, 10(3), 2021.
[75] R. K. Malhan, Y. Shahapurkar, A. M. Kabir, B. C. Shah, and S. K. Gupta. Integrating impedance
control and learning based search scheme for robotic assemblies under uncertainty. In ASMEs 13th
Manufacturing Science and Engineering Conference, College Station, Texas, USA, June 2018.
[76] Rishi K. Malhan, Shantanu Thakar, Ariyan M. Kabir, Pradeep Rajendran, Prahar M. Bhatt, and
Satyandra K. Gupta. Generation of configuration space trajectories over semi-constrained cartesian paths for robotic manipulators. IEEE Transactions on Automation Science and Engineering,
20(1):193–205, 2023.
166
[77] Andrea Mantelli, Alessia Romani, Raffaella Suriano, Marinella Levi, and Stefano Turri. Direct ink
writing of recycled composites with complex shapes: Process parameters and ink optimization. Advanced Engineering Materials, 23(9), 2021.
[78] Alonso Marco, Dominik Baumann, Majid Khadiv, Philipp Hennig, Ludovic Righetti, and Sebastian
Trimpe. Robot learning with crash constraints. 2020.
[79] Christian Bobst Markus Maier, Alisa Rupenyan and Konrad Wegener. Self-optimizing grinding machines using gaussian process models and constrained bayesian optimization. The international Journal
of Advanced Manufacturing Technology, 108:539–552, 2020.
[80] Jeremy A. Marvel and Wyatt S. Newman. Accelerating robotic assembly parameter optimization
through the generation of internal models. In 2009 IEEE International Conference on Technologies
for Practical Robot Applications, pages 42–47, 2009.
[81] Jeremy A. Marvel, Wyatt S. Newman, Dave P. Gravel, George Zhang, Jianjun Wang, and Tom
Fuhlbrigge. Automated learning for parameter optimization of robotic assembly tasks utilizing genetic
algorithms. In 2008 IEEE International Conference on Robotics and Biomimetics, pages 179–184,
2009.
[82] Brenan J McCarragher and Haruhiko Asada. The discrete event control of robotic assembly tasks.
Journal of dynamic systems, measurement, and control, 117(3):384–393, 1995.
[83] Patrick Mesmer, Michael Neubauer, Armin Lechler, and Alexander Verl. Drive-based vibration damping control for robot machining. IEEE Robotics and Automation Letters, 5(2):564–571, 2020.
[84] Hussein Mnyusiwalla, Pavlos Triantafyllou, Panagiotis Sotiropoulos, M´aximo A. Roa, Werner Friedl,
Ashok M. Sundaram, Duncan Russell, and Graham Deacon. A bin-picking benchmark for systematic
evaluation of robotic pick-and-place systems. IEEE Robotics and Automation Letters, 5(2):1389–1396,
2020.
[85] Omar Ahmed Mohamed, Syed Hasan Masood, and Jahar Lal Bhowmik. Mathematical modeling and
fdm process parameters optimization using response surface methodology based on q-optimal design.
Applied Mathematical Modelling, 40(23):10052–10073, 2016.
[86] Abdullah Mohammed and Lihui Wang. Brainwaves driven human-robot collaborative assembly. CIRP
Annals, 2018.
[87] Carlos W Morato, Krishnanand N Kaipa, and Satyandra K Gupta. System state monitoring to facilitate safe and efficient human-robot collaboration in hybrid assembly cells. In ASME 2017 International
Design Engineering Technical Conferences and Computers and Information in Engineering Conference,
pages V001T02A012–V001T02A012. American Society of Mechanical Engineers, 2017.
[88] Carlos W Morato, Krishnanand N Kaipa, Jiashun Liu, and Satyandra K Gupta. A framework for hybrid cells that support safe and efficient human-robot collaboration in assembly operations. In ASME
2014 International Design Engineering Technical Conferences and Computers and Information in Engineering Conference, pages V01AT02A078–V01AT02A078. American Society of Mechanical Engineers,
2014.
[89] J.J. M´arquez, J.M. P´erez, J. Rıos, and A. Viz´a. Process modeling for robotic polishing. Journal of
Materials Processing Technology, 159(1):69–82, 2005.
[90] Sani R Nassif. Modeling and forecasting of manufacturing variations (embedded tutorial). In Proceedings of the 2001 Asia and South Pacific Design Automation Conference, pages 145–150, 2001.
[91] Taylor V. Neumann and Michael D. Dickey. Liquid metal direct write and 3d printing: A review.
Advanced Materials Technologies, 5(9), 2020.
167
[92] Charles. W. X. Ng, Kelvin. H. K. Chan, W. K. Teo, and I-Ming. Chen. A method for capturing the
tacit knowledge in the surface finishing skill by demonstration for programming a robot. In 2014 IEEE
International Conference on Robotics and Automation (ICRA), pages 1374–1379, 2014.
[93] J. Norberto Pires, A. Loureiro, T. Godinho, P. Ferreira, B. Fernando, and J. Morgado. Welding robots.
IEEE Robotics Automation Magazine, 10(2):45–55, June 2003.
[94] Prabhakar R. Pagilla and Biao Yu. Robotic Surface Finishing Processes: Modeling, Control, and
Experiments . Journal of Dynamic Systems, Measurement, and Control, 123(1):93–102, 10 1999.
[95] Jia Pan, Sachin Chitta, and Dinesh Manocha. Fcl: A general purpose library for collision and proximity
queries. In Robotics and Automation (ICRA), IEEE International Conference on, pages 3859–3866.
IEEE, 2012.
[96] P. J. Pawar and R. Venkata Rao. Parameter optimization of machining processes using
teaching–learning-based optimization algorithm. The International Journal of Advanced Manufacturing Technology, 67:995–1006, 2012.
[97] S. Saravana Perumaal and N. Jawahar. Automated trajectory planner of industrial robot for pick-andplace task. International Journal of Advanced Robotic Systems, 10(2):100, 2013.
[98] Pradeep Rajendran, Shantanu Thakar, Prahar M. Bhatt, Ariyan M. Kabir, and Satyandra K. Gupta.
Strategies for Speeding Up Manipulator Path Planning to Find High Quality Paths in Cluttered Environments. Journal of Computing and Information Science in Engineering, 21(1):011009, 12 2020.
[99] Kormushev P. Rakicevic, N. Active learning via informed search in movement parameter space for
efficient robot task learning and transfer. Auton Robot, 43:1917–1935, 2019.
[100] R. Venkata Rao and V. D. Kalyankar. Parameter optimization of machining processes using a new
optimization algorithm. Materials and Manufacturing Processes, 27(9):978–985, 2012.
[101] Carl Edward Rasmussen. Gaussian Processes in Machine Learning. Springer, Berlin, Heidelberg, 2004.
[102] Karen E. Willcox Remi R. Lam. Lookahead bayesian optimization with inequality constraints. In The
31st International Conference on Neural Information Processing Systems, pages 1888–1898, 2017.
[103] Jan Peters Roberto Calandra, Andr´e Seyfarth and Marc Peter Deisenroth. Bayesian optimization for
learning gaits under uncertainty. Annals of Mathematics and Artificial Intelligence, 76, 2016.
[104] Eric Brochu Jos´e Castellanos Ruben Martinez-Cantin, Nando de Freitas and Arnaud Doucet. A
bayesian exploration-exploitation approach for optimal online sensing and planning with a visually
guided mobile robot. Autonomous Robots, 27:93–103, 2009.
[105] MASR Saadi, Alianna Maguire, Neethu T Pottackal, Md Shajedul Hoque Thakur, Maruf Md Ikram,
A John Hart, Pulickel M Ajayan, and Muhammad M Rahman. Direct ink writing: A 3d printing
technology for diverse materials. Advanced Materials, 34, 2022.
[106] Jens Schreiter, Duy Nguyen-Tuong, Mona Eberts, Bastian Bischoff, Heiner Markert, and Marc Toussaint. Safe exploration for active learning with gaussian processes. In Albert Bifet, Michael May,
Bianca Zadrozny, Ricard Gavalda, Dino Pedreschi, Francesco Bonchi, Jaime Cardoso, and Myra
Spiliopoulou, editors, Machine Learning and Knowledge Discovery in Databases, pages 133–149, Cham,
2015. Springer International Publishing.
[107] Aniruddha V. Shembekar, Yeo Jung Yoon, Alec Kanyuck, and Satyandra K. Gupta. Trajectory planning for conformal 3d printing using non-planar layers. In ASME International Design Engineering
Technical Conferences and Computers and Information in Engineering Conference, volume 1A, August
2018.
168
[108] Aniruddha V. Shembekar, Yeo Jung Yoon, Alec Kanyuck, and Satyandra K. Gupta. Generating robot
trajectories for conformal three-dimensional printing using non-planar layers. Journal of Computing
and Information Science in Engineering, 19(3), 2019.
[109] Hongyao Shen, Lingnan Pan, and Jun Qian. Research on large-scale additive manufacturing based on
multi-robot collaboration technology. Additive Manufacturing, 30:100906, 2019.
[110] D Shi and I Gibson. Improving surface quality of selective laser sintered rapid prototype parts using
robotic finishing. Proceedings of the Institution of Mechanical Engineers, Part B: Journal of Engineering Manufacture, 214(3):197–203, 2000.
[111] S. H. Suh, I. K. Woo, and S. K. Noh. Development of an automatic trajectory planning system
(atps) for spray painting robots. In Proceedings 1991 IEEE International Conference on Robotics and
Automation, volume 3, pages 1948–1955, April 1991.
[112] Yanan Sui, Vincent Zhuang, Joel Burdick, and Yisong Yue. Stagewise safe Bayesian optimization with
Gaussian processes. In Jennifer Dy and Andreas Krause, editors, Proceedings of the 35th International
Conference on Machine Learning, volume 80 of Proceedings of Machine Learning Research, pages
4781–4789. PMLR, 10–15 Jul 2018.
[113] Bhavya Sukhija, Matteo Turchetta, David Lindner, Andreas Krause, Sebastian Trimpe, and Dominik
Baumann. Gosafeopt: Scalable safe exploration for global optimization of dynamical systems. Artificial
Intelligence, 320:103922, 2023.
[114] Lars Care Srensen, Rasmus Skovgaard Andersen, Casper Schou, and Dirk Kraft. Automatic parameter
learning for easy instruction of industrial collaborative robots. In 2018 IEEE International Conference
on Industrial Technology (ICIT), pages 87–92, 2018.
[115] Yoshimi Takeuchi, Naoki Asakawa, and Dongfang Ge. Automation of polishing work by an industrial
robot: System of polishing robot. JSME international journal. Ser. C, Dynamics, control, robotics,
design and manufacturing, 36(4):556–561, 1993.
[116] Annalisa T. Taylor, Thomas A. Berrueta, and Todd D. Murphey. Active learning in robotics: A review
of control principles. Mechatronics, 77:102576, 2021.
[117] Kevin Tiana, Jinhye Bae, Shannon E Bakarich, Canhui Yang, Reece D Gately, Geoffrey M Spinks,
Marc in het Panhuis, Zhigang Suo, and Joost J Vlassak. 3d printing of transparent and conductive
heterogeneous hydrogel-elastomer systems. Advanced Materials, 29, 2017.
[118] Yongqiang Tu, Javier A. Arrieta-Escobar, Alaa Hassan, Uzair Khaleeq uz Zaman, Ali Siadat, and
Gongliu Yang. Optimizing process parameters of direct ink writing for dimensional accuracy of printed
layers. 3D Printing and Additive Manufacturing, 2021.
[119] Pinar Urhal, Andrew Weightman, Carl Diver, and Paulo Bartolo. Robot assisted additive manufacturing: A review. Robotics and Computer-Integrated Manufacturing, 59:335–345, 2019.
[120] Jean Chagas Vaz and Paul Oh. Material handling by humanoid robot while pushing carts using a
walking pattern based on capture point. In 2020 IEEE International Conference on Robotics and
Automation (ICRA), pages 9796–9801, 2020.
[121] An Wan, Jing Xu, Heping Chen, Song Zhang, and Ken Chen. Optimal path planning and control
of assembly robots for hard-measuring easy-deformation assemblies. IEEE/ASME Transactions on
Mechatronics, 22(4):1600–1609, 2017.
[122] Jiafu Wan, Shenglong Tang, Qingsong Hua, Di Li, Chengliang Liu, and Jaime Lloret. Context-aware
cloud robotics for material handling in cognitive industrial internet of things. IEEE Internet of Things
Journal, 5(4):2272–2281, 2018.
169
[123] Chang Wang, Koen V. Hindriks, and Robert Babuska. Active learning of affordances for robot use of
household objects. In 2014 IEEE-RAS International Conference on Humanoid Robots, pages 566–572,
2014.
[124] Fan Wang and Kris Hauser. Dense robotic packing of irregular and novel 3d objects. IEEE Transactions on Robotics, 38(2):1160–1173, 2022.
[125] Gang Wang, Wenlong Li, Cheng Jiang, Dahu Zhu, Zhongwei Li, Wei Xu, Huan Zhao, and Han Ding.
Trajectory planning and optimization for robotic machining based on measured point cloud. IEEE
Transactions on Robotics, 38(3):1621–1637, 2022.
[126] Jianhua Wang, Shuang Zheng, Heping Chen, and Dongmeng Yu. Linear trajectory planning for
material handling robot. In 2019 IEEE 9th Annual International Conference on CYBER Technology
in Automation, Control, and Intelligent Systems (CYBER), pages 323–328, 2019.
[127] Xuewu Wang, Lika Xue, Yixin Yan, and Xingsheng Gu. Welding robot collision-free path optimization.
Applied Sciences, 7(2), 2017.
[128] Jing Wei and W.S. Newman. Improving robotic assembly performance through autonomous exploration. In Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat.
No.02CH37292), volume 3, pages 3303–3308 vol.3, 2002.
[129] Yalun Wen and Prabhakar R. Pagilla. A novel 3d path following control framework for robots performing surface finishing tasks. Mechatronics, 76:102540, 2021.
[130] Sun Wencheng, Tao Yong, and Gao Jinpeng. Kinetic modeling and simulation of traditional chinese medicine packing robot. In 2017 IEEE 2nd Advanced Information Technology, Electronic and
Automation Control Conference (IAEAC), pages 1916–1920, 2017.
[131] D. E. Whitney and E. D. Tung. Robot Grinding and Finishing of Cast Iron Stamping Dies. Journal
of Dynamic Systems, Measurement, and Control, 114(1):132–140, 03 1992.
[132] Binglong Wu, Daokui Qu, and Fang Xu. Improving efficiency with orthogonal exploration for online
robotic assembly parameter optimization. In 2015 IEEE International Conference on Robotics and
Biomimetics (ROBIO), pages 958–963, 2015.
[133] Jia Wu, Xiu-Yun Chen, Hao Zhang, Li-Dong Xiong, Hang Lei, and Si-Hao Deng. Hyperparameter optimization for machine learning models based on bayesian optimization. Journal of Electronic Science
and Technology, 17(1):26–40, 2019.
[134] Jun Wu, Xiaojian Wang, Binbin Zhang, and Tian Huang. Multi-objective optimal design of a novel
6-dof spray-painting robot. Robotica, 39(12):2268–2282, 2021.
[135] He Xie, Wen-long Li, Da-Hu Zhu, Zhou-ping Yin, and Han Ding. A systematic model of machining
error reduction in robotic grinding. IEEE/ASME Transactions on Mechatronics, 25(6):2961–2972,
2020.
[136] Yang Yang, Cai Ying, Yeo Jung Yoon, Hangbo Zhao, and Satyandra K. Gupta. Sensor-based planning
and control for conformal deposition on a deformable surface using an articulated industrial robot. In
ASMEs Manufacturing Science and Engineering Conference, New Brunswick, New Jersey, USA, 2023.
[137] Ahmed Yaseer and Heping Chen. Machine learning based layer roughness modeling in robotic additive
manufacturing. Journal of Manufacturing Processes, 70:543–552, 2021.
[138] Yeo Jung Yoon, Oswin G. Almeida, Aniruddha V. Shembekar, and Satyandra K. Gupta. A robotic cell
for embedding prefabricated components in extrusion-based additive manufacturing. In ASME 15th
International Manufacturing Science and Engineering Conference, volume 1A, 2020.
170
[139] Yeo Jung Yoon and Satyandra K. Gupta. Learning to improve performance during non-repetitive
tasks performed by robots. In ASME 2021 International Design Engineering Technical Conferences
and Computers and Information in Engineering Conference, volume 2, August 2021.
[140] Yeo Jung Yoon and Satyandra K. Gupta. A Sequential Decision Making Approach to Learn Process
Parameters by Conducting Experiments on Sacrificial Objects. volume Volume 2: 42nd Computers
and Information in Engineering Conference of International Design Engineering Technical Conferences
and Computers and Information in Engineering Conference, August 2022.
[141] Yeo Jung Yoon, Santosh Narayan, and Satyandra K. Gupta. Self-Supervised Learning of Spatially
Varying Process Parameter Models for Robotic Finishing Tasks. Journal of Computing and Information Science in Engineering, pages 1–28, August 2023.
[142] Yeo Jung Yoon, Yang Yang, and Satyandra K. Gupta. Self-supervised Learning of Temporally Varying Process Parameter Models for Direct Ink Writing. International Design Engineering Technical
Conferences and Computers and Information in Engineering Conference, August 2023.
[143] Yeo Jung Yoon, Minsok Yon, Sung Eun Jung, and Satyandra K. Gupta. Development of Three-Nozzle
Extrusion System for Conformal Multi-Resolution 3D Printing With a Robotic Manipulator. volume
Volume 1: 39th Computers and Information in Engineering Conference of International Design Engineering Technical Conferences and Computers and Information in Engineering Conference, August
2019.
[144] Shengrui Yu and Ligang Cao. Modeling and prediction of paint film deposition rate for robotic spray
painting. In 2011 IEEE International Conference on Mechatronics and Automation, pages 1445–1450,
2011.
[145] Lei Yuan, Zengxi Pan, Donghong Ding, Shuaishuai Sun, and Weihua Li. A review on chatter in
robotic machining process regarding both regenerative and mode coupling mechanism. IEEE/ASME
Transactions on Mechatronics, 23(5):2240–2251, 2018.
[146] Hyunwoo Yuk and Xuanhe Zhao. A new 3d printing strategy by harnessing deformation, instability,
and fracture of viscoelastic inks. Advanced Materials, 30, 2018.
[147] Bin Zhang, Lei Gao, Liang Ma, Yichen Luo, Huayong Yang, and Zhanfeng Cui. 3d bioprinting: A
novel avenue for manufacturing tissues and organs. Engineering, 5(4):777–794, 2019.
[148] Binbin Zhang, Jun Wu, Liping Wang, and Zhenyang Yu. Accurate dynamic modeling and control
parameters design of an industrial hybrid spray-painting robot. Robotics and Computer-Integrated
Manufacturing, 63:101923, 2020.
[149] Li Zhang, Weiguo Gao, Dawei Zhang, and Yanling Tian. Prediction of dynamic milling stability
considering time variation of deflection and dynamic characteristics in thin-walled component milling
process. Shock and Vibration, 2016.
[150] Zhuohan Zhang, Ziyan Zhao, Yang Zhang, and Shixin Liu. Multi-process logistics planning for cost
minimization and workload balance in steel production systems. In 2022 IEEE International Conference on Networking, Sensing and Control (ICNSC), pages 1–6, 2022.
171
Abstract (if available)
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Efficiently learning human preferences for proactive robot assistance in assembly tasks
PDF
Decision support systems for adaptive experimental design of autonomous, off-road ground vehicles
PDF
Leveraging prior experience for scalable transfer in robot learning
PDF
Algorithms and systems for continual robot learning
PDF
Reward shaping and social learning in self- organizing systems through multi-agent reinforcement learning
PDF
Scaling robot learning with skills
PDF
Closing the reality gap via simulation-based inference and control
PDF
Contingency handling in mission planning for multi-robot teams
PDF
High-throughput methods for simulation and deep reinforcement learning
PDF
Data-driven acquisition of closed-loop robotic skills
PDF
Quickly solving new tasks, with meta-learning and without
PDF
Learning affordances through interactive perception and manipulation
PDF
Planning and learning for long-horizon collaborative manipulation tasks
PDF
Process planning for robotic additive manufacturing
PDF
Traveling sea stars: hydrodynamic interactions and radially-symmetric motion strategies for biomimetic robot design
PDF
Robot life-long task learning from human demonstrations: a Bayesian approach
PDF
Towards socially assistive robot support methods for physical activity behavior change
PDF
Advancing robot autonomy for long-horizon tasks
PDF
Nonverbal communication for non-humanoid robots
PDF
On virtual, augmented, and mixed reality for socially assistive robotics
Asset Metadata
Creator
Yoon, Yeo Jung
(author)
Core Title
AI-driven experimental design for learning of process parameter models for robotic processing applications
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Mechanical Engineering
Degree Conferral Date
2023-12
Publication Date
09/19/2023
Defense Date
07/05/2023
Publisher
Los Angeles, California
(original),
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
Bayesian optimization,experimental design,Gaussian process,manufacturing,OAI-PMH Harvest,robotic processing,robotics
Format
theses
(aat)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Gupta, Satyandra K. (
committee chair
), Bermejo-Moreno, Ivan (
committee member
), Jin, Yan (
committee member
), Nguyen, Quan (
committee member
), Nikolaidis, Stefanos (
committee member
)
Creator Email
y.maristella@yahoo.com,yeojungy@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-oUC113378506
Unique identifier
UC113378506
Identifier
etd-YoonYeoJun-12395.pdf (filename)
Legacy Identifier
etd-YoonYeoJun-12395
Document Type
Dissertation
Format
theses (aat)
Rights
Yoon, Yeo Jung
Internet Media Type
application/pdf
Type
texts
Source
20230920-usctheses-batch-1099
(batch),
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the author, as the original true and official version of the work, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright.
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Repository Email
cisadmin@lib.usc.edu
Tags
Bayesian optimization
experimental design
Gaussian process
robotic processing
robotics