Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Exploiting mechanical properties of bipedal robots for proprioception and learning of walking
(USC Thesis Other)
Exploiting mechanical properties of bipedal robots for proprioception and learning of walking
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
EXPLOITING MECHANICAL PROPERTIES OF BIPEDAL ROBOTS FOR PROPRIOCEPTION
AND LEARNING OF WALKING
by
Darío Urbina-Meléndez
A Dissertation Presented to the
FACULTY OF THE USC GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
(BIOMEDICAL ENGINEERING)
May 2023
Copyright 2023 Darío Urbina-Meléndez
Dedication
Para mi familia: de quien vengo,
y para mi esposa: con quien voy
ii
Acknowledgements
First, I want to express my gratitude to my family: my wife, my mother, my father and my sister. I want
to thank Mariana, my wife, for always bringing love, happiness and support to my life. Before we got
married, she emigrated with me from México to the United States of America where I decided to pursue a
PhD degree. I want to thank my father, who taught me very basic and beautiful things in life like walking
and meditating, he also taught me to be patient and to enjoy the process. I will always remember what
he told me when I graduated from my Engineering degree in México: "The significance of your degree
will depend on what you do after you get it". It is because of him that I like engineering. I want to thank
my mother, who has always been an immeasurable source of love to me. Love is the most important
thing, and it should always be there specially when things seem to "not be working". She is the one that
first motivated my scientific curiosity, it is because of her that I like science. And also, I want to show
my gratitude to my sister, she with whom I have shared this world since I was a baby. By her side and
together with her I understood that each one of us is unique and that “success” has multiple meanings.
For this group of people, who are my core and my source, my gratitude comes from very deep levels in
my soul. They understand that getting a Doctorate in Philosophy goes beyond doing research. They are
fundamental underpinnings of this accomplishment.
I want to thank a very important person in my life: Francisco Valero-Cuevas, my PhD advisor, my
mentor and my "compatriota". He has helped me pursue my dream of creating engineering artifacts (i.e.,
robotic systems) that mimic how legged animals learn and perform locomotion. It was because of him and
iii
the USC-CONACYT program that I was able to start my PhD studies. Not only did he opened to me the
doors to his laboratory at the magnificent University of Southern California, he also welcomed me to his
own home. He has shown me that vision put into actions can enable us to get to places few people have
gotten to. He has challenged me in so many different ways. With him I learned that the most difficult
person to work with is oneself. Thank you Francisco for helping me realize that I need to work so much
inwardly so as to be “good” at moving in a very complicated and complex world.
I would like to thank my committee members: Professors Nicolas Schweighofer and James Finley. Since
I met Professor Schweighofer, he has always urged me to be more rigorous and formal with my methods
and also on how I report them , being this fundamental to making good science and engineering. He has
very positively impacted my last and biggest PhD-student project by pointing out the weak points and
opportunities of improvement of the last chapter of my thesis. Professor Finley, who I met months before
I started my PhD program, has always offered support and motivation to me. By hearing my rationale and
providing very detailed feedback, he has boosted my research outcomes. I am specially grateful to him for
always offering his engineering and scientific perspective about my projects, he has always emphasised
on me leveraging my mechatronics background to make my projects better.
Going back to my foundations I could not go over this section without acknowledging México, my
country. THANK YOU!! Thank you for giving me such a significant part of my identity which is one of
the most powerful assets I have. Also, I want to express my deepest gratitude to the Universidad Nacional
Autonoma de México (UNAM), my alma matter, who has offered me many powerful tools to walk the
path to pursuing so many of my dreams. UNAM comes together with “autonomía” and freedom, and that
freedom together with what my family has given me, has pushed me to be more responsible: my actions
can take me very low or very high.
I would also like to thank the University of Southern California. It has been a great honor to be part of
such a great institution which I am sure will always have a special place in my heart. Coming to the US has
iv
been one of the biggest steps I have taken, and it was through and greatly thanks to this university that I
was able to reach this milestone in my professional and personal journey. Here I want to thank William
Yang, my PhD program advisor, who has helped me build a communication bridge with the University.
He has offered me so much support and help, and very importantly he has shown deep care about the
complications that someone like me (a PhD student, and specifically and international one) sometimes
need to go through. The USC environment (including teachers, advisors, clubs and KECK hospital) has
given me a lot of support in many areas further beyond research, THANK YOU.
Finally, I am grateful to have very dear friends in my life who have helped me made my journey more
manageable. Special thanks go to Octavio and Suraj, my PhD big buddies with whom I have sorted out
so many things even before the program officially started. I also want to thank my B5 group, my swim
mates, the lab mates with whom I created a friendship that goes beyond the "academia world" and my
co-authors (some of whom I have developed nice friendships with). I want to express my gratitude to all
of you who have been with me as a friend and special thanks to the ones that will stay for the years to
come. Here I want to mention a very dear friend of mine, my grandfather: “El Rino”. He was able to erase
generational differences, he heard me, motivated me, unconditionally loved me and showed me so many
beautiful things in life.
v
TableofContents
Dedication ii
Acknowledgements iii
List of Tables viii
List of Figures ix
Abstract xiii
Chapter 1: Introduction 1
1.1 General focus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Innovation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Chapter 2: A physical model suggests that hip-localized balance sense in birds improves state
estimation in perching: implications for bipedal robots 7
2.1 Chapter summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.3 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3.1 Physical model of a guinea fowl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3.2 Instrumentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3.3 Trials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.3.4 Data Acquisition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.3.5 Estimation of neck stiffness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.3.6 Estimation of sensory delay at the hip and head . . . . . . . . . . . . . . . . . . . . 15
2.3.7 Estimation of the time history of foot acceleration . . . . . . . . . . . . . . . . . . 15
2.3.8 Boot-Strap Analysis and Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.6 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.7 Author contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Chapter 3: Estimating Center of Pressure of a Bipedal Mechanism Using a Proprioceptive Artificial
Skin around its Ankles 29
3.1 Chapter summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.3 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.3.1 Biped, Artificial Skin, Ligaments and Force Plate Construction . . . . . . . . . . . . 31
vi
3.3.2 Double Stance Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.3.3 Single Stance Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.3.4 Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.4.1 Double Stance Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.4.2 Single Stance Case: Skin vs. Ligaments . . . . . . . . . . . . . . . . . . . . . . . . 37
3.4.3 Single Stance Case: Skin with 30 and 40 mm Spring Leaf Sensor versions . . . . . . 38
3.4.4 Single Stance Case: Skin and Ligament combination . . . . . . . . . . . . . . . . . 38
3.4.5 Statistical Significance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.4.6 Convergence to Stable Equilibrium Point . . . . . . . . . . . . . . . . . . . . . . . . 40
3.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.7 Supplementary information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.8 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
Chapter 4: Exploiting Brain-Body-Environment Interactions to Improve Locomotion Learning in
a Biped Robot 46
4.1 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.3 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.3.1 Robot characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.3.2 General G2P overview (naïve and natural babbling explained) . . . . . . . . . . . . 51
4.3.3 Desired foot trajectory characteristics and biped support . . . . . . . . . . . . . . . 54
4.3.4 Hardware experiments steps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.3.5 Data analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.3.5.1 Sparseness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.3.5.2 Detrended Fluctuation Analysis . . . . . . . . . . . . . . . . . . . . . . . 55
4.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.4.1 Exploiting limb mechanical properties reduces sparsity of training data and
increases success rate of locomotion learning . . . . . . . . . . . . . . . . . . . . . 57
4.4.2 Removing support increases walking success rate and correlates to faster walking . 59
4.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
Chapter 5: Conclusions 64
5.1 Applications and limitations of the presented work . . . . . . . . . . . . . . . . . . . . . . 65
5.2 Immediate follow up work and recommendations . . . . . . . . . . . . . . . . . . . . . . . 66
5.3 All in all . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
Bibliography 68
Appendix A 78
Expanding on electromechanical and algorithmic systems of the bipedal robot of Chapter 4 . . 78
A.1 Details of G2P the precursor of Natural G2P . . . . . . . . . . . . . . . . . . . . . . . . . . 78
A.2 Data and power circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
A.3 The biped . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
vii
ListofTables
2.1 Each trial consisted of 3,000 random center-out-and-back displacements (center-surface
of a sphere). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.1 Center of Pressure Prediction Accuracy, Single Stance . . . . . . . . . . . . . . . . . . . . 39
3.2 Center of Pressure Prediction Accuracy, Single Stance: "Blind Tests": Variable and
unknown training load . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
viii
ListofFigures
2.1 Physical model of the skeleton of the guinea fowl made of articulated aluminum plates and
an elastic tube for a neck. The location of the sensors can be seen on the floor between the
model’s feet, on its pelvis between the hips, and on its head. The joints of the model are,
starting from the pelvis: the hip, knee, ankle and metatarsal joints. The transparent sphere
around the accelerometer between the feet indicates the scale of random displacements 20
mm in radius. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2 Photograph of the physical model of the skeleton of the guinea fowl. On the left the
complete model is shown, on the middle and right sections details of the elastic linkages
that are required for the robot to maintain a standing posture can be seen. . . . . . . . . . 13
2.3 Generating 3D movements with the 6-DOF AdeptSix 300 robotic arm enabled us to apply
repeatable and specific type of perturbations to our model. . . . . . . . . . . . . . . . . . . 14
2.4 Estimated bending stiffness of the two necks. Neck stiffnesses were calculated using data
from 1,000 different trials and the simple lumped-parameter model in Equation 2.1. Left:
stiffness calculated for the flexible corrugated metal tubing (i.e., low stiffness neck). Right:
stiffness calculated for the solid rubber tube (i.e., high stiffness neck) Flexible corrugated
metal tubing and solid rubber tube data were statistically different (to p < 0.05), their
respective medians are 0.67 and 1.25 N/m. . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.5 Independently of the neck stiffness, foot-to-hip delays were shorter than foot-to-head
ones. The two data groups in the panel corresponding to the low stiffness neck (left panel)
were statistically different (to p < 0.05); this is not the case for high stiffness neck data
(right panel). Foot-to-head median delays were longer, measured at 0.095 s for the low
stiffness neck, and 0.055 s for the high stiffness one. Foot-to-hip median delays were 0.02
and 0.03 s respectively for the low stiffness and high stiffness necks. . . . . . . . . . . . . 19
2.6 Example of the acceleration at the feet in the sagittal plane estimated from the measured
accelerations at the hip and head. The acceleration at the hip yields more accurate
estimates of acceleration at the feet. (A) a 5 (s) time window. (B) a 300 (ms) time window. 20
ix
2.7 Comparison of estimation accuracy of foot acceleration from the hip, head and their
fusion as a function of perturbation magnitudes. Hip-to-feet compared to head-to-foot
acceleration estimation was more accurate (p < 0.05). Fusion of the hip and head
information did not improve estimation of the foot acceleration beyond that obtained with
hip information along. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.8 Estimation accuracy of foot acceleration as a function of neck type. Increasing neck
stiffness improved estimation of foot acceleration from acceleration measured at the
head. Low stiffness and high stiffness neck data were statistically different ( p < 0.05).
The median %VAF was 15.11 and 17.95 for the low stiffness neck and high stiffness neck
respectively. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.1 AandBshowexperimentstoestimatetheCenterofPressure,whileCpresents
device details. A) Biped standing on the force plate with CoP sensing area marked by
a numbered grid. The force plate provides ground truth CoP locations (i.e., labels) while
signals from four leaf spring sensors on each of the biped’s ankles are recorded (i.e.,
features). B) Foot structure in an upside-down position to which Center of Pressure loads
were applied to different locations on its sole. Simultaneously, data was recorded from
the spring leaf sensors (i.e., features). C) Ankle joint consists of a universal joint. Two
strain gauges are encapsulated inside two metal layers that form the leaf spring sensor.
When the leaf spring geometry changes due to the skin elongation or contraction, the
strain gauge electric resistance changes, generating a signal which depends on the strain
the skin experiences derived from its elongation. . . . . . . . . . . . . . . . . . . . . . . . 32
3.2 Non-homogeneously distributed sensors, subjected to different conditions can
enable the prediction of body states. A) Render of Foot-Ankle-Leg with Dacron
Cable Ligament Assembly; ligaments can also be extension springs, as shown in B). Two
mounting points are available, one being more proximal to the knee. When using the
distal mounting point, we noticed a lost in state observability due to the universal joint
and mount point alignment. B) Foot-Ankle-Leg with a combination of skin with 30 mm
Leaf Springs and a Extension Spring Ligament. . . . . . . . . . . . . . . . . . . . . . . . . 36
3.3 Rawsignalsamples: A) Skin with 40 mm Leaf Spring andB) Extension Spring Ligaments. 41
3.4 Skin strain afferent signals enable the prediction of the Center of Pressure.
Confusion Matrices that show percentages of how much the prediction of the CoP (i.e., x
axis) agrees with the ground truth CoP value (i.e., y axis). All cases done using skin with
30 mm Leaf Spring Sensors. A) Double stance experiment, four CoP locations prediction
accuracy: 87.44% =97.6%× 89.6%. 97.6% Force Plate Prediction Accuracy. CoP values
assigned by the plate were 89.6% correctly predicted using skin afferent signals. B) Double
stance experiment, nine CoP locations prediction accuracy: 78.53% = 97.2%× 80.8%
(Same rationale than in A). Even though this prediction value is high, CoP values were
not assigned correctly by the force plate: the biped was manipulated to reach nine CoP
locations (as described in Section 3.3.2), but only some points (i.e., labels) were assigned by
the plate while performing the task. This is considered a poor test due to the incapability
of the force plate to better assign nine CoP location labels. C) We show with the single
stance experiment that a synthetic skin is capable of providing signals to estimate 16 CoP
values. Experiments protocol described in Section 3.3.2 and 3.3.3 and in Fig. 3.1-B. . . . . . 43
x
4.1 A- One leg tendon route diagramB- Render of 3D model of the bipedC.- Photograph of
the biped . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.2 Desired joint and foot trajectories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.3 Two minutes of babbling data and desired trajectories. A: Joint Space, B: Endpoint Space . 57
4.4 Plots of obtained and desired foot trajectories shown together with close ups of biped feet
in full (i.e.,A), partial (i.e.,B) and minimal (i.e.,C) support cases . . . . . . . . . . . . . . . 59
4.5 This figure shows box plot of the fractal scaling components for naive (blue) and natural
(red) versions for eight different trials (four trials for both the right and left legs). When the
FSC value is low, meaning that persistence is compromised (partially supported cases) the
biped can sometimes generate locomotion, while presenting slow speeds of displacement
compared to the cases where persistence is higher or presents less dispersion (Minimally
supported cases). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
A.1 This diagram offers a general overview of the main hardware components of our system.
InA, we symbolize a PC where the learning is performed, this can also be considered the
high level control part of our system. In B, we show the main components to interface
the learning with the the mechanical components of the system, this can also be called
the lower control part of our system. In C, we show a photograph of the robot which
interacts with the environment to produce walking . . . . . . . . . . . . . . . . . . . . . . 79
A.2 Representation of the ANN that we train to be used as a map from six limb kinematics to
3 motor activations. In this figure we show data used to train the NN (particularly naïve
babbling data): limb kinematics (values for input nodes) which result from motor
activations (desired values for output nodes). As a reminder, motor activations in
babbling are random (Specific details on naïve and natural babbling are given in 4.3.2). The
ANN has three fully connected layers: input, hidden and output layers with respectively 6,
15 and 3 nodes. It is trained with babbling data: Babbling data sets are divided in training
and testing sets, once the NN has been satisfactory trained, it is used to produce motor
commands given a set of desired limb kinematics. The details of the NN are given in
Appendix A.1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
A.3 Representation of the ANN that we train to be used as a map from six limb kinematics to
3 motor activations. In this figure we show data used to train the NN (particularly natural
babbling data): limb kinematics (values for input nodes) which result from motor
activations (desired values for output nodes). As a reminder, motor activations in
babbling are random (Specific details on naïve and natural babbling are given in 4.3.2). The
ANN has three fully connected layers: input, hidden and output layers with respectively 6,
15 and 3 nodes. It is trained with babbling data: Babbling data sets are divided in training
and testing sets, once the NN has been satisfactory trained, it is used to produce motor
commands given a set of desired limb kinematics. The details of the NN are given in
Appendix A.1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
xi
A.4 ANN learning performance. Here we compare ANN performances with training babbling
data sets (red line) vs. ANN performances with testing babbling data sets (blue line). In
detail, for each subfigure: X axis is the number of epochs, Y axis is the mean square error
of PWM activation values (i.e., sum of the mean of the differences between the predicted
and ground truth values). Each row represents a different experiment trial. Group of
subfigures framed in blue and green are respectively ANN performances based on naïve
and natural babbling data respectively. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
A.5 Electric diagram showing the required components to actuate one DC motor and read
one rotary encoder. In the case of our robot, it has 6 motors, and 4 rotary encoders,
the connection approach would be the same as shown in this figure. The L298 motor
driver interfaces the data and power circuits (respectively signaled with green and red
wires). With orange we signal 5V lines to power data circuits and with purple the
PC-Microcontroller serial communication lines. . . . . . . . . . . . . . . . . . . . . . . . . 85
A.6 Renders of the biped’s physical components.A)Thewholebiped including all 3D printed
components (explained with more detail in the next points), hip metal plate (upper-most
flat metal structure), motor locations (per leg: two close to the hip and one close to the
knee), aluminum tubes as higher and lower leg segments (respectively connecting hip
with knees and knees with feet). B) Hip bridge structure used to fix both legs to the
hip. C)Hipmotormounts, designed to hold metal plates to attach motors of different
characteristics/diameters. D)Hiplegmounts, which have housings for ball-bearings to
reduce the friction of the proximal DoF of the legs. E)Uppersegmentrotatorcuffs ,F)
Lowersegmentlegmounts, that have housings for ball-bearings to reduce the friction
of the distal DoF of the legs. G) Distal segment rotator cuffs. E and G include tendon
channels to maintain proper tendon routing. D and F were designed to have rounded
profiles to reduce tendon abrasion; furthermore these parts include encoder mounts. . . . 86
xii
Abstract
My goal is to test how the anatomy of bipedal animals and robots influence proprioception, and how it
can be exploited to improve learning of locomotion. In specific, my objectives are to study: (i) the effect
of mechanoreceptor number and location on proprioception accuracy, (ii) how the utility of mechanore-
ceptors signals is affected by the location and properties of the structure on which they sit, and (iii) how
to exploit passive mechanical properties to improve learning of locomotion.
For my first objective, I quantify the benefits of bioinspired sensory fusion and distributed sensing on
state estimation for bipedal balance.We provide evidence that hip-localized balance sense, by its proxim-
ity to a moving platform and in combination with head acceleration, does provide two functional advan-
tages compared to head-only balance-sense: (1) improved state estimation, and (2) reduced sensory delays.
Moreover, increased neck stiffness can improve the utility of vestibular signals.
For my second objective, we show that strain data of bioinspired proprioceptive skins and/or ligament
arrangements can enable reliable center of pressure (CoP) estimates. This is an alternative to traditional
center of pressure (CoP) sensing methods that rely on sensors at the footsole subject to wear and damage
due to direct interaction with the ground [62]. With a model-free machine learning approach, we reliably
estimated the CoP location of a biped by measuring the strain experienced by sensorized artificial skin
and/or ligaments wrapped and arranged around its ankles.
For my third objective, I study how to exploit the mechanical and dynamic properties of tendon-driven
robotic limbs (and their interaction with the environment) to improve the learning of inverse dynamical
xiii
models for locomotion. Here we demonstrate that two minutes of bioinspired motor babbling performed
in tendo driven bipedal limbs can enable the learning of partially supported bipedal locomotion. This
achieved only if the random actions are compatible with the physical properties and range of motion of
the limb (i.e., “natural” babbling), compared to purely random “naïve” babbling that can conflict with the
leg’s natural dynamics or exacerbate antagonistic muscle actions. Furthermore we see how by removing
support to the system (i.e., increasing the constraints that the environment imposes to walking) the biped
learns to walk after training with both “naïve” and “natural” babbling approaches. With this we explore
how the performance of learned actions depends not only on brain-body properties, but also on their
interaction with the environment. We showed this brain-body-environment interactions by combining
our published in-hardware learning algorithm (G2P) with a a custom-built, tendon-driven bipedal robot.
xiv
Chapter1
Introduction
1.1 Generalfocus
Proprioception is a form of biological state estimation that uses motor output and mechanoreceptor signals
to extract physical variables (like location and movement of body parts) to enable effective control of
movement. [97]. In this thesis I explain the experiments I have conducted to test how the anatomy of
bipedal animals and robots influence proprioception [97, 19], and how it can be exploited to improve
learning of locomotion. A central question for my studies is: How do bipedal organisms exploit their
anatomy to achieve locomotion? This is a very broad question, so during my PhD I focused on three
relevant components necessary to achieve bipedal locomotion:
1. Balance: that heavily relies on estimating environmental characteristics like the type of terrain, and
its behavior, in case you are standing on swaying platforms or branches (Focus of Chapter 2).
2. Estimating body states useful to understand positions of the body with respect to the environment
(Focus of Chapter 3).
3. And also learning and performing of tasks, like walking (Focus of Chapter 4).
With (1) we study the impact that birds’ morphology has on their vestibular and lumbosacral balance
organs signaling [100] With (2) we study the impact that skin and/or ligament strain could have on balance
1
[101]. With both 1 and 2 we investigated how the utility of mechanoreceptors signals is affected by the
location and properties of the tissues on which they sit . And with 3 we study how anatomical properties
influence learning of locomotion [102, 19]. As explained in this thesis, the completed and proposed studies
promote better learning of locomotion tasks and better robot dexterity and agility. The approach I will fol-
low throughout the rest of this section will be to: i) explain the significance that anatomical properties have
on proprioception , and ii) explain the significance that exploiting anatomical properties has on learning
of locomotion . For i) we will focus on cases where the system passively acquires data to describe its body,
the environment or the body-environment interaction without self-generated energy, for ii) we focus on a
case where the robot actively acquires data to describe its body, the environment or the body-environment
interaction with self-generated energy
Overall, I designed experiments inspired on how vertebrates leverage their neuro and mechanical prop-
erties to obtain relevant afferent information [62, 88] from mechanoreceptors that are often distributed
and non-collocated [56, 42]. Bipedalism depends on many aspects, like: balance (focus of Chapter 2 and 3),
rhythmic locomotion generation (focus of Chapter 4 and control [94]. Generally, keeping balance means
to prevent falling or rotating about the foot point after perturbations [110]. Specifically, a balance-sensing
organ produces afferent signals to detect current body posture and motion to determine the movements
required to achieve or maintain a desired posture and motion. In the experiments reported for In chapters
2 and 3 balance is passively achieved thanks to viscoelastic muscle, ligaments and skin properties. In chap-
ter 4, the hip of the robot is constrained to move along the x and z axis, and around the z axis, preventing
the biped to fall down, allowing us to focus merely on the task of learning a locomotion cycle.
In birds (focus of chapter 2), direct neurophysiological evidence has clearly established that they must
possess a balance sense, that is independent of the vestibular system [1], between the hips called the
lumbosacral organ (LSO) [78]. They retain the ability to reflexively compensate for body rotations even
after labyrinthectomy and spinal cord transection to eliminate descending inputs influenced by the vision
2
and vestibular senses [1]. This neurophysiological evidence, along with particular anatomical features of
avian lumbosacral region, suggests a balance sensing function of the LSO that complements proprioceptive
information from the vestibular system, as well as mechanoreceptors in muscles, joints and skin which is
our focus for chapter 3.
Regarding skin (focus of chapter 3) ,it has haptic sensors that are often thought of as pressure sen-
sors which help in estimating parameters like the Center of Pressure (CoP) on the soles of the feet [24,
81]. However, a less known but no less critical example of non collocated sensors (both, in animals and
in robots) is that of mechanoreceptors on the skin surrounding joints [85, 115, 20]. These cutaneous sen-
sors react to stretch rather than pressure and are a prime example where the information from multiple
distributed mechanoreceptors are processed to extract estimates of joint angles which the nervous sys-
tem uses to control limb movements [53, 25, 26]. Furthermore, skin (and ligaments) strain signals play
an important role in understanding ankle positions [66, 4, 73], motivating our data driven studies on the
role that a skin can have on estimating the Center of Pressure (i.e., important parameter used by many
roboticists for balance control [84, 69, 35, 92, 31]). Data driven approaches have a significant drawback:
more data comes with the burden of more processing time, which can’t be afforded in many deployment
applications. One of the main motivations for Chapter 4 is to develop a data collection strategy that, when
paired with the plant dynamics, allows a less time consuming data collection.
In chapter 4, our bipedal robot performs locomotion learning informed by the natural dynamics of its
legs; this reduces the amount of training data required to achieve a useful cyclical movement. We developed
a “Natural Babbling”,to preferentially excite the natural dynamics of the legs. This babbling emphasizes on
exploring areas of the robotic limbs’ configuration space where locomotion solutions lie [86, 75, 48], thus
improving learning of locomotion. Rather than only contracting their muscles to locomote, vertebrates
use intrinsic properties of their own limbs (i.e., natural dynamics, inertia, natural frequency, resonance,
3
etc) to reduce energy expenditure of locomotion [105, 18], potentially reducing the energy and time they
require to learn a locomotion task.
Chapter 4 studies target to contribute to the development of data-driven model-free locomotor learning
approaches that do not rely on prior knowledge and can learn with minimal experience. Particularly, these
studies are an extension of [68, 42], in the learning of tendon driven anatomy capabilities domain, in these
studies learning happens in a model agnostic way, within a few attempts in the physical world: usually in
a five to ten minutes time window. Many current control algorithms to control robots require models of
their physical structure, environment and task [110, 5, 80]. Robotic learning algorithms usually take hours
or days worth of iterations to produce a useful locomotor task [74, 120]. It is a similar case for systems
that depend on experts to manually tune their parameters, and/or provide demonstrations of the task [67,
39].
Adopting lessons from the millions of years of biological evolution poses intriguing and exciting pos-
sibilities for the engineering evolution of robust and versatile bipedal robots. There are well-known forms
of “embodied intelligence” where the structure of the body co-evolves with the nervous system (or con-
troller) to simplify and improve open- or closed-loop control [104, 15]. At the other extreme we have the
classical robotics approach to fully centralized control that depends on algorithms that process sensory
information and issue motor commands. Such approaches can run the risk of ignoring or overriding the
natural dynamics of the plant.
In this thesis we explore novel ways to generate and process data to enhance agility, versatility and
more efficient autonomous learning. We emphasize on the importance of sensor distribution and location
as well as on the importance of pairing actuation, natural dynamics and sensation. We want to motivate
the creation of robots that perform locomotion learning guided by the natural dynamics of the plant. We
present a way to promote a more informative —and more sparse—sensory feedback to enhance learning
and refinement of locomotion.
4
1.2 Innovation
To our knowledge, Chapter 2 studies are the first to quantitatively analyze the practical benefits of hip-
localized sensing of acceleration for balance control [100] . We present concrete evidence that a hip-
localized balance sense organ (like the LSO) is an effective source of faster and better sensing of posture-
relevant information [100] .
Chapter 2 and publication [100] results show that the time delays for head-localized balance sense
organs can be shortened by co-contracting neck muscles (i.e., a stiffer neck). From the state estimation
point of view, however, we find that hip-localized balance sense organs are superior, and do not benefit
from sensory fusion with head-localized acceleration independently of neck stiffness.
For chapter 3, similarly to [84, 99], we use machine learning to avoid the need for precise modeling
and characterization of sensors and body dynamics. Relying on detailed models of the robot’s dynamics
to calculate the CoP makes the system prone to failure when there are mismatches between the model
and the real system dynamics due to wear and tear, damage, degradation, lack of information about the
system, model simplicity or poor parameter identification [34].
We are, to the best of our knowledge, the first group to estimate the Center of Pressure using strains
on an Artificial skin [101] (chapter 3). Overall, the mechanisms’ manufacture was of low-cost, keeping
in mind that we want to show that the introduced techniques can be applied to a variety of equivalent
ankles and artificial skins incorporating different technologies. Sensorized skins also have the advantages
of (i) passively stabilizing the ankle, while (ii) not restricting the DoFs of the mechanism, allowing smooth
movements and applicability to soft or hybrid (soft-rigid) robots update [99].
I am particularly interested in data efficient locomotion learning strategies driven by the natural dy-
namics and mechanical properties of bio-inspired tendon driven robots. A learning strategy like this, which
we perform in chapter 4, emphasizes exploring areas of the robotic limbs’ configuration space where
locomotion solutions lie, thus accelerating learning of locomotion. Furthermore, we see how learning
5
performed on a body is only a part of successfully completing a task, in chapter 4 we also explore how
brain-body-environment interaction is fundamental on shaping the final performance of a backdrivable
system. In this study we highlight how the success of the resultant action of a compliant and backdriv-
able biped robot does not necessarily depend on reducing its error (i.e., in terms of movement patterns of
the feet: mean square distance between obtained trajectories and desired ones). Our study relates to the
adaptive behavior observed in animals, where success of learned actions rely on the emergence of a useful
brain-body-environment interaction [16].
6
Chapter2
Aphysicalmodelsuggeststhathip-localizedbalancesenseinbirds
improvesstateestimationinperching: implicationsforbipedalrobots
Darío Urbina-Meléndez, Kian Jalaleddini, Monica A Daley, and Francisco J Valero-Cuevas. “A physical
model suggests that hip-localized balance sense in birds improves state estimation in perching: implica-
tions for bipedal robots”. In: Frontiers in Robotics and AI 5 (2018), p. 38.
2.1 Chaptersummary
In addition to a vestibular system, birds uniquely have a balance-sensing organ within the pelvis, called
the lumbosacral organ (LSO). The LSO is well developed in terrestrial birds, possibly to facilitate balance
control in perching and terrestrial locomotion. No previous studies have quantified the functional benefits
of the LSO for balance. We suggest two main benefits of hip-localized balance sense: reduced sensorimo-
tor delay and improved estimation of foot-ground acceleration. We used system identification to test the
hypothesis that hip-localized balance sense improves estimates of foot acceleration compared to a head-
localized sense, due to closer proximity to the feet. We built a physical model of a standing guinea fowl
perched on a platform, and used 3D accelerometers at the hip and head to replicate balance sense by the LSO
and vestibular systems. The horizontal platform was attached to the end effector of a 6 DOF robotic arm,
allowing us to apply perturbations to the platform analogous to motions of a compliant branch. We also
7
compared state estimation between models with low and high neck stiffness. Cross-correlations revealed
that foot-to-hip sensing delays were shorter than foot-to-head, as expected. We used multi-variable out-
put error state-space (MOESP) system identification to estimate foot-ground acceleration as a function of
hip- and head-localized sensing, individually and combined. Hip-localized sensors alone provided the best
state estimates, which were not improved when fused with head-localized sensors. However, estimates
from head-localized sensors improved with higher neck stiffness. Our findings support the hypothesis
that hip-localized balance sense improves the speed and accuracy of foot state estimation compared to
head-localized sense. The findings also suggest a role of neck muscles for active sensing for balance con-
trol: increased neck stiffness through muscle co-contraction can improve the utility of vestibular signals.
Our engineering approach provides, to our knowledge, the first quantitative evidence for functional ben-
efits of the LSO balance sense in birds. The findings support notions of control modularity in birds, with
preferential vestibular sense for head stability and gaze, and LSO for body balance control,respectively.
The findings also suggest advantages for distributed and active sensing for agile locomotion in compliant
bipedal robots.
This chapter work is published in: [100]
2.2 Introduction
All terrestrial vertebrates have linear and angular acceleration sense localized to the vestibular system of
the inner ear. It is well known that birds use a variety of reflexes mediated by internal signals to stabilize
their head during walking and flying [71]. Uniquely among living animals, birds appear to have two spe-
cialized balance-sensing organs: the vestibular system of the inner ear and an additional balance sensor
located between the hips called the lumbosacral organ (LSO) [78] which has been proposed to be especially
useful for terrestrial locomotion [78], [76]. Birds have long flexible necks, with head motions tightly cou-
pled to gaze control [82, 77, 72]. Consequently, the vestibular system is not closely nor tightly coupled to
8
the torso. In contrast, the LSO is located in the sacrum between the hips, near the CoM. Having a balance
organ at the torso is likely to be beneficial to legged locomotion and balance because the hip joint plays an
important role on controlling the position of the CoM of the whole body with respect to the foot [1].Here
we consider and contrast the functional implications of hip-localized (LSO) and head-localized (vestibular)
balance-sense.
Generally speaking, keeping balance is a task that many legged-animals perform to prevent falling or
rotating about the foot point after perturbations [109]. Specifically, a balance-sensing organ produces af-
ferent signals to detect current body posture and motion to determine the movements required to achieve
or maintain a desired posture and motion. In birds, direct neurophysiological evidence has clearly es-
tablished that they must possess balance sense that is independent of the vestibular system [7]. They
retain the ability to reflexively compensate for body rotations even after labyrinthectomy and spinal cord
transection to eliminate descending inputs influenced by the vision and vestibular senses [7]. This neu-
rophysiological evidence, along with particular anatomical features of avian lumbosacral region (below),
suggests a balance sensing function of the LSO that complements proprioceptive information from the
vestibular system, as well as mechanoreceptors in the skin, joints and muscles.
Anatomically, the LSO is located within an enlargement of the lumbosacral region of the spinal column,
between the 27th to 38th segments [98]. The LSO presents a suite of features unique to the spinal column
of birds, including bilateral protrusions of neural tissue identified as mechanosensors (accessory lobe (AL)
neurons), located adjacent to ligaments supporting the spinal cord [90] , [114], [76], [78]. The spinal cord
is dorsally bifurcated in this region and supports a “glycogen body” (GB) centered on top. The entire
region is enclosed by bony canals with a distinct concentric ring structure [78]. The arrangement of the
canals, AL, ligaments, and GB is reminiscent of the vestibular system [78] and invites functional analogy
to an accelerometer. Each AL contains mechanoreceptors [78, 90, 114], with commissural axons projecting
to last-order premotor interneurons in the spinal pattern generating network [29, 78]. The AL neurons
9
within the LSO exhibit spontaneous firing and phase-coupled firing in response to vibrational stimulation
between 75-100Hz, and ablation of these neurons disrupts standing balance [78]. Thus, multiple lines of
anatomical and neurophysiological evidence suggest balance-sensing function of the LSO.
Despite evidence of LSO hip-localized balance-sense in birds, no previous studies have provided quan-
titative evidence for the functional benefits of LSO as an adaptation for posture balance sensing of posture-
relevant information. We hypothesize that hip-localized balance sense provides two main functional ad-
vantages compared to head-only balance-sense: 1) reduced sensorimotor delay and 2) more accurate state
estimation of foot-ground acceleration due to closer proximity to the feet. Here we use a physical model
of a perching guinea fowl subject to foot-ground perturbations to test the hypothesis that hip-localized
balance sense enables more rapid sensing and accurate state estimation compared to only a head-localized
balance sense.
Most birds “perch” (balance with the feet attached to the substrate) when they alight on elevated ob-
jects such as branches; therefore we focus on perching as a conveniently simple and ecologically relevant
balancing behavior. We built a simple physical model of a standing guinea fowl perched on a horizontal
platform (i.e., feet attached to the platform). The horizontal platform was attached to the end effector of a
6 DOF robotic arm, allowing us to apply perturbations analogous to motions of a compliant branch. The
physical model provides a first approximation of the muscle-tendon viscoelastic properties that provide
leg compliance. We approximated LSO and vestibular balance sensors using 3-D accelerometers located at
hips and at the head, respectively. We used system identification to estimate foot-ground acceleration as
a function of hip- and head-localized sensing, individually and combined.
10
2.3 Methods
2.3.1 Physicalmodelofaguineafowl
A skeletal model of a guinea fowl was built by interconnected and hinged aluminum bars (Figure 2.1 and
2.2). Friction was reduced by using bearings at the hip, knee, ankle, and foot. The general body size, limb
segment lengths and configuration were based on guinea fowl anatomy from the literature [21, 37, 40],
with a hip height of 20 cm.
This physical model focused on approximating the guinea fowl’s (i) LSO (hip) and vestibular (head)
balance sensing systems location, (ii) body center of mass location, and limb configuration in a standing
posture (iii) visco-elastic mechanical properties of the muscle-tendon-driven limbs. This model was meant
as a first approximation of the key physical features, to allow a quantitative comparison of the information
available at hip- and head-localized balance sensors. It was not meant to be an exhaustive exploration of
the effects of posture, material properties, and muscle-tendon actions. Such considerations could be the
subject of future work.
The toes of the model were firmly attached to a platform. Thus, the guinea fowl model “perched”
while maintaining an upright standing posture. This posture was maintained by the passive tensions
in rubber bands that cross the hip, knee, ankle and metatarsal joints without further assistance or active
support (Figure 2.2). We pre-tensioned rubber bands across joints to represent the tendon-driven functional
anatomy of a guinea fowl. These rubber bands also have viscoelastic properties that approximate the
passive mechanical properties of “muscles” held at a constant activation level when holding the standing
posture. The origins and insertions of the rubber bands were adjusted to have large enough moment arms
at each joint to overcome gravity and maintain posture even when perturbed by the moving platform. Our
focus was not to explore effects of varying muscle activation patterns for standing postures, but instead
11
Figure 2.1: Physical model of the skeleton of the guinea fowl made of articulated aluminum plates and
an elastic tube for a neck. The location of the sensors can be seen on the floor between the model’s feet,
on its pelvis between the hips, and on its head. The joints of the model are, starting from the pelvis: the
hip, knee, ankle and metatarsal joints. The transparent sphere around the accelerometer between the feet
indicates the scale of random displacements 20 mm in radius.
to find a set of tensions in the rubber bands sufficient to maintain standing posture and propagate the
perturbations from the platform through the skeletal anatomy.
We used two interchangeable necks, each with different stiffness to test the effects of muscle coac-
tivation on balance sensing at the head. Each neck was 25 cm long and curved as shown in Figure 2.1
and 2.2. The first neck was made of 12.7 mm diameter Ultra-Flex Corrugated Steel Sleeving (McMaster-
Carr, part 54885K21). The second was 19.05 mm diameter Abrasion-Resistant Polyurethane Rubber Rod
(McMaster-Carr, part 8695K155).
2.3.2 Instrumentation
The end-effector of a 6 degrees of freedom (DOF) AdeptSix 300 robotic arm (Omron Adept Technologies,
Inc, San Ramón, CA) hold the platform where the model perched (Figure 2.3). We used 3-D accelerometers
12
Figure 2.2: Photograph of the physical model of the skeleton of the guinea fowl. On the left the complete
model is shown, on the middle and right sections details of the elastic linkages that are required for the
robot to maintain a standing posture can be seen.
at the following locations on the model: (i) head to represent the vestibular system; (ii) hip to represent
the LSO sensor, and (iii) between the feet to record the reference perturbations or "foot acceleration" (Fig-
ure 2.1). All accelerometers were MEMS inertial sensors Model LIS344ALH (ST Microelectronics, Geneva,
Switzerland).
2.3.3 Trials
Each trial replicated a scenario that a guinea fowl might face while perching on a tree branch which is
subject to perturbations from weather and other animals. Our goal was not to replicate natural perturba-
tion exactly, but to provide a general test of our hypothesis that the LSO has benefits over the vestibular
system for rapid and accurate state estimation for balance.
Each trial consisted of a series of 3,000 random, uncorrelated displacements generated by the robotic
arm. Each displacement was a center-out-and-back movement in a random direction to the surface of
spheres with 2, 5, 10, and 20 mm in radius. Trials were block-randomized across sphere sizes. We recorded
a total of eight trials (4 sphere sizes x 2 necks stiffnesses) (Table 2.1).
13
TRIALS Low Stiffness neck High Stiffness neck
2mm sphere Trial_LS
2
Trial_HS
2
5mm sphere Trial_LS
5
Trial_HS
5
10mm sphere Trial_LS
10
Trial_HS
10
20mm sphere Trial_LS
20
Trial_HS
20
Table 2.1: Each trial consisted of 3,000 random center-out-and-back displacements (center-surface of a
sphere).
Figure 2.3: Generating 3D movements with the 6-DOF AdeptSix 300 robotic arm enabled us to apply re-
peatable and specific type of perturbations to our model.
14
2.3.4 DataAcquisition
We used a high-performance National Instruments (NI) PXI-8108 computer, upgraded with 4 GB DDR2
RAM and a 500 GB SSD. An NIPXI-6254 ADC card recorded the accelerations signals. The data acquisition
hardware was housed in the NI PXI-1042 chassis. We acquired data at the sampling rate of 1kHz.
2.3.5 Estimationofneckstiffness
To estimate the effective neck stiffness, we performed a boot-strap analysis of 1,000 trials by randomly
selecting 30s segments from each trial. We then found the resonant frequency (the frequency with maximal
power) of accelerations at the head. The effective muscle stiffness was estimated from:
K
i
=m
i
f
2
i
(2.1)
Wherei is the neck number,m
i
the mass andf
i
the resonant frequency.
2.3.6 Estimationofsensorydelayatthehipandhead
We calculated cross-correlation of foot acceleration against that recorded from hip or head to estimate the
propagation delays of the applied mechanical perturbations. The delay was taken as the lag where the
cross-correlation was maximal.
2.3.7 Estimationofthetimehistoryoffootacceleration
We used a data-driven modeling approach to estimate the time history of the foot acceleration given the
time history of signals recorded at the sensory sites (hip and head). To this end, we trained state-space
models (in the least-squares sense) to predict foot acceleration from the hip or head accelerations. We
used MOESP state-space identification [108, 107] implemented in the State-spaceModelIdentification (SMI)
MATLAB toolbox [45]. The state-space model is represented as follows:
15
x(k+1) =Ax(k)+Bacc
sensor
(k)
acc
foot
(k) =Cx(k)+Dacc
sensor
(k)
(2.2)
whereacc
sensor
(k) is the input signal (acceleration signal recorded from the hip or neck) andacc
foot
(k) is
the measured foot acceleration. x(k) is the state variable, andA, B, C, D are the unknown state-space
matrices. We set the model order to three after inspecting the singular values of the extended observ-
ability matrix as described in the previous work [44]. The model order of three resulted in 21 parameters
that was significantly less than the number of 4000 available training data points for each training run.
Since the number of free parameters was much less than 10% of the training data, the model is not over-
parameterized and cannot learn noise and the stochastic behavior.
We assessed the performance of the model in predicting the foot acceleration ( ˆ acc
foot
). By running
the identified models in the prediction mode, we compared the predictions to the actual measured signals,
acc
foot
. We quantified the difference using the identification Variance Accounted For (VAF) expressed as:
%VAF=100
1− var( ˆ acc
foot
(k)− acc
foot
(k))
var(acc
foot
(k))
(2.3)
where 100% indicates a perfect prediction of all the variability in the measured signals, and 0% means no
meaningful prediction.
2.3.8 Boot-StrapAnalysisandStatistics
To estimate the robustness of the analyses (cross-correlation, system identification, etc), we performed a
100 trials boot-strap study (random sampling with replacement) [27]. For each trial, we randomly chose
40s windows from the measured data, performed the cross-correlation and system identification analyses,
16
and then calculated summary statistics across the 100 measures. We performed student’s t-test for statis-
tically significant differences between conditions. Values for central tendency and variance are reported
as medians (interquartile range) unless stated otherwise.
2.4 Results
We first present the differences in neck stiffness, then the effect of sensor location and neck stiffness on (i)
sensing delay, and (ii) estimation of foot acceleration.
The necks made from two different materials have different bending stiffnesses whose estimates are
shown in figure 2.4. Since we measured the dynamical response of the entire physical model (see Discus-
sion), each direction and magnitude of perturbation induced a different dynamical response that resulted
in a different acceleration measured at the head. This led to different resonant frequencies to be multiplied
by the mass of the neck (Equation 2.1). Note that we would obtain different estimates of neck stiffnesses if
the square of the resonant frequency at the head were multiplied by the mass of the whole model. Doing
this would have given us an approximation of the stiffness of the whole body, which besides the neck, has
a fixed stiffness. Also, if the complete body mass were considered, mass differences between trials with
different necks would have been smaller, resulting in a constant bias that would not change the statistical
differences between the estimates of neck stiffness. The median neck stiffnesses were 0.67 ( 0.26 to 1.05)
N/m and 1.25 (0.56 to 1.55) N/m for the low and high stiffness necks, respectively. Student t-test shows the
average neck stiffness are significantly different ( p<0.05).
As expected, an accelerometer at the hip generally detected foot acceleration with shorter delays than
the accelerometer at the head. Foot-to-hip median delays were 0.02 (0 to 0.03) s and 0.03 (0.005 to 0.065) s
, respectively for the low and high stiffness necks. Foot-to-head median delays were longer, measured at
0.095 (0.06 to 0.135) s for the low stiffness neck, and 0.055 (0.02 to 0.07) s for the high stiffness one (Figure
17
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
Stiffness (N/m)
Estimated necks' bending stiffness
Flexible corrugated
metal tubing
Solid rubber tube
(Low stiffness neck)
(High stiffness neck)
Figure 2.4: Estimated bending stiffness of the two necks. Neck stiffnesses were calculated using data from
1,000 different trials and the simple lumped-parameter model in Equation 2.1. Left: stiffness calculated for
the flexible corrugated metal tubing (i.e., low stiffness neck). Right: stiffness calculated for the solid rubber
tube (i.e., high stiffness neck) Flexible corrugated metal tubing and solid rubber tube data were statistically
different (to p<0.05), their respective medians are 0.67 and 1.25 N/m.
2.5). The variability was quite large as the shown information collapses data across different acceleration
axes and different sphere experiments (perturbation magnitudes).
Foot-to-hip delays were significantly shorter than foot-to-head delays ( p<0.05) for the low stiffness
neck, but not significantly different for the high stiffness neck (Figure 2.5). A stiffer neck reduced the
delays for information sensed at the head. This resulted in hip and neck delays that were very similar with
no statistical difference.
Estimates of acceleration at the feet are more accurate when using signals from the hip-mounted ac-
celerometers than from the head-mounted accelerometers. Figure 2.6 shows an example where acceleration
at the feet is estimated from the hip- and head- mounted accelerometer, overlaid with the ground-truth
signal measured at the feet.
Hip-localized estimates of the foot acceleration accounted for 30.81-48.96 % of variance (% VAF as de-
fined in equation 2.3) against 15.59-22.19 % of head-localized estimates (Figure 2.7). This figure summarizes
18
Foot-to-hip Foot-to-head
-0.05
0
0.05
0.1
0.15
0.2
Delay (s)
-0.05
0
0.05
0.1
0.15
0.2
Delay (s)
Foot-to-hip Foot-to-head
Low stiffness neck High stiffness neck
Figure 2.5: Independently of the neck stiffness, foot-to-hip delays were shorter than foot-to-head ones.
The two data groups in the panel corresponding to the low stiffness neck (left panel) were statistically
different (to p< 0.05); this is not the case for high stiffness neck data (right panel). Foot-to-head median
delays were longer, measured at 0.095 s for the low stiffness neck, and 0.055 s for the high stiffness one.
Foot-to-hip median delays were 0.02 and 0.03 s respectively for the low stiffness and high stiffness necks.
the estimation results by pooling together data from both neck stiffnesses. Prediction of foot acceleration
as a function of neck type is shown in figure 2.8. Particularly, figure 2.7 shows data separated as a function
of perturbation magnitude. It demonstrates that independently of the perturbation magnitude, the esti-
mate of foot acceleration from the hip was always more accurate than that from the head sensor. Moreover,
sensory fusion (combining info from both sensors) did not significantly improve the foot acceleration es-
timation. Therefore sensory fusion did not provide additional benefits beyond hip-only sensing.
We have found that when only head-localized accelerometers were available, the high stiffness neck
improved estimates of foot acceleration compared to the low stiffness neck (Fig. 2.8). With the low stiffness
neck, the median VAF was 15.11 (11.38 to 21.74) % , while it was 17.95 (10.18 to 29.19) % for the high stiffness
one. These data groups were statistically different ( p<0.05).
19
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
Time (s)
-0.04
-0.03
-0.02
-0.01
0
0.01
0.02
0.03
Acceleration (g)
Measured
Hip estimated
Head estimated
Typical estimation of floor acceleration
Time (s)
Acceleration (g)
A
B
2.7 2.75 2.8 2.85 2.9 2.95 3
-0.03
-0.02
-0.01
0
0.01
0.02
0.03
Figure 2.6: Example of the acceleration at the feet in the sagittal plane estimated from the measured accel-
erations at the hip and head. The acceleration at the hip yields more accurate estimates of acceleration at
the feet. (A) a 5 (s) time window. (B) a 300 (ms) time window.
20
0
10
20
30
40
50
60
70
%VAF
Estimation of foot acceleration
2
5
10
20
Head and hip
Hip
Foot acceleration estimated from different
Head
sensory fusion
sensory channels
Perturbation sphere
radious (cm)
Figure 2.7: Comparison of estimation accuracy of foot acceleration from the hip, head and their fusion as
a function of perturbation magnitudes. Hip-to-feet compared to head-to-foot acceleration estimation was
more accurate (p < 0.05). Fusion of the hip and head information did not improve estimation of the foot
acceleration beyond that obtained with hip information along.
0
10
20
30
40
50
60
%VAF
Estimation of foot acceleration with signals from the head accelerometer
Low stiffness neck High stiffness neck
Figure 2.8: Estimation accuracy of foot acceleration as a function of neck type. Increasing neck stiffness
improved estimation of foot acceleration from acceleration measured at the head. Low stiffness and high
stiffness neck data were statistically different ( p < 0.05). The median %VAF was 15.11 and 17.95 for the
low stiffness neck and high stiffness neck respectively.
21
2.5 Discussion
To validate the anatomical and neurophysiological evidence of LSO balance sensing function in birds, we
present a quantitative investigation of the functional benefits of hip-localized balance sense. Here we
investigated the perturbation sensing dynamics of a physical model of a guineafowl perched in a standing
posture. We explored two proposed functional advantages of hip-localized compared to head-localized
balance sense: minimization of sensorimotor delay and improved estimation of foot-ground acceleration,
due to closer proximity of the sensor to the feet. To our knowledge, this is the first study to quantitatively
analyze the practical benefits of hip-localized sensing of accelerating for balance control. We find that
a hip-localized acceleration sensor—analogous to the LSO—provides shorter delays and improved state
estimation of feet acceleration during substrate perturbations.
In particular, our experimental paradigm applied displacements at the feet, where we also measured
the ‘ground truth’ acceleration of the moving substrate on which the bird is perched. We then compared
the ability to sense and reconstruct that ground truth acceleration on the basis of accelerations measured
at the hip and head. We find that the location of these simulated balance sense organs has important
consequences to how a bird (a model of a guinea fowl, in this case) could use acceleration information
from hip-localized balance sense for bipedal perching, standing and locomotion. A second level of analysis
focused on the material properties of the neck of the physical model. One was (less stiff) corrugated tubing,
and the other (more stiff) solid rubber tubing. These material differences were designed to explore the effect
of muscle co-contraction at the neck as a means of active sensing, or at least modulation of the utility of
head-localized balance sense.
Before discussing the results in detail, it is important to clarify some features of our experimental
approach to balance sense. A salient feature of our experimental results is the variability in our results,
as in Figure 2.4. Shouldn’t the bending stiffness of each neck be thought as a single number? Similarly,
shouldn’t the foot-to-hip delays be constant and the same independently of neck stiffness (Figure 2.5)?
22
Recall that the stiffness of the system is inferred from the resonant frequency of the acceleration measured
at the head. The acceleration at the head is a function of the the dynamical responseoftheentireguineafowl
model to input perturbations. In fact, we are measuring the frequency response and delays of the coupled
oscillations of the legs held in a standing position by rubber bands, plus the pelvis and neck. Given that
this physical structure is only symmetric in the sagittal plane, its dynamical response will depend on the
direction of the 3D perturbations—which naturally results in variability in our results. Nevertheless, the
corrugated tubing condition ( ‘low stiffness neck’) leads to perturbation responses at the head that, in
general and on average, reflect a lower stiffness for this lumped-parameter analysis. Similarly, foot-to-hip
delays were, in general and on average, shorter than the feet-to-head delays. In a sense, instead of ‘neck
stiffness,’ the results in Figure 2.4 may be better called the ‘apparent stiffness lumped at the head.’ But
given that the purpose of this analysis is to test for the effect of the material properties of the neck on time
delays and estimation accuracy, we chose not to belabor this point and simply call it ‘neck stiffness.’ After
all, (i) the neck is the only body part that was swapped, and (ii) changes in material properties only at the
neck better reflect the potential effects of muscle co-contraction at the neck in the guinea fowl.
There are limitations to our approach that, while worth mentioning, we believe do not challenge the
validity of our results. Importantly, our physical model can only approximate the anatomy and muscle
mechanics of the guinea fowl. Our multi-link articulated structure approximates only the general link-
segment arrangement and length proportions of the animal skeleton, and the viscoelastic rubber bands
only roughly approximate the properties of muscle-tendon linkages. Similarly, we did not consider the
proprioceptive signals coming from the joints, skin and muscles that could also contribute to state estimates
of foot acceleration. While these limitations prevent us from claiming that our results are direct parallels of
how a guinea fowl would respond neuro-mechanically to perturbations, it is nevertheless a valid means to
testfordifferencesinsensorysignalsasafunctionofsensorlocationandneckstiffness . Moreover, we explicitly
avoided making the assumption that the skeleton of the guinea fowl was simply a set of links rigidly fused
23
at a given posture. Rather, we used rotating hinges at the joints, where the posture of the model was
achieved by appropriately setting the lengths and tensions of the rubber bands to approximate muscle-
tendon actions to maintain posture at rest. This mechanical structure—as a first approximation—provides
a biomechanically realistic instantaneous response to a perturbation at the feet, and avoids other multiple
assumptions associated with a computational model [70]. The results we present here are an analysis of
the aggregate acceleration responses to a sequence of center-out 3-dimensional perturbations. As such, we
consider the details of each response only implicitly. Future research could explore the moment-to-moment
details of the responses within an individual perturbation.
The biological interpretation of these results hinges on the assumption that the functional benefits of
hip-localized balance sense could translate into selective evolutionary pressure to promote the anatomical
specialization of the LSO in evolutionary time. This assumption is supported by two fundamental control-
theoretical notions: (i) that delays are detrimental because they make any causal closed-loop controller
(biological or engineered) more unstable [41] and (ii) that having a more faithful estimate of a perturbation
improves the corrective response, and thus improving performance, economy and stability.
The simplest interpretation of the time delays hinges on the notion that a causal feedback controller
has knowledge of the past, but not of the present (strictly speaking) or future. Therefore, it cannot exe-
cute anticipatory control actions and is thus limited by its closed-loop bandwidth. In contrast, biological
systems are well-known to produce anticipatory motor commands [7, 112], as well as short-latency reflex-
ive responses [96, 52, 51]. Anticipatory strategies are considered to be critical adaptations to mitigate the
deleterious effects of large transmission and processing delays inherent to neural systems [10, 32]. Nev-
ertheless, any voluntary, anticipatory or reflexive action would benefit from shorter delays. This point is
supported by the observation of many morphological and physiological adaptations in the nervous systems
to reduce time delays such as increased axonal diameter, myelination and saltatory conduction.
24
The biological relevance of state estimation [54, 95] relates to the fact that physiological sensory signals
contain task-relevant information, but not necessarily in the coordinates and units used by the controller.
In particular, some version of the ‘state’ of the system is encoded in sensory coordinates and units that
are different from those used by the neural controller to select, plan and execute a response. This means
that any raw sensory signal (e.g., acceleration at the LSO or vestibular system) must first be interpreted to
extract useful information (e.g., the details of the perturbation at the feet). The MOESP state-space iden-
tification technique is but one example of a state estimator [108, 107]. To test our hypothesis, it suffices
to show that a hip-localized balance sensing organ is better at sensing, estimating, and reconstructing the
perturbations at the feet than a head-localized one, Figure 2.6. On the same figure, we only show for-
ward/backward accelerations (i.e., along the y axis), which are the most destabilizing during locomotion.
It has been shown that lateral (i.e., side-to-side) movements are more stable than forward/backward move-
ments because stance width naturally provides a stabilizing effect [23]. Whether and how the concept of
state estimation applies to the nervous system, however, is yet unresolved [65].
Necker stated in the concluding paragraph of his 2006 paper that ’The local organization of the neu-
ronal network [of the LSO] favors rapid and hence effective control,’ with no further elaboration [78].
We now present what is, to the best of our knowledge, the first concrete evidence that a hip-localized
balance sense organ (like the LSO) is an effective source of faster and better sensing of posture-relevant
information. Faster sensing is evidenced by the shorter time delays for hip-localized vs. a head-localized
accelerometers. Moreover, our results also show that the time delays for head-localized balance sense or-
gans can be shortened by cocontracting neck muscles (i.e., a stiffer neck). From the state estimation point
of view, however, we find that hip-localized balance sense organs are superior, and do not benefit from
sensory fusion with head-localized acceleration—independently of neck stiffness. Therefore, we conclude
that hip-localized balance sense indeed promotes more rapid and effective control.
25
These results have important implications for how the evolution of hip-localized balance sense by the
LSO might have contributed to the unique sensorimotor control features of birds. In particular, it has long
been recognized that birds have relatively ’modular’ function and control of wings, legs and tail compared
to other vertebrates [38]. The functional dissociation between forelimb (wing) for aerial locomotion and
hindlimb (leg) for terrestrial locomotion is paralleled by increased autonomy of their respective sensori-
motor control networks[47, 93, 50, 13, 72]. The presence of a local and distributed balance sensing organ
that is directly integrated with hindlimb spinal networks has likely contributed to this modular control
organization. The mechanosensing neurons of the LSO project directly to pre-motor neurons in the spinal
cord[78, 28]. This suggests the balance sense information produced by the LSO is likely to contribute to
rapid and effective control because it is processed locally. Such local processing is advantageous because
involving the brain in the response could introduce counterproductive time delays.
While our results focus on perching, hip-localized balance sense is likely beneficial for other postural
and locomotor tasks. We designed our perturbations to simulate sensory inputs analogous to bird perch-
ing on a branch subject to varied 3-D movements such as wind, movements of other animals, etc. During
perching, a bird is exposed to 3-D substrate perturbations, for which short-latency reflex responses could
suffice, if sufficiently rapid sensing is available. This is similar to the observed knee and ankle strategies
in the control of human upright posture [14], or slip-grip mechanisms for human finger manipulation
[17]. Moreover, such rapid and informative sensing is also critical to low-level (distributed, spinal or sub-
cortical) sensorimotor processing to control short-latency responses to perturbations [60, 61] that ulti-
mately supports long-latency control of voluntary function in general. The LSO is directly integrated with
the hindlimb spinal motor control networks [78, 28], suggesting that hip-localized balance sense is likely
relevant to all hindlimb-mediated behaviours, including perching, standing balance, over-ground locomo-
tion and arboreal locomotion. Birds effectively have two distinct balance sensorimotor processing centers:
26
the ‘cerebral brain’, responsible for executive function and navigation, and the ‘sacral brain’, responsible
for low-level, short latency control of terrestrial perching, standing and locomotion.
Adopting lessons from the millions of years of biological evolution poses intriguing and exciting pos-
sibilities for the engineering evolution of robust and versatile bipedal robots. There are well known forms
of morphological control where the structure of the body co-evolves with the nervous system (or con-
troller) to simplify and improve open- or closed-loop control [82, 63, 104]. At the other extreme we have
the classical robotics approach to fully centralized control that depends on algorithms that process sen-
sory information and issue motor commands. The LSO provides support for an intermediate alternative,
where one can have the benefits of morphological adaptations and central control— but supplemented by
distributed neural control centers informed by distributed balance sense organs like the LSO.
2.6 Acknowledgements
Author MD would like to thank Alexander Spröwitz for discussions on the potential balance sensing func-
tion of the lumbosacral organ of birds.
2.7 Authorcontributions
DU designed and constructed the physical model of the guinea fowl, wrote the Title, Abstract, Introduc-
tion, did renders of the physical model of the guinea fowl and put together different author ideas and
perspectives.
KJ guided the NI DAQ implementation and data analysis activities: system identification analysis,
bootstrap analysis and statistics.
Together, DU and KJ implemented the NI DAQ system, programmed the AdeptSix 300 robotic arm, did
data analysis on MATLAB, wrote the Methods section created and edited the figures.
27
MD and FV gave the initial idea of giving an engineering quantitative analysis for functional benefits
of the LSO balance sense in birds. They wrote most of the Discussion section and validated: (i) the design
and construction of the physical model of the guinea fowl, (ii) data analysis activities and (iii) each of the
paper sections and figures.
All the authors contributed to editing the paper for style, clarity, succinctness and grammar.
28
Chapter3
EstimatingCenterofPressureofaBipedalMechanismUsinga
ProprioceptiveArtificialSkinarounditsAnkles
Darío Urbina-Meléndez, Jiaoran Wang, Daniel Wang, Ali Marjaninejad, and Francisco J Valero-Cuevas.
“Estimating Center of Pressure of a Bipedal Mechanism Using a Proprioceptive Artificial Skin around its
Ankles”. In: 2021 43rd Annual International Conference of the IEEE Engineering in Medicine Biology
Society (EMBC). IEEE. 2021, pp. 4522–4528.
3.1 Chaptersummary
Estimating the Center of Pressure (CoP) under legged robots is useful to control their posture and gait. This
is traditionally done using contact sensors at the base of the foot or with sensors on distal joints, which
are subject to wear and damage due to impulse forces. In vertebrates, skin and ligament deformation at
the ankle is a particularly rich source of sensory information for locomotion. For our bipedal mechanism,
afferent signals from sensors on synthetic skin wrapped around the ankles sufficed to estimate the location
of the CoP with a mean accuracy>81.5%. For this we used K-Nearest Neighbors (KNN) algorithm trained
on the same force magnitude applied at four and nine ground-truth CoP locations. For a single mechanical
foot (i.e., single stance), signals from skin or ligaments (i.e., elastic rubber sheets and cables, respectively)
also sufficed to calculate the CoP (Mean prediction accuracy >91.3%). Moreover, the visco-elasticity of
29
these elements serves to passively stabilize the ankle. Importantly, training the single leg case with forces
of different magnitudes also resulted in similarly accurate mean CoP prediction accuracy >84.5%. We
show that using bio-inspired proprioceptive skins and/or ligament arrangements can provide reliable COP
predictions, while permitting arbitrary postures of the ankle and no sensors on the sole of the foot prone
to wear and damage. This novel approach to estimation of the CoP can be used to improve locomotion
control in a new class of bio-inspired rigid, soft and hybrid (soft-rigid) legged robots.
This chapter work is published in: [101]
3.2 Introduction
In contrast to engineered systems where controlled variables are measured directly and accurately, bi-
ological systems use mechanoreceptors that are often distributed and non-collocated [56], [43]. Haptic
sensors on the skin are often thought of as pressure sensors which help on estimating parameters like the
Center of Pressure (CoP) on the soles of the feet [81], [24]. However, a less known but no less critical
example of noncollocated sensors (both, in animals and in robots) is that of mechanoreceptors on the skin
surrounding joints [85] [115]. These cutaneous sensors react to stretch rather than pressure and are a
prime example where the information from multiple distributed mechanoreceptors are processed to ex-
tract estimates of joint angles which the nervous system uses to control limb movements [53], [25], [26].
Furthermore, skin stretch plays an important role in understanding ankle positions [73], [66], [4]. Here we
explored whether the strain measured by sensors on a low-cost artificial skin wrapped around the ankles
of a bipedal mechanism suffices to estimate its CoP (Fig. 3.1).
Common solutions for measurement of the COP involve signal acquisition from the sole of the robot’s
foot, to then use a model-based, [84], [69], [35], [31], [92] or a data-driven approach [91]. For some of the
mentioned studies, it is described that the sensors are used to calculate the Zero Moment Point (ZMP). For
these studies, calculation of ZMP depends on the CoP location; the former should always coincide with the
30
later for dynamically balanced configurations [110]. Relying on detailed models of the robot’s dynamics
to calculate the CoP makes the system prone to failure when there are mismatches between the model
and the real system dynamics due to wear and tear, damage, degradation, lack of information about the
system, model simplicity or poor parameter identification [34]. In [99] a “redundant” sensor architecture
together with a model-free machine learning approach is used to observe the behaviour of a soft actuator.
Similarly to [99], we use machine learning to avoid the need for precise modeling and characterization of
sensors and body dynamics.
We calculate the position of the CoP by measuring the strain experienced by a low-cost sensorized
artificial skin wrapped around the ankles of an eight Degrees-of-Freedom (DoF) biped, as well as around
a two DoF uniped structure. This skin functions as a passive neutral-position ankle stabilizer while pro-
viding strain measurements as afferent sensory information. We used these strain measurements to train
a K-Nearest Neighbor (KNN) algorithm to estimate the CoP’s location known a priori. The foot-leg mech-
anism, stabilized by the taut elastic skin makes the ankle joints return to a neutral configuration when
not loaded. To further validate our results, we also performed experiments where we substituted the skin
with ligaments with different stiffnesses to operate like guy-wires. This approach presents an alternative
to sensors on the sole of the foot which are subject to wear and potential damage due to impulse forces
caused by robot-floor interaction [62].
3.3 Methods
3.3.1 Biped,ArtificialSkin,LigamentsandForcePlateConstruction
We developed a simple biped with eight DoF (i.e., two per ankle and two per hip joint), Fig. 3.1-A. For
manufacture and debugging simplicity we did not include knees. As explained in [110], the forces and
moments that an ankle is subjected to, can be used to understand foot-ground interaction; thus the shape
31
Figure 3.1:AandBshowexperimentstoestimatetheCenterofPressure,whileCpresentsdevice
details.A) Biped standing on the force plate with CoP sensing area marked by a numbered grid. The force
plate provides ground truth CoP locations (i.e., labels) while signals from four leaf spring sensors on each
of the biped’s ankles are recorded (i.e., features). B) Foot structure in an upside-down position to which
Center of Pressure loads were applied to different locations on its sole. Simultaneously, data was recorded
from the spring leaf sensors (i.e., features). C) Ankle joint consists of a universal joint. Two strain gauges
are encapsulated inside two metal layers that form the leaf spring sensor. When the leaf spring geometry
changes due to the skin elongation or contraction, the strain gauge electric resistance changes, generating
a signal which depends on the strain the skin experiences derived from its elongation.
of a structure over this joint can be ignored or, in our case simplified. The two legs are connected by a
transversal post which we call "hip". We added a 900 grams mass (i.e., black cube in Fig 3.1-A) to the hip bar
to increase the load and moments applied to the ground and ankle joints respectively. Due to the biped’s
symmetry with respect to its sagittal plane, and the high density of the 900 grams mass compared to the hip
bar, we know that the Center of Mass (CoM) of the biped is located within this mass, for these experiments
a different mass could have been used, as long as the biped’s CoM location is the one just described. The
leg and foot incorporates mounting points for both skin and ligament components (Fig. 3.1-C and 3.2).
We also created a sensorized visco-elastic skin structure. The sensing component of the skin, which we
call "leaf spring sensor" consists of three parts: a strain device encapsulated in a double-layer aluminum
arch, a load bearing buffer structure made of a double-layer of highly elastic rubber polymer, and metal
connectors that facilitate the connection to the leg structure (Fig. 3.1-C, upper right corner). TwoComidox
BF120-3AA strain gauges were attached to the proximal side of the arch structure and reinforced with an
electrical tape infill. When the leg experiences a perturbation, the skin elongates causing the strain gauges
32
to sense the surface deformation of the aluminum arch structures. A pair of Adafruit ADS-1015 amplifiers
(Adafruit Industries, New York, NY, USA) were used in conjunction with a diagonal-half Wheatstone bridge
to prepare the strain gauge signals for acquisition using a PC.
For some experiments, we replaced the skin or sections of it with ligaments (Fig. 3.2); to mount them,
we created a structure employing SparkFun TAL-220b load cells (SparkFun Electronics, Boulder, CO, USA)
as the sensing element. As ligaments, we used Dacron
®
cable and extension springs, and then used them
to couple the load cells to the leg (Fig. 3.2). The flexibility and elasticity of these ligaments buffered the
load cell from the motion of the leg. This was done to combat the typically limited range of motion of
commercially available load cells. Four load cells and accompanying cable mounting mechanisms were
attached to the leg structure and special care was taken to maintain equal tension on each load cell while
the structure was in equilibrium. The consistent elastic properties of the ligaments (especially the extension
springs version) combined with a stiff mounting structure allowed us to reliably measure load-cell tension,
even after extensive use and testing.
To measure the ground truth of the CoP for the double stance case, we created a40.9× 40.9 cm force
plate sensor (Fig 3.1-A). The plate edges dimensions were chosen based on the biped’s support polygon
dimensions (i.e., three foot-sole areas). FourSparkFunTAL-220b load cells placed under the plate’s surface,
determined the8.0× 8.0 cm CoP sensing area (i.e., square framed with black electrical tape in Fig 3.1-A).
Placing the sensors in this position increased the sensitivity to the CoP position variations throughout
experiments.
Overall, the mechanisms’ manufacture was of low-cost, keeping in mind that we want to show that the
introduced techniques can be applied to a variety of equivalent ankles and artificial skins incorporating
different technologies.
33
3.3.2 DoubleStanceExperiments
First, we trained the force plate on both nine and four ground truth CoP locations marked on a numbered
8× 8 cm grid on the plate center, subdivided by four or nine squares (16 and 7.1 cm
2
respectively). For this,
we used a mass equivalent to the mass of the biped but with a 2.4 cm
2
base to ensure localized loading. We
then centered the biped over the force plate and positioned its CoM centered over the numbered grid. We
used the trained platform to assign CoP values (i.e., labels) to the afferent signals provided by the skin (i.e.,
features) while manually manipulating the biped mechanism to place its CoM over different locations of
the numbered grid (Fig. 3.1-A). During the experiments, the CoM remained over each number in the grid
for the same amount of time.
As shown in Fig. 3.1-A, to ensure that the load experienced by the platform only depends on the biped’s
position and mass, we equipped the bipedal structure with a rope to pull the hip towards a desired location,
while we compensated rotational hip movements by holding the opposite side of the hip. Hip rotation
compensation was done to keep the CoM projection on the force plate within the numbered grid area. In
our experiments, CoP and CoM are in the same region (i.e., anterior or posterior with respect to the coronal
plane, and/or lateral right or left with respect to the sagittal plane); this is consistent with the well-studied
relationship between CoM and CoP for static cases [113], [11]. Due to the low friction between the metal
foot and the acrylic platform, electrical tape was used to hold the foot at the desired position on the force
plate.
The prediction of the CoP is possible after training a KNN algorithm [83] with afferent strain data from
a synthetic skin (i.e., features), and using CoP locations as ground truth labels. We report results of five
nearest neighbors (K = 5). For some cases there is a better prediction accuracy using a different K, but
considering that the best K cannot be determined while a robot operates, we chose K=5 as our standard
parameter for our algorithm.
34
3.3.3 SingleStanceExperiments
For this experiment, we mounted just one foot of the biped on an upside-down position. This allows its
two degree-of-freedom ankle to flex when applying a 900 gram load. We first applied this load across nine
numbered and equally segmented locations forming a 3x3 grid of boxes on its upward facing sole. For a
second experiment version, we increased the resolution of the square segments to 16, forming a 4x4 grid
on the sole. For both cases, the total grid area was 85.9 cm
2
.
A 900 grams mass was used to mimic the load that our biped foot would experience while being on
single stance position. Each sequence consisted of applying the load to all grid locations by a human
operator following a visual cue on a screen, while data was recorded from the leaf spring sensors (i.e.,
features). The collected data consists of 28 complete sequences. (Fig. 3.1-B).
In the same way as described for the double stance experiment, the prediction of the CoP is possible
by using skin afferent information as features and CoP locations as labels to train a KNN algorithm [83].
The main difference is, in the case of the double stance, the ground truth CoP was given by a trained force
plate. But, for the single stance experiment, the labels were automatically assigned. The operator follows
a visual cue on a screen to apply the load to the foot sole. Same values shown on the screen are assigned
as labels to the data features. While a value is shown on the screen, a batch of 25 lines of data are recorded
(each line containing all sensors readings), only the twelfth line of each batch is used to train the KNN
model. This helps to ensure that the recorded features corresponds to the correct ground truth CoP label,
by giving enough time to the operator to manually change the position of the load to the correspondent
value on the foot sole grid.
35
Figure 3.2: Non-homogeneouslydistributedsensors,subjectedtodifferentconditionscanenable
the prediction of body states. A) Render of Foot-Ankle-Leg with Dacron Cable Ligament Assembly;
ligaments can also be extension springs, as shown in B). Two mounting points are available, one being
more proximal to the knee. When using the distal mounting point, we noticed a lost in state observability
due to the universal joint and mount point alignment. B) Foot-Ankle-Leg with a combination of skin with
30 mm Leaf Springs and a Extension Spring Ligament.
3.3.4 Code
For Sections 3.3.2 and 3.3.3 We used C++ (Arduino boards) and Mathworks (Natick, MA, USA) MATLAB
for data acquisition. The Caret library in R was used to perform KNN analyses. Validity of our estima-
tor was performed with five-Fold Cross-Validation to all our experiments [59]. Code repository link in
Supplementary Information (Section 3.7).
36
3.4 Results
We focus on the CoP prediction done from skin afferent signals and compare them to predictions obtained
from equivalent mechanisms (i.e., springs and string ligaments). For all the following results, unless ex-
plicitly mentioned, training and testing were done with the K-Nearest Neighbor approach described in
Methods (Section 3.3.2 and 3.3.3).
3.4.1 DoubleStanceCase
Skin wrapped around the ankles of a bipedal structure can be used to show its CoP location. In our exper-
iments we estimate the location of the biped’s CoP with a mean accuracy>81.5%.
FourCoPlocationsexperiments: Force plate prediction accuracy for these experiments was 97.6%.
Three tests were performed; for each tests, CoP values assigned by the force plate were predicted with
respective accuracies: 87.44%, 82.76% and 77.29%. In Fig. 3.4-A the CoP prediction accuracy for one of the
tests of this experiment is presented (i.e., 87.44% =97.6%× 89.6%).
Nine CoP locations experiments: Force plate prediction accuracy for this experiments was 97.2%.
One tests was performed; the CoP values assigned by the force plate were predicted with an accuracy of:
80.8%. In Fig. 3.4-B the CoP prediction accuracy for this experiment is presented (i.e., 78.53% =97.2%× 80.8%).
3.4.2 SingleStanceCase: Skinvs. Ligaments
For single stance case situations, ankle-wrapping skin or ligament strain afferent signals suffice on esti-
mating the structure’s CoP.
The mean prediction accuracy for the skin case was 91.44% while for the string and spring ligament
cases was: 86.68% and 95.75% respectively (Table 3.1). It is important to consider that skin sensors are
37
not calibrated; this can be seen in (Fig. 3.3) when comparing section A (same baseline for all signals) with
section B (different baseline for each signal), respectively for the skin and spring cases.
3.4.3 SingleStanceCase: Skinwith30and40mmSpringLeafSensorversions
Regardless of their type, sensors that provide strain measurements of ankle skin can enable CoP estimation.
Here we show how similar CoP prediction results can be obtained from two versions of the used sensor.
Nine CoP locations experiments: When the CoP was estimated from the artificial skin afferent
signals, the maximum and minimum prediction accuracies were 96.79% and 81.5% (Table 3.1). For the skin,
a total of 6 prediction accuracies were calculated using different leaf spring sensor sizes: 3 for the 30 mm
and 3 for the 40 mm cases. Respectively, the mean prediction accuracies obtained were 87.42% and 95.47%.
Sixteen CoP locations experiments: For this case, the CoP was estimated with a mean prediction
accuracy of 97.05% (Table 3.1).
3.4.4 SingleStanceCase: SkinandLigamentcombination
Here we show how the CoP can be estimated using signals from skin and tendon strain sensors simultane-
ously (Fig 3.2 and Table 3.1). Signals from different sensors are used to build a prediction or understanding
of a phenomenon or parameter (see Discussion, Section 3.5).
Maximum and minimum CoP prediction accuracy values for these cases were 92.79% and 84.85%. Even
though skin and extension spring elements have very different stiffnesses, we didn’t observe a significant
drop in prediction accuracy with respect to the other already presented results (Table 3.1).
3.4.5 StatisticalSignificance
After performing a five-fold cross-validation to all our our experiments, we obtained a Kappa value that was
always above 0.81 for the single stance experiments. For the double stance experiments, we consistently
38
Table 3.1: Center of Pressure Prediction Accuracy, Single Stance
39
obtained a Kappa value above 0.70. The obtained values point to a substantial or an near-perfect agreement
(Kappa>0.61 or Kappa>0.81 respectively) [59]. Overall, this can be interpreted as the prediction almost
always been accurate: the prediction "agrees" with the real or ground truth values.
3.4.6 ConvergencetoStableEquilibriumPoint
To demonstrate how the plant, for all skin and ligaments stance configurations, converges to a stable
equilibrium point (i.e., neutral ankle position or foot-leg in a 90 deg angle), we let the ankle go to its
neutral position after perturbing it. During this simple test we observed how signals converge to the same
signal value (e.g. 1400 for the extension spring configuration).
3.5 Discussion
We show that bio-inspired proprioceptive skins and/or ligament arrangements can provide reliable COP
predictions via a model-free machine learning approach (KNN), while permitting arbitrary postures of
the ankle and no sensors on the sole of the foot prone to wear and damage. It is important to consider
that we use a very low-cost artificial skin to show that different kinds of skins and/or ligaments (capable
of measuring strain or longitudinal deformation) can be used to calculate the CoP. We propose this novel
approach to estimate the CoP, that can be used to improve locomotion control in a new class of bio-inspired
rigid, soft and hybrid (soft-rigid) legged robots.
Our work is motivated by the well-known problem that sensors placed on the sole of the foot are subject
to wear and impulse forces [62]. Our alternative, as in nature, is to use skin and ligament strain sensors
at the ankle. Our work now demonstrates that these non-collocated strain sensors suffice to estimate the
CoP of bipeds during double and single stance. We observed that non-homogeneously distributed sensors,
subjected to different conditions (i.e., strain sensors mounted on a combination of skin and ligaments
surrounding the ankle, as described in Section 3.4.4, Figure 3.2 and Table 3.1) can enable the prediction
40
Figure 3.3: Rawsignalsamples: A) Skin with 40 mm Leaf Spring andB) Extension Spring Ligaments.
41
Table 3.2: Center of Pressure Prediction Accuracy, Single Stance: "Blind Tests": Variable and unknown
training load
42
Figure 3.4: SkinstrainafferentsignalsenablethepredictionoftheCenterofPressure. Confusion
Matrices that show percentages of how much the prediction of the CoP (i.e., x axis) agrees with the ground
truth CoP value (i.e., y axis). All cases done using skin with 30 mm Leaf Spring Sensors. A) Double stance
experiment, four CoP locations prediction accuracy: 87.44% =97.6%× 89.6%. 97.6% Force Plate Prediction
Accuracy. CoP values assigned by the plate were 89.6% correctly predicted using skin afferent signals.
B) Double stance experiment, nine CoP locations prediction accuracy: 78.53% = 97.2%× 80.8% (Same
rationale than in A). Even though this prediction value is high, CoP values were not assigned correctly by
the force plate: the biped was manipulated to reach nine CoP locations (as described in Section 3.3.2), but
only some points (i.e., labels) were assigned by the plate while performing the task. This is considered a
poor test due to the incapability of the force plate to better assign nine CoP location labels. C) We show
with the single stance experiment that a synthetic skin is capable of providing signals to estimate 16 CoP
values. Experiments protocol described in Section 3.3.2 and 3.3.3 and in Fig. 3.1-B.
of body states. This observation is aligned with our previous publication [100] where sensory signals are
fused to show the benefits of distributed sensing for balance. Sensorized skins also have the advantages
of (i) passively stabilizing the ankle, while (ii) not restricting the DoFs of the mechanism, allowing smooth
movements and applicability to soft or hybrid (soft-rigid) robots [99].
Similar to our approach, in [91], the CoP location of a commercial prosthetic foot was estimated by
measuring the strain experienced by structural elements at the ankle. We extend that approach by showing
that measuring skin and ligament strain is also enough to calculate the CoP. Table 3.1 and Fig. 3.4 show
that we can estimate the CoP for the double and single stance cases, mean prediction accuracy values of
92.4% and 81.51% respectively.
Even though we got a mean CoP prediction accuracy >80% in the double stance experiments, we
identified two main areas for potential improvements of this experiment: (i) using a robotic biped with
motors to change its posture, instead of manually changing and holding new positions and (ii) increasing
43
the resolution of the force plate used to provide more accurate ground truth CoP values. As shown in
Fig. 3.4-A and explained in Section 3.3.2; our self-built force plate provides measurements of the ground
truth for the four CoP locations to be predicted with skin afferent signals. But the resolution of the force
plate may not have been high enough to provide accurate ground truth CoP locations for the nine CoP
locations experiment (Fig. 3.4-B). This likely explains why we do not see a solid blue diagonal in Fig. 3.4-B.
Regarding the ability of the skin alone to estimate the CoP, we show that (for the single stance experiments,
where no force plate is involved) afferent signals from the skin suffice to estimate up to 16 CoP locations
with a prediction accuracy of 96.8% (Fig. 3.4-C).
We did an additional ‘blind test’ which involves a training load with variable and unknown magnitude.
This was our attempt to approximate what a foot would experience in a natural terrain where ground
reaction forces are variable and not knownapriori. We repeated the experiments described in the Section
3.3.3 and Fig. 3.1-B, but instead of applying a same 900g load, a human operator applied a variable and
unmeasured load to the foot sole with her index finger at the same specified locations (results on Table
3.2). We obtained overall prediction accuracies close to the ones obtained for the constant force case (cf.
84.53%, Table 3.2, and 91.37%, Table 3.1). Kappa Values consistently pointing to high prediction accuracy
(i.e.,>0.61 for setups combining skin and ligaments and>0.71 for setups with only skin or ligaments).
Finally, the consistency of all results involving two versions of the spring-leaf skin sensors, ligaments,
and skin+ligaments combination (Tables 3.1, 3.2 and Fig. 3.4-C) highlights the likely generalizability of our
approach to different kinds of artificial skins, ligaments and their combination. This approach should be
useful to robotics, as it is known to be critical in biological systems. As mentioned in [49], afferent signals
produced by skin stretched at the ankle or knee have a significant impact on the control of joint angles
during walking. In [49], the importance of skin sensation is stressed by showing that it has a greater impact
on ankle joint control than visual cues.
44
3.6 Conclusions
We believe that this bio-inspired study will motivate engineers and scientists to further explore the benefits
and applications of the bio-inspired, non-collocated proprioception for the control of locomotion in legged
robots.
3.7 Supplementaryinformation
The code and the supplementary files can be accessed through project’s Github repository at:
https://github.com/CatStrain/Cat_skin
3.8 Acknowledgements
The authors thank Tailun Liu for his help on developing the first versions of the circuits used for the experiments,
Suraj Chakravarthi Raja for helping in proofreading the manuscript. The authors acknowledge the access to equip-
ment for building experimental hardware provided by the Baum Family Makerspace, from the Viterbi School of En-
gineering (USC). Research reported in this publication was supported in part by the Department of Defense CDMRP
Grant MR150091, and Award W911NF1820264 from the DARPA-L2M program, as well as National Institutes of Health
under the award number R21-NS113613 to F.J.V.-C. The authors acknowledge support for D.U.-M. by the research
fellowship granted by Consejo Nacional de Ciencia y Tecnología (CONACYT-Mexico).
45
Chapter4
ExploitingBrain-Body-EnvironmentInteractionstoImprove
LocomotionLearninginaBipedRobot
4.1 ChapterSummary
Young vertebrates use motor babbling (random sparse physical actions) to explore brain-body-environment interac-
tions and to heuristically learn complex motor tasks. Here we demonstrate that two minutes of motor babbling in a
physical biped robot can enable the learning of supported-by-a-gantry bipedal locomotion, but only if the random
actions exploit the physical properties and are compatible with the range of motion of the limb (i.e., “natural” bab-
bling), compared to purely random “naïve” babbling that can conflict with the leg’s natural dynamics or exacerbate
antagonistic muscle actions. Furthermore we see how, by reducing the support the gantry provides (i.e., increasing
the constraints that the environment imposes to walking) the biped learns to walk after training with both “naïve”
and “natural” babbling approaches. Importantly, the interference with the ground was naturally mediated by the
inherent backdrivability of the electric motors, and was not pre-programmed. With this we explore how the perfor-
mance of learned actions improves not only if the brain exploits the physical properties of a body, but also if the body
admits mechanical inputs from the environment. This being specially relevant when collecting critical learning data
via physical experience is risky, expensive or desirable.
46
4.2 Introduction
Getting inspiration from biology to create robots has become more common as their architecture and task complexity
have increased (e.g., more degrees of freedom and walking with unknown constraints). These complexities have
increasingly complicated the creation of human-made models and/or equations that encapsulate all the dynamics of
a robot [8], [55]. Two bioinspired approaches, central to this paper, that aim to make more versatile and faster learner
robots are: i) the creation of tendon (i.e., cable) driven robots with variable impedance actuators whose movements
adapt to their body dynamics and environmental mechanical inputs [106] and ii) the creation of model- or task-
agnostic machine learning algorithms to estimate body states [101] and to learn to perform tasks heuristically [117,
58, 46] via brain-body collaboration and sparse physical actions [58, 68].
For this paper we created a bioinspired backdrivable biped robot that performs data-driven model-free locomo-
tion learning guided by the dynamics of the plant. We show that these properties can improve the learning rate and
performance of bipedal locomotion. We get inspiration from: i) agonist/antagonist muscle pairs mutual inhibition
[22] specially relevant for flexing movements (i.e., abduction or adduction) which can produce oscillatory move-
ments (e.g., leg swing) [36] and ii) motor babbling (random sparse physical actions) used by young vertebrates to
explore brain-body-environment interactions to produce useful behaviors [33, 2].
Our approach falls into the group of Heuristic data-driven learning techniques that have become viable alterna-
tives to algorithmic approaches to perform bipedal locomotion. Numerous robots have presented walking behavior
boosted by data-driven machine learning (ML) techniques to tune or learn models of their body [68] and/or tasks
[58]. These techniques heavily rely on data sampling strategies that often consist of thousands of action trials. One
way to reduce training time is to use kinematic or dynamic models together with RL, significantly reducing the sam-
pling space [119]. In [116] ANNs based on reference trajectories were used to produce walking and running. Other
approaches are based on skills parametrized as dynamic movement primitives and improved with RL techniques
[89, 118]. In contrast to these heuristic data-driven learning approaches, robotic locomotion has been traditionally
achieved using control techniques based on: i) error correction like linear control, computed torque control and
adaptive control and ii) based on computing optimal values for cost functions like in optimal control [94]. Even
though optimal control is among the most successful approaches to control limb movement, it has been proven to
47
require longer calculation time to find solutions to the nonlinear problems of generating movement of biologically
realistic limbs compared to approaches that exploit the limb’s properties [12].
In this work we are particularly interested in developing a strategy to sparsely sample the robot’s configuration
space while exploiting its mechanical properties. Importantly, we do not rely on a robot’s model which is the basis
of many reinforcement learning algorithms. Following this idea, in [64] learning of a walking task was accelerated
using a lookup table of kinematic robot ranges corresponding to the area where the center of pressure needs to be to
maintain a dynamically balanced gait. Similarly to us, they constraint the degrees of freedom of their robot to simplify
the task of finding the suitable actions. In [100], we explain how positioning the sensors closer to perturbations cites,
helps in making data more relevant to estimate the body states of a biped, potentially being a way to reduce the data
required to learn a task. In the furthest extreme we have robots whose structure allows them to produce locomotion
even without sensory feedback, showing that intrinsic mechanical control can suffice to produce useful movements
[9].
We present a “natural” motor babbling strategy as an extension of G2P or “General to Particular” algorithm [68]
which enables bioinspired learning of locomotion movements in tendon driven robotic limbs. Our new “natural”
babbling strategy is an improvement of the naïve babbling strategy previously used by G2P. Naïve babbling imposes
movements that can conflict with leg dynamics, causing 80% of the data generated to lie on edges of the configuration
space, away from the area where the locomotion solutions lie. With the natural babbling, motor activations: i)
produce joint rotations away from their limits of rotation and ii) follow a sinusoidal pattern instead of step functions
(sinusoid patterns have a phase shift of 180 +-20 degrees for pairs of motors that act on the same joint, in other words
two antagonist motors are not simultaneously activated with high activation values: resembling muscle mutual
inhibition in living organisms). As a result the leg joints are more homogeneously exposed to the region of the
configuration space where locomotion patterns lie, promoting a more informative sensory feedback, compatible
with the limb properties and faster learning of robotic locomotion.
Implementing such novel data-driven heuristic learning approaches must emphasize on brain-body co-adaptation
[103]. To this end, we designed and built a physical tendon-driven biped robot with backdrivable and variable
impedance (i.e., high admittance) actuators. These are crucial aspects to the design of the robot of this paper, whose
movement depends not only on the controller outputs but also on the interaction with the environment. As explained
48
in [106]: variable impedance allows for better leg movement adaptation, while increasing robustness to environmen-
tal perturbations and/or changes in robot kinematics or dynamic properties. To explain the foundation of this, we
can focus on a single motor: we say that it “admits” mechanical input (i.e., variation in its shaft position due to an
external mechanical load) when the total torque applied to its shaft is bigger than the torque generated by the motor.
For our robot when two motors acting on the same joint, are simultaneously exited with same activation values, in
neglectable environmental or limb mechanical input, the equilibrium point will be a static position (i.e., impedance
of both actuators is the same so no joint rotation will be generated). All in all the state of our robot limbs (their po-
sition, velocity and accelerations) depends on the balance between i) motor activations and ii) environmental input;
both points interfaced by the limb mechanical properties.
4.3 Methods
4.3.1 Robotcharacteristics
For our experiments we built a physical biped robot consisting of two legs with two Degrees of Freedom each,
connected through a hip. For each leg, the degrees of freedom are hip and knee, the foot being a ball foot to facilitate
interaction with the environment. The mechanical power to the joints is provided by a structure that resembles a
muscle: the force of the muscle provided by a motor, while the muscle-joint interface (which in our robot would be
the motor-joint interface) is a string which we call tendon. This robot is over actuated in the sense that it has more
actuators than degrees of freedom. The tendon route is shown in Figure 4.1.
The tendon routing of our robot is an evolution of the routing for the robot in our already published paper [68],
where all the motors were placed distally to the leg (i.e., in the hip). Here we simplify the tendon routing by having
only two motors placed distal to the leg and one of them in the thigh. This design decision was made to reduce
the torques affecting the hip joint, thus potentially simplifying the task of learning a useful movement. The motors
(Maxon DCX16S GB KL 24V) include a gearhead (with a reduction ratio of 21:1). Comparing two motors A and B,
both set to the same voltage level and mechanical load; A with a gearhead and B without one: motor A reduces
backdrivability of the limb while increasing its mechanical power output capabilities. This is an advantage for when
49
A B
C
M2
M1
M2
Figure 4.1: A- One leg tendon route diagram B- Render of 3D model of the biped C.- Photograph of the
biped
the design of the robot is changed to a heavier one due to bigger body size and/or the addition of more components
(e.g., sensors and actuators).
50
The range of motion of the joints were bigger than for our previous robot designs, allowing us to explore the
capability of the robot to track a desired trajectory independently of hard stops providing physical help. Here it is
important to mention that in locomotion experiments the movement of a robot is typically physically limited by two
components which serve as boundaries of its feasible configuration space: mechanical constraints (i.e., hard tops) in
its own body and environmental constraints (i.e, objects or ground itself). By designing our robot to have big ranges
of motion normally not reachable while performing tasks (4.2-B and 4.3-B), we focus on the role that environmental
constraints have on the resultant performance of a task.
To maintain inertia as low as possible (having a direct impact on power consumption to meet demands of leg
movement), and to increase the stiffness of the legs, we used aluminum tubes as main components of the legs. We
used additive manufacturing or 3D printing techniques, for the construction of the joints. We also considered the
implementation of easy tendon attachment points to facilitate the replacement of tendons, which is the part of the
robot that breaks more often.
We built a gantry to support the biped, only allowing it’s hip to move along the x and z axis in its sagittal plane
. The gantry prevents the biped from falling down, allowing us to focus merely on the task of learning a locomotion
cycle.
4.3.2 GeneralG2Poverview(naïveandnaturalbabblingexplained)
General to Particular (G2P) algorithm was developed in [68]. This algorithm uses a Neural Network (NN) as a map
frominputstooutputs(respectivelydesiredkinematicstomotoractivations). The NN istrained with input-
output data sets obtained from babbling, andtested with input-output data sets obtained from babbling by predicting
outputs given inputs. The predicted outputs are compared with ground truth motor activations outputs. The differ-
ence between predicted and obtained values is the error and the goal is to reduce such error. Thetesting/training
datasetssizeratioisof0.25. G2P refines this map with a reinforcement learning approach. For this paper we do
not consider such section of the algorithm since we are interested in understanding the value of the data obtained
during babbling. To better understand the value of babbling data, we test the performance of the system after train-
ing and testing the NN. By having only this step to improve the system walking performance we more clearly see
51
the effect of using different types of babbling data. We use one NN per leg. (In the Appendix A.1 we provide more
details on the characterisitics of G2P)
As mentioned in the introduction, we made changes to the babbling strategy of G2P to more homogeneously
expose the leg joints to the areas in its configuration space where locomotion patterns lie. To keep our focus on
assessing the usefulness of the data to produce a mapping with which a desired trajectory can be tracked, we par-
ticularly tested the G2P capability to create a motor activations to limb kinematics map without any refinement to
such map. With this paper we show that (for a two DoF, three actuators leg) properly obtained data can be enough
to train an ANN to produce useful movement (more details in results and discussion sections).
In this section, when motors M1, 2 and 3 are mentioned, please refer to Figure 4.1.
We first give a naïve babbling overview to then continue with the natural babbling explanation.
In general naïve babbling consists of random step PWM signal variation for each one of the motors. Each motor
signal is independent from the others. Frequency of steps change: 1.3 Hz.
For natural babbling, each PWM signal for each of the motors follows a sinusoid profile. Considering that the
mean value of the signals is 0, only the positive section is used. For each motor, the signal amplitude is varied
randomly. M1 and M2 signals have a phase shift of 180 deg. This is to avoid simultaneous activations of the motors
which causes no hip movement to happen [22, 36]. Every 15 seconds, the phase between M1 and M3 was increased
by 36 deg. The baseline of each signal varies +-30 PWM units ( +- 1V). To get a sinusoid-like shape a number of
small steps in series need to be considered (this is a digital system, so we are discretizing the signal). Step frequency:
6 Hz. Sinusoid frequency (every time a period is completed): .6 Hz. Frequency of each signal peak : 1.3 Hz. Each
peak (natural babbling) has approximately the same width as each step of naïve babbling (look at the purple signals
within the two black horizontal lines in the chart). All frequencies are reported as approximate values. It is intrinsic
to the microcontroller behavior to have slight variations in signaling and sampling frequency. The limits of rotation
of each of the joints were never reached with natural babbling, a crucial point for our results and conclusions (Figure
4.2-B)
52
Figure 4.2: Desired joint and foot trajectories
53
4.3.3 Desiredfoottrajectorycharacteristicsandbipedsupport
Before hardware experiments were performed, we did a forward kinematics analysis of possible robot movement to
determine the desired joint ranges and foot trajectory to follow. This foot trajectory is such that it allows for front
and back swings of the leg to have different heights (Figure 4.2-C).
As shown in Figure 4.4, we divide our experiment in three main cases determined by the level of support the
gantry provided to the biped full, partial and minimal.
1. Full support: only in air movement, no robot interaction with the ground. When performing movements,
the feet trajectories will be limited only by the characteristics of the biped itself. (Figure 4.4-A)
2. Partial support: desired trajectories are partially reachable since they are partially under ground level. In
other words, ground constraints the movement of the robot to stay over the boundary marked by ground.
(Fig. 4.4-B)
3. Minimal support: desired trajectory is unreachable, it is completely under ground level. This is the case
where the biped’s movements are more constrained. Also, for this case the area of the feasible joint configu-
ration space is smaller than in points 1 or 2. (i.e. here the biped movements are constrained to exist between
the limits imposed by the ground and the limits marked by the limits of joint rotations). (Figure 4.4-C)
For all the system variations described in the previous points the desired trajectory has always the same distance
to the hip. Its position variation is achieved by reconfiguring the biped gantry to lower the biped’s hip.
4.3.4 Hardwareexperimentssteps
The following steps were performed using both: naïve and natural babbling. These steps describe our experiment.
Eight trials of this experiment were performed, four based on naïve babbling and four on natural babbling. If the
biped displaces its body mass for 40 cm we consider this a successful walking trial.
1. Collect babbling data for two minutes (Figure 4.3). Babbling characteristics are described in Section 4.3.2.
2. Train an ANN to map motor activations to limb kinematics as described in Section4.3.2.
54
3. With the biped fully supported (i.e., Section 4.3.3, suspended in air, no ground constraint), track the desired
foot trajectory (Figure 4.4-A).
4. With the biped partially supported (i.e., Section 4.3.3, in touch with ground with its hip at 40 cm off the
ground). Perform trajectory tracking as in (Figure 4.4-B). Measure time the biped takes to travel 40 cm in case
there is successful walking.
5. With the biped minimally supported (i.e., Section 4.3.3,in touch with ground with its hip at 39 cm off the
ground, Figure 4.4-C). Measure time the biped takes to travel 40 cm in case there is successful walking.
4.3.5 Dataanalysis
4.3.5.1 Sparseness
Calculated by discretizing the area within the desired trajectory with 1mm
2
squares, and calculating the ratio of
occupied/unoccupied squares by babbling data.
4.3.5.2 DetrendedFluctuationAnalysis
In Detrended Fluctuation Analysis (DFA), the fractal scaling component estimates a time series’ scaling behavior
which represents the power law scaling behavior of the time series over various time scales. The steps for DFA are
as follows:
1. First, we integrated the time series data to obtain the detrended series, which removes the trend from the data.
To detrend the series, we divided the original time series into non-overlapping windows of equal length and
then fit a polynomial function of a certain degree to each window.
2. Then we divided the detrended series into smaller segments of equal length (boxes). The scale factor deter-
mines the length of the boxes.
3. Afterward, we calculated each box’s root-mean-square fluctuation (F) of the detrended series.
55
4. Then we obtained the detrended series’ fluctuation by calculating the root-mean-square fluctuations’ average
across all the boxes at a given scale.
5. We repeated steps 1 to 4 for different scale factor values and plotted the average fluctuation versus the scale
factor (DFA curve).
6. Finally, we analyzed the DFA curve to check the time series data for long-term correlations. The DFA curve
shows a power-law relationship between the fluctuation and the scale factor quantified by the slope alpha
(fractal scaling component) using linear regression on a log-log scale.
A higher fractal scaling component indicates that the time series exhibits stronger long-term correlations or
persistence over various time scales. This means that the fluctuations in the time series at larger time scales are
more correlated than expected by chance and have a more pronounced or persistent trend. Conversely, suppose the
fractal scaling component is lower. In that case, this indicates weaker long-term correlations or anti-persistence in
the time series, which means that the fluctuations at larger time scales are less correlated. The time series has a
weaker or more variable trend.
4.4 Results
As mentioned before, all the trials presented here are based on two minutes of babbling data (Figure 4.3). When a
result is reported as “mean”, it is the average value from four trials. For the mean cases of sparseness, Hurst exponent
analysis and poincare analysis the number of values considered for this calculation are eight (left and right legs for
each four trials, total eight).
As mentioned above, we consider a successful trial one where walking is produced and at least 40 cm are trav-
elled. Success rate is the relationship between the number of performed trials of a particular kind divided by the
number of such trials that are successful. Please refer to methods to see a detailed description for full, partial and
minimal supported cases (i.e., none, half or all of the desired endpoint trajectory below the floor, respectively).
56
Figure 4.3: Two minutes of babbling data and desired trajectories. A: Joint Space, B: Endpoint Space
4.4.1 Exploitinglimbmechanicalpropertiesreducessparsityoftrainingdataand
increasessuccessrateoflocomotionlearning
All results reported in this subsection correspond to babbling data and walking attempts for partial supported
case. (Fig. 4.4-B).
2 minutes of natural babbling data are enough to produce locomotion, while 2 minutes of naïve babbling data are
not enough (Fig. 4.4-B). With natural babbling for partial supported case, G2P learned walking in 75% of the cases
compared to 0% for naïve babbling cases. Mean displacement speed for successful natural-babbling-based cases was
1.9 cm/sec.
The difference, as previously described, between the naïve and natural cases resides in the babbling data. More
sparse babbling data (i.e., naïve babbling data as shown in Figure 4.3) is correlated to less successful walking trails (e.i.,
naïve babbling cases produce slower walking compared to natural cases, where in the partially supported scenario
walking is not achieved at all.
57
As shown in Figure 4.3, natural babbling data is closer to the regions of the configuration space where locomotion
solutions lie. If we analyze the sparseness of this data within the area delimited by a desired trajectory, we see that
the sparseness for the natural babbling data is smaller than that of the naïve babbling data. For the trial presented
in Figure 4.3, left-right leg spareness of naïve babbling data: 0.073 and 0.091 respectively; left-right leg spareness
of natural babbling data: 0.73 and 0.52 respectively. Mean sparseness values for naïve and natural babbling data
respectively are: 0.235 and 0.7.
In [3] it is described how a model to be able to describe a system, and to accurately predict its behavior, needs
to be trained with more training samples spanning throughout the entire range of possible values such samples can
have. Using the example of randomly generated points within a two dimensional square that delimits a 2D pattern,
51% of points will lie “inside” the square. Lying “Inside”, for the purpose of this explanation means that the points
will be at least 10% of the distance of any of the edges of the figure. The points inside this square will be able to
describe the characteristics of the patterns inside of such a square.
This can be correlated to our experiments where most of the naïve babbling points lie away and few inside the
desired trajectory, following the above explanation, these points will potentially fail on training a model that can
accurately predict the behavior inside the desired trajectory. In this case behavior will be the motor commands to
pull on the tendons to produce cyclical movements that are close to the desired trajectory. This is seen in Figure 4.4-A
where the blue trajectories based on a model trained with naïve babbling data fail to closely resemble the desired
trajectory. In contrast natural babbling points, which lie inside of the desired trajectory are better to train a model
which can predict the motor activations required to produce cyclical foot trajectory patterns that better resemble the
desired trajectory. This is seen in Figure 4.4-A where the green trajectories based on a model trained with natural
babbling data better resemble the desired trajectory compared to the case of the naïve babbling based experiments.
In the experiments presented in this work our system has two joints (i.e., two degrees of freedom in the joint
space), the proposed natural babbling approach is potentially more beneficial for cases where joints are increased or
when the dimensions of the system increase. The explanation in [3] implies that randomly generated points within
squares, cubes and progressively figures in higher dimensional spaces are progressively reduced (e.i., proportion of
inside points being: 0.51, 0.40, 0.32, 0.26, 0.20, 0.18, 0.13 respectively for 2 dimensions to 8 dimension cases).
58
Figure 4.4: Plots of obtained and desired foot trajectories shown together with close ups of biped feet in
full (i.e.,A), partial (i.e.,B) and minimal (i.e.,C) support cases
4.4.2 Removingsupportincreaseswalkingsuccessrateandcorrelatestofasterwalking
For natural babbling cases, and with a minimally supported biped (Fig. 4.4-B), G2P learned supported bipedal
walking in 100% of the trials based on both naïve and natural babbling (respectively, mean displacement speeds for
this cases was 2.23 cm/sec and 4.99 cm/sec). When going from partial to minimal supported cases mean speed for
trials based on natural babbling increased by 262%. For the same trials, by going to minimally supported scenarios
the success rate was increased from 75% to 100%. For the trials based on naive babbling the success rate was increased
from 0 to 100%.
When in partially supported case, the biped can only barely touch ground with fully straight legs, reducing the
work that the legs produce to only the swing of the hip. In contrast, when the biped is under minimally supported
cases, it can produce work with both hip swing and knee flexion (Figure 4.4).
Compared to the in-air performance of the biped (fully supported case), in the partially supported case, when
the biped touches the ground, the scaling behavior in both versions drops. This shows, as expected, that following
the trend on the ground for the biped is more complicated than the in-air condition. By choosing more challenging
constraints for the biped in the minimally supported case, the naïve version shows significantly higher scaling com-
ponents (p approximately of 0.03) than partially supported case, indicating more persistent locomotion in minimally
59
Figure 4.5: This figure shows box plot of the fractal scaling components for naive (blue) and natural (red)
versions for eight different trials (four trials for both the right and left legs). When the FSC value is low,
meaning that persistence is compromised (partially supported cases) the biped can sometimes generate
locomotion, while presenting slow speeds of displacement compared to the cases where persistence is
higher or presents less dispersion (Minimally supported cases).
supported case. On the other hand, in minimally supported case, the natural version did not have a significantly
different scaling component (p approximately of 0.22) from partially supported case; however, there is less variance
from trial to trial compared to the partially supported case, which shows more robustness in the performance for
the minimally supported case (Figure 4.5).
The improvement observed from partially to minimally supported scenarios shows faster walking for cases based
on natural babbling. The reason for this is that for cases based on natural babbling, walking has already emerged for
the partially supported scenario, whereas for cases based on naïve babbling walking first emerges with the reduction
of the support to the system. In other words, both naïve and natural cases present an improvement when support is
removed, but naïve cases have had less improvement after locomotion emerges than cases based on natural babbling.
60
4.5 Discussion
This paper aims to motivate the creation of backdrivable biped robots that perform data-driven model-free locomo-
tion learning guided by the dynamics of the plant, thus potentially improving the learning rate and performance of
bipedal locomotion. As a future direction and a further improvement of G2P algorithm, the techniques presented
in this paper could evolve to incorporate approaches that include the calculation of parameters useful to maintain
a balanced gait. Zero moment point (ZMP) calculations [110] are a common example of an approach that has suc-
cessfully achieved quasi static balanced gaits. Another more recent approach that focuses on dynamic walking is
the hybrid zero dynamic (HZD) paradigm [6] which considers the transition between the leg dynamics in aerial and
ground stages. Even though these techniques are not necessary for the performance of our robot, in general they
are potential options to enhance the capabilities of G2P.
A fundamental aspect about our studies is that we exploit the mechanical properties of the limbs to produce
walking. In [12] the anatomical properties of a bioinspired limb are exploited by using recurring muscle patterns
(i.e., muscle synergies) to simplify the task of producing limb movements. In our previous work [68], based on motor
babbling (random step-like motor activations to explore limb capabilities), activation patterns to produce functional
movements with tendon driven limbs were found. Here we take this work a step further by exploiting the fact that
some of the actuators of our robot act antagonistically, we take inspiration from biological systems where muscle
spindles on a contracting (i.e., homonymous) muscle generate signals that inhibits the action of an opposing muscle
(e.i., afferent fibers of muscle spindles bifurcate to innervate both the alpha motor neuron that causes a muscle to
contract and the inhibitory spinal cord interneurons that cause its antagonist muscle to relax [87]). Then, we modified
the randomness of motor activations by including the rule that the activation level of two antagonist motors should
be significantly different. We see that while there is randomness in the actions produced for motor babbling, when
the activations to produce such actions i) allow a gradual transition between states and ii) exploit limb anatomy (i.e.,
consider the antagonistic nature of actuators in tendon driven structures), learning of desired actions is more robust.
Walking based on natural babbling data consistently produced higher displacement speeds than walking based
on naïve babbling data. Fully explaining this is one of the potential research routs that could be followed after these
studies. One hypothesis that could potentially provide part of the explanation for this behavior is that instead of the
leg joints being static due to muscle co-contraction, antagonist muscle inhibition produces longer movement- and
61
shorter static-periods during babbling, being this a way to produce longer periods of endpoint torque production
and shorter periods of no torque production. Learned actions based on this data, consequently, will also have more
presence of alternated muscle actions, which theoretically should produce higher torques in the endpoint, thus higher
biped displacement speeds. We offer this as a possible explanation for higher displacement speeds, but we are aware
that further analysis in this area are needed: even though the presented rationale helps on explaining higher torque
production by a single joint, it is also necessary to break (stop or stall) the movement of some joints in a multi-DoF
system in order to avoid limb slackness and to propagate the action of a proximal joint into a distal end effector.
This particular discussed point could easily develop into explaining or creating a model that explains the observed
behaviour, this is an opportunity to invite the reader to think on one of the advantages that learning techniques like
this offer: being able to control a system without fully modeling it.
A central aspect of our results is that thanks to its backdrivable limbs, our robot movements directly depend on
the level of support given to the biped. Here we highlight how the success of the resultant action does not depend
on reducing its error (i.e., in terms of movement patterns of the feet, mean square distance between obtained trajec-
tories and desired ones). Our study relates to the adaptive behavior observed in animals, where success of learned
actions rely on the emergence of a useful brain-body-environment interaction [16]. In our studies, first the “brain”
heuristically learns a motor-activations-to-limb-kinematics map, and then the “body” (thanks to its backdrivability)
admits mechanical inputs from “environment” constraints that depend on the level of support provided to the biped.
By removing the level of support we observe how the feasible action space of the legs is reduced, guiding the plant
to a better execution of the walking task, in many cases it permits the realization of such a task.
All in all, in this chapter we first focused on obtaining more “natural” babbling data with movements similar
to the ones the biped will need to execute when performing a walking action, as shown in our results this natural
data facilitates learning of walking. We show that natural babbling is more informative about the motor activations
patterns to produce desired limb behaviours. We have, in other words, made babbling data useful for the production
of tasks of a particular sort; this could be first interpreted as a double-edge sword: specialization and thus better
performance in a particular task is a potential outcome, but this arguably compromises generalizability. By changing
the mechanical support level to the system and not relying on an algorithm to compensate for the environmental
changes, we show how it is imperative to improve the performance of a robot with techniques further beyond
62
the algorithmic ones. The experiments in this chapter show some benefits of mechanical computation (where the
mechanical structure of the system, through its backdrivabillity admits both command- and environmental- inputs),
that allows for generalizability. Consequently, not algorithm-driven emergence of actions away from the initial
desired ones is needed to ensure successful plant (robot) performance.
63
Chapter5
Conclusions
As we highlight in [57], machines to be deployed in unpredictable environments, face challenges closer to those
living organisms face (i.e., learn without preprogrammed models of themselves or the environment and adapt to
changes in the environment and their own bodies). These machines, in many cases robots, need to be able to learn
new skills while always subjected to a rarely linear and usually dynamical body-environment interaction. By taking
inspiration from biology where organisms are multisensory agents, sensory fusion has been an important aspect to
consider for the creation of robots of the type described above. As explained in [30], using information from multiple
sensors allows systems to still acquire data even in case of partial damage, increment time and space data coverage,
reduce uncertainty of data accuracy provided by a single sensor, and in general better measurements resolution. As
mentioned in all the projects in these thesis: data to enable learning and proprioception is expensive and valuable
so acquiring it in non-naïve ways is crucial. In other words, in this thesis my aim has been to show some examples
of how to exploit properties of bipeds to enhance data relevance to perform state estimations and learning of tasks.
As central tool for my studies I designed and built bioinspired tendon-driven and skin-wrapped bipedal structures
and robots. I first endeavoured into the creation of a physical model of a bird to show the role that propioceptive
signals from the vestibular system and lumbosacral organs in birds could have in prediction of body states useful for
balance. In that study we highlighted that other signals coming from joints, skin and muscles could also contribute
to foot state estimates, being this one motivation to expand my research into exploring the potential impact of skin
in the prediction of the center of pressure parameters useful for bipedal balance. Lastly I explored ways to exploit
fundamental properties of tendon driven limbs (i.e, efferent antogonist muscle activations paired to their afferent
signaling) to improve learning of complex actions like walking.
64
5.1 Applicationsandlimitationsofthepresentedwork
Some future applications of my work could be:
1. The better placement of sensors to more accurately estimate body states useful for balance in bipedal robots
with infinite degrees of freedom structures (e.g., birds’ neck which can position the head relative to the center
of mass of the body, in far more position compared to the neck of a human). As robots become more complex
and more degrees of freedom separate their head from their hip (like in birds), it makes more sense to have
a sensor close to the center of mass (COM) of the robot since it is the COM’s behaviour the one that better
resembles the behavior of the whole system. In other words, if the system was simplified to a point in space
that point would be represented by the COM.
2. The usage of artificial skins to distally estimate parameters like the center of pressure (CoP). Placing sensors
on a visco-elastic structure like a skin comes with manufacturing challenges but also with advantages like
the reduction of sensor wear due to impulse forces intrinsic of foot-ground interaction. Also a skin can serve
as a flexible and light protective coat to inner components like electronics, the existence of such coat opens
the possibility of acquiring sensory information useful for estimation of body states, similarly to the example
provided in this thesis (Chpater 3). For robot deployed in “real world” scenarios; in other words, dynamic,
unstructured and unknown scenarios, skin afferent information should not be used alone to estimate the CoP.
Given a foot position, skin strain signals alone wouldn’t be able to provide signals to differentiate between
ground contact and non-ground contact scenarios. The CoP measurement in this thesis depends on the po-
sition of the feet which only depends on the position of the biped kept by external forces. Strains in skin
are only sufficient to estimate the CoP, under the assumption that the foot or feet used for this estimation
are placed flat on ground. As seen in biology, and described in Chapter 3, skin provides afferent information
useful to estimate body states (like joints position), but proprioception depends on more that a single data
source, hence the need of building systems with complementary sensorial systems to achieve reliable estate
estimation.
65
3. To develop data sampling techniques that exploit the mechanical properties of a robot to create better and more
robust models for locomotion, specially when acquiring data in hardware is necessary and when prolonged
periods of data acquisition time is not feasible.
5.2 Immediatefollowupworkandrecommendations
1. The work presented in Chapter 2 can be directly connected to the work in Chapter 4. This will potentially
enable the creation of a robot that performs bioinspired balance as well as learning of cyclical movement
generation (two important aspects of a walking agent). Currently the work presented in Chapter 4 is applied
to a biped that is held by a gantry that prevents it from falling down. Even though this has allowed us to show
the advantages of a novel strategy to learn locomotion learning by exploiting the mechanical properties of a
robot, we are aware that this system would only be ready for deployment in "real world" environments when
it becomes independent of a gantry help.
2. The bipedal robot in 4 is able to displace its own body mass thanks to it robust design. In fact, it is the first
physical system that uses the G2P algorithm that can perform such task. Many groups don’t give enough value
to the importance of rigorous mechanical design and manufacturing. The work done in this thesis shows that
brain-body-environment interaction is fundamental on the learning and performance of useful actions, much
work and attention to each one of this areas needs to be conducted throughout the whole life of a project.
5.3 Allinall
With this thesis I show that bioinspired robotic components, compared to traditional rigid mechanisms, allow for a
more dynamic environment-robot interaction and improve the acquisition of more informative signals to estimate
body states useful for locomotion.
In general, with my studies I show that the architecture and mechanical properties of a body should not be
neglected, but exploited to improve body-environment interactions. Yes, machine learning techniques have been
shown to be powerful heuristic approaches to create models of legged structures, and particularly reinforcement
learning has improved data collection techniques to acquire more relevant data faster. Still, like exposed in the
66
chapters of these thesis, there is a need to improve locomotion learning processes by letting the mechanics drive
the behaviour of the system. Like in biology, I want to motivate not only "brain-body" biased interaction, but also a
"body-brain" one.
67
Bibliography
[1] Anick Abourachid, Rémi Hackert, Marc Herbin, Paul A Libourel, François Lambert,
Henri Gioanni, Pauline Provini, Pierre Blazevic, and Vincent Hugel. “Bird terrestrial locomotion
as revealed by 3D kinematics”. In: Zoology 114.6 (2011), pp. 360–368.
[2] Karen E Adolph, Whitney G Cole, Meghana Komati, Jessie S Garciaguirre, Daryaneh Badaly,
Jesse M Lingeman, Gladys LY Chan, and Rachel B Sotsky. “How do you learn to walk? Thousands
of steps and dozens of falls per day”. In: Psychological science 23.11 (2012), pp. 1387–1394.
[3] Charu C Aggarwal, Alexander Hinneburg, and Daniel A Keim. “On the surprising behavior of
distance metrics in high dimensional space”. In: Database Theory—ICDT 2001: 8th International
Conference London, UK, January 4–6, 2001 Proceedings 8. Springer. 2001, pp. 420–434.
[4] Jean-Marc Aimonetti, Valérie Hospod, Jean-Pierre Roll, and Edith Ribot-Ciscar. “Cutaneous
afferents provide a neuronal population vector that encodes the orientation of human ankle
movements”. In: The Journal of physiology 580.2 (2007), pp. 649–658.
[5] Aaron D Ames, Eric A Cousineau, and Matthew J Powell. “Dynamically stable bipedal robotic
walking with NAO via human-inspired hybrid zero dynamics”. In: Proceedings of the 15th ACM
international conference on Hybrid Systems: Computation and Control. 2012, pp. 135–144.
[6] Aaron D Ames and Ioannis Poulakakis. “Hybrid zero dynamics control of legged robots”. In:
(2018).
[7] Alexander S Aruin and Mark L Latash. “The role of motor action in anticipatory postural
adjustments studied with self-induced and externally triggered perturbations”. In: Experimental
Brain Research 106.2 (1995), pp. 291–300.
[8] Mohsen Attaran and Promita Deb. “Machine learning: the new’big thing’for competitive
advantage”. In: International Journal of Knowledge Engineering and Data Mining 5.4 (2018),
pp. 277–305.
[9] Alexander Badri-Spröwitz, Alborz Aghamaleki Sarvestani, Metin Sitti, and Monica A Daley.
“BirdBot achieves energy-efficient gait with minimal control using avian-inspired leg clutching”.
In: Science Robotics 7.64 (2022), eabg4055.
[10] Bruce P Bean. “The action potential in mammalian central neurons”. In: Nature Reviews
Neuroscience 8.6 (2007), pp. 451–465.
68
[11] Brian J Benda, Patrick O Riley, and David E Krebs. “Biomechanical relationship between center of
gravity and center of pressure during standing”. In: IEEE Transactions on Rehabilitation
Engineering 2.1 (1994), pp. 3–10.
[12] Max Berniker, Anthony Jarc, Emilio Bizzi, and Matthew C Tresch. “Simplified and effective motor
control based on muscle synergies to exploit musculoskeletal dynamics”. In: Proceedings of the
National Academy of Sciences 106.18 (2009), pp. 7601–7606.
[13] Marguerite Biederman-Thorson and John Thorson. “Rotation-compensating reflexes independent
of the labyrinth and the eye”. In: Journal of comparative Physiology 83.2 (1973), pp. 103–122.
[14] Jeffrey T Bingham, Julia T Choi, and Lena H Ting. “Stability in a frontal plane model of balance
requires coupled changes to postural configuration and neural feedback control”. In: Journal of
neurophysiology 106.1 (2011), pp. 437–448.
[15] Angelo Cangelosi, Josh Bongard, Martin H Fischer, and Stefano Nolfi. “Embodied intelligence”.
In: Springer handbook of computational intelligence (2015), pp. 697–714.
[16] Hillel J Chiel and Randall D Beer. “The brain has a body: adaptive behavior emerges from
interactions of nervous system, body and environment”. In: Trends in neurosciences 20.12 (1997),
pp. 553–557.
[17] Kelly J Cole and James H Abbs. “Grip force adjustments evoked by load force perturbations of a
grasped object”. In: Journal of neurophysiology 60.4 (1988), pp. 1513–1522.
[18] David F Collins, Kathryn M Refshauge, Gabrielle Todd, and Simon C Gandevia. “Cutaneous
receptors contribute to kinesthesia at the index finger, elbow, and knee”. In: Journal of
neurophysiology 94.3 (2005), pp. 1699–1706.
[19] Steve Collins, Andy Ruina, Russ Tedrake, and Martijn Wisse. “Efficient bipedal robots based on
passive-dynamic walkers”. In: Science 307.5712 (2005), pp. 1082–1085.
[20] Heather Culbertson, Samuel B Schorr, and Allison M Okamura. “Haptics: The present and future
of artificial touch sensation”. In: Annual Review of Control, Robotics, and Autonomous Systems 1
(2018), pp. 385–409.
[21] Monica A Daley, G Felix, and Andrew A Biewener. “Running stability is enhanced by a
proximo-distal gradient in joint neuromechanical control”. In: Journal of Experimental Biology
210.3 (2007), pp. 383–394.
[22] BL Day, CD Marsden, JA Obeso, and JC Rothwell. “Reciprocal inhibition between the muscles of
the human forearm.” In: The Journal of physiology 349.1 (1984), pp. 519–534.
[23] Jesse C Dean, Neil B Alexander, and Arthur D Kuo. “The effect of lateral stabilization on walking
in young and old adults”. In: IEEE Transactions on Biomedical Engineering 54.11 (2007),
pp. 1919–1926.
69
[24] Andrei Drăgulinescu, Ana-Maria Drăgulinescu, Gabriela Zincă, Doina Bucur, Valentin Feies
,
, and
Dumitru-Marius Neagu. “Smart socks and in-shoe systems: State-of-the-art for two popular
technologies for foot motion analysis, sports, and medical applications”. In: Sensors 20.15 (2020),
p. 4316.
[25] Benoni B Edin. “Quantitative analyses of dynamic strain sensitivity in human skin
mechanoreceptors”. In: Journal of neurophysiology 92.6 (2004), pp. 3233–3243.
[26] Benoni B Edin and Niclas Johansson. “Skin strain patterns provide kinaesthetic information to the
human central nervous system.” In: The Journal of physiology 487.1 (1995), pp. 243–251.
[27] Bradley Efron and Robert J Tibshirani. An introduction to the bootstrap. CRC press, 1994.
[28] Anne Lill Eide. “The axonal projections of the Hofmann nuclei in the spinal cord of the late stage
chicken embryo”. In: Anatomy and embryology 193.6 (1996), pp. 543–557.
[29] Anne Lill Eide and Joel C Glover. “Development of an identified spinal commissural interneuron
population in an amniote: neurons of the avian Hofmann nuclei”. In:JournalofNeuroscience 16.18
(1996), pp. 5749–5761.
[30] Wilfried Elmenreich. “An introduction to sensor fusion”. In: Vienna University of Technology,
Austria 502 (2002), pp. 1–28.
[31] Kemalettin Erbatur, Akihiro Okazaki, Keisuke Obiya, Taro Takahashi, and Atsuo Kawamura. “A
study on the zero moment point measurement for biped walking robots”. In: 7th International
Workshop on Advanced Motion Control. Proceedings (Cat. No. 02TH8623). IEEE. 2002, pp. 431–436.
[32] A Aldo Faisal, Luc PJ Selen, and Daniel M Wolpert. “Noise in the nervous system”. In: Nature
reviews neuroscience 9.4 (2008), pp. 292–303.
[33] Michael S Fine and Kurt A Thoroughman. “Trial-by-trial transformation of error into
sensorimotor adaptation changes with environmental dynamics”. In: Journal of neurophysiology
98.3 (2007), pp. 1392–1404.
[34] Michel Fliess and Cédric Join. “Model-free control”. In: International Journal of Control 86.12
(2013), pp. 2228–2252.
[35] Michele Folgheraiter, Alikhan Yessaly, Galym Kaliyev, Asset Yskak, Sharafatdin Yessirkepov,
Artemiy Oleinikov, and Giuseppina Gini. “Computational efficient balance control for a
lightweight biped robot with sensor based zmp estimation”. In: 2018 IEEE-RAS 18th International
Conference on Humanoid Robots (Humanoids). IEEE. 2018, pp. 232–237.
[36] W Otto Friesen. “Reciprocal inhibition: a mechanism underlying oscillatory animal movements”.
In: Neuroscience & Biobehavioral Reviews 18.4 (1994), pp. 547–553.
[37] SM Gatesy and AA Biewener. “Bipedal locomotion: effects of speed, size and limb posture in birds
and humans”. In: Journal of Zoology 224.1 (1991), pp. 127–147.
[38] Stephen M Gatesy and Kenneth P Dial. “Locomotor modules and the evolution of avian flight”. In:
Evolution 50.1 (1996), pp. 331–340.
70
[39] Arjan Gijsberts and Giorgio Metta. “Real-time model learning using incremental sparse spectrum
gaussian process regression”. In: Neural networks 41 (2013), pp. 59–69.
[40] Joanne C Gordon, Jeffery W Rankin, and Monica A Daley. “How do treadmill speed and terrain
visibility influence neuromuscular control of guinea fowl locomotion?” In: Journal of
Experimental Biology 218.19 (2015), pp. 3010–3022.
[41] Keqin Gu, Jie Chen, and Vladimir L Kharitonov. Stability of time-delay systems. Springer Science
& Business Media, 2003.
[42] Daniel A Hagen, Ali Marjaninejad, Gerald E Loeb, and Francisco J Valero-Cuevas. “insideOut: A
Bio-Inspired Machine Learning Approach to Estimating Posture in Robots Driven by Compliant
Tendons”. In: Frontiers in Neurorobotics 15 (2021), p. 679122.
[43] Daniel A Hagen, Ali Marjaninejad, and Francisco J Valero-Cuevas. “A Bio-Inspired Framework
for Joint Angle Estimation from Non-Collocated Sensors in Tendon-driven Systems”. In: 2020
IEEE International Conference on Intelligent Robots and Systems (IROS). IEEE. 2020.
[44] Bert Haverkamp. “Subspace method identification, theory and practice”. PhD thesis. PhD thesis,
TU Delft, Delft, The Netherlands, 2000.
[45] Bert Haverkamp and Michel Verhaegen. “SMI Toolbox: State space Model Identification software
for multivariable dynamical systems”. In: Delft University of Technology, Delft, The Netherlands
(1997).
[46] Yunpeng He, Chuanzhi Zang, Peng Zeng, Qingwei Dong, Ding Liu, and Yuqi Liu. “Convolutional
shrinkage neural networks based model-agnostic meta-learning for few-shot learning”. In: Neural
Processing Letters (2022), pp. 1–14.
[47] Stephen Ho and Michael J O’Donovan. “Regionalization and intersegmental coordination of
rhythm-generating networks in the spinal cord of the chick embryo”. In: Journal of Neuroscience
13.4 (1993), pp. 1354–1371.
[48] Philip Holmes, Robert J Full, Dan Koditschek, and John Guckenheimer. “The dynamics of legged
locomotion: Models, analyses, and challenges”. In: SIAM review 48.2 (2006), pp. 207–304.
[49] Erika E Howe, Adam J Toth, Lori Ann Vallis, and Leah R Bent. “Baseline skin information from
the foot dorsum is used to control lower limb kinematics during level walking”. In: Experimental
brain research 233.8 (2015), pp. 2477–2487.
[50] Richard D Jacobson and M Hollyday. “Electrically evoked walking and fictive locomotion in the
chick.” In: Journal of Neurophysiology 48.1 (1982), pp. 257–270.
[51] Kian Jalaleddini, Chuanxin Minos Niu, Suraj Chakravarthi Raja, Won Joon Sohn, Gerald E Loeb,
Terence D Sanger, and Francisco J Valero-Cuevas. “Neuromorphic meets neuromechanics, part II:
the role of fusimotor drive”. In: Journal of neural engineering 14.2 (2017), p. 025002.
[52] Kian Jalaleddini, Ehsan Sobhani Tehrani, and Robert E Kearney. “A subspace approach to the
structural decomposition and identification of ankle joint dynamic stiffness”. In: IEEE transactions
on biomedical engineering 64.6 (2017), pp. 1357–1368.
71
[53] Henrik Jörntell and Carl-Fredrik Ekerot. “Topographical organization of projections to cat motor
cortex from nucleus interpositus anterior and forelimb skin”. In: The Journal of Physiology 514.2
(1999), pp. 551–566.
[54] Rudolph Emil Kalman et al. “A new approach to linear filtering and prediction problems”. In:
Journal of basic Engineering 82.1 (1960), pp. 35–45.
[55] Daekyum Kim, Sang-Hun Kim, Taekyoung Kim, Brian Byunghyun Kang, Minhyuk Lee,
Wookeun Park, Subyeong Ku, DongWook Kim, Junghan Kwon, Hochang Lee, et al. “Review of
machine learning methods in soft robotics”. In: Plos one 16.2 (2021), e0246102.
[56] Dinant A Kistemaker, Arthur J Knoek Van Soest, Jeremy D Wong, Isaac Kurtzer, and
Paul L Gribble. “Control of position and movement is simplified by combined muscle spindle and
Golgi tendon organ feedback”. In: Journal of neurophysiology 109.4 (2013), pp. 1126–1139.
[57] Dhireesha Kudithipudi, Mario Aguilar-Simon, Jonathan Babb, Maxim Bazhenov,
Douglas Blackiston, Josh Bongard, Andrew P Brna, Suraj Chakravarthi Raja, Nick Cheney,
Jeff Clune, et al. “Biological underpinnings for lifelong learning machines”. In: Nature Machine
Intelligence 4.3 (2022), pp. 196–210.
[58] Robert Kwiatkowski and Hod Lipson. “Task-agnostic self-modeling machines”. In: Science
Robotics 4.26 (2019), eaau9354.
[59] J Richard Landis and Gary G Koch. “The measurement of observer agreement for categorical
data”. In: biometrics (1977), pp. 159–174.
[60] Emily L Lawrence, Guilherme M Cesar, Martha R Bromfield, Richard Peterson,
Francisco J Valero-Cuevas, and Susan M Sigward. “Strength, multijoint coordination, and
sensorimotor processing are independent contributors to overall balance ability”. In: BioMed
research international 2015 (2015).
[61] Emily L Lawrence, Sudarshan Dayanidhi, Isabella Fassola, Philip Requejo, Caroline Leclercq,
Carolee J Winstein, and Francisco J Valero-Cuevas. “Outcome measures for hand function
naturally reveal three latent domains in older adults: strength, coordinated upper extremity
function, and sensorimotor processing”. In: Frontiers in aging neuroscience 7 (2015).
[62] Zhibin Li, Bram Vanderborght, Nikos G Tsagarakis, Luca Colasanto, and Darwin G Caldwell.
“Stabilization for the compliant humanoid robot COMAN exploiting intrinsic and controlled
compliance”. In: 2012 IEEE International Conference on Robotics and Automation. IEEE. 2012,
pp. 2000–2006.
[63] Hod Lipson and Jordan B Pollack. “Automatic design and manufacture of robotic lifeforms”. In:
Nature 406.6799 (2000), pp. 974–978.
[64] Jinsu Liu and Manuela Veloso. “Online ZMP sampling search for biped walking planning”. In:
2008 IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE. 2008, pp. 185–190.
[65] Gerald E Loeb. “Optimal isn’t good enough”. In: Biological cybernetics 106.11-12 (2012),
pp. 757–765.
72
[66] Catherine R Lowrey, Nick DJ Strzalkowski, and Leah R Bent. “Skin sensory information from the
dorsum of the foot and ankle is necessary for kinesthesia at the ankle joint”. In: Neuroscience
letters 485.1 (2010), pp. 6–10.
[67] Poramate Manoonpong, Tao Geng, Tomas Kulvicius, Bernd Porr, and Florentin Wörgötter.
“Adaptive, fast walking in a biped robot under neuronal control and learning”. In: PLoS
Computational Biology 3.7 (2007), e134.
[68] Ali Marjaninejad, Darıo Urbina-Meléndez, Brian A Cohn, and Francisco J Valero-Cuevas.
“Autonomous functional movements in a tendon-driven limb via limited experience”. In: Nature
machine intelligence 1.3 (2019), pp. 144–154.
[69] Ernesto C Martinez-Villalpando, Hugh Herr, and Matthew Farrell. “Estimation of ground reaction
force and zero moment point on a powered ankle-foot prosthesis”. In: 2007 29th Annual
International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE. 2007,
pp. 4687–4692.
[70] Flávio VC Martins, Eduardo G Carrano, Elizabeth F Wanner, Ricardo HC Takahashi, and
Geraldo R Mateus. “A dynamic multiobjective hybrid approach for designing wireless sensor
networks”. In: Evolutionary Computation, 2009. CEC’09. IEEE Congress on. IEEE. 2009,
pp. 1145–1152.
[71] Monique Maurice, Henri Gioanni, and Anick Abourachid. “Influence of the behavioural context
on the optocollic reflex (OCR) in pigeons (Columba livia)”. In: Journal of experimental biology
209.2 (2006), pp. 292–301.
[72] Kimberly L McArthur and J David Dickman. “State-dependent sensorimotor processing: gaze and
posture stability during simulated flight in birds”. In: Journal of neurophysiology 105.4 (2011),
pp. 1689–1700.
[73] Robyn L Mildren, Catherine M Hare, and Leah R Bent. “Cutaneous afferent feedback from the
posterior ankle contributes to proprioception”. In: Neuroscience letters 636 (2017), pp. 145–150.
[74] Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A Rusu, Joel Veness,
Marc G Bellemare, Alex Graves, Martin Riedmiller, Andreas K Fidjeland, Georg Ostrovski, et al.
“Human-level control through deep reinforcement learning”. In: nature 518.7540 (2015),
pp. 529–533.
[75] Benjamin Morris and Jessy W Grizzle. “Hybrid invariant manifolds in systems with impulse
effects with application to periodic locomotion in bipedal robots”. In: IEEE Transactions on
Automatic Control 54.8 (2009), pp. 1751–1764.
[76] R Necker. “The structure and development of avian lumbosacral specializations of the vertebral
canal and the spinal cord with special reference to a possible function as a sense organ of
equilibrium”. In: Anatomy and embryology 210.1 (2005), pp. 59–74.
[77] Reinhold Necker. “Head-bobbing of walking birds”. In: Journal of comparative physiology A 193.12
(2007), p. 1177.
73
[78] Reinhold Necker. “Specializations in the lumbosacral vertebral canal and spinal cord of birds:
evidence of a function as a sense organ which is involved in the control of walking”. In:Journalof
Comparative Physiology A 192.5 (2006), p. 439.
[79] Derrick Nguyen and Bernard Widrow. “Improving the learning speed of 2-layer neural networks
by choosing initial values of the adaptive weights”. In: 1990 IJCNN international joint conference
on neural networks. IEEE. 1990, pp. 21–26.
[80] Quan Nguyen and Koushil Sreenath. “Optimal Robust Control for Bipedal Robots through
Control Lyapunov Function based Quadratic Programs.” In: Robotics: Science and Systems. Vol. 11.
Rome, Italy. 2015.
[81] Riann M Palmieri, Christopher D Ingersoll, Marcus B Stone, and B Andrew Krause.
“Center-of-pressure parameters used in the assessment of postural control”. In: Journal of sport
rehabilitation 11.1 (2002), pp. 51–66.
[82] Ashley E Pete, Daniel Kress, Marina A Dimitrov, and David Lentink. “The role of passive avian
head stabilization in flapping flight”. In: Journal of The Royal Society Interface 12.110 (2015),
p. 20150508.
[83] Leif E Peterson. “K-nearest neighbor”. In: Scholarpedia 4.2 (2009), p. 1883.
[84] Vadakkepat Prahlad, Goswami Dip, and Chia Meng-Hwee. “Disturbance rejection by online ZMP
compensation”. In: Robotica 26.1 (2008), pp. 9–17.
[85] Uwe Proske and Simon C Gandevia. “The proprioceptive senses: their roles in signaling body
shape, body position and movement, and muscle force”. In: Physiological reviews (2012).
[86] Subramanian Ramamoorthy and Benjamin J Kuipers. “Trajectory generation for dynamic bipedal
walking through qualitative model based manifold learning”. In: 2008 IEEE International
Conference on Robotics and Automation. IEEE. 2008, pp. 359–366.
[87] David Andrew Rice and Peter John McNair. “Quadriceps arthrogenic muscle inhibition: neural
mechanisms and treatment perspectives”. In: Seminars in arthritis and rheumatism. Vol. 40. 3.
Elsevier. 2010, pp. 250–266.
[88] Barbara Sargent, Nicolas Schweighofer, Masayoshi Kubo, and Linda Fetters. “Infant exploratory
learning: influence on leg joint coordination”. In: PloS one 9.3 (2014), e91500.
[89] Matteo Saveriano, Fares J Abu-Dakka, Aljaz Kramberger, and Luka Peternel. “Dynamic
movement primitives in robotics: A tutorial survey”. In: arXiv preprint arXiv:2102.03861 (2021).
[90] DM Schroeder and RG Murray. “Specializations within the lumbosacral spinal cord of the
pigeon”. In: Journal of Morphology 194.1 (1987), pp. 41–53.
[91] Steffen Schütz, Atabak Nezhadfard, Navid Dorosti, and Karsten Berns. “Exploiting the intrinsic
deformation of a prosthetic foot to estimate the center of pressure and ground reaction force”. In:
Bioinspiration & Biomimetics (2020).
74
[92] Makoto Shimojo, Takuma Araki, Aigou Ming, and Masatoshi Ishikawa. “A ZMP sensor for a
biped robot”. In: Proceedings 2006 IEEE International Conference on Robotics and Automation, 2006.
ICRA 2006. IEEE. 2006, pp. 1200–1205.
[93] Gerald N Sholomenko and John D Steeves. “Effects of selective spinal cord lesions on hind limb
locomotion in birds”. In: Experimental neurology 95.2 (1987), pp. 403–418.
[94] Hayder FN Al-Shuka, F Allmendinger, Burkhard Corves, and Wen-Hong Zhu. “Modeling, stability
and walking pattern generators of biped robots: a review”. In: Robotica 32.6 (2014), pp. 907–934.
[95] Dan Simon. Optimal state estimation: Kalman, H infinity, and nonlinear approaches . John Wiley &
Sons, 2006.
[96] Thomas Sinkjær, Jacob Buus Andersen, Jørgen Feldbæk Nielsen, and Hans Jacob Hansen. “Soleus
long-latency stretch reflexes during walking in healthy and spastic humans”. In: Clinical
Neurophysiology 110.5 (1999), pp. 951–959.
[97] Larry R Squire, N Dronkers, and J Baldo. Encyclopedia of neuroscience. Vol. 2. Elsevier Amsterdam,
The Netherlands: 2009.
[98] George L Streeter. “The structure of the spinal cord of the ostrich”. In: Developmental Dynamics
3.1 (1904), pp. 1–27.
[99] Thomas George Thuruthel, Benjamin Shih, Cecilia Laschi, and Michael Thomas Tolley. “Soft
robot perception using embedded soft sensors and recurrent neural networks”. In: Science
Robotics 4.26 (2019), eaav1488.
[100] Darıo Urbina-Meléndez, Kian Jalaleddini, Monica A Daley, and Francisco J Valero-Cuevas. “A
physical model suggests that hip-localized balance sense in birds improves state estimation in
perching: implications for bipedal robots”. In: Frontiers in Robotics and AI 5 (2018), p. 38.
[101] Darıo Urbina-Meléndez, Jiaoran Wang, Daniel Wang, Ali Marjaninejad, and
Francisco J Valero-Cuevas. “Estimating Center of Pressure of a Bipedal Mechanism Using a
Proprioceptive Artificial Skin around its Ankles”. In: 2021 43rd Annual International Conference of
the IEEE Engineering in Medicine & Biology Society (EMBC). IEEE. 2021, pp. 4522–4528.
[102] Francisco J Valero-Cuevas. Fundamentals of neuromechanics. Vol. 8. Springer, 2015.
[103] Francisco J Valero-Cuevas and Andrew Erwin. “Bio-robots step towards brain–body
co-adaptation”. In: Nature Machine Intelligence 4.9 (2022), pp. 737–738.
[104] Francisco J Valero-Cuevas, Jae-Woong Yi, Daniel Brown, Robert V McNamara, Chandana Paul,
and Hood Lipson. “The tendon network of the fingers performs anatomical computation at a
macroscopic scale”. In: IEEE Transactions on Biomedical Engineering 54.6 (2007), pp. 1161–1166.
[105] Pasha A Van Bijlert, AJ ‘Knoek’ van Soest, and Anne S Schulp. “Natural Frequency Method:
estimating the preferred walking speed of Tyrannosaurus rex based on tail natural frequency”. In:
Royal Society open science 8.4 (2021), p. 201441.
75
[106] Bram Vanderborght, Alin Albu-Schäffer, Antonio Bicchi, Etienne Burdet, Darwin G Caldwell,
Raffaella Carloni, Manuel Catalano, Oliver Eiberger, Werner Friedl, Ganesh Ganesh, et al.
“Variable impedance actuators: A review”. In: Robotics and autonomous systems 61.12 (2013),
pp. 1601–1614.
[107] Michel Verhaegen and Patrick Dewilde. “Subspace model identification part 1. The output-error
state-space model identification class of algorithms”. In: International journal of control 56.5
(1992), pp. 1187–1210.
[108] Michel Verhaegen and Vincent Verdult. Filtering and system identification: a least squares
approach. Cambridge university press, 2007.
[109] Miomir Vukobratovic, Branislav Borovac, Dusan Surla, and Dragan Stokic. Biped locomotion:
dynamics, stability, control and application. Vol. 7. Springer Science & Business Media, 2012.
[110] Miomir Vukobratović and Branislav Borovac. “Zero-moment point—thirty five years of its life”.
In: International journal of humanoid robotics 1.01 (2004), pp. 157–173.
[111] MR Wayahdi, M Zarlis, and PH Putra. “Initialization of the Nguyen-Widrow and Kohonen
algorithm on the backpropagation method in the classifying process of temperature data in
Medan”. In: Journal of Physics: Conference Series. Vol. 1235. 1. IOP Publishing. 2019, p. 012031.
[112] David T Westwick and Eric J Perreault. “Closed-loop identification: application to the estimation
of limb impedance in a compliant environment”. In: IEEE Transactions on Biomedical Engineering
58.3 (2011), pp. 521–530.
[113] David A Winter. “Human balance and posture control during standing and walking”. In: Gait &
posture 3.4 (1995), pp. 193–214.
[114] Yuko Yamanaka, Naoki Kitamura, and Izumi Shibuya. “Chick spinal accessory lobes contain
functional neurons expressing voltagegated sodium channels generating action potentials”. In:
Biomedical Research 29.4 (2008), pp. 205–211.
[115] Tingting Yang, Dan Xie, Zhihong Li, and Hongwei Zhu. “Recent advances in wearable tactile
sensors: Materials, sensing mechanisms, and device performance”. In: Materials Science and
Engineering: R: Reports 115 (2017), pp. 1–37.
[116] Yuxiang Yang, Ken Caluwaerts, Atil Iscen, Tingnan Zhang, Jie Tan, and Vikas Sindhwani. “Data
efficient reinforcement learning for legged robots”. In: Conference on Robot Learning. PMLR. 2020,
pp. 1–10.
[117] Jaesik Yoon, Taesup Kim, Ousmane Dia, Sungwoong Kim, Yoshua Bengio, and Sungjin Ahn.
“Bayesian model-agnostic meta-learning”. In: Advances in neural information processing systems
31 (2018).
[118] Yuxia Yuan, Zhijun Li, Ting Zhao, and Di Gan. “DMP-based motion generation for a walking
exoskeleton robot using reinforcement learning”. In: IEEE Transactions on Industrial Electronics
67.5 (2019), pp. 3830–3839.
76
[119] Wei Zhu, Xian Guo, Dai Owaki, Kyo Kutsuzawa, and Mitsuhiro Hayashibe. “A survey of
sim-to-real transfer techniques applied to reinforcement learning for bioinspired robots”. In: IEEE
Transactions on Neural Networks and Learning Systems (2021).
[120] Viktor Zykov, Josh Bongard, and Hod Lipson. “Evolving dynamic gaits on a physical robot”. In:
Proceedings of Genetic and Evolutionary Computation Conference, Late Breaking Paper, GECCO.
Vol. 4. Citeseer. 2004, p. 2004.
77
AppendixA
Expandingonelectromechanicalandalgorithmicsystemsofthebipedal
robotofChapter4
The objective of this appendix is to give a detailed explanation of the electromechanical and algorithmic systems
around the experiments of Chapter 4. The robot of Chapter 4 consist on a tendon driven bipedal structure which is
mechanically supported by a gantry and actuated by DC motors (As shown in Figure A.1-C, described in Methods
section 4.3.1 of Chapter 4 and which we expand on Appendix A.3). Electric circuits enable microcontroller-encoders
and microcontroller-motor drivers communication; additionally a power circuit sets the voltage and provides enough
current for the motors to operate (As shown in Figure A.1-B and explained in Appendix A.2). The microcontroller
is in charge of the data acquisition (DAQ) and low level control of our robot, which consist on following learned
motor activation patters. Learning of such patterns, which is the focus of Chapter 4, takes place in a PC computer
(As shown in figure A.1-A and explained in Appendix A.1).
A.1 DetailsofG2PtheprecursorofNaturalG2P
The first version of the learning algorithm that we use in Chapter4 was developed in [68], it is called the General to
Particular (G2P) algorithm. Revisiting what was explained in 4.3.2: this algorithm uses an Artificial Neural Network
(ANN) as a map frominputstooutputs(respectivelydesiredkinematicstomotoractivations). The ANN is
trained with input-output data sets obtained from babbling, andtested with input-output data sets obtained from
babbling by predicting outputs given inputs (Figures A.2 and A.3). The predicted outputs are compared with ground
78
PC
Power supply
Motor drivers
(L298 H-
Bridge)
Microcontroller
(Arduino)
A B C
Encoder data
Figure A.1: This diagram offers a general overview of the main hardware components of our system. In
A, we symbolize a PC where the learning is performed, this can also be considered the high level control
part of our system. InB, we show the main components to interface the learning with the the mechanical
components of the system, this can also be called the lower control part of our system. InC, we show a
photograph of the robot which interacts with the environment to produce walking
truth motor activations outputs. The difference between predicted and obtained values is the error and the goal is
to reduce such error. Thetesting/trainingdatasetssizeratioisof0.25.
The ANN used for Natural and Naïve G2P represents the inverse map from6Dlimbkinematics(i.e.,forour
robotproximalanddistaljointposition,velocitiesandaccelerations) to3Dmotorcontrolsequences(i.e.,
threemotorsactuatingthejointsthroughtendons), it hasthreefullyconnectedlayers(input,hiddenand
outputlayers) with 6, 15 and 3 nodes, respectively. (Figures A.2 and A.3)
Asthetransferfunctions for all nodes we selected the hyperbolic tangent sigmoid function, which is an S-like
function that produces a bounded output value in a range between -1 and 1. Additionally, we chose this function
over the signmoid since the gradient of the second is bigger than the first. The higher gradient produces a greater
sensibility to changes in the input values, producing higher updates in the weights of the networks (thus potentially
faster learning). We also applied a scaling for the output layer (giving values between -1–1) to obtain values to cover
the whole motor control range values (0-255).
Theweightsandbiases were initialized based on the Nguyen–Widrow initialization algorithm [79, 111], with
this we avoid initializing weights close to the regions where the gradient of the transfer function has very small or
high values. Having initial values localized in the mentioned region creates undesired output saturation. To obtain
best results, this approach randomly initializes weights close to the mid point of the transfer function (i.e., 0 for the
79
cases of our experiments). As performance/error function we used the mean square error (m.s.e.) approach
(Figure A.4). With this, the mean of the differences between values predicted by the ANN and the ground truth
values are calculated . This error is propagated backwards to update the initial weights, action performed with the
Levenberg–Marquardt back propagation technique, the assignment of new weights is particularly done with
AdaptiveMomentEstimation(Adam), a gradient descent method hosen over MomeNtum, AdaGrad, RMSProp.
Adam is the standard go-to method since it includes benefits from both Momentum and RMSProp. To find the
best model weights, it leverages the usually seen speed of MomeNtum, and adaptability to gradients with different
orientations like commonly well handled by RMSProp. Each time the back propagation is complete, it is considered
that anepoch happened. We determined the maximum number of epochs to 100; also, the model training stops after
there is no improvement after 5 epochs.
A.2 Dataandpowercircuits
The brain-body (i.e., PC-robot) interface relies on data and power circuits which are described in this Appendix
section, these circuits make possible the data communication from and to the biped and the activation of the motors
to actuate the robot’s mechanical structure. For the context of this work, an data circuit is one that is set to voltage
levels under 5V and where data that travels through it is represented by a train of pulses of voltage levels of 5
and 0 volts, transition between these two values are perceived and treated as a step function. A power circuit also
carries current as is set to a voltage, in contrast to the data circuit, a power system is designed to have higher caliber
components to resist the heat dissipation intrinsic of higher currents that travel through its conductors. A power
circuit is meant to carry a load which in the case of our experiments is motors. Circuits like the one used to actuate
our biped robot A.5, have both data and power components. The data and the power sections need to be carefully
interfaced to avoid damaging low voltage and low current components (i.e., logic components like microcontrollers
and PLCs) with the current that high voltage and high current components drag (like motors). This interface is
done with the L298N H bridge which, is a transistor- (switch-) based circuit that allows current to or not to flow in
particular directions. The maximum voltage motors are set to is the voltage the H bridge is set to, motors will receive
current in a train of pulses fashion (pulse with modulation, or PWM duty cycle) which will have a particular effect
on the motor. For a given period of time, if the current is continuously received without interruptions the motor will
80
Middle layer Output layer Input layer
Figure A.2: Representation of the ANN that we train to be used as a map from six limb kinematics to 3
motor activations. In this figure we show data used to train the NN (particularly naïve babbling data):
limbkinematics(valuesforinputnodes) which result frommotoractivations(desiredvaluesfor
output nodes). As a reminder, motor activations in babbling are random (Specific details on naïve and
natural babbling are given in 4.3.2). The ANN has three fully connected layers: input, hidden and output
layers with respectively 6, 15 and 3 nodes. It is trained with babbling data: Babbling data sets are divided in
training and testing sets, once the NN has been satisfactory trained, it is used to produce motor commands
given a set of desired limb kinematics. The details of the NN are given in Appendix A.1.
behave as if set to a voltage 100% equal to the voltage the H bridge is set to; if 50% of the time the current is null,
then the motor will behave as if it was set to 50% of the voltage the H Bridge is set to, this is a linear behavior that
can be extrapolated for different PWM duty cycle.
The main component between the PC and the robot is the microcontroller (Arduino MEGA for the case of our
experiments). The PC handles the learning part of the experiments, as described in A.1, the learning aspects are
also the center of Chapter ??. The microcontroller coordinates the data collection sampling interval as well as the
81
Middle layer Output layer Input layer
Figure A.3: Representation of the ANN that we train to be used as a map from six limb kinematics to 3
motor activations. In this figure we show data used to train the NN (particularly natural babbling data):
limbkinematics(valuesforinputnodes) which result frommotoractivations(desiredvaluesfor
output nodes). As a reminder, motor activations in babbling are random (Specific details on naïve and
natural babbling are given in 4.3.2). The ANN has three fully connected layers: input, hidden and output
layers with respectively 6, 15 and 3 nodes. It is trained with babbling data: Babbling data sets are divided in
training and testing sets, once the NN has been satisfactory trained, it is used to produce motor commands
given a set of desired limb kinematics. The details of the NN are given in Appendix A.1.
frequency of the motor activations. Through serial port, it communicates with the computer by sending or receiving
data from it. In general, per leg the data sent to the computer consist on: 2 kinematic values (hip and knee encoder
positions), and sampling interval (the PC will then derivate these two values to obtain respective velocities and
accelerations), the way this data is used is described in Chapter ?? and A.1. Per leg, data sent from the PC to the
microcontroller consist on motor activation values for three motors (these data activation values are always motors
commands: either for motor babbling commands or to execute a learned action, aspects described in Chapter?? and
82
A.1). Data transfer is done with .csv files, in summary, for all the experiments: data is sent to the computer using a
.csv file, this is then used for training a NN as detailed described in the in Chapter ?? and Appendix section A.1 and
generated activation commands to produce a desired action are then sent back to the microcontroller.
Here, I describe the PC-robot interface required to read one encoder and actuate one motor (as described in
Chapter 4 our robot has in total 6 motors and 4 encoders). The PC-microcontroller serial communication is done
through the serial port of the Arduino (Purple line in Figure A.5). For the microcontroller-H Bridge (microcontroller-
L298 Motor drive) and microcontroller-encoder data communication we use digital PWM and general input output
pins of our microcontroller (Green lines in Figure A.5). Logic components of both the encoder and L298 are powered
with the microcontroller 5 V power supply port (Orange line in Figure A.5). For our experiments this port is sufficient
to provide enough current to all the encoders and H bridges, but it is strongly recommended to use an external power
supply for all those components: the more components, the bigger risk of dragging too much current from the
microcontroller, causing its intermittent and faulty operation. Common ground to all circuits is an aspect important
to consider, this will set the reference point to all voltage levels in the system.
A.3 Thebiped
In this section we provide a figure (Figure A.6) that shows the different components that form the physical structure
of the biped robot we use in Chapter 4. In the Methods section (4.3.1) of Chapter 4, the mechanical characteristics of
the tendon driven biped are given, including: degrees of freedom (DoF) , power train structure (i.e., geared DC motors
and tendons), explanation of the overactuated architecture of the robot, material considerations and distribution to
maintain inertia as low as possible and the gantry characteristics to restrict the DoF of center of mass (localized
between the legs, under the hip) of the robot. In Chapter 4, Methods 4.3.1 we also visited the differences of our
biped with the robot in our already published work in [68], including: differences in tendon routing and in ranges
of motion.
83
m.s.e. (PWM activation value)
m.s.e. (PWM activation value)
m.s.e. (PWM activation value)
m.s.e. (PWM activation value)
Figure A.4: ANN learning performance. Here we compare ANN performances with training babbling data
sets (red line) vs. ANN performances with testing babbling data sets (blue line). In detail, for each subfigure:
X axis is the number of epochs, Y axis is the mean square error of PWM activation values (i.e., sum of the
mean of the differences between the predicted and ground truth values). Each row represents a different
experiment trial. Group of subfigures framed in blue and green are respectively ANN performances based
on naïve and natural babbling data respectively.
84
DATE REV. NAME CHANGES REVISION
LOCATION: Document realized with version : CONTRACT: SCHEME 08 L1 Main electrical closet University of Southern California (Valero Lab)
0
2022.0.3.6
Dario Urbina Melendez
User data 1 User data 2 0 dario 3/24/2023 SOLIDWORKS Electrical
POWER ANALOG IN
COMMUNICATION
ARDUINO 2560
- +
DIGITAL
31 31
33
35
37
39
41
43
45
47
49
51
53
30
32
34
36
38
40
42
44
46
48
50
52
22
24
26
28
GND
14
15
16
17
18
19
20
21
TX3
RX3
TX2
RX2
TX1
RX1
SDA
SCL
1
0
TX0
RX0
2
3
4
5
6
7
8
9
10
11
12
13
GND
AREF
PWM
A8
A9
A10
A11
A12
A13
A14
A15
A0
A1
A2
A3
A4
A5
6
A7
TX
RX
L
WWW.ARDUINO.CC
ON
MADE
IN ITALY
IOREF
RESET
3.3V
5V
GND
Vin
GND
M1
Rotary encoder
B
5V
A
X
G
L298 Motor Driver
In 2
In 1
EN A
5V
G
Vcc
OUT 1
OUT 2
+
_
PC
SOLIDWORKS Educational Product. For Instructional Use Only.
Figure A.5: Electric diagram showing the required components to actuate one DC motor and read one
rotary encoder. In the case of our robot, it has 6 motors, and 4 rotary encoders, the connection approach
would be the same as shown in this figure. The L298 motor driver interfaces the data and power circuits
(respectively signaled with green and red wires). With orange we signal 5V lines to power data circuits
and with purple the PC-Microcontroller serial communication lines.
85
A
C B
D
E
F
G
Figure A.6: Renders of the biped’s physical components. A) The whole biped including all 3D printed
components (explained with more detail in the next points), hip metal plate (upper-most flat metal struc-
ture), motor locations (per leg: two close to the hip and one close to the knee), aluminum tubes as higher
and lower leg segments (respectively connecting hip with knees and knees with feet). B) Hip bridge
structure used to fix both legs to the hip. C)Hipmotormounts, designed to hold metal plates to attach
motors of different characteristics/diameters. D)Hiplegmounts, which have housings for ball-bearings
to reduce the friction of the proximal DoF of the legs. E)Uppersegmentrotatorcuffs , F)Lowerseg-
mentlegmounts, that have housings for ball-bearings to reduce the friction of the distal DoF of the legs.
G) Distal segment rotator cuffs. EandG include tendon channels to maintain proper tendon routing. D
andF were designed to have rounded profiles to reduce tendon abrasion; furthermore these parts include
encoder mounts.
86
Abstract (if available)
Abstract
My goal is to test how the anatomy of bipedal animals and robots influence proprioception, and how it can be exploited to improve learning of locomotion. In specific, my objectives are to study: (i) the effect of mechanoreceptor number and location on proprioception accuracy, (ii) how the utility of mechanoreceptors signals is affected by the location and properties of the structure on which they sit, and (iii) how to exploit passive mechanical properties to improve learning of locomotion. For my first objective, I quantify the benefits of bioinspired sensory fusion and distributed sensing on state estimation for bipedal balance. We provide evidence that hip-localized balance sense, by its proximity to a moving platform and in combination with head acceleration, does provide two functional advantages compared to head-only balance-sense: (1) improved state estimation, and (2) reduced sensory delays. Moreover, increased neck stiffness can improve the utility of vestibular signals. For my second objective, we show that strain data of bioinspired proprioceptive skins and/or ligament arrangements can enable reliable center of pressure (CoP) estimates. This is an alternative to traditional center of pressure (CoP) sensing methods that rely on sensors at the foot sole subject to wear and damage due to direct interaction with the ground. With a model-free machine learning approach, we reliably estimated the CoP location of a biped by measuring the strain experienced by sensorized artificial skin and/or ligaments wrapped and arranged around its ankles. For my third objective, I study how to exploit the mechanical and dynamic properties of tendon-driven robotic limbs (and their interaction with the environment) to improve the learning of inverse dynamical models for locomotion. Here we demonstrate that two minutes of bioinspired motor babbling performed in tendon-driven bipedal limbs can enable the learning of partially supported bipedal locomotion. This achieved only if the random actions are compatible with the physical properties and range of motion of the limb (i.e., “natural” babbling), compared to purely random “naïve” babbling that can conflict with the leg's natural dynamics or exacerbate antagonistic muscle actions. Furthermore, we see how by removing support to the system (i.e., increasing the constraints that the environment imposes to walking) the biped learns to walk after training with both naïve and natural babbling approaches. With this we explore how the performance of learned actions depends not only on brain-body properties, but also on their interaction with the environment. We showed this brain-body-environment interactions by combining our published in-hardware learning algorithm (G2P) with a custom-built, tendon-driven bipedal robot.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Sensory acquisition for emergent body representations in neuro-robotic systems
PDF
The representation, learning, and control of dexterous motor skills in humans and humanoid robots
PDF
Evaluating sensing and control in underwater animal behaviors
PDF
Algorithms and systems for continual robot learning
PDF
Leveraging prior experience for scalable transfer in robot learning
PDF
Leveraging structure for learning robot control and reactive planning
PDF
Design and use of a biomimetic tactile microvibration sensor with human-like sensitivity and its application in texture discrimination using Bayesian exploration
PDF
Hierarchical tactile manipulation on a haptic manipulation platform
PDF
Data-driven acquisition of closed-loop robotic skills
PDF
A discrete-time return map analysis and prediction of gait-modulated robot dynamic under repeated obstacle collisions
PDF
Efficiently learning human preferences for proactive robot assistance in assembly tasks
PDF
Planning and learning for long-horizon collaborative manipulation tasks
PDF
Program-guided framework for your interpreting and acquiring complex skills with learning robots
PDF
High-throughput methods for simulation and deep reinforcement learning
PDF
Online reinforcement learning for Markov decision processes and games
PDF
Green learning for 3D point cloud data processing
PDF
Learning controllable data generation for scalable model training
PDF
Exploiting side information for link setup and maintenance in next generation wireless networks
PDF
Building and validating computational models of emotional expressivity in a natural social task
PDF
Toward counteralgorithms: the contestation of interpretability in machine learning
Asset Metadata
Creator
Urbina-Meléndez, Darío
(author)
Core Title
Exploiting mechanical properties of bipedal robots for proprioception and learning of walking
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Biomedical Engineering
Degree Conferral Date
2023-05
Publication Date
05/02/2024
Defense Date
03/09/2023
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
backdrivable,Balance,bioinspired,bio-inspired,bio-mechanics,bipedal robot,Birds,brain-body,brain-body-environment,center of pressure,co-adaptations,co-localized sensing,compliant robot,Control,distributed sensing,dynamical properties,emergence of locomotion,gait,humanoid,impact awareness,impacts,K-nearest neighbors,learning of walking,legged robots,ligaments,limit cycles,locomotion,locomotion planning,lumbosacral organ,machine learning,model-free,motor babbling,naïve babbling,natural babbling,OAI-PMH Harvest,Perch,posture,proprioception,robot,skin,state estimation,tendon-driven,tendons,vertebrates,vestibular system
Format
theses
(aat)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Valero-Cuevas, Francisco (
committee chair
), Finley, James (
committee member
), Schweighofer, Nicolas (
committee member
)
Creator Email
urbdario@gmail.com,urbiname@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-oUC113095832
Unique identifier
UC113095832
Identifier
etd-UrbinaMeln-11756.pdf (filename)
Legacy Identifier
etd-UrbinaMeln-11756
Document Type
Thesis
Format
theses (aat)
Rights
Urbina-Meléndez, Darío
Internet Media Type
application/pdf
Type
texts
Source
20230503-usctheses-batch-1035
(batch),
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the author, as the original true and official version of the work, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright.
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Repository Email
cisadmin@lib.usc.edu
Tags
backdrivable
bioinspired
bio-inspired
bio-mechanics
bipedal robot
brain-body
brain-body-environment
center of pressure
co-adaptations
co-localized sensing
compliant robot
distributed sensing
dynamical properties
emergence of locomotion
gait
humanoid
impact awareness
impacts
K-nearest neighbors
learning of walking
legged robots
limit cycles
locomotion
locomotion planning
lumbosacral organ
machine learning
model-free
motor babbling
naïve babbling
natural babbling
posture
proprioception
robot
state estimation
tendon-driven
vertebrates
vestibular system