Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Model-based approaches to objective inference during steady-state and adaptive locomotor control
(USC Thesis Other)
Model-based approaches to objective inference during steady-state and adaptive locomotor control
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
MODEL-BASED APPROACHES TO OBJECTIVE INFERENCE DURING STEADY -STATE AND ADAPTIVE LOCOMOTOR CONTROL by Pouria Nozari Porshokouhi A Dissertation Presented to the FACULTY OF THE USC GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (BIOMEDICAL ENGINEERING) August 2023 Copyright © 2023 Pouria Nozari Porshokouhi Within the graceful tapestry of human body movement, patterns unfold — a symphony of purposeful motion. One traverses distance in a certain manner, adopting certain strategies aligned with certain objectives and motives. In unraveling the enigma of movement pattern objectives, we embark on a quest to push the boundaries of knowledge in the realm of human locomotor control. ii This dissertation is dedicated to the beautiful, everlasting soul and memory of my beloved mother, who remains the eternal source of inspiration in my life. iii Acknowledgements I am most grateful to Dr. James M Finley for his strong intellectual mentorship, his demand for excellence, and his support and presence throughout all the challenging stages of my professional career and my personal life. Working alongside this remarkable individual has been a privilege, and I am truly appreciative of the invaluable knowledge and wisdom I have gained from him. I am profoundly grateful to the esteemed members of my committee, Dr. Francisco Valero-Cuevas, Dr. Nicolas Schweighofer, and Dr. Kristan Leech, whose impactful contribution to my academic journey has not only challenged me but also pushed the boundaries of my capabilities, propelling me to reach new heights. I extend my heartfelt gratitude to my beloved parents for their uncompromising support and unconditional love throughout my entire life. They made significant sacrifices, selflessly relinquishing their own ties and sending me on a journey to a foreign land in pursuit of my own success. I am forever indebted to the guidance, encouragement, and immense strength my mother instilled in me, propelling me forward even in the face of challenges. Her sacrifices and belief in me have shaped the person I am today, and for that, I am eternally thankful to her. May this piece of achievement bring pride to her everlasting soul. There are individuals in every person’s life whose unwavering presence and support play a vital role in their journey toward success. For me, my one and only, Suzanna Kazarian, is that extraordinary person. She has consistently stood by my side, being an inspiration with each step I took and a motivation with each challenge I faced. Suzanna, I am forever grateful to you for being my sunshine, filling my life with light and my heart with love, and I love you. iv Contents Epigraph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii Chapter 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Objectives of Human Motor Control . . . . . . . . . . . . . . . . . . . . . 1 1.1.1 Normative Theories to Explain Human Motor Control . . . . . . . . 1 1.1.2 Optimal Control; A Dominant Theory in Motor Control . . . . . . . . 2 1.1.3 Energy Economy in Gait . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1.4 Energy Economy in Adaptive Gait . . . . . . . . . . . . . . . . . . . 4 1.1.5 Stability in Locomotor Control . . . . . . . . . . . . . . . . . . . . . 4 1.1.6 Low-dimensional Control Laws . . . . . . . . . . . . . . . . . . . . . 5 1.2 Utility of Predictive Simulation in Human Locomotion . . . . . . . . . . . . 6 1.3 Other Considerations and Objectives in Locomotor Control . . . . . . . . 8 1.4 Inverse Optimization as a Tool to Infer Motor Costs . . . . . . . . . . . . . 11 1.4.1 From Optimal Control to Inverse Optimal Control: A Commentary on Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 1.5 Alternatives to Optimality . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 1.5.1 Feasibility Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 1.5.2 Good-enough Control . . . . . . . . . . . . . . . . . . . . . . . . . . 18 1.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 Chapter 2. Specific Aims . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 v Contents Chapter 3. A Simulation Platform to Evaluate Principles of Bipedal Locomotor Control . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 3.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 3.2.1 Dynamical Movement Primitives (DMPs) . . . . . . . . . . . . . . . 30 3.2.2 Biped-Treadmill Modeling . . . . . . . . . . . . . . . . . . . . . . . . 32 3.2.3 Stabilizing the Gait via Control Policies Update . . . . . . . . . . . . 33 3.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 3.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 Chapter 4. Good-enough Split-belt Gait Predicts Scaling of Asymmetry with Speed Ratio . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 4.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 4.2.1 Biped-Treadmill Model . . . . . . . . . . . . . . . . . . . . . . . . . 49 4.2.2 Learning from Demonstrations . . . . . . . . . . . . . . . . . . . . . 52 4.2.3 Predictive Simulations of Bipedal Locomotion . . . . . . . . . . . . 54 Cost Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 Constraints and Considerations . . . . . . . . . . . . . . . . . . . . 57 Optimization of Bipedal Walking using DMPs . . . . . . . . . . . . . 58 Good-enough Control . . . . . . . . . . . . . . . . . . . . . . . . . . 61 4.2.4 Statistical Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 4.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 4.3.1 Tied-belt Walking Across Different Speeds . . . . . . . . . . . . . . 65 Work minimization predicted symmetric features in normal walking 65 Cost of transport scales with walking speed during normal walking 66 4.3.2 Split-belt Walking Across Different Belt Speed Ratios . . . . . . . . 67 4.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 4.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 Chapter 5. A Review of Inverse Optimal Control Approaches . . . . . . . . . 86 5.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 5.2 Markov Decision Process Methods . . . . . . . . . . . . . . . . . . . . . . 89 5.2.1 Markov Decision Process Representations . . . . . . . . . . . . . . 89 5.2.2 Inverse Reinforcement Learning Based on Markov Decision Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 5.3 Trajectory Optimization Methods . . . . . . . . . . . . . . . . . . . . . . . 94 5.3.1 Trajectory Optimization Representation . . . . . . . . . . . . . . . . 94 5.3.2 Inverese Optimal Control Based on Trajectory Optimization . . . . . 98 vi Contents 5.3.3 Different Methods of Estimating Costs and Policies from Example Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 5.4 Considerations of Cost Functions to Generate Behavior . . . . . . . . . .104 5.4.1 On Intrinsic Costs . . . . . . . . . . . . . . . . . . . . . . . . . . . .105 5.4.2 The Choice of Cost for Human Locomotor Behavior . . . . . . . . .108 5.5 Assumptions on the Optimality of Observed Behavior . . . . . . . . . . .110 5.6 Applications to Human Biomechanics and Behavior . . . . . . . . . . . .112 5.7 Open Questions and Opportunities for Human Locomotor Control . . . .115 5.7.1 Incorporation of Feedback Control . . . . . . . . . . . . . . . . . . .115 5.7.2 Analysis of Cost Components . . . . . . . . . . . . . . . . . . . . .117 Cost Multicollinearity . . . . . . . . . . . . . . . . . . . . . . . . . .117 Cost Scaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .119 5.7.3 Stochastic Cost Inference . . . . . . . . . . . . . . . . . . . . . . .120 5.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .121 Chapter 6. Utility of a Feature-Reduced Bayesian IOC in Human Locomotor Control . . . . . . . . . . . . . . . . . . . . . . . . . . .124 Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .124 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .125 6.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .131 6.2.1 Five-Segment Biped Model . . . . . . . . . . . . . . . . . . . . . . .131 6.2.2 Inverse Optimal Control as a Method to Infer Locomotor Costs . . .133 6.2.3 Cost Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . .137 6.2.4 Multicollinearity in the Cost Space . . . . . . . . . . . . . . . . . . .137 6.2.5 Bayesian Optimization to Infer Distributions of Cost Weights . . . .145 6.2.6 Feature-Reduced Bayesian Inverse Optimization to Infer Costs from Synthetic and Behavioral Data . . . . . . . . . . . . . . . . . .146 6.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .151 6.3.1 Feature Reduction in Cost Space . . . . . . . . . . . . . . . . . . .151 6.3.2 Cost Inference from Synthetic Gait Examples . . . . . . . . . . . .154 Behavior Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . .154 Cost Weight Recapturing . . . . . . . . . . . . . . . . . . . . . . . .156 6.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .159 6.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .163 Chapter 7. Conclusions and Future Directions . . . . . . . . . . . . . . . . .164 7.1 Inverse Good-enough Control: A Visionary Outlook . . . . . . . . . . . .172 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .175 Appendices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .195 Appendix 1: Trajectory Optimization for Gait Simulations of a Compass Walker Using Direct Collocation . . . . . . . . . . . . . . . . . . . . . . .195 vii Contents Appendix 2: A Na¨ ıve Evolutionary Bi-level Inverse Optimization . . . . . . . .199 viii List of Tables 4.1 A pool of physically meaningful (intrinsic) cost functions proposed in the literature. Three different families of costs are quantified at different process levels of kinematics, dynamics, and kinetics. . . . . . . . . . . . . 57 5.1 A pool of physically meaningful cost features proposed/used in the literature. Features are categorized into four different families of criteria; kinematics, dynamics, models of energetics, and stability measures . . . .109 6.1 Candidate cost components used in the inverse optimization analysis. . . .138 6.2 Summary of ANOVA Results for Behavioral Prediction Error (RMSE). RMSE∼ Inference*Weight was the formula used to explain RMSE. . . . . .156 6.3 Summary of ANOVA Results for cost recapturing accuracy. KS-Stat∼ Inference*Weight was the formula used to explain KS-Stat. . . . .159 ix List of Figures 3.1 The representation of the biped-treadmill model in Matlab Simscape environment. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 3.2 Block diagram of the model dynamics simulator, genetic algorithm optimization, and control scheme with PID controllers for stabilizing walking on a treadmill. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 3.3 DMPs trained with observed human joint angle data for all DOFs. The coefficient of determination ( R 2 ) and the root mean square error (RMSE) for each DOF represent the variance explained and the prediction error norm values, respectively. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 3.4 The strides of the simulated (tuned) DMP-driven joint angle trajectories and the actual (baseline) data over gait cycles for both left and right legs. . 37 3.5 The biped torso translational and rotational states over the simulation time. 38 3.6 Spatiotemporal parameters of the biped optimal gait on the treadmill vs. stride number. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 3.7 Changes in spatiotemporal parameters of the biped optimal gait versus the biped torso relative position on the treadmill. . . . . . . . . . . . . . . . 40 4.1 The biped-treadmill model. . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 4.2 Overview of the predictive simulation of bipedal walking using hybrid optimization. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 4.3 Overview of the simulations of bipedal walking using evolutionary optimization. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 4.4 Optimal solution vs. good-enough solutions. . . . . . . . . . . . . . . . . . 64 4.5 Joint Angle Trajectories for tied-belt walking at 1.0 m/s. . . . . . . . . . . . 67 4.6 Energy-speed relationship in tied-belt walking. . . . . . . . . . . . . . . . . 68 4.7 Optimal Values for Optimization Variables. . . . . . . . . . . . . . . . . . . 69 4.8 Correlation plot for spatiotemporal scaling variables from good-enough solutions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 4.9 Leg work rate results from good-enough solutions with respect to positive work rate. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 4.10 Results of spatiotemporal features from good-enough solutions with respect to positive work rate. . . . . . . . . . . . . . . . . . . . . . . . . . . 81 x List of Figures 4.11 A comparison among spatiotemporal features of Split-belt walking at the belt speed ratio of 3:1 from good-enough solutions and other empirical and simulation studies. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 4.12 Results of stance and swing time from good-enough solutions with respect to positive work rate. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 4.13 Results of foot placement from good-enough solutions with respect to positive work rate. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 4.14 Feasible solutions for walking on the treadmill across a range of belt speed ratios. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 6.1 Schematics of the five-segment biped model. . . . . . . . . . . . . . . . . .131 6.2 Free body diagram of a five-segment biped. . . . . . . . . . . . . . . . . . .133 6.3 Limb angle trajectories for all step cycles from lift-off to heel-strike for subject 1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .139 6.4 The Bayesian cost inference framework. . . . . . . . . . . . . . . . . . . .147 6.5 The pipeline for inference of locomotor costs from synthetic data. . . . . . .149 6.6 Cost correlation matrix. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .152 6.7 The results of variance accounted (VAF) for in cost (X) and behavior (y) spaces. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .153 6.8 Matrix of cost bases loadings for dimensionality reduction. . . . . . . . . .154 6.9 Joint angle predictions by the two Bayesian inverse optimization approaches for simulated subject one and weight set 1. . . . . . . . . . . .155 6.10 Comparison of behavioral prediction error by the Na¨ ıve, and Feature-Reduced inference approaches across different weight sets. . . .157 6.11 Posterior distributions of cost weights for simulated subject 1 inferred by the two Bayesian inverse optimization approaches. . . . . . . . . . . . . . .158 6.12 Comparison of cost weight recapturing performance by the Na¨ ıve, and Feature-Reduced inference approaches across different weight sets. . . .160 A.1 Schematics of the compass walker model. . . . . . . . . . . . . . . . . . .196 A.2 Overview of Gait Optimization Using Direct Collocation. . . . . . . . . . . .198 A.3 Overview of an IOC problem implemented in a nested bilevel paradigm . .201 A.4 The weights identified by IOC and weighted contributions of cost features to the composite cost for case 1. . . . . . . . . . . . . . . . . . . . . . . . .204 A.5 The kinematic and dynamic trajectories for case 2 with a combined cost. .205 A.6 The weights and valuations of cost features for combined features with nonzero weights (case 2). . . . . . . . . . . . . . . . . . . . . . . . . . . . .206 xi Abstract This dissertation addresses the utility of model-based computational approaches for simulations of treadmill walking to understand the principles and factors that underlie human locomotor behavior. Split-belt walking is a common paradigm for adaptive gait through which locomotor studies can uncover the underlying principles of motor control in novel conditions. Combining the principles of optimality and good-enough control, we proposed a hybrid evolutionary-gradient-based algorithm in simulations of adaptive gait on a split-belt treadmill that selects a family of suboptimal solutions that are feasible and good-enough. We then aimed to predict empirical features of gait observed during split-belt gait by selecting the good-enough solutions with respect to an effort-related cost function. Our predicted ranges of good-enough spatiotemporal asymmetries were in agreement with physics-based predictions. Moreover, our results were closer to the empirical observations of split-belt walking compared to prior simulation work. Across various tasks and contexts, it is highly likely that a composite cost representing a trade-off among several energetic and non-energetic components would better explain the gait features. Despite the effort to uncover xii Abstract this cost trade-off in human locomotor control by inverse optimization studies, the attempts towards inference of a range of costs and resolution of cost multicollinearity have been far too limited. Thus, We developed a Feature-Reduced Bayesian inverse optimization framework to identify a range of plausible costs that would explain the observed locomotor patterns in human walking. We implemented this approach to infer costs from synthetic gait data, for which the cost weights were known. Our Feature-Reduced Bayesian inference was able to recapture the true weights dictating the synthetic locomotor behavior more accurately than the Na¨ ıve approach. Keywords: Bipedal locomotor control, Predictive simulation, Optimization, Feasibility, Good-enough control, Modeling and simulation, Inverse optimization, Dynamics, Redundant systems, Inverse reinforcement learning xiii Chapter 1 Introduction 1.1 OBJECTIVES OF HUMAN MOTOR CONTROL 1.1.1 Normative Theories to Explain Human Motor Control Biomechanists and neuroscientists are often interested in identifying normative theories that explain human motor control behaviors. These theories should encompass principles that determine the policies we use when moving in both conventional and novel environments. Several plausible theories have been developed using a combination of computational modeling approaches from machine learning, control theory, and estimation-detection theory [1]. Optimal control is one theory that has successfully explained many features of human motor behaviors. This theory states that actions stem from the tendency of the nervous system to choose control policies that optimize a functional objective that may be composed of several criteria such as energy expenditure, effort, and variance [2, 3, 4]. A different principled 1 Chapter 1. Introduction framework in optimal control states that behaviors are derived from supervised learning processes that detect and minimize performance or prediction error, i.e., deviations from nominal motor patterns by learning an internal representation of the dynamics of the body and environment [5]. The habitual control hypothesis is a third theory of motor control, stating that when exposed to novel environments, people adopt policies that resemble what they typically use [6]. Depending on the type of the task, any of these theories may be used either alone or in combination to potentially make inferences about the motor control behavior. 1.1.2 Optimal Control; A Dominant Theory in Motor Control There are challenges involved with determining which theory or theories explain human motor behavior. Of all normative theories, optimal control is one that allows for addressing many motor control problems in a coherent and principled framework [7]. Further, it motivates the use of mathematics to solve difficult problems with which the brain contends in an intuitive fashion [8]. It provides an objective representation of behavior rather than the control policies, and it also allows for direct implementation of the dynamics of the behavior as well as organismic or task considerations as constraints in the problem. Optimal control theory, when applied to biological motor behaviors, states that the brain chooses control policies that optimize a functional objective consisting of one or more criteria [9, 10]. This theory is in line with evolution theories of natural selection 2 1.1. Objectives of Human Motor Control and de novo learning, which favor motor patterns that increase an organism’s fitness by increasing its probability of passing its genes to subsequent generations [11]. Optimal control theory is also supported by experimental observations in movement, foraging, hunting, and several other behaviors in animals [12, 13]. Moreover, there is also evidence that features of human walking can be explained as resulting from optimization of an objective function [14, 15, 16]. However, it remains to be investigated what the most important locomotor objectives are that drive various features of human locomotion under different gait conditions. 1.1.3 Energy Economy in Gait Energy economy, i.e., the tendency to minimize energetic costs, is a prevalent objective that is believed to drive many features observed in human locomotor behavior [17, 18, 19, 20, 21, 22, 23]. Results of empirical studies suggest that various features of bipedal gait are adopted to minimize energy consumption per unit distance at either mechanical or neural levels, known as cost of transport [19, 24]. Individuals prefer to walk at speeds that correspond to their minimum metabolic cost expenditure ranges [25]. In addition, individuals generally take steps of equal lengths and times, which to the lowest levels of metabolic cost rate during normal walking at steady-state [26]. 3 Chapter 1. Introduction 1.1.4 Energy Economy in Adaptive Gait There is evidence that energy minimization may also explain how people adapt their walking patterns to novel environments, such as when wearing powered exoskeletons [27, 28, 29, 30] or when walking on a dual-belt treadmill with each belt running at a different speed [31, 32, 33]. Particularly, when walking on a dual-belt treadmill, given enough time for adaptation, individuals would adapt to an asymmetric gait, where they take longer steps on the fast belt at steady-state [32, 33]. This strategy would allow the individuals to take advantage of the work provided by the treadmill, which in turn, favors them in reducing their metabolic cost of transport. In another study of adaptive walking, with increasing belt speed differences, the participants preferred to adopt increasingly positive step time asymmetries but symmetric step lengths [34], which also coincided with the lowest energy cost consumption. 1.1.5 Stability in Locomotor Control Optimality principles are not limited to the energy economy. Other important behavioral or performance criteria than energy have shown to also matter in human locomotor control when modeled as an optimal control problem. For instance, stability, known as balance in behavioral gait literature, is another important feature that is believed to determine gait patterns along with energetic and kinematic-related objectives. The slower gait with shortened step length adopted by older adults, when compared to younger adults, was shown to be related to an increased weight 4 1.1. Objectives of Human Motor Control on the importance of balance and to minimize the risk of falls in this population [35]. In addition, preliminary gait observations in young adults indicated that their preferred walking speeds were associated with maximized stability, characterized by minimal head and pelvic accelerations [36]. With similar head and pelvic acceleration measures for stability, another empirical study found that during walking, healthy young adults maximize vertical and anteroposterior stability while maintaining mediolateral stability only enough to not fall over [37]. 1.1.6 Low-dimensional Control Laws In human motor control, if the goal really is an optimization of some sort, then a biological learner should prefer to search within a lower-dimensional activation space, also known as the space of free variables that control the behavior [38]. Along these lines of thought, Bernstein proposed that the central nervous system imposes constraints, i.e., fixed synergies or primitives, in the lower-level control of biological motor behaviors [39]. There is evidence that walking may be coordinated in part via central pattern generators (CPG) [40] that generate rhythmic control policies. Here, we introduce a type of movement primitives, namely, dynamical movement primitives, which will be used in our simulations of gait with nonlinear dynamics to represent the baseline movement patterns. Dynamical movement primitives (DMP) is one principled approach to model rhythmic control policies [41], which model movement trajectories, especially rhythmic control policies, as weakly nonlinear 5 Chapter 1. Introduction dynamical systems. This framework consists of nonlinear differential equations that govern the behavior of a virtual point mass called the DMP attractor. The attractor is influenced by nonlinear forcing functions that encode the desired movement characteristics, such as shape, timing, and scaling. One of the key strengths of DMPs is their ability to generalize and adapt to different movement contexts. This adaptability is achieved through a process known as trajectory reproduction. DMPs can dynamically adjust the DMP attractor based on sensory feedback, including position, velocity, and external forces. This real-time adaptation enables robust and flexible motor control through phase and frequency resetting [42]. In addition to capturing the rhythmic patterns of walking, DMPs are also beneficial because they can reduce the search space from which gaits are selected by imposing a biologically-inspired structure to joint-level trajectories. 1.2 UTILITY OF PREDICTIVE SIMULATION IN HUMAN LOCOMOTION Experimental observations of human motor control and the correspondence of the gait features observations with cost economy incentivize further use of optimization as a valid computational process representing the emergence of human motor behavior. Optimal control theory has been tested through computational modeling of human gait in the framework of predictive simulation. Studies involving predictive simulations of human gait have tested the optimality principles with various single or composite objectives by validating the predictions against experimental observations. 6 1.2. Utility of Predictive Simulation in Human Locomotion Chow and Jacobson [43] were among the first to use optimization techniques in the analysis of walking control schemes and stability [14, 44, 45, 46]. They proposed the formalization of an optimal programming framework to simulate energy-optimal walking in the sagittal plane with seven degrees of freedom [43]. Walking models were extended by the addition of muscles into the walking model dynamics to help predict more naturalistic kinematics, ground reaction forces, and muscle activation patterns that were more consistent with experimental measurements [47, 48]. Predictive simulations of bipedal walking using such models were also utilized to explain the features observed in human walking. For instance, performing a constrained optimization on a walking model using minimum metabolic cost as the primary objective predicted stride length values that were similar to those observed in walking experiments [23]. Multiple relations between speed and frequency during human walking were predicted by constrained optimization of energy cost [21]. Further computer optimizations of minimal models [49] and musculoskeletal models [50] of walking had success in predicting the walk-to-run transition speed thresholds for the optimized mechanical cost of transport [16] and metabolic costs of transport [51]. Other predictive simulations have investigated whether energy optimality can explain the emergent spatiotemporal features of walking, e.g., step lengths and step times, across different contexts. For healthy normal walking, gait patterns with equal step lengths and step times were predicted by energetically optimal simulations on a minimal model [52]. 7 Chapter 1. Introduction Computational modeling and simulations of human walking have also explored the use of a variety of non-energetic costs to explain locomotor control behavior. For instance, the inclusion of a metric related to peak joint loads in the objective predicted the presence of knee flexion present during stance in a 2D walking model, resulting in more realistic gait patterns than those in prior work [53]. Taking into account a criterion such as jerk minimization in the objective had success in predicting more dynamically realistic behavior resulting in torque patterns with smaller peaks lying within ranges computed from human demonstrations [54]. Further, measures of body and head angular momentum were other important objectives that have been added to conventional objectives, which helped predict patterns associated with gaits with less body rotation and more balance, consistent with experimental observations [55]. 1.3 OTHER CONSIDERATIONS IN LOCOMOTOR CONTROL AND FAILURES OF SINGLE-CRITERION OBJECTIVES Although several empirical and simulation studies suggest that energy optimality is a primary factor driving biological motor control behaviors, they fail to predict all the features opted during locomotion [21, 56]. Several studies attempting to predict the natural ranges of walk-to-run transition speed in human gait have found the naturally preferred transition speeds to be significantly slower than those corresponding to minimum levels of energy expenditure [57, 58, 23, 59]. Furthermore, studies on horses have revealed some conflicting evidence on the trot-to-gallop transition features. While a study accurately predicted the minimum sustained galloping speed 8 1.3. Other Considerations and Objectives in Locomotor Control in horses using energy-optimal simulations of trot-to-gallop transition speed [60], another simulation study predicted galloping speeds that were significantly different from the minimum sustained galloping speed ranges [61]. During downhill slope walking, individuals adopted step lengths and times that were not energy-optimal, where they reportedly adopted strategies that would maximize their stability [62]. Zelik et al. found that during a jumping task, individuals avoided stiff landings at the cost of performing extra negative muscle work corresponding to preferred strategies that are energetically suboptimal but are less painful and discomforting [63]. Studies of gait in individuals post-stroke have shown that, compared to the neurotypical population, individuals post-stroke demonstrate 40% to 50% increased cost of transport resulting in energetically suboptimal gait patterns [64, 65, 66, 67, 68]. A study on hemiparetic gait showed that affected by stroke, self-selected speeds were slower than normal, corresponding to energetically suboptimal strategies [69]. In another study, when assisted by a prosthesis, amputees perceived higher rates of exertion at speeds that corresponded to the minimum metabolic cost of transport relative to less energy-economical speeds. They also preferred to expend extra energy but presumably reduce their perceived rates of exertion [70]. The suboptimal patterns preferred by amputees with increased co-contraction could potentially suggest that they would opt for a more stable gait strategy despite the added physiological cost, consistent with reports of downhill walking in healthy individuals [62]. Individuals post-stroke may also have a psychological bias towards aesthetics 9 Chapter 1. Introduction [71, 72] and harmony, quantified by the presence of repetitive temporal proportions of gait phases, e.g., the relative proportion of swing time and stance time across steps, [73] that may or may not result in energetically optimal strategies when walking. Such instances of inconsistency between energy optimality predictions and natural observations stress the insufficiency of standalone energy optimality in predicting all the gait features across all conditions and populations. It is, therefore, likely that a composite objective composed of energetic as well as various other non-energetic criteria would better predict the gait features adopted by individuals [21]. For instance, the inclusion of jerk minimization in the objective function predicted smoother motions and more realistic behaviors with reasonably smaller torques [54]. Adding a fatigue-like criterion such as peak joint loads in the objective along with a muscular activation effort term predicted more realistic gait patterns, which led to the presence of knee flexion during stance in a 2D walking model [53]. In another study, measures of body and head angular momentum were other essential objectives that have been combined with effort terms to predict patterns associated with more balanced gaits [55]. Other relevant criteria in the locomotion domain include effort costs related to joint torque, muscle force, and activation patterns [48, 74], deviations from desired velocity and acceleration at the center of mass [75], deviations from capture point and vertical center of mass oscillations [76]. Performance accuracy and variability also have been among essential factors, especially in the upper limb reaching domain [77, 78]. 10 1.4. Inverse Optimization as a Tool to Infer Motor Costs The evidence raised in empirical as well as simulation studies corroborates that it is highly likely that a trade-off among several objective and subjective factors raised here drives motor pattern choices in biological organisms. What is the trade-off among different factors, and whether and how this trade-off can change depending on a range of conditions and considerations remains to be addressed. 1.4 INVERSE OPTIMIZATION AS A TOOL TO INFER MOTOR COSTS As discussed in Section 1.3, several theories in human motor control have posed the question as to what the main factors influencing the emergence of motor control behavior are. One theory stems from empirical evidence on natural evolution that emphasizes a compromise among many factors, such as speed, acceleration, maneuverability, energy economy, stability, etc [11]. According to this theory, motor strategies are selected such that multiple objectives become optimal simultaneously [79]. As such, the goal of motor behaviors computationally amounts to the solution of either a multi-objective optimization problem or a new problem in an optimal control framework referred to as inverse optimal control (IOC), also introduced in Chapter 5. In a multi-objective optimization process, the precise amount of contribution of each criterion is not clear, and in lieu of a unique optimal solution, a set of optimal solutions, also known as Pareto front, is achieved in the multi-dimensional space of all criteria [80, 81]. Inverse optimal control, on the other hand, approaches an optimization problem by obtaining a unified mechanistic ensemble of all different criteria as a 11 Chapter 1. Introduction composite objective. Inverse optimal control (IOC) is a computational approach used in the optimal control context to infer an objective function that, when optimized, gives rise to an observed behavior. Unlike a forward optimization problem that aims to obtain the control laws that optimize a given objective function, in an IOC problem, only the behavior is known a priori, and then the attempt is towards learning the correct combination of criteria forming the composite function that, when optimized, would make predictions of the walking behavior consistent with those given. However, what the appropriate group of criteria should be and to what extent they each contribute to the emergence of motor behavior remains to be addressed. Thus, in an IOC problem, a primary step in cost inference is to choose the structure of the composite objective function. A composite objective can be a non-parametric function of the components, wherein there are no assumptions on the form of the trade-off among components, that can be solved by, e.g., a multi-objective optimization [82]. However, a more common approach used in biological motor control opts for a parametric composite objective function, wherein the composite objective is assumed to take on a certain shape that is a function of different components, such as a linear or exponentially weighted combination of all components [83, 84]. A parametric function is well-defined in the structure since its components are selected in advance. Thus, the goal of IOC will be to solve for a set of either time-varying or constant weights determining the contribution of each selected component to the 12 1.4. Inverse Optimization as a Tool to Infer Motor Costs composite objective function. A common parametric composite objective function takes the form of a linear weighted sum of a group of components, inspired by the classical objective functions defined as a sum of state cost and control cost in optimal control frameworks, such as linear quadratic regulators (LQR) [85]. Hereafter, an IOC problem refers to one where the composite objective function is parametric and represented as a linear combination of a set of cost components. Inverse Reinforcement Learning (IRL) is an alternative formulation of the cost inference problem, generally used when the resulting optimal control policies are of more interest than the sparse cost function as inferred by IOC. Inverse optimal control (IOC) and inverse reinforcement learning (IRL) are both approaches used to understand motor control by inferring the underlying reward or cost functions. However, IOC is a more appropriate framework to characterize a human motor control problem due to the interpretability of the inferred costs and the better capability of trajectory optimization offered by IOC to model locomotor behavior. Further details on the representation, implementation, goals, and applications of the two main principled approaches of IOC and IRL utilized in reward or cost inference will be provided under a review in Chapter 5. Additionally, various considerations in both cost inference frameworks, such as the considerations of costs and rewards, the type of optimality of the given example motor behaviors, and representing stochasticity, will be discussed. 13 Chapter 1. Introduction 1.4.1 From Optimal Control to Inverse Optimal Control: A Commentary on Terminology Optimal control has historically been proposed and used as a fundamental theoretical framework to solve dynamical systems. The term ”Optimal Control” typically refers to the problem of finding the optimal control policy for a system given a known cost function or reward structure, characterized by the key fundamental Hamilton-Jacobi-Bellman (HJB) equation [9, 86]: ∂V ∗ x(t) ∂t +min u(t) h L x(t),u(t) +∇V ∗ x(t) · f x(t),u(t) i =0 (1.1) where the system dynamics is described via the state equation ˙ x=f x(t),u(t) , and L x(t),u(t) denotes the instantaneous cost function quantifying the cost associated with being at statex(t) and applying controlu(t). Additionally,V ∗ x(t) is the optimal value function, representing the expected cumulative cost or reward associated with being at state x(t) and following an optimal control policy thereafter, and∇V ∗ x(t) denotes the gradient of the optimal value function. Equation 1.1, which is a partial differential equation in nature, represents the necessary optimality condition used as the key element characterizing any optimal control problem. On the other hand, what is referred to as ”Inverse Optimal Control”, in the literature, refers to the problem of identification of the cost function or reward structure that, when minimized or maximized respectively, would make predictions that align 14 1.4. Inverse Optimization as a Tool to Infer Motor Costs with some observed behavior. Thus, referring to the cost inference frameworks as ”inverse optimal control” may not be strictly accurate since they do not formalize and represent the problem as an optimal control problem characterized by the specific representation laid by the HJB equation. Cost inference frameworks, especially those based on trajectory optimization methods such as bi-level or brute-force optimizations, could be, at best, considered a parameter estimation problem based on approximate least squares. Even with a single-level cost inference framework based on Kaurush-Kuhn-Tucker (KKT) conditions, the solution to the inverse problem amounts to estimating the cost weights for an underdetermined system of equations. In this case, the cost weights are obtained from solving an approximate algebraic least-squares problem that satisfies the given equations to some degree but does not provide a unique and precise optimal solution [87, 88]. Inspired by the relevant literature, the terms ”Inverse Approximate Optimal Control”, ”Inverse Optimization”, ”Inverse Least-Squares”, ”Cost Inference”, and ”Cost Parameter Identification” could be appreciated as reasonable alternatives that could replace the term ”Inverse Optimal Control” when characterizing a cost inference problem. However, to keep consistent with a huge body of literature in the field, the term ”Inverse Optimal Control” will be used when referring to methods of cost inference that are not based on reinforcement learning and, in particular, Markov decision processes. 15 Chapter 1. Introduction 1.5 ALTERNATIVES TO OPTIMALITY Although optimality principles can explain several features of motor control and learning, they rely on algorithms with a greedy nature that explore a given cost landscape for the unique global or local optima. However, achieving the global optimum is not necessarily the goal of a biological learning process, nor does settling for a unique solution explain the natural diversity observed within individuals’ locomotor strategies. Repeating the same task in the same context would result in at least a slightly different motor control behavior. There are several reasons why true optimization is unlikely to govern adaptive motor learning. An optimization-based algorithm typically assumes access to an accurate representation of the cost function landscape, which represents the multidimensional map that quantifies the cost values for any given control strategy. However, in biological learning, uncertainty within the sensorimotor system due to noise and delay at the sensory processing level may impair the learner’s ability to evaluate their performance accurately [89]. Furthermore, optimization-based algorithms can be allowed to implement an exhaustive search through control-cost map space until a unique optimal control policy is obtained. This not only requires more exploration time than may be practically plausible, but it relies on a memory capable of storing all information on continuous updates to the cost landscape. A biological learner may rather settle for control laws that are only feasible [90], 16 1.5. Alternatives to Optimality i.e., satisfy the mechanical constraints, or good-enough [38], i.e., are locally stable on the cost landscape and result from refinement around a habitual repertoire of strategies but are not necessarily globally optimal. In turn, the costs associated with both the exploration and learning time and the cognitive load associated with memory limitations are remarkably reduced. 1.5.1 Feasibility Theory As an alternative to optimization, some suggest that for a control strategy to be a candidate, all it needs is to exist in the feasible activation space. A feasible activation space for a given motor task is a family of valid solutions at mechanical or neuromuscular levels, e.g., joint torque or muscle activation patterns, that, when recruited, would satisfy the biomechanical or organismic constraints as well as the mechanical constraints manifested as motor task demands. Take, for instance, an arm-reaching task, wherein the individual is asked to reach from point A to point B. As long as both points are within the person’s reach without violating their organismic constraints, and as long as the individual completes the task by moving their hand from point A and landing at point B, one can claim their selected strategy is feasible. Similarly, a feasible strategy in a locomotor control task on a treadmill will be for the person to not fall off the treadmill and to not drift, i.e., adjust to and more or less maintain the speed commanded through the treadmill speed. Thus, Feasibility Theory has been proposed as a conceptual framework believed to drive biological 17 Chapter 1. Introduction motor learning through the physics of the body and the properties of the task, and not necessarily a value function scoring the performance [90]. Feasibility Theory consists of heuristic local searches driven by the memory of the likelihoods of different solutions, as a result of which a repertoire of useful, yet likely suboptimal, motor habits are adopted, learned, and stored for later use for a novel condition or task in the form of a look-up exploration. However, upon validation for biological motor control and learning, feasibility alone may not explain why people choose certain strategies for different tasks. For instance, from demonstrations of adaptive gait on a split-belt treadmill, it is clear that individuals do not only adapt to walking on a split-belt treadmill just to not fall or not to drift. In particular, from the large pool of feasible gait strategies, they seem to target regions and adopt generally stereotypical patterns and gait features that have a time course similar to that of metabolic cost or are associated with lower levels of energy cost [33, 91, 34]. This may indicate that feasibility is one necessary component in learning but may not be sufficient to provide a mechanistic understanding of motor control, and other principles may be required in doing so. 1.5.2 Good-enough Control It is plausible that individuals are willing to adopt a range of potentially different strategies, all of which are almost equally good enough with respect to a cost [38]. A potential good-enough strategy can be selected through an iterative assessment 18 1.6. Conclusion of performance, where the search for a control policy ends when a threshold performance is achieved. For instance, a successful strategy can be found where the gradient of the cost landscape is shallow. That is, once no discernible improvement is sensed in the recently selected strategy over experienced ones, the biological learner stops exploring. In addition, when exposed to a novel context, a biological learner is more likely to start by using an accessible repertoire of previously learned strategies. As a result, the search process will efficiently be reduced from a de novo learning to a look-up search around habitual strategies and performing some refinement on those. Further, once learned, either replicas or fine-tuned renditions of these motor habits that are suboptimal are likely to persist in the event the biological learner were to interact with the same context again [6]. One can implement this learning process through bioinspired heuristic algorithms such as reinforcement learning [92, 93, 94] and gradient descent [95]. 1.6 CONCLUSION A set of the key points discussed thus far are listed below • Energy optimality has proved potent in explaining several features of gait. However, the role of alternative theories to optimality, such as feasibility and good-enough control, in predicting families of plausible solutions is yet to be investigated. 19 Chapter 1. Introduction • Trajectory optimization is a principled framework that can be applied in simulations of human locomotor control. However, when the system’s dynamics are complex, primitive-based control laws can replace the more computationally costly trajectory optimization approaches. • Although energy optimality may explain several gait features, especially in normal level-ground walking, it may fail to explain the gait features across different contexts. It is highly likely that a compromise among various costs underlies emergent locomotor behavior. • Studies of inverse optimal control have attempted to understand the cost tradeoff underlying locomotor behavior. However, considerations of cost multicollinearity resolution and inference of distributions of cost weights rather than a unique combination have been understudied. 20 Chapter 2 Specific Aims The present study proposes using predictive simulation as a computational tool to test the role of optimal control in shaping human locomotor behavior. We also investigate the role of alternative approximate theories in motor control, such as feasibility and good-enough control, utilized to explain human locomotor behaviors. Although energy optimality can predict features of locomotion in some conditions, it fails to fully predict some features such as gait transition speeds [57, 96, 97], step lengths and times during downhill walking [62], or self-selected speed in people post-stroke [98]. It is likely that other criteria such as jerk [99, 100], effort costs related to joint torque and muscle force or activation patterns [48, 74], performance accuracy or variability [77, 78], stability [37] and several others may need to be taken into account to better predict the features adopted when walking. However, how these criteria are balanced to drive locomotor patterns that people select remains to be seen. 21 Specific Aims One plausible approach to address this uncertainty is to model the locomotor control problem in an inverse optimal control (IOC) framework where the underlying objective of observed behavior is inferred as a weighted combination of several candidate criteria [101, 102, 84]. However, since several hypothesized criteria are likely to correlate, a conventional IOC problem would not be able to disambiguate the individual effects of correlated criteria from one another [87]. One solution to circumvent this problem is to utilize a dimensionality reduction algorithm in the IOC to form new composite cost components that are independent, so-called basis functions to use in lieu of individual components. The IOC problem then changes to the identification of weightings on the composite cost components, each of which is an independent sum of the individual criteria with fixed weights. In addition, the focus of the majority of cost inference studies has been on inferring only a unique cost combination. However, a range of costs may exist that, when used in optimization, would generate the same behavior, leading to a potential diversity at the motor cost level. Despite attempts towards inference of a range of cost weights via approximate approaches, such as bounded-error frameworks [87], the inherent dissection of the cost inference problem into multiple independent optimizations for the cost bounds, and the assumptions that the cost weight space could be realized via a linear uniform interpolation between the cost weight bounds inferred from the behavior bounds make the use of this approach far too limited. Thus, the utility of a Bayesian inference approach can help to formalize a stochastic multivariate inverse 22 Specific Aims optimization to infer a feasible range of cost weights that, when optimized, would recover the observed behavior. Therefore, our aim is to develop a feature-reduced Bayesian inverse trajectory optimization framework to account for redundancy in motor cost space that can identify ranges of plausible locomotor costs to reproduce observed locomotor patterns and disambiguate among different basis functions. The aims and hypotheses of the present work are listed as follows Specific Aim 1: To determine the role of optimality and feasibility principles in predicting human gait during normal and adaptive walking through the use of model-based predictive simulations Current theories of adaptive walking suggest that healthy individuals adapt their gait to take advantage of the work provided by a split-belt treadmill, which in turn, reduces the positive work performed by their legs [31, 33]. What remains to be seen is if participants’ behavior is consistent with energy-optimal predictions made via simulation. Here, we developed a nine-degree-of-freedom (DOF) model of bipedal walking on a split-belt treadmill, for which the belts can run at different speeds, in Matlab Simscape [103]. We first tested if this model could predict the known parabolic relationship between measures of the energetic cost of transport and walking speed [25]. We then determined if our simulations could predict the empirical observations of spatiotemporal features that people adopt during split-belt adaptation [33, 91, 34]. We also compared our simulation predictions with other simulation studies [104, 105]. Hypothesis 1a: We hypothesized that energy optimal simulations of bipedal gait that 23 Specific Aims minimized positive work per unit distance, i.e., mechanical cost of transport on a regular treadmill would predict a parabolic relation between the cost of transport and walking speed. Hypothesis 1b: We also hypothesized that in simulations of split-belt treadmill walking, good-enough solutions with respect to costs related to effort and energetics would reduce the positive work rate required by legs with the belt speed ratio. Moreover, our simulations would predict that people would step further forward on the faster belt and also spend less time in the stance phase on the fast belt as strategies to take advantage of the provided external positive work. Specific Aim 2: To develop a novel Feature-Reduced Bayesian inverse optimal control framework for inference of a range of plausible costs in human locomotor control We will first use behavioral data and a bipedal walking model to identify the cost bases during steady-state walking. Hypothesis 2a: We hypothesize that given simulated behavioral data, Feature-Reduced IOC will recover the true cost weights more accurately than the Na¨ ıve IOC and that our Feature-Reduced Bayesian inference would predict the demonstrated synthetic gait data more accurately than the Na¨ ıve inference. Hypothesis 2b: We hypothesize that utilizing the composite cost inferred by the Feature-Reduced IOC applied to human locomotor behavioral data will predict the 24 Specific Aims observed gait patterns in novel conditions, i.e., novel walking speeds, better than a Na¨ ıve IOC would do. We also compare the behavioral predictions made by the costs inferred via the two Bayesian IOC approaches against one another, as well as the predictions made by a typical classical cost such as energetics. The structure of this thesis is laid out as follows. In Chapter 3, we present a framework for simulating bipedal gait, which will later be used in predictive simulations of optimal gait patterns. We modeled an anthropometric biped and a treadmill in Matlab Simscape and used Dynamical Movement Primitives (DMP) to define modifiable reference trajectories for joint-level actuators from demonstrations of human walking. Using optimization of gait patterns through a small number of spatiotemporal parameters through DMPs, we implemented a simple feedback controller to stabilize the biped based on deviation from reference states. In Chapter 4, we aim to predict features of energy-optimal and good-enough gait patterns with respect to energy at steady state across different walking speeds and a range of belt speed ratios for walking on a tied-belt and a split-belt treadmill, respectively. To achieve this, we developed an anthropometric 7-segment, 9-DOF biped model and, accordingly, a model-based predictive simulation to determine how energy-optimal and good-enough gaits in simulation compared to experimental observations and predictions made by prior simulation work in both tied-belt and split-belt walking contexts. Chapter 5 provides further details on the representation, implementation, goals, and applications of the two main principled approaches of inverse optimal 25 Specific Aims control (IOC) and inverse reinforcement learning (IRL) utilized in reward or cost inference. Moreover, various considerations in both cost inference frameworks, such as the considerations of costs and rewards, the type of optimality of the given example motor behaviors, and stochasticity will be discussed. Chapter 6 aims to apply inverse optimal control to synthetic and behavioral data on locomotion and investigate the effects of a stochastic inference and dimensionality reduction in the cost feature space on the objective inference problem. To this end, We will first use behavioral data on human locomotion from a public data set and a five-segment biped model to reduce the redundancy in cost feature space and identify the cost bases for our objective inference problem. Utilizing the structural results from the feature reduction process, we will develop a novel feature-reduced Bayesian inverse optimal control with the purpose of inferring a range of locomotor cost weights that can predict given synthetic and human behavioral data. 26 Chapter 3 Development of a Platform to Evaluate Principles of Bipedal Locomotor Control Using Dynamical Movement Primitives † ABSTRACT The control of bipedal locomotion is often considered to arise from the optimization of variables related to energetic cost and stability. However, developing model-based predictions of optimal strategies can be challenging due to the high-dimensionality of the control space and challenges associated with optimizing potentially conflicting objectives. Here, we present a framework for simulating bipedal gait which can ultimately be used in predictive simulations of optimal gait patterns. We modeled a human-like biped and a treadmill in Matlab Simscape and used Dynamical Movement Primitives (DMP) to generate control joint-level controllers from demonstrations of human walking. DMPs facilitate the optimization of gait patterns through a small † The content of this Chapter is published [103]. 27 Chapter 3. A Simulation Platform to Evaluate Principles of Bipedal Locomotor Control number of free parameters, such as the amplitude and timing of the patterns. We also implemented a simple feedback controller that variated the amplitude of the DMPs to stabilize the biped based on the deviation from a reference point on the treadmill. Optimizing of this controller allowed us to generate a human-like gait and ultimately contributed to the development of a platform with which we can explore optimization principles during locomotion. Keywords: Bipedal locomotion; Dynamical Movement Primitves; Stabilizing Control; Optimization; Genetic Algorithm. 3.1 INTRODUCTION Patterns of healthy bipedal gait in humans are often thought to emerge from the optimization of one or more biomechanical or physiological criteria [16]. Current evidence suggests that many features of bipedal gait are selected to minimize energy consumption [106, 19]. Stability is another objective feature that the CNS must control when walking [107, 108]. Therefore, the use of an optimization-based framework can be helpful for understanding the costs that influence how people select walking patterns in different contexts. Although energetic cost and stability appear to be important costs underlying steady-state gait, there is also evidence that walking may be coordinated in part via central pattern generators (CPG) [40] that generate a rhythmic control policy (RCP). One powerful approach to modeling RCPs is through dynamical movement 28 3.1. Introduction primitives (DMP) [41], which model RCPs as weakly nonlinear dynamical systems. In addition to capturing the rhythmic patterns of walking, DMPs are also beneficial because they can reduce the search space from which gaits are selected by imposing a biologically-inspired structure to joint-level trajectories. One challenge with using DMPs for control is that joint-level control strategies alone are insufficient for stabilizing gait because the system is underactuated [109]. This means that there is no direct control of the translation and orientation degrees of freedom of the biped even if the desired joint level trajectories are tracked perfectly. What remains to be seen is if a simple, biologically-plausible feedback controller can be used in conjunction with DMP-driven control policies to maintain stability during bipedal gait. The objective of this study is to test a simple feedback control law derived from human behavioral data for stabilizing a simulation of bipedal locomotion on a treadmill using DMPs. We created a treadmill model and a simulated biped with the same anthropometric features as a healthy participant. DMPs were extracted from joint angle trajectories observed in the participant’s locomotion. Once the DMPs were learned, a simple feedback control law was used to modify the DMPs and stabilize the gait. This control law was based on experimental data which demonstrated that foot placement during walking position can be predicted from a linear function based on the state of the torso [110]. 29 Chapter 3. A Simulation Platform to Evaluate Principles of Bipedal Locomotor Control 3.2 METHODS 3.2.1 Dynamical Movement Primitives (DMPs) We used dynamical movement primitives (DMPs), to model rhythmic control policies for bipedal locomotion. Separate DMPs for the ankle, knee, and hip joints were fit to data collected from a 40-yo healthy female (155 cm tall, mass of 47 kg) who walked on a treadmill at 1.0 m/s. The DMP is modeled as a second-order nonlinear dynamical system representing a damped spring around a nominal trajectory [41]. The formulation of DMP is as follows τ ˙ z =α z (β z (y 0 − y)− z)+f τ ˙ y =z , (3.1) where τ denotes the gait period, α z and β z are constants that are set so that the system is critically damped, y 0 denotes the average value of each joint angle trajectory, y denotes the joint angle trajectory and f is a forcing term that allows for the DMP to be fit to a given set of joint angle data. The forcing term is obtained for each joint angle trajectory through a learning algorithm. The forcing term is defined as a nonlinear function of the DMP’s phase ( ϕ ) and amplitude (r) and is composed of a normalized weighted sum of Gaussian-like basis functions in the following form [111]: 30 3.2. Methods f(ϕ,r )= P N i=1 ψ i ω i P N i=1 ω i r , (3.2) whereψ i =exp(h i (cos(ϕ − c i )− 1)) defines the i th Von Mises basis function, which is a Gaussian-like periodic function. N is the number of basis functions which is 100 here. h i and c i specify the width and midpoint of each of the basis functions, respectively. Here,h i =250 andc i are chosen so these functions are equally spaced over each gait cycle. ω i denotes the weight or the extent to which the i th basis function contributes to the forcing term. The algorithm to fit the DMPs is conducted in multiple steps as follows. First, the gait period is determined by extracting the dominant frequency of the behavioral data using a Fourier decomposition. Then, the average value of each joint angle trajectory (y 0 ) is calculated. Finally, the forcing term for each joint is computed using (3.1) as follows: f d =τ 2 ¨y− α z (β z (y 0 − y)− τ ˙ y d ) , (3.3) where f d denotes the desired forcing term. y d , ˙ y d , and ¨y d represent the desired joint angle, angular velocity, and angular acceleration, respectively. From here we determine the appropriate weights ω i of each of the basis functions using Linear Weighted Regression to minimize the error between the actual and desired forcing terms [112]. 31 Chapter 3. A Simulation Platform to Evaluate Principles of Bipedal Locomotor Control Figure 3.1: The representation of the biped-treadmill model in Matlab Simscape environment. 3.2.2 Biped-Treadmill Modeling To simulate bipedal walking, we first modeled a biped with anthropometric properties close to those of our test participant using Matlab Simscape (Fig. 3.1). This biped was constrained to walk in the sagittal plane and has 9 DOFs including hip extension/flexion, knee extension/flexion, and ankle dorsi-/plantar-flexion, as well as horizontal/vertical translation, and forward/ backward tilt. The segments in the model include the thigh, shank, and feet bilaterally and a point mass torso. The inertial properties of all segments were estimated from the participant’s mass and height according to Winter [113]. The contacts between the biped and the treadmill were modeled as sphere-to-plane contact forces using Simscape multibody contact forces library [114]. 32 3.2. Methods Contact spheres were modeled at each corner of the feet. The force generated during contact was modeled via viscoelastic elements with viscous friction as follows: F n =− k y p y − b y ˙ p y F t =− µv s , (3.4) where F n and F t denote the normal and friction forces, respectively. p y , ˙ p y , and v s denote respectively the vertical penetration, vertical penetration rate, and the shear velocity of each sphere on the treadmill surface. k y and b y denote the stiffness and damping coefficients and are set to k y = 2e + 3 N.m − 1 and b y = 1e + 5 N.m − 1 .s, respectively. µ x represents the viscous friction coefficient and is set to µ x k = 0.6 and µ x s =1.0 for the kinetic and static cases, respectively. 3.2.3 Stabilizing the Gait via Control Policies Update Once the model of the biped and the treadmill was prepared and the control laws for all lower-extremity joints were extracted as DMPs, we simulated the biped using the approach illustrated in Fig. 3.2. Once weight vectors were obtained (Section 3.2.1), the “Update DMP Trajectories” block used these weights to compute the corresponding forcing terms according to (2). After substituting the forcing terms into (1), the joint angle trajectories y, ˙ y and ¨y were obtained. To stabilize the biped, the forcing functions were modified at each midstance based on the torso state. Stabilization occurred through variation of the amplitude valuesr as follows: 33 Chapter 3. A Simulation Platform to Evaluate Principles of Bipedal Locomotor Control Figure 3.2: Block diagram of the model dynamics simulator, genetic algorithm optimization, and control scheme with PID controllers for stabilizing walking on a treadmill. δr =− Kp r δx torso − Ki r Z δx torso dt− Kd r ˙ δx torso , (3.5) whereδr =r des − r ref withr ref =1 andr des denoting the reference DMP amplitude and desired DMP amplitude tuned by PID control. Further,δr =r des − r ref withr ref =1 is the distance between the torso and the center of the treadmill. The position controller gains are to be tuned such that the cumulative error in biped pitch is minimized while maximizing the simulation time. To this end, the following optimization problem is to 34 3.3. Results be solved: min [Kpr,Kir,Kdr] Z (θ p − θ pdes ) 2 dt+K pen (t des − t sim ) s.t. [Kp r ,Ki r ,Kd r ]∈χ , (3.6) where the termK pen (t des − t sim ) penalizes falls that occur at timet sim before the desired episode durationt des is achieved. Setting the penalty coefficient K pen to large values, e.g.,K pen =10 5 would heavily penalize losses of balance before reaching the desired simulation duration. χ is the feasibility region, wherein the gain variables Kp r , Ki r and Kd r must lie. We used a Genetic Algorithm (GA) from the Global Optimization toolbox in Matlab to find the optimal controller gains for the biped. 3.3 RESULTS The trained DMPs fit the joint angle trajectories for all DOFs accurately (Fig. 3.3) resulting in an average coefficient of determination of R 2 > 0.99. The optimal gains for the biped position controller were Kp r = 0.27, Ki r = 0.562 and Kd r = 0.033. This optimal set of controller gains allowed the biped to walk for 25.3 seconds, whereas the biped was only able to walk for approximately 2 seconds in the absence of the feedback control component. The simulated DMP-driven joint angle trajectories for all DOFs were consistent with the actual motion data over a gait cycle (Fig. 3.4). This provides a measure of the variability in the commanded DMP amplitude. This variability reflects the need 35 Chapter 3. A Simulation Platform to Evaluate Principles of Bipedal Locomotor Control 0 0.5 1 1.5 2 -20 0 20 Left Hip R 2 = 0.99809 RMSE = 1.0379 DMP Trajectory Actual Trajectory 0 0.5 1 1.5 2 0 20 40 60 Lower-extremity Joint Angle Trajectories (deg) Left Knee R 2 = 0.99683 RMSE = 2.0614 0 0.5 1 1.5 2 Time (sec) -20 0 20 Left Ankle R 2 = 0.98959 RMSE = 1.1806 0 0.5 1 1.5 2 -20 0 20 Right Hip R 2 = 0.99763 RMSE = 0.9203 0 0.5 1 1.5 2 0 20 40 60 Right Knee R 2 = 0.99691 RMSE = 2.06 0 0.5 1 1.5 2 Time (sec) -20 0 20 Right Ankle R 2 = 0.9899 RMSE = 1.2125 Figure 3.3: DMPs trained with observed human joint angle data for all DOFs. The coefficient of determination (R 2 ) and the root mean square error (RMSE) for each DOF represent the variance explained and the prediction error norm values, respectively. for stabilizing responses during the simulation. However, the relatively small RMSE values show that the biped’s stability can be controlled which only minor deviations from the trained DMP trajectories. Figure 3.5 represents how well the controller stabilized the torso and, consequently, the biped. The position of the biped was calculated as its signed distance from the center of the treadmill. The pitch of the biped was calculated as the rotation angle of the torso about the frontal axis (Y -axis), as shown in Fig. 1. The 36 3.3. Results 0 0.5 1 -20 0 20 Left Hip RMSE= 1.61 0.74 deg Actual Cyclic Trajectory Simulated Cyclic Trajectories 0 0.5 1 0 20 40 60 Lower-extremity Joint Angle Trajectories (deg) Left Knee RMSE= 2.35 1.15 deg 0 0.5 1 Time (sec) -20 -10 0 10 Left Ankle RMSE= 1.03 0.35 deg 0 0.5 1 -20 0 20 Right Hip RMSE= 1.27 0.61 deg 0 0.5 1 0 20 40 60 Right Knee RMSE= 2.31 1.29 deg 0 0.5 1 Time (sec) -20 0 20 Right Ankle RMSE= 1.06 0.41 deg Figure 3.4: The strides of the simulated (tuned) DMP-driven joint angle trajectories and the actual (baseline) data over gait cycles for both left and right legs. pitch is a measure of stability, while the position of the biped provides a measure of how successfully the biped maintained the desired position on the treadmill. The eventual failure at the end of both trajectories is due to a backward fall, which the DMPs are unable to rescue. The spatiotemporal data of the optimal solution for biped walking were also obtained as a function of both the stride number (Fig. 3.6) and the biped position with respect to the center of the treadmill (Fig. 3.7). The change in step length was 37 Chapter 3. A Simulation Platform to Evaluate Principles of Bipedal Locomotor Control 0 5 10 15 20 25 -0.6 -0.4 -0.2 0 0.2 0.4 Biped Relative Position (m) Simulated Reference 0 5 10 15 20 25 Simulation Time (sec) -50 -40 -30 -20 -10 0 Biped Pitch Angle(deg) Simulated Reference Figure 3.5: The biped torso translational and rotational states over the simulation time. an approximately linear function of the biped’s distance from the set point position on the treadmill (Fig. 3.7). As the biped tended to accelerate past the center of the treadmill, the controller applied a reduction in the DMP amplitude that led to a reduced step length. As the biped dropped behind the reference position, it returned toward the target position by increasing the DMP amplitude and, consequently increasing the 38 3.4. Discussion 0 5 10 15 20 25 0.3 0.4 0.5 0.6 0.7 0.8 Step Time (sec) Left Foot Right Foot 0 5 10 15 20 25 Stride No. 0.3 0.4 0.5 0.6 Step Length Change (m) Left Foot Right Foot Figure 3.6: Spatiotemporal parameters of the biped optimal gait on the treadmill vs. stride number. following step length. 3.4 DISCUSSION Here, we evaluated the capacity of simple feedback control law to stabilize a simulated biped driven by dynamical movement primitives. The form of the feedback control law was derived from human studies and enabled the biped to reject perturbations and 39 Chapter 3. A Simulation Platform to Evaluate Principles of Bipedal Locomotor Control -0.3 -0.2 -0.1 0 0.1 0.2 0.3 -0.4 -0.2 0 0.2 0.4 Step Time Change (sec) Left Foot Right Foot -0.3 -0.2 -0.1 0 0.1 0.2 0.3 Biped Position at Midstance w.r.t. Center (m) -0.1 -0.05 0 0.05 0.1 0.15 Step Length Change (m) R 2 Left = 0.84406 R 2 Right = 0.61664 Left Foot Left Foot Linear Fit Right Foot Right Foot Linear Fit Figure 3.7: Changes in spatiotemporal parameters of the biped optimal gait versus the biped torso relative position on the treadmill. maintain position on the treadmill. The use of a DMP-based framework allowed us to generate human-like behavior through simple modulation of the amplitude of the joint-level trajectories. Our spatiotemporal results were also in line with values that one would observe from human data. The optimal solution had the biped walk for a significantly longer time compared to the case where only the actual joints angle data were fed into the joints with no update law on the DMPs output. 40 3.4. Discussion Although the combination of DMP-based joint-level control and a human-inspired stabilization strategy improved the stability of the biped when walking on the treadmill, some aspects of the control can stand to be improved. For example, updating the gait period based on the biped’s position or pitch could lead to more robust control [115]. Alternatively, the addition of inter-limb coupling, and phase and frequency resetting could enhance stability [42]. The strategy for optimizing our feedback controller is also an essential issue to consider. Here, we used genetic algorithm to find controller gains that produced a significant improvement in simulation time. However, genetic algorithms are sensitive to the number of generations and population size, depending on the structure of the optimization problem, and may get stuck at local minima if the distribution of the solutions at each generation is not dispersed enough. Therefore, a potential future direction to pursue is to exploit other evolutionary optimization strategies such as Covariance Matrix Adaptation [116] to improve the likelihood of finding globally optimal solutions. Ultimately, we hope to use this platform to explore the feasibility of using inverse optimization strategies [117] to extract cost functions from observed behavior. As an example, the current biped model can be easily extended to explore control strategies underlying adaptation to walking on a split-belt treadmill [118, 119]. Our model could be used to find energy-minimizing strategies for walking on a split-belt treadmill and the resulting behavior can be compared with experimental data. 41 Chapter 4 Selecting Actions Based on Cost Gradients Reveals Scaling of Gait Asymmetry With Belt Speed Ratio During Split-belt Walking ABSTRACT Simulation studies have attempted to uncover underlying principles of human adaptive gait during split-belt walking through optimality principles. For split-belt walking, the simulation studies typically have predicted one unique solution, corresponding to substantially larger positive values of spatiotemporal asymmetry, when compared to those observed experimentally. The discrepancies in predictions may suggest that other principles than sole energy optimality may promote different action selection algorithms that can better explain the adopted gait features. Thus, it has yet to be determined what is the main underlying principle that can explain the observed ranges of gait features. Our goal was to implement an action selection algorithm on a mechanical model of bipedal walking, obtaining suboptimal solutions 42 that are feasible and good enough with respect to energetic costs, and predict the features of split-belt gait in humans. We first obtained a large set of feasible solutions via a genetic algorithm optimization, which then was used to fit a surrogate cost landscape. Implementing a gradient descent with gradient thresholding along the surrogate cost landscape, we next obtained potential good-enough solutions as those in basins around local minima of the surrogate cost landscape. We performed simulations of good-enough control for belt speed ratios ranging from 1.0 to 3.0 with an average belt speed of 1.0 m/s. Our predictions revealed a significant increase in step time and step length asymmetry, and accordingly, a significant decrease in net positive work rate by the legs with belt speed ratio. In particular, positive step time asymmetries of (Mean ± SD) 0.138 ± 0.091 and positive step length asymmetries of (Mean ± SD) 0.145 ± 0.130 were observed to predominantly give rise to our predicted family of good-enough solutions for split-belt walking at the speed ratio of 3:1. Our findings from the good-enough simulations of split-belt walking are in agreement with the hypothesis that by spending less time in contact with the fast belt and placing the foot further forward on the fast belt, people take advantage of the work provided by the split-belt treadmill to reduce the positive work required by the legs. These results suggest that settling for good-enough solutions and not optimization per se could drive human locomotor control, especially in novel contexts such as adaptation to split-belt treadmill walking. Individuals may explore and exploit solutions that are just good enough and meet the physical demands of the task. 43 Chapter 4. Good-enough Split-belt Gait Predicts Scaling of Asymmetry with Speed Ratio Keywords: Bipedal locomotion; Energy optimality; Good-enough Control; Predictive simulation; Split-belt treadmill; Adaptation; Evolutionary optimization; Feasibility. 4.1 INTRODUCTION Several studies on human motor control have proposed various normative theories attempting to understand the factors driving human movements [120, 121, 122, 123, 1]. A common feature of these theories is that they each propose that motor strategies are selected to reduce costs associated with physical effort, such as metabolic cost [14, 15, 16]. For instance, individuals tend to naturally select gait features, such as walking speed, that minimize their metabolic cost of transport [15, 19, 25, 26]. Not only does energy economy explain characteristics of well-practiced behaviors, but it may also underlie adaptive locomotor control in novel conditions. One common approach to studying adaptive locomotor control and learning is split-belt treadmill walking wherein the belt under each leg moves at a different speed [118, 31]. After extended experience walking on the treadmill, participants adopt asymmetric gait where the step times and foot placement locations differ on the fast and slow belts [32, 33, 34]. Stepping further forward on the fast belt allows individuals to take advantage of the external work provided by the treadmill, reducing the positive work performed by the legs, hence, reducing the energetic cost [33]. However, it was also shown that individuals adopt asymmetric step times that are energetically optimal [34]. While there were observations of people adopting asymmetric step lengths to be 44 4.1. Introduction optimal [33], asymmetric step times were preferred by individuals in the other study [34]. The discrepancies in these findings may be because one held stride length constant while varying step length asymmetry [33], whereas the other varied either step length asymmetry or step time asymmetry while keeping the other constant [34]. Although empirical studies have been able to observe a reduction in the cost of walking in participants’ adaptation given an experience of a limited range of walking conditions, these studies are unable to determine what strategies are best with respect to a given cost. Empirical cost mapping involves a local search within a limited subset of the behavioral feature space. This mapping typically occurs through varying one feature at a time while the others are clamped to determine the relationship between a target spatiotemporal feature and costs such as metabolic cost and mechanical work [33, 34]. As a result, these studies could not achieve a simultaneous variation of different features. As such, they are not fully capable of offering a more extensive experience of and evaluating all different plausible walking strategies in a novel context. Thus, due to only partial experience and the experimental paradigm designs specifically customized for each study, empirical studies cannot strongly support the presence of any such principles underlying the behaviors. These limitations justify the need for a more systematic and comprehensive evaluation and testing of the energy optimality principles in adaptive locomotor control. An alternative approach to investigate the role of optimality principles in locomotor behavior is model-based predictive simulations of bipedal locomotion. Model-based 45 Chapter 4. Good-enough Split-belt Gait Predicts Scaling of Asymmetry with Speed Ratio simulations are capable of simultaneously exploring the effects of varying multiple gait features, such as step length asymmetry, step time asymmetry, stride time, and stride length. The use of a gradient-based exploration through energetic cost landscape in simulations of split-belt gait for a minimal biped model [16] predicted positive step length asymmetry and step time asymmetry as optimal patterns at steady-state [104]. Further, minimization of muscle activations cubed in simulations of a musculoskeletal biped model walking on a split-belt treadmill predicted substantial positive step length asymmetries, associated with reduced metabolic rates below that of tied-belt walking at the same average belt speed [105]. Despite predicting longer steps on the fast belt as energy-optimal, simulations have typically predicted step length asymmetry levels (0.26 for belt speed ratio of 3 : 1 [105], and approximately 0.4 for belt speeds of 0.45 : 0.35 ms − 1 [104]) that are an order of magnitude larger than those observed (0.028± 0.01 [33]). First, it is possible that other factors, such as stability, excessive loads, or smoothness, as well as energy, influence human adaptation. Furthermore, simulation studies often explore the cost landscape to find only one solution that is globally or locally optimal. Although optimality principles can explain several features of motor control and learning, they rely on algorithms with a greedy nature that explores a given cost landscape for the unique global or local optima. There are several reasons why true optimization is unlikely to govern adaptive motor learning. An optimization-based algorithm typically assumes access to an accurate representation of the cost function 46 4.1. Introduction landscape. However, in biological learning, uncertainty within the sensorimotor system due to noise and delay at the sensory processing level may impair the learner’s ability to evaluate their performance accurately [89]. Furthermore, optimization-based algorithms can be allowed to implement an exhaustive search through control-cost map space until a unique optimal control policy is obtained. This not only requires more exploration time than may be practically plausible, but it relies on a memory capable of storing all information on continuous updates to the cost landscape. A biological learner, on the contrary, may rather settle for suboptimal policies to reduce both the cost of search time and the cognitive load associated with memory limitations. It is plausible that a biological learner, such as an animal or a human, explores different strategies in the form of a trial-and-error learning process and selects a repertoire of good-enough motor behaviors that are locally stable, i.e., lie within a basin around a local minimum, when evaluated on the landscape of some cost, e.g., energetics [38]. Good-enough solutions, by definition, are those strategies in the vicinity of which the cost gradient is shallow, amounting to suboptimal locally stable strategies. An important instance of locally stable solutions could be where a range of recent strategies have incurred approximately equal cost values. Further, once learned, replicas or fine-tuned renditions of these motor habits that are suboptimal are likely to persist in the event the biological learner were to interact with the same context again [6]. 47 Chapter 4. Good-enough Split-belt Gait Predicts Scaling of Asymmetry with Speed Ratio Here, we investigated whether the principle of good-enough control with respect to a range of effort-related cost functions could explain empirical features of gait observed during split-belt walking. We developed and used an anthropometric 7-segment, 9-DOF biped model to implement predictive simulations of bipedal walking on a split-belt treadmill. We first tested the hypothesis that solutions that are good enough with respect to positive mechanical work performed by the legs are associated with stepping further forward and spending less time on the faster belt, in line with the experimental observations. We then aimed to determine how the features of good-enough locomotor strategies varied across a wide range of split-belt conditions. 4.2 METHODS This section describes the developed model of walking and the framework for performing optimization-based simulations of human locomotion. First, a description of the biped model we developed for our simulations is given. Next, we introduce the dynamical movement primitives (DMPs) as a set of rhythmic control policies used to create baseline joint angle patterns that actuate the joints in the biped model. We use DMPs due to their inherent modular nature, which would allow for a low-dimensional tuning of motor control patterns. We propose a framework for optimization-based predictive simulations of gait via manipulation of a set of spatiotemporal scaling and shift variables in the DMPs. We include an additional 48 4.2. Methods component of good-enough control in our optimization paradigm based on which we select good-enough strategies rather than a single globally optimal solution. 4.2.1 Biped-Treadmill Model We developed a biped model with seven segments and nine degrees of freedom walking on a split-belt treadmill in Matlab SimScape (Mathworks, Natick, MA) [103]. The segments consisted of the feet, the shanks, the thighs, and a lumped point-mass trunk, and the model is free to move in the sagittal plane with dorsi/plantar-flexion in the ankles, flexion/extension in the knees, and flexion/extension in the hips (Fig. 4.1A, B). The trunk was also free to move with two translational DOFs in the anterior/posterior and superior/inferior directions, and one rotational DOF , i.e., the pitch about the sagittal axis. The mass, length, and inertial properties of the body segments were estimated to be consistent with those from a female with a mass of 47 kg and height of 155 cm based on published anthropomorphic data [113]. Lower-extremity joint torques were determined by a PD controller that enabled tracking of the desired joint angle trajectories. In our walking model, we represented the treadmill as two separate belts each of which can run at a different speed to simulate split-belt walking. Moreover, the contact between each foot and the treadmill was modeled via four spheres of radius 5 mm located on the four corners of each foot (Fig. 4.1C). We modeled the normal contact force acting on each sphere by the treadmill through the use of a compliant 49 Chapter 4. Good-enough Split-belt Gait Predicts Scaling of Asymmetry with Speed Ratio Fast Belt Slow Belt A Contact sphere Treadmill Belt Speed (m/s) 1.5 1.0 0.5 Belt Speed Ratio B D C Fast/Right Belt Slow/Left Belt Figure 4.1: The biped-treadmill model. (A) The biped model and treadmill schematics. (B) The biped is a 7-segment, 9-DOF model of bipedal walking free to move in the sagittal plane. The actuated degrees of freedom include ankle dorsi-/plantar-flexion, knee flexion/extension, and hip flexion/extension. Further, the torso is free to translate in both anterior/posterior and superior/inferior directions and to pitch about the frontal axis. (C) The contact model; We modeled the contact through the use of four spheres placed on the corners of each foot. The normal force, noted by F n acting on each sphere was modeled using elastic components through the Hunt-Crossley model (with stiffness k and damping b) relating the sphere penetration (p y ) to force values. The friction force followed stick-slip law relating the normal force to friction force (F f ) using the friction coefficient µ that was a function of sphere velocity relative to the treadmill surface (v poc ). (D) The top view of the biped on the treadmill and belt speeds (left) and the variety of belt speed ratios used in simulations of split-belt gait (right). Hunt-Crossley model [124]: F n =kp 1.5 y (1+1.5b ˙ p y ) (4.1) where k and b denote the stiffness and the dissipation coefficients of the contact, 50 4.2. Methods respectively. p y and ˙ p y are the amounts and the rate by which each sphere would penetrate into the treadmill plane, respectively. We modeled the friction force acting on each sphere by the treadmill through the use of the stick-slip law as follows: F f =µ (v poc )F n (4.2) wherev poc is the velocity of the sphere at the point of contact relative to the belt speed in the anterior-posterior direction. µ (v poc ) is the coefficient of friction and a function of the relative slip velocity v poc and properties of the contact. µ (v poc ) is then described by the static and kinetic coefficients of friction, i.e., µ s and µ k , respectively, and a threshold velocityv th , below which the sphere sticks to the contact: µ (v poc )= µ s ∗ v poc /v th v poc <v th µ s +(µ k − µ s )∗ v poc /0.5∗ v th v th <v poc <1.5∗ v th µ k v poc >v th , (4.3) We first hand-tuned the contact parameters for tied-belt walking on the treadmill with belts running at 1.0 m/s to avoid gaits that had aerial phases and excessive penetration of the foot into the treadmill surface. We then used those values as the initial guesses to search for the contact parameters that minimized the error between the ground reaction forces predicted by the model and the actual ground reaction 51 Chapter 4. Good-enough Split-belt Gait Predicts Scaling of Asymmetry with Speed Ratio forces recorded from a human walking on a tied-belt treadmill at 1.0 m/s. Ultimately, we usedk =9× 10 4 N.m − 1.5 ,b=60s.m − 1 ,v th =0.1m.s − 1 ,µ s =0.8, andµ d =0.5 as the contact parameters for our simulations. 4.2.2 Learning from Demonstrations There is strong evidence that walking may be coordinated in part via central pattern generators (CPG) [40] that create a rhythmic control policy for lower-extremity motion. Dynamical movement primitives (DMPs) [41] can be used to model weakly nonlinear rhythmic control policies. DMPs allow for imposing behavioral structure to the model, which in turn can reduce the search space while exploring the optimal strategies. Thus, we used DMPs to parameterize desired joint angle trajectories capturing the idea that the nervous system may modulate behaviors such as walking by tuning a few parameters that scale a set of primitives. For each of the ankle, knee, and hip joints, we separately fit DMPs to the joint angle trajectories of ankles, knees, and hips in the gait data collected from a healthy female walking at 1.0 m/s. Each DMP was modeled as a second-order dynamical system as follows: τ 2 ¨θ =α z β z (θ b − θ )− τ ˙ θ +f (4.4) where θ , ˙ θ , and ¨θ are the joint angle, angular velocity, and angular acceleration trajectories, respectively. τ denotes the stride time, here. α z andβ z are the constants of the DMP oscillators and are set such that the oscillators are critically damped, 52 4.2. Methods hence, dynamically stable. θ b denotes the joint angle baseline set to initial states of the joint angle trajectories, and f is a weakly nonlinear forcing term that is used to fit the experimental joint angle data. For each DOF , we defined the forcing term as a weighted sum of a set of Gaussian kernel functions [111]: f(ϕ,A )= P N i=1 ψ i w i P N i=1 ψ i A (4.5) where the forcing term f is a function of the phase of a given DMP oscillator (ϕ ) and a spatial scaling parameter (A), and is composed ofN activation functions. ψ i andw i denote the i th Gaussian activation kernel and its corresponding weight, respectively. For a given DOF , each activation functionψ i is defined as follows: ψ i (ϕ )=exp h cos(ϕ − c i )− 1 (4.6) where h denotes the width of each Gaussian activation function and was set to h = 250. Moreover,c i captures the midpoint locations of each Gaussian kernel and spans the entire 2π phase range for each oscillator according to c i = 2πi/N , i = 1,...,N. We fit six different DMPs for hip, knee, and ankle DOFs to joint angle data recorded from human gait. The learned set of DMPs was each assumed to have an amplitude and initial phase value ofA = 1 andϕ 0 = 0, respectively, to capture the baseline joint angles. Moreover, the stride timeτ in (4.4) was determined by extracting the dominant frequency of the experimental data using a Fourier decomposition. Then, we fed the 53 Chapter 4. Good-enough Split-belt Gait Predicts Scaling of Asymmetry with Speed Ratio demonstrated joint angles (θ d ), angular velocities ( ˙ θ d ), and angular accelerations ( ¨θ d ) into (4.4) to find the desired forcing term ( f d ): f d =τ 2 d ¨θ d − α z β z (θ bd − θ d )− τ ˙ θ d (4.7) Plugging f d from (4.7) into (4.5) and using locally weighted regression [125] with A=1 andϕ 0 =0, we obtained the weights on Gaussian kernels (w i ’s) for the baseline control policies in each DOF . 4.2.3 Predictive Simulations of Bipedal Locomotion We investigated whether the principle of good-enough control with respect to a range of effort-related cost functions could explain empirical features of gait observed during split-belt walking. The diagram in Fig. 4.2 represents an overview of our predictive simulation framework for split-belt bipedal walking, representing the two main components as the evolutionary optimization and the gradient descent. The joint angle trajectories resulting from the DMP kernel weights obtained for each DOF (per Section 4.2.2) were used as the baseline motion reference. We then used evolutionary optimization to obtain the optimal set of DMP scaling variables with respect to a given cost for different belt speed ratios (see Section 4.2.3). All feasible solutions were then used as input to a neural network to obtain a surrogate function representing the cost landscape. Next, we used a gradient descent algorithm to search through the resulting surrogate cost for obtaining the good-enough gait 54 4.2. Methods Minimize cost s.t. Equations of motion Cost A. Evolutionary Optimization Minimize surrogate cost ′ s.t. C. Gradient Descent with multiple starts Optimization Variable, e.g., stride time final position constraints & final position constraints stop search when ‖ ▽ ′ ‖ ′ ≤ 1 % initial guess vicinity of the best evolutionary solution for tied-belt walking = , ∗ 0 . 1 D. Range of potential good-enough solutions ▽ ≻ 0 & Optimization Variable, e.g., stride time Cost Optimization Variable, e.g., stride time Neural Network Cost vs. variable for feasible solutions in one dimension B. Cost surrogate function for feasible solutions shown in one dimension Cost Figure 4.2: Overview of the predictive simulation of bipedal walking using hybrid optimization. Predictive simulations of good-enough split-belt walking consisted of two main components of evolutionary optimization and gradient-based search; (A) Evolutionary optimization was used to predict a broad range of feasible solutions through evaluations of the cost and the dynamics of motion in Simulink. (B) Surrogate cost obtained from the evolutionary solutions through neural network fitting. (C) Gradient Descent searched through the surrogate cost function to obtain the locally optimal set of variables multiple times. (D) Regions of the cost landscape corresponding to potential good-enough solutions are highlighted in red. strategies under different belt speed ratio conditions (see Section 4.2.3). Cost Functions Results from both simulation and empirical studies support the idea that minimization of mechanical work could be a factor that drives the gaits that individuals adopt when walking in novel conditions. For instance, positive mechanical work performed by 55 Chapter 4. Good-enough Split-belt Gait Predicts Scaling of Asymmetry with Speed Ratio the limbs correlates with the metabolic cost consumed by humans during walking [126]. Further, during split-belt walking, reduction in metabolic cost was associated with reduced positive mechanical work rate required by individuals’ legs [34, 33]. In addition to positive mechanical work, one could alternatively use other relevant physically meaningful costs (Table 6.1), as proposed in the literature. For instance, alternatives to positive work rate in Energetic family of costs entail negative work rate, absolute work rate, and mixed work rate. Minimization of jerk has also been proposed as a kinematic criterion that would result in human body motion with maximal smoothness [99]. Moreover, dynamic criteria, also known as minimum effort criteria, emphasize the importance of a specific norm of generalized forces being minimized. Examples of minimum effort criteria involve the minimization of Euclidean or cubic norms of, for instance, torques [48, 127, 74]. In this work, we present only the results from the use of positive work rate as the cost function. We calculated the total positive work rate (W + ) as the sum of positive portions of mechanical work performed by lower-extremity joints. T (i) and θ (i) (represented in Table 6.1) denote torque and angular velocity trajectories at i th DOF , respectively. The productT (i) θ (i) returns the mechanical power trajectory ati th joint. Integrating the positive portion of this product over a stride would amount to the positive mechanical work performed by i th joint. Ultimately, summing all the work values across different joints and dividing the outcome by the stride time returns a measure of the net positive work rate. For the cost represented as mixed work (Table 6.1), we specifically 56 4.2. Methods Table 4.1: A pool of physically meaningful (intrinsic) cost functions proposed in the literature. Three different families of costs are quantified at different process levels of kinematics, dynamics, and kinetics. Cost Integrand Mathematical Representation Reference Kinematics Jerk 1/τ R 2τ τ P N DOF i=1 ... θ (i) 2 dt [54], [99], [77], [78], [128] Dynamics Torque Squared 1/τ R 2τ τ P N DOF i=1 T (i) 2 dt [48], [102], [76], [127], [54], [99] Torque Cubed 1/τ R 2τ τ P N DOF i=1 T (i) 3 dt [48], [53], [129], [130] Energetics Positive Work Rate 1/τ R 2τ τ P N DOF i=1 h T (i) ˙ θ (i) i + dt [19], [131], [132], Negative Work Rate 1/τ R 2τ τ P N DOF i=1 h T (i) ˙ θ (i) i − dt [133], [16], [52], Absolute Work Rate 1/τ R 2τ τ P N DOF i=1 T (i) ˙ θ (i) dt [63], [134], [135] Mixed Work Rate (Mechanical correlate for metabolic cost) 1/τ R 2τ τ P N DOF i=1 w p h T (i) ˙ θ (i) i + − w n h T (i) ˙ θ (i) i − dt [49], [56] calculated a sum of positive and negative portions of work weighted by w p and w n , respectively, wherew p =4.0 andw n =0.8 [49]. Constraints and Considerations A set of constraints and checks were used in optimizations to ensure the feasibility of the behaviors resulting from simulations. The constraints consisted of (1) biomechanical or organismic feasibility, (2) task goal demands, and (3) those related 57 Chapter 4. Good-enough Split-belt Gait Predicts Scaling of Asymmetry with Speed Ratio to the biological feasibility of the resulting behaviors. Physiologically plausible ranges of motion for joint angles and stride time [136] and the empirically reported joints’ peak moments [137] were used to inform the biomechanical constraints on the resulting behaviors. The constraints related to the task goal consisted of an imposed minimal drift in biped during simulations. This was achieved by forcing the biped’s center of mass to stay within2cm of the center of the treadmill. In our simulations of bipedal walking, cases could be made where the biped successfully found efficient strategies that were not biologically plausible. Examples of such behaviors were scuffing , where the swing foot drags along the treadmill, and jumping, leading to instants during gait at which none of the feet were in contact with the treadmill. We deemed these examples unreasonable for human walking, hence heavily penalizing the emergence of such behaviors by imposing a set of extra constraints to ensure the biological feasibility of the results. The constraint used to avoid scuffing was for the swing toe clearance to be greater than zero throughout the initial swing, i.e., while the swing foot is behind the center of mass. To avoid behaviors involving any flight or jumping, we constrained the minimum toe clearance across both feet to be zero, which ensured that one foot was always in contact with the treadmill. Optimization of Bipedal Walking using DMPs The optimization problem for walking is represented in the following form: 58 4.2. Methods h A, τ, ϕ 0 , θ b i ∗ = argmin A, τ, ϕ 0 , θ b J , (4.8a) s.t. x torso (t) <3cm ,∀t x torso | t=τ <2cm , F N (t)=0 ,∀t: n x ankle (t)− x torso (t)<0 ∧ v ankle (t)>0 o F Left N (t)>0 ∨ F Right N (t)>0 ,∀t , (4.8b) where the goal was to find the optimal values for the variables of interest [A, τ, ϕ 0 , θ b ] ∗ that minimized any given cost function J. The cost selected for J here was positive work rate quantified as shown in Table 6.1. Note that in Eq. (4.8), four constraints must also be satisfied (4.8b) while minimizing the cost function (4.8a). x torso stands for the position of the biped torso in the anteroposterior direction relative to the center of the treadmill. We introduced our inequality constraints on the end positions of the torso to ensure that no net drift occurred over any of the two strides; hence, the biped would show an approximate cyclic behavior. We implemented this cyclicality by allowing for the position of the torso throughout the gait and at the end to vary within± 3cm and± 2cm of the center of the treadmill, respectively, in lieu of using an exact equality constraint. This choice of constraints was because, realistically, there is some deviation in individuals’ fore-aft position from the center of the treadmill during 59 Chapter 4. Good-enough Split-belt Gait Predicts Scaling of Asymmetry with Speed Ratio walking. The third and fourth constraints in Eq. (4.8b) represent avoidance of scuffing and flight, respectively, where F N denotes the normal ground reaction force. One efficient way to implement the constrained optimization problem in (4.8) with genetic algorithm is to use an equivalent piece-wise composite penalty [138] that entails constraint violations along with the cost. Hence, our constrained optimization problem posed in 4.8 reduces to an unconstrained one taking the following form for optimization of positive work rate: h A, τ, ϕ 0 , θ b i ∗ = argmin [A, τ, ϕ 0 , θ b ] 1/τ Z τ 0 N DOF X i=1 h T (i) ˙ θ (i) i + dt+K p 1− S/C , (4.9) where C denotes the total number of constraints, which equals C = 4, in our optimization. S stands for the number of constraints that were satisfied at any iteration, andK p = 10 9 is a penalty coefficient set to a large value to heavily penalize the composite cost function in conditions where any constraints were not satisfied. With the new additive cost function, we ensured that any solutions evaluated by the optimization first satisfied the approximate cyclicality constraints and then minimized the given cost. Figure 4.3 shows how the evolutionary optimization solves the problem in Eq. (4.9) resulting in a range of feasible solutions (corresponding to Fig. 4.2A). For each belt speed ratio, we used a genetic algorithm to obtain a set of DMP scaling parameters, 60 4.2. Methods including stride time (τ ), six angle amplitude scaling variables (A DOF ), the fast hip oscillator phase shift variable (ϕ 0f ) and two offset variables for fast and slow hips (θ b S andθ b F ) that correspond to optimal cost values. Further, we stored all the intermediate feasible solutions evaluated by the genetic algorithm. The optimization in this step was performed for 100 generations and a population size of 512, where the crossover rate and mutation rate were set to 0.8 and 0.01, respectively. Good-enough Control The main goal of most optimization frameworks is to achieve a unique global optimum. However, achieving the global optimum is not necessarily the goal of a biological learning process, nor does settling for a unique solution explain the natural step-to-step and stride-to-stride variability observed in individuals’ locomotor strategies. Aside from the uncertainty in the sensorimotor system, it is plausible that individuals are equally willing to adopt a range of potentially different strategies, all of which are good enough with respect to a cost function. A potential good-enough strategy can be selected through an iterative assessment of performance, where success is determined by whether a threshold performance is achieved [38]. For instance, a successful strategy can be found where the gradient of the cost landscape is shallow. That is, once there is no discernible improvement in cost from the previously experienced strategies to the currently selected one, the biological learner stops exploring. In addition, when exposed to a novel context, a biological learner 61 Chapter 4. Good-enough Split-belt Gait Predicts Scaling of Asymmetry with Speed Ratio a. Optimization (Genetic Algorithm) Yes No b. Biped-Treadmill in Simscape [, , , ] Joints Angles Joints Torques Torso States Joints States Gen+=1 Evaluate GA Penalty = + 1 − Motion Tracking PD Control Joints Torques Tune DMP Parameters = , = , = , = , New DMP Parameters Generate Target Trajectories = ( , , , , ) Stall Gen = 10 Max Gen = 100 Figure 4.3: Overview of the simulations of bipedal walking using evolutionary optimization. Simulations based on evolutionary optimization consisted of two levels; (a) GA optimization and (b) biped-treadmill walking simulation in Simscape. Starting with DMPs fit onto a given set of baseline joint angles (θ baseline ), one can initialize the optimization by specifying spatiotemporal scaling variables (A 0 ,τ 0 ,ϕ 0 0 andθ 0 b ) to then obtain the target (reference) joint anglesθ joints ref . The reference joint angles will then be used for inverse dynamics through a motion tracking block to obtain joint torquesT joints for performing the simulations of bipedal walking in Simscape. Next, kinematic and dynamic outputs of walking simulations at each iteration will be evaluated by the genetic algorithm with respect to a piecewise penalty function (P) that entails a target cost function (J) and an extra penalty term (K(1− S/C)) that ensures the constraints are satisfied. By evaluating the penalty at each generation, one can determine whether the optimal decision variables are achieved or if the variables need to be tuned for the next generation until the stopping criteria are met. is more likely to start off by using a repertoire of previously learned strategies. As a result, the search process would efficiently be reduced from a de novo learning to a look-up exploration and refinement around good habitual strategies from before. One can implement this learning process through bioinspired algorithms such as 62 4.2. Methods reinforcement learning [92, 93, 94] and gradient descent [95]. Here, we obtained a set of good-enough solutions, for each belt speed ratio, by combining the use of feasible solutions from a genetic algorithm and a gradient-based search. Figure 4.4 illustrates the feasible solutions obtained from the genetic algorithm and their use to fit a surrogate cost landscape that, when explored via gradient descent, would find a range of good-enough solutions. For each cost and belt speed ratio, we used all the feasible solutions obtained from the genetic algorithm (the red circles in Fig. 4.4) and their corresponding costs to estimate the surrogate cost landscape (the black curve in Fig. 4.4). To this end, we fit a shallow neural network with 10 neurons to the data based on a 55/20/25% training/validation/test split ratio that would map the solutions to the cost values with a coefficient determination of R 2 > 0.95. The transfer functions in the hidden, and the output layers were a sigmoid and a positive linear function, respectively. Further, we used Levenberg-Marquardt backpropagation as the algorithm to train the neural networks. We then used the fitted neural network as a surrogate for the real cost landscape, along which we performed a gradient-based search with multiple starts to achieve good-enough solutions for each belt speed ratio. The initial guess was set to a strategy chosen randomly from within 10% of the best evolutionary solution obtained for the tied-belt walking from the optimization of the same cost. This was to mimic the carry-over from tied-belt walking to split-belt gait during early adaptation, where the motor patterns and features resemble those of tied-belt walking. Further, 63 Chapter 4. Good-enough Split-belt Gait Predicts Scaling of Asymmetry with Speed Ratio Cost Optimization Variable, e.g., stride time Global Optimum Evolutionary Solutions Good-enough Region Surrogate Cost Landscape Figure 4.4: Optimal solution vs. good-enough solutions. A hypothetical landscape for a given cost is shown (black curve). A pure optimization algorithm evaluates intermediate solutions over different search iterations (grey circles; darker ones are closer to the global optimum) and ideally evolves along the gradient vector in the direction of decreasing cost to eventually find the global optimum (red circle). However, if the goal of the search were to find good-enough solutions, the search could terminate once the cost gradient is shallow enough; that is, the iterative relative cost improvement is below a threshold α (the landscape regions highlighted in dark red). different initializations potentially leading to different suboptimal strategies represent the uncertainty in what solution a biological learner thinks is optimal or the variability in initial states. The gradient-based search stopped wherever the relative improvement in the cost gradient norm was less than 1% of the cost value, and the Hessian was positive definite. The regions highlighted in dark red traces in Fig. 4.4 illustrate the potential locations of good-enough solutions along the surrogate cost landscape in one dimension. 64 4.3. Results 4.2.4 Statistical Analysis We used linear regression to determine if there were significant associations between belt speed ratios and (1) effort-related costs, such as work rate, as well as (2) spatiotemporal features, e.g., step lengths, step times, foot placement, and their corresponding asymmetry values. We reported the regression coefficients from these analyses as slope β and their corresponding 95% confidence intervals. The null hypothesis in all regressions was the slope β is zero, that is, there is no association between the belt speed ratio and any feature of interest with the statistical significance level set toα =0.05. 4.3 RESULTS 4.3.1 Tied-belt Walking Across Different Speeds Work minimization predicted symmetric features in normal walking We simulated steady-state walking on a split-belt treadmill and obtained good-enough solutions with respect to various costs at different belt speed ratio values. Further, the simulations satisfied the constraint violations on endpoint position and torso pitch. In all cases, the position and pitch deviated from those at the initial point by less than 2cm and1deg, respectively. For tied-belt walking at1.0m/s, our simulations predicted almost equal double-stance times (Fig. 4.5 - the time span of the shaded areas, [t DSL = 0.20 s; t DSR = 0.21 s], supporting the symmetry being the optimal feature for tied-belt walking. Moreover, our predicted energy-optimal step lengths and step 65 Chapter 4. Good-enough Split-belt Gait Predicts Scaling of Asymmetry with Speed Ratio times for tied-belt walking were [SL L = 47.6 cm; SL R = 47.8 cm] and [ST L = 0.517 s; ST R = 0.515 s], respectively. These predictions are in agreement with experimental observations, where steps of equal lengths and times during normal walking were energy-optimal [26]. Moreover, the positive mechanical work rate performed by the legs was [W L + =65.2W;W R + =62.2W]. Cost of transport scales with walking speed during normal walking Energy-optimal simulations of tied-belt walking predicted a linear relationship between the square of walking speed and positive work rate (Fig. 4.6 - linear fit in black, β = 0.0112, 95% CI [0.0103,0.0122], p < 10 − 6 ). This predicted trend was consistent with the linear relationship between square of walking speed and metabolic cost rate observed by Ralston (Fig. 4.6, the line in red) [25]. However, discrepancies were observed between the slopes and intercepts of the fit to our predictions and those predicted by Ralston. We speculate that these inconsistencies stem from the estimation of metabolic energy expenditure based on respiratory data by Ralston, while in our model, we did not account for any energy cost term associated with standing still. Considering only a mechanical measure of the cost of transport could have the effect of having resting state or isometric energy expenditure terms ruled out so our predictions would differ from experimental observations. 66 4.3. Results 0 25 50 75 100 -20 0 20 Hip Angle (deg) Right Lift-Off Left Lift-Off Right Heel-Strike Left Heel-Strike A Left Mid-Stance Flexion Extension Joint Angle Curves: Left Leg Double-Stance: Left HS to Right LO 0 25 50 75 100 0 20 40 60 Knee Angle (deg) B Flexion Joint Angle Curves: Right Leg Double-Stance: Right HS to Left LO 0 25 50 75 100 Gait Cycle % -20 -10 0 10 Ankle Angle (deg) C Plantar- Flexion Dorsi- Flexion Figure 4.5: Joint Angle Trajectories for tied-belt walking at 1.0 m/s. (A) Hip joint angles, (B) Knee joint angles, (C) Ankle joint angles. Energy-optimal simulations of walking on a tied-belt treadmill predicted symmetric joint angle patterns across limbs. Specifically, predictions of step lengths ( SL L = 47.6 cm; SL R = 47.8 cm), step times (ST L = 0.517 s; ST R = 0.515 s) and double-stance times (t DSL = 0.20 s; t DSR = 0.21 s) were symmetric across limbs. 4.3.2 Split-belt Walking Across Different Belt Speed Ratios Search based on the proposed good-enough control framework found a feasible family of solutions for each spatiotemporal decision variable. Figure 4.7 represents 67 Chapter 4. Good-enough Split-belt Gait Predicts Scaling of Asymmetry with Speed Ratio 0 3000 6000 9000 12000 Speed Squared (m/min) 2 0 30 60 90 120 Energy Cost Rate (Cal/min/kg) E Ralston = 0.0053 V 2 + 29.00 E sim = 0.0112 V 2 + 9.00 r= 0.997 , p<0.01 ** Predicted by Simulations Linear Fit to Predictions 95% CI Range for Fit Fit by Ralston, 1958 Figure 4.6: Energy-speed relationship in tied-belt walking. Our simulations of energy-optimal walking on a tied-belt treadmill predicted a linear effect of speed squared on the mechanical cost of transport rate (the linear fit in black, β = 0.0112, 95% CI [0.0103,0.0122], p < 1× 10 − 6 ). Our prediction on the presence of a linear trend was in agreement with a linear fit to the experimental observations made by Ralston (the linear fit in red, [25]). the good-enough solutions with respect to the positive work rate at different belt speed ratios. This suggests that if selected based on the gradient along the cost landscape, a variable range of solutions could potentially be adopted. However, one could argue that different dimensions of the variable space could be correlated with one another, leading to mechanical redundancy in the set of selected strategies. Investigating 68 4.3. Results A S H A S K A S A A F H A F K A F A 0 1 2 3 Variable Values Spatiotemporal Scaling Variables A 0 y 0 S y 0 F -0.2 -0.1 0 0.1 0.2 Variable Values Spatiotemporal Shift Variables B 1 3 2 1 Belt Speed Ratio ( V Fast / V Slow ) Allowable Ranges Good-enough Solutions Figure 4.7: Optimal Values for Optimization Variables. (A) Spatiotemporal scaling variables include a stride time (τ ) and six joint angle amplitude scaling parameters for hips (A H S for slow leg; A H F for fast leg), knees (A K S for slow leg; A K F for fast leg) and ankles (A A S for slow leg; A A F for fast leg). (B) The spatiotemporal shift variables include three temporal phase shift parameters in DMP oscillators of fast joint angles relative to their contralateral joints (ϕ H 0 ,ϕ K 0 , andϕ A 0 ) and two spatial shift parameters used to tune the baseline angles in DMPs of the two hip joints (y 0S andy 0F ). the association among our variables of interest revealed that the correlation among the scaling variables was negligible (R 2 max ≈ 0.09 found between A H S and A K S ). This indicates that our gradient-based search has obtained a set of good-enough gait strategies that are independent of each other. Spending shorter time on the fast belt reduced positive work rate in split-belt walking Solutions from simulations of split-belt walking that were good enough with respect to positive work rate revealed a significant decrease in net positive work with belt 69 Chapter 4. Good-enough Split-belt Gait Predicts Scaling of Asymmetry with Speed Ratio speed ratio (Fig. 4.9B; β = − 55.6, 95% CI [− 93.7, − 17.6], p < 0.01). Figure 4.9A depicts positive and negative work rate components for each leg. Good-enough search with respect to positive work rate also showed a significant increase in spatiotemporal features with belt speed ratio (Fig. 4.10; A) FPA:β =0.16, 95% CI [0.02, 0.31], P = 0.03, B) SLA: β = 0.06, 95% CI [0.01, 0.11], P = 0.04, and C) STA: β = 0.13, 95% CI [0.07, 0.18], p < 0.01. ). Our simulations also generally predicted positive values of step time asymmetry during split-belt gait, i.e., a longer step time for the fast leg, corresponding to a shorter time spent in contact with the fast belt. Theoretically, a shorter time in contact with the fast belt typically corresponds to a smaller hip extension upon push-off, leading to a less positive power required to push off the treadmill by the fast leg. Our good-enough simulations of gait also predicted gait strategies wherein the fast leg would generally step further forward than the slow leg (Fig. 4.10A). Placing the foot further forward on the fast leg would lead to a greater amount of positive work performed by the treadmill on the biped upon heel strike. Consequently, the biped would take advantage of this absorbed external positive work to reduce the work demand by the legs later in the gait cycle. From the simulation results, we only took the solutions for the belt speed ratio of 3:1 to compare our results to the findings of other studies. For the empirical evidence, we used the observations of step length asymmetry from S´ anchez et. al (2019) [33] and both step length asymmetry and step time asymmetry from Stenum and Choi 70 4.4. Discussion (2020) [34]. We also used the predictions made by prior simulation studies of split-belt gait, including Price et al. (2023) [105] and Seethapathi et al. (2021) [104]. Our good-enough framework, in general, generated solutions that were widely variable compared to other studies (Fig. 4.11). In contrast to other simulation studies, our predicted spatiotemporal features were more inclusive of the empirical observations. For step length asymmetry, the good-enough predictions were (Mean ± SD) 0.145 ± 0.130, compared to the empirical observations of 0.028 ± 0.013 [33] and 0.034 ± 0.025 [34], and prior simulation predictions of 0.42 [104] and 0.40 [105]. The step time asymmetries predicted by the good-enough algorithm were (Mean ± SD) 0.138 ± 0.091, compared to empirical values of 0.178 ± 0.028 [34], and prior simulation predictions of 0.34 [104] and 0.25 [105]. 4.4 DISCUSSION In this study, we aimed to determine the role of optimality and good-enough control principles in shaping the gait patterns adopted during split-belt walking at steady-state. We developed a predictive simulation framework based on principles of optimality and good-enough control to investigate the effect of different belt speed ratios on energetics and spatiotemporal features adopted during split-belt walking. Our overall approach involved the utility of evolutionary optimization to obtain a large set of feasible solutions and a multi-start gradient-based search to find a family of good-enough solutions across a range of belt speed ratios. We also tested different 71 Chapter 4. Good-enough Split-belt Gait Predicts Scaling of Asymmetry with Speed Ratio cost functions in the problem that were related to the energy expenditure of walking, including different combinations of positive work rate and negative work rate. Our spatiotemporal features of interest were measures of step length, step time, leading and trailing foot placement, and their corresponding asymmetry features including step length asymmetry (SLA), step time asymmetry (STA), and foot placement asymmetry (FPA). Our simulations predicted that when all features are free to vary, increasingly positive levels of step time asymmetry were selected with an increased belt speed ratio to reduce the positive work rate required by legs. Our step time asymmetry results were qualitatively consistent with empirical observations of positive step time asymmetry as the dominant strategy in split-belt treadmill walking [34]. Our findings of positive step time asymmetry agree with the findings of other simulation studies, where the gait adopted to optimize the metabolic rate was associated with substantially positive step time asymmetry levels [105, 104]. The predicted positive step time asymmetry levels also agree with physics-based theoretical models because greater step time for the fast leg is qualitatively equivalent to less stance time on the fast belt (Fig. 4.12). That is, the biped would leave the fast belt earlier when the fast leg has extended relatively less than normal prior to push-off. This can be a reasonable strategy to avoid a greater propulsion force required for push-off, hence reducing work demands by the fast leg. Good-enough solutions with respect to net positive work rate also predicted 72 4.4. Discussion generally positive values for foot placement asymmetry; that is, solutions with fast leg stepping further forward relative to the biped’s trunk were predicted to reduce positive work rate at each given belt speed ratio. These predictions were also in line with the experimental observations of longer steps on the fast belt [33, 91, 34]. Our results also agreed with predictions of positive foot placement difference and step length asymmetry found by prior simulation studies [104, 105]. Stepping further forward on the fast belt relative to the biped’s trunk predicted by the simulations (Fig. 4.13) was another expected gait feature to emerge according to physics-based models of split-belt treadmill walking. By stepping further forward on the fast belt at heel-strike, the biped can harvest a greater amount of positive work than usual due to the fast leg being more aligned with the belt speed vector geometrically. This implies that when able, the biped coordinated its motor patterns to increase the provided assistance and take advantage of the external positive work rate by the fast belt via stepping further forward on the fast belt. While increases in measures of step time asymmetry and foot placement asymmetry were shown to be helpful strategies in maximizing the harnessed energy from the treadmill, the extremely positive asymmetries will likely challenge the anatomical and mechanical constraints. On the other hand, increases in spatiotemporal features beyond typically adopted asymmetry levels could be harmful to the body, even if the goal of the central nervous system was energy optimality. In addition, our cost maps for feasible solutions in split-belt walking across different 73 Chapter 4. Good-enough Split-belt Gait Predicts Scaling of Asymmetry with Speed Ratio speed ratios showed that the more extreme asymmetries can require more positive work rate production by legs (Fig. 4.14). Furthermore, the feasible solutions for walking at each belt speed ratio corroborate that a broad range of strategies could feasibly be selected, and the mechanics of split-belt walking does not mute the options. Although our predictions of asymmetry for spatiotemporal features were qualitatively in agreement with prior empirical evidence [32, 33, 91, 34], when compared quantitatively, our results, similar to other simulation studies, were substantially greater than the experimental observations of these features. This discrepancy may be due to the presence of biomechanical and biological constraints in human body movement, while we did not impose any such constraints in the biped model nor in our simulations of gait other than mechanical saturations of joint torque limits. The differences between insights provided by simulations and empirical evidence could also potentially be attributed to the limited experience of various gait strategies achieved through mapping trials. During mapping trials, individuals are typically exposed to only a limited number of asymmetry levels for one feature while the other features are clamped at certain levels throughout each bout [33, 34]. One could argue that experiencing a few discrete values of one asymmetry feature may not provide enough opportunity to potentially explore regions of the cost landscape that correspond to more optimal strategies. In addition, a simultaneous variation 74 4.4. Discussion of different features rather than independent manipulation of one feature at a time may be the solution to experience and achieve strategies that are more optimal and perhaps resemble what simulations of split-belt gait would predict. Another potential explanation for why simulations of gait with respect to energy optimality principles predict substantially larger asymmetries than experiments could be that the true principles underlying human locomotor control and learning may be different from optimality. A rather less greedy action selection algorithm, such as a good-enough control approach, may predict the empirical evidence better. Alternatively, a combination of different algorithms and the addition of different control laws to govern the motor control problem at different time scales may help to understand the underlying principles better. Implementation of feedback control to stabilize gait reactively while the longer-term goal is to land feed-forward solutions around regions of cost landscape with good fitness values via an optimization-based approach is an example of such frameworks. In addition, rather than a single cost or value function often related to energetics or effort, the human central nervous system may care about a trade-off among several factors, such as different components of muscle work, stability, and kinematic metrics in generating control behavior. A main methodological limitation of the present work is in the approximation of the cost landscape from a set of feasible solutions and the use of it for gradient-based search. Estimating a fit through a set of data points in solution space is a difficult problem, and since done using only a limited discrete sample through the solution 75 Chapter 4. Good-enough Split-belt Gait Predicts Scaling of Asymmetry with Speed Ratio space, it may be inaccurate in function approximation around regions of variable space where the sample density is smaller, especially in higher variable dimensions. Another limitation of the current study is the model of the biped, wherein no muscular or tendon-driven control elements are incorporated. Simulation studies of gait have attempted to achieve and predict a more accurate and realistic human gait by using musculoskeletal models [105]. However, increasing the model complexity brings about challenges related to computational cost, interpretability of the results, and validation feasibility. Studies of human motor control have historically been interested in explaining variability in human motor control through mechanistic approaches that mathematically represent the presence of noise at different levels of the sensorimotor system [139, 140, 128, 141]. While variability in human select strategies is typically represented as a stochastic process, we did not explicitly incorporate noise at any of the control or sensory levels to gain a mechanistic understanding of the principles underlying human locomotor behavior. The hybrid evolutionary-gradient-based search with multiple starts developed in the current study as a heuristic approach provided an opportunity to obtain a family of feasible and good-enough solutions with respect to energetics-related costs. The resulting variability in our predicted solutions was primarily due to the search based on the cost landscape gradient, wherein our optimality thresholding approach allowed the achievement of approximate local minima potentially at different regions of the cost landscape, each corresponding 76 4.5. Conclusion to a different solution. Now performing this gradient-based search starting from random initial solutions, the algorithm could find solutions anywhere within the potential regions associated with shallow-enough gradients. Since we simulated the good-enough search multiple times, each time with a different start point randomly sampled in the vicinity of the best solutions from tied-belt treadmill walking, the search algorithm resulted in a variable set of acceptable and good-enough solutions. 4.5 CONCLUSION The results of our feasible and good-enough simulations of split-belt gait with respect to energy-related costs predicted dominantly positive step time asymmetries and generally positive foot placement asymmetry to result in reduced net positive work required by legs. Our solutions were associated with substantially higher spatiotemporal asymmetry values than the empirical observations of steady-state split-belt gait features. Our simulations provided a framework to determine the human gait patterns by bearing that the goal of human split-belt locomotion at steady-state is achieving good-enough solutions with respect to effort. Therefore, it can be interpreted from the discrepancies between our predictions and the empirical evidence that biological motor control and learning are perhaps driven in part by other principles and factors than energy optimality alone. 77 Chapter 4. Good-enough Split-belt Gait Predicts Scaling of Asymmetry with Speed Ratio ACKNOWLEDGEMENTS The authors acknowledge the Center for Advanced Research Computing (CARC) at the University of Southern California for providing computing resources that have contributed to the research results reported within this publication. URL: https://carc.usc.edu. 78 4.5. Conclusion −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1 τ A S H A S K A S A A F H A F K A F A ϕ 0 y 0S y 0F Correlation Matrix for Control Variables −0.24 −0.24 0.01 0.25 −0.09 −0.10 0.23 −0.18 0.09 0.31 0.04 −0.12 0.10 −0.06 −0.13 0.27 −0.10 −0.24 0.04 0.09 −0.07 −0.10 0.29 0.01 0.13 −0.12 0.14 −0.16 −0.06 −0.21 0.09 −0.22 0.02 0.08 0.09 −0.11 −0.05 −0.02 0.09 −0.05 −0.21 0.15 −0.25 0.27 −0.25 Correlation Matrix for Control Variables Figure 4.8: Correlation plot for spatiotemporal scaling variables from good-enough solutions. Pairwise correlation plot among the decision variables obtained across different levels of belt speed difference is represented. The size of the circles for any given pair of variables represents their correlation strength. The circles with red color spectrum represent positive correlation, whereas the blue spectrum does negative correlation coefficients. The plot suggests that the variation in several variables is not independent of that in the others and that only a few number of variables could possibly explain the variance in positive work rate as the dependent variable across different belt speed conditions. 79 Chapter 4. Good-enough Split-belt Gait Predicts Scaling of Asymmetry with Speed Ratio 1 1.5 2 2.5 3 Belts Speed Ratio -500 -250 0 250 500 Work Rate (W) Work Rate By Legs A) Left/Slow Right/Fast 1 1.5 2 2.5 3 Belts Speed Ratio -500 -250 0 250 500 Work Rate (W) Net Work Rates y = -55.6 x + 424.4 r = -0.35 B) Positive Negative Median Figure 4.9: Leg work rate results from good-enough solutions with respect to positive work rate. (A) The positive and negative components of work rates performed by each leg. (B) The net positive and negative work rates performed by legs. Net positive and negative work rates both decreased in magnitude with increasing belt speed ratio. 80 4.5. Conclusion 1 1.5 2 2.5 3 0 0.25 0.5 0.75 1 Foot Placement (m) Foot Placement A) 1 1.5 2 2.5 3 -0.5 -0.25 0 0.25 0.5 0.75 FPA Foot Placement Asymmetry y = 0.2 x - 0.1 r = 0.28 1 1.5 2 2.5 3 0 0.25 0.5 0.75 1 Step Length (m) Step Length B) 1 1.5 2 2.5 3 Belt Speed Ratio -0.5 -0.25 0 0.25 0.5 0.75 SLA Step Length Asymmetry y = 0.1 x - 0.0 r = 0.26 1 1.5 2 2.5 3 0 0.25 0.5 0.75 1 Step Time (s) Step Time C) Left/Slow Right/Fast 1 1.5 2 2.5 3 -0.5 -0.25 0 0.25 0.5 0.75 STA Step Time Asymmetry y = 0.1 x - 0.2 r = 0.52 Median Figure 4.10: Results of spatiotemporal features from good-enough solutions with respect to positive work rate. (A) Foot placement on each belt (top) and measure of foot placement asymmetry (FPA) (bottom) across different belt speed ratios. (B) Step length on each belt (top) and step length asymmetry (SLA) (bottom) across different belt speed ratios. (C) Step time for each leg (top) and measure of step time asymmetry (STA) (bottom) across different belt speed ratios. 81 Chapter 4. Good-enough Split-belt Gait Predicts Scaling of Asymmetry with Speed Ratio −0.1 0.0 0.1 0.2 0.3 0.4 Optimality Predictions Good−enough Predictions Sanchez et al. 2019 Stenum & Choi 2020 Seethapathi et al. 2021 Price et al. 2023 Study Spatiotemporal Asymmetry Feature Step Length Asymmetry Step Time Asymmetry Empirical Observations Prior Simulation Predictions Figure 4.11: A comparison among spatiotemporal features of Split-belt walking at the belt speed ratio of 3:1 from good-enough solutions and other empirical and simulation studies. 82 4.5. Conclusion 1 1.5 2 2.5 3 0 0.25 0.5 0.75 1 Swing Time (s) Slow/Left Leg 1 1.5 2 2.5 3 0 0.25 0.5 0.75 1 Fast/Right y = 0.1 x + 0.5 r = 0.41 1 1.5 2 2.5 3 Belt Speed Ratio 0 0.25 0.5 0.75 1 Stance Time (s) y = 0.1 x + 0.1 r = 0.32 1 1.5 2 2.5 3 Belt Speed Ratio 0 0.25 0.5 0.75 1 y = -0.1 x + 0.3 r = -0.42 A B C D Figure 4.12: Results of stance and swing time from good-enough solutions with respect to positive work rate. (A) Swing time for the slow leg. (B) Swing time for the fast leg. (C) Stance time for the slow leg. (D) Stance time for the fast leg. 83 Chapter 4. Good-enough Split-belt Gait Predicts Scaling of Asymmetry with Speed Ratio 1 1.5 2 2.5 3 Belt Speed Ratio -0.5 -0.25 0 0.25 0.5 Foot Placement (m) Slow/Left Leg 1 1.5 2 2.5 3 Belt Speed Ratio -0.5 -0.25 0 0.25 0.5 Fast/Right Leg A B Figure 4.13: Results of foot placement from good-enough solutions with respect to positive work rate. (A) Leading and trailing foot placement on the slow belt. (B) Leading and trailing foot placement on the fast belt. 84 4.5. Conclusion SLA STA −0.25 0.00 0.25 0.50 200 400 600 800 200 400 600 800 Spatiotemporal Asymmetry Positive Work Rate (W) 1.0 1.5 2.0 2.5 3.0 Belt Speed Ratio Figure 4.14: Feasible solutions for walking on the treadmill across a range of belt speed ratios. The cost maps formed from feasible solutions reveal regions of the variable space that correspond to minimal values of positive work rate cost. Deviations beyond the cost-minimal regions could lead to suboptimal feasible solutions that correspond to extra amounts of energy consumption. 85 Chapter 5 Challenges and Opportunities in the Application of Inverse Optimal Control to Explain Human Locomotion: A Review 5.1 OVERVIEW Observing humans walking in stereotypical fashions across various conditions motivates the effort to understand what factors influence how they select their gait strategies. To this end, several normative theories have been proposed to provide plausible objective explanations for the patterns and features that emerge during human locomotor behavior. Optimal control theory is a normative theory that has had success in explaining several locomotor features in both walking and running. According to optimal control theory, when applied to bio-inspired motor behaviors, the organism’s brain regulates the pertinent motor features such that some performance or behavioral objective, also known as fitness, is optimal. While often utilized to predict behaviors, optimal control theory can be used 86 5.1. Overview to infer the mechanisms underlying given behavioral patterns. Optimal control theory mathematically amounts to implementing an optimization problem to explore sequences of motor patterns that minimize some functional objective while also satisfying different task or organismic constraints. One often implements optimal control in a predictive simulation framework where the objective is known a priori and the behavioral patterns are to be determined. The objective mainly represents a functional scalar metric that represents a performance or behavioral fitness and can entail one or more criteria. Although the functional forms of these features are known, what features and to what extent each of them constitutes the objective of a locomotor control behavior in an optimal control framework remains unknown. Therefore, the main purpose of this review is to review various frameworks that can help to understand the mechanistic factors underlying given motor behaviors. This problem is identified as an inverse optimal control (IOC) problem, which aims to infer the features of plausible motor objectives that, when optimized, would predict given behavioral patterns. The present review deals with the fundamentals of human locomotion and the utility of an inverse optimal control problem for objective inference in locomotor control. In addition to the focus on reviewing motor objectives, this review also discusses different variations of the inverse optimal control problem applied to models of locomotion. The recommended recipe to represent and solve an objective inference problem for locomotion could vary depending on specific assumptions made 87 Chapter 5. A Review of Inverse Optimal Control Approaches about the behavior. Particularly, various considerations that would affect one’s choice of approach include those about the order of optimality for the given behaviors and the presence of sensorimotor uncertainty in the behavior. Another important dimension to consider is what components of motor control are to be modeled in the problem, i.e., if both proactive and reactive control are assumed to be present. Therefore, we propose an overview of different approaches for implementing an inverse optimal control framework. Further, this review deals with the challenges and open questions in motor control, which can be addressed through the use of IOC. Ultimately, we discuss what opportunities in locomotor control and learning the utility of IOC can offer. For instance, using the outcome of an objective inference problem as a proactive control scheme for making predictions is one application of IOC in walking. Further, upon walking on unpredictable terrains with perturbations, an IOC problem could be used towards inferring the contributions of both reactive and proactive control components to the mechanistic control scheme. The remainder of this work will review the two main approaches to inferring the preference of observed motions. These approaches differ mainly in the choice of model for the problem formulation. The first, Inverse Reinforcement Learning (IRL), models the dynamics as a Markov Decision Process (MDP). An MDP is typically appropriate when considering many example motions in a state space, such as many observed motions in a particular room or along a road network. The second approach is Inverse Optimal Control (IOC), which models the problem as a trajectory 88 5.2. Markov Decision Process Methods optimization. This formulation is generally appropriate when considering a single optimal behavior. These two modeling approaches are detailed below, along with a brief synopsis of the existing literature on inverse approaches under Sections 5.2 and 5.3. Key issues to consider when formulating an inverse method using either approach are then examined, including cost functions (see Section 5.4) and optimality properties (see Section 5.5). Finally, a summary of applications of both cost inference frameworks and specific areas of further research are proposed to advance inverse optimization methods for treating human locomotion priorities (Sections 5.6 and 5.7). 5.2 MARKOV DECISION PROCESS METHODS 5.2.1 Markov Decision Process Representations A Markov Decision Process (MDP) can be considered a graph representation of a dynamic system with state x t at time t. At a given state x t , a controller chooses an action a from a set of available actions, each of which results in probabilities of transitioning to the next state x t+1 . A reward of R a (x t ,x t+1 ) is received when transitioning from state x t to x t+1 after taking action a. While this representation most naturally represents discrete states and actions, sampling methods or function approximation can be used to approximate continuous MDPs. A complete representation of a system using an MDP would include every state the system can assume, as well as every action available at each state. Dynamic programming considers a policy and value function with an MDP . A policy 89 Chapter 5. A Review of Inverse Optimal Control Approaches π (x) maps from state x to desired action a, and the value V(x) of a state x is the expected total discounted reward starting from x (e.g. assuming a policy π (x)). The discount factor γ n , where γ ∈ (0,1) is used to discount rewards n timesteps in the future. γ represents a preference for achieving similar rewards sooner rather than later. State-indexed arrays are typically used to represent the policy and value function for discrete problems. However, state-indexed arrays contain information proportional to the number of states under consideration, which may be substantial for complex systems. The fundamental basis for solving for a policy is the Bellman optimality equation, V π ∗ =max a R(s,a)+γ X s ′ P(s ′ |s,a)V π ∗ (s ′ ) . (5.1) Briefly, it states that the value of a state, when following the optimal policy, is the reward of the optimal action at that state plus the (discounted) expected future value of all future rewards following the optimal policy. MDPs are useful to find a policy to run at any particular state in a limited state space. For example, an engineer building a control policy for a robot operating in a known environment with a specified goal location may use dynamic programming with an MDP . This policy can be followed starting from any location in the environment and will arrive at the goal state. Therefore, MDPs are appropriate for reasoning about many trajectories which span a substantial volume of the state space. These state 90 5.2. Markov Decision Process Methods spaces are typically represented with discrete states, with a specific cost for each discrete state. Markov decision processes are used to treat problems with continuous states as well, typically by sampling states in a representative way. 5.2.2 Inverse Reinforcement Learning Based on Markov Decision Processes Typical MDP-based approaches to inverse optimization are referred to as Inverse Reinforcement Learning (IRL), so-called because they invert the typical reinforcement learning algorithm, which aims to generate actions to optimally traverse an MDP based on a reward function. Instead, IRL approaches seek a cost function and policy which can best explain a database of given motions. The database provides example motions that potentially span the volume of the state space, possibly with each state being visited many times. Perhaps the earliest work in this area introduces the goal of discovering a reward function underlying observed behaviors [142]. One notable requirement is that the discount factor preference γ is assumed to be known rather than learned from behavior. The fundamental observation is that the cost function must satisfy the Bellman optimality equation (Equation 5.1). The authors note an ambiguity in the formulation of the inverse problem, where a trivial reward (R a (x t ,x t+1 ) = 0) is always consistent with any policy. That is to say, any behavior might always be a result of randomly choosing actions. The authors then introduce criteria to choose a reward function to maximally separate the performance of the observed policy’s choices from other action choices while 91 Chapter 5. A Review of Inverse Optimal Control Approaches also attempting to use ”simpler”, sparser rewards over states. The authors consider both discrete and continuous state spaces and, in continuous space, introduce cost basis function approximations, which map states to reward components in continuous space. Formulating the cost of a continuous MDP as a linear combination of cost basis functions allows for modeling physiological costs, although solving the optimality conditions may require approximation. Specifically, in continuous state spaces, the optimality requirement may be infeasible due to too many constraints or the inability of the linear function approximation to match the true cost function [142]. In this case, penalty methods can be used to find a maximally feasible solution. Additional work on inverting MDP models focuses on the concept of ”apprenticeship learning” [143], where the main goal is to recover a policy that behaves similarly to expert demonstrations. This is motivated by the observation that an expert might have difficulty expressing reward functions but can easily demonstrate desirable behaviors. The authors derive an IRL algorithm that finds a policy that performs as well as the expert demonstration according to an unknown reward function which may be different from the expert’s true reward function. The work also seeks a generalizable policy that extends beyond the examples. The algorithm first generates a list containing a random policy. The second step then finds reward weights which results in the expert behavior being better than all of the policies considered so far by the largest possible margin. A forward reinforcement learning step is then performed to estimate the optimal policy given these weights, which is 92 5.2. Markov Decision Process Methods added to the list of policies considered. The algorithm then returns to step two and iteratively repeats until the margin between the demonstrated and proposed controller behavior becomes acceptably small. Closely related work on imitation learning [144] focuses on finding reward weightings and policies which match the state-action pair frequency observed in the example behaviors. That is, instead of a cost function that matches feature vectors with an unknown cost weighting, the algorithm seeks a cost function that behaves statistically as similar to the observed trajectory (or policy) as possible. The authors posit this approach provides better generalization than direct feature mapping as it is less dependent on specific choices of features. Similarly, apprenticeship learning [145] seeks a policy that performs well with respect to an unknown weighting of given reward basis functions of the state. These basis functions may, for example, represent physiological costs. One implication of assuming reward bases is that the resulting policy from apprenticeship learning may be different from, and even better than, the expert demonstration. However, the algorithm performs more poorly when the demonstrated behavior is far from optimal. Additional work [146] treats the optimization of road path planning as inherently stochastic, where the policies to be learned are probabilistic. This approach led to a substantial generalization of cost functions for a large dataset of many drivers navigating a road network, driving to many different goal states. Attempts have been made to relax the requirements of entire trajectories in 93 Chapter 5. A Review of Inverse Optimal Control Approaches favor of simple lists of state transitions [147]. This work also focuses on adaptive continuous state approximations with cost functions of state, allowing for the treatment of physiological cost factors. One assumption in this work is the full controllability of the system, which typically cannot be guaranteed for bipedal walking dynamics. Another approach to optimization in dynamic continuous spaces [148] uses Gaussian functions to model costs as a function of positions and velocities. This choice of parametrization results in efficient learning with limited motion capture data. The generality of modeling problems as MDPs has led to IRL application beyond motion planning, including understanding the rules of parsing natural language [149]. 5.3 TRAJECTORY OPTIMIZATION METHODS 5.3.1 Trajectory Optimization Representation It is not always required to derive a closed-form control policy that expresses solutions for a composition of optimal paths for all states at all times as in a closed-loop control. Further, obtaining such closed-loop solutions based on the principle of optimality becomes implausible when a control system with high-dimensional nonlinear dynamics and several physical or physiological constraints, such as human motor control, is involved. In particular, when the goal is to find a specific control strategy that, given, e.g., an initial state, is better than all other plausible strategies, then the system’s dynamics can be modeled as a trajectory through a continuous state space. Then, the goal in such behaviors amounts to identifying trajectories of 94 5.3. Trajectory Optimization Methods states and controls that, when adopted by the dynamical system, move the system through the state space in a manner that minimizes a given cost. This would be achieved through trajectory optimization, which is based on the calculus of variations. Trajectory optimization aims to compute trajectories of states and controls as an open-loop solution to a broader constrained optimal control problem posed for a dynamical system, such as human locomotor control. A trajectory optimization problem is generally defined as an optimization over continuous trajectories of states and controls. However, to be able to numerically solve a motor control problem as a trajectory optimization, one approach is to discretize the continuous trajectories into arrays of state and control values over a finite number of time points. In a trajectory optimization problem, the goal is to determine the trajectories of n statesx(t) = [x 1 (t),...,x n (t)] T that often describe the kinematics, the trajectories of m control input u(t) = [u 1 (t),...,u m (t)] T that could be represented in form of either generalized forces such as torques and propelling forces or neuromuscular activations, and the final time t f such that the scalar objective function in the following form is minimized J = ˆ J(x(t 0 ),x(t f ),t f )+ Z t f t 0 ϕ (x(t),u(t),t)dt, (5.2) subject to the dynamics of the motion represented as discrete continuity constraints ˙ x(t)=f(x(t),u(t),t), (5.3) 95 Chapter 5. A Review of Inverse Optimal Control Approaches , and other constraints, including those related to states or control bounds and those on task or goal completion P min ≤ P(x(t),u(t),t)≤ P max , (5.4) b min ≤ b(x(t),u(t),t)≤ b max , (5.5) In the objective function J in (5.2), the term ˆ J denotes the cost associated with achieving the initial and terminal states, and the second term on the right-hand side is the cost in form of the integral of a path-dependent function ϕ over time. For human locomotion, since there is often no cost associated with initial or terminal configurations, the objective function is reduced to J = R t f t 0 ϕ (x(t),u(t),t)dt. In (5.3), ˙ x(t) denotes the time-derivative of the state vector x(t), and f embodies the dynamics, and is a function of states, controls and time. Moreover, (5.4) demonstrates the vector of path constraints over the states and controls trajectories, and (5.5) shows the upper and lower limits that need to be placed on the states or controls or the points in time. Both (5.4) and (5.5) are generally required to ensure that the physical or physiological feasibility of the obtained strategies is met. In general, for modeling a locomotor behavior through trajectory optimization in an optimal control framework, (5.2)-(5.5) can be simplified as follows [x ∗ ,u ∗ ]=argmin x,u Z t f t 0 ϕ (x(t),u(t),t)dt, 96 5.3. Trajectory Optimization Methods s.t. h j (x(t),u(t),t)=0, j =1,...,m 1 . (5.6) g j (x(t),u(t),t)<0, j =1,...,m 2 . where h j (j = 1, ..., m 1 ) and g j (j = 1, ..., m 2 ) are used to summarize all the dynamics, the path constraints, and the bounds as either equality or inequality constraints, respectively. Regarding the proposed formalization for a rather complicated behavior with different phases, such as human locomotion, it is worth noting that (5.6) yet cannot embody the details of the behavior unless it accounts for transitions from one phase to another (e.g. single support to double support transitions or vice versa). This does not change the formalization represented in (5.6), but other specific constraints need to be put in place under the equality constraints to ensure that continuous dynamics is achieved as well. The objective function, accordingly, takes the form of the sum of similar or different scalar criteria measures over each phase. What remains to be debated is that among all behavioral or performance criteria, what is the primary objective of a locomotion behavior that, when optimized in the form of a function, can make reasonable predictions of motor strategies. 97 Chapter 5. A Review of Inverse Optimal Control Approaches 5.3.2 Inverese Optimal Control Based on Trajectory Optimization An objective inference problem formalized via an IOC framework with a composite cost as a parametric linear combination of cost components becomes a parameter identification problem. Thus, the goal mathematically amounts to determining a set of unknown relative weights corresponding to each of the well-defined sub-criteria that, when optimized, can make reasonable predictions of walking behaviors consistent with those observed in humans. An IOC problem is often formalized as a bilevel problem as follows ω ∗ =argmin ω ∥x ∗ (t;ω)− ˜ x(t)∥, (5.7a) where ω denotes the weights of a vector of sub-criteria, and x ∗ (t;ω) is the optimal behavior that results from the solution of the following forward simulation [x ∗ ,u ∗ ]=argmin x,u i=C X i=1 ω i Z t f t 0 ϕ i (x(t),u(t),t)dt, s.t. h j (x(t),u(t),t)=0, j =1,...,m 1 . (5.7b) g j (x(t),u(t),t)<0, j =1,...,m 2 . An IOC problem, as presented in (5.7), is essentially composed of two levels; the lower level (5.7b) attends to solving the same forward optimal control problem 98 5.3. Trajectory Optimization Methods as in (5.6) but with a new objective function that is the weighted sum of C different sub-criteria instead of the old single-criterion objective. m 1 andm 2 denote the number of equality and inequality constraints, respectively. The upper-level optimization in (5.7a), however, modulates the weights corresponding to sub-criteria in the objective of the lower-level optimization. The upper-level aims to ultimately find a set of optimal weightsω ∗ that, when plugged into the objective of the lower-level problem, a locomotion behaviorx ∗ (t;ω ∗ ) results from the lower-level optimization that minimizes the error cost∥x ∗ (t;ω ∗ )− ˜ x(t)∥. Minimization of this upper-level error cost ensures that the predicted behavior x ∗ (t;ω ∗ ) is close enough to the demonstrated human locomotion behavior ˜ x(t). It is worth noting that the demonstrated data, depending on the observability of the model of the system and its environment, can involve all kinematic, dynamic or neural aspects of a given behavior. 5.3.3 Different Methods of Estimating Costs and Policies from Example Data Trajectory optimization approaches to inverse optimal control can be split into two categories; single-level and bi-level. Bi-level trajectory optimizations consist of an outer loop optimization process to propose cost weights and an inner forward optimization to find the trajectory predicted by those weights. Bi-level trajectory optimization IOC approaches differ in the implementation of the outer and inner loop optimizations. Outer loop optimizations are typically derivative-free, so gradients do not need to be estimated based on finite 99 Chapter 5. A Review of Inverse Optimal Control Approaches differencing the results of optimization (i.e. the inner loop). Approaches include genetic algorithms [150], and Nelder-Mead simplex algorithms [151]. The inner loop optimization is a global forward trajectory optimization, modeling the costs and constraints of the system in question, typically implemented as a direct collocation. A likely limiting factor on the complexity of the model and task under consideration is the reliability of this forward trajectory optimization routine. Such methods can apply to locomotion models with tens of joints if contact ordering can be prespecified [152, 101]. For healthy, steady walking and running behaviors, specifying a heel strike, toe-off contact scheme with double support, and walking is likely sufficient. However, predicting behavior with other reasonable contact ordering options, such as using a handrail while walking, may stress the ability of these trajectory optimizations to reliably predict optimal trajectories given weights. One approach to improving the robustness of the inner loop is to run many parallel forward optimizations with different starting states and choose the results with the lowest cost. Another approach to the outer loop weight optimization is a simple iterative process whereby all trajectories generated by the inner loop are assumed to be suboptimal, and a candidate set of weights is generated under constraints that ensure the optimal example has a lower cost than each of the generated suboptimal trajectories [84]. Bi-level trajectory optimizations have shown some robustness to added noise, model, and task errors, as well as arbitrarily incorrect assumptions on the presence of distracting cost components [84]. One important issue that arises with a bilevel optimization scheme 100 5.3. Trajectory Optimization Methods is if the example trajectory is a local minimum for the optimization problem, the inner loop trajectory optimization may find a strictly better trajectory with respect to the real cost. Single-level trajectory optimization approaches can be split into two categories, methods based on Kaurush-Kuhn-Tucker (KKT) conditions and local update laws. KKT methods formulate the bi-level optimization problem specified above but then examine the KKT conditions for optimality of the inner loop in the vicinity of the demonstrated optimal trajectory. The essential observation is that the observed trajectory must satisfy the KKT conditions of that inner trajectory loop, so the optimization itself need not be performed; the KKT conditions can be assumed to be satisfied, which simply adds constraints to a single-level optimization for the weights w [153]. The KKT conditions for the inverse optimization problem represented in 5.7 with the first-order optimal state x ∗ are as follows: ∇ x L(x ∗ ,ν,λ )=∇ x i=C X i=1 ω i Z t f t 0 ϕ i (x,u∗ ,t)dt+ m X i=1 ν i ∇ x h i (x ∗ )+ n X i=1 λ i ∇ x g i (x ∗ )=0. (5.8) We can now introduce the Karush-Kuhn-Tucker (KKT) conditions: g i (x ∗ )≤ 0, i=1,...,m 2 (5.9a) h i (x ∗ )=0, i=1,...,m 1 (5.9b) 101 Chapter 5. A Review of Inverse Optimal Control Approaches λ i g i (x ∗ )=0, i=1,...,m 2 (5.9c) λ i ≥ 0, i=1,...,m 2 (5.9d) ∇ x i=C X i=1 ω i Z t f t 0 ϕ i (x∗ ,u∗ ,t)dt+ m X i=1 ν i ∇ x h i (x ∗ )+ n X i=1 λ i ∇ x g i (x ∗ )=0. (5.9e) where the first two conditions 5.9a and 5.9b are known as the primal feasibility conditions, Eqs. 5.9c and 5.9d are known as the complementary slackness conditions, and the condition 5.9e is the stationarity condition. Typically, in a forward trajectory optimization, the KKT conditions are used to determine search directions and stopping conditions [154]. Optimization algorithms search for both the statex ∗ , and the Lagrange multipliersλ , which satisfy KKT conditions, where the cost function f is known. In inverse optimal control using the KKT conditions, the same conditions must hold, except x is known and equal to x ∗ (the given example), and the cost function is a function of unknownω. The other approach to single-level inverse optimal control relies on a forward trajectory optimization which updates a candidate trajectory based on an update law, so-called pi-squared. In particular, the pi-squared update law has proven useful for inverse optimal control [148, 155, 156]. The essential observation leading to pi-squared IOC is that, given an optimal trajectory, an admissible cost function is one where the pi-squared update law evaluates to zero. That is, the update law suggests no direction in which the example trajectory could be improved, given the correct cost function. IOC based on pi-squared, like all inverse methods, inherits the 102 5.3. Trajectory Optimization Methods limitations of its forward methods. In particular, it is unclear how to incorporate hybrid events such as impacts and contact transitions naturally in a pi-squared framework. Extensions to pi-squared to handle hybrid events, such as sequencing separate trajectories for separate phases connected by discrete transitions, may be possible. However, the application of pi-squared inverse optimal control to human locomotion, where hybrid dynamics are crucial, is limited until these extensions are developed. A commonality among single-level trajectory optimization-based IOC techniques is that they necessarily operate on a local minimum. This may be considered an advantage over the bilevel trajectory optimization with a global forward search, as discussed above, as single-level algorithms do not require that the demonstration be a global optimum. However, this may also preclude the investigation of cost functions that choose among distinct isolated approaches to solving the task. Another method of more limited practical value that has been used to investigate cost functions is brute force search of a small set of costs or models. For example, it was shown that a range of natural properties of gait emerges from simply optimizing different simple models of gait for mechanical energy [157]. Rebula and Kuo also showed that a range of natural walking properties emerges when optimizing for various combinations of mechanical work and force smoothness, which may be related to a metabolic cost for switching muscle forces [158]. While these studies were more for illustrating the expressive power of simple models and costs in human locomotion, simply performing forward searches for all possible models and cost 103 Chapter 5. A Review of Inverse Optimal Control Approaches combinations is likely infeasible for most natural motions. 5.4 CONSIDERATIONS OF COST FUNCTIONS TO GENERATE BEHAVIOR Formulating an IOC or IRL problem requires an insight into the objective function and what potential criteria are best suited to form a composite objective. One classification of individual costs would suggest that the most important criteria can be split into four different types. The criteria related to behavior reproduction are those for which part of the objective would be to closely track some nominal patterns or features observed in natural gait, e.g., joint angles, walking speed, symmetry, and so on [159, 160, 161]. The second group of criteria would be the abstract costs that allow for arbitrary shaping of the rewards or penalty over the state space, most functional examples of which are state function approximators in the form of Gaussian kernels [143]. This group of criteria is often used as the objective for behavior when represented via Markov Decision Process (MDP) or solved in the context of dynamic programming [142, 143]. Task costs are listed under a group of criteria that can quantify the performance mostly in the task space and are extrinsic in they can be directly measured in experiments or calculated in a model without the need for lower-level analysis such as dynamics. Minimization of end-point position error or error variance in reaching tasks [162, 163], or the biped torso pitch angle in walking [164] are examples of such 104 5.4. Considerations of Cost Functions to Generate Behavior type of costs. The intrinsic criteria are the last and most popular family of costs that are best suited for predictive simulations of motor behaviors. These criteria mostly consist of mechanical or neuromuscular measures of performance, such as energy, effort, jerk, etc. These cannot be directly measured in the task space. For these criteria to be used as the optimality objectives, one needs a deeper knowledge of lower-level processes involved in the control of the given motor behavior. Section 5.4.1 reviews choices of intrinsic costs in the human motor control domain. 5.4.1 On Intrinsic Costs A compromise among many candidate criteria, such as speed, acceleration, maneuverability, energy economy, stability, and many others, might be the driver of individuals’ locomotor patterns [11]. However, it remains to be seen what set of physically meaningful criteria explain human locomotor behaviors across various contexts [79, 165]. The general consensus is that depending on what level of motor control the features are to be evaluated, different families of performance or behavioral criteria can be used as candidates for representing the cost function in the optimization [165]. Kinematic criteria are represented as functions of the states at joint-level or task space and their respective derivatives. The minimum jerk [99] and minimum accelerations [166], at both joint and Cartesian spaces, would result in human body 105 Chapter 5. A Review of Inverse Optimal Control Approaches motions with maximal smoothness [167, 168]. Dynamic criteria, also known as minimum effort criteria, emphasize the importance of a specific norm of generalized forces or their derivatives being minimized. Examples of minimum effort criteria involve the minimum absolute or Euclidean norms of, for instance, torques [48, 127, 74] or torque rates [100, 169], in joint space. Dynamic criteria can also be translated to neural criteria in a sufficiently complex model of walking, such as a musculoskeletal model, where minimization of a norm of motor neurons activities in muscles can represent the minimum effort feature in neural space [8, 170]. Another major class of criteria consists of those related to energetics. Empirical observations of human gait features, such as symmetric step length and step width or estimates of the optimal speed of walking, have shown to correspond to energetically optimal patterns [22, 31, 171, 30, 25, 24]. In addition, predictive simulations of walking with energy minimization on various models of human’s lower extremity, e.g., a simple inverted pendulum model of the human legs, discovers reasonable patterns for a classic gait [16, 51]. Several studies have adapted different forms of minimum mechanical work per distance performed by legs as a valid alternative energetic cost metric since mechanical work for step-to-step transitions appears to correlate with the metabolic cost [19]. Examples of work-related proxies include peak work [131], positive work [132], and total absolute work [52]. Furthermore, Minimizing metabolic cost expenditure as another form of energetic criteria in simulations of minimal walking models have shown to predict behaviors that are consistent with 106 5.4. Considerations of Cost Functions to Generate Behavior those observed in humans [23, 56, 49]. Another form of energetic criteria that is defined at the junction of kinematic and dynamic criteria is geodesic cost, so-called kinetic energy. Minimization of a geodesic cost is inspired by the idea that the human central nervous system may adopt the shortest path in configuration space with respect to the kinetic energy measure [79, 172]. Other criteria are those related to stability or balance of a model during walking. One can opt for a scalar biomechanical measure related to, e.g., the margin of stability [173], distance from capture point [174, 76] or stabilizing and destabilizing forces [175] to represent stability in form of a cost function to be optimized during walking. Alternative stability measures include the total distance from the position of the projected center of mass (CoM) or the head to the center of the extended base of support in the fore-aft direction over a gait cycle [129]. The minimization of angular momentum of body segments has also been proposed to have success in achieving stable gait patterns [55]. Similar to lower-extremity studies, the choice of performance criteria is debated in upper-limb movement optimization. Flash and Hogan were the first to propose the use of a minimum jerk model as the objective function of the optimization for simulating two-dimensional upper-limb reaching tasks [167]. Their analysis was independent of the dynamics of the musculoskeletal system, and it yielded predictions of motor patterns that were in good agreement with experimental observations. Uno and Kawato proposed the use of sum of torque rate squared measure as the objective 107 Chapter 5. A Review of Inverse Optimal Control Approaches function in the optimization of the same multijoint arm reaching task, which made predictions that were consistent with those made by the minimum jerk model as well as the empirical observations [100]. For higher dimensional reaching tasks in upper-extremity, a compromise among different criteria were adapted as the final composite objective function [176]; norms of hand Cartesian jerk [167], angular jerk [177], angular acceleration [178], torque [74] and torque change [100, 169], geodesic (kinetic energy) [179], joint-level mechanical effort [180, 181], and neural effort [8, 170]. However, the present review is primarily focused on the literature of studies on locomotion. Table 5.1 provides an overview of some of the most important cost features related to states and controls that can be used as the objective function of an optimal control problem in both locomotor and upper limb control behaviors. 5.4.2 The Choice of Cost for Human Locomotor Behavior The IRL algorithms considered in Section 5.2.2 rely on a cost formulation ranging from simple arrays indexed by discrete states to continuous cost basis functions. A Markov Decision Process, which has been trained on a large library of example motions, provides predictive generalization within the volume of states used to determine the cost function. That is, by choosing a starting point within the learned volume, the MDP may predict the path an expert demonstrator would choose to the goal state. However, conceptually this places strong assumptions on the state space of interest. For example, if a library of level-ground walking examples is used to identify the cost 108 5.4. Considerations of Cost Functions to Generate Behavior Table 5.1: A pool of physically meaningful cost features proposed/used in the literature. Features are categorized into four different families of criteria; kinematics, dynamics, models of energetics, and stability measures Cost Features References Locomotion Upper Limb Kinematics Accelerations (joints or Cartesian) [166] [176], [178] Jerk (joint or Cartesian) [54], [99] [182], [167], [168], [177] Quasi-tracking or accuracy [166], [102], [76] [77], [78], [128] Dynamics Joint torques [48], [102], [76], [127] [74], [178] Joint torque rates [54], [99] [100], [169] Neural/muscular effort [48], [53], [129] [8], [170], [183], [184] Fatigue (neural exponents) [53], [129] [130] Reaction force or force changes [133], [53], [185], [158] [130] Energetics Mechanical work (or cost of transport) (positive, absolute and peak measures) [19], [131], [132], [133], [16], [52], [63] [132], [181] Metabolic cost models [23], [49], [56] [134], [135] Kinetic Energy (Geodesic) [79], [172] [179] Stability Deviations or variability in features [76] Deviations from capture point or stability margin [186], [174], [76] Stabilizing and destabilizing forces [175] Body segments angular momentum [55] 109 Chapter 5. A Review of Inverse Optimal Control Approaches function in an MDP , it is unclear how to apply that process network to predict the task of climbing upstairs. Predicting stair climbing would require consideration of states which were not seen or even possible to achieve in the example data. It is, therefore, desirable to have a cost function that can be calculated even for novel states and based on physical or neuromuscular factors. Thus, in human locomotor control, inference of a parameterized representation of cost that is interpretable, such as a linear weighted combination of intrinsic costs, and is capable of generalizing to other task contexts is more desirable. 5.5 ASSUMPTIONS ON THE OPTIMALITY OF OBSERVED BEHAVIOR One key consideration for inferring the costs and control policies in IOC and IRL frameworks is how to treat the given example behaviors. In practice, the examples provided can be assumed either locally or globally optimal for to-be-inferred costs or reward maps. The optimality assumption for the examples or reference behaviors depends on the research question or application and the available data, and the choice of either can have important implications for the inferred costs and resulting control policies. Local optimality assumptions are often used in IOC and IRL to learn cost functions or reward-policy maps that capture suboptimal goals or constraints in a specific motor task. As such, the resolution of IOC or IRL amounts to solving an approximate 110 5.5. Assumptions on the Optimality of Observed Behavior cost or policy inference for which the demonstrated trajectories are optimal. Local optimality assumption has enabled the utility of combinatorial single-level algorithms. Single-level approximate IOC algorithms based on trajectory optimization have formalized the inverse optimal control problem as a direct optimization problem via Kaurush-Kuhn-Tucker conditions to infer cost weights for which the given example trajectories are locally optimal, i.e., first-order optimality is met [87, 88]. Local optimality has also been a key assumption in regularized search based on gradient descent and approaches based on maximum entropy, e.g., pi-squared update law [148, 155, 156]. Some studies solve the IOC problem to infer the costs for which the given example behaviors are globally optimal; that is, a forward optimizer solves the system dynamics numerous times for any given cost combination to find a globally optimal solution. Global optimality assumption on the examples often facilitates the use of a brute-force-based IOC [16] or a bi-level IOC [187]. Despite the power in matching the given example behavior, approaches based on global optimality assumption cannot generalize to novel behaviors. In human locomotor behavior, perhaps the most reasonable interpretation of steady treadmill walking steps is that each step is an imperfect attempt at achieving a theoretical optimal step. The theoretical optimal step may be found in the volumes spanned by the examples but is never perfectly achieved on any given step. An MDP might be a natural method to analyze how one chooses to deal with noise around 111 Chapter 5. A Review of Inverse Optimal Control Approaches a given optimal trajectory, but it is unclear if it is suited to predict the ideal optimal trajectory given simple task constraints such as desired walking speed, physical constraints on the limbs, etc. 5.6 APPLICATIONS TO HUMAN BIOMECHANICS AND BEHAVIOR Inverse optimal control approaches have proven to be highly valuable in the field of navigation, offering insights into the decision-making processes and strategies employed by individuals while moving through complex environments. By estimating the underlying reward function, researchers can gain a deeper understanding of how pedestrians or autonomous systems plan their trajectories and make navigation choices. For example, Fajen and Warren utilized inverse optimal control techniques to analyze the behavioral dynamics of steering, obstacle avoidance, and route selection [188]. The incorporation of social forces and individual preferences to estimate the reward function guiding pedestrian navigation and then using the resulting model to navigate robots safely and efficiently in crowded environments is an important application in the navigation domain [189, 190, 191]. Models capable of predicting and simulating pedestrian behavior are generally useful for designing intelligent transportation systems, enhancing pedestrian safety, and improving urban planning. By understanding the underlying principles guiding pedestrian movement, researchers can make informed decisions to optimize navigation systems and improve the overall efficiency of human-machine interaction in crowded environments. 112 5.6. Applications to Human Biomechanics and Behavior Other applications of inverse optimal control and inverse reinforcement learning are their use for robotic manipulation, where the attempt is to infer the cost and reward structure underlying the teleoperator’s intentions and goals of the task to then aid in completing a task [192]. Making drones capable of deciding by themselves as expert pilots is another application of IOC and IRL in the robotics field [193, 194]. The inverse optimal control problem has also been addressed in the framework of human-robot collaboration. In this domain, after sufficient training, the robots are to anticipate the future behavior of humans to aid a better and smoother interaction. Mainprice et al. (2015) inferred the cost functions underlying human-human collaborative tasks and then used them within a direct motion planning algorithm to make predictions of the movement behavior demonstrated by a human avatar in the presence of a moving agent (collaborator) [195]. Inverse optimal control approaches have also been used in human biomechanics to infer the cost functions that explain given observations in reaching and manipulation tasks [196, 176, 156, 197, 198]. Moreover, different perspectives of inverse optimal control have also been applied in human locomotion [76, 165, 84, 87, 129] to understand the underlying costs of the locomotor behavior in humans. In [87], an approximately inverse optimal control algorithm based on Karush-Kuhn-Tucker (KKT) optimality conditions was proposed to infer weights of a combination of known costs underlying behavior from motion capture data in human arm trajectories during an industrial screwing task, a postural 113 Chapter 5. A Review of Inverse Optimal Control Approaches coordination in a visual tracking task and a walking gait initialization task. They also proposed a new approach to solving the inverse optimization problem in a bounded error framework to infer a range of costs from noisy motor behavior observations. Nguyen et al. used a bi-level inverse optimization framework to infer the weights among three common cost components from normal human walking [129]. The bi-level framework in this work consisted of an inner loop that attended to the forward optimization of gait by a musculoskeletal model via direct collocation and an outer loop that tuned the weights of the inner-loop cost components. Their proposed approach was tested for finding weights in a cost function that reproduces synthetic gait and experimental human gait data. Rebula et al. (2019) also developed a bi-level method for characterizing cost functions of legged locomotion, with the goal of representing complex humanoid behavior with simple models [84]. They used a simple five-link biped to simulate walking gaits and generate various gait patterns and then assessed the ability of their method to recover them. In addition, the authors explored the sensitivity of their method to an artificial noise in the observed behavior, imperfect knowledge of the model or task, as well as uncertainty in the known cost components. Some studies utilized the inferred costs for transfer learning to autonomous walking in robots [187, 165, 117]. For instance, an approximate local inverse optimal control proposed learning the reward function from demonstrations of human running, through the use of which the locomotion behavior of a humanoid robot was simulated in new contexts, including running on flat ground, rough terrain, 114 5.7. Open Questions and Opportunities for Human Locomotor Control and under strong lateral perturbation [117]. 5.7 OPEN QUESTIONS AND OPPORTUNITIES FOR HUMAN LOCOMOTOR CONTROL 5.7.1 Incorporation of Feedback Control In the computational frameworks proposed by studies of IOC or IRL, stability or balance are often represented by the inclusion of a metric such as those introduced in Section 5.4 into the composite objective function. The margin of stability [173], stabilizing and destabilizing forces [175], and body angular momentum values [55] are all candadiates for such a metric to represent stability. However, two major flaws might arise with the direct use of such measures in the objective function; first, formulation of most of these proxies relies on the base of support and contact sequences which are not continuous over a gait cycle during bipedal walking, causing discontinuities in the objective function or constraints. The second and more important pitfall involved with the use of a scalar objective to wrap the stability in a feed-forward control law does not fully account for the dynamic stability of the walking behavior as it, in part, relies on the reactive control component, too [199, 200]. This will even be a bigger challenge in the case of modeling adaptive/self-stabilizing gait, where the individuals are to deal with external perturbations during walking, such as abrupt accelerations imposed on either belt of a dual-belt treadmill [201]. The suggested IOC formulation, which is introduced in the form of a forward model scheme, needs adjustments to respect instantaneous nuisances imposed by the environment and accordingly in walking behavior when 115 Chapter 5. A Review of Inverse Optimal Control Approaches used for modeling adaptive walking. This can often be addressed through the use of a feedback control law that tunes the motor patterns in response to changes in the biped states in real time. A reasonable alternative could be to represent stability as part of the control scheme over time rather than as a criterion in the objective function. For instance, zero moment point (ZMP), which is a point on the walking plane where the resultant ground reaction force acts, can be used as part of the control policy in designing the reference kinematic trajectories that should be tracked for the stability of a bipedal walking model [45, 82]. Similarly, using in optimization the concept of the capture point or foot rotation point (FRI), which is a point on the walking plane where the resultant ground reaction force acts to bring the feet to a full stop [202, 203, 204], can accommodate a foot placement estimator (FPE). The FPE is derived based on the assumption of conservation of angular momentum during the transition from one leg to the other and estimates where on the walking plane the feet should be placed for a dynamically stable gait [205, 206, 207, 208]. Perhaps another potential avenue to further this discussion would be the utility of an inverse optimal control framework to understand the trade-off between the importance of control costs associated with reactive control components and the costs associated with proactive control. 116 5.7. Open Questions and Opportunities for Human Locomotor Control 5.7.2 Analysis of Cost Components Cost Multicollinearity Although a linear combination of multiple cost components can be a valid assumption for the form of the composite cost function, there are often challenges associated with it. One must note that, in biological motor control, the physically meaningful costs, defined as state variables, are often preferred over a set of abstract basis functions because of the interest in intuition and interpretability. However, since these criteria need not represent a set of basis functions for the composite cost, it is highly likely that some of these meaningful criteria are not linearly independent of the others. In the scope of human locomotion, two different criteria are independent if the behavioral patterns, e.g., their kinematic evolution resulting from the optimization of each criterion alone, are not correlated to those of the other [87]. Hence, for any pair of different but correlated meaningful components, the optimization of each alone would likely predict the same behavioral outcome as the other. As a result, solving the IOC problem would not be able to disambiguate the effects of any correlated pair of components in the composite cost function as long as multicollinear components are present in the composite cost. This would cause a challenge with the realization of the weights for the correlated components because the optimization would no longer be able to disambiguate the relative effects that the multicollinear costs have on the behavior. Thus, one can claim that including a comprehensive set of physically meaningful criteria and not essentially a set of basis functions in the composite cost 117 Chapter 5. A Review of Inverse Optimal Control Approaches would most likely cause a mathematical redundancy in the solutions provided by IOC. The cost redundancy problem has been addressed in prior work only in part by omitting the costs that had the highest degrees of cross-correlation, hence keeping only a limited number of costs in the optimization [87]. First, the cost multicollinearity was evaluated as the correlation among the gradient vectors of each pair of different cost components with respect to the given example behavior. Then, the costs with the least-correlated gradients in the pair-wise correlation result for a given reference behavior were selected, and the rest were omitted in the IOC process. Although omitting the correlated cost components could address the cost multicollinearity in a task, it was done arbitrarily for each task. Further, it is speculated that the resulting cost basis would no longer be rich enough to generate a wide range of behaviors, especially upon interaction with novel contexts and conditions [195]. An alternative to a cost reduction is to keep all components but transform them to construct a smaller number of bases, while each basis can be a combination of different cost components. Additionally, by isolating the cost gradient analysis for each given behavior in the analysis of cost multicollinearity, one cannot interpret the achieved cost combinations in each case since the evaluations of cost gradients could heavily vary from one behavior to another. In addition, by examining the correlation only among cost gradients, the effect of cost variations on the behavior would be undervalued. Therefore, a more biologically relevant analysis would investigate the correlations among different values of costs with respect to behavior variations. 118 5.7. Open Questions and Opportunities for Human Locomotor Control Mathematically, this would amount to a supervised dimensionality reduction. Cost Scaling Another challenge involved with a conventional IOC is the uncertainty in the relative importance of the criteria in the objective, i.e., their different order of magnitude. The criteria with greater orders of magnitude would typically dominate the entire composite objective function. As a result, the IOC would become less sensitive to the criteria with smaller orders of magnitude as if they might not seem to exist in the additive composite objective. Take, for instance, the norm of the jerks and the error between the foot path and a desired trajectory in Cartesian space. Let’s say both are present in a composite objective function, and their weights are almost in the same order of magnitude. It is clear the former can take values of order O(10 3 ), whereas the foot path deviation, when using consistent SI units, would take on values of order O(1). Thus, minimizing this composite objective would lead to a higher contribution of the jerk, and the foot path error cost variations would practically be neglected in dictating the behavior predictions. Therefore, when including various physically meaningful cost components with different orders of magnitude in the composite cost, scaling the components prior to the inverse optimization problem is of eminent importance. Studies have attempted to scale the cost components by dividing them by their valuation in their corresponding single-cost forward optimizations [176]. For instance, if a cost related to energy and a cost related to jerk were combined in a cost 119 Chapter 5. A Review of Inverse Optimal Control Approaches function, to scale them, one may first run forward simulations by minimizing each of energy and jerk alone and then use their resulting optimal valuations to scale them in the composite cost. However, an alternative would be to take into account also the potential variance of the costs alongside their mean orders of magnitude through non-negative cost standardization from analyzing behavioral data. 5.7.3 Stochastic Cost Inference After accounting for cost multicollinearity in cost inference for biological motor control, there is still some likelihood that each given cost can take on a range of weights that, when used in optimization, would generate the same behavior, leading to a potential diversity at the motor cost level. Moreover, a degree of flexibility on the cost combinations may explain the generalizability of the range of inferred costs to predict behavior in novel contexts in a principled fashion. However, despite only behavioral variability incorporated via noise in the reference demonstrations [84], the major focus of cost inference studies has been on inferring only a unique cost combination from behavior observations and the attempt towards inference of a range of cost weights has been far too limited. For instance, bounded-error frameworks have been utilized to infer a feasible range of cost weights from locomotor behaviors with a bounded uncertainty introduced through the behavior [87]. In lieu of the explicit cost function and exploring the cost weight space, these approaches solve for the lower and upper bounds of the cost weights and report 120 5.8. Conclusion the polytope that connects these bounds as the solution for the feasible range of cost weights. However, the assumptions of 1) the presence of a linear map from the uncertainty in behavior to the uncertainty in cost space and 2) the inference of a uniform distribution for cost weights may not be valid. First, one may find solutions for the cost weights given the reference behaviors that represent the feasibility bounds. However, that all other reference behaviors lie within the polytope of given behaviors does not mean the same holds for their corresponding cost weights, and the cost weights for other behaviors may be out of the expected inferred bounds. The same argument defies any uniformity assumption on the map from behavior bounds to cost weight bounds. Further, in bounded-error approaches, the inherent dissection of the cost inference problem into multiple independent optimizations for each cost weight makes these approaches less efficient, and the results would be less interpretable for large-scale problems. Thus, unless a powerful stochastic method aids in sampling from the cost-weight space, bounded-error methods are of limited practical use. The utility of stochastic inference methods that have had success in understanding some of the underlying structures of human motor control, such as Bayesian inference, may be a solution to this problem. 5.8 CONCLUSION The exploration of inverse optimal control (IOC) and inverse reinforcement learning (IRL) has advanced our understanding of human locomotor behavior. Although 121 Chapter 5. A Review of Inverse Optimal Control Approaches an IRL technique based on Markov Decision Processes is able to model human control policies and infer the reward values in the state-control space, inverse optimal control methods based on trajectory optimization remain superior to other approaches in modeling human locomotor behavior. Cost inference approaches based on trajectory optimization would allow for capturing the temporal dependencies and detailed kinematics inherent in human locomotion. Moreover, allowing for the incorporation of physical constraints, such as walking speed and other explicit physiological constraints, makes trajectory-based methods particularly suitable for capturing the complexities of real-world motor behaviors. Finally, IOC based on trajectory optimization would provide a more interpretable representation of cost trade-offs underlying human motor behavior, whereas IRL methods based on MDP do not necessarily have a predefined structure. In IRL, the inference algorithm searches for a reward function that maximizes the likelihood of the observed demonstrations of state-action pairs until a good fit is achieved. Thus, the inferred reward in IRL that does not have a predefined shape, such as IOC, may be complex and cannot be easily expressed through a simple mathematical formulation, hence are not interpretable for biological motor control problems. In the next Chapter, we utilize an inverse optimization approach based on trajectory optimization methods with a pool of physically meaningful costs, and propose accounting for cost redundancy and incorporation of a stochastic inference component in the framework to infer distributions of cost bases weights that, when optimized, would predict locomotor 122 5.8. Conclusion behavior. 123 Chapter 6 Utility of a Feature-Reduced Bayesian Inverse Optimization to Understand Human Locomotor Control ABSTRACT Although energy optimality has been a primary factor that explains several gait features in biological locomotor behaviors, prior empirical and simulation works suggest that it is plausible that a composite cost representing a trade-off among various energetic as well as other non-energetic cost terms would better explain the gait patterns and features demonstrated by individuals. However, the attempt in prior works towards inference of a range of cost weights rather than a unique cost combination has been far too limited. In addition, the use of a set of correlated cost components in the composite cost has resulted in problems with cost inference due to multicollinearity in cost space. Thus, We developed a Feature-Reduced Bayesian inverse optimization framework to identify a range of plausible costs that would explain the observed locomotor patterns in human walking. We implemented this approach 124 6.1. Introduction to infer costs from synthetic gait data, for which the cost weights were known. Our Bayesian inference approaches were able to provide distributions of different cost weights given behavior. Our Feature-Reduced Bayesian inference recaptured the true weights dictating the synthetic locomotor behavior more accurately than the Na¨ ıve approach. Our results imply that a parsimonious representation of costs may be sufficient in dictating locomotor control. Keywords: Inverse optimization, Bi-level cost inference, Bayesian inference, Markov-Chain Monte-Carlo, Cost redundancy, Supervised dimensionality reduction, Human locomotion. 6.1 INTRODUCTION There are several factors that dictate how we choose to move through our environments. Although energy optimality explains several gait features in biological locomotor behaviors, it cannot fully predict all aspects of control, nor does it generalize to novel contexts and other populations [21, 56]. Significant differences have been observed between self-selected and energy-optimal walk-to-run speeds [57, 59]. Moreover, step lengths and times adopted by individuals are associated with higher stability in downhill walking [62]. Additionally, individuals have preferred to suboptimally perform extra negative muscle work to, in turn, avoid stiff landings, hence reducing discomfort in vertical jumps [63]. These pieces of empirical evidence reveal that gait, in general, can be explained by various factors besides energy 125 Chapter 6. Utility of a Feature-Reduced Bayesian IOC in Human Locomotor Control optimality, and perhaps energy economy is not the only goal of motor control Further, clinical populations, such as poststroke, demonstrate a preference for gait patterns more associated with balance [64, 66, 67]. Hemiparetic individuals prefer walking faster than their most balanced speed but slower than their energy-optimal speed, seemingly trading off energy cost and balance [209]. Moreover, when asked, amputees favored walking using compliant prosthetic feet, corresponding to a desired interplay of stability, energy, and comfort [210]. Studies on modeling and simulation of gait have also investigated the degree to which minimization of other costs can predict human locomotor behaviors. For instance, minimization of jerk, compared to prior simulation studies with pure effort cost, predicted more naturalistic gait patterns with smaller torque peaks [54]. The use of muscle activations raised to powers three and ten representing fatigue-like factors along with muscle activation squared as effort predicted realistic amounts of knee flexion present during the stance phase, consistent with empirical observations [53]. Body angular momentum is another important factor that, when combined with work rate, predicted arm motions counter-oscillating with respect to the legs, resulting in a more stable gait controller and a more accurate limit-cycle achieved [55]. Effort costs related to joint torque and muscle activation [48, 74], performance accuracy or variability [77, 78], capture point error and stability margins [211], acceleration [75] and vertical oscillation of the center of mass position [76] are other important factors considered in simulations of gait. 126 6.1. Introduction Altogether, prior empirical and simulation works suggest that it is plausible that a composite cost representing a trade-off among various energetic as well as other non-energetic cost terms would better explain the gait patterns and features demonstrated by individuals [21]. Simulation studies have formalized this trade-off as a linear combination of multiple cost terms that capture factors related to energy expenditure, muscle activation, motion smoothness, and stability [212, 213, 53, 214, 215]. Studies of human gait simulations have aimed to infer the trade-off among a set of various known costs from human demonstrations of locomotor patterns through a class of methods based on optimal control, namely, inverse optimal control. For instance, nested bi-level optimal control (see Section 5.3.2 and Eq. 5.7) has been commonly used along with simplistic or musculoskeletal models to obtain the cost weights in the proposed linear combination that, when used in the forward optimization, would predict locomotor patterns consistent with empirical observations [101, 102, 129]. While nested approaches often use evolutionary algorithms, an alternative to formalizing the inverse optimal control problem has been the utility of Karush–Kuhn–Tucker (KKT) conditions to infer the cost weights via solving a single-level algebraic problem (see 5.3.3 and Eq. 5.9). [87, 216, 217]. There is a likelihood that a range of locomotor costs may exist that, when optimized, would generate the same behavior, leading to a potential diversity at the motor cost level. However, despite incorporating uncertainty in given behavioral examples of locomotion for cost inference [84], the major focus of cost inference 127 Chapter 6. Utility of a Feature-Reduced Bayesian IOC in Human Locomotor Control studies has been on inferring only a unique cost combination from behavior observations, and the attempt towards inference of a range of cost weights has been far too limited. For instance, bounded-error frameworks have been utilized to infer a feasible range of cost weights from locomotor behaviors with a bounded uncertainty introduced through the behavior [87]. In lieu of the explicit cost function and exploring the cost weight space, these approaches solve for the lower and upper bounds of the cost weights and report the polytope that connects these bounds as the solution for the feasible range of cost weights. However, the assumptions of 1) a linear map from the behavior bounds to the cost weight bounds and 2) the presence of a uniform distribution for cost weights that result in a feasible polytope for the set of cost weights are not physiologically plausible. Moreover, the inherent dissection of the cost inference problem into multiple independent optimizations for each cost weight makes these approaches inefficient and less interpretable for large-scale problems. Bayesian inference approach can help to formalize a stochastic multivariate inverse optimization to infer a feasible range of cost weights that, when optimized, would recover a range of observed behavior. Bayesian inference is a statistical framework that allows for the estimation of unknown parameters by combining prior beliefs and observed data, providing a principled way to quantify uncertainty [218]. Markov Chain Monte Carlo (MCMC) is a simulation-based method commonly used to perform Bayesian inference in complex models [219]. In the domain of human motor control, Bayesian inference and MCMC have been utilized to gain insights into motor 128 6.1. Introduction learning, decision-making processes, and sensorimotor control mechanisms. Utility of Bayesian modeling to investigate motor adaptation [7], to understand the neural control of movement [220], and to infer the underlying structure of motor variability [221] are examples of prior work in the human motor control domain that have incorporated Bayesian inference. Additional uses of MCMC have been to explore the Bayesian integration of visual and proprioceptive information in reaching tasks [222] and modeling sequential decision-making tasks in motor learning [223]. Another challenge with inverse optimization is cost inference in the presence of multicollinearity in cost space. This problem has been raised as cost redundancy and addressed by omitting the costs that had the highest degrees of cross-correlation, hence keeping only a limited number of costs in the optimization [87]. Although omitting the correlated cost components could address the cost multicollinearity in a task, it is speculated that the resulting cost basis would no longer be rich enough to generate a wide range of behaviors, especially upon interaction with novel contexts and conditions [195]. An alternative to a cost reduction is to keep all components but transform them to construct a smaller number of bases, while each basis can be a combination of different cost components. Additionally, in the analysis of cost multicollinearity, correlations among the gradients of costs have been used to determine the presence of cost multicollinearity [87]. Although cost gradient points to a form of variance in costs as the behavior changes, the evaluations of cost gradients could heavily vary from one behavior to another. In addition, by 129 Chapter 6. Utility of a Feature-Reduced Bayesian IOC in Human Locomotor Control examining the correlation only among cost gradients, the effect of cost variations on the behavior would be undervalued. Therefore, a more biologically relevant analysis would investigate the correlations among different values of costs with respect to behavior variations. Mathematically, this would amount to a supervised dimensionality reduction. We aim to determine whether the resolution of multicollinearity in the cost space and the use of Bayesian estimation for cost inference could identify a range of plausible costs that would explain the observed locomotor patterns better than a Na¨ ıve Bayesian inference. We began with a range of physically meaningful cost components proposed in the literature and then used empirical behavioral data to inform a supervised dimensionality reduction that resolved the multicollinearity among the cost components. We then used the cost bases obtained from the dimensionality reduction and formed our composite cost function in the forward problem as a linear weighted sum of these cost bases. To verify the feasibility of our framework, we first used this composite cost to generate synthetic gait behaviors at six different walking speeds and then used Markov-Chain Monte-Carlo Bayesian inference to recover the weights that had given rise to the synthetic behaviors. Next, we used a portion of the empirical walking patterns at six different speeds as the reference behavioral data and ran our framework to infer a range of cost bases that, when used in the forward problem, would explain those reference patterns. Lastly, we used the inferred range of composite cost from the behavioral data to predict the locomotor behaviors at two new 130 6.2. Methods Figure 6.1: Schematics of the five-segment biped model. The Cartesian coordinates of the joints and the center of mass of the segments are represented in the left figure. The figure on the right panel represents the notation used to represent the joint torques and their location on the model segments. walking speeds to determine if an objective understanding of walking in a condition could help to generalize to novel conditions. 6.2 METHODS 6.2.1 Five-Segment Biped Model We developed a five-segmented biped model in our simulations of walking on a treadmill. This model consists of a lumped torso connected to the legs at the hip joints. Each leg consists of two rigid bodies modeling the femur and the tibia connecting at the knee joint. The joints and the legs’ contact points with the treadmill are modeled through frictionless revolute joints actuated by torque motors (Fig. 6.1). We define q = [q 1 q 2 q 3 q 4 q 5 ] T as the vector representing segment angles. We take the ankle joint on the stance limb as the origin (P 0 = 0). Going through the kinematic chain of the biped and using the chains rule, one could obtain the Cartesian 131 Chapter 6. Utility of a Feature-Reduced Bayesian IOC in Human Locomotor Control coordinates, velocities, and accelerations of the joints (P i , ˙ P i , ¨ P i ) and the segments’ center of mass (G i , ˙ G i , ¨ G i ): ˙ P i =( ∂P i ∂q )˙ q, ˙ G i =( ∂G i ∂q )˙ q (6.1a) ¨ P i =( ∂ ˙ P i ∂x )˙ x, ¨ G i =( ∂ ˙ G i ∂x )˙ x (6.1b) where ˙ q = [˙ q 1 ˙ q 2 ˙ q 3 ˙ q 4 ˙ q 5 ] T and ¨q = [¨q 1 ¨q 2 ¨q 3 ¨q 4 ¨q 5 ] T . Further, x = [q ˙ q] T and ˙ x = [˙ q ¨q] T represent the vectors of states and state derivatives, respectively. With access to explicit kinematic terms, one can expand the dynamics of the model. To this end, the controls were defined as a vector of joint torques u=[u 1 u 2 u 3 u 4 u 5 ] T . As shown in Fig. 6.1, u 1 through u 5 denote the torques on stance ankle, stance knee, stance hip, swing hip, and swing knee, respectively. The free body diagram of the five-segment biped is represented in Fig. 6.2. We used the moment balance equations at each joint starting fromP 0 to derive the system of equations governing the motion dynamics: u i + 5 X j=i [(G j − P i− 1 )× (− m j g ˆ j)]. ˆ k = 5 X j=i [(G j − P i− 1 )× (− m j ¨ G j )+¨q j I j ˆ k]. ˆ k, i∈{1,2,5} (6.2a) u i + 5 X j=i [(G j − P 2 )× (− m j g ˆ j)]. ˆ k = 5 X j=i [(G j − P 2 )× (− m j ¨ G j )+ ¨q j I j ˆ k]. ˆ k, i∈{3,4} (6.2b) 132 6.2. Methods Figure 6.2: Free body diagram of a five-segment biped. The equations of motion in Eq. (6.2) describe the dynamics of motion for our biped model by constructing the necessary relations between the states, their derivatives, and joint torques across all segments. 6.2.2 Inverse Optimal Control as a Method to Infer Locomotor Costs Optimal control theory, when applied to biological motor behaviors, states that the brain chooses control policies that optimize a functional objective consisting of one or more criteria [9, 10]. We formalized optimal control for human locomotion as a trajectory optimization problem with the purpose of determining the trajectories of statesx(t) and controlsu(t) that minimize a scalar objective function: [x ∗ ,u ∗ ]=argmin x,u Z t f t 0 ϕ (x(t),u(t),t)dt, (6.3a) 133 Chapter 6. Utility of a Feature-Reduced Bayesian IOC in Human Locomotor Control subject to motion dynamics ˙ x(t)=f(x(t),u(t),t), (6.3b) , path constraints P min ≤ P(x(t),u(t),t)≤ P max , (6.3c) and scalar constraints such as bounds on the end-point reaching error b min ≤ b(x(t),u(t),t)≤ b max , (6.3d) where the cost function is a Lagrange term introduced as the time-integral of a cost integrated ϕ (x(t),u(t),t). Equation (6.3b) describes the Newtonian dynamics of the system introduced in the form of state-space continuity constraints (f) along the trajectory by relating the state derivatives ˙ x(t) to states x(t) and controls u(t) at each point in time horizon. Moreover, (6.3c) demonstrates the path constraints over the states and controls trajectories, and (6.3d) shows the upper and lower limits on the states and controls. Presence of (6.3b)-(6.3d) ensures that the physical or physiological feasibility of the obtained strategies, depending on the nature of the constraints introduced, is met. For simplicity, the trajectory optimization problem introduced in (6.3) can be reduced to a nonlinear optimization of a Lagrange term with a set of equality and inequality constraints: 134 6.2. Methods [x ∗ ,u ∗ ]=argmin x,u Z t f t 0 ϕ (x(t),u(t),t)dt, s.t. h j (x(t),u(t),t)=0, j =1,...,m 1 . (6.4) g j (x(t),u(t),t)<0, j =1,...,m 2 . where h j (j = 1,...,m 1 ) and g j (j = 1,...,m 2 ) are used to summarize all the dynamics, the path constraints, and the bounds as either equality or inequality constraints, respectively. While an optimal control problem predicts optimal policies or patterns for a given cost function, an inverse optimization problem aims to infer the plausible cost that, when optimized, would predict some given reference behavior. Classically, the cost function in an inverse optimization problem takes the form of a linear weighted combination of a set of physically meaningful cost components (see Eq. 6.5b). Accordingly, the goal of the inverse optimization mathematically amounts to inferring a set of cost component weights. Here, we used a bilevel framework for inferring the locomotor costs, formalized as follows: ω ∗ =argmin ω ∥x ∗ (t;ω)− ˜ x(t)∥, (6.5a) whereω denotes the weight of each cost component, ˜ x(t) represents the reference 135 Chapter 6. Utility of a Feature-Reduced Bayesian IOC in Human Locomotor Control behavior, and x ∗ (t;ω) is the optimal behavior that results from the solution of the following forward simulation[x ∗ ,u ∗ ]=argmin x,u P N ϕ i=1 ω i R t f t 0 ϕ i (x(t),u(t),t)dt, s.t. h j (x(t),u(t),t)=0, j =1,...,m 1 . (6.5b) g j (x(t),u(t),t)<0, j =1,...,m 2 . The inverse optimization problem represented in (6.5) consists of two levels; the lower level (6.5b) attends to solving the same forward optimal control problem as in (6.4) with its cost being rather a composite cost in form of a linear weighted sum of N ϕ different costs than a single cost. m 1 and m 2 denote the number of equality and inequality constraints in the forward optimal control problem, respectively. The upper-level optimization in (6.5a), however, modulates the weightings on costs in the composite cost function introduced in the lower-level optimization. The upper-level (6.5a) aims to ultimately find a set of optimal weights ω ∗ that, when used to scale the cost components of the lower level, the resulting locomotor behavior prediction x ∗ (t;ω ∗ ) is as close to the given reference behaviors ˜ x(t) as possible. It is worth noting that depending on the observability in the biped model and its environment, the demonstrated reference data can involve all kinematic (movement), dynamic (generalized forces), or neural (muscle activation) aspects of a given behavior. However, we only used the kinematic aspects of the demonstrated states ˜ x(t) for the reference behaviors. 136 6.2. Methods 6.2.3 Cost Functions We selected from the pool of physically meaningful costs proposed in Section 5.4 a group of components as the candidates for our inverse optimization analysis (see Table 6.1) 6.2.4 Multicollinearity in the Cost Space Since the costs describe the behavioral aspects of motor patterns, it is likely that different costs affect the motion dynamics and behavioral structures in a relatively similar fashion, though at different levels. That is, variations of different costs would have similar effects on the outcome behavior, meaning the cost components in the cost basis are multicollinear. Therefore, whenever physics-based costs are of interest, one needs to ensure that the multicollinearity in the cost space is administered. Since our aim was to investigate the costs for normal treadmill walking, we adapted the behavioral human locomotion data from a public dataset [224] to detect any underlying multicollinearity in the cost space, hence reducing the cost dimensionality. The dataset entails motion capture data of treadmill walking at eight speeds per individual across 42 participants. We preprocessed the joint angle data for each walking trial through imputation and low-pass filtration at 10 Hz. Then for all participants, we used event detection to split their joint angle timeseries data at each speed into periods of single-limb stance starting with lift-off and ending at ipsilateral touchdown. Figure 6.3 represents instances of preprocessed gait data for participant 137 Chapter 6. Utility of a Feature-Reduced Bayesian IOC in Human Locomotor Control Table 6.1: Candidate cost components used in the inverse optimization analysis. Cost Mathematical Representation Reference Kinematics Accelerations (Rotational and Translational) 1/τ R 2τ τ P N DOF i=1 ¨q (i) 2 dt and 1/τ R 2τ τ P N DOF i=1 ¨ G (i) 2 dt [166], [176], [87] Jerk (Rotational and Translational) 1/τ R 2τ τ P N DOF i=1 ... q (i) 2 dt and 1/τ R 2τ τ P N DOF i=1 ... G (i) 2 dt [54], [99], [77], [87], [78], [128] Dynamics Torque Squared and Absolute Torque 1/τ R 2τ τ P N DOF i=1 u (i) 2 2 dt and 1/τ R 2τ τ P N DOF i=1 u (i) 1 dt [48], [102], [76], [127], [54], [99] Torque Rate Squared and Absolute 1/τ R 2τ τ P N DOF i=1 ˙ u (i) 2 2 dt and 1/τ R 2τ τ P N DOF i=1 ˙ u (i) 1 dt [48], [53], [129], [130] Torque 2nd Rate Squared and Absolute 1/τ R 2τ τ P N DOF i=1 ¨u (i) 2 2 dt and 1/τ R 2τ τ P N DOF i=1 ¨u (i) 1 dt Vertical and Horizontal GRF Rate 1/τ R 2τ τ P N foot i=1 ˙ vGRF (i) 2 2 dt and 1/τ R 2τ τ P N foot i=1 ˙ hGRF (i) 2 2 dt [133], [53], [185], [158] Energetics Mechanical Cost of Transport (Positive and Absolute) 1/τ R 2τ τ P N DOF i=1 h u (i) ˙ q (i) i + dt and 1/τ R 2τ τ P N DOF i=1 u (i) ˙ q (i) 1 dt [19], [131], [132], [63], [134], [135] Stability Peak-to-Peak Angular Momentum max(AM)− min(AM), whereAM = P N DOF i=1 [I i +m i L 2 i ] ˙ q (i) [127] 1 shown in limb coordinates. To investigate any multicollinearity among the cost components, we used the 138 6.2. Methods 0 0.2 0.4 0.6 -0.2 0 0.2 0.4 Femur Angle (rad) 0 0.2 0.4 0.6 Time (s) -1 -0.5 0 0.5 Tibia Angle (rad) 0 0.2 0.4 0.6 -0.2 0 0.2 0.4 Femur Angle (rad) 0 0.2 0.4 0.6 Time (s) -1 -0.5 0 0.5 Tibia Angle (rad) Fast Medium Slow Walking Speed Figure 6.3: Limb angle trajectories for all step cycles from lift-off to heel-strike for subject 1. The limb angle trajectories from eight different walking speeds are segmented by steps starting with lift-off and ending with touch-down. The represented limb angles are for Tibia and Femur across both legs in swing and stance. The gradient grey color bar is used to represent the range of walking speeds, where the trajectories shown in darker color correspond to the faster gait instances. evaluations of the costs given the segmented limb angle trajectory instances. Using the segmented trajectories and the equations on kinematics (6.1) and inverse dynamics (6.2), we computed kinematic terms down to jerk in both linear and angular space as well as the dynamics including torques, ground reaction forces, 139 Chapter 6. Utility of a Feature-Reduced Bayesian IOC in Human Locomotor Control and their derivatives. Then for each segmented gait instance, we used the obtained trajectories of kinematics, dynamics, and their derivatives to evaluate our selected cost components shown in Table 6.1. For instance, using the derived kinematics and dynamics, we computed positive work rate as one of the cost components for each step according to W + = R P DOF i=1 h u i [t 0 ],...,u i [t f ] i T h ˙ q i [t 0 ],..., ˙ q i [t f ] i + dt. Performing the same process across all subjects and walking speeds, we obtained a broad range of values for each of our cost components. Evaluating the costs for all segmented gait step instances would obtain a matrix of cost values ([C ϕ ] n× N C ) with each row assigned to each step. n denotes the number of steps, and N C represents the number of cost components. To perform a preliminary correlation analysis, we computed the pairwise Pearson’s correlation coefficients among each pair of cost component evaluations according to the following mathematical expression for two given cost columnsX andY : Corr(X,Y)= P n i=1 (X i − ¯ X)(Y i − ¯ Y) p P n i=1 (X i − ¯ X) 2 P n i=1 (Y i − ¯ Y) 2 (6.6) where ¯ X and ¯ Y are the sample means ofX andY , respectively. One approach to reducing the cost dimensionality is to apply an unsupervised learning algorithm to the cost matrix ([C ϕ ]) with the purpose of obtaining a lower-dimensional cost basis. However, an unsupervised dimensionality reduction 140 6.2. Methods would neglect the behavioral aspect of the data and the result will be independent of the effect of the costs on the behavior. Since how the variation of cost components would influence the behavior is of interest, performing a supervised dimensionality reduction considering a dependent variable related to the behavior is of significance. To this end, we introduced a measure of the amount of angular deviation from some reference patterns as the dependent variable. For each set of cost evaluations, the dependent variable was calculated as the Euclidean distance from the limb angles over that step to the ensemble mean patterns at their corresponding walking speed. Since there was variability in the step time across different steps, using a regular Euclidean distance was not plausible. To address this, we used dynamical time warping to align the vectors with different time lengths, which then enabled us to compute the time-warped Euclidean distance between the patterns at each step and the ensemble mean patterns [225]. We then used a robust sparse principal component regression (sPCR) [226, 227] on the cost component evaluation matrix [C ϕ ] n× N C and their corresponding vector of dynamical time-warped distance measures [D] n× 1 to reduce the cost dimensionality. Algorithm (1) represents an overview of the cost dimensionality reduction process. The goal was to obtain the transformation matrix that maps from the high-dimensional cost space to a lower-dimensional one via applying a sparse principal component regression to the cost matrix [C ϕ ] n× N C and the Euclidean distance vector [D] n× 1 . Since the order of magnitude of different costs varied significantly, it was important 141 Chapter 6. Utility of a Feature-Reduced Bayesian IOC in Human Locomotor Control to standardize the cost matrix prior to implementing the dimensionality reduction algorithm (Line 4). The z-score standardization matrix is denoted by Z(.) This ensured to avoid any structural bias in the resulting transformation matrix towards costs with larger orders of magnitude, hence, inherently higher variance. Next, we performed the sparse principal component regression problem on the standardized cost matrix and the time-warped distance vector for a range of target numbers of bases and sparsity values (Lines 9-30). For each combination of the number of bases and sparsity, we performed the joint regularized minimization problem for sparse principal component regression [226, 227] on the training portion of the data as follows: [A,B,β ]=argmin A,B,β n ∥Y (Train) − X (Train) β ∥ 2 +∥X (Train) − X (Train) AB T ∥ 2 +∥B∥ 2 +λ ∥B∥ 1 o s.t. B T B =I N Comp (6.7) whereA andB were the regular and sparse rotation matrices for cost dimensionality reduction, respectively, and β denoted the coefficients of the linear regression. Equation 6.2.4 jointly minimizes a residual term for regression and a variance term regularized with sparsity for dimensionality reduction. We then used the obtained matrices and coefficients for each sparsity and number of bases to calculate and store the mean cross-validated variance accounted for in cost space (Lines 24 and 26) and 142 6.2. Methods the mean cross-validated coefficient of determination for time-warped distance (Lines 25 and 27). We also stored all the resulting transformation matrices for selection (Line 28). Then, from the stored results, we selected the optimal transformation matrix [T ∗ ] N C × N B that met certain selection criteria. The criteria for selecting the optimal combination were variance accounted for (VAF) of 90% in cost space and a plateau observed in the cross-validated variance in the time-warped distance metric. The transformation matrix obtained from the sPCR algorithm shown in Algorithm (1) was then used to inform a feature-reduced composite cost used in our forward simulations of walking. Computationally, the obtained transformation [T ∗ ] N C × N B can map from a column space with N C dimensions to one with N B dimensions. Thus, applying [T ∗ ] to the Na¨ ıve cost inference problem in Eq. (6.5), we formalized our feature-reduced cost inference as follows: ω ∗ B =argmin ω B ∥x ∗ (t;ω B )− ˜ x(t)∥, (6.8a) where ω B denotes the vector of weightings on the cost bases, and x ∗ (t;ω B ) is the optimal behavior that results from the solution of the following lower-level forward optimization with a transformed composite cost [x ∗ ,u ∗ ]=argmin x,u ω B T N B × 1 Z t f t 0 h Z ϕ x(t),u(t),t i N T × N C h T ∗ i N C × N B dt, (6.8b) 143 Chapter 6. Utility of a Feature-Reduced Bayesian IOC in Human Locomotor Control Algorithm 1: Cost dimensionality reduction via sparse Principal Component Regression Data: Cost matrix [C ϕ ] n× N C , Time-warped distance vector [D] n× 1 Result: Cost transformation matrix [T ∗ ] N C × N B 1 sPCR Hyperparameters: number of basesN Comp and sparsityλ ; 2 λ ← 1e− 5 ; // Initializes sparsity 3 N Comp ← 1 ; // Initializes number of bases 4 X ←Z ([C ϕ ]) ; // Standardizes cost matrix 5 Y ← [D] ; // Assigns the distance matrix as the response variable 6 VAF X ← ∅ ; // Preassigns an empty set to variance accounted (VAF) for in cost space 7 Rsq← ∅ ; // Preassigns an empty set to R-squared values 8 T set ← ∅ ; // Preassigns an empty set to sPCR transformation matrices 9 whileN Comp ̸=N C do 10 whileλ ̸= 0.1 do 11 SplitC ϕ andT into 10 consistent random folds; 12 fold← 1; 13 VAF tmp ← 0; 14 Rsq tmp ← 0; 15 whilefold̸= 10 do 16 X (Train) ← X[− (fold)] ; // Splits data into training and cross-validation (CV) sets 17 X (CV) ← X[(fold)]; 18 Y (Train) ← Y[− (fold)]; 19 Y (CV) ← Y[(fold)]; 20 [A,B,β ] = argmin A,B,β n ∥Y (Train) − X (Train) β ∥ 2 +∥X (Train) − X (Train) AB T ∥ 2 +∥B∥ 2 +λ ∥B∥ 1 o s.t. B T B =I N Comp ; // The regularized sPCR minimization problem [226, 227] 21 VAF tmp ← VAF tmp + Var(X (CV) B) Var(X (CV) ) × 100 ; // Calculates VAF for CV set 22 Rsq tmp ← Rsq tmp + 1− ∥Y (CV) − X (CV) β ∥ 2 Var(Y (CV) ) × 100 ; // Calculates R-squared for CV set 23 end 24 VAF (CV) ← VAFtmp 10 ; // Calculates the mean CV VAF across 10 folds 25 Rsq (CV) ← Rsqtmp 10 ; // Calculates the mean CV R-squared across 10 folds 26 VAF X ← VAF X ∪ VAF (CV) ; // Add the mean CV VAF to the VAFX set 27 Rsq← Rsq ∪ Rsq (CV) ; // Add the mean CV VAF to the Rsq set 28 T set ← T set ∪ B ; // Add the mean CV VAF to the Tset set 29 end 30 end 31 From the set of all transformation matrices (T set ) choose the optimal transformation ([T ∗ ]) that corresponds toVAF X > 90% and a plateauedRsq ; 144 6.2. Methods subject to motion dynamics implemented as path constraints ˙ x(t)=f(x(t),u(t),t), (6.8c) and other equality or inequality as well as scalar boundary constraints, e.g., limit-cycle constraint: P min ≤ P(x(t),u(t),t)≤ P max , (6.8d) b min ≤ b(x(t),u(t),t)≤ b max , (6.8e) We refer to the problem in (6.8) as feature-reduced inverse optimization, which aims to obtain, for any given reference behavior, the optimal weightsω ∗ B on the cost bases such that minimization of the new composite cost would predict the reference behaviors. 6.2.5 Bayesian Optimization to Infer Distributions of Cost Weights This section overviews the Bayesian optimization framework used to infer the costs from a set of locomotor patterns given as the reference behavior (Fig. 6.4). The Bayesian optimization approach utilized here was Metropolis-Hastings Markov Chain Monte Carlo (MCMC) [228, 229]. A Markov Chain Monte Carlo process refers to a random sampling process where the new sample is drawn from a Gaussian process with a mean equal to the previous sample and a tuned fixed or adaptive 145 Chapter 6. Utility of a Feature-Reduced Bayesian IOC in Human Locomotor Control variance (w (t+1) ∼ N (w (t) ,σ 2 )). In this work, the generated samples were used, at each iteration, to weigh the cost components that constitute the composite cost in the forward simulations of gait. The error between behavioral predictions resulting from the forward simulations and the reference patterns was then used to form the log-likelihood. In our study, we did not assume any prior knowledge of the weights structure. Hence, the log-prior for all weights was a uniform distribution between zero and one (U(0,1)). Applying the Metropolis-Hastings criterion on the samples based on posteriors, we could then accept or reject the new sample. If the posterior of the new sample were greater than that of the previous sample, the new sample would be accepted. Otherwise, if the posterior ratio were greater than a random draw from a uniform distribution, the new sample would still be accepted. This process stops whenever the target number of iterations is reached. We used the Bayesian optimization framework for the Na¨ ıve and the Feature-reduced cost inference approaches. 6.2.6 Feature-Reduced Bayesian Inverse Optimization to Infer Costs from Synthetic and Behavioral Data This section overviews two different natures of locomotor control behaviors for which we utilized and compared the Na¨ ıve and our Feature-reduced cost inference. The first case used a set of synthetic gait data as the reference for cost inference, while in the second case, we used human behavioral gait data as the reference patterns to infer the costs. 146 6.2. Methods 1 0 0 1 ⁞ Value Density Value Density Minimize the Composite Cost (a sum of the cost components weighted by the Bayesian parameters ( ) ) Cost = ( ) Dynamics Log-Likelihood Joint Angle Time Reference behavior Predicted behavior ⁞ Value Density 0 1 Value Density 0 1 Log-Prior = > Yes No > (, ) Yes No Accept the new sample Reject the new sample 1 0 0 1 ⁞ Value Density Value Density 1 0 0 1 ⁞ Value Density Value Density Number of maximum iterations reached Yes No Propose a new set of weights Forward simulation of gait using the proposed weights Calculate posterior probability of the new solution Generate a new set of weights via a Markov Gaussian Process: ( ) ~ , Predicted locomotor patterns Figure 6.4: The Bayesian cost inference framework. The Bayesian estimation uses a Gaussian Markov-chain process to generate new samples for the cost weights. The new sample is then used to weigh the cost components in the forward simulations of gait. The predicted locomotor patterns are compared against a given set of reference behaviors to form the log-likelihood function and, accordingly, the posterior distribution. Using the Metropolis-Hastings criterion, the next step was to accept or reject the new sample. Synthetic Data as Reference Behaviors As a first step to analyze the performance of the two Na ¨ ıve and Feature-reduced Bayesian inference approaches, we verified the use of a Feature-reduced approach through a synthetic set of gait data (Fig. 6.5). We generated simulated behavioral data using our 5-link biped model walking at a range of treadmill speeds, where the model segments were scaled to represent 10 different anthropometric parameters. 147 Chapter 6. Utility of a Feature-Reduced Bayesian IOC in Human Locomotor Control The purpose of the use of 10 different sets of anthropometric parameters was to investigate any potential sensitivity of the methodology to inertial properties in the model. We also used five different weight sets to scale the cost components constructing five different composite costs. For all combinations of different anthropometric data and different sets of cost weights, we performed simulations of walking through forward optimization. We then used the locomotor patterns predicted by the forward simulations as reference demonstrations for the cost inference problem. We then performed both Na¨ ıve and Feature-reduced inference approaches to test the performance of both frameworks in recovering the weights that were originally used to generate the synthetic patterns. For any given weight set and anthropometric parameter combination, we ran each inference approach with three different initial seeds for the Bayesian optimization to obtain three chains of cost weight values. Running several chains in parallel from a wide range of initial starts can provide useful information about how well the chains are mixing. This helped us conduct the convergence analysis on the inference for each combination of anthropometric parameters and sets of weights [230]. To compare the performance of the two Na¨ ıve and Feature-reduced inference approaches, we compared the posterior distributions of the costs obtained by each approach to the true costs. For each set of results from cost inference, we performed a paired Kolmogorov-Smirnov test between the resulting weight distributions and their corresponding true weights with the null hypothesis significance level set at α =0.05. 148 6.2. Methods A Set of Individual Cost Weights ( ) Torque ⁞ Cost = Torque rate Jerk Work rate Naïve Cost Inference Feature‐Reduced Cost Inference Minimize the Composite Cost Cost = Dynamics Forward Simulation of Gait A pool of Physiological Costs Bayesian Cost Inference 1 2 5 different weight sets 10 different inertial properties sets ⋯ ⋮ ⋮ ⋮ ⋮ ⋯ ⋮ ⋮ Naïve IOC Feature‐reduced IOC Basis 1 Basis 2 Basis N Figure 6.5: The pipeline for inference of locomotor costs from synthetic data. We assigned to the cost components five different known sets of weights and, for 10 different anthropometric parameters, performed forward simulations of gait using the resulting composite cost at a range of treadmill speeds. We then used the generated synthetic data as the reference for both Na¨ ıve and Feature-reduced cost inference frameworks to recover the true cost weights. As the last step, we compared the performance of the two approaches through the use of a Kolmogorov-Smirnov test. We used the secondary statistical metric, KS-stat, to compare the performance of the two inference approaches. The KS-stat describes the maximum distance from the true weights to the 95% confidence interval bounds on the inferred weights. The smaller KS-stat corresponds to a more accurate inference of the true weight. Behavioral Data as Reference Patterns Next, we aimed to infer locomotor costs using behavioral data from humans walking through the Na¨ ıve and Feature-reduced inference approaches. Our goal was 149 Chapter 6. Utility of a Feature-Reduced Bayesian IOC in Human Locomotor Control to understand the effect of feature reduction on the robustness of cost weight recovery and the predictive power of the inferred composite objective for novel conditions. We hypothesized that the Feature-reduced cost inference can infer costs more consistently across different simulations than the Na¨ ıve approach. We also hypothesized that the composite cost inferred by Feature-reduced cost inference applied to human locomotor behavioral data would predict features of locomotion in novel conditions or contexts better than the Na¨ ıve inference. To test this, we first compared the generalizability of the costs inferred from the two approaches in accurately predicting the gait patterns at new walking speeds. To implement the cost inference approaches on the behavioral gait data, we first scaled our walking model by the anthropometric parameters for each individual. Then, for each individual, we used the gait patterns at five out of eight walking speeds for the cost inference and held out the data for the remainder of the speed conditions. For any conjectured composite cost at each iteration in either cost inference approach, the forward optimization predicted the gait behavior at the five training walking speed conditions. It is noteworthy that a combined measure of prediction error across different speeds was used to inform the likelihood function in the Bayesian cost inference problem. After inferring chains of costs from each inference for each individual, we then used the held-out set of data for the remaining speed conditions to determine the generalization power of the costs inferred by each approach to these unforeseen walking conditions. 150 6.3. Results 6.3 RESULTS 6.3.1 Feature Reduction in Cost Space The preliminary results from correlation analysis on cost components reported suggest that there are degrees of multicollinearity among the cost components (Fig. 6.6). Performing the sparse principle component regression for different values of number of bases and a range of different sparsity values, we obtained the variance accounted for (VAF) in cost space and the behavior (Fig. 6.7). Considering the selection criteria, we found a combination of sparsity (λ = 0.001) and number of bases (N B = 5) that achieved VAF X ≈ 92% and VAF Y ≈ 54% (The point marked by red star in Fig. 6.7). From the results of VAF analysis (Fig. 6.7), it is concluded that the sparse principal component regression reduced sixteen cost component candidates to only five cost bases, where each cost basis is composed up of a set of sparse components (Fig. 6.8). Each of the five cost bases obtained here achieved approximately 56%, 19%, 10%, 4%, and 3%, respectively, adding up to a cumulative variance accounted for of VAF X ≈ 92%. Two observations are made from these results. First, the dimensionality reduction has shown successful clustering of cost components that carry almost the same interpretation, such as those related to effort and energy (first basis), those related to translational kinematics such as acceleration and jerk 151 Chapter 6. Utility of a Feature-Reduced Bayesian IOC in Human Locomotor Control Correlation Matrix for Cost Components 0.82 ±0.05 0.00 ±0.21 0.00 ±0.21 0.00 ±0.22 0.00 ±0.23 0.93 ±0.12 0.29 ±0.23 0.29 ±0.22 0.03 ±0.17 0.01 ±0.20 0.63 ±0.16 0.52 ±0.15 0.02 ±0.15 0.01 ±0.19 0.64 ±0.09 0.16 ±0.19 0.21 ±0.18 0.00 ±0.20 0.00 ±0.19 0.74 ±0.18 0.31 ±0.19 0.57 ±0.12 0.57 ±0.10 0.00 ±0.18 0.00 ±0.19 0.69 ±0.14 0.78 ±0.08 0.58 ±0.13 0.04 ±0.11 0.08 ±0.13 0.23 ±0.21 0.16 ±0.20 0.21 ±0.21 0.13 ±0.12 0.32 ±0.18 0.17 ±0.14 0.42 ±0.17 0.44 ±0.16 0.04 ±0.15 0.02 ±0.16 0.59 ±0.11 0.62 ±0.12 0.61 ±0.12 0.85 ±0.10 0.37 ±0.14 0.19 ±0.22 0.38 ±0.26 0.16 ±0.15 0.10 ±0.19 0.18 ±0.17 0.23 ±0.13 0.11 ±0.22 0.22 ±0.13 0.27 ±0.21 0.28 ±0.17 0.21 ±0.21 0.30 ±0.22 0.00 ±0.21 0.00 ±0.21 0.64 ±0.16 0.33 ±0.17 0.91 ±0.06 0.59 ±0.13 0.39 ±0.17 0.63 ±0.12 0.17 ±0.21 0.34 ±0.23 0.30 ±0.21 0.22 ±0.11 0.12 ±0.18 0.47 ±0.13 0.68 ±0.07 0.20 ±0.22 0.46 ±0.16 0.33 ±0.17 0.49 ±0.13 0.40 ±0.14 0.23 ±0.21 0.55 ±0.22 0.47 ±0.20 0.07 ±0.11 0.04 ±0.18 0.56 ±0.13 0.92 ±0.03 0.25 ±0.22 0.67 ±0.13 0.20 ±0.14 0.60 ±0.13 0.34 ±0.13 0.30 ±0.21 0.87 ±0.02 0.42 ±0.22 0.35 ±0.20 0.24 ±0.09 0.17 ±0.17 0.31 ±0.18 0.67 ±0.08 0.09 ±0.22 0.42 ±0.15 0.19 ±0.16 0.41 ±0.15 0.38 ±0.14 0.12 ±0.22 0.80 ±0.08 0.83 ±0.05 0.40 ±0.19 0.27 ±0.18 0.35 ±0.09 0.29 ±0.13 0.09 ±0.18 0.33 ±0.21 0.04 ±0.19 0.27 ±0.21 0.05 ±0.18 0.11 ±0.13 0.00 ±0.14 0.06 ±0.20 0.03 ±0.11 0.21 ±0.15 0.08 ±0.13 1. Angular Acceleration 2. Angular Jerk 3. Linear Acceleration 4. Linear Jerk 5. Torque Squared 6. Absolute Torque 7. Torque Rate Squared 8. Absolute Torque Rate 9. Torque 2nd Rate Squared 10. Absolute Torque 2nd Rate 11. yGRF Rate 12. xGRF Rate 13. Positive Work Rate 14. Absolute Work Rate 15. Kinetic Energy 16. Pk-to-pk Angular Momentum 1. Angular Acceleration 2. Angular Jerk 3. Linear Acceleration 4. Linear Jerk 5. Torque Squared 6. Absolute Torque 7. Torque Rate Squared 8. Absolute Torque Rate 9. Torque 2nd Rate Squared 10. Absolute Torque 2nd Rate 11. yGRF Rate 12. xGRF Rate 13. Positive Work Rate 14. Absolute Work Rate 15. Kinetic Energy 16. Pk-to-pk Angular Momentum 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Figure 6.6: Cost correlation matrix. Cost correlation matrix represents the pairwise correlation values among each pair of cost values samples. The correlation values were calculated from cost values grouped by individuals, and the values reported on the correlation matrix reflect the mean and standard deviation across individuals. The fields with lighter colors correspond to higher degrees of correlation. (second basis), those emphasizing the importance of dynamic renditions of the jerk (third basis), etc. Furthermore, costs related to effort and energy showed to be the most important factor in explaining the behavior for normal walking, explaining56% of the variance in cost space when explaining the variance in behavior. The substantial 152 6.3. Results 0 10 20 30 40 50 60 70 80 90 0 500 1000 1500 2000 1 / Sparsity VAF (%) Variables X (N=1) X (N=2) X (N=3) X (N=4) X (N=5) X (N=6) Y (N=1) Y (N=2) Y (N=3) Y (N=4) Y (N=5) Y (N=6) Variance accounted for (VAF) in cost and behavior spaces Figure 6.7: The results of variance accounted (VAF) for in cost (X) and behavior (y) spaces. Variance accounted for in cost space (blue curves) and in behavior space (red curves) is represented as a function of the desired number of bases, denoted by N, and of a range of loading matrix sparsity. Sparser structures are more desirable for the transformation matrix. To achieve this, our goal here was to select a point as farthest left as possible on these curves while still achieving an acceptable amount of variance accounted for. In addition, the curves depicted in darker colors correspond to greater numbers of bases in the dimensionality reduction structure. Thus, since a smaller number of bases is more desirable, the goal was to select the lowest number of bases possible while still respecting the VAF criteria. dominance of costs related to energetics is in agreement with the literature on normal walking energetics. 153 Chapter 6. Utility of a Feature-Reduced Bayesian IOC in Human Locomotor Control Basis 1 Basis 2 Basis 3 Basis 4 Basis 5 −0.5 0.0 0.5 −0.5 0.0 0.5 −0.5 0.0 0.5 −0.5 0.0 0.5 −0.5 0.0 0.5 Pk−to−Pk ang momentum Peak kinetic energy Absolute work Positive work xGRF rate sqr yGRF rate sqr Torque 2nd rate abs Torque 2nd rate sqrd Torque rate abs Torque rate sqrd Torques abs Torques sqrd Cartesian jerk Cartesian acceleration Angular jerk Angular acceleration Loading Cost Component −1.0 −0.5 0.0 0.5 1.0 Loading Cost Components Loadings across Cost Bases (No cut−off) Basis 1 Basis 2 Basis 3 Basis 4 Basis 5 −0.5 0.0 0.5 −0.5 0.0 0.5 −0.5 0.0 0.5 −0.5 0.0 0.5 −0.5 0.0 0.5 Pk−to−Pk ang momentum Peak kinetic energy Absolute work Positive work xGRF rate sqr yGRF rate sqr Torque 2nd rate abs Torque 2nd rate sqrd Torque rate abs Torque rate sqrd Torques abs Torques sqrd Cartesian jerk Cartesian acceleration Angular jerk Angular acceleration Loading Cost Component −1.0 −0.5 0.0 0.5 1.0 Loading Cost Components Loadings across Cost Bases (No cut−off) Cumulative VAF % 100 50 0 25 75 Figure 6.8: Matrix of cost bases loadings for dimensionality reduction. The transformation or loading matrix depicts the cost bases constructs or the eigenvectors of the cost correlation matrix corresponding to the selected combination. Each cost basis can be thought of as a unit vector the constructing components of which are the cost components, and the horizontal bars show the contributing magnitudes of each of the components to each basis, also known as loadings. In addition, the solid grey curve on top of the loadings plot demonstrates the amount of cumulative variance accounted for (VAF) for each basis added. 6.3.2 Cost Inference from Synthetic Gait Examples Behavior Prediction We used ten different sets of anthropometric properties and five different weight sets to generate gait patterns at five different speeds (see Fig. 6.5). The joint angle trajectories from the synthetic data were compared against those generated using the costs inferred from Na¨ ıve (Fig. 6.9A; for subject 1 and weight set 1) and Feature-Reduced (Fig. 6.9B; for subject 1 and weight set 1). For simulated subject 1 and weight set 1, the accuracy of the costs inferred via the Feature-Reduced approach in predicting the reference synthetic behaviors was qualitatively better than 154 6.3. Results Right.Hip Right.Knee Left.Hip Left.Knee 0.0 0.2 0.4 0.6 0.0 0.2 0.4 0.6 −0.25 0.00 0.25 −0.25 0.00 0.25 Time (s) Joint Angles (rad) 0.8 1.0 1.2 1.4 1.6 Speed Naive Bayesian Inference Right.Hip Right.Knee Left.Hip Left.Knee 0.0 0.2 0.4 0.6 0.0 0.2 0.4 0.6 −0.25 0.00 0.25 −0.25 0.00 0.25 Time (s) Joint Angles (rad) 0.8 1.0 1.2 1.4 1.6 Speed Feature−Reduced Bayesian Inference Right.Hip Right.Knee Left.Hip Left.Knee 0.0 0.2 0.4 0.6 0.0 0.2 0.4 0.6 −0.25 0.00 0.25 −0.25 0.00 0.25 Time (s) Joint Angles (rad) 0.8 1.0 1.2 1.4 1.6 Speed Naive Bayesian Inference Right.Hip Right.Knee Left.Hip Left.Knee 0.0 0.2 0.4 0.6 0.0 0.2 0.4 0.6 −0.25 0.00 0.25 −0.25 0.00 0.25 Time (s) Joint Angles (rad) 0.8 1.0 1.2 1.4 1.6 Speed Naive Bayesian Inference Right.Hip Right.Knee Left.Hip Left.Knee 0.0 0.2 0.4 0.6 0.0 0.2 0.4 0.6 −0.25 0.00 0.25 −0.25 0.00 0.25 Time (s) Joint Angles (rad) 0.8 1.0 1.2 1.4 1.6 Speed Feature−Reduced Bayesian Inference Right.Hip Right.Knee Left.Hip Left.Knee 0.0 0.2 0.4 0.6 0.0 0.2 0.4 0.6 −0.25 0.00 0.25 −0.25 0.00 0.25 Time (s) Joint Angles (rad) 0.8 1.0 1.2 1.4 1.6 Speed Feature−Reduced Bayesian Inference Reference Behavior A B Figure 6.9: Joint angle predictions by the two Bayesian inverse optimization approaches for simulated subject one and weight set 1. (A) Joint angle trajectories predicted by the costs inferred via the Na¨ ıve inverse optimization. (B) joint angle trajectories predicted by the costs inferred via the Feature-Reduced Bayesian inverse optimization. that of the Na¨ ıve approach (Fig. 6.9). However, to understand the effect of the cost inference approach on behavior prediction, we performed a thorough comparison between the approaches in explaining the variance in the root mean square error as a measure of behavioral predictive power. Our aim was to compare the two cost inference approaches in predicting synthetic gait, i.e., the reference joint angle trajectories, for all simulated weight-subject conditions. To achieve this, we performed a two-way analysis of variance (ANOVA) with two main effects of the inference approach and weight set and an interaction term between the inference and each of the subject and weight set variables. Results from 155 Chapter 6. Utility of a Feature-Reduced Bayesian IOC in Human Locomotor Control ANOVA did not find any significant effects of the Inference approach on the prediction error (Table 6.2), suggesting that there was no significant difference between the Na¨ ıve and Feature-Reduced approaches in predicting the synthetic patterns. We summarized the RMSE results and grouped them by weight sets for visualization purposes. Figure 6.10 demonstrates the results of root-mean-square error (RMSE) in comparing the behavioral predictive power of the two approaches across different weight sets. There were no significant differences between the two Na¨ ıve and Feature-Reduced approaches in predicting the reference synthetic gait patterns across different weight sets. Cost Weight Recapturing After obtaining results of cost weight from the Markov-Chain Monte-Carlo (MCMC) Bayesian inverse optimization, we performed convergence diagnostics on the weight chains using the three different randomly initialized chains in each simulation. We used the three chains in each subject-weight combination to obtain measures of inter-chain and intra-chain convergence for each cost weight [231] and modified those to obtain the rank-normalized convergence measures [230]. Performing these steps Table 6.2: Summary of ANOVA Results for Behavioral Prediction Error (RMSE). RMSE∼ Inference*Weight was the formula used to explain RMSE. DOF Mean-Sq F-stat P-value Inference 1 0.000 0.009 0.926 Weight 4 0.001 0.088 0.986 Inference:Weight 4 0.001 0.148 0.964 Residuals 90 0.005 156 6.3. Results ns ns ns ns ns 0.00 0.01 0.02 0.03 0.04 1 2 3 4 5 Weight Set RMSE (rad) Inference Naïve Inference Featured−Reduced Prediction Error (Root Mean Square Error) by the inferred costs Figure 6.10: Comparison of behavioral prediction error by the Na¨ ıve, and Feature-Reduced inference approaches across different weight sets. revealed that rank-normalized reduction factors were always ˆ R < 1.01, which is well below the standard conventional threshold of 1.1 or a more extreme threshold of 1.01 for convergence; thus, no convergence problem was detected with our MCMC results for the cost weights. The posterior distributions on the chains of cost weights in Subject 1 obtained via MCMC for each of Na¨ ıve and Feature-Reduced Bayesian cost inference approaches are illustrated by blue and orange distribution functions, respectively, in Fig. 6.11. By definition, if an inferred cost weight distribution was more inclusive of its corresponding expected true weight (Fig. 6.11; shown as red marks), the performance of the inference would be deemed better. To quantify the performance of each inference approach in recapturing the true 157 Chapter 6. Utility of a Feature-Reduced Bayesian IOC in Human Locomotor Control Weight Set 1 Weight Set 2 Weight Set 3 Weight Set 4 Weight Set 5 −.5 −.25 0 .25 .5 −.5 −.25 0 .25 .5 −.5 −.25 0 .25 .5 −.5 −.25 0 .25 .5 −.5 −.25 0 .25 .5 16. Pk−to−pk Angular Momentum 15. Kinetic Energy 14. Absolute Work Rate 13. Positive Work Rate 12. xGRF Rate 11. yGRF Rate 10. Absolute Torque 2nd Rate 9. Torque 2nd Rate Squared 8. Absolute Torque Rate 7. Torque Rate Squared 6. Absolute Torque 5. Torque Squared 4. Linear Jerk 3. Linear Acceleration 2. Angular Jerk 1. Angular Acceleration Weighting Cost Inferred from Naïve Inference Featured−Reduced True Weights Cost Posterior Distributions Inferred from Bayesian Inference for Subject 1 Figure 6.11: Posterior distributions of cost weights for simulated subject 1 inferred by the two Bayesian inverse optimization approaches. The chains of cost weights for subject 1 obtained by the Bayesian inverse optimization were used to calculate posterior distributions of costs represented in the form of empirical distribution functions. The results from the two Na¨ ıve (blue) and Feature-Reduced (orange) approaches were compared to the true cost weights shown in red marks. weights, we used Kolmogorov-Smirnov (KS) test to calculate the KS statistics, which can be interpreted as the maximum distance from the true weight to the 95% confidence bound of each distribution. Accordingly, a smaller KS-Stat value for a given cost weight distribution implies a better recovery of their corresponding true weight. Using the KS-Stats obtained for all the inferred cost weights, we aimed to compare the inference power of the two Na¨ ıve and Feature-Reduced approaches in recapturing the true weights across different weight set conditions. To achieve this, we performed a two-way analysis of variance (ANOVA) with two main effects of the inference approach and weight set and an interaction between the inference and weight set 158 6.4. Discussion variables. The dependent variable was the KS-Stat value. The results of the ANOVA for KS-Stat revealed that the inference approach had a significant effect on the KS-Stat (Table 6.3; F 1,90 = 26.89, P < 0.001). Performing a Tukey’s test on these ANOVA results suggested that the Feature-Reduced inference approach resulted in KS-Stat values that were significantly smaller than those by the Na ¨ ıve approach (mean difference of ∆ M = 0.051 in KS-Stat results, P = 8.7e− 5). Furthermore, the KS-Stat results from the two inference approaches across different costs, subjects, and weight sets were grouped by weight sets and illustrated (Fig. 6.12) 6.4 DISCUSSION The lack of incorporation of uncertainty in cost inference and potential multicollinearity in the space of potential costs motivated this work. Despite the effort in the literature towards inference of a convex hull of cost weights that, when optimized, would predict a bounded range of gait behaviors [87], the strict assumptions on the existence of a linear and uniformly distributed map from bounded behaviors to bounded cost weights make these approaches of limited use. Moreover, the use of cost components Table 6.3: Summary of ANOVA Results for cost recapturing accuracy. KS-Stat∼ Inference*Weight was the formula used to explain KS-Stat. DOF Mean-Sq F-stat P-value Inference 1 0.066 16.892 8.72e-05 *** Weight 4 0.008 2.124 0.084 . Inference:Weight 4 0.007 1.703 0.156 Residuals 90 0.004 Signif. codes: 0 ‘***’; 0.001 ‘**’; 0.01 ‘*’; 0.05 ‘.’; 0.1 ‘ ’; 1. 159 Chapter 6. Utility of a Feature-Reduced Bayesian IOC in Human Locomotor Control **** **** **** **** ns 0.0 0.3 0.6 0.9 1.2 1 2 3 4 5 Weight Set KS−Stat Inference Naïve Inference Featured−Reduced Kolmogorov Smirnov (KS) Stat for the Inferred cost weights Figure 6.12: Comparison of cost weight recapturing performance by the Na¨ ıve, and Feature-Reduced inference approaches across different weight sets. Kolmogorov-Smirnov statistic was used as the metric to quantify the performance of the inference approaches. that are linearly dependent and of almost the same nature in the cost basis for an inverse optimization problem could lead to issues with weight inference due to cost multicollinearity, which would also make the results less interpretable. In this Chapter, we developed a Feature-Reduced Bayesian inverse optimization approach to infer a distribution of cost weights from given behavior. The dimensionality reduction in the approach proposed here was performed using a sparse principal component regression on the cost evaluations based on behavioral gait data. Moreover, our proposed approach also encompassed a Bayesian inference component involved, wherein for a given set of reference behavior, a distribution of plausible weights on the cost bases would be obtained. We compared the performance of the proposed 160 6.4. Discussion approach to a Na¨ ıve Bayesian inference approach in predicting a set of synthetic gait patterns that were generated through optimization of a known weighted set of costs. The dimensionality reduction managed to reduce the 16 individual cost components to five cost bases, with costs related to energetics being the most important principal cost basis explaining almost 57% of the variance in cost space. Considering that the gait data used here are from normal steady walking and given that we know a typically important factor in dictating normal gait is costs related to energetics, this observation is reasonable. Moreover, our cost reduction is already informing of the presence of clusters of costs, so-called cost bases, that are pointing at various physically meaningful factors, also referred to as intrinsic costs. Energy or effort, jerk in task space, and costs related to effort rate or jerk at the dynamic level were, for instance, the dominant construct of the first three extracted cost bases. Our Bayesian estimation provided a distribution of plausible cost component weights in both Na¨ ıve and Feature-reduced approaches. Although there was no significant difference between the Na ¨ ıve and Feature-reduced inference approaches, our proposed Feature-reduced approach was able to recapture the true cost weights better. Thus, the Feature-reduced Bayesian cost inference resolved the cost redundancy and yet, with fewer degrees of freedom, was able to recover the distributions of cost weights better than a Na¨ ıve approach. A methodological limitation of the proposed framework is efficiency. Despite its success in obtaining distributions of cost bases weights, due to the bi-level 161 Chapter 6. Utility of a Feature-Reduced Bayesian IOC in Human Locomotor Control nature of the framework, it is computationally expensive. For one simulation of Bayesian inference with 2000 iterations from behaviors at five different gait speeds, the CPU run time using five parallel cores of model AMD Epyc 7513 could range from 25 to 35 hours. Although we distributed different cases of such simulations onto a set of high-performance compute nodes as different tasks, the 25-35-hour run time would still make the use of this algorithm in real-time applications almost impossible. However, it remains to be investigated how well the results from a smaller number of iterations would converge and infer the cost weight distributions. Another limitation of the current algorithm is that the likelihood function in the Bayesian inference was an ensemble average of the root mean square errors of the behaviors. Alternative formalizations of the likelihood function that incorporate uncertainties in the represented example behaviors rather than only an ensemble average across them would be an open avenue to pursue. It remains as a future direction to utilize our proposed approach in cost inference from behavioral gait data. It is to be investigated how accurately the cost bases inferred by our approach would predict locomotor behaviors in novel tasks, such as new walking speeds, when compared to the costs inferred by other Na¨ ıve approaches. Moreover, we will test the accuracy of our inferred composite costs against, e.g., energy optimization alone in predicting the feature trends for novel conditions. To achieve this, it is necessary to consider a behavioral task for which it can be ensured that factors other than energetic cost play an important role in dictating locomotor 162 6.5. Conclusion behavior. 6.5 CONCLUSION The results of our proposed Feature-reduced Bayesian inverse optimization inferred distributions of plausible cost families that could predict given walking behavior at different gait speeds. When compared to a Na¨ ıve Bayesian inference, there was no significant difference between the approaches as far as behavioral predictive power. However, the Feature-reduced approach was able to recover the true cost weights that had dictated the synthetic data better. That implies that again a Feature-reduced inference approach, despite reducing the cost dimensionality and hence the degrees of freedom in the inference, was still able to recover distributions of cost weights better than a full-rank cost space, such as the 16 costs in the Na¨ ıve approach. In sum, the Feature-reduced Bayesian inverse optimization provides a more parsimonious representation of costs that may give rise to observed behavior. ACKNOWLEDGEMENTS The authors acknowledge the Center for Advanced Research Computing (CARC) at the University of Southern California for providing computing resources that have contributed to the research results reported within this publication. URL: https://carc.usc.edu. 163 Chapter 7 Conclusions and Future Directions Studies of human motor control have historically been interested in understanding the principles underlying human movement. Despite the attempts in empirical studies, the need for a more systematic and comprehensive evaluation of dominant theories, such as predictive simulation, in gaining a mechanistic understanding of human gait stands. The main goal of this thesis was to investigate the utility of model-based approaches in explaining the principles underlying human locomotor behavior. To this end, we developed physics-based mechanical models of bipedal walking and incorporated them alongside a variety of different computational algorithms in the field of optimal control and biological learning to infer principles and motor objectives that drive human gait behavior. In Chapter 3, the aim was to develop a simulation platform for bipedal walking overground and on a treadmill. The biped consisted of seven segments with nine degrees of freedom free to move in the sagittal plane. To actuate the model 164 joints, we achieved low-dimensional control strategies through the use of dynamical movement primitives, which extracted rhythmic control policies from experimental gait data. To test the feasibility of our simulation platform, we aimed to achieve stable walking overground and on a treadmill using a feedback control law that tuned the spatiotemporal variables of our baseline primitives. The goal of the problem was to walk stably for a desired period of time. The reactive control update law was the modulation of only a few PID gain parameters that mapped the undesirable changes in biped’s states, such as torso position and pitch, to changes in spatiotemporal scaling variables as stabilizing responses. The biped was able to achieve a stable gait for the desired period of time without falling, and the simple control update law proved efficient in increasing the walk time compared to a pure proactive control or less robust feedback control, e.g., update with respect to only one state. The platform developed in Chapter 3 was then used to test principles underlying normal and adaptive gait on a split-belt treadmill. Chapter 4 attended to this problem. In Chapter 4, the main aim was to test the principles and factors that drive human adaptive gait. For instance, empirical observations of human gait have revealed that specific gait features, such as self-selected speed, step lengths, and step times are adopted by individuals during normal walking to minimize the cost of transport. However, more importantly, how these principles and factors change in a novel context, such as adaptive gait, remains unknown. Here, we modeled split-belt treadmill walking, which has been proposed as a common paradigm for 165 Chapter 7. Conclusions and Future Directions studying adaptive gait, and sought to uncover the factors that drive human adaptation to the split-belt treadmill. To this end, we modified the simulation platform developed in Chapter 3 to only represent steady-state gait controlled via a feed-forward control law. We developed a hybrid computational algorithm that first obtained a wide range of feasible solutions and then searched through the feasible surrogate fitness landscape via gradient thresholding to obtain a set of feasible good-enough solutions for walking at a range of belt speed ratios. Our hybrid framework found a wide range of solutions that were good enough with respect to the positive work rate. The biped decreased its positive work rate as the belt speed ratio increased, indicating that higher belt speed ratios provide greater assistance. Our approximate local search with gradient thresholding also predicted that the biped decreased its stance time on the fast belt and placed the foot further forward on the faster belt as strategies to decrease the positive work rate by legs. Our findings indicated that optimality and good-enough principles can explain the gait individuals adopt during split-belt walking. However, the discrepancies may indicate that a mix of principles and control law components, as well as considerations of different costs rather than energy optimality alone, may better explain the demonstrated human behavior. Although factors related to energy have been shown to play an important role in shaping the gait features, a standalone cost such as energy optimality fails to explain all the gait features observed in different contexts. A compromise among different factors may rather better explain the features human adopts under varying 166 gait conditions. In Chapter 5, we reviewed how the problem of uncovering factors that drive human behavior has been addressed through cost inference approaches in the literature. Different approaches to inverse optimization that have had success in the inference of human motor costs from human behavior were reviewed, and challenges and opportunities in the domain were raised. The two essential methods of 1) inverse optimal control, as inspired by trajectory optimization, and 2) inverse reinforcement learning, based on Markov decision processes, were introduced as the leading approaches to cost inference in the field. In addition, other important aspects of a cost inference problem, such as cost and constraints, as well as uncertainty and optimality of the given example behavior, were discussed. This Chapter provided a background for 6 and motivated the utility of a stochastic cost inference to obtain a distribution of plausible locomotor costs from human behavior. In Chapter 6, the aim was to infer the distribution of locomotor costs that describe observed human gait. To this end, we first reduced the dimensionality of a rich pool of costs to obtain the cost bases via behavioral gait data. Markov-Chain Monte-Carlo algorithm was then used as a Bayesian inverse optimization framework, implemented in both cases of Na¨ ı’ve and Feature-Reduced cost inference. We performed the Bayesian cost inverse optimization on both synthetic and behavioral gait data to infer the costs that could explain the gait patterns. Our results from synthetic data revealed that the behaviors generated by the costs inferred via the two approaches were not significantly different. However, the cost weights obtained by the Feature-Reduced 167 Chapter 7. Conclusions and Future Directions cost inference were more accurate than the Na¨ ıve approach in recapturing the true weights that were used in generating the synthetic data. In sum, the model-based predictive simulation frameworks proposed here enabled us to investigate the principles underlying human locomotor behavior and gain a mechanistic understanding of human body movement in conventional and novel contexts. Chapter 3 tested the feasibility of a primitive-based control used as a feedback control law in stabilizing a mechanical biped model walking on a treadmill. Chapter 4 used the platform developed in Chapter 3 to test the hypotheses around optimality principles and their variants, such as feasibility and good-enough control theories, in understanding how individuals adapt to split-belt treadmill walking. Chapter 5 reviewed inverse approaches to a mechanistic understanding of human motor control that led to the choice of a general inverse optimal control framework used in Chapter 6. However, the Bayesian Feature-Reduced cost inference approach proposed in Chapter 6 addressed the curse of dimensionality and inference uncertainty as the two major drawbacks of conventional cost inference methods. In general, we proved the computational models of human locomotor control useful in understanding the underlying mechanisms that drive human movement. Depending on the task and context, other researchers may utilize any of the frameworks proposed in the current thesis alone or in combination to further their mechanistic understanding of biological motor systems. The methodological frameworks proposed and used here and the insights found 168 can have implications for human movement science and robotics in general. The low-dimensional primitive-based reactive control framework proposed in Chapter 3 can be scaled and applied as a simple baseline model for stability control law used in powered exoskeletons as well as bio-inspired robots. The model-based predictive simulation developed in Chapter 4 enabled us to predict trends in spatiotemporal features during split-belt gait. After validation and scaling, the physics-based model, alongside the inferred underlying principle, can be used for motor skill training or enhancement in rehabilitation therapy or sports biomechanics to test the feasibility and effectiveness of a proposed motor strategy through a simulation framework prior to imposing them onto the individuals. This would help, at least in part, to predict how individuals will respond to a motor practice intervention before actually implementing them in the real world. As such, the safe and controlled platform provided by such simulation studies can also test a wide variety of events and strategies that can otherwise not be directly observed and experienced in experiments. If established and validated, such a model-based approach can alleviate the need for extensive human experiments. As a result, a na¨ ıve experiment paradigm can be replaced by one that is also informed by prior mechanistic inferences made through model-based simulations of motor control. In addition, the Bayesian Feature-Reduced cost inference in Chapter 6 is potent in enhancing rehabilitation outcomes by personalizing therapy, evaluating and tracking progress, and guiding the design of bio-inspired assistive devices. 169 Chapter 7. Conclusions and Future Directions Inverse optimization can extract the reward or cost functions that a patient was probably optimizing before an impairment by looking at, e.g., patterns demonstrated on unaffected limbs in the hemiparetic population. Accordingly, the rehabilitation interventions can be tailored to facilitate the recovery of impaired motor skills by promoting the inferred motor costs as the goal in relevant motor skill training paradigms. The cost inference approach and insights delivered here can also inform clinical assessments of quality or improvement in motor skills, e.g., at the motor examination level in the unified Parkinson’s Disease Rating Scale (UPDRS). Our insights into feature-reduced cost bases can replace the high-dimensional, often kinematic motor assessment metrics by providing a parsimonious family of objective measures, which also capture dynamics and kinetics of movement. These metrics could be used to gauge an individual’s motor skills at different points in time, hence their progress, or to compare the motor skills between different populations objectively. Additionally, the distributions of parsimonious families of costs inferred via our approach can be transferred as a more compact, generalizable representation of control policies to learning in bio-inspired robotics, hence generating a broader range of biologically feasible behaviors. From the perspective of potential future works, we have the following directions to follow: • Implementation of good-enough control is somewhat subjective; although a very reasonable criterion is gradient thresholding, there are still avenues for further 170 work in exploring alternative approaches that potentially combine feasibility, modular and habitual control. • An alternative for gradient descent in the good-enough framework is to use a stochastic gradient descent as a more biologically reasonable approach. Using stochastic gradient descent, one can mimic local observability and the memory limits, obscuring the on-the-go gradient evaluation. Accordingly, as the exploration process evolves, only a subset of the feasible sample solutions that are in the vicinity of the current state will be used to fit the cost surrogate, and then this local cost surrogate will be used to evaluate the gradient. • Alternative metrics related or non-related to energy expenditure can be utilized in our predictive simulations of split-belt gait. It is believed and shown in the literature of mainly simulation studies that different cost functions can explain different aspects of gait features. We hope to reimplement our hybrid evolutionary-gradient-based search using a range of alternative costs, such as negative work, absolute work, a positive-negative mix of mechanical work, as well as those incorporating joint moment norms. • Insights from the predictions made by a range of different locomotor costs can provide an avenue for selecting the costs with empirically plausible predictions and using them as a set of cost components informing an inverse optimal control problem. Accordingly, finetuning the weightings on each cost component would 171 Chapter 7. Conclusions and Future Directions potentially infer the true cost combinations that more accurately capture human behavior. • We can use a rather simpler model of walking proposed in the literature, such as the one by Seethapathi et al. [104], to implement our action selection algorithm in the simulation of split-belt walking across a range of belt speed ratios. By implementing the same search algorithm on a simpler model, the gaps between prior simulation studies and the current work will be bridged since the effects of model choice and action selection algorithms could then be disambiguated. • In an extension to the work in Chapter 6, we can perform the cost dimensionality reduction and obtain the cost base constructs for each individual separately to account for the effects of between-individual variability. The resulting cost base construct will then be used to implement a subject-specific cost inference given behavior for each individual. • Complementary to the work in this dissertation, the methodology proposed in Chapters 4 and 6 could be brought together to form an Inverse Good-enough Control framework. 7.1 INVERSE GOOD-ENOUGH CONTROL: A VISIONARY OUTLOOK As mentioned in the last potential avenue to complement the work proposed in this dissertation, the methodology developed in Chapters 4 and 6 could be utilized 172 7.1. Inverse Good-enough Control: A Visionary Outlook together to formalize an Inverse Good-enough Control (IGC) framework. An IGC framework reconciles the stricter optimality assumption that IOC imposes on the given example behavior for cost inference, and it, rather, attempts to recover the cost function for which the given behavior is good enough. As such, the lower-level problem in cost inference via IGC attends to solving the forward simulations of gait through the use of a gradient-based search along a surrogate cost landscape, finding a manifold of good-enough solutions. The goal of the upper-level problem in IGC, then, becomes to find the cost weights that maximize the likelihood of the given example pattern(s) relative to the good-enough predictions from the lower-level problem, i.e., implicitly maximize the probability of the observed data given the good-enough solution family membership. When developing a cost inference framework based on Inverse Good-enough Control (IGC), one must recognize some challenges involved. Firstly, incorporating a good-enough control algorithm into the lower-level problem incurs a higher computational cost. This is due to the fact that good-enough control aims to identify a set of behaviors meeting a gradient threshold criterion rather than a unique solution or an uncertainty-bounded region. As a result, replacing optimization with good-enough control in the lower-level problem significantly increases the runtime of the algorithm. To mitigate this, a concrete implementation of the problem that explores the cost landscape for different good-enough solutions in parallel would be necessary. In addition, the considerable computational expenses may enforce the choice of more 173 Chapter 7. Conclusions and Future Directions simplistic walking models, such as a compass walker (see Appendix A.1), to reduce the run-time. Furthermore, the use of a good-enough control component in the lower-level problem may encounter challenges related to multicollinearity among various cost gradients. Even after decorrelating the costs via dimensionality reduction approaches such as that proposed in Section 6.2.4, several costs may still exhibit covarying gradients in the vicinity of specific solutions. This poses a challenge for cost inference with IGC as it becomes difficult to disentangle the effects of costs with similar gradients on the predicted behavior. Despite these inherent challenges, the utility of Inverse Good-enough Control (IGC) for cost inference presents a promising avenue for future research and development. 174 References [1] F . J. Valero-Cuevas, H. Hoffmann, M. U. Kurse, J. J. Kutch, and E. A. Theodorou. Computational models for neuromuscular function. IEEE Reviews in Biomedical Engineering, 2:110–135, 2009. [2] Emanuel Todorov. Optimality principles in sensorimotor control. Nature neuroscience, 7(9):907–915, 2004. [3] Robert F Stengel. Optimal control and estimation. Courier Corporation, 1994. [4] Dimitri P Bertsekas. Dynamic programming and optimal control 3rd edition, volume ii. Belmont, MA: Athena Scientific , 2011. [5] Mitsuo Kawato. Internal models for motor control and trajectory planning. Current opinion in neurobiology, 9(6):718–727, 1999. [6] Aymar De Rugy, Gerald E Loeb, and Timothy J Carroll. Muscle coordination is habitual rather than optimal. Journal of Neuroscience, 32(21):7384–7391, 2012. [7] Konrad K¨ ording. Decision theory: what” should” the nervous system do? Science, 318(5850):606–610, 2007. [8] Emanuel Todorov and Michael I Jordan. Optimal feedback control as a theory of motor coordination. Nature neuroscience, 5(11):1226–1235, 2002. [9] Donald E Kirk. Optimal control theory: an introduction. Courier Corporation, 2004. [10] Emanuel Todorov. Optimal control theory. Bayesian brain: probabilistic approaches to neural coding, pages 268–298, 2006. [11] R McNeill Alexander. Principles of animal locomotion. Princeton University Press, 2003. [12] R McN Alexander. The gaits of bipedal and quadrupedal animals. The International Journal of Robotics Research, 3(2):49–59, 1984. [13] R McNeill Alexander. Optima for animals. Princeton University Press, 2021. 175 References [14] Yves Nubar and Renato Contini. A minimal principle in biomechanics. The bulletin of mathematical biophysics, 23(4):377–391, 1961. [15] WA Sparrow and KM Newell. Metabolic energy expenditure and the regulation of movement economy. Psychonomic Bulletin & Review, 5(2):173–196, 1998. [16] Manoj Srinivasan and Andy Ruina. Computer optimization of a minimal biped model discovers walking and running. Nature, 439(7072):72–75, 2006. [17] MY Zarrugh and CW Radcliffe. Predicting metabolic cost of level walking. European Journal of Applied Physiology and Occupational Physiology, 38(3):215–223, 1978. [18] Brian R Umberger and Philip E Martin. Mechanical power and efficiency of level walking with different stride rates. Journal of Experimental Biology, 210(18):3255–3265, 2007. [19] J Maxwell Donelan, Rodger Kram, and Arthur D Kuo. Mechanical work for step-to-step transitions is a major determinant of the metabolic cost of human walking. Journal of Experimental Biology, 205(23):3717–3727, 2002. [20] Keith E Gordon, Daniel P Ferris, and Arthur D Kuo. Metabolic and mechanical energy costs of reducing vertical center of mass movement during gait. Archives of physical medicine and rehabilitation, 90(1):136–144, 2009. [21] John EA Bertram and Andy Ruina. Multiple walking speed–frequency relations are predicted by constrained optimization. Journal of theoretical Biology, 209(4):445–453, 2001. [22] RM Alexander. Optimization and gaits in the locomotion of vertebrates. Physiological reviews, 69(4):1199–1227, 1989. [23] AE Minetti and R McN Alexander. A theory of metabolic costs for bipedal gaits. Journal of Theoretical Biology, 186(4):467–476, 1997. [24] MY Zarrugh, FN Todd, and HJ Ralston. Optimization of energy expenditure during level walking. European journal of applied physiology and occupational physiology, 33(4):293–306, 1974. [25] Henry J Ralston. Energy-speed relation and optimal speed during level walking. Internationale Zeitschrift f¨ ur Angewandte Physiologie Einschliesslich Arbeitsphysiologie, 17(4):277–283, 1958. [26] Richard G Ellis, Kevin C Howard, and Rodger Kram. The metabolic and mechanical costs of step time asymmetry in walking. Proceedings of the Royal Society B: Biological Sciences, 280(1756):20122784, 2013. 176 References [27] Daniel P Ferris, Gregory S Sawicki, and Monica A Daley. A physiologist’s perspective on robotic exoskeletons for human locomotion. International Journal of Humanoid Robotics, 4(03):507–528, 2007. [28] Keith E Gordon and Daniel P Ferris. Learning to walk with a robotic ankle exoskeleton. Journal of biomechanics, 40(12):2636–2644, 2007. [29] Gregory S Sawicki and Daniel P Ferris. Mechanics and energetics of level walking with powered ankle exoskeletons. Journal of Experimental Biology, 211(9):1402–1413, 2008. [30] Jessica C Selinger, Shawn M O’Connor, Jeremy D Wong, and J Maxwell Donelan. Humans can continuously optimize energetic cost during walking. Current Biology, 25(18):2452–2456, 2015. [31] James M Finley, Amy J Bastian, and Jinger S Gottschall. Learning to be economical: the energy cost of walking tracks motor adaptation. The Journal of physiology, 591(4):1081–1095, 2013. [32] Natalia S´ anchez, Sungwoo Park, and James M Finley. Evidence of energetic optimization during adaptation differs for metabolic, mechanical, and perceptual estimates of energetic cost. Scientific Reports , 7(1):1–14, 2017. [33] Natalia S´ anchez, Surabhi N Simha, J Maxwell Donelan, and James M Finley. Taking advantage of external mechanical work to reduce metabolic cost: the mechanics and energetics of split-belt treadmill walking. The Journal of physiology, 597(15):4053–4068, 2019. [34] Jan Stenum and Julia T Choi. Step time asymmetry but not step length asymmetry is adapted to optimize energy cost of split-belt treadmill walking. The Journal of Physiology, 598(18):4063–4078, 2020. [35] Brian E Maki. Gait changes in older adults: predictors of falls or indicators of fear? Journal of the American geriatrics society, 45(3):313–320, 1997. [36] Hylton B Menz, Stephen R Lord, and Richard C Fitzpatrick. Acceleration patterns of the head and pelvis when walking on level and irregular surfaces. Gait & posture, 18(1):35–46, 2003. [37] Mark D Latt, Hylton B Menz, Victor S Fung, and Stephen R Lord. Walking speed, cadence and step length are selected to optimize the stability of head and pelvis accelerations. Experimental brain research, 184(2):201–209, 2008. [38] Gerald E Loeb. Optimal isn’t good enough. Biological cybernetics, 106(11):757–765, 2012. 177 References [39] NA Bernstein. The coordination and regulation of movements. oxford, uk, pergamon. 1967. [40] Auke Jan Ijspeert. Central pattern generators for locomotion control in animals and robots: a review. Neural networks, 21(4):642–653, 2008. [41] Auke Jan Ijspeert, Jun Nakanishi, Heiko Hoffmann, Peter Pastor, and Stefan Schaal. Dynamical movement primitives: learning attractor models for motor behaviors. Neural computation, 25(2):328–373, 2013. [42] Jun Nakanishi, Jun Morimoto, Gen Endo, Gordon Cheng, Stefan Schaal, and Mitsuo Kawato. Learning from demonstration and adaptation of biped locomotion. Robotics and autonomous systems, 47(2-3):79–91, 2004. [43] CK Chow and DH Jacobson. Studies of human locomotion via optimal programming. Mathematical Biosciences, 10(3-4):239–306, 1971. [44] Miles A Townsend and Ali A Seireg. Effect of model complexity and gait criteria on the synthesis of bipedal locomotion. IEEE Transactions on Biomedical Engineering, (6):433–444, 1973. [45] Miomir Vukobratovic, Andrew A Frank, and Davor Juricic. On the stability of biped locomotion. IEEE Transactions on Biomedical Engineering, (1):25–36, 1970. [46] CK Chow and DH Jacobson. Further studies of human locomotion: postural stability and control. Mathematical Biosciences, 15(1-2):93–108, 1972. [47] Arvikar Seireg and RJ Arvikar. The prediction of muscular load sharing and joint forces in the lower extremities during walking. Journal of biomechanics, 8(2):89–102, 1975. [48] Antonio Pedotti, VV Krishnan, and L Stark. Optimization of muscle-force sequencing in human locomotion. Mathematical Biosciences, 38(1-2):57–76, 1978. [49] Arthur D Kuo. A simple model of bipedal walking predicts the preferred speed–step length relationship. J. Biomech. Eng., 123(3):264–269, 2001. [50] Frank C Anderson and Marcus G Pandy. Dynamic optimization of human walking. J. Biomech. Eng., 123(5):381–390, 2001. [51] Manoj Srinivasan. Optimal speeds for walking and running, and walking on a moving walkway. Chaos: An Interdisciplinary Journal of Nonlinear Science, 19(2):026112, 2009. 178 References [52] Manoj Srinivasan. Fifteen observations on the structure of energy-minimizing gaits in many simple biped models. Journal of The Royal Society Interface, 8(54):74–98, 2011. [53] Marko Ackermann and Antonie J Van den Bogert. Optimality principles for model-based prediction of human gait. Journal of biomechanics, 43(6):1055–1060, 2010. [54] Amira Aloulou and Olfa Boubaker. Minimum jerk-based control for a three dimensional bipedal robot. In Sabina Jeschke, Honghai Liu, and Daniel Schilberg, editors, Intelligent Robotics and Applications, pages 251–262, Berlin, Heidelberg, 2011. Springer Berlin Heidelberg. [55] Jack M Wang, David J Fleet, and Aaron Hertzmann. Optimizing walking controllers. In ACM SIGGRAPH Asia 2009 papers, pages 1–8. 2009. [56] John EA Bertram. Constrained optimization in human walking: cost minimization and gait plasticity. Journal of experimental biology, 208(6):979–991, 2005. [57] ALAN Hreljac. Preferred and energetically optimal gait transition speeds in human locomotion. Medicine and Science in Sports and Exercise, 25(10):1158–1162, 1993. [58] Rodolfo Margaria. Biomechanics and energetics of muscular exercise. Oxford University Press, 1976. [59] Alf Thorstensson and H Roberthson. Adaptations to changing speed in human locomotion: speed of transition between walking and running. Acta Physiologica Scandinavica, 131(2):211–214, 1987. [60] Steven J Wickler, Donald F Hoyt, Edward A Cogger, and Gregory Myers. The energetics of the trot–gallop transition. Journal of Experimental Biology, 206(9):1557–1564, 2003. [61] Claire T Farley and C Richard Taylor. A mechanical trigger for the trot-gallop transition in horses. Science, 253(5017):306–308, 1991. [62] LC Hunter, EC Hendrix, and JC Dean. The cost of walking downhill: is the preferred gait energetically optimal? Journal of biomechanics, 43(10):1910–1915, 2010. [63] Karl E Zelik and Arthur D Kuo. Mechanical work as an indirect measure of subjective costs influencing human movement. PloS one, 7(2):e31143, 2012. 179 References [64] Brenda Brouwer, Krishnaji Parvataneni, and Sandra J Olney. A comparison of gait biomechanics and metabolic requirements of overground and treadmill walking in people with stroke. Clinical biomechanics, 24(9):729–734, 2009. [65] George Chen, Carolynn Patten, Dhara H Kothari, and Felix E Zajac. Gait deviations associated with post-stroke hemiparesis: improvement during treadmill walking using weight support, speed, support stiffness, and handrail hold. Gait & posture, 22(1):57–62, 2005. [66] George Chen, Carolynn Patten, Dhara H Kothari, and Felix E Zajac. Gait differences between individuals with post-stroke hemiparesis and non-disabled controls at matched speeds. Gait & posture, 22(1):51–56, 2005. [67] Kathleen J Ganley, Richard M Herman, and Wayne T Willis. Muscle metabolism during overground walking in persons with poststroke hemiparesis. Topics in stroke rehabilitation, 15(3):218–226, 2008. [68] Darcy S Reisman, Stuart Binder-MacLeod, and William B Farquhar. Changes in metabolic cost of transport following locomotor training poststroke. Topics in stroke rehabilitation, 20(2):161–170, 2013. [69] Darcy S Reisman, Katherine S Rudolph, and William B Farquhar. Influence of speed on walking economy poststroke. Neurorehabilitation and neural repair, 23(6):529–534, 2009. [70] Benjamin J Darter and Jason M Wilken. Energetic consequences of using a prosthesis with adaptive ankle motion during slope walking in persons with a transtibial amputation. Prosthetics and orthotics international, 38(1):5–11, 2014. [71] D Casey Kerrigan. Guest editorial: aesthetics of walking. Journal of rehabilitation research and development, 38(5):IX, 2001. [72] Verne T Inman, Howard D Eberhart, et al. The major determinants in normal and pathological gait. JBJS, 35(3):543–558, 1953. [73] Marco Iosa, Augusto Fusco, Fabio Marchetti, Giovanni Morone, Carlo Caltagirone, Stefano Paolucci, and Antonella Peppe. The golden ratio of gait harmony: repetitive proportions of repetitive gait phases. BioMed research international, 2013, 2013. [74] Winston L Nelson. Physical principles for economies of skilled movements. Biological cybernetics, 46(2):135–147, 1983. [75] Holger Diedam, Dimitar Dimitrov, Pierre-Brice Wieber, Katja Mombaur, and Moritz Diehl. Online walking gait generation with adaptive foot positioning 180 References through linear model predictive control. In 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 1121–1126. IEEE, 2008. [76] Debora Clever and Katja Mombaur. On the relevance of common humanoid gait generation strategies in human locomotion: an inverse optimal control approach. In Modeling, Simulation and Optimization of Complex Processes HPSC 2015, pages 27–40. Springer, 2017. [77] Christopher M Harris and Daniel M Wolpert. Signal-dependent noise determines motor planning. Nature, 394(6695):780–784, 1998. [78] Hirokazu Tanaka, John W Krakauer, and Ning Qian. An optimization principle for determining movement duration. Journal of neurophysiology, 95(6):3875–3886, 2006. [79] Gustavo Arechavaleta, Jean-Paul Laumond, Halim Hicheur, and Alain Berthoz. An optimality principle governing human walking. IEEE Transactions on Robotics, 24(1):5–14, 2008. [80] Toshihiko Y anase and Hitoshi Iba. Evolutionary multi-objective optimization for biped walking. In Asia-Pacific Conference on Simulated Evolution and Learning , pages 635–644. Springer, 2008. [81] Scott Forer, Santosh Balajee Banisetty, Logan Yliniemi, Monica Nicolescu, and David Feil-Seifer. Socially-aware navigation using non-linear multi-objective optimization. In 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 1–9. IEEE, 2018. [82] Manish Raj, Vijay Bhaskar Semwal, and Gora Chand Nandi. Multiobjective optimized bipedal locomotion. International Journal of Machine Learning and Cybernetics, 10(8):1997–2013, 2019. [83] Guy Bessonnet, Pascal Seguin, and Philippe Sardain. A parametric optimization approach to walking pattern synthesis. The International Journal of Robotics Research, 24(7):523–536, 2005. [84] John R Rebula, Stefan Schaal, James Finley, and Ludovic Righetti. A robustness analysis of inverse optimal control of bipedal walking. IEEE Robotics and Automation Letters, 4(4):4531–4538, 2019. [85] Rudolf Emil Kalman. When is a linear control system optimal? 1964. [86] Arthur Earl Bryson. Applied optimal control: optimization, estimation and control. CRC Press, 1975. [87] Adina Panchea. Inverse optimal control for redundant systems of biological motion. PhD thesis, 2015. 181 References [88] Anne-Sophie Puydupin-Jamin, Miles Johnson, and Timothy Bretl. A convex approach to inverse optimal control and its application to modeling human locomotion. In 2012 IEEE International Conference on Robotics and Automation, pages 531–536. IEEE, 2012. [89] David W Franklin and Daniel M Wolpert. Computational mechanisms of sensorimotor control. Neuron, 72(3):425–442, 2011. [90] Brian A Cohn, May Szedl´ ak, Bernd G¨ artner, and Francisco J Valero-Cuevas. Feasibility theory reconciles and informs alternative approaches to neuromuscular control. Frontiers in computational neuroscience, 12:62, 2018. [91] Natalia S´ anchez, Surabhi N Simha, J Maxwell Donelan, and James M Finley. Using asymmetry to your advantage: learning to acquire and accept external assistance during prolonged split-belt walking. Journal of Neurophysiology, 125(2):344–357, 2021. [92] Masa-aki Sato, Yutaka Nakamura, and Shin Ishii. Reinforcement learning for biped locomotion. In Artificial Neural Networks—ICANN 2002: International Conference Madrid, Spain, August 28–30, 2002 Proceedings 12, pages 777–782. Springer, 2002. [93] Jun Morimoto and Christopher G Atkeson. Learning biped locomotion. IEEE Robotics & Automation Magazine, 14(2):41–51, 2007. [94] Richard S Sutton and Andrew G Barto. Reinforcement learning: An introduction. MIT press, 2018. [95] Masahiko Haruno, Daniel M Wolpert, and Mitsuo Kawato. Mosaic model for sensorimotor learning and control. Neural computation, 13(10):2201–2220, 2001. [96] AE Minetti, LP Ardigo, and F Saibene. The transition between walking and running in humans: metabolic and mechanical aspects at different gradients. Acta physiologica scandinavica, 150(3):315–323, 1994. [97] Wayland Tseh, Jeff Bennett, Jennifer L Caputo, and Don W Morgan. Comparison between preferred and energetically optimal transition speeds in adolescents. European journal of applied physiology, 88(1):117–121, 2002. [98] James M Finley, Andrew Long, Amy J Bastian, and Gelsy Torres-Oviedo. Spatial and temporal control contribute to step length asymmetry during split-belt adaptation and hemiparetic gait. Neurorehabilitation and neural repair, 29(8):786–795, 2015. 182 References [99] Paolo Viviani and Tamar Flash. Minimum-jerk, two-thirds power law, and isochrony: converging approaches to movement planning. Journal of Experimental Psychology: Human Perception and Performance, 21(1):32, 1995. [100] Y oji Uno, Mitsuo Kawato, and Rika Suzuki. Formation and control of optimal trajectory in human multijoint arm movement. Biological cybernetics, 61(2):89–101, 1989. [101] Katja Mombaur and Debora Clever. Inverse Optimal Control as a Tool to Understand Human Movement, pages 163–186. Springer International Publishing, Cham, 2017. [102] Debora Clever and Katja D Mombaur. An inverse optimal control approach for the transfer of human walking motions in constrained environment to humanoid robots. In Robotics: Science and Systems, 2016. [103] Nozari P Pouria and James M Finley. Development of a platform to evaluate principles of bipedal locomotion using dynamical movement primitives. In 2019 9th International IEEE/EMBS Conference on Neural Engineering (NER), pages 1062–1065. IEEE, 2019. [104] Nidhi Seethapathi, Barrett Clark, and Manoj Srinivasan. Exploration-based learning of a step to step controller predicts locomotor adaptation. bioRxiv, 2021. [105] Mark A Price, Meghan E Huber, and Wouter Hoogkamer. Minimum effort simulations of split-belt treadmill walking exploit asymmetry to reduce metabolic energy expenditure. bioRxiv, 2022. [106] Andy Ruina, John EA Bertram, and Manoj Srinivasan. A collisional model of the energetic cost of support work qualitatively explains leg sequencing in walking and galloping, pseudo-elastic leg behavior in running and the walk-to-run transition. Journal of theoretical biology, 237(2):170–192, 2005. [107] Jesse C Dean, Neil B Alexander, and Arthur D Kuo. The effect of lateral stabilization on walking in young and old adults. IEEE Transactions on Biomedical Engineering, 54(11):1919–1926, 2007. [108] J Maxwell Donelan, Rodger Kram, and Kuo Arthur D. Mechanical and metabolic determinants of the preferred step width in human walking. Proceedings of the Royal Society of London. Series B: Biological Sciences, 268(1480):1985–1992, 2001. [109] Shouyi Wang, Wanpracha Chaovalitwongse, and Robert Babuska. Machine learning algorithms in bipedal robot control. IEEE Transactions on Systems, 183 References Man, and Cybernetics, Part C (Applications and Reviews), 42(5):728–743, 2012. [110] Y ang Wang and Manoj Srinivasan. Stepping in the direction of the fall: the next foot placement can be predicted from current upper body state in steady-state walking. Biology letters, 10(9):20140405, 2014. [111] Stefan Schaal and Christopher G Atkeson. Constructive incremental learning from only local information. Neural computation, 10(8):2047–2084, 1998. [112] William S Cleveland and Susan J Devlin. Locally weighted regression: an approach to regression analysis by local fitting. Journal of the American statistical association, 83(403):596–610, 1988. [113] DA Winter. Chapter 3: Anthropometry. Biomechanics and motor control of human movement, 2005. [114] Steve Miller. Simscape multibody contact forces library. MATLAB Central File Exchange, 2017. [115] Mike Peasgood, Eric Kubica, and John McPhee. Stabilization of a dynamic walking gait simulation. 2007. [116] Nikolaus Hansen. The cma evolution strategy: A tutorial. arXiv preprint arXiv:1604.00772, 2016. [117] Taesung Park and Sergey Levine. Inverse optimal control for humanoid locomotion. In Robotics Science and Systems Workshop on Inverse Optimal Control and Robotic Learning from Demonstration, pages 4887–4892, 2013. [118] Darcy S Reisman, Hannah J Block, and Amy J Bastian. Interlimb coordination during locomotion: what can be adapted and stored? Journal of neurophysiology, 94(4):2403–2415, 2005. [119] James M Finley, Matthew A Statton, and Amy J Bastian. A novel optic flow pattern speeds split-belt locomotor adaptation. Journal of neurophysiology, 111(5):969–976, 2014. [120] MI Jordan, DM Wolpert, and M Gazzaniga. Computational motor control. the cognitive neurosciences. MIT Press, 601:620, 1999. [121] Reza Shadmehr and John W Krakauer. A computational neuroanatomy for motor control. Experimental brain research, 185(3):359–381, 2008. [122] Stefan Schaal and Nicolas Schweighofer. Computational motor control in humans and robots. Current opinion in neurobiology, 15(6):675–682, 2005. 184 References [123] Mark L Latash, Mindy F Levin, John P Scholz, and Gregor Sch¨ oner. Motor control theories and their applications. Medicina, 46(6):382, 2010. [124] Kenneth H Hunt and Frank R Erskine Crossley. Coefficient of restitution interpreted as damping in vibroimpact. 1975. [125] Christopher G Atkeson, Andrew W Moore, and Stefan Schaal. Locally weighted learning. Lazy learning, pages 11–73, 1997. [126] Ray G Burdett, Gary S Skrinar, and Sheldon R Simon. Comparison of mechanical work and metabolic energy consumption during normal gait. Journal of orthopaedic research, 1(1):63–72, 1983. [127] Jack M Wang, Samuel R Hamner, Scott L Delp, and Vladlen Koltun. Optimizing locomotion controllers using biologically-based actuators and objectives. ACM Transactions on Graphics (TOG), 31(4):1–11, 2012. [128] J¨ orn Diedrichsen, Reza Shadmehr, and Richard B Ivry. The coordination of movement: optimal feedback control and beyond. Trends in cognitive sciences, 14(1):31–39, 2010. [129] Vinh Q Nguyen, Russell T Johnson, Frank C Sup, and Brian R Umberger. Bilevel optimization for cost function determination in dynamic simulation of human gait. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 27(7):1426–1435, 2019. [130] Roy D Crowninshield and Richard A Brand. A physiologically based criterion of muscle force prediction in locomotion. Journal of biomechanics, 14(11):793–801, 1981. [131] John F Soechting, Christopher A Buneo, Uta Herrmann, and Martha Flanders. Moving effortlessly in three dimensions: does donders’ law apply to arm movement? Journal of Neuroscience, 15(9):6271–6280, 1995. [132] Jun Nishii. Energetic optimality of arm trajectory. In Proc. Int. Conf. on Biomechanics of Man, 2002, 2002. [133] Brian R Umberger, Karin GM Gerritsen, and Philip E Martin. A model of human muscle energy expenditure. Computer methods in biomechanics and biomedical engineering, 6(2):99–111, 2003. [134] R McN Alexander. A minimum energy cost hypothesis for human arm trajectories. Biological cybernetics, 76(2):97–105, 1997. [135] Dinant A Kistemaker, Jeremy D Wong, and Paul L Gribble. The central nervous system does not minimize energy cost in arm movements. Journal of neurophysiology, 104(6):2985–2994, 2010. 185 References [136] Michael W Whittle. Gait analysis: an introduction. Butterworth-Heinemann, 2014. [137] Jacquelin Perry, Jon R Davids, et al. Gait analysis: normal and pathological function. Journal of Pediatric Orthopaedics, 12(6):815, 1992. [138] Angel Fernando Kuri-Morales and Jes´ us Guti´ errez-Garc´ ıa. Penalty function methods for constrained optimization with genetic algorithms: A statistical analysis. In Mexican international conference on artificial intelligence , pages 108–117. Springer, 2002. [139] Daniel M Wolpert, Zoubin Ghahramani, and J Randall Flanagan. Perspectives and problems in motor learning. Trends in cognitive sciences, 5(11):487–494, 2001. [140] Emanuel Todorov. Stochastic optimal control and estimation methods adapted to the noise characteristics of the sensorimotor system. Neural computation, 17(5):1084–1108, 2005. [141] Dagmar Sternad. It’s not (only) the mean that matters: variability, noise and exploration in skill learning. Current opinion in behavioral sciences, 20:183–195, 2018. [142] Andrew Y Ng, Stuart J Russell, et al. Algorithms for inverse reinforcement learning. In Icml, volume 1, page 2, 2000. [143] Pieter Abbeel and Andrew Y Ng. Apprenticeship learning via inverse reinforcement learning. In Proceedings of the twenty-first international conference on Machine learning, page 1, 2004. [144] Nathan D. Ratliff, J. Andrew Bagnell, and Martin A. Zinkevich. Maximum margin planning. In Proceedings of the 23rd international conference on Machine learning - ICML ’06, pages 729–736, Pittsburgh, Pennsylvania, 2006. ACM Press. [145] Umar Syed, Michael Bowling, and Robert E. Schapire. Apprenticeship learning using linear programming. In Proceedings of the 25th international conference on Machine learning - ICML ’08, pages 1032–1039, Helsinki, Finland, 2008. ACM Press. [146] Brian D. Ziebart, Andrew Maas, J. Andrew Bagnell, and Anind K. Dey. Maximum Entropy Inverse Reinforcement Learning. In Proceedings of the 23rd National Conference on Artificial Intelligence - Volume 3 , AAAI’08, pages 1433–1438. AAAI Press, 2008. event-place: Chicago, Illinois. 186 References [147] Krishnamurthy Dvijotham and Emanuel Todorov. Inverse optimal control with linearly-solvable mdps. In ICML, 2010. [148] Abdeslam Boularias, Jens Kober, and Jan Peters. Relative entropy inverse reinforcement learning. In Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics , pages 182–189. JMLR Workshop and Conference Proceedings, 2011. [149] Gergely Neu and Csaba Szepesv´ ari. Training parsers by inverse reinforcement learning. Machine Learning, 77(2-3):303–337, December 2009. [150] Nahema Sylla, Vincent Bonnet, Gentiane Venture, Nahid Armande, and Philippe Fraisse. Human arm optimal motion analysis in industrial screwing task. In 5th IEEE RAS/EMBS international conference on biomedical robotics and biomechatronics, pages 964–969. IEEE, 2014. [151] John A Nelder and Roger Mead. A simplex method for function minimization. The computer journal, 7(4):308–313, 1965. [152] Katja Mombaur. Optimal control for applications in medical and rehabilitation technology: challenges and solutions. Advances in mathematical modeling, optimization and optimal control, pages 103–145, 2016. [153] Stephen P Boyd and Lieven Vandenberghe. Convex optimization. Cambridge university press, 2004. [154] Jorge Nocedal and Stephen J Wright. Numerical optimization. Springer, 1999. [155] Navid Aghasadeghi and Timothy Bretl. Maximum entropy inverse reinforcement learning in continuous state spaces with path integrals. In 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 1561–1566. IEEE, 2011. [156] Mrinal Kalakrishnan, Peter Pastor, Ludovic Righetti, and Stefan Schaal. Learning objective functions for manipulation. In 2013 IEEE International Conference on Robotics and Automation, pages 1331–1336. IEEE, 2013. [157] Manoj Srinivasan and Andy Ruina. Computer optimization of a minimal biped model discovers walking and running. Nature, 439(7072):72–75, September 2005. [158] John R Rebula and Arthur D Kuo. The cost of leg forces in bipedal locomotion: a simple optimization study. PloS one, 10(2):e0117384, 2015. [159] DT Davy and ML Audu. A dynamic optimization technique for predicting muscle forces in the swing phase of gait. Journal of biomechanics, 20(2):187–201, 1987. 187 References [160] Gary T Y amaguchi and Felix E Zajac. Restoring unassisted natural gait to paraplegics via functional neuromuscular stimulation: a computer simulation study. IEEE Transactions on biomedical engineering, 37(9):886–902, 1990. [161] Federico L Moro, Michael Gienger, Ambarish Goswami, Nikos G Tsagarakis, and Darwin G Caldwell. An attractor-based whole-body motion control (wbmc) system for humanoid robots. In 2013 13th IEEE-RAS International Conference on Humanoid Robots (Humanoids), pages 42–49. IEEE, 2013. [162] Shigeru Kitazawa. Optimization of goal-directed movements in the cerebellum: a random walk hypothesis. Neuroscience research, 43(4):289–294, 2002. [163] Gavin Simmons and Yiannis Demiris. Optimal robot arm control using the minimum variance model. Journal of Robotic Systems, 22(11):677–690, 2005. [164] Siyuan Feng, Eric Whitman, X Xinjilefu, and Christopher G Atkeson. Optimization-based full body control for the darpa robotics challenge. Journal of Field Robotics, 32(2):293–312, 2015. [165] Alessandro Vittorio Papadopoulos, Luca Bascetta, and Gianni Ferretti. Generation of human walking paths. Autonomous Robots, 40(1):59–75, 2016. [166] Katja Mombaur, Jean-Paul Laumond, and Eiichi Y oshida. An optimal control model unifying holonomic and nonholonomic walking. In Humanoids 2008-8th IEEE-RAS International Conference on Humanoid Robots, pages 646–653. IEEE, 2008. [167] Tamar Flash and Neville Hogan. The coordination of arm movements: an experimentally confirmed mathematical model. Journal of neuroscience, 5(7):1688–1703, 1985. [168] Emanuel Todorov and Michael I Jordan. Smoothness maximization along a predefined path accurately predicts the speed profiles of complex arm movements. Journal of Neurophysiology, 80(2):696–714, 1998. [169] Eri Nakano, Hiroshi Imamizu, Rieko Osu, Y oji Uno, Hiroaki Gomi, Toshinori Y oshioka, and Mitsuo Kawato. Quantitative examinations of internal representations for arm trajectory planning: minimum commanded torque change model. Journal of Neurophysiology, 81(5):2140–2155, 1999. [170] Emmanuel Guigon, Pierre Baraduc, and Michel Desmurget. Computational motor control: redundancy and invariance. Journal of neurophysiology, 97(1):331–347, 2007. [171] Sabrina J Abram, Jessica C Selinger, and J Maxwell Donelan. Energy optimization is a major objective in the real-time control of step width in human walking. Journal of biomechanics, 91:85–91, 2019. 188 References [172] Liandong Zhang and Changjiu Zhou. Optimal three-dimensional biped walking pattern generation based on geodesics. International Journal of Advanced Robotic Systems, 14(2):1729881417696235, 2017. [173] AL Hof, MGJ Gazendam, and WE Sinke. The condition for dynamic stability. Journal of biomechanics, 38(1):1–8, 2005. [174] Jerry Pratt, John Carff, Sergey Drakunov, and Ambarish Goswami. Capture point: A step toward humanoid push recovery. In 2006 6th IEEE-RAS international conference on humanoid robots, pages 200–207. IEEE, 2006. [175] C Duclos, P Desjardins, S Nadeau, A Delisle, D Gravel, B Brouwer, and H Corriveau. Destabilizing and stabilizing forces to assess equilibrium during everyday activities. Journal of biomechanics, 42(3):379–382, 2009. [176] Bastien Berret, Enrico Chiovetto, Francesco Nori, and Thierry Pozzo. Evidence for composite cost functions in arm movement planning: an inverse optimal control approach. PLoS Comput Biol, 7(10):e1002183, 2011. [177] Y asuhiro Wada, Yuichi Kaneko, Eri Nakano, Rieko Osu, and Mitsuo Kawato. Quantitative examinations for multi joint arm trajectory planning—using a robust calculation algorithm of the minimum commanded torque change trajectory. Neural networks, 14(4-5):381–393, 2001. [178] Shay Ben-Itzhak and Amir Karniel. Minimum acceleration criterion with constraints implies bang-bang control as an underlying principle for optimal trajectories of arm reaching movements. Neural Computation, 20(3):779–812, 2008. [179] Armin Biess, Dario G Liebermann, and Tamar Flash. A computational model for redundant human three-dimensional pointing movements: integration of independent spatial and temporal motor plans simplifies movement dynamics. Journal of Neuroscience, 27(48):13045–13064, 2007. [180] Jun Nishii and Y oshiaki Taniai. Evaluation of trajectory planning models for arm-reaching movements based on energy cost. Neural computation, 21(9):2634–2647, 2009. [181] Bastien Berret, Christian Darlot, Fr´ ed´ eric Jean, Thierry Pozzo, Charalambos Papaxanthis, and Jean Paul Gauthier. The inactivation principle: mathematical solutions minimizing the absolute work and biological implications for the planning of arm movements. PLoS Comput Biol, 4(10):e1000194, 2008. [182] Neville Hogan. An organizing principle for a class of voluntary movements. Journal of neuroscience, 4(11):2745–2754, 1984. 189 References [183] Andrew H Fagg, Ashvin Shah, and Andrew G Barto. A computational model of muscle recruitment for wrist movements. Journal of Neurophysiology, 88(6):3348–3358, 2002. [184] BM Van Bolhuis and CCAM Gielen. A comparison of models explaining muscle activation patterns for isometric contractions. Biological cybernetics, 81(3):249–261, 1999. [185] Ross H Miller. A comparison of muscle energy models for simulating human walking in three dimensions. Journal of biomechanics, 47(6):1373–1381, 2014. [186] At L Hof. The ‘extrapolated center of mass’ concept suggests a simple control of balance in walking. Human movement science, 27(1):112–125, 2008. [187] Katja Mombaur, Anh Truong, and Jean-Paul Laumond. From human to humanoid locomotion—an inverse optimal control approach. Autonomous robots, 28(3):369–383, 2010. [188] Brett R Fajen and William H Warren. Behavioral dynamics of intercepting a moving target. Experimental Brain Research, 180(2):303–319, 2007. [189] Peter Trautman and Andreas Krause. Unfreezing the robot: Navigation in dense, interacting crowds. In 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 797–803. IEEE, 2010. [190] Peter Henry, Christian Vollmer, Brian Ferris, and Dieter Fox. Learning to navigate through crowded environments. In 2010 IEEE international conference on robotics and automation, pages 981–986. IEEE, 2010. [191] Markus Kuderer, Henrik Kretzschmar, Christoph Sprunk, and Wolfram Burgard. Feature-based prediction of trajectories for socially compliant navigation. In Robotics: science and systems, 2012. [192] Kris Hauser. Recognition, prediction, and planning for assisted teleoperation of freeform tasks. Autonomous Robots, 35:241–254, 2013. [193] Alain Ajami, Jean-Paul Gauthier, Thibault Maillot, and Ulysse Serres. How humans fly. ESAIM: Control, Optimisation and Calculus of Variations, 19(4):1030–1054, 2013. [194] Thibault Maillot, Ulysse Serres, Jean-Paul Gauthier, and Alain Ajami. How pilots fly: An inverse optimal control problem approach. In 52nd IEEE Conference on Decision and Control, pages 1792–1797. IEEE, 2013. [195] Jim Mainprice, Rafi Hayne, and Dmitry Berenson. Goal set inverse optimal control and iterative replanning for predicting human reaching motions in shared workspaces. IEEE Transactions on Robotics, 32(4):897–908, 2016. 190 References [196] Sergey Levine and Vladlen Koltun. Continuous inverse optimal control with locally optimal examples. arXiv preprint arXiv:1206.4617, 2012. [197] Ashesh Jain, Brian Wojcik, Thorsten Joachims, and Ashutosh Saxena. Learning Trajectory Preferences for Manipulators via Iterative Improvement. arXiv:1306.6294 [cs], November 2013. arXiv: 1306.6294. [198] Jim Mainprice, Rafi Hayne, and Dmitry Berenson. Predicting human reaching motion in collaborative tasks using inverse optimal control and iterative re-planning. In 2015 IEEE International Conference on Robotics and Automation (ICRA), pages 885–892. IEEE, 2015. [199] Tad McGeer et al. Passive dynamic walking. I. J. Robotic Res., 9(2):62–82, 1990. [200] Catherine E Bauby and Arthur D Kuo. Active control of lateral balance in human walking. Journal of biomechanics, 33(11):1433–1440, 2000. [201] Chang Liu, Lucas De Macedo, and James M Finley. Conservation of reactive stabilization strategies in the presence of step length asymmetries during walking. Frontiers in Human Neuroscience, 12:251, 2018. [202] Ambarish Goswami. Foot rotation indicator (fri) point: A new gait planning tool to evaluate postural stability of biped robots. In Proceedings 1999 IEEE International Conference on Robotics and Automation (Cat. No. 99CH36288C), volume 1, pages 47–52. IEEE, 1999. [203] John Rebula, Fabian Canas, Jerry Pratt, and Ambarish Goswami. Learning capture points for humanoid push recovery. In 2007 7th IEEE-RAS International Conference on Humanoid Robots, pages 65–72. IEEE, 2007. [204] Twan Koolen, Tomas De Boer, John Rebula, Ambarish Goswami, and Jerry Pratt. Capturability-based analysis and control of legged locomotion, part 1: Theory and application to three simple gait models. The international journal of robotics research, 31(9):1094–1113, 2012. [205] Matthew Millard, Derek Wight, John McPhee, Eric Kubica, and David Wang. Human foot placement and balance in the sagittal plane. Journal of biomechanical engineering, 131(12), 2009. [206] Derek L Wight, Eric G Kubica, and David WL Wang. Introduction of the foot placement estimator: A dynamic measure of balance for bipedal robotics. Journal of computational and nonlinear dynamics, 3(1), 2008. [207] Matthew Millard, John McPhee, and Eric Kubica. Foot placement and balance in 3d. Journal of computational and nonlinear dynamics, 7(2), 2012. 191 References [208] Sjoerd M Bruijn and Jaap H Van Die¨ en. Control of human gait stability through foot placement. Journal of The Royal Society Interface, 15(143):20170816, 2018. [209] Louis N Awad, Brian A Knarr, Pawel Kudzia, and Thomas S Buchanan. The interplay between walking speed, economy, and stability after stroke. Journal of Neurologic Physical Therapy, pages 10–1097, 2023. [210] Silvia U Raschke, Michael S Orendurff, Johanne L Mattie, David EA Kenyon, O Yvette Jones, David Moe, Lorne Winder, Angie S Wong, Ana Moreno-Hern´ andez, M Jason Highsmith, et al. Biomechanical characteristics, patient preference and activity level with different prosthetic feet: a randomized double blind trial with laboratory and community testing. Journal of biomechanics, 48(1):146–152, 2015. [211] Jerry E Pratt and Russ Tedrake. Velocity-based stability margins for fast bipedal walking. In Fast motions in biomechanics and robotics, pages 299–324. Springer, 2006. [212] Antoine Falisse, Gil Serrancol´ ı, Christopher L Dembia, Joris Gillis, Ilse Jonkers, and Friedl De Groote. Rapid predictive simulations with complex musculoskeletal models suggest that diverse healthy and pathological human gaits can emerge from similar control strategies. Journal of The Royal Society Interface, 16(157):20190402, 2019. [213] Brian R Umberger. Stance and swing phase costs in human walking. Journal of the Royal Society Interface, 7(50):1329–1340, 2010. [214] Igor Mordatch, Jack M Wang, Emanuel Todorov, and Vladlen Koltun. Animating human lower limbs using contact-invariant optimization. ACM Transactions on Graphics (TOG), 32(6):1–8, 2013. [215] Antoine Falisse, Lorenzo Pitto, Hans Kainz, Hoa Hoang, Mariska Wesseling, Sam Van Rossom, Eirini Papageorgiou, Lynn Bar-On, Ann Hallemans, Kaat Desloovere, et al. Physics-based simulations to predict the differential effects of motor control and musculoskeletal deficits on gait dysfunction in cerebral palsy: a retrospective case study. Frontiers in human neuroscience, 14:40, 2020. [216] Jiacheng Weng, Ehsan Hashemi, and Arash Arami. Adaptive reference inverse optimal control for natural walking with musculoskeletal models. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 30:1567–1575, 2022. 192 References [217] Filip Beˇ canovi´ c, Vincent Bonnet, Raphael Dumas, Kosta Jovanovi´ c, and Samer Mohammed. Force sharing problem during gait using inverse optimal control. IEEE Robotics and Automation Letters, 2022. [218] Andrew Gelman, John B Carlin, Hal S Stern, David B Dunson, Aki Vehtari, and Donald B Rubin. Bayesian data analysis. CRC press, 2013. [219] Walter R Gilks, Sylvia Richardson, and David Spiegelhalter. Markov chain Monte Carlo in practice. CRC press, 1995. [220] Karl Friston, J´ er´ emie Mattout, and James Kilner. Action understanding and active inference. Biological cybernetics, 104:137–160, 2011. [221] Jeffrey M Beck, Wei Ji Ma, Xaq Pitkow, Peter E Latham, and Alexandre Pouget. Not noisy, just wrong: the role of suboptimal inference in behavioral variability. Neuron, 74(1):30–39, 2012. [222] Daniel A Braun, Ad Aertsen, Daniel M Wolpert, and Carsten Mehring. Motor task variation induces structural learning. Current Biology, 19(4):352–357, 2009. [223] Pedro A Ortega and Daniel A Braun. Generalized thompson sampling for sequential decision-making and causal inference. Complex Adaptive Systems Modeling, 2(1):1–23, 2014. [224] Claudiane A Fukuchi, Reginaldo K Fukuchi, and Marcos Duarte. A public dataset of overground and treadmill walking kinematics and kinetics in healthy individuals. PeerJ, 6:e4640, 2018. [225] Meinard M¨ uller. Dynamic time warping. Information retrieval for music and motion, pages 69–84, 2007. [226] Joseph Ryan G Lansangan. Sparse principal component regression. The Philippine Statistician, 62(1):33–50, 2013. [227] Hui Zou, Trevor Hastie, and Robert Tibshirani. Sparse principal component analysis. Journal of computational and graphical statistics, 15(2):265–286, 2006. [228] Charles J Geyer. Practical markov chain monte carlo. Statistical science, pages 473–483, 1992. [229] Steve Brooks, Andrew Gelman, Galin Jones, and Xiao-Li Meng. Handbook of markov chain monte carlo. CRC press, 2011. 193 References [230] Aki Vehtari, Andrew Gelman, Daniel Simpson, Bob Carpenter, and Paul-Christian B¨ urkner. Rank-normalization, folding, and localization: An improved r for assessing convergence of mcmc (with discussion). Bayesian analysis, 16(2):667–718, 2021. [231] Stephen P Brooks and Gareth O Roberts. Assessing convergence of markov chain monte carlo algorithms. Statistics and Computing, 8(4):319–335, 1998. 194 Appendices APPENDIX 1: TRAJECTORY OPTIMIZATION FOR GAIT SIMULATIONS OF A COMPASS WALKER USING DIRECT COLLOCATION In the implementation of inverse optimization for cost inference, we initially started off by using a simple compass walker model that has two leg segments and is actuated by an internal torque placed at the hip. We also performed our inverse optimization using a bi-level approach with a gradient-free evolutionary algorithm as the upper-level optimization algorithm. Model of Bipedal Compass Walker We developed a compass walking model in our simulations of overground walking (Fig. A.1A). This model consisted of two lumped legs modeled as rigid bodies connected to each other at the hip joint. The joints and the legs’ contact points with the treadmill are modeled through frictionless revolute joints. The relative separation between the two leg segments at the hip joint in this model was actuated by an internal torque of valueu. 195 Appendices | 3 A B Figure A.1: Schematics of the compass walker model. A) The Cartesian coordinates of the joints and the center of mass of the segments. The angles of the limbs and the internal torque at the hip joint are also represented. B) The free body diagram of the model used to derive the kinematics and dynamics of the model. We define q = [q 1 q 2 ] T as the vector representing limb angles. We take the ankle joint on the stance limb as the origin (P 0 = 0). Going through the kinematic chain of the biped and using the chains rule, one could obtain the Cartesian coordinates, velocities, and accelerations of the joints (P i , ˙ P i , ¨ P i ) and the segments’ center of mass (G i , ˙ G i , ¨ G i ): P 1 = − l 1 sin(q 1 ) l 1 cos(q 1 ) P 2 = − l 1 sin(q 1 )+l 2 sin(q 2 ) l 1 cos(q 1 )− l 2 cos(q 2 ) (A.1a) ˙ P i =( ∂P i ∂q )˙ q, ˙ G i =( ∂G i ∂q )˙ q (A.1b) ¨ P i =( ∂ ˙ P i ∂x )˙ x, ¨ G i =( ∂ ˙ G i ∂x )˙ x (A.1c) 196 Appendices where ˙ q = [˙ q 1 ˙ q 2 ] T and ¨q = [¨q 1 ¨q 2 ] T . Further, x = [q ˙ q] T and ˙ x = [˙ q ¨q] T represent the vectors of states and state derivatives, respectively. With access to explicit kinematic terms, one can expand the dynamics of the model. To this end, the control was defined as the internal torque at the hip joint denoted by u. Using the free body diagram of the compass walker (Fig. A.1B), and writing moment balance equations at the leading contact pointP 0 and the hip jointP 1 , we derived the dynamics of the model: X M P 1 =0 =⇒ u+[(G 2 − P 1 )× (− m 2 g ˆ j)]. ˆ k =[(G 2 − P 1 )× (− m 2 ¨ G 2 )+ ¨q 2 I 2 ˆ k]. ˆ k, (A.2a) X M P 0 =0 =⇒ 2 X j=1 [(G j − P 0 )× (− m j g ˆ j)]. ˆ k = 2 X j=i [(G j − P 0 )× (− m j ¨ G j )+ ¨q j I j ˆ k]. ˆ k (A.2b) The equations of motion in Eq. (A.2) describe the dynamics of motion for our biped model by constructing the necessary relations between the states, their derivatives, and joint torques across all segments. Gait Optimization in Compass Walker The method used here to simulate bipedal gait was trajectory optimization solved via direct collocation (Fig. A.2). The variables in the optimization problem were limb states, i.e., limb angles and angular velocities, as well as the control, i.e., the internal 197 Appendices A. Trajectory discretization e.g. To optimize a cost function for, e.g., u(t), we optimize for discrete time points of B. Decision Variables: (for Compass Walker using Direct Collocation) C. Optimization Problem: (5*n variables) (Solved by interior-point optimization) | 3 S Constraints Figure A.2: Overview of Gait Optimization Using Direct Collocation. A) Prior to optimizations, the states and control are discretized into a number of grids split equally. B) The decision variables used in optimizations of compass walker used for direct collocation. C) The optimization problem for gait in the compass walker. torque at the hip. The states and controls were first discretized (Fig. A.2A-B). Then, the constrained optimization problem in Fig. A.2C was solved numerically to obtain the optimal solutions for discretized states and controls. The collocation method used in our forward simulations of gait was Hermite-Simpson, where the dynamics of the system, generally formalized as ˙ x(t) = f(t,x,u), would be approximated numerically through the optimization constraints in the following form: 198 Appendices ˙ x=f =⇒ Z t k+1 t k ˙ xdt= Z t k+1 t k fdt (A.3) which leads to the following collocation equations x k =1/6h k (f k +4f k+1/2 +f k+1 ), (A.4a) x k+1/2 =1/2(x k +x k+1 )+h k /8(f k − f k+1 ), (A.4b) APPENDIX 2: A NA¨ IVE EVOLUTIONARY BI-LEVEL INVERSE OPTIMIZATION We present preliminary results from performing na¨ ıve IOC on simulated data using a simple compass walker. A compass walker is a plausible model for representing bipedal walking and, due to simplicity, would significantly reduce the computational cost, allowing for faster and more efficient testing of different hypotheses through simulation. We used the IOC framework represented in Fig. A.3 to investigate the validity of na¨ ıve IOC by its capability to recover the true weights of the cost features in two cases with synthetic behaviors. 199 Appendices Synthetic Gait Data Used in a Na¨ ıve Bi-level Inverse Optimization Prior to applying our developed IOC method to experimental behavioral data, we generate synthetic data through the use of our model. To generate the first set of synthetic behaviors, we use forward simulations of walking through optimizing the objective function each time with a unitary weight on one individual cost feature and zero weight on all others. These simulations would each generate motor patterns for which one can perform IOC to infer the underlying objective function. We would expect that performing IOC on each of these cases would recover the unitary weighting on the true cost feature, while all other weights turn out to be close to zero. For generating the second set of synthetic behaviors, we performed forward simulations of gait with arbitrary weights on multiple rather than individual cost features. Similar to the first case, we expect that an IOC framework, given the generated synthetic behaviors, be capable of recovering the intended weighted objective structure. Here, we aim to investigate the functionality of a feature-reduced IOC when applied to simulated gait data from the biped model. We hypothesize that for these simulated data in general when running IOC multiple times with randomized initial guesses, the feature-reduced IOC will recover the true weights on the intended cost features more accurately and with a higher confidence level than a na ¨ ıve IOC would do. To investigate this, we will obtain measures of cost feature weights from performing IOC 200 Appendices Figure A.3: Overview of an IOC problem implemented in a nested bilevel paradigm on each set of synthetic data using both a na¨ ıve method (without feature reduction) and our feature-reduced IOC. For each case with synthetic data, since there is only one single set of reference trajectories for each set of arbitrary weights, the metric of dissimilarity (shown as the loss in (6.5a) and (6.8a)) between the forward optimization predictionsx ∗ and reference behavior ˜ x remains the Euclidean distance between the two trajectories, i.e., Loss =∥x ∗ − ˜ x∥. We would compare the weights (ω ∗ andω ∗ B ) and their confidence interval obtained from performing IOC for multiple times across the two na¨ ıve and feature-reduced methods. Case 1. Unitary weights on individual cost features In this case, we assumed the composite cost function in IOC is composed of four cost features; integral of torque squared, integral of torque rate squared, absolute work, and integral of accelerations squared. We then performed four simulations of 201 Appendices walking while optimizing each of the individual cost features separately to generate synthetic behaviors. The results from each optimization consisted of the joint angles, joint angular velocity profiles, and internal torque patterns between two segments. We then used the kinematics from these results, i.e., joint angles and velocities, as a reference for IOC in each condition. Ideally, performing IOC on the reference behaviors from each of the four individual optimization cases would recover unitary weights on the true optimized features and zero weights for the rest. After implementing these steps and performing IOC, the kinematics were reproduced with a sum of squared error of less than 0.1 (sum of errors in angle and velocity reproduction). Of 10 different runs on IOC with different random guesses on weights, the estimates with the best kinematic recovery were selected (Fig. A.4). This figure represents the weights on the cost features recovered by IOC (left panel) and valuations of the weighted contribution of each cost feature to the composite cost (right panel). The number on the horizontal axes determines what number cost feature had been truly optimized. Numbers 1 through 4 represent the torque squared integral, torque rate squared integral, absolute work, and acceleration squared integral, respectively. An ideal IOC problem would be able to recover unitary weights on the features corresponding to each condition. Further, in each condition, the cost feature that had been truly optimized should contribute the most to the valuation of the composite cost. It can be seen in Fig. A.4 that the na¨ ıve IOC algorithm had 202 Appendices success in recovering almost unitary weights and full contribution on the desired cost features in conditions 1 and 2, i.e., for each of the torque squared integral and torque rate squared integral. Despite the first two conditions, when we used IOC to infer the unitary weights for absolute work and acceleration squared in conditions 3 and 4, respectively, there were mismatches between the estimates and true values of the weights. Specifically, when trying to recover true weight for absolute work and acceleration, other cost features than the only desired one were thought by IOC to be present in the composite cost. This could suggest that behavior driven by the optimization of, e.g., absolute work or accelerations squared could also be a product of other cost features (torque squared, here), given the feature space in this problem. Specifically, it is highly likely that for our specific locomotor task, optimizations of several cost features have similar effects on the behavioral outcome, i.e., some cost features are multicollinear. Case 2. Cost features combined with nonzero weights In this case, we assumed the composite cost function in IOC is a weighted combination of five cost features; the four features used above and an extra term on jerk. We performed a forward simulation of walking on the model by minimizing a composite cost function consisting of the aforementioned features with nonzero weights. We selected the arbitrary vector [2,3,1,4,1.5] T as the set of weights on torque squared, torque rate squared, absolute work, acceleration squared, and jerk 203 Appendices Figure A.4: The weights identified by IOC and weighted contributions of cost features to the composite cost for case 1. Each number on the horizontal axis is an identifier of the condition wherein only the cost feature corresponding to that number had been optimized to generate synthetic behavior. Thus, IOC would ideally have to recover unitary weights on the feature numbers corresponding to each condition number. squared cost features, respectively. By performing forward optimizations of these five cost features weighted by the given coefficients for gait, we obtained the optimal synthetic behavior for this case. We then used IOC to estimate weights on the given cost features so that the behavior predictions within the lower-level optimization in IOC would match that of the simulated results. Figure A.5 suggests that IOC had success in reproducing the given kinematics as well as internal torque trajectories, although there was no direct cost on torque prediction error. It can be seen in Fig. A.5 that the joint angle, angular velocity trajectories, and internal torques reproduced by IOC matched the reference 204 Appendices 0 20 40 60 80 100 Step % -0.4 -0.2 0 0.2 0.4 0.6 0.8 Hip Angle (rad) Angles 0 20 40 60 80 100 Step % -3 -2 -1 0 1 2 3 Hip Angular Velocity (rad/s) Angular Velocity Stance Leg - Demonstration Stance Leg - IOC Predictions Swing Leg - Demonstration Swing Leg - IOC Predictions 0 20 40 60 80 100 Step % 4 4.5 5 5.5 6 6.5 7 Hip Internal Torque (N.m) Internal Torque Demonstration IOC Predictions Figure A.5: The kinematic and dynamic trajectories for case 2 with a combined cost. trajectories for almost all 10f different IOC runs. However, looking at Fig. A.6, one could observe a wide range for the set of cost feature weights, all of which could predict the given synthetic behavior with the same desired order of accuracy. This would suggest that the reproduction of given synthetic behaviors does not map to a unique solution on weights of the cost features and that the mechanistic solution to gait behavior in the given cost feature space is redundant. This could motivate an attempt to use a structured stochastic cost inference approach to infer a distribution of cost weights that could predict the same given gait behavior. 205 Appendices u 2 du 2 |u.dq| (d 2 q) 2 (d 3 q) 3 Cost Features 0 1 2 3 4 5 Weights Normalized to W 1 Recovered Weights W True = [ 2 ; 3 ; 1 ; 4 ; 2.5 ] True Weights Estimates of Weights u 2 du 2 |u.dq| (d 2 q) 2 (d 3 q) 3 Cost Features 0 1 2 3 4 5 6 Cost Feature Values Cost Feature Valuation Figure A.6: The weights and valuations of cost features for combined features with nonzero weights (case 2). 206
Abstract (if available)
Abstract
This dissertation addresses the utility of model-based computational approaches for simulations of treadmill walking to understand the principles and factors that underlie human locomotor behavior. Split-belt walking is a common paradigm for adaptive gait through which locomotor studies can uncover the underlying principles of motor control in novel conditions. Combining the principles of optimality and good-enough control, we proposed a hybrid evolutionary-gradient-based algorithm in simulations of adaptive gait on a split-belt treadmill that selects a family of suboptimal solutions that are feasible and good-enough. We then aimed to predict empirical features of gait observed during split-belt gait by selecting the good-enough solutions with respect to an effort-related cost function. Our predicted ranges of good-enough spatiotemporal asymmetries were in agreement with physics-based predictions. Moreover, our results were closer to the empirical observations of split-belt walking compared to prior simulation work. Across various tasks and contexts, it is highly likely that a composite cost representing a trade-off among several energetic and non-energetic components would better explain the gait features. Despite the effort to uncover this cost trade-off in human locomotor control by inverse optimization studies, the attempts towards inference of a range of costs and resolution of cost multicollinearity have been far too limited. Thus, We developed a Feature Reduced Bayesian inverse optimization framework to identify a range of plausible costs that would explain the observed locomotor patterns in human walking. We implemented this approach to infer costs from synthetic gait data, for which the cost weights were known. Our Feature-reduced Bayesian inference was able to recapture the true weights dictating the synthetic locomotor behavior more accurately than the Na ̈ıve approach.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Understanding reactive balance control strategies in non-disabled and post-stroke gait
PDF
The effects of fast walking, biofeedback, and cognitive impairment on post-stroke gait
PDF
Locomotor skill learning in virtual reality in healthy adults and people with Parkinson disease
PDF
Exploiting mechanical properties of bipedal robots for proprioception and learning of walking
PDF
Performance monitoring and disturbance adaptation for model predictive control
PDF
Inference of computational models of tendon networks via sparse experimentation
PDF
Artificial Decision Intelligence: integrating deep learning and combinatorial optimization
PDF
Economic model predictive control for building energy systems
PDF
Anatomically based human hand modeling and simulation
PDF
Novel multi-stage and CTLS-based model updating methods and real-time neural network-based semiactive model predictive control algorithms
PDF
Electromagnetic modeling of plasmonic nanostructures
PDF
Investigating the role of muscle physiology and spinal circuitry in sensorimotor control
PDF
Learning, adaptation and control to enhance wireless network performance
PDF
Geometric and dynamical modeling of multiscale neural population activity
PDF
The next generation of power-system operations: modeling and optimization innovations to mitigate renewable uncertainty
PDF
Active state tracking in heterogeneous sensor networks
PDF
Continuum modeling of reservoir permeability enhancement and rock degradation during pressurized injection
PDF
Deep learning for subsurface characterization and forecasting
PDF
Large-scale inference in multiple Gaussian graphical models
PDF
Learning controllable data generation for scalable model training
Asset Metadata
Creator
Nozari Porshokouhi, Pouria
(author)
Core Title
Model-based approaches to objective inference during steady-state and adaptive locomotor control
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Biomedical Engineering
Degree Conferral Date
2023-08
Publication Date
07/17/2023
Defense Date
06/07/2023
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
bipedal locomotor control,dynamics,feasibility,good-enough control,inverse optimization,inverse reinforcement learning,modeling and simulation,OAI-PMH Harvest,optimization,predictive simulation,redundant systems
Format
theses
(aat)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Finley, James M. (
committee chair
), Leech, Kristan (
committee member
), Schweighofer, Nicolas (
committee member
), Valero-Cuevas, Francisco (
committee member
)
Creator Email
nozari.pouria@gmail.com,nozaripo@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-oUC113281252
Unique identifier
UC113281252
Identifier
etd-NozariPors-12094.pdf (filename)
Legacy Identifier
etd-NozariPors-12094
Document Type
Dissertation
Format
theses (aat)
Rights
Nozari Porshokouhi, Pouria
Internet Media Type
application/pdf
Type
texts
Source
20230719-usctheses-batch-1069
(batch),
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the author, as the original true and official version of the work, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright.
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Repository Email
cisadmin@lib.usc.edu
Tags
bipedal locomotor control
dynamics
feasibility
good-enough control
inverse optimization
inverse reinforcement learning
modeling and simulation
optimization
predictive simulation
redundant systems