Page 5 |
Save page Remove page | Previous | 5 of 151 | Next |
|
small (250x250 max)
medium (500x500 max)
Large (1000x1000 max)
Extra Large
large ( > 500x500)
Full Resolution
All (PDF)
|
This page
All
|
3.2.3 Controlling a Single Active Drifter . . . . . . . . . . . . . . . . . . . . . . 38 3.2.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 3.3 Multi-Drifter Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 3.3.1 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 3.3.2 Drifter Control Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 3.3.3 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 3.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 II Task Data Scarcity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 4 Sim-to-(Multi)-Real: Transfer of Low-Level Robust Control Policies to Multiple Quadro-tors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 4.1 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 4.2 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 4.3 Dynamics Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 4.3.1 Rigid Body Dynamics for Quadrotors . . . . . . . . . . . . . . . . . . . . . 72 4.3.2 Normalized Motor Thrust Input . . . . . . . . . . . . . . . . . . . . . . . . 74 4.3.3 Simulation of Non-Ideal Motors . . . . . . . . . . . . . . . . . . . . . . . . 75 4.3.4 Observation Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 4.3.5 Numerical Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 4.4 Learning & Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 4.4.1 Randomization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 4.4.2 Policy Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 4.4.3 Policy Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 4.4.4 Sim-to-Sim Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 4.4.5 Sim-to-Real Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 4.5 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 4.5.1 Ablation Analysis on Cost Components . . . . . . . . . . . . . . . . . . . . 84 4.5.2 Sim-to-Real: Learning with Estimated Model . . . . . . . . . . . . . . . . 85 4.5.3 Sim-to-Multi-Real: Learning without Model . . . . . . . . . . . . . . . . . 87 4.5.4 Control Policy Robustness and Recovery . . . . . . . . . . . . . . . . . . . 89 4.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 5 Task Specific Learning with Scarce Data via Meta-learned Losses . . . . . . . . . . . . 93 5.1 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 5.2 Meta-Learning via Learned Loss . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 5.2.1 ML3 for Supervised Learning . . . . . . . . . . . . . . . . . . . . . . . . . 98 5.2.2 ML3 Reinforcement Learning . . . . . . . . . . . . . . . . . . . . . . . . . 100 5.2.3 Shaping ML3 loss by adding extra loss information during meta-train . . 104 5.3 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 5.3.1 Learning to mimic and improve over known task losses . . . . . . . . . . . 105 5.3.2 Shaping loss landscapes by adding extra information at meta-train time . 111 v
Object Description
Title | Data scarcity in robotics: leveraging structural priors and representation learning |
Author | Molchanov, Artem |
Author email | a.molchanov86@gmail.com;molchano@usc.edu |
Degree | Doctor of Philosophy |
Document type | Dissertation |
Degree program | Computer Science |
School | Viterbi School of Engineering |
Date defended/completed | 2020-05-11 |
Date submitted | 2020-08-11 |
Date approved | 2020-08-11 |
Restricted until | 2020-08-11 |
Date published | 2020-08-11 |
Advisor (committee chair) | Sukhatme, Gaurav Suhas |
Advisor (committee member) |
Ayanian, Nora Culbertson, Heather Gupta, Satyandra K. |
Abstract | Recent advances in Artificial Intelligence have benefited significantly from access to large pools of data accompanied in many cases by labels, ground truth values, or perfect demonstrations. In robotics, however, such data are scarce or absent completely. Overcoming this issue is a major barrier to move robots from structured laboratory settings to the unstructured real world. In this dissertation, by leveraging structural priors and representation learning, we provide several solutions when data required to operate robotics systems is scarce or absent. ❧ In the first part of this dissertation we study sensory feedback scarcity. We show how to use high-dimensional alternative sensory modalities to extract data when primary sensory sources are absent. In a robot grasping setting, we address the problem of contact localization and solve it using multi-modal tactile feedback as the alternative source of information. We leverage multiple tactile modalities provided by electrodes and hydro-acoustic sensors to structure the problem as spatio-temporal inference. We employ the representational power of neural networks to acquire the complex mapping between tactile sensors and the contact locations. We also investigate scarce feedback due to the high cost of measurements. We study this problem in a challenging field robotics setting where multiple severely underactuated aquatic vehicles need to be coordinated. We show how to leverage collaboration among the vehicles and the spatio-temporal smoothness of the ocean currents as a prior to densify feedback about ocean currents in order to acquire better controllability. ❧ In the second part of this dissertation, we investigate scarcity of the data related to the desired task. We develop a method to efficiently leverage simulated dynamics priors to perform sim-to-real transfer of a control policy when no data about the target system is available. We investigate this problem in the scenario of sim-to-real transfer of low-level stabilizing quadrotor control policies. We demonstrate that we can learn robust policies in simulation and transfer them to the real system while acquiring no samples from the real quadrotor. Finally, we consider the general problem of learning a model with a very limited number of samples using meta-learned losses. We show how such losses can encode a prior structure about families of tasks to create well-behaved loss landscapes for efficient model optimization. We demonstrate the efficiency of our approach for learning policies and dynamics models in multiple robotics settings. |
Keyword | robotics; machine learning; artificial intelligence |
Language | English |
Part of collection | University of Southern California dissertations and theses |
Publisher (of the original version) | University of Southern California |
Place of publication (of the original version) | Los Angeles, California |
Publisher (of the digital version) | University of Southern California. Libraries |
Provenance | Electronically uploaded by the author |
Type | texts |
Legacy record ID | usctheses-m |
Contributing entity | University of Southern California |
Rights | Molchanov, Artem |
Physical access | The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the author, as the original true and official version of the work, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright. The original signature page accompanying the original submission of the work to the USC Libraries is retained by the USC Libraries and a copy of it may be obtained by authorized requesters contacting the repository e-mail address given. |
Repository name | University of Southern California Digital Library |
Repository address | USC Digital Library, University of Southern California, University Park Campus MC 7002, 106 University Village, Los Angeles, California 90089-7002, USA |
Repository email | cisadmin@lib.usc.edu |
Filename | etd-MolchanovA-8923.pdf |
Archival file | Volume13/etd-MolchanovA-8923.pdf |
Description
Title | Page 5 |
Full text | 3.2.3 Controlling a Single Active Drifter . . . . . . . . . . . . . . . . . . . . . . 38 3.2.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 3.3 Multi-Drifter Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 3.3.1 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 3.3.2 Drifter Control Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 3.3.3 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 3.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 II Task Data Scarcity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 4 Sim-to-(Multi)-Real: Transfer of Low-Level Robust Control Policies to Multiple Quadro-tors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 4.1 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 4.2 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 4.3 Dynamics Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 4.3.1 Rigid Body Dynamics for Quadrotors . . . . . . . . . . . . . . . . . . . . . 72 4.3.2 Normalized Motor Thrust Input . . . . . . . . . . . . . . . . . . . . . . . . 74 4.3.3 Simulation of Non-Ideal Motors . . . . . . . . . . . . . . . . . . . . . . . . 75 4.3.4 Observation Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 4.3.5 Numerical Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 4.4 Learning & Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 4.4.1 Randomization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 4.4.2 Policy Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 4.4.3 Policy Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 4.4.4 Sim-to-Sim Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 4.4.5 Sim-to-Real Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 4.5 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 4.5.1 Ablation Analysis on Cost Components . . . . . . . . . . . . . . . . . . . . 84 4.5.2 Sim-to-Real: Learning with Estimated Model . . . . . . . . . . . . . . . . 85 4.5.3 Sim-to-Multi-Real: Learning without Model . . . . . . . . . . . . . . . . . 87 4.5.4 Control Policy Robustness and Recovery . . . . . . . . . . . . . . . . . . . 89 4.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 5 Task Specific Learning with Scarce Data via Meta-learned Losses . . . . . . . . . . . . 93 5.1 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 5.2 Meta-Learning via Learned Loss . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 5.2.1 ML3 for Supervised Learning . . . . . . . . . . . . . . . . . . . . . . . . . 98 5.2.2 ML3 Reinforcement Learning . . . . . . . . . . . . . . . . . . . . . . . . . 100 5.2.3 Shaping ML3 loss by adding extra loss information during meta-train . . 104 5.3 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 5.3.1 Learning to mimic and improve over known task losses . . . . . . . . . . . 105 5.3.2 Shaping loss landscapes by adding extra information at meta-train time . 111 v |