Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Mobile robot obstacle avoidance using a computational model of the locust brain
(USC Thesis Other)
Mobile robot obstacle avoidance using a computational model of the locust brain
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
MOBILE ROBOT OBSTACLE AVOIDANCE USING A COMPUTATIONAL MODEL OF THE LOCUST BRAIN by Manu Viswanathan A Thesis Presented to the FACULTY OF THE USC VITERBI SCHOOL OF ENGINEERING UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree MASTER OF SCIENCE (COMPUTER SCIENCE) May 2011 Copyright 2011 Manu Viswanathan Acknowledgements First and foremost, I would like to express my heartfelt gratitude to Dr. Laurent Itti for his unwavering support, optimism, patience and general good humour. I wish that every graduate student would have the great fortune to have someone like him as an advisor. He gives his students enormous freedom to choose research topics and explore different ideas, yet keeps us on track without ever being pushy or domineering. Christian Siagian (now Dr. Christian Siagian) got me into iLab and was ever ready to help me with a variety of different projects regardless of whether they were related to our research or not. Moreover, he has graciously endured my putting our joint work on the back burner to concentrate my efforts on this project. To him, I owe much. I was lucky that Farhan Baluch’s other research interests kept him busy, thus, allowing me to play a large role in the Robolocust project. However, he has been most gracious and supportive of my efforts, treating me as a valued collaborator rather than an intrusive competitor. He has also been a wonderful fount of ideas for trying different approaches to get things working. Chin-Kai Chang, iLab’s resident hardware expert, was the force behind the hardware of the initial version of the robot. Coming to graduate school after over a decade in the software industry and having long forgotten the electrical engineering course work from my undergraduate years, I think it is safe to say that without his help, I would have been quite lost in the ocean of my own ignorance. Randolph Voorhies spent many hours debugging my implementation of the FastSLAM algorithmanditsMonteCarloLocalizationmodule,whichisthebackboneonwhichmuchof ii the robot’s data collection and analysis rests. When a Propeller board’s USB port snapped off, he soldered it back on for me and also helped me out with various other hardware related issues. And a quick bit of trivia here: Rand was the one who suggested the name “Lobot” for this system. Lior Elezary setup Lobot’s codebase and wrote the first few modules to get me started. Much of that code has long since been either revised beyond recognition or totally excised. Nonetheless, hisinitialeffortssparedmetheinertiaandterrorassociatedwithstartingfrom a completely blank slate. Dicky Sihite implemented a remote control interface, an RPM sensor and some other bits and pieces for Lobot version 1.0. David Berg cheerfully explained different aspects of neuroscience that I needed to un- derstand. NaderNooriwaskindenoughtoclearthemathematicalroadblocks Iranintoeverynow and again. Moreover, his genuine interest in my progress and infectious laughter have been quite heartening. Thanks are also due to Drs. Gaurav Sukhatme, Stefan Schaal and Tansu Celikel for agreeing to serve as members of the thesis committee. Dr. Sukhatme, in particular, chal- lengedmetoapplyideasandtechniquesthatIlearntinhiswonderfullytaughtprobabilistic robotics course to different parts of the Robolocust project. Dr. Celikel was always ready to lend me an ear and gave me a lot of moral support. Finally, on a personal note, I would like to thank my wife. Ever so often, I have some hare-brained idea that eventually ends up uprooting our lives (move to the USA, quit work and go to grad school). Yet, she has never complained, never hesitated to make the hard changes to make it all work out. That kind of selflessness is a true blessing! Therefore, more than anyone else, her sacrifices and endurance have made not only this thesis, but so much else in my life possible and downright fun even through the most trying of times. iii Table of Contents Acknowledgements ii List of Tables vi List of Figures vii Abbreviations x Abstract xii Chapter 1: Introduction 1 1.1 Collision Detection Using Insect Vision 2 1.1.1 Challenges for Application to Robotics 5 1.2 Related Work 7 1.2.1 Vision-based Obstacle Avoidance 7 1.2.2 Bio-interfacing 10 1.2.3 Our Contribution 11 1.3 Scope and Organization of This Work 12 Chapter 2: A Sensor Model for the LGMD 15 2.1 Characterizing the LGMD 18 2.2 Generating Artificial LGMD Spikes 20 2.2.1 Model Parameters 23 2.3 Bayesian Time-to-impact Estimator 27 2.3.1 Sensor Model Split 31 2.3.2 Implementation Details 32 2.3.3 Estimator Parameters 35 2.4 Testing the LGMD Sensor Model 38 2.4.1 Experimental Methodology 39 2.4.2 Performance Metrics 40 2.4.3 Data Analysis 41 2.5 Results 42 2.6 Discussion 43 2.6.1 The Effect of δ 43 2.6.2 The Effect of Noise 54 2.6.3 The Effect of Robot Speed 55 2.6.4 The Locust Array as a Range Sensor 55 iv Chapter 3: Robot Controller Architecture 57 3.1 Hardware Platform 57 3.2 Low-level Software 59 3.2.1 Development Platform 59 3.2.2 iRobot Create Command Module Code 60 3.2.3 Command Module Interface Thread 62 3.3 Behaviour-based Framework 65 3.3.1 The Main Thread 65 3.3.2 Sensorimotor Interface Objects 69 3.3.3 Behaviour Threads 72 3.3.4 DAMN Arbiters 75 3.3.5 Visualization 80 3.3.6 Metrics Logging 81 3.4 The Fundamental Behaviours 82 3.4.1 Forward 83 3.4.2 Emergency Stop 85 3.4.3 Extricate 86 3.4.4 Open Path 88 Chapter 4: LGMD-based Obstacle Avoidance 92 4.1 The Algorithms 93 4.1.1 EMD 93 4.1.2 VFF 95 4.1.3 TTI 96 4.2 Local Navigation Task 97 4.2.1 Localization Module 97 4.2.2 Goal Seeking 100 4.3 Testing Obstacle Avoidance 101 4.3.1 Experimental Methodology 101 4.3.2 Performance Metrics 102 4.3.3 Data Analysis 103 4.4 Results 107 4.5 Discussion 116 4.5.1 The EMD Algorithm 116 4.5.2 The VFF Algorithm 118 4.5.3 The TTI Algorithm 119 Chapter 5: Conclusion 122 5.1 Evaluation of Project Goals 122 5.2 Comparisons to Other Similar Efforts 123 5.3 Future Work 125 Bibliography 128 Appendix: Raw Data for the Slaloming Experiments 134 v List of Tables Table 2.1: Values of different spike generation parameters 25 Table 2.2: Values of different TTI estimator parameters 36 Table 2.3: Values of variable parameters for TTI estimation tests 39 Table 3.1: Lobot’s hardware components 58 Table 3.2: Packages required to build Lobot’s software 59 Table 3.3: Messages exchanged by high and low-level controllers 61 Table 4.1: Metrics collected by local navigation experiments 104 Table 4.2: ANOVA computations for LGMD-based extrication success rate 109 Table 4.3: ANOVA follow-up for effect of algorithm type on success rate 113 Table 4.4: ANOVA follow-up for effect of LGMD noise on success rate 113 Table 4.5: ANOVA computations for average time-to-goal 114 Table 4.6: ANOVA computations for average robot speed 115 vi List of Figures Figure 1.1: Physical structure of the LGMD 3 Figure 1.2: Plots illustrating how LGMD spike rate varies w.r.t. time-to-impact 4 Figure 1.3: Commercially available equipment for electrophysiology 5 Figure 1.4: Test platform for developing LGMD-based avoidance algorithms 13 Figure 2.1: Original and final versions of Robolocust test platform 16 Figure 2.2: Computing LGMD spikes using cameras 17 Figure 2.3: The effect of C, α, δ and l/v on the LGMD spike rate 19 Figure 2.4: Generating artificial LGMD spikes from virtual locusts 21 Figure 2.5: Probability distribution built by Bayesian time-to-impact estimator 27 Figure 2.6: Conceptual illustration of Bayesian state estimator’s sensor model 29 Figure 2.7: The TTI estimator’s representation of the belief 32 Figure 2.8: Experimental setup for TTI estimation tests 39 Figure 2.9: Plots of LGMD spike rate and predicted TTI vs. actual TTI 44 Figure 2.10: Plots of the overall prediction errors vs. δ 52 Figure 2.11: Feedback system for filtering LGMD input 54 Figure 3.1: Format of commands sent from high-level to low-level controller 61 Figure 3.2: The overall architecture of Lobot’s controller 66 Figure 3.3: Class diagram for the App class 69 Figure 3.4: Class diagram for the LocustModel class 70 vii Figure 3.5: Class diagram for the Robot class 72 Figure 3.6: Class diagram for the Behavior class 73 Figure 3.7: Overall structure of a DAMN controller 76 Figure 3.8: Class diagram for Lobot’s implementation of DAMN 78 Figure 3.9: Class diagram for high-level controller’s UI objects 80 Figure 3.10: Screenshots of high-level controller’s UI 81 Figure 3.11: Lobot’s implementation of the open path algorithm 89 Figure 4.1: Illustration of lgmd extricate emd’s modus operandi 94 Figure 4.2: Slalom course for testing obstacle avoidance algorithms 98 Figure 4.3: Robot trajectories for the different obstacle avoidance algorithms 108 Figure 4.4: Extrication statistics for the different algorithms 109 Figure 4.5: Time and speed statistics for the different algorithms 114 Figure 4.6: Plot of EMD threshold for different noise levels 117 Figure A.1: Raw data for LRF-based extricate 136 Figure A.2: Raw data for EMD algorithm, no noise 137 Figure A.3: Raw data for EMD algorithm, 25Hz noise 138 Figure A.4: Raw data for EMD algorithm, 50Hz noise 139 Figure A.5: Raw data for EMD algorithm, 100Hz noise 140 Figure A.6: Raw data for VFF algorithm, no noise 141 Figure A.7: Raw data for VFF algorithm, 25Hz noise 142 Figure A.8: Raw data for VFF algorithm, 50Hz noise 143 Figure A.9: Raw data for VFF algorithm, 100Hz noise 144 Figure A.10: Raw data for TTI algorithm, no noise 145 Figure A.11: Raw data for TTI algorithm, 25Hz noise 146 viii Figure A.12: Raw data for TTI algorithm, 50Hz noise 147 Figure A.13: Raw data for TTI algorithm, 100Hz noise 148 ix Abbreviations AI Artificial Intelligence ANOVA Analysis of Variance API Application Programming Interface CPU Central Processing Unit DAMN Distributed Architecture for Mobile Navigation DCMD Descending Contralateral Movement Detector EMD Elementary Motion Detector FOV Field of View INVT iLab Neuromorphic Visualization Toolkit I/O Input Output LED Light Emitting Diode LGMD Lobula Giant Movement Detector LRF Laser Range Finder MCL Monte Carlo Localization MLSD Modified Least Significant Difference x OI Open Interface PID Proportional Integral Derivative R/C Radio Controlled SLAM Simultaneous Localization and Mapping TTI Time-to-impact UART Universal Asynchronous Receiver Transmitter UI User Interface USB Universal Serial Bus VFF Vector Force Field WMA Weighted Moving Average w.r.t. with respect to xi Abstract The Lobula Giant Movement Detector (LGMD), a visual interneuron in the locust’s brain, responds preferentially to objects approaching along collisional trajectories. The goal of the Robolocust project is to build a robot that interfaces with this neuron and uses the LGMD spikes to make steering decisions. However, before interfacing with actual locusts, we would like to simulate the LGMD and develop suitable obstacle avoidance algorithms that take LGMD spikes as their sensory input. To this end, we have implemented a computational model of the LGMD that uses a laser range finder mounted on an iRobot Create to generate artificial spikes that are then fed into obstacle avoidance algorithms. The main contribution of this thesis is a Bayesian state estimator that models the time- to-impactasahiddenvariableandusestheLGMDspikestogaugehowfarawaytherobotis from approaching obstacles. Additionally, we present three different LGMD-based obstacle avoidance algorithms that we have developed. xii Chapter 1 Introduction Early successes in the field of Artificial Intelligence (AI) led researchers to an overly opti- mistic view of future developments. Herbert Simon, co-developer of the General Problem Solver (Newell and Simon, 1963), stated that “machines will be capable, within twenty years, of doing any work a man can do” (Crevier, 1993). That was in 1965. Not to trivialize the AI pioneers or their achievements or the work of many brilliant scientists and engineers since in various subfields of AI, but it seems that getting a com- puter to play chess is a tad easier than getting an embodied, autonomous agent to behave intelligently. Nature, on the other hand, has churned out a bewildering array of embodied, au- tonomous agents. While we are not all gifted with the intellectual prowess to be expert chess players or achieve other such formidable feats of abstract thought, nearly every one of us can, with a modicum of alertness and some common sense, perform a complicated task such as crossing a busy street with, say, a child in tow, while simultaneously engaged in a conversation with another person and yet keep track of the child and be generally aware of the traffic as we move. In fact, we do these things so effortlessly that we do not even consider them to be examples of particularly intelligent behaviour. Until, of course, one of us tries to get a machine to do something similar... 1 The point being belaboured here is that when it comes to constructing embodied, au- tonomous agents such as mobile robots, it behooves us to look to nature for inspiration. Furthermore, considering that evolution has already spent an enormous amount of time perfecting the design and operation of various biological sensorimotor systems such as eyes and muscles, might we be able to benefit from all that “prior art” by direct integration? Specifically, for the Robolocust project, we want to build a robot that uses the locust’s visual system as a bio-sensor in order to detect and avoid collisions. Our goals are two-fold: 1. Develop suitable sensor models and obstacle avoidance algorithms that we can deploy on mobile robots that use vision as a primary sensing modality. 2. Interface with actual locusts and use the neural spike trains emanating from their visual systems as a collision detection sensor to guide a mobile robot away from approaching obstacles. This thesis, which has grown out of a larger effort, will focus on the first of the two goals mentionedabove. Wewilltouchuponthebrain-machineinterfacingaspectsoftheproblem, but only in passing; our chief concern here shall remain algorithms, modeling and testing on a “traditional” mobile robot platform (i.e., one not connected to a live organism). 1.1 Collision Detection Using Insect Vision Despitesportingtinybrains,manyinsectsnonethelessexhibitsophisticatedbehavioursthat rely heavily on visual perception. Bees, for instance, use optic flow to measure distances flown (Si et al., 2003). Fruit flies orient themselves so that, when they jump, they will end up moving as far away as possible from an approaching object. These preflight postural adjustments are made on the basis of visual input (Card and Dickinson, 2008). The tobacco hawk moth has two classes of neurons that code for looming and receding stimuli (Wicklien and Strausfield, 2000). The moth uses these neurons while feeding in order to maintain its relative position with respect to a flower. 2 Figure 1.1: Physical structure of the locust’s Lobula Giant Movement Detector (LGMD). This visual interneuron detects objects moving along collisional trajectories and fires with increasing frequency as they get closer. In contrast to the moth’s Class 1 and Class 2 neurons, the locust has a single visual interneuron, viz., the Lobula Giant Movement Detector (LGMD), that is tuned to pick-up objectsmovingtowardtheanimalonadirectcollisioncourse. TheDescendingContralateral Movement Detector (DCMD) is a post-synaptic target of the LGMD and it connects to motor neurons that trigger escape behaviours. Figure 1.1 ∗ shows the physical structure of the LGMD. Further details about the LGMD’s physiology may be found in Judge and Rind (1997); Nowel and Shelton (1981); O’Shea and Williams (1974); Rind and Simmons (1999). In addition to being a single neuron responsible for detecting collisions, the LGMD, as suggested by the “giant” in its name, is a relatively large cell. This makes it more accessible compared to equivalent systems in other insects (such as moths and flies). These features make the locust’s LGMD an attractive choice as a collision detection bio-sensor for a mobile robot. However, using the LGMD in this manner entails several hardware and software challenges as described below. ∗ Reprinted from Rind and Simmons (1999) with permission from Elsevier. 3 Time-to-impact LGMD Firing Rate O Slow Moving Objects Looming Blanking Time-to-impact LGMD Firing Rate O Fast Moving Objects Looming Blanking Figure1.2: PlotsillustratinghowtheLGMDspikeratebehavesw.r.t. thetime-to-impact. Asanobjectgetscloser,theLGMDfireswithincreasingfrequency,achievingapeakshortly before a collision and then experiencing a rapid decline. We refer to the portion before the peak as the “looming phase” and to the portion after the peak as the “blanking phase.” Slow moving objects do not produce a sharp looming–peak–blanking response. Thus, from an algorithmic standpoint, it would be better if things were to approach our robot at high speeds. 4 Figure 1.3: Some sample equipment for recording electrophysiological data. This is the Model 2400 Patch Clamp Amplifier from A-M Systems and is designed for intracellular recordings. It requires a 120V, 60Hz AC power supply, weighs 22lbs, measures 17" × 4.75"×11.25" and retails for about $3,000. 1.1.1 Challenges for Application to Robotics High Speed Requirements As we can see from figure 1.2, slow moving objects do not excite the LGMD as pointedly as fast moving objects (Gabbiani et al., 1999a). Clearly, to better recognize imminent collisions, we will have to operate our robot at high speeds. Otherwise, we will not be able to elicit the necessary spiking pattern to which we can respond in a reliable manner. Unfortunately, small and medium-sized mobile robot platforms that are available com- mercially are typically designed for low-speed operation, usually up to 0.5m/s or there- abouts. This means that we will have to build a custom robot capable of moving at high speeds (e.g., up to 5m/s) without losing stability. Physical Connection Issues Signal acquisition, amplification and filtering are crucial aspects of recording electrophys- iological data from biological specimens (see, for example, Bure˘ s et al., 1960, chap. 2). Figure 1.3 ∗ shows an example of a commercially available device used for this task. Clearly, the equipment pictured above is ill-suited for deployment atop a high-speed robot that has limited payload and is subject to significant jolts and vibrations. Conse- ∗ Photo (c) 2010 A-M Systems (http://www.a-msystems.com/). Used with permission. 5 quently, we will have to redesign the electrophysiological recording equipment, focusing on miniaturization and robustness without sacrificing accuracy and resistance to noise. Stopping the Robot As an object approaches along a collisional trajectory, the LGMD spikes with increasing frequency. Shortly before collision, the spike rate reaches a peak and then experiences a very sharp decline (see the lower plot in figure 1.2). In order to reliably stop the robot before it bumps into something or to steer it away from obstacles, we must detect these peaks and take appropriate action. The problem here is that the spike rate peaks occur very close to impact. A locust’s musculoskeletal system is capable of exerting and withstanding over 18 gees of acceleration (Bennet-Clark, 1975). Therefore, a last-minute escape is a perfectly viable option for the creature. A mobile robot, however, would be somewhat challenged to effect a similar col- lision avoidance strategy. Consequently, we need to develop algorithms that can predict oncoming collisions well in advance. Steering Decisions Finally, the locust’s brain contains two LGMD’s, one on the right and another on the left. The locust makes a simple reactive decision regarding escape direction based on which LGMD is firing more rapidly (Yue et al., 2010). Thus, if the left LGMD is producing more spikes, the locust will jump right, and vice versa. As we shall see in subsequent chapters, such a simple approach is not well suited to a mobile robot attempting to use the LGMD as a collision sensor, especially in situations where we need the robot to behave in a goal-directed manner. This means that we will have to mount several locusts on our robot and combine the LGMD readings from each of them so as to effect reasonably intelligent and goal-directed actions. 6 1.2 Related Work We mentioned earlier that the Robolocust project aims to develop sensor models and ob- stacle avoidance algorithms that will enable a mobile robot to detect and avoid impending collisions on the basis of LGMD spikes. We would then like to apply these algorithms to control a robot that is connected directly to the visual systems of live locusts. These goals impinge upon the following two areas of robotics research: • Vision-based obstacle avoidance • Bio-interfacing Therefore,tohelpcontextualizeourwork,wepresentabriefsurveyoftheavailableliterature on these two topics. 1.2.1 Vision-based Obstacle Avoidance Moravec (1977) implemented one of the earliest robots to use vision-based obstacle avoid- ance. His system first applied an “interest” operator on the input image to produce a list of recognizable features. Then, it tracked these features across successive frames using a cor- relation operator. Stereo computations helped determine the range to the objects identified by the feature list (Gennery, 1977). Since then, researchers have developed numerous techniques applicable to vision-based robotics. de Souza and Kak (2002) and Bonin-Font et al. (2008) provide comprehensive surveys of the current state of the art. Here, we only discuss a few papers that we believe have direct relevance to the Robolocust project. One approach to vision-based obstacle avoidance is to use stereoscopically arranged cameras to extract depth information from the visual input and use it to build a three dimensional terrain map (Singh and Digney, 1999). The robot can then evaluate several candidate paths based on how much roll and pitch it will likely encounter, the likelihood of hitting an object protruding from the ground, etc. 7 Alternatively, in reasonably structured indoor environments, a less involved solution is to segment the input image so as to separate the floor from potential obstacles and use stereo to gauge their shapes and relative positions (Hashima et al., 1997). Instead of trying to determine the full three dimensional shapes and positions of obsta- cles, Murray and Little (2000) project everything into two dimensions and use the resulting coordinates to update an occupancy grid (Elfes, 1989), which they then use for path plan- ning and obstacle avoidance. In effect, the authors have converted a visual sensor into a range sensor such as a sonar or laser ranger. The visual sonar approach due to Lenser and Veloso (2003) sidesteps the complexities of stereo systems (camera calibration, input image rectification, correspondence matching and final triangulation) by opting for monocular vision. First, the algorithm performs color segmentationontheinputimageandcategorizestheresultingsegmentsintodifferentclasses such as “floor,” “obstacle,” “unknown” and so on. It then overlays scan lines radiating outward from the robot’s current position in 5 ◦ steps and identifies objects along these scan lines by applying various domain-specific heuristics. Finally, distances are calculated by casting rays from the closest pixels of identified objects in the image plane onto the ground plane. Although promising, we should point out that visual sonar was developed in the context of robotic soccer. How well it will scale to more general environments is an open question. For example, we could expect a home with wood floors having prominent grain lines, par- tially covered by area rugs of different patterns and colors as well as containing furniture in several colors to break the algorithm. Michels et al. (2005) also use monocular vision instead of stereo to determine distances to various objects in the camera’s field of view. However, rather than trying to model different things in the scene, they follow a hybrid approach, wherein they use a laser range finder to first train a predictor module. They then apply supervised reinforcement learning to develop a steering control policy that uses the vision module’s distance predictions as 8 input. With this setup, they have gotten their robot to drive outdoors over rough terrain at high speeds (2–5m/s) for about a minute without crashing into anything. The preceding paragraphs touched upon the use of vision toward the construction of a 2D or 3D world model in which obstacles are modeled either directly or indirectly (via learning) in such a way that the robot knows exactly (subject, of course, to some margin of error)wheretheyarew.r.t. itscurrentlocation. Opticalflowtechniques, ontheotherhand, eschew such elaborate world modeling. Instead, they rely on motion-induced visual cues to recognize potentially hazardous conditions without requiring explicit range estimates. Forinstance,Coombsetal.(1995)computedthedivergenceinthecentralandperipheral optical flows using two cameras, one equipped with a narrow-angle lens to capture foveal visionandtheothersportingawide-anglelensforperipheralvision. Thedifferencebetween these two flows enabled the robot to avoid collisions by centering itself within a “conceptual corridor.” Many insects navigate in exactly this fashion. Santos-Victor et al. (1993) designed a robot to mimic the centering behaviour observed in bees (Srinivasan et al., 1991). They used two cameras pointed in opposite directions to compute the optical flow on each side and a simple PID control law to minimize the difference in the two flow velocities. Their robot was able to successfully avoid obstacles; however, it did not handle rotational motion very well. More recently, i Badia et al. (2007) combined neuronal models of the fly’s optomo- tor system and the locust’s LGMD to produce a robotic blimp capable of maintaining a steady flight path and avoiding obstacles. However, they have only tested their robot in an artificially instrumented room. In an approach similar to Michels et al. (although predating them by a decade), Naka- mura and Asada (1995) applied reinforcement learning to come up with behaviours that encapsulate the sensorimotor processes required to both track targets and avoid obstacles using a single camera as the robot’s sensor. However, in contrast to the work by Michels 9 et al. (2005), Nakamura and Asada used optical flow primitives to make depth estimates rather than training with a range sensor. 1.2.2 Bio-interfacing Hertz (2004) built a robot named RoachBot that was controlled by a cockroach. It was equipped with some distance sensors that were wired to lights placed around the cockroach. When the sensors detected something nearby, the corresponding light would turn on. The cockroach, perched atop a trackball, would instinctively seek a hiding place and try to scurry away. Its motor actions were then “transmitted” from the trackball to the robot’s motors to effect appropriate steering and speed control. RoachBot went through three design iterations. Unfortunately, as reported by Hertz, it never quite achieved the desired obstacle avoidance behaviour. Moreover, RoachBot was meant to function within an artistic and entertainment milieu, helping lay people identify with the insect’s point of view, and to highlight such quintessentially biological attributes as laziness, irrationality, unpredictability and other emotions. Regeretal.(2000)presentedamoretechnicallyorientedprojectwhereintheyconnected the sensorimotor system of a lamprey to that of a small robot. Light sensors on the robot were converted to neural stimulations and delivered to the lamprey via surgically implanted electrodes. The frequency of these stimulations were proportional to the amount of light picked up by the robot. The lamprey’s motor responses were, in turn, converted to drive commands and delivered to the robot. With this setup, the authors were able to get the robot to track a light source, exploiting the lamprey’s natural instinct to stabilize itself w.r.t. its vertical axis as it swims. In a similar experiment involving a fruit fly, Graetzel et al. (2008) used a high-speed camera to analyze wing beat patterns that were then converted to equivalent motor com- mands for a robot. As the robot moved about, its range and vision sensors were used to update the optic flow pattern of an LED flight simulator inside of which the fly was teth- 10 ered. The new pattern caused the fly to adjust its flight path, which, in turn, resulted in the robot changing course and providing new sensor input. Inanearliereffort, deMarseetal.(2001)culturedcorticaltissuefrom18embryonicrats onto a 60 channel multi-electrode array (MEA). A clustering algorithm was used to recog- nize different spatiotemporal firing patterns in the spike trains produced by the resulting neural network. Different patterns were mapped to corresponding motions of an “animat” within a virtual world. The animat’s new state was then used to produce appropriate sen- sory feedback for the network, which was delivered as stimulations to specific parts of the network. This feedback would cause changes in the firing patterns, which, in turn, would move the animat, and so on. The above work later led to the development of MEART (Bakkum et al., 2007), wherein a cultured neural network was used to control robot arms that would draw with colored pens, thus, creating “art.” Whereas the projects described above have all attempted to close a sensorimotor loop in a system constructed from biologic and synthetic components, Herr and Dennis (2004) implemented a swimming robot that used living muscle tissue from a frog without any sensory elements. Instead, the muscle was controlled by an open loop process running on a microcontroller whose only goal was to test various manoeuvres such as starting, stopping, turning and normal forward swimming. 1.2.3 Our Contribution In keeping with other insect-inspired solutions to obstacle avoidance, ours too is an optical flow based technique. One drawback of such methods is that their depth estimates are implicitintherobot’sbehaviouralresponsesratherthanbeingmadeexplicit. Thisprecludes smooth integration with mapping algorithms that rely on range sensing (e.g., Thrun et al., 2005, chap. 9). We overcome the above limitation by modeling the LGMD as a time-to-impact sensor and combine this information with the robot’s encoder data to produce distance estimates 11 thatcanthenbeinjectedintoavarietyofrangesensingbasedalgorithms. Thus,likeMurray and Little (2000), we convert a visual sensor into a range sensor, but without the heavy computational demands of stereo vision or the domain specificity of visual sonar (Lenser and Veloso, 2003). Furthermore, sincetheLGMDworksbetterwhenexposedtofast-movingobjects(figure 1.2), we expect a robot equipped with LGMD’s, either real or virtual, to operate well at high speeds. Regarding the bio-interfacing aspects of Robolocust, to the best of our knowledge, it wouldbethefirst“hybrot”(Potter,2002)tooperateuntetheredandaccomplishacomplex, real-world task with the biological component on-board. Earlier efforts have, in effect, been teleoperated, with sensorimotor signals being relayed back and forth between robot and organism. On Robolocust, however, we will have the insects mounted on the robot and we will interpret their sensory inputs and issue appropriate motor commands locally. Thus, the robot will be completely autonomous and self-contained. 1.3 Scope and Organization of This Work This thesis concentrates on the stopping and steering problems described in section 1.1.1. WeuseacomputationalmodeloftheLGMDtogenerateartificialspikes,thereby,simulating the connection between real locusts and a robot. Figure 1.4 shows our test robot. It is an iRobot Create ∗ equipped with a laser range finder that, in conjunction with the Create’s encoders, allows us to make fairly accurate estimates of the time-to-impact in different directions. These times-to-impact are used to generate spike rate curves similar to the ones shown in figure 1.2, which are then fed into behaviours designed to issue stop and turn commands to the robot. In chapter 2, we will go into the details of the above-mentioned LGMD model and show how we have implemented it on our test robot. We then develop a sensor model for the LGMD that enables the robot to predict how long it has until it collides with something. ∗ See http://www.irobot.com/create/. 12 Mini−ITX quad−core Computer Computer Batteries & Power Supply Laser Range Finder iRobot Create Robot Base Figure 1.4: The test platform for developing LGMD-based stopping and steering algo- rithms. The laser range finder and robot’s encoders are used to generate artificial LGMD spikes. We have conducted many controlled experiments to thoroughly test and characterize this sensor model. Sections 2.5 and 2.6 present detailed analyses of these experiments. Next,weapplytheLGMDsensormodeltoagoal-seekinglocalnavigationtaskinvolving a slalom obstacle course. However, before we can explain the three LGMD-based obstacle avoidance algorithms we have designed, we need to describe the hardware and software architecture of the robot’s controller. Chapter 3 does that. Once the above contextual foundation has been laid, we can proceed to understanding how LGMD-based obstacle avoidance works. Chapter 4 delves into this area. To compare the three LGMD-based avoidance algorithms, we use as a baseline, a behaviour that takes the laser range finder’s distance measurements as its input instead of the artificially gener- ated LGMD spikes. Sections 4.4 and 4.5 show the results of these comparisons and discuss the relative merits and demerits of each of the three algorithms. 13 Finally, we conclude with chapter 5, looking back at what we have achieved so far and forward to where we hope to go next with this research. 14 Chapter 2 A Sensor Model for the LGMD In the early stages of the Robolocust project, we tried to use cameras to generate artificial LGMD spikes. Figure 2.1(a) shows a picture of the original test robot. Unfortunately, we were unable to generate artificial LGMD spikes using the cameras. We attempted to do so following the treatment by Stafford et al. (2007). Figure 2.2 presents a schematic and brief description of this model. The trouble we faced in our implementation was its inability to filter out lateral motions. As a consequence, when the robot moved, it would respond to all motion because it could not distinguish between objects approaching along collisional trajectories and objects simply moving sideways in and out of its field of view. Since we wanted to focus mainly on using LGMD spikes as sensory input for the robot, we decided to defer a thorough investigation of the above-mentioned problem or even the development of an alternative vision-based spike generation procedure to a later date. In- stead, we chose to implement the computational model due to Gabbiani et al. (1999a), which defines the LGMD spike rate as a function of the time-to-impact. As we were already using a laser range finder in conjunction with the cameras to verify ground truth about obstacle distances, it made sense to simply discard the cameras for the time-being and use the laser range finder alone to help simulate actual locusts. 15 Computer Cameras Batteries R/C Car Platform Laser Range Finder (a) Lobot 1.0: multiple cameras on an R/C car. Mini−ITX quad−core Computer Computer Batteries & Power Supply Laser Range Finder iRobot Create Robot Base (b) Lobot 2.0: laser range finder on a Roomba. Figure 2.1: The original and final versions of Lobot: the Robolocust test platform. We started off on an R/C car with cameras as the primary sensor (the laser range finder was for ground truth). However, we moved to the iRobot Create and laser range finder combination due to hardware problems and, more importantly, the lack of a suitable vision- based computational model of the LGMD. Insections2.1and2.2wegointothedetailsofhowtheLGMDworksaspertheGabbiani model and how we use it to generate artificial spikes for our robot. The rest of the chapter describes the sensor model we developed to predict the time-to-impact from LGMD spikes and our test results. Before we proceed, however, we would like to point out that there is some disagreement within the research community (Gabbiani et al., 1999b; Rind and Santer, 2004) about whether the LGMD operates multiplicatively (Gabbiani et al., 2002) or additively (Rind and Bramwell, 1996). Recent work by i Badia et al. (2010) suggests that the LGMD’s apparent non-linearity is due to its large pre-synaptic network and not an intrinsic property of the neuron itself. We remain agnostic on this matter. Our interest is in the interpretation of the LGMD spikes so as to elicit goal-directed obstacle avoidance behaviour. Consequently, from our perspective, the LGMD is a black box that outputs a particular signal. It so happens that this signal follows a functional computation proposed by Gabbiani et al. (1999a) that fits certain electrophysiological data. We are happy to use this correlation to our advantage 16 P Layer I Layer S Layer Camera δ time delay Grayscale Input Frame Grayscale Input Frame − • absolute value δ + × 0.25 ⊗ convolution Convolution Kernel 1 9 1 1 1 1 1 1 1 1 1 δ × −2 + Membrane Potential Computation LGMD Figure 2.2: The processing pipeline implemented by the three layer neural model of the LGMD described by Stafford et al. (2007). The photoreceptive P layer works by taking the difference of the raw input frames in the current and previous time steps. The I layer integrates the result of the P layer’s computations in the current and previous time steps and smooths the result using a convolution. Finally, the S layer computes the difference of the current P layer and previous I layer to produce a membrane potential. When this potential exceeds a certain threshold, the LGMD is considered to have fired. 17 and do not concern ourselves with exactly how our input signal is produced or whether it is a perfect description of the underlying biology. 2.1 Characterizing the LGMD According to Gabbiani et al. (1999a) (see also Indiveri, 1997), LGMD spikes are produced at a rate that is a function of the time-to-impact: f(t) =C ˙ θ(t−δ) e −αθ(t−δ) (2.1) where f = the LGMD firing rate t = the time-to-impact θ(x) = 2tan −1 l/v x ˙ θ(x) = −l/v x 2 +(l/v) 2 C, α, δ and l/v are external parameters to the computations described by the above equa- tions. C is a constant of proportionality and controls the amplitude of the LGMD signal. Low values of C will result in low spike rate values whereas higher values will produce commensurately higher spike rates. The top two plots in figure 2.3 show the effect of C when α, δ and l/v are kept constant. l/v is the time it takes for an object to double its size in the locust’s field of view. Equivalently, it is the time it takes for the approaching object to cover a distance equal to its own (half) height. The bottom row in figure 2.3 shows how l/v affects the LGMD spike rate. It controls the sharpness and durations of the looming and blanking phases (see figure 1.2, page 4). Low values indicate a fast-moving object; this will produce a very sharp response with the looming and blanking phases lasting a short while. High values, on the 18 t f(t) O Low C t f(t) O High C t f(t) O Low α t f(t) O High α t f(t) O Low δ t f(t) O High δ t f(t) O Low l/v t f(t) O High l/v Figure 2.3: Plots illustrating how C, α, δ and l/v affect the LGMD spike rate. The top row shows the effect of C. The second row shows the effect of α. The third row shows the effect of δ. And the bottom row shows the effect of l/v. In each row, the left plot shows the effect when that row’s parameter is low and the right plot when the parameter value is high. 19 other hand, tend to “expand” the looming and blanking phases of the LGMD signal and produce gentler, lower peaks. Weknowquitewellbynowthatasanobjectapproaches,theLGMDfireswithincreasing frequency. However, after a certain threshold is crossed, feed-forward inhibition will kill the LGMD’s firing. α determines how rapid this blanking phase is, i.e., how quickly the firing rate drops off as the approaching object gets close to collision. The second row of figure 2.3 shows how the LGMD firing rate behaves w.r.t. α while the other three parameters are kept constant. Finally, δ controls how far away from collision the spike rate peak is achieved. Lower values will result in peaks closer to collisions; higher values in peaks further away from collision. The third row of figure 2.3 illustrates the effect of the δ parameter. 2.2 Generating Artificial LGMD Spikes To guide Lobot (figure 2.1(b), page 16) using locusts, we divide the laser range finder’s field of view into adjacent, overlapping angular regions and assign a virtual locust to each region as illustrated in figure 2.4(a). We then apply equation (2.1) to compute the LGMD spike rate for each virtual locust. These spike trains from the entire array of virtual locusts are fed into an obstacle avoidance behaviour to effect the desired steering control. To be able to use equation (2.1) for determining the LGMD spike rate for a virtual locust, we need an estimate of the current time-to-impactt in that locust’s direction, which we obtain by combining the information returned by the robot’s encoders and the laser range finder (LRF). From the encoders, we can tell the robot’s current speed, s, and its current heading, φ, relative to its own frame of reference (see figures 2.4(b) and 2.4(c)). Thus, the robot’s velocity vector~ v is given by: ~ v =s cosφ sinφ 20 Robot LRF Virtual Locust Line of Sight Field of View Legend (a) Wedividethelaserrangefinder(LRF)field of view (FOV) into adjacent, overlapping angular regions. A virtual locust is setup facing along the bisector of each such re- gion. Each virtual locust’s FOV is cen- tered about this bisector (we only show a few FOV’s to keep the diagram legible). LRF Robot x y Wall or obstacle ˆu ψ β β ~ v φ (b) We project the robot’s velocity vector ~ v onto the locust’s line of sight ˆ u to get s ψ (shown in red). We then divide the mean distance measurement in the locust’s FOV by s ψ to get the time-to-impact in the lo- cust’s direction, which we plug into equa- tion (2.1) to obtain the spike rate. LRF Robot x +y −y 0 ◦ 5 ◦ −5 ◦ 10 ◦ −10 ◦ (c) The robot’s frame of reference. 0 ◦ corresponds to driving straight ahead; positive angles in- dicate left turns and negative angles indicate right turns. The angles shown above are exag- gerated for illustrative purposes. Also, while turning, the robot will actually sweep out a circular arc. The heading returned by the low- level encoder system (i.e., φ) will be a tangent to this arc. Figure 2.4: Generating artificial LGMD spikes from virtual locusts 21 Now, let us say we want to compute the LGMD spike rate for a virtual locust looking along the direction ψ (measured in the robot’s frame of reference). The unit vector ˆ u along this locust’s line of sight will be given by: ˆ u = cosψ sinψ If we project~ v onto ˆ u and calculate the magnitude of the resulting vector, we will get the speed of the robot along the locust’s line of sight, viz., s ψ . Thus: s ψ = (~ v· ˆ u)ˆ u (2.2) The next step is to find the distance to the nearest obstacle along ˆ u. For that, we could simply take the LRF’s range reading corresponding to angle ψ. However, since a single reading can be faulty or noisy, it would be prudent to take the mean range reading in the locust’s field of view. If a virtual locust’s field of view encompasses β degrees on each side of its line of sight, ψ, then the distance to the nearest obstacle along the locust’s line of sight, viz., d ψ is given by: d ψ = 1 2β/R ψ+β X x=ψ−β LRF(x) (2.3) where LRF(x) = LRF range reading in direction x R = angular resolution of LRF At this point, we have all the information we need to get the time-to-impact along the locust’s line of sight, which we can calculate by dividing the left hand side of equation (2.3) by that of equation (2.2). That is, we rearrange the terms in speed = distance/time to get: 22 t ψ = d ψ s ψ We can now pass the above value to the spike rate function defined by equation (2.1) and get the LGMD spike rate for the virtual locust looking in the direction ψ. Repeating this processforalltheothervirtuallocustswillgiveusthespikeratesfortheentirelocustarray. For the sake of clarity, the above discussion has glossed over some minor details. For example, it sometimes happens that the LRF does not have a valid range reading for some directionx. Whenwecomputethemeandistanceinalocust’sfieldofview(equation(2.3)), weshoulddiscardsuchreadings. Moreover, becauseLRFandencoderdatatendtobequite stable and noise-free, the spike trains we obtain will be fairly clean. Therefore, to better simulate real locusts, we corrupt the computed spike rates by adding Gaussian noise to the output of equation (2.1). Algorithm 2.1 shows the exact steps applied to generate artificial LGMD spikes for a virtuallocustlookingindirectionψ. Figure2.4(b)illustratessomeofthegeometryinvolved to help clarify the above discussion and the steps of algorithm 2.1. 2.2.1 Model Parameters Section 2.1 explained four crucial parameters (C, α, δ, l/v) used by the spike generation routine. Algorithm 2.1 introduced four more, viz., β, σ, f min and f max . Yet another that we have not mentioned explicitly thus far is L, the list of virtual locusts, specified by the angles associated with their individual lines of sight (see figure 2.4(a)). Table 2.1 documents the values we used for these different parameters. We arrived at these values mostly by a process of trial and error. However, for some parameters (e.g., δ, σ), we chose values based ontest design. Except forthese testing relatedparameters, which we varied across experiments, the others remained fixed throughout, i.e., once we converged on their values, we did not change them thereafter. 23 Algorithm 2.1 calculate-spike-rate(ψ) s = current robot speed // from encoders φ = current robot heading // from encoders ~ v = s cosφ sinφ T ˆ u = cosψ sinψ T s ψ = (~ v· ˆ u)ˆ u if s ψ = 0 // robot is stationary return 0 // no motion =⇒ no spikes n = 0 Σ = 0 for x =ψ−β to ψ+β if LRF(x) is a valid range reading Σ + = LRF(x) ++n if n = 0 // no valid range readings return 0 d ψ = Σ/n t ψ =d ψ /s ψ // Now apply eq.(2.1) and algorithm 2.2 to obtain LGMD spike rate return f(t ψ )+gaussian-noise(σ) clamped to range [f min ,f max ] Algorithm 2.2 gaussian-noise(σ) if σ = 0 return 0 // no noise return 1 2 P 12 i=1 rand(−σ,σ) // see Thrun et al. (2005, pg.124) 24 Name Description Units Value C Controls amplitude of LGMD signal 175 α Controls slope of LGMD blanking phase 0.001 l/v Controls sharpness and duration of LGMD peak s 0.25 δ † Controls where LGMD peak occurs s 0.25, 0.50, 0.75, 1.00 1.25, 1.50, 1.75, 2.00 σ † Standard deviation for Gaussian noise for corrupting LGMD spikes Hz 0, 25, 50, 100 f min Minimum spike rate Hz 0 f max Maximum spike rate Hz 800 L Directions of the lines of sight of the virtual locusts degrees −95, −75, −60, −45, −30, −15, 0 , 15, 30, 45, 60, 75, 95 β Half the FOV of a virtual locust degrees 15 Table 2.1: These are the values we used for the different spike generation parameters. Parameters marked with a ‘†’ were the only ones that we varied. The others remained fixed across all the experiments and tests we conducted. 25 For C, α and l/v, we drove the robot around, using different values, and observed the LGMD responses, eventually settling on those values that resulted in spike rate curves similar to the one shown in the lower plot of figure 1.2 (page 4). We noticed then that sometimes the virtual locusts would produce extremely high spike rates, which prompted us to introduce f min and f max to ensure that momentary glitches in LRF and encoder readings would not produce outlandish spiking activity. We varied δ and σ to see how well our LGMD sensor model and obstacle avoidance algorithms would work in close quarters (low δ) and how well they could cope with poor sensor performance (i.e., high noise). The choice of β was somewhat arbitrary: we did not want a very large FOV for each locust because obstacle boundaries would cause sharp discontinuities in the range readings, which, in turn, would lead to very misleading and inaccurate LGMD spike rates. However, we also did not want to constrain the FOV too much. Finally, forL, we chose the radial arrangement shown in figure 2.4(a) in order to mimic a range sensor such as a sonar ring. We placed the virtual locusts so that there would be partial overlaps in the FOV’s of neighbouring locusts. The intent here was to create some ambiguity about the exact direction from which an obstacle is approaching. One thing to note about L is that, for the most part, we chose an angular spacing of 15 ◦ between neighbouring locusts. However, the leftmost and rightmost virtual locusts were configured to look along 95 ◦ and −95 ◦ respectively rather than 90 ◦ and −90 ◦ due to a peculiar aspect of the velocity vector projection (equation (2.2)). We noticed that, on average, the robot spent more time driving straight ahead than engaged in turns. The projection of this velocity vector onto ±90 ◦ will always yield zero for s ψ . As a result, these two locusts rarely produced any spikes. We worked around this problem by offsetting these two virtual locusts by 5 ◦ each. While this did ameliorate the situation somewhat, it did not provide a completely satisfactory solution. We will have more to say on this matter in section 2.6. 26 t 0 .1 .2 .3 · · · 10 0 1 P(t) Figure 2.5: An illustration of the state space and posterior probability distribution built by the Bayesian time-to-impact estimator, which uses the current instantaneous LGMD firing rate obtained from an “LGMD sensor” as its input. The continuous state variable t, viz., the unknown time-to-impact, which we are trying to estimate from LGMD spikes, is limited to the range (0,10] and discretized in steps of 0.1s. 2.3 Bayesian Time-to-impact Estimator Equation (2.1) tells us that the LGMD spike rate is a function of the time-to-impact. Unfortunately, as we can see from the many plots shown in figure 2.3, this function is not invertible. That is, there is no closed form expression that will allow us to determine the corresponding time-to-impact given a spike rate from an “LGMD sensor.” Consequently, we use Bayesian state estimation to determine the most likely time-to- impact (TTI) given an LGMD spike rate. The goal of our TTI estimator is to build a probability distribution over all the possible times-to-impact. Since the time-to-impact, t, is a continuous variable and we have chosen to implement a simple Bayes filter (Thrun et al., 2005, chap. 2), we discretize our state space so that we only consider t starting at 0.1s and going up to 10s in steps of 0.1s (see figure 2.5). For each t ∈ {0.1, 0.2, ...,10}, we want to determine P(t|f), where f is the current LGMD firing rate returned by the LGMD sensor. We start off with a uniform belief, i.e., we assume that the likelihood of any statet 0 is the same as that of any other statet. Then, 27 as the robot starts to move and generate LGMD spikes, we apply the Bayesian state update equations to determine the new belief. Now, as per Bayes’s rule: P(t|f) = P(t)×P(f|t) P(f) P(t) is the prior, which we obtain from the current belief. P(f) can be “absorbed” into a normalizer η. This leaves us one remaining term to work out: P(f|t). The above term, P(f|t), is the so-called causal information because it tells us what value of t causes the observed spike rate f. We can use equation (2.1) to construct a table of such causal probabilities. Figure 2.6 provides a conceptual illustration of what such a table looks like. The columnvectors ofthis table correspondto the different time-to-impact values as determined by the TTI discretization. The rows correspond to different ranges of the LGMD readings. To generate appropriate probabilities for each cell in this table, we apply equation (2.1) foreachTTIvaluettoobtainacorrespondingLGMDvaluef. Wefirstassignsomearbitrary number to the cell corresponding to (f,t). The remaining cells in the column indexed by t are then assigned decreasing numbers depending on how far away they are from the (f,t) cell. These numbers are computed according to a Gaussian weighting formula. Finally, we normalize the entire column vector corresponding to t to ensure that it represents a valid probability distribution. Algorithm2.3showstheexactstepsweusedtogeneratethetableofcausalprobabilities, which we represent as a two-dimensional array containingm rows andn columns, wherem andndependontheexactdiscretizationappliedtotheLGMDandTTIrangesrespectively. By default, n is 100 because we discretize the time-to-impact from zero to ten seconds in steps of 0.1s. However, Lobot’s control software actually allows any kind of discretization. For example, if we configure it to use TTI’s from zero to 20s in steps of 0.5s, n would be 40 rather 100. 28 0% 25% 50% 75% 100% Likelihood scale 100 200 300 400 500 600 700 LGMD Ranges (Hz) 9 8 7 6 5 4 3 2 1 TTI Ranges (s) Sensor Model for Bayesian Time-to-impact State Estimator Figure 2.6: A conceptual illustration of the Bayesian state estimator’s sensor model con- tainingcausalprobabilitiesP(f|t),wheref istheLGMDfiringrateandtthecorresponding time-to-impact. When the LGMD sensor returns a new observation f, the state estimator will select the corresponding row from the sensor model’s table and combine it with the current belief to produce the posterior distribution (which will become the prior for the next iteration). Algorithm 2.3 generate-sensor-model 1 M = new m×n 2D array 2 for all t∈{.1, .2, .3, ..., 10} 3 c = column index corresponding to t 4 r = row index corresponding to f(t) // equation (2.1), page 18 5 η = 0 6 for i∈{0, 1, 2, ..., m−1} // zero-based array indices 7 M ic = exp − 1 2 r−i σ 2 // Gaussian weighting 8 η + = M ic 9 for i∈{0, 1, 2, ..., m−1} 10 M ic / = η 11 return M 29 Lines 3 and 4 of algorithm 2.3 use appropriate functions to determine the row and column indices of the (f,t) cell. The exact details of these two functions are not important; sufficeittosaythattheyincorporateLobot’sconfigurationsettingsintotheircomputations. Lines 5–8 fill the column corresponding to t. Each cell is assigned a number based on how far away it is from the (f,t) cell. Line 7 in particular is the one that applies the Gaussian weighting formula mentioned earlier. Finally, lines 9 and 10 normalize the values in the column to ensure they constitute a valid probability distribution. Once we have the sensor model properly setup, in each iteration of the state estimation loop, we use the latest LGMD reading to select the appropriate row vector and update the current belief to produce the desired posterior like so: ∀tbel(t n+1 ) =ηP(f n |t n )bel(t n ) (2.4) Althoughthisshouldworkwithoutanyproblems,inpractice,wefoundthatthebeliefwould often reach 100% for some state and 0% for all the others. Once the estimator achieved such a configuration, it would get stuck and end up ignoring further observations, thereby diverging from the true state in subsequent iterations. To solve this problem, we added a uniform distribution after applying the Bayesian state update shown in equation (2.4). In effect, this additional step helps us foster a healthy skepticism toward the LGMD sensor model and allows us to include random and other effects that are not and cannot be explicitly modeled by our algorithm. Oncewehavecomputedtheupdatedbelief,wecanusethestatetthathasthemaximum likelihood to estimate distances to approaching obstacles by combining the estimate of the time-to-impact with the robot’s odometry. Repeating this procedure for all the virtual locusts will give us TTI and distance estimates for the entire locust array. This information can then be used to implement a variety of obstacle avoidance algorithms. 30 2.3.1 Sensor Model Split To avoid getting bogged down in minutiae, the preceding discussion omitted a few details. Specifically, we use two tables of causal probabilities, one for the looming phase of the LGMD signal and the other for the blanking phase (see figure 1.2, page 4 for an illustration of these two phases). That is, we split the table shown in figure 2.6 into two parts and use one or the other likelihood profile depending on the input signal’s current phase. The table is split based on the value of δ, since it controls where the peak LGMD firing rate occurs. Thus, in a sense,δ is the parameter that partitions the LGMD spike rate curve into its two phases. Thereasonwesplitthesensormodeltableisasfollows. Aroundδ,theprobabilitiestend to be almost equal for the nearby states, which can confuse the filter and make it choose the wrong state. To help clarify this point, let us take a concrete example in conjunction with the sensor model depicted in figure 2.6. Say thatδ = 2.5s. Further assume that the LGMD of one of the virtual locusts is firing near its peak rate, e.g., 650Hz, and that the Bayes filter has reached a state wherein the most likely time-to-impact is t = 2.5s. As the approaching object gets closer, the blanking phase of the LGMD will kick in and its firing rate will drop, say, to 550Hz. The filter will respond to this new sensor reading by picking the third row vector in figure 2.6. Notice that the probabilities immediately surrounding t = 2.5s are identical. ∗ Thus, when the filter performs its update step, it could end up at either t = 2.4s or t = 2.6s. We found that the filter often worked without getting confused in the manner described above. However, there were situations in which the filter chose the wrong state near these crucial points so that, in subsequent iterations, its TTI estimates increased, indicating an object receding from the robot, when, in fact, collisions were imminent. Since reliable obstacle avoidance is a fundamental requirement for any mobile robot, we considered the above “lapses in judgment” to be a serious problem. Our solution was to ∗ Figure 2.6 actually presents a slightly simplified view of reality for illustrative purposes. In practice, the likelihood values for the states near δ are usually not identical. However, they do tend to be fairly close. 31 Belief: .075 .003 .022 .136 .049 Σ = 1 0 1 2 3 99 array indices .1 .2 .3 .4 10 time-to-impact Figure 2.7: The TTI estimator stores its belief about the current state in a single dimen- sional array, using a simple mapping between array indices and times-to-impact as shown above. Each element of the array holds a probability indicating the likelihood associated with that element’s corresponding time-to-impact. All the probabilities sum to unity. split the table of causal probabilities and recognize the current phase of the LGMD signal to decide which likelihood profile to use. The TTI estimator starts off using the table corresponding to the looming phase. It then tracks the LGMD spike rate, taking the second derivative at each step to detect the signal peak. When the second derivative becomes negative, we have found an inflection point indicating a local maximum (see, for example, Anton et al., 2002, chap. 4). The estimator then switches to the blanking phase’s table. When the spike rate drops below some predetermined threshold, the estimator switches back to the looming phase’s table. Since the input LGMD readings can be noisy, the derivatives of this signal will tend to be noisier still. Therefore, we use a weighted moving average of the five most recent second derivatives to avoid sudden, spurious switches between the two tables. We also use a threshold to decide when to actually administer the second derivative test. Otherwise, the filter could easily misconstrue a local maximum caused by sensor noise as the signal peak. 2.3.2 Implementation Details Algorithm 2.4 presents a simplified version of the actual C++ code for the TTI estimator’s update function. The bits related to multithreaded synchronization, visualization and some internal bookkeeping have been excised as they are not germane to our discussion here. 32 Algorithm 2.4 The TTI Estimator’s update function 1 void TTIEstimator :: update() 2 { 3 // Use correct sensor model for current phase of LGMD signal 4 switch (lgmd phase()) 5 { 6 case LOOMING: 7 if (m lgmd[0] > Params:: rising threshold () && 8 m sder. value() < 0) 9 m sensor model = & blanking sensor model() ; 10 break ; 11 case BLANKING: 12 if (m lgmd[0] < Params:: falling threshold ()) 13 m sensor model = & looming sensor model() ; 14 break ; 15 } 16 17 using namespace boost ::lambda ; 18 19 // Multiply individual probabilities of current belief with 20 // corresponding probabilities in appropriate column vector of 21 // sensor model. 22 Belief tmp(m belief . size () , 0.0 f) ; 23 std ::transform(m belief .begin() , m belief .end() , 24 m sensor model−>column vector(m lgmd[0]). begin() , 25 tmp.begin() , 1 ∗ 2) ; 26 27 // Normalize intermediate belief obtained above 28 float norm = std ::accumulate(tmp.begin() , tmp.end() , 0.0 f ); 29 std ::transform(tmp.begin() , tmp.end() , tmp.begin() , 1/norm) ; 30 31 // Add uniform distribution to above result 32 float u = 1.0 f/tmp. size () ; 33 std ::transform(tmp.begin() , tmp.end() , m belief .begin() , 34 ( 1 + u)/2) ; 35 } 33 The belief is represented as a single dimensional array (see figure 2.7), contained in an std::vector object from the C++ standard library (Stroustrup, 2000). Each TTIEstimator object is attached to a virtual locust. Before TTIEstimator::update() getscalled,thecurrentLGMDspikeratewillbecopiedfromthevirtuallocusttotheestima- tor object. This copying function takes care of computing and storing the second derivative of the LGMD signal. The current LGMD spike rate is stored in m lgmd[0] and the second derivative in m sder, an object encapsulating the weighted moving average computation and whose value() function returns the latest filtered value of the second derivative. Thus, lines 7 and 8 in algorithm 2.4 check if the current LGMD spike rate exceeds a predeterminedthresholdandifthesignalhasachievedalocalmaximum. Ifso,theestimator altersitsinternalstatetousetheblankingphase’ssensormodeltable(line9). Ontheother hand, if the estimator is currently using the blanking phase’s sensor model and the spike rate drops below its preset falling threshold, it switches back to the looming phase’s table of causal probabilities (lines 11–14). Once the estimator has setup the proper likelihood profile corresponding to the current LGMD phase, it finds the column vector within the sensor model table indexed by the current LGMD spike rate, multiplies the individual probabilities of that vector with those of the current belief (lines 22–25) and then normalizes the resulting temporary belief (lines 28–29). Notethatfigure2.6anditsaccompanyingexplanationdepictthesensormodeltablesuch that its rows are indexed by the LGMD spike rate range and its columns by the discretized times-to-impact. The C++ implementation, however, arranges the sensor model table so thatitscolumnscorrespondtoLGMDspikeraterangesanditsrowstothetimes-to-impact. In effect, the implementation transposes the table. This does have some implications for algorithm 2.3, which was put together chiefly to show how the sensor model is constructed rather than to present a perfectly accurate picture of the system’s innards. However, it does not affect the Bayesian state update operation in any way whatsoever. 34 We chose the “transposed” implementation based on the treatment by Thrun et al. (2005, chap. 2). However, the explanation of the sensor model is easier to follow as shown in figure 2.6 because, moving left to right, the likelihood profile’s shape resembles that of the familiar LGMD spike rate curve, making the relationship between the two more readily apparent. The final step of the TTI estimator’s Bayesian update, lines 32–34 in algorithm 2.4, is to add a uniform distribution so as to prevent the filter from getting stuck in some state as explained earlier. 2.3.3 Estimator Parameters Lobot’ssoftwareallowsuserstoconfiguremanydifferentaspectsofLGMDspikegeneration, time-to-impact estimation and other things that it controls. In table 2.2, we document the various settings that we used for the Bayesian TTI estimator. As we have already seen, our Bayes filter maintains a probability distribution over all the possible times-to-impact. We discretize this continuous state variable so that it goes from zero to ten seconds in steps of a tenth of a second. T is the parameter that holds this discretization. We show it as a list of states in table 2.2. However, in the configuration file, it is actually specified using three numbers, viz., the TTI range’s start and end values and a step size. Section 2.3.1 explained the rationale behind splitting the table of causal probabilities and why the state nearest to the δ parameter (table 2.1, page 25) is the logical place to perform the split. Thus, the looming phase’s sensor model will hold causal probabilities corresponding to TTI values in the range [δ,10], which is the value of the T L parameter, and the blanking phase’s sensor model will hold causal probabilities corresponding to TTI values in the range [0,δ], which is the value of the T B parameter. Since we varied δ across different experiments, we also had to update T L and T B to follow suit. 35 Name Description Units Value T TTI discretization s {.1, .2, .3, ..., 10} T † L TTI range for looming phase s [δ,10] T † B TTI range for blanking phase s [0,δ] F L LGMDfiringraterangesforlooming phase Hz {0, 50, 100,200,300, 400,500,600,650,800} F B LGMD firing rate ranges for blank- ing phase Hz {0, 100,200,300 400,500,600,800} σ L Std. dev. for Gaussian weighting of looming sensor model bins 1.5 σ B Std. dev. for Gaussian weighting of blanking sensor model bins 1 Γ r Rising threshold (for detecting tran- sitions from looming to blanking) Hz 600 Γ f Fallingthreshold(fordetectingtran- sitions from blanking to looming) Hz 100 S Filter size for second derivative weighted moving average 5 Γ c Confidencethresholdfordistancees- timates .1 D Range for distance estimates mm [50, 5000] Table2.2: ThesearethevaluesweusedforthedifferentparametersgoverningtheBayesian time-to-impact state estimator. Parameters marked with a ‘†’ were the only ones that we varied. The others remained fixed across all the experiments and tests we conducted. 36 We should point out here that we could have simply used the δ parameter’s value from the spike generation module’s settings. However, we consider LGMD spike generation to be a black box of which the other parts of the controller know nothing. This abstraction facilitates easy switching between alternate spike sources. For instance, we anticipate being able to use the controller as-is when attached to real locusts. In that situation, we will not be able to control δ but would have to rely on an empirically determined value consistent with the underlying electrophysiology. Furthermore, rather than using a single parameter to decide where to split the sensor model, we use two, viz., T L and T B , so that we may specify slight overlaps if deemed necessary. For example, instead of having T L = [δ,10] and T B = [0,δ], we could use T L = [δ−.5,10] and T B = [0,δ +.5]. Again, we anticipate this flexibility to pay off when we deploy Lobot’s controller in conjunction with actual locusts. The next two parameters, viz.,F L andF B , specify the ranges for the LGMD firing rate. In figure 2.6 (page 29), they would select the appropriate row vector from the sensor model. However, as explained in section 2.3.2, the implementation actually transposes the table shown in figure 2.6. Thus, F L and F B select the relevant column vectors from the causal probability tables (see line 24 in algorithm 2.4). Lobot interprets F L and F B in pairs. For instance, with F L as shown in table 2.2, it creates a sensor model table for the looming phase containing 9 columns. The first column will be for LGMD sensor readings in the range [0,50); the second column for LGMD sensor readings in the range [50,100); the third for [100,200); so on and so forth. F B is interpreted in the same way. Algorithm 2.3 (page 29) presented a conceptual outline of how the TTI estimator gen- erates the causal probability tables it needs. As noted above, the actual implementation is somewhat different (because of the transposition of the table from what is depicted in figure 2.6). However, those details need not detract from the basic idea, viz., to apply equa- tion (2.1) (page 18) and find the cell within the table corresponding to some t and f(t), put an arbitrary number there, update the remaining cells corresponding to t according to 37 a Gaussian weighting formula and, finally, normalize the resulting list to obtain the desired causal probabilities. The σ L and σ B parameters specify the standard deviations for the above-mentioned Gaussian weighting. Apropos figure 2.6, higher values for these parameters will produce blurrier likelihood profiles wherein the colors are more diffuse; lower σ L and σ B will result in sharper profiles with clear and distinct boundaries between the colors. In order to determine when to switch between the likelihood profiles for the two phases of the LGMD signal, the TTI estimator uses spike rate thresholds, viz., Γ r and Γ f . The call to Params::rising threshold() on line 7 of algorithm 2.4 returns Γ r and line 12’s call to Params::falling threshold() returns Γ f . When the input spike rates being read from the LGMD sensor exceed Γ r , the filter uses the weighted moving average (WMA) of the five most recent second derivatives to find the signal peak. We have found the five most recent second derivatives to provide good performance. However, the software allows us to change the WMA filter size via the S parameter. Finally, the filter’s TTI estimates are combined with the robot’s encoder data to yield distances to approaching obstacles. However, the filter will only produce such estimates when the probability of the most likely state exceeds the value of the Γ c parameter. Fur- thermore, all distance estimates are clamped to the range specified by the D parameter. 2.4 Testing the LGMD Sensor Model We have seen how Lobot uses its laser range finder and wheel encoders to calculate the time-to-impact in different directions and, thereby, generate artificial LGMD spikes. We have also seen how it uses Bayesian state estimation to recover the time-to-impact from LGMD spikes. We would now like to see how accurate Lobot’s TTI estimates are and how well they hold up to increasing levels of sensor noise. Additionally, we would like to characterize Lobot’s performance w.r.t. δ and the robot’s speed s. 38 LRF Robot Wall 2.5m Figure 2.8: The robot was repeatedly driven 2.5m straight toward a wall. Only one virtual locust was active. Parameter Units Values δ s 0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.00 σ Hz 0 25 50 100 s m/s 0.1 0.2 0.3 0.4 Table 2.3: The values of the variable parameters used for testing the Bayesian time-to-impact estimator. We are interested in δ’s effect on state estimation because, in real locusts, the LGMD spike rate peak occurs very close to collision. By varying δ and observing the robot’s response, we hope to be able to glean useful information regarding the viability of live LGMD’s as collision detection sensors for a mobile robot. As for s, because the LGMD produces more peaked and easily recognized response patterns for fast-moving objects, we wish to investigate the extent to which this variable affects the time-to-impact estimation. 2.4.1 Experimental Methodology Figure 2.8 shows the test design for gauging the TTI estimator’s performance. Table 2.3 lists the different values we used for the variable parameters. Recall thatδ controls how far away from collision the LGMD spike rate peaks and σ specifies the standard deviation of the Gaussian noise injected into the spike rate generator’s output (see section 2.2.1). 39 Werantenindividualexperimentsforeachcombinationofvaluesofδ,σ ands. Sincewe usedeightvaluesforδ,fourforσ,andfourrobotspeeds,weranatotalof10×8×4×4 = 1280 individual experiments. Each run involved driving the robot straight toward a wall at the designated speed for that experiment. It would always start from the same spot and drive a distance of approximately 2.5m. Although it was controlled manually after a run to bring it back to its start point, the robot operated autonomously when driving forward toward the wall. It was configured to come to a halt just short of the wall. Mostrunsendedwiththerobotstoppinglessthan50mmfromthewall. Theproblematic part was getting it to drive straight, especially at 0.4m/s. We discarded those runs in which Lobot veered significantly off-course. 2.4.2 Performance Metrics Withtheseexperiments,wewantedtotestjusttheBayesiantime-to-impactstateestimator, to see how well the TTI predictions for an individual virtual locust would match the ground truth measured by the laser range finder and odometry. Therefore, we turned off all of the thirteen virtual locusts shown in figure 2.4(a) (page 21) except for the one looking straight ahead along 0 ◦ . During each experiment, as the robot drove toward the wall, every few milliseconds, it recorded the actual time-to-impact, the current LGMD spike rate, the predicted time-to- impact, prediction confidence and several other relevant parameters to a log file. Here is an example of one such log entry taken from a log file for an experiment where δ = 1.00s, σ = 50Hz and s = 0.2m/s: 40 1285645738335 lgmd extricate tti TTI estimator info : \ current robot speed = 0.194 m/s \ locust direction = 0 degrees \ LGMD spike rate = 109 Hz \ actual time−to−impact = 1.8 seconds \ predicted time−to−impact = 2.0 seconds \ actual distance = 350 mm \ predicted distance = 388 mm \ prediction confidence = 10.0% Each entry consists of several lines. The first line starts off with a large number, which is a time stamp expressed as the number of milliseconds since midnight, January 1 st 1970, fol- lowed by some labels from the Lobot behaviour outputting these messages. The subsequent lines in the entry contain the data in which we are interested. Once all the experiments were done, the log files were parsed by a custom analysis program that extracted and combined the pertinent pieces of information from this raw data to help produce the plots presented in section 2.5. 2.4.3 Data Analysis Since the TTI prediction model is configured to maintain a probability distribution over 0 to 10 seconds in steps of 0.1s, we only considered those log entries that record an actual time-to-impact of 10s or less. Now, across a dataset, we can have multiple log entries for the same TTI reading. For instance, the example log entry shown earlier was for t = 1.8s. Some other experiment in the same dataset (i.e., δ = 1.00s, σ = 50Hz and s = 0.2m/s) may also contain an entry for t = 1.8s. In fact, it is highly likely that several of the ten experiments in each dataset will have such “repeated” entries. Therefore, we average the model’s predictions across all these repeats to obtain the final predicted value. That is, if we have five experiments in a dataset with entries for t = 1.8s, we compute the mean and standard deviation of all the five predicted TTI’s corresponding 41 to the actual time-to-impact of 1.8s and take the average as the final result. We do the same for the prediction confidences and distance readings. TheseaveragedvaluesarethenplottedtohelpcharacterizetheBayesianTTIprediction model’s performance. TofurthergaugehowwelltheTTIstateestimatorworks,wedefinetheoverallprediction error for a dataset as: E = v u u t X t t− ¯ p t t 2 (2.5) where t = the actual time-to-impact ¯ p t = the averaged prediction corresponding to t That is, the overall error for a dataset is the square root of the sum of the squares of the individual error ratios. We plot this quantity w.r.t. δ, σ and s to see how the prediction accuracy varies with these parameters. 2.5 Results Figure 2.9 shows the plots of the LGMD spike rate and predicted time-to-impact versus the actual time-to-impact for δ = 0.25s. The plots for the remaining δ values are continued on subsequent pages. Due to space constraints, we show the 16 different plots corresponding to the 4 different noise profiles and 4 robot speeds used for testing on each page. This means that each plot can only occupy a small area and not show too much detail. Therefore, for the sake of legibility, we omit the error bars from these plots. Figure 2.10 on page 52 shows the overall prediction errors for each dataset (see equa- tion(2.5)). NotethattheabsolutevaluesofE arenotreallyallthatimportant; wearemore 42 interested in observing the trends in the overall accuracy of the Bayesian TTI estimator w.r.t. δ, σ and s. 2.6 Discussion 2.6.1 The Effect of δ Although there are some anomalies, from figure 2.10, it is clear that, in general, the time- to-impact predictions exhibit better convergence as we increase δ. This happens because, at higher values of δ, the LGMD spike rate peaks occur further away from collisions. Consequently, the LGMD starts to fire more rapidly early on during obstacle approach. Now, looking at figure 2.3 on page 19, we see that in the looming phase, the LGMD fires relatively weakly. Then, a little while before δ, it experiences a sudden and rapid upswing in spiking. Notice that in each plot in figure 2.9, the TTI prediction curve starts off relatively far away from the dashed diagonal line corresponding to zero error. It drops toward the zero-error line at about the same time the above-mentioned upswing kicks in. Inotherwords, theTTIestimatorworksbestwithinthe“steepled”regionoftheLGMD spike rate curve near δ. Thus, if δ is small, the steeple will occur later during obstacle approachandlastaveryshortwhile,therebydegradingtheestimator’soverallperformance. This has two important implications: • Proper obstacle avoidance requires good range estimates. In a very crowded envi- ronment, we might have little choice but to set δ to a low value. Thus, we expect LGMD-based avoidance to be better suited to sparsely populated areas than densely packed ones. • In real locusts, the LGMD spike rate peak tends to occur in the neighbourhood of 200ms. As our TTI estimator works better when the peaks are nearer 1–2s, we 43 0 1 2 3 4 5 6 7 8 9 10 Robot Speed = 0.4 m/s Predicted Time-to-impact (s) Prediction Confidence 25% 50% 75% δ 0 1 2 3 4 5 6 7 8 9 10 Robot Speed = 0.3 m/s Predicted Time-to-impact (s) δ 0 1 2 3 4 5 6 7 8 9 10 Robot Speed = 0.2 m/s Predicted Time-to-impact (s) δ 0 1 2 3 4 5 6 7 8 9 Actual Time-to-impact (s) No Noise 0 1 2 3 4 5 6 7 8 9 10 Robot Speed = 0.1 m/s Predicted Time-to-impact (s) 10 δ Prediction Confidence 25% 50% 75% δ δ δ 0 1 2 3 4 5 6 7 8 9 Actual Time-to-impact (s) 25Hz Noise 10 δ Prediction Confidence 25% 50% 75% δ δ δ 0 1 2 3 4 5 6 7 8 9 Actual Time-to-impact (s) 50Hz Noise 10 δ 0 80 160 240 320 400 480 560 640 720 800 LGMD Spike Rate (Hz) Prediction Confidence 25% 50% 75% δ 0 80 160 240 320 400 480 560 640 720 800 LGMD Spike Rate (Hz) δ 0 80 160 240 320 400 480 560 640 720 800 LGMD Spike Rate (Hz) δ 0 1 2 3 4 5 6 7 8 9 Actual Time-to-impact (s) 100Hz Noise 0 80 160 240 320 400 480 560 640 720 800 LGMD Spike Rate (Hz) 10 δ Figure 2.9: Plots of LGMD spike rate and predicted TTI vs. actual TTI. This one is for δ = 0.25s; subsequent pages show the plots for δ = 0.50, 0.75, ..., 2.00. The solid black curve is the LGMD spike rate. The solid colored curve shows the predicted TTI. The color of this curve indicates the confidence level associated with the prediction; redder pointsindicatelowerconfidencewhilegreenerpointsindicatehigherconfidence. Thedashed diagonal line from the top-left corner of each plot to its bottom right corner denotes the points where the predicted TTI coincides perfectly with the actual TTI. The closer the colored curve is to the dashed line, the more accurate the model’s prediction. 44 Figure 2.9: Continued 0 1 2 3 4 5 6 7 8 9 10 Robot Speed = 0.4 m/s Predicted Time-to-impact (s) Prediction Confidence 25% 50% 75% δ 0 1 2 3 4 5 6 7 8 9 10 Robot Speed = 0.3 m/s Predicted Time-to-impact (s) δ 0 1 2 3 4 5 6 7 8 9 10 Robot Speed = 0.2 m/s Predicted Time-to-impact (s) δ 0 1 2 3 4 5 6 7 8 9 Actual Time-to-impact (s) No Noise 0 1 2 3 4 5 6 7 8 9 10 Robot Speed = 0.1 m/s Predicted Time-to-impact (s) 10 δ Prediction Confidence 25% 50% 75% δ δ δ 0 1 2 3 4 5 6 7 8 9 Actual Time-to-impact (s) 25Hz Noise 10 δ Prediction Confidence 25% 50% 75% δ δ δ 0 1 2 3 4 5 6 7 8 9 Actual Time-to-impact (s) 50Hz Noise 10 δ 0 80 160 240 320 400 480 560 640 720 800 LGMD Spike Rate (Hz) Prediction Confidence 25% 50% 75% δ 0 80 160 240 320 400 480 560 640 720 800 LGMD Spike Rate (Hz) δ 0 80 160 240 320 400 480 560 640 720 800 LGMD Spike Rate (Hz) δ 0 1 2 3 4 5 6 7 8 9 Actual Time-to-impact (s) 100Hz Noise 0 80 160 240 320 400 480 560 640 720 800 LGMD Spike Rate (Hz) 10 δ 45 Figure 2.9: Continued 0 1 2 3 4 5 6 7 8 9 10 Robot Speed = 0.4 m/s Predicted Time-to-impact (s) Prediction Confidence 25% 50% 75% δ 0 1 2 3 4 5 6 7 8 9 10 Robot Speed = 0.3 m/s Predicted Time-to-impact (s) δ 0 1 2 3 4 5 6 7 8 9 10 Robot Speed = 0.2 m/s Predicted Time-to-impact (s) δ 0 1 2 3 4 5 6 7 8 9 Actual Time-to-impact (s) No Noise 0 1 2 3 4 5 6 7 8 9 10 Robot Speed = 0.1 m/s Predicted Time-to-impact (s) 10 δ Prediction Confidence 25% 50% 75% δ δ δ 0 1 2 3 4 5 6 7 8 9 Actual Time-to-impact (s) 25Hz Noise 10 δ Prediction Confidence 25% 50% 75% δ δ δ 0 1 2 3 4 5 6 7 8 9 Actual Time-to-impact (s) 50Hz Noise 10 δ 0 80 160 240 320 400 480 560 640 720 800 LGMD Spike Rate (Hz) Prediction Confidence 25% 50% 75% δ 0 80 160 240 320 400 480 560 640 720 800 LGMD Spike Rate (Hz) δ 0 80 160 240 320 400 480 560 640 720 800 LGMD Spike Rate (Hz) δ 0 1 2 3 4 5 6 7 8 9 Actual Time-to-impact (s) 100Hz Noise 0 80 160 240 320 400 480 560 640 720 800 LGMD Spike Rate (Hz) 10 δ 46 Figure 2.9: Continued 0 1 2 3 4 5 6 7 8 9 10 Robot Speed = 0.4 m/s Predicted Time-to-impact (s) Prediction Confidence 25% 50% 75% δ 0 1 2 3 4 5 6 7 8 9 10 Robot Speed = 0.3 m/s Predicted Time-to-impact (s) δ 0 1 2 3 4 5 6 7 8 9 10 Robot Speed = 0.2 m/s Predicted Time-to-impact (s) δ 0 1 2 3 4 5 6 7 8 9 Actual Time-to-impact (s) No Noise 0 1 2 3 4 5 6 7 8 9 10 Robot Speed = 0.1 m/s Predicted Time-to-impact (s) 10 δ Prediction Confidence 25% 50% 75% δ δ δ 0 1 2 3 4 5 6 7 8 9 Actual Time-to-impact (s) 25Hz Noise 10 δ Prediction Confidence 25% 50% 75% δ δ δ 0 1 2 3 4 5 6 7 8 9 Actual Time-to-impact (s) 50Hz Noise 10 δ 0 80 160 240 320 400 480 560 640 720 800 LGMD Spike Rate (Hz) Prediction Confidence 25% 50% 75% δ 0 80 160 240 320 400 480 560 640 720 800 LGMD Spike Rate (Hz) δ 0 80 160 240 320 400 480 560 640 720 800 LGMD Spike Rate (Hz) δ 0 1 2 3 4 5 6 7 8 9 Actual Time-to-impact (s) 100Hz Noise 0 80 160 240 320 400 480 560 640 720 800 LGMD Spike Rate (Hz) 10 δ 47 Figure 2.9: Continued 0 1 2 3 4 5 6 7 8 9 10 Robot Speed = 0.4 m/s Predicted Time-to-impact (s) Prediction Confidence 25% 50% 75% δ 0 1 2 3 4 5 6 7 8 9 10 Robot Speed = 0.3 m/s Predicted Time-to-impact (s) δ 0 1 2 3 4 5 6 7 8 9 10 Robot Speed = 0.2 m/s Predicted Time-to-impact (s) δ 0 1 2 3 4 5 6 7 8 9 Actual Time-to-impact (s) No Noise 0 1 2 3 4 5 6 7 8 9 10 Robot Speed = 0.1 m/s Predicted Time-to-impact (s) 10 δ Prediction Confidence 25% 50% 75% δ δ δ 0 1 2 3 4 5 6 7 8 9 Actual Time-to-impact (s) 25Hz Noise 10 δ Prediction Confidence 25% 50% 75% δ δ δ 0 1 2 3 4 5 6 7 8 9 Actual Time-to-impact (s) 50Hz Noise 10 δ 0 80 160 240 320 400 480 560 640 720 800 LGMD Spike Rate (Hz) Prediction Confidence 25% 50% 75% δ 0 80 160 240 320 400 480 560 640 720 800 LGMD Spike Rate (Hz) δ 0 80 160 240 320 400 480 560 640 720 800 LGMD Spike Rate (Hz) δ 0 1 2 3 4 5 6 7 8 9 Actual Time-to-impact (s) 100Hz Noise 0 80 160 240 320 400 480 560 640 720 800 LGMD Spike Rate (Hz) 10 δ 48 Figure 2.9: Continued 0 1 2 3 4 5 6 7 8 9 10 Robot Speed = 0.4 m/s Predicted Time-to-impact (s) Prediction Confidence 25% 50% 75% δ 0 1 2 3 4 5 6 7 8 9 10 Robot Speed = 0.3 m/s Predicted Time-to-impact (s) δ 0 1 2 3 4 5 6 7 8 9 10 Robot Speed = 0.2 m/s Predicted Time-to-impact (s) δ 0 1 2 3 4 5 6 7 8 9 Actual Time-to-impact (s) No Noise 0 1 2 3 4 5 6 7 8 9 10 Robot Speed = 0.1 m/s Predicted Time-to-impact (s) 10 δ Prediction Confidence 25% 50% 75% δ δ δ 0 1 2 3 4 5 6 7 8 9 Actual Time-to-impact (s) 25Hz Noise 10 δ Prediction Confidence 25% 50% 75% δ δ δ 0 1 2 3 4 5 6 7 8 9 Actual Time-to-impact (s) 50Hz Noise 10 δ 0 80 160 240 320 400 480 560 640 720 800 LGMD Spike Rate (Hz) Prediction Confidence 25% 50% 75% δ 0 80 160 240 320 400 480 560 640 720 800 LGMD Spike Rate (Hz) δ 0 80 160 240 320 400 480 560 640 720 800 LGMD Spike Rate (Hz) δ 0 1 2 3 4 5 6 7 8 9 Actual Time-to-impact (s) 100Hz Noise 0 80 160 240 320 400 480 560 640 720 800 LGMD Spike Rate (Hz) 10 δ 49 Figure 2.9: Continued 0 1 2 3 4 5 6 7 8 9 10 Robot Speed = 0.4 m/s Predicted Time-to-impact (s) Prediction Confidence 25% 50% 75% δ 0 1 2 3 4 5 6 7 8 9 10 Robot Speed = 0.3 m/s Predicted Time-to-impact (s) δ 0 1 2 3 4 5 6 7 8 9 10 Robot Speed = 0.2 m/s Predicted Time-to-impact (s) δ 0 1 2 3 4 5 6 7 8 9 Actual Time-to-impact (s) No Noise 0 1 2 3 4 5 6 7 8 9 10 Robot Speed = 0.1 m/s Predicted Time-to-impact (s) 10 δ Prediction Confidence 25% 50% 75% δ δ δ 0 1 2 3 4 5 6 7 8 9 Actual Time-to-impact (s) 25Hz Noise 10 δ Prediction Confidence 25% 50% 75% δ δ δ 0 1 2 3 4 5 6 7 8 9 Actual Time-to-impact (s) 50Hz Noise 10 δ 0 80 160 240 320 400 480 560 640 720 800 LGMD Spike Rate (Hz) Prediction Confidence 25% 50% 75% δ 0 80 160 240 320 400 480 560 640 720 800 LGMD Spike Rate (Hz) δ 0 80 160 240 320 400 480 560 640 720 800 LGMD Spike Rate (Hz) δ 0 1 2 3 4 5 6 7 8 9 Actual Time-to-impact (s) 100Hz Noise 0 80 160 240 320 400 480 560 640 720 800 LGMD Spike Rate (Hz) 10 δ 50 Figure 2.9: Continued 0 1 2 3 4 5 6 7 8 9 10 Robot Speed = 0.4 m/s Predicted Time-to-impact (s) Prediction Confidence 25% 50% 75% δ 0 1 2 3 4 5 6 7 8 9 10 Robot Speed = 0.3 m/s Predicted Time-to-impact (s) δ 0 1 2 3 4 5 6 7 8 9 10 Robot Speed = 0.2 m/s Predicted Time-to-impact (s) δ 0 1 2 3 4 5 6 7 8 9 Actual Time-to-impact (s) No Noise 0 1 2 3 4 5 6 7 8 9 10 Robot Speed = 0.1 m/s Predicted Time-to-impact (s) 10 δ Prediction Confidence 25% 50% 75% δ δ δ 0 1 2 3 4 5 6 7 8 9 Actual Time-to-impact (s) 25Hz Noise 10 δ Prediction Confidence 25% 50% 75% δ δ δ 0 1 2 3 4 5 6 7 8 9 Actual Time-to-impact (s) 50Hz Noise 10 δ 0 80 160 240 320 400 480 560 640 720 800 LGMD Spike Rate (Hz) Prediction Confidence 25% 50% 75% δ 0 80 160 240 320 400 480 560 640 720 800 LGMD Spike Rate (Hz) δ 0 80 160 240 320 400 480 560 640 720 800 LGMD Spike Rate (Hz) δ 0 1 2 3 4 5 6 7 8 9 Actual Time-to-impact (s) 100Hz Noise 0 80 160 240 320 400 480 560 640 720 800 LGMD Spike Rate (Hz) 10 δ 51 0 1 2 3 4 5 Robot Speed = 0.4 m/s Prediction Error, E r =−0.74 0 1 2 3 4 5 Robot Speed = 0.3 m/s Prediction Error, E r =−0.95 0 1 2 3 4 5 Robot Speed = 0.2 m/s Prediction Error, E r =−0.98 0 1 2 δ (s) No Noise 0 1 2 3 4 5 Robot Speed = 0.1 m/s Prediction Error, E r =−0.99 r =−0.22 r = +0.06 r =−0.97 0 1 2 δ (s) 25Hz Noise r =−0.79 r =−0.60 r =−0.51 r =−0.47 0 1 2 δ (s) 50Hz Noise r =−0.89 r =−0.84 r =−0.80 r =−0.87 0 1 2 δ (s) 100Hz Noise r =−0.87 Figure2.10: These plots show the overall prediction errorE, as defined by equation (2.5), versus δ for the four noise levels and robot speeds. In each plot, we specify the value of Pearson’sr (or coefficient of correlation; see Bartz (1988, chap. 8)) to quantify the strength of the relationship between E and δ. The exact values ofE are of less concern than the trends w.r.t. δ,σ (LGMD noise) and s (robot speed). Since most of the plots exhibit moderate to high negative values of r, we can conclude that, in general, prediction errors go down as we increase δ. 52 will have to find some means of artificially increasing δ when we are dealing with live LGMD’s. We might, for instance, have to place small telescopes in front of the locusts’ eyes to make things appear closer than they are. Another issue with low δ values is that the LGMD spike rate curve fails to achieve its characteristic peak and decay pattern. Instead, the spike rate starts to rise but is then cut short before the blanking phase even begins. This is due to the robot’s emergency stop behaviour, ∗ which halts the robot when things get too close. As we saw in algorithm 2.1 (page 24), when the robot stops moving, spiking activity is shut down. However, when δ is further away from the actual point of collision, the LGMD model’s spike generation will not be preempted by the emergency stop behaviour, thus, allowing it to fully realize both its looming and blanking phases. In Lobot’s current incarnation, wherein the laser range finder is the primary sensor, the lackofablankingphasedoesnotpresentanyinsurmountabledifficulties. WhentheLGMD- based behaviours fail to ward off a peril, other behaviours, relying on the laser range finder, areabletorescuetherobotandsetthingsrightagainsothatalloftheLGMDfunctionality can, in effect, start anew. However, ifwedidnothaveatraditionalrangesensoron-boardandwereinsteadrelying solely on visually generated LGMD-based range sensing, a blanking phase would be crucial because, as explained in section 2.3.1, we split the sensor model in two and recognize spike rate peaking as the trigger for switching between the two likelihood profiles. Without the blanking, the robot would fail to detect imminent collisions, incorrectly interpreting declining LGMD spikes as an object receding from the robot. Again, this underscores the need for controlling δ in such a way that the LGMD spike rate peaks occur well before collisions. 53 LGMD Filter TTI Estimator Raw LGMD Input Time- to- impact Figure 2.11: To clean up the raw LGMD input, we could feedback the output of the TTI estimator, which, in turn, could use the filtered (and raw) LGMD inputs to make better time-to-impact estimates. The two filters could be used like this in tandem to reinforce and improve each others’ beliefs about the robot’s true state. 2.6.2 The Effect of Noise As is to be expected, the predictions become more erratic and less reliable with increasing noise in the spikes. Nonetheless, even under these circumstances, the TTI estimator does converge with reality near collisions, albeit with a greater margin of error. We should point out here that we did not attempt to filter the input LGMD signal in any way because we wished to ascertain the robustness of our time-to-impact prediction model. However, in a “real” system, it would be reasonable to expect the input LGMD spikes to undergo some initial filtering before being fed into the higher layers of the robot’s controller. For example, with real locusts on-board, the raw electrophysiological data will first be run through some low-pass filters before being delivered to the robot’s high-level computational unit. Spikes generated using cameras would also first be cleaned up in a similar manner. Such low-level filtering, though not a panacea, will certainly help maintain the TTI estimator’s performance within the bounds of its noise tolerances. Alternatively,onepossibilityforfilteringLGMDinputthatmightbeworthinvestigating would be to feedback the output of the TTI estimator as shown in figure 2.11. The LGMD filter, like the TTI estimator, could also be Bayesian in nature, maintaining a probability distribution over the possible LGMD spike rates. ∗ Chapter 3 will explain Lobot’s behaviour-based control system in detail. 54 2.6.3 The Effect of Robot Speed This parameter, s, is not as influential as δ or σ. However, from the relevant plots in figure 2.9, we note that at lower speeds, even when δ is high, we do not obtain the char- acteristic steeple shape of the LGMD spike rate curve. We have already discussed why it is imperative to obtain a “proper” looming–peak–blanking response from the LGMD. Therefore, here, we merely highlight the fact that when the robot is used with real locusts, since theirδ’s are inherently small, we can only expect low speed to further exacerbate the situation, thus, bearing out our earlier hypothesis regarding the necessity of a high speed robot platform. 2.6.4 The Locust Array as a Range Sensor Section 2.2.1 explained the intent behind the radial arrangement of virtual locusts, which was to mimic a sonar ring. Informal tests conducted during development revealed a serious shortcoming of this LGMD-based range sensor: poor peripheral vision. The problem arises due to the projection of the velocity vector. As we project this vector along the lines of sight of locusts further and further away from the robot’s current heading, it will become smaller and smaller. Because time = distance/speed, as the speed goes down, the time-to-impact for these virtual locusts will become larger. Thus, when we apply equation (2.1), the spike rates for the locusts on the sides will be lower than those of the central ones, which, in general, indicates objects further away than they truly might be. This is not a limitation of the LGMD spike generation model. Rather, it is an unfor- tunate side effect of the method chosen to calculate times-to-impact, which was coupled to the robot’s encoder data. If we had, instead, decided to track objects within each vir- tual locust’s field of view using the laser range finder, we could have computed the desired TTI’s independently of the robot’s motion and avoided the degradation in the peripheral responses of the LGMD array. 55 However, in light of our intent to transition to visual input as the primary sensing mo- dality (using cameras as well as real locusts), we deemed this issue to be a low-priority concern. A more serious by-product of the coupling between robot motion and LGMD spiking calculations is the fact that there will be no spikes when the robot is stationary even if there exists relative motion between the robot and objects in its environment. We will see in subsequent chapters that this does have slightly unpleasant consequences when it comes to testing the LGMD-based obstacle avoidance algorithms. Nonetheless, because we were more interested in first developing and testing the afore-mentioned algorithms and because we intend to eventually generate LGMD spikes using a vision-based approach, wherein the robot’s motion would be immaterial, we decided against pursuing a full-fledged solution to this problem for the current system. 56 Chapter 3 Robot Controller Architecture WhetherLGMDspikesaregeneratedartificiallyorfromreallocusts,ultimately,theRobolo- cust project’s goal is to use these spike trains for obstacle avoidance and local navigation. However,beforewegetintothatarea,itwouldbeusefultofirstunderstandhowthedifferent parts of Lobot’s behaviour-based control system work. 3.1 Hardware Platform We constructed our simulation testbed (figure 2.1(b), page 16) from several stock compo- nents as described in table 3.1. The whole rig was based on an iRobot Create (http: //www.irobot.com/create/), which features differential steering and is equipped with wheel encoders and various low-level sensors to detect bumps, wheel drops, etc. The Create is controlled via a serial protocol referred to as the “Open Interface,” which specifies the exact sequences of bytes to be exchanged between the robot’s internal mi- crocontroller and a host computer for issuing motor commands and receiving sensor data. Rather than have our high-level system deal directly with the robot base via the Open Interface (OI), we chose to connect the main computer to the robot via the iRobot Create Command Module, an external microcontroller that interfaces with the Create’s main I/O port and augments it with extra programmability and I/O capabilities. On the Command Module, we ran a custom program (which we will describe in the next section) that served 57 iRobot Create Differentially steered ro- bot base with encoders and sensors. Hokuyo URG-04LX-UG01 USB-powe- red laser range finder. Sharp GP2D15 Two rear-mounted IR proximity detectors to avoid bump- ing into things while backing up. (a) Sensorimotor platform. iRobot APS 3000mAh NiMH recharge- able batteries for the robot base. Ocean Server BBDC-02R Battery con- troller for ATX computer. Ocean Server BA95HC-FL Two Li-Ion batteries (14.4V, 95Wh) for the main computer. (b) Power subsystem. iRobot Command Module 8-bit, 18MHz Atmel AT- mega168 microcontroller for talking to the iRobot Create. Jetway NF93R Mini-ITXmotherboardforthemaincom- puter. Intel Core2 Q9000 2GHz, low-power (45W TDP) quad- core CPU. Crucial CT2KIT25664AC800 4GB of DDR2 SDRAM for the computer’s main memory. Emphase FD25S 10000 16GB 2.5" SATA solid-state disk drive. Qcom LR802UKG 54MbpswirelessUSBmoduleforcon- necting to a WiFi network. (c) Computing units. Table 3.1: We built Lobot, our LGMD simulation testbed, using low-end, off-the-shelf components that we were able to assemble without having to resort to extensive customiza- tion or “hacking.” as an intermediary between the higher layers of the controller and the robot’s hardware, taking care of the nitty-gritty associated with translating abstract commands (e.g., turn left along a 5 ◦ arc) into the proper bits and bytes required by the OI. Since the Create does not feature any contact or proximity sensors on its rear, Lobot would often bump into things when it was backing up. This was problematic because our test environment was made up of a few empty cardboard boxes that were easily moved. We took care of this issue by mounting two Sharp GP2D15 infrared proximity detectors on the robot’s rear. These two sensors were considered part of the low-level system, i.e., they were 58 autoconf freeglut3-dev libboost-program-options-dev libjpeg62-dev automake g++ libboost-regex-dev libtool avr-libc gcc-avr libdc1394-13-dev subversion avrdude make libdevil-dev tk8.4-dev bison libboost-dev libfftw3-dev flex libboost-filesystem-dev libglew-dev Table 3.2: A partial list of Debian packages required to build the Lobot codebase, which is a part of iLab’s Neuromorphic Visualization Toolkit. managed by the above-mentioned custom Command Module program; the higher layers of the controller did not use these two rear proximity sensors. We settled on the mini-ITX form factor ∗ for the robot’s main computer to keep it compact and ensure that it would fit within the bounds of the Create. As we expected to run a heavily multithreaded, behaviour-based control program, we decided to use a quad- core processor. To keep the CPU power requirements within acceptable levels, we opted for the laptop version of the microprocessor. The remaining components of the robot’s computer were chosen around these two criteria (viz., form factor and CPU). 3.2 Low-level Software 3.2.1 Development Platform Lobot was developed on version 5.0 of the amd64 port of the Debian GNU/Linux operating system (http://www.debian.org/). Additionally, we used the iLab Neuromorphic Visual- ization Toolkit (INVT, http://ilab.usc.edu/toolkit/). In fact, Lobot’s codebase is a part of INVT. Both Debian and INVT are freely available in source form. Thus, Lobot’s behaviour- based controller may be downloaded and compiled to work on any robot that uses hardware equivalent to the system described in table 3.1. Furthermore, we designed Lobot’s control software with extensibility in mind. Getting it to work on a different robot equipped with slightly different sensorimotor capabilities ought only be a matter of implementing one or ∗ See, for example, http://en.wikipedia.org/wiki/Mini-ITX. 59 two C++ classes that encapsulate these differences and present a common interface to the rest of the system. In table 3.2, we document some of the Debian packages required to build Lobot’s software. In addition to these standard packages (which are part of the operating sys- tem), we built and installed a custom package derived from the URG library (http: //www.hokuyo-aut.jp/cgi-bin/urg programs en/) for interfacing with the laser range finder. Download and build instructions for INVT are available on the INVT website. The above-mentioned custom version of URG is available upon request. 3.2.2 iRobot Create Command Module Code Lobot’s control software consists of two separate programs: • The high-level controller is written in C++ and runs on the robot’s main computer. It implements the behaviour-based system described in section 3.3. This program is responsible for simulating locusts, goal-directed local navigation, etc. It issues “abstract” motor commands such as “turn left 5 ◦ ,” “drive forward at 0.2m/s,” and so on. • The low-level controller is written in C and runs on the iRobot Command Module. It receives motor commands from the high-level controller and converts them into appropriate Open Interface byte sequences. It also polls the Create for its sensor data, takes appropriate action in response to bumps, wheel drops, etc. and sends notification messages of these low-level actions to the high-level controller. These two programs talk to each other via a USB connection and a simple serial protocol consisting of command messages and acknowledgement messages. Commands are fixed- length messages (4 bytes) that flow “down” from the high-level to the low-level controller. Acknowledgements, on the other hand, are variable-length notification messages sent from the low-level controller “up” to the higher layers of the system. Figure 3.1 shows the byte 60 High-level Controller XOR checksum Param low byte Param high byte Command code 3 2 1 0 Messase Bytes Low-level Controller Figure 3.1: Commands are 4-byte mes- sagescontainingamotorcommand, a16- bit parameter and a checksum. Name Parameter Units nop forward speed mm/s reverse speed mm/s stop left turn radius mm right turn radius mm straight spin angle degrees (a) Listofcommandssentbyhigh-leveltolow- level controller. Name Num. Bytes ready 1 bumps 6 wheel drops 2 cliffs 6 remote 2 sensors 54 (b) Listofacknowledgementsfromlow-levelto high-level controller. Table3.3: Thecommandsandacknowl- edgements exchanged by the high and low-level controllers. order and format of command messages. Table 3.3 lists the currently recognized commands and acknowledgements that go back and forth between the high and low-level controllers. One minor complication associated with the low-level controller is that the Command Module sports only one UART. This means that, at any given moment, the Command Module can communicate with either the Create’s internal microcontroller (using OI com- mands) or the high-level controller via the external USB connection. Consequently, the low-level controller must multiplex the Command Module’s serial UART. In its main loop, the low-level control program first directs serial I/O to the Create and requests the latest sensor data using the OI. It then handles any sensor events that need immediate attention. For example, wheel drops or cliffs indicate a potentially hazardous situation such as the robot being on the edge of a staircase. It would be unwise to send a notification to the high-level controller and wait for it to issue an appropriate command. Instead, the low-level controller takes care of the problem itself and sends a notification to 61 the high-level specifying what it did and why. Similarly, the Create’s built-in bump sensor and the two add-on rear proximity detectors are handled by the low-level controller. After low-level sensor events have been handled, the low-level controller switches the Command Module’s UART to the external USB port and sends notifications of low-level actions plus the latest low-level sensor packet (which includes the encoder data). It then sends a READY acknowledgment to the high-level controller to indicate that it will wait a short duration for the high-level to send a command. If no command arrives in this period, thelow-levelcontrollerwillassumethatthehigh-levelcontrollerisdeadand, therefore, stop the robot. Algorithm 3.1 shows the C code making up the low-level controller’s main loop. The serial port multiplexing implemented by this loop has two important repercussions for the high-level controller: • Firstly,itcannotsendmotorcommandstothelow-levelcontrollerwheneveritpleases. Instead, it must buffer pending motor commands and send the next one in the queue in response to the READY acknowledgement. • If there are no pending motor commands, the high-level controller must send a NOP to have the robot continue doing whatever it is currently doing because, if the low- level controller fails to get a valid response to its READY acknowledgement within the timeout period, it will halt the robot. 3.2.3 Command Module Interface Thread Because the Command Module controls communication to and from the robot base, the high-level controller must play a subservient role in this regard. Thus, in the higher layers of the system, we implement a thread dedicated to listening to the serial port connecting the robot’s main computer to the Command Module. This thread’s sole job is to wait for data to become available on the serial port and respond appropriately. When it finds notification messages and sensor packets sent by the low-level controller, it buffers them for 62 Algorithm 3.1 The Low-level Controller’s Main Loop 1 for (;;) 2 { 3 // Retrieve sensor data and react as required . When user 4 // presses P on remote or Play button on robot , quit this 5 // low−level control program. 6 lo sensors () ; 7 if (lo sensors pending ()) { 8 react to sensors () ; 9 if (lo remote quit () || play quit ()) 10 break ; 11 } 12 13 // Send any pending acknowledgements and sensor data to high 14 // level . But before we switch UART to talk to high level , we 15 // need to ensure that drive module’s acceleration/deceleration 16 // functionality doesn ’t try to issue drive commands while 17 // UART is switched. Otherwise , it will send drive commands to 18 // wrong destination and high level will receive what it will 19 // construe as bogus acknowledgement messages. 20 lo clear buffers () ; 21 lo suspend ramping() ; 22 lo io to usb () ; 23 send pending acks() ; // also sends latest sensor packet 24 25 // If conditions are safe (e.g. , no wheels dropped) and user 26 // isn ’t remote controlling robot , then retrieve next high− 27 // level command and execute it . If no command is available , 28 // stop the robot. 29 char cmd[LOBOT CMD SIZE] = {0} ; 30 if (ignore high level ()) // unsafe conditions or remote ctrl 31 ; // don’t send ACKREADY 32 else // no unsafe conditions detected 33 { 34 lo tx(LOBOTACKREADY) ; 35 lo rx(cmd, LOBOT CMD SIZE, LOBOTCOMPUTERTIMEOUT) ; 36 if ( lo io error () || bad parity(cmd)) // timeout or bad data 37 cmd[0] = LOBOTCMDSTOP ; 38 } 39 40 lo clear buffers () ; 41 lo io to create () ; 42 lo resume ramping() ; 43 if (cmd[0]) 44 execute(cmd[0] , lo make word(cmd[1] , cmd[2])) ; 45 46 // Short wait before next iteration 47 lo wait(LOBOTUPDATEDELAY) ; 48 } 63 later retrieval by other modules. And in response to a READY acknowledgement, it sends out the next pending motor command. Without this dedicated thread, the high and low-level portions of Lobot’s controller can easily become unsynchronized. For example, if the high-level controller is busy with some other task when a READY acknowledgement comes in, it may not respond in time, which would cause the low-level controller to halt the robot. Occasional glitches of this sort can be overlooked or may not even be noticed. However, frequent occurrences of such desynchronizations will result in the robot behaving erratically or, even worse, doing absolutely nothing. Another possibility is that the serial data buffers at both ends could overflow if not read and emptied regularly. This will lead to loss of sensor packets and command messages. The cumulative effects of such losses will usually be quite undesirable. Forinstance,abehaviourmayissueinappropriatemotorcommandsonthebasisofoutdated or corrupted sensor data; a localization module could quickly go astray without reliable and timely odometry streaming in from the low-level; so on and so forth. A dedicated thread responsible for monitoring the serial connection between the high and low-level controllers avoids these problems. When different modules in the high-level controller need to issue motor commands or retrieve the latest low-level sensor data, they use an interface object that is responsible for encapsulating the details of the underlying robot platform. For the low-level controller running on the Command Module, this interface object is implemented by the RoombaCM C++ class. Thus, to get the robot to turn, a client of the RoombaCM class would use the RoombaCM::turn() method; to get the robot to drive forwards or backwards or to make it stop, a client would use the RoombaCM::drive() method. Internally, these functions actually buffer the motor commands with the dedicated in- terface thread described above, which is, in fact, implemented as a private inner class, viz., Comm, defined within RoombaCM. This design makes the interface thread for the low-level controller, i.e., RoombaCM::Comm, completely transparent to all the high-level modules. 64 3.3 Behaviour-based Framework Inthebehaviour-basedparadigmforrobotcontrol, multipleconcurrentbehavioursareused to make the robot perform in the desired manner. Each behaviour is a self-contained, au- tonomous, goal-directed module responsible for a small part of the robot’s overall operation (Arkin, 1998). Usually, a behaviour implements a single situation–action rule in a tight sensorimotor feedback loop. That is, each behaviour tracks one sensor and issues motor commands based on inferring that sensor’s measurements. Forexample,wecouldhaveonebehaviourthatsimplydrivestherobotforward. Another that stops the robot when things get too close. And a third that decides on a suitable steering direction based on distances to nearby objects. In isolation, these behaviours will not result in particularly interesting actions. However, together, they lead to a robot that can wander about without bumping into anything. This approach to controller design breaks the system down into manageable parts with minimal connections across various modules. Furthermore, incremental addition of be- haviours makes it easier to add new capabilities to the system one step at a time. Figure 3.2 presents a block diagram illustrating the overall architecture of Lobot’s con- troller. The previous section has already described the low-level controller and its interface thread. Here, we will go into the details of the higher layers. 3.3.1 The Main Thread When the high-level controller starts up, it first reads a configuration file to see which behaviours and sensorimotor subsystems have been enabled. It then creates the various interface objects for the sensors and the robot platform based on the settings in the config- uration file. For each active behaviour, it launches a new thread. When these initialization steps are complete, the main thread enters a sensor update loop. Algorithm 3.2 shows some snippets of code from the main thread’s initialization se- quence. Algorithm 3.3 shows the main thread’s sensor update loop. To keep the exposi- 65 Create Robot Command Module Low-level controller Open Interface Command Module Interface Thread High Level Low Level Ack Cmd Main Thread Arbiters Speed Turn Spin Sensorimotor Interface Objects LRF LGMD Robot Behaviours Forward Extricate Eme. Stop Open Path . . . . . . Figure 3.2: The overall architecture of Lobot’s behaviour-based controller. Various inter- faceobjectsencapsulateandabstractthesensorimotorsubsystems. Inparticular, the Robot interfaceobjecthidestheCommandModuleinterfacethread. Themainthreadupdatesthe interface objects to provide the latest sensor data to the behaviours. The behaviours, each running in a separate thread, use the sensorimotor interface objects to decide on appro- priate actions. The arbiters combine the actions of competing behaviours and issue motor commands. 66 Algorithm 3.2 The High-level Controller’s Initialization Sequence 1 Configuration :: load( config file ()) ; 2 3 // Create the laser range finder I/O object 4 if (laser enabled ()) 5 m lrf = new LaserRangeFinder(laser device () , laser baud rate ()); 6 7 // Create the robot interface object 8 if (robot enabled ()) 9 m robot = create robot(robot platform () , m model manager) ; 10 11 // Create the map object if mapping is enabled 12 if (mapping enabled()) 13 m map = new Map() ; 14 15 // Setup the locust LGMD models 16 if (video enabled() && video input ()) 17 m input source = new InputSource(m compositor) ; 18 else if (laser enabled () && laser input ()) 19 m input source = new InputSource(m lrf) ; 20 if (m input source) 21 create locust models(m input source , &m locusts) ; 22 23 // Start the different behaviours 24 create behaviours(&m behaviours) ; 25 26 // Create and configure main window 27 if (show ui()) 28 ... tion over here focused, we have omitted error handling and other details from the above- mentioned source code listings. Note that algorithm 3.2 makes some references to video (i.e., camera) I/O when setting up the virtual locusts. This branch of the code is currently unused as the laser range finder is Lobot’s primary input source. However, by implementing an appropriate vision-based algorithmforLGMDspikegeneration, wecanusethiscontrollerwithoutanychangestothe main thread; that is, the facility for using cameras exists within the code but is dormant. Lines11–13inalgorithm3.2arealsonoteworthy. Theyinitializetherobot’slocalization module, which we will describe in chapter 4. 67 Algorithm 3.3 The Main Thread’s Sensor Update Loop 1 int update delay = 2 1000 ∗ clamp(global conf("update_delay", 100), 1, 1000) ; 3 while (! Shutdown:: signaled ()) 4 { 5 UpdateLock :: begin write () ; 6 if (m lrf) 7 m lrf−>update() ; 8 if (m robot) 9 m robot−>update() ; 10 std ::for each(m locusts .begin() , m locusts .end() , 11 std ::mem fun(&LocustModel :: update)) ; 12 UpdateLock :: end write() ; 13 usleep(update delay) ; 14 } Inthemainthread’ssensorupdateloop(algorithm3.3),wesimplycalltheupdate()func- tion for each sensor’s interface object to retrieve the latest sensor data. Since other threads (the behaviours) also access these interface objects, we protect the update operations with a write lock. One point that should be clarified is that figure 3.2 makes it look as if there is a direct connection between the main thread and the Command Module’s interface thread. In actuality, the connection between these two threads is indirect and hidden within the call to Robot::update() in line 9 of algorithm 3.3. This function takes care of retrieving the latest low-level sensor packet from the Command Module interface thread’s internal buffer as described in section 3.2.3. All of the functionality of the main thread is encapsulated within an application object, whichisaninstanceofthe Appsingletonclass(Alexandrescu,2001;EckelandAllison,2004; Gamma et al., 1994). Apart from the initialization and update steps described above, the App object is also responsible for managing the many different objects and resources used by Lobot’s high-level controller, i.e., creating them, maintaining references to them and destroying them on application shutdown. Since it is a singleton, it acts as a convenient system-wide point of contact for getting references to the different objects it manages. In particular, behaviours can simply use the App object to get at the different sensorimotor 68 singleton App LaserRangeFinder Robot LocustModel Figure 3.3: Class diagram showing that the singleton App object, which encapsulates the functionality of the main thread, is responsible for managing the various sensorimotor interface objects within Lobot’s high-level controller. interface objects. Figure 3.3 presents a class diagram indicating the relationship between the App and sensorimotor interface objects. A Note on the Class Diagrams The intent with these diagrams is merely to convey a basic idea about the control software’s organization and not to present detailed designs. Therefore, we do not use a formalized notation such as UML. ∗ Instead, we adopt a some- what ad hoc and very simple notation which is as follows. Each class is drawn inside a colored box: blue for abstract base classes, green for derived classes and yellow for “nor- mal” classes. Upward pointing triangles indicate an inheritance relationship. Diamonds are for aggregation, i.e., lists of objects. 3.3.2 Sensorimotor Interface Objects LRF The LaserRangeFinder class (see algorithm 3.2, line 5) uses the custom-built version of the URG library mentioned in section 3.2.1 to interface with Lobot’s laser range finder. LaserRangeFinder::update() retrieves the latest range measurements and caches them inter- nally. Behaviours can then use the get distance() function to obtain the range reading in a particular direction. ∗ See, for example, http://en.wikipedia.org/wiki/Unified Modeling Language. 69 LocustModel StaffordModel GabbianiModel InputSource LaserRangeFinder ImageSource Figure 3.4: Class diagram showing how LocustModel is related to other classes within the Lobot high-level controller. Each LocustModel gets its input from either a camera (ImageSource) or the laser range finder. LGMD In section 2.2, we saw how Lobot uses its laser range finder to generate artificial LGMD spikes. All ofthat functionalityis encapsulated by the GabbianiModel class, which is derived from LocustModel, an abstract base. Figure 3.4 presents a simplified class diagram for LocustModel and some related classes. Line 21 in algorithm 3.2 sets up the array of virtual locusts depicted in figure 2.4(a) (page 21) by creating multiple instances of the LocustModel class. Since LocustModel is an abstract base class, the controller actually creates instances of one of the derived classes shown in figure 3.4, viz., StaffordModel or GabbianiModel, with an object factory (Alexan- drescu, 2001) and the relevant settings in the Lobot configuration file. Every class derived from LocustModel must implement the update method, which is declared pure virtual. StaffordModel::update() executes the processing pipeline shown in figure 2.2 (page 17) using input frames streaming in from multiple cameras. As we mentioned in chapter 2, we found this LGMD model insufficiently biased toward collisional stimuli, which prompted us to move to range input and the Gabbiani model. GabbianiModel::update() generates LGMD spikes by following algorithm 2.1’s step-by-step procedure (see page 24). Thus, in the main thread’s sensor update loop, we first retrieve the latest range and encoder data (lines 6–9, algorithm 3.3) and then generate LGMD spikes for each virtual locust by calling GabbianiModel::update() via C++’s virtual function 70 mechanism (lines 10 and 11). These spikes are the sensory inputs for the LGMD-based obstacle avoidance algorithms, which we shall get to in chapter 4. Robot The Robot class provides a common interface for interacting with the underlying mobile robot platform. This interface consists of the following motion primitives: Drive: a speed expressed in m/s dictates how fast the robot will move; positive values will drive the robot forward, negative ones will make it back-up; zero will stop the robot. Turn: command the robot to steer toward the specified direction (expressed in degrees); 0 ◦ is for driving straight ahead; positive (ccw) angles are for heading left and negative (cw) angles for heading right. Spin: if the robot supports in-place turns, then this command will cause it to spin in-place by the specified angle; positive angles will result in counterclockwise turns, negative angles in clockwise turns. In addition to these motion primitives, the Robot class also provides a pure virtual method for updating the low-level sensors built-in to the robot platform, e.g., encoders, proximity detectors, and so on. An inner class, viz., Robot::Sensors, is used to represent the latest values of these built-in sensors. Furthermore, Robot provides a callback mechanism that behaviours can use to be informed as soon as a low-level sensor packet becomes available. The localization module, for example, registers a Robot::SensorHook with the Robot object so as to receive immediate notifications of odometric updates. In order to actually send motor commands and read low-level sensor packets, we as- sume that the main computer will be connected to the robot platform via a serial port. Accordingly, the Robot class uses a Serial object, which it makes available to its subclasses. Figure 3.5 summarizes the above description in the form of a class diagram. The RCCar class was used in conjunction with Lobot 1.0 (figure 2.1(a), page 16). When we switched to the iRobot Create, we implemented the RoombaCM class along with the RoombaCM::Comm 71 Robot Sensors SensorHook RCCar RoombaCM Comm Serial Figure 3.5: Class diagram showing how the Robot class is related to others within Lobot’s high-level controller. Sensors and SensorHook are public data structures defined inside the Robot class. The Serial object provides an API for reading and writing bytes from/to a serial port. The Comm thread is private to the RoombaCM class. thread for interfacing with the low-level controller. As we did when we instantiated the LocustModel class, we use an object factory to instantiate the proper subclass of Robot based on the configuration settings. To get this controller to work on a different robot platform, we only need derive a new class from Robot and provide suitable implementations of the drive, turn, spin and update methods. 3.3.3 Behaviour Threads Lobot’s configuration file specifies the list of active behaviours. Line 24 in algorithm 3.2 calls a function that walks through this list and creates an object for each named behaviour using an object factory. Upon instantiation, a new thread is launched for every behaviour object. All behaviours running within the Lobot controller are derived from an abstract base class named Behavior (figure 3.6), which defines two crucial methods, viz., run and action. The runmethodimplementsagenericmainloop(algorithm3.4), while action, apurevirtual method, is expected to implement each behaviour’s custom, goal-directed, sensorimotor situation–action rule. 72 Drawable Behavior Forward EmergencyStop Extricate OpenPath . . . . . . Figure 3.6: Class diagram illustrating how Lobot’s behaviours are organized. Every behaviour is derived from the Behavior class, which, in turn, is a Drawable (to be discussed shortly). Algorithm 3.4 Generic Main Loop for all Behaviours 1 void Behavior :: run() 2 { 3 try 4 { 5 App:: wait for init () ; 6 pre run() ; 7 while (! Shutdown:: signaled ()) { 8 if (Pause :: is clear ()) 9 action() ; 10 usleep(m update delay) ; 11 } 12 post run() ; 13 } 14 catch (uhoh& e) 15 { 16 LERROR("behaviour %s encountered an error: %s", 17 name. c str () , e.what()) ; 18 } 19 } 73 To add a new capability to the robot, we must derive a new class from Behavior and provide a suitable implementation of the action method. To perform some initialization prior to entering the behaviour’s main loop, we can take advantage of the pre run method made available by the Behavior class. In most cases, this is all we need to do to get a new behaviour up and running. However, in some cases, Behavior::run’s generic main loop may not be appropriate. Thus, Lobot’s behaviour-based framework allows Behavior subclasses to override its run method. The localization module, for example, does this in order to perform localization updates with minimal odometric accumulation. Some Notes on Algorithm 3.4 The main thread’s application object (see section 3.3.1) uses a flag to signal other threads whenthehigh-levelcontroller’sinitializationsequenceiscomplete. Somethreadscareabout this condition, others do not. In the case of behaviours, we need to care because every Behavior is a Drawable and certain visualization related steps may only be performed after the visualization thread (section 3.3.5) comes up. Unfortunately, the visualization thread can only be vivified after all the behaviours are running because it needs to have all the active drawables ready before it can begin operating. This is why line 5 in algorithm 3.4 is necessary. For a new behaviour, if we need to override Behavior::run, it is imperative that we retain the call to App::wait for init(). The Pause object is a system-wide flag used to temporarily pause the high-level con- troller. When the flag is set, all threads will skip their usual processing and simply spin uselessly. We use this flag to allow users to pause the robot and also to allow behaviours to pause the robot when relevant conditions are met (e.g., reaching a goal). We omitted this flag in algorithm 3.3 for the sake of clarity; however, the main thread, like all other Lobot threads, also checks the Pause object. A new behaviour that overrides Behavior::run must keep this in mind. Line 10 implements a short delay between successive iterations of the behaviour’s main loop. Behavior::m update delay is set during behaviour initialization from the update delay 74 setting in the configuration file. This is an important parameter for all behaviours. The absolute and relative values of different behaviours’ update delays play a significant role in determining the robot’s overall responses to various situations. Each behaviour should provide a reasonable default and proper bounds for this parameter. Furthermore, users should be careful when tweaking the defaults. Theuhohclassusedonline14isanexceptionbaseclassderivedfromstd::runtime error. All the exceptions generated by different Lobot modules are subclasses of uhoh. 3.3.4 DAMN Arbiters Asweknow,inabehaviour-basedcontroller,eachbehaviourmonitorsone(orasmallsubset) of the robot’s sensors and controls one (or a small subset) of its actuators in accordance with some specific objective (e.g., locate green objects, maintain a distance of one foot from a wall, and so on). Since such a controller consists of multiple, independent behaviours all executingatthesametime,conflictsareboundtoarise. Asaresult,selectingonebehaviour and action from a set of several contenders is a significant design issue for behaviour-based control. Many different approaches have been advocated and can be roughly classified into two broad categories (Pirjanian, 1999): Arbitration techniques: a single behaviour gets to briefly control the robot based on its priority and various other criteria. Command fusion techniques: the motor commands from multiple behaviours are com- bined according to various relevant criteria. The subsumption architecture (Brooks, 1986) is an example of an arbitration-based action selection mechanism wherein a higher priority behaviour can “subsume” any lower priority behaviour by either suppressing its sensory inputs or inhibiting its actuator outputs. Thus, in subsumption, communication between behaviours is strictly hierarchical. Relationships between the behaviours are fixed at design time and, often, hard-wired into the controller, 75 DAMN Arbiter Behaviour 1 Behaviour 2 Behaviour 3 • • • vote vote vote Actuators motor commands Mode Manager weights Figure 3.7: Block diagram illustrating the overall structure of a DAMN controller. Each behaviour votes for the desired action. The arbiter issues motor commands based on a weighted sum of all the votes. which makes the system inflexible because changing behavioural inter-relationships usually entails significant redesign. In contrast, the Distributed Architecture for Mobile Navigation, or DAMN (Rosenblatt, 1997), performs commandfusionusingbehaviourpriorities. Figure 3.7shows how aDAMN controller is structured. As usual, multiple, independent behaviours execute concurrently. However, the behaviours do not have direct control over the robot’s actuators. Instead, whenever they are ready with some control decision, they cast votes on each of the possi- ble actions. An arbiter periodically collects the currently available votes and computes a weighted sum of all the votes for each action. It then sends appropriate commands to the actuators corresponding to the action that received the maximum votes. DAMN provides a common interface for behaviours to communicate their intentions without imposing any sort of structural or representational policies on them. This allows for a lot of heterogeneity within the controller’s behavioural network. Each component in the system can proceed completely asynchronously w.r.t. all of the others; no behaviour is constrained to “report” to the arbiter at a fixed interval. Thus, a behaviour attempting to analyze images streaming in from a camera can take the time it needs to get its job done without slowing down a low-level obstacle avoidance behaviour working with range sensors. 76 This flexibility enables reactive and deliberative behaviours to coexist and interact in a natural way, i.e., without the need for a complicated middle layer as in hybrid controllers (Arkin, 1998). Since no behaviour is summarily overridden, no information or state compu- tations are discarded outright. Instead, all behaviours get a say in the actions to be carried out by the robot and, where appropriate, compromises can be made. Users can decide how much of a say each behaviour gets in the final decision by configuring the mode manager to assign specific weights to each behaviour’s votes. Lobot’simplementationofDAMNisencapsulatedbythe Arbiterclassanditssubclasses (figure3.8). TheArbiterbaseclassdefinesaVoteBaseinnerclassthatismeanttobeextended by derived classes. Behaviours can vote for different actions with the Arbiter::vote method. In addition to this voting infrastructure, Arbiter also furnishes a main loop much like the Behavior class. Algorithm 3.5 presents the C++ code for the arbiter main loop. There are two noteworthy points regarding algorithm 3.5: • The call to the init priorities function on line 6 creates a mapping between behaviour names and their corresponding priorities. In lieu of the mode manager, we simply use each active behaviour’s speed priority, turn priority and spin priority settings from the configuration file. • Line 16 is the one that tallies the currently available votes. Since each type of arbiter can have different voting semantics, the base class itself cannot tally votes and issue motor commands. Instead, it declares the Arbiter::motor cmd pure virtual method and expects each subclass to supply a suitable implementation. In section 3.3.2, we detailed the abstract motion primitives provided by the Robot class, viz., drive, turn and spin. Lobot’s DAMN implementation mirrors this interface in the form of three corresponding arbiters, viz., SpeedArbiter, TurnArbiter and SpinArbiter. The SpeedArbiter’s vote tallying procedure is purely priority-based. That is, it finds the votecastbythehighestprioritybehaviourandissuesthatvote’s drive command. According to Rosenblatt (1997), to issue an appropriate motor command, a speed arbiter should pick 77 Drawable Arbiter SpeedArbiter TurnArbiter SpinArbiter Figure 3.8: Class diagram illustrating Lobot’s implementation of DAMN. The Arbiter base class provides a framework for voting plus a main loop. The SpeedArbiter, TurnArbiter and SpinArbiter tally votes and use the Robot interface object to issue drive, turn and spin commands respectively. Algorithm 3.5 Generic Main Loop for all DAMN Arbiters 1 void Arbiter :: run() 2 { 3 try 4 { 5 App:: wait for init () ; 6 init priorities () ; 7 if (! App:: robot()) { 8 LERROR("robot sensorimotor subsystem unavailable") ; 9 return ; 10 } 11 pre run() ; 12 while (! Shutdown:: signaled ()) { 13 if (Pause :: is clear ()) { 14 AutoMutex M(m votes mutex) ; 15 if (! m votes.empty()) { 16 motor cmd(m votes , App:: robot()) ; 17 purge container(m votes) ; 18 m votes. clear () ; 19 } 20 } 21 usleep(m update delay) ; 22 } 23 post run() ; 24 } 25 catch (uhoh& e) 26 { 27 LERROR("arbiter error: %s", e.what()) ; 28 AutoMutex M(m votes mutex) ; 29 purge container(m votes) ; 30 m votes. clear () ; 31 } 32 } 78 the minimum speed that will satisfy all the behaviours. However, for Robolocust, we prefer to take the behaviour priorities into account. Otherwise, for example, a behaviour voting for a full stop because something has gotten too close will always override another voting to back-up and away from that very thing. By issuing the drive commands of the highest priority behaviour, we ensure that low priority behaviours issuing lower speed commands do not override the directives of higher priority behaviours voting for higher speeds. Toinfluencetherobot’sheading, abehaviourmustvotefororagainsteachpossibleturn direction. If a behaviour votes −1 for some direction, it means that the behaviour is dead- set against turning in that direction; a vote of +1 indicates a strong preference for going in that direction; and a vote of zero means the behaviour is neutral with regards to that direction. Fractional numbers indicate varying degrees of being for or against a particular steering direction. For example, an obstacle avoidance behaviour might scale vote values based on the distances to obstacles in different directions. Once a behaviour has assembled its steering preferences in a TurnArbiter::Vote structure, it can cast that vote with the turn arbiter. To issue an appropriate turn command, the turn arbiter uses the turn priority settings of the behaviours to calculate a weighted sum of all the votes. It then smooths the vote values for the different directions using a Gaussian. Finally, it picks the turn direction with the maximum vote value. This procedure is the same as the one described by Rosenblatt (1997, chap. 4) except that we do not factor in each vote’s age into the weighted sum; nor do we implement the final vote interpolation and parabolic fitting step. The SpinArbiter computes the amount of rotation for in-place turns as a weighted sum of the requested spin angles, where the behaviour spin priority settings are the weights. In general, for steering control, all the behaviours should be configured to use either the turn commands or the spin commands; the two different steering command sets should not be mixed. Although the iRobot Create does support in-place turns, we chose to use turn rather than spin commands for Lobot’s steering control. 79 Drawable RenderHook MainWindow singleton Behavior Arbiter LaserViz . . . Figure 3.9: Class diagram for the high-level controller’s UI objects. The MainWindow sets up an OpenGL window and launches a visualization thread. All modules that perform visualization must be derived from Drawable. For example, the LaserViz class visualizes the LRF data while behaviours and arbiters are setup for visualization via their respective base classes. The Drawable::RenderHook data structure can be used to register callback functions that render arbitrary overlays on the Drawable. 3.3.5 Visualization To aid in debugging, we developed a graphical user interface for the high-level controller that enables users to visualize the robot’s sensory inputs and the motor commands being issued by different behaviours. Figure 3.9 presents a class diagram showing how the different components of the UI relate to each other. To visualize something, we must derive it from the Drawable class. All the Drawable objects are managed by the MainWindow, which is responsible for initializing an OpenGL rendering context and popping up a GLUT window on the screen (Woo et al., 1997). The MainWindow also launches a new thread and runs GLUT’s main loop in this thread. Using a separate OpenGL rendering thread ensures that visualization will not slow down the behaviours and other modules. Each Drawable is assigned to a particular area within the MainWindow via a geometry setting in the configuration file. Furthermore, the configuration file also associates every Drawable object with a visualization flag, allowing users to selectively enable or disable drawables. These features make Lobot’s UI a very flexible visualization tool as illustrated byfigure3.10,whichshowstwoscreenshotsoftheUIconfiguredwithquitedifferentsettings. 80 (a) Top: laser range finder (LRF) visualization in flat mode; Middle: LGMD spike rate histories of five virtual locusts; Bottom (left–right): emergency stop be- haviour, extricate behaviour, turn arbiter. (b) Left half: localization module; Right half (left–right, top–bottom): forward, emergency stop, extricate and LGMD extri- cateTTIbehaviours, LRFinpolarmodewith open paths overlaid, LGMD spikes in polar mode. Figure 3.10: Screenshots of the high-level controller’s user interface. The ability to turn drawables on or off and to specify their exact sizes and positions makes for a very flexible robot state visualization tool. In addition to rendering itself, a Drawable also supports rendering arbitrary overlays on top of itself. This is useful, for instance, with the open path behaviour (section 3.4.4), which paints candidate open paths on top of the laser range finder’s data visualization. Several localization behaviours also take advantage of overlay rendering to visualize their respective states on a shared map. Overlays are rendered using callbacks. Client modules may register any number of Drawable::RenderHook functions with a target Drawable. After renderingitself, the Drawablewilltriggereachoftheserenderinghooksintheorderinwhich they were registered. 3.3.6 Metrics Logging In section 2.4.2, we presented an example log entry containing various pieces of information relevant to the evaluation of the Bayesian time-to-impact state estimator. That entry was 81 output by the lgmd extricate tti behaviour (which we shall meet in chapter 4). However, all log entries follow a similar pattern: a time-stamp followed by some informational content. The metrics behaviour offers a central logging facility that takes care of the details of generatingsuchlogmessages. Thisbehaviourdoesnotcontroltherobotinanywaynordoes it perform any sort of visualization. Its sole job is to buffer log messages and periodically write them out to a log file. The log file’s name is generated from a user-specified setting and a time-stamp corresponding to when the high-level controller began running. The public interface of the metrics logging API wraps around the C++ stream insertion operator, which allows it to be used with arbitrary data types as long as there is a way to send the target type to an std::ostream. The above-mentioned wrapper also checks if the metrics behaviour is active. If not, it ensures that client modules not incur any overheads related to building and buffering log messages. Furthermore, having a separate thread handle log messages ensures that behaviours responsible for crucial robot control functions are not slowed down by disk I/O. Even so, the metrics logging facility is not meant to be a generalized, high-performance API for tasks such as debug logging. Rather, it is a specialized API designed to record data related to the various experiments we conducted to evaluate Lobot’s different algorithms. In that capacity, it provides an efficient and convenient interface for data collection and is used by all behaviours that need to output such messages. 3.4 The Fundamental Behaviours Although Lobot was built primarily to investigate LGMD-based obstacle avoidance, its high-level controller is fairly generic and can be deployed in a multitude of situations by selective activation and deactivation of various behaviours. However, for general use, there arecertainbehaviours, theso-called“fundamentalbehaviours,” thatmustalwaysbeactive. Without them, the robot will lack the most basic capabilities required of a mobile agent. There are four such fundamental behaviours: 82 1. forward: simply drive the robot forward. 2. emergency stop: stop the robot when things get too close. 3. extricate: get the robot out of the situation that has resulted in an emergency stop. 4. open path: steer the robot in the direction of maximal traversability. Strictly speaking, the first three are the only ones that are absolutely necessary to ensure collision-free operation. However, the fourth behaviour listed above adds an exploratory component to the robot’s movements, which makes it a lot more interesting; that is why we classify open path as a fundamental behaviour. Together, these four behaviours comprise levels of competence 0, 1 and 2 as defined by Brooks (1986). In some special cases, we may not need all the fundamental behaviours. For instance, whenwetestedtheLGMDsensormodelasdescribedinsection2.4,weonlyusedtheforward and emergency stop behaviours. The forward behaviour was configured to drive the robot towardthewallattherequiredspeedforeachexperimentwhile emergency stopbroughtitto a halt shortly before it made contact with the wall. In addition to these two fundamental behaviours, we also had the lgmd extricate tti behaviour turned on. This is an LGMD- based obstacle avoidance behaviour. To prevent it from steering the robot away from the approaching wall, we set its turn priority to zero. We will go into further details about lgmd extricate tti in chapter 4; over here, suffice it to say that it generated the raw data from which we created the plots for figures 2.9 and 2.10. 3.4.1 Forward The forward behaviour affects both the robot’s heading and its speed. That is, it casts votes with both the TurnArbiter and the SpeedArbiter. The steering command is always to drive straight ahead, i.e., a vote of +1 for 0 ◦ . The other directions get a vote value less than one, computed according to the formula: 83 v(x, C) = 1− |x−C| M (3.1) where C = the “central” direction (0 ◦ in this case) x = all the other steering directions and M = the maximum steering angle supported by the robot platform In equation (3.1), when x = C, the vote will be +1. Turn directions further away from C will get votes less than one. Since the TurnArbiter requires all votes to lie in the range [−1,+1], we clamp the output of equation (3.1) to this range. To make the application of equation (3.1) a little more concrete, let us say that the robot supports 9 steering angles, which are, going from left to right, {20 ◦ , 15 ◦ , 10 ◦ , 5 ◦ , 0 ◦ , −5 ◦ , −10 ◦ , −15 ◦ , −20 ◦ }. Then, the forward behaviour will produce the following votes for the TurnArbiter: {−1, −.5, 0, .5, 1, .5, 0, −.5, −1}, which are obtained from equation (3.1) using C = 0 ◦ and M = 20 ◦ . For the speed command, the forward behaviour operates in one of two modes, viz., fixed or adaptive. In fixed mode, it simply votes for the cruising speed setting from the configurationfile. Inadaptivemode,thebehaviourregulatesthespeedbasedonthedistance to the closest obstacle within a small angular field of view, [−F,F], centered about 0 ◦ . To “convert” distance readings from the laser range finder into equivalent speed values, we use a linear interpolation between a distance range, [D min ,D max ], and a speed range, [S min ,S max ]. Thus,weregulatetherobot’sspeedbyperformingthefollowingcomputation: S min +(S max −S min )× d−D min D max −D min whered =argmin LRF(θ)∀θ ∈ [−F,F] 84 The actual code for this function is a little more complicated due to multithreaded coordi- nation and visualization related steps. Moreover, the FOV setting for adaptive mode, viz., F, is specified using two configuration parameters: fov and averaging block, which involve an angular step size plus an averaging operation. However, these details are not important for a conceptual understanding of what the forward behaviour does, for which the preceding explanation is quite adequate. Finally,wenotethatforwarddrivingistheleastprioritybehaviour: weuseavalueofone for its turn priority and speed priority settings. All other behaviours get higher priorities. Thus, any other behaviour may override forward driving. ∗ 3.4.2 Emergency Stop The emergency stop behaviour monitors a danger zone around the robot. When an object penetrates this zone, it votes to stop the robot. Since the SpeedArbiter picks the speed commandofthehighestprioritybehaviour,attheveryleast, emergency stopalwaysoverrides the forward behaviour. However, as we shall see shortly, emergency stop is not assigned the highest speed priority setting. The danger zone is specified by dividing the laser range finder’s field of view into several angular blocks. For example, a list such as {−90 ◦ , −30 ◦ , 30 ◦ , 91 ◦ }, will be construed as the following three blocks: [−90 ◦ , −30 ◦ ), [−30 ◦ , 30 ◦ ) and [30 ◦ , 91 ◦ ). Each block is associated with a distance threshold and a count threshold (d,c). When there are greater thanc LRF readings ina block whose values are allless thand, we considerthe dangerzone penetrated. Having a count threshold in addition to a distance threshold ensures that momentary sensor glitches do not bring the robot to a screeching halt. Furthermore, using multiple blocksasdescribedaboveinsteadofapurelycirculardangerzonewithasingledistanceand count threshold allows us to fine-tune the robot’s response to nearby objects. For instance, we can configure it so that it is more tolerant of things on its sides than of things in front of ∗ Recall, however, that in a DAMN controller, behaviours are not summarily overridden. Rather, their priorities are used to weight their votes. Low priority behaviours simply get less of a say in the final motor commands output by the arbiters. 85 it. This fine-tuning ability is also advantageous for non-circular robots and cases wherein the laser range finder is not mounted at the robot’s center. 3.4.3 Extricate This behaviour monitors the robot’s danger zone and commands the robot to back-up, turn and/or drive forward as required to get it away from the obstacle currently obstructing its path so that it can start moving normally again. The extrication algorithm is based on the vectorforcefield(VFF)ideaduetoBorensteinandKoren(1989). Eachrangereadinginside the danger zone exerts a virtual repulsive force on the robot while distance measurements outside of the danger zone exert attractive forces. The sum of all these force vectors is used to drive and steer the robot away from obstacles. The extricate behaviourdoesnotimplementthefullVFFalgorithmdescribedbyBoren- stein and Koren. In particular, there is no certainty grid; moreover, we do not perform the low-pass filtering, damping, speed control and trap detection steps. Instead, we generate drive and turn commands based on the quadrant in which the steering vector produced by the VFF calculations lies. Below, we present a snippet of C++ code from the Extricate class’s action method: ∗ const int T = random(TurnArbiter :: turn step () , TurnArbiter :: turn max()); switch (quadrant(direction(F))) { case 1: C. drive = 1 ; C.turn = T ; break ; case 2: C. drive = −1 ; C.turn = −T ; break ; case 3: C. drive = −1 ; C.turn = T ; break ; case 4: C. drive = 1 ; C.turn = −T ; break ; } ∗ See section 3.3.3 for an explanation of how behaviours are implemented. 86 If the obstacle is directly ahead of the robot, the VFF steering vector will point behind it (quadrants 2 or 3) and we will issue a negative drive command, i.e., the robot will reverse rather than drive forward. If, however, the obstacle is on the robot’s flanks or slightly behind it, the VFF vector will point forward, i.e., quadrants 1 or 4, in which case we will simply continue to drive forward. As the robot backs up or moves up per the VFF steering vector’s direction, we also make it turn so that, at the end of the extrication, when it resumes normal forward driving, it will be facing away from the obstacle. If the steering vector is pointing ahead to the left, the turn angle will be positive and negative if it is pointing ahead to the right. However, when reversing (cases 2 and 3 above), the turn angles have to be negated to achieve the desired effect. Note that the switch statement shown above only computes the drive direction. This value, viz., ±1, is multiplied by the extricate speed configuration setting to yield the actual speed for which the extricate behaviour will vote. The turn angle, however, is passed as-is to the turn arbiter. An interesting point about the amount of turn is that it is a small random number within the steering range supported by the turn arbiter. This trick yields a crude but workable alternative to a more formalized trap detection approach. If necessary, a higher level behaviour can be used to effect a better solution. Itiscrucialthatthe extricate behaviour’s speed priority begreaterthan emergency stop’s. Otherwise, the drive commands that implement extricate’s escape strategy will always be overridden, resulting in a robot that stays put upon reaching its first static obstacle and refuses to move thereafter. One last thing to say about the extricate behaviour is its use of the restart mode flag. In section 2.6.4, we pointed out a major shortcoming of our LGMD range sensor, namely, that spiking activity ceases when the robot becomes immobile. Thus, LGMD-based obsta- cle avoidance behaviours will lose their sensory input and not be able to complete a full extrication when emergency stop kicks in. 87 To work around this problem, we run the LRF-based extricate (i.e., this) behaviour in addition to the LGMD-based version(s) of extricate. However, to give the LGMD-based extrications precedence, we turn on this behaviour’s restart mode setting. In this mode, the extricate behaviour waits for the robot to come to a full stop before issuing any extrication commands. Furthermore, we reduce the frequency with which this behaviour runs. These two conditions together ensure that this behaviour “rescues” the robot and restarts LGMD spiking only when absolutely necessary. 3.4.4 Open Path If we run only the forward, emergency stop and extricate behaviours, the robot will keep driving straight ahead until it encounters an obstacle, at which point it will veer away and then resume driving straight ahead until the next obstacle and then straight again...ad nauseam. It would be much more interesting if the robot were to “proactively” choose trajectories that offer maximal traversability. To that end, we have implemented the open path algo- rithm (von Wahlde et al., 2009). This behaviour steers the robot toward the direction with the longest path that can accommodate the robot’s width. The idea behind the open path algorithm is to use the current set of distance readings from the laser range finder to build a list of candidate open paths and then pick the one with the maximum traversable area. Though the end results are the same, Lobot’s im- plementation, which is based on trigonometry, is very different from the search procedure described by von Wahlde et al.. Figure 3.11 shows a diagram that will help clarify the following discussion. Let us say that the robot’s width is 2w units. Our goal is to find the longest path that is at least this wide and at least m units long. To do this, we search the entire angular range of the laser range finder, determining the possible path length in each direction and then pick the direction with the maximum path length as the final steering angle. 88 x y O A B C D M P P ′ θ γ α m w w Figure 3.11: Lobot’s implementation of the open path algorithm is based on trigonom- etry. The laser range finder is at the origin. The shaded blue region represents the LRF measurements for directions θ−α to θ+α. The maximum distance that can be traversed in directionθ will be the projection of the minimum range reading within rectangleABCD onto the direction vector of θ. To find the path length in some directionθ, we start off by constructing the (imaginary) rectangle ABCD as shown in figure 3.11. The edge AB is perpendicular to the direction vectoralongθandisatadistancemfromtheorigin(wherethelaserrangefinderislocated). Its length is 2w. CD is located at a distance equal to the range reading along θ. The path length along θ will be the minimum range reading that lies within ABCD projected onto the direction vector associated with θ. In figure 3.11, the minimum range reading is OP. The projection of P onto θ’s direction vector yields point P 0 . Thus, the desired path length is OP 0 . To get to P 0 , consider the right triangle 4OPP 0 . Let γ be the angle subtended by OP at the origin. Then,∠POP 0 will be θ−γ, which gives us: 89 PP 0 = OP ×sin|θ−γ| (3.2) and OP 0 = OP ×cos|θ−γ| (3.3) Recall that OP is the laser range finder measurement along directionγ. That is, OP = LRF(γ). Now, for any γ ∈ [θ−α, θ+α], if we apply equation (3.2), we will get the orthogonal component of the projection of LRF(γ) onto the direction vector along θ. If this value is 6 w, then the range reading along γ lies within the region bounded by the lines AD and BC. In this situation, we can apply equation (3.3) to get the projection of LRF(γ) along the directionθ. The minimum value yielded by repeated applications of equation (3.3) will bethemaximumtraversabledistanceindirectionθ beforetherobotencountersanobstacle. One thing we have not mentioned so far is the angleα. As we can see from figure 3.11, to find the maximum traversable path in rectangle ABCD, we need to search the LRF readings in the angular range demarcated by∠AOB. From the right triangle 4OMA, we see that α = tan −1 (w/m). Since the search range is symmetric about θ, we get ∠AOB spanning θ−α to θ+α. Algorithm 3.6 formalizes the procedure described above. The actual code, of course, is a little more involved due to the use of various error checks and data structures required for visualization. Repeated calls to algorithm 3.6 will yield a list of candidate paths in different directions. The behaviour then votes to steer in the direction corresponding to the longest candidate path. If there are no viable paths in any direction, perhaps because obstacle distances in all directions are less than m, the open path behaviour will not do anything. In such a situation, we would expect the extricate behaviour to bring the robot back into a state where open path can resume functioning normally. In addition to w and m, the open path behaviour uses the following three parameters: fov, step and dead zone. The first two allow us to restrict the search for open paths to some 90 Algorithm 3.6 Find path length along direction θ L =∞ // initialize path length to some large value for all γ ∈ [θ−α, θ+α] if LRF(γ)sin|θ−γ|6w d = LRF(γ)cos|θ−γ| if d<L L =d if L =∞ orL <m // m is the minimum path length return −1 // no viable open path in direction θ return L subportion of the laser range finder’s entire field of view while skipping some directions. For example, fov = 45 ◦ and step = 10 ◦ , will result in the behaviour searching for open paths every 10 ◦ from −45 ◦ to +45 ◦ . Furthermore, to ensure that the robot does not keep turning unnecessarily, we mark some angular range directly ahead of it as a “dead zone.” When the most open path lies in this range, the behaviour will not issue any steering commands. For instance, if dead zone is [−10 ◦ , +10 ◦ ], the robot will turn only when the most open path is offset from straight ahead by more than 10 ◦ . Finally,regardingbehaviourpriorities,sinceopen pathisonlyconcernedwiththerobot’s steering, we only need to specify a turn priority for it, which we set to a value between that of the forward and extricate behaviours. Before closing this discussion, we would like to highlight that, despite its simplicity, the open path algorithm results in seemingly purposeful motion. For instance, in our lab, we find that, starting from one corner of a room, the robot often “seeks” the door at the other end, neatly skirting intervening furniture, then follows the corridor and turns into another office. Casual observers tend to mistake this as some sort of goal seeking and are a little surprised to learn that it is, in fact, just random exploration. As we shall see in the next chapter, we utilize this faux purposefulness to good effect in a slalom obstacle course, wherein the open path behaviour is used to guide the robot toward a goal located at the other end of the course. 91 Chapter 4 LGMD-based Obstacle Avoidance Of the four fundamental behaviours discussed in the previous chapter, extricate is the one that effects obstacle avoidance. It does so on the basis of the laser range finder’s distance measurements. To use LGMD spikes instead, it is reasonable to follow the design precedent alreadyestablishedandsimplyreplacetheLRF-basedversionof extricate withversionsthat take LGMD spikes as input. In this chapter, we describe three such behaviours: 1. lgmd extricate emd 2. lgmd extricate vff 3. lgmd extricate tti The first two output steering commands using the raw LGMD spikes as their input. The third behaviour uses the Bayesian time-to-impact estimator described in section 2.3 to obtain range estimates, which it then converts to virtual forces just like the LRF-based extricate. Before we dive into the details of the above-mentioned algorithms, we should note that Lobot’s codebase contains a fourth LGMD-based obstacle avoidance behaviour, viz., lgmd extricate sim, which simply turns the robot in the direction of minimum spiking ac- tivity. The problem with this simplistic approach is that it causes many conflicting turn commands, i.e., frenzied alternation between left and right. 92 In a behaviour-based system, we expect each behaviour to issue motor commands in short bursts. For example, when the robot encounters an obstacle, an extricate behaviour will request a slight back-up. Then, the forward behaviour will resume normal driving, but only for a very short while because extrication will trigger again followed, perhaps, by emergency stop and, once more, forward...This back-and-forth will continue until the obstacle has been fully cleared. Thus,itisnormalforallextricatebehaviourstorepeatedlyvoteforthenecessaryactions in relation to a single obstacle. However, such successive motor commands should not be polaropposites. Thatis,ifanobstacleisontheright,theextrication’ssteeringvotesshould generally be pointed to the left. Unfortunately, with lgmd extricate sim, we found that the initial turn command would usually be correct but that subsequent ones would initiate an alternating pattern. What was happening was that as the robot turned, the LGMD spikes from the virtual locusts facing the obstacle would start to drop. Then, lgmd extricate sim would misread the drop and react by turning in the wrong direction. Since extrication is a high-priority behaviour, this sort of repeated zigzagging tends to interfere with the ability of other behaviours to function effectively. Thus, we need extrications to be controlled and fluid movements with the entire event (comprised of sev- eral individual “mini-extrications”) being a relatively short affair so that the robot can quickly return to seeking goals and performing other useful tasks. These requirements and lgmd extricate sim’s inability to fulfill them made it abundantly clear that we needed more sophisticated approaches. In the following sections, we will detail the alternative solutions we devised. 4.1 LGMD-based Obstacle Avoidance Algorithms 4.1.1 EMD EMDstandsfor“ElementaryMotionDetector”(Reichardt,1987). Figure4.1(a)showshow a bidirectional EMD works. The lgmd extricate emd behaviour uses the LGMD spike rates 93 First Input Second Input τ τ time delay time delay × × − + Output (a) A bidirectional Elementary Motion Detector combines two inputs and outputs a positive number when the input signal “moves” in one direction and a negative number when it moves in the other direction. Σ Virtual Locust Elementary Motion Detector EMD Direction Vector Legend (b) Thelgmd extricate emdbehaviourfeedsthe LGMD spike rates from each pair of adja- cent virtual locusts into an EMD and turns based on a vector sum of the outputs of all the EMD’s. Figure 4.1: The lgmd extricate emd behaviour steers the robot away from the direction of maximal LGMD spiking using an array of Elementary Motion Detectors. from each pair of adjacent virtual locusts as the inputs to individual EMD’s as shown in figure 4.1(b). Each detector in the EMD array is assigned a line of sight vector that is the bisector of the lines of sight of its left and right inputs. The absolute value of the output of the EMD is used to scale the direction vector along its line of sight. Finally, we sum these individual vectors to produce a single vector that points in the direction of maximal LGMD spiking activity currently taking place. The lgmd extricate emd behaviour outputs appropriate drive and turn commands when the magnitude of the above vector crosses some predetermined threshold. The snippet of code shown in algorithm 4.1 summarizes this behaviour’s extrication logic. 94 Algorithm 4.1 LGMDExtricateEMD::action() if (magnitude(emd) >= Params:: threshold ()) { const int T = random(TurnArbiter :: turn step () , TurnArbiter :: turn max()); switch (octant(direction(emd))) { case 1: // obstacle in front and on the left C. drive = −1 ; C.turn = T ; break ; case 2: // obstacle on the left case 3: // obstacle on the left case 4: // obstacle behind and on the left C. drive = 1 ; C.turn = −T ; break ; case 5: // obstacle behind and on the right case 6: // obstacle on the right case 7: // obstacle on the right C. drive = 1 ; C.turn = T ; break ; case 8: // obstacle in front and on the right C. drive = −1 ; C.turn = −T ; break ; } . . . } 4.1.2 VFF The lgmd extricate vff behaviour replaces the range measurements used by extricate with LGMD spike rates. Thus, instead of being specified in terms of distances, the danger zone is specified in terms of a spike rate threshold. When a virtual locust’s spike rate is above this threshold, we take it to mean that a collision is imminent. Thus, LGMD spike rates above the threshold are converted to repulsive forces. Conversely, spike rates below the threshold become attractive forces. The resulting vector sum of the attractive and repulsive forces produces a steering vector, which is used exactly as described in section 3.4.3. To prevent this behaviour from becoming overly sensitive, we require the LGMD spike rates of a minimum number of locusts to exceed the above threshold before we engage the extrication algorithm. This minimum is specified via the count configuration. Thus, 95 the total number of repulsive force vectors must exceed count before lgmd extricate vff will perform any extrication. 4.1.3 TTI The TTIEstimator C++ class encapsulates the Bayesian time-to-impact state estimation filter described in section 2.3. When the lgmd extricate tti behaviour starts, it creates and attaches a new TTIEstimator object to each virtual locust. Thus, each input LGMD spike rate is converted into a corresponding time-to-impact. We then combine these predicted times-to-impact with the robot’s current velocity vec- tor (obtained from the encoders) to determine the distances to obstacles in each virtual locust’s direction. These distances are then converted into virtual force vectors based on a threshold, i.e., distances below the threshold act as repulsive forces whereas distances above the threshold are attractive forces. The vector sum of these individual forces produces a steering vector whose quadrant decides the exact drive and turn commands to be issued (see section 3.4.3). As we did for lgmd extricate vff, we require the total number of repulsive force vectors to exceed a preconfigured count before engaging the extrication algorithm. Additionally, we noted that when the robot was backing up as part of an overall extrica- tion,theLGMD’swouldfireasmallamount. The lgmd extricate tti behaviouroftenmiscon- strued this minor spiking as further actionable input, which induced extended bouts of re- versedriving. Wequelledthesespuriousextricationsbyintroducingan interference threshold parameter, which specifies a threshold speed that must be met before the behaviour’s ex- trication algorithm will kick in. By setting this parameter to a small positive quantity such as 0.05m/s or 0.1m/s, we can ensure that lgmd extricate tti remains passive when the robot is backing up (wherein the low-level controller reports a negative speed). In effect, the interference threshold enables us to recognize a potentially ongoing extrication and prevents unnecessary preemptions of such events. 96 4.2 Local Navigation Task To be able to properly gauge the efficacy of the three LGMD-based obstacle avoidance algorithms described in the previous section, we designed a slalom course and subjected Lobot to a goal seeking task. Figure 4.2 shows pictures as well as a schematic bird’s eye view of this course. Lobot’s objective was to find its way from a fixed starting point to a goal location (or rather a goal rectangle) at the other end of the course. As it negotiated the obstacle course, the robot logged various metrics, which we shall present and analyze in the coming sections of this chapter. Before we get to that, however, let us briefly examine the behaviours that facilitate this navigation task. 4.2.1 Localization Module As the robot proceeds from start position to goal, we need to record its trajectory so that we can later compare how many extrication events took place, how long the robot took to reach the goal, how many times the LGMD-based obstacle avoidance behaviours failed to properly guide the robot, etc. Keeping track of a robot’s current location is a moderately complicated undertaking. Accordingly, Lobot has quite a few C++ classes devoted to localization. However, we only need an overview of three of them to understand how this large chunk of code works. The survey Behaviour The survey behaviour provides the core localization functionality. It uses the FastSLAM algorithm (Thrun et al., 2005, chap. 13). Thus, the robot could automatically build the map shown in figure 4.2(b) and this was our original intent. Unfortunately, we found the mapping process to be error-prone, requiring comparison against ground truth for proper interpretation of the trajectory data. Because we were not interested in the robot’s cartographic abilities per se and because the ground truth in this case was easy to obtain and input, we disabled the mapping component of the FastSLAM algorithm and utilized only its localization part, whichapplies 97 (a) Pictures of the slalom course. 88 24 168 32 86 41 66 82 26 92 75 21 84 100 80 3 79 38 63 Goal Area Initial Pose (b) A schematic bird’s eye view of the slalom course. All dimensions are in centimeters. Figure 4.2: The slalom obstacle course designed to test Lobot’s obstacle avoidance be- haviours. The robot is expected to find its way from a start position to a target area at the other end. Various metrics collected along the way serve as the basis for comparing the different algorithms. 98 the Monte Carlo Localization (MCL) algorithm (Thrun et al., 2005, chap. 8). That is, we provide a known map to the survey behaviour and use MCL to keep track of the robot. There are two important points about the survey behaviour’s implementation: • During initialization, it registers a SensorHook callback function with the Robot inter- face object (see section 3.3.2) to obtain the latest low-level odometry packets as soon as they become available to the high-level controller. • It overrides and reimplements Behavior::run using a thread signaling mechanism that allows it to invoke the action method in response to odometric updates rather than relying on the default delay-based loop shown in algorithm 3.4 on page 73. Without these enhancements to the default procedure, the robot could potentially accumu- late quite a bit of odometry before performing a SLAM (or, in this case, MCL) update, which can lead to highly inaccurate mapping and localization results. Finally, we note that the survey behaviour does not influence the robot’s motion in any way. It only keeps track of the robot and does not vote for any drive or turn commands. The Map Data Structure The Map class implements a shared (i.e., thread-safe), two-dimensional occupancy grid that keeps track of where obstacles are as well as the robot’s current pose. The cells in this grid hold probability values rather than simple Boolean flags indicating the presence or absence of obstacles. Whenthesurveybehaviourisdonewithamappingand/orlocalizationupdate,itrecords the latest results from the FastSLAM algorithm in the system-wide Map. For our naviga- tion task, since we only use FastSLAM’s MCL subsystem, we load a known map (viz., figure 4.2(b)) during initialization and it does not get updated. Instead, the survey be- haviour only updates the robot pose at the end of each invocation of its action method. Visualization is an important responsibility of the Map data structure. All the localiza- tionmodulesrenderrelevantbitsofstateonthe Mapusingtheoverlayrenderingmechanism 99 described in section 3.3.5. The left half of figure 3.10(b) on page 81 shows the various lo- calization related behaviours and modules in action. The black shaded regions are the obstacles making up the slalom course. The red arrowhead shape represents the robot. Map takes care of these two things itself. The blue-green dots near the arrowhead’s tip are the MCL algorithm’s particles and are put there by the survey behaviour. The trail behind the arrowhead is output by the track behaviour (to be discussed shortly). The dashed rect- angle in the upper area represents the goal and is rendered by the goal behaviour, which we will get to in section 4.2.2. In addition to visualization support, the Map also provides the PoseHook callback mech- anism to notify behaviours and other interested parties about updates to the robot’s pose. The track Behaviour This behaviour is mostly cosmetic. We do not require it to be active for the localization to work. However, it does provide some useful visual feedback to users and takes care of some metrics logging, which makes it a worthwhile adjunct to the overall navigation task. The track behaviour maintains the 25 most recent robot poses. As just mentioned, it draws a trail behind the robot’s current pose on the Map so that we know how the robot got to its present location and heading. More importantly, every second, the track behaviour sends the robot’s current speed to themetricslog. Italsotakesadvantageofthe Map::PoseHookmechanismtosendtherobot’s latest pose to the metrics log as soon as it becomes available. 4.2.2 Goal Seeking The goal behaviour allows users to specify a list of goals and then uses the localization module to drive the robot to each goal in the list. A goal is a rectangle specified in Map coordinates. The behaviour will direct the robot towards the center of each goal rectangle. However, the goal is considered satisfied when the robot is located anywhere within the target rectangle. 100 When the robot reaches a goal, the behaviour can be configured to continue on to the next goal or to pause the robot and wait for the user to resume normal operation. For the navigation and data collection task, we only have one goal as shown in figure 4.2(b). Thus, when the robot reaches this goal, the experiment is concluded. The goalbehaviourguidestherobottothegoalbyapplyingtheVFFalgorithm ∗ (Boren- stein and Koren, 1989), with the goal location applying an attractive force. Unfortunately, we found that in some situations, the robot would get stuck due to clashes between the goal-seeking and extrication behaviours, which caused repetitive back-and-forth motions near certain obstacles. This led to inordinate delays in the completion of the slalom course and artificially inflated the number of extrications recorded; a serious problem because it made the LGMD-based avoidance algorithms appear much worse w.r.t. the baseline LRF- based extricate. Luckily,itturnedoutthatthestructureoftheslalomcourserenderedactivegoal-seeking unnecessary. Instead, the open path behaviour’s actions proved sufficient to guide the robot through the slalom. Consequently, we configured the goal behaviour to operate in passive mode, wherein it simply monitored the robot’s current location to check if it was within the goal area or not. If so, it would stop the robot and record the time so that we could later calculate how long it had taken to get to the goal. In passive mode, the goal behaviour does not issue any motor commands. 4.3 Testing Obstacle Avoidance 4.3.1 Experimental Methodology Section 4.1 described the three LGMD-based obstacle avoidance algorithms we have de- signed, viz., EMD, VFF and TTI. We tested them by deploying their respective extricate behaviours in the local navigation task detailed in section 4.2. Each run of the robot from start to goal constitutes one experiment. ∗ Not to be confused with the lgmd extricate vff behaviour. 101 For each algorithm, we ran 25 individual experiments. Furthermore, we used four dif- ferent noise profiles for the LGMD spikes: no noise, 25Hz noise, 50Hz noise and 100Hz noise. This made for a total of 3×4×25 = 300 experiments for the LGMD-based avoidance algorithms. Finally,toserveasabaselineforevaluatingLGMD-basedavoidance,weusedLRF-based extrication, i.e., the extricate fundamental behaviour. Since LGMD spikes are not required for the LRF-based extricate, there was no question of additional batches of experiments for different noise levels. Thus, we only needed 25 runs for it. This brought the grand total to 325 individual experiments for evaluating the obstacle avoidance algorithms. Note that Lobot operated autonomously in each experiment. During an experiment, we did not aid it in any way whatsoever. If it bumped into an obstacle or seemed to get stuck at some point, we let that happen. Of course, at the end of an experiment, we would bring the robot back to the start position and point it in the requisite direction prior to commencing the next run. 4.3.2 Performance Metrics The following behaviours were active in each experiment: • forward • emergency stop • extricate • open path • metrics • survey • track • goal • bump counter Naturally, while testing LGMD-based avoidance, we also had one of lgmd extricate emd, lgmd extricate vff or lgmd extricate tti running. In these cases, we configured extricate to operate in restart mode as described in section 3.4.3. 102 Whentherobotstartedoffacrosstheslalomcourse, the goal behaviourwouldlogames- sage stating that goal seeking had begun. When the robot reached the goal, the behaviour logged another message to that effect. The difference between the time-stamps of these two messages allows us to calculate the time it took the robot to successfully negotiate the entire course. Along the way, when the robot got close to obstacles, the emergency stop behaviour would record the event with a log message specifying the robot’s current pose. Similarly, when any of the extricate behaviours acted, they too would record the robot’s current pose in the metrics log file. The goal, emergency stop and extricate behaviours recorded various important events. However, these events are not enough for reconstructing the entire trajectory from start to finish. For that, as already discussed in section 4.2.1, we used the track behaviour, which loggedtherobot’sposeaftereachMCLupdateandalsoperiodicallysenttherobot’scurrent speed to the log file. Finally, the one behaviour we have not described thus far is bump counter, which moni- tors the iRobot Create’s bump sensor with a Robot::SensorHook (see section 3.3.2). When- ever the low-level controller reported physical contact between the robot and something, bump counter made a note in the metrics log of where this occurred. Table4.1summarizesthedifferentmetricswecollectedfromeachofthelocalnavigation experiments. Inadditiontocollectingthesemetricsinplaintextlogfiles, wealsoperformed screen captures of the Lobot UI for each experiment. All the raw logs as well as the screen capture movies are available at http://ilab.usc.edu/mviswana/robolocust-data/. 4.3.3 Data Analysis Point List Averaging Aswecanseefromtable4.1,themetricslogsrecordseveraldifferentlistsofpoints. Thefirst steptowardsmakingsenseofallthisdataistosomehowequalizethenumberofpointsacross an entire dataset of 25 experiments. To help clarify, consider the list of points making up 103 • List of points specifying robot’s trajectory from initial position to goal • The total time taken to go from start to finish • The robot’s average speed from start to finish • List of points where emergency stops occurred • List of points where LRF-based extrications took place • List of points where LGMD-based extrications took place • List of points where the robot physically bumped into objects Table 4.1: Summary of metrics collected by local navigation experiments. Different be- haviours recorded the relevant pieces of data in “raw” form. An analysis program parsed the log entries and extracted the above “useable” information. the robot’s trajectory. One experiment might record a hundred points from start to finish; in another experiment, we might get 110 points; a third experiment may yield 85 points; so on and so forth. Inordertofindtheaverage-casetrajectory, wewillhaveto“normalize”thecardinalities of all the trajectories so that they end up containing the same number of points. To do that, we consider the experiment with median point list size as the “reference” experiment. For example, the dataset for the EMD algorithm with 50Hz noise resulted in the following list sizes for the robot trajectories: A B C D E 1 307 263 262 296 265 2 262 227 262 228 290 3 245 309 315 241 301 4 265 289 306 244 320 5 333 271 291 262 300 The median of the above list of numbers is 271, which corresponds to experiment B5. Thus, for this dataset, experiment B5 is the reference experiment. Now, for each trajectory point in the reference experiment, we find the nearest point in all the other experiments in that dataset. This will give us 24 new point lists, each containing 271 points. The situation is depicted below: 104 A1 0 A2 0 ··· B5 ··· E5 0 (x 1 ,y 1 ) (x 1 ,y 1 ) ··· (x 1 ,y 1 ) ··· (x 1 ,y 1 ) (x 2 ,y 2 ) (x 2 ,y 2 ) ··· (x 2 ,y 2 ) ··· (x 2 ,y 2 ) . . . . . . . . . . . . (x 271 ,y 271 ) (x 271 ,y 271 ) ··· (x 271 ,y 271 ) ··· (x 271 ,y 271 ) Note that we use the reference experiment B5 as-is. However, to “transform” the point list for experiment A1 to obtain the new point list A1 0 , we have to discard 36 points to get the required 271 points from the original 307. Similarly, to get A2 0 from A2, we have to duplicate 9 points from the original 262. Since the reference experiment’s point list size is the median, we will discard some points in half of the experiments making up a dataset and duplicate some points in the other half. After we have equalized the point list sizes as described above, we create the final averaged trajectory by taking the centroids of the corresponding points. That is, in the point matrix shown earlier, we go row by row and compute the centroid of all the points on each row. This operation yields a trajectory containing 271 points averaged over the 25 experiments in the dataset. The preceding discussion was in terms of the robot’s trajectory. However, we can apply thesameproceduretotheotherpointliststogettheaveragedlistofpointswhereemergency stops occurred; and the averaged list of points where extrications took place; and so on. The only point list that we do not average in the manner just described is the list of points where the robot bumped into things because, for most experiments, it is an empty list and we usually get less than a handful of bumps across the entire dataset. 105 Evaluating the Algorithms Given our test design, the most important measure of an LGMD-based obstacle avoidance algorithm’s effectiveness is the number of failures the robot experiences as it moves across the slalom course from the start location to the goal. We classify failures into two types: Hard failures: wherein the robot makes physical contact with an object. Soft failures: wherein an LGMD-based version of the extricate behaviour is unable to prevent emergency stop from halting the robot, which leads to a loss of spiking activity andrequiresaninterventionby extricate togettherobotmovingagainsothatLGMD spiking may resume. Needless to say, we hope to observe few soft failures and even fewer hard failures. Fromthesetwobasiccounts,whichweobtainfromtherawmetricslogs,wecancalculate the success rate for each LGMD-based avoidance behaviour as the following ratio: R = M L+M (4.1) where M = total number of LGMD-based extrications and L = total number of LRF-based extrications We apply equation (4.1) to each experiment and then compute the average across the entire dataset. Apart from being used to calculate an LGMD-based avoidance algorithm’s success rate, the number of LGMD extrication events itself is an important measure. Too few and the robot will constantly need rescuing by the LRF-based extricate; too many, on the other hand, will have a detrimental effect on the robot’s ability to perform other important tasks due to the high priority assigned to extrication behaviours. Ideally, we would like the LGMD-based extrications to occur at roughly the same locations and with about the same frequency as the LRF-based extrications. 106 Some other useful performance metrics include the average time it takes the robot to complete the course and the average speed it is able to maintain under the influence of different avoidance algorithms. For the average time-to-goal, we simply calculate each experiment’s duration using the start and finish times reported by the goal behaviour and then take the mean of all the 25 durations for each dataset. Theaveragespeedcomputationisalittlemoreinvolved. The forwardbehaviourcontrols the robot’s normal driving speed. For the local navigation task, we ran this behaviour in adaptive mode with speed range [0.150m/s,0.350m/s] (see section 3.4.1). However, the forward behaviour is not the only one that affects the robot’s speed; emergency stopandtheextricationbehavioursarealsomajorclientsofLobot’s SpeedArbiter. Furthermore, the low-level controller ramps up or down to the commanded speed so as to prevent lurching motions. As a result of so many different actors on this particular stage, several of the speed entries logged by the track behaviour will have negative or small values, indicating brief periods wherein the robot was backing up from an obstacle or speeding up to its normal forward driving speed. Such entries will incorrectly skew the average forward driving speed computation. To work around the problem, when analyzing the metrics logs for the robot speed, we ignore entries below 0.099m/s. The remaining entries across the entire dataset are then used to compute the average robot speed and its standard deviation. The next section presents various results derived from the analyses outlined here. 4.4 Results Figure 4.3 shows the averaged trajectories for the different algorithms and noise levels. Each plot in figure 4.3 also indicates the bump and emergency stop locations plus the points where extrications took place. The appendix (page 134) shows the raw data from which we obtained figure 4.3’s plots. Figure 4.4 shows some statistics related to the performance of the LGMD-based avoid- ance algorithms. Particularly noteworthy is the anomaly in the TTI algorithm’s number of 107 Figure4.3: These plots show the averaged trajectories from start to finish for the differ- ent obstacle avoidance algo- rithmsundervaryingnoiselev- els in LGMD spiking. Since the LRF-based extricate be- haviour does not depend on LGMD spiking, there is no data for it under noisy spiking conditions. In addition to the robot trajectories, we also show the following lists of points: where the emergency stop behaviour handled danger zone penetra- tions, where the LRF and LGMD-based extrications oc- curred, and where the robot bumped into things. Except for the bump points, all the other point lists are averaged as described in section 4.3.3. In each plot, the individ- ual cells making up the grid are 25cm squares. The shaded gray regions represent the ob- stacles. The shaded rectangle in the top-left area of the plot isthegoal. Theredarrowhead in the bottom-left area shows the robot in its initial pose. LRF EMD VFF TTI No Noise 25Hz Noise 50Hz Noise 100Hz Noise Bumps LRF Extrications Emergency stops LGMD Extrications 108 Figure 4.4: These plots show how well the LGMD- basedobstacleavoidanceal- gorithms performed w.r.t. each other. For the sake of refer- ence, the upper plot also marks the extricate be- haviour’s average number of extrications with a solid horizontal line and the standard deviation for this statistic with dotted lines. The TTI algorithm’s per- formanceunder25Hz Gaus- sian noise in the LGMD spikes is quite anomalous and is discussed in the text. For each bar in the lower plot, we apply equa- tion (4.1) to each experi- mentandthencalculatethe mean and standard devia- tionfortheentiredatasetof 25 experiments. LGMD Extrications 0 25 50 75 100 125 Number of Extrication Events Number of Extrication Events No Noise 25Hz Noise 50Hz Noise 100Hz Noise EMD VFF TTI LRF LGMD Extrication Success Rate 25 50 75 100 Success Rate (%) Success Rate (%) No Noise 25Hz Noise 50Hz Noise 100Hz Noise EMD VFF TTI Source of Variance df SS MS F R 2 f 2 Algorithm (rows, r) 2 14,804.86 7,402.43 71.01 † .25 .33 Noise level (columns, c) 3 8,443.31 2,814.44 27.00 † .14 .16 Algorithm by Noise (r×c) 6 6,996.29 1,116.05 10.71 † .12 .13 Within groups (WG) 288 30,021.39 104.24 Totals 299 60,267.85 Table 4.2: Summary of two-way ANOVA computations for analyzing LGMD-based extri- cation success rate w.r.t. algorithm type and noise level. Since we conducted 25 experi- ments per algorithm per noise level, the total sample size is 25×3×4 = 300, which yields df WG = 300−3×4 = 288. F-values marked with a ‘†’ indicate a significant result at the .001 level. For the sig- nificance levels of the F ratios, we picked the entry closest to 288 for the denominator’s degrees of freedom from the F-distribution tables found in Moore (2000). When an F-test indicated a significant effect, we measured the effect size by computingR 2 , the coefficient of determination, and, from it, Cohen’s f 2 to see if follow-up analysis would be worthwhile. 109 extrications under 25Hz LGMD noise. There were five experiments out of the 25 for this algorithm and noise level that generated an enormous number of extrications. If we ignore these five experiments, then the number of extrications becomes 23±8 rather than 46±47 as plotted. The maximum number of extrications without the five “bad” runs is 44. The five bad runs, however, had the following numbers of extrication events: A3: 139 E3: 149 A4: 103 E4: 150 D5: 147 The above experiment labels (A3, E3, etc.) are w.r.t. to the grid column and row labels shown in figure A.11 (page 146). We do not know why the algorithm behaved so out of its norm in these five instances. Clearly, these five experiments throw off the performance profile of the TTI algorithm in the 25Hz noise case by a considerable amount. Nonetheless, we decided to retain these data as-is instead of rerunning the experiments to obtain better results. To gauge whether the differences between the LGMD-based extrication success rates of the three algorithms are statistically significant, we performed a two-way ANOVA (see, for example, Bartz, 1988). Since we have three LGMD-based avoidance algorithms and four noise levels, we used a 3×4 factorial design for the ANOVA with the rows in the table of factors assigned to the EMD, VFF and TTI algorithms and the columns to the four noise levels, viz., 0Hz, 25Hz, 50Hz and 100Hz. Table 4.2 summarizes the ANOVA computations for the LGMD-based extrication success rate. When an F-test for a factor revealed a significant effect, we computed a corresponding coefficientofdetermination,R 2 ,forittodeterminehowmuchofthevariationintheresponse variable was due to that factor. We calculated R 2 from the following formula (Keren and Lewis, 1979): R 2 = SS effect SS total 110 We then measured the effect size by computing Cohen’s f 2 (Cohen, 1992): f 2 = R 2 1−R 2 Per convention, if the value of f 2 fell near .35, we deemed the effect a large one worthy of furtheranalysis. Wealsotakeacloserlookatcaseswheref 2 wasnear.15(mediumeffects). But forf 2 6.10, we considered the effect too small to warrant a follow-up to the ANOVA. From thef 2 column in table 4.2, it is clear that the extrication strategy plays a crucial role in determining how successful an LGMD-based avoidance behaviour will be at getting the robot to steer clear off approaching obstacles. The amount of noise in the LGMD spikes also has a significant effect on the success rate. Moreover, there is some interaction between extrication algorithm and LGMD noise level that affects the final outcome of an obstacle avoidance event. Tofurtherinvestigatetheseissues, weappliedtheFisher-HaytertestofLeastSignificant Differences (Williams and Herv´ e, 2010), following the recommendations by Seaman et al. (1991). Because of the significant interaction between algorithm type and noise level, we performed pairwise comparisons of the success rate means for the three algorithms on a per-noise-level basis. Additionally, to get an idea of the overall performance differences between the three algorithms, we also compared the marginal means. The MLSD (Modified Least Significant Difference) statistic required by the Fisher- Hayter test is computed using the formula: MLSD =q α,k−1,ν r MS WG n In this equation, q α,k−1,ν is the α-level critical value of the Studentized range distribution for a range of k−1 and ν degrees of freedom. MS WG is the within groups mean sum of squares entry from the tables summarizing the ANOVA computations andn is the number of experiments per algorithm per noise level (25 in our case). The degrees of freedom ν is calculated as k×(n−1). 111 Since we have three algorithms, for the pairwise comparisons of mean success rates for each algorithm, k will be 3. Thus, ν = 3×(25−1) = 72. This yields the following values of MLSD for α ={.05, .01, .001}: MLSD .05 = 2.829× p 104.24/25 = 5.78 MLSD .01 = 3.762× p 104.24/25 = 7.68 MLSD .001 = 4.894× p 104.24/25 = 9.99 We used the tables published by Harter (1960) to obtain the above values of q. The differences of the means of the success rates of the LGMD-based extrication al- gorithms must exceed these MLSD cutoffs in order to be considered significant. For each significant difference, to further gauge the magnitude of the effect implied by the difference, we calculated Cohen’s d (Cohen, 1992): d = X 1 −X 2 s p where s p = pooled standard deviation = p MS WG Table 4.3 presents the results of the follow-up analysis described above. The procedure for gauging the effects of noise on the extrication success rate is similar. Table 4.4 shows the pairwise comparisons across the ANOVA columns. Figure 4.5 shows some statistics related to the robot’s performance in the local nav- igation task. Tables 4.5 and 4.6 summarize the ANOVA computations for the average time-to-goal and average robot speed respectively. As we can see from table 4.6, the forward driving speed is not affected either by LGMD extrication strategy or the noise level in the LGMD spikes. We expect this to be the case because the forward behaviour controls this particular aspect of the robot’s actions based on the distances to nearby obstacles without regard to any other factors. Since the slalom environment is static and fairly restricted, the forward behaviour will rarely have occasion to stray outside of a limited range of possible speeds. 112 Table 4.3: Follow-up anal- ysis for the effect of algo- rithm type on the success rate of an LGMD-based ob- stacle avoidance behaviour. Differencesmarkedwitha‘†’ are significant at the .001 level. For these differences, the corresponding d value measures the effect size. By convention, d>.8 is consid- ered a large effect. Noise level Alg. 1 Alg. 2 X 1 −X 2 d 0Hz EMD VFF 2.09 EMD TTI 20.12 † 1.97 VFF TTI 22.21 † 2.18 25Hz EMD VFF 2.00 EMD TTI 13.84 † 1.36 VFF TTI 15.84 † 1.55 50Hz EMD VFF 17.84 † 1.75 EMD TTI 10.96 † 1.07 VFF TTI 28.80 † 2.82 100Hz EMD VFF .31 EMD TTI .32 VFF TTI .01 Marginals EMD VFF 5.56 EMD TTI 11.15 † 1.09 VFF TTI 16.71 † 1.64 Table 4.4: Follow-up anal- ysis for the effect of spik- ing noise on the success rate of an LGMD-based ob- stacle avoidance behaviour. The MLSD cutoff points for α ={.05, .01, .001}are6.94, 8.74 and 10.96 respectively, which correspond to differ- ences marked ‘∗’, ‘‡’ and ‘†’ respectively. For significant differ- ences, Cohen’s d measures the effect size. By conven- tion,d>.8 is indicative of a large effect while d between .5 and .8 implies a medium effect. Algorithm Noise 1 Noise 2 X 1 −X 2 d EMD 0Hz 25Hz 3.85 0Hz 50Hz 16.34 † 1.60 0Hz 100Hz 11.33 † 1.12 25Hz 50Hz 20.19 † 1.98 25Hz 100Hz 15.18 † 1.49 50Hz 100Hz 5.01 VFF 0Hz 25Hz 3.76 0Hz 50Hz .59 0Hz 100Hz 13.11 † 1.28 25Hz 50Hz 4.35 25Hz 100Hz 16.87 † 1.65 50Hz 100Hz 12.52 † 1.23 TTI 0Hz 25Hz 10.13 ‡ .99 0Hz 50Hz 7.18 ∗ .70 0Hz 100Hz 9.11 ‡ .89 25Hz 50Hz 17.31 † 1.70 25Hz 100Hz 1.02 50Hz 100Hz 16.29 † 1.60 Marginals 0Hz 25Hz 5.91 0Hz 50Hz 8.04 ∗ .79 0Hz 100Hz 5.11 25Hz 50Hz 13.95 † 1.37 25Hz 100Hz 11.03 † 1.08 50Hz 100Hz 2.92 113 Figure 4.5: These plots show how long the robot took to go from the slalom course’s start position to the goal and how quickly it moved along the way. Thesolidhorizontallinerep- resents the average perfor- mance of the LRF-based extricate behaviourwhilethe dotted horizontal lines show its standard deviation. Average Time-to-goal 25 50 75 100 125 Time to Goal (s) Time to Goal (s) No Noise 25Hz Noise 50Hz Noise 100Hz Noise EMD VFF TTI LRF Average Robot Speed 0.10 0.15 0.20 0.25 Forward Speed (m/s) Forward Speed (m/s) No Noise 25Hz Noise 50Hz Noise 100Hz Noise EMD VFF TTI LRF Source of Variance df SS MS F R 2 f 2 Algorithm (r) 2 952.74 476.37 2.25 Noise level (c) 3 6,344.72 2,114.91 9.97 † .08 .09 Algorithm by Noise (r×c) 6 6,732.66 1,122.11 5.29 † .09 .10 Within groups (WG) 288 61,104.27 212.17 Totals 299 75,134.39 Table4.5: Summaryoftwo-wayANOVAcomputationsforgaugingtheperformanceofthe LGMD-based avoidance algorithms w.r.t. the time it takes the robot to achieve the local navigation task’s goal state. The factorial design of the ANOVA is the same as the one for the extrication success rate (table 4.2). F-values marked with a ‘†’ indicate a significant result at the .001 level. 114 Source of Variance df SS MS F R 2 f 2 Algorithm (r) 2 28,610.85 14,305.40 .45 Noise level (c) 3 36,041.11 12,013.70 .38 Algorithm by Noise (r×c) 6 22,197.16 3,699.53 .12 Within groups (WG) 288 9,126,178.81 31,688.12 Totals 299 9,213,027.93 Table4.6: Summaryoftwo-wayANOVAcomputationsforgaugingtheperformanceofthe LGMD-based avoidance algorithms w.r.t. the robot’s average forward driving speed during the local navigation task. For these computations, we converted all the speeds recorded in the metrics logs from m/s to mm/s to prevent floating point roundoff errors associated with squaring fractional numbers. That is why theSS andMS values in this table are so large. In a similarvein, as evidenced by the non-significantF-ratio in the first row of table 4.5, the extrication algorithm does not affect the amount of time it takes the robot to work its way from the start position to the goal. That is, at any given level of LGMD noise, all three algorithms (viz., EMD, VFF and TTI) are equally matched. However, the noise level does affect the time-to-goal. There is also a significant interaction effect between the noise level and extrication algorithm that influences the time-to-goal. Nonetheless, the entries in the f 2 column of table 4.5 indicate that the sizes of these two effects are small and, therefore, do not require any further explanation. To justify the above decision from another perspective, we can use theω 2 statistic (see, for example, Olejnik and Algina, 2003): ω 2 c = SS c −df c ×MS WG SS tot +MS WG = 6344.72−3×212.17 75134.39+212.17 = 0.076 = 7.6% Thistellsusthatonly7.6%ofthevariabilityobservedintheresponsevariable(time-to-goal) can be explained by the effect of the LGMD noise level. Similarly, for the interaction effect intable4.5,ω 2 r×c = 7.2%. Thus, eventhoughtheF-testindicatesahighlysignificanteffect, 115 that effect is actually quite small. In fact, a visual inspection of the upper plot in figure 4.5 suggests that most of the variability is due to the somewhat anomalous 25Hz noise case. While the Fisher-Hayter test will certainly confirm (or disconfirm) this hypothesis, due to the small effect involved, it is doubtful that we will gain any additional insight beyond what has already been clarified by figure 4.5 and table 4.5, which is why it is reasonable to ignore the results of the omnibus F-test and skip the pairwise comparison of means in this case. 4.5 Discussion 4.5.1 The EMD Algorithm As we can see from table 4.4 and figure 4.4, there is a precipitous drop in the EMD al- gorithm’s success rate when the LGMD noise goes from 25Hz to 50Hz. Beyond the 50Hz mark, the algorithm’s performance remains depressed but at the same level. Moreover,fromtable4.3,weseethattheEMDandVFFalgorithmsareequallymatched except when the noise is 50Hz. Thus, something happens when we go from 25 to 50Hz LGMD noise that causes the dramatic performance hit. Looking back at algorithm 4.1 (page 95), we see that the EMD algorithm only kicks in when the magnitude of the final EMD vector exceeds some pre- configured threshold. Here are the values we used for this threshold at the different noise levels: Noise (Hz): 0 25 50 100 EMD Threshold: 5,000 27,500 75,000 150,000 Figure 4.6 shows a plot of these values. As we can see, the change in slope of the line segment from 25Hz to 50Hz is a 1000 units, while the other two differences are only 600 and 400 units. This large change in the threshold at the 50Hz noise level increased the damping effect of this threshold to a considerably greater extent than at the other noise levels. As a consequence, when the LGMD noise hit 50Hz, the lgmd extricate emd behaviour 116 Figure 4.6: Plot of EMD threshold for different noise levels. The numbers “at- tached” to the line segments indicate their slopes. 30 60 90 120 150 EMD Threshold (× 10 3 ) 0 25 50 75 100 Noise (Hz) 5 900 1900 1500 unnecessarily ignored many situations that it ought to have handled, thus, resulting in the observed performance drop. Theobvioussolutiontothisproblemistolowerthethreshold. Unfortunately,wearrived at the settings shown earlier via a process of trial and error. Needless to say, such hand- tuning is cumbersome and, therefore, quite suboptimal and error-prone, especially because the vector produced by the EMD algorithm is essentially a mathematical abstraction that is very difficult to relate to in a physical sense. A better approach then would be to have the robot automatically learn a function that returns appropriate values for this and other settings based on all relevant factors. Another benefit of automatic tuning would be the near elimination of hard failures, i.e., pointswherethe lgmd extricate emdbehaviourfailedtopreventtherobotfrombumpinginto obstacles. The EMD algorithm already has a very low incidence of hard failures. For the 0Hz and 25Hz noise levels, we only observed two bumps across all 25 runs ∗ (see figure 4.3, page 108). However, at the higher noise levels, there were no bumps at all, not because the behaviour’s performance became better but because the EMD threshold made it very conservative in its actions, which caused the more reliable LRF-based emergency stop and extricate behaviours to exert greater control. With an automatic thresholding procedure, we can configure the behaviour to take into account not only the proper amount of “boldness” it needs to exhibit so as to maximize its ∗ Actually, in the 0Hz case, there is a third bump at the very beginning of one of the experiments (B2, figure A.2, page 137). However, that was due to a glitch in the low-level system. Occasionally, the iRobot Create reports spurious bumps and other events. We should have implemented some extra debouncing logic in the low-level controller to filter out these errors. 117 success rate and avoid having to be rescued by the LRF-based version of extricate, but also the requisite amount of “reticence” to keep the robot sufficiently far away from the objects in its environment so that they are not hit. 4.5.2 The VFF Algorithm Of the three LGMD-based obstacle avoidance algorithms we developed, we found this one to be the most resistant to noise. Its success rate remained in the nineties up to 50Hz noise and dropped to about 77% when the LGMD noise went up to 100Hz (see figure 4.4 and table 4.4). The algorithm also has very favourable characteristics w.r.t. hard failures: two across all 25 runs for 0Hz noise, three for 25Hz and only one for the 50Hz case (figure 4.3). At the 100Hz noise level, we observed zero hard failures. Once again, this does not imply the algorithm getting better in the face of increasing noise, but is rather an artifact of potential misconfiguration. An advantage the VFF algorithm has over EMD is that its controlling parameters are physicalquantitiesthataremucheasiertoconceptualizeandadjust. Nonetheless,weexpect automatic parameter tuning will produce superior results. Lest we conclude that, overall, VFF is the best LGMD-based extrication technique, we should point out an important caveat associated with its high success rate: the number of extrication events tends to be quite high (see upper plot in figure 4.4, page 109). Conse- quently, when we apply equation (4.1) (page 106) to calculate the success rate, we get a large number. In section 4.3.3, we stated that, ideally, LGMD-based extrications should occur at roughly the same locations and with about the same frequency as LRF-based extrications. From figure 4.3 (page 108), we see that the lgmd extricate vff behaviour’s extrication re- sponse is active right from the start all the way up to the goal. Inpracticalterms,thisleadstooverlyjitterydriving. Thatis,therobotconstantlystops, backs up a little bit and then resumes forward driving despite there being no obstacles 118 nearby to elicit such an action. Even when the extrication commands involve forward motion, the change in speed from the one commanded by the forward behaviour to the speed specified by the extricate speed setting perpetuates the staccato effect. It is unclear whether or not automatic tuning will eliminate this problem. 4.5.3 The TTI Algorithm This algorithm’s performance profile is truly baffling. When there is no noise in the LGMD spikes, its success rate is over 20 percentage points below both EMD and VFF (table 4.3, page 113). However, when we raise the noise level from 0 to 25Hz, the success rate goes up from 68.4% to 78.5%, still not on par with EMD or VFF, but the best it could achieve (on average). Further increasing the noise from 25 to 50Hz results in a drop in success rate to 61.2%. And then, at 100Hz noise, it goes up to 77.5%. Another odd aspect of this algorithm’s performance is its high levels of variability (see the error bars in figure 4.4, page 109). Perhaps the only saving grace here is that at the highest noise level, all the three algorithms were equally good (or, depending on one’s philosophical outlook, equally bad). To make matters worse, the number of hard failures were abysmally high. For the other two algorithms, we usually got two or three bumps at most per batch of 25 experiments. Without LGMD noise, the situation was similar with the lgmd extricate tti behaviour ex- periencing three bumps. However, at the 25, 50 and 100Hz noise levels, it collided with obstacles 8, 5 and 7 times respectively. Section 4.1.3 explained how this algorithm works. Basically, it does the same thing as the LRF-based extricate, viz., computingasteeringvectorfromdistancereadings converted into attractive and repulsive forces. The only difference is that lgmd extricate tti obtains its range measurements from a Bayesian state estimator that uses artificially generated LGMD spikes as its input. 119 The danger zone for this behaviour is specified in terms of a single distance threshold plus a minimum count of repulsive force vectors. Here are the different values we used for these two parameters at the different noise levels: Noise (Hz): 0 25 50 100 Distance Threshold (mm): 385 425 432.5 432.5 Count Threshold: 2 2 4 5 We do not see how the above variations in the configuration settings could possibly produce the strange zigzag pattern in the algorithm’s performance. The only plausible explanation we can think of is that theδ parameter (explained in section 2.1) has something to do with lgmd extricate tti’s travails. Recall thatδ controls the amount of time remaining between the LGMD spike rate peak and the collision event to which the LGMD is responding. When δ is small, the peaks will occur very close to collision; when it is large, the peaks will be further away. The analysis in section 2.6.1 concluded that LGMD-based obstacle avoidance would be better suited to sparsely populated areas because densely packed environments would force us to use low values of δ, which do not bode well for the predictive abilities of the LGMD-based time-to-impact estimator (see figure 2.10 on page 52). The slalom obstacle course (figure 4.2) that we used to test the three LGMD-based avoidance behaviours was a fairly confined area. Accordingly, after some trial-and-error, we settled on δ = .5s. The tight passageways and corners within the obstacle course further compoundedthe problembyrestrictingtherobot’sspeedtoabout0.170m/s. As we cansee from the plots for δ =.5s (figure 2.9, page 44), these settings are not conducive to getting the Bayesian TTI estimator to do its best work. Webelievethatthelowvaluesofδ androbotspeedresultedinhighlevelsofuncertainty in the time-to-impact predictions, which, in turn, led to the high variability we see in the lgmd extricate tti behaviour’sresponses. Forexample,thebesttimeweevergotwas48.754s, which happened in experiment D5, figure A.10, (page 145). The worst time, 183.763s, was also due to lgmd extricate tti: experiment E1, figure A.13 (page 148). Similarly, experiment 120 C1, figure A.12 (page 147), had the worst LGMD extrication success rate of all: 0%; yet, amongthetotal300experimentsconducted,inthetoptenbestsuccessrateswasexperiment E3’s 97.4% (figure A.11, page 146). Unfortunately, as we did not record the metrics related to the time-to-impact predic- tions, ∗ we have no way to validate the above conjecture thatδ is behind lgmd extricate tti’s strange performance profile. Therefore, we declare the TTI algorithm’s results inconclusive and in need of further study. On a positive note, an edge this algorithm has over EMD and VFF is that it generates explicit range estimates from LGMD spikes, which allows it, an optical flow based obstacle avoidance technique, to be integrated with mapping algorithms that rely on range sensing. This is why we feel that working to resolve the problems highlighted above would be a worthwhile effort. Moreover, since this behaviour’s controlling parameters are distances, one can readily grasp how it ought to be configured. Nonetheless, due to its dependence on a complex Bayes filter that sports several of its own parameters, an automated configuration system that learns the appropriate settings is practically a necessity to obtain near-optimal actions from the algorithm. ∗ A grave oversight in retrospect. 121 Chapter 5 Conclusion 5.1 Evaluation of Project Goals The main objectives of this thesis have been to: • Develop a sensor model for the locust’s Lobula Giant Movement Detector, a collision detection neuron. • UseLGMDspikesemanatingfrommultiplelocuststoguideamobilerobotawayfrom approaching obstacles. • Thoroughlycharacterizethesensormodelandobstacleavoidancealgorithmstogauge their applicability to a bio-interfacing context, wherein live LGMD’s will serve as the sensory inputs to a robot. We addressed the first of these goals by observing that the LGMD spike rate is strongly correlated with the time-to-impact. However, since one cannot directly recover the time-to- impact given a spike rate, we devised a probabilistic sensor model that predicted the most likely time-to-impact from input spikes streaming in from virtual locusts. We then conducted many controlled experiments to understand how the above model would react to various situations. These tests revealed two important points regarding the use of live LGMD’s for mobile robot guidance. First, that slow robots would not be 122 served very well by LGMD’s. And second, that we will need to instrument the locusts’ eyes somehow to make things appear closer than they are. Otherwise, the LGMD spike rate peaks will occur too close to collisions, seriously jeopardizing the practicability of LGMD’s as collision detection sensors. Although, at the outset of the project, we felt these two conditions would be necessary, the detailed tests we ran bolsters these hypotheses with ample evidence. For the second goal, viz., combining the LGMD spike rates from multiple locusts to achieve directed, local navigation, we designed three different LGMD-based obstacle avoid- ance algorithms. Two of these used the raw LGMD spike rate as their input while the third generated range estimates using the Bayesian time-to-impact estimator we had developed. We tested these three algorithms quite extensively, using different levels of sensor (i.e., LGMD) noise to simulate actual spike rates we would expect to see if the robot were receiving its input from live LGMD’s. Based on the detailed analyses of these experiments, atthistime,werecommendusingtheVFFalgorithmonarobotinterfacedwithlivelocusts. See sections 4.1.2 and 4.5.2 for further details about this algorithm. One area where we fell short was the lackluster performance of the TTI algorithm (section 4.1.3), for which we had high hopes. However, as we pointed out in section 4.5.3, theperformancedataforthisalgorithmare, infact, inconclusiveandwarrantfurtherstudy. Despite the lack of a clear result in this one case, we believe that, in all other respects, we have accomplished all that we set out to achieve for the particular subportion of the Robolocust project on which this thesis is focused. Additionally, an ancillary benefit from this work has been the implementation of a versatile and extensible behaviour-based con- troller that can be deployed on multiple mobile robot platforms. 5.2 Comparisons to Other Similar Efforts i Badia et al. (2007) built a robotic blimp equipped with two cameras, which were used to simulate a pair of LGMD’s. When collisions were detected, the robot would veer left 123 or right depending on which side’s LGMD was producing more spikes, with the amount of turn being proportional to the difference in the spike rates. Yue et al. (2010) refer to this as the “steering wheel” algorithm. In their paper, they also describe a winner-take-all approach wherein one of the LGMD’s suppresses the other’s motor command. Our work on LGMD-based obstacle avoidance is related to these projects. However, the above-mentioned authors are more concerned with elucidating the underlying biological principles and mechanisms operating within the locust’s optomotor system. Our focus, on the other hand, is on applying the LGMD to mobile robots. We are especially interested in designing techniques that mesh seamlessly into a larger behavioural framework, assisting rather than disrupting the robot’s other ongoing activities and goals. This is why we rejected the simplistic approaches outlined above (see the discussion regarding lgmd extricate sim in the opening paragraphs of chapter 4). While the algorithms we have developed are slightly more computationally involved than a simple left-or-right decision, they are not so intense as to lose the advantages of reactivity. Moreover, they are much better suited to situations requiring goal-directedness. The time-to-impact Bayes filter, in particular, offers a means of converting visual input into distance measurements that can then be injected into a number of different range sensing based algorithms that have been developed over the years and are in wide use in the field. To the best of our knowledge, no other insect-inspired obstacle avoidance systemprovidessuchacapabilityforflexibleintegrationwithothermoduleswithinarobot’s controller. However,aglaringdeficiencyinourrobotinitspresentformisthatwesimulatedLGMD spiking with a range sensor. Thus, we started off with range input, which we converted into an approximation of visual spiking activity. We then, in effect, regurgitated the original distance measurements from these simulated neural spikes. We separated logically distinct functions into their own black boxes so that the compo- nents consuming the LGMD spikes would not know how that input was generated. As far 124 as the sensor model and avoidance algorithms are concerned, their input could be coming from live locusts or cameras. Furthermore, we added noise to the artificial spikes to better simulate real spiking patterns. Nevertheless, because the sensing modality used to generate the input and then verify the output is the same, it leaves us open to the perfectly valid criticism that the results we obtained cannot be construed as being truly representative of an actual visually-based system. It goes without saying that it would have been much better had we used cameras to obtain LGMD spikes and then verified the Bayesian state estimator’s distance estimates with the laser range finder. As explained in chapter 2, we sidestepped that approach for the sake of expediency. However, in the coming months, we plan to rectify the above methodological weakness as described in the next section. 5.3 Future Work A common theme that emerged from the analysis in section 4.5 is that hand-tuning of various parameters is undesirable. Moreover, for the TTI algorithm, we suspect that com- plex interplay between several different parameters led to the problematic and inexplicable performance profile we observed for it. Therefore, in the near term, we hope to implement some relevant machine learning algorithms to have the robot automatically converge on optimal settings for the different LGMD-based avoidance behaviours as well as the Bayesian time-to-impact estimator. Once we have such a system in place, we would like to retest the obstacle avoidance algorithms and compare the results with the ones we obtained here to see if we get any substantial improvements. One thing to keep in mind for this next round of experiments would be to test in restrictive as well as more open environments and control the δ parameter (see section 2.1) for the LGMD spike generation to gauge whether or not it boosts the TTI algorithm’s performance. 125 Recalling that the work we undertook here was part of a first step toward interfacing with real locusts, an obvious avenue of further research is the integration of the algorithms and software developed for this thesis with the ongoing efforts by other members in our lab toward the bio-interfacing aspects of the Robolocust project. However, since we do not expect ubiquitous deployment of live locusts as sensory com- ponents on mobile robots, a more practical pursuit would be the development of a compu- tational model of the LGMD that works with input frames streaming in from cameras. In chapter 2, we mentioned our initial efforts with the model by Stafford et al. (2007) and the trouble we faced with it. Over the next few months, we will mount two or three cameras (or perhaps a panoramic camera) on the robot and resume our investigation into vision-based LGMD modeling. An informal survey of the literature suggests that there are several alternatives to the algorithm by Stafford et al. We plan on implementing a couple of them and, if necessary, developing another. Once we are able to generate LGMD spikes from visual input, we can test the time-to- impact estimator described in chapter 2 as well as chapter 4’s avoidance algorithms and verify performance against the laser range finder’s distance measurements. Eventually,wehopetobeabletoapplytheabovevision-basedtime-to-impactestimator to a range sensing task such as occupancy grid mapping (Thrun et al., 2005, chap. 9) and compare the results with grids created using the laser range finder, stereo vision (Murray and Little, 2000) and other applicable techniques (e.g., depth estimates from visual sonar (Lenser and Veloso, 2003) or reinforcement learning (Michels et al., 2005)). In the long run, if LGMD-based range sensing turns out to work well, we envision building a hardware sensor that imitates the LGMD and incorporates the Bayesian state estimator to output time-to-impact, thus, producing a monocular, passive range sensing solution. Compared to active range sensors, we expect such a device to be smaller, less power-hungry and, hopefully, cheaper. Possible applications include collision sensing for 126 automobiles and mobile robots. If it can be made highly compact with minimal power requirements, it may also function as a useful aid to the visually impaired. 127 Bibliography Alexandrescu, A. Modern C++ Design: Generic Programming and Design Patterns Applied. Pearson Education, 2001. ISBN 978-81-317-1115-6. Anton, H., Bivens, I. and Davis, S. Calculus. John Wiley and Sons, 7 th edition, 2002. ISBN 9971-51-431-1. Arkin, R. C. Behavior-Based Robotics. MIT Press, 1998. Bakkum, D. J., Gamblen, P. M., Ben-Ary, G., Chao, Z. C. and Potter, S. M. “MEART: The Semi-living Artist.” Frontiers in Neurorobotics, 1(5):1–10, 2007. Bartz, A. E. Basic Statistical Concepts. Macmillan Publishing Company, 3 rd edition, 1988. ISBN 0-02-306445-5. Bennet-Clark, H. C. “The Energetics of the Jump of the Locust Schistocerca gregaria.” Journal of Experimental Biology, 63:53–83, 1975. Bonin-Font, F., Ortiz, A. and Oliver, G. “Visual Navigation for Mobile Robots: A Survey.” Journal of Intelligent and Robotic Systems, 53(3):263–296, November 2008. Borenstein, J. andKoren, Y. “Real-time Obstacle Avoidance for Fast Mobile Robots.” IEEETransactionsonSystems, Man, andCybernetics,19(5):1179–1187,September1989. Brooks, R. A. “A Robust Layered Control System for a Mobile Robot.” IEEE Journal of Robotics and Automation, 2(1):14–23, March 1986. Bure˘ s, J., Petr´ a˘ n, M. and Zachar, J. Electrophysiological Methods in Biological Re- search. Czechoslovak Academy of Sciences, 1960. Card, G. and Dickinson, M. H. “Visually Mediated Motor Planning in the Escape Response of Drosophila.” Current Biology, 18(17):1300–1307, September 2008. Cohen, J. “A Power Primer.” Psychological Bulletin, 112(1):155–159, 1992. 128 Coombs, D., Herman, M., Hong, T. and Nashman, M. “Real-time Obstacle Avoid- ance Using Central Flow Divergence and Peripheral Flow.” In Proceedings of the 5 th International Conference on Computer Vision, pages 276–283, June 1995. Crevier, D. AI: The Tumultuous History of the Search for Artificial Intelligence. Basic Books, 1993. ISBN 0-465-02997-3. de Marse, T. B., Wagenaar, D. A., Blau, A. W. and Potter, S. M. “The Neu- rally Controlled Animat: Biological Brains Acting with Simulated Bodies.” Autonomous Robots, 11:305–310, 2001. de Souza, G. N. and Kak, A. C. “Vision for Mobile Robot Navigation: A Survey.” IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(2):237–267, Febru- ary 2002. Eckel, B. and Allison, C. Thinking in C++, Volume Two: Practical Programming. Pearson Prentice Hall, 2004. ISBN 0-13-035313-2. Elfes, A. “Using Occupancy Grids for Mobile Robot Perception and Navigation.” Com- puter, 22(6):46–57, June 1989. Gabbiani, F.,Krapp, H. G.,Koch, C. andLaurent, G. “Multiplicative Computation in a Visual Neuron Sensitive to Looming.” Nature, 420:320–324, 2002. Gabbiani, F., Krapp, H. G. and Laurent, G. “Computation of Object Approach by a Wide-Field Motion-Sensitive Neuron.” Journal of Neuroscience, 19(3):1122–1141, February 1999a. Gabbiani, F.,Laurent, G.,Hastopoulos, N. andKrapp, H. G. “The Many Ways of Building Collision-sensitive Neurons.” Trends in Neuroscience, 22(10):437–438, 1999b. Gamma, E., Helm, R., Johnson, R. and Vlissides, J. M. Design Patterns: Elements of Reusable Object-Oriented Software. Addison-Wesley, 1994. Gennery,D.B. “AStereoVisionSystemforanAutonomousVehicle.” In5 th International Joint Conference on Artificial Intelligence, August 1977. Graetzel, C. F., Medici, V., Rohrseitz, N., Nelson, B. J. and Fry, S. N. “The Cyborg Fly: A Biorobotic Platform to Investigate Dynamic Coupling Effects Between a Fruit Fly and a Robot.” In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 14–19, September 2008. Harter, H. L. “Tables of Range and Studentized Range.” The Annals of Mathematical Statistics, 31(4):1122–1147, 1960. 129 Hashima, M., Hasegawa, F., Kanda, S., Maruyama, T. and Uchiyama, T. “Local- ization and Obstacle Detection for a Robot Carrying Food Trays.” In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 345–351, 1997. Herr, H. and Dennis, R. G. “A Swimming Robot Actuated by Living Muscle Tissue.” Journal of NeuroEngineering and Rehabilitation, 1(1), October 2004. Hertz, G. “Cockroach Controlled Mobile Robot.” Website, 2004. URL http://www. conceptlab.com/roachbot/. i Badia, B. S., Bernard, U. and Verschure, P. F. M. J. “Non-linear Neuronal Responses as an Emergent Property of Afferent Networks: A Case Study of the Locust Lobula Giant Movement Detector.” PLoS Computational Biology, 6(3), March 2010. i Badia, B. S., Pyk, P. and Verschure, P. F. M. J. “A Fly-Locust Based Neuronal ControlSystemAppliedtoanUnmannedAerialVehicle: TheInvertebrateNeuronalPrin- ciples for Course Stabilization, Altitude Control and Collision Avoidance.” International Journal of Robotics Research, 26(7):759–772, July 2007. Indiveri, G. “Analog VLSI Model of Locust DCMD Neuron Response for Computation of Object Approach.” In L. S. Smith and A. Hamilton, editors, Neuromorphic Systems: Engineering Silicon from Neurobiology. World Scientific, 1997. Judge,S.J.andRind,C.F.“TheLocustDCMD,aMovement-detectingNeuroneTightly Tuned to Collision Trajectories.” Journal of Experimental Biology, 200:2209–2216, 1997. Keren, G. andLewis, C. “Partial Omega Squared for Anova Designs.” Educational and Psychological Measurements, 39:119–128, 1979. Lenser, S. and Veloso, M. “Visual Sonar: Fast Obstacle Avoidance Using Monocular Vision.” In Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 886–891, October 2003. Michels, J.,Saxena, A. andNg, A. Y. “High Speed Obstacle Avoidance Using Monoc- ular Vision and Reinforcement Learning.” In Proceedings of the 21 st International Con- ference on Machine Learning, 2005. Moore, D. S. The Basic Practice of Statistics. W. H. Freeman and Company, 2 nd edition, 2000. ISBN 0-7167-3627-6. Moravec,H. “TowardsAutomaticVisualObstacleAvoidance.” In5 th International Joint Conference on Artificial Intelligence, August 1977. 130 Murray, D. and Little, J. “Using Real-Time Stereo Vision for Mobile Robot Naviga- tion.” Autonomous Robots, 8(2):161–171, April 2000. Nakamura, T. and Asada, M. “Motion Sketch: Acquisition of Visual Motion Guided Behaviors.” In Proceedings of the 14 th International Joint Conference on Artificial Intel- ligence, pages 126–132, 1995. Newell, A. and Simon, H. A. “GPS: A Program that Simulates Human Thought.” In E. A. Feigenbaum and J. Feldman, editors, Computers and Thought. McGraw-Hill, 1963. Nowel, M. S. and Shelton, P. M. J. “A Golgi-electron-microscopial Study of the Structure and Development of the Lamina Ganglionaris of the Locust Optic Lobe.” Cell and Tissue Research, 216(2):377–401, 1981. Olejnik, S. and Algina, J. “Generalized Eta and Omega Squared Statistics: Measuring Effect Size for Some Common Research Designs.” Psychological Methods, 8(4):434–447, 2003. O’Shea, M. andWilliams, J. L. D. “The Anatomy and Output Connections of a Locust Visual Interneuron: The Lobula Giant Movement Detector (LGMD) Neurone.” Journal of Comparative Physiology, 92:257–266, 1974. Pirjanian, P. “Behavior Coordination Mechanisms – State-of-the-art.” Technical Re- port IRIS-99-375, Institute of Robotics and Intelligent Systems, University of Southern California, October 1999. Potter, S. M. “Hybrots: Hybrid Systems of Cultured Neurons+Robots for Studying Dynamic Computation and Learning.” In J. M. Carmena and G. Maistros, editors, Proceedings of the 7 th International Conference on the Simulation of Adaptive Behaviour, Workshop on Motor Control in Humans and Robots: On the Interplay of Real Brains and Artificial Devices, 2002. Reger, B. D., Fleming, K. M., Sanguineti, V., Alford, S. and Mussa-Ivaldi, F.A. “ConnectingBrainstoRobots: AnArtificialBodyforStudyingtheComputational Properties of Neural Tissues.” Artificial Life, 6:307–324, 2000. Reichardt, W. “Evaluation of Optical Motion Information by Movement Detectors.” Journal of Comparative Physiology A, 161:533–547, 1987. Rind, F. C. and Bramwell, D. I. “Neural Network Based on the Input Organization of an Identified Neuron Signaling Impending Collisions.” Journal of Neurophysiology, 75(3):967–985, 1996. 131 Rind, F. C. and Santer, R. D. “Collision Avoidance and a Looming Sensitive Neuron: Size Matters but Biggest is not Necessarily Best.” In Proceedings of the Royal Society, Biological Sciences, volume 271, February 2004. Rind, F. C. and Simmons, P. J. “Seeing What is Coming: Building Collision-sensitive Neurones.” Trends in Neuroscience, 22(5):215–220, May 1999. Rosenblatt, J. K. DAMN: A Distributed Architecture for Mobile Navigation. Ph.D. thesis, The Robotics Institute, Carnegie Mellon University, January 1997. CMU-RI-TR- 97-01. Santos-Victor, J., Sandini, G., Curotto, F. and Garibaldi, S. “Divergent Stereo in Autonomous Navigation: Learning from Bees.” In IEEE CS Conference on Computer Vision and Pattern Recognition, 1993. Seaman, M. A., Levin, J. R. and Serlin, R. C. “New Developments in Pairwise Mul- tiple Comparisons: Some Powerful and Practicable Procedures.” Psychological Bulletin, 110(3):577–586, 1991. Si, A., Srinivasan, M. V. and Zhang, S. W. “Honeybee Navigation: Properties of the Visually Driven Odometer.” Journal of Experimental Biology, 206:1265–1273, 2003. Singh, S. andDigney, B. “Autonomous Cross-country Navigation Using Stereo Vision.” TechnicalReportCMU-RI-TR-99-03,TheRoboticsInstitute,CarnegieMellonUniversity, January 1999. Srinivasan, M. V., Lehre, M., Kirchner, W. H. and Zhang, S. W. “Range Percep- tion Through Apparent Image Speed in Freely Flying Honeybees.” Visual Neuroscience, 6:519–535, 1991. Stafford, R., Santer, R. D. and Rind, F. C. “A Bio-inspired collision mechanism for cars: Combining insect inspired neurons to create a robust system.” BioSystems, 87:162–169, 2007. Stroustrup, B. The C++ Programming Language. Addison-Wesley, 2000. ISBN 0-201- 70073-5. Thrun, S., Burgard, W. and Fox, D. Probabilistic Robotics. MIT Press, 2005. von Wahlde, R., Wiedenman, N., Brown, W. A. and Viqueira, C. “An Open-path Obstacle Avoidance Algorithm Using Scanning Laser Range Data.” Technical Report ARL-MR-0715, Army Research Laboratory, Aberdeen Proving Ground, February 2009. 132 Wicklien, M. and Strausfield, N. J. “Organization and Significance of Neurons that Detect Change of Depth in Hawk Moth Manduca sexta.” Journal of Comparative Neu- rology, 424(2):356–376, August 2000. Williams, L. J. and Herv´ e, A. “Fisher’s Least Significant Difference (LSD) Test.” In N. J. Salkind, D. M. Dougherty and B. Frey, editors, Encyclopedia of Research Design, pages 491–494. Thousand Oaks (CA): Sage, 2010. Woo, M.,Neider, J. andDavis, T. OpenGL Programming Guide: The Official Guide to Learning OpenGL, Version 1.1. Addison-Wesley, 2 nd edition, 1997. ISBN 0-201-46138-2. Yue, S., Santer, R. D., Yamawaki, Y. and Rind, F. C. “Reactive Direction Control foraMobileRobot: ALocust-likeControlofEscapeDirectionEmergesWhenaBilateral Pair of Model Locust Visual Neurons are Integrated.” Autonmous Robots, 28(2):151–167, February 2010. 133 Appendix Raw Data for the Slaloming Experiments Section 4.3 described how we tested the three LGMD-based obstacle avoidance algorithms by applying their respective behaviours as the extrication method in a local navigation task. Section 4.4 then presented various statistics detailing Lobot’s performance in this task. Specifically, figure 4.3 (page 108) showed the averaged trajectories from start to finish along with the locations where the robot bumped into obstacles and where its extrication and emergency stop behaviours were engaged in motor activity. To avoid unnecessarily breaking the flow of ideas in chapter 4, we deferred the presenta- tion of the raw data from which figure 4.3 was generated. Here, we remedy that omission. Testing three algorithms under four different noise levels yielded 12 datasets comprised of 25 individual experiments. The plain laser range finder based version of the extricate behaviour gave us another dataset. The remaining pages of this chapter are devoted to the raw data for these 13 datasets. Each set of plots appears on a page by itself. These plots are arranged in a 5×5 grid whosecolumnsarelabeledbylettersandrowsbynumbers. Therawdatasetsavailablefrom http://ilab.usc.edu/mviswana/robolocust-data/ are also arranged in similar grids. 134 Thus, for example, to view the video of experiment C3 for some algorithm and noise level, select the C3 hyperlink for it on the above-mentioned web page. 135 Figure A.1: Raw data for the LRF-based version of the extricate behaviour from which the first row of figure 4.3 was obtained. In addition to the robot’s trajectory from start to finish, we also show where emergency stops, extrications and bumps occurred. In each plot, the indi- vidual cells making up the gridare25cmsquares. The shaded gray regions repre- sent the obstacles. The shaded rectangle in the top-left area of the plot is the goal. The red ar- rowhead in the bottom-left area shows the robot in its initial pose. The key for the diff- erent point lists is shown below: Bumps Stops LRF Extr. A B C D E 1 2 3 4 5 136 Figure A.2: Raw data for EMD algorithm with no noise in LGMD spikes. In addition to the robot’s trajectory from start to finish, we also show where emergency stops, extrications and bumps occurred. In each plot, the in- dividual cells making up the gridare25cm squares. The shaded gray regions represent the obstacles. The shaded rectangle in the top-left area of the plotisthegoalandthered arrowhead in the bottom- left shows the robot in its initial pose. The key for the diff- erent point lists is shown below: Bumps Stops LRF Extr. LGMD Extr. A B C D E 1 2 3 4 5 137 Figure A.3: Raw data for EMD algorithm with 25Hz noise in LGMD spikes. In addition to the robot’s trajectory from start to finish, we also show where emergency stops, extrications and bumps occurred. In each plot, the in- dividual cells making up the gridare25cm squares. The shaded gray regions represent the obstacles. The shaded rectangle in the top-left area of the plotisthegoalandthered arrowhead in the bottom- left shows the robot in its initial pose. The key for the diff- erent point lists is shown below: Bumps Stops LRF Extr. LGMD Extr. A B C D E 1 2 3 4 5 138 Figure A.4: Raw data for EMD algorithm with 50Hz noise in LGMD spikes. In addition to the robot’s trajectory from start to finish, we also show where emergency stops, extrications and bumps occurred. In each plot, the in- dividual cells making up the gridare25cm squares. The shaded gray regions represent the obstacles. The shaded rectangle in the top-left area of the plotisthegoalandthered arrowhead in the bottom- left shows the robot in its initial pose. The key for the diff- erent point lists is shown below: Bumps Stops LRF Extr. LGMD Extr. A B C D E 1 2 3 4 5 139 Figure A.5: Raw data for EMD algorithm with 100Hz noise in LGMD spikes. In addition to the robot’s trajectory from start to finish, we also show where emergency stops, extrications and bumps occurred. In each plot, the in- dividual cells making up the gridare25cm squares. The shaded gray regions represent the obstacles. The shaded rectangle in the top-left area of the plotisthegoalandthered arrowhead in the bottom- left shows the robot in its initial pose. The key for the diff- erent point lists is shown below: Bumps Stops LRF Extr. LGMD Extr. A B C D E 1 2 3 4 5 140 Figure A.6: Raw data for VFF algorithm with no noise in LGMD spikes. In addition to the robot’s trajectory from start to finish, we also show where emergency stops, extrications and bumps occurred. In each plot, the in- dividual cells making up the gridare25cm squares. The shaded gray regions represent the obstacles. The shaded rectangle in the top-left area of the plotisthegoalandthered arrowhead in the bottom- left shows the robot in its initial pose. The key for the diff- erent point lists is shown below: Bumps Stops LRF Extr. LGMD Extr. A B C D E 1 2 3 4 5 141 Figure A.7: Raw data for VFF algorithm with 25Hz noise in LGMD spikes. In addition to the robot’s trajectory from start to finish, we also show where emergency stops, extrications and bumps occurred. In each plot, the in- dividual cells making up the gridare25cm squares. The shaded gray regions represent the obstacles. The shaded rectangle in the top-left area of the plotisthegoalandthered arrowhead in the bottom- left shows the robot in its initial pose. The key for the diff- erent point lists is shown below: Bumps Stops LRF Extr. LGMD Extr. A B C D E 1 2 3 4 5 142 Figure A.8: Raw data for VFF algorithm with 50Hz noise in LGMD spikes. In addition to the robot’s trajectory from start to finish, we also show where emergency stops, extrications and bumps occurred. In each plot, the in- dividual cells making up the gridare25cm squares. The shaded gray regions represent the obstacles. The shaded rectangle in the top-left area of the plotisthegoalandthered arrowhead in the bottom- left shows the robot in its initial pose. The key for the diff- erent point lists is shown below: Bumps Stops LRF Extr. LGMD Extr. A B C D E 1 2 3 4 5 143 Figure A.9: Raw data for VFF algorithm with 100Hz noise in LGMD spikes. In addition to the robot’s trajectory from start to finish, we also show where emergency stops, extrications and bumps occurred. In each plot, the in- dividual cells making up the gridare25cm squares. The shaded gray regions represent the obstacles. The shaded rectangle in the top-left area of the plotisthegoalandthered arrowhead in the bottom- left shows the robot in its initial pose. The key for the diff- erent point lists is shown below: Bumps Stops LRF Extr. LGMD Extr. A B C D E 1 2 3 4 5 144 Figure A.10: Raw data forTTIalgorithmwithno noise in LGMD spikes. In addition to the robot’s trajectory from start to finish, we also show where emergency stops, extrications and bumps occurred. In each plot, the in- dividual cells making up the gridare25cm squares. The shaded gray regions represent the obstacles. The shaded rectangle in the top-left area of the plotisthegoalandthered arrowhead in the bottom- left shows the robot in its initial pose. The key for the diff- erent point lists is shown below: Bumps Stops LRF Extr. LGMD Extr. A B C D E 1 2 3 4 5 145 Figure A.11: Raw data for TTI algorithm with 25Hz noise in LGMD spikes. In addition to the robot’s trajectory from start to finish, we also show where emergency stops, extrications and bumps occurred. In each plot, the in- dividual cells making up the gridare25cm squares. The shaded gray regions represent the obstacles. The shaded rectangle in the top-left area of the plotisthegoalandthered arrowhead in the bottom- left shows the robot in its initial pose. The key for the diff- erent point lists is shown below: Bumps Stops LRF Extr. LGMD Extr. A B C D E 1 2 3 4 5 146 Figure A.12: Raw data for TTI algorithm with 50Hz noise in LGMD spikes. In addition to the robot’s trajectory from start to finish, we also show where emergency stops, extrications and bumps occurred. In each plot, the in- dividual cells making up the gridare25cm squares. The shaded gray regions represent the obstacles. The shaded rectangle in the top-left area of the plotisthegoalandthered arrowhead in the bottom- left shows the robot in its initial pose. The key for the diff- erent point lists is shown below: Bumps Stops LRF Extr. LGMD Extr. A B C D E 1 2 3 4 5 147 Figure A.13: Raw data for TTI algorithm with 100Hz noise in LGMD spikes. In addition to the robot’s trajectory from start to finish, we also show where emergency stops, extrications and bumps occurred. In each plot, the in- dividual cells making up the gridare25cm squares. The shaded gray regions represent the obstacles. The shaded rectangle in the top-left area of the plotisthegoalandthered arrowhead in the bottom- left shows the robot in its initial pose. The key for the diff- erent point lists is shown below: Bumps Stops LRF Extr. LGMD Extr. A B C D E 1 2 3 4 5 148
Abstract (if available)
Abstract
The Lobula Giant Movement Detector (LGMD), a visual interneuron in the locust's brain, responds preferentially to objects approaching along collisional trajectories. The goal of the Robolocust project is to build a robot that interfaces with this neuron and uses the LGMD spikes to make steering decisions.
Linked assets
University of Southern California Dissertations and Theses
Asset Metadata
Creator
Viswanathan, Manu
(author)
Core Title
Mobile robot obstacle avoidance using a computational model of the locust brain
School
Viterbi School of Engineering
Degree
Master of Science
Degree Program
Computer Science (Robotics
Publication Date
01/26/2011
Defense Date
01/10/2011
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
Bayesian state estimation,collision detection,insect vision,locust LGMD,mobile robotics,OAI-PMH Harvest,obstacle avoidance,probabilistic sensor model,time-to-impact
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Itti, Laurent (
committee chair
), Celikel, Tansu (
committee member
), Schaal, Stefan (
committee member
), Sukhatme, Gaurav S. (
committee member
)
Creator Email
mviswana@usc.edu,mvnathan@att.net
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-m3629
Unique identifier
UC1155625
Identifier
etd-Viswanathan-4280 (filename),usctheses-m40 (legacy collection record id),usctheses-c127-433779 (legacy record id),usctheses-m3629 (legacy record id)
Legacy Identifier
etd-Viswanathan-4280.pdf
Dmrecord
433779
Document Type
Thesis
Rights
Viswanathan, Manu
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Repository Name
Libraries, University of Southern California
Repository Location
Los Angeles, California
Repository Email
cisadmin@lib.usc.edu
Tags
Bayesian state estimation
collision detection
insect vision
locust LGMD
mobile robotics
obstacle avoidance
probabilistic sensor model
time-to-impact