Close
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
00001.tif
(USC Thesis Other)
00001.tif
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
VISUOMOTOR COORDINATION IN ANURANS, MAMMALS, AND ROBOTS by Jim-Shih Liaw A Dissertation Presented to the FACULTY OF THE GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy (Computer Science) December 1993 Copyright 1993 Jim-Shih Liaw UMI Number: DP22870 All rights reserved INFORMATION TO ALL U SE R S T he quality of this reproduction is d ependent upon the quality of the copy submitted. In the unlikely even t that the author did not sen d a com plete manuscript and there are m issing p a g es, th e se will be noted. A lso, if material had to be rem oved, a note will indicate the deletion. Dissertation Publishing UMI D P 22870 Published by ProQ uest LLC (2014). Copyright in the Dissertation held by the Author. Microform Edition © ProQ uest LLC. All rights reserved. This work is protected against unauthorized copying under Title 17, United S ta tes C ode ProQ uest LLC. 7 89 East E isenhow er Parkway P.O. Box 1346 Ann Arbor, Ml 4 8 1 0 6 - 1346 UNIVERSITY OF SOUTHERN CALIFORNIA TH E GRADUATE SCHOOL UNIVERSITY PARK LOS ANGELES, CALIFORNIA 90007 This dissertation, w ritten by < ii jp- ft.. U i a w ................................................. under the direction of h i s Dissertation Committee, and approved by all its members, has been presented to and accepted by The Graduate School, in partial fulfillm ent of re quirements for the degree of Ph.P. CpS D O C TO R OF PHILOSOPH Y Dean of Graduate Studies Date Noyerobec.29.,..1993 DISSERTATION COMMITTEE Chairperson To my wife Cissy Acknowledgments Looking back at my journey of the Ph.D. study and the completion of this thesis, I feel very fortunate to be associated with so many people who made this experience fulfilling and joyful. My deepest appreciation goes to Prof. Arbib for providing such an exciting environment, inspiring me to ask more questions, and guiding me to search for answers. I have benefited not only from his having been a constant source of knowledge himself, but also from his successful efforts in making the accumulation of knowledge in Brain Simulation Laboratory (BSL) smooth and easy, and creating opportunities to collaborate with researchers from around the world. I thank my committee members, Prof. Blum for teaching me the mathematical foundation of neural networks and Prof. Herrera for showing me the neurobiology of frogs and his advices in developing my research career. The collaboration with Prof. Weerasuriya at Mercer University on motor control has been most fruitful and enjoyable. His knowledge in neurobiology and the experience of conducting behavioral experiments with him in his laboratory broadened my view of neuroscience. The intriguing comments by Prof. Gaillard from Univ. of Poiters, France, inspired the development of a mechanism to estimate the absolute speed of a looming object. I thank the colleagues in the "frog group" at BSL, Bill Betts, Fernando Corbacho, Mathew Lamb, Hyun Bong Lee, Lucia Simo, and DeLiang Wang for constant and at times lengthy discussions and exchange of ideas. The retina model developed by Jeff Teeters and extended by Fernando and Hyun Bong greatly facilitated my research. This was made possible by the Neural Simulation Language, NSL, developed by Alfredo Weitzedfeld. I thank Lucia for teaching me neuroanatomy. It was the fellow students, Amanda Bischoff, Peter Dominey, Andy i Fagg, Jean Marc Fellous, Bruce Hoff, Irwin King, Nicolas Schweighofer, Reza i I Shadmehr, who created the exciting and enjoyable atmosphere for studying, j l researching, and growing. My special thanks go to Irwin for working together on motion perception, and Andy on robotic application. I'd like to thank Paulina Tagle for her assistance in administrative matters throughout the years. ; ! Lastly, I want to thank my wife, Cissy, for her love, patience and support. Table of Contents Dedication......................................................................................................... I Acknowledgments........................................................................................... List of Figures.................................................................................................. Abstract............................................................................................................ Chapter 1 Introduction................................................................................ Chapter 2 Motion Perception in Anurans............................................... 2.1 The computational requirements...................................................... 2.2 Neural network model for avoiding looming objects.................... 2.2.1 The basic cell types............................................................... 2.2.2 The looming-detection network.......................................... 2.2.3 Computing the 3-dimensional motion................................. 2.2.4 The escape direction.............................................................. 2.2.5 Motor schema selection: ducking vs. jumping.................. 2.3 Computer simulation......................................................................... 2.3.1 Escape direction based on stimulus position and direction 2.3.2 The effect of stimulus speed and size.................................. 2.3.3 Continuity of a looming stimulus......................................... 2.3.4 Stimulus/background contrast.............................................. 2.3.5 Stimulus distance and size.................................................... 2.3.6 Motor schema selection........................................................ 2.4 Discussion.......................................................................................... ...ii Jii ..ix xii ...1 ...7 ...7 .12 .12 .16 .20 .22 .24 25 .26 .27 .29 .30 .30 .31 .32 C hapter 3 Time to Contact or Urgency to A ct....................................................... 41 3.1. The Computational Issues................................................................................41 3.1.1 Defining the problem............................................................................. 43 3.1.2 Outline of a solution...............................................................................44 3.2 The Firing Range...............................................................................................45 3.2.1 Starting distance.....................................................................................46 3.2.2 Stimulus size......................................................................................... ,.47 3.2.3 Stimulus speed........................................................................................ 50 3.2.4 Worm-like stimulus................................................................................51 3.3 Computing Depth by Neural Modulation....................................................... 52 3.4 Discussion.......................................................................................................... 55 Chapter 4 Motion Perception: M ammalian Visual Systems................................58 4.1 The Issues............................................................................................................58 4.2 Hierarchy in the Mammalian Visual Pathway................................................59 4.3 Vector Analysis of Optical Flow Field............................................................61 4.4 A Neural Network M odel.................................................................................63 4.4.1 Looming stimuli..................................................................................... 65 4.3.2 Receding stimuli..................................................................................... 67 4.3.4 Rotational motion...................................................................................68 4.5 Implementation and Simulation....................................................................... 70 4.5.1 Neural network model for motion perception.....................................70 4.5.2 Computer simulation............................................................................. 74 Motion in depth......................................................................................75 Rotational motion...................................................................................76 4.6 A Comparison of Anuran and Mammalian Motion Perception.................. 76 C hapter 5 Snapping M otor Pattern G eneration....................................................79 5.1 Introduction........................................................................................................ 79 5.2 Controlling Single Muscle Activation.................................................... 81 5.2.1 Neurophysiology of the premotor neurons.......................................... 81 5.2.2 Exploring the premotor space................................................................84 Onset tim e...............................................................................................85 Rising rate...............................................................................................88 Peak amplitude....................................................................................... 88 Duration.................................................................................................. 88 Total activation...................................................................................... 88 Decay rate...............................................................................................89 5.3 Anuran Snapping: A Case Study of Motor Coordination............................. 90 5.3.1 The myology of snapping...................................................................... 92 5.3.2 The neurophysiology of snapping........................................................ 92 5.3.3 Synchronizing Motor Synergies........................................................... 93 Sequential activation............................................................................. 97 Parallel activation...................................................................................98 5.4 On-Line Correction..........................................................................................100 5.4.1 Controlling distance by mediation..................................................... 103 5.4.2 Controlling distance by modulation................................................... 104 5.5 Behavioral Study of Motor Coordination..................................................... 107 5.5.1 Correlation between lunging and snapping variations..................... 110 5.5.2 Mechanisms for Error Correction....................................................... 111 5.5.3. On-Line Estimation of Distance.........................................................116 5.6 Modulation by Afferent Feedback................................................................117 5.7 Discussion......................................................... 120 C hapter 6 Sensorimotor Transform ation: From Animal to A gent..................128 6.1 Application of the Looming Avoidance Model to Robot Control............ 128 6.2 Detecting A Looming Object........................................................................ 130 6.3 Obstacle Avoidance....................................................................................... 133 6.3.1 Related work..........................................................................................133 6.3.2 Technical considerations..................................................................... 134 Assumptions..........................................................................................134 Performance measurements................................................................ 135 6.3.3 Sensorimotor transformation...............................................................135 Motor heading map..............................................................................136 Size constancy via motion parallex.................................................... 138 6.4 Discussion.........................................................................................................139 C hapter 7 Conclusion............................................. Bibliography............................................................... A ppendix..................................................................... A. The leaky integrator m odel.......................... B The mathematical formulation of the model List of Figures 2.1 The escape direction.................................................................................................... 8 2.2 Looming-sensitive neurons.................................. 11 2.3 Looming-detection m odel.........................................................................................14 2.4 Synaptic connectivity of T3 neurons........................................................................17 2.5 Detection of 3-D motion............................................................................................21 2.6 Gating of the tectal projection onto the motor heading map.................................23 2.7 Escape direction........................................................................................................ 26 2.8 Stimulus speed and avoidance response rate.......................................................... 27 2.9 Stimulus size and avoidance response rate....................... 28 2.10 Motor schema selection...........................................................................................31 2.11 Various synaptic connections for T3 neurons...................................................... 33 3.1 Onset distance vs. starting distance......................................................................... 46 3.2 Onset distance vs. starting distance for 4 square objects......................................47 3.3 Visual angle at onset distance vs. starting distance for 4 square objects............ 48 3.4 Onset distance vs. stimulus speed for square objects............................................ 49 3.5 Visual angle at onset distance vs. stimulus speed for square objects.................49 3.6 Onset distance vs. starting distance for worm-like stimulus.................................50 3.7 Onset distance vs. stimulus speed for worm-like stimulus...................................51 3.8 Integration with modulation..................................................................................... 53 3.9 Absolute speed of looming stimulus....................................................................... 54 4.1 The hierarchical organization of a neural network for detecting looming patterns.........................................................................................................................64 - 4.2 The synaptic weight masks........................................................................................67 4.3 The flow pattern generated by a looming object.................................................... 68 4.4 The structure of a detector of counter-clockwise rotational stimuli.....................70 4.5 The synaptic masks for direction selective neurons...............................................71 4.6 Two expanding patterns that eventually overlap each other.................................74 4.7 Two shrinking patterns with identical shapes and locations.................................75 4.8 Two rotational stimuli in phase and angular velocity on intersecting path 76 5.1 Basic MPG module................................................................................................... 82 5.2 Simulation of the basic MPG module..................................................................... 83 5.2 Simulation of the basic MPG module..................................................................... 83 5.3 Exploration of the premotor space...........................................................................86 5.4 Snapping in anurans.................................................................................................. 91 5.5 Sequential network of snapping MPG..................................................................... 94 5.6 Behavior of sequential model................................................................................... 96 5.7 Parallel network of snapping M PG......................................................................... 98 5.8 Behavior of parallel model....................................................................................... 99 5.9 A model of feedback control in motor coordination............................................101 5.10 Controlling snapping distance via mediation......................................................104 5.11 Controlling snapping distance via modulation................................................... 105 5.12 Compensation of long lunge by shortened tongue movement..........................108 5.13 Lateral variability in snapping.............................................................................. 109 xi 5.14 Variation in lunging and tongue extension......................................... ............... 110 5.15 Time to contact at mouth opening....................................................... ............... 112 5.16 Distance to target at mouth opening.................................................... ............... 113 5.17 Time delay from onset of lunge to start of mouth opening.............. ............... 114 5.18 Shaping of MPG by afferent feedback............................................... ............... 119 5.19 Feedback from motor system to sensory center................................ ............... 125 6.1 Experimental setup for robot navigation................................. ............................ 129 6.2 Collision layer from experiment #1.......................................... ............................ 131 6.3 Experiment # 2 ............................................................................ ............................ 131 6.4 Experiment # 3 ............................................................................ ............................ 132 6.5 Experiment # 4 ............................................................................ ............................ 132 6.6 Avoiding four obstacles............................................................. ............................ 137 6.7 Estimating depth based on motion parallex.................................... ....................139 i i i X U Abstract The objective of this thesis is to study visuomotor coordination, including issues of motion perception, sensorimotor transformation and motor pattern generation. The problems of computing the trajectory of a looming stimulus in 3-dimensional space and estimation of its absolute velocity forms the core of the study in motion perception. A model of anuran visual system capable of performing these two functions is presented. The detection and localization of a looming stimulus is carried out by a population of neurons which integrate signals of the expanding edges. The direction of the looming stimulus is computed by monitoring the shift of i the peak of neuronal activity in this population. The absolute velocity of a looming I \ stimulus is computed by integrating the activity of the looming sensitive neurons, j I i After two integration processes which are modulated by the activity of the looming i detectors, a signal that depends only on the true speed of looming stimuli can be obtained. A model for extracting looming, receding, and rotating patterns from optical flow field is developed to compare and contrast the anuran and mammalian visual systems. The mammalian model is constructed within a mathematical framework for integrating motion parameters of different levels of complexity. The anuran model is applied to perform the task of obstacle avoidance for an autonomous robot. The capability and reliability of the model in handling real world visual data, and the capability of achieving size constancy are demonstrated in this experiment. A model for the motor pattern generator controlling tongue snapping in anuran is developed to study issues concerning motor control. The model allows us to explore the premotor space in search for ways to control muscle activity and coordinate motor synergies. The modeling study also led to several biological experiments which provides data supporting flexibility in snapping and discern several control strategies for coordinating body and tongue movements in a high-velocity, high- , accuracy prey-catching behavior. 1 CHAPTER 1 ; I i INTRODUCTION i The objective of this thesis is to study visuomotor coordination. The scope of ; this study is defined along two dimensions. In the dimension of functionality, the | i issues concern motion perception, sensorimotor transformation and motor pattern ; generation. In the dimension of subject species, the primary focus is on anurans, with comparative studies of similar systems in mammals and applications to robot j I control. | More specifically, the problem of computing the trajectory of a moving stimulus ! in 3-dimensional space and estimation of its absolute velocity from the expanding ; retinal image forms the core of study in motion perception. A model of the anuran 1 visual system capable of performing these two functions, without information i regarding the optical flow field will be presented. To compare and contrast it with the mammalian visual system, a model for extracting looming, receding, and rotating patterns from the optical flow field is developed. A mathematical framework for integrating motion parameters of different levels of complexity is formulated as the basis for the mammalian model. The anuran model is applied to carry out the task of obstacle avoidance for an autonomous robot. The capability and reliability of the model in handling real world visual data is tested in this experiment. 2 On the topic of motor control, a model for the motor pattern generator controlling tongue snapping in anurans is developed. The model is used to explore i the premotor space in order to find the parameters and ways of controlling muscle j activity and motor coordination. The modeling study led to several biological ! i experiments which provide data supporting flexibility in snapping and enable us to . discern several control strategies for coordinating body and tongue movements in a j high-velocity, high-accuracy prey-catching behavior. The model for motion perception in anurans is presented in Chapter 2. This ! model is based on neurobiological data concerning how frogs avoid a looming , | stimulus. Avoiding looming objects (possible predators) is essential for the survival ' of animals. This thesis presents a neural network model to account for the detection ; of and response to a looming stimulus. The generation of an appropriate response includes five tasks: detection of a looming stimulus, localization of the stimulus position, computation of the direction of the stimulus movement, determination of j escape direction, and selection of a proper motor action. The detection of a looming • j i stimulus is achieved based on the expansion of the retinal image. The spatial ; location of the stimulus is encoded by a population of neurons. The direction of the looming stimulus is computed by monitoring the shift of the peak of neuronal ! i activity in this population. The signal encoding the stimulus location is gated by the 1 direction-selective neurons onto a motor heading map which specifies the escape direction. The selection of a proper action is achieved through competition among different groups of motor neurons. Our model exhibits many features that are consistent with animal behavior such as dependency on the speed and size of a looming object, the sensitivity of stimulus continuity and contrast reversal. Some of these features are common to the visual systems in many species while others are unique to anurans. Through such comparisons, the model provides some insights into why the visual systems in mammals and anurans shares many basic common properties yet differ in some peculiar ways. For example, the frog is unable to detect a white object approaching against a black background. This is due to the preference of the retinal ganglion cells for dark moving stimulus. Moreover, the model provides an account for the observation that frogs fail to respond to discontinuous looming patterns such as an expanding ring or donut. Knowing the time to collision (Tc) with a looming object is important for a variety of behaviors. The issues of extracting this visual parameter and the role it plays in guiding behavior have attracted the attention of a large number of researchers. Most of the study is centered around a mathematical notion, tau (x), which under certain conditions is equivalent to Tc. However, the question of how the nervous system computes this visual parameter is still unanswered. In Chapter 3, a mechanism for computing the absolute velocity of a looming stimulus regardless of its size or starting distance is proposed. This is achieved by integrating the activity of the looming sensitive neurons. Two features of the looming detectors are crucial for the integration process. First, the looming detectors respond to an approaching object only when it enter a specific range. The boundary of the range depends on the size of the stimulus and hence does not signify the true distance of the stimuli. The second feature is the preference of the looming detectors for smaller stimulus. After two integration processes which are modulated by the activity of the looming detectors, the activity of the second integrator neuron converges to a curve that correlates well with the true speed of looming stimuli independent of their size or the distance at which they start approaching. The activity of the second integrator 4 i i neuron provides a signal of urgency to act. Such a signal has several advantages ! over Tc. In particular, its flexibility greatly facilitates adaptation and learning. The i biological plausibility of the two integrator neurons is also discussed. Chapter 4 presents a study that seeks to unify theoretical and experimental approaches to research on motion perception. In particular, the focus is placed on the j issues concerning approaching, receding, and rotating motion. The mathematical analysis of the optic flow field often involves temporal and spatial differentiation to obtain information about the flow field. However, differentiation introduces noise and may not be biologically plausible. Experimental data, on the other hand, has ■ revealed many neuronal populations sensitive to various aspects of motion such as looming, rotation, and time-to-contact. Although the neural mechanisms underlying such motion perception is unknown, it has been shown that the biological visual system relies heavily on its sophisticated capability of directional selectivity. In this | ! study, we first develop a mathematical formulation based on Green's theorem and j Stokes's theorem which demonstrates that the two approaches to motion perception i ! are mathematically equivalent. Then we build a neural network model based on a mathematical formulation that is capable of detecting motion in depth and rotation. ! The availability of optical flow field provides mammalian visual systems functional capability beyond that in the anurans. Due to the lack of optical flow information, the anuran is not capable of detecting a receding object nor can it distinguish between looming and certain rotating patterns. The study of motor coordination in snapping behavior of anurans is given in Chapter 5. The transformation of sensory signals into appropriate spatio-temporal ; patterns of activity in motoneurons is postulated to be carried out in part by a motor pattern generator (MPG). The chapter describes a biologically constrained neural : 5 network model for such an MPG to (a) explore ways in which a physiologically identified push-pull mechanism built into the MPG can be used to generate and synchronize motor synergies, (b) offer a hypothesis on on-line correction of snapping distance based on visual feedback, and (c) investigate the role of afferent feedback in two-way information flow between sensory centers, motor circuits and periphery. In particular, a hypothesis is postulated in which distance estimation (the urgency signal) based on looming perception plays a role in the on-line correction. This hypothesis provides a crucial link between the study of the perceptual and motor systems. Snapping in anurans has long been described as a sequence of highly stereotypical, ballistic movements with little or no variability. The high degree of accuracy in prey-catching is thought to be achieved by aiming of the head which is sufficiently precise that ballistic tongue snapping will hit the target. However, computer simulation of our model of the MPG for controlling jaw and tongue muscles demonstrated a capability for tunable, rather than ballistic, control of snapping. This led to the design of a series of experiments to test the hypothesis that prey-catching involves variability in head movements which is compensated by controlling the tongue during snapping and to elucidate the parameters used in such control. The data obtained from frame-by-frame analysis of the snapping in anurans provides strong support for the hypothesis. Furthermore, it allows us to discern several plausible strategies for coordinating body, head, mouth and tongue movements involved in prey-catching. The problem of computing 3-D motion is a challenging one even when the optical flow information is available. The model of anuran motion perception provides a simple and efficient way of performing the task. Chapter 6 describes an experiment in which the model is applied to detect obstacles and generate a detour j path. The application to robot control demonstrates that this mechanism is robust | and reliable in handling real image data in the presence of noise and occlusion. ! Sensorimotor transformation is a major topic in this experiment. The motor map first ! introduced in Chapter 2 provides a substrate for transforming the sensory signal of j I obstacles into an appropriate motor heading. Another difficult task confronting the j i model is to respond to a small obstacle at a closer location but not to a larger one at I i a greater distance even though the latter casts a larger image on the camera. The model demonstrate qualitatively such size constancy based on motion parallex. j i i i CHAPTER 2 MOTION PERCEPTION IN ANURANS 2.1 The computational requirements Motion perception is crucial to the survival of the animal, and biological visual systems are efficient in analyzing and interpreting intensity changes in the visual field produced by dynamic stimuli. In the natural environment, the animal must infer motion in 3-D space from the 2-D retinal images of a moving stimulus. In a psychophysical experiment, Schiff (1965) demonstrated the capability of detecting motion in depth in many animals. By magnifying a shadow cast on a monitor screen, he observed avoidance responses in all subjects tested, including crabs, frogs, chicks, and humans. Similar ability was also observed in lower animals. For instance, a flying insect tethered and suspended from above, when exposed to a rapidly expanding pattern, will extend its legs as if it were getting ready to land (Goodman 1960, Braitenberg and Taddei-Ferretti 1966). Such capability has been studied intensively in humans (e.g., Regan and Beverley 1978, Schiff and Oldak 1990, Braddick and Holliday 1991, Freeman and Harris 1992). In a series of physiological experiments, it has been demonstrated in mammals that expanding patterns suffice to elicit a response in neurons sensitive to motion in depth (Beverly and Regan 1973 in cats, Cynader and Regan 1978, Mottor and Mountcastle 1981, Saito et al. 1986, Tanaka and Saito 1989 in monkeys). 8 \ 1 / A B C Figure 2.1: The escape direction. When a looming stimulus is on a collision I course with a frog, the escape direction of the frog is a compromise between the forward direction and that away from the looming object (A,B). If the stimulus is not on a colliding trajectory, the frog will jump in such a direction as to "cut back” behind the looming object (C) (adapted from Ingle and Hoff, 1990). In each figure, the rays provide a histogram of escape I directions for the given direction of approach of the looming stimulus. Different visually elicited avoidance responses are observed in frogs depending j 1 on the stimulus situation (Ingle 1976, Ingle and Hoff 1990, Ewert 1984, Cobas and Arbib 1991, 1992) but here we concentrate on the response to a looming object which involves issues concerning motion perception in general. In this behavior, j i when the approaching object is small and from the upper visual field, frogs will duck. ■ If the approaching object is large and on a collision course, frogs jump away from it, with the direction of such a jump being a compromise between the forward direction and that away from the threat. When the object is not on a collision course, frogs tend to cut behind it (Fig. 2.1). Such an adaptive behavior requires the computation i of the trajectory of a stimulus in 3-dimensional space and the transformation of such 1 perception into appropriate motor signals. 9 I I I Furthermore, the success of interacting with a moving object depends on the ! timing of an action. Animals can perceive motion cues in guiding their behavior, as exemplified by the correlation between the velocity of an expanding pattern and the landing response observed in flying insects (Coggshall 1972). The gannet can time | when to retract its wings when diving to catch a fish in the sea (Nelson 1978). Schiff and Detwiler (1979) and Todd (1981) demonstrated by manipulating the rate of the | expansion of cast images that human observers can perceive time to contact without information of the distance and observer speed. By changing the size of a ball (by deflating it) in mid air, Savelsberg et al. (1991) was able to show the adjustment of \ the hand aperture to catch the ball of various sizes based on the expansion pattern. , Cellular recording in pigeon brain reveals neurons with maximum response at a j constant time before contact occurs for visual stimuli of different speed and size j (Wang and Frost 1992). The above data illustrates the need for a neural mechanism j capable of initiating a timely response to a moving stimulus based on the expansion | i rate of the retinal image. A noval mechanism for computing Tc based on frog J neurobiology will be presented in Chapter 3. The analysis of 3-dimensional motion has thrived within the framework of j optical flow proposed by Gibson (1958). Information about the 3-dimensional motion extracted from the optic flow field has been applied to control steering (Lee 1976), obtain egomotion (Prazdny 1980), and infer 3-dimensional structure of a moving object (Koenderink and Van Doom 1976). Beverley and Regan (1973) proposed that the trajectory of a looming object can be calculated from the speed J differences in two retinal images. Applying learning rules, Hatsopoulos and Warren (1991) trained a neural network to extract translational components of heading from an optic flow field. i 10 The existence and extent of direction selectivity in anuran retinal ganglion cells ; ; is a controversal issues (Norton et al. 1970, Backstrom et al. 1978, Watanabe and Murakami 1984, Griisser-Comehls and Langeveld 1985). The controversy raises an interesting question: Is it possible to compute 3-D motion and the time to contact (Tc) of a looming stimulus without an optical flow field? A mechanism for computing 3-D motion without requiring optical flow field is of great computational j i | advantage since the optimazation procedure for obtaining a flow field is j computationally expensive (e.g., Horn and Schunk 1981, Nagel 1983, Wang et al. | 1989). The study of looming avoidance behavior in anurans thus presents a unique I opportunity for discovering a method of computational importance. I In this chapter, a biologically-constrained neural network model is developed to i account for the recognition of a looming stimulus, and the translation of that recog nition into an appropriate motor command. The detection of a looming stimulus is achieved by neurons in tectum and pretectum. The spatial location of the stimulus is encoded by a population of tectal neurons. The stimulus movement direction is | computed from the shift in the peak of activity in this population. The instantaneous ■ stimulus distance (a signal of closeness) is computed by integrating the activity of j looming detecting neurons. The signal of the stimulus position is gated by the activity of the direction-selective neurons onto a motor heading map which specifies the direction of the avoidance movement. The avoidance action is initiated by the signals from tectum and pretectum. The selection of a proper action is obtained via the competition among different motor schemas. The model is implemented in NSL, a Neural Simulation Language (Weitzenfeld 1991). Figure 2.2: Looming-sensitive neurons. Top. T3 neuron's response to (A) turning on-off of general illumination, (B) a disc of 3 cm diameter moving at constant distance (25 cm) along a 60° arc (dashed line), (C) and (D) the same stimulus moving towards (t) and away from (a) the eye from a distance of 25 cm to 3 cm (Griisser and Griisser-Comehls, 1976). The (t) indicates the start of movement towards the animal during which the cell fires vigorously. The (a) signals the start of away movement, during which the cell is almost silent. Bottom. TH6 neurons respond strongly to a 5 cm disc approaching from the fronto-dorsal visual field (a), less strongly to the same stimulus from a lateral-horizontal position (b), and weakly to the same object moving around at constant 5 cm distance from the eye (Ewert, 1971). Note that the firing rate of TH6 neurons increases as the stimulus moves closer, whereas that of the T3 neurons stays constant as the stimulus approaches the animal. 12 2.2 Neural network model for avoiding looming objects j 2.2.1 The basic cell types Two types of neuron are sensitive to a stimulus moving toward the frog's eye: j T3 neurons: Griisser and Griisser-Comehls (1970) reported that T3 neurons in j the optic tectum respond vigorously to stimuli moving toward the frog’s eye, in contrast to their low sensitivity to movement away from the eye or around the body at a constant distance (Fig. 2.2A). The excitatory receptive field (ERF) of these j neurons ranges from 20° to 30°. To elicit a T3 neuron response, the stimulus must ’ have an angular size of at least 3° (Griisser and Griisser-Comehls 1973). TH6 neurons: Ewert (1971) categorized neurons from caudal thalamus into 10 classes based on their receptive fields and response characteristics. Among these : pretectal neurons, TH6 neurons are sensitive to a looming stimulus, especially one from the upper visual field (Fig, 2.2B). Moreover, the closer the object, the stronger the TH6 response it can elicit. TH6 neurons respond more strongly to a 3 cm object j approaching to 5 cm, than to a 10 cm object approaching to only 10 cm, even though 1 1 1 the latter subtends a larger visual angle. TH6 neurons have ERF of either 180* or 1 I 360*. The input to "total field" TH6 neurons comes from the contralateral eye via the i ipsilateral optic tectum (Brown and Ingle 1973) — small tectal lesions produce scotoma in these neurons. A major difference between these two types of looming-detection neurons is that TH6 neurons are sensitive to the distance of a stimulus while T3 neurons are not (Fig. 2.2). 13 Other types of neuron relevant to looming avoidance behavior include: j T2 neurons: T2 neurons have large receptive fields, usually extending more than j i 90°. Some of these T2 neurons show directional selectivity in which the preferred direction points from the temporal to the nasal part of the visual field (Griisser and ( Griisser-Comehls 1976). j T6 neurons: The ERF of T6 neurons from optic tectum lies in the upper visual field, with fronto-caudal range at least 120° and left-right range greater than 90°. [ [ Their activity can be elicited by stimuli larger than 8° diameter and moving faster j than 5°/sec. TH3 neurons: The ERF of TH3 neurons in thalamus ranges from about 30° to I 46°. These neurons receive input from the contralateral eye. They respond best to moving stimuli that fill their ERF. Based on the preference of TH3 for large visual I stimuli, Ewert and Wietersheim (1974) postulated that these neurons are important in the recognition of a predator based on the size of a stimulus. ‘ Based on lesion studies, Ingle (1983) suggested that the spatial map of : i stimulus location is encoded in optic tectum. Following unilateral removal of tectum, optic fibers regenerate to the remaining (opposite) tectum. In these "rewired" frogs, j r it is found that instead of jumping away from a looming threat, they jump toward it. This finding implies that the escape direction is determined based on the information I stored in the optic tectum. Furthermore, after pretectal lesion, frogs are able to avoid looming stimuli correctly (with 50% response rate compared to normal frogs), indicating that T3 neurons alone are able to mediate looming-avoidance behavior. I Given these data, we postulate that the optic tectum contains a spatial map of looming stimulus location encoded by a population of T3 neurons, while T2 neurons detect movement in the temporo-nasal direction which indicates a crossing trajectory. 14 R4 R3 " i f l F i TH6^,,, TH3 ___ DEPTH Heading Map Jump Duck Retina Optic Tectum Thalamic Pretectum Tegmentum Medulla Spinal cord Figure 2.3: The looming-detection m odel. The visual stimulus is transmitted to the network via R3 and R4 ganglion cells. T3 neurons detect movement of a looming object and T6 neurons monitor stimulus activity in the upper visual field. These tectal signals, along with depth information, converge onto the TH6 neurons which determine whether the visual stimulus is a looming threat. T2 neurons detects temporo-nasal movement whereas TH3 neurons are more sensitive to larger stimuli. The spatial signals conveyed by T2 and T3 neurons are integrated and transformed in the tegmentum to form a motor heading map which specifies the escape direction. The motor output depends on the size and elevation of a stimulus. 15 The spatial signal from T3 neurons is gated by the activity of T2 neurons to guide the ! ipsilateral and contralateral jump. TH6 neurons carry out the recognition of a i looming stimulus based on converging tectal inputs and depth information. Once a j j predator is recognized by the pretectal system, the prey-catching system in the optic | i tectum will be shut off by inhibitory modulation from pretectum.1 If the looming ; I stimulus is large (signaled by the size-sensitive TH3 neurons), the jumping motor schema will be activated. On the other hand, if the stimulus is recognized as a small I airborne object (signaled by T6 neurons), the ducking motor schema will be elicited. ; I In summary, the topography of the T3 neurons provides the base for the spatial map ; i of the stimulus location whereas the activation of the tectal and pretectal neurons : i serves as the triggering command of the predator-avoidance behavior (Fig. 2.3). The retinal input to our model will comprise R3 and R4 neurons. This is based j I on the observation in frogs that the activation of R3 and R4 retinal neurons has a high ; correlation with the flight and hiding response (Griisser and Griisser-Comehls 1976). , They are more sensitive to larger stimuli than other retinal neurons. Ewert (1984) reported that toad’s caudal thalamus receives inputs from R3 and R4 neurons but not ; ! R2 neurons. R3 neurons of toads respond more strongly to the leading edge of a j black stimulus moving against a white background than to the trailing edge (about 5:1). But for a white stimulus moving against a black background, R3 neurons , respond stronger to the trailing edge (about 2:1). Increase of the stimulus speed does 1 It has been postulated that the prey-catching system involves inhibitory modulation of the optic tectum by the pretectum (Ewert and von Seelen, 1974). Ewert (1974) found that the activity of T5.2 cells is highly correlated to the prey-catching response. Cervantes-Perez et al. (1985) proposed a tectal column model of prey-catching. However, the detailed study of neural interactions between prey-catching and predator-avoidance is beyond the scope of the present thesis. 16 not influence the contrast dependent edge preference (Tsai and Ewert 1987). R4 neurons of frogs respond to any moving object (larger than 2-5°) which leads to a 1 I dimming in its ERF. The essential factor is the overall dimming rather than the size, j shape, movement, or contrast of the stimulus. Continuous excitation is elicited when j a large dark object moves into the ERF and stops there (Griisser and Griisser- I Comehls 1976). Most of the neuron types discussed here have been found in both frogs and toads with qualitatively similar response characteristics (for a review see Griisser and I Griisser-Comehls 1976, Ewert 1987). In our current model, we do not distinguish ; i between data obtained from frogs and toads. ; 2.2.2 The looming-detection network ; i The individual "neuron" in our neural networks is modeled as a "leaky j integrator" unit which interacts with others only via its "firing rate". The firing rate 1 of a cell depends only on a single membrane potential (each cell is modelled as a 1 single compartment). It is a coarse approximation to a real neuron and expresses the i intuition that "the higher the membrane potential, the higher the firing rate" (a detailed mathematical treatment of the model is given in Appendix A). Each neuron ! type is represented as a two dimensional array of leaky integrator units. In the current implementation, the size of these arrays is 21x21 with each unit covering a region of about 8°x8° of the left visual field. In our model, the visual stimulus is transmitted to the avoidance circuitry via R3 and R4 ganglion cells. These are modeled by a version of the Teeters (1989) model of frog and toad retina, implemented in NSL (following Wang & Arbib 1991, Teeters & Arbib 1991). T3 neurons detect movement toward the eye, and T6 neurons 17 R3 R4 R3-to-T3 R4-to-T3 T3 membrane potential T3 firing rate Figure 2.4: Synaptic connectivity of T3 neurons. The R3 signals are projected through a radial mask to T3 neurons. The R3 cells respond to the leading edges of a stimulus in which the highest activity concentrates on the peripheral region (top of the second column). A radially arranged connectivity pattern (off-center, on-surround) is suitable to integrate activity from the periphery (top of the third column, the shading of each square indicate the synaptic weight). The R4 signals, which indicate the general dimming in the visual field (bottom of the second column), are projected to T3 neurons through a Gaussian mask (bottom of the third column). The position of a looming object is determined by localizing its center. The top graph in the last column is the membrane potential, and the lower one is the firing rate, of an array of T3 neurons. The T3 neuron with its center of mask aligned with the center of the looming stimulus would have the highest activation. Neighboring neurons would have decaying activation as one moves away from the center. Thus the spatial position of a looming stimulus is encoded in a population of T3 neurons where the more central neurons have higher activations. monitor stimulus activity in the upper visual field. Based on Brown and Ingle's (1973) finding on pretectal "total field" neurons, there is no direct retinal projection to the TH6 neurons. Instead, T3 and T6 tectal signals, along with depth information (provided by some depth perception circuitry), converge onto the TH6 neurons which j determine whether the visual stimulus is a looming threat (Fig. 2.3). The topography of the T3 neurons provides the spatial map of the stimulus location and together with the activity of the TH6 neurons indicates the presence of a looming stimulus. There are two cues which can be used to detect a looming object: the expansion of the image of an object and the decreasing of distance to it. Schiff (1965) demonstrated that expanding shadows projected on a screen could elicit avoidance response in frogs. T3 neurons are not sensitive to the distance of a stimulus (Fig. 2.2A), suggesting that the detection of the movement toward the eye by these neurons is based solely on the expansion of the stimulus. This can be achieved by arranging (Fig. 2.4) the mask for the projection from the R3 to T3 neurons so that the synaptic weights increase as one moves away from the center of the mask (equation (3) in Appendix, referred to as [Eq. A3] in the following text). Through such a mask, the sensitivity of R3 cells to the leading edges of a visual stimulus provides the information of expanding edges in all directions. This gives the signal for an approaching object. In contrast, the matrix of synaptic weights for connecting R4 to T3 neurons is a Gaussian mask. The prolonged excitation of R4 neurons to the dimming of light ensures continuity inside the expanding edges. Thus, the T3 neuron is able to detect looming objects and yet would not be fooled by expanding edges alone (such as two vertical bars moving away from each other). How can we localize a stimulus which is expanding in all directions? We propose a method of population encoding as a solution to this problem: The position of a looming object can be determined by localizing its center. Through the radial mask in our model, a cell that corresponds to the center of a region (in the input layer) which has higher activity in its perimeter will have higher activation. In a looming situation, the highest activity in R3 neurons concentrates on the perimeter of a stimulus (the expanding edges). Therefore, a T3 neuron which corresponds to the center of a stimulus will receive the highest R3 signals if these signals are projected through a radial mask. Neighboring neurons would have less activation, which decays as one moves away from the center (Fig. 2.4). Furthermore, the Gaussian arrangement of the synaptic weights for R4 neurons can extract the center of the dimmed region cast by a looming object. Together, R3 and R4 signals differentially activate T3 neurons based on the degree of alignment of their ERFs and the center of a looming stimulus. Therefore, the spatial position of a looming stim ulus is encoded in a population of T3 neurons where the more central neurons have higher activations. The ERF of a T6 neuron is in the upper visual field. This effect is achieved by projecting retinal input to T6 neurons through a mask which has larger weights for higher positions [Eq. A8]. Through such a mask, a stimulus located in the upper visual field will elicit a stronger response in T6 neurons than one in the lower visual field. Input from T6 neurons to TH6 neurons makes them more responsive to an approaching aerial threat. TH6 neurons also receive depth information from some depth perception circuitry [Eq. A 12]. In our current model, such depth information is obtained from a schema which is not analyzed further. A candidate for neural implementation is the Cue Interaction Model of House (1989), but the present schema computes the depth map of each point on the retina at each time step on the basis of the initial position, speed, and final position of a stimulus. More precisely, the output of the schema is "closeness", inversely proportional to the distance of the stimulus. By properly combining the inputs from T3 neurons, which signal the presence of expanding edges, and the depth signal from the depth perception system, 20 the TH6 neurons will respond more strongly to a smaller but closer stimulus than to a larger but farther one. 2.2.3 Computing the 3-dimensional motion i Ingle and Hoff (1990) reported that a frog's response to a looming stimulus is ! i dependent upon the direction of the stimulus movement. If the stimulus is on a j i collision trajectory, the frog jumps to the contralateral visual field. If the stimulus is on a course that will pass in front of the animal, the frog will "cut back" behind the incoming stimulus by jumping into the ipsilateral visual field. This amounts to the | detection of the stimulus movement in three dimensions. j The best known biological motion detection scheme is based on research on the I neural mechanisms underlying insect optomotor response (Hassenstein & Reichardt j 1956, Reichardt 1969, Reichardt & Guo 1986). The basic unit of motion detection in | l their model is a pair of receptors connected in such a way that the delayed output of I one receptor is multiplied by the output of the other unit. The model made several j interesting predictions which were consistent with the data obtained from insects. i Barlow and Levick (1965) demonstrated that unidirectional lateral inhibition is ; ! crucial in rabbit retinal ganglion cells. Neurons with directional selectivity have also | been found in the frog's retina (Norton et al. 1970, Backstrom et al. 1978, Watanabe and Murakami 1984, Griisser-Comehls and Langeveld 1985) and optic tectum (T2 neurons reported by Griisser and Griisser-Comehls 1976). However, it is not likely that neurons in the retina have enough information to compute 3-dimensional motion such as a looming object moving on a crossing trajectory. In the optic tectum, T2 , neurons are sensitive to movement in the temporal to the nasal direction. Selectivity in this direction can be used to signal a crossing stimulus which elicits the "cut-back" 21 Preferred Direction Null Direction T3 Neurons Delay T2 Neurons T3 Neurons T2 Neurons Figure 2.5: Detection of 3-D motion. A. The perception of 3-dimensional motion is achieved by detecting a shift of the center of a looming object manifested in a shift of the peak of T3 neuronal activity. T2 neurons receive signals from T3 neurons for the current position of the peak and asymmetrical projections from the delay units — excitatory inputs in the preferred direction and inhibitory inputs in the null direction. Therefore, T2 neurons are activated only when the stimulus is moving toward the eye in a temporo-nasal trajectory. B. The dependence of T2 velocity-sensitivity on the delay time can be replaced by changes in the spatial connectivity with the delay units. For example, given a fixed delay time, if the excitatory connections from the delay unit to the T2 neurons is shifted one cell to the right, an object moving at a speed that is previously able to activate a T2 response will fail to do so, but one that moves at a slightly higher speed will succeed. response. We postulate a mechanism for detecting looming movement direction that I suggests that T2 neurons make use of the information processed by T3 neurons. ! ! The idea of computing 3-dimensional motion is to detect the shift of the center { of a looming stimulus. We have already seen how the position of the center may be 1 encoded in a population of T3 neurons. When a stimulus looms directly (in the ! positive z-direction) at the animal, while the population of activated T3 neurons becomes larger, its center remains stationary. Conversely, a shift of the peak of neuronal activity in this population indicates a looming movement with components j i in the x-y direction. In particular, a motion trajectory which passes in front of an | animal has a horizontal component pointing from the temporal part toward the nasal j part of the animal's visual field. Therefore, by monitoring a shift of the peak in T3 J i neuronal activity in the temporo-nasal direction, T2 neurons can signal a crossing ! i (instead of a colliding) looming stimulus and elicit a proper "cut-back" action. The j network that computes 3-dimensional movement direction is shown in Fig. 2.5 A. In j the network, T2 neurons receive asymmetrical inputs from T3 neurons, i.e., excitatory inputs in the preferred (temporo-nasal) direction and inhibitory inputs in the null direction [Eq. A6 and Eq. A7]. In this way, only a stimulus that crosses from ■ the temporal part toward the nasal part of the visual field would elicit response in T2 , neurons. 2.2.4 The escape direction Once a looming object is detected, toads must determine which direction to jump to avoid it. Cobas and Arbib (1991,1992) propose a Motor Heading Map hypothesis i for the determination of the direction to jump: prey-catching and predator-avoidance T2 Neuron T3 Neuron left right left right left right Motor Heading Map — — Cut-Back • Collision Figure 2.6: Gating of the tectal projection onto the motor heading map. Only half the projections are shown here for simplicity. The T3 neurons project bilaterally to the heading map with differential weights. The T2 neurons gate T3 signal by projecting inhibitory signals to the ipsilateral heading map and excitatory signals to the contralateral one. When the looming stimulus is on the colliding trajectory (and hence provides no T2 modulation), the ipsilateral heading map would have higher activity and suppress the contralateral map. But when the stimulus is crossing the visual field, the T2 signal would gate the T3 signal onto the contralateral heading map, thus resulting in a "cut-back" jump. Note that the contralateral jumps are directed to the frontal part of the visual field whereas the ipsilateral jumps are directed more towards the caudal visual field. systems share a common map for the heading (coded in body-coordinates) of the responding movements, as distinct from a common tectal map for the direction of the stimulus.2 The projection from optic tectum to the heading map thus must differ 2 The direction of prey and the direction of prey-catching are the same, but the direction of a predator and the direction of escape are different. Thus, in the latter case, the sensory map and the motor map must be distinguished. 24 depending on whether a visual stimulus is identified as prey or predator. We extend ^ this idea to explain the avoidance behavior in which frogs have to choose between ipsilateral and contralateral escape directions depending on whether a looming t stimulus is on a collision course or not. We postulate that the motor heading map provides a basis for integrating signals from multiple sensory circuits in the following way: Signals from different sources i I converge onto the map and interact/compete with each other, and an appropriate | I heading direction can be obtained through proper coordination of their interaction. In j I looming avoidance, the T3 neurons, which encode the stimulus position, project to j the ipsilateral and contralateral heading maps with the ipsilateral one receiving higher | excitation, whereas the T2 neurons project excitatory signals to the contralateral map : and inhibitory signals to the ipsilateral map to direct a "cut-back" maneuver (Fig. 1 I 2.6). When the looming stimulus is on a collision course (and hence there is no T2 I modulation), the ipsilateral heading map will have higher activity and suppress the contralateral map. As a result, the frog will jump to the contralateral visual field. J ; However, when the stimulus is crossing the visual field, the T2 signal inhibits the T3 ; signal while exciting the contralateral heading map, thus resulting in a cut-back jump. 2.2.5 Motor schema selection: ducking vs. jumping i I Toads respond to different stimulus situations with different movement patterns. Cobas and Arbib (1991, 1992) proposed a general mechanism of motor pattern selection through the interaction of motor schemas. In our present model, we are concerned with the toad's response to a looming stimulus. If the approaching object j is small and from the upper visual field, the toad would duck, while if the object is large or groundbome, the toad tends to jump away (Ewert, 1984). We postulate a ; specific mechanism for the selection of the proper action (ducking vs. jumping) j depending on the size and position of a looming stimulus. i I Our approach is to utilize the sensitivity of TH3 neurons to stimulus size and the sensitivity of T6 neurons to stimuli in the upper visual field. In our model, each TH3 neuron integrates visual signals from a large number of R3 and R4 cells (through a 7x7 mask which conforms to its ERF size, see [Eq. A13]). In this arrangement, the larger the stimulus, the stronger the activity of, as well as the greater the number of, TH3 neurons it can elicit. T6 neurons, on the other hand, are more sensitive to the elevation of the looming stimulus. By connecting TH3, TH6, and T3 neurons to the jumping motor schema, a higher activation of this motor schema will be elicited by a larger stimulus (Fig. 2.3). On the other hand, by connecting TH6 and T6 neurons to the ducking motor schema, a higher ducking response will be elicited if the stimulus approached from a higher elevation. We have set the weights so that an appropriate action will be chosen through their competition, depending on the size and elevation of the stimulus (Section 2.3.6). 2.3 Computer simulation The looming-avoidance model is implemented in NSL (Weitzenfeld 1991) running on a Sun workstation. Visual stimuli are represented as black squares in a 2 dimensional array of receptors monitoring the left visual field. In our current simulation, the dimension of this array is 21x21 and each square corresponds to a region of about 8°x8s visual angle. This is a rather coarse grain in comparison to the resolution of toad's retina, and we wish to increase the resolution of our model in the near future so that precise quantitative simulation can be carried out. A program is used to generate visual stimuli with different positions (on the x-y plane), initial o I o o o o A B C Figure 2.7: Escape direction. The arrows indicate the directions of the looming stimulus. The circles around the frog represent escape directions and those behind the arrows indicate the corresponding looming directions. (A) and (B) show two examples of the computer simulation of our model and (C) shows the findings from behavioral experiments conducted by Ingle and Hoff (1990). Note that there in some degree of variability of the selected escape direction in the simulation results. distances, and speeds. The distance (closeness) of the stimulus at each time step is also calculated by this program. The visual stimulus is then projected to the retinal network and on to the predator-avoidance model. 2.3.1 Escape direction based on stimulus position and direction Fig. 2.7 shows the escape directions in response to an object looming from different positions and directions. In this figure, the arrows indicate the directions of the looming stimulus. The circles around the frog represent escape directions and those behind the arrows indicate the corresponding looming directions. (A) and (B) show two examples of the computer simulation of our model and (C) shows the findings from behavioral experiments conducted by Ingle and Hoff (1990). When a looming stimulus is on a colliding trajectory or on one that crosses the lateral part of the visual field, our model chooses escape directions that are roughly a compromise between the forward direction and that opposite to the incoming stimulus. When a 27 TH 6 Firing Rate 80 — 4 0 - 20 - - 6.4 12.8 25.6 51.2 102.4 Stimulus Speed (degree/sec) Figure 2.8: Stimulus speed and avoidance response rate. The solid line is the projection obtained from computer simulation. The dashed line represents the data from an experiment in which a disc is moved around over the toad's head (Ewert and Rehn, 1969). looming stimulus is on a trajectory that crosses the frontal part of the visual field, our model chooses escape directions that cut behind the incoming stimulus. The simulation result is a reasonable approximation to the experimental data shown in Fig. 2.7C. 2.3.2 The effect of stimulus speed and size Neuronal response is dependent upon the speed of the stimulus. The average impulse frequency of R3 and R4 ganglion cells increases as the stimulus speeds up, 28 TH 6 Firing Rate 10 — 85 102 Stimulus Size (degrees) Figure 2.9: Stimulus size and avoidance response rate. The solid line is the projection obtained from computer simulation. The dashed line represents I the corresponding experimental data (Ewert and Rehn, 1969). 1 j within limits; when the stimulus moves too slowly (< l°/sec) or too quickly (> j lOOYsec), toads would not respond at all (Griisser and Griisser-Comehls 1976). i Within such limits, avoidance activity of toads increases almost linearly with the \ speed of the stimulus (Ewert and Rehn 1969): Response = KV0- 9 for 1<V<50 (Ysec), where V is the speed of the stimulus and K is some constant. In our model, the TH6 j is the primary command neuron that activates the motor schemas. Therefore, we , simulate the model with different stimulus speeds and compare the average firing rate of the TH6 neuron with the observed response rate of the toads (Fig. 2.8). Without tuning the parameters specifically for this simulation, the result fits the experimental ; data well: The response increases almost linearly with the speed of the stimulus, and the lower and upper limits to elicit an avoidance response match those observed. The toad's avoidance response rate is also subject to the size of the visual ! stimulus. Ewert and Rehn (1969) found that toad's escape activity increases with the j 1 size of the stimulus, with a diameter of about 50° being optimal. By generating • stimuli with varying sizes and projecting them to our model, we obtain the effect of ; * stimulus size on avoidance activity plotted in Fig. 2.9. The model behaves much like | i the observed animal behavior except that it shows saturation of response activity when the stimulus is larger than the optimal size, while the observed data show a ; slight decrease in this situation. An interesting hypothesis and predictions can be deduced from this discrepancy, as presented in the Discussion section. j i ! 2.3.3 Continuity of a looming stimulus | One of the cues in determining if an object is moving toward the eye is the j | expansion of image size on the retina. The expansion results in leading edges I i moving away from each other. Such movement of the edges can also be produced by j several objects moving away from each other. Ambiguity thus arises if this is the J only cue available. Schiff (1965) found that discontinuous expanding shadows (e.g., ' a square with an opening in the center) failed to elicit avoidance response in animals ; such as crabs and frogs. In our model, this ambiguity is resolved by integrating the > signal from R4 neurons. The R4 neurons exhibit prolonged activation to the dimming ; of light and, in effect, can ensure the continuity inside the expanding edges. Computer simulation shows that the T3 neuron is not fooled by such stimulus situations as four squares (or two vertical bars) moving away from each other. But, if two vertical bars are overlapped at first, then for a short period of time after they start to move away from each other, they appear to be a single expanding bar. In this case, some activity is elicited in the simulated T3 neuron. 30 2.3.4 Stimulus/background contrast Schiff (1965) reported that frogs do not respond to a white expanding stimulus on a black background and Ingle (personal communication) observed that frogs are ; i hit by a white object looming against a black background. Similar result has been j reported in housefly (Holmqvist and Srinivasan 1991). We carried out a related simulation and observed that our model did not respond to a white looming stimulus on a black background. This phenomenon can be explained as follows. R3 neurons respond more strongly to the moving edges along the transition from a bright region j 1 to a dark region and this response becomes weaker if the contrast is reversed. R4 neurons are tuned to detect the dimming in the visual field and hence do not respond to a expanding white pattern against a dark background. As a result of the reduction of neuronal activities in the R3 and R4 ganglion cells, the retinal signal is not strong j I enough to activate T3 neurons. 2.3.5 Stimulus distance and size Ewert (1971) reported that TH6 neurons respond more vigorously to a 3 cm i object approaching to 5 cm, than to a 10 cm object approaching to only 10 cm, even I though the latter subtends a larger visual angle. This discrimination requires information about both the size and distance of the object. The signal of the object size comes from TH3 neurons whose response increases in proportion to the object size. The information about the distance of the object is computed by a depth schema. By integrating signals of the distance and size of the object through proper weights, the TH6 neurons will respond more strongly to a smaller but closer one. If j TH6 neurons are deprived of the depth information, their functional capability would ' 31 Elevation (degree) i Ducking zone 120 Jumping zone t ! 10 Size (cm) Figure 2.10: Motor schema selection. The space formed by the size and elevation of a looming stimulus is divided into a ducking zone and a jumping zone. The grey area represents the jumping zone predicted by computer simulation. It conforms to the observation that frogs tends to duck in response to a small airborne stimulus but jump away from a larger or groundbome threat. degenerate to that of a T3 neuron, i.e., they would no longer be able to distinguish a distant stimulus from a close one. 2.3.6 Motor schema selection There are no quantitative data regarding the stimulus parameters which discriminate the ducking and jumping responses. Hence, we carried out a series of simulations to determine the effect of varying the size and position of a looming stimulus on the choice of avoidance actions. Our simulation results show that the 1 32 space formed by the size and elevation of a stimulus is divided into a ducking zone ! ! and a jumping zone (Fig. 2.10). The shaded region in the figure represents the ; jumping zone and the white region corresponds to the ducking zone. 2.4 Discussion In developing the predator-avoidance model, we have tried to make it as i consistent as possible with known biological data. All the functional units employed have been physiologically identified, except the schema for depth perception, j However, many of the details of the connections between different layers of I neuron sare unknown and, therefore, assumptions are made in developing the model. Several hypotheses are postulated for the underlying mechanisms of toad's detection I of a looming stimulus, determination of direction to jump, and selection of proper I avoidance actions. These assumptions and hypotheses suggest testable predictions of the anatomical structures and their functions in avoidance behavior. Some ' I predictions drawn from computer simulation of our model are discussed here. j 1) We propose a mechanism to localize the position of a looming object which appears to expand in all directions. The position of a looming object is determined by localizing its center and is encoded in a population of T3 neurons : where the more central neurons have higher activations. The T3 neuron receives retinal input with synaptic weights that favor signals away from the center of its ERF (off-center, on-surround) (Fig. 2.4A). This is in contrast to the commonly observed on-center, off-surround in the ERF of neurons. Different arrangements of the synaptic connectivity for T3 neurons are used in coding the position of a looming , stimulus and the results are shown in Fig. 2.11. In our simulation, the radial mask used in our model shows better performance. In general, through a radial mask, a Figure 2.11: Various svnaptic connections for T3 neurons. Different connectivity patterns for T3 neurons are shown in the left column while the corresponding simulation results of the firing rate of T3 neurons are shown in the right. Darker squares indicate higher synaptic weights in these matrices. That is, the uppermost one is off-center, on-surround, the middle one is homogeneous, and the lowest one is on-center, off-surround. Note the more precise localization of the peak in the top row, and the splitting into "subpeaks" in the bottom row. cell that corresponds to the center of a region (in the input layer) which has higher activity in its perimeter (e.g., expanding edges of a looming pattern) will have higher activation. A Gaussian mask, on the other hand, indicates the loci of the higher activity in the input layer. By arranging the projections from R3 neurons in a radial fashion and R4 neurons in a Gaussian distribution, the information regarding the expanding edges (R3) and the center of the dimming region (R4) can be maximally integrated by the T3 neurons. Similar synaptic organization has been proposed for mammalian visual systems ; | where a radial mask was used for the looming detecting neurons to integrate signals from direction-selective neurons (Saito et al. 1986, Tanaka et al. 1989, Sereno and ! Sereno 1991). Our model demonstrates that looming detection can be achieved j without first computing the 2-dimensional motion (on the frontal plane) of a looming I i stimulus by direction-selective neurons. The existence (and absence) of this intermediate processing stage can explain an interesting difference between anuran and mammalian visual systems, i.e., the response to discontinuous looming patterns. As described in section 3.3, frogs do not respond to discontinuous looming patterns ! [ t due to the decrease of R4 signals. Looming-sensitive neurons in the mammalian ; visual pathways, on the other hand, respond to discontinuous stimuli such as two bars moving away from each other (Beverley and Regan 1973, Cynader and Regan 1978, ; ! Mottor and Mountcastle 1981). Such "opposing vector organization" (so termed by ' ! Mottor and Mountcastle 1981) can be obtained by combining signals from two j groups of direction-selective neurons each tuned to one of the opposite motion j directions. Looming detectors can in turn be constructed by integrating signals from an array of opposite motion detectors each selectively responsive to opposite motion ! in different directions (see Chapter 4 for a discussion of how primitive motion I features can be integrated into higher motion parameters). Furthermore, such edge information is based on the abrupt change in image contrast (either from dark to bright or vice versa) therefore, the looming detectors in mammalian visual pathway are not sensitive to contrast reversal as seen in the anurans. 2) We propose a mechanism for computing 3-dimensional motion by : detecting the shift of the center of a looming object. Our model demonstrates that the trajectory of a looming object is obtainable without first computing the optic flow 35 \ field. This may explain why animals with poorly developed direction-selective 1 i neurons (and hence limited capability to generate an optic flow field) are able to i respond to looming stimuli efficiently. | 3) Cobas and Arbib (1991, 1992) postulated a motor heading map as a j common interface between the spatially-coded sensory map and the frequency-coded 1 motor output for both prey-catching and predator-avoidance behavior. In our model, | I we extend this idea to suggest that the motor heading map is the basis for integrating ' t signals from multiple sensory circuits. Signals from different sensory circuits converge on the motor map and through their cooperation/competition, a proper direction of movement can be determined. We use this mechanism to explain the , avoidance behavior in which frogs have to choose between ipsilateral/contralateral j escape directions depending on whether or not a looming stimulus is on a collision i I course. The resulting escape direction is consistent with the observed animal i j behavior. , Based on lesion studies, Grobstein (1989) suggests that before reaching the j motor pattern generators, efferents from the tectal map are transformed in midbrain ; tegmentum into an abstract representation in which the horizontal component is I independent of elevation and distance components. Similar results have been shown for the head movements in bam owls (Masino and Knudsen 1990), and hand-pointing in human (Flanders and Soechting 1990, Flanders et al. 1992). A prediction based on our model is that, after unilateral lesion of the tectal efferent, the frog would jump straight forward or towards a looming stimulus approaching from the visual contralateral to the lesion site. The forward jump is due to the lack of signal from the ' motor map while that toward the stimulus is a result of the weak contralateral 36 projection from the tectum to the motor map. This prediction has been verified by the data reported by Roche and Comer (1993). Further hints about the neural substrates involved in the sensorimotor transformation are given by Ingle (1991) based on localized lesions of the tegmental nuclei. In the first experiment, the anteroventral nucleus in the tegmentum was lesioned and the contraversive jump in frogs was abolished and all the jumps were directed toward the incoming stimulus from all directions. An opposite result was observed following the ablation of the posteroventral nucleus in the tegmentum, the frogs jumped to the contralateral visual field in escaping looming objects from all directions. It is thus reasonable to assume that the motor heading map is created in the midbrain tegmentum. However, more careful analysis is needed to relate the i motor heading to these nuclei. The motor heading map in our current model is place-coded, i.e., the position of the neurons in the map correlates to the escape direction, but Masino and Grobstein (1990) and Smeraski and Grobstein (1991) show that the tecto-tegmental projection I is not topographically organized. This does not invalidate our motor heading map hypothesis, rather it suggests that the neural implementation of the motor heading I map probably is done in some other form of population encoding scheme. Future development of our model will address the issue of neural organization of the : tegmentum which is essential to the sensorimotor transformation. 4) We have reproduced, without tuning any parameters, the observed phenomena in which the avoidance activity increases with the size and speed of a looming stimulus (Fig. 2.8 and 2.9). The simulation results fit the experimental data 1 well except for stimuli larger than 51°: our model shows a saturated response rate in contrast to the experimental findings which show a slight decrease. The increase in 37 t neuronal activity seen in our simulation is due to the increase in the number of R3 1 and R4 cells activated by a larger stimulus. In the simulation, the initial image size | on the retina for all of the visual stimuli are the same (i.e., every stimulus starts as a single square on the receptor field). Therefore, the T3 neurons can integrate, to the j fullest extent, the activity from R3 cells in tracking the expanding edges. Since each T3 neuron is connected to a certain number of retina cells, there is an upper bound to its activity. Beyond such a limit, a saturated activity is observed. However, in the I behavioral experiments, different visual stimuli were presented at the same starting j position, and hence with different initial image size on the frog's retina. For a larger j stimulus which starts with a larger retinal image, T3 neurons can integrate only I ! partial activity of R3 activity tracking the expanding edges. This loss is compensated by higher R4 activity (due to the larger dimming region), up to a certain extent. ; When the size of a stimulus increases beyond an optimum, the loss of R3 activity outweighs the compensation from R4 activity, thus resulting in a decrease of activity j in T3 neurons. j A prediction that follows this hypothesis is that a visual stimulus will elicit a stronger response if it starts expanding from a smaller size (i.e., approaching the ' animal from a further position). This prediction is somewhat counter-intuitive in that a stimulus which starts looming at a closer position would indicate a more urgent collision situation and should be taken care of immediately, while the model predicts a lower response to it. Computer simulations with looming stimuli starting at different distances indicate that the model did respond differently to the same stimulus depending on the starting position: a stronger response is elicited when the stimulus starts approaching from a further position. Consistent data has been reported in an experiment designed to test the critical threshold for eliciting an I j avoidance behavior, in which the response rate in crabs decreased when the j magnification of a cast image started from a larger visual angle (Schiff 1965). ! Similar data was obtained in flies (Borst and Bahde 1988, Holmqvist and Srinivasan 1991). Such a counter-intuitive behavior might seem hard to explain, yet our model offers a hypothesis that predicts the same response. | The size-dependency of visual stimuli approaching from a constant starting [ distance is complicated by the interaction of the inputs from R3, which is stronger for [ smaller stimuli, and R4 which favors larger ones. The initial increase of the number j I of of R3 and R4 neurons elicited by a larger stimulus will give rise to higher response | i in T3 neurons. Such an increase will taper off as the stimulus becomes so large that less R3 can respond to the expanding edges. Finally, the response rate of the , looming detectors will start to decline until it cease to fire since the input from R4 j alone is not enough to fire T3. This prediction is consistent with the observation that j i T3 neurons respond only to small but not to large looming stimuli (Ewert, unplished J data). | These predictions have significant implications for the visual parameters, especially that of Tc, used in controlling movement. If x is the essential parameter as | proposed by Lee and others (e.g., Lee 1976, 1980, Todd 1981, Savelsberg et al. 1991), then the timing of eliciting an escape response should be independent of either the initial distance or the size of a looming stimulus. If, on the other hand, some measurement such as an "urgency" signal obtained by integrating the activity of the looming detectors, contributes to the initiation of an action, then the timing would depend on such factors as initial distance or stimulus size. Behavioral data consistent ' with our hypothesis have been reported in the housefly where the timing of landing was shown to depend on the size of the approaching surface (Borst and Bahde 1988) while Holmqvist and Srinivasan (1991) demonstrated that timing of escape depends on both the stimulus size and initial distance. Although, for the size-dependence, I they only test the animals with surfaces up to a limit (12 cm and 5 cm, respectively) i i such that only the increase of size effect was observed, such a dependence is , nonetheless demonstrated. A quantitative study of this issues will be given in the next i chapter. ( I 5) Avoidance behavior is dependent on a multiplicity of signals including T2, \ T3, T6, TH3 and TH6 (Fig. 2.3). T2 and T3 provide the metric parameters 1 i concerning the trajectory of a looming stimulus. Such metric information is not j i exclusive to avoidance behavior; but rather, it is "neutral" in that it is important for i prey-catching behavior as well. It has been postulated that distance estimation based ; i on looming perception can play a role in controlling snapping distance (Chapter VI). The decision of an appropriate action is carried out by the cooperation and competition among a multiplicity of neurons that extract various features of a | i stimulus that is significant for such interaction. For example, the T6 neuron is tuned j I to detect an airborne object and TH3 is sensitive to large stimuli that might signify a ! I predator while T5.2 neurons respond best to a prey-like object. No matter what j action is chosen, the metric information extracted by the T2-T3 neurons can be used in controlling that movement. Such a flexible use of a common circuitry in different behaviors is an important property underlying the adaptibility of the nervous system. 6) There are many behavioral experiments in which the animal's response is nondeterministic (e.g., Fig 7C). The source of such behavioral variability is not known yet. A common practice in modeling nondeterminism is to add some form of random "noise" to the computation of the model's activity. Our model exhibits some degree of variability in computer simulations (Fig 7 A and B) without any "noise" 40 term. The source of nondeterminism lies in the transformation of the place-coding of stimulus position in tectum into the motor heading map. The place-coding is distributed over a population of T3 neurons whereas the signal in the motor heading i map is more localized, thus more than one neuron in the motor heading map receives j I enough activation to overcome its threshold, leading to variability in the choice of escape direction. However, the current model does not take into account the frequency-coding seen in the reticular efferents. Further study in this direction of sensorimotor transformation is needed to gain more insight into the basic | characteristics of behavioral variability. CH A PTER 3 TIME TO CONTACT OR URGENCY TO ACT 3.1. The Computational Issues To interact with a moving object, one needs to know not only its path in 3-D space but also its temporal properties. One temporal feature of particular behavioral significance is the time to contact (Tc) of a looming stimulus. For example, an escape jump initiated too soon will leave the predator much time to change its pursuit course, whereas a late escape is not much different from no move at all. Over the past two decades there has been considerable interest in the perception of Tc (Lee 1976, 1980, Schiff and Detwiler 1979, Todd 1981, Cavallo and Laurent 1988, Tresilian 1991, Savelsberg et al. 1991, Wang and Frost 1992). The use of Tc in guiding behavior has been demonstrated in a wide variety of species for different purposes (as briefly described in Section 2.1). Can a timely command be issued based on the looming pattern alone? Lee (1976, 1980) provided a theory of a visual parameter termed tau (x), that under certain conditions is equivalent to Tc, even though the absolute speed of a looming object cannot be recovered from the image velocity field. The conditions required are that a rigid looming stimulus is on a trajectory directly towards the eye at a constant speed without rotation. Under such conditions, x can be computed from the ratio of the instantaneous visual angle subtended by the looming object and its expansion rate. 42 That is, x = 2Q(t)/Q'(t)> where Q(t) can be either the solid angle subtended by the I looming object or the angle between two points on the object. Subbarao (1990) I showed that the formula for Tc can be determined from the first-order derivatives of j image flow (see also Arbib 1989, §7.2). In principle, Tc can also be computed as the j remaining distance divided by the speed. However, it is difficult to distinguish which mechanism is employed (Tresilian 1991, Simpson 1993). Furthermore, the neural j mechanism underlying the estimation of Tc is unknown, especially when j differentiation is involved whose biological feasibility is doubtful (Nakayama 1982). i In this chapter, the looming avoidance model presented in the previous chapter is extended to address this issue. The basic idea is to integrate over time the activity of I I T3 which signifies a looming stimulus. Since T3 activity is positively correlated to the I speed of a looming stimulus, the temporal integration provides some indication of the I distance traveled by the stimulus (i.e., how close is the stimulus). If the distance at j I which T3 starts to respond is known, then the instantaneous distance of a looming j object can be obtained. Such a signal should be independent of the starting position or ; the size of a looming stimulus. However, it turns out that T3 response depends on ; | both the speed and size of a looming stimulus. This poses a challenge for finding a j mechanism that can compute the stimulus speed independent of its size. Neural ' modulation of the integrator neurons based on T3 activity is postulated to be a biologically plausible mechanism for accomplishing this goal. Described in the » following sections is a detailed quantitative study of the looming perception circuit to develop this hypothesis. 3.1.1 Defining the problem How can the anuran visual system extract motion parameters which allow a timely reaction be evoked? Does it compute Tc, or does it make use of the information processed by the looming-sensitive T3 neurons? If it is the latter, then, how? In this chapter, we study the neural mechanism underlying the use of looming perception to generate a motor command with appropriate timing. The crucial signal for issuing a timely reaction is one that conveys, in some way, the speed (Vz), at which a stimulus is approaching the animal. The activity of the retinal ganglion cells, R3 and R4, is correlated with the speed of the expanding edges on the parallel frontal plane measured in visual angles (Griisser and Griisser-Comehls 1976, Ewert 1984). Therefore, the visual system must produce a signal that reflects Vz given d0/dt. Recovering Vz from d0/dt is a ill-posed problem. This is due to the fact that the visual angle an object subtends depends not only on its size, but also on its distance from the eye. This can be shown as the following. $ = Tan~'-j (3.1) dd _ dO_dz^__ dd_ (3 2) dt ~ dz~dt~ z dz (3-3> dz r +z e =— =v V - r (3.4) dt z r + z v 7 r 2 + 72 • 7 = L ±£_0 (3.5) z r That is, to recover the absolute speed of a looming stimulus from its angular velocity, the system needs to know both the size of the stimulus and the distance from 44 it. However, an approximation of Tc, which does not depend on stimulus size or distance, can be obtained if the visual angle is small such that 0 ~ Tan0. Therefore, This formulation is the basis of the mechanisms proposed by Lee and others (Lee i regarding the stimulus size. If T3 also signal, in some way, the distance of a stimulus, | then the absolute speed of an approaching object can be computed, and hence the time I to contact. The basic idea is to integrate over time T3 activity. Since T3 activity is I positively correlated to the speed of a looming stimulus, the temporal integration provides some indication of the distance traveled by the stimulus (i.e., how close is the stimulus). Two hypotheses can be formulated. (i) If T3 begins to fire when a stimulus of a certain size approaching from some sufficiently long distance reaches a specific visual angle, than a timely reaction can be obtained by integrating the T3 activity from that point on. j (ii) If some signal can be extracted from the activity of such an integrator, independent of stimulus sizes, than a timely command can be generated for stimuli of 0 » Tan0 = -j- (3.6) (3.7) e (3.8) 1976, Simpson 1993). However, in this chapter, we explore a different approach j based on our model of looming perception in anurans disciuued in Chapter 2. : 3.1.2 Outline of a solution ! As discussed in Section 2.4, the looming detecting cells (T3s) respond stronger to • a smaller than to a larger looming stimulus. This preference provides the information | 45 different sizes. More specifically, if the curves of the activity of such an integrator for ; stimuli of different sizes intersect at a certain point in time, than this will provide a reference point that depend only on the absolute speed of looming stimuli and i independent on their sizes. I Neural modulation of the integrator neurons based on T3 activity is postulated to I I be a biologically plausible mechanism for accomplishing this goal. This idea motivated the conduction of a detailed quantitative study of the looming perception circuit to fully j develop and test these hypotheses. j | 3.2 The F iring Range i The first parameter investigated in this study is the distance at which a looming stimulus elicits the first response in T3 neurons (such distance will be referred to as j onset distance). It has been reported that T3 neurons respond to a looming stimulus j j only when it enter a certain range of distance, and their continuous discharge stops : i after it exits that range (Gaillard, 1984). Such a firing range for T3 begins at a distance ] about 15 cm from the animal and ends at about 4 cm away. Our model also exhibits ! this firing characteristic. The existence of such a firing range raises an interesting question: Does the activity of T3 provides information regarding the distance of a i looming stimulus? More generally, we ask what information is given by the activity of T3? A series of computer simulations is conducted to search for the answer. In the simulation experiments, one of the following features, namely, the size, speed or starting distance of a looming stimulus, is varied while the other two are held constant. The size of the stimuli will be given in cm unless otherwise specified. j 25 -I Absolute Distance (cm) Visual Angle(°) 20 10 - 0 20 40 60 8 0 1 00 120 Starting Distance Figure 3.1. Onset distance vs. starting distance. In this simulation, a 6x6 square is moved towards the eye starting from different distance. The abscissa represents the distance from which the square starts to approach, whereas the ordinate represent the distance between the stimulus and the eye when the first response in T3 is elicited (open circle) or the visual angle that the stimulus would subtend at the corresponding distance (solid circle). Note that for all simulations in which the stimulus starts to approach from a distance greater than 35 cm, the first response in T3 is elicited when the stimulus reaches the distance of 15 cm away from the eye (open circle). The curve that connects the solid circle is the plot of the same data except that the ordinate now represent the visual angle that the square would subtend at the corresponding distance. 3.2.1 Starting distance In this experiment, a square object starting from different distance at a constant speed (20 cm/s) was simulated. For a given stimulus starting from a long distance away, is elicits the first response in T3 when it reaches a constant distance independent of starting distance. For example, a 6x6 square activates T3 always when it reaches 15 cm away, as long as it starts to approach further than 35 cm away (Fig. 3.1). T3 ceases to fire when the stimulus is closer than 9 cm. The onset distance decreases as the starting distance is reduced to smaller than 35 cm. The visual angle that the looming object subtends at the onset distance is also drawn in Fig. 3.1, and a reversed 47 40 - i % 30 - C o H «g 20- < D O 14x14 10x10 6x6 2x2 10 - » Q 6 0 80 120 20 4 0 1 00 0 Starting Distance Figure 3.2. Onset distance vs. starting distance for 4 square objects. The simulation illustrated in Fig. 3.1 is repeated for squares of different sizes (2x2, 6x6, 10x10, and 14x14). The abscissa represents the distance from which the square starts to approach, whereas the ordinate represent the distance between the stimulus and the eye when the first response in T3 is elicited. Note that for all simulations with the same stimulus approaching from a distance greater than some distance, the first response in T3 is elicited when the stimulus reaches a constant distance away from the eye. relation as described above is observed. This experiment confirms the existence of a firing range within which a looming stimulus will elicit a response in T3. Furthermore, the distance of the beginning of this range (i.e., the onset distance) increases as the starting point of a looming object becomes farther, up to a limit beyond which the onset distance becomes constant. 3.2.2 Stimulus size In the next series of experiments, we test if the aforementioned relationship between the firing range and the starting distance remains constant over different sizes of looming stimuli. First, we tested stimuli of 10 sizes (2x2, 4x4, to 20x20) t approaching from 50 cm away at the speed of 20 cm/s. It turns out that the onset distance depends on the size of the looming object; the bigger the object, the greater the 48 80 -i 2x2 6x6 10x10 14x14 J> 120 100 60 80 20 4 0 0 Starting Distance Figure 3.3. Visual angle at onset distance vs. starting distance for 4 square i objects. The same data illustrated in Fig. 3.2 is plotted by changing the j ordinate from the distance between the stimulus and the eye when the first i response in T3 is elicited to the visual angle that the square would subtend at ; the corresponding distance. Plotted in this way, is can be seen that the first response in T3 is elicited when the stimulus reaches a critical visual angle ^ (about 13°, except the 2x2 square). onset distance. Next, the same simulation of varying starting distance described above ; I I is repeated for four square objects with dimensions varying from 2x2 to 14x14. (Fig. ; 3.2). Different onset distance is obtained for different stimulus size. The starting j i distance at which the onset distances saturates also increases for larger stimuli. , However, the slope of the initial increase in onset distance vs. starting distances is constant for all sizes of stimuli tested. An interesting regularity is revealed when the visual angle the looming stimuli subtend at their respective onset distance is plotted against the starting distance. For starting distance greater than 30 cm, the visual angle at onset distance is constant for all 10 square objects simulated (22 ± 3°), except for the 2x2 object (Fig. 3.3). The data i suggests that T3 is sensitive to the size of the image of a looming stimulus cast on the retina but not to its absolute size. 49 40 -I | 30 - < 5 P ' 3 2 0 - & > o § 10 - • H Q 1 0 20 30 0 40 50 60 2x2 3x3 4x4 6x6 10x10 14x14 Velocity (cm/s) Figure 3.4. Onset distance vs. stimulus speed for square objects. The effect of varying the speed of a looming stimulus is simulated. The abscissa represents the speed of the stimulus and the ordinate represents the distance at T3 onset. There is a small decrease in the onset distance for the higher looming speed. 40 -| < u 20 - " 5b 9 1 10- t /3 > 60 50 20 30 40 1 0 0 2x2 3x3 4x4 6x6 10x10 14x14 Velocity (cm/s) Figure 3.5. Visual angle at onset distance vs. stimulus speed for square objects. The same data illustrated in Fig. 3.4 is plotted by changing the ordinate from the distance at T3 onset to the visual angle that the stimulus would subtend at the corresponding distance. 50 3.2.3 Stimulus speed The speed of a looming stimulus is an important element in the timing of a reaction. The effect on the firing range by changing looming velocity is simulated and the result is shown in Fig. 3.4. In this experiment, the speed of a looming stimulus is varied from 5 cm/s to 60 cm/s. The simulation is repeated for square object from 2x2 to 14x14, all starting from 50 cm away. A linear decrease in onset distance is associated with the increase in looming speed. The average slope of the decrease is j about 0.17, that is, the effect of looming speed on the onset distance is small. When j l i visual angle at onset distance is plotted, a curve described by an arc tangent function is j observed (Fig. 3.5). j 7 - 4 > C /5 c ' a 6- O P 5‘ < D 4 - o C/3 3 a 2 - 40 50 30 20 1 0 0 4x1 3x1 3x0.5 2x0.5 Starting D istance (cm) Figure 3.6. Onset distance vs. starting distance for worm-like stimulus. The simulation illustrated in Fig. 3.2 is repeated for worm-like stimulus of different sizes (2x0.5, 3x0.5, 3x1, and 4x1). The abscissa represents the distance from which the square starts to approach, whereas the ordinate represent the distance at T3 onset. The same observation made in large squares is also seen here. That is, for all simulations with the same stimulus approaching from a distance greater than some distance, the first response in T3 is elicited when the stimulus reaches a constant distance away from the eye. 3.2.4 Worm-like stimulus The above series of experiments were repeated for worm-like (elongated) stimuli. Similar results were obtained. For large starting distance (>12.5 cm which corresponds roughly to the limit of snapping zone), a given worm-like stimulus activates T3 at a constant distance (Fig. 3.6). The onset distance decreases when the starting distances becomes smaller than a threshold distance. The relationship between the visual angle at onset distance and starting distance is harder to obtain since the length of the stimulus in x and y dimensions is different. However, it is observed that the average of the angles in x and y dimension is quite stable over different worm-like stimuli. A constant decrease in onset distance for increased looming speed is also seen in the worm-like stimuli (Fig. 3.7). 10 -i 4 > < D O G a C/5 a 20 4 0 60 8 0 0 3x2 4x1 3x1 3x0.5 2x0.5 Velocity (cm/s) Figure 3.7. Onset distance vs. stimulus speed for worm-like stimulus. The effect of varying the speed of approaching worm-like stimuli is simulated. The abscissa represents the speed of the stimulus and the ordinate represents the distance at T3 onset. There is a small decrease in the onset distance for the higher looming speed. 52 3.3 C om puting D epth by N eural M odulation The activity of T3 does not offer direct information about the true speed or \ distance of a looming stimulus since its onset depends on the size of a looming | stimulus. Furthermore, the existence of the firing range exclude the use of Lee's I algorithm since after the response starts to decrease from its peak, T3 activity no longer ; ! signifies the expansion rate of a looming pattern which continues to increase rapidly. | However, T3 provides one important datum, that is, a constant onset distance for ! I stimuli starting from positions beyond a threshold distance. Another critical piece of | i information is its preference for smaller looming objects and the increase in its firing | I for faster looming stimuli (as discussed in Section 2.4). With such information, we I postulate a biologically plausible neural mechanism for computing the absolute depth i of a looming object. j The basic step in the computation is to integrate the activity of T3 over time. The j rationale is that due to its positive correlation with stimulus speed, the temporal , l integration will indicate the distance traveled after T3 is activated, since the onset i distance is constant for a given stimulus starting from any position beyond the threshold. Such a signal will indicate the closeness of a looming stimulus. The 1 second step is to resolve the size-dependency of T3. A hint of stimulus size is provided in the firing rate of T3; higher activity for smaller object. One may hope that the increased firing rate can compensate for the delayed activation for smaller object. However, simulation of our model demonstrated otherwise: The increase in firing rate alone is insufficient (Fig. 3.8 and Fig. 3.9), The most effective way for the integration of delayed T3 activity in the presence of a smaller stimulus to catch up with that of the advanced activation of T3 by a larger one is to increase the rising rate of the integrator. One biologically plausible way to achieve 53 ( B Figure 3.8. Integration with modulation. (A) The results of simulations with | different stimulus sizes are superimosed. The top row show traces of the T3 | activity in response to square objects of the sizes 5x5, 6x6, 7x7, 8x8, and 9x9 (from right to left). The middle row shows the activity of the first integrator neuron responding to the same set of stimulus. The bottom row shows that activity of the second integratior neuron. Note that all traces in the bottom converges. (B) The effect of neural modulation is increased in the simulation. The rising rate of the traces for smaller object becomes higher (middle) and the convergence is reached in a shorter time (bottom). I such an effect is by modulating the time constant of the integrator neuron1. ] Specifically, the degree of modulation is proportional to the activity of T3, i.e., the higher the T3 activation, the greater the modulation which results in a shorter time constant for the integrator neuron. With such modulation, not only the peak amplitude of the integrator is increased, it also reaches the peak within a shorter time (Fig. 3.8). After a second round of integration, the activity of the integrator neuron for different stimulus sizes converges. When the signal of the second neuron integrator is subject to a threshold function, the 1 The biochemical basis of neural modulation can be found in Kaczmarek and Levitan (1987). Servan-Schreiber et al. (1990) presents a mathematical model. 54 i f Figure 3.9. Absolute speed of looming stimulus. The results of simulations with different stimulus sizes and speeds are superimosed. (A) From right to left: the response to a 4x4 square moving at 10 cm/s, 8x8 square at 10 cm/s, 4x4 at 20 cm/s, and 8x8 at 20 cm/s in T3, first integrator, second integrator (starting from the top). The bottom trace shows the result of passing the signel from the second integrator through a threshold function and limiting to its firing for a few time steps. When the threshold is set to be a value close to the point of convergence in the second integrator, a signal can be generated that reflects the absolute speed of the looming stimuli. (B) The response to 2 additional stimulus (2x2 and 6x6 squares) at the same 2 speeds are superimposed onto traces in (A). time of its activation occurs within a small time window for stimuli of different sizes , looming at the same speed (Fig. 3.9). Such signal will be useful in eliciting a timely action. One example of using this signal is in controlling the snapping distance during prey-capture behavior (see Chapter 5). A candidate for the first neuron integrator is the looming sensitive neurons in the pretectum classified as TH6 cells by Ewert (1971). TH6 neurons have ERF of either 180° or 360°. The input to TH6 comes from the contralateral eye via the ipsilateral optic tectum (Brown and Ingle 1973) — small tectal lesions produce scotoma in these neurons. The closer a looming object approaches, the stronger the TH6 response it can elicit, so does the first integrator in our model. TH6 neurons respond more strongly to a 3 cm object approaching to 5 cm, than to a 10 cm object approaching to only 10 cm, even though the latter subtends a larger visual angle. The integrator in the model also exhibits this property. To the best of our knowledge, there is no record of neurons in anurans with properties similar to the second integrator in our model which produces constant response to looming stimuli of different sizes. However, in the nucleus rotundus of pigeons, neurons which reach their peak response at a constant time before contact over a wide range of the size or speed of looming stimuli has been reported (Wang and Frost 1992). The activity of the second integrator correlates to the speed of a looming stimulus regardless of its size (Fig. 3.9). In Fig. 3.9, looming stimuli of 3 sizes (4x4, 6x6 and 8x8 squares) moving at 2 speeds (10 and 20 cm/s) are simulated. The response of the second integrator converges to 2 distinct ranges corresponding to the 2 speeds, independent of the sizes, of the looming stimuli. Note that in Fig. 3.9 A, the onset time of T3 is the same for a 4x4 square moving at 20 cm/s and a 8x8 square at 10 cm/s. However, in the second integrator, the two curves converges separately to curves of the corresponding speed. 3.4 Discussion We have obtained from T3 activity, via two modulated integrations, a visual parameter that is a close approximate of the absolute speed of a looming object, independent of its size or starting distance (within limit). We will refer to this visual parameter as upsilon (v), the urgency to act. It is reciprocally related to Tc. There are i several advantages of using v over Tc (or x). First, there is no differentiation involved. To compute v, only integration processes are required. Differentiation is j problematic because it introduces noise and may not be biologically plausible (Nakayama 1982, van Dom and Koenderink 1982, Brost and Bahde 1986,1988). ■ Second, it is highly flexible. The degree of urgency signaled by v can be j modulated (Fig. 3.8 and 3.9). This is an important property since the range of I interaction with an approaching object is wide (e.g., escaping a predator, landing on a j j surface, or catching a ball) and the timing of these actions varies. Besides the j i flexibility for dealing with different interaction, it also greatly facilitates learning and adaptation by offering a specific site of plasticity with predictable outcome of modification. Third, it is an increasing signal (as the distance decreases) which is readily usable by the down stream centers; as oppose to the need of taking the reverse of Tc or t. ■ There is evidence that distance related signal is coded in a similar manner, i.e., the | closer a target, the higher the intensity of the signal in the pathway that carries ; information about target distance (Grobstein and Staradub 1989, Grobstein 1991). ! Fourth, it maybe a better reflection of biological reality. The visual parameter Tc (or x) is a mathematical notion. x > , on the other hand, is biologically-based. As most biological phenomena, it show some degree of variability and some operation range beyond which its performance degrades. For example, the convergence of curves of different sizes is not complete, but rather, some variation across different sizes is 1 observed (Fig. 3.9). The constancy of onset distance only occurs with larger starting distance. The speed or size correlation of T3 only holds true within certain range (cf. Fig. 2.8, 2.9). Similarly, Holmqvist and Srinivasan (1991) reported behavioral data 57 showing that the escape response of the housefly depends on the initial distance, speed l and size of a looming stimulus. The significance of biological consistency is profound ' in that it forces us to confront the problem of how the nervous system is able to perform reliably in the presence of such constraints as noisy signals (both external and internal) of limited range of reliability. Besides offering a better understanding of the brain, progress in this direction certainly will have great technological implications as well. ! 58 CHAPTER 4 ! MOTION PERCEPTION: MAMMALIAN VISUAL SYSTEMS j 4.1 The Issues This study seeks to unify theoretical and experimental approaches to research on motion perception. In particular, we focus on the issues concerning motion in depth and rotation. The mathematical analysis of the optic flow field often involves t temporal and spatial differentiation. However, differentiation introduces noise and may not be biologically plausible. Experimental data, on the other hand, has revealed many neuronal populations sensitive to various aspects of motion such as looming, rotation, and time-to-contact. Although the neural mechanisms underlying i » such motion perception are unknown, it has been shown that the biological visual i system relies heavily on its sophisticated capability of directional selectivity. In this ' t study, we first introduce a theory which provides a bridge to link these two I approaches to motion perception. Then we develop a neural network model that is capable of detecting motion in depth and rotation based on data from mammalian visual system in which the direction selective neurons form the optical flow field. Directional selective neurons provides the mammalian visual system with an extended functional capability in comparison to the anuran visual system. Much theoretical study of motion perception is based on the notion of the optic 1 flow field originated by Gibson (1955, 1958). The motion in depth has been I characterized by the divergence, div, of the optic flow field (Koenderink and von 59 Doom 1976, Longuet-Higgins and Prazdny 1980, Regan 1986). There are several ! problems faced by such a differential approach. First, the computation of the spatial j i derivatives of the flow field requires relatively smooth surfaces in the environment. | Second, the spatial differentiation introduces noise into the flow field. Third, there is j the question of how biologically plausible is the differentiation of the flow field to j take place in the visual systems as pointed out by Nakayama (1985). We offer an I alternative formulation based on Green’ s theorem that avoid spatial differentiation. Such a formulation has been used to derived a qualitative description (approaching or i I receding) of the div derived from the normal flow field for guiding robot navigation ; (Sharma 1992). ! I In spite of the progress they make in their own perspectives, little connection has ' j been made to bring together research on the theoretical and experimental tracks. : i This chapter present an efforts to provide a bridge between the two approaches. In the following sections, we first review experimental data concerning motion ; i perception in mammalian visual systems; followed by an introduction of vector ; I analysis based on Green's and Stokes's theorems to show how the visual processes | underlying mammalian motion perception can be described by vector integration. ! Finally, we present the implementation of the mathematical formalism by two neural network models capable of detecting motion in depth and rotation. 4.2 H ierarchy in the M ammalian Visual Pathway The analysis of visual motion is complicated. It is natural, from a computational point of view, to decompose a complicated process into a hierarchy of subprocesses, j For example, motion perception can be decomposed into subtasks such as detection of translational motion and rotational motion, under which we may have more primitive functions like directional selectivity building on top of motion sensitivity. It is interesting to see that, amongst its massive parallelism, functional hierarchies organized in a fashion similar to what one would suggest based on computational considerations have been observed in biological visual systems . Hubei and Wiesel (1962, 1968) studied the diversity and the complexity of different sensory cell types and connections in visual cortex from extracellular microelectrode recordings in anesthetized cats and monkeys and discovered cells with orientation and velocity sensitivity characteristics of these units toward stationary and moving stimuli. Van Essen and Maunsell (1983, 1985) identified a hierarchical organization in the visual cortex of macaque monkey according to architectural structures, anatomical pathways, and functional properties. For example, area VI contains cells that are sensitive to spatial frequency (orientation, length, and width) as well as temporal frequency (direction and speed). Further up the hierarchy we observe two distinct functional divisions. One function focuses specifically on the analysis of forms, shapes, and color while the other is involved with the analysis and interpretation of moving stimuli. Hence, specific functions are developed from a pool of more general ones from the lower hierarchy. They found the middle temporal area (MT) to be associated highly with motion analysis. In particular, MT may be involved in computing qualitatively different information, perhaps a composite measurement, which characterizes more complex aspects of motion such as direction-selectivity, speed-selectivity, binocular-disparity-selectivity, and preference of opposing movement, an important cue in looming detection (Regan and Beverly 1978, Mottor and Mountcastle 1981, Maunsell and Van Essen 1983, Sakata et al. 1985). ; 6i i i Even higher in the hierarchy, neurons in the medial superior temporal area (MST) are capable of detecting looming and rotational motion (Saito et al. 1986, 1989). Intensive projection from MT to MST suggests that these looming and \ rotation detecting neurons integrate information processed by the MT neurons. ! Leinonen (1980) reported cells in posterior area 7 responding positively to rotational stimuli in the awake monkey. Rizzolatti et al. (1981) found similar neurons in the } premotor cortex. From studies of rhesus monkeys, Sakata et al. (1985) decomposed cortical area 7a, which is concerned with space vision, into various functional groups. They are visual fixation (VF) neurons, visual tracking (VT) neurons, and j passive visual (PV) neurons. They found PV neurons specifically sensitive to the j visual rotational stimulus. Additionally, some of these neurons responded to rotation \ in the frontoparallel plane and others responded to rotation in depth. Saito et al. ; (1986) discovered neurons in the medial superior temporal area (MST) capable of detecting looming and rotational motion. Intensive projection from MT to MST j suggests that these looming and rotation detecting neurons integrate information I processed by the MT neurons (Ungerleider et al. 1982, MaunseU and Van Essen 1983, Saito 1986). 4.3 Vector Analysis of Optical Flow Field We introduce here a theoretical framework for motion integration for extracting higher motion parameters (including looming and rotation) in a biologically plausible way. These motion parameters are extracted based on the gradient of an optic flow field. For example, a looming pattern is characterized by a radial outward gradient (the divergence, div) of the optic flow field, and a rotating motion can be represented as the curl of the field. We will show that two 62 interpretations of the div and curl of a general vector field are mathematically 1 equivalent. This result is then applied to the optic flow field showing its biological implication. j Given a vector field V in a domain D of 2-D space, one has two scalar functions ; i Vx and Vy such that V = (Vx > Vy). We assume that these functions have first partial i derivatives in D. The divergence, div, of the vector field, V, is the scalar, | dV dV divV = ^ (4.1) i ax ay The curl of V is the new vector, curl v = - ^ I - - ^ J (4.2) yx ox U U where I and J are the unit vectors in the directions of the X and Y coordinate axis, respectively. Based on Green's theorem (Kaplan 1984), it can be shown that, £ Vnds = J J divVdxdy (4.3) D where D is a domain of the xy-plane, C is a piecewise smooth simple closed curve in D whose interior is also in D, Vn denotes the component of V normal to the direction of increasing arc length s along C, and R is the closed region bounded by C. This result is a 2-D form of Gauss's theorem. The left hand side of the equation can be interpreted as the total mass leaving R per unit time (i.e., the flux across boundary C) whereas the right hand side measures the rate at which the density decreases throughout R (i.e., the total loss of mass per unit time). Furthermore, based on Stokes's theorem (Kaplan 1984), one has the following equation. 63 £ V,ds = J J curlVdxdy (4.4) | c d ! where Vt denotes the tangential component of V in the increasing arc length s and J curlV represents the rotation of the vector field around the z-axis. This result is a special case of Stokes's theorem. Equations (4.3) and (4.4) have special significance in the research of motion j perception. Equation (4.3) states that the integration of the directional (normal) | I vector along a closed curve is equivalent to that of the divergence of the vector field ! over the region bounded by the same curve. While equation (4.4) shows that the integration of the directional (tangent) vector along a closed curve is equivalent to that of the z-component of its curl over the region bounded by the same curve. In 1 I theoretical analysis of motion perception, the divergence of an optical flow field has j been used as a measurement of the looming (or receding) patterns, and its curl as the j signal for the rotational patterns. However, there are reasons to believe that, due to its low pass filtering nature, it is not plausible for biological visual systems to ! compute the spatial derivatives of an optic flow field (Nakayama 1985). Meanwhile, : neurobiologists have reported a rich set of direction selective neurons which are j closely related to the looming and rotation detectors in biological visual systems. ! Equations (4.3) and (4.4) provides a mathematical ground that can bring together the theoretical and experimental ends in the research on motion perception. We will ■ demonstrate how this scheme of motion integration can be implemented in mammalian visual systems. 4.4 A Neural Network Model Now we demonstrate how the integration of the normal vectors of an optic flow field can be performed in a hierarchical way as seen in the biological visual 64 e @ : © _ © Fig- 4.1: The hierarchical organization of a neural network for detecting looming patterns. At the lowest level are the direction selective neurons which are paired to form the detectors of an opponent vector. There are two types of opponent vector, one with components pointing away from each other (the upper part of the figure) and those that point towards each other (the lower part). The looming detector is constructed by connecting the first type of opponent vectors detectors via excitatory synapses and the other types of such detectors with inhibitory synapses. pathways. First, we form an opponent vector detector by pairing direction-selective , | neurons whose preferred directions point away from each other. For example, a : horizontal opponent vector detector is composed of a set of leftward motion detectors on the left half of the receptive field and another set of rightward motion detectors on the right (Fig. 4.1). Each pair of such opponent vector detectors computes the normal vector of the flow field in a certain direction. A set of these pairs, each corresponding to a direction, gives rise to a discrete approximation of the outward flux of the optic flow throughout the receptive field. Second, another type of opponent vector composed of a pair of neurons whose preferred directions point towards each other computes inward flux in a certain direction is constructed in a 65 similar way. Finally, a div integrator, which sums up all the local flux, is constructed ! by summing up the outward flux computed by a set of the outward opponent vector j I detectors and subtracting the inward flux calculated by the set of inward opponent 1 vector detectors. Such inhibition has been documented by Cynader and Regan (1978) in neurons sensitive to motion in depth. | i 4.4.1 Looming stimuli Let R be the receptive field of a looming detector bounded by a closed curve £2. The integrated divergence of the optic flow field covered by the looming detector is i given by the integral JJdivVdxdy. Therefore, a looming pattern in the flow field is I R ' indicated by the condition that ! JJdivVdxdy > 9 (4.5) R where 0 is some predefined threshold for detecting looming stimuli. From Eq. (4.3), this condition is equivalent to the following one, fn Vnds>0 (4.6) i Now we demonstrate how the integration of the normal vectors of an optic flow i field can be performed in a hierarchical way as seen in the biological visual : pathways. First, we form an opponent vector detector by pairing direction-selective neurons whose preferred directions point away from each other. For example, a horizontal opponent vector detector is composed of a set of leftward motion detectors on the left half of the receptive field and another set of rightward motion detectors on the right (Fig. 4.1). Each pair of such opponent vector detectors compute the normal i vector of the flow field in a certain direction. A set of these pairs, each correspond to a direction, give rise to a discrete approximation of the outward flux of the optic flow 66 I i i i throughout the receptive field. Second, another type of opponent vector composed of a pair of neurons whose preferred directions point towards each other computes i inward flux in a certain direction is constructed in a similar way. Finally, a div ■ I J integrator, which sums up the all the local flux, is constructed by summing up the I i outward flux computed by a set of the outward opponent vector detectors and 1 i subtracting the inward flux calculated by the set of inward opponent vector detectors. Such inhibition has been documented by Cynader and Regan (1978) in neurons j sensitive to motion in depth. j The integration in Eq. (4.3) is performed along the boundary, Q. In order for a | div integrator to sums up the signals from opponent vector detectors only along the i boundary of its receptive field, one can arrange the synaptic connectivity in the way shown in Fig. 4.2 A. Although a better approximation of the divergence can be obtained via such synaptic connectivity pattern, there are reasons on biological I grounds that oppose it. Saito et al. (1986) reported a unique property of the looming j detectors found in the MT and MST area in Macaque monkeys, namely that they [ respond to looming stimuli anywhere within a restricted central region of their j receptive field (position invariance). Furthermore, the initial size of an expanding ! stimulus is not important in activating the looming detectors. A synaptic connectivity pattern that resembles a Gaussian distribution (Fig. 4.2 B) could perform better in terms of these biological properties. The position of a looming object can be determined by localizing its center. As shown in Fig. 4.3 A, a looming detector with its receptive field aligned with the center of a looming pattern in the optic flow field receives only excitatory inputs from the opponent vector detectors without any inhibition, whereas, a looming detector whose receptive field covers a more peripheral region of the looming flow 3 t Figure 4.2: The svnaptic weight masks. These are qualitative representations of weight masks for the network. (A) The inverted-hat is utilized to approximate the integration of the normal vectors along the boundary of the receptive field of a looming detector. (B) An example of a Gaussian mask which is the other alternative choice for the synaptic connectivity. pattern receives inhibitory inputs with the strength of such inhibition increasing as one moves further away from the center. As a result, the looming detectors that correspond to the center of a looming flow pattern have higher activation level and the activity decays as one moves toward the perimeter. Therefore, the spatial position of a looming stimulus is encoded in a population of looming detectors where the more central neurons have higher activations (Fig. 4.3 B). 4.3.2 Receding stimuli Similarly, a receding pattern is characterized by a radial inward flux of the optic flow field. Now the sum of the divergence of the flow field is smaller than zero (or some predefined threshold). § a Vnds = |J divVdxdy < 6 (4.7) R The integration of the inward flux of the optic flow is obtained in a similar way. The only difference is that the integration of the activities of the opponent vector detectors is constructed in a reverse manner. Now the div integrator sums up the 68 I i B Fig. 4.3: A. The flow pattern generated by a looming object. The normal vectors summed by a looming detector whose receptive field covers the center of the flow have all positive values whereas, at a more peripheral region, a looming detectors integrates both positive and negative normal vectors that cancel each other. B. As a result, the activity of the looming detectors is highest in those positions that correspond to the center of a looming flow pattern and the activity decays as one moves toward the perimeter. inward flux computed by the set of the inward opponent vector detectors and subtracts the outward flux calculated by the set of outward opponent vector detectors. From a mathematical point of view, the function of a receding detector is the negation of that of a looming detector. 4.3.4 Rotational motion A rotational pattern in the optic flow field on the xy-plane is characterized by the z-component of its curl , given by the integral JJcurlVdxdy where R is the receptive field of a corresponding rotation detector. This integral measures the extent to which the flow field is a rotation around the z-axis. Thus a rotational pattern in an optical flow field is signaled by the condition that JJ curlVdxdy > 0 (4.8) D where 0 denotes some predefined threshold for detecting rotation. From Eq. (4.4), this condition is equivalent to the one below. fQV ,d s> 0 (4.9) We now show how a discrete approximation of fnV,ds of an optic flow can be achieved by a hierarchical network built on top of direction-selective neurons. The rotation-sensitive cells are formed by arranging the projection of the direction- selective neurons in a special way. When a rotational stimulus moves in the visual field, say an object moving on a circular path, each layer of direction-selective neurons will respond to the stimulus when the path it traces matches coarsely to the preferential direction of that layer. That is, as a stimulus rotates, a sequence of layers will be activated, one after another, when the instantaneous movement direction of the stimulus falls in line with the preferred direction of a layer. The transient activation of these layers indicates the tangential component of the optic flow in their receptive field. By lining up the projection of these layers to rotation detecting layer with their respective preferred direction, the cells of the latter layer are able to sum up activities in this sequence of activated layers (Fig. 4.4). The temporal summation of the activity across the sequence of layers by the rotation-detecting neurons is achieved by the spatial distribution of synaptic weights tailored towards the preferred direction of each of the input layers. The total activity of the rotation detecting layer is an approximation of the curlz of the optic field. Fig. 4.4: The structure of a detector of counter-clockwise rotational stimuli. The rotation detector receives inputs from various layers of direction selective neuron. Each of the direction selective neurons detects motion in the direction tangent to the boundary of the rotation detector at a particular location. For example, the neuron that is sensitive to upward motion monitors such motion at the right part of the receptive field of the rotation detector, while one that is tuned to leftward motion monitors that in the ; upper portion. j i 4.5 Implementation and Simulation 4.5.1 Neural network model for motion perception i The neural network for motion perception is composed of layers of neurons. We will use I(i,j,t) to denote the activity of the (i,j) cell of the input layer, a sheet of receptors, at time t. The second level consists of D layers of directionally selective units, each corresponding to a different direction. Note that these are functional I layers — they do not correspond to distinct anatomical layers of cortex. Neurons at 71 A B Fig. 4.5: Synaptic masks for direction selective neurons. (A). This mask is used in the construction of direction sensitive units by having excitation (inhibition) in the preferred (null) direction. (B). We employ this type of mask when we sum up all directional sensitive layers, Wk in equation (4.10). This mask is rotated according to the preferred directions of these input layers so that each of them monitors a particular part of the receptive field where tangent vector points to the preferred direction. every point (i, j) in the layer Lk are most sensitive to movements in the direction of Lk receives input from the input layer I and has activity, ydLtft) = — L fc +aI(t> — Mk*l(t- t)i (4.10) dt where * denotes the convolution operator (defined in Appendix A), a = 2.0 is a scaling factor for the input stimulus and the time constant, x, is 2.5 ms. The mask is based on the principle of asymmetric synaptic weights for computing movement direction, i.e., excitatory weights towards the preferred direction and inhibitory weights towards the null direction (Fig. 4.5 A). The equations of leaky integrator neurons which have the general form as r = -M (r) + S(/) will be dt -written in a shorter form, M: = S , throughout the thesis (see Appendix A for more detail). For example, Eq.(4.10) can be written as Lk:= a-I + Mk*I. 72 The detector of opponent vectors (away from each other), 0 +(i,j,t), is formed by combining direction-selective neurons of opposite directions from each half of the receptive field. 0 +:=M1*Lk + +M2*Lk- (4.11) where Mi and M2 are shown in Fig. 5B. L4-+ and L^- are two layers of direction- selective neurons with opposite preferred directions. The detector of the other type of opponent vectors (towards each other), 0"(i,j,t), is built by combining direction-selective neurons of opposite directions from the other half of the receptive field. This is done simply by switching the convolution masks for the two direction-selective layers: 0-:=M 2*Lk++M,*Lk. (4.12) A layer of looming detectors is constructed by converging the opponent vector detectors, L^+ and L^- onto an output layer in the following way. LOOM:=X(ki*Lk*-k2*Lk_) (4.13) k where k denotes the number of the opponent vector detecting layers and ki and k2 are positive scaling constants. A layer of receding detectors is constructed in the reverse way. RECEDESrG:=X(ki*Lk- - k 2*Lk,) (4.14) k We now turn our attention to the rotation detection model. There are two output layers in the rotation detection network. The first layer sums up the distributed activities in all the direction-selective layers to trace out the boundary of the rotational motion. The activity in this layer can be described as 73 Lcuti:= X w k*Lk (4.15) k where Lk denotes the kth direction-selective layer and Wk is a large mask as | stimulus rotates, a sequence of layers will be activated, one after another, when the | i instantaneous movement direction of the stimulus falls in line with the preferred direction of a layer. Thus, the correspondence between the arrangement of synaptic ; i weights and the preferred direction enables neurons in the output layer to sum up activities in the sequence of the activated input layers. i The final output layer which corresponds to the center of a rotating stimulus is , given by | where W0ut2 * s an on-surround and off-center (inverse) Gaussian mask illustrated in : Fig. 4A, with k = 0.18 and a = 0.38. An inverse Gaussian mask is suitable for 1 extracting the center of a population of active neurons in which the activity concentrates on the periphery. However, in the case where two groups of active neurons intersect with each other (caused by two intersecting rotations), an inverse Gaussian mask with off-center (zero or small positive synaptic weights near the I center) will lead to a strong activity in the overlapping region, signaling a single and incorrect center of rotation. To overcome this problem, the inverse Gaussian mask is modified to have inhibitory center so that only neurons that correspond to low or no illustrated in Fig. 4.5 B, and the time constant = 1 0 ms. Note that the distribution of I i positive weights in Wk coincides with the preferred direction of input layer Lk. As a j (4.16) with the time constant = 5 ms, and (4.17) A C Fig. 4.6: Two expanding patterns that eventually overlap each other. (A-C). There is considerable amount of interference in the flow field generated by these two patterns. For example, at some point of time between snap shots A and B, two edges of the two patterns move closer towards each other giving rise to an illusion of shrinking pattern. Moments later (after snap shot C), they start to move away from each other and thus signaling a third looming pattern. Nevertheless, The network locates the centers of the two looming stimuli correctly (D). activity in the input layer can be activated. Such a mask works fairly well in extracting the centers of multiple rotations, even when they overlap with each other as shown in the next section. 4.5.2 Computer simulation We simulated our model with different motion patterns, carried out on a SUN workstation using NSL. We present some of the results here. All the examples involve overlapping patterns since the problem in handling occlusion has been the major weakness of the optic flow approach to motion perception. !■ □ □ □ □ ! IB _ I U U l 0 I D U H H B I 1 i aa i in aaai (■ □ ■ a i (■■■a (■ ■ ■ ■ U l m m m u < JU U L J _JI in a a a n i Fig. 4.7: Two shrinking patterns with identical initial shapes and locations. Interference caused by these two patterns can be seen in B in which there appears to be three instead of two shrinking patterns. Again, the network correctly locates the centers of the two shrinking patters. Motion in depth Two looming patterns whose expanding images overlap each other were presented to the model (Fig. 4.6 A-C). The model was able to localize the center of the two looming stimuli signaled by two peaks of activation in the layer of looming detectors (Fig. 4.6 D). In another set of simulations, we presented two shrinking patterns coupled with translational motion to the model (Fig. 4.7 A-C). Again, the model was able to detect two shrinking patterns indicated by the two groups of activated receding detectors (Fig. 4.7 D). However, the network was not always successful in detecting overlapping patterns, and in some cases it may be confused by the interfering patterns and produce erroneous results such as a single peak of 76 Figure 4.8: Two rotational stimuli in phase and angular velocity on intersecting paths. There are some ambiguities around the areas where the two rotational stimuli intersect as seen in the memory trace of the first layer of rotation detectors (A), but eventually the output became stable and the center of both rotational stimuli are detected (B). j activation. We also tested the property of positional invariance of our model by { i presenting looming and receding patterns in different spatial locations. The model j was able to detect these patterns in a restricted region within its receptive field. J i Rotational motion l I W e tested the rotation detecting network by simulating two intersecting rotational stimuli each composed of 3 dots following a circular path (Fig. 4.8). The ! network successfully gives the correct result of two circular traces and corresponding centers of rotation. We also presented a rotational pattern with translated motion to : the model. The model was able to track the moving rotatory trace and its center. 4.6 A Comparison of Anuran and Mammalian Motion Perception We have studied and modeled two different types of motion perception system; one based on anuran visual system which is capable of perceiving 3-D motion without optical flow information, and the other based on the mammalian visual system which relies on such information. Even though the anuran visual system is able to extract essential features of 3-D motion such as its trajectory and its speed of j approaching, there are limitations of this system due to the lack of optical flow 1 information. First, it is highly sensitive to the reversal of contrast. Both behavioral i and simulation data show that a frog is not capable of detecting a white object j looming against a black background. The property of direction selectivity does not j depend on order of contrast. Instead, it responds only to the change in contrast either I from bright to dark or the reversed condition. Therefore, the looming detectors in the mammalian visual system can perform in both contrast conditions. ! Second, the simulation of our model suggests that the anuran visual system , cannot perceive a receding stimulus. Two factors contribute to this limitation. One is the low sensitivity to trailing edges of a moving stimulus by the R3 retinal ganglion cells. A leading (trailing) edge of a moving stimulus is the region where [ the contrast changes from bright to dark (dark to bright) in the direction of the | movement. For a receding stimulus, along the shrinking edges, the contrast changes from dark to bright which elicit only low response in R3. Furthermore, the shrinking j area also activates a decreasing number of R4. As a result of these two factors, T3 i does not respond to a receding stimulus. A shrinking white stimulus against a dark ; background can elicit response in T3. However, the number of activated T3 neurons decreases as the stimulus becomes smaller. With the independence of contrast in directive neurons, the mammalian visual system is able to identify correctly whether a stimulus is approaching or receding in different contrast conditions. Third, T3 cannot distinguish between a looming stimulus from a rotating one if it fills appropriate portion of its receptive field (i.e., as long as it activates the group of R3s that are connected to the T3). However, it has little effect on at the behavioral level since only one (or few) T3 can be activated by such a stimulus. In summary, the major difference between anuran and mammalian visual system, namely, the availability of optical flow information, results in several differences in their functional capabilities. With optical flow information, the mammalian visual system achieves higher distinguishing power. However, not all differences at the cellular level bear behavioral significance. 79 I t i CHAPTER 5 i i SNAPPING MOTOR PATTERN GENERATION l 5.1 Introduction Performing a motor task requires synchronized execution of muscle synergies. Analysis of electromyography (EMG) data obtained from a wide variety of motor tasks has revealed correlations between features of the EMG record and task parameters. For example, in an experiment where a human subject is asked to perform rapid reversal movements, it was found that the peak amplitude in biceps and the rising rate of the triceps are closely correlated to movement speed (Schmidt et al. 1988), whereas variations in the peak amplitude of both biceps and triceps are correlated to movement distance (Sherwood et al 1988). Gottlieb and colleagues , (1992) demonstrated that, in fast reaching movement, the peak amplitude in biceps and the latency to onset of triceps are closely correlated to movement distance. However, the neural mechanisms of the control of the muscle activity is poorly understood due to the complexity of the premotor and motor circuits involved. Most of the modeling work on goal-directed movements concentrates on the optimization criteria used in the computation of inverse kinematics (Flash and Hogan 1985, Bermejo and Zeigler 1989, Bizzi et al. 1992, Hoff and Arbib 1993). Abstract models of coordinating serial motor synergies have been proposed (Rumelhart and Norman 1982 for typing, Schomaker 1992 for handwriting). Research on motor pattern generators (MPGs), on the other hand, has thrived in studying rhythmic behavior such as gastric mill rhythms in lobsters (Selverston and Moulins 1987) and locomotion (Grillner et al. 1988, see Harris-Warrick and Marder 1991 for a review from a modulatory perspective). However, the neural mechanisms of the MPGs involved in coordinating goal-directed motor synergies is still lacking. The relative simplicity of snapping behavior and the accessibility of the premotor circuitry in anurans presents an important opportunity to study the neural mechanisms underlying the coordination of multiple motor synergies. This behavior carries all the essential aspects of motor control, including the interaction of agonist and antagonist in generating a single motor synergy (e.g., the projection and retraction of the tongue), and the synchronization of multiple motor synergies. To this end, we develop a model of the snapping MPG as a vehicle to investigate the following questions. What are the characteristic features of the activity of a single muscle? How can these features be controlled by the premotor circuit? What are the strategies employed to generate and synchronize motor synergies? What is the role of afferent feedback in shaping the activity of an MPG? The neural network model of the MPG is composed of leaky integrator neurons. The model is implemented in NSL, Neural Simulation Language (Weizenfeld 1991), and the simulation is conducted on a SUN SPARCstation. Though the parameters of the model are quantitatively consistent with available experimental data, the purpose of our study is not to offer "the" set of parameters, such as synaptic weights, for the MPG since the building units of our model itself is a highly simplified abstraction of a real neuron. Rather, we intend to use "a" biologically consistent model to elucidate the organizational and functional principles underlying the generation and coordination of motor synergies and to explore a behavioral repertoire beyond the currently available experimental data set. Ultimately, we wish to demonstrate the 81 , j potential of a close coupling of experimentation and modeling approaches in advancing our understanding of the nervous system. 5.2 Controlling Single Muscle Activation 5.2.1 Neurophysiology of the premotor neurons The reticular formation in the medulla has been considered a major premotor , i area in many species including anurans. Schwippert et al. (1989, 1990, Ewert et al. j 1990) recorded in toads from this premotor area and categorized 10 functional units (M l to M10) pertinent to sensorimotor integration. W eerasuriya (1989) and Matsushima et al. (1989) found that the premotor neurons in the medulla send axon collaterals into motoneuron pools. Among these physiologically identified premotor neurons, two classes of neurons exhibit opposite firing characteristics. Shortly before a movement, M8.1 firing rate increases from less than 5 Hz to about 40 Hz, while that of M8.2 decreases from 40 Hz to less than 3 Hz. Such duality between M8.1 and M8.2 is observed in a wide variety of movements including eye closure, walking, stalking, orienting, and snapping. We hypothesize that the opposing characteristic of M8.1 and M8.2 provides a push-pull mechanism for coordinating the activity of various motoneuron pools. M 10 neurons discharge a sequence of bursting spikes at an average frequency between 15 to 20 Hz. Premotor bursting cells in mammals has been found to contact motoneurons directly in controlling eye, neck and limb movements (Robinson 1981, Ito 1986, Grantyn and Berthoz 1988). Therefore, it is postulated that M10 bursting cells are the output neurons of the snapping MPG and provide signals to drive the motoneurons in the facial and hypoglossal nuclei. 82 M8.2 I l M10 Muscle Figure 5.1. Basic MPG module. The push-pull between M8.1 and M8.2 forms the basis of a basic MPG module. They receive independent inputs and send excitatory and inhibitory signals, respectively, to M10. In the model, each module is postulated to control one muscle synergy. A network of MPGs for cordinating motor synergies can be constructed using this triplet as a building block.. Based on these experimental data, we composed a basic MPG module consisting of a triplet of M8.1-M8.2-M10, as shown in Fig. 5.1. The premotor circuit consists of several such basic modules with each one controlling a given set of muscles. M8.1 and M8.2 neurons receive signals through independent input lines and project to M10 with excitatory and inhibitory signals, respectively. The input to M8.1 increases its firing from the baseline 5 Hz to 40 Hz, whereas the input to M8.2 reduces its discharge rate from 40 Hz to 3 Hz. Between them, they provide a push- pull mechanism for the coordination of different muscle synergies. M 10 neurons 83 M8.1 M S. 2 M 10 Figure 5.2. Simulation of the basic MPG module. The temporal profile of M8.1, M8.2, M10, and motoneuron (MN) in response to an input signal that excites M8.1 and inhibits M8.2 at the same time. The firing rate of M8.1 increases from 5 to 40 Hz whereas that of M8.2 decreases from 40Hz to 3Hz. The removal of tonic inhibition by M8.2 coupled with an excitatory input from M8.1 produces a response in M10 which in turns activates the motoneuron.. integrate signals from M8 neurons and discharge with bursting spikes to drive the motoneurons (Fig. 5.2). Mathematically, the basic three-neuron circuit can be described as the following. M8 .i:= kj -INj + 5 (5.1) M8 2:= - k 2 • IN2 + 40 (5.2) M10: = k3 • f(M gl) — k4 • f(M g2) (5.3) where xi, X 2, X 3 are time constants for M8.1, M8.2 and M10, respectively. Mx* denotes the membrane potential of neuron "xx" and f(mxx) is its firing rate, kj’ s are positive numbers representing the synaptic weights. INi and IN2 are input signals to M8.1 and M8.2, respectively. Note that the input to M8.2 is inhibitory such that it reduces the firing rate of M8.2 from 40 to 3 Hz. To facilitate discussion, when we refer to the "activation" of M8.2, it means that some input signal to M8.2 produces a decrease in its tonic discharge. The tonic inhibition from M8.2 to M10 is released when an input arrives at M8.2 to reduce such inhibition. At this time, the excitatory input from M8.1 is able to elicit a response in M10. 5.2.2 Exploring the premotor space We begin our investigation of issues concerning motor control by examining what characterizes a muscle activity as reflected by its EMG recording and how these characteristic features can be controlled. The action potential transmitted from the axons of motoneurons to the muscle fibers is called muscle action potential. Electrodes placed on the surface of a muscle (or inserted in the muscle) will record the algebraic sum of all muscle action potentials being transmitted along the muscle fibers near the electrode (Winter 1979, Loeb and Gans 1986, Gans 1992). That is, an 85 EMG record is a measure of the output of a population of motoneurons which send action potential to muscle fibers that run through the neighborhood of the recording site. Therefore, the average of a group of EMG recorded from repeated performance of a motor task is a good indicator of the output of the motoneuron population. In general, an EMG record can be characterized by its time of onset, rising rate, peak amplitude, duration, total activation (the area of the EMG) and decay rate. We explore ways in which the push-pull mechanism built into the basic MPG module can be used to control these features of a single muscle activity by varying the model parameters within biologically plausible range. The result of the experiment is summarized below. Onset time Based on the computer simulation of the MPG module, many factors can influence the onset time of the motoneuron, including the time of arrival of the inputs to M8.1 and M8.2 (Fig. 5.3 A), and the time constants of M8.1, M8.2 and M10 (Fig. 5.3 B, C, D). However, some subtle differences exist between their effects. A decrease in the time constant of the premotor neurons (but not of the motoneuron itself, Fig. 5.3 E) results in a shorter latency to the onset of the motoneuron along with a slight increase in its rising rate (12%), peak amplitude (20%) and total activation (10%). On the other hand, an earlier arrival of input signal to either M8.1 or M8.2 alone causes a small decrease in the rising rate, peak amplitude, and total activation due to the asynchrony in the push-pull mechanism. Advancing the input to M8.1 is about 3 times more effective than that to M8.2 with less change in the total activation in the motoneuron (6% vs. 9% reduction). A synchronized advancement 87 D ...................... ... E Figure 5.3. Exploration of the premotor space. The traces in the top row correspond to M8.1, 2nd row to M8.2, 3rd row to M10, and the bottom row to the motoneuron. (A) Superimposed traces of the simulation result by varying the timing of the arrival of input signal to M8.1 as shown. Note the development of a two-stage decay in the motoneuron as the activation of M8.1 advances. (B) Three simulation results are shown in which the time constant of M8.1 varies from 10, 20 to 30 ms. (C) In this graph, the time constant of M8.2 is changed from 10, 20, to 30 ms. Note that varying the agonist (M8.1) is more effective in advancing the onset of motoneuron and in increasing its peak amplitude than the antagonist (M8.2). (D) This graph shows superimposed traces by varying the time constant of M10 from 10, 20, to 30 ms. The change in M10 is most effective in affecting the onset and peak amplitude in the motoneuron. (E) The result of varying the time constant of the motoneuron with the same values is shown. With shorter time constant, it reaches a higher peak amplitude faster, with a shorter duration (and hence no change in the total activation). of the input signals to both M8.1 and M8.2 produces a shortening in the latency without any effect on all the other features. Rising rate The single most effective way of tuning the rising rate is to change the time constant of the motoneuron. An increase in the peak amplitude is associated with a shorter time constant, and as a result, the total activation remains almost constant (< 0.1% change). The second factor is the time constant of the premotor neurons, which also produces changes in the onset time, peak amplitude and total activation. Peak amplitude Among all parameters tested, the strength of the input signals to M8.1 and M8.2 is the most influential in determining the peaking amplitude. It has no effect on the onset time, but is always accompanied by a corresponding change in the total activation. As mentioned above, a change in the time constant of the motoneuron also affects the peak amplitude with no change in the total activation. Duration Two major factors can affect the duration of muscle activation. First, the time constant of the motoneuron is positively correlated to the duration (however, the time constant of the premotor neurons has little effect). The second factor is the duration of the input signal which also produces a marked increase in both the peak amplitude and total activation. Total activation The most significant factor in controlling the total activation is the strength of the input signal and its duration. As discussed above, decreasing the time constant of 89 1 I 1 I the premotor neurons increases the total activation while asynchrony in the push-pull between M8.1 and M8.2 produces a reduction. The reason for the reduction is that when the activation of M8.1 and M8.2 does not occur at the same time, say M8.1 is activated earlier, the tonic inhibition from M8.2 is not fully reduced yet. Therefore, ; the resulting activation of M l0 is less than when their activity is synchronized, i.e., ! M8.1 reaches its peak firing at the same time as M8.2 reaches its lowest. Decay rate Besides the obvious factor of the time constant of the motoneuron, one unexpected and interesting way to modify the decay rate is the asynchrony of the M8.1 and M8.2 activation - a delay between their respective evocation (up for M8.1 and down for M8.2) results in a slower decay rate. As such delay prolongs, a two- stage decay in the motoneuron activity begins to emerge (Fig. 5.3 A). In summary, we observed that there exists multiple ways to control every characteristic feature of the muscle activity. Conversely, a change in the premotor parameters affects several features in the muscle activation. Combinations of the control parameters produce a rich functional repertoire for motor control. For example, prolonged input with reduced strength will produce an increase in the duration without changing the peak amplitude. Equipped with such knowledge, we proceed to study issues concerning the strategies used in agonist-antagonist interaction and synchronizing multiple synergies and how such strategies can be realized by the premotor circuitry proposed here, using anuran snapping behavior as an example. 90 l 5.3 Anuran Snapping: A Case Study of Motor Coordination i Anuran prey capture consists of a sequence of motor synergies released by a specific stimulus. This series of steps includes an approach or orientation towards I the prey stimulus, a fixation of the prey in the frontal visual field and the ! consummatory event of snapping at the prey and swallowing it. Snapping starts with a lunge of the head towards the prey, followed by mouth opening, protrusion of the , tongue, retraction of the tongue with the prey, closure of the mouth, return of the head and body to resting position, and swallowing of prey. Since the observation that i the sequence proceeds to completion once initiated, despite removal of the target (Hinsche 1935), snapping has been regarded as highly ballistic with little or no variability based on sensory feedback (Ingle 1983, Ewert 1984, Matsushima et al. 1989, Weerasuriya 1983, 1989). The key stimulus that elicits prey capture is either visual, tactile or olfactoiy, and the outputs of their respective sensory analyzers share common access to motor pattern generators responsible for the elaboration of the appropriate motor outputs. The motoneurons controlling the tongue muscles are located in the hypoglossal nucleus of the medulla oblongata, whereas prey recognition is performed in the optic tectum. The transformation of sensory signals into appropriate spatio-temporal patterns of activity in the motoneurons is considered to be carried out by an MPG network in the medial reticular formation of the medulla (Edinger 1908, Herrick 1930, Nieuwenhuys and Opdam 1976, Weerasuriya 1983, 1989, Schwippert et al 1989). 1 91 ! \ T T h tk h - H , TIME (7msec/unit3 Figure 5.4. Snapping in anurans. A sequence of drawings of tongue flip is shown on the left and the EMG recording of three major muscles involved in snapping is given on the right. Left. Median (A) and sagittal (B) sections of the tongue at rest. (C) The tongue is lifted. Note that M. submentalis (SM) rotates from a horizontal position (in B) to a vertical one (see the displacement of the "+" from B to C). A vertically oriented, rod shaped mass (the lingual rod) is formed by the contraction of M. genioglossus (GG). (D) The lingual rod and lingual pad pass beyond SM. SM is pulled caudally by M. geniohyoideus medialis (GHM) and M. intermandibularis posterior (IMP). (E) The tongue is fully protruded. Right. Each bar of the graph represents the product of the mean spike number times the mean amplitude as a percentage of the maximum value. The maximum value of cumulative voltage in units of 0.08 mv is given for each graph. The dashed lines define the stages of preparatory, protrusive, retractive, and closing movements by defining the start of mouth opening, start of tongue protrusion, start of retraction, and end of retraction (From Gans and Gomiak 1982). 92 5.3.1 The myology of snapping i Snapping is controlled by several extrinsic tongue muscles for flipping the tongue and by jaw muscles for opening and closing the mouth. Gans and Gomiak j (1982) divided the feeding sequence of toads into four phases, (1) preparatory, (2) ; tongue protrusion, (3) tongue retraction, and (4) mouth closing. We focus on the i middle two phases of this sequence. Fig. 5.4 illustrates the protraction of the tongue (left) and the EMG recording of the major muscles involved in snapping (right). In the preparatory phase prior to mouth opening, M. depressor mandibulae (DM) plays a major role in mouth opening by lowering the jaw — activation of DM reaches its first peak before the mouth starts to open. Shortly after the activity in DM reach its peak, M. genioglossus (GG) starts to contract to project the tongue. During its contraction, GG forms a transverse, rod-shaped mass (the lingual rod in Fig. 5.4), shortens the lingual tissue and provides a forward momentum. In the retraction phase, M. Hyoglossus (HG) provides the major force to bring the tongue back into the mouth. The activation of HG begins before, and peaks at, the start of retraction. The second burst of activity in DM opens the mouth even wider to allow the prey to be brought in, followed by the melting of the basal mass (small increase of GG near the ending of retraction phase). 5.3.2 The neurophysiology of snapping In anurans, optic tectum is the primary visual center mediating prey-catching behavior. Based on physiological criteria, ten classes of neurons (T1-T10) have been identified (for review see Griisser and Grusser-Comehls 1976, Ewert 1984). The activity of these tectal cell types is correlated with a variety of behaviors such as prey-catching, predator-avoidance, and general arousal. Here we review the tectal cells involved in prey-catching behavior. Based on their sensitivity to different stimulus configurations, Ewert and Wietersheim (1974) divided the T5 cells into 3 subclasses. Among them the activity of worm-selective T5.2 cells is highly correlated to prey-catching behavior. More recent data obtained from extracellular recording from freely moving toads showed that T5.2 neurons respond to a worm-like stimulus with discharge of increasing frequencies for about 200 ms before snapping (Schiirg-Pfeiffer 1989). The firing rate is low for the first 150 ms and then increases dramatically to about 150-200 spikes/s 50 ms prior to snapping. These discharges precede snapping; T5.2 are silent during snapping. T4 neurons have a large receptive field and can be activated by either visual or tactile stimuli. Some of the T4 neurons show the property of general arousal ("newness"). Several medullary premotor neurons are relevant to the control of snapping behavior. M4 neurons with large receptive fields are more sensitive to a novel stimulus. M5.2 neurons, whose receptive field is relatively small and located mostly in the frontal regions, show strong preference to worm-like stimuli. 5.3.3 Synchronizing Motor Synergies We construct a model of the MPG for snapping behavior involving the coordination of jaw and tongue synergies using the basic premotor module described above as building blocks. Two types of input signal of an MPG have been distinguished. The first type of input is the trigger command that activates an MPG and the other type is the control signal that sculptures its temporal pattern. The control signal conveys such information as the direction and amplitude of a 94 T5.2, •5.3, ;T5.4] Torus, ,M 5.2, M4 DM M8.2 G G M8.1 G G M8.2 H G M8.1 H G M8.2 DM M8.1 DM M10 G G MIO H G MIO G G H G DM Depressor Protractor Retractor Figure 5.5. Sequential network of snapping MPG. The MPG modules for controlling different muscle groups share a common architecture as shown in the shaded area. The lines with an arrow at the end represent excitatory connections whereas those with a small circle reflect inhibitory synapses. Note that the connection from DM to GG and from GG to HG provides the temporal order in their activation. The feedback from HG to DM produces the second peak in DM activity. movement. Separate pathways have been reported to carry these two types of signal from optic tectum to medulla (Ingle 1983, Grobstein 1989). Similar functional separation of the sensorimotor signal has been observed in other goal-directed MPGs, such as those seen in controlling saccadic eye movements (Robinson 1981, Scudder 1988). Based on such observations, we put forth a general principle for constructing the snapping MPG: The intrinsic connectivity between the modules is responsible for generating a "default" snap upon the arrival of a trigger command which is modified by the control signal coming into each module. 95 ! i The premotor circuit consists of several basic modules described in the previous section, with each one controlling a given set of muscles (Fig. 5.5). T5.2 neurons in the tectum which respond best to a prey-like stimulus, provides the triggering command to activate snapping. M4 neurons (receiving signal from its tectal counterpart, T4 neurons), which have the property of general arousal ("newness"), elicit in the motor neurons an initial IPSP which may serve two purposes, (1) ; resetting the system before initiating a new movement and (2) preventing the toad from snapping at undesirable objects which may elicit low activity in T5.2 neurons. Tactile stimuli are conveyed to the premotor system via the torus. M8.1 neurons receives excitatory inputs from M5.2 neurons (for visual stimuli) and torus (for tactile inputs) while M8.2 cells receive inhibitory inputs from both M5.2 neurons and torus. M8.1 and M8.2 neurons project to MIO with excitatory and inhibitory signals, respectively. Between them, they provide a push-pull mechanism for the coordination of different MPGs. MIO neurons integrate signals from M8 neurons and discharge with bursting spikes to drive the motoneurons. After snapping, the feedback from MIO to tectum turns off the activity in T5.2 and T4 neurons. The intrinsic connectivity between the basic modules in an MPG provides the primary coordination of the muscle synergies involved which is fine tuned by control signals. Here we introduce an example involving three major muscles (DM, GG, and HG) for snapping and show how the push-and-pull mechanism between M8.1 and M8.2 can be utilized to achieve motor coordination in synchronizing the jaw movement and the protraction and retraction of the tongue. We explore two plausible intrinsic connectivity schemes between the jaw and tongue modules which produce the temporally ordered sequence of the mouth opening, tongue protraction and 96 Visual Stim. T5.2 M8.1 M8.2 DM GGB HG Figure 5.6. Behavior of sequential model. The activity of the network in response to a prey-like stimulus lasting 450 ms Top trace). Note the slow build up of the discharge of T5.2 neurons (2nd trace) which terminates before die peak of the activation of the protractor motoneurons (as observed in moving animals). The push-pull mechansim of M8.1 and M8.2 for the DM module is shown here (3rd and 4th trace). The next trace shows the activity of the MIO for the DM module. The last three traces correspond to the activity of the motoneurons (DM, GG, and then HG). The temporal ordering of the activities in DM, GG, and HG correspond closely to observed behavioral data. Immediately following the peak in HG, a second, lower, activation in DM occurs. However, the 30 ms delay between the onset of GG and HG is too long to match the experimental data (see text for discussion). 97 retraction. As we will see in Section 5.4, these two schemes lead to different control : strategies for controlling snapping distance based on visual feedback. j i Sequential activation ' First, we test a scheme in which the temporal order of snapping behavior is generated by a sequential connectivity pattern between the modules of the MPG model (Fig. , 5.5). In this scheme, the DM module is the first of the three to be triggered by M5.2 : neurons. The activation is propagated from the DM to the GG and then from GG to HG, thus creating a temporal delay between the activation of DM, GG and HG. To activate any of the three modules, the trigger command excites the M8.1 neuron and inhibits the M8.2 neuron of the respective module, including the feedback from HG to DM triggering its second peak. Inhibition from the premotor system (via GG in this example) terminates the activation of tectal neurons. The parameters values of the model, e.g., the time constant of the neurons and their synaptic weights, are so chosen that once activated, the model will generate a temporal pattern of activity close to the EMG data (Fig. 5.6). The temporal pattern of each motor synergy is a good match with the EMG data. Furthermore, the long delay between the activation of DM and GG (~ 35 ms) suggests that the latency to the onset of GG is suitable for synchronizing jaw and tongue synergies. However, there is a discrepancy in the temporal delay between GG and HG activation (~ 30 ms in the simulation vs, ~ 5 ms in the EMG data). Instead, the data shows that the difference in the time of peaking in GG and HG is more significant (a 30 ms delay between them, which corresponds roughly to the delay between tongue protraction and retraction). Such a discrepancy suggests a different connectivity pattern for the MPG. 98 ! T5.2, :T5.3j :T5.4: Torus, T4 .M 5.2, M4 ililW S W GG 1M8.1 DM W DM M8.2. H G \ M8.2A G G M8.2 M8.1 M8.1 H G MIO DM MIO GG MIO G G H G DM Retractor Depressor Protractor Figure 5.7. Parallel network of snapping MPG. An alternative model for the snapping MPG. In this model, DM activates both GG and HG. As a result, GG and HG are activated at the same time. However, HG reaches its peak activity 30 ms later than GG does. Such a temporal order between GG and HG is generated by the difference in their time constants. Parallel activation We developed an alternative model for producing the temporal order seen in the snapping behavior. The basic organization is similar to the mediation model, except that in this model, DM activates both GG and HG, using the same push-pull m echanism of M8.1 and M8.2 (Fig. 5.7). Besides the change in their interconnection, there are some subtle differences in these two models. First, in the sequential model, the time constants for all the components in the HG module is set to be greater than their counterparts in the GG module such that a longer peaking and decay rate in HG can be obtained. In the parallel model, the time constant of the HG 99 i Figure 5.8. Behavior of parallel model. The activity of the network in response to a prey-like stimulus lasting 450 ms. The three traces correspond to DM, GG and HG (from top to bottom). The activation of DM is about 35 ms ahead of both GG and HG. Though activated at about the same time, GG reaches its peak activity 30 ms before HG does. motoneuron is also greater than that of GG motoneuron, for the same reason. However, if all other components of the HG modules (e.g., M8.1, M8.2 and MIO) also contain greater time constants, than the onset time of the HG motoneuron will be much later than the GG motoneuron. Therefore, the time constant of the premotor neurons in the HG module has to assume values smaller that their GG counterparts. The parallel model produce temporal patterns that closely matches the EMG data (Fig. 5.8). A snap of the tongue is a rapid and powerful movement where the reversal in its direction causes the tongue to wrap around a prey to bring it into the mouth. In order to create such a transition in movement direction, the retractor muscles may have to be activated early to prepare for the transition. This can explain why there is only about 5 ms delay between the evocation of the protractor and retractor muscles. A similar agonist-antagonist relation is also seen in rapid reversal of arm movement. In | the next section, we will discuss how the sequential and parallel models can support different strategies of controlling snapping distances based on visual feedback. j I 5.4 On-Line Correction i We have presented a biologically consistent model for the snapping MPG and i demonstrated how a push-pull mechanism can be used to generate and coordinate motor synergies. The model suggest a simple architecture for the MPG with the flexibility to generate a wide variety of motor patterns. In particular, the flexibility in controlling snapping distance is important considering the lunging movement that always precedes jaw and tongue movements. A lunge involves a fast forward movement of the whole body over a long distance in a short time (about 140 ms), so that one may expect a rather high degree of error. Such an error has to be corrected by subsequent tongue movement using visual feedback if the frog is to catch a worm. This hypothesis is supported by preliminary data analysis from a behavioral experiment, which in turn led to the postulation of several hypotheses concerning the information processing required by such flexibility in motor control (described in the next section). In principle, both the jaw and the tongue synergy can be modified to control snapping distance for compensating for the error induced by lunging. The jaw synergy can be varied to speed up or slow down a snap, thus changing the point of ' contact of the tip of the tongue. Alternatively, the tongue synergy can be adjusted. EMG study from a variety of motor tasks suggests different ways in which the activity of agonist and antagonist muscles can be altered to control the distance of a movement. In prey-capture behavior, the small amount of time (about 100 ms delay 101 Visual System Looming Disparity Accommodation W Trigger i p Motor System Distance Prey Recognition Distance Estimation Tongue Retraction (HG) Open Mouth (DM) Close Mouth (LM) Lunging MPG Tongue Protrusion (GG) Snapping MPG Figure 5.9. A model of feedback control in motor coordination. The MPG model is situated in a more general context of motor coordination based on sensory feedback. The recognition of a prey activates the sequence of capture behavior by triggering the lunging MPG which always precedes snapping. Once triggered by the lunging MPG, the intrinsic connectivity of snapping MPG produces the "default" temporal pattern which is subject to the modulation based on the estimation of target distance. The distance signal exerts its influence by varying the initiation of retractor subunits. It is further postulated that the distance estimation from looming cues generated by the lunging movement plays an important role in such modulation mechanism. between lunging and snapping movements) available for processing sensory information constrains the strategy for motor coordination. Therefore, we i hypothesize that after the initial estimation of the distance from a target, the temporal j I relation between the lunging and the jaw synergies is fixed, and the control is i performed on the tongue synergy so that the nervous system can make use of the longest possible time to process sensory inputs (Fig. 5.9). Delaying the control until the last synergy in a given motor task seems to be a general strategy employed by various species for coordinating prehensile behavior. For example, while a relatively fixed temporal relation exists between the reaching and grasping components of prehension in pigeon (Goodale 1983) or human (Jeannerod 1984), the aperture in grasping can be adjusted to compensate for the error in transporting (Bermejo and Zeigler 1989 in pigeon, Wing et al. 1986 in human). Such a flexibility in motor control offers the animal an effective way of coordinating elements of a fast movement. The strategy of a snap is to protract and retract the tongue with intense force such that the reversal in movement direction causes the tongue to wrap around a prey enabling the tongue to bring it into the mouth. The requirement of such an intense force may saturate the amplitude in both the protractor and retractor muscles. As a result, it is likely that only the temporal parameters of the muscle activation are available for synchronizing the snapping movements. In particular, we focus on two temporal parameters that are most often seen to vary with task parameters in a variety of EMG recording, namely, the onset time and rising rate. The reason that temporal features of a motor synergy can be used to control its spatial parameters is as follow. The tongue is composed of soft tissue, therefore, much like hitting a target with a whip, the extension of the tongue can be controlled, in addition to the angle it is 103 i f I aimed at, by the timing of its retraction, i.e., the earlier the retraction, the shorter the j extension and vice versa. When the time of onset is chosen as the point of control, the control signal is treated as part of the input to the MPG in order to modify the onset time, a scheme which is referred to as mediation. In the case of controlling the rate of peaking, the . control signal modifies the membrane property of the motoneurons which leads to a change in the rate of their activation, and hence the rate of muscle fiber recruitment. This scheme of control is known as neural modulation (the biochemical basis of neural modulation can be found in Kaczmarek and Levitan 1987, Servan-Schreiber et al. 1990 presents a mathematical model, and Harris-Warrick and Marder 1991 provides a review of modulation in MPG). We will demonstrate the differential effects of these two scheme on agonist-antagonist interaction in the protraction and retraction of tongue. 5.4.1 Controlling distance by mediation The sequential connectivity pattern from DM to GG to HG which creates a relatively long delay between tongue protrusion and retraction is suitable for applying the mediation scheme. On-line correction based on visual feedback is achieved by simply adding a closeness signal to the input to HG subunit. The result of computer simulation by varying the amplitude and timing of the closeness signal to HG is summarized here. The activation time of HG can be pushed to the maximum of 30 ms after that of GG, in proportion to the strength of the closeness signal, from 35 ms after without it (Fig. 5.10). The surprise is that there is a temporal window such that the closeness signal is effective only when it arrives at the HG within that window. It makes sense in that even though the retraction of the tongue 104 Figure 5.10. Controlling snapping distance via mediation. Simulations of the sequential model with 6 different "closeness" signals are superimposed. The traces correspond to DM, GG, and HG (from top to bottom). As the arrival of the "closeness" signal advances, the latency to the onset of HG shortens progressively. needs to be regulated by the closeness signal, the temporal order of DM~>GG— >HG must be maintained. Therefore, there must be a temporal limit to the effect of the closeness signal. The simulation shows that, within the window, the effectiveness of the closeness signal first increases to an optimum and then decreases to zero. Though the result does not match the EMG data which shows that the activation of GG and HG is at about the same time (separated by about 5 ms), the insight gained here can be useful for synchronizing motor synergies in performing certain motor tasks. 5.4.2 Controlling distance by modulation Modulation is most effective in situations where the delay between the muscle synergies to be synchronized is short and the difference in their rising rates is significant. Such a case is seen in the parallel connectivity pattern in which DM 105 ( Figure 5.11. Controlling snapping distance via modulation. Simulations of the parallel model with 6 different "closeness” signals are superimposed. The traces correspond to DM, GG, and HG (from top to bottom). In this scheme, latency to the onset of HG only varies slightly while its activity peaks with a shorter delay when the control signal arrives earlier. Note also that the amplitude of HG becomes greater while the duration is shortened, as discussed in Section 5.2.2. activates GG and HG simultaneously. The modulation operates on the time constant of HG to speed up its peaking with higher closeness signal. The result of computer simulation by varying the timing of the closeness signal to HG is shown in Fig. 5.11. Again, we observe a temporal window in which the distance signal first advances the onset time of HG, up to a maximum and then the effect starts to decrease back to zero. In addition, some unexpected effects were observed in the experiment. For example, as a result of making the time constant for HG slower than that of GG, the activation of HG is significantly delayed. So it turned out that we need to set the 106 time constants for HG.M8.1 and HG.M8.2 faster than those for their GG counterparts in order to compensate for the reverse relationship between the corresponding motoneurons (i.e., GG and HG). Second, a comparison of the two schemes reveals that the temporal window in the mediation scheme occurs earlier than that in the modulation scheme (relative to the onset of HG). That is, in the mediation scheme, the closeness signal serves to "prime" the HG and therefore must arrive before the onset of HG. In the modulation scheme, the window starts from 10 ms before HG onset to 25 ms after. During that period, the reduction of the time constant will have an effect on the rise of the HG membrane potential. Third, the contribution of the feedback from HG to DM has to be reduced in the sequential scheme. The reason is that in the sequential model, the control signal is treated as part of the input in order to modify the onset time. Therefore, it adds to the excitation of HG by the trigger signal. In the parallel scheme, on the other hand, the control signal only changes the membrane property (the time constant) of HG motoneuron, which has little effect on its total activity integrated over time. This difference becomes significant when the control signal is intensive. We also investigated the origin of the temporal window within which the arrival of the control signal is effective in tuning the HG synergy. The existence of such a window is that the signal of the instantaneous target distance is relayed to the MPG via an intemeuron which discharges a brief burst of spikes when the distance reaches a threshold value. If the relay is removed and the instantaneous target distance is used as the control signal, a different result is obtained. First, the temporal window disappears; instead, as long as the closeness signal arrives at the HG early enough, it will change its onset time (or the rising rate). Second, the second peak in DM 107 j i becomes larger as the HG peaks earlier. Consequently, there is also an increase in the 2nd peak of GG, a much larger increase. This is reminiscent of one snap that we observed in behavioral studies (Liaw et al, in preparation) where a frog lunged too far I ahead the worm and in attempting to correct the error, it lowered its jaw so much for so long that it almost tipped over. 5.5 Behavioral Study of Motor Coordination The simulation of our model of the MPG for controlling jaw and tongue muscles demonstrated the capability of tunable, rather than ballistic, control of snapping. Snapping is a rapid (about 140 ms for lunging and 80 ms in tongue flipping) movement. Prey-catching, on the other hand, is highly accurate, i.e., the frog catches its prey almost every time. To achieve such a high degree of accuracy with ballistic actions requires great precision in sensing, planning, execution and coordination of all the motor elements, including four limbs, trunk, head, jaw, and tongue movements. Is this plausible? The considerations led to the design of a series of experiments, conducted in the laboratory of Ananda Weerasuriya in October, 1992, to test the hypothesis that prey- catching involves variability in head movements which is compensated by controlling the tongue during snapping and to elucidate the parameters used in such control. The answers of the following intriguing questions are also sought: (i) Is the anuran nervous system capable of varying the time of tongue retraction in prey- catching? (ii) If so, what kind of sensory information is used? (iii) How does the nervous system make use of the control parameters to coordinate multiple body movements such as body lunging, mouth opening and tongue flipping? (iv) What is the information processing required? 1 0 8 1 ipgt i h W i Figure 5.12. Compensation of long lunge bv shortened tongue movement. In this figure, the lunge carries the frog over the worm (note the position of snout relative to the worm in frames 4, 5, and 6). The tongue starts to retract before the completion of the protraction phase (frame 4) to compensate for the overshooting lunge. Figure 5.13. Lateral variability in snapping. This sequence illustrates a successful capture of a worm that is swinging on a thread in front of the frog. Frames 2 and 3 depict the lunge of the frog toward the worm. In frame 4 the tongue is seen to extend directly forward and is apparently about to miss the worm which is slightly to the right of the frog. The rapid on-line correction of tongue extension to capture the worm is seen in frame 5. The tongue has now slightly curled to the right and has contacted the worm. The successful retrieval of the worm into the mouth is seen in frame 6. 110 Lunge Too Far? Yes No Yes Tongue Fully Extended? No Figure 5.14. Variation in lunging and tongue extension. The columns represent the number of snaps in which the lung carries the frog over the worm (first column) or not (second column). The rows represent the number of snaps in which the tongue is fully extended (top row) or not (bottom). 5.5.1 Correlation between lunging and snapping variations Here we report the findings obtained from a behavioral experiment designed to answer the first three questions posed above and postulate a plausible mechanism for the last one. A high resolution video camera was used to record the prey-catching behavior in seven normal frogs. Frame-by-frame analysis reveals that tongue movements in frog have a high degree of flexibility in direction and distance (Fig. 5.12 and 5.13). This is not so surprising if one considers the potential for variation in the lunging of body that always precedes snapping. In order to eliminate the possibility that the variability in snapping distance is a random phenomenon, we 9 35 15 0 A Ill compiled data to show the correlation between the extension of the tongue and the error in lunging (Fig. 5.14). In 24 snaps where the lunging movement is too long, 15 of them (63%) were corrected by a shorter than normal tongue extension, whereas in all 35 cases in which the lunging does not overshoot the target, the tongue extends fully. Taken together, the data provide strong support for the hypothesis that error in lunging is corrected by varying tongue extension during snapping. In a third of the cases where the lunging is too long, the frogs were unable to make a correction. This is not unexpected; considering the short time available for the frog nervous system to process sensory information. A point worth noting here is that the approach to studying snapping in anurans has been one that either treats each of the constituent motor elements of snapping in isolation (e.g., Gans and Gomiak 1982, Satou et al., 1989, Sokoloff 1991) or as a whole without any regards to its components (e.g., Collett 1977, Ingle 1976). Here, we present the first effort to treat the body lunging, jaw movement, and tongue flipping as analyzable elements of an integral behavior and to look at snapping behavior from the perspective of motor coordination. Many intriguing issues arise from this perspective and several interesting analogies can be drawn from other motor behavior such as pecking in birds and reaching and grasping in primates (see Discussion). 5.5.2 Mechanisms for Error Correction Having established that the frog appropriately employs sensory feedback to correct error in lunging by varying the tongue movements, we search for other possible mechanisms for error-correction and, at the same time, address the second question of what kind of sensory information is used in controlling the tongue 112 Number of snaps N = 50 21 15 30 16 45 60 Time to contact (ms) Figure 5.15. Time to contact at mouth opening. This histogram shows the distribution of time to contact at the begining of mouth opening. A wide distribution is obtained in this experiment. movement. Is it the information about time-to-contact (Tc)? This has been demonstrated to be one crucial visual cue used in guiding movement in many species such as flies (Wagner 1982), birds (Lee and Reddish 1981, Wang and Frost 1992), and humans (Schiff and Detweiler 1979, Savelsberg et al., 1991). If the snapping is highly stereotypical and ballistic as previously believed, then Tc would be the most reliable control parameter in initiating snapping. However, as shown in Fig. 5.15, there is a wide spread (from 15 to 60 ms) in Tc measured as the time elapsed between the start of mouth opening and the moment when the tongue makes contact with a 113 Distance Before „ _ Lunge N = 56 (cm) 10 2 - - • • • • • • • • • • 1 2 3 4 5 6 Distance at Mouth Opening (cm) Figure 5.16. Distance to target at mouth opening. The fixation distance before snapping is plotted against the distance to target at the begining of mouth opening. Each dot represents a snap. A positive correlation between the fixation distance and distance at mouth opening can be seen here. A total of 57 snaps are sampled here. worm1. A possible alternative is a simple but effective approximation of Tc, namely, distance-to-contact (Dc), which is especially attractive in our case due to the demand 1 The start of mouth opening has been used as a good indication of the initiation of the ballistic tongue movement due to its visibility and rather constant relationship to tongue protrusion which usually lags by about 5 ms (e.g., Gans and Gomiak 1982). Number of snaps N = 87 49 n 100 130 30 160 n 190 114 Time after lunge (ms) Figure 5.17. Time delay from onset of lunge to start of mouth opening. This histogram shows the delay between the onset of lunge to the begining of mouth opening. A concentrated distribution around 130-160 ms for a total of 87 snaps is observed. The data suggests that the delay between lunging synergy and jaw-tongue synergy is quite constant. of fast information processing. The data of Dc (Fig. 5.16) shows a large spread in its range (from 0.1 to 5.4 cm), thus ruling out Dc as a reliable parameter in determining when to initiate mouth and tongue movement. Although there is a wide spread in the range of Dc at the start of mouth opening, there is a clear correlation between the initial distance to the worm before the lunge and that at the start of mouth opening. But how can we interpret this data? In order to answer this question, we look at the time elapsed between the onset of lunging and mouth opening and found a narrow window of time during which the mouth movement is initiated (Fig. 5.17). For a longer initial distance, a more powerful jump and hence a higher speed is generated which will travel a greater distance in a 115 j i i i constant time, giving rise to the correlation shown in Fig. 5.16. Taken together, the data excludes the compensation of lunging error by varying the time to initiate I tongue snapping. Instead, we see a rigid temporal relation between lunging and ! tongue snapping, i.e., the frog initiates jaw and tongue movements at a more or less ( I fixed interval after body lunging, leaving tongue extension and retraction as the only means for correcting the error in lunging. The above data allows us to delineate the neural mechanism underlying the coordination of multiple movements involved in the high speed, high precision prey- snapping behavior in anurans. The small amount of time available for processing sensory information dictates the strategy for motor coordination. After initial estimation of the distance from a target, the temporal relation among various motor elements is, we hypothesize, fixed except for the last one so that the nervous system can make use of the longest possible time to process sensory inputs (Fig. 5.9). This seems to be a general strategy employed by various species for coordinating prehensile behavior. For example, while a relatively fixed temporal relation exists : between the reaching and grasping components of prehension in pigeon (Goodale 1983) or human (Jeannerod 1984), the aperture in grasping can be adjusted to compensate for the error in transporting (Bermejo and Zeigler 1989 in pigeon, Wing et al. 1986 in human). The study of visuomotor coordination in frog, which is a simpler preparation in comparison to mammals, thus provides an opportunity to work out in some detail the circuitry involved from sensory through sensorimotor transformation to motor control which can provide insights for the mammalian nervous system as well. With this we turn to the mechanism for processing sensory inputs to extract the distance information used by the snapping MPG to coordinate multiple motor elements. 5.5.3. On-Line Estimation of Distance It has been reported that the anuran is able to use disparity and accommodation cues to compute distance in snapping behavior (Ingle 1976, Collett 1977; see House 1 1989 for models). However, the snapping behavior in these experiments was treated as a whole without any concern for its constituent motor elements and the distance estimation was assumed to be performed before the snapping. By demonstrating in frogs the capability of adjusting tongue extension after snapping is initiated, our new finding poses a challenging question of finding the neural mechanism underlying distance computation from rapidly changing visual stimuli. Is depth perception based on disparity and accommodation cues adequate to serve this purpose? The answer is probably "no" for depth perception from accommodation cues which relies on the feedback of focusing, a measurement which is hard to obtain under fast moving condition. Anurans, on the other hand, are able to perform under monocular viewing conditions. This suggests the existence of a mechanism for depth perception from other monocular cues such as the looming cue generated by the lunging movement. Although neurons responsive to looming stimuli have been recorded in the optic tectum and pretectum in anurans (Grusser and GrUsser-Comehls, 1976, Ewert, 1984), it has yet to be demonstrate that anuran is indeed capable of computing distance from looming perception. However, a modeling study of the looming-avoidance behavior in frogs and toads has shown that looming perception can be achieved within one synaptic relay after the arrival of retinal efferents at the optic tectum (Chapter 2, see also Liaw and Arbib 1993). With one more synaptic junction to integrate the looming activity, relative distance information can be obtained. Such a fast pathway for depth estimation provides an attractive alternative suitable for extracting distance parameters for coordinating fast movements. We can depict a scenario where depth perception based on static cues such as disparity or accommodation plays a dominant role in initial estimation of target distance to guide lunging while one based on motion cues such as looming is crucial in the subsequent error-correction by tongue extension. We are currently designing experiments that will allow us to discern the involvement of various mechanisms for depth perception based on static and dynamic cues. 5.6 Modulation by Afferent Feedback Afferent feedback from the periphery plays an important role in motor control. We study the issues concerning afferent signal in shaping the behavior of a central MPG by examining some puzzling experiments concerning anuran snapping. Emerson (1977) showed that stimulation of Geniohyideus (which controls the hyoid) but not DM (which opens the mouth) produces mouth opening in toads. Weerasuriya (1989, 1991) reported that bilateral lesion of the hypoglossal nerve in toads abolishes tongue movement, which is a direct consequence of transection of axons innervating tongue muscles. Surprisingly, such a lesion also results in the absence of mouth opening, even though DM is not innervated by hypoglossal nerve. Taken together, these data indicates that afferent feedback from tongue muscles, which enters the brain through the hypoglossal nerve, is critical in the normal function of the jaw synergy. The first glimpse of the mechanism underlying the regulation of MPG by afferent feedback is provided by experiments conducted by Nishikawa and Gans (1992). In particular, EMG recording obtained in their experiments reveals that both DM and LM (Levator mandibulae, which raises the lower jaw to close the mouth) are activated at about the same time (±10 ms) before and after bilateral lesion of 118 hypoglossal nerve. Before the lesion, LM reaches its peak activity about 86 ms later than DM whereas LM and DM peak at about the same time after the lesion. Such a synchronized activation and peaking of agonist and antagonist muscle synergies may explain the failure to open the mouth. Based on data from EMG recording, extracellular recording in axons of trigeminal motoneurons that innervate LM, and direct stimulation of various muscles, they offered a hypothesis in which sensory 1 feedback from the tongue inhibits the phasic activity of LM for approximately 86 ms dining snapping, thus enabling DM to open the mouth. ! Based on lessons learned from our previous exploration of the premotor space, we propose a slightly different hypothesis for the role of afferent feedback. We postulate that, the afferent signal modulates the time constant of the LM module such that it peaks 86 ms later than DM, even though they are activated at about the same time. The rationale of this hypothesis is that modulation of time constants is most suitable in changing peaking time without affecting onset time. However, a delay of 86 ms in peaking time is significant. Computer simulation , of our MPG model indicates that changing the time constant of motoneurons alone is ! not sufficient. We tested time constant up to 300 ms, which is beyond the plausible limit of several tens of ms, and obtained a difference of peaking time of only about 40 ms. Greater delay in peaking time was obtained when the time constant of M10 was also increased, but still not enough. And to our surprise, we observed that a decrease in the strength of the inhibition of M8.2 is very effective in delaying the peaking in motoneurons. In retrospect, we recognized that by integrating the increase in M8.1 and decrease in M8.2, the rate of rising in M10 is twice that of M8.1 or M8.2. Reducing one of its opposing inputs (either the excitation from M8.1 or inhibition of M8.2) prolongs the rising rate of M10. Under such conditions, Afferent L M Figure 5.18. Shaping of MPG by afferent feedback. Two simulations of the MPG consisting of DM and LM are superimposed. The top graph shows the proprioceptive signal which is turn on (upper trace) during the first simulation and is turned off (lower trace) during the second one. When the afferent signal is absent, DM (middle graph) and LM (bottom graph) becomes active and reach their peak activation at about the same time. Whereas, with the presence of afferent signal, LM is activated about 7 ms later than DM and reahes its peak activity about 84 ms later than DM. increasing the time constant of M8.1 and M8.2 from 20 to 30 ms, and M10 and the motoneuron from 20 to 85 ms, delays the peaking by about 84 ms, while the activation time is delayed by only 7 ms (Fig. 5.18). Without the afferent signal, LM is activated and peaks at about the same time as DM. As discussed earlier, an increase in the time constant in the premotor neuron reduces the total activation of the motoneurons. Therefore, the synaptic efficacy of the motoneuron has to be increased to balance such effect. The combination of increased time constant and synaptic weight produces a long decay in the LM activity. Such a dynamic modulation by afferent feedback on the central motor pattern generator is pervasive and essential in the nervous system. Encouraged by the new data provided by Nishikawa and Gans (1992), we explored other possible roles of the afferent signal in sculpturing the activity of the MPG. In particular, the origin of the second peak in DM activity is reconsidered. EMG recording of DM activity evoked by stimulation of the "snapping zone" in the optic tectum (a tectal region which has been show to elicit snapping when stimulated, e.g., Ewert 1967) did not show a second peak (Matsushima et al. 1985). One plausible explanation for such a discrepancy is that the second peak in DM activity is elicited by afferent signal from the tongue after contact with the prey. In this case, a second peak appears in natural feeding behavior but not in electrically evoked DM activation. We constructed an alternative model in which the somatosensory feedback from hypoglossal nerve evokes the second peak in the DM module. In our simulation, the major difference between the afferent feedback-based and intrinsic feedback-based model is that the timing of the second peak varies in a greater range in the formal model depending on the arrival of the afferent signal. 5.7 Discussion Anuran snapping shares many similarities with other prehensile movements seen in various species, such as pecking in birds and reaching and grasping in primates, and understanding of snapping in anurans will shed some light on the more general class of prehensile behavior. We have developed a neural network model to study many important issues in generating and controlling prehensile movements. However, there are several limitations to our current model. First, it is a single neuron model and therefore is not adequate for addressing several population issues such as the size principle in progressive muscle fiber recruitment. Second, it is unilateral, and therefore is unable to account for the control in snapping direction. Third, it is composed of leaky integrator neurons which do not include the processing occurring in the dendrite nor does it produce the temporal pattern of the spike train traveling on an axon. Despite these limits, many lessons are learned from experimentation with the model. Using a physiologically identified push-pull mechanism built-in the premotor circuit, we explored ways in which characteristic features of a single muscle synergy can be controlled. A larger number of methods for controlling muscle activity was obtained in our experiments. Such a rich functional repertoire is necessary to accounts for the variations observed in performing motor tasks. For example, in human movement, a variety of patterns have been observed to control the distance of a movement. An increase in the peak amplitude in both agonist and antagonist muscles was observed in producing rapid reversals of movement with greater distance (Sherwood et al. 1988). Furthermore, it was found that the peak amplitude in the agonist and the rising rate of the antagonist are closely correlated to movement speed (Schmidt et al. 1988). In rapid arm reaching movement with greater distance, an increase in the amplitude in agonist muscles and a delay in the peaking of the antagonist muscles was shown (Gottlieb 1992), whereas, a delayed in the onset of antagonist muscle correlates well with the amplitude of wrist movements (Mustard and Lee, 1987). Similarly, different ways for controlling movement velocity under different loads has been observed. Next, we experimented with using a simple three-neuron module incorporating such push-pull mechanism as a building block to construct MPGs for generating and coordinating complex motor synergies. One important issue in motor control is timing, especially in synchronizing multiple motor synergies. The timing in coordinating reaching and grasping components based on visual feedback has been 122 studied in humans by perturbing the position and size of the target during reaching (Paulignan et al. 1991a, b, Gentilucci et al. 1992). The results showed a subtle coupling between these two synergies in correcting arm and hand movement towards the new target location. Such a phenomenon was explained by a mathematical model i in which the synchronization is based on estimates of the duration of the two motor components (Hoff and Arbib 1993). However, our knowledge of the underlying ; neural mechanism is still lacking. To this end, we explored two strategies in using a sensory signal of target ; distance to control the snapping distance by regulating the timing of the motor synergies involved. Our model incorporates separate but interacting modules for the control of the depressor mandibuli (jaw opener), genioglossus (tongue protractor), and hyoglossus (tongue retractor) muscles. The push-pull mechanism built into each module facilitates the coordination of the motoneurons and also provides a flexibility to motoneuron activation. Two schemes, mediation and modulation, for achieving this flexibility were explored to simulate variable tongue extension by having a "closeness" signal to control the synchronization of tongue protractor and retractor. In the mediation scheme, the signal is used to control the onset time of the retractor, i whereas in the modulation scheme, it changes the time-constant of the retractor thus leading to a variation in its rate of peaking. An important insight learned from this study is the way alternative functional hypotheses allows us to tease apart the structural organization of the nervous system. The connectivity at the neuronal level is an essential aspect for understanding the brain function at the system level. To date, neuronal connectivity is difficult, if not impossible, to obtain experimentally. Functional (behavioral or physiological) data, on the other hand, is more readily available. Therefore, construction of alternative hypothesis that links the structure of the system and its function to allow the comparison of the simulation data produced by different model and the experimental data is a useful strategy to identify the underlying structural organization. The mediation and modulation schemes are suitable for synchronizing motor synergies in different situations. In general, mediation is more effective if the time delay between the activation of the motor synergies is large (e.g., from DM to GG), whereas modulation is more suitable if they are evoked at about the same time (e.g., between GG and HG). Furthermore, during much of a movement, agonist and antagonist muscles are simultaneously active, implying that the modulation scheme is important in the agonist-antagonist interaction. For example, EMG recording from human subjects during reaching to a target at different locations by hand reveals that the modulation scheme is employed to control the peaking rate of the antagonist muscles while the timing of the agonist muscles remains stable (Gottlieb et al., 1989, 1992). A progressively slower rate of rising in the antagonist muscles is observed as the distance to the targets increases. On-line correction based on visual feedback has been shown to be an essential feature in controlling human arm movements (van Sonderen et al. 1989). Our model allows us to test several mechanisms for achieving such flexible motor control. Our hypothesis of the role of visual feedback in tuning the timing of tongue retraction is supported by preliminary data analysis from a series of behavioral experiments, which in turn led to the postulation of several hypotheses concerning the information processing required by such flexibility in motor control (Liaw et al. in preparation). In particular, we hypothesize that depth perception based on static and motion cues play different roles in guiding the snapping movements (Fig. 5.9). This hypothesis is inspired by a model of looming perception in avoidance behavior (Chapter II, see 124 j I ! also Liaw and Arbib 1993) and the role looming perception plays in catching moving ' target in humans (Savelsbergh et al. 1991). In the hypothesis, we extend the role of distance estimation based on looming cues, as an alternative to time-to-contact, from j avoidance behavior to prehensile behavior. I I 1 A crucial aspect in motor control is related to the shaping of MPG behavior by afferent feedback from the periphery. We address this issue by attempting to solve a puzzling phenomenon discovered by Weerasuriya (1989) that bilateral lesion of the hypoglossal nerve in toads abolishes not only tongue protrusion but also mouth opening, even though DM is not innervated by hypoglossal nerve. EMG data obtained by Nishikawa and Gans (1992) demonstrates that the difference in the relative timing of the peak amplitude in DM and LM is the key feature underlying this puzzling phenomenon. We hypothesized that the afferent signal conveyed by the 1 hypoglossal nerve modulates the time constant of the MPG module controlling LM such that it peaks 86 ms later than DM. Without the afferent modulation, LM is activated and peaks at about the same time as DM, thus preventing the mouth to open. The afferent feedback may convey information such as the kinetics of the tongue (e.g., its speed or extension in bringing in a prey) which dictates when the mouth should be closed (depending on, say, the weight of the prey which affects the speed of tongue retraction). The correlation between morphological and functional changes through evolution is one area in which exploration with the MPG model will be fruitful. In particular, Nishikawa and colleagues (1992) demonstrated that the failure to open mouth is seen only in the phylogenetically new development of a highly protrusible tongue in Bufo marinus and Rana pipiens. In other species with a shorter tongue, 125 Visual Stim. T4 T5.2 n n n r w wW ' n A p n . a * A . A Figure 5.19. Feed back from motor system to sensory center. The response of the network to a continuous prey-like stimulus. Note that now the T5.2 neurons discharge a sequence of bursts as recorded in experiments. The activity of the T4 neurons decreases after the first burst, which is consistent with their "newness" characteristic. This is due to the feedback inhibition from M 10 neurons. such as the frog Discoglossus pictus, hypoglossal sensory feedback is not necessary for mouth opening (Nishikawa and Roth 1991). In addition to the feedback from the periphery to MPG, such a loop also exists between the sensory and motor centers. For example, the discharge of prey-selective neurons (T5.2 cells) in the optic tectum terminates abruptly before the onset of snapping, presumably turned off by motor feedback. Simulation of such inhibitory feedback from the MPG to the optic tectum reveals an unexpected role of afferent signal in influencing the property of sensory neurons. The animals are often immobilized before cellular recording, thus precluding the expression of a motor response, and the visual stimulus persists for a long time. To simulate such an experimental condition, a long train of visual stimulation is generated (Fig. 5.19). Note that now the T5.2 neurons discharge a sequence of burstings as recorded in experiments (Matsumoto et al. 1986). Surprisingly, the activity of the T4 neurons decreases after the first burst, displaying their "newness" characteristic. It is interesting to see that the "newness" characteristic of cells up in the tectum can be sculptured by the feedback from the medulla. Since a tactile stimulus also can trigger a response in the snapping MPG (via torus), a prediction that follows is that similar reduction in response to visual stimuli should be observed after such a tactile stimulation. This phenomenon was documented by Liege & Galand (1972). Two-way flow of information throughout sensory, motor systems, and periphery is a general organizational principle emerging from evolution (e.g., Cohen 1992 in locomotion and Ewert 1987 on loops in anurans). The activity of any part of the nervous system is subject to the influence of both descending and ascending signals. The push-pull mechanism presented here provides a simple and effective way to construct MPGs capable of accommodating such a two-way interaction. Tongue flip in anurans not only serves the purpose of prey-catching, but is also involved in rejection of unpalatable food such as when the food tastes bad or the animal is stung by such prey as bees (Dean, 1980). The lingual signal is transmitted through the glossopharygeal nerve (IX). By means of intracellular recording, Matsushima et al. (1986) demonstrated that stimulation of the IX nerve elicits response in tongue-protracting and retracting motoneurons. Moreover, they found spatial facilitation between the tectal EPSPs and the glossopharygeal EPSPs in some motoneurons. Further study on the medulla reticular formation, the premotor area, 127 revealed the convergence of signals from the optic tectum and the IX nerve i (Matsushima et al. 1986, 1989). One may postulate that both the optic tectum and the glossopharygeal nerve access the same motor pattern generating circuit in the j medulla reticular formation for the tongue flip, serving the purposes of prey-catching ; and prey-rejection, respectively. Exploration of the differences of tongue movements ; ! in striking and rejection will provide further insight into the neural mechanisms j underlying lingual muscle control. , In summary, we believe that the study of visuomotor coordination in frog ! provides an opportunity to work out in some detail the circuitry involved from [ sensory through sensorimotor transformation to motor control which can provide , i i insights for mammalian nervous systems as well. j i 1 128 CHAPTER 6 SENSORIMOTOR TRANSFORMATION: FROM ANIMAL TO AGENT 6.1 Application of the Looming Avoidance Model to Robot Control The study of animal behavior can provide valuable insights to the design of an autonomous mobile robot. This chapter presents one such case in which a biologically based neural network model (described in Chapter 2) of how frogs escape looming stimuli is applied to mediate obstacle avoidance in mobile robots. The motivation of applying the model to robots lies in that, from a relative motion point of view, the situation where an object moves towards a robot is similar to that in which the robot moves towards the object. Moreover, the two situations become identical when a mobile robot has to negotiate moving obstacles. The robotic experiments are carried out in an integrated testbed consisting of Neural Simulation Language (NSL), for implementing the neural network model, and the Rapid Robotics Application Development environment (R2AD), for dynamic control of a robot arm (Fig. 6.1). In the first experiment, the ability of the model to compute 3-D motion from a real image data is evaluated. In the second experiment, the capability of the model in guiding the robot arm to avoid obstacles is tested. A camera mounted on the moving robot arm provides the visual inputs to the neural network, which computes the 3-D motion of the obstacles and determines an appropriate escape direction to guide the robot arm to go around the obstacles. The 129 ! NSL control program gateway $ AD Kernel Puma vision arm system images arm motion commands Fig. 6.1. Experimental setup for robot navigation. The illustration at the top show the architecture of the integrated system. The high level control is performed by neural netwok models written in NSL, whereas the low level robot manipulation is computed by R2AD. An inteface (gateway) provides the prototol for communication between these two components. In addition to controlling the robot arm, R2AD also provides frame grabbing routines for visual inputs from a camera. One image frame is taken after every move made by the robot and is fed into the looming perception neural network. The neural network locates the position of the obstacles based on the looming patterns and selects an optimal path based on some criteria (locally shortest path in this case) by specifying the direction of the next movement. looming detector circuit, implemented in NSL, acts as the high-level controller for the robot arm subsystem. The controller for the camera system preprocesses the incoming image by reducing the 512x480 image down to an image size of 32x32. A sequence of these reduced-resolution images is sent to the looming detector circuit, which estimates the trajectory of the predator. From this estimated trajectory, the network triggers the execution of an appropriate escape behavior. 6.2 Detecting A Looming Object For the first experiment, the simulated predator is mounted on the end-effector of a Puma arm. A separate control process interfaces with the Puma control module to generate a set of straight-line trajectories which pass near the camera. In this experiment, we have selected four real world time-varying sequences of files representing various objects moving in the space around the visual sensor. The files ' are : (1) an object directly looming towards the camera, (2) an object moving transversely (frontoparallel) across the visual field, (3) a crossing looming stimulus which passes through the midline of the visual field, and (4) an object which looms from the upper visual field in a downward and leftward direction. Out of these four image segments only the first one poses a direct danger to the robot which must generate a sequence of motor patterns to avoid the object while other perceptual sequences will not induce the same behavior even though all stimuli are looming towards the robot. Figures 6.2 to 6.5 illustrate the output result from the neural layer detecting looming stimuli, the leftward (180°) movement directional-selectivity layer, and the collision layer which responds to direct looming objects. The firing patterns in these collidflf Figure 6.2. This is the collision layer from experiment #1. It indicates the presence of an object moving directly towards the camera. Similar activities also exist in the loo m f and dl80f layers that illustrate the moving object is expanding (d!80f) while looming ( loomf). (a) (b) Figure 6.3. In experiment #2, we have an object moving across the visual field from right to left. This motion corresponds to the activation of neuron firings in the dl80f layer that is sensitive to movements in the right-to-left direction. However, we observe that the object does not loom towards the robot which explains the inactivity in the lo o m f layer. 132 j ! Figure 6.4. (a) This is the neural layer indicating the presence of a looming stimulus with an object approaching from the left as described in experimental setup #3. (b) This figure depicts the motion sensitive layer which is directionally selective to movements in the right-to-left direction (a) (b) Figure 6.5. In experiment #4, the looming stimulus approaching from above which accounts for the position of the peak neural firing towards the edge of these two figures. Nevertheless, these activities does not elicit a response in the collision layer. figures are topologically organized in retina coordinates with the location of the peak - activity corresponding to the center of the looming stimulus. 6.3 Obstacle Avoidance The objective of this experiment is to test the capability of the looming avoidance model in detecting obstacles and providing a detour path. A camera mounted on the moving Puma arm provides the visual inputs to the model, which computes the 3-D motion of the obstacles and determines an appropriate escape direction to guide the robot arm around the obstacles. At each time step, the Puma arm moves to a new position specified by the model. The camera takes a image frame at the new position which is fed into the model. The model computes the trajectory of the obstacle whose positions on the image is varied due to the movements of the camera. Below several related works in which a neural network model is applied to robot control. 6.3.1 Related work CMU NAVLAB - Pomerleau (1990) A 3-layered neural network is trained to imitate a human driver using back- propagation. As a person drives the Navlab, a camera takes in image as the training input and the direction in which the driver is currently steering serves as the desired output. The network learned to drive under various road (and weather) conditions. Though the term "neural" is used, this network obtains its functionality through training and the relevance to biology is not among the issues addressed by this study. MAVIN - Baloch and Waxman (1991) They developed a control system for mobile robots which utilizes dynamical neural networks for learning and performance. The system includes networks for early visual perception, eye motion control, pattern learning and object recognition, object associations and delayed expectation learning, emotional states, behavioral actions, and associative switching. They demonstrated that a variety of behavioral conditioning phenomena are emergent consequences of this modular neural system. However, in the experiment, instead of using real image inputs, the system is provided with some simulated patterns therefore it is not clear how well the system can perform in a natural environment. Behavior-based robot control - Brooks (1986) Brook proposed a hierarchical of behavior in which behavior in each layer is capable to perform some function on its own. The behavior in the lower level of the hierarchy performs very simple task. The behavior becomes more complicated as one moves up in the hierarchy. This approach is more biologically oriented. However, while Brooks stress the independence between different layers in order to reduce computational complexity biological systems are capable of integrating various behavior to achieve optimal performance (optimal in an ecological sense). 6.3.2 Technical considerations Assumptions Two assumptions are made in applying the model to obstacle avoidance problem. 1. Contrast: The obstacles are black (or dark) objects placed against a white background. 2. Initial distance: The robot always moves towards obstacles from some distance away such that initially the obstacle should cover at most a few pixels (in order to allow the expanding image to be integrated). 135 Performance measurements The performance of the model in obstacle avoidance task is measured by two criteria. First, the minimum requirement is that it must be able to detect and avoid obstacles. The second criterion is the optimality of path chosen. There are several parameters for this measurement such as length of the chosen path, total number of turns, or the desirability of path. In the present study, the length of the path is used as the measurement of the performance. 6.3.3 Sensorimotor transformation The detection of obstacles and the computation of trajectory relative to the moving camera is performed in the same way as described in Chapter 2. In the model, the detection of approaching obstacles is achieved based on the expansion of the retinal images and the spatial location of the objects is encoded by a population of looming sensitive neurons. The direction of the looming stimulus is computed by monitoring the shift of the peak of neuronal activity within this population. The signal encoding the stimulus location is gated by direction-selective neurons onto a motor heading map which specifies the escape direction. However, there is one major differences between the looming avoidance behavior and obstacle avoidance. First, in the simulation of looming avoidance, only one approaching stimulus is presented at a time, whereas multiple obstacles are present in this experiment. This raises several interesting issues that the model must deal with, including occlusion, detection of gaps between obstacles, size constancy for obstacles of different sizes at different distances, and the interaction and integration of multiple obstacle signals. 136 Motor heading map Unlike avoiding a single looming object where the data of escape direction is compiled experimentally and the projection from the looming detectors is fixed, the heading for avoiding multiple obstacles has to be determined dynamically based on their spatial arrangement. Finding a collision-free path is a problem that has attracted the attention of many researchers in the filed of robotic research. Two major approaches have been developed to provide such a path. First, the graph searching approach in which the obstacles are represented as nodes in a connected graph and the problem is transformed into a search through the graph for an optimal path based on some criteria (Lozano-Perez 1983, Schwartz et al. 1987, Mitchel 1990, Zhu and Latombe 1991). Second, the potential field approach in which a "force field" is created by integrating repulsive forces exerted by the obstacles and attractive forces generated by targets. A path is represented as a "valley" in this vector field (Arbib and House 1987, Arkin 1989, Warren 1990, Tilove 1990, Camuni et al. 1993). Here we propose an biologically-inspired approach based on the motor heading map which is postulated to exist in the midbrain tegmentum of the frog (see section 2.2.4 and 2.4). The motor map provides a substrate on which signals of multiple obstacles interact and compete with each other and a heading of the next step for the robot to take emerges from such interaction. For this scheme to work, the signal should indicate not the location and extent of the obstacle, but rather, it should specify the opening beyond the edges of an obstacle. This is achieved by projecting looming detectors to the motor map via a connectivity patterns that resembles an inverted DOG (Difference of Gaussian). Through a mask of an inverted DOG, neurons activated by an obstacle inhibit cells on the map that correspond to the same spatial location and excites those that are some distance away. As a result, only cells 137 o B D D 0 Fig. 6.6. Avoiding four obstacles. Three experiments with slightly different starting positions are shown. Note that the chosen path (solid line) is locally optimal but is not globally optimal (dashed line). whose location correspond to an opening (or gap between two obstacle) can be activated. The presence of multiple obstacles (and hence multiple gaps) raises the question of which route to choose. A winner-takes-all mechanism proposed by Didday (1970, see also Amari and Arbib 1977) is built into the motor map. A simple mechanism to obtain locally optimal path (shortest length) is employed in the simulation. The idea is to favor the most centrally located gap. This is implemented as a differential tonic firing rate for cells in the motor map such that more centrally located one have higher background activity. The inequality gives central cells higher total activity and will emerge as the winner on the map. Typical experimental results for avoiding arbitrarily arranged obstacles are shown in Fig. 6.6. Size constancy via motion parallex Since the only available information is the expansion of the image of obstacles, the model is confronted by a challenging problem, namely, without depth information, how can the model respond to a small obstacle at a closer distance instead of a larger one at a greater distance when the latter subtends a bigger visual angle (i.e., it casts a bigger image on the camera)? Although only expanding patterns in the image are available, it provides a crucial clue about the distance of an object, namely, motion parallex. Motion parallex refers to that fact that for two objects moving at the same speed at different distances, the closer one generates a greater retinal shift, i.e., its image moves at a higher speed than the more distant one. Therefore, once the camera starts moving, the looming detector will respond more strongly to the smaller and closer obstacle since it elicits more retinal ganglion cells. In several experiments with obstacles of different sizes, the model demonstrated the capability of correctly avoiding then based on motion parallex. A sequence of movements made by the robot arm under such condition is given in Fig. 6.7. Fig. 6.7. Estimating depth based on motion parallax. The top row shows three snap shots of the robot arm moving around three obstacles (from left to right). The frames in the bottom row show images seen through the robot's eye (the video camera). In this experiment, a large object is placed behind two small objects. The looming neural network is able to detect the small ones first based on motion parallax and guides the robot arm through them first. 6.4 Discussion The model has demonstrated the capability of detecting multiple obstacles in the presence of noise and occlusion. It also qualitatively achieves size constancy. However, there are several limitations of the model. One of our major obstacles is the lack of sufficient computational power in processing sensory information. Even with the limited environment of 32 x 32, we were unable to achieve a real-time coordination between the sensory computation and motor action. However, this may be rectified adequately with speedier hardware, i. e., parallel or concurrent hardware systems. 140 ! The model is sensitive to contrast reversal as discussed in section 2.3.4. Furthermore, it also requires an initial distance greater than some threshold value. The robot must move towards obstacles from some distance away such that initially i the obstacle should cover at most a few pixels. If somehow an obstacle is placed i close to the robot, it will get stuck. There are several ways to solve this problem. First, we can make use of the texture of the object. However, this requires much higher resolution and may slow down the system considerably. Second, we can make the robot back up when stuck. Third, the robot can look around to find more information to get out of the situation. Although our primary simulations have been modeled after visual perception ! I mechanisms, we hope to further model other sensory information processing circuitry in the future. If an individualized sensory unit could be simulated successfully, one interesting topic of research would be to find ways for neural networks having diverse modality to fuse together to provide a more realistic perception. A simple mechanism for finding a locally optimal path is employed in the present experiment. There are several ways to improve the mechanism towards a more globally optimal solution. One way is to take into account obstacles at a greater distance. Second, one may seek also to minimize the deviation from target. Minimization of the deviation from target can be obtained by shifting the background activity of the cells on the motor map according to the current heading. The shift may be achieved by neural modulation with a gradient corresponding to the deviation from the target. 141 CHAPTER 7 CONCLUSION The study of visuomotor coordination has been approached from several directions, from motion perception, sensorimotor transformation, to motor control, in anurans, mammals, and robots. Some progress is made in these aspects which we hope may contribute to a better understanding of the issues involved. However, many more questions remain unanswered and new ones are raised by the study. Here, some topics for future research are discussed. The issue of transforming sensory information into motor patterns is general to all animals. In anurans, the stimulus location is place coded in the tectum, yet the command to the motor pattern generation is coded by the overall level of activity in the population of command fibers (population code). Based on lesion studies in the frogs, Grobstein suggested that one transformation of the tectal efferents occurs in the midbrain tegmentum, a shift from initial 2-D signals place-coded in the sensory coordinate to a 3-D head- or body-centered signals whose components are carried by at least two pathways — one for horizontal eccentricity and the other for stimulus elevation and distance (Grobstein 1989). Similar results have been reported in many species from lower to higher vertebrates, e.g., wave localization in clawed frogs (Elepfandt 1988), head movement in barn owls (Masino and Knudsen 1990), saccadic eye movement in cats and monkeys (Buttner et al. 1977, Cohen et al. 1986, for a review see Sparks and Mays 1990), and hand pointing in humans (Flanders and Soechting 1990, Flanders et al. 1992). These findings suggest that even though the schemes for target localization may be different in these cases, they share a common mechanism for transforming target positions into motor commands. Future research will address the issues of sensorimotor integration in the tegmentum and medulla involving both predator-avoidance and prey-orienting behaviors. We also plan to analyze more complicated stimulus situations and the integration of the corresponding responses with detour behavior when barriers are present. Moreover, many interesting and challenging issues raised by lesion experiments remain to be addressed in our modeling study. The capability to compute the 3-D trajectory of a moving object is critical for an animal to catch or avoid a visual target. Neurons which response specifically to a looming object on a trajectory that will lead to contact with a particular region on the body surface have been physiologically recorded (Gentilucci et al. 1988). The response of these neurons depends only on the trajectory of the looming object regardless of the eye positions. This raises two interesting issues, namely, the integration of visual and somatosensory signals and the underlying mechanism for transformation from retinal coordinates to somatosensory coordinates. Another important contribution by motion perception is in stabilizing the visual perception of the dynamic world. Visual stabilization requires the integration of motion perception with the proprioceptive information about eye and body movement and feedback from higher cognitive centers. There is evidence that cells in the posterior parietal cortex (area 7a) are involved in the integration of the signals about position of a visual target and the position of the eyes (Andersen and Zipser 1988). The role of this type of neuron in visual perception is an interesting subject for future research. 143 The proposed neural mechanism for detecting motion in depth is based on the expansion of the monocular retinal image. Another crucial cue based on depth information has been excluded in our model. There are two issues for the future research, one is the incorporation of the binocular disparity signal into the detection of motion in depth, and the other is how motion perception may aid depth perception. There are neurons sensitive to the edges in the two retinal images moving in opposite directions ( Regan and Beverly 1978, Mottor and Mountcastle 1981). This response requires resolving the disparity in the two images. On the other hand, motion in depth based on monocular cues should be able to support depth perception. The ’ close relation between the perception of depth and motion in depth is an intriguing issue for further investigation. Object recognition from motion has been a subject of vision research for a long time. While object recognition from static images has generated many exciting results (Binford 1971, Marr and Nishihara 1978, Rosenfeld 1984, Biederman 1987, Riseman and Hanson 1987, Nevada and Price, 1989, Liaw and McCormick 1990, King 1993), less is known about object recognition based on motion. Although it is well known that co-movement of features in a scene is a strong cue for grouping moving parts into a coherent object, questions remains as how to construct primitive motion elements as building blocks for a dynamic scene. On the biological front, one may investigate the neural substrates and mechanisms that carry out the visual binding process. Lesion studies reveal some of the substrates involved in object recognition. Through clinical studies of patients with lesions to the posterior right hemisphere, Vaina (1989) found deficits in motion interpretation of stereopsis using Julesz’s random-dot stereograms. The group with a right occipito-parietal lesion showed impairment on the stereopsis and speed comparison tasks. Furthermore, they : 144 i failed to achieve perceptual grouping of moving elements as a rigid 3D structure. On ■ the other hand, the group with a right occipito-temporal lesion failed to identify 2D- form-from-motion or stereopsis. Vaina further hypothesized that the middle : i temporal cortex's (MT) role is to achieve some type of perceptual grouping based on the integration of elementary and local measurements (Siegel and Andersen 1986). It 1 might conduct a correlation or correspondence process between sets of elements over | time and space. Finally, we can extend the capability of the robot control system in four directions, (i) Experimentation with multiple moving objects, (ii) The addition of a pattern recognition sub-system such that the robot will reach for a target while avoiding obstacles. (This extension makes the system an ideal testbed for studying issues of sensory fusion, namely, how target and obstacle information are integrated to mediate behavior), (iii) Employing a more advanced strategy such as rotating the camera (simulated eye movements) to actively gather more information to make better decisions, (iv) Designing and simulating physiological experiments regarding sensorimotor integration. Ultimately, we hope to demonstrate the close coupling of experimental and modeling approaches to neuroscience - that biological studies help in the design of a better intelligent robots while the computational systems in turn help the advancement of biological research. 145 i REFERENCES Amari, S. and Arbib, M.A., 1977, Competition and cooperation in neural nets, in i Systems Neuroscience (J. Metzler, ed.), pp. 119-165, Academic Press. i i Andersen, R. and Zipser, D., 1988, The role of the posterior parietal cortex in the coordinate transformations for visul-motor integration, Canadian J. Physiol, and Pharm. 66: 488-501. Arbib, M. A., & House, D. H., 1987, Depth and Detours: An Essay on Visually- Guided Behavior, in Vision, Brain, and Cooperative Computation (M.A.Arbib & A.R.Hanson, Ed.), Cambridge, MA: A Bradford Book/MIT Press, pp. 129-163. I Arkin, R. C., 1989, Neuroscience in motion: the application of schema theory to mobile robotics, in Visuomotor Coordination: Amphibians, Comparisons, Models, and Robots (J.-P. Ewert & M. A. Arbib, Ed.), New York: Plenum Press, pp.649-671. Backstrom, A.-C., Hemila S., and Reuter, T., 1978, Directional selectivity and color coding in the frog retina, Medical Biology 56: 72-83. Baloch, A. A. and Waxman, A. M., 1991, Visual learning, adaptive expectation, and behavioral conditioning of the mobile robot MAVIN, Neural Network, 4: 271- 302. Barlow, H.B., and Levick, W.R., 1965, The mechanism of directional selectivity in the rabbit's retina, J. Physiol. 173: 377-407. Bermejo, R. and Zeigler, H. P., 1989, Prehension in the pigeon II. Kinematic analysis, Exp. Brain Res. 75: 577-585. Beverly, K.I., and Regan, D., 1973, Evidence for the existence of neural mechanisms selectively sensitive to the direction of movement in space, J. Physiol., 235: 17- 29. Biederman, I., 1987, Recognition-by-components: A theory of human image understanding, Psychol. Review, 94: 115-147. Binford, T.O., 1971, Visual perception by computer, Proc. IEEE Conf. on Systems and Controls. 146 I Bizzi, E., Hogan, N., Mussa-Ivaldi, F. A., and Giszter, S., 1992, Does the nervous ' system use equilibrium-point control to guide single and multiple joint movements? Brain Behav. Sci. 15: 603-614. Braddick, O., and Holliday, I., 1991, Serial search for targets defined by divergence or deformation of optic flow, Perception, 20: 345-354. j Braitenberg, V, and Taddei-Ferretti, C., 1966, Landing reaction of Musca domestica, Naturwissenchaften, 53:155-156. ! Brooks, R. A., 1986, A robust layered control system for a mobile robot, IEEE J. Robotics and Automation, RA-2: 14-23. i Brost, A., and Bahde S., 1986, What kind of movement detectors in triggering the landing response of the housefly? Biol. Cybern. 55: 59-69. Brost, A., and Bahde S., 1988, Spatio-temporal integration of motion, a simple i strategy for safe landing in flies, Naturwissenschaften 75: 265-267. Brown, W.T., and Ingle, D., 1973, Receptive field changes produced in frog thalamic units by lesions of the optic tectum, Brain Res. 59: 405-409. Buttner, U., Buttner-Ennever, J.A., and Henn, V., 1977, Vertical eye movement related unit activity in the rostral mesencephalic reticular formation of the alert , monkey. Brain Res. 130: 239-52. Cammurri, A., Frixione, M., Vercelli, G., and Zaccaria, R., 1993, How to do actions with symbols: Analogical reasoning in a hybrid representation system, submitted to Artificial Intelligence. Cavallo, V., and Laurent, M., 1988, Visual information and skill level in time-to- colUsion estimation, Perception, 17: 623-632. Cervantes-Perez, F., Lara, R., and Arbib, M.A., 1985, A neural model of interactions subserving prey-predator discrimination and size preference in anuran amphibia. J. Theor. Biol. 113: 117-152. Cobas, A., and Arbib, M. A., 1991, Prey-Catching and Predator-Avoidance in Frog and Toad 1: Maps and Schemas, in Visual Structure and Integrated Functions, M.A. Arbib and J.-P. Ewert, eds., Research Notes in Neural Computing, NY: Springer-Verlag, pp.139-152. Cobas, A and Arbib, M.A., 1992, "Prey-Catching and Predator-Avoidance in Frog and Toad: Defining the Schemas," Journal o f Theoretical Biology, 157: 271-304. Coggshall J.C., 1972, The landing response and visual processing in the milkweed bug, Oncopeltus fasciatus, J. Exp. Biol. 52: 401-414. Cohen, A. H., 1992, The role of heterarchical control in the evolution of central pattern generators, Brain Behav. Evol. 40: 112-124. Cohen, B., W aitzman, D.M., Buttner-Ennever, J.A., and Matsuo, V., 1986, Horizontal saccades and the central mesencephalic reticular formation, Prog. Brain Res. 64: 243-56. Collett, T .: Stereopsis in toads. Nature 267, 349-351 (1977). Cynader, M., and Regan, D., 1978, Neurons in the cat parastriate cortex sensitive to the direction of motion in three-dimensional space, J. Physiol. 274: 549-569. Dean, J., 1980, Encounter between bombardier beetles and two species of toads (Bufo americanus, B. marinus): Speed of prey-capture does not determine success, J. Comp. Physiol. A 135: 41-50. Didday, R.L., 1976, A model of visuomotor mechanisms in the frog optic tectum, Math. Biosci. 30: 169-180. Edinger, L., 1908, Vorlesungen iiber den Bau der nervosen Zentralorgane des Menschen une der Tiere, Vol. Z. Vogel, Leipzig. Elepfandt, A., 1988,Central organization of wave localization in the clawed frog, Xenopus laevis involvement and bilateral organization of the midbrain, Brain Beha. Evol. 31:349-357. Emerson, S.B., 1977, Movement of the hyoid in frogs during feeding, Am. J. Anat. 149: 115-120. Ewert, J.-P., 1967, Aktivierung der Verhaltensfolge beim Beutefang der Erdkrote (Bufo bufo L.) durch elektrische Mittelhimreizung, Z. Vergl. Physiol. 54: 455- 481. Ewert, J.-P., 1971, Single unit response of the toad (Bufo americanus) caudal thalamus to visual objects, Z. Verg. Physiol. 74: 81-102. Ewert, J.-P., 1974, The neural basis of visually guided behavior, Scientific American 230: 34-42. Ewert, J.-P., 1984, Tectal mechanism that underlies prey-catching and avoidance behavior in toads, in: Comparative Neurology o f the Optic Tectum, H. Vanegas, ed., NY: Plenum Press, pp. 247-416. Ewert, J.-P., 1987, Neuroethology of releasing mechanisms: prey-catching in toads, Behav. and Brain Sci., 10: 337-405. Ew ert, J.-P ., and Rehn, B., 1969, Q uantitative A nalyse der Reiz- Reaktionsbeziehungen bei visuellen Auslosen des Fluchtverhaltens der Wechselkrote (Bufo bufo Laur.), Behaviour 35: 212-234. Ewert, J.-P., and Seelen, W. v., 1974, Neurobiologie und System-Theorie eines visuellen Muster-Erkennungsmechanismus bei Kroten, Kybernetik 14: 167-183. Ewert, J.-P., and Wietersheim, A.v., 1974, Musterauswertung durch tectale und thalamus/ praetectale Nervennetze im visuellen System der Krote {Bufo bufo L.), J. Comp. Physiol. 92: 131-148. Ewert, J.-P., Weerasuriya, A., Schurg-Pfieffer, E., Framing, E., 1990, Responses of m edullary neurons to moving visual stimuli in the common toad: I) Characterization of medial reticular neurons by extracellular recording, J. Comp. Physiol. A, 167: 495-508. Flanders, M., and Soechting, J., 1990, Parcellation of sensorimotor transformations for arm movements, J. Neuroscience, 10: 2420-2427. Flanders, M., Helms-Tillery, S., and Soechting, J., 1992, Early stages in a sensorimotor transformation, Behav. Brain Sci. 15: 309-362. Flash T. and Hogan, N„ 1985, The coordination of arm movements: An experimentally confirmed mathematical model, J. Neurosci. 5: 1688-1703. Freeman, T. and Harris, M., 1992, Human sensitivity to expanding and rotating motion: Effects of complementary masking and directional structure, Vision Res. 32: 81-87. Gaillard, F., 1984, Binocular vision in frog: functional properties, anatomical characteristics and neuronal integration o f the different retino-tectal inputs, Doctoral Dissertation, University de Poiters. Gans, C. and Gomiak, G.C., 1982, Functional morphology of lingual protrusion in marine toads {Bufo marinus), Amer. J. Anatomy, 163: 195-222. Gans, C., 1992, Electromyography, in Biomechanics Structures and Systems, A Practical Approach, Biewener, A.A., (ed.), NY, NY: Oxford University Press. Gentilucci, M., Fogassi, L., Luppino, G., Metelli, M., Camarda, R., Rizzolatti, G., 1988, Functional organization of the inferior area 6 in the macaque monkey I. Somatotopy and the control of proximal movements, Exp. Brain Res., 71: 475- 490. Gentilucci, M., Chieffi, S., Scarpa, M., and Castiello, U., 1992, Temporal coupling between transport and grasp components during prehension movements: effects of visual perturbation, Behav. Brain Res. 47: 71-82. Gibson, J.J., 1955, The optical expansion-pattem in aerial location, Amr J. Psychol. 68: 480-484. Gibson, J.J., 1958, Visually controlled locomotion and visual orientation in animals, Br. J. Psychol. 49: 182-194. 149 Goodale, M. A., 1983, Visually guided pecking in the pigeon (Columbia livia), Brain 1 Behav. Evol. 22: 22-41. Goodman, L.J., 1960, The landing response of insects, I. the landing response of the fly Lucilia sericata and other celliphorinae, J. Exp. Biol. 37: 854-878. , Gottlieb, G. L., Corcos, D. M., Latash, M. L., and Agarwal, G. C., 1992, Organizing j principles for single-joint movements: I. A speed-insensitive strategy, J. Neurophysiol. 62: 342-357. Gottlieb, G. L., Latash, M. L., Corcos, D. M., Liubinskas, T. J., and Agarwal, G. C., ! 1992, Organizing principles for single-joint movements: V. Agonist-antagonist , interactions, J. Neurophysiol. 67: 1417-1427. i Grantyn, A., and Berthoz, A., 1988, The role of the tectoreticulospinal system in the control of head movement, in Control o f Head Movement, B.P. Peterson and F J. Richmond, eds, NY: Oxford University Press, pp. 224-244. Grillner S., Buchanan, J. T., Wallen, P., and Brodin L., 1988, Neural control of locomotion in lower vertebrates: from behavior to ionic mechanisms, in Neural Control o f Rhythmic Movements in Vertebrates, A.H. Cohen, S. Rossignol, and S. Grillner, eds., NY: John Wiley & Sons, pp. 1-40. Grobstein, P., 1989, Organization in the Sensorimotor Interface: A Case Study with Increased Resolution, in Visuomotor Coordination: Amphibians, Comparisons, Models, and Robots, J.-P., Ewert, and M. A., Arbib, eds., NY: Plenum, pp. 537- 568. Grobstein, P., 1991, Directed movement in the frog: A closer look at a central ; representation of spatial location, in Visual Structures and Integrated Functions, M. A., Arbib, and J.-P., Ewert, eds., Research Notes in Neural Computing, NY: Springer-Verlag, pp. 125-138. Grobstein, P., and Staradub, V., 1989, Frog orienting behavior: The descending distance signal, Soc. Neurosci. Abstr. 15: 54. Griisser, O.-J., and Griisser-Comehls, U., 1970, Die Neurophysiologie visuell gesteuerter Verhaltensweisen bei Anuren, Verhandlungsbericht Dtsch. Zool. Ges., 64: 201-218. Grusser, O.-J., and Griisser-Comehls, U., 1973, Neuronal mechanisms of visual movement perception and some psychophysical and behavioral correlations, in: Handbook o f Sensory Physiology, R. Jung, ed., vol VII/3A, Springer-Verlag, NY, pp. 333-429. Grusser, O.-J., and Griisser-Comehls, U., 1976, Neurophysiology of the anuran visual system, in Frog Neurobiology, R.Llinas and W. Precht, eds., NY: Springer-Verlag, pp. 297-385. Griisser-Comehls, U., and Langeveld, S., 1985, Velocity and directional selectivity of frog retinal ganglion cells depend on chromaticity of moving stimuli, Brain, Behavior and Evolution 27: 165-185. Harris-Warrick, R. M., and Marder, E., 1991, Modulation of neural networks for behavior, Ann. Rev. Neurosci. 14: 39-57. Hassenstein, B., and Reichardt, W., 1956, Functional structure of a mechanism of perception of optical movement, in Proc. 1st Intl. Congr. Cybernet. Namar, pp. 797-801. Hatsopoulos, N., and Warren, W., 1991, Visual navigation with a neural network, Neural Networks, 4: 303-317. Herrick, C J. 1930, The medulla oblongata of Necturus, J. Comp. Neurol. 50: 1-96. Hinsche, G., 1935, Ein Schnappreflex nach "Nichts" bei Anuren, Zool. Anz., I l l : 113-122. Hoff, B„ and Arbib, M.A., 1993, Models of trajectory formation and temporal interaction of reach and grasp, J. Motor Behavior, (in press). Holmqvist, M., and Srinivasan, M, 1991, A visually evoked escape response of the housefly, J. Comp. Physiol. A, 169: 451-459. Horn, K.P., and Schunk, B.G., 1981, Determining optic flow, Artif. Intell. 17: 185- 202. House, D., 1989, Depth Perception in Frogs and Toads: A Study in Neural Computing, in: Lecture Notes in Biomathematics 80, Springer-Verlag, Berlin. Hubei, D. H., and Wiesel, T. N. 1962. Receptive Fields, Binocular Interaction and Functional Architecture in the Cat's Visual Cortex. J. o f Physiol., 160, 106-154. Hubei, D. H., and Wiesel, T. N. 1968. Receptive Fields and Functional Architecture of Monkey Striate Cortex. J. o f Physiol., 195, 215-243. Ingle, D., 1976, Spatial vision in anurans, in: The Amphibian Visual System, K.V. Fite, ed., Academic Press, NY, pp. 119-141. Ingle, D., 1983, Brain mechanism of visual localization by frogs and toads, in: Advances in Vertebrate Neuroethology, J.-P Ewert, R. Capranica and D. Ingle, eds, Plenum Press, NY, pp. 177-226. Ingle, D., 1991, Control of frog evasive direction: Triggering and biasing systems, in: Visual Structure and Integrated Functions, M.A. Arbib & J.-P. Ewert, eds., Research Notes in Neural Computing, NY: Springer-Verlag, pp. 181-189. 151 Ingle, D., and Hoff, K. vS., 1990, Visually Elicited Evasive Behavior in Frogs: Giving memory research an ethological context, BioScience 40: 284-291. Ito, M., 1986, Neural systems controlling movements, TINS 9: 515-518. Jeannerod, M., 1984, The timing of natural prehension movements, /. Motor Behav. 16:235-254. Kaczmarek, L. K., and Levitan, I. B., 1987, Neuromodulation: The biochemical control o f neuronal excitability, NY: Oxford University Press. Kaplan, 1984, Advanced Calculus, Reading, MA: Addison-Wesley. King, I.K.C., 1993, A High Level Approach to the Intepretation o f Motion in Dynamic Scenes, PhD Dissertation, Department of Computer Science, Univ. of Southern California. Koenderink, J.J., and Van Doom, A.J., 1975, Invariant properties of the motion parallax field due to the movement of rigid bodies relative to an observer, Optic Acta 22: 773-791. Koenderink, J.J., and Van Doom, A.J., 1976, Local structure of motion parallax of the plane, J. Optical Society o f America 66: 717-723. Lee, D.N., 1976, A theory of visual control of braking based on information about time to collision, Perception 5: 437-459. Lee, D.N., 1980, The optic flow field: the foundation of vision, Phil. Tran. R. Soc. Lond. B 290: 169-179. Lee, D.N., and Reddish, P.E., 1981, Plummeting gannets: a paradigm of ecological optics, Nature: 293: 293-294. Leinonen, L. 1980. Functional Properties of Neurones in the Posterior Part of Area 7 in Awake Monkey. Acta Physiol. Scandinavia, 108, 301-308. Liaw, J.-S., and McCormick, B.H., 1990, Mapping biological structures using finite element analysis, Proc. First International Conference on Visualization in Biomedical Computing, pp. 352-357. Liaw, J.-S, and Arbib, M. A., 1993, Neural mechanisms underlying directional- selective avoidance behavior in frog and toad, J. Adaptive Behavior 1: 227-261. Liaw, J.-S., W eerasuriya, A., and Arbib, M. A, 1993, Neural strategies for coordinating fast movements, (to appear). Liege, B., and Galand, G., 1972, Single-unit responses in the frog’ s brain, Vision Res. 12: 609-622. Loeb, G.E., and Gans, C., 1986, Electromyography fo r Experimentalists, Univ. of Chicago Press, Chicago. Longuet-Higgins, H.C., 1981, A computer algorithm for reconstructing a scene from two projections, Nature 293: 133-135. Lozano-Perez, Spatial planning: A configuration space approach, IEEE Trans. Comput., 32: 108-120. Marr, D., and Nishihara, H.K., 1976, Representation and recognition o f a spatial organization o f three dimensional shapes, A l Memo 377, MIT A l Lab, Cambridge, MA. Masino, T., and Grobstein, P., 1990, Tectal Connectivity in the Frog Rana pipiens: Tectotegm ental Projections and a General Analysis of Topographic Organization, J. Comparative Neurology 291: 103-127. Masino, T., and Knudsen, E.I., 1990, Horizontal and vertical components of head movement are controlled by distinct neural circuits in the bam owl, Nature 345: 434-437. Matsumoto, N., Schwippert, W., and Ewert, J.-P., 1986, Intracellular activity of morphologically identified neurons of the grass frog's optic tectum in response to moving configurational visual stim uli,/. Comp. Physiol. A 159: 721-739. Matsushima, T., Satou, M., and Ueda, K., 1985, An electromyographic analysis of electrically-evoked prey-catching behavior by means of stimuli applied to the optic tectum in the Japanese toad, Neurosci. Res. 3: 154-161. Matsushima, T., Satou, M., and Ueda, K., 1986, Glossopharyngeal and tectal influences on tongue-muscle motoneuoms in the Japanese toad, Brain Res. 365: 198-203. Matsushima, T., Satou, M., and Ueda, K., 1989, Medullary reticular neurons in the Japanese toad: morphologies and excitatory inputs from the optic tectum, J. Comp. Physiol. A 166: 7-22. Maunsell, J. H. R., & van Essen, D. C. 1983. The Connections of the Middle Temporal Visual Area (MT) and Their Relationship to a Cortical Hierarchy in the Macaque Monkey. J. o f Neurosci., 3: 2563-2586. Mitchel, J.S.B., 1990, Algorithmic approaches to optimal route planning, SPIE Conf. on Mobile Robots. Mottor, B.C., and Mountcastle, V.B., 1981, The functional properties of the light- sensitive neurons in the posterior parietal cortex studied in waking monkeys: Foveal sparing and opponent vector organization, J. Neurosci. 1: 3-26. 153 j I i Mustard, B. E., and Lee, R. G., 1987, Relationship between EMG pattern and j kinematic properties for flexion movements at the human wrist, Exp. Brain Res. 66:247-256. j Nagel, H.-H., 1983, Constraints for the estimation of displacement vector fields from I image sequences, Proceedings ICCAI-83, Karlsruhe, F.R.G., 545-558. j Nakayama, K. 1981, Differential motion hyperacuity under conditions of common j image motion, Vision Res. 21: 1475-1482. i Nakayama, K. 1985, Biological motion processing: A review, Vision Res. 25: 625- , 660. Nelson, B., 1978, The Gannet, Berkhamsted: T.&A.D. Poyser. Nevatia, R., and Price, K., 1989, Research in knowledge-based vision techniques fo r the autonomous land vehicle program, (IRIS-255), Los Angeles: University of Southern California. Nieuwenhuys and Opdam 1976, Structure of the brain stem, in Frog Neurobiology, R.Llinas and W. Precht, eds., NY: Springer-Verlag, pp. 811-855. Nishikawa, K. and Roth, G., 1991, The mechanism of tongue protraction during prey capture in the frog Discoglossus pictus, J. Exp. Biol. 159: 217-234. Nishikawa, K., Anderson, C. W., Deban, S. M., and O'Reilly, J. C., 1992, The evolution of neural circuits controlling feeding behavior in frogs, Brain Behav. Evol. 40: 125-140. j Nishikawa, K. and Gans, C., 1992, The role of hypoglossal sensory feedback during feeding in the marine toads, Bufo marinus, J. Exp. Zool. 264: 245-252. Norton, A.L., Spekreijse, H., Wagner, H.G., and Wolbarsht, M.L., 1970, Response to directional stimuli in retinal preganglionic units, J. Physiol. 206: 93-107. Paulignan, Y., MacKenzie, C., Marteniuk, R., and Jeannerod, M., 1991a, Selective perturbation of visual input during prehension movements: 1. The effects of changing object position, Exp. Brain Res. 83: 502-512. Paulignan, Y., Jeannerod, M., MacKenzie, C., and Marteniuk, R., 1991b, Selective perturbation of visual input during prehension movements: 2. The effects of changing object size, Exp. Brain Res. 87: 407-420. Pomerleau, D. A., 1990, Neural network based autonomous navigation, in Vision and Navigation, ed., E. Thorpe, Boston: Kluwer Academic Publishers, pp 83-93. Prager, J.M., and Arbib, M.A., 1983, Computing the optic flow: The MATCH algorithm and prediction, Comp. Vision, Graphics and Image Proc. 24: 271-304. Prazdny, K., 1980, Egomotion and relative depth map from optic flow, B iol. Cybernet. 36: 87-102. Regan, D., 1986, Visual Processing of four kinds of relative motion, Vision Res., 26: 127-145. Regan, D., and Beverly, K. I, 1978, Looming Detectors in the Human Visual Pathway. Vision Research, 18, 415-421. Reichardt, W.E., 1969, Movement perception in insects, in: Processing o f Optical Data by Organisms and Machines, W. Reichardt, ed., NY: Academic, pp. 465- 493. Reichardt, W.E., and Guo, A.-K., 1986, Elementary pattern discrimination (behavioral experiments with the fly Musca domestica) Biol. Cybernet. 46: 1-30. Riseman, E.M., and Hanson, A.R., 1987, A methodology for the development of general knowledge-based vision systems, in Vision, brain and cooperative computation, M.A. Arbib a,d A.R. Hanson eds., Bradford Book/MIT Press. Rizzolatti, G., Scandolara, C., Matelli, M., and Gentilucci, M., 1981, Afferent properties of periarcuate neurons in macaque monkeys, II., Visual responses, Behav. Brain Res. 2: 147-163. Robinson, D.A., 1981, Control of eye movements, in Handbook o f Physiology, J.M. Brookhart,V.B., Montcastle, V.B., Brooks, and B.L. Roberts, eds, Bethesda: American Physiology Society, pp. 1275-1320. Roche, J.M., and Comer, C.M., 1993, The spatial organization and neural basis of visually triggered escape in the frog, Society fo r Neuroscience Abstracts, p. 150. Rosenfeld, A 1984, Image analysis: Problems, progress and Prospects, Pattern Recognition Vol. 17, No. 1, pp. 3-12. Rumelhart, D. E., and Norman, D. A., 1982, Simulating a skilled typist: A study of skilled cognitive-motor performance, Cognitive Science 6: 1-36. Saito, H.-A., Yukie, M., Tanaka, K., Hikosaka, K., Fukada, Y., and Iwai, E., 1986, Integration of Direction Signals of Image Motion in the Superior Temporal Sulcus of the Macaque Monkey, J. Neurosci., 6: 145-157. Sakata, H., Shibutani, H., Kawano, K., & Harrington, T. L. 1985. Neural Mechanisms of Space Vision in the Parietal Association Cortex of the Monkey. Vision Research, 25(3), 453-463. Satou, M., Takeuchi, H., and Ueda, K., 1989, Tongue-m uscle-controlling motoneurons in the Japanese toad: neural inputs from the thalamus, Brain Res.. 481: 39-46. Savelsberg, G., Whiting, H, and Bootsma, R., 1991, Grasping Tau, J. Expeerimental Psychology: Human Perception and Performance 17: 315-322. Schiff, W., 1965, Perception of impending collision, Psychological Monographs, 79: 1-26. Schiff, W., and Detwiler, M.L., 1979, Information used in judging impending collision, Perception 8: 647-658. Schiff, W., and Oldak, R., 1990, Accuracy of judging time to arrival: Effects of modality, trajectory, and gender, J. Expeerimental Psychology: Human Perception and Performance, 16: 303-316. Schmidt, R. A., Sherwood, D. E., and Walter, C. B., 1988, Rapid movement with reversals in direction: 1. The control of movement time, Exp. Brain Res. 69: 344-354. Schomaker, L. R. B., 1992, A neural oscillator-network model of temporal pattern generation, Human Movement Science, 11: 181-192. Schurg-Pffieffer, E., 1989, Behavior-correlated properties of tectal neurons in freely moving toads, in Visuomotor coordination, Amphians, Comparisons, Models, and Robots, J.-P. Ewert and M.A. Arbib, eds, NY: Plenum Press, pp. 451-480. Schwippert, W., Beneke, T., and Framing, E., 1989, Visual integration in bulbal structures of toads: intra/extra-cellular recording and labeling studies, in Visuomotor coordination, Amphians, Comparisons, Models, and Robots, J-P Ewert and MA Arbib eds., NY: Plenum Press, pp. 481-536. Schwippert, W., Beneke, T., and Ewert, J.-P., 1990, Responses of medullary neurons to moving visual stimuli in the common toad: II) An intracellular recording and cobalt-lysine labeling study, J. Comp. Physiol. A, 167: 509-520. Scudder, C. A., 1988, A new local feedback model of the saccadic burst generator, J. Neurophysiol. 59: 1455-1475. Selverston, A.I., and Moulins, M., 1987, eds. The Crustacean Stomatogastric System, NY: Springer-Verlag. Sereno, M.I., and Sereno, M.E., 1991, Learning to see rotation and dilation with a Hebb rule, in Advances in Neural Information Processing Systems 3, R.P Lippman, J. Moody, and D.S. Touretzky, eds., San Mateo, CA: Morgan Kaufmann Publishers, pp. 320-326. Servan-Schreiber, D., Printz, H., and Cohen, J., 1990, A network model of catecholamine effects: Gain, signal-to-noise ratio, and behavior, Science, 249: 892-895. 156 Sharma, R. 1992, Active vision in robot navigation: Monitoring time-to-collision while tracking, Proc. Intl. Conf. Intelligent Robots and Systems, Raleigh, NC, pp. 2203-2208. Sherwood, D. E., Schmidt, R. A., and Walter, C. B., 1988, Rapid movement with reversals in direction: 1. The control of movement amplitude and inertial load, ! Exp. Brain Res. 69: 355-367. j Siegel, R. M. & Andersen, R. M. 1986, Motion perceptual deficits following ibotenic acide lesions of the middle temporal are (MT) in the behaving rhesus monkey. Society o f Neuroscience (abstract), 12: 1183. Simpson, W.A., 1993, Optic flow and depth perception, Spatial Vision, 7: 35-75. Smeraski, C., and Grobstein, P., 1991, Directed movement in the frog: Electrophysio- logical studies of a tecto-tegmental pathway, Soc. Neuroscience Abstracts, 17: , 1578. Sokoloff, A. J., 1991, Musculotopic organization of the hypoglossal nucleus in the , grass frog, Rana pipiens, J. Comp. Neurol. 308: 505-512. Sparks D.L., and Mays, L.E., 1990, Signal transformations required for the generation of saccadic eye movements, Ann. Rev. Neurosci., 13: 309-336. Schwartz, J.T., Sharir, M. and Hopcroft, J., 1987, Planning, Geometry and complexity o f Robot Motion, Norwood, NJ: Ablex. Tanaka, K., K., Fukada, Saito, H.-A., 1989, Underlying mechanisms of the response specificity of expansion, contraction, and rotation cells in the dorsal part of the medial superior temporal area of the Macaque monkey, J. Neurophysiol. 62: 642-656. Teeters, J.L., 1989, A Simulation System for Neural Networks and Model for the Anuran Retina, Technical Report 89-01, Center for Neural Engineering, Univ. of Southern California. Teeters, J.L., and Arbib, M.A., 1991, A model of anuran retina relating intemeurons to ganglion cell responses, Biol. Cybern, 64: 197-207. Tilove, R.B., 1990, Local obstacle avoidance for mobile robots based on the method of artificial potentials, Proc. IEEE Intl. Conf. Robotics and Automation, pp. 566- 571. Todd, J.T., 1981, Visual information about moving objects, J. exp. Psychol. Hum. Percept. Perform. 7: 795-810. Tresilian, J., 1991, Empirical and theoretical issues in the perception of time to contact, J. Expeerimental Psychology: Human Perception and Performance 17: 865-876. Tsai, H J., and Ewert, J.-P., 1987, Edge preference of retinal and tectal neurons in common toads {Bufo bufo) in response to worm-like moving stripes: the question of behaviorally relevant 'position indicators,' J. Comp. Physiol. A, 161: 295-304. Ungerleider, L.G., Desimone, R., and Mishkin, M., 1982, Cortical projections of area MT in the macaque, Sco. Neurosci. Abstr. 8: 680. Vaina, L. M. 1989. Selective Impairment of Visual Motion Interpretation Following Lesions of the Right Occipito-Parietal Area in Humans. Biol. Cybern., 61, 347- 359. Van Doom, A.J., and Koenderink, J.J., 1975, Visibility of moving gradients, Biol. Cybernet. 44: 167-175. Van Essen, D. C., & Maunsell, J. H. R. 1983. Hierarchical Organization and Functional Streams in the Visual Cortex. Trends in Neurosciences, 6: 370-375. van Sonderen, J. F., Gielen, C. C. A. M., and Denier van der Gon, J. J., 1989, Motor program for goal-directed movements are continuously adjusted according to changes in target location, Exp. Brain Res. 78: 139-146. Wagner, H., 1982, Flow-field variables trigger landing in flies, Nature 297: 147-148. Wang, D.L. and Arbib, M.A., 1991, How does the toad's visual system discriminate different worm-like stimuli? Biol. Cybern. 64: 251-261. Wang, H., Mathur, B., Koch, C., 1989, Computing optical flow in the primate visual system, Neural Computation 1:92-103. Wang, Y., and Frost, B., 1992, Time to collision is signalled by neurons in the nucleus rotundus of pigeons, Nature 356: 236-238. Warren, C.W., 1990, Multiple robot path coordination using artificial potential fields, Proc. IEEE Intl. Conf. Robotics and Automation, pp. 500-505. W atanabe, S., and Murakami, M., 1984, Synaptic mechanisms of directional selectivity in ganglion cells of frog retina as revealed by intracellular recordings, Japanese Journal o f Physiology 34: 497-511. Weerasuriya, A. 1983, Snapping in toads: some aspects of sensorimotor interfacing and motor pattern generation, in Advances in Vertebrate Neuroethology, J.-P. Ewert, R.R. Capranicca, and D.J. Ingle, eds., NY: Plenum Press, pp. 613-627. Weerasuriya, A. 1989, In search of the pattern generator for snapping in toads, in Visuomotor Coordination, Amphians, Comparisons, Models, and Robots, J.-P. Ewert and M.A. Arbib, eds, NY: Plenum Press, pp. 589-614. 158 i I j Weerasuriya, A. 1991, Motor pattern generators in anuran prey capture, in Visual 1 Structure and Integrated Functions, M.A. Arbib & J.-P. Ewert, eds., Research Notes in Neural Computing, NY: Springer-Verlag, pp. 255-270. Weitzenfeld, A., 1991, NSL, Neural Simulation Language, Technical Report 91-05, Center for Neural Engineering, Univ. of Southern California. I Wing, A. M., Turton, A., and Fraser, C., 1986, Grasp size and accuracy of approach i in reaching, J. Motor Behav., 18: 245-260. Winter, D.A., 1979, Biomechanics o f Human Movement, NY, NY: John Wiley & Sons. Zhu, D., and Latombe, J-C, 1991, New heuristic algorithms for efficient hierarchical ! path planning, IEEE trans. Robotics and Automation, 7: 9-20. 159 APPENDIX A. The leaky integrator model Each neuron in our model is represented by two quantities: its membrane potential and its firing rate. The membrane potential of a neuron is described by the following differential equation: Here m(t) denotes the membrane potential at time t. z is the time constant of membrane potential. S(t) represents the total input received by the neuron. The solution of this equation is obtained numerically by using the Euler method. In NSL, equation (1) is written as The firing rate of a neuron is approximated by passing the membrane potential through the "thresholding function" below: (1) m := S. (2) if x < 0 ; otherwise. Each layer of neurons is represented by a 2-dimensional array. The connection between two layers of neuron is defined by an interconnection mask storing synaptic weights. A mask is applied via a spatial convolution, denoted by the symbol *. B The m athem atical form ulation of the model 160 The model of the R3 and R4 neurons is based on the retina model developed by Teeters and Arbib (1991). Here we only give the mathematical formulation of the : ] neurons beyond the retina. T3: This layer of neurons detects expanding edges (via R3 input) as well as the continuity inside these edges (via R4 input). T3 := W x*R3f + W2*R4f, (3) where R3f and R4f are the firing rates of R3 and R4 retinal ganglion cells, respectively. It is difficult to estimate the time constant from the cellular recording of the T3 neurons (Fig. 2A). The elapsed time between the onset of a stimulus (e.g., light being turned on or off, or object starts approaching) and the occurrence of the first spike varies from about 200 ms to 260 ms, averaging about 230 ms. The relay time through retina ranges from about 60 ms to 200 ms (Hartline 1940, Barlow 1953, i Dowling 1976, Nagano et al. 1988). In our current formulation, the slowest case is chosen for the relay time (i.e., 200 ms). The time constant is set to be the difference of the longest delay time and the average response time in T3 neurons, that is, 30 ms. W j and W2 are 5x5 matrices The size of the mask is chosen based on the size of the ERF of the R3 (7°), R4 (12°), and T3 (30°) neurons. There is considerable overlap in the receptive field of the retinal ganglion cells. Assuming a 50% overlap, every retinal ganglion cells extends, on the average, the ERF of a T3 neuron by about 5°, except the first gangion cell which contribute the full 10°. Therefore, by connecting a T3 neuron to 5 ganglion cells in one dimension, we obtain 10' + 4x5“ = 161 30° as the range of the ERF. The entries of W j and W2 are discrete values which approximate the functions defined below. a W i = K - 2 ^ xay EXP ~ ( 2 .,2 > x . y — 2 + 2 < X _ V x y > _ (4) where a is a scaling factor of the synaptic weights, and the a controls the width of the Gaussian window (5x5 in this case). W j is a on-surround and off-center (upside- down Gaussian) mask illustrated in Fig. 3A, with K = 0.8, a = 15 and crx = o y = 2. It consists of a sharp negative peak at the center and a broader positive surround. .2 Yl a W = 2 2 K G x cry EXP x2 + y y /J (5) where a = 0.1, and o x, o y are the same as above. The firing rate of T3 neurons varies from 20 to 25 spikes/s (Fig. 2A) and is thus defined in the model as T3f 1° 125 if x < 20; otherwise. where the threshold of 20 is chosen such that a stimulus moving at a constant distance will not elicit any firing in the T3 neurons. T2: This layer of neurons detects the shift in T3 neuron firing pattern. T2 := K*T3f - W 1 *T3_delayf (6) where the time constant = 50 ms, K = 1.2, and T3_delayf is the delayed signal from the T3 neurons. There is no data on this type of T2 neurons which are sensitive to crossing stimuli only. Therefore, the parameters here are chosen to obtained the 162 ; i desired direction selectivity. The mask, W j, has positive values in the preferred 1 direction and negative one in the null direction. W j = kl*cos(x) + k2*sin(2y), -2 < x < 2, a n d -1 < y < 1 (7) where k l = 2.0 and k2 = 0.6 are scaling factors of the excitation and the inhibition > weights, respectively. Note that the synaptic weights are positive for connections in : the preferred direction and negative in the null direction. ! The characteristic of T2 neurons depends on the length of the delay time of the ! T3 signals. For example, when the delay is short, only a faster moving object can elicit a response in the T2 neurons. However, the T2 characteristic also depends spatially on the connectivity with the delay units. For example, given a delay time, if the excitatory connections from the delay unit to the T2 neurons is shifted one cell to the right, an object moving at a speed that is previously able to activate a T2 response will fail to do so, but one that moves at a slightly higher speed will (Fig. 5B). Therefore, one may replace the temporal dependence with the spatial dependence (or vice versa). That is, by adjusting the connections from T3 neurons and the delay units to T2 neurons, one may tune the velocity sensitivity of the T2 neurons even if the delay time is fixed. This is why instead of one-to-one connection, a mask of 5x3 is used so that the T2 neurons can be activated by a crossing stimulus as long as its speed falls in some range (about 5°/s to 207s in the current simulation) The firing rate of T2 neurons is defined as T2f = g(T2,kxl,kyl,ky2), where kxl = 36.0, kyl = 0.0, and ky2 = 37.5. T6: This layer of neurons monitors the stimulus activity in the upper visual field (for aerial predators). This is done by convolving R3 and R4 with a mask which has higher weights for the higher positions in the visual field. 163 T6 := W *R3f+W *R4f, (8) where W is a 1x21 vector that lets the ERF of T6 cover the entire visual field in the rostro-caudal dimension. The value of the vector increases in correspondence with the vertical position of the visual field, i.e., There is no data available to estimate the time constant for the T6 neurons, therefore it is set to be the same as that of T3 neurons, i.e., 30 ms. With no available data to constrain the parameters, the firing rate of T6 neurons is defined to be the same as its membrane potential with a threshold at 0. T6f = g(T6,kxl,kyl,ky2), where kxl = 0, kyl = 0, and ky2 = 0. DEPTH: This is modeled as a black box which computes the distance of a stimulus. The distance depends on 2 parameters: the initial distance (init_dist) and the speed, v, of the stimulus. DEPTH = init_dist - vt (10) where t is the current simulation time. The firing rate of the depth perception neurons is inversely proportional to the distance of the stimulus. W = kY, where k = 0.1 and Y is the coordinate of the vertical axis. (9) CLOSENESS =k/DEPTH. (11) where k = 100. T H 6: This layer of neurons carries out the recognition of a looming stimulus, whereas T3 just recognizes an expanding pattern.. The signals from T3, T6 and DEPTH converge onto this layer. 164 TH6 := W x*T3f + W2*T6f + W3*CLOSENESS (12) where the time constant = 30 ms, W3 = 3.0. Wj and W2 are defined similarly to Eq. (9), such that the ERF of the TH6 neurons covers the entire rostro-caudal extent of the visual field, with the coefficients k = 0.03 for W j and k = 0.01 for W2. The firing rate of the TH6 neurons increases from about 10 spikes/s to more than 50 spike/s (Fig. 2B) and is thus defined as TH6f = g(TH6,kxl,kyl,ky2), where kxl = 10.0, kyl = 0.0, and ky2 = 0.0. TH3: This layer of neurons shows maximal activity when the stimulus fills its entire receptive field (about 30° to 46°). This is achieved by integrating R3 and R4 inputs with homogeneous synaptic weights. TH3 := W x*R3f + W2*R4f (13) where and W2 are 7x7 matrices. The choice of the size of the matrix of the synaptic weights is based on the same calculation described for T3 neurons (10° + 6x5° = 40°). W j(i,j) = 1.2 and W2(i,j) = 0.02, for l<i, j <7. There is no data to constrain the time constant, so it is set to be the same as that of T3 neurons. The firing rate of TH3 neurons is defined as TH3f = g(TH3,kxl,ky 1 ,ky2), where kxl = 335.0, kyl = 0.0, and ky2 = 0.0. such that it increases, after overcoming the threshold, as the membrane potential increases. M otor Heading M ap: The gating of T3 signals onto the motor heading map by the T2 neuronal activity is shown in Fig. 6. The NSL implementation of it is basically by connecting T2 and T3 neurons to the appropriate locations on the heading map as illustrated in the figure. Left_Ipsilateral_Map(max-i) = W ^LeftJIBG ) - W2*Left_T2(i) (14) where max is the dimension of the map, Wj = 2.5, W2 = 2.0. Right_Cut_Back_Map(max-i) = Wj*Left_T3(i) + W2*Left_T2(i) (15) where W j = 1.5, W2 = 1.0. JU M P: This represents the motor schema for a jumping movement. It receives inputs from T3, TH6, and TH3 neurons. There is no data to constrain the parameters of the JUMP and DUCK schemas. The parameters are chosen such that the JUMP schema will win the competition if the looming stimulus is relatively large or approaching from a lower elevation, whereas if there is a small object looming from a higher visual field, the DUCK schema will be elicited. JUMP := W j*T3f + W2*TH3f + W3*TH6f - DUCKf (16) where the time constant = 50 ms. W ls W2, and W3 are vectors of length 5 where W x(i) = 0.005, W2(i) = 0.01, and W3(i) = 0.06, for l<i<5. The firing rate of JUMP command neurons is defined as JUMPf = g(JUMP,kx 1 ,ky 1 ,ky2), where kxl = 25.0, kyl = 0.0, and ky2 = 30.0. DUCK: This represents the motor schema for a ducking movement. It receives inputs from T6 and TH6 neurons. DUCK := W j*T6f + W2*TH6f - JUMPf (17) where the time constant = 50 ms, W} = 0.2 and W2 = 0.8; 166 The firing rate of DUCK neurons is defined as DUCKf = g(DUCK,kxl,kyl,ky2), where kxl = 25.0, kyl = 0.0, and ky2 = 30.0.
Abstract (if available)
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
PDF
00001.tif
Asset Metadata
Core Title
00001.tif
Tag
OAI-PMH Harvest
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-oUC11255744
Unique identifier
UC11255744
Legacy Identifier
DP22870