Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Human visual perception of centers of optic flows
(USC Thesis Other)
Human visual perception of centers of optic flows
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
HUMAN VISUAL PERCEPTION OF CENTERS OF OPTIC FLOWS
by
Junkwan Lee
Copyright 2012 Junkwan Lee
____________________________________________________________________
A Dissertation Presented to the
FACULTY OF THE USC GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
BIOMEDICAL ENGINEERING
May 2012
ii
ACKNOWLEDGEMENTS
It was a long journey. I was looking at somewhere higher, but I did not know
how to reach there. It took me quite a while to learn how to prepare real stones and
step on them, instead of stepping on the air. I'm deeply indebted to two people in that
regard. I thank my advisor Norberto for, besides all his intellctual guidance on the
way, his endless encouragement on what I have done when I was discouraged by
what I could not have done. I also thank Eun-Jin for always patiently listening to my
rants on why A is not, and at the end pointing to why A could be.
I thank Dr. Bartlett Mel and Dr. Bosco Tjan for their critical review and
constructive suggestions on my PhD work, which only made it better and stronger. I
thank Dr. Michael Khoo for his generous support and belief on me when they were
most needed.
I thank Bosun who was another family of mine and treated me as one for last
several years. We met abroad as strangers, and he has been my best friend and my
brother ever since. I thank Xiwu, it was great experience to grow with you. I thank
Markus and Winston, many of the most pleasant memories here are with you guys.
I thank dad, mom and my two younger brothers who always believed on and
supported their son and elder brother. I love you all.
And I love you DDong, my friend now resting in my heart.
iii
TABLE OF CONTENTS
Acknowledgements ii
List of Figures v
Abstract vi
Chapter 1:General Introduction
Optic Flow 1
Elementary Flows for Analysis of Optic Flows 2
Human Perception of the Parameters of Elementary Optic Flows 3
Roadmap of the Dissertation 5
Chapter 2: Dynamics of Estimation for Rotation and Expansion Optic Flows
Introduction 7
General Methods 9
Experiment 1: Rapid Visual Presentation of
Randomly-Positioned Centers 11
Experiment 2: Pointing to Centers with Varying Presentation Times 14
Experiment 3: Perception of Center in Mixture of
Rotation and Translation 18
Discussion 24
Chapter 3: Perception of Optic flow Centers with Masked Rotations and
Expansions
Introduction 31
Masking Experiment 32
Simulation Study 37
Discussion 46
Chapter 4: An Interpretation of the Motion-Opponency Model in the Bayesian
Framework
Introduction 52
Bayesian Formulation of True Motion Estimation 53
Consine Weighting as a Generative Model of Aperture-Constrained
Measurements 54
Posterior Probability of Uni-directional Motion 57
Discussion 64
Chapter 5: Conclusion
Summary of Findings 68
Implications / Future Area of Research 70
iv
Bibliography 72
Appendix : Mathematical Details of the Two Center Estimation Models 80
v
LIST OF FIGURES
Fig 1.1 Four elementary flows 3
Fig 2.1. Schematics of the RSVP experiments 12
Fig 2.2. Duration thresholds for correct detection of target in RSVP sequences 13
Fig 2.3. Example of pointing data 16
Fig 2.4. Pointing-discrimination thresholds 17
Fig 2.5. Ambiguity in the instantaneous velocity fields in the mixture of
rotation and translation 19
Fig 2.6. Stimuli configuration for Experiment 3 21
Fig. 2.7. Vertical component of the last position of the perceived rotation center 23
Fig. 2.8. Normalized sensitivity of focus and rate-of-expansion perception 27
Fig. 3.1. Illustration of Masked Stimuli 34
Fig. 3.2. Representative example of pointing data 36
Fig. 3.3. Results of linear regression on pointing data 37
Fig. 3.4. Schematics of the models tested 40
Fig. 3.5. Simulation results of the models on masking experiments 43
Fig. 3.6. Details of how the models respond to different masking profiles 44
Fig. 3.7. Directional differences between two expansions with different centers 48
Fig 4.1. Direction likelihoods 55
Fig 4.2. Relationship between the likelihood and the posterior under
two class prior assumption 63
vi
ABSTRACT
Optic flow contains rich information about the relative motion between an
observer and the world, and three dimensional layout of the environment.
Elementary flows such as expansion and rotation were proposed as bases for analysis
of complex optic flow in the brain [93]. In this dissertation various aspects of human
visual perception of centers of expansion and rotation optic flows were investigated
using psychophysical probes and computational modeling. In the first section,
temporal dependence of perception of centers in expansion and rotation optic flows
were measured by pointing and detection tasks. It was found that the dynamics of
center perception for rotation and expansion flows were fast. Significant percept of
center location was already developed by 100 ms, with the exact time course
depending on the spatial velocity profile of the motion fields. Such time courses
were faster than those observed for the measurement of rates of expansion. Such fast
time courses could explain an optical illusion related to a rolling-wheel-like motion
in a fronto-parallel plane. Mathematically, the brain would have enough
information to estimate the motion correctly by integrating multiple frames of such
flows. However, the subjects’ perception of the rotational center was strongly biased
toward the center of instantaneous flow even with extended viewing time. In the
second section, perception of the centers of expansion and rotation optic flows was
tested under various masking conditions with short stimuli presentation (100ms). It
was observed that different tendencies in error according to the spatial profiles of
masking. When a rectangular masking was presented with its long side aligned with
vii
the direction in which center eccentricities changed (Parallel masking condition),
mean perceived eccentricities were unaffected with increased variance in response.
But when the rectangular masking was aligned so that center eccentricities changed
along its short side (Perpendicular masking condition), a strong bias in mean
perceived eccentricities was observed. This bias still persisted with a small square
mask, which was contained in both the parallel and perpendicular masks. Two
possible strategies for estimating centers of expansion and rotation optic flows were
tested to see whether they could account for the observed human data. In the motion-
opponency model, centers of such flows were found by locating the point where
motions were maximally balanced in any direction. In the template model, centers
were found by locating the point that best fitted expansion or rotation motion
templates. The simplest implementations of both of the strategies were simulated on
the same experimental setups of the various masking conditions. The results showed
that motion-opponency model could reproduce important aspects of human
perception under masking conditions, whereas the template model could not explain
the observed human data. In the third section of the dissertation, a mathematical
formulation was presented, which provided a theoretical background how local
motion direction and center location can be estimated together through motion-
opponency operation. First, the motion-opponency computation can be thought of as
computing optimal uni-directional motion with a generative model of aperture-
constrained local motion measurements. Second, the motion-opponency model of
center estimation can be optimal in locating points that are less-likely to arise from
viii
uni-directional motion. The constraint intrinsic in direction likelihood formulation
allows direct comparison in terms of likelihoods between cross-measurements under
simple prior assumptions. Collectively, the results suggest an important contribution
of the brain area MT in estimation of centers of optic flows.
1
CHAPTER 1.
GENERAL INTRODUCTION
Optic flow
When I was young, I used to sit in the back of the car my father was driving.
My relatives lived in the city that was an-hour-drive away from ours, and we had to
drive through ever-lasting rural area to get there. I liked to watch outside the passing
scene. Everything retreated to my left but with different speeds. Flowers and trees by
the highway flied away, leaving only traces of yellow, red and green. Cows in the
farm usually stayed a bit longer on my sight. And the mountains, they followed us!
Actually they moved pretty fast that it took long time for our car to outrun them. As
grown up, now I sit on the driver’s seat at the front of my car. How the scene moves
while I drive is very different from the one I used to watch when I was young. Now
they move in every direction. Things on my left keep going left until they disappear,
while those on my right exit toward my right. Lane marks and letters on the road
disappear downwards under the windshield. And the sky, it keeps growing taller.
The two examples above depict two different optic flows experienced when
an observer translates laterally or forward in the three-dimensional (3D) environment,
respectively. The term “Optic flow”, first coined by Gibson [15], denotes the pattern
of apparent motion of the points in the projected plane or sphere when there is a
relative motion between an observer and the world. The shape of the pattern is
determined by the relative motion itself as well as the layout and geometry of the
2
environment. As such, optic flow contains rich information about those information
which can be useful for human or animal to interact effectively with its environment.
Some variables are relatively straight forward to extract from the flow. For example,
the apparent speed of local velocity tells about the depth of the objects when an
observer translates (Motion parallax; [15, 36, 40, 41]). In the above example, the
nearby trees move faster and the mountains move slower as they are farther away.
Also the focus of the radial flow indicates the heading direction of locomotion [15,
41]. It was shown that humans can integrate local-motion signals and retrieve the
location of the focus of expansion to infer heading direction within one or two
degrees of accuracy [81, 86]. Geometry of the scene can also be inferred. For
example 3D structure of an object can be decided up to affine similarity with two
frames [4, 37, 52, 74, 77], and up to Euclidian similarity with three frames of known
correspondences in case of a rigid object [74, 76, 77].
Elementary flows for analysis of optic flows
Of particular interest is the fact that the first order local derivatives of an
instantaneous velocity field of optic flows contain rich information about the relative
motion and the geometry of the surface such as slant of plane [34-36, 41].
Divergence, curl and deformations with two axis are mutually-independent basis set
that are generally invariant to observer rotation. Thus it raises a possibility that the
brain may decompose complex optic flows locally into the corresponding elementary
flows, i.e., expansion, rotation and deformations for further analysis. Fig 1.1 shows
3
those four elementary flows. In fact, numerous psychophysical studies showed
existence of channels specialized for such elementary flows [47, 57, 58] working
independently [13, 46, 72]. Cells sensitive to those components were also found in
the medial superior temporal area (MST) of the monkey brain [10, 16, 20, 48, 49, 70,
71].
Fig 1.1. Four elementary flows. A) Expansion, B) Rotation, C) and D) Deformation
Human perception of the parameters of elementary optic flows
Expansion and rotation flows are defined by two parameters. One is the
speed of the motion, i.e., rate of expansion and angular velocity of rotation
respectively. The other is the position of center of expansion and rotation. Humans
can make fine discrimination of rate of expansion [91] and angular velocity [2]. It
was shown that subjects can make correct decision on how fast global motions are
even when local velocities contradict the speed of global motion. The studies also
showed importance of estimation of center position of the flows, because
mislocalization of the center causes systematic bias in the perception of the
4
expansion rate. Similarly, mislocalization of the center of 2D rotation causes errors
in the estimations of angular velocities.
Yuille & Grzywacz proposed a theoretical framework how visual motion
can be processed in the brain [93]. The theory explains how the brain may fit internal
models of motion to various regions of visual input in Bayesian optimal way.
Clustering can be done in possible parameter space of familiar internal models, or by
non-parametric statistical tests. Thus, the theory predicts that for processing of
familiar motion models such as the elementary flows, both of the motion parameters
– the rate and the center position of elementary flows – are estimated simultaneously.
Perception of centers was also studied in terms of heading perception.
Heading-perception studies often involve complex flows. They may contain both
observer 3D rotation and 3D scenes subtending a range of depths [1, 61, 79, 84].
Computational models of human heading perception generally focus on correct
estimation of heading under those complex situation [3, 23, 27, 50, 51, 59, 60]. Thus
most model take the computation of locating centers of expansion flow as given
when there is no observer rotation, only suggesting possibilities of simple means
such as finding intersections of vectors or template matching implemented by
population coding. But such explanations have not been thoroughly tested.
5
Roadmap of the dissertation
Various aspects of human visual perception of centers of expansion and
rotation optic flows were investigated with psychophysical probes and computational
modeling. The results have been described in the following three chapters.
The second chapter of the dissertation investigated temporal dynamics of
how fast center percept develops in human visual perception of expansion and
rotation flow fields. For making these measurements, two tasks were used; pointing
and detection tasks parameterized by center positions with varying stimuli durations.
Center perception was also tested in a situation where different temporal integration
window would predict separate center perception. In rolling-wheel-like motion of a
fronto-parallel plane, the center of the instantaneous velocity field and that of the
actual rotational motion is different. The brain can only disambiguate with
integrating multiple frames of such flows.
The third chapter describes human performance of center perception when
some portion of the stimuli was occluded with different profile of masking. Two
kinds of computational model of center estimation were tested for reproducibility of
the observed human results by computer simulation.
The fourth chapter introduces Bayesian optimality formulation that provides
one way of interpreting the motion-opponency model of center estimation introduced
in the chapter three. The Bayesian formulation provides an explanation of the motion
opponency calculation used in the model of center estimation, as well as optimality
of the proposed model of the center estimation.
6
The last chapter summarizes the findings, and discuss their implications and
area of future research.
7
CHAPTER 2.
DYNAMICS OF CENTER ESTIMATION FOR ROTATION
AND EXPANSION OPTIC FLOWS
Introduction
The time required to perceive the center of global flow, such as expansion,
has been studied in terms of heading perception. Heading-perception studies often
involve complex flows. They may contain both observer 3D rotation and 3D scenes
subtending a range of depths [1, 61, 84] (reviewed in [79, 82]). Without these
complexities, calculating heading is computationally equivalent to estimating the
center of the radial flow. In those simple cases, human performance of heading
discrimination requires about 300 ms to plateau[8]. Unfortunately, these
measurements of the time course of heading discrimination have not been published
in detail. Hence, we do not know the exact experimental conditions and the
dynamics of heading perception before 300 ms.
In another study, Hooge et al. [28] asked subjects to make a saccade towards
the perceived heading direction. They found that the error at the end of the saccade
leveled off at about 500 ms, estimating processing time for the heading direction at
about 430 ms. In turn, van den Berg [78] measured positional lags in the perception
of laterally stepping radial flows. The processing time estimated from the difference
between the perceived and the actual heading of the last radial flow was between 300
and 600 ms. However, these measurements could only serve as the upper limit of the
8
estimated processing time. In Hooge et al. [28], the measured time was compounded
with the time required for the performance of eye movements which often requires
more than one saccade. In van den Berg [78], most subjects did not notice a change
of the center position when its velocity was more than 6°/sec. This velocity gave,
giving total displacements of over 9° in 1.5 sec. The failure at these velocities may
have been due to the flash-like transition of the stepping radial flows. In them, all
dots were replaced across consecutive steps. Therefore, this stepping paradigm
appeared to fail to elicit precise center localization, which in turn may translate into
longer-than-actual estimated pooling time.
On the other hand, other studies hint of fast integration time for center
localization. For instance, a strong bias toward singularities in perceived heading
directions were observed with simulated eye-rotation stimuli [1, 61]. In these stimuli,
the optic flow simulating both observer translation and rotation is presented to the
fixated subjects. In many of such experimental set-ups, subjects’ heading responses
are not towards the true heading direction. Rather, subjects show a bias toward the
instantaneous singularities of the combined flow, as opposed to the true heading
direction. It is interesting to note that in simulated eye-rotation stimuli, the
instantaneous singularities are ever drifting over the screen for the entire stimuli
presentation. The fact that the subjects’ misperceived heading directions are strongly
correlated with the positions of instantaneous singularities may indicate possible
human sensitivity to the ever-drifting singularities, i.e., fast estimation of the
singularity positions.
9
In this chapter, the temporal characteristics of center perception were studied
in three experiments. In Experiment 1, threshold stimulus-presentation times for
detection of pre-queued center locations in expansion or rotation flows were
measured. The “target” flow was presented inside a random sequence of
“distracters.” The target and distracters were the same type of flow, but differed in
their center locations. In Experiment 2, change of sensitivity to the center location
was measured with increasing number of frames of rotation or expansion flow
stimuli. Perceived center locations were measured by direct pointing as stimulus-
presentation times were varied. Signal-detection-theory-like measures of
discrimination threshold were derived from pointing data to investigate the profile of
threshold change according to the stimulus-presentation times. In Experiment 3, the
perception of location of rotational center was investigated when a fronto-parallel dot
field underwent a rolling-wheel-like motion of simultaneous rotation and translation.
In this stimulus, the location of rotational center as well as the singularity of the
instantaneous velocity field is ever-changing over the presentation time.
General Methods
STIMULI
Random-dot motion stimuli were generated using Psychotoolbox MATLAB
extension and were presented on a 35 cm × 26 cm CRT display. The stimuli were
viewed binocularly from 40 cm away, so that the stimulus area subtended 48° × 35°.
Stimuli were composed of 1675 dots, giving a dot density of approximately 1
10
dot/deg
2
. Each dot had a diameter of 12´ and zero luminance, while the background
luminance was 26.5 Cd. Using dark dots minimized their undesirable static cue of
luminance streak due to the slow passive decay of the CRT phosphor. The random-
dot field was refreshed every frame at 85 Hz. Each dot had a motion lifetime of nine
frames or 106 ms. After reaching the end of their lifetime, dots died and reappeared
in random position in the next frame to maintain their total number constant. The
phases of the dots inside the nine-frame cycle were randomly staggered to avoid
luminance blinking. Finally, dots were replaced in random positions when they
moved out of the stimulus area.
In Experiments 1 and 2, center perception on three different velocity
configurations of global motion was tested. Two were of rigid motion with different
rates of global motion, and one was of non-rigid motion. The rigid global motion
was as in either an approaching rigid fronto-parallel plane or a fronto-parallel plane
rotating in front of an observer about the viewing axis. The rates of global motion, ρ,
in rigid motion stimuli were fixed at ρ = 1.5 sec
-1
or ρ = 2.3 sec
-1
. (These global
rates were either rate of expansion or angular velocity.) Dots moved with local
velocity determined by v = ρ × r where r was the distance from the center of the
global motion. In a non-rigid global motion, local velocities were 12 deg/sec for all
dots, thus ρ effectively decreased with distance from the center. This velocity profile
gave a percept of non-rigid motion, such as the rotational motion in a whirlpool drain.
11
SUBJECTS
Three subjects with normal or corrected-to-normal visual acuity participated
in all three motion experiments. JK was one of the authors and the other two, JF and
CW, were naïve paid subjects.
Experiment 1: Rapid Visual Presentation of Randomly-Positioned
Centers
RATIONALE
The task of center localization was presented as a detection task using the
Rapid Serial Visual Presentation (RSVP) protocol [54, 55, 69]. The threshold-
stimulus duration was measured that allowed correct detection of a target embedded
among distracters presented in a rapid sequence. The target and distracters were the
same type of flow, but differed only in their center locations.
METHODS
In two sequences of rapidly changing optic-flow patterns, where their centers
jumped around, the target appeared only once in one of the two sequences. Subjects
were asked to make a 2-alternative-forced-choice (2AFC) of which sequence
contained the target. The target and the distracters were either all expansions or all
rotations. The center of the target lay anywhere between 2 and 10 deg eccentricity
on the 45˚ diagonal line crossing the first quadrant (upper right) of the screen (Fig.
2.1A). In turn, the distracters had centers in the diagonal lines passing the second,
12
third, or fourth quadrants. Each sequence comprised ten optic flows with random
centers. The target appeared only once in any of the second through ninth optic
flows, while the others were distracters (Fig. 2.1B). The duration of each optic flow
in the sequence, or the independent variable, was adjusted by a two-up-one-down
staircase for rigid stimuli and a three-up-one-down staircase for non-rigid stimuli.
For each test condition, two staircases were randomly interleaved to make each trial
less predictable from the previous one. One staircase started at a long interval (500
ms for rigid motion and 200 ms for non-rigid motion) while the other started at the
shortest one (2 frames or 24 ms). We tested each condition of the motion
(expansion/rotation and rigid/non-rigid) separately. All the responses of the two
Fig 2.1. Schematics of the RSVP experiments. A) The center of the target optic
flow lay randomly in a position indicated by the dashed line while the centers of the
distracters lay on the dotted lines. B) Example of an RSVP sequence containing the
target, marked by an arrow. The white dots indicate the centers of the optic flows in
each sequence. Two sequences, each of which comprised ten random center
positions, were shown in the experiment. Only one of the two sequences had the
target, which appeared once anywhere between the second and ninth intervals.
Subjects made a 2AFC of which sequence contained the target.
13
staircases per each condition were pooled together. Then a maximum likelihood
estimate of the lognormal psychometric curve was obtained. We defined the 75%
correct point of this curve as the threshold.
Subjects were familiarized with the task with ten training trials at the longest
intervals (500 ms for rigid motion and 200 ms for non-rigid motion). We made sure
that the subjects had 100% correct responses at those intervals. Moreover, we
explicitly instructed them not to track each jump of center positions. Rather, they
had to attend to the target region, with their eyes always fixated at the mark on the
middle of the screen.
RESULTS
Fig 2.2. Duration-of-optic-flow thresholds for correct detection of target in RSVP
sequences. Thresholds with their standard errors are shown. Inter-subject means and
its standard errors per each velocity profile are shown with bold diamond symbols.
Thresholds are not longer than 100 ms, being shorter for both large global rates of
rigid motions and non-rigid motions.
NON-RIGID Ro=2.3 Ro=1.5
0
50
100
150
Rate of Expansion (rad/s)
Threshold Duration (ms)
Expansion
JK
CW
JF
NON-RIGID Ro=2.3 Ro=1.5
0
50
100
150
200
Angular Velocity (rad/s)
Threshold Duration (ms)
Rotation
JK
CW
JF
14
Threshold optic-flow durations for both rigid expansion and rigid rotation at
ρ=1.5 sec
-1
were around 100 ms (Fig. 2.2). These thresholds decreased with faster
rates of rigid global motions, giving a value of about 60 ms for ρ=2.3 sec
-1
. The
thresholds were even lower for our non-rigid motion stimuli, being around 20 ms.
Experiment 2: Pointing to Centers with Varying Presentation Times
RATIONALE
Experiment 1 showed that very short presentation times (around or less than
100 ms) are enough for subjects to perform our detection task. In this experiment,
we measured more directly how sensitivity to the centers of rotation and expansion
flows develops with varying stimulus presentation times. Perceived center locations
were measured by direct pointing as presentation times varied. To quantify changing
sensitivity, we derived a signal-detection-theory-like measure of discrimination
threshold from pointing data.
METHODS
Random-dot motion fields of expansion or rotation with vertically displaced
centers were presented for 24, 35, 59, 94, 141, 212, 329, or 506 ms. We preceded
and followed these optic flows by a 750-ms and a 500-ms blank screens respectively.
A fixation mark was present for both of the blank screens and for the motion.
Subjects were asked to maintain fixation when the mark was present. We removed it
after the post-motion blank screen and a mouse cursor then appeared at the location
15
of the fixation. We asked subjects to move the cursor to the perceived location of the
center. Subjects could move their eyes freely during this pointing phase.
Test positions of the centers were displaced vertically around the fixation. For each
stimulus presentation time, we tested thirty positions, equally spaced between 8 deg
below to 8 deg above fixation. Hence, we had 8 durations × 30 positions = 240 trials,
which we tested in random order in each experimental block for each stimulus
condition.
RESULTS
An example of the results with the center-pointing task is shown in Fig. 2.3.
Pointed positions showed apparent correlation with the actual centers (Fig. 2.3A
shows an example at 35-ms stimulus presentation time). We performed linear
regression to quantify this correlation. Trials deviating by more than 3 times the
standard deviation from the regression line were regarded as outliers and disposed
off (< 1% of total trials). We then performed the linear regression again on the
refined data. Slopes from the refined linear regression were generally less than 1,
showing the typical bias toward the fixation reported elsewhere (Fig. 2.3B; [2, 28, 30,
73, 78, 91]). With increased stimulus presentation, the perception of the center
improved, as indicated by the slopes of the regression line getting closer to 1 (Fig.
2.3B) and the standard deviations of the residuals (STD) decreasing (Fig. 2.3C). To
quantify this improvement of sensitivity to the center position, we derived from the
linear regression results a simple variable defined as STD divided by the slope.
16
Higher STD meant larger spread from the mean perceived position. In turn, higher
slope meant better separation of perceived centers. Therefore, this variable had a
meaning analogous to 1 over d’ of signal-detection theory. We termed the new
variable “pointing-discrimination threshold”.
Fig 2.3. A) Example of pointing data for subject CW with rigid rotation of ρ=1.5
sec
-1
presented for 35 ms. Every dot represents the perceived center position versus
the actual center position. The solid line is the linear regression, while the dashed
line represents the identity of perceived and actual positions. The slope from the
linear regression for this condition is 0.54 while the standard deviation of the
residual is 1.34 deg. These two values give a pointing-discrimination threshold of
2.48 deg. B) Slopes with 95% confidence intervals and C) STDs as a function of
stimulus duration for the same subject and rotation stimuli. Performance improves
with stimulus duration as evident from the increase in slope and decrease in STD.
Inter-subject means of pointing-discrimination thresholds for all tested
conditions appear in Fig. 2.4. Not surprisingly, the data showed that the threshold
falls with stimulus duration. For rigid motions, the threshold reached a statistical
plateau by the first 60 ms of stimulus presentation. This rapid fall in threshold
indicated that a significant amount of the percept of center location could be formed
with the first few frames of stimulation. In contrast, the threshold took longer to
reach a statistical plateau for non-rigid stimuli. For them, the threshold reached a
statistical plateau by 100 ms, while for non-rigid rotations, the threshold reached the
0 100 200 300 400 500 600
0
0.5
1
1.5
Stimulus Duration (ms)
STD of Residuals (deg)
-10 -5 0 5 10
-10
-5
0
5
10
Actual Center Pos.(deg)
Perceived Center Pos.(deg)
0 100 200 300 400 500 600
0
0.2
0.4
0.6
0.8
1
Stimulus Duration (ms)
Slope of the Perceived Centers
A) B) C)
17
plateau later. However, thresholds for non-rigid motions were comparable to or
lower than those for rigid stimuli for all tested conditions. Consequently, our non-
rigid stimuli elicited stronger percepts about center location at all presentation
durations. These results were in line with the observations in Experiment 1, which
yielded lower duration thresholds to detect targets in our non-rigid configurations
than in the rigid ones.
Fig 2.4. Inter-subject means and standard errors of pointing-discrimination
thresholds for expansion and rotation as a function of stimulus duration. The
thresholds can reach plateau performance quickly (in < 60 ms) for rigid motions.
In conclusion, both Experiments 1 and 2 show that the perception of optic-
flow center arises rapidly, typically not taking more than 100 ms and often reaching
a plateau sooner. Such center-estimation times are almost as short as those for the
measurement of local velocity [45, 67]. Therefore, one can say that the estimation of
the center of the optic flow is almost “instantaneous.” The brain obtains the center
as fast as practically possible.
10
1
10
2
10
3
1
2
3
Stimulus Duration (ms)
Threshold (deg)
Expansion
Rigid, ro=1.5
Rigid, ro=2.3
Nonrigid
10
1
10
2
10
3
1
2
3
4
5
Stimulus Duration (ms)
Threshold (deg)
Rotation
Rigid, ro=1.5
Rigid, ro=2.3
Nonrigid
18
Experiment 3: Perception of Center in a Mixture of Rotation and
Translation
RATIONALE
The first two experiments showed that the perception of optic-flow center is
practically instantaneous. This instantaneity would have implications for brain
computations requiring relatively long temporal integration. In Experiment 3, we
give an example of such a computation. In this example, the brain could obtain the
correct position of the center through long temporal integration, but would obtain a
wrong answer by performing an instantaneous computation. What does the brain do
when confronted with such a situation?
When a fronto-parallel plane rotates around an axis parallel to observer’s line
of sight, a rotational flow arises on the observer’s retina. The center of the rotational
flow on the retina matches the center of rotational motion of the plane. Here, we will
call this projected center the true center. Interestingly, the center of the rotational
flow and the true center start to deviate when one adds lateral translational motion to
the plane, as in a motion of rolling wheels. The instantaneous retinal flow produced
by this “rolling wheel” is another (perfect) rotation with the same angular velocity,
with its center shifted from the true center. The important consequence of this shift
is illustrated in Fig. 2.5. One can combine different translations and rotations of the
plane (Figs. 2.5A and B) to obtain a single retinal rotation (Fig. 2.5C). In other
words, the decomposition of a single instantaneous rotational flow into
19
corresponding rotational and translational components is degenerate. One can find
an infinite number of solutions for this decomposition. However, this decomposition
ambiguity could be resolved with time. In a wheel-like motion, the center of the
retinal flow translates in time in different amounts as determined by the magnitude of
the translation of the plane (Figs 2.5D and E). With extended time, one can
determine in principle the motion parameters of the rotation and translation of the
Fig 2.5. Ambiguity in the instantaneous velocity fields arising from the
combinations of different rotational and translational flows, and its temporal
resolution. A) Combination of rotation with center C
1
and angular velocity ρ, and
translation with velocity V
1
. B) Combination of rotation with center C
2
and angular
velocity ρ, and translation with velocity V
2
. C) Both combinations in A and B result
in the same instantaneous rotational field with the shifted center C
r
. D) Optic flows
resulting from the progress over time of the “rotating wheel” arising from the
combination of the rotation and translation in A. E) Optic flows resulting from the
progress over time of the rotating wheel arising from the combination of the rotation
and translation in B. Although the combinations in A and B result in an ambiguous
instantaneous flow (C), one can disambiguate the motion by temporal integration (D
and E).
20
plane. In Experiment 3, we tested whether the brain integrates motion information
over time to find the true center. The alternative would be a calculation of the center
based on “instantaneous” flow information.
METHODS
Stimuli were random-dot flow fields, which rotated and translated
simultaneously. The stimuli were shown for 50, 250, 650, or 1000 ms, preceded and
followed by 750 and 500 ms of blank screens respectively. The fixation mark was
present throughout all stimuli and blank screens. After the second blank screen,
subjects were asked to point to the location of the last perceived position of the
rotation center. We allowed for eye movements during the pointing phase. Subjects
were naïve about the motion of the random-dot field, except that they were told that
the center of rotation might be moving over time. The angular velocity of rotation
was 1.5 rad/s and the three values of translation velocities were 6, 12, and 18 deg/s.
These angular and translational velocities gave rise to 4, 8, and 12 degrees shift of
center position (Fig. 2.6). All four combinations between directions of rotation
(clockwise and anticlockwise) and translation (left and right) were tested ten times
each, giving a total of 120 trials per each presentation time (3 speeds × 4 directions ×
10 repetitions). We tested these trials in one block in randomized sequence (Fig.
2.6A). Each translational velocity tested had its own associated vertical position of
the true center of rotation. This position was such that the true and the
instantaneous-flow center of rotation had the same distance from fixation to control
21
for eccentricity. The initial horizontal position of the true center also depended on
the translational velocity. We made sure that at the end of the presentation time, the
center arrived exactly at the midline crossing the fixation. The actual and the
instantaneous centers always stayed within the stimulus area. An example of one of
the configurations appears in Fig. 2.6B.
Fig 2.6. Stimuli configuration for Experiment 3. A) All four combinations between
directions of rotation and translation. The fixation mark is the symbol “+.”
Instantaneous centers (not shown in the figure) appeared downward for the top two
combinations and upward for the bottom two combinations. B) Example of
trajectory in one of the four conditions. The position of the actual rotation center (C
A
)
was situated so that C
A
and the instantaneous-flow rotation center (C
I
) had the same
eccentricity. This separation of centers depended on the magnitude of the
translational motion added. As the motion proceeded, both of the centers translated
horizontally and ended at the vertical midline crossing the fixation mark.
This rotation/translation task required training. Initial subjective reports by
subject JF indicated that he had trouble tracking where the center went and that he
was not sure where it ended. We found that after performing the RSVP tasks in
Experiment 1, the results in Experiment 3 became clean and consistent across the
22
subjects, as well as thresholds measured in Experiment 2 improved. Subject JK (one
of the authors) and CW did not require training.
RESULTS
The results showed that the vertical perceived positions of the rotation centers
at the end of the stimuli were always closer to the centers of the instantaneous
velocity fields than to the actual ones. This bias towards the instantaneous center
was regardless of the presentation time. The systematic deviations from the
positions of the instantaneous centers were similar to those predicted by the
fixational bias observed in Experiment 2.
When the presentation time was very short, there was not enough
information about translational motion. Thus the stimulus was almost
indistinguishable from the static rotational field centered at the position of the
instantaneous center (See Fig. 2.5, C). Subjects could correctly track the center
positions of such stimuli (Fig. 2.7, 50 ms condition). The results showed that subjects
could not achieve correct decomposition of rotation and translation even with
prolonged presentation of the stimuli. Instead, subjects reported perception of lateral
movement of the rotational center from the initially perceived height, which
conformed with the similar vertical positions of perceived centers over all the
presentation times.
23
Fig. 2.7. Vertical component of the last position of the perceived center of rotation as
a function of the velocity of translation and parametric on the stimulus presentation
time. The data are for three subjects, namely, CW, JK and JF. The vertical positions
of actual (dashed line) and instantaneously computed (dotted line) rotational centers
are shown for comparison. Subjects show consistent bias perceiving the last position
towards the instantaneous center.
Our data included two-dimensional measurements of perceived center
positions. Thus, we could also study possible horizontal systematic errors of
perceived center following the rationale in [78]. If the center took a certain time lag
Δt to compute, then the actual position of the center would be vΔt ahead from the
perceived one, where v is the velocity of translation. The experiment was not
primarily designed to compare such lags and thus, no randomization in horizontal
end points was performed. Nevertheless, we observed a consistent spatial lag in the
reported horizontal position of the last perceived center. We then plotted this spatial
lag as a function of v and calculated the slope. It gave temporal lags of 50 to 200 ms,
depending on different test conditions (data not shown). These lags were short
enough to be consistent with the results of Experiments 1 and 2. In contrast, these
lags were much shorter than those reported in the earlier direction-of-heading
experiments [78].
0 5 10 15 20
-6
-4
-2
0
2
4
6
Translation Velocity (deg/sec)
Vertical Position (deg)
CW
50 ms
250 ms
650 ms
1000 ms
0 5 10 15 20
-6
-4
-2
0
2
4
6
Translation Velocity (deg/sec)
Vertical Position (deg)
JK
50 ms
250 ms
650 ms
1000 ms
0 5 10 15 20
-6
-4
-2
0
2
4
6
Translation Velocity (deg/sec)
Vertical Position (deg)
JF
50 ms
250 ms
1000 ms
24
Taken together, our results indicate fast estimations of centers of optic flow
by the visual system. In particular, Experiment 3 shows that the fastness of
estimation comes at the expense of the ability to decompose rotation and translation
correctly even with prolonged presentations of the stimuli. Rather, the brain appears
to prefer “instantaneous” centers.
Discussion
FAST TEMPORAL DYNAMICS OF CENTER PERCEPTION
The main conclusion of this chapter is that the estimation of optic-flow center
is fast (Figs. 2.2 and 2.4). A significant percept of center position emerges within the
first few frames of stimuli. Time intervals of around 100 ms or less are enough for
estimating the center position. The visual system even appears sometimes to prefer
fast rather than correct estimations of center. For instance, Experiment 3 showed an
example in which the visual system failed to obtain veridical interpretations of
underlying motions, although the information was available in the optic flow (Fig.
2.7). Such a failure happened because the available information was extended over
time.
To grasp how fast the estimation of optic-flow centers is one must compare
its speed to those of other relevant computations in the motion pathway of the brain.
As pointed out in the conclusion of Experiment 2, the time taken to estimate the
optic-flow center is almost as fast as that of the measurement of local velocity [45,
67]. The computation of optic-flow center requires several local estimates of
25
motions in multiple locations around it. Consequently, this computation is global not
local. It thus should take extra time to perform. We can think of two non-mutually
exclusive hypotheses for why the computation of center does not take substantially
longer than that of local velocity: First, perhaps the computation of center does not
require full velocity estimates. Human perception of direction saturates faster than
that of speed [9], and directional measurements without speed estimates may be
enough. That the speeds in our non-rigid motions are all equal suggests that this
hypothesis is valid, since we can find the center with just local directions (Figs. 2.2
and 2.4). Second, perhaps the computations of both local motion and center of optic
flow are performed simultaneously by a shared mechanism. For instance, motion
opponency seems to be a part of the computation of local direction [14, 25, 38].
Motion-opponent mechanisms tend to yield zeros in the centers of an optic flow and
thus, may signal them.
Other relevant computations are the estimations of global parameters of the
optic flow that are more complex than the center. Elementary flow patterns such as
rotation and expansion have been proposed as possible basis for optic-flow analysis
[34-36, 46, 57]. These patterns are compatible with the electrophysiological-
response properties found in the primate’s medial temporal superior (MST) cortical
cells [10, 65, 71]. The main parameters of rotation and expansion besides their
centers are angular velocity and rate of expansion respectively. Is the computation
of the center faster than of these parameters? In their Bayesian framework for the
perception of visual motion, Yuille and Grzywacz [93] proposed the simultaneous
26
estimation of all optic-flow parameters. Thus, if their framework were right, one
would expect the time courses of the computations of center, angular velocity, and
rate of expansion to be similar. Alternatively, the computation of center may occur
sooner. Thus, it may be an input to the estimation of angular velocity and rate of
expansion. In Fig. 2.8, we compare the time courses of the computation of the focus
(Fig. 2.4) and the rate of expansion [91] for dot fields expanding at 1.5 s
-1
. To
perform this comparison, we graph sensitivity, i.e., the inverse of threshold. The
normalized sensitivity curve for center perception improves faster than that for the
rate of expansion. The former reaches over 80% of its best performance before 100
ms, whereas the latter requires about 200 ms of stimulus presentation to reach the
same level. This difference implies separate processing of the center position and of
the rate of global motion. Furthermore, because the center is estimated more rapidly,
it might be used in the computation of global-motion parameters that are more
complex.
27
Fig. 2.8. Normalized sensitivity of focus and rate-of-expansion perception as a
function of stimulus presentation time. Data are means and standard estimations
from three subjects respectively. The time course of center perception is faster than
that of the rate of expansion.
SENSITIVITY OF PERCEIVED CENTER POSITION TO THE SPATIAL
VELOCITY PROFILE OF MOTION STIMULI
For the estimation of optic-flow centers to be very fast, it must use the
“instantaneous” spatial distribution of local velocities, avoiding significant temporal
integration. Therefore, that our results show that the perceptual strength of center
position varies with velocity profiles of motion stimuli is not surprising. For
instance, rigid global-motion stimuli with high rates of motion give better sensitivity
in pointing tasks than when the rates are low (Fig. 2.4). In addition, fast stimuli
require shorter presentation times for correct detection (Fig. 2.2). These results are
10
1
10
2
10
3
0
0.5
1
Stimulus Duration (ms)
Normalized Sensitivity
Center
Rate of Expansion
28
consistent with pointing errors being smaller for stimuli with larger rates of
expansion and rotation [73]. Moreover, the results are consistent with the threshold
of heading discrimination decreasing with increasing approach speed [86]. In turn,
non-rigid global motions gave the best sensitivity in the pointing task (Fig. 2.4). And
they gave the shortest threshold times for detection (Fig. 2.2). These fast-motion and
non-rigid results can be explained by assuming that center estimation is based on
only the directional information of local motion signals. For this explanation, we use
three observations. First, local directions are sufficient for finding the center,
because the non-rigid stimulus does not contain local-speed information (Figs. 2.2,
2.3, and 2.4; [17]). Second, directional signals near the center are more valuable than
are those at the periphery in determining the center location [21]. Third, directional
signals are impoverished at low speeds [9]. Therefore, non-rigid rotations and
expansions should typically generate better perception of center than rigid ones,
because in the latter, speeds converge to zero as one approaches the center.
Similarly, fast rotations and expansions should evoke better perception of center than
slow ones, because near it, they will induce faster speeds. This explanation for the
advantages of both fast and non-rigid motions is consistent with the psychophysical
observation that randomization of direction but not of speed affects heading
perception severely [83]. Similarly, electrophysiological responses from MST cells
show only marginal decrease when speed gradients are removed but directions of
motion are preserved in rigid expansion and rotation motion stimuli [71].
29
“INSTANT” FLOW INTERPRETATION IN ROLLING-WHEEL-LIKE RANDOM
DOT MOTION FIELD
When the random dot motion field that undergoes simultaneous motions of
rotation and translation was shown, subjects kept perceiving center of the
instantaneous flow as center of rotational motion regardless of viewing time. It is
surprising because in principle, decomposition into correct rotation and translation is
possible with multiple frames of velocity fields. Instead subjects reported straight to
somewhat rugged lateral movement of instantaneous rotational centers. Such
perception can be explained if we assume that percepts of instantaneous centers
concatenate temporally to give arise to a sense of higher order motion [42, 43].
Ruggedness in perceived lateral motion may be due to the fact that local motion
signals near those instantaneous centers are always in conflict with the direction the
instantaneous centers proceed. In our mixture motion stimuli, positions of
instantaneous centers are moving horizontally whereas local motion signals around
the instantaneous center are always vertical. Similar integration of conflicting motion
signals has been reported in [75], where 1
st
order horizontal motion of grating and 2
nd
order vertical motion of its aperture are integrated to give sensation of illusory
diagonal motion in peripheral vision.
It is interesting to note the similarity between our results in Experiment 3 and
the misperceived heading toward a single wall in simulated-rotation stimuli [1, 61].
Observer translation gives a radial optic flow. Observer rotation adds lamellar flow
into it and shifts instantaneous center of the radial flow toward the opposite direction
30
of the lamellar flow. In both of the cases estimation of the true motion parameters,
the true rotational center and the true heading direction respectively, is possible in
principle. But subjects’ perception is biased toward the singularities of instantaneous
velocity fields regardless of viewing time. Thus both of the observed bias might be
understood in terms of very fast perception of the flow centers and sensation of
higher-order motion from temporal concatenation of these “instantaneously-
processed” flow centers.
Why would the brain interpret optic flow information in such seemingly
misled way, or in “stroboscopic” way disregarding temporal information? One
possibility points to the necessity of actual eye motion or its efferent copy to cancel
out lamellar/translational flow and simplify complex retinal flow. Human can indeed
perceive correct heading with actual eye rotation even when the resulting retinal flow
is the same as in simulated case [1, 61]. With these complications resolved with
extraretinal signals, fast processing of the instantaneous velocity field on the other
hand may benefit organism by providing information about the environment as
quickly as possible.
31
CHAPTER 3.
PERCEPTION OF OPTIC FLOW CENTERS WITH
MASKED ROTATIONS AND EXPANSIONS
Introduction
The estimation of optic-flow center location requires spatial integration of
local motion signals. Hence occluding parts of the motion stimuli to control available
amount of local motion cues can be an effective way to probe the integration
necessary to develop the percept of center location. Several studies have investigated
the effects of masking parts of the optic flow with either no motion (as occluded by
static objects) or random noise. When the center region is covered by a circular
aperture, human performance of center estimation deteriorates. This deterioration
appears as decreased percent correct [87], increased error [73] or increased
coherence threshold [21]. Shift in mean perceived center position or bias was
additionally reported in some of the works. Interestingly, a small bias toward the
fixation with increasing mask diameter was observed in [87]. However, a bias in the
opposite direction was observed in [73].
Optic-flow stimulus presentation time was 3.6 sec in [87] and was unlimited
in [73]. However, elsewhere we have shown that the dynamics of center estimation
on expansion and rotation optic flows is fast (specially for the non-rigid velocity
profiles used in this study). A significant amount of center percept already develops
by 100ms. Also in the above studies, only circular-shape making was considered.
32
Hence, in this chapter we tested the perception of centers of expansion and rotation
optic flows with short stimulus presentation and three different spatial profiles of
masking. We found that with such short presentations, different spatial profiles of
masking had different effects on the perceived eccentricities of the centers. We then
simulated two general models of center estimation to test whether they could account
for these effects.
Masking Experiment
METHOD
Subjects were asked to point to the locations of subjectively perceived centers
after watching the movie of random dot motion fields undergoing coherent radial and
rotary motions. Perceived centers were measured for control condition (no masking,
Fig 3.1. A) and three masking conditions where the center region was covered with
different profiles of masking. Tested masking profiles included 1) rectangular
masking of the size 8°x40° with its long side aligned with the direction of changing
eccentricities of centers (‘parallel’ masking, Fig 3.1. B), 2) the same size rectangular
masking with its short side aligned with the direction of changing eccentricities of
centers (‘perpendicular’ masking, Fig 3.1. C) and 3) square masking with its side
length the same as the short side of the rectangles used in 1) and 2) (‘square’
masking, Fig 3.1. D).
Each stimulus was shown for 106 ms (9 frames at 85 Hz). Dots moved along
concentric radial or clockwise rotary trajectories on expansion and rotation flows
33
respectively, with linear velocity of 14 deg/sec. This velocity profile gave a percept
of non-rigid motion, such as the rotational motion in a whirlpool drain. Tested
eccentricities of the centers were 1°, 3°, 5° and 7° (see Fig. 3.1) presented at 15-
equally-divided angles, i.e., 0°, 24°, 48°,…., 336°. Thus there were 15 measurements
per one eccentricity and 4 x 15 = 60 trials per one masking condition. Control
condition was measured separately, and the three masking conditions were measured
together in random order.
Subjects were asked to move a cursor to the perceived location of the center
with a provided mouse after watching the motion stimuli. Each motion stimulus
presentation was preceded and followed by a 750-ms and a 500-ms blank screens
respectively. A fixation cross was presented for both of the blank screens and for the
motion. Subjects maintained fixation when the cross was present. We removed it
after the post-motion blank screen and a mouse cursor then appeared at the location
of the fixation. Subjects could move their eyes freely during this pointing phase.
Random dot motion stimuli were generated using Psychotoolbox MATLAB
extension and were presented on a 35 cm × 26 cm CRT display. The stimuli were
viewed binocularly from 36 cm away. The stimulus area subtended a circular region
of radius 20°. Stimuli were composed of 1256 dots, giving a dot density of
approximately 1 dot/deg
2
. Each dot had a diameter of 12´ and zero luminance, while
the background luminance was 26.5 Cd. Using dark dots minimized their undesirable
static cue of luminance streak due to the slow passive decay of the CRT phosphor.
Because the stimuli duration was short, no dot replacement was done. All dots were
34
randomly placed on the first frame of the movie and survived until the last frame (9
th
frame). The random-dot field was refreshed every frame at 85 Hz.
Fig. 3.1. Stimuli configurations illustrated in the form of glass pattern from 3 frames
of rotational motion stimuli. Tested center eccentricities are shown with white dots.
Fixation mark is the white cross. Note that all tested center positions were inside the
masking in all three masking conditions. The entire stimuli display was presented
rotated at the actual experiment. A) control, B) parallel masking, C) perpendicular
masking and D) square masking.
SUBJECTS
Three subjects participated in the experiment. JK is one of the authors. CW
and JF have participated on a series of psychophysical experiments on perception of
centers of optic flow, but they were never tested on masked conditions and were
naïve to the purpose of the experiment.
RESULTS
2D pointing data were decomposed into radial and tangential components
relative to the axis connecting the fixation and the true center. Then the radial
components were averaged over all the presentation angles to obtain mean
eccentricities of the perceived center locations. An example of the pointing data was
shown in Fig. 3.2. Perceived center eccentricities under control condition (Fig 3.2. A)
A) D) B) C)
35
showed typical underestimation observed in center perception studies [2, 28, 30, 73,
78, 91], but otherwise correct tracking of actual center positions. As part of the
stimuli were occluded with masking, eccentricities of the perceived centers started to
deviate from those of the control condition (Fig 3.2, B). Different tendencies in
errors were observed with varying masking conditions. Parallel masking condition
exhibited increased variable error; mean perceived eccentricities showed similar
tracking of the actual center positions with larger variance around the mean.
Interestingly both of the perpendicular and square masking conditions exhibited
increased systematic error without obvious increase in variance. These two
conditions showed strong bias in mean perceived eccentricities toward the masking
center. Centers presented closer to fixation were perceived farther away, and those
presented more peripherally were perceived closer, making the entire line of mean
perceived centers quite flat. To quantify this trend we conducted linear regression
between actual eccentricities of the tested center locations and the corresponding
perceived eccentricities. Results are shown in Fig. 3.3 for all subjects for both of the
non-rigid expansion and non-rigid rotation flows.
Examination of Fig 3.3 confirms our initial observations. Slopes of the linear
regression lines for the square and perpendicular masking conditions were
significantly smaller than control condition while standard deviation of the residuals
(STD) remain at the same level, indicating strong systematic bias. In contrast parallel
masking condition gave comparable slope with the control condition but STD is
36
consistently the largest of all four conditions. These tendencies were observed for
both of the non-rigid expansion and non-rigid rotation motion stimuli.
Fig. 3.2. A representative example of pointing data (Subject JK on non-rigid rotation
stimuli). Mean and Standard error of perceived center eccentricity are shown as
function of actual center eccentricity. A) control and B) masking conditions.
0 2 4 6 8
0
1
2
3
4
5
6
7
8
Actual Center Eccentricity (deg)
Mean Perceived Center Eccentricity (deg)
Square
Parallel
Perpendicular
0 2 4 6 8
0
1
2
3
4
5
6
7
8
Actual Center Eccentricity (deg)
Mean Perceived Center Eccentricity (deg)
Control
A) B)
37
Fig. 3.3. Slopes and standard deviations (STD) of the residuals of perceived
eccentricities for all four conditions obtained from linear regression on pointing data.
Slopes with 95% confidence intervals are shown on left and STD of the residuals are
shown on right. Data from three subjects were shown with different symbols (see
legends). Data for nonrigid expansion on the top and nonrigid rotation on the bottom.
Simulation Study
RATIONALE
Two strategies of locating centers of expansion and rotation flows were
considered in this study. The first one, termed as ‘Motion-opponency model’
hereafter, was based on the fact that around the center, distribution of local motion
directions is isotropic and for every direction of motions, there normally exists
Control Square Parallel Perpendicular
0
0.2
0.4
0.6
0.8
1
Masking Condition
Slope of the Perceived Centers
Nonrigid Expansion
CW
JK
JF
Control Square Parallel Perpendicular
0
0.2
0.4
0.6
0.8
1
Masking Condition
Slope of the Perceived Centers
Nonrigid Rotation
CW
JK
JF
Control Square Parallel Perpendicular
0.4
0.5
0.6
0.7
0.8
0.9
1
1.1
1.2
Masking Condition
STD of the Residuals
Nonrigid Rotation
CW
JK
JF
Control Square Parallel Perpendicular
0.4
0.6
0.8
1
1.2
1.4
1.6
Masking Condition
STD of the Residuals
Nonrigid Expansion
CW
JK
JF
A) B)
D) C)
38
similar amount of motions in the opposite direction. In other words, motions are
balanced in any direction around the center. Thus a detector that calculates motion-
opponency in any given direction will tend to yield zero when situated around the
center. Hence we can build a model that calculates motion-opponency in every
direction and estimates center by finding the point where such output is minimal over
all the directions.
The second strategy, termed as ‘Template model’ hereafter, was based on the
more straightforward fact that when templates of expansion and rotation motion
flows with different centers were fitted to the given stimulus, the template with the
center coinciding with the true center will give maximal correlation. Directional
template model is consistent with many aspects of human heading perception. For
example, randomization of direction but not of speed affects heading perception
severely [83]. Electrophysiological responses from MST cells show only marginal
decrease when speed gradients are removed but directions of motion are preserved in
rigid expansion and rotation motion stimuli [71]. Perhaps most importantly,
directional template model can account for the importance of motion signals near
singularity in determining center position. Human perception of heading deteriorates
when singularity is out of visual field or masked [7, 73, 87]. Subtraction of direction
fields (i.e., magnitudes of all vectors are one) of two differently-centered expansion
flows yields spatially non-homogenous difference field. In fact differential cues are
more prominent between the two centers and become negligible peripherally ([21],
39
see also Fig 3.7), explaining deterioration of heading perception when singularity
region is not shown.
METHOD
The two strategies of finding the center location were implemented in their
simplest forms. Schematics of motion-opponency model were shown in Fig 3.4, A.
Given sampled signals over a certain region, we calculated correlation with every
direction of translations, and summed up their rectification. Obtaining correlation is a
simple way of calculating motion opponency. Correlating with, for example, upward
translational motion gives zero if there exists the same amount of upward and
downward motions in the sampled motion signals, indicating that motions are
balanced in the region in the upward and downward directions. Rectification is
important. Without rectification, sum of correlations will be zero anywhere because
correlation operator is linear and sum of correlations in one direction and in the
opposite direction will always cancel out. Once the sums of rectified correlations
were calculated over all stimulus area, location of center could be estimated by
finding the minimum (ideally, zero) position. Schematics of template model were
shown in Fig 3.4, B. We fitted directional templates of expansion or rotation motion
flow over all stimulus area, calculated correlation between the template and the
signals underneath, and estimated the center location by finding the maximal
correlation position. Mathematical details of both of the models are shown in
appendix.
40
The two models of center estimation were tested on the experimental setup of
various masking conditions described before. Tested eccentricity of center locations
were 1°, 2.5°, 4°, 5.5° and 7°. Each eccentricity was simulated 1000 times with
random dot placement in each trial. Same number of dots as in the experiment were
used in simulation (i.e., 1256 dots). Random additive noise from Gaussian
distribution with standard deviation 5° was added to motion direction of every dot.
Correlations with translational motions were obtained in total 8 directions (0, 45°, …,
315°) in motion-opponency model. Simulations were done only on non-rigid
expansion stimuli because both of the models predict same outputs for expansion or
rotation. Finally, pooling regions of the translational detector in motion-opponency
model and expansion template in template model were both circular regions of radius
5 degrees.
Fig. 3.4. Schematics of the models tested. A) Motion-opponency model, B) Template
model. See the text as well as appendix for details of calculations performed each
stage.
41
RESULTS
Mean eccentricities of the model outputs were plotted in Fig 3.5 A. Both of
the models showed correct estimation of centers in control case. Note that both of the
models do not try to explain fixational bias in control case, thus their slopes in
control condition were one. But results of motion-opponency model showed clear
separation of mean lines in square and perpendicular masking conditions whereas
results of template model were rather congregated. We did linear regression on
model outputs and plotted slopes (Fig 3.5, B) and STDs (Fig 3.5, C) in the same
format as the human perception data (Fig 3.3).
Linear regression results revealed that motion-opponency model gave
predicted center positions very similar to those of the human perception in terms of
slope and variance. Both of square and perpendicular masking conditions showed
strong bias in mean center eccentricities. Parallel masking condition gave the similar
slope as the control condition but gave the largest STD in all four conditions. But
predictions of model B deviated from human data in two important aspects. First,
both square and perpendicular masking conditions indeed showed bias in slopes but
the amounts was not in scale with the human data. Second, perpendicular, not
parallel, masking condition gave the largest STD.
Fig 3.6 shows details of how each model behaves with different masking
profiles when the actual center is situated at 1° of eccentricity. Outputs of individual
pairs of correlation detectors as well as the summed output as function of eccentricity
42
are shown in Fig. 3.6 LEFT. The functional meaning of the model A can be better
understood in terms of a pair of the unit correlation detectors. Suppose we pick two
opposite-direction detectors, for example upward and downward translation detectors.
Rectified sum of the two correlation detectors range 0 to 1, where 1 means strong
translational motion in either of the upward or downward direction, and 0 means
motions are balanced and there is no net translation in either up or downward
directions. The curves of the outputs from the individual pairs as well as the total
sum shows that they yield minimum in correct eccentricity (1° in the shown example)
in control condition and parallel masking condition. But the minimum of individual
pairs and the total shifts toward the direction of the center of the masking in square
and perpendicular masking case, because motion signals are occluded by masking so
that the new most-balanced point is upward to the actual center position. It can be
also seen that width of the drop in the curves are the widest in parallel masking
condition, implying the most spread appearance of minimum points in random
experiments, which in turn translates into the largest STD of the residuals in
predicted center positions.
On the other hand, maximum correlation point doesn’t change in template
model. Whether part of the motion signals are occluded or not, the correlation with
the template is always maximum at the actual center position under all the masking
conditions (Fig. 3.6 RIGHT). Instead, the slopes of the curves around this maximum
correlation point becomes asymmetric in square and parallel masking conditions.
Because one side is shallower than the other, random iterations yield more maximum
43
in correlation on the shallower side of the curve than the other. This asymmetry
explains small bias in mean predicted center eccentricities. Also it is the
perpendicular masking condition that has the widest bandwidth of the curve thus
gives the most spread in maximum points in random experiments.
Fig. 3.5. Simulation results of the models on masking experiments. Top : Motion-
opponency model, Bottom: Template model. A) Mean predicted center eccentricities
are shown as a function of the actual center eccentricity. Each point is an averaged
center eccentricity from 1000 trials with random dot placement in each trial and
Gaussian additive noise in local motion direction. Linear regression was done on the
simulated data to obtain B) slopes with 95% confidence intervals and C) STD of the
residuals.
44
Fig. 3.6. Details of how the models respond to different masking profiles. Left :
outputs of the Motion-opponency model, Middle : control and three masking
conditions, Right: outputs of the Template model.
45
Fig. 3.6, continued
46
Discussion
We tested the perception of the centers of expansion and rotation optic flows
under various masking conditions with short stimulus presentations (100ms). We
observed different tendencies in error according to masking profiles. When a
rectangular masking was presented with its long side aligned with the direction in
which center eccentricities changed (parallel masking condition), mean perceived
eccentricities were unaffected (although the response variance increased). However,
when the rectangular masking was aligned so that center eccentricities changed along
its short side (perpendicular masking condition), a strong bias in mean perceived
eccentricities was observed. This bias still persisted with a small square mask, which
was contained in both the parallel and perpendicular masks (Figs 3.2 and 3.3). The
small-mask results were surprising, as it provided all the information allowed by the
perpendicular mask and more.
Two possible strategies for estimating centers of expansion and rotation
flows were considered. In the motion-opponency model, centers of such flows were
found by locating the point where local motions were maximally balanced, thus
yielding least correlation with uniform translation in any direction. In the template
model, centers were found by locating the point that best fitted expansion or rotation
motion templates. The simplest implementations of both of the strategies were
simulated on the same experimental setups of the various masking conditions. The
results showed that the motion-opponency model could reproduce important aspects
47
of human perception under masking conditions. In contrast, the template model
could not explain the observed human data.
Below, we discuss further the limitation of the template model in explaining
the observed human perception of centers under masking conditions. We spend
additional time analyzing how this model fails, as it is an important alternative in the
literature. We also discuss the compatibility of the two models of center estimation
considered in this study with known physiology of the brain areas MT and MST.
DISCREPANCY OF THE TEMPLATE MODEL WITH THE OBSERVED
HUMAN PERFORMANCE
The template model predicts the largest variance in estimated centers in the
perpendicular rather than in the parallel masking condition. This is because the
directional difference between the stimulus flow and the template is not spatially
homogenous, and the perpendicular mask occludes the most cues among the three
masking conditions. As illustrated in Fig. 3.7, the difference in direction is prominent
around the region between the centers of the template and the stimulus flow. (see
[21]). In addition, this difference decreases rapidly peripherally. And the largest
differences lie perpendicularly to the line connecting the two centers (Fig 3.7, D).
Thus, perpendicular masking occludes signals more valuable in discriminating
differences between center positions than parallel masking does, although the total
amount of occluded signals are similar in the two conditions. In other words, in the
template model, the least discriminative power or the largest variance in determining
48
the center position is in the perpendicular masking condition. This prediction of the
template model is contradictory to the observed human performance.
The mean center position predicted by the template model shows a small bias
in the direction consistent with the observed human data. However, this bias cannot
increase arbitrarily by changing model parameters because it is due to the asymmetry
of the correlation curve rather than the shift of the actual maximum point. As can be
seen in Fig 3.6 RIGHT, correlation with the template is always maximum at the true
center regardless of shape of masking. Hence the larger shift in mean estimated
centers can only happen with larger asymmetry with the maximum point unchanged.
Such increased asymmetry should accompany larger variance. But this is again not
compatible with the observed human data.
Fig. 3.7. Directional differences between two expansions with different centers. A)
and B) : two expansion fields with separate centers, C) Difference vectors between
expansion fields shown in A) and B), D) Contour plot of magnitudes of difference
vectors
49
CONNECTION OF THE MODELS TO THE KNOWN PHYSIOLOGY OF MT
AND MST
The motion-opponency model and the template model presented in this work
have direct compatibility with the known physiology of the brain region MT and
MST, respectively. For example, cells in MT show selectivity to unidirectional
translation motion, with decrease in their response when opposite direction motion is
present within their receptive fields [24, 56, 68]. Also the best fit linear-nonlinear
model of MT pattern cell found in [63] shows cosine-like weighting in their linear
component, followed by rectification non-linearity.
Many cells in MST show sensitivity to expansion or rotation optic flows [11,
10, 39, 64, 71]. While they respond specifically to one or combination of such flows,
their response amplitude varies with respect to the center position within their
receptive field. Thus they can serve as template with different centers and encode
position of centers through population coding of multiple such cells responding to
the same type of flows.
In the previous chapter we have shown that threshold of perception of focus
of expansion saturates faster than that of perception of rate of expansion. Taken
together with the finding in this study that the model compatible with the known MT
physiology can better explain human perception of centers of expansion and rotation
flows under different profiles of masking, it raises possibility that location of centers
may be computed separately from and in earlier stage than the speed of global flows
such as expansion and rotation, possibly in the area MT.
50
LIMITATION AND EXTENDABILITY OF THE MOTION OPPONENCY
MODEL
Computation required for center estimation is different from heading
estimation. The center of radial flow and the actual heading direction are simply not
the same when there is observer rotation in addition to translation. The model also
cannot explain, as it is, numerous bias in perceived heading when a lamellar flow
occludes (as in occlusion by moving object) or is superimposed transparently on an
expanding flow [12, 62]. In such conditions, heading bias toward the opposite
direction to the lamellar flow was observed. It has not been tested if such bias would
still be observed in the short stimuli presentation as in our study (around 100ms). But
the present model predicts shift in mean center position toward the direction of
lamellar flow when subjected to such stimuli, hence it will require additional stage of
computation to link estimated center position to the heading if one would try to
explain the observed bias. On the other hand, models that adopt directional templates
of expansion or rotation were shown to have limited capability to explain the
observed bias presented in this work. Models that use difference of velocities for
locating heading [27, 59, 60] provides elegant framework that explains heading
perception in the presence of observer rotation as well as various heading bias in the
presence of moving objects, but it doesn’t apply to the motion stimuli with constant
speed as used in our study.
Rather, we propose the model raises a possibility on how the brain integrates
local direction signals in very short time span (around 100 ms) to locate points of
51
interest where local motions are balanced, or distribution of motion signals are not
peaked. Centers of optic flows are one of such and can be useful for further
processing of visual motion information. For example the estimates of center
position can be used for estimation of angular velocity or rate of expansion, both of
which requires estimation of distance from the centers of the flows [2, 91]. It has also
been proposed that information around the singularity of optic flow is particularly
useful for estimating underlying motions [80]. Kinetic boundaries defined by
surfaces moving in different directions are another, as briefly discussed in [19, 68].
One drawback in terms of physiological implementation is that the points of interest
should be signaled by absence of activity, rather than presence of it. Thus separating
‘balanced motion’ from ‘no motion’ can be a problem. One possible solution to this
separation can be obtained from considering together the activities of V1
directionally selective cells as well as component-like MT cells, whose response do
not get suppressed by presence of non-preferred direction motion signals thus can
serve as more reliable indicator for presence of motion or not [56, 63, 68].
52
CHAPTER 4.
AN INTERPRETATION OF THE MOTION-OPPONENCY
MODEL IN THE BAYESIAN FRAMEWORK
Introduction
The motion-opponency model of center estimation could well account for the
results of the masked-center perception experiment in the previous chapter. The
motion-opponency model integrates local motions with cosine weighting and then
rectify. The output of individual detector increases with peaked distribution of
motion directions, and decreases with more spread, or balanced distribution of
directions. Hence, the output of each detector can be thought to encode strenth of
uniform motion in one given direction. This computation is very similar with how
the pattern cells in the brain area MT sums up component motions [63]. The model
then sums up the rectified outputs from individual detectors to find the least total
output. If one interprets output of each detector as strength of uniform motion in a
given direction, then the total can be interpreted as strength of uniform motion in any
direction.
Below a Bayesian probabilistic framework of motion integration has been
developed. The results suggest several interesting notions to interpret computations
involved in center estimation. First, cosine weighting approximates a reasonable
generative model of how local motion measurements result from given true motion
53
direction in connection with underlying texture. Thus the weighted sum in the
individual detector can be interpreted as calculating likelihood of a uniform motion
in the given direction. Second, this naturally gives the final total the meaning of sum
of probabilities, so that the output indicates probability of uniform motion in any
direction. Large output indicates higher probability of uniform motion; less
probability of a uniform motion indicates presence of multiple motions. A center of
optic flows is one of such structure. And last, the intrinsic constraint in direction
likelihood formulation simplifies computation of posterior probability. This allows
likelihoods computed for different regions of the scene to be directly compared
under simple prior assumptions.
Bayesian Formulation of True Motion Estimation
Let θ
� ⃗
be the vector of directions of all the local motion measurements under
the sampling region, i.e.,
θ
� ⃗
= �
θ
1
θ
2
⋮
θ
n
� where θ
i
is the direction of the i
th
local motion measurement, i = 1, ⋯ , n.
Similarly, let ψ
� � ⃗
be the vector of directions of the corresponding true motion,
i.e.,
ψ
� � ⃗
= �
ψ
1
ψ
2
⋮
ψ
n
� where ψ
i
is the direction of the true motion corresponding to the i
th
local
motion measurement, i = 1, ⋯ , n.
54
Then by the Bayes theorem,
P � ψ
� � ⃗
| θ
� ⃗
� =
P � θ
� ⃗
| ψ
� � ⃗
�P � ψ
� � ⃗
�
P � θ
� ⃗
�
INDEPENDENCE AND IDENTITY ASSUMPTION
When the observed direction of local motion θ
i
is independent from each
other, the likelihood term, P � θ
� ⃗
| ψ
� � ⃗
� can be decomposed as follows :
P � θ
� ⃗
| ψ
� � ⃗
� = � P(
n
i=1
θ
i
| ψ
i
)
Taking logarithm of the both sides,
log P � θ
� ⃗
| ψ
� � ⃗
� = log � P(
n
i=1
θ
i
| ψ
i
) = � log P( θ
i
| ψ
i
)
n
i=1
… . Eq. 1)
Consine Weighting as a Generative Model of Aperture-Constrained
Measurements
DIRECTION LIKELIHOODS
Local motion detectors with small integration area can only observe motion
component parallel to the direction of spatial gradient. When the direction of true
motion ψ
i
is given, possible values of θ
i
range from ψ
i
- 90° to ψ
i
+ 90° depending
on the orientation of spatial texture on the location ([26], reproduced in Fig 4.1 A). It
is reasonable to assume the orientation of texture and the orientation of true motion
are independent from each other. Then in ideal situation with sensitivity to
55
infinitesimal direction signal, the probability of θ
i
being any value between ψ
i
- 90°
and ψ
i
+ 90° should be uniform, and zero everywhere else. The likelihood in such
ideal case is shown in Fig. 4.1 B.
Fig 4.1. A) Examples of possible local measurements when true motion is rightward.
B) Two direction likelihoods. Uniform (solid line) and von Mises (dashed line)
distributions
However, the rectangular likelihood shown in Fig 4.1 B) need to be smoothed
for several reasons. First, speed goes to zero around ±90º so direction signal in such
case cannot reliably measured. Second, due to noise in local measurements,
distribution of local measurements are often spread to the extent that makes whole
likelihood zero. One can use, for example, von Mises distribution for a smoothed
version of the rectangular likelihood (Fig 1, B) given as below,
P( θ
i
| ψ
i
) =
1
2 πI
0
( κ)
e
κ cos ( θ
i
− ψ
i
)
where is a parameter that determines concentration of a probability density
function; smaller makes the distribution more uniform and higher makes the
56
distribution more peaked around ψ
i
. I
0
( κ) is a normalization factor, and given by the
modified Bessel function of order 0. Choosing κ = 1 and denoting a constant
α =
1
2 πI
0
(1)
, the log-likelihood in Eq. 1) becomes
log P � θ
� ⃗
| ψ
� � ⃗
� = � log P( θ
i
| ψ
i
)
n
i=1
= � log αe
cos ( θ
i
− ψ
i
)
n
i=1
= � �log α + loge
cos ( θ
i
− ψ
i
)
�
n
i=1
= � cos( θ
i
− ψ
i
)
n
i=1
+ n log α
∝ � cos( θ
i
− ψ
i
)
n
i=1
… Eq. 2
Thus, the sum of cosine of direction differences is directly proportional to the
log-likelihood of P � θ
� ⃗
| ψ
� � ⃗
�, or the likelihood of the true motion direction ψ
� � ⃗
given the
measurements θ
� ⃗
. Furthermore, reconstructing the vectors θ
v
� � � � ⃗
and ψ
v
� � � � � ⃗
as following,
θ
v
� � � � ⃗
=
⎝
⎜
⎛
cosθ
1
sin θ
1
⋮
cosθ
n
sin θ
n ⎠
⎟
⎞
, ψ
v
� � � � � ⃗
=
⎝
⎜
⎛
cosψ
1
sin ψ
1
⋮
cosψ
n
sin ψ
n ⎠
⎟
⎞
Eq. 2 can be rewritten as
� cos( θ
i
− ψ
i
)
n
i=1
= � (cos θ
i
cos ψ
i
+ sin θ
i
sin ψ
i
)
n
i=1
= θ
v
� � � � ⃗
∙ ψ
v
� � � � � ⃗
57
Hence, the likelihood can also be obtained by calculating innerproduct
between the measurements and the true motion.
The likelihood computed by Eq. 2 is very similar with the computation used
in the motion-opponency model of center estimation, except that correlation in the
motion opponency model has an additional normalization by the signal length. The
likelihood formulation provides a way how the outcome of motion opponency
computation may be understood; that is, local velocity measurements whose
direction lie between ±90º from the true motion adds up the likelihood of such
motion. Local measurements outside those range subtracts, thus decreases such
likelihood. Also the computation described in Eq. 2 is very similar with the
computation performed in the model of MT pattern cells that appear in Rust et. al.
Hence the model of MT pattern cell may be interpreted as computing likelihood of
uniform translational motion given measurements of local velocities.
Posterior Probability of Uni-directional Motion
One can divide the whole ψ
� � ⃗
space into two mutually-exclusive groups, group
G
1
whose members have all uniform directions and the group G
2
for all the rest, i.e.,
those that have more than one direction among its members.
G
1
= � ψ
� � ⃗
� ψ
� � ⃗
= �
ψ
1
ψ
2
⋮
ψ
n
� where ψ
1
= ψ
1
= ⋯ = ψ
n
}
58
G
2
= � ψ
� � ⃗
� ψ
� � ⃗
= �
ψ
1
ψ
2
⋮
ψ
n
� where at least two directions present among the elements}
Then G
1
and G
2
are mutually exclusive and G
1
∪ G
2
comprises the whole
possible space of ψ
� � ⃗
. For example, when the values of each ψ
i
are discretized into
possible directions and the number of elements is , then G
1
has members and G
2
has m
n
− m members.
Probability of uni-directional motion given local measurements can be
obtained by summing up all the probabilities of ψ
� � ⃗
that belongs to G
1
.
P �G
1
| θ
� ⃗
� = � P � ψ
� � ⃗
| θ
� ⃗
�
ψ
� � � ⃗
∈G
1
= �
P � θ
� ⃗
| ψ
� � ⃗
�P � ψ
� � ⃗
�
P � θ
� ⃗
�
ψ
� � � ⃗
∈G
1
=
P � ψ
� � ⃗
�
P � θ
� ⃗
�
� P � θ
� ⃗
| ψ
� � ⃗
�
ψ
� � � ⃗
∈G
1
… Eq. 3
assuming uniform motion in every direction has equal probability
thus P � ψ
� � ⃗
� can be treated as a constant for all ψ
� � ⃗
∈ G
1
.
59
The right part of Eq. 3 is the sum of likelihoods that can be obtained from Eq.
2. This term is also similar with the final output of the motion opponency model of
center estimation. The summed likelihood is scaled by P � ψ
� � ⃗
�, a constant representing
the probability of ψ
� � ⃗
being uniform motion, and divided by P � θ
� ⃗
�, the probability of
measurements to give the posterior probability that the local measurements are from
any uni-directional motions.
Suppose two different sets of local measurements, θ
1
� � � � ⃗
and θ
2
� � � � ⃗
are sampled
over two different regions of a scene. Generally P � θ
1
� � � � ⃗
� and P � θ
2
� � � � ⃗
� are different, thus
to compare P �G
1
| θ
1
� � � � ⃗
� and P �G
1
| θ
2
� � � � ⃗
�, P � θ
1
� � � � ⃗
� and P � θ
2
� � � � ⃗
� need to be computed
extensively by integrating over all possible likelihoods. However, below it was
shown that the constraint intrinsic in the formulation of direction likelihoods
significantly simplifies the computation of P � θ
� ⃗
� and allows direct comparison in
terms of likelihoods for two simple prior assumptions.
THE CONSTRAINT INTRINSIC IN LIKELIHOOD FORMULATION
Assuming each P( θ
i
| ψ
i
) has an identical probability density,
� P � θ
� ⃗
| ψ
� � ⃗
� d ψ
� � ⃗
all ψ
� � � ⃗
= � ⋯ � � P(
n
i=1
θ
i
| ψ
i
) d
ψ
n
ψ
1
ψ
1
⋯ d ψ
n
= 1 … Eq. 4
since each P( θ | ψ ) itself is a probability density function which sums up to 1 over
one cycle of . This function is shift-invariant, i.e., the probability only depends on
the difference between and . Hence the integral over one cycle of also gives 1.
60
This constraint that the sum of likelihoods is always one, significantly
simplifies the computation of P � θ
� ⃗
� and allows direct comparison in terms of
likelihoods for the following two simple prior assumptions.
COMPLETE POSTERIOR UNDER FLAT PRIOR ASSUMPTION
P � θ
� ⃗
� = � P � θ
� ⃗
| ψ
� � ⃗
�P � ψ
� � ⃗
�
all ψ
� � � ⃗
= P � ψ
� � ⃗
� � P � θ
� ⃗
| ψ
� � ⃗
�
all ψ
� � � ⃗
= P � ψ
� � ⃗
� by Eq. 4
Hence,
P �G
1
| θ
� ⃗
� = � P � θ
� ⃗
| ψ
� � ⃗
�
ψ
� � � ⃗
∈G
1
and
P �G
2
| θ
� ⃗
� = 1 − P �G
1
| θ
� ⃗
� = 1 − � P � θ
� ⃗
| ψ
� � ⃗
�
ψ
� � � ⃗
∈G
1
Therefore with flat prior assumption, a posterior probability of the sampled
motion signals resulting from a uniform motion is simply the sum of likelihoods over
all possible single motion directions. Note that P �G
1
| θ
� ⃗
� is very similar with the
output of the proposed model of center estimation in previous chapter, only except
the normalization part in correlation calculation. Also P �G
2
| θ
� ⃗
� decreases as
P �G
1
| θ
� ⃗
� increases, so locating the point that gives minimum output directly
61
translates into locating the point that maximizes the probability of multiple motions
being present.
COMPLETE POSTERIOR UNDER 2 CLASS PRIOR ASSUMPTION
Taking advantage of Eq. 4, the analytic expression for posterior probability P �G
1
| θ
� ⃗
�
can be derived for a prior that has arbitrary choice of P(G
1
) and P(G
2
). Note that this
still requires constant probability assumption within group.
More specifically,
P � θ
� ⃗
� = � P � θ
� ⃗
| ψ
� � ⃗
�P � ψ
� � ⃗
�
all ψ
� � � ⃗
= � P � θ
� ⃗
| ψ
� � ⃗
�P � ψ
� � ⃗
�
ψ
� � � ⃗
∈G
1
+ � P � θ
� ⃗
| ψ
� � ⃗
�P � ψ
� � ⃗
�
ψ
� � � ⃗
∈G
2
= p
1
� P � θ
� ⃗
| ψ
� � ⃗
�
ψ
� � � ⃗
∈G
1
+ p
2
� P � θ
� ⃗
| ψ
� � ⃗
�
ψ
� � � ⃗
∈G
2
where p
1
and p
2
are the probabilities of any member that belongs to G
1
and G
2
respectively. Because there are m members in G
1
and m
n
− m members in G
2
,
mp
1
= P(G
1
) thus, p
1
=
P(G
1
)
m
and
(m
n
− m)p
2
= P(G
2
) thus, p
2
=
P(G
2
)
m
n
− m
62
Also,
� P � θ
� ⃗
| ψ
� � ⃗
�
ψ
� � � ⃗
∈G
2
= 1 − � P � θ
� ⃗
| ψ
� � ⃗
�
ψ
� � � ⃗
∈G
1
by Eq. 1)
Hence, denoting
L
1
= � P � θ
� ⃗
| ψ
� � ⃗
�
ψ
� � � ⃗
∈G
1
,
P � θ
� ⃗
� = p
1
� P � θ
� ⃗
| ψ
� � ⃗
�
ψ
� � � ⃗
∈G
1
+ p
2
� P � θ
� ⃗
| ψ
� � ⃗
�
ψ
� � � ⃗
∈G
2
=
P(G
1
)
m
L
1
+
P(G
2
)
m
n
− m
(1 − L
1
)
thus finally,
P �G
1
| θ
� ⃗
� =
P(G
1
)
m
L
1
P(G
1
)
m
L
1
+
P(G
2
)
m
n
− m
(1 − L
1
)
and
P �G
2
| θ
� ⃗
� =
P(G
2
)
m
n
− m
(1 − L
1
)
P(G
1
)
m
L
1
+
P(G
2
)
m
n
− m
(1 − L
1
)
P �G
1
| θ
� ⃗
� is plotted as a function of L
1
in Fig 4.2. It is shown that a posterior
P �G
1
| θ
� ⃗
� is still monotonic over likelihood, with sigmoidal-shape saturation. This
means that with the prior that has two arbitrary values of P(G
1
) and P(G
2
), it is still
possible to make inference about P �G
1
| θ
� ⃗
� by only calculating likelihoods L
1
. The
63
final output of center estimation model can be interpreted as the sum of likelihoods
L
1
. And under two simple prior assumptions, the posterior probability has monotonic
relationship with L
1
. Thus under those simple prior conditions, the proposed model
of center estimation is optimal in locating points that are least-likely to arise from the
uni-directional motion in any direction.
Fig 4.2. Relationship between L
1
(x axis) and P �G
1
| θ
� ⃗
� (y axis) for three values of
P(G
1
) (see legends). The number of measurements, n is 78.
In conclusion, following were shown in this chapter. First, introduction of a
generative model P( θ | ψ ) of aperture-constrained local motion measurements
allows the interpretation of the motion-opponency computation in terms of optimal
uni-directional motion estimation. Second, the constraint intrinsic in direction
likelihood formulation allows direct comparison in terms of likelihoods between
cross-measurements under simple prior assumptions.
64
Discussion
It was pointed out in the chapter two that the center estimation in human
visual perception is fast. The time scale of performance saturation was comparable to
the human perception of local motion directions (Figs 2.2 and 2.4, [9, 45, 67]). Thus,
a possibility was raised that the center may be estimated in the early stage in the
brain where local motion computations were performed [14, 25, 38]. The
mathematical formulation presented in this chapter provides a theoretical background
how such computations can be done in a single stage. More specifically, likelihood
of a set of local motion measurements being from a uni-directional velocity can be
estimated by obtaining cosine-weighted sum. The output of such computation can
give maximum-likelihood solution for the direction of true motion. In turn, sum of
those likelihoods can indicate the posterior probability of the set of local motion
measurements being arisen from any uni-directional motion. Thus locating the points
that are less-likely to be from uni-directional motion can serve as a mechanism to
locate regions where multiple motions are likely to present, such as in the center of
optic flows. Further implication of the results as well as limitations and relation of
the framework to the existing literature is discussed below.
INTERPRETATION OF NON UNI-DIRECTIONAL MOTION
It was shown that human center perception in masked condition could not be
explained in terms of best-fitting locations of rotation or expansion templates.
Instead, the theoretical treatment in this chapter suggests that the motion-opponency
65
model of center estimation is optimal in searching for points that are least-likely to
arise from uni-directional motion. The notion of “less-likely to arise from uni-
directional motion” is very flexible and can include multiple aspects of interest in
visual processing of optic flows. First, it can include centers of spiral optic flow
patterns, not only rotation and expansion. Spiral motion arises commonly when
observer moves along curvilinear path [32, 33, 85]. The model can capture the center
without need of the separate templates of such spiral flow fields. Second, kinetic
boundaries defined by surfaces moving in different directions are another feature that
may be captured by the same model. Probabilistic interpretation presented in this
chapter provides theoretical background to understand how the motion-opponency
model works as well as to develop the model further for practical application for
locating centers and kinetic boundaries in natural movies. It will be interesting to see
how the model behaves in detecting such entities.
VALIDITY OF INDEPENDENCE ASSUMPTION
Many works assume independence of local texture for simplification of
likelihood computation [44, 66, 88, 89]. However, assumption of texture
independence that allows factorization of P � θ
� ⃗
| ψ
� � ⃗
� into individual terms is, strictly
speaking, not realistic with natural scene. In natural scene, textures that belong to
one surface surfaces tend to be correlated thus independence doesn’t generally hold.
Many successful models of texture synthesis utilize the spatial dependence in natural
texture [5, 6, 22, 31, 53]. But texture independence assumption becomes more
66
reasonable as the sampling range increases, as it will start to include textures from
multiple objects in the scene thus distribution of local orientations will become more
flat.
Flat prior on the true motion directions also assumes independent motion per
each local motion measurement. Many studies suggest preference on slower speeds
[18, 66, 90] and spatially-smooth velocity field [29, 92, 93, 26]. It needs more
investigation to know how incorporating such priors would effect detectability of
centers in the framework presented in this chapter.
VELOCITY LIKELIHOOD VS. DIRECTION LIKELIHOOD
Velocity likelihood of aperture-constrained measurements has been widely
used for the Bayesian formulation for flow estimation [44, 66, 88, 89]. Velocity (i.e.,
local direction and speed) measurements are more informative than direction
measurement alone. In noiseless conditions two measurements of local velocities can
give a correct estimate of true uniform motion underlying them. Hence most works
estimate direction and speed of flow velocities simultaneously. In contrast, two
measurements of local directions cannot determine unique direction of underlying
motion. However, the Bayesian formulation in terms of only directional information,
as presented in this chapter, lends clearer explanation of motion-opponency
computation that prevails in the MT pattern cells [24, 56, 68, 14]. Also it was shown
that the constraint intrinsic in the direction likelihood formulation leads to simplified
expression of posterior probability. Because full velocity space includes direction
67
space, it should be possible to derive similar results with Bayesian formulation of
full velocity estimation.
68
CHAPTER 5.
CONCLUSION
Summary of Findings
Temporal dependence of perception of centers in expansion and rotation
optic flows were measured by pointing and detection tasks. We found that the
dynamics of center perception for rotation and expansion flows were fast.
Significant percept of center location was already developed by 100 ms, with the
exact time course depending on the spatial velocity profile of the motion fields. Such
time courses were faster than those observed for the measurement of rates of
expansion. We also demonstrated that such fast time courses could explain an optical
illusion related to a rolling-wheel-like motion in a fronto-parallel plane.
Mathematically, the brain would have enough information to estimate the motion
correctly by integrating multiple frames of such flows. However, the subjects’
perception of the rotational center is strongly biased toward the center of
instantaneous flow even with extended viewing time.
Perception of the centers of expansion and rotation optic flows was tested
under various masking conditions with short stimuli presentation (100ms). We
observed different tendencies in error according to the spatial profiles of masking.
When a rectangular masking was presented with its long side aligned with the
direction in which center eccentricities changed (Parallel masking condition), mean
perceived eccentricities were unaffected with increased variance in response. But
69
when the rectangular masking was aligned so that center eccentricities changed along
its short side (Perpendicular masking condition), a strong bias in mean perceived
eccentricities was observed. This bias still persisted with a small square mask, which
was contained in both the parallel and perpendicular masks. Two possible strategies
for estimating centers of expansion and rotation optic flows were tested to see
whether they could account for the observed human data. In the motion-opponency
model, centers of such flows were found by locating the point where motions were
maximally balanced in any direction. In the template model, centers were found by
locating the point that best fitted expansion or rotation motion templates. The
simplest implementations of both of the strategies were simulated on the same
experimental setups of the various masking conditions. The results showed that
motion-opponency model could reproduce important aspects of human perception
under masking conditions, whereas the template model could not explain the
observed human data.
Lastly, a mathematical formulation was presented, which yielded an
interpretation of the motion-opponency model of center estimation in Bayesian
framework. First, the motion-opponency computation can be thought of as
computing optimal uni-directional motion with a generative model of aperture-
constrained local motion measurements. Second, the motion-opponency model of
center estimation can be optimal in locating points that are less-likely to arise from
uni-directional motion. The constraint intrinsic in direction likelihood formulation
allows direct comparison in terms of likelihoods between cross-measurements under
70
simple prior assumptions. The Bayesian formulation provided a theoretical
background how local motion direction and center location can be estimated together
through motion-opponency operation.
Implications / Possible Future Research
Yuille & Grzywacz proposed the Bayesian framework for decomposing
complex optic flows into simpler elementary flows. In their original formulation, the
two parameters of expansion and rotation are estimated simultaneously. However,
the results in this dissertation indicates separate processing of the two parameters.
Temporal dependence of human perception of center location was faster than those
of rate of expansion (Fig. 2.8). Furthermore, the time scale of performance saturation
was comparable to the human perception of local motion directions (Figs 2.2 and 2.4,
[9, 45, 67]). Also human perception of centers in masked expansion and rotation
could be well accounted for by the model incorporating motion-opponency
mechanism found in the brain area MT [14, 24, 56, 68]. Finally, the Bayesian
formulation in chapter four provides theoretical background how local motion
computation and detection of centers can be performed in a single stage. Taken
together, all the results so far raises a possibility that centers of optic flows are
initially estimated in the brain area MT. Furthermore, the motion-opponency model
of center estimation predicts that structures other than centers of expansion and
rotation can be detected with the same mechanism. For example, centers of spiral
71
flows as well as kinetic boundaries defined by two differently-moving surfaces could
also be encoded in the activities of MT cells.
72
BIBLIOGRAPHY
1. Banks, M.S., et al., Estimating heading during real and simulated eye
movements. Vision Res, 1996. 36(3): p. 431-43.
2. Barraza, J.F. and N.M. Grzywacz, Local computation of angular velocity in
rotational visual motion. J Opt Soc Am A Opt Image Sci Vis, 2003. 20(7): p.
1382-90.
3. Beintema, J.A. and A.V. van den Berg, Heading detection using motion
templates and eye velocity gain fields. Vision Res, 1998. 38(14): p. 2155-79.
4. Bennett, B.M., et al., Structure from two orthographic views of rigid motion.
J Opt Soc Am A, 1989. 6(7): p. 1052-69.
5. Bergen, J.R. and E.H. Adelson, Visual Texture Segmentation Based on
Energy Measures. Journal of the Optical Society of America a-Optics Image
Science and Vision, 1986. 3(13): p. P99-P99.
6. Caelli, T. and B. Julesz, On perceptual analyzers underlying visual texture
discrimination: part I. Biol Cybern, 1978. 28(3): p. 167-75.
7. Crowell, J.A. and M.S. Banks, Perceiving heading with different retinal
regions and types of optic flow. Percept Psychophys, 1993. 53(3): p. 325-37.
8. Crowell, J.A., et al., Optic flow and heading judgements. Invest. Ophthalmol.
Visual Sci. Suppl., 1990(31): p. 522.
9. De Bruyn, B. and G.A. Orban, Human velocity and direction discrimination
measured with random dot patterns. Vision Res, 1988. 28(12): p. 1323-35.
10. Duffy, C.J. and R.H. Wurtz, Sensitivity of MST neurons to optic flow stimuli.
I. A continuum of response selectivity to large-field stimuli. J Neurophysiol,
1991. 65(6): p. 1329-45.
11. Duffy, C.J. and R.H. Wurtz, Sensitivity of MST neurons to optic flow stimuli.
II. Mechanisms of response selectivity revealed by small-field stimuli. J
Neurophysiol, 1991. 65(6): p. 1346-59.
12. Duffy, C.J. and R.H. Wurtz, An illusory transformation of optic flow fields.
Vision Res, 1993. 33(11): p. 1481-90.
73
13. Freeman, T.C. and M.G. Harris, Human sensitivity to expanding and rotating
motion: effects of complementary masking and directional structure. Vision
Res, 1992. 32(1): p. 81-7.
14. Garcia, J.O. and E.D. Grossman, Motion opponency and transparency in the
human middle temporal area. Eur J Neurosci, 2009. 30(6): p. 1172-82.
15. Gibson, J.J., Perception of the Visual World. 1950: Boston: Houghton Mifflin.
16. Graziano, M.S., R.A. Andersen, and R.J. Snowden, Tuning of MST neurons
to spiral motions. J Neurosci, 1994. 14(1): p. 54-67.
17. Grzywacz, N.M., Harris, J.M. and Amthor, F.R. , Computational and Neural
Constraints for the Measurement of Local Visual Motion, in Visual Detection
of Motion, A.T.S.a.R.J. Snowden, Editor. 1994, Academic Press: San Diego,
California. p. 19-50.
18. Grzywacz, N.M. and A.L. Yuille, Massively parallel implementations of
theories for apparent motion. Spat Vis, 1988. 3(1): p. 15-44.
19. Grzywacz, N.M. and A.L. Yuille, A model for the estimate of local image
velocity by cells in the visual cortex. Proc R Soc Lond B Biol Sci, 1990.
239(1295): p. 129-61.
20. Gu, Y., et al., Visual and nonvisual contributions to three-dimensional
heading selectivity in the medial superior temporal area. J Neurosci, 2006.
26(1): p. 73-85.
21. Harvey, B.M. and O.J. Braddick, Psychophysical differences in processing of
global motion and form detection and position discrimination. J Vis, 2008.
8(7): p. 14 1-18.
22. Hassner, M. and J. Sklansky, The Use of Markov Random-Fields as Models
of Texture. Computer Graphics and Image Processing, 1980. 12(4): p. 357-
370.
23. Hatsopoulos, N.G. and W.H. Warren, Visual Navigation with a Neural
Network. Neural Networks, 1991. 4(3): p. 303-317.
24. Heeger, D.J., et al., Motion opponency in visual cortex. Journal of
Neuroscience, 1999. 19(16): p. 7162-74.
25. Heeger, D.J., et al., Motion opponency in visual cortex. J Neurosci, 1999.
19(16): p. 7162-74.
74
26. Hildreth, E.C., The measurement of visual motion. ACM distinguished
dissertations. 1984, Cambridge, Mass.: MIT Press. 241 p.
27. Hildreth, E.C., Recovering heading for visually-guided navigation. Vision
Res, 1992. 32(6): p. 1177-92.
28. Hooge, I.T., J.A. Beintema, and A.V. van den Berg, Visual search of heading
direction. Exp Brain Res, 1999. 129(4): p. 615-28.
29. Horn, B.K.P. and B.G. Schunck, Determining Optical-Flow. Proceedings of
the Society of Photo-Optical Instrumentation Engineers, 1981. 281: p. 319-
331.
30. Johnston, I.R., G.R. White, and R.W. Cumming, The role of optical
expansion patterns in locomotor control. Am J Psychol, 1973. 86(2): p. 311-
24.
31. Julesz, B., et al., Inability of Humans to Discriminate between Visual
Textures That Agree in Second-Order Statistics - Revisited. Perception, 1973.
2(4): p. 391-405.
32. Kim, N.G., B.R. Fajen, and M.T. Turvey, Perceiving circular heading in
noncanonical flow fields. J Exp Psychol Hum Percept Perform, 2000. 26(1):
p. 31-56.
33. Kim, N.G. and M.T. Turvey, Visually perceiving heading on circular and
elliptical paths. J Exp Psychol Hum Percept Perform, 1998. 24(6): p. 1690-
704.
34. Koenderink, J.J., Optic flow. Vision Res, 1986. 26(1): p. 161-79.
35. Koenderink, J.J. and A.J. Doorn, Invariant properties of the motion parallax
field due to the movement of rigid bodies relative to an observer. Opt. Acta,
1975(22): p. 773-791.
36. Koenderink, J.J. and A.J. Doorn, Local structure of movement parallax of the
plane. J. Opt. Soc. Am., 1976(66): p. 717-723.
37. Koenderink, J.J. and A.J. van Doorn, Affine structure from motion. J Opt Soc
Am A, 1991. 8(2): p. 377-85.
38. Krekelberg, B. and T.D. Albright, Motion mechanisms in macaque MT. J
Neurophysiol, 2005. 93(5): p. 2908-21.
75
39. Lagae, L., et al., Responses of macaque STS neurons to optic flow
components: a comparison of areas MT and MST. J Neurophysiol, 1994.
71(5): p. 1597-626.
40. Lee, D.N., The optic flow field: the foundation of vision. Philos Trans R Soc
Lond B Biol Sci, 1980. 290(1038): p. 169-79.
41. Longuet-Higgins, H.C. and K. Prazdny, The interpretation of a moving
retinal image. Proc R Soc Lond B Biol Sci, 1980. 208(1173): p. 385-97.
42. Lu, Z.L. and G. Sperling, The functional architecture of human visual motion
perception. Vision Res, 1995. 35(19): p. 2697-722.
43. Lu, Z.L. and G. Sperling, Three-systems theory of human visual motion
perception: review and update. J Opt Soc Am A Opt Image Sci Vis, 2001.
18(9): p. 2331-70.
44. Lucas, B.D. and T. Kanade. An iterative image registration technique with an
application to stereo vision. in Imaging Understanding Workshop. 1981.
45. McKee, S.P. and L. Welch, Sequential recruitment in the discrimination of
velocity. J Opt Soc Am A, 1985. 2(2): p. 243-51.
46. Meese, T.S. and M.G. Harris, Independent detectors for expansion and
rotation, and for orthogonal components of deformation. Perception, 2001.
30(10): p. 1189-202.
47. Morrone, M.C., D.C. Burr, and L.M. Vaina, Two stages of visual processing
for radial and circular motion. Nature, 1995. 376(6540): p. 507-9.
48. Orban, G.A., et al., First-order analysis of optical flow in monkey brain. Proc
Natl Acad Sci U S A, 1992. 89(7): p. 2595-9.
49. Paolini, M., et al., Responses to continuously changing optic flow in area
MST. J Neurophysiol, 2000. 84(2): p. 730-43.
50. Perrone, J.A., Model for the computation of self-motion in biological systems.
J Opt Soc Am A, 1992. 9(2): p. 177-94.
51. Perrone, J.A. and L.S. Stone, A model of self-motion estimation within
primate extrastriate visual cortex. Vision Res, 1994. 34(21): p. 2917-38.
52. Pollick, F.E., The perception of motion and structure in structure-from-
motion: comparisons of affine and Euclidean formulations. Vision Res, 1997.
37(4): p. 447-66.
76
53. Portilla, J. and E.P. Simoncelli, A parametric texture model based on joint
statistics of complex wavelet coefficients. International Journal of Computer
Vision, 2000. 40(1): p. 49-71.
54. Potter, M.C., Meaning in visual search. Science, 1975. 187(4180): p. 965-6.
55. Potter, M.C. and E.I. Levy, Recognition memory for a rapid sequence of
pictures. J Exp Psychol, 1969. 81(1): p. 10-5.
56. Qian, N. and R.A. Andersen, Transparent Motion Perception as Detection of
Unbalanced Motion Signals .2. Physiology. Journal of Neuroscience, 1994.
14(12): p. 7367-7380.
57. Regan, D., Visual processing of four kinds of relative motion. Vision Res,
1986. 26(1): p. 127-45.
58. Regan, D. and K.I. Beverley, Visual responses to vorticity and the neural
analysis of optic flow. J Opt Soc Am A, 1985. 2(2): p. 280-3.
59. Royden, C.S., Mathematical analysis of motion-opponent mechanisms used
in the determination of heading and depth. J Opt Soc Am A Opt Image Sci
Vis, 1997. 14(9): p. 2128-43.
60. Royden, C.S., Computing heading in the presence of moving objects: a model
that uses motion-opponent operators. Vision Res, 2002. 42(28): p. 3043-58.
61. Royden, C.S., J.A. Crowell, and M.S. Banks, Estimating heading during eye
movements. Vision Res, 1994. 34(23): p. 3197-214.
62. Royden, C.S. and E.C. Hildreth, Human heading judgments in the presence
of moving objects. Percept Psychophys, 1996. 58(6): p. 836-56.
63. Rust, N.C., et al., How MT cells analyze the motion of visual patterns. Nat
Neurosci, 2006. 9(11): p. 1421-31.
64. Saito, H., et al., Integration of direction signals of image motion in the
superior temporal sulcus of the macaque monkey. Journal of Neuroscience,
1986. 6(1): p. 145-57.
65. Saito, H., et al., Integration of direction signals of image motion in the
superior temporal sulcus of the macaque monkey. J Neurosci, 1986. 6(1): p.
145-57.
77
66. Simoncelli, E.P., Local analysis of visual motion, in The Visual
Neurosciences, L.M. Chalupa and J.S. Werner, Editors. 2003, MIT Press. p.
523-530.
67. Snowden, R.J. and O.J. Braddick, The temporal integration and resolution of
velocity signals. Vision Res, 1991. 31(5): p. 907-14.
68. Snowden, R.J., et al., The response of area MT and V1 neurons to
transparent motion. Journal of Neuroscience, 1991. 11(9): p. 2768-85.
69. Tadin, D., J.S. Lappin, and R. Blake, Fine temporal properties of center-
surround interactions in motion revealed by reverse correlation. J Neurosci,
2006. 26(10): p. 2614-22.
70. Tanaka, K., Y. Fukada, and H.A. Saito, Underlying mechanisms of the
response specificity of expansion/contraction and rotation cells in the dorsal
part of the medial superior temporal area of the macaque monkey. J
Neurophysiol, 1989. 62(3): p. 642-56.
71. Tanaka, K. and H. Saito, Analysis of motion of the visual field by direction,
expansion/contraction, and rotation cells clustered in the dorsal part of the
medial superior temporal area of the macaque monkey. J Neurophysiol, 1989.
62(3): p. 626-41.
72. Te Pas, S.F., A.M. Kappers, and J.J. Koenderink, Detection of first-order
structure in optic flow fields. Vision Res, 1996. 36(2): p. 259-70.
73. te Pas, S.F., A.M.L. Kappers, and J.J. Koenderink, Locating the singular
point in first-order optical flow fields. J. Exp. Psychol. Human Percept.
Perform., 1998(24): p. 1415-1430.
74. Todd, J.T. and P. Bressan, The perception of 3-dimensional affine structure
from minimal apparent motion sequences. Perception & Psychophysics, 1990.
48(5): p. 419-30.
75. Tse, P.U. and P.J. Hsieh, The infinite regress illusion reveals faulty
integration of local and global motion signals. Vision Res, 2006. 46(22): p.
3881-5.
76. Ullman, S., The interpretation of visual motion. The MIT Press series in
artificial intelligence. 1979, Cambridge, Mass.: MIT Press. 229 p.
77. Ullman, S., Recent computational studies in the interpretation of structure
from motion, in Human and machine vision, J. Beck and A. Rosenfeld,
Editors. 1983, Academic Press: New York. p. 459-480.
78
78. van den Berg, A.V., Predicting the present direction of heading. Vision Res,
1999. 39(21): p. 3608-20.
79. van den Berg, A.V., Human ego-motion perception. Int Rev Neurobiol, 2000.
44: p. 3-25.
80. Verri, A., M. Straforini, and V. Torre, Computational aspects of motion
perception in natural and artificial vision systems. Philos Trans R Soc Lond
B Biol Sci, 1992. 337(1282): p. 429-43.
81. Warren, R., The perception of egomotion. J Exp Psychol Hum Percept
Perform, 1976. 2(3): p. 448-56.
82. Warren, W.H., Jr., Optic flow, in The Visual Neurosciences, L. Chalupa and J.
Werner, Editors. 2004, MIT Press: Cambridge, MA. p. 1247-1259.
83. Warren, W.H., Jr., et al., On the sufficiency of the velocity field for perception
of heading. Biol Cybern, 1991. 65(5): p. 311-20.
84. Warren, W.H., Jr. and D.J. Hannon, Eye movements and optical flow. J Opt
Soc Am A, 1990. 7(1): p. 160-9.
85. Warren, W.H., Jr., et al., Perception of circular heading from optical flow. J
Exp Psychol Hum Percept Perform, 1991. 17(1): p. 28-43.
86. Warren, W.H., Jr., M.W. Morris, and M. Kalish, Perception of translational
heading from optical flow. J Exp Psychol Hum Percept Perform, 1988. 14(4):
p. 646-60.
87. Warren, W.H. and K.J. Kurtz, The role of central and peripheral vision in
perceiving the direction of self-motion. Percept Psychophys, 1992. 51(5): p.
443-54.
88. Weiss, Y. and D.J. Fleet, Velocity likelihoods from generative models.
Investigative Ophthalmology & Visual Science, 2000. 41(4): p. S795-S795.
89. Weiss, Y. and D.J. Fleet, Velocity likelihoods in biological and machine
vision. Probabilistic Models of the Brain: Perception and Neural Function,
2002: p. 77-96.
90. Weiss, Y., E.P. Simoncelli, and E.H. Adelson, Motion illusions as optimal
percepts. Nature Neuroscience, 2002. 5(6): p. 598-604.
79
91. Wurfel, J.D., J.F. Barraza, and N.M. Grzywacz, Measurement of rate of
expansion in the perception of radial motion. Vision Res, 2005. 45(21): p.
2740-51.
92. Yuille, A.L. and N.M. Grzywacz, A Computational Theory for the Perception
of Coherent Visual-Motion. Nature, 1988. 333(6168): p. 71-74.
93. Yuille, A.L. and N.M. Grzywacz, A theoretical framework for visual motion,
in High-Level Motion Processing - Computational, Neurobiological, and
Psychological Perspectives, T. Watanabe, Editor. 1998, MIT Press:
Cambridge, MA. p. 187-211.
80
APPENDIX : Mathematical Details of the Two Center
Estimation Models
For a given location x, we pooled motion signals under the circular region centered
at x with a fixed radius (5° in this study). We considered only the directions of the
pooled motion signals as the stimuli had all same constant velocities. Let v
s
� � � ⃗ be a
vector with cosine and sine of all the directions alternately stacked, i.e.,
v
s
� � � ⃗ =
⎝
⎜
⎛
cosθ
1
sin θ
1
⋮
cosθ
n
sin θ
n ⎠
⎟
⎞
assuming that total n motion signals were pooled for the region
The vector v
T
� � � � ⃗ representing a translational motion of direction θ
T
is simply,
v
T
� � � � ⃗ =
⎝
⎜
⎛
cosθ
T
sin θ
T
⋮
cosθ
T
sin θ
T ⎠
⎟
⎞
The correlation between the signal and the translation template is given by,
v
s
� � � ⃗ ∙ v
T
� � � � � � ⃗
|v
s
� � � ⃗||v
T
� � � � ⃗|
=
v
s
� � � ⃗ ∙ v
T
� � � � � � ⃗
n
81
Defining a rectifying function R(x) as
R(x) = �
0 if x < 0
x if x ≥ 0
the output of the model A at the position x is the sum of all the rectified correlations
over all possible translational directions, i.e.,
Output of Model A at position x = � R �
v
s
� � � ⃗ ∙ v
T
� � � � � � ⃗
n
�
θ
T
and the estimated center position is the position that minimizes the output calculated
above,
Estimated center position =
x
� R �
v
s
� � � ⃗ ∙ v
T
� � � � � � ⃗
|v
s
� � � ⃗||v
T
� � � � ⃗|
�
θ
T
Likewise, the vector v
E
� � � � ⃗ representing a expanding motion centered at x is,
v
E
� � � � ⃗ =
⎝
⎜
⎛
cosφ
1
sin φ
1
⋮
cosφ
n
sin φ
n
⎠
⎟
⎞
where φ
i
, i = 1, … , n is the angle of the vector from x to the position of the i
th
motion
signal.
82
Then the correlation between the signal and the expansion template is given by,
v
s
� � � ⃗ ∙ v
E
� � � � � � ⃗
|v
s
� � � ⃗||v
E
� � � � ⃗|
=
v
s
� � � ⃗ ∙ v
E
� � � � � � ⃗
n
And the estimated center position is the position that maximized this correlation, i.e.,
Estimated center position =
x
v
s
� � � ⃗ ∙ v
E
� � � � � � ⃗
|v
s
� � � ⃗||v
E
� � � � ⃗|
Abstract (if available)
Abstract
Optic flow contains rich information about the relative motion between an observer and the world, and three dimensional layout of the environment. Elementary flows such as expansion and rotation were proposed as bases for analysis of complex optic flow in the brain [93]. In this dissertation various aspects of human visual perception of centers of expansion and rotation optic flows were investigated using psychophysical probes and computational modeling. In the first section, temporal dependence of perception of centers in expansion and rotation optic flows were measured by pointing and detection tasks. It was found that the dynamics of center perception for rotation and expansion flows were fast. Significant percept of center location was already developed by 100 ms, with the exact time course depending on the spatial velocity profile of the motion fields. Such time courses were faster than those observed for the measurement of rates of expansion. Such fast time courses could explain an optical illusion related to a rolling-wheel-like motion in a fronto-parallel plane. Mathematically, the brain would have enough information to estimate the motion correctly by integrating multiple frames of such flows. However, the subjects’ perception of the rotational center was strongly biased toward the center of instantaneous flow even with extended viewing time. In the second section, perception of the centers of expansion and rotation optic flows was tested under various masking conditions with short stimuli presentation (100ms). It was observed that different tendencies in error according to the spatial profiles of masking. When a rectangular masking was presented with its long side aligned with the direction in which center eccentricities changed (Parallel masking condition), mean perceived eccentricities were unaffected with increased variance in response. But when the rectangular masking was aligned so that center eccentricities changed along its short side (Perpendicular masking condition), a strong bias in mean perceived eccentricities was observed. This bias still persisted with a small square mask, which was contained in both the parallel and perpendicular masks. Two possible strategies for estimating centers of expansion and rotation optic flows were tested to see whether they could account for the observed human data. In the motion-opponency model, centers of such flows were found by locating the point where motions were maximally balanced in any direction. In the template model, centers were found by locating the point that best fitted expansion or rotation motion templates. The simplest implementations of both of the strategies were simulated on the same experimental setups of the various masking conditions. The results showed that motion-opponency model could reproduce important aspects of human perception under masking conditions, whereas the template model could not explain the observed human data. In the third section of the dissertation, a mathematical formulation was presented, which provided a theoretical background how local motion direction and center location can be estimated together through motion-opponency operation. First, the motion-opponency computation can be thought of as computing optimal uni-directional motion with a generative model of aperture-constrained local motion measurements. Second, the motion-opponency model of center estimation can be optimal in locating points that are less-likely to arise from uni-directional motion. The constraint intrinsic in direction likelihood formulation allows direct comparison in terms of likelihoods between cross-measurements under simple prior assumptions. Collectively, the results suggest an important contribution of the brain area MT in estimation of centers of optic flows.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
The measurement of motion parameters in the perception of optic flow
PDF
Learning contour statistics from natural images
PDF
Selectivity for visual speech in posterior temporal cortex
PDF
Contributions to structural and functional retinal imaging via Fourier domain optical coherence tomography
PDF
Functional models of fMRI BOLD signal in the visual cortex
PDF
Dependence of rabbit retinal synchrony on visual stimulation parameters
PDF
Neural spiketrain decoder formulation and performance analysis
PDF
Encoding of natural images by retinal ganglion cells
PDF
Computational investigation of glutamatergic synaptic dynamics: role of ionotropic receptor distribution and astrocytic modulation of neuronal spike timing
PDF
Inverse modeling and uncertainty quantification of nonlinear flow in porous media models
PDF
Developing optical instrumentation for detecting cutaneous skin perfusion level correcting transcutaneous ICG dilution curve cardiac output measurement
PDF
Parametric and non‐parametric modeling of autonomous physiologic systems: applications and multi‐scale modeling of sepsis
PDF
An experimental study of the elastic theory of granular flows
PDF
The role of counter-current flow in the modeling and simulation of multi-phase flow in porous media
PDF
Functional magnetic resonance imaging characterization of peripheral form vision
PDF
Characterization of visual cortex function in late-blind individuals with retinitis pigmentosa and Argus II patients
PDF
Robot vision for the visually impaired
PDF
Excitatory-inhibitory interactions in pyramidal neurons
PDF
A Boltzmann model for tracking aerosols in turbulent flows
PDF
Numerical and experimental study on dynamics of unsteady pipe flow involving backflow prevention assemblies
Asset Metadata
Creator
Lee, Junkwan (author)
Core Title
Human visual perception of centers of optic flows
School
Andrew and Erna Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Biomedical Engineering
Publication Date
05/01/2013
Defense Date
05/01/2012
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
expansion,OAI-PMH Harvest,optic flow,rotation,visual motion
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Grzywacz, Norberto M. (
committee chair
), Mel, Bartlett W. (
committee member
), Tjan, Bosco S. (
committee member
)
Creator Email
junkwan.lee@gmail.com,junkwanl@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c3-20032
Unique identifier
UC11288896
Identifier
usctheses-c3-20032 (legacy record id)
Legacy Identifier
etd-LeeJunkwan-701.pdf
Dmrecord
20032
Document Type
Dissertation
Rights
Lee, Junkwan
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
expansion
optic flow
rotation
visual motion