Close
The page header's logo
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected 
Invert selection
Deselect all
Deselect all
 Click here to refresh results
 Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Computational modeling and utilization of attention, surprise and attention gating
/
Computational modeling and utilization of attention, surprise and attention gating [slides]
(USC Thesis Other) 

Computational modeling and utilization of attention, surprise and attention gating [slides]

doctype icon
play button
PDF
 Download
 Share
 Open document
 Flip pages
 More
 Download a page range
 Download transcript
Copy asset link
Request this asset
Transcript (if available)
Content Explaining
Observer
Performance
in
Dynamic
Vision
Tasks

using
Bayesian
Surprise

  There
are
many
models
of
feature
based
attention
but
in

general
they
lack
a
temporal
component
to
detect
unique

feature
changes
across
time.  

 Treisman,
A.M.,
&
Gelade,
G.
(1980).
Cognitive
Psychology,
12
(1),
97‐136.


 Koch,
C.,
&
Ullman,
S.
(1985).
Human
Neurobiology,
4
(4),
219‐227.


 Itti,
L.,
Koch,
C.,
&
Niebur,
E.
(1998).
IEEE
PAMI
20
(11),
1254‐1259.

See
also:  
Shiffrin,
R.M.,
&
Schneider,
W.
(1977).
Psychological
Review,
84
(2),
127‐190.

http://www.nerd‐cam.com

Koch
and
Ullman  Itti
and
Koch

 2AFC
Task.
Did
you
see
an
Animal?
(or
Transportation

Method)
–
Two
separate
experiments.

 20
Hz
exposure.
Natural
scene
distracters.

 An
attention
gate
may
predict
what
is
attended
to
in
a

dynamic
scene.

 Can
we
create
model
of
an
attentional
gate
that
extends

the
notion
of
feature
integration
into
time?  

 Sperling,
G.,
&
Weichselgartner,
E.
(1995).
Psychological
Review,
102
(3),
503‐532.


 Shih,
S.‐I.,
&
Sperling,
G.
(2002).
Psychological
Review,
109
(2),
260‐305.

 We
can
predict,
using
Bayesian

Surprise,
the
activity
of
an
attention

gate.  
  The
attention
gate
is
a
triage
system
for

controlling
what
visual
information
from

targets
is
able
to
pass
initial
processing

to
higher
visual
centers.

  A
triage
system
for
visual
information

assumes
an
information
bottleneck

which
necessitates
a
choice
of

information
selection
since
not
all

information
can
be
processed
by

increasingly
complex
visual
systems.  
 Hypothesis:
A
large
part
of
RSVP
performance
can
be

explained
in
terms
of
an
attentional
gate
(Sperling
et

al,
Cave,
Chun
&
Potter,
etc…)  
 What
gets
past
the
attention
gate
is
perceivable.

 Things
which
are
more
interesting
or
more
important;

perhaps
more
informative,
should
have
a
better
ability

to
control
the
attention
gate.

  The
attention
gate
triages
information
from
RSVP.

 Can
we
create
and
test
such
a
mechanism?


 Sperling,
G.,
&
Weichselgartner,
E.
(1995).
Psychological
Review,
102
(3),
503‐532.


 Cave,
K.R.
(1999).
Psychological
Research,
62,
182‐194.


 Chun,
M.M.,
&
Potter,
M.C.
(1995).
Journal
of
Exp.
Psychology:
Human
Perc.
and
Perf.,
21,
109‐127.

 Given
an
input
stream
of
images,

what
is
truly
new
and
informative?

 Information
outliers
are
surprising.

 We
should
be
able
to
resist
garbage

information
like
1/f
noise.  
 Information
is
based
on
image

features
(Treisman
&
Gelade,
Koch

&
Ullman,
Itti
&
Koch).

 What
is
informative
should
be

better
able
to
pass
the
Attention

Gate.


 Treisman,
A.M.,
&
Gelade,
G.
(1980).
Cognitive
Psychology,
12
(1),
97‐136.


 Koch,
C.,
&
Ullman,
S.
(1985).
Human
Neurobiology,
4
(4),
219‐227.


 Itti,
L.,
Koch,
C.,
&
Niebur,
E.
(1998).
IEEE
PAMI
20
(11),
1254‐1259.

Blue/Yellow  Color  Opponent

  Shannon (1948):
          D = dataset       D = all possible
        dataset
  Problems:  
  Fine for communication; but what about
semantic/subjective aspects?
  Information vs. value, importance, relevance, or
surprise.
  I is informative compared with what?
  White snow paradox.
TV news, sports, music,
action movies, etc
  0.3 MByte/s
(640x480, MPEG4,  
46,000 frames)
Greyscale snow
 5.0 MByte/s
Shift emphasis:
  from objective probability of occurrence of data
  …to effects of data onto subjective beliefs of observers.
. . .
P(M)
M
MTV CNN FOX BBC      . . .    Snow
prior
  Family M of observer-dependent models or hypotheses
about the world.
  Observer beliefs:
  Bayesian foundation of probability: data is what changes a
prior into a posterior:
. . .
P(M)
M
MTV CNN FOX BBC      . . .   Snow
prior
. . .
P(M)
M
MTV CNN FOX BBC      . . .    
Snow
prior
. . .
P(M|D)
M
MTV CNN FOX BBC      . . .    
Snow
posterior
. . .
P(M)
M
MTV CNN FOX BBC      . . .    
Snow
prior
. . .
P(M|D)
M
MTV CNN FOX BBC      . . .    Snow
posterior
Surprise
. . .
M
MTV CNN FOX BBC      . . .    Snow
Surprise =
P(M),
P(M|D)

Beliefs stabilize, prior and posterior become identical,
and additional snow frames carry no surprise.
. . .
M
MTV CNN FOX BBC      . . .    Snow
P(M),
P(M|D)

  Surprise:
  using, e.g., the Kullback-Leibler (KL) distance for d.
  Shannon’s Information:
  Moral: We want relative information rather than
absolute information.

  We
want
to
start
to
quantify
surprise
by
models
of

something
directly
measurable
such
as
image
features.


This
is
an
easy
way
to
quantify
an
image.


  Models
of
image
features
are
the
expected
feature
response

given
past
feature
measurements.


  Approximate
P(M)
and
P(M|D)
with
a
probability

distribution.

Feature
Response
‐
Models

Use
a
Gamma
probability
over
feature

responses
since
we
assume
Poisson

noise
(Neural
Spike
Trains)
and
a

response
range
from
zero
to
infinity  
So…
How
can
Updating
our
Beliefs
Surprise
us?

 Create
models
over
all
features
and
locations
in
images

then
combine
into
Surprise
Maps.

 Recall
the
RSVP
task
you
saw
earlier.
Can
we
make

predictions
about
how
observers
will
perform
on
it?

  Run
8
subjects
on
500
RSVP
sequences
with
Animal

Targets
and
natural
distracters
(see
Example
Below)

 Some
are  
and

subjects
always
spot

the
target.

 Some
are  
and

subjects
always
miss

the
target.

 Can
surprise
tell
use

why?

  Can
we
make  
RSVP
sequences  
based
on
surprise

statistics?

  Idea:
Change
order
of
images
to
block
the
target
with
images

which
are
more
Surprising
than
it.

Einhäuser,
W.,
Mundhenk,
T.N.,
Baldi,
P.,
Koch,
C.,
&
Itti,
L.
(2007).
Journal
of
Vision,
7
(10),
1‐13.

 Can
we
make
“Easy”
RSVP
sequences
hard
based
on

surprise
statistics?

Einhäuser,
W.,
Mundhenk,
T.N.,
Baldi,
P.,
Koch,
C.,
&
Itti,
L.
(2007).
Journal
of
Vision,
7
(10),
1‐13.

 The
M‐W
Pattern
Emerges

  +/‐
50
MS
is
critical
for

Surprise
masking
in
RSVP

  W
–
Easy

  M
–
Hard

Mundhenk,
T.N.,
Einhäuser,
W.,
&
Itti,
L.
(2009).
Vision
Research,
In
Press

Mundhenk,
T.N.,
Einhäuser,
W.,
&
Itti,
L.
(2009).
Vision
Research,
In
Press

Mundhenk,
T.N.,
Einhäuser,
W.,
&
Itti,
L.
(2009).
Vision
Research,
In
Press

For
Example
Chun
and
Potter  1995  
 Can
we
recreate
the
attention
gate
to
reveal
what

image
contents
are
detectable?

 Easy
target
locations
should
have
a
higher
likelihood

of
passing
the
attention
gate
than
hard
target

locations.  

 Build
on
the
idea
that
more
surprise
for
a
target

means
more
pass
through.

 Note:
We
expand
the
data
set
to
include

transportation
targets
in
addition
to
the
animal

targets.

  If
the
attention
gate
is
valid,
then
the
surprise
attention
gate

should
overlap
more
with
easy
targets
than
with
hard
targets.

  Thus,
targets
are
easier
because
more
image
information
gets

past
the
attention
gate.





 Surprise
seems
to
reveal
the
activity
of
the
attentional

gate
for
quick
automatic
attention.

 Surprise
at
the
feature
level
supports
a
two‐stage
model

of
attention
and
agrees
with
data
on
lag
sparing
and

attentional
blink
in
RSVP.

 We
have
greater
insight
into
what
parts
of
a
dynamic

scene
can
be
perceived
by
observers.

  The
current
Attention
Gate
model
of
Bayesian
Surprise
works

very
well
with
different
types
of
targets.

  The
model
does
need
some
more
work
and
testing.

  The
model
has
some
neat
properties
such
as
giving
an

explanation
to
split
attention
effects.

  Surprise
may
unify
saliency
maps
with
automatic
attention

gating.

•  Itti,
L.,
&
Baldi,
P.
(2006).
Bayesian
Surprise
attracts
human
attention.
Advances
in
Neural
Information

Processing
Systems
(NIPS),
19
(pp.
547‐554):
MIT
Press.

•  Einhäuser,
W.,
Mundhenk,
T.N.,
Baldi,
P.,
Koch,
C.,
&
Itti,
L.
(2007).
A
bottom‐up
model
of
spatial

attention
predicts
human
error
patterns
in
rapid
scene
recognition.
Journal
of
Vision,
7
(10),
1‐13.

•  Mundhenk,
T.N.,
Einhäuser,
W.,
&
Itti,
L.
(2009).
Automatic
Computation
of
an
Image’s
Statistical

Surprise
Predicts
Performance
of
Human
Observers
on
a
Natural
Image
Detection
Task.
Vision
Research,

In
Press

•  http://www.mundhenk.com/thesis/ 
Asset Metadata
Creator Mundhenk, Terrell Nathan (author) 
Core Title Computational modeling and utilization of attention, surprise and attention gating [slides] 
Contributor Electronically uploaded by the author (provenance) 
School Andrew and Erna Viterbi School of Engineering 
Degree Doctor of Philosophy 
Degree Program Computer Science 
Degree Conferral Date 2009-08 
Publication Date 08/04/2009 
Defense Date 04/21/2009 
Publisher University of Southern California (original), University of Southern California. Libraries (digital) 
Tag attention,attentional blink,bayes,biologically inspired,CINNIC,computation,contour,detection,gating,H2SV,Human Performance,iLab,image processing,information,Integration,Itti,Koch,MAG,masking,Nerd-Cam,Neuromorphic Vision Toolkit,OAI-PMH Harvest,RSVP,saliency,spot light,statistics,Surprise,tracking,vision,visual cortex,visual saliency 
Format 36 pages (extent) 
Language English
Advisor Itti, Laurent (committee chair),  Arbib, Michael A. (committee member),  Biederman, Irving (committee member),  Schaal, Stefan (committee member) 
Creator Email nathan@mundhenk.com,tnmundhenk@hrl.com 
Permanent Link (DOI) https://doi.org/10.25549/usctheses-c127-15525 
Unique identifier UC188659 
Identifier usctheses-c127-15525 (legacy record id) 
Legacy Identifier etd-Mundhenk-2997-defense_slides 
Dmrecord 15525 
Document Type Dissertation 
Format 36 pages (extent) 
Rights Mundhenk, Terrell Nathan 
Internet Media Type application/pdf 
Type texts
Source University of Southern California (contributing entity) 
Repository Name Libraries, University of Southern California
Repository Location Los Angeles, California
Repository Email uscdl@usc.edu
Abstract (if available)
Abstract What draws in human attention and can we create computational models of it which work the same way? Here we explore this question with several attentional models and applications of them. They are each designed to address a missing fundamental function of attention from the original saliency model designed by Itti and Koch. These include temporal based attention and attention from non-classical feature interactions. Additionally, attention is utilized in an applied setting for the purposes of video tracking. Attention for non-classical feature interactions is handled by a model called CINNIC. It faithfully implements a model of contour integration in visual cortex. It is able to integrate illusory contours of unconnected elements such that the contours “pop-out” as they are supposed to and matches in behavior the performance of human observers. Temporal attention is discussed in the context of an implementation and extensions to a model of surprise. We show that surprise predicts well subject performance on natural image Rapid Serial Vision Presentation (RSVP) and gives us a good idea of how an attention gate works in the human visual cortex. The attention gate derived from surprise also gives us a good idea of how visual information is passed to further processing in later stages of the human brain. It is also discussed how to extend the model of surprise using a Metric of Attention Gating (MAG) as a baseline for model performance. This allows us to find different model components and parameters which better explain the attentional blink in RSVP. 
Tags
attention
attentional blink
bayes
biologically inspired
CINNIC
computation
contour
detection
gating
H2SV
iLab
image processing
Itti
masking
Nerd-Cam
Neuromorphic Vision Toolkit
RSVP
saliency
spot light
statistics
tracking
visual cortex
visual saliency
information
Linked assets
Computational modeling and utilization of attention, surprise and attention gating
doctype icon
Computational modeling and utilization of attention, surprise and attention gating 
Action button