Close
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
The neural correlates of creativity and perceptual pleasure: from simple shapes to humor
(USC Thesis Other)
The neural correlates of creativity and perceptual pleasure: from simple shapes to humor
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
The Neural Correlates of Creativity and Perceptual Pleasure:
From Simple Shapes to Humor
Ori Amir
______________________________________
A Dissertation Presented to the
FACULTY OF THE USC GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
(Psychology Graduate Program)
August 2015
Copyright 2015 Ori Amir
2
Table of Contents
Chapter 1: Ha Ha! Versus Aha! 3
Introduction 3
Materials and Methods 5
Results 10
Discussion 19
References 24
Chapter 2: The Neural Basis for Shape Preferences 27
Introduction 28
Experiment 1: Adult Shape Preferences 32
Experiment 2: Infant Shape Preferences 38
Experiment 3: MRI Study 41
General Discussion 48
References 53
Chapter 3: The Neural Genesis of a Joke 56
Introduction 57
Method 59
Results 63
Discussion 70
References 73
3
Chapter 1: Ha Ha! Versus Aha!
Abstract
While humor typically involves a surprising discovery, not all discoveries are perceived
as humorous or lead to a feeling of mirth. Is there a difference in the neural signature of
humorous versus nonhumorous discovery? Subjects viewed drawings that were uninterpretable
until a caption was presented that provided either: 1) a nonhumorous interpretation (or insight) of
an object from an unusual or partial view (UV) or 2) a humorous interpretation (HU) of the
image achieved by linking remote and unexpected concepts. fMRI activation elicited by the UV
captions was a subset of that elicited by the humorous HU captions, with only the latter showing
activity in the temporal poles and temporo-occipital junction (linking remote concepts), and
medial prefrontal cortex (unexpected reward). Mirth may be a consequence of the linking of
remote ideas producing high—and unexpected—activation in association and classical reward
areas. We suggest that this process is mediated by opioid activity as part of a system rewarding
attention to novel information.
Keywords: Cortical µ-opioid gradient, fMRI, humor, medial prefrontal cortex, temporal pole
1. Introduction
Humor involves a discovery of an unexpected interpretation or perspective similar to
what occurs when we experience insight. However, not all insight experiences are funny.
Previous imaging studies of humor have contrasted humorous stimuli with the same stimuli
without the humorous element (e.g., Mobbs et al. 2003; Moran et al. 2004; Bartolo et al. 2006;
Kohn et al. 2011), and/or compared different kinds of humor (e.g., Goel and Dolan 2001; Watson
et al. 2007; Samson et al. 2008, 2009; Franklin and Adams 2011). Unfortunately, a comparison
among subtypes of humor necessarily ignores the regions involved in all humor, and a
comparison of humorous stimuli to the same stimuli with the humorous element removed may be
too broad, as controls lack the element of discovery. The present investigation directly compares
humorous versus nonhumorous cognitive discovery.
4
More recently studies have compared humor with more subtle controls, for example,
humorous clips versus nonhumorous stimuli that inspire a positive/enjoyable feeling (Neely et al.
2012; Vrticka et al. 2013), humorous resolvable incongruities versus nonresolvable incongruities
(Samson et al. 2008, 2009; Bekinschtein et al. 2011; Chan, Chou, Chen, Liang 2012; Chan,
Chou, Chen, Yeh et al. 2012). Yet in all of those studies, an element of discovery was arguably
present in the humorous conditions but not in the controls, or at least, was not systematically
controlled for.
The imaging studies of humor cited above generally reported activation in portions of the
temporal lobes often near the junctions with the parietal or occipital lobes and/or the temporal
poles (TPs) and some classical reward regions. Additional regions were also reported, but not
consistently across the different studies. The inconsistency might be the result of a lack of
control for the element of discovery, as well as additional confounds, common in comedy, which
enhance the reward, but are not necessary for humor, such as schadenfreude, superiority, or
sexual titillation (Hurley et al. 2011). In the present investigation, we use humorous stimuli that
were not designed to induce these emotions. In some of the prior experiments, as noted by
Bartolo et al. (2006), humor relied heavily on attribution of intentions, eliciting activity in the
regions associated with theory of mind. Our stimuli did not rely heavily on such attributions.
Several imaging studies have explored the neural basis of insight (e.g., Bowden et al.
2005; Aziz-Zadeh et al. 2009). Owing to the large variation in the nature of the stimuli, the tasks,
and the controls (ranging from anagrams to physical problem solving involving object
interactions), no consistent pattern of activation emerged across the various studies (for a review,
see Dietrich and Kanso 2010).
To study the neural correlates of mirth (enjoying humor) while controlling the element of
discovery, subjects in the present investigation viewed 2 kinds of simple line drawings, that
were, by themselves, uninterpretable with respect to their possible referents (Fig. 1). A
subsequently presented caption would be provided that elicited, for one condition, an insight-like
(but nonhumorous) interpretation of an object depicted from a partial or unusual view (UV) (Fig.
1A,B) by providing the referent for that object (UV condition). Responses to UV stimuli were
compared with responses to the second kind of drawing/caption, with a humorous interpretation
(HU) (Fig. 1C,D), based on a surprising linking of remote associations (HU condition). Each HU
and UV stimulus had a control caption that merely provided a physical description of the
5
drawing, adding no referential information. For each drawing, half the subjects viewed the
original caption and half the control caption. Critically, both UV and HU conditions (unlike their
controls) result in a discovery, that is, the referent or a new interpretation, of the previously
uninterpretable line drawing.
2. Materials and Methods
Subjects viewed simple line drawings (Fig. 1) whose referent was inaccessible in the
absence of a caption. To compare the brain's response with humor and insight, we compared the
activation elicited from humorous “droodles” (Price 1955, 1976, 2000), such as the ones shown
in Figure 1C,D, in which the humor is based on the unexpected linking of remote concepts, to
drawings of objects in partial or unusual views (Nishimoto et al. 2010; e.g., Fig. 1A,B) defining
an insight condition where the description provides the referent but does not elicit a humorous
response. There is some disagreement in the literature whether the label “insight” should be
applied to instances in which the solution is provided, as is the case in our experiment, or only
when subjects reach the solution on their own. However, in the present experiment providing the
solution for the insight condition renders that condition more comparable with the humorous
droodles condition, as the HU is similarly provided (rather than asking the subjects to come up
with it themselves).
6
Figure 1. Examples of the 4 conditions. Credits: (A and B): Adopted from Nishimoto et al.
(2010). (C and D): Droodles excerpts from “Droodles - the Classic Collection” © 2000 by
Tallfellow press, Los Angeles. Used by Permission. All rights reserved. (C): Droodle orientation
and caption differs from the original Droodle.
2.1. Participants
Fifteen adults, 7 females, all right handed (except for 1 male), age 19–31 (mean = 22.6).
All were students at the University of Southern California, except for one graduate from another
university. We obtained informed consent from all subjects, and they were compensated for their
participation. The study was approved by the University Park Institutional Review Board at the
University of Southern California.
2.2. Stimuli
The stimuli were 200 line drawings along with both interpretive and physical
descriptions. One hundred images whose original captions provided the HU condition were
scanned from 3 books by Price (1955, 1976, 2000), whose drawings with their captions he
7
termed “Droodles.” We used only Droodles that were readily understood and did not require
knowledge of out of date conventions or objects. The remaining stimuli, 100 (insight) images
whose original captions referred to objects depicted from a partial or UV interpretation were
taken from Nishimoto et al. (2010). Out of their database of 196 exemplars created for
experimental purposes, 100 were selected based on interpretability.
Interpretive descriptions were generally taken as written by the original authors. Some
minor changes were made to the descriptions of UVs to make them more accessible to American
subjects. Physical descriptions were created which described the drawing physically without
conveying its interpretation and which, on average, approximated the length of the drawing's
interpretive description with respect to the number of words.
Subjects viewed both types of drawings, HUs and UVs, followed by either the referential
or the control descriptions for a total of 4 conditions. Each drawing's 2 descriptions were
counterbalanced between subjects so that half the subjects saw the control description and half
saw the referential description. A particular drawing was viewed in only a single condition by a
given subject.
2.3. Procedure
Each 10-s trial began with a 2-s display of a drawing (subtending a visual angle of ∼8°)
in the center of the screen. The drawing was shown alone to pique the subject's curiosity (Fig. 2).
Then for 5 s, either a physical or interpretive description was displayed below the drawing
followed by 3 s of just the drawing itself, allowing for the interpretation to be fully appreciated.
Trials were separated by a 2-s blank interval. Presentation sequences were programmed with
Psychophysics Toolbox (Brainard 1997; Pelli 1997) running on MATLAB (The MathWorks,
Natick, MA, USA). Subjects were instructed to rate each drawing for funniness on a scale of 1–4
with 4 buttons in the scanner: 1 indicated they did not understand the description and/or its
relationship to the drawing; 2–4 indicated the degree of funniness with 2 being “not funny,” 3 “a
little funny,” and 4 “funny.” A pretest had determined that “funny” was more effective than
“very funny” in encouraging subjects to use the full scale. Responses were collected from the
onset of the description to the end of the trial. Note that only one-fourth of the trials were
8
designed to be funny (HU type), while another one-fourth of the trials were interpretive but of
the UVs type, and one-half the trials were the control physical descriptions.
In one session, each subject participated in an anatomical scan and 4 experimental runs of
50 drawings, each lasting 10 min.
Figure 2. The temporal arrangement of a trial in the fMRI experiment. Droodles excerpts from
“Droodles - the Classic Collection” © 2000 by Tallfellow press, Los Angeles. Used by
Permission. All rights reserved.
2.4. Data acquisition
All fMRI images were scanned at USC's Dana and David Dornsife Cognitive
Neuroscience Imaging Center on a Siemens Trio 3T scanner with a standard 16-channel head
coil. Each subject ran in a high-resolution T1-weighted structural scan using MPRAGE
9
sequence. (Repetition time (TR) = 1100 ms, 192 sagittal slices, 256 × 256 matrix size, 1 × 1 × 1
mm voxels).
Functional images were acquired using an echo-planar imaging (EPI) pulse sequence
with the parameters: TR = 2000 ms, TE = 30 ms, flip angle = 62°, 256 × 256 matrix size, in
plane resolution 3 × 3, 3 mm thick slices, 32 axial slices covering as much of the brain as
possible, always including the TPs, but occasionally missing the superior rim of the primary
motor and somatosensory cortices.
2.5. Data Analysis
Preprocessing (3D motion correction using Trilinear interpolation, 3D spatial smoothing
using a 4-mm full-width at half-max Gaussian filter, linear trend removal using a high-pass filter
set to 3 cycles over the run's length) was done with the Brain Voyager software package (Brain
Innovation BV, Maastricht, The Netherlands). Statistical analysis was done using MATLAB
scripts along with Brain Voyager. Motion corrected functional images were coregistered with the
same session's anatomical scan. Coregistered images were then transformed to Talairach
coordinates and underwent statistical analysis.
Statistical analysis was based on a general linear model with a separate regressor for 10
TRs from the beginning of each trial type. The 6 motion correction parameters (3D translation
and 3D rotation) were included in the design matrix of the regression to eliminate any potential
motion artifacts. We then conducted a whole-brain, random-effects group average analysis. The
contrasts between the different conditions were computed as a subtraction of TRs 4–9 (i.e., 7–18
s from the beginning of the trial) of each condition with an uncorrected threshold of P < 0.005
(with an additional cluster threshold of 10 consecutive voxels applied). This procedure was used
to define the regions showing differential activity for both HU and UV conditions compared with
their respective physical controls (HUcont and UVcont), and compared with each other after the
controls were subtracted to reveal regions uniquely activated for humor but not insight ([HU −
HUcont] − [UV − UVcont]), as well as a conjunction of the 2 conditions to reveal regions
activated for both humor and insight ([HU − HUcont] and [UV − UVcont]). In the same way, we
also obtained a contrast map of only those HU drawings which were rated 4-“funny” minus the
HUs rated 3-“a little funny”, by each subject individually, resulting in a map of regions selective
10
for the degree of funniness. For the rating analysis, all parameters were the same, but we used a
fixed effect analysis, because of reduced power (nevertheless, as described in the results, a
comparison of the HU with the UV conditions yielded highly similar results to a comparison of a
rating of 4 vs. 3, with some additional regions found in the latter analysis). We used the
relatively broad time window (TRs 4–9) to make sure that we captured the sum of activations
independent of the slight variations over subjects and trials in the time course (e.g., of “getting
the joke”).
3. Results
3.1. Behavioral Results
Subjects rated a drawing as “1” (I do not understand) on only 1.3% of the trials, and those
trials were removed from further analysis. Mean ratings for the 4 conditions are shown in Table
1. The HU drawings were rated as significantly funnier than the UV interpretation drawings,
t(14) = 12.39, P < 0.001, and the physical description control of the HU condition (HUcont),
t(14) = 12.42, P < 0.001. UV drawings, which were not meant to be humorous, still received a
slightly, but significantly, higher mean ratings than their UVcont, t(14) = 2.93, P = 0.011.
11
Table 1. Mean funniness ratings and standard deviations for the 4 conditions, with 2-“not
funny”, 3 – “a little funny” and 4 – “funny”.
Condition: Mean Funniness Rating Standard Deviation
HU (humorous) 3.35 .33
UV (Insight) 2.22 .25
HU control 2.09 .15
UV control 2.04 .06
Funniness ratings were 2—“not funny,” 3—“a little funny,” and 4—“funny.” Ratings of 1—“I
don't understand” are excluded.
Mean response times for responding to the HU drawings was 3.64 s (SD = 0.86), which
included the time to read the description. This time was significantly longer than that for UV, M
= 2.92, SD = 1.15, t(14) = 5.04, P < 0.001. The RTs for the control conditions HUcont, M =
3.08, SD = 1.12, and UVcont M = 2.51, SD = 1.18, were both significantly shorter than those of
their respective interpretive conditions, respectively t(14) = 4.53 and t(14) = 7.01, both P's <
0.001. The 400- to 600-ms RT difference between HU and UV may reflect the longer time
required to get the referent for the HU stimuli and, perhaps, also deciding its funniness. Owing to
the large window of time selected for the imaging data analysis (TRs 4–9, TR = 2 s), we likely
captured all of the event-related activity for both conditions.
3.2. Humorous Interpretation Drawings and Unusual View Interpretation Drawings vs. Controls
The HU and UV drawings were contrasted with their physical description controls.
Figure 3 displays these contrasts on an inflated cortical map (see also Fig. 4, for additional
regions not apparent on the inflated maps). We used the P < 0.005 threshold for all statistical
analysis. We found that the regions exhibiting greater activation for the UVs versus their controls
were a subset of those activated for the HUs versus their controls.
12
Figure 3. (A) BOLD response to Unusual View (UV) Interpretation “Insight” Drawings minus
their physical description controls (UVcont). (B) Humorous Interpretation (HU) Drawings minus
their physical description controls (HUcont). (C) Color scale of t-values. Maps are displayed at a
threshold of P < 0.005 uncorrected. Note: right side of the brain is on the right. TP, temporal
pole; LOC, lateral occipital complex; TOJ, temporo-occipital junction; dACC, dorsal anterior
cingulated cortex (supplementary motor area); mPFC, medial prefrontal cortex.
13
Figure 4. Top—regions activated for HU but not UV (based on the contrast: [HU − HUcont] −
[UV − UVcont]). Bottom—regions activated for both HU − HUcont and UV − UVcont as
revealed by conjunction analysis. All maps are displayed with a threshold of P < 0.005,
uncorrected. TP, temporal pole; TOJ, temporo-occipital junction; PHG, parahippocampal gyrus;
LOC, lateral occipital complex; mPFC, medial prefrontal cortex; lAMG, left amygdala; STR,
striatum; dACC, dorsal anterior cingulate cortex (this area of activation includes SMA,
supplementary motor area). Figures conform to radiology conventions, with left side on the right.
A conjunction analysis revealed the regions activated by both UV and HU conditions
compared with their respective controls (Table 2). Among those regions were the following: left
frontal regions largely overlapping with Broca's language area, a small portion of the posterior
inferior frontal gyrus on the right, previously suggested to involve holding multiple alternative
meanings in language processing (Mashal et al. 2005), and an area largely overlapping with
bilateral lateral occipital complex (LOC)—a region selective for visual images of object shape
(Grill-Spector et al. 1999; Hayworth and Biederman 2006). The parahippocampal gyrus (PHG),
14
a region which activates more strongly to visual scenes compared with single objects (Epstein
and Kanwisher 1998), and responds more strongly to visual objects and scenes that produce rich
contextual associations (Bar et al. 2008), was activated more for both interpretive conditions.
Table 2. Regions activated by both HU and UV minus their respective controls, as revealed by a
conjunction analysis
ROI No. of voxels X Y Z Mean t Mean P value
rLOC 8571 36 −53 −26 4.090 0.002
lFrontal 5042 −47 20 1 4.071 0.002
rCer 4568 30 −61 −33 4.131 0.002
lSTR 4005 −15 −7 11 3.876 0.002
lLOC 3789 −41 −49 −22 3.772 0.003
ldACC 3736 −5 12 56 3.825 0.002
rSTR 2036 11 −3 11 3.781 0.002
lAMG 1316 −30 −6 −7 3.897 0.002
lPHG 981 −32 −34 −22 3.915 0.002
rPHG 468 28 −40 −24 3.968 0.002
rFrontal 342 55 19 24 3.748 0.003
Note: ROIs are defined using a threshold of P = 0.005, uncorrected. ROI sizes in voxels as well
as Talairach coordinates, average t and P values are provided.
l, left; r, right; m, middle; LOC, lateral occipital complex; SMA, supplementary motor cortex;
STR, striatum; Cer, cerebellum; PHG, parahippocampal gyrus; AMG, amygdala; dACC, dorsal
anterior cingulate cortex; ROI, regions of interest.
Increased activation for both interpretive conditions was also observed in the bilateral
ventral striatum, and the left amygdala (lAMG) (while the right amygdala was significantly
activated for both HU − HUcont and UV− UVcont, no right amygdala activation survived either
the conjunction of these 2 contrasts or their subtraction [see Supplementary Tables 1 and 2]),
both of which are part of the dopaminergic reward network (Lee et al. 2004), and both were
reported in previous fMRI studies of humor (e.g., lAMG specifically was reported in Mobbs et
al. (2003), Bartolo et al. (2006), Watson et al. (2007), although some studies found bilateral
amygdala activity, e.g., Moran et al. (2004) and still others reported no amygdala activity). Our
findings suggest that activation in these 2 reward regions occurs in response to nonhumorous
15
discovery as well (note, however, that a small region within the left striatum was significantly
more activated for HU than UV [see Fig. 4, top left image, and Table 3]). Activity in both the
PHG and the striatum has been found to correlate positively with preference for visual scenes
(Yue et al. 2007).
Table 3. Regions activated for humor but not nonhumorous discovery (insight) as revealed by the
subtraction [HU − HUcont] − [UV − UVcont]
ROI No. of voxels X Y Z Mean t Mean P value
rTP 4989 38 14 −25 4.113 0.002
rTPJ 3570 42 −82 11 4.437 0.001
lmPFC 3543 −7 48 40 4.118 0.002
lTP 3139 −50 3 −23 3.991 0.002
rTOJ 2545 57 −56 −2 3.876 0.002
rOFC 1133 28 11 −18 3.864 0.002
lTPJ 1013 −57 −61 6 3.823 0.002
Neg_rParietal 640 17 −47 54 −3.792 0.002
lTOJ 528 −61 −49 −9 3.947 0.002
rCer 515 20 −38 −33 4.037 0.002
rHip 430 25 −32 −16 3.929 0.002
lSTR 375 −7 −11 10 3.795 0.003
Neg_srFrontal 362 26 29 38 −3.838 0.002
Neg_rSTG 311 41 −34 8 −3.702 0.003
Note: ROIs are defined using a threshold of P = 0.005, uncorrected. ROI sizes in voxels as well
as Talairach coordinates, t and P values are provided.
l, left; r, right; m, middle; s, superior; Neg, negative (i.e., activation is significantly lower than
baseline); TP, temporal pole; TPJ, temporoparietal junction; TOJ, temporo-occipital junction;
OFC, orbitofrontal cortex; Cer, cerebellum; mPFC, medial prefrontal cortex; Hip, hippocampus;
STR, striatum; STG, superior temporal gyrus; ROI, regions of interest.
Finally, a region extending from the dorsal portions of the anterior cingulated cortex
(dACC) to the supplementary motor area (SMA) was activated for both interpretive conditions.
Mobbs et al. (2003) reported activation in this region for humorous stimuli, and they suggested it
to be involved in the motor planning of laughter. However, that region's elevated activity for the
nonhumorous UV condition suggests an alternative cognitive function: the reassessment of the
16
display based on the new interpretation (an executive function often attributed to ACC,
specifically in studies of insight; e.g., Aziz-Zadeh et al. (2009)).
3.3. Regions Only Activated for Humorous Interpretations
A subtraction of activation for UV drawings from HU drawings (after each condition's
activation was subtracted from its respective control: [HU − HUcont] − [UV − UVcont]),
revealed 4 regions of activation (see Fig. 4 top and Table 3): 1) The medial prefrontal cortex
(mPFC), an area found to be involved in reward learning that responds to unexpected reward
contingencies (Rolls 1996), and also has been found to correlate with funniness ratings of stimuli
in previous studies (e.g., Goel and Dolan 2001), 2) the TPs, and 3) the temporo-occipital
junctions (TOJs) extending to 4) temporoparietal junctions (TPJs). Mobbs et al. (2003) found
activation in both TOJ and TP as well, but only on the left side, and suggested TP activity may
relate to lexical retrieval or semantic knowledge processing, and the TOJ does semantic
processing and may be detecting incongruity/surprise. TPJ activity, particularly in the right
hemisphere has been linked to mentalizing; however, the left TPJ is responsive to other similar
cognitive tasks not requiring modeling of other minds, that is, perspective taking and comparing
interpretations (Perner et al. 2006). Regardless of their exact function in the context of humor,
these are association regions involved in high-level semantic processing (Meyer and Damasio
2009; Man et al. 2012). While there is some inconsistency in the humor literature against which
we have compared our results, as discussed in the Introduction, this may be the result of the
variable nature of controls (which often either included some funny elements or were lacking the
element of discovery). Unlike previous studies, we compared the humorous HU condition with a
nonhumorous UV condition which retained the element of discovery, allowing regions
responsive to humor to be distinguished from those responsive to insight or reinterpretation.
3.4. Eliminating Potential Confounds
The HU Drawings compared with those for the UVs were more complex visually—as
there were, on average, more pixels in the drawings, and their verbal interpretation tended to be
longer. Consequently, the comparisons of HU and UV interpretive conditions were always made
17
after subtracting from the interpretive conditions their respective descriptive controls. The
descriptive controls for the HUs and UVs were such that the images were identical and the
captions were of the same length (see Materials and Methods). To assess whether any of the
differences between the HUs and the UVs minus their respective controls were the result of
differences in figure complexity or descriptive length/content, we examined the contrast between
the 2 physical controls: HUcont − UVcont. This contrast only yielded significant activation in
early visual areas up to and including LOC (but not PHG) and in the left hemisphere frontal
language areas (using the same P < 0.005, uncorrected, as in all of the analysis we report). Note
that none of these regions were the ones differentiating HU from UV activations.
Another concern was that laughter, in the humorous HU condition, would result in head
movements that might artificially increase the BOLD signal in some of the regions we reported.
While head movements were small to begin with (always <2 mm, and typically <0.5 mm), and
they were corrected in the preprocessing (see Materials and Methods), we conducted an
additional test to confirm that head movements did not explain our findings: we added the 6
motion correction parameters (3D translations and rotations) to the design matrix of the GLM in
BrainVoyager, and obtained virtually identical results with exclusion of the motion correction
parameters in the design matrix. All results presented here were computed with the design matrix
including the motion correction parameters.
3.5. Ratings of Humor
We contrasted the activation, within the interpretive HU condition, for drawings that
were rated by each subject as 4 (funny), minus those that were rated 3 (a little funny) (Fig. 5) to
assess whether a “dose–response” effect would be evident and, if so, whether this effect would
also involve those regions differentially activated by the humorous minus the nonhumorous
insight conditions. These comparisons included fewer trials (because they constituted only the
trials from the HU drawings, and only those rated either 3 or 4). Nevertheless, under the same
threshold (P < 0.005), robust activation was evident in the same regions that differentiated HU
from UV conditions in the previous analyses: bilateral TPs, the temporo-occipital junction, and
bilateral medial prefrontal-frontal cortex. However, in this analysis, the region of differential
activation-labeled mPFC extended more dorsally within the mPFC, and it was more prominent
18
on the left side. The striatum and parahippocampal cortices were also activated bilaterally, as
well as most of the left cingulate cortex, precuneus, and lingual gyri on both sides. This overlap
between regions activated for higher funniness ratings and the regions specified by comparing
the HU minus the UV conditions suggests that the latter difference indeed amounts to a
manipulation of humor.
Figure 5. A contrast map of HU drawings rated 4—“Funny” minus those rated 3—“a little
funny” for a threshold of P < 0.005. mPFC, medial prefrontal cortex; TP, temporal pole; TOJ,
temporo-occipital junction; CC, cingulate cortex.
Previous imaging studies of humor that had both a nonhumorous control and obtained
funniness ratings did not report that all regions showing higher activation for the humorous
condition, relative to the nonhumorous control, also showed a dose response (e.g., Goel and
Dolan 2001; Watson et al. 2007). The fact we did find a dose response suggests that we have
tighter controls over this complex behavior.
19
4. Discussion
4.1. The “humor” Network
Both the humorous HU drawings, and the insight-like UV drawings had an element of
discovery and stimulus reinterpretation. However, a network of 4 regions, uniquely and
bilaterally active for the humorous stimuli was revealed by subtracting activation for the UV
from the HU drawings: the TPs, temporo-occipital junction (TOJ), extending to the
temporoparietal junction (TPJ), and the mPFC. The role of this network in humor processing is
further supported by its greater activation in the separate analysis comparing, within the HU
condition, those drawings that were rated funniest with those rated as less funny.
4.2. What Makes Humor Humor?
In an often cited (almost by default) attempt to characterize the cognitive process
underlying humor appreciation, Suls (1972) suggested that humor is a form of problem solving
in which there are 2 stages: 1) a perception of incongruity between the “punch line” and what
was expected, and 2) a resolution of the incongruity. However, that also characterizes
nonhumorous insight (Ruch and Hehl 1998). What, then, distinguishes the 2 experiences?
Navon (1988) suggested that jokes are characterized by a situation/action that is
appropriate from one perspective typically assumed by the protagonist of the joke, but virtually
inappropriate or absurd in reality. Similarly, Ruch and Hehl (1998) suggested a third stage to
Suls' theory, in which there is a realization that the resolution does not make sense (is only an “as
if” resolution). Hurley et al. (2011) suggested that all humor is based on realizing an
inappropriate interpretation/perception was reached because of an erroneous
assumption/heuristic. McGraw and Warren (2010) more generally described the conditions for
humor as the perception of a benign violation. What is common to the 4 theories above is that
some absurd, erroneous, or inappropriate perception/behavior is entertained resulting in violated
expectations. Indeed, the funniest drawings elicited activity in the anterior cingulate cortex,
which has been suggested to encode errors/conflicts (Botvinick et al. 2004; Brown and Braver
2007; Shenhav et al. 2013). These accounts also imply that, in order for these conditions to be
20
met, humor needs to be based on remote associations—associations not readily elicited by the set
up—simply because otherwise it might be impossible to mislead or surprise the listener. The
linking of the remote associations (e.g., in Fig. 1C, a pig looking at book titles in the library)
likely occurs in association cortex where its novelty, the violation of expectations, and/or the
rejection of the (apparently appropriate) link as absurd (e.g., because pigs do not read),
necessarily results in heightened activity in association regions if the elements are to be actively
maintained so that they can be linked by the punch line (or caption, in our study). Once we hear a
joke, its repetition is no longer funny as a conceptual structure has been formed that has already
integrated the previous remote associations.
Accordingly, we find that these humorous stimuli induce activation in association
regions, namely bilateral TOJ, TPJ, and TP, where information converges, and where remote
associations may be integrated.
Could the same explanation apply to any other arbitrary constellation of regions? We
believe it could not, since this constellation of regions is unique in satisfying 2 requirements: 1)
it consists of higher order association areas where remote associations converge to give rise to
semantic interpretations; 2) the regions have been reported in previous independent
investigations of humor. (Although those regions have been reported in previous investigations,
they generally were reported as part of a larger constellation of regions, and our work reduces the
set of relevant regions, by controlling for the element of discovery.)
4.3. The Element of Discovery
Unlike previous studies in which only the humorous stimuli retained an element of
discovery, both the humorous HU and nonhumorous (insight) UV conditions in our study
retained an element of discovery, that is, the drawings were uninterpretable until their
descriptions revealed what they depicted. Regions that are activated for the UV condition (minus
its descriptive controls) appear to be largely a subset of those activated by the HU condition
(minus its controls). Our findings (see The Humor Network), therefore, suggest reconsideration
of at least 2 interpretations of prior studies.
Watson et al. (2007) found that, late in the processing of their jokes, sight gags led to an
increased activation in visual areas, and verbal gags led to increased activation in language areas.
21
They interpreted the findings as support for Suls’ (1972) incongruity-resolution theory of humor,
suggesting the late visual or linguistic activity arises from the resolution of the visual or
linguistic incongruity, respectively. Goel and Dolan (2001) used similar results to reach the same
conclusion. However, we found a late increase in activation of LOC (a visual object recognition
region) for both HU and UV conditions compared with their controls (Fig. 6). In other words, we
found the same activation pattern of “resolution” for both our humorous HU condition and the
nonhumorous UV condition, results that challenge Suls' characterization of the conditions for
humor.
22
Figure 6. Time course of activation for the HU and UV conditions minus their controls, for P <
0.05, uncorrected (the relaxed threshold here is for presentation purposes only, all analysis and
other figures used a P < 0.005 threshold). Notice that activation in LOC increases (relative to
controls) starting around 9 s (or a little earlier) after the beginning of the trial, suggesting a
reinterpretation of the drawing.
Second, as discussed in the Results and Discussion, since the SMA was activated for both
HU and UV, it seems less likely its only function in humor processing is laughter production (as
was suggested by Mobbs et al. 2003).
As noted, humor typically has an element of absurdity or benign violations (e.g., Navon
1988). In visual humor, typical depictions might include talking animals or a person in an
incongruous pose. That is, even without the caption, such depictions look funny—and would
have imposed a serious confound in our effort to distinguish humor from discovery. Fortunately,
the HU droodles depicted no such absurdities. Instead, like the control UV stimuli, they appeared
“abstract” with no real-world referent. It was the caption, not the drawing, that conveyed the
absurdity of the referent that thus rendered the composite funny. We could thus assess the impact
of the captions alone, unconfounded with the images.
4.5. Why is Humor Pleasurable?
Many theories have suggested evolutionary advantages for enjoying humor, such as, a
motivation for error debugging (Hurley et al. 2011) and a number of previous studies reported
activation in some classical reward areas (e.g., Mobbs et al. 2003). We are aware of only one,
however, that suggested a neural mechanism linking the cognitive processing of humor and the
pleasurable feeling of mirth (e.g., Biederman and Vessel 2006).
There is near consensus in the field that “surprise” is an element of humor, and that the
surprise is positive (or at least benign). Often the surprise induces a feeling of superiority,
Schadenfreude, relief, or sexual titillations, but these factors are not necessary (Hurley et al.
2011). None of these were aspects of our humorous stimuli. The positive surprises induced by
our stimuli seem to be the “cleverness” of the interpretation. Indeed, in a separate survey, 20
student subjects rated the same HUs used in the fMRI study, and the survey revealed a high
23
correlation of Funniness ratings with ratings of both Cleverness (r = 0.823, P < 0.001) and
Surprise (r = 0.711, P < 0.001) but not Complexity (r = −0.122, P = 0.188). The link between
cleverness and pleasure may be an aspect of a general motivational system, which renders the
consumption of novel and richly interpretable information as pleasurable.
Lewis et al. (1981) discovered a gradient of µ-opioid receptors in the ventral cortical
visual pathway, sparse in early sensory areas, but gradually increasing in density through higher
sensory cortices, with the greatest density in association cortex (e.g., PHG, TPs). Zadina et al.
(1997) reported the presence of µ-opioid receptors in prefrontal cortex. Biederman and Vessel
(2006) hypothesized that neural activation in association cortex would be greater in response to
richly interpretable, novel, and surprising experiences, resulting in greater opioid release,
possibly triggering further activation of classical reward regions, producing pleasure. They
proposed that such activation of cortical opioids might be the neural basis for the human
motivation for seeking novel, richly interpretable information (for additional evidence, see Yue
et al. 2007; Amir et al. 2011).
The TPs and TOJs association areas were significantly activated for HUs but not UVs.
Also responding exclusively to HUs, the mPFC, which is implicated in reward learning and
responds most strongly to unexpected rewards (or their absence; Rolls 1996; Lee et al. 2004).
This characterization fits well with the unexpected and rewarding nature of punch lines.
In conclusion, the mirthful feeling accompanying humor appreciation may come about as
a result of unexpected greater activation in cortical association regions rich in µ-opioid receptors
(e.g., bilateral TOJ and TP), triggering, in response to this unexpected reward, activation in
classical reward/reward-learning regions such as the mPFC, lAMG, and striatum (with the latter
2 regions also activated for nonhumorous discovery).
4.6. Summary
A handful of regions differentiate the discovery inherent in humor from nonhumorous
discovery. The same regions were also more responsive to funnier humorous stimuli, thus
exhibiting a dose–response effect. We hypothesize that the surprisingly high activation in
association regions (e.g., TPs and TOJ) leads to the high activation in reward regions (e.g.,
24
mPFC) in line with previous studies (e.g., Biederman and Vessel 2006) that linked greater
associative activity with perceptual pleasure.
Acknowledgment
We thank Mark Lescroart for sharing his Matlab code, and Jianchang Zhuang for his assistance
with the MRI scanning.
References
Amir, O., Biederman, I., & Hayworth, K.J. (2011). The neural basis for shape
preferences. Vision Research, 51, 2198-2206.
Aziz-Zadeh, L., Kaplan, J., & Iacoboni, M. (2009). “Aha!”: The neural correlates of verbal
insight solutions. Human brain mapping, 30, 908–16.
Bar, M., Aminoff, E., & Schacter, D. (2008). Scenes unseen: the parahippocampal cortex
intrinsically subserves contextual associations, not scenes or places per se. The Journal of
neuroscience : the official journal of the Society for Neuroscience, 28, 8539–44.
Bartolo, A., Benuzzi, F., Nocetti, L., Baraldi, P., & Nichelli, P. (2006). Humor comprehension
and appreciation: an FMRI study. Journal of cognitive neuroscience,18, 1789–98.
Biederman, I., & Vessel, E. A. (2006). Perceptual Pleasure and the Brain. American Scientist,
94, 249-255.
Bowden, E. M., Jung-Beeman, M., Fleck, J., & Kounios, J. (2005). New approaches to
demystifying insight. Trends in Cognitive Sciences, 9(7), 322-328.
Brainard, D.H. The psychophysics toolbox (1997), Spatial Vision, 10, pp. 433–436
Dietrich, A., & Kanso, R. (2010). A review of EEG, ERP, and neuroimaging studies of creativity
and insight. Psychological Bulletin, 136, 822–48.
Epstein, R., & Kanwisher, N. (1998). A cortical representation of the local visual
environment. Nature, 392, 598–601.
Goel, V., & Dolan, R. (2001). The functional anatomy of humor: segregating cognitive and
affective components. Nature Neuroscience, 4(3), 237-8.
Grill-Spector, K., Kushnir, T., Edelman, S., Avidan, G., Itzchak, Y., et al. (1999). Differential
processing of objects under various viewing conditions in the human lateral occipital
complex. Neuron, 24, 187–203.
25
Hayworth, K., & Biederman, I. (2006). Neural evidence for intermediate representations in
object recognition. Vision research, 46, 4024–31.
Hurley, M. M., Dennett, D. C., & Adams Jr, R. B. (2011). Inside jokes: Using humor to reverse-
engineer the mind. MIT Press.
Lee, G. P., Meador, K. J., Loring, D. W., Allison, J. D., Brown, W. S., Paul, L. K., et al. (2004).
Neural substrates of emotion as revealed by functional magnetic resonance imaging.
Cognitive Behavioral Neurology, 17, 9–17.
Lewis, M. E., Mishkin, M., Bragin, E., Brown, R. M., Pert, C.B., Pert, A. (1981) Opiate receptor
gradients in monkey cerebral cortex: correspondence with sensory processing hierarchies.
Science, 211, 1166–1169.
Man, K., Kaplan, J. T., Damasio, A., & Meyer, K. (2012). Sight and Sound Converge to Form
Modality-Invariant Representations in Temporoparietal Cortex. The Journal of
Neuroscience, 32(47), 16629-16636.
Mashal, N., Faust, M., & Hendler, T. (2005). The role of the right hemisphere in processing
nonsalient metaphorical meanings: Application of Principal Components Analysis to
fMRI data. Neuropsychologia, 43(14), 2084-2100.
Meyer, K., & Damasio, A. (2009). Convergence and divergence in a neural architecture for
recognition and memory. Trends in neurosciences, 32(7), 376-382.
Mobbs, D., Greicius, M., Abdel-Azim, E., Menon, V., & Reiss, A. (2003). Humor modulates the
mesolimbic reward centers. Neuron, 40, 1041–8.
Moran, J., Wig, G., Adams, R., Janata, P., & Kelley, W. (2004). Neural correlates of humor
detection and appreciation. NeuroImage, 21, 1055–60.
Navon, D. (1988). The seemingly appropriate but virtually inappropriate: Notes on
characteristics of jokes. Poetics, 17(3), 207-219.
Nishimoto, T., Ueda, T., Miyawaki, K., Une, Y., Takahashi, M. (2010) A normative set of 98
pairs of nonsensical pictures (droodles). Behavior Research Methods. 42(3), 685-691.
Perner, J., Aichhorn, M., Kronbichler, M., Staffen, W., & Ladurner, G. (2006). Thinking of
mental and other representations: The roles of left and right temporo-parietal
junction. Social Neuroscience, 1(3-4), 245-258.
Price, Roger. Oodles of Droodles. New York: H. Wolff Book. 1955.
Price, Roger. Droodles #1. Los Angeles: Price/Stern/Sloan Publishers Inc. 1976.
26
Price, Roger. Droodles: The Classic Collection. Los Angeles: Tallfellow, 2000.
Rolls, E. (1996). The orbitofrontal cortex. Philosophical transactions of the Royal Society of
London. Series B, Biological sciences, 351, 1433–44.
Ruch, W., & Hehl, F.-J. (1998). A two-mode model of humor appreciation: Its relation to
aesthetic appreciation and simplicity–complexity of personality. In W. Ruch (Ed.), The
sense of humor: Explorations of a personality characteristic (pp. 109–142). Berlin:
Mouton de Gryter.
Samson, A. C., Zysset, S., & Huber, O. (2008). Cognitive humor processing: Different logical
mechanisms in nonverbal cartoons—an fMRI study. Social neuroscience, 3(2), 125-140.
Samson, A., Hempelmann, C., Huber, O., & Zysset, S. (2009). Neural substrates of incongruity-
resolution and nonsense humor. Neuropsychologia, 47, 1023–33.
Saxe, R., Carey, S., & Kanwisher, N. (2004). Understanding other minds: Linking
developmental psychology and functional neuroimaging. Annu. Rev. Psychol., 55, 87-124.
Suls J. (1972). A two stage model for the appreciation of jokes and cartoons. In: Goldstein J,
McGhee P, editors. Psychology of humor. New York: Academic Press.
Watson, K., Matthews, B., & Allman, J. (2007). Brain activation during sight gags and language-
dependent humor.Cerebral cortex,17, 314–24.
Yue, X., Vessel, E.A., & Biederman, I (2007). The neural basis of scene
preferences. NeuroReport, 18, 525-529.
Zadina, J. E., Hackler, L., Ge, L. J., & Kastin, A. J. (1997). A potent and selective endogenous
agonist for the µ-opiate receptor. Nature, 386, 499-502.
27
Chapter 2: The Neural Basis for Shape Preferences
Abstract
Several dimensions of shape, such as curvature or taper, can be regarded as extending from a
singular or zero value (e.g., a straight contour with 0 curvature or parallel contours with a 0
angle of convergence) to an infinity of non-singular values (e.g., curves and non parallel
contours). As orientation in depth is varied, a singular value remains singular, and a non-singular
value will vary but remains non-singular. Infant and adult human participants viewed pairs of
geons where one member had a singular and the other had a non-singular value on a given shape
dimension, e.g., a cylinder vs. a cone. The participants preferred to look at the nonsingular geons.
The nonsingular geons also produced greater fMRI activation in shape selective cortex (LOC), a
result consistent with their producing greater single unit activity in macaque IT (Kayaert et al.,
2005). That non-singular stimuli elicit higher neural activity and attract eye movements may
account for search asymmetries in that these stimuli pop out from their singular distractors but
not the reverse. A positive association between greater activation in higher-level areas of the
ventral pathway and visual preference has been demonstrated previously for real world scenes
(Yue et al., 2007) and may reflect the workings of a motivational system that leads humans to
seek novel but richly interpretable information.
Keywords: Shape, Lateral occipital complex, Visual Search Asymmetries, Infant Eye
Movements, Visual Preferences
28
1. Introduction
It has been known for over a century that the location of visual fixations is decidedly non
random: people are prone to fixating some stimuli (or regions of a scene) more frequently than
others. The present study investigated whether these fixation biases would be manifested when
looking at simple shapes, such as geons. Given fixation biases, is there a fundamental
characteristic of the shapes that predicts the biases? And is there an underlying neural correlate
that might explain the biases?
Several 3D dimensions of shapes, such as curvature or cotermination, can be regarded as
extending from a singular or zero value (such as a straight contour with 0 curvature or a pair of
contours that terminate at a common point with a 0 difference in the loci of their endpoints) to an
infinity of non-singular values (as with curves and non coterminating contours). A singular
value will always
produce
a
2D
projection
(i.e.,
an
image)
that
is
singular at all orientations in
depth; the 2D projection of a non-singular value will vary, but it will remain non-singular at all
orientations (up to an “accident” of viewpoint)
1
. Parallelism (here termed taper), with a 0 angle
of convergence, represents a special case in that with rotation in depth, perspective will produce
an apparent convergence of the contours. However, there is a strong bias towards maintaining a
perception of parallelism under depth rotation even with modest non zero angles of convergence
at modest extensions in depth (King et al., 1976).
2
1
Singular
or
non-‐singular
values
are
each,
categorically,
nonaccidental
in
that
rotation
in
depth
will
not
alter
whether
they
are
singular
or
non
singular.
The
difference
between
two
contours,
one
with
a
singular
and
the
other
with
a
non-‐singular
value,
e.g.,
a
straight
vs.
a
curved
contour,
defines
an
invariant,
or
nonaccidental,
property
(NAP)
difference.
A
difference
between
two
non-‐singular
values
defines
a
metric
property
(MP)
difference.
2
This effect is readily witnessed with the familiar psychological demonstrations of the
trapezoidal window or room. However, even when the shapes do not resemble familiar objects,
e.g., windows or rooms, and the observer can hold and study a trapezoid, paradoxically, the
greater the angle of rotation in depth (and hence the greater the convergence of the 2D projection
of the trapezoid’s edges), the greater the likelihood that the trapezoid will be perceived as a
rectangle, i.e., a singular value (King et al., 2006). This bias towards parallelism is magnified the
29
We explored visual shape preferences for these dimensions with geons, a set of shape
primitives formed from a partition of the set of simple generalized cones (GCs) (Binford, 1971).
A GC is a volume created by sweeping a cross section along an axis. The cross section can be
round (as with a cylinder) or straight (as with a brick) and the axis could be straight or curved.
The cross section can remain constant in size (as with a brick or cylinder), or expand (or contract,
when considered from the reverse direction), as with a cone or a wedge, expand and then
contract (producing sides with positive curvature). Each of these nonaccidental variations would
produce a different geon. Any object can be represented by a collection of GC parts in specified
relations (Marr & Nishihara, 1978). For the stimuli used in the present experiment (Fig. 1),
singular geons had straight axes and/or parallel sides. Non-singular geons had curved axes,
and/or non-parallel straight sides or sides with positive curvature.
In the present investigation we manipulated four of those dimensions (Fig. 1): main axis
curvature, positive curvature of sides, taper, and a conjunction of taper and positive curvature
(for which all but taper has an element of curvature in the non-singular shape that is not present
in the singular variant). Previous studies have shown that human adults prefer curved stimuli
(both real objects and nonsense shapes) over a version of those objects with sharp (pointy)
corners instead of the curve (Silvia & Barona 2009; Bar & Neta 2006), which may be the
consequence of pointy objects perceived as threatening, as suggested by the increased amygdala
activation elicited by such stimuli (Bar and Neta, 2007). Similar preferences for curvature, where
the non-curved/less-curved versions of the objects had both sharp angles and straight edges in
place of the curves, have been reported (Carbon, 2010; Leder and Carbon, 2005; Hevner 1935).
All of the manipulations in our study that involved curvature compared straight edges (in the
singular shape) to curved edges (non-singular shape) along extended contours. In contrast with
the prior studies, none of our stimuli compared a discontinuous extrema of curvature (i.e., a
pointy corner) with a continuous (i.e., curved) extrema. (The two shape measures are
independent in that the derivative at the extrema of a curved or a straight contour could or could
more time that the observer has to view the trapezoid. A similar bias also characterizes the
perception of symmetry, in that a vertically elongated ellipse under depth rotation around the Y-
axis (thus foreshortening the shorter axis) will appear as a circle.
30
not be continuous.) There have been reports with complex stimuli that infants prefer looking at,
for example, bulls-eye patterns with curved lines to ones with straight lines, although some of the
stimuli with straight contours had pointy corners (Quinn et al., 1997; Ruff & Birch, 1974).
In the present investigation, infants and adult human participants viewed simultaneously
presented pairs of geons where one geon had a singular and the other had a non-singular value,
e.g., a cylinder vs. a cone. We assessed, by eye tracking, which shape the participants
spontaneously fixated. We also assessed (in adults) whether singular or non-singular values
would elicit a greater fMRI BOLD response in the lateral occipital complex (LOC), an area
selective for shape and critical for shape recognition (James et al., 2003). We found that non-
singular values are more likely to attract fixations and elicit greater BOLD responses in LOC.
The latter result is consistent with Kayaert, Biederman, Op De Beeck, and Vogels’ (2005)
finding of greater single unit activity to non-singular values in the inferior temporal (IT) region
of the macaque, an area homologous to LOC (Kriegeskorte, et al., 2008).
31
Figure 1. Sample stimuli sets. The dimensions manipulated are A. main axis curvature, B. taper,
C. positive curvature of sides, D. a conjunction of taper and positive curvature of sides. Note that
the “non-singular” label refers to the value along the dimension manipulated within a row, not to
all the attributes of the geon, which could have non-singular and singular values.
32
2. Experiment 1: Adult Shape Preferences
We tracked eye movements of adults (Exp. 1) and infants (Exp. 2) in a free-viewing
paradigm for singular vs. non-singular geons. The geons varied along simple GC dimensions:
curvature of main axis, parallelism (taper), and positive curvature of the geon’s sides.
2.1. Method
2.1.1. Participants
14 adults, 10 females, ages 19-36, all students at the University of Southern California were
compensated for their participation. Three were members of the P.I.’s lab (who were not
compensated) but their data were virtually identical to those of the other participants.
2.1.2. Stimuli
The stimuli for all 3 experiments were from a total set of 36 line drawings of 3D appearing geons,
24 of which were used in Exp. 2 and all 36 in Exps. 1 and 3. All 36 geons can be viewed at
http://geon.usc.edu/~ori/36Shapes.html. Images were generated using Autodesk’s 3D-studio
Max 8, and MATLAB (The MathWorks, Natick, MA). The edges, corresponding to orientation
and depth discontinuities, were white, with the inner surfaces a uniform grey. They were
displayed on a black background (see Fig. 1). The stimuli were organized into 12 sets of 3 geons
each, with the geons in each set differing along one dimension (10 sets) or two dimensions (2
sets). Each set consisted of one “singular” object that had a zero value along the manipulated
dimension(s) in that set (e.g. a brick with parallel sides, thus zero taper), one object with a high
value along the same dimension(s) (e.g. a brick with a high degree of taper), and one object of
intermediate value (e.g. medium taper). The intermediate geon (which was only used in Exp. 3)
was physically equidistant from the other two geons as scaled by the Gabor-jet model, a
multiscale, multiorientation model of V1 simple-cell filtering (Lades et al., 1993; Xu et al., 2010).
The model computes physical similarity between pairs of stimuli that correlate almost perfectly
with psychophysical discriminability when the stimuli vary metrically. The measure does not
33
distinguish between metric and nonaccidental differences so it can be employed as a gauge to
assess differential psychophysical sensitivity to such differences. The model can be downloaded
from http://
geon.usc.edu/~biederman/GWTgrid_Simple.m.
The objects were all rotated 15° clockwise from the upright to reduce the tendency for the
singular shapes to appear more vertical, to avoid possible confounds as a result of anisotropy in
orientation sensitivities (Hansen & Essock, 2004; Leehey et al., 1975). For the six sets that
manipulated the dimension of main axis curvature, the objects’ orientations in the plane were
further adjusted so that a line drawn from the center of the bottom of the base to the top surface
would have approximately the same orientation for all 3 objects in the set. These manipulations
were done prior to the Gabor scaling. Variation in the shapes were selected so as not to introduce
differences in vertices, as would occur if a cross section with straight edges was compared with a
curved cross section, e.g., a brick vs. a cylinder. Geons with sharp points were also not included
because prior research (e.g., Bar & Neta, 2006) had shown that they were not preferred. This
limited the variations to axis curvature and those changes that would be produced by variations
in the size of the cross section as it was swept along an axis, e.g., parallel sides (when the cross
section remained constant in size), and positive and negative curvature of the sides of the geon.
For each set, all 3 objects were resized so that, with respect to each other, they were equal in area
(defined as the number of pixel within the geon) and, consequently, in luminance. In the present
study, nine participants were studied with the resized shapes only, one with the shapes before
resizing, and four with both versions of the shapes. There was no effect of the variation in size so
the data are shown collapsed across the different size levels.
2.1.3. Procedure
Stimuli were presented on a 46 in LCD screen (Sony Bravia XBR-III, 1016 571.5 mm)
and displayed using C++ and SDL on a Unix platform. The screen was 100 cm from the
subject’s head, which rested on a chin-rest (Fig. 2a). Four participants had their head restrained
with a helmet-like apparatus (instead of the chin-rest) 125 cm from the screen, which did not
affect the pattern of results. Stimulus width and height subtended an average of ~2.5° and ~4.5°
of visual angle, respectively.
34
Figure 2. A. Set up for experiment 1, adult eye-tracking. B. Set up for experiment 2, infant eye-
tracking.
Each trial began with a central fixation plus sign (~.25°) followed by a simultaneous
display of two objects (one singular and one non-singular from the same set) presented 6°-7° to
the right and to the left of the screen center. The singular object was on the right for half of the
trials, and stimulus pairs from all 12 sets were shown before there was a repetition of a pair. A
given geon would never appear in successive trials.
Participants performed from four to eight runs of 40 trials each with breaks of
approximately one min between runs. Some trials were excluded because the scanner could not
locate the position of the eye for most of the trial. Each participant completed an average of 163
valid trials.
Between trials, a white fixation plus sign was presented on a gray background to
minimize afterimages of the gray geons. At the beginning of each trial, which was initiated by
the participant’s key press, the fixation plus sign flickered five times and disappeared (along with
the gray background) immediately followed by a pair of gray geons presented for 2 s on a black
background. Participants were instructed to look wherever they wanted once the fixation plus
disappeared. Participants’ eye movements were tracked from the beginning of the flicker to the
end of the trial (when the geons were removed from the screen and the fixation plus reappeared).
A B
35
2.1.4. Eye-tracking analysis
The left eye position was tracked with a video scanner (ISCAN RK-464) at 240 Hz, in
pupil-CR mode. The data were transferred to Matlab for subsequent analysis. Preference for an
object was assessed by calculating the proportion of first looks and looking time, calculated
using the total number of first saccades or dwell time to one object divided by the total number
of first looks or looking time to both objects.
On each trial, the two shapes were displayed for 2 s. On 77% of the trials, participants
fixated both shapes. A fixation was considered to be on a shape whenever it fell within an 8° x 8°
invisible frame that extended 2.5° around the boundary of the shape, which was always beyond
5° to the right or the left of initial fixation. The invisible frame was constant for all trials and
shapes. For the purpose of counting the number of fixations within each region of interest, a
fixation was defined as gaze longer than 100ms remaining within a circle with a radius of .67°,
of visual angle.
2.2. Results and Discussion
All 14 participants fixated longer at the non-singular shapes, M = 61.1%, of the total time
looking at any of the shapes, which was reliably greater than chance (50%), t(13) = 3.48, p
< .005. In addition, 13 of the 14 participants made first saccades more often than chance to the
non-singular object, M = 60.9%, t(13) = 3.32, p < .006. The non-singular geon in all 12 object
sets had longer looking times on average and, for 11 of the 12 sets, subjects looked at the non-
singular shape first more often (for one set it was looked at first 49.1% of the time, not reliably
different from chance).
In ten of the sets there was an element of curvature added in the non-singular variant that
was absent in the singular variant, so it is possible that some of the effect could be explained by
preference for curved features. However, the preference results held for the two comparisons for
which only taper was manipulated. The non-singular objects of these two sets were fixated
longer than chance, M = 60.6%, t(13) = 3.59, p < .004, and looked at first for a larger percentage
36
of the trials, although with reduced power because of fewer observations so the taper dimension,
individually, fell short of significance (M = 55.8% of the trials, t(13) = 1.76, p = .102). The
mean number of fixations per trial to the non singular geon, 2.47, was greater than the mean of
1.88 fixations to the singular geon (t(13) = 3.88, p < .01); and the mean length of a fixation at a
non-singular geon, 317 ms, was longer than that to the singular geon, 292 ms, t(13) = 2.32, p
< .05. Table 1 shows the results for the individual shape dimensions. All of the preferences, as
reflected by eye movements, favored the non-singular objects.
37
Table 1
Adults’ preference results: percent of trials looking first at the non-singular objects
Dimension
Num of
sets M SD df t sig.
Taper 2 55.8% 12.3% 13 1.76 p = .102
Main Axis
Curvature 6 60.1% 12.6% 13 2.99 p <. 02
Positive curvature
of sides 2 64.3% 18.4% 13 2.91 p <. 02
Taper + Positive
Curvature 2 65.2% 14.7% 13 3.86 p < .002
All Objects 12 60.9% 12.3% 13 3.32 p < .006
Adults’ preference results: percent of time looking at the non-singular objects.
Dimension
Num of
sets M SD df t sig.
Taper 2 60.6% 11.1% 13 3.59 p < .004
Main Axis
Curvature 6 61.7% 11.2% 13 3.91 p < .001
Positive curvature
of sides 2 60.1% 18.9% 13 1.99 p < .07
Taper + Positive
Curvature 2 61.1% 14.8% 13 2.81 p < .02
All Objects 12 61.1% 12.0% 13 3.48 p < .005
The singular stimuli served in more than one set more frequently than the non-singular
stimuli, so there was greater repetition over a run of the singular stimuli compared to the non-
singular stimuli. It is possible that their repetition might have reduced their fixation preference,
an effect that would be at odds with Zajonc’s (2001) “mere-exposure effect,” in which liking for
a stimulus grows with exposure. To assess whether the greater frequency of presentation of
singular stimuli could have resulted in reduced preference, we compared preferences from the
38
first and last runs (40 trials each). There were no reliable differences either in percent looking
time (First: 61.2% vs. Last: 61.1%, t < 1.00) or percent first fixations to non-singular stimuli
(61.0% vs. 60.9%, t < 1.00). In sum, adults show a preference, consistent across participants and
shapes, for non-singular objects, as measured both by total looking time and first look.
3. Experiment 2: Infant Shape Preferences
It is possible that the preferences shown by adults have their origin in language,
geometrical knowledge, and extensive experience with various objects. Would preferences for
the non-singular shapes be manifested in infancy? Similar to the choice presented to the adults
in Exp. 1, infants were presented with pairs of singular and non-singular objects (one on each
side of the screen), while their eye movements were recorded.
3.1. Method
3.1.1. Participants
Nineteen 5-month-old infants (10 girls, M = 5 months, 2.6 days, range: 4 mos, 8 days to 6
mos, 1 day) were run at the Babylab at the Centre for Brain and Cognitive Development,
Birkbeck, University of London. Data from five additional infants were excluded from final
analyses due to fussiness (i.e., watching less than 10 seconds). Infants were recruited via local-
area advertisements in the greater London metropolitan area and each was given an “I Am an
Infant Scientist” t-shirt or bib for participating.
3.1.2. Stimuli
The same stimuli were used as in Exp. 1, except that the shapes were not scaled to be of
equal relative size (which, based on Exp.1, had little effect), and the stimuli in Exp. 2 covered a
greater area on the screen than in Exp. 1, subtending a visual angle of 9° in height and 4.5° in
width. The objects’ horizontal distance from the center of the screen was ~8-10°.
39
3.1.3. Procedure
Infants sat in a car seat 50 cm in a small, quiet room 50 cm from a monitor above which
was mounted a Tobii 1750 corneal-reflection eye-tracker, as shown in Fig. 2b. Their caregivers,
who sat out of their view, were instructed to refrain from commenting on the movies or
interacting with their infant. Fixation calibration was accomplished at 5 points (the corners and
center of a square), and all infants were calibrated to at least 4 points. To accommodate the
infants’ short attention span, each of the 12 pairs of geons was shown once but, because of an
error of presentation, two of the trials mixed shapes from two different sets. These trials were
excluded from the analysis which thus included only the 10 trial types that were identical to
those studied with adults. Each trial lasted 5 s, which was considerably longer than the 2 s
exposure duration for the adults. With an longer exposure duration longer than 2 s, adults
would
become
bored
with
such
simple
displays
and,
given
their
greater
number
of
trials
than
the
infants,
it
would
be
too
difficult
for
them
to
maintain
attention
to
such
a
simple
task.
The design was similar to that of Exp. 1. During a session for a particular infant, singular
and non-singular objects appeared at most twice in the same locations on successive trials.
Attention getters (still kaleidoscopic circles or squares with either a brring or boing sound) were
presented after every trial to orient infants to the center of the screen. These 1s movie clips
looped until the infants returned their gaze to the center of the screen for approximately 1500 ms.
The experimenter monitored the infants’ looks though an external video camera to turn off the
attention getter with a key press that initiated the next trial. After every fourth trial, a 7 s Sesame
Street clip (“Mahna Mahna”) was presented to promote attention to the screen and general
interest in the task.
3.1.4. Eye-tracking analysis
The raw data were exported through Tobii’s ClearView software. Only fixations longer
than 100 ms were considered for further analysis. These data were then analyzed with Matlab
40
scripts to assess two measures: 1) the object that was fixated first, and 2) the proportion of the
total time spent looking at singular and non-singular objects. As with adults, fixations were
considered to fall on a particular object if they were recorded within the left or right areas of
interest (15° x 15°) surrounding the object, the closest edge of which was 8-10° away from the
center of the screen. These measures were then averaged across all objects for each infant. One-
sample two-tailed t-tests on the difference between looking to singular and non-singular objects
assessed whether both looking measures were reliably above chance (50%).
3.2. Results and Discussion
Infants, like adults, made first saccades more often than chance to the non-singular geons,
M = 59.5% of the trials, t(18) = 2.82, p < .02. 13 out of 19 infants made more first saccades to
the non-singular geons. Unlike adults, infants showed no difference in total looking time,
fixating the non-singular geons 48.6% of the time, t(18) < 1.00. The pattern of first saccades was
similar to that of adults suggesting an early onset of the bias, one that precedes language
acquisition and formal training in geometry. The lack of a significant difference in looking time
may have developmental significance, or it could be the result of fussiness and general lack of
interest in the stimuli. Table 2 presents results for individual dimensions.
Given that in five of the ten geon pairs the non-singular geon was slightly bigger than the
singular (and vice versa for the other five), and the finding that infants prefer looking at larger
stimuli (Cohen, 1979), we were concerned that it may be the larger size of those five non-
singular geons that was the cause for the bias to look at them first. If that were the case there
would be a positive correlation between the relative size of the non-singular shape compared to
the singular shape in the same set, and first looks. In fact this correlation was non-significant,
and if anything, negative (r = -.16), suggesting that the small size differences do not explain the
bias.
41
Table 2.
Infants’ preference: percent of trials looking first at the non-singular objects
Dimension
Num of
Sets M N
Binomial
sig.
Taper 1 62.5% 16 p < .13
Main Axis Curvature 5 58.1% 86 p < .03
Positive curvature of
sides 2 63.9% 36 p < .04
Taper + Positive
Curvature 2 57.1% 35 p < .10
All Sets 10 59.5% 173 p < .005
Infants’ preference: percent of time looking at the non-singular objects
Dimension
Num of
Sets M SD df t sig.
Taper 1 49.0% 36.0% 15 <1.00 ns
Main Axis Curvature 5 48.4% 16.4% 18 <1.00 ns
Positive curvature of
sides 2 49.4% 22.3% 18 <1.00 ns
Taper + Positive
Curvature 2 46.6% 26.8% 18 <1.00 ns
All Sets 10 48.6% 11.0% 18 <1.00 ns
4. Experiment 3: MRI Study
Yue, Vessel and Biederman (2007) and Vessel and Biederman (2006) reported that
preferred scenes induce greater activity in scene selective cortex. Would the preferred (non-
singular) shapes from Exps. 1 and 2 induce greater activation in shape selective cortex?
42
The present experiment was initially designed as an fMRI-adaptation study (fMRI-a)
(Grill-Spector et al., 1999) to assess sensitivity to nonaccidental vs. metric properties (NAPs vs.
MPs) in the lateral occipital complex (LOC), an area critical for object recognition (James et al.,
2003). fMRI-a exploits the finding that the repetition of identical stimuli generally produces a
reduced BOLD response relative to when the stimuli are changed. The release of adaptation is
taken as a measure of the dissimilarity of the differences in the stimuli. Because we had two
types of changes, NAPs and MPs, the design required trials in which identical stimuli were
repeated, namely singular and non-singular stimuli that were involved in the changes.
Unexpectedly, the repetition of the non-singular stimuli elicited significantly greater BOLD
responses than repetition of the singular stimuli, which precluded the employment of the release
from adaptation effects to assess NAP vs. MP sensitivity. Instead, the study became one of
comparing the baseline activation of singular and non-singular stimuli.
4.1. Method
4.1.1. Participants
Nineteen adult students from the University of Southern California (USC), 10 females,
ages 22-36. All participants were screened for safety and gave informed consent in accordance
with the USC Institutional Review Board Guidelines. The sixteen participants who were not
members of I.B.’s lab were compensated for their participation. Lab members and compensated
subjects gave highly similar data. Of the 19 participants one of the participants (male) was
excluded from the analysis because of low BOLD signal in LOC.
4.1.2 Stimuli
The shapes were presented in isolation at the center of the screen subtending
approximately 1.5° visual angle. Stimuli were displayed with Psychophysics Toolbox (Brainard,
1997) running under Matlab (The MathWorks, Natick, MA). The equal-sized shapes used in Exp.
1 were used for 7 out of the 18 participants. The rest were presented with the shapes before
43
resizing, as in Exp. 2. Since no differences were observed between the groups, the results were
collapsed over shape size.
4.1.3 Procedure
At the onset of each 2 seconds trial, two objects from the same set appeared sequentially,
alone at the center of the screen each for 100ms with an inter-stimulus interval of 400ms between
them. There was a small ~.5° lateral translation in the position of the two stimuli with respect to
each other to reduce low level adaptation effects when the stimuli were identical. When the
objects were not being shown, a white fixation dot was displayed at the center of the screen.
There were four conditions (only the first two of which are relevant to the present investigation):
1. Singular Identical (S-id): The singular object was repeated twice (e.g. straight elongated brick,
appearing twice). 2. Non-Singular Identical (NS-id): The object with the highest value along the
dimension was repeated twice (e.g. a brick bent to the right such that its main axis is curved,
appearing twice). 3. Nonaccidental Property Difference (NAP): The intermediate object was
followed by the singular object (e.g. a brick bent partially to the right followed by a straight
brick). 4. Metric Property Difference (MP): The intermediate object was followed by the object
with the highest value on the dimension (e.g. the brick partially bent to the right followed by a
brick with greater main axis curvature in the same direction). The physical similarity of the pairs
of stimuli for the NAP and MP changes was equated by the Gabor-jet metric.
In a random jittered design, the four conditions and blank trials were balanced with
respect to trial history. While in the scanner, participants were given a task that was intended to
be orthogonal to the main hypothesis so as to not confound the results with task difficulty, but
merely to make sure participants were attending to the display. The task was to press a button
when the fixation dot changed from white to red (100 ms after the presentation of the two
objects), which would occur on 12% of the trials (an equal proportion of times for every
condition). Task performance was at ceiling for all participants. These detection trials were
excluded from the analysis.
Each subject participated in an anatomical scan of three or four experimental runs of 252
trials each, and two LOC localizer runs consisting of blocks of intact and scrambled images of
44
objects faces and scenes (LOC was localized as an area showing greater activation for intact
objects compared with scrambled objects), all in the same session.
4.1.4. Data acquisition
Scanning was performed at USC’s Dana and David Dornsife Cognitive Neuroscience
Imaging Center on a Siemens Trio 3T scanner. A standard 16-channel head coil was used for all
acquisitions.
High-resolution T1 weighted structural scans were performed on each subject using
MPRAGE sequence (TR=2070 ms, 160 sagittal slices, 256 x 256 matrix size, 1 x 1 x 1 mm
voxels).
Full brain functional images were acquired using an echo planar imaging (EPI) pulse
sequence (TR=1000 ms, TE=30 ms, flip angel=62°, 64 x 64 matrix size, in plane resolution 3 x 3,
3 mm thick slices, 18 roughly axial slices centered on ventral aspects of the occipital and
temporal lobes).
4.1.5. Data analysis
Preprocessing (3D motion correction using Trilinear interpolation, 3D spatial smoothing
using a 4 mm full-width at half-max Gaussian filter, linear trend removal using a high pass filter
set to 3 cycles over the run’s length), and statistical analysis were done with the Brain Voyager
software package (Brain Innovation BV, Maastricht, The Netherland). Motion corrected
functional images were coregistered with the same session anatomical scan, which were then
transformed into Talairach coordinates. All statistical analysis was done on the transformed data.
We performed a standard, region of interest analysis identical to that described in Hayworth and
Biederman (2006). LOC was defined for each subject individually with an independent localizer
by comparing intact images of objects to scrambled versions of the same images (threshold was
set to: Bonferroni corrected, p < 0.01). LOC ROIs in both hemispheres were subdivided into
posterior, LO (lateral occipital cortex), and anterior, pFs (posterior fusiform gyrus). Early visual
cortex was roughly defined anatomically by a set of voxels about the posterior aspect of the
calcarine sulcus that were activated more for the experimental conditions than blank trials.
45
For the fast event-related experimental runs, a deconvolution analysis was performed on
data averaged over all voxels within each participant’s ROIs. The BOLD response over the each
ROI was deconvolved using a 20-point fitting function. The β values for this deconvolution were
used to calculate the %BOLD change (as a function of time). The peak (average of the %BOLD
change for time points 5-7) of the deconvolved hemodynamic responses for the four conditions
was used in the subsequent analysis. P-values were obtained for the two comparisons (singular
vs. non-singular; NAP vs. MP) with a paired t-test, across participants.
4.2. Results and Discussion
The two Identical conditions were meant to establish baseline activation for singular vs.
non-singular objects. The NAP and MP conditions were designed to assess release from
adaptation effects for NAP vs. MP property changes. For the Identical conditions, non-singular
objects induced higher activation in LOC compared to singular objects, t(18) = 2.95, p < .01, in
LO, and t = 2.21, p < .05, in pFs (Fig. 3). No such baseline activation difference was observed in
early visual areas (t < 1.00). No significant difference was found between the NAP and MP
conditions in any of the ROIs
3
.
Could the greater baseline activation for non-singular shapes be, a consequence of the
singular shapes appearing more similar to each other? To test this we split the experimental runs
in two, comparing runs 1 and 2 to runs 3 and 4 for subjects who completed 4 runs, or run 1 to run
3 for subjects with those who completed 3 runs. An ANOVA of Condition (S-Id vs. NS-Id) X
Half (Initial half of experiment vs. Final half) revealed only a main effect of Half as the average
BOLD signal change in the initial half of the experiment, was higher than in the final half, both
in LO (Initial half: M = .29, Final half: M = .20, F(1,18) = 9.53, p < .01) and in pFs (Initial half:
M = .23, Final half: M = .14, F = 14.77, p < .001). Because of reduced power, the Condition
main effect fell short of significance both in LO (S-Id: M = .23, NS-Id: M = .26, F = 2.25 , p =
3
This does not necessarily mean that the expected greater adaptation for the MP compared to
NAP conditions, which would result in a smaller BOLD response for the MP condition, did not
occur. While S1 in both conditions was the intermediate shape, the S2 was singular in the NAP
and non-singular in the MP conditions, which would predict, based on our uneven baseline
activation for the singular vs. non-singular result, that the MP condition would show greater
activation. It could be that both effects were present and cancelled each other.
46
1.51) and pFs (S-Id: M = .17, NS-Id: M = .20, F(1,18) = 3.06, p < .1). Critically, there was no
interaction among Half and Condition, with non-singular geons inducing greater activation, on
average, than singular geons both in the initial and final halves of the experiment both in LO
(Initial half: S-Id: M = .28, NS-Id: M = .30; Final half: S-Id: M = .18, NS-Id: M = .21; F < 1),
and pFs (Initial half: S-Id: M = .22, NS-Id: M = .24; second half: S-Id: M = .12 NS-Id: M = .16;
F < 1). The absence of this interaction suggests that exposure did not have an effect on the main
finding, that non-singular geons induce greater activation.
Figure 3. Top: percent signal change for Singular Identical and Non-Singular Identical
conditions, showing the baseline activation for each object type, in LO and pFs. Non-singular
shapes induce significantly higher activation. Bottom: Averaged percent signal change across
time-points 5-7 for all experimental conditions. S-Id and NS-Id differ reliably, but NAP and MP
47
do not although the subtraction of the control Identical conditions (S-Id for NAP and NS-Id for
MP) would yield a greater BOLD response for the NAP than the MP conditions.
To what extent is the greater fMRI activity observed here in LOC to non-singular shapes
mirrored in single cell tuning in macaque IT, the likely homologue to LOC (Kriegeskorte, et al.,
2008)? Kayaert et al. (2005) presented a series of 2D shapes and recorded single-unit activity in
IT. The 2D shapes were GCs in which the cross section is a line. Starting from a rectangle or a
triangle, several non-singular variations defining six levels for each dimension, were varied: axis
curvature, negative curvature of the sides, positive curvature of the sides, and (for the rectangle
only) taper. Three results are relevant for the present discussion: a) A multidimensional analysis
of the responses of 98 IT neurons showed that 95% of the activity could be accounted for by
independent tuning to the various dimensions, b) almost all the tuning functions were monotonic,
with the peak response at an extrema of a continuum, e.g., a given neuron might have the highest
activity to a straight axis whereas another would have the highest activity to the most highly
curved axis, and c) there was a strong tendency for the highest activity, overall, to be at the non
singular extrema of each of the dimensions (Fig. 4). In the general discussion, we propose a
linkage between higher activity and preference.
48
Figure 4. Average firing rate of 98 cells in macaque IT. Notice the greater population response
for non-singular shapes with high values along the dimensions. The two singular shapes, the
rectangle and triangle, are leftmost on each row. Adapted with permission from Kayaert, et al.
2005.
5. General Discussion
To summarize, we found that both adults and infants preferentially look at shapes with
high non-singular values, and that the same non-singular shapes produce greater BOLD activity
in adult human shape selective cortex. This effect is not feed-forward from early visual areas and
the BOLD results are consistent with a previous report of greater single-unit activity to non-
singular shapes in macaque IT (Kayaert, et al. 2005).
49
5.1. Eye Tracking and fMRI
Adults and infants looked first, and adults looked longer, at non-singular geons. Non-
singular geons were tapered or had particular curved contours that were straight in the singular
geons. With respect to curvature, these results are consistent with previous studies reporting
preferential looking at curved features, (though typically confounded with pointy vs. curved
contour terminations rather than along the contour) both in infants (Cohen, 1979; Quinn et al.,
1997; Ruff & Birch, 1974) and adults (e.g., Bar & Neta, 2006).
We found greater activation to non-singular geons in shape selective cortex, LOC, but not
in early visual areas. The fMRI results are consistent with reports of greater activity in macaque
IT to non-singular shapes (Kayaert, et al. 2005).
5.2 Relation to Asymmetries in Visual Search
Treisman and Gormican (1988) reported search asymmetries in which, for example,
curved targets “pop-out” from straight distractors or converging lines pop out from parallel
distractors but not the reverse (e.g. a straight target does not pop-out from curved destructors). Ju
(1990) found such search asymmetries with geons. Our results provide, for the first time, a
neural account of the asymmetries: non-singular targets elicit greater neural activity and
(presumably because of this greater activity) attract fixations, thus rendering these non-singular
targets more detectable.
5.3 Why would non-singular shapes elicit greater neural activity and attract eye fixations?
Why would non-singular shapes attract visual fixations? There is evidence that human
saccades tend to maximize the rate of information acquisition. Loschky and McConkie (2002)
reported that in gaze-contingent displays, saccades avoided regions that were blurred during the
initiation of the saccade. First fixations tend to go to salient locations (Itti & Koch, 2001) and
locations that have greater uncertainty or a maximum amount of local information (Renninger et
al., 2007). Previous studies have shown that we are more sensitive to nonaccidental than metric
differences (Biederman
&
Bar,
1999;
Biederman
et
al.
2009). Although a definitive answer
50
will have to await additional research, we speculate that
because
the
non-‐singular
values
are
indeterminate
and
sensitive
to
rotation
in
depth
they
require
more
processing
and/or
their
neuronal
representation
is
less
sparse/efficient
than
that
of
the
nonaccidental
singular
values
4
.
The
increased
activity
might
be
an
internal
correlate
of
this
indeterminacy.
5.4. Verbal Reports vs. Eye Tracking.
Verbal expressions of preference by adult subjects to curved over pointy shapes have
been documented in a number of studies (Silvia & Barona, 2009; Bar & Neta, 2006; Carbon,
2010; Leder and Carbon, 2005; Hevner, 1935). In contrast to our studies in which curved edges
were always compared to straight ones along the length of the contour, these studies compared
curved with pointy endpoints, where contours coterminated. A negative preference for shapes
with sharp corners may be the result of sharp corners signaling danger, as supported by greater
activation in the amygdala for those “pointy” shapes (Bar & Neta, 2007). Another critical
difference is that the above studies used verbal reports of liking for curvy shapes, while we
recorded eye movements and showed that participants looked more at the shapes with the curved
contours over the straight one.
Would verbal reports of liking agree with eyetracking results? To test this, we used the
same display of singular non-singular pairs of objects, but instead of recording eye-movements
as in the free viewing Exp.1, we asked subjects to report (by key press) which one of the two
shapes they like better. None of the subjects had prior exposure to the stimuli. All eight subjects
liked the singular shapes better. It is likely that the singular geons conformed more to what the
Gestaltists termed “good figure,” or pragnänz, or what might be termed a “Platonic ideal.” That
is, these shapes, seemed more well formed, symmetrical, and “not broken/twisted”. However, the
pattern of visual saccades suggests that the non-singular shapes are more interesting, and that
ultimately people are motivated to look at the non-singular shapes. Similarly one would expect
4
Greater
activity,
however,
does
not
necessarily
equal
greater
sensitivity.
There
is
much
evidence
that
humans
are
more
sensitive
to
a
shape
change
from
a
singular
to
a
non-‐
singular
value
or
vice
versa,
i.e.,
a
nonaccidental
change,
than
to
a
physically
equivalent
shape
change
from
one
non-‐singular
value
to
another,
i.e.,
a
metric
change
(Biederman
&
Bar,
1999;
Biederman
et
al.
2009).
This
greater
behavioral
sensitivity
is
mirrored
in
greater
modulation
of
macaque
IT
cells
to
nonaccidental
changes
(Kayaert
et
al.,
2003).
51
the average face to be preferred (average faces are beautiful/healthy), but twisted faces are more
interesting, draw more attention, and elicit increased activation in face selective visual cortex
(Leopold et al., 2006).
5.5. Motivation for Information
The results show that we are motivated to look at non-singlar stimuli, presumably
because they offer more information, and that the non-singular stimuli produce greater activation
in shape selective cortex. What might be the underlying motivational factor causing this greater
inspection of stimuli with more information?
There is evidence that stimuli that offer more novel (Ranganath & Rainer, 2003) and
richly interpretable (Grill-Spector et al., 1998) information elicit greater activity in ventral stream
areas such as LOC and PPA and people are more prone to look at such stimuli that induce greater
activation in those areas. Direct evidence for a link between activation and preference was
obtained by Yue et al. (2007) who found that scenes that elicited greater activation in the
parahippocampal place area (PPA, Epstein & Kanwisher, 1998) received higher preference
ratings
5
. Here we have shown a similar relation between visual sampling of simple shapes and
activity in object selective cortex LOC.
We speculate that a mediating factor between the greater activation and the perceptual
preference could be cortical opioid activity. In 1981, Lewis et al. discovered a gradient of mu
opioid receptors in the ventral cortical pathway mediating object recognition in the macaque.
Such receptors are sparse in the early stages (e.g., V1 and V2) but increase in density reaching
their maximum in associative cortex, where, where perceptual information activates stored
associations and where, presumably, interpretation and comprehension is achieved (Bar, 2004).
Biederman and Vessel (2006) proposed that increased activation in these associative areas would
produce greater opioid activity and, assuming that such activity is pleasurable, produce greater
5
As noted earlier, there are other factors besides interest that may affect preference ratings, such
as symmetry for shapes, health indicators for faces, etc., and there are also factors unrelated to
information content that can elicit increased activation, e.g. fearful faces cause greater activation
in right FFA than neutral faces (Vuilleumier et al., 2001). We (and Yue et al., 2007) used stimuli
that were emotionaly neutral to minimize such effects.
52
pleasure and preference. This cortical opioid system may be the neural correlate motivating us to
maximize information assimilation.
5.6. Developmental Implications.
Our finding that 5-month-old infants tended to look first at the non-singular geons
suggests that the bias to look at non-singular shapes begins before language acquisition or formal
training in geometry takes place, and does not reflect cultural values. Instead, it likely reflects a
cognitive mechanism, much like that of adults, which direct infants’ attention to the more
informative locations in the environment. Unlike adults, however, infants did not show greater
looking time at the non-singular geons, possibly due to the long (5 s) presentation time, and
fussiness/lack of enthusiasm for our stimuli. If we assume that the infant’s bias to look at the
non-singular geons has the same neural basis as that of the adult, and that the adult’s bias is
related to greater LOC activation, then an implication is that the infant’s ventral pathway is
sufficiently functional in infancy to produce this bias.
6. Conclusions.
Our findings suggest that a mechanism is in place to direct our attention to sample the
more informative segments of the visual environment; a mechanism that may use increased
activation as a signal for assimilating information-rich stimuli. Such a mechanism may exist
from infancy to render us infovores.
Acknowledgments
Supported by NSF BCS 04-20794, 05-31177, 06-17699 to I.B. The infant study was supported
by a grant to R.W. and N. Z. Kirkham from the University of London Central Research Fund and
a grant to Mark Johnson from the UK Medical Research Council, G0701484. The infant data
were collected by Rachel Wu. We thank Paul Quinn, Denis Mareschal, Natasha Z. Kikham, and
Mark H. Johnson, for their input on the infant shape preference study, Laurent Itti for allowing
us to use his eye-tracker and custom code, and Nader Noori for his help in setting up the adult
eyetracking experiment. We also thank Ken Hayworth, Mark Lescroart, Xiaokun Xu and Jiye
53
Kim for their many helpful inputs and Matlab code, Xiaomin Yue for the Gabor-jet scaling codes,
and Jianchang Zhuang for his assistance with the MRI scanning.
References
Bar M. (2004). Visual objects in context. Nature Reviews Neuroscience, 5, 617–629.
Bar, M., & Neta, M. (2006). Humans prefer curved visual objects. Psychological Science, 17,
645-648.
Biederman, I., & Vessel, E. (2006). Perceptual pleasure and the brain. American Scientist, 94,
249–255.
Binford, T. O. (1971). Visual perception by computer. Paper presented at the Proceedings of the
IEEE conference on Systems and Control. Miami, FL.
Brainard, D.H. (1997). The Psychophysics Toolbox. Spatial Vision, 10, 433– 436.
Cohen, L. (1979). Our Developing Knowledge of Infant Perception and Cognition. American
Psychologist, 34(10), 894–899.
Epstein, R., & Kanwisher, N. (1998). A cortical representation of the local visual environment.
Nature, 392(6676), 598–601.
Grill-Spector, K., Kushnir, T., Edelman, S., Avidan, G., & Itzchak, Y. (1999). Differential
Processing of Objects under Various Viewing Conditions in the Human Lateral Occipital
Complex. Neuron, 24, 187-203.
Grill-Spector, K., Kushnir, T., Hendler, T., Edelman, S., Itzchak, Y., Malach, R., et al. (1998). A
Sequence of Object-Processing Stages Revealed by fMRI in the Human Occipital Lobe.
Human Brain Mapping, 328, 316-328.
Hansen, B. C., & Essock, E. A. (2004). A horizontal bias in human visual processing of
orientation and its correspondence to the structural components of natural scenes. Journal of
vision, 4(12), 1044-60.
Hayworth, K. J., & Biederman, I. (2006). Neural evidence for intermediate representations in
object recognition. Vision research, 46(23), 4024-31.
Itti, L., & Koch, C. (2001). Computational modeling of visual attention. Nature reviews.
Neuroscience, 2(3), 194-203.
54
James, T. W., Culham, J., Humphrey, G. K., Milner, A. D., & Goodale, M. A. (2003). Ventral
occipital lesions impair object recognition but not object-directed grasping: an fMRI study.
Brain: A Journal of Neurology, 126(11), 2463-75.
Ju, G. (1990). The role of attention in object recognition: The attentional costs of processing
contrasts of non-accidental properties. Unpublished doctoral dissertation, State University of
New York at Buffalo.
Kayaert, G., Biederman, I., Op De Beeck, H. P., & Vogels, R. (2005). Tuning for shape
dimensions in macaque inferior temporal cortex. The European journal of neuroscience, 22(1),
212-24.
Kriegeskorte, N., Mur, M., Ruff, D. A., Kiani, R., Bodurka, J., Esteky, H., et al. (2008).
Matching categorical object representations in inferior temporal cortex of man and monkey.
Neuron, 60(6), 1126-41.
King, M., Meyer, G. E., Tangney, J., & Biederman, I. (1976). Shape constancy and a perceptual
bias towards symmetry. Perception & Psychophysics, 19, 129-l36.
Lades, M., et. al. (1993). Distortion Invariant Object Recognition in the Dynamic Link
Architecture. IEEE Transactions on Computers, 42, 300-311.
Leehey, S. C., Moskowitz-Cook, A., Brill, S., & Held, R. (1975). Orientational anisotropy in
infant vision. Science (New York, N.Y.), 190(4217), 900-2.
Leopold, D. a, Bondar, I. V., & Giese, M. a. (2006). Norm-based face encoding by single
neurons in the monkey inferotemporal cortex. Nature, 442(7102), 572-5.
Lewis, M., Mishkin, M., Bragin, E., Brown, R., Pert, C., A, et al. (1981). Opiate receptor
gradients in monkey cerebral cortex: correspondence with sensory processing hierarchies.
Science, 211(4487), 1166-9. AAAS.
Loschky, L. C., & McConkie, G. W. (2002). Investigating spatial vision and dynamic attentional
selection using a gaze-contingent multiresolutional display. Journal of Experimental
Psychology: Applied, 8(2), 99-117.
Marr, D., & Nishihara, H. (1978). Representation and recognition of the spatial organization of
three-dimensional shapes. Proceedings of the Royal Society of London. Series B, Biological
Sciences, 200(1140), 269–294. JSTOR.
Quinn, P. C., Brown, C. R., & Streppa, M. L. (1997). Complex Perceptual Organization of
Visual Configurations by Young Infants. Infant Behavior and Development, 20(1), 35-46.
55
Ranganath, C., & Rainer, G. (2003). Neural mechanisms for detecting and remembering novel
events. Nature Reviews Neuroscience, 4(3), 193-202.
Renninger, L. W., Verghese, P., & Coughlan, J. (2007). Where to look next? Eye movements
reduce local uncertainty. Journal of vision, 7(3), 6.
Ruff, H. A., & Birch, H. G. (1974). Infant Visual Fixation: The Effect Number of Concentricity,
of Directions. Journal of Experimental Child Psychology, 17, 460-473.
Treisman, A., & Gormican, S. (1988). Feature Analysis in Early Vision : Evidence From Search
Asymmetries. Psychological Review, 95(1), 15-48.
Vuilleumier, P., Armony, J. L., Driver, J., & Dolan, R. J. (2001). Effects of attention and
emotion on face processing in the human brain: an event-related fMRI study. Neuron, 30(3),
829-41.
Xu, X., Yue, X., Lescroart, M., Biederman, I., & Kim, J. G. (2009). Adaptation in the fusiform
face area (FFA): Image or Person? Vision Research, 49, 2000-2007.
Yue, X., Vessel, E. A., & Biederman, I. (2007). The neural basis of scene preferences.
Neuroreport, 18(6), 525-9.
Zajonc, R. (2001). Mere Exposure: A Gateway to the Subliminal. Current Directions in
Psychological Science, 10(6), 224-228.
56
Chapter 3: The Neural Genesis of a Joke
Abstract
Unlike humor appreciation, the fMRI study of humor creation has been unexplored. 40
participants including professional and amateur improv comedians and controls viewed New
Yorker cartoon drawings of human interactions (without the captions) and, in one condition
(HUM), generated a humorous caption, and, in another (MUN), a mundane caption. The
HUM condition produced bilaterally greater activation in the striatum, medial prefrontal
cortex (mPFC) and temporal association regions. A dose response function was evident such
that greater activation in those regions correlated with funnier punchlines. While the same
regions were involved in both humor appreciation and creation, only in humor creation did
activation in reward regions precede activation in temporal association regions. Greater
comedic experience was associated with decreased activation in the striatum and mPFC, but
increased activation in anterior temporal regions. We propose that the mPFC helps to direct
the search through association space taking place in the anterior temporal regions and that
such intervention is needed less for more experienced comedians who, to a greater extent,
reap the fruits of their spontaneous associations.
Keywords: humor creation; fMRI; creativity; expertise; medial prefrontal cortex; anterior
temporal cortex.
57
1. Introduction
While several imaging studies investigated the neural basis of passive humor appreciation
(e.g. Amir et al., 2013; Vrticka et al., 2013), none has investigated active humor creation. A
prominent figure in humor research even argued that such a study would be impossible due to the
spontaneous fashion in which humorous ideas are conceived (Martin, 2010). The phenomenon of
improv comedy (e.g. as in the TV show “Whose Line is it Anyway”), in which comedians
routinely generate humorous responses rapidly and on cue, however, suggests otherwise. In the
current investigation, we had professional improv comedians generate humorous or mundane
captions to New Yorker cartoons while undergoing an fMRI scan and found a highly consistent
pattern of activation underlying the genesis of their creations. To investigate the effects of
expertise/talent, two additional groups of participants were included: amateur comedians and
controls with no experience in comedy performance.
1.1. Humor Appreciation
Previous imaging studies of passive humor appreciation typically suggest the activation
of brain regions engaged in three functions: a) Regions necessary for the detection and cognitive
processing of humor, e.g., the temporo-occipital junction for the detection of incongruity (Chan
et al., 2012; Mobbs et al., 2003), and anterior temporal regions where remote associations might
might be assessed as to whether they provide a humorous resolution (Amir et al., 2013), b)
Classical reward regions that underlie the emotional or hedonic component of humor
appreciation, e.g., striatum, amygdala, and ventromedial prefrontal cortex (e.g. Moran et al.,
2004; Vrticka, 2013), c) Regions that are involved in the processing of only specific types of
humor, e.g. language regions for puns, visual regions for visual gags (Watson et al., 2006), right
temporal-parietal junction for jokes relying on theory of mind (Samson et al., 2008).
However, the division into cognitive processing vs. hedonic regions might be an
oversimplification. Amir et al. (2013) found a dose response, i.e. greater activation associated
with greater funniness ratings, not only in classical reward regions, but also in the bilateral
temporo-occipital junction and temporal poles. That finding is in line with a body of research
demonstrating a positive correlation between perceptual preferences and activation in higher-
58
level association cortices, e.g. preference for images of scenes and simple shapes are associated
with greater activation in higher level visual cortex (Amir et al., 2011; Yue et al., 2007). Such
findings support Biederman and Vessel’s (2006) hypothesized motivational system that underlies
the spontaneous attentional selection to novel and richly interpretable information, the kind of
information that generates greater activity in higher-level association cortex. They speculated
that such a motivational system might be implemented via a gradient of µu-opioid receptors,
discovered by Lewis et al. (1980), that are sparse in early cortices (e.g. primary visual cortex)
and increase in density as one proceeds up the cortical hierarchy (e.g. association visual cortex,
semantic temporal regions).
1.2. Expertise
The few studies of the neural correlates of expertise compared experts to controls on a
task related to their field of expertise, but which required no creativity (e.g., architects
evaluating the aesthetics of buildings, Kirk et al., 2009; dancers viewing dance videos, Calvo-
Merino et al., 2004). The few studies of creativity that studied experts (typically of musicians
improvising) did not have a control group of non-experts (e.g. Limb & Braun 2008; Liu et al.
2012). Here, for the first time, we compare activation of experts (professional comedians)
performing a creative task (generating humorous captions to cartoons) to non-experts.
1.3. Creativity
A review of the literature on creativity offers little overlap in the pattern of activation for
different operationalizations of “creativity” (Dietrich & Kanso, 2010). The evidence suggests
that creative endeavors in different domains (e.g. music improvisation vs. painting) may rely on
different regions similar to the modularity observed in perception, e.g. auditory vs. visual cortex.
A more promising approach to understanding the neural processes underlying creativity might be
to compare the pattern of activation associated with appreciation of the creative product to that
underlying its creation. Here we explore the similarities and differences in the time course of
59
activation for humor creation vs. appreciation by overlaying our current results on humor
creation with those from our previous study of humor appreciation (Amir et al., 2013).
2. Method
Participants underwent fMRI scanning while looking at line drawings of human
interactions in various contexts (e.g. office, cocktail party), originally appearing in the
NewYorker Magazine. In order to isolate active humor generation from any effects of passive
humor appreciation, we selected drawings that were not funny by themselves (the funny captions
that originally appeared with the drawings and all other text were removed, and some drawings
were processed with Photoshop to remove elements that were inherently funny). Prior to the
presentation of each cartoon, subjects were cued to generate a) a humorous caption, b) a
mundane caption or c) no caption. Each participant rated on a 4-point scale how funny their
caption was on each trial. In addition, independent ratings of those captions were obtain by other
raters, allowing us to compare the neural correlates of successful vs. unsuccessful humor
generation. Finally, to observe the effect of experience/talent we compared 3 groups of
participants: professional comedians, amateur comedians and controls.
2.1. Participants
The location of our lab, near Hollywood, made it not too difficult to recruit professional
and promising amateur comedians. The 40 participants were categorized into 3 groups:
a) professional comedians (13 individuals, mean age 35.4, range: 26-47; 1 female). Six
were members of the famous Los Angeles’ “Groundlings” troupe and seven were professional
stand up comedians with stand up related TV credits (e.g. late night show appearances, stand up
specials). No significant differences were observed in the pattern of activity of professional
improv or stand up comedians, so the two groups were collapsed in all further analysis into a
Professionals” group.
b) Nine promising amateur comedians (Mean age 27.2, range: 20-33; 2 females) each
with several years of experience in stand up and/or improv, who demonstrated a significant
potential to become professional comedians relative to their peers.
60
c) 18 controls (Mean age 24.9, range: 19-34; 7 females). Controls were all either honor
students, graduate students or faculty at the University of Southern California selected to roughly
match the high intelligence reported for successful comedians (Greengross et al., 2012). We have
controlled for age effects in all group comparisons.
2.2. Procedure
Each trial (see Fig. 1) lasted 17 seconds. For the first 2 seconds of a trial, a single word
cued the desired caption type: 1) Humor (HUM), participants task was to think of a funny
caption for the drawing (cue word: “Humorous”); 2) Mundane (MUN) think of a caption that
would fit the drawing but be mundane and expected (cue word: “Expected”); 3) Nothing
(NOTH) just look at the drawing without thinking of a caption (cue word: “Nothing”). Then a
drawing depicting a human interaction appeared at the center of the screen (subtending a visual
angle of ∼8 deg.). In the HUM and MUN conditions participants had 15 seconds to generate a
caption for the drawing and rate it for funniness. Once participants have thought of a caption
they were instructed to immediately rate it, using a keyboard, on a 4 point scale (1 - not funny, 2
- a little funny, 3 - pretty funny, 4 - very funny). Each participant saw each drawing once, and
drawings were counterbalanced across the 3 conditions between participants. Each run lasted 7.9
minutes with 24 jittered trials sequenced such that each sequence of 2 conditions appeared the
same amount of times. Most participants completed 6 runs; all completed at least 4. No runs or
participants were discarded. Presentation sequences were programmed with Psychophysics
Toolbox (Brainard 1997; Pelli 1997) running on MATLAB (The MathWorks, Natick, MA,
USA).
61
Figure 1: The time-course of a trial.
2.3. Data Acquisition
Data acquisition and preprocessing parameters were matched with those of a previous
investigation of the neural correlates of passive humor perception (Amir et al., 2013), to allow a
comparison to humor generation. All fMRI images were scanned at USC’s Dana and David
Dornsife Cognitive Neuroscience Imaging Center on a Siemens Trio 3T scanner with a standard
16-channel head coil. Each subject ran in a high-resolution T1-weighted structural scan using
MPRAGE sequence. (Repetition time (TR) = 1100 ms, 192 sagittal slices, 256 × 256 matrix size,
1 × 1 × 1 mm voxels).
Functional images were acquired using an echo-planar imaging (EPI) pulse sequence
with the parameters: TR = 2000 ms, TE = 30 ms, flip angle = 62°, 256 × 256 matrix size, in
plane resolution 3 × 3, 3 mm thick slices, 32 axial slices covering as much of the brain as
possible, always including the Temporal Poles, but occasionally missing the superior rim of the
primary motor and somatosensory cortices.
62
2.4. Data Analysis
Preprocessing (3D motion correction using Trilinear interpolation, 3D spatial smoothing
using a 4-mm full-width at half-max Gaussian filter, linear trend removal using a high-pass filter
set to 3 cycles over the run’s length) was done with the Brain Voyager software package (Brain
Innovation BV, Maastricht, The Netherlands). Statistical analysis was done using MATLAB
scripts along with Brain Voyager, and Python. Motion corrected functional images were
coregistered with the same session’s anatomical scan. Coregistered images were then
transformed to Talairach coordinates and underwent statistical analysis.
Statistical analysis was based on a general linear model with a separate regressor for 12 TRs
from the beginning of each trial type. The 6 motion correction parameters (3D translation and 3D
rotation) were included in the design matrix of the regression to eliminate any potential motion
artifacts. We then conducted a whole-brain, random-effects group average analysis. We defined
regions of interest (ROIs) using the data from all participants with different contrasts (HUM-
MUN, HUM+MUN-2xNOTH), TR-intervals (3-6, 5-9), as well as TRs obtained in a previous
experiment on passive humor appreciation. For the purpose of defining ROIs, we used different
p-values for different (contrast, TR-interval) combinations, never higher than p = .01
uncorrected. P values were made more conservative in order to define smaller, well defined,
ROIs as necessary (e.g. for the contrast HUM-MUN, we used p < .01 Bonfferonni corrected).
The ROIs were then used to compare activation in the different participants groups, and to assess
whether the pattern activation in the region encodes the funniness of the caption.
2.5. Obtaining Independent Ratings
Following the fMRI scan, participants were presented with the images from their last 1-2
trials (time permitting) and were asked to recall and write down the captions they had generated.
81 students of the psychology department were recruited to rate the recalled captions for course
credit. Each spent an hour rating a fraction (typically a quarter) of the total number of captions
on a 7-point scale for funniness, cleverness and offensiveness. Ratings were normalized for each
participant before all ratings were averaged.
63
3. Results
3.1. Humorous vs. Mundane
Taken as one group the 40 participants showed significantly greater activation during
HUM relative to MUN trials in bilateral striatum, medial prefrontal cortex (mPFC), temporo-
occipital junction (TOJ) and primary visual cortex (V1)
6
(p < .01, Bonferonni corrected; see
Table 1). A conjunction analysis of MUN and HUM conditions, contrasted with activation
during the NOTH condition revealed additional activations in temporal regions – particularly the
bilateral anterior temporal regions and the temporo-occipital junctions (p < .001, uncorrected; see
Fig 2 & Table 2). All of those regions have previously been implicated in studies of humor
appreciation (Amir et al., 2013; Watson et al., 2006), with two differences: the center of
activation in humor production relative to appreciation for both the anterior temporal regions and
temporo-occipital junctions was slightly more posterior, and the time course of activation in
those regions differed markedly from that for humor appreciation.
6
The
greater
activation
of
V1
during
the
HUM
condition
may
reflect
a
greater
effort
to
search
for
aspects
of
the
drawing
with
a
comedic
potential.
In
the
case
of
passive
humor
appreciation,
Watson
et
al.
(2006)
reported
greater
visual
cortex
activation
for
visual
gags
(relative
to
non
humorous
control
visual
stimuli),
but
not
to
language
gags
–
this
additional
visual
activation
was
suggested
to
reflect
the
resolution
of
the
punchline.
64
Figure 2: Regions with differential activation for HUM vs. MUN. The activation was bilateral in
all regions; we display only the temporal regions in the right hemisphere for convenience.
3.2. Funniness Dose Response
We observed a dose response in professional comedians in the striatum, and temporal
regions (particularly bilateral TOJ) so that the funnier they rated their caption the greater the
activation was in these regions that were localized by subtracting MUN from HUM trials (only
rated captions from the HUM condition were included in this analysis; see Fig 3 & Table 1). The
dose response occurred early in the time-course of activation (the early peak of TR=4-6 was
used), suggesting it pertains to the process of creating the humorous caption rather than its
evaluation. Professional Comedians also showed a dose response when the funniness of their
captions was measured by independent raters.
Controls and amateurs showed no dose response in the regions localized by the contrast
HUM minus MUN. Controls did show dose response in some of the regions localized by the
conj. HUM & MUN (Table 2), but only when their captions’ funniness was measured by
independent raters. The fact that, unlike comedians, controls showed no dose response when
65
their own funniness ratings were used might be explained by an examination of the time course
of activation, which suggests the reason might be that the while in comedians the dose response
is sustained throughout the trial, controls only display it early in the trial, suggesting information
about the quality of the joke vanishes from representation once controls are called to rate it.
Figure 3: Dose response of a typical comedian. The same general areas showed greater
activation for Humorous minus Mundane and High minus Low Funniness.
66
Table 1. ROIs as localized by the contrast of HUM minus MUN (Random Effects Analysis) with
a threshold of p < .01 Bonferroni corrected. With number of Voxels, Talairach coordinates and
dose response for Professionals (p), Amateurs (A) and Controls (C). For self rating (OWN), and
independent ratings of funniness (FUN), cleverness (CLV) and offensiveness (OFF).
Significance levels are: * p<.1, ** p<.05, *** p<.01.
ROI
NrOfVoxel
s X Y Z OWN FUN CLV OFF
V1 4523 -3 -78 -11 P** P*
mPFC 637 -3 49 27
STR 11985 -1 -3 6 P** P*
CER 539 0 -49 -32
lTOJ 5978 -32 -78 -6 P** P*** P*
rTOJ 3395 30 -82 5 P*** P**
Key: r – right; l – left; V1 – primary visual cortex; mPFC – medial prefrontal cortex; CER –
cerebellum; STR – striatum; TOJ - temporo-occipital junction.
67
Table 2. ROIs as localized by the a conjunction of HUM and MUN minus twice NOTH with a
threshold of p < .001 uncorrected. With number of Voxels, Talairach coordinates and dose
response for Professionals (p), Amateurs (A) and Controls (C). For self rating (OWN), and
independent ratings of funniness (FUN), cleverness (CLV) and offensiveness (OFF).
Significance levels are: * p<.1, ** p<.05, *** p<.01.
ROI
NrOfVox
els X Y Z
OW
N FUN CLV
OF
F
Neg rParietal
supramarginal gyrus 10571 51 -43 31 P** P*
Neg rAntTPJ 1758 47 -10 3 P**
Neg lAntFrontal 620 38 44 11 P** P*
Neg rDLPFC 649 32 20 38 P*
Neg PCC 2940 4 -34 36 P*
ACCdSPM 26325 -5 21 42 P* C*
mCER 2337 -2 -48 -29
vmPFC 1773 -2 48 -6 P** P*
Cuneus 3358 -7 -55 13 C** C**
lAMG 380 -40 -14 -23
Neg_lParietal_
supramarginal_gyrus 948 -60 -42 33 C*** C*** C**
lSupAntTemporal 4911 -52 -8 -13 C*
lTP 8737 -49 11 -17 P**
lFrontal 21542 -45 21 6 P*
lSupFrontal 16089 -42 0 34 C** C**
lSTR 14276 -17 -7 5 P** C*
rSTR 8273 13 -4 7 P** C* P*
rSupAntTepmporal 6234 44 -10 -16 P*
rPostSupParietal 9542 43 -42 -1 P**
rTOJ 32003 29 -81 -14 P** P**
lTOJ 17677 -38 -75 -11 P* P** P* P*
Key: r – right; l – left; m – medial; Neg – negative (i.e. the region was localized by a
significantly greater activity for the NOTH condition); TOJ – temporo-occipital junction; STR –
striatum; Sup – superior; Ant – anterior; TP – temporal pole; AMG – amygdala; TPJ – temporal
parietal junction; CER – cerebellum; ACCdSPM – anterior cingulate cortex/dorsal
supplementary motor cortex; PCC – posterior cingulate cortex; vmPFC – ventromedial prefrontal
cortex.
68
3.3. Group Differences
We observed a clear function of comedic experience/talent so that HUM minus MUN
activity in the striatum (b = -.031, p < .05, d = .985) and mPFC (b = -.062, p < .005, d = 1.18)
was greatest for controls than professional comedians (with amateurs falling in between). The
reverse was true in the left superior anterior temporal regions (b = .033, p < .05, d = .12; the right
sup. ant. temporal region showed a similar pattern that was insignificant; Fig 4). The statistical
measures of these effects were computed with a regression analysis of the average difference
between HUM and MUN condition over the full duration of a trial, with age included as
regressor of no interest, with bs the regression coefficients of expertise, p its significance, and d
(Cohen’s d) the effect size of the difference between professional comedians and controls.
The mPFC is perhaps the region most consistently reported in fMRI creativity studies
(e.g. in jazz improvisation, Limb & Braun, 2008; rap improvisation, Liu et al., 2012; story
generation, Howard-Jones et al., 2005; search for anagram solutions, Aziz-Zadeh et al., 2009).
However, its role likely to be cognitive control over the creative process (Ridderinkhof et al.,
2004). The fact it showed less activation, while anterior temporal regions were more active in
professional comedians, relative to controls, suggests professionals rely more on the spontaneous
flow of associations in the temporal regions in the search for funny ideas, with less mPFC
interference.
69
Figure 4: Activation for the contrast HUM-MUN for the 3 groups: Professional comedians,
Amateur comedians and Controls. An expertise effect was significant in the striatum, mPFC, left
sup. ant. temporal cortex, but not in the right sup. ant. temporal cortex.
3.4. Humor Appreciation vs. Creation
In Amir et al. (2013) participants viewed line drawings that were uninterpretable until a
caption revealed the referent. The caption was either humorous or mundane. We compared the
activation time course resulting by a subtraction of activation for the mundane from that of the
humorous condition in this passive humor appreciation experiment, to that obtained by the
subtraction HUM-MUN in the present experiment of active humor creation (Fig 5). In humor
appreciation (i.e., the passive function in Fig. 5), the time-course of anterior temporal activation
peaks early and declines rapidly relative to the active (creation) function, presumably reflecting
that “getting a joke” generally would occur more quickly—and be finished--than the act of
70
creating a joke. . The active humor creation condition resulted in a gradual increase in anterior
temporal activation throughout the trial, suggesting the gradual construction of comedic meaning
via the discovery and linkage of remote associations. Striatum activity peaks and declines with a
nearly identical time-course for both humor appreciation and creation, thus coinciding with
anterior temporal activity in the case of appreciation and preceding it in the case of creation. The
antecedence of striatum activity in the case of humor creation might merely index an expectation
for the reward of creating a funny caption, or it could serve to signal other parts of the brain (e.g.
the anterior temporal region) to activate humorous associations. Evidence for the latter
interpretation arises from the dose response in the striatum: the greater the activation in the
region, the greater the funniness of the subsequently generated caption.
Figure 5: Active humor creation (red) vs. passive humor appreciation (blue).
5. Discussion
Professional and amateur comedians as well as controls underwent fMRI while
generating captions to line drawings of human interactions. When attempting to generate
humorous captions greater activation was observed in the striatum, mPFC, and–in the case of
comedians only – bilateral anterior temporal cortex.
71
5.1. Anterior Temporal Cortex and mPFC
There is evidence that the anterior temporal lobes are where humor comprehension is
achieved (e.g. Amir et al., 2013; Vrticka et al., 2013), and activation there during humorous
caption creation suggests that is where associations are activated and linked in the process of
punch-line generation. Comedians show greater activation in the region relative to controls when
generating funny captions suggesting either a greater wealth of stored associations that are useful
for comedy creation, a greater reliance on the undirected (i.e. not guided by mPFC) flow of
associations – with a greater confidence that such flow will result in a discovery of an
appropriate funny idea, or both.
While in the case of humor appreciation a rapid peak in activation followed by a rapid
decline characterizes the region, in humor creation a gradual increase in activation emerges (Fig
5). This pattern might reflect the necessarily instantaneous event of “getting the joke” in the case
of appreciation, vs. the gradual construction of a joke via the linking of remote associations in
the case of creation. In both humor creation and appreciation, greater activation in the region
reflects greater funniness.
The greater activation in this µ-opioid rich region for professional comedians when creating
comedy might, according to Biederman and Vessel’s (2006) hypothesis, underlie the pleasure
they experience when writing/performing comedy (even in spite of lower striatum activation).
The reverse group effect is observed in mPFC where activity is greater for the less experienced
comedians and controls. mPFC has been implicated in cognitive control (Ridderinkhof et al.,
2004; Passingham et al., 2010). Thus, a possible interpretation is that the mPFC helps to direct
the search through association space taking place in the anterior temporal regions and that such
intervention is needed less for more experienced comedians who, to a greater extent, reap the
fruits of their spontaneous associations.
5.2. Striatum
The striatum is part of the classical reward system and is activated in response to any
pleasurable stimulus, including humor as well as other forms of art (Vessel et al., 2012). Unlike
the case of humor appreciation (e.g. Amir et al., 2013), where striatal activation follows or
72
coincides with activation of temporal regions, activation in the striatum preceded the peak of
temporal activation in the case of humor creation. This may be in expectation of the reward for
having generated a funny idea, but the early dose response in the region, i.e., that greater striatal
activation precedes funnier captions, suggests it might play a causal role, likely by signaling to
other parts of the brain of a context that is more likely to activate associations with a potential to
be humorous. Common comedy coaches’ advice comes to mind: “have fun and you will be
funnier”. Professional comedians show less activity in the region relative to amateurs and
controls, possibly since their practice makes them more prone to generate humorous association
so less striatal signal may be necessary.
5.3. Conclusion
A small set of regions is involved in humor creation, including bilateral temporal regions,
mPFC and the striatum. In addition to a greater activity in those regions during creation of
humorous relative to mundane captions, the regions exhibit a dose response, i.e. greater activity
in those regions results in funnier captions. Experienced comedians show lower activity in the
striatum and mPFC during comedy creation, but greater activation in the anterior temporal
regions. Over several studies we have found that activation in the anterior temporal cortex, with
its high density of µ-opioid receptors, is experienced as pleasurable. That activation may be the
underlying motivational system driving comedians and other experts to indulge in their passion.
Acknowledgments
We would like to thank neuroscientists Jonas Kaplan and Bosco Tjan, and comedians
Troy Conrad, Dave Reinitz, Shane Mauss, & Greg Wilson for their helpful insights, and JC
Zhuang for his help with the MRI scans. Supported by NSF BCS 04-20794, 05-31177, 06-17699
to I.B.
73
References
Amir, O., Biederman, I., & Hayworth, K. J. (2011). The neural basis for shape
preferences. Vision research, 51(20), 2198-2206.
Amir, O., Biederman, I., Wang, Z., & Xu, X. (2013). Ha Ha! Versus Aha! A Direct Comparison
of Humor to Nonhumorous Insight for Determining the Neural Correlates of Mirth. Cerebral
Cortex, bht343.
Biederman, I., & Vessel, E. (2006). Perceptual Pleasure and the Brain. American scientist, 94(3),
247-253.
Calvo-Merino, B., Glaser, D. E., Grèzes, J., Passingham, R. E., & Haggard, P. (2005). Action
observation and acquired motor skills: an FMRI study with expert dancers. Cerebral
cortex, 15(8), 1243-1249.
Davidson, R. J. (2012). The Emotional Life of Your Brain: How Its Unique Patterns Affect the
Way You Think, Feel, AndLive--and How You Can Change Them. Penguin.
Dietrich, A., & Kanso, R. (2010). A review of EEG, ERP, and neuroimaging studies of creativity
and insight. Psychological bulletin, 136(5), 822.
Greengross, G., Martin, R. A., & Miller, G. F. (2012). Personality traits, intelligence, humor
styles, and humor production ability of professional stand-up comedians compared to college
students. Psychology of Aesthetics, Creativity, and the Arts, 6(1), 74.
Howard-Jones, P. A., Blakemore, S. J., Samuel, E. A., Summers, I. R., &
Claxton, G. (2005). Semantic divergence and creative story generation: An
fMRI investigation. Cognitive Brain Research, 25(1), 240-250.
Kirk, U., Skov, M., Christensen, M. S., & Nygaard, N. (2009). Brain correlates of aesthetic
expertise: a parametric fMRI study. Brain and cognition, 69(2), 306-315.
Lewis, M. E., Mishkin, M., Bragin, E., Brown, R. M., Pert, C. B., & Pert, A. (1981). Opiate
receptor gradients in monkey cerebral cortex: correspondence with sensory processing
hierarchies. Science, 211(4487), 1166-1169.
Limb, C. J., & Braun, A. R. (2008). Neural substrates of spontaneous musical
performance: An fMRI study of jazz improvisation. PLoS One, 3(2), e1679.
74
Liu, S., Chow, H. M., Xu, Y., Erkkinen, M. G., Swett, K. E., Eagle, M. W., ... & Braun, A. R.
(2012). Neural correlates of lyrical improvisation: an fMRI study of freestyle rap. Scientific
reports, 2.
Maguire, E. A., Gadian, D. G., Johnsrude, I. S., Good, C. D., Ashburner, J., Frackowiak, R. S., &
Frith, C. D. (2000). Navigation-related structural change in the hippocampi of taxi
drivers. Proceedings of the National Academy of Sciences, 97(8), 4398-4403.
Martin, R. A. (2010). The psychology of humor: An integrative approach. Academic Press.
Moran, J. M., Wig, G. S., Adams, R. B., Janata, P., & Kelley, W. M. (2004). Neural correlates of
humor detection and appreciation. Neuroimage, 21(3), 1055-1060.
Passingham, R. E., Bengtsson, S. L., & Lau, H. C. (2010). Medial frontal cortex: from self-
generated action to reflection on one's own performance. Trends in cognitive sciences, 14(1),
16-21.
Ridderinkhof, K. R., Ullsperger, M., Crone, E. A., & Nieuwenhuis, S. (2004). The role of the
medial frontal cortex in cognitive control. science, 306(5695), 443-447.
Samson, A. C., Zysset, S., & Huber, O. (2008). Cognitive humor processing: different logical
mechanisms in nonverbal cartoons—an fMRI study. Social Neuroscience, 3(2), 125-140.
Vessel, E. A., Starr, G. G., & Rubin, N. (2012). The brain on art: intense aesthetic experience
activates the default mode network. Frontiers in human neuroscience, 6.
Vrticka, P., Black, J. M., & Reiss, A. L. (2013). The neural basis of humour processing. Nature
Reviews Neuroscience, 14(12), 860-868.
Yue, X., Vessel, E. A., & Biederman, I. (2007). The neural basis of scene
preferences. Neuroreport, 18(6), 525-529.
Abstract (if available)
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
The neural basis for shape preferences
PDF
Functional models of fMRI BOLD signal in the visual cortex
PDF
Behavioral and neural influences of interoception and alexithymia on emotional empathy in autism spectrum disorder
PDF
Behabioral and neural evidence of state-like variance in intertemporal decisions
PDF
A million-plus neuron model of the hippocampal dentate gyrus: role of topography, inhibitory interneurons, and excitatory associational circuitry in determining spatio-temporal dynamics of granul...
PDF
The acute impact of glucose and sucralose on food decisions and brain responses to visual food cues
PDF
Heart, brain, and breath: studies on the neuromodulation of interoceptive systems
Asset Metadata
Creator
Amir, Ori
(author)
Core Title
The neural correlates of creativity and perceptual pleasure: from simple shapes to humor
School
College of Letters, Arts and Sciences
Degree
Doctor of Philosophy
Degree Program
Psychology
Publication Date
07/28/2016
Defense Date
03/27/2015
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
cortical μ-opioid gradient,creativity,fMRI,humor,medial prefrontal cortex,OAI-PMH Harvest,shape preferences,temporal pole
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Biederman, Irving (
committee chair
), Baker, Laura A. (
committee member
), Kaplan, Jonas (
committee member
), Kellman, Barnet (
committee member
), Tjan, Bosco S. (
committee member
)
Creator Email
oamir@usc.edu,oriacadem@gmail.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c3-611226
Unique identifier
UC11300015
Identifier
etd-AmirOri-3733.pdf (filename),usctheses-c3-611226 (legacy record id)
Legacy Identifier
etd-AmirOri-3733.pdf
Dmrecord
611226
Document Type
Dissertation
Format
application/pdf (imt)
Rights
Amir, Ori
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
cortical μ-opioid gradient
creativity
fMRI
humor
medial prefrontal cortex
shape preferences
temporal pole