Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Functional magnetic resonance imaging characterization of peripheral form vision
(USC Thesis Other)
Functional magnetic resonance imaging characterization of peripheral form vision
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
Functional Magnetic Resonance Imaging Characterization of Peripheral Form Vision by Rachel Millin A Dissertation Presented to the Faculty of the Graduate School of The University of Southern California in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy (Neuroscience) The University of Southern California December 2015 Copyright 2015 Rachel Millin ii Acknowledgements The work presented in this thesis, and my completed doctorate, would have been impossible without the help and support of many remarkable people. I thank my family for their encouragement throughout my PhD. My parents, Barbara and Jerry, encouraged me to pursue a PhD, and provided me with the best preparation possible by teaching me to question and to think carefully outside of academics and science. They demonstrated amazing patience throughout this long process, and were always willing (or convincingly pretended to be willing) to listen, advise, humor, and console. I cannot overstate how lucky I feel to have such wonderful parents. My husband, Rishabh, who survived the PhD experience at the same time, often put aside his own concerns to help me work through problems, scientific or otherwise. Finding such a kind partner was one of the most important accomplishments of my PhD. I also could not have asked for a better advisor for my PhD. Dr. Bosco Tjan taught me not only about peripheral vision and fMRI, but statistics, signal processing, cognition, grant writing, public speaking…not to mention the job market, diplomacy, and where to get the best food in LA. I have yet to find a subject in which he is not brilliant, but he shows amazing patience with us mere mortals. He is also one of the kindest people I know, and helped me through difficult periods of the PhD. I thank him for his patience and generosity, and for making me laugh when it was most needed. I would also like to thank the other members of my thesis committee, Dr. Judith Hirsch and Dr. Bartlett Mel. Judith inspired me with her intelligence, wit, and determination, and helped me to believe that I could succeed with a sharp tongue and sharp elbows. Her guidance has been invaluable. Bartlett’s advice on both science and career iii helped to shape my thinking on both subjects. I always enjoyed chatting with him, and appreciate his generosity with his time and wisdom. Lastly, I am grateful for the friends I made during the PhD. In particular, my outstanding labmates, Anirvan, MiYoung, Pinglei, Kilho, Ben, and Helga helped me on numerous occasions through insightful discussions related to our research. They also made the lab a place I was happy to return to each day (not just for the air conditioning). I am so happy to have them as friends. Thank you all. iv Table of Contents Acknowledgements ................................................................................................. ii List of Figures ...................................................................................................... viii List of Tables ...........................................................................................................x Chapter 1: Introduction ...........................................................................................1 Chapter 2: Visual Crowding in V1 .........................................................................7 Summary .........................................................................................................7 Introduction .....................................................................................................7 Methods.........................................................................................................10 General Methods ..................................................................................10 Participants ..................................................................................10 Stimuli .........................................................................................10 fMRI preprocessing ....................................................................13 Localization of visual areas ........................................................14 Data analysis ...............................................................................14 Experiment 1 ........................................................................................18 Participants ..................................................................................18 Stimuli .........................................................................................18 Data analysis ...............................................................................18 Experiment 2 ........................................................................................19 Participants ..................................................................................19 Stimuli .........................................................................................20 Experiment 3 ........................................................................................20 Participants ..................................................................................20 Stimuli .........................................................................................20 Experiment 4 ........................................................................................21 Participants ..................................................................................21 Stimuli .........................................................................................21 v Results ...........................................................................................................22 Discussion .....................................................................................................36 Chapter 3: A non-neuronal model of BOLD fMRI in retinotopic visual cortex ..................................................................................................42 Summary .......................................................................................................42 Introduction ...................................................................................................43 Methods.........................................................................................................46 Annotation conventions .......................................................................46 Model Overview ..................................................................................46 Noise modeling and synthesis ..............................................................47 Bold signal synthesis ...........................................................................50 Mapping from visual space to cortical surface ....................................53 Subject-specific vs. general model ......................................................53 Subjects ................................................................................................55 Stimuli ..................................................................................................55 fMRI acquisition ..................................................................................56 fMRI preprocessing .............................................................................57 Model assessment ................................................................................57 Results ...........................................................................................................58 Model calibration .................................................................................59 Near generalization ..............................................................................62 Far generalization .................................................................................64 Discussion .....................................................................................................66 Match between simulation and data .....................................................66 Model assumptions ..............................................................................68 Potential applications of the model ......................................................69 Chapter 4: Applications of the BOLD fMRI model for experiment design and interpretation .................................................................................................71 Summary .......................................................................................................71 Introduction ...................................................................................................72 vi Methods.........................................................................................................74 Annotation conventions .......................................................................74 Subjects ................................................................................................74 Stimuli ..................................................................................................74 pRF mapping ...............................................................................75 Letter MVPA experiment ...........................................................75 fMRI acquisition ..................................................................................76 fMRI preprocessing .............................................................................76 fMRI data simulation ...........................................................................77 Data Analysis .......................................................................................78 Multivoxel pattern analysis .........................................................79 Population receptive fields ..........................................................80 Results ...........................................................................................................81 Multi-voxel pattern analysis ................................................................81 Population receptive fields ...................................................................83 Discussion .....................................................................................................84 Multi-voxel pattern analysis ................................................................84 High MVPA performance using simulation ...............................84 Implications for using MVPA to study peripheral vision ...........85 Population receptive fields ...................................................................86 Comparison to published studies ................................................86 Data vs. simulation pRFs ............................................................86 Neural and hemodynamic contributions to the pRF measurement ......................................................................87 Appendix A: Hemodynamic model ......................................................................91 Model Equations ...........................................................................................91 Model Parameters .........................................................................................92 Appendix B: V1 cortical model ............................................................................94 Model Equations ...........................................................................................94 Cortical Magnification Parameters ...............................................................95 vii References ..............................................................................................................96 viii List of Figures Figure 1.1: Depiction of cortical magnification, from Motter and Simoni (2003). ....................................................................................2 Figure 1.2: Crowding demonstrations ...................................................................3 Figure 2.1: Stimuli and protocols used in the experiments. ................................12 Figure 2.2: Stimuli and hemodynamic response functions for ROIs in Experiment 1, averaged across 14 subjects. ......................................23 Figure 2.3: Averaged hemodynamic response timecourses for ROIs in Experiment 2. ....................................................................................26 Figure 2.4: BOLD response to flankers is unaffected by spacing. ......................27 Figure 2.5: Behavioral performance in the scanner for Experiment 3, measured in proportion correct. ........................................................28 Figure 2.6: Averaged hemodynamic response timecourses for ROIs in Experiment 3. ....................................................................................30 Figure 2.7: Behavioral performance in the scanner for Experiment 4, measured in proportion correct. ........................................................31 Figure 2.9: LOC subregions observed in Experiments 2-4. ................................35 Figure 2.10: Suppression, not response saturation, explains Experiment 1 results. ...............................................................................................38 Figure 3.1: Model Overview. ..............................................................................47 Figure 3.2: Noise model. .....................................................................................48 Figure 3.3: Signal synthesis. ................................................................................52 Figure 3.4: Results of model calibration, for individual subject and general models. .................................................................................61 ix Figure 3.5: Results of near generalization, for individual subject and general models. .................................................................................63 Figure 3.6: Results of far generalization, for individual subject and general models. .................................................................................65 Figure 4.1: Individual subject cortical models. ...................................................78 Figure 4.2: Data analysis scheme for each subject. .............................................79 Figure 4.3: MVPA results for data (blue) and simulation (red). .........................82 Figure 4.4: Eccentricity dependence of pRF size for data and simulation, for two different assumed HRFs. ......................................................84 x List of Tables Table A1: Balloon model parameters. ...............................................................93 Table B1: Cortical model parameters. ...............................................................95 1 Chapter 1: Introduction The healthy visual system allows us to perform a wide variety of tasks almost effortlessly, from recognizing familiar faces and reading, to detecting movement and locating objects. Central and peripheral vision provide complementary functionality, and are well-suited for different tasks. Central vision performs well for tasks that require high- resolution vision, such as finding keys on a cluttered table or reading the newspaper. Peripheral vision excels in the detection of motion and vision in low-light conditions. The differences that result in the divergence in central and peripheral visual function begin in the eye, but are magnified in the brain. The fovea, the part of the retina responsible for processing the center of gaze, provides information about the visual scene with high resolution to later stages of visual processing. The peripheral retina sends a lower resolution signal with better temporal precision (see Wassle, 2004 for a review). At the input stage of the retina, the cone photoreceptors, which have small receptive fields sensitive to color, occur with high density in the fovea, and low in the periphery. Rod photocreptors, which have larger receptive fields and high contrast sensitivity but no color sensitivity, are more numerous in the peripheral retina (Osterberg, 1935). The difference in representation of fovea and periphery continues beyond the eye. In the early stages of cortical processing, the visual field is mapped topographically onto the cortex, but in a distorted manner: information about the peripheral visual field is processed by a relatively small cortical area, while a disproportionate amount of cortex is dedicated to central visual 2 processing (Fig. 1.1). The representation of the fovea at the expense of the periphery is known as “cortical magnification”. Figure 1.1: Depiction of cortical magnification, from Motter and Simoni (2003). Identically sized letter T’s appearing in the visual field (left) are shown projected onto a model of the primary visual cortex that accounts for the cortical magnification function. The amount of cortex representing a T decreases moving from the central to peripheral visual field. This organization results in a peripheral visual system poorly-suited for high- resolution tasks such as face recognition and reading. For people who lose their central vision due to common diseases of the retina, such as age-related macular degeneration, these normally simple daily tasks become challenging. Even people with healthy vision can face problems due to the limitations of peripheral vision: a driver, for example, may 3 be unaware of pedestrians about to step into the street. The limiting factor for many such tasks is not low resolution, but the phenomenon of visual “crowding”: the inability to identify an object or visual feature when surrounded by other objects or features (Korte 1923). Crowding occurs even after compensation for the low resolution in the periphery. The distance at which an object crowds a target object, known as the “critical spacing”, is approximately half the eccentricity of the target object (Bouma, 1970), and is relatively constant across different stimuli and individuals (Strasburger et al., 1991; Levi et al., 2002; Tripathy and Cavanaugh, 2002; Pelli et al., 2004; Louie et al., 2007; Wallace and Tjan, 2011). Figure 1.2: Crowding demonstrations A. The letter “r” is easily identifiable whether one fixates at the gray cross or moves the “r” further into the periphery by fixating on the red minus. B. However, if one focuses on the red minus, the “r” becomes difficult to identify when surrounded by other letters. This A B C Whitney and Levi (2011) Pelli and Tillman (2008) 4 crowding phenomenon occurs across diverse visual stimuli, from simple gabor patches, to objects, to features within an object (such as the windows and doors of a house). C. Crowding can occur in everyday situations: the child on the left is difficult to distinguish from the surrounding signs when a driver is focusing on the road ahead. The cause of crowding is not well understood, but is believed to involve errors in feature integration and segmentation, processes basic to form vision. A better understanding of peripheral visual processing, particularly its deficiencies, is an important step toward developing training regimens and technological aids to improve peripheral vision for patients with central vision loss. Because the same basic mechanisms likely govern central and peripheral visual processing, an improved understanding of peripheral vision also aids in a comprehensive understanding of visual processing. The study of peripheral visual processing in humans is challenging, as neural activity cannot be measured directly in healthy participants. One of the most common methods for indirectly measuring brain activity associated with a task or stimulus is functional magnetic resonance imaging (fMRI). fMRI has been used to identify a number of phenomena previously observed in electrophysiology experiments, such as surround suppression (Zenger-Landolt and Heeger, 2003; Williams et al., 2003; Pihlaja et al., 2008), contrast-response functions (Boynton et al., 1999; Olman et al., 2004), and attentional modulation of responses (Buracas and Boynton, 2007). However, the blood-oxygen-level- dependent (BOLD) signal measured in fMRI is related to the neural response through a complex process, by which increased neuronal activity leads to an increase in oxygenated hemoglobin (providing oxygen for neural metabolism), leading to a change in BOLD signal (Thulborn et al., 1982; Ogawa et al., 1990; Kwong et al., 1992). As a result, BOLD signal reflects not only neural activity, but also the other components of this complex signaling 5 process. A substantial contribution to the BOLD signal comes from large drainage veins, downstream from the neural population driving the response (Ogawa et al., 1993; Kim et al., 1994). As a result of these factors, BOLD signal is spread out in both space and time. It is also affected by numerous sources of noise (Kruger and Glover, 2002), some of which cause alterations in blood flow that are difficult to distinguish from stimulus-driven changes. All of these factors pose a problem for fMRI in general, but fMRI studies of peripheral vision are further complicated by the cortical magnification. Because of the condensed representation of the peripheral visual field at early stages of visual processing, combined with the low spatial resolution of fMRI of around 3 mm at 3 Tesla (Ugurbil et al., 2013), very few volumetric pixels (known as voxels) in an fMRI measurement will reflect the neural response to a peripheral stimulus. This is particularly true for the type of stimuli that are used to study crowding, because the stimuli must be small enough that their center-to-center distance can be within the critical spacing. In this thesis, I will describe several studies addressing how fMRI can be used to better understand peripheral vision. In the first study, we used fMRI to identify the earliest visual area involved in crowding. By knowing the first stage of visual processing at which crowding occurs, we can constrain theories on the process that gives rise to crowding. In this study, we eliminate the problem of localizing the signal of a flanked target letter by measuring changes in BOLD signal within a large region of interest (ROI) that includes both target and flankers. This allowed us to detect a suppression of BOLD signal associated with crowding as early as primary visual cortex (V1). 6 In the second part of the thesis, I will describe a model we developed to aid in design and interpretation of more complex analyses that do not require averaging within a large ROI, but instead examine the individual responses of multiple voxels. These techniques are relatively new, and their viability in the peripheral representation in early visual cortex is not established. By incorporating the signal and noise characteristics of BOLD measurement described above, our model provides a means for establishing the limits they impose on analyses of BOLD studies of peripheral vision in early visual cortex. The model also provides a non-neuronal baseline to which results can be compared to aid in interpretation of experiment results. Lastly, I demonstrate the application of this model to two commonly used analysis method that can inform on peripheral visual processing: multi-voxel pattern analysis (MVPA) and population receptive field (pRF) estimation. 7 Chapter 2: Visual Crowding in V1 SUMMARY In peripheral vision, objects in clutter are difficult to identify. The exact cause of this “crowding” effect is unclear. To perceive coherent shapes in clutter, the visual system must integrate certain local features across receptive fields while preventing others from being combined. It is believed that this selective feature integration–segmentation process is impaired in peripheral vision, leading to crowding. We used functional magnetic resonance imaging (fMRI) to investigate the neural origin of crowding. We found that crowding was associated with suppressed fMRI signal as early as V1, regardless of whether attention was directed toward or away from a target stimulus. This suppression in early visual cortex was greatest for stimuli that produced the strongest crowding. In contrast, the pattern of activity was mixed in higher level visual areas, such as the lateral occipital cortex. These results support the view that the deficiency in feature integration and segmentation in peripheral vision is present at the earliest stages of cortical processing. INTRODUCTION In peripheral vision, flanking an otherwise identifiable object with other objects or patterns interferes with its identification (Bouma 1970; Anstis 1974; Flom 1991). This crowding phenomenon greatly reduces the utility of peripheral vision for everyday form- vision tasks such as reading and face and object recognition. The cause of crowding is under debate, but is believed to involve failures in the core visual processes of feature integration and segmentation. While behavioral studies have suggested loci of crowding at multiple stages of visual processing (Louie et al. 2007; Whitney and Levi 2011), knowing 8 the earliest stage where crowding occurs can significantly constrain theories of peripheral form vision. Several recent models of peripheral visual processing propose that peripheral representations in visual cortex capture only local statistics of the visual image (Parkes et al. 2001; Balas et al. 2009; Van den Berg et al. 2010; Freeman and Simoncelli 2011), resulting in the distorted percept associated with crowding. This reductive statistical representation is believed to be a consequence of pooling of image features within large receptive fields. Such theories typically suggest an origin of crowding after V1. In contrast, Neri and Levi (2006) as well as Pelli and Tillman (2008) suggest that imprecise feature binding in V1 is a component of crowding. Nandy and Tjan (2012) also propose V1 as the earliest source of crowding, due to inappropriate feature integration via horizontal connections. Psychophysical experiments provide ambiguous evidence for crowding at the level of V1. The finding that crowding is reduced by placing target and flanker on opposite sides of the vertical meridian but not the horizontal meridian (Liu et al. 2009) suggests that the low-level origin of crowding can be either V1 or hV4, since both have a contiguous representation of the horizontal meridian. Several experiments have demonstrated a dependence of crowding on the perceived rather than physical stimulus (Dakin et al. 2011; Maus et al. 2011;Wallis and Bex 2011), with the interpretation that crowding occurs beyond V1 (Dakin et al. 2011). However, crowding affects orientationspecific adaptation even when target and flankers are removed from awareness (Ho and Cheung 2011), suggesting an earlier origin. While physiological effects of crowding have been reported for V2 and beyond (Motter 2006; Bi et al. 2009; Freeman et al. 2011), physiological 9 evidence for a V1 locus of crowding is currently lacking. A single recent study found a correlation between blood oxygenation level–dependent (BOLD) adaptation in V1 and crowding (Anderson et al. 2012). However, this effect was measured while subjects detected changes of a crowded target. Hence, the effect may be percept-driven as opposed to input-driven. Whether there is a bottom-up, inputdriven origin of crowding in V1 remains undetermined. To investigate the neural origin of crowding, we used functional magnetic resonance imaging (fMRI) to measure BOLD signals in visual cortex resulting from crowded and noncrowded letter stimuli presented in the periphery. To overcome the difficulty of separating signals for small, closely spaced peripheral target and flankers (due to the small cortical magnification factor relative to imaging resolution), we designed the experiments and the associated analyses to use regions of interest that encompassed both the target and flankers. We examined the effect of crowding on BOLD signal for attended and unattended stimuli in separate experiments. In the first, subjects’ attention was directed away from the letter stimuli with an unrelated task at fixation, to emphasize the automatic aspect of the crowding mechanism. In the second, they identified the target letter in crowded and noncrowded conditions, a common paradigm for assessing crowding in behavioral studies. Regardless of whether attention was directed to the letter stimuli, we observed a suppression of BOLD signal with crowding in early visual areas, including V1. A third experiment showed that this suppression was greatest for stimuli that induced the strongest crowding, and a fourth ruled out task difficulty and response accuracy as confounds. While 10 crowding may involve multiple levels of visual processing, these results, taken together, argue for a significant role of the primary visual cortex in crowding. METHODS General Methods Participants All subjects were University of Southern California students with normal or corrected-to-normal vision. Subjects had definable retinotopic and higher visual processing areas using the techniques described below. The Institutional Review Board of the university approved the experimental protocol, and each subject provided written informed consent. Stimuli All stimuli were generated on Macintosh computer, using MATLAB and the PsychToolbox (Brainard 1997; Pelli 1997). Stimuli consisting of one, two, or three uppercase Sloan letters were presented with the center letter at 5° eccentricity in the lower right or upper left quadrant of the subject’s visual field, midway between the horizontal and vertical meridians. The target letter was presented alone or flanked by a letter on each side. Targets and flankers were randomly chosen from four letters: K, N, R, or S in Sloan font for Macintosh (provided by Denis Pelli). Target and flanker letters subtended 0.75 o of visual angle and were presented at 100% contrast. Four target-flanker center-to-center separations were used: 0.9º (crowded condition), 1.5º, 2.25º (non-crowded condition), and infinity (target presented alone). Center-to-center spacing between target and flankers was 11 manipulated in a manner that did not change the eccentricity of the flankers and at the same time avoided any overlapping between target and flankers at the smallest letter spacing (Figure 2.1A). At a center-to-center spacing of one letter height (0.75º, a condition not tested), the target and flankers were arranged horizontally and thus in a configuration mid- way between radial and tangential relative to fixation. Larger letter spacings were generated by moving each flanker from this horizontal “home” position on an arc centered at fixation. Experiments 1, 2, and 4 tested only separations 0.9º (crowded) and 2.25º (non- crowded), while Experiment 3 tested all four. We chose these stimuli such that identification was comparable in the 2.25º condition to that for an isolated letter, and was strongly impaired for the crowded condition (Figure 2.5). Stimuli were displayed on a 32x24 cm rear projection screen mounted perpendicularly to the toe-head axis in the bore of the magnet, directly above the subject’s head. The viewing distance was 75 cm. The background luminance of the display was set at 156 cd/m 2 , and the maximum luminance at 100% contrast was 312 cd/m 2 . 12 Figure 2.1: Stimuli and protocols used in the experiments. A. Configuration of letter stimuli. The target always appeared in the same position at 5º eccentricity from fixation, midway between the two flankers. Spacing between the target and flankers was measured center-to-center in degrees. Flanker position varied as a function of condition: single (no flankers), 0.9º, 1.5º, and 2.25º (1.2, 2, and 3 times the letter height, respectively). Each flanker had its own arc, centered at fixation, along which it was placed. This arrangement prevented the flankers from overlapping with the target at small target-flanker spacing while keeping the eccentricity of each flanker constant, independent of target-flanker spacing. B. Example of block design used in Experiment 1, with each color representing a condition (gray is rest). The sequence of blocks was counterbalanced across scans, so that each block type was preceded equally often by every block type (Kourtzi and Kanwisher 2000, 2001; Tjan et al. 2006). C. Example of rapid event-related design used in Experiments 2, 3 and 4. Trial order was counter-balanced for a history of two preceding trials, with a total of 127 trials in each scan, and 6 scans in a session. Subjects practiced identifying the target letter outside of the scanner for the equivalent duration of one scan prior to the MRI session. fMRI acquisition All scans were performed using a 3 Tesla whole-body magnet (Siemens MAGNETON Trio) at the USC Dana and David Dornsife Cognitive Neuroscience Imaging Center at the University of Southern California with a single-channel send-receive circular polarization (CP) coil for Experiment 3 and a 12-channel matrix coil running a simulated CP mode for Experiments 1, 2, and 4 (a major scanner upgrade occurred mid-course during the study; Experiment 3 was the first experiment). Functional scans were acquired using a Gradient Echo Planar (EPI) sequence (TR = 1000ms, TE =30ms, Flip Angle = 65º) with + 5° 1x + 5° ... 16 s ... 3 s Upper Left Lower Right Letter spacing AB C 13 an isotropic voxel resolution of 3 mm. Fourteen oblique slices oriented perpendicular to the calcarine sulcus were acquired to ensure that all early visual areas would be captured. Anatomical scans were acquired using a Magnetization Prepared Rapid Acquisition Gradient (MPRAGE) sequence (TI=1100ms, TR=2070ms, TE =4.14ms, Flip Angle = 12º) with a resolution of 1x1x1.2 mm. fMRI preprocessing BrainVoyager was used to preprocess the imaging data. Preprocessing of functional images included motion-correction, slice-timing correction, linear-trend removal, and high-pass temporal filtering. No spatial smoothing was applied. Intra-session functional scans were aligned to one another and co-registered with the intra-session anatomical scan. Inter-session co-registration was achieved by co-registering the anatomical scans from each session. Aligned data were mapped to the flattened representation of the subject’s cortical surface. Data from each scan was normalized by z- transform prior to analysis. Specifically, the timecourse of each voxel in each run was normalized by first subtracting the mean voxel signal of an entire run, then dividing by the standard deviation of the voxel signal of the same run. This normalization is used to remove differences in signal baseline and fluctuation amplitude for each voxel across runs in order to improve model estimation. (Each run contained an equal number of each condition, such that this normalization does not bias any particular condition.) 14 Localization of visual areas We used a standard paradigm to localize LOC (Kourtzi and Kanwisher 2000, 2001). Retinotopic visual areas were defined using rotating wedge and expanding ring stimuli (Engel et al. 1994, 1997). Data analysis BrainVoyager QX, BVQXToolbox (Brain Innovation, Maastricht, The Netherlands) and in-house MATLAB code were used for anatomical and functional data analysis, to extract BOLD timecourses from regions of interest (ROIs) of each individual subject. Unless otherwise stated, statistics were performed on the mean of the z-transformed signal within ROIs for V1, V2, V3, and hV4. In analyses for Experiments 2 through 4, we only included the dorsal representations of V1, V2, and V3 in the left hemisphere because stimuli were presented in the lower right visual field. For Experiments 1 and 2, ROIs for visual areas V1, V2, and V3 were defined by finding the corresponding subregion in each visual area that mapped to the retinal location of the stimulus, based on the subject’s retinotopic map. These retinotopically-defined ROIs represented a contiguous region in visual space that included the target and all possible flanker positions. Using this all-encompassing ROI allowed us to capture more comprehensively the effects of target-flanker interaction. Retinotopic polar angle and eccentricity maps were used to identify voxels corresponding to the stimulus (flankers and target) location. To ensure inclusion of the target and the flankers, ROIs were defined to include voxels that responded most strongly to eccentricities between 3 and 7 degrees in the quadrant in which the stimulus was presented. Because of 15 difficulties in mapping within hV4 due to distortions caused by vasculature (Winawer et al. 2010), the hV4 ROI was determined by stimulus-evoked activation. For Experiments 3 and 4, additional reference scans were used to help define regions of visual cortex that responded selectively to the area of visual space where the target and flanking letters appeared in the main experiment. Stimuli were single letter presentations, displayed at any one of the seven possible locations where a letter could appear in the main Experiment 3. Reference scans consisted of 24-second blocks of 1.5-second trials, blocked by letter position, and separate blocks for a fixation-only condition. ROIs were defined by restricting the localized retinotopic and LOC regions to subregions that were significantly activated by the stimuli, which included any of the seven letter positions (from reference scans) and all possible target-flanker configurations (from the main experiment). To extract the mean timecourse for each ROI, preprocessed fMRI data averaged across all voxels within an ROI was deconvolved against an indicator function formed by placing a Dirac delta function at each timepoint to be estimated, with separate indicator functions for each event type. In Experiment 1, statistics were performed on the area under the resulting curve from 6 seconds to 16 seconds post stimulus. For all other experiments, the peak of the timecourse was estimated from fitting with a difference-of-gamma function (Boynton and Finney 2003), and entered into statistical tests. SPSS and MATLAB code were used for group- and ROI-level statistical analyses. Paired t-tests were used to test for within- subject differences across stimulus conditions, and repeated measures ANOVA was used to identify interactions between stimulus manipulations. The mean time-course across subjects for each ROI was fit separately for visualization. Standard error was calculated 16 as the within subjects error (Loftus and Masson 1994) to facilitate the visualization of within-subjects differences. We also examined the response in the lateral occipital complex, LOC, when subjects attended to the letter stimuli (Experiments 2 through 4). Using two different methods, we tested for the presence of subregions within LOC that were affected differentially by crowding. For each individual subject for whom LOC mapping data was available, we created a sign-of-difference map for the “crowded” minus “noncrowded” condition with the activity restricted to the ROI for LOC, defined by the localization methods described in the main text. We applied a general linear model with a separate predictor for each event type and an assumed HRF. We contrasted the response to the “crowded” (0.9º) with the “noncrowded” (2.25º) target-present conditions. The sign-of- difference maps we created for LOC generally revealed distinct sub-regions: superior- posterior regions where the response to the crowded stimulus was less than that to the noncrowded stimulus, and inferior-anterior regions that showed the opposite pattern. In order to verify that the separate regions identified for LOC could be considered as functional sub-regions and did not arise by chance, we calculated the probability that a random preference for crowded and noncrowded configurations, segregated into contiguous areas due to the inherent smoothness of fMRI data, could have produced the same or fewer number of sub-regions. Intrinsic spatial smoothness of unsmoothed fMRI data was estimated separately for each dataset, using the methods outlined by Worsley et al. and Kiebel et al. and implemented on the cortical sheet using code adapted from smoothest3d in BrainVoyager BVQX Toolbox (K.J. Worsley et al. 1996; Kiebel et al. 1999). For each subject, pixels were randomly assigned preference for the crowded or 17 noncrowded stimulus (with probability 0.5), and randomly placed in the subject’s LOC ROI with no sign map. The density of pixels was determined by data smoothness. A Voronoi diagram was then used to identify a region of influence around each pixel, and the pixels belonging to this region were assigned the same map value as the defining pixel (Figure 2.8). This was repeated 5,000 times to obtain a histogram of the number of contiguous regions formed with random seeding. We used this null distribution to estimate the probability of observing the actual number of regions in the sign-of-difference map in LOC if there were no systematic effect of crowding. A group-level LOC analysis was also performed in Talairach coordinates using spatially smoothed data (FWHM = 6 mm) from subjects who participated in any of Experiments 2 - 4. A common ROI for LOC was defined based on the LOC localizer runs of the subjects for whom LOC mapping data was available. This was done using a conventional voxel-wise mixed effects GLM, thresholded at a conservative false discovery rate (FDR) of 0.02. Within this common ROI, a contrast map for crowded (0.9º target- flanker separation) and non-crowded (2.25º) conditions was determined using voxel-wise mixed effects GLM on spatially smoothed (FWHM = 6 mm) data from Experiments 2 - 4. This map was thresholded to achieve a FDR of 0.05 to identify the subregions of LOC where the response to the crowded condition was significantly different from that to the non-crowded condition. 18 Experiment 1 Participants Fifteen subjects (four female) participated. (One male subject was excluded from the analysis because the subject reported difficulty with the central task). Stimuli Stimuli were either 3 letters (target and flankers) or 2 letters (flankers alone), in crowded and non-crowded configurations (Figure 2.2). These four experimental conditions were displayed in a block design, with alternating 16-second blocks where stimuli were presented in the upper left and lower right visual fields, respectively (Figure 2.1B). During the first half of each 2-second trial, letter stimuli were counter-flickered twice for two cycles at 20 Hz in the periphery while a small colored square blinked twice at fixation, and subjects were asked to press the button that corresponded to the color of the square. For six subjects, the central stimulus was instead a square that changed colors four times, and subjects were asked to respond if the last color was the same as either of the first two. Data analysis Both left and right hemispheres were used in the analysis, with active blocks defined as blocks where stimuli were presented in the contralateral visual field. Timecourses for the target-absent conditions were subtracted from the target-present curves to obtain a time-course representing the effect of the target letter. For visualization, data were averaged across hemispheres and subjects. 19 To look for evidence of voxel-wise suppression of BOLD response with crowding, we estimated the responses of each unsmoothed voxel within the ROIs in V1, V2, and V3 to each condition of Experiment 1. To handle the relatively low signal-to-noise level for unsmoothed voxels, we modeled the response with a weighted sum of two terms: an impulse function convolved with an HRF to capture the initial transient response to the stimulus, and a step function of block duration convolved with an HRF to represent the sustained response. For each voxel in each condition, we estimated the response of the voxel as the sum of the fitted model response over the stimulus block (from 6 to 16 seconds after stimulus onset, over 11 time points). We quantified suppression in the crowded configuration by calculating the sum of differences in the responses between target-absent and target-present conditions for each ROI of each subject. To determine significance, we compared this measure for each ROI to a null distribution generated by randomly reassigning target-present and target-absent labels to each single-voxel response with a probability of 0.5, and calculating the resulting sum of response differences over the stimulus block. This generation procedure was repeated 5000 times to obtain a null distribution that we used to compute the p-values. Experiment 2 Participants Fifteen subjects participated in the study (three female). 20 Stimuli Stimuli corresponding to the four experimental conditions used in Experiment 1 were presented in the lower right visual quadrant. In order to minimize condition or task-difficulty dependent fluctuations in attention, the stimulus conditions plus an unannounced rest condition were displayed in a fast event-related design. Subjects were instructed to respond with the identity of the center letter when three letters were present and not to respond when the center letter was missing. Behavioral data (accuracy and response time) were collected while subjects were performing the letter identification task inside the scanner. Each trial lasted three seconds (Figure 2.1C). The stimulus was presented during the first 100 ms of each trial, and subjects had the remainder of the trial to respond. Experiment 3 Participants Six subjects participated in the study (one female). The data for one male subject was excluded due to poor data quality. Stimuli There were four experimental conditions with respect to the target-flanker separation: 0.9º, 1.5º, 2.25º, and target alone. Behavioral data (accuracy and response time) were collected while subjects were performing the letter identification task inside the scanner, in order to make direct comparisons between behavioral and BOLD response. The event-timing protocol was identical to that for Experiment 2. 21 Stimuli corresponding to the four experimental conditions used in Experiment 1 were presented in the lower right visual quadrant. In order to minimize condition or task- difficulty dependent fluctuations in attention, the stimulus conditions plus an unannounced rest condition were displayed in a fast event-related design. Subjects were instructed to respond with the identity of the center letter when three letters were present and not to respond when the center letter was missing. Behavioral data (accuracy and response time) were collected while subjects were performing the letter identification task inside the scanner. Each trial lasted three seconds (Figure 2.1C). The stimulus was presented during the first 100 ms of each trial, and subjects had the remainder of the trial to respond. Experiment 4 Participants Six subjects participated (one female), three of whom also participated in Experiment 3. The data for one male subject who also participated in Experiment 3 was excluded due to poor data quality. Stimuli Experiment 4 used the crowded (0.9º target-flanker separation) and non-crowded (2.25º) triplets from the previous experiments. In addition, two new conditions were generated by dividing the center letter in the non-crowded condition into 36 tiles and randomly rearranging either 50% (moderately scrambled) or 90% (severely scrambled) of the tiles. The amount of scrambling was chosen such that the subjects' performance in 22 identifying the highly scrambled target (proportion correct) was approximately equal to that for identifying the crowded target (Figure 2.7). RESULTS To determine the effect of crowding on processing in visual areas, we measured the BOLD response to crowded and non-crowded configurations with target and flankers or flankers alone within V1, V2, V3, and hV4. By design, the ROI within each of these visual areas encompassed both target and flankers. In Experiment 1, we directed the subjects’ attention to a demanding fixation task and away from the target letter and flankers. We found that the response to a non-crowded target and flankers was greater than that for the flankers alone – adding the target led to a positive change in BOLD signal in areas V1 [t(13)=4.41, p<0.001], V2 [t(13)=5.89, p<0.001], V3 [t(13)=4.02, p<0.005], and hV4 [t(13)=2.46, p<0.05] (Figure 2.2B&D). In contrast, response to a crowded target and flankers was the same as that to the flankers alone (Figure 2.2C&D) in each of these areas (p’s>0.46). This interaction between target-presence and crowding reached statistical significance in V1 [F(1,13)=5.25, p<0.05] and V2 [F(1,13)=7.84, p<0.05]. While the target added the same amount of contrast energy to the stimulus in each condition, the addition of the target to the non-crowded configuration led to a reliably larger signal change than to the crowded configuration in V1 and V2. These results show that the addition of the target in the crowded configuration led to mutual suppression of the signals for the target and flankers, particularly in V1 and V2 The central task was challenging, and mean performance was nearly identical (ranging from 80% to 81%) for all conditions, eliminating the possibility that differential 23 attention to the task versus letter stimuli across conditions was responsible for this outcome. Figure 2.2: Stimuli and hemodynamic response functions for ROIs in Experiment 1, averaged across 14 subjects. 020 020 0 20 −0. 0.1 Δ BOLD (Beta Value) Time (s) Crowded Non-crowded B C D A Stimuli 020 target present target present target absent target absent non-crowded crowded 0 0.2 0.4 0.6 Target Present minus Target Absent target present target absent crowded non-crowded V1 V3 hV4 V2 1 0 0.2 0.4 0.6 24 A. Stimulus configurations, from upper left to lower right: non-crowded target and flankers, crowded target and flankers, non-crowded configuration with flankers only, crowded configuration with flankers only. Light gray region illustrates the projection of the retinotopically-defined regions of interest to visual space. B and C show the BOLD timecourses measured from four visual cortical areas (V1, V2, V3, and hV4) over the stimulus block for non-crowded and crowded configurations, respectively, with the target present (blue and red curves) and without (cyan and magenta curves). The gray bar in the first panel of B indicates the stimulus duration. Subtractions of the target-absent from target-present responses are shown in D. Shaded regions in all plots represent the within- subjects standard error (Loftus and Masson, 1994). Saturated color marks the time-points used in analyses. Addition of the target led to a significant increase in BOLD response in the non-crowded configuration, but not the crowded configuration. This interaction of target presence with crowding was observed as early as V1. Results from Experiment 1 show that the mutual suppression between target and flankers in the crowded condition occurs when attention is directed away from the target, suggesting that crowding occurs automatically, regardless of the behavioral relevance of a stimulus. Nevertheless, crowding persists when attention is directed to the target. We therefore expect to observe crowding-induced signal suppression with attention. To test this, we examined the effect of crowding on BOLD response using the same stimuli as in Experiment 1, but where the subjects’ task was to identify the target (center) letter (Experiment 2). Accuracy measured inside the scanner for identifying the target letter was significantly better in the non-crowded condition (97% in the non-crowded condition vs. 59% in the crowded condition; t(14) = 10.1; p<0.001), indicating that the crowding manipulation was effective. We found that the suppression in BOLD signal with crowding observed in Experiment 1 persisted when attention was directed to the letter stimuli. The addition of the target letter led to a significant increase in peak BOLD response in the non- crowded condition in V1 [t(14) = 5.18, p<0.001], V2 [t(14) = 2.59, p<0.05], and hV4 [t(14) 25 = 3.75, p<0.005], but not the crowded condition (p’s>0.10; Figure 2.3). In V3, target presence did not lead to any measurable increase in response the non-crowded condition (p=0.41), but in the crowded condition, it led to a significant decrease in response [t(14) = -2.72, p<0.05]. The V3 result is consistent with those for other visual areas. This is because the magnitude of the mutual suppression between the target and flankers relative to any signal gain caused by the target is not known. What is relevant for the current study is whether mutual suppression between target and flankers is stronger in the crowded condition than in the non-crowded condition. The interaction between crowding and target presence was statistically significant in each visual area tested: V1 [F(1,13) = 7.69, p <0.05], V2 [F(1,13) = 11.02, p<0.01], V3 [F(1,13) = 4.98, p<0.05], and hV4 [F(1,13) = 6.35, p<0.05]. Despite the differences in experiment design and attentional state, Experiments 1 and 2 yielded qualitatively similar results, particularly in V1 and V2. 26 Figure 2.3: Averaged hemodynamic response timecourses for ROIs in Experiment 2. ROIs were defined similarly as in Experiment 1 (Fig. 1A). Panels A and B show the time-course of the BOLD signal for non-crowded and crowded configurations, respectively, with the target present (blue and red curves) and without (cyan and magenta curves) averaged across 15 subjects. Subtractions of the target-absent from target-present responses are shown in C. Shaded regions in all plots represent the within-subjects standard error. Compared to the non-crowded condition (black curves), adding the target in the crowded condition (green curves) led to a smaller increase (or even a decrease) in BOLD response. This interaction between target presence and crowding was significant in all ROIs, including V1. A key property of crowding is that it decreases with increasing distance between target and flankers. We thus expected that the BOLD signal suppression we observed to depend on target-flanker spacing. To test this, we again measured BOLD signal while 10 −0. 0.1 Δ BOLD (Beta Value) Time (s) Crowded Non-crowded A B C 020 target present target present target absent target absent non-crowded crowded 0 0.2 Target Present minus Target Absent 1 −0.2 V1 V3 hV4 V2 0 0.2 −0.2 0 10 020 10 020 10 020 27 subjects identified a target letter, and varied the spacing between target and flankers (Experiment 3). Letters were displayed at a center-to-center spacing of 0.9º (strongly crowded), 1.5º (weakly crowded), 2.25º (non-crowded), or “infinity” (single letter) (Figure 2.3A). We found in Experiment 2 that the BOLD response within the all-inclusive ROI to the flankers-alone stimuli was not affected by flanker-flanker separation (p’s>0.10, Figure 2.4), which justifies the omission of the target-absent conditions. Figure 2.4: BOLD response to flankers is unaffected by spacing. Panel A shows the time-course of the BOLD signal for noncrowded (cyan) and crowded (magenta) conditions, when only the flankers were displayed, averaged across the 15 subjects from Experiment 2. Subtraction of the crowded from noncrowded configuration responses is shown in Panel B. Shaded regions in all plots represent the within-subjects standard error (Loftus and Masson, 1994). There was no significant difference in response between the two flanker-only conditions in any of the ROIs (p’s>0.10). Δ BOLD (Beta Value) Target Absent A 0 0.2 −0.2 0 0.1 −0.1 Non-crowded minus Crowded B crowded non-crowded 10 Time (s) 020 10 020 10 020 10 020 V1 V3 hV4 V2 28 Averaged across five subjects, the accuracy for identifying the target letter was 63±3% when the target-flanker separation was 0.9º; it improved to 94±2% at a separation of 2.25º, which was not different from the single-letter condition (Figure 2.5). Response time was shorter for conditions with higher accuracy (1.3 s for target-flanker separation of 0.9º, 0.95 s for the single letter); hence, there was no speed-accuracy tradeoff. Figure 2.5: Behavioral performance in the scanner for Experiment 3, measured in proportion correct. Individual subject data is shown, with error bars representing the standard error across experiment blocks. Letter identification accuracy decreased with decreasing letter spacing. Crowded (0.9º separation) and noncrowded (2.25º separation) conditions led to significantly different letter-identification accuracy [t(4)=7.40, p<.01] and response time [t(4)=3.84, p<.05]. Subjects were less accurate and slower in the crowded condition. The fMRI results for this experiment for visual areas V1-hV4 are discussed in the main text. In cortical areas where crowding was associated with a reduced BOLD amplitude, such a reduction was not due to subjects “giving up” on the crowded condition – response time of the crowded condition was significantly longer. 0.4 0.6 0.8 1 Prop. Correct 0.90° Single Condition 1.50° 2.25° KC ASN MN SD BB 29 We again saw a clear separation between the timecourses corresponding to the crowded and non-crowded conditions in areas V1, V2, V3, and hV4, where the crowded condition was associated with lower peak BOLD amplitude (Figure 2.3B). Repeated measures ANOVA confirmed a main within-subjects effect of crowding (0.9º vs. 2.25º) on BOLD amplitude [F(1,4) = 88.1, p<.001] but found no significant interaction between crowding and visual area. Consistent with Experiments 1 and 2, signal for the crowded condition was significantly lower than that for the non-crowded condition in V1 [t(4) = 2.79, p<0.05], V2 [t(4) = 5.85, p<0.01], V3 [t(4) = 3.37, p<0.05] and hV4 [t(4) = 5.81, p<0.005]. The relationship between BOLD amplitude and target-flanker separation was monotonic and systematic. BOLD amplitude of the weakly crowded (1.5º) condition fell in-between the crowded (0.9º) and non-crowded (2.25º) conditions (Figure 2.6B), consistent with behavioral measures. As expected, BOLD amplitudes for the single-letter condition were lower than those for the non-crowded condition [F(1,4) = 22.49, p<0.01] because the ROIs included both the target and flanker locations. However, there was no significant difference between the amplitudes for the single-letter condition and the crowded condition [F(1,4) = 1.7, p = 0.26], despite the crowded stimulus having three times the total contrast energy. 30 Figure 2.6: Averaged hemodynamic response timecourses for ROIs in Experiment 3. A. Stimulus configurations, from upper left to lower right: single letter, target and flankers with center-to-center spacing of 0.9º (crowded), center-to-center spacing of 1.5º (weakly crowded), and center-to-center spacing of 2.25º (non-crowded). Light gray region illustrates the projection of the retinotopically-defined regions of interest to visual space. B. The time-course of the BOLD signal (data symbols) for each condition averaged across five subjects was fitted with a difference-of-gamma function (solid lines). The estimated peak response amplitudes are plotted in the insets. Error bars in bar plots represent the within-subjects standard error. The peak response to the crowded −0.2 0 0.2 V1 V2 0 10 20 V3 hV4 010 20 0.4 Δ BOLD (Beta Value) Single 0.9° 1.5° 2.25° 0 0.2 Time (s) 010 20 010 20 B A −0.2 0 0.2 0.4 0 0.2 0 0.2 0 0.2 31 condition (red line/bar) was lower than that to the non-crowded condition (blue line/bar) in all four ROIs. It could be suggested that subjects might “give up” in the crowded condition, which was more difficult when the task was letter identification (Experiments 2 and 3), resulting in less attention to stimuli and less attentional enhancement of the BOLD signal (Ress et al. 2000). This was not the case. We performed a control experiment (Experiment 4) that included the crowded and non-crowded conditions, plus two new conditions that were identical to the non-crowded condition but with the target moderately or severely scrambled (Figure 2.6A). The mean accuracy for target identification was significantly different across all conditions (p<0.05) except between the crowded and severely- scrambled (48% vs. 60%, with within-subjects differences ranging from -23% to 9%), which constituted the most difficult conditions (Figure 2.7). Figure 2.7: Behavioral performance in the scanner for Experiment 4, measured in proportion correct. Condition 2.25° 90% 0. 0.4 0.6 0.8 1 Prop. Correct 2 0.90° 2.25° LC ASN PB ACA BB 2.25° 50% 32 By condition: crowded (0.9º), severely scrambled (2.25º, 90%), moderately scrambled (2.25º, 50%), and noncrowded (2.25º). Individual subject data is shown, with error bars representing the standard error across experiment blocks. We examined the effects of crowding and task difficulty on BOLD signal within the ROIs in V1 through hV4. Crowding (crowded vs. non-crowded) was again associated with a reduction in BOLD amplitude [F(1,4) = 19.6, p<0.05)] (Figure 2.6B), and there was no interaction between crowding and visual area [F(3,12) = 1.77, p = 0.20]. More importantly, we found a lower response to the crowded condition than to the highly- scrambled condition [F(1,4) = 9.45, p<0.05)], which were comparable in performance, and there was no interaction with visual area. In contrast, there was no discernible difference in BOLD signal amplitude between the two scrambled conditions and the non-crowded condition [F(2,4) = 1.22, p = 0.35], despite their varying levels of behavioral performance. The reduction of BOLD signal in the early visual areas was associated with crowding and not task difficulty. There was no correlation between task difficulty and BOLD response, most likely because anticipation was not possible in the rapid event-related design. Combining data from the two conditions (crowded and non-crowded) common to all three event-related, letter-identification experiments yielded the same conclusion: there was a strong main effect of crowding [F(1,21) = 31.16, p < 0.0005]. Paired t-test on the combined data showed an effect of crowding in all four cortical areas: V1 [t(21) = -4.56, p < 0.0005], V2 [t(21) = -4.49, p < 0.0005], V3 [t(21) = -5.50, p < 0.00005], and hV4 [t(21) = -4.22, p < 0.0005]. (Three subjects participated in two of the three experiments. A single entry was created for each subject using the average BOLD amplitude of each condition across the two experiments.) 33 Figure 2.8: Stimuli and averaged hemodynamic response timecourses for ROIs in Experiment 4. A. Stimulus configurations, from upper left to lower right: crowded; non-crowded with 90% of the target tiles scrambled; non-crowded with 50% of tiles scrambled; and non- crowded with intact target. Light gray region illustrates the projection of the retinotopically-defined regions of interest to visual space. B. The time-course of the BOLD signal for each condition averaged across five subjects was fitted with a difference-of-gamma function. The estimated peak response amplitudes are plotted in the insets. Error bars in bar plots represent the within-subjects standard error. Greater task difficulty for the scrambled conditions (gray lines/bars) was not associated with a significant reduction in BOLD amplitude. Results from these four experiments show a highly significant physiological effect of crowding as early as in V1. They do not rule out the contribution of other areas to −0.2 0 0.2 V1 V2 010 20 −0.2 0 0.2 V3 hV4 Δ BOLD (Beta Value) 2.25°, 50% 0.9° 2.25°, 90% 2.25° Time (s) B 010 20010 20 0 0.2 0 0.2 0 0.2 A 0 0.2 010 20 34 crowding, nor do they by themselves suggest a V1 origin of crowding. Crowding may occur in higher-level visual areas (Louie et al. 2007; Farzin et al. 2009), in which case regions beyond retinotopic visual cortex could be expected to exhibit signal suppression with crowding. In the lateral occipital complex (LOC), a large higher-level object-selective region, we found that the effect of crowding systematically varied across subregions for most subjects. Within the subregions of LOC that were significantly modulated by the stimuli in Experiments 2, 3, and 4 (where subjects attended to the stimuli), we computed the sign of the voxel-wise differences in BOLD amplitude between crowded and non- crowded conditions. For eleven of the fifteen subjects who participated in the experiments and had localized LOCs, including the three subjects who participated in multiple experiments and yielded consistent results, distinct patches of voxels were evident in LOC: the more superior-posterior subregions of LOC showed the same sign of difference as the early visual areas, where the crowded condition evoked a lower BOLD signal level; in contrast, the more anterior-inferior subregions showed the opposite: the crowded condition produced the larger response (Figure 2.9A). These distinct regions were generally contiguous such that it is statistically unlikely that this configuration could have occurred by chance (bootstrapped p<0.05 for each of the five subjects; Figure 2.9B-D). The lack of statistically significant subregions in a minority of subjects may have resulted from low power due to the relatively small voxel-wise signal differences in rapid event-related designs without spatial smoothing. To increase our power to identify LOC subregions, we performed a voxel-wise mixed-effects group analysis within an independently defined group-level LOC ROI using spatially smoothed data. The resulting group LOC map again 35 revealed posterior regions where the BOLD response was greater for the non-crowded condition, and an anterior region where response was greater for the crowded condition. The pattern of results in LOC could be due to the representation of different classes of stimuli in segregated subregions of LOC (Konkle and Oliva 2012), or the dedication of different regions to distinct types of processing (Grill-Spector et al. 1999; Kourtzi and Huberle 2005). One possible explanation for this pattern of results is that an impoverished bottom-up feed-forward signal from the early stages of visual processing is received by subregions of LOC, while other subregions are forced to engage in top-down inference (that is often unsuccessful in disambiguating the crowded signal). Figure 2.9: LOC subregions observed in Experiments 2-4. A. Visual cortices of 2 representative subjects, with LOC sign-of-difference maps. Cortical surfaces were cut along the calcarine fissure (black dotted line) and flattened. BB LOC hV4 V1 V2 V3 0 0.16 0.32 Observed Map Simulated Map 10 60 no. of subregions = 11 B D frequency 20 30 40 50 no. of subregions = 36 Significance Test observed null distribution LOC hV4 V2 V3 ASN A Individual Sign-of-Difference Maps no. of subregions ASN ASN ASN C V1 V2 V3 V2 V3 V1 V1 LOC hV4 V1 V2 V3 V1 V2 V3 E Group-level Map 36 Regions of LOC that showed numerically higher activation for non-crowded stimuli are indicated with blue; regions that showed higher activation for crowded stimuli are shown in red. B. For subject ASN, 11 subregions, including 3 prominent ones, are apparent. C. A simulated random sign-of-difference map for the same subject, produced by randomly assigning preference for crowded or non-crowded stimuli while preserving the local correlation between voxels. Many (36) small, distinct subregions are apparent. D. Null distribution (histogram of the number of subregions in a random map over 5,000 simulations), for the same subject. Less than 0.1% of simulated maps had 11 or fewer subregions, thus rejecting the null hypothesis that the observed subregions arose by chance. 11 of the15 subjects from Experiments 2-4 who had participated in a LOC localizer scan showed consistent results, rejecting the null hypothesis of random sub- division of LOC at a significance level of 5%. E. Mixed-effect group analysis (N=22) of the contrast of responses to crowded and non-crowded conditions, thresholded for false discovery rate of 0.05, displayed on the inflated cortical surface of subject BB for comparison. Orange indicates a subregion where response was higher in the crowded condition, blue in the non-crowded condition. DISCUSSION We found that crowding was associated with a decrease in BOLD signal in early stages of visual processing, from V1 to hV4. The suppression of BOLD signal correlates with crowding: suppression was strongest for the most closely spaced stimulus configuration. In V1 and V2, this effect persists regardless of whether attention is directed toward or away from the stimulus, implicating an automatic rather than behavior-driven process. In accord with this view, the complex response pattern of LOC to crowding also makes a top-down explanation less parsimonious. Results from these experiments are consistent with an early locus of crowding in V1. Crowding has been described as a distinct process from “surround suppression”, where a mask placed adjacent to a target interferes with detection of the target (Levi et al. 2002; Petrov and McKee 2006; Petrov and Popple 2007; Petrov et al. 2007; Levi 2008). 37 This distinction is largely based on differences in psychophysical measures. However, the two phenomena have several properties in common (see Table 1 of Petrov et al. 2007), and a precise delineation between crowding and surround suppression has not been established. Surround suppression can very well be a component of crowding (Maniglia et al. 2011). Indeed, for studies that directly compare crowding with surround suppression, it is not uncommon to define the phenomena in terms of the stimuli (e.g. Petrov et al. 2007) – small laterally-placed flankers for crowding, a large grating annulus matched in spatial frequency to a central target grating for surround suppression. In order to emphasize crowding in case the phenomena are in fact distinct, we used stimuli typically designed to induce crowding rather than surround suppression. We used spatially broadband letter stimuli rather than oriented gratings (Zenger-Landolt and Heeger 2003). We kept our “surround” region very small, consisting of only two letters, one on each side of the target. We displayed the target (and flankers) at 100% contrast (Pelli et al. 2007). The amount of BOLD signal suppression we observed with our crowding stimuli was very strong – the BOLD response to the target and flankers in the ROIs that encompassed both target and flankers was similar to the BOLD response evoked by the flankers alone, regardless of the attention state (Experiments 1 and 2), The lack of response increase when the target was added to the crowded configuration was not due to saturation of the BOLD response at the voxel level. (Response clearly did not saturate at the ROI level since the ROI responses were higher in the target- present non-crowded condition than that in the crowded condition.) If voxel-level saturation were the only mechanism underlying our finding in the crowded condition, we would expect to never observe a decrease in response when a target was added. However, 38 individual subject data showed a significant and reliable decrease in single-voxel responses when the target was added (Figure 2.10), which rules out the saturation of voxel responses as an explanation of our finding. Figure 2.10: Suppression, not response saturation, explains Experiment 1 results. Recent work by Kay et al. (2013) suggests that the response of a voxel may saturate as stimuli fill its receptive field. To determine if response saturation could explain the absence of response increase when the target was added in the crowded condition, we examined the responses of single voxels summed over the stimulus block (from 6 to 16 seconds after stimulus onset) in the crowded condition. Single-voxel responses are shown for upper visual field in the crowded condition with target present (blue) and target absent (red) for a representative subject. For visualization, voxels are sorted by response (from highest to lowest) to the target-absent condition. Target-present responses are generally lower than target-absent responses. In V1 we found that eight out of fourteen subjects’ data showed a significant suppression of single-voxel responses (p < 0.05), rather than saturation, when the target was added to stimuli presented in the upper visual field. In V2 and V3 this was the case for seven out of fourteen subjects. For stimuli presented in the lower visual field, data from three subjects showed significant suppression in V1, four in V2, and three in V3. If voxel-level saturation were the sole mechanism underlying our finding in the crowded condition, we would expect to never observe a decrease in response for any subject in any visual field when a target was added. The fact that most subjects showed a statistically significant decrease in BOLD response when the target was added argues against this possibility. Our V1 results differ from those of three recent neuroimaging studies that investigated the effect of crowding on BOLD signal in retinotopic visual areas (Fang and He 2008; Bi et al. 2009; Freeman et al. 2011); none of these found any effect of crowding 20 ∆ BOLD (Sum of Beta Values) Voxel Number target present crowded target absent 0 50 100 150 0 50 100 150 0 20 40 60 80 V3 V2 V1 Subject HA 15 10 5 0 -5 39 on BOLD signal in V1 when attention was directed away from their stimuli. Bi et al. measured a crowding-modulated adaptation effect rather than directly measuring crowding. Fang and He, as well as Bi et al., attempted to isolate the responses to target from those to flankers. This necessitated the use of relatively large stimuli and large center- to-center spacing between target and flankers, which induced only weak behavioral crowding. Freeman et al. manipulated crowding with the relative timing between target and flankers. Their behavioral experiment suggested a large difference between their crowded (simultaneous presentation of the target and flankers) and non-crowded (sequential presentation of the target and flankers) conditions. The target and flankers were spatially adjacent, and like the current study, the authors did not try to spatially delineate target response from flanker response. While they found that crowding disrupted the temporal correlation between V1 and the visual word-form area, they did not observe any effect of crowding on BOLD amplitude in V1. However, the differences in the temporal dynamics between their crowded and non-crowded stimuli may have obscured the effect in V1. This is because for short inter-stimulus intervals, as in their non-crowded condition, temporal summation of sequentially evoked BOLD responses is sublinear, probably due to vascular refractory effects (Liu et al. 2010). This means that neuronal response could be disproportionally underestimated in their non-crowded (sequential presentation) condition, thus masking the difference between crowded and non-crowded conditions. In the current study, we measured signal modulated by crowding directly under identical temporal regimes for the crowded and non-crowded conditions. We were able to use typical crowding stimuli with small center-to-center spacing that induced very strong 40 behavioral crowding because our method of analysis does not require separation of responses to target and flankers. Our method allowed us to reliably observe the suppression of BOLD signal in V1 with crowding with and without attention directed to the stimuli. Crowding is reduced when flankers are contralateral to the target with respect to the vertical but not the horizontal meridian (Liu et al. 2009). This implicates a locus for crowding where the upper and lower visual fields are contiguously represented across the horizontal meridian, such as V1 or hV4. It has been suggested that cortical magnification in V1 combined with inappropriate feature integration over a constant radius on the cortex could explain the scaling of the spatial extent of crowding (the critical spacing) with eccentricity (Pelli and Tillman 2008; Nandy and Tjan 2012). Furthermore, Nandy and Tjan provided the first quantitative account of elliptic shape of the spatial extent of crowding based in part on the physiological and anatomical properties specific to V1, and thus implicated V1 as a site for crowding. The current finding of crowding-induced suppression in V1 is consistent with this view. The combined results from our four experiments provide support for an input- driven origin of crowding in V1. Anderson et al. (2012) observed a release from adaptation of BOLD signal in V1 when subjects noticed a change in a flanked target, but not when crowding interfered with change detection. However, this effect was measured while subjects attended to the crowding stimuli, such that it could be specific to stimuli that are behaviorally relevant, rather than automatic. In the present study, the fact that there was substantial signal suppression in V1 regardless of the behavioral relevance of the stimulus could suggest a lack of intervention by feedback and recurrent processing that are task- specific. While it may be tempting to exclude feedback as a possibility, feedback may be 41 necessary for task-independent processing of visual input. Regardless, our result is consistent with the view that there is a generic bottom-up image-processing component to crowding. Our finding does not preclude the possibility that crowding arises at multiple stages of processing. Crowding occurs for stimuli of varying complexity (sinusoidal wavelets, line segments, letter, faces, objects), which could reflect contributions to crowding at various stages of processing ( Louie et al. 2007; Farzin et al. 2009; Whitney and Levi 2011; Anderson et al. 2012). It is possible that, while crowding begins at a low level, higher cortical areas additionally contribute to the crowding effect. Our view is that, starting with V1, crowding is caused by similar local processes in multiple visual areas. 42 Chapter 3: A non-neuronal model of BOLD fMRI in retinotopic visual cortex SUMMARY In the last decade, a number of fMRI analysis methods have been developed that allow for characterization of data beyond an increase or decrease of response within a localized region of cortex. However, interpretation of the results of these analyses is complicated by a lack of information about what results can be expected based on the limitations and characteristics of BOLD signal, regardless of the underlying neural processes. To address this, we have developed a forward model of MVPA in primary visual cortex. This model begins with a representation of the V1 cortical surface and its mapping with the visual space (Rovamo and Virsu, 1984) and incorporates the Balloon Model (Buxton et al., 1998) to generate the BOLD response. Known components of BOLD signal are integrated, including spatial correlation in BOLD signal due to hemodynamic spread. We also introduce a new noise model that captures the spatially and temporally correlated noise that is inherent in BOLD measurements. For each subject, data acquired during viewing of an expanding ring stimulus was used to tune model parameters to achieve the best match between the BOLD SNR and signal and noise correlation characteristics of the data and model simulation of the same dataset. The model was then used to simulate BOLD fMRI data for a rotating wedge stimulus with similar protocol and appearance to the ring stimulus (near generalization) and letter stimuli presented at 5 degrees eccentricity with timing distinct from the other two experiments (far generalization). We show that, once calibrated, the model can be used to predict the results 43 of novel experiments. The calibrated model can thus be used to both optimize experiment design, and provide a baseline result exclusive of the effects of neural interactions that can be used to interpret the results of fMRI experiments. INTRODUCTION Until recently, standard analyses of BOLD fMRI data were mostly limited to either voxel-wise search for regions of activation or examination of the mean signal within pre- defined regions of interest (ROIs). However, more complex techniques that allow for additional information to be gained from BOLD fMRI have become widely used. These methods, while powerful, are susceptible to the same difficulties that plague conventional fMRI analyses. A ubiquitous problem for fMRI analysis is the corruption of the signal by a significant amount of noise from physiological and other sources. The typical signal-to- noise ratio (SNR) of a block design fMRI experiment ranges from only about 0.25 to 2.5, and is poorer for event-related designs (Chen et al., 2003). Here, “signal” is the component of the BOLD response directly related to the stimulus or task of interest. What is referred to as “noise” in fMRI data has varied sources, including task-irrelevant neural activity, head motion, physiologic processes such as breathing, electronic noise due to nearby equipment, and blurring or distortion due to data pre-processing techniques (Kruger and Glover, 2001; Woolrich et al., 2004; Liu et al., 2006). The contributions from these factors can be estimated from conventional experiments, where noise is modulation unexplained by the experimental manipulation, or in resting state studies or phantom scans where 44 experiment-related modulation is absent. Such studies have demonstrated the complexity of BOLD noise. Short and long-range (beyond 5 mm) spatial correlations (Biswal et al., 1995; Zarahn et al., 1997; Poot et al., 2008) have been observed in resting state data; in fact, correlation between the signal timecourses of brain regions separated by centimeters are observable in the absence of a stimulus or task (Xiong et al., 1999; Fox et al., 2005). Even in low noise, the stimulus-relevant component of the BOLD response is less than ideal. The spatial and temporal resolution of typical BOLD data are low: about 3 mm and 2 s for full brain coverage, respectively (Ugurbil et al., 2013; Smith et al., 2013). In addition to these limits of the measurement, the signal source, deoxygenated hemoglobin in the vasculature (large drainage veins in the case of a typical gradient echo EPI sequence), induces a signal that is spread out in both space and time. The spatial point-spread of the BOLD signal is estimated between 2.5 mm and 4 mm FWHM at typical field strengths (Engel et al., 1998; Parkes et al., 2005). This means that measured activation could be in response to a change in neural activity more than a millimeter from the site of the measurement. The hemodynamic response has a temporal lag, such that the peak response typically occurs about 6 seconds after the neural activity that drives it, and returns to baseline only after about 20 seconds (Buckner, 1998). Standard methods are often used to reduce the impact of these nuisance factors on conventional statistical analyses such as the general linear model (GLM) to detect BOLD activation. Such approaches include using regressors of no interest (for example, polynomials – see Worsley et al., 2002 – or selected independent or principal components – see Thomas et al., 2002; Behzadi et al., 2007), or autoregressive correlation correction (Bullmore et al., 1996). However, the influence of complex noise on newer, more complex 45 analyses in unclear. All of these factors limit the information about the signal available to a given method for analysis. For example, analyses that rely on the spatial pattern of response may be unable to separate spatially correlated BOLD signal from similarly correlated BOLD noise. In early visual areas, information available in the BOLD response is further constrained by the cortical magnification function, resulting in a small representation on cortex of the peripheral visual field. Combined with the low spatial resolution and spatial spread of the BOLD signal, this compression of the representation of the peripheral visual field has the potential to affect fMRI measurements in a manner that is unaccounted for in methods for its analysis. This poses two important problems: 1) How does one design an experiment that will be informative in spite of these limitations, and 2) What conclusions can be drawn from the data about the neural process of interest, given the influence of these nuisance factors on the measurement? These problems motivated the development of a model of BOLD fMRI in retinotopic visual cortex to aid in the design and interpretation of fMRI experiments. While the neural activity that is reflected in BOLD signal is unknown, many properties of BOLD signal and noise can be estimated and modeled without knowledge of the underlying neural activity. These factors can be incorporated into a non-neuronal model to predict how such factors impact BOLD measurements and how an experiment can be optimized to maximize the probability of an informative outcome. The model is designed to be sufficiently general that it can be used for experiment optimization without requiring additional data acquisition by the experimenter. In this chapter, I describe the model and demonstrate its efficacy for describing and predicting BOLD signal in primary visual cortex. 46 METHODS Annotation conventions Let 𝑦 𝑢,𝑣,𝑡 be the mean-normalized BOLD (T2* weighted) measurement obtained at cortical location (𝑢,𝑣) and time 𝑡. Depending on context, we may choose to represent 𝑦, as well as other vectorizable quantities, in a number of ways: a) 𝑦 '( 𝑡 = 𝑦 * 𝑢,𝑣 =𝑦(𝑢,𝑣,𝑡) b) 𝒚=[𝑦 𝑢,𝑣,𝑡 ] represents the BOLD measurements over cortical space and time. c) 𝒚 𝑡 =𝒚 * = 𝑦 𝑢,𝑣,𝑡 * represents the BOLD measurement over cortical space at a given time 𝑡. d) 𝒚 𝑢,𝑣 =𝒚 '( =[𝑦 𝑢,𝑣,𝑡 '( represents the BOLD measurements over time obtained at cortical location (𝑢,𝑣). Model Overview We assume 𝒚 to be a sum of the stimulus-evoked BOLD signal and noise. Specifically, we assume: 𝑦 '( 𝑡 = 𝑏 𝑢,𝑣,𝑡 +𝜖 '( 𝑡 (1) = 𝑏(𝐼 4 ',(,* )+𝜖 '( 𝑡 where 𝐼 4 𝑢,𝑣,𝑡 is the cortical image corresponding to a stimulus, 𝑏 𝑢,𝑣,𝑡 is the BOLD signal evoked by the cortical image, and 𝜖 '( 𝑡 is noise, the component of the measurement unrelated to the stimulus. 47 Figure 3.1: Model Overview. Given an experimental data set, we first estimate the noise by separating it from signal using GLM: 𝑦 '( 𝑡 = 𝑿 𝑡 𝜷 '( +𝜖 '( (𝑡) (2) where 𝑿 is the design matrix of the BOLD experiment and 𝜖 '( 𝑡 is the noise. We then characterize the spatiotemporal structure of the estimated noise in order to synthesize noise in the model. Noise modeling and synthesis We modeled the noise as having two components, one of which is signal- dependent and the other of which is independent of signal: 𝑣𝑎𝑟 𝝐 '( =(1+𝛾 < )𝜎 > ? (3) where 𝛾 < is the signal-dependence and 𝜎 > ? is the signal-independent (“basal”) noise variance. Because the signal-dependent contribution is independent from other noise sources, the variance of 𝜖 '( 𝑡 is assumed to be the sum of the variance (𝜎 > ? ) of the “basal” noise and a term that depends on the power (square of the root-mean-square) of the BOLD signal at stimulus map to model V1 BOLD signal model bold noise model predicted voxel-wise response 48 the location (𝑢,𝑣). Using the definition of BOLD signal used in the GLM (Eq. 2), this gives 𝑣𝑎𝑟 𝝐 '( =𝛾 𝑟𝑚𝑠 𝑿𝜷 '( ? +𝜎 > ? (4) where 𝛾 and 𝜎 > are constant across the modeled cortex, which we estimated from the BOLD measurements cross the cortex from Eqs. (1) and (2) using a least squares procedure. Only vertices with an R 2 value (from the original GLM fit to separate signal from noise) were used in the fitting procedure, as voxels with low R 2 do not provide reliable information about this relationship. Figure 3.2: Noise model. Gaussian white noise on the V1 model surface is filtered to produce the spatiotemporal correlation structure of “real” noise. The noise is then scaled voxelwise by a constant that captures signal-dependent and signal-independent contributions to the variance of the noise. Given 𝛾 and 𝜎 > , we define basal (signal-independent) noise (𝜉 '( 𝑡 ) can be determined from 𝜉 '( 𝑡 = 𝝐 '( C D E F GH< 𝑿𝜷 IJ E KC D E (5) To synthesize 𝝃, and eventually 𝝐, we first estimate the spatiotemporal amplitude spectrum of 𝝃. We assume that the spectrum is isotropic in spatial frequency, such that ° Simulated noise Spatiotemporal kernel Gaussian white noise Signal-dependent scaling factor 49 the data can be averaged over the smaller spatial dimension (say v) to improve estimation. We further assume that the spectrum is space-time separable, such that spatial and temporal spectra can be estimated separately by averaging over the other dimension. We obtain the space-time separable amplitude spectrum for the new noise, 𝐴 N : 𝜉 𝑟,𝑡 = 𝜉 𝑢,𝑣,𝑡 ( 𝐴 N 𝛥𝑟,𝛥𝑡 = ℱ 𝜉 𝑟,𝑡 (6) 𝐴 N 𝛥𝑟 = 𝐴 N 𝛥𝑟,𝛥𝑡 ∆* (7) 𝐴 N 𝛥𝑡 = 𝐴 N 𝛥𝑟,𝛥𝑡 ∆' (8) 𝐴 N 𝛥𝑟,𝛥𝑡 =𝐴 N 𝛥𝑟 ⊗𝐴 N 𝛥𝑡 (9) 𝐴 N =𝐴 N 𝛥𝑢,𝛥𝑣,𝛥𝑡 =𝐴 N 𝛥𝑢 ? + 𝛥𝑣 ? ,𝛥𝑡 (10) where the Fourier transform (ℱ) and its inverse were implemented with FFT and IFFT. To synthesize the basal noise 𝜉(𝑢,𝑣,𝑡), we first draw a white noise sample of unit power density 𝑤(𝑢,𝑣,𝑡) and filter it with 𝐴 N in Fourier domain. That is, 𝜉 𝑢,𝑣,𝑡 =ℱ TU ℱ 𝑤 𝑢,𝑣,𝑡 ∘𝐴 N (11) To synthesize the total noise 𝜖(𝑢,𝑣,𝑡), we invert Eq. (3) after having synthesized the BOLD signal 𝑏 𝑢,𝑣,𝑡 (see next section): 𝜖 𝑢,𝑣,𝑡 =𝜉 𝑢,𝑣,𝑡 F GH< W IJ E KC D E C D E (12) 50 Bold signal synthesis BOLD signal, 𝑏 𝑢,𝑣,𝑡 , which was approximated as 𝑿(𝑡)𝜷 '( in Eq. (1), is synthesized from a projection of the retinal image 𝐼 G (𝑥,𝑦,𝑡) to a cortical region of interest, which we refer to as a cortical image 𝐼 4 𝑢,𝑣,𝑡 . The mapping between 𝐼 G and 𝐼 4 is based on a model of retinotopy, which we will describe in the next section. Instead of using a linear model, the BOLD signal is synthesized via a more elaborate nonlinear biophysical model, in order to make the model more generalizable across tasks and stimuli. First, a point-wise BOLD signal is generated at each cortical location using the biophysical model. This signal is then spatially smoothed based on the known point- spread function of BOLD signal. Finally, the smoothed signal is scaled across the cortex by a local property that we call “responsiveness”. We used the hemodynamic model 𝐻(⋅ ;𝛩) proposed by Friston et al. (2003), which incorporated the Balloon Model of Buxton et al. (1998) to generate point-wise BOLD signal 𝑝 𝑢,𝑣,𝑡 from the cortical projection of the stimulus: 𝑝 𝑢,𝑣,𝑡 =𝐻(𝐼 4 𝑢,𝑣,𝑡 ;𝛩) (13) where the parameters 𝛩 of Friston et al. model are taken from the literature, adjusting for the field strength and acquisition parameters we used in our fMRI experiments (see Appendix A). The point-wise BOLD signal is then smoothed spatially by a Gaussian kernel with FWHM= 3.5mm (Engel et al., 1997; Parkes et al., 2005): 51 𝑞 𝑢,𝑣,𝑡 =𝑝 * 𝑢,𝑣 ∗𝑒 T ' E K( E / U.cd/ Tef >.d E (14) Finally, the synthesized BOLD signal 𝑏(𝑢,𝑣,𝑡) is taken to be a spatial point-wise product between 𝑞(𝑢,𝑣,𝑡) and a spatially varying scaling function 𝜌(𝑢,𝑣), which represents the observable hemodynamic responsiveness at each cortical location. That is, 𝑏 𝑢,𝑣,𝑡 =𝑞 * 𝑢,𝑣 ∘𝜌(𝑢,𝑣) (15) 𝜌(𝑢,𝑣) is inferred from measured data (𝑦(𝑢,𝑣,𝑡)) such that it minimizes the discrepancy between synthesized point-wise signal-to-noise ratio, SNR <kl (𝑢,𝑣), and the measured SNR Hmn (𝑢,𝑣) associate Eq. (1). Specifically, SNR Hmn 𝑢,𝑣 = GH< 𝑿𝜷 IJ (nG 𝝐 IJ (16) and SNR <kl 𝑢,𝑣 = GH< 𝒃 IJ (nG 𝝐 IJ (17) where 𝑟𝑚𝑠(⋅) and 𝑣𝑎𝑟(⋅) are over time, and the last expression of Eq. (13) is derived from Eq. (8). 52 SNR <kl 𝑢,𝑣 =1/ 𝑣𝑎𝑟 𝝃 '( F C D E +1/ 𝑟𝑚𝑠 𝒃 '( ? =1/ 𝑣𝑎𝑟 𝝃 '( F C D E +1/ 𝑟𝑚𝑠 (𝑞 '( 𝑡 ∘𝜌 '( ) ? (18) We then adopted an efficient way for estimating the point-wise responsiveness function 𝜌(𝑢,𝑣) from the SNR. Specifically, we equate Eq. 12 with Eq. 14 and solve for 𝜌 𝑢,𝑣 to obtain the responsiveness, and the simulated BOLD signal via equation (13). Figure 3.3: Signal synthesis. Spectrum of real eye-tracking data is used to produce timecourse of simulated gaze positions according to which input image is jittered. The stimulus is then projected to a V1 cortical model, and the balloon model is used to generate the timecourse for each point on the cortical model. Cortical response is smoothed with a Gaussian kernel, equivalent to the typical point-spread function of fMRI. Finally, the signal at each vertex is scaled according to “responsiveness” - the relative strength of each vertex’s BOLD response given the same amount of neural stimulation. ° Friston-Buxton non-linear model for voxel-wise timecourse Gaussian kernel FWHM = 3.5 mm Timecourse of gaze positions Voxel-wise scaling by “responsiveness” Simulated signal 53 Mapping from visual space to cortical surface We model the cortical input that drives the BOLD response as a cortical image 𝐼 4 𝑢,𝑣,𝑡 , obtained by projection of the retinal image 𝐼 G (𝑥,𝑦,𝑡) via a model of the V1 surface developed by Rovamo and Virsu (1984). This model accounts for the retinotopic organization of V1 and the cortical magnification function (see Appendix B). For each subject, we estimated the two parameters 𝑐 U and 𝑐 ? of the cortical magnification as a function of eccentricity 𝜔 (Duncan and Boynton, 2003) 𝑀 𝜔 =1/(𝑐 U 𝜔+𝑐 ? ) (19) by minimizing the least-squares difference between the cortical distance from the fovea to the cortical location representing a given eccentricity for the subject’s retinotopic mapping and the resulting cortical model. Thus, for each subject, a cortical location was referenced to the corresponding location on the model cortex via the shared retinal coordinate. We used this system to map each point on the subject’s cortex to a point on the model surface. The model surface was then flattened to a 2D mesh to simplify the combination of signal (simulated on the 3D surface) and noise (synthesized in 2D). Subject-specific vs. general model Our aim was to provide a model that could be used to predict the results of novel experiments without requiring the acquisition of data with a particular participant or stimulus protocol. As a first step, we calibrated our model separately for each subject, and tested whether this subject-specific model was generalizable across experiments. 54 A general model, for simulation without requiring data acquisition, was developed from the data of all subjects. The parameters for the general model were determined from the individual subject models as follows: The noise was generated in the same manner as for the individual subject models, but the parameters for the noise model were obtained by combining parameter estimates across subjects. The spectrum of the basal noise, 𝜉(𝑢,𝑣,𝑡), was synthesized by first merging the temporal and spatial spectra separately across subjects: 𝐴 N 𝛥𝑟 = 𝐴 𝛥𝑟 < < (19) 𝐴 N 𝛥𝑡 = 𝐴 𝛥𝑡 < < (20) When the amplitude for a frequency was available from more than one subject’s noise spectrum, the amplitude values for the frequency were averaged across subjects. The resulting spectra were then sampled using linear interpolation in order to produce noise with the desired dimensions. 𝜉(𝑢,𝑣,𝑡) was synthesized from the resulting spectra using the same method as for the subject-specific model. 𝛾 and 𝜎 > , the parameters describing the signal-dependence of the noise, were estimated from the noise variance and rms signal of all subjects. The total noise was then synthesized in the same manner as for the individual subject models. BOLD signal was synthesized in the same manner as for the individual subject models. The hemodynamic responsiveness, 𝜌(𝑢,𝑣), was obtained through the following procedure. The spatial 2D amplitude spectrum of the responsiveness function was determined for each subject, and a combined spectrum was obtained in the same manner as the noise spectrum. This spectrum was then used to generate a new 2D set of responsiveness values, with randomized phase 𝜑: 55 𝜌 𝑢,𝑣 =ℱ TU 𝑒 ?tuv ∘𝐴 w (20) The resulting responsiveness values were rescaled to match the mean and standard deviation of the responsiveness across subjects. The cortical model described above and in Appendix B was also used for the general model, with cortical magnification function parameters 𝑐 U =0.065 and 𝑐 ? = 0.054, the average values across 10 subjects obtained by Duncan and Boynton (2003). Subjects All subjects had normal or corrected-to-normal vision, and were between the ages of 23 and 37 (2 males and 2 females). All subjects had definable retinotopic visual areas, as defined by the procedure described below. The experimental protocol was approved by the Institutional Review Board of the university, and all subjects provided written informed consent. Stimuli Stimuli were generated on a Macintosh computer using MATLAB and the PsychToolbox (Brainard 1997; Pelli 1997). Retinotopic mapping used two types of full-contrast, color and contrast-reversing stimuli. The first, for mapping eccentricity, consisted of 20 rings of increasing radius, expanding out from fixation. The width of the rings was log-scaled with their diameter to activate approximately equal cortical area regardless of eccentricity. The second, for mapping polar angle, was comprised of a wedge that rotated about fixation. For the letter experiment, a letter (K, N, R, or S) in Sloan font subtending 4 degrees was displayed at 5 degrees eccentricity in the lower right visual quadrant. Each trial lasted 56 2 seconds, during which a single letter white appeared at full contrast on a gray background for 200 ms every second. Trials were blocked in groups of 5 trials of the same letter, separated by blocks on 5 trials with no peripheral stimulation. The sequence of letter blocks was counterbalanced within each scan, such that each block occurred the same number of times and was preceded by each other block the same number of times. A fixation task was used to discourage eye movements and keep subjects’ attention constant and away from the letter stimulus. On each trial, a small (0.3 degree) square appeared at fixation, and proceeded through a sequence of 4 colors selected from red, yellow, green, and blue. In half of the trials (randomly selected), all four colors in the sequence were different; in the other half of the trials, the 4 th color in the sequence was the same as either the 1 st or 2 nd . Subjects were asked to indicate with a button press when the fourth color was a repetition of one of the first two. Stimuli were projected on a 32 × 24-cm screen mounted perpendicularly to the toe– head axis in the bore of the magnet, directly above the subject’s head. The viewing distance was 88 cm. The display was set at a background luminance of 156 cd/m2, and the maximum luminance at 100% contrast was 312 cd/m2. fMRI acquisition All scans were performed using a 3 Tesla whole-body magnet (Siemens Prisma) at the USC Dana and David Dornsife Cognitive Neuroscience Imaging Center at the University of Southern California with a 32-channel matrix coil running a simulated CP mode. Functional scans were acquired using a multiband Gradient Echo Planar (EPI) sequence (TR = 1000ms, TE =35ms, Flip Angle = 65º) with an isotropic voxel resolution 57 of 2.5 mm, with several exceptions. Retinotopic scans for S2 and S3 were acquired with at TR of 1.2 seconds, and letter data with an isotropic voxel resoluation of 3 mm. Anatomical scans were acquired using a Magnetization Prepared Rapid Acquisition Gradient (MPRAGE) sequence (TI=1100ms, TR=2070ms, TE =4.14ms, Flip Angle = 12º) with a resolution of 1x1x1.2 mm. fMRI preprocessing Preprocessing of imaging data was performed with BrainVoyager, including motion-correction, linear-trend removal, and high-pass temporal filtering. Intrasession functional scans were aligned to one another and co-registered with the intrasession anatomical scan. Intersession co-registration was achieved by co-registering the anatomical scans from each session. Aligned data was projected to the flattened mesh representation of the subject’s cortical surface. Data was %-transformed (mean-subtracted and normalized) to eliminate differences in signal baseline and match the units of the balloon model for consistency between real and simulated data. Data for vertices of the mesh that fell within V1 boundaries (as identified from response to the wedge stimulus) was selected for analysis. Model assessment The eccentricity mapping scan (ring) was used to determine model parameters (“calibration”). The model was then tested for fidelity of prediction of the polar angle mapping scan (wedge), which we refer to as “near generalization” due to the similarities in stimulus and timing of the ring and wedge experiments. The letter scans were also used to 58 test model prediction, and we refer to this as “far generalization” because the letter stimuli differ in appearance and timing from the ring stimuli. To make an equivalent comparison between the simulation and data (where true signal and noise are unknown), both were fit with a general linear model (GLM), with one predictor for each stimulus condition, convolved with a canonical hemodynamic response function (Kay et al., 2013) prior to assessment. The quality of the model fit to data and the fidelity with which it predicted novel data was assessed based on the following: signal-to- noise ratio (SNR), as defined above; F-statistic of the GLM fit; dependence of pairwise noise correlation of vertices on their cortical separation. In the case of the letter, or far generalization, experiment, 5 or 6 scans were acquired for each subject, allowing for comparison of test-retest reliability and model fidelity. To assess this for each subject, we used a bootstrapping procedure where we sampled vertices with replacement from each of the acquired scans 1000 times, and calculated the resulting distribution of SNR for the sampled set of vertices. We then determined the Hellinger distance between the distributions for pairs of runs. This resulted in a distribution of distance measures that reflect the test-retest reliability for the data. To compare this to the difference between simulation and data, we calculated the Hellinger distance between the SNR distribution of a simulated run and those of each of the runs of real data, and determined whether the resulting values fell within the 95th percentile of the bootstrapped distribution. RESULTS We developed a non-neuronal model of BOLD response in retinotopic visual areas for experiment planning and interpretation. Our model begins with the mapping of an 59 image in visual space to a model of retinotopic visual cortex (demonstrated with primary visual cortex, V1). A timecourse for the signal evoked by the stimulus each cortical location is generated using the Balloon model (Buxton et al, 1998) as implemented by Friston and colleagues (2003). Signal is smoothed according to the point-spread function of BOLD response measured previously (Engel et al., 1997; Parkes et al., 2005). Multiplicative noise with a spatiotemporal spectrum matched to noise measured in BOLD data (taken as the residual of the GLM fit, assuming additivity) is combined with the signal to produce the model output: a timecourse for each cortical position, which captures many of the observed properties of the measured BOLD response. Model calibration The parameters of the model were determined as described in Methods, through a fitting process to measured data on an individual subject basis. The calibration data was collected during retinotopic mapping, using an expanding ring stimulus (see Methods), while subjects performed a central task. The results of the fitting procedure are shown in Figure 3.4. Because, for individual subject models, there was a direct mapping between locations (vertices) on the model cortex and the subject’s cortex via the visual space, we compared the timecourses of corresponding vertices that represent the identical visual coordinate (Figure 3.4A). There is a close match between timecourses from the data and simulation. This is consistent with a reasonable match between the signal timecourses generated using the balloon model and the hemodynamic response to the stimulus, which has been demonstrated in other studies at 3 T (Mildner et al., 2001). To assess the match in signal-to-noise across the full dataset, we fit a GLM (with one predictor per event type; in this case, one predictor for each radius of the ring stimulus), and compared the 60 cumulative distribution function (CDF) of F-statistics of this fit across vertices. We also used the component of the data accounted for by the GLM (“signal”) and the residual (“noise”) – the same procedure used to separate signal from noise for model calibration - to calculate CDFs for the SNR across vertices for the data and the simulation. The CDFs for SNR were nearly identical between the data and model for each subject (R 2 = 0.99, 0.99, 0.92, and 0.99), which is to be expected because SNR was matched during calibration. The CDFs for F-statistic were also very similar (R 2 = 0.53, 0.80, 0.69, and 0.83), showing that after typical data processing with GLM the model and data are still closely matched. While expected in the calibration step if a good fit is achieved, this was not insured during calibration: during calibration, the signal and noise for the model were not obtained not through GLM (the synthesized signal and noise were used). Lastly, we examined the correlation structure of the noise determined after GLM fit of the data and simulation. For two of the four subjects, the pairwise correlations of the simulation noise had similar distance-dependence as the data noise; for the other two subjects, the long-range correlations were underestimated. We also examined the ring experiment data generated with the general model, with parameters estimated from the combined data across subjects (see Methods). There is no direct correspondence between general model and data for comparison; however, the timecourses produced with the general model have similar features to data timecourses, such as differing peak heights and noise spikes. The CDFs for F-statistic and SNR of the general simulation are similar to those for the combined data across subjects (Figure 3.4B&C, bottom row), but somewhat less similar than those for the combined data across subject-specific simulations. 61 Figure 3.4: Results of model calibration, for individual subject and general models. For all panels, blue indicates the measured BOLD data, red the individual subject model, and gray the model generalized across subjects. A. Two representative timecourses for corresponding vertices (representing the identical retinal location) from the subject’s data and the simulation. The last row contains representative timecourses from the general model. B. and C. show the estimated cumulative distribution functions (CDFs) of the signal-to-noise ratio (as defined in Methods) and the F-static of the GLM fit across vertices. Error bands represent the 95% confidence interval of the CDF estimate. The last row shows the CDFs for the combined data across all subjects (blue), all subject- specific models (red), and the general model (gray). D. The dependence of pairwise timcourse correlation on cortical distance. Each point represents a pair of vertices, and −5 0 5 0 100 200 −2 0 2 0 100 200 S1 S2 S3 S4 A B C D correlation coeff. cortical distance (mm) % signal change −2 0 2 −5 0 5 −5 0 5 −5 0 5 −5 0 5 −5 0 5 0 10 20 30 40 −0. 0 0.4 0.8 4 0 10 20 30 40 0 0.4 0.8 −0.4 0 0.4 0.8 −0.4 0 10 20 30 40 50 0 0.4 0.8 −0.4 0 10 20 30 40 50 time (s) 0 100 200 0 50 0 100 200 0 100 200 0 100 200 cumulative proportion F-statistic SNR −2 0 2 −5 0 5 0 10 20 30 40 −0. 5 0 0. 5 1 data subject simulation general simulation 0 2 4 0 2 4 0 2 4 0 2 4 0 2 4 0 0. 5 1 0 0. 5 1 0 0. 5 1 0 0. 5 1 0 0. 5 1 0 100 200 0 100 200 0 100 200 100 200 0 100 200 100 200 0 0 100 200 100 200 0 0 100 62 the correlation between their timecourses is shown as a function of their cortical separation. Near generalization After establishing a reasonable fit to the data, we tested whether our model could generalize to unseen stimuli and protocols. First, we used the model to synthesize BOLD signal evoked by a rotating wedge stimulus (see Methods). This stimulus is similar to the ring stimulus, in that both types of stimuli are contrast-reversing at the same temporal frequency, presented at full contrast, with each condition (spatial location) presented for the duration of one TR. We performed the same assessments on this simulated data as for the data simulated during calibration (Figure 3.5). The match between simulation and data was quite reasonable for all but one subject (SNR distribution R 2 = -1.9, 0.95, 0.84, and 0.81), though somewhat worse than during calibration; however, this is to be expected even for an ideal model, as there is variability in signal and noise between runs, even using the same stimulus within the same scan session (Rombouts, et al., 1998). 63 Figure 3.5: Results of near generalization, for individual subject and general models. For all panels, blue indicates the measured BOLD data, red the individual subject model, and gray the model generalized across subjects. A. Two representative timecourses for corresponding vertices (representing the identical retinal location) from the subject’s data and the simulation. The last row contains representative timecourses from the general model. B. and C. show the estimated cumulative distribution functions (CDFs) of the signal-to-noise ratio and the F-static of the GLM fit across vertices. Error bands represent the 95% confidence interval of the CDF estimate. D. The dependence of pairwise timcourse correlation on cortical distance. Each point represents a pair of 0 10 20 30 40 50 0 10 20 30 40 50 S1 S2 S3 S4 A D 0 10 20 30 40 −5 0 5 −0. 4 0 0.4 0.8 correlation coeff. cortical distance (mm) 0 100 200 −5 0 5 0 100 200 −5 0 5 −5 0 5 time (s) −5 0 5 −5 0 5 −5 0 5 −5 0 5 −0. 5 0 0. 5 1 −0. 4 0 0.4 0.8 0 20 40 60 −1. −0. 5 0 0. 5 1 −1. % signal change cumulative proportion F-statistic SNR 0 0.5 1 0 0.5 1 0 0.5 1 0 0.5 1 0 0.5 1 data subject simulation general simulation 0 100 200 0 100 200 −0. 4 0 0.4 0.8 0 10 20 30 40 0 10 5 0 10 5 0 10 5 0 10 5 0 4 2 0 100 200 0 100 200 0 100 200 0 100 200 0 100 200 0 100 200 −5 0 5 −5 0 5 0 200 100 0 100 50 0 400 200 0 400 200 B C 64 vertices, and the correlation between their timecourses is shown as a function of their cortical separation. Far generalization For a more demanding test of model generalizability, we simulated data from an experiment that differed both in stimulus and protocol from the ring experiment used for calibration. In this experiment, subjects performed a central task while letters subtending 4 degrees were presented briefly at 5 degrees eccentricity in the lower right visual field. For this experiment, we collected 5 of such runs for each subject, such that we could compare test-retest reliability for the data to that between model and data. This provides a means for assessing similarity between the data and simulation, while accounting for the variation in data across runs. This comparison is shown in Figure 3.6B,C&D. The error bands of the CDFs of SNR and F-statistic from the real data represent the range of estimated CDFs across all runs collected. The difference between SNR distributions across runs, and between the simulation and data, was assessed using the Hellinger distance (Figure 3.6E; see methods). The degree to which the simulation matched the real data varied widely between subjects. The majority of the Hellinger distances between the SNR distributions of the simulation and data for subject 4 fell within the 95 th percentile of the bootstrapped distribution of Hellinger distances between SNR distributions of pairs of runs of real data. For subject 1, two out of five distances between real and simulated SNR distributions fell within the 95 th percentile; for subject 2, one distance met this criterion; for subject 3, no distance met this criterion. 65 Figure 3.6: Results of far generalization, for individual subject and general models. For all panels, blue indicates the measured BOLD data, red the individual subject model, and gray the model generalized across subjects. A. Two representative timecourses for corresponding vertices (representing the identical retinal location) from the subject’s data and the simulation. The last row contains representative timecourses from the general model. B. and C. show the estimated cumulative distribution functions (CDFs) of the signal-to-noise ratio (as defined in Methods) and F-static of the GLM fit across vertices. Error bands show the range of CDF estimates across all runs of letter data collected. D. Bootstrapped distributions of Hellinger distance (HD) between distributions of SNR for pairs of runs from the real data are shown in blue. The HDs between the distribution of SNR of the simulation and each run of real data are shown in red. The black line indicates the 95th percentile of the distribution. E. The dependence of pairwise timcourse 66 correlation on cortical distance. Each point represents a pair of vertices, and the correlation between their timecourses is shown as a function of their cortical separation. DISCUSSION We have developed a realistic model of BOLD fMRI in retinotopic visual cortex, and demonstrated its capacity to synthesize data for novel experiments. While simple, the model incorporates observable spatial and temporal properties of the measured BOLD signal. An multiplicative, correlated noise model was used to simulate noise with similar spatial and temporal properties to noise determined from real data. Once calibrated, provides reasonable predictions of data acquired with different stimuli and protocols. A general version of the model can be used for simulation without data acquisition for a specific subject. Match between simulation and data The model achieves a good fit to data used for calibration, and closely reproduces data from a similar experiment (near generalization, as assessed through timecourse similarity, distribution of SNR, and distribution of F-statistic). The model’s fidelity in generalization to an experiment that differs markedly from the experiment used for calibration is more variable across subjects. There are several possible explanations for this discrepancy. First, data from subject 3, for whom there was the poorest fidelity, was collected in two different scan sessions, with two substantially different EPI protocols (see methods). The relative contributions of system noise and physiological noise differ depending on the voxel size and echo time (Kruger and Glover, 2001), both of which were 67 different between the retinotopy and letter scans for this subject. This could result in a failure of the noise model parameters estimated during calibration with the ring experiment to generalize to the letter experiment. This could indicate a general limitation of the model’s utility for simulating experiments acquired with scan protocols different from those used for calibration. Adding an additional noise parameter that is a function of the scan protocol could possibly ameliorate these discrepancies. Another possible source of this discrepancy is the subject’s attentional state across scans, which can be affected by both the subject’s level of alertness and the stimulus, itself. Elevated attention to a stimulus generally results in an increase in SNR (Kastner et al., 1999; Buracas and Boynton, 2007). The sudden appearance of a peripheral stimulus, in the case of the letter experiment, may also have induced more pronounced changes in attention or eye movements in some subjects, despite the demanding central task. Even for the same protocol, there can be variation in the number and location of active voxels across scans (Rombouts, et al., 1998). There does not appear to be a systematic relationship across subjects between the SNR distribution predicted by the simulation and that determined from the real data; thus, it may be that the scan order (randomized across subjects) had an impact on the SNR. Finally, because the model is deliberately non-neuronal, it does not account for neural factors that may impact the BOLD response to these stimuli differently. BOLD signal in V1 has been shown to reflect a mild spatial non-linearity – as a stimulus increases in size, there is an increase in the BOLD response of a given voxel to the stimulus that eventually saturates (Kay et al., 2014). This saturation likely has a neural component. A saturation, and eventually decrease, has been observed in the responses of V1 neurons to 68 gratings or bars of increasing size displayed within their receptive fields in numerous studies (Hubel and Wiesel 1968; Dreher 1972; DeAngelis et al. 1994; Levitt and Lund 1997; Kapadia et al., 1999; Sceniack et al., 1999). Similar results have been observed in large regions of interest in V1 using fMRI (Press et al., 2001). The retinotopic mapping and letter stimuli may be affected to differing degrees by this suppressive effect. Model assumptions For simplicity, we make several assumptions about the noise is BOLD data. Consistent with the use of GLM to distinguish signal from noise, we assume that the noise is multiplicative in the sense that it depends on the signal, but is otherwise additive. This is certainly not strictly true. Neural noise in visual cortex has been described as multiplicative (Geisler and Albrecht, 1995; Carandini, 2004); however, neural noise is only one of many factors that contribute to BOLD noise, and its contribution as seen through the vascular response will be more complex. The largest contribution to BOLD fMRI noise is believed to come from physiological sources such as breathing and heartbeat, which are aliased due to the sampling rate of BOLD fMRI relative to their temporal frequency (Kruger and Glover, 2001). These contributions are also likely not additive in the sense employed here, as they induce changes in the blood response that, together with the changes in flow evoked by neural activation, have a non-linear effect on the BOLD response (Buxton et al., 1998). Assumptions of additivity are standard and effective in BOLD analyses; noise removal methods that utilize ICA and PCA to project out nuisance factors lead to large reductions in variance not attributable to the stimulus protocol (Behzadi, 2007; Kay, 2013). On the other hand, an assumption of additivity likely gives the simulation an advantage over real data in separating signal from noise using standard 69 methods such as GLM. We also assume that the noise is stationary. This is another commonly made assumption that is not strictly correct (Bullmore et al., 2001; Diedrichsen and Shadmehr, 2005; Long et al., 2005; Lund et al., 2006), and likely allows more successful separation of signal from noise in the case of the simulation. Finally, we assume an isotropic point spread function. A large, if not dominant, source of the point-spread function is the network of veins that are the source of the BOLD signal (Duong et al., 2000; Parkes et al., 2005). The venous structure is variable across the cortex, resulting in an inhomogeneous point-spread (Yu et al., 2012). This may be problematic for modeling the response of a specific cortical location; however, conclusions about populations of voxels or vertices are likely reasonable given this assumption. Potential applications of the model The general version of the model and tools for its calibration with individual subject data will be provided on GitHub. We hope that it will be a useful tool for both planning and interpretation of experiments. Because the model allows for simulation of the entirety of each hemisphere of V1 (and other retinotopic visual areas by replacing the cortical magnification function in the cortical model), we anticipate that it will be useful for optimization of protocols for experiments that make use of multi-voxel pattern analysis (MVPA; employed first by Haxby et al., 2001; reviewed in Norman et al., 2006). MVPA relies on a differential pattern of activation of groups of voxels across different conditions. This can be challenging in studies of peripheral vision, where stimuli have a relatively small representation in retinotopic visual areas due to the cortical magnification of the foveal representation. This model could be used to optimize stimuli to maximize the 70 chances of differentiable patterns of activation given the constraints imposed by the cortical magnification factor, spatial spread of the BOLD response, and spatially correlated noise. Because the model intentionally does not incorporate neural properties or interactions, it can also be used to determine whether a given result can be explained by the properties of BOLD response, alone, without hypothesizing a neural mechanism. Simulation could contribute to the evidence for neural sources of phenomena such as the radial bias and population receptive fields. 71 Chapter 4: Applications of the BOLD fMRI model for experiment design and interpretation SUMMARY In this chapter, I demonstrate how the model can be applied to aid in two widely- used computational approaches to fMRI data analysis. The first, multi-voxel pattern analysis (MVPA) of BOLD fMRI data has become a popular technique for studying visual processing in humans. MVPA is in general limited by the spatial spread of the hemodynamic response, the voxel size relative to the spatial scale of the stimulus-relevant response, and the noise inherent in BOLD fMRI. Application of MVPA to the study of early peripheral visual processing is especially challenging due to the cortical magnification factor, which limits the number of voxels that carry stimulus information for a small, peripherally presented stimulus. An informative outcome may thus be infeasible for a given peripheral visual MVPA experiment. We used our model to simulate MVPA with peripherally presented letters, in order to determine the lower bound on stimulus size for a successful MVPA experiment. The MVPA performance predicted by the model is higher than that of the real data for large letter sizes, but similar at small sizes; this may indicate that factors captured by the model are limiting at small stimulus size, but factors not considered in the model are limiting for large stimuli. The second method, voxel population receptive field (pRF) modeling, characterizes and quantifies the spatial response properties of an fMRI voxel. fMRI pRF measurements are influenced by both the underlying response properties of neurons and the spatiotemporal properties of the associated hemodynamic response. We used the model to determine to what extent 72 population receptive fields in early visual cortical areas can be accounted for without neuronal receptive fields. Using standard protocols, stimuli, and analysis methods, we found that pRF sizes near the foveal representation, determined from the simulated data, are similar to those obtained with subject data. Since the model does not include neuronal receptive field sizes, the resulting pRF sizes represent the contribution of factors that are orthogonal to neuronal receptive field properties, including cortical magnification, the retinotopic organization of the V1 cortex, the point-spread function of the BOLD response, and the thermal and physiological noise associated with the fMRI measurement. Assessment of the non-neuronal contribution potentially allows for its removal, and improved estimation of the underlying neural population response properties. INTRODUCTION Until recently, standard analyses of BOLD fMRI data were mostly limited to either a search for the brain regions involved in a particular task or process via tests of single- voxel activation, or the examination of how a task or process modulates BOLD response within a specific brain region via analysis of mean signal in a region of interest (Friston et al., 1994). In the past 10 years, however, a number of new techniques have been developed to make use of responses across populations of voxels to infer response properties or stimulus information. Multi-voxel pattern analysis (MVPA), for example, the pattern of response across a voxel population is assessed for stimulus or behavior-correlated information (employed first by Haxby et al., 2001; reviewed in Norman et al., 2006). For such a technique to be successful, it is neither necessary nor sufficient for there to be a 73 neural response specialized for the task or stimulus property of interest; however, there must be a BOLD response that carries information about the task or stimulus. For example, if the spatial scale of the neural code is too small, neural information may not be conveyed in the BOLD response. Conversely, if the BOLD signal does carry stimulus information, the reason is not always clear. For example, the debate as to why orientation information can be read from MVPA in V1 is currently under debate (Kamitani and Tong, 2005; Haynes and Rees, 2005; Gardner, 2010; Op de Beeck, 2010; Swisher et al., 2010; Alink et al., 2013). These new methods are potentially powerful new tools for studying peripheral vision. However, because of the novelty of these methods, little is known about under what experimental conditions they can be expected to provide an informative outcome, and how an outcome should be interpreted. The model presented in the previous chapter can be used to inform on both of these issues. In this chapter, we give two examples of applications of the model, relevant to the study of peripheral vision: one for maximizing the likelihood of an informative experimental result, and one for aiding in interpretation of an analysis method. For the first application, we use the model to optimize the stimulus for a MVPA experiment. MVPA is generally limited by the spatial spread of the hemodynamic response, the voxel size relative to the spatial scale of the stimulus-relevant response, and the noise inherent in BOLD fMRI. Application of MVPA to the study of early peripheral visual processing is especially challenging due to the reduced cortical magnification factor, which limits the number of voxels that carry stimulus information for a small, peripherally presented stimulus. We demonstrate use of the model to determine the minimum size for a stimulus to be identifiable using MVPA when presented at 5 degrees eccentricity. This application is 74 important to the design of experiments to study phenomena such as “visual crowding” that occur for stimuli in peripheral vision with small center-to-center separation that can only be achieved with small stimuli. For the second application, we determine the contribution of non-neuronal factors captured in the model to measurements of voxel population receptive fields (pRFs; Smith et al., 2001; Dumoulin and Wandell, 2008). Voxel pRFs are thought to reflect the location and size of the combined receptive fields of neurons within a voxel (Wandell and Winawer, 2015). However, the measurement will also be influenced by the contribution of the hemodynamic response evoked by neural activity in neighboring voxels. To assess the extent of this influence, we performed pRF measurements on our non-neuronal model. METHODS Annotation conventions Subjects All subjects had normal or corrected-to-normal vision, and were between the ages of 23 and 37 (2 males and 2 females). All subjects had definable retinotopic visual areas, as defined by the procedure described below. The experimental protocol was approved by the Institutional Review Board of the university, and all subjects provided written informed consent. Stimuli Stimuli were generated on a Macintosh computer using MATLAB and the PsychToolbox (Brainard 1997; Pelli 1997). Stimuli were projected on a 32 × 24-cm screen 75 mounted perpendicularly to the toe–head axis in the bore of the magnet, directly above the subject’s head. The viewing distance was 88 cm. The display was set at a background luminance of 156 cd/m2, and the maximum luminance at 100% contrast was 312 cd/m2. pRF mapping We performed pRF mapping used two types of full-contrast, color and contrast- reversing stimuli. The first consisted of 20 rings of increasing radius, expanding out from fixation. The diameter of the rings was log-scaled to traverse approximately the same area of cortex between ring presentations regardless of eccentricity. The width of the rings was log-scaled with their diameter to activate approximately equal cortical area regardless of eccentricity. Each ring was displayed for 1 TR, with 4 TRs of rest following each cycle of expansion. The second stimulus was comprised of a wedge that rotated about fixation. A full rotation was comprised of 32 wedges, each subtending 45 degrees polar angle, with no rest between rotation cycles. Each wedge was displayed for one TR. The same stimuli were used to define the boundaries of retinotopic visual areas. Letter MVPA experiment For the letter experiment, a letter (K, N, R, or S) in Sloan font subtending 4 degrees was displayed at 5 degrees eccentricity in the lower right visual quadrant. Each trial lasted 2 seconds, during which a single letter white appeared at full contrast on a gray background for 200 ms every second. Trials were blocked in groups of 5 trials of the same letter, separated by blocks on 5 trials with no peripheral stimulation. The sequence of letter blocks was counterbalanced within each scan, such that each block occurred the same number of times and was preceded by each other block the same number of times. A fixation task 76 was used to discourage eye movements and keep subjects’ attention constant and away from the letter stimulus. On each trial, a small (0.3 degree) square appeared at fixation, and proceeded through a sequence of 4 colors selected from red, yellow, green, and blue. In half of the trials (randomly selected), all four colors in the sequence were different; in the other half of the trials, the 4 th color in the sequence was the same as either the 1 st or 2 nd . Subjects were asked to indicate with a button press when the fourth color was a repetition of one of the first two. One subject performed the same experiment with the letter size decreased to 3 degrees, 2 degrees, and 1 degree. fMRI acquisition All scans were performed using a 3 Tesla whole-body magnet (Siemens Prisma) at the USC Dana and David Dornsife Cognitive Neuroscience Imaging Center at the University of Southern California with a 32-channel matrix coil running a simulated CP mode. Functional scans were acquired using a multiband Gradient Echo Planar (EPI) sequence (TR = 1000ms, TE =35ms, Flip Angle = 65º) with an isotropic voxel resolution of 2.5 mm, with several exceptions. Retinotopic scans for S2 and S3 were acquired with at TR of 1.2 seconds, and letter data with an isotropic voxel resoluation of 3 mm. Anatomical scans were acquired using a Magnetization Prepared Rapid Acquisition Gradient (MPRAGE) sequence (TI=1100ms, TR=2070ms, TE =4.14ms, Flip Angle = 12º) with a resolution of 1x1x1.2 mm. fMRI preprocessing Preprocessing of imaging data was performed with BrainVoyager, including motion-correction, linear-trend removal, and high-pass temporal filtering. Intrasession 77 functional scans were aligned to one another and co-registered with the intrasession anatomical scan. Intersession co-registration was achieved by co-registering the anatomical scans from each session. Aligned data was projected to the flattened mesh representation of the subject’s cortical surface. Data was %-transformed (mean-subtracted and normalized) to eliminate differences in signal baseline and match the units of the balloon model for consistency between real and simulated data. Data for vertices of the mesh that fell within V1 boundaries (as identified from response to the wedge stimulus) was selected for analysis. fMRI data simulation We used the model described in the previous chapter to simulate both pRF and MVPA experiments. fMRI for each subject was simulated using an individualized version of the model, which captured the subject’s unique cortical magnification function (Figure 4.1) and noise. To determine the cortical magnification function, we estimated the two parameters 𝑐 U and 𝑐 ? of the cortical magnification as a function of eccentricity 𝜔 (Duncan and Boynton, 2003) 𝑀 𝜔 =1/(𝑐 U 𝜔+𝑐 ? ) by minimizing the least-squares difference between the cortical distance from the fovea to the cortical location representing a given eccentricity for the subject’s retinotopic mapping and the resulting cortical model. This is particularly important for the pRF analysis, as pRF size has been shown to correlate with cortical magnification factor (Harvey and Dumoulin, 2011). The pRF stimuli were converted to black and white images before being input to the model, with color treated as full-contrast. Otherwise, stimuli and protocols were identical for data collection and simulation. Simulation was performed only for the 78 left hemisphere (stimuli were presented in the right visual field, leading to activation of the retinotopically organized areas in the left hemisphere) to reduce computing time. Figure 4.1: Individual subject cortical models. Isoeccentricity lines in the visual field (far left) are mapped onto the cortical models fit to individual subject data. The model fit from Duncan and Boynton (2003), an average across 10 subjects, is shown for comparison. Data Analysis The same analysis procedures were applied to both data and simulation (Figure 4.2). S1 S2 S3 S4 Duncan and Boynton (2003) 10˚ 5˚ 2˚ 79 Figure 4.2: Data analysis scheme for each subject. Data from the ring experiment is used to calibrate the V1 fMRI model. The calibrated model is then used to simulate data for each of the experiments performed with subjects. Ring and wedge collected and simulated data is used separately to estimate population receptive fields for each V1 vertex in the subject and model cortices. Letter data is used to train and test a linear SVM classifier for MVPA in a leave-one-run-out cross- validation scheme. Multivoxel pattern analysis MVPA was performed in a leave-one-run-out cross-validated classification framework with linear SVM. To determine whether stimulus information was contained in the pattern of voxel activity in V1, a classifier was trained to identify the letter shown data ring wedge letter 1 letter 2 letter 3 letter 4 letter 5 V1 fMRI model ring wedge letter 1 letter 2 letter 3 letter 4 letter 5 pRF estimation MVPA cross- validation calibrate simulate simulation pRFs pRFs prop. correct prop. correct 80 during each block (K, N, R, or S), and then tested on an independent dataset. Data was separated into training (4 runs) and testing (1 run) sets. Using the training set, vertex selection was performed by identifying voxels within V1 (data or model) that showed significant activation with presentation of any of the letter stimuli. The timecourses of vertex from both the training and testing sets were then fit with GLM with a separate predictor for each presentation block, yielding one value per vertex per block. These values were then z-scored for each vertex. Resulting values for the training data, paired with the associated letter labels, were used to train the linear SVM, using a one-vs.-one multiclass scheme, in the libsvm MATLAB package (Chang and Lin, 2011), to distinguish between patterns of activation for the 4 letters. The trained classifier was then used to predict which letter appeared in each block of the testing data. This process was repeated using each of the five runs as the testing run. Performance was assessed using proportion of trials correctly classified by stimulus. Population receptive fields We analyzed both real and simulated data from the wedge and ring scans using a standard pRF estimation technique (Dumoulin and Wandell, 2008), yielding the 3- parameter non-negative isotropic Gaussian pRF 𝑝𝑅𝐹 =𝑒 T ~T~ D E K kTk D E ?C E that best predicted the time course of each V1 vertex. 𝜎 is the size of the pRF, centered at visual coordinate (𝑥 > ,𝑦 > ). Parameters were first estimated using a fine grid search, and then optimized with this estimate as a starting value, by least-squares difference between the observed timecourse and the predicted timecourse based on the current pRF estimate. The predicted timecourse was generated by first taking the dot product between the current 81 pRF estimate and the stimulus at each TR and then convolving with a canonical hemodynamic response function (HRF). We used the difference of gamma functions HRF implemented by Dumoulin and Wandell (2008): 𝐻 𝑡 = 𝑡 𝑑 U n 𝑒 T (*T ) W −𝑐 𝑡 𝑑 ? n E 𝑒 T (*T E ) W E Because the temporal width of the hemodynamic response has been shown to affect the estimates of pRF size (Dumoulin and Wandell, 2008), we used two sets of parameters for the canonical HRF. The first set of parameters, 𝑎 U =6,𝑎 ? =12,𝑏 U =0.9,𝑏 ? =0.9,𝑐 =0.35, and 𝑑 u =𝑎 u 𝑏 u were used by Harvey and Dumoulin (2011). The second set of parameters, 𝑎 U =5.74,𝑎 ? =13.18,𝑏 U =1.19,𝑏 ? =1.00,𝑐 =0.22, and 𝑑 u =𝑎 u 𝑏 u were derived to achieve a FWHM of 6 s, comparable to the width found to minimize the measured pRF size (Dumoulin and Wandell, 2008), and similar to the canonical HRF used by Kay et al. (2013). To determine the relationship between the eccentricity of the center of the pRF and its size, we performed a linear fit. 95% confidence intervals for the fit were calculated through bootstrapping, wherein pRFs were randomly sampled with replacement and fit 500 times. This yielded intervals for slope and intercept. RESULTS Multi-voxel pattern analysis For the first application of our model, we determine the smallest peripheral (5 degrees eccentricity) letter stimulus for which stimulus information can be detected in the BOLD data. We simulated an experiment where letters K, N, R, and S were displayed in 82 a block design. MVPA was used to determine whether information about the stimulus shown (assessed by training and testing a linear SVM classifier within a cross-validation scheme) was contained in data from a subject’s V1 or the corresponding simulated V1 data. We first tested MVPA performance for a 4 degree letter, which we had previously been able to classify successfully using linear SVM on subject data. We compared the proportion of blocks correctly identified from the simulated data with those from the subject data (Figure 4.3A). For all subjects, the classification performance was significantly higher for the simulation than the data. We then examined the effect of decreasing the size of the stimulus on the classification performance for the data and simulation. Classification performance for both the simulation and data increased with increasing letter size. Interestingly, for the smallest letter, the performance was similar across cross-validation folds for the simulation and data, but performance for the simulation diverged at large letter sizes (Figure 4.3B). Figure 4.3: MVPA results for data (blue) and simulation (red). S1 S2 S3 S4 0 0.2 0.4 0.6 0.8 1 proportion correct A 1° 2° 3° 4° B subject letter size data simulation 83 A. Proportion of blocks of 4 degree letters correctly identified by linear SVM for each of the four subjects. B. Proportion of letters correctly identified for subject 4 using 1, 2, 3, and 4 degree letters. Error bars in both plots show the standard error across cross- validation folds. Population receptive fields We performed population receptive field mapping, using the method developed by Dumoulin and colleagues (2008), on both data and simulation for each subject. Because the width of the HRF assumed in the pRF model can affect the estimated pRF sizes (Dumoulin and Wandell, 2008), we performed the fitting procedure with two different HRFs that differed in width. We compared the relationship between eccentricity of the pRF center and the pRF size (Figure 4.4), described well in previous studies with a linear fit (Dumoulin and Wandell, 2008; Binda et al., 2013; Kay et al., 2015). A linear model provided a good fit to the data of each of our subjects when the narrow HRF was used for fitting (R 2 = 0.62, 0.71, 0.59, and 0.60, respectively). The fit for this relationship for the simulation was reasonable, but more variable (R 2 = 0.68, 0.10, 0.70, and 0.20, respectively). The eccentricity-size relationship for pRFs obtained with a broader HRF, which is more similar to that used in other studies (Dumoulin and Wandell, 2008; Kay et al., 2013), was not described as well by a linear fit (R 2 = 0.09, 0.29, 024, and 0.15 for the data; R 2 = 0.11, 0.17, 0.11, and 0.06 for the simulation), but pRF size estimates were closer to those reported in the literature. 84 Figure 4.4: Eccentricity dependence of pRF size for data and simulation, for two different assumed HRFs. Linear fits to the relationship between the eccentricity of a pRF center and its size are displayed for data (dark and light blue) and model (red and orange) for each subject. Dark blue and red lines indicate the assumed HRF from Harvey and Dumoulin (2011) was used in the pRF model; light blue and orange indicate a double-gamma HRF with FWHM = 6s was used. Error bands represent the bootstrapped 95% confidence interval of the fit. The fits for combined data of all subjects is shown with data from Binda et al., 2013 (black; upper line – drifting bar stimulus; lower line – multifocal stimulus) and Dumoulin and Wandell, 2008 (gray line). DISCUSSION We demonstrated two application of the model described in the previous chapter to experiments relevant to the study of peripheral vision. Multi-voxel pattern analysis High MVPA performance using simulation Interestingly, we see a marked difference in MVPA performance on the real and simulated data at large letter sizes, which decreases at small letter sizes. As both simulation and data have near chance MVPA performance for small letters, it may be that for small letter sizes the lack of cortical magnification in the periphery is the limiting factor for 85 MVPA. At large letter sizes, the model predicts that letters should be easily decodable from the data, but this is not the case. It is likely that when the stimulus is large, the cortical magnification factor is no longer limiting, but another factor, unaccounted for in the model, restricts performance. One such possible factor is unaccounted-for properties of the noise. We model the noise as non-stationary, additive, and Gaussian; however, this is an approximation that does not hold for real BOLD noise (Zahran et al., 1997; Woolrich et al., 2004). The features from both the data and the model that are submitted to the classifier are obtained using a GLM, which makes the same assumptions. This could lead to pre- processing successfully removing noise from the simulation and not the data; if noise is the limiting factor for MVPA in a particular regime, the model may be unable to capture this. Another possibility is that the stimulus information that reaches V1 from earlier stages of processing cannot be represented as pixel contrast. For example, as early as the retina, there are ganglion cells that respond preferentially to local contrast differences (reviewed in Masland, 2001). Implications for using MVPA to study peripheral vision If it is indeed the case that the lack of cortical magnification is the limiting factor for stimulus size in MVPA, this places a constraint on how stimuli can be designed for an MVPA experiment on peripheral vision. In the case of crowding, a natural MVPA experiment would be to determine the information content related to a flanked versus unflanked stimulus available in BOLD measured in visual areas. However, such an experiment may not be feasible given the limitations of MVPA for even an isolated small peripheral stimulus. Perhaps a stimulus where crowding occurs between many objects or 86 within a large object only under certain conditions could be used to overcome this limitation. Population receptive fields Comparison to published studies The pRF sizes measured for our data and simulations are larger than some measurements reported in the literature (Dumoulin and Wandell; Kay et al., 2015). The protocol used for pRF measurement, as well as the hemodynamic response function assumed in the pRF fitting, can affect the estimated pRF size (Dumoulin et al., 2008; Binda et al., 2013; Alvarez et al., 2015). We used an assumed HRF, which may have been optimal for neither the simulation nor data. The ring and wedge stimuli we used also each traveled only in one direction (expanding out and rotating counterclockwise), such that movement in space and progression in time were linked. This complicates the attribution of temporal response spread to hemodynamic temporal lag versus spatial spread of activation. Using stimuli identical to those used in other studies might help to eliminate the discrepancy in pRF sizes. Data vs. simulation pRFs While the intercepts of the linear fit to the eccentricity-size relationship do not differ systematically between the data and the simulation, the simulation predicts systematically smaller pRFs centered at high eccentricities. It is possible that the elements included in the model – cortical magnification, correlated signal (point spread function), and correlated noise – are sufficient to describe the pRFs measured for vertices with pRFs centered at low eccentricities, but not centered at peripheral locations. Besides neural factors that influence 87 the pRF estimate and are not included in the model (see below for further discussion), there are several possible sources of difference between data and model pRFs. The shape of the assumed HRF had a significant effect on both the estimated sizes of the pRFs and the linearity of the relationship. The HRFs used in this analysis were optimized neither for the data or the simulation, but likely described the two with differing accuracy. Measurement of each subject’s HRF (and, similarly, the impulse response function of the simulation), for use in the pRF fit might reduce the discrepancy. The discrepancies between estimates for the simulation and the model may also be due to imperfect estimation of the cortical magnification function, which relies on somewhat noisy retinotopic data. This may be the responsible for the overestimation and noisy fit for subject S4. There are properties of the pRF that are unaccounted for in both the pRF model used for fitting and our simulation. The pRF model we used was linear in space, which is consistent with the assumptions made in our simulation. However, there is evidence that the response of a V1 voxel to an increasingly large stimulus is not linear, which could lead to overestimation of receptive field sizes in the data but not the simulation (Kay et al., 2013, Kay et al., 2015). This could also contribute to the discrepancy we see between subject and model pRF sizes. Neural and hemodynamic contributions to the pRF measurement The pRF measurement reflects the spatial properties of both neural and hemodynamic responses, which cannot be disambiguated with fMRI. The cumulative receptive field of all neurons within a voxel depends on both the sizes of the receptive fields and the spread of their centers in visual space. Thus, the size of a Gaussian approximation to the population receptive field measurement can be described by 88 𝜎 ? =𝜎 lm'Gn ? +𝜎 4ml*mG ? +𝜎 mH ? where 𝜎 lm'Gn ? describes the contribution from the true neural receptive field sizes, 𝜎 4ml*mG ? describes the spread of neural receptive field centers, and 𝜎 mH ? describes the blurring from hemodynamic factors. Each of these terms will exhibit an eccentricity dependence. The receptive field size and hemodynamic spread are expected to scale linearly in the visual field due to the cortical magnification function. Our V1 model assumes that the spread of receptive field centers will also scale with eccentricity, as they are determined by the mapping from the visual space to cortical model based on the cortical magnification function. However, electrophysiological measurements indicate that the neural point-spread is not constant across the cortex (Gattass et al., 1987), but decreases sharply between the fovea and the cortical representation of 10 degrees eccentricity. This relationship may also hold for humans. If this is the case, the spread between receptive field centers must be greater in the periphery. Thus, it is likely that in the fovea, where neural receptive fields are small and highly overlapping, the population receptive field size is dominated by the hemodynamic spread: 𝜎 mH ? > 𝜎 lm'Gn ? >𝜎 4ml*mG ? Included in 𝜎 mH ? is the hemodynamic point-spread, which could lead to a BOLD response in a particular voxel when a part of the visual space is stimulated to which neurons within the voxel do not response. Another potential hemodynamic factor is an imperfect estimate of the temporal hemodynamic response. Here, we assume an HRF, and some of the 89 activation due to the hemodynamic lag is certainly left unaccounted for by the pRF model. However, even if an HRF is measured independently, it is impossible to know the true HRF evoked during pRF mapping. The HRF may change somewhat depending on protocol (hence the use of the non-linear balloon model in our simulation), and a constrained HRF model is often assumed to overcome the influence of noise on the estimation. Thus, some impact of an imperfect HRF on the measurement is inevitable. Hemodynamics may be less important to pRF measurements in the periphery, where receptive fields are large and spread out, and neural factors likely dominate the measurement. However, it may not be the receptive field size that is most dominant, but the large spread in receptive field locations within a single voxel: 𝜎 4ml*mG ? >𝜎 lm'Gn ? >𝜎 mH ? While we cannot measure the size or spread of neural receptive fields in humans to tease apart these two factors, we can use fMRI to inform on the relative contributions from hemodynamic versus neural factors. Our simulations are generated with a model that does not include neurons, and is intended to capture the hemodynamic contribution alone; however, included effects such as the point-spread function (Engel et al., 1997) and correlation structure of noise could arise in part from neural phenomena. It is unclear to what extent the point-spread function is determined by the spread of neural activity versus hemodynamic spread (Parkes et al., 2003). However, the point-spread varies significantly with both MR field strength – from 2.3 mm at 7 T (Shmuel et al., 2007) to 3.5 mm at 1.5 T (Engel et al., 1997) – and spin echo (SE) vs. gradient echo (GE) sequence – 3.4 mm vs. 90 3.9 mm at 3T. This is likely due to differential signal contributions from large veins versus microvasculature (Parkes et al., 2005). While this could be in response to proximity to different neural populations, it is more likely a reflection differences in the blood-driven signal in different vascular compartments. Therefore, it may be that the true neural pRFs at the foveal representation are sufficiently small to be concealed by hemodynamic factors using gradient echo sequences at 3T, but sufficiently large to be detectable using spin echo at 7T. Measurements of pRFs at different field strengths could help to elucidate the factors that determine the pRF measurement. 91 Appendix A: Hemodynamic model MODEL EQUATIONS We use the Balloon Model (Buxton et al., 1998) as implemented by Friston et al. (2004) to describe the temporal response of each voxel to a stimulus. This response is described by several differential equations. First, a stimulus 𝑢 evokes a neural response 𝑧: 𝑑𝑧(𝑡) 𝑑𝑡 = −𝑧 𝑡 +𝑢(𝑡) The resulting neural activity leads to a signal to the vasculature, 𝑠, which results in a change in the blood flow into the vasculature (𝑓 ul ): 𝑑𝑠(𝑡) 𝑑𝑡 = 𝑧 𝑡 −𝜅𝑠 𝑡 −𝛾(𝑓 ul 𝑡 −1) 𝑑𝑓 ul (𝑡) 𝑑𝑡 = 𝑠 𝑡 According to the Balloon Model, the flow in to the vasculature leads to an increase in pressure in the vein and an expansion – an increase in volume (𝑣) – and an increase in the flow out of the vein (𝑓 '* ). When a steady state is reached as the flow in and flow out become equivalent, the balloon begins to deflate. At the same time, the neural activity leads to an increase in the rate of oxygen extraction from the blood (𝐸). The changes in 92 blood flow, blood volume, and oxygen extraction rate cause the amount of deoxyhemoglobin in the vein (𝑞) to change. 𝐸(𝑡)= 1−(1−𝐸 > ) U/ (*) 𝑓 '* (𝑡)= (𝑣(𝑡)) U/ 𝑑𝑣(𝑡) 𝑑𝑡 =(𝑓 ul 𝑡 −𝑓 '* 𝑣 𝑡 )/𝜏 𝑑𝑞(𝑡) 𝑑𝑡 = (𝑓 ul 𝑡 𝐸 𝑡 𝐸 > −𝑓 '* 𝑣 𝑡 𝑞(𝑡))/𝜏 This drives a change in BOLD signal (𝐵𝑂𝐿𝐷), which is dependent on intravascular (first term) and extravascular (second term) signal changes that result from changes in the amount of deoxyhemoglobin and blood volume, and their relative weighting (third term). 𝐵𝑂𝐿𝐷(𝑡)= 𝑣 > (𝑘 U 1−𝑞(𝑡) +𝑘 ? 1− 𝑞(𝑡) 𝑣(𝑡) +𝑘 1−𝑣(𝑡) ) MODEL PARAMETERS Parameters for the preceding equations were taken from either Friston et al. (2003), Buxton et al. (1998) or Buxton et al. (2004). The parameters were chosen to achieve reasonable similarity in width and time-to-peak between the impulse response produced by the model and the canonical HRFs for V1 of Boynton et al. (1996), Dumoulin and Wandell 93 (2008), and Kay et al. (2013). Weighting parameters 𝑘 U , 𝑘 ? , and 𝑘 for the extravascular signal component, intravascular signal component, and the effect of changing their balance, respectively, were calculated based on Zhao et al. (2007). Parameter Value Source 𝜅 0.65 Friston et al. (2003) 𝛾 0.41 Friston et al. (2003) 𝐸 > 0.4 Buxton et al. (2004) 𝛼 0.4 Buxton et al. (2004) 𝜏 3 Buxton et al. (2004) 𝑣 > 0.01 Buxton et al. (1998) 𝑘 U 346.58𝐸 > 𝑇𝐸 Zhao et al. (2007) 𝑘 ? 100𝑒 T? .¡¢£ 𝐸 > 𝑇𝐸 Zhao et al. (2007) 𝑘 1−𝑒 T? .¡¢£ 𝐸 > Zhao et al. (2007) Table A1: Balloon model parameters. 94 Appendix B: V1 cortical model MODEL EQUATIONS The V1 cortical model developed by Rovamo and Virsu (1982) maps a point in the visual (or retinal) space defined by its polar angle and eccentricity (𝜃,𝜔) to a coordinate on a 3-dimensional, rotationally symmetric surface (𝜑,𝑟,𝑧). This surface represents a plausible shape for V1 if it were unfolded, assuming locally isotropic cortical magnification; specifically, as defined on the cortical model, the local magnification along a meridian (𝑀 𝜔 , corresponding to constant polar angle) is equivalent to the local magnification along an isoeccentricity contour (points of constant radius, 𝑀 𝜃 ). 𝑀 𝜔 =𝑀 𝜃 (1) and that the rotation angle about the symmetric surface is equivalent to the polar angle in the visual field 𝜑 =𝜃 (2) The cortical magnification along the eccentricity and polar angle directions, respectively, are given by: 𝑀 𝜔 = 𝑑𝑧 ? + 𝑑𝑟 ? >.d /𝑑𝜔 (3) 𝑀 𝜃 = 𝑟𝑑𝜑 / sin𝜔𝑑𝜃 (4) Rovamo and Virsu solve these equations to give 𝑟 and 𝑧 in terms of 𝜃 and 𝜔, defining the projection from the visual field to the cortical model when combined with (2): 𝑟 =𝑀 𝜔 𝑠𝑖𝑛𝜔 (5) 𝑧 = 𝑀(𝜔) ? − G ª ? >.d 𝑑𝜔 ª > (6) 95 We used the cortical magnification function described by Duncan and Boynton (2003): 𝑀 𝜔 =1/(𝑐 U 𝜔+𝑐 ? ) (7) CORTICAL MAGNIFICATION PARAMETERS Parameters for the cortical magnification function for general and individual subject models are listed below, with the parameters for the general model obtained from Duncan and Boynton (2003). Subject 𝑐 U 𝑐 ? general 0.065 0.054 S1 0.072 0.078 S2 0.045 0.14 S3 0.066 0.036 S4 0.023 0.18 Table B1: Cortical model parameters. 96 References Alink, A., Krugliak, A., Walther, A., and Kriegeskorte, N. (2013). fMRI orientation decoding in V1 does not require global maps or globally coherent orientation stimuli. Front Psychol 4. Anderson, E.J., Dakin, S.C., Schwarzkopf, D.S., Rees, G., and Greenwood, J.A. The Neural Correlates of Crowding-Induced Changes in Appearance. Current Biology. Anstis, S. (1974). A chart demonstrating variations in acuity with retinal position. Vision Research 14, 589–592. Balas, B., Nakano, L., and Rosenholtz, R. (2009). A summary-statistic representation in peripheral vision explains visual crowding. Journal of Vision 9, 13, 1–18. Behzadi, Y., Restom, K., Liau, J., and Liu, T.T. (2007). A component based noise correction method (CompCor) for BOLD and perfusion based fMRI. NeuroImage 37, 90–101. van den Berg, R., Roerdink, J.B.T.M., and Cornelissen, F.W. (2010). A Neurophysiologically Plausible Population Code Model for Feature Integration Explains Visual Crowding. PLoS Comput Biol 6, e1000646. Bi, T., Cai, P., Zhou, T., and Fang, F. (2009). The effect of crowding on orientation- selective adaptation in human early visual cortex. Journal of Vision 9, 13, 1–10. Binda, P., Thomas, J.M., Boynton, G.M., and Fine, I. (2013). Minimizing biases in estimating the reorganization of human visual areas with BOLD retinotopic mapping. Journal of Vision 13, 13–13. Biswal, B., Yetkin, F.Z., Haughton, V.M., and Hyde, J.S. (1995). Functional connectivity in the motor cortex of resting human brain using echo-planar MRI. Magn Reson Med 34, 537–541. Bouma, H. (1970). Interaction Effects in Parafoveal Letter Recognition. Nature 226, 177– 178. Boynton, G.M., and Finney, E.M. (2003). Orientation-Specific Adaptation in Human Visual Cortex. The Journal of Neuroscience 23, 8781–8787. 97 Boynton, G.M., Engel, S.A., Glover, G.H., and Heeger, D.J. (1996). Linear Systems Analysis of Functional Magnetic Resonance Imaging in Human V1. The Journal of Neuroscience 16, 4207–4221. Boynton, G.M., Demb, J.B., Glover, G.H., and Heeger, D.J. (1999). Neuronal basis of contrast discrimination. Vision Research 39, 257–269. Brainard, D.H. (1997). The Psychophysics Toolbox. Spatial Vision 10, 433–436. Buckner, R.L. (1998). Event-related fMRI and the hemodynamic response. Hum Brain Mapp 6, 373–377. Bullmore, E., Brammer, M., Williams, S.C., Rabe-Hesketh, S., Janot, N., David, A., Mellers, J., Howard, R., and Sham, P. (1996). Statistical methods of estimation and inference for functional MR image analysis. Magn Reson Med 35, 261–277. Bullmore, E., Long, C., Suckling, J., Fadili, J., Calvert, G., Zelaya, F., Carpenter, T.A., and Brammer, M. (2001). Colored noise and computational inference in neurophysiological (fMRI) time series analysis: resampling methods in time and wavelet domains. Hum Brain Mapp 12, 61–78. Buracas, G.T., and Boynton, G.M. (2007). The Effect of Spatial Attention on Contrast Response Functions in Human Visual Cortex. J. Neurosci. 27, 93–97. Buxton, R.B., Wong, E.C., and Frank, L.R. (1998). Dynamics of blood flow and oxygenation changes during brain activation: The balloon model. Magn. Reson. Med. 39, 855–864. Buxton, R.B., Uludağ, K., Dubowitz, D.J., and Liu, T.T. (2004). Modeling the hemodynamic response to brain activation. NeuroImage 23, Supplement 1, S220– S233. Carandini, M. (2004). Amplification of Trial-to-Trial Response Variability by Neurons in Visual Cortex. PLoS Biol 2, e264. Chang, C.-C., and Lin, C.-J. (2011). LIBSVM: a library for support vector machines. ACM Transactions on Intelligent Systems and Technology 2, 27:1–27:27. Chen, C.-C., Tyler, C.W., and Baseler, H.A. (2003). Statistical properties of BOLD magnetic resonance activity in the human brain. NeuroImage 20, 1096–1109. 98 Dakin, S.C., Greenwood, J.A., Carlson, T.A., and Bex, P.J. (2011). Crowding is tuned for perceived (not physical) location. Journal of Vision 11. DeAngelis, G.C., Freeman, R.D., and Ohzawa, I. (1994). Length and width tuning of neurons in the cat’s primary visual cortex. Journal of Neurophysiology 71, 347– 374. Diedrichsen, J., and Shadmehr, R. (2005). Detecting and adjusting for artifacts in fMRI time series data. NeuroImage 27, 624–634. Dreher, B. (1972). Hypercomplex cells in the cat’s striate cortex. Invest Ophthalmol 11, 355–356. Duncan, R.O., and Boynton, G.M. (2003). Cortical Magnification within Human Primary Visual Cortex Correlates with Acuity Thresholds. Neuron 38, 659–671. Duong, T.Q., Kim, D.-S., Uğurbil, K., and Kim, S.-G. (2000). Spatiotemporal dynamics of the BOLD fMRI signals: Toward mapping submillimeter cortical columns using the early negative response. Magn. Reson. Med. 44, 231–242. Engel, S.A., Rumelhart, D.E., Wandell, B.A., Lee, A.T., and et al (1994). fMRI of human visual cortex. Nature 369, 525. Engel, S.A., Glover, G.H., and Wandell, B.A. (1997). Retinotopic organization in human visual cortex and the spatial precision of functional MRI. Cerebral Cortex 7, 181– 192. Fang, F., and He, S. (2008). Crowding alters the spatial distribution of attention modulation in human primary visual cortex. Journal of Vision 8, 6, 1–9. Farzin, F., Rivera, S.M., and Whitney, D. (2009). Holistic crowding of Mooney faces. Journal of Vision 9, 18, 1–15. Flom, M. C. (1991). Contour interaction and the crowding effect. Problems in Optometry 3, 237–257. Fox, M.D., Snyder, A.Z., Vincent, J.L., Corbetta, M., Essen, D.C.V., and Raichle, M.E. (2005). The human brain is intrinsically organized into dynamic, anticorrelated functional networks. PNAS 102, 9673–9678. Freeman, J., and Simoncelli, E.P. (2011). Metamers of the ventral stream. Nat Neurosci 14, 1195–1201. 99 Freeman, J., Donner, T.H., and Heeger, D.J. (2011). Inter-area correlations in the ventral visual pathway reflect feature integration. Journal of Vision 11, 15, 1–23. Friston, K.J., Holmes, A.P., Worsley, K.J., Poline, J.-P., Frith, C.D., and Frackowiak, R.S.J. (1994). Statistical parametric maps in functional imaging: A general linear approach. Hum. Brain Mapp. 2, 189–210. Friston, K.J., Harrison, L., and Penny, W.D. (2003). Dynamic Causal Modelling. NeuroImage 19, 1273–1302. Gardner, J.L. (2010). Is cortical vasculature functionally organized? NeuroImage 49, 1953–1956. Gattass, R., Sousa, A.P., and Rosa, M.G. (1987). Visual topography of V1 in the Cebus monkey. J. Comp. Neurol. 259, 529–548. Geisler, W.S., and Albrecht, D.G. (1995). Bayesian analysis of identification performance in monkey visual cortex: Nonlinear mechanisms and stimulus certainty. Vision Research 35, 2723–2730. Harvey, B.M., and Dumoulin, S.O. (2011). The Relationship between Cortical Magnification Factor and Population Receptive Field Size in Human Visual Cortex: Constancies in Cortical Architecture. J. Neurosci. 31, 13604–13612. Haxby, J.V., Gobbini, M.I., Furey, M.L., Ishai, A., Schouten, J.L., and Pietrini, P. (2001). Distributed and Overlapping Representations of Faces and Objects in Ventral Temporal Cortex. Science 293, 2425–2430. Haynes, J.-D., and Rees, G. (2005). Predicting the orientation of invisible stimuli from activity in human primary visual cortex. Nat Neurosci 8, 686–691. Ho, C., and Cheung, S.-H. (2011). Crowding by Invisible Flankers. PLoS ONE 6, e28814. Hubel, D.H., and Wiesel, T.N. (1968). Receptive fields and functional architecture of monkey striate cortex. J Physiol 195, 215–243. Kamitani, Y., and Tong, F. (2005). Decoding the visual and subjective contents of the human brain. Nat Neurosci 8, 679–685. Kapadia, M.K., Westheimer, G., and Gilbert, C.D. (1999). Dynamics of spatial summation in primary visual cortex of alert monkeys. PNAS 96, 12073–12078. 100 Kastner, S., Pinsk, M.A., De Weerd, P., Desimone, R., and Ungerleider, L.G. (1999). Increased Activity in Human Visual Cortex during Directed Attention in the Absence of Visual Stimulation. Neuron 22, 751–761. Kay, K.N., Rokem, A., Winawer, J., Dougherty, R.F., and Wandell, B.A. (2013a). GLMdenoise: a fast, automated technique for denoising task-based fMRI data. Front Neurosci 7. Kay, K.N., Winawer, J., Mezer, A., and Wandell, B.A. (2013b). Compressive spatial summation in human visual cortex. Journal of Neurophysiology 110, 481–494. Kay, K.N., Weiner, K.S., and Grill-Spector, K. (2015). Attention Reduces Spatial Uncertainty in Human Ventral Temporal Cortex. Current Biology 25, 595–600. Kim, S.G., Hendrich, K., Hu, X., Merkle, H., and Uğurbil, K. (1994). Potential pitfalls of functional MRI using conventional gradient-recalled echo techniques. NMR Biomed 7, 69–74. Korte, W. (1923). Uber Die Gestaltauffassung im Indirecten Sehen. Z. Psychologie 93, 17–82. Krüger, G., and Glover, G.H. (2001). Physiological noise in oxygenation-sensitive magnetic resonance imaging. Magn. Reson. Med. 46, 631–637. Kwong, K.K., Belliveau, J.W., Chesler, D.A., Goldberg, I.E., Weisskoff, R.M., Poncelet, B.P., Kennedy, D.N., Hoppel, B.E., Cohen, M.S., and Turner, R. (1992). Dynamic magnetic resonance imaging of human brain activity during primary sensory stimulation. PNAS 89, 5675–5679. Levi, D.M. (2008). Crowding—An essential bottleneck for object recognition: A mini- review. Vision Research 48, 635–654. Levi, D.M., Hariharan, S., and Klein, S.A. (2002). Suppressive and facilitatory spatial interactions in peripheral vision: Peripheral crowding is neither size invariant nor simple contrast masking. Journal of Vision 2, 167–177. Levitt, J.B., and Lund, J.S. (1997). Contrast dependence of contextual effects in primate visual cortex. Nature 387, 73–76. 101 Liu, C.-S.J., Miki, A., Hulvershorn, J., Bloy, L., Gualtieri, E.E., Liu, G.T., Leigh, J.S., Haselgrove, J.C., and Elliott, M.A. (2006). Spatial and Temporal Characteristics of Physiological Noise in fMRI at 3T. Academic Radiology 13, 313–323. Liu, T., Jiang, Y., Sun, X., and He, S. (2009). Reduction of the Crowding Effect in Spatially Adjacent but Cortically Remote Visual Stimuli. Current Biology 19, 127–132. Loftus, G., and Masson, M. (1994). Using confidence intervals in within-subject designs. Psychonomic Bulletin & Review 1, 476–490. Long, C.J., Brown, E.N., Triantafyllou, C., Aharon, I., Wald, L.L., and Solo, V. (2005). Nonstationary noise estimation in functional MRI. NeuroImage 28, 890–903. Louie, E.G., Bressler, D.W., and Whitney, D. (2007). Holistic crowding: Selective interference between configural representations of faces in crowded scenes. Journal of Vision 7, 24–24. Lund, T.E., Madsen, K.H., Sidaros, K., Luo, W.-L., and Nichols, T.E. (2006). Non-white noise in fMRI: Does modelling have an impact? NeuroImage 29, 54–66. Maniglia, M., Pavan, A., Cuturi, L.F., Campana, G., Sato, G., and Casco, C. (2011). Reducing Crowding by Weakening Inhibitory Lateral Interactions in the Periphery with Perceptual Learning. PLoS ONE 6, e25568. Masland, R.H. (2001). Neuronal diversity in the retina. Current Opinion in Neurobiology 11, 431–436. Mildner, T., Norris, D.G., Schwarzbauer, C., and Wiggins, C.J. (2001). A qualitative test of the balloon model for BOLD-based MR signal changes at 3T. Magn Reson Med 46, 891–899. Motter, B.C. (2006). Modulation of Transient and Sustained Response Components of V4 Neurons by Temporal Crowding in Flashed Stimulus Sequences. The Journal of Neuroscience 26, 9683–9694. Motter, B.C., and Simoni, D.A. (2007). The roles of cortical image separation and size in active visual search performance. Journal of Vision 7, 6–6. Nandy, A.S., and Tjan, B.S. (2012). Saccade-confounded image statistics explain visual crowding. Nat Neurosci 15, 463–469. 102 Neri, P., and Levi, D.M. (2006). Spatial Resolution for Feature Binding Is Impaired in Peripheral and Amblyopic Vision. Journal of Neurophysiology 96, 142–153. Norman, K.A., Polyn, S.M., Detre, G.J., and Haxby, J.V. (2006). Beyond mind-reading: multi-voxel pattern analysis of fMRI data. Trends in Cognitive Sciences 10, 424– 430. Ogawa, S., Lee, T.M., Kay, A.R., and Tank, D.W. (1990). Brain magnetic resonance imaging with contrast dependent on blood oxygenation. PNAS 87, 9868–9872. Ogawa, S., Menon, R.S., Tank, D.W., Kim, S.G., Merkle, H., Ellermann, J.M., and Ugurbil, K. (1993). Functional brain mapping by blood oxygenation level- dependent contrast magnetic resonance imaging. A comparison of signal characteristics with a biophysical model. Biophys J 64, 803–812. Olman, C.A., Ugurbil, K., Schrater, P., and Kersten, D. (2004). BOLD fMRI and psychophysical measurements of contrast response to broadband images. Vision Research 44, 669–683. Op de Beeck, H.P. (2010). Against hyperacuity in brain reading: Spatial smoothing does not hurt multivariate fMRI analyses? NeuroImage 49, 1943–1948. Osterberg, G. (1935). Topography of the layer of rods and cones in the human retina (Nyt Nordisk Forlag). Parkes, L., Lund, J., Angelucci, A., Solomon, J.A., and Morgan, M. (2001). Compulsory averaging of crowded orientation signals in human vision. Nat Neurosci 4, 739– 744. Parkes, L.M., Schwarzbach, J.V., Bouts, A.A., Deckers, R. h R., Pullens, P., Kerskens, C.M., and Norris, D.G. (2005). Quantifying the spatial resolution of the gradient echo and spin echo BOLD response at 3 Tesla. Magn. Reson. Med. 54, 1465– 1472. Pelli, D.G. (1997). The VideoToolbox software for visual psychophysics: transforming numbers into movies. Spatial Vision 10, 437–442. Pelli, D.G., and Tillman, K.A. (2008). The uncrowded window of object recognition. Nat Neurosci 1129–1135. 103 Pelli, D.G., Tillman, K.A., Freeman, J., Su, M., Berger, T.D., and Majaj, N.J. (2007). Crowding and eccentricity determine reading rate. Journal of Vision 7, 20, 1–36. Petrov, Y., and McKee, S.P. (2006). The effect of spatial configuration on surround suppression of contrast sensitivity. Journal of Vision 6, 224–238. Petrov, Y., and Popple, A.V. (2007). Crowding is directed to the fovea and preserves only feature contrast. Journal of Vision 7, 8, 1–9. Petrov, Y., Popple, A.V., and McKee, S.P. (2007). Crowding and surround suppression: Not to be confused. Journal of Vision 7, 12, 1–9. Pihlaja, M., Henriksson, L., James, A.C., and Vanni, S. (2008). Quantitative multifocal fMRI shows active suppression in human V1. Hum. Brain Mapp. 29, 1001–1014. Poot, D.H.J., Sijbers, J., and den Dekker, A.J. (2008). An exploration of spatial similarities in temporal noise spectra in fMRI measurements. p. 69142F – 69142F – 8. Press, W.A., Brewer, A.A., Dougherty, R.F., Wade, A.R., and Wandell, B.A. (2001). Visual areas and spatial summation in human visual cortex. Vision Research 41, 1321–1332. Ress, D., Backus, B.T., and Heeger, D.J. (2000). Activity in primary visual cortex predicts performance in a visual detection task. Nat Neurosci 3, 940–945. Rombouts, S.A.R.B., Barkhof, F., Hoogenraad, F.G.C., Sprenger, M., and Scheltens, P. (1998). Within-Subject Reproducibility of Visual Activation Patterns With Functional Magnetic Resonance Imaging Using Multislice Echo Planar Imaging. Magnetic Resonance Imaging 16, 105–113. Rovamo, J., and Virsu, V. (1984). Isotropy of cortical magnification and topography of striate cortex. Vision Res. 24, 283–286. Sceniak, M.P., Ringach, D.L., Hawken, M.J., and Shapley, R. (1999). Contrast’s effect on spatial summation by macaque V1 neurons. Nat Neurosci 2, 733–739. Smith, A.T., Singh, K.D., Williams, A.L., and Greenlee, M.W. (2001). Estimating Receptive Field Size from fMRI Data in Human Striate and Extrastriate Visual Cortex. Cereb. Cortex 11, 1182–1190. 104 Smith, S.M., Miller, K.L., Salimi-Khorshidi, G., Webster, M., Beckmann, C.F., Nichols, T.E., Ramsey, J.D., and Woolrich, M.W. (2011). Network modelling methods for FMRI. NeuroImage 54, 875–891. Strasburger, H., Harvey, L.O., and Rentschler, I. (1991). Contrast thresholds for identification of numeric characters in direct and eccentric view. Perception & Psychophysics 49, 495–508. Swisher, J.D., Gatenby, J.C., Gore, J.C., Wolfe, B.A., Moon, C.-H., Kim, S.-G., and Tong, F. (2010). Multiscale pattern analysis of orientation-selective activity in the primary visual cortex. J Neurosci 30, 325. Thomas, C.G., Harshman, R.A., and Menon, R.S. (2002). Noise Reduction in BOLD- Based fMRI Using Component Analysis. NeuroImage 17, 1521–1537. Thulborn, K.R., Waterton, J.C., Matthews, P.M., and Radda, G.K. (1982). Oxygenation dependence of the transverse relaxation time of water protons in whole blood at high field. Biochimica et Biophysica Acta (BBA) - General Subjects 714, 265– 270. Tjan, B.S., Lestou, V., and Kourtzi, Z. (2006). Uncertainty and Invariance in the Human Visual Cortex. Journal of Neurophysiology 96, 1556–1568. Tripathy, S.P., and Cavanagh, P. (2002). The extent of crowding in peripheral vision does not scale with target size. Vision Research 42, 2357–2369. Uğurbil, K., Xu, J., Auerbach, E.J., Moeller, S., Vu, A.T., Duarte-Carvajalino, J.M., Lenglet, C., Wu, X., Schmitter, S., Van de Moortele, P.F., et al. (2013). Pushing spatial and temporal resolution for functional and diffusion MRI in the Human Connectome Project. NeuroImage 80, 80–104. Wallace, J.M., and Tjan, B.S. (2011). Object crowding. Journal of Vision 11, 19–19. Wandell, B.A., and Winawer, J. (2015). Computational neuroimaging and population receptive fields. Trends in Cognitive Sciences 19, 349–357. Wässle, H. (2004). Parallel processing in the mammalian retina. Nat Rev Neurosci 5, 747–757. Watson, A., and Pelli, D. (1983). Quest: A Bayesian adaptive psychometric method. Attention, Perception, & Psychophysics 33, 113–120. 105 Whitney, D., and Levi, D.M. (2011). Visual crowding: a fundamental limit on conscious perception and object recognition. Trends in Cognitive Sciences 15, 160–168. Williams, A.L., Singh, K.D., and Smith, A.T. (2003). Surround Modulation Measured With Functional MRI in the Human Visual Cortex. Journal of Neurophysiology 89, 525–533. Winawer, J., Horiguchi, H., Sayres, R.A., Amano, K., and Wandell, B.A. (2010). Mapping hV4 and ventral occipital cortex: The venous eclipse. Journal of Vision 10, 1, 1–22. Woolrich, M.W., Jenkinson, M., Brady, J.M., and Smith, S.M. (2004). Fully Bayesian spatio-temporal modeling of FMRI data. IEEE Transactions on Medical Imaging 23, 213–231. Worsley, K.J., Liao, C.H., Aston, J., Petre, V., Duncan, G.H., Morales, F., and Evans, A.C. (2002). A General Statistical Analysis for fMRI Data. NeuroImage 15, 1–15. Xiong, J., Parsons, L.M., Gao, J.-H., and Fox, P.T. (1999). Interregional connectivity to primary motor cortex revealed using MRI resting state images. Hum. Brain Mapp. 8, 151–156. Yu, X., Glen, D., Wang, S., Dodd, S., Hirano, Y., Saad, Z., Reynolds, R., Silva, A.C., and Koretsky, A.P. (2012). Direct imaging of macrovascular and microvascular contributions to BOLD fMRI in layers IV–V of the rat whisker–barrel cortex. NeuroImage 59, 1451–1460. Zarahn, E., Aguirre, G.K., and D’Esposito, M. (1997). Empirical Analyses of BOLD fMRI Statistics. NeuroImage 5, 179–197. Zenger-Landolt, B., and Heeger, D.J. (2003). Response Suppression in V1 Agrees with Psychophysics of Surround Masking. J. Neurosci. 23, 6884–6893. Zhao, J.M., Clingman, C.S., Närväinen, M.J., Kauppinen, R.A., and van Zijl, P.C.M. (2007). Oxygenation and hematocrit dependence of transverse relaxation rates of blood at 3T. Magn. Reson. Med. 58, 592–597.
Abstract (if available)
Abstract
The peripheral visual system is poorly‐suited for high-resolution tasks such as face recognition and reading. For people who lose their central vision due to common diseases of the retina, such as age‐related macular degeneration, these normally simple daily tasks become challenging. Even people with healthy vision can face problems due to the limitations of peripheral vision: a driver, for example, may be unaware of pedestrians about to step into the street. A better understanding of peripheral visual processing, particularly its deficiencies, is an important step toward developing training regimens and technological aids to improve peripheral vision. Because the same basic mechanisms likely govern central and peripheral visual processing, an improved understanding of peripheral vision also aids in a comprehensive understanding of visual processing. ❧ The study of peripheral visual processing in humans is challenging, as neural activity cannot be measured directly in healthy participants. One of the most common methods for indirectly measuring brain activity associated with a task or stimulus is functional magnetic resonance imaging (fMRI). However, the application of fMRI to the study of peripheral vision is challenging due to properties of the fMRI measurement, as well as the limited cortical representation of the peripheral visual field. As a result of these factors, the fMRI signal for a peripheral stimulus is small, spatially and temporally blurred, and noisy. ❧ In this thesis, I describe several studies addressing how fMRI can be used to better understand peripheral vision and its limitations. In the first study, we used fMRI to identify the earliest visual area involved in crowding, the difficulty in identifying cluttered objects in peripheral vision. Crowding is believed to be caused by neural mechanisms responsible for basic visual processes such as feature integration and segmentation. By knowing the first stage of visual processing at which crowding occurs, we can constrain theories on the process that gives rise to crowding. In this first study, we eliminate the problem of localizing the signal of a flanked target letter by measuring changes in BOLD signal within a large region of interest (ROI) that includes both target and flankers. This allowed us to detect a suppression of BOLD signal associated with crowding as early as primary visual cortex (V1). In the second part of the thesis, I will describe a model we developed to aid in design and interpretation of more complex analyses that do not require averaging within a large ROI, but instead examine the individual responses of multiple voxels. These techniques are relatively new, and their viability in the peripheral representation in early visual cortex is not established. By incorporating the signal and noise characteristics of BOLD measurement and the cortical representation of the visual field, our model provides a means for establishing the limits these factors impose on analyses of BOLD studies of peripheral vision in early visual cortex. The model also provides a non‐neuronal baseline to which results can be compared to aid in interpretation of fMRI experiment results. Lastly, I demonstrate the application of this model to two commonly used analysis method that can inform on peripheral visual processing: multi‐voxel pattern analysis (MVPA) and population receptive field (pRF) estimation.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Functional models of fMRI BOLD signal in the visual cortex
PDF
Crowding and form vision deficits in peripheral vision
PDF
Crowding in peripheral vision
PDF
Characterization of visual cortex function in late-blind individuals with retinitis pigmentosa and Argus II patients
PDF
Explicit encoding of spatial relations in the human visual system: evidence from functional neuroimaging
PDF
The neural correlates of face recognition
PDF
Cortical and subcortical responses to electrical stimulation of rat retina
PDF
Value-based decision-making in complex choice: brain regions involved and implications of age
PDF
Modeling the development of mid-level visual cortex
PDF
Sensory information processing by retinothalamic neural circuits
PDF
Selectivity for visual speech in posterior temporal cortex
PDF
Computational models and model-based fMRI studies in motor learning
PDF
Exploring sensory responses in the different subdivisions of the visual thalamus
PDF
Contextual modulation of sensory processing via the pulvinar nucleus
PDF
Attention, movie cuts, and natural vision: a functional perspective
PDF
The brain's virtuous cycle: an investigation of gratitude and good human conduct
PDF
The neural correlates of creativity and perceptual pleasure: from simple shapes to humor
PDF
Neuroscience inspired algorithms for lifelong learning and machine vision
PDF
Synaptic integration in dendrites: theories and applications
PDF
The behavioral and neural bases of tactile object localization
Asset Metadata
Creator
Millin, Rachel
(author)
Core Title
Functional magnetic resonance imaging characterization of peripheral form vision
School
College of Letters, Arts and Sciences
Degree
Doctor of Philosophy
Degree Program
Neuroscience
Publication Date
11/09/2015
Defense Date
10/09/2015
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
fMRI,OAI-PMH Harvest,peripheral vision,primary visual cortex,visual crowding
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Tjan, Bosco S. (
committee chair
), Hirsch, Judith A. (
committee member
), Mel, Bartlett W. (
committee member
)
Creator Email
rachel3ai@gmail.com,rmillin@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c40-197147
Unique identifier
UC11278127
Identifier
etd-MillinRach-4020.pdf (filename),usctheses-c40-197147 (legacy record id)
Legacy Identifier
etd-MillinRach-4020.pdf
Dmrecord
197147
Document Type
Dissertation
Format
application/pdf (imt)
Rights
Millin, Rachel
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
fMRI
peripheral vision
primary visual cortex
visual crowding